Adapted Binary Particle Swarm Optimization for Efﬁcient Features Selection in the Case of Imbalanced Sensor Data

: Daily living activities (DLAs) classification using data collected from wearable monitoring sensors is very challenging due to the imbalance characteristics of the monitored data. A major research challenge is to determine the best combination of features that returns the best accuracy results using minimal computational resources, when the data is heterogeneous and not fitted for classical algorithms that are designed for balanced low-dimensional datasets. This research article: (1) presents a modification of the classical version of the binary particle swarm optimization (BPSO) algorithm that introduces a particular type of particles called sensor particles, (2) describes the adaptation of this algorithm for data generated by sensors that monitor DLAs to determine the best positions and features of the monitoring sensors that lead to the best classification results, and (3) evaluates and validates the proposed approach using a machine learning methodology that integrates the modified version of the algorithm. The methodology is tested and validated on the Daily Life Activities (DaLiAc) dataset.


Introduction
The development and large scale deployment of IoT monitoring devices has made significant amounts of data available, which may be used for improving the care of elders. In this context, the identification of daily living activities out of monitoring data is very important for understanding the elders behaviour and assessing deviations from daily routines, which may signal illness progression and timely planning of intervention processes. For example, in the project ReMIND [1] the enhancement of the quality of life of the elders with mild neurocognitive impairments is considered, and one major challenge addressed by the ReMIND project is the stimulating of the physical activity through music using James robots.
Therefore, the challenge of the daily living activities (DLAs) classification in real-time using low computational resources is critical as those elders might suffer unexpected falls caused by their critical health condition or sudden switches from one DLA to another DLA caused by memory problems. One way to increase the classification speed is to reduce the number of features of the data samples; however, that challenge is even more complex when the data is imbalanced and heterogeneous for each monitored person. In [2] the reduction of the number of features was addressed by proposing a novel bio-inspired algorithm. However, in that approach, the experimental DLAs datasets were balanced and normalized prior to the application of the feature selection.
The challenges of the aging population [3] have been addressed by various types of information and communication technology (ICT) systems in the last years, but few of them evaluated the challenges related to the acceptance and to the usability of the ambient assisted living (AAL) systems [4].
(1) The development of a modified version of the standard binary particle swarm optimization (BPSO) algorithm, which introduces sensors particles characterized by weights proportional to the importance of the monitoring sensors and used further in the equations for the updating of the velocities of the standard particles; (2) The adaptation of the proposed algorithm for DLAs data reflecting these adaptations in the way in which the objective function is defined and in the ranking of the features returned by the proposed algorithm; (3) The evaluation and the validation of the adapted version of the BPSO algorithm using a machine learning methodology developed in-house, which compares the proposed algorithm with other feature selection approaches, and which uses as experimental support Daily Life Activities (DaLiAc) dataset [16]; The primary reasons to adapt BPSO for DLAs data are: (1) the flexibility to modify the classical BPSO, either by altering the mathematical equations or by introducing novel concepts in the original version, (2) the fact that the standard classical algorithms, in particular the standard versions of BPSO, do not exploit the nature of the DLAs datasets at maximum, and (3) the fact that in the case of the DLAs the data is generated by several monitoring sensors described by groups of features-information that might lead to better results if it is integrated in the feature selection algorithms used in the processing of this type of data.
The remaining sections of the article are organized as follows: Section 2 illustrates the research background, Section 3 presents the materials and methods used in this article, Section 4 describes the most representative results obtained in the experiments, Section 5 presents a discussion of those results, and Section 6 concludes the article.

Background
The first subsection addresses feature selection challenges related to imbalanced data with many features generated by monitoring sensors. The second subsection presents literature approaches that consider feature selection based on the particle swarm optimization (PSO) algorithm [17]. The third subsection considers various approaches from the literature that use the DaLiAc dataset for experimental support.

Feature Selection Challenges for Imbalanced Data Generated by Monitoring Sensors
In [18] the high dimensionality and the class imbalance properties of the datasets are tackled. Compared to the approach presented in this article, the authors of that article are motivated by medical applications from real-world systems. Their feature selection method aims to achieve a better separability between the minority class and the majority class. In the approach presented in this article the majority or the minority properties of the classes of the samples are not considered when the features are selected. However, the approach presented in this article considers the variability of the monitored data for each class for each feature, and it also considers the relations between features.
The authors of [19] address one of the major challenges of the feature selection in the case of datasets with many features, namely the ignorance of the class imbalance problem and thus in the majority of the cases, the selected features are biased towards majority classes. In the approach presented in this article the imbalance is addressed by the way in which the objective function of the adapted version of the BPSO algorithm is defined, as all the classes are treated equally.
In [20], the authors deal with the following two major challenges in machine learning, namely the high dimensionality and the class-imbalance. The cardinality of the set of features is penalized by a scaling factors technique, while the class-imbalance problem is addressed using a method based on a support vector machine (SVM) [21]. In the approach presented in this article, each class is treated equally when the objective function is defined, and moreover, the number of features is penalized in the objective function, increasing its value when the number of features is small.
The selection of features for imbalanced datasets is challenging as an effective learning model should be constructed, reducing the memory consumption and the time consumption-two objectives, which are addressed in the approach presented in this article. In [22] that challenge is addressed by rough-set-based feature selection algorithm for imbalanced data (RSFSAID), a novel algorithm for feature selection. Similar to the approach from this article, those authors use PSO. However, in their approach, PSO is applied in the determination of the optimized parameters for the proposed algorithm. They also study the lower and the upper boundary regions, when they define the significance of the features. In the approach from this article the minimum and the maximum values of the ranges of variability are considered in the definition of the matrix of variability.
The authors of [23] address the challenge of feature selection in the case of high dimensional imbalanced datasets from the perspective of the neglecting of the features interaction. Many traditional approaches applied in the case of datasets with imbalanced classes are biased towards the majority classes and the authors of that article propose a method for feature selection based on the interaction information (II). In this article the interactions between the features are considered, defining a matrix that approaches the features from the perspective of the range of variability. Moreover, this article considers particular characteristics of the DLAs datasets that are not approached in that article, and proposes a more generic algorithm for feature selection.

Feature Selection Approaches Based on Particle Swarm Optimization
The authors of [24] apply PSO in feature selection, considering three goals, which are also targeted by the approach presented in this research article: (1) the maximization of the classification performance, (2) the minimization of the number of the selected features and (3) the minimization of the computational time. However, they propose, as future research work, the development of an approach based on PSO, which is multi-objective and which simultaneously minimizes the number of the selected features and maximizes the performance of the classification. In the approach presented in this article, both of those two objectives are considered in the definition of the objective function.
In [25], a method is presented in which the features are selected using a hybrid version of PSO, which also integrates a local search strategy. That method is called hybrid particle swarm optimization with local search (HPSO-LS) and determines efficiently the discriminative features with reduced correlations. The objective function of that approach uses the k-nearest neighbor (k-NN) [26] method, and, between two solutions that return the same accuracy, it considers that solution with the smaller number of features. However, that approach is expensive in terms of computational resources, as it requires the training of a k-NN classifier for each particle in each iteration of the algorithm.
The authors of [27] considered an algorithm for feature selection based on PSO with learning memory (PSO-LM). Compared to the approach from this article, that approach introduces a memory learning strategy with the objective to balance the global exploration and the local exploitation in the algorithm. Moreover, their objective function is similar to the objective function of the adapted version of the BPSO algorithm described in this article, as it uses two parts as follows: The first part corresponds to the accuracy of the prediction model trained using the selected features and the second part corresponds to the ratio between the number of the selected features and the total number of features. However, the approach presented in this article is different, as, instead of using a classifier for the first part of the objective function as in the case of that approach, which considers the k-NN classifier, this article introduces an algorithm, which considers the variability of the monitored data, and which also considers the fact that the data is imbalanced.
In [28], an approach is considered for the joint moment prediction, in which the features are selected using a method based on BPSO and the objective function is based on the variance accounted for (VAF). Similar to the approach from this article, their approach is tested and validated on DLAs data. However, that approach does not consider the minimization of the features as a distinctive part of the objective function and the data considered in their experiments for the validation of the approach contains only gait patterns of running, therefore it can not be generalized very easily.
Another approach for feature selection based on PSO is presented in [29]. The method presented in that article is called hybrid particle swarm optimization with a spiral-shaped mechanism (HPSO-SSM) and is based on the following three improvements: (1) the enhancement of the diversity in the searching process using a logistic map sequence, (2) the introduction of two new parameters in the original formulas, used in the updating of the positions, and (3) the adopting of a spiral-shaped mechanism as a local search operator. Their method presents broad applications, as it is validated using twenty classic benchmark classification datasets from the University of California, Irvine (UCI) machine learning repository. However, compared to the method presented in this article, it is not very focused, as it does not exploit the nature of those datasets at maximum.

Machine Learning Approaches from the Literature that Consider the Daily Life Activities (DaLiAc) Dataset
The Daily Life Activities (DaLiAc) dataset was introduced in [16] as a benchmark dataset that is analyzed in the context of a hierarchical classification system for DLAs. Compared to the method presented in this article, which evaluates the performance of the DLAs classification using 10-fold cross validation, the method presented in that article considers a leave-one-subject-out procedure.
The authors also address the challenge of a high number of features, which leads to a very high computational complexity in the case of embedded systems or real time applications and they even suggest the reduction of the features, giving, as an illustrative example, the sequential feature selection.
The authors of [30] use the DaLiAc dataset as experimental support for a method based on a classifier ensemble of randomized trees, and their objective is to maximize the recognition performance, both in the replacement scenario and in the relocation scenario. Compared to the approach presented in this article, they used more K-folds cross-validation schemes with K from {2, 3, 5, 10, 20} and they also used the accuracy to measure the performance. They also included junk data in their experiments and therefore their results are not directly comparable with the results obtained in this article.
The human activity recognition is approached in [31] by combining a number of classifiers, and the method presented in that article, which is based on a combination of models outperforms other types of classifiers combination models. However, the challenge that still remains is how to increase the robustness of that method against interperson variability. The method based on BPSO described in this article addresses that challenge partially by assigning a weight to each monitoring sensor and consequently the weights are different and adapted for each monitored subject.
The algorithms applied in [32] for improving the accuracy of activity recognition are k-NN, logistic regression (LR), naive bayes (NB), random forest (RF), extremely randomized trees (ERT) and SVM. The method described in that article consists of four phases, namely segmentation, feature extraction, feature selection and evaluation. In the case of the DaLiAc dataset, the overall mean classification is 89.6%. The work described in this article is inspired from the work from that article as follows: (1) our work considers the selection of the features according to the positions of the sensors that generate them, (2) it considers the RF algorithm, and (3) it applies the proposed research method on several datasets. However, one weakness of that work is that it does not address the importance of each monitoring sensor, even though it addresses the optimal number of sensors, the positions of the sensors, and the optimal number of features for each sensor.
The method presented in [33] considers the DaLiAc dataset from a different perspective than the other articles, which consider it as experimental support, converting the inertial sensor signals to images and then applying a convolutional neural network (CNN) [34] for images classification. Even if that method leads to better classification results than the traditional classification methods, it is also subject to the limitations of CNNs, such as the class imbalance, the overfitting and poor performance when the input data is small. Moreover, CNNs are also computationally expensive and, in this article, that challenge is addressed by trying to reduce the number of features so that the classifiers can be trained faster and the classification results are optimized.

Materials and Methods
This section presents the main concepts and the pseudo-code of the adapted version of BPSO.

Mathematical Formulation of the Optimization Problem Approached Using an Adapted Variant of the BPSO Algorithm
The objective of the optimization problem is to select an optimal number of features in the case of data generated by sensors that monitor various types of DLAs in order to: (1) minimize the number of selected features and to (2) maximize the value of a mathematical formula based on a heuristic introduced by us, which describes the range of variability of the DLAs using the selected features. Those two sub-objectives are reflected in the objective function of the adapted version of BPSO algorithm.
The optimization variables are represented by arrays of zeros and ones of the form X = (x 1 , ..., x 24 ), such that x i ∈ {0, 1} for any i ∈ {1, ..., 24}. A value equal to 0 means that a feature is not selected and a value equal to 1 means that a feature is selected. The search space is thus represented by 2 24 = 16777216 possible variables.
The optimization constraint is the maximum value of the selected features, which is equal to six as six is also the number of features generated by each monitoring sensor in the data used as experimental support. The search space is thus limited to fewer variables, more specifically ∑ 6 i=0 ( 24 i ) = 190051. That constraint is considered in this article using the ranking of the features according to an heuristic developed in-house and is applied to the best results obtained using the adapted version of BPSO, when they are compared to the results obtained using other methods in Section 4.

Matrix of Variability of DLAs Sensors Data for Each Monitored Subject
The matrix of variability M has the form: where D is the number of features of the data generated by the monitoring sensors, which is also the dimension of the search space in the adapted version of BPSO, and N DLAs is the number of daily living activities. Each line i of the matrix M, where i = 1, ..., D, is a permutation of the set {1, 2, ..., N DLAs }, such that the order of the elements of that set is given by the increasing order of the ranges of variability of each monitored DLA. The range of variability for a DLA a and the corresponding feature f is given by the formula: where r_var a, f is the range of variability for the DLA a and the feature f , v_max a, f is the maximum value monitored by the feature f for the DLA a and v_min a, f is the minimum value monitored by the feature f for the DLA a.

Metric for the Evaluation of the Matrix of Variability of DLAs Sensors Data for Each Monitored Subject
The value of the metric M ∆ used in the evaluation of the matrix of variability M of DLAs sensors data is given by:

Heuristic for Features Ranking in the Optimal Solution Returned by the Adapted BPSO Algorithm for Each Monitored Subject
In order to restrict the number of selected features to a maximum threshold, a matrix R of variability is proposed. The matrix R is defined by the set of all r_var a, f such that a takes values from the set {1, ..., N DLAs } and f takes values from the set {1, ..., D}: The feature f 1 is considered better than the feature f 2 if the value returned by the Comparator( f 1 , f 2 ) described next is greater than or equal to 0: such that for any real number x the value of s(x) is 1 if x ≥ 0 or −1 otherwise.

Mathematical Description of the Objective Function of the Adapted Version of the BPSO Algorithm
Algorithm 1: The objective function of the adapted BPSO algorithm. The inputs of Algorithm 26 are: p-The particle for which the fitness value is computed, D-The number of dimensions of the search space, M-A precomputed matrix, which is presented in more details next, M ∆ -A metric used for the evaluation of that matrix, and DLAs-The set of daily living activities monitored by the monitoring sensors. The output of the algorithm is r-The fitness value of the particle p.
Initially, the value of r is 0 (line 1), the value of x is equal to the position of the particle p (line 2) and F selected is an empty set (line 3). For each d from the set {1, ..., D}, if the value of x d is equal to 1 then the feature f d is included in the set of selected features F selected (lines 4-8). The initial value of ∆ is 0 (line 9). For each activity a from the set of daily living activities DLAs the steps from (lines 11-23) are repeated. The initial values of min a and max a are equal to +∞ (line 11) and −∞ (line 12). For each feature f from the set of features F selected , the values of min a and max a are updated in (lines [14][15][16][17][18][19] as follows: If the value of M f ,a is less than the value of min a (line 14) then min a is initialized to the value of M f ,a (line 15) and if the value of M f ,a is greater than the value of max a (line 17) then max a is initialized to the value of M f ,a (line 18). If F selected = ∅ (line 21) then the value of ∆ is incremented with |max a − min a + 1| (line 22). In (line 25) the value of r is updated using the formula: The ideal value of r is 1 and corresponds to the case when ∆ = M ∆ and the number of selected features is 0, and the worst value of r is 0 and corresponds to the case when ∆ = 0 and all features are selected. Consequently, r takes values from the interval [0, 1] and the objective of the algorithm is to maximize that value. The algorithm returns the final value of r in (line 26).  The BPSO algorithm is adapted after the binary version of the PSO algorithm [35]. A method similar to the one presented in [36] is considered in order to the restrict the values of the positions of the particles to take values from the set {0, 1}.
The inputs of the adapted BPSO algorithm are: N particles -The number of particles, N iterations -The maximum number of iterations, c 1 -The acceleration coefficient for the cognitive component, c 2 -The acceleration coefficient for the social component, w min -The minimum value for the inertia, w max -The maximum value for the inertia, V min -The minimum possible value for the velocity, V max -The maximum possible value for the velocity, D-The number of dimensions of the search space, and N sensors -The number of monitoring sensors. The output g best describes the position of the best particle after the maximum number of iterations N iterations .
The adapted BPSO algorithm starts with the initialization of the particles that correspond to each monitoring sensor (line 1) and then for each particle that corresponds to a monitoring sensor, the corresponding fitness value is calculated (lines 2-4). Next, the weights of each sensor particle are computed (lines 5-7) in such a way that the weights are proportional to the fitness values of the sensor particles.
The particles swarm is initialized randomly (line 8) and the value of the initial iteration i is set to 0 in (line 9). The algorithm is then executed for a maximum number of iterations N iterations given as input.
In each iteration for each particle from the swarm (lines [11][12][13][14][15], the fitness value is computed in (line 12) using the ObjectiveFunction and the functions U pdatedLocalBest and U pdateGlobalBest update the values of p best and g best as follows: g best = p best if Fitness(p best ) > Fitness(g best ) g best otherwise.
The inertia is updated (line 16) considering the values of i, w min and w max as follows: Next, for each particle from the swarm (lines 17-30) and for each dimension of the particle (lines 18-29) the new value of the velocity of the particle for dimension d is computed, taking into consideration the following components: If the computed value is not in the interval (V min , V max ), then that value is updated either to V min or to V max (lines 20-22) as follows: The velocity is converted into a probability value (line 23) using the exponential: In (lines 24-28) the value of a random value r 0 from [0, 1] is applied in order to update the position of the particle as follows: In (line 31) the current iteration i is incremented with 1 and finally in (line 33) the algorithm returns the value of g best .

Results
The adapted version of the BPSO algorithm was written and evaluated in Java in IntelliJ IDEA. The feature selection experiments for the adapted BPSO algorithm were performed in IntelliJ IDEA. The experiments for feature selection using FFS and BFE were performed in Konstanz Information Miner (KNIME) [37]. The experiments for feature selection using RF were performed in JetBrains PyCharm, using the sklearn library from Python. The experiments for feature selection based on the adapted versions of GA and DE were performed in IntelliJ IDEA. The properties of the system in which the experiments were performed are: (1) processor properties: Intel(R) Core(TM) i5-7600K CPU @ 3.80GHz 3.80 GHz; (2) installed memory (RAM) properties: 16.0 GB; (3) system type properties: 64-bit Operating System, x64-based processor.

Machine Learning Methodology for the Classification of DLAs Based on Adapted BPSO
The steps of the machine learning methodology that is applied in the classification of the DLAs using data generated by monitoring sensors are described in more details in the next subsections. In the Feature Selection step, the methodology integrates the algorithm that is presented in this article, namely the adapted BPSO algorithm. The methodology is also used as support in order to validate and to test this algorithm.

DLAs Sensors Data
In Figure 1, the placement of the monitoring sensors for the DaLiAc dataset is illustrated. The main characteristics of the DaLiAc dataset are presented in Table 1. In the case of the DaLiAc dataset the following activities are monitored: (1) DLA 1 -Sitting; The DaLiAc dataset contains information about 19 subjects (eight females and eleven males) aged 26 ± 8 years. The data was acquired using four SHIMMER (Shimmer Research, Dublin, Ireland) sensors. Each of those sensors is equipped with a triaxial accelerometer (three features) and a triaxial gyroscope (three features). The accelerometer measures the proper acceleration while the gyroscope measures the velocity and the angular velocity. The sensors were placed on the left ankle, on the right hip, on the chest and on the right ankle of the monitored subjects. The frequency of sampling of those sensors is 200 Hz. Figure 2 summarizes the number of samples of each daily living activity for each monitored subject. As can be seen in the figure, the walking outside activity (DLA 7 ) contains the biggest number of samples for each monitored subject and moreover the samples for each activity are not balanced.

Feature Selection
In this step, various combinations of feature selection techniques are investigated. Four different types of feature selection techniques are applied: (1) The first technique considers the features that correspond to the monitoring sensors that are used for collecting the data from the monitoring sensors: six features (chest sensor), six features (right wrist sensor), six features (left ankle sensor), six features (right hip sensor); (2) The second technique considers the following three feature selection algorithms from literature: the forward feature selection (FFS) algorithm [38], the backward features elimination (BFE) algorithm [39] and the random forest (RF) [40]; (3) The third technique considers the BPSO algorithm adapted for data generated by monitoring sensors placed on the bodies of the monitored subjects; (4) The fourth technique considers adapted versions of genetic algorithm (GA) [41] and differential evolution (DE) [42] for feature selection using the same objective function as in the case of the adapted BPSO algorithm;

Cross Validation
The approach presented in this article is based on 10-fold cross validation. Thus, the data is split randomly in 10 folds of approximately equal size and the prediction model is run 10 times, such that, in each run, the testing data is represented by a different fold from those 10 folds and the training data is represented by the remaining nine folds.

Machine Learning Classification Model
The machine learning classification model is used in the classification of the daily living activities performed by the monitored subjects based on the data from the monitoring sensors. The machine learning classification model that is applied in this article is RF. RF is an ensemble classification approach that is based on the development of a collection of decision trees and has applications in many domains, such as medicine, ecology, bioinformatics, astronomy, and agriculture, and, moreover, it is a very good option for imbalanced datasets.

DLAs Classification
The final output of the machine learning methodology is represented by the classified DLAs, considering the raw sensors data generated during the monitoring of the DLAs. The metrics that are applied in the evaluation of the models used for the classification of the DLAs are the recall, the precision, and the F-measure.

Feature Selection Results for DLAs Data Generated by Monitoring Sensors Using the Adapted BPSO Algorithm
The configuration of the parameters of the adapted BPSO algorithm applied in the experiments is summarized in Table 2.
The mappings between the sensor particles weights and the sensor positions for the DaLiAc dataset are: t 1 -C, t 2 -RW, t 3 -LA and t 4 -RH. Table 3 summarizes the weights of the sensor particles for the DaLiAc Dataset, for each monitored subject. In Table 4 is presented as a summary of the experiments for each monitored subject S, as follows: the mean running time in milliseconds t m , the mean fitness value f m , the mean number of returned features NRF m , and the mean accuracy a m . For each monitored subject, five experiments were performed. The value of NRF m is from {5, 6, 7, 8}.
In Figure 3, the evolution of the fitness value in each iteration of the adapted BPSO algorithm for each subject is presented. The fitness value is improved after 30 iterations for each monitored subject.

Comparison of the Results Obtained Using the Adapted Version of BPSO for feature selection with the Results Obtained Using Other Methods
Both for the adapted versions of GA and DE for feature selection, the same objective function was applied as in the case of the adapted version of BPSO. Moreover, the number of returned features was restricted to six in the case of GA, DE, and BPSO, using the heuristic for features ranking introduced in this article, which returns the results presented in Appendix A. In the case of the adapted version of BPSO the configuration parameters described in the previous section were applied.
The following configurations are applied in the case of the other methods: (1) FFS and BFE-The standard configurations from KNIME, a threshold for the number of features equal to six and a random drawing strategy; (2) RF-The standard configurations from sklearn, a number of estimators equal to 1000, a maximum number of features equal to six; (3) GA-20 chromosomes, 30 iterations, CR (crossover rate) = 0.5, MR (mutation rate) = 0.5; (4) DE-20 agents, 30 iterations, CR (crossover probability) = 0.5, F (differential weight) = 1.0.
Both for the adapted versions of GA and DE for feature selection, the same objective function was applied, as in the case of the adapted version of BPSO. Moreover, the number of returned features was restricted to six in the case of GA, DE, and BPSO, using the heuristic for features ranking introduced in this article that returns the results presented in Appendix A.
In Table 5, the running times for each feature selection approach are presented. The running time of BPSO is much better than the running time of BFE, GA and DE, but slightly worse than the running time of FFS and comparable to the running time of RF. Table 5. The running time for forward feature selection (FFS), backward features elimination (BFE), random forest (RF), binary particle swarm optimization (BPSO), genetic algorithm (GA) and differential evolution (DE) for each monitored subject, in milliseconds. In Figure 4, the selected features for each feature selection approach are presented, namely FFS, BFE, RF, BPSO, GA, and DE, for each monitored subject.

Comparison of the DLAs Classification Results Obtained Using the Adapted Version of BPSO with the Results Obtained Using Other Methods
In Table 6 the DLAs classification results for each feature selection approach are presented, when the applied classification algorithm is RF. For eight out of the 19 monitored subjects, BPSO returns classification results that are from the top three results. Moreover, BPSO returns the best classification result in the case of the ninth monitored subject and this result is very promising, showing that further improvements of the algorithm might lead to even better classification results. Another promising observation is that GA returns the best classification result in the case of the fifteenth monitored subject and DE returns the best classification result in the case of the eighteenth monitored subject.
Therefore, further improvements of the objective function and the application of other bio-inspired algorithms might lead to better classification results.

Discussion
In this section, a critical discussion and the main focus is on the following aspects is presented: (1) the results obtained when the adapted BPSO algorithm is applied for each type of DLA and (2) the comparison of the performance of the machine learning approach based on he BPSO algorithm with the performance of other approaches from the literature that consider the same benchmark dataset.

Application of Adapted BPSO Algorithm for DLAs Classification
The experiments were performed in KNIME as follows: the input data for each feature selection approach is read from an Excel Reader (XLS) node, the X-Partitioner node represents the beginning of the cross validation and it applies a number of validations equal to 10 and random sampling, two Number To String nodes are used for converting the data types of the labels of the samples from the training data and from the testing data from number to string, in order to prepare the data for classification, the Random Forest Learner node is used for creating a classification model and the split criterion is Information Gain Ration, the Random Forest Predictor node is used for predicting the results, considering the data resulted from the Random Forest Learner node and the testing data as input, one String to Number node is used for converting the data type of the predicted labels from string to number, the X-Aggregator node represents the end of the cross-validation, and finally the Scorer node is used for computing the accuracy.
In Figure 5, the classification results for the DaLiAc Dataset for each monitored subject are presented. The best results were obtained in the case of the activities sitting (DLA 1 ) and lying (DLA 2 ) and the worst results were obtained in the case of the activities bicycling (50 watt) (DLA 11 ) and bicycling (100 watt) (DLA 12 ).

Comparison of the Performance of the BPSO Based Approach with the Performance of Literature Approaches
In this subsection, the results obtained using the machine learning methodology based on BPSO are compared with some of the best literature results. Table 7 presents a comparison of the results obtained, using the adapted version of the BPSO algorithm with the results obtained in the literature. As can be seen in the table, the approach proposed in this article returns results comparable to the ones from the literature. The method presented in [33] returns much better results; however, in our approach we consider a maximum of six features and, in that approach, all features are used, therefore the two approaches are not directly comparable.

Conclusions
In this article, we presented a novel approach for feature selection in the case of data that is generated by sensors that monitor various types of DLAs. The method was tested and validated on the DaLiAc dataset. The experimental results show that the adapted version of BPSO presented in this article is comparable to other classical algorithms such as FFS, BFE, and RF, in terms of running time and classification performance.
The running time of the adapted version of BPSO is better than the running time of BFE and the developed objective function also returns promising results in combination with other bio-inspired algorithms, such as GA and DE. Moreover, for some monitored subjects the approach based on BPSO returns results that are much better than the ones obtained using classical algorithms for feature selection. Even if the proposed method based on BPSO returns results, which are not as good as the ones obtained using classical methods for some monitored subjects, further improvements of the objective function might lead to better results.
For example, the method presented in [33] performs significantly better,; however, it requires complex computational resources and image processing transformations.
As future research work the following research directions are proposed: (1) The testing and the validation of the proposed methodology using other validation types, such as the leave-one-subject validation type or the leave-two-subjects validation type, (2) the proposal of another heuristic for the ranking of the features returned by the adapted version of BPSO and (3)  Funding: This research was funded by the Romanian National Authority for Scientific Research and Innovation, CCCDI-UEFISCDI and of the AAL Programme with co-funding from the European Union's Horizon 2020 research and innovation programme grant number AAL59/2018 ReMIND within PNCDI III. The APC was funded by the Technical University of Cluj-Napoca through the grants for scientific research support programme.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A
The ranking of the features for each monitored subject S of the DaLiAc dataset in the adapted version of BPSO introduced in this article is presented next: (