Motion Assessment for Accelerometric and Heart Rate Cycling Data Analysis

Motion analysis is an important topic in the monitoring of physical activities and recognition of neurological disorders. The present paper is devoted to motion assessment using accelerometers inside mobile phones located at selected body positions and the records of changes in the heart rate during cycling, under different body loads. Acquired data include 1293 signal segments recorded by the mobile phone and the Garmin device for uphill and downhill cycling. The proposed method is based upon digital processing of the heart rate and the mean power in different frequency bands of accelerometric data. The classification of the resulting features was performed by the support vector machine, Bayesian methods, k-nearest neighbor method, and neural networks. The proposed criterion is then used to find the best positions for the sensors with the highest discrimination abilities. The results suggest the sensors be positioned on the spine for the classification of uphill and downhill cycling, yielding an accuracy of 96.5% and a cross-validation error of 0.04 evaluated by a two-layer neural network system for features based on the mean power in the frequency bands 〈3,8〉 and 〈8,15〉 Hz. This paper shows the possibility of increasing this accuracy to 98.3% by the use of more features and the influence of appropriate sensor positioning for motion monitoring and classification.


Introduction
Recognition of human activities based on acceleration data [1][2][3][4][5][6] and their analysis by signal processing methods, computational intelligence, and machine learning, forms the basis of many systems for rehabilitation monitoring and evaluation of physical activities. Extensive attention has been paid to the analysis of these signals and their multimodal processing with further biomedical data [7,8] for feature extraction, classification, and human-computer interactions. Methods of motion detection and its analysis by accelerometers and global positioning systems (GPS) are also used for studies of physical activities including cycling [9][10][11][12][13][14], as assessed in this paper.
The present paper is devoted to the use of these systems to recognize selected motion activities using data acquired by accelerometers in mobile phones [42][43][44] with positioning and heart rate (HR) data simultaneously recorded by the Garmin system [45,46]. The locations of the accelerometric and Garmin sensors used to monitor the motion and heart rate data are presented in Figure 1. The methods used for the data processing include data de-noising, statistical methods, neural networks [47], and deep learning [48][49][50][51] methods with convolutional neural networks.  The main goal of the present paper is the analysis of accelerometric and heart rate signals to contribute to monitoring physical activities and to the assesment of rehabilitation exercises [11,52]. Selected sensors were used for the analysis of data recorded during cycling in different conditions, extending the results recorded on the exercise bike [53,54]. The proposed mathematical tools include the use of neural networks [55], machine learning for pattern recognition, and the application of signal processing methods for data analysis to enable the monitoring of selected physiological functions. Figure 1a presents the location of sensors for the acquisition of accelerometric, positioning, and heart rate data during cycling experiments with different loads. Both the mobile phone at different locations (for accelerometric data recording) and the Garmin system (for the simultaneous recording of GPS data and the heart rate) were used for data acquisition. Sample signals for uphill and downhill cycling are shown in Figure 1b-d.

Data Acquisition
The GPS and motion data (time stamps, longitude, latitude, altitude, cycling distance, and the cycling speed) were simultaneously measured by a Garmin fitness watch (Fenix 5S, Garmin Ltd., Schaffhausen, Switzerland). The heart rate data were acquired by a Garmin chest strap connected to a Garmin watch by the wireless data transmission technology. All data sets were acquired during the cycling experiments realised by a healthy and trained adult volunteer. Records were subsequently stored to the Garmin Connect website, exported in the specific Training Center (TCX) format (used for data exchange between fitness devices), converted to the comma-separated values (CSV), and imported into the MATLAB software for further processing.
A summary of the cycling segments for specific locations of the mobile phone used for accelerometric data acquisition is presented in Table 1.
The original mean sampling frequency was 142 Hz (changing in the range 15, 300 Hz with the standard deviation STD = 114) for accelerometric data and 0.48 Hz (changing in the range 0.2, 1 Hz, STD = 0.27) for heart rate data.  Table 2 presents the categories used for the classification. They were selected according to the profile of the terrain, its slope being evaluated by the Garmin GPS system. The individual categories include: (i) c(1)-HillUp; (ii) c(2)-HillDown; (iii) c(3)-SteepHillUp; and (iv) c(4)-SteepHillDown cycling. A sample time segment of the modulus of the accelerometric data simultaneously recorded by the mobile phone at the selected location (the RightLeg) is presented in Figure 1d. All procedures involving human participants were in accordance with the ethical standards of the institutional research committee and with the 1964 Helsinki Declaration and its later amendments.

Signal Processing
The proposed data processing method included data analysis at first. The total number of 1293 cycling segments was reduced to 1254 segments in the initial step, to exclude those with the standard deviation of the speed higher than a selected fraction of its mean value. This process excluded 3% of the cycling segments with gross errors and problems on the cycling route, as specified in Table 1.
In the next step, the linear acceleration data without additional gravity components were processed. Their modulus A q (n) of the accelerometric data was evaluated from the components Ax q (n), Ay q (n), and Az q (n) recorded in three directions: for all values n = 0, 1, 2, · · · , N − 1 in each segment q = 1, 2, · · · , Q(pos) N values long, for all classes and at positions pos specified in Table 1. The Garmin data were used to evaluate the mean heart rate, cycling speed, and the mean slope in each segment. Owing to the slightly changing time period during each observation, the initial preprocessing step included the linear interpolation into a vector of uniformly spaced instants with the same endpoints and number of samples. The processing of multimodal records {s(n)} N−1 n=0 of the accelerometric and heart rate signals was performed by similar numerical methods. In the initial stage, their de-noising was performed by finite impulse response (FIR) filtering of a selected order M, resulting in a new sequence {x(n)} N−1 n=0 using the relation with coefficients {b(k)} M−1 k=0 forming a filter of the selected type and cutoff frequencies. In the present study, the selected cutoff frequency f c = 60 Hz was used for the antialiasing low pass FIR filter of the order M = 4. It allowed signal resampling for this new sampling frequency.
The accelerometric data were processed to evaluate the signal spectrum, covering the full frequency range of 0, f s /2 = 30 Hz related to the sampling theorem. The mean normalized power components in 4 sub-bands were then evaluated to define the features of each segment q = 1, 2, · · · , Q(pos) for each class and sensor position. The resulting feature vector F(:, q) includes in each of its columns q relative mean power values in the frequency bands f c1 , f c2 Hz, which form a complete filter bank covering the frequency ranges of 0, 3 , 3, 8 , 8, 15 , and 15, 30 Hz. The next row of the feature vector includes the mean heart rate in each segment q = 1, 2, · · · , Q(pos).
Each of the selected spectral features of a signal segment {y(n)} N−1 n=0 N samples long was evaluated using the discrete Fourier transform, in terms of the relative power PV in a specified frequency band f c1 , f c2 Hz, as follows: where Φ is the set of indices for which the frequencies f k = k N f s ∈ f c1 , f c2 Hz. The validity of a pair of features F1, F2 selected from the feature vector F(:, q) for all segments q = 1, 2, · · · , Q(pos) related to specific classes c(k) and c(l) and positions pos was evaluated by the proposed criterion Z pos (k, l) for cluster couples k, l defined by the relation: where ST pos (k, l) = std(C pos (k)) + std(C pos (l)) (6) using the Euclidean distance D pos (k, l) between the cluster centers C k , C l of the features associated with classes k and l, respectively, and the sum ST pos (k, l) of their standard deviations. For well-separated and compact clusters, this criterion should take a value larger than zero. Signal analysis resulted in the evaluation of the feature matrix P R,Q . The feature vector [p(1, q), p(2, q), · · · , p(R, q)] in each of its columns includes both the mean power in specific frequency ranges and the mean heart rate. The target vector TV 1,Q = [t(1), t(2), · · · , t(Q)] includes the associated terrain specification according to Table 2 with selected results in Figure 2. Different classification methods were then applied to evaluate these features.

Pattern Recognition
Pattern values in the feature matrix P R,Q and the associated target vector TV 1,Q were then used for classifying all Q feature vectors into separate categories. System modelling was performed by a support vector machine (SVM), a Bayesian method, the k-nearest neighbour method, and a neural network [22,[55][56][57]. Both the accuracies and the cross-validation errors were then compared with the best results obtained by the two-layer neural network.
The machine learning [57,58] was based on the optimization of the classification system with R = 5 input values (that corresponded with the features evaluated as the mean power in four frequency bands and the mean heart rate) and S2 output units in the learning stage. The target vector TV 1,Q was transformed to the target matrix T S2,Q with units in the corresponding class rows in the range 1, S2 to enable evaluating the probability of each class.
In the case of the neural network classification model, the pattern matrix P R,Q formed the input of the two-layer neural network structure with sigmoidal and softmax transfer functions presented in Figure 3a and used to evaluate the values by the following relations: A1 S1,Q = f 1(W1 S1,R P R,Q , b1 S1,1 ) A2 S2,Q = f 2(W2 S2,S1 A1 S1,Q , b2 S2,1 ).
For each column vector in the pattern matrix, the corresponding target vector has one unit element in the row pointing to the correct target value. The network coefficients include the elements of the matrices W1 S1,R and W2 S2,S1 and associated vectors b1 S1,1 and b2 S2,1 . The proposed model uses the sigmoidal transfer function f 1 in the first layer and the probabilistic softmax transfer function f 2 in the second layer. The values of the output layer, based on the Bayes theorem [22], using the function f 2(.) = exp(.) sum(exp(.)) (9) provide the probabilities of each class. Figure 3b illustrates the pattern matrix formed by Q column vectors of R = 5 values representing the mean power in 4 frequency bands and the mean heart rate. Figure 3c presents the associated target matrix for a selected position of the accelerometric sensor.

(c) TARGET VALUES / POSITION: SPINE2
Class (1) Class (2) Class (3) Class (4) t(1,k) t(2,k) t(3,k) t(4,k) The associated performance metrics [59] can then be used to evaluate: • Sensitivity (the true positive rate, the recall) and specificity (the true negative rate): • Accuracy: • Precision (the positive predictive value) and F1-score (the harmonic mean of the precision and sensitivity): Cross-validation errors [60] were then evaluated as a measure of the generalization abilities of classification models using the leave-one-out method. Table 3    The validity of pairs of features F(i) and F(j) for separating classes c k and c j was then evaluated using the proposed criterion specified by Equation (4). Figure 4 presents the evaluation of two classes (c(3)-SteepHillUp and c(4)-SteepHillDown) with cluster centers for different locations of the sensors, and associated values of the criterion function for features evaluated as the mean power in the frequency range 0, 3 Hz and the mean heart rate (Figure 4a,b) and the mean power in frequency ranges 3, 8 Hz and 15, 30 Hz (Figure 4c,d). The results presented here show that the highest discrimination abilities are possessed by a sensor located at the Spine2 position.   Table 4. Its separate rows present the accuracy AC [%] and cross-validation errors for the classification of class A (HillUp: c(1)+c (3)) and class B (HillDown: c(2)+c(4)) for different locations of the sensors, chosen features (F1-frequency range 3, 8 Hz, F2-frequency range 8, 15 Hz) and selected classification methods. The highest accuracy and the lowest cross-validation errors were achieved by the Spine2 location of the accelerometric sensors and all classification methods. Table 5 presents the accuracy AC [%], specificity (TNR), sensitivity (TPR), F1-score (F1s), and cross-validation errors CV for classification into classes A and B by the neural network model for different locations of sensors and 5 features F1-F5 including the power in all four frequency bands and the mean heart rate in each cycling segment. The highest accuracy, 98.3%, was achieved again for the Spine2 position of the accelerometric sensor with the highest F1-score of 98.2% as well.  The comparison of neural network classification for two and five features is presented in Figure 6 related to Tables 4 and 5. Cross-validation errors are evaluated by the leave-one-out method in all cases. Figure 6a shows that there is an increase in the accuracy by 6.17% on average that is most significant for locations with the lowest accuracy, including the arm and neck positions. In a similar way, an increase in the number of features from two to five decreased the cross-validation error on average by 8.72%, as presented in Figure 6b. This decrease was most significant for locations with the lowest accuracy and the highest error, which included the arm and neck positions again.

Conclusions
This paper has presented the use of selected methods of machine learning and digital signal processing in the evaluation of motion and physical activities using wireless sensors for acquiring accelerometric and heart rate data. A mobile phone was used to record the accelerometric data at different body positions during cycling, under selected environmental conditions.
The results suggest that accelerometric data and the associated signal power in selected frequency bands can be used as features for the classification of different motion patterns to recognize cycling terrain and downhill and uphill cycling.
The proposed criterion selected the most appropriate position for classification of motion: it was the Spine2 position. All classification methods, including a support vector machine, a Bayesian method, the k-nearest neighbour method, and a two-layer neural network, were able to distinguish specific classes with an accuracy higher than 90%. The best results were achieved by the two-layer neural network and Spine2 position with an accuracy of 96.5% for two features, which was increased to 98.3% for five features.
These results correspond with those achieved during cycling on a home exercise bike [4,54] with different loads and additional sensors, including thermal cameras as well.
It is expected that further studies will be devoted to the analysis of more extensive data sets acquired by new non-invasive sensors, enabling the detection of further motion features with higher sampling frequencies. Special attention will be devoted to further multichannel data processing tools and deep learning methods with convolutional neural networks to improve the possibilities of remote monitoring of physiological functions.