Author Contributions
Conceptualization, methodology, formal analysis, software, validation, and writing—original draft preparation, M.Q.; writing—review and editing, literature review, synthetic data generation, and data and results visualization, S.C.U.; supervision, resources, project administration, and funding acquisition, E.A. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Accelerometer time sequence: accelerations (dashed red, green, blue), with magnitude (solid).
Figure 1.
Accelerometer time sequence: accelerations (dashed red, green, blue), with magnitude (solid).
Figure 2.
Work flow of signal process and analysis.
Figure 2.
Work flow of signal process and analysis.
Figure 3.
(a) Moving average smoothing effect. (b) Step length influence on normal step length. (c) MinMax feature scaling before normalization. (d) MinMax feature scaling after normalization.
Figure 3.
(a) Moving average smoothing effect. (b) Step length influence on normal step length. (c) MinMax feature scaling before normalization. (d) MinMax feature scaling after normalization.
Figure 4.
Random forest classifier confusion matrix performance on time-domain features with p-values < 0.05.
Figure 4.
Random forest classifier confusion matrix performance on time-domain features with p-values < 0.05.
Figure 5.
J48 classifier confusion matrix performance on frequency-domain features with p-values < 0.05.
Figure 5.
J48 classifier confusion matrix performance on frequency-domain features with p-values < 0.05.
Figure 6.
Random forest classifier confusion matrix performance on wavelet-domain features with p-values < 0.05.
Figure 6.
Random forest classifier confusion matrix performance on wavelet-domain features with p-values < 0.05.
Figure 7.
J48 classifier confusion matrix performance on statistical domain features with p-values < 0.05.
Figure 7.
J48 classifier confusion matrix performance on statistical domain features with p-values < 0.05.
Figure 8.
Random forest classifier confusion matrix performance on information-theoretic domain features with p-values < 0.05.
Figure 8.
Random forest classifier confusion matrix performance on information-theoretic domain features with p-values < 0.05.
Figure 9.
Random forest classifier confusion matrix performance on all domain features with p-values < 0.05.
Figure 9.
Random forest classifier confusion matrix performance on all domain features with p-values < 0.05.
Table 1.
Accelerometer gait features, their original use cases, and units or variable types.
Table 1.
Accelerometer gait features, their original use cases, and units or variable types.
Feature Name | Applied Cases | Unit/Variable Type |
---|
Number of Steps | Alcohol Usage [39], Parkinson’s Disease [48,49] | Count |
Average Step Time | Alcohol Usage [39,50], Parkinson’s Disease [48,49] | Seconds (s) |
Average Cadence | Alcohol Usage [39], Parkinson’s Disease [48] | Steps per minute |
Skewness | Alcohol Usage [39], Paraspinal Assessment [51] | Dimensionless (statistical) |
Kurtosis | Alcohol Usage [39], Neuron Discharge [52] | Dimensionless (statistical) |
Coefficient of Variation of Step Time | Parkinson’s, Peripheral Neuropathy [53] | Dimensionless (ratio or %) |
Harmonic Ratio | Parkinson’s Disease, Peripheral Neuropathy [53] | Dimensionless (ratio) |
Average Step Length | Alcohol Usage [39], Parkinson’s Disease [49] | Meters (m) |
Gait Velocity | Alcohol Usage [39], Parkinson’s Disease [49] | Meters/second (m/s) |
Minimum and Maximum Difference | Parkinson’s Disease [54] | Acceleration (m/s2) |
Standard Deviation | Parkinson’s, Peripheral Neuropathy [53,54], Paraspinal [51] | m/s2 (or unit of original signal) |
Root Mean Square | Parkinson’s Disease [54] | m/s2 |
Entropy Rate | Parkinson’s, Peripheral Neuropathy [53], Neural Control [55], Heart [56] | Bits or dimensionless |
Regression Line for Local Maxima and Minima | Parkinson’s Disease [54] | Slope (dimensionless) |
Average Power | Alcohol Usage [39], Paraspinal Assessment [51] | Power (a.u. or m2/s3) |
Ratio of Spectral Peak | Alcohol Usage [39] | Dimensionless (ratio) |
Signal-to-Noise Ratio (SNR) | Alcohol Usage [39], Coronary Artery [57] | Decibels (dB) |
Total Harmonic Distortion (THD) | Alcohol Usage [39] | Percentage (%) or dB |
Energy in Band 0.5–3 Hz | Parkinson’s Disease [54] | Energy (a.u.) |
Windowed Energy in Band 0.5–3 Hz | Parkinson’s Disease [54] | Energy (a.u.) |
Peak Frequency | Parkinson’s, Peripheral Neuropathy [53], Paraspinal [51] | Hertz (Hz) |
Spectral Centroid | Parkinson’s Disease, Peripheral Neuropathy [53] | Hertz (Hz) |
Bandwidth | Parkinson’s, Peripheral Neuropathy [53] | Hertz (Hz) |
Regression Line for Windowed Energy | Parkinson’s Disease [54] | Slope (dimensionless) |
Wavelet Bandwidth | Parkinson’s, Peripheral Neuropathy [53] | Hertz (Hz) |
Wavelet Entropy Rate | Parkinson’s, Dysphagia, Neural Control [58,59] | Bits or dimensionless |
Zeroth-Lag Cross-Correlation Coefficient | Parkinson’s, Peripheral Neuropathy [53] | Correlation coefficient (−1 to 1) |
Lempel-Ziv Complexity | Parkinson’s, Peripheral Neuropathy [53], EEG [60] | Dimensionless (complexity score) |
Table 2.
Time-domain features extracted from accelerometer data.
Table 2.
Time-domain features extracted from accelerometer data.
Feature | Description | Formula |
---|
Number of Steps (numSteps) | The number of steps taken in a given time interval [39,61] | - |
Average Step Time (AvgStepTime) | The average time elapsed for each step [39,50] | |
Average Cadence (AvgCad) | Ratio of total steps to total time [39,61] | AvgCad = |
Skewness (S) | Asymmetry of the signal distribution [39,53,61] | |
Kurtosis (K) | Extent to which signal amplitudes lie predominantly on one side of the mean [39,53,61] | |
Coefficient of Variation of Step Time () | Standard deviation of stride interval divided by mean stride interval [53,58] | |
Harmonic Ratio (HR) | Quantifies harmonic composition of accelerations via DFT [53,62] | |
Average Step Length (AvgStepLength) | The average distance covered per step [39,50] | |
Gait Velocity (gaitVelocity) | Ratio of total distance covered to total time [39,61] | |
Minimum and Maximum Difference (minMaxDiff) | Global max of a step minus global min, averaged over all steps [54] | minMaxDiff = |
Standard Deviation () | Measure for signal spread, square root of variance [53,54] | |
Root Mean Square (RMS) | Quadratic mean, statistical measure [54] | |
Entropy Rate (H) | Measures signal uncertainty and regularity [53,54,55,56] | |
Regression Line for Local Maxima and Minima | Regression line of local extrema in signal sequence [54] | – |
Table 3.
Frequency-domain features extracted from accelerometer data.
Table 3.
Frequency-domain features extracted from accelerometer data.
Feature | Description | Formula |
---|
Average Power (AvgPower) | The mean of the total power underneath the curve of the PSD estimate for a signal [39,50] | |
Ratio of Spectral Peak (Welch, FFT, DCT) (RSP) | Ratio of the energies of low- and high-frequency bands [39,50] | |
Signal-to-Noise Ratio (SNR) | Power of the whole signal over the power of its computed noise [39] | |
Total Harmonic Distortion (THD) | Distortion of the whole signal compared to its harmonics [39] | |
Energy in Band 0.5 to 3 Hz
(EB (0.5–3 Hz)) | Energy in a frequency band describing parts of distinct frequencies in the signal [54] | |
Windowed Energy in Band
0.5 to 3 Hz (WEB) | Energy in a frequency band of 5 s windows with an overlap of 2.5 s, averaged from complete signal sequence [54] | |
Peak Frequency (PeakFreq.) | The maximum spectral power [53] | |
Spectral Centroid (SP) | The frequency that divides the spectral power distribution into two equal parts [53] | |
Bandwidth (B) | Difference between the uppermost and lowermost frequencies in the signal [53] | |
Regression Line for Windowed Energy (y) | Regression line of energy values from a window (2.5 s) moved through a signal sequence [54] | |
Table 4.
Wavelet-domain features extracted from the signal.
Table 4.
Wavelet-domain features extracted from the signal.
Feature | Description | Formula |
---|
Wavelet Bandwidth (WB) | The relative energy contribution in a time–frequency band [53] | |
Wavelet Entropy Rate (ER) | Wavelet entropy represents signal disorder in the time–frequency domain [53,58,59] | |
Table 5.
Statistical Features.
Table 5.
Statistical Features.
Feature | Description | Formula |
---|
Zeroth-Lag Cross-Correlation Coefficient () | The agreement or similarity between two directional acceleration signals [53] | |
Kurtosis (K) | The extent to which the distribution of signal amplitudes lies predominantly on the left of the mean amplitude [39,53,61] | |
Standard Deviation () | Measure for signal spreading, defined as the square root of variance [53,54] | |
Table 6.
Information-Theoretic Features.
Table 6.
Information-Theoretic Features.
Feature | Description | Formula |
---|
Lempel–Ziv Complexity () | The complexity–predictability of the signal [53,60,63,64] |
where is the number of unique patterns in sequence X, and N is the sequence length. |
Entropy Rate (H(X)) | The uncertainty measure of the signal, representing the regularity of a signal when consecutive data points are related [53,54,55,56] |
where represents the probability distribution of the signal values. |
Table 7.
Time-domain features ranked by correlation coefficient.
Table 7.
Time-domain features ranked by correlation coefficient.
Index | Feature Name | Before Normalization | After Normalization | Coef
Diff |
---|
Coef
|
p
-Value
|
Predictable (p < 0.05)
|
Coef
| p-Value
|
Predictable p
< 0.05)
|
---|
1 | Standard Deviation () | −0.1068 | 0.0657 | 0 | −0.3947 | 0.0000 | 1 | 0.2880 |
2 | Root Mean Square (RMS) | −0.1067 | 0.0660 | 0 | −0.3943 | 0.0000 | 1 | 0.2877 |
3 | minMaxDiff | −0.1268 | 0.0286 | 1 | −0.3842 | 0.0000 | 1 | 0.2574 |
4 | Skewness (S) | −0.2649 | 0.0000 | 1 | −0.2715 | 0.0000 | 1 | 0.0066 |
5 | Kurtosis (K) | −0.1509 | 0.0091 | 1 | −0.2610 | 0.0000 | 1 | 0.1101 |
6 | gaitVelocity | −0.1131 | 0.0511 | 0 | −0.2523 | 0.0000 | 1 | 0.1392 |
7 | AvgCadence | 0.1108 | 0.0561 | 0 | −0.2490 | 0.0000 | 1 | 0.1383 |
8 | numSteps | −0.1309 | 0.0238 | 1 | −0.2102 | 0.0003 | 1 | 0.0793 |
9 | AvgStepLength | 0.1108 | 0.0561 | 0 | −0.1988 | 0.0006 | 1 | 0.0880 |
10 | Entropy Rate (H) | −0.0773 | 0.1831 | 0 | −0.1813 | 0.0017 | 1 | 0.1040 |
11 | Harmonic Ratio (HR) | 0.1505 | 0.0093 | 1 | 0.1708 | 0.0031 | 1 | 0.0203 |
12 | Coeff. of Variation of Step Time () | 0.1128 | 0.0518 | 0 | −0.1346 | 0.0202 | 1 | 0.0218 |
Average Useful | 0.1302 | | | 0.2586 | | | 0.1284 |
13 | AvgStepTime | 0.0831 | 0.1525 | 0 | 0.0975 | 0.0928 | 0 | 0.0000 |
Average All | 0.1251 | | | 0.2312 | | | 0.1061 |
Table 8.
Frequency-domain features ranked by correlation coefficient.
Table 8.
Frequency-domain features ranked by correlation coefficient.
Index | Feature Name | Before Normalization | After Normalization | Coef
Diff |
---|
Coef
|
p
-Value
|
Predictable (p
< 0.05)
|
Coef
|
p
-Value
|
Predictable (p
< 0.05)
|
---|
1 | AvgPower | −0.1345 | 0.0202 | 1 | −0.3990 | 0.0000 | 1 | 0.2645 |
2 | WEB | −0.1393 | 0.0161 | 1 | −0.3974 | 0.0000 | 1 | 0.2581 |
3 | EB (0.5–3 Hz) | −0.1409 | 0.0149 | 1 | −0.3347 | 0.0000 | 1 | 0.1937 |
4 | PeakFreq. | −0.1239 | 0.0325 | 1 | −0.3196 | 0.0000 | 1 | 0.1958 |
5 | SNR | 0.2669 | 0.0000 | 1 | −0.2471 | 0.0000 | 1 | −0.0199 |
6 | RSPFFT | −0.1385 | 0.0168 | 1 | −0.1734 | 0.0027 | 1 | 0.0349 |
7 | RSPWelch | −0.0925 | 0.1111 | 0 | −0.1703 | 0.0032 | 1 | 0.0778 |
8 | RSPDCT | −0.1179 | 0.0420 | 1 | −0.1525 | 0.0084 | 1 | 0.0346 |
Average Useful | 0.1443 | | | 0.2742 | | | 0.1299 |
9 | Bandwidth (B) | −0.0682 | 0.2408 | 0 | −0.0795 | 0.1711 | 0 | 0.0000 |
10 | Spectral Centroid (SP) | 0.0910 | 0.1168 | 0 | 0.0393 | 0.4996 | 0 | 0.0000 |
11 | THD | 0.1056 | 0.0687 | 0 | 0.0362 | 0.5334 | 0 | 0.0000 |
Average All | 0.1314 | | | 0.2313 | | | 0.0999 |
Table 9.
Wavelet-domain features ranked by correlation coefficient.
Table 9.
Wavelet-domain features ranked by correlation coefficient.
Index | Feature Name | Before Normalization | After Normalization | Coef
Diff |
---|
Coef
|
p
-Value
|
Predictable (p
< 0.05)
|
Coef
|
p
-Value
|
Predictable (
p
< 0.05)
|
---|
1 | Wavelet Entropy Rate (ER) | 0.1880 | 0.0011 | 1 | 0.1229 | 0.0340 | 1 | −0.0651 |
Average Useful | 0.1880 | | | 0.1229 | | | −0.0651 |
2 | Wavelet Bandwidth (WB) | −0.1565 | 0.0068 | 1 | −0.0889 | 0.1256 | 0 | 0.0000 |
Average All | 0.1723 | | | 0.1059 | | | −0.0664 |
Table 10.
Statistical-Domain features ranked by correlation coefficient.
Table 10.
Statistical-Domain features ranked by correlation coefficient.
Index | Feature Name | Before Normalization | After Normalization | Coef
Diff |
---|
Coef
|
p
-Value
|
Predictable (
p
< 0.05)
|
Coef
|
p
-Value
|
Predictable (
p
< 0.05)
|
---|
7 | Standard Deviation () | −0.1068 | 0.0657 | 0 | −0.3947 | 0.0000 | 1 | 0.2880 |
11 | Zero-Lag Cross-Correlation Coeff. () | 0.0720 | 0.2152 | 0 | −0.2848 | 0.0000 | 1 | 0.2128 |
5 | Kurtosis (K) | −0.1509 | 0.0091 | 1 | −0.2610 | 0.0000 | 1 | 0.1101 |
Average | 0.1099 | | | 0.3135 | | | 0.2036 |
Table 11.
Information-theoretic features ranked by correlation coefficient.
Table 11.
Information-theoretic features ranked by correlation coefficient.
Index | Feature Name | Before Normalization | After Normalization | Coef
Diff |
---|
Coef
|
p
-Value
|
Predictable (
p
< 0.05)
|
Coef
|
p
-Value
|
Predictable (p
< 0.05)
|
---|
1 | Entropy Rate (H(X)) | −0.0773 | 0.1831 | 0 | −0.1813 | 0.0017 | 1 | 0.1040 |
Average | 0.0773 | | | 0.1813 | | | 0.1040 |
Table 12.
Confusion matrix representation.
Table 12.
Confusion matrix representation.
Actual/Predicted | Class 1 | Class 2 |
---|
Class 1 | True Positives (TP) | False Negatives (FN) |
Class 2 | False Positives (FP) | True Negatives (TN) |
Table 13.
Classifiers ranked by accuracy for time-domain features with p-values < 0.05.
Table 13.
Classifiers ranked by accuracy for time-domain features with p-values < 0.05.
Classifier Type | Accuracy (%) |
---|
Random Forest | 83.22 |
JRip | 80.20 |
J48 | 78.86 |
Decision Table | 74.16 |
Naive Bayes | 48.66 |
SMO (SVM in WEKA) | 41.28 |
Table 14.
Random forest classifier performance metrics for different BAC Classes.
Table 14.
Random forest classifier performance metrics for different BAC Classes.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|
BAC = 0 | 0.942 | 0.049 | 0.803 | 0.942 | 0.867 | 0.968 |
BAC = 0.05 | 0.625 | 0.039 | 0.714 | 0.625 | 0.667 | 0.855 |
BAC = 0.12 | 0.807 | 0.054 | 0.780 | 0.807 | 0.793 | 0.909 |
BAC = 0.2 | 0.632 | 0.035 | 0.727 | 0.632 | 0.676 | 0.836 |
BAC = 0.3 | 0.937 | 0.032 | 0.945 | 0.937 | 0.941 | 0.979 |
Weighted Avg. | 0.932 | 0.040 | 0.830 | 0.832 | 0.829 | 0.929 |
Table 15.
Random forest configuration for time-domain features.
Table 15.
Random forest configuration for time-domain features.
Parameter | Value |
---|
Classifier | weka.classifiers.trees.RandomForest |
Number of Trees (numTrees) | 100 |
Maximum Depth (maxDepth) | 0 (unlimited) |
Number of Features (numFeatures) | 0 (auto: ) |
Seed | 1 |
Out-of-Bag Error Estimation | Enabled |
Bag Size Percent | 100 |
Batch Size | 100 |
Break Ties Randomly | False |
Print Classifier | False |
Table 16.
Classifier accuracy comparison on frequency-domain features with p-values < 0.05.
Table 16.
Classifier accuracy comparison on frequency-domain features with p-values < 0.05.
Classifier Type | Accuracy |
---|
J48 | 82.21% |
Random Forest | 79.53% |
JRip | 77.18% |
Decision Table | 74.83% |
Naive Bayes | 48.99% |
SMO (SVM in WEKA) | 43.29% |
Table 17.
J48 classifier performance metrics for different BAC classes.
Table 17.
J48 classifier performance metrics for different BAC classes.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|
BAC = 0 | 0.885 | 0.061 | 0.754 | 0.885 | 0.814 | 0.913 |
BAC = 0.05 | 0.650 | 0.027 | 0.788 | 0.650 | 0.712 | 0.904 |
BAC = 0.12 | 0.842 | 0.079 | 0.716 | 0.842 | 0.774 | 0.874 |
BAC = 0.2 | 0.605 | 0.023 | 0.793 | 0.605 | 0.687 | 0.852 |
BAC = 0.3 | 0.919 | 0.032 | 0.944 | 0.919 | 0.932 | 0.958 |
Weighted Avg. | 0.822 | 0.044 | 0.827 | 0.822 | 0.820 | 0.913 |
Table 18.
J48 configuration for frequency-domain features.
Table 18.
J48 configuration for frequency-domain features.
Parameter | Value |
---|
Classifier | weka.classifiers.trees.J48 |
Confidence Factor for Pruning (C) | 0.25 |
Minimum Number of Instances per Leaf (M) | 2 |
Unpruned | False |
Reduced Error Pruning | False |
Binary Splits | False |
Collapse Tree | True |
Subtree Raising | True |
Use Laplace for Smoothing | False |
Seed | 1 |
Table 19.
Classifiers ranked by accuracy for wavelet-domain features with p-values < 0.05.
Table 19.
Classifiers ranked by accuracy for wavelet-domain features with p-values < 0.05.
Classifier Type | Accuracy |
---|
Random Forest | 77.85% |
J48 | 75.84% |
JRip | 70.81% |
Decision Table | 53.36% |
Naive Bayes | 42.62% |
SMO (SVM in WEKA) | 37.25% |
Table 20.
Performance metrics for different BAC classes.
Table 20.
Performance metrics for different BAC classes.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|
BAC = 0 | 0.865 | 0.073 | 0.714 | 0.865 | 0.783 | 0.905 |
BAC = 0.05 | 0.600 | 0.054 | 0.632 | 0.600 | 0.615 | 0.740 |
BAC = 0.12 | 0.807 | 0.066 | 0.742 | 0.807 | 0.773 | 0.879 |
BAC = 0.2 | 0.658 | 0.038 | 0.714 | 0.658 | 0.685 | 0.789 |
BAC = 0.3 | 0.829 | 0.043 | 0.920 | 0.829 | 0.872 | 0.910 |
Weighted Avg. | 0.779 | 0.054 | 0.785 | 0.779 | 0.779 | 0.865 |
Table 21.
Random forest configuration for wavelet-domain features.
Table 21.
Random forest configuration for wavelet-domain features.
Parameter | Value |
---|
Classifier | weka.classifiers.trees.RandomForest |
Number of Trees (numTrees) | 150 |
Maximum Depth (maxDepth) | 0 (unlimited) |
Number of Features (numFeatures) | 0 (auto: ) |
Seed | 1 |
Out-of-Bag Error Estimation | Enabled |
Bag Size Percent | 100 |
Batch Size | 100 |
Break Ties Randomly | False |
Print Classifier | False |
Table 22.
Classifiers ranked by accuracy for statistical domain features with p-values < 0.05.
Table 22.
Classifiers ranked by accuracy for statistical domain features with p-values < 0.05.
Classifier Type | Accuracy |
---|
J48 | 83.89% |
Random Forest | 82.86% |
JRip | 76.51% |
Decision Table | 72.15% |
Naive Bayes | 50.34% |
SMO (SVM in WEKA) | 40.94% |
Table 23.
J48 performance metrics for statistical features with p-values < 0.05.
Table 23.
J48 performance metrics for statistical features with p-values < 0.05.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|
BAC = 0 | 0.904 | 0.041 | 0.825 | 0.904 | 0.862 | 0.942 |
BAC = 0.05 | 0.775 | 0.027 | 0.816 | 0.775 | 0.795 | 0.908 |
BAC = 0.12 | 0.772 | 0.058 | 0.759 | 0.772 | 0.765 | 0.874 |
BAC = 0.2 | 0.632 | 0.038 | 0.706 | 0.632 | 0.667 | 0.826 |
BAC = 0.3 | 0.937 | 0.037 | 0.937 | 0.937 | 0.937 | 0.977 |
Weighted Avg. | 0.839 | 0.041 | 0.837 | 0.839 | 0.839 | 0.923 |
Table 24.
J48 configuration for statistical domain features.
Table 24.
J48 configuration for statistical domain features.
Parameter | Value |
---|
Classifier | weka.classifiers.trees.J48 |
Confidence Factor for Pruning (C) | 0.15 |
Min. No. of Instances per Leaf (M) | 3 |
Unpruned | False |
Reduced Error Pruning | False |
Binary Splits | False |
Collapse Tree | True |
Subtree Raising | True |
Laplace for Smoothing | False |
Seed | 1 |
Table 25.
Classifiers ranked by accuracy for information-theoretic features with p-values < 0.05.
Table 25.
Classifiers ranked by accuracy for information-theoretic features with p-values < 0.05.
Classifier Type | Accuracy |
---|
Random Forest | 58.05% |
J48 | 57.05% |
Decision Table | 53.36% |
JRip | 43.29% |
Naive Bayes | 37.92% |
SMO (SVM in WEKA) | 37.25% |
Table 26.
Performance metrics for different BAC classes.
Table 26.
Performance metrics for different BAC classes.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|
BAC = 0 | 0.654 | 0.195 | 0.415 | 0.654 | 0.507 | 0.820 |
BAC = 0.05 | 0.650 | 0.097 | 0.510 | 0.650 | 0.571 | 0.776 |
BAC = 0.12 | 0.298 | 0.100 | 0.415 | 0.298 | 0.347 | 0.768 |
BAC = 0.2 | 0.237 | 0.031 | 0.529 | 0.237 | 0.327 | 0.709 |
BAC = 0.3 | 0.784 | 0.107 | 0.813 | 0.784 | 0.798 | 0.901 |
Weighted Avg. | 0.581 | 0.110 | 0.590 | 0.581 | 0.571 | 0.820 |
Table 27.
Random forest configuration for information-theoretic domain features.
Table 27.
Random forest configuration for information-theoretic domain features.
Parameter | Value |
---|
Classifier | weka.classifiers.trees.RandomForest |
Number of Trees (numTrees) | 200 |
Maximum Depth (maxDepth) | 0 (unlimited) |
Number of Features (numFeatures) | 0 (auto: ) |
Seed | 1 |
Out-of-Bag Error Estimation | Enabled |
Bag Size Percent | 100 |
Batch Size | 100 |
Break Ties Randomly | True |
Print Classifier | False |
Table 28.
Accuracy of different classifiers for all domain features.
Table 28.
Accuracy of different classifiers for all domain features.
Classifier Type | Accuracy |
---|
Random Forest | 84.90% |
J48 | 80.87% |
JRip | 80.54% |
Decision Table | 75.17% |
Naive Bayes | 56.04% |
SMO (SVM in WEKA) | 43.62% |
Table 29.
Performance metrics for different BAC classes.
Table 29.
Performance metrics for different BAC classes.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|
BAC = 0 | 0.942 | 0.041 | 0.031 | 0.942 | 0.883 | 0.969 |
BAC = 0.05 | 0.650 | 0.031 | 0.765 | 0.650 | 0.703 | 0.7854 |
BAC = 0.12 | 0.825 | 0.054 | 0.783 | 0.825 | 0.803 | 0.906 |
BAC = 0.2 | 0.7121 | 0.042 | 0.711 | 0.7121 | 0.711 | 0.848 |
BAC = 0.3 | 0.937 | 0.016 | 0.972 | 0.937 | 0.954 | 0.974 |
Weighted Avg. | 0.849 | 0.033 | 0.850 | 0.849 | 0.848 | 0.928 |
Table 30.
Random forest configuration for all domain features.
Table 30.
Random forest configuration for all domain features.
Parameter | Value |
---|
Classifier | weka.classifiers.trees.RandomForest |
Number of Trees (numTrees) | 200 |
Maximum Depth (maxDepth) | 0 (unlimited) |
Number of Features (numFeatures) | 0 (auto: ) |
Seed | 1 |
Out-of-Bag Error Estimation | Enabled |
Bag Size Percent | 100 |
Batch Size | 100 |
Break Ties Randomly | True |
Print Classifier | False |
Table 31.
Comparison of performance metrics with prior work.
Table 31.
Comparison of performance metrics with prior work.
Classifier | Acc. | F1 Score | AUC Score | TP Rate | FP Rate | Prec. | Method | Device | Features |
---|
McAfee et al. [82] | 70% | 0.786 | 0.825 | — | — | — | J48 | Phone, Watch | Skew, Kurtosis, Gait Velocity, Residual Step Time, Band Power, XZ Sway, XY Sway, YZ Sway, Sway Volume |
Bremner et al. [81] | 62% | — | — | — | — | — | Conv. Neural Network | Phone, Watch | Raw data |
Our Work | 84.90% | 0.848 | 0.928 | 0.849 | 0.033 | 0.850 | Random Forest | Phone | Number of Steps, Average Step Time, Average Cadence, Skewness, Kurtosis, Coefficient of Variation of Step Time, Harmonic Ratio, Average Step Length, Gait Velocity, Minimum and Maximum Difference, Standard Deviation, Root Mean Square, Entropy Rate, Regression Line for Local Maxima and Minima, Average Power, Ratio of Spectral Peak, Signal-to-Noise Ratio, Total Harmonic Distortion, Energy in Band 0.5 to 3 Hz, Windowed Energy in Band 0.5 to 3 Hz, Peak Frequency, Spectral Centroid, Bandwidth, Regression Line for Windowed Energy, Wavelet Bandwidth, Wavelet Entropy Rate, Zeroth-Lag Cross-Correlation Coefficient, Lampel–Ziv Complexity |