One Small Step for a Man: Estimation of Gender, Age and Height from Recordings of One Step by a Single Inertial Sensor

A number of previous works have shown that information about a subject is encoded in sparse kinematic information, such as the one revealed by so-called point light walkers. With the work at hand, we extend these results to classifications of soft biometrics from inertial sensor recordings at a single body location from a single step. We recorded accelerations and angular velocities of 26 subjects using integrated measurement units (IMUs) attached at four locations (chest, lower back, right wrist and left ankle) when performing standardized gait tasks. The collected data were segmented into individual walking steps. We trained random forest classifiers in order to estimate soft biometrics (gender, age and height). We applied two different validation methods to the process, 10-fold cross-validation and subject-wise cross-validation. For all three classification tasks, we achieve high accuracy values for all four sensor locations. From these results, we can conclude that the data of a single walking step (6D: accelerations and angular velocities) allow for a robust estimation of the gender, height and age of a person.


Introduction
Sparse representation of human motions has been investigated for some decades now. It is well-known that representation of human motion by point light displays and similar concepts (e.g., point light walker [1,2]) contains detailed information on several aspects of motions and their initiators.
Over the years, the possibilities to identify certain parameters characterizing given motions have been explored. On the one hand, it is possible to discover information about the displayed motions as such. In the field of action recognition, it has been shown that estimation of poses and skeletons from video and motion capture data allows for recognition and analysis of human movement (Lv et al. [3], Junejo et al. [4], Barnachon et al. [5], Oshin et al. [6]). The survey of vision-based human motion capture by Moeslund et al. [7] discusses the advances and application of motion-capture-related techniques for tracking, pose estimation and recognition of movement. Recognition of motion patterns from video data can be achieved by machine learning approaches exploiting local space-time features (e.g., for SVM-based methods, Schüldt et al. [8]). On the other hand, information on the kinematic properties of living beings or animated objects can be detected by analyzing representations of motions. This can be done using motion capture data from passive or active devices, as well as contact forces measurements (Venture et al. [9], Kirk et al. [10]).
More recently, the market for wearable devices has virtually exploded (Liew et al. [11], Son et al. [12]). The sheer number of devices [13] reflects that there are numerous methods to capture and analyze human motion in a relatively new field of application associated with ubiquitous computing. Even though information acquired by such devices may be less accurate than information acquired by modern motion capture systems (Le Masurier et al. [14], Foster et al. [15]), it has been shown that reconstruction of motion from extremely sparse sensor setups is possible in practice (Tautges et al. [16], Riaz et al. [17]). This indicates that data collected using tri-axial accelerometers are suitable for classification tasks, e.g., associated with social actions (Hung et al. [18]), general everyday activities (Parkka et al. [19], Jean-Baptiste et al. [20], Dijkstra et al. [21]) or repetitive physical exercises (Morris et al. [22]).
We investigated if data from a single wearable sensor can reveal similar information about the moving subject as motion capture data in the sense of the above-quoted [1,2]. We focus on classification of gender, age and height defining exemplary inertial properties of moving subjects. Our experiments show that it is indeed possible to classify and thereby estimate such properties. Our method is able to process representations of single steps recorded by one accelerometer (as opposed to longer data sequences; Neugebauer et al. [23]). In sum, our method is able to recover soft biometric information with high accuracy consistently over various sensor positions. Since the classification task depends on the chosen feature sets, we further investigated this by evaluating the role of different possible feature sets in the classification.
Modern machine learning techniques like decision trees can target pattern recognition and prediction tasks based on many different representations of motion (Brand et al. [24], Bao et al. [25], Kwapisz et al. [26]). We used random forests, a learning method based on the construction of multiple decision trees, which can be used for classification, as well as regression tasks. While learning predictive models by using decision trees on their own may result in over-fitting to a training set (Phan et al. [27]), random forests are less prone to this problem. For an overview of random forests, refer to the works of Breimann [28] or Liaw and Wiener [29].

Participants' Consent
All participants were informed in detail about the purpose of the study, the nature of the experiments, the types of data to be recorded and the data privacy policy. The subjects were aware that they were taking part in experiments where a number of biometric and kinematic properties were monitored. The main focus of the study was communicated to the subjects during their progress over the course of the training by the specialists of Gokhale Method Institute [30] (Stanford, CA, United States). Each willing participant was asked to fill in the data collection form with personal details, including full name, sex, age and height.

Population Characteristics and Sampling
The participants were selected during a gait and posture training program conducted in July of 2014 by the specialists of Gokhale Method Institute. They use special gait and posture training methods to help regain the structural integrity of the body. The training program consisted of six 90-minute training sessions. The study population consisted of a total of 26 adults with a male to female ratio of 12:14 and an average age of 48.1 years (σ = ± 12.7). The average height of the participants was recorded at 174 cm (σ = ± 10.2). The characteristics of the study population are shown in Table 1. Table 1. Characteristics of the study population, including age, sex and height. For validation, two types of models were used: k-fold cross-validation and subject-wise cross-validation.

Male Participants 12
Height (cm, ± SD) 174 ± 10.2 A k-fold cross-validation model (chosen value of k = 10) was used to compute the classification accuracy of the classifier. In k-fold cross-validation, original sample data are randomly partitioned into k equally-sized sub-samples or folds. Out of the k folds, k-1folds are used for training, and the left-out fold is used for validation. The cross-validation process is repeated k times, and each of the k folds is used exactly once for validation. For sampling, the stratified sampling method [31] is used to divide the population into training and test datasets.
A subject-wise cross-validation model was also employed to compute the classification accuracy of each participant against others. Subject-wise cross-validation is a special variant of leave-one-out cross-validation in which instead of leaving one sample out for validation, all samples of one participant are left out for validation. For n participants (n = 26, in our case), all samples of n − 1 participants are used for training, and all samples of the left-out participant are used for testing. The cross-validation process is repeated n times in order to validate each participant exactly once against the rest. Unlike 10-fold cross-validation, the number of samples in each fold is not equal in subject-wise cross-validation. This is due to the difference in the step length of each subject. Subjects with shorter step lengths have more steps than the others.

Standardized Gait Tasks
The gait task consisted of a 10-meter straight walk from a starting point, turning around and walking back to the starting point. Participants were asked to walk in their natural manner and to repeat the gait task two times, resulting in a 4 × 10-meter walk. Three different types of experiments were performed: (1) walking on a hard surface (concrete floor) with shoes on; (2) walking on a hard surface (concrete floor) with bare feet; and (3) walking on a soft surface (exercise mattress) with bare feet. Data were recorded during three different stages of the training course: (1) at the start of the training (before the 1st session); (2) in the middle of the training (after the 3rd session); and (3) at the end of the training (after the 6th session). Hence, for each participant, 9 different recording sessions were carried out in total (see Table 2).

Sensor Placement and Data Collection
A set of four APDM Opal wireless inertial measurement units [32] was used to record accelerations and angular velocities. An APDM Opal IMU consists of a triad of three accelerometers and three gyroscopes. The technical specifications of the sensor are given in Table 3. The sensors were tightly attached to different body parts using adjustable elastic straps. We were particularly interested in the inertial measurements of four different body parts: (1) chest; (2) lower back; (3) right wrist; and (4) left ankle. The sensor placement at each body part is shown in Figure 1.

Pre-Processing
The output sampling rate of an APDM Opal IMU sensor is adjustable between 20 and 128 Hz. In our experiments, an output sampling rate of 128 Hz was chosen. Due to the noisy nature of the acceleration measurements, raw data were pre-processed to suppress noise. To this end, we used the moving average method with a window size of 9 frames to smooth the raw signal and suppress noise.

Signal Decomposition
The input signal consists of a long sequence of steps, which is segmented into single steps in order to extract features. A simple approach to decompose a long sequence of steps into single steps is by means of peak and valley detection [33][34][35]. In this approach, peaks are detected by finding local maxima, whereas valleys are detected by finding local minima. The detection of false peaks is minimized by using two thresholds: ∆ d and ∆ h ·∆ d is used to define the minimum distance between two peaks, and ∆ h is used to define the minimum height of the peak. We have used the same approach to detect peaks and valleys from the input signal. The values of the two thresholds are chosen by experimentation. The valleys are then used to cut the input signal into individual steps. Peaks and valleys are only detected in the x-axis of the acceleration signal and are used to decompose the y-and z-axes of acceleration and all axes of the gyroscope. This approach makes sure that the length of the individual step is consistent in all axes of the acceleration and gyroscope. In Figure 2, the left side image presents the pre-processed input signal from the x-axis of the IMU's accelerometer attached to the lower back. The detected valleys, highlighted with circles ( ), are also shown.

Extraction of Features
All single steps detected from the signal decomposition are further processed to extract different features from the time and frequency domains. Table 4 presents a complete list of features extracted from different components of accelerations and angular velocities. For each single step, the feature set consists of 50 features in total. Statistical features include: step length, step duration, average, standard deviation, global minimum, global maximum, root mean square and entropy. Energy features include the energy of the step. The maximum amplitude of the frequency spectrum of the signal is calculated using fast Fourier transform (FFT). The step length and the step duration are only computed for the x-axis of the accelerations, as they remain the same in all other axes. All of the remaining features are computed for all 3D accelerations and 3D angular velocities. In Figure 2, the right-hand image presents a decomposed signal depicting a single step between the vertical dash-dot lines (-·). Some of the extracted features are also shown, including: (1) square ( ): global minimum;

Signal Energy
A, G x, y, z 6 Energy of the step: ∑ N n=1 |x[n]| 2 Amplitude A, G x, y, z 6 Maximum amplitude of the frequency spectrum of the signal of the step

Classification of Features
Training and validation data were prepared for each sensor using the features extracted in the previous step. Three types of group classification tasks were performed: (1) gender classification; (2) height classification; and (3) age classification. Furthermore, training and validation data were also prepared for classification within participant subgroups for height and age classification. In Table 5, the characteristics of the population within different classification tasks are presented. For age and height classification, we choose classes based on the available data. Here, we have tried to define meaningful thresholds for classes while keeping balanced populations for all classes. As the classifier, random forest [29] was chosen and trained on the training dataset with the following values of parameters: number of trees = 400; maximum number of features for best split = 7. Two types of validation strategies were employed: stratified 10-fold cross-validation and subject-wise cross-validation. The 10-fold cross-validation was employed for all group and subgroup classification tasks, whereas the subject-wise cross-validation was employed to group classification tasks only.
For each sensor in a classification task, the classifier was trained and validated for three different sets of features: (1) 3D accelerations (26 features); (2) 3D angular velocities (26 features); and (3) 6D accelerations and angular velocities (50 features). The 10-fold cross-validation was employed for all three sets of features, whereas the subject-wise cross-validation was employed for the third set of features (50 features) only. Finally, the classification rate, specificity, sensitivity and the positive predictive value (PPV) for each set of features were calculated as explained in [36]. The same approach was used for all group and subgroup classification tasks. The classification rate c or classification accuracy is given by the formula in Equation (1): where TP, TN are the numbers of true positives and true negatives, respectively, and FP, FN are the numbers of false positives and false negatives, respectively.

Results
In the following sections, we present the results of our investigations of the recorded gait data. Our classification results prove a number of hypotheses regarding biometric and biographic characteristics of the human subjects. Specifically, the gender, the body height and the age of participants could be classified well. Each of classification tasks was solved by training random forest classifiers, as introduced in the previous section.

Gender Classification
Our goal was to show that classification tasks regarding the gender of the trial subject can be performed sufficiently well by using the proposed sensors attached to each of the given locations.
H 0 : The gender can be identified by motion recordings of any of the employed sensors The results presented in Figure 3 show that the statement holds true for each of the four sensors individually. For each sensor, there are three different images visualizing the results of the binary classification, namely for the investigation of accelerations, of angular velocities, as well as of both combined. The confusion matrices encode the following information: each column represents the instances in one of the predicted classes, while each row represents the instances in the actual class (female/male). For the application of acceleration only, the classification rates are higher than 84.8% for each of the sensors. Classification results based on angular velocities show a lower classification rate, but still above 79.35%. The classification based on the combined features performs better than each of the individual feature sets, namely above 87%. More precisely, the results for the combined features are (listed by sensor in descending order of rates): chest (92.57%), lower back (91.52%), left ankle (89.96%), right wrist (87.16%). Table 6 presents 10-fold cross-validation results of gender classification, including correct classification accuracy, sensitivity, specificity, the positive predictive value (PPV) of each class and the average PPV of all classes. PPV C 1 represents the PPV of the class C F G , and PPV C 2 represents the PPV of the class C M G .

Body Height Classification
Another goal was body height classification from only accelerations, angular velocities and a combination of both.
H 1 : The body height can be identified by motion recordings of any of the employed sensors The results of the ternary classification for each individual sensor are given in Figure 4. Here, the classification estimated the assignment to three classes (C 1 H : height ≤170 cm, C 2 H : 170 cm < height < 180 cm, C 3 H : height ≥180 cm). A behavior similar to the gender classification was observed where the classification based on the combined features of accelerations and angular velocities performs better than the individual ones. More precisely, the results for the combined features are (listed by sensor in descending order of rates): chest (89.05%), lower back (88.45%), left ankle (87.27%), right wrist (84.78%). Table 6 presents 10-fold cross-validation results of body height classification, including correct classification accuracy, sensitivity, specificity, the positive predictive value (PPV) of each class and the average PPV of all classes. PPV C 1 shows the PPV of the class C 1 H ; PPV C 2 shows the PPV of the class C 2 H ; and PPV C 3 shows the PPV of the class C 3 H .

Age Classification
Another goal was age group classification from only accelerations, angular velocities and their combination.
H 2 : The age group of individuals can be identified by motion recordings of any of the employed sensors.
The results of the ternary classification for each individual sensor are given in Figure 5. Here, the classification estimated the assignment to three classes according to three age groups (C 1 A : age <40; C 2 A : 40 ≤ age < 50; C 3 A : age ≥ 50) of participants. Similar to the previous classification tasks, the classification based on the combined features of accelerations and angular velocities performs better than the individual ones. More precisely, age classification results for the combined features are (listed by sensor in descending order of rates): lower back (88.822%), chest (88.818%), left ankle (85.74%), right wrist (83.50%). Table 6 presents 10-fold cross-validation results of age classification, including correct classification accuracy, sensitivity, specificity, the positive predictive value (PPV) of each class and the average PPV of all classes. PPV C 1 represents the PPV of the class C 1 A ; PPV C 2 represents the PPV of the class C 2 A ; and PPV C 3 represents the PPV of the class C 3 A .  Figure 5.
Confusion matrices of age classification computed with 10-fold cross-validation. Each column presents the sensor position (left to right): left ankle, lower back, chest and right wrist. 6D accelerations and angular velocities (50 features) were used for classification. C 1 A : age <40; C 2 A : 40 ≤ age < 50; C 3 A : age ≥ 50.

Contribution of Individual Features to Classification Results
The contribution of each of the employed features in all three classification tasks was homogenous in the sense that there is not one outstanding feature with a major contribution to the classification results. In all experiments, we made the following observation: in sum, accelerations contributed more to the overall results than angular velocities. However, the combination of the two feature types did better than accelerations or angular velocities individually. Random forest's permutation-based variable importance measures have been used to evaluate the contribution of individual features in the overall classification results. For further details, refer to the works of Breimann [29] and Louppe et al. [37].
In detail, the classification results related to sensors at different locations can depend on quite different feature sets. In the following, we will give an overview of the most important contributors for each of the locations.

Gender Classification
For the location at the chest, angular velocities (around the y-axis, i.e., transverse axis) contributed most, especially the standard deviation, max, energy, and RMS. These are related to the rotation of the upper body around a horizontal axis over the course of the motion. Note that this is not a contradiction to our other claims. Furthermore, the amplitude of the accelerations along the x-axis, i.e., the cranio-caudal axis, is of high importance. For the lower back, the most important features are associated with acceleration of the z-axis. This corresponds to changes in the velocity of the hip movement within the sagittal plane, i.e., front to back. In addition, angular velocities associated with the z-axis, i.e., rotation around the anteroposterior axis (swinging of hips), contribute significantly to the results. Furthermore, the amplitude of the accelerations along the x-axis, i.e., the cranio-caudal axis, is also of high importance. For the right wrist, features associated with acceleration along the y-and z-axes are top contributors. Particularly, minimum, maximum and entropy acceleration values associated with dorso-ventral, as well as lateral movement of the hand play a more important part in the classification. Furthermore, the RMS and energy of angular velocities associated with the z-axis are important. This is also linked to the swinging of the hand in the lateral direction.
For the ankles, the contribution of accelerations along each axis is generally higher compared to the contribution of other single features. Figure 6 shows bar graphs of the features' importance computed during gender classification. The graphs present a comparison of the importance of each feature (as percentage) with respect to different sensor positions. In general, all features are significantly contributing in the classification task. An overview of contribution percentages where the most important features are highlighted is given in Table 7.

Body Height Classification
For the location at the chest, accelerations along the z-axis contributed most, especially the mean, minimum, maximum and energy. These are associated with the motion of the upper body in the dorso-ventral direction. Furthermore, the minimum accelerations along the x-axis, i.e., the cranio-caudal axis, are of importance.
For the lower back, the most important features are associated with acceleration of the z-axis, especially the mean, maximum, RMS and energy. This corresponds to changes in the velocity of the movement of the hips within the sagittal plane, i.e., front to back. In addition, the minimum of the accelerations in the x-axis contributes significantly to the results. These are linked to the movement of the hips along the cranio-caudal axis (up and down). For the right wrist, features associated with acceleration along each of the three axes contribute significantly. Particularly, maximum, RMS and energy values associated with dorso-ventral movement of the hand play a more important part. For the ankles, also the contribution of accelerations along each axis is generally high. Additionally, angular velocities associated with the rotation of the feet from side to side (around the z-axis) are significant contributors. Figure 7 shows bar graphs of the feature contribution computed during body height classification. The graphs present a comparison of the importance of each feature (as percentage) with respect to different sensor positions. In general, all features are significantly contributing in the classification task. An overview of the contribution percentages where the most important features are highlighted is given in Table 8.

Age Classification
For the location at the chest, the importance of the features is similarly distributed as in the height classification results: accelerations along the z-axis contributed most, especially the mean, maximum, RMS and energy. These are associated with the motion of the upper body in the dorso-ventral direction. Furthermore, the minimum acceleration along the x-axis, i.e., the cranio-caudal axis, is important. For the lower back, the most important features are associated especially with acceleration of the z-axis. This is similar to the results found in the height classification scenario and corresponds to changes in the velocity of the movement of the hips within the sagittal plane, i.e., front to back. For the right wrist, features associated with acceleration along each of the three axes contribute significantly. Additionally, the minimum angular velocity associated with rotation around the z-axis, i.e., swinging laterally, is important. For the ankles, the contribution of features associated with lateral acceleration is high. Additionally, angular velocities associated with swinging of the feet from side to side (around the z-axis), as well as rolling over from heel to toes (rotation around the y-axis) are significant contributors. Figure 8 shows bar graphs of the features' importance computed during age classification. The graphs present a comparison of the importance of each feature (as percentage) with respect to different sensor positions. In general, all features are significantly contributing in the classification task. An overview of contribution percentages where the most important features are highlighted is given in Table 9.

Classification Results Based on Restriction to Subgroups
Since the correlation between body height and gender is very high (on average, men are taller than women), we performed a gait-based classification task on each of the groups of female and male participants in order to present height classification results that are independent of this particular phenomenon. Moreover, we also performed age classification on the data of each subgroup (female vs. male) separately. The number of subjects present in the study did not allow for ternary classification of subgroups (see Table 5 for the population characteristics). Therefore, there were two different classes in the height-related experiment: C 1 H = the body height of the subject is less than or equal to t h cm; C 2 H = the body height of the subject is greater than t h cm (t h = 180 for male, t h = 170 for female subjects). In the age-related experiment, assigned classes were: C 1 A = the subject is less than or equal to t a years old; C 2 A = the subject is greater than t a years old (t a = 40 for male, t a = 50 for female subjects). Table 10 shows an overview of the results. It is quite clear that the results are very good in all cases with the classification rate higher than 90% in all but two cases (89.34% and 87.97% for the right wrist sensor in both female groups). The results also present balanced sensitivity, specificity, the positive predictive value (PPV) of each class and the average PPV of all classes. For body height classification, PPV C 1 represents the PPV of the class C 1 H , and PPV C 2 represents the PPV of the class C 2 H . For age classification, PPV C 1 shows the PPV of the class C 1 A , and PPV C 2 shows the PPV of the class C 2 A .

Subject-Wise Cross-Validation
In order to show that our results are not caused by over-fitting the classification to specific subjects rather than learning the properties, we are looking for (gender, height, age), a subject-wise cross-validation model was also employed (as explained in Section 2.8). Table 11 presents the classification results of subject-wise cross-validation for all three group classification tasks: gender, height and age. The feature set contained all features of 6D accelerations and angular velocities (50 in total). For each sensor position, sensitivity, specificity, the PPV of each class and the average PPV of all classes were also computed. A comparison of the classification results of group classification tasks using 10-fold cross-validation and subject-wise cross-validation for chest (CH), lower back (LB), right wrist (RW) and left ankle (LA) is presented in Figure 9. It is clearly observable that 10-fold cross-validation outperforms subject-wise cross-validation in all cases.   In the case of gender classification using chest and lower back sensors, the classification rates are 7.08% and 6.37% lower than 10-fold cross-validation. For right wrist and left ankle sensors, the classification rates are 8.26% and 12.83% lower than 10-fold cross-validation. In the case of height classification using chest and lower back sensors, the classification rates are 6.18% and 6.07% lower than 10-fold cross-validation. For right wrist and left ankle sensors, the classification rates are 12.18% and 19.50% lower than 10-fold cross-validation.
For the age classification task, a sharp decline in the classification rates is observable in subject-wise cross-validation. For chest and lower back sensors, the classification rates are 20.28% and 16.82% lower than 10-fold cross-validation. For right wrist and left ankle, the classification rates are 21.51% and 21.79% lower than 10-fold cross-validation. The main reason for such a sharp decline is because of the unbalanced population in classes C 1 A , C 2 A and C 3 A with a subject ratio of 9:6:11. On the level of subject-wise cross-validation, it is also possible to address the questions of the invariance of the features within the different steps of a walking sequence or to come up with random forest regressions for age and height. Not surprisingly, almost all steps of one walking sequence were classified identically; 99.1% for gender classification, 98.7% for height classification and 98.4% for age classification. When performing a random forest regression instead of a classification, we obtained age classifications with an average RMS error of about 11.51 years and height classification with an average RMS error of about 9.14 cm.

Summary of Findings
The general problem we tackled is the estimation of soft biometric information from one single step recorded by one inertial sensor. We did so by solving different classification tasks based on the motion data of human walking steps represented by accelerations and angular velocities. Data were recorded by one sensor placed at various locations on the human body, namely the chest, the lower back, the wrist and the ankle. The results show that these classification tasks can be solved well by using accelerometers and/or gyroscopes at any of the given locations. The classification rates were highest for sensors located at the lower back and chest in each of the experiments, but still convincingly high when the sensor is attached to the wrist or ankle.
Our analysis of the feature sets used in each of the experiments has made clear that there is not one feature mainly responsible for any of the distinctions necessary for a classification. However, the feature importance in each of the classifications gave pointers as to what combination of features produces the best results. The most important findings were that angular velocities did not perform better than accelerations.

Comparison with Existing Research
It is not surprising that information about the gender can be recovered by analysis of chest or lower back movement. The effects of marker placement and viewpoint selection for recording locomotion are discussed extensively in the works of Troje [2], as was the high relevance of hip movement for gender classification by human observers. However, we have presented new findings, namely that accelerations associated with wrist and ankle movement alone allow for classification of gender, as well. To our knowledge, we are also the first to show that classification of height and age groups is possible from non-visual features. This is as yet done by solely relying on image-or video-based features. Makihara et al. [38] introduce a paper on gait-based age estimation by Gaussian process regression on silhouette-based features of bodies (contrary to face-based age estimation, as presented by Stewart et al. [39]). Their investigation was based on standard resolution video data. They have constructed a whole-generation database of over 1000 individuals, their age ranging from two to 94.
Our initial situation is clearly different from this in terms of sensor modalities. The use of commercial smart phones and wearables is an attractive chance to monitor biometric properties nowadays. Mobile phones and smart devices are a convenient platform for recording information in an every-day setup. Our experiments have shown that information recorded by a single sensor, such as a smart device, suffices for the estimation of basic soft biometric properties. Particularly, the wrist was an important subject for tests, because smart devices are commonly worn at that location.
Estimating biometric properties based on motion data makes sense in a number of different scenarios. In some of them, the focus may be on hard biometric properties in order to facilitate online identity checks and close security gaps. A number of previous works have shown that identification and authentication problems can be solved by classification of motion data acquired by mobile devices. Derawi and Bours [40] show that recognition of specific users can be done in real-time based on data collected by mobile phones. Their method can correctly identify enrolled users based on learning templates of different walking trials.
On the other hand, attention may be directed to soft biometric properties. Monitoring health or preventing and curing injury are use cases that represent this idea. Previous works have shown that accelerometers are well suited for detection and recognition of events and activity. In their paper on sensory motor performance, Albert et al. [41] discuss a new method to classify different types of falls in order to rapidly assess the cause and necessary emergency response. They present very good results classifying accelerometer data acquired by commercial mobile phones, which were attached to the lower backs of test subjects. In their comparative evaluation of five machine learning classifiers, support vector machines performed best, achieving accuracy values near 98%. Classification by decision trees only performed second best in their experiments at 94% to 98% accuracy for fall detection and at 98% to 99% accuracy for fall type classification. In their paper on gait pattern classification, Von Tscharner et al. [42] even conclude that a combination of PCA, SVM and ICA is most reliable dealing with high intra-and inter-subject variability. However, in their survey on mobile gait classification, Schneider et al. [43] make an attempt to settle the disagreement about suitable classification algorithms. In their study, they conclude that random forest is best suited for the classification of gait-related properties. In our setup, we decided to use random forest in order to produce comparable results. One additional benefit of this choice is that there is a low number of parameters that have to be chosen. Furthermore, the random forest method enables computing the significance and importance of each feature in overall classification. This helped us to investigate and perform a comparative study of the features' importance for each sensor position in different classification tasks.

Limitations
Since our database is much smaller than the one introduced by Makihara et al. [38] and the variety of biometric features was also smaller (e.g., age covered only three decades), our experiments can only serve as proof of concept for now. Testing classifiers of non-image-based features on a larger database comprising wider ranges of biometric properties is a direction for future work.
Another limitation of our database is that it only consists of data belonging to patients with complaints of back pain. It will be worthy to perform further experiments to record data of participants without back pain (control group). Classification tasks can then be performed for the patient group, the control group and a combination of both.
One noteworthy limitation we had to face in our experiments is a possible uncertainty of sensor placement. Irrespective of how carefully each involved sensor is placed, the accuracy of placement depends on physical characteristics of test subjects, which may vary between individuals to some extent.

Conclusions and Future Work
We have classified biometric information based on the data of a single inertial-measurement unit collected on a single step. As a novel empirical finding, we have shown that single steps of normal walking already reveal biometric information about gender, height and age quite well, not only for measurements of lower back movements or chest movements, but also for wrist movements or ankle movements. Using standard 10-fold cross-validation, the classification rates have been for gender classification: 87.16% (right wrist sensor) to 92.57% (chest sensor); height classification: 84.78% (right wrist sensor) to 89.05% (chest sensor); age classification: 83.50% (right wrist sensor) to 88.82% (chest, lower back sensor). When using the rather strict subject-wise evaluations, the classification rates are somewhat lower for gender by 6.37% (lower back sensor) to 12.83% (left ankle) compared to the results of 10-fold cross-validation. For height classification, the classification rates using subject-wise evaluation are 6.07% (lower back sensor) to 19.50% (left ankle sensor) lower, and for age classification, 16.82% (lower back sensor) to 21.79% (left ankle sensor). These values can be seen as "lower bounds" on the possible classification rates on the biological variations, since also our feature selection, as well as our used machine learning techniques might not be optimal. Especially, a good estimate of the direction of gravity should improve the results; at sensors position with less change in orientation (chest, lower back), the classification rates had been better than at the ones with higher change (wrist, ankle). In future work, we will try to adopt a model-based estimate of body-part orientation using techniques similar to the ones used in [17] to come up with such estimates.
On the side of the basic science questions about human movement control, we want to address questions about to which degree the movement patterns can be "spoofed" by trained and untrained persons in future work. We will perform tests asking probands to try to walk like the other gender, to pretend to have another age or to have another height, etc.
On the technological side, our work should help to gain information on the user by smartwatches, smartphones or smart shoes, given the fact that many sensor systems for consumer electronics are limited: long time recordings can be done in low frame rates only or high speed measurements can be done for a limited amount of time, to save battery life time. Thus, it is more and more important to get information out of sparse sensor readings. Our work presents a technique where biometric parameters can be estimated from single steps. These biometric parameters can be used for further analysis of motions that are recorded with lower frame rates. Compared to previous work, where full sequences are considered for classification, we see this as a strong improvement.
However, our work also demonstrates the sensitivity of sensor data of such devices with respect to privacy concerns: already, the information on a single step recorded from a smartphone or smartwatch reveals personal information on gender, height and age.