1. Introduction
Recently, with the wearable technology advancing at a fast pace, billions of smart devices have been equipped with built-in motion sensors such as accelerometers and gyroscopes. They can be exploited to log the body motion of users, which can be a very useful tool for the research communities studying motion sensing. More and more researchers have used the motion characteristics of the human body for various tasks, which ranged from activity recognition [
1,
2,
3,
4,
5,
6], gesture categorization [
7], clinical condition monitoring [
8], BMI predication [
9], to user gait recognition [
10,
11,
12,
13]. In particular, identity recognition using the dynamics of the walking pattern seems a promising technique in preventing the use of smart devices and other systems linked with them without the owner’s permission. However, the quality of gait-based biometric systems is greatly influenced by the footwear which the subject is wearing.
The previous works [
14,
15] studied the gait changes related to the different shoes worn by the subjects. Their experiments were carried out with four kinds of shoes with different weights. They found that heavy footwear reduces the discrimination and the sideways motion of the foot has the most discriminating power compared to the up-down or forward-backward directions of the motion. Meanwhile, based on some previous papers on exercise physiology, the height of the heels is also a important parameter related to the human gait. A recent survey [
16] summarized a list of the five main open problems for gait recognition including different kinds of shoes. Walking requires ongoing, finely tuned interactions between muscular and tendinous tissues. Wearing HH puts extra stress and pressure on various parts of the human body that would affect the subject’s natural gait [
17]. In common sense, an increase in the height of HH will cause a decrease in subject’s walking speed and the length of stride.
Though footwear alters the gait, there is only a very limited number of studies in footwear recognition. The existing methods normally use the RGB camera [
18], the specific motion capture system [
19], the ground reaction force sensors [
20], or Microsoft Kinect sensor [
21], all of which are lab limited. In fact, there is no research on the footwear recognition in the daily life scenario, and none for the HH which about 37% to 69% of American women frequently wear [
22]. Additionally, even if only considering the HH, they are categorized into many categories by the height of the heels, as shown in
Table 1 and
Figure 1.
Therefore, motion sensor-based footwear recognition using the gait characteristic in daily life is still an open challenge. One of the major challenges is that the daily life walking environment is highly dynamic and it includes a variety of environmental factors that could directly or indirectly introduce variations into the gait patterns. For example, the clothes the individual is wearing, the different walking surfaces, slopes, and obstacles on the road, can all contribute to gait changes besides footwear.
In this section, we evaluate the diffculty of the task by visualizing the raw signals, as shown in
Figure 2. A participant of medium build (average weight) was asked to walk back and forth three times on the same surface, wearing different shoes (flat, mid HH and ultra HH), each time with a smartphone placed on her waist. From the visualized data, we find that the gait of ultra HH (9.8 cm) is significantly different from the previous two scenes. Its acceleration component has a sharper peak, and especially the angular velocity has a lateral rotation lasting for one second. We believe that this is due to the reduced stability and the changes of the center of gravity caused by the ultra HH.
Inspired by the deep neural networks, some very recent works employ them to motion sensor-based recognition, such as Convolution Neural Network (CNN) [
3,
5,
23], which are competent in capturing the local characteristics of multi-channel signals; Recurrent Neural Network (RNN) [
24], and its variant, LSTM units [
1,
25], which are designed to extract the temporal dependencies and incrementally learn information over time.
Recently, the combination of CNNs and LSTM in a unified stack framework has already offered state-of-the-art results in sensor-based recognition [
7]. In our previous study, we developed a hybrid deep neural network [
9] for gait analysis using data captured from built-in motion sensors in smartphones. The hybrid deep neural network overcomes the challenge of environmental factors. In this study, we extended our prior work by incorporating some extensions of attention mechanism to the previous model, and tested its performance by investigating gait changes related to footwear. The extensions we introduced in this study and the major contributions of this paper are summarized in three points:
- (1)
To the best of our knowledge, we are the first to recognize the subject’s footwear by the dynamics of gait changes acquired from smartphone sensors in daily life. We categorize the shoes into 3 classes by the height of the heels (flat, mid HH and ultra HH). We propose Sensing-HH, a novel deep attention model, which can automatically learn a hierarchical feature representation and the infinite temporal contexts from raw signals through the hybrid net structures. It also has the ability to implicitly learn to suppress irrelevant parts in the raw signals and to highlight salient features useful for this specific task by adding the attention mechanism.
- (2)
We established a dataset with 35 young females wearing 3 kinds of shoes. All of them were asked to walk for 4 min on a flat surface, with 3 smartphones as recording devices, which at the same time were held by their hands, attached to their waists, and placed in their handbags, respectively.
- (3)
We conducted comprehensive experiments on this dataset to evaluate the proposed Sensing-HH model. The results showed that our model achieved competitive performance with a mean F1-score ( ) of 0.827 when the smartphone was attached to the waist, from different classes, through cross verification. Meanwhile, the Score of the Ultra HH was as high as 0.91.
The remaining part of this paper is structured as follows:
In
Section 2, we give a brief overview of the state of some related literature. In
Section 3, we present how the dataset was established. In
Section 4, we illustrate the Sensing-HH, a deep attention model. Experimental results with the baseline methods are presented in
Section 5.
Section 6 gives the conclusions.
3. Dataset
To our best knowledge, there is no existing dataset that specifically studied the motion sensor-based gait recognition of HH wearing in a daily environment. In this section, we describe our strategy for motion sensor data collection to build the dataset.
3.1. Participants Selection
We recruited female participants who wear HH for at least 5 days a week, for an average of 12 h a day (including walking, sitting and standing). In order to avoid other factors such as age, height, and weight to impact the results, we selected 35 subjects with the age range from 19 to 27, and with similar builds. Participant details are shown below: age: 23 ± 4 years; height: 164.3 ± 12.4 cm; mass: 51.8 ± 7.6 kg. Each of the participants was informed before the experiment of its aim and the measuring method. All of them signed a consent to participate in the study. Prior to the gait measurement, we conducted a short survey asking questions about the preferred types of footwear and how frequently they wear HH. Two-thirds of the participants answered that they preferred flat shoes in their day to day life. One-third of them preferred high heeled shoes, even with the heels more than 8 cm in height. All of them wore 3 kinds of shoes (flat, mid HH and ultra HH) for this study.
3.2. Data Collection
All of the motion sensor data were recorded by a log application from 3 different android smartphones (Samsung Galaxy S10, Samsung Galaxy Note8, and Smartisan Pro2).
Table 2 summarizes sensor specifications for the devices.
The tri-accelerometer and the tri-gyroscope are motion sensors equipped by the smartphones we used. The tri-accelerometer is based on the basic principle of acceleration and it is used to measure the acceleration (including gravity) in the X, Y and Z directions of the smartphones. The tri-gyroscope captures the angular velocity of a smartphone during its rotation in space. Both of them reflect the gait characteristic of smartphone users.
All of the participants were asked to walk for 4 min on a flat ground, as shown in
Figure 3, the recording devices, the 3 smartphones that was mentioned before were held on their hands, attached to their waists, and placed in their handbags, respectively, as shown in
Figure 4.
6. Conclusions
In summary, we developed Sensing-HH for footwear recognition based on daily life gait data captured by built-in motion sensors from smartphones. To our best knowledge, we are the first to recognize the subject’s footwear by the dynamics of gait changes acquired from smartphone sensors in daily life. We categorize the shoes into 3 classes by the height of the heels (flat, mid HH and ultra HH). Sensing-HH is a novel deep attention model which can automatically learn a hierarchical feature representation and the infinite temporal contexts from raw signals through the hybrid net structures. It also has the ability to implicitly learn to suppress irrelevant parts in the raw signals and to highlight useful salient features for this specific task by adding the attention mechanism. We used a daily life gait dataset to evaluate the performance of Sensing-HH and other baseline models. Comparing to three existing deep neural networks and two shallow models, Sensing-HH performed significantly better in most scenarios.
The results show that the proposed model is able to make footwear recognition more efficient and automated. It also can be applied to a large population as it only requires data from smartphones and it can accurately recognize footwear using daily life gait data with no restriction to the location of the measuring devices. Sencing-HH has the potential to extend use of the motion sensor data. For example, to help build a robust biometric system that includes gait pattern analysis. Future studies will focus on how to accurately recognize footwear in a dataset having a wider range of varied heights and weights of the subjects, so that the model would be able to work under an even closer-to-reality scenario.