Design and Implementation of Foot-Mounted Inertial Sensor Based Wearable Electronic Device for Game Play Application

Wearable electronic devices have experienced increasing development with the advances in the semiconductor industry and have received more attention during the last decades. This paper presents the development and implementation of a novel inertial sensor-based foot-mounted wearable electronic device for a brand new application: game playing. The main objective of the introduced system is to monitor and identify the human foot stepping direction in real time, and coordinate these motions to control the player operation in games. This proposed system extends the utilized field of currently available wearable devices and introduces a convenient and portable medium to perform exercise in a more compelling way in the near future. This paper provides an overview of the previously-developed system platforms, introduces the main idea behind this novel application, and describes the implemented human foot moving direction identification algorithm. Practical experiment results demonstrate that the proposed system is capable of recognizing five foot motions, jump, step left, step right, step forward, and step backward, and has achieved an over 97% accuracy performance for different users. The functionality of the system for real-time application has also been verified through the practical experiments.


Introduction
In recent years, with the rapid development of MEMS (Micro-Electro-Mechanical System) technology, the inertial sensor production has made a leap forward in terms of chip-size minimization, low-cost manufacturing, low-power consumption, and simplification in operation. Due to these advancements, various types of inertial MEMS sensors have been adapted for multiple applications, such as vehicles and personal navigation [1], motion tracking systems [2], and consumer electronic devices (smartphones) [3]. The wearable electronic devices, which emerged during the last few years, also utilize the low-cost MEMS inertial sensor, and are becoming more attractive in the consumer market.
Wearable electronic devices refer to electronic technologies or devices that are incorporated into items of clothing and accessories which can be comfortably worn [4]. Generally, these devices can perform communications and allow the wearers to access their activity and behavior information. The foot-mounted inertial sensor based electronic device is one commonly existing type and has attracted attention for further study, development and implementation. The application fields of foot-mounted wearable device can mainly be categorized into: pedestrian navigation, human daily or sports activity recognition, and medical field.

System Architecture
The proposed system architecture is shown in Figure 2. In this system, the foot moving dynamic data are captured by inertial sensor, and is then wirelessly transmitted to various kinds of terminals (i.e., smartphone, tablets, computer, and smart TV) through Bluetooth 4.0. The software, which is compatible in different platforms, plays the role of receiving data, performing the step motion detection algorithm, and interacting with games. Both hardware and software platforms are included in this system, and are described as follows.

Hardware Platform
The system hardware platform mainly combines a CC 2450 microprocessor (Texas Instrument, Dallas, TX, USA), a MPU9150 9-axis inertial sensor (InvenSense, Sunnyvale, CA, USA), and other necessary electronic components. The CC2540 [33] processor has a high performance and low-power 8051 microcontroller and includes a 2.4 GHz Bluetooth low energy System on Chip (SOC). It can run both application and BLE (Bluetooth Low Energy) protocol stack, so it is compatible with multiple mobile devices (i.e., Smartphone, and tablets). The MPU9150 [34] is an integrated nine-axis MEMS motion tracking device that combines a three-axis gyroscope, a three-axis accelerometer, and a threeaxis magnetometer. Figure 3 shows the system hardware platform. In our system, the three tasks of the hardware platform are to derive inertial sensor data through II2 (I2C) interface in a pre-set sampling frequency (200 Hz), to package data in a pre-defined user protocol and send data via Bluetooth to the host.

System Architecture
The proposed system architecture is shown in Figure 2. In this system, the foot moving dynamic data are captured by inertial sensor, and is then wirelessly transmitted to various kinds of terminals (i.e., smartphone, tablets, computer, and smart TV) through Bluetooth 4.0. The software, which is compatible in different platforms, plays the role of receiving data, performing the step motion detection algorithm, and interacting with games. Both hardware and software platforms are included in this system, and are described as follows.
Sensors 2016, 16,1752 4 of 24 diversity of shoe styles, sensor mounted manners and user habits, the system robustness and algorithm compatibility are other difficult challenges to overcome.

System Architecture
The proposed system architecture is shown in Figure 2. In this system, the foot moving dynamic data are captured by inertial sensor, and is then wirelessly transmitted to various kinds of terminals (i.e., smartphone, tablets, computer, and smart TV) through Bluetooth 4.0. The software, which is compatible in different platforms, plays the role of receiving data, performing the step motion detection algorithm, and interacting with games. Both hardware and software platforms are included in this system, and are described as follows.

Hardware Platform
The system hardware platform mainly combines a CC 2450 microprocessor (Texas Instrument, Dallas, TX, USA), a MPU9150 9-axis inertial sensor (InvenSense, Sunnyvale, CA, USA), and other necessary electronic components. The CC2540 [33] processor has a high performance and low-power 8051 microcontroller and includes a 2.4 GHz Bluetooth low energy System on Chip (SOC). It can run both application and BLE (Bluetooth Low Energy) protocol stack, so it is compatible with multiple mobile devices (i.e., Smartphone, and tablets). The MPU9150 [34] is an integrated nine-axis MEMS motion tracking device that combines a three-axis gyroscope, a three-axis accelerometer, and a threeaxis magnetometer. Figure 3 shows the system hardware platform. In our system, the three tasks of the hardware platform are to derive inertial sensor data through II2 (I2C) interface in a pre-set sampling frequency (200 Hz), to package data in a pre-defined user protocol and send data via Bluetooth to the host.

Hardware Platform
The system hardware platform mainly combines a CC 2450 microprocessor (Texas Instrument, Dallas, TX, USA), a MPU9150 9-axis inertial sensor (InvenSense, Sunnyvale, CA, USA), and other necessary electronic components. The CC2540 [33] processor has a high performance and low-power 8051 microcontroller and includes a 2.4 GHz Bluetooth low energy System on Chip (SOC). It can run both application and BLE (Bluetooth Low Energy) protocol stack, so it is compatible with multiple mobile devices (i.e., Smartphone, and tablets). The MPU9150 [34] is an integrated nine-axis MEMS motion tracking device that combines a three-axis gyroscope, a three-axis accelerometer, and a three-axis magnetometer. Figure 3 shows the system hardware platform. In our system, the three tasks of the hardware platform are to derive inertial sensor data through II2 (I2C) interface in a pre-set sampling frequency (200 Hz), to package data in a pre-defined user protocol and send data via Bluetooth to the host.

Software Platform
The system software platform is developed in C++ programming language in Visual Studio. The main functions of the software include the following: receive and decode data, log user's motion data, calculate the attitude, run the human foot detection algorithm and interact the human motion with game control. For real-time processing, a multi-threaded program is designed to simultaneously implement the listed tasks. Multithreading is a widespread programming and execution model that allows multiple threads to exist within the context of a single process. These threads share the processor's resources, but execute their functions independently. This multi-threaded software can guarantee the whole system's real-time application, and moreover introduces a clear structure, which is beneficial for further revision or development.

Methodology
The inertial sensor is attached on human foot and the measured rotation and acceleration information is applied for stepping direction classification. The motion recognition process of the proposed system is illustrated in Figure 4. As shown in Figure 4, the identification process is executed as: first, the collected raw inertial data is pre-processed for error compensation, noise reduction and misalignment elimination; second, the peak points of the norm of 3-axis acceleration are detected to segment data; and, finally, the selected features in the divided data segment are extracted and put into the classifier to derive the foot motion types. The detailed description of each procedure is provided in the following subsections.

Preprocessing
MEMS inertial sensor has the advantages of being small-size, low cost, affordable; however, they suffer from various error sources, which cause negative effects on their performance. Therefore, calibration experiments are indispensable to remove the deterministic errors, such as bias, scale

Software Platform
The system software platform is developed in C++ programming language in Visual Studio. The main functions of the software include the following: receive and decode data, log user's motion data, calculate the attitude, run the human foot detection algorithm and interact the human motion with game control. For real-time processing, a multi-threaded program is designed to simultaneously implement the listed tasks. Multithreading is a widespread programming and execution model that allows multiple threads to exist within the context of a single process. These threads share the processor's resources, but execute their functions independently. This multi-threaded software can guarantee the whole system's real-time application, and moreover introduces a clear structure, which is beneficial for further revision or development.

Methodology
The inertial sensor is attached on human foot and the measured rotation and acceleration information is applied for stepping direction classification. The motion recognition process of the proposed system is illustrated in Figure 4.

Software Platform
The system software platform is developed in C++ programming language in Visual Studio. The main functions of the software include the following: receive and decode data, log user's motion data, calculate the attitude, run the human foot detection algorithm and interact the human motion with game control. For real-time processing, a multi-threaded program is designed to simultaneously implement the listed tasks. Multithreading is a widespread programming and execution model that allows multiple threads to exist within the context of a single process. These threads share the processor's resources, but execute their functions independently. This multi-threaded software can guarantee the whole system's real-time application, and moreover introduces a clear structure, which is beneficial for further revision or development.

Methodology
The inertial sensor is attached on human foot and the measured rotation and acceleration information is applied for stepping direction classification. The motion recognition process of the proposed system is illustrated in Figure 4. As shown in Figure 4, the identification process is executed as: first, the collected raw inertial data is pre-processed for error compensation, noise reduction and misalignment elimination; second, the peak points of the norm of 3-axis acceleration are detected to segment data; and, finally, the selected features in the divided data segment are extracted and put into the classifier to derive the foot motion types. The detailed description of each procedure is provided in the following subsections.

Preprocessing
MEMS inertial sensor has the advantages of being small-size, low cost, affordable; however, they suffer from various error sources, which cause negative effects on their performance. Therefore, calibration experiments are indispensable to remove the deterministic errors, such as bias, scale As shown in Figure 4, the identification process is executed as: first, the collected raw inertial data is pre-processed for error compensation, noise reduction and misalignment elimination; second, the peak points of the norm of 3-axis acceleration are detected to segment data; and, finally, the selected features in the divided data segment are extracted and put into the classifier to derive the foot motion types. The detailed description of each procedure is provided in the following subsections.

Preprocessing
MEMS inertial sensor has the advantages of being small-size, low cost, affordable; however, they suffer from various error sources, which cause negative effects on their performance. Therefore, calibration experiments are indispensable to remove the deterministic errors, such as bias, scale factor, misalignment before using MEMS sensor. The inertial sensor error model [35] is described as follows and it is employed for the error compensation: where f b , ω b denote the measurements of specific force and rotation and f b , ω b denote the true specific force and angular velocity. b a , b ω , respectively, denote the accelerometer and gyroscope instrument bias; S b , S ω separately denote the matrices of the linear scale factor error of gyroscope and accelerometer; and N b , N ω denote the matrices of representing axes non-orthogonality. ε(ω), ε( f ) denote the stochastic error of the sensors. The parameters S b , S ω , b a , b ω , N b , N ω can be derived through a calibration experiment before sensor usage [36,37]. With a hand rotating calibration scheme, the experiment can be accomplished in approximately one minute [38]. In the proposed system, the IMU is attached on shoes to detect the user's foot motions and control the game. However, due to the difference between various shoe styles and the sensor placement, the IMU orientation (pitch and roll) varies when mounting on different users' shoes, which causes the misalignment with different users.
Hence, in order to achieve a satisfactory identification result for different shoe styles or placement manners, the data should be collected under various attachment conditions and put into the training process to derive the classifier. However, this process is time-consuming and the performance is not guaranteed if the sensor is attached with a new placement that is not included in the training set.
To avoid such drawbacks, we propose to project the measured acceleration and rotation data from the sensor frame (shoe frame) to the user frame, where the user frame is defined as the user's right, left and up directions as three axes to construct the right handed coordinate system. Thus, no matter how the inertial sensor is placed on the shoes (sensor frame is always different), the measured data can be unified to be expressed in the same coordinate. During the sensor installation, the forward axis of the IMU (y-axis in proposed system), is always aligned with the foot moving forward direction, so we only need to consider the misalignment of pitch and roll angles. This proposed data transformation from sensor frame to user frame is able to effectively eliminate the misalignment caused by different shoes styles and sensor placement because it aligns all the collected data in the same frame. Figure 5 shows this process. factor, misalignment before using MEMS sensor. The inertial sensor error model [35] is described as follows and it is employed for the error compensation:  can be derived through a calibration experiment before sensor usage [36,37]. With a hand rotating calibration scheme, the experiment can be accomplished in approximately one minute [38].
In the proposed system, the IMU is attached on shoes to detect the user's foot motions and control the game. However, due to the difference between various shoe styles and the sensor placement, the IMU orientation (pitch and roll) varies when mounting on different users' shoes, which causes the misalignment with different users.
Hence, in order to achieve a satisfactory identification result for different shoe styles or placement manners, the data should be collected under various attachment conditions and put into the training process to derive the classifier. However, this process is time-consuming and the performance is not guaranteed if the sensor is attached with a new placement that is not included in the training set.
To avoid such drawbacks, we propose to project the measured acceleration and rotation data from the sensor frame (shoe frame) to the user frame, where the user frame is defined as the user's right, left and up directions as three axes to construct the right handed coordinate system. Thus, no matter how the inertial sensor is placed on the shoes (sensor frame is always different), the measured data can be unified to be expressed in the same coordinate. During the sensor installation, the forward axis of the IMU (y-axis in proposed system), is always aligned with the foot moving forward direction, so we only need to consider the misalignment of pitch and roll angles. This proposed data transformation from sensor frame to user frame is able to effectively eliminate the misalignment caused by different shoes styles and sensor placement because it aligns all the collected data in the same frame. Figure 5 shows this process.  More importantly, the data expressed in this frame can directly reflect the actual user moving direction in horizontal plane, which provides a better data basis for the consequent signal process, and is beneficial to achieve a more robust result.  Figure 5 shows the alignment process with the rotation matrix C n b , where the inertial data, collected under different misalignment conditions, are scaled in the same frame (Right-Forward-Up). More importantly, the data expressed in this frame can directly reflect the actual user moving direction in horizontal plane, which provides a better data basis for the consequent signal process, and is beneficial to achieve a more robust result. Therefore, a reliable and accurate attitude result is very significant and necessary since it can be used to correctly project the inertial measurement onto user frame with the rotation matrix to perform the data standardization process (align in the same frame), and consequently derive a dependable feature extraction section. Considering the given initial attitude and gyroscope measurement, the orientation results can be derived by integrating the angular velocity measured by 3-axis gyroscope. However, due to the error of MEMS gyroscope, the attitude result drifts quickly with time and is not able to provide long term solution. On the other side, the accelerometer can provide attitude angles without suffering from long term drift which is complementary with gyroscope, and is effective to compensate the attitude drift error. Hence, an attitude filter is used to integrate the gyroscope and accelerometer measurement together and derive the non-drift attitude solution. The Kalman filter is then used to blend the information in a feature-level fusion [39]. The dynamic model, measurement model of the filter and an adaptive measurement noise tuning strategy implemented are subsequently described as follows.

Dynamic Model
The attitude angle error model, which is the angle difference between true navigation frame and the computed navigation frame, is employed as the dynamic model [40]. This model is expressed in linear form, and easy to implement. The 3-axis gyro biases are also included in the dynamic model; and they are estimated in the filter and work in the feedback loop to mitigate the error from raw measurement. The equation of dynamic model is written as: where ψ denotes the attitude error. ω n in denotes the n-frame rotation angular rate vector relative to the inertial frame (i-frame) expressed in the n-frame. C n b denotes the Direction Cosine Matrix (DCM) from b-frame (i.e., the body frame) to n-frame (i.e., the navigation frame). The symbol "×" denotes cross product of two vectors. ε b denotes the gyros output error. Here, we only consider the effect of gyro bias and it is modeled as first order Gauss-Markov process. Finally, τ b denotes the correlation time of the gyro biases and ω b is the driving noise vector.

Measurement Model
The acceleration residuals in the body frame are used to derive the system measurement model. In our model, instead of using attitude difference separately derived by accelerometer and gyroscope, the acceleration difference is applied to avoid the singularity problem when the pitch angle is ±90 • [41]. The acceleration residuals in body frame are defined as the difference between accelerometer direct measurements and the projection of local gravity on the body frame.
where a b m denotes the accelerometer measurement. a b n c denotes the local gravity acceleration project on the body frame using the gyros derived rotation matrix C b n c . The subscript n c denotes the computed frame. According to the DCM chain rule, C b n is expressed as:  (3), the relationship between acceleration residuals in body frame and attitude error is written as: Then, the measurement model can be obtained by Equation (5). The measurement Z is the acceleration in body frame [δa x δa y δa z ] T , and the measurement matrix H is expressed as: This attitude filter works effectively under stationary or low acceleration conditions. In these situations, the specific force measured by accelerometer equals to local gravity acceleration, so the pitch and roll angles derived through accelerometer are accurate and can be positive in fixing the accumulated attitude error caused by gyroscope error, while, in high dynamic situation, the accelerometer will sense the external dynamic acceleration, which is undesirable in the filter. Hence, if the contribution of measurement update remains a same weight as that in low dynamic situation, a side effect will be introduced and lead to a degraded performance. Hence, to achieve an optimal attitude estimation result, we propose to adaptively tune the measurement covariance matrix R according to a system dynamic index ε [42] and is designed as: where f denotes the norm of measured acceleration and g denotes the local gravity acceleration. Then the specific tuning strategy of covariance matrix R is described as follows: 1 Stationary mode: If the scalar subjects to ε < Thres1, the system is considered to be stationary. Correspondingly, the covariance matrix R is set as x , σ 2 y , σ 2 z denote the velocity random walk of three-axis accelerometer. In our approach, the Thres1 is set as Low acceleration mode: If the index satisfies the condition Thres1 < ε < Thres2, the system suffers from low acceleration and is treated as measurement noise. The covariance matrix R is set as High dynamic mode: If the scalar subjects to ε > Thres2, norm of the three accelerations is far from the specific force, which equals to gravity acceleration. The acceleration residuals are not reliable. In this situation, we only use the angular velocity to calculate attitude, and the filter only performs the prediction loop without measurement update.

Data Segmentation
Data segmentation is carried out to divide the continuous stream of collected sensor data into multiple subsequences, and retrieve the important and useful information for the activity recognition. The sliding windows algorithms are commonly used to segment data in various applications because they are simple, intuitive and online algorithms. However, this approach is not suitable here because an entire human stepping motion signal may not be included in the current detected window, and is separated in two adjacent windows, which is possible to cause poor result in some cases. Moreover, this algorithm works with a complexity of O(nL), where L is the average length of a segment, and it affects the system real-time capability.
Hence, the relationship between gait cycle and acceleration signal is analyzed to derive a practical approach to segment data. Generally, a gait cycle can be divided into four phases [43], namely: (1) Push-off, heel off the ground and toe on the ground; (2) Swing, both heel and toe off the ground; (3) Heel Strike, heel on the ground and toe off the ground; and (4) Foot stance phase, heel and toe on the ground at rest. Figure 6 shows these four phases and their correlated acceleration signal.
The sliding windows algorithms are commonly used to segment data in various applications because they are simple, intuitive and online algorithms. However, this approach is not suitable here because an entire human stepping motion signal may not be included in the current detected window, and is separated in two adjacent windows, which is possible to cause poor result in some cases. Moreover, this algorithm works with a complexity of O(nL), where L is the average length of a segment, and it affects the system real-time capability.
Hence, the relationship between gait cycle and acceleration signal is analyzed to derive a practical approach to segment data. Generally, a gait cycle can be divided into four phases [43], namely: (1) Push-off, heel off the ground and toe on the ground; (2) Swing, both heel and toe off the ground; (3) Heel Strike, heel on the ground and toe off the ground; and (4) Foot stance phase, heel and toe on the ground at rest. Figure 6 shows these four phases and their correlated acceleration signal. As shown in Figure 6, the blue line is the norm of three accelerations and red line denotes the smoothed acceleration signal by a moving average algorithm, where, for each epoch, a window containing the previous N sample points is averaged to produce the acceleration value; and the reason is to derive a smoother form of signal, deduce noise, and eliminate unexpected peak points. Figure 6 illustrates that the smoothed acceleration signal during one walking cycle generally features two peak points, one is in the push-off phase that the foot is leaving the ground and another one is in the heel-strike phase that the foot hits the ground. Although it may not hold for each walking cycle, that more than two peak points are available in one cycle, due to different user habits of motion strength, these two points are always available in each gait cycle. Here, the utilization of the peak point for triggering the date segmentation process is proposed. Once one peak point is detected, the feature in the vicinity of this point is extracted and consequently the foot motion type is identified.
The reason for using peak point is that one peak point is always available in the push-off phase when the foot leaves ground, which will not vary for different users or stepping patterns. This point As shown in Figure 6, the blue line is the norm of three accelerations and red line denotes the smoothed acceleration signal by a moving average algorithm, where, for each epoch, a window containing the previous N sample points is averaged to produce the acceleration value; and the reason is to derive a smoother form of signal, deduce noise, and eliminate unexpected peak points. Figure 6 illustrates that the smoothed acceleration signal during one walking cycle generally features two peak points, one is in the push-off phase that the foot is leaving the ground and another one is in the heel-strike phase that the foot hits the ground. Although it may not hold for each walking cycle, that more than two peak points are available in one cycle, due to different user habits of motion strength, these two points are always available in each gait cycle. Here, the utilization of the peak point for triggering the date segmentation process is proposed. Once one peak point is detected, the feature in the vicinity of this point is extracted and consequently the foot motion type is identified.
The reason for using peak point is that one peak point is always available in the push-off phase when the foot leaves ground, which will not vary for different users or stepping patterns. This point facilitates the detection of the beginning phase of each step, and ensures the reliable real time performance. On the other side, the foot motion detection algorithm works with the O (peak point number) complexity. Therefore, the classification process is only performed when the peak point is detected, which decreases the computation burden. Moreover, the specific phase of each walking circle do not need to be classified, as it simplifies the identification process Additionally, the length of data for feature extraction also needs to be ascertained. A tradeoff is available here between discrimination accuracy of motion types and real-time applicability. Involving more data in the segmentation procedure is beneficial to correctly identify human motion and achieve more reliable results, but will cause a lag response, whereas less data can achieve a quick and less delay judgment on human motion. However, there is not enough information included for classification. Hence, the distribution of three separate axis acceleration signal of different motions is analyzed to figure out the length of data segment for feature extraction. Figure 7 draws the collected three axes acceleration signals in the vicinity of the peak points in the initial stage of a step, and Figure 7a-e, respectively, represents acceleration signals collected from forward, backward, left, right and jump motions. The blue, red and green solid lines separately denote the acceleration signals represented in user frame. The green dashed line, which drawn from top to bottom, denotes the position of the peak points. The peak points line suffers a shift in right side and it is due to the implementation of the mean average algorithm, but it will not cause any negative effect to the identification process. It is suggested to use the acceleration signals to invest the data segment length because they experience different performances during the process of human stepping in various directions, and they are able to provide an intuitive, direct, and easy understanding manner to recognize the moving directions. For example, Figure 7c illustrates the left motion and the acceleration (red line) in user's right direction features an obvious difference compared with the other two axes. Similarly, for the forward and backward motions, the accelerations in forward or backward directions exhibit more diversity.
facilitates the detection of the beginning phase of each step, and ensures the reliable real time performance. On the other side, the foot motion detection algorithm works with the O (peak point number) complexity. Therefore, the classification process is only performed when the peak point is detected, which decreases the computation burden. Moreover, the specific phase of each walking circle do not need to be classified, as it simplifies the identification process Additionally, the length of data for feature extraction also needs to be ascertained. A tradeoff is available here between discrimination accuracy of motion types and real-time applicability. Involving more data in the segmentation procedure is beneficial to correctly identify human motion and achieve more reliable results, but will cause a lag response, whereas less data can achieve a quick and less delay judgment on human motion. However, there is not enough information included for classification. Hence, the distribution of three separate axis acceleration signal of different motions is analyzed to figure out the length of data segment for feature extraction. Figure 7 draws the collected three axes acceleration signals in the vicinity of the peak points in the initial stage of a step, and Figure 7a-e, respectively, represents acceleration signals collected from forward, backward, left, right and jump motions. The blue, red and green solid lines separately denote the acceleration signals represented in user frame. The green dashed line, which drawn from top to bottom, denotes the position of the peak points. The peak points line suffers a shift in right side and it is due to the implementation of the mean average algorithm, but it will not cause any negative effect to the identification process. It is suggested to use the acceleration signals to invest the data segment length because they experience different performances during the process of human stepping in various directions, and they are able to provide an intuitive, direct, and easy understanding manner to recognize the moving directions. For example, Figure 7c illustrates the left motion and the acceleration (red line) in user's right direction features an obvious difference compared with the other two axes. Similarly, for the forward and backward motions, the accelerations in forward or backward directions exhibit more diversity. Additionally, each figure illustrates the acceleration distribution of 500 motion samples performed by different testers where for each motion 500 data groups are collected. Acceleration data in the vicinity of first peak point are extracted, and the mean and standard deviation of these segments are calculated. The solid lines and dashed lines represent the mean and standard deviation, respectively. The acceleration distribution shown in the Figure 7 provides an intuitive statistical result of acceleration in the initial phase of a step and is helpful to confirm the data segment length. The data segmentation length selected for feature extraction is 31 samples and it is presented in orange rectangle in the figure, where it includes 20 samples before the peak point with 10 samples after the peak point and the peak point itself. The main justifications to choose this length of data are: first, the extracted feature within the selected interval is able to provide enough distinguished information for the motion identification; and, second, it ensures a reliable real-time applicability. The data shown in Figure 7 is sampled in 200 Hz and the first 30 samples of a gait cycle are utilized for classification, which means that the motion type can be decided in approximately 0.15 s after this motion occurs.

Feature Extraction
Generally, features can be defined as the abstractions of raw data. The objective of feature extraction is to find the main characteristics of a data segment which can accurately represent the original data and identify valid, useful and understandable patterns. Basically, the features can be divided into various categories, time domain and frequency domain are the most commonly ones used for recognition. The feature selection is an extremely important step because a good feature space can lead to a clear and easy classification and poor feature space may be time-consuming, computationally-expensive, and cannot lead to good result. In our system, not all of the commonlyused features in activity recognition field in our system are selected; however, the collected signal is analyzed and the foot moving physical discipline is considered to choose the features, which are not only effective to discriminate motion types but also has less computation complexity. In this system, the selected features for foot motion classification are described as follows.

Mean and Variance
The mean and variance value of the three axis accelerometer and gyroscope measurements are derived from the data segment to consider as the feature, according to the following equations: Additionally, each figure illustrates the acceleration distribution of 500 motion samples performed by different testers where for each motion 500 data groups are collected. Acceleration data in the vicinity of first peak point are extracted, and the mean and standard deviation of these segments are calculated. The solid lines and dashed lines represent the mean and standard deviation, respectively. The acceleration distribution shown in the Figure 7 provides an intuitive statistical result of acceleration in the initial phase of a step and is helpful to confirm the data segment length. The data segmentation length selected for feature extraction is 31 samples and it is presented in orange rectangle in the figure, where it includes 20 samples before the peak point with 10 samples after the peak point and the peak point itself. The main justifications to choose this length of data are: first, the extracted feature within the selected interval is able to provide enough distinguished information for the motion identification; and, second, it ensures a reliable real-time applicability. The data shown in Figure 7 is sampled in 200 Hz and the first 30 samples of a gait cycle are utilized for classification, which means that the motion type can be decided in approximately 0.15 s after this motion occurs.

Feature Extraction
Generally, features can be defined as the abstractions of raw data. The objective of feature extraction is to find the main characteristics of a data segment which can accurately represent the original data and identify valid, useful and understandable patterns. Basically, the features can be divided into various categories, time domain and frequency domain are the most commonly ones used for recognition. The feature selection is an extremely important step because a good feature space can lead to a clear and easy classification and poor feature space may be time-consuming, computationally-expensive, and cannot lead to good result. In our system, not all of the commonly-used features in activity recognition field in our system are selected; however, the collected signal is analyzed and the foot moving physical discipline is considered to choose the features, which are not only effective to discriminate motion types but also has less computation complexity. In this system, the selected features for foot motion classification are described as follows.

Mean and Variance
The mean and variance value of the three axis accelerometer and gyroscope measurements are derived from the data segment to consider as the feature, according to the following equations: where x i denotes the signal, N denotes the data length, and x, σ 2 denote the mean and variance value of the data sequence.

Signal Magnitude Area
The signal magnitude area (SMA or sma) is a statistical measure of the magnitude of a varying quantity, and actually is the absolute values of the signal. SMA is calculated according to Equation (9).
where x denotes the signal and (t 1 , t 2 ) denotes the integration time period.

Position Change
Position change is an intuitive feature for the foot direction identification because different foot moving directions cause various position changes. For example, jumping features a larger change in vertical direction, and stepping right and left lead to an obvious position change in horizontal plane. The Inertial Navigation System (INS) mechanization equation is able to provide the trajectory of a moving object in three dimensions with the measured rotations and accelerations [44], while, due to the double integration strategy of INS mechanization and the noise of sensor, the accumulated errors will be involved in the trajectory estimation and lead to a drift of position, especially when using a MEMS sensor.
Hence, it is not feasible to calculate the position during the whole identification process, the position is only derived in the data segmentation, with an initial velocity (0,0,0), initial position (0,0,0), and a zero azimuth during the calculation process. The inertial sensor has the characteristic of keeping accurate in short term, so the position result computed in the 31 samples interval is reliable and trustworthy. The position calculation equation is described as follows: where a b denotes the measured acceleration in body frame, C n b is the rotation matrix that projects the acceleration from body frame to the navigation frame (local-level frame,) and a n denotes the projected acceleration in navigation frame. v, p denote the computed velocity and position and v 0 , p 0 denote the initial velocity and position.

Ratio
The ratio feature is used to calculate the proportion of feature in single axis and the norm of features in three axes. The aim of introducing the ratio metric is to normalize the feature of three axes to best deal with the motions performed in different strength performed by different users. For example, for the jump motion, the position change in up direction (jump height) is more than that in horizontal plane and is dominant in the position change, though the jump height is different for various users, the proportion of jump height in position change still occupy significantly. Specifically, the position feature (the position change) derived from a heavy jump motion maybe (0.2, 0.2, 0.5), and is (0.05, 0.05, 0.2) from a slight jump motion; though the jump height amplitude varies significantly depending on different user habits, the ratio of jump height occupies over 50% of whole position change for the both groups. Hence, the ratio feature of position change in different directions is a good metric to distinguish and evaluate the motion types with different strengths. The ratio feature introduced here is calculated as in Equation (11): where, FeatureX, Y , Z denote the calculated features in different axes and ratioFeature denotes the ratio. In our proposed system, the position, mean, variance, and SMA features calculated in three directions or axes are all considered to derive the ratio feature.

Classification
The classification is a process to predict or reorganize the motions with the extracted features. In order to achieve a good motion classification performance, three popular supervised classification approaches are employed in our research work for the validation and these three classifiers are described as follows.

Decision Tree
A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences. Generally, internal node, branch, and leaf nodes are included in a decision tree classifier, where, the internal node represents a test on the selected feature, branch denotes the outcome of the test, and the leaf nodes represent the class labels (different moving directions). Figure 8 graphically illustrates the decision tree model. horizontal plane and is dominant in the position change, though the jump height is different for various users, the proportion of jump height in position change still occupy significantly. Specifically, the position feature (the position change) derived from a heavy jump motion maybe (0.2, 0.2, 0.5), and is (0.05, 0.05, 0.2) from a slight jump motion; though the jump height amplitude varies significantly depending on different user habits, the ratio of jump height occupies over 50% of whole position change for the both groups. Hence, the ratio feature of position change in different directions is a good metric to distinguish and evaluate the motion types with different strengths. The ratio feature introduced here is calculated as in Equation (11) where, , FeatureX Y Z ， denote the calculated features in different axes and ratioFeature denotes the ratio. In our proposed system, the position, mean, variance, and SMA features calculated in three directions or axes are all considered to derive the ratio feature.

Classification
The classification is a process to predict or reorganize the motions with the extracted features. In order to achieve a good motion classification performance, three popular supervised classification approaches are employed in our research work for the validation and these three classifiers are described as follows.

Decision Tree
A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences. Generally, internal node, branch, and leaf nodes are included in a decision tree classifier, where, the internal node represents a test on the selected feature, branch denotes the outcome of the test, and the leaf nodes represent the class labels (different moving directions). Figure 8 graphically illustrates the decision tree model.  The blue circles denote the internal node that executes the test on the feature (comparison of the feature with trained parameter), the green arrows denote the test outcomes and the rectangles denote the different labels or classes. The red dashed lines from top nodes to the leaf nodes represent a decision process or classification rule.
The tree generation is the training stage of this classifier and it works in a recursive procedure. The general tree generation process is that, for each feature of the samples, a metric (the splitting  The blue circles denote the internal node that executes the test on the feature (comparison of the feature with trained parameter), the green arrows denote the test outcomes and the rectangles denote the different labels or classes. The red dashed lines from top nodes to the leaf nodes represent a decision process or classification rule.
The tree generation is the training stage of this classifier and it works in a recursive procedure. The general tree generation process is that, for each feature of the samples, a metric (the splitting measure) is computed from splitting on that feature. Then, the feature that generates the optimal index (highest or lowest) is selected and a decision node is created to split the data based on that feature. The recursion procedure stops when the samples in a node belong to the same class (majority), or when there are no remaining features on which to split. Depending on different splitting measures, the decision tree can be categorized as: ID3 (Iterative Dichotomiser 3), Quest (Quick, Unbiased, Efficient, Statistical Tree), CART (Classification And Regression Tree), C4.5, etc. [45,46].

K-Nearest Neighbors
K-nearest neighbors algorithm (kNN) [47] is an approach based on the closest training samples in the feature space, where k denotes the number of classes. In the kNN approach, an object is classified by a majority vote of its neighbors, with the object being assigned to the most common class among its nearest neighbors. Similarity measures are fundamental components in this algorithm and different distance measures can be used to find the distance between data points. Figure 9 illustrates the main concept of kNN algorithm.
Sensors 2016, 16,1752 14 of 24 measure) is computed from splitting on that feature. Then, the feature that generates the optimal index (highest or lowest) is selected and a decision node is created to split the data based on that feature. The recursion procedure stops when the samples in a node belong to the same class (majority), or when there are no remaining features on which to split. Depending on different splitting measures, the decision tree can be categorized as: ID3 (Iterative Dichotomiser 3), Quest (Quick, Unbiased, Efficient, Statistical Tree), CART (Classification And Regression Tree), C4.5, etc. [45,46].

K-Nearest Neighbors
K-nearest neighbors algorithm (kNN) [47] is an approach based on the closest training samples in the feature space, where k denotes the number of classes. In the kNN approach, an object is classified by a majority vote of its neighbors, with the object being assigned to the most common class among its nearest neighbors. Similarity measures are fundamental components in this algorithm and different distance measures can be used to find the distance between data points. Figure 9 illustrates the main concept of kNN algorithm. As shown in Figure 9, the test sample (blue circle) is classified by the neighbors class, either green square or green triangle. If k is selected as 3, the test sample is assigned to the red square because two of its k neighbors belong to red square. In the same way, if k = 5, the test sample is assigned to green triangle class. Hence, the main idea of kNN is that the category of the predicted object is decided by the labels of the neighbors' majority. Additionally, the votes of these neighbors could be weighted based on the distance to overcome the problem of non-uniform densities of the neighbor classes.

Support Vector Machine
The support vector machine (SVM) is used to construct a hyperplane or set of hyperplanes in a high-or infinite-dimensional space for classification, regression, or other tasks. Since several available hyperplanes are able to classify the data, the SVM is employed to use the one that represents the largest separation, or margin, between the two classes to classify. The hyperplane chosen in SVM maximize the distance between the plane and the nearest data point on each side. Figure 10 draws the SVM classifier.
As shown in this figure, the optimal separating hyper-plane (solid red line) locates the samples with different labels (blue circles 1, red square −1) in the two sides of the plane, and the distances of the closest samples to the hyper-plane in each side become maxima. These samples are called support vectors and the distance is optimal margin. The specific illustration of the SVM classifier can be found in the literature [48][49][50]. As shown in Figure 9, the test sample (blue circle) is classified by the neighbors class, either green square or green triangle. If k is selected as 3, the test sample is assigned to the red square because two of its k neighbors belong to red square. In the same way, if k = 5, the test sample is assigned to green triangle class. Hence, the main idea of kNN is that the category of the predicted object is decided by the labels of the neighbors' majority. Additionally, the votes of these neighbors could be weighted based on the distance to overcome the problem of non-uniform densities of the neighbor classes.

Support Vector Machine
The support vector machine (SVM) is used to construct a hyperplane or set of hyperplanes in a high-or infinite-dimensional space for classification, regression, or other tasks. Since several available hyperplanes are able to classify the data, the SVM is employed to use the one that represents the largest separation, or margin, between the two classes to classify. The hyperplane chosen in SVM maximize the distance between the plane and the nearest data point on each side. Figure 10 draws the SVM classifier.
As shown in this figure, the optimal separating hyper-plane (solid red line) locates the samples with different labels (blue circles 1, red square −1) in the two sides of the plane, and the distances of the closest samples to the hyper-plane in each side become maxima. These samples are called support vectors and the distance is optimal margin. The specific illustration of the SVM classifier can be found in the literature [48][49][50].

Experiments and Results
The experiment is designed to include two parts. In the first part, different testers are invited to perform the five foot motions in their own manners. Then, we collect the data, preprocess data to remove error, divide the data into segments, extract the features and put them into the training process of introduced machine learning algorithms to derive the classifiers. Additionally, the classifiers are tested by two cross validation approaches. In the second part, the data processing procedure is transformed in our software platform and is programmed in C++, the program is also connected to the game control interface to perform the practical game playing experiment.

Date Set
In order to obtain a sufficient amount of data for training, ten testers-two females and eight males-are invited to participate in experiments. All testers are in good health condition without any abnormality in their gait cycles. The IMU sensor was attached on the testers' shoes, and they were guided to perform the five stepping motions in their natural manners. In order to have diverse characteristic of each motion, some actions of the testers were conducted at different strengths (heavy or slight), different frequencies (fast or slow), and different scopes (large or small amplitude), and some actions were performed by the same tester on different days. The data collected during this experiment were stored to form the training dataset. Figure 11 shows the system hardware platform. In this platform, a 3.7 V lithium battery (blue one) is used to provide the power supply. The IMU module has a small size, and is very convenient to mount on user's shoe. Table 1 provides a summary of the collected training dataset, where the quantitative information of the collected human stepping motions is listed in this table. The second row lists the actual motion numbers collected in the experiment, and they include 895 jump, 954 stepping left, 901 stepping right, 510 moving forward and 515 moving backward.

Experiments and Results
The experiment is designed to include two parts. In the first part, different testers are invited to perform the five foot motions in their own manners. Then, we collect the data, preprocess data to remove error, divide the data into segments, extract the features and put them into the training process of introduced machine learning algorithms to derive the classifiers. Additionally, the classifiers are tested by two cross validation approaches. In the second part, the data processing procedure is transformed in our software platform and is programmed in C++, the program is also connected to the game control interface to perform the practical game playing experiment.

Date Set
In order to obtain a sufficient amount of data for training, ten testers-two females and eight males-are invited to participate in experiments. All testers are in good health condition without any abnormality in their gait cycles. The IMU sensor was attached on the testers' shoes, and they were guided to perform the five stepping motions in their natural manners. In order to have diverse characteristic of each motion, some actions of the testers were conducted at different strengths (heavy or slight), different frequencies (fast or slow), and different scopes (large or small amplitude), and some actions were performed by the same tester on different days. The data collected during this experiment were stored to form the training dataset. Figure 11 shows the system hardware platform. In this platform, a 3.7 V lithium battery (blue one) is used to provide the power supply. The IMU module has a small size, and is very convenient to mount on user's shoe. Table 1

Experiments and Results
The experiment is designed to include two parts. In the first part, different testers are invited to perform the five foot motions in their own manners. Then, we collect the data, preprocess data to remove error, divide the data into segments, extract the features and put them into the training process of introduced machine learning algorithms to derive the classifiers. Additionally, the classifiers are tested by two cross validation approaches. In the second part, the data processing procedure is transformed in our software platform and is programmed in C++, the program is also connected to the game control interface to perform the practical game playing experiment.

Date Set
In order to obtain a sufficient amount of data for training, ten testers-two females and eight males-are invited to participate in experiments. All testers are in good health condition without any abnormality in their gait cycles. The IMU sensor was attached on the testers' shoes, and they were guided to perform the five stepping motions in their natural manners. In order to have diverse characteristic of each motion, some actions of the testers were conducted at different strengths (heavy or slight), different frequencies (fast or slow), and different scopes (large or small amplitude), and some actions were performed by the same tester on different days. The data collected during this experiment were stored to form the training dataset. Figure 11 shows the system hardware platform. In this platform, a 3.7 V lithium battery (blue one) is used to provide the power supply. The IMU module has a small size, and is very convenient to mount on user's shoe. Table 1

Classification Results
In our proposed system, a corresponding classifier is trained for each motion instead of using a single classifier for the five motions. This training strategy is beneficial to improve the robustness and decrease the complexity of this system, since one classifier only needs to recognize two classes instead of five. Moreover, it offers the possibility of selecting typical features for each motion based on motion principle or data analysis in future work.
In order to have a better evaluation of the classification performance, two cross-validation approaches for test were chosen: k-fold cross validation and holdout validation. In k-fold crossvalidation approach, the original sample is randomly partitioned into k equal sized subsamples. A single subsample is then retained from these k subsamples as the validation data for testing the model, and the remaining (k − 1) subsamples are used as training data. The cross-validation process is then repeated k − 1 times, with each of the k − 1 subsamples used exactly once as the validation data. The k − 1 results from these folds can then be averaged to produce a single estimation. The advantage of this method over repeated random sub-sampling is that all of the observations are used for both training and validation, and each observation is used for validation exactly once. Here, a commonly used 10-fold test is employed. In holdout validation, a subset of observations is chosen randomly from the initial samples to form a validation or testing set, and the remaining observations are retained as the training data. Twenty-five percent of the initial samples are chosen for test and validation. The two cross validation approaches are also performed for the three classifiers and the classification results are listed in Table 2

Classification Results
In our proposed system, a corresponding classifier is trained for each motion instead of using a single classifier for the five motions. This training strategy is beneficial to improve the robustness and decrease the complexity of this system, since one classifier only needs to recognize two classes instead of five. Moreover, it offers the possibility of selecting typical features for each motion based on motion principle or data analysis in future work.
In order to have a better evaluation of the classification performance, two cross-validation approaches for test were chosen: k-fold cross validation and holdout validation. In k-fold cross-validation approach, the original sample is randomly partitioned into k equal sized subsamples. A single subsample is then retained from these k subsamples as the validation data for testing the model, and the remaining (k − 1) subsamples are used as training data. The cross-validation process is then repeated k − 1 times, with each of the k − 1 subsamples used exactly once as the validation data. The k − 1 results from these folds can then be averaged to produce a single estimation. The advantage of this method over repeated random sub-sampling is that all of the observations are used for both training and validation, and each observation is used for validation exactly once. Here, a commonly used 10-fold test is employed. In holdout validation, a subset of observations is chosen randomly from the initial samples to form a validation or testing set, and the remaining observations are retained as the training data. Twenty-five percent of the initial samples are chosen for test and validation. The two cross validation approaches are also performed for the three classifiers and the classification results are listed in Tables 2 and 3.   Additionally, in order to have a quantitative evaluation of the classifier performance, the Accuracy, Precision, and Recall metrics are also introduced. The definition of these metrics and their calculation equations are described below.
• Accuracy: The accuracy is the most standard metric to summarize the overall classification performance for all classes and it is defined as follows: • Precision: Often referred to as positive predictive value, it is the ratio of correctly classified positive instances to the total number of instances classified as positive: • Recall: Also called true positive rate, it is the ratio of correctly classified positive instances to the total number of positive instances: where TP (True Positive) indicates the number of true positive or correctly classified results, TN (True Negatives) is the number of negative instances that were classified as negative, FP (False Positives) is the number of negative instances that were classified as positive and FN (False Negatives) is the number of positive instances that were classified as negative. According to the evaluation metrics, the accuracy, precision, recall, for the test result of each motion are calculated and listed in Tables 4-6.  Based on the evaluation metrics listed in Tables 4-6, and according to the graphically comparison of accuracy and precision shown in Figures 12 and 13, the SVM classifier has an overall better performance than the other approaches. Moreover, the average time for each classifier to make the decision on the motion type is: decision tree classifier 0.0056 ms; kNN, 0.53 ms; and SVM, 0.0632 ms. Although the decision tree classifier has the least response time for identification, its performance on the motion type is not satisfied. The response time for SVM is 0.06 ms and it is in an acceptable time frame because this lag level will not cause an observable delay on user experience. Hence, combined with the performance and the decision time of each classifier, the SVM classifier achieves the best result and is selected in our proposed system to classify the stepping motions. Additionally, we analyze the misclassified events of each motion to give the profile of errors, aiming to avoid that one specific stepping motion always contributes to the wrong recognition, which is potentially due to unsuitable feature selection or data segmentation. The statistical result is listed in Table 7.  Based on the evaluation metrics listed in Tables 4-6, and according to the graphically comparison of accuracy and precision shown in Figures 12 and 13, the SVM classifier has an overall better performance than the other approaches. Moreover, the average time for each classifier to make the decision on the motion type is: decision tree classifier 0.0056 ms; kNN, 0.53 ms; and SVM, 0.0632 ms. Although the decision tree classifier has the least response time for identification, its performance on the motion type is not satisfied. The response time for SVM is 0.06 ms and it is in an acceptable time frame because this lag level will not cause an observable delay on user experience. Hence, combined with the performance and the decision time of each classifier, the SVM classifier achieves the best result and is selected in our proposed system to classify the stepping motions. Additionally, we analyze the misclassified events of each motion to give the profile of errors, aiming to avoid that one specific stepping motion always contributes to the wrong recognition, which is potentially due to unsuitable feature selection or data segmentation. The statistical result is listed in Table 7.      Table 7 provides the false identification of each motion in the two cross-validation approaches. For example, in 10-fold cross validation, 27 true jump motions are missing or mistakenly classified, which occupies 42.86% of the misclassified events; however, eight left, seven right, nine forward, and 12 backward are wrongly treaded as jump motion by classifier, which totally contributes 57.15% of the misclassified events. In each classifier, the identification error of its corresponding motion type (i.e., the wrong categorization of jump motion in the jump classifier) occupies approximately 33% to 48%, and the misclassified percentage of other motion varies from 51% to 66%. Moreover, the error result also shows that the misclassified events are averagely distributed in each motion, and demonstrates that no one specific motion error is predominant during the motion determination process.

Practical Experiment Result
A running game we programmed in Unity is used to practically test the algorithm. In this game, a man is running in the forest with numerous obstacles and the traditional play manner is that the user needs to control the object to jump, go left, go right or get down to avoid the obstacles. Here, we use foot movement direction to control the man and the result is shown in following figures.
As shown in Figure 14, the red rectangle shows the virtual player presented in game, the arrow denotes the player's moving direction, the green rectangle illustrates the step motion identification result, and the orange rectangle shows the person moving direction. Figure 15 shows practical test result in the game Subway Surfers. Figure 15a illustrates that a person walking forward correlates to the jump of kid in game operation. In the left side of this figure, the person steps forward and the red arrow presents the stepping direction. The right side shows the game environment, where we can see that the kid selected in the green circle jumps up to avoid the front obstacle. In the same way, Figure 15b shows the person stepping left and it correlates to the kid moving to left.   Table 7 provides the false identification of each motion in the two cross-validation approaches. For example, in 10-fold cross validation, 27 true jump motions are missing or mistakenly classified, which occupies 42.86% of the misclassified events; however, eight left, seven right, nine forward, and 12 backward are wrongly treaded as jump motion by classifier, which totally contributes 57.15% of the misclassified events. In each classifier, the identification error of its corresponding motion type (i.e., the wrong categorization of jump motion in the jump classifier) occupies approximately 33% to 48%, and the misclassified percentage of other motion varies from 51% to 66%. Moreover, the error result also shows that the misclassified events are averagely distributed in each motion, and demonstrates that no one specific motion error is predominant during the motion determination process.

Practical Experiment Result
A running game we programmed in Unity is used to practically test the algorithm. In this game, a man is running in the forest with numerous obstacles and the traditional play manner is that the user needs to control the object to jump, go left, go right or get down to avoid the obstacles. Here, we use foot movement direction to control the man and the result is shown in following figures.
As shown in Figure 14, the red rectangle shows the virtual player presented in game, the arrow denotes the player's moving direction, the green rectangle illustrates the step motion identification result, and the orange rectangle shows the person moving direction. Figure 15 shows practical test result in the game Subway Surfers. Figure 15a illustrates that a person walking forward correlates to the jump of kid in game operation. In the left side of this figure, the person steps forward and the red arrow presents the stepping direction. The right side shows the game environment, where we can see that the kid selected in the green circle jumps up to avoid the front obstacle. In the same way, Figure 15b shows the person stepping left and it correlates to the kid moving to left. Step left.

Conclusions
This paper introduces a novel application of foot-mounted inertial sensor based wearable electronic devices-game play. The main contributions of this paper can be summarized as: (1) This paper presents the first attempt to employ user's stepping direction for controlling the player operation in game play. (2) This paper proposes and implements a novel computationally-efficient, real-time algorithm for the identification of foot moving direction. (3) In the proposed system, the acceleration and gyroscope measurements are fused to derive the attitude and use it to correct the misalignment error. This makes the proposed algorithm compatible with various shoe styles and sensor placements. (4) The stepping motion type can be recognized in the beginning phase of one step cycle, which guarantees the system real-time applicability. (5) It is suggested to design the corresponding classifier for each motion where each classifier only needs to identify two classes instead of using one classifier to recognize all five motions. This is beneficial to acquire a more precise and reliable identification result. (6) Three commonly-used classifiers in the aspects of cross validation performance and response time are compared. Based on this comparison, it is concluded that the SVM classifier achieves the best performance. (7) It extends the inertial sensor based game play scenario to the foot motion control mode, which introduces the possibility of playing running game indoor or anywhere and is potentially beneficial to encourage the user to exercise more for good health. Practical experiments of different users illustrate that the proposed system reaches a high accuracy classification result and excellent user experience, and it effectively broadens the application of current available wearable electronic devices.

Conclusions
This paper introduces a novel application of foot-mounted inertial sensor based wearable electronic devices-game play. The main contributions of this paper can be summarized as: (1) This paper presents the first attempt to employ user's stepping direction for controlling the player operation in game play. (2) This paper proposes and implements a novel computationally-efficient, real-time algorithm for the identification of foot moving direction. (3) In the proposed system, the acceleration and gyroscope measurements are fused to derive the attitude and use it to correct the misalignment error. This makes the proposed algorithm compatible with various shoe styles and sensor placements. (4) The stepping motion type can be recognized in the beginning phase of one step cycle, which guarantees the system real-time applicability. (5) It is suggested to design the corresponding classifier for each motion where each classifier only needs to identify two classes instead of using one classifier to recognize all five motions. This is beneficial to acquire a more precise and reliable identification result. (6) Three commonly-used classifiers in the aspects of cross validation performance and response time are compared. Based on this comparison, it is concluded that the SVM classifier achieves the best performance. (7) It extends the inertial sensor based game play scenario to the foot motion control mode, which introduces the possibility of playing running game indoor or anywhere and is potentially beneficial to encourage the user to exercise more for good health. Practical experiments of different users illustrate that the proposed system reaches a high accuracy classification result and excellent user experience, and it effectively broadens the application of current available wearable electronic devices.