Foot Pose Estimation Using an Inertial Sensor Unit and Two Distance Sensors

There are many inertial sensor-based foot pose estimation algorithms. In this paper, we present a methodology to improve the accuracy of foot pose estimation using two low-cost distance sensors (VL6180) in addition to an inertial sensor unit. The distance sensor is a time-of-flight range finder and can measure distance up to 20 cm. A Kalman filter with 21 states is proposed to estimate both the calibration parameter (relative pose of distance sensors with respect to the inertial sensor unit) and foot pose. Once the calibration parameter is obtained, a Kalman filter with nine states can be used to estimate foot pose. Through four activities (walking, dancing step, ball kicking, jumping), it is shown that the proposed algorithm significantly improves the vertical position estimation.


Introduction
Foot pose (position and orientation) estimation is used in many areas, such as gait analysis [1][2][3], exergaming [4,5] and pedestrian navigation systems [6]. The most accurate method to estimate foot pose is using an optical motion tracker. However, optical motion tracking is rather expensive, and it can only capture the motion in a limited space, which is determined by the number of cameras.
Recently, inertial sensor-based foot pose estimation methods are increasingly used, since the sensor can be attached to a shoe, which does not require any infrastructure in an environment and, thus, removes the space constraint imposed by optical motion trackers. As inertial sensors are becoming less expensive and smaller, inertial sensor-based motion tracking (ISBMT) is expected to be more popular.
The basic principle of an ISBMT is integration of gyroscope output (which gives the orientation) and double integration of accelerometer outputs (which gives the position). In the process, the sensors noises are also integrated. For example, if there is a nonzero accelerometer sensor bias term, it affects the position estimation error proportional to the square of elapsed time [7]. Furthermore, initial orientation error also significantly affects the position estimation error, since the gravitational acceleration is not correctly removed during the position estimation [7].
In foot pose tracking, the error increase can be mitigated using zero velocity updating [8,9]. Whenever a foot touches the ground, we know the velocity of the foot is zero. Using this zero velocity information, the estimation error can be significantly reduced [6].
Another approach to reduce the estimation error is to use a smoother [10]. The smoother uses both prior and posterior data to estimate the current state. This combination reduces the estimation error. One drawback of a smoother is that its computation cannot be done online and, thus, is not suitable for applications, such as gaming.
The pose estimation error also can be reduced by using additional sensors. In [11], a pressure sensor is attached on a shoe to accurately detect the zero velocity intervals, which increases the estimation accuracy. In [12], an ultrasound range sensor is used to measure the distance to walls. The ultrasound sensor does not improve foot pose estimation, but improves the location accuracy for indoor pedestrian navigation. In [13], a camera is used to read markers on the floor, which gives the absolute position and orientation of a foot. One disadvantage of this approach is that the markers must be installed on the floor, so the experiment space is limited by the number of makers. In [14], a camera and infrared LEDs are used to measure the relative pose between two feet.
In this paper, we present a methodology to improve the accuracy of foot pose estimation by attaching low-cost distance sensors on a shoe in addition to an inertial sensor unit. Since the distance sensor gives height information, it can help to improve height estimation accuracy. The improved accuracy could be helpful in gait analysis and exergaming applications. The foot pose estimation algorithm is implemented using a Kalman filter. To combine inertial sensors and distance sensors, the relative pose between two sensors should be known. This pose parameter is included in the proposed Kalman filter.
The paper is organized as follows. In Section 2, the system hardware and coordinate systems are introduced. In Section 3, dynamic equations of the Kalman filter for ISBMT are given. In Section 4, measurement equations of the Kalman filter are provided. In Section 5, the proposed algorithm is tested, and the estimated positions are compared with optical tracker measurement values. The discussion is given in Section 6.

System Overview
An inertial measurement unit (three-axis accelerometers and three-axis gyroscopes, Xsens MTi) is attached on a shoe as in Figure 1. The sampling frequency of the inertial sensors is 100 Hz. Two distance sensors (VL6180 [15]) are also attached on a shoe, where symbols A and B are used to distinguish the two sensors. The distance sensor VL6180 measures the distance by measuring the time-of-flight of infrared light, and the measurement range is up to 20 cm. This sensor is most often used as a proximity sensor in smartphones. The sampling frequency of the distance sensors is 33.33 Hz. Two coordinate systems are used in this paper: the body and world coordinate systems. The three axes of the body coordinate system coincide with the three axes of the inertial sensor unit. The z axis of the world coordinate system is in the direction of the local gravitational field: the z axis is pointing upward. The x and y axes are chosen arbitrarily. The origin of the world coordinate system is assumed to be on the floor.
The relative position and orientation of a distance sensor with respect to the inertial sensor unit are required in the foot pose estimation algorithm. As shown in Figure 1, the positions of two distance sensors are denoted [r A ] b ∈ R 3 and [r B ] b ∈ R 3 , which are the position coordinates in the body coordinate system. The notation [p] b ([p] w ) for a vector p ∈ R 3 is used to emphasize that a vector is represented in the body (world) coordinate system. When there is no confusion, [p] b (or [p] w ) is just expressed by p. The pointing direction of the distance sensor is denoted by a unit vector n A ∈ R 3 and n B ∈ R 3 .
The distance sensor parameters (r A , r B , n A , n B in Figure 1) could be determined using a ruler and a protractor. It would be, however, not easy to determine the parameters with high accuracy. Thus, the small errors in the parameters are estimated in the Kalman filter. We model the parameters as follows: wherer A ∼n B are initial estimated parameter values (most likely measured values using a ruler and a protractor) andr A ∼n B are errors in the parameter estimation.
The distance between two distance sensors is denoted by d AB = r A − r B , which is measured by a ruler. This scalar quantity can be measured more accurately comparing with other vector parameters, which are rather difficult to measure and sometimes require guesswork. Thus, the error of d AB is not estimated. The measured value of d AB is denoted by z AB : where v AB is the measurement noise.

Kalman Filter for ISBMT
In this section, the foot pose tracking algorithm is given. Let r ∈ R 3 and v ∈ R 3 denote the position and velocity of the inertial sensor unit in the world coordinate system. Let q ∈ R 4 be the quaternion [16] representing the rotation relationship between the body and world coordinate systems. Let C(q) ∈ SO(3) be the rotation matrix corresponding to the quaternion q.

Basic Pose Equations
The basic equations for r, v and q are given by [17]: where [a] b ∈ R 3 is the acceleration expressed in the body coordinate system and ω = ω x ω y ω z is the angular velocity of the body coordinate system with respect to the world coordinate system. The symbol Ω(ω) is defined by:

Numerical Integration
The basic principle of ISBMT is that q, v and r can be computed by numerically integrating Equation (3) [7]. To do that, we need to know ω and [a] b , which can be measured using gyroscopes and accelerometers.
The gyroscope output (y g ∈ R 3 ) and accelerometer output (y a ∈ R 3 ) can be modeled as follows: where is the local gravitational vector in the world coordinate system and g is the magnitude of the gravitational acceleration. The measurement noises v g and v a are assumed to be uncorrelated zero mean white Gaussian. We can integrate Equation (3) by replacing ω by y g and replacing [a] b by y a − C(q)g. The numerical integration algorithm is given in [18]. Let the integrated values be denoted byq,r andv.
Since ω = y g and [a] b = y a − C(q)g, there are errors inq,r andv, which are denoted byq ∈ R 3 , r ∈ R 3 andv ∈ R 3 : where ⊗ denotes the quaternion multiplication and q * is the conjugate quaternion of q ∈ R 4 . The definition ofq ∈ R 3×1 is from the assumption that the estimation error ofq is small, and thus, the following is satisfied [19]: The errorsq,r andv along with the parameter errorsr A ∼n B in Equation (1) are estimated in the Kalman filter. Combining nine states in Equation (5) and 12 states in Equation (1), we have the following state for a Kalman filter: Once the calibration parametersr A ∼n B are estimated, the parameter error termsr A ∼n B in Equation (7) can be removed for fast computation.
The dynamic equation of x is given by: where: The dynamic equations forq,r andv are from the result in [20]. The derivatives ofr A ∼n B are zero, since r A ∼ n B are constant parameters.

Measurement Equation of the Kalman Filter
Two measurement equations are used in the Kalman filter. One is the measurement from distance sensors (Section 4.1), and the other is the measurement equation using the zero velocity intervals (Section 4.2).

Distance Sensor Output and Parameters
The distance sensor outputs z A ∈ R and z B ∈ R can be modeled as follows: Assuming that the floor is flat and the origin of the world coordinate system lies on the floor, the following is satisfied: In Equation (10), − (r A + n A d A ) is a vector (in the body coordinate system) from a point (intersection of line n A and floor plane) on the floor to the inertial sensor. By pre-multiplying C(q) (the rotation matrix from the body coordinate system to the world coordinate system), the third component of is the height of the inertial sensor. Now, we relate Equation (10) to the state Equation (7). Note that [21]: Inserting Equations (1), (5), (9) and (11), into the first equation of Equation (10), we have the following: Assuming that the error terms (q,r A ,n A ,r) and the noise term (v A ) are small, we can ignore the product terms: for example, [q×]r A is ignored. By ignoring the product terms, we obtain: where:z Thus, the measurementz A is related to the state x as follows: where: Similarly, we can derive a measurement equation for the distance sensor B: where:z In addition to Equations (13) and (15), some constraints are added in the measurement equations. The first constraint is n A and n B should be unit vectors since they represent direction vectors. The second constraint is that r A − r B = d AB .
The constraint (n A and n B should be unit vectors) is expressed as: From n A =n An A = 1 andn An A ≈ 0 (assuming thatn A is small), we have the following approximation:n Thus, constraints on n A and n B are imposed through the following measurement equations: where: The noise term v constraint ∈ R 2×1 is an artificial noise reflecting the fact that the constraint Equation (17) is an approximation.
Inserting Equation (2) into the constraint r A − r B = d AB , we have: Since it has the nonlinear relationship, an extended Kalman filtering technique is used for this measurement.z where:z In summary, Equations (14), (15), (18) and (20) are used as measurement equations for the Kalman filter if distance sensor data z A and z B are available. In this case, the measurement equation is given by:

Zero Velocity Updating
In addition to distance sensors, zero velocity updating is used in a measurement equation. If we know when a foot is not moving (for example, when a foot is on the ground), the velocity error in the ISBMT algorithm can be reset, which significantly reduces foot pose estimation errors. Since velocity sensors (such as a Doppler velocity sensor in [22]) are not used, the zero velocity intervals are detected indirectly, where many zero velocity detection algorithms have been proposed [8,23].
In this paper, we use a simple zero velocity detection algorithm. Let Z m be a set of all discrete time indices belonging to the zero velocity intervals. The discrete time k belongs to Z m if the following conditions are satisfied: where N g and N a are even number integers. During zero velocity intervals, we assume the following: Inserting Equation (5) into Equation (22), we have the following measurement equation in the Kalman filter. z where: The noise term v zero ∈ R 3×1 is an artificial noise reflecting the fact that the true velocity may not be zero (the velocity may be almost zero, but not exactly zero). The proposed algorithm is tested with several foot movement experiments. The foot position is measured with an inertial sensor unit (Xsens MTi) and two distance sensors (VL6180) as in Figure 2. The estimated positions are compared with the positions obtained using an optical motion tracker (Optitrack six Flex 13 camera system), which is considered as a ground truth.

Experiments and Results
The parameters used in the proposed algorithm are summarized in Table 1. All noises are assumed to be uncorrelated white Gaussian noises.
Four motions (walking, dancing steps, kicking in a football, jumping) are tested. The estimated position is compared with two other inertial sensor-only pose estimations. In the first inertial sensor-only pose estimation, the same algorithm is used, except for the distance sensors: that is, Equations (14) and (15) are not used in the Kalman filter. We call this method "K.F. (zero velocity updating)". In the second inertial sensor-only pose estimation, the height updating is added in the first inertial sensor-only pose estimation. If we assume that the floor is flat, the foot height during the zero velocity intervals (that is, when a foot is on the ground) should be the same. Thus, in addition to zero velocity updating, the z axis value of r is updated to the initial z axis value during each zero velocity interval. We call this method "K.F. (zero velocity + height updating)" In Figures 3-5, the estimated positions (walking case) by three methods (the proposed method and the two inertial sensor-only methods) are given along with the position by an optical tracker. Since the optical tracker coordinate system is different from the world coordinate system, the position data are translated and rotated. Furthermore, the inertial sensor data and optical tracker data are not synchronized at the hardware level. The data are synchronized by maximizing the cross-correlation.
We can see that x and y axis position estimations are similar in all three methods. This is not surprising, since the distance sensor only gives the z axis position (height) information. In the z axis position estimation, the proposed method gives better results (see the third graphs of Figures 3-5 around 4.6, 6 and 7 s). Among inertial sensor-only estimations, the Kalman filter (zero velocity + height updating) is better, since its height is compensated during the zero velocity intervals.    Another example is given in Figures 6 and 7, where a ball kicking action is done. The x and y position estimation results are similar, both by the proposed method and K.F. (zero velocity + height updating). In this case, the position errors are quite large, presumably due to the fact that there is a rather long moving interval (2.8∼4-s interval) and a quick movement (large sensor values). In the z axis position estimation, we can see that the proposed method gives a significantly better result.  The RMS error results for four different activities are given in Table 2. In general, there is no significant improvement in the x and y axis position estimation. Once the calibration parameters are computed, the dimension of the Kalman filter state reduces to nine from 21, sincer A ∼n B can be removed from the state. The proposed algorithm is used with this nine-state Kalman filter. In one Kalman filter, the initial calibration parameters in Table 1 are used. In another Kalman filter, the estimated calibration parameters in Equation (24) (which are supposed to be more accurate) are used. The position estimation results of two nine-state Kalman filters are given in Table 3, where the result of the 21-state Kalman filter is repeated from Table 2 for an easy comparison. The nine-state Kalman filter with calibrated parameters gives slightly better results, except for the kicking case. Since the difference between the initial and calibrated parameters is small, the RMS error difference is also small. The calibration of parameters is a one time process, and after the calibration, we can estimate the position using the nine-state Kalman filter.
In the walking data, the calibration parameters are estimated in the Kalman filter. The calibration parameters from walking data are given by:

Conclusions
There are many inertial sensor-based foot pose estimation algorithms. In this paper, the position estimation is improved additionally using distance sensors. The distance sensor has a 20-cm range limitation. However, this limitation does not degrade estimation accuracy much, since a foot is over the 20-cm range only for a short time, unless the foot is in the air intentionally. Longer range sensors exist, but they are either less accurate in short ranges or more expensive.
The inertial motion estimation algorithm is proposed using a 21-state Kalman (nine states for pose tracking and 12 states for calibration parameters). After the parameters are calibrated, the nine-state Kalman filter can be used to estimate the foot pose.
The proposed algorithm is tested for four activities: walking, dancing steps, ball kicking and jumping. It is shown in Table 2 that there is significant improvement in the z axis position estimation compared with inertial sensor-only foot pose estimation. On the other hand, there is no improvement in the x and y axis, since the distance sensor only gives the height information.
The proposed algorithm can be used in gait analysis (which requires a foot pose estimation), motion-based gaming (such as a soccer game) and exergaming (game-based exercises).