- freely available
- re-usable

*Sensors*
**2010**,
*10*(6),
5378-5394;
doi:10.3390/s100605378

## Abstract

**:**This paper describes the development of a modified Kalman filter to integrate a multi-camera vision system and strapdown inertial navigation system (SDINS) for tracking a hand-held moving device for slow or nearly static applications over extended periods of time. In this algorithm, the magnitude of the changes in position and velocity are estimated and then added to the previous estimation of the position and velocity, respectively. The experimental results of the hybrid vision/SDINS design show that the position error of the tool tip in all directions is about one millimeter RMS. The proposed Kalman filter removes the effect of the gravitational force in the state-space model. As a result, the resulting error is eliminated and the resulting position is smoother and ripple-free.

## 1. Introduction

It is well known that inertial navigation sensors have drifts. There are two components in the inertial sensor drift: bias stability and bias variability. These components are involved in double integration in position calculation; so after a while, the output of the Inertial Navigation System (INS) is not reliable. Since these factors are involved in the inertial navigation computing task, they cause unavoidable drift in orientation and position estimation. Removing the drift of inertial navigation systems requires that the sensors be assisted with other resources or technologies such as Global Positioning Systems (GPS) [1,2], vision systems [3–5], or odometers [6,7].

The use of Kalman filters is a common method used in the data fusion technique. The Kalman filter is a powerful method for improving the output estimation and reducing the effect of sensor drift. However, sensor integration is based on Kalman filtering, but different types of Kalman filters are being developed in this area [8–14].

In the past, the three-dimensional attitude representations were applied, but these representations are singular or discontinuous for certain attitudes [15]. As a result, the quaternion parameterization was proposed, which has the lowest dimensional possibility for a globally non-singular attitude representation [16,17].

In aided inertial motion tracking applications, the state variables of a Kalman filter usually take one of two forms: first, the sensed engineering quantities, that is acceleration, velocity, and attitude, etc.; and second, the errors of these quantities. The first form is used by Centralized Kalman Filter [14], Unscented Kalman Filter [18–20], Adaptive Kalman Filter [10,21], and Sigma-point Extended Kalman Filter [22], while the second is used by Indirect Kalman Filter [23–25].

A Kalman filter that operates on the error states is called an indirect or a complementary Kalman filter. The optimal estimates of the errors are then subtracted from the sensed quantities to obtain the optimal estimates. Since the 1960s, the complementary Kalman filter has become the standard method of integrating non-inertial with inertial measurements in aviation and missile navigation. This method requires dynamic models for both the navigation variable states and the error states [26].

This research develops an EKF which offers the estimation of the changes in the state variables. Then the current estimated values of changes in the variables are added to the previous estimation values of the position and velocity, respectively. According to the general equations of the SDINS, the constant value of the gravitational force is removed from the resulted equations and the resulting error from the uncertainty value of the gravitational force is eliminated.

## 3. Vision System

In this research, a vision system is proposed which includes four CCD cameras located on an arc to expand the domain of the field of view, see Figure 2. In order to find the Cartesian mapping grid for transforming 2D positions in the cameras’ image plane to the corresponding 3D position in the navigation frame, the single camera calibration for each camera and the stereo camera calibration for each two adjacent cameras are required.

The calibration of the vision system provides the intrinsic and extrinsic parameters of the cameras [38] in order to map a 2D point on the image planes to the 3D point in the world coordinate system. The estimation of camera parameters requires a single camera imaging model, as shown in Figure 3.

The camera lens distortion causes two radial and tangential displacements [39]. The longer distance from the center of the image plane initiates the larger displacement, when the distance of a point
$p={\left[\begin{array}{cc}x& y\end{array}\right]}^{T}={\left[\begin{array}{cc}\frac{{P}_{x}}{{P}_{z}}& \frac{{P}_{y}}{{P}_{z}}\end{array}\right]}^{T}$ on the image plane is defined as r^{2} = (x)^{2} + (y)^{2}.

Considering two vectors α and β as the radial and tangential distortion factors of a camera, the distortions can be calculated as [40]:

Consequently, the projection of each point in the world coordinate system into the image plane is:

_{1}and f

_{2}denote the focal length factors of the lens. In fact, f

_{1}and f

_{2}are related to the focal length and the dimension of the pixels:

_{px}and d

_{py}refer to center-to-center distance between adjacent sensor elements in x and y directions, respectively; and s

_{u}represents the image scale factor [41], therefore:

According to the camera model obtained in Equation (13), the geometric parameters f, s_{u}, α, and β can be estimated by capturing enough images while the coordinate of each 3D point P and its 2D projected point p are known in calibration grids:

Applying the parameter estimation method [34,42] to Equation (11) gives the geometric parameters of a camera. Furthermore, the transformation matrix for each two adjacent cameras is computed by substituting the equations of the coordinate system transformation into Equation (11) for each corresponding projected point.

In order to localize the tool tip, the edge detection and boundary extraction must be applied to every single frame from each camera. Obtaining the edge of the tool tip requires applying a thresholding technique. Each pixel is detected as an edge if its gradient is greater than the threshold. In this paper, the threshold is chosen as the boundary pixels of the tool tip are detected as the edge positions. Since the size of the tool tip is about a few pixels, then an adaptive thresholding technique is applied to remove the noise pixels around the tool tip as much as possible. For this purpose, a masking window is chosen around the initial guess of the position of the tool tip. Then, a fixed threshold is chosen which select pixels that their value is above the 80% of the value of all pixels of the image. If the boundary detection technique can identify the boundary of the tool tip, then it shows that the threshold selection is appropriate. Otherwise, the previous threshold is reduced by 5%, and this procedure is run recursively to find the proper threshold. Afterwards, the opening morphologic operation followed by closing operation is applied to simplify and smooth the shape of the tool tip. Finally, the boundary of the tool tip can be detected and extracted by using the eight-connected neighbors’ technique.

## 4. Modified Kalman Filter

The integrated navigation technique employs two or more independent sources of navigation information with complementary characteristics to achieve an accurate, reliable, and low-cost navigation system. Figure 4 shows a block diagram of the integration of the multi-camera vision system and the inertial navigation system:

Typically, Extended Kalman Filter (EKF) is applied by combining two independent estimates of a nonlinear variable [43]. The continuous form of a nonlinear system is described as:

Since the measurements are practically provided at discrete intervals of time, it is appropriate to express the system modeling in the form of discrete differential equations:

Therefore the two set of equations involving the prediction and updating of the state of the system are defined as:

According to Equations (7), (17), and (18), the discrete form of the system is developed as:

_{i}is the sampling rate of the inertial sensors. In this research, instead of estimating the actual value of these quantities, we propose to estimate how much the position and the velocity will be changed; that is:

As a consequence, the computation of the velocity is independent of the gravitational force in the new state-space model. In fact, the error caused by inaccurate value of the gravitational force in the new state-space model is completely eliminated.

The inertial sensor noise is theoretically modeled with a zero-mean Gaussian random process. In practice, the average of the noise is not absolutely zero. Due to the inherent characteristic of the Gaussian random process, the discrete difference of a zero-mean Gaussian random process is also a zero-mean Gaussian random process with very lower actual mean while its variance is twice of the variance of the original process. As a result, the drift resulting from the input noise is reduced and a smooth positioning is expected.

The equation of the INS with the state vector X = [Δx Δv q]^{T} can be reformulated as:

Subsequently, the transition matrix [44] can be calculated as:

By considering Δa = [Δ_{1} Δ_{2} Δ_{3}]^{T}:

Substituting $\dot{C}=\underset{\mathrm{\Delta}t\to 0}{\text{lim}}\left(\frac{\mathrm{\Delta}C}{\mathrm{\Delta}t}\right)$, where Δt = T, into Equation (3) leads to the following Equation:

Therefore:

As a result of Equation (106):

Because the vision system as the measurement system provides the position of the tool tip, velocity can be computed by knowing the present and the previous position at each time step:

_{v}is the sampling rate of the cameras. Accordingly, the observation matrix would be:

## 5. Experimental Results

This section presents the experimental hardware setup and the result of applying the proposed EKF. The experimental hardware includes a 3DX-GX1 IMU from Microstrain, an IDS Falcon Quattro PCIe frame grabber from IDS Imaging Development Systems, and four surveillance IR-CCD cameras. The IMU contains three rate gyros and three accelerometers with a sampling rate of 100 Hz and with a noise density of 3.5 °/ $\sqrt{\mathit{hour}}$ and 0.4 mg/rms$\sqrt{\mathit{Hz}}$, respectively [45].

All cameras are connected through the frame grabber to a PC, which includes four parallel video channels able to capture images from four cameras simultaneously with a sampling rate of 20 fps. Since the multi-camera vision system is used as a measurement system, the camera calibration procedure must be performed primary. The intrinsic and extrinsic parameters of each camera are listed in Table 1.

Once the calibration is completed, the vision system is ready to track the tool and measure the position of the tool tip by applying image processing techniques. Figure 5 demonstrates the result of the video tracking by one of the cameras.

It should be mentioned that a predesigned path is printed on the 2D plane and it is tried to be traced by the tool tip during its movement on the plane in order to compare the performance of proposed EKF and with the performance of the conventional EKF reported in [5].

The sensor fusion techniques allow us estimating the states variables of the system at the sampling rate of the sensor with the highest measurement rate. In this experiment, the sampling rate of cameras and inertial sensors are 20 fps and 100 Hz. As a result of sensor fusion, the measurement rate of the proposed integrated system is 100 Hz.

The classical EKF is applied in both switch and continues modes. In the switch mode, the estimation of the states variables is corrected whenever the measurement of the vision system is available. Otherwise, the states are estimated only based on the SDINS. In order to reduce the computational complexity of image processing algorithms, sensor fusion allows that the sampling rate of the vision system can be reduced to 10 fps and 5 fps. As illustrated in Table 2, the positioning error is increased by reducing the sampling rate of the cameras. In addition, the error in proposed EKF grows faster than the other methods; since this technique assumes that the rate of the changes in state variables is constant from one frame to another frame. So, this assumption cannot be valid in lower measurement rates.

Although, it is shown in Table 2 that the position error of the continuous EKF is less than the others; it should be mentioned that the position obtained by the multi-camera vision system still has errors compared with the predesigned path.

Figure 6 and Figure 7 compare the position resulting from each method at two different parts of the trajectory of the tool tip at two sampling rate of 16 fps and 5 fps. As shown, the camera path is traced smoothly by applying continuous EKF. Since the position is estimated in real-time, it is not possible to fit a curve between each two camera measurement without sensor fusion techniques.

The position resulting from switch EKF is crinkly due to the drift position in the SDINS and the wrinkles are amplified by decreasing the measurement rate of the cameras. The position estimated by the proposed EKF is smooth and ripple-free and this method tries to reduce the errors of the entire system compared with the predesigned path. As a result, the proposed EKF is suitable for the higher measurement rate; while the continuous EKF is recommended for the lower sampling rate. However, the error of inertial sensors resulting from noise and the common motion-dependent errors are compensated, but the remaining errors cause the position error estimation in the integrated system. In addition, the video tracking errors lead to the position estimation error as well.

## 6. Conclusions

This paper describes the use of the EKF to develop integration of the multi-camera vision system and inertial sensors. The sensor fusion techniques allow estimation of the state variables at the sampling rate of the sensor with the highest measurement rate. This helps to reduce the sampling rate of the sensors with high computational load.

The classical EKF is designed for nonlinear dynamic systems such as the strapdown inertial navigation system. The performance of the classical EKF is reduced by lowering the sampling rate of the cameras. When the sampling rate of the cameras is reduced, the rate of updating decreases and the system must rely more on the inertial sensors output for estimating the position. Because of the drift in the SDINS, the position error increases.

The modified EKF is proposed to obtain position estimation with less error. Furthermore, it removes the effect of the gravitational force in the state-space model. In fact, the error resulting from inaccuracy in the evaluation of the gravitational force is eliminated in the state-space model. In addition, the estimated position is smooth and ripple-free. However; the proposed EKF is not convincing at the lower measurement rate. The error of the estimated position results from inertial sensor errors, uncompensated common motion-dependent errors, attitude errors, video tracking errors, and unsynchronized data.

## References

- Farrell, J.; Barth, M. Global Positioning System and Inertial Navigation; McGraw-Hill: New York, NY, USA, 1999; p. 145. [Google Scholar]
- Grewal, M.; Weill, L.R.; Andrews, A.P. Global Positioning Systems, Inertial Navigation, and Integration, 2nd ed; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
- Foxlin, E.; Naimark, L. VIS-Tracker: A Wearable Vision-Inertial Self-Tracker. Proceedings of IEEE Virtual Reality Conference, Los Angeles, CA, USA, March 2003; pp. 199–206.
- Parnian, N.; Golnaraghi, F. Integration of Vision and Inertial Sensors for Industrial Tools Tracking. Sens. Rev
**2007**, 27, 132–141. [Google Scholar] - Parnian, N.; Golnaraghi, F. A low-Cost Hybrid SDINS/Multi-Camera Vision System for a Hand-Held Tool Positioning. Proceedings of 2008 IEEE/ION Position, Location and Navigation Symposium, Monterey, CA, USA, May 6–8, 2008; pp. 489–496.
- Ernest, P.; Mazl, R.; Preucil, L. Train Locator Using Inertial Sensors and Odometer. Proceedings of IEEE Intelligent Vehicles Symposium, Parma, Italy, June 2004; pp. 860–865.
- Pingyuan, C.; Tianlai, X. Data Fusion Algorithm for INS/GPS/Odometer Integrated Navigation. Proceedings of IEEE Conference on Industrial Electronics and Applications, Harbin, China, May 2007; pp. 1893–1897.
- Abuhadrous, I.; Nashashibi, F.; Laurgeau, C. 3D Land Vehicle Localization: A Real-time Multi-Sensor Data Fusion Approach using RTMAPS. Proceedings of the 11th International Conference on Advanced Robotics, Coimbra, Portugal, June 30–July 3, 2003; pp. 71–76.
- Bian, H.; Jin, Z.; Tian, W. Study on GPS Attitude Determination System Aided INS Using Adaptive Kalman Filter. Meas. Sci. Technol
**2005**, 16, 2072–2079. [Google Scholar] - Hu, C.; Chen, W.; Chen, Y.; Liu, D. Adaptive Kalman Filtering for Vehicle Navigation. J. Global Position Syst
**2003**, 2, 42–47. [Google Scholar] - Crassidis, J.L.; Lightsey, E.G.; Markley, F.L. Efficient and Optimal Attitude Determination Using Recursive Global Positioning System Signal Operations. J. Guid. Control Dyn
**1999**, 22, 193–201. [Google Scholar] - Crassidis, J.L.; Markley, F.L. New Algorithm for Attitude Determination Using Global Positioning System Signals. J. Guid. Control Dyn
**1997**, 20, 891–896. [Google Scholar] - Kumar, N.V. Integration of Inertial Navigation System and Global Positioning System Using Kalman Filtering, PhD. Thesis,. Indian Institute of Technology, New Delhi, Delhi, India, 2004.
- Lee, T.G. Centralized Kalman Filter with Adaptive Measurement Fusion: it’s Application to a GPS/SDINS Integration System with an Additional Sensor. Int. J. Control Autom. Syst
**2003**, 1, 444–452. [Google Scholar] - Pittelkau, M.E. An Analysis of Quaternion Attitude Determination Filter. J. Astron. Sci
**2003**, 51, 103–120. [Google Scholar] - Markley, F.L. Attitude Error Representation for Kalman Filtering. J. Guid. Control Dyn
**2003**, 26, 311–317. [Google Scholar] - Markley, F.L. Multiplicative vs. Additive Filtering for Spacecraft Attitude Determination. Proceedings of the 6th Conference on Dynamics and Control of Systems and Structures in Space (DCSSS), Riomaggiore, Italy, July 2004.
- Crassidis, J.L.; Markley, F.L. Unscented Filtering for Spacecraft Attitude Estimation. J. Guid. Control Dyn
**2003**, 26, 536–542. [Google Scholar] - Grewal, M.S.; Henderson, V.D.; Miyasako, R.S. Application of Kalman Filtering to the Calibration and Alignment of Inertial Navigation Systems. IEEE Trans. Autom. Control
**1991**, 39, 4–13. [Google Scholar] - Lai, K.L.; Crassidis, J.L.; Harman, R.R. In-Space Spacecraft Alignment Calibration Using the Unscented Filter. Proceedings of AIAA Guidance, Navigation, and Control Conference and Exhibit, Austin, TX, USA, August 2003; pp. 1–11.
- Pittelkau, M.E. Kalman Filtering for Spacecraft System Alignment Calibration. J. Guid. Control. Dynam
**2001**, 24, 1187–1195. [Google Scholar] - Merwe, R.V.; Wan, E.A. Sigma-Point Kalman Filters for Integrated Navigation. Proceedings of the 60th Annual Meeting of the Institute of Navigation, Dayton, OH, USA, June 2004.
- Chung, H.; Ojeda, L.; Borenstein, J. Sensor fusion for Mobile Robot Dead-reckoning with a Precision-calibrated Fibre Optic Gyroscope. Proceedings of IEEE International Conference on Robotics and Automation, Seoul, Korea, May 2001; pp. 3588–3593.
- Chung, H.; Ojeda, L.; Borenstein, J. Accurate Mobile Robot Dead-reckoning with a Precision-Calibrated Fibre Optic Gyroscope. IEEE Trans. Rob. Autom
**2004**, 17, 80–84. [Google Scholar] - Roumeliotis, S.I.; Sukhatme, G.S.; Bekey, G.A. Circumventing Dynamic Modeling: Evaluation Of The Error-State Kalman Filter Applied To Mobile Robot Localization. Proceedings of IEEE International Conference on Robotics and Automation, Detroit, MI, USA, May 1999; pp. 1656–1663.
- Friedland, B. Analysis Strapdown navigation Using Quaternions. IEEE Trans. Aerosp. Electron. Syst
**1974**, AES-14, 764–767. [Google Scholar] - Tao, T.; Hu, H.; Zhou, H. Integration of Vision and Inertial Sensors for 3D Arm Motion Tracking in Home-based Rehabilitation. Int. J. Robot. Res
**2007**, 26, 607–624. [Google Scholar] - Ang, W.T. Active Tremor Compensation in Handheld Instrument for Microsurgery, PhD Thesis, technology report CMU-RI-TR-04-28,. Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA, May 2004.
- Ledroz, A.G.; Pecht, E.; Cramer, D.; Mintchev, M.P. FOG-Based Navigation in Doenhole Environment During Horizontal Drilling Utilizing a Complete Inertial Measurement Unit: Directional Measurement-While-Drilling Surveying. IEEE Trans. Instrum. Meas
**2005**, 54, 1997–2006. [Google Scholar] - Pandiyan, J.; Umapathy, M.; Balachandar, S.; Arumugam, A.; Ramasamy, S.; Gajjar, N.C. Design of Industrial Vibration Transmitter Using MEMS Accelerometer. Instit. Phys. Conf. Ser
**2006**, 34, 442–447. [Google Scholar] - Huster, A.; Rock, S.M. Relative Position Sensing by Fusing Monocular Vision and Inertial Rate Sensors. Proceedings of IEEE International Conference on Advanced Robotics, Coimbra, Portugal, June 30–July 3, 2003; pp. 1562–1567.
- Persa, S.; Jonker, P. Multi-sensor Robot Navigation System. SPIE Int. Soc. Opt. Eng
**2002**, 4573, 187–194. [Google Scholar] - Treiber, M. Dynamic Capture of Human Arm Motion Using Inertial Sensors and Kinematical Equations, Master Thesis,. University of Waterloo, Ontario, Canada, 2004.
- Titterton, D.H.; Weston, J.L. Strapdown Inertial Navigation Technology, 2nd ed; The institution of Electrical Engineers: Herts, UK, 2004. [Google Scholar]
- Hibbeler, R.C. Enginnering Mechanics: Statics and Dynamics, 8th ed; Prentice-Hall: Bergen County, NJ, USA, 1998. [Google Scholar]
- Angular Acceleration of the Earth. Available online: http://jason.kamin.com/projects_files/equations.html/ (accessed on 21 January 2010).
- Angular Speed of the Earth. Available online: http://hypertextbook.com/facts/2002/Jason/Atkins.shtml/ (accessed on 21 January 2010).
- Forsyth, D.A.; Ponce, J. Computer Vision: A Modern Approach; Prentice-Hall: Upper Saddle River, NJ, USA, 2003. [Google Scholar]
- Yoneyama, S.; Kikuta, H.; Kitagawa, A.; Kitamura, K. Lens Distortion Correction for Digital Image Correlation by Measuring Rigid Body Displacement. Opt. Eng
**2006**, 42, 1–9. [Google Scholar] - Brown, D.C. Close-Range Camera Calibration. Photogram. Eng
**1971**, 37, 855–866. [Google Scholar] - Tsai, R.Y. A Versatile Camera Calibration Technique for High Accuracy 3D Machine Vision Metrology Using Off-the-shelf TV Cameras and Lenses. IEEE J. Rob. Autom
**1987**, RA-3, 323–344. [Google Scholar] - Heikkila, J. Accurate Camera Calibration and Feature-based 3-D Reconstruction from Monocular Image Sequences, Dissertation,. University of Oulu, Oulun yliopisto, Finland, 1997.
- Grewal, M.S.; Andrews, A.P. Kalman Filtering: Theory and Practice Using MATLAB, 2nd ed; John Wiley: New York, NY, USA, 2001. [Google Scholar]
- Zarchan, P.; Musoff, H. Fundamentals of Kalman Filtering: A Practical Approach, 2nd ed; AIAA: Alexandria, VA, USA, 2005. [Google Scholar]
- MicroStrain: Orientation Sensors—Wireless Sensors. Available online: http://www./microstrain./com/ (accessed on 21 January 2010).

**Figure 6.**Estimated position by applying different estimation method: continuous EKF (left), Switch EKF (center), and proposed EKF (right); when the sampling rate of the cameras is 16 fps.

**Figure 7.**Estimated position by applying different estimation method: continuous EKF (left), Switch EKF (center), and proposed EKF (right); when the sampling rate of the cameras is 5 fps.

Camera #1 | Camera #2 | Camera #3 | Camera #4 | |
---|---|---|---|---|

Focal Length | X: 400.69 pixels Y: 402.55 pixels | X: 398.51 pixels Y: 400.44 pixels | X: 402.00 pixels Y: 405.10 pixels | X: 398.74 pixels Y: 400.60 pixels |

Principal Point | X: 131.12 pixels Y: 130.10 pixels | X: 152.74 pixels Y: 122.79 pixels | X: 144.77 pixels Y: 118.23 pixels | X: 136.90 pixels Y: 145.34 pixels |

Distortion Coefficients | K_{r,x}: −0.3494K _{r,y}: 0.1511K _{t,x}: 0.0032K _{t,y}: −0.0030 | K_{r,x}: −0.3522K _{r,y}: 0.1608K _{t,x}: 0.0047K _{t,y}: −0.0005 | K_{r,x}: −0.3567K _{r,y}: 0.0998K _{t,x}: −0.0024K _{t,y}: 0.0016 | K_{r,x}: −0.3522K _{r,y}: 0.0885K _{t,x}: 0.0024K _{t,y}: −0.0002 |

Rotation VectorWrt InertialReference Frame | 1.552265 2.255665 −0.635153 | 0.4686021 2.889162 −0.7405382 | 0.6128003 −2.859007 0.7741390 | 1.537200 −2.314144 0.4821106 |

Translation Vectorwrt InertialReference Frame | 729.4870 mm 293.6999 mm 873.3399 mm | 385.2578 mm 625.1560 mm 840.7220 mm | −61.1933 mm 623.1377 mm 851.9321 mm | −365.5847 mm 289.6135 mm 848.5442 mm |

**Table 2.**Positions estimated by different estimation methods are compared with the position estimated by the multi-camera vision system.

Proposed EKF | EKF (Switch) | EKF (Continuous) | ||||
---|---|---|---|---|---|---|

Cameras Measurement Rate | Error (RMS) | Variance | Error (RMS) | Variance | Error (RMS) | Variance |

16 fps | 0.9854 | 0.1779 | 1.0076 | 0.7851 | 0.4320 | 0.1386 |

10 fps | 1.0883 | 0.3197 | 1.2147 | 0.8343 | 0.5658 | 0.2149 |

5 fps | 1.4730 | 1.5173 | 1.3278 | 0.8755 | 0.7257 | 0.8025 |

© 2010 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).