Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion
Abstract
:1. Introduction
- By combining a monocular camera and an inertial sensor, sensor-fusion based 6-DoF motion tracking for mobile AR/VR in real-time is realized.
- To alleviate the jitter during the visual-inertial fusion, an adaptive filter framework is proposed to balance the jitter and latency phenomenon, enabling a real-time and smooth 6-DoF motion tracking for mobile AR/VR.
2. Related Works
3. Materials and Methods
3.1. Platform and System Description
3.2. Monocular Visual and IMU Based Tracking
3.2.1. Monocular Parameter Calibration
3.2.2. Visual-Based Tracking
3.2.3. Process Model for Visual-Inertial Fusion
- (a)
- When IMU data and arrived in a certain sample frequency, the state vector is propagated using numerical integration on Equation (8).
- (b)
- Calculate and .
- (c)
- Compute the propagated state covariance matrix according to the Equation (16).
3.3. Adaptive Visual-Inertial Fusion for Mobile AR/VR
3.3.1. Measurement Model for Visual-Inertial Fusion
| Algorithm 1. Visual-inertial motion tracking process. | 
| 01. Initialize , and | 
| 02. for do | 
| 03. { Time update: | 
| 04. Compute and , , | 
| 05. Compute with the 4th Runge Kutta integration | 
| 06. if Pose from visual-based arrived | 
| 07. {Measurement update: | 
| 08. Compute the residual: , Kalman gain: ; | 
| 09. Compute the correction: , ; | 
| 10. Use to correct state estimate and the obtain } | 
| 11. end | 
| 12. end } | 
3.3.2. Quaternion-Based Linear Filter Framework
3.3.3. Different Motion Situations Analysis
- (a)
- Jitter-filtering: When the mobile AR/VR system is almost kept static or moves slowly, the change between adjacent poses can be almost ignored. Thus, the jitter phenomenon plays a dominant role at this scenario, while the latency phenomenon for mobile AR/VR can be neglected for users’ perception, and this stage is defined as jitter-filtering in this paper. At the same time, the real-time distances between successive arriving poses are small enough at this stage, meaning that the normalized distance is close to 0.
- (b)
- Moderation-filtering: When the motion situation of the mobile AR/VR system is under moderate motion situations, this stage is defined as moderation-filtering in our work with a moderate distance .
- (c)
- Latency-filtering: When the mobile AR/VR system is encountered rapid motion, the change between adjacent arriving poses is drastic. The user would perceive the latency obvious when the pose cannot arrive timely. Thus, latency phenomenon plays a dominant role at this scenario for mobile AR/VR, while the jitter phenomenon can be neglected within a fast motion situation. This stage is defined as latency-filtering. Moreover, the real-time distances between successive arriving poses are relative large, meaning that the normalized distance is close to 1.
3.3.4. Adaptive Filter Framework Definition
4. Experiments and Results
4.1. Adaptive Visual-Inertial Fusion Performance
4.2. Real-Time Motion Tracking for Mobile AR
4.3. Real-Time Motion Tracking for Mobile VR
5. Discussion
6. Conclusions and Future Works
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Marchand, E.; Uchiyama, H.; Spindler, F. Pose estimation for augmented reality: A hands-on survey. IEEE Trans. Visual. Comput. Graph. 2016, 22, 2633–2651. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Cao, R.; Wang, Y. Sensor-aware recognition and tracking for wide-area augmented reality on mobile phones. Sensors 2015, 15, 31092–31107. [Google Scholar] [CrossRef] [PubMed]
- Guan, T.; Duan, L.; Chen, Y.; Yu, J. Fast scene recognition and camera relocalisation for wide area augmented reality systems. Sensors 2010, 10, 6017–6043. [Google Scholar] [CrossRef] [PubMed]
- Samaraweera, G.; Guo, R.; Quarles, J. Head tracking latency in virtual environments revisited: Do users with multiple sclerosis notice latency less? IEEE Trans. Visual. Comput. Graph. 2016, 22, 1630–1636. [Google Scholar] [CrossRef] [PubMed]
- Rolland, J.P.; Baillot, Y.; Goon, A.A. A survey of tracking technology for virtual environments. Fundam. Wearable Comput. Augment. Real. 2001, 8, 1–48. [Google Scholar]
- Gerstweiler, G.; Vonach, E.; Kaufmann, H. HyMoTrack: A mobile AR navigation system for complex indoor environments. Sensors 2016, 16. [Google Scholar] [CrossRef] [PubMed]
- Mihelj, M.; Novak, D.; Begus, S. Virtual Reality Technology and Applications; Springer: Dordrecht, The Netherlands, 2014. [Google Scholar]
- Lee, J.Y.; Seo, D.W.; Rhee, G.W. Tangible authoring of 3D virtual scenes in dynamic augmented reality environment. Comput. Ind. 2011, 62, 107–119. [Google Scholar] [CrossRef]
- Gonzalez, F.C.J.; Villegas, O.O.V.; Ramirez, D.E.T.; Sanchez, V.G.C.; Dominguez, H.O. Smart multi-level tool for remote patient monitoring based on a wireless sensor network and mobile augmented reality. Sensors 2014, 14, 17212–17234. [Google Scholar] [CrossRef] [PubMed]
- Tayara, H.; Ham, W.; Chong, K.T. A real-time marker-based visual sensor based on a FPGA and a soft core processor. Sensors 2016, 16, 2139. [Google Scholar] [CrossRef] [PubMed]
- Pressigout, M.; Marchand, E. Real-time 3D model-based tracking: combining edge and texture information. In Proceedings of the IEEE International Conference on Robotics and Automation, Orlando, FL, USA, 15–19 May 2006; pp. 2726–2731. [Google Scholar]
- Espindola, D.B.; Fumagalli, L.; Garetti, M.; Pereira, C.E.; Botelho, S.S.C.; Henriques, R.V. A model-based approach for data integration to improve maintenance management by mixed reality. Comput. Ind. 2013, 64, 376–391. [Google Scholar] [CrossRef]
- Han, P.; Zhao, G. CAD-based 3D objects recognition in monocular images for mobile augmented reality. Comput. Gr. 2015, 50, 36–46. [Google Scholar] [CrossRef]
- Alex, U.; Mark, F. A markerless augmented reality system for mobile devices. In Proceedings of the International Conference on Computer and Robot Vision, Regina, SK, Canada, 28–31 May 2013; pp. 226–233. [Google Scholar]
- Munguia, R.; Castillo-Toledo, B.; Grau, A. A robust approach for a filter-based monocular simultaneous localization and mapping (SLAM) system. Sensors 2013, 13, 8501–8522. [Google Scholar] [CrossRef] [PubMed]
- Klein, G.; Murray, D. Parallel tracking and mapping for small AR workspaces. In Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, Nara, Japan, 13–16 November 2007; pp. 1–10. [Google Scholar]
- Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef]
- Silveira, G.; Malis, E.; Rives, P. An efficient direct approach to visual SLAM. IEEE Trans. Robot. 2008, 24, 969–979. [Google Scholar] [CrossRef]
- Newcombe, R.A.; Lovegrove, S.J.; Davison, A.J. DTAM: Dense tracking and mapping in real-time. In Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2320–2327. [Google Scholar]
- Jakob, E.; Thomas, S.; Daniel, C. LSD-SLAM: Large-scale direct monocular SLAM. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 834–849. [Google Scholar]
- Forster, C.; Pizzoli, M.; Scaramuzza, D. SVO: Fast semi-direct monocular visual odometry. In Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, 31 May–7 June 2014; pp. 15–22. [Google Scholar]
- Engel, J.; Koltun, V.; Cremers, D. Direct Sparse Odometry. arXiv, 2016; arXiv:1607.02565v2. [Google Scholar]
- Newcombe, R.A.; Davison, A.J. Live dense reconstruction with a single moving camera. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 1498–1505. [Google Scholar]
- Xu, K.; Chia, K.W.; Cheok, A.D. Real-time camera tracking for marker-less and unprepared augmented reality environments. Image Vis. Comput. 2008, 26, 673–689. [Google Scholar] [CrossRef]
- Lee, S.H.; Lee, S.K.; Choi, J.S. Real-time camera tracking using a particle filter and multiple feature trackers. In Proceedings of the IEEE Consumer Electronics Society’s Games Innovations Conference, London, UK, 25–28 August 2009; pp. 29–36. [Google Scholar]
- Wei, B.; Guan, T.; Duan, L.; Yu, J.; Mao, T. Wide area localization and tracking on camera phones for mobile augmented reality systems. Multimedia Syst. 2015, 21, 381–399. [Google Scholar] [CrossRef]
- Chen, P.; Peng, Z.; Li, D.; Yang, L. An improved augmented reality system based on AndAR. J. Vis. Commun. Image Represent. 2015, 37, 63–69. [Google Scholar] [CrossRef]
- Wang, W.J.; Wan, H.G. Real-time camera tracking using hybrid features in mobile augmented reality. Sci. China Inf. Sci. 2015, 58, 1–13. [Google Scholar] [CrossRef]
- He, C.; Kazanzides, P.; Sen, H.T.; Kim, S.; Liu, Y. An inertial and optical sensor fusion approach for six degree-of-freedom pose estimation. Sensors 2015, 15, 16448–16465. [Google Scholar] [CrossRef] [PubMed]
- Santoso, F.; Garratt, M.A.; Anavatti, S.G. Visual-inertial navigation systems for aerial robotics: Sensor fusion and technology. IEEE Trans. Autom. Sci. Eng. 2017, 14, 260–275. [Google Scholar] [CrossRef]
- Kong, X.; Wu, W.; Zhang, L.; Wang, Y. Tightly-coupled stereo visual-inertial navigation using point and line features. Sensors 2014, 14, 12816–12833. [Google Scholar] [CrossRef] [PubMed]
- Leutenegger, S.; Lynen, S.; Bosse, M.; Siegwart, R.; Furgale, P. Keyframe-based visual–inertial odometry using nonlinear optimization. Int. J. Robot. Res. 2015, 34, 314–334. [Google Scholar] [CrossRef]
- Konolige, K.; Agrawal, M.; Sola, J. Large-scale visual odometry for rough terrain. Springer Tracts Adv. Rob. 2010, 66, 201–212. [Google Scholar]
- Weiss, S.; Siegwart, R. Real-time metric state estimation for modular vision-inertial systems. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 4531–4537. [Google Scholar]
- Tomazic, S.; Ckrjanc, I. Fusion of visual odometry and inertial navigation system on a smartphone. Comput. Ind. 2015, 74, 119–134. [Google Scholar] [CrossRef]
- Kim, Y.; Hwang, D.H. Vision/INS integrated navigation system for poor vision navigation environments. Sensors 2016, 16, 1672. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Besada, J.A.; Bernardos, A.M.; Tarrio, P.; Casar, J.R. A novel system for object pose estimation using fused vision and inertial data. Inform. Fusion 2016, 33, 15–28. [Google Scholar] [CrossRef]
- Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern. Anal. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
- Furgale, P.; Rehder, J.; Siegwart, R. Unified temporal and spatial calibration for multi-sensor systems. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 1280–1286. [Google Scholar]
- Fang, W.; Zheng, L.; Deng, H. A motion tracking method by combining the IMU and camera in mobile devices. In Proceedings of the 10th International Conference on Sensing Technology, Nanjing, China, 11–13 November 2016. [Google Scholar] [CrossRef]
- Chou, J.C.K. Quaternion kinematic and dynamic differential equations. IEEE Trans. Robot. Autom. 1992, 8, 53–64. [Google Scholar] [CrossRef]












| Translational Error (cm) | Rotational Error (deg) | |||||
| Mean error | 5.78 | 5.67 | 0.81 | 0.72 | 0.67 | 0.79 | 
| Standard Deviation | 2.98 | 2.83 | 1.29 | 0.37 | 0.28 | 0.42 | 
| Maximum error | 10.46 | 9.83 | 3.25 | 1.59 | 1.41 | 1.71 | 
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, W.; Zheng, L.; Deng, H.; Zhang, H. Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion. Sensors 2017, 17, 1037. https://doi.org/10.3390/s17051037
Fang W, Zheng L, Deng H, Zhang H. Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion. Sensors. 2017; 17(5):1037. https://doi.org/10.3390/s17051037
Chicago/Turabian StyleFang, Wei, Lianyu Zheng, Huanjun Deng, and Hongbo Zhang. 2017. "Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion" Sensors 17, no. 5: 1037. https://doi.org/10.3390/s17051037
APA StyleFang, W., Zheng, L., Deng, H., & Zhang, H. (2017). Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion. Sensors, 17(5), 1037. https://doi.org/10.3390/s17051037
 
         
                                                
 
       