Real-Time Orthophoto Mosaicing on Mobile Devices for Sequential Aerial Images with Low Overlap
Abstract
:1. Introduction
- A novel online georeferenced orthophoto mosaicing framework with high efficiency and robustness: Compared with the existing commercial software and current state-of-the-art mosaicing systems, our method proposes a complete solution for real-time incremental stitching on mobile devices. Considerable improvements are considered for robustness and efficiency to adapt to the challenging requirements of high-quality orthoimage generation with relatively fast speed and less computation.
- Planar restricted online pose graph optimization: A planar-restricted global pose graph optimization algorithm is proposed and compared with other 2D aerial image-mosaicing schemes and traditional SLAM or SfM systems. Instead of using sparse 3D map points which is outlier sensitive, keypoint matches are parameterized to planar restricted reprojection errors where parameters are effectively reduced. This method can achieve better robustness and efficiency, even if the overlap rate is low.
- An adapted weighted multiband images fusion algorithm with LoD based tiling, caching and rendering Memory resources in mobile devices are often limited. Thus, retaining the entire mosaic for large-scale datasets is impossible. In addition, the display system demands a tiled DOM for efficient rendering. To solve these problems, the orthophoto consists of several image tiles, which are managed with a hashed least recently used (LRU) cache; the LoD technique is also used for quick rendering.
- An android application demonstrating algorithm effectiveness on mobile devices: To show the realistic performance in cellphones, an android software is designed to upload flight mission for DJI drones. In addition, we integrate the presented algorithm by providing restful web service through C++.
2. Related Work
3. Methodology
3.1. Proposed Framework
3.2. Appearance and Spatial Based Fast Neighbors Query and Matching
- BoW-accelerated global correspondences with cross check. A BoW vocabulary is pretrained with k-means algorithm because brute force matching is highly computationally expensive, and features are transformed to some small spaces to accelerate global searching by matching each space separately. This strategy is adopted to obtain an initial matching, and the accelerated version of DBoW implemented by GSLAM [29] is used in this work.
- Outliers filtering based on epipolar geometry and matching angle histogram. The initial matching contains outliers. A simple histogram-based voting filtering is performed to increase inlier rate. In addition, a fundamental matrix is estimated with the random sample consensus [24] procedure for geometric outlier removal.
- Multi-homography-based rematching. The previous procedures may ignore some good matches. Thus, we try to find them again with a multihomography assumption. The previous inlier matches are used to calculate multiple homography matrices. Then, window searching is performed for every feature without match to find the remaining potential matches. The distance between matched keypoint and epipolar line is computed, and the match is only accepted when the distance is below the fixed threshold.
3.3. Online Planar Restricted Sparse Reconstruction
3.4. Georeferenced Images Fusion with Tiling and LoD
- The fusing should be efficient for real-time processing. The computational expensive offline view selection and seam finding methods based on graph cut are unsuitable here.
- The mosaic result should be rendered efficiently. Publishing the entire image frequently is impossible because we use network for the dynamic orthophoto publishing. The map rendering engines often require tiled images with LoD support, and only tiles that are visible and updated should be refreshed.
- The processing should be memory efficient to run on mobile devices. After fusing hundreds of images, the final mosaic could be very large, and the memory resource in mobile devices is very limited. An efficient caching algorithm should be considered only to hold active data.
- The final mosaic should be as ortho and smooth as possible. The stitching is not fully ortho because no 3D dense reconstruction procedure is considered in the entire pipeline for efficiency. However, we can still preserve the view, and the blending method should smooth the seam lines to obtain a natural result.
3.5. Mobile Application with Flight Planning and Live Map
4. Experiments
4.1. Results on DroneMap2 Dataset
- Seam-line cutting of moving objects: Traditional seam finding methods are unsuitable here due to the incremental stitching style. In addition, for efficiency, the seam-lines are automatically determined by the adaptive weighted blending. The stitching seam-line may cut the moving objects, such as cars. Thus, half of the cars are rendered, as illustrated in the highlights of Figure 3a,d,n.
- Live reconstruction drift between airlines: The stitched result is difficult to adjust because the algorithm renders orthophoto lively. However, even with fused GPS information, the sparse reconstruction and pose estimation contain small drift and are updated after more observations, which may cause mismatches, as shown in Figure 3 for sequence mavic-factory and mavic-warriors.
- Homography mismatch caused by high buildings: The sparse reconstruction and fusion steps assume that the ground is a local planar. Thus, homography projection is used to wrap original image to the stitching plane. For high buildings, mismatches may be observed, as illustrated in Figure 3b.
4.2. Live DOM Quality Comparison
- Planes are used instead of map points for optimization. We do not rely on outlier sensitive map points. Thus, less keypoints are extracted, higher robustness is obtained, and less parameters are used for optimization. This feature dramatically decreases the optimization complexity and brings faster processing ability. We do not require to carefully handle outliers, and the optimization converges with less time.
- GPS information is tightly used throughout the matching, reconstruction, and fusion procedures. We use the GPS information throughout the entire pipeline to reduce time budget because they are available for our georeferenced stitching. The GPS-aided absolute rotation averaging prevents poor pose estimation and accelerates the convergence of graph optimization.
- Direct orthophoto blending without dense and mesh reconstruction. Most SfM systems perform dense reconstruction and mesh triangulation before DOM rendering. Our method directly fuses images to the final mosaic efficiently, and our reconstruction step uses planar assumption at the first stage. Thus, it provides better quality than map point-based SLAM front-end systems.
4.3. Computation Performance Comparison
4.4. Quantitative DOM Precision Evaluation
4.5. Web-Based Live Map Sharing
5. Conclusions
Author Contributions
Acknowledgments
Conflicts of Interest
References
- Verhoeven, G. Taking computer vision aloft–archaeological three-dimensional reconstructions from aerial photographs with photoscan. Archaeol. Prospect. 2011, 18, 67–73. [Google Scholar] [CrossRef]
- Zhao, Y.; Liu, G.; Xu, S.; Bu, S.; Jiang, H.; Wan, G. Fast Georeferenced Aerial Image Stitching with Absolute Rotation Averaging and Planar-Restricted Pose Graph. IEEE Trans. Geosci. Remote Sens. 2020, 1–16. [Google Scholar] [CrossRef]
- Vallet, J.; Panissod, F.; Strecha, C.; Tracol, M. Photogrammetric performance of an ultra light weight swinglet UAV. ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, XXXVIII-1/C22, 253–258. [Google Scholar] [CrossRef] [Green Version]
- Bu, S.; Zhao, Y.; Wan, G.; Liu, Z. Map2dfusion: Real-time incremental UAV image mosaicing based on monocular slam. In Proceedings of the IEEE Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference, Daejeon, Korea, 9–14 October 2016; pp. 4564–4571. [Google Scholar]
- Botterill, T.; Mills, S.; Green, R. Real-time aerial image mosaicing. In Proceedings of the IEEE Image and Vision Computing New Zealand (IVCNZ), 2010 25th International Conference, Queenstown, New Zealand, 8–9 November 2010; pp. 1–8. [Google Scholar]
- De Souza, R.H.C.; Okutomi, M.; Torii, A. Real-time image mosaicing using non-rigid registration. In Pacific-Rim Symposium on Image and Video Technology; Springer: Berlin/Heidelberg, Germany, 2011; pp. 311–322. [Google Scholar]
- Lütjens, M.; Kersten, T.; Dorschel, B.; Tschirschwitz, F. Virtual Reality in Cartography: Immersive 3D Visualization of the Arctic Clyde Inlet (Canada) Using Digital Elevation Models and Bathymetric Data. Multimodal Technol. Interact. 2019, 3, 9. [Google Scholar] [CrossRef] [Green Version]
- Edler, D.; Keil, J.; Wiedenlübbert, T.; Sossna, M.; Dickmann, F. Immersive VR Experience of Redeveloped Post-industrial Sites: The Example of eche Hollandïn Bochum-Wattenscheid. KN J. Cartogr. Geogr. Inf. 2019, 69, 267–284. [Google Scholar] [CrossRef] [Green Version]
- Hruby, F.; Castellanos, I.; Ressl, R. Cartographic Scale in Immersive Virtual Environments. KN J. Cartogr. Geogr. Inf. 2020. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Montiel, J.; Tardos, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. arXiv 2015, arXiv:1502.00956. [Google Scholar] [CrossRef] [Green Version]
- Mur-Artal, R.; Tardós, J.D. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef] [Green Version]
- Engel, J.; Koltun, V.; Cremers, D. Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 611–625. [Google Scholar] [CrossRef] [PubMed]
- Forster, C.; Zhang, Z.; Gassner, M.; Werlberger, M.; Scaramuzza, D. Svo: Semidirect visual odometry for monocular and multicamera systems. IEEE Trans. Robot. 2017, 33, 249–265. [Google Scholar] [CrossRef] [Green Version]
- Schönberger, J.L.; Frahm, J.M. Structure-from-Motion Revisited. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Snavely, N. Bundler: Structure from Motion (SFM) for Unordered Image Collections. 2008. Available online: http://phototour.cs.washington.edu/bundler/ (accessed on 10 January 2020).
- Moulon, P.; Monasse, P.; Marlet, R. Global fusion of relative motions for robust, accurate and scalable structure from motion. In Proceedings of the Computer Vision (ICCV), 2013 IEEE International Conferenc, Sydney, Australia, 1–8 December 2013; pp. 3248–3255. [Google Scholar]
- Sweeney, C. Theia Multiview Geometry Library: Tutorial & Reference. Available online: http://theia-sfm.org (accessed on 1 October 2020).
- Lati, A.; Belhocine, M.; Achour, N. Robust aerial image mosaicing algorithm based on fuzzy outliers rejection. Evol. Syst. 2019, 11, 717–729. [Google Scholar] [CrossRef]
- Warren, M.; McKinnon, D.; He, H.; Glover, A.; Shiel, M.; Upcroft, B. Large scale monocular vision-only mapping from a fixed-wing sUAS. In Field and Service Robotics; Springer: Berlin/Heidelberg, Germany, 2014; pp. 495–509. [Google Scholar]
- Gherardi, R.; Farenzena, M.; Fusiello, A. Improving the efficiency of hierarchical structure-and-motion. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 1594–1600. [Google Scholar]
- Turner, D.; Lucieer, A.; Watson, C. An automated technique for generating georectified mosaics from ultra-high resolution unmanned aerial vehicle (UAV) imagery, based on structure from motion (SfM) point clouds. Remote Sens. 2012, 4, 1392–1410. [Google Scholar] [CrossRef] [Green Version]
- Salaün, Y.; Marlet, R.; Monasse, P. Robust SfM with Little Image Overlap. arXiv 2017, arXiv:1703.07957. [Google Scholar]
- Botterill, T.; Mills, S.; Green, R.D. New Conditional Sampling Strategies for Speeded-Up RANSAC. In Proceedings of the BMVC 2009, London, UK, 7–10 September 2009; pp. 1–11. [Google Scholar]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Davis, J. Mosaics of scenes with moving objects. In Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No. 98CB36231), Santa Barbara, CA, USA, 25 June 1998; pp. 354–360. [Google Scholar]
- Garcia-Fidalgo, E.; Ortiz, A.; Bonnin-Pascual, F.; Company, J.P. Fast image mosaicing using incremental bags of binary words. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 1174–1180. [Google Scholar]
- Guizilini, V.; Sales, D.; Lahoud, M.; Jorge, L. Embedded mosaic generation using aerial images. In Proceedings of the 2017 IEEE Latin American Robotics Symposium (LARS) and 2017 Brazilian Symposium on Robotics (SBR), Curitiba, Brazil, 8–11 November 2017; pp. 1–6. [Google Scholar]
- Juan Jesus Ruiz, F.C.; Merino, L. MGRAPH: A Multigraph Homography Method to Generate Incremental Mosaics in Real-Time From UAV Swarms. IEEE Robtics Autom. Lett. 2018, 3, 2838–2845. [Google Scholar]
- Zhao, Y.; Xu, S.; Bu, S.; Jiang, H.; Han, P. GSLAM: A General SLAM Framework and Benchmark. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Snyder, J.P. Map Projections—A Working Manual; US Government Printing Office: Washington, DC, USA, 1987; Volume 1395.
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the Computer Vision (ICCV), 2011 IEEE International Conference, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
- Muja, M.; Lowe, D.G. Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP 2009, 2, 2. [Google Scholar]
- Tang, C.; Wang, O.; Tan, P. Gslam: Initialization-robust monocular visual slam via global structure-from-motion. In Proceedings of the 2017 IEEE International Conference on 3D Vision (3DV), Qingdao, China, 10–12 October 2017; pp. 155–164. [Google Scholar]
- Horn, B.K. Closed-form solution of absolute orientation using unit quaternions. JOSA A 1987, 4, 629–642. [Google Scholar] [CrossRef]
- Chatterjee, A.; Madhav Govindu, V. Efficient and robust large-scale rotation averaging. In Proceedings of the IEEE International Conference on Computer Vision, Sydney Australia, 1–8 December 2013; pp. 521–528. [Google Scholar]
Sequence | Location | Images | Resolution | Size (MB) | Ours | DJITerra (s) | Pix4DMapper (s) | |
---|---|---|---|---|---|---|---|---|
Time (s) | Average (MB/s) | |||||||
mavic2-road | Xi’an, Shaanxi | 240 | 5472 × 3078 | 1895.4 | 119.7 | 15.8 | 583.0 | 4867.0 |
mavic-campus | Xi’an, Shaanxi | 293 | 4000 × 3000 | 1479.4 | 64.4 | 22.9 | 568.0 | 4707.0 |
mavic-factory | Xi’an, Shaanxi | 359 | 4000 × 3000 | 1924.6 | 148.7 | 12.9 | 666.0 | 4811.0 |
mavic-fengniao | Xi’an, Shaanxi | 216 | 4000 × 3000 | 1102.5 | 87.6 | 12.6 | 434.0 | 3303.0 |
mavic-garden | Suzhou, Jiangsu | 247 | 4000 × 3000 | 1241.0 | 89.1 | 13.8 | 550.0 | 4753.0 |
mavic-hongkong | Hong Kong | 288 | 4000 × 3000 | 1439.5 | 130.2 | 11.1 | 575.0 | 4723.0 |
mavic-huangqi | Hengyang, Hunan | 229 | 4000 × 3000 | 1156.5 | 98.2 | 11.8 | 454.0 | 4351.0 |
mavic-library | Xi’an, Shaanxi | 205 | 4000 × 3000 | 997.3 | 69.2 | 14.4 | 365.0 | 3786.0 |
mavic-npu | Xi’an, Shaanxi | 119 | 4000 × 3000 | 603.2 | 34.1 | 17.7 | 194.0 | 1834.0 |
mavic-river | Xi’an, Shaanxi | 166 | 4000 × 3000 | 960.4 | 64.7 | 14.8 | 408.0 | 3746.0 |
mavic-warriors | Xi’an, Shaanxi | 96 | 4000 × 3000 | 779.7 | 26.2 | 29.7 | 182.0 | 1623.0 |
mavic-yangxian | Xi’an, Shaanxi | 165 | 4000 × 3000 | 840.2 | 75.4 | 11.1 | 392.0 | 2935.0 |
p4r-field | Xi’an, Shaanxi | 683 | 5472 × 3648 | 5939.0 | 381.1 | 15.6 | 1837.0 | 23,941.0 |
p4r-roads | Xi’an, Shaanxi | 138 | 5472 × 3648 | 1058.6 | 84.2 | 12.6 | 501.0 | 4135.0 |
p4r-roads2 | Xi’an, Shaanxi | 203 | 5472 × 3648 | 1556.1 | 125.6 | 12.4 | 556.0 | 6332.0 |
p4r-village | Xi’an, Shaanxi | 136 | 5472 × 3648 | 1038.4 | 85.4 | 12.2 | 353.0 | 2762.0 |
phantom3-olathe | Olathe, USA | 160 | 4000 × 3000 | 898.7 | 74.6 | 12.0 | 312.0 | 2514.0 |
phantom3-strawberry | Xi’an, Shaanxi | 184 | 4000 × 3000 | 990.5 | 86.3 | 11.5 | 473.0 | 3591.0 |
phantom4-mountain | Shenzhen, Guangdong | 81 | 4864 × 3648 | 627.7 | 51.9 | 12.1 | 252.0 | 1752.0 |
xag-xinjiang | Yuli, Xinjiang | 303 | 4864 × 3648 | 2179.4 | 163.7 | 13.3 | - | 4362.0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, Y.; Cheng, Y.; Zhang, X.; Xu, S.; Bu, S.; Jiang, H.; Han, P.; Li, K.; Wan, G. Real-Time Orthophoto Mosaicing on Mobile Devices for Sequential Aerial Images with Low Overlap. Remote Sens. 2020, 12, 3739. https://doi.org/10.3390/rs12223739
Zhao Y, Cheng Y, Zhang X, Xu S, Bu S, Jiang H, Han P, Li K, Wan G. Real-Time Orthophoto Mosaicing on Mobile Devices for Sequential Aerial Images with Low Overlap. Remote Sensing. 2020; 12(22):3739. https://doi.org/10.3390/rs12223739
Chicago/Turabian StyleZhao, Yong, Yuqi Cheng, Xishan Zhang, Shibiao Xu, Shuhui Bu, Hongkai Jiang, Pengcheng Han, Ke Li, and Gang Wan. 2020. "Real-Time Orthophoto Mosaicing on Mobile Devices for Sequential Aerial Images with Low Overlap" Remote Sensing 12, no. 22: 3739. https://doi.org/10.3390/rs12223739
APA StyleZhao, Y., Cheng, Y., Zhang, X., Xu, S., Bu, S., Jiang, H., Han, P., Li, K., & Wan, G. (2020). Real-Time Orthophoto Mosaicing on Mobile Devices for Sequential Aerial Images with Low Overlap. Remote Sensing, 12(22), 3739. https://doi.org/10.3390/rs12223739