Dual-Branch Feature Generalization Method for AUV Near-Field Exploration of Hydrothermal Areas
Abstract
:1. Introduction
- Dual-branch generalization model: combines the visual realism of NeRF with the computational efficiency of 3D Gaussian splatting, achieves both offline generation of high-fidelity descriptors and real-time generalization for SLAM tasks;
- Shared descriptor space for multi-modal fusion: aligns NeRF’s optical features with the geometric features of 3D Gaussian splatting in a unified descriptor space;
- Generalized feature confidence mechanism: dynamically adjusts the reliability of generalized descriptors to balance robustness and accuracy in feature matching, addressing challenges in environments with varying lighting and viewpoints.
2. Related Works
- Fusion Module: Integration of the Feature Pyramid Network (FPN) into the UnsuperPoint framework to improve multi-sensor data fusion, enabling the extraction of more comprehensive and reliable features.
- Depth Module: A specialized module designed to ensure a uniform distribution of interest points in depth, significantly enhancing localization accuracy by maintaining balanced spatial coverage of detected features.
- Unsupervised Training Strategy: Introduction of an innovative unsupervised training approach, including the following: an auto-encoder framework for encoding sonar data, a ground truth depth generation framework to support depth module training, and a mutually supervised framework to ensure effective training of the fusion and depth modules without reliance on extensive labeled datasets.
- Non-Rigid Feature Filter: Development of a camera data encoder equipped with a non-rigid feature filter to exclude features from non-rigid structures, such as smoke emitted from hydrothermal vents, thus mitigating environmental noise and interference.
3. Method
3.1. Construction of Optical–Acoustic Fused Feature Descriptors
3.2. Feature Generalization Method Based on NeRF
3.3. Dual-Branch Generalization Model
3.3.1. 3D Gaussian Splatting Model
3.3.2. Shared Descriptors
3.3.3. Feature Confidence
3.3.4. Training
4. Experiments and Results
4.1. Data Acquisition Experiment
4.1.1. Experimental Setup
4.1.2. Data Acquisition System
4.1.3. Data Acquisition Process
4.1.4. Negative Label Generation with OAF-IPD
4.1.5. Location
4.2. Metrics
4.3. Results
4.3.1. Comparison of Descriptor Generalization Effects
4.3.2. Relative Pose Estimation
4.3.3. Generalization Method Effectiveness in SLAM
4.3.4. Process Speed
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Humphris, S.E.; Tivey, M.K.; Tivey, M.A. The Trans-Atlantic Geotraverse hydrothermal field: A hydrothermal system on an active detachment fault. Deep Sea Res. Part II 2015, 121, 8–16. [Google Scholar] [CrossRef]
- Yang, K.; Scott, S.D. Possible contribution of a metal-rich magmatic fluid to a sea-floor hydrothermal system. Nature 1996, 383, 420–423. [Google Scholar] [CrossRef]
- Bloesch, M.; Omari, S.; Hutter, M.; Siegwart, R. Robust visual inertial odometry using a direct EKF-based approach. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 298–304. [Google Scholar]
- Germain, H.; Lepetit, V.; Bourmaud, G. Neural Reprojection Error: Merging feature learning and camera pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Sarlin, P.-E.; Unagar, A.; Larsson, M.; Germain, H.; Toft, C.; Larsson, V.; Pollefeys, M.; Lepetit, V.; Hammarstrand, L.; Kahl, F.; et al. Back to the Feature: Learning robust camera localization from pixels to pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Zhou, Q.; Sattler, T.; Leal-Taixé, L. Patch2Pix: Epipolar-guided pixel-level correspondences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Quattrini Li, A.; Coskun, A.; Doherty, S.M.; Ghasemlou, S.; Jagtap, A.S.; Modasshir, M.; Rahman, S.; Singh, A.; Xanthidis, M.; O’Kane, J.M.; et al. Experimental comparison of open source vision-based state estimation algorithms. In Proceedings of the International Symposium on Experimental Robotics (ISER), Tokyo, Japan, 3–8 October 2016. [Google Scholar]
- Joshi, B.; Rahman, S.; Kalaitzakis, M.; Cain, B.; Johnson, J.; Xanthidis, M.; Karapetyan, N.; Hernandez, A.; Li, A.Q.; Vitzilaios, N.; et al. Experimental comparison of open source visual-inertial based state estimation algorithms in the underwater domain. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 4–8 November 2019; pp. 7227–7233. [Google Scholar]
- Joe, H.; Cho, H.; Sung, M.; Kim, J.; Yu, S.-C. Sensor fusion of two sonar devices for underwater 3D mapping with an AUV. Auton. Robot. 2021, 45, 543–560. [Google Scholar] [CrossRef]
- Hu, C.; Zhu, S.; Liang, Y.; Mu, Z.; Song, W. Visual-pressure fusion for underwater robot localization with online initialization. IEEE Robot. Autom. Lett. 2021, 6, 8426–8433. [Google Scholar] [CrossRef]
- Rahman, S.; Quattrini Li, A.; Rekleitis, I. SVIn2: A multi-sensor fusion-based underwater SLAM system. Int. J. Robot. Res. 2022, 41, 1022–1042. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive image features from Scale-Invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-Up robust features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G.R. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
- Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef]
- Yi, K.M.; Trulls, E.; Lepetit, V.; Fua, P. LIFT: Learned invariant feature transform. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2016; pp. 467–483. [Google Scholar]
- DeTone, D.; Malisiewicz, T.; Rabinovich, A. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 224–236. [Google Scholar]
- Christiansen, P.H.; Kragh, M.F.; Brodskiy, Y.; Karstoft, H. UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Liu, Y.; Xu, Y.; Zhang, Z.; Wan, L. Unsupervised Learning-Based Optical–Acoustic Fusion Interest Point Detector for AUV Near-Field Exploration of Hydrothermal Areas. J. Mar. Sci. Eng. 2024, 12, 1406. [Google Scholar] [CrossRef]
- Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. NeRF: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Kerbl, B.; Leimkühler, T.; Müller, J.; Kaplanyan, A.S.; Eggermont, J. 3D Gaussian Splatting for real-time radiance field rendering. ACM Trans. Graph. (TOG) 2023, 42, 139:1–139:14. [Google Scholar] [CrossRef]
- Suárez, I.; Sfeir, G.; Buenaposada, J.M.; Baumela, L. BEBLID: Boosted Efficient Binary Local Image Descriptor. Pattern Recognit. Lett. 2020, 133, 366–372. [Google Scholar] [CrossRef]
- Humenberger, M.; Zaffaroni, P.; Lienhart, R. FeatureBooster: Boosting Feature Descriptors for Robust Matching. In Proceedings of the International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Li, K.; Wang, L.; Liu, L.; Ran, Q.; Xu, K.; Guo, Y. Decoupling Makes Weakly Supervised Local Feature Better. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Huang, Y.; You, S. You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Zhang, G.; Du, C.; Sun, Y.; Xu, H.; Qin, H.; Huang, H. UUV Trajectory Tracking Control Based on ADRC. In Proceedings of the IEEE International Conference on Robotics and Biomimetics, Qingdao, China, 3–7 December 2016. [Google Scholar]
- Guo, Z.; Yang, X.; Gao, J.; Yan, J.; Luo, X. Velocity Observer-based Tracking Control of Autonomous Underwater Vehicle with Communication Delay. In Proceedings of the International Symposium on Autonomous Systems, Shanghai, China, 29–31 May 2019. [Google Scholar]
- Larsson, V. PoseLib—Minimal Solvers for Camera Pose Estimation. 2020. Available online: https://github.com/vlarsson/poseLib (accessed on 31 January 2020).
Dimension | Sign | Definition | Data Type |
---|---|---|---|
1–256 | Original feature | float | |
257–512 | Generalized feature | float | |
513 | Feature type | int | |
514 | Matching type | int | |
515 | Matching score | float |
Methods | Bright Environment | Dark Environment | ||||||
---|---|---|---|---|---|---|---|---|
A | P | R | F1 | A | P | R | F1 | |
None | 40.7 | 81.4 | 44.9 | 57.9 | 30.2 | 75.4 | 43.0 | 54.6 |
NeRF | 87.7 | 76.4 | 98.6 | 86.1 | 80.0 | 63.4 | 81.8 | 71.5 |
3D-GS | 79.4 | 74.5 | 82.6 | 78.3 | 79.8 | 66.0 | 80.8 | 72.7 |
Dual-branch | 89.0 | 79.2 | 98.4 | 87.8 | 85.4 | 68.4 | 85.8 | 76.1 |
Generalization | AUC-RANSAC | AUC-LO-RANSAC | ||||
---|---|---|---|---|---|---|
5° | 10° | 20° | 5° | 10° | 20° | |
None | 37.2 | 54.3 | 59.4 | 49.7 | 61.3 | 63.4 |
NeRF | 49.7 | 63.2 | 72.9 | 68.1 | 76.4 | 85.3 |
3D-GS | 46.5 | 59.4 | 74.2 | 60.4 | 70.4 | 85.1 |
Dual-branch | 49.2 | 65.8 | 75.1 | 67.3 | 79.8 | 88.5 |
System | ORB | OAF | ||||
---|---|---|---|---|---|---|
Method | - | - | NeRF | 3D-GS | Dual-Branch | |
Descriptor size | 128 | 256 | 517 | |||
Time cost (ms/) | Detector | 16.07 | 21.77 | |||
Generalization | - | - | 2290.31 | 4.78 | 75.23 | |
Tracking | 8.78 | 20.86 | 70.74 | 73.67 | 71.79 | |
Mapping | 230.54 | 280.41 | 259.63 | 284.12 | 280.17 | |
Total | 255.39 | 323.04 | 2642.45 | 384.34 | 448.96 | |
Additional time (%) | 0 | 26.49 | 934.67 | 50.49 | 75.59 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Chen, G.; Xu, Y.; Wan, L.; Zhang, Z. Dual-Branch Feature Generalization Method for AUV Near-Field Exploration of Hydrothermal Areas. J. Mar. Sci. Eng. 2024, 12, 2359. https://doi.org/10.3390/jmse12122359
Liu Y, Chen G, Xu Y, Wan L, Zhang Z. Dual-Branch Feature Generalization Method for AUV Near-Field Exploration of Hydrothermal Areas. Journal of Marine Science and Engineering. 2024; 12(12):2359. https://doi.org/10.3390/jmse12122359
Chicago/Turabian StyleLiu, Yihui, Guofang Chen, Yufei Xu, Lei Wan, and Ziyang Zhang. 2024. "Dual-Branch Feature Generalization Method for AUV Near-Field Exploration of Hydrothermal Areas" Journal of Marine Science and Engineering 12, no. 12: 2359. https://doi.org/10.3390/jmse12122359
APA StyleLiu, Y., Chen, G., Xu, Y., Wan, L., & Zhang, Z. (2024). Dual-Branch Feature Generalization Method for AUV Near-Field Exploration of Hydrothermal Areas. Journal of Marine Science and Engineering, 12(12), 2359. https://doi.org/10.3390/jmse12122359