LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination
Abstract
1. Introduction
- We propose LDFE (Light-Aware Deep Front-End), a novel architecture that treats image enhancement as geometric structure restoration rather than visibility improvement, fundamentally changing how low-light SLAM pipelines should be constructed.
- We demonstrate that enhancement, feature extraction, and matching must be co-designed rather than independently replaced. Our experiments show that naive combinations of state-of-the-art components can actually degrade performance, while our synergistic design achieves significant improvements.
- We present a comprehensive analysis of how low-light enhancement affects deep feature distributions, providing insights into the coupling relationships between enhancement methods and learned feature descriptors.
- We develop a complete visual SLAM system that achieves state-of-the-art performance on multiple challenging datasets, including EuRoC, TUM-VI, and 4Seasons, under various lighting conditions.
2. Related Work
2.1. Visual SLAM Under Challenging Illumination
2.2. Low-Light Image Enhancement
2.3. Deep Feature Detection and Matching
2.4. Point-Line Feature Fusion
2.5. Comparative Analysis of Illumination-Robust SLAM Systems
3. Proposed Method
3.1. System Overview
- Illumination Assessment: An illumination scoring module analyzes the input image to determine the degree of enhancement required.
- Adaptive Enhancement: Based on the illumination score, EnlightenGAN adaptively enhances the image to restore geometric structures.
- Feature Extraction: SuperPoint extracts keypoints and descriptors from the enhanced image, while EDlines detects line segments.
- Feature Matching: LightGlue establishes point correspondences between frames, complemented by line-segment matching using geometric constraints.
- Pose Estimation: The matched features are used for camera pose estimation through PnP-RANSAC with joint point–line optimization.
- Backend Optimization: Local bundle adjustment and loop closure detection refine the trajectory and map.
3.2. Illumination-Adaptive Enhancement
3.2.1. Illumination Scoring Module
- Normal mode (): No enhancement applied; direct feature extraction;
- Light enhancement (): single-pass EnlightenGAN with reduced intensity;
- Full enhancement (): full EnlightenGAN processing for severe lowlight conditions.
3.2.2. EnlightenGAN for Geometric Restoration
3.3. Deep Feature Extraction
3.3.1. SuperPoint for Illumination-Invariant Keypoints
3.3.2. EDlines for Robust Line Detection
3.4. Attention-Based Feature Matching
3.4.1. LightGlue Matching
- Number of layers: 9 (reduced from 12 for efficiency). We systematically evaluated the precision–speed trade-off of reducing LightGlue from the default 12 self-attention layers to 9 layers through ablation experiments on EuRoC sequences. Results show that the 9-layer configuration achieves 98.3% of the matching accuracy (inlier ratio of 0.71 vs. 0.72) of the 12-layer baseline while reducing inference latency by 28% (from 21 ms to 15 ms per frame on RTX 4060Ti). This reduction is critical for maintaining near-real-time performance in our complete pipeline (target 22 fps). The minimal accuracy degradation occurs because the first 9 layers capture the most discriminative attention patterns, while the final 3 layers primarily refine already-confident matches. For SLAM applications requiring robust feature correspondences rather than pixel-perfect accuracy, this trade-off is highly favorable. We verified that matching failure rates (matches with reprojection error > 5 pixels) increase by only 1.2% (from 3.1% to 4.3%) under severe degradation conditions, a negligible cost compared to the substantial latency gain.
- Flash attention: enabled for memory efficiency;
- Depth confidence threshold: 0.95 (increased for higher precision);
- Width confidence threshold: 0.99.
3.4.2. Line-Segment Matching
3.5. Point–Line Fusion for Pose Estimation
3.5.1. Joint Optimization Formulation
3.5.2. Adaptive Feature Weighting
3.6. Backend Optimization
3.6.1. Local Bundle Adjustment
3.6.2. Loop Closure Detection
- Feature matching using LightGlue between the current frame and candidate;
- Geometric verification using a 5-point algorithm with RANSAC;
- Pose graph optimization to distribute the accumulated error.
4. Experiments
4.1. Experimental Setup
4.1.1. Datasets
Ground-Truth Trajectories
Physical Validation of Synthetic Degradation
4.1.2. Evaluation Metrics
4.1.3. Implementation Details
4.2. Comparison with State-of-the-Art Methods
- M1 (ORB-SLAM3): Classical ORB-SLAM3 using hand-crafted ORB features, representing the traditional SLAM baseline;
- M2 (ORB3 + SuperPoint): ORB-SLAM3 backend with SuperPoint replacing ORB features, evaluating the benefit of learned features alone;
- M3 (ORB3 + EnlightenGAN): ORB-SLAM3 with EnlightenGAN preprocessing, evaluating enhancement with traditional features;
- M4 (SP + LightGlue): SuperPoint with LightGlue matching without enhancement, representing state-of-the-art deep feature methods;
- M5 (Ours): Complete LDFE-SLAM integrating EnlightenGAN enhancement with SuperPoint and LightGlue.
4.2.1. EuRoC Dataset Results
4.2.2. Feature Detection Analysis
- Under Original lighting, all methods extract >1000 keypoints per frame, ensuring reliable tracking.
- Under severe (25%) brightness, M1 drops to ∼200 keypoints (near the failure threshold), while M5 maintains ∼1200 keypoints.
- M5’s advantage stems from the synergy of EnlightenGAN’s restoration of gradient structures and SuperPoint’s learned robustness to residual noise.
- Critically, M3 (enhancement + ORB) fails to maintain high keypoint counts despite enhancement, confirming that enhancement benefits are feature extractor-dependent.
4.2.3. Minimum Illumination Analysis
4.3. Ablation Studies
4.3.1. Component Contribution Analysis
4.3.2. Enhancement Method Comparison
4.3.3. Feature Distribution Analysis
- Raw low-light images: Features cluster in bright regions, leaving 60% of the image area without coverage.
- Traditional enhancement: Features spread but concentrate on enhancement artifacts.
- LDFE-SLAM: A uniform distribution with 85% spatial coverage is observed.
4.3.4. Parameter Sensitivity
5. Discussion
5.1. Key Insights
5.2. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| SLAM | Simultaneous Localization and Mapping |
| LDFE | Light-Aware Deep Front-End |
| ATE | Absolute Trajectory Error |
| RPE | Relative Pose Error |
| GAN | Generative Adversarial Network |
| CNN | Convolutional Neural Network |
| BA | Bundle Adjustment |
| ORB | Oriented FAST and Rotated BRIEF |
| GPU | Graphics Processing Unit |
| CPU | Central Processing Unit |
References
- Mur-Artal, R.; Tardós, J.D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef]
- Campos, C.; Elvira, R.; Rodríguez, J.J.G.; Montiel, J.M.M.; Tardós, J.D. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM. IEEE Trans. Robot. 2021, 37, 1874–1890. [Google Scholar] [CrossRef]
- Zuñiga-Noël, D.; Jaenal, A.; Gomez-Ojeda, R.; Gonzalez-Jimenez, J. The UMA-VI Dataset: Visual–Inertial Odometry in Low-Textured and Dynamic Illumination Environments. Int. J. Robot. Res. 2020, 39, 1052–1060. [Google Scholar] [CrossRef]
- Schmidt, F.; Daubermann, J.; Mitschke, M.; Blessing, C.; Meyer, S.; Enzweiler, M.; Valada, A. Rover: A multi-season dataset for visual slam. IEEE Trans. Robot. 2025. published online. [Google Scholar] [CrossRef]
- Canh, T.N.; Quoc, B.N.; Zhang, H.; Veeraiah, B.R.; HoangVan, X.; Chong, N.Y. IRAF-SLAM: An Illumination-Robust and Adaptive Feature-Culling Front-End for Visual SLAM in Challenging Environments. In Proceedings of the 2025 European Conference on Mobile Robots (ECMR), Lincoln, UK, 1–7 September 2025; pp. 1–7. [Google Scholar]
- Savinykh, A.; Kurenkov, M.; Kruzhkov, E.; Yudin, E.; Potapov, A.; Karpyshev, P.; Tsetserukou, D. Darkslam: Gan-assisted visual slam for reliable operation in low-light conditions. In Proceedings of the 2022 IEEE 95th Vehicular Technology Conference: (VTC2022-Spring), Helsinki, Finland, 19–22 June 2022; pp. 1–6. [Google Scholar]
- Chen, P.H.; Luo, Z.X.; Huang, Z.K.; Yang, C.; Chen, K.W. IF-Net: An illumination-invariant feature network. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 8630–8636. [Google Scholar]
- Hu, J.; Guo, X.; Chen, J.; Liang, G.; Deng, F.; Lam, T.L. A Two-Stage Unsupervised Approach for Low Light Image Enhancement. IEEE Robot. Autom. Lett. 2021, 6, 8363–8370. [Google Scholar] [CrossRef]
- Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Yang, J.; Zhou, P.; Wang, Z. EnlightenGAN: Deep Light Enhancement Without Paired Supervision. IEEE Trans. Image Process. 2021, 30, 2340–2349. [Google Scholar] [CrossRef]
- Singh, S.P.; Mazotti, B.; Rajani, D.M.; Mayilvahanan, S.; Li, G.; Ghaffari, M. Twilight SLAM: Navigating low-light environments. arXiv 2023, arXiv:2304.11310. [Google Scholar] [CrossRef]
- DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperPoint: Self-Supervised Interest Point Detection and Description. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 224–236. [Google Scholar]
- Zhao, Z.; Wu, C.; Kong, X.; Li, Q.; Guo, Z.; Lv, Z. Light-SLAM: A robust deep-learning visual SLAM system based on LightGlue under challenging lighting conditions. IEEE Trans. Intell. Transp. Syst. 2025, 26, 9918–9931. [Google Scholar] [CrossRef]
- Luo, H.; Liu, Y.; Guo, C.; Li, Z.; Song, W. SuperVINS: A real-time visual-inertial SLAM framework for challenging imaging conditions. IEEE Sens. J. 2025, 25, 26042–26050. [Google Scholar] [CrossRef]
- Lindenberger, P.; Sarlin, P.E.; Pollefeys, M. LightGlue: Local Feature Matching at Light Speed. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 17581–17592. [Google Scholar]
- Gomez-Ojeda, R.; Moreno, F.A.; Zuniga-Noel, D.; Scaramuzza, D.; Gonzalez-Jimenez, J. PL-SLAM: A Stereo SLAM System through the Combination of Points and Line Segments. IEEE Trans. Robot. 2019, 35, 734–746. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhu, P.; Ren, W. Pl-cvio: Point-line cooperative visual-inertial odometry. In Proceedings of the 2023 IEEE Conference on Control Technology and Applications (CCTA), Bridgetown, Barbados, 16–18 August 2023; pp. 859–865. [Google Scholar]
- Chen, J.; Hou, P.; Cao, Z.; Xu, T.; Zhao, J. An improved monocular SLAM system by combining geometric features with semantic probability. IEEE Sens. J. 2025, 25, 27111–27125. [Google Scholar] [CrossRef]
- Xu, K.; Hao, Y.; Yuan, S.; Wang, C.; Xie, L. AirSLAM: An efficient and illumination-robust point-line visual SLAM system. IEEE Trans. Robot. 2025, 41, 1673–1692. [Google Scholar] [CrossRef]
- Liu, H.; Zhong, H.; Si, W. FTI-SLAM: Federated Learning-Enhanced Thermal-Inertial SLAM. Robot Learn. 2024, 1, 1. [Google Scholar] [CrossRef]
- Lipson, L.; Deng, J. Multi-session slam with differentiable wide-baseline pose optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 19626–19635. [Google Scholar]
- Xin, Z.; Wang, Z.; Yu, Z.; Zheng, B. ULL-SLAM: Underwater low-light enhancement for the front-end of visual SLAM. Front. Mar. Sci. 2023, 10, 1133881. [Google Scholar] [CrossRef]
- Land, E.H. The Retinex Theory of Color Vision. Sci. Am. 1977, 237, 108–128. [Google Scholar] [CrossRef]
- Tian, X.; Xianyu, X.; Li, Z.; Xu, T.; Jia, Y. Infrared and Visible Image Fusion Based on Multi-Level Detail Enhancement and Generative Adversarial Network. Intell. Robot. 2024, 4, 524–543. [Google Scholar] [CrossRef]
- Sarlin, P.E.; DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperGlue: Learning Feature Matching with Graph Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4938–4947. [Google Scholar]
- Jiang, H.; Karpur, A.; Cao, B.; Huang, Q.; Araujo, A. OmniGlue: Generalizable Feature Matching with Foundation Model Guidance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 16865–16875. [Google Scholar]
- Wang, Y.; He, X.; Peng, S.; Tan, D.; Zhou, X. Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-like Speed. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 21666–21675. [Google Scholar]
- Verma, H.; Siruvuri, S.D.V.S.S.V.; Budarapu, P.R. A Machine Learning-Based Image Classification of Silicon Solar Cells. Int. J. Hydromechatronics 2024, 7, 49–66. [Google Scholar] [CrossRef]
- Yuan, F.; Huang, X.; Wang, L.; Ding, J.; Tian, Z.; Wang, Y.; Gu, S.; Funabora, Y.; Peng, Y.; Mao, Z. Towards general embodied intelligence: Integrating large language models, knowledge bases, and reasoning capabilities to build the next generation of AI agents. Int. J. Hydromechatronics 2025, in press. [Google Scholar] [CrossRef]
- Peng, Y.; Yang, X.; Li, D.; Ma, Z.; Liu, Z.; Bai, X.; Mao, Z. Predicting Flow Status of a Flexible Rectifier Using Cognitive Computing. Expert Syst. Appl. 2025, 264, 125878. [Google Scholar] [CrossRef]
- Mao, Z.; Suzuki, S.; Nabae, H.; Miyagawa, S.; Suzumori, K.; Maeda, S. Machine Learning-Enhanced Soft Robotic System Inspired by Rectal Functions to Investigate Fecal Incontinence. Bio-Des. Manuf. 2025, 8, 482–494. [Google Scholar] [CrossRef]
- Ma, S.; Zhang, M.; Sun, W.; Gao, Y.; Jing, M.; Gao, L.; Wu, Z. Artificial intelligence and medical-engineering integration in diabetes management: Advances, opportunities, and challenges. Healthc. Rehabil. 2025, 1, 100006. [Google Scholar] [CrossRef]
- Kannapiran, S.; Bendapudi, N.; Yu, M.Y.; Parikh, D.; Berman, S.; Vora, A.; Pandey, G. Stereo visual odometry with deep learning-based point and line feature matching using an attention graph neural network. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 3491–3498. [Google Scholar]
- He, Y.; Zhao, J.; Guo, Y.; He, W.; Yuan, K. PL-VIO: Tightly-Coupled Monocular Visual–Inertial Odometry Using Point and Line Features. Sensors 2018, 18, 1159. [Google Scholar] [CrossRef] [PubMed]
- Shu, F.; Wang, J.; Pagani, A.; Stricker, D. Structure plp-slam: Efficient sparse mapping and localization using point, line and plane for monocular, rgb-d and stereo cameras. arXiv 2022, arXiv:2207.06058. [Google Scholar]
- Akinlar, C.; Topal, C. EDLines: A Real-Time Line Segment Detector with a False Detection Control. Pattern Recognit. Lett. 2011, 32, 1633–1642. [Google Scholar] [CrossRef]
- Zhang, L.; Koch, R. An Efficient and Robust Line Segment Matching Approach Based on LBD Descriptor and Pairwise Geometric Consistency. J. Vis. Commun. Image Represent. 2013, 24, 794–805. [Google Scholar] [CrossRef]
- Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M.W.; Siegwart, R. The EuRoC Micro Aerial Vehicle Datasets. Int. J. Robot. Res. 2016, 35, 1157–1163. [Google Scholar] [CrossRef]
- Wenzel, P.; Wang, R.; Yang, N.; Cheng, Q.; Khan, Q.; von Stumberg, L.; Zeller, N.; Cremers, D. 4Seasons: A Cross-Season Dataset for Multi-Weather SLAM in Autonomous Driving. In Proceedings of the DAGM German Conference on Pattern Recognition (GCPR), Bonn, Germany, 28 September–1 October 2020; pp. 404–417. [Google Scholar]








| Method | Learned FE | Features | Enhanced | Backend | GPU | RT |
|---|---|---|---|---|---|---|
| HF-Net-based SLAM | Yes | Point | No | Geom. BA | Yes | Yes |
| Light-SLAM | Yes | Point | Yes | Learn. + BA | Yes | Near |
| SuperVINS | Yes | Point | Yes | VINS | Yes | Yes |
| AirSLAM | Yes | Pt + Line | No | Geom. BA | Opt. | Yes |
| Twilight-SLAM | Yes | Point | Yes | Geom. BA | Yes | Yes |
| LDFE-SLAM (Ours) | Yes | Pt + Line | Light-aware | Geom. BA | Opt. | Yes |
| Enhancement Method | ATE RMSE (m) | Keypoints/Frame | Inlier Ratio |
|---|---|---|---|
| None (raw low-light) | 0.198 | 245 | 0.42 |
| Histogram Equalization | 0.175 | 512 | 0.38 |
| CLAHE | 0.162 | 489 | 0.45 |
| Retinex-Net | 0.128 | 678 | 0.52 |
| Zero-DCE | 0.118 | 712 | 0.55 |
| EnlightenGAN | 0.095 | 845 | 0.62 |
| EnlightenGAN + (Ours) | 0.068 | 912 | 0.71 |
| Condition | Metric | M1 | M2 | M3 | M4 | M5 |
|---|---|---|---|---|---|---|
| ORB-SLAM3 | ORB3 + SP | ORB3 + EG | SP + LG | (Ours) | ||
| Original (100%) | ATE (m) | 0.72 | 0.68 | 0.75 | 0.89 | 1.21 |
| Success (%) | 100 | 100 | 100 | 100 | 100 | |
| Mild (50%) | ATE (m) | 3.70 | 1.05 | 1.18 | 0.98 | 1.15 |
| Success (%) | 100 | 100 | 100 | 100 | 77 | |
| Severe (25%) | ATE (m) | X | 1.02 | 3.35 | 1.05 | 1.18 |
| Success (%) | 12 | 55 | 15 | 12 | 62 | |
| Extreme (10%) | ATE (m) | X | X | X | X | 1.25 |
| Success (%) | 0 | 12 | 0 | 0 | 0 |
| Configuration | ATE RMSE (m) | vs. Full |
|---|---|---|
| Full LDFE-SLAM | 0.068 | — |
| w/o EnlightenGAN | 0.142 | +108.8% |
| w/o SuperPoint (use ORB) | 0.185 | +172.1% |
| w/o LightGlue (use brute-force) | 0.125 | +83.8% |
| w/o Line Features | 0.089 | + 30.9% |
| w/o Adaptive Illumination Scoring | 0.078 | +14.7% |
| w/o Geometric Consistency Loss | 0.082 | +20.6% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, C.; Wang, Y.; Luo, W.; Peng, Y. LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination. Machines 2026, 14, 44. https://doi.org/10.3390/machines14010044
Liu C, Wang Y, Luo W, Peng Y. LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination. Machines. 2026; 14(1):44. https://doi.org/10.3390/machines14010044
Chicago/Turabian StyleLiu, Cong, You Wang, Weichao Luo, and Yanhong Peng. 2026. "LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination" Machines 14, no. 1: 44. https://doi.org/10.3390/machines14010044
APA StyleLiu, C., Wang, Y., Luo, W., & Peng, Y. (2026). LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination. Machines, 14(1), 44. https://doi.org/10.3390/machines14010044

