MDPI - Publisher of Open Access Journals

25 pages, 2253 KB

Open AccessArticle

Monocular Visual Pose Estimation Method Based on Spherical Cooperative Target

by Yanyu Ding, Chaoran Zhang, Yongbin Zhang, Fujin Yang, Zhiyuan Tang, Shipeng Li, Xinran Liu and Xiaojun Zhao

Sensors 2026, 26(10), 3139; https://doi.org/10.3390/s26103139 - 15 May 2026

Abstract

In close-range monocular visual measurement and cooperative target pose estimation, conventional planar targets are constrained by viewpoint changes and are prone to perspective distortion. Although spherical targets provide omnidirectional observability, their PnP-based pose estimation may still suffer from large errors under limited fields [...] Read more.

In close-range monocular visual measurement and cooperative target pose estimation, conventional planar targets are constrained by viewpoint changes and are prone to perspective distortion. Although spherical targets provide omnidirectional observability, their PnP-based pose estimation may still suffer from large errors under limited fields of view and sparse feature observations. To address this issue, this paper proposes an integrated visual measurement framework covering both high-precision spherical target construction and robust pose estimation. First, a composite marker layout based on adaptively scaled latitude–longitude topology is designed. To suppress cumulative distortion caused by long-sequence multi-view rigid registration, a center-to-pole point-cloud stitching strategy is developed, and multiple observations are fused using geometric-consistency weighting to accurately reconstruct the feature-point coordinate system of the target. Second, a joint optimization method is proposed by combining feature-point reprojection error with a contour center consistency constraint. Specifically, the theoretical contour center is predicted from the analytical projection model of the sphere and constrained to agree with the observed contour center fitted from the image. In addition, an SQPnP-based sequential reinitialization mechanism is introduced to improve robustness under sparse-point observations. Simulation results demonstrate that the proposed method achieves higher accuracy and robustness under continuous pose changes, sparse feature points, and different noise levels, compared with EPnP, EPnP+LM, LM, and SQPnP, while real-image experiments further demonstrate its practical feasibility. Full article

(This article belongs to the Section Sensing and Imaging)

18 pages, 2950 KB

Open AccessArticle

A Target-Free Vision-Based Method for Measuring Girder Rigid-Body Displacement Under Long-Distance Imaging Conditions

by Guangyu Li, Hai-Bin Huang, Shengzhi Ai, Yuan Cheng and Dong Liang

Infrastructures 2026, 11(5), 161; https://doi.org/10.3390/infrastructures11050161 - 6 May 2026

Viewed by 196

Abstract

The rigid-body displacement of bridge girders, particularly the lateral displacement of curved girder bridges, is a critical indicator reflecting the structural safety reserve and durability of bridges. However, under long-distance imaging conditions, the inherent scale ambiguity and perspective distortion in monocular vision measurement, [...] Read more.

The rigid-body displacement of bridge girders, particularly the lateral displacement of curved girder bridges, is a critical indicator reflecting the structural safety reserve and durability of bridges. However, under long-distance imaging conditions, the inherent scale ambiguity and perspective distortion in monocular vision measurement, coupled with environmental interferences such as weakened natural edges and varying illumination, pose severe challenges to target-free, high-precision, and real-time displacement measurement. To this end, this paper proposes a target-free visual method for measuring rigid-body displacement of bridge girders under long-distance imaging. By fusing optical flow and Hough transform to extract seismic block edges and adopting hierarchical NCC matching for stable girder tracking, the method achieves millimeter-level accuracy, real-time performance, and strong illumination robustness. Model tests and field validation confirm its effectiveness for low-cost bridge health monitoring. Full article

(This article belongs to the Special Issue Sustainable Bridge Engineering)

► Show Figures

Figure 1

22 pages, 55205 KB

Open AccessArticle

A Distributed and Reconfigurable Architecture for Unified Multimodal Indoor Localization of a Mobile Edge Node in a Cyber-Physical Context

by Theodoros Papafotiou, Emmanouil Tsardoulias and Andreas Symeonidis

Robotics 2026, 15(5), 91; https://doi.org/10.3390/robotics15050091 - 30 Apr 2026

Viewed by 218

Abstract

Precise 3D positioning in GPS-denied environments is a critical enabler of autonomous robotics, industrial automation, and smart logistics within the emerging cyber-physical landscape. This paper presents a distributed and reconfigurable architecture designed to benchmark and provide unified multimodal indoor localization for mobile edge [...] Read more.

Precise 3D positioning in GPS-denied environments is a critical enabler of autonomous robotics, industrial automation, and smart logistics within the emerging cyber-physical landscape. This paper presents a distributed and reconfigurable architecture designed to benchmark and provide unified multimodal indoor localization for mobile edge nodes. Unlike rigid commercial solutions, our architecture employs a distributed, reconfigurable framework that allows the rapid interchange of Absolute Localization Methods (UWB, External RGB-D Vision) and Relative Localization Methods (Inertial Odometry, Visual Odometry). We evaluate these modalities individually and in hybrid configurations using a custom low-cost mobile edge node. Experimental results in a controlled environment demonstrate that while all-optical systems offer high precision, a cost-effective fusion of Ultra-Wideband (UWB) and Inertial Measurement Unit (IMU) data provides a robust balance of accuracy and reliability. Conversely, we identify significant limitations in monocular visual odometry within feature-poor indoor spaces. The developed platform serves as a reproducible foundation for researchers to prototype hybrid localization algorithms and assess the trade-offs between hardware cost and operational accuracy within complex cyber-physical ecosystems. Full article

(This article belongs to the Special Issue Localization and 3D Mapping of Intelligent Robotics)

► Show Figures

Figure 1

32 pages, 436 KB

Open AccessReview

Amblyopia in 2026: A State-of-the-Art Review of Multidimensional Phenotyping, Response Heterogeneity, and Clinical Considerations

by Danjela Ibrahimi and José R. García-Martínez

Brain Sci. 2026, 16(5), 467; https://doi.org/10.3390/brainsci16050467 - 27 Apr 2026

Viewed by 852

Abstract

Amblyopia is increasingly conceptualized as a neurodevelopmental visual disorder that often arises from discordant binocular visual experience during early life and is associated with abnormal binocular interactions, interocular suppression, orientation-dependent developmental abnormalities in selected refractive phenotypes, and experience-dependent plasticity, consistent with a distributed-network [...] Read more.

Amblyopia is increasingly conceptualized as a neurodevelopmental visual disorder that often arises from discordant binocular visual experience during early life and is associated with abnormal binocular interactions, interocular suppression, orientation-dependent developmental abnormalities in selected refractive phenotypes, and experience-dependent plasticity, consistent with a distributed-network perspective rather than a purely monocular acuity deficit. We performed a structured state-of-the-art narrative synthesis of peer-reviewed reviews, randomized controlled trials, and key mechanistic human studies indexed in PubMed/MEDLINE, Web of Science, and Scopus (1 January 2016–28 February 2026; last search 28 February 2026), prioritizing recent evidence from 2021–2026. Literature supports consideration of clinically trackable constructs beyond best-corrected visual acuity (BCVA), including quantified suppression/imbalance, binocular function, and functionally meaningful outcomes such as reading-related limitation and broader functional impact. Across established and emerging intervention classes, treatment effects are heterogeneous across ages and etiologies. Evidence is strongest for conventional penalization and selected active training-based approaches, whereas newer protocol-standardized approaches remain investigational and require prospective evaluation with transparent exposure/dose reporting. Based on these findings, we outline a clinically oriented, core outcome set for amblyopia and strabismus (COSAMS)-aligned framework that combines quantified binocular imbalance with multidimensional phenotyping and a hypothesis-driven, prospectively testable therapeutic model intended to structure (not replace) clinical decision-making. Priorities for precision-oriented amblyopia care include standardization of suppression metrics, adoption of core outcome sets, transparent reporting of ‘not measurable’ outcomes and missingness, and prospective validation of phenotype-driven, prediction-ready frameworks. Full article

(This article belongs to the Special Issue Brain Plasticity in Health and Disease: From Molecules to Circuits)

► Show Figures

Figure 1

30 pages, 135773 KB

Open AccessArticle

Robust 3D Multi-Object Tracking via 4D mmWave Radar-Camera Fusion and Disparity-Domain Depth Recovery

by Yunfei Xie, Xiaohui Li, Dingheng Wang, Zhuo Wang, Shiliang Li, Jia Wang and Zhenping Sun

Sensors 2026, 26(7), 2096; https://doi.org/10.3390/s26072096 - 27 Mar 2026

Viewed by 787

Abstract

4D millimeter-wave radar provides high-precision ranging capability and exhibits strong robustness under adverse weather and low-visibility conditions, but its point clouds are relatively sparse and suffer from severe elevation-angle measurement noise. Monocular cameras, by contrast, provide rich semantic information and high recall, yet [...] Read more.

4D millimeter-wave radar provides high-precision ranging capability and exhibits strong robustness under adverse weather and low-visibility conditions, but its point clouds are relatively sparse and suffer from severe elevation-angle measurement noise. Monocular cameras, by contrast, provide rich semantic information and high recall, yet are fundamentally limited by scale ambiguity. To exploit the complementary characteristics of these two sensors, this paper proposes a radar-camera fusion 3D multi-object tracking framework that does not rely on complex 3D annotated data. First, on the radar signal-processing side, a Gaussian distribution-based adaptive angle compression method and IMU-based velocity compensation are introduced to effectively suppress measurement noise, and an improved DBSCAN clustering scheme with recursive cluster splitting and historical static-box guidance is employed to generate high-quality radar detections. Second, a disparity-domain metric depth recovery method is proposed. This method uses filtered radar points as sparse metric anchors, performs robust fitting with RANSAC, and applies Kalman filtering for temporal smoothing, thereby converting the relative depth output of the visual foundation model Depth Anything V2 into metric depth. Finally, a hierarchical fusion strategy is designed at both the detection and tracking levels to achieve stable cross-modal state association. Experimental results on a self-collected dataset show that the proposed method achieves an overall MOTA of 77.93%, outperforming single-modality baselines and other comparison methods by 11 to 31 percentage points. This study provides an effective solution for low-cost and robust environment perception in complex dynamic scenarios. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

21 pages, 23671 KB

Open AccessArticle

Zero-Shot Polarization-Intensity Physical Fusion Monocular Depth Estimation for High Dynamic Range Scenes

by Renhao Rao, Zhizhao Ouyang, Shuang Chen, Liang Chen, Guoqin Huang and Changcai Cui

Photonics 2026, 13(3), 268; https://doi.org/10.3390/photonics13030268 - 11 Mar 2026

Viewed by 532

Abstract

Monocular 3D reconstruction remains a persistent challenge for autonomous driving systems in Degraded Visual Environments (DVEs) with extreme glare and low illumination, such as highway tunnels, due to the lack of reliable texture cues. This paper proposes a physics-aware deep learning framework that [...] Read more.

Monocular 3D reconstruction remains a persistent challenge for autonomous driving systems in Degraded Visual Environments (DVEs) with extreme glare and low illumination, such as highway tunnels, due to the lack of reliable texture cues. This paper proposes a physics-aware deep learning framework that overcomes these limitations by fusing polarization sensing with conventional intensity imaging. Unlike traditional end-to-end data-driven fusion strategies, we propose a Modality-Aligned Parameter Injectionstrategy. By remapping the weight space of the input layer, this strategy achieves a smooth transfer of the pre-trained Vision Transformer (i.e., MiDaS) to multi-modal inputs. Its core advantage lies in the seamless integration of four-channel polarization geometric information while fully preserving the pre-trained semantic representation capabilities of the backbone network, thereby avoiding the overfitting risk associated with training from scratch on small-sample data. Furthermore, we design a Reliability-Aware Gating mechanism that dynamically re-weights appearance and geometric cues based on intensity saturation and the physical validity of polarization signals as measured by the Degree of Linear Polarization (DoLP). We validate the proposed method on our self-constructed POLAR-GLV benchmark, a real-world dataset collected specifically for high dynamic range tunnel scenarios. Extensive experiments demonstrate that our method consistently outperforms intensity-only baselines, reducing geometric reconstruction error by 24.2% in high-glare tunnel exit zones and 10.0% at tunnel entrances. Crucially, compared to multi-stream fusion architectures, these performance gains come with negligible additional computational cost, making the framework highly suitable for resource-constrained onboard inference environments. Full article

(This article belongs to the Special Issue AI for Photonics: Intelligent Imaging, Learning-Driven Optics, and Photonic Computing)

► Show Figures

Figure 1

17 pages, 1212 KB

Open AccessArticle

Comparative Photopic and Mesopic Visual Performance of Enhanced Monofocal Versus Non-Diffractive Extended Depth-of-Focus Intraocular Lenses

by Inas Baoud Ould Haddi, Vanesa Blázquez-Sánchez, Dayan Flores-Cervantes, Emilio Dorronzoro-Ramirez, Nuria Garzón and Cristina Bonnin-Arias

J. Clin. Med. 2026, 15(4), 1368; https://doi.org/10.3390/jcm15041368 - 9 Feb 2026

Cited by 1 | Viewed by 622

Abstract

Background/Objectives: Enhanced monofocal and non-diffractive extended depth-of-focus (EDoF) intraocular lenses (IOLs) aim to improve intermediate vision while maintaining contrast sensitivity. However, direct comparative evidence under both photopic and mesopic conditions remains limited. This study prospectively compared the visual performance of two enhanced [...] Read more.

Background/Objectives: Enhanced monofocal and non-diffractive extended depth-of-focus (EDoF) intraocular lenses (IOLs) aim to improve intermediate vision while maintaining contrast sensitivity. However, direct comparative evidence under both photopic and mesopic conditions remains limited. This study prospectively compared the visual performance of two enhanced monofocal IOLs (Tecnis^® Eyhance^™ and ISOPure^®) and one non-diffractive EDoF IOL (AcrySof^® IQ Vivity^™). Methods: Sixty patients undergoing bilateral cataract surgery were implanted with one of three IOL groups (n = 20 each). Patients were assigned to one of three IOL groups based on preoperative consultation and clinical indications. One month postoperatively, monocular corrected distance (CDVA), distance-corrected intermediate (DCIVA), and near visual acuity (DCNVA) were measured under photopic and mesopic conditions. Photopic defocus curves, contrast sensitivity under photopic and mesopic conditions and correlation between pupil diameter and visual acuities were also assessed. Results: Baseline characteristics were comparable among groups. Under photopic conditions, Vivity^™ achieved significantly better UDVA, DCIVA and DCNVA than both Eyhance^™ and ISOPure^®. Under mesopic conditions, distance acuity did not differ significantly, but Vivity™ showed superior DCIVA and DCNVA. Defocus curves demonstrated a broader functional range of vision with Vivity™, while Eyhance^™ and ISOPure^® showed nearly overlapping profiles. Contrast sensitivity was similar among all IOLs under both lighting conditions, with no statistically significant differences. Conclusions: The non-diffractive EDoF AcrySof^® IQ Vivity^™ provided a wider and more functional depth-of-focus than the enhanced monofocal lenses evaluated, without compromising contrast sensitivity. Eyhance^™ and ISOPure^® offered comparable performance, with good distance vision and modest depth-of-focus extension. All three IOLs maintained CS levels comparable to those typically reported for standard monofocal IOLs under both photopic and mesopic illumination, indicating no clinically relevant contrast penalty. Full article

(This article belongs to the Section Ophthalmology)

► Show Figures

Figure 1

45 pages, 5418 KB

Open AccessReview

Visual and Visual–Inertial SLAM for UGV Navigation in Unstructured Natural Environments: A Survey of Challenges and Deep Learning Advances

by Tiago Pereira, Carlos Viegas, Salviano Soares and Nuno Ferreira

Robotics 2026, 15(2), 35; https://doi.org/10.3390/robotics15020035 - 2 Feb 2026

Cited by 1 | Viewed by 2585

Abstract

Localization and mapping remain critical challenges for Unmanned Ground Vehicles (UGVs) operating in unstructured natural environments, such as forests and agricultural fields. While Visual SLAM (VSLAM) and Visual–Inertial SLAM (VI-SLAM) have matured significantly in structured and urban scenarios, their extension to outdoor natural [...] Read more.

Localization and mapping remain critical challenges for Unmanned Ground Vehicles (UGVs) operating in unstructured natural environments, such as forests and agricultural fields. While Visual SLAM (VSLAM) and Visual–Inertial SLAM (VI-SLAM) have matured significantly in structured and urban scenarios, their extension to outdoor natural domains introduces severe challenges, including dynamic vegetation, illumination variations, a lack of distinctive features, and degraded GNSS availability. Recent advances in Deep Learning have brought promising developments to VSLAM- and VI-SLAM-based pipelines, ranging from learned feature extraction and matching to self-supervised monocular depth prediction and differentiable end-to-end SLAM frameworks. Furthermore, emerging methods for adaptive sensor fusion, leveraging attention mechanisms and reinforcement learning, open new opportunities to improve robustness by dynamically weighting the contributions of camera and IMU measurements. This review provides a comprehensive overview of Visual and Visual–Inertial SLAM for UGVs in unstructured environments, highlighting the challenges posed by natural contexts and the limitations of current pipelines. Classic VI-SLAM frameworks and recent Deep-Learning-based approaches were systematically reviewed. Special attention is given to field robotics applications in agriculture and forestry, where low-cost sensors and robustness against environmental variability are essential. Finally, open research directions are discussed, including self-supervised representation learning, adaptive sensor confidence models, and scalable low-cost alternatives. By identifying key gaps and opportunities, this work aims to guide future research toward resilient, adaptive, and economically viable VSLAM and VI-SLAM pipelines, tailored for UGV navigation in unstructured natural environments. Full article

(This article belongs to the Special Issue Localization and 3D Mapping of Intelligent Robotics)

► Show Figures

Figure 1

13 pages, 1345 KB

Open AccessArticle

Clinical Performance of an Enhanced Monofocal IOL Bilaterally Implanted in Patients Targeted for Monovision: A Prospective Study

by Javier García-Bella, Celia Villanueva, Nuria Garzón, Bárbara Burgos-Blasco, Beatriz Vidal-Villegas and Julián García-Feijoo

J. Clin. Med. 2026, 15(2), 875; https://doi.org/10.3390/jcm15020875 - 21 Jan 2026

Viewed by 617

Abstract

Background/Objectives: The purpose of the study is to assess visual and refractive outcomes and patient satisfaction after bilateral implantation of an enhanced monofocal intraocular lens (IOL) in a monovision configuration. Methods: Prospective, monocentric, non-comparative study including adults 21 years or older, [...] Read more.

Background/Objectives: The purpose of the study is to assess visual and refractive outcomes and patient satisfaction after bilateral implantation of an enhanced monofocal intraocular lens (IOL) in a monovision configuration. Methods: Prospective, monocentric, non-comparative study including adults 21 years or older, with astigmatism less than 1.50 D, who were suitable for bilateral cataract surgery targeted with −1.00 D monovision. Participants were implanted with the RayOne EMV and followed up for three months. Outcome measures included refraction, monocular and binocular uncorrected distance visual acuity (UDVA), corrected distance visual acuity (CDVA), uncorrected and distance-corrected intermediate visual acuity (UIVA and DCIVA) at 66 cm and 80 cm, binocular defocus curve, and CatQuest-9SF questionnaire. Results: Sixty eyes of thirty patients were included. Postoperative spherical equivalent (SEQ) was −0.16 ± 0.29 D in the dominant eyes and −1.24 ± 0.43 D in the non-dominant eyes. Binocularly, mean UDVA at 4 m was −0.01 ± 0.07 and 0.1 logMAR or better in all patients. Mean binocular UIVA at 66 cm was 0.08 ± 0.08 and 0.2 logMAR or better in 92.9% of patients. Binocular UDVA was statistically significantly improved compared to monocular UDVA of the dominant eye targeted for distance (p < 0.001). Similarly, binocular UIVA was statistically significantly improved compared to monocular UIVA of the non-dominant eye targeted for −1.00 D (p < 0.001). A total of 96.6% of patients were satisfied with their sight. Conclusions: Bilateral implantation of an enhanced monofocal IOL in a monovision configuration provided excellent binocular uncorrected vision at distance and intermediate ranges, demonstrating effective binocular summation and a high level of patient satisfaction. Full article

(This article belongs to the Section Ophthalmology)

► Show Figures

Figure 1

30 pages, 5328 KB

Open AccessArticle

DTVIRM-Swarm: A Distributed and Tightly Integrated Visual-Inertial-UWB-Magnetic System for Anchor Free Swarm Cooperative Localization

by Xincan Luo, Xueyu Du, Shuai Yue, Yunxiao Lv, Lilian Zhang, Xiaofeng He, Wenqi Wu and Jun Mao

Drones 2026, 10(1), 49; https://doi.org/10.3390/drones10010049 - 9 Jan 2026

Cited by 1 | Viewed by 1204

Abstract

Accurate Unmanned Aerial Vehicle (UAV) positioning is vital for swarm cooperation. However, this remains challenging in situations where Global Navigation Satellite System (GNSS) and other external infrastructures are unavailable. To address this challenge, we propose to use only the onboard Microelectromechanical System Inertial [...] Read more.

Accurate Unmanned Aerial Vehicle (UAV) positioning is vital for swarm cooperation. However, this remains challenging in situations where Global Navigation Satellite System (GNSS) and other external infrastructures are unavailable. To address this challenge, we propose to use only the onboard Microelectromechanical System Inertial Measurement Unit (MIMU), Magnetic sensor, Monocular camera and Ultra-Wideband (UWB) device to construct a distributed and anchor-free cooperative localization system by tightly fusing the measurements. As the onboard UWB measurements under dynamic motion conditions are noisy and discontinuous, we propose an adaptive adjustment method based on chi-squared detection to effectively filter out inconsistent and false ranging information. Moreover, we introduce the pose-only theory to model the visual measurement, which improves the efficiency and accuracy for visual-inertial processing. A sliding window Extended Kalman Filter (EKF) is constructed to tightly fuse all the measurements, which is capable of working under UWB or visual deprived conditions. Additionally, a novel Multidimensional Scaling-MAP (MDS-MAP) initialization method fuses ranging, MIMU, and geomagnetic data to solve the non-convex optimization problem in ranging-aided Simultaneous Localization and Mapping (SLAM), ensuring fast and accurate swarm absolute pose initialization. To overcome the state consistency challenge inherent in the distributed cooperative structure, we model not only the UWB noisy uncertainty but also the neighbor agent’s position uncertainty in the measurement model. Furthermore, we incorporate the Covariance Intersection (CI) method into our UWB measurement fusion process to address the challenge of unknown correlations between state estimates from different UAVs, ensuring consistent and robust state estimation. To validate the effectiveness of the proposed methods, we have established both simulation and hardware test platforms. The proposed method is compared with state-of-the-art (SOTA) UAV localization approaches designed for GNSS-challenged environments. Extensive experiments demonstrate that our algorithm achieves superior positioning accuracy, higher computing efficiency and better robustness. Moreover, even when vision loss causes other methods to fail, our proposed method continues to operate effectively. Full article

(This article belongs to the Special Issue Autonomous Drone Navigation in GPS-Denied Environments)

► Show Figures

Figure 1

20 pages, 8043 KB

Open AccessArticle

Development of a Cost-Effective UUV Localisation System Integrable with Aquaculture Infrastructure

by Thein Than Tun, Loulin Huang and Mark Anthony Preece

J. Mar. Sci. Eng. 2026, 14(2), 115; https://doi.org/10.3390/jmse14020115 - 7 Jan 2026

Viewed by 627

Abstract

In many aquaculture farms, Unmanned Underwater Vehicles (UUVs) are being deployed to perform dangerous and time-consuming repetitive tasks (e.g., fish net-pen visual inspection) on behalf of or in collaboration with farm operators. Mostly, they are remotely operated, and one of the main barriers [...] Read more.

In many aquaculture farms, Unmanned Underwater Vehicles (UUVs) are being deployed to perform dangerous and time-consuming repetitive tasks (e.g., fish net-pen visual inspection) on behalf of or in collaboration with farm operators. Mostly, they are remotely operated, and one of the main barriers to deploying them autonomously is the UUV localisation. Specifically, the cost of the localisation sensor suite, sensor reliability in constrained operational workspace and return on investment (ROI) for the huge initial investment on the UUV and its localisation hinder the R&D work and adoption of the autonomous UUV deployment on an industrial scale. The proposed system, which leverages the AprilTag (a fiducial marker used as a frame of reference) detection, provides cost-effective UUV localisation for the initial trials of autonomous UUV deployment, requiring only minor modifications to the aquaculture infrastructure. With such a cost-effective approach, UUV R&D engineers can demonstrate and validate the advantages and challenges of autonomous UUV deployment to farm operators, policymakers, and governing authorities to make informed decision-making for the future large-scale adoption of autonomous UUVs in aquaculture. Initial validation of the proposed cost-effective localisation system indicates that centimetre-level accuracy can be achieved with a single monocular camera and only 10 AprilTags, without requiring physical measurements, in a

115.46

m³ laboratory workspace under various lighting conditions. Full article

(This article belongs to the Special Issue Infrastructure for Offshore Aquaculture Farms)

► Show Figures

Figure 1

25 pages, 103370 KB

Open AccessArticle

NeRF-Enhanced Visual–Inertial SLAM for Low-Light Underwater Sensing

by Zhe Wang, Qinyue Zhang, Yuqi Hu and Bing Zheng

J. Mar. Sci. Eng. 2026, 14(1), 46; https://doi.org/10.3390/jmse14010046 - 26 Dec 2025

Viewed by 1183

Abstract

Marine robots operating in low illumination and turbid waters require reliable measurement and control for surveying, inspection, and monitoring. This paper present a sensor-centric visual–inertial simultaneous localization and mapping (SLAM) pipeline that combines low-light enhancement, learned feature matching, and NeRF-based dense reconstruction to [...] Read more.

Marine robots operating in low illumination and turbid waters require reliable measurement and control for surveying, inspection, and monitoring. This paper present a sensor-centric visual–inertial simultaneous localization and mapping (SLAM) pipeline that combines low-light enhancement, learned feature matching, and NeRF-based dense reconstruction to provide stable navigation states. A lightweight encoder–decoder with global attention improves signal-to-noise ratio and contrast while preserving feature geometry. SuperPoint and LightGlue deliver robust correspondences under severe visual degradation. Visual and inertial data are tightly fused through IMU pre-integration and nonlinear optimization, producing steady pose estimates that sustain downstream guidance and trajectory planning. An accelerated NeRF converts monocular sequences into dense, photorealistic reconstructions that complement sparse SLAM maps and support survey-grade measurement products. Experiments on AQUALOC sequences demonstrate improved localization stability and higher-fidelity reconstructions at competitive runtime, showing robustness to low illumination and turbidity. The results indicate an effective engineering pathway that integrates underwater image enhancement, multi-sensor fusion, and neural scene representations to improve navigation reliability and mission effectiveness in realistic marine environments. Full article

(This article belongs to the Special Issue Intelligent Measurement and Control System of Marine Robots)

► Show Figures

Figure 1

37 pages, 9718 KB

Open AccessArticle

Monocular Visual Measurement System Uncertainty Analysis and One-Step End–End Estimation Upgrade

by Kuai Zhou, Wenmin Chu and Peng Zhao

Sensors 2025, 25(23), 7179; https://doi.org/10.3390/s25237179 - 24 Nov 2025

Viewed by 1200

Abstract

Monocular visual measurement and vision-guided robotics technology find extensive application in modern automated manufacturing, particularly in aerospace assembly. However, during assembly pose measurement and guidance, the propagation and accumulation of multi-source errors—including those from visual measurement, hand–eye calibration, and robot calibration—impact final assembly [...] Read more.

Monocular visual measurement and vision-guided robotics technology find extensive application in modern automated manufacturing, particularly in aerospace assembly. However, during assembly pose measurement and guidance, the propagation and accumulation of multi-source errors—including those from visual measurement, hand–eye calibration, and robot calibration—impact final assembly accuracy. To address this issue, this study first proposes an uncertainty analysis method for monocular visual measurement systems in assembly pose, encompassing the determination of uncertainty propagation paths and input uncertainty values. Building on this foundation, the system’s uncertainty is analyzed. Inspired by the uncertainty analysis results, this study further proposes a direct one-step solution to a series of problems in robot calibration and hand–eye calibration using a nonlinear mapping estimation method. Through experiments and discussion, a high-performance, one-step, end-to-end pose estimation convolutional neural network (OECNN) is constructed. The OECNN achieves direct mapping from the pose variation of the target object to the drive volume variation of the positioner. The uncertainty analysis conducted in this study yields a series of conclusions that are significant for further enhancing the precision of assembly pose estimation. The proposed uncertainty analysis methodology may also serve as a reference for uncertainty analysis in complex systems. Experimental validation demonstrates that the proposed one-step end-to-end pose estimation method exhibits high accuracy. It can be applied to automated assembly tasks involving various vision-guided robots, including those with typical configurations, and it is particularly suitable for high-precision assembly scenarios, such as aircraft assembly. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 7207 KB

Open AccessArticle

YOLO–LaserGalvo: A Vision–Laser-Ranging System for High-Precision Welding Torch Localization

by Jiajun Li, Tianlun Wang and Wei Wei

Sensors 2025, 25(20), 6279; https://doi.org/10.3390/s25206279 - 10 Oct 2025

Viewed by 1170

Abstract

A novel closed loop visual positioning system, termed YOLO–LaserGalvo (YLGS), is proposed for precise localization of welding torch tips in industrial welding automation. The proposed system integrates a monocular camera, an infrared laser distance sensor with a galvanometer scanner, and a customized deep [...] Read more.

A novel closed loop visual positioning system, termed YOLO–LaserGalvo (YLGS), is proposed for precise localization of welding torch tips in industrial welding automation. The proposed system integrates a monocular camera, an infrared laser distance sensor with a galvanometer scanner, and a customized deep learning detector based on an improved YOLOv11 model. In operation, the vision subsystem first detects the approximate image location of the torch tip using the YOLOv11-based model. Guided by this detection, the galvanometer steers the IR laser beam to that point and measures the distance to the torch tip. The distance feedback is then fused with the vision coordinates to compute the precise 3D position of the torch tip in real-time. Under complex illumination, the proposed YLGS system exhibits superior robustness compared with color-marker and ArUco baselines. Experimental evaluation shows that the system outperforms traditional color-marker and ArUco-based methods in terms of accuracy, robustness, and processing speed. This marker-free method provides high-precision torch positioning without requiring structured lighting or artificial markers. Its pedagogical implications in engineering education are also discussed. Potential future work includes extending the method to full 6-DOF pose estimation and integrating additional sensors for enhanced performance. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

19 pages, 4672 KB

Open AccessArticle

Monocular Visual/IMU/GNSS Integration System Using Deep Learning-Based Optical Flow for Intelligent Vehicle Localization

by Jeongmin Kang

Sensors 2025, 25(19), 6050; https://doi.org/10.3390/s25196050 - 1 Oct 2025

Cited by 1 | Viewed by 2029

Abstract

Accurate and reliable vehicle localization is essential for autonomous driving in complex outdoor environments. Traditional feature-based visual–inertial odometry (VIO) suffers from sparse features and sensitivity to illumination, limiting robustness in outdoor scenes. Deep learning-based optical flow offers dense and illumination-robust motion cues. However, [...] Read more.

Accurate and reliable vehicle localization is essential for autonomous driving in complex outdoor environments. Traditional feature-based visual–inertial odometry (VIO) suffers from sparse features and sensitivity to illumination, limiting robustness in outdoor scenes. Deep learning-based optical flow offers dense and illumination-robust motion cues. However, existing methods rely on simple bidirectional consistency checks that yield unreliable flow in low-texture or ambiguous regions. Global navigation satellite system (GNSS) measurements can complement VIO, but often degrade in urban areas due to multipath interference. This paper proposes a multi-sensor fusion system that integrates monocular VIO with GNSS measurements to achieve robust and drift-free localization. The proposed approach employs a hybrid VIO framework that utilizes a deep learning-based optical flow network, with an enhanced consistency constraint that incorporates local structure and motion coherence to extract robust flow measurements. The extracted optical flow serves as visual measurements, which are then fused with inertial measurements to improve localization accuracy. GNSS updates further enhance global localization stability by mitigating long-term drift. The proposed method is evaluated on the publicly available KITTI dataset. Extensive experiments demonstrate its superior localization performance compared to previous similar methods. The results show that the filter-based multi-sensor fusion framework with optical flow refined by the enhanced consistency constraint ensures accurate and reliable localization in large-scale outdoor environments. Full article

(This article belongs to the Special Issue AI-Driving for Autonomous Vehicles)

► Show Figures

Figure 1

Search Results (128)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (128)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI