MDPI - Publisher of Open Access Journals

24 pages, 84138 KB

Open AccessArticle

A Hybrid Strategy for Achieving Robust Matching Inside the Binocular Vision of a Humanoid Robot

by Ming Xie, Xiaohui Wang and Jianghao Li

Mathematics 2025, 13(21), 3488; https://doi.org/10.3390/math13213488 (registering DOI) - 1 Nov 2025

Binocular vision is a core module in humanoid robots, and stereo matching is one of the key challenges in binocular vision, relying on template matching techniques and mathematical optimization methods to achieve precise image matching. However, occlusion significantly affects matching accuracy and robustness [...] Read more.

Binocular vision is a core module in humanoid robots, and stereo matching is one of the key challenges in binocular vision, relying on template matching techniques and mathematical optimization methods to achieve precise image matching. However, occlusion significantly affects matching accuracy and robustness in practical applications. To address this issue, we propose a novel hybrid matching strategy. This method does not require network training and has high computational efficiency, effectively addressing occlusion issues. First, we propose the Inverse Template Matching Mathematical Method (ITM), which is based on optimization theory. This method generates multiple new templates from the image to be matched using mathematical segmentation techniques and then matches them with the original template through an inverse optimization process, thereby effectively improving matching accuracy under mild occlusion conditions. Second, we propose the Iterative Matching Mathematical Method (IMM), which repeatedly executes ITM combined with optimization strategies to continuously refine the size of matching templates, thereby further improving matching accuracy under complex occlusion conditions. Concurrently, we adopt a local region selection strategy to selectively target areas related to occlusion regions for inverse optimization matching, significantly enhancing matching efficiency. Experimental results show that under severe occlusion conditions, the proposed method achieves a 93% improvement in accuracy compared to traditional template matching methods and a 37% improvement compared to methods based on convolutional neural networks (CNNs), reaching the current state of the art in the field. Our method introduces a reverse optimization paradigm into the field of template matching and provides an innovative mathematical solution to address occlusion issues. Full article

(This article belongs to the Special Issue Robotics and Intelligent Systems: New Mathematical Challenges and Algorithmic Innovations)

► Show Figures

Figure 1

23 pages, 8095 KB

Open AccessArticle

Three-Dimensional Measurement of Transmission Line Icing Based on a Rule-Based Stereo Vision Framework

by Nalini Rizkyta Nusantika, Jin Xiao and Xiaoguang Hu

Electronics 2025, 14(21), 4184; https://doi.org/10.3390/electronics14214184 - 27 Oct 2025

Viewed by 224

Abstract

The safety and reliability of modern power systems are increasingly challenged by adverse environmental conditions. (1) Background: Ice accumulation on power transmission lines is recognized as a severe threat to grid stability, as tower collapse, conductor breakage, and large-scale outages may be caused, [...] Read more.

The safety and reliability of modern power systems are increasingly challenged by adverse environmental conditions. (1) Background: Ice accumulation on power transmission lines is recognized as a severe threat to grid stability, as tower collapse, conductor breakage, and large-scale outages may be caused, thereby making accurate monitoring essential. (2) Methods: A rule-driven and interpretable stereo vision framework is proposed for three-dimensional (3D) detection and quantitative measurement of transmission line icing. The framework consists of three stages. First, adaptive preprocessing and segmentation are applied using multiscale Retinex with nonlinear color restoration, graph-based segmentation with structural constraints, and hybrid edge detection. Second, stereo feature extraction and matching are performed through entropy-based adaptive cropping, self-adaptive keypoint thresholding with circular descriptor analysis, and multi-level geometric validation. Third, 3D reconstruction is realized by fusing segmentation and stereo correspondences through triangulation with shape-constrained refinement, reaching millimeter-level accuracy. (3) Result: An accuracy of 98.35%, sensitivity of 91.63%, specificity of 99.42%, and precision of 96.03% were achieved in contour extraction, while a precision of 90%, recall of 82%, and an F1-score of 0.8594 with real-time efficiency (0.014–0.037 s) were obtained in stereo matching. Millimeter-level accuracy (Mean Absolute Error: 1.26 mm, Root Mean Square Error: 1.53 mm, Coefficient of Determination = 0.99) was further achieved in 3D reconstruction. (4) Conclusions: Superior accuracy, efficiency, and interpretability are demonstrated compared with two existing rule-based stereo vision methods (Method A: ROI Tracking and Geometric Validation Method and Method B: Rule-Based Segmentation with Adaptive Thresholding) that perform line icing identification and 3D reconstruction, highlighting the framework’s advantages under limited data conditions. The interpretability of the framework is ensured through rule-based operations and stepwise visual outputs, allowing each processing result, from segmentation to three-dimensional reconstruction, to be directly understood and verified by operators and engineers. This transparency facilitates practical deployment and informed decision making in real world grid monitoring systems. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Figure 1

21 pages, 13473 KB

Open AccessArticle

Ship Ranging Method in Lake Areas Based on Binocular Vision

by Tengwen Zhang, Xin Liu, Mingzhi Shao, Yuhan Sun and Qingfa Zhang

Sensors 2025, 25(20), 6477; https://doi.org/10.3390/s25206477 - 20 Oct 2025

Viewed by 319

Abstract

The unique hollowed-out catamaran hulls and complex environmental conditions in lake areas hinder traditional ranging algorithms (combining target detection and stereo matching) from accurately obtaining depth information near the center of ships. This not only impairs the navigation of electric tourist boats but [...] Read more.

The unique hollowed-out catamaran hulls and complex environmental conditions in lake areas hinder traditional ranging algorithms (combining target detection and stereo matching) from accurately obtaining depth information near the center of ships. This not only impairs the navigation of electric tourist boats but also leads to high computing resource consumption. To address this issue, this study proposes a ranging method integrating improved ORB (Oriented FAST and Rotated BRIEF) with stereo vision technology. Combined with traditional optimization techniques, the proposed method calculates target distance and angle based on the triangulation principle, providing a rough alternative solution for the “gap period” of stereo matching-based ranging. The method proceeds as follows: first, it acquires ORB feature points with relatively uniform global distribution from preprocessed binocular images via a local feature weighting approach; second, it further refines feature points within the ROI (Region of Interest) using a quadtree structure; third, it enhances matching accuracy by integrating the FLANN (Fast Library for Approximate Nearest Neighbors) and PROSAC (Progressive Sample Consensus) algorithms; finally, it applies the screened matching point pairs to the triangulation method to obtain the position and distance of the target ship. Experimental results show that the proposed algorithm improves processing speed by 6.5% compared with the ORB-PROSAC algorithm. Under ideal conditions, the ranging errors at 10m and 20m are 2.25% and 5.56%, respectively. This method can partially compensate for the shortcomings of stereo matching in ranging under the specified lake area scenario. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 3187 KB

Open AccessArticle

A Trinocular System for Pedestrian Localization by Combining Template Matching with Geometric Constraint Optimization

by Jinjing Zhao, Sen Huang, Yancheng Li, Jingjing Xu and Shengyong Xu

Sensors 2025, 25(19), 5970; https://doi.org/10.3390/s25195970 - 25 Sep 2025

Viewed by 420

Abstract

Pedestrian localization is a fundamental sensing task for intelligent outdoor systems. To overcome the limitations of accuracy and efficiency in conventional binocular approaches, this study introduces a trinocular stereo vision framework that integrates template matching with geometric constraint optimization. The system employs a [...] Read more.

Pedestrian localization is a fundamental sensing task for intelligent outdoor systems. To overcome the limitations of accuracy and efficiency in conventional binocular approaches, this study introduces a trinocular stereo vision framework that integrates template matching with geometric constraint optimization. The system employs a trinocular camera configuration arranged in an equilateral triangle, which enables complementary perspectives beyond a standard horizontal baseline. Based on this setup, an initial depth estimate is obtained through multi-scale template matching on the primary binocular pair. The additional vertical viewpoint is then incorporated by enforcing three-view geometric consistency, yielding refined and more reliable depth estimates. We evaluate the method on a custom outdoor trinocular dataset. Experimental results demonstrate that the proposed approach achieves a mean absolute error of 0.435 m with an average processing time of 3.13 ms per target. This performance surpasses both the binocular Semi-Global Block Matching (0.536 m) and RAFT-Stereo (0.623 m for the standard model and 0.621 m for the real-time model without fine-tuning). When combined with the YOLOv8-s detector, the system can localize pedestrians in 7.52 ms per frame, maintaining real-time operation (>30 Hz) for up to nine individuals, with a total end-to-end latency of approximately 32.56 ms. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

18 pages, 26474 KB

Open AccessArticle

Artificial Texture-Free Measurement: A Graph Cuts-Based Stereo Vision for 3D Wave Reconstruction in Laboratory

by Feng Wang and Qidan Zhu

J. Mar. Sci. Eng. 2025, 13(9), 1699; https://doi.org/10.3390/jmse13091699 - 3 Sep 2025

Viewed by 541

Abstract

A novel method for three-dimensional (3D) wave reconstruction based on stereo vision is proposed to overcome the challenges of measuring water surfaces under laboratory conditions. Traditional methods, such as adding seed particles or projecting artificial textures, can solve the image problem caused by [...] Read more.

A novel method for three-dimensional (3D) wave reconstruction based on stereo vision is proposed to overcome the challenges of measuring water surfaces under laboratory conditions. Traditional methods, such as adding seed particles or projecting artificial textures, can solve the image problem caused by the optical properties of the water surface. However, these methods can be costly and complicated to operate. In this paper, the proposed method uses affine consistency as matching invariants, bypassing the need for artificial textures. The method presents new data and smoothness terms within the graph cuts framework to achieve robust wave reconstruction. In a laboratory tank experiment, the wave point clouds were successfully reconstructed using a binocular camera. The accuracy of the method was verified by comparing the reconstruction with theoretical values and the sequences of the wave probe. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

21 pages, 33500 KB

Open AccessArticle

Location Research and Picking Experiment of an Apple-Picking Robot Based on Improved Mask R-CNN and Binocular Vision

by Tianzhong Fang, Wei Chen and Lu Han

Horticulturae 2025, 11(7), 801; https://doi.org/10.3390/horticulturae11070801 - 6 Jul 2025

Cited by 1 | Viewed by 848

Abstract

With the advancement of agricultural automation technologies, apple-harvesting robots have gradually become a focus of research. As their “perceptual core,” machine vision systems directly determine picking success rates and operational efficiency. However, existing vision systems still exhibit significant shortcomings in target detection and [...] Read more.

With the advancement of agricultural automation technologies, apple-harvesting robots have gradually become a focus of research. As their “perceptual core,” machine vision systems directly determine picking success rates and operational efficiency. However, existing vision systems still exhibit significant shortcomings in target detection and positioning accuracy in complex orchard environments (e.g., uneven illumination, foliage occlusion, and fruit overlap), which hinders practical applications. This study proposes a visual system for apple-harvesting robots based on improved Mask R-CNN and binocular vision to achieve more precise fruit positioning. The binocular camera (ZED2i) carried by the robot acquires dual-channel apple images. An improved Mask R-CNN is employed to implement instance segmentation of apple targets in binocular images, followed by a template-matching algorithm with parallel epipolar constraints for stereo matching. Four pairs of feature points from corresponding apples in binocular images are selected to calculate disparity and depth. Experimental results demonstrate average coefficients of variation and positioning accuracy of 5.09% and 99.61%, respectively, in binocular positioning. During harvesting operations with a self-designed apple-picking robot, the single-image processing time was 0.36 s, the average single harvesting cycle duration reached 7.7 s, and the comprehensive harvesting success rate achieved 94.3%. This work presents a novel high-precision visual positioning method for apple-harvesting robots. Full article

(This article belongs to the Section Fruit Production Systems)

► Show Figures

Figure 1

18 pages, 4774 KB

Open AccessArticle

InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset

by Yuandong Niu, Limin Liu, Fuyu Huang, Juntao Ma, Chaowen Zheng, Yunfeng Jiang, Ting An, Zhongchen Zhao and Shuangyou Chen

Remote Sens. 2025, 17(12), 2035; https://doi.org/10.3390/rs17122035 - 13 Jun 2025

Viewed by 817

Abstract

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in [...] Read more.

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method. Full article

(This article belongs to the Collection Visible Infrared Imaging Radiometers and Applications)

► Show Figures

Figure 1

17 pages, 10247 KB

Open AccessArticle

Pose Measurement of Non-Cooperative Space Targets Based on Point Line Feature Fusion in Low-Light Environments

by Haifeng Zhang, Jiaxin Wu, Han Ai, Delian Liu, Chao Mei and Maosen Xiao

Electronics 2025, 14(9), 1795; https://doi.org/10.3390/electronics14091795 - 28 Apr 2025

Viewed by 653

Abstract

Pose measurement of non-cooperative targets in space is one of the key technologies in space missions. However, most existing methods simulate well-lit environments and do not consider the degradation of algorithms in low-light conditions. Additionally, due to the limited computing capabilities of space [...] Read more.

Pose measurement of non-cooperative targets in space is one of the key technologies in space missions. However, most existing methods simulate well-lit environments and do not consider the degradation of algorithms in low-light conditions. Additionally, due to the limited computing capabilities of space platforms, there is a higher demand for real-time processing of algorithms. This paper proposes a real-time pose measurement method based on binocular vision that is suitable for low-light environments. Firstly, the traditional point feature extraction algorithm is adaptively improved based on lighting conditions, greatly reducing the impact of lighting on the effectiveness of feature point extraction. By combining point feature matching with epipolar constraints, the matching range of feature points is narrowed down to the epipolar line, significantly improving the matching speed and accuracy. Secondly, utilizing the structural information of the spacecraft, line features are introduced and processed in parallel with point features, greatly enhancing the accuracy of pose measurement results. Finally, an adaptive weighted multi-feature pose fusion method based on lighting conditions is introduced to obtain the optimal pose estimation results. Simulation and physical experiment results demonstrate that this method can obtain high-precision target pose information in a real-time and stable manner, both in well-lit and low-light environments. Full article

► Show Figures

Figure 1

18 pages, 9039 KB

Open AccessArticle

An Intelligent Monitoring System for the Driving Environment of Explosives Transport Vehicles Based on Consumer-Grade Cameras

by Jinshan Sun, Jianhui Tang, Ronghuan Zheng, Xuan Liu, Weitao Jiang and Jie Xu

Appl. Sci. 2025, 15(7), 4072; https://doi.org/10.3390/app15074072 - 7 Apr 2025

Viewed by 794

Abstract

With the development of industry and society, explosives are widely used in social production as an important industrial product and require transportation. Explosives transport vehicles are susceptible to various objective factors during driving, increasing the risk of transportation. At present, new transport vehicles [...] Read more.

With the development of industry and society, explosives are widely used in social production as an important industrial product and require transportation. Explosives transport vehicles are susceptible to various objective factors during driving, increasing the risk of transportation. At present, new transport vehicles are generally equipped with intelligent driving monitoring systems. However, for old transport vehicles, the cost of installing such systems is relatively high. To enhance the safety of older explosives transport vehicles, this study proposes a cost-effective intelligent monitoring system using consumer-grade IP cameras and edge computing. The system integrates YOLOv8 for real-time vehicle detection and a novel hybrid ranging strategy combining monocular (fast) and binocular (accurate) techniques to measure distances, ensuring rapid warnings and precise proximity monitoring. An optimized stereo matching workflow reduces processing latency by 23.5%, enabling real-time performance on low-cost devices. Experimental results confirm that the system meets safety requirements, offering a practical, application-specific solution for improving driving safety in resource-limited explosive transport environments. Full article

► Show Figures

Figure 1

22 pages, 16473 KB

Open AccessArticle

Multi-Camera Hierarchical Calibration and Three-Dimensional Reconstruction Method for Bulk Material Transportation System

by Chengcheng Hou, Yongfei Kang and Tiezhu Qiao

Sensors 2025, 25(7), 2111; https://doi.org/10.3390/s25072111 - 27 Mar 2025

Viewed by 1421

Abstract

Three-dimensional information acquisition is crucial for the intelligent control and safe operation of bulk material transportation systems. However, existing visual measurement methods face challenges, including difficult stereo matching due to indistinct surface features, error accumulation in multi-camera calibration, and unreliable depth information fusion. [...] Read more.

Three-dimensional information acquisition is crucial for the intelligent control and safe operation of bulk material transportation systems. However, existing visual measurement methods face challenges, including difficult stereo matching due to indistinct surface features, error accumulation in multi-camera calibration, and unreliable depth information fusion. This paper proposes a three-dimensional reconstruction method based on multi-camera hierarchical calibration. The method establishes a measurement framework centered on a core camera, enhances material surface features through speckle structured light projection, and implements a ‘monocular-binocular-multi-camera association’ calibration strategy with global optimization to reduce error accumulation. Additionally, a depth information fusion algorithm based on multi-epipolar geometric constraints improves reconstruction completeness through multi-view information integration. Experimental results demonstrate excellent precision with absolute errors within 1 mm for features as small as 15 mm and relative errors between 0.02% and 2.54%. Compared with existing methods, the proposed approach shows advantages in point cloud completeness, reconstruction accuracy, and environmental adaptability, providing reliable technical support for intelligent monitoring of bulk material transportation systems. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

17 pages, 9081 KB

Open AccessArticle

A Rapid Deployment Method for Real-Time Water Surface Elevation Measurement

by Yun Jiang

Sensors 2025, 25(6), 1850; https://doi.org/10.3390/s25061850 - 17 Mar 2025

Viewed by 758

Abstract

In this research, I introduce a water surface elevation measurement method that combines point cloud processing techniques and stereo vision cameras. While current vision-based water level measurement techniques focus on laboratory measurements or are based on auxiliary devices such as water rulers, I [...] Read more.

In this research, I introduce a water surface elevation measurement method that combines point cloud processing techniques and stereo vision cameras. While current vision-based water level measurement techniques focus on laboratory measurements or are based on auxiliary devices such as water rulers, I investigated the feasibility of measuring elevation based on images of the water surface. This research implements a monitoring system on-site, comprising a ZED 2i binocular camera (Stereolabs, San Francisco, CA, USA). First, the uncertainty of the camera is evaluated in a real measurement scenario. Then, the water surface images captured by the binocular camera are stereo matched to obtain parallax maps. Subsequently, the results of the binocular camera calibration are utilized to obtain the 3D point cloud coordinate values of the water surface image. Finally, the horizontal plane equation is solved by the RANSAC algorithm to finalize the height of the camera on the water surface. This approach is particularly significant as it offers a non-contact, shore-based solution that eliminates the need for physical water references, thereby enhancing the adaptability and efficiency of water level monitoring in challenging environments, such as remote or inaccessible areas. Within a measured elevation of 5 m, the water level measurement error is less than 2 cm. Full article

(This article belongs to the Section Environmental Sensing)

► Show Figures

Figure 1

23 pages, 12690 KB

Open AccessArticle

MSS-YOLO: Multi-Scale Edge-Enhanced Lightweight Network for Personnel Detection and Location in Coal Mines

by Wenjuan Yang, Yanqun Wang, Xuhui Zhang, Le Zhu, Tenghui Wang, Yunkai Chi and Jie Jiang

Appl. Sci. 2025, 15(6), 3238; https://doi.org/10.3390/app15063238 - 16 Mar 2025

Cited by 6 | Viewed by 1218

Abstract

As a critical task in underground coal mining, personnel identification and positioning in fully mechanized mining faces are essential for safety. Yet, complex environmental factors—such as narrow tunnels, heavy dust, and uneven lighting—pose significant challenges to accurate detection. In this paper, we propose [...] Read more.

As a critical task in underground coal mining, personnel identification and positioning in fully mechanized mining faces are essential for safety. Yet, complex environmental factors—such as narrow tunnels, heavy dust, and uneven lighting—pose significant challenges to accurate detection. In this paper, we propose a personnel detection network, MSS-YOLO, for fully mechanized mining faces based on YOLOv8. By designing a Multi-Scale Edge Enhancement (MSEE) module and fusing it with the C2f module, the performance of the network for personnel feature extraction under high-dust or long-distance conditions is effectively enhanced. Meanwhile, by designing a Spatial Pyramid Shared Conv (SPSC) module, the redundancy of the model is reduced, which effectively compensates for the problem of the max pooling being prone to losing the characteristics of the personnel at long distances. Finally, the lightweight Shared Convolutional Detection Head (SCDH) ensures real-time detection under limited computational resources. The experimental results show that compared to Faster-RCNN, SSD, YOLOv5s6, YOLOv7-tiny, YOLOv8n, and YOLOv11n, MSS-YOLO achieves AP50 improvements of 4.464%, 10.484%, 3.751%, 4.433%, 3.655%, and 2.188%, respectively, while reducing the inference time by 50.4 ms, 11.9 ms, 3.7 ms, 2.0 ms, 1.2 ms, and 2.3 ms. In addition, MSS-YOLO is combined with the SGBM binocular stereo vision matching algorithm to provide a personnel 3D spatial position solution by using disparity results. The personnel location results show that in the measurement range of 10 m, the position errors in the x-, y-, and z-directions are within 0.170 m, 0.160 m, and 0.200 m, respectively, which proves that MSS-YOLO is able to accurately detect underground personnel in real time and can meet the underground personnel detection and localization requirements. The current limitations lie in the reliance on a calibrated binocular camera and the performance degradation beyond 15 m. Future work will focus on multi-sensor fusion and adaptive distance scaling to enhance practical deployment. Full article

► Show Figures

Figure 1

35 pages, 37221 KB

Open AccessArticle

Target Ship Recognition and Tracking with Data Fusion Based on Bi-YOLO and OC-SORT Algorithms for Enhancing Ship Navigation Assistance

by Shuai Chen, Miao Gao, Peiru Shi, Xi Zeng and Anmin Zhang

J. Mar. Sci. Eng. 2025, 13(2), 366; https://doi.org/10.3390/jmse13020366 - 16 Feb 2025

Cited by 3 | Viewed by 2157

Abstract

With the ever-increasing volume of maritime traffic, the risks of ship navigation are becoming more significant, making the use of advanced multi-source perception strategies and AI technologies indispensable for obtaining information about ship navigation status. In this paper, first, the ship tracking system [...] Read more.

With the ever-increasing volume of maritime traffic, the risks of ship navigation are becoming more significant, making the use of advanced multi-source perception strategies and AI technologies indispensable for obtaining information about ship navigation status. In this paper, first, the ship tracking system was optimized using the Bi-YOLO network based on the C2f_BiFormer module and the OC-SORT algorithms. Second, to extract the visual trajectory of the target ship without a reference object, an absolute position estimation method based on binocular stereo vision attitude information was proposed. Then, a perception data fusion framework based on ship spatio-temporal trajectory features (ST-TF) was proposed to match GPS-based ship information with corresponding visual target information. Finally, AR technology was integrated to fuse multi-source perceptual information into the real-world navigation view. Experimental results demonstrate that the proposed method achieves a mAP0.5:0.95 of 79.6% under challenging scenarios such as low resolution, noise interference, and low-light conditions. Moreover, in the presence of the nonlinear motion of the own ship, the average relative position error of target ship visual measurements is maintained below 8%, achieving accurate absolute position estimation without reference objects. Compared to existing navigation assistance, the AR-based navigation assistance system, which utilizes ship ST-TF-based perception data fusion mechanism, enhances ship traffic situational awareness and provides reliable decision-making support to further ensure the safety of ship navigation. Full article

(This article belongs to the Special Issue Safe Maneuvering, Efficient Navigation and Intelligent Management for Ships)

► Show Figures

Figure 1

20 pages, 8045 KB

Open AccessArticle

Estimation of Wind Turbine Blade Icing Volume Based on Binocular Vision

by Fangzheng Wei, Zhiyong Guo, Qiaoli Han and Wenkai Qi

Appl. Sci. 2025, 15(1), 114; https://doi.org/10.3390/app15010114 - 27 Dec 2024

Cited by 1 | Viewed by 889

Abstract

Icing on wind turbine blades in cold and humid weather has become a detrimental factor limiting their efficient operation, and traditional methods for detecting blade icing have various limitations. Therefore, this paper proposes a non-contact ice volume estimation method based on binocular vision [...] Read more.

Icing on wind turbine blades in cold and humid weather has become a detrimental factor limiting their efficient operation, and traditional methods for detecting blade icing have various limitations. Therefore, this paper proposes a non-contact ice volume estimation method based on binocular vision and improved image processing algorithms. The method employs a stereo matching algorithm that combines dynamic windows, multi-feature fusion, and reordering, integrating gradient, color, and other information to generate matching costs. It utilizes a cross-based support region for cost aggregation and generates the final disparity map through a Winner-Take-All (WTA) strategy and multi-step optimization. Subsequently, combining image processing techniques and three-dimensional reconstruction methods, the geometric shape of the ice is modeled, and its volume is estimated using numerical integration methods. Experimental results on volume estimation show that for ice blocks with regular shapes, the errors between the measured and actual volumes are 5.28%, 8.35%, and 4.85%, respectively; for simulated icing on wind turbine blades, the errors are 5.06%, 6.45%, and 9.54%, respectively. The results indicate that the volume measurement errors under various conditions are all within 10%, meeting the experimental accuracy requirements for measuring the volume of ice accumulation on wind turbine blades. This method provides an accurate and efficient solution for detecting blade icing without the need to modify the blades, making it suitable for wind turbines already in operation. However, in practical applications, it may be necessary to consider the impact of illumination and environmental changes on visual measurements. Full article

► Show Figures

Figure 1

22 pages, 6639 KB

Open AccessArticle

Reliable Disparity Estimation Using Multiocular Vision with Adjustable Baseline

by Victor H. Diaz-Ramirez, Martin Gonzalez-Ruiz, Rigoberto Juarez-Salazar and Miguel Cazorla

Sensors 2025, 25(1), 21; https://doi.org/10.3390/s25010021 - 24 Dec 2024

Viewed by 1631

Abstract

Accurate estimation of three-dimensional (3D) information from captured images is essential in numerous computer vision applications. Although binocular stereo vision has been extensively investigated for this task, its reliability is conditioned by the baseline between cameras. A larger baseline improves the resolution of [...] Read more.

Accurate estimation of three-dimensional (3D) information from captured images is essential in numerous computer vision applications. Although binocular stereo vision has been extensively investigated for this task, its reliability is conditioned by the baseline between cameras. A larger baseline improves the resolution of disparity estimation but increases the probability of matching errors. This research presents a reliable method for disparity estimation through progressive baseline increases in multiocular vision. First, a robust rectification method for multiocular images is introduced, satisfying epipolar constraints and minimizing induced distortion. This method can improve rectification error by 25% for binocular images and 80% for multiocular images compared to well-known existing methods. Next, a dense disparity map is estimated by stereo matching from the rectified images with the shortest baseline. Afterwards, the disparity map for the subsequent images with an extended baseline is estimated within a short optimized interval, minimizing the probability of matching errors and further error propagation. This process is iterated until the disparity map for the images with the longest baseline is obtained. The proposed method increases disparity estimation accuracy by 20% for multiocular images compared to a similar existing method. The proposed approach enables accurate scene characterization and spatial point computation from disparity maps with improved resolution. The effectiveness of the proposed method is verified through exhaustive evaluations using well-known multiocular image datasets and physical scenes, achieving superior performance over similar existing methods in terms of objective measures. Full article

(This article belongs to the Collection Robotics and 3D Computer Vision)

► Show Figures

Figure 1

Search Results (79)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (79)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI