MDPI - Publisher of Open Access Journals

25 pages, 24232 KiB

Open AccessArticle

Topology-Aware Multi-View Street Scene Image Matching for Cross-Daylight Conditions Integrating Geometric Constraints and Semantic Consistency

by Haiqing He, Wenbo Xiong, Fuyang Zhou, Zile He, Tao Zhang and Zhiyuan Sheng

ISPRS Int. J. Geo-Inf. 2025, 14(6), 212; https://doi.org/10.3390/ijgi14060212 - 29 May 2025

Viewed by 469

Abstract

While deep learning-based image matching methods excel at extracting high-level semantic features from remote sensing data, their performance degrades significantly under cross-daylight conditions and wide-baseline geometric distortions, particularly in multi-source street-view scenarios. This paper presents a novel illumination-invariant framework that synergistically integrates geometric [...] Read more.

While deep learning-based image matching methods excel at extracting high-level semantic features from remote sensing data, their performance degrades significantly under cross-daylight conditions and wide-baseline geometric distortions, particularly in multi-source street-view scenarios. This paper presents a novel illumination-invariant framework that synergistically integrates geometric topology and semantic consistency to achieve robust multi-view matching for cross-daylight urban perception. We first design a self-supervised learning paradigm to extract illumination-agnostic features by jointly optimizing local descriptors and global geometric structures across multi-view images. To address extreme perspective variations, a homography-aware transformation module is introduced to stabilize feature representation under large viewpoint changes. Leveraging a graph neural network with hierarchical attention mechanisms, our method dynamically aggregates contextual information from both local keypoints and semantic topology graphs, enabling precise matching in occluded regions and repetitive-textured urban scenes. A dual-branch learning strategy further refines similarity metrics through supervised patch alignment and unsupervised spatial consistency constraints derived from Delaunay triangulation. Finally, a topology-guided multi-plane expansion mechanism propagates initial matches by exploiting the inherent structural regularity of street scenes, effectively suppressing mismatches while expanding coverage. Extensive experiments demonstrate that our framework outperforms state-of-the-art methods, achieving a 6.4% improvement in matching accuracy and a 30.5% reduction in mismatches under cross-daylight conditions. These advancements establish a new benchmark for reliable multi-source image retrieval and localization in dynamic urban environments, with direct applications in autonomous driving systems and large-scale 3D city reconstruction. Full article

(This article belongs to the Topic 3D Computer Vision and Smart Building and City, 3rd Edition)

► Show Figures

Figure 1

23 pages, 13904 KiB

Open AccessArticle

Symmetric Model for Predicting Homography Matrix Between Courts in Co-Directional Multi-Frame Sequence

by Pan Zhang, Jiangtao Luo and Xupeng Liang

Symmetry 2025, 17(6), 832; https://doi.org/10.3390/sym17060832 - 27 May 2025

Viewed by 431

Abstract

The homography matrix is essential for perspective transformation across consecutive video frames. While existing methods are effective when the visual content between paired images remains largely unchanged, they rely on substantial, high-quality annotated data for a multi-frame court sequence with content variation. To [...] Read more.

The homography matrix is essential for perspective transformation across consecutive video frames. While existing methods are effective when the visual content between paired images remains largely unchanged, they rely on substantial, high-quality annotated data for a multi-frame court sequence with content variation. To address this limitation and enhance homography matrix predictions in competitive sports images, a new symmetric stacked neural network model is proposed. The model first leverages the mutual invertibility of bidirectional homography matrices to improve prediction accuracy between paired images. Secondly, by theoretically validating and leveraging the decomposability of the homography matrix, the model significantly reduces the amount of data annotation required for continuous frames within the same shooting direction. Experimental evaluations on datasets for court homography transformations in sports, such as ice hockey, basketball, and handball, show that the proposed symmetric model achieves superior accuracy in predicting homography matrices, even when only one-third of the frames are annotated. Comparisons with seven related methods further highlight the exceptional performance of the proposed model. Full article

(This article belongs to the Special Issue Advances in Image Processing with Symmetry/Asymmetry)

► Show Figures

Figure 1

14 pages, 9340 KiB

Open AccessArticle

Research on a Rapid Image Stitching Method for Tunneling Front Based on Navigation and Positioning Information

by Hongda Zhu and Sihai Zhao

Sensors 2025, 25(10), 3023; https://doi.org/10.3390/s25103023 - 10 May 2025

Viewed by 522

Abstract

To address the challenges posed by significant parallax, dynamic changes in monitoring camera positions, and the need for rapid wide-field image stitching in underground coal mine tunneling faces, this paper proposes a fast image stitching method for tunneling face images based on navigation [...] Read more.

To address the challenges posed by significant parallax, dynamic changes in monitoring camera positions, and the need for rapid wide-field image stitching in underground coal mine tunneling faces, this paper proposes a fast image stitching method for tunneling face images based on navigation and positioning data. First, using a pixel-based calculation approach, the tunneling face scene is partitioned into the cutting section and the ground, enhancing the reliability of scene segmentation. Then, the spatial distance between the camera and the cutting plane is computed based on the tunneling machine’s navigation and positioning data, and a plane-induced homography model is employed to efficiently determine the dynamic transformation matrix of the cutting section. Finally, the Dual-Homography Warping (DHW) method is applied to achieve fast panoramic image stitching of the tunneling face. Comparative experiments with three classical stitching methods, SURF, SIFT, and BRISK, demonstrate that the proposed method reduces stitching time by 60%. Field experiments in underground environments verify that this method can generate a complete panoramic stitched image of the tunneling face, providing an unobstructed perspective beyond the machine body and cutting head to clearly observe the shovel plate and surrounding ground conditions, significantly enhancing the visibility and convenience of remote operation. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

17 pages, 3782 KiB

Open AccessArticle

DRHT: A Hybrid Mathematical Model for Accurate Ultrasound Probe Calibration and Efficient 3D Reconstruction

by Xuquan Ji, Yonghong Zhang, Huaqing Shang, Lei Hu, Xiaozhi Qi and Wenyong Liu

Mathematics 2025, 13(8), 1359; https://doi.org/10.3390/math13081359 - 21 Apr 2025

Viewed by 469

Abstract

The calibration of ultrasound probes is essential for three-dimensional ultrasound reconstruction and navigation. However, the existing calibration methods are often cumbersome and inadequate in accuracy. In this paper, a hybrid mathematical model, Dimensionality Reduction and Homography Transformation (DRHT), is proposed. The model characterizes [...] Read more.

The calibration of ultrasound probes is essential for three-dimensional ultrasound reconstruction and navigation. However, the existing calibration methods are often cumbersome and inadequate in accuracy. In this paper, a hybrid mathematical model, Dimensionality Reduction and Homography Transformation (DRHT), is proposed. The model characterizes the relationship between the image plane of ultrasound and projected calibration lines and homography transformation. The homography transformation, which can be estimated using the singular value decomposition method, reduces the dimensionality of the calibration data and could significantly accelerate the computation of image points in ultrasonic three-dimensional reconstruction. Experiments comparing the DRHT method with the PLUS library demonstrated that DRHT outperformed the PLUS algorithm in terms of accuracy (0.89 mm vs. 0.92 mm) and efficiency (268 ms vs. 761 ms). Furthermore, high-precision calibration can be achieved with only four images, which greatly simplifies the calibration process and enhances the feasibility of the clinical application of this model. Full article

(This article belongs to the Special Issue Robust Perception and Control in Prognostic Systems)

► Show Figures

Figure 1

21 pages, 6484 KiB

Open AccessArticle

A Perspective Distortion Correction Method for Planar Imaging Based on Homography Mapping

by Chen Wang, Yabin Ding, Kai Cui, Jianhui Li, Qingpo Xu and Jiangping Mei

Sensors 2025, 25(6), 1891; https://doi.org/10.3390/s25061891 - 18 Mar 2025

Viewed by 1497

Abstract

In monocular vision measurement, a barrier to implementation is the perspective distortion caused by manufacturing errors in the imaging chip and non-parallelism between the measurement plane and its image, which seriously affects the accuracy of pixel equivalent and measurement results. This paper proposed [...] Read more.

In monocular vision measurement, a barrier to implementation is the perspective distortion caused by manufacturing errors in the imaging chip and non-parallelism between the measurement plane and its image, which seriously affects the accuracy of pixel equivalent and measurement results. This paper proposed a perspective distortion correction method for planar imaging based on homography mapping. Factors causing perspective distortion from the camera’s intrinsic and extrinsic parameters were analyzed, followed by constructing a perspective transformation model. Then, a corrected imaging plane was constructed, and the model was further calibrated by utilizing the homography between the measurement plane, the actual imaging plane, and the corrected imaging plane. The nonlinear and perspective distortions were simultaneously corrected by transforming the original image to the corrected imaging plane. The experiment measuring the radius, length, angle, and area of a designed pattern shows that the root mean square errors will be 0.016 mm, 0.052 mm, 0.16°, and 0.68 mm², and the standard deviations will be 0.016 mm, 0.045 mm, 0.033° and 0.65 mm², respectively. The proposed method can effectively solve the problem of high-precision planar measurement under perspective distortion. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

9 pages, 1408 KiB

Open AccessArticle

Real-Time Integration of Optical Coherence Tomography Thickness Map Overlays for Enhanced Visualization in Epiretinal Membrane Surgery: A Pilot Study

by Ferhat Turgut, Keisuke Ueda, Amr Saad, Tahm Spitznagel, Luca von Felten, Takashi Matsumoto, Rui Santos, Marc D. de Smet, Zoltán Zsolt Nagy, Matthias D. Becker and Gábor Márk Somfai

Bioengineering 2025, 12(3), 271; https://doi.org/10.3390/bioengineering12030271 - 10 Mar 2025

Viewed by 1088

Abstract

(1) Background: The process of epiretinal membrane peeling (MP) requires precise intraoperative visualization to achieve optimal surgical outcomes. This study investigates the integration of preoperative Optical Coherence Tomography (OCT) images into real-time surgical video feeds, providing a dynamic overlay that enhances the decision-making [...] Read more.

(1) Background: The process of epiretinal membrane peeling (MP) requires precise intraoperative visualization to achieve optimal surgical outcomes. This study investigates the integration of preoperative Optical Coherence Tomography (OCT) images into real-time surgical video feeds, providing a dynamic overlay that enhances the decision-making process during surgery. (2) Methods: Five MP surgeries were analyzed, where preoperative OCT images were first manually aligned with the initial frame of the surgical video by selecting five pairs of corresponding points. A homography transformation was then computed to overlay the OCT onto that first frame. Subsequently, for consecutive frames, feature point extraction (the Shi–Tomasi method) and optical flow computation (the Lucas–Kanade algorithm) were used to calculate frame-by-frame transformations, which were applied to the OCT image to maintain alignment in near real time. (3) Results: The method achieved a 92.7% success rate in optical flow detection and maintained an average processing speed of 7.56 frames per second (FPS), demonstrating the feasibility of near real-time application. (4) Conclusions: The developed approach facilitates enhanced intraoperative visualization, providing surgeons with easier retinal structure identification which results in more comprehensive data-driven decisions. By improving surgical precision while potentially reducing complications, this technique benefits both surgeons and patients. Furthermore, the integration of OCT overlays holds promise for advancing robot-assisted surgery and surgical training protocols. This pilot study establishes the feasibility of real-time OCT integration in MP and opens avenues for broader applications in vitreoretinal procedures. Full article

(This article belongs to the Special Issue Biomedical Applications of Optical Coherence Tomography, Second Edition)

► Show Figures

Figure 1

21 pages, 4210 KiB

Open AccessArticle

Cross-Field Road Markings Detection Based on Inverse Perspective Mapping

by Eric Hsueh-Chan Lu and Yi-Chun Hsieh

Sensors 2024, 24(24), 8080; https://doi.org/10.3390/s24248080 - 18 Dec 2024

Viewed by 819

Abstract

With the rapid development of the autonomous vehicles industry, there has been a dramatic proliferation of research concerned with related works, where road markings detection is an important issue. When there is no public open data in a field, we must collect road [...] Read more.

With the rapid development of the autonomous vehicles industry, there has been a dramatic proliferation of research concerned with related works, where road markings detection is an important issue. When there is no public open data in a field, we must collect road markings data and label them by ourselves manually, which is huge labor work and takes lots of time. Moreover, object detection often encounters the problem of small object detection. The detection accuracy often decreases when the detection distance increases. This is primarily because distant objects on the road take up few pixels in the image and object scales vary depending on different distances and perspectives. For the sake of solving the issues mentioned above, this paper utilizes a virtual dataset and open dataset to train the object detection model and cross-field testing in the field of Taiwan roads. In order to make the model more robust and stable, the data augmentation method is employed to generate more data. Therefore, the data are increased through the data augmentation method and homography transformation of images in the limited dataset. Additionally, Inverse Perspective Mapping is performed on the input images to transform them into the bird’s eye view, which solves the “small objects at far distance” problem and the “perspective distortion of objects” problem so that the model can clearly recognize the objects on the road. The model testing on the front-view images and bird’s eye view images also shows a remarkable improvement of accuracy by 18.62%. Full article

(This article belongs to the Special Issue Integration of Sensor Technologies and Artificial Intelligence Strategies for Autonomous Vehicles and Intelligent Transportation Systems)

► Show Figures

Figure 1

26 pages, 3704 KiB

Open AccessArticle

Deep Unsupervised Homography Estimation for Single-Resolution Infrared and Visible Images Using GNN

by Yanhao Liao, Yinhui Luo, Qiang Fu, Chang Shu, Yuezhou Wu, Qijian Liu and Yuanqing He

Electronics 2024, 13(21), 4173; https://doi.org/10.3390/electronics13214173 - 24 Oct 2024

Cited by 1 | Viewed by 1533

Abstract

Single-resolution homography estimation of infrared and visible images is a significant and challenging research area within the field of computing, which has attracted a great deal of attention. However, due to the large modal differences between infrared and visible images, existing methods are [...] Read more.

Single-resolution homography estimation of infrared and visible images is a significant and challenging research area within the field of computing, which has attracted a great deal of attention. However, due to the large modal differences between infrared and visible images, existing methods are difficult to stably and accurately extract and match features between the two image types at a single resolution, which results in poor performance on the homography estimation task. To address this issue, this paper proposes an end-to-end unsupervised single-resolution infrared and visible image homography estimation method based on graph neural network (GNN), homoViG. Firstly, the method employs a triple attention shallow feature extractor to capture cross-dimensional feature dependencies and enhance feature representation effectively. Secondly, Vision GNN (ViG) is utilized as the backbone network to transform the feature point matching problem into a graph node matching problem. Finally, this paper proposes a new homography estimator, residual fusion vision graph neural network (RFViG), to reduce the feature redundancy caused by the frequent residual operations of ViG. Meanwhile, RFViG replaces the residual connections with an attention feature fusion module, highlighting the important features in the low-level feature graph. Furthermore, this model introduces detail feature loss and feature identity loss in the optimization phase, facilitating network optimization. Through extensive experimentation, we demonstrate the efficacy of all proposed components. The experimental results demonstrate that homoViG outperforms existing methods on synthetic benchmark datasets in both qualitative and quantitative comparisons. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

15 pages, 4913 KiB

Open AccessArticle

Deep Learning-Based Monocular Estimation of Distance and Height for Edge Devices

by Jan Gąsienica-Józkowy, Bogusław Cyganek, Mateusz Knapik, Szymon Głogowski and Łukasz Przebinda

Information 2024, 15(8), 474; https://doi.org/10.3390/info15080474 - 9 Aug 2024

Cited by 3 | Viewed by 3375

Abstract

Accurately estimating the absolute distance and height of objects in open areas is quite challenging, especially when based solely on single images. In this paper, we tackle these issues and propose a new method that blends traditional computer vision techniques with advanced neural [...] Read more.

Accurately estimating the absolute distance and height of objects in open areas is quite challenging, especially when based solely on single images. In this paper, we tackle these issues and propose a new method that blends traditional computer vision techniques with advanced neural network-based solutions. Our approach combines object detection and segmentation, monocular depth estimation, and homography-based mapping to provide precise and efficient measurements of absolute height and distance. This solution is implemented on an edge device, allowing for real-time data processing using both visual and thermal data sources. Experimental tests on a height estimation dataset we created show an accuracy of 98.86%, confirming the effectiveness of our method. Full article

(This article belongs to the Special Issue Information Processing in Multimedia Applications)

► Show Figures

Figure 1

17 pages, 9837 KiB

Open AccessArticle

Robust Calibration Technique for Precise Transformation of Low-Resolution 2D LiDAR Points to Camera Image Pixels in Intelligent Autonomous Driving Systems

by Ravichandran Rajesh and Pudureddiyur Venkataraman Manivannan

Vehicles 2024, 6(2), 711-727; https://doi.org/10.3390/vehicles6020033 - 19 Apr 2024

Cited by 1 | Viewed by 2322

Abstract

In the context of autonomous driving, the fusion of LiDAR and camera sensors is essential for robust obstacle detection and distance estimation. However, accurately estimating the transformation matrix between cost-effective low-resolution LiDAR and cameras presents challenges due to the generation of uncertain points [...] Read more.

In the context of autonomous driving, the fusion of LiDAR and camera sensors is essential for robust obstacle detection and distance estimation. However, accurately estimating the transformation matrix between cost-effective low-resolution LiDAR and cameras presents challenges due to the generation of uncertain points by low-resolution LiDAR. In the present work, a new calibration technique is developed to accurately transform low-resolution 2D LiDAR points into camera pixels by utilizing both static and dynamic calibration patterns. Initially, the key corresponding points are identified at the intersection of 2D LiDAR points and calibration patterns. Subsequently, interpolation is applied to generate additional corresponding points for estimating the homography matrix. The homography matrix is then optimized using the Levenberg–Marquardt algorithm to minimize the rotation error, followed by a Procrustes analysis to minimize the translation error. The accuracy of the developed calibration technique is validated through various experiments (varying distances and orientations). The experimental findings demonstrate that the developed calibration technique significantly reduces the mean reprojection error by 0.45 pixels, rotation error by 65.08%, and distance error by 71.93% compared to the standard homography technique. Thus, the developed calibration technique promises the accurate transformation of low-resolution LiDAR points into camera pixels, thereby contributing to improved obstacle perception in intelligent autonomous driving systems. Full article

(This article belongs to the Topic Information Sensing Technology for Intelligent/Driverless Vehicle, 2nd Edition)

► Show Figures

Graphical abstract

18 pages, 7048 KiB

Open AccessArticle

U-ETMVSNet: Uncertainty-Epipolar Transformer Multi-View Stereo Network for Object Stereo Reconstruction

by Ning Zhao, Heng Wang, Quanlong Cui and Lan Wu

Appl. Sci. 2024, 14(6), 2223; https://doi.org/10.3390/app14062223 - 7 Mar 2024

Cited by 1 | Viewed by 1851

Abstract

The Multi-View Stereo model (MVS), which utilizes 2D images from multiple perspectives for 3D reconstruction, is a crucial technique in the field of 3D vision. To address the poor correlation between 2D features and 3D space in existing MVS models, as well as [...] Read more.

The Multi-View Stereo model (MVS), which utilizes 2D images from multiple perspectives for 3D reconstruction, is a crucial technique in the field of 3D vision. To address the poor correlation between 2D features and 3D space in existing MVS models, as well as the high sampling rate required for static sampling, we proposeU-ETMVSNet in this paper. Initially, we employ an integrated epipolar transformer module (ET) to establish 3D spatial correlations along epipolar lines, thereby enhancing the reliability of aggregated cost volumes. Subsequently, we devise a sampling module based on probability volume uncertainty to dynamically adjust the depth sampling range for the next stage. Finally, we utilize a multi-stage joint learning method based on multi-depth value classification to evaluate and optimize the model. Experimental results demonstrate that on the DTU dataset, our method achieves a relative performance improvement of 27.01% and 11.27% in terms of completeness error and overall error, respectively, compared to CasMVSNet, even at lower depth sampling rates. Moreover, our method exhibits excellent performance with a score of 58.60 on the Tanks &Temples dataset, highlighting its robustness and generalization capability. Full article

(This article belongs to the Special Issue Deep Learning and Machine Learning in Image Processing and Pattern Recognition)

► Show Figures

Figure 1

25 pages, 10231 KiB

Open AccessArticle

Comprehensive Evaluation of Multispectral Image Registration Strategies in Heterogenous Agriculture Environment

by Shubham Rana, Salvatore Gerbino, Mariano Crimaldi, Valerio Cirillo, Petronia Carillo, Fabrizio Sarghini and Albino Maggio

J. Imaging 2024, 10(3), 61; https://doi.org/10.3390/jimaging10030061 - 29 Feb 2024

Cited by 8 | Viewed by 2900

Abstract

This article is focused on the comprehensive evaluation of alleyways to scale-invariant feature transform (SIFT) and random sample consensus (RANSAC) based multispectral (MS) image registration. In this paper, the idea is to extensively evaluate three such SIFT- and RANSAC-based registration approaches over a [...] Read more.

This article is focused on the comprehensive evaluation of alleyways to scale-invariant feature transform (SIFT) and random sample consensus (RANSAC) based multispectral (MS) image registration. In this paper, the idea is to extensively evaluate three such SIFT- and RANSAC-based registration approaches over a heterogenous mix containing Triticum aestivum crop and Raphanus raphanistrum weed. The first method is based on the application of a homography matrix, derived during the registration of MS images on spatial coordinates of individual annotations to achieve spatial realignment. The second method is based on the registration of binary masks derived from the ground truth of individual spectral channels. The third method is based on the registration of only the masked pixels of interest across the respective spectral channels. It was found that the MS image registration technique based on the registration of binary masks derived from the manually segmented images exhibited the highest accuracy, followed by the technique involving registration of masked pixels, and lastly, registration based on the spatial realignment of annotations. Among automatically segmented images, the technique based on the registration of automatically predicted mask instances exhibited higher accuracy than the technique based on the registration of masked pixels. In the ground truth images, the annotations performed through the near-infrared channel were found to have a higher accuracy, followed by green, blue, and red spectral channels. Among the automatically segmented images, the accuracy of the blue channel was observed to exhibit a higher accuracy, followed by the green, near-infrared, and red channels. At the individual instance level, the registration based on binary masks depicted the highest accuracy in the green channel, followed by the method based on the registration of masked pixels in the red channel, and lastly, the method based on the spatial realignment of annotations in the green channel. The instance detection of wild radish with YOLOv8l-seg was observed at a mAP@0.5 of 92.11% and a segmentation accuracy of 98% towards segmenting its binary mask instances. Full article

► Show Figures

Figure 1

13 pages, 1052 KiB

Open AccessTechnical Note

Geolocalization from Aerial Sensing Images Using Road Network Alignment

by Yongfei Li, Dongfang Yang, Shicheng Wang, Lin Shi and Deyu Meng

Remote Sens. 2024, 16(3), 482; https://doi.org/10.3390/rs16030482 - 26 Jan 2024

Cited by 2 | Viewed by 2220

Abstract

Estimating the geographic positions in GPS-denied environments is of great significance to the safe flight of unmanned aerial vehicles (UAVs). In this paper, we propose a novel geographic position estimation method for UAVs after road network alignment. We discuss the generally overlooked issue, [...] Read more.

Estimating the geographic positions in GPS-denied environments is of great significance to the safe flight of unmanned aerial vehicles (UAVs). In this paper, we propose a novel geographic position estimation method for UAVs after road network alignment. We discuss the generally overlooked issue, namely, how to estimate the geographic position of the UAV after successful road network alignment, and propose a precise robust solution. In our method, the optimal initial solution of the geographic position of the UAV is first estimated from the road network alignment result, which is typically presented as a homography transformation between the observed road map and the reference one. The geographic position estimation is then modeled as an optimization problem to align the observed road with the reference one to improve the estimation accuracy further. Experiments on synthetic and real flight aerial image datasets show that the proposed algorithm can estimate more accurate geographic position of the UAV in real time and is robust to the errors from homography transformation estimation compared to the currently commonly-used method. Full article

(This article belongs to the Special Issue Advanced Methods for Motion Estimation in Remote Sensing)

► Show Figures

Figure 1

18 pages, 9346 KiB

Open AccessArticle

GNSS-Assisted Visual Dynamic Localization Method in Unknown Environments

by Jun Dai, Chunfeng Zhang, Songlin Liu, Xiangyang Hao, Zongbin Ren and Yunzhu Lv

Appl. Sci. 2024, 14(1), 455; https://doi.org/10.3390/app14010455 - 4 Jan 2024

Viewed by 2164

Abstract

Autonomous navigation and localization are the foundations of unmanned intelligent systems, therefore, continuous, stable, and reliable position services in unknown environments are especially important for autonomous navigation and localization. Aiming at the problem where GNSS cannot continuously localize in complex environments due to [...] Read more.

Autonomous navigation and localization are the foundations of unmanned intelligent systems, therefore, continuous, stable, and reliable position services in unknown environments are especially important for autonomous navigation and localization. Aiming at the problem where GNSS cannot continuously localize in complex environments due to weak signals, poor penetration ability, and susceptibility to interference and that visual navigation and localization are only relative, this paper proposes a GNSS-aided visual dynamic localization method that can provide global localization services in unknown environments. Taking the three frames of images and their corresponding GNSS coordinates as the constraint data, the GNSS coordinate system and world coordinate system transformation matrix are obtained through horn coordinate transformation, and the relative positions of the subsequent image sequences in the world coordinate system are obtained through epipolar geometry constraints, homography matrix transformations, and 2D–3D position and orientation solving, which ultimately yields the global position data of unmanned carriers in GNSS coordinate systems when GNSS is temporarily unavailable. Both the dataset validation and measured data validation showed that the GNSS initial-assisted positioning algorithm could be applied to situations where intermittent GNSS signals exist, and it can provide global positioning coordinates with high positioning accuracy in a short period of time; however, the algorithm would drift when used for a long period of time. We further compared the errors of the GNSS initial-assisted positioning and GNSS continuous-assisted positioning systems, and the results showed that the accuracy of the GNSS continuous-assisted positioning system was two to three times better than that of the GNSS initial-assisted positioning system, which proved that the GNSS continuous-assisted positioning algorithm could maintain positioning accuracy for a long time and it had good reliability and applicability in unknown environments. Full article

► Show Figures

Figure 1

20 pages, 2470 KiB

Open AccessArticle

Coarse-to-Fine Homography Estimation for Infrared and Visible Images

by Xingyi Wang, Yinhui Luo, Qiang Fu, Yuanqing He, Chang Shu, Yuezhou Wu and Yanhao Liao

Electronics 2023, 12(21), 4441; https://doi.org/10.3390/electronics12214441 - 29 Oct 2023

Cited by 4 | Viewed by 1855

Abstract

Homography estimation for infrared and visible images is a critical and fundamental task in multimodal image processing. Recently, the coarse-to-fine strategy has been gradually applied to the homography estimation task and has proved to be effective. However, current coarse-to-fine homography estimation methods typically [...] Read more.

Homography estimation for infrared and visible images is a critical and fundamental task in multimodal image processing. Recently, the coarse-to-fine strategy has been gradually applied to the homography estimation task and has proved to be effective. However, current coarse-to-fine homography estimation methods typically require the introduction of additional neural networks to acquire multi-scale feature maps and the design of complex homography matrix fusion strategies. In this paper, we propose a new unsupervised homography estimation method for infrared and visible images. First, we design a novel coarse-to-fine strategy. This strategy utilizes different stages in the regression network to obtain multi-scale feature maps, enabling the progressive refinement of the homography matrix. Second, we design a local correlation transformer (LCTrans), which aims to capture the intrinsic connections between local features more precisely, thus highlighting the features crucial for homography estimation. Finally, we design an average feature correlation loss (AFCL) to enhance the robustness of the model. Through extensive experiments, we validated the effectiveness of all the proposed components. Experimental results demonstrate that our method outperforms existing methods on synthetic benchmark datasets in both qualitative and quantitative comparisons. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

Search Results (73)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (73)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI