MDPI - Publisher of Open Access Journals

23 pages, 4070 KiB

Open AccessArticle

A Deep Learning-Based System for Automatic License Plate Recognition Using YOLOv12 and PaddleOCR

by Bianca Buleu, Raul Robu and Ioan Filip

Appl. Sci. 2025, 15(14), 7833; https://doi.org/10.3390/app15147833 - 12 Jul 2025

Viewed by 372

Automatic license plate recognition (ALPR) plays an important role in applications such as intelligent traffic systems, vehicle access control in specific areas, and law enforcement. The main novelty brought by the present research consists in the development of an automatic vehicle license plate [...] Read more.

Automatic license plate recognition (ALPR) plays an important role in applications such as intelligent traffic systems, vehicle access control in specific areas, and law enforcement. The main novelty brought by the present research consists in the development of an automatic vehicle license plate recognition system adapted to the Romanian context, which integrates the YOLOv12 detection architecture with the PaddleOCR library while also providing functionalities for recognizing the type of vehicle on which the license plate is mounted and identifying the county of registration. The integration of these functionalities allows for an extension of the applicability range of the proposed solution, including for addressing issues related to restricting access for certain types of vehicles in specific areas, as well as monitoring vehicle traffic based on the county of registration. The dataset used in the study was manually collected and labeled using the makesense.ai platform and was made publicly available for future research. It includes 744 images of vehicles registered in Romania, captured in real traffic conditions (the training dataset being expanded by augmentation). The YOLOv12 model was trained to automatically detect license plates in images with vehicles, and then it was evaluated and validated using standard metrics such as precision, recall, F1 score, mAP@0.5, mAP@0.5:0.95, etc., proving very good performance. Experimental results demonstrate that YOLOv12 achieved superior performance compared to YOLOv11 for the analyzed issue. YOLOv12 outperforms YOLOv11 with a 2.3% increase in precision (from 97.4% to 99.6%) and a 1.1% improvement in F1 score (from 96.7% to 97.8%). Full article

(This article belongs to the Collection Machine Learning in Computer Engineering Applications)

► Show Figures

Figure 1

42 pages, 5041 KiB

Open AccessArticle

Autonomous Waste Classification Using Multi-Agent Systems and Blockchain: A Low-Cost Intelligent Approach

by Sergio García González, David Cruz García, Rubén Herrero Pérez, Arturo Álvarez Sanchez and Gabriel Villarrubia González

Sensors 2025, 25(14), 4364; https://doi.org/10.3390/s25144364 - 12 Jul 2025

Viewed by 271

Abstract

The increase in garbage generated in modern societies demands the implementation of a more sustainable model as well as new methods for efficient waste management. This article describes the development and implementation of a prototype of a smart bin that automatically sorts waste [...] Read more.

The increase in garbage generated in modern societies demands the implementation of a more sustainable model as well as new methods for efficient waste management. This article describes the development and implementation of a prototype of a smart bin that automatically sorts waste using a multi-agent system and blockchain integration. The proposed system has sensors that identify the type of waste (organic, plastic, paper, etc.) and uses collaborative intelligent agents to make instant sorting decisions. Blockchain has been implemented as a technology for the immutable and transparent control of waste registration, favoring traceability during the classification process, providing sustainability to the process, and making the audit of data in smart urban environments transparent. For the computer vision algorithm, three versions of YOLO (YOLOv8, YOLOv11, and YOLOv12) were used and evaluated with respect to their performance in automatic detection and classification of waste. The YOLOv12 version was selected due to its overall performance, which is superior to others with mAP@50 values of 86.2%, an overall accuracy of 84.6%, and an average F1 score of 80.1%. Latency was kept below 9 ms per image with YOLOv12, ensuring smooth and lag-free processing, even for utilitarian embedded systems. This allows for efficient deployment in near-real-time applications where speed and immediate response are crucial. These results confirm the viability of the system in both accuracy and computational efficiency. This work provides an innovative solution in the field of ambient intelligence, characterized by low equipment cost and high scalability, laying the foundations for the development of smart waste management infrastructures in sustainable cities. Full article

(This article belongs to the Special Issue Sensing and AI: Advancements in Robotics and Autonomous Systems)

► Show Figures

Figure 1

22 pages, 4371 KiB

Open AccessArticle

Defining Keypoints to Align H&E Images and Xenium DAPI-Stained Images Automatically

by Yu Lin, Yan Wang, Juexin Wang, Mauminah Raina, Ricardo Melo Ferreira, Michael T. Eadon, Yanchun Liang and Dong Xu

Cells 2025, 14(13), 1000; https://doi.org/10.3390/cells14131000 - 30 Jun 2025

Viewed by 363

Abstract

10X Xenium is an in situ spatial transcriptomics platform that enables single-cell and subcellular-level gene expression analysis. In Xenium data analysis, defining matched keypoints to align H&E and spatial transcriptomic images is critical for cross-referencing sequencing and histology. Currently, it is labor-intensive for [...] Read more.

10X Xenium is an in situ spatial transcriptomics platform that enables single-cell and subcellular-level gene expression analysis. In Xenium data analysis, defining matched keypoints to align H&E and spatial transcriptomic images is critical for cross-referencing sequencing and histology. Currently, it is labor-intensive for domain experts to manually place keypoints to perform image registration in the Xenium Explorer software. We present Xenium-Align, a keypoint identification method that automatically generates keypoint files for image registration in Xenium Explorer. We validated our proposed method on 14 human kidney samples and one human skin Xenium sample representing healthy and diseased states, with expert manually marked results. These results show that Xenium-Align could generate accurate keypoints for automatically implementing image alignment in the Xenium Explorer software for spatial transcriptomics studies. Our future research aims to optimize the method’s runtime efficiency and usability for image alignment applications. Full article

► Show Figures

Figure 1

30 pages, 8572 KiB

Open AccessArticle

Robotic-Guided Spine Surgery: Implementation of a System in Routine Clinical Practice—An Update

by Mirza Pojskić, Miriam Bopp, Omar Alwakaa, Christopher Nimsky and Benjamin Saß

J. Clin. Med. 2025, 14(13), 4463; https://doi.org/10.3390/jcm14134463 - 23 Jun 2025

Viewed by 584

Abstract

Objective: The aim of this study is to present the initiation of robotic-guided (RG) spine surgery into routine clinical care at a single center with the use of intraoperative CT (iCT) automatic registration-based navigation. The workflow included iCT with automatic registration, fusion with [...] Read more.

Objective: The aim of this study is to present the initiation of robotic-guided (RG) spine surgery into routine clinical care at a single center with the use of intraoperative CT (iCT) automatic registration-based navigation. The workflow included iCT with automatic registration, fusion with preoperative imaging, verification of preplanned screw trajectories, RG introduction of K-wires, and the insertion of pedicle screws (PSs), followed by a control iCT scan. Methods: All patients who underwent RG implantation of pedicle screws using the Cirq^® robotic arm (BrainLab, Munich, Germany) in the thoracolumbar spine at our department were included in the study. The accuracy of the pedicles screws was assessed using the Gertzbein–Robbins scale (GRS). Results: In total, 108 patients (60 female, mean age 68.7 ± 11.4 years) in 109 surgeries underwent RG PS placement. Indications included degenerative spinal disorders (n = 30 patients), spondylodiscitis (n = 24), tumor (n = 33), and fracture (n = 22), with a mean follow-up period of 7.7 ± 9 months. Thirty-seven cases (33.9%) were performed percutaneously, and all others were performed openly. Thirty-three operations were performed on the thoracic spine, forty-four on the lumbar and lumbosacral spine, thirty on the thoracolumbar, one on the cervicothoracic spine, and one on the thoracolumbosacral spine. The screws were inserted using a fluoroscopic (first 12 operations) or navigated technique (latter operations). The mean operation time was 228.8 ± 106 min, and the mean robotic time was 31.5 ± 18.4 min. The mean time per K-wire was 5.35 ± 3.98 min. The operation time was lower in the percutaneous group, while the robot time did not differ between the two groups. Robot time and the time per K-wire improved over time. Out of 688 screws, 592 were GRS A screws (86.1%), 54 B (7.8%), 22 C (3.2%), 12 D (1.7%), and 8 E (1.2%). Seven screws were revised intraoperatively, and after revision, all were GRS A. E screws were either revised or removed. In the case of D screws, screws located at the end of the construct were revised, while so-called in-out-in screws in the middle of the construct were not revised. Conclusions: Brainlab’s Cirq^® Robotic Alignment Module feature enables placement of pedicle screws in the thoracolumbar spine with high accuracy. A learning curve is shown through improvements in robotic time and time per K-wire. Full article

(This article belongs to the Special Issue Spine Surgery: Clinical Advances and Future Directions)

► Show Figures

Figure 1

16 pages, 3988 KiB

Open AccessArticle

An Arthroscopic Robotic System for Meniscoplasty with Autonomous Operation Ability

by Zijun Zhang, Yijun Zhao, Baoliang Zhao, Gang Yu, Peng Zhang, Qiong Wang and Xiaojun Yang

Bioengineering 2025, 12(5), 539; https://doi.org/10.3390/bioengineering12050539 - 17 May 2025

Viewed by 475

Abstract

Meniscoplasty is a common surgical procedure used to treat meniscus tears. During the operation, there are often key challenges such as a limited visual field, a narrow operating space, and difficulties in controlling the resection range. Therefore, this study developed an arthroscopic robotic [...] Read more.

Meniscoplasty is a common surgical procedure used to treat meniscus tears. During the operation, there are often key challenges such as a limited visual field, a narrow operating space, and difficulties in controlling the resection range. Therefore, this study developed an arthroscopic robotic system with the ability of autonomous meniscus resection to achieve better surgical outcomes. To address the issue of limited visual fields during the operation, this study used the preoperative and intraoperative meniscus point cloud images for surgical navigation and proposed a novel cross-modal point cloud registration framework. After the registration was completed, the robotic system automatically generated a resection path that could maintain the crescent shape of the remaining meniscus based on the improved Rapidly Exploring Random Tree (RRT) path-planning algorithm in this study. Meanwhile, the Remote Center of Motion (RCM) constraint was introduced during the movement of the robot to enhance safety. In this study, the mean squared error of the preoperative–intraoperative meniscus point cloud registration was only 0.1964 mm², which meets the surgical accuracy requirements. We conducted experiments to validate the autonomous operation capabilities of the robot. The robot successfully completed motion-planning and autonomous implementation, thus demonstrating the reliability of the robotic system. Full article

(This article belongs to the Section Biomedical Engineering and Biomaterials)

► Show Figures

Figure 1

21 pages, 3360 KiB

Open AccessArticle

Radiomic Feature Characteristics of Ovine Pulmonary Adenocarcinoma

by David Collie, Ziyuan Chang, James Meehan, Steven H. Wright, Chris Cousens, Jo Moore, Helen Todd, Jennifer Savage, Helen Brown, Calum D. Gray, Tom J. MacGillivray, David J. Griffiths, Chad E. Eckert, Nicole Storer and Mark Gray

Vet. Sci. 2025, 12(5), 400; https://doi.org/10.3390/vetsci12050400 - 23 Apr 2025

Viewed by 474

Abstract

Radiomic feature (RF) analysis of computed tomography (CT) images may aid the diagnosis and staging of ovine pulmonary adenocarcinoma (OPA). We assessed the RF characteristics of OPA tumours in JSRV-infected sheep compared to non-tumour lung tissues, examined their stability over time, and analysed [...] Read more.

Radiomic feature (RF) analysis of computed tomography (CT) images may aid the diagnosis and staging of ovine pulmonary adenocarcinoma (OPA). We assessed the RF characteristics of OPA tumours in JSRV-infected sheep compared to non-tumour lung tissues, examined their stability over time, and analysed RF variations in the nascent tumour field (NTF) and nascent tumour margin field (NTmF). In monthly CT scans, lung tissues were automatically segmented by density, and lung tumours were manually segmented. RFs were calculated for each imaging session, selected according to stability and reproducibility, and adjusted for volume dependence where appropriate. Comparisons between scans within sheep were facilitated through fiducial registration and spatial transformations. Initially, 9/36 RFs differed significantly from non-tumour lung tissue of similar density. Predominant RF changes included ngtdm_Complexity, glrlm_RunLNUnif_VN, and gldm_SmDHGLE. RFs in lung tumour segments showed time-dependent changes, whereas non-tumour lung tissue of similar density remained consistent. OPA lung tumour RF characteristics are distinct from those of other lung tissues of similar density and evolve as the tumour develops. Such characteristics suggest that radiomic analysis offers potential for the early detection and management of JSRV-related lung tumours. This research enhances the understanding of OPA imaging, potentially informing better diagnosis and control measures for naturally occurring infections. Full article

► Show Figures

Figure 1

29 pages, 26666 KiB

Open AccessArticle

Automatic Registration of Multi-Temporal 3D Models Based on Phase Congruency Method

by Chaofeng Ren, Kenan Feng, Haixing Shang and Shiyuan Li

Remote Sens. 2025, 17(8), 1328; https://doi.org/10.3390/rs17081328 - 9 Apr 2025

Viewed by 498

Abstract

The application prospects of multi-temporal 3D models are broad. It is difficult to ensure that multi-temporal 3D models have a consistent spatial reference. In this study, a method for automatic alignment of multi-temporal 3D models based on phase congruency (PC) matching is proposed. [...] Read more.

The application prospects of multi-temporal 3D models are broad. It is difficult to ensure that multi-temporal 3D models have a consistent spatial reference. In this study, a method for automatic alignment of multi-temporal 3D models based on phase congruency (PC) matching is proposed. Firstly, the texture image of the multi-temporal 3D model is obtained, and the key points are extracted from the texture image. Secondly, the affine model between the plane of the key point and its corresponding tile triangle is established, and the 2D coordinates of the key point are mapped to 3D spatial coordinates. Thirdly, multi-temporal 3D model matching is completed based on PC to obtain a large number of evenly distributed corresponding points. Finally, the parameters of the 3D transformation model are estimated based on the multi-temporal corresponding points, and the vertex update of the 3D model is completed. The first experiment demonstrates that the method proposed in this study performs remarkably well in improving the positioning accuracy of feature point coordinates, effectively reducing the mean error of the systematic error to below 0.001 m. The second experiment further reveals the significant impact of different 3D transformation models. The experimental results show that the coordinates obtained based on position and orientation system (POS) data have significant positioning errors, while the method proposed in this study can reduce the coordinate errors between the two-period models. Due to the fact that this method does not require obtaining ground control points (GCPs) and does not require manual measurement for 3D geometric registration, its application to multi-temporal 3D models can ensure high-precision spatial referencing for multi-temporal 3D models, streamlining processes to reduce resource intensity and enhancing economic efficiency. Full article

(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)

► Show Figures

Figure 1

27 pages, 11200 KiB

Open AccessArticle

An Automatic Registration System Based on Augmented Reality to Enhance Civil Infrastructure Inspections

by Leonardo Binni, Massimo Vaccarini, Francesco Spegni, Leonardo Messi and Berardo Naticchia

Buildings 2025, 15(7), 1146; https://doi.org/10.3390/buildings15071146 - 31 Mar 2025

Cited by 1 | Viewed by 666

Abstract

Manual geometric and semantic alignment of inspection data with existing digital models (field-to-model data registration) and on-site access to relevant information (model-to-field data registration) represent cumbersome procedures that cause significant loss of information and fragmentation, hindering the efficiency of civil infrastructure inspections. To [...] Read more.

Manual geometric and semantic alignment of inspection data with existing digital models (field-to-model data registration) and on-site access to relevant information (model-to-field data registration) represent cumbersome procedures that cause significant loss of information and fragmentation, hindering the efficiency of civil infrastructure inspections. To address the bidirectional registration challenge, this study introduces a high-accuracy automatic registration method and system based on Augmented Reality (AR) that streamlines data exchange between the field and a knowledge graph-based Digital Twin (DT) platform for infrastructure management, and vice versa. A centimeter-level 6-DoF pose estimation of the AR device in large-scale, open unprepared environments is achieved by implementing a hybrid approach based on Real-Time Kinematic and Visual Inertial Odometry to cope with urban-canyon scenarios. For this purpose, a low-cost and non-invasive RTK receiver was prototyped and firmly attached to an AR device (i.e., Microsoft HoloLens 2). Multiple filters and latency compensation techniques were implemented to enhance registration accuracy. The system was tested in a real-world scenario involving the inspection of a highway viaduct. Throughout the use case inspection, the system seamlessly and automatically provided field operators with on-field access to existing DT information (i.e., open BIM models) such as georeferenced holograms and facilitated the enrichment of the asset’s DT through the automatic registration of inspection data (i.e., images) with the open BIM models included in the DT. This study contributes to DT-based civil infrastructure management by establishing a bidirectional and seamless integration between virtual and physical entities. Full article

(This article belongs to the Special Issue Selected Papers from the “24th International Conference on Construction Applications of Virtual Reality—CONVR2024”)

► Show Figures

Figure 1

19 pages, 39933 KiB

Open AccessArticle

SIFT-Based Depth Estimation for Accurate 3D Reconstruction in Cultural Heritage Preservation

by Porawat Visutsak, Xiabi Liu, Chalothon Choothong and Fuangfar Pensiri

Appl. Syst. Innov. 2025, 8(2), 43; https://doi.org/10.3390/asi8020043 - 24 Mar 2025

Viewed by 1560

Abstract

This paper describes a proposed method for preserving tangible cultural heritage by reconstructing a 3D model of cultural heritage using 2D captured images. The input data represent a set of multiple 2D images captured using different views around the object. An image registration [...] Read more.

This paper describes a proposed method for preserving tangible cultural heritage by reconstructing a 3D model of cultural heritage using 2D captured images. The input data represent a set of multiple 2D images captured using different views around the object. An image registration technique is applied to configure the overlapping images with the depth of images computed to construct the 3D model. The automatic 3D reconstruction system consists of three steps: (1) Image registration for managing the overlapping of 2D input images; (2) Depth computation for managing image orientation and calibration; and (3) 3D reconstruction using point cloud and stereo-dense matching. We collected and recorded 2D images of tangible cultural heritage objects, such as high-relief and round-relief sculptures, using a low-cost digital camera. The performance analysis of the proposed method, in conjunction with the generation of 3D models of tangible cultural heritage, demonstrates significantly improved accuracy in depth information. This process effectively creates point cloud locations, particularly in high-contrast backgrounds. Full article

(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)

► Show Figures

Figure 1

18 pages, 10219 KiB

Open AccessArticle

Automatic Registration of Remote Sensing High-Resolution Hyperspectral Images Based on Global and Local Features

by Xiaorong Zhang, Siyuan Li, Zhongyang Xing, Binliang Hu and Xi Zheng

Remote Sens. 2025, 17(6), 1011; https://doi.org/10.3390/rs17061011 - 13 Mar 2025

Cited by 1 | Viewed by 671

Abstract

Automatic registration of remote sensing images is an important task, which requires the establishment of appropriate correspondence between the sensed image and the reference image. Nowadays, the trend of satellite remote sensing technology is shifting towards high-resolution hyperspectral imaging technology. Ever higher revisit [...] Read more.

Automatic registration of remote sensing images is an important task, which requires the establishment of appropriate correspondence between the sensed image and the reference image. Nowadays, the trend of satellite remote sensing technology is shifting towards high-resolution hyperspectral imaging technology. Ever higher revisit cycles and image resolutions require higher accuracy and real-time performance for automatic registration. The push-broom payload is affected by the push-broom stability of the satellite platform and the elevation change of ground objects, and the obtained hyperspectral image may have distortions such as stretching or shrinking at different parts of the image. In order to solve this problem, a new automatic registration strategy for remote sensing hyperspectral images based on the combination of whole and local features of the image was established, and two granularity registrations were carried out, namely coarse-grained matching and fine-grained matching. The high-resolution spatial features are first employed for detecting scale-invariant features, while the spectral information is used for matching, and then the idea of image stitching is employed to fuse the image after fine registration to obtain high-precision registration results. In order to verify the proposed algorithm, a simulated on-orbit push-broom imaging experiment was carried out to obtain hyperspectral images with local complex distortions under different lighting conditions. The simulation results show that the proposed remote sensing hyperspectral image registration algorithm is superior to the existing automatic registration algorithms. The advantages of the proposed algorithm in terms of registration accuracy and real-time performance make it have a broad prospect for application in satellite ground application systems. Full article

(This article belongs to the Special Issue Trends and Prospects in Hyperspectral Remote Sensing Images Processing and Analysis)

► Show Figures

Graphical abstract

26 pages, 7964 KiB

Open AccessArticle

Pig Face Open Set Recognition and Registration Using a Decoupled Detection System and Dual-Loss Vision Transformer

by Ruihan Ma, Hassan Ali, Malik Muhammad Waqar, Sang Cheol Kim and Hyongsuk Kim

Animals 2025, 15(5), 691; https://doi.org/10.3390/ani15050691 - 27 Feb 2025

Viewed by 772

Abstract

Effective pig farming relies on precise and adaptable animal identification methods, particularly in dynamic environments where new pigs are regularly added to the herd. However, pig face recognition is challenging due to high individual similarity, lighting variations, and occlusions. These factors hinder accurate [...] Read more.

Effective pig farming relies on precise and adaptable animal identification methods, particularly in dynamic environments where new pigs are regularly added to the herd. However, pig face recognition is challenging due to high individual similarity, lighting variations, and occlusions. These factors hinder accurate identification and monitoring. To address these issues under Open-Set conditions, we propose a three-phase Pig Face Open-Set Recognition (PFOSR) system. In the Training Phase, we adopt a decoupled design, first training a YOLOv8-based pig face detection model on a small labeled dataset to automatically locate pig faces in raw images. We then refine a Vision Transformer (ViT) recognition model via a dual-loss strategy—combining Sub-center ArcFace and Center Loss—to enhance both inter-class separation and intra-class compactness. Next, in the Known Pig Registration Phase, we utilize the trained detection and recognition modules to extract representative embeddings from 56 identified pigs, storing these feature vectors in a Pig Face Feature Gallery. Finally, in the Unknown and Known Pig Recognition and Registration Phase, newly acquired pig images are processed through the same detection–recognition pipeline, and the resulting embeddings are compared against the gallery via cosine similarity. If the system classifies a pig as unknown, it dynamically assigns a new ID and updates the gallery without disrupting existing entries. Our system demonstrates strong Open-Set recognition, achieving an AUROC of 0.922, OSCR of 0.90, and F1-Open of 0.94. In the closed set, it attains a precision@1 of 0.97, NMI of 0.92, and mean average precision@R of 0.96. These results validate our approach as a scalable, efficient solution for managing dynamic farm environments with high recognition accuracy, even under challenging conditions. Full article

(This article belongs to the Special Issue Precision Livestock Farming: New Techniques for Monitoring the Behaviour and Welfare of Farm Animal)

► Show Figures

Figure 1

16 pages, 4878 KiB

Open AccessTechnical Note

A Robust Digital Elevation Model-Based Registration Method for Mini-RF/Mini-SAR Images

by Zihan Xu, Fei Zhao, Pingping Lu, Yao Gao, Tingyu Meng, Yanan Dang, Mofei Li and Robert Wang

Remote Sens. 2025, 17(4), 613; https://doi.org/10.3390/rs17040613 - 11 Feb 2025

Viewed by 761

Abstract

SAR data from the lunar spaceborne Reconnaissance Orbiter’s (LRO) Mini-RF and Chandrayaan-1’s Mini-SAR provide valuable insights into the properties of the lunar surface. However, public lunar SAR data products are not properly registered and are limited by localization issues. Existing registration methods for [...] Read more.

SAR data from the lunar spaceborne Reconnaissance Orbiter’s (LRO) Mini-RF and Chandrayaan-1’s Mini-SAR provide valuable insights into the properties of the lunar surface. However, public lunar SAR data products are not properly registered and are limited by localization issues. Existing registration methods for Earth SAR have proven to be inadequate in their robustness for lunar data registration. And current research on methods for lunar SAR has not yet focused on producing globally registered datasets. To solve these problems, this article introduces a robust automatic registration method tailored for S-band Level-1 Mini-RF and Mini-SAR data with the assistance of lunar DEM. A simulated SAR image based on real lunar DEM data is first generated to assist the registration work, and then an offset calculation approach based on normalized cross-correlation (NCC) and specific processing, including background removal, is proposed to achieve the registration between the simulated image, and the real image. When applying Mini-RF images and Mini-SAR images, high robustness and good accuracy are exhibited, which produces fully registered datasets. After processing using the proposed method, the average error between Mini-RF images and DEM references was reduced from approximately 3000 m to about 100 m. To further explore the additional improvement of the proposed method, the registered lunar SAR datasets are used for further analysis, including a review of the circular polarization ratio (CPR) characteristics of anomalous craters. Full article

(This article belongs to the Section Engineering Remote Sensing)

► Show Figures

Figure 1

20 pages, 7090 KiB

Open AccessArticle

An Infrared and Visible Image Alignment Method Based on Gradient Distribution Properties and Scale-Invariant Features in Electric Power Scenes

by Lin Zhu, Yuxing Mao, Chunxu Chen and Lanjia Ning

J. Imaging 2025, 11(1), 23; https://doi.org/10.3390/jimaging11010023 - 13 Jan 2025

Viewed by 1084

Abstract

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected [...] Read more.

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected results. To overcome the high difficulty of aligning infrared and visible light images, an image alignment method is proposed in this paper. First, we use the Sobel operator to extract the edge information of the image pair. Second, the feature points in the edges are recognised by a curvature scale space (CSS) corner detector. Third, the Histogram of Orientation Gradients (HOG) is extracted as the gradient distribution characteristics of the feature points, which are normalised with the Scale Invariant Feature Transform (SIFT) algorithm to form feature descriptors. Finally, initial matching and accurate matching are achieved by the improved fast approximate nearest-neighbour matching method and adaptive thresholding, respectively. Experiments show that this method can robustly match the feature points of image pairs under rotation, scale, and viewpoint differences, and achieves excellent matching results. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Graphical abstract

25 pages, 12595 KiB

Open AccessArticle

Fusion-Based Damage Segmentation for Multimodal Building Façade Images from an End-to-End Perspective

by Pujin Wang, Jiehui Wang, Qiong Liu, Lin Fang and Jie Xiao

Buildings 2025, 15(1), 63; https://doi.org/10.3390/buildings15010063 - 27 Dec 2024

Cited by 1 | Viewed by 1070

Abstract

Multimodal image data have found widespread applications in visual-based building façade damage detection in recent years, offering comprehensive inspection of façade surfaces with the assistance of drones and infrared thermography. However, the comprehensive integration of such complementary data has been hindered by low [...] Read more.

Multimodal image data have found widespread applications in visual-based building façade damage detection in recent years, offering comprehensive inspection of façade surfaces with the assistance of drones and infrared thermography. However, the comprehensive integration of such complementary data has been hindered by low levels of automation due to the absence of properly developed methods, resulting in high cost and low efficiency. Thus, this paper proposes an automatic end-to-end building façade damage detection method by integrating multimodal image registration, infrared–visible image fusion (IVIF), and damage segmentation. An infrared and visible image dataset consisting of 1761 pairs encompassing 4 main types of façade damage has been constructed for processing and training. A novel infrared–visible image registration method using main orientation assignment for feature point extraction is developed, reaching a high RMSE of 14.35 to align the multimodal images. Then, a deep learning-based infrared–visible image fusion (IVIF) network is trained to preserve damage characteristics between the modalities. For damage detection, a relatively high mean average precision (mAP) result of 85.4% is achieved by comparing four instance segmentation models, affirming the effective utilization of IVIF results. Full article

(This article belongs to the Special Issue Low-Carbon and Green Materials in Construction—2nd Edition)

► Show Figures

Figure 1

28 pages, 70926 KiB

Open AccessArticle

Fusion of Visible and Infrared Aerial Images from Uncalibrated Sensors Using Wavelet Decomposition and Deep Learning

by Chandrakanth Vipparla, Timothy Krock, Koundinya Nouduri, Joshua Fraser, Hadi AliAkbarpour, Vasit Sagan, Jing-Ru C. Cheng and Palaniappan Kannappan

Sensors 2024, 24(24), 8217; https://doi.org/10.3390/s24248217 - 23 Dec 2024

Cited by 2 | Viewed by 2018

Abstract

Multi-modal systems extract information about the environment using specialized sensors that are optimized based on the wavelength of the phenomenology and material interactions. To maximize the entropy, complementary systems operating in regions of non-overlapping wavelengths are optimal. VIS-IR (Visible-Infrared) systems have been at [...] Read more.

Multi-modal systems extract information about the environment using specialized sensors that are optimized based on the wavelength of the phenomenology and material interactions. To maximize the entropy, complementary systems operating in regions of non-overlapping wavelengths are optimal. VIS-IR (Visible-Infrared) systems have been at the forefront of multi-modal fusion research and are used extensively to represent information in all-day all-weather applications. Prior to image fusion, the image pairs have to be properly registered and mapped to a common resolution palette. However, due to differences in the device physics of image capture, information from VIS-IR sensors cannot be directly correlated, which is a major bottleneck for this area of research. In the absence of camera metadata, image registration is performed manually, which is not practical for large datasets. Most of the work published in this area assumes calibrated sensors and the availability of camera metadata providing registered image pairs, which limits the generalization capability of these systems. In this work, we propose a novel end-to-end pipeline termed DeepFusion for image registration and fusion. Firstly, we design a recursive crop and scale wavelet spectral decomposition (WSD) algorithm for automatically extracting the patch of visible data representing the thermal information. After data extraction, both the images are registered to a common resolution palette and forwarded to the DNN for image fusion. The fusion performance of the proposed pipeline is compared and quantified with state-of-the-art classical and DNN architectures for open-source and custom datasets demonstrating the efficacy of the pipeline. Furthermore, we also propose a novel keypoint-based metric for quantifying the quality of fused output. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

Search Results (200)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (200)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI