MDPI - Publisher of Open Access Journals

21 pages, 3448 KiB

Open AccessArticle

A Welding Defect Detection Model Based on Hybrid-Enhanced Multi-Granularity Spatiotemporal Representation Learning

by Chenbo Shi, Shaojia Yan, Lei Wang, Changsheng Zhu, Yue Yu, Xiangteng Zang, Aiping Liu, Chun Zhang and Xiaobing Feng

Sensors 2025, 25(15), 4656; https://doi.org/10.3390/s25154656 - 27 Jul 2025

Viewed by 360

Abstract

Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability [...] Read more.

Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability of deep learning models, this paper proposes a multi-granularity spatiotemporal representation learning algorithm based on the hybrid enhancement of handcrafted and deep learning features. A MobileNetV2 backbone network integrated with a Temporal Shift Module (TSM) is designed to progressively capture the short-term dynamic features of the molten pool and integrate temporal information across both low-level and high-level features. A multi-granularity attention-based feature aggregation module is developed to select key interference-free frames using cross-frame attention, generate multi-granularity features via grouped pooling, and apply the Convolutional Block Attention Module (CBAM) at each granularity level. Finally, these multi-granularity spatiotemporal features are adaptively fused. Meanwhile, an independent branch utilizes the Histogram of Oriented Gradient (HOG) and Scale-Invariant Feature Transform (SIFT) features to extract long-term spatial structural information from historical edge images, enhancing the model’s interpretability. The proposed method achieves an accuracy of 99.187% on a self-constructed dataset. Additionally, it attains a real-time inference speed of 20.983 ms per sample on a hardware platform equipped with an Intel i9-12900H CPU and an RTX 3060 GPU, thus effectively balancing accuracy, speed, and interpretability. Full article

(This article belongs to the Topic Applied Computing and Machine Intelligence (ACMI))

► Show Figures

Figure 1

26 pages, 92114 KiB

Open AccessArticle

Multi-Modal Remote Sensing Image Registration Method Combining Scale-Invariant Feature Transform with Co-Occurrence Filter and Histogram of Oriented Gradients Features

by Yi Yang, Shuo Liu, Haitao Zhang, Dacheng Li and Ling Ma

Remote Sens. 2025, 17(13), 2246; https://doi.org/10.3390/rs17132246 - 30 Jun 2025

Viewed by 399

Abstract

Multi-modal remote sensing images often exhibit complex and nonlinear radiation differences which significantly hinder the performance of traditional feature-based image registration methods such as Scale-Invariant Feature Transform (SIFT). In contrast, structural features—such as edges and contours—remain relatively consistent across modalities. To address this [...] Read more.

Multi-modal remote sensing images often exhibit complex and nonlinear radiation differences which significantly hinder the performance of traditional feature-based image registration methods such as Scale-Invariant Feature Transform (SIFT). In contrast, structural features—such as edges and contours—remain relatively consistent across modalities. To address this challenge, we propose a novel multi-modal image registration method, Cof-SIFT, which integrates a co-occurrence filter with SIFT. By replacing the traditional Gaussian filter with a co-occurrence filter, Cof-SIFT effectively suppresses texture variations while preserving structural information, thereby enhancing robustness to cross-modal differences. To further improve image registration accuracy, we introduce an extended approach, Cof-SIFT_HOG, which extracts Histogram of Oriented Gradients (HOG) features from the image gradient magnitude map of corresponding points and refines their positions based on HOG similarity. This refinement yields more precise alignment between the reference and image to be registered. We evaluated Cof-SIFT and Cof-SIFT_HOG on a diverse set of multi-modal remote sensing image pairs. The experimental results demonstrate that both methods outperform existing approaches, including SIFT, COFSM, SAR-SIFT, PSO-SIFT, and OS-SIFT, in terms of robustness and registration accuracy. Notably, Cof-SIFT_HOG achieves the highest overall performance, confirming the effectiveness of the proposed structural-preserving and corresponding point location refinement strategies in cross-modal registration tasks. Full article

► Show Figures

Figure 1

27 pages, 86462 KiB

Open AccessArticle

SAR Image Registration Based on SAR-SIFT and Template Matching

by Shichong Liu, Xiaobo Deng, Chun Liu and Yongchao Cheng

Remote Sens. 2025, 17(13), 2216; https://doi.org/10.3390/rs17132216 - 27 Jun 2025

Viewed by 367

Abstract

Accurate image registration is essential for synthetic aperture radar (SAR) applications such as change detection, image fusion, and deformation monitoring. However, SAR image registration faces challenges including speckle noise, low-texture regions, and the geometric transformation caused by topographic relief due to side-looking radar [...] Read more.

Accurate image registration is essential for synthetic aperture radar (SAR) applications such as change detection, image fusion, and deformation monitoring. However, SAR image registration faces challenges including speckle noise, low-texture regions, and the geometric transformation caused by topographic relief due to side-looking radar imaging. To address these issues, this paper proposes a novel two-stage registration method, consisting of pre-registration and fine registration. In the pre-registration stage, the scale-invariant feature transform for the synthetic aperture radar (SAR-SIFT) algorithm is integrated into an iterative optimization framework to eliminate large-scale geometric discrepancies, ensuring a coarse but reliable initial alignment. In the fine registration stage, a novel similarity measure is introduced by combining frequency-domain phase congruency and spatial-domain gradient features, which enhances the robustness and accuracy of template matching, especially in edge-rich regions. For the topographic relief in the SAR images, an adaptive local stretching transformation strategy is proposed to correct the undulating areas. Experiments on five pairs of SAR images containing flat and undulating regions show that the proposed method achieves initial alignment errors below 10 pixels and final registration errors below 1 pixel. Compared with other methods, our approach obtains more correct matching pairs (up to 100+ per image pair), higher registration precision, and improved robustness under complex terrains. These results validate the accuracy and effectiveness of the proposed registration framework. Full article

► Show Figures

Figure 1

19 pages, 8306 KiB

Open AccessArticle

Plant Sam Gaussian Reconstruction (PSGR): A High-Precision and Accelerated Strategy for Plant 3D Reconstruction

by Jinlong Chen, Yingjie Jiao, Fuqiang Jin, Xingguo Qin, Yi Ning, Minghao Yang and Yongsong Zhan

Electronics 2025, 14(11), 2291; https://doi.org/10.3390/electronics14112291 - 4 Jun 2025

Viewed by 595

Abstract

Plant 3D reconstruction plays a critical role in precision agriculture and plant growth monitoring, yet it faces challenges such as complex background interference, difficulties in capturing intricate plant structures, and a slow reconstruction speed. In this study, we propose PlantSamGaussianReconstruction (PSGR), a novel [...] Read more.

Plant 3D reconstruction plays a critical role in precision agriculture and plant growth monitoring, yet it faces challenges such as complex background interference, difficulties in capturing intricate plant structures, and a slow reconstruction speed. In this study, we propose PlantSamGaussianReconstruction (PSGR), a novel method that integrates Grounding SAM with 3D Gaussian Splatting (3DGS) techniques. PSGR employs Grounding DINO and SAM for accurate plant–background segmentation, utilizes algorithms such as Scale-Invariant Feature Transform (SIFT) for camera pose estimation and sparse point cloud generation, and leverages 3DGS for plant reconstruction. Furthermore, a 3D–2D projection-guided optimization strategy is introduced to enhance segmentation precision. The experimental results of various multi-view plant image datasets demonstrate that PSGR effectively removes background noise under diverse environments, accurately captures plant details, and achieves peak signal-to-noise ratio (PSNR) values exceeding 30 in most scenarios, outperforming the original 3DGS approach. Moreover, PSGR reduces training time by up to 26.9%, significantly improving reconstruction efficiency. These results suggest that PSGR is an efficient, scalable, and high-precision solution for plant modeling. Full article

► Show Figures

Figure 1

16 pages, 9488 KiB

Open AccessArticle

A Multitask Network for the Diagnosis of Autoimmune Gastritis

by Yuqi Cao, Yining Zhao, Xinao Jin, Jiayuan Zhang, Gangzhi Zhang, Pingjie Huang, Guangxin Zhang and Yuehua Han

J. Imaging 2025, 11(5), 154; https://doi.org/10.3390/jimaging11050154 - 15 May 2025

Viewed by 650

Abstract

Autoimmune gastritis (AIG) has a strong correlation with gastric neuroendocrine tumors (NETs) and gastric cancer, making its timely and accurate diagnosis crucial for tumor prevention. The endoscopic manifestations of AIG differ from those of gastritis caused by Helicobacter pylori (H. pylori) [...] Read more.

Autoimmune gastritis (AIG) has a strong correlation with gastric neuroendocrine tumors (NETs) and gastric cancer, making its timely and accurate diagnosis crucial for tumor prevention. The endoscopic manifestations of AIG differ from those of gastritis caused by Helicobacter pylori (H. pylori) infection in terms of the affected gastric anatomical regions and the pathological characteristics observed in biopsy samples. Therefore, when diagnosing AIG based on endoscopic images, it is essential not only to distinguish between normal and atrophic gastric mucosa but also to accurately identify the anatomical region in which the atrophic mucosa is located. In this study, we propose a patient-based multitask gastroscopy image classification network that analyzes all images obtained during the endoscopic procedure. First, we employ the Scale-Invariant Feature Transform (SIFT) algorithm for image registration, generating an image similarity matrix. Next, we use a hierarchical clustering algorithm to group images based on this matrix. Finally, we apply the RepLKNet model, which utilizes large-kernel convolution, to each image group to perform two tasks: anatomical region classification and lesion recognition. Our method achieves an accuracy of 93.4 ± 0.5% (95% CI) and a precision of 92.6 ± 0.4% (95% CI) in the anatomical region classification task, which categorizes images into the fundus, body, and antrum. Additionally, it attains an accuracy of 90.2 ± 1.0% (95% CI) and a precision of 90.5 ± 0.8% (95% CI) in the lesion recognition task, which identifies the presence of gastric mucosal atrophic lesions in gastroscopy images. These results demonstrate that the proposed multitask patient-based gastroscopy image analysis method holds significant practical value for advancing computer-aided diagnosis systems for atrophic gastritis and enhancing the diagnostic accuracy and efficiency of AIG. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

33 pages, 36897 KiB

Open AccessArticle

Making Images Speak: Human-Inspired Image Description Generation

by Chifaa Sebbane, Ikram Belhajem and Mohammed Rziza

Information 2025, 16(5), 356; https://doi.org/10.3390/info16050356 - 28 Apr 2025

Cited by 1 | Viewed by 414

Abstract

Despite significant advances in deep learning-based image captioning, many state-of-the-art models still struggle to balance visual grounding (i.e., accurate object and scene descriptions) with linguistic coherence (i.e., grammatical fluency and appropriate use of non-visual tokens such as articles and prepositions). To address these [...] Read more.

Despite significant advances in deep learning-based image captioning, many state-of-the-art models still struggle to balance visual grounding (i.e., accurate object and scene descriptions) with linguistic coherence (i.e., grammatical fluency and appropriate use of non-visual tokens such as articles and prepositions). To address these limitations, we propose a hybrid image captioning framework that integrates handcrafted and deep visual features. Specifically, we combine local descriptors—Scale-Invariant Feature Transform (SIFT) and Bag of Features (BoF)—with high-level semantic features extracted using ResNet50. This dual representation captures both fine-grained spatial details and contextual semantics. The decoder employs Bahdanau attention refined with an Attention-on-Attention (AoA) mechanism to optimize visual-textual alignment, while GloVe embeddings and a GRU-based sequence model ensure fluent language generation. The proposed system is trained on 200,000 image-caption pairs from the MS COCO train2014 dataset and evaluated on 50,000 held-out MS COCO pairs plus the Flickr8K benchmark. Our model achieves a CIDEr score of 128.3 and a SPICE score of 29.24, reflecting clear improvements over baselines in both semantic precision—particularly for spatial relationships—and grammatical fluency. These results validate that combining classical computer vision techniques with modern attention mechanisms yields more interpretable and linguistically precise captions, addressing key limitations in neural caption generation. Full article

(This article belongs to the Topic Visual Computing and Understanding: New Developments and Trends)

► Show Figures

Figure 1

22 pages, 121478 KiB

Open AccessArticle

Ground-Moving Target Relocation for a Lightweight Unmanned Aerial Vehicle-Borne Radar System Based on Doppler Beam Sharpening Image Registration

by Wencheng Liu, Zhen Chen, Zhiyu Jiang, Yanlei Li, Yunlong Liu, Xiangxi Bu and Xingdong Liang

Electronics 2025, 14(9), 1760; https://doi.org/10.3390/electronics14091760 - 25 Apr 2025

Viewed by 361

Abstract

With the rapid development of lightweight unmanned aerial vehicles (UAVs), the combination of UAVs and ground-moving target indication (GMTI) radar systems has received great interest. However, because of size, weight, and power (SWaP) limitations, the UAV may not be able to equip a [...] Read more.

With the rapid development of lightweight unmanned aerial vehicles (UAVs), the combination of UAVs and ground-moving target indication (GMTI) radar systems has received great interest. However, because of size, weight, and power (SWaP) limitations, the UAV may not be able to equip a highly accurate inertial navigation system (INS), which leads to reduced accuracy in the moving target relocation. To solve this issue, we propose using an image registration algorithm, which matches a Doppler beam sharpening (DBS) image of detected moving targets to a synthetic aperture radar (SAR) image containing coordinate information. However, when using conventional SAR image registration algorithms such as the SAR scale-invariant feature transform (SIFT) algorithm, additional difficulties arise. To overcome these difficulties, we developed a new image-matching algorithm, which first estimates the errors of the UAV platform to compensate for geometric distortions in the DBS image. In addition, to showcase the relocation improvement achieved with the new algorithm, we compared it with the affine transformation and second-order polynomial algorithms. The findings of simulated and real-world experiments demonstrate that our proposed image transformation method offers better moving target relocation results under low-accuracy INS conditions. Full article

(This article belongs to the Special Issue New Challenges in Remote Sensing Image Processing)

► Show Figures

Figure 1

23 pages, 1297 KiB

Open AccessArticle

Multi-Granularity and Multi-Modal Feature Fusion for Indoor Positioning

by Lijuan Ye, Yi Wang, Shenglei Pei, Yu Wang, Hong Zhao and Shi Dong

Symmetry 2025, 17(4), 597; https://doi.org/10.3390/sym17040597 - 15 Apr 2025

Viewed by 469

Abstract

Despite the widespread adoption of indoor positioning technology, the existing solutions still face significant challenges. On one hand, Wi-Fi-based positioning struggles to balance accuracy and efficiency in complex indoor environments and architectural layouts formed by pre-existing access points (APs). On the other hand, [...] Read more.

Despite the widespread adoption of indoor positioning technology, the existing solutions still face significant challenges. On one hand, Wi-Fi-based positioning struggles to balance accuracy and efficiency in complex indoor environments and architectural layouts formed by pre-existing access points (APs). On the other hand, vision-based methods, while offering high-precision potential, are hindered by prohibitive costs associated with binocular camera systems required for depth image acquisition, limiting their large-scale deployment. Additionally, channel state information (CSI), containing multi-subcarrier data, maintains amplitude symmetry in ideal free-space conditions but becomes susceptible to periodic positioning errors in real environments due to multipath interference. Meanwhile, image-based positioning often suffers from spatial ambiguity in texture-repeated areas. To address these challenges, we propose a novel hybrid indoor positioning method that integrates multi-granularity and multi-modal features. By fusing CSI data with visual information, the system leverages spatial consistency constraints from images to mitigate CSI error fluctuations while utilizing CSI’s global stability to correct local ambiguities in image-based positioning. In the initial coarse-grained positioning phase, a neural network model is trained using image data to roughly localize indoor scenes. This model adeptly captures the geometric relationships within images, providing a foundation for more precise localization in subsequent stages. In the fine-grained positioning stage, CSI features from Wi-Fi signals and Scale-Invariant Feature Transform (SIFT) features from image data are fused, creating a rich feature fusion fingerprint library that enables high-precision positioning. The experimental results show that our proposed method synergistically combines the strengths of Wi-Fi fingerprints and visual positioning, resulting in a substantial enhancement in positioning accuracy. Specifically, our approach achieves an accuracy of 0.4 m for 45% of positioning points and 0.8 m for 67% of points. Overall, this approach charts a promising path forward for advancing indoor positioning technology. Full article

(This article belongs to the Section Mathematics)

► Show Figures

Figure 1

30 pages, 33973 KiB

Open AccessArticle

Research on Rapid and Accurate 3D Reconstruction Algorithms Based on Multi-View Images

by Lihong Yang, Hang Ge, Zhiqiang Yang, Jia He, Lei Gong, Wanjun Wang, Yao Li, Liguo Wang and Zhili Chen

Appl. Sci. 2025, 15(8), 4088; https://doi.org/10.3390/app15084088 - 8 Apr 2025

Viewed by 1173

Abstract

Three-dimensional reconstruction entails the development of mathematical models of three-dimensional objects that are suitable for computational representation and processing. This technique constructs realistic 3D models of images and has significant practical applications across various fields. This study proposes a rapid and precise multi-view [...] Read more.

Three-dimensional reconstruction entails the development of mathematical models of three-dimensional objects that are suitable for computational representation and processing. This technique constructs realistic 3D models of images and has significant practical applications across various fields. This study proposes a rapid and precise multi-view 3D reconstruction method to address the challenges of low reconstruction efficiency and inadequate, poor-quality point cloud generation in incremental structure-from-motion (SFM) algorithms in multi-view geometry. The methodology involves capturing a series of overlapping images of campus. We employed the Scale-invariant feature transform (SIFT) algorithm to extract feature points from each image, applied the KD-Tree algorithm for inter-image matching, and Enhanced autonomous threshold adjustment by utilizing the Random sample consensus (RANSAC) algorithm to eliminate mismatches, thereby enhancing feature matching accuracy and the number of matched point pairs. Additionally, we developed a feature-matching strategy based on similarity, which optimizes the pairwise matching process within the incremental structure from a motion algorithm. This approach decreased the number of matches and enhanced both algorithmic efficiency and model reconstruction accuracy. For dense reconstruction, we utilized the patch-based multi-view stereo (PMVS) algorithm, which is based on facets. The results indicate that our proposed method achieves a higher number of reconstructed feature points and significantly enhances algorithmic efficiency by approximately ten times compared to the original incremental reconstruction algorithm. Consequently, the generated point cloud data are more detailed, and the textures are clearer, demonstrating that our method is an effective solution for three-dimensional reconstruction. Full article

► Show Figures

Figure 1

20 pages, 4789 KiB

Open AccessCommunication

Fast Registration Algorithm for Laser Point Cloud Based on 3D-SIFT Features

by Lihong Yang, Shunqin Xu, Zhiqiang Yang, Jia He, Lei Gong, Wanjun Wang, Yao Li, Liguo Wang and Zhili Chen

Sensors 2025, 25(3), 628; https://doi.org/10.3390/s25030628 - 22 Jan 2025

Cited by 1 | Viewed by 1227

Abstract

In response to the issues of slow convergence and the tendency to fall into local optima in traditional iterative closest point (ICP) point cloud registration algorithms, this study presents a fast registration algorithm for laser point clouds based on 3D scale-invariant feature transform [...] Read more.

In response to the issues of slow convergence and the tendency to fall into local optima in traditional iterative closest point (ICP) point cloud registration algorithms, this study presents a fast registration algorithm for laser point clouds based on 3D scale-invariant feature transform (3D-SIFT) feature extraction. First, feature points are preliminarily extracted using a normal vector threshold; then, more high-quality feature points are extracted using the 3D-SIFT algorithm, effectively reducing the number of point cloud registrations. Based on the extracted feature points, a coarse registration of the point cloud is performed using the fast point feature histogram (FPFH) descriptor combined with the sample consensus initial alignment (SAC-IA) algorithm, followed by fine registration using the point-to-plane ICP algorithm with a symmetric target function. The experimental results show that this algorithm significantly improved the registration efficiency. Compared with the traditional SAC−IA+ICP algorithm, the registration accuracy of this algorithm increased by 29.55% in experiments on a public dataset, and the registration time was reduced by 81.01%. In experiments on actual collected data, the registration accuracy increased by 41.72%, and the registration time was reduced by 67.65%. The algorithm presented in this paper maintains a high registration accuracy while greatly reducing the registration speed. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

18 pages, 17735 KiB

Open AccessArticle

Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection

by Yanqin Li and Dehai Zhang

Processes 2025, 13(2), 293; https://doi.org/10.3390/pr13020293 - 21 Jan 2025

Cited by 3 | Viewed by 1120

Abstract

The traditional SIFT (Scale Invariant Feature Transform) registration algorithm is highly regarded in the field of image processing due to its scale invariance, rotation invariance, and robustness to noise. However, it faces challenges such as a large number of feature points, high computational [...] Read more.

The traditional SIFT (Scale Invariant Feature Transform) registration algorithm is highly regarded in the field of image processing due to its scale invariance, rotation invariance, and robustness to noise. However, it faces challenges such as a large number of feature points, high computational demand, and poor real-time performance when dealing with large-scale images. A novel optimization method based on integral image technology and canny edge detection is presented in this paper, aiming to maintain the core advantages of the SIFT algorithm while reducing the complexity involved in image registration computations, enhancing the efficiency of the algorithm for real-time image processing, and better adaption to the needs of large-scale image handling. Firstly, Gaussian separation techniques were used to simplify Gaussian filtering, followed by the application of integral image techniques to accelerate the construction of the entire pyramid. Additionally, during the feature point detection phase, an innovative feature point filtering strategy was introduced by combining Canny edge detection with dilation operations alongside the traditional SIFT approach, aiming to reduce the number of feature points and thereby lessen the computational load. The method proposed in this paper takes 0.0134 s for Image type a, 0.0504 s for Image type b, and 0.0212 s for Image type c. In contrast, the traditional method takes 0.1452 s for Image type a, 0.5276 s for Image type b, and 0.2717 s for Image type c, resulting in reductions of 0.1318 s, 0.4772 s, and 0.2505 s, respectively. A series of comparative experiments showed that the time taken to construct the Gaussian pyramid using our proposed method was consistently lower than that required by the traditional method, indicating greater efficiency and stability regardless of image size or type. Full article

(This article belongs to the Special Issue Simulation, Modeling, and Decision-Making Processes in Manufacturing Systems and Industrial Engineering)

► Show Figures

Figure 1

20 pages, 7090 KiB

Open AccessArticle

An Infrared and Visible Image Alignment Method Based on Gradient Distribution Properties and Scale-Invariant Features in Electric Power Scenes

by Lin Zhu, Yuxing Mao, Chunxu Chen and Lanjia Ning

J. Imaging 2025, 11(1), 23; https://doi.org/10.3390/jimaging11010023 - 13 Jan 2025

Viewed by 1113

Abstract

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected [...] Read more.

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected results. To overcome the high difficulty of aligning infrared and visible light images, an image alignment method is proposed in this paper. First, we use the Sobel operator to extract the edge information of the image pair. Second, the feature points in the edges are recognised by a curvature scale space (CSS) corner detector. Third, the Histogram of Orientation Gradients (HOG) is extracted as the gradient distribution characteristics of the feature points, which are normalised with the Scale Invariant Feature Transform (SIFT) algorithm to form feature descriptors. Finally, initial matching and accurate matching are achieved by the improved fast approximate nearest-neighbour matching method and adaptive thresholding, respectively. Experiments show that this method can robustly match the feature points of image pairs under rotation, scale, and viewpoint differences, and achieves excellent matching results. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Graphical abstract

27 pages, 20664 KiB

Open AccessArticle

Dual-Vehicle Heterogeneous Collaborative Scheme with Image-Aided Inertial Navigation

by Zi-Ming Wang, Chun-Liang Lin, Chian-Yu Lu, Po-Chun Wu and Yang-Yi Chen

Aerospace 2025, 12(1), 39; https://doi.org/10.3390/aerospace12010039 - 10 Jan 2025

Viewed by 743

Abstract

The Global Positioning System (GPS) has revolutionized navigation in modern society. However, the susceptibility of GPS signals to interference and obstruction poses significant navigational challenges. This paper introduces a GPS-denied method based on scene image coordinates instead of real-time GPS signals. Our approach [...] Read more.

The Global Positioning System (GPS) has revolutionized navigation in modern society. However, the susceptibility of GPS signals to interference and obstruction poses significant navigational challenges. This paper introduces a GPS-denied method based on scene image coordinates instead of real-time GPS signals. Our approach harnesses advanced image feature-recognition techniques, employing an enhanced scale-invariant feature transform algorithm and a neural network model. The recognition of prominent scene features is prioritized, thus improving recognition speed and precision. The GPS coordinates are extracted from the best-matching image by juxtaposing recognized features from the pre-established image database. A Kalman filter facilitates the fusion of these coordinates with inertial measurement unit data. Furthermore, ground scene recognition cooperates with its aerial counterpart to overcome specific challenges. This innovative idea enables heterogeneous collaboration by employing coordinate conversion formulas, effectively substituting traditional GPS signals. The proposed scheme may include military missions, rescues, and commercial services as potential applications. Full article

(This article belongs to the Special Issue New Trends in Aviation Development 2024–2025)

► Show Figures

Figure 1

17 pages, 4232 KiB

Open AccessArticle

Real-Time Automatic Configuration of Brain MRI: A Comparative Study of SIFT Descriptors and YOLO Neural Network

by Rávison Amaral Almeida, Júlio César Porto de Carvalho, Antônio Wilson Vieira, Heveraldo Rodrigues de Oliveira and Marcos F. S. V. D’Angelo

Appl. Sci. 2025, 15(1), 147; https://doi.org/10.3390/app15010147 - 27 Dec 2024

Viewed by 942

Abstract

This work presents two approaches to image processing in brain magnetic resonance imaging (MRI) to enhance slice planning during examinations. The first approach involves capturing images from the operator’s console during slice planning for two different brain examinations. From these images, Scale-Invariant Feature [...] Read more.

This work presents two approaches to image processing in brain magnetic resonance imaging (MRI) to enhance slice planning during examinations. The first approach involves capturing images from the operator’s console during slice planning for two different brain examinations. From these images, Scale-Invariant Feature Transform (SIFT) descriptors are extracted from the regions of interest. These descriptors are then utilized to train and test a model for image matching. The second approach introduces a novel method based on the YOLO (You Only Look Once) neural network, which is designed to automatically align and orient cutting planes. Both methods aim to automate and assist operators in decision making during MRI slice planning, thereby reducing human dependency and improving examination accuracy. The SIFT-based method demonstrated satisfactory results, meeting the necessary requirements for accurate brain examinations. Meanwhile, the YOLO-based method provides a more advanced and automated solution to detect and align structures in brain MRI images. These two distinct approaches are intended to be compared, highlighting their respective strengths and weaknesses in the context of brain MRI slice planning. Full article

► Show Figures

Figure 1

20 pages, 8861 KiB

Open AccessArticle

An Improved Registration Method for UAV-Based Linear Variable Filter Hyperspectral Data

by Xiao Wang, Chunyao Yu, Xiaohong Zhang, Xue Liu, Yinxing Zhang, Junyong Fang and Qing Xiao

Remote Sens. 2025, 17(1), 55; https://doi.org/10.3390/rs17010055 - 27 Dec 2024

Viewed by 689

Abstract

Linear Variable Filter (LVF) hyperspectral cameras possess the advantages of high spectral resolution, compact size, and light weight, making them highly suitable for unmanned aerial vehicle (UAV) platforms. However, challenges arise in data registration due to the imaging characteristics of LVF data and [...] Read more.

Linear Variable Filter (LVF) hyperspectral cameras possess the advantages of high spectral resolution, compact size, and light weight, making them highly suitable for unmanned aerial vehicle (UAV) platforms. However, challenges arise in data registration due to the imaging characteristics of LVF data and the instability of UAV platforms. These challenges stem from the diversity of LVF data bands and significant inter-band differences. Even after geometric processing, adjacent flight lines still exhibit varying degrees of geometric deformation. In this paper, a progressive grouping-based strategy for iterative band selection and registration is proposed. In addition, an improved Scale-Invariant Feature Transform (SIFT) algorithm, termed the Double Sufficiency–SIFT (DS-SIFT) algorithm, is introduced. This method first groups bands, selects the optimal reference band, and performs coarse registration based on the SIFT method. Subsequently, during the fine registration stage, it introduces an improved position/scale/orientation joint SIFT registration algorithm (IPSO-SIFT) that integrates partitioning and the principle of structural similarity. This algorithm iteratively refines registration based on the grouping results. Experimental data obtained from a self-developed and integrated LVF hyperspectral remote sensing system are utilized to verify the effectiveness of the proposed algorithm. A comparison with classical algorithms, such as SIFT and PSO-SIFT, demonstrates that the registration of LVF hyperspectral data using the proposed method achieves superior accuracy and efficiency. Full article

(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)

► Show Figures

Figure 1

Search Results (194)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (194)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI