MDPI - Publisher of Open Access Journals

22 pages, 7096 KB

Open AccessArticle

An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms

by Guanhua Yi, Tianxiang Zhang, Yunfei Chen and Dapeng Yu

J. Mar. Sci. Eng. 2026, 14(2), 218; https://doi.org/10.3390/jmse14020218 - 21 Jan 2026

Viewed by 671

Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching [...] Read more.

Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching method that integrates ORB (Oriented FAST and Rotated BRIEF) feature extraction with a fixed-ratio constraint matching strategy. First, lightweight color and contrast enhancement techniques are employed to restore color balance and improve local texture visibility. Then, ORB descriptors are extracted and matched via a KNN (K-Nearest Neighbors) nearest-neighbor search, and Lowe’s ratio test is applied to eliminate false matches caused by weak texture similarity. Finally, the geometric transformation between image frames is estimated by incorporating robust optimization, ensuring stable homography computation. Experimental results on real underwater datasets show that the proposed method significantly improves stitching continuity and structural consistency, achieving 40–120% improvements in SSIM (Structural Similarity Index) and PSNR (peak signal-to-noise ratio) over conventional Harris–ORB + KNN, SIFT (scale-invariant feature transform) + BF (brute force), SIFT + KNN, and AKAZE (accelerated KAZE) + BF methods while maintaining processing times within one second. These results indicate that the proposed method is well-suited for real-time underwater environment perception and panoramic mapping on low-cost, micro-sized underwater robotic platforms. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

23 pages, 3212 KB

Open AccessArticle

AKAZE-GMS-PROSAC: A New Progressive Framework for Matching Dynamic Characteristics of Flotation Foam

by Zhen Peng, Zhihong Jiang, Pengcheng Zhu, Gaipin Cai and Xiaoyan Luo

J. Imaging 2026, 12(1), 7; https://doi.org/10.3390/jimaging12010007 - 25 Dec 2025

Viewed by 504

Abstract

The dynamic characteristics of flotation foam, such as velocity and breakage rate, are critical factors that influence mineral separation efficiency. However, challenges inherent in foam images, including weak textures, severe deformations, and motion blur, present significant technical hurdles for dynamic monitoring. These issues [...] Read more.

The dynamic characteristics of flotation foam, such as velocity and breakage rate, are critical factors that influence mineral separation efficiency. However, challenges inherent in foam images, including weak textures, severe deformations, and motion blur, present significant technical hurdles for dynamic monitoring. These issues lead to a fundamental conflict between the efficiency and accuracy of traditional feature matching algorithms. This paper introduces a novel progressive framework for dynamic feature matching in flotation foam images, termed “stable extraction, efficient coarse screening, and precise matching.” This framework first employs the Accelerated-KAZE (AKAZE) algorithm to extract robust, scale- and rotation-invariant feature points from a non-linear scale-space, effectively addressing the challenge of weak textures. Subsequently, it innovatively incorporates the Grid-based Motion Statistics (GMS) algorithm to perform efficient coarse screening based on motion consistency, rapidly filtering out a large number of obvious mismatches. Finally, the Progressive Sample and Consensus (PROSAC) algorithm is used for precise matching, eliminating the remaining subtle mismatches through progressive sampling and geometric constraints. This framework enables the precise analysis of dynamic foam characteristics, including displacement, velocity, and breakage rate (enhanced by a robust “foam lifetime” mechanism). Comparative experimental results demonstrate that, compared to ORB-GMS-RANSAC (with a Mean Absolute Error, MAE of 1.20 pixels and a Mean Relative Error, MRE of 9.10%) and ORB-RANSAC (MAE: 3.53 pixels, MRE: 27.36%), the proposed framework achieves significantly lower error rates (MAE: 0.23 pixels, MRE: 2.13%). It exhibits exceptional stability and accuracy, particularly in complex scenarios involving low texture and minor displacements. This research provides a high-precision, high-robustness technical solution for the dynamic monitoring and intelligent control of the flotation process. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

24 pages, 248126 KB

Open AccessArticle

Image Matching for UAV Geolocation: Classical and Deep Learning Approaches

by Fatih Baykal, Mehmet İrfan Gedik, Constantino Carlos Reyes-Aldasoro and Cefa Karabağ

J. Imaging 2025, 11(11), 409; https://doi.org/10.3390/jimaging11110409 - 12 Nov 2025

Cited by 2 | Viewed by 2162

Abstract

Today, unmanned aerial vehicles (UAVs) are heavily dependent on Global Navigation Satellite Systems (GNSSs) for positioning and navigation. However, GNSS signals are vulnerable to jamming and spoofing attacks. This poses serious security risks, especially for military operations and critical civilian missions. In order [...] Read more.

Today, unmanned aerial vehicles (UAVs) are heavily dependent on Global Navigation Satellite Systems (GNSSs) for positioning and navigation. However, GNSS signals are vulnerable to jamming and spoofing attacks. This poses serious security risks, especially for military operations and critical civilian missions. In order to solve this problem, an image-based geolocation system has been developed that eliminates GNSS dependency. The proposed system estimates the geographical location of the UAV by matching the aerial images taken by the UAV with previously georeferenced high-resolution satellite images. For this purpose, common visual features were determined between satellite and UAV images and matching operations were carried out using methods based on the homography matrix. Thanks to image processing, a significant relationship has been established between the area where the UAV is located and the geographical coordinates, and reliable positioning is ensured even in cases where GNSS signals cannot be used. Within the scope of the study, traditional methods such as SIFT, AKAZE, and Multiple Template Matching were compared with learning-based methods including SuperPoint, SuperGlue, and LoFTR. The results showed that deep learning-based approaches can make successful matches, especially at high altitudes. Full article

(This article belongs to the Topic Image Processing, Signal Processing and Their Applications)

► Show Figures

Figure 1

24 pages, 59144 KB

Open AccessArticle

EWAM: Scene-Adaptive Infrared-Visible Image Matching with Radiation-Prior Encoding and Learnable Wavelet Edge Enhancement

by Mingwei Li, Hai Tan, Haoran Zhai and Jinlong Ci

Remote Sens. 2025, 17(22), 3666; https://doi.org/10.3390/rs17223666 - 7 Nov 2025

Viewed by 1195

Abstract

Infrared–visible image matching is a prerequisite for environmental monitoring, military reconnaissance, and multisource geospatial analysis. However, pronounced texture disparities, intensity drift, and complex non-linear radiometric distortions in such cross-modal pairs mean that existing frameworks such as SuperPoint + SuperGlue (SP + SG) and [...] Read more.

Infrared–visible image matching is a prerequisite for environmental monitoring, military reconnaissance, and multisource geospatial analysis. However, pronounced texture disparities, intensity drift, and complex non-linear radiometric distortions in such cross-modal pairs mean that existing frameworks such as SuperPoint + SuperGlue (SP + SG) and LoFTR cannot reliably establish correspondences. To address this issue, we propose a dual-path architecture, the Environment-Adaptive Wavelet Enhancement and Radiation Priors Aided Matcher (EWAM). EWAM incorporates two synergistic branches: (1) an Environment-Adaptive Radiation Feature Extractor, which first classifies the scene according to radiation-intensity variations and then incorporates a physical radiation model into a learnable gating mechanism for selective feature propagation; (2) a Wavelet-Transform High-Frequency Enhancement Module, which recovers blurred edge structures by boosting wavelet coefficients under directional perceptual losses. The two branches collectively increase the number of tie points (reliable correspondences) and refine their spatial localization. A coarse-to-fine matcher subsequently refines the cross-modal correspondences. We benchmarked EWAM against SIFT, AKAZE, D2-Net, SP + SG, and LoFTR on a newly compiled dataset that fuses GF-7, Landsat-8, and Five-Billion-Pixels imagery. Across desert, mountain, gobi, urban and farmland scenes, EWAM reduced the average RMSE to 1.85 pixels and outperformed the best competing method by 2.7%, 2.6%, 2.0%, 2.3% and 1.8% in accuracy, respectively. These findings demonstrate that EWAM yields a robust and scalable framework for large-scale multi-sensor remote-sensing data fusion. Full article

► Show Figures

Graphical abstract

20 pages, 10851 KB

Open AccessArticle

Evaluating Feature-Based Homography Pipelines for Dual-Camera Registration in Acupoint Annotation

by Thathsara Nanayakkara, Hadi Sedigh Malekroodi, Jaeuk Sul, Chang-Su Na, Myunggi Yi and Byeong-il Lee

J. Imaging 2025, 11(11), 388; https://doi.org/10.3390/jimaging11110388 - 1 Nov 2025

Viewed by 1393

Abstract

Reliable acupoint localization is essential for developing artificial intelligence (AI) and extended reality (XR) tools in traditional Korean medicine; however, conventional annotation of 2D images often suffers from inter- and intra-annotator variability. This study presents a low-cost dual-camera imaging system that fuses infrared [...] Read more.

Reliable acupoint localization is essential for developing artificial intelligence (AI) and extended reality (XR) tools in traditional Korean medicine; however, conventional annotation of 2D images often suffers from inter- and intra-annotator variability. This study presents a low-cost dual-camera imaging system that fuses infrared (IR) and RGB views on a Raspberry Pi 5 platform, incorporating an IR ink pen in conjunction with a 780 nm emitter array to standardize point visibility. Among the tested marking materials, the IR ink showed the highest contrast and visibility under IR illumination, making it the most suitable for acupoint detection. Five feature detectors (SIFT, ORB, KAZE, AKAZE, and BRISK) were evaluated with two matchers (FLANN and BF) to construct representative homography pipelines. Comparative evaluations across multiple camera-to-surface distances revealed that KAZE + FLANN achieved the lowest mean 2D error (1.17 ± 0.70 px) and the lowest mean aspect-aware error (0.08 ± 0.05%) while remaining computationally feasible on the Raspberry Pi 5. In hand-image experiments across multiple postures, the dual-camera registration maintained a mean 2D error below ~3 px and a mean aspect-aware error below ~0.25%, confirming stable and reproducible performance. The proposed framework provides a practical foundation for generating high-quality acupoint datasets, supporting future AI-based localization, XR integration, and automated acupuncture-education systems. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

19 pages, 5498 KB

Open AccessArticle

Fast and Accurate Sperm Detection Algorithm for Micro-TESE in NOA Patients

by Mahmoud Mohamed, Konosuke Kachi, Kohei Motoya and Masashi Ikeuchi

Bioengineering 2025, 12(6), 601; https://doi.org/10.3390/bioengineering12060601 - 31 May 2025

Cited by 2 | Viewed by 1791

Abstract

Purpose: Non-obstructive azoospermia (NOA) presents major challenges in assisted reproductive technology (ART) due to the extremely low number of viable sperm within testicular tissue. In Micro-TESE procedures, embryologists manually search for sperm under DIC microscopy—a slow, labor-intensive process. We aim to streamline this [...] Read more.

Purpose: Non-obstructive azoospermia (NOA) presents major challenges in assisted reproductive technology (ART) due to the extremely low number of viable sperm within testicular tissue. In Micro-TESE procedures, embryologists manually search for sperm under DIC microscopy—a slow, labor-intensive process. We aim to streamline this process with an efficient computational detection tool. Methods: We present SD-CLIP (Sperm Detection using Classical Image Processing), a lightweight, real-time algorithm that simulates sperm structure detection from unstained DIC images. The model first identifies convex sperm head candidates based on shape and width using edge gradients, then confirms the presence of a tail via principal component analysis (PCA) of pixel clusters. Results: Compared to the MB-LBP + AKAZE method, SD-CLIP improved processing speed by 4× and achieved a 3.8× higher posterior probability ratio, making detected sperm candidates significantly more reliable. Evaluation was performed on both human Micro-TESE and mouse testis images, demonstrating robustness in low-sperm environments. Conclusions: SD-CLIP simulates a domain-specific image interpretation model that identifies sperm morphology with high specificity. It requires minimal computational resources, supports real-time integration, and could be extended to automated sperm extraction systems. This tool has clinical value for accelerating Micro-TESE and increasing success rates in ART for NOA patients. Full article

(This article belongs to the Special Issue The Power of Models and Simulation Tools in Biomedical and Biochemical Engineering)

► Show Figures

Graphical abstract

18 pages, 9335 KB

Open AccessArticle

Image Matching Algorithm for Transmission Towers Based on CLAHE and Improved RANSAC

by Ruihua Chen, Pan Yao, Shuo Wang, Chuanlong Lyu and Yuge Xu

Designs 2025, 9(3), 67; https://doi.org/10.3390/designs9030067 - 29 May 2025

Cited by 1 | Viewed by 2004

Abstract

To address the lack of robustness against illumination and blurring variations in aerial images of transmission towers, an improved image matching algorithm for aerial images is proposed. The proposed algorithm consists of two main components: an enhanced AKAZE algorithm and an improved three-stage [...] Read more.

To address the lack of robustness against illumination and blurring variations in aerial images of transmission towers, an improved image matching algorithm for aerial images is proposed. The proposed algorithm consists of two main components: an enhanced AKAZE algorithm and an improved three-stage feature matching strategy, which are used for feature point detection and feature matching, respectively. First, the improved AKAZE enhances image contrast using Contrast-Limited Adaptive Histogram Equalization (CLAHE), which highlights target features and improves robustness against environmental interference. Subsequently, the original AKAZE algorithm is employed to detect feature points and construct binary descriptors. Building upon this, an improved three-stage feature matching strategy is proposed to estimate the geometric transformation between image pairs. Specifically, the strategy begins with initial feature matching using the nearest neighbor ratio (NNR) method, followed by outlier rejection via the Grid-based Motion Statistics (GMS) algorithm. Finally, an improved Random Sample Consensus (RANSAC) algorithm computes the transformation matrix, further enhancing matching efficiency. Experimental results demonstrate that the proposed method exceeds the original AKAZE algorithm’s matching accuracy by 4∼15% on different image sets while achieving faster matching speeds. Under real-world conditions with UAV-captured aerial images of transmission towers, the proposed algorithm achieves over 95% matching accuracy, which is higher than other algorithms. Our proposed algorithm enables fast and accurate matching of transmission tower aerial images. Full article

(This article belongs to the Section Electrical Engineering Design)

► Show Figures

Figure 1

24 pages, 20197 KB

Open AccessArticle

Thermal Infrared Orthophoto Geometry Correction Using RGB Orthophoto for Unmanned Aerial Vehicle

by Kirim Lee and Wonhee Lee

Aerospace 2024, 11(10), 817; https://doi.org/10.3390/aerospace11100817 - 6 Oct 2024

Cited by 2 | Viewed by 2453

Abstract

The geometric correction of thermal infrared (TIR) orthophotos generated by unmanned aerial vehicles (UAVs) presents significant challenges due to low resolution and the difficulty of identifying ground control points (GCPs). This study addresses the limitations of real-time kinematic (RTK) UAV data acquisition, such [...] Read more.

The geometric correction of thermal infrared (TIR) orthophotos generated by unmanned aerial vehicles (UAVs) presents significant challenges due to low resolution and the difficulty of identifying ground control points (GCPs). This study addresses the limitations of real-time kinematic (RTK) UAV data acquisition, such as network instability and the inability to detect GCPs in TIR images, by proposing a method that utilizes RGB orthophotos as a reference for geometric correction. The accelerated-KAZE (AKAZE) method was applied to extract feature points between RGB and TIR orthophotos, integrating binary descriptors and absolute coordinate-based matching techniques. Geometric correction results demonstrated a significant improvement in regions with stable and changing environmental conditions. Invariant regions exhibited an accuracy of 0.7~2 px (0.01~0.04), while areas with temporal and spatial changes saw corrections within 5~7 px (0.10~0.14 m). This method reduces reliance on GCP measurements and provides an effective supplementary technique for cases where GCP detection is limited or unavailable. Additionally, this approach enhances time and economic efficiency, offering a reliable alternative for precise orthophoto generation across various sensor data. Full article

(This article belongs to the Special Issue New Trends in Aviation Development 2024–2025)

► Show Figures

Figure 1

26 pages, 15055 KB

Open AccessArticle

Building Better Models: Benchmarking Feature Extraction and Matching for Structure from Motion at Construction Sites

by Carlos Roberto Cueto Zumaya, Iacopo Catalano and Jorge Peña Queralta

Remote Sens. 2024, 16(16), 2974; https://doi.org/10.3390/rs16162974 - 14 Aug 2024

Cited by 4 | Viewed by 6904

Abstract

The popularity of Structure from Motion (SfM) techniques has significantly advanced 3D reconstruction in various domains, including construction site mapping. Central to SfM, is the feature extraction and matching process, which identifies and correlates keypoints across images. Previous benchmarks have assessed traditional and [...] Read more.

The popularity of Structure from Motion (SfM) techniques has significantly advanced 3D reconstruction in various domains, including construction site mapping. Central to SfM, is the feature extraction and matching process, which identifies and correlates keypoints across images. Previous benchmarks have assessed traditional and learning-based methods for these tasks but have not specifically focused on construction sites, often evaluating isolated components of the SfM pipeline. This study provides a comprehensive evaluation of traditional methods (e.g., SIFT, AKAZE, ORB) and learning-based methods (e.g., D2-Net, DISK, R2D2, SuperPoint, SOSNet) within the SfM pipeline for construction site mapping. It also compares matching techniques, including SuperGlue and LightGlue, against traditional approaches such as nearest neighbor. Our findings demonstrate that deep learning-based methods such as DISK with LightGlue and SuperPoint with various matchers consistently outperform traditional methods like SIFT in both reconstruction quality and computational efficiency. Overall, the deep learning methods exhibited better adaptability to complex construction environments, leveraging modern hardware effectively, highlighting their potential for large-scale and real-time applications in construction site mapping. This benchmark aims to assist researchers in selecting the optimal combination of feature extraction and matching methods for SfM applications at construction sites. Full article

(This article belongs to the Special Issue Remote Sensing for 2D/3D Mapping)

► Show Figures

Figure 1

21 pages, 5094 KB

Open AccessArticle

TQU-SLAM Benchmark Dataset for Comparative Study to Build Visual Odometry Based on Extracted Features from Feature Descriptors and Deep Learning

by Thi-Hao Nguyen, Van-Hung Le, Huu-Son Do, Trung-Hieu Te and Van-Nam Phan

Future Internet 2024, 16(5), 174; https://doi.org/10.3390/fi16050174 - 17 May 2024

Cited by 4 | Viewed by 5116

Abstract

The problem of data enrichment to train visual SLAM and VO construction models using deep learning (DL) is an urgent problem today in computer vision. DL requires a large amount of data to train a model, and more data with many different contextual [...] Read more.

The problem of data enrichment to train visual SLAM and VO construction models using deep learning (DL) is an urgent problem today in computer vision. DL requires a large amount of data to train a model, and more data with many different contextual and conditional conditions will create a more accurate visual SLAM and VO construction model. In this paper, we introduce the TQU-SLAM benchmark dataset, which includes 160,631 RGB-D frame pairs. It was collected from the corridors of three interconnected buildings comprising a length of about 230 m. The ground-truth data of the TQU-SLAM benchmark dataset were prepared manually, including 6-DOF camera poses, 3D point cloud data, intrinsic parameters, and the transformation matrix between the camera coordinate system and the real world. We also tested the TQU-SLAM benchmark dataset using the PySLAM framework with traditional features such as SHI_TOMASI, SIFT, SURF, ORB, ORB2, AKAZE, KAZE, and BRISK and features extracted from DL such as VGG, DPVO, and TartanVO. The camera pose estimation results are evaluated, and we show that the ORB2 features have the best results (

E r r_{d}

= 5.74 mm), while the ratio of the number of frames with detected keypoints of the SHI_TOMASI feature is the best (

r_{d} = 98.97 %

). At the same time, we also present and analyze the challenges of the TQU-SLAM benchmark dataset for building visual SLAM and VO systems. Full article

(This article belongs to the Special Issue Machine Learning Techniques for Computer Vision)

► Show Figures

Figure 1

20 pages, 1827 KB

Open AccessArticle

Efficient Crowd Anomaly Detection Using Sparse Feature Tracking and Neural Network

by Sarah Altowairqi, Suhuai Luo, Peter Greer and Shan Chen

Appl. Sci. 2024, 14(9), 3928; https://doi.org/10.3390/app14093928 - 4 May 2024

Cited by 11 | Viewed by 4271

Abstract

Crowd anomaly detection is crucial in enhancing surveillance and crowd management. This paper proposes an efficient approach that combines spatial and temporal visual descriptors, sparse feature tracking, and neural networks for efficient crowd anomaly detection. The proposed approach utilises diverse local feature extraction [...] Read more.

Crowd anomaly detection is crucial in enhancing surveillance and crowd management. This paper proposes an efficient approach that combines spatial and temporal visual descriptors, sparse feature tracking, and neural networks for efficient crowd anomaly detection. The proposed approach utilises diverse local feature extraction methods, including SIFT, FAST, and AKAZE, with a sparse feature tracking technique to ensure accurate and consistent tracking. Delaunay triangulation is employed to represent the spatial distribution of features in an efficient way. Visual descriptors are categorised into individual behaviour descriptors and interactive descriptors to capture the temporal and spatial characteristics of crowd dynamics and behaviour, respectively. Neural networks are then utilised to classify these descriptors and pinpoint anomalies, making use of their strong learning capabilities. A significant component of our study is the assessment of how dimensionality reduction methods, particularly autoencoders and PCA, affect the feature set’s performance. This assessment aims to balance computational efficiency and detection accuracy. Tests conducted on benchmark crowd datasets highlight the effectiveness of our method in identifying anomalies. Our approach offers a nuanced understanding of crowd movement and patterns by emphasising both individual and collective characteristics. The visual and local descriptors facilitate high-level analysis by closely relating to semantic information and crowd behaviour. The analysis observed shows that this approach offers an efficient framework for crowd anomaly detection, contributing to improved crowd management and public safety. The proposed model achieves accuracy of 99.5 %, 96.1%, 99.0% and 88.5% in the UMN scenes 1, 2, and 3 and violence in crowds datasets, respectively. Full article

► Show Figures

Figure 1

25 pages, 9712 KB

Open AccessArticle

Comparative Analysis of Color Space and Channel, Detector, and Descriptor for Feature-Based Image Registration

by Wenan Yuan, Sai Raghavendra Prasad Poosa and Rutger Francisco Dirks

J. Imaging 2024, 10(5), 105; https://doi.org/10.3390/jimaging10050105 - 28 Apr 2024

Cited by 6 | Viewed by 3668

Abstract

The current study aimed to quantify the value of color spaces and channels as a potential superior replacement for standard grayscale images, as well as the relative performance of open-source detectors and descriptors for general feature-based image registration purposes, based on a large [...] Read more.

The current study aimed to quantify the value of color spaces and channels as a potential superior replacement for standard grayscale images, as well as the relative performance of open-source detectors and descriptors for general feature-based image registration purposes, based on a large benchmark dataset. The public dataset UDIS-D, with 1106 diverse image pairs, was selected. In total, 21 color spaces or channels including RGB, XYZ, Y′CrCb, HLS, L*a*b* and their corresponding channels in addition to grayscale, nine feature detectors including AKAZE, BRISK, CSE, FAST, HL, KAZE, ORB, SIFT, and TBMR, and 11 feature descriptors including AKAZE, BB, BRIEF, BRISK, DAISY, FREAK, KAZE, LATCH, ORB, SIFT, and VGG were evaluated according to reprojection error (RE), root mean square error (RMSE), structural similarity index measure (SSIM), registration failure rate, and feature number, based on 1,950,984 image registrations. No meaningful benefits from color space or channel were observed, although XYZ, RGB color space and L* color channel were able to outperform grayscale by a very minor margin. Per the dataset, the best-performing color space or channel, detector, and descriptor were XYZ/RGB, SIFT/FAST, and AKAZE. The most robust color space or channel, detector, and descriptor were L*a*b*, TBMR, and VGG. The color channel, detector, and descriptor with the most initial detector features and final homography features were Z/L*, FAST, and KAZE. In terms of the best overall unfailing combinations, XYZ/RGB+SIFT/FAST+VGG/SIFT seemed to provide the highest image registration quality, while Z+FAST+VGG provided the most image features. Full article

(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)

► Show Figures

Figure 1

22 pages, 56589 KB

Open AccessArticle

Target Search for Joint Local and High-Level Semantic Information Based on Image Preprocessing Enhancement in Indoor Low-Light Environments

by Huapeng Tang, Danyang Qin, Jiaqiang Yang, Haoze Bie, Yue Li, Yong Zhu and Lin Ma

ISPRS Int. J. Geo-Inf. 2023, 12(10), 400; https://doi.org/10.3390/ijgi12100400 - 30 Sep 2023

Cited by 4 | Viewed by 2286

Abstract

In indoor low-light environments, the lack of light makes the captured images often suffer from quality degradation problems, including missing features in dark areas, noise interference, low brightness, and low contrast. Therefore, the feature extraction algorithms are unable to extract the feature information [...] Read more.

In indoor low-light environments, the lack of light makes the captured images often suffer from quality degradation problems, including missing features in dark areas, noise interference, low brightness, and low contrast. Therefore, the feature extraction algorithms are unable to extract the feature information contained in the images accurately, thereby hindering the subsequent target search task in this environment and making it difficult to determine the location information of the target. Aiming at this problem, a joint local and high-level semantic information (JLHS) target search method is proposed based on joint bilateral filtering and camera response model (JBCRM) image preprocessing enhancement. The JBCRM method improves the image quality by highlighting the dark region features and removing the noise interference in order to solve the problem of the difficult extraction of feature points in low-light images, thus providing better visual data for subsequent target search tasks. The JLHS method increases the feature matching accuracy between the target image and the offline database image by combining local and high-level semantic information to characterize the image content, thereby boosting the accuracy of the target search. Experiments show that, compared with the existing image-enhancement methods, the PSNR of the JBCRM method is increased by 34.24% at the highest and 2.61% at the lowest. The SSIM increased by 63.64% at most and increased by 12.50% at least. The Laplacian operator increased by 54.47% at most and 3.49% at least. When the mainstream feature extraction techniques, SIFT, ORB, AKAZE, and BRISK, are utilized, the number of feature points in the JBCRM-enhanced images are improved by a minimum of 20.51% and a maximum of 303.44% over the original low-light images. Compared with other target search methods, the average search error of the JLHS method is only 9.8 cm, which is 91.90% lower than the histogram-based search method. Meanwhile, the average search error is reduced by 18.33% compared to the VGG16-based target search method. As a result, the method proposed in this paper significantly improves the accuracy of the target search in low-light environments, thus broadening the application scenarios of target search in indoor environments, and providing an effective solution for accurately determining the location of the target in geospatial space. Full article

(This article belongs to the Special Issue Unlocking the Power of Geospatial Data: Semantic Information Extraction, Ontology Engineering, and Deep Learning for Knowledge Discovery)

► Show Figures

Figure 1

20 pages, 7517 KB

Open AccessArticle

Image Copy-Move Forgery Detection Based on Fused Features and Density Clustering

by Guiwei Fu, Yujin Zhang and Yongqi Wang

Appl. Sci. 2023, 13(13), 7528; https://doi.org/10.3390/app13137528 - 26 Jun 2023

Cited by 18 | Viewed by 4521

Abstract

Image copy-move forgery is a common simple tampering technique. To address issues such as high time complexity in most copy-move forgery detection algorithms and difficulty detecting forgeries in smooth regions, this paper proposes an image copy-move forgery detection algorithm based on fused features [...] Read more.

Image copy-move forgery is a common simple tampering technique. To address issues such as high time complexity in most copy-move forgery detection algorithms and difficulty detecting forgeries in smooth regions, this paper proposes an image copy-move forgery detection algorithm based on fused features and density clustering. Firstly, the algorithm combines two detection methods, speeded up robust features (SURF) and accelerated KAZE (A-KAZE), to extract descriptive features by setting a low contrast threshold. Then, the density-based spatial clustering of applications with noise (DBSCAN) algorithm removes mismatched pairs and reduces false positives. To improve the accuracy of forgery localization, the algorithm uses the original image and the image transformed by the affine matrix to compare similarities in the same position in order to locate the forged region. The proposed method was tested on two datasets (Ardizzone and CoMoFoD). The experimental results show that the method effectively improved the accuracy of forgery detection in smooth regions, reduced computational complexity, and exhibited strong robustness against post-processing operations such as rotation, scaling, and noise addition. Full article

(This article belongs to the Special Issue Digital Image Processing: Technologies and Applications)

► Show Figures

Figure 1

11 pages, 2911 KB

Open AccessArticle

The Use of a 3D Image Comparison Program for Dental Identification

by Daijiro Kubo, Tomoki Itamiya, Norishige Kawanishi, Noriyuki Hoshi and Katsuhiko Kimoto

Appl. Sci. 2023, 13(13), 7517; https://doi.org/10.3390/app13137517 - 26 Jun 2023

Cited by 2 | Viewed by 3564

Abstract

Dental identification involves compiling a prescribed dental chart of a deceased person’s oral findings which is then compared with antemortem dental information. However, this process is complicated, and a comparison can be difficult. In this study, the authors evaluated whether it is possible [...] Read more.

Dental identification involves compiling a prescribed dental chart of a deceased person’s oral findings which is then compared with antemortem dental information. However, this process is complicated, and a comparison can be difficult. In this study, the authors evaluated whether it is possible to identify images from antemortem dental information images using an image comparison program (AKAZE) with one-sided cross-sectional images generated from the STL (Standard Triangle Language) data of upper and lower jaw models acquired with an intraoral scanner. From the STL data of 20 patients, 120 cross-sectional images were generated by three practitioners and compared with the cross-sectional images of 20 patients generated later, and the degree of agreement calculated by AKAZE was analyzed. Statistically significant differences were found between images of the same and different models, and statistically significant differences were obtained when comparing one-sided images with limited information, suggesting that partial dentition information can be used to identify the same dentition. Full article

(This article belongs to the Special Issue CAD & CAM Dentistry)

► Show Figures

Figure 1

Search Results (36)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (36)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI