Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,026)

Search Parameters:
Keywords = image matching

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 3386 KB  
Article
UAV Visual Localization via Multimodal Fusion and Multi-Scale Attention Enhancement
by Yiheng Wang, Yushuai Zhang, Zhenyu Wang, Jianxin Guo, Feng Wang, Rui Zhu and Dejing Lin
Sustainability 2026, 18(9), 4277; https://doi.org/10.3390/su18094277 (registering DOI) - 25 Apr 2026
Abstract
For power-grid applications such as transmission corridor inspection, substation asset inspection, and post-disaster emergency repair, reliable UAV self-localization under GNSS-degraded or GNSS-denied conditions is critical to ensuring operational safety and accurate defect geotagging. Due to substantial discrepancies in viewpoint, scale, and geometric structure [...] Read more.
For power-grid applications such as transmission corridor inspection, substation asset inspection, and post-disaster emergency repair, reliable UAV self-localization under GNSS-degraded or GNSS-denied conditions is critical to ensuring operational safety and accurate defect geotagging. Due to substantial discrepancies in viewpoint, scale, and geometric structure between oblique UAV images and nadir satellite images, conventional RGB-based cross-view retrieval methods often suffer from unstable alignment and insufficient geometric modeling, particularly in scenarios with repetitive textures and partial overlap. To address these challenges, we propose a cross-view visual geo-localization model that integrates RGBD multimodal inputs with multi-scale attention enhancement. Specifically, MiDaS is used to estimate relative depth from UAV imagery, which is concatenated with RGB to form a four-channel input, while satellite images are padded with an additional zero channel to maintain dimensional consistency. A shared-weight ViTAdapter is adopted to learn joint semantic–geometric representations, and a lightweight Efficient Multi-scale Attention (EMA) module is adopted on spatial feature maps to strengthen multi-scale spatial consistency. In addition, an IoU-weighted InfoNCE loss is employed to accommodate partial matching during training, thereby improving the robustness of feature alignment. Experiments on the GTA-UAV dataset under the cross-area protocol show stable performance across both retrieval and localization metrics. Specifically, Recall@1, Recall@5, and Recall@10 reach 18.12%, 38.83%, and 49.47%, respectively; AP is 28.01 and SDM@3 is 0.53; meanwhile, the top-1 geodesic distance error Dis@1 is 1052.73 m. These results indicate that explicit geometric priors combined with multi-scale spatial enhancement can effectively improve cross-view feature alignment, leading to enhanced robustness and accuracy for localization in challenging power inspection scenarios. Full article
Show Figures

Figure 1

28 pages, 33073 KB  
Article
Pedestrian Localization Using Smartphone LiDAR in Indoor Environments
by Jaehun Kim and Kwangjae Sung
Electronics 2026, 15(9), 1810; https://doi.org/10.3390/electronics15091810 - 24 Apr 2026
Abstract
Many place recognition approaches, which identify previously visited places or locations by matching current sensory data, such as 2D RGB images and 3D point clouds, have been proposed to achieve accurate and robust localization and loop closure detection in global positioning system (GPS)-denied [...] Read more.
Many place recognition approaches, which identify previously visited places or locations by matching current sensory data, such as 2D RGB images and 3D point clouds, have been proposed to achieve accurate and robust localization and loop closure detection in global positioning system (GPS)-denied environments. Since visual place recognition (VPR) methods that rely on images captured by camera sensors are highly sensitive to variations in appearance, including changes in lighting, surface color, and shadows, they can lead to poor place recognition accuracy. In contrast, light detection and ranging (LiDAR)-based place recognition (LPR) approaches based on 3D point cloud data that captures the shape and geometric structure of the environment are robust to changes in place appearance and can therefore provide more reliable place recognition results than VPR methods. This work presents an indoor LPR method called PointNetVLAD-based indoor pedestrian localization (PIPL). PIPL is a deep network model that uses PointNetVLAD to learn to extract global descriptors from 3D LiDAR point cloud data. PIPL can recognize places previously visited by a pedestrian using point clouds captured by a low-cost LiDAR sensor on a smartphone in small-scale indoor environments, while PointNetVLAD performs place recognition for vehicles using high-cost LiDAR, GPS, and inertial measurement unit (IMU) sensors in large-scale outdoor areas. For place recognition on 3D point cloud reference maps generated from LiDAR scans, PointNetVLAD exploits the universal transverse mercator (UTM) coordinate system based on GPS and IMU measurements, whereas PIPL uses a virtual coordinate system designed in this study due to the unavailability of GPS indoors. In experiments conducted in campus buildings, PIPL shows significant advantages over NetVLAD (known as a convolutional neural network (CNN)-based VPR method). Particularly in indoor environments with repetitive scenes where geometric structures are preserved and image-based appearance features are sparse or unclear, PIPL achieved 39% higher top-1 accuracy and 10% higher top-3 accuracy compared to NetVLAD. Furthermore, PIPL achieved place recognition accuracy comparable to NetVLAD even with a small number of points in a 3D point cloud and outperformed NetVLAD even with a smaller model training dataset. The experimental results also indicate that PIPL requires over 76% less place retrieval time than NetVLAD while maintaining robust place classification performance. Full article
(This article belongs to the Special Issue Advanced Indoor Localization Technologies: From Theory to Application)
14 pages, 1169 KB  
Article
Assessing the Relationship Between Volumetric Changes and Functional Connectivity in Patients with Mild Cognitive Impairment
by Weronika Machaj, Przemyslaw Podgorski, Julian Maciaszek, Dorota Szczesniak, Joanna Rymaszewska, Patryk Piotrowski and Anna Zimny
J. Clin. Med. 2026, 15(9), 3229; https://doi.org/10.3390/jcm15093229 - 23 Apr 2026
Abstract
Background: Amnestic mild cognitive impairment (aMCI) is considered a transitional state between normal aging and dementia, often without visible abnormalities on standard brain magnetic resonance (MR) images. The aim of the study was to analyze both microstructural and functional brain abnormalities using advanced [...] Read more.
Background: Amnestic mild cognitive impairment (aMCI) is considered a transitional state between normal aging and dementia, often without visible abnormalities on standard brain magnetic resonance (MR) images. The aim of the study was to analyze both microstructural and functional brain abnormalities using advanced MR techniques. Methods: The study included 27 patients with aMCI and an age-matched control group (CG) of 25 healthy subjects. All MR studies were performed on a 3T MR scanner (Philips, Ingenia) with a 32-channel head and neck coil using volumetric 3D T1 sequences, followed by a resting-state functional MRI (rs-fMRI) sequence. Volumetric analysis was performed using the Destrieux atlas to assess potential structural differences between groups. Seed-to-voxel functional connectivity analyses were conducted using the bilateral hippocampi and both anterior and posterior divisions of the parahippocampal gyri as seed regions. Results: Compared to healthy controls, reduced cortical thickness was observed in aMCI subjects in the temporal regions, frontal and orbitofrontal areas, limbic areas, parietal and sensorimotor cortices, as well as occipito-temporal regions. Additionally, significantly increased functional connectivity was observed between bilateral medial temporal lobe (MTL) regions and the right thalamus. Conclusions: Cortical thinning in various brain regions along with the increased functional connectivity between the MTL regions and the right thalamus may reflect potential compensatory mechanisms in response to initial subtle degenerative changes, emphasizing the importance of using both functional and structural imaging to detect early changes in aMCI patients. Full article
Show Figures

Figure 1

26 pages, 1490 KB  
Systematic Review
Object Detection in Optical Remote Sensing Images: A Systematic Review of Methods, Benchmarks, and Operational Applications
by Neus Fontanet Garcia and Piero Boccardo
Remote Sens. 2026, 18(9), 1289; https://doi.org/10.3390/rs18091289 - 23 Apr 2026
Abstract
Object detection in optical remote sensing imagery has emerged as a crucial task in computer vision, with applications ranging between environmental monitoring to disaster management, precision agriculture, and urban planning. This review systematically examines current methodologies, categorising them into four principal approaches: (1) [...] Read more.
Object detection in optical remote sensing imagery has emerged as a crucial task in computer vision, with applications ranging between environmental monitoring to disaster management, precision agriculture, and urban planning. This review systematically examines current methodologies, categorising them into four principal approaches: (1) template matching-based methods, which leverage predefined patterns for object identification; (2) knowledge-based methods, which incorporate geometric and contextual information to enhance detection accuracy; (3) object-based image analysis (OBIA), which segments images into meaningful objects using spectral and spatial properties; (4) machine learning-based methods, particularly deep convolutional neural networks (CNNs), which have revolutionised the field through automatic feature learning. Each methodology’s performance characteristics, computational requirements, and suitability for different remote sensing applications are analysed. Our systematic review, following PRISMA guidelines, analysed 189 studies published from 2010 to 2025, of which 73 provided quantitative results on standard benchmarks. The three most critical challenges identified are as follows: (1) annotation bottleneck, as dense bounding box labelling of remote sensing imagery remains highly labour-intensive for deep learning approaches, (2) extreme scale variation spanning 2–3 orders of magnitude within single scenes, and (3) domain adaptation failures when models encounter new geographic regions or sensor characteristics. This review identifies critical research gaps and proposes prioritised future directions, emphasising foundation models for zero-shot detection, efficient architectures for resource-constrained deployment, and standardised benchmarks with size-specific metrics. The analysis provides practitioners with evidence-based decision frameworks for method selection and researchers with a roadmap for advancing object detection in remote sensing applications. Full article
25 pages, 1701 KB  
Article
Concrete Crack Detection in Extremely Dark Environments Based on Infrared-Visible Multi-Level Registration Fusion and Frequency Decoupling
by Zixiang Li, Weishuai Xie and Bingquan Xiang
Sensors 2026, 26(9), 2612; https://doi.org/10.3390/s26092612 - 23 Apr 2026
Abstract
To address the issues of difficult heterogeneous image registration and low segmentation accuracy caused by the severe lack of illumination and significant modal differences in concrete cracks in extremely dark environments, this paper proposes a two-stage processing framework of registration–fusion first, and decoupling–segmentation [...] Read more.
To address the issues of difficult heterogeneous image registration and low segmentation accuracy caused by the severe lack of illumination and significant modal differences in concrete cracks in extremely dark environments, this paper proposes a two-stage processing framework of registration–fusion first, and decoupling–segmentation later. In the registration and fusion stage, a registration algorithm based on morphological priors and multi-level quadtree spatial constraints is designed. This approach transforms the problem from pixel grayscale matching to spatial topological matching, achieving a feature fusion of high infrared saliency and high visible light sharpness. In the segmentation stage, a Latent Frequency-Decoupled Topological Network (LFDT-Net) is proposed. It utilizes Discrete Wavelet Transform (DWT) to achieve high-fidelity frequency decoupling of the low-frequency infrared backbone and the high-frequency visible light edges. Furthermore, a Cross-Frequency Guidance Module is utilized to eliminate double-edged artifacts, and a skeleton-aware topological loss function is introduced to constrain the topological integrity of the cracks. Experimental results on a self-built heterogeneous multi-modal crack dataset demonstrate that the proposed method significantly outperforms existing mainstream methods in registration accuracy, fusion quality, and segmentation accuracy. Achieving a mean Intersection over Union (mIoU) of 81.7%, the method effectively suppresses background noise in dark environments and precisely restores the microscopic edges and continuous topological structures of faint cracks. Full article
(This article belongs to the Special Issue AI-Based Visual Sensing for Object Detection)
22 pages, 6548 KB  
Article
A Hybrid Lung and Colon Histopathological Image Classification Framework Using MobileNetV3-Small Deep Features and Differential Evolution Optimization
by Muhammad Usama Naveed, Sohail Jabbar, Muhammad Munwar Iqbal, Awais Ahmad, Ibrahim S. Alkhazi and Mansoor Alghamdi
Diagnostics 2026, 16(9), 1256; https://doi.org/10.3390/diagnostics16091256 - 22 Apr 2026
Viewed by 146
Abstract
Background/Objectives: Cancer remains one of the leading causes of mortality worldwide, with lung and colon cancers among the most prevalent. Conventional histopathological diagnosis is time-consuming, requires expert pathologists, and is susceptible to human error. Methods: To address these limitations, this study proposes an [...] Read more.
Background/Objectives: Cancer remains one of the leading causes of mortality worldwide, with lung and colon cancers among the most prevalent. Conventional histopathological diagnosis is time-consuming, requires expert pathologists, and is susceptible to human error. Methods: To address these limitations, this study proposes an automated classification framework for lung and colon cancer using histopathological images. The proposed method employs a lightweight pretrained deep learning model, MobileNetV3-Small, through transfer learning. Training is performed on an enhanced version of the LC25000 dataset, in which redundant image patches are removed to improve robustness and clinical generalizability. The images were initially available in multiple resolutions, which are resized to 224 × 224 × 3 to match the canonical input size of MobileNetV3-Small. Deep features are extracted from the dropout layer as it provides regularized representation of high-level features by reducing the overfitting (dimension N × 1024), which are optimized using a differential evolution algorithm, reducing the feature space to N × 60. These optimized features are evaluated using multiple classifiers. Results: Experimental results demonstrate a maximum classification accuracy of 98.14% using a Quadratic Support Vector Machine (SVM) and a 21.3× speed-up achieved with bagged trees, outperforming several state-of-the-art approaches representing a 3.34% improvement over the baseline study on the enhanced dataset. Conclusions: The results confirm that the proposed framework effectively balances high accuracy with computational efficiency. The use of a lightweight deep model combined with feature optimization makes the approach well-suited for practical clinical environments. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

60 pages, 7000 KB  
Article
Biometric Embedded Non-Blind Color Image Watermarking with Geometric Tamper Resistance via SIFT-ORB Keypoint Matching
by Swapnaneel Dhar, Riyanka Manna, Khaldi Amine and Aditya Kumar Sahu
Computers 2026, 15(5), 264; https://doi.org/10.3390/computers15050264 - 22 Apr 2026
Viewed by 95
Abstract
This work introduces a non-blind watermarking framework for color images to address tamper detection, particularly under geometric transformations. The proposed scheme fuses two watermarks, a personal signature and a biometric fingerprint, into a unified composite watermark embedded into the chrominance component of the [...] Read more.
This work introduces a non-blind watermarking framework for color images to address tamper detection, particularly under geometric transformations. The proposed scheme fuses two watermarks, a personal signature and a biometric fingerprint, into a unified composite watermark embedded into the chrominance component of the cover image using a multi-level transform domain approach, discrete wavelet transforms (DWTs), discrete cosine transforms (DCTs), and singular value decomposition (SVD). By leveraging the rotation-invariant properties of scale-invariant feature transform (SIFT) and oriented FAST and rotated BRIEF (ORB) descriptors, the framework ensures robust tamper detection without requiring alignment, thus mitigating the limitations of conventional detection techniques vulnerable to transformation-induced tamper obfuscation (TITO). Extensive experimentation demonstrates that the method maintains high perceptual fidelity, achieving PSNR values ranging from 50 to 55 dB for embedding strength factor μ (0.01–0.04) and SSIM indices near 1 across multiple benchmark images. Furthermore, the scheme exhibits notable resilience to a range of image processing attacks and geometric distortion. Comparative evaluation reveals its superiority over existing grayscale, color, SIFT-based and DWT-DCT-SVD-based watermarking techniques, affirming its applicability in scenarios demanding secure, imperceptible, and transformation-invariant image watermarking. Full article
25 pages, 19124 KB  
Article
Multi-Scale Fractional-Order Image Fusion Algorithm Based on Polarization Spectral Images
by Zhenduo Zhang, Xueying Cao and Zhen Wang
Appl. Sci. 2026, 16(9), 4087; https://doi.org/10.3390/app16094087 - 22 Apr 2026
Viewed by 75
Abstract
With the continuous advancement of polarization spectral sensing technology, multi-band polarization image fusion has emerged as a novel approach to image fusion. By integrating spectral and polarization information, this method overcomes the limitations of relying on a single information source and significantly improves [...] Read more.
With the continuous advancement of polarization spectral sensing technology, multi-band polarization image fusion has emerged as a novel approach to image fusion. By integrating spectral and polarization information, this method overcomes the limitations of relying on a single information source and significantly improves overall image quality. To address this, this paper proposes a new polarization spectral fusion algorithm. First, feature matching is employed to achieve pixel-level spatial alignment of multi-band polarization images. Then, a fusion strategy based on multi-scale decomposition and singular value decomposition is adopted to preserve structural information and fine details. Subsequently, fractional-order processing and guided filtering are applied to enhance details and suppress noise. Finally, a progressive reconstruction from low to high scales is performed to ensure hierarchical consistency and information integrity throughout the fusion process. In addition, spectral information is utilized for color restoration, enabling the final image to achieve high spatial resolution while maintaining natural and rich color representation.Experimental results demonstrate that the proposed method effectively integrates features from different spectral bands and polarization information while preserving maximum similarity, leading to significant improvements in both image quality and detail representation. Full article
17 pages, 5236 KB  
Article
Two Non-Learning Filters for the Enhancement of Images Obtained from a Fluorescence Imaging System, a Near-Infrared Camera, and Low-Light Condition
by Jun Hong, Xi He, Haoru Ning, Zhonghuan Su, Ling Zhang, Yingcheng Lin and Ye Wu
Electronics 2026, 15(9), 1777; https://doi.org/10.3390/electronics15091777 - 22 Apr 2026
Viewed by 97
Abstract
Images obtained from imaging instruments can endure issues such as high degradation, color distortion, and weak brightness. Effective systems for enhancing these images are critically required. To improve the image quality, herein, we propose two filters based on simple functions, including cosine, sine, [...] Read more.
Images obtained from imaging instruments can endure issues such as high degradation, color distortion, and weak brightness. Effective systems for enhancing these images are critically required. To improve the image quality, herein, we propose two filters based on simple functions, including cosine, sine, hyperbolic secant, and the inverse of hyperbolic cosecant. These filters are used for enhancing the images obtained from a fluorescence imaging system, a near-infrared camera, and low-light condition. The contrast is increased while the image quality is improved. They perform better than a matched filter. Moreover, the combination of our filters with the filter based on the watershed algorithm or the matched filter can be used to extract the marginal features from images generated under water environment. Furthermore, their application in image fusion is explored. Our designed filters may be potentially used for future applications on target identification and tracking. Full article
21 pages, 8256 KB  
Article
SemGeoFrame: A Visual Matching Framework for Aircraft Based on Surface Semantic Information
by Zhaoyun Luo, Yanfei Liu, Chen Liu, Min Kong, Dongfang Yang, Maoan Zhou and Cong An
Remote Sens. 2026, 18(9), 1267; https://doi.org/10.3390/rs18091267 - 22 Apr 2026
Viewed by 120
Abstract
In GNSS-denied environments, UAV visual positioning faces the critical bottleneck of low matching accuracy between heterogeneous images. To address this, we propose SemGeoFrame, a visual matching framework that leverages surface semantic information to enhance robustness. The key innovations are threefold: First, we construct [...] Read more.
In GNSS-denied environments, UAV visual positioning faces the critical bottleneck of low matching accuracy between heterogeneous images. To address this, we propose SemGeoFrame, a visual matching framework that leverages surface semantic information to enhance robustness. The key innovations are threefold: First, we construct a semantic prior from the probability distributions of image semantic segmentation and design a consistency screening mechanism based on Jensen–Shannon divergence to eliminate false matches by leveraging pixel-level semantic consistency for cross-view image matching. Second, a confidence-guided partition sampling strategy ensures balanced distribution of matches in both spatial and semantic categories, overcoming the limitations of conventional spatial-only sampling. Third, geometric, semantic, and confidence constraints are jointly optimized to achieve robust homography estimation. SemGeoFrame adopts a plug-and-play design and consistently improves the performance of mainstream matching algorithms (e.g., ORB, SuperPoint, LoFTR) on multiple heterogeneous datasets. The experimental results demonstrate that our framework significantly enhances matching accuracy and robustness across diverse scenarios. Full article
12 pages, 2265 KB  
Article
Optimizing Reconstruction Parameters for Detecting Peripheral In-Stent Restenosis with Photon-Counting Detector CT: A Phantom Study
by Yiheng Tan, Joost F. Hop, Magdalena Dobrolinska, Xinlin Zheng, Evie J. I. Hoeijmakers, Jean-Paul P. M. de Vries, Marcel J. W. Greuter and Reinoud P. H. Bokkers
Diagnostics 2026, 16(9), 1253; https://doi.org/10.3390/diagnostics16091253 - 22 Apr 2026
Viewed by 139
Abstract
Background/Objectives: To determine the optimal reconstruction parameters for accurate visualization of peripheral in-stent restenosis using photon-counting detector CT (PCD-CT), and to evaluate its potential advantages over energy-integrated detector CT (EID-CT). Methods: Endovascular peripheral stents with varying degrees of in-stent restenosis were [...] Read more.
Background/Objectives: To determine the optimal reconstruction parameters for accurate visualization of peripheral in-stent restenosis using photon-counting detector CT (PCD-CT), and to evaluate its potential advantages over energy-integrated detector CT (EID-CT). Methods: Endovascular peripheral stents with varying degrees of in-stent restenosis were scanned in a custom-made phantom using EID-CT (Somatom Force) and PCD-CT (Naeotom Alpha) under clinical acquisition protocols. EID-CT images were reconstructed with Bv40 and Bv59 kernels at 512 matrices. PCD-CT data were acquired in standard-resolution (SR) and ultra-high-resolution (UHR) modes. In both modes, images were reconstructed with multiple kernels (Bv40, Bv56 and Bv72) and matrix sizes (512 and 1024 matrix). In SR mode, additional virtual monoenergetic images (40–100 keV) were generated, while UHR mode included only polychromatic reconstructions. Quantitative image quality (noise, contrast, contrast-to-noise ratio [CNR]) was measured, and two blinded readers performed qualitative assessments of restenosis visualization. Results: PCD-CT with SR mode at VMI 40 keV achieved the highest image contrast and CNR, significantly outperforming EID-CT and PCD-CTUHR under matched conditions (all p < 0.05). The sharper reconstruction kernel further enhanced the image contrast and improved subjective visualization despite increased image noise. Both readers ranked PCD-CTSR-Bv72-40keV at 1024 matrix highest for detecting all degrees of restenosis, with excellent inter-reader agreement (ρ > 0.80). Conclusions: PCD-CT in SR mode at VMI 40 keV, specifically using the Bv72 kernel with a 1024 matrix, optimizes the visualization of peripheral in-stent restenosis. Compared to EID-CT, PCD-CT provides superior image quality and detectability of restenosis. Full article
Show Figures

Figure 1

12 pages, 4476 KB  
Article
Broadband Polarization-Insensitive Tunable Terahertz Metamaterial Absorber Based on an Asymmetric Graphene Structure
by Ahmed Ali, Sulaiman Al-Sowayan, Waleed Shihzad, Asrafali Barkathulla, Zaid Ahmed Shamsan, Majeed A. S. Alkanhal and Yosef T. Aladadi
Nanomaterials 2026, 16(9), 502; https://doi.org/10.3390/nano16090502 - 22 Apr 2026
Viewed by 189
Abstract
A graphene-based tunable broad-band terahertz (THz) metamaterial absorber is presented, exhibiting strong and stable absorption across a wide frequency range. The device employs an ultra-thin three-layer structure consisting of a metallic reflector, a dielectric spacer, and a patterned graphene metasurface with an asymmetric [...] Read more.
A graphene-based tunable broad-band terahertz (THz) metamaterial absorber is presented, exhibiting strong and stable absorption across a wide frequency range. The device employs an ultra-thin three-layer structure consisting of a metallic reflector, a dielectric spacer, and a patterned graphene metasurface with an asymmetric geometry. Through optimized structural parameters, the absorber achieves broad-band absorption exceeding 90% between 2.45 THz and 6.11 THz with a bandwidth of 3.66 THz, featuring three distinct resonant frequencies at 2.764 THz, 3.534 THz, and 5.41 THz, corresponding to peak absorption efficiencies of 97.26%, 96.96%, and 99.90%, respectively. Impedance matching and electric field analyses confirm that the enhanced absorption arises from the strong coupling of electric and magnetic resonances within the multilayer structure. Moreover, the absorber exhibits polarization-insensitive behavior under varying polarization angles and maintains high absorption stability for both TE and TM modes up to an incident angle of 60°, as verified by simulation results, and allows dynamic tunability through Fermi-level modulation. These characteristics highlight the absorber’s potential for advanced THz imaging, sensing, and stealth applications. Full article
Show Figures

Figure 1

19 pages, 378 KB  
Article
Mislabel Detection in Multi-Label Chest X-Rays via Prototype-Weighted Neighborhood Consistency in CoAtNet Embedding Space
by Ariel Gamboa, Mauricio Araya and Camilo Sotomayor
Appl. Sci. 2026, 16(9), 4067; https://doi.org/10.3390/app16094067 - 22 Apr 2026
Viewed by 76
Abstract
Large-scale chest X-ray (CXR) datasets often rely on report-derived or weak labels, introducing missing and incorrect annotations that can degrade downstream models and limit trust. We study training-free mislabel detection in multi-label CXRs by scoring neighborhood label consistency in a fixed embedding space. [...] Read more.
Large-scale chest X-ray (CXR) datasets often rely on report-derived or weak labels, introducing missing and incorrect annotations that can degrade downstream models and limit trust. We study training-free mislabel detection in multi-label CXRs by scoring neighborhood label consistency in a fixed embedding space. Using the NIH Chest X-ray Kaggle sample (5606 CXRs), we extract intermediate CoAtNet features and obtain 64-dimensional embeddings with a frozen CoAtNet backbone and a lightweight refinement head. On top of these embeddings, we compare kNN consistency baselines with distance weighting and label-set similarity against LPV-DW-CS, clustered prototype voting weighted by distance and cluster support. We evaluate three synthetic label-noise regimes with review budgets matched to the corruption rate: random single-label (5% and 20%), boundary-noise (20% corruption within the lowest-density 20% subset), and disjoint-label replacement (20% within that subset). LPV-DW-CS yields the highest downstream macro-AUROC after filtering top-ranked samples (up to 0.8860), while kNN variants achieve higher Recall@budget at the same review rates (up to 99.44%). An image-only expert Likert review of top-ranked real samples finds substantial label-set inconsistencies (54.1% for LPV-DW-CS-280-A; 60.5% for KNN-DW-LSS), supporting neighborhood-consistency ranking as a practical, training-free tool for targeted dataset auditing. Full article
(This article belongs to the Special Issue Computer-Vision-Based Biomedical Image Processing)
Show Figures

Figure 1

23 pages, 5106 KB  
Article
A Multidimensional Framework for Analyzing Image–Text Consistency in Social Media
by Hongqi Xia, Zhijie Zhao, Binbin Zhao, Hong Lan, Han Wu, Xujing Jing and Yanrong Zhang
Appl. Sci. 2026, 16(8), 4044; https://doi.org/10.3390/app16084044 - 21 Apr 2026
Viewed by 175
Abstract
As image–text posts have become a dominant form of social media communication, understanding how the two modalities jointly convey meaning remains a key challenge in multimodal analysis. This study aims to examine whether image–text consistency is inherently multidimensional rather than reducible to a [...] Read more.
As image–text posts have become a dominant form of social media communication, understanding how the two modalities jointly convey meaning remains a key challenge in multimodal analysis. This study aims to examine whether image–text consistency is inherently multidimensional rather than reducible to a single similarity metric. Existing studies often reduce consistency to a single relevance score, which cannot capture semantic, emotional, and functional interactions. We construct a dataset of 28,650 multimodal posts and model image–text relationships along three dimensions: semantic consistency (CSC), emotional consistency (CEC), and informational matching consistency (IMC). Semantic and emotional alignment are measured using cross-modal representation and similarity computation, while IMC is defined through rule-based classification of informational roles. Results show that emotional consistency (CEC = 0.621) is higher than semantic consistency (CSC = 0.549, p<0.001), while 61.0% of posts maintain consistent informational orientation. These findings demonstrate that image–text consistency exhibits distinct cross-dimensional patterns that cannot be captured by single-metric approaches. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

22 pages, 45694 KB  
Article
Visual Localization for Deep-Sea Mining Vehicles During Operation
by Yangrui Cheng, Bingkun Wang, Xiaojun Zhuo, Kai Liu and Yingjie Guan
J. Mar. Sci. Eng. 2026, 14(8), 759; https://doi.org/10.3390/jmse14080759 - 21 Apr 2026
Viewed by 104
Abstract
Deep-sea mining operations demand continuous, drift-free positioning over multi-day missions—a requirement that traditional acoustic dead-reckoning systems struggle to meet due to cumulative error accumulation and frequent DVL bottom-lock loss in sediment plume environments. Inspired by Google Cartographer’s 2D grid mapping paradigm, we present [...] Read more.
Deep-sea mining operations demand continuous, drift-free positioning over multi-day missions—a requirement that traditional acoustic dead-reckoning systems struggle to meet due to cumulative error accumulation and frequent DVL bottom-lock loss in sediment plume environments. Inspired by Google Cartographer’s 2D grid mapping paradigm, we present a prior map-based visual localization framework that decouples offline mapping from real-time localization, fundamentally eliminating drift through absolute image registration against pre-built seabed mosaics. By integrating adaptive keyframe selection, Multi-Scale Retinex (MSR) enhancement, and the AD-LG deep feature matching architecture, our system constructs globally consistent seabed maps for absolute positioning. The framework leverages deformable convolutions and LightGlue to effectively mitigate challenges such as low texture and non-rigid distortion. Quantitative validation on tank simulation datasets demonstrates significant superiority over IMU-only and standard fusion schemes; qualitative deployment on real Pacific CCZ imagery confirms near-real-time operational feasibility on an embedded Jetson Orin NX platform. This system establishes visual navigation as a viable backup to acoustic systems, addressing a critical gap in deep-sea mining vehicle autonomy. Full article
(This article belongs to the Special Issue Advances in Underwater Positioning and Navigation Technology)
Show Figures

Figure 1

Back to TopTop