MDPI - Publisher of Open Access Journals

21 pages, 11455 KB

Open AccessArticle

Cross-Scale Spectral Calibration for Spatiotemporal Fusion of Remote Sensing Images

by Yishuo Tian, Xiaorong Xue, Jingtong Yang, Wen Zhang, Bingyan Lu, Xin Zhao and Wancheng Wang

Sensors 2026, 26(7), 2090; https://doi.org/10.3390/s26072090 - 27 Mar 2026

Viewed by 417

Abstract

Spatiotemporal fusion aims to generate remote sensing images with both high spatial and high temporal resolution by integrating multi-source observations. However, significant spectral inconsistencies often arise when fusing images acquired at different spatial scales, which severely degrade the radiometric fidelity and temporal reliability [...] Read more.

Spatiotemporal fusion aims to generate remote sensing images with both high spatial and high temporal resolution by integrating multi-source observations. However, significant spectral inconsistencies often arise when fusing images acquired at different spatial scales, which severely degrade the radiometric fidelity and temporal reliability of the fused results. Most existing methods focus on enhancing spatial details or temporal consistency, while the cross-scale spectral discrepancy between coarse- and fine-resolution images has not been sufficiently addressed. To tackle this issue, we propose a cross-scale spectral calibration framework for spatiotemporal fusion (XSC-Net), which explicitly models and corrects spectral responses across different spatial scales. The proposed method introduces a spatial feature refinement block to enhance spatially discriminative structures and a hierarchical spectral refinement block to adaptively calibrate channel-wise spectral representations. By jointly exploiting spatial and spectral correlations, the proposed framework effectively suppresses spectral distortion while preserving fine spatial details. Extensive experiments on the public CIA and LGC datasets indicate that XSC-Net compares favorably with state-of-the-art methods, demonstrating superior performance over established baselines. Furthermore, ablation studies verify the efficacy and contribution of the proposed architectural components. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

26 pages, 12081 KB

Open AccessArticle

DEPART: Multi-Task Interpretable Depression and Parkinson’s Disease Detection from In-the-Wild Video Data

by Elena Ryumina, Alexandr Axyonov, Mikhail Dolgushin, Dmitry Ryumin and Alexey Karpov

Big Data Cogn. Comput. 2026, 10(3), 89; https://doi.org/10.3390/bdcc10030089 - 16 Mar 2026

Viewed by 484

Abstract

Automated video-based detection of cognitive disorders can enable a scalable non-invasive health monitoring. However, existing methods focus on a single disease and provide limited interpretability, whereas real-world videos often contain co-occurring conditions. We propose a novel unified multi-task method to detect depression and [...] Read more.

Automated video-based detection of cognitive disorders can enable a scalable non-invasive health monitoring. However, existing methods focus on a single disease and provide limited interpretability, whereas real-world videos often contain co-occurring conditions. We propose a novel unified multi-task method to detect depression and Parkinson’s disease (PD) from in-the-wild video data called DEPART (DEpression and PArkinson’s Recognition Technique). It performs body region extraction, Contrastive Language-Image Pre-training (CLIP)-based visual encoding, Transformer-based temporal modeling, and prototype-aware classification with a gated fusion technique. Gradient-based attention maps are used to visualize task-specific regions that drive predictions. Experiments on the In-the-Wild Speech Medical (WSM) corpus demonstrate competitive performance: the multi-task model achieves Recall of 82.39% for depression and 78.20% for PD, compared with 87.76% and 78.20%, for the best single-task models. The multi-task learning initially increases false positives for healthy persons in the PD subset, mainly due to annotation–modality mismatches, static visual content misinterpreted as motor impairments, and occasional body detection failures. After cleaning the test data, Recall for healthy individuals becomes comparable across models; the multi-task model improves Recall for both depression (from 82.39% to 87.50%) and PD (from 78.20% to 86.14%), suggesting better robustness for real-life clinical applications. Full article

► Show Figures

Figure 1

23 pages, 8147 KB

Open AccessArticle

SDENet: A Novel Approach for Single Image Depth of Field Extension

by Xu Zhang, Miaomiao Wen, Junyang Jia and Yan Liu

Algorithms 2026, 19(3), 216; https://doi.org/10.3390/a19030216 - 13 Mar 2026

Viewed by 291

Abstract

Traditional hardware-based approaches for depth-of-field extension (DOF-E), such as optimized lens design or focus-stacking via layer scanning, are often plagued by bulkiness and prohibitive costs. Meanwhile, conventional multi-focus image fusion algorithms demand precise spatial alignment, a challenge that becomes particularly acute in applications [...] Read more.

Traditional hardware-based approaches for depth-of-field extension (DOF-E), such as optimized lens design or focus-stacking via layer scanning, are often plagued by bulkiness and prohibitive costs. Meanwhile, conventional multi-focus image fusion algorithms demand precise spatial alignment, a challenge that becomes particularly acute in applications like microscopy. To address these limitations, this paper proposed a novel single-image DOF-E method termed SDENet. The method adopts an encoder –decoder architecture enhanced with multi-scale self-attention and depth enhancement modules, enabling the transformation of a single partially focused image into a fully focused output while effectively recovering regions outside the original depth of field (DOF). To support model training and performance evaluation, we introduce a dedicated dataset (MSED) containing 1772 pairs of single-focus and all-focus images covering diverse scenes. Experimental results on multiple datasets verify that SDENet significantly outperforms state-of-the-art deblurring methods, achieving a PSNR of 26.98 dB and SSIM of 0.846 on the DPDD dataset, which represents a substantial improvement in clarity and visual coherence compared to existing techniques. Furthermore, SDENet demonstrates competitive performance with multi-image fusion methods while requiring only a single input. Full article

► Show Figures

Figure 1

22 pages, 3475 KB

Open AccessArticle

Cross-Layer Feature Fusion and Attention-Based Class Feature Alignment Network for Unsupervised Cross-Domain Remote Sensing Scene Classification

by Jiahao Wei, Erzhu Li and Ce Zhang

Remote Sens. 2026, 18(6), 859; https://doi.org/10.3390/rs18060859 - 11 Mar 2026

Viewed by 323

Abstract

Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus [...] Read more.

Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus in scene classification. Existing UDA methods focus primarily on aligning the overall feature distributions across domains but neglect class feature alignment, resulting in the loss of critical class information. To address this issue, a cross-layer feature fusion and attention-based class feature alignment network (CFACA-NET) is proposed for unsupervised cross-domain remote sensing scene classification. Specifically, a multi-layer feature extraction module (MFEM) consisting of a cross-layer feature fusion module (CFFM), a multi-scale dynamic attention module (MSDAM), and a fused feature optimization module (FFOM) is designed to enhance the representation ability of scene features. A high-confidence sample selection module is further introduced, which utilizes evidence theory and information entropy to obtain reliable pseudo-labels. Finally, a class feature alignment module is proposed, incorporating a two-stage training strategy to achieve effective class feature alignment. Experimental results on three remote sensing scene classification datasets demonstrate that CFACA-NET outperforms existing state-of-the-art methods in cross-domain classification performance, effectively enhancing cross-domain adaptation capability. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing (Fourth Edition))

► Show Figures

Figure 1

19 pages, 35815 KB

Open AccessArticle

YOLOv10-TWD: An Improved YOLOv10n for Terracotta Warrior Recognition

by Yalin Li, Liang Wang, Xinyuan Zhang, Sijie Dong and Xinjuan Zhu

Appl. Sci. 2026, 16(5), 2616; https://doi.org/10.3390/app16052616 - 9 Mar 2026

Viewed by 267

Abstract

To address challenges such as complex backgrounds, partial occlusion, and high similarity of details in Terracotta Warrior image recognition, this paper proposes a lightweight detection method, YOLOv10-TWD, based on an improved YOLOv10n. Specifically, a lightweight Convolution-Attention Fusion Module (CAFMAttention) and a dual-branch feature [...] Read more.

To address challenges such as complex backgrounds, partial occlusion, and high similarity of details in Terracotta Warrior image recognition, this paper proposes a lightweight detection method, YOLOv10-TWD, based on an improved YOLOv10n. Specifically, a lightweight Convolution-Attention Fusion Module (CAFMAttention) and a dual-branch feature extraction structure (DualConv) are integrated into the detection head to enhance the model’s focus on fine-grained features and its discriminative robustness under partial damage conditions. In the Neck network, Ghost-Shuffle Convolution (GSConv) is introduced to compress the computational cost of multi-scale feature fusion while strengthening context-aware capabilities. Experimental results on a self-built Terracotta Warrior dataset demonstrate that the proposed method achieves a 7.63% improvement in mAP@0.5 compared to the baseline YOLOv10n, while simultaneously achieving a 6.66% increase in inference speed. The model achieves high precision alongside significant optimization in inference efficiency, making it well-suited for rapid recognition tasks in cultural heritage and museum scenarios. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 5699 KB

Open AccessArticle

An Improved YOLOv8 Detection Algorithm Based on Screen Printing Defect Images

by Shuqin Wu, Xinru Dong, Qiang Da, Meiou Wang, Yuxuan Sun, Ge Ge, Jinge Ma, Jiajie Kang, Yu Yao and Shubo Shi

Sensors 2026, 26(5), 1604; https://doi.org/10.3390/s26051604 - 4 Mar 2026

Viewed by 373

Abstract

Micro-defects, such as ink spots, scratches, and sintering formed during the screen printing process of photovoltaic cells, significantly impair module performance. Traditional machine vision methods exhibit limited detection efficiency and high false-positive and missed-detection rates, while existing deep learning algorithms struggle to achieve [...] Read more.

Micro-defects, such as ink spots, scratches, and sintering formed during the screen printing process of photovoltaic cells, significantly impair module performance. Traditional machine vision methods exhibit limited detection efficiency and high false-positive and missed-detection rates, while existing deep learning algorithms struggle to achieve accurate and adaptive detection of small-target defects and background similar defects in complex industrial environments. This study proposes an enhanced defect detection methodology based on an improved YOLOv8 algorithm. A multi-focus image acquisition platform using primary and auxiliary CCDs was independently developed, integrating a high-frame-rate industrial camera and a high-resolution electron microscope, with an LED ring light employed to suppress reflections, thereby establishing a high-quality dataset covering three defect categories. The algorithm was optimized through multiple dimensions: the RepNCSPELAN4 module was incorporated into the backbone network to improve multi-scale feature fusion, and a novel wavelet transform-based WaveConv module was designed to replace traditional downsampling, thereby better preserving defect edges and texture details. The neck network integrates a lightweight shuffle attention mechanism and a new detail enhancement module to strengthen critical features while controlling model complexity. Additionally, a dedicated auxiliary detection head was added for spotting tiny ink dots. Experimental results demonstrate a marked improvement in performance: on the custom dataset, the improved model achieves a stable mean average precision of approximately 92%. Specifically, ink spot detection reached a precision of 84.9% and recall of 77.7%, effectively reducing missed small-target defects; sintering defect detection attained 98.9% precision and 100% recall, addressing previous misclassifications due to background similarity; and scratch detection precision improved to 92.2%. Visual comparisons confirm that the enhanced model effectively overcomes the limitations of the original approach. By constructing a specialized dataset and implementing targeted, coordinated optimizations to the YOLOv8 architecture, this study significantly enhances the accuracy and robustness of screen-printing defect detection in photovoltaic cells, providing an effective solution for real-time online quality inspection in smart manufacturing lines. Full article

(This article belongs to the Special Issue Defect Detection Based on Vision Sensors)

► Show Figures

Figure 1

25 pages, 6938 KB

Open AccessArticle

A BIM-Centered Multi-Source Image Fusion Framework for Remote Client Site Visits

by Ren-Jye Dzeng, Chen-Wei Cheng and Yu-Hsiang Chen

Buildings 2026, 16(5), 994; https://doi.org/10.3390/buildings16050994 - 3 Mar 2026

Viewed by 384

Abstract

Clients need to visit project sites periodically during construction to visualize progress and identify deviations from expectations. However, physical site visits are time-consuming, costly, and potentially unsafe, especially for remote and overseas projects. More fundamentally, existing remote-site-visit solutions focus primarily on automatic recognition [...] Read more.

Clients need to visit project sites periodically during construction to visualize progress and identify deviations from expectations. However, physical site visits are time-consuming, costly, and potentially unsafe, especially for remote and overseas projects. More fundamentally, existing remote-site-visit solutions focus primarily on automatic recognition and visualization, while insufficiently addressing the scientific challenge of how heterogeneous, dynamic site data can be fused and operationalized to support timely, collaborative decision making. This research proposes a framework for clients’ remote site visits. It develops an RASE system that enables multi-source data fusion and real-time collaborative decision support by integrating UAVs, 360° cameras, BIM, and VR/AR technologies. RASE allows clients to synchronize real-world visual data with BIM models within predefined scenes, annotate issues directly on BIM components, and seamlessly switch among heterogeneous image-capture sources to maintain situational awareness in highly dynamic construction environments. The proposed framework emphasizes an operational data-fusion mechanism and an interaction paradigm that reduces the cognitive and coordination burdens of remote decision making. A case study shows that RASE reduces site-visit time by 78.0%, though initial equipment costs increase total expenses by 44.1%. Sensitivity analyses indicate that projects with greater remoteness or higher visit frequency significantly improve both time and cost effectiveness. The core contribution of RASE lies in enabling a scalable, operational data-fusion mechanism that supports collaboration for remote site visits, with the associated issues for the corresponding BIM components. Automatic image and voice recognition functionality may be incorporated with RASE to improve the efficiency of system control, textual input, and BIM association in the future. Full article

(This article belongs to the Special Issue Information and Communication Technology (ICT) and Optimization for Construction Project Management)

► Show Figures

Figure 1

24 pages, 1346 KB

Open AccessSystematic Review

Artificial Intelligence in Cadastre: A Systematic Review of Methods, Applications, and Trends

by Jingshu Chen, Majid Nazeer, Bo Sum Lee and Man Sing Wong

Land 2026, 15(3), 411; https://doi.org/10.3390/land15030411 - 2 Mar 2026

Viewed by 931

Abstract

Surveying and register administration are core to land administration, and accordingly, land surveying and registration are essential to socio-economic development due to their potential accuracy and efficiency. Until now, customary land surveying and registration have relied on human input, which is a situation [...] Read more.

Surveying and register administration are core to land administration, and accordingly, land surveying and registration are essential to socio-economic development due to their potential accuracy and efficiency. Until now, customary land surveying and registration have relied on human input, which is a situation that undermines efficiency and is prone to errors in data handling. During the last decade, the exponential growth in artificial intelligence (AI), in particular, geospatial artificial intelligence (GeoAI), has provided new methodologies that can overcome these deficiencies. This review examines AI in cadastral management by analyzing technical solutions and trends across three areas including data collection, modeling, and common applications. This review aims to provide a comprehensive survey of the current use of AI in cadastral management to the extent of defining a future research avenue. Based on the comprehensive review of literature, this study has reached the following three conclusions. (1) Automated extraction of parcel boundaries has been achieved through deep learning in data collection and processing, removing the bottlenecks of manual interpretation. Models such as convolutional neural networks (CNNs) and Transformers have been used for pixel-level semantic segmentation of high-resolution remote sensing images, leading to significant improvements in efficiency and accuracy. (2) Non-spatial data have been processed with natural language processing techniques to automatically extract information and construct relationships, thus overcoming the limitations of paper-based archives and traditional relational databases. (3) Deep learning models have been applied to automatically detect parcel changes and to enable integrated analysis of spatial and non-spatial data, which has supported the transition of cadastral management from two-dimensional to three-dimensional. However, several challenges remain, including differences in multi-temporal data processing, spatial semantic ambiguity, and the lack of large-scale, high-quality annotated data. Future research can focus on improving model generalization, advancing cross-modal data fusion, and providing recommendations for the development of a reliable and practical intelligent cadastral system. Full article

► Show Figures

Figure 1

19 pages, 84231 KB

Open AccessArticle

Vision–Language Models for Transmission Line Fault Detection: A New Approach for Grid Reliability and Optimization

by Runle Yu, Lihao Mai, Yang Weng, Qiushi Cui, Guochang Xu and Pengliang Ren

J. Imaging 2026, 12(3), 106; https://doi.org/10.3390/jimaging12030106 - 28 Feb 2026

Viewed by 459

Abstract

Reliable fault detection along transmission corridors is essential for preventing small defects from developing into long outages and costly emergency operations. This study aims to improve the field reliability of an open vocabulary vision language backbone without retraining the large model in an [...] Read more.

Reliable fault detection along transmission corridors is essential for preventing small defects from developing into long outages and costly emergency operations. This study aims to improve the field reliability of an open vocabulary vision language backbone without retraining the large model in an end-to-end manner. The work focuses on four operational fault classes in multi-region corridor imagery collected during routine inspections and uses a Florence-2 vision language model as the base recognizer. On top of this backbone, three domain-specific components are introduced. A subclass-aware fusion scheme keeps probability mass within the active parent concept so that insulator icing and conductor icing produce stable, action-oriented decisions. A Power-Line Focus Then Crop normalization uses an attention-guided corridor window together with isotropic resizing so that thin conductors and small fittings remain visible in the processed image. A corridor geo prior reduces scores as the distance from the mapped centerline increases and in this way suppresses detections that lie outside the corridor. All methods are evaluated under a shared preprocessing and scoring pipeline in training-free and parameter-efficient tuning modes. Experiments on unseen regions show higher accuracy for skinny and low-contrast faults, fewer false alarms outside the right-of-way, and improved score calibration in the confidence range used for triage, while keeping throughput and memory usage suitable for unmanned aerial vehicles and substation edge devices. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

30 pages, 5797 KB

Open AccessArticle

FADS-Fusion: A Post-Flood Assessment Using Dempster–Shafer Fusion for Segmentation and Uncertainty Mapping

by Daniel Sobien and Chelsea Sobien

Remote Sens. 2026, 18(5), 714; https://doi.org/10.3390/rs18050714 - 27 Feb 2026

Viewed by 397

Abstract

Machine Learning (ML) modeling for disaster management is a growing field, but existing works focus more on mapping the extent of floods or broad categories of damage and they lack methods for explainability to help users understand model outputs. In this study, we [...] Read more.

Machine Learning (ML) modeling for disaster management is a growing field, but existing works focus more on mapping the extent of floods or broad categories of damage and they lack methods for explainability to help users understand model outputs. In this study, we propose Flood Assessment using Dempster–Shafer Fusion (FADS-Fusion), a tool for addressing post-flood damage assessment using Dempster–Shafer fusion to combine outputs from multiple deep learning models. FADS-Fusion is generalized to use any pretrained models, once outputs are post-processed for consistency, making it applicable for other disaster management or change detection applications. The novelty of our work comes from the application of Dempster–Shafer for multi-model fusion and uncertainty quantification on a flood dataset for segmenting both buildings and roads. We trained and evaluated models using the SpaceNet 8 challenge dataset and demonstrated that the fusion of the SpaceNet 8 Baseline (SN8) and Siamese Nested UNet (SNUNet) models has a modest overall improvement +1.93% to mAP, while a +12.3% increase for Precision and a −15.0% decrease in Recall are statistically significant compared to the baseline. FADS-Fusion also quantifies uncertainty by using the conflict of evidence, with a discount factor, with Dempster–Shafer fusion as both a quantitative and qualitative explainability method. While uncertainty correlates with a drop in performance, this relationship depends on values for class-weighted uncertainty and location. Mapping uncertainty back onto the original image allows for a visual inspection on fusion quality and indicates areas where a human will need to reassess. Our work demonstrates that FADS-Fusion improves post-flood segmentation performance and adds the benefit of uncertainty quantification for explainability, an aspect important for reliability and user decision-making but understudied in ML for disaster management in the literature. Full article

(This article belongs to the Special Issue Advances in Earth Observation to Improve Flood Disaster Monitoring and Management (Second Edition))

► Show Figures

Figure 1

19 pages, 1446 KB

Open AccessArticle

Optical Characteristics-Guided Asymmetric Dual Encoder Feature Fusion Cloud Detection Algorithm

by Jing Zhang, Qi Lang, Xinlong Shi, Jiaxuan Liu and Yunsong Li

Remote Sens. 2026, 18(5), 677; https://doi.org/10.3390/rs18050677 - 24 Feb 2026

Viewed by 356

Abstract

The rapid development of remote sensing satellite technology has enabled remote sensing images to be widely used in agriculture, meteorology, environmental monitoring and other fields. However, the presence of clouds in these images can lead to blurred and incomplete observations of the Earth’s [...] Read more.

The rapid development of remote sensing satellite technology has enabled remote sensing images to be widely used in agriculture, meteorology, environmental monitoring and other fields. However, the presence of clouds in these images can lead to blurred and incomplete observations of the Earth’s surface, limiting the quality and applicability of the data. Current cloud detection networks usually adopt a single encoder–decoder structure that uniformly processes all spectral features without distinguishing between various spectral bands. To overcome this limitation, this paper proposes an Optical characteristics-guided Asymmetric Dual Encoder Feature Fusion cloud detection algorithm (OADEF²). The algorithm adopts an asymmetric dual encoder framework to divide the spectral bands of Sentinel-2A into two groups: RGB visible light bands and infrared/atmospheric correction bands, which are subsequently input into two different encoder branches. This method utilizes the unique physical characteristics of different spectral bands to improve the accuracy of cloud detection. In order to direct the focus of the network to cloud-related optical characteristics, an Optical characteristics-guided Multi-Scale cloud feature module (OCGMSCFM) based on Dynamic HOT Index and Full-Band Cloud Index is introduced. This module effectively solves the problem of insufficient representation of cloud features. In order to improve the efficiency of feature fusion, a Feature Aggregation and Filtering module (FAFM) is proposed. This module uses aggregation and techniques to filter basic features, thereby improving the accuracy of cloud detection. In order to overcome the limitations of feature modeling, a dual attention module that fuses Multi-interaction Local Spatial Attention mixed Channel Attention (MILSAMCAM) is added to the decoder. The experimental results validated the effectiveness of this algorithm in cloud detection tasks, achieving an F1-score of 97.30% on the S2-CMC dataset. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

30 pages, 14583 KB

Open AccessArticle

MF-CLF: Multi-Feature Chrominance–Luminance Fusion for Blind Underwater Image Quality Assessment

by Wei Chen, Yi Zhang, Damon M. Chandler and Mikolaj Leszczuk

J. Mar. Sci. Eng. 2026, 14(4), 333; https://doi.org/10.3390/jmse14040333 - 9 Feb 2026

Viewed by 505

Abstract

Underwater images commonly exhibit blurring, color casts, and low contrast due to light attenuation and scattering in water. Although numerous underwater image enhancement (UIE) algorithms have been developed to improve the usability of underwater imaging systems, evaluating the performance of these algorithms remains [...] Read more.

Underwater images commonly exhibit blurring, color casts, and low contrast due to light attenuation and scattering in water. Although numerous underwater image enhancement (UIE) algorithms have been developed to improve the usability of underwater imaging systems, evaluating the performance of these algorithms remains challenging due to the lack of reference images. Thus, blind/no-reference (NR) underwater image quality assessment (UIQA) has emerged as a key research focus. While existing NR-UIQA methods based on luminance and chrominance cues have shown effectiveness, modeling these attributes separately ignores valuable information arising from their joint behavior, since underwater degradations often induce simultaneous changes in luminance and chrominance that cannot be reliably characterized by either attribute alone. In this paper, we propose a lightweight and explainable NR-UIQA method, called multi-feature chrominance–luminance fusion (MF-CLF), based on jointly modeling the intra- and cross-attribute dependencies among chrominance and luminance statistics. Specifically, our approach constructs chrominance-attribute features across multiple color spaces, extracts luminance-attribute features using multi-kernel perceptual descriptors, and models the chrominance–luminance characteristics by explicitly capturing the interactions between the luminance and chrominance attributes. The extracted features are then mapped into a quality score using a support vector machine (SVM), enabling objective and reliable underwater image quality prediction. Experimental results tested on four public benchmark datasets demonstrate that MF-CLF significantly outperforms among lightweight, statistical-learning-based methods. Specifically, our approach achieves an SROCC value of 0.864 on the SAUD2.0 dataset, outperforming existing methods by 20.3%, and demonstrates strong robustness in cross-dataset evaluations with an SROCC value of 0.737, which is more than twice that of the traditional methods. Full article

(This article belongs to the Topic Advances in Underwater Signal Processing and Communication: Challenges, Innovations, and Applications)

► Show Figures

Figure 1

20 pages, 2128 KB

Open AccessArticle

An Image Deraining Network Integrating Dual-Color Space and Frequency Domain Prior

by Luxia Yang, Yiying Hou and Hongrui Zhang

Technologies 2026, 14(2), 102; https://doi.org/10.3390/technologies14020102 - 4 Feb 2026

Viewed by 543

Abstract

Image deraining is a crucial preprocessing task for enhancing the robustness of high-level vision systems under adverse weather conditions. However, most of the existing methods are limited to a single RGB color space, and it is difficult to effectively separate high-frequency rain streaks [...] Read more.

Image deraining is a crucial preprocessing task for enhancing the robustness of high-level vision systems under adverse weather conditions. However, most of the existing methods are limited to a single RGB color space, and it is difficult to effectively separate high-frequency rain streaks from low-frequency backgrounds, resulting in color distortion and detail loss in the restored image. Therefore, a rain removal network that combines dual-color space and frequency domain priors is proposed. Specifically, the devised network employs a dual-branch Transformer architecture to extract color and structural features from the RGB and YCbCr color spaces, respectively. Meanwhile, a Hybrid Attention Feedforward Block (HAFB) is constructed. HAFB achieves feature enhancement and regional focus through a progressive perception selection mechanism and a multi-scale feature extraction architecture, thereby effectively separating rain streaks from the background. Furthermore, a Wavelet-Gated Cross-Attention module is designed, including a Wavelet-Enhanced Attention Block (WEAB) and a Dual Cross-Attention module (DCA). This design enhances the complementary fusion of structural information and color features through frequency-domain guidance and bidirectional semantic interaction. Finally, experimental results on multiple datasets (i.e., Rain100L, Rain100H, Rain800, Rain12, and SPA-Data) demonstrate that the proposed method outperforms other approaches. Full article

(This article belongs to the Section Information and Communication Technologies)

► Show Figures

Figure 1

24 pages, 2143 KB

Open AccessArticle

Intelligent Detection and 3D Localization of Bolt Loosening in Steel Structures Using Improved YOLOv9 and Multi-View Fusion

by Fangyuan Cui, Xiaolong Chen and Lie Liang

Buildings 2026, 16(3), 619; https://doi.org/10.3390/buildings16030619 - 2 Feb 2026

Viewed by 392

Abstract

Structural health monitoring of steel buildings requires accurate detection and localization of bolt loosening, a critical yet challenging task due to complex joint geometries and varying environmental conditions. We propose an intelligent framework that integrates an improved YOLOv9 model with multi-view image fusion [...] Read more.

Structural health monitoring of steel buildings requires accurate detection and localization of bolt loosening, a critical yet challenging task due to complex joint geometries and varying environmental conditions. We propose an intelligent framework that integrates an improved YOLOv9 model with multi-view image fusion to address this problem. The method constructs a comprehensive dataset with multi-angle images under diverse lighting, occlusion, and loosening conditions, annotated with multi-task labels for precise training. The YOLOv9 backbone is enhanced with attention mechanisms to focus on key bolt features, while an angle-aware detection head regresses both bounding boxes and rotation angles, enabling loosening state determination through a threshold-based criterion. Furthermore, the framework unifies camera coordinate systems and employs epipolar geometry to fuse 2D detections from multiple views, reconstructing 3D bolt positions and orientations for precise localization. The proposed method achieves robust performance in detecting loosening angles and spatially localizing bolts, offering a practical solution for real-world structural inspections. Its significance lies in the integration of advanced deep learning with multi-view geometry, providing a scalable and automated approach to enhance safety and maintenance efficiency in steel structures. Full article

(This article belongs to the Section Building Structures)

► Show Figures

Figure 1

22 pages, 45754 KB

Open AccessArticle

Chrominance-Aware Multi-Resolution Network for Aerial Remote Sensing Image Fusion

by Shuying Li, Jiaxin Cheng, San Zhang and Wuwei Wang

Remote Sens. 2026, 18(3), 431; https://doi.org/10.3390/rs18030431 - 29 Jan 2026

Viewed by 394

Abstract

Spectral data obtained from upstream remote sensing tasks contain abundant complementary information. Infrared images are rich in radiative information, and visible images provide spatial details. Effective fusion of these two modalities improves the utilization of remote sensing data and provides a more comprehensive [...] Read more.

Spectral data obtained from upstream remote sensing tasks contain abundant complementary information. Infrared images are rich in radiative information, and visible images provide spatial details. Effective fusion of these two modalities improves the utilization of remote sensing data and provides a more comprehensive representation of target characteristics and texture details. The majority of current fusion methods focus primarily on intensity fusion between infrared and visible images. These methods ignore the chrominance information present in visible images and the interference introduced by infrared images on the color of fusion results. Consequently, the fused images exhibit inadequate color representation. To address these challenges, an infrared and visible image fusion method named Chrominance-Aware Multi-Resolution Network (CMNet) is proposed. CMNet integrates the Mamba module, which offers linear complexity and global awareness, into a U-Net framework to form the Multi-scale Spatial State Attention (MSSA) framework. Furthermore, the enhancement of the Mamba module through the design of the Chrominance-Enhanced Fusion (CEF) module leads to better color and detail representation in the fused image. Extensive experimental results show that the CMNet method delivers better performance compared to existing fusion methods across various evaluation metrics. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

Search Results (284)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (284)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI