Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (284)

Search Parameters:
Keywords = multi-focus image fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 11455 KB  
Article
Cross-Scale Spectral Calibration for Spatiotemporal Fusion of Remote Sensing Images
by Yishuo Tian, Xiaorong Xue, Jingtong Yang, Wen Zhang, Bingyan Lu, Xin Zhao and Wancheng Wang
Sensors 2026, 26(7), 2090; https://doi.org/10.3390/s26072090 - 27 Mar 2026
Viewed by 417
Abstract
Spatiotemporal fusion aims to generate remote sensing images with both high spatial and high temporal resolution by integrating multi-source observations. However, significant spectral inconsistencies often arise when fusing images acquired at different spatial scales, which severely degrade the radiometric fidelity and temporal reliability [...] Read more.
Spatiotemporal fusion aims to generate remote sensing images with both high spatial and high temporal resolution by integrating multi-source observations. However, significant spectral inconsistencies often arise when fusing images acquired at different spatial scales, which severely degrade the radiometric fidelity and temporal reliability of the fused results. Most existing methods focus on enhancing spatial details or temporal consistency, while the cross-scale spectral discrepancy between coarse- and fine-resolution images has not been sufficiently addressed. To tackle this issue, we propose a cross-scale spectral calibration framework for spatiotemporal fusion (XSC-Net), which explicitly models and corrects spectral responses across different spatial scales. The proposed method introduces a spatial feature refinement block to enhance spatially discriminative structures and a hierarchical spectral refinement block to adaptively calibrate channel-wise spectral representations. By jointly exploiting spatial and spectral correlations, the proposed framework effectively suppresses spectral distortion while preserving fine spatial details. Extensive experiments on the public CIA and LGC datasets indicate that XSC-Net compares favorably with state-of-the-art methods, demonstrating superior performance over established baselines. Furthermore, ablation studies verify the efficacy and contribution of the proposed architectural components. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

26 pages, 12081 KB  
Article
DEPART: Multi-Task Interpretable Depression and Parkinson’s Disease Detection from In-the-Wild Video Data
by Elena Ryumina, Alexandr Axyonov, Mikhail Dolgushin, Dmitry Ryumin and Alexey Karpov
Big Data Cogn. Comput. 2026, 10(3), 89; https://doi.org/10.3390/bdcc10030089 - 16 Mar 2026
Viewed by 484
Abstract
Automated video-based detection of cognitive disorders can enable a scalable non-invasive health monitoring. However, existing methods focus on a single disease and provide limited interpretability, whereas real-world videos often contain co-occurring conditions. We propose a novel unified multi-task method to detect depression and [...] Read more.
Automated video-based detection of cognitive disorders can enable a scalable non-invasive health monitoring. However, existing methods focus on a single disease and provide limited interpretability, whereas real-world videos often contain co-occurring conditions. We propose a novel unified multi-task method to detect depression and Parkinson’s disease (PD) from in-the-wild video data called DEPART (DEpression and PArkinson’s Recognition Technique). It performs body region extraction, Contrastive Language-Image Pre-training (CLIP)-based visual encoding, Transformer-based temporal modeling, and prototype-aware classification with a gated fusion technique. Gradient-based attention maps are used to visualize task-specific regions that drive predictions. Experiments on the In-the-Wild Speech Medical (WSM) corpus demonstrate competitive performance: the multi-task model achieves Recall of 82.39% for depression and 78.20% for PD, compared with 87.76% and 78.20%, for the best single-task models. The multi-task learning initially increases false positives for healthy persons in the PD subset, mainly due to annotation–modality mismatches, static visual content misinterpreted as motor impairments, and occasional body detection failures. After cleaning the test data, Recall for healthy individuals becomes comparable across models; the multi-task model improves Recall for both depression (from 82.39% to 87.50%) and PD (from 78.20% to 86.14%), suggesting better robustness for real-life clinical applications. Full article
Show Figures

Figure 1

23 pages, 8147 KB  
Article
SDENet: A Novel Approach for Single Image Depth of Field Extension
by Xu Zhang, Miaomiao Wen, Junyang Jia and Yan Liu
Algorithms 2026, 19(3), 216; https://doi.org/10.3390/a19030216 - 13 Mar 2026
Viewed by 291
Abstract
Traditional hardware-based approaches for depth-of-field extension (DOF-E), such as optimized lens design or focus-stacking via layer scanning, are often plagued by bulkiness and prohibitive costs. Meanwhile, conventional multi-focus image fusion algorithms demand precise spatial alignment, a challenge that becomes particularly acute in applications [...] Read more.
Traditional hardware-based approaches for depth-of-field extension (DOF-E), such as optimized lens design or focus-stacking via layer scanning, are often plagued by bulkiness and prohibitive costs. Meanwhile, conventional multi-focus image fusion algorithms demand precise spatial alignment, a challenge that becomes particularly acute in applications like microscopy. To address these limitations, this paper proposed a novel single-image DOF-E method termed SDENet. The method adopts an encoder –decoder architecture enhanced with multi-scale self-attention and depth enhancement modules, enabling the transformation of a single partially focused image into a fully focused output while effectively recovering regions outside the original depth of field (DOF). To support model training and performance evaluation, we introduce a dedicated dataset (MSED) containing 1772 pairs of single-focus and all-focus images covering diverse scenes. Experimental results on multiple datasets verify that SDENet significantly outperforms state-of-the-art deblurring methods, achieving a PSNR of 26.98 dB and SSIM of 0.846 on the DPDD dataset, which represents a substantial improvement in clarity and visual coherence compared to existing techniques. Furthermore, SDENet demonstrates competitive performance with multi-image fusion methods while requiring only a single input. Full article
Show Figures

Figure 1

22 pages, 3475 KB  
Article
Cross-Layer Feature Fusion and Attention-Based Class Feature Alignment Network for Unsupervised Cross-Domain Remote Sensing Scene Classification
by Jiahao Wei, Erzhu Li and Ce Zhang
Remote Sens. 2026, 18(6), 859; https://doi.org/10.3390/rs18060859 - 11 Mar 2026
Viewed by 323
Abstract
Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus [...] Read more.
Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus in scene classification. Existing UDA methods focus primarily on aligning the overall feature distributions across domains but neglect class feature alignment, resulting in the loss of critical class information. To address this issue, a cross-layer feature fusion and attention-based class feature alignment network (CFACA-NET) is proposed for unsupervised cross-domain remote sensing scene classification. Specifically, a multi-layer feature extraction module (MFEM) consisting of a cross-layer feature fusion module (CFFM), a multi-scale dynamic attention module (MSDAM), and a fused feature optimization module (FFOM) is designed to enhance the representation ability of scene features. A high-confidence sample selection module is further introduced, which utilizes evidence theory and information entropy to obtain reliable pseudo-labels. Finally, a class feature alignment module is proposed, incorporating a two-stage training strategy to achieve effective class feature alignment. Experimental results on three remote sensing scene classification datasets demonstrate that CFACA-NET outperforms existing state-of-the-art methods in cross-domain classification performance, effectively enhancing cross-domain adaptation capability. Full article
Show Figures

Figure 1

19 pages, 35815 KB  
Article
YOLOv10-TWD: An Improved YOLOv10n for Terracotta Warrior Recognition
by Yalin Li, Liang Wang, Xinyuan Zhang, Sijie Dong and Xinjuan Zhu
Appl. Sci. 2026, 16(5), 2616; https://doi.org/10.3390/app16052616 - 9 Mar 2026
Viewed by 267
Abstract
To address challenges such as complex backgrounds, partial occlusion, and high similarity of details in Terracotta Warrior image recognition, this paper proposes a lightweight detection method, YOLOv10-TWD, based on an improved YOLOv10n. Specifically, a lightweight Convolution-Attention Fusion Module (CAFMAttention) and a dual-branch feature [...] Read more.
To address challenges such as complex backgrounds, partial occlusion, and high similarity of details in Terracotta Warrior image recognition, this paper proposes a lightweight detection method, YOLOv10-TWD, based on an improved YOLOv10n. Specifically, a lightweight Convolution-Attention Fusion Module (CAFMAttention) and a dual-branch feature extraction structure (DualConv) are integrated into the detection head to enhance the model’s focus on fine-grained features and its discriminative robustness under partial damage conditions. In the Neck network, Ghost-Shuffle Convolution (GSConv) is introduced to compress the computational cost of multi-scale feature fusion while strengthening context-aware capabilities. Experimental results on a self-built Terracotta Warrior dataset demonstrate that the proposed method achieves a 7.63% improvement in mAP@0.5 compared to the baseline YOLOv10n, while simultaneously achieving a 6.66% increase in inference speed. The model achieves high precision alongside significant optimization in inference efficiency, making it well-suited for rapid recognition tasks in cultural heritage and museum scenarios. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

20 pages, 5699 KB  
Article
An Improved YOLOv8 Detection Algorithm Based on Screen Printing Defect Images
by Shuqin Wu, Xinru Dong, Qiang Da, Meiou Wang, Yuxuan Sun, Ge Ge, Jinge Ma, Jiajie Kang, Yu Yao and Shubo Shi
Sensors 2026, 26(5), 1604; https://doi.org/10.3390/s26051604 - 4 Mar 2026
Viewed by 373
Abstract
Micro-defects, such as ink spots, scratches, and sintering formed during the screen printing process of photovoltaic cells, significantly impair module performance. Traditional machine vision methods exhibit limited detection efficiency and high false-positive and missed-detection rates, while existing deep learning algorithms struggle to achieve [...] Read more.
Micro-defects, such as ink spots, scratches, and sintering formed during the screen printing process of photovoltaic cells, significantly impair module performance. Traditional machine vision methods exhibit limited detection efficiency and high false-positive and missed-detection rates, while existing deep learning algorithms struggle to achieve accurate and adaptive detection of small-target defects and background similar defects in complex industrial environments. This study proposes an enhanced defect detection methodology based on an improved YOLOv8 algorithm. A multi-focus image acquisition platform using primary and auxiliary CCDs was independently developed, integrating a high-frame-rate industrial camera and a high-resolution electron microscope, with an LED ring light employed to suppress reflections, thereby establishing a high-quality dataset covering three defect categories. The algorithm was optimized through multiple dimensions: the RepNCSPELAN4 module was incorporated into the backbone network to improve multi-scale feature fusion, and a novel wavelet transform-based WaveConv module was designed to replace traditional downsampling, thereby better preserving defect edges and texture details. The neck network integrates a lightweight shuffle attention mechanism and a new detail enhancement module to strengthen critical features while controlling model complexity. Additionally, a dedicated auxiliary detection head was added for spotting tiny ink dots. Experimental results demonstrate a marked improvement in performance: on the custom dataset, the improved model achieves a stable mean average precision of approximately 92%. Specifically, ink spot detection reached a precision of 84.9% and recall of 77.7%, effectively reducing missed small-target defects; sintering defect detection attained 98.9% precision and 100% recall, addressing previous misclassifications due to background similarity; and scratch detection precision improved to 92.2%. Visual comparisons confirm that the enhanced model effectively overcomes the limitations of the original approach. By constructing a specialized dataset and implementing targeted, coordinated optimizations to the YOLOv8 architecture, this study significantly enhances the accuracy and robustness of screen-printing defect detection in photovoltaic cells, providing an effective solution for real-time online quality inspection in smart manufacturing lines. Full article
(This article belongs to the Special Issue Defect Detection Based on Vision Sensors)
Show Figures

Figure 1

25 pages, 6938 KB  
Article
A BIM-Centered Multi-Source Image Fusion Framework for Remote Client Site Visits
by Ren-Jye Dzeng, Chen-Wei Cheng and Yu-Hsiang Chen
Buildings 2026, 16(5), 994; https://doi.org/10.3390/buildings16050994 - 3 Mar 2026
Viewed by 384
Abstract
Clients need to visit project sites periodically during construction to visualize progress and identify deviations from expectations. However, physical site visits are time-consuming, costly, and potentially unsafe, especially for remote and overseas projects. More fundamentally, existing remote-site-visit solutions focus primarily on automatic recognition [...] Read more.
Clients need to visit project sites periodically during construction to visualize progress and identify deviations from expectations. However, physical site visits are time-consuming, costly, and potentially unsafe, especially for remote and overseas projects. More fundamentally, existing remote-site-visit solutions focus primarily on automatic recognition and visualization, while insufficiently addressing the scientific challenge of how heterogeneous, dynamic site data can be fused and operationalized to support timely, collaborative decision making. This research proposes a framework for clients’ remote site visits. It develops an RASE system that enables multi-source data fusion and real-time collaborative decision support by integrating UAVs, 360° cameras, BIM, and VR/AR technologies. RASE allows clients to synchronize real-world visual data with BIM models within predefined scenes, annotate issues directly on BIM components, and seamlessly switch among heterogeneous image-capture sources to maintain situational awareness in highly dynamic construction environments. The proposed framework emphasizes an operational data-fusion mechanism and an interaction paradigm that reduces the cognitive and coordination burdens of remote decision making. A case study shows that RASE reduces site-visit time by 78.0%, though initial equipment costs increase total expenses by 44.1%. Sensitivity analyses indicate that projects with greater remoteness or higher visit frequency significantly improve both time and cost effectiveness. The core contribution of RASE lies in enabling a scalable, operational data-fusion mechanism that supports collaboration for remote site visits, with the associated issues for the corresponding BIM components. Automatic image and voice recognition functionality may be incorporated with RASE to improve the efficiency of system control, textual input, and BIM association in the future. Full article
Show Figures

Figure 1

24 pages, 1346 KB  
Systematic Review
Artificial Intelligence in Cadastre: A Systematic Review of Methods, Applications, and Trends
by Jingshu Chen, Majid Nazeer, Bo Sum Lee and Man Sing Wong
Land 2026, 15(3), 411; https://doi.org/10.3390/land15030411 - 2 Mar 2026
Viewed by 931
Abstract
Surveying and register administration are core to land administration, and accordingly, land surveying and registration are essential to socio-economic development due to their potential accuracy and efficiency. Until now, customary land surveying and registration have relied on human input, which is a situation [...] Read more.
Surveying and register administration are core to land administration, and accordingly, land surveying and registration are essential to socio-economic development due to their potential accuracy and efficiency. Until now, customary land surveying and registration have relied on human input, which is a situation that undermines efficiency and is prone to errors in data handling. During the last decade, the exponential growth in artificial intelligence (AI), in particular, geospatial artificial intelligence (GeoAI), has provided new methodologies that can overcome these deficiencies. This review examines AI in cadastral management by analyzing technical solutions and trends across three areas including data collection, modeling, and common applications. This review aims to provide a comprehensive survey of the current use of AI in cadastral management to the extent of defining a future research avenue. Based on the comprehensive review of literature, this study has reached the following three conclusions. (1) Automated extraction of parcel boundaries has been achieved through deep learning in data collection and processing, removing the bottlenecks of manual interpretation. Models such as convolutional neural networks (CNNs) and Transformers have been used for pixel-level semantic segmentation of high-resolution remote sensing images, leading to significant improvements in efficiency and accuracy. (2) Non-spatial data have been processed with natural language processing techniques to automatically extract information and construct relationships, thus overcoming the limitations of paper-based archives and traditional relational databases. (3) Deep learning models have been applied to automatically detect parcel changes and to enable integrated analysis of spatial and non-spatial data, which has supported the transition of cadastral management from two-dimensional to three-dimensional. However, several challenges remain, including differences in multi-temporal data processing, spatial semantic ambiguity, and the lack of large-scale, high-quality annotated data. Future research can focus on improving model generalization, advancing cross-modal data fusion, and providing recommendations for the development of a reliable and practical intelligent cadastral system. Full article
Show Figures

Figure 1

19 pages, 84231 KB  
Article
Vision–Language Models for Transmission Line Fault Detection: A New Approach for Grid Reliability and Optimization
by Runle Yu, Lihao Mai, Yang Weng, Qiushi Cui, Guochang Xu and Pengliang Ren
J. Imaging 2026, 12(3), 106; https://doi.org/10.3390/jimaging12030106 - 28 Feb 2026
Viewed by 459
Abstract
Reliable fault detection along transmission corridors is essential for preventing small defects from developing into long outages and costly emergency operations. This study aims to improve the field reliability of an open vocabulary vision language backbone without retraining the large model in an [...] Read more.
Reliable fault detection along transmission corridors is essential for preventing small defects from developing into long outages and costly emergency operations. This study aims to improve the field reliability of an open vocabulary vision language backbone without retraining the large model in an end-to-end manner. The work focuses on four operational fault classes in multi-region corridor imagery collected during routine inspections and uses a Florence-2 vision language model as the base recognizer. On top of this backbone, three domain-specific components are introduced. A subclass-aware fusion scheme keeps probability mass within the active parent concept so that insulator icing and conductor icing produce stable, action-oriented decisions. A Power-Line Focus Then Crop normalization uses an attention-guided corridor window together with isotropic resizing so that thin conductors and small fittings remain visible in the processed image. A corridor geo prior reduces scores as the distance from the mapped centerline increases and in this way suppresses detections that lie outside the corridor. All methods are evaluated under a shared preprocessing and scoring pipeline in training-free and parameter-efficient tuning modes. Experiments on unseen regions show higher accuracy for skinny and low-contrast faults, fewer false alarms outside the right-of-way, and improved score calibration in the confidence range used for triage, while keeping throughput and memory usage suitable for unmanned aerial vehicles and substation edge devices. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

30 pages, 5797 KB  
Article
FADS-Fusion: A Post-Flood Assessment Using Dempster–Shafer Fusion for Segmentation and Uncertainty Mapping
by Daniel Sobien and Chelsea Sobien
Remote Sens. 2026, 18(5), 714; https://doi.org/10.3390/rs18050714 - 27 Feb 2026
Viewed by 397
Abstract
Machine Learning (ML) modeling for disaster management is a growing field, but existing works focus more on mapping the extent of floods or broad categories of damage and they lack methods for explainability to help users understand model outputs. In this study, we [...] Read more.
Machine Learning (ML) modeling for disaster management is a growing field, but existing works focus more on mapping the extent of floods or broad categories of damage and they lack methods for explainability to help users understand model outputs. In this study, we propose Flood Assessment using Dempster–Shafer Fusion (FADS-Fusion), a tool for addressing post-flood damage assessment using Dempster–Shafer fusion to combine outputs from multiple deep learning models. FADS-Fusion is generalized to use any pretrained models, once outputs are post-processed for consistency, making it applicable for other disaster management or change detection applications. The novelty of our work comes from the application of Dempster–Shafer for multi-model fusion and uncertainty quantification on a flood dataset for segmenting both buildings and roads. We trained and evaluated models using the SpaceNet 8 challenge dataset and demonstrated that the fusion of the SpaceNet 8 Baseline (SN8) and Siamese Nested UNet (SNUNet) models has a modest overall improvement +1.93% to mAP, while a +12.3% increase for Precision and a −15.0% decrease in Recall are statistically significant compared to the baseline. FADS-Fusion also quantifies uncertainty by using the conflict of evidence, with a discount factor, with Dempster–Shafer fusion as both a quantitative and qualitative explainability method. While uncertainty correlates with a drop in performance, this relationship depends on values for class-weighted uncertainty and location. Mapping uncertainty back onto the original image allows for a visual inspection on fusion quality and indicates areas where a human will need to reassess. Our work demonstrates that FADS-Fusion improves post-flood segmentation performance and adds the benefit of uncertainty quantification for explainability, an aspect important for reliability and user decision-making but understudied in ML for disaster management in the literature. Full article
Show Figures

Figure 1

19 pages, 1446 KB  
Article
Optical Characteristics-Guided Asymmetric Dual Encoder Feature Fusion Cloud Detection Algorithm
by Jing Zhang, Qi Lang, Xinlong Shi, Jiaxuan Liu and Yunsong Li
Remote Sens. 2026, 18(5), 677; https://doi.org/10.3390/rs18050677 - 24 Feb 2026
Viewed by 356
Abstract
The rapid development of remote sensing satellite technology has enabled remote sensing images to be widely used in agriculture, meteorology, environmental monitoring and other fields. However, the presence of clouds in these images can lead to blurred and incomplete observations of the Earth’s [...] Read more.
The rapid development of remote sensing satellite technology has enabled remote sensing images to be widely used in agriculture, meteorology, environmental monitoring and other fields. However, the presence of clouds in these images can lead to blurred and incomplete observations of the Earth’s surface, limiting the quality and applicability of the data. Current cloud detection networks usually adopt a single encoder–decoder structure that uniformly processes all spectral features without distinguishing between various spectral bands. To overcome this limitation, this paper proposes an Optical characteristics-guided Asymmetric Dual Encoder Feature Fusion cloud detection algorithm (OADEF2). The algorithm adopts an asymmetric dual encoder framework to divide the spectral bands of Sentinel-2A into two groups: RGB visible light bands and infrared/atmospheric correction bands, which are subsequently input into two different encoder branches. This method utilizes the unique physical characteristics of different spectral bands to improve the accuracy of cloud detection. In order to direct the focus of the network to cloud-related optical characteristics, an Optical characteristics-guided Multi-Scale cloud feature module (OCGMSCFM) based on Dynamic HOT Index and Full-Band Cloud Index is introduced. This module effectively solves the problem of insufficient representation of cloud features. In order to improve the efficiency of feature fusion, a Feature Aggregation and Filtering module (FAFM) is proposed. This module uses aggregation and techniques to filter basic features, thereby improving the accuracy of cloud detection. In order to overcome the limitations of feature modeling, a dual attention module that fuses Multi-interaction Local Spatial Attention mixed Channel Attention (MILSAMCAM) is added to the decoder. The experimental results validated the effectiveness of this algorithm in cloud detection tasks, achieving an F1-score of 97.30% on the S2-CMC dataset. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

30 pages, 14583 KB  
Article
MF-CLF: Multi-Feature Chrominance–Luminance Fusion for Blind Underwater Image Quality Assessment
by Wei Chen, Yi Zhang, Damon M. Chandler and Mikolaj Leszczuk
J. Mar. Sci. Eng. 2026, 14(4), 333; https://doi.org/10.3390/jmse14040333 - 9 Feb 2026
Viewed by 505
Abstract
Underwater images commonly exhibit blurring, color casts, and low contrast due to light attenuation and scattering in water. Although numerous underwater image enhancement (UIE) algorithms have been developed to improve the usability of underwater imaging systems, evaluating the performance of these algorithms remains [...] Read more.
Underwater images commonly exhibit blurring, color casts, and low contrast due to light attenuation and scattering in water. Although numerous underwater image enhancement (UIE) algorithms have been developed to improve the usability of underwater imaging systems, evaluating the performance of these algorithms remains challenging due to the lack of reference images. Thus, blind/no-reference (NR) underwater image quality assessment (UIQA) has emerged as a key research focus. While existing NR-UIQA methods based on luminance and chrominance cues have shown effectiveness, modeling these attributes separately ignores valuable information arising from their joint behavior, since underwater degradations often induce simultaneous changes in luminance and chrominance that cannot be reliably characterized by either attribute alone. In this paper, we propose a lightweight and explainable NR-UIQA method, called multi-feature chrominance–luminance fusion (MF-CLF), based on jointly modeling the intra- and cross-attribute dependencies among chrominance and luminance statistics. Specifically, our approach constructs chrominance-attribute features across multiple color spaces, extracts luminance-attribute features using multi-kernel perceptual descriptors, and models the chrominance–luminance characteristics by explicitly capturing the interactions between the luminance and chrominance attributes. The extracted features are then mapped into a quality score using a support vector machine (SVM), enabling objective and reliable underwater image quality prediction. Experimental results tested on four public benchmark datasets demonstrate that MF-CLF significantly outperforms among lightweight, statistical-learning-based methods. Specifically, our approach achieves an SROCC value of 0.864 on the SAUD2.0 dataset, outperforming existing methods by 20.3%, and demonstrates strong robustness in cross-dataset evaluations with an SROCC value of 0.737, which is more than twice that of the traditional methods. Full article
Show Figures

Figure 1

20 pages, 2128 KB  
Article
An Image Deraining Network Integrating Dual-Color Space and Frequency Domain Prior
by Luxia Yang, Yiying Hou and Hongrui Zhang
Technologies 2026, 14(2), 102; https://doi.org/10.3390/technologies14020102 - 4 Feb 2026
Viewed by 543
Abstract
Image deraining is a crucial preprocessing task for enhancing the robustness of high-level vision systems under adverse weather conditions. However, most of the existing methods are limited to a single RGB color space, and it is difficult to effectively separate high-frequency rain streaks [...] Read more.
Image deraining is a crucial preprocessing task for enhancing the robustness of high-level vision systems under adverse weather conditions. However, most of the existing methods are limited to a single RGB color space, and it is difficult to effectively separate high-frequency rain streaks from low-frequency backgrounds, resulting in color distortion and detail loss in the restored image. Therefore, a rain removal network that combines dual-color space and frequency domain priors is proposed. Specifically, the devised network employs a dual-branch Transformer architecture to extract color and structural features from the RGB and YCbCr color spaces, respectively. Meanwhile, a Hybrid Attention Feedforward Block (HAFB) is constructed. HAFB achieves feature enhancement and regional focus through a progressive perception selection mechanism and a multi-scale feature extraction architecture, thereby effectively separating rain streaks from the background. Furthermore, a Wavelet-Gated Cross-Attention module is designed, including a Wavelet-Enhanced Attention Block (WEAB) and a Dual Cross-Attention module (DCA). This design enhances the complementary fusion of structural information and color features through frequency-domain guidance and bidirectional semantic interaction. Finally, experimental results on multiple datasets (i.e., Rain100L, Rain100H, Rain800, Rain12, and SPA-Data) demonstrate that the proposed method outperforms other approaches. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

24 pages, 2143 KB  
Article
Intelligent Detection and 3D Localization of Bolt Loosening in Steel Structures Using Improved YOLOv9 and Multi-View Fusion
by Fangyuan Cui, Xiaolong Chen and Lie Liang
Buildings 2026, 16(3), 619; https://doi.org/10.3390/buildings16030619 - 2 Feb 2026
Viewed by 392
Abstract
Structural health monitoring of steel buildings requires accurate detection and localization of bolt loosening, a critical yet challenging task due to complex joint geometries and varying environmental conditions. We propose an intelligent framework that integrates an improved YOLOv9 model with multi-view image fusion [...] Read more.
Structural health monitoring of steel buildings requires accurate detection and localization of bolt loosening, a critical yet challenging task due to complex joint geometries and varying environmental conditions. We propose an intelligent framework that integrates an improved YOLOv9 model with multi-view image fusion to address this problem. The method constructs a comprehensive dataset with multi-angle images under diverse lighting, occlusion, and loosening conditions, annotated with multi-task labels for precise training. The YOLOv9 backbone is enhanced with attention mechanisms to focus on key bolt features, while an angle-aware detection head regresses both bounding boxes and rotation angles, enabling loosening state determination through a threshold-based criterion. Furthermore, the framework unifies camera coordinate systems and employs epipolar geometry to fuse 2D detections from multiple views, reconstructing 3D bolt positions and orientations for precise localization. The proposed method achieves robust performance in detecting loosening angles and spatially localizing bolts, offering a practical solution for real-world structural inspections. Its significance lies in the integration of advanced deep learning with multi-view geometry, providing a scalable and automated approach to enhance safety and maintenance efficiency in steel structures. Full article
(This article belongs to the Section Building Structures)
Show Figures

Figure 1

22 pages, 45754 KB  
Article
Chrominance-Aware Multi-Resolution Network for Aerial Remote Sensing Image Fusion
by Shuying Li, Jiaxin Cheng, San Zhang and Wuwei Wang
Remote Sens. 2026, 18(3), 431; https://doi.org/10.3390/rs18030431 - 29 Jan 2026
Viewed by 394
Abstract
Spectral data obtained from upstream remote sensing tasks contain abundant complementary information. Infrared images are rich in radiative information, and visible images provide spatial details. Effective fusion of these two modalities improves the utilization of remote sensing data and provides a more comprehensive [...] Read more.
Spectral data obtained from upstream remote sensing tasks contain abundant complementary information. Infrared images are rich in radiative information, and visible images provide spatial details. Effective fusion of these two modalities improves the utilization of remote sensing data and provides a more comprehensive representation of target characteristics and texture details. The majority of current fusion methods focus primarily on intensity fusion between infrared and visible images. These methods ignore the chrominance information present in visible images and the interference introduced by infrared images on the color of fusion results. Consequently, the fused images exhibit inadequate color representation. To address these challenges, an infrared and visible image fusion method named Chrominance-Aware Multi-Resolution Network (CMNet) is proposed. CMNet integrates the Mamba module, which offers linear complexity and global awareness, into a U-Net framework to form the Multi-scale Spatial State Attention (MSSA) framework. Furthermore, the enhancement of the Mamba module through the design of the Chrominance-Enhanced Fusion (CEF) module leads to better color and detail representation in the fused image. Extensive experimental results show that the CMNet method delivers better performance compared to existing fusion methods across various evaluation metrics. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

Back to TopTop