MDPI - Publisher of Open Access Journals

26 pages, 25667 KB

Open AccessArticle

DFSMamba: A Spatial–Frequency Collaborative Modeling Framework for Remote Sensing Image Super-Resolution

by Jie Yu, Hui Li, Xiangyong Zheng, Cheng Zhong and Qiao Sun

Remote Sens. 2026, 18(12), 1910; https://doi.org/10.3390/rs18121910 (registering DOI) - 9 Jun 2026

Existing single-image super-resolution methods for remote sensing images suffer from insufficient global receptive fields, weak high-frequency texture recovery, and excessive computational complexity. To address these issues, this paper proposes DFSMamba, a novel spatial–frequency collaborative modeling framework. First, Semantic Continuous-Sparse Attention enhances semantic perception [...] Read more.

Existing single-image super-resolution methods for remote sensing images suffer from insufficient global receptive fields, weak high-frequency texture recovery, and excessive computational complexity. To address these issues, this paper proposes DFSMamba, a novel spatial–frequency collaborative modeling framework. First, Semantic Continuous-Sparse Attention enhances semantic perception through dynamic chunking and sparse connections while maintaining linear complexity, effectively alleviating the semantic truncation problem caused by fixed window partitioning. Second, the Adaptive State-Space Module employs parallel forward and backward state-space model branches to achieve bidirectional long-range dependency modeling and introduces an activation-guided feature fusion mechanism to adaptively enhance semantically relevant regions. Third, the Discrete Fourier Transform Module maps images to the frequency domain, establishes a global lossless receptive field, and explicitly enhances high-frequency details, compensating for the insufficient utilization of frequency-domain information in pure spatial-domain methods. Experiments on five public datasets demonstrate that DFSMamba outperforms mainstream CNN, Transformer, and Mamba-based methods across ×2 to ×4 scales. On the AID×3 task, it achieves a PSNR of 31.48 dB, exceeding MambaIRv2 by 1.07 dB. Ablation studies verify the positive synergistic effect of the three modules, with the full configuration achieving a PSNR improvement of 0.85 dB over the single-module setup. Fine-grained category, multi-scale input, and loss function experiments further confirm its robustness and generalization capability, particularly in edge and texture detail reconstruction. Full article

► Show Figures

Figure 1

19 pages, 44405 KB

Open AccessArticle

SFQMamba: A Spatial–Frequency Deraining Framework for Robust Visual Sensing in UAV-Assisted IoT Systems

by Letian Deng, Chunyu Meng, Yuhong Zhou, Yuechao Guo, Zhiming Guo, Di Ya, Jianhai Yang, Huaibo Song and Lifeng Qin

Sensors 2026, 26(12), 3680; https://doi.org/10.3390/s26123680 (registering DOI) - 9 Jun 2026

Abstract

Existing single-image deraining methods often exhibit limited 2D long-range dependency modeling and underexploit frequency-domain priors. To address this, we propose SFQMamba, a dual-branch deraining network based on spatial–frequency feature fusion. The CNN branch employs a Fused Enhance Block (FEB), which integrates multi-scale spatial [...] Read more.

Existing single-image deraining methods often exhibit limited 2D long-range dependency modeling and underexploit frequency-domain priors. To address this, we propose SFQMamba, a dual-branch deraining network based on spatial–frequency feature fusion. The CNN branch employs a Fused Enhance Block (FEB), which integrates multi-scale spatial modeling with global frequency modulation, supported by residual coupling and channel guidance, to suppress rain streaks and recover structural details. Concurrently, the Mamba branch utilizes a Spatial-Aware Selective Fusion Block (SASFB). By incorporating a four-directional scanning mechanism and adaptive path-gating, SASFB extends 1D State Space Models into the 2D domain for content-aware feature fusion. Features from both branches are hierarchically aggregated via concatenation and pointwise convolution. Experiments on the Rain13K and Raindrop datasets show that SFQMamba provides robust restoration. Compared with TransMamba, it obtains improvements of 0.12 dB in PSNR and 0.11% in SSIM, removing dense rain streaks while preserving structural and textural details. Furthermore, on the RainVisDrone benchmark, specifically the medium-rain subset, our method improves YOLOv8s detection by 0.0737 AP, 0.1060 AP

⁠_{50}

, and 0.0897 AP

⁠_{75}

over degraded inputs. These results indicate that the proposed framework benefits both low-level visual restoration and downstream object perception in UAV applications. Full article

(This article belongs to the Special Issue UAV Secure Communication for IoT Applications)

► Show Figures

Figure 1

25 pages, 3812 KB

Open AccessArticle

Nondestructive Detection of Foreign Matter in Pu-erh Ripe Tea Based on Deep Learning

by Baijuan Wang, Xiaoxue Guo, Xin Fang, He Ji, Jihong Zhou, Junjie He, Shihao Zhang and Yuefei Wang

Foods 2026, 15(12), 2083; https://doi.org/10.3390/foods15122083 (registering DOI) - 8 Jun 2026

Abstract

To address the challenges of small foreign matter size, severe occlusion, and complex backgrounds in Pu-erh ripe tea processing, this study drew inspiration from primate visual mechanisms and proposed an improved YOLOv13-based network, AE-YOLOv13-S. To mitigate loss of fine details, the weakening of [...] Read more.

To address the challenges of small foreign matter size, severe occlusion, and complex backgrounds in Pu-erh ripe tea processing, this study drew inspiration from primate visual mechanisms and proposed an improved YOLOv13-based network, AE-YOLOv13-S. To mitigate loss of fine details, the weakening of discriminative features, and the frequent occurrence of missed and false detections, the Adaptive Sparse Self-Attention Network was introduced to optimize the backbone of the network, inspired by the sequential cognitive pattern of primates involving target search, local verification, selective integration, and final decision making. To address insufficient long-range semantic associations and the submergence of fine-grained differences in background noise, Emulating Self-Attention with Convolution was employed to optimize part of the Conv modules of the network, drawing on the hierarchical information processing mechanisms of primates from peripheral perception to central fine analysis. In response to the limitations of bounding boxes, such as approximate target enclosure, the large amount of geometric supervision noise, the obvious localization deviation, and delayed model convergence, a Scale-based Dynamic Loss, inspired by primate visual perception mechanisms, was introduced to optimize the network’s loss function. The results showed that, during training, compared with the baseline, AE-YOLOv13-S achieved lower training loss values: Box Loss declined by 6.76%, Cls Loss by 6.52%, and DFL Loss by 8.65%. On the validation dataset, the model demonstrated reductions of 6.58%, 16.39%, and 8.33% for these respective metrics. After the overall improvements, AE-YOLOv13-S achieved increases of 1.43, 4.85, and 2.69 percentage points in precision, recall, and mAP@50, respectively, with only a 0.3 G increase in FLOPs. The improved model can classify and detect foreign matter in Pu-erh ripe tea efficiently and accurately, providing not only a new technical pathway for foreign matter detection in tea processing but also a practically meaningful technical solution for intelligent quality control and food safety assurance in the tea processing chain. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence, Machine Learning and Deep Learning in Food Safety Analysis and Quality Control)

35 pages, 1263 KB

Open AccessSystematic Review

Advances in Artificial Intelligence-Enabled Crop Pest and Disease Detection: A Systematic Review

by Zhen Ma, Cundeng Wang, Xinzhong Wang and Xuegeng Chen

Agriculture 2026, 16(12), 1262; https://doi.org/10.3390/agriculture16121262 - 7 Jun 2026

Viewed by 263

Abstract

The detection technology of crop diseases and pests is transitioning from single sensor monitoring to intelligent perception and multimodal fusion. This paper follows the PRISMA 2020 standard and systematically reviews the relevant core literature. This paper systematically summarizes the development history of spectral [...] Read more.

The detection technology of crop diseases and pests is transitioning from single sensor monitoring to intelligent perception and multimodal fusion. This paper follows the PRISMA 2020 standard and systematically reviews the relevant core literature. This paper systematically summarizes the development history of spectral sensing technology and analyzes the physical mechanisms of hyperspectral and multispectral imaging in early identification of crop diseases. The focus is on the architectural evolution of deep learning models, including lightweight convolutional neural networks (CNNs), vision transformers (ViTs) with long-range dependency modeling capabilities, and the efficient computing state space model Mamba. In addition, the research progress of spatial spectral joint learning, heterogeneous data fusion, and vision-language models (VLMs) in improving system robustness and interpretability are introduced. By synthesizing the integrated applications of UAV remote sensing, Internet of Things (IoT) edge computing and intelligent robots in staple and cash crops, this paper summarizes the implementation of the integrated system of perception, decision-making and execution. To address the issues of insufficient cross-domain generalization ability and uneven allocation of computing resources in existing models, this paper provides perspectives on the future development of agricultural artificial intelligence (AI) towards foundation model-driven, edge-intelligent collaboration, and green sustainable direction, which can provide theoretical reference for engineering applications in the field of intelligent plant protection. Full article

(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

► Show Figures

Figure 1

20 pages, 8392 KB

Open AccessArticle

Rail-BEV: A LiDAR-Centric and Sensor-Aware BEV Perception Framework for Long-Range Railway Obstacle Detection

by Jinghan Huang, Wentao Hu, Zifeng He, Chixiang Ma, Wenbo Song, Xinci Liu and Mingxin Yang

Sensors 2026, 26(12), 3637; https://doi.org/10.3390/s26123637 - 7 Jun 2026

Viewed by 204

Abstract

Reliable long-range onboard perception is a prerequisite for future railway safety systems, where potential obstacles must be recognized under long braking distances, sparse far-field returns, and strongly constrained rail-corridor geometry. This paper presents Rail-BEV as an initial reproducible baseline study for LiDAR-centric, sensor-aware [...] Read more.

Reliable long-range onboard perception is a prerequisite for future railway safety systems, where potential obstacles must be recognized under long braking distances, sparse far-field returns, and strongly constrained rail-corridor geometry. This paper presents Rail-BEV as an initial reproducible baseline study for LiDAR-centric, sensor-aware bird’s-eye-view (BEV) railway obstacle perception. LiDAR is used as the primary geometric sensing modality, while a front-center RGB camera provides lightweight auxiliary visual evidence through calibrated LiDAR-to-image projection. The aligned geometric and visual cues are organized within a unified railway-oriented BEV backend that integrates geometry-aware fusion, rail-geometry prediction, and lightweight inference-time structural refinement. Evaluation was conducted on a scene-isolated railway benchmark with range-stratified center-distance matching, and all model variants were assessed on independent test sequences rather than on validation-selected checkpoints. Compared with CenterPoint and BEVFusion baselines evaluated under the same settings, Rail-BEV achieved the highest overall mAP of 0.6669, with particularly improved long-range pedestrian perception. The controlled ablation further shows that front-view RGB evidence improves the LiDAR-only baseline from 0.5612 to 0.5750 mAP, while ROI-based rail-corridor refinement further increases mAP to 0.5916 and Rail-BEV mIoU to 0.1193. These results indicate that LiDAR-centered sensing, lightweight visual assistance, and coarse rail-aware structural reasoning can be jointly organized to support reproducible long-range railway obstacle perception. This study also clarifies the remaining limitations in rail-geometry quality, calibration robustness, sensor degradation, and strict railway-oriented localization. Full article

(This article belongs to the Section Communications)

► Show Figures

Graphical abstract

20 pages, 1967 KB

Open AccessArticle

Predicting CO₂ Pressure Loss in Aged Traditional-Method Sparkling Wine Bottles for Compliance with European Regulations

by Gérard Liger-Belair, Virginie Thollin and Clara Cilindre

Beverages 2026, 12(6), 70; https://doi.org/10.3390/beverages12060070 - 5 Jun 2026

Viewed by 117

Abstract

Today, billions of bottles are aging in the cellars of traditional-method sparkling wine regions prior to their release on the market. Given the fundamental role of carbon dioxide (CO₂) in both the production and sensory perception of sparkling wines, it is [...] Read more.

Today, billions of bottles are aging in the cellars of traditional-method sparkling wine regions prior to their release on the market. Given the fundamental role of carbon dioxide (CO₂) in both the production and sensory perception of sparkling wines, it is essential to understand and control all stages that influence its pressure and concentration in the bottle throughout the winemaking process. This study addressed the central question of how long traditional-method sparkling wine bottles can age in cellars while maintaining sufficient CO₂ pressure. By considering their capacity to retain the minimum CO₂ pressure of 3.5 bar at 20 °C, as required by European regulations, a predictive formula for the shelf life of older vintages was proposed and discussed, integrating the multiple relevant parameters that govern CO₂ retention. Moreover, based on previously published datasets, a comparison was carried out between CO₂ losses measured for a range of modern crown caps and those observed in collections of older champagne vintages sealed with cork-lined crown caps. The results clearly show that modern crown caps preserve dissolved CO₂ far more effectively in traditional-method sparkling wines than the cork-lined closures commonly used during the last century, leading to substantially longer predicted shelf lives. Full article

(This article belongs to the Section Wine, Spirits and Oenological Products)

► Show Figures

Graphical abstract

30 pages, 3899 KB

Open AccessArticle

An Improved YOLOv8n Framework for PCB Defect Detection via C2f-Mamba Feature Extraction and FPN-PAN++ Multi-Scale Fusion

by Xuan Hua, Haolin Jiang, Hao Wang and Yahui Shan

Symmetry 2026, 18(6), 969; https://doi.org/10.3390/sym18060969 - 3 Jun 2026

Viewed by 183

Abstract

To address the issues in existing PCB defect detection models, including insufficient capability for capturing small defects, weaker global feature modeling, and inadequate multi-scale feature fusion, this paper proposes a C2f-FPN-PAN++-Mamba model based on an improved YOLOv8n. The Mamba state–space model is embedded [...] Read more.

To address the issues in existing PCB defect detection models, including insufficient capability for capturing small defects, weaker global feature modeling, and inadequate multi-scale feature fusion, this paper proposes a C2f-FPN-PAN++-Mamba model based on an improved YOLOv8n. The Mamba state–space model is embedded into the C2f module to construct a C2f-Mamba feature extraction unit, which, while retaining the local perception capability of convolution, enhances long-range dependency modeling, accurately capturing global semantic information of subtle defects in complex backgrounds and significantly improving the model’s feature representation ability for small defects. Meanwhile, an FPN-PAN++ enhanced feature fusion structure is introduced, achieving efficient complementary interaction between high and low-level features through bidirectional cross-scale feature aggregation and path augmentation, thereby strengthening the model’s robustness in identifying multi-scale and multi-form defects. Finally, the C2f-Mamba and FPN-PAN++ are organically integrated, improving global modeling and multi-scale fusion capabilities while maintaining lightweight computational efficiency, effectively reducing the miss and false detection rates of small defects. Experimental results indicate that, compared with the original YOLOv8n model, the proposed method achieves significant performance improvements in PCB defect detection tasks. On the PCB defect dataset, the model’s precision increased from 96.4% to 98.5%, recall from 94.6% to 98.4%, and mAP@0.5 from 97.2% to 98.8%, with the mAP@0.5:0.95 metric, reflecting multi-scale detection performance, rising dramatically from 57.5% to 62.5%. Experiments demonstrate that this method effectively enhances detection capability for small and complex defects while preserving the advantages of a lightweight model and high inference speed, providing a reliable technical solution for high-precision, real-time PCB defect detection in industrial scenarios. Full article

► Show Figures

Figure 1

39 pages, 7192 KB

Open AccessArticle

FreqMambaGAN: A Frequency-Decoupled Mamba-Enhanced CycleGAN for Underwater Image Enhancement

by Baojiang Ye, Haifeng Wang, Wenbin Wang and Tianyi Wang

J. Mar. Sci. Eng. 2026, 14(11), 1050; https://doi.org/10.3390/jmse14111050 - 3 Jun 2026

Viewed by 134

Abstract

Underwater images often suffer from color cast, low contrast, scattering-induced haze, and texture degradation, which limit the performance of underwater visual perception systems. To address these problems, this study proposes FreqMambaGAN, a frequency-decoupled selective state-space cycle-adversarial network for underwater image enhancement. The proposed [...] Read more.

Underwater images often suffer from color cast, low contrast, scattering-induced haze, and texture degradation, which limit the performance of underwater visual perception systems. To address these problems, this study proposes FreqMambaGAN, a frequency-decoupled selective state-space cycle-adversarial network for underwater image enhancement. The proposed method is built upon a CycleGAN-style bidirectional translation framework and introduces a frequency-decoupled Mamba generator to separately model low-frequency color and illumination information and high-frequency texture and edge details. In addition, Efficient Mamba Blocks are embedded into the generator and discriminator to enhance long-range dependency modeling with linear computational complexity. Skip-attention connections are further adopted to preserve shallow spatial details during reconstruction. To improve training stability and imaging plausibility, a multi-stage training strategy is designed by combining supervised warm-up, unpaired cycle-adversarial learning, perceptual regularization, total variation smoothing, and a lightweight physics-inspired consistency constraint based on dark-channel and underwater image-formation priors. Experiments on public underwater image enhancement datasets demonstrate that FreqMambaGAN achieves competitive quantitative performance and visually improved enhancement results in terms of color correction, contrast restoration, haze suppression, and structural preservation. These results indicate that integrating frequency-domain decomposition with selective state-space modeling is effective for underwater image enhancement. Full article

(This article belongs to the Topic Applications and Development of Underwater Robotics and Underwater Vision Technology, 2nd Edition)

► Show Figures

Figure 1

28 pages, 37658 KB

Open AccessArticle

LDSDet: Long-Range Context and Dynamic Cross-Modal Alignment for Multimodal Object Detection Under Challenging Illumination

by Shijun Sun, Shuai Ma, Xuyang Feng, Chen Sun, Baolong Ding, Yaoyao Ran and Yihong Zhang

Remote Sens. 2026, 18(11), 1827; https://doi.org/10.3390/rs18111827 - 3 Jun 2026

Viewed by 224

Abstract

In the field of remote sensing applications, multimodal object detection has emerged as an important technique for enhancing perception robustness in UAV-based scenarios. Nevertheless, RGB–IR UAV detection remains difficult: Degraded illumination destabilizes shallow representations and weakens local discriminative cues, while spatial inconsistencies and [...] Read more.

In the field of remote sensing applications, multimodal object detection has emerged as an important technique for enhancing perception robustness in UAV-based scenarios. Nevertheless, RGB–IR UAV detection remains difficult: Degraded illumination destabilizes shallow representations and weakens local discriminative cues, while spatial inconsistencies and fluctuating modality reliability further hinder cross-modal interaction. In addition, existing methods, which often depend on global illumination estimation or simplistic fusion schemes, struggle to jointly maintain contextual stability, reliable cross-modal interaction, and compact discriminative representations in complex aerial scenes. To address these issues, this paper proposes LDSDet, an RGB–IR multimodal UAV object detector for challenging illumination conditions. Specifically, LDSDet integrates three complementary modules: a Long-range Aware Residual Convolution (LARC) module that enhances contextual perception and stabilizes shallow features; a Dynamic Attention-based Cross-modal Fusion (DACF) block that performs spatially adaptive RGB–IR interaction; and a lightweight SeqShuffleGate (SSG) module that suppresses redundant fusion responses to yield compact and discriminative multimodal representations. Extensive experiments on DroneVehicle, FLIR-Aligned, and LLVIP demonstrate the effectiveness of LDSDet, which achieves 85.2%

{mAP}_{50}

, 45.3% mAP, and 67.1% mAP, respectively, showing strong robustness under day–night alternation, low-light environments, and complex illumination variations. Full article

(This article belongs to the Section Remote Sensing for Geospatial Science)

► Show Figures

Figure 1

30 pages, 3776 KB

Open AccessReview

Multimodal Sensor Fusion in Autonomous Vehicles: Technologies, Architectures, and Open Challenges

by Patrik Viktor and Gabor Kiss

Sensors 2026, 26(11), 3528; https://doi.org/10.3390/s26113528 - 2 Jun 2026

Viewed by 274

Abstract

The rapid progress of sensing technologies, artificial intelligence, and embedded computing has significantly accelerated the development of autonomous vehicles. Among the core challenges of higher-level driving automation, reliable environmental perception remains one of the most critical. This review presents a systematic PRISMA-based analysis [...] Read more.

The rapid progress of sensing technologies, artificial intelligence, and embedded computing has significantly accelerated the development of autonomous vehicles. Among the core challenges of higher-level driving automation, reliable environmental perception remains one of the most critical. This review presents a systematic PRISMA-based analysis of multimodal sensor technologies and fusion architectures applied in autonomous driving, based on 66 peer-reviewed studies published between 2014 and 2025. The study examines the operational characteristics, advantages, and limitations of major sensing modalities, including cameras, LiDAR, radar, ultrasonic sensors, and GNSS/IMU-based localization systems. Particular attention is given to multimodal fusion strategies, covering early, mid-level, high-level, and transformer-based architectures that combine complementary sensor information to improve perception robustness and decision reliability. The review further synthesizes current evidence on performance under adverse environmental conditions, benchmark validation practices, real-time computational constraints, and the growing role of functional safety frameworks such as ISO 26262 and SOTIF. Emerging research directions, including 4D radar, self-supervised long-range fusion, foundation models, and cooperative V2X perception, are also discussed. The findings indicate that multimodal sensor fusion is a highly effective architectural strategy for improving scalability, fail-operational robustness, and certifiable safety in autonomous driving systems, particularly in higher-level automation scenarios. Future research should focus on uncertainty-aware fusion, explainable cross-modal reasoning, large-scale real-world validation, and efficient hardware–software co-design to support robust Level 4–5 vehicle autonomy. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

25 pages, 12578 KB

Open AccessArticle

MCS-DETR: An Efficient Multi-Scale Context-Aware Detection Model for the Selective Harvesting of Greenhouse Cucumbers

by Lihong Rong, Weilong Zhang, Fang Sun, Huimin Liu, Changqing Cai, Fuzhu Ding and Zhimin Tong

Appl. Sci. 2026, 16(11), 5530; https://doi.org/10.3390/app16115530 - 2 Jun 2026

Viewed by 90

Abstract

Selective harvesting of greenhouse cucumbers requires accurate detection with low inference latency. In greenhouse canopies, mature cucumbers are often partly occluded and visually similar to surrounding stems and leaves, which makes harvestability recognition difficult. Existing real-time detectors still struggle to preserve fine boundary [...] Read more.

Selective harvesting of greenhouse cucumbers requires accurate detection with low inference latency. In greenhouse canopies, mature cucumbers are often partly occluded and visually similar to surrounding stems and leaves, which makes harvestability recognition difficult. Existing real-time detectors still struggle to preserve fine boundary cues, capture long-range context, and remain compact enough for on-device inference under these conditions. This study proposes MCS-DETR, an efficient multi-scale context-aware detector built on RT-DETR. Instead of increasing model scale, MCS-DETR redesigns shallow feature extraction, high-level contextual interaction, and cross-scale feature aggregation within a compact framework. A shallow feature level is also retained to preserve fine contour information. On the greenhouse cucumber dataset, MCS-DETR achieved 93.4% mAP@0.5 and 76.8% mAP@0.5:0.95, outperforming RT-DETR-R18 while requiring fewer parameters and less computation. On an NVIDIA Jetson Orin NX Super (Hunan ChuangLebo Intelligent Technology Co., Ltd., Room 2003, Building C, Xinchanghai Digital Center, Changsha Economic and Technological Development Zone, Changsha, Hunan, China) platform, it reached 26.3 FPS after TensorRT acceleration. These results indicate that MCS-DETR can provide an efficient on-device perception module for real-time greenhouse cucumber detection. Full article

(This article belongs to the Section Agricultural Science and Technology)

► Show Figures

Figure 1

29 pages, 50937 KB

Open AccessArticle

MAFT: A Lightweight Network for Martian Rock Segmentation Based on an Adaptive Frequency Transformer

by Chu Li, Yutong Jia, Gang Wan, Qifang Ma, Jia Liu, Yang Wang, Biao Wang, Jia Liu and Zhanji Wei

Remote Sens. 2026, 18(11), 1794; https://doi.org/10.3390/rs18111794 - 1 Jun 2026

Viewed by 252

Abstract

The segmentation of rocks on the Martian surface is crucial for navigation and obstacle avoidance by Mars rovers. However, frequent dust storms degrade rock surface textures, and the wide range of rock scales—from sub-meter to ten-meter—further complicates segmentation, especially under the strict computational [...] Read more.

The segmentation of rocks on the Martian surface is crucial for navigation and obstacle avoidance by Mars rovers. However, frequent dust storms degrade rock surface textures, and the wide range of rock scales—from sub-meter to ten-meter—further complicates segmentation, especially under the strict computational constraints of rover hardware. This paper proposes a lightweight network named MAFT, specifically designed for Martian rock segmentation. The network builds upon the Adaptive Frequency Transformer (AFFormer) and constructs an improved backbone termed the Improved Adaptive Frequency Transformer (IAFFormer). By replacing the traditional self-attention mechanism with a frequency-domain approach, it captures global feature dependencies while reducing the computational complexity from quadratic to linear. The spatially isolated 1 × 1 convolutions in the pixel descriptor module are further replaced with Adaptive Kernel Convolution (AKConv), enabling the backbone to dynamically adjust its sampling positions to conform to the irregular and diverse morphologies of Martian rocks. An Enhanced Multidimensional Convolutional Attention (EMCA) module is introduced as the decoding structure. By integrating max-pooling in the squeeze stage and adaptive dilated convolutions in the excitation stage, EMCA strengthens the boundary perception and long-range dependency modeling of dust-covered rocks without increasing the parameter count. Additionally, we constructed a dataset of Martian rocks for the Zhurong rover (TWMARS-V2) and conducted experiments using a synthetic dataset (SynMars) and a real dataset (MarsData-V2). Experimental results demonstrate that MAFT achieves the highest segmentation accuracy among all compared methods, with only 2.97 M parameters and 15.49 G FLOPs. On the TWMARS-V2 dataset, Pixel Accuracy (PA) reaches 98.17%, and IoU reaches 88.90%. Full article

(This article belongs to the Special Issue Advances in Exploring the Moon, Mars, and Asteroids Based on In Situ and Remote Sensing Measurements (Second Edition))

► Show Figures

Figure 1

13 pages, 516 KB

Open AccessArticle

Auditory Perception and Psychosocial Well-Being in Long-Term Cochlear Implant Users

by Kadriye Guney, Ozlem Topcu, Patrizia Mancini and Hilal Dincer D’Alessandro

Audiol. Res. 2026, 16(3), 83; https://doi.org/10.3390/audiolres16030083 - 28 May 2026

Viewed by 178

Abstract

Background/Objectives: This study investigated auditory perception and psychosocial well-being in long-term cochlear implant (CI) users, with a particular focus on the effects of auditory (re)habilitation on learned helplessness and speech-in-noise perception, representing everyday listening performance. Methods: Thirty CI users and thirty [...] Read more.

Background/Objectives: This study investigated auditory perception and psychosocial well-being in long-term cochlear implant (CI) users, with a particular focus on the effects of auditory (re)habilitation on learned helplessness and speech-in-noise perception, representing everyday listening performance. Methods: Thirty CI users and thirty peers with typical hearing (TH) participated in the study. Speech perception was assessed using the Hearing in Noise Test (HINT) and the Matrix Test in both quiet and noisy listening conditions. Psychosocial status was evaluated using the Learned Helplessness Scale (LHS), the Beck Depression Inventory (BDI), and the Beck Anxiety Inventory (BAI). Perceived hearing quality was evaluated using the Hearing Implant Sound Quality Index (HISQUI). Results: CI users showed significantly poorer speech perception performance than TH participants (p < 0.05), whereas between-group psychosocial outcomes, including LHS, BDI, and BAI scores, did not differ significantly (p > 0.05). Positive correlations were observed between Matrix and HINT scores in quiet and noisy conditions. Positive associations were also observed between CI hearing thresholds and HINT/Matrix results in noisy conditions. Within the prelingually deaf CI subgroup, age at implantation was correlated with CI thresholds, as well as with speech perception performance across both tests (p < 0.05). Conclusions: Although CI users showed significantly poorer speech perception performance, their levels of learned helplessness, depression, and anxiety were comparable to those of their TH peers. These results suggest auditory benefits following long-term CI rehabilitation, while psychosocial status appears to be within a typical range despite persistent listening difficulties in daily life. Full article

(This article belongs to the Section Hearing)

► Show Figures

Figure 1

16 pages, 7030 KB

Open AccessArticle

DDCATNet: Effective Deep Learning-Based Illumination Color Cast Estimation Approach for Achieving Computational Color Constancy

by Ho-Hyoung Choi

Sensors 2026, 26(11), 3313; https://doi.org/10.3390/s26113313 - 23 May 2026

Viewed by 248

Abstract

Digital camera sensors are designed to capture a wide range of incident illuminants, enabling the creation of high-quality images. However, these sensors lack the capability to differentiate between the color of the source illuminant and the actual color (or original color) of the [...] Read more.

Digital camera sensors are designed to capture a wide range of incident illuminants, enabling the creation of high-quality images. However, these sensors lack the capability to differentiate between the color of the source illuminant and the actual color (or original color) of the object being captured. For this reason, the computational color constancy (CCC) was introduced and has been developed over decades. The CCC is an approach to modeling the color perception of the human visual system (HVS) by ensuring accurate object color determination under varying source illuminant conditions. At the core of human visual perception (HVP)-based CCC is attaining higher accuracy in scene illuminant estimation. The emergence of deep convolutional neural networks (DCNNs) was a recent innovation in accurate illuminant estimation, fundamentally transforming the CCC research landscape. Nevertheless, accurate illuminant estimation still remains a huge challenge for both traditional and state-of-the-art (SOTA) approaches. To further advance precision in illuminant estimation, this article presents a novel learning-based illumination color cast estimation approach to HVP-based CCC. Most importantly, the proposed approach is intended to integrate informative features into both channel and spatial regions while preserving long-term dependency feature information with the use of dense skip connections. To achieve these objectives, the proposed Dense Dual Connection Aggregated Transform Network (DDCATNet) architecture is designed to comprise several modules: shallow feature extraction, channel-wise and spatial feature-based Dense Dual Connection (DDC), fusion of the dense channel-wise attention (CA) and spatial attention (SA) branches through a gate mechanism (GM) unit, and aggregate transform. It is worth noting that both the CA blocks and the SA blocks in the DDC module are characterized by dense and cascading connections, meant to preserve long-term feature information and modulate different-level feature information at both global and local scales. The densely connected CA branch (DCA) and the densely connected SA branch (DSA) are also highly effective in securing high-contribution information while suppressing redundant data. The GM unit is integrated at the back of the DDC module, fusing the two DCA and DSA branches to ensure the adaptive merging of useful hierarchical feature information and the extraction of more valuable feature information. As a result, the proposed DDCATNet architecture significantly enhanced precision in illuminant estimation, thereby improving performance. In rigorous experiments on a wide range of datasets, the proposed DDCATNet approach outperformed its SOTA counterparts, validating the efficacy and generalization capabilities, as well as robust camera-invariance, across diverse, single- and multi-illuminant datasets and model architectures. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

29 pages, 38428 KB

Open AccessArticle

A Dual-Path CNN and Transformer Network for Continuous Pavement Crack Detection

by Jinhe Zhang, Shangyu Sun, Weidong Song, Yuxuan Li and Qiaoshuang Teng

Sensors 2026, 26(11), 3286; https://doi.org/10.3390/s26113286 - 22 May 2026

Viewed by 286

Abstract

Cracks are among the most common pavement distresses, and their timely detection is crucial for road maintenance. Existing methods struggle to completely capture elongated and irregular cracks, often resulting in fragmented detection outputs, which leads to the inaccurate assessment of crack length and [...] Read more.

Cracks are among the most common pavement distresses, and their timely detection is crucial for road maintenance. Existing methods struggle to completely capture elongated and irregular cracks, often resulting in fragmented detection outputs, which leads to the inaccurate assessment of crack length and affects the reliability of pavement condition evaluation. To address this issue, this paper proposes a dual-path crack segmentation network that integrates CNN and Transformers. The CNN branch incorporates a dynamic multi-branch convolution module to enhance the directional perception and structural modeling of elongated cracks. The Transformer branch employs a lightweight DCNv4 module to replace traditional self-attention mechanisms, effectively capturing long-range dependencies while reducing computational complexity. A multi-path fusion module is designed to achieve the collaborative enhancement of dual-path features, improving the semantic representation of continuous crack regions. Additionally, a combined loss function of BCE and Dice is adopted to alleviate the severe class imbalance between crack and background pixels, further improving the completeness of crack segmentation. Experiments on four datasets, including CFD, DeepCrack537, Gaps384, and Crack500, demonstrate that the proposed model outperforms all compared methods in terms of F-score and mIoU. Ablation studies further validate the effectiveness of the dual-path architecture and its key modules in improving performance. Furthermore, in field validation on real road scenarios, the pavement condition index (PCI) calculated based on the proposed method shows an average deviation of only 0.81 compared to manually interpreted ground truth, demonstrating the practical value of continuous crack detection for pavement maintenance assessment. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

Search Results (285)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (285)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI