MDPI - Publisher of Open Access Journals

22 pages, 1342 KiB

Open AccessArticle

Multi-Scale Attention-Driven Hierarchical Learning for Fine-Grained Visual Categorization

by Zhihuai Hu, Rihito Kojima and Xian-Hua Han

Electronics 2025, 14(14), 2869; https://doi.org/10.3390/electronics14142869 - 18 Jul 2025

Fine-grained visual categorization (FGVC) presents significant challenges due to subtle inter-class variation and significant intra-class diversity, often leading to limited discriminative capacity in global representations. Existing methods inadequately capture localized, class-relevant features across multiple semantic levels, especially under complex spatial configurations. To address [...] Read more.

Fine-grained visual categorization (FGVC) presents significant challenges due to subtle inter-class variation and significant intra-class diversity, often leading to limited discriminative capacity in global representations. Existing methods inadequately capture localized, class-relevant features across multiple semantic levels, especially under complex spatial configurations. To address these challenges, we introduce a Multi-scale Attention-driven Hierarchical Learning (MAHL) framework that iteratively refines feature representations via scale-adaptive attention mechanisms. Specifically, fully connected (FC) classifiers are applied to spatially pooled feature maps at multiple network stages to capture global semantic context. The learned FC weights are then projected onto the original high-resolution feature maps to compute spatial contribution scores for the predicted class, serving as attention cues. These multi-scale attention maps guide the selection of discriminative regions, which are hierarchically integrated into successive training iterations to reinforce both global and local contextual dependencies. Moreover, we explore a generalized pooling operation that parametrically fuses average and max pooling, enabling richer contextual retention in the encoded features. Comprehensive evaluations on benchmark FGVC datasets demonstrate that MAHL consistently outperforms state-of-the-art methods, validating its efficacy in learning robust, class-discriminative, high-resolution representations through attention-guided hierarchical refinement. Full article

(This article belongs to the Special Issue Advances in Machine Learning for Image Classification)

► Show Figures

Figure 1

25 pages, 6123 KiB

Open AccessArticle

SDA-YOLO: An Object Detection Method for Peach Fruits in Complex Orchard Environments

by Xudong Lin, Dehao Liao, Zhiguo Du, Bin Wen, Zhihui Wu and Xianzhi Tu

Sensors 2025, 25(14), 4457; https://doi.org/10.3390/s25144457 - 17 Jul 2025

Abstract

To address the challenges of leaf–branch occlusion, fruit mutual occlusion, complex background interference, and scale variations in peach detection within complex orchard environments, this study proposes an improved YOLOv11n-based peach detection method named SDA-YOLO. First, in the backbone network, the LSKA module is [...] Read more.

To address the challenges of leaf–branch occlusion, fruit mutual occlusion, complex background interference, and scale variations in peach detection within complex orchard environments, this study proposes an improved YOLOv11n-based peach detection method named SDA-YOLO. First, in the backbone network, the LSKA module is embedded into the SPPF module to construct an SPPF-LSKA fusion module, enhancing multi-scale feature representation for peach targets. Second, an MPDIoU-based bounding box regression loss function replaces CIoU to improve localization accuracy for overlapping and occluded peaches. The DyHead Block is integrated into the detection head to form a DMDetect module, strengthening feature discrimination for small and occluded targets in complex backgrounds. To address insufficient feature fusion flexibility caused by scale variations from occlusion and illumination differences in multi-scale peach detection, a novel Adaptive Multi-Scale Fusion Pyramid (AMFP) module is proposed to enhance the neck network, improving flexibility in processing complex features. Experimental results demonstrate that SDA-YOLO achieves precision (P), recall (R), mAP@0.95, and mAP@0.5:0.95 of 90.8%, 85.4%, 90%, and 62.7%, respectively, surpassing YOLOv11n by 2.7%, 4.8%, 2.7%, and 7.2%. This verifies the method’s robustness in complex orchard environments and provides effective technical support for intelligent fruit harvesting and yield estimation. Full article

(This article belongs to the Special Issue Sensing Technology and Computer Vision for Precision Agriculture and Smart Farming)

► Show Figures

Figure 1

24 pages, 20337 KiB

Open AccessArticle

MEAC: A Multi-Scale Edge-Aware Convolution Module for Robust Infrared Small-Target Detection

by Jinlong Hu, Tian Zhang and Ming Zhao

Sensors 2025, 25(14), 4442; https://doi.org/10.3390/s25144442 - 16 Jul 2025

Viewed by 76

Abstract

Infrared small-target detection remains a critical challenge in military reconnaissance, environmental monitoring, forest-fire prevention, and search-and-rescue operations, owing to the targets’ extremely small size, sparse texture, low signal-to-noise ratio, and complex background interference. Traditional convolutional neural networks (CNNs) struggle to detect such weak, [...] Read more.

Infrared small-target detection remains a critical challenge in military reconnaissance, environmental monitoring, forest-fire prevention, and search-and-rescue operations, owing to the targets’ extremely small size, sparse texture, low signal-to-noise ratio, and complex background interference. Traditional convolutional neural networks (CNNs) struggle to detect such weak, low-contrast objects due to their limited receptive fields and insufficient feature extraction capabilities. To overcome these limitations, we propose a Multi-Scale Edge-Aware Convolution (MEAC) module that enhances feature representation for small infrared targets without increasing parameter count or computational cost. Specifically, MEAC fuses (1) original local features, (2) multi-scale context captured via dilated convolutions, and (3) high-contrast edge cues derived from differential Gaussian filters. After fusing these branches, channel and spatial attention mechanisms are applied to adaptively emphasize critical regions, further improving feature discrimination. The MEAC module is fully compatible with standard convolutional layers and can be seamlessly embedded into various network architectures. Extensive experiments on three public infrared small-target datasets (SIRSTD-UAVB, IRSTDv1, and IRSTD-1K) demonstrate that networks augmented with MEAC significantly outperform baseline models using standard convolutions. When compared to eleven mainstream convolution modules (ACmix, AKConv, DRConv, DSConv, LSKConv, MixConv, PConv, ODConv, GConv, and Involution), our method consistently achieves the highest detection accuracy and robustness. Experiments conducted across multiple versions, including YOLOv10, YOLOv11, and YOLOv12, as well as various network levels, demonstrate that the MEAC module achieves stable improvements in performance metrics while slightly increasing computational and parameter complexity. These results validate the MEAC module’s significant advantages in enhancing the detection of small and weak objects and suppressing interference from complex backgrounds. These results validate MEAC’s effectiveness in enhancing weak small-target detection and suppressing complex background noise, highlighting its strong generalization ability and practical application potential. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

21 pages, 41202 KiB

Open AccessArticle

Copper Stress Levels Classification in Oilseed Rape Using Deep Residual Networks and Hyperspectral False-Color Images

by Yifei Peng, Jun Sun, Zhentao Cai, Lei Shi, Xiaohong Wu, Chunxia Dai and Yubin Xie

Horticulturae 2025, 11(7), 840; https://doi.org/10.3390/horticulturae11070840 - 16 Jul 2025

Viewed by 54

Abstract

In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to [...] Read more.

In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to humans. This study proposes an efficient and precise non-destructive detection method for Cu stress in oilseed rape, which is based on hyperspectral false-color image construction using principal component analysis (PCA). By comprehensively capturing the spectral representation of oilseed rape plants, both the one-dimensional (1D) spectral sequence and spatial image data were utilized for multi-class classification. The classification performance of models based on 1D spectral sequences was compared from two perspectives: first, between machine learning and deep learning methods (best accuracy: 93.49% vs. 96.69%); and second, between shallow and deep convolutional neural networks (CNNs) (best accuracy: 95.15% vs. 96.69%). For spatial image data, deep residual networks were employed to evaluate the effectiveness of visible-light and false-color images. The RegNet architecture was chosen for its flexible parameterization and proven effectiveness in extracting multi-scale features from hyperspectral false-color images. This flexibility enabled RegNetX-6.4GF to achieve optimal performance on the dataset constructed from three types of false-color images, with the model reaching a Macro-Precision, Macro-Recall, Macro-F₁, and Accuracy of 98.17%, 98.15%, 98.15%, and 98.15%, respectively. Furthermore, Grad-CAM visualizations revealed that latent physiological changes in plants under heavy metal stress guided feature learning within CNNs, and demonstrated the effectiveness of false-color image construction in extracting discriminative features. Overall, the proposed technique can be integrated into portable hyperspectral imaging devices, enabling real-time and non-destructive detection of heavy metal stress in modern agricultural practices. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in the Processing of Horticultural Crops)

► Show Figures

Figure 1

22 pages, 3279 KiB

Open AccessArticle

HA-CP-Net: A Cross-Domain Few-Shot SAR Oil Spill Detection Network Based on Hybrid Attention and Category Perception

by Dongmei Song, Shuzhen Wang, Bin Wang, Weimin Chen and Lei Chen

J. Mar. Sci. Eng. 2025, 13(7), 1340; https://doi.org/10.3390/jmse13071340 - 13 Jul 2025

Viewed by 175

Abstract

Deep learning models have obvious advantages in detecting oil spills, but the training of deep learning models heavily depends on a large number of samples of high quality. However, due to the accidental nature, unpredictability, and urgency of oil spill incidents, it is [...] Read more.

Deep learning models have obvious advantages in detecting oil spills, but the training of deep learning models heavily depends on a large number of samples of high quality. However, due to the accidental nature, unpredictability, and urgency of oil spill incidents, it is difficult to obtain a large number of labeled samples in real oil spill monitoring scenarios. Surprisingly, few-shot learning can achieve excellent classification performance with only a small number of labeled samples. In this context, a new cross-domain few-shot SAR oil spill detection network is proposed in this paper. Significantly, the network is embedded with a hybrid attention feature extraction block, which consists of a coordinate attention module to perceive the channel information and spatial location information, as well as a global self-attention transformer module capturing the global dependencies and a multi-scale self-attention module depicting the local detailed features, thereby achieving deep mining and accurate characterization of image features. In addition, to address the problem that it is difficult to distinguish between the suspected oil film in seawater and real oil film using few-shot due to the small difference in features, this paper proposes a double loss function category determination block, which consists of two parts: a well-designed category-perception loss function and a traditional cross-entropy loss function. The category-perception loss function optimizes the spatial distribution of sample features by shortening the distance between similar samples while expanding the distance between different samples. By combining the category-perception loss function with the cross-entropy loss function, the network’s performance in discriminating between real and suspected oil films is thus maximized. The experimental results effectively demonstrate that this study provides an effective solution for high-precision oil spill detection under few-shot conditions, which is conducive to the rapid identification of oil spill accidents. Full article

(This article belongs to the Section Marine Environmental Science)

► Show Figures

Figure 1

21 pages, 21215 KiB

Open AccessArticle

ES-Net Empowers Forest Disturbance Monitoring: Edge–Semantic Collaborative Network for Canopy Gap Mapping

by Yutong Wang, Zhang Zhang, Jisheng Xia, Fei Zhao and Pinliang Dong

Remote Sens. 2025, 17(14), 2427; https://doi.org/10.3390/rs17142427 - 12 Jul 2025

Viewed by 256

Abstract

Canopy gaps are vital microhabitats for forest carbon cycling and species regeneration, whose accurate extraction is crucial for ecological modeling and smart forestry. However, traditional monitoring methods have notable limitations: ground-based measurements are inefficient; remote-sensing interpretation is susceptible to terrain and spectral interference; [...] Read more.

Canopy gaps are vital microhabitats for forest carbon cycling and species regeneration, whose accurate extraction is crucial for ecological modeling and smart forestry. However, traditional monitoring methods have notable limitations: ground-based measurements are inefficient; remote-sensing interpretation is susceptible to terrain and spectral interference; and traditional algorithms exhibit an insufficient feature representation capability. Aiming at overcoming the bottleneck issues of canopy gap identification in mountainous forest regions, we constructed a multi-task deep learning model (ES-Net) integrating an edge–semantic collaborative perception mechanism. First, a refined sample library containing multi-scale interference features was constructed, which included 2808 annotated UAV images. Based on this, a dual-branch feature interaction architecture was designed. A cross-layer attention mechanism was embedded in the semantic segmentation module (SSM) to enhance the discriminative ability for heterogeneous features. Meanwhile, an edge detection module (EDM) was built to strengthen geometric constraints. Results from selected areas in Yunnan Province (China) demonstrate that ES-Net outperforms U-Net, boosting the Intersection over Union (IoU) by 0.86% (95.41% vs. 94.55%), improving the edge coverage rate by 3.14% (85.32% vs. 82.18%), and reducing the Hausdorff Distance by 38.6% (28.26 pixels vs. 46.02 pixels). Ablation studies further verify that the synergy between SSM and EDM yields a 13.0% IoU gain over the baseline, highlighting the effectiveness of joint semantic–edge optimization. This study provides a terrain-adaptive intelligent interpretation method for forest disturbance monitoring and holds significant practical value for advancing smart forestry construction and ecosystem sustainable management. Full article

► Show Figures

Graphical abstract

24 pages, 2440 KiB

Open AccessArticle

A Novel Dynamic Context Branch Attention Network for Detecting Small Objects in Remote Sensing Images

by Huazhong Jin, Yizhuo Song, Ting Bai, Kaimin Sun and Yepei Chen

Remote Sens. 2025, 17(14), 2415; https://doi.org/10.3390/rs17142415 - 12 Jul 2025

Viewed by 169

Abstract

Detecting small objects in remote sensing images is challenging due to their size, which results in limited distinctive features. This limitation necessitates the effective use of contextual information for accurate identification. Many existing methods often struggle because they do not dynamically adjust the [...] Read more.

Detecting small objects in remote sensing images is challenging due to their size, which results in limited distinctive features. This limitation necessitates the effective use of contextual information for accurate identification. Many existing methods often struggle because they do not dynamically adjust the contextual scope based on the specific characteristics of each target. To address this issue and improve the detection performance of small objects (typically defined as objects with a bounding box area of less than 1024 pixels), we propose a novel backbone network called the Dynamic Context Branch Attention Network (DCBANet). We present the Dynamic Context Scale-Aware (DCSA) Block, which utilizes a multi-branch architecture to generate features with diverse receptive fields. Within each branch, a Context Adaptive Selection Module (CASM) dynamically weights information, allowing the model to focus on the most relevant context. To further enhance performance, we introduce an Efficient Branch Attention (EBA) module that adaptively reweights the parallel branches, prioritizing the most discriminative ones. Finally, to ensure computational efficiency, we design a Dual-Gated Feedforward Network (DGFFN), a lightweight yet powerful replacement for standard FFNs. Extensive experiments conducted on four public remote sensing datasets demonstrate that the DCBANet achieves impressive mAP@0.5 scores of 80.79% on DOTA, 89.17% on NWPU VHR-10, 80.27% on SIMD, and a remarkable 42.4% mAP@0.5:0.95 on the specialized small object benchmark AI-TOD. These results surpass RetinaNet, YOLOF, FCOS, Faster R-CNN, Dynamic R-CNN, SKNet, and Cascade R-CNN, highlighting its effectiveness in detecting small objects in remote sensing images. However, there remains potential for further improvement in multi-scale and weak target detection. Future work will integrate local and global context to enhance multi-scale object detection performance. Full article

(This article belongs to the Special Issue High-Resolution Remote Sensing Image Processing and Applications)

► Show Figures

Figure 1

18 pages, 4631 KiB

Open AccessArticle

Semantic Segmentation of Rice Fields in Sub-Meter Satellite Imagery Using an HRNet-CA-Enhanced DeepLabV3+ Framework

by Yifan Shao, Pan Pan, Hongxin Zhao, Jiale Li, Guoping Yu, Guomin Zhou and Jianhua Zhang

Remote Sens. 2025, 17(14), 2404; https://doi.org/10.3390/rs17142404 - 11 Jul 2025

Viewed by 294

Abstract

Accurate monitoring of rice-planting areas underpins food security and evidence-based farm management. Recent work has advanced along three complementary lines—multi-source data fusion (to mitigate cloud and spectral confusion), temporal feature extraction (to exploit phenology), and deep-network architecture optimization. However, even the best fusion- [...] Read more.

Accurate monitoring of rice-planting areas underpins food security and evidence-based farm management. Recent work has advanced along three complementary lines—multi-source data fusion (to mitigate cloud and spectral confusion), temporal feature extraction (to exploit phenology), and deep-network architecture optimization. However, even the best fusion- and time-series-based approaches still struggle to preserve fine spatial details in sub-meter scenes. Targeting this gap, we propose an HRNet-CA-enhanced DeepLabV3+ that retains the original model’s strengths while resolving its two key weaknesses: (i) detail loss caused by repeated down-sampling and feature-pyramid compression and (ii) boundary blurring due to insufficient multi-scale information fusion. The Xception backbone is replaced with a High-Resolution Network (HRNet) to maintain full-resolution feature streams through multi-resolution parallel convolutions and cross-scale interactions. A coordinate attention (CA) block is embedded in the decoder to strengthen spatially explicit context and sharpen class boundaries. The rice dataset consisted of 23,295 images (11,295 rice + 12,000 non-rice) via preprocessing and manual labeling and benchmarked the proposed model against classical segmentation networks. Our approach boosts boundary segmentation accuracy to 92.28% MIOU and raises texture-level discrimination to 95.93% F1, without extra inference latency. Although this study focuses on architecture optimization, the HRNet-CA backbone is readily compatible with future multi-source fusion and time-series modules, offering a unified path toward operational paddy mapping in fragmented sub-meter landscapes. Full article

(This article belongs to the Topic Advances in Smart Agriculture with Remote Sensing as the Core and Its Applications in Crops Field)

► Show Figures

Figure 1

20 pages, 3802 KiB

Open AccessArticle

RT-DETR-FFD: A Knowledge Distillation-Enhanced Lightweight Model for Printed Fabric Defect Detection

by Gengliang Liang, Shijia Yu and Shuguang Han

Electronics 2025, 14(14), 2789; https://doi.org/10.3390/electronics14142789 - 11 Jul 2025

Viewed by 251

Abstract

Automated defect detection for printed fabric manufacturing faces critical challenges in balancing industrial-grade accuracy with real-time deployment efficiency. To address this, we propose RT-DETR-FFD, a knowledge-distilled detector optimized for printed fabric defect inspection. Firstly, the student model integrates a Fourier cross-stage mixer (FCSM). [...] Read more.

Automated defect detection for printed fabric manufacturing faces critical challenges in balancing industrial-grade accuracy with real-time deployment efficiency. To address this, we propose RT-DETR-FFD, a knowledge-distilled detector optimized for printed fabric defect inspection. Firstly, the student model integrates a Fourier cross-stage mixer (FCSM). This module disentangles defect features from periodic textile backgrounds through spectral decoupling. Secondly, we introduce FuseFlow-Net to enable dynamic multi-scale interaction, thereby enhancing discriminative feature representation. Additionally, a learnable positional encoding (LPE) module transcends rigid geometric constraints, strengthening contextual awareness. Furthermore, we design a dynamic correlation-guided loss (DCGLoss) for distillation optimization. Our loss leverages masked frequency-channel alignment and cross-domain fusion mechanisms to streamline knowledge transfer. Experiments demonstrate that the distilled model achieves an mAP@0.5 of 82.1%, surpassing the baseline RT-DETR-R18 by 6.3% while reducing parameters by 11.7%. This work establishes an effective paradigm for deploying high-precision defect detectors in resource-constrained industrial scenarios, advancing real-time quality control in textile manufacturing. Full article

► Show Figures

Figure 1

24 pages, 3524 KiB

Open AccessArticle

Transient Stability Assessment of Power Systems Based on Temporal Feature Selection and LSTM-Transformer Variational Fusion

by Zirui Huang, Zhaobin Du, Jiawei Gao and Guoduan Zhong

Electronics 2025, 14(14), 2780; https://doi.org/10.3390/electronics14142780 - 10 Jul 2025

Viewed by 179

Abstract

To address the challenges brought by the high penetration of renewable energy in power systems, such as multi-scale dynamic interactions, high feature dimensionality, and limited model generalization, this paper proposes a transient stability assessment (TSA) method that combines temporal feature selection with deep [...] Read more.

To address the challenges brought by the high penetration of renewable energy in power systems, such as multi-scale dynamic interactions, high feature dimensionality, and limited model generalization, this paper proposes a transient stability assessment (TSA) method that combines temporal feature selection with deep learning-based modeling. First, a two-stage feature selection strategy is designed using the inter-class Mahalanobis distance and Spearman rank correlation. This helps extract highly discriminative and low-redundancy features from wide-area measurement system (WAMS) time-series data. Then, a parallel LSTM-Transformer architecture is constructed to capture both short-term local fluctuations and long-term global dependencies. A variational inference mechanism based on a Gaussian mixture model (GMM) is introduced to enable dynamic representations fusion and uncertainty modeling. A composite loss function combining improved focal loss and Kullback–Leibler (KL) divergence regularization is designed to enhance model robustness and training stability under complex disturbances. The proposed method is validated on a modified IEEE 39-bus system. Results show that it outperforms existing models in accuracy, robustness, interpretability, and other aspects. This provides an effective solution for TSA in power systems with high renewable energy integration. Full article

(This article belongs to the Special Issue Advanced Energy Systems and Technologies for Urban Sustainability)

► Show Figures

Figure 1

19 pages, 14033 KiB

Open AccessArticle

SCCA-YOLO: Spatial Channel Fusion and Context-Aware YOLO for Lunar Crater Detection

by Jiahao Tang, Boyuan Gu, Tianyou Li and Ying-Bo Lu

Remote Sens. 2025, 17(14), 2380; https://doi.org/10.3390/rs17142380 - 10 Jul 2025

Viewed by 281

Abstract

Lunar crater detection plays a crucial role in geological analysis and the advancement of lunar exploration. Accurate identification of craters is also essential for constructing high-resolution topographic maps and supporting mission planning in future lunar exploration efforts. However, lunar craters often suffer from [...] Read more.

Lunar crater detection plays a crucial role in geological analysis and the advancement of lunar exploration. Accurate identification of craters is also essential for constructing high-resolution topographic maps and supporting mission planning in future lunar exploration efforts. However, lunar craters often suffer from insufficient feature representation due to their small size and blurred boundaries. In addition, the visual similarity between craters and surrounding terrain further exacerbates background confusion. These challenges significantly hinder detection performance in remote sensing imagery and underscore the necessity of enhancing both local feature representation and global semantic reasoning. In this paper, we propose a novel Spatial Channel Fusion and Context-Aware YOLO (SCCA-YOLO) model built upon the YOLO11 framework. Specifically, the Context-Aware Module (CAM) employs a multi-branch dilated convolutional structure to enhance feature richness and expand the local receptive field, thereby strengthening the feature extraction capability. The Joint Spatial and Channel Fusion Module (SCFM) is utilized to fuse spatial and channel information to model the global relationships between craters and the background, effectively suppressing background noise and reinforcing feature discrimination. In addition, the improved Channel Attention Concatenation (CAC) strategy adaptively learns channel-wise importance weights during feature concatenation, further optimizing multi-scale semantic feature fusion and enhancing the model’s sensitivity to critical crater features. The proposed method is validated on a self-constructed Chang’e 6 dataset, covering the landing site and its surrounding areas. Experimental results demonstrate that our model achieves an

m A P_{0.5}

of 96.5% and an

m A P_{0.5 : 0.95}

of 81.5%, outperforming other mainstream detection models including the YOLO family of algorithms. These findings highlight the potential of SCCA-YOLO for high-precision lunar crater detection and provide valuable insights into future lunar surface analysis. Full article

► Show Figures

Figure 1

19 pages, 2468 KiB

Open AccessArticle

A Dual-Branch Spatial-Frequency Domain Fusion Method with Cross Attention for SAR Image Target Recognition

by Chao Li, Jiacheng Ni, Ying Luo, Dan Wang and Qun Zhang

Remote Sens. 2025, 17(14), 2378; https://doi.org/10.3390/rs17142378 - 10 Jul 2025

Viewed by 238

Abstract

Synthetic aperture radar (SAR) image target recognition has important application values in security reconnaissance and disaster monitoring. However, due to speckle noise and target orientation sensitivity in SAR images, traditional spatial domain recognition methods face challenges in accuracy and robustness. To effectively address [...] Read more.

Synthetic aperture radar (SAR) image target recognition has important application values in security reconnaissance and disaster monitoring. However, due to speckle noise and target orientation sensitivity in SAR images, traditional spatial domain recognition methods face challenges in accuracy and robustness. To effectively address these challenges, we propose a dual-branch spatial-frequency domain fusion recognition method with cross-attention, achieving deep fusion of spatial and frequency domain features. In the spatial domain, we propose an enhanced multi-scale feature extraction module (EMFE), which adopts a multi-branch parallel structure to effectively enhance the network’s multi-scale feature representation capability. Combining frequency domain guided attention, the model focuses on key regional features in the spatial domain. In the frequency domain, we design a hybrid frequency domain transformation module (HFDT) that extracts real and imaginary features through Fourier transform to capture the global structure of the image. Meanwhile, we introduce a spatially guided frequency domain attention to enhance the discriminative capability of frequency domain features. Finally, we propose a cross-domain feature fusion (CDFF) module, which achieves bidirectional interaction and optimal fusion of spatial-frequency domain features through cross attention and adaptive feature fusion. Experimental results demonstrate that our method achieves significantly superior recognition accuracy compared to existing methods on the MSTAR dataset. Full article

(This article belongs to the Special Issue Synthetic Aperture Radar (SAR) Image Object Detection and Information Extraction: Methods and Applications (Second Edition))

► Show Figures

Figure 1

30 pages, 34072 KiB

Open AccessArticle

ARE-PaLED: Augmented Reality-Enhanced Patch-Level Explainable Deep Learning System for Alzheimer’s Disease Diagnosis from 3D Brain sMRI

by Chitrakala S and Bharathi U

Symmetry 2025, 17(7), 1108; https://doi.org/10.3390/sym17071108 - 10 Jul 2025

Viewed by 295

Abstract

Structural magnetic resonance imaging (sMRI) is a vital tool for diagnosing neurological brain diseases. However, sMRI scans often show significant structural changes only in limited brain regions due to localised atrophy, making the identification of discriminative features a key challenge. Importantly, the human [...] Read more.

Structural magnetic resonance imaging (sMRI) is a vital tool for diagnosing neurological brain diseases. However, sMRI scans often show significant structural changes only in limited brain regions due to localised atrophy, making the identification of discriminative features a key challenge. Importantly, the human brain exhibits inherent bilateral symmetry, and deviations from this symmetry—such as asymmetric atrophy—are strong indicators of early Alzheimer’s disease (AD). Patch-based methods help capture local brain changes for early AD diagnosis, but they often struggle with fixed-size limitations, potentially missing subtle asymmetries or broader contextual cues. To address these limitations, we propose a novel augmented reality (AR)-enhanced patch-level explainable deep learning (ARE-PaLED) system. It includes an adaptive multi-scale patch extraction network (AMPEN) to adjust patch sizes based on anatomical characteristics and spatial context, as well as an informative patch selection algorithm (IPSA) to identify discriminative patches, including those reflecting asymmetry patterns associated with AD; additionally, an AR module is proposed for future immersive explainability, complementing the patch-level interpretation framework. Evaluated on 1862 subjects from the ADNI and AIBL datasets, the framework achieved an accuracy of 92.5% (AD vs. NC) and 85.9% (AD vs. MCI). The proposed ARE-PaLED demonstrates potential as an interpretable and immersive diagnostic aid for sMRI-based AD diagnosis, supporting the interpretation of model predictions for AD diagnosis. Full article

► Show Figures

Figure 1

25 pages, 8372 KiB

Open AccessArticle

CSDNet: Context-Aware Segmentation of Disaster Aerial Imagery Using Detection-Guided Features and Lightweight Transformers

by Ahcene Zetout and Mohand Saïd Allili

Remote Sens. 2025, 17(14), 2337; https://doi.org/10.3390/rs17142337 - 8 Jul 2025

Viewed by 230

Abstract

Accurate multi-class semantic segmentation of disaster-affected areas is essential for rapid response and effective recovery planning. We present CSDNet, a context-aware segmentation model tailored to disaster scene scenarios, designed to improve segmentation of both large-scale disaster zones and small, underrepresented classes. The architecture [...] Read more.

Accurate multi-class semantic segmentation of disaster-affected areas is essential for rapid response and effective recovery planning. We present CSDNet, a context-aware segmentation model tailored to disaster scene scenarios, designed to improve segmentation of both large-scale disaster zones and small, underrepresented classes. The architecture combines a lightweight transformer module for global context modeling with depthwise separable convolutions (DWSCs) to enhance efficiency without compromising representational capacity. Additionally, we introduce a detection-guided feature fusion mechanism that integrates outputs from auxiliary detection tasks to mitigate class imbalance and improve discrimination of visually similar categories. Extensive experiments on several public datasets demonstrate that our model significantly improves segmentation of both man-made infrastructure and natural damage-related features, offering a robust and efficient solution for post-disaster analysis. Full article

► Show Figures

Figure 1

13 pages, 2285 KiB

Open AccessArticle

STHFD: Spatial–Temporal Hypergraph-Based Model for Aero-Engine Bearing Fault Diagnosis

by Panfeng Bao, Wenjun Yi, Yue Zhu, Yufeng Shen and Boon Xian Chai

Aerospace 2025, 12(7), 612; https://doi.org/10.3390/aerospace12070612 - 7 Jul 2025

Viewed by 189

Abstract

Accurate fault diagnosis in aerospace transmission systems is essential for ensuring equipment reliability and operational safety, especially for aero-engine bearings. However, current approaches relying on Convolutional Neural Networks (CNNs) for Euclidean data and Graph Convolutional Networks (GCNs) for non-Euclidean structures struggle to simultaneously [...] Read more.

Accurate fault diagnosis in aerospace transmission systems is essential for ensuring equipment reliability and operational safety, especially for aero-engine bearings. However, current approaches relying on Convolutional Neural Networks (CNNs) for Euclidean data and Graph Convolutional Networks (GCNs) for non-Euclidean structures struggle to simultaneously capture heterogeneous data properties and complex spatio-temporal dependencies. To address these limitations, we propose a novel Spatial–Temporal Hypergraph Fault Diagnosis framework (STHFD). Unlike conventional graphs that model pairwise relations, STHFD employs hypergraphs to represent high-order spatial–temporal correlations more effectively. Specifically, it constructs distinct spatial and temporal hyperedges to capture multi-scale relationships among fault signals. A type-aware hypergraph learning strategy is then applied to encode these correlations into discriminative embeddings. Extensive experiments on aerospace fault datasets demonstrate that STHFD achieves superior classification performance compared to state-of-the-art diagnostic models, highlighting its potential for enhancing intelligent fault detection in complex aerospace systems. Full article

(This article belongs to the Special Issue Challenges and Recent Advances in Model-Based Engineering for Aerospace)

► Show Figures

Figure 1

Search Results (565)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (565)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI