MDPI - Publisher of Open Access Journals

21 pages, 5917 KiB

Open AccessArticle

VML-UNet: Fusing Vision Mamba and Lightweight Attention Mechanism for Skin Lesion Segmentation

by Tang Tang, Haihui Wang, Qiang Rao, Ke Zuo and Wen Gan

Electronics 2025, 14(14), 2866; https://doi.org/10.3390/electronics14142866 (registering DOI) - 17 Jul 2025

Deep learning has advanced medical image segmentation, yet existing methods struggle with complex anatomical structures. Mainstream models, such as CNN, Transformer, and hybrid architectures, face challenges including insufficient information representation and redundant complexity, which limit their clinical deployment. Developing efficient and lightweight networks [...] Read more.

Deep learning has advanced medical image segmentation, yet existing methods struggle with complex anatomical structures. Mainstream models, such as CNN, Transformer, and hybrid architectures, face challenges including insufficient information representation and redundant complexity, which limit their clinical deployment. Developing efficient and lightweight networks is crucial for accurate lesion localization and optimized clinical workflows. We propose the VML-UNet, a lightweight segmentation network with core innovations including the CPMamba module and the multi-scale local supervision module (MLSM). The CPMamba module integrates the visual state space (VSS) block and a channel prior attention mechanism to enable efficient modeling of spatial relationships with linear computational complexity through dynamic channel-space weight allocation, while preserving channel feature integrity. The MLSM enhances local feature perception and reduces the inference burden. Comparative experiments were conducted on three public datasets, including ISIC2017, ISIC2018, and PH2, with ablation experiments performed on ISIC2017. VML-UNet achieves 0.53 M parameters, 2.18 MB memory usage, and 1.24 GFLOPs time complexity, with its performance on the datasets outperforming comparative networks, validating its effectiveness. This study provides valuable references for developing lightweight, high-performance skin lesion segmentation networks, advancing the field of skin lesion segmentation. Full article

(This article belongs to the Section Bioelectronics)

► Show Figures

Figure 1

25 pages, 6123 KiB

Open AccessArticle

SDA-YOLO: An Object Detection Method for Peach Fruits in Complex Orchard Environments

by Xudong Lin, Dehao Liao, Zhiguo Du, Bin Wen, Zhihui Wu and Xianzhi Tu

Sensors 2025, 25(14), 4457; https://doi.org/10.3390/s25144457 (registering DOI) - 17 Jul 2025

Abstract

To address the challenges of leaf–branch occlusion, fruit mutual occlusion, complex background interference, and scale variations in peach detection within complex orchard environments, this study proposes an improved YOLOv11n-based peach detection method named SDA-YOLO. First, in the backbone network, the LSKA module is [...] Read more.

To address the challenges of leaf–branch occlusion, fruit mutual occlusion, complex background interference, and scale variations in peach detection within complex orchard environments, this study proposes an improved YOLOv11n-based peach detection method named SDA-YOLO. First, in the backbone network, the LSKA module is embedded into the SPPF module to construct an SPPF-LSKA fusion module, enhancing multi-scale feature representation for peach targets. Second, an MPDIoU-based bounding box regression loss function replaces CIoU to improve localization accuracy for overlapping and occluded peaches. The DyHead Block is integrated into the detection head to form a DMDetect module, strengthening feature discrimination for small and occluded targets in complex backgrounds. To address insufficient feature fusion flexibility caused by scale variations from occlusion and illumination differences in multi-scale peach detection, a novel Adaptive Multi-Scale Fusion Pyramid (AMFP) module is proposed to enhance the neck network, improving flexibility in processing complex features. Experimental results demonstrate that SDA-YOLO achieves precision (P), recall (R), mAP@0.95, and mAP@0.5:0.95 of 90.8%, 85.4%, 90%, and 62.7%, respectively, surpassing YOLOv11n by 2.7%, 4.8%, 2.7%, and 7.2%. This verifies the method’s robustness in complex orchard environments and provides effective technical support for intelligent fruit harvesting and yield estimation. Full article

(This article belongs to the Special Issue Sensing Technology and Computer Vision for Precision Agriculture and Smart Farming)

► Show Figures

Figure 1

22 pages, 3502 KiB

Open AccessArticle

NGD-YOLO: An Improved Real-Time Steel Surface Defect Detection Algorithm

by Bingyi Li, Andong Xiao, Xing Hu, Sisi Zhu, Gang Wan, Kunlun Qi and Pengfei Shi

Electronics 2025, 14(14), 2859; https://doi.org/10.3390/electronics14142859 (registering DOI) - 17 Jul 2025

Abstract

Steel surface defect detection is a crucial step in ensuring industrial production quality. However, due to significant variations in scale and irregular geometric morphology of steel surface defects, existing detection algorithms show notable deficiencies in multi-scale feature representation and cross-layer multi-scale feature fusion [...] Read more.

Steel surface defect detection is a crucial step in ensuring industrial production quality. However, due to significant variations in scale and irregular geometric morphology of steel surface defects, existing detection algorithms show notable deficiencies in multi-scale feature representation and cross-layer multi-scale feature fusion efficiency. To address these challenges, this paper proposes an improved real-time steel surface defect detection model, NGD-YOLO, based on YOLOv5s, which achieves fast and high-precision defect detection under relatively low hardware conditions. Firstly, a lightweight and efficient Normalization-based Attention Module (NAM) is integrated into the C3 module to construct the C3NAM, enhancing multi-scale feature representation capabilities. Secondly, an efficient Gather–Distribute (GD) mechanism is introduced into the feature fusion component to build the GD-NAM network, thereby effectively reducing information loss during cross-layer multi-scale information fusion and adding a small target detection layer to enhance the detection performance of small defects. Finally, to mitigate the parameter increase caused by the GD-NAM network, a lightweight convolution module, DCConv, that integrates Efficient Channel Attention (ECA), is proposed and combined with the C3 module to construct the lightweight C3DC module. This approach improves detection speed and accuracy while reducing model parameters. Experimental results on the public NEU-DET dataset show that the proposed NGD-YOLO model achieves a detection accuracy of 79.2%, representing a 4.6% mAP improvement over the baseline YOLOv5s network with less than a quarter increase in parameters, and reaches 108.6 FPS, meeting the real-time monitoring requirements in industrial production environments. Full article

(This article belongs to the Special Issue Fault Detection Technology Based on Deep Learning)

► Show Figures

Figure 1

22 pages, 4882 KiB

Open AccessArticle

Dual-Branch Spatio-Temporal-Frequency Fusion Convolutional Network with Transformer for EEG-Based Motor Imagery Classification

by Hao Hu, Zhiyong Zhou, Zihan Zhang and Wenyu Yuan

Electronics 2025, 14(14), 2853; https://doi.org/10.3390/electronics14142853 (registering DOI) - 17 Jul 2025

Abstract

The decoding of motor imagery (MI) electroencephalogram (EEG) signals is crucial for motor control and rehabilitation. However, as feature extraction is the core component of the decoding process, traditional methods, often limited to single-feature domains or shallow time-frequency fusion, struggle to comprehensively capture [...] Read more.

The decoding of motor imagery (MI) electroencephalogram (EEG) signals is crucial for motor control and rehabilitation. However, as feature extraction is the core component of the decoding process, traditional methods, often limited to single-feature domains or shallow time-frequency fusion, struggle to comprehensively capture the spatio-temporal-frequency characteristics of the signals, thereby limiting decoding accuracy. To address these limitations, this paper proposes a dual-branch neural network architecture with multi-domain feature fusion, the dual-branch spatio-temporal-frequency fusion convolutional network with Transformer (DB-STFFCNet). The DB-STFFCNet model consists of three modules: the spatiotemporal feature extraction module (STFE), the frequency feature extraction module (FFE), and the feature fusion and classification module. The STFE module employs a lightweight multi-dimensional attention network combined with a temporal Transformer encoder, capable of simultaneously modeling local fine-grained features and global spatiotemporal dependencies, effectively integrating spatiotemporal information and enhancing feature representation. The FFE module constructs a hierarchical feature refinement structure by leveraging the fast Fourier transform (FFT) and multi-scale frequency convolutions, while a frequency-domain Transformer encoder captures the global dependencies among frequency domain features, thus improving the model’s ability to represent key frequency information. Finally, the fusion module effectively consolidates the spatiotemporal and frequency features to achieve accurate classification. To evaluate the feasibility of the proposed method, experiments were conducted on the BCI Competition IV-2a and IV-2b public datasets, achieving accuracies of 83.13% and 89.54%, respectively, outperforming existing methods. This study provides a novel solution for joint time-frequency representation learning in EEG analysis. Full article

(This article belongs to the Special Issue Artificial Intelligence Methods for Biomedical Data Processing)

► Show Figures

Figure 1

15 pages, 1142 KiB

Open AccessTechnical Note

Terrain and Atmosphere Classification Framework on Satellite Data Through Attentional Feature Fusion Network

by Antoni Jaszcz and Dawid Połap

Remote Sens. 2025, 17(14), 2477; https://doi.org/10.3390/rs17142477 (registering DOI) - 17 Jul 2025

Abstract

Surface, terrain, or even atmosphere analysis using images or their fragments is important due to the possibilities of further processing. In particular, attention is necessary for satellite and/or drone images. Analyzing image elements by classifying the given classes is important for obtaining information [...] Read more.

Surface, terrain, or even atmosphere analysis using images or their fragments is important due to the possibilities of further processing. In particular, attention is necessary for satellite and/or drone images. Analyzing image elements by classifying the given classes is important for obtaining information about space for autonomous systems, identifying landscape elements, or monitoring and maintaining the infrastructure and environment. Hence, in this paper, we propose a neural classifier architecture that analyzes different features by the parallel processing of information in the network and combines them with a feature fusion mechanism. The neural architecture model takes into account different types of features by extracting them by focusing on spatial, local patterns and multi-scale representation. In addition, the classifier is guided by an attention mechanism for focusing more on different channels, spatial information, and even feature pyramid mechanisms. Atrous convolutional operators were also used in such an architecture as better context feature extractors. The proposed classifier architecture is the main element of the modeled framework for satellite data analysis, which is based on the possibility of training depending on the client’s desire. The proposed methodology was evaluated on three publicly available classification datasets for remote sensing: satellite images, Visual Terrain Recognition, and USTC SmokeRS, where the proposed model achieved accuracy scores of 97.8%, 100.0%, and 92.4%, respectively. The obtained results indicate the effectiveness of the proposed attention mechanisms across different remote sensing challenges. Full article

► Show Figures

Figure 1

24 pages, 20337 KiB

Open AccessArticle

MEAC: A Multi-Scale Edge-Aware Convolution Module for Robust Infrared Small-Target Detection

by Jinlong Hu, Tian Zhang and Ming Zhao

Sensors 2025, 25(14), 4442; https://doi.org/10.3390/s25144442 - 16 Jul 2025

Abstract

Infrared small-target detection remains a critical challenge in military reconnaissance, environmental monitoring, forest-fire prevention, and search-and-rescue operations, owing to the targets’ extremely small size, sparse texture, low signal-to-noise ratio, and complex background interference. Traditional convolutional neural networks (CNNs) struggle to detect such weak, [...] Read more.

Infrared small-target detection remains a critical challenge in military reconnaissance, environmental monitoring, forest-fire prevention, and search-and-rescue operations, owing to the targets’ extremely small size, sparse texture, low signal-to-noise ratio, and complex background interference. Traditional convolutional neural networks (CNNs) struggle to detect such weak, low-contrast objects due to their limited receptive fields and insufficient feature extraction capabilities. To overcome these limitations, we propose a Multi-Scale Edge-Aware Convolution (MEAC) module that enhances feature representation for small infrared targets without increasing parameter count or computational cost. Specifically, MEAC fuses (1) original local features, (2) multi-scale context captured via dilated convolutions, and (3) high-contrast edge cues derived from differential Gaussian filters. After fusing these branches, channel and spatial attention mechanisms are applied to adaptively emphasize critical regions, further improving feature discrimination. The MEAC module is fully compatible with standard convolutional layers and can be seamlessly embedded into various network architectures. Extensive experiments on three public infrared small-target datasets (SIRSTD-UAVB, IRSTDv1, and IRSTD-1K) demonstrate that networks augmented with MEAC significantly outperform baseline models using standard convolutions. When compared to eleven mainstream convolution modules (ACmix, AKConv, DRConv, DSConv, LSKConv, MixConv, PConv, ODConv, GConv, and Involution), our method consistently achieves the highest detection accuracy and robustness. Experiments conducted across multiple versions, including YOLOv10, YOLOv11, and YOLOv12, as well as various network levels, demonstrate that the MEAC module achieves stable improvements in performance metrics while slightly increasing computational and parameter complexity. These results validate the MEAC module’s significant advantages in enhancing the detection of small and weak objects and suppressing interference from complex backgrounds. These results validate MEAC’s effectiveness in enhancing weak small-target detection and suppressing complex background noise, highlighting its strong generalization ability and practical application potential. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

27 pages, 7645 KiB

Open AccessArticle

VMMT-Net: A Dual-Branch Parallel Network Combining Visual State Space Model and Mix Transformer for Land–Sea Segmentation of Remote Sensing Images

by Jiawei Wu, Zijian Liu, Zhipeng Zhu, Chunhui Song, Xinghui Wu and Haihua Xing

Remote Sens. 2025, 17(14), 2473; https://doi.org/10.3390/rs17142473 - 16 Jul 2025

Abstract

Land–sea segmentation is a fundamental task in remote sensing image analysis, and plays a vital role in dynamic coastline monitoring. The complex morphology and blurred boundaries of coastlines in remote sensing imagery make fast and accurate segmentation challenging. Recent deep learning approaches lack [...] Read more.

Land–sea segmentation is a fundamental task in remote sensing image analysis, and plays a vital role in dynamic coastline monitoring. The complex morphology and blurred boundaries of coastlines in remote sensing imagery make fast and accurate segmentation challenging. Recent deep learning approaches lack the ability to model spatial continuity effectively, thereby limiting a comprehensive understanding of coastline features in remote sensing imagery. To address this issue, we have developed VMMT-Net, a novel dual-branch semantic segmentation framework. By constructing a parallel heterogeneous dual-branch encoder, VMMT-Net integrates the complementary strengths of the Mix Transformer and the Visual State Space Model, enabling comprehensive modeling of local details, global semantics, and spatial continuity. We design a Cross-Branch Fusion Module to facilitate deep feature interaction and collaborative representation across branches, and implement a customized decoder module that enhances the integration of multiscale features and improves boundary refinement of coastlines. Extensive experiments conducted on two benchmark remote sensing datasets, GF-HNCD and BSD, demonstrate that the proposed VMMT-Net outperforms existing state-of-the-art methods in both quantitative metrics and visual quality. Specifically, the model achieves mean F1-scores of 98.48% (GF-HNCD) and 98.53% (BSD) and mean intersection-over-union values of 97.02% (GF-HNCD) and 97.11% (BSD). The model maintains reasonable computational complexity, with only 28.24 M parameters and 25.21 GFLOPs, striking a favorable balance between accuracy and efficiency. These results indicate the strong generalization ability and practical applicability of VMMT-Net in real-world remote sensing segmentation tasks. Full article

(This article belongs to the Special Issue Application of Remote Sensing in Coastline Monitoring)

► Show Figures

Figure 1

21 pages, 41202 KiB

Open AccessArticle

Copper Stress Levels Classification in Oilseed Rape Using Deep Residual Networks and Hyperspectral False-Color Images

by Yifei Peng, Jun Sun, Zhentao Cai, Lei Shi, Xiaohong Wu, Chunxia Dai and Yubin Xie

Horticulturae 2025, 11(7), 840; https://doi.org/10.3390/horticulturae11070840 - 16 Jul 2025

Abstract

In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to [...] Read more.

In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to humans. This study proposes an efficient and precise non-destructive detection method for Cu stress in oilseed rape, which is based on hyperspectral false-color image construction using principal component analysis (PCA). By comprehensively capturing the spectral representation of oilseed rape plants, both the one-dimensional (1D) spectral sequence and spatial image data were utilized for multi-class classification. The classification performance of models based on 1D spectral sequences was compared from two perspectives: first, between machine learning and deep learning methods (best accuracy: 93.49% vs. 96.69%); and second, between shallow and deep convolutional neural networks (CNNs) (best accuracy: 95.15% vs. 96.69%). For spatial image data, deep residual networks were employed to evaluate the effectiveness of visible-light and false-color images. The RegNet architecture was chosen for its flexible parameterization and proven effectiveness in extracting multi-scale features from hyperspectral false-color images. This flexibility enabled RegNetX-6.4GF to achieve optimal performance on the dataset constructed from three types of false-color images, with the model reaching a Macro-Precision, Macro-Recall, Macro-F₁, and Accuracy of 98.17%, 98.15%, 98.15%, and 98.15%, respectively. Furthermore, Grad-CAM visualizations revealed that latent physiological changes in plants under heavy metal stress guided feature learning within CNNs, and demonstrated the effectiveness of false-color image construction in extracting discriminative features. Overall, the proposed technique can be integrated into portable hyperspectral imaging devices, enabling real-time and non-destructive detection of heavy metal stress in modern agricultural practices. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in the Processing of Horticultural Crops)

► Show Figures

Figure 1

20 pages, 2926 KiB

Open AccessArticle

SonarNet: Global Feature-Based Hybrid Attention Network for Side-Scan Sonar Image Segmentation

by Juan Lei, Huigang Wang, Liming Fan, Qingyue Gu, Shaowei Rong and Huaxia Zhang

Remote Sens. 2025, 17(14), 2450; https://doi.org/10.3390/rs17142450 - 15 Jul 2025

Viewed by 72

Abstract

With the rapid advancement of deep learning techniques, side-scan sonar image segmentation has become a crucial task in underwater scene understanding. However, the complex and variable underwater environment poses significant challenges for salient object detection, with traditional deep learning approaches often suffering from [...] Read more.

With the rapid advancement of deep learning techniques, side-scan sonar image segmentation has become a crucial task in underwater scene understanding. However, the complex and variable underwater environment poses significant challenges for salient object detection, with traditional deep learning approaches often suffering from inadequate feature representation and the loss of global context during downsampling, thus compromising the segmentation accuracy of fine structures. To address these issues, we propose SonarNet, a Global Feature-Based Hybrid Attention Network specifically designed for side-scan sonar image segmentation. SonarNet features a dual-encoder architecture that leverages residual blocks and a self-attention mechanism to simultaneously capture both global structural and local contextual information. In addition, an adaptive hybrid attention module is introduced to effectively integrate channel and spatial features, while a global enhancement block fuses multi-scale global and spatial representations from the dual encoders, mitigating information loss throughout the network. Comprehensive experiments on a dedicated underwater sonar dataset demonstrate that SonarNet outperforms ten state-of-the-art saliency detection methods, achieving a mean absolute error as low as 2.35%. These results highlight the superior performance of SonarNet in challenging sonar image segmentation tasks. Full article

(This article belongs to the Special Issue Ocean Remote Sensing Based on Radar, Sonar and Optical Techniques (Second Edition))

► Show Figures

Graphical abstract

28 pages, 4068 KiB

Open AccessArticle

GDFC-YOLO: An Efficient Perception Detection Model for Precise Wheat Disease Recognition

by Jiawei Qian, Chenxu Dai, Zhanlin Ji and Jinyun Liu

Agriculture 2025, 15(14), 1526; https://doi.org/10.3390/agriculture15141526 - 15 Jul 2025

Viewed by 68

Abstract

Wheat disease detection is a crucial component of intelligent agricultural systems in modern agriculture. However, at present, its detection accuracy still has certain limitations. The existing models hardly capture the irregular and fine-grained texture features of the lesions, and the results of spatial [...] Read more.

Wheat disease detection is a crucial component of intelligent agricultural systems in modern agriculture. However, at present, its detection accuracy still has certain limitations. The existing models hardly capture the irregular and fine-grained texture features of the lesions, and the results of spatial information reconstruction caused by standard upsampling operations are inaccuracy. In this work, the GDFC-YOLO method is proposed to address these limitations and enhance the accuracy of detection. This method is based on YOLOv11 and encompasses three key aspects of improvement: (1) a newly designed Ghost Dynamic Feature Core (GDFC) in the backbone, which improves the efficiency of disease feature extraction and enhances the model’s ability to capture informative representations; (2) a redesigned neck structure, Disease-Focused Neck (DF-Neck), which further strengthens feature expressiveness, to improve multi-scale fusion and refine feature processing pipelines; and (3) the integration of the Powerful Intersection over Union v2 (PIoUv2) loss function to optimize the regression accuracy and convergence speed. The results showed that GDFC-YOLO improved the average accuracy from 0.86 to 0.90 when the cross-overmerge threshold was 0.5 (mAP@0.5), its accuracy reached 0.899, its recall rate reached 0.821, and it still maintained a structure with only 9.27 M parameters. From these results, it can be known that GDFC-YOLO has a good detection performance and stronger practicability relatively. It is a solution that can accurately and efficiently detect crop diseases in real agricultural scenarios. Full article

(This article belongs to the Special Issue AI-Powered UAVs and Imaging Systems for Precision Wheat and Rice Management)

► Show Figures

Figure 1

19 pages, 3619 KiB

Open AccessArticle

An Adaptive Underwater Image Enhancement Framework Combining Structural Detail Enhancement and Unsupervised Deep Fusion

by Semih Kahveci and Erdinç Avaroğlu

Appl. Sci. 2025, 15(14), 7883; https://doi.org/10.3390/app15147883 - 15 Jul 2025

Viewed by 81

Abstract

The underwater environment severely degrades image quality by absorbing and scattering light. This causes significant challenges, including non-uniform illumination, low contrast, color distortion, and blurring. These degradations compromise the performance of critical underwater applications, including water quality monitoring, object detection, and identification. To [...] Read more.

The underwater environment severely degrades image quality by absorbing and scattering light. This causes significant challenges, including non-uniform illumination, low contrast, color distortion, and blurring. These degradations compromise the performance of critical underwater applications, including water quality monitoring, object detection, and identification. To address these issues, this study proposes a detail-oriented hybrid framework for underwater image enhancement that synergizes the strengths of traditional image processing with the powerful feature extraction capabilities of unsupervised deep learning. Our framework introduces a novel multi-scale detail enhancement unit to accentuate structural information, followed by a Latent Low-Rank Representation (LatLRR)-based simplification step. This unique combination effectively suppresses common artifacts like oversharpening, spurious edges, and noise by decomposing the image into meaningful subspaces. The principal structural features are then optimally combined with a gamma-corrected luminance channel using an unsupervised MU-Fusion network, achieving a balanced optimization of both global contrast and local details. The experimental results on the challenging Test-C60 and OceanDark datasets demonstrate that our method consistently outperforms state-of-the-art fusion-based approaches, achieving average improvements of 7.5% in UIQM, 6% in IL-NIQE, and 3% in AG. Wilcoxon signed-rank tests confirm that these performance gains are statistically significant (p < 0.01). Consequently, the proposed method significantly mitigates prevalent issues such as color aberration, detail loss, and artificial haze, which are frequently encountered in existing techniques. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 4820 KiB

Open AccessArticle

Sem-SLAM: Semantic-Integrated SLAM Approach for 3D Reconstruction

by Shuqi Liu, Yufeng Zhuang, Chenxu Zhang, Qifei Li and Jiayu Hou

Appl. Sci. 2025, 15(14), 7881; https://doi.org/10.3390/app15147881 - 15 Jul 2025

Viewed by 100

Abstract

Under the upsurge of research on the integration of Simultaneous Localization and Mapping (SLAM) and neural implicit representation, existing methods exhibit obvious limitations in terms of environmental semantic parsing and scene understanding capabilities. In response to this, this paper proposes a SLAM system [...] Read more.

Under the upsurge of research on the integration of Simultaneous Localization and Mapping (SLAM) and neural implicit representation, existing methods exhibit obvious limitations in terms of environmental semantic parsing and scene understanding capabilities. In response to this, this paper proposes a SLAM system that integrates a full attention mechanism and a multi-scale information extractor. This system constructs a more accurate 3D environmental model by fusing semantic, shape, and geometric orientation features. Meanwhile, to deeply excavate the semantic information in images, a pre-trained frozen 2D segmentation algorithm is employed to extract semantic features, providing a powerful support for 3D environmental reconstruction. Furthermore, a multi-layer perceptron and interpolation techniques are utilized to extract multi-scale features, distinguishing information at different scales. This enables the effective decoding of semantic, RGB, and Truncated Signed Distance Field (TSDF) values from the fused features, achieving high-quality information rendering. Experimental results demonstrate that this method significantly outperforms the baseline-based methods in terms of mapping and tracking accuracy on the Replica and ScanNet datasets. It also shows superior performance in semantic segmentation and real-time semantic mapping tasks, offering a new direction for the development of SLAM technology. Full article

(This article belongs to the Special Issue Applications of Data Science and Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 7562 KiB

Open AccessArticle

FIGD-Net: A Symmetric Dual-Branch Dehazing Network Guided by Frequency Domain Information

by Luxia Yang, Yingzhao Xue, Yijin Ning, Hongrui Zhang and Yongjie Ma

Symmetry 2025, 17(7), 1122; https://doi.org/10.3390/sym17071122 - 13 Jul 2025

Viewed by 221

Abstract

Image dehazing technology is a crucial component in the fields of intelligent transportation and autonomous driving. However, most existing dehazing algorithms only process images in the spatial domain, failing to fully exploit the rich information in the frequency domain, which leads to residual [...] Read more.

Image dehazing technology is a crucial component in the fields of intelligent transportation and autonomous driving. However, most existing dehazing algorithms only process images in the spatial domain, failing to fully exploit the rich information in the frequency domain, which leads to residual haze in the images. To address this issue, we propose a novel Frequency-domain Information Guided Symmetric Dual-branch Dehazing Network (FIGD-Net), which utilizes the spatial branch to extract local haze features and the frequency branch to capture the global haze distribution, thereby guiding the feature learning process in the spatial branch. The FIGD-Net mainly consists of three key modules: the Frequency Detail Extraction Module (FDEM), the Dual-Domain Multi-scale Feature Extraction Module (DMFEM), and the Dual-Domain Guidance Module (DGM). First, the FDEM employs the Discrete Cosine Transform (DCT) to convert the spatial domain into the frequency domain. It then selectively extracts high-frequency and low-frequency features based on predefined proportions. The high-frequency features, which contain haze-related information, are correlated with the overall characteristics of the low-frequency features to enhance the representation of haze attributes. Next, the DMFEM utilizes stacked residual blocks and gradient feature flows to capture local detail features. Specifically, frequency-guided weights are applied to adjust the focus of feature channels, thereby improving the module’s ability to capture multi-scale features and distinguish haze features. Finally, the DGM adjusts channel weights guided by frequency information. This smooths out redundant signals and enables cross-branch information exchange, which helps to restore the original image colors. Extensive experiments demonstrate that the proposed FIGD-Net achieves superior dehazing performance on multiple synthetic and real-world datasets. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

21 pages, 21215 KiB

Open AccessArticle

ES-Net Empowers Forest Disturbance Monitoring: Edge–Semantic Collaborative Network for Canopy Gap Mapping

by Yutong Wang, Zhang Zhang, Jisheng Xia, Fei Zhao and Pinliang Dong

Remote Sens. 2025, 17(14), 2427; https://doi.org/10.3390/rs17142427 - 12 Jul 2025

Viewed by 227

Abstract

Canopy gaps are vital microhabitats for forest carbon cycling and species regeneration, whose accurate extraction is crucial for ecological modeling and smart forestry. However, traditional monitoring methods have notable limitations: ground-based measurements are inefficient; remote-sensing interpretation is susceptible to terrain and spectral interference; [...] Read more.

Canopy gaps are vital microhabitats for forest carbon cycling and species regeneration, whose accurate extraction is crucial for ecological modeling and smart forestry. However, traditional monitoring methods have notable limitations: ground-based measurements are inefficient; remote-sensing interpretation is susceptible to terrain and spectral interference; and traditional algorithms exhibit an insufficient feature representation capability. Aiming at overcoming the bottleneck issues of canopy gap identification in mountainous forest regions, we constructed a multi-task deep learning model (ES-Net) integrating an edge–semantic collaborative perception mechanism. First, a refined sample library containing multi-scale interference features was constructed, which included 2808 annotated UAV images. Based on this, a dual-branch feature interaction architecture was designed. A cross-layer attention mechanism was embedded in the semantic segmentation module (SSM) to enhance the discriminative ability for heterogeneous features. Meanwhile, an edge detection module (EDM) was built to strengthen geometric constraints. Results from selected areas in Yunnan Province (China) demonstrate that ES-Net outperforms U-Net, boosting the Intersection over Union (IoU) by 0.86% (95.41% vs. 94.55%), improving the edge coverage rate by 3.14% (85.32% vs. 82.18%), and reducing the Hausdorff Distance by 38.6% (28.26 pixels vs. 46.02 pixels). Ablation studies further verify that the synergy between SSM and EDM yields a 13.0% IoU gain over the baseline, highlighting the effectiveness of joint semantic–edge optimization. This study provides a terrain-adaptive intelligent interpretation method for forest disturbance monitoring and holds significant practical value for advancing smart forestry construction and ecosystem sustainable management. Full article

► Show Figures

Graphical abstract

36 pages, 25361 KiB

Open AccessArticle

Remote Sensing Image Compression via Wavelet-Guided Local Structure Decoupling and Channel–Spatial State Modeling

by Jiahui Liu, Lili Zhang and Xianjun Wang

Remote Sens. 2025, 17(14), 2419; https://doi.org/10.3390/rs17142419 - 12 Jul 2025

Viewed by 282

Abstract

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer [...] Read more.

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer superior representational capacity. However, challenges remain in achieving a balance between fine-detail adaptation and computational efficiency. Mamba, a state–space model (SSM)-based architecture, offers linear-time complexity and excels at capturing long-range dependencies in sequences. It has been adopted in remote sensing compression tasks to model long-distance dependencies between pixels. However, despite its effectiveness in global context aggregation, Mamba’s uniform bidirectional scanning is insufficient for capturing high-frequency structures such as edges and textures. Moreover, existing visual state–space (VSS) models built upon Mamba typically treat all channels equally and lack mechanisms to dynamically focus on semantically salient spatial regions. To address these issues, we present an innovative architecture for distant sensing image compression, called the Multi-scale Channel Global Mamba Network (MGMNet). MGMNet integrates a spatial–channel dynamic weighting mechanism into the Mamba architecture, enhancing global semantic modeling while selectively emphasizing informative features. It comprises two key modules. The Wavelet Transform-guided Local Structure Decoupling (WTLS) module applies multi-scale wavelet decomposition to disentangle and separately encode low- and high-frequency components, enabling efficient parallel modeling of global contours and local textures. The Channel–Global Information Modeling (CGIM) module enhances conventional VSS by introducing a dual-path attention strategy that reweights spatial and channel information, improving the modeling of long-range dependencies and edge structures. We conducted extensive evaluations on three distinct remote sensing datasets to assess the MGMNet. The results of the investigations revealed that MGMNet outperforms the current SOTA models across various performance metrics. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

Search Results (776)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (776)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI