Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (6,747)

Search Parameters:
Keywords = visual learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
34 pages, 3703 KiB  
Article
Uncertainty-AwareDeep Learning for Robust and Interpretable MI EEG Using Channel Dropout and LayerCAM Integration
by Óscar Wladimir Gómez-Morales, Sofia Escalante-Escobar, Diego Fabian Collazos-Huertas, Andrés Marino Álvarez-Meza and German Castellanos-Dominguez
Appl. Sci. 2025, 15(14), 8036; https://doi.org/10.3390/app15148036 - 18 Jul 2025
Abstract
Motor Imagery (MI) classification plays a crucial role in enhancing the performance of brain–computer interface (BCI) systems, thereby enabling advanced neurorehabilitation and the development of intuitive brain-controlled technologies. However, MI classification using electroencephalography (EEG) is hindered by spatiotemporal variability and the limited interpretability [...] Read more.
Motor Imagery (MI) classification plays a crucial role in enhancing the performance of brain–computer interface (BCI) systems, thereby enabling advanced neurorehabilitation and the development of intuitive brain-controlled technologies. However, MI classification using electroencephalography (EEG) is hindered by spatiotemporal variability and the limited interpretability of deep learning (DL) models. To mitigate these challenges, dropout techniques are employed as regularization strategies. Nevertheless, the removal of critical EEG channels, particularly those from the sensorimotor cortex, can result in substantial spatial information loss, especially under limited training data conditions. This issue, compounded by high EEG variability in subjects with poor performance, hinders generalization and reduces the interpretability and clinical trust in MI-based BCI systems. This study proposes a novel framework integrating channel dropout—a variant of Monte Carlo dropout (MCD)—with class activation maps (CAMs) to enhance robustness and interpretability in MI classification. This integration represents a significant step forward by offering, for the first time, a dedicated solution to concurrently mitigate spatiotemporal uncertainty and provide fine-grained neurophysiologically relevant interpretability in motor imagery classification, particularly demonstrating refined spatial attention in challenging low-performing subjects. We evaluate three DL architectures (ShallowConvNet, EEGNet, TCNet Fusion) on a 52-subject MI-EEG dataset, applying channel dropout to simulate structural variability and LayerCAM to visualize spatiotemporal patterns. Results demonstrate that among the three evaluated deep learning models for MI-EEG classification, TCNet Fusion achieved the highest peak accuracy of 74.4% using 32 EEG channels. At the same time, ShallowConvNet recorded the lowest peak at 72.7%, indicating TCNet Fusion’s robustness in moderate-density montages. Incorporating MCD notably improved model consistency and classification accuracy, especially in low-performing subjects where baseline accuracies were below 70%; EEGNet and TCNet Fusion showed accuracy improvements of up to 10% compared to their non-MCD versions. Furthermore, LayerCAM visualizations enhanced with MCD transformed diffuse spatial activation patterns into more focused and interpretable topographies, aligning more closely with known motor-related brain regions and thereby boosting both interpretability and classification reliability across varying subject performance levels. Our approach offers a unified solution for uncertainty-aware, and interpretable MI classification. Full article
(This article belongs to the Special Issue EEG Horizons: Exploring Neural Dynamics and Neurocognitive Processes)
19 pages, 692 KiB  
Article
The Multimodal Rehabilitation of Complex Regional Pain Syndrome and Its Contribution to the Improvement of Visual–Spatial Memory, Visual Information-Processing Speed, Mood, and Coping with Pain—A Nonrandomized Controlled Trial
by Justyna Wiśniowska, Iana Andreieva, Dominika Robak, Natalia Salata and Beata Tarnacka
Brain Sci. 2025, 15(7), 763; https://doi.org/10.3390/brainsci15070763 - 18 Jul 2025
Abstract
Objectives: To investigate whether a Multimodal Rehabilitation Program (MRP) affects the change in visual–spatial abilities, especially attention, information-processing speed, visual–spatial learning, the severity of depression, and strategies for coping with pain in Complex Regional Pain Syndrome (CRPS) participants. Methods: The study was [...] Read more.
Objectives: To investigate whether a Multimodal Rehabilitation Program (MRP) affects the change in visual–spatial abilities, especially attention, information-processing speed, visual–spatial learning, the severity of depression, and strategies for coping with pain in Complex Regional Pain Syndrome (CRPS) participants. Methods: The study was conducted between October 2021 and February 2023, with a 4-week rehabilitation program that included individual physiotherapy, manual and physical therapy, and psychological intervention such as psychoeducation, relaxation, and Graded Motor Imagery therapy. Twenty participants with CRPS and twenty healthy participants, forming a control group, were enlisted. The study was a 2-arm parallel: a CRPS group with MRP intervention and a healthy control group matched to the CRPS group according to demographic variables. Before and after, the MRP participants in the CRPS group were assessed for visual–spatial learning, attention abilities, severity of depression, and pain-coping strategy. The healthy control group underwent the same assessment without intervention before two measurements. The primary outcome measure was Reproduction on Rey–Osterrieth’s Complex Figure Test assessing visual–spatial learning. Results: In the post-test compared to the pre-test, the participants with CRPS obtained a significantly high score in visual–spatial learning (p < 0.01) and visual information-processing speed (p = 0.01). They made significantly fewer omission mistakes in visual working memory (p = 0.01). After the MRP compared to the pre-test, the CRPS participants indicated a decrease in the severity of depression (p =0.04) and used a task-oriented strategy for coping with pain more often than before the rehabilitation program (p = 0.02). Conclusions: After a 4-week MRP, the following outcomes were obtained: an increase in visual–spatial learning, visual information-processing speed, a decrease in severity of depression, and a change in the pain-coping strategies—which became more adaptive. Full article
(This article belongs to the Section Neurorehabilitation)
22 pages, 1342 KiB  
Article
Multi-Scale Attention-Driven Hierarchical Learning for Fine-Grained Visual Categorization
by Zhihuai Hu, Rihito Kojima and Xian-Hua Han
Electronics 2025, 14(14), 2869; https://doi.org/10.3390/electronics14142869 - 18 Jul 2025
Abstract
Fine-grained visual categorization (FGVC) presents significant challenges due to subtle inter-class variation and significant intra-class diversity, often leading to limited discriminative capacity in global representations. Existing methods inadequately capture localized, class-relevant features across multiple semantic levels, especially under complex spatial configurations. To address [...] Read more.
Fine-grained visual categorization (FGVC) presents significant challenges due to subtle inter-class variation and significant intra-class diversity, often leading to limited discriminative capacity in global representations. Existing methods inadequately capture localized, class-relevant features across multiple semantic levels, especially under complex spatial configurations. To address these challenges, we introduce a Multi-scale Attention-driven Hierarchical Learning (MAHL) framework that iteratively refines feature representations via scale-adaptive attention mechanisms. Specifically, fully connected (FC) classifiers are applied to spatially pooled feature maps at multiple network stages to capture global semantic context. The learned FC weights are then projected onto the original high-resolution feature maps to compute spatial contribution scores for the predicted class, serving as attention cues. These multi-scale attention maps guide the selection of discriminative regions, which are hierarchically integrated into successive training iterations to reinforce both global and local contextual dependencies. Moreover, we explore a generalized pooling operation that parametrically fuses average and max pooling, enabling richer contextual retention in the encoded features. Comprehensive evaluations on benchmark FGVC datasets demonstrate that MAHL consistently outperforms state-of-the-art methods, validating its efficacy in learning robust, class-discriminative, high-resolution representations through attention-guided hierarchical refinement. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Image Classification)
Show Figures

Figure 1

18 pages, 1814 KiB  
Article
AI-Based Damage Risk Prediction Model Development Using Urban Heat Transport Pipeline Attribute Information
by Sungyeol Lee, Jaemo Kang, Jinyoung Kim and Myeongsik Kong
Appl. Sci. 2025, 15(14), 8003; https://doi.org/10.3390/app15148003 - 18 Jul 2025
Abstract
This study analyzed the probability of damage in heat transport pipelines buried in urban areas using pipeline attribute information and damage history data and developed an AI-based predictive model. A dataset was constructed by collecting spatial and attribute data of pipelines and defining [...] Read more.
This study analyzed the probability of damage in heat transport pipelines buried in urban areas using pipeline attribute information and damage history data and developed an AI-based predictive model. A dataset was constructed by collecting spatial and attribute data of pipelines and defining basic units according to specific standards. Damage trends were analyzed based on pipeline attributes, and correlation analysis was performed to identify influential factors. These factors were applied to three machine learning algorithms: Random Forest, eXtreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM). The model with optimal performance was selected by comparing evaluation indicators including the F2-score, accuracy, and area under the curve (AUC). The LightGBM model trained on data from pipelines in use for over 20 years showed the best performance (F2-score = 0.804, AUC = 0.837). This model was used to generate a risk map visualizing the probability of pipeline damage. The map can aid in the efficient management of urban heat transport systems by enabling preemptive maintenance in high-risk areas. Incorporating external environmental data and auxiliary facility information in future models could further enhance reliability and support the development of a more effective maintenance decision-making system. Full article
Show Figures

Figure 1

21 pages, 5917 KiB  
Article
VML-UNet: Fusing Vision Mamba and Lightweight Attention Mechanism for Skin Lesion Segmentation
by Tang Tang, Haihui Wang, Qiang Rao, Ke Zuo and Wen Gan
Electronics 2025, 14(14), 2866; https://doi.org/10.3390/electronics14142866 - 17 Jul 2025
Abstract
Deep learning has advanced medical image segmentation, yet existing methods struggle with complex anatomical structures. Mainstream models, such as CNN, Transformer, and hybrid architectures, face challenges including insufficient information representation and redundant complexity, which limit their clinical deployment. Developing efficient and lightweight networks [...] Read more.
Deep learning has advanced medical image segmentation, yet existing methods struggle with complex anatomical structures. Mainstream models, such as CNN, Transformer, and hybrid architectures, face challenges including insufficient information representation and redundant complexity, which limit their clinical deployment. Developing efficient and lightweight networks is crucial for accurate lesion localization and optimized clinical workflows. We propose the VML-UNet, a lightweight segmentation network with core innovations including the CPMamba module and the multi-scale local supervision module (MLSM). The CPMamba module integrates the visual state space (VSS) block and a channel prior attention mechanism to enable efficient modeling of spatial relationships with linear computational complexity through dynamic channel-space weight allocation, while preserving channel feature integrity. The MLSM enhances local feature perception and reduces the inference burden. Comparative experiments were conducted on three public datasets, including ISIC2017, ISIC2018, and PH2, with ablation experiments performed on ISIC2017. VML-UNet achieves 0.53 M parameters, 2.18 MB memory usage, and 1.24 GFLOPs time complexity, with its performance on the datasets outperforming comparative networks, validating its effectiveness. This study provides valuable references for developing lightweight, high-performance skin lesion segmentation networks, advancing the field of skin lesion segmentation. Full article
(This article belongs to the Section Bioelectronics)
Show Figures

Figure 1

24 pages, 2173 KiB  
Article
A Novel Ensemble of Deep Learning Approach for Cybersecurity Intrusion Detection with Explainable Artificial Intelligence
by Abdullah Alabdulatif
Appl. Sci. 2025, 15(14), 7984; https://doi.org/10.3390/app15147984 - 17 Jul 2025
Abstract
In today’s increasingly interconnected digital world, cyber threats have grown in frequency and sophistication, making intrusion detection systems a critical component of modern cybersecurity frameworks. Traditional IDS methods, often based on static signatures and rule-based systems, are no longer sufficient to detect and [...] Read more.
In today’s increasingly interconnected digital world, cyber threats have grown in frequency and sophistication, making intrusion detection systems a critical component of modern cybersecurity frameworks. Traditional IDS methods, often based on static signatures and rule-based systems, are no longer sufficient to detect and respond to complex and evolving attacks. To address these challenges, Artificial Intelligence and machine learning have emerged as powerful tools for enhancing the accuracy, adaptability, and automation of IDS solutions. This study presents a novel, hybrid ensemble learning-based intrusion detection framework that integrates deep learning and traditional ML algorithms with explainable artificial intelligence for real-time cybersecurity applications. The proposed model combines an Artificial Neural Network and Support Vector Machine as base classifiers and employs a Random Forest as a meta-classifier to fuse predictions, improving detection performance. Recursive Feature Elimination is utilized for optimal feature selection, while SHapley Additive exPlanations (SHAP) provide both global and local interpretability of the model’s decisions. The framework is deployed using a Flask-based web interface in the Amazon Elastic Compute Cloud environment, capturing live network traffic and offering sub-second inference with visual alerts. Experimental evaluations using the NSL-KDD dataset demonstrate that the ensemble model outperforms individual classifiers, achieving a high accuracy of 99.40%, along with excellent precision, recall, and F1-score metrics. This research not only enhances detection capabilities but also bridges the trust gap in AI-powered security systems through transparency. The solution shows strong potential for application in critical domains such as finance, healthcare, industrial IoT, and government networks, where real-time and interpretable threat detection is vital. Full article
Show Figures

Figure 1

15 pages, 3364 KiB  
Article
Potential Benefits of Polar Transformation of Time–Frequency Electrocardiogram (ECG) Signals for Evaluation of Cardiac Arrhythmia
by Hanbit Kang, Daehyun Kwon and Yoon-Chul Kim
Appl. Sci. 2025, 15(14), 7980; https://doi.org/10.3390/app15147980 - 17 Jul 2025
Abstract
There is a lack of studies on the effectiveness of polar-transformed spectrograms in the visualization and prediction of cardiac arrhythmias from electrocardiogram (ECG) data. In this study, single-lead ECG waveforms were converted into two-dimensional rectangular time–frequency spectrograms and polar time–frequency spectrograms. Three pre-trained [...] Read more.
There is a lack of studies on the effectiveness of polar-transformed spectrograms in the visualization and prediction of cardiac arrhythmias from electrocardiogram (ECG) data. In this study, single-lead ECG waveforms were converted into two-dimensional rectangular time–frequency spectrograms and polar time–frequency spectrograms. Three pre-trained convolutional neural network (CNN) models (ResNet50, MobileNet, and DenseNet121) served as baseline networks for model development and testing. Prediction performance and visualization quality were evaluated across various image resolutions. The trade-offs between image resolution and model capacity were quantitatively analyzed. Polar-transformed spectrograms demonstrated superior delineation of R-R intervals at lower image resolutions (e.g., 96 × 96 pixels) compared to conventional spectrograms. For deep-learning-based classification of cardiac arrhythmias, polar-transformed spectrograms achieved comparable accuracy to conventional spectrograms across all evaluated resolutions. The results suggest that polar-transformed spectrograms are particularly advantageous for deep CNN predictions at lower resolutions, making them suitable for edge computing applications where the reduced use of computing resources, such as memory and power consumption, is desirable. Full article
Show Figures

Figure 1

17 pages, 3612 KiB  
Article
MPVT: An Efficient Multi-Modal Prompt Vision Tracker for Visual Target Tracking
by Jianyu Xie, Yan Fu, Junlin Zhou, Tianxiang He, Xiaopeng Wang, Yuke Fang and Duanbing Chen
Appl. Sci. 2025, 15(14), 7967; https://doi.org/10.3390/app15147967 - 17 Jul 2025
Abstract
Visual target tracking is a fundamental task in computer vision. Combining multi-modal information with tracking leverages complementary information, which improves the precision and robustness of trackers. Traditional multi-modal tracking methods typically employ a full fine-tuning scheme, i.e., fine-tuning pre-trained single-modal models to multi-modal [...] Read more.
Visual target tracking is a fundamental task in computer vision. Combining multi-modal information with tracking leverages complementary information, which improves the precision and robustness of trackers. Traditional multi-modal tracking methods typically employ a full fine-tuning scheme, i.e., fine-tuning pre-trained single-modal models to multi-modal tasks. However, this approach suffers from low transfer learning efficiency, catastrophic forgetting, and high cross-task deployment costs. To address these issues, we propose an efficient model named multi-modal prompt vision tracker (MPVT) based on an efficient prompt-tuning paradigm. Three key components are involved in the model: a decoupled input enhancement module, a dynamic adaptive prompt fusion module, and a fully connected head network module. The decoupled input enhancement module enhances input representations via positional and type embedding. The dynamic adaptive prompt fusion module achieves efficient prompt tuning and multi-modal interaction using scaled convolution and low-rank cross-modal attention mechanisms. The fully connected head network module addresses the shortcomings of traditional convolutional head networks such as inductive biases. Experimental results from RGB-T, RGB-D, and RGB-E scenarios show that MPVT outperforms state-of-the-art methods. Moreover, MPVT can save 43.8% GPU memory usage and reduce training time by 62.9% compared with a full-parameter fine-tuning model. Full article
(This article belongs to the Special Issue Advanced Technologies Applied for Object Detection and Tracking)
Show Figures

Figure 1

17 pages, 10396 KiB  
Article
Feature Selection Based on Three-Dimensional Correlation Graphs
by Adam Dudáš and Aneta Szoliková
AppliedMath 2025, 5(3), 91; https://doi.org/10.3390/appliedmath5030091 - 17 Jul 2025
Abstract
The process of feature selection is a critical component of any decision-making system incorporating machine or deep learning models applied to multidimensional data. Feature selection on input data can be performed using a variety of techniques, such as correlation-based methods, wrapper-based methods, or [...] Read more.
The process of feature selection is a critical component of any decision-making system incorporating machine or deep learning models applied to multidimensional data. Feature selection on input data can be performed using a variety of techniques, such as correlation-based methods, wrapper-based methods, or embedded methods. However, many conventionally used approaches do not support backwards interpretability of the selected features, making their application in real-world scenarios impractical and difficult to implement. This work addresses that limitation by proposing a novel correlation-based strategy for feature selection in regression tasks, based on a three-dimensional visualization of correlation analysis results—referred to as three-dimensional correlation graphs. The main objective of this study is the design, implementation, and experimental evaluation of this graphical model through a case study using a multidimensional dataset with 28 attributes. The experiments assess the clarity of the visualizations and their impact on regression model performance, demonstrating that the approach reduces dimensionality while maintaining or improving predictive accuracy, enhances interpretability by uncovering hidden relationships, and achieves better or comparable results to conventional feature selection methods. Full article
Show Figures

Figure 1

49 pages, 3444 KiB  
Article
A Design-Based Research Approach to Streamline the Integration of High-Tech Assistive Technologies in Speech and Language Therapy
by Anna Lekova, Paulina Tsvetkova, Anna Andreeva, Georgi Dimitrov, Tanio Tanev, Miglena Simonska, Tsvetelin Stefanov, Vaska Stancheva-Popkostadinova, Gergana Padareva, Katia Rasheva, Adelina Kremenska and Detelina Vitanova
Technologies 2025, 13(7), 306; https://doi.org/10.3390/technologies13070306 - 16 Jul 2025
Viewed by 80
Abstract
Currently, high-tech assistive technologies (ATs), particularly Socially Assistive Robots (SARs), virtual reality (VR) and conversational AI (ConvAI), are considered very useful in supporting professionals in Speech and Language Therapy (SLT) for children with communication disorders. However, despite a positive public perception, therapists face [...] Read more.
Currently, high-tech assistive technologies (ATs), particularly Socially Assistive Robots (SARs), virtual reality (VR) and conversational AI (ConvAI), are considered very useful in supporting professionals in Speech and Language Therapy (SLT) for children with communication disorders. However, despite a positive public perception, therapists face difficulties when integrating these technologies into practice due to technical challenges and a lack of user-friendly interfaces. To address this gap, a design-based research approach has been employed to streamline the integration of SARs, VR and ConvAI in SLT, and a new software platform called “ATLog” has been developed for designing interactive and playful learning scenarios with ATs. ATLog’s main features include visual-based programming with graphical interface, enabling therapists to intuitively create personalized interactive scenarios without advanced programming skills. The platform follows a subprocess-oriented design, breaking down SAR skills and VR scenarios into microskills represented by pre-programmed graphical blocks, tailored to specific treatment domains, therapy goals, and language skill levels. The ATLog platform was evaluated by 27 SLT experts using the Technology Acceptance Model (TAM) and System Usability Scale (SUS) questionnaires, extended with additional questions specifically focused on ATLog structure and functionalities. According to the SUS results, most of the experts (74%) evaluated ATLog with grades over 70, indicating high acceptance of its usability. Over half (52%) of the experts rated the additional questions focused on ATLog’s structure and functionalities in the A range (90–100), while 26% rated them in the B range (80–89), showing strong acceptance of the platform for creating and running personalized interactive scenarios with ATs. According to the TAM results, experts gave high grades for both perceived usefulness (44% in the A range) and perceived ease of use (63% in the A range). Full article
Show Figures

Figure 1

21 pages, 15709 KiB  
Article
Preliminary Quantitative Evaluation of the Optimal Colour System for the Assessment of Peripheral Circulation from Applied Pressure Using Machine Learning
by Masanobu Tsurumoto, Takunori Shimazaki, Jaakko Hyry, Yoshifumi Kawakubo, Takeshi Yokoyama and Daisuke Anzai
Sensors 2025, 25(14), 4441; https://doi.org/10.3390/s25144441 - 16 Jul 2025
Viewed by 63
Abstract
Peripheral circulatory failure refers to a condition in which the blood flow through superficial capillaries is markedly reduced or completely occluded. In clinical practice, nurses strictly adhere to regular repositioning protocols to prevent peripheral circulatory failure, during which the skin condition is evaluated [...] Read more.
Peripheral circulatory failure refers to a condition in which the blood flow through superficial capillaries is markedly reduced or completely occluded. In clinical practice, nurses strictly adhere to regular repositioning protocols to prevent peripheral circulatory failure, during which the skin condition is evaluated visually. In this study, skin colour changes resulting from pressure application were continuously captured using a camera, and supervised machine learning was employed to classify the data into two categories: before and after pressure. The evaluation of practical colour space components revealed that the h component of the JCh colour space demonstrated the highest discriminative performance (Area Under the Curve (AUC) = 0.88), followed by the a* component of the CIELAB colour space (AUC = 0.84) and the H component of the HSV colour space (AUC = 0.83). These findings demonstrate that it is feasible to quantitatively evaluate skin colour changes associated with pressure, suggesting that this approach can serve as a valuable indicator for dimensionality reduction in feature extraction for machine learning and is potentially an effective method for preventing pressure-induced skin injuries. Full article
Show Figures

Figure 1

26 pages, 7645 KiB  
Article
VMMT-Net: A Dual-Branch Parallel Network Combining Visual State Space Model and Mix Transformer for Land–Sea Segmentation of Remote Sensing Images
by Jiawei Wu, Zijian Liu, Zhipeng Zhu, Chunhui Song, Xinghui Wu and Haihua Xing
Remote Sens. 2025, 17(14), 2473; https://doi.org/10.3390/rs17142473 - 16 Jul 2025
Viewed by 76
Abstract
Land–sea segmentation is a fundamental task in remote sensing image analysis, and plays a vital role in dynamic coastline monitoring. The complex morphology and blurred boundaries of coastlines in remote sensing imagery make fast and accurate segmentation challenging. Recent deep learning approaches lack [...] Read more.
Land–sea segmentation is a fundamental task in remote sensing image analysis, and plays a vital role in dynamic coastline monitoring. The complex morphology and blurred boundaries of coastlines in remote sensing imagery make fast and accurate segmentation challenging. Recent deep learning approaches lack the ability to model spatial continuity effectively, thereby limiting a comprehensive understanding of coastline features in remote sensing imagery. To address this issue, we have developed VMMT-Net, a novel dual-branch semantic segmentation framework. By constructing a parallel heterogeneous dual-branch encoder, VMMT-Net integrates the complementary strengths of the Mix Transformer and the Visual State Space Model, enabling comprehensive modeling of local details, global semantics, and spatial continuity. We design a Cross-Branch Fusion Module to facilitate deep feature interaction and collaborative representation across branches, and implement a customized decoder module that enhances the integration of multiscale features and improves boundary refinement of coastlines. Extensive experiments conducted on two benchmark remote sensing datasets, GF-HNCD and BSD, demonstrate that the proposed VMMT-Net outperforms existing state-of-the-art methods in both quantitative metrics and visual quality. Specifically, the model achieves mean F1-scores of 98.48% (GF-HNCD) and 98.53% (BSD) and mean intersection-over-union values of 97.02% (GF-HNCD) and 97.11% (BSD). The model maintains reasonable computational complexity, with only 28.24 M parameters and 25.21 GFLOPs, striking a favorable balance between accuracy and efficiency. These results indicate the strong generalization ability and practical applicability of VMMT-Net in real-world remote sensing segmentation tasks. Full article
(This article belongs to the Special Issue Application of Remote Sensing in Coastline Monitoring)
Show Figures

Figure 1

25 pages, 10906 KiB  
Article
Explainable Machine Learning for Mapping Rainfall-Induced Landslide Thresholds in Italy
by Xiangyu Shao, Wenjun Yan, Chaoying Yan, Wen Zhao, Yixuan Wang, Xia Shi, Hongchang Dong, Tianjiang Li, Junpo Yu, Peng Zuo, Zeyu Zhou and Jiming Jin
Appl. Sci. 2025, 15(14), 7937; https://doi.org/10.3390/app15147937 - 16 Jul 2025
Viewed by 59
Abstract
Reliable rainfall thresholds are critical for effective early warning and mitigating the risks of rainfall-induced landslides. Traditional statistical models have limitations in multi-variable modeling, while machine learning models face interpretability challenges. Explainable machine learning methods can address these challenges, but they are rarely [...] Read more.
Reliable rainfall thresholds are critical for effective early warning and mitigating the risks of rainfall-induced landslides. Traditional statistical models have limitations in multi-variable modeling, while machine learning models face interpretability challenges. Explainable machine learning methods can address these challenges, but they are rarely applied to rainfall threshold modeling. In this study, we compared the performance of an empirical statistical model and machine learning models for predicting rainfall-induced landslides in Italy. Based on the optimal model, we visualized refined rainfall thresholds at three probability levels and employed SHAP (Shapley Additive Explanations) to enhance model explainability by quantifying the contribution of each input variable to the predictions. The results demonstrated that the XGBoost model achieved a good performance (AUC = 0.917 ± 0.026) with well-balanced sensitivity (0.792 ± 0.075) and specificity (0.812 ± 0.033) in landslide susceptibility modeling. Hydrological factors, particularly total rainfall, were identified as the dominant triggering mechanisms, with SHAP analysis confirming their substantially greater contribution compared to environmental factors in rainfall threshold modeling. The developed visualized threshold maps revealed distinct spatial variations in landslide-triggering rainfall thresholds across Italy, characterized by lower thresholds in gentle slope areas with moderate annual precipitation and higher thresholds in steep slope and mid-to-low-elevation regions, while these regional differences decreased under high-probability scenarios. This study offered a modeling approach for regional rainfall threshold assessment by integrating multi-variable modeling with explainable methods, contributing to the development of landslide early warning systems. Full article
Show Figures

Figure 1

20 pages, 41202 KiB  
Article
Copper Stress Levels Classification in Oilseed Rape Using Deep Residual Networks and Hyperspectral False-Color Images
by Yifei Peng, Jun Sun, Zhentao Cai, Lei Shi, Xiaohong Wu, Chunxia Dai and Yubin Xie
Horticulturae 2025, 11(7), 840; https://doi.org/10.3390/horticulturae11070840 - 16 Jul 2025
Viewed by 54
Abstract
In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to [...] Read more.
In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to humans. This study proposes an efficient and precise non-destructive detection method for Cu stress in oilseed rape, which is based on hyperspectral false-color image construction using principal component analysis (PCA). By comprehensively capturing the spectral representation of oilseed rape plants, both the one-dimensional (1D) spectral sequence and spatial image data were utilized for multi-class classification. The classification performance of models based on 1D spectral sequences was compared from two perspectives: first, between machine learning and deep learning methods (best accuracy: 93.49% vs. 96.69%); and second, between shallow and deep convolutional neural networks (CNNs) (best accuracy: 95.15% vs. 96.69%). For spatial image data, deep residual networks were employed to evaluate the effectiveness of visible-light and false-color images. The RegNet architecture was chosen for its flexible parameterization and proven effectiveness in extracting multi-scale features from hyperspectral false-color images. This flexibility enabled RegNetX-6.4GF to achieve optimal performance on the dataset constructed from three types of false-color images, with the model reaching a Macro-Precision, Macro-Recall, Macro-F1, and Accuracy of 98.17%, 98.15%, 98.15%, and 98.15%, respectively. Furthermore, Grad-CAM visualizations revealed that latent physiological changes in plants under heavy metal stress guided feature learning within CNNs, and demonstrated the effectiveness of false-color image construction in extracting discriminative features. Overall, the proposed technique can be integrated into portable hyperspectral imaging devices, enabling real-time and non-destructive detection of heavy metal stress in modern agricultural practices. Full article
Show Figures

Figure 1

21 pages, 31171 KiB  
Article
Local Information-Driven Hierarchical Fusion of SAR and Visible Images via Refined Modal Salient Features
by Yunzhong Yan, La Jiang, Jun Li, Shuowei Liu and Zhen Liu
Remote Sens. 2025, 17(14), 2466; https://doi.org/10.3390/rs17142466 - 16 Jul 2025
Viewed by 51
Abstract
Compared to other multi-source image fusion tasks, visible and SAR image fusion faces a lack of training data in deep learning-based methods. Introducing structural priors to design fusion networks is a viable solution. We incorporated the feature hierarchy concept from computer vision, dividing [...] Read more.
Compared to other multi-source image fusion tasks, visible and SAR image fusion faces a lack of training data in deep learning-based methods. Introducing structural priors to design fusion networks is a viable solution. We incorporated the feature hierarchy concept from computer vision, dividing deep features into low-, mid-, and high-level tiers. Based on the complementary modal characteristics of SAR and visible, we designed a fusion architecture that fully analyze and utilize the difference of hierarchical features. Specifically, our framework has two stages. In the cross-modal enhancement stage, a CycleGAN generator-based method for cross-modal interaction and input data enhancement is employed to generate pseudo-modal images. In the fusion stage, we have three innovations: (1) We designed feature extraction branches and fusion strategies differently for each level based on the features of different levels and the complementary modal features of SAR and visible to fully utilize cross-modal complementary features. (2) We proposed the Layered Strictly Nested Framework (LSNF), which emphasizes hierarchical differences and uses hierarchical characteristics, to reduce feature redundancy. (3) Based on visual saliency theory, we proposed a Gradient-weighted Pixel Loss (GWPL), which dynamically assigns higher weights to regions with significant gradient magnitudes, emphasizing high-frequency detail preservation during fusion. Experiments on the YYX-OPT-SAR and WHU-OPT-SAR datasets show that our method outperforms 11 state-of-the-art methods. Ablation studies confirm each component’s contribution. This framework effectively meets remote sensing applications’ high-precision image fusion needs. Full article
Show Figures

Figure 1

Back to TopTop