MDPI - Publisher of Open Access Journals

17 pages, 17693 KB

Open AccessArticle

High-Resolution Mapping of Eucalyptus Plantations for Municipal Forest Governance: A Task-Specific Deep Learning Approach in Nanning, China

by Boyuan Zhuang and Qingling Zhang

Forests 2026, 17(4), 461; https://doi.org/10.3390/f17040461 (registering DOI) - 9 Apr 2026

Abstract

Eucalyptus plantations are expanding rapidly in southern China, delivering economic benefits but also posing ecological risks, which creates a pressing need for precise, municipal-scale monitoring. Mapping eucalyptus with sub-meter resolution imagery, however, is confronted by two main challenges: (1) the pronounced multi-scale heterogeneity [...] Read more.

Eucalyptus plantations are expanding rapidly in southern China, delivering economic benefits but also posing ecological risks, which creates a pressing need for precise, municipal-scale monitoring. Mapping eucalyptus with sub-meter resolution imagery, however, is confronted by two main challenges: (1) the pronounced multi-scale heterogeneity of fragmented stands, and (2) the difficulty in achieving precise boundary delineation due to shadowed and complex canopy edges. To address these, this study makes two primary contributions. First, we present the Eucalyptus Semantic Segmentation Dataset (ESSD)—a high-quality, pixel-level annotated dataset that includes geographic coordinates to support reproducible research. Second, we propose SDCNet, a task-specific deep learning network optimized for eucalyptus mapping. SDCNet incorporates a redesigned SD-ASPP module that leverages Deep Over-parameterized Convolution (DO-Conv) to capture multi-scale features, alongside a novel Coordinated Self-Attention Mechanism (CSAM) to enhance the accuracy of canopy boundary detection. Ablation studies confirm the effectiveness of each component. In benchmark tests against seven state-of-the-art semantic segmentation models, SDCNet achieves superior performance, obtaining a per-class Intersection over Union (IoU) of 88.83% and an F1-score of 93.81% for eucalyptus—an improvement of +2.24% in IoU and +1.71% in F1-score over the strongest baseline. Applied to Nanning City, SDCNet produces the first 0.3 m resolution eucalyptus distribution map for the region. This map reveals a critical finding: within the watershed of the Xiyunjiang Reservoir—Nanning’s primary drinking water source—eucalyptus plantations cover more than 50% of the forested area. This result provides the first quantitative, high-resolution evidence of potential hydrological risk at a municipal scale. Our work establishes an integrated framework that bridges advanced remote sensing with actionable forest governance, offering scientifically grounded support for ecological risk assessment and sustainable land-use policy. Full article

(This article belongs to the Topic AI-Enabled Precise Forest Monitoring Through UAV and Satellite Remote Sensing)

► Show Figures

Figure 1

36 pages, 7325 KB

Open AccessArticle

Intelligent Scheduling of Rail-Guided Shuttle Cars via Deep Reinforcement Learning Integrating Dynamic Graph Neural Networks and Transformer Model

by Fang Zhu and Shanshan Peng

Algorithms 2026, 19(4), 289; https://doi.org/10.3390/a19040289 - 8 Apr 2026

Abstract

With the rapid development of e-commerce and smart manufacturing, automated warehouse systems have become critical infrastructure for modern logistics. In China’s vast market, the dynamic scheduling of Rail-Guided Vehicles (RGVs) faces significant challenges due to complex task uncertainties, hierarchical supply chain structures, and [...] Read more.

With the rapid development of e-commerce and smart manufacturing, automated warehouse systems have become critical infrastructure for modern logistics. In China’s vast market, the dynamic scheduling of Rail-Guided Vehicles (RGVs) faces significant challenges due to complex task uncertainties, hierarchical supply chain structures, and real-time collision avoidance requirements. Traditional rule-based methods and static optimization models often fail to adapt to such dynamic environments. To address these issues, this paper proposes a novel hybrid deep reinforcement learning framework integrating a Dynamic Graph Neural Network (DGNN) and a Transformer model. The DGNN captures the spatiotemporal dependencies of the warehouse network topology, while the Transformer mechanism enhances long-range feature extraction for task prioritization. Furthermore, we design a centralized Deep Q-network (DQN) framework with parameterized action spaces to coordinate multiple RGVs collaboratively. While the system manages multiple physical vehicles, the learning architecture employs a single-agent global scheduler to avoid the non-stationarity issues inherent in multi-agent reinforcement learning. Experimental results based on real-world data from a large-scale electronics manufacturing warehouse demonstrate that our method reduces average task completion time by 18.5% and improves system throughput by 22.3% compared to state-of-the-art baselines. The proposed approach demonstrates potential for intelligent warehouse management in dynamic industrial scenarios. Full article

► Show Figures

Figure 1

26 pages, 7110 KB

Open AccessArticle

Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles

by Xiaosong Tang, Maoxuan Xu, Feng Yang, Jialin Liu, Suping Peng and Xu Qiao

Remote Sens. 2026, 18(7), 1102; https://doi.org/10.3390/rs18071102 - 7 Apr 2026

Abstract

During the interpretation of Borehole Radar (BHR) B-scan profiles, the accurate determination of the azimuth of geological targets in three-dimensional space is a critical issue for achieving precise anomaly localization and spatial structure inversion. However, existing directional BHR anomaly localization methods exhibit limited [...] Read more.

During the interpretation of Borehole Radar (BHR) B-scan profiles, the accurate determination of the azimuth of geological targets in three-dimensional space is a critical issue for achieving precise anomaly localization and spatial structure inversion. However, existing directional BHR anomaly localization methods exhibit limited intelligence, insufficient adaptability to multi-site data, and weak generalization capability, rendering them inadequate for engineering applications under complex geological conditions. To address these challenges, a robust deep learning model, termed BSS-Pose-BHR, is developed based on YOLOv11n-pose for keypoint detection in directional BHR profiles. The model incorporates three key optimizations: Bi-Level Routing Attention (BRA) replaces Multi-Head Self-Attention (MHSA) in the backbone to improve computational efficiency; Conv_SAMWS enhances keypoint-related feature weighting in the backbone and neck; and Spatial and Channel Reconstruction Convolution (SCConv) is integrated into the detection head to reduce redundancy and strengthen local feature extraction, thereby improving suitability for keypoint detection tasks. In addition, a three-dimensional electromagnetic model of limestone containing a certain density of clay particles is established to construct a simulation dataset. On the simulated test set, compared with current mainstream deep learning approaches and conventional directional borehole radar anomaly localization algorithms, BSS-Pose-BHR achieves superior performance, with an mAP50(B) of 0.9686, an mAP50–95(B) of 0.7712, an mAP50(P) of 0.9951, and an mAP50–95(P) of 0.9952. Ablation experiments demonstrate that each proposed module contributes significantly to performance improvement. Compared with the baseline, BSS-Pose-BHR improves mAP50(B) by 5.39% and mAP50(P) by 0.86%, while increasing model weight by only 1.05 MB, thereby achieving a reasonable trade-off between detection accuracy and complexity. Furthermore, indoor physical model experiments validate the effectiveness of the method on measured data. Robustness experiments under different Peak Signal-to-Noise Ratio (PSNR) conditions and varying missing-trace rates indicate that BSS-Pose-BHR maintains high detection accuracy under moderate noise and data loss, demonstrating strong engineering applicability and practical value. Full article

(This article belongs to the Special Issue Ground Penetrating Radar (GPR) Applications in Earth, Moon and Planetary Exploration (Second Edition))

► Show Figures

Figure 1

28 pages, 3267 KB

Open AccessArticle

A Hierarchical Dynamic Path Planning Framework for Autonomous Vehicles Based on Physics-Informed Potential Field and TD3 Reinforcement Learning

by Yan Pan, Yu Wang and Bin Ran

Appl. Sci. 2026, 16(7), 3610; https://doi.org/10.3390/app16073610 - 7 Apr 2026

Abstract

Autonomous driving in dense traffic demands policies that ensure safety, accurate path tracking, and ride comfort, yet reinforcement learning (RL) alone suffers from low sample efficiency and weak safety guarantees, while classical artificial potential field (APF) methods lack adaptability to dynamic scenarios. This [...] Read more.

Autonomous driving in dense traffic demands policies that ensure safety, accurate path tracking, and ride comfort, yet reinforcement learning (RL) alone suffers from low sample efficiency and weak safety guarantees, while classical artificial potential field (APF) methods lack adaptability to dynamic scenarios. This paper proposes PIPF-TD3, which integrates APF theory with the Twin Delayed Deep Deterministic Policy Gradient (TD3) by embedding composite potential values and Doppler-weighted gradients as physics-informed features into the state vector. A Hybrid A* planner generates a reference path encoded as an attractive field; repulsive fields model nearby obstacles using real-time perception data; and a multi-objective reward function jointly optimizes path tracking, collision avoidance, and ride comfort. Experiments in CARLA 0.9.14 across two scenarios—a highway segment with mixed obstacles and a signalized intersection with conflicting turning movements—show that PIPF-TD3 achieves 100% task completion with zero collisions, whereas TD3 without potential field guidance suffers a 90% collision rate. PIPF-TD3 reduces mean cross-track error to 0.12 m (72.1% reduction over the rule-based FSM baseline), maintains 67.0% larger safety clearance, and yields RMS longitudinal and lateral accelerations of 1.12 and 0.75 m/s², outperforming the FSM by 37.1% and 42.7%. These results confirm that Doppler-weighted physical priors substantially enhance RL-based driving safety and quality in complex traffic conditions. Full article

(This article belongs to the Section Transportation and Future Mobility)

► Show Figures

Figure 1

25 pages, 1501 KB

Open AccessArticle

MA-JTATO: Multi-Agent Joint Task Association and Trajectory Optimization in UAV-Assisted Edge Computing System

by Yunxi Zhang and Zhigang Wen

Drones 2026, 10(4), 267; https://doi.org/10.3390/drones10040267 - 7 Apr 2026

Abstract

With the rapid development of applications such as smart cities and the industrial internet, the computation-intensive tasks generated by massive sensing devices pose significant challenges to traditional cloud computing paradigms. Unmanned aerial vehicle (UAV)-assisted edge computing systems, leveraging their high mobility and wide-area [...] Read more.

With the rapid development of applications such as smart cities and the industrial internet, the computation-intensive tasks generated by massive sensing devices pose significant challenges to traditional cloud computing paradigms. Unmanned aerial vehicle (UAV)-assisted edge computing systems, leveraging their high mobility and wide-area coverage capabilities, offer an innovative architecture for low-latency and highly reliable edge services. However, the practical deployment of such systems faces a highly complex multi-objective optimization problem featured by the tight coupling of task offloading decisions, UAV trajectory planning, and edge server resource allocation. Conventional optimization methods are difficult to adapt to the dynamic and high-dimensional characteristics of this problem, leading to suboptimal system performance. To address this critical challenge, this paper constructs an intelligent collaborative optimization framework for UAV-assisted edge computing systems and formulates the system quality of service (QoS) optimization problem as a mixed-integer non-convex programming problem with the dual objectives of minimizing task processing latency and reducing overall system energy consumption. A multi-agent joint task association and trajectory optimization (MA-JTATO) algorithm based on hybrid reinforcement learning is proposed to solve this intractable problem, which innovatively decouples the original coupled optimization problem into three interrelated subproblems and realizes their collaborative and efficient solution. Specifically, the Advantage Actor-Critic (A2C) algorithm is adopted to realize dynamic and optimal task association between UAVs and edge servers for discrete decision-making requirements; the multi-agent deep deterministic policy gradient (MADDPG) method is employed to achieve cooperative and energy-efficient trajectory planning for multiple UAVs to meet the needs of continuous control in dynamic environments; and convex optimization theory is applied to obtain a closed-form optimal solution for the efficient allocation of computational resources on edge servers. Simulation results demonstrate that the proposed MA-JTATO algorithm significantly outperforms traditional baseline algorithms in enhancing overall QoS, effectively validating the framework’s superior performance and robustness in dynamic and complex scenarios. Full article

(This article belongs to the Section Drone Communications)

► Show Figures

Figure 1

30 pages, 1921 KB

Open AccessArticle

TinyML for Sustainable Edge Intelligence: Practical Optimization Under Extreme Resource Constraints

by Mohamed Echchidmi and Anas Bouayad

Technologies 2026, 14(4), 215; https://doi.org/10.3390/technologies14040215 - 7 Apr 2026

Abstract

Deep learning has emerged as an effective tool for automatic waste classification, supporting cleaner cities and more sustainable recycling systems. Because environmental protection is central to the United Nations Sustainable Development Goals (SDGs), improving the sorting and processing of everyday waste is a [...] Read more.

Deep learning has emerged as an effective tool for automatic waste classification, supporting cleaner cities and more sustainable recycling systems. Because environmental protection is central to the United Nations Sustainable Development Goals (SDGs), improving the sorting and processing of everyday waste is a practical step toward this broader objective. In many real-world settings, however, waste is still sorted manually, which is slow, labor-intensive, and prone to human error. Although convolutional neural networks (CNNs) can automate this task with high accuracy, many state-of-the-art models remain too large and computationally demanding for low-cost edge devices intended for deployment in homes, schools, and small recycling facilities. In this work, we investigate lightweight waste-classification models suitable for TinyML deployment while preserving competitive accuracy. We first benchmark multiple CNN architectures to establish a strong baseline, then apply complementary compression strategies including quantization, pruning, singular value decomposition (SVD) low-rank approximation, and knowledge distillation. In addition, we evaluate an RL-guided multi-teacher selection benchmark that adaptively chooses one teacher per minibatch during distillation to improve student training stability, achieving up to 85% accuracy with only 0.496 M parameters (FP32 ≈ 1.89 MB; INT8 ≈ 0.47 MB). Across all experiments, the best accuracy–size trade-off is obtained by combining knowledge distillation with post-training quantization, reducing the model footprint from approximately 16 MB to 281 KB while maintaining 82% accuracy. The resulting model is feasible for deployment on mobile applications and resource-constrained embedded devices based on model size and TensorFlow Lite Micro compatibility. Full article

(This article belongs to the Special Issue Emerging Technologies and Intelligent Systems for Sustainable Development)

21 pages, 1876 KB

Open AccessReview

Artificial Intelligence in MRI-Based Glioma Imaging: From Radiomics-Based Machine Learning to Deep Learning Approaches

by Ammar Saloum, Israa Zaher, Christian Stipho, Enes Demir, Varun Naravetla, Mehrdad Pahlevani, Nasser Yaghi and Michael Karsy

BioMedInformatics 2026, 6(2), 20; https://doi.org/10.3390/biomedinformatics6020020 - 7 Apr 2026

Abstract

Gliomas are generally readily detected and broadly characterized using conventional MRI; however, substantial challenges remain in accurately delineating tumor extent, grading heterogeneous disease, and translating imaging findings into consistent, reproducible clinical decisions. Despite reported Dice coefficients of 0.85–0.91 for whole-tumor segmentation and classification [...] Read more.

Gliomas are generally readily detected and broadly characterized using conventional MRI; however, substantial challenges remain in accurately delineating tumor extent, grading heterogeneous disease, and translating imaging findings into consistent, reproducible clinical decisions. Despite reported Dice coefficients of 0.85–0.91 for whole-tumor segmentation and classification AUC values exceeding 0.90 for glioma grading in curated datasets, most AI systems remain limited by validation design, dataset bias, and inadequate external generalizability. This narrative review synthesizes current AI applications for MRI-based glioma detection and segmentation, highlighting the evolution from radiomics-based classical machine learning approaches relying on handcrafted features to deep learning models capable of end-to-end representation learning. Commonly used MRI sequences, algorithmic paradigms, and reported performance trends are reviewed, with particular emphasis on tumor segmentation as a foundational enabling task. Key limitations that hinder clinical translation are examined, including limited dataset diversity, validation practices that inflate reported performance, domain shift across institutions, acquisition-related bias, and inadequate model interpretability. Emerging strategies to address these challenges, such as multi-institutional training, harmonization techniques, explainable AI frameworks, and workflow-integrated validation, are also discussed. While AI-based models demonstrate strong technical performance in research settings, their clinical impact will depend on rigorous external validation, transparency, and alignment with real-world neuro-oncology workflows. Full article

► Show Figures

Figure 1

10 pages, 512 KB

Open AccessProceeding Paper

Multitask Deep Neural Network for IMU Calibration, Denoising, and Dynamic Noise Adaption for Vehicle Navigation

by Frieder Schmid and Jan Fischer

Eng. Proc. 2026, 126(1), 44; https://doi.org/10.3390/engproc2026126044 - 7 Apr 2026

Viewed by 46

Abstract

In intelligent vehicle navigation, efficient sensor data processing and accurate system stabilization is critical to maintain robust performance, especially when GNSS signals are unavailable or unreliable. Classical calibration methods for Inertial Measurement Units (IMUs), such as discrete and system-level calibration, fail to capture [...] Read more.

In intelligent vehicle navigation, efficient sensor data processing and accurate system stabilization is critical to maintain robust performance, especially when GNSS signals are unavailable or unreliable. Classical calibration methods for Inertial Measurement Units (IMUs), such as discrete and system-level calibration, fail to capture time-varying, non-linear, and non-Gaussian noise characteristics. Likewise, Kalman filters typically assume static measurement noise levels for non-holonomic constraints (NHCs), resulting in suboptimal performance in dynamic environments. Furthermore, zero-velocity detection plays a vital role in preventing error accumulation by enabling reliable zero-velocity updates during motion stops, but classical thresholding approaches often lack robustness and precision. To address these limitations, we propose a novel multitask deep neural network (MTDNN) architecture that jointly learns IMU calibration, adaptive noise level estimation for NHC, and zero-velocity detection solely from raw IMU data. This shared-encoder design is utilized to minimize computational overhead, enabling real-time deployment on resource-constrained platforms such as Raspberry Pi. The model is trained using post-processed GNSS-RTK ground truth trajectories obtained from both a proprietary dataset and the publicly available 4Seasons dataset. Experimental results confirm the proposed system’s superior accuracy, efficiency, and real-time capability in GNSS-denied conditions. Full article

(This article belongs to the Proceedings of European Navigation Conference 2025)

► Show Figures

Figure 1

34 pages, 56063 KB

Open AccessArticle

Deep Learning-Based Intelligent Analysis of Rock Thin Sections: From Cross-Scale Lithology Classification to Grain Segmentation for Quantitative Fabric Characterization

by Wenhao Yang, Ang Li, Liyan Zhang and Xiaoyao Qin

Electronics 2026, 15(7), 1509; https://doi.org/10.3390/electronics15071509 - 3 Apr 2026

Viewed by 233

Abstract

Quantitative microstructure evaluation of sedimentary rock thin sections is essential for revealing reservoir flow mechanisms and assessing reservoir quality. However, traditional manual identification is inefficient and prone to subjectivity. Although current deep learning approaches have improved efficiency, most remain confined to single tasks [...] Read more.

Quantitative microstructure evaluation of sedimentary rock thin sections is essential for revealing reservoir flow mechanisms and assessing reservoir quality. However, traditional manual identification is inefficient and prone to subjectivity. Although current deep learning approaches have improved efficiency, most remain confined to single tasks and lack a pathway to translate image recognition into quantifiable geological parameters. Moreover, these methods struggle with cross-scale feature extraction and accurate grain boundary localization in complex textures. To overcome these limitations, this study proposes a three-stage automated analysis framework integrating intelligent lithology identification, sandstone grain segmentation, and quantitative analysis of fabric parameters. To address scale discrepancies in lithology discrimination, Rock-PLionNet integrates a Partial-to-Whole Context Fusion (PWC-Fusion) module and the Lion optimizer, which mitigates cross-scale feature inconsistencies and enables accurate screening of target sandstone samples. Subsequently, to correct boundary deviations caused by low contrast and grain adhesion, the PetroSAM-CRF strategy integrates polarization-aware enhancement with dense conditional random field (DenseCRF)-based probabilistic refinement to extract precise grain contours. Based on these outputs, the framework automatically calculates key fabric parameters, including grain size and roundness. Experiments on 3290 original multi-source thin-section images show that Rock-PLionNet achieves a classification accuracy of 96.57% on the test set. Furthermore, PetroSAM-CRF reduces segmentation bias observed in general-purpose models under complex texture conditions, enabling accurate parameter estimation with a roundness error of 2.83%. Overall, this study presents an intelligent workflow linking microscopic image recognition with quantitative analysis of geological fabric parameters, providing a practical pathway for digital petrographic evaluation in hydrocarbon exploration. Full article

► Show Figures

Figure 1

23 pages, 1312 KB

Open AccessArticle

From Text to Structure: Precise Cognitive Diagnosis via Semantic Enhancement and Dynamic Q-Matrix Calibration

by Jingxing Fan, Zhichang Zhang and Yuming Du

Appl. Sci. 2026, 16(7), 3477; https://doi.org/10.3390/app16073477 - 2 Apr 2026

Viewed by 304

Abstract

Traditional cognitive diagnosis models typically rely on expert-annotated Q-matrices to define the relationship between exercises and knowledge concepts. This process is not only highly subjective and costly, but also prone to introducing noise and bias, which directly affects diagnostic accuracy. Meanwhile, most existing [...] Read more.

Traditional cognitive diagnosis models typically rely on expert-annotated Q-matrices to define the relationship between exercises and knowledge concepts. This process is not only highly subjective and costly, but also prone to introducing noise and bias, which directly affects diagnostic accuracy. Meanwhile, most existing deep learning-based methods overlook the rich semantic information contained in concept descriptions, making it difficult to deeply model the intrinsic relationships among knowledge points, resulting in limited interpretability of the models. To address these issues, this paper proposes a cognitive diagnosis model that incorporates key textual information from concept descriptions to refine the Q-matrix (KECQCD). The core innovation of the model lies in leveraging the pre-trained language model RoBERTa to encode concept texts, fusing semantic features with identifier embeddings through a gating mechanism to construct semantically-enhanced concept representations. It designs a concept-exercise heterogeneous information network and employs a graph attention mechanism to adaptively aggregate node features, explicitly modeling high-order knowledge dependencies. Furthermore, a multi-task joint learning framework is established to predict student performance while dynamically correcting association errors in the initial Q-matrix. Experimental results on the public Junyi dataset show that the KECQCD model significantly outperforms mainstream baseline models across multiple metrics, including accuracy (ACC), area under the curve (AUC), and root mean square error (RMSE). Ablation studies confirm the effectiveness of each core module, and diagnostic consistency (DOA) evaluation further demonstrates the enhanced interpretability of the model’s outcomes. This research offers a new solution for building accurate, reliable, and interpretable cognitive diagnosis systems, contributing positively to the advancement of personalized intelligent education. Full article

► Show Figures

Figure 1

27 pages, 8750 KB

Open AccessArticle

Uncertainty-Aware Prediction of Unconfined Compressive Strength and Fracture Anisotropy in Deep Shales: A Leakage-Free Physics-Constrained Machine Learning Framework

by Yicheng Song and Xinpu Shen

Appl. Sci. 2026, 16(7), 3471; https://doi.org/10.3390/app16073471 - 2 Apr 2026

Viewed by 161

Abstract

The continuous prediction and uncertainty quantification of unconfined compressive strength (UCS) and the fracture-related index of anisotropy (FRIA) are essential for optimizing drilling operations and hydraulic fracturing design in shale gas development. However, machine-learning-based log inversion often suffers from (1) spatial information leakage [...] Read more.

The continuous prediction and uncertainty quantification of unconfined compressive strength (UCS) and the fracture-related index of anisotropy (FRIA) are essential for optimizing drilling operations and hydraulic fracturing design in shale gas development. However, machine-learning-based log inversion often suffers from (1) spatial information leakage caused by autocorrelation in well logs, (2) implicit target contamination during multi-source data fusion, and (3) biased evaluation under random data splitting, which can overestimate apparent performance and underestimate extrapolation risk in deep heterogeneous intervals. To address these limitations, we propose a leakage-free, physics-constrained framework for predicting UCS and FRIA in the Weiyuan shale gas reservoir. Using 18,440 quality-controlled, depth-aligned samples, we adopt a contiguous depth-based split that preserves stratigraphic continuity while isolating training, validation, and test intervals to block spatial leakage. Under a strict leakage-free protocol, we evaluate single-task ensemble trees (STL-RF/HGB), a multi-task neural network (MTL-MLP), and a physics-informed variant (PINN-MLP) for deep-interval stabilization. The best model is target-dependent: STL-RF achieves R² = 0.984 for FRIA, whereas MTL-MLP attains R² = 0.874 for UCS. For deep formations (>4800 m), PINN-MLP with a depth-continuity constraint reduces deep-interval prediction error by 47.5%. Multi-seed experiments with 95% Student’s t confidence intervals further confirm robustness. Overall, the framework provides a reproducible workflow for continuous geomechanical-parameter prediction and risk-aware deployment in deep unconventional reservoirs. Full article

► Show Figures

Figure 1

38 pages, 1145 KB

Open AccessArticle

Transfer Learning Strategies for Comic Character Recognition in Low-Data Regimes: A Comparative Study

by Marco Parrillo, Luigi Laura and Alessandro Manna

Future Internet 2026, 18(4), 192; https://doi.org/10.3390/fi18040192 - 2 Apr 2026

Viewed by 216

Abstract

Image classification in low-data regimes remains a challenging problem, particularly in stylized visual domains where intra-class similarity and inter-class feature overlap limit discriminative capacity. This study presents a systematic evaluation of regularization and transfer learning strategies for multi-class comic character recognition under constrained [...] Read more.

Image classification in low-data regimes remains a challenging problem, particularly in stylized visual domains where intra-class similarity and inter-class feature overlap limit discriminative capacity. This study presents a systematic evaluation of regularization and transfer learning strategies for multi-class comic character recognition under constrained data conditions. Four convolutional architectures are compared: (i) a baseline CNN trained from scratch, (ii) a regularized CNN incorporating data augmentation, dropout, and early stopping, (iii) a pretrained ResNet-50 used as a fixed feature extractor, and (iv) a partially fine-tuned ResNet-50 with selective layer unfreezing. Experiments are conducted on a custom four-class dataset exhibiting moderate class imbalance, evaluated using both a fixed 70/20/10 split and 5-fold cross-validation to assess generalization stability. Results indicate that shallow CNN architectures suffer from substantial overfitting, even when regularization is applied, whereas transfer learning significantly improves macro-averaged F1-score and out-of-distribution detection performance. Cross-validated results, the primary basis for inference given the dataset scale, show that both ResNet-50 strategies achieve equivalent mean accuracy of 95.0% (SD: ±0.4% for feature extraction, ±0.8% for fine-tuning; paired t = 0.00, p = 1.000), while shallow CNN architectures reach only 81–87%. Under a single fixed 70/20/10 partition (n = 69 test samples, 95% CI: ±9–12%), fine-tuning nominally reaches 98.5%; crucially, cross-validation deflates this figure to parity with feature extraction, confirming it reflects favorable partitioning rather than genuine architectural superiority. The primary finding is therefore that frozen ResNet-50 feature extraction is the recommended strategy: it matches fine-tuning in cross-validated generalization while requiring 15× fewer trainable parameters and exhibiting lower fold-to-fold variance. The findings demonstrate that pretrained deep residual representations transfer effectively to stylized comic imagery and that evaluation protocol selection critically impacts perceived performance in small datasets. These results provide practical guidelines for robust model selection in domain-specific, limited-data image classification tasks. Full article

(This article belongs to the Special Issue Innovations in Artificial Intelligence and Neural Networks)

► Show Figures

Graphical abstract

43 pages, 1754 KB

Open AccessSystematic Review

Potential Clinical Applicability of Deep Learning in the Diagnosis of Major Depressive Disorder Using rs-fMRI: A Systematic Literature Review

by Maryam Saeedi, Lan Wei, Mercy Edoho and Catherine Mooney

Appl. Sci. 2026, 16(7), 3444; https://doi.org/10.3390/app16073444 - 1 Apr 2026

Viewed by 313

Abstract

Background: Major Depressive Disorder (MDD) is one of the leading causes of disability worldwide. Deep learning methods have been widely used for MDD detection, with research suggesting that deep models outperform traditional machine learning techniques. However, detecting MDD remains challenging due to data [...] Read more.

Background: Major Depressive Disorder (MDD) is one of the leading causes of disability worldwide. Deep learning methods have been widely used for MDD detection, with research suggesting that deep models outperform traditional machine learning techniques. However, detecting MDD remains challenging due to data heterogeneity, model complexities and the requirement for discriminative feature representations. Objective: This review outlines recent progress in deep learning methods for MDD detection from Resting-state fMRI (rs-fMRI), with a focus on the model’s generalisability and features that most effectively represent the function/anatomy of the brain to contribute to biomarker identifications and interpretability. Further, the review assesses the applicability of current models to real-world challenges. Methods: This systematic review followed the PRISMA guidelines. Studies involved clinically diagnosed MDD subjects, a control group, and deep learning methods for classification tasks. Results: The cerebellum, thalamus, amygdala, insula, and default mode network are the most frequently reported brain regions associated with depression. Although deep learning has shown impressive results, it has limitations in terms of reliance on labelled data, heterogeneity of data from various hospitals, and model interpretability. A majority of the studies lacked external validation and had a single-site dataset or regionally homogeneous datasets, and did not consider the temporal and dynamic nature of rs-fMRI data. Conclusion: Deep learning offers considerable potential in advancing MDD diagnosis and understanding its mechanisms. Multi-regional data collection, harmonisation techniques, and rigorous testing in real-world workflows should be the primary focus of future research. Full article

► Show Figures

Figure 1

13 pages, 3260 KB

Open AccessArticle

Efficient Deep Image Prior with Spatial-Channel Attention Transformer

by Weiwei Lin, Zeqing Zhang, Jin Lin and Ying You

Mathematics 2026, 14(7), 1185; https://doi.org/10.3390/math14071185 - 1 Apr 2026

Viewed by 295

Abstract

The deep image prior (DIP) suggests that it is possible to train a randomly initialized network with a suitable architecture to solve inverse imaging problems by simply optimizing its parameters to reconstruct a single degraded image. However, the prior knowledge exploited by vanilla [...] Read more.

The deep image prior (DIP) suggests that it is possible to train a randomly initialized network with a suitable architecture to solve inverse imaging problems by simply optimizing its parameters to reconstruct a single degraded image. However, the prior knowledge exploited by vanilla DIP relies on basic local convolutions, which inevitably limits the performance of inverse imaging tasks to the generative capacity of the model. Furthermore, image information is often not only related to neighboring pixels but also dependent on global color features and spatial distribution. Simple local convolutions used in inverse imaging cannot capture precise fine-grained details. Moreover, DIP is an unsupervised process but requires iterations to learn inverse imaging, consuming computational power and limiting the adaptation of global attention. To solve these problems, this article explores an efficient global prior module—a tri-directional multi-head self-attention mechanism—aiming to learn pixel-wise correlations along three directions: horizontal, vertical, and channel-wise. Our observations found that global learning can effectively enhance the detail information of edge pixels, making images more vivid and textures clearer. In addition, tri-directional multi-head self-attention can efficiently replace the global perception ability of pixel-level self-attention. Finally, we demonstrate that global learning can effectively improve the imaging effect of inverse imaging problems and enhance the information of texture edge pixels. Moreover, tri-directional multi-head self-attention can effectively alleviate the computation redundancy of pixel-level self-attention, thus achieving efficient and high-quality inverse imaging tasks. The principle of this method lies in global feature capture and efficient attention modeling, striking a balance between detail fidelity and computational practicality. Full article

(This article belongs to the Special Issue Securing Software Through Mathematics and Domain-Specific Knowledge: Innovations and Applications)

► Show Figures

Figure 1

12 pages, 1514 KB

Open AccessArticle

A Spatio-Temporal Dependency Modeling and Key Node Radiation-Based Method for Ultra-Short-Term Wind Farm Power Prediction Using GAT-TCN

by Shujun Liu, Tao Zhou, Xiaoze Du, Jiangbo Wu and Yiting He

Energies 2026, 19(7), 1710; https://doi.org/10.3390/en19071710 - 31 Mar 2026

Viewed by 294

Abstract

Deep learning has become an important tool for wind power forecasting because it can help improve wind energy utilization and support reliable grid-connected operation. For wind farms, accurate turbine-level forecasting depends on spatial interactions among turbines and temporal evolution of historical operating data. [...] Read more.

Deep learning has become an important tool for wind power forecasting because it can help improve wind energy utilization and support reliable grid-connected operation. For wind farms, accurate turbine-level forecasting depends on spatial interactions among turbines and temporal evolution of historical operating data. In this study, a spatio-temporal forecasting framework is developed by combining a Graph Attention Network with a Temporal Convolutional Network. The graph attention module describes the neighborhood relations among turbines and learns their influence strengths adaptively, while the temporal convolution module extracts temporal patterns from multivariate SCADA sequences for multi-step prediction. On this basis, the learned attention weights are further used to define a node influence metric. This makes it possible to identify a small set of key turbines and use only their historical data to predict the future power output of the whole wind farm. The proposed framework is evaluated using one year of SCADA data from 134 turbines. A sliding-window dataset is constructed, and the model is tested on the training, validation, and test sets. The results show that the method can capture the spatio-temporal dependencies within the wind farm and still provide effective farm-wide forecasting when only limited observation nodes are available. The value of this work lies in organizing existing techniques around a practical wind farm forecasting task and in providing an interpretable prediction strategy based on key turbine selection, rather than in proposing a fundamentally new theoretical model. Full article

► Show Figures

Figure 1

Search Results (2,018)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2,018)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI