MDPI - Publisher of Open Access Journals

20 pages, 6483 KB

Open AccessArticle

Loop-MapNet: A Multi-Modal HDMap Perception Framework with SDMap Dynamic Evolution and Priors

by Yuxuan Tang, Jie Hu, Daode Zhang, Wencai Xu, Feiyu Zhao and Xinghao Cheng

Appl. Sci. 2025, 15(20), 11160; https://doi.org/10.3390/app152011160 - 17 Oct 2025

Viewed by 171

High-definition maps (HDMaps) are critical for safe autonomy on structured roads. Yet traditional production—relying on dedicated mapping fleets and manual quality control—is costly and slow, impeding large-scale, frequent updates. Recently, standard-definition maps (SDMaps) derived from remote sensing have been adopted as priors to [...] Read more.

High-definition maps (HDMaps) are critical for safe autonomy on structured roads. Yet traditional production—relying on dedicated mapping fleets and manual quality control—is costly and slow, impeding large-scale, frequent updates. Recently, standard-definition maps (SDMaps) derived from remote sensing have been adopted as priors to support HDMap perception, lowering cost but struggling with subtle urban changes and localization drift. We propose Loop-MapNet, a self-evolving, multimodal, closed-loop mapping framework. Loop-MapNet effectively leverages surround-view images, LiDAR point clouds, and SDMaps; it fuses multi-scale vision via a weighted BiFPN, and couples PointPillars BEV and SDMap topology encoders for cross-modal sensing. A Transformer-based bidirectional adaptive cross-attention aligns SDMap with online perception, enabling robust fusion under heterogeneity. We further introduce a confidence-guided masked autoencoder (CG-MAE) that leverages confidence and probabilistic distillation to both capture implicit SDMap priors and enhance the detailed representation of low-confidence HDMap regions. With spatiotemporal consistency checks, Loop-MapNet incrementally updates SDMaps to form a perception–mapping–update loop, compensating remote-sensing latency and enabling online map optimization. On nuScenes, within 120 m, Loop-MapNet attains 61.05% mIoU, surpassing the best baseline by 0.77%. Under extreme localization errors, it maintains 60.46% mIoU, improving robustness by 2.77%; CG-MAE pre-training raises accuracy in low-confidence regions by 1.72%. These results demonstrate advantages in fusion and robustness, moving beyond one-way prior injection and enabling HDMap–SDMap co-evolution for closed-loop autonomy and rapid SDMap refresh from remote sensing. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

27 pages, 5792 KB

Open AccessArticle

Optimized Hybrid Deep Learning Framework for Short-Term Power Load Interval Forecasting via Improved Crowned Crested Porcupine Optimization and Feature Mode Decomposition

by Shucheng Luo, Xiangbin Meng, Xinfu Pang, Haibo Li and Zedong Zheng

Algorithms 2025, 18(10), 659; https://doi.org/10.3390/a18100659 - 17 Oct 2025

Viewed by 67

Abstract

This paper presents an optimized hybrid deep learning model for power load forecasting—QR-FMD-CNN-BiGRU-Attention—that integrates similar day selection, load decomposition, and deep learning to address the nonlinearity and volatility of power load data. Firstly, the original data are classified using Gaussian Mixture Clustering optimized [...] Read more.

This paper presents an optimized hybrid deep learning model for power load forecasting—QR-FMD-CNN-BiGRU-Attention—that integrates similar day selection, load decomposition, and deep learning to address the nonlinearity and volatility of power load data. Firstly, the original data are classified using Gaussian Mixture Clustering optimized by ICPO (ICPO-GMM), and similar day samples consistent with the predicted day category are selected. Secondly, the load data are decomposed into multi-scale components (IMFs) using feature mode decomposition optimized by ICPO (ICPO-FMD). Then, with the IMFs as targets, the quantile interval forecasting is trained using the CNN-BiGRU-Attention model optimized by ICPO. Subsequently, the forecasting model is applied to the features of the predicted day to generate interval forecasting results. Finally, the model’s performance is validated through comparative evaluation metrics, sensitivity analysis, and interpretability analysis. The experimental results show that compared with the comparative algorithm presented in this paper, the improved model has improved RMSE by at least 39.84%, MAE by 26.12%, MAPE by 45.28%, PICP and MPIW indicators by at least 3.80% and 2.27%, indicating that the model not only outperforms the comparative model in accuracy, but also exhibits stronger adaptability and robustness in complex load fluctuation scenarios. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

26 pages, 1351 KB

Open AccessReview

Trends and Limitations in Transformer-Based BCI Research

by Maximilian Achim Pfeffer, Johnny Kwok Wai Wong and Sai Ho Ling

Appl. Sci. 2025, 15(20), 11150; https://doi.org/10.3390/app152011150 - 17 Oct 2025

Viewed by 247

Abstract

Transformer-based models have accelerated EEG motor imagery (MI) decoding by using self-attention to capture long-range temporal structures while complementing spatial inductive biases. This systematic survey of Scopus-indexed works from 2020 to 2025 indicates that reported advances are concentrated in offline, protocol-heterogeneous settings; inconsistent [...] Read more.

Transformer-based models have accelerated EEG motor imagery (MI) decoding by using self-attention to capture long-range temporal structures while complementing spatial inductive biases. This systematic survey of Scopus-indexed works from 2020 to 2025 indicates that reported advances are concentrated in offline, protocol-heterogeneous settings; inconsistent preprocessing, non-standard data splits, and sparse efficiency frequently reporting cloud claims of generalization and real-time suitability. Under session- and subject-aware evaluation on the BCIC IV 2a/2b dataset, typical performance clusters are in the high-80% range for binary MI and the mid-70% range for multi-class tasks with gains of roughly 5–10 percentage points achieved by strong hybrids (CNN/TCN–Transformer; hierarchical attention) rather than by extreme figures often driven by leakage-prone protocols. In parallel, transformer-driven denoising—particularly diffusion–transformer hybrids—yields strong signal-level metrics but remains weakly linked to task benefit; denoise → decode validation is rarely standardized despite being the most relevant proxy when artifact-free ground truth is unavailable. Three priorities emerge for translation: protocol discipline (fixed train/test partitions, transparent preprocessing, mandatory reporting of parameters, FLOPs, per-trial latency, and acquisition-to-feedback delay); task relevance (shared denoise → decode benchmarks for MI and related paradigms); and adaptivity at scale (self-supervised pretraining on heterogeneous EEG corpora and resource-aware co-optimization of preprocessing and hybrid transformer topologies). Evidence from subject-adjusting evolutionary pipelines that jointly tune preprocessing, attention depth, and CNN–Transformer fusion demonstrates reproducible inter-subject gains over established baselines under controlled protocols. Implementing these practices positions transformer-driven BCIs to move beyond inflated offline estimates toward reliable, real-time neurointerfaces with concrete clinical and assistive relevance. Full article

(This article belongs to the Special Issue Brain-Computer Interfaces: Development, Applications, and Challenges)

► Show Figures

Figure 1

22 pages, 1678 KB

Open AccessArticle

Image Completion Network Considering Global and Local Information

by Yubo Liu, Ke Chen and Alan Penn

Buildings 2025, 15(20), 3746; https://doi.org/10.3390/buildings15203746 - 17 Oct 2025

Viewed by 85

Abstract

Accurate depth image inpainting in complex urban environments remains a critical challenge due to occlusions, reflections, and sensor limitations, which often result in significant data loss. We propose a hybrid deep learning framework that explicitly combines local and global modelling through Convolutional Neural [...] Read more.

Accurate depth image inpainting in complex urban environments remains a critical challenge due to occlusions, reflections, and sensor limitations, which often result in significant data loss. We propose a hybrid deep learning framework that explicitly combines local and global modelling through Convolutional Neural Networks (CNNs) and Transformer modules. The model employs a multi-branch parallel architecture, where the CNN branch captures fine-grained local textures and edges, while the Transformer branch models global semantic structures and long-range dependencies. We introduce an optimized attention mechanism, Agent Attention, which differs from existing efficient/linear attention methods by using learnable proxy tokens tailored for urban scene categories (e.g., façades, sky, ground). A content-guided dynamic fusion module adaptively combines multi-scale features to enhance structural alignment and texture recovery. The frame-work is trained with a composite loss function incorporating pixel accuracy, perceptual similarity, adversarial realism, and structural consistency. Extensive experiments on the Paris StreetView dataset demonstrate that the proposed method achieves state-of-the-art performance, outperforming existing approaches in PSNR, SSIM, and LPIPS metrics. The study highlights the potential of multi-scale modeling for urban depth inpainting and discusses challenges in real-world deployment, ethical considerations, and future directions for multimodal integration. Full article

(This article belongs to the Special Issue Advanced Technologies for Construction and Maintenance of Engineering Structures)

► Show Figures

Figure 1

32 pages, 4935 KB

Open AccessArticle

Machine Learning Analytics for Blockchain-Based Financial Markets: A Confidence-Threshold Framework for Cryptocurrency Price Direction Prediction

by Oleksandr Kuznetsov, Oleksii Kostenko, Kateryna Klymenko, Zoriana Hbur and Roman Kovalskyi

Appl. Sci. 2025, 15(20), 11145; https://doi.org/10.3390/app152011145 - 17 Oct 2025

Viewed by 110

Abstract

Blockchain-based cryptocurrency markets present unique analytical challenges due to their decentralized nature, continuous operation, and extreme volatility. Traditional price prediction models often struggle with the binary trade execution problem in these markets. This study introduces a confidence-based classification framework that separates directional prediction [...] Read more.

Blockchain-based cryptocurrency markets present unique analytical challenges due to their decentralized nature, continuous operation, and extreme volatility. Traditional price prediction models often struggle with the binary trade execution problem in these markets. This study introduces a confidence-based classification framework that separates directional prediction from execution decisions in cryptocurrency trading. We develop a neural network system that processes multi-scale market data, combining daily macroeconomic indicators with a high-frequency order book microstructure. The model trains exclusively on directional movements (up versus down) and uses prediction confidence levels to determine trade execution. We evaluate the framework across 11 major cryptocurrency pairs over 12 months. Experimental results demonstrate 82.68% direction accuracy on executed trades with 151.11-basis point average net profit per trade at 11.99% market coverage. Order book features dominate predictive importance (81.3% of selected features), validating the critical role of blockchain microstructure data for short-term price prediction. The confidence-based execution strategy achieves superior risk-adjusted returns compared to traditional classification approaches while providing natural risk management capabilities through selective trade execution. These findings contribute to blockchain technology applications in financial markets by demonstrating how a decentralized market microstructure can be leveraged for systematic trading strategies. The methodology offers practical implementation guidelines for cryptocurrency algorithmic trading while advancing the understanding of machine learning applications in blockchain-based financial systems. Full article

(This article belongs to the Special Issue Blockchain Technologies: Trends, Challenges, Potentials and Applications)

► Show Figures

Figure 1

16 pages, 5944 KB

Open AccessArticle

A Gradient-Variance Weighting Physics-Informed Neural Network for Solving Integer and Fractional Partial Differential Equations

by Liang Zhang, Quansheng Liu, Ruigang Zhang, Liqing Yue and Zhaodong Ding

Appl. Sci. 2025, 15(20), 11137; https://doi.org/10.3390/app152011137 - 17 Oct 2025

Viewed by 112

Abstract

Physics-Informed Neural Networks (PINNs) have emerged as a promising paradigm for solving partial differential equations (PDEs) by embedding physical laws into the learning process. However, standard PINNs often suffer from training instabilities and unbalanced optimization when handling multi-term loss functions, especially in problems [...] Read more.

Physics-Informed Neural Networks (PINNs) have emerged as a promising paradigm for solving partial differential equations (PDEs) by embedding physical laws into the learning process. However, standard PINNs often suffer from training instabilities and unbalanced optimization when handling multi-term loss functions, especially in problems involving singular perturbations, fractional operators, or multi-scale behaviors. To address these limitations, we propose a novel gradient variance weighting physics-informed neural network (GVW-PINN), which adaptively adjusts the loss weights based on the variance of gradient magnitudes during training. This mechanism balances the optimization dynamics across different loss terms, thereby enhancing both convergence stability and solution accuracy. We evaluate GVW-PINN on three representative PDE models and numerical experiments demonstrate that GVW-PINN consistently outperforms the conventional PINN in terms of training efficiency, loss convergence, and predictive accuracy. In particular, GVW-PINN achieves smoother and faster loss reduction, reduces relative errors by one to two orders of magnitude, and exhibits superior generalization to unseen domains. The proposed framework provides a robust and flexible strategy for applying PINNs to a wide range of integer- and fractional-order PDEs, highlighting its potential for advancing data-driven scientific computing in complex physical systems. Full article

► Show Figures

Figure 1

23 pages, 507 KB

Open AccessArticle

Sustainability in Education: Exploring Teachers’ Confidence in Establishing an Out-of-School Learning Environment

by Fatma Coştu and Neslihan Karakuş

Sustainability 2025, 17(20), 9160; https://doi.org/10.3390/su17209160 - 16 Oct 2025

Viewed by 205

Abstract

Outdoor learning offers dynamic, real-world educational opportunities that extend beyond traditional classrooms and foster sustainability awareness. This quantitative study endeavors to assess teachers’ competency in facilitating outdoor learning, aiming for a more engaging and impactful introduction. Employing a relational survey design in the [...] Read more.

Outdoor learning offers dynamic, real-world educational opportunities that extend beyond traditional classrooms and foster sustainability awareness. This quantitative study endeavors to assess teachers’ competency in facilitating outdoor learning, aiming for a more engaging and impactful introduction. Employing a relational survey design in the form of a multi-survey model, the research engaged 586 teachers representing diverse academic disciplines across public and private elementary and secondary schools. Central to the investigation was the utilization of the “Outdoor Learning Regulation Scale [OLRS]” as the primary data collection instrument. The evaluation of teachers’ aptitude in regulating outdoor learning encompassed various variables, including gender, subject specialization, prior online or in-person training in outdoor learning, use of non-school environments for teaching, childhood environment, and teaching location. To analyze the collected data, a nuanced approach to statistical analysis was undertaken, aiming to provide a clearer and more specific explanation of the data analysis methods employed. The findings of the study unveiled no significant disparities in teachers’ outdoor learning regulation capabilities based on gender, subject specialization, childhood environment, or teaching location. However, discernible differences surfaced in their proficiency in outdoor learning regulation concerning previous online or in-person training in outdoor learning and their utilization of outdoor environments for teaching, thus providing deeper insights into the factors shaping teachers’ efficacy in facilitating outdoor learning experiences. Additionally, the study emphasizes the link between outdoor learning and sustainability education. By equipping teachers with the skills to regulate outdoor learning, this research supports the integration of sustainability into educational practices, promoting students’ ecological awareness and sustainable thinking. These results highlight the importance of professional development and targeted training in outdoor education, with direct implications for strengthening sustainability-oriented teaching practices. Full article

► Show Figures

Figure 1

19 pages, 2733 KB

Open AccessArticle

Style Transfer from Sentinel-1 to Sentinel-2 for Fluvial Scenes with Multi-Modal and Multi-Temporal Image Fusion

by Patrice E. Carbonneau

Remote Sens. 2025, 17(20), 3445; https://doi.org/10.3390/rs17203445 - 15 Oct 2025

Viewed by 183

Abstract

Recently, there has been significant progress in the area of semantic classification of water bodies at global scales with deep learning. For the key purposes of water inventory and change detection, advanced deep learning classifiers such as UNets and Vision Transformers have been [...] Read more.

Recently, there has been significant progress in the area of semantic classification of water bodies at global scales with deep learning. For the key purposes of water inventory and change detection, advanced deep learning classifiers such as UNets and Vision Transformers have been shown to be both accurate and flexible when applied to large-scale, or even global, satellite image datasets from optical (e.g., Sentinel-2) and radar sensors (e.g., Sentinel-1). Most of this work is conducted with optical sensors, which usually have better image quality, but their obvious limitation is cloud cover, which is why radar imagery is an important complementary dataset. However, radar imagery is generally more sensitive to soil moisture than optical data. Furthermore, topography and wind-ripple effects can alter the reflected intensity of radar waves, which can induce errors in water classification models that fundamentally rely on the fact that water is darker than the surrounding landscape. In this paper, we develop a solution to the use of Sentinel-1 radar images for the semantic classification of water bodies that uses style transfer with multi-modal and multi-temporal image fusion. Instead of developing new semantic classification models that work directly on Sentinel-1 images, we develop a global style transfer model that produces synthetic Sentinel-2 images from Sentinel-1 input. The resulting synthetic Sentinel-2 imagery can then be classified with existing models. This has the advantage of obviating the need for large volumes of manually labeled Sentinel-1 water masks. Next, we show that fusing an 8-year cloud-free composite of the near-infrared band 8 of Sentinel-2 to the input Sentinel-1 image improves the classification performance. Style transfer models were trained and validated with global scale data covering the years 2017 to 2024, and include every month of the year. When tested against a global independent benchmark, S1S2-Water, the semantic classifications produced from our synthetic imagery show a marked improvement with the use of image fusion. When we use only Sentinel-1 data, we find an overall IoU (Intersection over Union) score of 0.70, but when we add image fusion, the overall IoU score rises to 0.93. Full article

(This article belongs to the Special Issue Multimodal Remote Sensing Data Fusion, Analysis and Application)

► Show Figures

Figure 1

21 pages, 7603 KB

Open AccessArticle

Non-Invasive Inversion and Characteristic Analysis of Soil Moisture in 0–300 cm Agricultural Soil Layers

by Shujie Jia, Yaoyu Li, Boxin Cao, Yuwei Cheng, Abdul Sattar Mashori, Zheyu Bai, Mingyi Cui, Zhimin Zhang, Linqiang Deng and Wuping Zhang

Agriculture 2025, 15(20), 2143; https://doi.org/10.3390/agriculture15202143 - 15 Oct 2025

Viewed by 223

Abstract

Accurate profiling of deep (20–300 cm) soil moisture is crucial for precision irrigation but remains technically challenging and costly at operational scales. We systematically benchmark eight regression algorithms—including linear regression, Lasso, Ridge, elastic net, support vector regression, multi-layer perceptron (MLP), random forest (RF), [...] Read more.

Accurate profiling of deep (20–300 cm) soil moisture is crucial for precision irrigation but remains technically challenging and costly at operational scales. We systematically benchmark eight regression algorithms—including linear regression, Lasso, Ridge, elastic net, support vector regression, multi-layer perceptron (MLP), random forest (RF), and gradient boosting trees (GBDT)—that use easily accessible inputs of 0–20 cm surface soil moisture (SSM) and ten meteorological variables to non-invasively infer soil moisture at fourteen 20 cm layers. Data from a typical agricultural site in Wenxi, Shanxi (2020–2022), were divided into training and testing datasets based on temporal order (2020–2021 for training, 2022 for testing) and standardized prior to modeling. Across depths, non-linear ensemble models significantly outperform linear baselines. Ridge Regression achieves the highest accuracy at 0–20 cm, SVR performs best at 20–40 cm, and MLP yields consistently optimal performance across deep layers from 60 cm to 300 cm (R² = 0.895–0.978, KGE = 0.826–0.985). Although ensemble models like RF and GBDT exhibit strong fitting ability, their generalization performance under temporal validation is relatively limited. Model interpretability combining SHAP, PDP, and ALE shows that surface soil moisture is the dominant predictor across all depths, with a clear attenuation trend and a critical transition zone between 160 and 200 cm. Precipitation and humidity primarily drive shallow to mid-layers (20–140 cm), whereas temperature variables gain relative importance in deeper profiles (200–300 cm). ALE analysis eliminates feature correlation biases while maintaining high predictive accuracy, confirming surface-to-deep information transmission mechanisms. We propose a depth-adaptive modeling strategy by assigning the best-performing model at each soil layer, enabling practical non-invasive deep soil moisture prediction for precision irrigation and water resource management. Full article

(This article belongs to the Section Agricultural Soils)

► Show Figures

Figure 1

22 pages, 3532 KB

Open AccessArticle

Dual Weakly Supervised Anomaly Detection and Unsupervised Segmentation for Real-Time Railway Perimeter Intrusion Monitoring

by Donghua Wu, Yi Tian, Fangqing Gao, Xiukun Wei and Changfan Wang

Sensors 2025, 25(20), 6344; https://doi.org/10.3390/s25206344 - 14 Oct 2025

Viewed by 228

Abstract

The high operational velocities of high-speed trains present constraints on their onboard track intrusion detection systems for real-time capture and analysis, encompassing limited computational resources and motion image blurring. This emphasizes the critical necessity of track perimeter intrusion monitoring systems. Consequently, an intelligent [...] Read more.

The high operational velocities of high-speed trains present constraints on their onboard track intrusion detection systems for real-time capture and analysis, encompassing limited computational resources and motion image blurring. This emphasizes the critical necessity of track perimeter intrusion monitoring systems. Consequently, an intelligent monitoring system employing trackside cameras is constructed, integrating weakly supervised video anomaly detection and unsupervised foreground segmentation, which offers a solution for monitoring foreign objects on high-speed train tracks. To address the challenges of complex dataset annotation and unidentified target detection, weakly supervised learning detection is proposed to track foreign object intrusions based on video. The pretraining of Xception3D and the integration of multiple attention mechanisms have markedly enhanced the feature extraction capabilities. The Top-K sample selection alongside the amplitude score/feature loss function effectively discriminates abnormal from normal samples, incorporating time-smoothing constraints to ensure detection consistency across consecutive frames. Once abnormal video frames are identified, a multiscale variational autoencoder is proposed for the positioning of foreign objects. A downsampling/upsampling module is optimized to increase feature extraction efficiency. The pixel-level background weight distribution loss function is engineered to jointly balance background authenticity and noise resistance. Ultimately, the experimental results indicate that the video anomaly detection model achieved an AUC of 0.99 on the track anomaly detection dataset and processes 2 s video segments in 0.41 s. The proposed foreground segmentation algorithm achieved an F1 score of 0.9030 in the track anomaly dataset and 0.8375 on CDnet2014, with 91 Frames per Second, confirming its efficacy. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

19 pages, 2435 KB

Open AccessArticle

A Lesion-Aware Patch Sampling Approach with EfficientNet3D-UNet for Robust Multiple Sclerosis Lesion Segmentation

by Hind Almaaz and Samia Dardouri

J. Imaging 2025, 11(10), 361; https://doi.org/10.3390/jimaging11100361 - 13 Oct 2025

Viewed by 189

Abstract

Accurate segmentation of multiple sclerosis (MS) lesions from 3D MRI scans is essential for diagnosis, disease monitoring, and treatment planning. However, this task remains challenging due to the sparsity, heterogeneity, and subtle appearance of lesions, as well as the difficulty in obtaining high-quality [...] Read more.

Accurate segmentation of multiple sclerosis (MS) lesions from 3D MRI scans is essential for diagnosis, disease monitoring, and treatment planning. However, this task remains challenging due to the sparsity, heterogeneity, and subtle appearance of lesions, as well as the difficulty in obtaining high-quality annotations. In this study, we propose Efficient-Net3D-UNet, a deep learning framework that combines compound-scaled MBConv3D blocks with a lesion-aware patch sampling strategy to improve volumetric segmentation performance across multi-modal MRI sequences (FLAIR, T1, and T2). The model was evaluated against a conventional 3D U-Net baseline using standard metrics including Dice similarity coefficient, precision, recall, accuracy, and specificity. On a held-out test set, EfficientNet3D-UNet achieved a Dice score of 48.39%, precision of 49.76%, and recall of 55.41%, outperforming the baseline 3D U-Net, which obtained a Dice score of 31.28%, precision of 32.48%, and recall of 43.04%. Both models reached an overall accuracy of 99.14%. Notably, EfficientNet3D-UNet also demonstrated faster convergence and reduced overfitting during training. These results highlight the potential of EfficientNet3D-UNet as a robust and computationally efficient solution for automated MS lesion segmentation, offering promising applicability in real-world clinical settings. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

15 pages, 8859 KB

Open AccessArticle

A Hybrid Estimation Model for Graphite Nodularity of Ductile Cast Iron Based on Multi-Source Feature Extraction

by Yongjian Yang, Yanhui Liu, Yuqian He, Zengren Pan and Zhiwei Li

Modelling 2025, 6(4), 126; https://doi.org/10.3390/modelling6040126 - 13 Oct 2025

Viewed by 200

Abstract

Graphite nodularity is a key indicator for evaluating the microstructure quality of ductile iron and plays a crucial role in ensuring product quality and enhancing manufacturing efficiency. Existing research often only focuses on a single type of feature and fails to utilize multi-source [...] Read more.

Graphite nodularity is a key indicator for evaluating the microstructure quality of ductile iron and plays a crucial role in ensuring product quality and enhancing manufacturing efficiency. Existing research often only focuses on a single type of feature and fails to utilize multi-source information in a coordinated manner. Single-feature methods are difficult to comprehensively capture microstructures, which limits the accuracy and robustness of the model. This study proposes a hybrid estimation model for the graphite nodularity of ductile cast iron based on multi-source feature extraction. A comprehensive feature engineering pipeline was established, incorporating geometric, color, and texture features extracted via Hue-Saturation-Value color space (HSV) histograms, gray level co-occurrence matrix (GLCM), Local Binary Pattern (LBP), and multi-scale Gabor filters. Dimensionality reduction was performed using Principal Component Analysis (PCA) to mitigate redundancy. An improved watershed algorithm combined with intelligent filtering was used for accurate particle segmentation. Several machine learning algorithms, including Support Vector Regression (SVR), Multi-Layer Perceptron (MLP), Random Forest (RF), Gradient Boosting Regressor (GBR), eXtreme Gradient Boosting (XGBoost) and Categorical Boosting (CatBoost), are applied to estimate graphite nodularity based on geometric features (GFs) and feature extraction. Experimental results demonstrate that the CatBoost model trained on fused features achieves high estimation accuracy and stability for geometric parameters, with R-squared (R²) exceeding 0.98. Furthermore, introducing geometric features into the fusion set enhances model generalization and suppresses overfitting. This framework offers an efficient and robust approach for intelligent analysis of metallographic images and provides valuable support for automated quality assessment in casting production. Full article

► Show Figures

Figure 1

20 pages, 49845 KB

Open AccessArticle

DDF-YOLO: A Small Target Detection Model Using Multi-Scale Dynamic Feature Fusion for UAV Aerial Photography

by Ziang Ma, Chao Wang, Chuanzhi Chen, Jinbao Chen and Guang Zheng

Aerospace 2025, 12(10), 920; https://doi.org/10.3390/aerospace12100920 - 13 Oct 2025

Viewed by 418

Abstract

Unmanned aerial vehicle (UAV)-based object detection shows promising potential in intelligent transportation and disaster response. However, detecting small targets remains challenging due to inherent limitations (long-distance and low-resolution imaging) and environmental interference (complex backgrounds and occlusions). To address these issues, this paper proposes [...] Read more.

Unmanned aerial vehicle (UAV)-based object detection shows promising potential in intelligent transportation and disaster response. However, detecting small targets remains challenging due to inherent limitations (long-distance and low-resolution imaging) and environmental interference (complex backgrounds and occlusions). To address these issues, this paper proposes an enhanced small target detection model, DDF-YOLO, which achieves higher detection performance. First, a dynamic feature extraction module (C2f-DCNv4) employs deformable convolutions to effectively capture features from irregularly shaped objects. In addition, a dynamic upsampling module (DySample) optimizes multi-scale feature fusion by combining shallow spatial details with deep semantic features, preserving critical low-level information while enhancing generalization across scales. Finally, to balance rapid convergence with precise localization, an adaptive Focaler-ECIoU loss function dynamically adjusts training weights based on sample quality during bounding box regression. Extensive experiments on VisDrone2019 and UAVDT benchmarks demonstrate DDF-YOLO’s superiority. Compared to YOLOv8n, our model achieves gains of 8.6% and 4.8% in mAP50, along with improvements of 5.0% and 3.3% in mAP50-95, respectively. Furthermore, it exhibits superior efficiency, requiring only 7.3 GFLOPs and attaining an inference speed of 179 FPS. These results validate the model’s robustness for UAV-based detection, particularly in small-object scenarios. Full article

(This article belongs to the Section Aeronautics)

► Show Figures

Figure 1

23 pages, 23535 KB

Open AccessArticle

FANT-Det: Flow-Aligned Nested Transformer for SAR Small Ship Detection

by Hanfu Li, Dawei Wang, Jianming Hu, Xiyang Zhi and Dong Yang

Remote Sens. 2025, 17(20), 3416; https://doi.org/10.3390/rs17203416 - 12 Oct 2025

Viewed by 363

Abstract

Ship detection in synthetic aperture radar (SAR) remote sensing imagery is of great significance in military and civilian applications. However, two factors limit detection performance: (1) a high prevalence of small-scale ship targets with limited information content and (2) interference affecting ship detection [...] Read more.

Ship detection in synthetic aperture radar (SAR) remote sensing imagery is of great significance in military and civilian applications. However, two factors limit detection performance: (1) a high prevalence of small-scale ship targets with limited information content and (2) interference affecting ship detection from speckle noise and land–sea clutter. To address these challenges, we propose a novel end-to-end (E2E) transformer-based SAR ship detection framework, called Flow-Aligned Nested Transformer for SAR Small Ship Detection (FANT-Det). Specifically, in the feature extraction stage, we introduce a Nested Swin Transformer Block (NSTB). The NSTB employs a two-level local self-attention mechanism to enhance fine-grained target representation, thereby enriching features of small ships. For multi-scale feature fusion, we design a Flow-Aligned Depthwise Efficient Channel Attention Network (FADEN). FADEN achieves precise alignment of features across different resolutions via semantic flow and filters background clutter through lightweight channel attention, further enhancing small-target feature quality. Moreover, we propose an Adaptive Multi-scale Contrastive Denoising (AM-CDN) training paradigm. AM-CDN constructs adaptive perturbation thresholds jointly determined by a target scale factor and a clutter factor, generating contrastive denoising samples that better match the physical characteristics of SAR ships. Finally, extensive experiments on three widely used open SAR ship datasets demonstrate that the proposed method achieves superior detection performance, outperforming current state-of-the-art (SOTA) benchmarks. Full article

(This article belongs to the Special Issue Synthetic Aperture Radar (SAR) Image Object Detection and Information Extraction: Methods and Applications (Second Edition))

► Show Figures

Figure 1

21 pages, 5980 KB

Open AccessArticle

Research on the Classification Method of Pinus Species Based on Generative Adversarial Networks and Convolutional Neural Networks

by Shuo Xu, Hang Su and Lei Zhao

Appl. Sci. 2025, 15(20), 10942; https://doi.org/10.3390/app152010942 - 11 Oct 2025

Viewed by 227

Abstract

With the rapid expansion of the global timber trade, accurate wood identification has become essential for regulating ecosystems and combating illegal logging. Traditional methods, largely reliant on manual analysis, are inadequate for large-scale, high-precision demands. A multi-architecture fusion network model that combines generative [...] Read more.

With the rapid expansion of the global timber trade, accurate wood identification has become essential for regulating ecosystems and combating illegal logging. Traditional methods, largely reliant on manual analysis, are inadequate for large-scale, high-precision demands. A multi-architecture fusion network model that combines generative adversarial networks and one-dimensional convolutional neural networks aims to solve the problems in data quality and the challenges in classification accuracy existing in the classification process of pine tree species. The generative adversarial network is used to improve the data, which effectively expands the scale of the training set. Moreover, the one-dimensional convolutional neural network is utilized to extract local and global features from the spectral data, which improves the classification accuracy of the model and also makes the model more stable. The results obtained from the experiment show that MAFNet can achieve an accuracy rate of 99.63% in the classification of pine species. The model performed best on cross-sectional data. The research finds that MAFNet, relying on the strategy of integrating data enhancement and deep feature extraction, provides strong technical support for the rapid, accurate and non-destructive identification of pine species. Full article

► Show Figures

Figure 1

Search Results (2,084)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2,084)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI