MDPI - Publisher of Open Access Journals

20 pages, 2524 KB

Open AccessArticle

YOLO-PFA: Advanced Multi-Scale Feature Fusion and Dynamic Alignment for SAR Ship Detection

by Shu Liu, Peixue Liu, Zhongxun Wang, Mingze Sun and Pengfei He

J. Mar. Sci. Eng. 2025, 13(10), 1936; https://doi.org/10.3390/jmse13101936 - 9 Oct 2025

Viewed by 90

Maritime ship detection faces challenges due to complex object poses, variable target scales, and background interference. This paper introduces YOLO-PFA, a novel SAR ship detection model that integrates multi-scale feature fusion and dynamic alignment. By leveraging the Bidirectional Feature Pyramid Network (BiFPN), YOLO-PFA [...] Read more.

Maritime ship detection faces challenges due to complex object poses, variable target scales, and background interference. This paper introduces YOLO-PFA, a novel SAR ship detection model that integrates multi-scale feature fusion and dynamic alignment. By leveraging the Bidirectional Feature Pyramid Network (BiFPN), YOLO-PFA enhances cross-scale weighted feature fusion, improving detection of objects of varying sizes. The C2f-Partial Feature Aggregation (C2f-PFA) module aggregates raw and processed features, enhancing feature extraction efficiency. Furthermore, the Dynamic Alignment Detection Head (DADH) optimizes classification and regression feature interaction, enabling dynamic collaboration. Experimental results on the iVision-MRSSD dataset demonstrate YOLO-PFA’s superiority, achieving an mAP@0.5 of 95%, outperforming YOLOv11 by 1.2% and YOLOv12 by 2.8%. This paper contributes significantly to automated maritime target detection. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

25 pages, 20535 KB

Open AccessArticle

DWTF-DETR: A DETR-Based Model for Inshore Ship Detection in SAR Imagery via Dynamically Weighted Joint Time–Frequency Feature Fusion

by Tiancheng Dong, Taoyang Wang, Yuqi Han, Deren Li, Guo Zhang and Yuan Peng

Remote Sens. 2025, 17(19), 3301; https://doi.org/10.3390/rs17193301 - 25 Sep 2025

Viewed by 627

Abstract

Inshore ship detection in synthetic aperture radar (SAR) imagery poses significant challenges due to the high density and diversity of ships. However, low inter-object backscatter contrast and blurred boundaries of docked ships often result in performance degradation for traditional object detection methods, especially [...] Read more.

Inshore ship detection in synthetic aperture radar (SAR) imagery poses significant challenges due to the high density and diversity of ships. However, low inter-object backscatter contrast and blurred boundaries of docked ships often result in performance degradation for traditional object detection methods, especially under complex backgrounds and low signal-to-noise ratio (SNR) conditions. To address these issues, this paper proposes a novel detection framework, the Dynamic Weighted Joint Time–Frequency Feature Fusion DEtection TRansformer (DETR) Model (DWTF-DETR), specifically designed for SAR-based ship detection in inshore areas. The proposed model integrates a Dual-Domain Feature Fusion Module (DDFM) to extract and fuse features from both SAR images and their frequency-domain representations, enhancing sensitivity to both high- and low-frequency target features. Subsequently, a Dual-Path Attention Fusion Module (DPAFM) is introduced to dynamically weight and fuse shallow detail features with deep semantic representations. By leveraging an attention mechanism, the module adaptively adjusts the importance of different feature paths, thereby enhancing the model’s ability to perceive targets with ambiguous structural characteristics. Experiments conducted on a self-constructed inshore SAR ship detection dataset and the public HRSID dataset demonstrate that DWTF-DETR achieves superior performance compared to the baseline RT-DETR. Specifically, the proposed method improves mAP@50 by 1.60% and 0.72%, and F1-score by 0.58% and 1.40%, respectively. Moreover, comparative experiments show that the proposed approach outperforms several state-of-the-art SAR ship detection methods. The results confirm that DWTF-DETR is capable of achieving accurate and robust detection in diverse and complex maritime environments. Full article

(This article belongs to the Special Issue Deep Learning for Multi-Source Remote Sensing Image Interpretation: Exploring, Rethinking, and Limiting Breakthroughs)

► Show Figures

Graphical abstract

33 pages, 2931 KB

Open AccessArticle

Data-Fusion-Based Algorithm for Assessing Threat Levels of Low-Altitude and Slow-Speed Small Targets

by Wei Wu, Wenjie Jie, Angang Luo, Xing Liu and Weili Luo

Sensors 2025, 25(17), 5510; https://doi.org/10.3390/s25175510 - 4 Sep 2025

Viewed by 1001

Abstract

Low-Altitude and Slow-Speed Small (LSS) targets pose significant challenges to air defense systems due to their low detectability and complex maneuverability. To enhance defense capabilities against low-altitude targets and assist in formulating interception decisions, this study proposes a new threat assessment algorithm based [...] Read more.

Low-Altitude and Slow-Speed Small (LSS) targets pose significant challenges to air defense systems due to their low detectability and complex maneuverability. To enhance defense capabilities against low-altitude targets and assist in formulating interception decisions, this study proposes a new threat assessment algorithm based on multisource data fusion under visible-light detection conditions. Firstly, threat assessment indicators and their membership functions are defined to characterize LSS targets, and a comprehensive evaluation system is established. To reduce the impact of uncertainties in weight allocation on the threat assessment results, a combined weighting method based on bias coefficients is proposed. The proposed weighting method integrates the analytic hierarchy process (AHP), entropy weighting, and CRITIC methods to optimize the fusion of subjective and objective weights. Subsequently, Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) and Dempster–Shafer (D-S) evidence theory are used to calculate and rank the target threat levels so as to reduce conflicts and uncertainties from heterogeneous data sources. Finally, the effectiveness and reliability of the two methods are verified through simulation experiments and measured data. The experimental results show that the TOPSIS method can significantly discriminate threat values, making it suitable for environments requiring rapid distinction between high- and low-threat targets. The D-S evidence theory, on the other hand, has strong anti-interference capability, making it suitable for environments requiring a balance between subjective and objective uncertainties. Both methods can improve the reliability of threat assessment in complex environments, providing valuable support for air defense command and control systems. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

20 pages, 8561 KB

Open AccessArticle

LCW-YOLO: An Explainable Computer Vision Model for Small Object Detection in Drone Images

by Dan Liao, Rengui Bi, Yubi Zheng, Cheng Hua, Liangqing Huang, Xiaowen Tian and Bolin Liao

Appl. Sci. 2025, 15(17), 9730; https://doi.org/10.3390/app15179730 - 4 Sep 2025

Viewed by 1273

Abstract

Small targets in drone imagery are often difficult to accurately locate and identify due to scale imbalance and limitations, such as pixel representation and dynamic environmental interference, and the balance between detection accuracy and resource consumption of the model also poses challenges. Therefore, [...] Read more.

Small targets in drone imagery are often difficult to accurately locate and identify due to scale imbalance and limitations, such as pixel representation and dynamic environmental interference, and the balance between detection accuracy and resource consumption of the model also poses challenges. Therefore, we propose an interpretable computer vision framework based on YOLOv12m, called LCW-YOLO. First, we adopt multi-scale heterogeneous convolutional kernels to improve the lightweight channel-level and spatial attention combined context (LA2C2f) structure, enhancing spatial perception capabilities while reducing model computational load. Second, to enhance feature fusion capabilities, we propose the Convolutional Attention Integration Module (CAIM), enabling the fusion of original features across channels, spatial dimensions, and layers, thereby strengthening contextual attention. Finally, the model incorporates Wise-IoU (WIoU) v3, which dynamically allocates loss weights for detected objects. This allows the model to adjust its focus on samples of average quality during training based on object difficulty, thereby improving the model’s generalization capabilities. According to experimental results, LCW-YOLO eliminates 0.4 M parameters and improves mAP@0.5 by 3.3% on the VisDrone2019 dataset when compared to YOLOv12m. And the model improves mAP@0.5 by 1.9% on the UAVVaste dataset. In the task of identifying small objects with drones, LCW-YOLO, as an explainable AI (XAI) model, provides visual detection results and effectively balances accuracy, lightweight design, and generalization capabilities. Full article

(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)

► Show Figures

Figure 1

19 pages, 2216 KB

Open AccessArticle

A Photovoltaic Power Prediction Framework Based on Multi-Stage Ensemble Learning

by Lianglin Zou, Hongyang Quan, Ping Tang, Shuai Zhang, Xiaoshi Xu and Jifeng Song

Energies 2025, 18(17), 4644; https://doi.org/10.3390/en18174644 - 1 Sep 2025

Viewed by 553

Abstract

With the significant increase in solar power generation’s proportion in power systems, the uncertainty of its power output poses increasingly severe challenges to grid operation. In recent years, solar forecasting models have achieved remarkable progress, with various developed models each exhibiting distinct advantages [...] Read more.

With the significant increase in solar power generation’s proportion in power systems, the uncertainty of its power output poses increasingly severe challenges to grid operation. In recent years, solar forecasting models have achieved remarkable progress, with various developed models each exhibiting distinct advantages and characteristics. To address complex and variable geographical and meteorological conditions, it is necessary to adopt a multi-model fusion approach to leverage the strengths and adaptability of individual models. This paper proposes a photovoltaic power prediction framework based on multi-stage ensemble learning, which enhances prediction robustness by integrating the complementary advantages of heterogeneous models. The framework employs a three-level optimization architecture: first, a recursive feature elimination (RFE) algorithm based on LightGBM–XGBoost–MLP weighted scoring is used to screen high-discriminative features; second, mutual information and hierarchical clustering are utilized to construct a heterogeneous model pool, enabling competitive intra-group and complementary inter-group model selection; finally, the traditional static weighting strategy is improved by concatenating multi-model prediction results with real-time meteorological data to establish a time-period-based dynamic weight optimization module. The performance of the proposed framework was validated across multiple dimensions—including feature selection, model screening, dynamic integration, and comprehensive performance—using measured data from a 75 MW photovoltaic power plant in Inner Mongolia and the open-source dataset PVOD. Full article

(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)

► Show Figures

Figure 1

13 pages, 6742 KB

Open AccessArticle

SD-FINE: Lightweight Object Detection Method for Critical Equipment in Substations

by Wei Sun, Yu Hao, Sha Luo, Zhiwei Zou, Lu Xing and Qingwei Gao

Energies 2025, 18(17), 4639; https://doi.org/10.3390/en18174639 - 1 Sep 2025

Viewed by 383

Abstract

The safe and stable operation of critical substation equipment is paramount to the power system, and its intelligent inspection relies on highly efficient and accurate object detection technology. However, the demanding requirements for both accuracy and efficiency in complex environments pose significant challenges [...] Read more.

The safe and stable operation of critical substation equipment is paramount to the power system, and its intelligent inspection relies on highly efficient and accurate object detection technology. However, the demanding requirements for both accuracy and efficiency in complex environments pose significant challenges for lightweight models. To address this, this paper proposes SD-FINE, a lightweight object detection technique specifically designed for detecting critical substation equipment. Specifically, we introduce a novel Fine-grained Distribution Refinement (FDR) approach, which fundamentally transforms the bounding box regression process in DETR from predicting coordinates to iteratively optimizing edge probability distributions. Central to the new FDR is an adaptive weight function learning mechanism that learns weights for these distributions. This mechanism is designed to enhance the model’s perception capability regarding equipment location information within complex substation environments. Additionally, this paper develops a new Efficient Hybrid Encoder that provides adaptive scale weighting for feature information at different scales during cross-scale feature fusion, enabling more flexible and efficient lightweight feature extraction. Experimental validation on a critical substation equipment detection dataset demonstrates that SD-FINE achieves an accuracy of 93.1% while maintaining model lightness. It outperforms mainstream object detection networks across various metrics, providing an efficient and reliable detection solution for intelligent substation inspection. Full article

► Show Figures

Figure 1

20 pages, 2496 KB

Open AccessArticle

Mine-DW-Fusion: BEV Multiscale-Enhanced Fusion Object-Detection Model for Underground Coal Mine Based on Dynamic Weight Adjustment

by Wanzi Yan, Yidong Zhang, Minti Xue, Zhencai Zhu, Hao Lu, Xin Zhang, Wei Tang and Keke Xing

Sensors 2025, 25(16), 5185; https://doi.org/10.3390/s25165185 - 20 Aug 2025

Cited by 1 | Viewed by 702

Abstract

Environmental perception is crucial for achieving autonomous driving of auxiliary haulage vehicles in underground coal mines. The complex underground environment and working conditions, such as dust pollution, uneven lighting, and sensor data abnormalities, pose challenges to multimodal fusion perception. These challenges include: (1) [...] Read more.

Environmental perception is crucial for achieving autonomous driving of auxiliary haulage vehicles in underground coal mines. The complex underground environment and working conditions, such as dust pollution, uneven lighting, and sensor data abnormalities, pose challenges to multimodal fusion perception. These challenges include: (1) the lack of a reasonable and effective method for evaluating the reliability of different modality data; (2) the absence of in-depth fusion methods for different modality data that can handle sensor failures; and (3) the lack of a multimodal dataset for underground coal mines to support model training. To address these issues, this paper proposes a coal mine underground BEV multiscale-enhanced fusion perception model based on dynamic weight adjustment. First, camera and LiDAR modality data are uniformly mapped into BEV space to achieve multimodal feature alignment. Then, a Mixture of Experts-Fuzzy Logic Inference Module (MoE-FLIM) is designed to infer weights for different modality data based on BEV feature dimensions. Next, a Pyramid Multiscale Feature Enhancement and Fusion Module (PMS-FFEM) is introduced to ensure the model’s perception performance in the event of sensor data abnormalities. Lastly, a multimodal dataset for underground coal mines is constructed to provide support for model training and testing in real-world scenarios. Experimental results show that the proposed method demonstrates good accuracy and stability in object-detection tasks in coal mine underground environments, maintaining high detection performance, especially in typical complex scenes such as low light and dust fog. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

28 pages, 2107 KB

Open AccessArticle

A Scale-Adaptive and Frequency-Aware Attention Network for Precise Detection of Strawberry Diseases

by Kaijie Zhang, Yuchen Ye, Kaihao Chen, Zao Li and Hongxing Peng

Agronomy 2025, 15(8), 1969; https://doi.org/10.3390/agronomy15081969 - 15 Aug 2025

Viewed by 610

Abstract

Accurate and automated detection of diseases is crucial for sustainable strawberry production. However, the challenges posed by small size, mutual occlusion, and high intra-class variance of symptoms in complex agricultural environments make this difficult. Mainstream deep learning detectors often do not perform well [...] Read more.

Accurate and automated detection of diseases is crucial for sustainable strawberry production. However, the challenges posed by small size, mutual occlusion, and high intra-class variance of symptoms in complex agricultural environments make this difficult. Mainstream deep learning detectors often do not perform well under these demanding conditions. We propose a novel detection framework designed for superior accuracy and robustness to address this critical gap. Our framework introduces four key innovations: First, we propose a novel attention-driven detection head featuring our Parallel Pyramid Attention (PPA) module. Inspired by pyramid attention principles, our module’s unique parallel multi-branch architecture is designed to overcome the limitations of serial processing. It simultaneously integrates global, local, and serial features to generate a fine-grained attention map, significantly improving the model’s focus on targets of varying scales. Second, we enhance the core feature fusion blocks by integrating Monte Carlo Attention (MCAttn), effectively empowering the model to recognize targets across diverse scales. Third, to improve the feature representation capacity of the backbone without increasing the parametric overhead, we replace standard convolutions with Frequency-Dynamic Convolutions (FDConv). This approach constructs highly diverse kernels in the frequency domain. Finally, we employ the Scale-Decoupled Loss function to optimize training dynamics. By adaptively re-weighting the localization and scale losses based on target size, we stabilize the training process and improve the Precision of bounding box regression for small objects. Extensive experiments on a challenging dataset related to strawberry diseases demonstrate that our proposed model achieves a mean Average Precision (MAP) of 81.1%. This represents an improvement of 2.1% over the strong YOLOv12-n baseline, highlighting its practical value as an effective tool for intelligent disease protection. Full article

(This article belongs to the Special Issue Modern Control of Biotic Stress in Crops: Intelligent Detection and Precision Pesticide Application)

► Show Figures

Figure 1

18 pages, 1417 KB

Open AccessArticle

A Fusion-Based Approach with Bayes and DeBERTa for Efficient and Robust Spam Detection

by Ao Zhang, Kelei Li and Haihua Wang

Algorithms 2025, 18(8), 515; https://doi.org/10.3390/a18080515 - 15 Aug 2025

Viewed by 632

Abstract

Spam emails pose ongoing risks to digital security, including data breaches, privacy violations, and financial losses. Addressing the limitations of traditional detection systems in terms of accuracy, adaptability, and resilience remains a significant challenge. In this paper, we propose a hybrid spam detection [...] Read more.

Spam emails pose ongoing risks to digital security, including data breaches, privacy violations, and financial losses. Addressing the limitations of traditional detection systems in terms of accuracy, adaptability, and resilience remains a significant challenge. In this paper, we propose a hybrid spam detection framework that integrates a classical multinomial naive Bayes classifier with a pre-trained large language model, DeBERTa. The framework employs a weighted probability fusion strategy to combine the strengths of both models—lexical pattern recognition and deep semantic understanding—into a unified decision process. We evaluate the proposed method on a widely used spam dataset. Experimental results demonstrate that the hybrid model achieves superior performance in terms of accuracy and robustness when compared with other classifiers. The findings support the effectiveness of hybrid modeling in advancing spam detection techniques. Full article

(This article belongs to the Section Evolutionary Algorithms and Machine Learning)

► Show Figures

Figure 1

31 pages, 5280 KB

Open AccessArticle

Attention Mechanism-Based Feature Fusion and Degradation State Classification for Rolling Bearing Performance Assessment

by Teng Zhan, Wentao Chen, Congchang Xu, Luoxing Li and Xiaoxi Ding

Sensors 2025, 25(16), 4951; https://doi.org/10.3390/s25164951 - 10 Aug 2025

Viewed by 691

Abstract

Rolling bearing failure poses significant risks to mechanical system integrity, potentially leading to catastrophic safety incidents. Current challenges in performance degradation assessment include complex structural characteristics, suboptimal feature selection, and inadequate health index characterization. This study proposes a novel attention mechanism-based feature fusion [...] Read more.

Rolling bearing failure poses significant risks to mechanical system integrity, potentially leading to catastrophic safety incidents. Current challenges in performance degradation assessment include complex structural characteristics, suboptimal feature selection, and inadequate health index characterization. This study proposes a novel attention mechanism-based feature fusion method for accurate bearing performance assessment. First, we construct a multidimensional feature set encompassing time domain, frequency domain, and time–frequency domain characteristics. A two-stage sensitive feature selection strategy is developed, combining intersection-based primary selection with clustering-based re-selection to eliminate redundancy while preserving correlation, monotonicity, and robustness. Subsequently, an attention mechanism-driven fusion model adaptively weights selected features to generate high-performance health indicators. Experimental validation demonstrates the proposed method’s superiority in degradation characterization through two case studies. The intersection clustering strategy achieves 32% redundancy reduction compared to conventional methods, while the attention-based fusion improves health indicator consistency by 18.7% over principal component analysis. This approach provides an effective solution for equipment health monitoring and early fault warning in industrial applications. Full article

(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)

► Show Figures

Figure 1

29 pages, 3188 KB

Open AccessArticle

A Multimodal Bone Stick Matching Approach Based on Large-Scale Pre-Trained Models and Dynamic Cross-Modal Feature Fusion

by Tao Fan, Huiqin Wang, Ke Wang, Rui Liu and Zhan Wang

Appl. Sci. 2025, 15(15), 8681; https://doi.org/10.3390/app15158681 - 5 Aug 2025

Viewed by 582

Abstract

Among the approximately 60,000 bone stick fragments unearthed from the Weiyang Palace site of the Han Dynasty, about 57,000 bear inscriptions. Most of these fragments exhibit vertical fractures, leading to a separation between the upper and lower fragments, which poses significant challenges to [...] Read more.

Among the approximately 60,000 bone stick fragments unearthed from the Weiyang Palace site of the Han Dynasty, about 57,000 bear inscriptions. Most of these fragments exhibit vertical fractures, leading to a separation between the upper and lower fragments, which poses significant challenges to digital preservation and artifact restoration. Manual matching is inefficient and may cause further damage to the bone sticks. This paper proposes a novel multimodal bone stick matching approach that integrates image, inscription, and archeological information to enhance the accuracy and efficiency of matching fragmented bone stick artifacts. Unlike traditional methods that rely solely on image data, our method leverages large-scale pre-trained models, namely Vision-RWKV for visual feature extraction, RWKV for inscription analysis, and BERT for archeological metadata encoding. A dynamic cross-modal feature fusion mechanism is introduced to effectively combine these features, enabling better interaction and weighting based on the contextual relevance of each modality. This approach significantly improves matching performance, particularly in challenging cases involving fractures, corrosion, and missing sections. The novelty of this method lies in its ability to simultaneously extract and fuse multiple sources of information, addressing the limitations of traditional image-based matching methods. This paper uses Rank-N and Cumulative Match Characteristic (CMC) curves as evaluation metrics. Experimental evaluation shows that the matching accuracy reaches 94.73% at Rank-15, and the method performs significantly better than the comparative methods on the CMC evaluation curve, demonstrating outstanding performance. Overall, this approach significantly enhances the efficiency and accuracy of bone stick artifact matching, providing robust technical support for the research and restoration of bone stick cultural heritage. Full article

► Show Figures

Figure 1

26 pages, 4572 KB

Open AccessArticle

Transfer Learning-Based Ensemble of CNNs and Vision Transformers for Accurate Melanoma Diagnosis and Image Retrieval

by Murat Sarıateş and Erdal Özbay

Diagnostics 2025, 15(15), 1928; https://doi.org/10.3390/diagnostics15151928 - 31 Jul 2025

Cited by 1 | Viewed by 779

Abstract

Background/Objectives: Melanoma is an aggressive type of skin cancer that poses serious health risks if not detected in its early stages. Although early diagnosis enables effective treatment, delays can result in life-threatening consequences. Traditional diagnostic processes predominantly rely on the subjective expertise [...] Read more.

Background/Objectives: Melanoma is an aggressive type of skin cancer that poses serious health risks if not detected in its early stages. Although early diagnosis enables effective treatment, delays can result in life-threatening consequences. Traditional diagnostic processes predominantly rely on the subjective expertise of dermatologists, which can lead to variability and time inefficiencies. Consequently, there is an increasing demand for automated systems that can accurately classify melanoma lesions and retrieve visually similar cases to support clinical decision-making. Methods: This study proposes a transfer learning (TL)-based deep learning (DL) framework for the classification of melanoma images and the enhancement of content-based image retrieval (CBIR) systems. Pre-trained models including DenseNet121, InceptionV3, Vision Transformer (ViT), and Xception were employed to extract deep feature representations. These features were integrated using a weighted fusion strategy and classified through an Ensemble learning approach designed to capitalize on the complementary strengths of the individual models. The performance of the proposed system was evaluated using classification accuracy and mean Average Precision (mAP) metrics. Results: Experimental evaluations demonstrated that the proposed Ensemble model significantly outperformed each standalone model in both classification and retrieval tasks. The Ensemble approach achieved a classification accuracy of 95.25%. In the CBIR task, the system attained a mean Average Precision (mAP) score of 0.9538, indicating high retrieval effectiveness. The performance gains were attributed to the synergistic integration of features from diverse model architectures through the ensemble and fusion strategies. Conclusions: The findings underscore the effectiveness of TL-based DL models in automating melanoma image classification and enhancing CBIR systems. The integration of deep features from multiple pre-trained models using an Ensemble approach not only improved accuracy but also demonstrated robustness in feature generalization. This approach holds promise for integration into clinical workflows, offering improved diagnostic accuracy and efficiency in the early detection of melanoma. Full article

(This article belongs to the Special Issue Lesion Detection and Analysis Using Artificial Intelligence, Third Edition)

► Show Figures

Figure 1

20 pages, 1536 KB

Open AccessArticle

Graph Convolution-Based Decoupling and Consistency-Driven Fusion for Multimodal Emotion Recognition

by Yingmin Deng, Chenyu Li, Yu Gu, He Zhang, Linsong Liu, Haixiang Lin, Shuang Wang and Hanlin Mo

Electronics 2025, 14(15), 3047; https://doi.org/10.3390/electronics14153047 - 30 Jul 2025

Viewed by 594

Abstract

Multimodal emotion recognition (MER) is essential for understanding human emotions from diverse sources such as speech, text, and video. However, modality heterogeneity and inconsistent expression pose challenges for effective feature fusion. To address this, we propose a novel MER framework combining a Dynamic [...] Read more.

Multimodal emotion recognition (MER) is essential for understanding human emotions from diverse sources such as speech, text, and video. However, modality heterogeneity and inconsistent expression pose challenges for effective feature fusion. To address this, we propose a novel MER framework combining a Dynamic Weighted Graph Convolutional Network (DW-GCN) for feature disentanglement and a Cross-Attention Consistency-Gated Fusion (CACG-Fusion) module for robust integration. DW-GCN models complex inter-modal relationships, enabling the extraction of both common and private features. The CACG-Fusion module subsequently enhances classification performance through dynamic alignment of cross-modal cues, employing attention-based coordination and consistency-preserving gating mechanisms to optimize feature integration. Experiments on the CMU-MOSI and CMU-MOSEI datasets demonstrate that our method achieves state-of-the-art performance, significantly improving the

A C C_{7}

,

A C C_{2}

, and

F 1

scores. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

18 pages, 2469 KB

Open AccessArticle

Neural Network-Based SLAM/GNSS Fusion Localization Algorithm for Agricultural Robots in Orchard GNSS-Degraded or Denied Environments

by Huixiang Zhou, Jingting Wang, Yuqi Chen, Lian Hu, Zihao Li, Fuming Xie, Jie He and Pei Wang

Agriculture 2025, 15(15), 1612; https://doi.org/10.3390/agriculture15151612 - 25 Jul 2025

Cited by 1 | Viewed by 625

Abstract

To address the issue of agricultural robot loss of control caused by GNSS signal degradation or loss in complex agricultural environments such as farmland and orchards, this study proposes a neural network-based SLAM/GNSS fusion localization algorithm aiming to enhance the robot’s localization accuracy [...] Read more.

To address the issue of agricultural robot loss of control caused by GNSS signal degradation or loss in complex agricultural environments such as farmland and orchards, this study proposes a neural network-based SLAM/GNSS fusion localization algorithm aiming to enhance the robot’s localization accuracy and stability in weak or GNSS-denied environments. It achieves multi-sensor observed pose coordinate system unification through coordinate system alignment preprocessing, optimizes SLAM poses via outlier filtering and drift correction, and dynamically adjusts the weights of poses from distinct coordinate systems via a neural network according to the GDOP. Experimental results on the robotic platform demonstrate that, compared to the SLAM algorithm without pose optimization, the proposed SLAM/GNSS fusion localization algorithm reduced the whole process average position deviation by 37%. Compared to the fixed-weight fusion localization algorithm, the proposed SLAM/GNSS fusion localization algorithm achieved a 74% reduction in average position deviation during transitional segments with GNSS signal degradation or recovery. These results validate the superior positioning accuracy and stability of the proposed SLAM/GNSS fusion localization algorithm in weak or GNSS-denied environments. Orchard experimental results demonstrate that, at an average speed of 0.55 m/s, the proposed SLAM/GNSS fusion localization algorithm achieves an overall average position deviation of 0.12 m, with average position deviation of 0.06 m in high GNSS signal quality zones, 0.11 m in transitional sections under signal degradation or recovery, and 0.14 m in fully GNSS-denied environments. These results validate that the proposed SLAM/GNSS fusion localization algorithm maintains high localization accuracy and stability even under conditions of low and highly fluctuating GNSS signal quality, meeting the operational requirements of most agricultural robots. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

24 pages, 4430 KB

Open AccessArticle

Early Bearing Fault Diagnosis in PMSMs Based on HO-VMD and Weighted Evidence Fusion of Current–Vibration Signals

by Xianwu He, Xuhui Liu, Cheng Lin, Minjie Fu, Jiajin Wang and Jian Zhang

Sensors 2025, 25(15), 4591; https://doi.org/10.3390/s25154591 - 24 Jul 2025

Cited by 1 | Viewed by 638

Abstract

To address the challenges posed by weak early fault signal features, strong noise interference, low diagnostic accuracy, poor reliability when using single information sources, and the limited availability of high-quality samples in practical applications for permanent magnet synchronous motor (PMSM) bearings, this paper [...] Read more.

To address the challenges posed by weak early fault signal features, strong noise interference, low diagnostic accuracy, poor reliability when using single information sources, and the limited availability of high-quality samples in practical applications for permanent magnet synchronous motor (PMSM) bearings, this paper proposes an early bearing fault diagnosis method based on Hippopotamus Optimization Variational Mode Decomposition (HO-VMD) and weighted evidence fusion of current–vibration signals. The HO algorithm is employed to optimize the parameters of VMD for adaptive modal decomposition of current and vibration signals, resulting in the generation of intrinsic mode functions (IMFs). These IMFs are then selected and reconstructed based on their kurtosis to suppress noise and harmonic interference. Subsequently, the reconstructed signals are demodulated using the Teager–Kaiser Energy Operator (TKEO), and both time-domain and energy spectrum features are extracted. The reliability of these features is utilized to adaptively weight the basic probability assignment (BPA) functions. Finally, a weighted modified Dempster–Shafer evidence theory (WMDST) is applied to fuse multi-source feature information, enabling an accurate assessment of the PMSM bearing health status. The experimental results demonstrate that the proposed method significantly enhances the signal-to-noise ratio (SNR) and enables precise diagnosis of early bearing faults even in scenarios with limited sample sizes. Full article

(This article belongs to the Special Issue Advances in Bearing Fault Diagnosis Using Single Sensor Techniques and Sensor Fusion Approaches)

► Show Figures

Figure 1

Search Results (127)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (127)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI