MDPI - Publisher of Open Access Journals

28 pages, 5420 KB

Open AccessArticle

HEMS-RTDETR: A Lightweight Edge-Enhanced and Deformation-Aware Detector for Floating Debris in Complex Water Environments

by Yiwei Cui, Xinyi Jiang, Haiting Yu, Meizhen Lei and Jia Ren

Electronics 2026, 15(6), 1226; https://doi.org/10.3390/electronics15061226 (registering DOI) - 15 Mar 2026

Abstract

Floating debris detection in complex aquatic environments holds significant importance for water resource protection and maritime safety monitoring. However, this task faces three core challenges: severe background interference leading to blurred target textures, significant non-rigid deformations, and the frequent loss of small targets [...] Read more.

Floating debris detection in complex aquatic environments holds significant importance for water resource protection and maritime safety monitoring. However, this task faces three core challenges: severe background interference leading to blurred target textures, significant non-rigid deformations, and the frequent loss of small targets at long distances. To address these issues, we propose a high-performance lightweight detection algorithm, termed High-Efficiency Edge-Aware Multi-Scale Real-Time Detection Transformer (HEMS-RTDETR), built upon the Real-Time Detection Transformer (RT-DETR) architecture. First, to suppress disturbances induced by water surface ripples and specular reflections, a Cross-Stage Partial Multi-Scale Edge Information Enhancement (CSP-MSEIE) module is introduced to reconstruct the backbone network. By removing computational redundancy while incorporating explicit edge enhancement, feature extraction capability and noise robustness for weak-texture targets are significantly improved. Second, to handle irregular debris morphology, a Deformable Attention Transformer (DAT) module is integrated, enabling adaptive attention focusing on geometrically deformed regions. Finally, an Efficient Multi-Scale Bidirectional Feature Pyramid Network (EMBSFPN) is constructed to enhance cross-scale semantic interaction and alleviate small-target signal loss. Experimental results demonstrate that, compared with RTDETR-r18, HEMS-RTDETR reduces parameters to 12.57 M, improves mAP@0.5 and mAP@0.5:0.95 by 2.44% and 3.05%, respectively, and maintains real-time inference at 93 FPS, indicating strong robustness and application potential in dynamic aquatic environments. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

17 pages, 4808 KB

Open AccessArticle

Predicting Groundwater Depth Using Historical Data Trend Decomposition: Based on the VMD-LSTM Hybrid Deep Learning Model

by Jie Yue, Hong Guo, Deng Pan, Huanxiang Wang, Yawen Xin, Furong Yu, Yingying Shao and Rui Dun

Water 2026, 18(6), 689; https://doi.org/10.3390/w18060689 (registering DOI) - 15 Mar 2026

Abstract

Groundwater is a critical natural and strategic economic resource, and the accurate prediction of groundwater depth dynamics is essential for the rational development and utilization of water resources. However, under the combined influence of climate variability, human activities, and complex hydrogeological conditions, groundwater [...] Read more.

Groundwater is a critical natural and strategic economic resource, and the accurate prediction of groundwater depth dynamics is essential for the rational development and utilization of water resources. However, under the combined influence of climate variability, human activities, and complex hydrogeological conditions, groundwater level time series exhibit strong nonlinear and non-stationary characteristics, posing great challenges to the accurate prediction of groundwater level dynamics. Most existing prediction models rely on sufficient hydro-meteorological and exploitation data that are difficult to obtain in water-scarce regions, or fail to effectively decouple the multi-scale features of non-stationary groundwater level signals, resulting in limited prediction accuracy and insufficient generalization ability. To address these research gaps, this study takes Zhengzhou, a typical water-deficient city in the Yellow River Basin, as the study area, and proposes a hybrid deep learning framework combining Variational Mode Decomposition (VMD) and Long Short-Term Memory (LSTM) neural network for predicting shallow and intermediate-deep groundwater level changes. Kolmogorov–Arnold Networks (KANs) and Gated Recurrent Units (GRUs) are selected as benchmark models to verify the superior performance of the proposed framework. In this framework, the non-stationary groundwater level signal is adaptively decomposed into Intrinsic Mode Functions (IMFs) with distinct frequency characteristics via VMD. An independent LSTM model is constructed for each IMF to capture its unique temporal variation pattern, and the final groundwater level prediction is obtained by linearly reconstructing the predicted results of all IMFs. The results show that the coefficient of determination (R²) of the VMD-LSTM model exceeds 0.90 for all monitoring datasets, with low Mean Absolute Error (MAE) and Mean Squared Error (MSE). It significantly outperforms the benchmark models in handling nonlinear and non-stationary time series features. Using only historical groundwater level data as input, the proposed framework effectively overcomes the limitation of insufficient driving variables in data-scarce regions and fully explores the multi-scale evolution of groundwater dynamics through the synergistic effect of multi-scale decomposition and deep learning. The method presented in this study provides a novel and reliable technical approach for groundwater level prediction in water-deficient and data-limited areas, and also offers scientific support for the rational management and sustainable utilization of regional groundwater resources. Future research will incorporate driving factors such as meteorology and exploitation to further improve the model’s ability to capture abrupt changes in groundwater level dynamics. Full article

(This article belongs to the Special Issue Landslide Geological Disaster Prevention and Geological Environment Protection)

► Show Figures

Figure 1

17 pages, 2662 KB

Open AccessArticle

A Swin-Transformer-Based Network for Adaptive Backlight Optimization

by Jin Li, Rui Pu, Junbang Jiang and Man Zhu

Symmetry 2026, 18(3), 502; https://doi.org/10.3390/sym18030502 (registering DOI) - 15 Mar 2026

Abstract

Mini-LED local dimming systems commonly suffer from luminance discontinuity, halo artifacts, and temporal instability in dynamic scenes. Traditional heuristic-based methods and standard convolutional neural networks often fail to capture long-range spatial dependencies and struggle to balance spatial smoothness, content fidelity, and real-time performance [...] Read more.

Mini-LED local dimming systems commonly suffer from luminance discontinuity, halo artifacts, and temporal instability in dynamic scenes. Traditional heuristic-based methods and standard convolutional neural networks often fail to capture long-range spatial dependencies and struggle to balance spatial smoothness, content fidelity, and real-time performance under hardware constraints. To address these challenges, this paper proposes SwinLightNet, an efficient adaptive backlight optimization network tailored for Mini-LED displays. Built upon a Swin Transformer framework tailored for Mini-LED backlight optimization, SwinLightNet integrates five hardware-aware design strategies: (i) a lightweight Swin variant (window size = 8, MLP ratio = 2.0) for efficient global context modeling; (ii) CNN encoder–decoder integration for multi-scale feature extraction; (iii) a partition-level alignment module ensuring spatial consistency; (iv) a backlight constraint module enforcing local luminance consistency and contrast preservation; (v) a change-aware temporal decision framework stabilizing dynamic sequences. These components synergistically resolve core limitations: global modeling suppresses halo artifacts while preserving content fidelity; alignment and constraint modules eliminate luminance discontinuity without compromising contrast; and the temporal framework guarantees flicker-free output under motion. Evaluated on DIV2K (static images) and a custom 2K-resolution video dataset (dynamic scenes), SwinLightNet demonstrates robust reconstruction quality while maintaining only 1.18 million parameters and 0.088 GFLOPs (Computational Cost). The results confirm SwinLightNet’s effectiveness in holistically addressing spatial, temporal, and hardware constraints, demonstrating strong potential for practical deployment in resource-constrained Mini-LED backlight control systems. Full article

(This article belongs to the Special Issue Symmetry and Asymmetry in Optimization Algorithms and Control Systems)

► Show Figures

Figure 1

21 pages, 10378 KB

Open AccessArticle

A Method for Detecting Slow-Moving Landslides Based on the Integration of Surface Deformation and Texture

by Xuerong Chen, Cuiying Zhou, Zhen Liu, Chaoying Zhao, Xiaojie Liu and Zhong Lu

Remote Sens. 2026, 18(6), 899; https://doi.org/10.3390/rs18060899 (registering DOI) - 15 Mar 2026

Abstract

Slow-moving landslides can trigger severe disasters when activated by earthquakes, torrential rains, or typhoons. Early detection is crucial for mitigating loss of life and property damage. Interferometric Synthetic Aperture Radar (InSAR) technology is among the most effective techniques for detecting slow-moving landslides, though [...] Read more.

Slow-moving landslides can trigger severe disasters when activated by earthquakes, torrential rains, or typhoons. Early detection is crucial for mitigating loss of life and property damage. Interferometric Synthetic Aperture Radar (InSAR) technology is among the most effective techniques for detecting slow-moving landslides, though its accuracy can be further improved through integration with optical imagery and Digital Elevation Models (DEM). Current machine learning approaches that combine InSAR and optical data suffer from limited efficiency, poor transferability, and challenges in regional-scale application. To address these limitations, this study proposes a multimodal dual-path network that integrates InSAR products with textural information from optical imagery to detect slow-moving landslides. One path processes InSAR deformation rates and topographic factors, while the other incorporates texture information and auxiliary data. Together, these paths extract semantic information from high-dimensional spatial features and condense it into low-dimensional representations. A pyramid pooling module is employed to capture multi-scale features during low-level semantic extraction. For feature fusion, a rate-constrained adaptive module is introduced to enhance the contribution of deformation rates to slow-moving landslides. According to the results, the proposed method improves the F1-score for landslide detection by 6% compared to using InSAR products alone. These results provide reliable technical support for regional landslide inventory compilation and disaster management, as well as new insights for regional-scale surveys in slow-moving landslide-prone areas. Full article

(This article belongs to the Special Issue Advances in AI-Driven Remote Sensing for Geohazard Perception)

► Show Figures

Figure 1

31 pages, 23615 KB

Open AccessArticle

A Memory-Efficient Class-Incremental Learning Framework for Remote Sensing Scene Classification via Feature Replay

by Yunze Wei, Yuhan Liu, Ben Niu, Xiantai Xiang, Jingdun Lin, Yuxin Hu and Yirong Wu

Remote Sens. 2026, 18(6), 896; https://doi.org/10.3390/rs18060896 (registering DOI) - 15 Mar 2026

Abstract

Most existing deep learning models for remote sensing scene classification (RSSC) adopt an offline learning paradigm, where all classes are jointly optimized on fixed-class datasets. In dynamic real-world scenarios with streaming data and emerging classes, such paradigms are inherently prone to catastrophic forgetting [...] Read more.

Most existing deep learning models for remote sensing scene classification (RSSC) adopt an offline learning paradigm, where all classes are jointly optimized on fixed-class datasets. In dynamic real-world scenarios with streaming data and emerging classes, such paradigms are inherently prone to catastrophic forgetting when models are incrementally trained on new data. Recently, a growing number of class-incremental learning (CIL) methods have been proposed to tackle these issues, some of which achieve promising performance by rehearsing training data from previous tasks. However, implementing such strategy in real-world scenarios is often challenging, as the requirement to store historical data frequently conflicts with strict memory constraints and data privacy protocols. To address these challenges, we propose a novel memory-efficient feature-replay CIL framework (FR-CIL) for RSSC that retains compact feature embeddings, rather than raw images, as exemplars for previously learned classes. Specifically, a progressive multi-scale feature enhancement (PMFE) module is proposed to alleviate representation ambiguity. It adopts a progressive construction scheme to enable fine-grained and interactive feature enhancement, thereby improving the model’s representation capability for remote sensing scenes. Then, a specialized feature calibration network (FCN) is trained in a transductive learning paradigm with manifold consistency regularization to adapt stored feature descriptors to the updated feature space, thereby effectively compensating for feature space drift and enabling a unified classifier. Following feature calibration, a bias rectification (BR) strategy is employed to mitigate prediction bias by exclusively optimizing the classifier on a balanced exemplar set. As a result, this memory-efficient CIL framework not only addresses data privacy concerns but also mitigates representation drift and classifier bias. Extensive experiments on public datasets demonstrate the effectiveness and robustness of the proposed method. Notably, FR-CIL outperforms the leading state-of-the-art CIL methods in mean accuracy by margins of 3.75%, 3.09%, and 2.82% on the six-task AID, seven-task RSI-CB256, and nine-task NWPU-45 datasets, respectively. At the same time, it reduces memory storage requirements by over 94.7%, highlighting its strong potential for real-world RSSC applications under strict memory constraints. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition))

► Show Figures

Figure 1

23 pages, 1970 KB

Open AccessArticle

SSFE-YOLO: A Shallow Structure Feature Enhancement-Based Algorithm for Detecting Foreign Objects on Mine Conveyor Belts

by Feng Tian, Yujie Wang and Xiaopei Liu

Appl. Sci. 2026, 16(6), 2773; https://doi.org/10.3390/app16062773 - 13 Mar 2026

Abstract

To address the insufficient capability of YOLO-series models in representing structural information for foreign objects with diverse scales and morphologies, an improved algorithm named SSFE-YOLO is proposed. First, the Space-to-Depth Convolution (SPDConv) is adopted into the backbone network to preserve edge and texture [...] Read more.

To address the insufficient capability of YOLO-series models in representing structural information for foreign objects with diverse scales and morphologies, an improved algorithm named SSFE-YOLO is proposed. First, the Space-to-Depth Convolution (SPDConv) is adopted into the backbone network to preserve edge and texture details in shallow features during downsampling, thereby maintaining the integrity of critical target structures at the feature generation stage. Second, an adaptive receptive field enhancement module (ARFE) is designed by introducing parallel feature branches with varying receptive fields. This module performs adaptive fusion to bolster the structural perception of the network towards polymorphic foreign objects. Furthermore, a distribution-feature stable compensation module (DFSC) is designed to suppress feature distribution shifts caused by illumination variations and noise interference through structural consistency enhancement and stable distribution constraints, which significantly improves the stability of feature representation in complex environments. Finally, a dual-dimension optimized loss function (D

^{2}

-OL) is constructed to achieve differentiated supervision for samples of varying quality and balanced optimization for multi-scale target detection by modulating the supervisory weights of feature layers and filtering effective training samples. Experimental results on a self-built mine conveyor belt dataset demonstrate that the proposed method achieves an mAP@0.5 of 90.5% and an mAP@0.5:0.95 of 59.1%, consistently outperforming mainstream models such as YOLOv8, YOLOv11, and YOLOv13. Simulation results indicate that the proposed approach effectively enhances the detection accuracy and robustness of foreign objects in mining environments, showcasing substantial potential for engineering applications. Full article

(This article belongs to the Section Applied Industrial Technologies)

► Show Figures

Figure 1

20 pages, 14849 KB

Open AccessArticle

MCViM-YOLO: Remote Sensing Vehicle Detection for Sustainable Intelligent Transportation

by Kairui Zhang, Ningning Zhu, Fuqing Zhao and Qiuyu Zhang

Sustainability 2026, 18(6), 2836; https://doi.org/10.3390/su18062836 - 13 Mar 2026

Viewed by 51

Abstract

Vehicle detection is a core task in smart city perception management and an important technical support for sustainable urban development and intelligent transportation optimization. In high-resolution unmanned aerial vehicle (UAV) remote sensing images, it faces challenges such as variable target scales, severe occlusion, [...] Read more.

Vehicle detection is a core task in smart city perception management and an important technical support for sustainable urban development and intelligent transportation optimization. In high-resolution unmanned aerial vehicle (UAV) remote sensing images, it faces challenges such as variable target scales, severe occlusion, and difficulty in modeling long-range dependencies. To address these issues, this study proposes the MCViM-YOLO algorithm, which integrates the local perception advantage of convolution with the global modeling capability of the state space model (Mamba). Based on YOLOv12, the algorithm reconstructs the neck network: it introduces the Mix-Mamba module (parallel multi-scale convolution and selective state space model) to simultaneously capture local details and global spatial dependencies, adopts the dual-factor calibration fusion module (DCFM) to adaptively fuse heterogeneous features, and employs a dual-branch attention detection head (DADH) to optimize the prediction of difficult samples (e.g., occluded, small-scale vehicles). Experiments on the VEBAI dataset demonstrate that our proposed model achieves an mAP@0.5 of 92.391% and a recall rate of 86.070%, with a computational complexity of 10.41 GFLOPs. The results show that the proposed method effectively improves the accuracy and efficiency of vehicle detection in complex remote sensing scenarios, provides technical support for traffic flow monitoring, low-carbon urban planning, and other sustainable applications, and offers an innovative paradigm for the deep integration of CNN and state space models with both theoretical research value and engineering application prospects. Full article

► Show Figures

Figure 1

43 pages, 2166 KB

Open AccessArticle

Research on Root Cause Analysis Method for Certain Civil Aircraft Based on Ensemble Learning and Large Language Model Reasoning

by Wenyou Du, Jingtao Du, Haoran Zhang and Dongsheng Yang

Machines 2026, 14(3), 322; https://doi.org/10.3390/machines14030322 - 12 Mar 2026

Viewed by 134

Abstract

To address the challenges commonly encountered in civil aircraft operating under multi-mode, strongly coupled closed-loop control—namely scarce fault samples, pronounced distribution shift, and root-cause explanations that are easily confounded by covariates—this paper proposes a root-cause analysis method that integrates ensemble learning with constraint-guided [...] Read more.

To address the challenges commonly encountered in civil aircraft operating under multi-mode, strongly coupled closed-loop control—namely scarce fault samples, pronounced distribution shift, and root-cause explanations that are easily confounded by covariates—this paper proposes a root-cause analysis method that integrates ensemble learning with constraint-guided reasoning by large language models (LLMs). First, for Full Authority Digital Engine Control (FADEC) monitoring sequences, a feature system comprising environment-normalized ratios, mechanism-informed mixing indices, and multi-scale temporal statistics is constructed, thereby improving cross-mode comparability and enhancing engineering-semantic expressiveness. Second, in the anomaly detection stage, a cost-sensitive LightGBM model is adopted and a validation-set-based adaptive thresholding strategy is introduced to achieve robust identification under highly imbalanced fault conditions. Furthermore, for Root Cause Analysis (RCA), a “computation–reasoning decoupling” framework is developed: Shapley Additive exPlanations (SHAP) are used to generate segment-level contribution evidence, while causal chains, engineering prohibitions, and structured output templates are injected into prompts to constrain the LLM, enabling it to infer root-cause candidates and produce structured explanations under mechanism-consistency constraints. Experiments on real flight data demonstrate that our method yields an anomaly detection F1-score of 0.9577 and improves overall RCA accuracy to 97.1% (versus 62.3% for a pure SHAP baseline). Practically, by translating complex high-dimensional data into actionable natural language diagnostic reports, the proposed method provides reliable and interpretable decision support for rapid RCA. Full article

(This article belongs to the Section Automation and Control Systems)

► Show Figures

Figure 1

23 pages, 5616 KB

Open AccessArticle

Informer–UNet: A Hybrid Deep Learning Framework for Multi-Point Soil Moisture Prediction and Precision Irrigation in Winter Wheat

by Dingkun Zheng, Chenghan Yang, Gang Zheng, Baurzhan Belgibaev, Madina Mansurova, Sholpan Jomartova and Baidong Zhao

Agriculture 2026, 16(6), 648; https://doi.org/10.3390/agriculture16060648 - 12 Mar 2026

Viewed by 130

Abstract

Soil moisture prediction is essential for precision irrigation in water-limited agricultural systems. This study presents a deep learning-driven irrigation framework for winter wheat, integrating a novel Informer–UNet model with a Comprehensive Irrigation Index for adaptive water management. The Informer–UNet combines ProbSparse self-attention mechanisms [...] Read more.

Soil moisture prediction is essential for precision irrigation in water-limited agricultural systems. This study presents a deep learning-driven irrigation framework for winter wheat, integrating a novel Informer–UNet model with a Comprehensive Irrigation Index for adaptive water management. The Informer–UNet combines ProbSparse self-attention mechanisms with UNet’s multi-scale feature fusion, enabling simultaneous prediction of soil moisture at 27 monitoring points across three depths, 10, 30, and 50 cm, while quantifying prediction uncertainty through Monte Carlo Dropout. A Comprehensive Irrigation Index incorporating moisture deviation, spatial variance, and confidence interval width was developed, with weights optimized via genetic algorithm. Field experiments were conducted in Chengdu, China, over two winter wheat growing seasons. The Informer–UNet achieved superior prediction accuracy, R² greater than 0.98, RMSE less than 0.65, compared to LSTM, Transformer, and standard Informer models, with the fastest convergence and lowest validation loss. The proposed DeepIndexIrr strategy maintained soil moisture within the target range, 55% to 75%, for over 81% of the irrigation period, reducing water consumption by 38.2% compared to fixed-threshold control and 19.2% compared to expert manual scheduling. These results demonstrate that integrating spatially distributed deep learning predictions with uncertainty-informed decision rules offers a promising approach for sustainable precision irrigation. Full article

(This article belongs to the Special Issue Advanced Earth Observation Technologies for Sustainable Irrigation Water Management)

► Show Figures

Figure 1

30 pages, 5823 KB

Open AccessArticle

Complex Weather Highway Aerial Vehicle Detection Network with Feature Enhancement and Grid-Based Feature Fusion

by Ningzhi Zeng and Jinzheng Lu

Appl. Sci. 2026, 16(6), 2710; https://doi.org/10.3390/app16062710 - 12 Mar 2026

Viewed by 63

Abstract

In highway aerial imagery, complex weather conditions such as rain, fog, snow, and low illumination often lead to severe appearance degradation and feature loss of vehicle targets, posing significant challenges for vehicle detection. Existing research faces two major challenges: first, the lack of [...] Read more.

In highway aerial imagery, complex weather conditions such as rain, fog, snow, and low illumination often lead to severe appearance degradation and feature loss of vehicle targets, posing significant challenges for vehicle detection. Existing research faces two major challenges: first, the lack of large-scale, high-quality annotated datasets tailored for complex weather scenarios; second, the difficulty traditional detectors encounter in effectively extracting feature information and performing multi-scale feature fusion under conditions of severe feature degradation and dense distribution of small objects. To address these issues, this paper investigates both data construction and algorithm design. Firstly, a Complex Weather Highway Vehicle Dataset (CWHVD) is established to provide a benchmark for related research. Secondly, a Feature-Enhanced Grid-Based Feature Fusion Complex-Weather Vehicle Detection Network (FGCV-Det) is proposed. A wavelet transform-based Feature Enhancement Module (FEWT) is introduced at the input stage to strengthen edge and texture representation. In the backbone, Adaptive Pinwheel Convolution (APConv) and a C3K2-HD module based on Hidden State Mixer-Based State Space Duality (HSM-SSD) are employed to enhance semantic modeling. Furthermore, a Complex Weather Grid Feature Pyramid Network (CWG-FPN) is designed to achieve weighted cross-scale fusion. The FGCV-Det significantly outperforms YOLO11s on CWHVD, achieving 63.4% precision, 48.6% recall, 51.7% mAP50, and 28.2% mAP50:95. It also generalizes well, reaching 47.1% and 49.6% mAP50 on VisDrone2019 and UAVDT, respectively, surpassing baseline and mainstream detectors, demonstrating strong robustness and generalization capability. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

31 pages, 1936 KB

Open AccessArticle

A Multi-Scale Heterogeneous Graph Attention Network for Nested Named Entity Recognition with Syntactic and Dependency Tree Structures

by Yifan Zhao, Lin Zhang and Yangshuyi Xu

Electronics 2026, 15(6), 1183; https://doi.org/10.3390/electronics15061183 - 12 Mar 2026

Viewed by 128

Abstract

Nested Named Entity Recognition (nested NER) frequently encounters challenges like boundary conflicts, complications in modeling long-distance dependencies, and inadequate representation of deep nested semantics resulting from overlapping spans and hierarchical inclusion relationships of entities. This research presents a multi-scale heterogeneous graph attention network [...] Read more.

Nested Named Entity Recognition (nested NER) frequently encounters challenges like boundary conflicts, complications in modeling long-distance dependencies, and inadequate representation of deep nested semantics resulting from overlapping spans and hierarchical inclusion relationships of entities. This research presents a multi-scale heterogeneous graph attention network to facilitate end-to-end recognition of nested entities through the collaborative modeling of structure and semantics. The model initially presents the structural integration mechanism, which consolidates the hierarchical restrictions of the syntactic tree and the inter-word relationships of the dependency tree within a singular heterogeneous graph space. It subsequently generates 1/2/3-hop multi-scale subgraphs and employs multi-scale subgraph attention to adaptively integrate information from various structural receptive fields, harmonizing the local cues of shallow entities with the global dependencies of deep entities. The experimental findings on the ACE2004, ACE2005, and GENIA benchmark datasets indicate that the proposed method surpasses several robust baselines regarding overall performance and nested entity recognition, particularly exhibiting notable advantages in identifying long entities and low-frequency entities. We further evaluate MHGAT on KBP2017 and GermEval2014 to validate generalization across datasets and languages. Full article

(This article belongs to the Special Issue Advancements and Challenges in NLP and Linguistic Text-Mining: Techniques, Applications, and Ethical Considerations)

► Show Figures

Figure 1

17 pages, 2354 KB

Open AccessArticle

Real-Time Intelligent Detection Algorithm for Ship Targets in High-Resolution Wide-Swath Sea Surface Images Captured by Airborne Cameras

by Haiying Liu, Qiang Fu, Haoyu Wang, Huaide Zhou, Yingchao Li and Huilin Jiang

Sensors 2026, 26(6), 1786; https://doi.org/10.3390/s26061786 - 12 Mar 2026

Viewed by 89

Abstract

The critical task of ship detection in aerial imagery for maritime monitoring faces significant challenges in achieving real-time performance on embedded platforms. These challenges arise from the large data volume inherent in wide-format aerial images and the pronounced scale variations among vessels. To [...] Read more.

The critical task of ship detection in aerial imagery for maritime monitoring faces significant challenges in achieving real-time performance on embedded platforms. These challenges arise from the large data volume inherent in wide-format aerial images and the pronounced scale variations among vessels. To address this issue, an optimized YOLOv8-based model is proposed. Scale adaptability is enhanced by incorporating a Multi-Scale Fusion (MSF) module into the backbone. In addition, a lightweight Group-Wise Scale Fusion Neck (GSF-Neck) with a parallel multi-branch structure is designed to facilitate adaptive multi-scale feature fusion while reducing computational overhead. The proposed model achieves a state-of-the-art mAP@0.5 of up to 94.55% on a dedicated aerial ship dataset, outperforming other major detectors. When deployed on an RK3588 embedded system using a sliding window strategy to process single 300 MB images, it maintains a stable processing speed of ≥2 fps. Compared to the baseline under identical conditions, the model proposed in this study improves mAP by 1.4% with a 6.6% reduction in FPS, effectively balancing detection performance and computational efficiency. Full article

(This article belongs to the Section Environmental Sensing)

► Show Figures

Figure 1

29 pages, 6729 KB

Open AccessArticle

A Novel Bearing Fault Diagnosis Framework with a Multi-Scale Feature Extraction Module and Efficient Content-Guided Attention Mechanism

by Yaru Liang, Jinxian Chen, Renxin Liu, Huamao Zhou, Nianqian Kang and Nanrun Zhou

Lubricants 2026, 14(3), 121; https://doi.org/10.3390/lubricants14030121 - 12 Mar 2026

Viewed by 150

Abstract

Rolling bearing faults originate from complex tribodynamic interactions among rolling elements, raceways, and the cage, yielding nonlinear, non-stationary vibration signals that are highly susceptible to noise and operating-condition variations, which compromises the reliability of diagnosis. To address this issue, this paper proposes the [...] Read more.

Rolling bearing faults originate from complex tribodynamic interactions among rolling elements, raceways, and the cage, yielding nonlinear, non-stationary vibration signals that are highly susceptible to noise and operating-condition variations, which compromises the reliability of diagnosis. To address this issue, this paper proposes the RConvNeXt–ECGA framework. The main contributions are twofold: (1) RConvNeXt is a convolutional module based on ConvNeXt, which achieves efficient multi-scale feature extraction through grouped parallel convolutions with multiple receptive fields; (2) Efficient Content-Guided Attention (ECGA) is a novel pixel-level attention mechanism, which adaptively reweights feature maps to highlight informative regions and suppress irrelevant interference. The proposed method achieves an average accuracy of 99.8% on bearing datasets from Case Western Reserve University and Huazhong University of Science and Technology, and 94.33% under cross-operating-condition tests, demonstrating superior robustness and generalization over representative deep learning-based baseline models. Full article

► Show Figures

Figure 1

25 pages, 6369 KB

Open AccessArticle

A Lightweight Attention-Guided and Geometry-Aware Framework for Robust Maritime Ship Detection in Complex Electro-Optical Environments

by Zhe Zhang, Chang Lin and Bing Fang

Automation 2026, 7(2), 48; https://doi.org/10.3390/automation7020048 - 12 Mar 2026

Viewed by 92

Abstract

Reliable ship detection in complex maritime optical imagery is a fundamental requirement for intelligent maritime monitoring and maritime automation systems. However, severe image degradation, large-scale variations, and background clutter often lead to feature ambiguity and unstable detection performance in real-world maritime environments. To [...] Read more.

Reliable ship detection in complex maritime optical imagery is a fundamental requirement for intelligent maritime monitoring and maritime automation systems. However, severe image degradation, large-scale variations, and background clutter often lead to feature ambiguity and unstable detection performance in real-world maritime environments. To address these challenges, this paper proposes a lightweight one-stage ship detection framework designed for robust real-time perception under degraded maritime sensing conditions. The proposed method incorporates an Adaptive Expert Selection Attention (AESA) mechanism to perform adaptive feature selection and background suppression under visually degraded conditions, together with a Geometry-Aware MultiScale Fusion (GAMF) module that enables orientation-aware aggregation of contextual information for elongated ship targets near complex sea–sky boundaries. In addition, a geometry-aware bounding box regression refinement is introduced to improve localization consistency in image space. Extensive experiments conducted on a unified real-world maritime benchmark demonstrate that the proposed framework consistently outperforms the baseline YOLO11n model by approximately 2–5 percentage points in terms of mAP@0.5 and mAP@0.5:0.95, while maintaining moderate computational complexity and real-time inference capability. These results indicate that the proposed method provides a practical and deployment-oriented perception solution for maritime automation applications, including onboard electro-optical sensing and coastal surveillance. Full article

► Show Figures

Figure 1

22 pages, 18777 KB

Open AccessArticle

LSOD-YOLO: A Visual Object Detection Method for AGV Perception Systems Based on a Lightweight Backbone and Detection Head

by Sijing Cai, Zhanzheng Wu, Kang Liu, Tianbai Zhang, Wei Weng and Xiaoyi Zheng

Technologies 2026, 14(3), 173; https://doi.org/10.3390/technologies14030173 - 12 Mar 2026

Viewed by 133

Abstract

In smart logistics and intelligent manufacturing scenarios, the deployment of Autonomous Guided Vehicles (AGVs) necessitates vision systems that balance stringent real-time constraints with high detection accuracy. However, contemporary lightweight models often struggle with multi-scale feature representation and precision degradation. To address these challenges, [...] Read more.

In smart logistics and intelligent manufacturing scenarios, the deployment of Autonomous Guided Vehicles (AGVs) necessitates vision systems that balance stringent real-time constraints with high detection accuracy. However, contemporary lightweight models often struggle with multi-scale feature representation and precision degradation. To address these challenges, this study presents LSOD-YOLO, a tailored evolution of YOLO11n designed for embedded AGV systems. Our methodology focuses on three architectural innovations: (1) we propose a Lightweight Shared Convolution Detection (LSCD) head integrated with Group Normalization (GN) and a scale-adaptive mechanism to harmonize multi-scale feature responses; (2) we re-engineer the backbone using a Star-Net architecture enhanced by Gated MLPs and Depthwise Attention to refine local spatial modeling; and (3) we integrate multi-branch residuals and Channel Attention (CAA) into the C3k2-Star-CAA module to enhance robustness against occlusions and complex backgrounds. The experimental validation on a self-built AGV industrial dataset and COCO128 reveals a compelling performance leap: a 30 FPS increase in throughput and a 1.5% gain in precision, all achieved with 32.8% fewer parameters. These findings confirm that LSOD-YOLO achieves a superior trade-off between computational efficiency and reliability, showing great potential for seamless deployment in resource-constrained AGV visual tasks. Full article

(This article belongs to the Special Issue Intelligent Transportation for Integrated Mobile System: AI-Driven Technologies, Engineering Systems, and Industrial Applications)

► Show Figures

Figure 1

Search Results (2,365)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2,365)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI