MDPI - Publisher of Open Access Journals

24 pages, 1594 KB

Open AccessArticle

RMP-YOLO: Robust Multi-Scale Pedestrian Detection for Dense Scenarios

by Chenyang Gui, Zhangyu Fan, Taibin Duan and Junhao Wen

Sensors 2026, 26(9), 2621; https://doi.org/10.3390/s26092621 - 23 Apr 2026

With the rapid advancement of autonomous driving in modern society, dense pedestrian detection technology has encountered performance bottlenecks. To address this, we propose a robust and lightweight pedestrian detection algorithm, RMP-YOLO, designed to efficiently detect small, occluded, and low-light objects. Firstly, RFAConv is [...] Read more.

With the rapid advancement of autonomous driving in modern society, dense pedestrian detection technology has encountered performance bottlenecks. To address this, we propose a robust and lightweight pedestrian detection algorithm, RMP-YOLO, designed to efficiently detect small, occluded, and low-light objects. Firstly, RFAConv is utilized as the core component of the backbone network, combining standard convolution with attention mechanisms and using group convolution to extract features from the spatial receptive field. Secondly, MobileViTv3 is introduced into the backbone to combine CNNs with Transformers. The model is further enhanced by adjusting feature fusion, introducing residual connections, and optimizing local representation with deep convolutional layers. Finally, the PIoUv2 loss function is employed for bounding-box regression, significantly reducing detection errors for small-scale pedestrians in crowded environments. Experimental results demonstrate that RMP-YOLO improves mAP@0.5 by 1.3% on a custom dataset and 0.91% on the WiderPerson dataset. Crucially, it maintains high efficiency with only 3.71 million parameters and 6.29 GFLOPs, meeting the deployment requirements for low computational power and high precision. Full article

(This article belongs to the Section Sensing and Imaging)

25 pages, 14230 KB

Open AccessArticle

EP-YOLO: An Enhanced Lightweight Model for Micro-Pest Detection in Agricultural Light-Trap Environments

by Yuyang Tang, Jiaxuan Wang, Wenxi Sheng and Jilong Bian

Sensors 2026, 26(9), 2607; https://doi.org/10.3390/s26092607 - 23 Apr 2026

Abstract

As food security gains increasing attention, automated pest monitoring is crucial for agricultural early warning systems. However, in practical light-trap capturing sensors, the extremely small scale of pests and complex background interference, such as unexpected reflection and occlusions, severely undermine the performance of [...] Read more.

As food security gains increasing attention, automated pest monitoring is crucial for agricultural early warning systems. However, in practical light-trap capturing sensors, the extremely small scale of pests and complex background interference, such as unexpected reflection and occlusions, severely undermine the performance of existing models, resulting in frequent missed and false detections. To deal with these challenges, this study proposes EP-YOLO, an enhanced lightweight detection architecture based on YOLOv8n. Specifically, to retain the spatial pixels of micro-targets during downsampling and isolate pest features while eliminating background noise without compromising channel information, the Spatial-to-Depth Convolution (SPD) module and the Efficient Multi-Scale Attention (EMA) module are introduced. We evaluate our model through experiments on Pest24, a dataset consisting of 24 tiny pest categories. The results demonstrate that EP-YOLO achieves a mAP@50 and mAP@50:95 of 70.5% and 47.3%, respectively, improving upon the baseline by 1.1% and 1.9%. Furthermore, EP-YOLO achieves a significant improvement in detecting certain extremely small pests. For example, Rice planthopper and Plutella xylostella show improvements of 8.4% and 3.1%, respectively, compared to the baseline. In conclusion, the physical limitations of detecting tiny pests are successfully overcome by EP-YOLO, providing a robust and deployable design for real-time agricultural monitoring systems. Full article

(This article belongs to the Section Smart Agriculture)

20 pages, 8508 KB

Open AccessArticle

SynthAirDrone: Synthetic Drone Detection Dataset for Airport-Runway Environments

by Jiuxia Guo, Jinxi Chen, Tianhang Zhang and Qi Feng

Drones 2026, 10(4), 306; https://doi.org/10.3390/drones10040306 - 20 Apr 2026

Viewed by 234

Abstract

Illegal drone intrusion near airport runways poses a critical threat to civil aviation safety, creating an urgent need for runway-side vision systems that can detect intruding UAVs early enough for safety warning and collision-risk mitigation. However, the development of such detectors is severely [...] Read more.

Illegal drone intrusion near airport runways poses a critical threat to civil aviation safety, creating an urgent need for runway-side vision systems that can detect intruding UAVs early enough for safety warning and collision-risk mitigation. However, the development of such detectors is severely hindered by the scarcity of annotated real-world data in this high-security scenario. To address this bottleneck, we present SynthAirDrone, the first high-fidelity synthetic dataset for UAV intrusion detection in airport runway environments, together with an intelligent data generation framework integrating scene-aware placement and multi-criteria quality assessment. The proposed method uses sky-region segmentation to guide physically plausible drone placement, and combines perspective-aware scaling, Poisson image editing, and a four-dimensional quality scoring system—covering sky overlap, lighting consistency, size plausibility, and edge continuity—to improve visual plausibility and semantic consistency. The resulting dataset comprises 6500 high-quality images, all annotated in YOLO-compatible format. Using the lightweight YOLOv11n model, we show that models trained solely on SynthAirDrone exhibit non-trivial cross-domain transfer to Anti-UAV, while mixed training with limited real data provides the strongest real-world performance under the present setting. Ablation studies further confirm that a quality threshold of

τ = 0.6

achieves the best trade-off between diversity and fidelity. Overall, this work delivers a reproducible and efficient synthetic data solution for UAV detector development in high-security, data-scarce airport-runway scenarios. Full article

(This article belongs to the Section Artificial Intelligence in Drones (AID))

► Show Figures

Figure 1

25 pages, 2513 KB

Open AccessArticle

YOLO-DAA: Directional Area Attention for Lightweight Tiny Object Detection in Maritime UAV Imagery

by Kuan-Chou Chen, Vinay Malligere Shivanna and Jiun-In Guo

Drones 2026, 10(4), 283; https://doi.org/10.3390/drones10040283 - 14 Apr 2026

Viewed by 369

Abstract

Tiny object detection in maritime Unmanned Aerial Vehicles (UAV) imagery remains challenging due to low-resolution targets, dynamic lighting, and vast water backgrounds that obscure fine spatial cues. This study introduces You Only Look Once – Directional Area Attention (YOLO-DAA), a lightweight yet direction-aware [...] Read more.

Tiny object detection in maritime Unmanned Aerial Vehicles (UAV) imagery remains challenging due to low-resolution targets, dynamic lighting, and vast water backgrounds that obscure fine spatial cues. This study introduces You Only Look Once – Directional Area Attention (YOLO-DAA), a lightweight yet direction-aware detection framework designed to enhance spatial reasoning and feature discrimination for maritime environments. The proposed model integrates two key components: the Spatial Reconstruction Unit (SRU), which dynamically filters redundant activations and reconstructs informative spatial features, and the Directional Area Attention (DAA), which introduces controllable row–column attention to model anisotropic dependencies. Together, they enable the network to capture orientation-sensitive structures such as elongated vessels and vertically aligned swimmers while maintaining real-time efficiency. Experimental results on Common Objects in Context (COCO) and SeaDronesSee datasets demonstrate that YOLO-DAA achieves significant improvements in both precision and recall, outperforming the YOLOv12-turbo baseline across multiple scales. In particular, the lightweight YOLO-DAA-n variant achieves a 12.5% AP₉₅ gain on SeaDronesSee with minimal computational overhead. The findings confirm that directional attention and spatial reconstruction jointly enhance the representation of tiny maritime targets, offering an effective balance between accuracy and efficiency for real-world UAV deployments. Full article

(This article belongs to the Special Issue Advances in Deep Learning for Drones and Its Applications: 2nd Edition)

► Show Figures

Figure 1

26 pages, 4957 KB

Open AccessArticle

Detection of Traffic Lights and Status (Red, Yellow and Green) in Images with Different Environmental Conditions Using Architectures from Yolov8 to Yolov12

by Julio Saucedo-Soto, Viridiana Hernández-Herrera, Moisés Márquez-Olivera, Octavio Sánchez-García and Antonio-Gustavo Juárez-Gracia

Vehicles 2026, 8(4), 90; https://doi.org/10.3390/vehicles8040090 - 10 Apr 2026

Viewed by 297

Abstract

Given that approximately 70% of traffic accidents are attributable to driver-related factors, it is necessary for vehicles to incorporate technologies that reduce risk through preventive actions derived from traffic-scene analysis. Interpreting the driving environment is non-trivial and is commonly decomposed into sub-tasks; among [...] Read more.

Given that approximately 70% of traffic accidents are attributable to driver-related factors, it is necessary for vehicles to incorporate technologies that reduce risk through preventive actions derived from traffic-scene analysis. Interpreting the driving environment is non-trivial and is commonly decomposed into sub-tasks; among them, traffic light perception is critical due to its role in regulating vehicular flow. This paper evaluates five YOLO CNN families (YOLOv8–YOLOv12) on two tasks: (i) traffic light detection and (ii) traffic light state recognition (green, yellow, red). The evaluation uses a hybrid dataset comprising the public LISA traffic light dataset and a custom dataset with images from Mexico City captured under diverse lighting conditions—a relevant setting given the city’s high traffic intensity. The results show mAP@0.50 = 94.4–96.3% for traffic light detection and mAP@0.50 = 99.3–99.4% for traffic light state recognition, indicating that modern YOLO variants provide highly reliable performance for both tasks under natural illumination variability. Full article

(This article belongs to the Special Issue AI-Empowered Assisted and Autonomous Driving)

► Show Figures

Figure 1

22 pages, 4431 KB

Open AccessArticle

LA-YOLO: Robust Tea-Shoot Detection Under Dynamic Illumination via Input Illumination Stabilization and Discriminative Feature Learning

by Menghua Liu, Fanghua Liu and Junchao Chen

Agriculture 2026, 16(7), 809; https://doi.org/10.3390/agriculture16070809 - 4 Apr 2026

Viewed by 506

Abstract

Accurate tea-shoot detection in real tea gardens is essential for intelligent harvesting, yet dynamic illumination (low light, strong light, and shadows) can cause brightness/contrast fluctuations and feature distribution shifts, degrading detection stability and localization accuracy. This paper proposes LA-YOLO, a dynamic-light tea-shoot detector [...] Read more.

Accurate tea-shoot detection in real tea gardens is essential for intelligent harvesting, yet dynamic illumination (low light, strong light, and shadows) can cause brightness/contrast fluctuations and feature distribution shifts, degrading detection stability and localization accuracy. This paper proposes LA-YOLO, a dynamic-light tea-shoot detector based on YOLOv11. First, we construct a dynamic-light benchmark dataset and a difficulty-stratified evaluation protocol with four single-light subsets (A–D) and a mixed-light subset (E). Second, we design LA-CSNorm, an input-side brightness-adaptive preprocessing module that applies gated enhancement to dark samples followed by channel-selective normalization to reduce illumination-induced drift. Third, we propose RECA, a residual efficient channel-attention module to enhance discriminative channels and improve localization stability. Ablation studies show that LA-CSNorm and RECA provide complementary gains, and their combination improves the YOLOv11 baseline to 0.831 mAP@0.5 and 0.621 mAP@0.5:0.95, with only 0.01 M additional parameters. On the mixed-light subset E, LA-YOLO achieves 0.816 mAP@0.5 and 0.613 mAP@0.5:0.95, and consistently outperforms mainstream YOLO variants (e.g., YOLOv11m) under dynamic lighting conditions. These results demonstrate that LA-YOLO offers a robust and deployment-friendly solution for tea-shoot detection in complex natural illumination. Full article

(This article belongs to the Topic Advances in Smart Agriculture with Remote Sensing as the Core and Its Applications in Crops Field, 2nd Edition)

► Show Figures

Figure 1

20 pages, 7512 KB

Open AccessArticle

PDA-YOLO: An Early Detection Method for Egg Fertilization Rate Based on Position-Decoupled Attention

by Yifan Zhou, Zhengxiang Shi, Geqi Yan, Haiqing Peng, Fuwei Li, Wei Liu and Dapeng Li

Agriculture 2026, 16(7), 784; https://doi.org/10.3390/agriculture16070784 - 2 Apr 2026

Viewed by 391

Abstract

This study addresses the inefficiencies, subjectivity, and poor adaptability to lighting variations inherent in traditional candling methods used in large-scale egg incubation. We developed a high-throughput transmissive imaging system capable of capturing 30 eggs simultaneously. Based on this system, we propose PDA-YOLO, an [...] Read more.

This study addresses the inefficiencies, subjectivity, and poor adaptability to lighting variations inherent in traditional candling methods used in large-scale egg incubation. We developed a high-throughput transmissive imaging system capable of capturing 30 eggs simultaneously. Based on this system, we propose PDA-YOLO, an enhanced YOLOv8-based object detection model featuring a position-decoupled attention strategy. Specifically, a lightweight C2f-SE module is integrated into the backbone to amplify subtle feature responses in low-contrast regions, while a CBAM is deployed prior to the detection head to mitigate background clutter through precise spatial attention. Experimental results on a self-constructed Hailan White egg dataset show that at the critical 60 h incubation stage, PDA-YOLO achieves a Recall of 91.5% and an mAP@0.5 of 97.4%, outperforming the YOLOv8 baseline while maintaining a real-time inference speed of 62.1 FPS. Grad-CAM visualizations confirm the model’s ability to focus on vascular textures and suppress noise. Furthermore, the model demonstrates robust performance under varying illumination (180–540 lumens), effectively mitigating missed detections in low light and recognition degradation from overexposure. This work provides a scalable, real-time solution for non-destructive, early-stage detection of poultry health and fertilization status in commercial hatcheries. Full article

(This article belongs to the Special Issue Computer Vision Analysis Applied to Farm Animals)

► Show Figures

Figure 1

23 pages, 7126 KB

Open AccessArticle

Dual-Modal Chicken Mortality Detection Using Dynamic Hybrid Convolution-Based Feature Fusion

by Tian Hua, Qian Fan, Runhao Chen, Yulin Bi, Hao Bai, Zhixiu Wang, Guobin Chang and Wenming Zhao

Animals 2026, 16(7), 1057; https://doi.org/10.3390/ani16071057 - 31 Mar 2026

Viewed by 363

Abstract

In large-scale caged broiler farms, daily inspection of dead broilers is essential for flock health management and disease prevention. To address the significant performance degradation of existing methods under challenging conditions such as poor lighting, severe occlusion, and complex backgrounds, this paper proposes [...] Read more.

In large-scale caged broiler farms, daily inspection of dead broilers is essential for flock health management and disease prevention. To address the significant performance degradation of existing methods under challenging conditions such as poor lighting, severe occlusion, and complex backgrounds, this paper proposes a dual-modal dynamic hybrid convolutional feature fusion method for dead bird detection based on an improved YOLO11 framework, termed YOLO11-DualDynConv-FF. First, a dual-modal fusion network architecture was developed to combine RGB and infrared (IR) images, enabling the model to simultaneously process both modalities. By integrating complementary information from RGB and IR data, the proposed method significantly improved detection accuracy and efficiency under low-light conditions. Second, a dynamic hybrid convolution feature fusion module was designed to merge multi-scale feature maps with contextual information, allowing the network to capture fine-grained details and adapt better to complex farming environments. In addition, an occlusion-aware module was introduced to specifically address the physical occlusion challenges prevalent in crowded cage settings. Comparative experiments and ablation studies involving multiple object detection networks were conducted to evaluate the proposed method. The results show that the improved YOLO11 model achieves superior performance, with precision, recall, F1-score, and mAP@0.5 reaching 92.6%, 79.0%, 0.85, and 80.1%, respectively. These results represent improvements of 2.0%, 5.0%, 0.17, and 12.1%, respectively, over the original YOLO11 model. Compared with existing approaches, the proposed model is better suited to complex real-world poultry farming environments and achieves higher detection accuracy, providing a valuable reference for intelligent monitoring in caged poultry farming. Full article

(This article belongs to the Section Poultry)

► Show Figures

Figure 1

24 pages, 4742 KB

Open AccessArticle

Comparative Evaluation of YOLOv8 and YOLO11 for Image-Based Classification of Sugar Beet Seed Treatment Levels

by Cihan Unal, Ilkay Cinar, Zulfi Saripinar and Murat Koklu

Sensors 2026, 26(7), 2137; https://doi.org/10.3390/s26072137 - 30 Mar 2026

Viewed by 413

Abstract

This study addresses the automatic classification of sugar beet seeds according to their spraying levels using RGB images, aiming to enable a fast, practical, and non-destructive early warning system without chemical analysis. A dataset of 16,519 seed images acquired under controlled lighting conditions [...] Read more.

This study addresses the automatic classification of sugar beet seeds according to their spraying levels using RGB images, aiming to enable a fast, practical, and non-destructive early warning system without chemical analysis. A dataset of 16,519 seed images acquired under controlled lighting conditions was used to evaluate YOLOv8-CLS and YOLO11-CLS architectures, including the n, s, m, l, and x scale variants within the Ultralytics framework. All experiments were conducted using a 10-fold cross-validation strategy, with models trained under different batch size and learning rate configurations. The results indicate that both architectures achieve reliable performance, with accuracy values ranging from approximately 78–83% for YOLOv8-CLS and 80–82% for YOLO11-CLS models. ROC-AUC scores consistently above 0.94 demonstrate strong inter-class discrimination. Misclassification analysis shows that errors mainly occur between visually similar intermediate treatment levels, particularly 25% and 50%. Despite this challenge, low log-loss values and balanced precision–recall profiles indicate stable decision behavior. Overall, the findings confirm that sugar beet seed treatment levels can be effectively distinguished using only RGB imagery, providing a potentially low-cost and scalable approach for early warning and quality control in seed treatment processes. Full article

(This article belongs to the Section Smart Agriculture)

► Show Figures

Figure 1

27 pages, 6255 KB

Open AccessArticle

Lightweight Safety Helmet Wearing Detection Algorithm Based on GSA-YOLO

by Haodong Wang, Qiang Zhou, Zhiyuan Hao, Wentao Xiao and Luqing Yan

Sensors 2026, 26(7), 2110; https://doi.org/10.3390/s26072110 - 28 Mar 2026

Viewed by 484

Abstract

Electric power station confined spaces are high-risk and complex environments characterized by significant illumination variations. Whether safety helmets are properly worn directly affects the operational safety of workers in confined spaces. However, helmet detection in such environments faces several challenges, including drastic lighting [...] Read more.

Electric power station confined spaces are high-risk and complex environments characterized by significant illumination variations. Whether safety helmets are properly worn directly affects the operational safety of workers in confined spaces. However, helmet detection in such environments faces several challenges, including drastic lighting changes and difficulties in small-object detection. Moreover, existing object detection models typically contain a large number of parameters, making real-time helmet detection difficult to deploy on field devices with limited computational resources. To address these issues, this paper proposes a lightweight safety helmet wearing detection algorithm named GSA-YOLO. To mitigate the effects of severe illumination variation and detail loss in confined spaces, a GCA-C2f module integrating GhostConv and the CBAM attention mechanism is embedded into the backbone network. This design reduces the number of parameters and computational cost while enhancing the model’s feature extraction capability under challenging lighting conditions. To improve detection performance for occluded targets, an improved efficient channel attention (I-ECA) mechanism is introduced into the neck structure, which suppresses irrelevant channel features and enhances occluded object detection accuracy. Furthermore, to alleviate missed detections of small objects and inaccurate localization under low-light conditions, a P2 detection branch is added to the head, and the WIoU loss function is adopted to dynamically adjust the weights of hard and easy samples, thereby improving small-object detection accuracy and localization robustness. A confined space helmet detection dataset containing 5000 images was constructed through on-site data collection for model training and validation. Experimental results demonstrate that the proposed GSA-YOLO achieves an mAP@0.5 of 91.2% on the self-built dataset with only 2.3 M parameters, outperforming the baseline model by 2.9% while reducing the parameter count by 23.6%. The experimental results verify that the proposed algorithm is suitable for environments with significant illumination variation and small-object detection challenges. It provides a lightweight and efficient solution for on-site helmet detection in confined space scenarios, thereby contributing to the reduction in industrial safety accidents. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

27 pages, 4296 KB

Open AccessArticle

Research on Lightweight Apple Detection and 3D Accurate Yield Estimation for Complex Orchard Environments

by Bangbang Chen, Xuzhe Sun, Xiangdong Liu, Baojian Ma and Feng Ding

Horticulturae 2026, 12(3), 393; https://doi.org/10.3390/horticulturae12030393 - 22 Mar 2026

Viewed by 321

Abstract

Severe foliage occlusion and dynamically changing lighting conditions in complex orchard environments pose significant challenges for visual perception systems in automated apple harvesting, including low detection accuracy, poor robustness, and insufficient real-time performance. To address these issues, this study proposes an improved lightweight [...] Read more.

Severe foliage occlusion and dynamically changing lighting conditions in complex orchard environments pose significant challenges for visual perception systems in automated apple harvesting, including low detection accuracy, poor robustness, and insufficient real-time performance. To address these issues, this study proposes an improved lightweight detection network based on YOLOv11, named YOLO-WBL, along with a precise yield estimation algorithm based on 3D point clouds, termed CLV. The YOLO-WBL network is optimized in three aspects: (1) A C3K2_WT module integrating wavelet transform is introduced into the backbone network to enhance multi-scale feature extraction capability; (2) A weighted bidirectional feature pyramid network (BiFPN) is adopted in the neck network to improve the efficiency of multi-scale feature fusion; (3) A lightweight shared convolution separated batch normalization detection head (Detect-SCGN) is designed to significantly reduce the parameter count while maintaining accuracy. Based on this detection model, the CLV algorithm deeply integrates depth camera point cloud information through 3D coordinate mapping, irregular point cloud reconstruction, and convex hull volume calculation to achieve accurate estimation of individual fruit volume and total yield. Experimental results demonstrate that: (1) The YOLO-WBL model achieves a precision of 93.8%, recall of 79.3%, and mean average precision (mAP@0.5) of 87.2% on the apple test set; (2) The model size is only 3.72 MB, a reduction of 28.87% compared to the baseline model; (3) When deployed on an NVIDIA Jetson Xavier NX edge device, its inference speed reaches 8.7 FPS, meeting real-time requirements; (4) In scenarios with an occlusion rate below 40%, the mean absolute percentage error (MAPE) of yield estimation can be controlled within 8%. Experimental validation was conducted using apple images selected from the dataset under varying lighting intensities and fruit occlusion conditions. The results demonstrate that the CLV algorithm significantly outperforms traditional average-weight-based estimation methods. This study provides an efficient, accurate, and deployable visual solution for intelligent apple harvesting and yield estimation in complex orchard environments, offering practical reference value for advancing smart orchard production. Full article

(This article belongs to the Special Issue AI for a Precision and Resilient Horticulture)

► Show Figures

Figure 1

28 pages, 14845 KB

Open AccessArticle

Spatial Relation Reasoning Based on Keypoints for Railway Intrusion Detection and Risk Assessment

by Shanping Ning, Feng Ding and Bangbang Chen

Appl. Sci. 2026, 16(6), 3026; https://doi.org/10.3390/app16063026 - 20 Mar 2026

Viewed by 264

Abstract

Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting [...] Read more.

Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting real-time warning and graded response capabilities. To address these gaps, this paper proposes a novel method for intrusion detection and risk assessment based on keypoint spatial discrimination. First, an XS-BiSeNetV2-based track segmentation network is developed, incorporating cross-feature fusion and spatial feature recalibration to improve track extraction accuracy in complex scenes. Second, an enhanced STI-YOLO detection model is introduced, integrating a Shuffle attention mechanism for better feature interaction, a high-resolution Transformer detection head to improve small-target sensitivity, and the Inner-IoU loss function to refine bounding box regression. Detected targets’ bottom keypoints are then analyzed relative to track boundaries to determine intrusion direction. By combining lateral distance and motion state features, a multi-level risk classification system is established for quantitative threat assessment. Experiments on the RailSem19 and GN-rail-Object datasets show that the method achieves a track segmentation mIoU of 88.19% and a detection mAP of 82.6%. The risk assessment module effectively quantifies threats across scenarios and maintains stable performance under low-light and strong-glare conditions. This work offers a quantifiable risk assessment solution for intelligent railway safety systems. Full article

► Show Figures

Figure 1

22 pages, 6052 KB

Open AccessArticle

HSMD-YOLO: An Anti-Aliasing Feature-Enhanced Network for High-Speed Microbubble Detection

by Wenda Luo, Yongjie Li and Siguang Zong

Algorithms 2026, 19(3), 234; https://doi.org/10.3390/a19030234 - 20 Mar 2026

Viewed by 272

Abstract

Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection [...] Read more.

Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection and built upon YOLOv11. The model incorporates three novel components: the Scale Switch Block (SSB), a scale-transformation module that suppresses artifacts and background noise, thereby stabilizing edges in thin-walled bubble regions and enhancing sensitivity to geometric contours; the Global Local Refine Block (GLRB), which achieves efficient global relationship modeling with an asymptotic linear complexity (

O (N)

) in spatial dimensions while further refining local features, thereby strengthening boundary perception and improving bubble–background separability; and the Bidirectional Exponential Moving Attention Fusion (BEMAF), which accommodates the multi-scale nature of bubbles by employing a parallel multi-kernel architecture to extract spatial features across scales, coupled with a multi-stage EMA based attention mechanism to enhance detection robustness under weak boundaries and complex backgrounds. Experiments conducted on an Side-Illuminated Light Field Bubble Database (SILB-DB) and a public gas–liquid two-phase flow dataset (GTFD) demonstrate that HSMD-YOLO achieves mAP@50 scores of 0.911 and 0.854, respectively, surpassing mainstream detection methods. Ablation studies indicate that SSB, GLRB, and BEMAF contribute performance gains of 1.3%, 2.0%, and 0.4%, respectively, thereby corroborating the effectiveness of each module for micro-scale object detection. Full article

(This article belongs to the Section Evolutionary Algorithms and Machine Learning)

► Show Figures

Figure 1

20 pages, 3218 KB

Open AccessArticle

MIP-YOLO11: An Underwater Object Detection Model Based on Improved YOLO11

by Xinyu Qu, Ying Shao, Zheng Wang and Man Chang

J. Mar. Sci. Eng. 2026, 14(6), 572; https://doi.org/10.3390/jmse14060572 - 19 Mar 2026

Viewed by 423

Abstract

Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. [...] Read more.

Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. This paper proposes an improved MIP-YOLO11 model for underwater object detection based on the YOLO11 framework. First, a MCEA module is designed in the backbone network to replace the basic CBS convolution module. Through a lightweight multi-branch convolutional structure, the perception ability for small objects, object edges, contours, and morphological features in underwater scenes are enhanced without significantly increasing computational overhead. Second, an IMCA module based on the coordinate attention mechanism is introduced at the end of the backbone network to replace the C2PSA module, reducing the number of model parameters while maintaining detection accuracy. Finally, the Bottleneck module in C3k2 is improved by incorporating a PConv and a dual residual connection mechanism, thereby expanding the receptive field and enhancing the efficiency of complex feature extraction. Experimental results demonstrate that MIP-YOLO11 significantly outperforms the traditional YOLO11 in underwater environments. P and R are improved by 2.5% and 4.1%, respectively. Moreover, the mAP0.5 and mAP0.5:0.95 metrics are increased by 4.2% and 7.5%, respectively. The improved model achieves a good balance between high accuracy and light weight, and can provide a more reliable underwater object detection scheme for AUV underwater detection and other application scenarios. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

25 pages, 6302 KB

Open AccessArticle

Artificial Intelligence-Based Detection of On-Ground Chestnuts Toward Automated Picking

by Kaixuan Fang, Yuzhen Lu and Xinyang Mu

AgriEngineering 2026, 8(3), 116; https://doi.org/10.3390/agriengineering8030116 - 19 Mar 2026

Viewed by 604

Abstract

Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges [...] Read more.

Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges in complex environments with shading, varying natural light conditions, and interference from weeds, fallen leaves, stones, and other foreign on-ground objects, which have remained unaddressed. This study collected 319 images of chestnuts on the orchard floor, containing 6524 annotated chestnuts. A comprehensive set of 29 state-of-the-art real-time object detectors, including 14 in the YOLO (v11–v13) and 15 in the RT-DETR (v1–v4) families at various model scales, was systematically evaluated through replicated modeling experiments for chestnut detection. Experimental results show that the YOLOv12m model achieved the best mAP@0.5 of 95.1% among all the evaluated models, while RT-DETRv2-R101 was the most accurate variant among the RT-DETR models, with mAP@0.5 of 91.1%. In terms of mAP@[0.5:0.95], the YOLOv11x model achieved the best accuracy of 80.1%. All models demonstrated significant potential for real-time chestnut detection, and YOLO models outperformed RT-DETR models in terms of both detection accuracy and inference, making them better suited for on-board deployment. This work lays a foundation for developing AI-based, vision-guided intelligent chestnut harvest systems. Full article

(This article belongs to the Special Issue Applications of Computer Vision in Agriculture)

► Show Figures

Figure 1

Search Results (389)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (389)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI