MDPI - Publisher of Open Access Journals

20 pages, 3518 KiB

Open AccessArticle

YOLO-AWK: A Model for Injurious Bird Detection in Complex Farmland Environments

by Xiang Yang, Yongliang Cheng, Minggang Dong and Xiaolan Xie

Symmetry 2025, 17(8), 1210; https://doi.org/10.3390/sym17081210 - 30 Jul 2025

Viewed by 141

Injurious birds pose a significant threat to food production and the agricultural economy. To address the challenges posed by their small size, irregular shape, and frequent occlusion in complex farmland environments, this paper proposes YOLO-AWK, an improved bird detection model based on YOLOv11n. [...] Read more.

Injurious birds pose a significant threat to food production and the agricultural economy. To address the challenges posed by their small size, irregular shape, and frequent occlusion in complex farmland environments, this paper proposes YOLO-AWK, an improved bird detection model based on YOLOv11n. Firstly, to improve the ability of the enhanced model to recognize bird targets in complex backgrounds, we introduce the in-scale feature interaction (AIFI) module to replace the original SPPF module. Secondly, to more accurately localize and identify bird targets of different shapes and sizes, we use WIoUv3 as a new loss function. Thirdly, to remove the noise interference and improve the extraction of bird residual features, we introduce the Kolmogorov–Arnold network (KAN) module. Finally, to improve the model’s detection accuracy for small bird targets, we add a small target detection head. The experimental results show that the detection performance of YOLO-AWK on the farmland bird dataset is significantly improved, and the final precision, recall, mAP@0.5, and mAP@0.5:0.95 reach 93.9%, 91.2%, 95.8%, and 75.3%, respectively, which outperforms the original model by 2.7, 2.3, 1.6, and 3.0 percentage points, respectively. These results demonstrate that the proposed method offers a reliable and efficient technical solution for farmland injurious bird monitoring. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)

► Show Figures

Figure 1

21 pages, 2965 KiB

Open AccessArticle

Inspection Method Enabled by Lightweight Self-Attention for Multi-Fault Detection in Photovoltaic Modules

by Shufeng Meng and Tianxu Xu

Electronics 2025, 14(15), 3019; https://doi.org/10.3390/electronics14153019 - 29 Jul 2025

Viewed by 203

Abstract

Bird-dropping fouling and hotspot anomalies remain the most prevalent and detrimental defects in utility-scale photovoltaic (PV) plants; their co-occurrence on a single module markedly curbs energy yield and accelerates irreversible cell degradation. However, markedly disparate visual–thermal signatures of the two phenomena impede high-fidelity [...] Read more.

Bird-dropping fouling and hotspot anomalies remain the most prevalent and detrimental defects in utility-scale photovoltaic (PV) plants; their co-occurrence on a single module markedly curbs energy yield and accelerates irreversible cell degradation. However, markedly disparate visual–thermal signatures of the two phenomena impede high-fidelity concurrent detection in existing robotic inspection systems, while stringent onboard compute budgets also preclude the adoption of bulky detectors. To resolve this accuracy–efficiency trade-off for dual-defect detection, we present YOLOv8-SG, a lightweight yet powerful framework engineered for mobile PV inspectors. First, a rigorously curated multi-modal dataset—RGB for stains and long-wave infrared for hotspots—is assembled to enforce robust cross-domain representation learning. Second, the HSV color space is leveraged to disentangle chromatic and luminance cues, thereby stabilizing appearance variations across sensors. Third, a single-head self-attention (SHSA) block is embedded in the backbone to harvest long-range dependencies at negligible parameter cost, while a global context (GC) module is grafted onto the detection head to amplify fine-grained semantic cues. Finally, an auxiliary bounding box refinement term is appended to the loss to hasten convergence and tighten localization. Extensive field experiments demonstrate that YOLOv8-SG attains 86.8% mAP@0.5, surpassing the vanilla YOLOv8 by 2.7 pp while trimming 12.6% of parameters (18.8 MB). Grad-CAM saliency maps corroborate that the model’s attention consistently coincides with defect regions, underscoring its interpretability. The proposed method, therefore, furnishes PV operators with a practical low-latency solution for concurrent bird-dropping and hotspot surveillance. Full article

► Show Figures

Figure 1

18 pages, 5309 KiB

Open AccessArticle

LGM-YOLO: A Context-Aware Multi-Scale YOLO-Based Network for Automated Structural Defect Detection

by Chuanqi Liu, Yi Huang, Zaiyou Zhao, Wenjing Geng and Tianhong Luo

Processes 2025, 13(8), 2411; https://doi.org/10.3390/pr13082411 - 29 Jul 2025

Viewed by 150

Abstract

Ensuring the structural safety of steel trusses in escalators is critical for the reliable operation of vertical transportation systems. While manual inspection remains widely used, its dependence on human judgment leads to extended cycle times and variable defect-recognition rates, making it less reliable [...] Read more.

Ensuring the structural safety of steel trusses in escalators is critical for the reliable operation of vertical transportation systems. While manual inspection remains widely used, its dependence on human judgment leads to extended cycle times and variable defect-recognition rates, making it less reliable for identifying subtle surface imperfections. To address these limitations, a novel context-aware, multi-scale deep learning framework based on the YOLOv5 architecture is proposed, which is specifically designed for automated structural defect detection in escalator steel trusses. Firstly, a method called GIES is proposed to synthesize pseudo-multi-channel representations from single-channel grayscale images, which enhances the network’s channel-wise representation and mitigates issues arising from image noise and defocused blur. To further improve detection performance, a context enhancement pipeline is developed, consisting of a local feature module (LFM) for capturing fine-grained surface details and a global context module (GCM) for modeling large-scale structural deformations. In addition, a multi-scale feature fusion module (MSFM) is employed to effectively integrate spatial features across various resolutions, enabling the detection of defects with diverse sizes and complexities. Comprehensive testing on the NEU-DET and GC10-DET datasets reveals that the proposed method achieves 79.8% mAP on NEU-DET and 68.1% mAP on GC10-DET, outperforming the baseline YOLOv5s by 8.0% and 2.7%, respectively. Although challenges remain in identifying extremely fine defects such as crazing, the proposed approach offers improved accuracy while maintaining real-time inference speed. These results indicate the potential of the method for intelligent visual inspection in structural health monitoring and industrial safety applications. Full article

(This article belongs to the Special Issue Advances in Computer Vision and Image Processing for Industrial Processes)

► Show Figures

Figure 1

23 pages, 7839 KiB

Open AccessArticle

Automated Identification and Analysis of Cracks and Damage in Historical Buildings Using Advanced YOLO-Based Machine Vision Technology

by Kui Gao, Li Chen, Zhiyong Li and Zhifeng Wu

Buildings 2025, 15(15), 2675; https://doi.org/10.3390/buildings15152675 - 29 Jul 2025

Viewed by 161

Abstract

Structural cracks significantly threaten the safety and longevity of historical buildings, which are essential parts of cultural heritage. Conventional inspection techniques, which depend heavily on manual visual evaluations, tend to be inefficient and subjective. This research introduces an automated framework for crack and [...] Read more.

Structural cracks significantly threaten the safety and longevity of historical buildings, which are essential parts of cultural heritage. Conventional inspection techniques, which depend heavily on manual visual evaluations, tend to be inefficient and subjective. This research introduces an automated framework for crack and damage detection using advanced YOLO (You Only Look Once) models, aiming to improve both the accuracy and efficiency of monitoring heritage structures. A dataset comprising 2500 high-resolution images was gathered from historical buildings and categorized into four levels of damage: no damage, minor, moderate, and severe. Following preprocessing and data augmentation, a total of 5000 labeled images were utilized to train and evaluate four YOLO variants: YOLOv5, YOLOv8, YOLOv10, and YOLOv11. The models’ performances were measured using metrics such as precision, recall, mAP@50, mAP@50–95, as well as losses related to bounding box regression, classification, and distribution. Experimental findings reveal that YOLOv10 surpasses other models in multi-target detection and identifying minor damage, achieving higher localization accuracy and faster inference speeds. YOLOv8 and YOLOv11 demonstrate consistent performance and strong adaptability, whereas YOLOv5 converges rapidly but shows weaker validation results. Further testing confirms YOLOv10’s effectiveness across different structural components, including walls, beams, and ceilings. This study highlights the practicality of deep learning-based crack detection methods for preserving building heritage. Future advancements could include combining semantic segmentation networks (e.g., U-Net) with attention mechanisms to further refine detection accuracy in complex scenarios. Full article

(This article belongs to the Special Issue Structural Safety Evaluation and Health Monitoring)

► Show Figures

Figure 1

24 pages, 14323 KiB

Open AccessArticle

GTDR-YOLOv12: Optimizing YOLO for Efficient and Accurate Weed Detection in Agriculture

by Zhaofeng Yang, Zohaib Khan, Yue Shen and Hui Liu

Agronomy 2025, 15(8), 1824; https://doi.org/10.3390/agronomy15081824 - 28 Jul 2025

Viewed by 275

Abstract

Weed infestation contributes significantly to global agricultural yield loss and increases the reliance on herbicides, raising both economic and environmental concerns. Effective weed detection in agriculture requires high accuracy and architectural efficiency. This is particularly important under challenging field conditions, including densely clustered [...] Read more.

Weed infestation contributes significantly to global agricultural yield loss and increases the reliance on herbicides, raising both economic and environmental concerns. Effective weed detection in agriculture requires high accuracy and architectural efficiency. This is particularly important under challenging field conditions, including densely clustered targets, small weed instances, and low visual contrast between vegetation and soil. In this study, we propose GTDR-YOLOv12, an improved object detection framework based on YOLOv12, tailored for real-time weed identification in complex agricultural environments. The model is evaluated on the publicly available Weeds Detection dataset, which contains a wide range of weed species and challenging visual scenarios. To achieve better accuracy and efficiency, GTDR-YOLOv12 introduces several targeted structural enhancements. The backbone incorporates GDR-Conv, which integrates Ghost convolution and Dynamic ReLU (DyReLU) to improve early-stage feature representation while reducing redundancy. The GTDR-C3 module combines GDR-Conv with Task-Dependent Attention Mechanisms (TDAMs), allowing the network to adaptively refine spatial features critical for accurate weed identification and localization. In addition, the Lookahead optimizer is employed during training to improve convergence efficiency and reduce computational overhead, thereby contributing to the model’s lightweight design. GTDR-YOLOv12 outperforms several representative detectors, including YOLOv7, YOLOv9, YOLOv10, YOLOv11, YOLOv12, ATSS, RTMDet and Double-Head. Compared with YOLOv12, GTDR-YOLOv12 achieves notable improvements across multiple evaluation metrics. Precision increases from 85.0% to 88.0%, recall from 79.7% to 83.9%, and F1-score from 82.3% to 85.9%. In terms of detection accuracy, mAP:0.5 improves from 87.0% to 90.0%, while mAP:0.5:0.95 rises from 58.0% to 63.8%. Furthermore, the model reduces computational complexity. GFLOPs drop from 5.8 to 4.8, and the number of parameters is reduced from 2.51 M to 2.23 M. These reductions reflect a more efficient network design that not only lowers model complexity but also enhances detection performance. With a throughput of 58 FPS on the NVIDIA Jetson AGX Xavier, GTDR-YOLOv12 proves both resource-efficient and deployable for practical, real-time weeding tasks in agricultural settings. Full article

(This article belongs to the Section Weed Science and Weed Management)

► Show Figures

Figure 1

25 pages, 4296 KiB

Open AccessArticle

StripSurface-YOLO: An Enhanced Yolov8n-Based Framework for Detecting Surface Defects on Strip Steel in Industrial Environments

by Haomin Li, Huanzun Zhang and Wenke Zang

Electronics 2025, 14(15), 2994; https://doi.org/10.3390/electronics14152994 - 27 Jul 2025

Viewed by 329

Abstract

Recent advances in precision manufacturing and high-end equipment technologies have imposed ever more stringent requirements on the accuracy, real-time performance, and lightweight design of online steel strip surface defect detection systems. To reconcile the persistent trade-off between detection precision and inference efficiency in [...] Read more.

Recent advances in precision manufacturing and high-end equipment technologies have imposed ever more stringent requirements on the accuracy, real-time performance, and lightweight design of online steel strip surface defect detection systems. To reconcile the persistent trade-off between detection precision and inference efficiency in complex industrial environments, this study proposes StripSurface–YOLO, a novel real-time defect detection framework built upon YOLOv8n. The core architecture integrates an Efficient Cross-Stage Local Perception module (ResGSCSP), which synergistically combines GSConv lightweight convolutions with a one-shot aggregation strategy, thereby markedly reducing both model parameters and computational complexity. To further enhance multi-scale feature representation, this study introduces an Efficient Multi-Scale Attention (EMA) mechanism at the feature-fusion stage, enabling the network to more effectively attend to critical defect regions. Moreover, conventional nearest-neighbor upsampling is replaced by DySample, which produces deeper, high-resolution feature maps enriched with semantic content, improving both inference speed and fusion quality. To heighten sensitivity to small-scale and low-contrast defects, the model adopts Focal Loss, dynamically adjusting to sample difficulty. Extensive evaluations on the NEU-DET dataset demonstrate that StripSurface–YOLO reduces FLOPs by 11.6% and parameter count by 7.4% relative to the baseline YOLOv8n, while achieving respective improvements of 1.4%, 3.1%, 4.1%, and 3.0% in precision, recall, mAP₅₀, and mAP_50:95. Under adverse conditions—including contrast variations, brightness fluctuations, and Gaussian noise—SteelSurface-YOLO outperforms the baseline model, delivering improvements of 5.0% in mAP₅₀ and 4.7% in mAP_50:95, attesting to the model’s robust interference resistance. These findings underscore the potential of StripSurface–YOLO to meet the rigorous performance demands of real-time surface defect detection in the metal forging industry. Full article

► Show Figures

Figure 1

27 pages, 6143 KiB

Open AccessArticle

Optical Character Recognition Method Based on YOLO Positioning and Intersection Ratio Filtering

by Kai Cui, Qingpo Xu, Yabin Ding, Jiangping Mei, Ying He and Haitao Liu

Symmetry 2025, 17(8), 1198; https://doi.org/10.3390/sym17081198 - 27 Jul 2025

Viewed by 191

Abstract

Driven by the rapid development of e-commerce and intelligent logistics, the volume of express delivery services has surged, making the efficient and accurate identification of shipping information a core requirement for automatic sorting systems. However, traditional Optical Character Recognition (OCR) technology struggles to [...] Read more.

Driven by the rapid development of e-commerce and intelligent logistics, the volume of express delivery services has surged, making the efficient and accurate identification of shipping information a core requirement for automatic sorting systems. However, traditional Optical Character Recognition (OCR) technology struggles to meet the accuracy and real-time demands of complex logistics scenarios due to challenges such as image distortion, uneven illumination, and field overlap. This paper proposes a three-level collaborative recognition method based on deep learning that facilitates structured information extraction through regional normalization, dual-path parallel extraction, and a dynamic matching mechanism. First, the geometric distortion associated with contour detection and the lightweight direction classification model has been improved. Second, by integrating the enhanced YOLOv5s for key area localization with the upgraded PaddleOCR for full-text character extraction, a dual-path parallel architecture for positioning and recognition has been constructed. Finally, a dynamic space–semantic joint matching module has been designed that incorporates anti-offset IoU metrics and hierarchical semantic regularization constraints, thereby enhancing matching robustness through density-adaptive weight adjustment. Experimental results indicate that the accuracy of this method on a self-constructed dataset is 89.5%, with an F1 score of 90.1%, representing a 24.2% improvement over traditional OCR methods. The dynamic matching mechanism elevates the average accuracy of YOLOv5s from 78.5% to 89.7%, surpassing the Faster R-CNN benchmark model while maintaining a real-time processing efficiency of 76 FPS. This study offers a lightweight and highly robust solution for the efficient extraction of order information in complex logistics scenarios, significantly advancing the intelligent upgrading of sorting systems. Full article

(This article belongs to the Section Physics)

► Show Figures

Figure 1

30 pages, 92065 KiB

Open AccessArticle

A Picking Point Localization Method for Table Grapes Based on PGSS-YOLOv11s and Morphological Strategies

by Jin Lu, Zhongji Cao, Jin Wang, Zhao Wang, Jia Zhao and Minjie Zhang

Agriculture 2025, 15(15), 1622; https://doi.org/10.3390/agriculture15151622 - 26 Jul 2025

Viewed by 252

Abstract

During the automated picking of table grapes, the automatic recognition and segmentation of grape pedicels, along with the positioning of picking points, are vital components for all the following operations of the harvesting robot. In the actual scene of a grape plantation, however, [...] Read more.

During the automated picking of table grapes, the automatic recognition and segmentation of grape pedicels, along with the positioning of picking points, are vital components for all the following operations of the harvesting robot. In the actual scene of a grape plantation, however, it is extremely difficult to accurately and efficiently identify and segment grape pedicels and then reliably locate the picking points. This is attributable to the low distinguishability between grape pedicels and the surrounding environment such as branches, as well as the impacts of other conditions like weather, lighting, and occlusion, which are coupled with the requirements for model deployment on edge devices with limited computing resources. To address these issues, this study proposes a novel picking point localization method for table grapes based on an instance segmentation network called Progressive Global-Local Structure-Sensitive Segmentation (PGSS-YOLOv11s) and a simple combination strategy of morphological operators. More specifically, the network PGSS-YOLOv11s is composed of an original backbone of the YOLOv11s-seg, a spatial feature aggregation module (SFAM), an adaptive feature fusion module (AFFM), and a detail-enhanced convolutional shared detection head (DE-SCSH). And the PGSS-YOLOv11s have been trained with a new grape segmentation dataset called Grape-⊥, which includes 4455 grape pixel-level instances with the annotation of ⊥-shaped regions. After the PGSS-YOLOv11s segments the ⊥-shaped regions of grapes, some morphological operations such as erosion, dilation, and skeletonization are combined to effectively extract grape pedicels and locate picking points. Finally, several experiments have been conducted to confirm the validity, effectiveness, and superiority of the proposed method. Compared with the other state-of-the-art models, the main metrics

F 1

score and mask mAP@0.5 of the PGSS-YOLOv11s reached 94.6% and 95.2% on the Grape-⊥ dataset, as well as 85.4% and 90.0% on the Winegrape dataset. Multi-scenario tests indicated that the success rate of positioning the picking points reached up to 89.44%. In orchards, real-time tests on the edge device demonstrated the practical performance of our method. Nevertheless, for grapes with short pedicels or occluded pedicels, the designed morphological algorithm exhibited the loss of picking point calculations. In future work, we will enrich the grape dataset by collecting images under different lighting conditions, from various shooting angles, and including more grape varieties to improve the method’s generalization performance. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

21 pages, 4863 KiB

Open AccessArticle

Detection Model for Cotton Picker Fire Recognition Based on Lightweight Improved YOLOv11

by Zhai Shi, Fangwei Wu, Changjie Han, Dongdong Song and Yi Wu

Agriculture 2025, 15(15), 1608; https://doi.org/10.3390/agriculture15151608 - 25 Jul 2025

Viewed by 264

Abstract

In response to the limited research on fire detection in cotton pickers and the issue of low detection accuracy in visual inspection, this paper proposes a computer vision-based detection method. The method is optimized according to the structural characteristics of cotton pickers, and [...] Read more.

In response to the limited research on fire detection in cotton pickers and the issue of low detection accuracy in visual inspection, this paper proposes a computer vision-based detection method. The method is optimized according to the structural characteristics of cotton pickers, and a lightweight improved YOLOv11 algorithm is designed for cotton fire detection in cotton pickers. The backbone of the model is replaced with the MobileNetV2 network to achieve effective model lightweighting. In addition, the convolutional layers in the original C3k2 block are optimized using partial convolutions to reduce computational redundancy and improve inference efficiency. Furthermore, a visual attention mechanism named CBAM-ECA (Convolutional Block Attention Module-Efficient Channel Attention) is designed to suit the complex working conditions of cotton pickers. This mechanism aims to enhance the model’s feature extraction capability under challenging environmental conditions, thereby improving overall detection accuracy. To further improve localization performance and accelerate convergence, the loss function is also modified. These improvements enable the model to achieve higher precision in fire detection while ensuring fast and accurate localization. Experimental results demonstrate that the improved model reduces the number of parameters by 38%, increases the frame processing speed (FPS) by 13.2%, and decreases the computational complexity (GFLOPs) by 42.8%, compared to the original model. The detection accuracy for flaming combustion, smoldering combustion, and overall detection is improved by 1.4%, 3%, and 1.9%, respectively, with an increase of 2.4% in mAP (mean average precision). Compared to other models—YOLOv3-tiny, YOLOv5, YOLOv8, and YOLOv10—the proposed method achieves higher detection accuracy by 5.9%, 7%, 5.9%, and 5.3%, respectively, and shows improvements in mAP by 5.4%, 5%, 4.8%, and 6.3%. The improved detection algorithm maintains high accuracy while achieving faster inference speed and fewer model parameters. These improvements lay a solid foundation for fire prevention and suppression in cotton collection boxes on cotton pickers. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

25 pages, 9119 KiB

Open AccessArticle

An Improved YOLOv8n-Based Method for Detecting Rice Shelling Rate and Brown Rice Breakage Rate

by Zhaoyun Wu, Yehao Zhang, Zhongwei Zhang, Fasheng Shen, Li Li, Xuewu He, Hongyu Zhong and Yufei Zhou

Agriculture 2025, 15(15), 1595; https://doi.org/10.3390/agriculture15151595 - 24 Jul 2025

Viewed by 248

Abstract

Accurate and real-time detection of rice shelling rate (SR) and brown rice breakage rate (BR) is crucial for intelligent hulling sorting but remains challenging because of small grain size, dense adhesion, and uneven illumination causing missed detections and blurred boundaries in traditional YOLOv8n. [...] Read more.

Accurate and real-time detection of rice shelling rate (SR) and brown rice breakage rate (BR) is crucial for intelligent hulling sorting but remains challenging because of small grain size, dense adhesion, and uneven illumination causing missed detections and blurred boundaries in traditional YOLOv8n. This paper proposes a high-precision, lightweight solution based on an enhanced YOLOv8n with improvements in network architecture, feature fusion, and attention mechanism. The backbone’s C2f module is replaced with C2f-Faster-CGLU, integrating partial convolution (PConv) local convolution and convolutional gated linear unit (CGLU) gating to reduce computational redundancy via sparse interaction and enhance small-target feature extraction. A bidirectional feature pyramid network (BiFPN) weights multiscale feature fusion to improve edge positioning accuracy of dense grains. Attention mechanism for fine-grained classification (AFGC) is embedded to focus on texture and damage details, enhancing adaptability to light fluctuations. The Detect_Rice lightweight head compresses parameters via group normalization and dynamic convolution sharing, optimizing small-target response. The improved model achieved 96.8% precision and 96.2% mAP. Combined with a quantity–mass model, SR/BR detection errors reduced to 1.11% and 1.24%, meeting national standard (GB/T 29898-2013) requirements, providing an effective real-time solution for intelligent hulling sorting. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

21 pages, 5181 KiB

Open AccessArticle

TEB-YOLO: A Lightweight YOLOv5-Based Model for Bamboo Strip Defect Detection

by Xipeng Yang, Chengzhi Ruan, Fei Yu, Ruxiao Yang, Bo Guo, Jun Yang, Feng Gao and Lei He

Forests 2025, 16(8), 1219; https://doi.org/10.3390/f16081219 - 24 Jul 2025

Viewed by 291

Abstract

The accurate detection of surface defects in bamboo is critical to maintaining product quality. Traditional inspection methods rely heavily on manual labor, making the manufacturing process labor-intensive and error-prone. To overcome these limitations, TEB-YOLO is introduced in this paper, a lightweight and efficient [...] Read more.

The accurate detection of surface defects in bamboo is critical to maintaining product quality. Traditional inspection methods rely heavily on manual labor, making the manufacturing process labor-intensive and error-prone. To overcome these limitations, TEB-YOLO is introduced in this paper, a lightweight and efficient defect detection model based on YOLOv5s. Firstly, EfficientViT replaces the original YOLOv5s backbone, reducing the computational cost while improving feature extraction. Secondly, BiFPN is adopted in place of PANet to enhance multi-scale feature fusion and preserve detailed information. Thirdly, an Efficient Local Attention (ELA) mechanism is embedded in the backbone to strengthen local feature representation. Lastly, the original CIoU loss is replaced with EIoU loss to enhance localization precision. The proposed model achieves a precision of 91.7% with only 10.5 million parameters, marking a 5.4% accuracy improvement and a 22.9% reduction in parameters compared to YOLOv5s. Compared with other mainstream models including YOLOv5n, YOLOv7, YOLOv8n, YOLOv9t, and YOLOv9s, TEB-YOLO achieves precision improvements of 11.8%, 1.66%, 2.0%, 2.8%, and 1.1%, respectively. The experiment results show that TEB-YOLO significantly improves detection precision and model lightweighting, offering a practical and effective solution for real-time bamboo surface defect detection. Full article

(This article belongs to the Special Issue Cutting-Edge Solutions in Advanced Forestry: Integrating Sensors, AI, IoT, Robotics, and Connectivity)

► Show Figures

Figure 1

22 pages, 5154 KiB

Open AccessArticle

BCS_YOLO: Research on Corn Leaf Disease and Pest Detection Based on YOLOv11n

by Shengnan Hao, Erjian Gao, Zhanlin Ji and Ivan Ganchev

Appl. Sci. 2025, 15(15), 8231; https://doi.org/10.3390/app15158231 - 24 Jul 2025

Viewed by 216

Abstract

Frequent corn leaf diseases and pests pose serious threats to agricultural production. Traditional manual detection methods suffer from significant limitations in both performance and efficiency. To address this, the present paper proposes a novel biotic condition screening (BCS) model for the detection of [...] Read more.

Frequent corn leaf diseases and pests pose serious threats to agricultural production. Traditional manual detection methods suffer from significant limitations in both performance and efficiency. To address this, the present paper proposes a novel biotic condition screening (BCS) model for the detection of corn leaf diseases and pests, called BCS_YOLO, based on the You Only Look Once version 11n (YOLOv11n). The proposed model enables accurate detection and classification of various corn leaf pathologies and pest infestations under challenging agricultural field conditions. It achieves this thanks to three key newly designed modules—a Self-Perception Coordinated Global Attention (SPCGA) module, a High/Low-Frequency Feature Enhancement (HLFFE) module, and a Local Attention Enhancement (LAE) module. The SPCGA module improves the model’s ability to perceive fine-grained targets by fusing multiple attention mechanisms. The HLFFE module adopts a frequency domain separation strategy to strengthen edge delineation and structural detail representation in affected areas. The LAE module effectively improves the model’s discrimination ability between targets and backgrounds through local importance calculation and intensity adjustment mechanisms. Conducted experiments show that BCS_YOLO achieves 78.4%, 73.7%, 76.0%, and 82.0% in

p r e c i s i o n

, recall,

F 1 s c o r e

, and

m A P @ 50

, respectively, representing corresponding improvements of 3.0%, 3.3%, 3.2%, and 4.6% compared to the baseline model (YOLOv11n), while also outperforming the mainstream object detection models. In summary, the proposed BCS_YOLO model provides a practical and scalable solution for efficient detection of corn leaf diseases and pests in complex smart-agriculture scenarios, demonstrating significant theoretical and application value. Full article

(This article belongs to the Special Issue Innovations in Artificial Neural Network Applications)

► Show Figures

Figure 1

18 pages, 4203 KiB

Open AccessArticle

SRW-YOLO: A Detection Model for Environmental Risk Factors During the Grid Construction Phase

by Yu Zhao, Fei Liu, Qiang He, Fang Liu, Xiaohu Sun and Jiyong Zhang

Remote Sens. 2025, 17(15), 2576; https://doi.org/10.3390/rs17152576 - 24 Jul 2025

Viewed by 250

Abstract

With the rapid advancement of UAV-based remote sensing and image recognition techniques, identifying environmental risk factors from aerial imagery has emerged as a focal point in intelligent inspection during the power transmission and distribution projects construction phase. The uneven spatial distribution of risk [...] Read more.

With the rapid advancement of UAV-based remote sensing and image recognition techniques, identifying environmental risk factors from aerial imagery has emerged as a focal point in intelligent inspection during the power transmission and distribution projects construction phase. The uneven spatial distribution of risk factors on construction sites, their weak texture signatures, and the inherently multi-scale nature of UAV imagery pose significant detection challenges. To address these issues, we propose a one-stage SRW-YOLO algorithm built upon the YOLOv11 framework. First, a P2-scale shallow feature detection layer is added to capture high-resolution fine details of small targets. Second, we integrate a reparameterized convolution based on channel shuffle (RCS) of a one-shot aggregation (RCS-OSA) module into the backbone and neck’s shallow layers, enhancing feature extraction while significantly reducing inference latency. Finally, a dynamic non-monotonic focusing mechanism WIoU v3 loss function is employed to reweigh low-quality annotations, thereby improving small-object localization accuracy. Experimental results demonstrate that SRW-YOLO achieves an overall precision of 80.6% and mAP of 79.1% on the State Grid dataset, and exhibits similarly superior performance on the VisDrone2019 dataset. Compared with other one-stage detectors, SRW-YOLO delivers markedly higher detection accuracy, offering critical technical support for multi-scale, heterogeneous environmental risk monitoring during the power transmission and distribution projects construction phase, and establishes the theoretical foundation for rapid and accurate inspection using UAV-based intelligent imaging. Full article

► Show Figures

Figure 1

17 pages, 2072 KiB

Open AccessArticle

Barefoot Footprint Detection Algorithm Based on YOLOv8-StarNet

by Yujie Shen, Xuemei Jiang, Yabin Zhao and Wenxin Xie

Sensors 2025, 25(15), 4578; https://doi.org/10.3390/s25154578 - 24 Jul 2025

Viewed by 271

Abstract

This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich [...] Read more.

This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich texture patterns. To address this, our framework integrates an improved StarNet into the backbone of YOLOv8 architecture. Leveraging the unique advantages of element-wise multiplication, the redesigned backbone efficiently maps inputs to a high-dimensional nonlinear feature space without increasing channel dimensions, achieving enhanced representational capacity with low computational latency. Subsequently, an Encoder layer facilitates feature interaction within the backbone through multi-scale feature fusion and attention mechanisms, effectively extracting rich semantic information while maintaining computational efficiency. In the feature fusion part, a feature modulation block processes multi-scale features by synergistically combining global and local information, thereby reducing redundant computations and decreasing both parameter count and computational complexity to achieve model lightweighting. Experimental evaluations on a proprietary barefoot footprint dataset demonstrate that the proposed model exhibits significant advantages in terms of parameter efficiency, recognition accuracy, and computational complexity. The number of parameters has been reduced by 0.73 million, further improving the model’s speed. Gflops has been reduced by 1.5, lowering the performance requirements for computational hardware during model deployment. Recognition accuracy has reached 99.5%, with further improvements in model precision. Future research will explore how to capture shoeprint images with complex backgrounds from shoes worn at crime scenes, aiming to further enhance the model’s recognition capabilities in more forensic scenarios. Full article

(This article belongs to the Special Issue Transformer Applications in Target Tracking)

► Show Figures

Figure 1

26 pages, 2261 KiB

Open AccessArticle

Real-Time Fall Monitoring for Seniors via YOLO and Voice Interaction

by Eugenia Tîrziu, Ana-Mihaela Vasilevschi, Adriana Alexandru and Eleonora Tudora

Future Internet 2025, 17(8), 324; https://doi.org/10.3390/fi17080324 - 23 Jul 2025

Viewed by 168

Abstract

In the context of global demographic aging, falls among the elderly remain a major public health concern, often leading to injury, hospitalization, and loss of autonomy. This study proposes a real-time fall detection system that combines a modern computer vision model, YOLOv11 with [...] Read more.

In the context of global demographic aging, falls among the elderly remain a major public health concern, often leading to injury, hospitalization, and loss of autonomy. This study proposes a real-time fall detection system that combines a modern computer vision model, YOLOv11 with integrated pose estimation, and an Artificial Intelligence (AI)-based voice assistant designed to reduce false alarms and improve intervention efficiency and reliability. The system continuously monitors human posture via video input, detects fall events based on body dynamics and keypoint analysis, and initiates a voice-based interaction to assess the user’s condition. Depending on the user’s verbal response or the absence thereof, the system determines whether to trigger an emergency alert to caregivers or family members. All processing, including speech recognition and response generation, is performed locally to preserve user privacy and ensure low-latency performance. The approach is designed to support independent living for older adults. Evaluation of 200 simulated video sequences acquired by the development team demonstrated high precision and recall, along with a decrease in false positives when incorporating voice-based confirmation. In addition, the system was also evaluated on an external dataset to assess its robustness. Our results highlight the system’s reliability and scalability for real-world in-home elderly monitoring applications. Full article

(This article belongs to the Special Issue Artificial Intelligence for Smart Healthcare: Methods, Applications, and Challenges)

► Show Figures

Figure 1

Search Results (387)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (387)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI