MDPI - Publisher of Open Access Journals

23 pages, 3177 KB

Open AccessArticle

CMA-YOLO: A Network for Wind Turbine Blade Surface Defect Detection with Multi-Scale Features and Dual Attention

by Weining Li, Songsong Li, Xingshuo Yue, Xu Wang, Yuhang Zhu and Xiaoming Chen

Information 2026, 17(5), 512; https://doi.org/10.3390/info17050512 - 21 May 2026

Viewed by 146

This paper introduces CMA-YOLO, a network that integrates multi-scale features with dual attention mechanisms to address weak feature representation, low detection accuracy, and loss of fine-grained details in deep networks for wind turbine blade surface defect detection. First, we construct the C2MSA module [...] Read more.

This paper introduces CMA-YOLO, a network that integrates multi-scale features with dual attention mechanisms to address weak feature representation, low detection accuracy, and loss of fine-grained details in deep networks for wind turbine blade surface defect detection. First, we construct the C2MSA module by designing a Multi-scale Feature-enhanced Attention Convolution Mix (MS-ACmix) based on ACmix and embedding it into the C2PSA block. This lets the network capture local and global contextual features, strengthening multi-scale target recognition and lowering missed detections. Second, we devise a Monte Carlo Dual Attention (MCDA) mechanism combining random sampling with dual attention. This approach retains the regularization benefits of the Monte Carlo method while leveraging dual attention selection, enabling improved detection accuracy with low computational cost. Finally, we substitute the original downsampling layers in the backbone and neck with the ADown module. This lightweight design, together with efficient feature extraction and fusion, reduces fine-grained detail loss and improves defect detection capability. Quantitative results reveal that, compared to YOLO11n, CMA-YOLO yields improvements of 3.4% in mAP@0.5, 6.1% in mAP@0.5:0.95, and 8.8% in recall, with a 0.7 GFLOPs reduction in computational cost, thus validating the proposed algorithm. Overall, CMA-YOLO provides a lightweight and effective approach for inspecting blade surface defects on wind turbines operating in resource-limited settings. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Graphical abstract

17 pages, 22288 KB

Open AccessArticle

Surface Defect Detection of Copper Tube Based on YOLOX with Convolutional Block Attention and Adaptive Spatial Feature Fusion

by Jianjun He and Ji Wang

Appl. Sci. 2026, 16(10), 5155; https://doi.org/10.3390/app16105155 - 21 May 2026

Viewed by 92

Abstract

Surface defect detection technology is important to improve product quality and save production costs. In order to realize automatic detection of copper tube surface defects, an improved YOLOX algorithm is proposed based on convolutional block attention (CBA) and adaptive spatial feature fusion (ASFF), [...] Read more.

Surface defect detection technology is important to improve product quality and save production costs. In order to realize automatic detection of copper tube surface defects, an improved YOLOX algorithm is proposed based on convolutional block attention (CBA) and adaptive spatial feature fusion (ASFF), named as CBA-ASFF-YOLOX. The improved YOLOX backbone feature extraction network is replaced with CSPDarknet-53. Then, we construct a convolutional block attention module and an adaptively spatial feature fusion module in the feature fusion part to enhance the spatial position correlation between features by learning the connections between different feature maps. To solve the unbalanced differences in real label samples, we utilize Focal Loss function to replace the cross-entropy loss function. In addition, the bounding box regression loss in this study is based on the SIoU formulation, and a simplified variant is adopted to improve regression quality and facilitate more stable convergence. Finally, the algorithm is applied to copper tube surface defect detection task. Experimental results show that the accuracy of CBA-ASFF-YOLOX algorithm is higher than those of other series of YOLO algorithm. Full article

► Show Figures

Figure 1

24 pages, 27236 KB

Open AccessArticle

WFSCA-YOLO: Robust Object Detection for Terrestrial Optical Sensing Under Atmospheric Degradation via a Wavelet-Driven Frequency–Spatial Co-Awareness Framework

by Jiabao Yan, Qihang Xu, Zhian Zheng, Xian-Hua Han, Junjie Zhu and Yanhua Lin

Remote Sens. 2026, 18(10), 1667; https://doi.org/10.3390/rs18101667 - 21 May 2026

Viewed by 66

Abstract

Optical object detection under fog-induced atmospheric degradation remains a challenging problem for terrestrial sensing and monitoring systems. Atmospheric scattering reduces image contrast and attenuates high-frequency edge and texture features that are important for precise object localization, while standard downsampling in convolutional neural networks [...] Read more.

Optical object detection under fog-induced atmospheric degradation remains a challenging problem for terrestrial sensing and monitoring systems. Atmospheric scattering reduces image contrast and attenuates high-frequency edge and texture features that are important for precise object localization, while standard downsampling in convolutional neural networks (CNNs) further amplifies this information loss during feature extraction. Existing spatial-domain methods largely improve pixel appearance or feature refinement without explicitly preserving fog-weakened high-frequency edge and texture features during feature extraction. To address this issue, we propose WFSCA-YOLO, a frequency-aware and feature-preserving detection framework with cross-domain fusion between frequency-domain details and spatial semantic responses. The framework introduces the Wavelet-driven Frequency–spatial Co-awareness Block (WFSCA-Block) into YOLOv8, where the Discrete Wavelet Transform (DWT) is used to decompose feature maps into multi-directional high-frequency subbands and preserve high-frequency edge and texture features degraded by atmospheric scattering. A Cross-Domain Feature Selector (CDFS) is further designed to adaptively recalibrate the fusion of frequency-domain details and spatial semantic responses under varying visibility conditions. Experiments on synthetic and real-world degraded optical benchmarks from near-ground scenes, namely Foggy Cityscapes and RTTS, show that WFSCA-YOLO consistently outperforms representative state-of-the-art methods, achieving 50.3% mAP@50 on Foggy Cityscapes (2.1 percentage points above the baseline) and a mean mAP@50 of 79.28% on RTTS over three independent runs. Under a unified FP32 batch-1 inference benchmark, WFSCA-YOLO runs at 134.76 FPS on an RTX 4090D, indicating real-time capability with only a slight latency increase relative to the YOLOv8-s baseline. These results indicate that preserving high-frequency edge and texture features is an effective strategy for robust perception under degraded visibility and offers practical potential for terrestrial sensing and monitoring platforms. Full article

(This article belongs to the Section Engineering Remote Sensing)

23 pages, 14104 KB

Open AccessArticle

Symbol Recognition of Station Signal Layout Drawings Using a Fusion Design of Generalized Focal Loss and Dilated Residual Segmentation

by Qi Sun, Weizhi Deng, Mengxin Zhu, Wentong Fan and Tianyu Li

Symmetry 2026, 18(5), 874; https://doi.org/10.3390/sym18050874 (registering DOI) - 21 May 2026

Viewed by 140

Abstract

Station Signal Layout Plans (SSLPs) are pivotal engineering drawings used in the design of railway signaling systems. Accurate recognition of such drawings is essential for enabling intelligent railway operations and supporting digital management. However, the inherent complexity of engineering drawings—characterized by diverse object [...] Read more.

Station Signal Layout Plans (SSLPs) are pivotal engineering drawings used in the design of railway signaling systems. Accurate recognition of such drawings is essential for enabling intelligent railway operations and supporting digital management. However, the inherent complexity of engineering drawings—characterized by diverse object categories and significant scale variations—substantially increases the difficulty of detection tasks. To address these challenges, this paper proposes an improved YOLOv8-based algorithm for rapid and accurate object detection. First, to enhance the detection of small objects in engineering drawings, a cross-scale attention mechanism is introduced into the mid-scale detection head. During prediction, this mechanism leverages fine-grained details from lower-level features to improve small-object detection. In addition, to suppress noise and blurred edges in drawings, the YOLOv8 neck network is enhanced with a DWRSeg-based design. This structure enlarges the receptive field while preserving local details, thereby effectively reducing the impact of noise on localization. To evaluate the proposed method, a complex dataset was constructed from station signal layout plans provided by a railway bureau, featuring substantial variations in target scale, diverse categories, and densely distributed objects. Experimental results demonstrate that, compared with YOLOv8n, the proposed DCS-YOLO model improves precision, recall, and mAP@0.5 by 3.1%, 0.8%, and 2.1%, respectively, while maintaining a comparable mAP@0.5:0.95. Comparative experiments with representative object detection methods demonstrate that the proposed algorithm achieves competitive detection accuracy and real-time performance for SSLP symbol recognition, providing a practical technical solution for the intelligent analysis of engineering drawings in the railway industry. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

18 pages, 4212 KB

Open AccessArticle

AHSC-Net: A Fish Pose Estimation Method for Intelligent Monitoring in Precision Aquaculture

by Xiaohong Peng, Ronghan Lu, Zhuohan Xiao and Xiaohan Chen

Fishes 2026, 11(5), 308; https://doi.org/10.3390/fishes11050308 - 21 May 2026

Viewed by 130

Abstract

In aquaculture, fish physiological information serves as the foundation for behavior recognition, precise feeding, and health monitoring. The acquisition of such information relies on accurate keypoint detection and pose estimation of the fish body. To address the challenges caused by inter-occlusion among fish [...] Read more.

In aquaculture, fish physiological information serves as the foundation for behavior recognition, precise feeding, and health monitoring. The acquisition of such information relies on accurate keypoint detection and pose estimation of the fish body. To address the challenges caused by inter-occlusion among fish schools and blurred keypoint boundaries in underwater environments, a novel fish pose estimation method based on the Adaptive-kernel Hybrid-center Structural Constraint Network (AHSC-Net) is proposed. Optimized specifically for the characteristics of fish poses, the proposed method effectively enhances detection accuracy and robustness in complex underwater scenarios. First, a Stochastic Local Centroid Sampling (SLCS) strategy is introduced to improve detection capability. By simulating centroid positions in occluded samples, this approach enhances the model’s ability to detect partially occluded fish. Next, a Spatial-Awareness Enhanced Pose Structural Constraint (SAPSC) is established through coordinate embedding and morphological constraints. It ensures the rationality of the predicted poses. Furthermore, an Adaptive Kernel Modulation Module (AKMM) is designed to dynamically adjust the Gaussian kernel distribution, effectively addressing challenges posed by underwater blurring and variations in fish scales. Experimental results demonstrate that AHSC-Net achieves 92.0% AP and 94.6% AR on a self-constructed largemouth bass dataset, outperforming state-of-the-art methods such as HRNet, HigherHRNet, DEKR, and YOLO-Pose. This study presents a fish pose estimation method that provides effective technical support for automated and precise monitoring in aquaculture. Full article

(This article belongs to the Special Issue Computer Vision Applications for Fisheries and Aquaculture)

► Show Figures

Figure 1

14 pages, 1996 KB

Open AccessArticle

Lightweight Fire Detection Algorithm YOLO-LRP for UAV Firefighting

by Beibei Cui, Mengyuan Zhao, Dongxu Sun, Zicheng Wang, Lei Zhang and Jean-Charles Créput

Appl. Sci. 2026, 16(10), 5121; https://doi.org/10.3390/app16105121 - 21 May 2026

Viewed by 82

Abstract

This paper proposes a lightweight fire detection algorithm, YOLO-LRP, for UAV firefighting scenarios. The algorithm is improved based on YOLOv10, where the MobileOne reparameterized backbone is adopted to reduce computational complexity. The OREPA module is integrated into the neck network to enhance training [...] Read more.

This paper proposes a lightweight fire detection algorithm, YOLO-LRP, for UAV firefighting scenarios. The algorithm is improved based on YOLOv10, where the MobileOne reparameterized backbone is adopted to reduce computational complexity. The OREPA module is integrated into the neck network to enhance training efficiency and feature fusion. The standard CBS module is replaced with GhostConv to further compress the model size, and the number of detection layers is streamlined to focus on small and medium fire targets. In addition, the SE channel attention mechanism is introduced to strengthen the extraction of key fire features under complex interference. Experimental results show that YOLO-LRP achieves 93.6% mAP50, only 2.26 M parameters, and 434.7 FPS in server-side tests. These improvements make the model more suitable for deployment on resource-constrained UAV platforms. Full article

► Show Figures

Figure 1

23 pages, 9952 KB

Open AccessArticle

A Bio-Inspired Lightweight Human Action Recognition Method Based on Human Keypoint Detection

by Weihao Huang, Mianting Wu, Weixiong Chen and Qiang Zhou

Biomimetics 2026, 11(5), 355; https://doi.org/10.3390/biomimetics11050355 - 20 May 2026

Viewed by 89

Abstract

Recognizing human actions from static images in complex industrial environments remains challenging due to insufficient feature representation and high computational complexity. This issue is particularly critical in power-grid safety monitoring, where improper worker postures (e.g., bending, climbing, falling) can lead to severe accidents [...] Read more.

Recognizing human actions from static images in complex industrial environments remains challenging due to insufficient feature representation and high computational complexity. This issue is particularly critical in power-grid safety monitoring, where improper worker postures (e.g., bending, climbing, falling) can lead to severe accidents and personal injuries, necessitating automated monitoring systems that operate reliably on resource-constrained edge devices. This study proposes a bio-inspired lightweight recognition framework that integrates an improved YOLO-Pose model with a gated recurrent unit (GRU) network. The scientific motivation is grounded in the observation that the human musculoskeletal system achieves highly efficient motion perception through three key mechanisms: hierarchical muscle coordination providing intrinsic rotation invariance, proprioceptive feedback enabling real-time error correction, and selective neural gating reducing redundant information transmission. These biological principles directly inspire our technical contributions: polar-coordinate encoding provides rotation invariance, three-stage filtering mimics proprioceptive feedback, and GRU gating mirrors selective information propagation. Unlike prior approaches that treat pose-based action recognition as a generic computer vision problem, this work explicitly incorporates anatomical structural constraints into the computational pipeline. The framework addresses three research gaps: (1) existing methods lack biomechanically derived invariance properties; (2) GCN-based approaches use fixed topologies that fail to adapt to occlusion patterns; (3) the trade-off between model complexity and accuracy remains unsatisfactory for edge deployment. Experiments on the self-constructed SKPose dataset demonstrate that the proposed method achieves 95.04% accuracy, outperforming ST-GCN by 3.67 percentage points and 2s-AGCN by 1.94 percentage points, with an inference speed of 48 FPS on 8.7 M parameters in underground power-grid environments and provides practical support for biomimetic perception systems and industrial safety monitoring. Full article

(This article belongs to the Special Issue Bionic Intelligent Robots)

► Show Figures

Figure 1

19 pages, 12889 KB

Open AccessArticle

YOLO-AFL: A Novel Lightweight Algorithm for Real-Time Safety Helmet Detection in Factory Workshops

by Hao Wang, Xianying Feng, Peigang Li, Anning Wang and Ming Yao

Sensors 2026, 26(10), 3237; https://doi.org/10.3390/s26103237 - 20 May 2026

Viewed by 153

Abstract

In factory workshops, wearing safety helmets is vital for worker safety. However, current deep learning-based detection methods are often hindered by large model parameters and high computational demands, limiting their deployment in resource-constrained settings. This article introduces YOLO-AFL, a novel lightweight model designed [...] Read more.

In factory workshops, wearing safety helmets is vital for worker safety. However, current deep learning-based detection methods are often hindered by large model parameters and high computational demands, limiting their deployment in resource-constrained settings. This article introduces YOLO-AFL, a novel lightweight model designed to solve these problems. The algorithm introduces several key optimizations to improve performance without increasing computational load. Firstly, the K-Means++ algorithm is applied during the anchor box preprocessing stage, along with a new distance metric (1 − AIoU), which enhances anchor box size estimation and boosts performance without additional overhead. Secondly, by introducing a lightweight PConv operation into the C3 module, the complexity of the model is significantly reduced. Finally, a dual attention network (LDA-GC) is designed to compensate for any accuracy loss caused by the model’s simplifications. Experimental results on a custom dataset show that the proposed algorithm achieves an mAP50 of 94.1%. Compared to the baseline model, it reduces the number of parameters by 19.1% and decreases computational complexity by 16.9%, demonstrating its superior performance and efficiency in safety helmet wearing detection. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

22 pages, 8477 KB

Open AccessArticle

FAMA-DET: A Frequency-Domain Adaptive Multi-Scale Attention Detection Network for Aircraft Target Detection in Optical Remote Sensing Images

by Lan Ma, Mingyang Peng, Yun Luo and Yujie Pi

Sensors 2026, 26(10), 3236; https://doi.org/10.3390/s26103236 - 20 May 2026

Viewed by 172

Abstract

Aircraft target detection in optical remote sensing imagery is hindered by severe scale variation, cluttered backgrounds, and the limited capacity of the spatial-domain convolution to represent frequency-selective target features. We propose FAMA-DET, a frequency-domain adaptive detection framework built on YOLO11, which pursues a [...] Read more.

Aircraft target detection in optical remote sensing imagery is hindered by severe scale variation, cluttered backgrounds, and the limited capacity of the spatial-domain convolution to represent frequency-selective target features. We propose FAMA-DET, a frequency-domain adaptive detection framework built on YOLO11, which pursues a unified design principle of content-adaptive spectral representation across all architectural levels. The Frequency-Domain Adaptive Cross-Stage Feature Extractor (FDACFE) replaces static kernels with frequency-domain parameterised convolution driven by learnable DFT basis vectors, enabling differentiated perception of high-frequency edge details and low-frequency semantic components. The Soft-Aligned Bidirectional Feature Pyramid Network (SABFPN) eliminates upsampling amplitude distortion through scale-normalised interpolation and enriches cross-scale fusion with multi-receptive-field textural modelling. The Adaptive Multi-Scale Recalibrated Decoupled Detection Head (AMRDDHead) embeds multi-scale channel recalibration into both localisation and classification branches to suppress background redundancy and reinforce discriminative activations. On MAR20, FAMA-DET improves mAP50 and mAP50-95 over the YOLO11n baseline by 1.8% and 1.6% at only 5.4 GFLOPs, while maintaining real-time throughput of 109.7 FPS. Under zero-shot cross-domain transfer to CORS-ADD, FAMA-DET achieves the highest mAP50 of 93.3% among all compared methods, surpassing RT-DETR-R18 in mAP50 while using 91.0% fewer GFLOPs, confirming that frequency-domain adaptive design yields both strong generalisation and deployment efficiency. Full article

(This article belongs to the Special Issue Remote Sensing Image Fusion and Object Tracking)

► Show Figures

Figure 1

22 pages, 3372 KB

Open AccessArticle

Multi-Class Marine Organism Detection Using Multi-Scale Attention-Enhanced YOLO11n

by Zehuan Bai, Haoxi Mao, Junliang Xu, Na Lv and Yiran Liu

Fishes 2026, 11(5), 301; https://doi.org/10.3390/fishes11050301 - 19 May 2026

Viewed by 160

Abstract

Monitoring marine organisms plays a vital role in biodiversity conservation, marine environmental management, and fisheries resource management. However, the underwater environment is often low-light and turbid, leading to indistinct target boundaries. Moreover, the wide variety of marine organisms—with significant differences in color, scale, [...] Read more.

Monitoring marine organisms plays a vital role in biodiversity conservation, marine environmental management, and fisheries resource management. However, the underwater environment is often low-light and turbid, leading to indistinct target boundaries. Moreover, the wide variety of marine organisms—with significant differences in color, scale, texture, and morphology—can easily result in missed detections. To address these challenges, this paper proposes a multi-class marine organism detection method using multi-scale attention-enhanced You Only Look Once 11 nano (YOLO11n). The method incorporates the Convolutional Block Attention Module (CBAM) into the YOLO11n network, enabling the model to better focus on key feature regions while effectively suppressing background noise interference in complex marine environments. In addition, the model is trained using the Complete Intersection over Union (CIoU) loss function, which enhances bounding box regression accuracy, especially in handling targets of varying scales. The effectiveness of the proposed method is validated on the publicly available BrackishMOT dataset. The proposed model achieves an overall mAP@0.5 of 0.481, computed as the average AP across six organism categories. Category-wise results indicate stronger performance on visually distinguishable targets, such as Jellyfish, Starfish, and Small fish, with AP values of 0.808, 0.678, and 0.677, respectively. In contrast, performance remains limited for rare or visually ambiguous categories. These results suggest that the proposed method is effective for multi-class marine organism detection, particularly when discriminative visual features are present. Full article

(This article belongs to the Special Issue Computer Vision Applications for Fisheries and Aquaculture)

► Show Figures

Figure 1

20 pages, 2996 KB

Open AccessArticle

IISD-YOLO: Infrared Detection of Insulator Strings for Transmission Lines Based on Improved YOLOv11

by Chen-Hao Zhao, Yi-Feng Ren, Long-Kun Cao and Hong-Yu Wang

Technologies 2026, 14(5), 306; https://doi.org/10.3390/technologies14050306 - 18 May 2026

Viewed by 177

Abstract

In the area of transmission line inspection, one of the prominent areas of research has been to unite Unmanned Aerial Vehicles (UAVs) with neural network object detection algorithms. This area of research is challenging because of high computational resource consumption and poor infrared [...] Read more.

In the area of transmission line inspection, one of the prominent areas of research has been to unite Unmanned Aerial Vehicles (UAVs) with neural network object detection algorithms. This area of research is challenging because of high computational resource consumption and poor infrared detection capabilities. In this study we propose an infrared image detection algorithm, named IISD-YOLO, using a modified version of the YOLOv11 network, to detect infrared transmission line insulator strings. Firstly, the original object detection layer was removed and replaced with the ShuffleNetv2 network to achieve the goal of a lightweight model; subsequently, based on the original feature extraction module C3k2, the Manhattan Self-Attention (MaSA) mechanism was introduced to design a new feature extraction module, C3k2-MaSA, which enhances the feature extraction capability for infrared objects; finally, the bidirectional feature pyramid network (Bi-FPN) is used to replace the original feature fusion module, enhancing the network’s ability to process and fuse information at different scales. The comparative experiments show that compared with the mainstream YOLO models, IISD-YOLO has improved by 4.5, 6.1, and 4.8 percentage points respectively on mAP@50 over YOLOv5, YOLOv8, and YOLOv10; furthermore, this model outperforms advanced models including YOLO-CIR, FA-YOLO, YOFIR, and RT-DETR, with improvements of 2.9, 9.1, 5.0, and 1.1 percentage points respectively on mAP@50. The ablation study shows that each improvement effectively enhances the overall performance. Compared with the original YOLOv11, the IISD-YOLO has increased its mAP@50 by 3.5 percentage points, while reducing the number of Params by 1.1 million and the computational GFLOPs by 2 G. These results confirm the superior performance of IISD-YOLO in infrared insulator string detection. Full article

► Show Figures

Figure 1

22 pages, 28095 KB

Open AccessArticle

LLE-YOLO: Adaptive Low-Light-Enhanced and Degradation-Aware Multi-Scale Attention Network for Miner Detection in Underground Coal Mines

by Yanyan Chen, Xiangrui Meng, Chaoyu Yang and Yijuan Wang

Appl. Sci. 2026, 16(10), 4983; https://doi.org/10.3390/app16104983 - 16 May 2026

Viewed by 168

Abstract

Underground coal mine environments commonly suffer from insufficient illumination, high dust concentrations, and cluttered backgrounds, which substantially degrade the accuracy of conventional object detection algorithms. To address these issues, this paper proposes LLE-YOLO, a detection network built upon YOLOv11n. At the input stage, [...] Read more.

Underground coal mine environments commonly suffer from insufficient illumination, high dust concentrations, and cluttered backgrounds, which substantially degrade the accuracy of conventional object detection algorithms. To address these issues, this paper proposes LLE-YOLO, a detection network built upon YOLOv11n. At the input stage, an Adaptive Low-Light Enhancement Module (ALEM) is introduced, which integrates Retinex decomposition, Contrast-Limited Adaptive Histogram Equalization (CLAHE), and brightness-dependent Gamma mapping to dynamically select the optimal enhancement strategy according to the global luminance. Furthermore, a Degradation-Aware Efficient Multi-Scale Attention (DEMA) module is proposed, which incorporates Contrast-Aware Modulation (CAM), an asymmetric dilated convolution group, and a Degradation-aware Spatial Gate (DSG) into the EMA channel-grouping and cross-spatial learning framework, thereby strengthening multi-scale personnel detection while keeping the parameter count tractable. On the publicly available DsDPM66 dataset, which covers 66 coal mine sites and 105,096 annotated images, LLE-YOLO achieves an mAP@0.5 of 83.7%, representing gains of 8.1 percentage points over YOLOv11n and 5.2 percentage points over the GCB-YOLOv11 baseline, while the recall increases from 71.2% to 78.2%. Under extremely dark scenarios (<30 lux), the mAP@0.5 is further improved by 15.3 percentage points. Ablation studies and Grad-CAM visualizations confirm the contribution of each module, offering a practical engineering reference for intelligent underground monitoring systems. Full article

► Show Figures

Figure 1

23 pages, 4189 KB

Open AccessArticle

DARE-YOLO: A Lightweight Object Detection Algorithm and Its FPGA Acceleration for Sustainable PV Panel Inspection

by Yuchuan Yang, Feng Xing, Caiyan Qin, Shuxu Chen, Hyundong Shin and Sungyoung Lee

Sustainability 2026, 18(10), 4999; https://doi.org/10.3390/su18104999 - 15 May 2026

Viewed by 118

Abstract

As a critical component of sustainable energy systems, the efficient maintenance of photovoltaic (PV) panels is essential. While deep learning is an important approach for PV panel defect detection, the high complexity of existing models and their substantial computational demand make deployment on [...] Read more.

As a critical component of sustainable energy systems, the efficient maintenance of photovoltaic (PV) panels is essential. While deep learning is an important approach for PV panel defect detection, the high complexity of existing models and their substantial computational demand make deployment on edge platforms difficult. This paper studies an acceleration method for photovoltaic panel defect detection on the Zynq-7020 heterogeneous platform. We design DARE-YOLO, a lightweight network for photovoltaic panel defect detection, together with a Zynq-based accelerator. In DARE-YOLO, we introduce RepConv and a lightweight single-path backbone to reduce the memory bandwidth overhead caused by multi-branch structures. We further design a Dilated Context Block (DCB) and a Dual-scale Decoupled Head (DDH), which effectively improve the detection accuracy of DARE-YOLO. On the Zynq platform, we develop the accelerator through a mixed fixed-point quantization strategy, a custom convolution IP core, and pipeline unrolling. These optimizations reduce data access latency, improve computational parallelism, and increase computational throughput. Experimental results show that DARE-YOLO achieves 93.84% mAP@0.5 with only 6.4 M parameters. The accelerator has a total on-board power consumption of only 1.95 W, while delivering a throughput of 37.5 GOPS, an energy efficiency of 19.23 GOPS/W. The image inference latency is 661.3 ms. This low-power, high-efficiency co-design paradigm ensures the long-term reliability of renewable energy facilities. Full article

(This article belongs to the Special Issue Sustainable Solar Power Systems and Applications)

► Show Figures

Figure 1

19 pages, 17100 KB

Open AccessArticle

A Green Jujube Grading Model Using BiFPN and COT Attention Mechanism

by Pengyan Chang, Xudong Zhu, Huini Wu, Shuijin Wu and Fan Jiang

Agronomy 2026, 16(10), 982; https://doi.org/10.3390/agronomy16100982 (registering DOI) - 15 May 2026

Viewed by 110

Abstract

The grading of green jujube is a key factor in improving production efficiency and market competitiveness. However, traditional grading methods are inefficient, imprecise, and struggle to detect minor damages. This study proposes an improved BCW-YOLO deep learning model specifically designed for automated grading [...] Read more.

The grading of green jujube is a key factor in improving production efficiency and market competitiveness. However, traditional grading methods are inefficient, imprecise, and struggle to detect minor damages. This study proposes an improved BCW-YOLO deep learning model specifically designed for automated grading of green jujube. The model integrates a Bidirectional Feature Pyramid Network (BiFPN) and a Contextual Transformer Attention (COT) mechanism to enhance feature fusion accuracy and capture fine-grained details. In addition, the WIoU v3 loss function is introduced to optimize object localization performance. By constructing a multi-angle green jujube dataset and applying data augmentation techniques, the model’s generalization capability was significantly improved. The results of the experiement indicate that the improved BCW-YOLO achieves precision, recall, mAP, and F1 score of 90.87%, 92.12%, 95.66%, and 91.49%, respectively, representing increases of 1.93%, 2.77%, 1.58%, and 2.34% compared to the original YOLO model. Through comprehensive validation using confusion matrices, PR curves, heatmap analyses, and ablation studies, the model’s performance was thoroughly verified. Compared with other YOLO series models, BCW-YOLO performs exceptionally well in detecting minor damages, demonstrating its potential in practical agricultural grading. The findings provide a new technical approach for precise grading and automated sorting of green jujube, showing promising application prospects. Full article

(This article belongs to the Special Issue Agricultural Imagery and Machine Vision)

► Show Figures

Figure 1

37 pages, 10460 KB

Open AccessArticle

Research on Visual Recognition and Harvesting Point Localization System for Grape-Picking Robots in Smart Agriculture

by Tao Lin, Qiurong Lv, Fuchun Sun, Wei Ma and Xiaoxiao Li

Agriculture 2026, 16(10), 1073; https://doi.org/10.3390/agriculture16101073 - 14 May 2026

Viewed by 189

Abstract

To improve grape target perception and picking-point positioning for intelligent harvesting robots, this study develops a vision-based method for orchard grape detection and harvesting-point localization. The method is intended to address missed detections, insufficient recognition accuracy, and unsatisfactory peduncle segmentation caused by illumination [...] Read more.

To improve grape target perception and picking-point positioning for intelligent harvesting robots, this study develops a vision-based method for orchard grape detection and harvesting-point localization. The method is intended to address missed detections, insufficient recognition accuracy, and unsatisfactory peduncle segmentation caused by illumination variation, occlusion, and interference from branches and leaves in complex orchard scenes. For grape cluster and peduncle detection, a lightweight YOLOv7-derived model, termed YOLO-FES, was established. In this model, FasterNet and SCConv were introduced to refine the backbone and neck structures, and the EMA mechanism was incorporated to lower parameter complexity and computational cost while improving detection performance. For suspended grape structure association and peduncle extraction, the GJK algorithm was combined with nearest-neighbor rectangular discrimination, and an improved YOLACT-based peduncle segmentation network, named M-YOLACT, was constructed. With the integration of the MLCA mechanism and the Mish activation function, accurate peduncle segmentation was achieved. In addition, a stereo depth camera was employed to obtain two-dimensional picking-point information and further recover the corresponding three-dimensional spatial coordinates. Experimental results showed that the mAP@0.5 of YOLO-FES for grape clusters and peduncles reached 95.37%. For grape peduncle segmentation, the mAP@0.5 values of the bounding boxes and masks produced by M-YOLACT reached 95.73% and 94.36%, respectively. The proposed method achieved an overall harvesting success rate of 89.2%, with an average time consumption of 11 s for a single harvesting operation. By integrating deep-learning-based detection and segmentation with binocular-vision localization, this study provides a practical technical solution and useful reference for the visual system design of grape-harvesting robots. Full article

(This article belongs to the Special Issue Key Technology Research and Applications of Agricultural Inspection Robots Based on Machine Vision and Artificial Intelligence)

► Show Figures

Figure 1

Search Results (1,750)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,750)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI