Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (12)

Search Parameters:
Keywords = fast pyramid pooling enhanced with local attention networks

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 9038 KiB  
Article
Deep Learning-Based Detection and Digital Twin Implementation of Beak Deformities in Caged Layer Chickens
by Hengtai Li, Hongfei Chen, Jinlin Liu, Qiuhong Zhang, Tao Liu, Xinyu Zhang, Yuhua Li, Yan Qian and Xiuguo Zou
Agriculture 2025, 15(11), 1170; https://doi.org/10.3390/agriculture15111170 - 29 May 2025
Viewed by 780
Abstract
With the increasing urgency for digital transformation in large-scale caged layer farms, traditional methods for monitoring the environment and chicken health, which often rely on human experience, face challenges related to low efficiency and poor real-time performance. In this study, we focused on [...] Read more.
With the increasing urgency for digital transformation in large-scale caged layer farms, traditional methods for monitoring the environment and chicken health, which often rely on human experience, face challenges related to low efficiency and poor real-time performance. In this study, we focused on caged layer chickens and proposed an improved abnormal beak detection model based on the You Only Look Once v8 (YOLOv8) framework. Data collection was conducted using an inspection robot, enhancing automation and consistency. To address the interference caused by chicken cages, an Efficient Multi-Scale Attention (EMA) mechanism was integrated into the Spatial Pyramid Pooling-Fast (SPPF) module within the backbone network, significantly improving the model’s ability to capture fine-grained beak features. Additionally, the standard convolutional blocks in the neck of the original model were replaced with Grouped Shuffle Convolution (GSConv) modules, effectively reducing information loss during feature extraction. The model was deployed on edge computing devices for the real-time detection of abnormal beak features in layer chickens. Beyond local detection, a digital twin remote monitoring system was developed, combining three-dimensional (3D) modeling, the Internet of Things (IoT), and cloud-edge collaboration to create a dynamic, real-time mapping of physical layer farms to their virtual counterparts. This innovative approach not only improves the extraction of subtle features but also addresses occlusion challenges commonly encountered in small target detection. Experimental results demonstrate that the improved model achieved a detection accuracy of 92.7%. In terms of the comprehensive evaluation metric (mAP), it surpassed the baseline model and YOLOv5 by 2.4% and 3.2%, respectively. The digital twin system also proved stable in real-world scenarios, effectively mapping physical conditions to virtual environments. Overall, this study integrates deep learning and digital twin technology into a smart farming system, presenting a novel solution for the digital transformation of poultry farming. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

21 pages, 9976 KiB  
Article
RLRD-YOLO: An Improved YOLOv8 Algorithm for Small Object Detection from an Unmanned Aerial Vehicle (UAV) Perspective
by Hanyun Li, Yi Li, Linsong Xiao, Yunfeng Zhang, Lihua Cao and Di Wu
Drones 2025, 9(4), 293; https://doi.org/10.3390/drones9040293 - 10 Apr 2025
Cited by 1 | Viewed by 2680
Abstract
In Unmanned Aerial Vehicle (UAV) target detection tasks, issues such as missing and erroneous detections frequently occur owing to the small size of the targets and the complexity of the image background. To improve these issues, an improved target detection algorithm named RLRD-YOLO, [...] Read more.
In Unmanned Aerial Vehicle (UAV) target detection tasks, issues such as missing and erroneous detections frequently occur owing to the small size of the targets and the complexity of the image background. To improve these issues, an improved target detection algorithm named RLRD-YOLO, based on You Only Look Once version 8 (YOLOv8), is proposed. First, the backbone network initially integrates the Receptive Field Attention Convolution (RFCBAMConv) Module, which combines the Convolutional Block Attention Module (CBAM) and Receptive Field Attention Convolution (RFAConv). This integration improves the issue of shared attention weights in receptive field features. It also combines attention mechanisms across both channel and spatial dimensions, enhancing the capability of feature extraction. Subsequently, Large-Scale Kernel Attention (LSKA) is integrated to further optimize the Spatial Pyramid Pooling Fast (SPPF) layer. This enhancement employs a large-scale convolutional kernel to improve the capture of intricate small target features and minimize background interference. To enhance feature fusion and effectively integrate low-level details with high-level semantic information, the Reparameterized Generalized Feature Pyramid Network (RepGFPN) replaces the original architecture in the neck network. Additionally, a small-target detection layer is added to enhance the model’s ability to perceive small targets. Finally, the detecting head is replaced with the Dynamic Head, designed to improve the localization accuracy of small targets in complex scenarios by optimizing for Scale Awareness, Spatial Awareness, and Task Awareness. The experimental results showed that RLRD-YOLO outperformed YOLOv8 on the VisDrone2019 dataset, achieving improvements of 12.2% in mAP@0.5 and 8.4% in mAP@0.5:0.95. It also surpassed other widely used object detection methods. Furthermore, experimental results on the HIT-HAV dataset demonstrate that RLRD-YOLO sustains excellent precision in infrared UAV imagery, validating its generalizability across diverse scenarios. Finally, RLRD-YOLO was deployed and validated on the typical airborne platform, Jetson Nano, providing reliable technical support for the improvement of detection algorithms in aerial scenarios and their practical applications. Full article
Show Figures

Figure 1

21 pages, 6489 KiB  
Article
Peach Leaf Shrinkage Disease Recognition Algorithm Based on Attention Spatial Pyramid Pooling Enhanced with Local Attention Network
by Caihong Zhang, Pingchuan Zhang, Yanjun Hu, Zeze Ma, Xiaona Ding, Ying Yang and Shan Li
Electronics 2024, 13(24), 4973; https://doi.org/10.3390/electronics13244973 - 17 Dec 2024
Viewed by 832
Abstract
Aiming at many challenges in the recognition task of peach leaf shrink disease, such as the diversity of object size of diseased leaf disease, complex background interference, and inflexible adjustment of model training learning rate, we propose a peach leaf shrink disease recognition [...] Read more.
Aiming at many challenges in the recognition task of peach leaf shrink disease, such as the diversity of object size of diseased leaf disease, complex background interference, and inflexible adjustment of model training learning rate, we propose a peach leaf shrink disease recognition algorithm based on an attention generalized efficient layer aggregation network. Firstly, the rectified linear unit activation function is used to effectively improve the stability and performance of the model in low-precision computing environments and solve the problem of partial gradient disappearance. Secondly, the integrated squeeze-and-excitation network attention mechanism can adaptively focus on the key areas of pests and diseases in the image, which significantly enhances the recognition ability of the model to the characteristics of pests and diseases. Finally, combined with fast pyramid pooling enhanced with Local Attention Networks, the deep fusion of cross-layer features is realized to improve the ability of the model to identify complex features and optimize the operation efficiency. The experimental results on the peach leaf shrink disease recognition dataset show that the proposed algorithm achieves a significant improvement in performance compared with the original YOLOv8 algorithm. Specifically, mF1, mPrecision, mRecall, and mAP increased by 0.1075, 0.0723, 0.1224, and 0.1184, respectively, which provided strong technical support for intelligent and automatic monitoring of peach pests and diseases. Full article
Show Figures

Figure 1

24 pages, 108807 KiB  
Article
SMEA-YOLOv8n: A Sheep Facial Expression Recognition Method Based on an Improved YOLOv8n Model
by Wenbo Yu, Xiang Yang, Yongqi Liu, Chuanzhong Xuan, Ruoya Xie and Chuanjiu Wang
Animals 2024, 14(23), 3415; https://doi.org/10.3390/ani14233415 - 26 Nov 2024
Cited by 1 | Viewed by 1094
Abstract
Sheep facial expressions are valuable indicators of their pain levels, playing a critical role in monitoring their health and welfare. In response to challenges such as missed detections, false positives, and low recognition accuracy in sheep facial expression recognition, this paper introduces an [...] Read more.
Sheep facial expressions are valuable indicators of their pain levels, playing a critical role in monitoring their health and welfare. In response to challenges such as missed detections, false positives, and low recognition accuracy in sheep facial expression recognition, this paper introduces an enhanced algorithm based on YOLOv8n, referred to as SimAM-MobileViTAttention-EfficiCIoU-AA2_SPPF-YOLOv8n (SMEA-YOLOv8n). Firstly, the proposed method integrates the parameter-free Similarity-Aware Attention Mechanism (SimAM) and MobileViTAttention modules into the CSP Bottleneck with 2 Convolutions(C2f) module of the neck network, aiming to enhance the model’s feature representation and fusion capabilities in complex environments while mitigating the interference of irrelevant background features. Additionally, the EfficiCIoU loss function replaces the original Complete IoU(CIoU) loss function, thereby improving bounding box localization accuracy and accelerating model convergence. Furthermore, the Spatial Pyramid Pooling-Fast (SPPF) module in the backbone network is refined with the addition of two global average pooling layers, strengthening the extraction of sheep facial expression features and bolstering the model’s core feature fusion capacity. Experimental results reveal that the proposed method achieves a mAP@0.5 of 92.5%, a Recall of 91%, a Precision of 86%, and an F1-score of 88.0%, reflecting improvements of 4.5%, 9.1%, 2.8%, and 6.0%, respectively, compared to the baseline model. Notably, the mAP@0.5 for normal and abnormal sheep facial expressions increased by 3.7% and 5.3%, respectively, demonstrating the method’s effectiveness in enhancing recognition accuracy under complex environmental conditions. Full article
(This article belongs to the Section Small Ruminants)
Show Figures

Figure 1

24 pages, 12126 KiB  
Article
Efficient Optimized YOLOv8 Model with Extended Vision
by Qi Zhou, Zhou Wang, Yiwen Zhong, Fenglin Zhong and Lijin Wang
Sensors 2024, 24(20), 6506; https://doi.org/10.3390/s24206506 - 10 Oct 2024
Cited by 7 | Viewed by 6561
Abstract
In the field of object detection, enhancing algorithm performance in complex scenarios represents a fundamental technological challenge. To address this issue, this paper presents an efficient optimized YOLOv8 model with extended vision (YOLO-EV), which optimizes the performance of the YOLOv8 model through a [...] Read more.
In the field of object detection, enhancing algorithm performance in complex scenarios represents a fundamental technological challenge. To address this issue, this paper presents an efficient optimized YOLOv8 model with extended vision (YOLO-EV), which optimizes the performance of the YOLOv8 model through a series of innovative improvement measures and strategies. First, we propose a multi-branch group-enhanced fusion attention (MGEFA) module and integrate it into YOLO-EV, which significantly boosts the model’s feature extraction capabilities. Second, we enhance the existing spatial pyramid pooling fast (SPPF) layer by integrating large scale kernel attention (LSKA), improving the model’s efficiency in processing spatial information. Additionally, we replace the traditional IOU loss function with the Wise-IOU loss function, thereby enhancing localization accuracy across various target sizes. We also introduce a P6 layer to augment the model’s detection capabilities for multi-scale targets. Through network structure optimization, we achieve higher computational efficiency, ensuring that YOLO-EV consumes fewer computational resources than YOLOv8s. In the validation section, preliminary tests on the VOC12 dataset demonstrate YOLO-EV’s effectiveness in standard object detection tasks. Moreover, YOLO-EV has been applied to the CottonWeedDet12 and CropWeed datasets, which are characterized by complex scenes, diverse weed morphologies, significant occlusions, and numerous small targets. Experimental results indicate that YOLO-EV exhibits superior detection accuracy in these complex agricultural environments compared to the original YOLOv8s and other state-of-the-art models, effectively identifying and locating various types of weeds, thus demonstrating its significant practical application potential. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

27 pages, 6983 KiB  
Article
DA-YOLOv7: A Deep Learning-Driven High-Performance Underwater Sonar Image Target Recognition Model
by Zhe Chen, Guohao Xie, Xiaofang Deng, Jie Peng and Hongbing Qiu
J. Mar. Sci. Eng. 2024, 12(9), 1606; https://doi.org/10.3390/jmse12091606 - 10 Sep 2024
Cited by 6 | Viewed by 2431
Abstract
Affected by the complex underwater environment and the limitations of low-resolution sonar image data and small sample sizes, traditional image recognition algorithms have difficulties achieving accurate sonar image recognition. The research builds on YOLOv7 and devises an innovative fast recognition model designed explicitly [...] Read more.
Affected by the complex underwater environment and the limitations of low-resolution sonar image data and small sample sizes, traditional image recognition algorithms have difficulties achieving accurate sonar image recognition. The research builds on YOLOv7 and devises an innovative fast recognition model designed explicitly for sonar images, namely the Dual Attention Mechanism YOLOv7 model (DA-YOLOv7), to tackle such challenges. New modules such as the Omni-Directional Convolution Channel Prior Convolutional Attention Efficient Layer Aggregation Network (OA-ELAN), Spatial Pyramid Pooling Channel Shuffling and Pixel-level Convolution Bilat-eral-branch Transformer (SPPCSPCBiFormer), and Ghost-Shuffle Convolution Enhanced Layer Aggregation Network-High performance (G-ELAN-H) are central to its design, which reduce the computational burden and enhance the accuracy in detecting small targets and capturing local features and crucial information. The study adopts transfer learning to deal with the lack of sonar image samples. By pre-training the large-scale Underwater Acoustic Target Detection Dataset (UATD dataset), DA-YOLOV7 obtains initial weights, fine-tuned on the smaller Smaller Common Sonar Target Detection Dataset (SCTD dataset), thereby reducing the risk of overfitting which is commonly encountered in small datasets. The experimental results on the UATD, the Underwater Optical Target Detection Intelligent Algorithm Competition 2021 Dataset (URPC), and SCTD datasets show that DA-YOLOV7 exhibits outstanding performance, with mAP@0.5 scores reaching 89.4%, 89.9%, and 99.15%, respectively. In addition, the model maintains real-time speed while having superior accuracy and recall rates compared to existing mainstream target recognition models. These findings establish the superiority of DA-YOLOV7 in sonar image analysis tasks. Full article
Show Figures

Figure 1

20 pages, 1928 KiB  
Article
An Automated Diagnosis Method for Lung Cancer Target Detection and Subtype Classification-Based CT Scans
by Lingfei Wang, Chenghao Zhang, Yu Zhang and Jin Li
Bioengineering 2024, 11(8), 767; https://doi.org/10.3390/bioengineering11080767 - 30 Jul 2024
Cited by 3 | Viewed by 2470
Abstract
When dealing with small targets in lung cancer detection, the YOLO V8 algorithm may encounter false positives and misses. To address this issue, this study proposes an enhanced YOLO V8 detection model. The model integrates a large separable kernel attention mechanism into the [...] Read more.
When dealing with small targets in lung cancer detection, the YOLO V8 algorithm may encounter false positives and misses. To address this issue, this study proposes an enhanced YOLO V8 detection model. The model integrates a large separable kernel attention mechanism into the C2f module to expand the information retrieval range, strengthens the extraction of lung cancer features in the Backbone section, and achieves effective interaction between multi-scale features in the Neck section, thereby enhancing feature representation and robustness. Additionally, depth-wise convolution and Coordinate Attention mechanisms are embedded in the Fast Spatial Pyramid Pooling module to reduce feature loss and improve detection accuracy. This study introduces a Minimum Point Distance-based IOU loss to enhance correlation between predicted and ground truth bounding boxes, improving adaptability and accuracy in small target detection. Experimental validation demonstrates that the improved network outperforms other mainstream detection networks in terms of average precision values and surpasses other classification networks in terms of accuracy. These findings validate the outstanding performance of the enhanced model in the localization and recognition aspects of lung cancer auxiliary diagnosis. Full article
(This article belongs to the Section Biomedical Engineering and Biomaterials)
Show Figures

Figure 1

23 pages, 5350 KiB  
Article
Enhancing Automated Brain Tumor Detection Accuracy Using Artificial Intelligence Approaches for Healthcare Environments
by Akmalbek Abdusalomov, Mekhriddin Rakhimov, Jakhongir Karimberdiyev, Guzal Belalova and Young Im Cho
Bioengineering 2024, 11(6), 627; https://doi.org/10.3390/bioengineering11060627 - 19 Jun 2024
Cited by 11 | Viewed by 5826
Abstract
Medical imaging and deep learning models are essential to the early identification and diagnosis of brain cancers, facilitating timely intervention and improving patient outcomes. This research paper investigates the integration of YOLOv5, a state-of-the-art object detection framework, with non-local neural networks (NLNNs) to [...] Read more.
Medical imaging and deep learning models are essential to the early identification and diagnosis of brain cancers, facilitating timely intervention and improving patient outcomes. This research paper investigates the integration of YOLOv5, a state-of-the-art object detection framework, with non-local neural networks (NLNNs) to improve brain tumor detection’s robustness and accuracy. This study begins by curating a comprehensive dataset comprising brain MRI scans from various sources. To facilitate effective fusion, the YOLOv5 and NLNNs, K-means+, and spatial pyramid pooling fast+ (SPPF+) modules are integrated within a unified framework. The brain tumor dataset is used to refine the YOLOv5 model through the application of transfer learning techniques, adapting it specifically to the task of tumor detection. The results indicate that the combination of YOLOv5 and other modules results in enhanced detection capabilities in comparison to the utilization of YOLOv5 exclusively, proving recall rates of 86% and 83% respectively. Moreover, the research explores the interpretability aspect of the combined model. By visualizing the attention maps generated by the NLNNs module, the regions of interest associated with tumor presence are highlighted, aiding in the understanding and validation of the decision-making procedure of the methodology. Additionally, the impact of hyperparameters, such as NLNNs kernel size, fusion strategy, and training data augmentation, is investigated to optimize the performance of the combined model. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) in Biomedicine)
Show Figures

Figure 1

29 pages, 6850 KiB  
Article
YOLOv8-MU: An Improved YOLOv8 Underwater Detector Based on a Large Kernel Block and a Multi-Branch Reparameterization Module
by Xing Jiang, Xiting Zhuang, Jisheng Chen, Jian Zhang and Yiwen Zhang
Sensors 2024, 24(9), 2905; https://doi.org/10.3390/s24092905 - 1 May 2024
Cited by 14 | Viewed by 4481
Abstract
Underwater visual detection technology is crucial for marine exploration and monitoring. Given the growing demand for accurate underwater target recognition, this study introduces an innovative architecture, YOLOv8-MU, which significantly enhances the detection accuracy. This model incorporates the large kernel block (LarK block) from [...] Read more.
Underwater visual detection technology is crucial for marine exploration and monitoring. Given the growing demand for accurate underwater target recognition, this study introduces an innovative architecture, YOLOv8-MU, which significantly enhances the detection accuracy. This model incorporates the large kernel block (LarK block) from UniRepLKNet to optimize the backbone network, achieving a broader receptive field without increasing the model’s depth. Additionally, the integration of C2fSTR, which combines the Swin transformer with the C2f module, and the SPPFCSPC_EMA module, which blends Cross-Stage Partial Fast Spatial Pyramid Pooling (SPPFCSPC) with attention mechanisms, notably improves the detection accuracy and robustness for various biological targets. A fusion block from DAMO-YOLO further enhances the multi-scale feature extraction capabilities in the model’s neck. Moreover, the adoption of the MPDIoU loss function, designed around the vertex distance, effectively addresses the challenges of localization accuracy and boundary clarity in underwater organism detection. The experimental results on the URPC2019 dataset indicate that YOLOv8-MU achieves an mAP@0.5 of 78.4%, showing an improvement of 4.0% over the original YOLOv8 model. Additionally, on the URPC2020 dataset, it achieves 80.9%, and, on the Aquarium dataset, it reaches 75.5%, surpassing other models, including YOLOv5 and YOLOv8n, thus confirming the wide applicability and generalization capabilities of our proposed improved model architecture. Furthermore, an evaluation on the improved URPC2019 dataset demonstrates leading performance (SOTA), with an mAP@0.5 of 88.1%, further verifying its superiority on this dataset. These results highlight the model’s broad applicability and generalization capabilities across various underwater datasets. Full article
Show Figures

Figure 1

18 pages, 2997 KiB  
Article
YOLOv8-CGRNet: A Lightweight Object Detection Network Leveraging Context Guidance and Deep Residual Learning
by Yixing Niu, Wansheng Cheng, Chunni Shi and Song Fan
Electronics 2024, 13(1), 43; https://doi.org/10.3390/electronics13010043 - 20 Dec 2023
Cited by 21 | Viewed by 4265
Abstract
The growing need for effective object detection models on mobile devices makes it essential to design models that are both accurate and have fewer parameters. In this paper, we introduce a YOLOv8 Res2Net Extended Network (YOLOv8-CGRNet) approach that achieves enhanced precision under standards [...] Read more.
The growing need for effective object detection models on mobile devices makes it essential to design models that are both accurate and have fewer parameters. In this paper, we introduce a YOLOv8 Res2Net Extended Network (YOLOv8-CGRNet) approach that achieves enhanced precision under standards suitable for lightweight mobile devices. Firstly, we merge YOLOv8 with the Context GuidedNet (CGNet) and Residual Network with multiple branches (Res2Net) structures, augmenting the model’s ability to learn deep Res2Net features without adding to its complexity or computational demands. CGNet effectively captures local features and contextual surroundings, utilizing spatial dependencies and context information to improve accuracy. By reducing the number of parameters and saving on memory usage, it adheres to a ‘deep yet slim’ principle, lessening channel numbers between stages. Secondly, we explore an improved pyramid network (FPN) combination and employ the Stage Partial Spatial Pyramid Pooling Fast (SimPPFCSPC) structure to further strengthen the network’s capability in processing the FPN. Using a dynamic non-monotonic focusing mechanism (FM) gradient gain distribution strategy based on Wise-IoU (WIoU) in an anchor-free context, this method effectively manages low-quality examples. It enhances the overall performance of the detector. Thirdly, we introduce Unifying Object Detection Heads with Attention, adapting to various input scenarios and increasing the model’s flexibility. Experimental datasets include the commonly used detection datasets: VOC2007, VOC2012, and VisDrone. The experimental results demonstrate a 4.3% improvement in detection performance by the proposed framework, affirming superior performance over the original YOLOv8 model in terms of accuracy and robustness and providing insights for future practical applications. Full article
Show Figures

Figure 1

16 pages, 4090 KiB  
Article
Object Detection for Hazardous Material Vehicles Based on Improved YOLOv5 Algorithm
by Pengcheng Zhu, Bolun Chen, Bushi Liu, Zifan Qi, Shanshan Wang and Ling Wang
Electronics 2023, 12(5), 1257; https://doi.org/10.3390/electronics12051257 - 6 Mar 2023
Cited by 15 | Viewed by 3463
Abstract
Hazardous material vehicles are a non-negligible mobile source of danger in transport and pose a significant safety risk. At present, the current detection technology is well developed, but it also faces a series of challenges such as a significant amount of computational effort [...] Read more.
Hazardous material vehicles are a non-negligible mobile source of danger in transport and pose a significant safety risk. At present, the current detection technology is well developed, but it also faces a series of challenges such as a significant amount of computational effort and unsatisfactory accuracy. To address these issues, this paper proposes a method based on YOLOv5 to improve the detection accuracy of hazardous material vehicles. The method introduces an attention module in the YOLOv5 backbone network as well as the neck network to achieve the purpose of extracting better features by assigning different weights to different parts of the feature map to suppress non-critical information. In order to enhance the fusion capability of the model under different sized feature maps, the SPPF (Spatial Pyramid Pooling-Fast) layer in the network is replaced by the SPPCSPC (Spatial Pyramid Pooling Cross Stage Partial Conv) layer. In addition, the bounding box loss function was replaced with the SIoU loss function in order to effectively speed up the bounding box regression and enhance the localization accuracy of the model. Experiments on the dataset show that the improved model has effectively improved the detection accuracy of hazardous chemical vehicles compared with the original model. Our model is of great significance for achieving traffic accident monitoring and effective emergency rescue. Full article
Show Figures

Figure 1

17 pages, 8269 KiB  
Article
Leveraging Saliency in Single-Stage Multi-Label Concrete Defect Detection Using Unmanned Aerial Vehicle Imagery
by Loucif Hebbache, Dariush Amirkhani, Mohand Saïd Allili, Nadir Hammouche and Jean-François Lapointe
Remote Sens. 2023, 15(5), 1218; https://doi.org/10.3390/rs15051218 - 22 Feb 2023
Cited by 13 | Viewed by 3865
Abstract
Visual inspection of concrete structures using Unmanned Areal Vehicle (UAV) imagery is a challenging task due to the variability of defects’ size and appearance. This paper proposes a high-performance model for automatic and fast detection of bridge concrete defects using UAV-acquired images. Our [...] Read more.
Visual inspection of concrete structures using Unmanned Areal Vehicle (UAV) imagery is a challenging task due to the variability of defects’ size and appearance. This paper proposes a high-performance model for automatic and fast detection of bridge concrete defects using UAV-acquired images. Our method, coined the Saliency-based Multi-label Defect Detector (SMDD-Net), combines pyramidal feature extraction and attention through a one-stage concrete defect detection model. The attention module extracts local and global saliency features, which are scaled and integrated with the pyramidal feature extraction module of the network using the max-pooling, multiplication, and residual skip connections operations. This has the effect of enhancing the localisation of small and low-contrast defects, as well as the overall accuracy of detection in varying image acquisition ranges. Finally, a multi-label loss function detection is used to identify and localise overlapping defects. The experimental results on a standard dataset and real-world images demonstrated the performance of SMDD-Net with regard to state-of-the-art techniques. The accuracy and computational efficiency of SMDD-Net make it a suitable method for UAV-based bridge structure inspection. Full article
Show Figures

Figure 1

Back to TopTop