applsci-logo

Journal Browser

Journal Browser

Deep Learning for Object Detection

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 May 2025 | Viewed by 39104

Special Issue Editor


E-Mail Website
Guest Editor
Digital Industry Center, Fondazione Bruno Kessler, 18, 38123 Trento, Italy
Interests: you only look once (YOLO); big data; convolutional neural networks (CNNs); object detection; artificial intelligence
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Currently, models based on convolutional neural networks (CNNs) are increasingly being applied for image classification due to their ability to handle big data. Models such as you only look once (YOLO) have become very popular for having greater flexibility and good performance in object identification.

In this Special Issue, we are aiming to collate studies on all of the aspects surrounding “Deep Learning for Object Detection”. Any original, unpublished work is welcome. If you have an interest in this topic, please let us know.

Dr. Stéfano Frizzo Stefenon
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • you only look once (YOLO)
  • big data
  • convolutional neural networks (CNNs)
  • object detection
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 4057 KiB  
Article
RHS-YOLOv8: A Lightweight Underwater Small Object Detection Algorithm Based on Improved YOLOv8
by Yifan Wei, Jun Tao, Wenjun Wu, Donghua Yuan and Shunzhi Hou
Appl. Sci. 2025, 15(7), 3778; https://doi.org/10.3390/app15073778 - 30 Mar 2025
Viewed by 557
Abstract
To address the challenge posed by the abundance of small objects with weak object features and little information in the images of underwater biomonitoring scenarios, and the added difficulty of recognizing these objects due to light absorption and scattering in the underwater environment, [...] Read more.
To address the challenge posed by the abundance of small objects with weak object features and little information in the images of underwater biomonitoring scenarios, and the added difficulty of recognizing these objects due to light absorption and scattering in the underwater environment, this study proposes an improved RHS-YOLOv8 (Ref-Dilated-HBFPN-SOB-YOLOv8). Firstly, a combination of hybrid inflated convolution and RefConv is used to redesign the lightweight Ref-Dilated convolution block, which reduces the model computation. Second, a new feature pyramid network fusion module, the Hybrid Bridge Feature Pyramid Network (HBFPN), is designed to fuse the deep features with the high-level features, as well as the features of the current layer, to improve the feature extraction capability for fuzzy objects. Third, Efficient Localization Attention (ELA) is added to reduce the interference of irrelevant factors on prediction. Fourth, an Involution module is introduced to effectively capture spatial long-range relationships and improve recognition accuracy. Finally, a small object detection branch is incorporated into the original architecture to enhance the model’s performance in detecting small objects. Experiments based on the DUO dataset show that RHS-YOLOv8 reduces 9.95% of computing power, while mAP@0.5 and mAP@0.50:0.95 are improved by 2.54% and 4.31%, respectively. Compared with other cutting-edge underwater object detection algorithms, the present algorithm improves the detection accuracy while lightweighting the improvement, which effectively enhances the capability to detect small underwater objects. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

18 pages, 72941 KiB  
Article
Performance Evaluation of YOLOv8, YOLOv9, YOLOv10, and YOLOv11 for Stamp Detection in Scanned Documents
by João Bento, Thuanne Paixão and Ana Beatriz Alvarez
Appl. Sci. 2025, 15(6), 3154; https://doi.org/10.3390/app15063154 - 14 Mar 2025
Viewed by 1902
Abstract
Stamps are an essential mechanism for authenticating documents in various sectors and institutions. Given the high volume of documents and the increase in forgery, it is necessary to adopt automated methods to identify stamps on documents. In this context, techniques based on deep [...] Read more.
Stamps are an essential mechanism for authenticating documents in various sectors and institutions. Given the high volume of documents and the increase in forgery, it is necessary to adopt automated methods to identify stamps on documents. In this context, techniques based on deep learning stand out as an efficient solution for automating this process. To this end, this article presents a performance evaluation of YOLOv8s, YOLOv9s, YOLOv10s, and YOLOv11s in detecting stamps on scanned documents. To train, validate, and test the models, an adapted dataset with 732 images from the combination of the StaVer and DDI-100 datasets is used. The performance of the models is evaluated by means of quantitative and qualitative analyses and by analyzing the computational cost. The results show that, in terms of performance, the YOLOv9s model obtained the best result, with a mAP (Mean Average Precision) of 98.7% for a precision and recall of 97.6%. In terms of computational cost and shorter inference time, the YOLOv11s model stands out. This comparative approach is a contribution to the state of the art for implementation in automatic stamp authentication devices. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

20 pages, 5309 KiB  
Article
DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection
by Weichao Pan, Jianmei Lei, Xu Wang, Chengze Lv, Gongrui Wang and Chong Li
Appl. Sci. 2025, 15(3), 1470; https://doi.org/10.3390/app15031470 - 31 Jan 2025
Viewed by 1128
Abstract
Existing methods for detecting road damage mainly depend on manual inspections or sensor-equipped vehicles, which are inefficient, have limited coverage, and are susceptible to errors and delays. These traditional methods also struggle with detecting minor damage, such as small cracks and initial potholes, [...] Read more.
Existing methods for detecting road damage mainly depend on manual inspections or sensor-equipped vehicles, which are inefficient, have limited coverage, and are susceptible to errors and delays. These traditional methods also struggle with detecting minor damage, such as small cracks and initial potholes, making real-time road monitoring challenging. To address these issues and improve the performance for real-time road damage detection using Street View Image Data (SVRDD), this study propose DAPONet, a new deep learning model. DAPONet proposes three main innovations: (1) a dual attention mechanism that combines global context and local attention, (2) a multi-scale partial overparameterization module (CPDA), and (3) an efficient downsampling module (MCD). Experimental results on the SVRDD public dataset show that DAPONet reaches a mAP50 of 70.1%, surpassing YOLOv10n (an optimized version of YOLO) by 10.4%, while reducing the model’s size to 1.6 M parameters and cutting FLOPs to 1.7 G, resulting in a 41% and 80% decrease, respectively. Furthermore, the model’s mAP50-95 of 33.4% on the MS COCO2017 dataset demonstrates its superior performance, with a 0.8% improvement over EfficientDet-D1, while reducing parameters and FLOPs by 74%. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

16 pages, 2813 KiB  
Article
An Evaluation of Image Slicing and YOLO Architectures for Object Detection in UAV Images
by Muhammed Telçeken, Devrim Akgun and Sezgin Kacar
Appl. Sci. 2024, 14(23), 11293; https://doi.org/10.3390/app142311293 - 4 Dec 2024
Cited by 1 | Viewed by 1361
Abstract
Object detection in aerial images poses significant challenges due to the high dimensions of the images, requiring efficient handling and resizing to fit object detection models. The image-slicing approach for object detection in aerial images can increase detection accuracy by eliminating pixel loss [...] Read more.
Object detection in aerial images poses significant challenges due to the high dimensions of the images, requiring efficient handling and resizing to fit object detection models. The image-slicing approach for object detection in aerial images can increase detection accuracy by eliminating pixel loss in high-resolution image data. However, determining the proper dimensions to slice is essential for the integrity of the objects and their learning by the model. This study presents an evaluation of the image-slicing approach for alternative sizes of images to optimize efficiency. For this purpose, a dataset of high-resolution images collected with Unmanned Aerial Vehicles (UAV) has been used. The experiments evaluated using alternative YOLO architectures like YOLOv7, YOLOv8, and YOLOv9 show that the image dimensions significantly change the performance results. According to the experiments, the best mAP@05 accuracy was obtained by slicing 1280×1280 for YOLOv7 producing 88.2. Results show that edge-related objects are better preserved as the overlap and slicing sizes increase, resulting in improved model performance. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

20 pages, 15897 KiB  
Article
EMB-YOLO: A Lightweight Object Detection Algorithm for Isolation Switch State Detection
by Haojie Chen, Lumei Su, Riben Shu, Tianyou Li and Fan Yin
Appl. Sci. 2024, 14(21), 9779; https://doi.org/10.3390/app14219779 - 25 Oct 2024
Cited by 1 | Viewed by 1167
Abstract
In power inspection, it is crucial to accurately and regularly monitor the status of isolation switches to ensure the stable operation of power systems. However, current methods for detecting the open and closed states of isolation switches based on image recognition still suffer [...] Read more.
In power inspection, it is crucial to accurately and regularly monitor the status of isolation switches to ensure the stable operation of power systems. However, current methods for detecting the open and closed states of isolation switches based on image recognition still suffer from low accuracy and high edge deployment costs. In this paper, we propose a lightweight object detection model, EMB-YOLO, to address this challenge. Firstly, we propose an efficient mobile inverted bottleneck convolution (EMBC) module for the backbone network. This module is designed with a lightweight structure, aimed at reducing the computational complexity and parameter count, thereby optimizing the model’s computational efficiency. Furthermore, an ELA attention mechanism is used in the EMBC module to enhance the extraction of horizontal and vertical isolation switch features in complex environments. Finally, we proposed an efficient-RepGDFPN fusion network. This network integrates feature maps from different levels to detect isolation switches at multiple scales in monitoring scenarios. An isolation switch dataset was self-built to evaluate the performance of the proposed EMB-YOLO. The experimental results demonstrated that the proposed method achieved superior detection performance on our self-built dataset, with a mean average precision (mAP) of 87.2%, while maintaining a computational cost of only 6.5×109 FLOPs and a parameter size of just 2.8×106 bytes. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

19 pages, 3429 KiB  
Article
An Insulator Fault Diagnosis Method Based on Multi-Mechanism Optimization YOLOv8
by Chuang Gong, Wei Jiang, Dehua Zou, Weiwei Weng and Hongjun Li
Appl. Sci. 2024, 14(19), 8770; https://doi.org/10.3390/app14198770 - 28 Sep 2024
Viewed by 1143
Abstract
Aiming at the problem that insulator image backgrounds are complex and fault types are diverse, which makes it difficult for existing deep learning algorithms to achieve accurate insulator fault diagnosis, an insulator fault diagnosis method based on multi-mechanism optimization YOLOv8-DCP is proposed. Firstly, [...] Read more.
Aiming at the problem that insulator image backgrounds are complex and fault types are diverse, which makes it difficult for existing deep learning algorithms to achieve accurate insulator fault diagnosis, an insulator fault diagnosis method based on multi-mechanism optimization YOLOv8-DCP is proposed. Firstly, a feature extraction and fusion module, named CW-DRB, was designed. This module enhances the C2f structure of YOLOv8 by incorporating the dilation-wise residual module and the dilated re-param module. The introduction of this module improves YOLOv8’s capability for multi-scale feature extraction and multi-level feature fusion. Secondly, the CARAFE module, which is feature content-aware, was introduced to replace the up-sampling layer in YOLOv8n, thereby enhancing the model’s feature map reconstruction ability. Finally, an additional small-object detection layer was added to improve the detection accuracy of small defects. Simulation results indicate that YOLOv8-DCP achieves an accuracy of 97.7% and an mAP@0.5 of 93.9%. Compared to YOLOv5, YOLOv7, and YOLOv8n, the accuracy improved by 1.5%, 4.3%, and 4.8%, while the mAP@0.5 increased by 3.0%, 4.3%, and 3.1%. This results in a significant enhancement in the accuracy of insulator fault diagnosis. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

22 pages, 7527 KiB  
Article
EAAnet: Efficient Attention and Aggregation Network for Crowd Person Detection
by Wenzhuo Chen, Wen Wu, Wantao Dai and Feng Huang
Appl. Sci. 2024, 14(19), 8692; https://doi.org/10.3390/app14198692 - 26 Sep 2024
Viewed by 985
Abstract
With the frequent occurrence of natural disasters and the acceleration of urbanization, it is necessary to carry out efficient evacuation, especially when earthquakes, fires, terrorist attacks, and other serious threats occur. However, due to factors such as small targets, complex posture, occlusion, and [...] Read more.
With the frequent occurrence of natural disasters and the acceleration of urbanization, it is necessary to carry out efficient evacuation, especially when earthquakes, fires, terrorist attacks, and other serious threats occur. However, due to factors such as small targets, complex posture, occlusion, and dense distribution, the current mainstream algorithms still have problems such as low precision and poor real-time performance in crowd person detection. Therefore, this paper proposes EAAnet, a crowd person detection algorithm. It is based on YOLOv5, with CBAM (Convolutional Block Attention Module) introduced into the backbone, BiFPN (Bidirectional Feature Pyramid Network) introduced into the neck, and combined with a loss function of CIoU_Loss to better predict the person number. The experimental results show that compared with other mainstream detection algorithms, EAAnet has achieved significant improvement in precision and real-time performance. The precision value of all categories was 78.6%, which was increased by 1.8. Among these, the categories of riders and partially visible person were increased by 4.6 and 0.8, respectively. At the same time, the parameter number of EAAnet is only 7.1M, with a calculation amount of 16.0G FLOPs. Therefore, it is proved that EAAnet has the ability of the efficient real-time detection of the crowd person and is feasible in the field of emergency management. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

20 pages, 7732 KiB  
Article
Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation
by Dewei Meng, Xuemei Xu, Zhaohui Jiang and Lei Xu
Appl. Sci. 2024, 14(19), 8587; https://doi.org/10.3390/app14198587 - 24 Sep 2024
Cited by 3 | Viewed by 1053
Abstract
Insulators are essential for electrical insulation and structural support in transmission lines. With the advancement of deep learning, object detection algorithms have become primary tools for detecting insulator defects. However, challenges such as low detection accuracy for small targets, weak feature map representation, [...] Read more.
Insulators are essential for electrical insulation and structural support in transmission lines. With the advancement of deep learning, object detection algorithms have become primary tools for detecting insulator defects. However, challenges such as low detection accuracy for small targets, weak feature map representation, the insufficient extraction of key information, and a lack of comprehensive datasets persist. This paper introduces OD (Omni-dimensional dynamic)-YOLOV7-tiny, an enhanced insulator defect detection method. We replace the YOLOv7-tiny backbone with FasterNet and optimize the convolution structure using PConv, improving spatial feature extraction efficiency and operational speed. Additionally, we incorporate the OD (Omni-dimensional dynamic)-SlimNeck feature fusion module and a decoupled detection head to enhance accuracy. For deployment on edge devices, channel pruning and channel-wise distillation are applied, significantly reducing model parameters while maintaining high accuracy. Experimental results show that the improved model reduces parameters by 53% and increases accuracy and mean average precision (mAP) by 3.9% and 2.2%, respectively. These enhancements confirm the effectiveness of our lightweight model for insulator defect detection on edge devices. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

16 pages, 3615 KiB  
Article
High-Precision BEV-Based Road Recognition Method for Warehouse AMR Based on IndoorPathNet and Transfer Learning
by Tianwei Zhang, Ci He, Shiwen Li, Rong Lai, Zili Wang, Lemiao Qiu and Shuyou Zhang
Appl. Sci. 2024, 14(11), 4587; https://doi.org/10.3390/app14114587 - 27 May 2024
Viewed by 1253
Abstract
The rapid development and application of AMRs is important for Industry 4.0 and smart logistics. For large-scale dynamic flat warehouses, vision-based road recognition amidst complex obstacles is paramount for improving navigation efficiency and flexibility, while avoiding frequent manual settings. However, current mainstream road [...] Read more.
The rapid development and application of AMRs is important for Industry 4.0 and smart logistics. For large-scale dynamic flat warehouses, vision-based road recognition amidst complex obstacles is paramount for improving navigation efficiency and flexibility, while avoiding frequent manual settings. However, current mainstream road recognition methods face significant challenges of unsatisfactory accuracy and efficiency, as well as the lack of a large-scale high-quality dataset. To address this, this paper introduces IndoorPathNet, a transfer-learning-based Bird’s Eye View (BEV) indoor path segmentation network that furnishes directional guidance to AMRs through real-time segmented indoor pathway maps. IndoorPathNet employs a lightweight U-shaped architecture integrated with spatial self-attention mechanisms to augment the speed and accuracy of indoor pathway segmentation. Moreover, it surmounts the challenge of training posed by the scarcity of publicly available semantic datasets for warehouses through the strategic employment of transfer learning. Comparative experiments conducted between IndoorPathNet and four other lightweight models on the Urban Aerial Vehicle Image Dataset (UAVID) yielded a maximum Intersection Over Union (IOU) of 82.2%. On the Warehouse Indoor Path Dataset, the maximum IOU attained was 98.4% while achieving a processing speed of 9.81 frames per second (FPS) with a 1024 × 1024 input on a single 3060 GPU. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

17 pages, 5379 KiB  
Article
RDD-YOLO: Road Damage Detection Algorithm Based on Improved You Only Look Once Version 8
by Yue Li, Chang Yin, Yutian Lei, Jiale Zhang and Yiting Yan
Appl. Sci. 2024, 14(8), 3360; https://doi.org/10.3390/app14083360 - 16 Apr 2024
Cited by 10 | Viewed by 4633
Abstract
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces [...] Read more.
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces an enhanced version of the You Only Look Once version 8 (YOLOv8) road damage detection algorithm called RDD-YOLO. First, the simple attention mechanism (SimAM) is integrated into the backbone, which successfully improves the model’s focus on crucial details within the input image, enabling the model to capture features of road damage more accurately, thus enhancing the model’s precision. Second, the neck structure is optimized by replacing traditional convolution modules with GhostConv. This reduces redundant information, lowers the number of parameters, and decreases computational complexity while maintaining the model’s excellent performance in damage recognition. Last, the upsampling algorithm in the neck is improved by replacing the nearest interpolation with more accurate bilinear interpolation. This enhances the model’s capacity to maintain visual details, providing clearer and more accurate outputs for road damage detection tasks. Experimental findings on the RDD2022 dataset show that the proposed RDD-YOLO model achieves an mAP50 and mAP50-95 of 62.5% and 36.4% on the validation set, respectively. Compared to baseline, this represents an improvement of 2.5% and 5.2%. The F1 score on the test set reaches 69.6%, a 2.8% improvement over the baseline. The proposed method can accurately locate and detect road damage, save labor and material resources, and offer guidance for the assessment and upkeep of road damage. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

18 pages, 4616 KiB  
Article
Seatbelt Detection Algorithm Improved with Lightweight Approach and Attention Mechanism
by Liankui Qiu, Jiankun Rao and Xiangzhe Zhao
Appl. Sci. 2024, 14(8), 3346; https://doi.org/10.3390/app14083346 - 16 Apr 2024
Viewed by 1764
Abstract
Precise and rapid detection of seatbelts is an essential research field for intelligent traffic management. In order to improve the detection precision of seatbelts and speed up algorithm inference velocity, a lightweight seatbelt detection algorithm is proposed. Firstly, by adding the G-ELAN module [...] Read more.
Precise and rapid detection of seatbelts is an essential research field for intelligent traffic management. In order to improve the detection precision of seatbelts and speed up algorithm inference velocity, a lightweight seatbelt detection algorithm is proposed. Firstly, by adding the G-ELAN module designed in this paper to the YOLOv7-tiny network, the optimization of construction and reduction of parameters are accomplished, and the ResNet is compressed with the channel pruning approach to decrease computational overheads. Then, the Mish activation function is utilized to replace the Leaky Relu in the neck to enhance the non-linear competence of the network. Finally, the triplet attention module is integrated into the model after pruning to make up for the underlying performance reduction caused by the previous stage and upgrade overall detection precision. The experimental results based on the self-built seatbelt dataset showed that, compared to the initial network, the Mean Average Precision (mAP) achieved by the proposed GM-YOLOv7 was improved by 3.8%, while the volume and the computation amount were lowered by 20% and 24.6%, respectively. Compared with YOLOv3, YOLOX, and YOLOv5, the mAP of GM-YOLOv7 increased by 22.4%, 4.6%, and 4.2%, respectively, and the number of computational operations decreased by 25%, 63%, and 38%, respectively. In addition, the accuracy of the improved RST-Net increased to 98.25%, while the parameter value was reduced by 48% compared to the basic model, effectively improving the detection performance and realizing a lightweight structure. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

19 pages, 6635 KiB  
Article
Fire Detection and Geo-Localization Using UAV’s Aerial Images and Yolo-Based Models
by Kheireddine Choutri, Mohand Lagha, Souham Meshoul, Mohamed Batouche, Farah Bouzidi and Wided Charef
Appl. Sci. 2023, 13(20), 11548; https://doi.org/10.3390/app132011548 - 21 Oct 2023
Cited by 16 | Viewed by 4278
Abstract
The past decade has witnessed a growing demand for drone-based fire detection systems, driven by escalating concerns about wildfires exacerbated by climate change, as corroborated by environmental studies. However, deploying existing drone-based fire detection systems in real-world operational conditions poses practical challenges, notably [...] Read more.
The past decade has witnessed a growing demand for drone-based fire detection systems, driven by escalating concerns about wildfires exacerbated by climate change, as corroborated by environmental studies. However, deploying existing drone-based fire detection systems in real-world operational conditions poses practical challenges, notably the intricate and unstructured environments and the dynamic nature of UAV-mounted cameras, often leading to false alarms and inaccurate detections. In this paper, we describe a two-stage framework for fire detection and geo-localization. The key features of the proposed work included the compilation of a large dataset from several sources to capture various visual contexts related to fire scenes. The bounding boxes of the regions of interest were labeled using three target levels, namely fire, non-fire, and smoke. The second feature was the investigation of YOLO models to undertake the detection and localization tasks. YOLO-NAS was retained as the best performing model using the compiled dataset with an average mAP50 of 0.71 and an F1_score of 0.68. Additionally, a fire localization scheme based on stereo vision was introduced, and the hardware implementation was executed on a drone equipped with a Pixhawk microcontroller. The test results were very promising and showed the ability of the proposed approach to contribute to a comprehensive and effective fire detection system. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

21 pages, 35297 KiB  
Article
CEMLB-YOLO: Efficient Detection Model of Maize Leaf Blight in Complex Field Environments
by Shengjie Leng, Yasenjiang Musha, Yulin Yang and Guowei Feng
Appl. Sci. 2023, 13(16), 9285; https://doi.org/10.3390/app13169285 - 16 Aug 2023
Cited by 15 | Viewed by 2247
Abstract
Northern corn leaf blight is a severe fungal disease that adversely affects the health of maize crops. In order to prevent maize yield decline caused by leaf blight, we propose the YOLOv5-based object detection lightweight models to rapidly detect maize leaf blight disease [...] Read more.
Northern corn leaf blight is a severe fungal disease that adversely affects the health of maize crops. In order to prevent maize yield decline caused by leaf blight, we propose the YOLOv5-based object detection lightweight models to rapidly detect maize leaf blight disease in complex scenarios. Firstly, the Crucial Information Position Attention Mechanism (CIPAM) enables the model to focus on retaining critical information during downsampling to reduce information loss. We introduce the Feature Restructuring and Fusion Module (FRAFM) to extract deep semantic information and make the feature map fusion across maps at different scales more effective. Thirdly, we add the Mobile Bi-Level Transformer (MobileBit) to the feature extraction network to help the model understand complex scenes more effectively and cost-effectively. The experimental results demonstrate that the proposed model achieves 87.5% mAP@0.5 accuracy on the NLB dataset, which is 5.4% higher than the original model. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

21 pages, 10442 KiB  
Article
Study on Lightweight Model of Maize Seedling Object Detection Based on YOLOv7
by Kai Zhao, Lulu Zhao, Yanan Zhao and Hanbing Deng
Appl. Sci. 2023, 13(13), 7731; https://doi.org/10.3390/app13137731 - 29 Jun 2023
Cited by 28 | Viewed by 3045
Abstract
Traditional maize seedling detection mainly relies on manual observation and experience, which is time-consuming and prone to errors. With the rapid development of deep learning and object-detection technology, we propose a lightweight model LW-YOLOv7 to address the above issues. The new model can [...] Read more.
Traditional maize seedling detection mainly relies on manual observation and experience, which is time-consuming and prone to errors. With the rapid development of deep learning and object-detection technology, we propose a lightweight model LW-YOLOv7 to address the above issues. The new model can be deployed on mobile devices with limited memory and real-time detection of maize seedlings in the field. LW-YOLOv7 is based on YOLOv7 but incorporates GhostNet as the backbone network to reduce parameters. The Convolutional Block Attention Module (CBAM) enhances the network’s attention to the target region. In the head of the model, the Path Aggregation Network (PANet) is replaced with a Bi-Directional Feature Pyramid Network (BiFPN) to improve semantic and location information. The SIoU loss function is used during training to enhance bounding box regression speed and detection accuracy. Experimental results reveal that LW-YOLOv7 outperforms YOLOv7 in terms of accuracy and parameter reduction. Compared to other object-detection models like Faster RCNN, YOLOv3, YOLOv4, and YOLOv5l, LW-YOLOv7 demonstrates increased accuracy, reduced parameters, and improved detection speed. The results indicate that LW-YOLOv7 is suitable for real-time object detection of maize seedlings in field environments and provides a practical solution for efficiently counting the number of seedling maize plants. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

15 pages, 3842 KiB  
Article
Rail Surface Defect Detection Based on An Improved YOLOv5s
by Hui Luo, Lianming Cai and Chenbiao Li
Appl. Sci. 2023, 13(12), 7330; https://doi.org/10.3390/app13127330 - 20 Jun 2023
Cited by 15 | Viewed by 2964
Abstract
As the operational time of the railway increases, rail surfaces undergo irreversible defects. Once the defects occur, it is easy for them to develop rapidly, which seriously threatens the safe operation of trains. Therefore, the accurate and rapid detection of rail surface defects [...] Read more.
As the operational time of the railway increases, rail surfaces undergo irreversible defects. Once the defects occur, it is easy for them to develop rapidly, which seriously threatens the safe operation of trains. Therefore, the accurate and rapid detection of rail surface defects is very important. However, in the detection of rail surface defects, there are problems, such as low contrast between defects and the background, large scale differences, and insufficient training samples. Therefore, we propose a rail surface defect detection method based on an improved YOLOv5s in this paper. Firstly, the sample dataset of rail surface defect images was augmented with flip transformations, random cropping, and brightness transformations. Next, a Conv2D and Dilated Convolution(CDConv) module was designed to reduce the amount of network computation. In addition, the Swin Transformer was combined with the Backbone and Neck ends to improve the C3 module of the original network. Then, the global attention mechanism (GAM) was introduced into PANet to form a new prediction head, namely Swin transformer and GAM Prediction Head (SGPH). Finally, we used the Soft-SIoUNMS loss to replace the original CIoU loss, which accelerates the convergence speed of the algorithm and reduces regression errors. The experimental results show that the improved YOLOv5s detection algorithm reaches 96.9% in the average precision of rail surface defect detection, offering the accurate and rapid detection of rail surface defects, which has certain engineering application value. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

14 pages, 4001 KiB  
Article
An Improved YOLOv7 Model Based on Visual Attention Fusion: Application to the Recognition of Bouncing Locks in Substation Power Cabinets
by Yang Wang, Xiaofeng Zhang, Longmei Li, Liming Wang, Ziyang Zhou and Peng Zhang
Appl. Sci. 2023, 13(11), 6817; https://doi.org/10.3390/app13116817 - 4 Jun 2023
Cited by 13 | Viewed by 3206
Abstract
With the continuous progress of intelligent power system technology, in order to meet the needs of substation operation and maintenance, a target detection algorithm is applied to identify the status of equipment switches. YOLOv7, as the latest achievement of YOLO (You Only Look [...] Read more.
With the continuous progress of intelligent power system technology, in order to meet the needs of substation operation and maintenance, a target detection algorithm is applied to identify the status of equipment switches. YOLOv7, as the latest achievement of YOLO (You Only Look Once) series algorithms, has good speed and accuracy in target detection tasks. However, when the generalized network is applied in a specific scenario, its advantages are not obvious due to its high weight and poor portability. In this paper, an improved GF-YOLOv7 network model is proposed to apply in the recognition of the status of bounce locks in a substation. The MobileViT module is used to improve the feature extraction ability of the backbone network. Referring to the CBAM feature attention mechanism, the channel attention module and the spatial attention module are used to design a more lightweight feature fusion network. The experimental results in the test set show that the proposed network can significantly reduce the network weight and improve the detection accuracy on the basis of a small reduction in the detection speed, and the accuracy reaches 97.8%, which can meet the needs of the detection task of substation bounce locks. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

18 pages, 14068 KiB  
Article
An Improved Few-Shot Object Detection via Feature Reweighting Method for Insulator Identification
by Junpeng Wu and Yibo Zhou
Appl. Sci. 2023, 13(10), 6301; https://doi.org/10.3390/app13106301 - 22 May 2023
Cited by 5 | Viewed by 1925
Abstract
To address the issue of low accuracy in insulator object detection within power systems due to a scarcity of image sample data, this paper proposes a method for identifying insulator objects based on improved few-shot object detection through feature reweighting. The approach utilizes [...] Read more.
To address the issue of low accuracy in insulator object detection within power systems due to a scarcity of image sample data, this paper proposes a method for identifying insulator objects based on improved few-shot object detection through feature reweighting. The approach utilizes a meta-feature transfer model in conjunction with the improved YOLOv5 network to realize insulator recognition under conditions of few-shot. Firstly, the feature extraction module of the model incorporates an improved self-calibrated feature extraction network to extract feature information from multi-scale insulators. Secondly, the reweighting module integrates the SKNet attention mechanism to facilitate precise segmentation of the mask. Finally, the multi-stage non-maximum suppression algorithm is designed in the prediction layer, and the penalty function about confidence is set. The results of multiple prediction boxes are retained to reduce the occurrence of false detection and missing detection. For the poor detection results due to a low diversity of sample space, the transfer learning strategy is applied in the training to transfer the entire trained model to the detection of insulator targets. The experimental results show that the insulator detection mAP reaches 29.6%, 36.0%, and 48.3% at 5-shot, 10-shot, and 30-shot settings, respectively. These findings serve as evidence of improved accuracy levels of the insulator image detection under the condition of few shots. Furthermore, the proposed method enables the recognition of insulators under challenging conditions such as defects, occlusion, and other special circumstances. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

Back to TopTop