applsci-logo

Journal Browser

Journal Browser

Deep Learning for Object Detection

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 October 2024 | Viewed by 17316

Special Issue Editor


E-Mail Website
Guest Editor
Digital Industry Center, Fondazione Bruno Kessler, 18, 38123 Trento, Italy
Interests: you only look once (YOLO); big data; convolutional neural networks (CNNs); object detection; artificial intelligence

Special Issue Information

Dear Colleagues,

Currently, models based on convolutional neural networks (CNNs) are increasingly being applied for image classification due to their ability to handle big data. Models such as you only look once (YOLO) have become very popular for having greater flexibility and good performance in object identification.

In this Special Issue, we are aiming to collate studies on all of the aspects surrounding “Deep Learning for Object Detection”. Any original, unpublished work is welcome. If you have an interest in this topic, please let us know.

Dr. Stéfano Frizzo Stefenon
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • you only look once (YOLO)
  • big data
  • convolutional neural networks (CNNs)
  • object detection
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

22 pages, 7527 KiB  
Article
EAAnet: Efficient Attention and Aggregation Network for Crowd Person Detection
by Wenzhuo Chen, Wen Wu, Wantao Dai and Feng Huang
Appl. Sci. 2024, 14(19), 8692; https://doi.org/10.3390/app14198692 (registering DOI) - 26 Sep 2024
Abstract
With the frequent occurrence of natural disasters and the acceleration of urbanization, it is necessary to carry out efficient evacuation, especially when earthquakes, fires, terrorist attacks, and other serious threats occur. However, due to factors such as small targets, complex posture, occlusion, and [...] Read more.
With the frequent occurrence of natural disasters and the acceleration of urbanization, it is necessary to carry out efficient evacuation, especially when earthquakes, fires, terrorist attacks, and other serious threats occur. However, due to factors such as small targets, complex posture, occlusion, and dense distribution, the current mainstream algorithms still have problems such as low precision and poor real-time performance in crowd person detection. Therefore, this paper proposes EAAnet, a crowd person detection algorithm. It is based on YOLOv5, with CBAM (Convolutional Block Attention Module) introduced into the backbone, BiFPN (Bidirectional Feature Pyramid Network) introduced into the neck, and combined with a loss function of CIoU_Loss to better predict the person number. The experimental results show that compared with other mainstream detection algorithms, EAAnet has achieved significant improvement in precision and real-time performance. The precision value of all categories was 78.6%, which was increased by 1.8. Among these, the categories of riders and partially visible person were increased by 4.6 and 0.8, respectively. At the same time, the parameter number of EAAnet is only 7.1M, with a calculation amount of 16.0G FLOPs. Therefore, it is proved that EAAnet has the ability of the efficient real-time detection of the crowd person and is feasible in the field of emergency management. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

20 pages, 7732 KiB  
Article
Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation
by Dewei Meng, Xuemei Xu, Zhaohui Jiang and Lei Xu
Appl. Sci. 2024, 14(19), 8587; https://doi.org/10.3390/app14198587 - 24 Sep 2024
Viewed by 258
Abstract
Insulators are essential for electrical insulation and structural support in transmission lines. With the advancement of deep learning, object detection algorithms have become primary tools for detecting insulator defects. However, challenges such as low detection accuracy for small targets, weak feature map representation, [...] Read more.
Insulators are essential for electrical insulation and structural support in transmission lines. With the advancement of deep learning, object detection algorithms have become primary tools for detecting insulator defects. However, challenges such as low detection accuracy for small targets, weak feature map representation, the insufficient extraction of key information, and a lack of comprehensive datasets persist. This paper introduces OD (Omni-dimensional dynamic)-YOLOV7-tiny, an enhanced insulator defect detection method. We replace the YOLOv7-tiny backbone with FasterNet and optimize the convolution structure using PConv, improving spatial feature extraction efficiency and operational speed. Additionally, we incorporate the OD (Omni-dimensional dynamic)-SlimNeck feature fusion module and a decoupled detection head to enhance accuracy. For deployment on edge devices, channel pruning and channel-wise distillation are applied, significantly reducing model parameters while maintaining high accuracy. Experimental results show that the improved model reduces parameters by 53% and increases accuracy and mean average precision (mAP) by 3.9% and 2.2%, respectively. These enhancements confirm the effectiveness of our lightweight model for insulator defect detection on edge devices. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

16 pages, 3615 KiB  
Article
High-Precision BEV-Based Road Recognition Method for Warehouse AMR Based on IndoorPathNet and Transfer Learning
by Tianwei Zhang, Ci He, Shiwen Li, Rong Lai, Zili Wang, Lemiao Qiu and Shuyou Zhang
Appl. Sci. 2024, 14(11), 4587; https://doi.org/10.3390/app14114587 - 27 May 2024
Viewed by 725
Abstract
The rapid development and application of AMRs is important for Industry 4.0 and smart logistics. For large-scale dynamic flat warehouses, vision-based road recognition amidst complex obstacles is paramount for improving navigation efficiency and flexibility, while avoiding frequent manual settings. However, current mainstream road [...] Read more.
The rapid development and application of AMRs is important for Industry 4.0 and smart logistics. For large-scale dynamic flat warehouses, vision-based road recognition amidst complex obstacles is paramount for improving navigation efficiency and flexibility, while avoiding frequent manual settings. However, current mainstream road recognition methods face significant challenges of unsatisfactory accuracy and efficiency, as well as the lack of a large-scale high-quality dataset. To address this, this paper introduces IndoorPathNet, a transfer-learning-based Bird’s Eye View (BEV) indoor path segmentation network that furnishes directional guidance to AMRs through real-time segmented indoor pathway maps. IndoorPathNet employs a lightweight U-shaped architecture integrated with spatial self-attention mechanisms to augment the speed and accuracy of indoor pathway segmentation. Moreover, it surmounts the challenge of training posed by the scarcity of publicly available semantic datasets for warehouses through the strategic employment of transfer learning. Comparative experiments conducted between IndoorPathNet and four other lightweight models on the Urban Aerial Vehicle Image Dataset (UAVID) yielded a maximum Intersection Over Union (IOU) of 82.2%. On the Warehouse Indoor Path Dataset, the maximum IOU attained was 98.4% while achieving a processing speed of 9.81 frames per second (FPS) with a 1024 × 1024 input on a single 3060 GPU. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

17 pages, 5379 KiB  
Article
RDD-YOLO: Road Damage Detection Algorithm Based on Improved You Only Look Once Version 8
by Yue Li, Chang Yin, Yutian Lei, Jiale Zhang and Yiting Yan
Appl. Sci. 2024, 14(8), 3360; https://doi.org/10.3390/app14083360 - 16 Apr 2024
Cited by 1 | Viewed by 1726
Abstract
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces [...] Read more.
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces an enhanced version of the You Only Look Once version 8 (YOLOv8) road damage detection algorithm called RDD-YOLO. First, the simple attention mechanism (SimAM) is integrated into the backbone, which successfully improves the model’s focus on crucial details within the input image, enabling the model to capture features of road damage more accurately, thus enhancing the model’s precision. Second, the neck structure is optimized by replacing traditional convolution modules with GhostConv. This reduces redundant information, lowers the number of parameters, and decreases computational complexity while maintaining the model’s excellent performance in damage recognition. Last, the upsampling algorithm in the neck is improved by replacing the nearest interpolation with more accurate bilinear interpolation. This enhances the model’s capacity to maintain visual details, providing clearer and more accurate outputs for road damage detection tasks. Experimental findings on the RDD2022 dataset show that the proposed RDD-YOLO model achieves an mAP50 and mAP50-95 of 62.5% and 36.4% on the validation set, respectively. Compared to baseline, this represents an improvement of 2.5% and 5.2%. The F1 score on the test set reaches 69.6%, a 2.8% improvement over the baseline. The proposed method can accurately locate and detect road damage, save labor and material resources, and offer guidance for the assessment and upkeep of road damage. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

18 pages, 4616 KiB  
Article
Seatbelt Detection Algorithm Improved with Lightweight Approach and Attention Mechanism
by Liankui Qiu, Jiankun Rao and Xiangzhe Zhao
Appl. Sci. 2024, 14(8), 3346; https://doi.org/10.3390/app14083346 - 16 Apr 2024
Viewed by 738
Abstract
Precise and rapid detection of seatbelts is an essential research field for intelligent traffic management. In order to improve the detection precision of seatbelts and speed up algorithm inference velocity, a lightweight seatbelt detection algorithm is proposed. Firstly, by adding the G-ELAN module [...] Read more.
Precise and rapid detection of seatbelts is an essential research field for intelligent traffic management. In order to improve the detection precision of seatbelts and speed up algorithm inference velocity, a lightweight seatbelt detection algorithm is proposed. Firstly, by adding the G-ELAN module designed in this paper to the YOLOv7-tiny network, the optimization of construction and reduction of parameters are accomplished, and the ResNet is compressed with the channel pruning approach to decrease computational overheads. Then, the Mish activation function is utilized to replace the Leaky Relu in the neck to enhance the non-linear competence of the network. Finally, the triplet attention module is integrated into the model after pruning to make up for the underlying performance reduction caused by the previous stage and upgrade overall detection precision. The experimental results based on the self-built seatbelt dataset showed that, compared to the initial network, the Mean Average Precision (mAP) achieved by the proposed GM-YOLOv7 was improved by 3.8%, while the volume and the computation amount were lowered by 20% and 24.6%, respectively. Compared with YOLOv3, YOLOX, and YOLOv5, the mAP of GM-YOLOv7 increased by 22.4%, 4.6%, and 4.2%, respectively, and the number of computational operations decreased by 25%, 63%, and 38%, respectively. In addition, the accuracy of the improved RST-Net increased to 98.25%, while the parameter value was reduced by 48% compared to the basic model, effectively improving the detection performance and realizing a lightweight structure. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

19 pages, 6635 KiB  
Article
Fire Detection and Geo-Localization Using UAV’s Aerial Images and Yolo-Based Models
by Kheireddine Choutri, Mohand Lagha, Souham Meshoul, Mohamed Batouche, Farah Bouzidi and Wided Charef
Appl. Sci. 2023, 13(20), 11548; https://doi.org/10.3390/app132011548 - 21 Oct 2023
Cited by 9 | Viewed by 2642
Abstract
The past decade has witnessed a growing demand for drone-based fire detection systems, driven by escalating concerns about wildfires exacerbated by climate change, as corroborated by environmental studies. However, deploying existing drone-based fire detection systems in real-world operational conditions poses practical challenges, notably [...] Read more.
The past decade has witnessed a growing demand for drone-based fire detection systems, driven by escalating concerns about wildfires exacerbated by climate change, as corroborated by environmental studies. However, deploying existing drone-based fire detection systems in real-world operational conditions poses practical challenges, notably the intricate and unstructured environments and the dynamic nature of UAV-mounted cameras, often leading to false alarms and inaccurate detections. In this paper, we describe a two-stage framework for fire detection and geo-localization. The key features of the proposed work included the compilation of a large dataset from several sources to capture various visual contexts related to fire scenes. The bounding boxes of the regions of interest were labeled using three target levels, namely fire, non-fire, and smoke. The second feature was the investigation of YOLO models to undertake the detection and localization tasks. YOLO-NAS was retained as the best performing model using the compiled dataset with an average mAP50 of 0.71 and an F1_score of 0.68. Additionally, a fire localization scheme based on stereo vision was introduced, and the hardware implementation was executed on a drone equipped with a Pixhawk microcontroller. The test results were very promising and showed the ability of the proposed approach to contribute to a comprehensive and effective fire detection system. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

21 pages, 35297 KiB  
Article
CEMLB-YOLO: Efficient Detection Model of Maize Leaf Blight in Complex Field Environments
by Shengjie Leng, Yasenjiang Musha, Yulin Yang and Guowei Feng
Appl. Sci. 2023, 13(16), 9285; https://doi.org/10.3390/app13169285 - 16 Aug 2023
Cited by 7 | Viewed by 1619
Abstract
Northern corn leaf blight is a severe fungal disease that adversely affects the health of maize crops. In order to prevent maize yield decline caused by leaf blight, we propose the YOLOv5-based object detection lightweight models to rapidly detect maize leaf blight disease [...] Read more.
Northern corn leaf blight is a severe fungal disease that adversely affects the health of maize crops. In order to prevent maize yield decline caused by leaf blight, we propose the YOLOv5-based object detection lightweight models to rapidly detect maize leaf blight disease in complex scenarios. Firstly, the Crucial Information Position Attention Mechanism (CIPAM) enables the model to focus on retaining critical information during downsampling to reduce information loss. We introduce the Feature Restructuring and Fusion Module (FRAFM) to extract deep semantic information and make the feature map fusion across maps at different scales more effective. Thirdly, we add the Mobile Bi-Level Transformer (MobileBit) to the feature extraction network to help the model understand complex scenes more effectively and cost-effectively. The experimental results demonstrate that the proposed model achieves 87.5% [email protected] accuracy on the NLB dataset, which is 5.4% higher than the original model. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

21 pages, 10442 KiB  
Article
Study on Lightweight Model of Maize Seedling Object Detection Based on YOLOv7
by Kai Zhao, Lulu Zhao, Yanan Zhao and Hanbing Deng
Appl. Sci. 2023, 13(13), 7731; https://doi.org/10.3390/app13137731 - 29 Jun 2023
Cited by 18 | Viewed by 2389
Abstract
Traditional maize seedling detection mainly relies on manual observation and experience, which is time-consuming and prone to errors. With the rapid development of deep learning and object-detection technology, we propose a lightweight model LW-YOLOv7 to address the above issues. The new model can [...] Read more.
Traditional maize seedling detection mainly relies on manual observation and experience, which is time-consuming and prone to errors. With the rapid development of deep learning and object-detection technology, we propose a lightweight model LW-YOLOv7 to address the above issues. The new model can be deployed on mobile devices with limited memory and real-time detection of maize seedlings in the field. LW-YOLOv7 is based on YOLOv7 but incorporates GhostNet as the backbone network to reduce parameters. The Convolutional Block Attention Module (CBAM) enhances the network’s attention to the target region. In the head of the model, the Path Aggregation Network (PANet) is replaced with a Bi-Directional Feature Pyramid Network (BiFPN) to improve semantic and location information. The SIoU loss function is used during training to enhance bounding box regression speed and detection accuracy. Experimental results reveal that LW-YOLOv7 outperforms YOLOv7 in terms of accuracy and parameter reduction. Compared to other object-detection models like Faster RCNN, YOLOv3, YOLOv4, and YOLOv5l, LW-YOLOv7 demonstrates increased accuracy, reduced parameters, and improved detection speed. The results indicate that LW-YOLOv7 is suitable for real-time object detection of maize seedlings in field environments and provides a practical solution for efficiently counting the number of seedling maize plants. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

15 pages, 3842 KiB  
Article
Rail Surface Defect Detection Based on An Improved YOLOv5s
by Hui Luo, Lianming Cai and Chenbiao Li
Appl. Sci. 2023, 13(12), 7330; https://doi.org/10.3390/app13127330 - 20 Jun 2023
Cited by 10 | Viewed by 1954
Abstract
As the operational time of the railway increases, rail surfaces undergo irreversible defects. Once the defects occur, it is easy for them to develop rapidly, which seriously threatens the safe operation of trains. Therefore, the accurate and rapid detection of rail surface defects [...] Read more.
As the operational time of the railway increases, rail surfaces undergo irreversible defects. Once the defects occur, it is easy for them to develop rapidly, which seriously threatens the safe operation of trains. Therefore, the accurate and rapid detection of rail surface defects is very important. However, in the detection of rail surface defects, there are problems, such as low contrast between defects and the background, large scale differences, and insufficient training samples. Therefore, we propose a rail surface defect detection method based on an improved YOLOv5s in this paper. Firstly, the sample dataset of rail surface defect images was augmented with flip transformations, random cropping, and brightness transformations. Next, a Conv2D and Dilated Convolution(CDConv) module was designed to reduce the amount of network computation. In addition, the Swin Transformer was combined with the Backbone and Neck ends to improve the C3 module of the original network. Then, the global attention mechanism (GAM) was introduced into PANet to form a new prediction head, namely Swin transformer and GAM Prediction Head (SGPH). Finally, we used the Soft-SIoUNMS loss to replace the original CIoU loss, which accelerates the convergence speed of the algorithm and reduces regression errors. The experimental results show that the improved YOLOv5s detection algorithm reaches 96.9% in the average precision of rail surface defect detection, offering the accurate and rapid detection of rail surface defects, which has certain engineering application value. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

14 pages, 4001 KiB  
Article
An Improved YOLOv7 Model Based on Visual Attention Fusion: Application to the Recognition of Bouncing Locks in Substation Power Cabinets
by Yang Wang, Xiaofeng Zhang, Longmei Li, Liming Wang, Ziyang Zhou and Peng Zhang
Appl. Sci. 2023, 13(11), 6817; https://doi.org/10.3390/app13116817 - 4 Jun 2023
Cited by 9 | Viewed by 2545
Abstract
With the continuous progress of intelligent power system technology, in order to meet the needs of substation operation and maintenance, a target detection algorithm is applied to identify the status of equipment switches. YOLOv7, as the latest achievement of YOLO (You Only Look [...] Read more.
With the continuous progress of intelligent power system technology, in order to meet the needs of substation operation and maintenance, a target detection algorithm is applied to identify the status of equipment switches. YOLOv7, as the latest achievement of YOLO (You Only Look Once) series algorithms, has good speed and accuracy in target detection tasks. However, when the generalized network is applied in a specific scenario, its advantages are not obvious due to its high weight and poor portability. In this paper, an improved GF-YOLOv7 network model is proposed to apply in the recognition of the status of bounce locks in a substation. The MobileViT module is used to improve the feature extraction ability of the backbone network. Referring to the CBAM feature attention mechanism, the channel attention module and the spatial attention module are used to design a more lightweight feature fusion network. The experimental results in the test set show that the proposed network can significantly reduce the network weight and improve the detection accuracy on the basis of a small reduction in the detection speed, and the accuracy reaches 97.8%, which can meet the needs of the detection task of substation bounce locks. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

18 pages, 14068 KiB  
Article
An Improved Few-Shot Object Detection via Feature Reweighting Method for Insulator Identification
by Junpeng Wu and Yibo Zhou
Appl. Sci. 2023, 13(10), 6301; https://doi.org/10.3390/app13106301 - 22 May 2023
Cited by 4 | Viewed by 1454
Abstract
To address the issue of low accuracy in insulator object detection within power systems due to a scarcity of image sample data, this paper proposes a method for identifying insulator objects based on improved few-shot object detection through feature reweighting. The approach utilizes [...] Read more.
To address the issue of low accuracy in insulator object detection within power systems due to a scarcity of image sample data, this paper proposes a method for identifying insulator objects based on improved few-shot object detection through feature reweighting. The approach utilizes a meta-feature transfer model in conjunction with the improved YOLOv5 network to realize insulator recognition under conditions of few-shot. Firstly, the feature extraction module of the model incorporates an improved self-calibrated feature extraction network to extract feature information from multi-scale insulators. Secondly, the reweighting module integrates the SKNet attention mechanism to facilitate precise segmentation of the mask. Finally, the multi-stage non-maximum suppression algorithm is designed in the prediction layer, and the penalty function about confidence is set. The results of multiple prediction boxes are retained to reduce the occurrence of false detection and missing detection. For the poor detection results due to a low diversity of sample space, the transfer learning strategy is applied in the training to transfer the entire trained model to the detection of insulator targets. The experimental results show that the insulator detection mAP reaches 29.6%, 36.0%, and 48.3% at 5-shot, 10-shot, and 30-shot settings, respectively. These findings serve as evidence of improved accuracy levels of the insulator image detection under the condition of few shots. Furthermore, the proposed method enables the recognition of insulators under challenging conditions such as defects, occlusion, and other special circumstances. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

Back to TopTop