Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (10)

Search Parameters:
Keywords = dilated re-param block

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 13045 KB  
Article
Detection of Crack Sealant in the Pretreatment Process of Hot In-Place Recycling of Asphalt Pavement via Deep Learning Method
by Kai Zhao, Tianzhen Liu, Xu Xia and Yongli Zhao
Sensors 2025, 25(11), 3373; https://doi.org/10.3390/s25113373 - 27 May 2025
Viewed by 892
Abstract
Crack sealant is commonly used to fill pavement cracks and improve the Pavement Condition Index (PCI). However, during asphalt pavement hot in-place recycling (HIR), irregular shapes and random distribution of crack sealants can cause issues like agglomeration and ignition. To address these problems, [...] Read more.
Crack sealant is commonly used to fill pavement cracks and improve the Pavement Condition Index (PCI). However, during asphalt pavement hot in-place recycling (HIR), irregular shapes and random distribution of crack sealants can cause issues like agglomeration and ignition. To address these problems, it is necessary to mill large areas containing crack sealant or pre-mark locations for removal after heating. Currently, detecting and recording crack sealant locations, types, and distributions is conducted manually, which significantly reduces efficiency. While deep learning-based object detection has been widely applied to distress detection, crack sealants present unique challenges. They often appear as wide black patches that overlap with cracks and potholes, and complex background noise further complicates detection. Additionally, no dataset specifically for crack sealant detection currently exists. To overcome these challenges, this paper presents a specialized dataset created from 1983 pavement images. A deep learning detection algorithm named YOLO-CS (You Only Look Once Crack Sealant) is proposed. This algorithm integrates the RepViT (Representation Learning with Visual Tokens) network to reduce computational complexity while capturing the global context of images. Furthermore, the DRBNCSPELAN (Dilated Reparam Block with Cross-Stage Partial and Efficient Layer Aggregation Networks) module is introduced to ensure efficient information flow, and a lightweight shared convolution (LSC) detection head is developed. The results demonstrate that YOLO-CS outperforms other algorithms, achieving a precision of 88.4%, a recall of 84.2%, and an mAP (mean average precision) of 92.1%. Moreover, YOLO-CS significantly reduces parameters and memory consumption. Integrating Artificial Intelligence-based algorithms into HIR significantly enhances construction efficiency. Full article
Show Figures

Figure 1

25 pages, 3467 KB  
Article
Side-Scan Sonar Small Objects Detection Based on Improved YOLOv11
by Chang Zou, Siquan Yu, Yankai Yu, Haitao Gu and Xinlin Xu
J. Mar. Sci. Eng. 2025, 13(1), 162; https://doi.org/10.3390/jmse13010162 - 18 Jan 2025
Cited by 13 | Viewed by 3444
Abstract
Underwater object detection using side-scan sonar (SSS) remains a significant challenge in marine exploration, especially for small objects. Conventional methods for small object detection face various obstacles, such as difficulties in feature extraction and the considerable impact of noise on detection accuracy. To [...] Read more.
Underwater object detection using side-scan sonar (SSS) remains a significant challenge in marine exploration, especially for small objects. Conventional methods for small object detection face various obstacles, such as difficulties in feature extraction and the considerable impact of noise on detection accuracy. To address these issues, this study proposes an improved YOLOv11 network named YOLOv11-SDC. Specifically, a new Sparse Feature (SF) module is proposed, replacing the Spatial Pyramid Pooling Fast (SPPF) module from the original YOLOv11 architecture to enhance object feature selection. Furthermore, the proposed YOLOv11-SDC integrates a Dilated Reparam Block (DRB) with a C3k2 module to broaden the model’s receptive field. A Content-Guided Attention Fusion (CGAF) module is also incorporated prior to the detection module to assign appropriate weights to various feature maps, thereby emphasizing the relevant object information. Experimental results clearly demonstrate the superiority of YOLOv11-SDC over several iterations of YOLO versions in detection performance. The proposed method was validated through extensive real-world experiments, yielding a precision of 0.934, recall of 0.698, mAP@0.5 of 0.825, and mAP@0.5:0.95 of 0.598. In conclusion, the improved YOLOv11-SDC offers a promising solution for detecting small objects in SSS images, showing substantial potential for marine applications. Full article
(This article belongs to the Special Issue Artificial Intelligence Applications in Underwater Sonar Images)
Show Figures

Figure 1

19 pages, 4786 KB  
Article
RT-DETR-Tea: A Multi-Species Tea Bud Detection Model for Unstructured Environments
by Yiyong Chen, Yang Guo, Jianlong Li, Bo Zhou, Jiaming Chen, Man Zhang, Yingying Cui and Jinchi Tang
Agriculture 2024, 14(12), 2256; https://doi.org/10.3390/agriculture14122256 - 10 Dec 2024
Cited by 6 | Viewed by 1949
Abstract
Accurate bud detection is a prerequisite for automatic tea picking and yield statistics; however, current research suffers from missed detection due to the variety of singleness and false detection under complex backgrounds. Traditional target detection models are mainly based on CNN, but CNN [...] Read more.
Accurate bud detection is a prerequisite for automatic tea picking and yield statistics; however, current research suffers from missed detection due to the variety of singleness and false detection under complex backgrounds. Traditional target detection models are mainly based on CNN, but CNN can only achieve the extraction of local feature information, which is a lack of advantages for the accurate identification of targets in complex environments, and Transformer can be a good solution to the problem. Therefore, based on a multi-variety tea bud dataset, this study proposes RT-DETR-Tea, an improved object detection model under the real-time detection Transformer (RT-DETR) framework. This model uses cascaded group attention to replace the multi-head self-attention (MHSA) mechanism in the attention-based intra-scale feature interaction (AIFI) module, effectively optimizing deep features and enriching the semantic information of features. The original cross-scale feature-fusion module (CCFM) mechanism is improved to establish the gather-and-distribute-Tea (GD-Tea) mechanism for multi-level feature fusion, which can effectively fuse low-level and high-level semantic information and large and small tea bud features in natural environments. The submodule of DilatedReparamBlock in UniRepLKNet was employed to improve RepC3 to achieve an efficient fusion of tea bud feature information and ensure the accuracy of the detection head. Ablation experiments show that the precision and mean average precision of the proposed RT-DETR-Tea model are 96.1% and 79.7%, respectively, which are increased by 5.2% and 2.4% compared to those of the original model, indicating the model’s effectiveness. The model also shows good detection performance on the newly constructed tea bud dataset. Compared with other detection algorithms, the improved RT-DETR-Tea model demonstrates superior tea bud detection performance, providing effective technical support for smart tea garden management and production. Full article
Show Figures

Figure 1

23 pages, 5508 KB  
Article
YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
by Xueqiang Zhao and Yangbo Chen
Drones 2024, 8(11), 609; https://doi.org/10.3390/drones8110609 - 24 Oct 2024
Cited by 13 | Viewed by 4588
Abstract
In recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. This paper proposes [...] Read more.
In recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. This paper proposes a multi-scale object detection network, namely YOLO-DroneMS (You Only Look Once for Drone Multi-Scale Object), for UAV images. Targeting the pivotal connection between the backbone and neck, the Large Separable Kernel Attention (LSKA) mechanism is adopted with the Spatial Pyramid Pooling Factor (SPPF), where weighted processing of multi-scale feature maps is performed to focus more on features. And Attentional Scale Sequence Fusion DySample (ASF-DySample) is introduced to perform attention scale sequence fusion and dynamic upsampling to conserve resources. Then, the faster cross-stage partial network bottleneck with two convolutions (named C2f) in the backbone is optimized using the Inverted Residual Mobile Block and Dilated Reparam Block (iRMB-DRB), which balances the advantages of dynamic global modeling and static local information fusion. This optimization effectively increases the model’s receptive field, enhancing its capability for downstream tasks. By replacing the original CIoU with WIoUv3, the model prioritizes anchoring boxes of superior quality, dynamically adjusting weights to enhance detection performance for small objects. Experimental findings on the VisDrone2019 dataset demonstrate that at an Intersection over Union (IoU) of 0.5, YOLO-DroneMS achieves a 3.6% increase in mAP@50 compared to the YOLOv8n model. Moreover, YOLO-DroneMS exhibits improved detection speed, increasing the number of frames per second (FPS) from 78.7 to 83.3. The enhanced model supports diverse target scales and achieves high recognition rates, making it well-suited for drone-based object detection tasks, particularly in scenarios involving multiple object clusters. Full article
(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)
Show Figures

Figure 1

19 pages, 3429 KB  
Article
An Insulator Fault Diagnosis Method Based on Multi-Mechanism Optimization YOLOv8
by Chuang Gong, Wei Jiang, Dehua Zou, Weiwei Weng and Hongjun Li
Appl. Sci. 2024, 14(19), 8770; https://doi.org/10.3390/app14198770 - 28 Sep 2024
Cited by 5 | Viewed by 1531
Abstract
Aiming at the problem that insulator image backgrounds are complex and fault types are diverse, which makes it difficult for existing deep learning algorithms to achieve accurate insulator fault diagnosis, an insulator fault diagnosis method based on multi-mechanism optimization YOLOv8-DCP is proposed. Firstly, [...] Read more.
Aiming at the problem that insulator image backgrounds are complex and fault types are diverse, which makes it difficult for existing deep learning algorithms to achieve accurate insulator fault diagnosis, an insulator fault diagnosis method based on multi-mechanism optimization YOLOv8-DCP is proposed. Firstly, a feature extraction and fusion module, named CW-DRB, was designed. This module enhances the C2f structure of YOLOv8 by incorporating the dilation-wise residual module and the dilated re-param module. The introduction of this module improves YOLOv8’s capability for multi-scale feature extraction and multi-level feature fusion. Secondly, the CARAFE module, which is feature content-aware, was introduced to replace the up-sampling layer in YOLOv8n, thereby enhancing the model’s feature map reconstruction ability. Finally, an additional small-object detection layer was added to improve the detection accuracy of small defects. Simulation results indicate that YOLOv8-DCP achieves an accuracy of 97.7% and an mAP@0.5 of 93.9%. Compared to YOLOv5, YOLOv7, and YOLOv8n, the accuracy improved by 1.5%, 4.3%, and 4.8%, while the mAP@0.5 increased by 3.0%, 4.3%, and 3.1%. This results in a significant enhancement in the accuracy of insulator fault diagnosis. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

22 pages, 8995 KB  
Article
Chili Pepper Object Detection Method Based on Improved YOLOv8n
by Na Ma, Yulong Wu, Yifan Bo and Hongwen Yan
Plants 2024, 13(17), 2402; https://doi.org/10.3390/plants13172402 - 28 Aug 2024
Cited by 15 | Viewed by 3176
Abstract
In response to the low accuracy and slow detection speed of chili recognition in natural environments, this study proposes a chili pepper object detection method based on the improved YOLOv8n. Evaluations were conducted among YOLOv5n, YOLOv6n, YOLOv7-tiny, YOLOv8n, YOLOv9, and YOLOv10 to select [...] Read more.
In response to the low accuracy and slow detection speed of chili recognition in natural environments, this study proposes a chili pepper object detection method based on the improved YOLOv8n. Evaluations were conducted among YOLOv5n, YOLOv6n, YOLOv7-tiny, YOLOv8n, YOLOv9, and YOLOv10 to select the optimal model. YOLOv8n was chosen as the baseline and improved as follows: (1) Replacing the YOLOv8 backbone with the improved HGNetV2 model to reduce floating-point operations and computational load during convolution. (2) Integrating the SEAM (spatially enhanced attention module) into the YOLOv8 detection head to enhance feature extraction capability under chili fruit occlusion. (3) Optimizing feature fusion using the dilated reparam block module in certain C2f (CSP bottleneck with two convolutions). (4) Substituting the traditional upsample operator with the CARAFE(content-aware reassembly of features) upsampling operator to further enhance network feature fusion capability and improve detection performance. On a custom-built chili dataset, the F0.5-score, mAP0.5, and mAP0.5:0.95 metrics improved by 1.98, 2, and 5.2 percentage points, respectively, over the original model, achieving 96.47%, 96.3%, and 79.4%. The improved model reduced parameter count and GFLOPs by 29.5% and 28.4% respectively, with a final model size of 4.6 MB. Thus, this method effectively enhances chili target detection, providing a technical foundation for intelligent chili harvesting processes. Full article
(This article belongs to the Section Plant Modeling)
Show Figures

Figure 1

26 pages, 10106 KB  
Article
DFLM-YOLO: A Lightweight YOLO Model with Multiscale Feature Fusion Capabilities for Open Water Aerial Imagery
by Chen Sun, Yihong Zhang and Shuai Ma
Drones 2024, 8(8), 400; https://doi.org/10.3390/drones8080400 - 16 Aug 2024
Cited by 10 | Viewed by 2930
Abstract
Object detection algorithms for open water aerial images present challenges such as small object size, unsatisfactory detection accuracy, numerous network parameters, and enormous computational demands. Current detection algorithms struggle to meet the accuracy and speed requirements while being deployable on small mobile devices. [...] Read more.
Object detection algorithms for open water aerial images present challenges such as small object size, unsatisfactory detection accuracy, numerous network parameters, and enormous computational demands. Current detection algorithms struggle to meet the accuracy and speed requirements while being deployable on small mobile devices. This paper proposes DFLM-YOLO, a lightweight small-object detection network based on the YOLOv8 algorithm with multiscale feature fusion. Firstly, to solve the class imbalance problem of the SeaDroneSee dataset, we propose a data augmentation algorithm called Small Object Multiplication (SOM). SOM enhances dataset balance by increasing the number of objects in specific categories, thereby improving model accuracy and generalization capabilities. Secondly, we optimize the backbone network structure by implementing Depthwise Separable Convolution (DSConv) and the newly designed FasterBlock-CGLU-C2f (FC-C2f), which reduces the model’s parameters and inference time. Finally, we design the Lightweight Multiscale Feature Fusion Network (LMFN) to address the challenges of multiscale variations by gradually fusing the four feature layers extracted from the backbone network in three stages. In addition, LMFN incorporates the Dilated Re-param Block structure to increase the effective receptive field and improve the model’s classification ability and detection accuracy. The experimental results on the SeaDroneSee dataset indicate that DFLM-YOLO improves the mean average precision (mAP) by 12.4% compared to the original YOLOv8s, while reducing parameters by 67.2%. This achievement provides a new solution for Unmanned Aerial Vehicles (UAVs) to conduct object detection missions in open water efficiently. Full article
Show Figures

Figure 1

18 pages, 3242 KB  
Article
Multi-Object Vehicle Detection and Tracking Algorithm Based on Improved YOLOv8 and ByteTrack
by Longxiang You, Yajun Chen, Ci Xiao, Chaoyue Sun and Rongzhen Li
Electronics 2024, 13(15), 3033; https://doi.org/10.3390/electronics13153033 - 1 Aug 2024
Cited by 11 | Viewed by 8725
Abstract
Vehicle detection and tracking technology plays a crucial role in Intelligent Transportation Systems. However, due to factors such as complex scenarios, diverse scales, and occlusions, issues like false detections, missed detections, and identity switches frequently occur. To address these problems, this paper proposes [...] Read more.
Vehicle detection and tracking technology plays a crucial role in Intelligent Transportation Systems. However, due to factors such as complex scenarios, diverse scales, and occlusions, issues like false detections, missed detections, and identity switches frequently occur. To address these problems, this paper proposes a multi-object vehicle detection and tracking algorithm based on CDS-YOLOv8 and improved ByteTrack. For vehicle detection, the Context-Guided (CG) module is introduced during the downsampling process to enhance feature extraction capabilities in complex scenarios. The Dilated Reparam Block (DRB) is reconstructed to tackle multi-scale issues, and Soft-NMS replaces the traditional NMS to improve performance in densely populated vehicle scenarios. For vehicle tracking, the state vector and covariance matrix of the Kalman filter are improved to better handle the nonlinear movement of vehicles, and Gaussian Smoothed Interpolation (GSI) is introduced to fill in trajectory gaps caused by detection misses. Experiments conducted on the UA-DETRAC dataset show that the improved algorithm increases detection performance, with mAP@0.5 and mAP@0.5:0.95 improving by 9% and 8.8%, respectively. In terms of tracking performance, mMOTA improves by 6.7%. Additionally, comparative experiments with mainstream detection and two-stage tracking algorithms demonstrate the superior performance of the proposed algorithm. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 11509 KB  
Article
EMR-YOLO: A Study of Efficient Maritime Rescue Identification Algorithms
by Jun Zhang, Yiming Hua, Luya Chen, Li Li, Xudong Shen, Wei Shi, Shuai Wu, Yunfan Fu, Chunfeng Lv and Jianping Zhu
J. Mar. Sci. Eng. 2024, 12(7), 1048; https://doi.org/10.3390/jmse12071048 - 21 Jun 2024
Cited by 11 | Viewed by 2325
Abstract
Accurate target identification of UAV (Unmanned Aerial Vehicle)-captured images is a prerequisite for maritime rescue and maritime surveillance. However, UAV-captured images pose several challenges, such as complex maritime backgrounds, tiny targets, and crowded scenes. To reduce the impact of these challenges on target [...] Read more.
Accurate target identification of UAV (Unmanned Aerial Vehicle)-captured images is a prerequisite for maritime rescue and maritime surveillance. However, UAV-captured images pose several challenges, such as complex maritime backgrounds, tiny targets, and crowded scenes. To reduce the impact of these challenges on target recognition, we propose an efficient maritime rescue network (EMR-YOLO) for recognizing images captured by UAVs. In the proposed network, the DRC2f (Dilated Reparam-based Channel-to-Pixel) module is first designed by the Dilated Reparam Block to effectively increase the receptive field, reduce the number of parameters, and improve feature extraction capability. Then, the ADOWN downsampling module is used to mitigate fine-grained information loss, thereby improving the efficiency and performance of the model. Finally, CASPPF (Coordinate Attention-based Spatial Pyramid Pooling Fast) is designed by fusing CA (Coordinate Attention) and SPPF (Spatial Pyramid Pooling Fast), which effectively enhances the feature representation and spatial information integration ability, making the model more accurate and robust when dealing with complex scenes. Experimental results on the AFO dataset show that, compared with the YOLOv8s network, the EMR-YOLO network improves the mAP (mean average precision) and mAP50 by 4.7% and 9.2%, respectively, while reducing the number of parameters and computation by 22.5% and 18.7%, respectively. Overall, the use of UAVs to capture images and deep learning for maritime target recognition for maritime rescue and surveillance improves rescue efficiency and safety. Full article
(This article belongs to the Special Issue Motion Control and Path Planning of Marine Vehicles—2nd Edition)
Show Figures

Figure 1

20 pages, 5603 KB  
Article
Multi-Scale Fusion Uncrewed Aerial Vehicle Detection Based on RT-DETR
by Minling Zhu and En Kong
Electronics 2024, 13(8), 1489; https://doi.org/10.3390/electronics13081489 - 14 Apr 2024
Cited by 25 | Viewed by 6722
Abstract
With the rapid development of science and technology, uncrewed aerial vehicle (UAV) technology has shown a wide range of application prospects in various fields. The accuracy and real-time performance of UAV target detection play a vital role in ensuring safety and improving the [...] Read more.
With the rapid development of science and technology, uncrewed aerial vehicle (UAV) technology has shown a wide range of application prospects in various fields. The accuracy and real-time performance of UAV target detection play a vital role in ensuring safety and improving the work efficiency of UAVs. Aimed at the challenges faced by the current UAV detection field, this paper proposes the Gathering Cascaded Dilated DETR (GCD-DETR) model, which aims to improve the accuracy and efficiency of UAV target detection. The main innovations of this paper are as follows: (1) The Dilated Re-param Block is creatively applied to the dilatation-wise Residual module, which uses the large kernel convolution and the parallel small kernel convolution together and fuses the feature maps generated by multi-scale perception, greatly improving the feature extraction ability, thereby improving the accuracy of UAV detection. (2) The Gather-and-Distribute mechanism is introduced to effectively enhance the ability of multi-scale feature fusion so that the model can make full use of the feature information extracted from the backbone network and further improve the detection performance. (3) The Cascaded Group Attention mechanism is innovatively introduced, which not only saves the computational cost but also improves the diversity of attention by dividing the attention head in different ways, thus enhancing the ability of the model to process complex scenes. In order to verify the effectiveness of the proposed model, this paper conducts experiments on multiple UAV datasets of complex scenes. The experimental results show that the accuracy of the improved RT-DETR model proposed in this paper on the two UAV datasets reaches 0.956 and 0.978, respectively, which is 2% and 1.1% higher than that of the original RT-DETR model. At the same time, the FPS of the model is also improved by 10 frames per second, which achieves an effective balance between accuracy and speed. Full article
(This article belongs to the Special Issue Applications of Computer Vision, 2nd Edition)
Show Figures

Figure 1

Back to TopTop