Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (5)

Search Parameters:
Keywords = weighted intersection over union (WISE-IOU) loss function

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 7695 KB  
Article
RTUAV-YOLO: A Family of Efficient and Lightweight Models for Real-Time Object Detection in UAV Aerial Imagery
by Ruizhi Zhang, Jinghua Hou, Le Li, Ke Zhang, Li Zhao and Shuo Gao
Sensors 2025, 25(21), 6573; https://doi.org/10.3390/s25216573 - 25 Oct 2025
Viewed by 1429
Abstract
Real-time object detection in Unmanned Aerial Vehicle (UAV) imagery is critical yet challenging, requiring high accuracy amidst complex scenes with multi-scale and small objects, under stringent onboard computational constraints. While existing methods struggle to balance accuracy and efficiency, we propose RTUAV-YOLO, a family [...] Read more.
Real-time object detection in Unmanned Aerial Vehicle (UAV) imagery is critical yet challenging, requiring high accuracy amidst complex scenes with multi-scale and small objects, under stringent onboard computational constraints. While existing methods struggle to balance accuracy and efficiency, we propose RTUAV-YOLO, a family of lightweight models based on YOLOv11 tailored for UAV real-time object detection. First, to mitigate the feature imbalance and progressive information degradation of small objects in current architectures multi-scale processing, we developed a Multi-Scale Feature Adaptive Modulation module (MSFAM) that enhances small-target feature extraction capabilities through adaptive weight generation mechanisms and dual-pathway heterogeneous feature aggregation. Second, to overcome the limitations in contextual information acquisition exhibited by current architectures in complex scene analysis, we propose a Progressive Dilated Separable Convolution Module (PDSCM) that achieves effective aggregation of multi-scale target contextual information through continuous receptive field expansion. Third, to preserve fine-grained spatial information of small objects during feature map downsampling operations, we engineered a Lightweight DownSampling Module (LDSM) to replace the traditional convolutional module. Finally, to rectify the insensitivity of current Intersection over Union (IoU) metrics toward small objects, we introduce the Minimum Point Distance Wise IoU (MPDWIoU) loss function, which enhances small-target localization precision through the integration of distance-aware penalty terms and adaptive weighting mechanisms. Comprehensive experiments on the VisDrone2019 dataset show that RTUAV-YOLO achieves an average improvement of 3.4% and 2.4% in mAP50 and mAP50-95, respectively, compared to the baseline model, while reducing the number of parameters by 65.3%. Its generalization capability for UAV object detection is further validated on the UAVDT and UAVVaste datasets. The proposed model is deployed on a typical airborne platform, Jetson Orin Nano, providing an effective solution for real-time object detection scenarios in actual UAVs. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

21 pages, 6919 KB  
Article
A Strawberry Ripeness Detection Method Based on Improved YOLOv8
by Yawei Yue, Shengbo Xu and Huanhuan Wu
Appl. Sci. 2025, 15(11), 6324; https://doi.org/10.3390/app15116324 - 4 Jun 2025
Cited by 2 | Viewed by 1330
Abstract
An enhanced YOLOv8-based network was developed to accurately and efficiently detect the ripeness of strawberries in complex environments. Firstly, a CA (channel attention) mechanism was integrated into the backbone and head of the YOLOv8 model to improve its ability to identify key features [...] Read more.
An enhanced YOLOv8-based network was developed to accurately and efficiently detect the ripeness of strawberries in complex environments. Firstly, a CA (channel attention) mechanism was integrated into the backbone and head of the YOLOv8 model to improve its ability to identify key features of strawberries. Secondly, the bilinear interpolation operator was replaced with DySample (dynamic sampling), which optimized data processing, reduced computational load, accelerated upsampling, and improved the model’s sensitivity to fine strawberry details. Finally, the Wise-IoU (Wise Intersection over Union) loss function optimized the IoU (Intersection over Union) through intelligent weighting and adaptive tuning, enhancing the bounding box accuracy. The experimental results show that the improved YOLOv8-CDW model has a precision of 0.969, a recall of 0.936, and a mAP@0.5 of 0.975 in complex environments, which are 8.39%, 18.63%, and 12.75% better than those of the original YOLOv8, respectively. The enhanced model demonstrates higher accuracy and faster detection of strawberry ripeness, offering valuable technical support for advancing deep learning applications in smart agriculture and automated harvesting. Full article
Show Figures

Figure 1

21 pages, 8937 KB  
Article
LSOD-YOLOv8: Enhancing YOLOv8n with New Detection Head and Lightweight Module for Efficient Cigarette Detection
by Yijie Huang, Huimin Ouyang and Xiaodong Miao
Appl. Sci. 2025, 15(7), 3961; https://doi.org/10.3390/app15073961 - 3 Apr 2025
Cited by 4 | Viewed by 2596
Abstract
Cigarette detection is a crucial component of public safety management. However, detecting such small objects poses significant challenges due to their size and limited feature points. To enhance the accuracy of small target detection, we propose a novel small object detection model, LSOD-YOLOv8 [...] Read more.
Cigarette detection is a crucial component of public safety management. However, detecting such small objects poses significant challenges due to their size and limited feature points. To enhance the accuracy of small target detection, we propose a novel small object detection model, LSOD-YOLOv8 (Lightweight Small Object Detection using YOLOv8). First, we introduce a lightweight adaptive weight downsampling module in the backbone layer of YOLOv8 (You Only Look Once version 8), which not only mitigates information loss caused by conventional convolutions but also reduces the overall parameter count of the model. Next, we incorporate a P2 layer (Pyramid Pooling Layer 2) in the neck of YOLOv8, blending the concepts of shared convolutional information and independent batch normalization to design a P2-LSCSBD (P2 Layer-Lightweight Shared Convolutional and Batch Normalization-based Small Object Detection) detection head. Finally, we propose a new loss function, WIMIoU (Weighted Intersection over Union with Inner, Multi-scale, and Proposal-aware Optimization), by combining the ideas of WiseIoU (Wise Intersection over Union), InnerIoU (Inner Intersection over Union), and MPDIoU (Mean Pairwise Distance Intersection over Union), resulting in a significant accuracy improvement without any loss in performance. Our experiments demonstrate that LSOD-YOLOv8 enhances detection accuracy for cigarette detection specifically. Full article
Show Figures

Figure 1

16 pages, 8580 KB  
Article
Enhanced YOLOv8 with BiFPN-SimAM for Precise Defect Detection in Miniature Capacitors
by Ning Li, Tianrun Ye, Zhihua Zhou, Chunming Gao and Ping Zhang
Appl. Sci. 2024, 14(1), 429; https://doi.org/10.3390/app14010429 - 3 Jan 2024
Cited by 23 | Viewed by 9210
Abstract
In the domain of automatic visual inspection for miniature capacitor quality control, the task of accurately detecting defects presents a formidable challenge. This challenge stems primarily from the small size and limited sample availability of defective micro-capacitors, which leads to issues such as [...] Read more.
In the domain of automatic visual inspection for miniature capacitor quality control, the task of accurately detecting defects presents a formidable challenge. This challenge stems primarily from the small size and limited sample availability of defective micro-capacitors, which leads to issues such as reduced detection accuracy and increased false-negative rates in existing inspection methods. To address these challenges, this paper proposes an innovative approach employing an enhanced ‘you only look once’ version 8 (YOLOv8) architecture specifically tailored for the intricate task of micro-capacitor defect inspection. The merging of the bidirectional feature pyramid network (BiFPN) architecture and the simplified attention module (SimAM), which greatly improves the model’s capacity to recognize fine features and feature representation, is at the heart of this methodology. Furthermore, the model’s capacity for generalization was significantly improved by the addition of the weighted intersection over union (WISE-IOU) loss function. A micro-capacitor surface defect (MCSD) dataset comprising 1358 images representing four distinct types of micro-capacitor defects was constructed. The experimental results showed that our approach achieved 95.8% effectiveness in the mean average precision (mAP) at a threshold of 0.5. This indicates a notable 9.5% enhancement over the original YOLOv8 architecture and underscores the effectiveness of our approach in the automatic visual inspection of miniature capacitors. Full article
Show Figures

Figure 1

29 pages, 6827 KB  
Article
Semantic Image Segmentation Based Cable Vibration Frequency Visual Monitoring Using Modified Convolutional Neural Network with Pixel-wise Weighting Strategy
by Han Yang, Hong-Cheng Xu, Shuang-Jian Jiao and Feng-De Yin
Remote Sens. 2021, 13(8), 1466; https://doi.org/10.3390/rs13081466 - 10 Apr 2021
Cited by 8 | Viewed by 3400
Abstract
Attributed to the explosive adoption of large-span spatial structures and infrastructures as a critical damage-sensitive element, there is a pressing need to monitor cable vibration frequency to inspect the structural health. Neither existing acceleration sensor-utilized contact methods nor conventional computer vision-based photogrammetry methods [...] Read more.
Attributed to the explosive adoption of large-span spatial structures and infrastructures as a critical damage-sensitive element, there is a pressing need to monitor cable vibration frequency to inspect the structural health. Neither existing acceleration sensor-utilized contact methods nor conventional computer vision-based photogrammetry methods have, to date, addressed the defects of lack in cost-effectiveness and compatibility with real-world situations. In this study, a state-of-the-art method based on modified convolutional neural network semantic image segmentation, which is compatible with extensively varying real-world backgrounds, is presented for cable vibration frequency remote and visual monitoring. Modifications of the underlying network framework lie in adopting simpler feature extractors and introducing class weights to loss function by pixel-wise weighting strategies. Nine convolutional neural networks were established and modified. Discrete images with varying real-world backgrounds were captured to train and validate network models. Continuous videos with different cable pixel-to-total pixel (C-T) ratios were captured to test the networks and derive vibration frequencies. Various metrics were leveraged to evaluate the effectiveness of network models. The optimal C-T ratio was also studied to provide guidelines for the parameter setting of monitoring systems in further research and practical application. Training and validation accuracies of nine networks were all reported higher than 90%. A network model with ResNet-50 as feature extractor and uniform prior weighting showed the most superior learning and generalization ability, of which the Precision reached 0.9973, F1 reached 0.9685, and intersection over union (IoU) reached 0.8226 when utilizing images with the optimal C-T ratio of 0.04 as testing set. Contrasted with that sampled by acceleration sensor, the first two order vibration frequencies derived by the most superior network from video with the optimal C-T ratio had merely ignorable absolute percentage errors of 0.41% and 0.36%, substantiating the effectiveness of the proposed method. Full article
(This article belongs to the Section Engineering Remote Sensing)
Show Figures

Graphical abstract

Back to TopTop