Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (84)

Search Parameters:
Keywords = GIOU

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 6621 KiB  
Article
Application of Improved YOLOv8 Image Model in Urban Manhole Cover Defect Management and Detection: Case Study
by Yanqiong Ding, Baojiang Han, Hua Jiang, Hao Hu, Lei Xue, Jiasen Weng, Zhili Tang and Yuzhang Liu
Sensors 2025, 25(13), 4144; https://doi.org/10.3390/s25134144 - 3 Jul 2025
Viewed by 422
Abstract
Manhole covers are crucial for maintaining urban operations and ensuring residents’ travel. The traditional inspection and maintenance management system based on manual judgment has low efficiency and poor accuracy, making it difficult to adapt to the rapidly expanding urban construction and complex environment [...] Read more.
Manhole covers are crucial for maintaining urban operations and ensuring residents’ travel. The traditional inspection and maintenance management system based on manual judgment has low efficiency and poor accuracy, making it difficult to adapt to the rapidly expanding urban construction and complex environment of manhole covers. To address these challenges, an intelligent management model based on the improved YOLOv8 model is proposed for three types of urban high-frequency defects: “breakage, loss and shift”. We design a lightweight dual-stream feature extraction network and use EfficientNetV2 as the backbone. By introducing the fused MBConv structure, the computational complexity is significantly reduced, while the efficiency of feature extraction is improved. An innovative foreground attention module is introduced to adaptively enhance the features of manhole cover defects, improving the model’s ability to identify defects of various scales. In addition, an optimized feature fusion architecture is constructed by integrating NAS-FPN modules. This structure utilizes bidirectional feature transfer and automatic structure search, significantly enhancing the expressiveness of multi-scale features. A combined loss function design using GIoU loss, dynamically weighted BCE loss, and Distribution Focal Loss (DFL) is adopted to address the issues of sample imbalance and inter-class differences. The experimental results show that the model achieved excellent performance in multiple indicators of manhole cover defect recognition, especially in classification accuracy, recall rate, and F1-score, with an overall recognition accuracy of 98.6%. The application of the improved model in the new smart management system for urban manhole covers can significantly improve management efficiency. Full article
(This article belongs to the Special Issue Artificial Intelligence and Sensors Technology in Smart Cities)
Show Figures

Figure 1

20 pages, 4060 KiB  
Article
Tomato Yield Estimation Using an Improved Lightweight YOLO11n Network and an Optimized Region Tracking-Counting Method
by Aichen Wang, Yuanzhi Xu, Dong Hu, Liyuan Zhang, Ao Li, Qingzhen Zhu and Jizhan Liu
Agriculture 2025, 15(13), 1353; https://doi.org/10.3390/agriculture15131353 - 25 Jun 2025
Cited by 1 | Viewed by 385
Abstract
Accurate and effective fruit tracking and counting are crucial for estimating tomato yield. In complex field environments, occlusion and overlap of tomato fruits and leaves often lead to inaccurate counting. To address these issues, this study proposed an improved lightweight YOLO11n network and [...] Read more.
Accurate and effective fruit tracking and counting are crucial for estimating tomato yield. In complex field environments, occlusion and overlap of tomato fruits and leaves often lead to inaccurate counting. To address these issues, this study proposed an improved lightweight YOLO11n network and an optimized region tracking-counting method, which estimates the quantity of tomatoes at different maturity stages. An improved lightweight YOLO11n network was employed for tomato detection and semantic segmentation, which was combined with the C3k2-F, Generalized Intersection over Union (GIoU), and Depthwise Separable Convolution (DSConv) modules. The improved lightweight YOLO11n model is adaptable to edge computing devices, enabling tomato yield estimation while maintaining high detection accuracy. An optimized region tracking-counting method was proposed, combining target tracking and region detection to count the detected fruits. The particle swarm optimization (PSO) algorithm was used to optimize the detection region, thus enhancing the counting accuracy. In terms of network lightweighting, compared to the original, the improved YOLO11n network significantly reduces the number of parameters and Giga Floating-point Operations Per Second (GFLOPs) by 0.22 M and 2.5 G, while achieving detection and segmentation accuracies of 91.3% and 90.5%, respectively. For fruit counting, the results showed that the proposed region tracking-counting method achieved a mean counting error (MCE) of 6.6%, representing a reduction of 5.0% and 2.1% compared to the Bytetrack and cross-line counting methods, respectively. Therefore, the proposed method provided an effective approach for non-contact, accurate, efficient, and real-time intelligent yield estimation for tomatoes. Full article
Show Figures

Figure 1

27 pages, 21013 KiB  
Article
Improved YOLO-Goose-Based Method for Individual Identification of Lion-Head Geese and Egg Matching: Methods and Experimental Study
by Hengyuan Zhang, Zhenlong Wu, Tiemin Zhang, Canhuan Lu, Zhaohui Zhang, Jianzhou Ye, Jikang Yang, Degui Yang and Cheng Fang
Agriculture 2025, 15(13), 1345; https://doi.org/10.3390/agriculture15131345 - 23 Jun 2025
Viewed by 564
Abstract
As a crucial characteristic waterfowl breed, the egg-laying performance of Lion-Headed Geese serves as a core indicator for precision breeding. Under large-scale flat rearing and selection practices, high phenotypic similarity among individuals within the same pedigree coupled with traditional manual observation and existing [...] Read more.
As a crucial characteristic waterfowl breed, the egg-laying performance of Lion-Headed Geese serves as a core indicator for precision breeding. Under large-scale flat rearing and selection practices, high phenotypic similarity among individuals within the same pedigree coupled with traditional manual observation and existing automation systems relying on fixed nesting boxes or RFID tags has posed challenges in achieving accurate goose–egg matching in dynamic environments, leading to inefficient individual selection. To address this, this study proposes YOLO-Goose, an improved YOLOv8s-based method, which designs five high-contrast neck rings (DoubleBar, Circle, Dot, Fence, Cylindrical) as individual identifiers. The method constructs a lightweight model with a small-object detection layer, integrates the GhostNet backbone to reduce parameter count by 67.2%, and employs the GIoU loss function to optimize neck ring localization accuracy. Experimental results show that the model achieves an F1 score of 93.8% and mAP50 of 96.4% on the self-built dataset, representing increases of 10.1% and 5% compared to the original YOLOv8s, with a 27.1% reduction in computational load. The dynamic matching algorithm, incorporating spatiotemporal trajectories and egg positional data, achieves a 95% matching rate, a 94.7% matching accuracy, and a 5.3% mismatching rate. Through lightweight deployment using TensorRT, the inference speed is enhanced by 1.4 times compared to PyTorch-1.12.1, with detection results uploaded to a cloud database in real time. This solution overcomes the technical bottleneck of individual selection in flat rearing environments, providing an innovative computer-vision-based approach for precision breeding of pedigree Lion-Headed Geese and offering significant engineering value for advancing intelligent waterfowl breeding. Full article
(This article belongs to the Special Issue Computer Vision Analysis Applied to Farm Animals)
Show Figures

Figure 1

26 pages, 7941 KiB  
Article
An Edge-Computing-Driven Approach for Augmented Detection of Construction Materials: An Example of Scaffold Component Counting
by Xianzhong Zhao, Bo Cheng, Yujie Lu and Zhaoqi Huang
Buildings 2025, 15(7), 1190; https://doi.org/10.3390/buildings15071190 - 5 Apr 2025
Viewed by 543
Abstract
Construction material management is crucial for project progression. Counting massive amounts of scaffold components is a key step for efficient material management. However, traditional counting methods are time-consuming and laborious. Utilizing a vision-based method with edge devices for counting these materials undoubtedly offers [...] Read more.
Construction material management is crucial for project progression. Counting massive amounts of scaffold components is a key step for efficient material management. However, traditional counting methods are time-consuming and laborious. Utilizing a vision-based method with edge devices for counting these materials undoubtedly offers a promising solution. This study proposed an edge-computing-driven approach for detecting and counting scaffold components. Two algorithm refinements of YOLOX, including generalized intersection over union (GIoU) and soft non-maximum suppression (Soft-NMS), were introduced to enhance detection accuracy in conditions of occlusion. An automated pruning method was proposed to compress the model, achieving a 60.2% reduction in computation and a 9.1% increase in inference speed. Two practical case studies demonstrated that the method, when deployed on edge devices, achieved 98.9% accuracy and reduced time consumption for counting tasks by 87.9% compared to the conventional method. This research provides an edge-computing-driven framework for counting massive materials, establishing a comprehensive workflow for intelligent applications in construction management. The paper concludes with limitations of the current study and suggestions for future work. Full article
Show Figures

Figure 1

22 pages, 6129 KiB  
Article
A Novel Machine Vision-Based Collision Risk Warning Method for Unsignalized Intersections on Arterial Roads
by Zhongbin Luo, Yanqiu Bi, Qing Ye, Yong Li and Shaofei Wang
Electronics 2025, 14(6), 1098; https://doi.org/10.3390/electronics14061098 - 11 Mar 2025
Cited by 1 | Viewed by 863
Abstract
To address the critical need for collision risk warning at unsignalized intersections, this study proposes an advanced predictive system combining YOLOv8 for object detection, Deep SORT for tracking, and Bi-LSTM networks for trajectory prediction. To adapt YOLOv8 for complex intersection scenarios, several architectural [...] Read more.
To address the critical need for collision risk warning at unsignalized intersections, this study proposes an advanced predictive system combining YOLOv8 for object detection, Deep SORT for tracking, and Bi-LSTM networks for trajectory prediction. To adapt YOLOv8 for complex intersection scenarios, several architectural enhancements were incorporated. The RepLayer module replaced the original C2f module in the backbone, integrating large-kernel depthwise separable convolution to better capture contextual information in cluttered environments. The GIoU loss function was introduced to improve bounding box regression accuracy, mitigating the issues related to missed or incorrect detections due to occlusion and overlapping objects. Furthermore, a Global Attention Mechanism (GAM) was implemented in the neck network to better learn both location and semantic information, while the ReContext gradient composition feature pyramid replaced the traditional FPN, enabling more effective multi-scale object detection. Additionally, the CSPNet structure in the neck was substituted with Res-CSP, enhancing feature fusion flexibility and improving detection performance in complex traffic conditions. For tracking, the Deep SORT algorithm was optimized with enhanced appearance feature extraction, reducing the identity switches caused by occlusions and ensuring the stable tracking of vehicles, pedestrians, and non-motorized vehicles. The Bi-LSTM model was employed for trajectory prediction, capturing long-range dependencies to provide accurate forecasting of future positions. The collision risk was quantified using the predictive collision risk area (PCRA) method, categorizing risks into three levels (danger, warning, and caution) based on the predicted overlaps in trajectories. In the experimental setup, the dataset used for training the model consisted of 30,000 images annotated with bounding boxes around vehicles, pedestrians, and non-motorized vehicles. Data augmentation techniques such as Mosaic, Random_perspective, Mixup, HSV adjustments, Flipud, and Fliplr were applied to enrich the dataset and improve model robustness. In real-world testing, the system was deployed as part of the G310 highway safety project, where it achieved a mean Average Precision (mAP) of over 90% for object detection. Over a one-month period, 120 warning events involving vehicles, pedestrians, and non-motorized vehicles were recorded. Manual verification of the warnings indicated a prediction accuracy of 97%, demonstrating the system’s reliability in identifying potential collisions and issuing timely warnings. This approach represents a significant advancement in enhancing safety at unsignalized intersections in urban traffic environments. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing in Machine Learning)
Show Figures

Figure 1

31 pages, 11795 KiB  
Article
DT-YOLO: An Improved Object Detection Algorithm for Key Components of Aircraft and Staff in Airport Scenes Based on YOLOv5
by Zhige He, Yuanqing He and Yang Lv
Sensors 2025, 25(6), 1705; https://doi.org/10.3390/s25061705 - 10 Mar 2025
Viewed by 1211
Abstract
With the rapid development and increasing demands of civil aviation, the accurate detection of key aircraft components and staff on airport aprons is of great significance for ensuring the safety of flights and improving the operational efficiency of airports. However, the existing detection [...] Read more.
With the rapid development and increasing demands of civil aviation, the accurate detection of key aircraft components and staff on airport aprons is of great significance for ensuring the safety of flights and improving the operational efficiency of airports. However, the existing detection models for airport aprons are relatively scarce, and their accuracy is insufficient. Based on YOLOv5, we propose an improved object detection algorithm, called DT-YOLO, to address these issues. We first built a dataset called AAD-dataset for airport apron scenes by randomly sampling and capturing surveillance videos taken from the real world to support our research. We then introduced a novel module named D-CTR in the backbone, which integrates the global feature extraction capability of Transformers with the limited receptive field of convolutional neural networks (CNNs) to enhance the feature representation ability and overall performance. A dropout layer was introduced to reduce redundant and noisy features, prevent overfitting, and improve the model’s generalization ability. In addition, we utilized deformable convolutions in CNNs to extract features from multi-scale and deformed objects, further enhancing the model’s adaptability and detection accuracy. In terms of loss function design, we modified GIoULoss to address its discontinuities and instability in certain scenes, which effectively mitigated gradient explosion and improved the stability of the model. Finally, experiments were conducted on the self-built AAD-dataset. The results demonstrated that DT-YOLO significantly improved the mean average precision (mAP). Specifically, the mAP increased by 2.6 on the AAD-dataset; moreover, other metrics also showed a certain degree of improvement, including detection speed, AP50, AP75, and so on, which comprehensively proves that DT-YOLO can be applied for real-time object detection in airport aprons, ensuring the safe operation of aircraft and efficient management of airports. Full article
(This article belongs to the Special Issue Computer Vision Recognition and Communication Sensing System)
Show Figures

Figure 1

19 pages, 10641 KiB  
Article
GE-YOLO for Weed Detection in Rice Paddy Fields
by Zimeng Chen, Baifan Chen, Yi Huang and Zeshun Zhou
Appl. Sci. 2025, 15(5), 2823; https://doi.org/10.3390/app15052823 - 5 Mar 2025
Cited by 2 | Viewed by 1078
Abstract
Weeds are a significant adverse factor affecting rice growth, and their efficient removal necessitates an accurate, efficient, and well-generalizing weed detection method. However, weed detection faces challenges such as a complex vegetation environment, the similar morphology and color of weeds, and crops and [...] Read more.
Weeds are a significant adverse factor affecting rice growth, and their efficient removal necessitates an accurate, efficient, and well-generalizing weed detection method. However, weed detection faces challenges such as a complex vegetation environment, the similar morphology and color of weeds, and crops and varying lighting conditions. The current research has yet to address these issues adequately. Therefore, we propose GE-YOLO to identify three common types of weeds in rice fields in the Hunan province of China and to validate its generalization performance. GE-YOLO is an improvement based on the YOLOv8 baseline model. It introduces the Neck network with the Gold-YOLO feature aggregation and distribution network to enhance the network’s ability to fuse multi-scale features and detect weeds of different sizes. Additionally, an EMA attention mechanism is used to better learn weed feature representations, while a GIOU loss function provides smoother gradients and reduces computational complexity. Multiple experiments demonstrate that GE-YOLO achieves 93.1% mAP, 90.3% F1 Score, and 85.9 FPS, surpassing almost all mainstream object detection algorithms such as YOLOv8, YOLOv10, and YOLOv11 in terms of detection accuracy and overall performance. Furthermore, the detection results under different lighting conditions consistently maintained a high level above 90% mAP, and under conditions of heavy occlusion, the average mAP for all weed types reached 88.7%. These results indicate that GE-YOLO has excellent detection accuracy and generalization performance, highlighting the potential of GE-YOLO as a valuable tool for enhancing weed management practices in rice cultivation. Full article
Show Figures

Figure 1

23 pages, 10794 KiB  
Article
Hand–Eye Separation-Based First-Frame Positioning and Follower Tracking Method for Perforating Robotic Arm
by Handuo Zhang, Jun Guo, Chunyan Xu and Bin Zhang
Appl. Sci. 2025, 15(5), 2769; https://doi.org/10.3390/app15052769 - 4 Mar 2025
Viewed by 726
Abstract
In subway tunnel construction, current hand–eye integrated drilling robots use a camera mounted on the drilling arm for image acquisition. However, dust interference and long-distance operation cause a decline in image quality, affecting the stability and accuracy of the visual recognition system. Additionally, [...] Read more.
In subway tunnel construction, current hand–eye integrated drilling robots use a camera mounted on the drilling arm for image acquisition. However, dust interference and long-distance operation cause a decline in image quality, affecting the stability and accuracy of the visual recognition system. Additionally, the computational complexity of high-precision detection models limits deployment on resource-constrained edge devices, such as industrial controllers. To address these challenges, this paper proposes a dual-arm tunnel drilling robot system with hand–eye separation, utilizing the first-frame localization and follower tracking method. The vision arm (“eye”) provides real-time position data to the drilling arm (“hand”), ensuring accurate and efficient operation. The study employs an RFBNet model for initial frame localization, replacing the original VGG16 backbone with ShuffleNet V2. This reduces model parameters by 30% (135.5 MB vs. 146.3 MB) through channel splitting and depthwise separable convolutions to reduce computational complexity. Additionally, the GIoU loss function is introduced to replace the traditional IoU, further optimizing bounding box regression through the calculation of the minimum enclosing box. This resolves the gradient vanishing problem in traditional IoU and improves average precision (AP) by 3.3% (from 0.91 to 0.94). For continuous tracking, a SiamRPN-based algorithm combined with Kalman filtering and PID control ensures robustness against occlusions and nonlinear disturbances, increasing the success rate by 1.6% (0.639 vs. 0.629). Experimental results show that this approach significantly improves tracking accuracy and operational stability, achieving 31 FPS inference speed on edge devices and providing a deployable solution for tunnel construction’s safety and efficiency needs. Full article
Show Figures

Figure 1

20 pages, 3735 KiB  
Article
Pavement Disease Visual Detection by Structure Perception and Feature Attention Network
by Bin Lv, Shuo Zhang, Haixia Gong, Hongbo Zhang, Bin Dong, Jianzhu Wang, Cong Du and Jianqing Wu
Appl. Sci. 2025, 15(2), 551; https://doi.org/10.3390/app15020551 - 8 Jan 2025
Viewed by 917
Abstract
Balancing detection performance and computational efficiency is critical for sustainable pavement disease detection in energy-constrained scenarios. However, existing visual methods often struggle to adapt to structural transformations and capture critical features of pavement diseases in complex environments, while their computational demands can be [...] Read more.
Balancing detection performance and computational efficiency is critical for sustainable pavement disease detection in energy-constrained scenarios. However, existing visual methods often struggle to adapt to structural transformations and capture critical features of pavement diseases in complex environments, while their computational demands can be resource-intensive. To address these challenges, this paper proposes a structure perception and feature attention network (SPFAN). The network includes a structure perception module that employs the updated deformable convolution technique. This technique enables the model to dynamically adjust and focus on the actual pavement disease regions, improving the accuracy of feature extraction, especially for diseases with irregular shapes and sizes. Additionally, the convolutional block attention module (CBAM) is integrated to optimize feature map attention across channel and spatial dimensions, enhancing the model focus on critical disease features without significantly increasing complexity. To further improve robustness, the generalized intersection over union (GIoU) loss function is adopted, ensuring better stability across targets of varying shapes and sizes. Experimental results on real-world pavement disease images show that the mAP@0.5 of the proposed SPFAN increases from 66.2% to 71.2%, an improvement of 7.55%, while the F1-score also increases by 9.03%, compared to the baseline YOLOv8n model. Furthermore, while achieving significant accuracy improvements, the proposed method maintains a similar parameter count as the baseline, preserving its low computational demands and high efficiency, making it suitable for real-time pavement damage detection in energy-constrained environments. Full article
Show Figures

Figure 1

21 pages, 9878 KiB  
Article
Deep Learning for Stomatal Opening Recognition in Gynura formosana Kitam Leaves
by Xinlong Shi, Yanbo Song, Xiaojing Shi, Wenjuan Lu, Yijie Zhao, Zhimin Zhou, Junmai Chai and Zhenyu Liu
Agronomy 2024, 14(11), 2622; https://doi.org/10.3390/agronomy14112622 - 6 Nov 2024
Cited by 1 | Viewed by 1105
Abstract
Gynura formosana Kitam possesses beneficial properties such as heat-clearing, detoxification, and cough suppression, making it a highly nutritious plant with significant economic value. During its growth, the plant’s leaves are prone to infections that can impair stomatal function and hinder growth. Effective identification [...] Read more.
Gynura formosana Kitam possesses beneficial properties such as heat-clearing, detoxification, and cough suppression, making it a highly nutritious plant with significant economic value. During its growth, the plant’s leaves are prone to infections that can impair stomatal function and hinder growth. Effective identification of stomatal openings and timely application of appropriate chemicals or hormones or indirect environmental adjustments (such as light, temperature, and humidity) to regulate stomatal openings are essential for maintaining the plant’s healthy growth. Currently, manual observation is the predominant method for monitoring stomatal openings of Gynura formosana Kitam, which is complex, labor-intensive, and unsuitable for automated detection. To address this, the study improves upon YOLOv8s by proposing a real-time, high-precision stomatal detection model, Refined GIoU. This model substitutes the original IoU evaluation methods in YOLOv8s with GIoU, DIoU, and EIoU while incorporating the SE (Squeeze-and-Excitation) and SA (Self-Attention) attention mechanisms to enhance understanding of feature representation and spatial relationships. Additionally, enhancements to the P2 layer improve the feature extraction and scale adaptation. The effectiveness of the Refined GIoU is demonstrated through training and validation on a dataset of 1500 images of Gynura formosana Kitam stomata. The results show that the Refined GIoU achieved an average precision (mAP) of 0.935, a recall of 0.98, and an F1-score of 0.88, reflecting an excellent overall performance. The GIoU loss function is better suited to detecting stomatal openings of Gynura formosana Kitam, significantly enhancing the detection accuracy. This model facilitates the automated, real-time monitoring of stomatal openings, allowing for timely control measures and improved economic benefits of Gynura formosana Kitam cultivation. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

37 pages, 15011 KiB  
Article
Steering-Angle Prediction and Controller Design Based on Improved YOLOv5 for Steering-by-Wire System
by Cunliang Ye, Yunlong Wang, Yongfu Wang and Yan Liu
Sensors 2024, 24(21), 7035; https://doi.org/10.3390/s24217035 - 31 Oct 2024
Cited by 1 | Viewed by 2127
Abstract
A crucial role is played by steering-angle prediction in the control of autonomous vehicles (AVs). It mainly includes the prediction and control of the steering angle. However, the prediction accuracy and calculation efficiency of traditional YOLOv5 are limited. For the control of the [...] Read more.
A crucial role is played by steering-angle prediction in the control of autonomous vehicles (AVs). It mainly includes the prediction and control of the steering angle. However, the prediction accuracy and calculation efficiency of traditional YOLOv5 are limited. For the control of the steering angle, angular velocity is difficult to measure, and the angle control effect is affected by external disturbances and unknown friction. This paper proposes a lightweight steering angle prediction network model called YOLOv5Ms, based on YOLOv5, aiming to achieve accurate prediction while enhancing computational efficiency. Additionally, an adaptive output feedback control scheme with output constraints based on neural networks is proposed to regulate the predicted steering angle using the YOLOv5Ms algorithm effectively. Firstly, given that most lane-line data sets consist of simulated images and lack diversity, a novel lane data set derived from real roads is manually created to train the proposed network model. To improve real-time accuracy in steering-angle prediction and enhance effectiveness in steering control, we update the bounding box regression loss function with the generalized intersection over union (GIoU) to Shape-IoU_Loss as a better-converging regression loss function for bounding-box improvement. The YOLOv5Ms model achieves a 30.34% reduction in weight storage space while simultaneously improving accuracy by 7.38% compared to the YOLOv5s model. Furthermore, an adaptive output feedback control scheme with output constraints based on neural networks is introduced to regulate the predicted steering angle via YOLOv5Ms effectively. Moreover, utilizing the backstepping control method and introducing the Lyapunov barrier function enables us to design an adaptive neural network output feedback controller with output constraints. Finally, a strict stability analysis based on Lyapunov stability theory ensures the boundedness of all signals within the closed-loop system. Numerical simulations and experiments have shown that the proposed method provides a 39.16% better root mean squared error (RMSE) score than traditional backstepping control, and it achieves good estimation performance for angles, angular velocity, and unknown disturbances. Full article
(This article belongs to the Special Issue Deep Learning for Perception and Recognition: Method and Applications)
Show Figures

Figure 1

22 pages, 17207 KiB  
Article
Multi-Target Vehicle Tracking Algorithm Based on Improved DeepSORT
by Dudu Guo, Zhuzhou Li, Hongbo Shuai and Fei Zhou
Sensors 2024, 24(21), 7014; https://doi.org/10.3390/s24217014 - 31 Oct 2024
Cited by 2 | Viewed by 1918
Abstract
In this paper, we address the issues of insufficient accuracy and frequent identity switching in the multi-target tracking algorithm DeepSORT by proposing two improvement strategies. First, we optimize the appearance feature extraction process by training a lightweight appearance extraction network (OSNet) on a [...] Read more.
In this paper, we address the issues of insufficient accuracy and frequent identity switching in the multi-target tracking algorithm DeepSORT by proposing two improvement strategies. First, we optimize the appearance feature extraction process by training a lightweight appearance extraction network (OSNet) on a vehicle re-identification dataset. This makes the appearance features better suited for the vehicle tracking model required in our paper. Second, we improve the metric of motion features by using the original IOU distance metric or GIOU metrics. The optimized tracking algorithm using GIOU achieves effective improvements in tracking precision and accuracy. The experimental results show that the improved vehicle tracking models MOTA and IDF1 are enhanced by 4.6% and 5.9%, respectively. This allows for the stable tracking of vehicles and reduces the occurrence of identity switching phenomenon to a certain extent. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

16 pages, 3898 KiB  
Article
APD-YOLOv7: Enhancing Sustainable Farming through Precise Identification of Agricultural Pests and Diseases Using a Novel Diagonal Difference Ratio IOU Loss
by Jianwen Li, Shutian Liu, Dong Chen, Shengbang Zhou and Chuanqi Li
Sustainability 2024, 16(20), 8855; https://doi.org/10.3390/su16208855 - 13 Oct 2024
Cited by 4 | Viewed by 1606
Abstract
The diversity and complexity of the agricultural environment pose significant challenges for the collection of pest and disease data. Additionally, pest and disease datasets often suffer from uneven distribution in quantity and inconsistent annotation standards. Enhancing the accuracy of pest and disease recognition [...] Read more.
The diversity and complexity of the agricultural environment pose significant challenges for the collection of pest and disease data. Additionally, pest and disease datasets often suffer from uneven distribution in quantity and inconsistent annotation standards. Enhancing the accuracy of pest and disease recognition remains a challenge for existing models. We constructed a representative agricultural pest and disease dataset, FIP6Set, through a combination of field photography and web scraping. This dataset encapsulates key issues encountered in existing agricultural pest and disease datasets. Referencing existing bounding box regression (BBR) loss functions, we reconsidered their geometric features and proposed a novel bounding box similarity comparison metric, DDRIoU, suited to the characteristics of agricultural pest and disease datasets. By integrating the focal loss concept with the DDRIoU loss, we derived a new loss function, namely Focal-DDRIoU loss. Furthermore, we modified the network structure of YOLOV7 by embedding the MobileViTv3 module. Consequently, we introduced a model specifically designed for agricultural pest and disease detection in precision agriculture. We conducted performance evaluations on the FIP6Set dataset using mAP75 as the evaluation metric. Experimental results demonstrate that the Focal-DDRIoU loss achieves improvements of 1.12%, 1.24%, 1.04%, and 1.50% compared to the GIoU, DIoU, CIoU, and EIoU losses, respectively. When employing the GIoU, DIoU, CIoU, EIoU, and Focal-DDRIoU loss functions, the adjusted network structure showed enhancements of 0.68%, 0.68%, 0.78%, 0.60%, and 0.56%, respectively, compared to the original YOLOv7. Furthermore, the proposed model outperformed the mainstream YOLOv7 and YOLOv5 models by 1.86% and 1.60%, respectively. The superior performance of the proposed model in detecting agricultural pests and diseases directly contributes to reducing pesticide misuse, preventing large-scale pest and disease outbreaks, and ultimately enhancing crop yields. These outcomes strongly support the promotion of sustainable agricultural development. Full article
Show Figures

Figure 1

14 pages, 2090 KiB  
Article
Pest Detection Based on Lightweight Locality-Aware Faster R-CNN
by Kai-Run Li, Li-Jun Duan, Yang-Jun Deng, Jin-Ling Liu, Chen-Feng Long and Xing-Hui Zhu
Agronomy 2024, 14(10), 2303; https://doi.org/10.3390/agronomy14102303 - 7 Oct 2024
Cited by 10 | Viewed by 2033
Abstract
Accurate and timely monitoring of pests is an effective way to minimize the negative effects of pests in agriculture. Since deep learning-based methods have achieved good performance in object detection, they have been successfully applied for pest detection and monitoring. However, the current [...] Read more.
Accurate and timely monitoring of pests is an effective way to minimize the negative effects of pests in agriculture. Since deep learning-based methods have achieved good performance in object detection, they have been successfully applied for pest detection and monitoring. However, the current pest detection methods fail to balance the relationship between computational cost and model accuracy. Therefore, this paper proposes a lightweight, locality-aware faster R-CNN (LLA-RCNN) method for effective pest detection and real-time monitoring. The proposed model uses MobileNetV3 to replace the original backbone, reduce the computational complexity, and compress the size of the model to speed up pest detection. The coordinate attention (CA) blocks are utilized to enhance the locality information for highlighting the objects under complex backgrounds. Furthermore, the generalized intersection over union (GIoU) loss function and region of interest align (RoI Align) technology are used to improve pest detection accuracy. The experimental results on different types of datasets validate that the proposed model not only significantly reduces the number of parameters and floating-point operations (FLOPs), but also achieves better performance than some popular pest detection methods. This demonstrates strong generalization capabilities and provides a feasible method for pest detection on resource-constrained devices. Full article
Show Figures

Figure 1

27 pages, 8828 KiB  
Article
Research on Detection Method of Chaotian Pepper in Complex Field Environments Based on YOLOv8
by Yichu Duan, Jianing Li and Chi Zou
Sensors 2024, 24(17), 5632; https://doi.org/10.3390/s24175632 - 30 Aug 2024
Cited by 3 | Viewed by 1472
Abstract
The intelligent detection of chili peppers is crucial for achieving automated operations. In complex field environments, challenges such as overlapping plants, branch occlusions, and uneven lighting make detection difficult. This study conducted comparative experiments to select the optimal detection model based on YOLOv8 [...] Read more.
The intelligent detection of chili peppers is crucial for achieving automated operations. In complex field environments, challenges such as overlapping plants, branch occlusions, and uneven lighting make detection difficult. This study conducted comparative experiments to select the optimal detection model based on YOLOv8 and further enhanced it. The model was optimized by incorporating BiFPN, LSKNet, and FasterNet modules, followed by the addition of attention and lightweight modules such as EMBC, EMSCP, DAttention, MSBlock, and Faster. Adjustments to CIoU, Inner CIoU, Inner GIoU, and inner_mpdiou loss functions and scaling factors further improved overall performance. After optimization, the YOLOv8 model achieved precision, recall, and mAP scores of 79.0%, 75.3%, and 83.2%, respectively, representing increases of 1.1, 4.3, and 1.6 percentage points over the base model. Additionally, GFLOPs were reduced by 13.6%, the model size decreased to 66.7% of the base model, and the FPS reached 301.4. This resulted in accurate and rapid detection of chili peppers in complex field environments, providing data support and experimental references for the development of intelligent picking equipment. Full article
Show Figures

Figure 1

Back to TopTop