Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,120)

Search Parameters:
Keywords = YOLOV7 network model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 3128 KiB  
Article
A Real-Time Mature Hawthorn Detection Network Based on Lightweight Hybrid Convolutions for Harvesting Robots
by Baojian Ma, Bangbang Chen, Xuan Li, Liqiang Wang and Dongyun Wang
Sensors 2025, 25(16), 5094; https://doi.org/10.3390/s25165094 (registering DOI) - 16 Aug 2025
Abstract
Accurate real-time detection of hawthorn by vision systems is a fundamental prerequisite for automated harvesting. This study addresses the challenges in hawthorn orchards—including target overlap, leaf occlusion, and environmental variations—which lead to compromised detection accuracy, high computational resource demands, and poor real-time performance [...] Read more.
Accurate real-time detection of hawthorn by vision systems is a fundamental prerequisite for automated harvesting. This study addresses the challenges in hawthorn orchards—including target overlap, leaf occlusion, and environmental variations—which lead to compromised detection accuracy, high computational resource demands, and poor real-time performance in existing methods. To overcome these limitations, we propose YOLO-DCL (group shuffling convolution and coordinate attention integrated with a lightweight head based on YOLOv8n), a novel lightweight hawthorn detection model. The backbone network employs dynamic group shuffling convolution (DGCST) for efficient and effective feature extraction. Within the neck network, coordinate attention (CA) is integrated into the feature pyramid network (FPN), forming an enhanced multi-scale feature pyramid network (HSPFN); this integration further optimizes the C2f structure. The detection head is designed utilizing shared convolution and batch normalization to streamline computation. Additionally, the PIoUv2 (powerful intersection over union version 2) loss function is introduced to significantly reduce model complexity. Experimental validation demonstrates that YOLO-DCL achieves a precision of 91.6%, recall of 90.1%, and mean average precision (mAP) of 95.6%, while simultaneously reducing the model size to 2.46 MB with only 1.2 million parameters and 4.8 GFLOPs computational cost. To rigorously assess real-world applicability, we developed and deployed a detection system based on the PySide6 framework on an NVIDIA Jetson Xavier NX edge device. Field testing validated the model’s robustness, high accuracy, and real-time performance, confirming its suitability for integration into harvesting robots operating in practical orchard environments. Full article
(This article belongs to the Section Sensors and Robotics)
15 pages, 1378 KiB  
Article
Intelligent Vehicle Target Detection Algorithm Based on Multiscale Features
by Aijuan Li, Xiangsen Ning, Máté Zöldy, Jiaqi Chen and Guangpeng Xu
Sensors 2025, 25(16), 5084; https://doi.org/10.3390/s25165084 - 15 Aug 2025
Abstract
To address the issues of false detections and missed detections in object detection for intelligent driving scenarios, this study focuses on optimizing the YOLOv10 algorithm to reduce model complexity while enhancing detection accuracy. The method involves three key improvements. First, it involves the [...] Read more.
To address the issues of false detections and missed detections in object detection for intelligent driving scenarios, this study focuses on optimizing the YOLOv10 algorithm to reduce model complexity while enhancing detection accuracy. The method involves three key improvements. First, it involves the design of multi-scale flexible convolution (MSFC), which can capture multi-scale information simultaneously, thereby reducing network stacking and computational load. Second, it reconstructs the neck network structure by incorporating Shallow Auxiliary Fusion (SAF) and Advanced Auxiliary Fusion (AAF), enabling better capture of multi-scale features of objects. Third, it improves the detection head through the combination of multi-scale convolution and channel adaptive attention mechanism, enhancing the diversity and accuracy of feature extraction. Results show that the improved YOLOv10 model has a size of 13.4 MB, meaning a reduction of 11.8%, and that the detection accuracy mAP@0.5 reaches 93.0%, outperforming mainstream models in comprehensive performance. This work provides a detection framework for intelligent driving scenarios, balancing accuracy and model size. Full article
Show Figures

Figure 1

17 pages, 1118 KiB  
Article
SMA-YOLO: A Novel Approach to Real-Time Vehicle Detection on Edge Devices
by Haixia Liu, Yingkun Song, Yongxing Lin and Zhixin Tie
Sensors 2025, 25(16), 5072; https://doi.org/10.3390/s25165072 - 15 Aug 2025
Abstract
Vehicle detection plays a pivotal role in traffic management as a key technology for intelligent traffic management and driverless driving. However, current deep learning-based vehicle detection models face several challenges in practical applications. These include slow detection speeds, large computational and parametric quantities, [...] Read more.
Vehicle detection plays a pivotal role in traffic management as a key technology for intelligent traffic management and driverless driving. However, current deep learning-based vehicle detection models face several challenges in practical applications. These include slow detection speeds, large computational and parametric quantities, high leakage and misdetection rates in target-intensive environments, and difficulties in deploying them on edge devices with limited computing power and memory. To address these issues, this paper proposes an improved vehicle detection method called SMA-YOLO, based on the YOLOv7 model. Firstly, MobileNetV3 is adopted as the new backbone network to lighten the model. Secondly, the SimAM attention mechanism is incorporated to suppress background interference and enhance small-target detection capability. Additionally, the ACON activation function is substituted for the original SiLU activation function in the YOLOv7 model to improve detection accuracy. Lastly, SIoU is used to replace CIoU to optimize the loss of function and accelerate model convergence. Experiments on the UA-DETRAC dataset demonstrate that the proposed SMA-YOLO model achieves a lightweight effect, significantly reducing model size, computational requirements, and the number of parameters. It not only greatly improves detection speed but also maintains higher detection accuracy. This provides a feasible solution for deploying a vehicle detection model on embedded devices for real-time detection. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

19 pages, 4394 KiB  
Article
Research on Optimized YOLOv5s Algorithm for Detecting Aircraft Landing Runway Markings
by Wei Huang, Hongrui Guo, Xiangquan Li, Xi Tan and Bo Liu
Processes 2025, 13(8), 2572; https://doi.org/10.3390/pr13082572 - 14 Aug 2025
Abstract
During traditional aircraft landings, pilots face significant challenges in identifying runway numbers with the naked eye, particularly at decision height under adverse weather conditions. To address this issue, this study proposes a novel detection algorithm based on an optimized version of the YOLOv5s [...] Read more.
During traditional aircraft landings, pilots face significant challenges in identifying runway numbers with the naked eye, particularly at decision height under adverse weather conditions. To address this issue, this study proposes a novel detection algorithm based on an optimized version of the YOLOv5s model (You Only Look Once, version 5) for recognizing runway markings during civil aircraft landings. By integrating a data augmentation strategy with external datasets, the method effectively reduces both false detections and missed targets through expanded feature representation. An Alpha Complete Intersection over Union (CIOU) Loss function is introduced in place of the original CIOU Loss function, offering improved gradient optimization. Additionally, the model incorporates several advanced modules and techniques, including a Convolutional Block Attention Module (CBAM), Soft Non-Maximum Suppression (Soft-NMS), cosine annealing learning rate scheduling, the FReLU activation function, and deformable convolutions into the backbone and neck of the YOLOv5 architecture. To further enhance detection, a specialized small-target detection layer is added to the head of the network and the resolution of feature maps is improved. These enhancements enable better feature extraction and more accurate identification of smaller targets. As a result, the optimized model shows significantly improved recall (R) and precision (P). Experimental results, visualized using custom-developed software, demonstrate that the proposed optimized YOLOv5s model achieved increases of 5.66% in P, 2.99% in R, and 2.74% in mean average precision (mAP) compared to the baseline model. This study provides valuable data and a theoretical foundation to support the accurate visual identification of runway numbers and other reference markings during aircraft landings. Full article
(This article belongs to the Special Issue Modelling and Optimizing Process in Industry 4.0)
Show Figures

Figure 1

33 pages, 9679 KiB  
Article
Intelligent Defect Detection of Ancient City Walls Based on Computer Vision
by Gengpei Zhang, Xiaohan Dou and Leqi Li
Sensors 2025, 25(16), 5042; https://doi.org/10.3390/s25165042 - 14 Aug 2025
Viewed by 52
Abstract
As an important tangible carrier of historical and cultural heritage, ancient city walls embody the historical memory of urban development and serve as evidence of engineering evolution. However, due to prolonged exposure to complex natural environments and human activities, they are highly susceptible [...] Read more.
As an important tangible carrier of historical and cultural heritage, ancient city walls embody the historical memory of urban development and serve as evidence of engineering evolution. However, due to prolonged exposure to complex natural environments and human activities, they are highly susceptible to various types of defects, such as cracks, missing bricks, salt crystallization, and vegetation erosion. To enhance the capability of cultural heritage conservation, this paper focuses on the ancient city wall of Jingzhou and proposes a multi-stage defect-detection framework based on computer vision technology. The proposed system establishes a processing pipeline that includes image processing, 2D defect detection, depth estimation, and 3D reconstruction. On the processing end, the Restormer and SG-LLIE models are introduced for image deblurring and illumination enhancement, respectively, improving the quality of wall images. The system incorporates the LFS-GAN model to augment defect samples. On the detection end, YOLOv12 is used as the 2D recognition network to detect common defects based on the generated samples. A depth estimation module is employed to assist in the verification of ancient wall defects. Finally, a Gaussian Splatting point-cloud reconstruction method is used to achieve a 3D visual representation of the defects. Experimental results show that the proposed system effectively detects multiple types of defects in ancient city walls, providing both a theoretical foundation and technical support for the intelligent monitoring of cultural heritage. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

19 pages, 1619 KiB  
Article
Impact of Water Velocity on Litopenaeus vannamei Behavior Using ByteTrack-Based Multi-Object Tracking
by Jiahao Zhang, Lei Wang, Zhengguo Cui, Hao Li, Jianlei Chen, Yong Xu, Haixiang Zhao, Zhenming Huang, Keming Qu and Hongwu Cui
Fishes 2025, 10(8), 406; https://doi.org/10.3390/fishes10080406 - 14 Aug 2025
Viewed by 91
Abstract
In factory-controlled recirculating aquaculture systems, precise regulation of water velocity is crucial for optimizing shrimp feeding behavior and improving aquaculture efficiency. However, quantitative analysis of the impact of water velocity on shrimp behavior remains challenging. This study developed an innovative multi-objective behavioral analysis [...] Read more.
In factory-controlled recirculating aquaculture systems, precise regulation of water velocity is crucial for optimizing shrimp feeding behavior and improving aquaculture efficiency. However, quantitative analysis of the impact of water velocity on shrimp behavior remains challenging. This study developed an innovative multi-objective behavioral analysis framework integrating detection, tracking, and behavioral interpretation. Specifically, the YOLOv8 model was employed for precise shrimp detection, ByteTrack with a dual-threshold matching strategy ensured continuous individual trajectory tracking in complex water environments, and Kalman filtering corrected coordinate offsets caused by water refraction. Under typical recirculating aquaculture system conditions, three water circulation rates (2.0, 5.0, and 10.0 cycles/day) were established to simulate varying flow velocities. High-frequency imaging (30 fps) was used to simultaneously record and analyze the movement trajectories of Litopenaeus vannamei during feeding and non-feeding periods, from which two-dimensional behavioral parameters—velocity and turning angle—were extracted. Key experimental results indicated that water circulation rates significantly affected shrimp movement velocity but had no significant effect on turning angle. Importantly, under only the moderate circulation rate (5.0 cycles/day), the average movement velocity during feeding was significantly lower than during non-feeding periods (p < 0.05). This finding reveals that moderate water velocity constitutes a critical hydrodynamic window for eliciting specific feeding behavior in shrimp. These results provide core parameters for an intelligent Litopenaeus vannamei feeding intensity assessment model based on spatiotemporal graph convolutional networks and offer theoretically valuable and practically applicable guidance for optimizing hydrodynamics and formulating precision feeding strategies in recirculating aquaculture systems. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Aquaculture)
Show Figures

Figure 1

16 pages, 2479 KiB  
Article
FBStrNet: Automatic Fetal Brain Structure Detection in Early Pregnancy Ultrasound Images
by Yirong Lin, Shunlan Liu, Zhonghua Liu, Yuling Fan, Peizhong Liu and Xu Guo
Sensors 2025, 25(16), 5034; https://doi.org/10.3390/s25165034 - 13 Aug 2025
Viewed by 115
Abstract
Ultrasound imaging is widely used in early pregnancy to screen for fetal brain anomalies. However, the accuracy of diagnosis can be influenced by various factors, including the sonographer’s experience and environmental conditions. To address these limitations, advanced methods are needed to enhance the [...] Read more.
Ultrasound imaging is widely used in early pregnancy to screen for fetal brain anomalies. However, the accuracy of diagnosis can be influenced by various factors, including the sonographer’s experience and environmental conditions. To address these limitations, advanced methods are needed to enhance the efficiency and reliability of fetal anomaly screening. In this study, we propose a novel approach based on a Fetal Brain Structures Detection Network (FBStrNet) for identifying key anatomical structures in fetal brain ultrasound images. Specifically, FBStrNet builds on the YOLOv5 baseline model, incorporating a lightweight backbone to reduce model parameters, replacing the loss function, and utilizing a decoupled detection header to improve accuracy. Additionally, our method integrates prior clinical knowledge to minimize false detection rates. Experimental results demonstrate that FBStrNet outperforms state-of-the-art methods, achieving real-time detection of fetal brain anatomical structures with an inference time of just 11.5 ms. This capability enables sonographers to efficiently visualize critical anatomical features, thereby improving diagnostic precision and streamlining clinical workflows. Full article
(This article belongs to the Special Issue Spectral Detection Technology, Sensors and Instruments, 2nd Edition)
Show Figures

Figure 1

25 pages, 54500 KiB  
Article
Parking Pattern Guided Vehicle and Aircraft Detection in Aligned SAR-EO Aerial View Images
by Zhe Geng, Shiyu Zhang, Yu Zhang, Chongqi Xu, Linyi Wu and Daiyin Zhu
Remote Sens. 2025, 17(16), 2808; https://doi.org/10.3390/rs17162808 - 13 Aug 2025
Viewed by 203
Abstract
Although SAR systems can provide high-resolution aerial view images all-day, all-weather, the aspect and pose-sensitivity of the SAR target signatures, which defies the Gestalt perceptual principles, sets a frustrating performance upper bound for SAR Automatic Target Recognition (ATR). Therefore, we propose a network [...] Read more.
Although SAR systems can provide high-resolution aerial view images all-day, all-weather, the aspect and pose-sensitivity of the SAR target signatures, which defies the Gestalt perceptual principles, sets a frustrating performance upper bound for SAR Automatic Target Recognition (ATR). Therefore, we propose a network to support context-guided ATR by using aligned Electro-Optical (EO)-SAR image pairs. To realize EO-SAR image scene grammar alignment, the stable context features highly correlated to the parking patterns of the vehicle and aircraft targets are extracted from the EO images as prior knowledge, which is used to assist SAR-ATR. The proposed network consists of a Scene Recognition Module (SRM) and an instance-level Cross-modality ATR Module (CATRM). The SRM is based on a novel light-condition-driven adaptive EO-SAR decision weighting scheme, and the Outlier Exposure (OE) approach is employed for SRM training to realize Out-of-Distribution (OOD) scene detection. Once the scene depicted in the cut of interest is identified with the SRM, the image cut is sent to the CATRM for ATR. Considering that the EO-SAR images acquired from diverse observation angles often feature unbalanced quality, a novel class-incremental learning method based on the Context-Guided Re-Identification (ReID)-based Key-view (CGRID-Key) exemplar selection strategy is devised so that the network is capable of continuous learning in the open-world deployment environment. Vehicle ATR experimental results based on the UNICORN dataset, which consists of 360-degree EO-SAR images of an army base, show that the CGRID-Key exemplar strategy offers a classification accuracy 29.3% higher than the baseline model for the incremental vehicle category, SUV. Moreover, aircraft ATR experimental results based on the aligned EO-SAR images collected over several representative airports and the Arizona aircraft boneyard show that the proposed network achieves an F1 score of 0.987, which is 9% higher than YOLOv8. Full article
(This article belongs to the Special Issue Applications of SAR for Environment Observation Analysis)
Show Figures

Figure 1

21 pages, 2657 KiB  
Article
A Lightweight Multi-Stage Visual Detection Approach for Complex Traffic Scenes
by Xuanyi Zhao, Xiaohan Dou, Jihong Zheng and Gengpei Zhang
Sensors 2025, 25(16), 5014; https://doi.org/10.3390/s25165014 - 13 Aug 2025
Viewed by 127
Abstract
In complex traffic environments, image degradation due to adverse factors such as haze, low illumination, and occlusion significantly compromises the performance of object detection systems in recognizing vehicles and pedestrians. To address these challenges, this paper proposes a robust visual detection framework that [...] Read more.
In complex traffic environments, image degradation due to adverse factors such as haze, low illumination, and occlusion significantly compromises the performance of object detection systems in recognizing vehicles and pedestrians. To address these challenges, this paper proposes a robust visual detection framework that integrates multi-stage image enhancement with a lightweight detection architecture. Specifically, an image preprocessing module incorporating ConvIR and CIDNet is designed to perform defogging and illumination enhancement, thereby substantially improving the perceptual quality of degraded inputs. Furthermore, a novel enhancement strategy based on the Horizontal/Vertical-Intensity color space is introduced to decouple brightness and chromaticity modeling, effectively enhancing structural details and visual consistency in low-light regions. In the detection phase, a lightweight state-space modeling network, Mamba-Driven Lightweight Detection Network with RT-DETR Decoding, is proposed for object detection in complex traffic scenes. This architecture integrates VSSBlock and XSSBlock modules to enhance detection performance, particularly for multi-scale and occluded targets. Additionally, a VisionClueMerge module is incorporated to strengthen the perception of edge structures by effectively fusing multi-scale spatial features. Experimental evaluations on traffic surveillance datasets demonstrate that the proposed method surpasses the mainstream YOLOv12s model in terms of mAP@50–90, achieving a performance gain of approximately 1.0 percentage point (from 0.759 to 0.769). While ensuring competitive detection accuracy, the model exhibits reduced parameter complexity and computational overhead, thereby demonstrating superior deployment adaptability and robustness. This framework offers a practical and effective solution for object detection in intelligent transportation systems operating under visually challenging conditions. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

28 pages, 9582 KiB  
Article
End-to-End Model Enabled GPR Hyperbolic Keypoint Detection for Automatic Localization of Underground Targets
by Feifei Hou, Yu Zhang, Jian Dong and Jinglin Fan
Remote Sens. 2025, 17(16), 2791; https://doi.org/10.3390/rs17162791 - 12 Aug 2025
Viewed by 243
Abstract
Ground-Penetrating Radar (GPR) is a non-destructive detection technique widely employed for identifying underground targets. Despite its utility, conventional approaches suffer from limitations, including poor adaptability to multi-scale targets and suboptimal localization accuracy. To overcome these challenges, we propose a lightweight deep learning framework, [...] Read more.
Ground-Penetrating Radar (GPR) is a non-destructive detection technique widely employed for identifying underground targets. Despite its utility, conventional approaches suffer from limitations, including poor adaptability to multi-scale targets and suboptimal localization accuracy. To overcome these challenges, we propose a lightweight deep learning framework, the Dual Attentive YOLOv11 (You Only Look Once, version 11) Keypoint Detector (DAYKD), designed for robust underground target detection and precise localization. Building upon the YOLOv11 architecture, our method introduces two key innovations to enhance performance: (1) a dual-task learning framework that synergizes bounding box detection with keypoint regression to refine localization precision, and (2) a novel Convolution and Attention Fusion Module (CAFM) coupled with a Feature Refinement Network (FRFN) to enhance multi-scale feature representation. Extensive ablation studies demonstrate that DAYKD achieves a precision of 93.7% and an mAP50 of 94.7% in object detection tasks, surpassing the baseline model by about 13% in F1-score, a balanced metric that combines precision and recall to evaluate overall model performance, underscoring its superior performance. These findings confirm that DAYKD delivers exceptional recognition accuracy and robustness, offering a promising solution for high-precision underground target localization. Full article
(This article belongs to the Special Issue Advanced Ground-Penetrating Radar (GPR) Technologies and Applications)
Show Figures

Figure 1

17 pages, 5705 KiB  
Article
Cherry Tomato Bunch and Picking Point Detection for Robotic Harvesting Using an RGB-D Sensor and a StarBL-YOLO Network
by Pengyu Li, Ming Wen, Zhi Zeng and Yibin Tian
Horticulturae 2025, 11(8), 949; https://doi.org/10.3390/horticulturae11080949 - 11 Aug 2025
Viewed by 304
Abstract
For fruit harvesting robots, rapid and accurate detection of fruits and picking points is one of the main challenges for their practical deployment. Several fruits typically grow in clusters or bunches, such as grapes, cherry tomatoes, and blueberries. For such clustered fruits, it [...] Read more.
For fruit harvesting robots, rapid and accurate detection of fruits and picking points is one of the main challenges for their practical deployment. Several fruits typically grow in clusters or bunches, such as grapes, cherry tomatoes, and blueberries. For such clustered fruits, it is desired for them to be picked by bunches instead of individually. This study proposes utilizing a low-cost off-the-shelf RGB-D sensor mounted on the end effector and a lightweight improved YOLOv8-Pose neural network to detect cherry tomato bunches and picking points for robotic harvesting. The problem of occlusion and overlap is alleviated by merging RGB and depth images from the RGB-D sensor. To enhance detection robustness in complex backgrounds and reduce the complexity of the model, the Starblock module from StarNet and the coordinate attention mechanism are incorporated into the YOLOv8-Pose network, termed StarBL-YOLO, to improve the efficiency of feature extraction and reinforce spatial information. Additionally, we replaced the original OKS loss function with the L1 loss function for keypoint loss calculation, which improves the accuracy in picking points localization. The proposed method has been evaluated on a dataset with 843 cherry tomato RGB-D image pairs acquired by a harvesting robot at a commercial greenhouse farm. Experimental results demonstrate that the proposed StarBL-YOLO model achieves a 12% reduction in model parameters compared to the original YOLOv8-Pose while improving detection accuracy for cherry tomato bunches and picking points. Specifically, the model shows significant improvements across all metrics: for computational efficiency, model size (−11.60%) and GFLOPs (−7.23%); for pickable bunch detection, mAP50 (+4.4%) and mAP50-95 (+4.7%); for non-pickable bunch detection, mAP50 (+8.0%) and mAP50-95 (+6.2%); and for picking point detection, mAP50 (+4.3%), mAP50-95 (+4.6%), and RMSE (−23.98%). These results validate that StarBL-YOLO substantially enhances detection accuracy for cherry tomato bunches and picking points while improving computational efficiency, which is valuable for resource-constrained edge-computing deployment for harvesting robots. Full article
(This article belongs to the Special Issue Advanced Automation for Tree Fruit Orchards and Vineyards)
Show Figures

Figure 1

20 pages, 16638 KiB  
Article
GIA-YOLO: A Target Detection Method for Nectarine Picking Robots in Facility Orchards
by Longlong Ren, Yuqiang Li, Yonghui Du, Ang Gao, Wei Ma, Yuepeng Song and Xingchang Han
Agronomy 2025, 15(8), 1934; https://doi.org/10.3390/agronomy15081934 - 11 Aug 2025
Viewed by 191
Abstract
The complex and variable environment of facility orchards poses significant challenges for intelligent robotic operations. To address issues such as nectarine fruit occlusion by branches and leaves, complex backgrounds, and the demand for high real-time detection performance, this study proposes a target detection [...] Read more.
The complex and variable environment of facility orchards poses significant challenges for intelligent robotic operations. To address issues such as nectarine fruit occlusion by branches and leaves, complex backgrounds, and the demand for high real-time detection performance, this study proposes a target detection model for nectarine fruit based on the YOLOv11 architecture—Ghost–iEMA–ADown You Only Look (GIA-YOLO). We introduce the GhostModule to reduce the model size and the floating-point operations, adopt the fusion attention mechanism iEMA to enhance the feature extraction capability, and further optimize the network structure through the ADown lightweight downsampling module. The test results show that GIA-YOLO achieves 93.9% precision, 88.9% recall, and 96.2% mAP, which are 2.2, 1.1, and 0.7 percentage points higher than YOLOv11, respectively; the size of the model is reduced to 5.0 MB and the floating-point operations is reduced to 5.2 G, which is 9.1% and 17.5% less compared to the original model, respectively. The model was deployed in the picking robot system and field tested in the nectarine facility orchard, the results show that GIA-YOLO maintains high detection precision and stability at different picking distances, with a comprehensive missed detection rate of 6.65%, a false detection rate of 8.7%, and supports real-time detection at 41.6 FPS. The results of the research provide an important reference and support for the optimization of the design and application of the nectarine detection model in the facility agriculture environment. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

21 pages, 2559 KiB  
Article
A Shape-Aware Lightweight Framework for Real-Time Object Detection in Nuclear Medicine Imaging Equipment
by Weiping Jiang, Guozheng Xu and Aiguo Song
Appl. Sci. 2025, 15(16), 8839; https://doi.org/10.3390/app15168839 - 11 Aug 2025
Viewed by 179
Abstract
Manual calibration of nuclear medicine scanners currently relies on handling phantoms containing radioactive sources, exposing personnel to high radiation doses and elevating cancer risk. We designed an automated detection framework for robotic inspection on the YOLOv8n foundation. It pairs a lightweight backbone with [...] Read more.
Manual calibration of nuclear medicine scanners currently relies on handling phantoms containing radioactive sources, exposing personnel to high radiation doses and elevating cancer risk. We designed an automated detection framework for robotic inspection on the YOLOv8n foundation. It pairs a lightweight backbone with a shape-aware geometric attention module and an anchor-free head. Facing a small training set, we produced extra images with a GAN and then fine-tuned a pretrained network on these augmented data. Evaluations on a custom dataset consisting of PET/CT gantry and table images showed that the SAM-YOLOv8n model achieved a precision of 93.6% and a recall of 92.8%. These results demonstrate fast, accurate, real-time detection, offering a safer and more efficient alternative to manual calibration of nuclear medicine equipment. Full article
(This article belongs to the Section Applied Physics General)
Show Figures

Figure 1

17 pages, 3002 KiB  
Article
Train-YOLO: An Efficient and Lightweight Network Model for Train Component Damage Detection
by Hanqing Zong, Ying Jiang and Xinghuai Huang
Sensors 2025, 25(16), 4953; https://doi.org/10.3390/s25164953 - 10 Aug 2025
Viewed by 274
Abstract
Currently, train component fault detection is predominantly carried out through manual inspection, a process that is inefficient, prone to high omission rates, and carries safety risks. This study proposes an innovative fault detection model for train components based on YOLOv8, aiming to overcome [...] Read more.
Currently, train component fault detection is predominantly carried out through manual inspection, a process that is inefficient, prone to high omission rates, and carries safety risks. This study proposes an innovative fault detection model for train components based on YOLOv8, aiming to overcome the inefficiencies and high omission rates associated with traditional manual methods. By optimizing the YOLOv8 network architecture and integrating the ADown module, C2f-Rep, and DHD, the model significantly improves computational efficiency and detection accuracy. Experimental results demonstrate that the optimized Train-YOLO model achieves a peak accuracy of 92.9% in train component fault detection. Additionally, it features a smaller model size and reduced computational demands, making it ideal for rapid on-site deployment. A comparison with other leading detection models further highlights the superiority of Train-YOLO in both accuracy and lightweight design. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

21 pages, 9664 KiB  
Article
A Detection Approach for Wheat Spike Recognition and Counting Based on UAV Images and Improved Faster R-CNN
by Donglin Wang, Longfei Shi, Huiqing Yin, Yuhan Cheng, Shaobo Liu, Siyu Wu, Guangguang Yang, Qinge Dong, Jiankun Ge and Yanbin Li
Plants 2025, 14(16), 2475; https://doi.org/10.3390/plants14162475 - 9 Aug 2025
Viewed by 292
Abstract
This study presents an innovative unmanned aerial vehicle (UAV)-based intelligent detection method utilizing an improved Faster Region-based Convolutional Neural Network (Faster R-CNN) architecture to address the inefficiency and inaccuracy inherent in manual wheat spike counting. We systematically collected a high-resolution image dataset (2000 [...] Read more.
This study presents an innovative unmanned aerial vehicle (UAV)-based intelligent detection method utilizing an improved Faster Region-based Convolutional Neural Network (Faster R-CNN) architecture to address the inefficiency and inaccuracy inherent in manual wheat spike counting. We systematically collected a high-resolution image dataset (2000 images, 4096 × 3072 pixels) covering key growth stages (heading, grain filling, and maturity) of winter wheat (Triticum aestivum L.) during 2022–2023 using a DJI M300 RTK equipped with multispectral sensors. The dataset encompasses diverse field scenarios under five fertilization treatments (organic-only, organic–inorganic 7:3 and 3:7 ratios, inorganic-only, and no fertilizer) and two irrigation regimes (full and deficit irrigation), ensuring representativeness and generalizability. For model development, we replaced conventional VGG16 with ResNet-50 as the backbone network, incorporating residual connections and channel attention mechanisms to achieve 92.1% mean average precision (mAP) while reducing parameters from 135 M to 77 M (43% decrease). The GFLOPS of the improved model has been reduced from 1.9 to 1.7, an decrease of 10.53%, and the computational efficiency of the model has been improved. Performance tests demonstrated a 15% reduction in missed detection rate compared to YOLOv8 in dense canopies, with spike count regression analysis yielding R2 = 0.88 (p < 0.05) against manual measurements and yield prediction errors below 10% for optimal treatments. To validate robustness, we established a dedicated 500-image test set (25% of total data) spanning density gradients (30–80 spikes/m2) and varying illumination conditions, maintaining >85% accuracy even under cloudy weather. Furthermore, by integrating spike recognition with agronomic parameters (e.g., grain weight), we developed a comprehensive yield estimation model achieving 93.5% accuracy under optimal water–fertilizer management (70% ETc irrigation with 3:7 organic–inorganic ratio). This work systematically addresses key technical challenges in automated spike detection through standardized data acquisition, lightweight model design, and field validation, offering significant practical value for smart agriculture development. Full article
(This article belongs to the Special Issue Plant Phenotyping and Machine Learning)
Show Figures

Figure 1

Back to TopTop