Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (50)

Search Parameters:
Keywords = you only look once version 3 (YOLOv3)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 1495 KB  
Article
FlashLightNet: An End-to-End Deep Learning Framework for Real-Time Detection and Classification of Static and Flashing Traffic Light States
by Laith Bani Khaled, Mahfuzur Rahman, Iffat Ara Ebu and John E. Ball
Sensors 2025, 25(20), 6423; https://doi.org/10.3390/s25206423 - 17 Oct 2025
Cited by 1 | Viewed by 1811
Abstract
Accurate traffic light detection and classification are fundamental for autonomous vehicle (AV) navigation and real-time traffic management in complex urban environments. Existing systems often fall short of reliably identifying and classifying traffic light states in real-time, including their flashing modes. This study introduces [...] Read more.
Accurate traffic light detection and classification are fundamental for autonomous vehicle (AV) navigation and real-time traffic management in complex urban environments. Existing systems often fall short of reliably identifying and classifying traffic light states in real-time, including their flashing modes. This study introduces FlashLightNet, a novel end-to-end deep learning framework that integrates the nano version of You Only Look Once, version 10m (YOLOv10n) for traffic light detection, Residual Neural Networks 18 (ResNet-18) for feature extraction, and a Long Short-Term Memory (LSTM) network for temporal state classification. The proposed framework is designed to robustly detect and classify traffic light states, including conventional signals (red, green, and yellow) and flashing signals (flash red and flash yellow), under diverse and challenging conditions such as varying lighting, occlusions, and environmental noise. The framework has been trained and evaluated on a comprehensive custom dataset of traffic light scenarios organized into temporal sequences to capture spatiotemporal dynamics. The dataset has been prepared by taking videos of traffic lights at different intersections of Starkville, Mississippi, and Mississippi State University, consisting of red, green, yellow, flash red, and flash yellow. In addition, simulation-based video datasets with different flashing rates—2, 3, and 4 s—for traffic light states at several intersections were created using RoadRunner, further enhancing the diversity and robustness of the dataset. The YOLOv10n model achieved a mean average precision (mAP) of 99.2% in traffic light detection, while the ResNet-18 and LSTM combination classified traffic light states (red, green, yellow, flash red, and flash yellow) with an F1-score of 96%. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing: 2nd Edition)
Show Figures

Figure 1

29 pages, 6039 KB  
Article
Tree Species Detection and Enhancing Semantic Segmentation Using Machine Learning Models with Integrated Multispectral Channels from PlanetScope and Digital Aerial Photogrammetry in Young Boreal Forest
by Arun Gyawali, Mika Aalto and Tapio Ranta
Remote Sens. 2025, 17(11), 1811; https://doi.org/10.3390/rs17111811 - 22 May 2025
Cited by 3 | Viewed by 2820
Abstract
The precise identification and classification of tree species in young forests during their early development stages are vital for forest management and silvicultural efforts that support their growth and renewal. However, achieving accurate geolocation and species classification through field-based surveys is often a [...] Read more.
The precise identification and classification of tree species in young forests during their early development stages are vital for forest management and silvicultural efforts that support their growth and renewal. However, achieving accurate geolocation and species classification through field-based surveys is often a labor-intensive and complicated task. Remote sensing technologies combined with machine learning techniques present an encouraging solution, offering a more efficient alternative to conventional field-based methods. This study aimed to detect and classify young forest tree species using remote sensing imagery and machine learning techniques. The study mainly involved two different objectives: first, tree species detection using the latest version of You Only Look Once (YOLOv12), and second, semantic segmentation (classification) using random forest, Categorical Boosting (CatBoost), and a Convolutional Neural Network (CNN). To the best of our knowledge, this marks the first exploration utilizing YOLOv12 for tree species identification, along with the study that integrates digital aerial photogrammetry with Planet imagery to achieve semantic segmentation in young forests. The study used two remote sensing datasets: RGB imagery from unmanned aerial vehicle (UAV) ortho photography and RGB-NIR from PlanetScope. For YOLOv12-based tree species detection, only RGB from ortho photography was used, while semantic segmentation was performed with three sets of data: (1) Ortho RGB (3 bands), (2) Ortho RGB + canopy height model (CHM) + Planet RGB-NIR (8 bands), and (3) ortho RGB + CHM + Planet RGB-NIR + 12 vegetation indices (20 bands). With three models applied to these datasets, nine machine learning models were trained and tested using 57 images (1024 × 1024 pixels) and their corresponding mask tiles. The YOLOv12 model achieved 79% overall accuracy, with Scots pine performing best (precision: 97%, recall: 92%, mAP50: 97%, mAP75: 80%) and Norway spruce showing slightly lower accuracy (precision: 94%, recall: 82%, mAP50: 90%, mAP75: 71%). For semantic segmentation, the CatBoost model with 20 bands outperformed other models, achieving 85% accuracy, 80% Kappa, and 81% MCC, with CHM, EVI, NIRPlanet, GreenPlanet, NDGI, GNDVI, and NDVI being the most influential variables. These results indicate that a simple boosting model like CatBoost can outperform more complex CNNs for semantic segmentation in young forests. Full article
Show Figures

Graphical abstract

18 pages, 4565 KB  
Article
Improved Lightweight YOLOv11 Algorithm for Real-Time Forest Fire Detection
by Ye Tao, Bangyu Li, Peiru Li, Jin Qian and Liang Qi
Electronics 2025, 14(8), 1508; https://doi.org/10.3390/electronics14081508 - 9 Apr 2025
Cited by 5 | Viewed by 2387
Abstract
Modern computer vision techniques for forest fire detection face a trade-off between computational efficiency and detection accuracy in complex forest environments. To address this, we propose a lightweight YOLOv11n-based framework optimized for edge deployment. The backbone network integrates a novel C3k2MBNV2 (Cross Stage [...] Read more.
Modern computer vision techniques for forest fire detection face a trade-off between computational efficiency and detection accuracy in complex forest environments. To address this, we propose a lightweight YOLOv11n-based framework optimized for edge deployment. The backbone network integrates a novel C3k2MBNV2 (Cross Stage Partial Bottleneck with 3 convolutions and kernel size 2 MobileNetV2) block to enable efficient fire feature extraction via a compact architecture. We further introduce the SCDown (Spatial-Channel Decoupled Downsampling) block in both the backbone and neck to preserve critical information during downsampling. The neck further incorporates the C3k2WTDC (Cross Stage Partial Bottleneck with 3 convolutions and kernel size 2, combined with Wavelet Transform Depthwise Convolution) block, enhancing contextual understanding with reduced computational overhead. Experiments on a forest fire dataset demonstrate that our model achieves a 53.2% reduction in parameters and 28.6% fewer FLOPs compared to YOLOv11n (You Only Look Once version eleven), along with a 3.3% improvement in mean average precision. These advancements establish an optimal balance between efficiency and accuracy, enabling the proposed framework to attain real-time detection capabilities on resource-constrained edge devices in forest environments. This work provides a practical solution for deploying reliable forest fire detection systems in scenarios demanding low latency and minimal computational resources. Full article
Show Figures

Figure 1

22 pages, 27512 KB  
Article
Predicting Dairy Calf Body Weight from Depth Images Using Deep Learning (YOLOv8) and Threshold Segmentation with Cross-Validation and Longitudinal Analysis
by Mingsi Liao, Gota Morota, Ye Bi and Rebecca R. Cockrum
Animals 2025, 15(6), 868; https://doi.org/10.3390/ani15060868 - 18 Mar 2025
Cited by 1 | Viewed by 2149
Abstract
Monitoring calf body weight (BW) before weaning is essential for assessing growth, feed efficiency, health, and weaning readiness. However, labor, time, and facility constraints limit BW collection. Additionally, Holstein calf coat patterns complicate image-based BW estimation, and few studies have explored non-contact measurements [...] Read more.
Monitoring calf body weight (BW) before weaning is essential for assessing growth, feed efficiency, health, and weaning readiness. However, labor, time, and facility constraints limit BW collection. Additionally, Holstein calf coat patterns complicate image-based BW estimation, and few studies have explored non-contact measurements taken at early time points for predicting later BW. The objectives of this study were to (1) develop deep learning-based segmentation models for extracting calf body metrics, (2) compare deep learning segmentation with threshold-based methods, and (3) evaluate BW prediction using single-time-point cross-validation with linear regression (LR) and extreme gradient boosting (XGBoost) and multiple-time-point cross-validation with LR, XGBoost, and a linear mixed model (LMM). Depth images from Holstein (n = 63) and Jersey (n = 5) pre-weaning calves were collected, with 20 Holstein calves being weighed manually. Results showed that You Only Look Once version 8 (YOLOv8) deep learning segmentation (intersection over union = 0.98) outperformed threshold-based methods (0.89). In single-time-point cross-validation, XGBoost achieved the best BW prediction (R2 = 0.91, mean absolute percentage error (MAPE) = 4.37%), while LMM provided the most accurate longitudinal BW prediction (R2 = 0.99, MAPE = 2.39%). These findings highlight the potential of deep learning for automated BW prediction, enhancing farm management. Full article
(This article belongs to the Section Cattle)
Show Figures

Figure 1

28 pages, 7478 KB  
Article
A Comparative Study of YOLO Series (v3–v10) with DeepSORT and StrongSORT: A Real-Time Tracking Performance Study
by Khadijah Alkandary, Ahmet Serhat Yildiz and Hongying Meng
Electronics 2025, 14(5), 876; https://doi.org/10.3390/electronics14050876 - 23 Feb 2025
Cited by 7 | Viewed by 5309
Abstract
Many previous studies have explored the integration of a specific You Only Look Once (YOLO) model with real-time trackers like Deep Simple Online and Realtime Tracker (DeepSORT) and Strong Simple Online and Realtime Tracker (StrongSORT). However, few have conducted a comprehensive and in-depth [...] Read more.
Many previous studies have explored the integration of a specific You Only Look Once (YOLO) model with real-time trackers like Deep Simple Online and Realtime Tracker (DeepSORT) and Strong Simple Online and Realtime Tracker (StrongSORT). However, few have conducted a comprehensive and in-depth analysis of integrating the family of YOLO models with these real-time trackers to study the performance of the resulting pipeline and draw critical conclusions. This work aims to fill this gap, with the primary objective of investigating the effectiveness of integrating the YOLO series, in light-sized versions, with the real-time DeepSORT and StrongSORT tracking algorithms for real-time object tracking in a computationally limited environment. This work will systematically compare various lightweight YOLO versions, from YOLO version 3 (YOLOv3) to YOLO version 10 (YOLOv10), combined with both tracking algorithms. It will evaluate their performance using detailed metrics across diverse and challenging real-world datasets: the Multiple Object Tracking 2017 (MOT17) and Multiple Object Tracking 2020 (MOT20) datasets. The goal of this work is to assess the robustness and accuracy of these light models in multiple complex real-world environments in scenarios with limited computational resources. Our findings reveal that YOLO version 5 (YOLOv5), when combined with either tracker (DeepSORT or StrongSORT), offers not only a solid baseline in terms of the model’s size (enabling real-time performance on edge devices) but also competitive overall performance (in terms of Multiple Object Tracking Accuracy (MOTA) and Multiple Object Tracking Precision (MOTP)). The results suggest a strong correlation between the choice regarding the YOLO version and the tracker’s overall performance. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

28 pages, 13762 KB  
Article
Elderly Fall Detection in Complex Environment Based on Improved YOLOv5s and LSTM
by Thioanh Bui, Juncheng Liu, Jingyu Cao, Geng Wei and Qian Zeng
Appl. Sci. 2024, 14(19), 9028; https://doi.org/10.3390/app14199028 - 6 Oct 2024
Cited by 6 | Viewed by 4546
Abstract
This work was conducted mainly to provide a healthy and safe monitoring system for the elderly living in the home environment. In this paper, two different target fall detection schemes are proposed based on whether the target is visible or not. When the [...] Read more.
This work was conducted mainly to provide a healthy and safe monitoring system for the elderly living in the home environment. In this paper, two different target fall detection schemes are proposed based on whether the target is visible or not. When the target is visible, a vision-based fall detection algorithm is proposed, where an image of the target captured by a camera is transmitted to the improved You Only Look Once version 5s (YOLOv5s) model for posture detection. In contrast, when the target is invisible, a WiFi-based fall detection algorithm is proposed, where channel state information (CSI) signals are used to estimate the target’s posture with an improved long short-term memory (LSTM) model. In the improved YOLOv5s model, adaptive picture scaling technology named Letterbox is used to maintain consistency in the aspect ratio of images in the dataset, and the weighted bidirectional feature pyramid (BiFPN) and the attention mechanisms of squeeze-and-excitation (SE) and coordinate attention (CA) modules are added to the Backbone network and Neck network, respectively. In the improved LSTM model, the Hampel filter is used to eliminate the noise from CSI signals and the convolutional neural network (CNN) model is combined with the LSTM to process the image made from CSI signals, and thus the object of the improved LSTM model at a point in time is the analysis of the amplitude of 90 CSI signals. The final monitoring result of the health status of the target is the result of combining the fall detection of the improved YOLOv5s and LSTM models with the physiological information of the target. Experimental results show the following: (1) the detection precision, recall rate, and average precision of the improved YOLOv5s model are increased by 7.2%, 9%, and 7.6%, respectively, compared with the original model, and there is almost no missed detection of the target; (2) the detection accuracy of the improved LSTM model is improved by 15.61%, 29.36%, and 52.39% compared with the original LSTM, CNN, and neural network (NN) models, respectively, while the convergence speed is improved by 90% compared with the original LSTM model; and (3) the proposed algorithm can meet the requirements of accurate, real-time, and stable applications of health monitoring. Full article
Show Figures

Figure 1

16 pages, 6488 KB  
Article
Magnetic-Controlled Microrobot: Real-Time Detection and Tracking through Deep Learning Approaches
by Hao Li, Xin Yi, Zhaopeng Zhang and Yuan Chen
Micromachines 2024, 15(6), 756; https://doi.org/10.3390/mi15060756 - 5 Jun 2024
Cited by 11 | Viewed by 4779
Abstract
As one of the most significant research topics in robotics, microrobots hold great promise in biomedicine for applications such as targeted diagnosis, targeted drug delivery, and minimally invasive treatment. This paper proposes an enhanced YOLOv5 (You Only Look Once version 5) microrobot detection [...] Read more.
As one of the most significant research topics in robotics, microrobots hold great promise in biomedicine for applications such as targeted diagnosis, targeted drug delivery, and minimally invasive treatment. This paper proposes an enhanced YOLOv5 (You Only Look Once version 5) microrobot detection and tracking system (MDTS), incorporating a visual tracking algorithm to elevate the precision of small-target detection and tracking. The improved YOLOv5 network structure is used to take magnetic bodies with sizes of 3 mm and 1 mm and a magnetic microrobot with a length of 2 mm as the pretraining targets, and the training weight model is used to obtain the position information and motion information of the microrobot in real time. The experimental results show that the accuracy of the improved network model for magnetic bodies with a size of 3 mm is 95.81%, representing an increase of 2.1%; for magnetic bodies with a size of 1 mm, the accuracy is 91.03%, representing an increase of 1.33%; and for microrobots with a length of 2 mm, the accuracy is 91.7%, representing an increase of 1.5%. The combination of the improved YOLOv5 network model and the vision algorithm can effectively realize the real-time detection and tracking of magnetically controlled microrobots. Finally, 2D and 3D detection and tracking experiments relating to microrobots are designed to verify the robustness and effectiveness of the system, which provides strong support for the operation and control of microrobots in an in vivo environment. Full article
Show Figures

Figure 1

18 pages, 6985 KB  
Article
Enhancing Livestock Detection: An Efficient Model Based on YOLOv8
by Chengwu Fang, Chunmei Li, Peng Yang, Shasha Kong, Yaosheng Han, Xiangjie Huang and Jiajun Niu
Appl. Sci. 2024, 14(11), 4809; https://doi.org/10.3390/app14114809 - 2 Jun 2024
Cited by 10 | Viewed by 3606
Abstract
Maintaining a harmonious balance between grassland ecology and local economic development necessitates effective management of livestock resources. Traditional approaches have proven inefficient, highlighting an urgent need for intelligent solutions. Accurate identification of livestock targets is pivotal for precise livestock farming management. However, the [...] Read more.
Maintaining a harmonious balance between grassland ecology and local economic development necessitates effective management of livestock resources. Traditional approaches have proven inefficient, highlighting an urgent need for intelligent solutions. Accurate identification of livestock targets is pivotal for precise livestock farming management. However, the You Only Look Once version 8 (YOLOv8) model exhibits limitations in accuracy when confronted with complex backgrounds and densely clustered targets. To address these challenges, this study proposes an optimized CCS-YOLOv8 (Comprehensive Contextual Sensing YOLOv8) model. First, we curated a comprehensive livestock detection dataset encompassing the Qinghai region. Second, the YOLOv8n model underwent three key enhancements: (1) incorporating a Convolutional Block Attention Module (CBAM) to accentuate salient image information, thereby boosting feature representational power; (2) integrating a Content-Aware ReAssembly of FEatures (CARAFE) operator to mitigate irrelevant interference, improving the integrity and accuracy of feature extraction; and (3) introducing a dedicated small object detection layer to capture finer livestock details, enhancing the recognition of smaller targets. Experimental results on our dataset demonstrate the CCS-YOLOv8 model’s superior performance, achieving 84.1% precision, 82.2% recall, 84.4% mAP@0.5, 60.3% mAP@0.75, 53.6% mAP@0.5:0.95, and 83.1% F1-score. These metrics reflect substantial improvements of 1.1%, 7.9%, 5.8%, 6.6%, 4.8%, and 4.7%, respectively, over the baseline model. Compared to mainstream object detection models, CCS-YOLOv8 strikes an optimal balance between accuracy and real-time processing capability. Its robustness is further validated on the VisDrone2019 dataset. The CCS-YOLOv8 model enables rapid and accurate identification of livestock age groups and species, effectively overcoming the challenges posed by complex grassland backgrounds and densely clustered targets. It offers a novel strategy for precise livestock population management and overgrazing prevention, aligning seamlessly with the demands of modern precision livestock farming. Moreover, it promotes local environmental conservation and fosters sustainable development within the livestock industry. Full article
Show Figures

Figure 1

17 pages, 2228 KB  
Article
Applying Machine Learning to Construct a Printed Circuit Board Gold Finger Defect Detection System
by Chien-Yi Huang and Pei-Xuan Tsai
Electronics 2024, 13(6), 1090; https://doi.org/10.3390/electronics13061090 - 15 Mar 2024
Cited by 4 | Viewed by 3018
Abstract
Machine vision systems use industrial cameras’ digital sensors to collect images and use computers for image pre-processing, analysis, and the measurements of various features to make decisions. With increasing capacity and quality demands in the electronic industry, incoming quality control (IQC) standards are [...] Read more.
Machine vision systems use industrial cameras’ digital sensors to collect images and use computers for image pre-processing, analysis, and the measurements of various features to make decisions. With increasing capacity and quality demands in the electronic industry, incoming quality control (IQC) standards are becoming more and more stringent. The industry’s incoming quality control is mainly based on manual sampling. Although it saves time and costs, the miss rate is still high. This study aimed to establish an automatic defect detection system that could quickly identify defects in the gold finger on printed circuit boards (PCBs) according to the manufacturer’s standard. In the general training iteration process of deep learning, parameters required for image processing and deductive reasoning operations are automatically updated. In this study, we discussed and compared the object detection networks of the YOLOv3 (You Only Look Once, Version 3) and Faster Region-Based Convolutional Neural Network (Faster R-CNN) algorithms. The results showed that the defect classification detection model, established based on the YOLOv3 network architecture, could identify defects with an accuracy of 95%. Therefore, the IQC sampling inspection was changed to a full inspection, and the surface mount technology (SMT) full inspection station was canceled to reduce the need for inspection personnel. Full article
Show Figures

Figure 1

22 pages, 7098 KB  
Article
Detection of Small Lesions on Grape Leaves Based on Improved YOLOv7
by Mingji Yang, Xinbo Tong and Haisong Chen
Electronics 2024, 13(2), 464; https://doi.org/10.3390/electronics13020464 - 22 Jan 2024
Cited by 10 | Viewed by 4120
Abstract
The precise detection of small lesions on grape leaves is beneficial for early detection of diseases. In response to the high missed detection rate of small target diseases on grape leaves, this paper adds a new prediction branch and combines an improved channel [...] Read more.
The precise detection of small lesions on grape leaves is beneficial for early detection of diseases. In response to the high missed detection rate of small target diseases on grape leaves, this paper adds a new prediction branch and combines an improved channel attention mechanism and an improved E-ELAN (Extended-Efficient Long-range Attention Network) to propose an improved algorithm for the YOLOv7 (You Only Look Once version 7) model. Firstly, to address the issue of low resolution for small targets, a new detection head is added to detect smaller targets. Secondly, in order to increase the feature extraction ability of E-ELAN components in YOLOv7 for small targets, the asymmetric convolution is introduced into E-ELAN to replace the original 3 × 3 convolution in E-ELAN network to achieve multi-scale feature extraction. Then, to address the issue of insufficient extraction of information from small targets in YOLOv7, a channel attention mechanism was introduced and improved to enhance the network’s sensitivity to small-scale targets. Finally, the CIoU (Complete Intersection over Union) in the original YOLOv7 network model was replaced with SIoU (Structured Intersection over Union) to optimize the loss function and enhance the network’s localization ability. In order to verify the effectiveness of the improved YOLOv7 algorithm, three common grape leaf diseases were selected as detection objects to create a dataset for experiments. The results show that the average accuracy of the algorithm proposed in this paper is 2.7% higher than the original YOLOv7 algorithm, reaching 93.5%. Full article
(This article belongs to the Special Issue Deep Learning in Image Processing and Pattern Recognition)
Show Figures

Figure 1

15 pages, 3710 KB  
Article
Improved YOLOv3 Integrating SENet and Optimized GIoU Loss for Occluded Pedestrian Detection
by Qiangbo Zhang, Yunxiang Liu, Yu Zhang, Ming Zong and Jianlin Zhu
Sensors 2023, 23(22), 9089; https://doi.org/10.3390/s23229089 - 10 Nov 2023
Cited by 12 | Viewed by 1990
Abstract
Occluded pedestrian detection faces huge challenges. False positives and false negatives in crowd occlusion scenes will reduce the accuracy of occluded pedestrian detection. To overcome this problem, we proposed an improved you-only-look-once version 3 (YOLOv3) based on squeeze-and-excitation networks (SENet) and optimized generalized [...] Read more.
Occluded pedestrian detection faces huge challenges. False positives and false negatives in crowd occlusion scenes will reduce the accuracy of occluded pedestrian detection. To overcome this problem, we proposed an improved you-only-look-once version 3 (YOLOv3) based on squeeze-and-excitation networks (SENet) and optimized generalized intersection over union (GIoU) loss for occluded pedestrian detection, namely YOLOv3-Occlusion (YOLOv3-Occ). The proposed network model considered incorporating squeeze-and-excitation networks (SENet) into YOLOv3, which assigned greater weights to the features of unobstructed parts of pedestrians to solve the problem of feature extraction against unsheltered parts. For the loss function, a new generalized intersection over unionintersection over groundtruth (GIoUIoG) loss was developed to ensure the areas of predicted frames of pedestrian invariant based on the GIoU loss, which tackled the problem of inaccurate positioning of pedestrians. The proposed method, YOLOv3-Occ, was validated on the CityPersons and COCO2014 datasets. Experimental results show the proposed method could obtain 1.2% MR−2 gains on the CityPersons dataset and 0.7% mAP@50 improvements on the COCO2014 dataset. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

17 pages, 1959 KB  
Article
Comparative Analysis of Machine Learning Models for Image Detection of Colonic Polyps vs. Resected Polyps
by Adriel Abraham, Rejath Jose, Jawad Ahmad, Jai Joshi, Thomas Jacob, Aziz-ur-rahman Khalid, Hassam Ali, Pratik Patel, Jaspreet Singh and Milan Toma
J. Imaging 2023, 9(10), 215; https://doi.org/10.3390/jimaging9100215 - 9 Oct 2023
Cited by 8 | Viewed by 3931
Abstract
(1) Background: Colon polyps are common protrusions in the colon’s lumen, with potential risks of developing colorectal cancer. Early detection and intervention of these polyps are vital for reducing colorectal cancer incidence and mortality rates. This research aims to evaluate and compare the [...] Read more.
(1) Background: Colon polyps are common protrusions in the colon’s lumen, with potential risks of developing colorectal cancer. Early detection and intervention of these polyps are vital for reducing colorectal cancer incidence and mortality rates. This research aims to evaluate and compare the performance of three machine learning image classification models’ performance in detecting and classifying colon polyps. (2) Methods: The performance of three machine learning image classification models, Google Teachable Machine (GTM), Roboflow3 (RF3), and You Only Look Once version 8 (YOLOv8n), in the detection and classification of colon polyps was evaluated using the testing split for each model. The external validity of the test was analyzed using 90 images that were not used to test, train, or validate the model. The study used a dataset of colonoscopy images of normal colon, polyps, and resected polyps. The study assessed the models’ ability to correctly classify the images into their respective classes using precision, recall, and F1 score generated from confusion matrix analysis and performance graphs. (3) Results: All three models successfully distinguished between normal colon, polyps, and resected polyps in colonoscopy images. GTM achieved the highest accuracies: 0.99, with consistent precision, recall, and F1 scores of 1.00 for the ‘normal’ class, 0.97–1.00 for ‘polyps’, and 0.97–1.00 for ‘resected polyps’. While GTM exclusively classified images into these three categories, both YOLOv8n and RF3 were able to detect and specify the location of normal colonic tissue, polyps, and resected polyps, with YOLOv8n and RF3 achieving overall accuracies of 0.84 and 0.87, respectively. (4) Conclusions: Machine learning, particularly models like GTM, shows promising results in ensuring comprehensive detection of polyps during colonoscopies. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 4562 KB  
Article
Improved YOLOv5-Based Real-Time Road Pavement Damage Detection in Road Infrastructure Management
by Abdullah As Sami, Saadman Sakib, Kaushik Deb and Iqbal H. Sarker
Algorithms 2023, 16(9), 452; https://doi.org/10.3390/a16090452 - 21 Sep 2023
Cited by 39 | Viewed by 7171
Abstract
Deep learning has enabled a straightforward, convenient method of road pavement infrastructure management that facilitates a secure, cost-effective, and efficient transportation network. Manual road pavement inspection is time-consuming and dangerous, making timely road repair difficult. This research showcases You Only Look Once version [...] Read more.
Deep learning has enabled a straightforward, convenient method of road pavement infrastructure management that facilitates a secure, cost-effective, and efficient transportation network. Manual road pavement inspection is time-consuming and dangerous, making timely road repair difficult. This research showcases You Only Look Once version 5 (YOLOv5), the most commonly employed object detection model trained on the latest benchmark Road Damage Dataset, Road Damage Detection 2022 (RDD 2022). The RDD 2022 dataset includes four common types of road pavement damage, namely vertical cracks, horizontal cracks, alligator cracks, and potholes. This paper presents an improved deep neural network model based on YOLOv5 for real-time road pavement damage detection in photographic representations of outdoor road surfaces, making it an indispensable tool for efficient, real-time, and cost-effective road infrastructure management. The YOLOv5 model has been modified to incorporate several techniques that improve its accuracy and generalization performance. These techniques include the Efficient Channel Attention module (ECA-Net), label smoothing, the K-means++ algorithm, Focal Loss, and an additional prediction layer. In addition, a 1.9% improvement in mean average precision (mAP) and a 1.29% increase in F1-Score were attained by the model in comparison to YOLOv5s, with an increment of 1.1 million parameters. Moreover, a 0.11% improvement in mAP and 0.05% improvement in F1 score was achieved by the proposed model compared to YOLOv8s while having 3 million fewer parameters and 12 gigabytes fewer Giga Floating Point Operation per Second (GFlops). Full article
Show Figures

Figure 1

4 pages, 739 KB  
Proceeding Paper
Optimizing Pothole Detection in Pavements: A Comparative Analysis of Deep Learning Models
by Tiago Tamagusko and Adelino Ferreira
Eng. Proc. 2023, 36(1), 11; https://doi.org/10.3390/engproc2023036011 - 30 Jun 2023
Cited by 12 | Viewed by 2380
Abstract
Advancements in computer vision applications have led to improved object detection (OD) in terms of accuracy and processing time, enabling real-time solutions across various fields. In pavement engineering, detecting visual defects such as potholes, cracking, and rutting is of particular interest. This study [...] Read more.
Advancements in computer vision applications have led to improved object detection (OD) in terms of accuracy and processing time, enabling real-time solutions across various fields. In pavement engineering, detecting visual defects such as potholes, cracking, and rutting is of particular interest. This study aims to evaluate YOLO models on a dataset of 665 road pavement images labeled with potholes for OD. Pre-trained deep learning models were customized for pothole detection using transfer learning techniques. The assessed models include You Only Look Once (YOLO) versions 3, 4, and 5. It was found that YOLOv4 achieves the highest mean average precision (mAP), while its shortened version, YOLOv4-tiny, offers the best-reduced inference time, making it ideal for mobile applications. Furthermore, the YOLOv5s model demonstrates potential, attaining good results and standing out for its ease of implementation and scalability. Full article
Show Figures

Figure 1

18 pages, 5815 KB  
Article
Damage Detection and Localization of Bridge Deck Pavement Based on Deep Learning
by Youhao Ni, Jianxiao Mao, Yuguang Fu, Hao Wang, Hai Zong and Kun Luo
Sensors 2023, 23(11), 5138; https://doi.org/10.3390/s23115138 - 28 May 2023
Cited by 17 | Viewed by 3914
Abstract
Bridge deck pavement damage has a significant effect on the driving safety and long-term durability of bridges. To achieve the damage detection and localization of bridge deck pavement, a three-stage detection method based on the you-only-look-once version 7 (YOLOv7) network and the revised [...] Read more.
Bridge deck pavement damage has a significant effect on the driving safety and long-term durability of bridges. To achieve the damage detection and localization of bridge deck pavement, a three-stage detection method based on the you-only-look-once version 7 (YOLOv7) network and the revised LaneNet was proposed in this study. In stage 1, the Road Damage Dataset 202 (RDD2022) is preprocessed and adopted to train the YOLOv7 model, and five classes of damage were obtained. In stage 2, the LaneNet network was pruned to retain the semantic segmentation part, with the VGG16 network as an encoder to generate lane line binary images. In stage 3, the lane line binary images were post-processed by a proposed image processing algorithm to obtain the lane area. Based on the damage coordinates from stage 1, the final pavement damage classes and lane localization were obtained. The proposed method was compared and analyzed in the RDD2022 dataset, and was applied on the Fourth Nanjing Yangtze River Bridge in China. The results shows that the mean average precision (mAP) of YOLOv7 on the preprocessed RDD2022 dataset reaches 0.663, higher than that of other models in the YOLO series. The accuracy of the lane localization of the revised LaneNet is 0.933, higher than that of instance segmentation, 0.856. Meanwhile, the inference speed of the revised LaneNet is 12.3 frames per second (FPS) on NVIDIA GeForce RTX 3090, higher than that of instance segmentation 6.53 FPS. The proposed method can provide a reference for the maintenance of bridge deck pavement. Full article
Show Figures

Figure 1

Back to TopTop