Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8

Kandavel, Murugeswari; Sureshkumar, Aanjankumar; Dhanaraj, Rajesh Kumar

doi:10.3390/ECSA-12-26515

Open AccessProceeding Paper

Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8^†

by

Murugeswari Kandavel

^1,*,

Aanjankumar Sureshkumar

¹ and

Rajesh Kumar Dhanaraj

²

¹

School of Computing Science and Engineering, VIT Bhopal University, Bhopal-Indore Highway, Kothrikalan, Sehore 466114, India

²

Symbiosis Institute of Computer Studies and Research (SICSR), Symbiosis International (Deemed University), Pune 411016, India

^*

Author to whom correspondence should be addressed.

^†

Presented at the 12th International Electronic Conference on Sensors and Applications, 12–14 November 2025; Available online: https://sciforum.net/event/ECSA-12.

Eng. Proc. 2025, 118(1), 92; https://doi.org/10.3390/ECSA-12-26515 (registering DOI)

Published: 7 November 2025

(This article belongs to the Proceedings of The 12th International Electronic Conference on Sensors and Applications)

Download

Browse Figures

Versions Notes

Abstract

The rapid expansion of urban infrastructure and the complexity of analysing sudden vehicle movements in traffic systems necessitate autonomous traffic monitoring solutions for intelligent, autonomous traffic management. However, traditional methods like manual monitoring and static rule-based detection often fail to meet the real-time requirements of a modern city, resulting in inefficient congestion management, pedestrian safety issues, and inadequate road maintenance. Conventional approaches are highly dependent on human intervention and predefined algorithms, and consequently cannot adapt to dynamic traffic and the unpredictable movement of pedestrians. With urban populations on the rise, there is an urgent need for artificial intelligence-driven solutions that can effectively process large volumes of real-time data to ease traffic management and decision-making. This study presents an AI-based traffic monitoring framework that integrates deep learning and natural language processing (NLP) models for improved traffic safety, anomaly detection, and infrastructure optimisation. The system comprises high-accuracy object detection with YOLOv8, adaptive pedestrian crossing recognition with Few-Shot Learning (FSL), and contextual analysis and real-time decision-making with LLaMA 3.2B. Through the use of these technologies, along with a BDD100K available dataset, the system achieves a high detection accuracy of 96%, a low inference time of 75.1 ms, and an improved adaptability compared to SOTA of 89%. The results indicate the suitability of AI-driven methods for thoughtful city planning and autonomous mobility, with the potential for AI-driven frameworks to improve urban traffic management by increasing its efficiency and safety.

Keywords:

intelligent transportation system; YOLO; traffic monitoring; Few-Shot Learning; pedestrian crossing

1. Introduction

In the evolution of the modern world, with the rapid expansion of urban infrastructure and the increasing complexity of traffic systems, autonomous traffic monitoring solutions are needed for intelligent, autonomous traffic management. However, traditional methods like manual monitoring and static rule-based detection often fail to meet the real-time requirements of a smart city, resulting in inefficient congestion management, and road maintenance. Solutions that are considered to be intelligent and bring autonomy to traffic systems have been developing as urban traffic systems become more complicated and as the new generation requires real-time monitoring and management signs. Currently, most traditional traffic monitoring methods do not respond adaptively to dynamic changes in traffic and the location of moving pedestrians, the sudden presence of heavy vehicles, and changes in road infrastructure. Based on these approaches, conventional methods heavily depend on manual supervision or rule-based automation, which are unable to perform well in complying with real-time traffic dynamics. Additionally, existing AI-based traffic monitoring models require large-scale labelled datasets for training, which makes them numerically expensive and difficult to deploy in new environments without significant retraining [1]. Recently, advancements in deep learning and large language models (LLMs) have significantly facilitated autonomous decision-making in traffic management. In this research, an AI-based system is developed, utilising YOLOv8 for real-time object detection, Few-Shot Learning (FSL) for pedestrian crossing adaptation, and LLaMA 3.2B for intelligent decision-making in traffic maintenance. It is optimised to detect traffic objects, including vehicles, pedestrians, potholes, and road signs, ensuring fast and accurate real-time detection. One of the greatest shortcomings of traditional machine learning models is that they require large amounts of labelled data to adapt to new pedestrian behaviours. However, the Few-Shot Learning module enables the system to learn quickly with very little labelled data, allowing for it to adapt rapidly to new pedestrian behaviours and crossing patterns. It also utilises LLaMA 3.2B, a powerful transformer-based model, to analyse objects detected, interpret traffic patterns, and provide maintenance recommendations [2].

This study integrates the following:

YOLOv8 for real-time traffic object detection.
Few-Shot Learning for pedestrian crossing adaptation.
LLaMA 3.2B for intelligent traffic maintenance decisions.

Recently, deep learning and large language models (LLMs) have enabled the development of autonomous systems that operate in real-time, with the capability of making decisions in real-time.

2. Related Works

A summary of the various existing models related to intelligent transportation traffic condition analyses using Few-Shot Learning is provided here.

Conventional traditional traffic monitoring system approaches include CCTV-based monitoring (prone to blind spots and manual effort). LiDAR and SOTA-based systems are expensive and limited adaptability. Static rule-based AI models are unable to generalise to new conditions.

Deep Learning for Traffic Object Detection: YOLO models are fast and accurate, but require large datasets. CNN-based approaches have high accuracy, but are computationally expensive. Transformer-based models are powerful, but require extensive fine-tuning.

Few-Shot Learning for Pedestrian Detection: FSL enables models to learn new pedestrian behaviours from a small dataset. Prototypical networks and meta-learning are key FSL techniques that improve detection in new environments with minimal labelled data.

We trained the YOLO model in a manner that could generalise well from a few training examples. Rather than learning explicit pedestrian behaviours from a large dataset, we designed the model to identify general patterns and commonalities of pedestrian behaviours across different environments, allowing for it to be almost always correct with a small number of training samples. This is particularly beneficial for detecting unusual or unseen pedestrian behaviours by only needing to retrain on a small amount. The model observes a new pedestrian crossing instance, compares its feature embedding with the stored prototypes using Euclidean distance, and assigns that pedestrian to the nearest class when a new pedestrian instance is observed. It significantly improves our capability of detecting new crossing patterns that have unseen examples. Unlike static conventional models, the YOLO module incorporates adaptive learning mechanisms, that is, the ability of the system to be updated in real-time as new pedestrian behaviours arise. Each new data point from every diverse urban environment (congested intersections, school zones, and high-traffic areas) feeds into the model, and it continuously refines its understanding towards pedestrian movement. This makes sure that the system is protected and extensible to the changing conditions of traffic and pedestrian behaviour. Few-Shot Learning with LlaMa has several major advantages: it can learn from small datasets, and therefore it does not require the use of very massive manual annotation. Few-shot task sampling is used to achieve this, where the model is trained on several small datasets containing only a few labelled pedestrian crossing data points per dataset. The model exploits meta learning optimisation and can rapidly adapt to new environments, reducing costly and labour-intensive data collection efforts [3].

3. Materials and Methods

3.1. Materials

Thus, to make the proposed system robust and accurate, the dataset for training the system is collected from various sources, including self-recorded traffic videos and publicly available datasets such as BDD100K traffic surveillance data. It includes diverse road scenes or test cases, such as different times of the day and high traffic. The dataset consists of annotated images of pedestrians in various poses at crosswalks and with different crowd densities for pedestrian detection purposes. Because in Few-Shot Learning (FSL) it is crucial to utilise a few training samples, the model is fine-tuned using manually labelled pedestrian crossing scenarios. To achieve better detection accuracy under real-world conditions, advanced data augmentation is applied to the input data through rotation, scaling, and contrast adjustments. Vehicle dashcams are extracted to maintain data diversity and keep the model diverse from different viewpoints.

The dataset includes labelled instances of vehicles, roadblocks, crossings, and traffic signals for object detection using YOLOV8. The bounding boxes and class labels of each image are defined. Firstly, the dataset is processed by removing noise and normalising the resolution, and is then used for training and fine-tuning the model. The system’s overall accuracy and real-time responsiveness are enhanced by combining large-scale labelled data and domain-specific adaptive learning. YOLOv8 FSL and LLaMA 3.2B are integrated into the proposed system to enable real-time traffic monitoring and pedestrian detection. The architecture contains three main components: (i) YOLOv8 for object detection, (ii) Few-Shot Learning for adaptive pedestrian detection, and (iii) LLaMA 3.2B to opt for real-time decisions. Each of these components analyses the traffic conditions, detects anomalies, and predicts potential road safety risks based on historical patterns, as shown in Figure 1, which defines the schematic flow of the proposed work [4].

3.2. Methods

This study introduces a traffic monitoring system that utilises AI and motion sensors, leveraging an available dataset to enhance traffic safety, identify unusual events, and optimise infrastructure. The proposed design includes the following: 1. very accurate object detection using YOLOv8, which is trained on the BDD100K dataset and achieves 96% detection accuracy; 2. smart recognition of pedestrian crossings using Few-Shot Learning (FSL), which has an 85% recall rate for situations not previously seen; and 3. quick analysis and decision-making with a very short delay of 75 ms using LLaMA 3.2B. This is a fully convolutional model with anchor-free detection to improve accuracy. This is achieved by partitioning the feature map of the base layer into two parts and merging them through a cross-stage hierarchy, effectively balancing performance and efficiency. For the head, decoupled detection heads are employed for classification and localisation tasks, enhancing the detection of small and overlapping objects by separating these two functions [5].

To improve accuracy of object detection under diverse environment settings, the model is fine-tuned on the combination of available datasets. The fine-tuning process involves the following:

Using domain-specific traffic data, pre-trained YOLOv8 weights are adapted using transfer learning.
Data augmentation: Random cropping, flipping, brightness adjustment, and Gaussian noise are used expressed in Equation (1).

[d (f_{θ} (x_{q}), c_{k}) = {∥ f_{θ} (x_{q}) - c_{k} ∥}^{2}]

(1)

When a query image x_q is presented, its feature embedding f is compared to all class prototypes as c_k using Euclidean distance d.

Loss function optimisation: The loss function is optimised is expressed in Equation (2).

[L = - \sum_{i} . l o g P (y_{i}| x_{i})]

(2)

Objectless loss: Binary cross-entropy (BCE) loss for object presence.
Classification loss: Focal loss to mitigate class imbalance.

The problem of pedestrian detection in dynamic urban environments is challenging, as movement patterns are diverse and road conditions are not uniform for some vehicle crossings. Deep learning models tend to rely on large amounts of labelled training data, which can sometimes be difficult and time-consuming to collect and annotate. To overcome these limitations, the proposed system includes a Few-Shot Learning (FSL) module that combines meta-learning principles to improve pedestrian crossing detection with limited data. It is also useful as a real-time traffic monitoring system, since this approach enables the system to adapt quickly to new pedestrians. Video preprocessing the frames extracted from the pipeline begins with applying enhancers such as noise reduction and frame stabilisation to make the frames clear and consistent. Next, parallel model inference is executed in which the YOLOv8, Few-Shot Learning (FSL), and LLaMA 3.2B models are run concurrently on edge devices or cloud-based infrastructure to process the frames and analyse the results [6].

4. Experimental Results

By utilising the proposed model in large-scale traffic datasets such as BDD100K and Cityscapes, the system achieves a high detection accuracy of 96% for traffic flow on the road with YOLOV8, while state-of-the-art models like SOTA achieve only 89%. This indicates that the proposed model is 40% more accurate than rule-based AI systems, with FSL reducing false negatives by 22% compared to the video-based computer vision model. The results indicate that the proposed model is well-suited for AI-driven cycles in smart city planning and for enhancing autonomous mobility in urban traffic management. Table 1 presents a comparison of the proposed model with traditional methods, and Figure 2 illustrates the confusion matrix of the proposed work.

Figure 3a defines the output process by the proposed model detecting sudden events in crossing, and Figure 3b defines the vehicle progression in road lanes.

Figure 4a defines the proposed model accuracy comparison with false negative rate analysis, and Figure 4b defines the latency per frame of the proposed model.

5. Conclusions

The proposed modern real-time traffic monitoring system uses advanced artificial intelligence methods to boost safety levels and urban mobility performance. The system utilises the maximum advantages of YOLOv8, a state-of-the-art deep learning model, for high-precision object detection, enabling the accurate identification of vehicles, pedestrians, and other road entities. The system integrates Few-Shot Learning (FSL) as a method to detect pedestrians in dynamic conditions with a small amount of labelled data. LLaMA 3.2B, as a robust language model, enables smart decisions that allow traffic control systems to adjust their operations based on real-time conditions.

Author Contributions

Conceptualization, M.K.; methodology, M.K.; software, M.K.; validation, M.K., A.S. and R.K.D.; formal analysis, M.K.; investigation, M.K.; resources, M.K.; data curation, M.K.; writing—original draft preparation, M.K.; writing—review and editing, A.S. and R.K.D.; visualisation, A.S. and R.K.D.; supervision, A.S. and R.K.D.; project administration, A.S. and R.K.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available at https://www.kaggle.com/datasets/solesensei/solesensei_bdd100k (accessed on 1 February 2025).

Acknowledgments

During the preparation of this research authors, the used Grammarly’s language-editing function, and the authors remain fully responsible for the content.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liu, Q.; Li, Z.; Zhang, L.; Deng, J. MSCD-YOLO: A Lightweight Dense Pedestrian Detection Model with Finer-Grained Feature Information Interaction. Sensors 2025, 25, 438. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Wang, M. Research on Pedestrian Tracking Technology for Autonomous Driving Scenarios. IEEE Access 2024, 12, 149662–149675. [Google Scholar] [CrossRef]
Boudjit, K.; Ramzan, N. Human detection based on deep learning YOLO-v2 for real-time UAV applications. J. Exp. Theor. Artif. Intell. 2022, 34, 527–544. [Google Scholar] [CrossRef]
Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
Terven, J.; Córdova-Esparza, D.-M.; Romero-González, J.-A. A comprehensive review of yolo architectures in computer vision: From YOLOV1 to YOLOV8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
Manakitsa, N.; Maraslidis, G.S.; Moysis, L.; Fragulis, G.F. A review of machine learning and deep learning for object detection, semantic segmentation, and human action recognition in machine and robotic vision. Technologies 2024, 12, 15. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the proposed work.

Figure 2. Confusion matrix of different classes.

Figure 3. (a) Crossing detected; (b) vehicle road lane movements.

Figure 4. (a) Accuracy and false negative comparison; (b) real-time latency time analysis.

Table 1. Proposed comparison with existing models.

Metric	Proposed YOLOv8	Baseline YOLOv8	SOTA	SSD
Detection Accuracy	96.0%	83.70%	88.00%	79.60%
Inference Speed	75.1 ms/frame	222.4 ms/frame	120 ms/frame	84 ms
Vehicle Object Detection	Enhanced	Moderate	High	Low

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kandavel, M.; Sureshkumar, A.; Dhanaraj, R.K. Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8. Eng. Proc. 2025, 118, 92. https://doi.org/10.3390/ECSA-12-26515

AMA Style

Kandavel M, Sureshkumar A, Dhanaraj RK. Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8. Engineering Proceedings. 2025; 118(1):92. https://doi.org/10.3390/ECSA-12-26515

Chicago/Turabian Style

Kandavel, Murugeswari, Aanjankumar Sureshkumar, and Rajesh Kumar Dhanaraj. 2025. "Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8" Engineering Proceedings 118, no. 1: 92. https://doi.org/10.3390/ECSA-12-26515

APA Style

Kandavel, M., Sureshkumar, A., & Dhanaraj, R. K. (2025). Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8. Engineering Proceedings, 118(1), 92. https://doi.org/10.3390/ECSA-12-26515

Article Menu

Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8^†

Abstract

1. Introduction

2. Related Works

3. Materials and Methods

3.1. Materials

3.2. Methods

4. Experimental Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8 †

Abstract

1. Introduction

2. Related Works

3. Materials and Methods

3.1. Materials

3.2. Methods

4. Experimental Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Autonomous Traffic-Monitoring Pedestrian Crossing Detection with Motion Sensors and Maintenance Decision Systems in Smart Cities Using YOLOV8^†