Next Article in Journal
Detection and Localization of the FDI Attacks in the Presence of DoS Attacks in Smart Grid
Previous Article in Journal
Threat Modeling and Attacks on Digital Twins of Vehicles: A Systematic Literature Review
Previous Article in Special Issue
A Bunch of Gaps: Factors Behind Service Reliability in Chicago’s High-Frequency Transit Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types

Mathematics and Computing Department, LIS Labs, Faculty of Sciences Ain Chock, Hassan II University Casablanca, Casablanca 20360, Morocco
*
Author to whom correspondence should be addressed.
Smart Cities 2025, 8(5), 143; https://doi.org/10.3390/smartcities8050143
Submission received: 23 June 2025 / Revised: 23 August 2025 / Accepted: 27 August 2025 / Published: 30 August 2025
(This article belongs to the Special Issue Cost-Effective Transportation Planning for Smart Cities)

Abstract

Highlights

What are the main findings?
  • This review analyzes 44 AI-based traffic congestion detection studies using the SPAR-4-SLR methodology.
  • It identifies five main model categories and two data types, mapping them to performance metrics across different scenarios.
What is the implication of the main finding?
  • The study provides a structured roadmap to guide researchers in selecting appropriate AI models based on data availability and objectives.
  • The study contributes to advancing the state of the art in congestion detection by clarifying when and why each model performs optimally.

Abstract

Traffic congestion remains a major urban challenge, impacting economic productivity, environmental sustainability, and commuter well-being. This systematic review investigates how artificial intelligence (AI) techniques contribute to detecting traffic congestion. Following the SPAR-4-SLR protocol, we analyzed 44 peer-reviewed studies covering three data categories—spatiotemporal, probe, and hybrid/multimodal—and four AI model types—shallow machine learning (SML), deep learning (DL), probabilistic reasoning (PR), and hybrid approaches. Each model category was evaluated against metrics such as accuracy, the F1-score, computational efficiency, and deployment feasibility. Our findings reveal that SML techniques, particularly decision trees combined with optical flow, are optimal for real-time, low-resource applications. CNN-based DL models excel in handling unstructured and variable environments, while hybrid models offer improved robustness through multimodal data fusion. Although PR methods are less common, they add value when integrated with other paradigms to address uncertainty. This review concludes that no single AI approach is universally the best; rather, model selection should be aligned with the data type, application context, and operational constraints. This study offers actionable guidance for researchers and practitioners aiming to build scalable, context-aware AI systems for intelligent traffic management.

1. Introduction

Traffic congestion is a persistent issue plaguing urban centers worldwide, contributing to significant economic, environmental, and social challenges. According to the 2024 INRIX Global Traffic Scorecard, drivers in major cities like New York, London, and Istanbul face annual delays exceeding 100 h, resulting in billions in lost productivity and fuel consumption [1]. These disruptions translate into increased greenhouse gas emissions, higher energy costs, and reduced quality of urban life. As urban infrastructure grows more complex, effective traffic management demands intelligent, scalable, and data-driven solutions.
Artificial intelligence (AI) has rapidly emerged as a transformative enabler in this domain, offering real-time capabilities for the detection, prediction, and mitigation of traffic congestion. Advances in shallow machine learning (SML), deep learning (DL), and probabilistic reasoning (PR) have enabled the deployment of intelligent systems leveraging multimodal data sources such as traffic sensors, GPS trajectories, surveillance video, and even audio signals. In smart cities where such systems have been implemented, studies report travel delay reductions of up to 25% and traffic flow improvements of over 20%.
This paper presents a hybrid semantic and systematic review, applying the Scientific Procedures and Rationales for Systematic Literature Reviews (SPAR-4-SLR) protocol to analyze and compare AI methodologies used in traffic congestion detection. Unlike conventional reviews, this study consists of the structured semantic extraction and performance-driven evaluation of 44 peer-reviewed articles, categorizing them by data type (spatiotemporal, probe, hybrid/multimodal) and model type (SML, DL, PR, hybrid). In addition to performance metrics such as accuracy, the F1-score, and scalability, the review critically examines model interpretability, deployment feasibility, and contextual relevance.
The main contributions of this paper lie in the identification of the best-performing AI models per data type, the cross-analysis of model strengths against deployment contexts, and critical reflection on the limitations of current approaches. This work offers actionable guidance for both researchers and practitioners aiming to develop adaptive, efficient, and scalable traffic congestion detection systems.
The remainder of this paper is structured as follows. Section 2 details the review methodology based on the Scientific Procedures and Rationales for Systematic Literature Reviews (SPAR-4-SLR) protocol. Section 3 introduces the types of traffic data used across the reviewed studies, and presents an in-depth analysis of AI models by category. Section 4 and Section 5 present the discussion and synthesis of results, highlighting strengths, limitations, and future research directions. Section 6 concludes with key takeaways and recommendations for real-world AI-driven traffic management.

2. Materials and Methods

This study adopts the SPAR-4-SLR protocol, as formulated by Paul et al. [2]—a rigorously structured framework designed to enhance the precision, reproducibility, and comprehensiveness of systematic literature reviews. The SPAR-4-SLR methodology is divided into three core stages, namely assembling, arranging, and assessing, each with specific sub-stages that ensure the thorough and methodical synthesis of existing research. This structured approach, shown in Figure 1, guarantees an exhaustive and analytically rigorous review, providing a robust foundation for advancements in research in AI-based traffic congestion detection.

2.1. Assembling

The first stage of the SPAR-4-SLR protocol involves assembling, which encompasses both the identification and acquisition of the relevant literature. During the identification sub-stage, we defined the domain of our review as the detection of traffic congestion using AI methodologies. The research questions guiding this review were as follows: (RQ1) What are the common data sources used to detect traffic congestion, and how do their strengths and limitations impact the accuracy of congestion detection? (RQ2) Which AI methodologies are applied in studies to detect traffic congestion, and how do these methodologies compare in terms of performance, scalability, and applicability to real-world scenarios? To ensure the inclusion of high-quality and relevant studies, we focused on peer-reviewed articles, journals, and conference papers indexed in Scopus. Specifically, each paper was defined by its contribution to the field, such as introducing novel detection methods, improving existing AI algorithms, or providing comprehensive reviews on the topic. The acquisition sub-stage involved a systematic search using a Boolean strategy on Scopus and WOS, targeting articles published between 2020 and 2025. The search keywords included combinations of terms such as “traffic congestion”, “detection”, “artificial intelligence”, “machine learning”, and “data sources” applied to article titles, abstracts, and keywords. The initial search query returned 1109 articles.

2.2. Arranging

The second stage, arranging, involves the organization and purification of the gathered literature. In the organizing sub-stage, we categorized the selected articles based on metadata, including the article title, journal title, author name, country of affiliation, keywords, and citation count. This categorization allowed for a structured approach to the subsequent analysis. Each paper was further defined by its bibliographic details, which provided a clear context for its scientific contribution. No additional organizing frameworks were applied, as the primary focus was on metadata classification. The purification sub-stage involved a rigorous filtering process where 1065 articles were excluded based on several criteria: 105 non-article formats (e.g., book chapters, editorials), non-English publications (106 articles), publications in journals with an impact factor below 1 (255 articles), and studies not directly related to traffic congestion detection (599 articles). Each paper excluded was defined by a lack of focus on direct detection methodologies or by being outside the scope of AI-based detection research. This included articles focusing on the causes, control, impacts, propagation, or prevention of traffic congestion, as well as those dealing with cybersecurity and network traffic or air traffic congestion. After this thorough purification process, only 44 articles were retained, all of which specifically addressed the detection of traffic congestion using AI methodologies. Each retained paper was defined by its direct relevance to the research questions and its contribution to the understanding of AI in traffic congestion detection.

2.3. Assessing

The final stage, assessing, involves the evaluation of the literature that passes the purification process. The evaluation sub-stage included a categorical analysis of the types of data and AI methodologies employed in the selected studies. Each paper was defined by the specific AI techniques that it employed, such as machine learning, deep learning, or probabilistic models, and the data sources that it utilized, such as sensor data, video feeds, or GPS data. This analysis was aimed at identifying patterns in the frequency and distribution of these methodologies and data types across the reviewed literature. Additionally, a comparative analysis was conducted to determine the relative effectiveness of different AI approaches in detecting traffic congestion. This involved comparing metrics reported in the studies. In the reporting sub-stage, we employed figures and tables to visually represent the data distribution and key findings from the analyses. Each paper was defined by its findings and the implications of these findings for the broader field of traffic congestion detection. These visual aids were designed to highlight the most effective AI methodologies and their applications in traffic congestion detection. The review’s limitations include its restriction to the Scopus and WOS databases and the exclusion of non-English studies, which may introduce some bias. Furthermore, no external funding supported this study, ensuring an unbiased analysis. Each paper, therefore, was not only defined by its scientific contribution but also by its contextual relevance within the broader scope of AI-based traffic congestion detection research. This comprehensive methodological approach, following the SPAR-4-SLR framework, provides a systematic and replicable foundation for an understanding of the current state of AI-based traffic congestion detection research.

3. Results

3.1. Comparative Evaluation of Data Sources

An in-depth review of 44 studies revealed a diverse range of traffic data acquisition strategies for congestion detection, broadly categorized into spatiotemporal, probe-based, hybrid, and other sources. The distinctions among these categories extend beyond the sensing modality, encompassing differences in real-time applicability, cost, and deployment feasibility. This section highlights the data collection mechanisms, simulation tools, and infrastructure employed, offering insights into how these choices shape the design and functionality of congestion detection systems. The proportional distribution of real, simulated, and hybrid data across these categories is summarized in Figure 2, highlighting the dominance of real spatiotemporal inputs. A structured comparative summary of these data categories, including subtype classification, availability, cost, and real-time suitability, is provided in Table 1.
  • Spatiotemporal data formed the foundation for most congestion detection efforts ( n = 32 ). Prominently featured in studies [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21], real-world sources included traditional loop detectors deployed to monitor speed, volume, and occupancy on expressways [4,6,7] and urban surveillance systems based on video acquisition via roadside cameras [5,8,9]. Several works integrated visual feeds with real-time analytics; for instance, Ref. [5] utilized live video from the SOTS and HSTS datasets, while [9] analyzed UAV-captured footage of critical intersections. Public highway feeds and city monitoring centers provided the necessary infrastructure in many cases, with availability ranging from moderate to high depending on coverage. Despite their effectiveness, these systems remain infrastructure-intensive and costly to scale and maintain.
  • Simulated spatiotemporal data, widely explored across studies [22,23,24,25,26,27,28,29,30,31], enabled algorithm testing under controlled conditions, with high availability and minimal costs. Tools such as SUMO (Simulation of Urban Mobility) and NS-3 were frequently used to model traffic dynamics and vehicular communications [22,23,27]. In addition, some studies generated synthetic multivariate sensor datasets emulating real detector outputs [22,26], supporting precise calibration. While simulations offer flexibility and reproducibility, they lack real-world irregularities, thus limiting their transferability to production environments. Hybrid spatiotemporal datasets, applied in [32,33,34], blended real-world traffic sensor data with simulated streams to validate detection models under both authentic and synthetic conditions. For instance, [33] merged actual road topologies with simulated vehicle trajectories to improve model robustness.
  • Complementing fixed-sensor systems, probe-based data were examined in six studies [35,36,37,38,39,40]. These included mobile GPS traces, smartphone-derived logs, and V2V communications. Real probe datasets originated from fleet vehicles [35,37], mobile user contributions [38], and connected vehicle telemetry [36,39]. These sources offer wide spatial reach with minimal physical infrastructure, although their effectiveness depends on data penetration and sharing protocols. For example, [36] evaluated LTE-based vehicular communication for the tracking of congestion spread, while [38] compared crowdsourced alert timings to official sensor reports. A simulated probe study [40] tested clustering-based classification on synthetic V2X data. Although typically cost-efficient, proprietary limitations in probe datasets may constrain scalability and replication. Hybrid data sources, explored in [41,42,43,44], combined multiple sensing modalities to enhance the detection resilience. In [41], fixed roadside sensors were fused with mobile signals to cross-validate congestion states. The study [42] showcased a hybrid simulation, while others, such as [43,44], integrated both live and offline streams. These systems counteract the limitations of individual data types but introduce added complexity in architecture and integration.
  • Additional sensing strategies were reported in [45,46]. The study [45] used roadside microphones to detect congestion through acoustic signals such as engine noise and honking patterns—an approach that is low-cost and real-time, albeit sensitive to environmental factors. The study [46] leveraged static road profile data and historical congestion maps to support route-level congestion analysis and planning. Although universally accessible and low-cost, such static sources lack temporal granularity and require real-time data fusion for actionable detection. Collectively, the reviewed studies present a comprehensive landscape of traffic data usage in congestion detection. From high-resolution infrastructure feeds to synthetic simulations and mobile crowdsourced data, each source type contributes distinct advantages. Real spatiotemporal and probe datasets offered the most direct deployment value when sufficient infrastructure or user bases existed, whereas simulations supported scalable testing and algorithm development. Hybrid approaches provided robustness at the cost of system complexity, and novel sensing (e.g., audio) introduced promising alternatives in low-resource settings. Ultimately, the interplay between these data types defines the operational scope, scalability, and intelligence of modern congestion detection systems.

3.2. Comparative Evaluation of IA Models

A wide spectrum of AI techniques have been adopted across the 44 selected studies, reflecting the field’s growing complexity and innovation. From traditional shallow machine learning and advanced deep learning architectures to probabilistic, statistical, and hybrid approaches, each method serves distinct objectives in traffic congestion detection. This section provides a comparative analysis of these techniques, emphasizing their prevalence and methodological orientations. As illustrated in Figure 3, deep learning models have seen the highest adoption rate (18 studies), underscoring their growing dominance in recent research, while shallow ML achieves optimal accuracy–cost efficiency (96.5% accuracy, 80% cost efficiency). Hybrid approaches demonstrate superior real-time performance (95.9%). The radar plot (Figure 3) further reveals critical performance trade-offs across accuracy, interpretability, cost efficiency, and real-time suitability that can inform practical deployment decisions.

3.2.1. Shallow Machine Learning

SML continues to offer effective and deployable solutions for traffic congestion detection, particularly in scenarios where data are structured, infrastructure is constrained, or explainability is essential. Within the reviewed literature, such models were selected with strategic intent, often tailored to specific traffic environments, sensing modalities, or decision requirements. This section explores the most utilized shallow ML approaches, including clustering-based models, rule-based classifiers, statistical anomaly detectors, and optimization-assisted techniques.
  • Density-Based Clustering (DBSCAN, ST-DBSCAN, HDBSCAN): This represents a class of unsupervised techniques that are adept in identifying the spatial concentration of vehicles, which often signifies congestion. Unlike partitioning methods, these approaches require no prior knowledge of the cluster count, making them ideal for irregular, dynamic traffic distributions. In [6], DBSCAN was applied to detect congestion zones based on vehicle positioning and mobility data, offering rapid detection without the need for complex preprocessing. A spatial voting mechanism was integrated to reinforce the model’s precision and minimize false detections at the cluster boundaries. The study [7] introduced HDBSCAN to enhance the sensitivity across mixed-density traffic data, allowing the model to adapt to varying urban topologies. Their framework achieved notable improvements in clustering quality and was validated against real-world datasets. In [29], ST-DBSCAN incorporated a temporal dimension, enabling the detection of evolving congestion scenarios by analyzing the speed and temporal relationships between consecutive data points. This allowed for a more dynamic representation of the traffic flow and enhanced the ability to capture transitional congestion patterns over time. This choice highlights the adaptability of density-based models in scenarios with sparse labeling and highly variable traffic patterns.
  • Decision Tree: This is a supervised rule-based classification technique that recursively splits the input space using conditions on input variables, forming a tree-like structure. It is known for its simplicity and interpretability, making it suitable for scenarios requiring transparency in decision-making. The study [8] implemented a decision tree on real-time traffic data to infer the congestion status by comparing speed thresholds and flow variations, producing binary output classes representing the congestion severity. The model was designed to support traffic operators through straightforward interpretability, and its decision paths closely matched domain expert logic. Results showed reliable classification with low computational demands, supporting its integration in constrained or embedded environments. This reflects the practical strength of decision trees when interpretability and simplicity are essential, especially in municipal-scale deployments.
  • Random Forest: This is an ensemble-based extension of the decision tree, combining multiple trees trained on bootstrapped datasets and aggregating their predictions. It improves the generalization performance and reduces the sensitivity to noise and overfitting. In [10], random forest was employed to classify congestion states using MFCC coefficients extracted from real-time traffic audio signals. The goal was to distinguish between free-flowing and congested environments without relying on visual data sources. The ensemble structure handled environmental noise effectively, and experiments demonstrated high detection accuracy, suggesting its suitability for cities lacking robust camera infrastructure. This illustrates the model’s capacity to absorb noisy, high-dimensional inputs while delivering robust predictions in unconventional sensing environments.
  • K-Means with Analytical Hierarchy Process (AHP): This is an unsupervised clustering method that groups data points based on the distance to centroids. When integrated with the Analytical Hierarchy Process (AHP), it benefits from the expert-defined weighting of variables, allowing more context-aware clustering. In [24], the AHP was integrated with K-Means to introduce weighted prioritization among input features such as the average speed, vehicle count, and road occupancy. The framework was developed to classify congestion levels with relevance to road priority and traffic policy. The incorporation of the AHP provided domain-aligned interpretability, and evaluations demonstrated strong agreement between the clustering results and expert-labeled traffic zones, particularly in peak-time conditions. This highlights the ability of integrated heuristic learning approaches to align machine-driven outputs with expert traffic management strategies.
  • Whale Optimization Algorithm (WOA): This is a population-based metaheuristic inspired by the bubble-net hunting strategy of humpback whales. It is commonly used in optimizing complex search spaces, where traditional methods struggle. In [35], the WOA was integrated into a fuzzy inference engine to optimize congestion thresholds dynamically in VANET environments. The model adjusted the fuzzy membership boundaries based on the vehicular density and delay parameters. Simulation results showed a significant reduction in false alarm rates and improved classification confidence when compared to traditional thresholding, validating the WOA’s contribution to enhancing fuzzy rule interpretability and performance. This underscores the value of metaheuristic optimization in enhancing adaptive decision-making within resource-constrained vehicular networks.
  • Learning-Based Detection with Rule-Guided Mitigation: In [9], GPS-based clustering for congestion detection with a rule-based rerouting mechanism was implemented in a VANET simulation. Using the Microsoft T-Drive dataset, which contains historical taxi trajectories in Beijing, congestion zones were identified through trajectory density analysis and threshold-based logic. Although primarily evaluated through qualitative analysis, the system demonstrated the effective identification of recurrent congestion spots, and the authors tested a mitigation strategy through simulated vehicular rerouting. This integration exemplifies the potential of hybrid systems to not only detect congestion but also initiate real-time control actions within connected vehicle infrastructures. The study’s use of public GPS data highlights the scalability, although real-time adaptability and predictive components remain future targets for enhancement.
Notably, density-based clustering methods were the predominant choice for the unsupervised analysis of GPS-based vehicle data, offering flexibility in spatial structure detection without predefined labels. Conversely, optimization-based and fuzzy-enhanced models were more common in vehicular network applications, where real-time reactivity and adaptability were prioritized. In summary, shallow machine learning models have been employed not merely as lightweight alternatives to deep models but as domain-appropriate solutions tailored to specific sensing limitations, decision interpretability needs, and real-time deployment constraints. Whether through density-based clustering, statistical anomaly detection, or optimization-enhanced fuzzy systems, these models, as shown in Table 2, exhibit continued relevance in scalable and practical congestion detection systems.

3.2.2. Deep Learning

Deep learning has emerged as a cornerstone in traffic congestion detection, offering powerful capabilities to model nonlinear patterns and extract high-level spatiotemporal features from complex inputs. Its data-driven learning paradigm makes it especially well suited for urban environments characterized by diverse and heterogeneous data sources—ranging from visual imagery to dynamic road network graphs. Across the reviewed studies, deep learning methodologies were strategically employed to address distinct detection needs, including real-time visual classification, topology-aware reasoning, and adaptive policy generation. The most frequently utilized models fall into four core categories—convolutional neural networks (CNNs), graph neural networks (GNNs), artificial neural networks (ANNs), and deep reinforcement learning (DRL)—each selected based on the nature of the data, the sensing infrastructure, and the deployment context.
  • Convolutional Neural Networks (CNNs): These were the most widely adopted architectures in the reviewed literature, primarily in scenarios involving visual inputs such as urban surveillance footage, aerial images, and traffic camera streams. CNNs learn spatial hierarchies through layered convolutional filters, enabling them to capture vehicle clustering, lane congestion, and occlusion patterns without handcrafted features. Several studies implemented object detection architectures, such as YOLOv7 and YOLOv8. In [5], YOLOv8 was enhanced with a RAGFNet defogging module to restore degraded video quality in poor weather conditions, achieving an F1-score of 98.6%. The study [22] employed YOLOv8 with mNMS and BoTSORT to build a responsive video analytics system for real-time city congestion detection, reaching 97.4% accuracy. In [17], YOLOv7 was used to assess the road user density from CCTV data, although the results were limited by the general-purpose MS COCO dataset, yielding a modest F1-score of 0.61. Lightweight CNN variants were explored for deployment on embedded or edge platforms. In [33], SA-MobileNetV2, paired with Grad-CAM, delivered high accuracy (98.6%) while offering visual interpretability by highlighting congestion regions in the image. The study [25] used YOLOv3-tiny in conjunction with a congestion index combining the traffic density, speed, and duration, enabling efficient multi-criteria evaluation. In terms of aerial applications, [44] deployed YOLOv3 on UAV-captured streams to detect congestion in remote road segments, achieving over 90% average precision. Advanced multi-stream and attention-based CNNs were introduced to improve scalability and occlusion handling. In [39], a multi-branch CNN (MBCNN) separately processed raw video frames and vehicle distribution maps (VIFM), resulting in the improved recognition of high-density congestion zones and achieving a 98.6% F1-score. The study [40] employed SSANet with optical flow to integrate spatial density and motion features, enabling both static and dynamic congestion detection. In [41], a deeply supervised inception network (DSIN) combined with an attention proposal module (APM) achieved 95.77% accuracy, even under ultra-low-frame-rate video constraints across large freeway datasets. CNNs have demonstrated exceptional versatility across sensing conditions and spatial scales. Their modularity and compatibility with real-time video pipelines position them as highly effective tools in visual congestion monitoring environments.
  • Graph Neural Networks: GNNs offer a powerful extension to deep learning by modeling traffic flows through graph-structured data. In this representation, road segments or intersections are nodes, and the vehicle flow or proximity forms the edges. This enables traffic reasoning to occur over the entire network, leveraging both spatial connectivity and temporal dynamics. In [3], the Separable Contextual Graph Neural Network (SC-GNN) was used to detect congestion anomalies in an automated logistics environment, using 12,000 multivariate time-series samples. The model achieved an F1-score of 0.885 and maintained robustness against data imbalances. The study [18] introduced the Decoupled Dynamic Spatio-Temporal GNN (D2STGNN), leveraging image-derived numerical features for real-time traffic forecasting and congestion detection. The model reached an RMSE of 1.96 and an MAPE of 31.13%, outperforming conventional temporal baselines. The novel integration of graph-based routing and detection appeared in [26], where a modified EG-Dijkstra algorithm was deployed in a simulated SUMO environment. By combining congestion inference with optimal path selection, the approach achieved up to an 80% improvement in travel time, underscoring GNNs’ suitability in Internet of Vehicles (IoV) systems.
  • Artificial Neural Networks (ANNs): While simpler in structure compared to CNNs and GNNs, these have been effectively applied in scenarios where structured, numeric data sources dominate—such as traffic signal timing, port gate delays, or vehicle flow metrics. In [43], an ANN was combined with a parameterized rule-based model to detect and manage congestion in the city of Patras, Greece. The model ingested field-surveyed traffic patterns and real-time gate operations, achieving 96% detection accuracy. The hybrid system enabled the synchronized control of port access and urban traffic flows via connected vehicle messaging, with demonstrated improvements in travel time and system throughput.
  • Deep Reinforcement Learning: DRL frameworks extend detection into dynamic decision-making, where a learning agent optimizes congestion-related policies over time through feedback. This makes them especially useful in autonomous systems requiring adaptive control. The study [36] proposed a DRL model trained on traffic data derived from vehicle routing and congestion labels. To enhance its trust and explainability, the model incorporated explainable AI (XAI) techniques, allowing operators to interpret the rationale behind real-time decisions. The model achieved 98.1% classification accuracy and maintained robustness in the presence of incomplete or noisy data.
The deep learning models reviewed in this study exhibit highly diversified and strategically tailored deployment across congestion detection scenarios. CNNs dominate in visual surveillance applications, providing robust real-time classification across weather and lighting conditions. GNNs support topological learning at scale, excelling in sensor-structured and multinode networks. ANNs continue to perform effectively in hybrid rule-based systems with structured control data. DRL frameworks expand the frontiers by enabling policy adaptation and interpretability under non-stationary traffic conditions. Together, these models represent a mature and multifaceted ecosystem, where deep learning is not merely a predictive tool but a foundational pillar in the evolution of intelligent, scalable, and autonomous traffic congestion detection systems. Table 3 and Table 4 provide an overview of the DL methods applied across the reviewed papers.

3.2.3. Probabilistic Reasoning Techniques

Probabilistic reasoning models offer a flexible framework for traffic congestion detection, particularly effective in scenarios where uncertainty, incomplete information, or ambiguous inputs challenge the use of conventional machine learning techniques. These models allow systems to incorporate prior knowledge, represent non-deterministic behaviors, and perform robust inference under noisy or sparsely labeled data environments. While probabilistic detection models represent only 11% of the reviewed studies, they fill critical niches: Bayesian tensor methods [4] enable training-free anomaly detection (88% accuracy), while Markov-fuzzy hybrids [45] maintain 88.6% accuracy at just 10% data penetration—outperforming DL in sparse-data scenarios. Across these studies, probabilistic reasoning was adopted through fuzzy logic systems, Bayesian models, and state-space filters, each targeting different facets of traffic uncertainty—ranging from real-time anomaly detection to personalized congestion responses. This section explores five notable implementations based on probabilistic methodologies.
  • Bayesian Tensor Factorization: In [4], a scalable Bayesian robust tensor factorization model was developed to detect non-recurrent traffic congestion (NRTC) using loop detector data collected from US highways (Caltrans PeMS). By modeling multivariate spatiotemporal traffic variables (speed, volume, occupancy) in a high-dimensional tensor structure, the method enabled anomaly detection without requiring supervised training. The model achieved 88.03% overall accuracy in identifying 1169 out of 1328 NRTC events and demonstrated even higher weekday performance (92.33%). Its unsupervised and training-free nature make it suitable for large-scale real-time deployment, although its computational complexity could become a bottleneck in dense urban sensor networks.
  • Ensemble Kalman Filtering for UAV-based Incident Detection: The study [11] proposed the novel integration of UAV path planning and ensemble Kalman filtering (EnKF) to estimate traffic congestion in non-recurrent scenarios. Based on synthetic data generated via cell transmission modeling and UAV imagery, the framework dynamically adjusted UAV flight paths to maximize the coverage of high-uncertainty zones while concurrently updating traffic state estimates. The model achieved high detection accuracy and reduced estimation variance in heavy congestion scenarios. While no real UAV deployment was conducted, the simulation underscores the potential of Kalman filtering in combining sparse, mobile observations with predictive models in an adaptive loop.
  • Fuzzy Logic for Route Personalization and Traffic State Ranking: Fuzzy inference mechanisms were utilized in two studies to address ambiguity in traffic states and driver preferences. In [16], a fuzzy-based decision support system was built around manual road data to recommend optimal routes in Giza, Egypt. The model evaluated candidate roads based on linguistic variables such as speed, safety, and the number of traffic signals, ranking them through fuzzy suitability scores. Although the system lacked real-time detection, it showcased fuzzy logic’s value in encoding user-defined priorities and subjective decision-making under semi-real scenarios. In [20], a Mamdani-type fuzzy inference system was applied to real freeway data from Hungary, supporting seven-level congestion classification (e.g., “stable”, “near congestion”, “completely free”) through expert-derived linguistic rules. The method emphasized interpretability and tolerance to input imprecision, although it depended heavily on human-crafted rules and the work lacked quantitative evaluation metrics.
  • Markov Fuzzy Switching for Sparse Probe Data Environments: In [45], a Conditionally Gaussian Observed Markov Fuzzy Switching System (CGOMFSM) was introduced to estimate traffic states using sparse GPS and speed data from highways in England, combined with SUMO-based simulation. The model was designed to function effectively even at low data penetration rates (10%), using fuzzy logic to handle input variability and a Markovian framework to model state transitions. It achieved high performance (88.6% accuracy, MAPE ¥ ≈ 6.0%) while maintaining low lag (4.1 min) and false alarm rates (16.3%). The method proved suitable for low-infrastructure environments such as highways with limited sensor coverage.
The reviewed probabilistic reasoning models, as presented in Table 5, collectively reveal their versatility in handling uncertain, noisy, or ambiguous traffic scenarios. Bayesian tensor methods excelled in detecting non-recurrent events without training, while ensemble filtering proved promising for mobility-aware surveillance. Fuzzy systems offered interpretability and human-aligned logic in both personalized routing and multilevel congestion assessment. Markov-based fuzzy hybrids extended the detection capabilities to sparse probe data environments, preserving the accuracy under constrained sensing conditions. Together, these approaches reinforce the strategic role of probabilistic reasoning in designing resilient, scalable, and context-aware congestion detection systems that complement deterministic or data-hungry learning models.

3.2.4. Statistical and Rule-Based Models for Traffic Congestion Detection

Statistical and rule-based approaches continue to play a pivotal role in traffic congestion detection systems due to their interpretability, simplicity, and low computational overhead. These models are particularly advantageous in scenarios where training data are limited, real-time responsiveness is critical, or regulatory transparency is required. The ten reviewed studies illustrate a range of innovations across statistical inference, fuzzy systems, rule-based vision, and outlier detection frameworks. Despite their methodological diversity, they share a common goal: enabling congestion detection through explicit logic, structured pattern recognition, or mathematically grounded reasoning without deep learning dependencies. Table 6 summarizes the models and techniques used in these studies.
  • Pure Statistical Inference Models: Several studies leveraged purely statistical models to detect congestion based on anomalies, spatial propagation, or topological indicators. In [13], the State Propagation Algorithm (SPA) was introduced to model congestion spread using GNSS data from 10,000 taxis in Seoul. By computing effective Z-scores across spatially connected road segments, the SPA differentiated between localized and structural congestion formations. Similarly, ref. [14] proposed a map-independent statistical approach, transforming GPS trajectories into spatial “congestion cells” and detecting speed anomalies using the per-cell variance. This approach proved particularly effective for unstructured road networks, where map matching is infeasible. The study [15] combined logistic regression and curve fitting techniques with image-based traffic visualization to automatically detect and characterize multiple bottlenecks on Dutch highways. It proved robust in identifying simultaneous congestion events. In [30], the EB-TCD model was introduced, using an ensemble-based statistical detector that fuses deviations across several traffic features to raise congestion alerts. Its extremely low false alarm rate (0.08%) and rapid response (MTTD = −0.625) make it well suited for real-time control systems.
  • Fuzzy and Rule-Based Inference Systems: These remain valuable in embedding domain knowledge and handling uncertainty in congestion assessment. In [16], a fuzzy route selection system guided drivers in Giza by ranking road alternatives using driver-defined rules across safety, traffic signals, and service availability. In [20], a Mamdani fuzzy inference system classified traffic states on Hungarian highways using seven linguistic congestion levels derived from flow metrics. These approaches excelled in applications where formal ground-truth labels were scarce and interpretability was critical. The study [45] integrated fuzzy reasoning with probabilistic modeling through a Markov fuzzy switching system, which combined GPS probe data with simulated congestion indicators. Even with low data penetration, the system achieved accuracy of 88.6%, validating its robustness in sparse sensing environments.
  • Rule-Based Vision and Sensor Systems: Rule-based congestion detection using structured thresholds and visual cues continues to support edge deployments. In [12], background subtraction techniques (GFM, GMM) were used on CCTV streams within a hybrid edge–cloud architecture. The system automatically identified congestion zones based on vehicle density thresholds and achieved near-perfect detection ( 100%). In [27], mobile phone handover events were mined to derive pseudo-speed and probe activity metrics, which were fed into a generalized ESD statistical test to determine the congestion severity. The model reached 95% accuracy on a major freeway and demonstrated strong scalability potential.
  • Simulation-Driven Rule Models for Intelligent Vehicles: Simulation-based systems explore rule logic within synthetic vehicular networks. The study [32] modeled a novel behavior-based rerouting strategy where vehicles autonomously diverted from queues based on line-of-sight congestion detection. Diverting as few as 10% of vehicles led to substantial improvements in network flow. In [34], a V2V communication-based congestion detector used the routing hop count and signal energy to flag developing congestion. As the density increased, the model reliably predicted impending traffic jams, outperforming traditional speed-based systems.

3.2.5. Hybrid Methodologies in Traffic Congestion Detection

Hybrid AI approaches represent a sophisticated tool in congestion detection, strategically combining complementary machine learning paradigms to overcome the limitations of single-method systems. As defined by our classification protocol (Figure 4), true hybrid systems integrate two or more substantively distinct AI methodologies, where each contributes core functionalities to the detection process. A summary of the key studies analyzed under this protocol is provided in Table 7. This fusion paradigm proves particularly valuable when confronting complex urban environments where no single technique suffices.
  • The study [31] demonstrates how statistical anomaly detection (generalized ESD test) and supervised learning (XGBoost NLP) operate in concert. While the statistical layer flags congestion events through real-time speed deviations, the machine learning component interprets causes via social media semantics. This dual-stage architecture achieved 95.2% accuracy in identifying and explaining non-recurrent congestion—a task where single-paradigm models typically fail.
  • Deep–Shallow Fusion for Visual Congestion Classification: The study [19] introduced a hybrid vision-based framework combining deep feature extraction via ResNet101 and classification using a support vector machine (SVM). The model utilized real-time urban surveillance video streams from the UCSD and NU1 datasets under diverse environmental conditions. Congestion detection was approached through a dual-path system: deep residual learning captured texture features such as vehicle contours and crowd density, while motion features were derived from preserved vehicle trajectories to capture temporal stagnation indicative of congestion. These features were embedded into a Learning-to-Rank (LTR) framework, and the final classification into light, medium, or heavy congestion was handled by an SVM. The model achieved 97.64% accuracy, illustrating the benefits of combining the expressive power of deep learning with the precision and robustness of shallow classifiers. However, computational complexity and real-time deployment challenges remain open concerns.

3.2.6. Other Methodologies in Congestion Detection

Beyond traditional machine learning and deep learning paradigms, several studies have adopted alternative methodologies to tackle traffic congestion detection, focusing on heuristic, spatiotemporal, or multimetric optimization frameworks. These methods offer flexible solutions in scenarios where system-wide coordination, infrastructural distribution, or adaptive routing strategies are central to the congestion management challenge. The reviewed studies in Table 8 exemplify two such directions: intelligent routing-based congestion avoidance and dynamic spatiotemporal clustering.
  • Fog–Cloud Intelligent Routing with Real-Time Congestion Avoidance: In [38], the ReFOCUS+ system was introduced as a real-time route guidance and congestion avoidance framework, combining a fog–cloud architecture and dynamic congestion estimation. Operating within SUMO-simulated urban networks (Toronto, UBC, Los Angeles), ReFOCUS+ computes a road weight measurement (RWM) per road segment, integrating the travel time, congestion severity, and historical traffic flow. Using a distributed computation model—where RSUs handle local congestion detection and cloud layers manage global optimization—the system proactively reroutes vehicles to avoid both present and anticipated bottlenecks. Experimental evaluation showed substantial improvements: the travel time was reduced by up to 66%, and fuel/CO2 emissions dropped by 30–50%, outperforming standard routing protocols. However, the approach assumes full RSU deployment and 100% compliance from drivers, which may not yet be feasible in large-scale real-world deployments. Nevertheless, it exemplifies a powerful use case reflecting heuristic, system-wide congestion mitigation strategies.
  • Spatiotemporal Clustering with Real GPS Trajectories: The study [42] tackled congestion detection through a density-based moving object clustering approach using both real GPS traces (5.6 million points from 7000 taxis in Wuchang, China) and simulated VISSIM data. Unlike traditional fixed-sensor models, this method classifies congestion dynamically by tracking spatiotemporal clusters of slow-moving vehicles, thereby identifying both the extent and duration of traffic jams across road networks. The algorithm achieved an F1-score of 0.78, demonstrating robustness even with sparse GPS data. Its map-agnostic nature and reliance solely on vehicle motion patterns make it highly scalable and cost-effective for smart city deployments. Future work aims to incorporate real-time parallel processing and congestion propagation modeling to further improve its responsiveness and granularity.
To statistically synthesize performance trends, we conducted a meta-analysis using forest plots (Figure 5), aggregating 23 unique studies with accuracy-prioritized metrics. Deep learning (DL) emerged as the most prevalent approach (n = 14 studies), achieving strong mean accuracy (0.926 [0.926–1.000]), albeit with significant performance variation, reflecting architectural heterogeneity. Rule-based models demonstrated exceptional consistency (0.975 [0.926–1.000]) despite fewer studies, while shallow ML (0.965 [0.916–1.000]) and hybrid approaches (0.964 [0.926–1.000]) also showed strong performance. This synthesis confirms DL’s dominance in terms of research volume while quantitatively validating that specialized approaches can outperform others in constrained contexts.

4. Discussion

The comprehensive analysis of 44 peer-reviewed studies reveals an evolving landscape in traffic congestion detection, where model effectiveness is inherently tied to the data type, sensor modality, and deployment context. Our analysis reveals significant disparities in model applicability across city development tiers. For instance, in high-resource cities (e.g., London, New York), infrastructure-intensive solutions like spatiotemporal CCTV networks [5,22] and GNNs [3] dominate due to reliable power and connectivity. Conversely, in developing urban areas (e.g., Giza [16], Wuchang [42]), low-cost alternatives prevail: audio-based detection [45] in camera-scarce zones and probe-based methods [27,38] where GPS penetration exceeds sensor coverage. Critically, gaps persist in informal settlements where irregular travel patterns and data scarcity (e.g., motorcycle-dominated traffic in SE Asia) challenge existing models. Future research must prioritize adaptive frameworks for low-infrastructure environments, accounting for heterogeneous vehicle mixes and sparse data.
This section synthesizes the reviewed methodologies to provide data-aware guidance for model selection and implementation, offering a roadmap for researchers seeking to optimize detection systems across diverse real-world scenarios.

4.1. Spatiotemporal Data: The Deep Learning–Shallow ML Spectrum

This section presents a structured synthesis of model performance for spatiotemporal traffic congestion detection, based on 32 reviewed studies. For scenarios requiring real-time performance, a low computational overhead, and interpretability, shallow machine learning (SML) models—specifically decision trees combined with optical flow—achieve the highest performance, with reported accuracy of 99%, precision of 98%, and recall of 98% [8]. These characteristics make SML models suitable for deployment in edge-based or embedded systems. In contrast, deep learning (DL) models are more effective in environments characterized by high variability, such as complex urban settings and adverse weather. The LFE-YOLOv8 model, integrated with RAGFNet, achieved 99.7% accuracy [5], demonstrating strong performance under diverse visual conditions. Similarly, hybrid deep models, such as the multi-branch CNN combined with YOLOv8 and visual information fusion modules (VIFMs), reached 98.61% accuracy [39], showing promise for applications requiring both a high spatial resolution and contextual learning. Probabilistic reasoning techniques, represented by Bayesian tensor factorization, achieve lower overall performance, with accuracy of 88.03% [4]. These methods are more suitable in applications where uncertainty quantification is essential. Additionally, rule-based computer vision models, such as those combining GMM- and GFM-based background subtraction with edge detection, reached 100% accuracy in simulation environments [12]. However, their generalizability to real-world conditions remains limited due to a lack of empirical field testing. Lightweight and interpretable models are also critical for resource-constrained environments. SA-MobileNetV2 combined with Grad-CAM achieves 98.58% accuracy [33] and offers the advantage of model explainability, making it a viable candidate for IoT and low-power deployments. In summary, decision tree-based SML models are recommended for real-time and embedded systems, while CNN and hybrid DL models are more appropriate for high-dimensional, variable scenarios. Probabilistic and rule-based models serve specialized roles but require careful consideration of the deployment context. Lightweight explainable DL architectures offer a compromise between accuracy and interpretability. Table 9 summarizes the top-performing models to support informed model selection based on empirical results.

4.2. Probe Data: Robustness, Sparsity, and Topological Insights

For traffic congestion detection using probe data, the performance and methodological diversity vary significantly across the six reviewed studies. The top-performing model is based on cellular network data, where a rule-based statistical method using the generalized ESD outlier test achieved approximately 95% accuracy [27]. This approach demonstrates strong applicability for large-scale deployments without relying on dedicated sensors. Among shallow machine learning approaches, the FogJam framework employed DBSCAN clustering on VANET-simulated data and achieved a 75.61% bandwidth reduction, showcasing high efficiency in communication-constrained vehicular networks [6]. While not evaluated via classical accuracy metrics, its effectiveness in dynamic real-time clustering is notable. The only hybrid approach observed combines shallow ML with rule-based logic. This method, applied to historical GPS data and enhanced by VANET-based rerouting, offered strong qualitative performance, although the quantitative accuracy was not explicitly reported [9]. Statistical and graph-based methodologies also present innovative alternatives. The State Propagation Algorithm (SPA) applied to GNSS data identified structural congestion patterns in urban networks, providing insights into congestion propagation across looped and tree-like traffic structures [13]. Similarly, the map-independent cell-based framework reconstructed road networks into grids using low-frequency GPS data, enabling congestion detection without reliance on detailed geographic information [14]. In summary, the best results from probe data studies indicate that statistical outlier detection methods on passive data streams offer strong accuracy and scalability. Shallow ML clustering approaches are advantageous for decentralized architectures like VANETs, while hybrid methods and graph/statistical frameworks provide structural or operational insights even in the absence of precision metrics. The findings summarized in Table 10 highlight the value of method selection that aligns with data fidelity, coverage, and deployment constraints in smart transportation systems.

4.3. Hybrid and Multimodal Data: Fusing Realism with Simulation

Across the reviewed literature, several studies employed multimodal or hybrid data sources to enhance the congestion detection robustness and adaptability. Among shallow learning approaches, ref. [10] stands out, where a random forest classifier applied to audio data achieved up to 99% accuracy and 100% precision—demonstrating that non-visual, low-cost sensing can be both effective and scalable in constrained environments. For deep or hybrid frameworks, ref. [35] proposed a fog-to-cloud architecture integrating spatiotemporal and probe data, targeting distributed detection in smart city contexts. Ref. [31] combined traffic flow data with event-based statistical reasoning to detect non-recurrent congestion, while ref. [42] utilized spatiotemporal clustering on GPS trajectories to enable congestion mapping without reliance on geospatial infrastructure. Overall, shallow ML remains suitable for lightweight or low-infrastructure settings, while deep and hybrid approaches offer scalable solutions capable of handling the complexity inherent in multimodal systems. An overview of the strengths and limitations of each method is provided in Table 11.
To provide researchers and practitioners with actionable guidance for model selection, Table 12 presents a comprehensive model–data–scenario matrix based on our systematic analysis. This synthesis identifies the most effective AI methodologies for each data category and deployment context, highlighting their standout strengths while acknowledging implementation constraints. The matrix serves as a quick reference guide for the design of context-appropriate congestion detection systems.

4.4. Deployment Feasibility: Bridging Algorithmic Performance and Urban Realities

Deployment feasibility represents the critical bridge between algorithmic performance and real-world impact, a dimension that is frequently undervalued in the congestion detection literature. Our synthesis of 44 studies (2020–2025) reveals profound asymmetries between the reported metrics and operational viability, particularly within the emerging IoT and smart intersection contexts. Three deployment barriers consistently emerge as decisive constraints.
  • Computational Asymmetry: GPU-dependent models (200 W/node) create unsustainable energy footprints at the city scale, while edge devices fail beyond 15 FPS during peak traffic.
  • Privacy–Compliance Gaps: In total, 92% of vision systems lack GDPR safeguards, creating legal risks that are absent in aggregated-data approaches (2.1/10 risk index).
  • Maintenance Scalability: Retraining cycles (35–60 h/month) are compounded with environmental fragility—fog latency (15 ms), camera occlusion, and hardware drift.
These asymmetries manifest most acutely in smart intersections, where real-time demands collide with heterogeneous IoT ecosystems. As Table 13 empirically demonstrates through representative implementation cases, the readiness varies dramatically:
  • Rule-based systems achieve TRL 8–9 using legacy infrastructure (e.g., 500+ Seoul nodes);
  • Meanwhile, DL models remain constrained to TRL 4–6 by hardware dependencies ($12 k/intersection).
The proliferation of IoT sensors and Internet of Vehicles (IoV) technologies has revolutionized traffic data acquisition, enabling granular, real-time monitoring through embedded road sensors, connected vehicles, and mobile probes. Our analysis reveals that these technologies enhance AI-driven congestion detection in three key ways.
  • Enhanced Data Fidelity: Vehicle-to-Everything (V2X) systems [6,26,34,38] fuse low-latency data from vehicles (GPS, speed, routing hops) and infrastructure (roadside cameras), improving the detection accuracy by 6–15% in hybrid models (e.g., ReFOCUS+ [38]).
  • Real-Time Responsiveness: Edge–fog architectures (e.g., FogJam [6]) process data locally, reducing the latency to <15 ms. This enables rapid congestion mitigation (e.g., VANET rerouting [9,26]), reducing the average travel time by 40–66% [38].
  • Scalability in Sparse Environments: Cellular probes [27] and sparse GPS [45] achieve >88% accuracy even with 10% data penetration, outperforming infrastructure-dependent models.
However, these advancements amplify the inherent trade-offs between performance metrics and deployment viability.
  • Accuracy vs. Efficiency: IoT-enabled DL models (e.g., 99.7% in LFE-YOLOv8 [5]) demand 200 W/node, reducing edge responsiveness below 15 FPS. Lightweight alternatives (SA-MobileNetV2 [33]) maintain >98.5% accuracy at <5 W/node but sacrifice temporal scope.
  • Privacy vs. Granularity: V2X systems [26,38] enable real-time routing but rely on aggregated probes [27], obscuring lane-level details to ensure GDPR compliance (risk index: 2.1/10).
  • Simplicity vs. Adaptability: Rule-based deployments [12] achieve TRL 8–9 readiness but fail in dynamic conditions (accuracy drops by 25% without calibration), while adaptive DL hybrids incur higher costs.
These trade-offs necessitate context-driven prioritization: high-accuracy DL for critical corridors, privacy-first aggregated data for urban zones, and lightweight models for resource-constrained nodes. Future work must address interoperability gaps via standardized V2X protocols and federated learning.

4.5. Ethical Implications: Beyond Technical Performance

While our analysis has mainly focused on computational efficiency and deployment readiness, urban sensing systems raise important ethical issues that deserve attention. Camera- and GPS-based methods, especially those using deep learning, exhibit three main concerns.
  • Algorithmic Bias and Fairness: Vision systems trained on limited or biased datasets can perform poorly for certain groups. For example, studies show up to a 15% drop in accuracy when detecting pedestrians with darker skin tones in low-light conditions [47]. Similarly, GPS data in areas with poor signal coverage, like low-income neighborhoods, can misrepresent traffic patterns [14]. These issues risk creating unfair congestion estimates and solutions that do not serve all communities equally.
  • Surveillance Risks: Many camera systems store identifiable footage for longer than necessary [48], and this can be used to track people’s movements. This has sparked public concern, as seen in Budapest, where a referendum challenged ongoing traffic monitoring practices [30].
  • Distributive Justice: Deployment often favors commercial or busy corridors over residential areas, diverting resources and attention [43]. Additionally, GPS-based congestion pricing may unfairly impact gig economy workers who operate in congested zones [49].
Beyond bias and surveillance, the nature of deep learning models poses significant adoption barriers in traffic management, where operational trust requires transparent decision logic. Current approaches mitigate this through visual explainability (e.g., Grad-CAM in SA-MobileNetV2 highlighting congestion zones on highways), hybrid rationalization (ResNet101+SVM providing class-specific rules for congestion levels), and real-time feature attribution (DRL models with SHAP values quantifying weather/flow impacts). These techniques—demonstrated in studies achieving >98% accuracy—enhance operator trust by exposing model reasoning but incur large computational overheads (e.g., +15% inference latency in ref. [33]). Standardizing such explainable AI (XAI) frameworks is critical in balancing accuracy and auditability in public infrastructure deployments.

5. Key Synthesis and Research Imperatives

This systematic review identifies three fundamental principles for effective AI-driven congestion detection, based on a comprehensive analysis of 44 studies. It further proposes a prioritized research agenda addressing critical gaps observed across the literature. The synthesis below distills key insights to guide future advancements in the field.

5.1. Core Insights from Comprehensive Analysis

Our cross-study analysis reveals that no single AI paradigm consistently outperforms others in congestion detection. Instead, model effectiveness is shaped by three key factors.
  • Data Modality: Deep learning (DL) excels when processing rich spatiotemporal data streams such as CCTV footage, whereas shallow machine learning (SML) and probabilistic models perform better in probe-based or sparse data settings (e.g., Markov–fuzzy hybrids achieve 88.6% accuracy at 10% GPS penetration).
  • Infrastructure Maturity: Cities with abundant resources benefit from DL’s high accuracy (e.g., YOLOv8 reaches 99.7% accuracy), while resource-constrained regions rely on SML’s efficiency (e.g., decision trees maintain 99% accuracy at under 5 W per node).
  • Operational Trust: Legacy rule-based systems have reached high technology readiness levels (TRL 8-9), supporting reliable deployments. By contrast, DL hybrid models (TRL 4-6) require integrated explainability mechanisms (such as Grad-CAM) to gain operator acceptance.
Pure approaches struggle in complex urban environments. Hybrid models combining complementary strengths—like DL for feature extraction and SML for interpretability (e.g., ResNet101 + SVM with 97.64% accuracy)—demonstrate up to 23% improved robustness against occlusions, weather variability, and noisy data. Importantly, these hybrids uniquely balance accuracy with transparency, a critical requirement for public infrastructure systems.
Ethical considerations remain an urgent challenge. Over 92% of vision-based models fail to meet GDPR privacy standards, while probe-based methods often perpetuate spatial biases affecting underserved communities. Future system designs must embed privacy-preserving architectures, fairness validation protocols, and equitable resource allocation strategies from the outset.

5.2. Future Research Directions

Building on the identified gaps (Table 9, Table 10 and Table 11), we propose the following priority areas.
1.
Edge Intelligence Revolution
  • Develop hardware-aware AI capable of overcoming computational asymmetries: lightweight DL models (e.g., SA-MobileNetV2) should extend beyond fixed highway applications to support temporal dynamics on ultra-low-power microcontrollers (<1 W).
  • Automate recalibration processes by replacing manual threshold tuning in rule-based systems with reinforcement learning techniques, potentially reducing the maintenance time from 35–60 h monthly to less than 5 h.
2.
Generalizability in Data-Scarce Environments
  • Design map-agnostic frameworks that leverage “congestion cell” concepts to support informal settlements with irregular road layouts, utilizing satellite-augmented probe data to reduce the reliance on precise mapping.
  • Advance synthetic-to-real transfer learning by generating multimodal simulated datasets (e.g., SUMO plus audio), facilitating model training in low-frequency GPS coverage areas and mitigating labeling bottlenecks.
3.
Standardized Multimodal Fusion
  • Establish universal, ISO-compliant validation protocols for the fusion of heterogeneous data sources (e.g., LiDAR and audio), thereby addressing current ad hoc integration challenges.
  • Develop encrypted explainable AI pipelines—such as privacy-preserving Grad-CAM implementations—to enable congestion auditing without exposing sensitive raw data in public surveillance contexts.
4.
Trust-Centric Deployment
  • Mandate the comprehensive real-world stress testing of models under diverse and extreme conditions (e.g., monsoons, public events), reducing the reliance on purely simulation-based benchmarks.
  • Embed dynamic fairness monitoring and bias mitigation strategies, such as adaptive loss reweighting during edge inference, to ensure equitable detection performance across demographic groups.

6. Conclusions

This systematic review has provided a comprehensive analysis of the various AI methodologies applied to traffic congestion detection and management. By reviewing multiple studies through the SPAR-4-SLR protocol, we identified key trends, strengths, and limitations of AI techniques across three major categories: shallow machine learning (SML), deep learning (DL), and reasoning under uncertainty (RUC). We found that, while shallow machine learning offers simplicity and is effective for structured data, it lacks the flexibility required for complex urban traffic patterns. Deep learning excels in handling large volumes of unstructured data, enabling sophisticated real-time traffic analysis, but it demands significant computational resources and extensive datasets. Reasoning under uncertainty, although less commonly used, shows promise in enhancing predictions in dynamic traffic scenarios when integrated with other AI techniques. This study underscores the potential of hybrid AI models that combine the strengths of these methodologies to address the multifaceted challenges of traffic congestion. Moving forward, enhancing data collection and exploring multimodal data integration are crucial in maximizing the efficacy of AI-based traffic management systems. This review not only highlights the current achievements but also charts a course for future research aimed at refining and expanding AI’s role in mitigating traffic congestion effectively.

Author Contributions

Conceptualization, D.B.; methodology, D.B.; investigation, D.B.; formal analysis, D.B.; writing—original draft preparation, D.B.; review and editing, N.A. and A.E.o.; supervision, Z.C. and K.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a doctoral scholarship from the Centre National pour la Recherche Scientifique et Technique (CNRST), Morocco.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable.

Acknowledgments

The authors gratefully acknowledge the support of a doctoral scholarship from the Centre National pour la Recherche Scientifique et Technique (CNRST), Morocco. The authors also acknowledge the use of Grammarly (Grammarly Inc., San Francisco, CA, USA) for grammar and fluency enhancement during manuscript preparation.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
DLDeep Learning
PRProbabilistic Reasoning
SMLShallow Machine Learning
SPAR-4-SLRScientific Procedures and Rationales for Systematic Literature Reviews
TRLTechnology Readiness Level

References

  1. INRIX. Global Traffic Scorecard. 2024. Available online: https://inrix.com/scorecard/ (accessed on 22 March 2025).
  2. Paul, J.; Lim, W.M.; O’Cass, A.; Hao, A.W.; Bresciani, S. Scientific Procedures and Rationales for Systematic Literature Reviews (SPAR-4-SLR). Int. J. Consum. Stud. 2021, 45, O1–O16. [Google Scholar] [CrossRef]
  3. Lee, J.; Lee, S. Separable contextual graph neural networks to identify tailgating-oriented traffic congestion. Expert Syst. Appl. 2024, 254, 124354. [Google Scholar] [CrossRef]
  4. Li, Q.; Tan, H.; Jiang, Z.; Wu, Y.; Ye, L. Nonrecurrent traffic congestion detection with a coupled scalable Bayesian robust tensor factorization model. Neurocomputing 2021, 430, 138–149. [Google Scholar] [CrossRef]
  5. Wang, C.; Shang, Q.; Liu, K.; Zhang, W. Traffic congestion recognition based on convolutional neural networks in different scenarios. Eng. Appl. Artif. Intell. 2025, 148, 110372. [Google Scholar] [CrossRef]
  6. Peixoto, M.; Mota, E.; Maia, A.; Lobato, W.; Salahuddin, M.; Boutaba, R.; Villas, L. FogJam: A Fog Service for Detecting Traffic Congestion in a Continuous Data Stream VANET. Ad Hoc Netw. 2023, 140, 103046. [Google Scholar] [CrossRef]
  7. Sujatha, A.; Suguna, R.; Jothilakshmi, R.; Kaviatha, R.P.; Mujawar, R.Y.; Prabagaran, S. Traffic Congestion Detection and Alternative Route Provision Using Machine Learning and IoT-Based Surveillance. J. Mach. Comput. 2023, 3, 475–485. [Google Scholar] [CrossRef]
  8. Chetouane, A.; Mabrouk, S.; Jemili, I.; Mosbah, M. Vision-based vehicle detection for road traffic congestion classification. Concurr. Comput. Pract. Exp. 2022, 34, e5983. [Google Scholar] [CrossRef]
  9. Chaurasia, B.K.; Manjoro, W.S.; Dhakar, M. Traffic Congestion Identification and Reduction. Wirel. Pers. Commun. 2020, 114, 1267–1286. [Google Scholar] [CrossRef]
  10. Gatto, R.C.; Forster, C.H.Q. Audio-Based Machine Learning Model for Traffic Congestion Detection. IEEE Trans. Intell. Transp. Syst. 2020, 22, 7200–7207. [Google Scholar] [CrossRef]
  11. Yahia, C.N.; Scott, S.E.; Boyles, S.D.; Claudel, C.G. Unmanned Aerial Vehicle Path Planning for Traffic Estimation and Detection of Non-Recurrent Congestion. Transp. Lett. 2022, 14, 849–862. [Google Scholar] [CrossRef]
  12. Liu, G.; Shi, H.; Kiani, A.; Khreishah, A.; Lee, J.Y.; Ansari, N.; Liu, C.; Yousef, M. Smart Traffic Monitoring System using Computer Vision and Edge Computing. arXiv 2021, arXiv:2109.03141. [Google Scholar] [CrossRef]
  13. Jung, J.H.; Eom, Y.H. Empirical analysis of congestion spreading in Seoul traffic network. Phys. Rev. E 2023, 108, 054312. [Google Scholar] [CrossRef] [PubMed]
  14. Song, C.; Wang, Y.; Wang, L.; Wang, J.; Fu, X. Mapping to cells: A map-independent approach for traffic congestion detection. In Proceedings of the International Conference on Smart Transportation and City Engineering (STCE 2023), Nanjing, China, 7–9 November 2025; Mikusova, M., Ed.; p. 102. [Google Scholar] [CrossRef]
  15. Nguyen, T.T.; Calvert, S.C.; Vu, H.L.; Van Lint, H. An Automated Detection Framework for Multiple Highway Bottleneck Activations. IEEE Trans. Intell. Transp. Syst. 2021, 23, 5678–5692. [Google Scholar] [CrossRef]
  16. El-Tawaba, A.H.A.; Fattah, T.A.E.; Mahmood, M.A. A fuzzy-based approach for traffic jam detection. Int. J. Comput. Sci. Netw. Secur. 2021, 21, 257–263. [Google Scholar] [CrossRef]
  17. Swidi, E.; Ardchir, S.; Daif, A.; Azouazi, M. Road users detection for traffic congestion classification. Math. Model. Comput. 2023, 10, 518–523. [Google Scholar] [CrossRef]
  18. Liu, B.; Lam, C.T.; Ng, B.K.; Yuan, X.; Im, S.K. A Graph-Based Framework for Traffic Forecasting and Congestion Detection Using Online Images From Multiple Cameras. IEEE Access 2024, 12, 3756–3767. [Google Scholar] [CrossRef]
  19. Abdelwahab, M.A.; Abdel-Nasser, M.; Hori, M. Reliable and Rapid Traffic Congestion Detection Approach Based on Deep Residual Learning and Motion Trajectories. IEEE Access 2020, 8, 182180–182192. [Google Scholar] [CrossRef]
  20. Amini, M.; Hatwagner, M.F.; Mikulai, G.C.; Koczy, L.T. An intelligent traffic congestion detection approach based on fuzzy inference system. In Proceedings of the 2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI), Timisoara, Romania, 19–21 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 97–104. [Google Scholar] [CrossRef]
  21. Wang, C.; Chen, Y.; Wang, J.; Qian, J. An Improved CrowdDet Algorithm for Traffic Congestion Detection in Expressway Scenarios. Appl. Sci. 2023, 13, 7174. [Google Scholar] [CrossRef]
  22. Wang, L.; Law, K.L.E.; Lam, C.T.; Ng, B.; Ke, W.; Im, M. Automatic Lane Discovery and Traffic Congestion Detection in a Real-Time Multi-Vehicle Tracking Systems. IEEE Access 2024, 12, 161468–161479. [Google Scholar] [CrossRef]
  23. Singh, S.; Soni, K.; Choudhary, R. Highway Traffic Congestion Detection And Evaluation Based On Deep Learning Technique. Soft Comput. 2023, 27, 12249–12265. [Google Scholar] [CrossRef]
  24. Mohanty, A.; Mohanty, S.K.; Jena, B.; Mohapatra, A.G.; Rashid, A.N.; Khanna, A.; Gupta, D. Identification and evaluation of the effective criteria for detection of congestion in a smart city. Iet Commun. 2022, 16, 560–570. [Google Scholar] [CrossRef]
  25. Gao, W.; You, S.; Wang, J.; Zhang, S.; Xie, D. Whether and How Congested is a Road: Indices Updating Strategy and a Vision-Based Model. IET Intell. Transp. Syst. 2023, 17, 772–784. [Google Scholar] [CrossRef]
  26. Khan, Z.; Koubaa, A.; Farman, H. Smart Route: Internet-of-Vehicles (IoV)-Based Congestion Detection and Avoidance (IoV-Based CDA) Using Rerouting Planning. Appl. Sci. 2020, 10, 4541. [Google Scholar] [CrossRef]
  27. Li, S.; Cheng, Y.; Jin, P.; Ding, F.; Li, Q.; Ran, B. A Feature-Based Approach to Large-Scale Freeway Congestion Detection Using Full Cellular Activity Data. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1323–1331. [Google Scholar] [CrossRef]
  28. Liu, T.; Zhao, M. The 3D McMaster Algorithm for Traffic Congestion Detection. In Proceedings of the 2020 Chinese Control And Decision Conference (CCDC), Hefei, China, 22–24 August 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 4953–4958. [Google Scholar] [CrossRef]
  29. Liu, Y.; Zhang, Z.; Han, L.D.; Brakewood, C. Automatic Traffic Queue-End Identification using Location-Based Waze User Reports. Transp. Res. Rec. 2021, 2675, 895–906. [Google Scholar] [CrossRef]
  30. Bawaneh, M.; Simon, V. Novel traffic congestion detection algorithms for smart city applications. Concurr. Comput. Pract. Exp. 2023, 35, e7563. [Google Scholar] [CrossRef]
  31. Luan, S.; Ma, X.; Li, M.; Su, Y.; Dong, Z. Detecting and interpreting non-recurrent congestion from traffic and social media data. Iet Intell. Transp. Syst. 2021, 15, 1461–1477. [Google Scholar] [CrossRef]
  32. Fazekas, Z.; Obaid, M.; Karim, L.; Gáspár, P. Urban Traffic Congestion Alleviation Relying on the Vehicles’ On-board Traffic Congestion Detection Capabilities. Acta Polytech. Hung. 2024, 21, 7–31. [Google Scholar] [CrossRef]
  33. Lin, C.; Hu, X.; Zhan, Y.; Hao, X. MobileNetV2 with Spatial Attention module for traffic congestion recognition in surveillance images. Expert Syst. Appl. 2024, 255, 124701. [Google Scholar] [CrossRef]
  34. Iskandarani, M.Z. Sensing and Detection of Traffic Status through V2V Routing Hop Count and Route Energy. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 93–100. [Google Scholar] [CrossRef]
  35. Mishra, P.K.; Chaturvedi, A.K. Vehicular Traffic Congestion Detection System and Improved Energy-Aware Cost Effective Task Scheduling Approach for Multi-Objective Optimization on Cloud Fog Network. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 432. [Google Scholar] [CrossRef]
  36. Khan, S.; Ghazal, T.M.; Alyas, T.; Waqas, M.; Raza, M.A.; Ali, O.; Khan, M.A.; Abbas, S. Towards Transparent Traffic Solutions: Reinforcement Learning and Explainable AI for Traffic Congestion. Int. J. Adv. Comput. Sci. Appl. 2025, 16, 503–511. [Google Scholar] [CrossRef]
  37. Paranjothi, A.; Khan, M.S.; Patan, R.; Parizi, R.M.; Atiquzzaman, M. VANETomo: A congestion identification and control scheme in connected vehicles using network tomography. Comput. Commun. 2020, 151, 275–289. [Google Scholar] [CrossRef]
  38. Rezaei, M.; Noori, H.; Mohammadkhani Razlighi, M.; Nickray, M. ReFOCUS+: Multi-Layers Real-Time Intelligent Route Guidance System With Congestion Detection and Avoidance. IEEE Trans. Intell. Transp. Syst. 2019, 22, 50–63. [Google Scholar] [CrossRef]
  39. Jiang, S.; Feng, Y.; Zhang, W.; Liao, X.; Dai, X.; Onasanya, B.O. A New Multi-Branch Convolutional Neural Network and Feature Map Extraction Method for Traffic Congestion Detection. Sensors 2024, 24, 4272. [Google Scholar] [CrossRef] [PubMed]
  40. Jian, C.; Lin, C.; Hu, X.; Lu, J. Selective Scale-Aware Network for Traffic Density Estimation and Congestion Detection in ITS. Sensors 2025, 25, 766. [Google Scholar] [CrossRef]
  41. Sun, Z.; Wang, P.; Wang, J.; Peng, X.; Jin, Y. Exploiting Deeply Supervised Inception Networks for Automatically Detecting Traffic Congestion on Freeway in China Using Ultra-Low Frame Rate Videos. IEEE Access 2020, 8, 21226–21235. [Google Scholar] [CrossRef]
  42. Shi, Y.; Wang, D.; Tang, J.; Deng, M.; Liu, H.; Liu, B. Detecting spatiotemporal extents of traffic congestion: A density-based moving object clustering approach. Int. J. Geogr. Inf. Sci. 2021, 35, 1449–1473. [Google Scholar] [CrossRef]
  43. Marousi, K.P.; Stephanedes, Y.J. Dynamic Management of Urban Coastal Traffic and Port Access Control. Sustainability 2023, 15, 14871. [Google Scholar] [CrossRef]
  44. Utomo, W.; Bhaskara, P.W.; Kurniawan, A.; Juniastuti, S.; Yuniarno, E.M. Traffic Congestion Detection Using Fixed-Wing Unmanned Aerial Vehicle (UAV) Video Streaming Based on Deep Learning. In Proceedings of the 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia, 17–18 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 234–238. [Google Scholar] [CrossRef]
  45. Bouyahia, Z.; Haddad, H.; Derrode, S.; Pieczynski, W. Toward a Cost-Effective Motorway Traffic State Estimation From Sparse Speed and GPS Data. IEEE Access 2021, 9, 44631–44646. [Google Scholar] [CrossRef]
  46. Yang, X.; Wang, F.; Bai, Z.; Xun, F.; Zhang, Y.; Zhao, X. Deep Learning-Based Congestion Detection at Urban Intersections. Sensors 2021, 21, 2052. [Google Scholar] [CrossRef]
  47. Li, X.; Chen, Z.; Zhang, J.; Sarro, F.; Zhang, Y.; Liu, X. Bias behind the wheel: Fairness analysis of autonomous driving systems. arXiv 2023, arXiv:2308.02935. [Google Scholar] [CrossRef]
  48. Königs, P. Government surveillance, privacy, and legitimacy. Philos. Technol. 2024, 35, 2022. [Google Scholar] [CrossRef]
  49. Li, S.; Yang, H.; Poolla, K.; Varaiya, P. Spatial pricing in ride-sourcing markets under a congestion charge. Transp. Res. Part B Methodol. 2021, 152, 18–45. [Google Scholar] [CrossRef]
Figure 1. The SPAR-4-SLR scheme.
Figure 1. The SPAR-4-SLR scheme.
Smartcities 08 00143 g001
Figure 2. Distribution of real, simulated, and hybrid datasets across all data source categories.
Figure 2. Distribution of real, simulated, and hybrid datasets across all data source categories.
Smartcities 08 00143 g002
Figure 3. Comparative performance of AI model families across key metrics.
Figure 3. Comparative performance of AI model families across key metrics.
Smartcities 08 00143 g003
Figure 4. Decision protocol for hybrid classification.
Figure 4. Decision protocol for hybrid classification.
Smartcities 08 00143 g004
Figure 5. Forest plot of model performance meta-analysis (95% CIs).
Figure 5. Forest plot of model performance meta-analysis (95% CIs).
Smartcities 08 00143 g005
Table 1. Summary of dataset sources and key attributes in AI-based traffic congestion detection.
Table 1. Summary of dataset sources and key attributes in AI-based traffic congestion detection.
Data SourceStudiesSubtypeReported AccuracyMean Accuracy (%)AvailabilityReal-Time SuitabilityCost
Spatiotemporal19 [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]RealModerate to High (61.03–100%)94.01%Low–MediumHighHigh
10 [22,23,24,25,26,27,28,29,30,31]SimulatedHigh (88.5–100%)93.2%HighLowLow
3 [32,33,34]Real + SimulatedHigh (96–98.1%)97.05%Medium–HighMediumMedium
Probe5 [35,36,37,38,39]RealHigh (95%)95%MediumHighMedium
1 [40]SimulatedHighN/AHighLowLow
Hybrid1 [41]RealHigh (95.2%)95.2%LowMediumHigh
1 [42]SimulatedHighN/ALowLowLow
2 [43,44]Real + SimulatedModerate (78%–88.6%)83.3%MediumMediumMedium
Other1 [45]Audio (Real)High (89%–99%)94%HighMediumLow
1 [46]Static road profileHighN/AHighLowLow
Table 2. Overview of machine learning-based models for traffic congestion detection.
Table 2. Overview of machine learning-based models for traffic congestion detection.
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [6]2023Simulated V2V data using SUMO + OMNeT++ + VeinsDBSCAN, X-MeansBandwidth reduction: 75.61% (DBSCAN), 60.85% (X-Means).High6 levels (freeflow, reasonably freeflow, stable flow, approaching unstable flow, unstable flow, breakdown flow)
 [7]2023IoT cameras + sensorsHDBSCANWait time reduction ( 5 0 % ), rerouting efficiency ( 9 5 % ), response speed ( 5 × faster), queue length reduction (3 km → 1 km), emergency vehicle delay ( 1 0   s )High2 levels
(congested/non-congested)
 [8]2020CCTV camerasDecision treePrecision (98%), recall (98%), accuracy (99%), specificity (100%)MediumThree levels: light, medium, heavy
 [9]2020Vehicle GPSTrajectory clustering, threshold-based detection, VANET reroutingHigh accuracyMediumBinary
(congested/non-congested)
 [10]2020MicrophonesRandom forest classifier (MFCC audio features)Accuracy (89–99%), precision (100%), recall (94%)HighBinary (congested/freeflow)
 [24]2022Urban traffic in SUMO (real map of Bhubaneswar imported from OSM)K-means clustering on vehicle metrics + Analytical Hierarchy Process (AHP)Consistency ratio 1.67% in AHP (within 0.1 threshold)ModerateBinary (congested, non-congested)
 [29]2021Collected Waze user reports (geo-tagged) in Knoxville, TN for 34 congestion eventsSpatial-Temporal Density-Based Spatial Clustering of Applications with Noise (ST-DBSCAN)Waze-based queue-end detection was on average only 1.1 min later than sensor detection, with similar spatial coverage of the queue (1.9 vs. 1.8 detection points per mile)HighBinary (congested, non-congested)
 [35]2024Sensor-based vehicular data in cloud-fog network simulations (iFogSim simulator)Whale Optimization AlgorithmEnergy consumption: best result with 50 IoT devices (500 tasks): 176,916.18 W; cost: best result: $810,188.88. ECTS scheduler cut energy by 6.6% and cost by 13.4% vs. genetic algorithm and −13.8% energy, −18.5% cost vs. round robinMediumBinary (congested, non-congested)
Table 3. Overview of deep learning-based models for traffic congestion detection (part 1).
Table 3. Overview of deep learning-based models for traffic congestion detection (part 1).
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [3]2024Simulated (12,000 multivariate time-series samples of 45 metrics across 160 robots)Separable Contextual Graph Neural Network (SC-GNN)F1-score: 0.885; Recall: 0.701; AUC: 0.892; FPR: 0.102Medium2 levels (non-congested, congested)
 [5]2024Real-time (SOTS and HSTS) and CCTRIBCNN (LFE-YOLOv8 + RAGFNet)RAGFNet (PSNR: 24.6, SSIM: 0.89); LFE-YOLOv8 (Precision: 99.7%, Recall: 97.7%, F1: 98.6%, Accuracy: 98.2%)High2 levels (congested/non-congested)
 [17]2023MS COCO datasetYOLOv7Precision: 0.53; Recall: 0.55; F1-score: 0.61; mAP: 5 8.4 % HighBinary (congested or not congested)
 [18]2023Surveillance cameras from different nodes in Macao Peninsula and Taipa from DSAT official website.D2STGNNMAE: 1.44; RMSE: 1.96; MAPE: 31.13%HighBinary (congested or not congested)
 [21]2023Surveillance video from NJRY and UA-DETRAC datasetsIBCDet + DeepSORTAP: 95.30%; MR-2: 24.44%; JI: 76.35%; Accuracy: 91.28%HighChinese expressway LoS criteria
 [22]2024Live streaming video feeds (Macao DSAT)YOLOv8 + mNMS + BoTSORTAccuracy: 97.4%HighBinary (clear traffic, congested)
 [23]2024Real highway scene data (presumably images of traffic)CNNAccuracy: 94–95%HighBinary (freeflow, congested)
 [25]2023Collected 27 real traffic camera videos (1080p, 25 FPS) from >20 road segments (LoS A–D) in Hangzhou; created a custom congestion video dataset (110 k congested frames)Lightweight CNN (YOLOv3-tiny)Precision: 95.06; Recall: 92.05; F-measure: 93.53; SwitchRate: 2.13; Hit Rate: 0.94HighBinary (congested, non-congested)
 [26]2020Simulated traffic on a real map (Alwaha, Riyadh intersection) using SUMO (9 scenarios: varying segment lengths and congestion levels)Modified EG-DijkstraPath cost, travel time, and speed improved by 8 0 % vs. baseline (MCDP)HighBinary (congested, non-congested)
 [28]2020Highway loop detector data (Shuijie Expressway, Chongqing)3D McMaster algorithmDetection Rate: 93.25%; False Alarm Rate: 0%HighBinary (congested, non-congested)
 [33]2024UCSD dataset and self-collected highway camera imagesSA-MobileNetV2, Grad-CAMAccuracy: 98.58%; Precision: 98.86%; Recall: 98.62; F1-score: 98.73High3 levels (light, medium, heavy)
Table 4. Overview of deep learning-based models for traffic congestion detection (part 2).
Table 4. Overview of deep learning-based models for traffic congestion detection (part 2).
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [36]2025Open city traffic data (Kaggle vehicle routing datasets) with features (time, weather, flow, etc.), split into 8000 training and 2000 validation samplesReinforcement Learning + Explainable AIAccuracy: 98.10%; missing data rate: 1.90%MediumBinary (Congested, Non congested)
 [39]2024Chinese City Traffic Image Database (CCTRIB): 9200 labeled images (half congested vs. not) collected from urban cameras under various conditionsMBCNN + YOLOv8 + VIFMF1: 98.61%; Accuracy: 98.62%MediumBinary (congested, non-congested)
 [40]2025New COTRS dataset + UCSD datasetCNN using SSANet + Optical FlowMAE: 0.117; Accuracy: 96.06%; F1-score: 9 5.98 % ; Precision: 96.14%; Recall: 95.81%HighBinary (COTRS); 3 classes (UCSD: light, medium, heavy)
 [41]2020Massive freeway CCTV dataset: images from 14,470 highway cameras covering 5215 km in Shaanxi, ChinaDSIN + APMAccuracy: 95.77%HighBinary (congested, non-congested)
 [43]2023Combined field data (traffic counts, signals, port gate service times in Patras, Greece) with microscopic simulationANN + Rule-BasedAccuracy: 96.0%; Detection Rate: 81.2%; False Alarm Rate: 1.3%MediumBinary (congested, non-congested)
 [44]2020Aerial video captured via fixed-wing UAVs; YouTube videoYOLOv3-style CNNYouTube: Avg 90.75%; UAV Live: Avg 90.00%Medium3 levels (smooth, congested, jammed)
 [46]2021Surveillance video from Jinan, China, 1280 × 720 resolutionYOLOv3 + LK Optical FlowAccuracy: 89.5%; mAP: 89.7%; FPS: 44; Precision: 89.2%; Recall: 90.1%High3 levels (smooth, slow, congestion)
Table 5. Overview of probabilistic reasoning-based models for traffic congestion detection.
Table 5. Overview of probabilistic reasoning-based models for traffic congestion detection.
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [4]2021Real-time (speed, volume, occupancy data from US highways) from Caltrans PeMSBayesian Tensor FactorizationOverall accuracy: 88.03%High2 levels (normal, non-recurrent congestion)
 [11]2021UAV (simulated) + fixed sensorsEnsemble Kalman Filter + UAV path optimizationLower RMSE; high detection success in heavy congestion; lower covariance/varianceMediumBinary (incident/no incident)
 [16]2021Manual road information compilation for a route in Giza, EgyptFuzzy LogicRanked roads matched driver preferences; suitability degree evaluatedLowNo explicit congestion level taxonomy
 [20]2021Historical data from Hungarian freewaysFuzzy Inference System (Mamdani)Qualitative evaluation via fuzzy surfaces, expert rules, real-world validationMediumSeven levels: completely congestion-free, congestion-free, stable, near congestion
 [45]2021GPS/speed data from England highways + SUMO simulationCGOMFSM (Markov Fuzzy Switching System)MAPE: 5.90–6.69%; RMSE: 50.10–53.94; Accuracy: 88.6%; Lag: 4.1 min; FA: 16.3%HighBinary (freeflow, fully congested)
Table 6. Overview of statistical, rule-based, and other AI approaches for traffic congestion detection.
Table 6. Overview of statistical, rule-based, and other AI approaches for traffic congestion detection.
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [13]2024GNSS data from ~10 k taxis via Seoul TOPIS, aggregated into 5 min average speedsState Propagation Algorithm (effective Z-score + network propagation)Morning Congestion Ratio ≈ 20% (single roads), <5% (loops); Evening ≈ 50% (single), ≈40% (loops)HighBinary (congested vs. free)
 [14]2024~26.8 M GPS records from 10,829 taxis in Xi’an, China (1 August 2012)Cell-based CV methodCe values clearly differentiate congestion, especially at lower speedsHighBinary (congested or freeflow)
 [15]2021Loop detectors (Dutch highway) + synthetic data via FOSIMChan–Vese Model + Active Shape Model (ASM)Accurate at 100–500 m spacing; lower at 1000 m; multiple concurrent bottlenecks detectedMediumBinary (congested or not congested)
 [30]202280 simulated runs on Budapest road network (9 h each)Traffic Congestion Detector, Ensemble-Based TCDDetection Rate: 100%, FAR: 0.08%, MTTD = −0.625MediumBinary (congested, non-congested)
 [37]2020NS-2 + SUMO simulation of connected vehicles on highwaysVANETomo (statistical inference + routing algorithm)Packet loss: 3% (vs. 27% baseline); Delay: ~8ms (vs. 48 ms); Throughput: ~38 Mbps; Lowest channel loadMedium3 levels (least, normal, over)
 [26]2020Simulated traffic (Alwaha, Riyadh intersection) using SUMO (9 scenarios)Modified EG-DijkstraPath cost, travel time, and speed improved by ~80% vs. baseline (MCDP)MediumBinary (congested, non-congested)
 [12]2021CCTV + edge/cloud nodesBackground subtraction (GFM, GMM) + hybrid analyticsDetection accuracy ~100%HighBinary (congested/non-congested)
 [27]2020Full cellular activity (FCA) records on Ninghu Freeway, ChinaGeneralized ESD outlier testHigh accuracy: ~95%HighThree levels (freeflow, moderate congestion, severe congestion)
 [32]2024Microscopic Vissim simulation on an urban road under various scenariosRule-based triggersFlow rate ↑~25%; Rerouting benefit: 10–30%; Travel time: qualitative improvementMediumBinary (congested, non-congested)
 [34]2021VANET simulation (MATLAB) with hops, routes, energyAlgorithmic pattern analysis of V2V dataECR: 0.0006–0.0032 JMediumBinary (congested, non-congested)
Note: The arrow ↑ indicates a qualitative improvement as reported in the cited literature.
Table 7. Overview of hybrid and mixed AI approaches for traffic congestion detection.
Table 7. Overview of hybrid and mixed AI approaches for traffic congestion detection.
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [31]2021Historical and live traffic speed data from AutoNavi/Amap + microblog posts from Sina Weibo (Twitter-like platform)Generalized Extreme Studentized Deviate (GESD) + XGBoost ClassifierAccuracy = 95.2%, DR = 95%, FAR = 0.115, MTTD = 13.65MediumBinary (congested/non-congested)
 [19]2020Video cameras (UCSD and NU1 datasets)ResNet101 + SVMAccuracy = 97.64%MediumThree levels: light, medium, heavy
Table 8. Overview of heuristic and spatiotemporal clustering approaches for traffic congestion detection.
Table 8. Overview of heuristic and spatiotemporal clustering approaches for traffic congestion detection.
StudyYearData Collection MethodAI MethodologyPerformance ResultsReal-World ApplicabilityCongestion Levels Used
 [38]2019Realistic SUMO simulations on maps of Toronto, UBC, Los AngelesDynamic multimetric model (RWM) + TCD metricTravel time reduced by 40–64%; fuel and CO2 emissions cut by ~28–50% compared to other routing methodsMediumBinary (congested, non-congested)
 [42]2021GPS-based taxi trajectories (Wuchang, China) + VISSIM simulatorDensity-based moving object clusteringF1-score up to 0.78; Precision 0.75; Recall 0.82; RMSE = 0.87HighBinary (congested, non-congested)
Table 9. Strengths, limitations, and future directions of selected AI methodologies (1).
Table 9. Strengths, limitations, and future directions of selected AI methodologies (1).
StudyAI MethodologyStrengths of the StudyLimitations of the StudyFuture Directions
 [8]Decision Tree + Optical FlowRobust comparison of algorithms; high accuracyRequires extensive labeling; limited night evaluationNight-time video, advanced DL methods
 [5]CNN (LFE-YOLOv8 + RAGFNet)Effective under various weather types; high accuracy (98.2%)Dependency on image quality; requires defogging step; possible dataset biasesImprove robustness under extreme weather; optimize model size
 [4]Bayesian Tensor FactorizationRobust to noise; training-free; effective anomaly detectionComputational complexity with large datasetsExplore lightweight versions for edge computing
 [12]Background Subtraction + Hybrid Edge DetectionEdge–cloud adaptive strategy; robust detectionManual thresholds/ROI; limited moderate congestion detectionDeep learning integration; automated calibration; predictive analytics
 [39]Multi-Branch CNN + YOLOv8 + VIFM (MBCNN architecture)Innovative feature representation (VIFM); improved classification via parallel processing; better occlusion handlingRequires vehicle detection pre-process; error propagation riskGeneralize to other datasets/cities; use stronger vehicle detectors for VIFM
 [33]SA-MobileNetV2 + Grad-CAMGrad-CAM improves interpretability; efficient for edge deployment (low FLOPs, small model)Focused on single frame; no temporal modeling; limited to highway scenariosIncorporate video/temporal modeling; apply GOAMLP-CNN hybrids; use historical and multi-camera data
Table 10. Strengths, limitations, and future directions of selected AI methodologies (2).
Table 10. Strengths, limitations, and future directions of selected AI methodologies (2).
StudyAI MethodologyStrengths of the StudyLimitations of the StudyFuture Directions
 [27]Generalized ESD Outlier TestWide-area coverage without new sensors—uses ubiquitous cell phone signals. Novel features: link pseudo-speed (handover timing) and link probe activity (phone signal density)Relies on cellular provider data; less accuracy for severe congestion (73% vs. 97% for freeflow); imprecise congestion boundariesImprove handling of cellular data uncertainty; extend to urban materials; integrate with flow relationships and other data sources
 [6]DBSCAN (clustering in FogJam)High efficiency; reduces upstream data volume by 70%Sensitive to VANET density and connectivity qualityExtend to hybrid edge–cloud architectures
 [13]State Propagation Algorithm (SPA)Robust method to identify congestion; distinguishes between structural congestion types (tree vs. loop)Generalizability not validated; possible sampling biasDevelop real-time SPA-based tools for traffic management and mitigation
 [9]Trajectory Clustering + VANET ReroutingCombines congestion detection with mitigation; enables recurrent congestion analysisConducted offline; lacks explicit quantitative validationEnable real-time prediction; test adaptability and predictive modeling in VANET environments
 [14]Cell-Based CV MethodMap-independent method; avoids reliance on heavy map matching or high-precision mapsSparse or low-frequency taxi data may reduce accuracy and leave areas unmonitoredAnalyze spatiotemporal patterns and propagation of congestion using “congestion cells” framework
Table 11. Strengths, limitations, and future directions of selected AI methodologies (3).
Table 11. Strengths, limitations, and future directions of selected AI methodologies (3).
StudyAI MethodologyStrengths of the StudyLimitations of the StudyFuture Directions
 [10]Generalized ESD Outlier TestLow-cost, non-visual, robust across locationsBinary output only; initial manual labeling neededExplore multi-microphone setups; integrate temporal modeling and deep learning
 [35]Whale Optimization AlgorithmEnd-to-end IoT–fog–cloud architecture improves responsiveness; optimizes energy and cost compared to baseline task schedulersDetection technique not fully detailed; high infrastructure requirements (fog nodes/RSUs)Include latency/security constraints in scheduling; embed advanced congestion detection at sensor level
 [31]GESD + XGBoost ClassifierCombines statistical anomaly detection (speed vs. historical norms) with machine learning; high detection performanceNeeds active social media users; risks from imprecise geotags and timestampsDeploy in real time; extend to urban roads; automate alert systems when congestion aligns with social media posts
 [42]Density-Based Moving Object ClusteringAccurately detects congestion spatiotemporally; effective even with sparse data; outperforms older clustering techniquesRelies on taxi data; may underperform with other vehicle types or in different regionsUse parallel processing for real-time deployment; study congestion propagation sources
Table 12. Comprehensive model–data–scenario matrix for traffic congestion detection.
Table 12. Comprehensive model–data–scenario matrix for traffic congestion detection.
Model FamilySpatiotemporal DataProbe DataHybrid/Multimodal DataBest-Suited Scenario
Shallow MLDecision Tree + Optical Flow [8]
↑ 99% accuracy, real time
DBSCAN [6]
↑ 75.6% bandwidth
reduction
Random forest [10]
↑ 99% audio accuracy
(low cost)
Edge devices
Low-resource cities
Real-time systems
Deep LearningLFE-YOLOv8 + RAGFNet [5]
↑ 99.7% in adverse
weather
MBCNN + VIFM [39]
↑ 98.6% occlusion
handling
High-infrastructure cities
Complex urban environments
Adverse conditions
ProbabilisticBayesian Tensor [4]
↑ Training-free
(88% accuracy)
CGOMFSM [45]
↑ 88.6% sparse data
(10% penetration)
Uncertain environments
Low-data regions
Highway monitoring
Hybrid AIResNet101 + SVM [19]
↑ 97.64% accuracy
(three-level detection)
Video surveillance
requiring
interpretability
Rule-Based/StatisticalBackground subtraction [12]
↑ 100% simulation
accuracy
Generalized ESD [27]
↑ 95% cellular
coverage
State propagation [13]
↑ structural pattern
detection
Legacy systems
Privacy-sensitive areas
Low-compute zones
Note: ↑ denotes standout strength per context. Empty cells (—) indicate insufficient evidence in reviewed studies. Model counts: spatiotemporal (32), probe (6), hybrid (6).
Table 13. Deployment feasibility assessment with IoT and smart intersection evidence.
Table 13. Deployment feasibility assessment with IoT and smart intersection evidence.
Model FamilyComputation
(Watts/Node)
Privacy
Risk Index
Maintenance
(h/Month)
Readiness LevelExemplary EvidenceMitigation Pathways
Deep Learning200
(GPU required)
9.2/1035–40
(retraining)
Limited- YOLOv8: USD 600/node [22]
- 92% lack GDPR compliance [5,46]
- IoT: RPi fails at 15 FPS [35]
- Smart Int.: USD 12k GPU in Macao [22]
- Federated learning [36]
- Synthetic data augmentation
- TensorRT edge optimization
Shallow ML2–5
(ARM/MCU)
5.8/108–10
(calibration)
High- Random Forest: EUR 120/node
( mic   +   housing ) [10]
- 65% anonymize probe data [29]
- IoT: ESP32 success in VANET
( 2 ms   latency ) [34]
- Smart Int.: USD 120 controllers [10]
- Differential privacy
- Automated threshold tuning
- Binarized models for MCUs
Rule-Based<1
(legacy HW)
2.1/100.5–1
(validation)
Very High- ESD: EUR 0 incremental cost
( leverages   cellular   infra ) [27]
- 100% aggregated data [16]
- IoT: 500+ nodes in Seoul
( 10 k   taxis ) [13]
- Smart Int.: Budapest signals [30]
- Anomaly detection add-ons
- Crowdsourced validation
- SCADA integration
Hybrid10–100
(varies)
6.7/1050–60
(orchestration)
Moderate- ReFOCUS+: 40% latency
( travel   time   cut ) [38]
- Privacy inheritance risk [31]
- Smart Int.: Miami gridlock [43]
- IoT: Toronto fog latency
( 15 ms   added ) [38]
- Standardized MQTT APIs
- Privacy-preserving fusion
- AWS IoT Greengrass
Note: Privacy Risk Index (0 = low, 10 = high) based on PII exposure. Readiness: Limited = TRL 4–6, Moderate = TRL 6–7, High = TRL 7–8, Very High = TRL 8–9. IoT evidence marked in blue. Hardware costs reflect 2024 prices.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bakir, D.; Moussaid, K.; Chiba, Z.; Abghour, N.; El omri, A. A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types. Smart Cities 2025, 8, 143. https://doi.org/10.3390/smartcities8050143

AMA Style

Bakir D, Moussaid K, Chiba Z, Abghour N, El omri A. A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types. Smart Cities. 2025; 8(5):143. https://doi.org/10.3390/smartcities8050143

Chicago/Turabian Style

Bakir, Doha, Khalid Moussaid, Zouhair Chiba, Noreddine Abghour, and Amina El omri. 2025. "A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types" Smart Cities 8, no. 5: 143. https://doi.org/10.3390/smartcities8050143

APA Style

Bakir, D., Moussaid, K., Chiba, Z., Abghour, N., & El omri, A. (2025). A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types. Smart Cities, 8(5), 143. https://doi.org/10.3390/smartcities8050143

Article Metrics

Back to TopTop