A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types
Abstract
Highlights
- This review analyzes 44 AI-based traffic congestion detection studies using the SPAR-4-SLR methodology.
- It identifies five main model categories and two data types, mapping them to performance metrics across different scenarios.
- The study provides a structured roadmap to guide researchers in selecting appropriate AI models based on data availability and objectives.
- The study contributes to advancing the state of the art in congestion detection by clarifying when and why each model performs optimally.
Abstract
1. Introduction
2. Materials and Methods
2.1. Assembling
2.2. Arranging
2.3. Assessing
3. Results
3.1. Comparative Evaluation of Data Sources
- Spatiotemporal data formed the foundation for most congestion detection efforts (). Prominently featured in studies [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21], real-world sources included traditional loop detectors deployed to monitor speed, volume, and occupancy on expressways [4,6,7] and urban surveillance systems based on video acquisition via roadside cameras [5,8,9]. Several works integrated visual feeds with real-time analytics; for instance, Ref. [5] utilized live video from the SOTS and HSTS datasets, while [9] analyzed UAV-captured footage of critical intersections. Public highway feeds and city monitoring centers provided the necessary infrastructure in many cases, with availability ranging from moderate to high depending on coverage. Despite their effectiveness, these systems remain infrastructure-intensive and costly to scale and maintain.
- Simulated spatiotemporal data, widely explored across studies [22,23,24,25,26,27,28,29,30,31], enabled algorithm testing under controlled conditions, with high availability and minimal costs. Tools such as SUMO (Simulation of Urban Mobility) and NS-3 were frequently used to model traffic dynamics and vehicular communications [22,23,27]. In addition, some studies generated synthetic multivariate sensor datasets emulating real detector outputs [22,26], supporting precise calibration. While simulations offer flexibility and reproducibility, they lack real-world irregularities, thus limiting their transferability to production environments. Hybrid spatiotemporal datasets, applied in [32,33,34], blended real-world traffic sensor data with simulated streams to validate detection models under both authentic and synthetic conditions. For instance, [33] merged actual road topologies with simulated vehicle trajectories to improve model robustness.
- Complementing fixed-sensor systems, probe-based data were examined in six studies [35,36,37,38,39,40]. These included mobile GPS traces, smartphone-derived logs, and V2V communications. Real probe datasets originated from fleet vehicles [35,37], mobile user contributions [38], and connected vehicle telemetry [36,39]. These sources offer wide spatial reach with minimal physical infrastructure, although their effectiveness depends on data penetration and sharing protocols. For example, [36] evaluated LTE-based vehicular communication for the tracking of congestion spread, while [38] compared crowdsourced alert timings to official sensor reports. A simulated probe study [40] tested clustering-based classification on synthetic V2X data. Although typically cost-efficient, proprietary limitations in probe datasets may constrain scalability and replication. Hybrid data sources, explored in [41,42,43,44], combined multiple sensing modalities to enhance the detection resilience. In [41], fixed roadside sensors were fused with mobile signals to cross-validate congestion states. The study [42] showcased a hybrid simulation, while others, such as [43,44], integrated both live and offline streams. These systems counteract the limitations of individual data types but introduce added complexity in architecture and integration.
- Additional sensing strategies were reported in [45,46]. The study [45] used roadside microphones to detect congestion through acoustic signals such as engine noise and honking patterns—an approach that is low-cost and real-time, albeit sensitive to environmental factors. The study [46] leveraged static road profile data and historical congestion maps to support route-level congestion analysis and planning. Although universally accessible and low-cost, such static sources lack temporal granularity and require real-time data fusion for actionable detection. Collectively, the reviewed studies present a comprehensive landscape of traffic data usage in congestion detection. From high-resolution infrastructure feeds to synthetic simulations and mobile crowdsourced data, each source type contributes distinct advantages. Real spatiotemporal and probe datasets offered the most direct deployment value when sufficient infrastructure or user bases existed, whereas simulations supported scalable testing and algorithm development. Hybrid approaches provided robustness at the cost of system complexity, and novel sensing (e.g., audio) introduced promising alternatives in low-resource settings. Ultimately, the interplay between these data types defines the operational scope, scalability, and intelligence of modern congestion detection systems.
3.2. Comparative Evaluation of IA Models
3.2.1. Shallow Machine Learning
- Density-Based Clustering (DBSCAN, ST-DBSCAN, HDBSCAN): This represents a class of unsupervised techniques that are adept in identifying the spatial concentration of vehicles, which often signifies congestion. Unlike partitioning methods, these approaches require no prior knowledge of the cluster count, making them ideal for irregular, dynamic traffic distributions. In [6], DBSCAN was applied to detect congestion zones based on vehicle positioning and mobility data, offering rapid detection without the need for complex preprocessing. A spatial voting mechanism was integrated to reinforce the model’s precision and minimize false detections at the cluster boundaries. The study [7] introduced HDBSCAN to enhance the sensitivity across mixed-density traffic data, allowing the model to adapt to varying urban topologies. Their framework achieved notable improvements in clustering quality and was validated against real-world datasets. In [29], ST-DBSCAN incorporated a temporal dimension, enabling the detection of evolving congestion scenarios by analyzing the speed and temporal relationships between consecutive data points. This allowed for a more dynamic representation of the traffic flow and enhanced the ability to capture transitional congestion patterns over time. This choice highlights the adaptability of density-based models in scenarios with sparse labeling and highly variable traffic patterns.
- Decision Tree: This is a supervised rule-based classification technique that recursively splits the input space using conditions on input variables, forming a tree-like structure. It is known for its simplicity and interpretability, making it suitable for scenarios requiring transparency in decision-making. The study [8] implemented a decision tree on real-time traffic data to infer the congestion status by comparing speed thresholds and flow variations, producing binary output classes representing the congestion severity. The model was designed to support traffic operators through straightforward interpretability, and its decision paths closely matched domain expert logic. Results showed reliable classification with low computational demands, supporting its integration in constrained or embedded environments. This reflects the practical strength of decision trees when interpretability and simplicity are essential, especially in municipal-scale deployments.
- Random Forest: This is an ensemble-based extension of the decision tree, combining multiple trees trained on bootstrapped datasets and aggregating their predictions. It improves the generalization performance and reduces the sensitivity to noise and overfitting. In [10], random forest was employed to classify congestion states using MFCC coefficients extracted from real-time traffic audio signals. The goal was to distinguish between free-flowing and congested environments without relying on visual data sources. The ensemble structure handled environmental noise effectively, and experiments demonstrated high detection accuracy, suggesting its suitability for cities lacking robust camera infrastructure. This illustrates the model’s capacity to absorb noisy, high-dimensional inputs while delivering robust predictions in unconventional sensing environments.
- K-Means with Analytical Hierarchy Process (AHP): This is an unsupervised clustering method that groups data points based on the distance to centroids. When integrated with the Analytical Hierarchy Process (AHP), it benefits from the expert-defined weighting of variables, allowing more context-aware clustering. In [24], the AHP was integrated with K-Means to introduce weighted prioritization among input features such as the average speed, vehicle count, and road occupancy. The framework was developed to classify congestion levels with relevance to road priority and traffic policy. The incorporation of the AHP provided domain-aligned interpretability, and evaluations demonstrated strong agreement between the clustering results and expert-labeled traffic zones, particularly in peak-time conditions. This highlights the ability of integrated heuristic learning approaches to align machine-driven outputs with expert traffic management strategies.
- Whale Optimization Algorithm (WOA): This is a population-based metaheuristic inspired by the bubble-net hunting strategy of humpback whales. It is commonly used in optimizing complex search spaces, where traditional methods struggle. In [35], the WOA was integrated into a fuzzy inference engine to optimize congestion thresholds dynamically in VANET environments. The model adjusted the fuzzy membership boundaries based on the vehicular density and delay parameters. Simulation results showed a significant reduction in false alarm rates and improved classification confidence when compared to traditional thresholding, validating the WOA’s contribution to enhancing fuzzy rule interpretability and performance. This underscores the value of metaheuristic optimization in enhancing adaptive decision-making within resource-constrained vehicular networks.
- Learning-Based Detection with Rule-Guided Mitigation: In [9], GPS-based clustering for congestion detection with a rule-based rerouting mechanism was implemented in a VANET simulation. Using the Microsoft T-Drive dataset, which contains historical taxi trajectories in Beijing, congestion zones were identified through trajectory density analysis and threshold-based logic. Although primarily evaluated through qualitative analysis, the system demonstrated the effective identification of recurrent congestion spots, and the authors tested a mitigation strategy through simulated vehicular rerouting. This integration exemplifies the potential of hybrid systems to not only detect congestion but also initiate real-time control actions within connected vehicle infrastructures. The study’s use of public GPS data highlights the scalability, although real-time adaptability and predictive components remain future targets for enhancement.
3.2.2. Deep Learning
- Convolutional Neural Networks (CNNs): These were the most widely adopted architectures in the reviewed literature, primarily in scenarios involving visual inputs such as urban surveillance footage, aerial images, and traffic camera streams. CNNs learn spatial hierarchies through layered convolutional filters, enabling them to capture vehicle clustering, lane congestion, and occlusion patterns without handcrafted features. Several studies implemented object detection architectures, such as YOLOv7 and YOLOv8. In [5], YOLOv8 was enhanced with a RAGFNet defogging module to restore degraded video quality in poor weather conditions, achieving an F1-score of 98.6%. The study [22] employed YOLOv8 with mNMS and BoTSORT to build a responsive video analytics system for real-time city congestion detection, reaching 97.4% accuracy. In [17], YOLOv7 was used to assess the road user density from CCTV data, although the results were limited by the general-purpose MS COCO dataset, yielding a modest F1-score of 0.61. Lightweight CNN variants were explored for deployment on embedded or edge platforms. In [33], SA-MobileNetV2, paired with Grad-CAM, delivered high accuracy (98.6%) while offering visual interpretability by highlighting congestion regions in the image. The study [25] used YOLOv3-tiny in conjunction with a congestion index combining the traffic density, speed, and duration, enabling efficient multi-criteria evaluation. In terms of aerial applications, [44] deployed YOLOv3 on UAV-captured streams to detect congestion in remote road segments, achieving over 90% average precision. Advanced multi-stream and attention-based CNNs were introduced to improve scalability and occlusion handling. In [39], a multi-branch CNN (MBCNN) separately processed raw video frames and vehicle distribution maps (VIFM), resulting in the improved recognition of high-density congestion zones and achieving a 98.6% F1-score. The study [40] employed SSANet with optical flow to integrate spatial density and motion features, enabling both static and dynamic congestion detection. In [41], a deeply supervised inception network (DSIN) combined with an attention proposal module (APM) achieved 95.77% accuracy, even under ultra-low-frame-rate video constraints across large freeway datasets. CNNs have demonstrated exceptional versatility across sensing conditions and spatial scales. Their modularity and compatibility with real-time video pipelines position them as highly effective tools in visual congestion monitoring environments.
- Graph Neural Networks: GNNs offer a powerful extension to deep learning by modeling traffic flows through graph-structured data. In this representation, road segments or intersections are nodes, and the vehicle flow or proximity forms the edges. This enables traffic reasoning to occur over the entire network, leveraging both spatial connectivity and temporal dynamics. In [3], the Separable Contextual Graph Neural Network (SC-GNN) was used to detect congestion anomalies in an automated logistics environment, using 12,000 multivariate time-series samples. The model achieved an F1-score of 0.885 and maintained robustness against data imbalances. The study [18] introduced the Decoupled Dynamic Spatio-Temporal GNN (D2STGNN), leveraging image-derived numerical features for real-time traffic forecasting and congestion detection. The model reached an RMSE of 1.96 and an MAPE of 31.13%, outperforming conventional temporal baselines. The novel integration of graph-based routing and detection appeared in [26], where a modified EG-Dijkstra algorithm was deployed in a simulated SUMO environment. By combining congestion inference with optimal path selection, the approach achieved up to an 80% improvement in travel time, underscoring GNNs’ suitability in Internet of Vehicles (IoV) systems.
- Artificial Neural Networks (ANNs): While simpler in structure compared to CNNs and GNNs, these have been effectively applied in scenarios where structured, numeric data sources dominate—such as traffic signal timing, port gate delays, or vehicle flow metrics. In [43], an ANN was combined with a parameterized rule-based model to detect and manage congestion in the city of Patras, Greece. The model ingested field-surveyed traffic patterns and real-time gate operations, achieving 96% detection accuracy. The hybrid system enabled the synchronized control of port access and urban traffic flows via connected vehicle messaging, with demonstrated improvements in travel time and system throughput.
- Deep Reinforcement Learning: DRL frameworks extend detection into dynamic decision-making, where a learning agent optimizes congestion-related policies over time through feedback. This makes them especially useful in autonomous systems requiring adaptive control. The study [36] proposed a DRL model trained on traffic data derived from vehicle routing and congestion labels. To enhance its trust and explainability, the model incorporated explainable AI (XAI) techniques, allowing operators to interpret the rationale behind real-time decisions. The model achieved 98.1% classification accuracy and maintained robustness in the presence of incomplete or noisy data.
3.2.3. Probabilistic Reasoning Techniques
- Bayesian Tensor Factorization: In [4], a scalable Bayesian robust tensor factorization model was developed to detect non-recurrent traffic congestion (NRTC) using loop detector data collected from US highways (Caltrans PeMS). By modeling multivariate spatiotemporal traffic variables (speed, volume, occupancy) in a high-dimensional tensor structure, the method enabled anomaly detection without requiring supervised training. The model achieved 88.03% overall accuracy in identifying 1169 out of 1328 NRTC events and demonstrated even higher weekday performance (92.33%). Its unsupervised and training-free nature make it suitable for large-scale real-time deployment, although its computational complexity could become a bottleneck in dense urban sensor networks.
- Ensemble Kalman Filtering for UAV-based Incident Detection: The study [11] proposed the novel integration of UAV path planning and ensemble Kalman filtering (EnKF) to estimate traffic congestion in non-recurrent scenarios. Based on synthetic data generated via cell transmission modeling and UAV imagery, the framework dynamically adjusted UAV flight paths to maximize the coverage of high-uncertainty zones while concurrently updating traffic state estimates. The model achieved high detection accuracy and reduced estimation variance in heavy congestion scenarios. While no real UAV deployment was conducted, the simulation underscores the potential of Kalman filtering in combining sparse, mobile observations with predictive models in an adaptive loop.
- Fuzzy Logic for Route Personalization and Traffic State Ranking: Fuzzy inference mechanisms were utilized in two studies to address ambiguity in traffic states and driver preferences. In [16], a fuzzy-based decision support system was built around manual road data to recommend optimal routes in Giza, Egypt. The model evaluated candidate roads based on linguistic variables such as speed, safety, and the number of traffic signals, ranking them through fuzzy suitability scores. Although the system lacked real-time detection, it showcased fuzzy logic’s value in encoding user-defined priorities and subjective decision-making under semi-real scenarios. In [20], a Mamdani-type fuzzy inference system was applied to real freeway data from Hungary, supporting seven-level congestion classification (e.g., “stable”, “near congestion”, “completely free”) through expert-derived linguistic rules. The method emphasized interpretability and tolerance to input imprecision, although it depended heavily on human-crafted rules and the work lacked quantitative evaluation metrics.
- Markov Fuzzy Switching for Sparse Probe Data Environments: In [45], a Conditionally Gaussian Observed Markov Fuzzy Switching System (CGOMFSM) was introduced to estimate traffic states using sparse GPS and speed data from highways in England, combined with SUMO-based simulation. The model was designed to function effectively even at low data penetration rates (10%), using fuzzy logic to handle input variability and a Markovian framework to model state transitions. It achieved high performance (88.6% accuracy, MAPE ¥ ≈ 6.0%) while maintaining low lag (4.1 min) and false alarm rates (16.3%). The method proved suitable for low-infrastructure environments such as highways with limited sensor coverage.
3.2.4. Statistical and Rule-Based Models for Traffic Congestion Detection
- Pure Statistical Inference Models: Several studies leveraged purely statistical models to detect congestion based on anomalies, spatial propagation, or topological indicators. In [13], the State Propagation Algorithm (SPA) was introduced to model congestion spread using GNSS data from 10,000 taxis in Seoul. By computing effective Z-scores across spatially connected road segments, the SPA differentiated between localized and structural congestion formations. Similarly, ref. [14] proposed a map-independent statistical approach, transforming GPS trajectories into spatial “congestion cells” and detecting speed anomalies using the per-cell variance. This approach proved particularly effective for unstructured road networks, where map matching is infeasible. The study [15] combined logistic regression and curve fitting techniques with image-based traffic visualization to automatically detect and characterize multiple bottlenecks on Dutch highways. It proved robust in identifying simultaneous congestion events. In [30], the EB-TCD model was introduced, using an ensemble-based statistical detector that fuses deviations across several traffic features to raise congestion alerts. Its extremely low false alarm rate (0.08%) and rapid response (MTTD = −0.625) make it well suited for real-time control systems.
- Fuzzy and Rule-Based Inference Systems: These remain valuable in embedding domain knowledge and handling uncertainty in congestion assessment. In [16], a fuzzy route selection system guided drivers in Giza by ranking road alternatives using driver-defined rules across safety, traffic signals, and service availability. In [20], a Mamdani fuzzy inference system classified traffic states on Hungarian highways using seven linguistic congestion levels derived from flow metrics. These approaches excelled in applications where formal ground-truth labels were scarce and interpretability was critical. The study [45] integrated fuzzy reasoning with probabilistic modeling through a Markov fuzzy switching system, which combined GPS probe data with simulated congestion indicators. Even with low data penetration, the system achieved accuracy of 88.6%, validating its robustness in sparse sensing environments.
- Rule-Based Vision and Sensor Systems: Rule-based congestion detection using structured thresholds and visual cues continues to support edge deployments. In [12], background subtraction techniques (GFM, GMM) were used on CCTV streams within a hybrid edge–cloud architecture. The system automatically identified congestion zones based on vehicle density thresholds and achieved near-perfect detection ( 100%). In [27], mobile phone handover events were mined to derive pseudo-speed and probe activity metrics, which were fed into a generalized ESD statistical test to determine the congestion severity. The model reached 95% accuracy on a major freeway and demonstrated strong scalability potential.
- Simulation-Driven Rule Models for Intelligent Vehicles: Simulation-based systems explore rule logic within synthetic vehicular networks. The study [32] modeled a novel behavior-based rerouting strategy where vehicles autonomously diverted from queues based on line-of-sight congestion detection. Diverting as few as 10% of vehicles led to substantial improvements in network flow. In [34], a V2V communication-based congestion detector used the routing hop count and signal energy to flag developing congestion. As the density increased, the model reliably predicted impending traffic jams, outperforming traditional speed-based systems.
3.2.5. Hybrid Methodologies in Traffic Congestion Detection
- The study [31] demonstrates how statistical anomaly detection (generalized ESD test) and supervised learning (XGBoost NLP) operate in concert. While the statistical layer flags congestion events through real-time speed deviations, the machine learning component interprets causes via social media semantics. This dual-stage architecture achieved 95.2% accuracy in identifying and explaining non-recurrent congestion—a task where single-paradigm models typically fail.
- Deep–Shallow Fusion for Visual Congestion Classification: The study [19] introduced a hybrid vision-based framework combining deep feature extraction via ResNet101 and classification using a support vector machine (SVM). The model utilized real-time urban surveillance video streams from the UCSD and NU1 datasets under diverse environmental conditions. Congestion detection was approached through a dual-path system: deep residual learning captured texture features such as vehicle contours and crowd density, while motion features were derived from preserved vehicle trajectories to capture temporal stagnation indicative of congestion. These features were embedded into a Learning-to-Rank (LTR) framework, and the final classification into light, medium, or heavy congestion was handled by an SVM. The model achieved 97.64% accuracy, illustrating the benefits of combining the expressive power of deep learning with the precision and robustness of shallow classifiers. However, computational complexity and real-time deployment challenges remain open concerns.
3.2.6. Other Methodologies in Congestion Detection
- Fog–Cloud Intelligent Routing with Real-Time Congestion Avoidance: In [38], the ReFOCUS+ system was introduced as a real-time route guidance and congestion avoidance framework, combining a fog–cloud architecture and dynamic congestion estimation. Operating within SUMO-simulated urban networks (Toronto, UBC, Los Angeles), ReFOCUS+ computes a road weight measurement (RWM) per road segment, integrating the travel time, congestion severity, and historical traffic flow. Using a distributed computation model—where RSUs handle local congestion detection and cloud layers manage global optimization—the system proactively reroutes vehicles to avoid both present and anticipated bottlenecks. Experimental evaluation showed substantial improvements: the travel time was reduced by up to 66%, and fuel/CO2 emissions dropped by 30–50%, outperforming standard routing protocols. However, the approach assumes full RSU deployment and 100% compliance from drivers, which may not yet be feasible in large-scale real-world deployments. Nevertheless, it exemplifies a powerful use case reflecting heuristic, system-wide congestion mitigation strategies.
- Spatiotemporal Clustering with Real GPS Trajectories: The study [42] tackled congestion detection through a density-based moving object clustering approach using both real GPS traces (5.6 million points from 7000 taxis in Wuchang, China) and simulated VISSIM data. Unlike traditional fixed-sensor models, this method classifies congestion dynamically by tracking spatiotemporal clusters of slow-moving vehicles, thereby identifying both the extent and duration of traffic jams across road networks. The algorithm achieved an F1-score of 0.78, demonstrating robustness even with sparse GPS data. Its map-agnostic nature and reliance solely on vehicle motion patterns make it highly scalable and cost-effective for smart city deployments. Future work aims to incorporate real-time parallel processing and congestion propagation modeling to further improve its responsiveness and granularity.
4. Discussion
4.1. Spatiotemporal Data: The Deep Learning–Shallow ML Spectrum
4.2. Probe Data: Robustness, Sparsity, and Topological Insights
4.3. Hybrid and Multimodal Data: Fusing Realism with Simulation
4.4. Deployment Feasibility: Bridging Algorithmic Performance and Urban Realities
- Computational Asymmetry: GPU-dependent models (200 W/node) create unsustainable energy footprints at the city scale, while edge devices fail beyond 15 FPS during peak traffic.
- Privacy–Compliance Gaps: In total, 92% of vision systems lack GDPR safeguards, creating legal risks that are absent in aggregated-data approaches (2.1/10 risk index).
- Maintenance Scalability: Retraining cycles (35–60 h/month) are compounded with environmental fragility—fog latency (15 ms), camera occlusion, and hardware drift.
- Rule-based systems achieve TRL 8–9 using legacy infrastructure (e.g., 500+ Seoul nodes);
- Meanwhile, DL models remain constrained to TRL 4–6 by hardware dependencies ($12 k/intersection).
- Simplicity vs. Adaptability: Rule-based deployments [12] achieve TRL 8–9 readiness but fail in dynamic conditions (accuracy drops by 25% without calibration), while adaptive DL hybrids incur higher costs.
4.5. Ethical Implications: Beyond Technical Performance
- Algorithmic Bias and Fairness: Vision systems trained on limited or biased datasets can perform poorly for certain groups. For example, studies show up to a 15% drop in accuracy when detecting pedestrians with darker skin tones in low-light conditions [47]. Similarly, GPS data in areas with poor signal coverage, like low-income neighborhoods, can misrepresent traffic patterns [14]. These issues risk creating unfair congestion estimates and solutions that do not serve all communities equally.
5. Key Synthesis and Research Imperatives
5.1. Core Insights from Comprehensive Analysis
- Data Modality: Deep learning (DL) excels when processing rich spatiotemporal data streams such as CCTV footage, whereas shallow machine learning (SML) and probabilistic models perform better in probe-based or sparse data settings (e.g., Markov–fuzzy hybrids achieve 88.6% accuracy at 10% GPS penetration).
- Infrastructure Maturity: Cities with abundant resources benefit from DL’s high accuracy (e.g., YOLOv8 reaches 99.7% accuracy), while resource-constrained regions rely on SML’s efficiency (e.g., decision trees maintain 99% accuracy at under 5 W per node).
- Operational Trust: Legacy rule-based systems have reached high technology readiness levels (TRL 8-9), supporting reliable deployments. By contrast, DL hybrid models (TRL 4-6) require integrated explainability mechanisms (such as Grad-CAM) to gain operator acceptance.
5.2. Future Research Directions
- 1.
- Edge Intelligence Revolution
- Develop hardware-aware AI capable of overcoming computational asymmetries: lightweight DL models (e.g., SA-MobileNetV2) should extend beyond fixed highway applications to support temporal dynamics on ultra-low-power microcontrollers (<1 W).
- Automate recalibration processes by replacing manual threshold tuning in rule-based systems with reinforcement learning techniques, potentially reducing the maintenance time from 35–60 h monthly to less than 5 h.
- 2.
- Generalizability in Data-Scarce Environments
- Design map-agnostic frameworks that leverage “congestion cell” concepts to support informal settlements with irregular road layouts, utilizing satellite-augmented probe data to reduce the reliance on precise mapping.
- Advance synthetic-to-real transfer learning by generating multimodal simulated datasets (e.g., SUMO plus audio), facilitating model training in low-frequency GPS coverage areas and mitigating labeling bottlenecks.
- 3.
- Standardized Multimodal Fusion
- Establish universal, ISO-compliant validation protocols for the fusion of heterogeneous data sources (e.g., LiDAR and audio), thereby addressing current ad hoc integration challenges.
- Develop encrypted explainable AI pipelines—such as privacy-preserving Grad-CAM implementations—to enable congestion auditing without exposing sensitive raw data in public surveillance contexts.
- 4.
- Trust-Centric Deployment
- Mandate the comprehensive real-world stress testing of models under diverse and extreme conditions (e.g., monsoons, public events), reducing the reliance on purely simulation-based benchmarks.
- Embed dynamic fairness monitoring and bias mitigation strategies, such as adaptive loss reweighting during edge inference, to ensure equitable detection performance across demographic groups.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
DL | Deep Learning |
PR | Probabilistic Reasoning |
SML | Shallow Machine Learning |
SPAR-4-SLR | Scientific Procedures and Rationales for Systematic Literature Reviews |
TRL | Technology Readiness Level |
References
- INRIX. Global Traffic Scorecard. 2024. Available online: https://inrix.com/scorecard/ (accessed on 22 March 2025).
- Paul, J.; Lim, W.M.; O’Cass, A.; Hao, A.W.; Bresciani, S. Scientific Procedures and Rationales for Systematic Literature Reviews (SPAR-4-SLR). Int. J. Consum. Stud. 2021, 45, O1–O16. [Google Scholar] [CrossRef]
- Lee, J.; Lee, S. Separable contextual graph neural networks to identify tailgating-oriented traffic congestion. Expert Syst. Appl. 2024, 254, 124354. [Google Scholar] [CrossRef]
- Li, Q.; Tan, H.; Jiang, Z.; Wu, Y.; Ye, L. Nonrecurrent traffic congestion detection with a coupled scalable Bayesian robust tensor factorization model. Neurocomputing 2021, 430, 138–149. [Google Scholar] [CrossRef]
- Wang, C.; Shang, Q.; Liu, K.; Zhang, W. Traffic congestion recognition based on convolutional neural networks in different scenarios. Eng. Appl. Artif. Intell. 2025, 148, 110372. [Google Scholar] [CrossRef]
- Peixoto, M.; Mota, E.; Maia, A.; Lobato, W.; Salahuddin, M.; Boutaba, R.; Villas, L. FogJam: A Fog Service for Detecting Traffic Congestion in a Continuous Data Stream VANET. Ad Hoc Netw. 2023, 140, 103046. [Google Scholar] [CrossRef]
- Sujatha, A.; Suguna, R.; Jothilakshmi, R.; Kaviatha, R.P.; Mujawar, R.Y.; Prabagaran, S. Traffic Congestion Detection and Alternative Route Provision Using Machine Learning and IoT-Based Surveillance. J. Mach. Comput. 2023, 3, 475–485. [Google Scholar] [CrossRef]
- Chetouane, A.; Mabrouk, S.; Jemili, I.; Mosbah, M. Vision-based vehicle detection for road traffic congestion classification. Concurr. Comput. Pract. Exp. 2022, 34, e5983. [Google Scholar] [CrossRef]
- Chaurasia, B.K.; Manjoro, W.S.; Dhakar, M. Traffic Congestion Identification and Reduction. Wirel. Pers. Commun. 2020, 114, 1267–1286. [Google Scholar] [CrossRef]
- Gatto, R.C.; Forster, C.H.Q. Audio-Based Machine Learning Model for Traffic Congestion Detection. IEEE Trans. Intell. Transp. Syst. 2020, 22, 7200–7207. [Google Scholar] [CrossRef]
- Yahia, C.N.; Scott, S.E.; Boyles, S.D.; Claudel, C.G. Unmanned Aerial Vehicle Path Planning for Traffic Estimation and Detection of Non-Recurrent Congestion. Transp. Lett. 2022, 14, 849–862. [Google Scholar] [CrossRef]
- Liu, G.; Shi, H.; Kiani, A.; Khreishah, A.; Lee, J.Y.; Ansari, N.; Liu, C.; Yousef, M. Smart Traffic Monitoring System using Computer Vision and Edge Computing. arXiv 2021, arXiv:2109.03141. [Google Scholar] [CrossRef]
- Jung, J.H.; Eom, Y.H. Empirical analysis of congestion spreading in Seoul traffic network. Phys. Rev. E 2023, 108, 054312. [Google Scholar] [CrossRef] [PubMed]
- Song, C.; Wang, Y.; Wang, L.; Wang, J.; Fu, X. Mapping to cells: A map-independent approach for traffic congestion detection. In Proceedings of the International Conference on Smart Transportation and City Engineering (STCE 2023), Nanjing, China, 7–9 November 2025; Mikusova, M., Ed.; p. 102. [Google Scholar] [CrossRef]
- Nguyen, T.T.; Calvert, S.C.; Vu, H.L.; Van Lint, H. An Automated Detection Framework for Multiple Highway Bottleneck Activations. IEEE Trans. Intell. Transp. Syst. 2021, 23, 5678–5692. [Google Scholar] [CrossRef]
- El-Tawaba, A.H.A.; Fattah, T.A.E.; Mahmood, M.A. A fuzzy-based approach for traffic jam detection. Int. J. Comput. Sci. Netw. Secur. 2021, 21, 257–263. [Google Scholar] [CrossRef]
- Swidi, E.; Ardchir, S.; Daif, A.; Azouazi, M. Road users detection for traffic congestion classification. Math. Model. Comput. 2023, 10, 518–523. [Google Scholar] [CrossRef]
- Liu, B.; Lam, C.T.; Ng, B.K.; Yuan, X.; Im, S.K. A Graph-Based Framework for Traffic Forecasting and Congestion Detection Using Online Images From Multiple Cameras. IEEE Access 2024, 12, 3756–3767. [Google Scholar] [CrossRef]
- Abdelwahab, M.A.; Abdel-Nasser, M.; Hori, M. Reliable and Rapid Traffic Congestion Detection Approach Based on Deep Residual Learning and Motion Trajectories. IEEE Access 2020, 8, 182180–182192. [Google Scholar] [CrossRef]
- Amini, M.; Hatwagner, M.F.; Mikulai, G.C.; Koczy, L.T. An intelligent traffic congestion detection approach based on fuzzy inference system. In Proceedings of the 2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI), Timisoara, Romania, 19–21 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 97–104. [Google Scholar] [CrossRef]
- Wang, C.; Chen, Y.; Wang, J.; Qian, J. An Improved CrowdDet Algorithm for Traffic Congestion Detection in Expressway Scenarios. Appl. Sci. 2023, 13, 7174. [Google Scholar] [CrossRef]
- Wang, L.; Law, K.L.E.; Lam, C.T.; Ng, B.; Ke, W.; Im, M. Automatic Lane Discovery and Traffic Congestion Detection in a Real-Time Multi-Vehicle Tracking Systems. IEEE Access 2024, 12, 161468–161479. [Google Scholar] [CrossRef]
- Singh, S.; Soni, K.; Choudhary, R. Highway Traffic Congestion Detection And Evaluation Based On Deep Learning Technique. Soft Comput. 2023, 27, 12249–12265. [Google Scholar] [CrossRef]
- Mohanty, A.; Mohanty, S.K.; Jena, B.; Mohapatra, A.G.; Rashid, A.N.; Khanna, A.; Gupta, D. Identification and evaluation of the effective criteria for detection of congestion in a smart city. Iet Commun. 2022, 16, 560–570. [Google Scholar] [CrossRef]
- Gao, W.; You, S.; Wang, J.; Zhang, S.; Xie, D. Whether and How Congested is a Road: Indices Updating Strategy and a Vision-Based Model. IET Intell. Transp. Syst. 2023, 17, 772–784. [Google Scholar] [CrossRef]
- Khan, Z.; Koubaa, A.; Farman, H. Smart Route: Internet-of-Vehicles (IoV)-Based Congestion Detection and Avoidance (IoV-Based CDA) Using Rerouting Planning. Appl. Sci. 2020, 10, 4541. [Google Scholar] [CrossRef]
- Li, S.; Cheng, Y.; Jin, P.; Ding, F.; Li, Q.; Ran, B. A Feature-Based Approach to Large-Scale Freeway Congestion Detection Using Full Cellular Activity Data. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1323–1331. [Google Scholar] [CrossRef]
- Liu, T.; Zhao, M. The 3D McMaster Algorithm for Traffic Congestion Detection. In Proceedings of the 2020 Chinese Control And Decision Conference (CCDC), Hefei, China, 22–24 August 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 4953–4958. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, Z.; Han, L.D.; Brakewood, C. Automatic Traffic Queue-End Identification using Location-Based Waze User Reports. Transp. Res. Rec. 2021, 2675, 895–906. [Google Scholar] [CrossRef]
- Bawaneh, M.; Simon, V. Novel traffic congestion detection algorithms for smart city applications. Concurr. Comput. Pract. Exp. 2023, 35, e7563. [Google Scholar] [CrossRef]
- Luan, S.; Ma, X.; Li, M.; Su, Y.; Dong, Z. Detecting and interpreting non-recurrent congestion from traffic and social media data. Iet Intell. Transp. Syst. 2021, 15, 1461–1477. [Google Scholar] [CrossRef]
- Fazekas, Z.; Obaid, M.; Karim, L.; Gáspár, P. Urban Traffic Congestion Alleviation Relying on the Vehicles’ On-board Traffic Congestion Detection Capabilities. Acta Polytech. Hung. 2024, 21, 7–31. [Google Scholar] [CrossRef]
- Lin, C.; Hu, X.; Zhan, Y.; Hao, X. MobileNetV2 with Spatial Attention module for traffic congestion recognition in surveillance images. Expert Syst. Appl. 2024, 255, 124701. [Google Scholar] [CrossRef]
- Iskandarani, M.Z. Sensing and Detection of Traffic Status through V2V Routing Hop Count and Route Energy. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 93–100. [Google Scholar] [CrossRef]
- Mishra, P.K.; Chaturvedi, A.K. Vehicular Traffic Congestion Detection System and Improved Energy-Aware Cost Effective Task Scheduling Approach for Multi-Objective Optimization on Cloud Fog Network. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 432. [Google Scholar] [CrossRef]
- Khan, S.; Ghazal, T.M.; Alyas, T.; Waqas, M.; Raza, M.A.; Ali, O.; Khan, M.A.; Abbas, S. Towards Transparent Traffic Solutions: Reinforcement Learning and Explainable AI for Traffic Congestion. Int. J. Adv. Comput. Sci. Appl. 2025, 16, 503–511. [Google Scholar] [CrossRef]
- Paranjothi, A.; Khan, M.S.; Patan, R.; Parizi, R.M.; Atiquzzaman, M. VANETomo: A congestion identification and control scheme in connected vehicles using network tomography. Comput. Commun. 2020, 151, 275–289. [Google Scholar] [CrossRef]
- Rezaei, M.; Noori, H.; Mohammadkhani Razlighi, M.; Nickray, M. ReFOCUS+: Multi-Layers Real-Time Intelligent Route Guidance System With Congestion Detection and Avoidance. IEEE Trans. Intell. Transp. Syst. 2019, 22, 50–63. [Google Scholar] [CrossRef]
- Jiang, S.; Feng, Y.; Zhang, W.; Liao, X.; Dai, X.; Onasanya, B.O. A New Multi-Branch Convolutional Neural Network and Feature Map Extraction Method for Traffic Congestion Detection. Sensors 2024, 24, 4272. [Google Scholar] [CrossRef] [PubMed]
- Jian, C.; Lin, C.; Hu, X.; Lu, J. Selective Scale-Aware Network for Traffic Density Estimation and Congestion Detection in ITS. Sensors 2025, 25, 766. [Google Scholar] [CrossRef]
- Sun, Z.; Wang, P.; Wang, J.; Peng, X.; Jin, Y. Exploiting Deeply Supervised Inception Networks for Automatically Detecting Traffic Congestion on Freeway in China Using Ultra-Low Frame Rate Videos. IEEE Access 2020, 8, 21226–21235. [Google Scholar] [CrossRef]
- Shi, Y.; Wang, D.; Tang, J.; Deng, M.; Liu, H.; Liu, B. Detecting spatiotemporal extents of traffic congestion: A density-based moving object clustering approach. Int. J. Geogr. Inf. Sci. 2021, 35, 1449–1473. [Google Scholar] [CrossRef]
- Marousi, K.P.; Stephanedes, Y.J. Dynamic Management of Urban Coastal Traffic and Port Access Control. Sustainability 2023, 15, 14871. [Google Scholar] [CrossRef]
- Utomo, W.; Bhaskara, P.W.; Kurniawan, A.; Juniastuti, S.; Yuniarno, E.M. Traffic Congestion Detection Using Fixed-Wing Unmanned Aerial Vehicle (UAV) Video Streaming Based on Deep Learning. In Proceedings of the 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia, 17–18 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 234–238. [Google Scholar] [CrossRef]
- Bouyahia, Z.; Haddad, H.; Derrode, S.; Pieczynski, W. Toward a Cost-Effective Motorway Traffic State Estimation From Sparse Speed and GPS Data. IEEE Access 2021, 9, 44631–44646. [Google Scholar] [CrossRef]
- Yang, X.; Wang, F.; Bai, Z.; Xun, F.; Zhang, Y.; Zhao, X. Deep Learning-Based Congestion Detection at Urban Intersections. Sensors 2021, 21, 2052. [Google Scholar] [CrossRef]
- Li, X.; Chen, Z.; Zhang, J.; Sarro, F.; Zhang, Y.; Liu, X. Bias behind the wheel: Fairness analysis of autonomous driving systems. arXiv 2023, arXiv:2308.02935. [Google Scholar] [CrossRef]
- Königs, P. Government surveillance, privacy, and legitimacy. Philos. Technol. 2024, 35, 2022. [Google Scholar] [CrossRef]
- Li, S.; Yang, H.; Poolla, K.; Varaiya, P. Spatial pricing in ride-sourcing markets under a congestion charge. Transp. Res. Part B Methodol. 2021, 152, 18–45. [Google Scholar] [CrossRef]
Data Source | Studies | Subtype | Reported Accuracy | Mean Accuracy (%) | Availability | Real-Time Suitability | Cost |
---|---|---|---|---|---|---|---|
Spatiotemporal | 19 [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21] | Real | Moderate to High (61.03–100%) | 94.01% | Low–Medium | High | High |
10 [22,23,24,25,26,27,28,29,30,31] | Simulated | High (88.5–100%) | 93.2% | High | Low | Low | |
3 [32,33,34] | Real + Simulated | High (96–98.1%) | 97.05% | Medium–High | Medium | Medium | |
Probe | 5 [35,36,37,38,39] | Real | High (95%) | 95% | Medium | High | Medium |
1 [40] | Simulated | High | N/A | High | Low | Low | |
Hybrid | 1 [41] | Real | High (95.2%) | 95.2% | Low | Medium | High |
1 [42] | Simulated | High | N/A | Low | Low | Low | |
2 [43,44] | Real + Simulated | Moderate (78%–88.6%) | 83.3% | Medium | Medium | Medium | |
Other | 1 [45] | Audio (Real) | High (89%–99%) | 94% | High | Medium | Low |
1 [46] | Static road profile | High | N/A | High | Low | Low |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[6] | 2023 | Simulated V2V data using SUMO + OMNeT++ + Veins | DBSCAN, X-Means | Bandwidth reduction: 75.61% (DBSCAN), 60.85% (X-Means). | High | 6 levels (freeflow, reasonably freeflow, stable flow, approaching unstable flow, unstable flow, breakdown flow) |
[7] | 2023 | IoT cameras + sensors | HDBSCAN | Wait time reduction (), rerouting efficiency (), response speed (× faster), queue length reduction (3 km → 1 km), emergency vehicle delay () | High | 2 levels (congested/non-congested) |
[8] | 2020 | CCTV cameras | Decision tree | Precision (98%), recall (98%), accuracy (99%), specificity (100%) | Medium | Three levels: light, medium, heavy |
[9] | 2020 | Vehicle GPS | Trajectory clustering, threshold-based detection, VANET rerouting | High accuracy | Medium | Binary (congested/non-congested) |
[10] | 2020 | Microphones | Random forest classifier (MFCC audio features) | Accuracy (89–99%), precision (100%), recall (94%) | High | Binary (congested/freeflow) |
[24] | 2022 | Urban traffic in SUMO (real map of Bhubaneswar imported from OSM) | K-means clustering on vehicle metrics + Analytical Hierarchy Process (AHP) | Consistency ratio 1.67% in AHP (within 0.1 threshold) | Moderate | Binary (congested, non-congested) |
[29] | 2021 | Collected Waze user reports (geo-tagged) in Knoxville, TN for 34 congestion events | Spatial-Temporal Density-Based Spatial Clustering of Applications with Noise (ST-DBSCAN) | Waze-based queue-end detection was on average only 1.1 min later than sensor detection, with similar spatial coverage of the queue (1.9 vs. 1.8 detection points per mile) | High | Binary (congested, non-congested) |
[35] | 2024 | Sensor-based vehicular data in cloud-fog network simulations (iFogSim simulator) | Whale Optimization Algorithm | Energy consumption: best result with 50 IoT devices (500 tasks): 176,916.18 W; cost: best result: $810,188.88. ECTS scheduler cut energy by 6.6% and cost by 13.4% vs. genetic algorithm and −13.8% energy, −18.5% cost vs. round robin | Medium | Binary (congested, non-congested) |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[3] | 2024 | Simulated (12,000 multivariate time-series samples of 45 metrics across 160 robots) | Separable Contextual Graph Neural Network (SC-GNN) | F1-score: 0.885; Recall: 0.701; AUC: 0.892; FPR: 0.102 | Medium | 2 levels (non-congested, congested) |
[5] | 2024 | Real-time (SOTS and HSTS) and CCTRIB | CNN (LFE-YOLOv8 + RAGFNet) | RAGFNet (PSNR: 24.6, SSIM: 0.89); LFE-YOLOv8 (Precision: 99.7%, Recall: 97.7%, F1: 98.6%, Accuracy: 98.2%) | High | 2 levels (congested/non-congested) |
[17] | 2023 | MS COCO dataset | YOLOv7 | Precision: 0.53; Recall: 0.55; F1-score: 0.61; mAP: | High | Binary (congested or not congested) |
[18] | 2023 | Surveillance cameras from different nodes in Macao Peninsula and Taipa from DSAT official website. | D2STGNN | MAE: 1.44; RMSE: 1.96; MAPE: 31.13% | High | Binary (congested or not congested) |
[21] | 2023 | Surveillance video from NJRY and UA-DETRAC datasets | IBCDet + DeepSORT | AP: 95.30%; MR-2: 24.44%; JI: 76.35%; Accuracy: 91.28% | High | Chinese expressway LoS criteria |
[22] | 2024 | Live streaming video feeds (Macao DSAT) | YOLOv8 + mNMS + BoTSORT | Accuracy: 97.4% | High | Binary (clear traffic, congested) |
[23] | 2024 | Real highway scene data (presumably images of traffic) | CNN | Accuracy: 94–95% | High | Binary (freeflow, congested) |
[25] | 2023 | Collected 27 real traffic camera videos (1080p, 25 FPS) from >20 road segments (LoS A–D) in Hangzhou; created a custom congestion video dataset (110 k congested frames) | Lightweight CNN (YOLOv3-tiny) | Precision: 95.06; Recall: 92.05; F-measure: 93.53; SwitchRate: 2.13; Hit Rate: 0.94 | High | Binary (congested, non-congested) |
[26] | 2020 | Simulated traffic on a real map (Alwaha, Riyadh intersection) using SUMO (9 scenarios: varying segment lengths and congestion levels) | Modified EG-Dijkstra | Path cost, travel time, and speed improved by vs. baseline (MCDP) | High | Binary (congested, non-congested) |
[28] | 2020 | Highway loop detector data (Shuijie Expressway, Chongqing) | 3D McMaster algorithm | Detection Rate: 93.25%; False Alarm Rate: 0% | High | Binary (congested, non-congested) |
[33] | 2024 | UCSD dataset and self-collected highway camera images | SA-MobileNetV2, Grad-CAM | Accuracy: 98.58%; Precision: 98.86%; Recall: 98.62; F1-score: 98.73 | High | 3 levels (light, medium, heavy) |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[36] | 2025 | Open city traffic data (Kaggle vehicle routing datasets) with features (time, weather, flow, etc.), split into 8000 training and 2000 validation samples | Reinforcement Learning + Explainable AI | Accuracy: 98.10%; missing data rate: 1.90% | Medium | Binary (Congested, Non congested) |
[39] | 2024 | Chinese City Traffic Image Database (CCTRIB): 9200 labeled images (half congested vs. not) collected from urban cameras under various conditions | MBCNN + YOLOv8 + VIFM | F1: 98.61%; Accuracy: 98.62% | Medium | Binary (congested, non-congested) |
[40] | 2025 | New COTRS dataset + UCSD dataset | CNN using SSANet + Optical Flow | MAE: 0.117; Accuracy: 96.06%; F1-score: ; Precision: 96.14%; Recall: 95.81% | High | Binary (COTRS); 3 classes (UCSD: light, medium, heavy) |
[41] | 2020 | Massive freeway CCTV dataset: images from 14,470 highway cameras covering 5215 km in Shaanxi, China | DSIN + APM | Accuracy: 95.77% | High | Binary (congested, non-congested) |
[43] | 2023 | Combined field data (traffic counts, signals, port gate service times in Patras, Greece) with microscopic simulation | ANN + Rule-Based | Accuracy: 96.0%; Detection Rate: 81.2%; False Alarm Rate: 1.3% | Medium | Binary (congested, non-congested) |
[44] | 2020 | Aerial video captured via fixed-wing UAVs; YouTube video | YOLOv3-style CNN | YouTube: Avg 90.75%; UAV Live: Avg 90.00% | Medium | 3 levels (smooth, congested, jammed) |
[46] | 2021 | Surveillance video from Jinan, China, 1280 × 720 resolution | YOLOv3 + LK Optical Flow | Accuracy: 89.5%; mAP: 89.7%; FPS: 44; Precision: 89.2%; Recall: 90.1% | High | 3 levels (smooth, slow, congestion) |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[4] | 2021 | Real-time (speed, volume, occupancy data from US highways) from Caltrans PeMS | Bayesian Tensor Factorization | Overall accuracy: 88.03% | High | 2 levels (normal, non-recurrent congestion) |
[11] | 2021 | UAV (simulated) + fixed sensors | Ensemble Kalman Filter + UAV path optimization | Lower RMSE; high detection success in heavy congestion; lower covariance/variance | Medium | Binary (incident/no incident) |
[16] | 2021 | Manual road information compilation for a route in Giza, Egypt | Fuzzy Logic | Ranked roads matched driver preferences; suitability degree evaluated | Low | No explicit congestion level taxonomy |
[20] | 2021 | Historical data from Hungarian freeways | Fuzzy Inference System (Mamdani) | Qualitative evaluation via fuzzy surfaces, expert rules, real-world validation | Medium | Seven levels: completely congestion-free, congestion-free, stable, near congestion |
[45] | 2021 | GPS/speed data from England highways + SUMO simulation | CGOMFSM (Markov Fuzzy Switching System) | MAPE: 5.90–6.69%; RMSE: 50.10–53.94; Accuracy: 88.6%; Lag: 4.1 min; FA: 16.3% | High | Binary (freeflow, fully congested) |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[13] | 2024 | GNSS data from ~10 k taxis via Seoul TOPIS, aggregated into 5 min average speeds | State Propagation Algorithm (effective Z-score + network propagation) | Morning Congestion Ratio ≈ 20% (single roads), <5% (loops); Evening ≈ 50% (single), ≈40% (loops) | High | Binary (congested vs. free) |
[14] | 2024 | ~26.8 M GPS records from 10,829 taxis in Xi’an, China (1 August 2012) | Cell-based CV method | Ce values clearly differentiate congestion, especially at lower speeds | High | Binary (congested or freeflow) |
[15] | 2021 | Loop detectors (Dutch highway) + synthetic data via FOSIM | Chan–Vese Model + Active Shape Model (ASM) | Accurate at 100–500 m spacing; lower at 1000 m; multiple concurrent bottlenecks detected | Medium | Binary (congested or not congested) |
[30] | 2022 | 80 simulated runs on Budapest road network (9 h each) | Traffic Congestion Detector, Ensemble-Based TCD | Detection Rate: 100%, FAR: 0.08%, MTTD = −0.625 | Medium | Binary (congested, non-congested) |
[37] | 2020 | NS-2 + SUMO simulation of connected vehicles on highways | VANETomo (statistical inference + routing algorithm) | Packet loss: 3% (vs. 27% baseline); Delay: ~8ms (vs. 48 ms); Throughput: ~38 Mbps; Lowest channel load | Medium | 3 levels (least, normal, over) |
[26] | 2020 | Simulated traffic (Alwaha, Riyadh intersection) using SUMO (9 scenarios) | Modified EG-Dijkstra | Path cost, travel time, and speed improved by ~80% vs. baseline (MCDP) | Medium | Binary (congested, non-congested) |
[12] | 2021 | CCTV + edge/cloud nodes | Background subtraction (GFM, GMM) + hybrid analytics | Detection accuracy ~100% | High | Binary (congested/non-congested) |
[27] | 2020 | Full cellular activity (FCA) records on Ninghu Freeway, China | Generalized ESD outlier test | High accuracy: ~95% | High | Three levels (freeflow, moderate congestion, severe congestion) |
[32] | 2024 | Microscopic Vissim simulation on an urban road under various scenarios | Rule-based triggers | Flow rate ↑~25%; Rerouting benefit: 10–30%; Travel time: qualitative improvement | Medium | Binary (congested, non-congested) |
[34] | 2021 | VANET simulation (MATLAB) with hops, routes, energy | Algorithmic pattern analysis of V2V data | ECR: 0.0006–0.0032 J | Medium | Binary (congested, non-congested) |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[31] | 2021 | Historical and live traffic speed data from AutoNavi/Amap + microblog posts from Sina Weibo (Twitter-like platform) | Generalized Extreme Studentized Deviate (GESD) + XGBoost Classifier | Accuracy = 95.2%, DR = 95%, FAR = 0.115, MTTD = 13.65 | Medium | Binary (congested/non-congested) |
[19] | 2020 | Video cameras (UCSD and NU1 datasets) | ResNet101 + SVM | Accuracy = 97.64% | Medium | Three levels: light, medium, heavy |
Study | Year | Data Collection Method | AI Methodology | Performance Results | Real-World Applicability | Congestion Levels Used |
---|---|---|---|---|---|---|
[38] | 2019 | Realistic SUMO simulations on maps of Toronto, UBC, Los Angeles | Dynamic multimetric model (RWM) + TCD metric | Travel time reduced by 40–64%; fuel and CO2 emissions cut by ~28–50% compared to other routing methods | Medium | Binary (congested, non-congested) |
[42] | 2021 | GPS-based taxi trajectories (Wuchang, China) + VISSIM simulator | Density-based moving object clustering | F1-score up to 0.78; Precision 0.75; Recall 0.82; RMSE = 0.87 | High | Binary (congested, non-congested) |
Study | AI Methodology | Strengths of the Study | Limitations of the Study | Future Directions |
---|---|---|---|---|
[8] | Decision Tree + Optical Flow | Robust comparison of algorithms; high accuracy | Requires extensive labeling; limited night evaluation | Night-time video, advanced DL methods |
[5] | CNN (LFE-YOLOv8 + RAGFNet) | Effective under various weather types; high accuracy (98.2%) | Dependency on image quality; requires defogging step; possible dataset biases | Improve robustness under extreme weather; optimize model size |
[4] | Bayesian Tensor Factorization | Robust to noise; training-free; effective anomaly detection | Computational complexity with large datasets | Explore lightweight versions for edge computing |
[12] | Background Subtraction + Hybrid Edge Detection | Edge–cloud adaptive strategy; robust detection | Manual thresholds/ROI; limited moderate congestion detection | Deep learning integration; automated calibration; predictive analytics |
[39] | Multi-Branch CNN + YOLOv8 + VIFM (MBCNN architecture) | Innovative feature representation (VIFM); improved classification via parallel processing; better occlusion handling | Requires vehicle detection pre-process; error propagation risk | Generalize to other datasets/cities; use stronger vehicle detectors for VIFM |
[33] | SA-MobileNetV2 + Grad-CAM | Grad-CAM improves interpretability; efficient for edge deployment (low FLOPs, small model) | Focused on single frame; no temporal modeling; limited to highway scenarios | Incorporate video/temporal modeling; apply GOAMLP-CNN hybrids; use historical and multi-camera data |
Study | AI Methodology | Strengths of the Study | Limitations of the Study | Future Directions |
---|---|---|---|---|
[27] | Generalized ESD Outlier Test | Wide-area coverage without new sensors—uses ubiquitous cell phone signals. Novel features: link pseudo-speed (handover timing) and link probe activity (phone signal density) | Relies on cellular provider data; less accuracy for severe congestion (73% vs. 97% for freeflow); imprecise congestion boundaries | Improve handling of cellular data uncertainty; extend to urban materials; integrate with flow relationships and other data sources |
[6] | DBSCAN (clustering in FogJam) | High efficiency; reduces upstream data volume by 70% | Sensitive to VANET density and connectivity quality | Extend to hybrid edge–cloud architectures |
[13] | State Propagation Algorithm (SPA) | Robust method to identify congestion; distinguishes between structural congestion types (tree vs. loop) | Generalizability not validated; possible sampling bias | Develop real-time SPA-based tools for traffic management and mitigation |
[9] | Trajectory Clustering + VANET Rerouting | Combines congestion detection with mitigation; enables recurrent congestion analysis | Conducted offline; lacks explicit quantitative validation | Enable real-time prediction; test adaptability and predictive modeling in VANET environments |
[14] | Cell-Based CV Method | Map-independent method; avoids reliance on heavy map matching or high-precision maps | Sparse or low-frequency taxi data may reduce accuracy and leave areas unmonitored | Analyze spatiotemporal patterns and propagation of congestion using “congestion cells” framework |
Study | AI Methodology | Strengths of the Study | Limitations of the Study | Future Directions |
---|---|---|---|---|
[10] | Generalized ESD Outlier Test | Low-cost, non-visual, robust across locations | Binary output only; initial manual labeling needed | Explore multi-microphone setups; integrate temporal modeling and deep learning |
[35] | Whale Optimization Algorithm | End-to-end IoT–fog–cloud architecture improves responsiveness; optimizes energy and cost compared to baseline task schedulers | Detection technique not fully detailed; high infrastructure requirements (fog nodes/RSUs) | Include latency/security constraints in scheduling; embed advanced congestion detection at sensor level |
[31] | GESD + XGBoost Classifier | Combines statistical anomaly detection (speed vs. historical norms) with machine learning; high detection performance | Needs active social media users; risks from imprecise geotags and timestamps | Deploy in real time; extend to urban roads; automate alert systems when congestion aligns with social media posts |
[42] | Density-Based Moving Object Clustering | Accurately detects congestion spatiotemporally; effective even with sparse data; outperforms older clustering techniques | Relies on taxi data; may underperform with other vehicle types or in different regions | Use parallel processing for real-time deployment; study congestion propagation sources |
Model Family | Spatiotemporal Data | Probe Data | Hybrid/Multimodal Data | Best-Suited Scenario |
---|---|---|---|---|
Shallow ML | Decision Tree + Optical Flow [8] ↑ 99% accuracy, real time | DBSCAN [6] ↑ 75.6% bandwidth reduction | Random forest [10] ↑ 99% audio accuracy (low cost) | Edge devices Low-resource cities Real-time systems |
Deep Learning | LFE-YOLOv8 + RAGFNet [5] ↑ 99.7% in adverse weather | — | MBCNN + VIFM [39] ↑ 98.6% occlusion handling | High-infrastructure cities Complex urban environments Adverse conditions |
Probabilistic | Bayesian Tensor [4] ↑ Training-free (88% accuracy) | CGOMFSM [45] ↑ 88.6% sparse data (10% penetration) | — | Uncertain environments Low-data regions Highway monitoring |
Hybrid AI | ResNet101 + SVM [19] ↑ 97.64% accuracy (three-level detection) | — | — | Video surveillance requiring interpretability |
Rule-Based/Statistical | Background subtraction [12] ↑ 100% simulation accuracy | Generalized ESD [27] ↑ 95% cellular coverage | State propagation [13] ↑ structural pattern detection | Legacy systems Privacy-sensitive areas Low-compute zones |
Model Family | Computation (Watts/Node) | Privacy Risk Index | Maintenance (h/Month) | Readiness Level | Exemplary Evidence | Mitigation Pathways |
---|---|---|---|---|---|---|
Deep Learning | 200 (GPU required) | 9.2/10 | 35–40 (retraining) | Limited | - YOLOv8: USD 600/node [22] - 92% lack GDPR compliance [5,46] - IoT: RPi fails at 15 FPS [35] - Smart Int.: USD 12k GPU in Macao [22] | - Federated learning [36] - Synthetic data augmentation - TensorRT edge optimization |
Shallow ML | 2–5 (ARM/MCU) | 5.8/10 | 8–10 (calibration) | High | - Random Forest: EUR 120/node [10] - 65% anonymize probe data [29] - IoT: ESP32 success in VANET [34] - Smart Int.: USD 120 controllers [10] | - Differential privacy - Automated threshold tuning - Binarized models for MCUs |
Rule-Based | <1 (legacy HW) | 2.1/10 | 0.5–1 (validation) | Very High | - ESD: EUR 0 incremental cost [27] - 100% aggregated data [16] - IoT: 500+ nodes in Seoul [13] - Smart Int.: Budapest signals [30] | - Anomaly detection add-ons - Crowdsourced validation - SCADA integration |
Hybrid | 10–100 (varies) | 6.7/10 | 50–60 (orchestration) | Moderate | - ReFOCUS+: 40% latency [38] - Privacy inheritance risk [31] - Smart Int.: Miami gridlock [43] - IoT: Toronto fog latency [38] | - Standardized MQTT APIs - Privacy-preserving fusion - AWS IoT Greengrass |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bakir, D.; Moussaid, K.; Chiba, Z.; Abghour, N.; El omri, A. A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types. Smart Cities 2025, 8, 143. https://doi.org/10.3390/smartcities8050143
Bakir D, Moussaid K, Chiba Z, Abghour N, El omri A. A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types. Smart Cities. 2025; 8(5):143. https://doi.org/10.3390/smartcities8050143
Chicago/Turabian StyleBakir, Doha, Khalid Moussaid, Zouhair Chiba, Noreddine Abghour, and Amina El omri. 2025. "A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types" Smart Cities 8, no. 5: 143. https://doi.org/10.3390/smartcities8050143
APA StyleBakir, D., Moussaid, K., Chiba, Z., Abghour, N., & El omri, A. (2025). A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types. Smart Cities, 8(5), 143. https://doi.org/10.3390/smartcities8050143