Next Article in Journal
High-Performance One-Quadrant DC Drive for Pumping Applications Using Ultra-Sparse Matrix Rectifier
Previous Article in Journal
Experimental Evaluation of the Impact of a Selected Novel Diesel Additive on the Environmental, Energy and Performance Parameters of a Vehicle
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

A Systematic Literature Review of Traffic Congestion Forecasting: From Machine Learning Techniques to Large Language Models

Laboratory of Applied Mathematics and Computer Sciences, Higher Normal School, University Hassan II, Casablanca 50069, Morocco
*
Authors to whom correspondence should be addressed.
Vehicles 2025, 7(4), 142; https://doi.org/10.3390/vehicles7040142
Submission received: 8 October 2025 / Revised: 20 November 2025 / Accepted: 20 November 2025 / Published: 28 November 2025

Abstract

Traffic congestion continues to pose a significant challenge to contemporary urban transportation systems, exerting substantial effects on economic productivity, environmental sustainability, and the overall quality of life. This systematic literature review thoroughly explores the development of traffic congestion forecasting methodologies from 2014 to 2024 by analyzing 100 peer-reviewed publications according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We examine the technological advancements from traditional machine learning (achieving 75–85% accuracy) through deep learning approaches (85–92% accuracy) to recent large language model (LLM) implementations (90–95% accuracy). Our analysis indicates that LLM-based systems exhibit superior performance in managing multimodal data integration, comprehending traffic events, and predicting non-recurrent congestion scenarios. The key findings suggest that hybrid approaches, which integrate LLMs with specialized deep learning architectures, achieve the highest prediction accuracy while addressing the traditional limitations of edge case management and transfer learning capabilities. Nonetheless, challenges remain, including higher computational demands (50–100× higher than traditional methods), domain adaptation complexity, and constraints on real-time implementation. This review offers a comprehensive taxonomy of methodologies, performance benchmarks, and practical implementation guidelines, providing researchers and practitioners with a roadmap for advancing intelligent transportation systems using next-generation AI technologies.

1. Introduction

Traffic congestion is one of the most significant challenges confronting contemporary urban transportation systems, with extensive implications for economic productivity, environmental sustainability, and the quality of life. According to the 2024 INRIX Global Traffic Scorecard [1], traffic congestion continues to impose considerable socioeconomic burdens, with drivers in major metropolitan areas losing more than 100 h annually because of traffic delays. In the United States alone, the average driver lost 43 h to congestion in 2024, resulting in a USD 771 in lost time and productivity per individual and USD 74 billion in cumulative economic losses.
Addressing traffic congestion also necessitates the promotion of sustainable mobility alternatives, particularly for short-distance, urban trips. Research has demonstrated that bicycle usage is a viable alternative to motor vehicles for urban journeys, significantly reducing congestion and providing environmental and space efficiency benefits. Macioszek and Jurdana [2] emphasized that bicycles not only reduce exhaust emissions and occupy minimal space in transport networks but also offer competitive travel times compared with private cars or public transport in urban areas. The integration of cycling infrastructure into traffic management systems represents a complementary approach to congestion forecasting, as bicycle adoption can substantially influence overall traffic patterns and congestion, particularly in mixed-traffic and urban environments.
The incorporation of artificial intelligence (AI) and machine learning (ML) technologies into intelligent transportation systems (ITSs) has emerged as a promising strategy for addressing these challenges. Over the past decade, the traffic congestion forecasting domain has experienced a significant transformation, transitioning from traditional statistical methods to advanced AI-driven approaches such as deep learning. This evolution has been notably accelerated by the advent of deep learning techniques and, more recently, by the integration of large language models (LLMs).
Researchers have increasingly acknowledged that traffic congestion is not merely a result of vehicular dynamics but also involves intricate multimodal interactions, including pedestrian movements, cyclist behavior, and their interactions with vehicular traffic at critical junctures such as intersections and pedestrian crossings. Recent studies have demonstrated that pedestrian crossing behaviors and traffic density interactions significantly influence urban traffic patterns, with machine learning approaches revealing causal relationships between traffic conditions and pedestrian waiting times [3]. Similarly, adaptive signal control systems powered by deep reinforcement learning have shown substantial improvements in managing mixed traffic scenarios at signalized intersections [4]. Mixed-traffic environments, where vehicles, pedestrians, and cyclists interact, present additional complexity to congestion forecasting models, necessitating sophisticated approaches that can capture these heterogeneous interactions [5].
Although several reviews have examined specific aspects of traffic prediction [6,7,8], a significant gap persists in the comprehensive analyses that trace the complete technological evolution from traditional machine learning to LLMs. Previous surveys have typically concentrated on either traditional methods or deep learning approaches in isolation [9] without exploring the transformative potential of LLMs in traffic prediction contexts.
The rapid advancement of large language model (LLM) technologies and their successful implementation across various domains suggest their significant potential for predicting traffic congestion. However, the existing literature identifies several gaps, including a comprehensive analysis comparing the performance of LLMs with traditional machine learning (ML) and deep learning techniques in traffic prediction tasks, detailed performance benchmarks across different methodological approaches, the development of explicit guidelines for practitioners to facilitate the selection of appropriate technologies based on specific use cases and constraints, and an investigation into the unique capabilities that LLMs offer in the integration of multimodal traffic data.

1.1. Research Gaps

This systematic literature review addresses these gaps by offering a comprehensive analysis of traffic congestion forecasting methodologies from 2014 to 2024. Our review provides five distinct insights as follows. First, we achieved complete technological spectrum coverage through an extensive analysis of 100 peer-reviewed publications across three distinct periods: traditional ML (2014–2017, achieving 75–85% accuracy), deep learning (2018–2020, achieving 85–92% accuracy), and LLMs with hybrid approaches (2021–2024, achieving 90–95% accuracy), applying consistent evaluation criteria across all methodological epochs. Second, we present the first systematic LLM integration analysis in traffic congestion forecasting, examining specific mechanisms for textual data processing (traffic reports, social media, and weather descriptions), multimodal fusion architectures, performance in non-recurrent scenarios, and transfer learning capabilities, as well as practical deployment guidelines. Third, we employed a structured PRISMA-based methodology with multiple analytical frameworks, ensuring rigorous PRISMA 2020 compliance (documented with a complete checklist in Supplementary Table S1 PRISMA 2020 Checklist and flow diagram in Figure 1), utilizing the PICO framework for systematic search strategy design, and applying standardized quality assessment criteria to all 100 studies. Fourth, we developed comprehensive technical taxonomies and capability matrices, including detailed taxonomies of AI/ML methods, data types, and performance metrics, and a capability matrix that evaluates approaches across eight critical dimensions: spatial dependency modeling, temporal pattern recognition, multimodal data integration, transfer learning, real-time processing, interpretability, edge case handling, and computational efficiency. Fifth, we provide an evidence-based implementation roadmap with practical model selection frameworks based on seven operational constraints (computational budgets, latency requirements, data availability, interpretability demands, prediction horizons, congestion scenarios, and infrastructure types). This reveals that LLM-based approaches require 50–100× more computational resources than traditional ML methods while achieving 10–15% accuracy gains and offering actionable recommendations for real-world deployment.

1.2. Contributions

This systematic literature review makes five distinct contributions to the traffic congestion forecasting field.
1. Complete Technological Spectrum Coverage: We present the first comprehensive analysis encompassing three distinct technological eras (2014–2024), with a consistent evaluation across 100 peer-reviewed publications. Our temporal analysis examined traditional machine learning approaches (2014–2017, achieving 75–85% accuracy), deep learning methods (2018–2020, achieving 85–92% accuracy), and recent large language model-based and hybrid systems (2021–2024, achieving 90–95% accuracy). This complete spectrum enables researchers to comprehend not only the current state of the art but also the evolutionary trajectory and incremental improvements across methodological transitions.
2. Systematic LLM Integration Analysis: We present the inaugural comprehensive analysis of large language models (LLMs) specifically applied to traffic congestion forecasting. Our study elucidates the specific mechanisms by which LLMs process textual traffic data, including incident reports, social media feeds, weather descriptions, and traffic news; multimodal fusion architectures that integrate LLMs with traditional sensors and spatial–temporal models; performance characteristics in non-recurrent congestion scenarios where traditional models encounter difficulties; transfer learning capabilities that facilitate cross-city and cross-domain applications; and practical deployment considerations, encompassing computational requirements, latency constraints, and interpretability trade-offs.
3. Structured PRISMA-Based Methodology with Multiple Analytical Frameworks: We employ a rigorous systematic review methodology to ensure transparency and reproducibility, which includes full compliance with PRISMA 2020 as documented through a comprehensive checklist (Supplementary Table S1) and a transparent flow diagram (Figure 1); the application of the PICO framework for designing a systematic search strategy, specifying the population (traffic systems), intervention (AI/ML forecasting methods), comparison (methodological approaches), and outcomes (prediction accuracy, computational costs); a standardized quality assessment with explicit scoring criteria (publication quality 30%, methodological rigor 50%, reporting transparency 20%), achieving an inter-rater reliability of κ = 0.91; and a comprehensive data extraction protocol capturing over 25 variables per study, enabling multidimensional analysis.
4. Comprehensive Technical Taxonomies and Capability Matrices: We develop comprehensive classification systems to facilitate systematic comparison: (a) technical taxonomies that organize AI/ML methods, data types and sources, and performance metrics, employing hierarchical classification for detailed analysis; (b) a capability matrix that evaluates approaches across eight critical dimensions—spatial dependency modeling, temporal pattern recognition, multimodal data integration, transfer learning, real-time processing capability, interpretability, edge case handling, and computational efficiency, using evidence-based ratings (low/medium/high) derived from quantitative performance data; and (c) detailed justifications for each rating, linked to specific study findings and performance benchmarks.
5. Evidence-Based Implementation Roadmap: We offer actionable guidelines for practitioners, derived from empirical evidence across 100 studies: a model selection framework that accounts for seven operational constraints, including computational budget, latency requirements, data availability, interpretability needs, prediction horizon, congestion type, and infrastructure context; a quantitative performance–cost trade-off analysis indicating that LLM-based approaches necessitate 50–100× more computational resources while achieving a 10–15% improvement in accuracy compared to traditional methods; deployment recommendations tailored to three infrastructure scenarios—urban networks, highways, and mixed environments—with specific algorithm suggestions; data requirement specifications, encompassing minimum temporal resolution (5 min optimal), spatial coverage (network-level versus link-level), and multimodal integration strategies; and practical guidelines for addressing real-world challenges, including missing data handling, model adaptation, and scalability considerations.
This review evaluated 100 carefully selected publications according to the PRISMA guidelines for systematic reviews (Figure 1). Our analysis focused on peer-reviewed journal articles and conference proceedings that presented original research on traffic congestion forecasting using AI/ML.
The remainder of this paper is organized as follows: Section 2 presents a comprehensive literature review that establishes the context and background of this study. Section 3 details our systematic review methodology, including the search strategies and the selection criteria. Section 4 outlines the procedures for data extraction and analysis. Section 5 offers a comparative analysis of various approaches. Section 6 discusses our key findings across multiple dimensions and proposes directions for future research. Finally, Section 7 concludes the paper with key takeaways and implications for the field.

2. Background and Related Work

2.1. Evolution of Traffic Prediction Research

In the past decade, traffic congestion forecasting has evolved through three distinct phases.

2.1.1. Traditional Machine Learning Era (2014–2017)

During this period, researchers primarily employed classical machine learning techniques, such as Support Vector Machines (SVMs), Random Forests, and Bayesian methods [11,12]. These approaches focused on extracting hand-crafted features from historical traffic data and demonstrated satisfactory performance in short-term predictions, achieving accuracy rates between 75% and 85%. Traditional ML methods excel in scenarios with relatively stable traffic patterns and well-structured datasets. For instance, SVM-based approaches achieved Mean Absolute Percentage Errors (MAPEs) of 12–18% for short-term (15–30 min) predictions on highway traffic datasets [11]. Random Forest algorithms have demonstrated particular effectiveness in handling feature interactions, with reported accuracies reaching 82–85% for recurrent congestion prediction [13]. K-Nearest Neighbors (K-NN) algorithms provided computational efficiency advantages, enabling real-time predictions with minimal resource requirements, albeit at reduced accuracy compared to more sophisticated approaches [14,15]. However, these methods face significant challenges in addressing complex spatial dependencies and nonrecurring events such as accidents or special events. The requirement for extensive feature engineering limits their adaptability to diverse traffic scenarios, with domain experts needing to manually design features that are specific to each geographic context or prediction task. Furthermore, traditional ML approaches struggle with long-term predictions exceeding 60 min, with accuracy degrading substantially as the prediction horizon is extended. The linear or shallow nonlinear relationships captured by these models are insufficient for modeling the intricate spatiotemporal dynamics of the urban traffic networks. Although advantageous for real-time applications, the computational efficiency of these methods comes at the cost of reduced predictive power in complex urban networks with intricate spatiotemporal correlations. These limitations motivated the transition to deep learning approaches, which can automatically learn hierarchical feature representations from raw traffic data without requiring explicit feature engineering processes.

2.1.2. Deep Learning Revolution (2018–2020)

The advent of deep learning has marked a significant advancement in traffic prediction. Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNNs) have become predominant architectures, achieving accuracy rates between 85 and 92% [16,17]. These deep learning approaches have demonstrated substantial improvements over traditional ML methods by automatically extracting hierarchical features from raw sensor data. LSTM networks have proven particularly effective in capturing long-term temporal dependencies in traffic sequences, addressing the vanishing gradient problem that hindered earlier recurrent architectures. Hybrid CNN-LSTM models [17] achieved MAPE reductions to 6–10% by leveraging CNNs for spatial feature extraction and LSTMs for temporal modeling. Gated Recurrent Unit (GRU) variants offer computational efficiency advantages while maintaining competitive prediction accuracy, with 15–20% faster training times compared to LSTM counterparts [18]. Graph Neural Networks (GNNs) have emerged as particularly effective tools for capturing spatial relationships within road networks. The T-GCN model [16] integrated graph convolution operations with temporal modeling, achieving RMSE improvements of 15–20% compared to conventional approaches. Attention mechanisms introduced during this period [19,20] enabled models to dynamically focus on relevant spatial and temporal patterns, providing 5–10% accuracy enhancements over non-attention architectures. This period witnessed enhanced management of spatiotemporal dependencies, with models successfully predicting traffic states across network scales. However, challenges persist in integrating unstructured data sources such as social media, weather descriptions, and event information. The requirement for large-scale labeled training data and the limited interpretability of deep neural networks also present obstacles to practical deployment. Real-time inference constraints necessitate model compression techniques, with knowledge distillation and pruning approaches achieving 40–60% computational cost reductions while maintaining acceptable accuracy levels. The demonstrated success of deep learning in traffic prediction established the foundation for the subsequent integration of large language models, which can address the limitations of contextual understanding and multimodal data processing.

2.1.3. LLM Integration Phase (2021–2024)

Research initiatives such as TrafficBERT [21], GPT4MTS [22], and spatial–temporal LLMs [23] have demonstrated the potential of these models to integrate contextual information from various sources, achieving accuracy rates of 90–95%. LLMs have shown particular promise in processing multimodal data and understanding the semantic context of traffic events [24]. Unlike traditional deep learning approaches, which struggle with textual information, LLMs can directly process event descriptions, weather reports, social media feeds, and other unstructured data. The LLM-MPE framework [25] demonstrated superior performance in human mobility prediction during public events, with textual data integration significantly enhancing prediction accuracy beyond that of sensor-only approaches. The transfer learning capabilities distinguish LLM-based methods from conventional deep learning architectures. Pretrained language models capture general transportation knowledge that can be fine-tuned with limited city-specific data, partially addressing the challenge of geographic generalization. The TPLLM framework [26] showed that LLM-based models achieve 12–18% better performance in low-data scenarios compared to traditional deep learning approaches trained from scratch. Recent comprehensive reviews have identified several significant trends and challenges in LLM integration for traffic forecasting [27,28,29,30]. Recently, LLMs have been increasingly incorporated into traffic prediction systems, representing the latest frontier in forecasting methodology evolution [31,32,33]. However, the integration of LLMs introduces substantial computational challenges that must be addressed. Training and inference costs increase by 50–100× compared to traditional ML methods, with state-of-the-art models requiring GPU clusters for practical deployment. Real-time prediction latency remains a critical concern, with LLM inference times often exceeding the acceptable bounds for time-sensitive traffic management applications. The complexity of domain adaptation and the “black box” nature of LLMs also present obstacles to regulatory acceptance and operational deployment in safety-critical transportation systems. Hybrid architectures that combine LLM contextual understanding with specialized deep learning spatiotemporal models represent a promising direction, balancing performance with computational feasibility. These approaches leverage LLMs for event interpretation and contextual feature extraction while utilizing efficient GNN-based architectures for core traffic state prediction.

2.2. Current State of Research

2.2.1. Methodological Advances

Graph-based methodologies have emerged as the prevailing framework for spatial modeling, with T-GCN [16] and S-GCN-GRU-NN [18] demonstrating superior efficacy in capturing network-wide dependencies. The adoption of attention mechanisms and transformer architectures has gained momentum, with empirical studies indicating a 5–10% enhancement over traditional sequential models [19,20].

2.2.2. Data Integration Challenges

The integration of heterogeneous data sources poses a substantial challenge. Traditional methodologies predominantly depend on sensor data; however, recent studies have investigated the incorporation of weather information [34], social media feeds [35], and event data [36]. Large language models (LLMs) have demonstrated particular promise in this domain because of their ability to process textual information in conjunction with numerical data.

2.2.3. Practical Implementation Barriers

Despite these theoretical advancements, several obstacles hinder practical implementation.
  • Computational requirements: Advanced models necessitate substantial resources.
  • Real-time constraints: Numerous models are unable to satisfy stringent latency demands.
  • Data quality issues: The presence of missing or noisy sensor data adversely impacts performance.
  • Generalization challenges: Models trained in one city frequently fail to transfer effectively.

3. Research Methodology

3.1. Systematic Review Protocol

This systematic literature review was conducted and documented in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement [10,37]. The PRISMA 2020 guidelines, which offer an evidence-based minimum set of items for reporting in systematic reviews, were adhered to throughout all phases of this review.
PRISMA Compliance Documentation
  • A completed PRISMA 2020 checklist documenting where each of the 27 checklist items appears in this manuscript is provided as Supplementary Table S1.
  • The PRISMA 2020 flow diagram documenting the study selection process is presented in Figure 1.
  • This review followed the PRISMA 2020 guidelines adapted for computer science systematic reviews to ensure comprehensive, transparent, and reproducible results.
Protocol Registration: This review was not prospectively registered because it aimed to offer a comprehensive overview of the research landscape rather than assess the effects of an intervention. This approach is consistent with the PRISMA 2020 guidelines for scoping and methodology-focused systematic reviews in computer science.

3.2. Research Questions

Six research questions were identified, as shown in Table 1. Each research question was accompanied by a justification statement explaining its relevance.

3.3. Search Strategy, Criteria and Selection Process

In accordance with our comprehensive review strategy across multiple databases, we established specific inclusion and exclusion criteria for the included studies [38]. These criteria were essential for refining our search to ensure the relevance and rigor of the studies included. Specifically, we concentrated on research published between 2014 and 2024 that addressed traffic congestion forecasting or prediction, utilized machine learning, deep learning, or LLM approaches, and consisted of peer-reviewed journal articles and conference proceedings with clear methodological descriptions and performance evaluations. The exclusion criteria encompassed publications prior to 2014 that lacked clear performance metrics, papers not focused on traffic congestion or flow prediction, reviews, surveys, or meta-analyses without original research, and publications in languages other than English.
Following the establishment of our inclusion and exclusion criteria, we meticulously applied them during the study selection process using a search query across various web-based databases.
  • IEEE Xplore Digital Library
  • ScienceDirect (Elsevier)
  • SpringerLink
Our search strategy employed the following query with Boolean operators: the search string was applied to the title, abstract, and keywords fields, as these fields offer the most pertinent initial screening while ensuring comprehensive coverage of the literature. This approach effectively balances sensitivity (capturing relevant studies) with precision (avoiding excessive irrelevant results), which is regarded as the best practice for systematic reviews in computer science [37]. A full-text search would have resulted in an excessive number of false positives, where congestion forecasting terms appeared only in peripheral sections or in the references.
(traffic AND (congestion OR flow OR volume OR density) AND
(prediction OR forecasting) AND (machine learning OR deep learning OR
neural networks) AND (LLM OR “large language model” OR transformer))
OR (intelligent transportation systems AND prediction) OR
(traffic data AND (analytics OR modeling))
The selection process consisted of four stages.
1.
Identification: Initial search yielded 3847 records
2.
Screening: Title and abstract screening reduced to 512 relevant papers
3.
Eligibility: Full-text assessment excluded 412 studies
4.
Inclusion: Quality assessment resulted in 100 high-quality papers
Search Strategy Rationale: Our search strategy incorporated terms related to large language models (LLMs), specifically “LLM OR large language model OR transformer,” which may theoretically omit certain studies predating 2020 that do not reference these terms. Nonetheless, several factors alleviate this concern: (1) the use of OR logic enables the inclusion of papers that align with traditional machine learning (ML) and deep learning (DL) terminology, even in the absence of LLM-specific terms, during the initial identification phase; (2) the extensive inclusion of traditional ML and DL terms (machine learning, deep learning, and neural networks) ensures the identification of literature predating 2020; (3) the term “transformer” encompasses papers from 2017 to 2020 that discuss attention mechanisms prior to the widespread adoption of LLM terminology; and (4) additional relevant studies were identified through supplementary backward citation tracking from key papers. To validate the comprehensiveness of our search, we conducted an additional search in October 2024, employing only traditional ML/DL terms without LLM-related keywords, specifically targeting the 2014–2020 period. This validation search yielded 2847 records, and cross-referencing revealed that 94% of our 2014–2020 papers (54 out of 57 papers) were captured by this alternative approach, thereby confirming the comprehensive coverage of the pre-LLM literature. The three papers uniquely identified by our primary search were early transformer papers (2020) that discussed attention mechanisms, which are pertinent to our analysis of technological evolution.
Quality assessment revealed specific areas where future research could contribute to the field. Notably, we identified a gap in the clarity of research objectives and methodology, appropriateness of the AI techniques employed, validation approach and experimental design, performance evaluation metrics, limitations, and acknowledgment and discussion of results in the context of existing literature.
A systematic quality assessment framework was implemented for all studies that advanced to the full-text review. This assessment utilized a quantitative scoring system adapted from Kitchenham’s guidelines for systematic reviews in software engineering [37] and specifically modified for the evaluation of AI/ML research. Each study was evaluated across three dimensions using explicit scoring rubrics (Table 2).
Minimum Quality Threshold: Studies scoring below 60/100 points were excluded from the final analysis. This threshold ensured the inclusion of only high-quality, methodologically sound research with adequate reporting of transparency. The final quality scores for the 100 included studies showed the following distributions:
  • High quality (80–100 points): 34 studies (34%);
  • Good quality (70–79 points): 41 studies (41%);
  • Acceptable quality (60–69 points): 25 studies (25%);
  • Mean quality score: 73.8 (Standard deviation SD = 8.4, Range: 60–94).
This distribution confirms that all the included studies met the stringent quality standards, with 75% achieving good-to-high quality ratings.

3.4. Key Terms Combination

This stage is consistent across most PICO frameworks, wherein keywords were derived from the preceding research questions to formulate the key term combination for the search [39]. Each question was deconstructed into four primary elements: population (P), intervention (I), comparison (C), and outcome (O). In Table 3, each element is accompanied by a sentence that illustrates the core of the systematic literature review (SLR).
This PICO process has also been used in [40,41,42] but in this study, we projected this procedure onto traffic congestion prediction by focusing on the evolution from machine learning-based techniques to large language models. Table 4 lists the words obtained for each PICO component.
We developed a comprehensive taxonomy of the key terms employed in the PICO process for traffic congestion forecasting research, as shown in Table 5, Table 6 and Table 7. This taxonomy organizes terms based on their conceptual similarities and roles within the research domain and encompasses seven categories:
AI/ML Method Taxonomy: This hierarchically organized taxonomy ranges from traditional machine learning methods to deep learning, large language models (LLMs), and advanced techniques, illustrating the evolution of approaches over time.
Data Type Taxonomy: This categorizes the various types of data utilized in traffic forecasting, from structured sensor data to spatiotemporal information and contextual data such as weather and events.
Congestion Scenario Taxonomy: This classifies different congestion types by cause (recurrent versus non-recurrent) and time horizon (short-, medium-, and long-term prediction).
Performance Metrics Taxonomy: This groups the evaluation metrics used to assess the model performance, including both accuracy and error metrics.
Limitation and Challenge Taxonomy: This organizes the common limitations mentioned in the literature, ranging from data-related issues to performance-related constraints.
Implementation Considerations Taxonomy: This categorizes the practical aspects of model deployment, including resource requirements and real-world applicability of the model.
Methodological Evolution Taxonomy: This maps the progression of research paradigms and momentum in the field over the past decade to the present.

4. Data Extraction

Taxonomy has functioned as a structured vocabulary for systematically analyzing papers and organizing findings according to the PICO framework. It encapsulates the transition from traditional methods to LLMs while preserving the conceptual relationships between analogous terms. This classification facilitated our data analysis using a mixed-methods approach, encompassing a quantitative analysis of performance metrics, qualitative assessment of methodologies, chronological analysis to discern trends, comparative analysis of various AI techniques, and thematic analysis of limitations and challenges.

4.1. PRISMA-Based Data Extraction

The data extraction process was meticulously designed to systematically address all pertinent PRISMA 2020 reporting guidelines. To ensure reliability and consistency, two reviewers independently extracted data from a 20% random sample of the included studies, achieving a Cohen’s κ of 0.91, which indicates near-perfect agreement. Disagreements were resolved through discussion and consensus. The remaining studies were extracted by one reviewer, and random checks were conducted by a second reviewer to ensure quality.

4.2. Publication Trends

The distribution of the 100 selected papers [11,12,13,14,16,17,18,19,20,21,22,23,25,26,34,35,36,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124] across the study period (2014–2024) is illustrated in Figure 2. There is a clear upward trend in publications, with a notable surge beginning in 2019–2020, which corresponds to an increased interest in deep learning applications for traffic analysis.

4.3. AI Techniques Distribution

The distribution of AI techniques across the selected papers, as shown in Figure 3, reveals the predominance of machine learning and deep learning approaches, with the recent emergence of LLM-based methods being noted.
Figure 4, Figure 5 and Figure 6 show the detailed distribution of machine learning, deep learning, and LLM-related techniques used in traffic congestion forecasting in the selectedstudies.
The methodological landscape of the papers under consideration reveals a diverse array of approaches to studying this topic in the literature. Machine learning techniques, excluding large language models, constitute a substantial portion of the research methodologies. A smaller yet significant number of studies have focused exclusively on language models. Some studies have demonstrated a hybrid approach that integrates both machine learning and large language model (LLM) techniques. Notably, deep learning methodologies have emerged as the most prevalent approach in these studies. This distribution highlights the varied technological strategies employed by researchers in their studies, reflecting the dynamic nature of the field and the ongoing exploration of different computational techniques.

4.4. Performance Metrics

The assessment of machine learning models generally encompasses various performance metrics, with accuracy being the most commonly reported metric. Both general and specific error rates are frequently used to evaluate the model performance. For regression tasks, the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are widely utilized, offering insights into the average magnitude of prediction errors. The Mean Absolute Percentage Error (MAPE) provides a percentage-based measure of prediction accuracy. In the context of classification problems, the Receiver Operating Characteristic (ROC) curve and its associated Area Under the Curve (AUC) are often used to assess model discrimination. Furthermore, precision, F1-score, and recall are employed, albeit less frequently, to provide a more nuanced understanding of model performance, particularly in scenarios involving imbalanced classification. The most commonly reported performance metrics across studies are shown in Figure 7.
A critical challenge in systematic reviews of machine learning research is the heterogeneity of performance metrics and evaluation protocols across various studies. This section details our approach to harmonizing the metrics and aggregating the performance ranges reported in this review. Studies in our corpus employed diverse accuracy metrics, including RMSE, MAE, MAPE, R 2 , and classification accuracy. To enable a fair comparison, we applied systematic conversion. 1. Accuracy Unification: For regression tasks (most common), we prioritized the Mean Absolute Percentage Error (MAPE) as the primary metric because it provides a scale-independent comparison. When studies reported only the RMSE or MAE,
M A P E e s t i m a t e d = RMSE m e a n g r o u n d _ t r u t h × 100
This approximation assumes a normal error distribution and provides conservative estimates. Where both MAPE and RMSE/MAE were reported (68 of 100 studies), we validated this conversion formula, finding a correlation of r = 0.89 (p < 0.001) between the calculated and reported MAPE values. 2. Classification Metrics: For studies framing congestion as classification (22 of 100 studies), accuracy, F1-score, and AUC were used directly. When converting to a comparable scale using regression metrics:
-
Accuracy > 90% ≈ MAPE < 10%
-
Accuracy 80–90% ≈ MAPE 10–20%
-
Accuracy < 80% ≈ MAPE > 20%
These equivalencies were established by analyzing 15 studies that reported both classification and regression metrics. Table 8 summarizes the performance results of various methodological approaches.
To ensure transparency and enable reproducibility, we systematically documented the characteristics of all the datasets employed in the reviewed studies. Table 9 presents comprehensive metadata for the eight primary datasets, accounting for 93% of all studies, including temporal and spatial resolution specifications, sensor infrastructure details, and data collection timeframes.

4.5. Common Limitations

Researchers frequently encounter various challenges in their studies, with data-related issues being the most prevalent. Difficulties in making accurate predictions and performance problems also rank high among the obstacles faced by users. Accuracy is another significant challenge for researchers. Less common but still noteworthy are limitations in computing power, concerns about efficiency, and difficulties in processing information in real time. Although less frequent, problems with data handling and accurate detection continue to pose challenges for some researchers. These limitations collectively represent the diverse array of obstacles that researchers must navigate to pursue scientific knowledge and advancement. The most frequently cited limitations in the reviewed papers are presented in Figure 8.
The primary challenges are as follows:
  • Data-related issues (66% of studies)
  • Prediction accuracy limitations (50%)
  • Performance constraints (36%)
  • Computational requirements (32%)

5. Comparative Analysis Across Model Types

5.1. Traditional Machine Learning vs. Deep Learning

In earlier research conducted between 2014 and 2018, traditional machine learning techniques, such as Support Vector Machines (SVMs), random forests, and regression models, were predominant. However, these methods have been supplanted by deep learning approaches (Figure 9). The primary distinctions are as follows.
The advantages of traditional machine learning include reduced computational demands, enhanced interpretability, effectiveness with smaller datasets, expedited training times, and straightforward implementation.
The advantages of deep learning include superior accuracy in modeling complex traffic patterns, enhanced capability for managing spatiotemporal relationships, ability to automatically extract relevant features, increased robustness to noisy data, and improved performance in non-stationary traffic environments.
The shift from traditional machine learning to deep learning is evident in the chronological analysis, with a significant increase in the adoption of deep learning methodologies post-2018. Notably, Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) architectures have been predominantly utilized for time-series predictions.

5.2. Deep Learning vs. LLM Approaches

The advent of LLM-based methodologies in the domain of traffic congestion forecasting represents a recent development, with most studies pertaining to LLMs emerging post-2021. The comparative analysis yielded the following results:
The strengths of deep learning (non-LLM) in the context of traffic analysis include its well-established validation within the domain, generally lower computational requirements than LLMs, more accessible implementation for transportation researchers, and direct design for time-series prediction.
The strengths of large language model (LLM) approaches include their ability to incorporate contextual and semantic information, enhanced capacity to handle multimodal data such as text descriptions and sensor data, superior transfer learning capabilities derived from pretrained models, potential for integrating external knowledge, and increased robustness in the presence of missing data.
Large language model (LLM) methodologies exhibit significant potential in scenarios requiring the synthesis of varied data sources, particularly when contextual elements such as events, meteorological conditions, or traffic incidents exert a substantial impact on congestion patterns (Figure 10).

5.3. Hybrid Approaches

Several studies (n = 4) have investigated hybrid methodologies that integrate traditional machine learning (ML), deep learning, and large language model (LLM) techniques. These hybrid models typically exhibit enhanced performance compared to approaches that utilize a single technique. Common configurations of these hybrid models include LSTM + CNN architectures for spatiotemporal data, Transformer + LSTM for the integration of sequential and attention mechanisms, BERT combined with traditional ML for feature extraction and prediction, and GCN + GRU for capturing both local and global spatial correlations.
These hybrid methodologies capitalize on the advantages of various techniques to effectively address the complex nature of traffic congestion.

5.4. Cross-Methodology Performance Analysis

Table 10 provides a comprehensive comparison of representative studies that used various methods.

5.5. Capability Matrix

Table 11 presents a comprehensive assessment of the capabilities of these methodologies.

5.6. Performance–Complexity Trade-Off Analysis

Figure 11 visualizes the relationship between the model complexity and prediction accuracy. This scatter plot shows the relationship between the prediction accuracy and computational cost for the different model types. Traditional ML has lower accuracy (75–85%) but minimal computational requirements (1–5× baseline). Deep learning has improved accuracy (85–92%) with moderate computational costs (10–25× baseline). LLM have high accuracy (90–95%) but substantial computational demands (50–80× baseline). Hybrid models have the highest accuracy (90–95%) with variable computational requirements.
The computational costs were normalized relative to the baseline to allow for a fair comparison.
Baseline Reference: Support Vector Machine (SVM) with RBF kernel on PeMS dataset (5 min resolution, 1-hour prediction horizon) implemented on single CPU core (Intel Xeon E5-2680v4 @ 2.4 GHz).
Cost Metrics Normalized:
  • Training Time Ratio:  C o s t t r a i n i n g = ( T i m e m e t h o d / T i m e b a s e l i n e )
  • Inference Latency Ratio:  C o s t i n f e r e n c e = ( L a t e n c y m e t h o d / L a t e n c y b a s e l i n e )
  • Memory Usage Ratio:  C o s t m e m o r y = ( R A M p e a k _ m e t h o d / R A M p e a k _ b a s e l i n e )
When studies used different hardware (GPUs), time-based costs were adjusted using standard benchmark ratios (e.g., NVIDIA V100 ≈ 20× single central processing unit (CPU) core for neural network training).
Cost Range Interpretation: Ranges such as “50–100×” for computational cost indicate the following:
-
Lower bound (50×): Simplest implementation, smaller model, single GPU.
-
Upper bound (100×): Complex architecture, larger model, distributed training;
-
Reflects real-world deployment variance based on implementation choices.
All computational costs in this review are reported relative to this normalized baseline, ensuring interpretable and consistent comparisons.
Key insights:
  • Traditional ML offers best efficiency for simple scenarios;
  • Deep learning provides balanced performance–complexity ratio;
  • LLMs excel in accuracy but at significant computational cost;
  • Hybrid approaches offer flexibility but increase complexity.

5.7. Application Suitability Matrix

Based on our analysis, we developed recommendations for different application scenarios (Table 12).

5.8. Detailed Comparison Tables

We present detailed comparison tables of the reviewed studies categorized by the primary methodological approach. For each paper, we included the methods used, datasets, limitations, performance metrics, and key findings to facilitate direct comparisons between studies. Table 13 presents a sample of the 100 reviewed papers, selected to illustrate the diversity of approaches and evolution over the study period.

6. Results and Outcomes: Answering Research Questions

6.1. Impact of Evolution from ML to LLMs on Traffic Congestion Forecasting (RQ1)

The progression from traditional machine learning to deep learning and ultimately to large language models has fundamentally transformed the prediction of traffic congestion. Our analysis revealed several significant advancements in the continuum of care.
Predictive accuracy has shown consistent improvement, with traditional ML approaches (2014–2017) achieving 80–85% accuracy, deep learning methods (2018–2020) achieving 85–92%, and recent LLM-integrated systems (2021–2024) demonstrating 90–95% accuracy across diverse traffic scenarios.
Contemporary LLM-based approaches have transcended the limitations of earlier systems by effectively integrating unstructured data sources, including incident reports, weather descriptions, and social media feeds, with traditional sensor data. This multimodal integration creates a comprehensive analytical framework.
Notably, LLM implementations exhibit sophisticated contextual understanding by interpreting the semantic significance of traffic events and their potential cascading effects on traffic congestion. Additionally, these systems have transfer learning capabilities and can apply pre-existing knowledge structures without requiring extensive domain-specific training data.
Our review further indicates substantial improvements in edge case management and non-recurrent congestion event prediction, which are historically challenging scenarios that limit the practical utility of earlier forecasting systems.
These developments suggest promising research directions, particularly in hybrid systems that combine traditional physics-based models with the contextual understanding capabilities of modern language models.

6.2. Performance–Cost Trade-Offs Between ML, DL, and LLMs (RQ2)

Our analysis reveals distinct performance–cost trade-offs across the methodological spectrum, as shown in Table 14.
A linear correlation was observed between the computational cost and accuracy. Large language model (LLM) methodologies employ 50–100 times more computational resources than traditional machine learning (ML) techniques do. This resulted in an enhancement of the prediction accuracy by up to 10–15%, and the optimized deep learning models frequently represented the optimal balance for real-time applications.

6.3. AI Model Performance Across Transportation Infrastructure Scenarios (RQ3)

The performance of the AI models varied significantly across the different transportation infrastructure scenarios, as shown in Table 15.

6.4. Effectiveness of AI Models in Predicting Different Types of Congestion (RQ4)

The effectiveness of the AI models varied significantly based on the type of congestion and prediction time frame (Table 16 and Table 17). In the context of recurrent congestion, traditional machine learning (ML) techniques are cost effective for making short-term predictions. However, for non-recurrent events and extended prediction horizons, the significant performance benefits of large language models (LLMs) and hybrid methodologies justify their increased computational expense.

6.5. Impact of Traffic Parameters on AI Model Performance (RQ5)

Our analysis identified several critical traffic parameters that significantly influenced the performance of the AI model for congestion forecasting as listed in Table 18 and Table 19. The principal insight is that the integration of multimodal approaches results in the most substantial enhancement in performance, whereas temporal resolution consistently improves the outcomes across various model types.

6.6. Comparative Advantages of Different AI Architectures (RQ6)

A comparative analysis of different AI architectures and their combinations revealed distinct advantages for specific traffic prediction scenarios, as shown in Table 20.
The analysis revealed that methodologies that integrate complementary architectures consistently outperform single-architecture approaches, yielding performance enhancements ranging from 5 to 12% in complex traffic scenarios. CNN + LSTM configurations [17] exhibit high efficacy by adeptly capturing both spatial and temporal traffic patterns and are particularly effective in video-based traffic analyses. GNN + GRU approaches [18] achieve notable performance by merging network topology modeling with temporal sequence learning, excelling in intricate urban networks. BERT + ML implementations [21] offer efficient solutions that leverage contextual understanding while maintaining computational efficiency, rendering them effective for incorporating textual data. Transformer + GNN architectures [111] demonstrate the highest overall performance in complex scenarios with multiple influencing factors. The optimal architectural combination is significantly dependent on the specific prediction task, available data types, and the computational constraints.

7. Discussion and Future Directions

7.1. Temporal Evolution of Methods

An examination of publications from 2014 to 2024 demonstrated discernible chronological advancements in the methodologies used to predict traffic congestion.
  • 2014–2017: Traditional ML Dominance This period was marked by the widespread use of traditional machine learning techniques, such as Support Vector Machines, Random Forests, and Bayesian methods. These approaches primarily focus on analyzing historical traffic data with limited integration of external factors such as weather conditions.
  • 2018–2020: Deep Learning Emergence During this period, a notable transition occurred, with deep learning methodologies, particularly Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) architectures, emerging as the dominant techniques. These models effectively address the limitations of traditional machine learning in capturing intricate temporal and spatial dependencies in traffic data analysis.
  • 2021–2024: LLM Integration and Hybrid Models Recent developments have seen the advent of approaches based on large language models (LLMs) and advanced hybrid models (HM). These methodologies facilitate the integration of multimodal data sources and contextual information (Figure 12), resulting in enhanced predictive accuracy, particularly for non-recurrent congestion events.

7.2. Performance Comparison

A comparative analysis of the performance metrics across various methodological approaches yielded several key insights, as presented in Table 21. Regarding accuracy improvements, a general trend of increasing prediction accuracy was observed over the study period, with hybrid models demonstrating the highest performance, achieving 90–95% accuracy compared to 75–85% for traditional machine learning approaches. In terms of error reduction, recent approaches have achieved significant reductions in the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) compared to earlier methods, with hybrid models reporting MAE values as low as 2–4% and RMSE values of 3–5, representing substantial improvements over traditional methods. Regarding computational trade-offs, although advanced models, particularly those based on large language models (LLMs), exhibit superior accuracy, they entail significantly increased computational requirements, with training times extending to days and memory requirements ranging from 10 to 100 GB, posing implementation challenges for real-time applications such as autonomous driving. In terms of sensitivity, LLM and hybrid approaches have shown notably improved performance during special events, adverse weather conditions, and other non-recurrent congestion scenarios compared with traditional methods. Regarding the temporal horizon impact, deep learning and LLM approaches consistently outperform traditional machine learning for longer prediction horizons (>30 min), whereas the performance gap narrows for very short-term predictions.

7.3. Domain-Specific Considerations

Various domain-specific factors influence the efficacy of these methods. In the context of urban versus highway environments, Convolutional Neural Network (CNN)- and Graph Neural Network (GNN)-based approaches demonstrate particular effectiveness in complex urban networks, whereas Long Short-Term Memory (LSTM) models frequently excel in highway traffic prediction. Regions with limited sensor infrastructure can derive substantial benefits from the transfer learning capabilities of large language model (LLM)-based approaches. Regarding computational resources, implementation contexts with constrained computational capacities may find optimized traditional machine learning (ML) approaches more advantageous despite their relatively lower accuracy. In terms of integration requirements (Figure 13), scenarios necessitating the fusion of heterogeneous data sources, such as traffic sensors, weather data, events, and social media, were most effectively addressed by the LLM and hybrid approaches.

7.4. Specific Considerations for LLM vs. ML Comparison

7.4.1. Contextual Understanding

A significant advantage of large language model (LLM)-based approaches over traditional machine learning (ML) and basic deep learning methods is their enhanced ability to comprehend context. This advantage is evident in several domains. In event impact modeling, LLMs can incorporate information about special events, roadwork, or incidents by processing textual descriptions and relating them to the traffic patterns. Regarding semantic interpretation, LLMs can interpret the semantic meaning of traffic-related texts, allowing news reports, social media, and other textual sources to be used as input. In terms of transfer learning capabilities, the application of LLM-based approaches has substantially improved transfer learning within this domain, enabling models to leverage the pretrained knowledge of various factors influencing traffic patterns, thereby reducing the need for extensive domain-specific training data sets. Regarding edge case handling, the review highlights a significant improvement in managing edge cases and non-recurrent congestion events with the implementation of LLM-based methodologies, effectively addressing the critical limitations of previous approaches. In the temporal context, LLMs exhibit a superior ability to comprehend temporal references (such as holiday periods, weekends, and rush hours) with greater nuance than traditional models do.

7.4.2. Technical Implementation Challenges

Despite their advantages, techniques based on large language models (LLMs) face significant practical challenges compared to traditional machine learning approaches. LLMs require substantial computational resources for both training and inference, which limits their applicability in real-time scenarios. In the context of domain adaptation, it is crucial to employ domain-specific data augmentation to effectively fine-tune general-purpose LLMs for traffic-related applications. Regarding interpretability, the predictions generated by LLMs are less interpretable than those produced by simpler models, raising concerns regarding stakeholder confidence and system validation. For data integration, the incorporation of numerical sensor data with textual information for LLM processing necessitates the use of sophisticated data fusion algorithms.

7.4.3. Performance in Edge Cases

The comparative analysis identified significant differences in the handling of traffic prediction edge cases between the machine learning (ML) and large language model (LLM) approaches. In instances of non-recurrent congestion, such as accidents, severe weather, and special events, LLMs and hybrid models exhibit superior performances. Regarding data sparsity, LLM-based approaches demonstrate greater robustness to missing data and sparse sensor coverage than traditional ML methods. Regarding the prediction horizon, whereas traditional ML models experience rapidly degrading performance beyond short-term predictions, LLM and hybrid approaches maintain better accuracy for long-term predictions. In terms of transferability, LLM-based models show enhanced cross-location transferability and require less location-specific training data than traditional models do.

7.5. Hybrid Approaches: Taxonomy and Characteristics

Hybrid approaches that combine multiple methodological paradigms achieve the highest performance levels (Table 22). We identified four distinct types of hybrid architectures based on the integration mechanisms:
Type 1: Ensemble Hybrids Ensemble methodologies integrate predictions from multiple independent models using mechanisms such as weighted averaging, stacking and voting. The architectural characteristics are as follows:
-
Multiple models trained independently on identical or varied datasets.
-
Integration at the prediction stage via meta-learning or straightforward aggregation.
-
An example includes the combination of Random Forest, XGBoost, and LSTM through stacked generalization.
-
Performance improvement ranges from 3–7% compared to individual models.
-
Computational cost comprises the sum of individual model costs with minimal aggregation overhead.
Type 2: Physics-Informed Hybrids Integration of data-driven ML/DL models with physical traffic flow models or constraints.
-
Neural networks that incorporate traffic flow equations as soft constraints.
-
Deep learning with physics-based loss functions ensuring adherence to conservation laws.
-
Example: GNN constrained by macroscopic traffic flow dynamics (LWR model).
-
Performance gain: 5–10% especially in data-scarce scenarios.
-
Additional benefit: Improved generalization and physical plausibility.
Type 3: Pipeline Hybrids This category involves sequential processing, in which the output of one model serves as the input for the other. It includes multistage architectures with distinct models that are dedicated to different subtasks. A common configuration is as follows: feature extraction is performed using Convolutional Neural Networks (CNNs) or Graph Neural Networks (GNNs), followed by temporal modeling with Long Short-Term Memory (LSTM) networks, and finally, refinement through attention mechanisms. For instance, a CNN can extract spatial patterns, an LSTM can model the temporal evolution, and attention mechanisms can refine the predictions. This approach yields a performance improvement of 6–12% owing to the specialized processing at each stage. Additionally, each stage offers the flexibility to be independently optimized.
Type 4: Multimodal Hybrids Diverse data modalities are integrated using distinct model architectures.
-
A large language model (LLM) processes textual data, including incidents, weather descriptions, and social media content.
-
Graph Neural Networks (GNNs) or Long Short-Term Memory (LSTM) networks process sensor time-series data.
-
A fusion module combines multimodal embeddings.
-
An example includes the use of BERT for text processing, Graph Convolutional Networks (GCNs) for spatial–temporal sensor data, and multimodal attention fusion.
-
Performance improvements range from 10 to 15%, particularly in non-recurrent congestion scenarios.
-
This approach uniquely leverages complementary information from heterogeneous sources.
Performance–Cost Trade-offs Across Hybrid Types:
Design Principles for Hybrid Approaches: Our analysis identified three fundamental principles for the effective design of hybrid models.
1.
Complementarity: The combined models should address distinct aspects, such as spatial versus temporal, data-driven versus physics-based, and numeric versus textual dimensions.
2.
Proportional Complexity: The performance improvements should justify the additional computational costs and the complexity of implementation.
3.
Proper Integration: The fusion mechanism is critical to performance, with learned fusion (e.g., attention mechanisms) outperforming simple averaging by 3–5%.
Future hybrid architectures are anticipated to prioritize the efficient integration of multimodal data, with particular emphasis on the combination of large language models (LLMs) and Graph Neural Networks (GNNs) to effectively leverage both textual context and network structure.

7.6. Micro-Scale Analysis: Intersection and Crossing-Level Predictions

While the majority of reviewed studies focused on network- or corridor-level predictions (78 of 100 studies), emerging research addresses micro-scale phenomena at intersections and pedestrian crossings, representing a critical gap in the literature that warrants detailed examination.

7.6.1. Intersection-Level Traffic Modeling

Micro-scale prediction at signalized intersections requires fundamentally different approaches than network-level forecasting. Kamal and Farooq [3] employed double/debiased machine learning to investigate the causal effect of traffic density on pedestrian crossing behavior, demonstrating that increased vehicular traffic significantly impacts pedestrian waiting times and stress levels at urban crossings. Their analysis revealed that incorporating pedestrian behavioral responses to traffic conditions provides more accurate modeling of intersection dynamics compared to traditional approaches that treat pedestrian and vehicular flows independently.
Li et al. [4] proposed a deep reinforcement learning-powered control system for managing mixed traffic involving connected autonomous vehicles (CAVs) and human-driven vehicles (HVs) at signalized intersections. Their adaptive signal control strategy combined with efficient CAV coordination policies demonstrated significant improvements in operational efficiency while maintaining safety requirements, particularly under varying CAV penetration rates. This approach highlights how micro-scale modeling with reinforcement learning enables more nuanced congestion mitigation compared to traditional corridor-level traffic management systems.

7.6.2. Mixed-Traffic Micro-Environments

Mixed-traffic scenarios, in which vehicles, pedestrians, and cyclists interact, pose unique forecasting challenges. Wang et al. [5] developed multi-agent deep learning frameworks for predicting vehicle–pedestrian–cyclist interactions at urban intersections, achieving MAPE values of 6.8% for 5 min prediction horizons. The key findings are as follows:
  • Agent-based modeling shows 12–18% accuracy improvements over aggregate approaches.
  • Multimodal interactions (vehicle–pedestrian–cyclist) require explicit modeling; ignoring pedestrians reduces accuracy by 8–15%.
  • Geometric configuration (crossing width, signal timing, lane arrangements) significantly influences prediction complexity.
  • Real-time computational requirements increase 3–5× compared to vehicle-only predictions due to higher resolution demands.

7.6.3. Data and Methodological Requirements

Micro-scale prediction imposes distinct data and computational requirements compared with network-level forecasting.
Temporal Resolution: Micro-scale analyses require 1–5 s intervals versus 5–15 min for network-level predictions. This 60–180x increase in data granularity presents substantial storage and computational challenges.
Spatial Granularity: Lane-level versus link-level modeling necessitates detailed geometric data, including lane widths, intersection geometry, crossing locations, and signal head positions. Only 12 of the 100 studies reviewed incorporated this level of detail.
Sensor Infrastructure: Successful micro-scale prediction typically requires the following:
  • High-resolution video analytics or LiDAR for pedestrian/cyclist detection.
  • Lane-level inductive loop detectors or equivalent.
  • Signal phase and timing (SPaT) data integration.
  • Weather sensors for visibility and surface condition monitoring.
Algorithmic Approaches: Agent-based and microscopic simulation models dominate micro-scale applications (9 of 14 micro-scale studies). Graph Neural Networks show particular promise for capturing fine-grained spatial interactions, achieving 8–12% higher accuracy than CNN-LSTM approaches at the intersection scale.

7.6.4. Research Gaps and Opportunities

The severe underrepresentation of micro-scale studies (14%) reveals critical research opportunities:
1.
Pedestrian–Vehicle Interaction Modeling: Only 8 studies explicitly model pedestrian impacts on vehicular congestion, despite increasing urbanization and pedestrian-priority policies in many cities.
2.
Transfer Learning for Intersections: Cross-intersection transfer learning remains unexplored; each intersection is typically modeled independently despite geometric and operational similarities.
3.
Computational Efficiency: Real-time micro-scale prediction faces severe computational constraints; model compression and edge computing approaches are needed.
4.
Data Fusion: Integrating video analytics, traditional sensors, and V2X communications for comprehensive micro-scale modeling represents a significant technical challenge.
5.
Multimodal Equity: Most studies focus on vehicle throughput optimization; pedestrian and cyclist delay minimization remains underexplored, raising equity concerns
This micro-scale research gap represents a significant opportunity for advancing traffic congestion forecasting, particularly as cities worldwide implement pedestrian-priority and complete street policies that fundamentally alter the dynamics of intersections.

7.7. Environmental and Sustainability Considerations

Although enhanced traffic forecasting can mitigate the emissions associated with congestion, the computational requirements of sophisticated AI models contribute to their environmental impacts. This section explores the energy consumption and carbon implications of various forecasting methodologies and evaluates their associated costs and benefits. We estimated the energy consumption across methodological categories based on the reported hardware specifications, training times, and inference requirements from the reviewed studies (Table 23):
To contextualize these numbers, we compared them with congestion-related emissions. Emission Benefits from Congestion Reduction:
  • Average US city (1M residents): 500,000 tons CO2e/year from traffic congestion
  • Studies show 8–22% congestion reduction with optimized forecasting systems
  • Potential savings: 40,000–110,000 tons CO2e/year
Net Environmental Impact (Table 24):
Although advanced AI models require substantial computational resources, their environmental impact remains minimal (<0.05%) compared to the benefits derived from reducing traffic congestion. Nonetheless, sustainable deployment practices, such as right-sizing models, utilizing renewable energy, and leveraging transfer learning, can diminish the carbon footprint by 60–85% without compromising performance. Emerging research has also explored the bidirectional relationships between traffic congestion and environmental factors, with studies demonstrating that air pollution data can serve as a valuable proxy for traffic forecasting in urban environments, particularly in regions with limited traditional traffic sensing infrastructure [125]. As traffic forecasting AI becomes increasingly prevalent worldwide, prioritizing energy efficiency alongside accuracy will become progressively important. The research community should adopt “Green AI” principles and report energy consumption alongside traditional performance metrics to guide the selection of sustainable technologies.

7.8. Policy Implications and Real-World Case Studies

Our systematic review revealed significant gaps between academic research achievements and real-world applications. This section examines three exemplary implementations (Table 25) that demonstrate the successful translation of forecasting research into operational systems, along with the derived policy recommendations.

7.8.1. Synthesized Policy Recommendations

Based on these case studies and our systematic review, we propose five evidence-based policy recommendations.
1. Tiered Implementation Strategy
  • Tier 1 (Foundational): Traditional ML for baseline systems on low-traffic corridors (75–85% accuracy, low cost, rapid deployment).
  • Tier 2 (Enhanced): Deep learning for high-traffic urban corridors (85–92% accuracy, moderate cost).
  • Tier 3 (Advanced): LLM-based systems for complex urban networks with multiple event types (90–95% accuracy, high cost, justified by complexity).
2. Data Infrastructure Investment Priorities
  • Minimum 5 min temporal resolution required for effective predictions.
  • Prioritize sensor density and coverage over sensor sophistication.
  • Establish data-sharing agreements with navigation service providers (Waze, Google Maps).
  • Implement standardized data formats to enable model portability.
3. Public-Private Partnership Framework
  • Industry possesses deployment expertise and operational knowledge.
  • Academia provides algorithmic innovation and theoretical advancement.
  • Current 3% industry-academia collaboration rate is inadequate.
  • Recommendation: Establish 50–50 cost-sharing programs for pilot deployments.
  • Include success metrics tied to real-world performance, not just prediction accuracy.
4. Regulatory and Performance Standards
  • Establish minimum accuracy thresholds: 85% for operational systems, 90% for safety-critical applications.
  • Maximum latency requirements: <5 s for real-time applications, <30 s for planning applications.
  • Mandatory interpretability requirements for systems influencing traffic management decisions.
  • Regular third-party auditing of system performance in production environments.
5. Equity and Accessibility Mandates
  • In total, 78% of reviewed studies focus on major metropolitan areas.
  • Mandate minimum research investment for medium-sized cities and rural corridors.
  • Require multimodal equity analysis (pedestrian, cyclist, transit impacts alongside vehicle throughput).
  • Establish accessibility standards ensuring benefits reach disadvantaged communities disproportionately affected by congestion.
  • Fund open-source model development enabling resource-constrained jurisdictions to benefit from advanced methods.

7.8.2. Implementation Barriers and Mitigation Strategies

Real-world implementation encounters challenges that are not typically present in research settings, such as organizational resistance, as transportation agencies often lack expertise in artificial intelligence (AI) and machine learning (ML). To address this, the establishment of national AI centers of excellence is recommended, which would provide technical assistance, training programs, and reference implementations. Another challenge is procurement, as traditional procurement processes are ill-suited for AI systems that require continuous updates. To mitigate this, the development of “AI-as-a-Service” procurement frameworks with performance-based contracts is suggested for future studies. Additionally, liability concerns arise because of unclear legal frameworks for AI-driven traffic management decisions. To address this, it is essential to establish clear human oversight requirements and liability allocation frameworks that distinguish between system recommendations and human decision making. Data privacy is also significant, with public concerns about surveillance and tracking. To mitigate this, the implementation of privacy-by-design principles, differential privacy techniques, and transparent data governance frameworks with public oversight is recommended. These case studies and recommendations offer actionable pathways for translating academic research into operational systems that deliver measurable public benefits.

7.9. Recommendations and Future Directions

Following a thorough analysis conducted in this systematic review, we offer the following recommendations for researchers specializing in traffic congestion forecasting:

7.9.1. Methodological Recommendations

Methodological recommendations include adopting a hybrid approach, wherein researchers should prioritize developing hybrid models that integrate the strengths of various techniques. This involves combining the contextual understanding capabilities of large language models (LLMs) with the spatiotemporal modeling capabilities of deep learning methods. Additionally, domain-specific pretraining is advised, which entails developing traffic-specific pretrained language models using transportation corpora to enhance the domain relevance of LLM-based methods for traffic flow prediction. Furthermore, computational optimization is crucial, necessitating investment in model compression techniques and efficient inference methods to render advanced models viable for real-time applications with limited computational resources and time constraints. The integration of XAI is also recommended, incorporating explainability techniques into complex models to strengthen stakeholder trust and system validation capabilities. Finally, the development of multimodal fusion frameworks is essential, involving the creation of standardized frameworks for integrating heterogeneous data sources, including numerical sensor data, text information, and visual inputs.

7.9.2. Data-Related Recommendations

Five critical research directions must be prioritized to enhance the capabilities of traffic congestion forecasting. First, comprehensive benchmark datasets encompassing diverse traffic scenarios, geographical variations, and challenging cases should be established. Such standardized datasets will facilitate equitable model comparisons and accelerate methodological advancements. Second, the development of traffic-specific data augmentation techniques requires focused attention to address the persistent challenges of sparse or geographically limited training data sets. These specialized augmentation approaches should maintain the unique spatiotemporal characteristics of traffic patterns while enhancing model generalizability. Third, immediate attention must be paid to privacy-preserving techniques for AI models to protect the patient data. The implementation of federated learning and differential privacy techniques would enable valuable data sharing across jurisdictions while safeguarding sensitive information, which is critical for the widespread adoption of these techniques. Fourth, the field would benefit significantly from the adoption of standardized evaluation metrics. Establishing a consistent measurement framework would enable more meaningful comparisons between modeling approaches and provide clearer evidence of genuine methodological advancement. Finally, a transition from simulation-based validation to real-world deployment is required. Prioritizing operational implementation reveals practical challenges that are not apparent in controlled environments and generates crucial insights into model refinement and its practical utility. Collectively, these research priorities address the fundamental challenges that currently limit the practical application of advanced traffic forecasting methods.

7.9.3. Application-Specific Recommendations

Five strategic implementation approaches can substantially enhance the practical utility of traffic congestion forecasting systems. Resource-aware model selection is a critical initial step that requires practitioners to align model complexity with available computational resources and specific application requirements. This balanced approach acknowledges the inherent trade-offs between predictive accuracy and operational efficiency, thereby ensuring sustainable deployment across diverse infrastructure environments. Transfer learning is a promising solution for regions with limited historical data. By leveraging knowledge structures from data-rich environments, advanced models can attain reasonable performance even in areas lacking extensive training datasets, effectively democratizing access to sophisticated forecasting capabilities. Multi-horizon prediction necessitates attention through ensemble methodologies that integrate specialized models optimized for various time ranges. Such composite approaches provide comprehensive forecasting capabilities across immediate, short-term, and extended time horizons, addressing the diverse planning needs of traffic management authorities. Uncertainty quantification must be embedded within predictive frameworks to support robust decision-making. By providing confidence intervals alongside point predictions, these systems enable traffic managers to appropriately weigh forecasts in their operational decisions, particularly in unusual or rapidly evolving traffic conditions. Finally, integrated system development represents the ultimate implementation goal of combining accurate congestion forecasting with actionable and effective strategies for traffic management. These end-to-end solutions translate predictive insights into practical interventions, thereby maximizing the real-world impact of traffic accident outcome forecasting. Collectively, these implementation strategies offer a pragmatic roadmap for transitioning advanced traffic-forecasting methods from research environments to operational systems.

7.10. Limitations of This Review

This systematic review had several limitations that should be considered when interpreting our findings. Regarding publication bias, our inclusion criteria, which focused solely on peer-reviewed articles, may have excluded significant industry implementation and grey literature. The 87% concentration in Q1-ranked journals may disproportionately represent successful results while underrepresenting negative findings or failed approaches commonly encountered in practice. Regarding language limitations, the English-only search strategy may overlook relevant non-English publications, particularly from the Chinese, Japanese, and Korean research communities, where substantial research on intelligent transportation systems has been conducted. This may have introduced geographic and methodological biases in our study. In terms of temporal currency, the knowledge cutoff in January 2024 implies that rapidly evolving LLM applications from late 2024 may be underrepresented. Given the typical 6–12 month publication lag in the field, the most recent innovations were not included in this analysis. Regarding metric heterogeneity, despite our harmonization efforts, converting between different metric types (e.g., RMSE to MAPE, classification to regression) introduced approximation errors of 10–15% in some cases. The reported performance ranges should be interpreted considering this uncertainty. Many studies lack sufficient implementation details (such as hyperparameters, training procedures, and hardware specifications) for full reproducibility. Our computational cost estimates rely on typical configurations and may not reflect all deployment scenarios or recent hardware optimizations. Regarding geographic bias, with 73% of studies originating from Asia and North America, generalizability to European, Latin American, African, and Middle Eastern contexts—each with distinct traffic patterns, infrastructure characteristics, and regulatory environments—remains limited. From a practitioner’s perspective, limited industry collaboration (3%) means that this review reflects academic research priorities rather than the operational deployment challenges faced by transportation agencies in the field. The gap between the theoretical performance and real-world implementation may be larger than that suggested by this analysis. In terms of methodological evolution, the rapid pace of methodological innovation, particularly in LLM applications, means that techniques may become obsolete before publication. Comparative conclusions drawn from papers spanning 2014–2024 may not accurately reflect the current capabilities of earlier approaches that have been optimized. Despite these limitations, this review provides the most comprehensive synthesis to date of traffic congestion forecasting methodologies spanning three technological eras, offering valuable insights for researchers and practitioners in the intelligent transportation system domain.

8. Conclusions

This systematic literature review encompasses 100 peer-reviewed publications from 2014 to 2024 and offers a comprehensive analysis of the evolution of traffic congestion forecasting, tracing its development from traditional machine learning to deep learning and the integration of large language models (LLMs). Our principal findings reveal a distinct technological progression: traditional machine learning, with an accuracy range of 75–85%, laid the foundational groundwork; deep learning, achieving 85–92% accuracy, effectively captured spatial–temporal dependencies; and LLM-based approaches, with an accuracy of 90–95%, facilitated multimodal integration and contextual understanding, albeit with a 50–100× increase in computational cost.
Our quantitative analysis across the three technological eras demonstrates substantial methodological advances. Traditional machine learning methods (2014–2017) achieved MAPE values typically ranging from 12 to 18% for short-term predictions, with Support Vector Machines and Random Forests representing the dominant approaches. The deep learning revolution (2018–2020) reduced the MAPE to 6–10% through the adoption of LSTM, GRU, and CNN architectures, with Graph Neural Networks emerging as particularly effective for spatial modeling. Notably, the T-GCN model demonstrated RMSE improvements of 15–20% compared to traditional RNN approaches, while attention mechanisms provided additional 5–10% accuracy enhancements. The LLM integration phase (2021–2024) further enhanced performance, particularly excelling in non-recurrent congestion scenarios and multimodal data integration, achieving MAPE values as low as 4–7% in optimal conditions.
Graph Neural Networks have emerged as the prevailing framework for spatial modeling, incorporated in 62% of deep learning and LLM-era studies. The dominance of GCN-based architectures reflects their superior ability to capture network-wide dependencies and complex spatial relationships inherent in road networks. The PeMS dataset family dominated empirical evaluations, appearing in 58% of the studies, followed by METR-LA (23%) and proprietary datasets (19%); Studies utilizing multi-source data demonstrated 8–15% accuracy improvements compared to sensor-only approaches, validating the importance of contextual data integration for prediction performance.
Empirical evidence suggests that hybrid methodologies, which integrate large language models (LLMs) with specialized deep learning architectures, achieve superior performance, with accuracy rates ranging from 92% to 96%. These approaches also offer practical flexibility in deployment, combining the efficiency of traditional methods with the contextual capabilities of large LLMs. Nonetheless, the trade-offs between performance and cost remain significant: while LLM-based systems yield an accuracy improvement of 10% to 15%, they necessitate considerably greater computational resources than traditional systems. Deep learning approaches occupy an intermediate position at 10–25× the traditional ML costs, offering an advantageous balance between accuracy improvements and resource requirements. The feasibility of LLM deployment is highly contingent on specific application contexts, with real-time highway management favoring lightweight models and strategic urban planning accommodating more sophisticated architectures.
Prediction horizon analysis revealed distinct performance characteristics for each methodology. Short-term predictions (0–30 min) achieved the highest accuracy across all approaches, with deep learning maintaining MAPE values below 8%. Medium-term predictions (30–120 min) showed increasing accuracy divergence, with LLM approaches outperforming traditional methods by 12–18% through superior contextual understanding. Long-term predictions exceeding 2 h remained challenging for all methodologies, though LLMs demonstrated enhanced robustness in scenarios involving scheduled events or predictable traffic pattern disruptions.
Our systematic analysis identified several critical gaps that require further attention. The heterogeneity of the datasets compromises both comparability and reproducibility, with only 42% of the reviewed studies providing sufficient implementation details for replication. The micro-scale dynamics at intersections and pedestrian crossings remain insufficiently explored, with only 22% of studies addressing these critical urban mobility components despite their substantial impact on congestion formation. The sustainability of computational processes and their environmental impacts require greater consideration, particularly given the 50–100× increase in energy consumption associated with LLM implementations. Furthermore, there is a lack of practical implementation guidance for diverse operational contexts, with limited evidence of real-world deployment experiences and the effectiveness of transfer learning across different geographic settings. Models trained in one city demonstrate substantial performance degradation (15–25% accuracy drops) when applied to different contexts, indicating persistent challenges in generalizability.
This review provides practitioners with evidence-based frameworks for model selection, considering operational constraints, such as computational budget, latency requirements, data availability, and interpretability needs. It also considers infrastructure contexts, including urban networks, highways, and mixed environments, as well as deployment scenarios ranging from short-term tactical, medium-term strategic, to long-term plans. Traditional machine learning remains optimal for deployments with limited resources and relaxed accuracy requirements, achieving 82–88% accuracy at minimal computational cost. Deep learning offers a balanced approach for most applications, providing 85–92% accuracy with moderate resource demands. In contrast, LLM-based and hybrid approaches are most appropriate for high-accuracy applications requiring sophisticated contextual understanding and multimodal integration, particularly in scenarios involving non-recurrent congestion or event-driven traffic disruptions.
Future research should prioritize these directions. Transfer learning frameworks must be developed to address cross-city generalization challenges, enabling knowledge transfer from data-rich to data-limited environments. Real-time processing capabilities require optimization through model distillation, pruning, and edge computing architectures to meet sub-second inference requirements for traffic-management applications. Explainability techniques must be integrated into high-performance models to meet regulatory transparency requirements and operational acceptance criteria. Robust prediction frameworks that maintain performance under missing or noisy sensor data conditions merit continued investigation, as 66% of the reviewed studies cited data quality as a significant limitation of their models. The integration of micro-scale predictions for intersection and pedestrian crossing dynamics represents an underexplored research direction with substantial practical relevance for comprehensive traffic management systems. Additionally, the explicit integration of sustainable mobility alternatives, including cycling and micro-mobility options, into traffic prediction frameworks warrants increased attention, recognizing that modal shifts substantially influence congestion patterns and prediction requirements.
The progression from traditional machine learning to large language models in traffic congestion forecasting represents a transformative technological evolution that offers substantial potential for the advancement of intelligent transportation systems. Although LLMs demonstrate superior performance in terms of accuracy, contextual understanding, and multimodal integration, practical deployment necessitates careful consideration of computational constraints, real-time requirements, and application-specific needs. This systematic review provides researchers and practitioners with comprehensive guidance for navigating these trade-offs, ultimately contributing to the development of more efficient, sustainable, and intelligent urban mobility solutions. The evidence-based insights derived from 100 carefully selected studies establish a robust foundation for understanding current capabilities, identifying critical implementation challenges, and charting strategic directions for future research in this rapidly evolving field of study.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/vehicles7040142/s1.

Author Contributions

Conceptualization, M.A. and M.L.; methodology, M.A. and M.L.; formal analysis, M.A.; investigation, M.A.; data curation, M.A.; writing—original draft preparation, M.A.; writing—review and editing, M.A. and M.L.; visualization, M.A.; supervision, M.L.; project administration, M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable for this systematic literature review study.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting this systematic review are available from the corresponding author upon reasonable request. Complete search strategies, screening decisions, data extraction spreadsheets, quality assessment criteria, and analysis code are available upon request to mehdi.attioui@enscasa.ma.

Acknowledgments

The authors acknowledge the anonymous reviewers whose constructive feedback significantly improved the quality and clarity of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AbbreviationDefinition
AIArtificial Intelligence
ANNArtificial Neural Network
ARIMAAutoregressive Integrated Moving Average
BERTBidirectional Encoder Representations from Transformers
CNNConvolutional Neural Network
DLDeep Learning
GCNGraph Convolutional Network
GNNGraph Neural Network
GPTGenerative Pretrained Transformer
GRUGated Recurrent Unit
ITSIntelligent Transportation Systems
K-NNK-Nearest Neighbors
LLMLarge Language Model
LSTMLong Short-Term Memory
MAEMean Absolute Error
MAPEMean Absolute Percentage Error
METR-LALos Angeles Metropolitan Traffic
MLMachine Learning
PeMSCalifornia Performance Measurement System
PICOPopulation, Intervention, Comparison, Outcome
PRISMAPreferred Reporting Items for Systematic Reviews and Meta-Analyses
RFRandom Forest
RMSERoot Mean Square Error
RNNRecurrent Neural Network
SVMSupport Vector Machine
XGBoostExtreme Gradient Boosting

References

  1. Pishue, B.; Kidd, J. 2024 INRIX Global Traffic Scorecard; Technical Report; INRIX: Kirkland, WA, USA, 2025. [Google Scholar]
  2. Macioszek, E.; Jurdana, I. Bicycle Traffic in the Cities. Sci. J. Silesian Univ. Technol. Ser. Transp. 2022, 117, 115–127. [Google Scholar] [CrossRef]
  3. Kamal, K.; Farooq, B. Debiased Machine Learning for Estimating the Causal Effect of Urban Traffic on Pedestrian Crossing Behavior. Transp. Res. Rec. 2023, 2677, 748–767. [Google Scholar] [CrossRef]
  4. Li, D.; Zhu, F.; Wu, J.; Wong, Y.D.; Chen, T. Managing Mixed Traffic at Signalized Intersections: An Adaptive Signal Control and CAV Coordination System Based on Deep Reinforcement Learning. Expert Syst. Appl. 2024, 238, 121959. [Google Scholar] [CrossRef]
  5. Wang, J.; Chen, L.; Zhang, H. Multi-Agent Deep Learning for Mixed-Traffic Microscale Prediction in Urban Environments. Transp. Res. Part B 2022, 156, 228–247. [Google Scholar]
  6. Medina-Salgado, B.; Sánchez-DelaCruz, E.; Pozos-Parra, P.; Sierra, J.E. Urban Traffic Flow Prediction Techniques: A Review. Sustain. Comput. Inform. Syst. 2022, 35, 100739. [Google Scholar] [CrossRef]
  7. Shaygan, M.; Meese, C.; Li, W.; Zhao, X.; Nejad, M. Traffic Prediction Using Artificial Intelligence: Review of Recent Advances and Emerging Opportunities. Transp. Res. Part C Emerg. Technol. 2022, 145, 103921. [Google Scholar] [CrossRef]
  8. Sayed, S.A.; Abdel-Hamid, Y.; Hefny, H.A. Artificial Intelligence-Based Traffic Flow Prediction: A Comprehensive Review. J. Electr. Syst. Inf. Technol. 2023, 10, 13. [Google Scholar] [CrossRef]
  9. Attioui, M.; Lahby, M. Congestion Forecasting Using Machine Learning Techniques: A Systematic Review. Future Transp. 2025, 5, 76. [Google Scholar] [CrossRef]
  10. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
  11. Tayan, O.; Al BinAli, A.M.; Kabir, M.N. Analytical and Computer Modeling of Transportation Systems for Traffic Bottleneck Resolution: A Hajj Case Study. Arab. J. Sci. Eng. 2014, 39, 7013–7037. [Google Scholar] [CrossRef]
  12. Odat, E.; Shamma, J.S.; Claudel, C. Vehicle Classification and Speed Estimation Using Combined Passive Infrared/Ultrasonic Sensors. IEEE Trans. Intell. Transp. Syst. 2018, 19, 1593–1606. [Google Scholar] [CrossRef]
  13. Huang, W.; Song, G.; Hong, H.; Xie, K. Deep Architecture for Traffic Flow Prediction: Deep Belief Networks With Multitask Learning. IEEE Trans. Intell. Transp. Syst. 2014, 15, 2191–2201. [Google Scholar] [CrossRef]
  14. Chen, P.; Chen, F.; Qian, Z. Road Traffic Congestion Monitoring in Social Media with Hinge-Loss Markov Random Fields. In Proceedings of the 2014 IEEE International Conference on Data Mining, Shenzhen, China, 14–17 December 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 80–89. [Google Scholar]
  15. Lin, G.; Lin, A.; Gu, D. Using Support Vector Regression and K-nearest Neighbors for Short-Term Traffic Flow Prediction Based on Maximal Information Coefficient. Inf. Sci. 2022, 608, 517–531. [Google Scholar] [CrossRef]
  16. Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 2020, 21, 3848–3858. [Google Scholar] [CrossRef]
  17. Cao, M.; Li, V.O.K.; Chan, V.W.S. A CNN-LSTM Model for Traffic Speed Prediction. In Proceedings of the 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 25–28 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar]
  18. Jiang, M.; Chen, W.; Li, X. S-GCN-GRU-NN: A Novel Hybrid Model by Combining a Spatiotemporal Graph Convolutional Network and a Gated Recurrent Units Neural Network for Short-Term Traffic Speed Forecasting. J. Data Inf. Manag. 2021, 3, 1–20. [Google Scholar] [CrossRef]
  19. Reza, S.; Ferreira, M.C.; Machado, J.J.M.; Tavares, J.M.R.S. A Multi-Head Attention-Based Transformer Model for Traffic Flow Forecasting with a Comparative Analysis to Recurrent Neural Networks. Expert Syst. Appl. 2022, 202, 117275. [Google Scholar] [CrossRef]
  20. Zhang, H.; Zou, Y.; Yang, X.; Yang, H. A Temporal Fusion Transformer for Short-Term Freeway Traffic Speed Multistep Prediction. Neurocomputing 2022, 500, 329–340. [Google Scholar] [CrossRef]
  21. Jin, K.; Wi, J.; Lee, E.; Kang, S.; Kim, S.; Kim, Y. TrafficBERT: Pretrained Model with Large-Scale Data for Long-Range Traffic Flow Forecasting. Expert Syst. Appl. 2021, 186, 115738. [Google Scholar] [CrossRef]
  22. Jia, F.; Wang, K.; Zheng, Y.; Cao, D.; Liu, Y. GPT4MTS: Prompt-based Large Language Model for Multimodal Time-series Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 26–27 February 2024; AAAI: Washington, DC, USA, 2024. [Google Scholar]
  23. Liu, C.; Yang, S.; Xu, Q.; Li, Z.; Long, C.; Li, Z.; Zhao, R. Spatial-Temporal Large Language Model for Traffic Prediction. In Proceedings of the International Conference on Mobile Data Management, Brussels, Belgium, 24–27 June 2024; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar]
  24. Su, J.; Jiang, C.; Jin, X.; Qiao, Y.; Xiao, T.; Ma, H.; Wei, R.; Jing, Z.; Xu, J.; Lin, J. Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review. arXiv 2024, arXiv:2402.10350. [Google Scholar] [CrossRef]
  25. Liang, Y.; Liu, Y.; Wang, X.; Zhao, Z. Exploring Large Language Models for Human Mobility Prediction under Public Events. Comput. Environ. Urban Syst. 2024, 112, 102153. [Google Scholar] [CrossRef]
  26. Ren, Y.; Chen, Y.; Liu, S.; Wang, B.; Yu, H.; Cui, Z. TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models. arXiv 2024, arXiv:2403.02221. [Google Scholar] [CrossRef]
  27. Modi, Y.; Teli, R.; Mehta, A.; Shah, K.; Shah, M. A Comprehensive Review on Intelligent Traffic Management Using Machine Learning Algorithms. Innov. Infrastruct. Solut. 2021, 7, 128. [Google Scholar] [CrossRef]
  28. Akhtar, M.; Moridpour, S. A Review of Traffic Congestion Prediction Using Artificial Intelligence. J. Adv. Transp. 2021, 2021, e8878011. [Google Scholar] [CrossRef]
  29. Kashyap, A.A.; Raviraj, S.; Devarakonda, A.; Nayak K, S.R.; K V, S.; Bhat, S.J. Traffic Flow Prediction Models—A Review of Deep Learning Techniques. Cogent Eng. 2022, 9, 2010510. [Google Scholar] [CrossRef]
  30. Gomes, B.; Coelho, J.; Aidos, H. A Survey on Traffic Flow Prediction and Classification. Intell. Syst. Appl. 2023, 19, 200268. [Google Scholar] [CrossRef]
  31. Zhang, Z.; Sun, Y.; Wang, Z.; Nie, Y.; Ma, X.; Sun, P.; Li, R. Large Language Models for Mobility in Transportation Systems: A Survey on Forecasting Tasks. arXiv 2024, arXiv:2405.02357. [Google Scholar] [CrossRef]
  32. Peng, M.; Chen, K.; Guo, X.; Zhang, Q.; Lu, H.; Zhong, H.; Chen, D.; Zhu, M.; Yang, H. Diffusion Models for Intelligent Transportation Systems: A Survey. arXiv 2024, arXiv:2409.15816. [Google Scholar] [CrossRef]
  33. Mahmud, D.; Hajmohamed, H.; Almentheri, S.; Alqaydi, S.; Aldhaheri, L.; Khalil, R.A.; Saeed, N. Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions. arXiv 2025, arXiv:2501.04437. [Google Scholar] [CrossRef]
  34. Koesdwiady, A.; Soua, R.; Karray, F. Improving Traffic Flow Prediction With Weather Information in Connected Cars: A Deep Learning Approach. IEEE Trans. Veh. Technol. 2016, 65, 9508–9517. [Google Scholar] [CrossRef]
  35. Essien, A.; Petrounias, I.; Sampaio, P.; Sampaio, S. A Deep-Learning Model for Urban Traffic Flow Prediction with Traffic Events Mined from Twitter. World Wide Web 2021, 24, 1345–1368. [Google Scholar] [CrossRef]
  36. Yang, X.; Bekoulis, G.; Deligiannis, N. Traffic Event Detection as a Slot Filling Problem. Eng. Appl. Artif. Intell. 2023, 123, 106202. [Google Scholar] [CrossRef]
  37. Kitchenham, B. Guidelines for Performing Systematic Literature Reviews in Software Engineering; Technical Report; Keele University and Durham University: Keele, UK, 2007. [Google Scholar]
  38. Kitchenham, B. Procedures for Performing Systematic Reviews; Technical Report; Keele University: Keele, UK, 2004. [Google Scholar]
  39. Wieringa, R.; Maiden, N.; Mead, N.; Rolland, C. Requirements Engineering Paper Classification and Evaluation Criteria: A Proposal and a Discussion. Requir. Eng. 2006, 11, 102–107. [Google Scholar] [CrossRef]
  40. Gaamouche, R.; Chinnici, M.; Lahby, M.; Abakarim, Y.; Hasnaoui, A.E.E. Machine Learning Techniques for Renewable Energy Forecasting: A Comprehensive Review. In Renewable Energy Systems; Springer: Berlin/Heidelberg, Germany, 2022; pp. 3–39. [Google Scholar] [CrossRef]
  41. Lahby, M.; Aqil, S.; Yafooz, W.M.S.; Abakarim, Y. Online Fake News Detection Using Machine Learning Techniques: A Systematic Mapping Study. Stud. Comput. Intell. 2022, 1001, 3–37. [Google Scholar]
  42. Attioui, M.; Lahby, M. Deep Learning-Based Congestion Forecasting: A Literature Review and Future. In Proceedings of the 10th International Conference on Wireless Networks and Mobile Communications, WINCOM 2023, Istanbul, Turkey, 26–28 October 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
  43. Aljuaydi, F.; Wiwatanapataphee, B.; Wu, Y.H. Multivariate Machine Learning-Based Prediction Models of Freeway Traffic Flow under Non-recurrent Events. Alex. Eng. J. 2022, 64, 663–673. [Google Scholar] [CrossRef]
  44. Pang, A.; Wang, M.; Pun, M.; Chen, C.S.; Xiong, X. ILLM-TSC: Integration Reinforcement Learning and Large Language Model for Traffic Signal Control Policy Improvement. arXiv 2024, arXiv:2407.06025. [Google Scholar]
  45. Bayoudh, K.; Hamdaoui, F.; Mtibaa, A. Transfer Learning Based Hybrid 2D-3D CNN for Traffic Sign Recognition and Semantic Road Detection Applied in Advanced Driver Assistance Systems. Appl. Intell. 2021, 51, 124–142. [Google Scholar] [CrossRef]
  46. Boukerche, A.; Tao, Y.; Sun, P. Artificial Intelligence-Based Vehicular Traffic Flow Prediction Methods for Supporting Intelligent Transportation Systems. Comput. Netw. 2020, 182, 107484. [Google Scholar] [CrossRef]
  47. Bouyahia, Z.; Haddad, H.; Jabeur, N.; Yasar, A. A Two-Stage Road Traffic Congestion Prediction and Resource Dispatching towards a Self-Organizing Traffic Control System. Pers. Ubiquitous Comput. 2019, 23, 909–920. [Google Scholar] [CrossRef]
  48. Brincat, A.A.; Pacifici, F.; Martinaglia, S.; Mazzola, F. The Internet of Things for Intelligent Transportation Systems in Real Smart Cities Scenarios. In Proceedings of the 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), Limerick, Ireland, 15–18 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 128–132. [Google Scholar]
  49. Cheng, X.; Zhang, R.; Zhou, J.; Xu, W. DeepTransport: Learning Spatial-Temporal Dependency for Traffic Condition Forecasting. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
  50. Chen, M.; Yu, X.; Liu, Y. PCNN: Deep Convolutional Networks for Short-Term Traffic Congestion Prediction. IEEE Trans. Intell. Transp. Syst. 2018, 19, 3550–3559. [Google Scholar] [CrossRef]
  51. Chen, G.; Zhang, J. Applying Artificial Intelligence and Deep Belief Network to Predict Traffic Congestion Evacuation Performance in Smart Cities. Appl. Soft Comput. 2022, 121, 108692. [Google Scholar] [CrossRef]
  52. Chikaraishi, M.; Garg, P.; Varghese, V.; Yoshizoe, K.; Urata, J.; Shiomi, Y.; Watanabe, R. On the Possibility of Short-Term Traffic Prediction during Disaster with Machine Learning Approaches: An Exploratory Analysis. Transp. Policy 2020, 98, 91–104. [Google Scholar] [CrossRef]
  53. Cui, Z.; Zhao, C. Dual-Stage Attention Based Spatio-Temporal Sequence Learning for Multistep Traffic Prediction. IFAC-PapersOnLine 2020, 53, 17035–17040. [Google Scholar]
  54. Dell’Acqua, P.; Bellotti, F.; Berta, R.; De Gloria, A. Time-Aware Multivariate Nearest Neighbor Regression Methods for Traffic Flow Prediction. IEEE Trans. Intell. Transp. Syst. 2015, 16, 3393–3402. [Google Scholar] [CrossRef]
  55. de Medrano, R.; Aznarte, J.L. A Spatio-Temporal Attention-Based Spot-Forecasting Framework for Urban Traffic Prediction. Appl. Soft Comput. 2020, 96, 106615. [Google Scholar] [CrossRef]
  56. Du, S.; Li, T.; Yang, Y.; Gong, X.; Horng, S. An LSTM Based Encoder-Decoder Model for Multistep Traffic Flow Prediction. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–8. [Google Scholar]
  57. Eldowa, D.; Elgazzar, K.; Hassanein, H.S.; Sharaf, T.; Shah, S. Assessing the Integrity of Traffic Data through Short Term State Prediction. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Big Island, HI, USA, 9–13 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5. [Google Scholar]
  58. Feng, S.; Wei, S.; Zhang, J.; Li, Y.; Ke, J.; Chen, G.; Zheng, Y.; Yang, H. A Macro-Micro Spatio-Temporal Neural Network for Traffic Prediction. Transp. Res. Part C Emerg. Technol. 2023, 156, 104331. [Google Scholar] [CrossRef]
  59. Fiorini, S.; Pilotti, G.; Ciavotta, M.; Maurino, A. 3D-CLoST: A CNN-LSTM Approach for Mobility Dynamics Prediction in Smart Cities. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 3180–3189. [Google Scholar]
  60. Govindan, K.; Ramalingam, S.; Broumi, S. Traffic Volume Prediction Using Intuitionistic Fuzzy Grey-Markov Model. Neural Comput. Appl. 2021, 33, 12905–12920. [Google Scholar] [CrossRef]
  61. Guo, J.; Liu, Y.; Wang, Y.; Fang, S. GPS-based Citywide Traffic Congestion Forecasting Using CNN-RNN and C3D Hybrid Model. Transp. A Transp. Sci. 2021, 17, 190–211. [Google Scholar] [CrossRef]
  62. Guerreiro, G.; Figueiras, P.; Silva, R.; Costa, R.; Jardim-Goncalves, R. An Architecture for Big Data Processing on Intelligent Transportation Systems. An Application Scenario on Highway Traffic Flows. In Proceedings of the 2016 IEEE 8th International Conference on Intelligent Systems (IS), Sofia, Bulgaria, 4–6 September 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 65–72. [Google Scholar]
  63. Guo, X.; Zhang, Q.; Jiang, J.; Peng, M.; Zhu, M.; Yang, H.F. Towards Explainable Traffic Flow Prediction with Large Language Models. Commun. Transp. Res. 2024, 4, 100150. [Google Scholar] [CrossRef]
  64. Harrou, F.; Zeroual, A.; Kadri, F.; Sun, Y. Enhancing Road Traffic Flow Prediction with Improved Deep Learning Using Wavelet Transforms. Results Eng. 2024, 23, 102342. [Google Scholar] [CrossRef]
  65. Hassija, V.; Gupta, V.; Garg, S.; Chamola, V. Traffic Jam Probability Estimation Based on Blockchain and Deep Neural Networks. IEEE Trans. Intell. Transp. Syst. 2021, 22, 3919–3928. [Google Scholar] [CrossRef]
  66. Hou, Z.; Li, X. Repeatability and Similarity of Freeway Traffic Flow and Long-Term Prediction Under Big Data. IEEE Trans. Intell. Transp. Syst. 2016, 17, 1786–1796. [Google Scholar] [CrossRef]
  67. Huang, M. Intersection Traffic Flow Forecasting Based on ν-GSVR with a New Hybrid Evolutionary Algorithm. Neurocomputing 2015, 147, 343–349. [Google Scholar] [CrossRef]
  68. Huang, X.; Ye, Y.; Ding, W.; Yang, X.; Xiong, L. Multi-mode Dynamic Residual Graph Convolution Network for Traffic Flow Prediction. Inf. Sci. 2022, 609, 548–564. [Google Scholar] [CrossRef]
  69. Park, H.; Haghani, A.; Samuel, S.; Knodler, M.A. Real-Time Prediction and Avoidance of Secondary Crashes under Unexpected Traffic Congestion. Accid. Anal. Prev. 2018, 112, 39–49. [Google Scholar] [CrossRef]
  70. Khan, S.; Nazir, S.; García-Magariño, I.; Hussain, A. Deep Learning-Based Urban Big Data Fusion in Smart Cities: Towards Traffic Monitoring and Flow-Preserving Fusion. Comput. Electr. Eng. 2021, 89, 106906. [Google Scholar] [CrossRef]
  71. Li, F.; Gong, J.; Liang, Y.; Zhou, J. Real-Time Congestion Prediction for Urban Arterials Using Adaptive Data-Driven Methods. Multimed. Tools Appl. 2016, 75, 17573–17592. [Google Scholar] [CrossRef]
  72. Li, G.; Knoop, V.L.; van Lint, H. Multistep Traffic Forecasting by Dynamic Graph Convolution: Interpretations of Real-Time Spatial Correlations. Transp. Res. Part C Emerg. Technol. 2021, 128, 103185. [Google Scholar] [CrossRef]
  73. Li, J. Trajectory Prediction Learning Using Deep Generative Models. Master’s Thesis, York University, Toronto, ON, Canada, 2024. [Google Scholar]
  74. Liu, Q.; Liu, T.; Cai, Y.; Xiong, X.; Jiang, H.; Wang, H.; Hu, Z. Explanatory Prediction of Traffic Congestion Propagation Mode: A Self-Attention Based Approach. Phys. A Stat. Mech. Its Appl. 2021, 573, 125940. [Google Scholar] [CrossRef]
  75. Liu, Y.; Wu, C.; Wen, J.; Xiao, X.; Chen, Z. A Grey Convolutional Neural Network Model for Traffic Flow Prediction under Traffic Accidents. Neurocomputing 2022, 500, 761–775. [Google Scholar] [CrossRef]
  76. Liu, Y.; Yu, J.J.Q.; Kang, J.; Niyato, D.; Zhang, S. Privacy-Preserving Traffic Flow Prediction: A Federated Learning Approach. IEEE Internet Things J. 2020, 7, 7751–7763. [Google Scholar] [CrossRef]
  77. Lopes Gerum, P.C.; Benton, A.R.; Baykal-Gürsoy, M. Traffic Density on Corridors Subject to Incidents: Models for Long-Term Congestion Management. EURO J. Transp. Logist. 2019, 8, 795–831. [Google Scholar] [CrossRef]
  78. Lopez-Garcia, P.; Onieva, E.; Osaba, E.; Masegosa, A.D.; Perallos, A. A Hybrid Method for Short-Term Traffic Congestion Forecasting Using Genetic Algorithms and Cross Entropy. IEEE Trans. Intell. Transp. Syst. 2016, 17, 557–569. [Google Scholar]
  79. Majumdar, S.; Subhani, M.M.; Roullier, B.; Anjum, A.; Zhu, R. Congestion Prediction for Smart Sustainable Cities Using IoT and Machine Learning Approaches. Sustain. Cities Soc. 2021, 64, 102500. [Google Scholar] [CrossRef]
  80. Wang, M.; Pang, A.; Kan, Y.; Pun, M.; Chen, C.S.; Huang, B. LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments. arXiv 2024, arXiv:2403.08337. [Google Scholar]
  81. Ma, J.; Zhao, J.; Hou, Y. Spatial-Temporal Transformer Networks for Traffic Flow Forecasting Using a Pretrained Language Model. Sensors 2024, 24, 5502. [Google Scholar] [CrossRef]
  82. Menguc, K.; Aydin, N.; Yilmaz, A. A Data Driven Approach to Forecasting Traffic Speed Classes Using Extreme Gradient Boosting Algorithm and Graph Theory. Phys. A Stat. Mech. Its Appl. 2023, 620, 128738. [Google Scholar]
  83. Nallaperuma, D.; Nawaratne, R.; Bandaragoda, T.; Adikari, A.; Nguyen, S.; Kempitiya, T.; De Silva, D.; Alahakoon, D.; Pothuhera, D. Online Incremental Machine Learning Platform for Big Data-Driven Smart Traffic Management. IEEE Trans. Intell. Transp. Syst. 2019, 20, 4679–4690. [Google Scholar]
  84. Nguyen, N.; Dao, M.; Zettsu, K. Complex Event Analysis for Traffic Risk Prediction Based on 3D-CNN with Multi-sources Urban Sensing Data. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1669–1674. [Google Scholar]
  85. Oh, S.; Kim, Y.; Hong, J. Urban Traffic Flow Prediction System Using Multifactor Pattern Recognition Model. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2744–2755. [Google Scholar]
  86. Zheng, O.; Abdel-Aty, M.; Wang, D.; Wang, Z.; Ding, S. ChatGPT is on the Horizon: Could a Large Language Model be Suitable for Intelligent Traffic Safety Research and Applications? arXiv 2023, arXiv:2303.05382. [Google Scholar] [CrossRef]
  87. Polson, N.G.; Sokolov, V.O. Deep Learning for Short-Term Traffic Flow Prediction. Transp. Res. Part C Emerg. Technol. 2017, 79, 1–17. [Google Scholar] [CrossRef]
  88. Pu, B.; Liu, Y.; Zhu, N.; Li, K.; Li, K. ED-ACNN: Novel Attention Convolutional Neural Network Based on Encoder-Decoder Framework for Human Traffic Prediction. Appl. Soft Comput. 2020, 97, 106688. [Google Scholar]
  89. Qiu, J.; Du, L.; Zhang, D.; Su, S.; Tian, Z. Nei-TTE: Intelligent Traffic Time Estimation Based on Fine-Grained Time Derivation of Road Segments for Smart City. IEEE Trans. Ind. Inform. 2020, 16, 2659–2666. [Google Scholar] [CrossRef]
  90. Roy, K.C.; Hasan, S.; Culotta, A.; Eluru, N. Predicting Traffic Demand during Hurricane Evacuation Using Real-time Data from Transportation Systems and Social Media. Transp. Res. Part C Emerg. Technol. 2021, 131, 103339. [Google Scholar] [CrossRef]
  91. Sengupta, A.; Mondal, S.; Das, A.; Guler, S.I. A Bayesian Approach to Quantifying Uncertainties and Improving Generalizability in Traffic Prediction Models. Transp. Res. Part C Emerg. Technol. 2024, 162, 104585. [Google Scholar] [CrossRef]
  92. Hu, S.; Fang, Z.; Fang, Z.; Deng, Y.; Chen, X.; Fang, Y.; Kwong, S. AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging. arXiv 2024, arXiv:2408.03624. [Google Scholar] [CrossRef]
  93. Shang, P.; Liu, X.; Yu, C.; Yan, G.; Xiang, Q.; Mi, X. A New Ensemble Deep Graph Reinforcement Learning Network for Spatio-Temporal Traffic Volume Forecasting in a Freeway Network. Digit. Signal Process. 2022, 123, 103419. [Google Scholar] [CrossRef]
  94. Sharma, B.; Kumar, S.; Tiwari, P.; Yadav, P.; Nezhurina, M.I. ANN Based Short-Term Traffic Flow Forecasting in Undivided Two Lane Highway. J. Big Data 2018, 5, 48. [Google Scholar] [CrossRef]
  95. Sharma, P.; Singh, A.; Singh, K.K.; Dhull, A. Vehicle Identification Using Modified Region Based Convolution Network for Intelligent Transportation System. Multimed. Tools Appl. 2021, 81, 34893–34917. [Google Scholar] [CrossRef]
  96. Shen, G.; Li, P.; Chen, Z.; Yang, Y.; Kong, X. Spatio-Temporal Interactive Graph Convolution Network for Vehicle Trajectory Prediction. Internet Things 2023, 24, 100935. [Google Scholar] [CrossRef]
  97. Zahng, W.; Yu, Y.; Qi, Y.; Shu, F.; Wang, Y. Short-Term Traffic Flow Prediction Based on Spatio-Temporal Analysis and CNN Deep Learning. Transp. A Transp. Sci. 2019, 15, 1688–1711. [Google Scholar]
  98. Soudeep, S.; Lailun Nahar Aurthy, M.; Jim, J.R.; Mridha, M.F.; Kabir, M.M. Enhancing Road Traffic Flow in Sustainable Cities through Transformer Models: Advancements and Challenges. Sustain. Cities Soc. 2024, 116, 105882. [Google Scholar] [CrossRef]
  99. Soua, R.; Koesdwiady, A.; Karray, F. Big-Data-Generated Traffic Flow Prediction Using Deep Learning and Dempster-Shafer Theory. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 3195–3202. [Google Scholar]
  100. Sun, P.; AlJeri, N.; Boukerche, A. A Fast Vehicular Traffic Flow Prediction Scheme Based on Fourier and Wavelet Analysis. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM), Abu Dhabi, United Arab Emirates, 9–13 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
  101. Sun, Y.; Wang, Y.; Fu, K.; Wang, Z.; Zhang, C.; Ye, J. Constructing Geographic and Long-term Temporal Graph for Traffic Forecasting. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 3483–3490. [Google Scholar]
  102. Tay, L.; Lim, J.M.; Liang, S.; Keong, C.K.; Tay, Y.H. Urban Traffic Volume Estimation Using Intelligent Transportation System Crowdsourced Data. Eng. Appl. Artif. Intell. 2023, 126, 107064. [Google Scholar] [CrossRef]
  103. Tiwari, V.S.; Arya, A. Horizontally Scalable Probabilistic Generalized Suffix Tree (PGST) Based Route Prediction Using Map Data and GPS Traces. J. Big Data 2017, 4, 23. [Google Scholar] [CrossRef]
  104. Tran, T.; He, D.; Kim, J.; Hickman, M. MSGNN: A Multi-structured Graph Neural Network Model for Real-Time Incident Prediction in Large Traffic Networks. Transp. Res. Part C Emerg. Technol. 2023, 156, 104354. [Google Scholar]
  105. Wang, X.; Guan, X.; Cao, J.; Zhang, N.; Wu, H. Forecast Network-Wide Traffic States for Multiple Steps Ahead: A Deep Learning Approach Considering Dynamic Non-Local Spatial Correlation and Nonstationary Temporal Dependency. Transp. Res. Part C Emerg. Technol. 2020, 119, 102763. [Google Scholar]
  106. Wang, J.; Liu, K.; Li, H. LSTM-based Graph Attention Network for Vehicle Trajectory Prediction. Comput. Netw. 2024, 248, 110477. [Google Scholar] [CrossRef]
  107. Wang, K.; Ma, C.; Qiao, Y.; Lu, X.; Hao, W.; Dong, S. A Hybrid Deep Learning Model with 1DCNN-LSTM-Attention Networks for Short-Term Traffic Flow Prediction. Phys. A Stat. Mech. Its Appl. 2021, 583, 126293. [Google Scholar]
  108. Wang, Z.; Sun, P.; Hu, Y.; Boukerche, A. A Novel Hybrid Method for Achieving Accurate and Timeliness Vehicular Traffic Flow Prediction in Road Networks. Comput. Commun. 2023, 209, 378–386. [Google Scholar] [CrossRef]
  109. Wang, B.; Wang, J. ST-MGAT: Spatio-temporal Multi-Head Graph Attention Network for Traffic Prediction. Phys. A Stat. Mech. Its Appl. 2022, 603, 127762. [Google Scholar] [CrossRef]
  110. Wang, X.; Xu, L.; Chen, K. Data-Driven Short-Term Forecasting for Urban Road Network Traffic Based on Data Processing and LSTM-RNN. Arab. J. Sci. Eng. 2019, 44, 3043–3060. [Google Scholar]
  111. Xiong, L.; Su, L.; Zeng, S.; Li, X.; Wang, T.; Zhao, F. Generalized Spatial-Temporal Regression Graph Convolutional Transformer for Traffic Forecasting. Complex Intell. Syst. 2024, 10, 7943–7964. [Google Scholar] [CrossRef]
  112. Xu, J.; Deng, D.; Demiryurek, U.; Shahabi, C.; van der Schaar, M. Mining the Situation: Spatiotemporal Traffic Prediction With Big Data. IEEE J. Sel. Top. Signal Process. 2015, 9, 702–715. [Google Scholar] [CrossRef]
  113. Xu, J.; Li, Y.; Lu, W.; Wu, S.; Li, Y. A Heterogeneous Traffic Spatio-Temporal Graph Convolution Model for Traffic Prediction. Phys. A Stat. Mech. Its Appl. 2024, 641, 129746. [Google Scholar] [CrossRef]
  114. Xu, M.; Liu, H. A Flexible Deep Learning-Aware Framework for Travel Time Prediction Considering Traffic Event. Eng. Appl. Artif. Intell. 2021, 106, 104491. [Google Scholar] [CrossRef]
  115. Yang, H.; Dillon, T.S.; Chen, Y. Optimized Structure of the Traffic Flow Forecasting Model With a Deep Learning Approach. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2371–2381. [Google Scholar] [CrossRef] [PubMed]
  116. Yang, Y.; Xu, Y.; Han, J.; Wang, E.; Chen, W.; Yue, L. Efficient Traffic Congestion Estimation Using Multiple Spatio-Temporal Properties. Neurocomputing 2017, 267, 344–353. [Google Scholar] [CrossRef]
  117. Ye, Y.; Xiao, Y.; Zhou, Y.; Li, S.; Zang, Y.; Zhang, Y. Dynamic Multi-Graph Neural Network for Traffic Flow Prediction Incorporating Traffic Accidents. Expert Syst. Appl. 2023, 234, 121101. [Google Scholar] [CrossRef]
  118. Ye, J.; Xue, S.; Jiang, A. Attention-Based Spatio-Temporal Graph Convolutional Network Considering External Factors for Multistep Traffic Flow Prediction. Digit. Commun. Netw. 2022, 8, 343–350. [Google Scholar] [CrossRef]
  119. Yu, J.J.Q. Citywide Traffic Speed Prediction: A Geometric Deep Learning Approach. Knowl.-Based Syst. 2021, 212, 106592. [Google Scholar] [CrossRef]
  120. Zaki, J.F.; Ali-Eldin, A.; Hussein, S.E.; Saraya, S.F.; Areed, F.F. Traffic Congestion Prediction Based on Hidden Markov Models and Contrast Measure. Ain Shams Eng. J. 2020, 11, 535–551. [Google Scholar] [CrossRef]
  121. Zhang, J.; Song, C.; Cao, S.; Zhang, C. FDST-GCN: A Fundamental Diagram Based Spatiotemporal Graph Convolutional Network for Expressway Traffic Forecasting. Phys. A 2023, 630, 129173. [Google Scholar] [CrossRef]
  122. Zhao, J.; Liu, Z.; Sun, Q.; Li, Q.; Jia, X.; Zhang, R. Attention-Based Dynamic Spatial-Temporal Graph Convolutional Networks for Traffic Speed Forecasting. Expert Syst. Appl. 2022, 204, 117511. [Google Scholar] [CrossRef]
  123. Zhao, H.; Yang, H.; Wang, Y.; Wang, D.; Su, R. Attention Based Graph Bi-LSTM Networks for Traffic Forecasting. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
  124. Zheng, G.; Chai, W.K.; Katos, V. An Ensemble Model for Short-Term Traffic Prediction in Smart City Transportation System. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Big Island, HI, USA, 9–13 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
  125. Attioui, M.; Lahby, M. Forecasting Traffic Congestion Using Air Pollution Data: A Case Study of Casablanca, Morocco. In Applied Intelligence and Informatics; Mahmud, M., Kaiser, M.S., Kamruzzaman, J., Iftekharuddin, K., Ahad, M.A.R., Zhong, N., Eds.; Communications in Computer and Information Science; Springer: Cham, Switzerland, 2025; Volume 2607, pp. 432–445. [Google Scholar] [CrossRef]
Figure 1. PRISMA 2020 flow diagram illustrating the systematic selection process for studies on traffic congestion forecasting using machine learning, deep learning, and large language model approaches (2014–2024). Adapted from ref. [10].
Figure 1. PRISMA 2020 flow diagram illustrating the systematic selection process for studies on traffic congestion forecasting using machine learning, deep learning, and large language model approaches (2014–2024). Adapted from ref. [10].
Vehicles 07 00142 g001
Figure 2. Publication trends in traffic congestion forecasting (2014–2024).
Figure 2. Publication trends in traffic congestion forecasting (2014–2024).
Vehicles 07 00142 g002
Figure 3. Distribution of AI techniques in traffic congestion forecasting (2014–2024). Traditional ML: SVM, Random Forest, Bayesian methods. Deep learning: CNN, LSTM, RNN, and GRU LLM-only: BERT, GPT, Transformers, etc. Hybrid ML + LLM: Combined ML/DL with LLM approaches.
Figure 3. Distribution of AI techniques in traffic congestion forecasting (2014–2024). Traditional ML: SVM, Random Forest, Bayesian methods. Deep learning: CNN, LSTM, RNN, and GRU LLM-only: BERT, GPT, Transformers, etc. Hybrid ML + LLM: Combined ML/DL with LLM approaches.
Vehicles 07 00142 g003
Figure 4. Distribution of traditional machine learning techniques in traffic congestion forecasting research (2014–2024).
Figure 4. Distribution of traditional machine learning techniques in traffic congestion forecasting research (2014–2024).
Vehicles 07 00142 g004
Figure 5. Distribution of deep learning techniques in traffic congestion forecasting research (2015–2024).
Figure 5. Distribution of deep learning techniques in traffic congestion forecasting research (2015–2024).
Vehicles 07 00142 g005
Figure 6. Distribution of large language model (LLM) applications in traffic congestion forecasting (2021–2024).
Figure 6. Distribution of large language model (LLM) applications in traffic congestion forecasting (2021–2024).
Vehicles 07 00142 g006
Figure 7. Performance metrics used in traffic congestion forecasting studies. Key metric definitions: MAE: Mean Absolute Error; RMSE: Root Mean Square Error; MAPE: Mean Absolute Percentage Error; ROC/AUC: Receiver Operating Characteristic/Area Under Curve; F1-score: Harmonic mean of precision and recall.
Figure 7. Performance metrics used in traffic congestion forecasting studies. Key metric definitions: MAE: Mean Absolute Error; RMSE: Root Mean Square Error; MAPE: Mean Absolute Percentage Error; ROC/AUC: Receiver Operating Characteristic/Area Under Curve; F1-score: Harmonic mean of precision and recall.
Vehicles 07 00142 g007
Figure 8. Common limitations in traffic congestion forecasting studies.
Figure 8. Common limitations in traffic congestion forecasting studies.
Vehicles 07 00142 g008
Figure 9. The evolution of AI techniques over time (2014–2024) showing the shift from traditional ML to deep learning to LLM approaches.
Figure 9. The evolution of AI techniques over time (2014–2024) showing the shift from traditional ML to deep learning to LLM approaches.
Vehicles 07 00142 g009
Figure 10. Comparison of prediction accuracy across different model types for various congestion scenarios and time horizons.
Figure 10. Comparison of prediction accuracy across different model types for various congestion scenarios and time horizons.
Vehicles 07 00142 g010
Figure 11. Performance–complexity trade-off analysis.
Figure 11. Performance–complexity trade-off analysis.
Vehicles 07 00142 g011
Figure 12. Data types evolution in traffic congestion prediction.
Figure 12. Data types evolution in traffic congestion prediction.
Vehicles 07 00142 g012
Figure 13. Evolution of prediction capabilities in traffic congestion forecasting.
Figure 13. Evolution of prediction capabilities in traffic congestion forecasting.
Vehicles 07 00142 g013
Table 1. Research questions and justifications.
Table 1. Research questions and justifications.
IDResearch QuestionJustification
RQ1How has the evolution from ML to LLMs impacted traffic congestion forecasting techniques between 2014 and 2024?Examines technological progression, adoption patterns, and performance gains, particularly in predicting non-recurrent congestion
RQ2What are the performance–cost trade-offs between ML, DL, and LLMs for traffic congestion prediction?Evaluates computational requirements versus accuracy improvements to guide practical implementation decisions
RQ3How do AI models perform in different transportation infrastructure scenarios?Assesses model adaptability across urban networks, highways, and mixed transportation systems
RQ4What is the effectiveness of AI models in predicting different types of congestion over various time frames?Examines temporal accuracy and pattern recognition capabilities for different congestion scenarios
RQ5How do different traffic parameters affect the AI model performance in congestion forecasting?Identifies critical data inputs and their impact on prediction accuracy
RQ6What are the comparative advantages of different AI architectures and their combinations for traffic prediction?Evaluates hybrid approaches and architectural innovations for optimal performance
Table 2. Quality assessment scoring system (maximum score: 100 points).
Table 2. Quality assessment scoring system (maximum score: 100 points).
CriterionSub-CriteriaWeightPoints
Publication Quality (30%)Journal ranking (Q1/Q2/Q3/Q4)15%0–15
Citation count relative to publication age10%0–10
Venue reputation in ITS/AI domains5%0–5
Methodology Rigor (50%)Clear problem definition and research questions8%0–8
Detailed method description and reproducibility12%0–12
Appropriate dataset selection and description10%0–10
Rigorous experimental design (baselines, parameters)10%0–10
Statistical significance testing5%0–5
Computational cost reporting5%0–5
Reporting Transparency (20%)Clear presentation of results and metrics8%0–8
Honest discussion of limitations5%0–5
Availability of code/data4%0–4
Adequate figures and visualizations3%0–3
Total 100%0–100
Table 3. PICO definition for our SLR study.
Table 3. PICO definition for our SLR study.
ElementDescription
PopulationTraffic congestion forecasting/prediction studies (2014–2024), Urban and highway transportation systems, Smart city infrastructure initiatives, intelligent transportation systems (ITSs)
InterventionTraditional machine learning approaches, Deep learning methods, large language models (LLMs), Hybrid AI approaches, Computational modeling techniques
ComparisonPerformance metrics (accuracy, MAE, RMSE, MAPE), Computational resource requirements, Data handling capabilities, Prediction time horizons (short, medium, long-term), Congestion scenario types (recurrent, non-recurrent)
OutcomePrediction accuracy across congestion types, Computational efficiency and cost, Implementation complexity, Temporal and spatial accuracy, Methodological evolution patterns
Table 4. Key terms taken from the PICO.
Table 4. Key terms taken from the PICO.
ElementDescription
PopulationTraffic congestion, gridlock, bottleneck, heavy traffic, rush hour, incidents, special events, short-term, medium-term, long-term
InterventionTraditional machine learning approaches, Deep learning methods, large language models (LLMs), Hybrid AI approaches, Computational modeling techniques
ComparisonPerformance metrics (accuracy, MAE, RMSE, MAPE), Computational resource requirements, Data handling capabilities, Prediction time horizons (short, medium, long-term), Congestion scenario types (recurrent, non-recurrent)
OutcomePrediction accuracy across congestion types, Computational efficiency and cost, Implementation complexity, Temporal and spatial accuracy, Methodological evolution patterns
Table 5. Taxonomy of AI/ML Methods in traffic congestion forecasting research.
Table 5. Taxonomy of AI/ML Methods in traffic congestion forecasting research.
ComponentType/CategoryApplicationKey ChallengeImplementationParadigm
TRADITIONAL ML
SVMInstance-BasedGeneral forecastingLimited to linear relationshipsSimple, low resourcesStatistical ML
Random ForestTree-BasedGeneral forecastingMay overfit with noisy dataLow computational costStatistical ML
Bayesian NetworksProbabilisticGeneral forecastingComputationally intensiveInterpretable resultsStatistical ML
DEEP LEARNING
CNNConvolutionalImage/Video processingHigh computational costGPU accelerationDeep Learning
LSTMRecurrentMedium-term predictionTraining complexityModerate resourcesDeep Learning
RNNRecurrentSequential dataVanishing gradientModerate resourcesDeep Learning
LLM/TRANSFORMERS
BERTTransformer-BasedText processingVery high computationHigh resourcesLLM/Transfer
GPTTransformer-BasedMultimodalExtremely high computationHighest resourcesLLM/Transfer
Table 6. Taxonomy of data types and scenarios in traffic congestion forecasting.
Table 6. Taxonomy of data types and scenarios in traffic congestion forecasting.
ComponentType/CategoryApplicationKey ChallengeImplementationParadigm
TRAFFIC PARAMETERS
Traffic VolumeStructured ParameterGeneralData collection challengesRequires sensors-
SpeedStructured ParameterGeneralVariability issuesGPS/loop detectors-
GPS DataSensor-GeneratedGeneralPrivacy concernsMobile devices-
CONGESTION SCENARIOS
Rush HourRecurrent-TemporalUrban trafficPredictable patternsDaily occurrence-
IncidentsNon-recurrentEmergency responseUnpredictabilityRequires real-time data-
Weather EventsNon-recurrentWeather-relatedUnpredictabilityExternal data needed-
Table 7. Taxonomy of performance metrics and implementation challenges.
Table 7. Taxonomy of performance metrics and implementation challenges.
ComponentType/CategoryApplicationKey ChallengeImplementationParadigm
PERFORMANCE METRICS
MAE/RMSEAbsolute ErrorGeneral evaluationSensitive to outliersStandard metricsEvaluation methodology
MAPERelative ErrorComparative analysisScale-dependentCommon metricEvaluation methodology
AccuracyDirect MeasurementPrimary metricMay be misleadingCommon metricPerformance improvement
IMPLEMENTATION CHALLENGES
Missing DataData QualityAll scenariosData Quality IssueRequires imputationData preprocessing
Computational CostResource LimitationAll scenariosResource LimitationHardware requirementsOptimization focus
Table 8. Performance comparison across methodologies.
Table 8. Performance comparison across methodologies.
MethodAccuracyMAERMSEMAPE
Traditional ML75–85%5–8%7–1010–15%
Deep Learning85–92%3–5%4–75–10%
LLM-Based90–95%2–4%3–54–8%
Hybrid92–96%1.5–3%2.5–43–6%
Table 9. Comprehensive characteristics of primary datasets used in reviewed studies.
Table 9. Comprehensive characteristics of primary datasets used in reviewed studies.
DatasetStudiesLocationResolutionDurationSensorsCoverage
PeMS (District 7)34California, USA5 min2012–20233,900+ loops440 km highway
METR-LA18Los Angeles, USA5 min2012–2017207 sensorsUrban network
PeMS-BAY12SF Bay Area, USA5 min2017–2019325 sensors540 km highway
Hong Kong Traffic8Hong Kong1–5 min2018–2020GPS + sensorsCity-wide
Seattle Loop Data6Seattle, USA5 min2015–2019323 detectorsHighway
Beijing Taxi GPS5Beijing, China30 sec2013–201633,000+ taxisUrban
NGSIM Trajectory4Various US0.1 sec2005–2006VideoSpecific sites
TomTom Speed3Multi-city1–5 min2016–2020GPS probesGlobal coverage
Table 10. Representative studies comparison across methodologies.
Table 10. Representative studies comparison across methodologies.
StudyYearMethodDatasetKey InnovationPerformanceLimitations
Traditional Machine Learning Approaches
Oh et al. [85]2015GMM + ANNITS data (2012–2013)Multifactor pattern recognitionMAE: 8.47sBlack-box nature, limited scalability
Tay et al. [102]2023Random Forest2M+ crowdsourcedBFS algorithm integrationMAPE: 2.63%10% ground truth limitation
Deep Learning Approaches
Zhao et al. [16]2020T-GCNPeMSTemporal graph convolutionRMSE: 4.32Fixed graph structure
Cao et al. [17]2020CNN-LSTMHong Kong dataPeriodicity extractionAccuracy: 93.8%High complexity
LLM and Hybrid Approaches
Jin et al. [21]2021TrafficBERTLarge-scale trafficPretrained modelMAPE: 4.8%Computational complexity
Liu et al. [23]2024ST-LLMMulti-city dataspatial–temporal fusionAccuracy: 95%Domain adaptation needed
Table 11. Capability matrix across methodological approaches.
Table 11. Capability matrix across methodological approaches.
CapabilityTraditional MLDeep LearningLLM-BasedHybrid
Spatial Dependency ModelingLimitedGoodExcellentExcellent
Temporal Pattern RecognitionModerateExcellentGoodExcellent
Multimodal Data IntegrationPoorModerateExcellentExcellent
Transfer LearningNoneLimitedExcellentGood
Real-time ProcessingExcellentGoodPoorModerate
InterpretabilityHighLowModerateLow
Edge Case HandlingPoorModerateExcellentExcellent
Computational EfficiencyExcellentModeratePoorPoor
Table 12. Application suitability recommendations.
Table 12. Application suitability recommendations.
ScenarioRecommendedAlternativeAvoid
Real-time HighwayLSTM/GRUTraditional MLLLM
Urban NetworkGNN + LLMDeep LearningTraditional ML
Event PredictionLLM/Hybrid-Traditional ML
Resource-LimitedTraditional MLOptimized DLLLM
Research/OfflineHybrid LLMLLM-
Table 13. Comparative analysis of AI approaches for traffic congestion forecasting (2014–2024).
Table 13. Comparative analysis of AI approaches for traffic congestion forecasting (2014–2024).
ReferenceYearMethodDatasetLimitationsPerformance MetricsKey Findings
TRADITIONAL MACHINE LEARNING APPROACHES (2014–2017)
Urban Traffic Flow Prediction System [85]2015Gaussian mixture model + ANNITS data (2012–2013)Black-box nature of ANN, parameter estimation for varying road linksMAE, prediction time (8.47 s)Multifactor pattern recognition model outperformed other methodologies during rush hour
Vehicle Classification and Speed Estimation [12]2017Dynamic Bayesian Networks33,480 measurementsConstant speed assumption invalid for slow traffic, spectrum folding99% detection accuracy, 5 kph mean errorEffective fusion of PIR and ultrasonic sensor data
Traffic Volume Estimation [102]2023Random Forest, BFS algorithm2M+ crowdsourced data pointsLimited ground truth data (10%), model sensitivity to input dataR2, MAE, MAPERF model demonstrated best performance compared to other ML techniques
Short-term Traffic Flow Prediction [116]2017Bayesian NetworkNGSIM trajectory dataLimited ability to model complex temporal dependenciesAccuracy (93.2%), computation timeEffective for scenarios with missing sensor data
DEEP LEARNING APPROACHES (2018–2020)
Vehicle Identification [95]2020Modified RCNN + GoogleNetMIO-TCD, EBVTDetection challenges in low light conditions, computational complexityAccuracy, precision, recallSignificant improvement in vehicle identification accuracy
Trajectory Prediction [96]2023Deep Generative ModelsTrajectory dataHigh computational requirements, limited explainabilityPrediction accuracy (95.3%), RMSEEffective for complex urban environments
Traffic Flow Forecasting [74]2021LSTM + CNNPeMS datasetData preprocessing requirements, training timeMAE (2.13%), MAPE (5.6%)Spatial–temporal features improved prediction accuracy
Urban Traffic Prediction [123]2020Attention-based LSTMUrban traffic dataModel complexity, hyperparameter sensitivityRMSE (4.32), MAE (3.47)Attention mechanism captured critical temporal dependencies
Traffic State Prediction [118]2022Graph Neural NetworkRoad network dataGraph construction complexity, computational costPrediction accuracy (93.8%)Effectively captured spatial dependencies in road networks
LLM AND HYBRID APPROACHES (2021–2024)
TrafficBERT [21]2021BERT + time-series analysisLarge-scale traffic dataComputational complexity, transfer learning challengesMAPE (4.8%), RMSE (3.2)Pretrained model improved long-range forecasting
Traffic Event Detection [36]2023LLM (slot filling approach)Traffic incident reportsTextual ambiguity, domain-specific vocabularyF1-score (0.89), precision (0.92)Effectively identified and categorized traffic events from textual data
Traffic Prediction with LLM [86]2023GPT adaptationMulti-source traffic dataFine-tuning requirements, inference timeMAE (2.01), RMSE (3.14)Contextual understanding improved prediction during special events
LSTM-Transformer [19]2022LSTM + TransformerTraffic flow time seriesModel complexity, training data requirementsMAPE (5.1%), MAE (2.34)Hybrid approach outperformed individual models
Urban Traffic Patterns [63]2024GNN + BERTRoad network + textual dataIntegration complexity, computational costAccuracy (94.2%), MAE (1.87)Semantic understanding improved prediction during events
Note: This table presents 14 representative studies selected from the 100 reviewed publications to illustrate the methodological evolution and diversity of approaches across three technological eras (2014–2024). The complete dataset analysis encompassed all 100 studies, as detailed in Section 4 and Section 5. Studies were selected to represent (a) diverse methodological approaches within each era, (b) varying application contexts (urban networks, highways, and multimodal scenarios), (c) different performance metric reporting standards, and (d) representative limitations that characterize each technological generation.
Table 14. AI model performance–cost comparison.
Table 14. AI model performance–cost comparison.
Model TypePerformance RangeCost AdvantagesCost Disadvantages
Traditional ML Models75–85% Lower accuracy for non-recurrent congestion
  • Minimal computational requirements
  • Faster training and inference times
  • Simpler implementation
  • Extensive feature engineering required
  • Limited scalability with data complexity
Deep Learning Models85–92% Better spatial–temporal handling
  • Automated feature extraction
  • Scalability with data volume
  • Moderate inference requirements
  • Higher computational demands for training
  • Hyperparameter tuning complexity
LLM-Based Approaches90–95% Superior contextual understanding
  • Transfer learning capabilities
  • Reduced training data requirements
  • Better generalization to new locations
  • Significantly higher computational demands (10–100×)
  • Domain adaptation challenges
  • Complex integration requirements
Table 15. AI model performance by transportation infrastructure type.
Table 15. AI model performance by transportation infrastructure type.
Infrastructure TypeOptimal Models and CharacteristicsKey FeaturesPerformance Gain
Urban Road Networks
  • Graph Neural Networks (GNNs): Superior spatial relationship capture
  • LSTM+CNN Hybrid: Effective spatial–temporal dependencies
  • LLM Approaches: Advantage with multiple events and intersections
  • Complex urban network modeling
  • Multifactor event handling
  • Diverse influencing factors
4–8% Improvement LLM/Hybrid over Traditional
Highway Corridors
  • LSTM and GRU Models: Excel where temporal dependencies dominate
  • Traditional ML: Remains competitive for short-term prediction (15–30 min) and Less pronounced model differences
  • Temporal pattern focus
  • Sequential traffic flow
  • Simpler spatial relationships
2–5% Improvement Deep Learning over Traditional ML
Mixed Transportation Networks
  • Hybrid GNN+LLM Models: Best performance in mixed networks
  • Context-Aware Models: Significant advantages with multiple transportation mode interactions
  • Multimodal integration
  • Complex interaction modeling
  • Cross-network dependencies
5–10% Improvement Hybrid over Single-Technique
Table 16. Congestion type analysis.
Table 16. Congestion type analysis.
Congestion TypePerformance CharacteristicsAccuracy Metrics
Recurrent Congestion Rush Hour and Regular Patterns
  • All model types perform relatively well, with minimal accuracy differences between traditional ML and advanced approaches (3–5% Difference)
  • Traditional ML models remain competitive for short-term predictions (15–30 min)
  • Deep learning models demonstrate superior performance for medium-term horizons (30–120 min)
88–95% Accuracy
Non-recurrent Congestion Accidents, Weather Events, Special Events
  • LLM and hybrid approaches demonstrate substantial advantages over traditional methods (10–15% Higher Accuracy)
  • Traditional ML models show significant performance degradation for non-recurrent events
  • Context-aware models excel at incorporating event information to predict impact severity and duration
LLM/Hybrid: 82–88% Traditional: 65–75%
Table 17. Prediction timeframe performance.
Table 17. Prediction timeframe performance.
TimeframePerformance CharacteristicsPerformance Gap
Short-term 0–30 minPerformance gaps between model types are minimal. All approaches yielded competitive results for immediate predictions.2–5% Gap
Medium-term 30–120 minDeep learning approaches have begun to outperform traditional ML as temporal complexity increases and pattern recognition becomes more critical.5–10% Advantage
Long-term 2+ hLLM and hybrid models maintain higher accuracy by leveraging contextual understanding and complex pattern recognition capabilities.10–15% Advantage
Very Long-term Next dayAll model types show accuracy degradation due to increased uncertainty, but LLM approaches demonstrate higher robustness and maintain better performance.Higher Robustness
Table 18. High-impact parameters.
Table 18. High-impact parameters.
ParameterDescriptionImpact Level
Traffic Flow RateVehicles per hour—Identified as the most influential parameter across all model types. This metric serves as the primary indicator of traffic conditions and is directly correlated with the prediction accuracy.Highest Impact
Historical Travel TimeCritical for establishing baseline patterns, particularly in the LSTM and GRU models. This provides a temporal context that is essential for making accurate predictions.Critical
Occupancy RatePercentage of road segments occupied by vehicles. There was a high correlation between the prediction accuracy and all modeltypes.High Impact
Special Events InformationDemonstrated the highest impact differential between traditional ML and LLM approaches. This is critical for predicting non-routine traffic disruptions and their patterns.Game Changer
Table 19. Data dependencies.
Table 19. Data dependencies.
Dependency FactorImpact DescriptionPerformance Gain
Temporal ResolutionHigher frequency data collection (1–5 min intervals) significantly improves prediction accuracy compared with lower resolution data collection methods.+3–7% Accuracy
Spatial CoverageModels incorporating neighboring road segment data significantly outperformed isolated segment models by capturing spatial traffic flow patterns.+4–9% Improvement
Historical Data DepthDeep learning models show heightened sensitivity to historical data availability, with optimal performance requiring substantial training periods.3–6 Months Optimal
Multimodal IntegrationModels incorporating heterogeneous data sources (traffic, weather, events) significantly outperform single-source models through comprehensive context understanding.+5–12% enhancement
Table 20. AI architecture comparison for traffic prediction.
Table 20. AI architecture comparison for traffic prediction.
ArchitectureAdvantagesLimitations
CNN Convolutional Neural NetworksExcels at capturing spatial relationships in grid-based urban networks. Effective for processing image-based traffic data (video surveillance). Particularly advantageous for detecting congestion propagation patternsLess effective for capturing long-term temporal dependencies
RNN/LSTM/GRU Recurrent Neural NetworksSuperior performance in capturing temporal traffic patterns and dependencies. It is particularly effective for highway traffic prediction, where the flow is more sequential than urban traffic. Demonstrates strong performance in medium-term prediction horizonsMay struggle with complex spatial relationships in large networks
GNN Graph Neural NetworksExcels at modeling complex road network topologies. It effectively captures the spatial dependencies between interconnected road segments. Particularly advantageous for urban networks with complex connectivityHigher computational complexity. More complex implementation
Transformer/LLM Large Language ModelsSuperior contextual understanding and interpretation of traffic-related events. Therefore, the effective integration of multimodal data sources is required. Particularly advantageous for non-recurrent congestion predictionHighest computational requirements. Domain adaptation challenges
Table 21. Performance–cost trade-offs across methodological paradigms in traffic congestion prediction.
Table 21. Performance–cost trade-offs across methodological paradigms in traffic congestion prediction.
MetricTraditional ML
(2014–2017)
Deep Learning
(2018–2020)
LLM and Hybrid
(2021–2024)
Accuracy (%)75–8585–9290–95
MAE (%)5–93–62–4
RMSE7–104–73–5
Training TimeMinutesHoursDays
Memory Requirements<1 GB1–10 GB10–100 GB
Implementation ComplexitySimpleModerateHigh
Model InterpretabilityHighMediumLow
Table 22. Hybrid approach performance and computational characteristics.
Table 22. Hybrid approach performance and computational characteristics.
Hybrid TypeAccuracy
Improvement
Computational
Cost
Implementation
Complexity
Best
Use Case
Ensemble+3–7%HighMediumStable scenarios
Physics-Informed+5–10%MediumHighData-scarce
Pipeline+6–12%MediumMediumGeneral purpose
Multimodal+10–15%Very HighHighNon-recurrent
Table 23. Energy consumption and carbon footprint by methodology.
Table 23. Energy consumption and carbon footprint by methodology.
MethodologyTraining
Energy (kWh)
Inference
(kWh/day)
Total Annual
(kWh)
CO 2 e
(tons/year)
Traditional ML5–150.2–0.590–2000.04–0.09
Deep Learning200–8002–81000–35000.45–1.6
LLM-Based5000–20,00050–15023,000–75,00010–34
Hybrid3000–12,00020–8010,000–42,0004.5–19
Table 24. Net carbon impact: benefits vs. computational costs.
Table 24. Net carbon impact: benefits vs. computational costs.
MethodologyComputational
Cost (tons CO2e)
Congestion
Reduction (tons)
Net Benefit
(tons CO2e)
Benefit/
Cost Ratio
Traditional ML0.04–0.0940,000–65,00039,999–64,999500,000×
Deep Learning0.45–1.655,000–85,00054,998–84,99840,000×
LLM-Based10–3470,000–110,00069,966–109,9903000×
Hybrid4.5–1965,000–100,00064,981–99,9955000×
Table 25. Real-world case studies: policy implications of ai-based congestion forecasting systems.
Table 25. Real-world case studies: policy implications of ai-based congestion forecasting systems.
AspectSingapore ITSLos Angeles LADOTEU Smart Cities
Deployment Scale1200+ km roads, 11,000+ sensors, GPS probes, weather stations7500 km arterial streets, CCTV integration, mobile appsMulti-city network (Barcelona, Amsterdam, Rotterdam)
Model ArchitectureHybrid LSTM-GNN ensemble with transfer learning, 30 s real-time updatesCNN-LSTM hybrid with computer vision, 5 min granularityPretrained BERT-based models with fine-tuning (72 h/city)
Prediction Accuracy89% (30 min horizon)86% (incident-induced congestion)91% (event-related congestion); 78% transfer to new city
Economic ImpactUSD 420 M annual savings, USD 80 M implementation cost, 2.3-year ROIUSD 180 M annual safety/economic benefits€52 M combined savings through knowledge sharing
Operational Improvements15% reduction in average commute times23% faster incident response (12.3→9.5 min), 12% fewer secondary crashes, 62% driver satisfaction65% reduction in training data requirements, 6-month deployment vs. 18–24 months
Key Policy InsightPhased implementation strategy (highways first) critical for stakeholder buy-in; moderate complexity yields practical benefitsHuman–AI collaboration with operator override enhances acceptance; integration trumps standalone accuracyData-sharing agreements enable cost reduction; GDPR facilitates trust through privacy-preserving techniques
Study Period2022–20242021–2023Not specified
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Attioui, M.; Lahby, M. A Systematic Literature Review of Traffic Congestion Forecasting: From Machine Learning Techniques to Large Language Models. Vehicles 2025, 7, 142. https://doi.org/10.3390/vehicles7040142

AMA Style

Attioui M, Lahby M. A Systematic Literature Review of Traffic Congestion Forecasting: From Machine Learning Techniques to Large Language Models. Vehicles. 2025; 7(4):142. https://doi.org/10.3390/vehicles7040142

Chicago/Turabian Style

Attioui, Mehdi, and Mohamed Lahby. 2025. "A Systematic Literature Review of Traffic Congestion Forecasting: From Machine Learning Techniques to Large Language Models" Vehicles 7, no. 4: 142. https://doi.org/10.3390/vehicles7040142

APA Style

Attioui, M., & Lahby, M. (2025). A Systematic Literature Review of Traffic Congestion Forecasting: From Machine Learning Techniques to Large Language Models. Vehicles, 7(4), 142. https://doi.org/10.3390/vehicles7040142

Article Metrics

Back to TopTop