Machine Learning Applications in Surface Transportation Systems: A Literature Review

: Surface transportation has evolved through technology advancements using parallel knowledge areas such as machine learning (ML). However, the transportation industry has not yet taken full advantage of ML. To evaluate this gap, we utilized a literature review approach to locate, categorize, and synthesize the principal concepts of research papers regarding surface transportation systems using ML algorithms, and we then decomposed them into their fundamental elements. We explored more than 100 articles, literature review papers, and books. The results show that 74% of the papers concentrate on forecasting, while multilayer perceptions, long short-term memory, random forest, supporting vector machine, XGBoost, and deep convolutional neural networks are the most preferred ML algorithms. However, sophisticated ML algorithms have been minimally used. The root-cause analysis revealed a lack of effective collaboration between the ML and transportation experts, resulting in the most accessible transportation applications being used as a case study to test or enhance a given ML algorithm and not necessarily to enhance a mobility or safety issue. Additionally, the transportation community does not deﬁne transportation issues clearly and does not provide publicly available transportation datasets. The transportation sector must offer an open-source platform to showcase the sector’s concerns and build spatiotemporal datasets for ML experts to accelerate technology advancements.


Introduction
Mobility is a multidimensional, heterogeneous concept that involves the movement of goods and people.With advancements in information and communication technology, mobility has provided more comfortable, more efficient, and faster accessibility in recent decades.Although advances in mobility continuously improve the quality of human life, the accompanying changes result in hurdles and setbacks, known as transportation externalities (i.e., air pollution, congestion, noise pollution, and health problems).Due to their internal interactions, transportation ecosystem problems are typically categorized as non-deterministic polynomial-hard (NP-hard) problems [1,2].One example of an NPhard problem is automobiles speeding up mobility while causing traffic congestion, which slows down mobility.NP-hard problems combined with externalities make transportation problems complex system problems.
The science, technology, and engineering fields have become involved in the world of transportation engineering to solve some of the complex system problems that have arisen over the past decades (i.e., computer science and systems engineering).These interdisciplinary resolutions have generally been called intelligent transportation systems (ITSs) [3].Over the past 30 years, ITSs have included dynamic message signs, real-time transit information systems, traveler information systems, electronic toll collection systems, ramp meters, smart parking systems, mapping applications, and advanced traffic signals [4].
Subsequent ITS generation has included automation, connectivity, cybersecurity for ITSs, and enabling technologies such as artificial intelligence (AI) [5].Artificial intelligence, the application of computers mimicking human decision-making and problem-solving logic capabilities, has been introduced into transportation systems.For instance, AI tools such as machine vision and voice recognition have already been applied in automated vehicle technologies [6].
As a subset of AI, machine learning (ML) is the most sophisticated state-of-the-art knowledge branch offering the potential to solve unsettled or difficult-to-solve problems.Machine learning is a data-driven approach that allows computers to imitate human learning capabilities by gradually improving accuracy [7].This approach solves some NP-hard problems, including optimization problems, as it does not have the limitations associated with analytical methods [8].
Several learning techniques are used to implement the ML algorithm, including classical learners, neural networks and deep learners, and ensemble learners.Ensemble learners use two or more shallow learners (from classical or neural network learners) to improve their performance.Generally, neural networks and deep learners aim to solve homogeneous data representation problems such as the use of images and video.Instead, classical learners target problems with heterogeneous data, such as independent properties of a particular subject.Consequently, feature engineering could be a performance improvement strategy for classical learners.
On the other hand, deep learners perform better with a large amount of input data although they are computationally expensive.Additionally, their results usually are uninterpretable for humans.These specific properties make different ML algorithms suitable for various applications depending on input data, required results, and expected performance.
With all the potential and capabilities of ML techniques, academic research and specific technological companies have considered ML for addressing transportation problems.Machine learning models have been widely adapted for forecasting and prediction regarding traffic flow, traffic congestion modeling, capacity analysis, mode choice analysis, and demand prediction [9][10][11][12][13].
Machine learning has primarily been explored within the academic research world and technology and manufacturing enterprises to accelerate pushing knowledge and products forward.In contrast, transportation authorities, including planners and policymakers, have not yet employed ML to its fullest potential regarding operations, infrastructure, management, planning, sustainability, equity, accessibility, and service improvement.Data scientists and researchers often apply a particular ML technique to a transportation application, but not vice versa.The existing literature on transportation systems reveals a considerable gap in the area of ML algorithm applications for improving transportation safety, mobility, and accessibility.
The primary objective of this research is to find the root cause of the slowdown in applying ML to surface transportation systems (STSs) compared to parallel sectors such as health care.For this root cause analysis, we utilized a literature review to investigate ML applications in STSs so as to evaluate the hurdles and barriers.The review results give an adequate overview, revealing the gaps and opportunities that can be taken advantage of to help improve the efficiency and performance of ML applications in STSs.
By providing a comprehensive literature review, we aim to deliver an opportunity for transportation authorities, including but not limited to planners, operators, managers, law enforcement, regulators, and policymakers to use machine learning for solving complex surface transportation system problems.The ultimate goal is to identify the prospects, gaps, and means by which ML techniques can be used to help improve safety, mobility, accessibility, level of service, efficiency, security, comfort, quality, cost, integrity, and sustainability in society's STSs.Furthermore, we present a platform for researchers to efficiently achieve what has been conducted in STS research, the potential opportunities, and the existing gaps.A bi-product of this study is an up-to-date dynamic website that serves as a repository of peer-reviewed research articles at the intersection of ML and surface transportation.
The remainder of this paper is organized as follows.Section 2 explains the objective and primary reasons behind initiating this research.Section 3 briefly describes the various ML algorithms considered in this literature review.Furthermore, this section describes the surface transportation system components and applications considered in this paper.Section 4 summarizes the published "literature reviews" and books on ML applications in transportation systems and identifies distinctions between these publications.Next, Section 5 describes the systematic approach selected for this literature review.Section 6 describes the findings from more than 100 reviewed papers according to how they align with our research goal.The papers were classified and presented based on their application sector areas, applied ML algorithms, results, and data input.Section 7 explains why some ML algorithms are more prevalent in transportation system application areas whereas other compelling ML algorithms and critical STS application areas are less popular.Finally, Section 8 presents the fundamental findings, future steps, limitations, and further research opportunities.

Motivation
Mobility is inherently unpredictable, not because of its stochastic nature but because of the many difficult-to-measure interacting factors, including environmental and climate conditions, socioeconomic status, special events, and accidents that impact it.Traditionally, transportation has been planned and managed using incomplete, inaccurate, inconsistent, and uncertain data.Stochastic methods and models have been the most popular approach over the past few decades [14][15][16].Statistical and analytical techniques such as regression, simulation, flow equilibrium, and spatiotemporal physics rules have been dominantly used to address management and planning problems.Although serving a purpose, these stochastic approaches cannot precisely model the existing external and internal factors that impact mobility systems.At best, these traditional approaches measure flow, capacity, and speed swayed by internal and external interactions.Wardrop's first and second principles [17] have been significantly used as the basis of transportation system analysis and planning, with the caveat that some external factors and interactions are missing.
The primary factors that influence the transportation system can be categorized into two main groups: the spatial transportation behaviors present in the graphical network (i.e., connectivity, routing, and directionality) and the external factors that impact mobility (i.e., weather conditions, time of day, time of year, time of the month, accidents, and socioeconomic status).Traffic and transportation modeling approaches typically ignore one or both of those primary factors.Ideally, we want to consider both groups of external factors and spatial behaviors.
AI and machine learning techniques allow us to combine spatial behaviors and external factors with the traditional models for more effective decision making, risk evaluation, and overall advancement while eliminating externalities.For instance, the multilayer perceptron (MLP) algorithm, as a neural network (NN) learning method, has been the most considered strategy for many years, particularly in prediction, clustering, and optimization.This method possesses the unique power of nonlinear future extraction from input data.Moreover, this algorithm is similar to the human brain in terms of intuition and logical decisions.The MLP's unique potential makes it a suitable strategy for highly uncertain ecosystems, such as transportation systems.In addition to the MLP, other variants of the NN and deep learning (DL) algorithms (i.e., long short-term memory (LSTM), long-short memory (LSM), and graph neural networks (GNN)) have gained the attention of many researchers.
The surface transportation system dynamic interacts in a spatiotemporal context.It is impossible to interpret STS interactions, such as congestion or crashes, without considering the spatiotemporal characteristics.For instance, STS spatial data include road networks, rail networks and stations, bus networks and stops, and vehicle trajectories.Neural networks can obtain surface transportation's sequential temporal-related spatial parameters.These spatiotemporal features are fed to a GNN and interpreted to a directed network.The messagepassing technique between network nodes is the central concept within the GNN algorithm.Each node in the graph passes its property to its neighbors, and then the neighbors aggregate the properties using a specific aggregation method.As a result, this aggregation also gives each node additional information about its neighbors.Figure 1 presents a sample of a road network framed in a GNN model utilizing the message-passing concept.
considering the spatiotemporal characteristics.For instance, STS spatial data include road networks, rail networks and stations, bus networks and stops, and vehicle trajectories.Neural networks can obtain surface transportation's sequential temporal-related spatial parameters.These spatiotemporal features are fed to a GNN and interpreted to a directed network.The message-passing technique between network nodes is the central concept within the GNN algorithm.Each node in the graph passes its property to its neighbors, and then the neighbors aggregate the properties using a specific aggregation method.As a result, this aggregation also gives each node additional information about its neighbors.Figure 1 presents a sample of a road network framed in a GNN model utilizing the message-passing concept.In summary, more recent sophisticated machine learning algorithms, including the NN algorithms, recurrent neural network (RNN) algorithms, GNN, and variational autoencoder (VAE), have been minimally used in the planning and management of the surface transportation sector.However, these algorithms provide numerous opportunities for improvements to the efficiency and accuracy of the transportation systems models if utilized appropriately.Therefore, we aimed to conduct a thorough literature review of all the studies conducted in the STS applications field to provide a proper understanding of the gaps and opportunities for applying advanced ML techniques.This review will be beneficial to STSs by alleviating the externalities and enhancing the overall system behavior.

Machine Learning
Machine learning is an evolving field that has impacted many applied sciences, including healthcare [18], the automotive industry [19], cybersecurity [20], and robotics [21,22].Unlike classical programming, ML data processing is modeled in advance (Figure 2).Numerous ML publications cover various applications, algorithms, fundamentals, and theories [23][24][25][26][27][28][29][30].Machine learning is the knowledge of applying data and techniques to imitate human learning competence patterns and behaviors.In their basic form, ML algorithms receive input data, analyze them, and provide output data by learning any perceived patterns and behavior inherent to the data.These algorithms typically apply a set of data, treat it as prior knowledge, train themselves, and search for similar patterns and Graph neural networks with unique spatial presentation capabilities have received considerable attention in the research world, including transportation research.
In summary, more recent sophisticated machine learning algorithms, including the NN algorithms, recurrent neural network (RNN) algorithms, GNN, and variational autoencoder (VAE), have been minimally used in the planning and management of the surface transportation sector.However, these algorithms provide numerous opportunities for improvements to the efficiency and accuracy of the transportation systems models if utilized appropriately.Therefore, we aimed to conduct a thorough literature review of all the studies conducted in the STS applications field to provide a proper understanding of the gaps and opportunities for applying advanced ML techniques.This review will be beneficial to STSs by alleviating the externalities and enhancing the overall system behavior.

Machine Learning
Machine learning is an evolving field that has impacted many applied sciences, including healthcare [18], the automotive industry [19], cybersecurity [20], and robotics [21,22].Unlike classical programming, ML data processing is modeled in advance (Figure 2).Numerous ML publications cover various applications, algorithms, fundamentals, and theories [23][24][25][26][27][28][29][30].Machine learning is the knowledge of applying data and techniques to imitate human learning competence patterns and behaviors.In their basic form, ML algorithms receive input data, analyze them, and provide output data by learning any perceived patterns and behavior inherent to the data.These algorithms typically apply a set of data, treat it as prior knowledge, train themselves, and search for similar patterns and ingrained behavior in the new data.Access to a large amount of prior knowledge would improve the efficiency of the learning experience.
As data-driven tools, ML algorithms learn from input data that contain correct results or functions.Several approaches exist for implementing this simple concept, depending on how the algorithm uses the input data and maintains the existing hyperparameters.These algorithms are organized into three primary categories: (1) classical algorithms, (2) neural network and deep learning algorithms, and (3) ensemble algorithms, as presented in Figure 3 and described below.As data-driven tools, ML algorithms learn from input data that contain correct results or functions.Several approaches exist for implementing this simple concept, depending on how the algorithm uses the input data and maintains the existing hyperparameters.These algorithms are organized into three primary categories: (1) classical algorithms, (2) neural network and deep learning algorithms, and (3) ensemble algorithms, as presented in Figure 3 and described below.

Classical Algorithms
Classical algorithms can be from supervised, unsupervised, or reinforcement learning algorithms.In the supervised learning algorithm, the input dataset includes both the input and desired output data.The algorithm is trained on how the input data are related to the output data.The supervised learning algorithms can be used for regression to predict a continuously variable output or classification to predict predefined classes or groups.The classification models are used for the prediction of the discrete variables.Logistic regression, naive Bayes (NB), supporting vector machine (SVM), decision trees, and k-nearest neighbor (KNN) represent the most popular ML models for supervised learning algorithm classification.

Classical Algorithms
Classical algorithms can be from supervised, unsupervised, or reinforcement learning algorithms.In the supervised learning algorithm, the input dataset includes both the input and desired output data.The algorithm is trained on how the input data are related to the output data.The supervised learning algorithms can be used for regression to predict a continuously variable output or classification to predict predefined classes or groups.
The classification models are used for the prediction of the discrete variables.Logistic regression, naive Bayes (NB), supporting vector machine (SVM), decision trees, and knearest neighbor (KNN) represent the most popular ML models for supervised learning algorithm classification.
Additionally, popular supervised regression ML algorithms include linear, polynomial, and ridge/lasso regression.The output data in the unsupervised learning algorithm are not fed into the model.Instead, the model only looks at the input data, organizes them based on internally recognized patterns, and clusters them into structured groups.Subsequently, the trained model can predict the appropriate group to which the new input data belong.
The reinforcement learning algorithm is a reward-based algorithm.The algorithm is fed by an environment with predefined rewarding actions.Each action corresponds to a specific reward.The algorithm then experiences the action in that environment and based on the bonuses, learns how to deal with the input data, movement, and maximization of the total compensation.A reinforcement learning algorithm is a form of trial and error that learns from past experiences and adapts to the knowledge to achieve the best conceivable outcome.

Neural Networks and Deep Learning Algorithms
Inspired by the human brain's neural structure, the NN and DL algorithms are based on multi-layered perceptrons.These algorithms contain an input layer, an output layer, and some hidden layers consisting of neurons or perceptrons.These layers are connected through weighted edges, and the training process is based on backpropagation.The input can be any real-world digitized data, such as an image, voice, or text.The input data establish proper edge weights and perceptron activation to improve the output accuracy by matching trained data.Neural network learning algorithms include supervised, unsupervised, reinforcement, clustering, regression, and classification strategies.The only difference between the NN and DL is the number of hidden layers (Figure 4).A DL algorithm can contain several hidden layers, each of which helps to extract a specific feature from the input data.
The reinforcement learning algorithm is a reward-based algorithm fed by an environment with predefined rewarding actions.Each actio specific reward.The algorithm then experiences the action in that envir on the bonuses, learns how to deal with the input data, movement, an the total compensation.A reinforcement learning algorithm is a form that learns from past experiences and adapts to the knowledge to ac ceivable outcome.

Neural Networks and Deep Learning Algorithms
Inspired by the human brain's neural structure, the NN and DL al on multi-layered perceptrons.These algorithms contain an input laye and some hidden layers consisting of neurons or perceptrons.These la through weighted edges, and the training process is based on backprop can be any real-world digitized data, such as an image, voice, or text.tablish proper edge weights and perceptron activation to improve the matching trained data.Neural network learning algorithms include su vised, reinforcement, clustering, regression, and classification strategi ence between the NN and DL is the number of hidden layers (Figure can contain several hidden layers, each of which helps to extract a sp the input data.Recurrent neural networks are special NNs that allow the model as input and another as output.This type of NN is suitable for time Recurrent neural networks are special NNs that allow the model to receive a series as input and another as output.This type of NN is suitable for time-series prediction.Feedforward neural networks (FNNs) send the data in one direction from input to output.Additionally, RNNs have an internal circular structure (loop feedback) in which the output data of one layer can be fed into the same layer again as feedback.Therefore, the output at state t depends on the input data at state t and the network status at state t-1.This property makes RNNs suitable for temporal prediction.Convolutional neural networks (CNNs), another category of NNs, can extract features from a multidimensional graph.These networks apply several convolutional and pooling layers to extract the input graph features.This capability makes CNNs suitable for image and video recognition.

Ensemble Algorithms
Ensemble algorithms, or multiple learning classifier algorithms, combine numerous learning algorithms to solve a single problem.Ensemble algorithms train several individual ML algorithms and then combine the results by applying specific strategies to improve the accuracy.Homogeneous ensemble learners employ the same ML algorithms (weak classifiers), whereas the heterogeneous ensemble algorithms contain various learners' algorithm types.As a result, the heterogeneous ensemble is more substantial than the homogeneous ensemble in generalization.
The boosting algorithm, a subcategory of the ensemble ML algorithm, tries to make a robust classifier from some weak classifiers.The algorithm groups input data using several classifiers and strategies to accomplish the best outcome.In this category, well-known algorithms include adaptive boosting (AdaBoost); categorical gradient boosting (CatBoost); light gradient boosted machine (LightGBN); and scalable, portable, and distributed gradient boosting (XGBoost).
The Bagging algorithm is another subcategory that feeds the ensemble learners with a separate pack of the input data to help the weak learner-trained models be independent of each other.Bagging is referred to as the input data class sampling.Random forest (RF) is a well-known classification and regression algorithm based on building several decision trees in the training process.The final result would be the most selected or average of the decision trees' conclusions.Finally, in the stacking subcategory, the primary ensemble learners use the original training set and generate a new dataset for training in second-level models.

Surface Transportation Systems
The surface transportation component ecosystem's entire dynamic is illustrated in Figure 5.The top categories include mode, policy/education, safety/security, traffic management, externality, demand, and supply.Each STS subcategory, independently or in combination with others, has been considered according to interdisciplinary knowledge and technology application areas for performance improvement.These knowledge and technology areas have addressed mobility challenges and problems from various approaches to make specific advancements.Figure 6 presents multiple aspects of advancements and their subcategories impacting STSs.For instance, autonomous vehicle (AV) research and development by technology providers and academic research have resulted in advanced technologies and enhancements for drivers and the automobile industry.Technologies and enhancements include automated parking technology, car-following and platooning technologies, adaptive cruise control, collision detection warning, lane change warning and maintenance systems, location-based and navigation systems, and dynamic routing services.Intelligent transportation systems, as a multidisciplinary knowledge area, have likewise produced some significant accomplishments during the past three decades (i.e., adaptive traffic signal control, opposite direction driving detection, over-speeding detection, automatic license plate recognition systems, red-light surveillance sensors, and traffic simulations).Each subcategory has fulfilled or improved one or more surface transportation components.For instance, weather condition forecasting improves "weather information" in the "traffic management" component of the STSs.These knowledge and technology areas have addressed mobility challenges and problems from various approaches to make specific advancements.Figure 6 presents multiple aspects of advancements and their subcategories impacting STSs.For instance, autonomous vehicle (AV) research and development by technology providers and academic research have resulted in advanced technologies and enhancements for drivers and the automobile industry.Technologies and enhancements include automated parking technology, carfollowing and platooning technologies, adaptive cruise control, collision detection warning, lane change warning and maintenance systems, location-based and navigation systems, and dynamic routing services.Intelligent transportation systems, as a multidisciplinary knowledge area, have likewise produced some significant accomplishments during the past three decades (i.e., adaptive traffic signal control, opposite direction driving detection, over-speeding detection, automatic license plate recognition systems, red-light surveillance sensors, and traffic simulations).Each subcategory has fulfilled or improved one or more surface transportation components.For instance, weather condition forecasting improves "weather information" in the "traffic management" component of the STSs.

Existing "Literature Reviews"
Most surveys, reviews, and books focusing on the application of AI to the transportation industry [31][32][33][34][35] cover algorithms, techniques, and pattern recognition algorithms that classify and predict traffic and transportation measures.These publications often compare the ML approaches with classical analytic and heuristic methods to assess their performance, computing costs, and accuracy.We present a brief description of existing literature reviews on AI and ML applications in the transportation sector in Table 1.
Table 1.Existing literature reviews on AI application in transportation systems.

Research
Goal Result De la Torre et al. [34] Promotes sustainable transportation systems by applying simulation, optimization, ML, and fuzzy sets.
As socioeconomic and environmental factors add complexity to sustainable transportation systems, hybrid methods that use two or more complex methods, such as ML and fuzzy algorithms, address sustainable transportation problems.

Existing "Literature Reviews"
Most surveys, reviews, and books focusing on the application of AI to the transportation industry [31][32][33][34][35] cover algorithms, techniques, and pattern recognition algorithms that classify and predict traffic and transportation measures.These publications often compare the ML approaches with classical analytic and heuristic methods to assess their performance, computing costs, and accuracy.We present a brief description of existing literature reviews on AI and ML applications in the transportation sector in Table 1.
The scope of this paper is limited to the application of ML algorithms in STSs.First, we decomposed STSs into their fundamental components and defined the components' interactions (Background section).Next, the ML algorithms were categorized into algorithms and input/output.Then, the articles that applied ML algorithms in STSs were located, reviewed, and sensitized by defining the application areas, the type of ML algorithm involved, and the input/output data category.Finally, we explore the gaps and opportunities of applying ML algorithms to STSs using a systematic review approach.

Research Goal Result
De la Torre et al. [34] Promotes sustainable transportation systems by applying simulation, optimization, ML, and fuzzy sets.
As socioeconomic and environmental factors add complexity to sustainable transportation systems, hybrid methods that use two or more complex methods, such as ML and fuzzy algorithms, address sustainable transportation problems.
Wang Y et al. [36] Concentrates on deep learning model applications to improve transportation systems' intelligence by focusing on computer vision, time-series prediction, classification, and optimization.
The convolutional neural network (CNN) model is the best choice for application areas such as image classification, traffic sign recognition, vehicle and passenger tracking, obstacle and lane detection, and video-based surveillance.LSTM, gated recurrent unit (GRU), and bi-directional LSTM achieve acceptable accuracy for time-series prediction, such as for traffic flow, traffic speed, and travel time prediction.
Akhtar M. and Moridpour S. [10] Examines ML algorithms' pros and cons by gathering 48 articles on traffic congestion predictions and categorizing them into probabilistic reasoning, shallow ML, and DL.
Artificial NNs and RNNs are the most applied models, whereas hybrid or ensembled models are primarily used for probabilistic and shallow learning classes.RNNs are more suitable for time-series prediction.Shallow models yield better results than DL models for short-term traffic congestion forecasting with a non-intense computational requirement.
Abduljabbar R. et al. [37] Targets three primary AI applications in transportation: corporate decision making, planning and management, improving public transport, and connected and autonomous vehicles (CAV).
AVs and public transportation systems benefit from AI for the avoidance of disruptions, accidents, and congestion.However, two challenges result from the application of the AI models: (1) neural networks are "black boxes" that make them unintuitive for human logic, and (2) human errors in data labeling cause biasing in the ML models' training procedure.
Pamuła T. [38] Focuses on the two primary NN properties that make them perfect for transportation applications: (1) mapping the variables' nonlinear functions to describe the objects' behavior and (2) NN design simplicity.The review presents sample solutions in road traffic parameters prediction, traffic control, traffic parameter measurement, driver behavior and AV, and transport policy and economics.
Describes the applied NN algorithms in transport research to solve problems such as classification and clustering, function approximation, time-series analysis, and forecasting.A noticeable lack of effort is reported in developing and tuning network configurations to achieve a set-forth level performance.Specifically, multilayer feedforward networks are often configured based on heuristics or literature records.In several examples, the topology of the MLP hidden layer is altered to improve the model performance.The processed variables' dimension reduction is reported to be the most effective NN performance enhancement, particularly in controlling AVs.A research gap is reported regarding the development of a practical NN configuration system design by which to resolve computation limitations within specific accuracy bounds.

Research Goal Result
Varghese V. et al. [39] Summarizes the relationship between accuracy and the influenced factors of the DL prediction models by adopting a search strategy, followed by a meta-analysis on prediction accuracy.
DL models demonstrate better prediction accuracy than conventional ML models; with a 100-million-fold increase in the input data, the prediction accuracy increases by 5.9% on average.In contrast, the accuracy diminishes by 5.3% with a 100 min longer prediction horizon.The combined convolutional neural network long short-term memory (CNN-LSTM) has the most significant prediction accuracy, followed by LSTM and deep belief network models.
Boukerche A. and Wang J. [40] Examines the ML model by comparing implementation difficulty, implementation cost, dataset requirements, DL structure, spatiotemporal features, prediction time costs, maintenance costs, and robustness.
The review concludes that applying RNNs and CNNs together or convolutional recurrent neural networks (Conv-RNN) with a sequence-to-sequence (Seq2Seq) structure and attention-based models are popularly applied algorithms.
Jiang W, and Luo J. [41] Focuses on applied GNN, such as graph convolutional neural networks and graph attention networks (GATs), to forecast traffic and transportation measures based on a review of 212 papers.
Provides recommendations for improving the research ecosystem, including: (1) a centralized data repository for GNN-based traffic forecasting resources to facilitate models' performance comparison and collaboration; (2) GNN fusions with other techniques and modeling approaches to overcome inherent challenges and achieve better performance; (3) applying data augmentation for DL algorithm performance boosting; (4) applying the transfer learning method to traffic prediction problems with a frequent lack of historical data.
Zhu L. et al. [42] Focuses on the big data evolutional attributes in ITS and categorizes them into smart cards, GPS, video, road site sensors, floating car sensors, wide-area sensors, connected and automated vehicles, passive collection, and other sources.
The review reports substantial unconsidered remaining challenges in ITS, including (1) inaccurate, incomplete, or unreliable data collection in particular locations or at certain times; (2) privacy issues when collecting personal data; (3) data storage and processing capacity limitations; (4) the absence of an open-access data ecosystem for transportation service providers and app developers to find and re-use the data effectively.
Wang Y, Zeng Z. [43] Focuses on transportation and data-driven methods, such as autonomous vehicles and energy, traffic data analysis and enhancements, travel time estimation accuracy, travel behavior analysis, public transportation data mining, network modeling, and railway system prognostics and health management.
Presents a series of data-driven methodologies for transportation problems such as an online energy management strategy for plugin hybrid vehicles, NN algorithms for classifying vehicles through a single loop detector, data fusion algorithms for travel behavior predictions, an algorithm applying density-based spatial clustering of applications with a noise algorithm for clustering travelers' pick up/drop off locations, and public transportation planning using big data approaches.Adopts an example-oriented approach by defining transportation problems and implementing data-driven algorithms as the solution.

Method
Research and developments regarding ML techniques and applications have exploded in recent years.A simple search on the Scopus [44] database for "Machine Learning" terms returns more than 330,000 documents dated between 1959 and 2021; however, the distribution is not normal.For example, although there were 8000 documents in 2014, this surged to 80,000 documents in 2021.
These terms generally resulted in documents that fundamentally focus on applying ML to various STSs.The search results returned more than 3300 documents between 1965 and 2021.A significant number of documents were articles and conference papers.The document distribution based on their type is presented in Figure 7a.The yearly document distribution is presented in Figure 7b.The distribution shows that most of the documents were primarily published recently, since 2014.
Appl.Sci.2022, 12, x FOR PEER REVIEW 13 of 31 These terms generally resulted in documents that fundamentally focus on applying ML to various STSs.The search results returned more than 3300 documents between 1965 and 2021.A significant number of documents were articles and conference papers.The document distribution based on their type is presented in Figure 7a.The yearly document distribution is presented in Figure 7b.The distribution shows that most of the documents were primarily published recently, since 2014.We applied a customized systematic literature review framework to capture reliable and diverse outcomes.The framework identified the review purpose, the need for revisions, and review strategy development.The process was a tailored adaptation of Tsafnet, G. et al. [45], illustrated in Figure 8 and explained thereafter.We applied a customized systematic literature review framework to capture reliable and diverse outcomes.The framework identified the review purpose, the need for revisions, and review strategy development.The process was a tailored adaptation of Tsafnet, G. et al. [45], illustrated in Figure 8 and explained thereafter.
The intent of Step 1 was to explore specific review questions.The review questions were finalized as follows: • Which STS application areas can use ML algorithms?• Which ML algorithm(s) would be most suitable for a particular STS problem?

•
Can external factors (i.e., weather conditions) be considered as input data?

•
Can spatial factors be considered as input data?• Which ML algorithm properties make a particular algorithm suitable for a specific STS problem?

•
Why are some ML algorithms not utilized within the STS domain?

•
How can the surface transportation sector further exploit ML algorithms?We applied a customized systematic literature review framework to capture reliable and diverse outcomes.The framework identified the review purpose, the need for revisions, and review strategy development.The process was a tailored adaptation of Tsafnet, G. et al. [45], illustrated in Figure 8 and explained thereafter.In Step 2, the existing literature reviews and books that address the identified questions in Step 1, either fully or partially, are explored and evaluated.The existing reviews were used to identify and brainstorm the methods, approaches, and results from other studies within the STS domain.This step has already been showcased in the "Existing Literature Review" section of this article.In Step 3, a comprehensive web search was conducted through the Web of Science [46] and Google Scholar [47] databases to collect literature in the related fields.In Step 4, we searched for articles using the logical "and" and "or" format.By conducting this search, more than 400 papers were found.The keywords used for our search were terms such as follows: In Step 5, duplicates and potentially non-applicable articles were eliminated.In Step 6, we reviewed the papers' abstracts and dismissed papers that were not explicitly related to the STS and ML algorithms.In total, 140 papers remained after this step.In Step 7, the full manuscripts of the 140 papers were downloaded, and fields such as the publication year, web source link, title, abstract, methodology, summary of results and findings, applied ML algorithms, considered transportation areas, and the publication of each paper were extracted to build a database.In Step 8, papers deemed irrelevant or outside of this study's scope were likewise eliminated.
Moreover, if the paper was specifically a review paper, it was tagged separately for further referencing.In Step 9, the collected papers' citations were considered by applying the snowballing concept, particularly papers with the highest citations added to the database if they were not already added.In Step 10, a graph structure was created for STS applications and linked to each STS ecosystem component.The graph type is a manyto-many graph, in which each STS component can be connected to multiple applications (i.e., the rail from STS components is related to dynamic routing and location-based services in an application area), or one application area can be linked to many STS components (i.e., lane change detection is connected to a bus, taxi, private vehicle, e-ride hailing, and freight fleet).A partial presentation of this graph is depicted in Figure 9.In Step 11, each reviewed paper was added to the aforementioned graph based on the type of ML algorithm and focused STS application area.For instance, if a paper used LSTM and GRU to forecast the speed, a link from LSTM to the speed forecasting application area and GRU to speed forecasting was created.A partial presentation of this graph is illustrated in Figure 10.This multilayer graph for papers and their covered subjects serves as a fully connected graph that can answer several queries, such as the number of papers that used LSTM for speed forecasting.In Step 11, each reviewed paper was added to the aforementioned graph based on the type of ML algorithm and focused STS application area.For instance, if a paper used LSTM and GRU to forecast the speed, a link from LSTM to the speed forecasting application area and GRU to speed forecasting was created.A partial presentation of this graph is illustrated in Figure 10.This multilayer graph for papers and their covered subjects serves as a fully connected graph that can answer several queries, such as the number of papers that used LSTM for speed forecasting.In Step 11, each reviewed paper was added to the aforementioned graph based on the type of ML algorithm and focused STS application area.For instance, if a paper used LSTM and GRU to forecast the speed, a link from LSTM to the speed forecasting application area and GRU to speed forecasting was created.A partial presentation of this graph is illustrated in Figure 10.This multilayer graph for papers and their covered subjects serves as a fully connected graph that can answer several queries, such as the number of papers that used LSTM for speed forecasting.We extracted properties of the applied ML algorithms' input data from the reviewed articles.In addition to the conventional traffic data, the papers were separately categorized if they used spatial data such as mobility trajectory or external data such as weather conditions.These two groups of factors were organized in a spatial-external factor matrix that presented the applied ML algorithms' input data.These categorizations were then added to the database.In Step 12, the final review, we added the reasoning, outcomes, and analysis based on the findings of the synthesis process.

Results
We conducted a comprehensive review of more than 100 recently peer-reviewed papers (95% published after 2018) in the area of ML algorithm applications in STSs.The ML algorithms' application popularity in the reviewed articles is ranked and presented in Table 2.The most applied ML algorithm was MLP, followed by LSTM.Supervised and unsupervised classical learning algorithms, including random forest, SVM, KNN, fuzzy, and linear regression, follow the MLP and LSTM algorithms in terms of usage.The high usage is perhaps due to their simple hyperparameter definitions and their ability to be used in regression and clustering.Furthermore, the algorithms' modeling initiation is minimal, and these algorithms do not require input data graph representation.Instead, the algorithms attempt to determine the input data's internal relationship and make a prediction based on the recognized patterns.These capabilities make the algorithms easy to use and powerful for predicting transportation parameters, such as speed, travel time, demand, mode choice, collision, parking demand, and operation cost optimization.The factors are local and quantitative and follow the demand/supply equilibrium concept.XGBoost, an ensemble learning algorithm, is a popularly applied algorithm for forecasting and clustering.Researchers have generally reported that XGBoost has better accuracy and performance than similar practices.XGBoost has proven its ability in other application fields and presents the same capability in traffic and transportation systems forecasting and clustering problem solving.
According to the reviewed papers, most STS ML papers focus on forecasting and predictions (74%), followed by optimization (11%).Service advancements, ITS technology advancements, and automated vehicles with technology advancements between levels 0 and 5 account for 15%, as illustrated in Figure 11.Some papers applied an ML algorithm to more than one application; therefore, they were counted for each application sector separately.
of usage.The high usage is perhaps due to their simple hyperparameter definitions and their ability to be used in regression and clustering.Furthermore, the algorithms' modeling initiation is minimal, and these algorithms do not require input data graph representation.Instead, the algorithms attempt to determine the input data's internal relationship and make a prediction based on the recognized patterns.These capabilities make the algorithms easy to use and powerful for predicting transportation parameters, such as speed, travel time, demand, mode choice, collision, parking demand, and operation cost optimization.The factors are local and quantitative and follow the demand/supply equilibrium concept.XGBoost, an ensemble learning algorithm, is a popularly applied algorithm for forecasting and clustering.Researchers have generally reported that XGBoost has better accuracy and performance than similar practices.XGBoost has proven its ability in other application fields and presents the same capability in traffic and transportation systems forecasting and clustering problem solving.
According to the reviewed papers, most STS ML papers focus on forecasting and predictions (74%), followed by optimization (11%).Service advancements, ITS technology advancements, and automated vehicles with technology advancements between levels 0 and 5 account for 15%, as illustrated in Figure 11.Some papers applied an ML algorithm to more than one application; therefore, they were counted for each application sector separately.The distribution is given for each application area separately in Table 3.Some areas have not been explored by ML, such as autonomous vehicles with technology advancements between level 0 and 5 (AV L0-L5) that focus on the applications planned for future vehicles.This could be related to a limitation explained in the conclusion section.Likewise, ITS technology advancements are also less focused upon, and applications such as simulations, with all their potential and requirements, are disregarded in ML applications.The forecasting and prediction area, with 69 articles, was found to be the most popular The distribution is given for each application area separately in Table 3.Some areas have not been explored by ML, such as autonomous vehicles with technology advancements between level 0 and 5 (AV L0-L5) that focus on the applications planned for future vehicles.This could be related to a limitation explained in the conclusion section.Likewise, ITS technology advancements are also less focused upon, and applications such as simulations, with all their potential and requirements, are disregarded in ML applications.The forecasting and prediction area, with 69 articles, was found to be the most popular ML application in the STS subject area.Optimization, as part of analytical methods, specifically in operation research [130], has been well practiced by applying ML algorithms with nonlinearity analysis capabilities [131].However, optimization, as the second-ranked group by nine reviewed papers, is far from the first rank.The reviewed papers in the ITS technology advancements application area subcategories were limited to four, and all employ external factors as inputs to the model.The papers were also categorized based on network behavior and external factors (spatial-external) as inputs to the model.The results are presented in Table 4.Although the forecasting traffic-related factors are ultimately impacted by external factors such as weather conditions, many reviewed articles do not consider external factors as inputs to their model.The reviewed papers regarding ITS technology advancements were limited to four, and all employ external factors as their model inputs.The following presents some of the reviewed papers and their highlighted methods and outputs as a sample.
The work in [87] is one of the few research articles applying network behavior and external factors as inputs.It focuses on the public transportation network's spatial property with a 30 min temporal interval.It applied k-fold crossing validation and decision tree combination with bagging and boosting to forecast ridership.Moreover, it evaluated various external measures' impact factors on ridership.The works in [99,100] applied CNN and DL for passenger face detection to improve passenger-counting system performance.Although this ML application did not directly address transportation system factors, it helped improve traffic performance.The work in [9] applied GNN and GRU for a realtime mode choice prediction and reported a maximum accuracy of R 2 = 0.95.The work in [132] applied spatiotemporal data for short-and long-term traffic prediction by defining a spatiotemporal network.The model consisted of several layers of spatiotemporal blocks, with a multi-resolution temporal module and a global correlated spatial module.The model was arranged concurrently in sequence to extract the dynamic temporal dependencies and the global spatial correlations.The work in [111] implemented a traffic scene as a graph of interacting vehicles in a flexible abstract representation and applied GNN models for traffic prediction.It reported that the prediction error in scenarios with considered interaction decreased by 30% compared to a model without considered interactions.The work in [77] proposes a DL model that concurrently extracts the spectral traffic features using a graph CNN.Additionally, it considered the temporal features by applying LSTM to predict travel time.The model transfers the fixed-length-interval GPS trajectory data to a formatted adjacent matrix as the graph CNN input.The model was applied to taxi service data and reported an acceptable accuracy compared to a shallow model such as LSTM.
The work in [51] reports on the open-source Pikalert [133] system, which combines weather information and real-time data from connected vehicles to provide crucial information to improve STS safety and efficiency.The paper concludes that the Pikalert developmental framework offers essential environmental information required to expand and develop CAV.The work in [64] applied ANN models to capture the spatial distribution of particulate matter and black carbon emissions and compared the results to land use regression (LUR) methods that were extensively employed.The paper reports a superior performance for ML algorithms compared with the LUR model.The work in [57] proposes the application of the wavelet neural network (WNN) [134] and the intelligent-particle swarm optimization (IPSO) algorithm [135] (replacement for gradian decent method) to predict bus arrival time.A 49% maximum relative error reduction with the IPSO algorithm, compared with the shallow implementation of WNN, was reported.
The work in [50] is one of the oldest studies applying NN analysis for physical and behavioral transportation planning problems.The paper concentrates on socioeconomic and demographic factors that impact travel demand patterns by focusing on behavioral properties.Additionally, the study examined the physical aspect of traffic management in a typical intersection by applying various NN algorithms.The paper reports that NN algorithms have a more impressive result for physical applications than for behavioral ones because of deficiencies in behavioral data.The work in [49] applied explainable AI methods to forecast household transportation energy consumption based on household travel survey urban zonal data.The paper reports an 83.4% prediction validation accuracy.Furthermore, the paper reports that although household travel time, mode, and frequency are influential features in transportation energy consumption, the proportional impact of those features varies throughout different zones.
The work in [122] presents a hybrid approach of combining order packing by solving a three-dimensional loading problem.This approach used an ML algorithm and a routing method by applying the genetic algorithm in intermodal network optimization.The paper compared the ML algorithms' performance to the performance of ordinary operation research approaches.
As a traffic and demand management strategy, parking planning and management solely depend on demand estimation.The works in [67,91] are two experimental parking demand forecasting studies that used ML algorithms and traffic and network data.Both papers report an acceptable accuracy in terms of parking demand prediction.The work in [101] presents a framework of recurrent convolutional networks (R-CNNs) for car detection.It applied unmanned aerial vehicles to capture images over signalized intersections.The paper reports that the model was insensitive to detection load with acceptable accuracy.
The work in [128] applied a deep Gaussian process to train a regression model on a small sample data of the acceleration, braking, and steering angles to capture the most significant autonomous driving features.The paper reports an achievement by using only 0.34% real-time input data and producing the same accurate result as that of Torcs (an open racing car simulator [136]).The work in [107] addresses the relationship between land use and mobility by applying a KNN clustering approach and using two weeks of mobile phone data to group origin/destination (OD) trips.The paper reports an 80% accuracy for OD land use category prediction for weekdays and 67% for weekends.
The work in [126] applied reinforcement learning on the three parameters of the cycle, arterial coordination offset, and green split of the multiple intersection signals to optimize timing schemes.The paper concludes that the traffic control method based on reinforcement learning performs better in complex traffic situations but is not applicable to all traffic conditions.Traffic management benefits from automated license plate recognition (ANPR).ANPR depends on an optical character recognition (OCR) technique, while ML algorithms are supreme in OCR.As a result, ANPR has a long history in ML algorithm applications [137].The work in [56] proposes an ANPR based on a feature extraction model and a backpropagation neural network that is adaptable in weak light and complex backgrounds.The paper reports a 97.7% accuracy with 46.1 milliseconds of processing time.
The work in [76] presents a framework for safely approaching an exit ramp by concentrating on the CAV network in a multi-lane road corridor.The framework applies a deep reinforcement learning algorithm that combines a graphic CNN with a deep Q-network as a control algorithm.The CAVs on a highway are presented as a GNN, while messages are passed based on the graph.The work in [119] proposes a rear-end collision detection model by applying a global navigation satellite system with accurate positioning data, a digital compass, and lane information.The model combines a cubature Kalman filter and applies an adaptive neuro-fuzzy inference system to judge the car-following status.A 99.61% prediction accuracy in the field test result is reported.

Discussion
As illustrated in the previous sections, the distribution of STS applications in the reviewed research papers with applied ML algorithms is not uniform.Some specific ML algorithms are used more frequently than others.We suggest a framework based on a customized problem-solving framework to examine the causation of selecting a particular ML algorithm to solve an STS application problem.According to the adapted Jenkins' tetrahedral model of memory experiments, problem solving depends on four interacting parameters: 'problem domain', 'problem-solving goals', 'general thinking and problemsolving skills', and 'specialized knowledge and skills' [138].However, 'general thinking and problem-solving skills' influence the ability to apply 'specialized knowledge and skills' to solve problems.Therefore, problem-solving strategy selection depends on the problem definition as domain specificity.
Additionally, problem-solving goals are driven by data availability, the required outcome results, and expertise as 'specialized knowledge and skills'.Therefore, the framework could be simplified to a three-pillar problem-solving strategy selection criteria context, as shown in Figure 12.We used the framed context for ML technique selection for application in STS problem solving.Selecting a suitable ML algorithm for a particular application depends on these three main criteria.
detection model by applying a global navigation satellite system with accurate positioning data, a digital compass, and lane information.The model combines a cubature Kalman filter and applies an adaptive neuro-fuzzy inference system to judge the car-following status.A 99.61% prediction accuracy in the field test result is reported.

Discussion
As illustrated in the previous sections, the distribution of STS applications in the reviewed research papers with applied ML algorithms is not uniform.Some specific ML algorithms are used more frequently than others.We suggest a framework based on a customized problem-solving framework to examine the causation of selecting a particular ML algorithm to solve an STS application problem.According to the adapted Jenkins' tetrahedral model of memory experiments, problem solving depends on four interacting parameters: 'problem domain', 'problem-solving goals', 'general thinking and problemsolving skills', and 'specialized knowledge and skills' [138].However, 'general thinking and problem-solving skills' influence the ability to apply 'specialized knowledge and skills' to solve problems.Therefore, problem-solving strategy selection depends on the problem definition as domain specificity.
Additionally, problem-solving goals are driven by data availability, the required outcome results, and expertise as 'specialized knowledge and skills'.Therefore, the framework could be simplified to a three-pillar problem-solving strategy selection criteria context, as shown in Figure 12.We used the framed context for ML technique selection for application in STS problem solving.Selecting a suitable ML algorithm for a particular application depends on these three main criteria.Our research shows that a limited set of ML algorithms are used for a group of narrow STS applications.Table 5 presents the most frequent ML algorithms utilized for STS application areas.It demonstrates that LSTM (the most prevalent RNN algorithm) and popular classical supervised algorithms (e.g., SVM, random forest, KNN, k-Mean, SVR, and fuzzy) are the most preferred ML algorithms for prediction in surface transportation.Such popularity might be due to the ML algorithms' primary nature, their extensive utilization for forecasting purposes.Additionally, data scientists are intimately familiar with the implementation of a digital model of real-world problems to estimate some system behavior as a typical ML problem-solving method.Several internal and external factors impact transportation systems.These impacts have been examined only by measuring Our research shows that a limited set of ML algorithms are used for a group of narrow STS applications.Table 5 presents the most frequent ML algorithms utilized for STS application areas.It demonstrates that LSTM (the most prevalent RNN algorithm) and popular classical supervised algorithms (e.g., SVM, random forest, KNN, k-Mean, SVR, and fuzzy) are the most preferred ML algorithms for prediction in surface transportation.Such popularity might be due to the ML algorithms' primary nature, their extensive utilization for forecasting purposes.Additionally, data scientists are intimately familiar with the implementation of a digital model of real-world problems to estimate some system behavior as a typical ML problem-solving method.Several internal and external factors impact transportation systems.These impacts have been examined only by measuring influencing factors such as mode choice, demand, traffic volume, speed, and flow changes.For example, although a rainy day affects the mode choice of people, demand prediction models rarely consider weather conditions as the input and instead focus on influencing factors only.Furthermore, the traffic measures such as speed, flow, and demand are timeseries data.Therefore, the use of the RNNs and LSTM primarily for STSs' forecasting objectives is practical.Moreover, many practices employ RNN algorithms in other time-series prediction knowledge areas.Traffic-related data with time-series contexts are likewise appropriate for the use of the RNN algorithm.Additionally, the accessibility of data resources in the STS context is crucial.Few publicly available data sources are offered for STSs, such as the performance measurement system [139].This dataset contains speed and volume data in five-minute intervals from 18,000 various vehicle detectors on highways of California between 2001 and 2019.Other trajectory data, such as taxi and ride-hailing data, are rarely available [140,141].A few open public traffic data sources for ML applications are presented in [42].
Therefore, according to the ML three-decision-criteria context, the publicly available STS data, the lack of expertise in spatiotemporal ML modeling, and the absence of STS problem definition presentation are obstacles to the application of ML algorithms for the enhancement of STS application areas.
A simple time-series prediction problem is the most straightforward fit for ML algorithms and is hence the most used.However, mobility data have spatial dimensions that create a spatiotemporal context for STSs.Therefore, the surface transportation network dataset must be well presented to link the mobility data to an integrated network.The most applied ML algorithms do not use spatial data as an input and focus only on the time-series dimension.Although ignoring the STS spatial data simplifies the problem, there are ML algorithms that can take the spatial data into account for resolving more complex STS problems.In the absence of spatial data, ML algorithms can still serve as valuable prediction tools compared with statistical and heuristic methods.
Several STS applications use MLP and DL algorithms.This popularity results from the algorithm's ability to adapt to a specific problem's context by tuning the number of layers for feature extraction.Some variations of NN algorithms, such as CNNs, account for the surface transportation spatial property.Although CNNs improve the surface transportation network presentation in ML algorithms, it depends on the Euclidian distance, which is not a valid transport network interpretation.For instance, two parallel roads can be near each other without any network relation, but CNNs consider them related.
Recent attempts employ more sophisticated techniques to combine the temporal and the network presentation with the ML algorithm to improve the STS modeling.The most popular one is GNN, which exploits NNs' capabilities alongside network presentation techniques.GNN has a unique ability to apply STS network features with time-series properties to develop a spatiotemporal network for node, edge, feature, pattern forecasting, clustering, recognition, and regeneration.
The primary constraint with sophisticated ML algorithms is the lack of publicly available spatial data.As a result, researchers typically employ OpenStreetMap [142] as a base roadway network and add other surface transportation network properties to the base layer.However, OpenStreetMap adaptation can be challenging, time-consuming, and inaccurate.Such hurdles are caused by the diverse scales and georeferenced data standards involved in unique map transformations.For instance, a bus network, including the trajectory, bus stops, terminals, entrance/exit port, ticketing stations, and real-time bus movements, is all georeferenced data that must be adapted to road networks.
Moreover, the adjusted data must be integrated into infrastructure components (i.e., walkways, bikeways, subway tunnels, and stations) and traffic components (i.e., traffic signals, on-street parking, and land use).Due to its crowdsourcing nature, OpenStreetMap is not a consistent or reliable network platform.Therefore, the absence of a publicly available, integrated, reliable, and accurate georeferenced transportation system database is a fundamental, required step for the use of spatiotemporal data in the application of ML algorithms.As a result, many robust ML algorithms, such as reinforcement learning algorithms and unsupervised learning algorithms, are hardly used in STS problem solving.Moreover, sophisticated NN algorithms, such as GAN and VAE, are not considered for STS problem solving.This ignorance could be caused due to the algorithms' complexity and novelty.As a result, the algorithms are not simply fitted into STS modeling and are not easily understood by the practitioners.
The simulation technique is fundamental for several microscopic and macroscopic traffic and transportation studies.Transportation professionals have widely adopted this method for planning, optimizing, and evaluating transportation projects.The simulation technique attempts to collect transportation-related data, aggregate the data, create a computer-formulated model, and reproduce the transportation system.The outcome is generally not an accurate approximation of the real-world traffic behavior, considering the unpredictability of STSs' uncertainty and interactions.ML algorithms can be utilized to improve the accuracy of transportation simulation and modeling.This utilization is an unexplored field for which we could not find any research papers.The lack of a knowledge base regarding the application of ML algorithms to modeling and simulation techniques is an identified research gap.
On the other hand, the GAN and VAE are robust algorithms that collect real-world input data, recognize their patterns, and reproduce scenarios similar to those in the real-world data.For example, these algorithms prove their power in creating virtual photographs by viewing real-world pictures.Alternatively, they can write a short story by viewing several human-written novels.These examples have concepts that are very similar to simulations.Thus, there is an opportunity to feed vast real-time traffic data into the GAN and VAE algorithms to produce similar scenarios.These reproduction instances could represent the same concept as a simulation but with a data-driven approach.
Although ML algorithms can use external factors other than traffic data to implement an STS model, most applied ML models use internal factors, such as speed, flow, and capacity.For example, the models could adapt the weather conditions, time of day, accidents, regional economic data, and other external factors to be used as the input data and examine these factors to develop a more sophisticated model.On the other hand, there is serious concern about the computational intensity of ML algorithms, specifically DL algorithms.DL algorithms with several layers are seriously computationally and memory intensive depending on the number of features and the number of input data.This high computational cost severely limits DL algorithms' application.However, the market answers the intensifying demand for higher processing power.For instance, big data analysis requirements, such as real-time video processing for AVs, force the technology providers to adopt cheaper processing power and memories.The processors' advancements remove the barriers to applying DL methods such as GNN, VAE, LSTM, and RNN in STSs more efficiently and include spatiotemporal and external factors.
Hence, a well-established STS database presentation with external and internal characteristics could be a potential platform.The platform would inspire data scientists to apply ML when implementing more sophisticated STS models.

Conclusions
This literature review has examined 100 recent research papers and publications on applying ML algorithms to STSs.It has established the opportunities and gaps for local, state, and federal transportation authorities and professionals to improve transportation performance by using AI technology to fill the gaps and exploit the opportunities.Our findings indicate that 74% of the ML research papers focus on 4 surface transportation application areas, out of a total of 31.
The findings reveal that the LSTM algorithm, among the NN and DL learning algorithm categories, and popular classical supervised algorithms (i.e., SVM, random forest, KNN, K-mean, SVR, and fuzzy) are the preferred algorithms for surface transportation problem solving.ML algorithm selection for a specific STS application problem solving essentially impacted by available expertise and experiences in applying ML algorithms to parallel industries.Therefore, the famous, well-known ML algorithms are also more preferred in solving the STSs application area.Additionally, many STSs applications have a temporal series property.This property makes some ML algorithms such as LSTM with the capability to work with time series data more preferred.
On the other hand, STSs application areas with more available data sources are also more preferred for applying ML algorithms.The applied ML algorithms into STSs present supremacy compared to statistical and analytical methods.However, their internal logic is not human interpretable, making it difficult to improve their performance.Generally, DL and MLP algorithms are used extensively in various surface transportation applications resulting from their nonlinear properties and applied experimental history.They prove their capability in forecasting and clustering with an acceptable result while applying limited surface transportation ecosystem input data.
Overall, 17 out of 31 well-known surface transportation application areas have not yet been explored using ML algorithms.The absence of data and data integrity in STSs are the primary reasons that impact ML algorithm selection and deployment choices.Although algorithms such as reinforcement learning algorithms can be instrumental in solving some complex transportation system problems, data scarcity and the unavailability of open-source data platforms limit the use of such tools.
Spatial behavior and external factors are rarely considered in the research world when considering the factors that impact traffic and transportation systems as the input data when applying ML models.This problem is primarily caused by a lack of reliable, open, integrated, and accessible transportation system databases.
The transportation sector must clearly state its problems and needs and develop an opensource, high-quality, high-frequency integrated spatiotemporal dataset for outside knowledge areas.Access to such platforms accelerates ML professionals' transportation problem-solving engagements, given that ML experts have initiated most ML transportation projects.
The transportation community has not yet been able to exercise or take advantage of ML techniques for applied projects in their fullest potential.Additionally, there is an ongoing challenge regarding the collection of the large number of traffic data required to process and build valuable information.This vicious circle must be broken by a collaborative partnership such as a public-private partnership.Federal, state, and local authorities, academic research bodies, and private technology provider firms must collaborate on a shared platform to present accurate, integrated, valid, and high-frequency traffic data.Additionally, this collaboration must define the essential goals and problems regarding the research and development domain.The platform can be enriched by applying ML techniques to answer predefined challenges.These efforts will accelerate the application of ML algorithms in STSs, not only to enhance the knowledge base but also to improve the mobility, accessibility, and safety issues entangled with the STSs.
It is important to note that we only considered peer-reviewed and published articles, journals, and books available through academic sources in this study.We realize that application areas including but not limited to AV L0-L5 are under continuous research and development in manufacturing and technology companies.However, due to their information security and lack of access to proprietary information, technology and automobile manufacturing companies' internal research and development efforts have not been reviewed or included in this research.
Developing an open-source website would be the next step of this research.The site will be used as a resource to identify the links in the publications between STS components, application areas, and ML algorithms.Researchers and developers will be invited to utilize the website and identify any gaps or opportunities for future research work.In addition, public agencies will be invited to use the website to identify funding opportunities.In terms of potential future research, we recommend the following two directions:

•
Develop an NLP engine to automatically recognize and match ML algorithms and STS application areas by analyzing the input papers' full text.

•
Develop a benchmarking web crawler to automatically find reports, documents, and projects that match specific criteria.

Figure 1 .
Figure 1.Sample GNN message-passing in a road network (a,b,c,d,e, and f are road network nodes with traffic data).Graph neural networks with unique spatial presentation capabilities have received considerable attention in the research world, including transportation research.In summary, more recent sophisticated machine learning algorithms, including the NN algorithms, recurrent neural network (RNN) algorithms, GNN, and variational autoencoder (VAE), have been minimally used in the planning and management of the surface transportation sector.However, these algorithms provide numerous opportunities for improvements to the efficiency and accuracy of the transportation systems models if utilized appropriately.Therefore, we aimed to conduct a thorough literature review of all the studies conducted in the STS applications field to provide a proper understanding of the gaps and opportunities for applying advanced ML techniques.This review will be beneficial to STSs by alleviating the externalities and enhancing the overall system behavior.

Figure 1 .
Figure 1.Sample GNN message-passing in a road network (a, b, c, d, e, and f are road network nodes with traffic data).
Appl.Sci.2022, 12, x FOR PEER REVIEW 5 of 31ingrained behavior in the new data.Access to a large amount of prior knowledge would improve the efficiency of the learning experience.

Figure 2 .
Figure 2. ML versus analytical approach in programming.

Figure 4 .
Figure 4. Neural network and deep learning model format.

Figure 4 .
Figure 4. Neural network and deep learning model format.

Figure 7 .
Figure 7. Published documents within ML and surface transportation distribution: (a) documents' distribution by type; (b) the number of documents by year.

Figure 7 .
Figure 7. Published documents within ML and surface transportation distribution: (a) documents' distribution by type; (b) the number of documents by year.

Figure 7 .
Figure 7. Published documents within ML and surface transportation distribution: (a) documents' distribution by type; (b) the number of documents by year.

31 Figure 9 .
Figure 9. Graph of partial STS components and application areas.

Figure 10 .
Figure 10.Graph of partial ML algorithms linked to application areas by papers.

Figure 9 .
Figure 9. Graph of partial STS components and application areas.

31 Figure 9 .
Figure 9. Graph of partial STS components and application areas.

Figure 10 .
Figure 10.Graph of partial ML algorithms linked to application areas by papers.

Figure 10 .
Figure 10.Graph of partial ML algorithms linked to application areas by papers.

Figure 11 .
Figure 11.Number of papers in various applied research areas.

Figure 11 .
Figure 11.Number of papers in various applied research areas.

Table 1 .
Existing literature reviews on AI application in transportation systems.

Table 2 .
Applied ML algorithms in reviewed papers.

Table 3 .
Research papers' distribution in STS application areas.

Table 5 .
Most utilized ML algorithms in STS application areas.