Urban Trafﬁc Flow Congestion Prediction Based on a Data-Driven Model

: Intelligent transportation systems need to realize accurate trafﬁc congestion prediction. The spatio-temporal features of trafﬁc ﬂow are essential to analyze and predict congestion. Our study proposes a data-driven model to predict the trafﬁc congested ﬂow. Firstly, the trafﬁc zone/grid method is used to store the local area roads’ average speed of the vehicles. Second, the discrete snapshot set is proposed to characterize trafﬁc ﬂow’s spatial and temporal features over a continuous period. Third, the evolution of trafﬁc congested ﬂow in various time dimensions (weekly days, weekend days, and one week) is examined by transforming the global urban transportation network into trafﬁc zones. Finally, the data-driven model is constructed to predict urban road trafﬁc congestion by using the extracted spatio-temporal characteristics of trafﬁc zones’ trafﬁc ﬂow, the snapshot set of which serves as inputs for this model. The model adopts the convolutional LSTM network to learn the temporal and local spatial features of trafﬁc ﬂow, while utilizing a convolutional neural network to effectively capture the global spatial features inherent in trafﬁc ﬂow. The numerical experiments are conducted on two cities’ transportation networks, and the results demonstrate that the performance of the proposed model outperforms traditional trafﬁc ﬂow prediction models.


Introduction
With the advancement of society and the growing number of vehicles, traffic congestion emerged as a ubiquitous issue, significantly affecting people's productivity and societal progress.Predicting city-wide future traffic speed distribution propagation is crucial for developing proactive strategies to decrease traffic congestion.In recent years, the transportation industry witnessed the rapid development of new technologies such as artificial intelligence, big data, and the Internet of Things.These technologies generated vast amounts of data from various devices.Given the proliferation of extensive traffic flow data and relative environmental data, predicting traffic flow and roadway congestion within the transportation domain garnered extensive attention in recent years [1][2][3].Researchers generally estimated the short-term traffic flow by studying historical traffic data [4].Existing methods for predicting traffic flow primarily fall into model-driven and data-driven approaches [5].The model-driven approach involves establishing mathematical models of traffic flow parameters to estimate the traffic flow state [6].On the other hand, data-driven approaches utilize intelligent mining of hidden information in historical data to estimate the future state of traffic flow iteratively, often referring to the machine learning and deep learning models.
The urban transportation network system exhibits a complex nonlinear structure influenced by multiple factors, such as road capacity, drivers, passengers, weather, and holidays [7].Model-driven prediction algorithms struggle to utilize these factors' data to effectively analyze and predict traffic flow.However, data-driven technologies such as artificial intelligence and big data offer new engines for technological advancement.Although still in the early stages of application exploration, they have much potential for having transformative impacts on traffic flow prediction, decision-making, and traffic system simulations [8,9].The data-driven approaches such as artificial neural networks (ANNs) gained attention for traffic flow forecasting.The ANNs method does not require constructing a physical model, and it could utilize abundant historical data to study optimal features for prediction [10].
The data-driven traffic prediction models technique offers several advantages.Firstly, data-driven models use abundant historical traffic data from various sources, such as traffic surveillance cameras, loop detectors, and GPS devices.By harnessing this wealth of data, data-driven models can more accurately capture the complex patterns and relationships in traffic flow [11,12].Secondly, these models can adapt and learn from new data, continuously improving their predictive performance, which is especially vital in dynamic traffic environments where conditions can change rapidly [13].Additionally, data-driven models can incorporate diverse factors influencing traffic flow, such as weather conditions, time of day, and road network characteristics [14].By considering these factors, data-driven models can provide more precise and reliable predictions than traditional approaches relying solely on historical traffic data.

Literature Review
Model-driven prediction approaches were widely used in traffic flow analysis and forecasting, including moving averages, exponential smoothing, the Kalman filter, and autoregressive moving averages (ARIMA) models [15][16][17][18][19].For instance, Williams et al. [15] employed the exponential smoothing method to analyze and forecast traffic flow in the city transportation network.Ding et al. [16] extended a spatio-temporal ARIMA model to forecast traffic flow in urban areas at five-minute intervals.Li et al. [17] combined ARIMA and support vector regression to enhance traffic prediction performance.Then, Xia et al. [18] introduced a non-parametric K-nearest neighbor (KNN) model for urban road traffic speed and flow prediction.Chang et al. [19] developed a dynamic multi-interval traffic flow prediction method based on KNN non-parametric regression models.
Compared with these above model-driven approaches, data-driven approaches utilizing neural networks demonstrated their effectiveness in traffic flow forecasting [13,14,[20][21][22][23][24][25][26][27].Zhu et al. [24] proposed a regression prediction traffic flow model based on multi-layer perceptron (MLP) neural networks.Kumar et al. [20] employed an ANN model to predict short-term traffic flow using traffic volume, speed, density, and time as input variables.Lv et al. [21] applied deep learning models to capture the complex traffic flow characteristics without prior knowledge.Then, Ma et al. [22] introduced a convolutional neural network (CNN) method to estimate traffic flow speed in large-scale networks.They proved that CNN utilizes convolutional filter layers to extract local features and simulate the spatial dependencies within the city using sliding windows.Zhang et al. [14] developed a deep learning framework combining residual and convolutional networks.They integrated traffic flow data and weather data as inputs to capture temporal similarity in various road areas and optimize traffic flow prediction.Moreover, Luo et al. [23] and Wu et al. [25] presented the short-term traffic flow prediction model based on the hybrid deep learning approach.In addition, Zhao et al. [26] introduced a long short-term memory (LSTM) neural network to estimate traffic flow within the scope of a traffic network.Furthermore, Duan et al. [27] applied the LSTM network to study and estimate commuter travel time, and Kim et al. [13] combined linear regression with LSTM to forecast travel demand for New York City.The above methods cannot effectively consider both the spatial and temporal characteristics of city-scale traffic flow, which greatly impacts the model's prediction accuracy.
It is well known that CNN focuses on capturing local spatial features, while LSTM specializes in modeling long-term dependencies in sequential data [11,[28][29][30][31][32].Over the past few years, the coronavirus disease (COVID-19) pandemic significantly impacted people's travel behavior [28].In light of this, Islam et al. [29] employed a combination network of CNN and LSTM to train X-ray images of COVID-19, assisting doctors in diagnosing and treating patients with this disease.Furthermore, Koller et al. [30] utilized a combined CNN and LSTM network to recognize the meanings of different hand shapes and lip-reading representations in sign language videos.Chen et al. [31] successfully applied a CNN and LSTM network to predict the remaining useful life of lithium-ion batteries.In the context of traffic problems, Bogaerts et al. [11] utilized a hybrid CNN and LSTM network to train travel demand data for online car-hailing services, enabling effective prediction of traffic demand within a timeframe ranging from the next five minutes to two hours.Additionally, Archana and Sanjay used this hybrid model to estimate the traffic flow during rainfall [32].Moreover, the dynamic graph neural network is utilized to study traffic flow's spatial and temporal features [5,12].Inspired by the findings of these studies, our study proposes a tailored data-driven model to study the spatial and temporal features of city-scale traffic flow and predict the short-term traffic flow.

Objective and Contribution
This paper introduces a novel approach to analyzing and predicting urban traffic congested flow.This work aims to analyze the spatial-temporal feature of traffic flow and develop a tailored data-driven model to estimate the short-term traffic congested flow.The main contributions of the paper are as follows: (1) the discrete snapshot set is proposed to store the spatial and temporal features of traffic flow over a continuous period; (2) the evolution of traffic flow is analyzed in various time dimensions (weekly days, weekend days, and one week); (3) the data-driven model was constructed to predict urban traffic congestion, combining ConvLSTM, batch normalization (BN) and CNN network to study the traffic flow's spatial and temporal features; (4) the numerical experiments are conducted on two Chinese cities' transportation networks, and the proposed model's performance outperforms traditional statistical and machine learning methods.
The remainder of this paper is organized as follows: Section 2 presents the mathematical formulations of the clustering algorithm and traffic zone.Section 3 introduces the traffic congested flow prediction data-driven model.We present the numerical experiments in Section 4. Finally, we conclude this study in Section 5.

Clustering the Congested Regions
Within the framework of the Urban Traffic Management Evaluation Index System [33], the average velocity of motorized vehicles was employed to quantitatively assess the congestion levels prevalent on urban roadways.The congestion classification for urban roads is shown in Table 1.
According to Table 1, the average vehicle speed below 30 km/h is classified as a congested road.To identify congestion events, this study first extracted clustered GPS trajectory sequences from the floating vehicle tracks (continuous slow-speed travel lasting for a minimum of 5 min with a speed below 30 km/h).Subsequently, abnormal behaviors were filtered based on vehicle engine state information to obtain the trajectory sequences [34].These extracted congestion trajectories were aggregated to determine the road congestion state.The congested events' spatial distribution can be obtained by clustering the travelers' pick-up and drop-off points [35,36].Our study uses these points and employs the graphbased spatial clustering of applications with a noise (G-DBSCAN) algorithm to find the congested regions in the study area (Xi'an city, China), as illustrated in Figure 1.The G-DBSCAN algorithm has a graph-based index structure and can be efficient for the nearby search operations [37].According to Table 1, the average vehicle speed below 30 km/h is classified as a congested road.To identify congestion events, this study first extracted clustered GPS trajectory sequences from the floating vehicle tracks (continuous slow-speed travel lasting for a minimum of 5 min with a speed below 30 km/h).Subsequently, abnormal behaviors were filtered based on vehicle engine state information to obtain the trajectory sequences [34].These extracted congestion trajectories were aggregated to determine the road congestion state.
The congested events' spatial distribution can be obtained by clustering the travelers' pick-up and drop-off points [35,36].Our study uses these points and employs the graph-based spatial clustering of applications with a noise (G-DBSCAN) algorithm to find the congested regions in the study area (Xi'an city, China), as illustrated in Figure 1.The G-DBSCAN algorithm has a graph-based index structure and can be efficient for the nearby search operations [37].The silhouette coefficient is used as an evaluation indicator for the G-DBSCAN algorithm.A denotes the average distance from it to other samples in the same cluster, and B denotes the average distance from it to samples in the nearest different cluster, and its contour coefficient is: The silhouette coefficient is used as an evaluation indicator for the G-DBSCAN algorithm.A denotes the average distance from it to other samples in the same cluster, and B denotes the average distance from it to samples in the nearest different cluster, and its contour coefficient is: where the range of The numerical values accompanying the icons in Figure 1 represent the count of dropoff points within the current hotspot.These congestion zones are primarily concentrated within the city's central region, with a significant portion of them spatially aligning near the main arteries of the second ring road.The most congested area has an impressive 234,308 travel flow.The underlying cause for this occurrence that lies in the periphery of the second ring road is dotted with numerous workplaces, commercial districts, and educa-tional institutions.The surge in traffic demand is attributed to the extensive commuting of employees between residential areas and workplaces and students traveling back and forth between homes and schools.This traffic surge is particularly prominent during morning and evening rush hours, resulting in heightened traffic volumes along the second ring road.

Partitioning Traffic Zones
The study area's traffic network structure can be gridded to traffic zones established according to the road spatial location information, as shown in Figure 2a.
The numerical values accompanying the icons in Figure 1 represent the count of drop-off points within the current hotspot.These congestion zones are primarily concentrated within the city's central region, with a significant portion of them spatially aligning near the main arteries of the second ring road.The most congested area has an impressive 234,308 travel flow.The underlying cause for this occurrence that lies in the periphery of the second ring road is dotted with numerous workplaces, commercial districts, and educational institutions.The surge in traffic demand is attributed to the extensive commuting of employees between residential areas and workplaces and students traveling back and forth between homes and schools.This traffic surge is particularly prominent during morning and evening rush hours, resulting in heightened traffic volumes along the second ring road.

Partitioning Traffic Zones
The study area's traffic network structure can be gridded to traffic zones established according to the road spatial location information, as shown in Figure 2a.The city-scale traffic zone's flow spatial information can be represented using a matrix, as shown: where I , J denotes the traffic zone's latitude and longitude range, and ij x represents the average traffic flow speed (ATFS) in the region ( ) i, j at timeslot t .Note that the traffic flow data recorded by floating cars equipped with GPS devices include the vehicle's speed, position, and recording time.
The matrix is constructed to effectively capture the spatio-temporal features of the traffic flow.The matrix accurately portrays the spatial relationship between traffic zones, with adjacent values representing traffic flow between adjacent zones [27].In essence, traffic flow over a continuous period can be represented by discrete snapshots as follows: The city-scale traffic zone's flow spatial information can be represented using a matrix, as shown: where I, J denotes the traffic zone's latitude and longitude range, and x ij represents the average traffic flow speed (ATFS) in the region (i, j) at timeslot t.Note that the traffic flow data recorded by floating cars equipped with GPS devices include the vehicle's speed, position, and recording time.
The matrix is constructed to effectively capture the spatio-temporal features of the traffic flow.The matrix accurately portrays the spatial relationship between traffic zones, with adjacent values representing traffic flow between adjacent zones [27].In essence, traffic flow over a continuous period can be represented by discrete snapshots as follows: where T represents the number of the time slot.Therefore, SnapShotSet stores the spatiotemporal information of traffic flow.
Figure 2b provides the snapshot set of the entire traffic network.The city-scale traffic network is partitioned into I × J zones, each zone with one or more roads.The value of each zone denotes the traffic flow average speed at timeslot t.The detailed process is as follows: Firstly, the study area is divided into distinct traffic zones, and the ATFS of each zone during each period t is calculated.Subsequently, these average speeds are mapped onto the corresponding grids.If multiple links traverse the same zone, their speeds are averaged to determine the zone's speed.Figure 3a visually represents this process, emphasizing its application to arterial roads.The green curve shows the main road with an average speed of 60 km/h, while the red curve shows the secondary road with an average speed of 40 km/h.At the intersection of the main road and the secondary road, the average speed of the main road is 50 km/h.Furthermore, feature extraction is crucial for traffic prediction in machine learning/deep learning.High-quality data features enable ordinary models to achieve accurate predictions [14].In our research, we use the insights gained from data analysis to extract relevant features that are then incorporated into the model training process.The features analysis conducted is instrumental in helping us develop a more accurate traffic prediction model.By incorporating the spatio-temporal characteristics of traffic flow identified in our study, we can train a model that achieves superior accuracy in predicting traffic flow in the next section.

Traffic Congested Flow Prediction Data-Driven Model
In order to accurately predict traffic congestion across the entire transport network, it is essential to simultaneously incorporate the temporal and spatial dimensions of traffic flow [38].Therefore, this paper presents the traffic congested flow prediction model, which investigates the traffic flow spatio-temporal features from the snapshot set.

Prediction Data-Driven Model
Traffic flow evolution exhibits significant temporal dependencies, wherein flow conditions from previous hours can exert a lasting influence on the present state.LSTM models can store the traffic flow's temporal features by using the memory units that enable them to decide whether to retain and update past hidden states, rendering them particularly effective in time series prediction [26].Given the time-related attributes of traffic flow, LSTM is apt for capturing the temporal dependencies in traffic flow Furthermore, feature extraction is crucial for traffic prediction in machine learning/deep learning.High-quality data features enable ordinary models to achieve accurate predictions [14].In our research, we use the insights gained from data analysis to extract relevant features that are then incorporated into the model training process.The features analysis conducted is instrumental in helping us develop a more accurate traffic prediction model.By incorporating the spatio-temporal characteristics of traffic flow identified in our study, we can train a model that achieves superior accuracy in predicting traffic flow in the next section.

Traffic Congested Flow Prediction Data-Driven Model
In order to accurately predict traffic congestion across the entire transport network, it is essential to simultaneously incorporate the temporal and spatial dimensions of traffic flow [38].Therefore, this paper presents the traffic congested flow prediction model, which investigates the traffic flow spatio-temporal features from the snapshot set.

Prediction Data-Driven Model
Traffic flow evolution exhibits significant temporal dependencies, wherein flow conditions from previous hours can exert a lasting influence on the present state.LSTM models can store the traffic flow's temporal features by using the memory units that enable them to decide whether to retain and update past hidden states, rendering them particularly effective in time series prediction [26].Given the time-related attributes of traffic flow, LSTM is apt for capturing the temporal dependencies in traffic flow evolution.To enhance the handling of spatio-temporal traffic flow data within the data-driven model, we draw inspiration from convolutional LSTM (ConvLSTM), an extended LSTM model detailed in the literature [39].ConvLSTM effectively leverages both temporal features and prior local traffic flow states to predict future states.Deep learning models commonly employ batch normalization (BN) layers to expedite convergence, enhance generalization, and mitigate gradient vanishing issues.Moreover, congestion in a particular traffic zone not only impacts neighboring zones, but can also affect remote zones.The convolutional neural network (CNN) has inherent local connectivity characteristics, making it adept at effectively addressing spatially correlated problems [22].
Therefore, our study proposes a data-driven model to predict traffic congested flow.This model comprises four combined ConvLSTM and BN layers, and a CNN layer, as illustrated in Figure 4.This model integrates time series features and spatial relationships within a hybrid deep learning architecture to study the spatio-temporal characteristics of traffic flow evolution.

The Process of Prediction Model
In order to accurately predict traffic congestion across the entire transport network, it is essential to simultaneously incorporate the temporal and spatial dimensions of traffic flow [38].Therefore, this paper presents the traffic congested flow prediction model, which investigates the traffic flow spatio-temporal features from the snapshot set.
The input is a series of discrete snapshots that represent the traffic flow evolution over a continuous period.Each snapshot stores the average traffic flow speed of city-wide traffic zones.The relationships of these zones can represent the spatial relationships of different roads' traffic flow.
The ConvLSTM layers can study the temporal features and explore the local spatial characteristics of traffic flow [39].Moreover, including the normalization layer improves training accuracy by re-normalizing activation values within each batch.

The Process of Prediction Model
Step1: Input data.
The input is a series of discrete snapshots that represent the traffic flow evolution over a continuous period.Each snapshot stores the average traffic flow speed of city-wide traffic zones.The relationships of these zones can represent the spatial relationships of different roads' traffic flow.
The ConvLSTM layers can study the temporal features and explore the local spatial characteristics of traffic flow [39].Moreover, including the normalization layer improves training accuracy by re-normalizing activation values within each batch.
In the ConvLSTM layer, each time series element is processed separately after convolution and merging, and the output becomes the time series vector.The network structure of ConvLSTM is shown in Figure 5, which combines the convolution layer and LSTM layer.For more details, please see the literature [39].Each element of the snapshot is a spatial correlation of traffic flow among multiple roads in the local region.The time series snapshot set is then used as the input to the LSTM layer.In the LSTM layer of the ConvLSTM, the LSTM module contains the input gate, forget gate, and output gate.To explore the temporal features of traffic flow, the LSTM layer takes the first 80 percent of time series data ∪ 0.8 * T t=1 X t as input.The output of hidden state matrix is denoted as H t in time t.To study the temporal characteristics and local spatial characteristics of traffic flow, we set the following equations: Input gate: Forget gate: Memory unit: Output gate: The output of hidden state matrix in time t: tanh( ) where   To study the temporal characteristics and local spatial characteristics of traffic flow, we set the following equations: Input gate: Forget gate: Memory unit: Output gate: The output of hidden state matrix in time t: where W in x , W We add the BN layer following each ConvLSTM layer to enhance the model's generalization and mitigate gradient vanishing.The BN layer typically requires configuring the scaling coefficient γ and translation coefficient β, which are utilized to normalize each batch's samples by computing the batch mean µ and variance σ 2 .The BN layer's output can be expressed as: where N represents the batch size, and X i denotes each sample within the batch.In the context of the traffic prediction model input, the output of the convolutional part in the ConvLSTM preceding the BN layer is denoted as (N, P, I, J), where P signifies the number of output channels, and I and J represent the width and height of the output feature map.Consequently, each sample within the batch can be defined as X p,i,j .The BN layer effectively normalizes each X p,i,j individually, leading to the calculation of µ, which corresponds to P × I × J.
The last layer, CNN, is utilized to capture the global spatial features of traffic flow.CNN could extract traffic flow spatial features by convolving the result of the preceding ConvLSTM-BN layer with the filter.H t denotes the input of the CNN layer.H l t,r denotes the output of the l th layer of the r th filter, H l−1 t,k is the preceding layer output of the k th filter, and O l r can be obtained by Equation ( 14), where f represents the nonlinear activation function; K is the depth of CNN layer; w l kr and b l r represent weights and deviations.
Step 4: Output the future short-term traffic flow.
The trained model f w,b () is obtained from the CNN layer.We can determine each traffic zone's estimated traffic flow average speed values based on the predicted result Y t .
The model stops until the acceptable error ε 1 and ε 2 are reached; otherwise, repeat the above iterative process.The mean absolute error (MAE) and root mean square error (RMSE) are employed as evaluation criteria to evaluate the efficacy of the proposed data-driven model.A lower value for these metrics signifies enhanced prediction accuracy and a more robust feature expression capability of the model.
where X i and Y i denote the observed and forecasted traffic flow matrix in time t.

Model Parameters and Experimental Platform
The data-driven model in this study utilizes empirical values for its parameters.The initial four ConvLSTM layers have 2 × 2 filters, while the CNN layer has 3 × 3 filters.To capture short-term temporal features, the feature dimension at each time step is set to 25, and the regularization norm for the convolutional layer weights is set to 0.002.The model uses a learning rate of 5.0 × 10 −5 for 160 iterations and an attenuation parameter of 0.9.A batch size of 6 is utilized.Dropout layers are implemented to prevent overfitting in the CNN.Python is the programming language used for model implementation and training.The experiments are conducted on a Windows platform with an Intel i7 CPU, 16 GB of random access memory, and a TITAN XP GPU.

Data Description
The evaluation uses GPS vehicle trajectory data collected from the transportation networks of two Chinese cities (Xi'an and Shenzhen), as shown in Figure 6.Table 2 provides the description of two datasets.The first dataset is the three-month float vehicles' GPS trajectory in Xi'an city, China.The second dataset consists of the one-week float vehicles' GPS trajectory data [40] in Shenzhen, China.Table 3 provides details of the data format, including license plate numbers, recording times, longitude, latitude, vehicle instantaneous speeds, speeds, and direction angles.The "*" indicates the province where the license plate belongs.The first eighty percent of the dataset is used as a training dataset to train the proposed model, while the remaining twenty percent is utilized as the test dataset.

Analysis of the Most Congested Main Road in Xi'an City
This study conducts macro-level congestion analysis within the area of the Xi'an third ring road, and the traffic zones were utilized to analyze the congested areas.We use the Python library to visualize the tested area's traffic flow (average speed) for the whole day, as shown in Figure 7.
Figure 7a represents the distribution of traffic flow speeds within the third ring road area on the work day (4 September 2016, Tuesday).The darker colors indicate higher speeds on road segments, with white areas representing non-road sections such  This study conducts macro-level congestion analysis within the area of the Xi'an third ring road, and the traffic zones were utilized to analyze the congested areas.We use the Python library to visualize the tested area's traffic flow (average speed) for the whole day, as shown in Figure 7.
Figure 7a represents the distribution of traffic flow speeds within the third ring road area on the work day (4 September 2016, Tuesday).The darker colors indicate higher speeds on road segments, with white areas representing non-road sections such as buildings, parks, and lakes.It was discovered that the second ring road and its surrounding areas are the most congested areas where the location of the second loop is marked by a bold red line in Figure 6a.Further, we analyzed the temporal patterns of congestion appearance and dissipation.Figure 7a reveals that during the midnight period (from 2 a.m. to 6 a.m.), the second ring road displays higher speeds than other areas.However, in the morning, noon, and evening peak hours, the vehicles' speed on the second ring road will decrease significantly, and the low-speed status could last for a long time.This implies that during the three peak periods of the day, the second ring road experiences lower speeds, indirectly indicating higher traffic volume.From 8 p.m. to 11 p.m., the speed on the second ring road gradually increases, as reflected by the brighter color in Figure 7a, and indicates a gradual decrease in citizens' travel frequency.Moreover, citizen activities are less frequent between 2 a.m. and 6 a.m., with most citizens resting.
Furthermore, we analyze the specific congestion conditions of the second ring road.Figure 7b visually represents the traffic speed of each section of the second ring road over one day.The brighter the color of each segment on the second ring road, the higher the traffic speed.Each image is labeled with a specific time, such as "hours2.5"for 2:30 a.m.It is evident that the traffic speed on the second ring road is higher from 12 a.m. to 6 a.m., whereas it decreases during the morning and evening peak hours.It reflects the travel patterns of commuters, who travel during the morning and evening peak hours.
Due to the periodic feature of the vehicle trajectory, we analyze the traffic flow evolution on the second ring road from early morning peak time (5 a.m. to 7 a.m.) for one week.As shown in Figure 8a, each row consists of seven small graphs representing each day of the week, and each column represents the speed distribution on the second ring road from 5 a.m. to 7 a.m.Above each small image is a time tag, such as "day1b '2016111405", indicating Monday (14 November 2016) at 5 a.m.It shows that the traffic flow is periodic during the morning peak hours, and the traffic speed at the same moment on the two adjacent days (except Sunday) is similar.Because most people rest/sleep during Sunday morning rush hour and hardly travel during this period, we obtain that the traffic flow speed is near zero.Furthermore, we analyze the specific congestion conditions of the second ring road.Figure 7b visually represents the traffic speed of each section of the second ring road over one day.The brighter the color of each segment on the second ring road, the higher the traffic speed.Each image is labeled with a specific time, such as "hours2.5"for 2:30 a.m.It is evident that the traffic speed on the second ring road is higher from 12 a.m. to 6 a.m., whereas it decreases during the morning and evening peak hours.It reflects the travel patterns of commuters, who travel during the morning and evening peak hours.Furthermore, we counted the number of congestion events lasting 300 s, including moderate and severe congestion, as described in Figure 8b.It is found that congestion events are periodic.The congestion was more obvious during the morning and evening peaks during the five weekly days.However, the congested events continued from 10 a.m. to 8 p.m. on weekends, especially during the evening rush hours, resulting in lower travel speeds on the second ring road.Furthermore, we counted the number of congestion events lasting 300 s, including moderate and severe congestion, as described in Figure 8b.It is found that congestion events are periodic.The congestion was more obvious during the morning and evening peaks during the five weekly days.However, the congested events continued from 10 a.m. to 8 p.m. on weekends, especially during the evening rush hours, resulting in lower travel speeds on the second ring road.

The Impact of Residents' Travel on the Second Ring Road of Xi'an on Weekdays and Weekends
In Figure 9, the x-label represents time slots for a whole day, while the y-label denotes the positions of the fourth section of the second ring road, encompassing its north, west, south, and east roads.By employing a time-space matrix as the channel of an image and encapsulating the daily traffic speeds of the network within images, each image corresponds to a distinct day for the network.These images offer comprehensive traffic insights, revealing congested areas highlighted in red regions and congestion propagation patterns.Notable patterns include oscillating congested traffic and localized clusters exhibiting pinned behavior.An in-depth exploration is available in the literature for a more comprehensive understanding of these traffic patterns [41].

The Impact of Residents' Travel on the Second Ring Road of Xi'an on Weekdays and Weekends
In Figure 9, the x-label represents time slots for a whole day, while the y-label denotes the positions of the fourth section of the second ring road, encompassing its north, west, south, and east roads.By employing a time-space matrix as the channel of an image and encapsulating the daily traffic speeds of the network within images, each image corresponds to a distinct day for the network.These images offer comprehensive traffic insights, revealing congested areas highlighted in red regions and congestion propagation patterns.Notable patterns include oscillating congested traffic and localized clusters exhibiting pinned behavior.An in-depth exploration is available in the literature for a more comprehensive understanding of these traffic patterns [41].During weekday morning and evening peak hours, as shown in Figure 9a, the second ring road experienced heavily congested events.However, on weekends, as shown in Figure 9b, congested events are only observed during the morning and evening peak hours.
Furthermore, following the congestion classification outlined in Table 1, the speed data for the second ring road were categorized into five distinct traffic congestion levels.Figure 10 is the three-dimensional (3D) extension of Figure 9, adding the new z-labels representing speed values used for classifying congestion.Figure 10a provides the 3D traffic flow of the weekly day.During morning and evening peak hours, congestion persists for extended durations, reaching severe levels.Notably, the western part of the second ring road has congested events lasting up to 2 h.However, a distinct pattern emerges on weekends, shown in Figure 10b, with congestion intensifying progressively from noon to evening, except on the north side of the second ring road.During weekday morning and evening peak hours, as shown in Figure 9a, the second ring road experienced heavily congested events.However, on weekends, as shown in Figure 9b, congested events are only observed during the morning and evening peak hours.
Furthermore, following the congestion classification outlined in Table 1, the speed data for the second ring road were categorized into five distinct traffic congestion levels.Figure 10 is the three-dimensional (3D) extension of Figure 9, adding the new z-labels representing speed values used for classifying congestion.Figure 10a provides the 3D traffic flow of the weekly day.During morning and evening peak hours, congestion persists for extended durations, reaching severe levels.Notably, the western part of the second ring road has congested events lasting up to 2 h.However, a distinct pattern emerges on weekends, shown in Figure 10b, with congestion intensifying progressively from noon to evening, except on the north side of the second ring road.
By analyzing the historical traffic data, we identify key patterns and trends that influence traffic flow, such as the impact of peak time and weekly/weekend days on traffic flow.The data analysis in this section plays a crucial role in our research as it enables us to identify the underlying characteristics of traffic flow and use these features in model training in the next section.By analyzing the historical traffic data, we identify key patterns and trends that influence traffic flow, such as the impact of peak time and weekly/weekend days on traffic flow.The data analysis in this section plays a crucial role in our research as it enables us to identify the underlying characteristics of traffic flow and use these features in model training in the next section.Figure 13 visualizes the prediction result of the proposed data-driven model to analyze the congestion traffic flow on both weekdays and the weekends.In Figure 13, the label "predict:2016052316" represents the predicted traffic flow results from 4 p.m. to 5 p.m. on a weekend day (23 May 2018), while "GroundTruth:2016052316" represents the actual traffic flow during the same period.Figure 13a predicts the future two-hour (evening peak) traffic flow in Xi'an city.It reveals that traffic congestion is mainly observed in the second ring road and the roads within it.Similar prediction results are obtained for Shenzhen city, as depicted in Figure 13b, where traffic congestion is observed on the primary roads.Therefore, Figures 12 and 13   Figure 13 visualizes the prediction result of the proposed data-driven model to analyze the congestion traffic flow on both weekdays and the weekends.In Figure 13, the label "predict:2016052316" represents the predicted traffic flow results from 4 p.m. to 5 p.m. on a weekend day (23 May 2018), while "GroundTruth:2016052316" represents the actual traffic flow during the same period.Figure 13a predicts the future two-hour (evening peak) traffic flow in Xi'an city.It reveals that traffic congestion is mainly observed in the second ring road and the roads within it.Similar prediction results are obtained for Shenzhen city, as depicted in Figure 13b, where traffic congestion is observed on the primary roads.Therefore, Figures 12 and 13

Models Comparison
To further the effectiveness of the proposed data-driven model, it was compared against common statistical forecasting methods, including the statistical models, such as the simple average method, ARIMA, and exponential smoothing method (ESM), and neural network algorithms such as MLP, CNN, and LSTM.The RMSE and MAE are used to test the performance of the proposed method.The experimental analysis is conducted with 160 iterations.The parameters of these compared are explained as follows: The simple average method is the time series-based approach that predicts future traffic flow by averaging historical data over a specific observation period.
The exponential smoothing method (ESM) is derived from the moving average method and assigns weights to past observations, giving more importance to recent data [15].Two and three exponential smoothing techniques, named two ESM and three ESM, are compared with the proposed method.
Autoregressive integrated moving average (ARIMA) is a general forecasting method designed explicitly for non-stationary time series data in transportation [16].
Multi-layer perceptron (MLP) is a popular neural network model for traffic prediction.It learns from historical data to establish an optimal structure between input and expected output.This MLP has two hidden layers, with the first layer consisting of 12 neurons as the test set model and the second with 8 neurons as the training set model [24].
Convolutional neural network (CNN) method utilizes two convolutional neural layers for the prediction of traffic flow.It employs stacked convolutional and activation layers to capture traffic patterns [42].
Convolutional long short-term memory method (ConvLSTM) is a convolutional recurrent neural network structure widely used for time series analysis.This method has two ConvLSTM layers [39].
Firstly, Figure 14a presents the RMSE values of the comparative methods.Traditional forecasting methods (simple average method, ESM, and ARIMA) can only capture the temporal characteristics of traffic flow, resulting in higher RMSE values.The MLP model is a simple neural network that focuses on studying the temporal characteristics, while the CNN method with two convolutional layers can learn the global spatial characteristics of traffic flow.Although ConvLSTM can learn both temporal and local spatial features, its RMSE performance is lower than that of CNN.The proposed model leverages the convolutional LSTM network to learn the temporal and regional spatial features of traffic flow, while effectively capturing the global spatial features using a convolutional neural network.This data-driven model achieves the minimum RMSE compared to the traditional models, and its evaluation metrics surpass those of the CNN and LSTM models.
Secondly, we added the MAE performance of all algorithms in Figure 14b.Upon further analysis, it is evident that our proposed method achieves the smallest MAE value, which is 4.28.The MAE values are less than 10 among all algorithms except for ESM and ARIMA.Specifically, the MAE values of ConvLSTM and CNN are 6.78 and 5.41, respectively.Moreover, the model demonstrates a stable error rate, making it suitable for large-scale transportation networks.The training process is also completed within a reasonable timeframe.These results highlight the superior short-term traffic flow predictive performance of our method using ConvLSTM and CNN.further analysis, is evident that our proposed method achieves the smallest M value, which is 4.28.The MAE values are less than 10 among all algorithms except ESM and ARIMA.Specifically, the MAE values of ConvLSTM and CNN are 6.78 5.41, respectively.Moreover, the model demonstrates a stable error rate, makin suitable for large-scale transportation networks.The training process is also compl within a reasonable timeframe.These results highlight the superior short-term tr flow predictive performance of our method using ConvLSTM and CNN.

Conclusions
This paper applies the G-DBSCAN clustering algorithm to obtain the traffic congestion state.The traffic network is decomposed into multiple traffic zones by the gridded method, and the discrete snapshot set is then proposed to store traffic flow's spatial and temporal features over a continuous period.Moreover, we analyze the evolution of traffic flow in various time dimensions (weekly day, weekend day, and one continued week).Further, the data-driven model was constructed to predict urban traffic congestion, combining the ConvLSTM and CNN network to study the traffic flow's spatial and temporal features.The numerical experiments are conducted on two cities' transportation networks, and the proposed model's performance outperforms both the traditional statistical and machine learning methods.
This study primarily analyzes traffic flow evolution through a statistical lens and utilizes black box deep learning models for prediction, thus this work lacks interpretable algorithms for traffic modeling.The limitation of this study is that it focuses on gathering local road traffic flow data within plane traffic zones as the training and prediction object.As a result, our method may not be able to effectively analyze traffic flow with multiple overlapping layers in space, such as overpasses or complex interchanges [27].
In future work, we not only expand our efforts by incorporating methodologies such as the Gaussian process [43] and Bayesian optimization [44] to enhance the internal

Figure 1 .
Figure 1.Cluster analysis using G-DBSCAN of the 20 most congested urban regions on weekdays.

Figure 1 .
Figure 1.Cluster analysis using G-DBSCAN of the 20 most congested urban regions on weekdays.

( a )
The grid representation of city transport network.(b) The representation of SnapShotSet .

Figure 2 .
Figure 2. The representation of traffic zones.

Figure 2 .
Figure 2. The representation of traffic zones.

Mathematics 2023 , 23 ( 1 )
11, x FOR PEER REVIEW 8 of Road vehicles average speed (2) Partition traffic zones (3) Average speed of each zone (a) The average speed of traffic zones.10 a.m.11 a.m. 12 a.m.(b) The traffic flow evolutionary in weekend day's noon peak.

Figure 3 .
Figure 3.The zone/grid representation of spatio-temporal trajectory data.

Figure 3 .
Figure 3.The zone/grid representation of spatio-temporal trajectory data.We aggregate the traffic speed data from floating vehicle tracks within each zone on each timeslot and employ color variations to visually depict the ATFS, and a grid-based depiction of urban road traffic flow is attained, as shown in Figure3b.This figure represents the average traffic flow speed in the study area during the weekend, specifically from 10 a.m. to 12 a.m.In this representation, darker colors indicate slower vehicle speeds and more congested traffic flow on the corresponding road, while lighter colors indicate higher vehicle speeds and smoother traffic conditions.As time progresses from 10 a.m. to 12 a.m., the traffic flow gradually increases, experiencing significant growth on arterial roads.Consequently, the next section aims to study the traffic flow features by utilizing the traffic flow snapshot set from the current and preceding moments.Furthermore, feature extraction is crucial for traffic prediction in machine learning/deep learning.High-quality data features enable ordinary models to achieve accurate predictions[14].In our research, we use the insights gained from data analysis to extract relevant features that are then incorporated into the model training process.The features analysis conducted is instrumental in helping us develop a more accurate traffic prediction model.By incorporating the spatio-temporal characteristics of traffic flow identified in our study, we can train a model that achieves superior accuracy in predicting traffic flow in the next section.

Mathematics 2023 ,
11,  x FOR PEER REVIEW 9 of 23 evolution.To enhance the handling of spatio-temporal traffic flow data within the data-driven model, we draw inspiration from convolutional LSTM (ConvLSTM), an extended LSTM model detailed in the literature[39].ConvLSTM effectively leverages both temporal features and prior local traffic flow states to predict future states.Deep learning models commonly employ batch normalization (BN) layers to expedite convergence, enhance generalization, and mitigate gradient vanishing issues.Moreover, congestion in a particular traffic zone not only impacts neighboring zones, but can also affect remote zones.The convolutional neural network (CNN) has inherent local connectivity characteristics, making it adept at effectively addressing spatially correlated problems[22].Therefore, our study proposes a data-driven model to predict traffic congested flow.This model comprises four combined ConvLSTM and BN layers, and a CNN layer, as illustrated in Figure4.This model integrates time series features and spatial relationships within a hybrid deep learning architecture to study the spatio-temporal characteristics of traffic flow evolution.

Figure 4 .
Figure 4.The structure of a data-driven prediction mode.

Figure 4 .
Figure 4.The structure of a data-driven prediction mode.

Mathematics 2023 ,
11, x FOR PEER REVIEW 10 of 23 explore the temporal features of traffic flow, the LSTM layer takes the first 80 percent of as input.The output of hidden state matrix is denoted as t H in time t.

Figure 5 .
Figure 5.The network structure of ConvLSTM.

W
and g x W are the weight matrices connecting the input t X to three gates and memory units, while represent the weight matrices linking the hidden state matrix

1 tHW
− to these gates and input units.Additionally, correspond to the weight matrices connecting the memory unit to three gates.The deviations among three gates and memory units are denoted by b .Moreover, the symbol * denotes the convolution operation;  signifies the scalar product of two vectors; and () σ and tanh() denote the nonlinear activation function, expressed as follows:

Figure 5 .
Figure 5.The network structure of ConvLSTM.
f x , W o x , and W g x are the weight matrices connecting the input X t to three gates and memory units, while W in h , W f h , W o h , and W g h represent the weight matrices linking the hidden state matrix H t−1 to these gates and input units.Additionally, W in g , W f g , and W o g correspond to the weight matrices connecting the memory unit to three gates.The deviations among three gates and memory units are denoted by b in , b f , b o , b g .Moreover, the symbol * denotes the convolution operation; • signifies the scalar product of two vectors; and σ() and tanh() denote the nonlinear activation function, expressed as follows: (a) Xi'an city, China (b) Shenzhen city, China
(a) Speed distribution in the third ring road of Xi'an city.(b) 4 h congestion change in the second ring road.

Figure 7 .
Figure 7. Evolution of traffic flow throughout the day in Xi'an city.

Figure 7 .
Figure 7. Evolution of traffic flow throughout the day in Xi'an city.
Analysis of congestion in the second ring road.(b)The number of congested events lasting 300 s.

Figure 8 .
Figure 8. Analyze morning rush hour traffic flow in the second ring road over the course of a week.

Figure 9 .
Figure 9. Speed diagram of a traffic cell w.r.t.space-time dimension.

Figure 9 .
Figure 9. Speed diagram of a traffic cell w.r.t.space-time dimension.

4. 3 .
Model Prediction Results and Analysis4.3.1.Sensitivity AnalysisThe proposed model is trained using vehicle trajectory data from Shenzhen city to predict the short-term traffic flow for the next two hours.To decrease/avoid overfitting, the batch normalization (BN) layers are incorporated into our model architecture.A comparative analysis of different ConvLSTM-BN models with varying layers indicates that the four-layer ConvLSTM-BN model achieves the lowest value of RMSE, as depicted in Figure11.Therefore, our data-driven model combines four ConvLSTM layer models and a CNN layer for traffic flow prediction.

Figure 10 .
Figure 10.Three-dimensional diagram of traffic flow for a whole day.

4. 3
.1.Sensitivity Analysis The proposed model is trained using vehicle trajectory data from Shenzhen city to predict the short-term traffic flow for the next two hours.To decrease/avoid overfitting, the batch normalization (BN) layers are incorporated into our model architecture.A comparative analysis of different ConvLSTM-BN models with varying layers indicates that the four-layer ConvLSTM-BN model achieves the lowest value of RMSE, as depicted in Figure 11.Therefore, our data-driven model combines four ConvLSTM layer models and a CNN layer for traffic flow prediction.

Figure 10 .
Figure 10.Three-dimensional diagram of traffic flow for a whole day.Mathematics 2023, 11, x FOR PEER REVIEW 18 of 23

Figure 11 .
Figure 11.Optimal number of ConvLSTM layers in the proposed model.

Figure 12
Figure 12 provides the RMSE and MAE results of the proposed model for differen short-term timeliness.It displays the pre-training results of Xi'an city, indicating that the model achieves its lowest RMSE and MAE values of 7.26 and 3.87, respectively, when predicting traffic flow in the next two hours.However, the RMSE and MAE values are relatively high when predicting traffic flow in the next four and eight hours.Then, we use the pre-trained model to train and test the traffic flow in Shenzhen.When forecasting road congestion status for the next two hours in Shenzhen, the mode achieves its minimum RMSE and MAE values of 11.70 and 7.68, respectively.

Figure 11 .
Figure 11.Optimal number of ConvLSTM layers in the proposed model.

Figure 12
Figure 12 provides the RMSE and MAE results of the proposed model for different short-term timeliness.It displays the pre-training results of Xi'an city, indicating that the model achieves its lowest RMSE and MAE values of 7.26 and 3.87, respectively, when predicting traffic flow in the next two hours.However, the RMSE and MAE values are relatively high when predicting traffic flow in the next four and eight hours.Then, we use

Figure 12 .
Figure13visualizes the prediction result of the proposed data-driven model to analyze the congestion traffic flow on both weekdays and the weekends.In Figure13, the label "predict:2016052316" represents the predicted traffic flow results from 4 p.m. to 5 p.m. on a weekend day (23 May 2018), while "GroundTruth:2016052316" represents the actual traffic flow during the same period.Figure13apredicts the future two-hour (evening peak) traffic flow in Xi'an city.It reveals that traffic congestion is mainly observed in the second ring road and the roads within it.Similar prediction results are obtained for Shenzhen city, as depicted in Figure13b, where traffic congestion is observed on the primary roads.Therefore, Figures12 and 13demonstrate the model's effectiveness in short-term traffic flow prediction.

Figure 12 .
Figure 12.The performance evaluation of proposed model predicts the short-term traffic flow.Figure 12.The performance evaluation of proposed model predicts the short-term traffic flow.

23 Figure 13 .
Figure 13 visualizes the prediction result of the proposed data-driven model to analyze the congestion traffic flow on both weekdays and the weekends.In Figure 13, the label "predict:2016052316" represents the predicted traffic flow results from 4 p.m. to 5 p.m. on a weekend day (23 May 2018), while "GroundTruth:2016052316" represents the actual traffic flow during the same period.Figure 13a predicts the future two-hour (evening peak) traffic flow in Xi'an city.It reveals that traffic congestion is mainly observed in the second ring road and the roads within it.Similar prediction results are obtained for Shenzhen city, as depicted in Figure 13b, where traffic congestion is observed on the primary roads.Therefore, Figures 12 and 13 demonstrate the model's effectiveness in short-term traffic flow prediction.Mathematics 2023, 11, x FOR PEER REVIEW 19 of 23

Figure 13 .
Figure 13.Forecast of traffic flow speed in the next two hours.

Figure 14 .
Figure 14.The performance comparison of predicting the future two-hour traffic flow in Shen city, China.Figure 14.The performance comparison of predicting the future two-hour traffic flow in Shenzhen city, China.

Figure 14 .
Figure 14.The performance comparison of predicting the future two-hour traffic flow in Shen city, China.Figure 14.The performance comparison of predicting the future two-hour traffic flow in Shenzhen city, China.

Table 1 .
The congestion classification for urban roads.

Table 1 .
The congestion classification for urban roads.

Table 2 .
The description of two datasets.

Table 3 .
The trajectory information of vehicles.Analysis the Evolutionary Process of Urban Road Congestion 4.2.1.Analysis of the Most Congested Main Road in Xi'an City