Trafﬁc Flow Online Prediction Based on a Generative Adversarial Network with Multi-Source Data

: Trafﬁc prediction is essential for advanced trafﬁc planning, design, management, and network sustainability. Current prediction methods are mostly ofﬂine, which fail to capture the real-time variation of trafﬁc ﬂows. This paper establishes a sustainable online generative adversarial network (GAN) by combining bidirectional long short-term memory (BiLSTM) and a convolutional neural network (CNN) as the generative model and discriminative model, respectively, to keep learning with continuous feedback. BiLSTM constantly generates temporal candidate ﬂows based on valuable memory units, and CNN screens out the best spatial prediction by returning the feedback gradient to BiLSTM. Multi-dimensional indicators are selected to map the multi-view fusion local trend for accurate prediction. To balance computing efﬁciency and accuracy, different batch sizes are pre-tested and allocated to different lanes. The models are trained with rectiﬁed adaptive moment estimation (RAdam) by dividing the dataset into the training and testing sets with a rolling time-domain scheme. In comparison with the autoregressive integrated moving average (ARIMA), BiLSTM, generating adversarial network for trafﬁc ﬂow (GAN-TF), and generating adversarial network for non-signal trafﬁc (GAN-NST), the proposed improved generating adversarial network for trafﬁc ﬂow (IGAN-TF) successfully generates more accurate and stable ﬂows and performs better.


Introduction
As a crucial technology in intelligent transportation systems (ITS) and a research hotspot, traffic flow prediction is necessary for a sustainable traffic network.Traffic flow is determined by modal split and routine choice influenced by many factors, including daily commuting and non-commuting activities, holidays, weather, public activities, traffic events, urban construction, and individual relocation.
In recent years, many deep learning-based models have been proposed to predict traffic volume, and they process the time-series traffic data [1,2].Because there is no explicit order dependence for traffic estimation, the observations in a period are treated as the same to predict a series of future journey times via the simulation of the historical observations' distribution, which time series methods are not suitable to handle.Therefore, the deep learning method is considered based on generative adversarial networks (GANs) [3].They are promising image generation methods because they generate samples based on their simulated probability distribution of a complex dataset.In a GAN, which follows an adversarial process to estimate generative models, two models are trained simultaneously.A generative model determines the distribution of training data, while a discriminative model calculates the probabilities of the sample selections.The training procedure aims at maximizing the likelihood of making a mistake.Like a minimax two-player game, there is a unique solution in the space of arbitrary functions, with the possibility of recovering the training data or discovering fake and real data equal to 0.5 universally.Radford et al. [4] introduced the deep convolutional generative adversarial networks (DCGANs), which are a class of convolutional neural network (CNN) and have certain architectural restraints.DCGANs were proved to be powerful for unsupervised learning.The training results of different image datasets showed the representation level from object part to scene in the generator and discriminator.Mescheder et al. [5,6] analyzed the common training algorithms of a GAN.Using the form of a smooth two-player game, the correlation gradient vector field of the GAN training target was analyzed.
The general GAN structure was proved to be superior in training.Also, they showed the convergence in the well-known GAN structure, which is difficult to train.Arjovsky et al. [7] introduced a new algorithm.They proved that it could enhance learning stability, eliminate pattern failure, and deliver meaningful learning curves, which helps debug and super parameter search.Zhang et al. [8] proposed a travel information maximization generation countermeasure network based on deep learning.By modeling the joint travel time distribution (TTD) of two continuous links and considering the spatiotemporal correlation of the whole network, the TTD was estimated successfully in the framework of a GAN for the first time.While multilayer perceptrons define the generative and adversarial models, the most recent research is based on the long short-term memory (LSTM) or CNN for various applications.LSTM, a fundamental deep learning model, can learn long-term dependencies.An LSTM internal unit, which includes a cell, an input gate, an output gate, and a forget gate, has hidden states improved by nonlinear mechanisms which allow states to propagate without amendment and use simple learned gating functions to update or reset states.LSTM works exceptionally well on different problems, including handwriting recognition, electric load forecasting, and natural language text compression.As a deep, feed-forward artificial neural network class, CNNs have been successfully used to analyze visual imagery.A CNN consists of multiple hidden layers (including convolutional layers, pooling layers, fully connected layers, and normalization layers), but it only has one input layer and one output layer.The applications of CNNs include natural language processing, recommender systems, and image and video recognition.As the traffic data are not preprocessed, most artificial intelligence (AI)-based prediction models, and being single input, cannot match the traffic variation among various intervals (e.g., hours, days, and weeks), which leads to poor performance under different traffic conditions.This paper puts forward a GAN-based prediction model combining BiLSTM as the generative model with a CNN as the discriminative model, in which a rolling time domain scheme is proposed by dividing the dataset into slices of training and test data on the same period to capture the variation of traffic data and realize the self-learning ability of GAN.Additionally, traffic flow prediction considering influence under signal control has generally not been considered in previous works.This paper adopts min-max normalization pre-processing of traffic flow and signal timing data as the input parameters to resolve this issue.
The remainder of this paper is organized as follows: we review the literature on GAN methods in Section 2.Then, the methodology framework of the improved generative adversarial network framework for forecasting traffic flow, including the adopted structure, rolling time-domain training and testing, model fusion mechanism, and multi-dimensional indicators, is presented in Section 3. The experimental dataset, evaluation indexes, distribution probability, model calibration, and time series stationary test process are in Section 4. Experimental results and comparison with benchmark methods are analyzed in Section 5. Section 6 concludes with the contributions and future work of this research.

Literature Review
Many existing predicted methods have been suggested to resolve the complex relationship between different factors and traffic fluctuation.One is the statistical-based method, which is simple to calculate and maintains stable prediction accuracy.However, the statistical-based method is required to satisfy linear or other strong assumptions related to statistical properties.Thus, it cannot accurately reflect the complex and nonlinear relationship hidden in time-varying traffic flows.Also, the predicted flows may lose accuracy since the utilized statistic characteristics can eliminate local fluctuations to ensure a relatively acceptable accuracy in general.Another type is the AI-based method, which has strong abilities of multi-source information fusion and nonlinear prediction.Still, it often faces problems of overfitting and calculation costs for online applications.Both of these methods can be applied to offline and online scenarios.However, due to the complex and nonlinear relationship between the influence factors and traffic flows, it is impossible to fit the time-varying properties of traffic flows into a fixed offline expression.
The traditional AI methods for traffic flow prediction can be grouped into two types: parametric and non-parametric methods.The parametric methods include the artificial neural network (ANN), autoregressive integral moving average model (ARIMA), exponential smoothing, and Kalman filter (KF).Voort et al. [9] suggested a hybrid method for short-term traffic prediction, which could better handle traffic flow changes and significantly improve prediction performance; Ding et al. [10] divided traffic patterns into six levels, and a time-space-based ARIMA was established to predict urban traffic flow at 5-min intervals.Chandra and Al-Deek [11] established a fast-path short-time traffic flow vector autoregressive prediction model, which considered the influence of upstream and downstream location information and achieved good results.Huang et al. [12] proposed a hybrid model in which the linear and non-linear parts were processed by the ARIMA and the ANN, respectively.The hybrid model improved the prediction accuracy.KF-based traffic prediction research was proposed because of its recursive solid ability.Gong et al. [13] suggested a KF-based model for the prediction of short-term traffic volume, and they used the Internet of Things for data collection.Ojeda et al. [14] adopted the adaptive scheme to build a multi-step KF-based method for traffic flow prediction.Guo et al. [15] argued that the adaptive KF-based method for the short-term traffic prediction did not have a large flow rate.The short-term traffic flow prediction performance could be improved by a new approach with better adaptability and stability.
Compared to parametric methods, non-parametric methods are more complicated and flexible because of the uncertainty of groups of parameters and traffic data structure.Based on statistical theory, the support vector machine (SVM) is adopted in many areas.However, scholars have established a least squares SVM, chaotic wavelet SVM, particle swarm optimization SVM, and genetic algorithm SVM because of its sensitivity in selecting kernel functions and parameters.Another type of typical non-parametric method is the neural network, which is widely used in prediction [16,17].The improved extreme gradient boosting (XGBoost) with spatial lag has improved prediction accuracy [18].Neural networks are good at learning multidimensional complex nonlinear problems.Khotanzad and Sadek [19] used fuzzy neural network and multi-layer sensing to a simulate fastpath network, and the result was better than that from the parametric regression model.Qiu et al. [20] built a Bayesian regularized neural network to forecast short-term velocity.Ma et al. [21] constructed a large-scale congestion prediction model and proposed the restricted Boltzmann machine based on a recurrent neural network.In recent years, an increasing interest in deep neural networks, such as deep belief networks [22], has appeared in traffic prediction.Bidirectional long short-term memory (BiLSTM) has an excellent predictive application effect.Cui et al. [23] suggested a deep overlay bidirectional, unidirectional LSTM neural network structure.The forward and backward correlation of time series data were used, and the structure was applied to predict the whole network's traffic speed.With the backward dependence of traffic data for prediction, the model can predict the traffic speed of freeways and complex urban traffic networks.Deep learning-based GANs can deal with these drawbacks.More hybrid prediction frameworks of parametric and non-parametric methods have been proposed due to the complex and nonlinear relationship among time-varying traffic flow sequences.Through training, a high-quality data confusion discriminant model is generated, which is widely used in image restoration, semantic segmentation, image prediction, etc., because GANs can flexibly perform real and virtual scenes in tasks such as artificial systems, computational experiments, and synchronous execution.Wang et al. [24,25] summarized GAN's latest technology and found that the parallel system research of functional interaction and integration has great potential.Liang et al. [26] considered the traffic state of the original traffic and the traffic state of the training samples based on GAN, and estimated the missing traffic state under spatio-temporal and flow-speed relationships.Li et al. [27] utilized the joint distribution of relative entropy to reduce the difference in different traffic distribution situations and established a multi-quantity based on a GAN, which could improve the prediction performance of a traffic prediction model for extreme values.
Reviewing the above literature, we found three critical problems for online traffic prediction: 1.
How to find the periodicity and trend of flows in the time domain?
Though special cases and many other factors influence traffic flows to change frequently, traffic patterns still show an obvious tendency in most cases with stable periodicities and trends.Daily, weekly, monthly, and quarterly patterns of traffic flows are significant fluctuation time domains for prediction approaches.

2.
How to describe the local trend raised by special cases with steady accuracy?
There is a correlation among local temporal neighbors.However, the local closing values tend to be heavily fluctuating.Their prediction is unreliable based mainly on closing values.The prediction module should depend on groups of statistic local information from temporal neighbors for steady accuracy.In this way, the fluctuation direction and range can be resolved in a reasonable domain.

3.
How to maintain the continuous self-learning ability in the time domain?
As complex factors influence traffic flows, the ability of the prediction model to continuously learn the latest pattern of sudden variations, periodicities, and trends is noteworthy, especially for online prediction methods.In most offline methods, the parameters or strategies are fixed, making it a challenge to deal with unknown fluctuations.Online methods are mostly trained with the latest small dataset with a calculation limitation, which cannot fit long-term periodicity and trends.As a result, an online discriminative tool is necessary for the frequent online prediction process.Recently, the loss of most online methods has usually been focused on the current interval, regardless of the whole sequence discrimination.In urban signalized intersections, signal timings must be considered to improve the self-learning ability of online prediction methods.To enhance traffic prediction, this paper presents an improved generative adversarial network framework for traffic flow (IGAN-TF) which introduces a parametric-vector-based fusion method to fuse three components' closeness.Then, multiple temporal views (i.e., daily periodicity, weekly periodicity, biweekly periodicity, and monthly trend) are employed to different temporal sequences with different periodicities and trends.The method of rolling time-domain training and testing further expanded the use of data.A relative strength index is introduced for the complex and nonlinear relationship between the target traffic flows.Five kinds of indicators are resolved by the generator and the discriminative model of IGAN-TF.Finally, the effectiveness and accuracy of the proposed model are proved by comparing the prediction results of different lanes (straight-lane, left-lane, right-lane, and straight right-lane) at the intersection with those of the baseline methods.
In conclusion, for question 1, time series models with long-term and short-term memory such as LSTM and BiLSTM are selected as basic units in the basic model.At the data fusion level, the model combines daily, weekly, biweekly, and monthly features.In addition, in terms of data feature extraction, trend features media and moving average are selected to learn traffic flow data trend features and make accurate predictions.For question 2, the model captures the local trend of the data.At the level of data fusion, recent features are combined.In addition, the selection of volatility features' maximum, minimum, and signal timing phase time dramatically impacts the traffic in the data feature extraction.For question 3, the self-learning ability proposed in this paper is mainly through introducing the GAN mechanism, inputting real data, generating data, and constantly judging and resisting the judgment model to realize the final prediction.In this process, the generative judgment model will continuously feedback the loss to the generative model to realize self-learning.

Methodology
In this section, we first give the predicted problem definition and motivation to illustrate the time series composition and relationship of traffic data.Then, the proposed improved generating adversarial network for traffic flow (IGAN-TF) is depicted in detail, and the corresponding algorithm (Algorithm 1) and dataset divided strategy are introduced to design advanced models.

Problem Definition and Motivation
Although traffic flow is simply time-series data, it is widely influenced by various factors, making it difficult to be fitted by a fixed expression without continuously renewing the latest fluctuation information.The travel demand is mainly composed of regular daily commuting and non-commuting active travels, which form a relatively stable traffic pattern.Special cases (such as weather change, public activities, and traffic events) may suddenly change the short-term travel demand.Other factors, such as holidays, urban construction, and individual relocation, could affect the traffic flows in the long term.Unfortunately, these kinds of data are challenging to collect entirely and to fuse.In urban signalized intersections, signal timings are another significant factor that affects traffic flows in each lane.However, no matter how these factors influence the traffic flow, the combined effect of these factors is reflected as the variation of traffic data in the time domain with various fluctuation ranges and temporal ranges as shown in Figure 1.
In Figure 1a, weekdays have much higher peak values than weekends.However, traffic flows fall obviously during relative long-term holidays because the outflow from this studied area outnumbers the inflow to this studied area during holidays.The two curves in Figure 1b have different shapes, which demonstrates that special cases could result in a sudden closing local trend at any time.Figure 1c describes daily traffic flows in a newly constructed road from 6 October 2016 to 31 December 2017.Traffic flows gradually increased with an obvious trend as time passed.Hence, traffic flow online prediction is required to precisely capture the fluctuation range and temporal range by considering different degrees of temporal influence by closeness, periodicity, and trend.

Improved Generative Adversarial Network Framework for Traffic Flow
With the high-frequency detected traffic flows, high-quality one-step forecasting is generally of great concern to traffic engineers and policymakers to design robust traffic management schemes.To tackle the above problems, an IGAN-TF is proposed to predict traffic flows by combining BiLSTM as the generative model (generator) G with CNN as the discriminative model D. The traffic flow of each lane is predicted in one step with the IGAN-TF, solely based on its historical information.By introducing an adversarial process between the generator G and the discriminative model D, the online discriminative model can judge the generated traffic flows of the generator G continuously, which improves the self-learning ability of online prediction.Table 1 presents the notations used in IGAN-TF.Let i t represent a set of multidimensional indicators and f t denote the closing traffic flow for an l-minute interval at the interval t.Given the past closing S timesteps flows f 1 , f 2 , . . ., f S and the corresponding historical multi-source indicators I S+1 (i 1 , i 2 , . . . ,i S ), the goal is to predict the closing flow fS+1 for the following l-minute time interval.
Traffic flow ground truth of temporal sequence 1, 2, . . ., S, S+1 (f is a vector form) f f 1 , f 2 , . . ., f S , fS+1 Generated traffic flow of temporal sequence 1, 2, . . ., S, S+1 ( f is a vector form) Figure 2 shows the framework of IGAN-TF.IGAN-TF is composed of two processes, including Figure 2a for data preparation section and Figure 2a for model learning/predicting section as follows.In the time domain T, traffic flows in different lanes are firstly transferred into a range [0, 1] with min-max normalization.The ground truth sequence of the targeted interval f S+1 and neighbors f 1 , f 2 , . . ., f S are concatenated as f and prepared to be input into the discriminative model.Moreover, traffic ground truth in multiple temporal views, such as in the daily, weekly, biweekly, and monthly domain, are matched with the recent intervals by multi-view temporal fusion, respectively, as time goes by.Note that some of the indicators, which are not related to temporal views, are not determined from the multi-view temporal fusion and remain the same value in the current time.With the temporal fusion, E key indicators i t (i t (1), i t (2), . . . ,i t (E)) are selected to form the input data of generator G.

(b) Model learning/predicting
Since traffic flow data is a specific time-series data, the BiLSTM model, widely used for long short-term time series prediction, is selected as the generator G to predict output fS+1 according to the input data I S+1 (i 1 , i 2 , . . . ,i S ).
Based on the CNN architecture, the discriminative model D calculates the probability of whether a sequence comes from the dataset f( f 1 , f 2 , . . ., f S , f S+1 ) or is being produced by a generator G with f f 1 , f 2 , . . ., f S , fS+1 by performing convolution operations on the one-dimensional input sequence.
The generator G and the discriminative model D evolve and interact with each other.One side evolves, while the other loses, and finally, the evolution ceases until the false target is similar to the true target.With historical ground truth f 1 , f 2 , . . ., f S and indicators I S+1 (i 1 , i 2 , . . . ,i S ) as the inputs, the discriminative model D should be as "confused" as possible.The generator G should diminish the adversarial loss because the discriminative model D will not make correct discrimination on the prediction.
Mean absolute error (MAE) training in neural networks may result in missing the minimum point during gradient descent.In contrast, the gradient of mean squared error (MSE) will decrease as the loss decreases.The Huber loss is chosen as the adversarial loss for generator G, which combines the advantages of both and makes the gradient reduce and fall near the minimum.The generator G minimizes L G f, f while keeping the weights of D fixed.
where δ is the Huber loss function threshold.
Since the role of the discriminative model is just to classify the input sequence . . ,f S , fS+1 into 0, the discriminative model D aims to minimize the sigmoid cross-entropy loss function L sce , keeping the weights of G fixed.
The generative BiLSTM and discriminative CNN are trained iteratively.During the adversarial process, the generator and discriminator are iteratively trained with minibatches of size K by summarizing the losses over every batch_size during an iteration, accelerating convergence and improving accuracy.Due to the temporal urgency of online prediction, disparate batch sizes are tested in this paper.The state-of-the-art optimizers for parameters, such as mini-batch gradient descent (MBGD), stochastic gradient descent (SGD), adaptive moment estimation (Adam), and momentum, have a series of problems in saddle point, time consumption, and learning rate selection.To further stabilize training, accelerate convergence, and improve generalization, an advanced optimizer RAdam [28] is applied to train IGAN-TF by rectifying the adaptive learning rate to have a consistent variance.The iterative process is summarized below with batch_size K. β 1 and β 2 are decay rates to calculate moving average and moving 2nd moment, respectively.
Algorithm 1 Iterative mini-batch adversarial training with backpropagation and learning rate decayed by rectified Adam as shown.
Step 1: Set the initial learning rates of the generator and the discriminative model (i.e., ρ G and ρ D , respectively); Step 2: Initialize exponential decay moving 1st moment, m 0 , exponential decay moving 2nd moment, v 0 , and weights of the generator and the discriminative model (i.e., W G and W D , respectively); Step 3: Compute the maximum length of the approximated simple moving average (SMA) with λ ∞ = 2 (1−β2) − 1, while t = 1, . . ., . . .S and the convergence is not reached; Step 4: Train the generator G with RAdam; (1) Get a new sequence (i 1 , f 1 ), (i 2 , f 2 ), (i 3 , f 3 ), . . . (i K , f K ) for the generator G; (2) Calculate gradients with regard to the stochastic objective at timestep t with and the length of the approximated SMA with 2 ) and the variance rectification term Step 5: Train the discriminative CNN by updating weights W D with RAdam, and the same with the generator; Step 6: End the algorithm if reaching convergence.

Strategies for Improving IGAN-TF
Concerning the three problems mentioned in Section 2, a series of strategies are proposed to improve the performance of IGAN-TF.Traditional stationary methods, which use transformation, moving average, and differencing processes to eliminate the temporal periodicity and trend out of sequence, are complicated to be precisely captured and scaled back.Hence, a multi-view fusion method fuses the recent l-minute interval view and another long-term view.As analysis of Figure 1 shows, traffic flows are more or less influenced by closeness, periodicity, and trend.Inspired by these observations, we propose a parametric-vector-based fusion method to fuse three components (i.e., periodicity, closeness, and trend).As different temporal sequences may have different periodicities and trends, the parametric-vector-based fusion method is employed to fuse the multiple temporal views between O cycle (cycle: daily, weekly, biweekly, monthly) and O recent .
where the learnable parameters, ω recent and ω cycle , adjust the impacts caused by closeness, daily periodicity, weekly periodicity, biweekly periodicity, and monthly trend, and is element-wise multiplication.For each of the different temporal views, a series of flow vectors are fetched and concatenated to construct five inputs as follows.
where l r , l d , l w , l b , and l m are the input lengths of recent, daily, weekly, biweekly, and monthly, respectively; p d and p w are daily and weekly periods; and p b and p m are biweekly and monthly trend spans.Then, IGAN-TF can identify multiple kinds of temporal characteristics.In our approach, the complexity of the input data between O recent and O cycle is (l r + l d + l w + l b + l m ), which is much less than the period considered by recurrent neural network-based methods.As time goes by, the periodicity and trend roll along with the recent closeness.Additionally, BiLSTM with timesteps and long-term memory units can also store valuable temporal information to find the periodicity and trend of flows in the time domain.2, representing the fluctuation direction and range of the local trend in a stable way.The median is used to express the central tendency of flows regardless of maximum and minimum.Maximum and minimum are indicated for the volatility, while moving average is utilized to embody different kinds of the steady local trend to be fitted.Relative strength index is to introduce the average increment level of fluctuation.To this end, the traffic fluctuation direction and range can be resolved in a reasonable domain.The complex and nonlinear relationship between the targeted traffic flows and these indicators is resolved by the generator and the discriminative model of IGAN-TF.It is worth mentioning that signal phase time is not calculated by multi-view fusion as with other indicators.It is accumulated from green time in the current phases of the lane during every interval, regarding it as an indicator for signal influence on the number of passing flows in urban intersections.Thus, multi-source data, including traffic flows and signal timings, are introduced into the same framework.The activation functions, such as Sigmoid function and Tanh function, quickly propagate the error back.The ReLU function stops some neurons from ever being activated, with the gradient being zero.The ELU activation function is introduced, which combines the advantages of Sigmoid and ReLU with a linear zone to mitigate the gradient disappearance and a soft saturation zone to improve the robustness.Finally, the training process, which uses RAdam, batch normalization, and minibatch, can avoid the challenges of the optimal local solution, overfitting, and unsteady accuracy.

To Maintain a Continuous Self-Learning Ability in the Time Domain (c) To maintain a continuous self-learning ability in the time domain
Multi-view temporal fusion is applied to learn the closeness, periodicity, and trend in the time domain.Different kinds of statistic indicators are fed to learn other properties of the fluctuation of traffic flows, including phase time, which affects the self-learning ability to adapt to the variation of signal timing.Rather than fitting the sequence with a fixed expression or using the latest small dataset for online prediction, a discriminative model is introduced to continuously judge the whole sequence during the online prediction process.Rolling time-domain training promotes the continuous self-learning ability of IGAN-TF by selecting appropriate periodicity and trend properties and satisfying online prediction demand upon the accuracy and calculating speed.The training process, which uses RAdam, batch normalization, and minibatch, can adaptively control the gradients to achieve optimal solutions by fine-tuning input distribution and parameters during training, accelerating convergence for online applications.

Data Source and Analysis
Figure 4 depicts the intersection of Hongzehu Road and Qingnian Road in Suqian City, Jiangsu Province, China.The traffic flow and signal timings dataset were stochastically chosen from this intersection.The interval length of observed traffic flow was 15 min, and the time-domain was from 26 October 2016 to 9 May 2017 (a total of 18,816 intervals of 196 days).For a neural network, necessary data analysis is essential [29,30].We conducted a series of data analyses to show the different patterns of lanes.In Figure 4, there are 12 lanes in the intersection.The Talib library was used to calculate the indicators for every four neighboring intervals (representing one-hour intervals).The flow distribution probability of different lanes is shown in Figure 5. Distinct lanes presented different traffic flow data in 196 days; lanes 1, 4, 7, and 10 were left-lane, lanes 2, 5, 8, and 11 were straight-lane, lane 3 was right-lane, and lanes 6, 9, and 12 were straight right-lane.The flow in the east-west direction was much greater than that in the north-south direction because the east-west direction is the main lane for people to commute and tide happening.
The augmented Dickey-Fuller (ADF) test and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test were adopted to verify the time stability of traffic flow data.The inspection chart is shown in Figure 6.The ADF test, one of the most common statistical tests, can be adapted to determine the unit roots' existence in a sequence and thus help to determine whether the sequence is stationary.It can be seen from Table 3 that the test statistics of lane 1 to lane 12 were less than the critical value, so the original hypothesis was rejected.That is to say, the linear stationarity and differential stationarity of the time series were excellent.The KPSS test results show that the test statistics of lane 1 to lane 12 were less than the critical value, so the original hypothesis was rejected.That is, the trend stability of the data time series was good.
The impact of signal control is very vital to traffic flow.Thus, as in Figure 2, we put signal timing phase time into our model as an indicator.As shown in Figure 7, signal timing phase times of different lanes in a whole day were calculated by each 15-min interval and used to help prediction.

Discussion
In this section, we use the ground truth traffic flow datasets for the experiment to evaluate the IGAN-TF.Comparative experiments and effective analysis of different components were conducted on the state-of-the-art benchmark deep learning models to demonstrate the predicting performance of the IGAN-TF framework.

Hardware Environment and Implement Settings
Here are the experimental details and parameter settings of these models: 1.

Hardware environment
The Python library Keras based on TensorFlow was adopted to construct our models.A PC Server completed all experiments (the configuration was an Intel Xeon CPU E5-1650, 3.50 GHz, 64GB memory).In this paper, the dataset of every two months was applied to figure out multi-view fusion for calculating the indicators of the generator's input data, except signal phase time in the same view.The fused indicators were trained in a rolling scheme.M was set to one week and N was set to one day.Under the rolling time-domain training scheme, the evolution process of the online prediction was recycled every day since N was set to one day.The computing time for predicting each day was 1.5 h.Considering the mediocre computer utilized and evening hours available for training, IGAN-TF is realizable for online applications.

2.
Network Architecture The implementation details of the IGAN-TF framework are shown in Tables 4 and 5 for the architecture of the generator and the discriminative model, respectively.The parameters of BiLSTM were initialized based on the normal distribution N(0, 1).Learning rates of the generator and the discriminative model were ρ G = 0.001 and ρ D = 0.02.To avoid overfitting with a ratio of 20%, the early stopping and dropout layers were used.δ as the Huber loss function threshold was defined as 1.In RAdam, exponential decay moving 1st moment, m 0 , and exponential decay moving 2nd moment, v 0 , were set to zero.Decay rates, β 1 and β 2 , were set to 0.9 and 0.999, respectively.The batch_size as a hyperparameter was initially optimized via cross-validation.As Figure 8 shows, by adjusting the size of batches (generally an exponential multiple of two), the efficiency and accuracy of calculation can be improved.At the same time, the number of iterations required to run the full dataset epoch was reduced.Too large a batch_size is time-consuming for more epochs and may lead to a locally optimal solution, while a small one may lead to the difficulty of convergence.Based on the cross-validation performance of the dataset of different lanes, the best batch_size of lanes 1, 2, 3, 5, 6, 7, 8, 9, 11, and 12 was 16, and that of lanes 4 and 10 was 32. Figure 9, in ascending order of lane number, shows the adversarial loss of 12 lanes based on IGAN-TF.Most of the lanes' results converged after 500 and 700 epochs, with adversarial loss fluctuating around 0.5.The training process was more accurate, steadier, and faster than that of GAN-TF, meaning that the calculating speed to convergence was highly improved by RAdam and other strategies, such as loss choice and multi-view temporal fusion.

Benchmark Methods
IGAN-TF was compared with the following four benchmarks: 1. ARIMA: Autoregressive integrated moving average is mainly used in time series analysis, such as in the analysis of traffic flows, whose data have non-stationarity.To eliminate the non-stationarity, ARIMA can use an initial differencing step one or more times.
2. BiLSTM: Bidirectional long short-term memory networks use a finite sequence to estimate each element of traffic flows based on the past and future flows, which is done by concatenating the outputs of two LSTM networks, one processing the flow sequence from left to right, the other one from right to left.The input data (indicators and multiple temporal views) and optimizer RAdam of BiLSTM are the same as for IGAN-TF.The loss of BiLSTM is the Huber loss.
3. GAN-TF: Compared with IGAN-TF, generative adversarial networks for traffic flow utilize indicators of the median, maximum, minimum, moving average, and signal timing phase time in neighbor intervals without multi-view temporal fusion in the data preparation aspect.The loss of generator is the Huber loss, and the training optimizer uses RAdam.

Evaluation Index
To evaluate different prediction methods, we employed root mean square error (RMSE) as the evaluation index.Given ground truth values f t and predicted values ft , the RMSE was determined as below:

Comprehensive Results
Without multi-view temporal fusion and different indicators, GAN-TF shows worse performance than IGAN-TF.In addition, GAN-NST was worse than BiLSTM and IGAN-TF because it ignored the influence of signal timing phase time as multi-source input data with traffic flows.RMSEs by ARIMA, BiLSTM, GAN-TF, GAN-NST, and IGAN-TF of all lanes in all days are shown in Table 6.Except for lanes 3 and 12, the average RMSE of IGAN-TF is obviously superior to ARIMA, BiLSTM, GAN-TF, and GAN-NST.If no signal timing information is added, only the RMSEs of lanes 3, 4, 6, 7, 8, 9, and 11 are better than performance of lane 12 and the increase of accuracy in peak hours and straight-lanes illustrate that applying IGAN-TF can reach a better prediction performance in predicting traffic flows of high and stable volumes.

Effects of Different Components
Different time-domain components more or less influence traffic flows.Different temporal sequences may have different periodicities and trends.Multiple temporal views and different predicting steps were employed to find out the effects of traffic flow.

Multiple Temporal Views in Fusion
We compared the results of different experiments with multiple temporal views for the test content, including by month, bi-week, week, and day.As shown in Figure 11, if traffic flow was predicted with single-day traffic data, the results were not very good.As the temporal view span increased, the results were significantly improved.It can be seen that the RMSE had different amplitude reductions in the 12 lanes of an intersection, indicating that the periodicity and similarity of the time series are essential features of the traffic flow pattern.
We formed an input sequence for the signal timing by randomly adding a fixed effective green time for each set of input information.The temporal information input every day is the same, and the periodicity of the time series can be maintained.In contrast, it was found that the model with the use of fixed sequence input (for example, signal timing information is added to the front of the daily flow data) was slightly better than the model with non-signal timing, but the difference was not large.Using random sequence input (that is, random insertion of timing information), the effect was greatly improved by the flow data of one day.

Different Predicting Steps
Under further analysis, we conducted different time intervals for traffic flow data.Five (0.25 h, 0.5, 1 h, 2 h, and 3 h) different intervals were used as the target sequence, as shown in Figure 12, to train the model.As the time interval increased, the RMSE indicator increased slightly in 12 lanes.All lanes were optimal for the IGAN-TF for changing time intervals.The performance of BiLSTM was also worthy of attention.Although most lanes were not optimal, its stability was not bad.ARIMA became more unstable as time intervals increased.The improvement of overall accuracy and stability after the model improvement was obvious, indicating that IGAN-TF is exceptionally suitable for traffic flow forecasting.

Different Loss Selection
To test the accuracy of long-term prediction of IGAN-TF and benchmarks, we compared the RMSE indices of the 12 lanes across the intersection at different loss functions.Table 9 shows that the Huber loss had a greater effect on traffic flow time series prediction than other losses and was more robust to outliers.

Effects of Different Intersections
Figure 13 gives the real continuous three intersections of Hongzehu Road and Qingnian Road shown in Figure 13a that is the same as Figure 4, Huanghe and Hongzehu Road shown in Figure 13b that includes four left-lanes, six straight-lanes, two right-lanes, and two straight right-lanes, and Hongzehu Road and Xingfu Road shown in Figure 13c that includes four left-lanes, six straight-lanes, and four straight right-lanes in Suqian City, Jiangsu Province, China.According to the result of RMSE by using the IGAN-TF predicted model, it can be seen that the overall prediction accuracy was excellent in Figure 14, and the lane difference of the three intersections was not particularly large.Due to many right-lanes on Huanghe and Hongzehu Road, the corresponding RMSE performance was negligible.The other two intersections are straight-right lanes, and the error result of RMSE was more significant.Moreover, the right lanes of Hongzehu Road and Xingfu Road contain the straight direction, which led to a greater RMSE in mixed traffic flow mode.The average accuracy of the three intersections was 0.088, which fully shows the scalability and usability of the proposed improved generative adversarial network framework for traffic flow.

Conclusions
Based on the deep learning model LSTM and CNN, this paper proposes IGAN-TF, a GAN-TF framework with generator BiLSTM and discriminant CNN to predict traffic flows in lanes during fixed intervals.The dataset combines indicators of current green time aggregated from phases in signal timings.The average, minimum, and maximum values of neighbor intervals were applied to generate the input of discriminant CNN to compete with the true value flow sequence.Different loss functions were defined to train and test IGAN-TF with an SGD algorithm and a rolling time-domain strategy.We measured the prediction results of IGAN-TF with criteria and compared it with ARIMA and BiLSTM.The morning and evening peak periods and left-turn, straight-forward, and right-turn directions showed the accuracy and stability of IGAN-TF in different time and traffic conditions and its potential to be discovered.
The proposed IGAN-TF and its application suggest a new way to resolve traffic prediction issues.IGAN-TF provides an open framework to combine the advantages of different sub-models and offers a suitable platform for them to compete with each other.As it was initially designed for the machine learning environment, big data scenarios (e.g., traffic prediction and traffic state estimation) can be solved in IGAN-TF.However, it is worth studying the application efficiency of IGAN-TF in various data datasets, distributions, and statistical properties.The indicators and their input methods, which influence the data variation most, significantly improve IGAN-TF.As IGAN-TF has a flexible framework, the cooperation of different approaches and strategies are also an efficient way to optimize the accuracy of IGAN-TF.

Figure 1 .
Figure 1.Different influences of factors on time-varying traffic flows.(a) Daily traffic pattern with periodicity and holiday long-term influence.(b) Special case with closing local trend.(c) Factors with long-term trend.

3. 3 . 1 .
To resolve the Periodicity and Trend of Flows in the Time Domain (a) To resolve the periodicity and trend of flows in the time domain In terms of cross-validation, the training and testing datasets are treated as a rolling time domain scheme shown in Figure 3 to capture the temporal periodicity and trend of the dataset.In the first cycle, we select the first M days as the training set and the next N days as the testing set; in the second cycle, the N + 1 days to M + N days are chosen to form the training set, and M + N + 1 days to M + 2N days are selected for testing.The process is repeated until all the data are in the experiment.

Figure 3 .
Figure 3. Rolling time-domain training and testing.

3. 3 . 2 .
To Describe the Local Trend Raised by Special Cases with Steady Accuracy (b) To describe the local trend raised by special cases with steady accuracy To enable a steady local trend for IGAN-TF, key indicators are selected to map the multi-view fusion local trend, groups of statistic local information from every five temporal neighbors.In this paper, five indicators are chosen, as shown in Table

Figure 4 .
Figure 4. Signalized intersection layout with a group of lanes.

Figure 5 .
Figure 5. Distribution probability of different lanes.

Figure 7 .
Figure 7. Signal timing phase time of different lanes in a whole day.

Figure 9 .
Figure 9. GAN-TF predicted loss of each lane flow.Based on comparison results of the above benchmarks with IGAN-TF, ARIMA represents the prediction performance of traditional nonlinear methods; BiLSTM with the same input and optimizer denotes the online feedback effect of the discriminative model; GAN-TF shows the contribution of stability and accuracy from multi-source indicators; and GAN-NST presents the function of considering impacts of signal phase time on prediction.Figure 10 presents a comparison between the predicted and ground-truth traffic flows of each lane by different GAN-based methods for 15 May 2017, based on traffic flows from 8 May 2017 to 14 May 2017.GAN-NST performed the worst in general because it produced many GAN outliers in each lane, showing unsteady performance based on non-signal data as an indicator.It is apparent that GAN-TF and GAN-NST are worse than IGAN-TF.The reason for the tiny difference between GAN-TF and IGAN-TF is that only several outliers were produced.More specific details should be derived from the tables of the average evaluation performance.As the fusion of IGAN-TF is temporal multi-view, the reason why IGAN-TF outperformed GAN-TF might lie in the advantage of the discriminative model.

Figure 10
presents a comparison between the predicted and ground-truth traffic flows of each lane by different GAN-based methods for 15 May 2017, based on traffic flows from 8 May 2017 to 14 May 2017.GAN-NST performed the worst in general because it produced many GAN outliers in each lane, showing unsteady performance based on non-signal data as an indicator.It is apparent that GAN-TF and GAN-NST are worse than IGAN-TF.The reason for the tiny difference between GAN-TF and IGAN-TF is that only several outliers were produced.More specific details should be derived from the tables of the average evaluation performance.As the fusion of IGAN-TF is temporal multi-view, the reason why IGAN-TF outperformed GAN-TF might lie in the advantage of the discriminative model.

Figure 10 .
Figure 10.Comparison of predicted flows by models and ground truth of each lane.

Figure 11 .
Figure 11.RMSE (percentage) of different groups of temporal views.
S+1 (i 1 , i 2 , . . . ,S ) Multi-dimensional indicators of temporal sequence S to generate target traffic flow at the interval S+1 fS+1Predicted target traffic flow at the interval S+1, generated from outputs of Generator G f tTraffic flow ground truth at the interval t

Table 2 .
Multi-dimensional indicators for prediction.

Table 3 .
Result of time series stationary test.

Table 4 .
Network architecture of generator.

Table 5 .
Network architecture of discriminative model.

Table 9 .
Loss comparison of RMSE.