Dynamic Estimation of Saturation Flow Rate at Information-Rich Signalized Intersections

: Intersections are the bottlenecks of the road network. The capacity of signalized intersections restricts the operation of the road network. Dynamic estimation of capacity is necessary for signalized intersections reﬁned management. With the development of technology, more and more detectors were installed near the intersection. It had been the information-rich environment, which provided support for dynamic estimation of capacity. A dynamic estimation method for a saturation ﬂow rate based on a neural network was developed. It would grasp the dynamic change of saturation ﬂow rates and inﬂuencing factors. The measure data at three scenarios (through lanes, shared right-turn and through lanes, shared left-turn and through lanes) of signalized intersections in Beijing were taken as examples to validate the proposed method. Firstly, the tra ﬃ c ﬂow characteristics of the three scenarios and factors a ﬀ ecting the saturation ﬂow rate were analyzed. Secondly, neural network models of the three scenarios were established. Then the hyperparameters of neural network models were determined. After training, the neural network structure and parameters were saved. Lastly, the test set data was validated by the training model. At the same time, the proposed method was compared with the Highway Capacity Manual (HCM) method and the statistical regression method. The results show that both regression models and neural network models have better accuracy than HCM models. In a simple scenario, the neural network models are not much di ﬀ erent from the regression models. With the increase of complexity of scenarios, the advantages of neural network models are highlighted. In through-left lane and through-right lane scenarios, the estimated saturation ﬂow rates used by the proposed method were 7.02%, 4.70%, respectively. In the complexity of tra ﬃ c scenarios, the proposed method can estimate the saturation ﬂow rate accurately and timely. The results could be used for signal control schemes optimizing and operation managing at signalized intersections subtly. the HCM It found that the changes in adjusted saturation ﬂow rates are di ﬀ erent from the measured. In the narrow lanes and high percentage of heavy vehicle scenarios, the measured saturation ﬂow rate decreased rapidly. It was shown that there was an interaction between the lane width and the percentage of heavy vehicles.


Introduction
With the fast-growing stage of urbanization and mobility, the number of vehicles in developing countries has increased year by year. Traffic jams have become frequent in many cities. In Beijing, the transport annual report stated that the average daily congestion duration was 2 h and 50 min in 2018 [1]. Intersections are the bottlenecks of the road network, where multiple traffic streams and people flow gather [2]. Traffic congestion usually occurs at intersections. There are many causes of congestion, and the primary cause is supply and demand imbalance. Engineers often improve the Information 2020, 11,178 3 of 23 the saturation flow rate of signalized intersection is the sum of the saturation flow rate for each lane group of each approach. Two conventional methods for determining the saturation flow rate are proposed in the U.S. Highway Capacity Manual [3]. One is the adjustment method. The other is measurement technique. In the adjustment method, the computed saturation flow rate is referred to as the "adjusted" saturation flow rate because it reflects the application of various factors that adjust the base saturation flow rate to the specific conditions present on the subject intersection approach. Equation (1) is used to compute the adjusted saturation flow rate per lane for the subject lane group.
where s 0 is the base saturation flow rate (pc/h/ln); f w is the adjustment factor for lane width; f HV is the adjustment factor for heavy vehicles in traffic stream; f g is the adjustment factor for the approach grade; f p is the adjustment factor for the existence of a parking lane and parking activity adjacent to lane group; f bb is the adjustment factor for blocking effect of local buses that stop within the intersection area; f a is the adjustment factor for area type; f LU is the adjustment factor for lane utilization; f LT is the adjustment factor for left-turn vehicle presence in a lane group; f RT is the adjustment factor for right-turn vehicle presence in a lane group; f Lpb is the pedestrian adjustment factor for left-turn groups; f Rpb is the pedestrian-bicycle adjustment factor for right-turn groups. In general, the adjustment method is used to estimate the saturation flow rate at planning intersections.
In the measurement technique, data is taken cycle by cycle. In general, vehicles are recorded when their front axles cross the stop line. Saturation flow rate is calculated only from the data recorded after the fourth to sixth vehicle in the queue passes the stop line. To reduce the data for each cycle, the time recorded for the fourth vehicle is subtracted from the time recorded for the last vehicle in the queue. This value represents the sum of the headways for the fifth through n-th vehicle, where n is the number of the last vehicle surveyed (which may not be the last vehicle in the queue). This sum is divided by the number of headways after the fourth vehicle (i.e., divided by (n − 4)) to obtain the average headway per vehicle under saturation flow. The saturation flow rate is 3600 divided by this average headway as expressed in Equation (2). In general, the measurement technique is used to obtain a saturation flow rate at operating intersections.
where, h t is saturation headways.

Improved Adjustment Methods
In order to improve the applicability of HCM adjustment locally, researchers have focused on adjustment factors. In China, owing to limitations on land use, many intersections are irregular crossing where the approach and exit lanes are offset or two roads cross at oblique angles. So the guideline markings are set in signalized intersections. However, this special factor is not taken into account to modify the base saturation flow rate in HCM. Qin et al [4] proposed a new adjustment factor for guideline markings. The saturation flow rate could be estimated accurately with intersection guideline markings. It was found that painting guideline markings could improve the saturation flow rate at signalized intersections. In most of the developing countries like India, two-wheelers are the major mode of personal transportation. Therefore, their effect on the saturation flow at signalized intersections could be substantial. It is not possible to use the U.S. Highway Capacity Manual model directly. Anusha et al [5] proposed a new adjustment factor for two-wheelers and incorporated the effect of two-wheelers on the saturation flow rate into a previous model. It was found that the estimation of the saturation flow rate using the modified model was closer to observed values. Due to the difference in roadway conditions, the driver's behavior and cultures etc, the parameter values of HCM are not applicable in suitable in China. Shao et al. [6] collected 18 cities' data. Then, base saturation flow rate, adjustment factors for lane width and approach grade were suggested. The adjustment factor for turn radius and PCE (passenger car equivalents) were developed. Wang et al. [7] and Lewis et al. [8] found that there was an interaction effect between heavy vehicles and lane width. Wang et al. [7] proposed a comprehensive adjustment factor for considering the interaction. It effectively improved the estimation accuracy.
The above methods maintain well the structure of the HCM model. They can directly express the influencing factors and their effect level on the saturation flow at signalized intersections. As such, the relationship between each factor and the saturation flow rate needs to be surveyed. With respect to geometric factors, the intersection is usually not rebuilt. The engineers can collect data once in a period. With respect to traffic factors, the traffic composition, driver's behavior, pedestrians and bicycle interference are different in each cycle. Therefore, a large number of engineers are required for data collection resulting in a high economic cost.

Estimated Departure Headway Methods
In general, departure headways at signalized intersections are defined as the time intervals between successive vehicles in a queue passing a stop line or a reference line at the intersection. Many researchers have focused on the recognition of saturation flow and achieved automatic extraction. Yang et al. [9] proposed a dynamic extraction method of saturation headway by using an induction coil detector. The average saturation headways of history and current cycle were analyzed with an exponential smoothing method. Compared with the traditional methods, the proposed method can be realized without additional cost and can meet the demand of dynamic extraction. Wang et al. [10] proposed an automatic estimation method for the saturation headway based on video detector data. The Dickey-Fuller test was used to verify whether the headways in the time series were saturation headways. An iterative method using quantiles was proposed to filter out abnormal data. The quantile of 80% and the data duration of no less than 150 min were suggested. In addition, Radhakrishana et al. [11] analyzed the factors affecting discharge headway under heterogeneous traffic conditions and proposed a novel method to measure headways. It was found that discharge headway values were having variation and were different from the homogeneous traffic scenario where the headway tends to follow a constant value after the initial four or five vehicles. Models for computing discharge headway were developed using linear regression and linear mixed regression. Differently, Tong et al. [12] proposed a neural network approach to estimate the queued vehicle discharge headway.
The above methods are based on the historical or current cycle of data extraction. They are difficult to predict the saturation headways in the next cycle. Due to the limitation of data collection equipment, many influencing factors are difficult to consider in real-time. In recent years, video detectors have been widely placed nearby the signalized intersections. The video detectors can automatically record vehicle crossing stop line time, vehicle speed and other information (like pedestrians and bicycles). The roadside units (RSU), which were placed in the intersections, can also record the vehicle type, location, trajectory and other information [17][18][19]. In the connected vehicle environment, the status information of each vehicle can also be collected [20]. Two types of data are mutually complemental. We can obtain the queuing situation and traffic composition from connected vehicle data. In an information-rich environment, automatic estimation of saturation flow rate at all intersections in cities can be achieved without a manual survey resulting in low economic costs. In China, the government promotes the electronic toll collection (ETC) equipment installed in all vehicles [21]. At present, the number of ETC users has exceeded 180 million. In Chongqing, all vehicles need to be set up radio frequency identification device (RFID) license plates [22]. At the same time, with the development of new transportation modes like mobility as a service (MaaS), the vehicle information will be more open. According to Accenture's report [23], in 2030, the format of the automotive industry will change from marketing to transportation services. These developments of new technologies will support the prediction of saturation headways.

Statistical and Physical Methods
Some researchers analyze the saturation flow rate form a new perspective, and develop statistical models of influencing factors and saturation flow rate, and establish physical models of microscopic traffic characteristics. Saha et al. [13] proposed a new saturation flow model that can accurately estimate the saturation flow rate in India traffic conditions. This model presents four forms-universal Kriging, pseudo-likelihood Kriging, blind Kriging, co-Kriging based saturation flow model. The performance of the new model is better than the conventional model. Shao et al. [14] studied the randomness of the saturation flow rate with a statistical method. It was found that the result of calculating the saturation flow rate may be low with the average value of saturation headways. A new method of estimation saturation flow rates is developed based on the median value of queue discharge headways. Murat et al. [15] built a new model for saturation flow based on driver behavior and vehicle characteristics. The vehicle's length, acceleration of vehicles, speed of vehicles crossing intersection and reaction time of vehicles in the queue are considered. There is no significant difference between the new model and measured data. Reaction time is the major factor in saturation flow. Vehicle length is also an important factor. Hossain [16] developed a regression model to estimate the saturation flow rate. It was found that the saturation flow rates were dependent on the lane width, percentage of turning vehicles, and the composition of vehicles. Chen et al [24] proposed a four-stage saturation departure model based on an empirical analysis of discharge headways in shared lanes. The model reflects the stochastic nature of vehicle-pedestrian conflict and constructs the logical relationships between shared right-turn saturation flow and its influencing factors.
The above models are developed under specific traffic conditions. They can describe the relationship between saturation flow and influencing factors. However, these models are relatively complex and fail to be used in engineering applications. Among these models, the performance of the regression models are acceptable in simple traffic scenarios with few influencing factors. However, in the complex traffic scenarios, the regression models are difficult to fit and the accuracy of the estimation of the saturation flow rate will decrease significantly.
In summary, previous studies provided an approach for estimating saturation flow from different perspectives. However, the previous research models usually used historical data in a period to estimate the saturation flow rate. It did not predict the saturation flow cycle by cycle. In addition, under complex traffic scenarios, the accuracy of the traditional method for estimating saturation flow rates are unsatisfactory. In this study, we combined the advantages of the adjustment method and field measurement method and proposed a neural network model that can predict the saturation flow rate cycle by cycle. The results of this study can improve the accuracy of estimation of saturation flow rate, dynamically optimize signal control schemes, and improve the level of traffic management.

Conventional Method
In this paper, we use two traditional methods to estimate the saturation flow rate for comparison of the proposed neural network method. They are the adjustment model and the multiple linear regression model. With respect to the adjustment model, each adjustment factor expressed the relationship between the influencing factor and saturation flow rate. The adjustment factors were divided into dynamic factors and static factors. The static factors are geometric factors such as lane width. There are differences between different intersections. The dynamic is traffic factors such as heavy vehicles in the traffic stream. The adjustment factors in the HCM are mainly applicable to American traffic conditions. In China, a Chinese National Standard (code for the planning of intersections on urban roads, Standard Number is GB50647-2011) [25] where, S t is the adjusted saturation flow rate in through lane; S l is the adjusted saturation flow rate in the left-turn lane; S r is the adjusted saturation flow rate in the right-turn lane; S bt is the base saturation flow rate in through lane; S bl is the base saturation flow rate in the left-turn lane; S br is the base saturation flow rate in the right-turn lane; f t is the adjustment factor for lane width; f g is the adjustment factor for approach grade and heavy vehicles in the traffic stream; f z is the adjustment factor for turning radius.

Neural Network Method
Since 1990, neural networks have been widely used in transportation [26]. Neural networks have two advantages that are suitable for describing complex traffic flow. One advantage is that the neural network can handle any complexity nonlinear systems. The other is that it does not require any prior knowledge for model building. For example, in regression models, basic relationships (linear, polynomial, exponential, etc.) need to be determined before model building. Therefore, neural networks are more flexible than statistical methods. The basic elements of the neural network are nodes which are arranged in a layered structure. Nodes are connected to each other through links. Each link is associated with a weight. Each node may receive many inputs from other nodes, applying an activation function to its input to determine its output signal. The sigmoid function (see Equation (6)), which is one of the typical activation functions, was chosen in this study.
A typical fully connected three-layered neural net is shown in Figure 1. The first layer is the input layer that contains the input units x i (n nodes). The inputs are linked to the nodes in the hidden layer z j (p nodes) with associated weight ω ij . The nodes in the hidden layer are also linked to the output nodes y k with associated weight v jk . The output units and hidden units may also have bias units. The b j and bb k are the bias of the hidden layer and output layer, respectively. The neural net can be trained through the adjustment of weights by input training patterns.

Neural Network Method
Since 1990, neural networks have been widely used in transportation [26]. Neural networks have two advantages that are suitable for describing complex traffic flow. One advantage is that the neural network can handle any complexity nonlinear systems. The other is that it does not require any prior knowledge for model building. For example, in regression models, basic relationships (linear, polynomial, exponential, etc.) need to be determined before model building. Therefore, neural networks are more flexible than statistical methods. The basic elements of the neural network are nodes which are arranged in a layered structure. Nodes are connected to each other through links. Each link is associated with a weight. Each node may receive many inputs from other nodes, applying an activation function to its input to determine its output signal. The sigmoid function (see Equation (6)), which is one of the typical activation functions, was chosen in this study.
A typical fully connected three-layered neural net is shown in Figure 1. The first layer is the input layer that contains the input units xi (n nodes). The inputs are linked to the nodes in the hidden layer zj (p nodes) with associated weight ωij. The nodes in the hidden layer are also linked to the output nodes yk with associated weight vjk. The output units and hidden units may also have bias units. The bj and bbk are the bias of the hidden layer and output layer, respectively. The neural net can be trained through the adjustment of weights by input training patterns. Currently, there are many available training algorithms, and the backpropagation rule is selected in this study, which is one of the most widely used training algorithms to deal with prediction problems. The principle of this rule is to minimize the total output error with respect to the weight. The network adjusts the weights in the direction that reduces the error. The training rules  Currently, there are many available training algorithms, and the backpropagation rule is selected in this study, which is one of the most widely used training algorithms to deal with prediction problems. The principle of this rule is to minimize the total output error with respect to the weight. The network adjusts the weights in the direction that reduces the error. The training rules are as follows [12]: Step 1: Select a learning rate α and initialize the weight ω ij and v jk using random values.
Step 2: Present the input data and compute the layers' output by Equations (7) and (8).
Step 3: Compute the weight correction terms of output and hidden units by Equations (9) and (10).
Step 4: If all the input data patterns are trained, go to step 5, otherwise, go to step 2.
Step 5: Sum all the weight correction terms and then add correspondingly to the old weights.' Step 6: Repeat steps 2-5 until a sufficiently small output error has been obtained. Each iteration from steps 2-6 is called an epoch. Usually, many epochs are required between training is completed. The hidden layer nodes, activation function, learning rate, and gradient descent function are effect on speed and accuracy of the training process. The number of nodes for the output layer is one. It is measured the saturation flow rate. The number of nodes for the hidden layer is based on Equation (11).
where, n is the number of input layer nodes; m is the number of output layer nodes; a is a constant (range from 1 to 10). This paper determines the range of hidden layer neuron nodes according to Equation (11).

Performance Evaluation
In order to validate the trained neural network, the data set had to be divide into two subsets, namely the training and test sets. The training set was used to train to network and check whether the network is over-fitted or not. Several types of prediction errors were available to assess the performance of the neural network such as the mean squared error and mean absolute percentage error [12]. In the present study, the mean absolute percentage error (MAPE) was used to evaluate the model performance, which is shown in Equation (12).
where E i is the model output value, M i is the measured value, n is the number of testing data set.

Proposed Model
The modeling technique proposed in this research is based on the main factors affecting the saturation flow rate. To estimate the saturation flow rate of each cycle, a three-layered neural network with one output unit was used. The output unit was the measured saturation flow rate.

Selection of Input Variables
The influencing factors of the saturation flow rate are different in different traffic scenarios. As is shown in Figure 2, the intersection users compete for conflict space, further explaining the characteristics of each factor. In through lanes, the saturation flow rate is affected by few factors. It is the composition of vehicles in the queue. However, it was found that there was the mutual interference among vehicles in multiple through lanes. After passing the stop line, the vehicle freely chose the lanes of departure. It affected the running vehicles and caused fluctuation of the saturation flow rate. In shared right-turn and through lanes, both through and right-turn vehicles can also cross the stop line. Therefore, the saturation flow rate was not only affected by the composition of vehicles, but also by the percentage of right-turning vehicles. When the right-turning vehicles crossed the departure, it was also disturbed by pedestrians and bicycles. The influencing factors in shared left-turn and through lanes were similar to the shared right-turn and through lanes. They were the percentage of left-turning vehicles, disturbing of pedestrians and bicycles, respectively. However, saturation flow rate was also affected by opposing vehicles.
Information 2020, 11, x FOR PEER REVIEW 8 of 22 characteristics of each factor. In through lanes, the saturation flow rate is affected by few factors. It is the composition of vehicles in the queue. However, it was found that there was the mutual interference among vehicles in multiple through lanes. After passing the stop line, the vehicle freely chose the lanes of departure. It affected the running vehicles and caused fluctuation of the saturation flow rate. In shared right-turn and through lanes, both through and right-turn vehicles can also cross the stop line. Therefore, the saturation flow rate was not only affected by the composition of vehicles, but also by the percentage of right-turning vehicles. When the right-turning vehicles crossed the departure, it was also disturbed by pedestrians and bicycles. The influencing factors in shared leftturn and through lanes were similar to the shared right-turn and through lanes. They were the percentage of left-turning vehicles, disturbing of pedestrians and bicycles, respectively. However, saturation flow rate was also affected by opposing vehicles. In general, the influencing factors are used as input variables. For variables that have been selected, it must be easy to measure on-site since a considerable amount of data has to be collected to develop the model. Based on this objective, seven variables were chosen to be input variables for the estimation of the saturation flow rate in each cycle. They are shown in Table 1. We should choose different variables in different scenarios. The structure and information of the model are shown in Figure 3. In general, the influencing factors are used as input variables. For variables that have been selected, it must be easy to measure on-site since a considerable amount of data has to be collected to develop the model. Based on this objective, seven variables were chosen to be input variables for the estimation of the saturation flow rate in each cycle. They are shown in Table 1. We should choose different variables in different scenarios. The structure and information of the model are shown in Figure 3. Firstly, it is necessary to obtain data. Since the saturation flow rate is a key parameter of signal timing, it is very important to obtain the saturation flow rate in real-time. The characteristics of the saturated flow rate and traffic volume were different. The traffic volume was measured in real-time at 5 mim or 15 min intervals. Because the saturation flow rate was related to the signal period, it is   Firstly, it is necessary to obtain data. Since the saturation flow rate is a key parameter of signal timing, it is very important to obtain the saturation flow rate in real-time. The characteristics of the saturated flow rate and traffic volume were different. The traffic volume was measured in real-time at 5 mim or 15 min intervals. Because the saturation flow rate was related to the signal period, it is more appropriate to use each signal period as the statistical interval. According to the characteristics of the neural network model, the input and output parameters need to be determined. In this paper, the input parameters are the influencing factors, and the output parameters are the measured saturation flow rates. The output parameters can be obtained through detectors (video camera, roadside unit etc.) nearby the intersections. There were four steps in the measurement method. (a) We should determine whether the number of queued vehicles exceed seven. If it is less than seven, it will indicate that the headway is not reached saturation. We could not obtain the saturation headway directly. (b) We should determine whether the headway, which ranged from the fourth vehicle to the last vehicle in the queue, was saturation headway. (c) The average saturation headway can be extracted. (d) The saturation flow rate was calculated by Equation (2). There were two types of input parameters, one a static parameter and the other is a dynamic parameter, which were obtained through different methods. Static parameters can be collected manually and updated every year. Dynamic parameters can be collected by detectors.

NO YES
Secondly, the structure of the neural network is determined. The number of nodes for the input layer was the same as all influencing factors (static and dynamic factors). The number of nodes for the output layer was one. The saturation flow rate was measured. Then, the optimal number of neuron nodes in the hidden layer was determined by judging the minimum error. Similarly, the activation function, gradient descent function and learning rate were determined according to the experiment. Finally, a neural network model can be built by training and verifying. The training may spend a lot of time. We could retrain and optimize the neural network at some intervals. The neural network model for dynamic estimation of the saturation flow rate was divided into five steps in this paper. As is shown in Figure 4.  (2). There were two types of input parameters, one a static parameter and the other is a dynamic parameter, which were obtained through different methods. Static parameters can be collected manually and updated every year. Dynamic parameters can be collected by detectors. Secondly, the structure of the neural network is determined. The number of nodes for the input layer was the same as all influencing factors (static and dynamic factors). The number of nodes for the output layer was one. The saturation flow rate was measured. Then, the optimal number of neuron nodes in the hidden layer was determined by judging the minimum error. Similarly, the activation function, gradient descent function and learning rate were determined according to the experiment. Finally, a neural network model can be built by training and verifying. The training may spend a lot of time. We could retrain and optimize the neural network at some intervals. The neural network model for dynamic estimation of the saturation flow rate was divided into five steps in this paper. As is shown in Figure 4.
Step 1: According to the traffic situation, we should analyze the influencing factors of subject lanes and determine input parameters.
Step 2: According to the parameters of step 1, the structure of a neural network, such as neural nodes in a hidden layer, activation function, gradient descent function, learning rate, can be determined.
Step 3: We use one part of historical data as a training set. After training, optimizing, iterating to convergence, the neural network model is built. We use another part of historical data as a verifying set. After verification, the parameters, weight matrix and bias matrix are saved.
Step 4: We use the trained neural network model and measured data of the current signal cycle to predict the saturation flow rate in the next cycle.
Step 5: We determine whether the neural network can update. If it is satisfied, it would go to step 3, otherwise, it would go to step 4.

Data Collection
In order to prove the availability and reliability of the proposed model in this paper, different traffic scenarios are selected. Then, the neural network model, HCM model, and regression models for estimating saturation flow rate were developed based on measured data. According to the characteristics of traffic flow, the influencing factors were different in various lanes. Therefore, the traffic flow characteristics and the number of influencing factors should be considered when selecting Step 1: According to the traffic situation, we should analyze the influencing factors of subject lanes and determine input parameters.
Step 2: According to the parameters of step 1, the structure of a neural network, such as neural nodes in a hidden layer, activation function, gradient descent function, learning rate, can be determined.
Step 3: We use one part of historical data as a training set. After training, optimizing, iterating to convergence, the neural network model is built. We use another part of historical data as a verifying set. After verification, the parameters, weight matrix and bias matrix are saved.
Step 4: We use the trained neural network model and measured data of the current signal cycle to predict the saturation flow rate in the next cycle.
Step 5: We determine whether the neural network can update. If it is satisfied, it would go to step 3, otherwise, it would go to step 4.

Data Collection
In order to prove the availability and reliability of the proposed model in this paper, different traffic scenarios are selected. Then, the neural network model, HCM model, and regression models for estimating saturation flow rate were developed based on measured data. According to the characteristics of traffic flow, the influencing factors were different in various lanes. Therefore  Northbound and Southbound Leftturn Phase 15s (green) + 3s (amber) + 2s(all-red) 20s (green) + 3s (amber) + 2s(all-red) 33s (green) + 3s (amber) + 2s(all-red) 1 TH express through lane. 2

Data Summary
A total of 600 cycle data were collected in this paper. The details about the data are shown in Table 3. There were 420 cycle data which were saturation headways, percentage of heavy vehicles and adjacent vehicle's lane changes in scenario 1. The average saturation flow rate was 1399 veh/h. The maximum saturation flow rate is 2030 veh/h. The minimum saturation flow rate is 967 veh/h. The standard deviation is 195.09. There were 90 cycles data which were saturation headways, percentage of heavy vehicles, percentage of right-turning vehicles and pedestrians and bicycle volumes in scenario 2. The average saturation flow rate was 1355 veh/h. The maximum saturation flow rate was 2105 veh/h. The minimum saturation flow rate was 701 veh/h. The standard deviation was 292.67. There were 90 cycles data which were saturation headways, percentage of heavy vehicles, percentage of left-turning vehicles and pedestrians and bicycle volumes in scenario 3. The average saturation In view of so much data, a video camera was used to record the vehicle's movements at the intersections in this paper. The vehicle's headways and influencing factors data were extracted manually. The lane width was recorded by a range finder. The signal cycle time and phases were recorded by a stopwatch. In the extraction process, the influencing factors and headways were recorded each cycle. If the vehicles in the subject lane affected by an adjacent vehicle's lane changes, we recorded 1, otherwise 0. A total of 420 cycle data were collected for the nine through lanes in the evening peak

Data Summary
A total of 600 cycle data were collected in this paper. The details about the data are shown in Table 3. There were 420 cycle data which were saturation headways, percentage of heavy vehicles and adjacent vehicle's lane changes in scenario 1. The average saturation flow rate was 1399 veh/h. The maximum saturation flow rate is 2030 veh/h. The minimum saturation flow rate is 967 veh/h. The standard deviation is 195.09. There were 90 cycles data which were saturation headways, percentage of heavy vehicles, percentage of right-turning vehicles and pedestrians and bicycle volumes in scenario 2. The average saturation flow rate was 1355 veh/h. The maximum saturation flow rate was 2105 veh/h. The minimum saturation flow rate was 701 veh/h. The standard deviation was 292.67. There were 90 cycles data which were saturation headways, percentage of heavy vehicles, percentage of left-turning vehicles and pedestrians and bicycle volumes in scenario 3. The average saturation flow rate was 1395 veh/h. The maximum saturation flow rate was 1782 veh/h. The minimum saturation flow rate was 960 veh/h. The standard deviation is 177.17.

Saturation Flow Rate Estimation Model with a Neurnal Network
A three-layer neural network model was proposed. According to the measured data, with the goal of minimizing the error of the output result, the parameters of the input layer, hidden layer and output layer are calibrated. The number of hidden layer nodes, activation function, gradient descent function, and learning rate is selected. The neural network model program was developed by the TensorFlow which is an open-source software by Google. The software had the standardized neural network architecture and functions for coding conveniently. At the same time, the calculation process and results were visualized through TensorBoard. It could make the optimization and debugging of the program easy. Its interface was shown in https://tensorflow.google.cn/.
Firstly, the input and output variables were determined. In general, the influencing factors were used as the input variables, and the measured saturated flow rates were used as the output variables. The input variables were different in different scenarios, and the output variables were the measured saturated flow rates. Scenario 1: The three input variables were lane width, vehicle composition, and multi-lane lateral interference. Scenario 2: The four input variables were vehicle composition, percentage of right-turning vehicles, pedestrians, and bicycles. Scenario 3: The four input variables were the percentage of left-turning vehicles, pedestrians, bicycles and opposing vehicles. Secondly, the training and test set were divided. 300 cycles data were used to train model and 120 cycles data were used to test model in scenario 1. 65 cycles data were used to train model and 25 cycles data were used to test model in scenarios 2 and 3. Then, the hyperparameters in the neural network model were calibrated and saved. In TensorFlow, the adjusted hyperparameters included the number of hidden layers, the number of neural nodes, the activation function, the learning rate, and the gradient descent function. In general, the number of hidden layers was one. The calibration process of hyperparameters is as follows.
Step 1: The range of hyperparameters were determined. For example, the number of neural nodes could be calculated according to Equation (9). The range was from 3 to 12. In general, there are 6 types of activation functions. The learning rate ranged from 0.01 to 0.1. There are 6 gradient descent functions.
Step 2: We should control the hyperparameters for model training and testing. We could choose the number ranging from 3 to 12 in turn to train. The structure and parameters were saved. Then, we used the same structure and parameters to test the model. The predictions of mean absolute error (MAE) and mean absolute percentage error (MAPE) were calculated.
Step 3: The prediction indexes and hyperparameters were compared and determined. For different scenarios, the prediction indexes were compared in the test set, and the hyperparameters were selected in which the prediction indexes were smallest.
Taking the model developed in scenario 1 as an example, the hyperparameters were determined by the above steps as shown in Figure 6. The number of neural nodes was 12. The activation function was "sigmoid". The learning rate was 0.04. The gradient descent function was "RMSProOptimizer". Scenarios 2 and 3 were similar to scenario 1. The results were shown in Table 4. In fact, the hyperparameters were determined automatically by programming.
After all hyperparameters were determined, the models of three scenarios were trained with training sets. After training, the model weights and bias matrix were saved. Then, we used the same parameters to verify the model with test sets. In scenario 1, the weight matrix was expressed as "W 11 " and the bias matrix was expressed as "b 11 " between the input layer and the hidden layer. The weight matrix was expressed as "W 12 " and the bias matrix was expressed as "b 12 " between the hidden layer and the output layer. Similarly, the scenario 2 matrixes were expressed as "W 21 , b 21 , W 22 , b 22 ". The scenario three matrixes were expressed as "W 31 , b 31 , W 32 , b 32 ". They are shown in Appendix A.
The prediction results of training and test sets in three scenarios were shown in Figure 7. The predicted and measured saturation flow rates in training and test sets were in good agreement. In scenario 1, the mean absolute percentage error values were 0.06 (94% accuracy) and 0.11 (89% accuracy) with training and test sets, respectively. In scenario 2, the mean absolute percentage error values were 0.07 (93% accuracy) and 0.07 (93% accuracy) with training and test sets, respectively. In scenario 3, the mean absolute percentage error values were 0.04 (96% accuracy) and 0.05 (95% accuracy) with training and test sets, respectively. Information 2020, 11, x FOR PEER REVIEW 14 of 22  After all hyperparameters were determined, the models of three scenarios were trained with training sets. After training, the model weights and bias matrix were saved. Then, we used the same parameters to verify the model with test sets. In scenario 1, the weight matrix was expressed as "W11" and the bias matrix was expressed as "b11" between the input layer and the hidden layer. The weight matrix was expressed as "W12" and the bias matrix was expressed as "b12" between the hidden layer and the output layer. Similarly, the scenario 2 matrixes were expressed as "W21, b21, W22, b22". The scenario three matrixes were expressed as "W31, b31, W32, b32". They are shown in Appendix A.

Comparison of Proposed Method and Conventional Method
In order to further verify the effectiveness of the proposed method in this paper, the proposed method was compared with conventional methods which were the HCM method and the statistical method. In scenario 1, the adjacent vehicle lane changes factor was not considered in the HCM method. There were two methods for this factor. Firstly, we should find the relationship between the factor and the saturation flow rate. The adjustment factor was obtained through derivation. Then it was introduced into the HCM multiplication equation. This method was suitable for the simple relationship between one factor and the saturation flow rate. Secondly, the multiple linear regression method was developed. The influencing factors were independent variables. The saturation flow rate was the dependent variable. In general, the linear regression method was used. If there were so many independent variables, the multicollinearity of the model would occur. In this paper, multiple linear regression methods and the HCM method were used as the control group. In the HCM method, two types of parameters need to be determined. One is the base saturation flow rate, and the other is adjustment factors. In order to make the estimated saturation flow rate close to the measured, the base saturation flow rate, lane width adjustment factor and percentage of heavy vehicles adjustment factor were used the recommend values in Chinese National Standard (standard number is GB50647-2011). The left-turn adjustment factor, right-turn adjustment factor and pedestrian-bicycle adjustment factor were recommended in the Chinese National Standard. So they were used in Highway Capacity Manual. For the multiple linear regression method, the three scenarios models were developed with training set data by SPSS software. The calibrated results of the three models were shown in Table 5. The goodness of fit of models was 0.436, 0.355, 0.170, respectively. It was found that the explaining variables (adjustment factors) could not fully explain the changes of explained variables (saturation flow rate).

Comparison of Proposed Method and Conventional Method
In order to further verify the effectiveness of the proposed method in this paper, the proposed method was compared with conventional methods which were the HCM method and the statistical method. In scenario 1, the adjacent vehicle lane changes factor was not considered in the HCM method. There were two methods for this factor. Firstly, we should find the relationship between the factor and the saturation flow rate. The adjustment factor was obtained through derivation. Then it was introduced into the HCM multiplication equation. This method was suitable for the simple relationship between one factor and the saturation flow rate. Secondly, the multiple linear regression method was developed. The influencing factors were independent variables. The saturation flow rate was the dependent variable. In general, the linear regression method was used. If there were so many independent variables, the multicollinearity of the model would occur. In this paper, multiple linear regression methods and the HCM method were used as the control group. In the HCM method, two types of parameters need to be determined. One is the base saturation flow rate, and the other is adjustment factors. In order to make the estimated saturation flow rate close to the measured, the base saturation flow rate, lane width adjustment factor and percentage of heavy vehicles adjustment factor were used the recommend values in Chinese National Standard (standard number is GB50647-2011). The left-turn adjustment factor, right-turn adjustment factor and pedestrian-bicycle adjustment factor were recommended in the Chinese National Standard. So they were used in Highway Capacity Manual. For the multiple linear regression method, the three scenarios models were developed with training set data by SPSS software. The calibrated results of the three models were shown in Table 5. The goodness of fit of models was 0.436, 0.355, 0.170, respectively. It was found that the explaining variables (adjustment factors) could not fully explain the changes of explained variables (saturation flow rate).
Due to the interaction between adjustment factors, the HCM method can not express the complex relationship between the factors. As is shown in Figure 8, the saturation flow rates for different lane width and different percentage of heavy vehicles were collected in Beijing. The "m" was the measured saturation flow rates. The "hcm" was the adjusted saturation flow rates with the HCM method. It was found that the changes in adjusted saturation flow rates are different from the measured. In the narrow lanes and high percentage of heavy vehicle scenarios, the measured saturation flow rate decreased rapidly. It was shown that there was an interaction between the lane width and the percentage of heavy vehicles.  4 Sig. means p-value and stands for the significance level. 5 PoHV means the percentage of heavy vehicles. 6 LW means lane width. 7 MTL means multiple through lanes. 8 PoRV means the percentage of right-turn vehicles. 9 PoLV means the percentage of left-turn vehicles.  4 Sig. means p-value and stands for the significance level. 5 PoHV means the percentage of heavy vehicles. 6 LW means lane width. 7 MTL means multiple through lanes. 8 PoRV means the percentage of right-turn vehicles. 9 PoLV means the percentage of left-turn vehicles.
Due to the interaction between adjustment factors, the HCM method can not express the complex relationship between the factors. As is shown in Figure 8, the saturation flow rates for different lane width and different percentage of heavy vehicles were collected in Beijing. The "m" was the measured saturation flow rates. The "hcm" was the adjusted saturation flow rates with the HCM method. It was found that the changes in adjusted saturation flow rates are different from the measured. In the narrow lanes and high percentage of heavy vehicle scenarios, the measured saturation flow rate decreased rapidly. It was shown that there was an interaction between the lane width and the percentage of heavy vehicles. Based on the test set data of three scenarios, a neural network model which was trained by the training set, a multiple linear regression model which was calibrated by the training set, and an HCM model were used to estimate the saturation flow rate. Each cycle error was calculated and formed the error distribution, as shown in Figure 9. In scenario 1, the mean absolute percentage error (MAPE) of estimated saturation flow rates are 29.30% (HCM model), 12.99% (regression model), 11.23% (proposed model), respectively. In scenario 2, the MAPE of estimated saturation flow rates are 21.60% (HCM model), 14.35% (regression model), 7.02% (proposed model), respectively. In scenario 3, the MAPE of estimated saturation flow rates are 24.53% (HCM model), 11.90% (regression model), 4.70% (proposed model), respectively. It was shown that both the regression model and the proposed model were better than the HCM model. In scenario 1, the error distribution of the proposed model was similar to the regression model. The MAPE of the proposed model was lower than the regression model. Because the traffic scenarios were simple, the saturation flow rates were affected by fewer factors. The linear relationship between the independent and dependent variables was present. The Based on the test set data of three scenarios, a neural network model which was trained by the training set, a multiple linear regression model which was calibrated by the training set, and an HCM model were used to estimate the saturation flow rate. Each cycle error was calculated and formed the error distribution, as shown in Figure 9. In scenario 1, the mean absolute percentage error (MAPE) of estimated saturation flow rates are 29.30% (HCM model), 12.99% (regression model), 11.23% (proposed model), respectively. In scenario 2, the MAPE of estimated saturation flow rates are 21.60% (HCM model), 14.35% (regression model), 7.02% (proposed model), respectively. In scenario 3, the MAPE of estimated saturation flow rates are 24.53% (HCM model), 11.90% (regression model), 4.70% (proposed model), respectively. It was shown that both the regression model and the proposed model were better than the HCM model. In scenario 1, the error distribution of the proposed model was similar to the regression model. The MAPE of the proposed model was lower than the regression model. Because the traffic scenarios were simple, the saturation flow rates were affected by fewer factors. The linear relationship between the independent and dependent variables was present. The advantage of the neural network model could degenerate into the linear regression model. So the prediction performance was not significantly different. However, in scenario 2 and 3, the saturation flow rates were affected by many factors. In addition to internal interference (in queue), it would also be interfered with by other external traffic participants (pedestrians and bicycles). The neural network model had obvious advantages, and the accuracy of the neural network model was better than the regression model. It was shown that the neural network model had the advantages in complex scenarios.
Information 2020, 11, x FOR PEER REVIEW 18 of 22 advantage of the neural network model could degenerate into the linear regression model. So the prediction performance was not significantly different. However, in scenario 2 and 3, the saturation flow rates were affected by many factors. In addition to internal interference (in queue), it would also be interfered with by other external traffic participants (pedestrians and bicycles). The neural network model had obvious advantages, and the accuracy of the neural network model was better than the regression model. It was shown that the neural network model had the advantages in complex scenarios.

Potential Applications
It was demonstrated in an earlier section that the model could estimate the saturation flow rate in the cycle based on the selected variables. The neural network model is an ideal tool to research the effect of some variables on saturation flow rates. When conducting a survey, it is difficult to isolate a variable from the disturbances of other factors. Besides, the neural network model has high flexibility in considering different traffic conditions. In this paper, three models are developed at three traffic scenarios, with different input variables. In fact, the model can accommodate more input variables. The reason for not introducing more variables in this study is that some influencing factors are not considered. During the operation of the intersection, data is collected constantly. The data can cover a variety of traffic conditions. The model can be continuously trained to achieve real intelligence.
In addition, the model can be applied in delay calculation and signal timing. As we know, the delay is the most widely used measure of effectiveness for the level of service analysis of signalized intersections. It is difficult to measure. Researchers have developed delay models for computing. Webster's delay model [27] (see Equation 13), HCM delay model [3] (see Equation 14), Akcelik's model [28] (see Equation 15) and Robertson's model [29] (see Equation 16) are the most widely adopted.

Potential Applications
It was demonstrated in an earlier section that the model could estimate the saturation flow rate in the cycle based on the selected variables. The neural network model is an ideal tool to research the effect of some variables on saturation flow rates. When conducting a survey, it is difficult to isolate a variable from the disturbances of other factors. Besides, the neural network model has high flexibility in considering different traffic conditions. In this paper, three models are developed at three traffic scenarios, with different input variables. In fact, the model can accommodate more input variables. The reason for not introducing more variables in this study is that some influencing factors are not considered. During the operation of the intersection, data is collected constantly. The data can cover a variety of traffic conditions. The model can be continuously trained to achieve real intelligence.
In addition, the model can be applied in delay calculation and signal timing. As we know, the delay is the most widely used measure of effectiveness for the level of service analysis of signalized intersections. It is difficult to measure. Researchers have developed delay models for computing. Webster's delay model [27] (see Equation (13)), HCM delay model [3] (see Equation (14)), Akcelik's model [28] (see Equation (15)) and Robertson's model [29] (see Equation (16)) are the most widely adopted.
where d p is the average delay per vehicle on the particular approach, λ = g/c is the proportion of the cycle which is effectively green for the phase under consideration, x = q/λs is the degree of saturation, c is the cycle time, g is the effective green time, q is the flow in vehicles per unit time, s is saturation flow.
where d 1 is the uniform delay, c is the cycle time, g is the effective green time, X = q/s is the degree of saturation for lane group, s is saturation flow.
where OD is the overflow delay, X 0 is the smallest significant q/s ratio, s is saturation flow.
where OD is the overflow delay, T is the analysis period in minutes, v is volume, c is saturation flow.
There is a large error in the calculation results of the above classic delay model compared with the measured values. Part of the reason is that the saturation flow is introduced to the delay model as a fixed value. In fact, the saturation flow is dynamic changes. If the saturation flow rate is estimated with the proposed model, the accuracy of the delay model will improve. Similarly, in the signal timing process, if the saturation flow rate is introduced signal timing model [30] (see Equation (17)), the timing effect will also be improved.
where C 0 is the cycle time, y is the ratio of flow to saturation flow q/s in one phase, Y is the total ratio of each phase, L is the lost time.

Conclusions
In this paper, the saturation flow rate at the approach of signalized intersections were collected by one cycle. At the same time, the influencing factors were recorded in time series. The saturation flow rates were estimated dynamically in real-time by the neural network. The comparison of the estimated results with HCM, regression, neural network models were shown as follows.
(1) There were many factors that had an effect on the saturation flow rate. They were static factors and dynamic factors. Therefore, the saturation flow rate was not constant, but changed continuously with traffic conditions.
(2) There were basic influencing factors in Highway Capacity Manual. The unique traffic characteristics were not considered in HCM. The complex relationship of influencing factors was difficult to express with the regression model. The neural network can be used to describe the relationship between multiple influencing factors and saturation flow rates.
(3) Compared with the HCM method and Regression method, the more complex traffic scenarios were, the more obvious the advantage of the neural network model. In Scenario 2 and 3, the MAPE of estimated saturation flow rate with a neural network model was 7.02%, 4.07%, respectively.
The proposed method for dynamic estimation of the saturation flow rate was based on the information-rich environment. In the actual application, the model developed and the calibration process were automatic. Then, connecting with the signal timing system, the goal of refined traffic management was further achieved.
Funding: This research was funded by the National Natural Science Foundation of China, grant number 5170080357.

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A
In scenario 1, the weight matrixes W 11 , W 12 , the bias matrixes b 11 , b 12 were shown as follows.  In scenario 3, the weight matrixes W 31 , W 32 , the bias matrixes b 31 , b 32 were shown as follows.