Impacts of Weather on Short-Term Metro Passenger Flow Forecasting Using a Deep LSTM Neural Network

: Metro systems play a key role in meeting urban transport demands in large cities. The close relationship between historical weather conditions and the corresponding passenger ﬂow has been widely analyzed by researchers. However, few studies have explored the issue of how to use historical weather data to make passenger ﬂow forecasting more accurate. To this end, an hourly metro passenger ﬂow forecasting model using a deep long short-term memory neural network (LSTM_NN) was developed. The optimized traditional input variables, including the di ﬀ erent temporal data and historical passenger ﬂow data, were combined with weather variables for data modeling. A comprehensive analysis of the weather impacts on short-term metro passenger ﬂow forecasting is discussed in this paper. The experimental results conﬁrm that weather variables have a signiﬁcant e ﬀ ect on passenger ﬂow forecasting. It is interesting to ﬁnd out that the previous variables of one-hour temperature and wind speed are the two most important weather variables to obtain more accurate forecasting results on rainy days at Taipei Main Station, which is a primary interchange station in Taipei. Compared to the four widely used algorithms, the deep LSTM_NN is an extremely powerful method, which has the capability of making more accurate forecasts when suitable weather variables are included. recurrent neural network (RNN) [19], long short-term memory neural network (LSTM_NN) [20], and gate recurrent unit neural network (GRU_NN) [21], have been comprehensively applied to tra ﬃ c ﬂow forecasting. An RNN-based microscopic car-following model was developed to capture and accurately predict tra ﬃ c oscillation in Emeryville, California [22]. The results indicated that RNN had a stronger performance in predicting the trajectories of a group of subsequent vehicles given the trajectory of the ﬁrst vehicle and initial / boundary conditions for the following vehicles. A novel tra ﬃ c volume forecasting model based on an LSTM_NN was proposed in [23]. The experimental results showed that the proposed LSTM_NN was more robust for tra ﬃ c volume forecasting than other state-of-the-art methodologies. A deep GRU-based car-following model was proposed to better capture and describe the complicated human behaviors [24]. The data were retrieved from the southbound direction of U.S. Highway 101 in Los Angeles, California. , 𝑣 (cid:3009)(cid:3048)(cid:3040) , 𝑣 (cid:3024)(cid:3036)(cid:3041)(cid:3031) ], which is the combination of endogenous input variables and exogenous input variables introduced in Sections 2.2 and 2.3. Since each set of input data contains the variables of the previous 1-hour, 2-hour, and 3-hour passenger flows, we set s=1 as the parameter of time step. This shows that the previous passenger flows at time step ( t-1 ), ( t-2 ), and ( t-3 ) are used as the input variables to forecast the passenger flow at time step t. It is worth noting that the dimension of the input vector [ X t ] is changed by multiplying the weight matrixes based on Equations (1)-(4). As a result, the vectors of 𝑖 (cid:3047) , 𝑓 (cid:3047) , 𝑂 (cid:3047) , 𝐶 (cid:3047) (cid:3562), 𝐶 (cid:3047) , and ℎ (cid:3047) have the same dimensions of a as the size of the hidden nodes and are not equal to the dimension of n in input vector [ X t ]. At the output layer, all the hidden outputs are fully connected to obtain 𝑌 (cid:3047) (cid:3561) , whose dimension is returned to n . 𝑌 (cid:3047) (cid:3561) is the forecasting result for the passenger flow and 𝑌 (cid:3047) is the actual value of the passenger flow given in the training dataset. This dataflow structure shows that the hourly passenger flow forecasting model can be transformed into a supervised learning problem using a deep LSTM_NN. In other forecast models, 𝑌 (cid:3047) was defined as one of the input variables contained in input vector [ X t ] [35] to fit the model, or as [ 𝑌 (cid:3047)(cid:2879)(cid:3038) , 𝑌 (cid:3047)(cid:2879)((cid:3038)(cid:2879)(cid:2869)) , (cid:3047)(cid:2879)(cid:2869)


Introduction
As one of the most important issues in an urban metro system is part of smart cities [1], passenger flow analysis and forecast for public transport have been widely studied over the last two decades. Metro systems play an important role in meeting urban transport demand. Because of their high speed, efficiency, volume, and punctuality, urban metros are the first choice for many daily commutes [2]. More accurate short-term passenger flow forecasting for metro systems is important in solving a series of problems in the process of urban development, such as mitigating the adverse effects of traffic congestion, reducing vehicle pollution, better planning of public transit networks, and better land use planning along metro lines.
Following an overview of the previous studies, we can conclude that the issue of short-term passenger flow forecasting is an ongoing process, primarily studied from two perspectives, namely data modeling and methodology modeling. In the field of data modeling, accurately forecasting passenger flow is fairly challenging because many factors affect the target station's passenger flow [3]. The most widely used input variables are historical passenger flow data and their corresponding spatiotemporal data. Although experimental results show that weather data [4,5] were able to represent the passenger recurrent neural network (RNN) [19], long short-term memory neural network (LSTM_NN) [20], and gate recurrent unit neural network (GRU_NN) [21], have been comprehensively applied to traffic flow forecasting. An RNN-based microscopic car-following model was developed to capture and accurately predict traffic oscillation in Emeryville, California [22]. The results indicated that RNN had a stronger performance in predicting the trajectories of a group of subsequent vehicles given the trajectory of the first vehicle and initial/boundary conditions for the following vehicles. A novel traffic volume forecasting model based on an LSTM_NN was proposed in [23]. The experimental results showed that the proposed LSTM_NN was more robust for traffic volume forecasting than other state-of-the-art methodologies. A deep GRU-based car-following model was proposed to better capture and describe the complicated human behaviors [24]. The data were retrieved from the southbound direction of U.S. Highway 101 in Los Angeles, California.
In summary, motivated by the close relationship between passenger flow and weather conditions, and along with the high-performance of an LSTM_NN in traffic flow forecasting, we applied a deep LSTM_NN to develop a metro passenger flow forecasting model by fusing historical hourly passenger flow data and the corresponding temporal and weather data in a station-level study. The main contributions of this paper are as follows.
(1) An hourly passenger flow forecasting model considering weather conditions was developed by using a deep LSTM_NN. The single weather variable of temperature, wind speed, relative humidity, and rainfall, as well as their various combinations, were separately assembled with the temporal data and historical passenger flow data, to build a series of station-level forecasting models to comprehensively analyze the impacts of different weather conditions on hourly metro passenger flow forecasting. Researchers have widely applied a deep LSTM_NN in building a passenger flow forecasting model, but rarely have they used it to build a forecasting model to comprehensively analyze the impacts of different weather conditions on passenger flow forecasting in such detail.
(2) The majority of weather data present general variations, which do not have much impact on passengers' daily commuting, even in the case of heavy rain. This indicates that adding the weather data as inputs to build a forecasting model may not be helpful in improving the forecasting results, because the weather data are likely noisy data in comparison with the temporal data and historical passenger flow data. Furthermore, the study of weather impacts on hourly passenger flow forecasting may be more meaningful than daily, monthly, or seasonal passenger flows. In these cases, how to select the suitable weather data to develop a short-term passenger forecasting model with higher performance is challenging. The experimental results powerfully verified that adding the previous variables of one-hour temperature and wind speed could indeed improve the results in comparison with the model using no weather data for hourly metro passenger flow forecasting at Taipei Main Station on rainy days.
The remainder of this paper is organized as follows. Section 2 describes the station-level passenger demand in the Taipei Mass Rapid Transit (MRT). The input variables used and the impacts of weather on hourly passenger flow are also analyzed in this section. Section 3 introduces the architecture of a cell in LSTM and the dataflow of the deep LSTM_NN in this paper. A case study of Taipei Main Station is presented in Section 4 in detail, and the experimental results are analyzed comprehensively against RF and the other three deep neural networks. Section 5 concludes the paper and suggests directions for future study.

Input Variables
The hourly inbound and outbound flows in the Taipei MRT have been available from the Taipei city government's open dataset since 1 November 2015 [25]. Figure 1 shows that Taipei Main Station has the highest average daily passenger flow. Its inbound and outbound flow is twice the second highest total of the top ten stations in Taipei. Therefore, Taipei Main Station was selected as a case study in the experimental section. Appl. Sci. 2020, 10, x FOR PEER REVIEW 4 of 17

Station-level Data Description
As shown in Table 1, we used the data of 363 days from Nov. 1, 2015, to Oct. 31, 2016 for training, and the remaining 151 days from Nov. 1, 2016, to Mar. 31, 2017 for testing. The ratio of training data and testing data was nearly 7:3. A detailed description of the data used for Taipei Main Station is given in Table 1.

Endogenous Input Variables
The traditional input variables, namely passenger flow direction ( ), date (month ( ), day ( )), week ( ), hour ( ), national holiday ( ), previous average hourly passenger flow ( _ ), previous 1-hour, 2-hour, 3-hour passenger flow ( _ _ , _ _ , _ _ ), and previous 2-hour passenger flow trend ( _ _ ), are defined as the endogenous input variables in this paper. The results in [26] show that the above endogenous input variables are the optimized combination for developing a high-performance model for passenger flow forecasting. Thus, all of them were used in the experiment.

Station-level Data Description
As shown in Table 1, we used the data of 363 days from 1 November 2015, to 31 October 2016 for training, and the remaining 151 days from Nov. 1, 2016, to Mar. 31, 2017 for testing. The ratio of training data and testing data was nearly 7:3. A detailed description of the data used for Taipei Main Station is given in Table 1.

Endogenous Input Variables
The traditional input variables, namely passenger flow direction (v Dir ), date (month (v M ), day (v D )), week (v W ), hour (v H ), national holiday (v NH ), previous average hourly passenger flow (v Pre_PF ), previous 1-hour, 2-hour, 3-hour passenger flow (v PF_Pre_1 , v PF_Pre_2 , v PF_Pre_3 ), and previous 2-hour passenger flow trend (v PFTr_Pre_2 ), are defined as the endogenous input variables in this paper. The results in [26] show that the above endogenous input variables are the optimized combination for developing a high-performance model for passenger flow forecasting. Thus, all of them were used in the experiment.

Exogenous Input Variables
Because of the potential relationship between weather conditions and urban transit ridership, weather variables were defined as the exogenous input variables in this paper. The weather data were provided by the Taiwan Environmental Protection Administration (Taiwan EPA) [27]. An extensive review of the previous literature [4][5][6][7][8][9][10][11][12][13][14][15][16][17] showed that temperature (v Temp ), rainfall (v Rain ), relative humidity (v Hum ), wind speed (v Wind ), and snow (v Snow ) were the most commonly used weather variables. Because Taipei almost never sees snow, v Temp , v Rain , v Hum , and v Wind were selected as the four weather variables in the study. The weather data were available from the Zhongshan meteorological monitoring station, which is the nearest observation station to Taipei Main Station. The weather data were hourly, which has the same time interval as the endogenous input variables described in Section 2.2.

Overview of Annual Weather Conditions in Taipei
Taipei is a typical city with a subtropical monsoon climate. Figure 2 shows the historical annual weather information from 1 November 2015 to 31 October 2016 around Taipei Main Station. Rainfall was copious throughout the year, especially June through September. However, heavy rainfall was uncommon. The temperatures were usually pleasantly warm with a high relative humidity. The winter months from December to February were colder with a relatively low temperature, less than 20 • C.

Exogenous Input Variables
Because of the potential relationship between weather conditions and urban transit ridership, weather variables were defined as the exogenous input variables in this paper. The weather data were provided by the Taiwan Environmental Protection Administration (Taiwan EPA) [27]. An extensive review of the previous literature [4][5][6][7][8][9][10][11][12][13][14][15][16][17] showed that temperature ( ), rainfall ( ), relative humidity ( ), wind speed ( ), and snow ( ) were the most commonly used weather variables. Because Taipei almost never sees snow, , , , and were selected as the four weather variables in the study. The weather data were available from the Zhongshan meteorological monitoring station, which is the nearest observation station to Taipei Main Station. The weather data were hourly, which has the same time interval as the endogenous input variables described in Section 2.2.

Overview of Annual Weather Conditions in Taipei
Taipei is a typical city with a subtropical monsoon climate. Figure 2 shows the historical annual weather information from Nov. 1, 2015 to Oct. 31, 2016 around Taipei Main Station. Rainfall was copious throughout the year, especially June through September. However, heavy rainfall was uncommon. The temperatures were usually pleasantly warm with a high relative humidity. The winter months from December to February were colder with a relatively low temperature, less than 20 °C.  Figure 3 shows the relationship between rainfall and hourly passenger flow on weekdays, weekends, and national holidays. To highlight the impact of rainfall on passenger flow, only the hourly passenger flow on the rainy days is given. The blue bars represent the millimeters of hourly rainfall, and the corresponding hourly passenger flow is marked by the broken black line. To compare the differences between the hourly passenger flow on rainy days, days without rain, and both, we calculated the average hourly passenger flow and marked it with the green dotted line, red dotted line, and yellow dotted line, respectively. Figure 3a illustrates the impact of rainfall on the inbound passenger flow at different hours on all Mondays during the training dataset. The three peak hours each day are 8:00, 18:00, and 21:00. The red, green, and yellow dotted lines almost overlap with each other. This show that the passenger flow on weekdays is scarcely affected by rain since most trips represent commutes, even if the trips are during the morning and evening peak hours. The lagged impact of rainfall on passenger flow is not obvious after a heavy rain at 6:00.  Figure 3 shows the relationship between rainfall and hourly passenger flow on weekdays, weekends, and national holidays. To highlight the impact of rainfall on passenger flow, only the hourly passenger flow on the rainy days is given. The blue bars represent the millimeters of hourly rainfall, and the corresponding hourly passenger flow is marked by the broken black line. To compare the differences between the hourly passenger flow on rainy days, days without rain, and both, we calculated the average hourly passenger flow and marked it with the green dotted line, red dotted line, and yellow dotted line, respectively. Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 18   Figure 3a illustrates the impact of rainfall on the inbound passenger flow at different hours on all Mondays during the training dataset. The three peak hours each day are 8:00, 18:00, and 21:00. The red, green, and yellow dotted lines almost overlap with each other. This show that the passenger flow on weekdays is scarcely affected by rain since most trips represent commutes, even if the trips are during the morning and evening peak hours. The lagged impact of rainfall on passenger flow is not obvious after a heavy rain at 6:00. Figure 3b illustrates the impact of rainfall on the hourly outbound passenger flow at different hours on all Saturdays. The most prominent peak hours are 16:00 to 18:00. These single-peak hours are quite different with the double-peak hours on weekdays. Clearly, the hourly passenger flow on the rainy days is less than the passenger flow on days without rain, as during 6:00-13:00 in Figure 3b. However, if the rain is not heavy, people still execute their travel plans, especially 16:00 to 18:00. This period has the largest passenger volume on weekends. Figure 3c depicts the impact of rainfall on hourly inbound passenger flow on all national holidays. The most obvious peak hours are 17:00 and 21:00. The difference between the passenger flow on rainy days and on days without rain becomes much greater than in the scenarios in Figure 3b. The impact of rainfall on hourly passenger flow at 8:00, 9:00, 11:00, 15:00, 17:00, 19:00, and 20:00 is greater than the impact at other hours. The flow at 17:00 falls dramatically on rainy days. This shows that the impact of rainfall on hourly passenger flow varies across the morning, noon, and evening hours on national holidays.

Impact of Rainfall on Hourly Passenger Flow
In summary, daily trips on national holidays and weekends are more likely to be affected by rainy weather than trips on weekdays, especially on national holidays, findings that are similar to [5]. It is verified that the fluctuation of the black line in Figure 3c is more serious compared to the other two. The possible reason is that travel plans on national holidays and weekends are not necessary, and people are more likely to change or cancel their travel plans due to the rainy days. Although the weather conditions are extremely complex, understanding them is important to improve the hourly passenger flow forecasting results for Taipei Main Station.

Methodology
RNN is a class of artificial neural network in which connections between nodes form a directed graph along a temporal sequence. LSTM is a special architecture of RNN, which was first introduced for sequence modeling [28]. It is a useful choice for handling the vanishing gradient problem since it can model long-term dependencies in the data. The vital component in LSTM is a cell. The state of a cell may be changed via control by the input gate, forget gate, and output gate simultaneously. Using this gate mechanism, LSTM has the ability to remove or add extracted hierarchical features and their past states to the cell, which enables LSTM to remember information for long time periods [29]. This state-memory characteristic makes LSTM useful in solving nonlinear time series problems, including a station-level passenger flow forecasting problem.
The architecture of a cell in LSTM is illustrated in Figure 4. Its current input data are typically composed of the input vector X t at time step t, and the hidden output h t-1 at time step t-1. The complete work cycle of a cell in LSTM is as follows. First, the three coupled parameters, namely weights and biases (w i , b i ), (w f , b f ), and (w o , b o ), are trained to formulate the nonlinear relationships between the input gate (i t ) and [x t , h t-1 ], the forget gate (f t ) and [x t , h t-1 ], and the output gate (o t ) and [x t , h t-1 ] at time step t, respectively. x t refers to input data at time step t, and h t-1 refers to output data in the hidden layer at time step t-1. The formulations of the three gates are defined in Equations (1)-(3). Second, an input squashing vector ( C t ) is obtained directly based on Equation (4) without any gate effect at time step t. The new state will then be generated and stored in C t at time step t under the control of f t , C t−1 , i t , and C t , which is shown in Equation (5). This special architecture shows that a cell can add or remove the previous memories stored in C t−1 with the control of the forget gate. Finally, the hidden output is obtained based on Equation (6) [30].
Forget gate : Output gate : Input squashing : Cell Hidden output : where 1 is the recurrent_activation function and 2 is the activation function. In the source codes of the high-level neural network API, such as Keras and Caffe, 1 is usually defined as sigmoid [31] or hard sigmoid function [32,33] and 2 is defined as tanh function [31][32][33]. refers to Hadamard product [30,34].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 17 Input gate: Forget gate: Output gate: Input squashing: Cell: Hidden output: ℎ = ⨀ ∫ ( ) where ∫ is the recurrent_activation function and ∫ is the activation function. In the source codes of the high-level neural network API, such as Keras and Caffe, ∫ is usually defined as sigmoid [31] or hard sigmoid function [32,33] and ∫ is defined as tanh function [31][32][33]. ⨀ refers to Hadamard product [30,34]. The dataflow in a deep LSTM_NN is shown in Figure 5. There is a total of n training examples The dataflow in a deep LSTM_NN is shown in Figure 5. There is a total of n training examples in the input data. Each set of input data at time step t is defined as an input vector X m×s t , where m is the number of input variables and s refers to the number of time steps. When modeling, the input variables are assembled as an input vector, namely , which is the combination of endogenous input variables and exogenous input variables introduced in Sections 2.2 and 2.3. Since each set of input data contains the variables of the previous 1-hour, 2-hour, and 3-hour passenger flows, we set s=1 as the parameter of time step. This shows that the previous passenger flows at time step (t-1), (t-2), and (t-3) are used as the input variables to forecast the passenger flow at time step t. It is worth noting that the dimension of the input vector [X t ] is changed by multiplying the weight matrixes based on Equations (1)-(4). As a result, the vectors of i t , f t , O t , C t , C t , and h t have the same dimensions of a as the size of the hidden nodes and are not equal to the dimension of n in input vector [X t ]. At the output layer, all the hidden outputs are fully connected to obtain Y t , whose dimension is returned to n. Y t is the forecasting result for the passenger flow and Y t is the actual value of the passenger flow given in the training dataset. This dataflow structure shows that the hourly passenger flow forecasting model can be transformed into a supervised learning problem using a deep LSTM_NN. In other forecast models, Y t was defined as one of the input variables contained in input vector [X t ] [35] to fit the model, or as [Y t−k , Y t−(k−1) , . . . Y t−1 ], which were a series of previous k-hour passenger flows, to be used as the only input variables to fit the model [36]. In summary, different forecasting tasks should be constructed with the most suitable input variables to obtain the optimized results.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 17 model [36]. In summary, different forecasting tasks should be constructed with the most suitable input variables to obtain the optimized results. Finally, we calculated the loss l(t) between and , where f refers to a loss function in Equation (7). Equation (8) was used to minimize the error between and over an entire sequence of length T using one of the state-of-the-art optimization algorithms [37]. To deeply and rapidly extract the relationship between the input variables and passenger flow, the deep structural LSTM_NN was applied in the experiment.

Data Specification
As discussed in Section 2, the considered input variables and their encoded values are listed in Table 2. The traditional input variables labeled No. 1-11 were assembled with different weather variables labeled No. 12-15. These assembled input variables are defined as the input vector [Xt]. Hourly passenger flow ( * ) is the target variable for forecasting (Yt). Table 2 lists the descriptions of the input variables.
has been encoded as a two-digit figure, shown in Table 3. The encoded values of input variables labeled No. 1-11 are described in [26].  Finally, we calculated the loss l(t) between Y t and Y t , where f refers to a loss function in Equation (7). Equation (8) was used to minimize the error between Y and Y over an entire sequence of length T using one of the state-of-the-art optimization algorithms [37]. To deeply and rapidly extract the relationship between the input variables and passenger flow, the deep structural LSTM_NN was applied in the experiment.

Data Specification
As discussed in Section 2, the considered input variables and their encoded values are listed in Table 2. The traditional input variables labeled No. 1-11 were assembled with different weather variables labeled No. 12-15. These assembled input variables are defined as the input vector [X t ]. Hourly passenger flow (v * PF ) is the target variable for forecasting (Y t ). Table 2 lists the descriptions of the input variables. v NH has been encoded as a two-digit figure, shown in Table 3. The encoded values of input variables labeled No. 1-11 are described in [26].  In the experiment, we developed many LSTM NN-based models for hourly passenger flow forecasting with Keras in Python. All training data and testing data were normalized and shuffled before training and testing.
Root mean square error (RMSE) and mean absolute percentage error (MAPE) are the two most commonly used criteria to evaluate the performance of a passenger flow forecasting model [26]. Thus, RMSE and MAPE in Equations (9) and (10) were used to evaluate the performance of the forecasting model.
The lower the value in RMSE or MAPE, the better the performance.

Results of not Using any Weather Variables
To evaluate the forecasting performance of the deep LSTM_NN, we first compared the experimental results using the deep LSTM_NN, one-layer LSTM_NN, and RF, respectively. Without adding any weather variable, the optimized combination of traditional input variables [26] were implemented in this section first.
The tuning parameters for the deep LSTM_NN and one-layer LSTM_NN are shown in Table 4. The best tuning results of parameters for the deep LSTM_NN, one-layer LSTM_NN, and RF are given in Table 5. As shown in Table 5 and the experimental results in [26], the three-layer LSTM_NN with RMSE = 490.77 and MAPE = 0.0547 has the better forecasting results than the one-layer LSTM_NN and RF. The experimental results verify that the deep LSTM_NN enables more accurate results than either one-layer LSTM_NN or RF.

Results of Using Weather Variables
As shown in Figure 3, the passenger flow on rainy days is quite different than on days without any rain. Adverse weather conditions may have great impacts on daily commutes. Thus, making a more accurate passenger flow forecast on rainy days is useful. The training data and testing data used for rainy days are shown in Table 6. To comprehensively discuss the different combinations of weather variables on passenger flow forecasting, 31 models were built and categorized as three groups using the proposed deep LSTM_NN on rainy days. Model_1 in Group_1 was built without using any weather variable, in order to make a comparison with the models using weather variables. Models 2-16 in Group_2 were built with different combinations of current hourly weather variables, and Models 17-31 in Group_3 were built with different combinations of previous one-hour weather variables. The tuning parameters of the proposed deep LSTM_NN are shown in Table 4. The best grid result is a group of the following parameters: (1)    To understand the socio-psychological aspects of the travel behavior of metro passengers, we suppose that individuals may give up their plans to go out if it is raining outside, especially during non-working hours. In this case, the previous one-hour weather variables may be the more important inputs for forecasting tasks than the current hourly weather variables. Therefore, Models 17-31 were built by the various combinations of the previous one-hour weather variables, in order to improve the results of forecasts made using no weather variable on rainy days. Table 8 shows the experimental results for Models 1 and 17-31. As shown in Table 8, both RMSE and MAPE in Models 17 and 23 are lower than Model_1. The better experimental results indicate that using the previous one-hour weather data as the input variable can improve the forecasting results of passenger flow on rainy days. These results match our assumptions about travelers' psychology. Using the combined weather variables of v Temp_Pre_1 and v Wind_Pre_1 in Model_23 results in more accurate forecasts than Model_1 by RMSE. The results indicate that travelers may cancel their travel plans if the weather is cold and windy on rainy days. These two previous one-hour weather variables, namely v Temp_Pre_1 and v Wind_Pre_1 , may be helpful in developing a more powerful forecast model on rainy days. These encouraging results demonstrate that understanding the socio-psychological aspects of the travel behavior of metro passengers is truly important to apply appropriate weather data for developing a more powerful passenger flow forecasting model.

Comparisons
To better analyze the forecasting performance of the deep LSTM_NN, four methods were selected for comparisons, namely RF [38], deep neural network (DNN) [18], RNN [20], and GRU [39]. We have discussed the optimized combination of the endogenous input variables using RF in our previous study [26], and the impacts of added weather variables on passenger flow forecasting have been further studied. Therefore, RF was further selected as one of the important comparative methods. Furthermore, the other three neural networks (DNN, RNN, and GRU) are the typical artificial intelligence algorithms that are successfully applied to many different forecasting tasks. The parameters of n_estimators and max_features in RF were chosen with the same values as in Section 4.2. To make an equal comparison among different neural network models, the parameters used in DNN, RNN, and GRU were the same as in the deep LSTM_NN, described in Section 4.3. The training samples and input variables used in the above four methods were identical to those used in Models 17-31. Figure 6 shows the RMSE and MAPE of the forecasting results in the five methods. The deep LSTM_NN is better than the other four methods when incorporating previous one-hour weather variables as the inputs in most scenarios. These results verify that the deep LSTM_NN can better extract and represent the deep variables of passenger flow embedded in the training dataset. It is a more powerful method for time series forecasting problems. samples and input variables used in the above four methods were identical to those used in Models 17-31. Figure 6 shows the RMSE and MAPE of the forecasting results in the five methods. The deep LSTM_NN is better than the other four methods when incorporating previous one-hour weather variables as the inputs in most scenarios. These results verify that the deep LSTM_NN can better extract and represent the deep variables of passenger flow embedded in the training dataset. It is a more powerful method for time series forecasting problems.

Conclusions
Inspired by the close relationship between weather conditions and passenger flow forecasting, a detailed experimental analysis of the impacts of weather variables on passenger flow forecasting using a deep LSTM_NN was conceived. The experimental results show that the previous one-hour temperature ( _ _ ) and previous one-hour wind speed ( _ _ ), especially _ _ , are the two most important weather variables for passenger flow forecasting on rainy days. Compared to the other four algorithms, the deep LSTM_NN is a more powerful method to make the more accurate forecasts when suitable weather variables are included. Since the previous one-hour weather data are not difficult to obtain, the model developed herein for short-term metro passenger flow

Conclusions
Inspired by the close relationship between weather conditions and passenger flow forecasting, a detailed experimental analysis of the impacts of weather variables on passenger flow forecasting using a deep LSTM_NN was conceived. The experimental results show that the previous one-hour temperature (v Temp_Pre_1 ) and previous one-hour wind speed (v Wind_Pre_1 ), especially v Temp_Pre_1 , are the two most important weather variables for passenger flow forecasting on rainy days. Compared to the other four algorithms, the deep LSTM_NN is a more powerful method to make the more accurate forecasts when suitable weather variables are included. Since the previous one-hour weather data are not difficult to obtain, the model developed herein for short-term metro passenger flow forecasting may be performed in practical applications.
The majority of weather data present general variations, which do not have much impact on passengers' daily commuting, even in the case of heavy rain. This indicates that adding current hourly weather data to build a forecasting model may not be helpful in improving the forecasting results, because the weather data are more likely to be noisy data. The experimental results in Table 7 support this assumption. Fortunately, instead of using current hourly weather data, we used the previous one-hour weather data to develop the model, which yields the better results. These encouraging results demonstrate that understanding the socio-psychological aspects of the travel behavior of metro passengers is truly important to apply appropriate weather data for developing a more powerful passenger flow forecasting model. This special strategy is creative and helpful in data modeling.
Because the testing data ran from November 2016 to March 2017, the weather during this period is relatively colder and drier. The experimental results can only verify that the proposed model can forecast passenger flows more accurately when weather conditions are broadly similar. As shown in Table 6, the training and testing examples are not rich enough for forecasting passenger flows on rainy days. In the future, we will collect more data to perform these forecasting tasks and further verify the results using days with adverse weather, such as typhoons, heavy rains, or high winds, and for weekends and national holidays with rainy days. In addition, because the weather conditions shift randomly at the hour-level, clustering methods may be used to group the training data and testing data based on different weather conditions before developing a forecasting model. As another possibility of exploring the impact of weather on passenger flow forecasting, the addition of seasonal variations or outdoor thermal comfort may also have an essential impact on travel behavior at specific traffic nodes [40]. All these represent productive directions for our future research.