1. Introduction
With the continuous formation and expansion of urban rail transit networks, this transportation mode brings high efficiency and convenience to residents’ travel. However, its increasing passenger flow also remarkably increases the risk of emergencies such as crowded passengers and stampedes. In the urban rail transit intelligent system, the accurate and real-time prediction of dynamically changing passenger flow is of great significance to the daily operation safety management, emergency prevention, and dispatch of urban rail transit. Many researchers have proposed some methods for the passenger flow prediction of urban rail transit, as well as road traffic prediction. These studies mainly used historical time series or spatiotemporal series with certain time intervals—combined with other auxiliary information—and used data models or algorithms to mine the time or spatiotemporal internal relations of historical data to predict passenger flow in the future.
Generally, prediction methods of time series or spatiotemporal series can be divided into two categories: Linear methods and nonlinear methods [
1,
2]. Linear methods are based on the linearity and stationary of time series or spatiotemporal series [
3]. Commonly used linear methods in the past were mainly the moving average model and the exponential smoothing model [
4,
5,
6]. In 1979, Ahmed and Cook [
7] first applied the autoregressive integrated moving average (ARIMA) model to predict traffic flow time series, and they obtained more accurate prediction results than the moving average model and the exponential smoothing model. Since then, classical time series analysis models such as ARIMA [
8] and the seasonal autoregressive integrated moving average (SARIMA) model [
5,
9] have been widely used in road traffic and urban rail transit passenger flow prediction, and they have achieved fairly good results. In 2005, Kamarianakis and Prastacos [
10] introduced the space–time autoregressive integrated moving average (STARIMA) model into the traffic flow short-term prediction of the road network in the center of the city of Athens, Greece. By using a spatial weight matrix to quantify the correlations between traffic flow at any observation location and the traffic conditions in adjacent locations, the spatiotemporal evolution of traffic flow in the road networks was statistically described, and then STARIMA achieved satisfactory prediction results. Since these methods usually require the stationarity hypothesis, their application is limited. Otherwise, the simple linear relationship cannot fully characterize the internal relationship of time or spatiotemporal series. Therefore, some nonlinear algorithms have been proposed, such as the Gaussian maximum likelihood estimation, the nonparametric regression model, and Kalman filtering [
11,
12,
13]. In recent years, intelligent algorithms such as the Bayesian network, the neural network, wavelet analysis, chaos theory, and the support vector machine have also been directly applied or combined into hybrid models for predicting road traffic or urban rail transit passenger flow [
14,
15,
16,
17,
18]. Compared to linear methods, nonlinear methods are more flexible, and their prediction results generally perform better [
19,
20].
With the widespread use of various types of data acquisition equipment, the intelligent transportation system (ITS) has mastered a large amount of traffic data [
21]. The general parameter approximation algorithms can only shallowly correlate data, and it is difficult for them to obtain a good prediction performance in the face of the curse of dimensionality caused by a data explosion. An artificial neural network (ANN) solves the curse of dimensionality by using distributed and hierarchical feature representation and by modeling complex nonlinear relationships with deeper network layers, thus creating a new field of deep learning [
22]. In the 1990s, Hua and Faghri [
23] introduced an ANN to the estimation of the travel time of highway vehicles. After that, various ANNs have been applied to traffic prediction by the ITS, such as the feed forward neural network (FFNN), the radial basis frequency neural network (RBFNN), the spectral-basis neural network (SNN) and the recurrent neural network (RNN) [
24,
25,
26,
27]. Among them, RNNs handle any input series through memory block, which is a memory-based neural network suitable for studying the evolutionary law of spatiotemporal data, but there are two shortcomings: (1) They need continuous trial and error to predetermine the optimal time lags, and (2) they cannot perform well in long-term predictions because of vanishing and exploding gradients [
28]. As a special RNN, the long short-term memory neural network (LSTM NN) overcomes the above shortcomings and has been introduced into the road traffic time series prediction. It has obtained significantly better prediction results [
19]. In addition, Fei Lin et al. [
29] proposed a sparse self-encoding method to extract spatial features from the spatial-temporal matrix through the fully connected layer. They then combined this with the LSTM NN to predict the average taxi speed in Qingyang District of Chengdu and obtained a higher accuracy and robustness compared with the LSTM NN. Xiaolei Ma et al. [
30] proposed a method based on the convolutional neural network (CNN), which used a two-dimensional time-space matrix to convert spatiotemporal traffic dynamics into an image which described the time and space relationship of traffic flow; they confirmed that the method can accurately predict traffic speed through two examples of the Beijing transportation network.
For a large sample of urban rail transit passenger flow, there is very little research on passenger flow prediction based on deep learning methods. In this paper, two deep learning methods, the LSTM NN and CNN, are introduced to predict the time series and spatiotemporal series of urban rail transit passenger flow, respectively. In addition, the traditional linear models, ARIMA, SARIMA and STARIMA, are used as contrasts in different experiments to test the prediction performances of two deep learning methods.
4. Discussion and Conclusions
The LSTM NN and the CNN are two popular deep learning methods. While the former’s unique structure of memory cell can capture the long short-term dependencies of time series, the latter has a powerful extraction ability over key image features. According to our studies, the main findings and conclusions are as follows:
- (1)
When the LSTM NN predicts the dayparting passenger flow of metro stations, the MRE of evening-peak and full-time can remain within 10%, which are much lower than ARIMA. In addition, the superiority of predicting the peak shape is particularly obvious, explaining that the LSTM NN is highly adaptable to extremely changing data and less limited to the prediction term span.
- (2)
When the LSTM NN predicts the daily passenger flow of metro lines, it overcomes the shortcoming of SARIMA, which has weak responses to dramatic changes in the training set due to the influence of overall periodicity. Hence, the LSTM NN achieves good prediction accuracy during both non-holiday and holidays.
- (3)
When the CNN predicts the dayparting passenger flow of metro stations, the MAE, MRE and RMSE of the CNN decreased by 24.62%, 29.43%, and 27.86%, respectively, compared with STARIMA, which proves that CNN is less affected by the random fluctuation of passenger flow and has a stronger robustness.
- (4)
When the CNN predicts the daily passenger flow of metro lines, its prediction accuracy is comparable to that of the LSTM NN. Like the LSTM NN, the CNN has long short-term memory capabilities. Both methods can capture the overall periodicity and dramatic changes of passenger flow, but both will ignore the data shack that lasts too short as abnormal noise.
Based on the above experimental findings, we can prove that the LSTM NN and the CNN can both accurately predict long short-term passenger flow of urban rail transit. They also have good data adaptability and robustness. However, it is difficult for these two networks to capture the transient fluctuation of the passenger flow caused by external disturbance, as this can only be done through learning from the spatiotemporal series themselves. Therefore, in future work, it is necessary to input some prior knowledge outside the spatiotemporal series into the networks to improve prediction results.