An Online Energy-Saving Driving Strategy for Metro Train Operation Based on the Model Predictive Control of Switched-Mode Dynamical Systems

: With the rapid development of urban rail transit systems and the consequent sharp increase of energy consumption, the energy-saving train operation problem has been attracting much attention. Extensive studies have been devoted to optimal control of a single metro train in an inter-station run to minimize the energy consumption. However, most of the existing work focuses on ofﬂine optimization of the energy-saving driving strategy, which still needs to be tracked in real train operation. In order to attain better performance in the presence of disturbances, this paper studies the online optimization problem of the energy-saving driving strategy for a single metro train, by employing the model predictive control (MPC) approach. Firstly, a switched-mode dynamical system model is introduced to describe the dynamics of a metro train. Based on this model, an MPC-based online optimization problem is formulated for obtaining the optimal mode switching times with minimal energy consumption for a single train in an inter-station run. Then we propose an algorithm to solve the constrained optimization problem at each time step by utilizing the exterior point penalty function method. The proposed online optimal train control algorithm which determines the mode switching times can not only improve the computational efﬁciency but also enhances the robustness to disturbances in real scenarios. Finally, the effectiveness and advantages of this online optimal train control algorithm are illustrated through case studies of a single train in an inter-station run.


Background and Motivation
Urban rail transit, which can effectively alleviate traffic pressure in large modern cities, is widely recognized as an attractive mode of transportation due to its features of large capacity, high speed, safety, punctuality and energy-efficiency [1]. Many large cities in the world are committed to developing and expanding their metro train projects. As of 2016, the global light rail market value is estimated to be $180.78 billion, which is expected to increase 2.3% by 2020. With the expansion of the metro line scale and the quick growth of passenger traffic demand, the total energy consumption of train operations is increasing rapidly. Therefore, energy-efficient train operation for urban rail transit, which needs to reduce the energy consumption while ensuring safety and punctuality, is becoming an urgent problem, and researchers have been continually striving to address the energy-efficient train operation problem.
So far, extensive studies have been devoted to searching for the optimal driving strategy, which is considered to be a commonly used, effective way to reduce the energy consumption of metro train operation [2]. Most of the existing studies only focus on the offline optimization of the driving strategy, which will be taken as the reference trajectory by the automatic train operation (ATO) system for ensuring the train tracks the reference trajectory. Due to the fact that offline optimized trajectory does not consider the disturbance factors encountered in the actual operation, it is hard to ensure the optimality of online operation. In addition, the complex operation environment and uncertain operational delays can deteriorate the tracking accuracy. Online adjustment of the reference trajectory is particularly urgent because it can quickly respond to various changes during a train's operation to enhance intelligence. An alternative solution is converting it to a conventional optimization problem by formulating the problem in the discrete domain and solving it online with a numerical method [3]. Therefore, it is of greater significance to address the online optimization problem of the driving strategy.
In the studies concerning the online trajectory optimization, the speed or force during the whole operation segment is taken as the optimization variable. Although high dimensions of the optimization variable can increase the degree of freedom, that will degrade the computational efficiency. According to the existing studies, the optimal train control strategy usually follows the sequential operation modes: maximum traction (MT), speed holding (SH), coasting (CS) and maximum braking (MB), in which the control of each mode is known. In this regard, it is practically efficient to optimize the mode switching times instead of the speed/force trajectory by modeling the train as a switched-mode dynamical system. In this paper, we will address the online optimization of energy-saving driving strategy with switching times as the optimization variables, so as to improve the computation efficiency and the robustness to disturbances.

Literature Review
Research on optimal driving strategy is actually done to solve the typical optimal control problem. The first work on optimal train control was carried out by Ishikawa in 1968 [4], in which the Pontryagin maximum principle was used to find optimal driving strategy for trains. He gave a result that the optimal train control strategy on level tracks with constant speed limit follows the fixed mode sequence of MT-SH-CS-MB. Since then, there have been many outstanding works on optimal train control based on optimal control theory and solved by the Pontryagin maximum principle [5][6][7], the method of which belongs to the category of analytic method. More recently, Albrecht et al. [8,9] made a detailed summary on the problem of finding an optimal driving strategy on an undulating track with variable speed restrictions. By using Pontryagin maximum principle, they gave the necessary conditions on an optimal strategy and pointed out that an optimal strategy always exists. However, for more complex actual train operation conditions, e.g., local constraints of state variables, nonlinear resistance and variable grade profiles, the analytic method cannot work well. In contrast, the dynamic programming (DP) method can cope with these complex issues [10][11][12]. In [10], the optimal train control problem was reformulated into a multi-stage decision process, and the optimal control strategy was obtained directly through the numerical DP. Similarly, when the optimal train control problem is discretized by numerical methods, many mathematical programming techniques can be used to solve the optimal train control problem, such as linear programming [13] and nonlinear programming [14]. As long as the computational time is enough, a near-optimal solution can be obtained by these numerical methods. In recent years, with the rise of various evolutionary algorithms, many scholars tried to use genetic algorithm (GA) [15], and ant colony optimization (ACO) [16] to solve the optimal train control problem, in which the optimality and convergence of the solutions cannot be guaranteed. In addition, because the operation of the train under actual running environment is very complex, the train cannot know many situations in advance. As you cannot calculate all the possible cases offline, evolutionary algorithms have some drawbacks. Furthermore, detailed reviews of the classical single-train control problem can be found in [1,17,18].
It is worth noting that the studies mentioned above only focused on the offline optimization of the driving strategy. In order to acquire better performance in the presence of disturbances, scholars have been paying more attention to the online optimization problem of the driving strategy. Considering the uncertain disturbances on resistance coefficients and the possible delay time, Yan et al. [19] proposed a moving horizon train optimization approach for dynamic train trajectory planning problem, which can be solved by immune differential evolution algorithm. Yan et al. [20] presented a cooperative energy-efficient trajectory planning scheme for multiple high-speed train movements using distributed model predictive control, under which each train can get the optimal speed trajectory online. Xun et al. [21] discussed an online train speed profile optimization problem solved by sequential quadratic programming. He et al. [22] developed a shrinking horizon MPC algorithm combined with the Radau pseudo-spectral method to obtain the optimal speed-time trajectory online based on the real-time information. We can easily see that in all these studies, the online optimization variable is the speed or force during the whole operation segment, which is high dimensional, and it will bring heavy computation cost so as to hinder online implementation.
Motivated by the fact that the optimal train control strategy usually follows the fixed mode sequence of MT-SH-CS-MB, some researchers have introduced the switched-mode dynamical system to describe the dynamics of a metro train [23,24], in which the optimization variable is the gear angle. In [25], based on the switched-mode dynamical system, the optimal driving problem was converted to a nonlinear optimization problem, and an exterior point method was proposed to calculate the driving strategies, in which the optimization variables were switching gear sequence and switching locations. However, these results cannot be applied to online optimization of reference trajectories.

The Focus of This Study
From the perspective of reducing the optimization variable's dimensions to improve the computation efficiency, we regard the train as a switched-mode dynamical system, and then convert the online optimal train control problem to finding the optimal mode switching times online. A general optimal control framework for such systems was established in [26], and an online algorithm to calculate the optimal mode-switching times in switched-mode dynamical systems was investigated in [27]. Nevertheless, the proposed algorithms cannot be directly applied to solving the optimal train control problem due to the process and terminal state constraints in train operation.
In this paper, we address the online optimization of energy-saving driving strategy based on MPC, in which the train model is an switched-mode dynamical system and the optimization variables are switching times. An online algorithm is then presented to calculate the optimal switching times.
The main contributions of this paper are summarized as follows:

1.
A switched-mode dynamical system is introduced to describe the dynamics of a metro train. By using the switching times as the optimization variables, the defect of high dimension of the optimization variable in [3,[19][20][21][22]28] can be avoided such that the computation cost can be reduced.

2.
By employing MPC, an online algorithm is presented to obtain the optimal switching times at each sampling instant, which partially extends the offline optimization in [25] to online optimization, and can timely respond to disturbances during train operation.
The remainder of this paper is organized as follows. In Section 2, the basic optimal train control problem is described, along with train traffic model and energy consumption model. In Section 3, a switched-mode dynamical system model is presented to describe the dynamics of a metro train, and the MPC-based online optimization problem is formulated for obtaining energy-saving switching times. An algorithm to solve the proposed optimization problem is proposed in Section 4. Section 5 of this paper presents case studies for metro trains to demonstrate the effectiveness of the proposed algorithm. In Section 6, we conclude the whole paper.
To describe and understand the proposed method easily, we summarize the notation needed in our formulation in Appendix A.

The Optimal Train Control Problem
In this section, the basic optimal train control problem is reviewed briefly, including the basic notation and the main modeling concepts which will be used later in this paper. The movement of a point-mass train with time as the independent variable is described as: where p = p(t) is the train position (m) and v = v(t) is the train speed (m/s); M is the total mass of the train; u = u(t) is the instantaneous force applied to the train; R(v) is the resistance force which is usually approximated in terms of a quadratic function of v by equation as where the parameters a 0 , a 1 and a 2 are the rolling resistance coefficient, the mechanical resistance coefficient and the external air resistance coefficient, respectively; G(p) is the gradient resistance along the track; i.e., with g being the gravitational acceleration, and θ(p) being the track gradient at location p. The speed of the running train is bounded as follow, where v lim is the upper speed limit at location p. Considering the performance of train power system, the applied force should be limited within a proper range: According to the characteristics of electric trains, the effort of a tractive motor is divided into a "constant torque region" and a "constant power region" [29]. As shown in Figure 1, we suppose that maximum traction and maximum braking are piecewise linear functions with respect to speed, where v c is the critical speed at which the maximum tractive/braking force function switches from one piece to the other. The piece with speed lower than v c is called Regime 1 and the piece with speed higher than v c is called and Consider the problem of driving a train as described by Equations (1) and (2) from one station to the next along a level track within a given allowable time T to minimize the energy consumption. Then, the optimal train control problem can be formally described as follows: where P is the length between two adjacent stations. Given the train dynamics (1) and (2) with initial and terminal constraints (10), find a control law u such that the objective function (9) is minimized subject to the constraints.

Online Energy-Saving Driving Strategy Based on MPC of Switched-Mode Systems
In this section, the dynamic model of a metro train is established as a switched-mode dynamical system. Then, the MPC-based online optimization problem is formulated with mode-switching times as the optimization variables to minimize the energy consumption for a single train in an inter-station run.

Train Switched-Mode Dynamical Model
Firstly, a switched-mode dynamical system model is introduced to describe the characteristics of a metro train. It is commonly recognized that the energy-saving strategy on level tracks with constant speed limit for an individual train inter-station run under a given running time uses four modes of movement, namely, MT, SH, CS and MB in sequence (see Figure 2). Given a fixed time horizon [0, T], let τ i , i = 1, 2, 3 be a finite sequence of mode-switching time instants such that 0 ≤ τ 1 ≤ τ 2 ≤ τ 3 ≤ T. τ 1 , τ 2 , τ 3 are the time instants to start SH, CS and MB respectively, as shown in Figure 2. Considering the train running between two stations, its dynamic equation is: The variable x(t) is the state of the system, and the force applied in each stage is given as follows: where R(v) is a quadratic function of v given in Equation (3). Considering the train runs on level track, Mgθ(p) is equal to zero. Substituting Equation (13) into the speed dynamic (2), we have Thus, the switched-mode dynamical system concerned can be written aṡ with a given initial condition x 0 := x(0). The functions f i , i = 1, 2, 3, 4 are called the modal functions, and the times τ i , i = 1, 2, 3 are called the switching times.

Optimization of Switching Times via MPC
Considering the train switched-mode dynamical system (15), a train runs from one station to the next along a level track within a given time T. Let us denote the vector of switching times bȳ τ := [τ 1 , τ 2 , τ 3 ] T . Define the feasible set as Λ = {τ = [τ 1 , τ 2 , τ 3 ] T } subject to the constraints: The sequence of modes (MT, SH, CS and MB) is fixed, which is commonly recognized as the energy-saving sequence for an individual train in an inter-station run under a given run-time. Then we have the power function for train at time t ∈ [0, T) denoted by L(x), which is a piecewise function, i.e., with ε ∈ (0, 1) representing the efficiency factor in converting kinetic to electrical energy during the regenerative braking.
Then, we formulate the online optimization problem with respect to switching times. Assume the metro train measures its current state at each sampling instant t k = k∆t with k = 0, 1, 2 . . . and ∆t being the sampling period, and then solves an MPC-based optimization problem to determine its switching times at the same time. Define the cost-to-go function for train at time t k ∈ [t, T) as J(x(t|t k ),τ(t k )) = T t k L(x(t|t k ))dt (18) where x(t|t k ), t ∈ [t k , T] is the future state trajectory of the train predicted at t k with respect toτ(t k ).
T is the vector of the switching times at time t k . We define the integer i(t k ,τ(t k )) as the index (n) of the last switching time that has taken place before or at time t k , which is given as below: The cost-to-go function (18) is not affected by the switching time that has taken place prior to time t k and is only related to the future switching times τ n(t k ,τ(t k )) , n = i(t k ,τ(t k )) + 1, · · ·, 3. It should be noted that the actual variableτ(t k ) is only a sub-vector ofτ including the future switching times. Nonetheless, we useτ(t k ) in (18) for notational brevity. When t k ∈ [0, T), the train solves the following optimization problem: Withτ(t k ) being the optimal control schemeτ * (t k ), the corresponding predicted state is also optimal and denoted by x * (t|t k ) for t ∈ [t k , T).
The online control of the train is realized through the following process. At each sampling instant t k , the current state information can be obtained by on-board equipment directly. By solving optimization problem (19), we can obtain the optimal switching timesτ * (t k ). As long as the current time t k < τ i(t k ,τ * (t k ))+1 , the train continues to operate in the current mode. When the current time t k ≥ τ i(t k ,τ * (t k ))+1 , the train switches to the next mode.

Solution Algorithm Development
In this section, we present an algorithm to solve the optimization problem (19). Concerning the switching time optimization problem, a number of algorithms have been developed in [27,30]. However, these algorithms cannot be directly applied to solving the optimization problem (19) due to the existence of a terminal state constraint.
As it is difficult to select the initial value which satisfies the terminal state constraints, the exterior point penalty function method is a reasonable solution.
In addition, the optimal solution should satisfy constraintτ ∈ Λ. One should observe that this constraint can be viewed as a series of linear inequality constraints, and the projection method can be applied. The projection can be easily and quickly obtained by using the algorithm given in reference [31]. Using this method, we can ignore the constraint, unless the constraint is active and the iteration point is on the boundary of the constraint, in which case the projection of the descent direction onto the feasible set should be used as the feasible direction.
Combined with the online optimization algorithm given in reference [27], we proposed an online algorithm based on the penalty function method.
Firstly, we construct a new cost function as: Here, γ is a penalty factor; the first penalty item is to satisfy the constraint of running distance at the terminal time step, and the second penalty item is to satisfy the zero speed constraint when the train arrives at the station.
Then, the detailed algorithm to solve the optimization problem (19) is described as follows.
The Algorithm 1 mainly includes three loops. The outermost loop is to control the solving of the optimization problem at each sampling time step k to get the optimal switching times until the train stops at the next station. The loop in the middle layer is to control the iterative updating of the penalty factor γ , where the minimization of (20) with each value of γ is solved until the total value of penalty terms decreases below a pre-set threshold ς. The innermost loop is used to control the iterative updating of the switching times until the gradient of the current cost function decreases below a pre-set threshold ϑ. Note that at the beginning of the Algorithm 1, the related parametersτ(0), γ PR , ρ, ς and ϑ need to be initialized. At each time step k, the train needs to collect the current state information, including speed and position, and then calculate the optimal switching time following the iterative updating of γ j andτ j q until both judging conditions with respect to ς and ϑ are satisfied.
is f easible}. Set q = q + 1 and go to step 4.
and go to step 3.

Case Studies
In this section, in order to verify the efficiency of the proposed online optimal switching time-based driving strategy, the proposed Algorithm 1 was applied to a metro train to implement the numerical experiments. The scenario of an inter-station run for a single train is shown in Figure 3, which indicates that the operation mode sequence is MT-SH-CS-MB and an inter-station run with the fixed operation mode sequence has many different driving trajectories with different switching times. Our objective was to find the one with minimum energy consumption. In Section 5.1, the main train parameters are given. In Section 5.2, case studies of two different inter-station runs with fixed running times are presented. In Section 5.3, we analyze the changes of switching times and energy consumption with different running times. In Section 5.4, the train's operation with a disturbance is analyzed. All computations throughout the following numerical experiments were performed by MATLAB

Parameter Setting
Here, the online optimal switching time driving strategy for the metro train DKZ32 is studied, which is used in Beijing Yizhuang metro line of China. Each train consists of three motor cars and three trailer cars; the train is powered by DC 750 V supply via third-rail. The main train parameters are listed as follows.

•
The maximum traction force is given as follows.

•
The maximum electrical braking force is calculated by: The running resistance can be calculated by: • The efficiency factor ε = 0.4.

Inter-Station Run with Fixed Running Time
In this subsection, we chose two different intervals of Beijing Yizhuang metro line as examples, the line parameters of which are shown in Table 1. The optimal speed trajectories obtained by the Algorithm 1 for Jiugong-Yizhuangqiao and Yizhuangqiao-Wenhuayuan intervals are shown in Figures 4 and 5 respectively, where (a) corresponds to the speed trajectories with respect to the position and (b) corresponds to those with respect to the time. The optimal switching times and the corresponding energy consumption values for Jiugong-Yizhuangqiao and Yizhuangqiao-Wenhuayuan intervals are given in Table 2.

Analysis of Different Running Times
In this subsection, we analyze the changes of switching times and energy consumption with different running times. We take the interval from Jiugong Station to Yizhuangqiao Station of Beijing Yizhuang metro line as an example. We take 105, 110, 115 and 120 s as the given running times, respectively, and let the train operate following the Algorithm 1. Then the optimal speed trajectories with these different running times are shown in Figure 6, where (a) corresponds to the speed trajectories with respect to the position and (b) corresponds to those with respect to the time. The switching times and energy consumption with different running times are shown in Table 3. We can easily observe that with the increase of running time, the proportion of coasting in the optimal speed trajectories is increased, and the energy consumption is reduced.

Analysis with Disturbance
In this case, we simulate the train operation with a disturbance under the Algorithm 1. The line parameters are the same as those in Section 5.3, and the given running time is 105 s. We assume that the train runs to a certain point under disturbance at t k = 9.1 s with p(t k ) = 45 m and v(t k ) = 9.3 m/s. The pre-optimized trajectory without disturbance is shown by Tra1 in Figure 6, in which the train speed at t k = 9.1 s should be 10 m/s. If the disturbance is exerted, the train will eventually fail to meet the terminal constraints under the pre-optimized switching time, which is illustrated by Tra2 in Figure 6. The trajectory Tra3 in Figure 7 represents the updated result under the Algorithm 1. As the train fails to reach the predetermined speed under disturbance, the cruising time is increased after the train reaches the speed limit. Under the updated switching times, the train will operate with the energy optimization manner and meet the terminal constraints simultaneously.

Discussion
Firstly, through the above case studies, the feasibility of the proposed algorithm is verified, and the results obtained are consistent with the previous conclusions on optimal train control. Secondly, because the number of decision variables in the optimal train control problem is only three, the scale of the optimization problem is greatly reduced so as to improve the computation efficiency, which makes online computation possible. In addition, when the train deviates from the original optimal trajectory due to disturbances, the Algorithm 1 can recalculate the optimal strategy of the remaining journey according to the latest state of the train, so we can say the online driving strategy proposed in this paper is robust to disturbances.

Conclusions
This paper has studied the online energy-saving driving strategy for a single metro train. A switched-mode dynamical system model has been established with the consideration of regenerative braking energy, which can accurately describe the metro train operation mode and punctuality constraint. According to the previous conclusions on optimal train control, the optimal train operation strategy has a fixed operation sequence(MT-SH-CS-MB). Therefore, the switching system model established in this paper has four modes, and the optimal control of each mode is known, which can be regarded as the speed feedback control. Therefore, as long as the current state of the train is obtained, the optimal control under the current mode can be obtained. The critical problem is to determine the optimal mode switching times. On this basis, an online optimization problem with switching times as the optimization variables has been formulated to minimize the energy consumption by employing the MPC approach, and an online algorithm to solve the optimization problem has also been introduced by utilizing the exterior point penalty function method. The proposed online algorithm with the switching times as the optimization variables cannot only respond in a timely way to disturbances during train operation, but also improve the computational efficiency which can realize online real-time application. Case studies have been presented to verify the feasibility and advantages of the proposed algorithm. We may further consider the online multi-train energy-saving operation based on the switched-mode dynamical systems as our future work.
Author Contributions: Conceptualization, methodology, software and writing-original draft preparation were performed by F.S.; writing-review and editing were performed by J.Z.; resources, supervision, and project administration were performed by Y.C. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In this appendix, we list the notation in this paper, as shown in Table A1.