Next Article in Journal
Quasi Semi-Border Singularities
Next Article in Special Issue
Mitigating Safety Concerns and Profit/Production Losses for Chemical Process Control Systems under Cyberattacks via Design/Control Methods
Previous Article in Journal
Two-Stage Classification with SIS Using a New Filter Ranking Method in High Throughput Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Economic Machine-Learning-Based Predictive Control of Nonlinear Systems

by
Zhe Wu
1 and
Panagiotis D. Christofides
1,2,*
1
Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, CA 90095-1592, USA
2
Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095-1592, USA
*
Author to whom correspondence should be addressed.
Mathematics 2019, 7(6), 494; https://doi.org/10.3390/math7060494
Submission received: 2 May 2019 / Revised: 24 May 2019 / Accepted: 28 May 2019 / Published: 1 June 2019

Abstract

:
In this work, a Lyapunov-based economic model predictive control (LEMPC) method is developed to address economic optimality and closed-loop stability of nonlinear systems using machine learning-based models to make predictions. Specifically, an ensemble of recurrent neural network (RNN) models via a k-fold cross validation is first developed to capture process dynamics in an operating region. Then, the LEMPC using an RNN ensemble is designed to maintain the closed-loop state in a stability region and optimize process economic benefits simultaneously. Parallel computing is employed to improve computational efficiency of real-time implementation of LEMPC with an RNN ensemble. The proposed machine-learning-based LEMPC method is demonstrated using a nonlinear chemical process example.

1. Introduction

Economic model predictive control (EMPC) that integrates optimization of process economics with process control has attracted significant attention over the last decade (e.g., [1,2,3]). Unlike the traditional two-layered real-time optimization (RTO) that continuously optimizes process set-points in the upper level and applies a tracking model predictive controller or a proportional-integral-derivative controller in the lower layer, EMPC operates the system in a time-varying fashion (off steady-state) by dynamically optimizing an economic cost function accounting for stability constraints in one layer. Many research works have been done on EMPC in terms of closed-loop stability, economic optimality, model uncertainty and process operational safety (e.g., [4,5,6]). However, to achieve a desired closed-loop performance under EMPC, it is acknowledged that a reliable process model that can predict the future state evolution accurately is a key requirement. To that end, multiple linear state-space models have been used within EMPC to capture the nonlinear dynamics in the operating region in [7]. Additionally, a well-conditioned polynomial nonlinear state-space model has been used in [8] to approximate the actual (first-principles) nonlinear process model (normally substituting of the actual nonlinear process) for EMPC. Given the recent interest in machine learning modeling and particularly its usage in solving nonlinear regression problems ([9,10]), in this work, we will derive a nonlinear data-driven process model for EMPC by taking advantage of machine learning techniques.
Machine learning has been widely used in a variety of engineering problems such as classification, pattern recognition, and numerical prediction. Among machine learning methods, neural networks have shown great success in regression problems. With the rapid development of neural network libraries and application programming interfaces e.g., Tensorflow [11], Caffe [12], Keras [13], etc.), building a neural network has presently become intuitive and straightforward. Additionally, neural network-based modeling outperforms other data-driven modeling methods due to its ability to implicitly detect complex nonlinearities and the availability of multiple training algorithms ([14]). Neural network has undergone several phases in which different theoretical approaches led to increasing understanding of its power and limitations. For example, in the 1980s, the first RNN (i.e., Hopfield networks) was created for pattern recognition ([15]). At that time, implementing a neural network with many layers for complex systems was a technical challenge [16]. Subsequently, deep learning was proposed to alleviate the problems of vanishing gradient in RNNs, which provided a new perspective to the machine learning field [17]. Compared to feedfoward neural networks, RNN is able to memorize previous information in the network and thus, can be applied to model a nonlinear dynamical system from time-series data. However, since the majority of research is focused on validating the identified system, it is not clear how RNN models can be incorporated in EMPC with guaranteed closed-loop stability.
In addition, ensemble learning, a machine learning paradigm that trains multiple models for the same problem, is gaining increasing attention in recent years ([18,19]). By taking advantage of multiple learning algorithms or various training datasets, ensemble learning provides better predictive performance than a single machine learning model. However, the consequent challenge for implementations of ensemble regression models in process control is how to solve the ensemble of RNNs efficiently and in a timely fashion. Although reducing the computation time of neural network training process is always an important research topic in machine learning community, process engineers are more concerned about the real-time implementation of neural networks in process control systems. To address computational implementation issues, parallel computing ([20]) is employed to solve multiple RNN models simultaneously in EMPC.
Motivated by the above, a Lyapunov-based EMPC method using an ensemble of RNN models is developed in this work. Specifically, in Section 2, the notation, the class of nonlinear systems considered and the stabilizability assumptions are introduced. In Section 3, the class of RNN models and the learning algorithm are provided. Then, ensemble learning is used to obtain multiple RNN models based on a k-fold cross validation, while parallel computing is used to improve computational efficiency. Subsequently, in Section 4, the Lyapunov-based EMPC that incorporates an RNN ensemble is formulated to guarantee the boundedness of the state in the closed-loop stability region accounting for sufficiently small modeling error, sampling time and bounded disturbances. In the last section, the proposed LEMPC method is applied to a chemical process example to demonstrate its guaranteed closed-loop stability and economic optimality properties.

2. Preliminaries

2.1. Notation

Throughout the manuscript, the operator · denotes the Euclidean norm of a vector. x T is used to denote the transpose of x. The function f ( · ) is of class C 1 if it is continuously differentiable in its domain. The notation L f V ( x ) denotes the standard Lie derivative L f V ( x ) : = V ( x ) x f ( x ) . “\” is used to denote the set subtraction, i.e., A \ B : = { x R n | x A , x B } . A continuous function α : [ 0 , a ) [ 0 , ) is said to belong to class K if α ( 0 ) = 0 and it is strictly increasing.

2.2. Class of Systems

We consider the class of continuous-time nonlinear systems that is described by the following system of first-order nonlinear ordinary differential equations:
x ˙ = F ( x , u , w ) : = f ( x ) + g ( x ) u + h ( x ) w , x ( t 0 ) = x 0
where the state vector is x R n , the manipulated input vector is u R m , and the disturbance vector is w W with W : = { w R q | | w | w m , w m 0 } . The control action constraint is defined by u U : = { u i min u i u i max , i = 1 , , m } R m . It is assumed that f ( · ) : R n R n , g ( · ) : R n R n × m , and h ( · ) : R n R n × q are sufficiently smooth vector and matrix functions. Additionally, it is assumed that the initial time t 0 is zero (i.e., t 0 = 0 ) and f ( 0 ) = 0 . Therefore, a steady-state of the nominal system of Equation (1) (i.e., w ( t ) 0 ) is at the origin.
A stabilizing control law u = Φ ( x ) U that renders the origin of the nominal system of Equation (1) (i.e., w ( t ) 0 ) exponentially stable is assumed to exist, in the sense that there exists a C 1 Control Lyapunov function V ( x ) such that the following inequalities hold for all x in an open neighborhood D around the origin:
c 1 | x | 2 V ( x ) c 2 | x | 2 ,
V ( x ) x F ( x , Φ ( x ) , 0 ) c 3 | x | 2 ,
V ( x ) x c 4 | x |
where c 1 , c 2 , c 3 and c 4 are positive constants. F ( x , u , w ) is the nonlinear system of Equation (1). Based on the universal Sontag control law ([21]), a candidate controller Φ ( x ) is given by the saturated control law accounting for the input constraint u U .
The stability region for the closed-loop system of Equation (1) is characterized as a level set of Lyapunov function V ( x ) within D where the time-derivative of the Lyapunov function V is negative. Specifically, based on Equation (2), we can first find a region ϕ u where V ˙ ( x ) is rendered negative under the controller u = Φ ( x ) U : ϕ u : = { x R n | V ˙ ( x ) = L f V + L g V u < k V ( x ) , u = Φ ( x ) U } { 0 } , with k > 0 . Then the closed-loop stability region Ω ρ is characterized as a subset in ϕ u : Ω ρ : = { x ϕ u | V ( x ) ρ } , where ρ > 0 . Since Ω ρ ϕ u , it follows that Ω ρ is a forward invariant set. Additionally, the Lipschitz property of F ( x , u , w ) combined with the bounds on u and w implies that there exist positive constants M, L x , L w , L x , L w such that the following inequalities hold for all x , x D , u U , and w W :
| F ( x , u , w ) | M
| F ( x , u , w ) F ( x , u , 0 ) | L x | x x | + L w | w |
V ( x ) x F ( x , u , w ) V ( x ) x F ( x , u , 0 ) L x | x x | + L w | w |
Remark 1.
Since closed-loop stability is achieved for the nonlinear system of Equation (1) under the controller u = Φ ( x ) U for all initial conditions x 0 Ω ρ ([4]), the dataset for training recurrent neural networks will be generated within the closed-loop stability region Ω ρ such that there exist feasible control actions u U for the resulting recurrent neural network that captures the nonlinear dynamics in Ω ρ .

3. Recurrent Neural Network

In this work, we develop a nonlinear recurrent neural network (RNN) model with the following form:
x ^ ˙ = F n n ( x ^ , u ) : = A x ^ + Θ T y
where x ^ R n is the RNN state vector and u R m is the manipulated input vector. y = [ y 1 , , y n , y n + 1 , , y m + n ] = [ σ ( x ^ 1 ) , , σ ( x ^ n ) , u 1 , , u m ] R n + m is a vector of both the network state x ^ and the input u, where σ ( · ) is the nonlinear activation function (e.g., a sigmoid function σ ( x ) = 1 / ( 1 + e x ) ). A is a diagonal matrix, i.e., A = d i a g { a 1 , , a n } R n × n , and Θ is a coefficient matrix Θ = [ θ 1 , , θ n ] R ( m + n ) × n with θ i = b i [ w i 1 , , w i ( m + n ) ] , i = 1 , , n . a i and b i are constants, where a i is assumed to be positive such that each state x ^ i is bounded-input bounded-state stable. w i j is the weight connecting the jth input to the ith neuron where i = 1 , , n and j = 1 , , ( m + n ) . Since the RNN model of Equation (4) is affine in the manipulated inputs u, the RNN model can also be represented in a similar form of Equation (1) by x ^ ˙ = f ^ ( x ^ ) + g ^ ( x ^ ) u , where f ^ ( · ) and g ^ ( · ) are derived from coefficients A and Θ in Equation (4). Similarly, f ^ ( · ) and g ^ ( · ) are assumed to be sufficiently smooth vector and matrix functions of dimensions n × 1 and n × m , respectively.
It is shown in Figure 1 that RNNs possess a certain type of memory by introducing feedback loops in the hidden layer such that the states information derived from earlier inputs remains in the network. Additionally, if we unfold one of the RNN states in time as shown in Figure 1, the manipulated input vector u t and the previous state vector x t 1 are fed into the current state x t with the weight matrix Θ , where x t 1 , x t , and x t + 1 are all internal states of an RNN model.
Due to the capability of capturing nonlinear dynamic behavior and the universal approximation theorem ([22,23]) demonstrating that the RNN model with sufficient neurons is able to approximate any dynamic nonlinear system on compact subsets of the state-space for finite time, the RNN model of Equation (4) is used to approximate the continuous-time nonlinear system of Equation (1) in this work.

3.1. RNN Learning Algorithm

The RNN learning algorithm is developed to obtain the optimal weight matrix Θ * , under which the modeling error between the nominal system of Equation (1) (i.e., w ( t ) 0 ) and the the RNN model of Equation (4) is minimized. The modeling error ν : = F ( x , u , 0 ) F n n ( x ^ , u ) under Θ = Θ * is assumed to be nonzero since the obtained RNN model may not perfectly approximate the nominal system of Equation (1) due to insufficient number of nodes or layers. Additionally, the modeling error ν ( t ) is bounded (i.e., | ν ( t ) | ν m , ν m > 0 ) since the RNN model of Equation (4) is developed based on the dataset generated within Ω ρ where x ( t ) and u ( t ) U are bounded. The weight vector θ i of the RNN model of Equation (4) is also bounded by | θ i | θ m , with θ m > 0 to prevent the weights from drifting to infinity. Throughout the manuscript, we will use x ( t ) to represent the state of the nonlinear system of Equation (1) and x ^ ( t ) to represent the state of the RNN model of Equation (4).
According to the definition of the modeling error ν , the nominal system of Equation (1) (i.e., w ( t ) 0 ) can be written in the following form:
x ˙ i = a i x i + θ i * T y + ν i , i = 1 , , n
where the optimal weight vector θ i * is obtained as follows:
θ i * : = arg min | θ i | θ m { k = 1 N d | F i ( x k , u k , 0 ) + a i x k θ i T y k | }
where N d is the number of data samples used for training. Consider the state error e = x ^ x R n . Based on Equations (4) and (5), the time-derivative of the state error is derived as follows:
e ˙ i = x ^ ˙ i x ˙ i = a i e i + ζ i T y ν i , i = 1 , , n
where ζ i = θ i θ i * is the error between the current weight vector θ i and the unknown optimal weight vector θ i * . The weight vector θ is updated by the following equation:
θ ˙ i = η i y e i , i = 1 , , n
where the learning rate η i is a positive real number. Based on the learning law of Equation (8), the following theorem is established to demonstrate that the state error e remains bounded and is constrained by the modeling error | ν | .
Theorem 1.
Consider the RNN model of Equation (4) with the weights updating law of Equation (8). In the presence of a nonzero modeling error ν, the state error e i and the weight error ζ i are bounded, and there exist λ R and μ > 0 such that the following inequality holds:
0 t | e ( τ ) | 2 d τ λ + μ 0 t | ν ( τ ) | 2 d τ
Proof. 
We first define a Lyapunov function V ˜ = 1 2 i = 1 n ( e i 2 + ζ i T η i 1 ζ i ) and prove the boundedness of e i and ζ i by showing that V ˜ is bounded. Specifically, using Equations (7) and (8) and the fact that ζ i ˙ = θ i ˙ , the time-derivative of V ˜ is first derived as follows:
V ˜ ˙ = i = 1 n ( e i e i ˙ + η i 1 ζ i ζ i ˙ ) = i = 1 n ( a i e i 2 e i ν i )
In the presence of modeling error ν i 0 , it is observed that V ˜ ˙ 0 does not hold for all times. Therefore, let α : = min { a i , 1 / ( λ m ) , i = 1 , , n } and β : = i = 1 n ( 2 θ m 2 + ν m 2 / 2 a i ) , where λ m represents the maximum eigenvalue of η i 1 . The following inequality is further derived based on Equation (10):
V ˜ ˙ = i = 1 n ( a i 2 e i 2 1 2 | ζ i | 2 ) + ( 1 2 | ζ i | 2 a i 2 e i 2 e i ν i ) α V ˜ + i = 1 n ( 1 2 | ζ i | 2 + 1 2 a i ν i 2 )
Since the weight vector is bounded by | θ i | θ m , it is derived that 1 2 | ζ i | 2 2 θ m 2 . According to the definitions of α and β , it follows that V ˜ ˙ α V ˜ + β . Therefore, for all V ˜ V 0 = β / α , V ˜ ˙ 0 holds. This implies that V ˜ is bounded, and thus, e i and ζ i are bounded as well. Additionally, from Equation (10), it is readily shown that the following equations hold:
V ˜ ˙ = i = 1 n ( a i 2 e i 2 a i 2 e i 2 e i ν i ) = i = 1 n ( a i 2 e i 2 ( a i 2 e i 2 + e i ν i + 1 2 a i ν i 2 ) + 1 2 a i ν i 2 ) i = 1 n ( a i 2 e i 2 + 1 2 a i ν i 2 )
From Equation (12), V ˜ ( t ) is obtained as follows:
V ˜ ( t ) V ˜ ( 0 ) + i = 1 n ( a i 2 0 t e i ( τ ) 2 d τ + 1 2 a i 0 t ν i ( τ ) 2 d τ ) V ˜ ( 0 ) a m i n 2 0 t | e ( τ ) | 2 d τ + 1 2 a m i n 0 t | ν ( τ ) | 2 d τ
where a m i n is the minimum value of a i , i = 1 , , n . Let λ = 2 a m i n sup t 0 ( V ˜ ( 0 ) V ˜ ( t ) ) and μ = 1 / a m i n 2 . The following inequality is derived:
0 t | e ( τ ) | 2 d τ 2 a m i n ( V ˜ ( 0 ) V ˜ ( t ) ) + 1 a m i n 2 0 t | ν ( τ ) | 2 d τ λ + μ 0 t | ν ( τ ) | 2 d τ
The above inequality shows that the state error e is bounded and is proportional to the modeling error | ν | between the RNN model of Equation (4) and the nominal system of Equation (1) (i.e., w ( t ) 0 ). Furthermore, if there exists a constant C > 0 such that 0 | ν ( t ) | 2 d t is bounded (i.e., 0 | ν ( t ) | 2 d t = C < ), then it follows that 0 | e ( t ) | 2 d t λ + μ C < . Therefore, e ( t ) converges to zero asymptotically since e ( t ) is a uniformly continuous function implied by the boundedness of its derivative e ˙ . □

3.2. Ensemble of RNN Models

In this section, an RNN model is initially developed to approximate the nonlinear system of Equation (1) within an operating region. Based on that, the ensemble learning using the aforementioned RNN learning algorithm is used with a k-fold cross validation to improve prediction accuracy of RNN models.

3.2.1. Development of RNN Models

The development of RNN models consists of two steps: data generation and training process. In general, the dataset for training an RNN model is a collection of sampled data from industrial processes, laboratory experiments, or simulations of a given system. There are many high-quality training datasets, for example, the collection of databases from the UCI Machine Learning Repository [24] that is widely used by the researchers in machine learning community. In this work, we use the simulation data of the nominal system of Equation (1) (i.e., w ( t ) 0 ) as the training and validation datasets for the RNN model of Equation (4). Specifically, to generate the dataset that captures the system dynamics for x Ω ρ and u U , open-loop simulations are first conducted for initial conditions x 0 Ω ρ with the constrained input vector u U . It should be mentioned that the continuous system of Equation (1) is simulated in a sample-and-hold fashion (i.e., the input is a piecewise constant function, i.e., u ( t ) = u ( t k ) , t [ t k , t k + 1 ) , where t k + 1 : = t k + Δ and Δ is the sampling period). Then, the nominal system of Equation (1) is integrated via explicit Euler method with a sufficiently small integration time step h c < Δ . The dataset is generated with the following steps: (1) The targeted region Ω ρ and the range of inputs u U are first discretized with sufficiently small intervals. (2) Open-loop simulations of the nominal system of Equation (1) are performed in a sample-and-hold fashion with various initial states x 0 Ω ρ and inputs u U to obtain a large number of state trajectories for a fixed length of time to sweep over all the values that ( x , u ) can take. It is noted that data points are collected at every integration time step h c instead of every sampling step Δ to obtain intermediate states as many as possible in order to approximate a continuous system of Equation (1). (3) Then, the obtained state trajectories are separated into many time-series data samples with the length of P n n , which represents the prediction period for RNN models. In other words, the RNN model is developed to predict future states over t [ t k , t k + P n n ] , where all the data points between t k and t k + P n n are treated as the internal states of the RNN model with the time interval of h c (i.e., the time interval between two consecutive states x t 1 and x t for the unfolded RNN shown in Figure 1 is equal to h c ). (4) The dataset is partitioned into training, validation and testing datasets.
Next, the RNN model is developed based on the dataset generated from open-loop simulations via an application program interface (API), i.e., Keras ([13]). To guarantee that the RNN model of Equation (4) achieves good performance in a neighborhood around the origin and has the same equilibrium point as the nonlinear system of Equation (1), the modeling error is further required to be bounded by | ν | γ | x | ν m during the training process. Before starting an RNN training process, the number of layers and nodes used, the optimizer, and the type of loss function need to be specified at the beginning. Specifically, in our work, the number of layers and nodes for RNNs are determined through a grid search and the adaptive moment estimation method (i.e., Adam in Keras) is used as RNN optimizer. The loss function is the mean absolute percentage error between predicted states x ^ and actual states x from training data. Additionally, to prevent RNNs from over-training, the early stopping criteria is employed to terminate the training process before the maximum number of epochs is reached if the following two conditions are satisfied: (1) The modeling error is below its threshold. (2) The performance on the validation dataset stops improving.
Remark 2.
It is noted that original data samples from open-loop simulations are shortened to a fixed length, P n n , in the step 3 of data generation above. Through this data preprocessing step, the dataset is applicable for developing an RNN with the prediction period P n n , which is chosen to be an integer multiple of the sampling period Δ such that the RNN model of Equation (4) is able to predict future states for at least one sampling period. One of the advantages of introducing step 2 and step 3 is that a new dataset can be conveniently generated for an RNN with a different prediction period P n n while the original dataset in step 2 only needs to be created once.
Remark 3.
Unlike feedforward neural networks (FNN) that take state measurement x ( t k ) and manipulated inputs u ( t k ) at time t = t k to predict the state at t = t k + P n n only, RNNs use all the intermediate states between x ( t k ) and x ( t k + P n n ) to predict the state trajectory over this prediction period. In other words, all the internal states with the time step h c are used as one of the inputs (the other one is the manipulated input u) to the RNN (shown in Figure 1) over t [ t k , t k + P n n ] during the training process. Therefore, besides the final predicted state x ( t k + P n n ) , all the internal states before can be predicted and exported (if needed) by RNNs. Additionally, it is important to mention that since the time interval for RNN internal states is chosen to be the sufficiently small integration time step h c that is used to simulate the continuous-time nonlinear system of Equation (1), the obtained RNN model can be regarded as a good representation for the continuous-time RNN model of Equation (4).
Remark 4.
Data quality plays an important role in data-driven modeling techniques including RNN modeling. In case a good RNN model is initially not available due to poor data quality, the RNN model can be improved via on-line update through real-time process state measurements. Additionally, if the real-time process measurements are of poor quality, for example, noisy measurements, a data preprocessing step can be employed before training an RNN. Overall, creating a good dataset remains an important issue in RNN modeling; in the chemical reactor example in Section 5 the data is generated from open-loop simulations.
Remark 5.
The RNN models considered in our work involve the use of nonlinear terms in the dynamic model (this can be thought of as a nonlinear process gain k ( x ) , multiplying the input), which allows capturing gain sign change in the different regions in the operating state-space if it occurs. This issue does not occur in the operating region of the chemical reactor example considered in Section 5. The reader may refer to [25] for an approach to neural network modeling that captures gain change.

3.2.2. Ensemble Learning

Ensemble learning is a machine learning process that combines multiple learning algorithms to obtain better prediction performance ([26]) in terms of reduced variability and improved generalization performance. The reasons that ensemble regression models are able to improve prediction performance are given in [10,26]. In this work, homogeneous ensemble regression models are constructed following the RNN learning algorithm in Section 3.1 and a k-fold cross validation. Specifically, as shown in Figure 2, the dataset is initially separated into k folds with the same size, where k 1 folds are used for training and the remaining one is used for validation. By choosing a different subset as the validation dataset every time, a total number of k different RNNs can be obtained. Subsequently, the final prediction results of an ensemble of RNN models is calculated using the stacking method that averages the predictions of multiple RNN models. It is noted that besides the stacking method, other ensemble-based algorithms such as Boosting and Bagging [19] can also be applied to improve the overall prediction performance within the targeted operating region.

3.2.3. Parallel Computing

Parallel computing is a type of computation in which the execution of multiple processes or of the computational tasks are carried out simultaneously. Since the final prediction results in ensemble learning are obtained by combining the results of individual RNN models that are independent from each other, the computational tasks for all RNN models can be operated in parallel to improve computational efficiency. The algorithm for parallel computation of an ensemble of k RNN models is shown in Figure 3. The blocks on the left demonstrate the state prediction using parallel computing and the dashed blocks on the right represent the actual nonlinear system of Equation (1) and the model-based controller that will be discussed in the next section. Specifically, we first reserve k + 1 nodes, where node 0 is the host node and the rest are worker nodes. The host node is used to exchange information while the worker nodes are mainly used for computation. Each worker node is assigned with a corresponding RNN model for prediction. At time t = t k , the model-based controller receives the state measurement x ( t k ) and sends it to the host node 0 along with the manipulated inputs u g ( t k ) (i.e., the guess of control actions over the prediction horizon). The host node broadcasts x ( t k ) and u g ( t k ) to all worker nodes such that all RNN models share the same input vector. Prediction of RNN models are carried out in all worker nodes simultaneously. It is noted that a synchronization barrier is placed before the ensemble combiner in node 0 to ensure that all the computation tasks in worker nodes are completed. Lastly, the host node combines prediction results from multiple RNN models to obtain predicted states over t [ t k , t k + P n n ] and sends them to the model-based controller. The optimal control actions u * ( t k ) is calculated by the optimizer solver in model-based controller by taking various guesses of control actions u g ( t k ) , and will be sent to the nonlinear process of Equation (1) to be applied for the next sampling period. The above process is repeated as the horizon of the nonlinear process of Equation (1) is rolled one sampling period forward each time.

4. Lyapunov-Based EMPC Using Ensemble RNN Models

In this section, the Lyapunov-based economic MPC (LEMPC) that incorporates an ensemble of RNN models to predict future states is developed for the nonlinear system of Equation (1). In the following subsections, a Lyapunov-based controller using the RNN model of Equation (4) is first used to characterize the closed-loop stability region in which the origin of the nonlinear system of Equation (1) can be rendered exponentially stable. Then, the formulation of the LEMPC using an ensemble of RNN models is given and Theorem 2 is established to demonstrate closed-loop stability for all initial conditions in the closed-loop stability region.

4.1. Lyapunov-Based Control Using RNN Models

We assume that there exists a stabilizing feedback controller u = Φ n n ( x ) U that can render the origin of the RNN model of Equation (4) exponentially stable in an open neighborhood D ^ in the sense that there exists a C 1 Lyapunov function V ^ ( x ) such that the following inequalities hold for all x in D ^ :
c ^ 1 | x | 2 V ^ ( x ) c ^ 2 | x | 2 ,
V ^ ( x ) x F n n ( x , Φ n n ( x ) ) c ^ 3 | x | 2 ,
V ^ ( x ) x c ^ 4 | x |
where c ^ 1 , c ^ 2 , c ^ 3 and c ^ 4 are positive constants. Although the RNN model of Equation (4) is obtained based on the dataset within the closed-loop stability region Ω ρ for the nonlinear system of Equation (1), it is noted that the existence a feasible control action u U that can render the origin of the RNN model of Equation (4) exponentially stable is not guaranteed due to model mismatch. Therefore, based on Equation (15), the closed-loop stability region is re-characterized for the RNN model of Equation (4) accounting for the input constraints. Specifically, a Lyapunov-based controller based on the RNN model is first developed using the universal Sontag control law ([21]) since the RNN model of Equation (4) can be represented by x ^ ˙ = f ^ ( x ^ ) + g ^ ( x ^ ) u . Then, following the characterization method of Ω ρ for the nonlinear system of Equation (1), an open set from which the origin of the RNN model of Equation (4) can be rendered exponentially stable under the controller u = Φ n n ( x ) U is characterized as follows: ϕ ^ u : = { x R n | V ^ ˙ ( x ) < k V ^ ( x ) , u = Φ n n ( x ) U } { 0 } , where V ^ ˙ ( x ) = V ^ ( x ) x F n n ( x , Φ n n ( x ) ) and k > 0 . The closed-loop stability region for the RNN model of Equation (4) is defined as the largest level of V ^ inside ϕ ^ u : Ω ρ ^ : = { x ϕ ^ u | V ^ ( x ) ρ ^ } , where ρ ^ > 0 . From now on, Ω ρ ^ will be taken as the new operating region for the closed-loop system of Equation (1) in the following sections. Additionally, the Lipschitz property of F n n ( x , u ) and the boundedness of u imply that there exist positive constants M n n and L n n such that the following inequalities hold for all x , x Ω ρ ^ , u U :
| F n n ( x , u ) | M n n
V ^ ( x ) x F n n ( x , u ) V ^ ( x ) x F n n ( x , u ) L n n | x x |
The following proposition is developed to demonstrate that the state of the nonlinear system of Equation (1) can be driven towards the origin exponentially for all x 0 Ω ρ ^ under the stabilizing controller u = Φ n n ( x ) U designed for the RNN model of Equation (4) provided that the modeling error is sufficiently.
Proposition 1.
Consider the stabilizing controller u = Φ n n ( x ) U that renders the origin of the RNN system of Equation (4) exponentially stable for all x 0 Ω ρ ^ . If there exists a positive real number γ that constrains the modeling error | ν | = | F ( x , u , 0 ) F n n ( x , u ) | γ | x | ν m , x Ω ρ ^ , u U and satisfies the following inequality:
γ < c ^ 3 / c ^ 4
then the origin of the nominal closed-loop system of Equation (1) under u = Φ n n ( x ) U is also exponentially stable for all x Ω ρ ^ .
Proof. 
We demonstrate the exponential stability of the origin of the nominal system of Equation (1) under u = Φ n n ( x ) U , x Ω ρ ^ by showing that there exists a positive real number k such that V ^ ˙ ( x ) k | x | 2 holds for all x Ω ρ ^ , where V ^ ˙ ( x ) = V ^ ( x ) x F ( x , Φ n n ( x ) , 0 ) . Based on Equations (15b) and (15c), the time-derivative of V ^ is derived as follows:
V ^ ˙ = V ^ ( x ) x F ( x , Φ n n ( x ) , 0 ) = V ^ ( x ) x ( F n n ( x , Φ n n ( x ) ) + F ( x , Φ n n ( x ) , 0 ) F n n ( x , Φ n n ( x ) ) ) c ^ 3 | x | 2 + c ^ 4 | x | ( F ( x , Φ n n ( x ) , 0 ) F n n ( x , Φ n n ( x ) ) ) c ^ 3 | x | 2 + c ^ 4 γ | x | 2
If γ satisfies Equation (17), then there exists a positive real number c ˜ 3 = c ^ 3 + c ^ 4 γ > 0 such that V ^ ˙ c ˜ 3 | x | 2 0 holds. Let k = c ˜ 3 and this completes the proof that the closed-loop state of the nominal system of Equation (1) is driven to the origin in an exponential rate of convergence under u = Φ n n ( x ) U for all x 0 Ω ρ ^ . □

4.2. LEMPC Using an Ensemble of RNN Models

The Lyapunov-based economic MPC (LEMPC) design using an ensemble of RNN models is represented by the following optimization problem:
J = max u S ( Δ ) t k t k + N l e ( x ˜ ( t ) , u ( t ) ) d t
s . t . x ˜ ˙ ( t ) = 1 N e j = 1 N e F n n j ( x ˜ ( t ) , u ( t ) )
u ( t ) U , t [ t k , t k + N )
x ˜ ( t k ) = x ( t k )
V ^ ( x ˜ ( t ) ) ρ ^ e , t [ t k , t k + N ) , if x ( t k ) Ω ρ ^ e
V ^ ˙ ( x ( t k ) , u ) V ^ ˙ ( x ( t k ) , Φ n n ( x ( t k ) ) , if x ( t k ) Ω ρ ^ \ Ω ρ ^ e
where S ( Δ ) is the set of piecewise constant functions with period Δ , x ˜ is the predicted state trajectory, N e is the number of RNN models used in predicting future states, and N is the number of sampling periods in the prediction horizon. We use V ^ ˙ ( x , u ) to represent the time-derivative of V ^ , i.e., V ^ ˙ ( x , u ) = V ^ ( x ) x ( F n n ( x , u ) ) . The optimization problem of Equation (19) is solved by optimizing the time integral of the stage cost function l e ( x ˜ ( t ) , u ( t ) ) of Equation (19a) over the prediction horizon accounting for the constraints of Equations (19b)–(19f). The constraint of Equation (19b) is the ensemble of RNN models of Equation (4) that is used as the prediction model. Equation (19c) defines the input constraints that are applied over the entire prediction horizon. Equation (19d) defines the initial condition x ˜ ( t k ) of Equation (19b). The constraint of Equation (19e) maintains the closed-loop state predicted by Equation (19b) in Ω ρ ^ e over the prediction horizon if the state x ( t k ) is inside Ω ρ ^ e . However, if x ( t k ) leaves Ω ρ ^ e but still remains in Ω ρ ^ , the contractive constraint of Equation (19f) drives the state towards the origin for the next sampling period such that the state will eventually enter Ω ρ ^ e within finite sampling periods. By introducing a conservative region Ω ρ ^ e with ρ ^ e < ρ ^ and the stability constraints of Equations (19e) and (19f), it will be proved in Theorem 2 that for any initial condition x 0 Ω ρ ^ , the closed-loop state of the nonlinear system of Equation (1) is bounded in Ω ρ ^ for all times. A schematic of Ω ρ ^ and Ω ρ ^ e is given by Figure 4, in which it is shown that the state trajectory starting inside Ω ρ ^ is bounded in Ω ρ ^ under LEMPC.
The optimal input trajectory is calculated over the entire prediction horizon t [ t k , t k + N ) and is denoted by u * ( t ) . However, only the control action u * ( t k ) for the first sampling period in t [ t k , t k + N ) is sent to the nonlinear system of Equation (1) to be applied over the next sampling period. Before we demonstrate closed-loop stability for the nonlinear system of Equation (1) under the LEMPC of Equation (19), the next two propositions are first derived to provide useful tools for proving closed-loop stability in Theorem 2. Specifically, the following proposition provides an upper bound for the state error in the presence of sufficiently small modeling error and bounded disturbances.
Proposition 2.
Consider the solution x ( t ) of the nonlinear system x ˙ = F ( x , u , w ) of Equation (1) in the presence of bounded disturbances | w ( t ) | w m , and the solution x ^ ( t ) of the RNN model x ^ ˙ = F n n ( x ^ , u ) of Equation (4) with the same initial condition x 0 = x ^ 0 Ω ρ ^ . If x ( t ) , x ^ ( t ) Ω ρ ^ , and the modeling error is bounded (i.e., | ν ( t ) | = | x ˙ x ^ ˙ | ν m ) for all times, then there exists a positive constant κ and a class K function f w ( · ) such that the following inequalities hold x , x ^ Ω ρ ^ and w ( t ) W :
| x ( t ) x ^ ( t ) | f w ( t ) : = L w w m + ν m L x ( e L x t 1 )
V ^ ( x ) V ^ ( x ^ ) + c ^ 4 ρ ^ c ^ 1 | x x ^ | + κ | x x ^ | 2
Proof. 
Following the proof in [5,7], we define the state error vector by e ( t ) = x ( t ) x ^ ( t ) . The time-derivative of e ( t ) is obtained x , x ^ Ω ρ ^ , u U and w ( t ) W as follows:
| e ˙ | = | F ( x , u , w ) F n n ( x ^ , u ) | | F ( x , u , w ) F ( x ^ , u , 0 ) | + | F ( x ^ , u , 0 ) F n n ( x ^ , u ) | L x | e ( t ) | + L w w m + ν m
where the last inequality is derived from Equation (3b) and the fact that | ν | ν m . Since x ( t ) and x ^ ( t ) share the same initial condition, (i.e., e ( 0 ) = 0 ), the upper bound for | e ( t ) | is derived for all x ( t ) , x ^ ( t ) Ω ρ ^ and | w ( t ) | w m as follows:
| e ( t ) | = | x ( t ) x ^ ( t ) | L w w m + ν m L x ( e L x t 1 )
Subsequently, x , x ^ Ω ρ , Equation (20b) is derived based on the Taylor series expansion of V ^ ( x ) around x ^ as follows:
V ^ ( x ) V ^ ( x ^ ) + V ^ ( x ^ ) x | x x ^ | + κ | x x ^ | 2 V ^ ( x ^ ) + c ^ 4 ρ ^ c ^ 1 | x x ^ | + κ | x x ^ | 2
where κ is a positive real number, and the last inequality is derived using Equations (15a) and (15c). □
The above proposition demonstrates that the error of state trajectories of the nonlinear system of Equation (1) and the RNN model of Equation (4), starting from the same initial condition, is bounded for finite time. The next proposition is developed to demonstrate that if x Ω ρ ^ \ Ω ρ s , the stabilizing controller u = Φ n n ( x ) U implemented in a sample-and-hold fashion is able to maintain V ^ ˙ ( x ) negative such that the state will be driven towards the origin and ultimately enters a small neighborhood around the origin (i.e., Ω ρ s ) in finite sampling steps.
Proposition 3.
Consider the system of Equation (1) under the controller u = Φ n n ( x ^ ) U that meets the conditions of Equation (15) and is implemented in a sample-and-hold fashion, i.e., u ( t ) = Φ n n ( x ^ ( t k ) ) , t [ t k , t k + 1 ) , where t k + 1 : = t k + Δ . Let ϵ w , ϵ s > 0 , Δ > 0 and ρ ^ > ρ ^ e > ρ s > 0 satisfy
c ^ 3 c ^ 2 ρ s + L n n M n n Δ ϵ s
c ˜ 3 c ^ 2 ρ s + L x M Δ + L w w m ϵ w
ρ ^ e > max { V ^ ( x ^ ( t k + Δ ) ) | x ^ ( t k ) Ω ρ s , u U , w W }
Then, for any x ( t k ) Ω ρ ^ \ Ω ρ s , the following inequalities holds:
V ^ ( x ^ ( t ) ) V ^ ( x ^ ( t k ) ) , t [ t k , t k + 1 )
V ^ ( x ( t ) ) V ^ ( x ( t k ) ) , t [ t k , t k + 1 )
Proof. 
To show that the value of V ^ decreases along the trajectory x ^ ( t ) of the RNN model of Equation (4) over t [ t k , t k + 1 ) , we calculate the time-derivative of V ^ ( x ^ ) based on x ^ ( t ) as follows:
V ^ ˙ ( x ^ ( t ) ) = V ^ ( x ^ ( t ) ) x ^ F n n ( x ^ ( t ) , Φ n n ( x ^ ( t k ) ) ) = V ^ ( x ^ ( t k ) ) x ^ F n n ( x ^ ( t k ) , Φ n n ( x ^ ( t k ) ) ) + V ^ ( x ^ ( t ) ) x ^ F n n ( x ^ ( t ) , Φ n n ( x ^ ( t k ) ) ) V ^ ( x ^ ( t k ) ) x ^ F n n ( x ^ ( t k ) , Φ n n ( x ^ ( t k ) ) )
Using Equations (15a) and (15b) and the Lipschitz condition of Equation (16), the following inequalities are obtained:
V ^ ˙ ( x ^ ( t ) ) c ^ 3 c ^ 2 ρ s + V ^ ( x ^ ( t ) ) x ^ F n n ( x ^ ( t ) , Φ n n ( x ^ ( t k ) ) ) V ^ ( x ^ ( t k ) ) x ^ F n n ( x ^ ( t k ) , Φ n n ( x ^ ( t k ) ) ) c ^ 3 c ^ 2 ρ s + L n n M n n Δ
Therefore, V ^ ˙ ( x ^ ( t ) ) ϵ s holds x ^ ( t k ) Ω ρ ^ \ Ω ρ s and t [ t k , t k + 1 ) if Equation (24a) is satisfied. By integrating the above inequality, it follows that V ^ ( x ^ ( t ) ) V ^ ( x ^ ( t k ) ) Δ ϵ s , x ^ ( t k ) Ω ρ ^ \ Ω ρ s , t [ t k , t k + 1 ) (i.e., Equation (25a).
Next, to show that V ^ ( x ( t ) ) V ^ ( x ( t k ) ) holds for all t [ t k , t k + 1 ) , the time-derivative of V ^ ( x ) for the nonlinear system of Equation (1) (i.e., x ˙ = F ( x , u , w ) ) in the presence of bounded disturbances (i.e., | w ( t ) | w m ) is derived as follows:
V ^ ˙ ( x ( t ) ) = V ^ ( x ( t ) ) x F ( x ( t ) , Φ n n ( x ( t k ) ) , w ) = V ^ ( x ( t k ) ) x F ( x ( t k ) , Φ n n ( x ( t k ) ) , 0 ) + V ^ ( x ( t ) ) x F ( x ( t ) , Φ n n ( x ( t k ) ) , w ) V ^ ( x ( t k ) ) x F ( x ( t k ) , Φ n n ( x ( t k ) ) , 0 )
Since Equation (18) shows that V ^ ( x ( t k ) ) x F ( x ( t k ) , Φ n n ( x ( t k ) ) , 0 ) c ˜ 3 | x ( t k ) | 2 holds for all x Ω ρ ^ , the following inequality is obtained for all x ( t k ) Ω ρ ^ \ Ω ρ s , t [ t k , t k + 1 ) using Equation (2a) and the Lipschitz condition in Equation (3):
V ^ ˙ ( x ( t ) ) c ˜ 3 c ^ 2 ρ s + L x | x ( t ) x ( t k ) | + L w | w | c ˜ 3 c ^ 2 ρ s + L x M Δ + L w w m
Therefore, the following inequality is further derived for all x ( t k ) Ω ρ ^ \ Ω ρ s and t [ t k , t k + 1 ) if the condition of Equation (24b) is satisfied:
V ^ ˙ ( x ( t ) ) ϵ w
Similarly, this implies that V ^ ( x ( t ) ) V ^ ( x ( t k ) ) Δ ϵ w , x ( t k ) Ω ρ ^ \ Ω ρ s , t [ t k , t k + 1 ) . Therefore, the state of the nonlinear system of Equation (1) will enter Ω ρ s in finite sampling periods if x ( t k ) Ω ρ ^ \ Ω ρ s . Additionally, if x ( t k ) Ω ρ s , in which Equations (29) and (30) do not hold, Equation (24c) guarantees that the state will not leave Ω ρ ^ e in one sampling period for all u U and w W . If the state x ( t k + 1 ) leaves Ω ρ s but remains in Ω ρ ^ e , at the next sampling period t [ t k + 1 , t k + 2 ) , Equation (30) is satisfied again and the state will be driven towards the origin. Therefore, the state of the nonlinear system of Equation (1) is bounded in Ω ρ ^ for all times. □
Based on Propositions 2 and 3, the following theorem is established to show recursive feasibility of the LEMPC optimization problem of Equation (19), and the boundedness of the closed-loop state within Ω ρ ^ under the sample-and-hold implementation of the LEMPC.
Theorem 2.
Consider the closed-loop system of Equation (1) under the sample-and-hold implementation of the LEMPC of Equation (19) with the stabilizing controller Φ n n ( x ) that satisfies Equation (15). Let Δ > 0 , ϵ w > 0 and ρ ^ > ρ ^ e > 0 satisfy Equation (24) and the following inequality:
ρ ^ e ρ ^ c ^ 4 ρ ^ c ^ 1 f w ( Δ ) κ ( f w ( Δ ) ) 2
If x 0 Ω ρ ^ and the conditions of Propositions 2 and 3 are satisfied, then there always exists a feasible solution for the optimization problem of Equation (19), and the closed-loop state x ( t ) is bounded in the closed-loop stability region Ω ρ ^ , t 0 .
Proof. 
The LEMPC of Equation (19) predicts future states by averaging the results of an ensemble of RNN models in Equation (19b). Since each RNN model is trained to satisfy the modeling error constraint and Equation (16), it is readily shown that the averaged results of an ensemble of multiple RNN models also satisfy the above conditions, and thus, the results derived in Propositions 1–3 for a single RNN model can be generalized to an RNN ensemble.
We first prove that the optimization problem of Equation (19) can be solved with recursive feasibility for all x Ω ρ ^ . Specifically, if x ( t k ) Ω ρ ^ e , the control actions Φ n n ( x ( t k + i ) ) , i = 0 , 1 , , N 1 satisfy the input constraint of Equation (19c) and the Lyapunov-based constraint of Equation (19e) since it is shown in Equation (25a) that the states predicted by the RNN model of Equation (19b) remain inside Ω ρ ^ e under the controller Φ n n ( x ) . Additionally, if x ( t k ) Ω ρ ^ \ Ω ρ ^ e , the control action u ( t ) = Φ n n ( x ( t k ) ) U , t [ t k , t k + 1 ) satisfies the input constraint of Equation (19c) and the Lyapunov-based constraint of Equation (19f) such that the state can be driven towards the origin during the next sampling period. Therefore, the stabilizing controller u = Φ n n ( x ) U provides a feasible solution that satisfies all the constraints of the LEMPC optimization problem of Equation (19) if x ( t ) Ω ρ ^ for all times.
Next, we prove that for x 0 Ω ρ ^ , the state of the closed-loop system of Equation (1) is bounded in Ω ρ ^ for all times. Specifically, if x ( t k ) Ω ρ ^ e , the predicted states x ^ ( t ) of the RNN model of Equation (19b) are maintained in Ω ρ ^ e under the constraint of Equation (19e). According to Proposition 2, the actual state x ( t ) , t [ t k , t k + 1 ) of the nonlinear system of Equation (1) is bounded by the following inequality:
V ^ ( x ) V ^ ( x ^ ) + c ^ 4 ρ ^ c ^ 1 | x x ^ | + κ | x x ^ | 2 V ^ ( x ^ ) + c ^ 4 ρ ^ c ^ 1 f w ( Δ ) + κ ( f w ( Δ ) ) 2
Therefore, if Ω ρ ^ e is chosen as a level set of V ^ that satisfies Equation (31), V ^ ( x ) based on the actual state x ( t ) is bounded in Ω ρ ^ for all t [ t k , t k + 1 ) . However, if x ( t k ) Ω ρ ^ \ Ω ρ ^ e , the constraint of Equation (19f) is activated such that the control action u decreases the value of V ^ ( x ^ ) based on the states predicted by the RNN model of Equation (19b) within the next sampling period. According to Equation (25b) in Proposition 3, the value of V ^ also decreases along the state trajectory of the actual nonlinear system of Equation (1) over t [ t k , t k + 1 ) . Therefore, it is concluded that for any initial condition in Ω ρ ^ , the closed-loop state of the system of Equation (1) is bounded in Ω ρ ^ for all times under the LEMPC of Equation (19). □

5. Application to a Chemical Process Example

A chemical process example is used to illustrate the application of LEMPC using an ensemble of RNN models, under which the state of the closed-loop system is maintained within the stability region in state-space for all times. We consider a well-mixed, non-isothermal continuous stirred tank reactor (CSTR), in which an irreversible second-order exothermic reaction takes place. Specifically, the reaction transforms a reactant A to a product B ( A B ). The inlet temperature, the inlet concentration of A, and the feed volumetric flow rate of the reactor are T 0 , C A 0 and F, respectively. Additionally, the CSTR is equipped with a heating jacket to supply or remove heat at a rate Q. The following material and energy balance equations are used to represent the CSTR dynamic model:
d C A d t = F V ( C A 0 C A ) k 0 e E R T C A 2
d T d t = F V ( T 0 T ) + Δ H ρ L C p k 0 e E R T C A 2 + Q ρ L C p V
where V is the volume of the reacting liquid in the reactor, C A is the reactant A concentration in the reactor, Q is the heat input rate and T is the temperature of the reactor. The feed volumetric flow rate and temperature are F and T 0 , respectively. The feed concentration of reactant A is C A 0 . The reacting liquid has a constant heat capacity of C p and density of ρ L . k 0 is the pre-exponential constant, Δ H is the enthalpy of reaction, R is the ideal gas constant, and E is the activation energy. Process parameter values are listed in Table 1.
The CSTR is initially operated at the unstable steady-state ( C A s , T s ) = ( 1.95 kmol / m 3 , 402 K ) , and ( C A 0 s Q s ) = ( 4 kmol / m 3 , 0 kJ / h ) . In this example, the two manipulated inputs are the heat input rate, Δ Q = Q Q s , and the inlet concentration of species A, Δ C A 0 = C A 0 C A 0 s , respectively. Additionally, the manipulated inputs are subject to the following constraints: | Δ Q | 5 × 10 5 kJ / h and | Δ C A 0 | 3.5 kmol / m 3 . The states and the inputs of the system of Equation (33) are represented by x T = [ C A C A s T T s ] and u T = [ Δ C A 0 Δ Q ] , respectively. Therefore, the unstable steady-state of the system of Equation (33) is at the origin of the state-space, (i.e., ( x s * , u s * ) = ( 0 , 0 ) ).
The control objective of the LEMPC of Equation (19) is to maximize the production rate of B in the CSTR by manipulating the inlet concentration Δ C A 0 and the heat input rate Δ Q , while maintaining the closed-loop state trajectories in the stability region Ω ρ ^ for all times under LEMPC. The objective function of the LEMPC optimizes the production rate of B as follows:
l e ( x ˜ , u ) = k 0 e E / R T C A 2
To simulate the dynamic model of Equation (33) numerically, the explicit Euler method is used with an integration time step of h c = 10 4 h . The Python module of the IPOPT software package [27], named PyIpopt, is used to solve the nonlinear optimization problem of the LEMPC of Equation (19) with a sampling period Δ = 10 2 h . Additionally, a Message Passing Interface (MPI) for the Python programming language, named MPI4Py ([28]), is incorporated in the LEMPC optimization problem to execute multiple RNN predictions of Equation (19b) concurrently in independent computing processes.
Extensive open-loop simulations for the CSTR system of Equation (33) are initially conducted to generate the dataset for RNN models. The control Lyapunov function V ^ ( x ) = x T P x is designed with the following positive definite P matrix:
P = 1060 22 22 0.52
Two hidden recurrent layers consisting of 96 and 64 recurrent units, respectively, are used in the RNN model. Additionally, a 10-fold cross validation are used to construct an ensemble of RNN models for the LEMPC of Equation (19). The closed-loop stability region Ω ρ ^ with ρ ^ = 368 and a subset Ω ρ ^ e with ρ ^ e = 320 for the CSTR system are characterized based on the obtained RNN models and the stabilizing controller u = Φ n n ( x ) U . To demonstrate the effectiveness of the proposed LEMPC of Equation (19), a linear state-space model (i.e., x ˙ = A x + B u with A = 100 × 0.154 0.003 ; 5.19 0.138 and B = 4.03 0 ; 1.23 0.004 ) is also identified following the system identification method in [29], and is added into the following performance comparison under the ensemble of RNN models and the first-principles model of Equation (33).
The open-loop simulation studies are first carried out for the first-principles model of Equation (33), an RNN model ensemble, and the linear state-space model, respectively. The optimal number of RNN models used in the ensemble is determined to be five from extensive simulation tests due to its best performance in terms of closeness to the trajectories under the first-principles model. As shown in Figure 5, starting from various initial conditions in Ω ρ ^ , the state trajectories under the RNN model ensemble almost overlap those under the first-principles model using the same input sequences, while the state trajectories under the linear state-space model show considerable deviation, especially in the top and bottom regions of Ω ρ ^ . From Figure 5, it is demonstrated that the RNN model ensemble is a good representation for the first-principles model of Equation (33) in that it successfully captures the CSTR nonlinear dynamics everywhere in Ω ρ ^ .
Subsequently, the closed-loop simulation of the nominal CSTR system (i.e., w ( t ) 0 ) under the first-principles model of Equation (33), the ensemble of RNN models, and the linear state-space model, respectively, are shown in Figure 6. The following material constraint is used in the LEMPC of Equation (19) to make the averaged reactant material available within the entire operating period t p to be its steady-state value, C A 0 s (i.e., the averaged reactant material in deviation form, u 1 , is equal to 0).
1 t p 0 t p u 1 ( τ ) d τ = 0 kmol / m 3
In Figure 6, it is shown that for the initial condition (0,0), the state trajectory under the ensemble of RNN models is close to that under the first-principles model, and both state trajectories are bounded in Ω ρ ^ e in the absence of disturbances. However, it is observed that the state trajectory under the linear state-space model ultimately leaves Ω ρ ^ due to its large model mismatch. This implies that a more conservative set Ω ρ ^ e needs to be characterized for the state spaced model to guarantee the boundedness of state in the stability region Ω ρ ^ .
Additionally, Figure 7 and Figure 8 show the manipulated inputs profiles in deviation from the steady-state values. It is shown that the manipulated inputs are bounded within the input constraints for all times. Moreover, the material constraint of Equation (36) is satisfied as it is shown in Figure 7 that the LEMPC using the first-principles model and the RNN model ensemble both consume the maximum allowable Δ C A 0 at the first few sampling steps, and thus have to lower the consumption of Δ C A 0 near the end of operating period. The LEMPC using the linear state-space model also satisfies the material constraint, but shows persistent oscillation due to model mismatch. Additionally, in Figure 8, it is shown that the consumption of Δ Q under the RNN ensemble is close to that under the first-principles model (both correspond to left y-axis) since their closed-loop trajectories shown in Figure 6 are similar. However, Δ Q shows large oscillation under the LEMPC using the linear state-space model (right y-axis) since the closed-loop trajectory in Figure 6 leaves Ω ρ ^ e , and thus, the contractive constraint of Equation (19f) is activated frequently.
To demonstrate that the closed-loop system under LEMPC achieves high process economics than the steady-state operation using the same amount reactant C A 0 , we calculate the accumulated economic benefits L E = 0 t p l e ( x , u ) d t over the entire operating period t p = 0.2 h, which are 2.04 , 4.14 , and 4.22 for the steady-state operation, the LEMPC under an ensemble of RNN models, and the LEMPC under the first-principles model, respectively. Therefore, it is demonstrated that both closed-loop stability and economic optimality are achieved under the proposed LEMPC using an ensemble of RNN models.

6. Conclusions

This work presented a new class of economic model predictive controllers that use an ensemble of recurrent neural network models as the prediction model for nonlinear systems. Specifically, an RNN ensemble was first developed based on the dataset of extensive open-loop simulations within the operating region. Parallel computing was used to reduce the computation time of prediction by multiple RNN models. Then, the LEMPC that incorporates RNN models and Lyapunov-based stability constraints was formulated to stabilize a nonlinear process within the closed-loop stability region while optimizing process economic benefits simultaneously. The proposed LEMPC using an RNN ensemble was applied to a nonlinear chemical process example to demonstrate its economic optimality and closed-loop stability.

Author Contributions

Z.W. developed the idea of machine learning-based EMPC for nonlinear systems, performed the simulation studies and prepared the initial draft of the paper. P.D.C. oversaw all aspects of the research and revised this manuscript.

Funding

Financial support from the National Science Foundation and the Department of Energy is gratefully acknowledged.

Conflicts of Interest

The authors declare that they have no conflict of interest regarding the publication of the research article.

References

  1. Angeli, D.; Amrit, R.; Rawlings, J.B. On average performance and stability of economic model predictive control. IEEE Trans. Autom. Control 2012, 57, 1615–1626. [Google Scholar] [CrossRef]
  2. Müller, M.A.; Angeli, D.; Allgöwer, F. Economic model predictive control with self-tuning terminal cost. Eur. J. Control 2013, 19, 408–416. [Google Scholar] [CrossRef]
  3. Ellis, M.; Durand, H.; Christofides, P.D. A tutorial review of economic model predictive control methods. J. Process Control 2014, 24, 1156–1178. [Google Scholar] [CrossRef]
  4. Heidarinejad, M.; Liu, J.; Christofides, P.D. Economic model predictive control of nonlinear process systems using Lyapunov techniques. AIChE J. 2012, 58, 855–870. [Google Scholar] [CrossRef]
  5. Wu, Z.; Durand, H.; Christofides, P.D. Safe economic model predictive control of nonlinear systems. Syst. Control Lett. 2018, 118, 69–76. [Google Scholar] [CrossRef]
  6. Wu, Z.; Zhang, J.; Zhang, Z.; Albalawi, F.; Durand, H.; Mahmood, M.; Mhaskar, P.; Christofides, P.D. Economic model predictive control of stochastic nonlinear systems. AIChE J. 2018, 64, 3312–3322. [Google Scholar] [CrossRef]
  7. Alanqar, A.; Ellis, M.; Christofides, P.D. Economic model predictive control of nonlinear process systems using empirical models. AIChE J. 2015, 61, 816–830. [Google Scholar] [CrossRef]
  8. Alanqar, A.; Durand, H.; Christofides, P.D. On identification of well-conditioned nonlinear systems: Application to economic model predictive control of nonlinear processes. AIChE J. 2015, 61, 3353–3373. [Google Scholar] [CrossRef]
  9. Wang, T.; Gao, H.; Qiu, J. A combined adaptive neural network and nonlinear model predictive control for multirate networked industrial process control. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 416–425. [Google Scholar] [CrossRef]
  10. Wu, Z.; Tran, A.; Ren, Y.M.; Barnes, C.S.; Chen, S.; Christofides, P.D. Model Predictive Control of Phthalic Anhydride Synthesis in a Fixed-Bed Catalytic Reactor via Machine Learning Modeling. Chem. Eng. Res. Des. 2019, 145, 173–183. [Google Scholar] [CrossRef]
  11. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. arXiv 2015, arXiv:1603.04467. [Google Scholar]
  12. Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Guadarrama, S.; Darrell, T. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv 2014, arXiv:1408.5093. [Google Scholar]
  13. Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 10 March 2019).
  14. Tu, J.V. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 1996, 49, 1225–1231. [Google Scholar] [CrossRef]
  15. Hopfield, J.J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. USA 1982, 79, 2554–2558. [Google Scholar] [CrossRef] [PubMed]
  16. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Venkatasubramanian, V. The promise of artificial intelligence in chemical engineering: Is it here, finally? AIChE J. 2019, 65, 466–478. [Google Scholar] [CrossRef]
  18. Sewell, M. Ensemble Learning; Technical Report; UCL: London, UK, 2010. [Google Scholar]
  19. Zhang, C.; Ma, Y. Ensemble Machine Learning: Methods and Applications; Springer: Berlin, Germany, 2012. [Google Scholar]
  20. Almasi, G.S.; Gottlieb, A. Highly Parallel Computing; Benjamin-Cummings Publishing Co., Inc.: Redwood City, CA, USA, 1988. [Google Scholar]
  21. Lin, Y.; Sontag, E.D. A universal formula for stabilization with bounded controls. Syst. Control Lett. 1991, 16, 393–397. [Google Scholar] [CrossRef]
  22. Sontag, E.D. Neural nets as systems models and controllers. In Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems, New Haven, CT, USA, 20–22 May 1992; pp. 73–79. [Google Scholar]
  23. Kosmatopoulos, E.B.; Polycarpou, M.M.; Christodoulou, M.A.; Ioannou, P.A. High-order neural network structures for identification of dynamical systems. IEEE Trans. Neural Netw. 1995, 6, 422–431. [Google Scholar] [CrossRef]
  24. Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml. (accessed on 10 March 2019).
  25. Sayyar-Rodsari, B.; Hartman, E.; Plumer, E.; Liano, K.; Schweiger, C. Extrapolating gain-constrained neural networks-effective modeling for nonlinear control. In Proceedings of the 43rd IEEE Conference on Decision and Control, Nassau, Bahamas, 14–17 December 2004; Volume 5, pp. 4964–4971. [Google Scholar]
  26. Mendes-Moreira, J.; Soares, C.; Jorge, A.M.; Sousa, J.F.D. Ensemble approaches for regression: A survey. ACM Comput. Surv. (CSUR) 2012, 45, 10. [Google Scholar] [CrossRef]
  27. Wächter, A.; Biegler, L.T. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 2006, 106, 25–57. [Google Scholar] [CrossRef]
  28. Dalcin, L.D.; Paz, R.R.; Kler, P.A.; Cosimo, A. Parallel distributed computing using Python. Adv. Water Resour. 2011, 34, 1124–1139. [Google Scholar] [CrossRef]
  29. Kheradmandi, M.; Mhaskar, P. Data Driven Economic Model Predictive Control. Mathematics 2018, 6, 51. [Google Scholar] [CrossRef]
Figure 1. A recurrent neural network and its unfolded structure, where Θ is the weight matrix, x is the state vector, u is the manipulated input vector and o is the output vector.
Figure 1. A recurrent neural network and its unfolded structure, where Θ is the weight matrix, x is the state vector, u is the manipulated input vector and o is the output vector.
Mathematics 07 00494 g001
Figure 2. An ensemble of k RNN models based on a k-fold cross validation.
Figure 2. An ensemble of k RNN models based on a k-fold cross validation.
Mathematics 07 00494 g002
Figure 3. Algorithm for parallel computation of an ensemble of k RNN models, where x ( t k ) is the state measurement, u g ( t k ) is the initial guess of manipulated inputs provided by the MPC optimizer, and u * ( t k ) is the optimal control action sent to the nonlinear process of Equation (1), respectively.
Figure 3. Algorithm for parallel computation of an ensemble of k RNN models, where x ( t k ) is the state measurement, u g ( t k ) is the initial guess of manipulated inputs provided by the MPC optimizer, and u * ( t k ) is the optimal control action sent to the nonlinear process of Equation (1), respectively.
Mathematics 07 00494 g003
Figure 4. A schematic representing the set ϕ ^ u , the closed-loop stability region Ω ρ ^ , and the set Ω ρ ^ e . Under the LEMPC of Equation (19), the closed-loop state trajectory is bounded in Ω ρ ^ for all times for any x 0 Ω ρ ^ .
Figure 4. A schematic representing the set ϕ ^ u , the closed-loop stability region Ω ρ ^ , and the set Ω ρ ^ e . Under the LEMPC of Equation (19), the closed-loop state trajectory is bounded in Ω ρ ^ for all times for any x 0 Ω ρ ^ .
Mathematics 07 00494 g004
Figure 5. The state trajectories for the open-loop simulation using the first-principles model of Equation (33), the ensemble of RNN models, and the linear state-space model, respectively for various sets of inputs and initial conditions (marked as blue stars) x 0 in the closed-loop stability region Ω ρ ^ .
Figure 5. The state trajectories for the open-loop simulation using the first-principles model of Equation (33), the ensemble of RNN models, and the linear state-space model, respectively for various sets of inputs and initial conditions (marked as blue stars) x 0 in the closed-loop stability region Ω ρ ^ .
Mathematics 07 00494 g005
Figure 6. The state-space profiles for the closed-loop CSTR under the LEMPC using the following models: the first-principles model (blue trajectory), the RNN model ensemble (red trajectory) and the linear state-space model (yellow trajectory) for an initial condition (0, 0).
Figure 6. The state-space profiles for the closed-loop CSTR under the LEMPC using the following models: the first-principles model (blue trajectory), the RNN model ensemble (red trajectory) and the linear state-space model (yellow trajectory) for an initial condition (0, 0).
Mathematics 07 00494 g006
Figure 7. Manipulated input profiles ( u 1 = Δ C A 0 ) for the initial condition (0, 0) under the LEMPC using the following models: the first-principles model (blue trajectory), the RNN model ensemble (red trajectory) and the linear state-space model (yellow trajectory), where the dashed black horizontal lines represent the upper and lower bounds for Δ C A 0 .
Figure 7. Manipulated input profiles ( u 1 = Δ C A 0 ) for the initial condition (0, 0) under the LEMPC using the following models: the first-principles model (blue trajectory), the RNN model ensemble (red trajectory) and the linear state-space model (yellow trajectory), where the dashed black horizontal lines represent the upper and lower bounds for Δ C A 0 .
Mathematics 07 00494 g007
Figure 8. Manipulated input profiles ( u 2 = Δ Q ) for the initial condition (0, 0) under the LEMPC using the following models: the first-principles model (blue trajectory corresponding to left y-axis), the RNN model ensemble (red trajectory corresponding to left y-axis) and the linear state-space model (yellow trajectory corresponding to right y a x i s ).
Figure 8. Manipulated input profiles ( u 2 = Δ Q ) for the initial condition (0, 0) under the LEMPC using the following models: the first-principles model (blue trajectory corresponding to left y-axis), the RNN model ensemble (red trajectory corresponding to left y-axis) and the linear state-space model (yellow trajectory corresponding to right y a x i s ).
Mathematics 07 00494 g008
Table 1. Parameter values of the CSTR.
Table 1. Parameter values of the CSTR.
T 0 = 300 K F = 5 m 3 / h
V = 1 m 3 E = 5 × 10 4 kJ / kmol
k 0 = 8.46 × 10 6 m 3 / kmol h Δ H = 1.15 × 10 4 kJ / kmol
C p = 0.231 kJ / kg K R = 8.314 kJ / kmol K
ρ L = 1000 kg / m 3 C A 0 s = 4 kmol / m 3
Q s = 0.0 kJ / h C A s = 1.2 kmol / m 3
T s = 438 K

Share and Cite

MDPI and ACS Style

Wu, Z.; Christofides, P.D. Economic Machine-Learning-Based Predictive Control of Nonlinear Systems. Mathematics 2019, 7, 494. https://doi.org/10.3390/math7060494

AMA Style

Wu Z, Christofides PD. Economic Machine-Learning-Based Predictive Control of Nonlinear Systems. Mathematics. 2019; 7(6):494. https://doi.org/10.3390/math7060494

Chicago/Turabian Style

Wu, Zhe, and Panagiotis D. Christofides. 2019. "Economic Machine-Learning-Based Predictive Control of Nonlinear Systems" Mathematics 7, no. 6: 494. https://doi.org/10.3390/math7060494

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop