Real-Time Optimization and Control of Nonlinear Processes Using Machine Learning

: Machine learning has attracted extensive interest in the process engineering ﬁeld, due to the capability of modeling complex nonlinear process behavior. This work presents a method for combining neural network models with ﬁrst-principles models in real-time optimization (RTO) and model predictive control (MPC) and demonstrates the application to two chemical process examples. First, the proposed methodology that integrates a neural network model and a ﬁrst-principles model in the optimization problems of RTO and MPC is discussed. Then, two chemical process examples are presented. In the ﬁrst example, a continuous stirred tank reactor (CSTR) with a reversible exothermic reaction is studied. A feed-forward neural network model is used to approximate the nonlinear reaction rate and is combined with a ﬁrst-principles model in RTO and MPC. An RTO is designed to ﬁnd the optimal reactor operating condition balancing energy cost and reactant conversion, and an MPC is designed to drive the process to the optimal operating condition. A variation in energy price is introduced to demonstrate that the developed RTO scheme is able to minimize operation cost and yields a closed-loop performance that is very close to the one attained by RTO/MPC using the ﬁrst-principles model. In the second example, a distillation column is used to demonstrate an industrial application of the use of machine learning to model nonlinearities in RTO. A feed-forward neural network is ﬁrst built to obtain the phase equilibrium properties and then combined with a ﬁrst-principles model in RTO, which is designed to maximize the operation proﬁt and calculate optimal set-points for the controllers. A variation in feed concentration is introduced to demonstrate that the developed RTO scheme can increase operation proﬁt for all considered conditions.


Introduction
In the last few decades, chemical processes have been studied and represented with different models for real-time optimization (RTO) and model predictive control (MPC) in order to improve the process steady-state and dynamic performance. The available models range from linear to nonlinear and from first-principles models to neural network models, among others [1]. For many applications, first-principles models are the preferable choice, especially when applied with process systems methodologies [2]. However, first-principles models are difficult to maintain due to the variation of some parameters. Furthermore, it could be difficult or impractical to obtain first-principles models for large-scale applications [3]. As a well-tested alternative, machine learning method, especially neural network models are able to represent complicated nonlinear systems [4,5]. Neural networks fit the data knowledge. At this stage, little attention has been paid to utilizing the full benefit of employing hybrid modeling in both the RTO and MPC layers.
Motivated by the above, this work demonstrates the implementation of a hybrid approach of combining a first-principles model and a neural network model in the RTO and MPC optimization problems. Specifically, the nonlinear part of the first-principles model is replaced by a neural network model to represent the complex, nonlinear term in a nonlinear process. We note that in our previous works, we developed recurrent neural network models from process data for use in MPC without using any information from a first-principles model or process structure in the recurrent neural network model formulation [4,5,31]. Furthermore, the previous works did not consider the use of neural network models to describe nonlinearities in the RTO layer and focused exclusively on model predictive control. In the present work, we use neural networks to describe nonlinearities arising in chemical processes and embed these neural network models in first-principles process models used in both RTO (nonlinear steady-state process model) and MPC (nonlinear dynamic process model), resulting in the use of hybrid model formulations in both layers. The rest of the paper is organized as follows: in Section 2, the proposed method that combines neural network with the first-principles model is discussed. In Section 3, a continuous stirred tank reactor (CSTR) example is utilized to illustrate the combination of neural network models and first-principles models in RTO and Lyapunov-based MPC, where the reaction rate equation is represented by a neural network model. In Section 4, an industrial distillation column is co-simulated in Aspen Plus Dynamics and MATLAB. A first-principles steady-state model of the distillation column is first developed, and a neural network model is constructed for phase equilibrium properties. The combined model is then used in RTO to investigate the performance of the proposed methodology.

Neural Network Model
The neural network model is a nonlinear function y = f NN (x) with input vector x = [x 1 , x 2 , ..., x n ] and output vector y = [y 1 , y 2 , ..., y m ]. Mathematically, a neural network function is defined as a series of functional transformations. The structure of a two-layer (one hidden-layer) feed-forward neural network is shown in Figure 1, where h 1 , h 2 , ..., h p are hidden neurons [32,33]. Specifically, the hidden neurons h j and the outputs y k are obtained by Equation (1): j0 ), j = 1, 2, ..., p (1a) where parameters w (1) ji and w (2) ki are weights in the first and the second layer and parameters w (1) j0 and w (2) k0 are biases. σ 1 and σ 2 are nonlinear element-wise transformations σ : R 1 → R 1 , which are generally chosen to be sigmoid functions such as the logistic sigmoid S(x) = 1/(1 + e −x ) or hyperbolic tangent function tanh(x) = 2/(1 + e −2x ) − 1. Each hidden neuron h j is calculated by an activation function σ 1 with a linear combination of input variables x i . Each output variable y k is also calculated by an activation function σ 2 with a linear combination of hidden neurons h i . Since the neural network models in this work are developed to solve regression problems, no additional output unit activation functions are needed. All the neural network models in this work will follow the structure discussed in this section. ji is marked on the structure. Neuron "1" is used to represent the biases.
Given a set of input vectors {x n } together with a corresponding set of target output vectors {ŷ n } as a training set of N data points, the neural network model is trained by minimizing the following sum-of-squares error function [33]: The proper weight vectors w are obtained by minimizing the above cost function via the gradient descent optimization method: where τ labels the iteration, η > 0 is known as the learning rate, and ∇E(w τ ) is the derivative of the cost function with respect to weight w. The weight vectors are optimized by moving through weight space in a succession of Equation (3) with some initial value w(0). The gradient of an error function ∇E(w) is evaluated by back propagation method. Additionally, data are first normalized, and then, k-fold cross-validation is used to separate the dataset into the training and validation set in order to avoid model overfitting.

Application of Neural Network Models in RTO and MPC
In the chemical engineering field, model fitting is a popular technique in both academia and industry. In most applications, a certain model formulation needs to be assumed first, and then, the model is fitted with experiment data. However, a good approximation is not guaranteed since the assumed model formulation may be developed based on deficient assumptions and uncertain mechanism, which lead to an inaccurate model. Alternatively, neural network model can be employed to model complex, nonlinear systems since neural networks do not require any a priori knowledge about the process and are able to fit any nonlinearity with a sufficient number of layers and neurons according to the universal approximation theorem [34]. The obtained neural network model can be used together with existing first-principles models. Specifically, the combination of the neural network model and first-principles model can be used in optimization problems, such as real-time optimization (RTO) and model predictive control (MPC).

RTO with the Neural Network Model
Real-time optimization (RTO) maximizes the economic productivity of the process subject to operational constraints via the continuous re-evaluation and alteration of operating conditions of a process [35]. The economically-optimal plant operating conditions are determined by RTO and sent to the controllers to operate the process at the optimal set-points [36].
Since RTO is an optimization problem, an explicit steady-state model is required in order to obtain optimal steady-states. First-principles models are commonly used in RTO; however, first-principles models may not represent the real process well due to model mismatch, and thus lead to non-optimal steady-states or even infeasible steady-states. In these cases, the machine learning method becomes a good solution to improve model accuracy. Specifically, a neural network model can be used to replace the complicated nonlinear part of the steady-state model to increase the accuracy of the first-principles model.
In general, the RTO problem is formulated as the optimization problem of Equation (4), where x ∈ R n is the state andx ∈ R m is part of the state. g(x) is a nonlinear function ofx, which is a part of the steady-state model. min Since it is difficult to obtain an accurate functional form of g(x), a neural network F NN (x) is developed using simulation data to replace g(x) in Equation (4). Therefore, the RTO based on the integration of first-principles model and neural network model is developed as follows:

MPC with Neural Network Models
Model predictive control (MPC) is an advanced control technique that uses a dynamic process model to predict future states over a finite-time horizon to calculate the optimal input trajectory. Since MPC is able to account for multi-variable interactions and process constraints, it has been widely used to control constrained multiple-input multiple-output nonlinear systems [37]. Since MPC is an optimization problem, an explicit dynamic model is required to predict future states and make optimal decisions. First-principles models can be developed and used as the prediction model in MPC; however, first-principles models suffer from model mismatch, which might lead to offsets and other issues. Therefore, machine learning methods can be used to reduce model mismatch by replacing the complicated nonlinear part of the dynamic model with a neural network model.
In general, MPC can be formulated as the optimization problem of Equation (6), where the notations follow those in Equation (4) andẋ = F(x, g(x)) is the first-principles dynamic process model.
Similar to Equation (5), a neural network F NN (x) is developed using simulation data to replace g(x) in Equation (6). As a result, the MPC based on the integration of the first-principles model and neural network model is developed as follows: other constraints Remark 1. To derive stability properties for the closed-loop system under MPC, additional stabilizing constraints can be employed within the MPC of Equation (7) (e.g., terminal constraints [38] and Lyapunov-based constraints [39]). In this work, a Lyapunov-based MPC (LMPC) is developed to achieve closed-loop stability in the sense that the close-loop state is bounded in a stability region for all times and is ultimately driven to the origin. The discussion and the proof of closed-loop stability under LMPC using machine learning-based models can be found in [4,31].

Remark 2.
All the optimization problems of MPC and RTO in this manuscript are solved using IPOPT, which is an interior point optimizer for large-scale nonlinear programs. The IPOPT solver was run on the OPTI Toolbox in MATLAB. It is noted that the global optimum of the nonlinear optimization problem is not required in our case, since the control objective of MPC is to stabilize the system at its set-point, rather than to find the globally-optimal trajectory. The Lyapunov-based constraints can guarantee closed-loop stability in terms of convergence to the set-point for the nonlinear system provided that a feasible solution (could be a locally-optimal solution) to the LMPC optimization problem exists.

Remark 3.
In the manuscript, the MPC is implemented in a sample-and-hold fashion, under which the control action remains the same over one sampling period, i.e., u(t) = u(x(t k )), ∀t ∈ [t k , t k+1 ), where t k+1 represents t k + ∆ and ∆ is the sampling period. Additionally, one possible way to solve the optimization problems of Equations (6) and (7) is to use continuous-time optimization schemes. This method has recently gained researchers attention and can be found in [40,41].

Remark 4.
In this work, the neural network is used to replace the nonlinear term in the first-principles model, for which it is generally difficult to obtain an accurate functional form from first-principles calculations. It should be noted that the neural network F NN (x) was developed as an input-output function to replace only a part (static nonlinearities) of the first-principles model, and thus does not replace the entire steady-state model or dynamic model.

Process Description and Simulation
The first example considers a continuous stirred tank reactor (CSTR), where a reversible exothermic reaction A ↔ B takes place [42,43]. After applying mass and energy balances, the following dynamic model is achieved to describe the process: In the model of Equation (8), C A , C B are the concentrations of A and B in the reactor, and T is the temperature of the reactor. The feed temperature and concentration are denoted by T 0 and C A 0 , respectively. k A and k B are the pre-exponential factor for the forward reaction and reverse reaction, respectively. E A and E B are the activation energy for the forward reaction and reverse reaction, respectively. τ is the residence time in the reactor; ∆H is the enthalpy of the reaction; and C P is the heat capacity of the mixture liquid. The CSTR is equipped with a jacket to provide heat to the reactor at rate Q. All process parameter values and steady-state values are listed in Table 1. Additionally, it is noted that the second equation of Equation (8) for C B is unnecessary if C A 0 is fixed due to C B = C A 0 − C A . This does not hold when C A 0 is varying, and thus, the full model is used in this work for generality.
When the tank temperature T is too low, the reaction rate is maintained as slow such that the reactant A does not totally reacted during the residence time, and thus, the reactant conversion When the tank temperature T is too high, the reversible exothermic reaction equilibrium turns backwards so that the reactant conversion (1 − C A /C A 0 ) also drops. As a result, there exists a best tank temperature to maximize the reactant conversion. Figure 2 shows the variation of the CSTR steady-state (i.e., concentration C A and temperature T) under varying heat input rate Q, where Q is not explicitly shown in Figure 2. Specifically, the minimum point of C A represents the steady-state of C A and T, under which the highest conversion rate (conversion rate = 1 − C A /C A 0 ) is achieved. Therefore, the CSTR process should be operated at this steady-state for economic optimality if no other cost is accounted for.

Neural Network Model
In the CSTR model of Equation (8), the reaction rate RT C B is a nonlinear function of C A , C B , and T. To obtain this reaction rate from experiment data, an assumption of the reaction rate mechanism and reaction rate function formulation is required. In practice, it could be challenging to obtain an accurate reaction rate expression using the above method if the reaction mechanism is unknown and the rate expression is very complicated.
In this work, a neural network model is built to represent the reaction rate r as a function of C A , C B , and T (i.e., r = F NN (C A , C B , T)), and then, the neural network model replaces the first-principles rate equation in the process model. Specifically, around eight million data were generated by the The dataset was generated such that various reaction rates under different operating conditions (i.e., temperature, concentrations of A and B) were covered. The operating conditions were discretized equidistantly. Specifically, we tried the activation functions such as tanh, sigmoid, and ReLU for hidden layers and a linear unit and softmax function for the output layer. It is demonstrated that the choice of activation functions for the output layer significantly affected the performance of the neural network in a regression problem, while those for the hidden layers achieved similar results. tanh(x) = 2/(1 + e −2x ) − 1 was ultimately chosen as the activation function for the hidden layers, and a linear unit was used for the output layer since they achieved the best training performance with the mean squared error less than 10 −7 . Data were first normalized and then fed to the MATLAB Deep Learning toolbox to train the model. The neural network model had one hidden layer with 10 neurons. The parameters were trained using Levenberg-Marquardt optimization algorithm. In terms of the accuracy of the neural network model, the coefficient of determination R 2 was 1, and the error histogram of Figure 3 demonstrates that the neural network represented the reaction rate with a high accuracy, as can be seen from the error distribution (we note that error metrics used in classification problems like the confusion matrix, precision, recall, and f1-score were not applicable to the regression problems considered in this work). In the process model of Equation (8), the first-principles reaction RT C B was replaced with the obtained neural network F NN (C A , C B , T). The integration of the first-principles model and the neural network model that was used in RTO and MPC will be discussed in the following sections.

Remark 5.
The activation function plays an important role in the neural network training process and may affect its prediction performance significantly. Specifically, in the CSTR example, since it is known that the reaction rate is generally in the form of exponential functions, we tried tanh and sigmoidactivation functions. It is demonstrated that both achieved the desired performance with mean squared error less than 10 −7 .

RTO Design
It is generally accepted that energy costs vary significantly compared to capital, labor, and other expenses in an actual plant. Therefore, in addition to productivity, it is important to account for energy cost in the real-time optimization of plant operation. Specifically, in this example, the heating cost was regarded as the entire energy cost since other energy costs may be lumped into the heating energy cost. The overall cost function is defined as follows: Equation (9) attempts to find the balance between the reactant conversion and heat cost. A simple linear form was taken between Q and C A in this case study since it was sufficient to illustrate the relationship between energy cost and reactant conversion. The above total cost was optimized in real time to minimize the cost of the CSTR process, by solving the optimization problem of Equation (10).
The constraints of Equation (10b), Equation (10c), and Equation (10d) are the steady-state models of the CSTR process, which set the time derivative of Equation (8) to zero and replace the reaction rate term by the neural network model built in Section 3.2. Since the feed concentration C A 0 is 1 mol/L, C A and C B must be between 0 and 1 mol/L. The temperature constraint [400, 500] and energy constraint [0, 10 5 ] are the desired operating conditions. At the initial steady-state, the heat price is 7 × 10 −7 , and the CSTR operates at T = 426.7 K, C A = 0.4977 mol/L and Q = 40,386 cal/s. The performance is not compromised too much since C A = 0.4977 mol/L is close to the optimum value C A = 0.4912 mol/L, while the energy saving is considerable when Q = 40,386 cal/s is compared to the optimum value Q = 59,983 cal/s. In the presence of variation in process variables or heat price, RTO recalculates the optimal operating condition, given that the variation is measurable every RTO period. The RTO of Equation (10) is solved every RTO period, and then sends steady-state values to controllers as the optimal set-points for the next 1000 s. Since the CSTR process has a relatively fast dynamics, a small RTO period of 1000 s is chosen to illustrate the performance of RTO.

Controller Design
In order to drive the process to the optimal steady-state, a Lyapunov-based model predictive controller (LMPC) is developed in this section. The controlled variables are C A , C B , and T, and the manipulated variable is heat rate Q. The CSTR is initially operated at the steady-state [C A s C B s T s ] = [0.4977 mol/L 0.5023 mol/L 426.743 K], with steady-state Q j s = 40, 386 cal/s. At the beginning of each RTO period, a new set of steady-states are calculated, and then, the input and the states are represented in their deviation variable form as u = Q − Q s and such that the systems of Equation (8) together with F NN (C A , C B , T) can be written in the form ofẋ = f (x) + g(x)u. A Lyapunov function is designed using the standard quadratic form V(x) = 100,000x 2 1 + 100,000x 2 2 + x 2 3 , and the parameters are chosen to ensure that all terms are of similar order of magnitude since temperature is varying in a much larger range compared to concentration. We characterize the stability region Ω ρ as a level set of Lyapunov function, i.e., Ω ρ = x ∈ R 3 | V(x) ≤ ρ . For the system of Equation (8), the stability region Ω ρ with ρ = 1000 is found based on the above Lyapunov function V and the following controller h(x) [44]: . The control objective is to stabilize C A , C B , and T in the reactor at its steady-state by manipulating the heat rate Q. A Lyapunov-based model predictive controller (LMPC) is designed to bring the process to the steady-state calculated by the RTO. Specifically, the LMPC is presented by the following optimization problem: wherex is the predicted state, N is the number of sampling periods within the prediction horizon, and S(∆) is the set of piece-wise constant functions with period ∆. The LMPC optimization problem calculates the optimal input trajectory over the entire prediction horizon t ∈ [t k , t k+N ), but only applies the control action for the first sampling period, i.e., u(t) = u(x(t k )), ∀t ∈ [t k , t k+1 ). In the optimization problem of Equation (12), Equation (12a) is the objective function minimizing the time integral of x(τ) 2 Q c + u(τ) 2 R c over the prediction horizon. Equation (12b) is the process model of Equation (8) in its deviation form and is used to predict the future states. A neural network F NN (x 1 , RT C B in Equation (8). Equation (12c) uses the state measurement x(t k ) at t = t k as the initial conditionx(t k ) of the optimization problem. Equation (12d) defines the input constraints over the entire prediction horizon, where U = [0 − Q s 10 5 − Q s ]. The constraint of Equation (12e) is used to decrease V(x) such that the state x(t) is forced to move towards the origin. It guarantees that the origin of the closed-loop system is rendered asymptotically stable under LMPC for any initial conditions inside the stability region Ω ρ . The detailed proof of closed-loop stability can be found in [39].
To simulate the dynamic model of Equation (8) numerically under the LMPC of Equation (12), we used the explicit Euler method with an integration time step of h c = 10 −2 s. Additionally, the optimization problem of the LMPC of Equation (12) is solved using the solver IPOPT in the OPTI Toolbox in MATLAB with the following parameters: sampling period ∆ = 5 s; prediction horizon N = 10. Q c = 1 0 0; 0 1 0; 0 0 5 × 10 −5 and R c = 10 −11 were chosen such that the magnitudes of the states and of the input in x(τ) 2 Q c and u(τ) 2 R c have the similar order.

Simulation Results
In the simulation, a variation of heat price is introduced to demonstrate the performance of the designed RTO and MPC. Since the heat price is changing as shown in Figure 4, the initial steady-state is no longer the optimal operating condition. The RTO of Equation (10) is solved at the beginning of each RTO period to achieve a set of improved set-points, which will be tracked by the MPC of Equation (12). With the updated set-points, the CSTR process keeps adjusting operating conditions accounting for varying heat price. After the controller receives the set-points, the MPC of Equation (12) calculates input u to bring x to the new set-point, and finally, both state x and input u are maintained at their new steady-states. The concentration profiles, temperature profile, and heat rate profile are shown in Figures 5-7.   During the first half of the simulation, heat price rises up to a doubled value. Considering the increasing heat price, the operation tends to decrease the heat rate to reduce the energy cost, while compromising the reactant conversion. Therefore, the energy cost and reactant conversion will be balanced by RTO to reach a new optimum. As demonstrated in Figure 5, C A increases and C B decreases during the first half of simulation, which implies that less reactant A is converted to product B in the tank. The reactor temperature also drops as shown in Figure 6, which corresponds to the reducing heat rate as shown in Figure 7.
Total cost is calculated by Equation (9) using state measurements of C A and Q from the closed-loop simulation and is plotted in Figure 8. The total cost with fixed steady-state is also calculated and plotted for comparison. After the heat price starts to increase, both total costs inevitably increase. Since RTO keeps calculating better steady-states compared to the initial steady-state, the total cost under RTO increases less than the simulation without RTO. The total cost is integrated with time to demonstrate the difference in cost increment, using Equation (13).
cost increase = t f inal 0 total cost − initial cost dt (13) where initial cost = 0.526 and t f inal = 10,000 s. The ratio of cost increment between simulations with RTO and without RTO is 195 : 241. Although the operating cost increases because of rising heat price, RTO reduces the cost increment by approximately a factor of 1/5, when compared to the fixed operating condition without RTO. The combination of neural network models and first-principles models works well in both RTO and MPC. Additionally, it is shown in Figures 5-7 that the RTO with the combined first-principles/neural-network model calculates the same steady-state when compared to the RTO with a pure first-principles model. Moreover, the MPC also drives all the states to the set-points without offset when the MPC uses the combination of a neural network model with a first-principles model. In this case study, the neural network model is accurate such that the combination of neural network and first-principles model attains almost the same closed-loop result as the pure first-principles model (curves overlap when plotted in the same figure as is done in Figures 5-7, where the blue curve denotes the solution under MPC with the combined first-principles/neural network model, the red curve denotes the solution under MPC with the first-principles model, the green curve denotes the set-points calculated by RTO with the hybrid model, and the black curve denotes the set-points calculated by RTO with the first-principles model). Additionally, we calculated the accumulated relative error (i.e., E = t=10,000s t=0 |T f −T h |dt t=10,000s t=0 T f dt ) between the temperature curves ( Figure 6) under the first-principles model (i.e., T f ) and under the hybrid model (i.e., T h ) over the entire operating period from t = 0 to t = 10,000 s. It was obtained that E = 4.98 × 10 −6 , which is sufficiently small. This implies that the neural network successfully approximated the nonlinear term of reaction rate. In practice, neural network could be more effective when the reaction rate is very complicated and depends on more variables and the reaction mechanism is unknown.

Process Description
A simple binary separation of propane from isobutane in a distillation column was used for the second case study [45]. Aspen Plus (Aspen Technology, Inc., Bedford, MA, USA) and Aspen Plus Dynamics V10.0 were utilized to perform high-fidelity dynamic simulation for the distillation column. Specifically, Aspen Plus uses the mass and energy balances to calculate the steady-state of the process based on a process flowsheet design and carefully-chosen thermodynamic models. After the steady-state model is solved in Aspen Plus, it can be exported to a dynamic model in Aspen Plus Dynamics, which runs dynamic simulations based on the obtained steady-state models and detailed process parameters [46,47].
A schematic of the distillation process is shown in Figure 9. The feed to the separation process was at 20 atm, 322 K and 1 kmol/s, with a propane mole fraction of 0.4 and an isobutane mole fraction of 0.6. After a valve controlling the feed flow rate, the feed enters the distillation column at Tray 14. The feed tray is carefully chosen to achieve the best separation performance and minimum energy cost, as discussed in [45]. The column has 30 trays with a tray spacing of 0.61 m, and the diameter of the tray is 3.85 m and 4.89 m for the rectifying section and stripping section, respectively. At the initial steady-state, the distillate product has a propane mole fraction 0.98 and a flow rate 0.39 kmol, while the bottom product has a propane mole fraction 0.019 and a flow rate 0.61 kmol. The reflux ratio is 3.33, together with condenser heat duty −2.17 × 10 7 W and reboiler heat duty 2.61 × 10 7 W. The pressure at the top and bottom is 16.8 atm and 17 atm. Both the top and bottom products are followed by a pump and a control valve. All the parameters are summarized in Table 2. In our simulation, the involved components of propane and isobutane were carefully chosen, and the CHAO-SEA model was selected for the thermodynamic property calculation. The steady-state model was first built in Aspen Plus using the detailed information as discussed above and the parameters in Table 2. Then, the achieved steady-state simulation was exported to the dynamic model as a pressure-driven model, based on additional parameters such as reboiler size and drum size. After checking the open-loop response of the dynamic model, controllers will be designed in Section 4.3.2. In order to calculate the steady-state of the distillation process, an analytic steady-state model is developed in this section. Since the Aspen model cannot be used in the optimization problem explicitly, this analytic steady-state model will be used in the RTO.
The analytic steady-state model consists of five variables, which are the reflux ratio R, the distillate mole flow rate D, the bottom mole flow rate B, the distillate mole fraction x D , and the bottom mole fraction x B . For clarification, x is denoted as the mole fraction for the light component propane. Other parameters include feed conditions: feed molar flow rate F, feed mole fraction x F , feed heat condition q; column parameters: total number of trays N T , feed tray N F ; component property: relative volatility α. Three equations were developed for the steady-state model.
The first equation F 1 (D, B) = 0 is the overall molar balance between feed and products, as shown in Equation (14).
The second equation F 2 (D, B, x D , x B ) = 0 is the overall component balance of light component propane, as shown in Equation (15): The third equation applies the binary McCabe-Thiele method. The constant molar overflow assumptions of the McCabe-Thiele method were held in this case study: the liquid and vapor flow rates were constant in a given section of the column. Equilibrium was also assumed to be reached on each tray. The top tray was defined as the first tray. To apply the McCabe-Thiele method, the rectifying operating line (ROL), stripping operating line (SOL), and phase equilibrium were developed as follows: Rectifying operating line (ROL): Stripping operating line (SOL): Phase equilibrium: where α = y C3 /x C3 y C4 /x C4 = 1.79 is the approximate relative volatility between propane and isobutane at a pressure 16.9 atm, which is the mean of the top and bottom pressure.
The third equation F 3 (R, D, x D , x B ) = 0 is expressed in Equation (19) below: The third equation There are five variables R, D, B, x D , x B and three equations F 1 , F 2 , F 3 , which implies that there are two degrees of freedom. In order to determine the whole process operating condition, two more states need to be fixed, potentially by RTO. It is necessary to point out that the concentrations x i and y i on each tray can be calculated by Equation (19) if all five variables R, D, B, x D , x B are determined. Additionally, if the equilibrium temperature-component curve T = f e (x) (bubble point curve) or T = f e (y) (dew point curve) are provided, then the temperature on each tray T i can also be calculated by simply using T i = f e (x i ) or T i = f e (y i ).

Neural Network Model
Phase equilibrium properties are usually nonlinear, and the first-principles models are often found to be inaccurate and demand modifications. In the above steady-state model, the phase equilibrium x n = y n α−(α−1)y n of Equation (19b) assumes that relative volatility α is constant; however, the relative volatility α does not hold constant with varying concentration and pressure. Therefore, a more accurate model for phase equilibrium x ∼ y can improve the model performance. Similarly, dew point curve T ∼ y can be built from first-principles formulation upon Raoult's Law and the Antoine equation. However, the Antoine equation is an empirical equation, and it is hard to relate saturated pressure with temperature accurately, especially for a mixture. As a result, the machine learning method can be used to achieve a better model to represent the phase equilibrium properties.
In this case study, a neural network (x, T) = F NN (y) was built, with one input (vapor phase mole fraction y) and two outputs (equilibrium temperature T and liquid phase mole fraction x). One thousand five hundred data of T, x, and y were generated by the Aspen property library and were then normalized and fed into the MATLAB Deep Learning toolbox. tanh(x) = 2/(1 + e −2x ) − 1 was chosen as the activation function. The neural network model had one hidden layer with five neurons. The parameters were trained according to Levenberg-Marquardt optimization, and the mean squared error for the test dataset was around 10 −7 . It is demonstrated in Figure 10 that the neural network model fits the data from the Aspen property library very well, where the blue solid curve is the neural network model prediction and the red curve denotes the Aspen model. Additionally, we calculated the accumulated relative error (i.e., E = y=1 y=0 |T f −T h |dy y=1 y=0 T f dy ) between the temperature curves ( Figure 10) under the Aspen model (i.e., T f ) and under the neural network model (i.e., T h ) and E = 2.32 × 10 −6 ; the result was similar for the liquid mole fraction curves. This sufficiently small error implies that the neural network model successfully approximated the nonlinear behavior of the thermodynamic properties. Additionally, the coefficient of determination R 2 was 1, and the error histogram of Figure 11 demonstrated that the neural network model represented the thermodynamic properties with great accuracy.

Errors = Targets -Outputs
Training Validation Testing Zero Error Figure 11. Error distribution histogram for training, validation, and testing data.
After training the neural network model, the first-principles phase equilibrium expression x n = y n α−(α−1)y n in Equation (19b) is replaced by the neural network phase equilibrium expression x n = F NN,1 (y n ), and then, the integrated model of first-principles model and neural network model is used in RTO as discussed in the following sections. In addition, the second output of the neural network model T n = F NN,2 (y n ) can be combined together with Equation (19) to calculate the temperature on each tray, which will be used later to calculate the set-points for the controllers.

RTO Design
Since the process has two degrees of freedom, the operating condition has not been determined. An RTO was designed for the distillation process to obtain the optimal operating condition. Since RTO needs an objective function, a profit was developed to represent the operation profit. According to the products, feed, and energy price in [45], the profit is defined by Equation (20).
The profit equals the profit of product subtracting the cost of feed and energy. The profit that will be used in RTO is represented as a function of R, D, B, x D ,x B . As a result, heat duty Q of both the condenser and reboiler is approximated by Q = L(R + 1)F, where L = 1.29 × 10 7 J/kmol is the molar latent heat of the mixture. Moreover, mass-based prices are changed to mole-based prices because all flow rates are mole-based. The price of the top distillate rises linearly as the mole fraction x D increases in order to demonstrate that the higher purity product has a higher price. To maximize the operation profit, the RTO problem is formulated as Equation (22).
Equation (22a) minimizes the negative profit with respective to five optimization variables R, D, B, x D ,x B . The first three constraint Equation (22b), Equation (22c), and Equation (22d) are the steady-state model of Equation (14), Equation (15) and Equation (19), as discussed in Section 4.1.2. The neural network model x n = F NN,1 (y n ) replaces x n = y n α−(α−1)y n in Equation (19). Constraints on the optimization variables are determined based on process parameters. Specifically, reflux ratio R can be any positive number; D and B should be between 0 and 1 because the feed had only 1 kmol/s; x D and x B should be also between zero and one because they are mole fractions. Since there are two degrees of freedom in the optimization problem, two steady-state values are sent to the controllers as set-points.

Controller Design
Six controllers were added in the distillation column, four of which had fixed set-points and two of which received set-points from RTO. The control scheme is shown in Figure 12. (1) A flow rate controller FC is controlling the feed mole flow rate at 1 kmol/s by manipulating feed valve V 1 . A fixed feed flow rate helps to fix the parameters in the first-principles steady-state model.
(2) A pressure controller PC is controlling the column top pressure at 16.8 atm by manipulating condenser heat duty Q top . A fixed column pressure helps to operate the process with fixed thermodynamic properties.
(3) A level controller LC 1 is controlling the reflux drum liquid level at 5.1 m by manipulating the distillate outlet valve V 2 . A certain liquid level in the condenser is required to avoid flooding or drying.
(4) A level controller LC 2 is controlling the reboiler liquid level at 6.35 m by manipulating the bottom outlet valve V 3 . A certain liquid level in the reboiler is required to avoid flooding or drying.
(5) A concentration controller CC is controlling the distillate C 3 mole fraction by manipulating the reflux mole flow rate. A time delay of 5 min was added to simulate the concentration measurement delay. At the beginning of each RTO period, RTO sends the optimized distillate C 3 mole fraction x D to concentration controller CC as the set-point. Then, controller CC adjusts the reflux flow to track the mole fraction to its set-point.
(6) A temperature controller TC is controlling temperature T 7 on Tray 7, by manipulating reboiler heat duty Q bottom . A time delay of 1 min was added to simulate the temperature measurement delay. Tray temperature control is common in industry, and two methods were carried out to determine the best tray temperature to be controlled. A steady-state simulation was used to obtain the temperature profile along the tube to find out that the temperature changes among Tray 6, Tray 7, and Tray 8 were greater than those among other trays. One more simulation was performed to get the gain of tray temperature as a response to a small change in the reboiler heat duty. It was also found that the temperature on Tray 7 had a greater gain than those on other trays. As a result, Tray 7 was chosen as the controlled variable.
At the beginning of the RTO period, RTO optimizes the profit and calculates a set of steady-states. Given the optimum value of R, D, B, x D ,x B , the steady-state model of F 1 = 0, F 2 = 0, and F 3 = 0 were used again to obtain the concentration profile in the distillation column. Then, the neural network model T n = F NN,2 (y n ) was used to calculate the temperature on Tray 7. After that, the tray temperature T 7 was sent to the controller TC and will be tracked to its set-point by manipulating the reboiler heat duty.
Flow rate controller FC, pressure controller PC, and both level controllers LC 1 and LC 2 had fixed set-points, which stabilized the process to operate at fixed operation parameters. Concentration controller CC and temperature controller TC received set-points from RTO at the beginning of RTO period and drove the process to more profitable steady-state. All the PI parameters were tuned by the Ziegler-Nichols method and are shown in Table 3.

Simulation Results
To demonstrate the effectiveness of RTO, a variation in feed mole fraction x F was introduced to the process, as shown in Figure 13. At the beginning of each RTO period (20 h), one measurement of feed mole fraction x F was sent to RTO to optimize the profit. Then, a set of steady-states was achieved from RTO and was sent to the controllers as set-points. The simulation results are shown in Figures 14 and 15. In Figure 14, the set-point of x D increases as feed concentration x F increases at the beginning of simulation, because higher distillate concentration is more profitable and more feed concentration x F allows further separation to achieve a higher concentration in the distillate. The set-point for x D also decreased later when feed concentration x F decreased. At the beginning of the simulation, reflux flow increased to reach higher x D set-points, and reflux flow never reached a steady-state during the whole simulation because the feed component kept changing as shown in Figure 13. In some cases, the mole fraction x D did not track exactly the set-point because of the ever-changing feed, too small set-point change, and coupled effect with other variables and controllers. Figure 15 illustrates the performance of temperature controller TC. When the feed x F increased, the set-point for Tray 7 temperature T 7 decreased according to RTO. The controller then manipulated the reboiler heat duty to track the tray temperature with a good performance as shown in Figure 15. It is noted in Figure 15 that the reboiler heat duty increased as tray temperature decreased at the beginning of the simulation. The reason is that the reboiler heat duty mainly dependent on the liquid flow into the reboiler and the vapor flow leaving the reboiler. Since the reflux flow was increased by the concentration controller CC at the beginning of simulation, both the liquid flow into the reboiler and vapor leaving the reboiler increased, thus increasing reboiler heat duty. Other controllers stayed at the fixed set-points throughout the simulation by adjusting their manipulated inputs. Therefore, we are not showing the plots for other controllers. It is demonstrated in Figure 16 that the RTO increased the operation profit when distillation column had a varying feed concentration. The profit in Figure 16 was calculated by the profit definition of Equation (20), using the closed-loop simulation data for variables D, B, F, and R. The black line is the operation profit calculated by the closed-loop simulation where the four controllers (FC, PC, LC 1 , and LC 2 ) had fixed set-points and the two controllers (CC and TC) had varying set-points from RTO. The blue line is the simulation where the set-points of all controllers were fixed at the initial steady-state and the controlled variables stayed at the initial set-point by adjusting manipulated variables in the presence of the same feed variation in Figure 13. Although the feed concentration kept changing each second and RTO updated the steady-state only each 20 h, the profit was still improved significantly by RTO, as shown in Figure 16. In this case study, a neural network model was combined only with the steady-state first-principles model, not the dynamic model. Additionally, it was demonstrated that the steady-states calculated by RTO using a combination of models were very close to the steady-state values in the Aspen simulator, which means that the combination of the neural network model and first-principles model was of high accuracy. The neural network model was used to represent the phase equilibrium properties for RTO to calculate the optimal steady-state in this work. Neural network models can be useful when the phase equilibrium is highly nonlinear such that the first-principles model is inaccurate. Additionally, it can be used when a large number of states are included in thermodynamic equations, such as pressure or more concentrations for the multi-component case.

Conclusions
In this work, we presented a method for integrating neural network modeling with first-principles modeling in the model used in RTO and MPC. First, a general framework that integrates neural network models with first-principle models in the optimization problems of RTO and MPC was discussed. Then, two chemical process examples were studied in this work. In the first case study, a CSTR with reversible exothermic reaction was utilized to analyze the performance of integrating the neural network model and first-principles model in RTO and MPC. Specifically, a neural network was first built to represent the nonlinear reaction rate. An RTO was designed to find the operating steady-state providing the optimal balance between the energy cost and reactant conversion. Then, an LMPC was designed to stabilize the process to the optimal operating condition. A variation in energy price was introduced, and the simulation results demonstrated that RTO minimized the operation cost and yielded a closed-loop performance that was very close to the one attained by RTO/MPC using the first-principles model. In the second case study, a distillation column was studied to demonstrate an application to a large-scale chemical process. A neural network was first trained to obtain the phase equilibrium properties. An RTO scheme was designed to maximize the operation profit and calculate the optimal set-points for the controllers using a neural network model with a first-principles model. A variation in the feed concentration was introduced to demonstrate that RTO increased operation profit for all considered conditions. In closing, it is important to note that the two simulation studies only demonstrated how the proposed approach can be applied and provided some type of "proof of concept" on the use of hybrid models in RTO and MPC, but certainly, both examples yield limited conclusions and cannot substitute for an industrial/experimental implementation to evaluate the proposed approach, which would be the subject of future work.
Author Contributions: Z.Z. developed the idea of incorporating machine learning in real-time optimization and model predictive control, performed the simulation studies, and prepared the initial draft of the paper. Z.W. and D.R. revised this manuscript. P.D.C. oversaw all aspects of the research and revised this manuscript.