Surrogate Models for Online Monitoring and Process Troubleshooting of NBR Emulsion Copolymerization

Chemical processes with complex reaction mechanisms generally lead to dynamic models which, while beneficial for predicting and capturing the detailed process behavior, are not readily amenable for direct use in online applications related to process operation, optimisation, control, and troubleshooting. Surrogate models can help overcome this problem. In this research article, the first part focuses on obtaining surrogate models for emulsion copolymerization of nitrile butadiene rubber (NBR), which is usually produced in a train of continuous stirred tank reactors. The predictions and/or profiles for several performance characteristics such as conversion, number of polymer particles, copolymer composition, and weight-average molecular weight, obtained using surrogate models are compared with those obtained using the detailed mechanistic model. In the second part of this article, optimal flow profiles based on dynamic optimisation using the surrogate models are obtained for the production of NBR emulsions with the objective of minimising the off-specification product generated during grade transitions.


Introduction
Nitrile butadiene rubber (NBR) is an elastomer used in a wide variety of applications demanding oil, fuel and chemical resistance where the content of acrylonitrile influences the end use.NBR can be produced by emulsion copolymerization of acrylonitrile (AN) and butadiene (Bd) using batch, semi-batch, and continuous processes.Usually it is produced using a series of eight to ten continuous-stirred tank reactors (CSTRs).Cold NBR polymers are synthesized between 5 and 15 ˝C, while hot NBR polymers are usually synthesized between 30 and 50 ˝C.A comprehensive mechanistic model that can predict different property trajectories for NBR emulsion polymerization has been developed by our group and has been successfully verified with experimental results over a long period [1][2][3].This model is capable of simulating the emulsion polymerisation of NBR in batch and in a train of CSTRs, with add-on options, such as choosing the type of reactor start-up, different modes of monomer partitioning, and the effect of impurities.More detailed information regarding the traits and attributes of this model can be obtained elsewhere [4,5].The mechanistic model developed by our group is complete and comprehensive and has been used for more than just obtaining the simulated dynamic behavior of commercial trains of CSTRs corresponding to different operating conditions.Depending on the type of start-up of the reactor train and different mechanisms selected by the user to be operative (related to radical desorption, partitioning methods, etc.), the simulation time varied from several minutes to an hour.In the case of starting up the reactors full of water or empty, the simulation time was found to be almost two hours, depending on the detail of the selected thermodynamic approach for monomer partitioning.To integrate the process model for control and optimisation applications, though, the current mechanistic model can be used, as it adds significant delay in the response of the measured variables before the control action is taken.To overcome this problem, suitable surrogate models (whose order is significantly less than the order of the actual mechanistic model and whose simulation times are far less than that of the fully mechanistic model) are proposed and used in this article for further online applications.The objective of the current article is two-fold: firstly, we explain the need for using surrogate models (in addition to the detailed model developed by our group), obtained using various techniques such as neural networks and transfer function models; secondly, we use them for real-time process applications, such as recipe formulations, control, and dynamic optimisation.
Surrogate models can come to the rescue when the objective is to control or optimise a process whose dynamics are either complex or involves relatively tedious and time-consuming numerical analysis to solve the original complex model for obtaining different state variables, or when the actual physics/chemistry of the process are poorly understood/not known.Artificial neural networks (ANN) are an important and useful tool that belongs to the class of surrogate modeling and is used for control and optimisation of processes which are highly nonlinear [6,7].Several authors have reported using ANN alone for predicting dynamic behavior or for controlling polymerization reactors [8,9], or otherwise, in a hybrid mode where ANN is used in addition to the corresponding simpler mechanistic model of the system [10,11].In general, ANNs are highly efficient when trained with large datasets involving a wide range of operating conditions.Otherwise, their performance will be limited and their predictions will be different from the expected dynamics [11].With sufficient training data, ANN can efficiently be used for prediction of steady state properties as reported by Vijayabhaskar et al. [12], Assenhaimer et al. [13], and Delfa and Boschetti [14].For emulsion copolymerisation systems.Although ANNs are used like a black box for modeling processes that are nonlinear, very little is available in the literature for using them as inverse modeling tools for complex polymerisation processes.In the current article, an inverse modeling approach with ANNs based on the back propagation technique is used for obtaining formulations for recipe ingredients to be used in the first reactor of the train that will give desired properties of the copolymer in the last reactor of the train (in a continuous mode).To further illustrate the points and shed more light, the capability of ANNs to predict the dynamic behaviour of emulsion copolymerisation of NBR in a batch reactor is described and discussed next.A major contribution of this research article (in addition to using ANNs) is to discuss how transfer function models are obtained for emulsion copolymerisation of NBR.For the first time, a complex process like the NBR system is described using transfer functions, which are, in turn, used for controlling the properties of the final product.

ANNs for NBR Emulsion Copolymerization in a Batch Reactor
In this section we explain how ANNs are used to simulate the production of NBR emulsion copolymerisation in a batch reactor.With the mechanistic model, the overall simulation time strongly depended on several factors such as the type of method for calculating monomer partitioning (i.e., thermodynamics vs constant partition coefficients), presence of monomer or water soluble impurities, and other details in the copolymerization kinetics.For any fast control action to be taken, relying on the mechanistic model would be equivalent to adding a dead time or delay to the measurement signal, the typical fast measurement being on conversion (X), with other rather slow measurements on copolymer composition or particle size or Mooney viscosity.Mooney viscosity is a well-known indirect measure of an average molecular weight of a polymer (usually a rubber product), determined by a Mooney visometer.In batch operation, ANN is designed to predict the effect of time on important latex and polymer properties such as conversion, number of particles, copolymer composition (CPC), weight-based and number-based average molecular weights, and tri-and tetra-functional branching frequencies.In the simulations, ANN based on back-propagation is programmed in MATLAB, where the total prediction error, EpWq (given by Equation ( 1)), is minimized.The error, EpWq, also referred to as quadratic error, is defined as the sum of the squares of the differences between the desired output, Y i , and the predicted output, X i .The predicted output X i , is a function of the weights (W jk for one hidden layer) used in the network, where the subscripts j and k represent the indices of the input and output neurons.In vector notation, W jk is usually represented by W " pW 12 , W 13 , W 14 , ...q: EpWq " 1 2 The predicted output is operated upon by an activation function also known as a squashing function.The two most commonly used activation functions are the sigmoid and hyperbolic tangent.It should be noted that, in the absence of the activation function, the problem reduces to that of a multiple linear regression model.It is the activation function that provides the ability to handle nonlinearities.The best values for the weights, W jk , are obtained by training the network using Levenberg-Marquardt back-propagation algorithm.In this algorithm, the weights are adjusted using the method of steepest descent with respect to the error E, as defined by Equation ( 1).The builtin function, trainlm.m, in MATLAB is based on this algorithm and is used in the simulations for the NBR system.The desired output, Y, is obtained from the mechanistic model, which gives good predictions when compared with the experimental data, as established in previous work [4,5].The original mechanistic model is highly nonlinear with 32 multiscale state variables.The data obtained from the mechanistic model are divided into training, validation, and untested datasets (also called unseen datasets).From the available data sets, 70% of the data is used for training, 15% percent of the data is used for validation, and the remaining 15% of the data is used for testing.To avoid an overfitted model for the training data, a program in MATLAB is written for obtaining the optimum size and complexity of the network with the objective that the training error be comparable to the prediction error.Performance characteristics of the designed ANN for the prediction of conversion, cumulative copolymer composition (CPC), weight-based average molecular weight (MW w ) and tri-functional branching frequency (BN 3 ) are shown in Figure 1a-d.In these figures, the profiles obtained from the mechanistic model (MM) are compared to those obtained using ANN for a targeted batch time of 700 min (a typical value used in commercial production).A very good agreement was achieved between the predictions using ANN with those obtained using the well-established and tested MM.The designed ANN for NBR emulsion copolymerisation in a batch reactor can, thus, be safely used in conjunction with process control and optimisation algorithms for describing the desired properties of the polymer.

) (W E
, also referred to as quadratic error, is defined as the sum of the squares of the differences between the desired output, Yi, and the predicted output, Xi.The predicted output Xi, is a function of the weights (Wjk for one hidden layer) used in the network, where the subscripts j and k represent the indices of the input and output neurons.In vector notation, Wjk is usually represented by ,...) , , ( The predicted output is operated upon by an activation function also known as a squashing function.The two most commonly used activation functions are the sigmoid and hyperbolic tangent.It should be noted that, in the absence of the activation function, the problem reduces to that of a multiple linear regression model.It is the activation function that provides the ability to handle nonlinearities.The best values for the weights, Wjk, are obtained by training the network using Levenberg-Marquardt back-propagation algorithm.In this algorithm, the weights are adjusted using the method of steepest descent with respect to the error E, as defined by Equation (1).The builtin function, trainlm.m, in MATLAB is based on this algorithm and is used in the simulations for the NBR system.The desired output, Y, is obtained from the mechanistic model, which gives good predictions when compared with the experimental data, as established in previous work [4,5].The original mechanistic model is highly nonlinear with 32 multiscale state variables.The data obtained from the mechanistic model are divided into training, validation, and untested datasets (also called unseen datasets).From the available data sets, 70% of the data is used for training, 15% percent of the data is used for validation, and the remaining 15% of the data is used for testing.To avoid an overfitted model for the training data, a program in MATLAB is written for obtaining the optimum size and complexity of the network with the objective that the training error be comparable to the prediction error.Performance characteristics of the designed ANN for the prediction of conversion, cumulative copolymer composition (CPC), weight-based average molecular weight (MWw) and tri-functional branching frequency (BN3) are shown in Figure 1a-d.In these figures, the profiles obtained from the mechanistic model (MM) are compared to those obtained using ANN for a targeted batch time of 700 min (a typical value used in commercial production).A very good agreement was achieved between the predictions using ANN with those obtained using the well-established and tested MM.The designed ANN for NBR emulsion copolymerisation in a batch reactor can, thus, be safely used in conjunction with process control and optimisation algorithms for describing the desired properties of the polymer.

ANN for Inverse Modeling of NBR Emulsion Copolymerisation in a Train of CSTRs
NBR is commercially produced in a continuous fashion using eight to ten reactors operated in series.Depending on the demand, a CSTR can be added or removed from the series, which indirectly affects the mean residence time of the train of CSTRs.The properties of the product from the last reactor are affected by the recipe ingredients used in the first reactor of the train and by the operating conditions of the train.While ANNs can be used to predict these properties based on recipe ingredients given as inputs, by using ANN-based inverse modeling, the recipe ingredients to be used in the first reactor for targeted properties exiting the last reactor can be obtained.This is achieved by using the properties of the product from the last reactor (say, the eighth reactor in an eight-reactor CSTR train) to be the inputs to the network, while the outputs are the recipe ingredients to the first reactor.This type of inverse modeling is easy and efficient with ANNs compared to the alternative, i.e., offline optimisation using the corresponding mechanistic model as a constraint.The ingredients that are fed to the first reactor are initiator (I), reducing agent (RA), emulsifier(s) (E), monomer(s) (M), water (W), and chain transfer agent (CTA).Considering typical high and low levels for each one of these reaction ingredients, the corresponding conversion (X), cumulative copolymer composition (CPC), and the weight-based average molecular weight (MWw) at the exit of the eighth reactor can be simulated using the mechanistic model.Out of the 64 available datasets, 52 datasets are then used for training the network and 12 datasets are left in order to check the performance of the ANN with respect to inverse modeling of the recipe ingredients for obtaining the desired polymer properties.The different levels of the reaction ingredients used for MM simulations are shown in Table 1.The low and high levels of the reaction ingredients can be normalized to −1 and +1, respectively which, in turn, are used as outputs from the network.The inputs to the network, as shown in Figure 2, are X, CPC, and MWw, while the outputs are the recipe ingredients RA, I, E, M, W, and CTA.

ANN for Inverse Modeling of NBR Emulsion Copolymerisation in a Train of CSTRs
NBR is commercially produced in a continuous fashion using eight to ten reactors operated in series.Depending on the demand, a CSTR can be added or removed from the series, which indirectly affects the mean residence time of the train of CSTRs.The properties of the product from the last reactor are affected by the recipe ingredients used in the first reactor of the train and by the operating conditions of the train.While ANNs can be used to predict these properties based on recipe ingredients given as inputs, by using ANN-based inverse modeling, the recipe ingredients to be used in the first reactor for targeted properties exiting the last reactor can be obtained.This is achieved by using the properties of the product from the last reactor (say, the eighth reactor in an eight-reactor CSTR train) to be the inputs to the network, while the outputs are the recipe ingredients to the first reactor.This type of inverse modeling is easy and efficient with ANNs compared to the alternative, i.e., offline optimisation using the corresponding mechanistic model as a constraint.The ingredients that are fed to the first reactor are initiator (I), reducing agent (RA), emulsifier(s) (E), monomer(s) (M), water (W), and chain transfer agent (CTA).Considering typical high and low levels for each one of these reaction ingredients, the corresponding conversion (X), cumulative copolymer composition (CPC), and the weight-based average molecular weight (MW w ) at the exit of the eighth reactor can be simulated using the mechanistic model.Out of the 64 available datasets, 52 datasets are then used for training the network and 12 datasets are left in order to check the performance of the ANN with respect to inverse modeling of the recipe ingredients for obtaining the desired polymer properties.The different levels of the reaction ingredients used for MM simulations are shown in Table 1.The low and high levels of the reaction ingredients can be normalized to ´1 and +1, respectively which, in turn, are used as outputs from the network.The inputs to the network, as shown in Figure 2, are X, CPC, and MW w , while the outputs are the recipe ingredients RA, I, E, M, W, and CTA.With the objective of obtaining the minimum mean sum of squared errors (MSE), simulations were performed to study the effect of the number of hidden layers and the number of neurons in each layer.Table 2 shows the MSE values for monomer and CTA concentrations obtained by varying the number of hidden layers from one to three and the corresponding number of neurons in each layer from five to 20.All ANN simulations took less than 3 s for obtaining the trained networks.As is evident from Table 2, a network with three hidden layers and 20 neurons gives minimum MSE values and the same trend was found for other reaction ingredients (I, RA, W, and E).A possible reason that can be attributed to the resulting large network is that the difference in the magnitude for the output variables such as conversion and weight-based molecular weight is almost 10 6 .Hence, the optimum configuration used in this work was a network with three hidden layers with 20 neurons in each layer.Though the number of weights for such a large network is very high, the simulation time for obtaining the trained network in all simulations was only a few seconds.The trained network (saved as .MAT file in MATLAB) in turn is used as an inverse modeling tool to predict the recipe ingredients (of the first reactor) for targeted properties of the stream exiting the eighth reactor.Figure 3 (a through f) shows the recipe ingredients ' predictions obtained using the ANN-based inverse modeling for desired conversion (X) compared to the data obtained from the mechanistic model.The proposed ANN-based inverse modeling could predict the required recipe ingredients very well for desired conversion levels.Similar results and trends were obtained for desired CPC and MWw (as shown in Figures 4 and 5).The predictions for all reaction ingredients are precise, except for the initiator in some cases.The prediction capability of the ANN can be quantified using the mean sum of squared errors (MSE) for each dataset.The MSE values for each of the predicted datasets are shown in Table 3.With the objective of obtaining the minimum mean sum of squared errors (MSE), simulations were performed to study the effect of the number of hidden layers and the number of neurons in each layer.Table 2 shows the MSE values for monomer and CTA concentrations obtained by varying the number of hidden layers from one to three and the corresponding number of neurons in each layer from five to 20.All ANN simulations took less than 3 s for obtaining the trained networks.As is evident from Table 2, a network with three hidden layers and 20 neurons gives minimum MSE values and the same trend was found for other reaction ingredients (I, RA, W, and E).A possible reason that can be attributed to the resulting large network is that the difference in the magnitude for the output variables such as conversion and weight-based molecular weight is almost 10 6 .Hence, the optimum configuration used in this work was a network with three hidden layers with 20 neurons in each layer.Though the number of weights for such a large network is very high, the simulation time for obtaining the trained network in all simulations was only a few seconds.The trained network (saved as .MAT file in MATLAB) in turn is used as an inverse modeling tool to predict the recipe ingredients (of the first reactor) for targeted properties of the stream exiting the eighth reactor.Figure 3 (a through f) shows the recipe ingredients' predictions obtained using the ANN-based inverse modeling for desired conversion (X) compared to the data obtained from the mechanistic model.The proposed ANN-based inverse modeling could predict the required recipe ingredients very well for desired conversion levels.Similar results and trends were obtained for desired CPC and MW w (as shown in Figures 4 and 5).The predictions for all reaction ingredients are precise, except for the initiator in some cases.The prediction capability of the ANN can be quantified using the mean sum of squared errors (MSE) for each dataset.The MSE values for each of the predicted datasets are shown in Table 3.The MSE values from Table 3 clearly show that the prediction using ANN-based inverse modeling is precise for almost all reaction ingredients except for the initiator.The obtained the MSE of the initiator is slightly greater than the MSE values of the other reaction ingredients.This is due to the fact that the magnitude of initiator concentration is very small compared to the concentrations of the other ingredients.The above results show that ANN-based inverse modeling can give good estimates of the reaction ingredients to be used in the first reactor (which can be applied to batch reactor operation as well) for obtaining the desired properties of the polymer exiting the last reactor of the CSTR train.This method of obtaining the estimates (for initial reaction ingredients) is easier and less time consuming compared to using the fully mechanistic model.Using the mechanistic model (which is very useful in its own right, as we have shown in previous publications, e.g., Madhuranthakam and Penlidis [4,5]), a trial and error approach has to be used for initial reaction ingredients.During the operation of the reactor train, for any slight discrepancies in the desired properties of the products, it is always possible to fine-tune the estimates obtained using the ANN-based inverse modeling.The MSE values from Table 3 clearly show that the prediction using ANN-based inverse modeling is precise for almost all reaction ingredients except for the initiator.The obtained the MSE of the initiator is slightly greater than the MSE values of the other reaction ingredients.This is due to the fact that the magnitude of initiator concentration is very small compared to the concentrations of the other ingredients.The above results show that ANN-based inverse modeling can give good estimates of the reaction ingredients to be used in the first reactor (which can be applied to batch reactor operation as well) for obtaining the desired properties of the polymer exiting the last reactor of the CSTR train.This method of obtaining the estimates (for initial reaction ingredients) is easier and less time consuming compared to using the fully mechanistic model.Using the mechanistic model (which is very useful in its own right, as we have shown in previous publications, e.g., Madhuranthakam and Penlidis [4,5]), a trial and error approach has to be used for initial reaction ingredients.During the operation of the reactor train, for any slight discrepancies in the desired properties of the products, it is always possible to fine-tune the estimates obtained using the ANN-based inverse modeling.

Surrogate Modeling for NBR Emulsion Copolymerization
In this section, models that are capable of predicting the dynamics and are amenable for control and/or optimisation applications for the emulsion copolymerisation of NBR are discussed.The ANNs discussed in the previous sections can also be used for predicting the dynamics and for control purposes, but the additional benefit of the surrogate models is that these models in their standard forms, such as a first order plus time delay or a second order plus time delay, etc., have fewer parameters than the number of weights obtained using the ANN.In many situations, the parameters of a feedback controller or a model predictive controller can be obtained as functions of the corresponding parameters of the model, which is not feasible with the case of ANN.Surrogate models are obtained by reducing the order of the original mechanistic model so that computation of the dynamic behaviour is fast, which in turn helps with online control and optimisation applications.Depending on the type and order of the original model, the model can be reduced either by using a model balancing approach or by error minimization.Model balancing involves evaluating the controllability and observability Gramians and partitioning the state vector into important states and less important states.The reduced model is obtained by truncation of the least important states [15].This method involves linearizing the original model around an operating condition or empirically obtaining Gramians corresponding to each operating condition, which may be very tedious.The actual mechanistic model for NBR emulsion copolymerization includes 32 state variables and its highly nonlinear nature makes it cumbersome to obtain a corresponding linearized model.Due to this reason, empirical models are obtained by using error minimization criteria.The initial choice of type of empirical models is very crucial as there could be multiple models (which could differ in the number of parameters) that can fit equally well the corresponding data available.After choosing a specific transfer function model, the models are fine-tuned later based on the objective of minimizing the error between the responses of the proposed empirical model and the data (obtained from the mechanistic model).The parameters of the final surrogate model are obtained by simulation and using the nonlinear least squares fitting function lsqnonlin in MATLAB.

Transfer Function Models for the First CSTR in the Reactor Train
As mentioned in the previous sections, the ingredients entering the first CSTR are the initiator, emulsifier(s), monomers, water, and chain transfer agent streams.Typical outputs considered for obtaining the corresponding surrogate models are conversion, cumulative copolymer composition, weight-based average molecular weight, and the total number of latex particles per liter of water (N p ).These are the typical outputs (in principle, measurable) which are, in turn, used in the controlled production of NBR latex.Surrogate modeling is conducted in the Laplace domain by programming interactive simulations between SIMULINK and MATLAB.The performance of the corresponding fitted transfer function model is evaluated in terms of the coefficient of determination, R 2 .The input variables chosen to be related to any output variable are restricted to the states that have higher impact than others and that can also be used as practical manipulated variables in control applications.For example, when the reactor is started full of batch recipe (for other types of start-up policies refer to [4]), the conversion obtained at the exit of the reactor is expressed as a function of initiator, and acrylonitrile and butadiene (monomers) flow rates, as shown in Equation (2): where Y 1 denotes conversion, X 1 represents initiator flowrate, X 2 is the acrylonitrile flow rate, and X 3 is the butadiene flow rate, all in the Laplace domain.τ 1 , τ 2 , and K 1 are the parameters of the model.Since the input and output variables in the transfer function models represent perturbations from initial steady states, the final model constitutes an initial value problem with all variables (outputs) to be zero at time t = 0.The structure of this model basically consists of a combination of a second order and two first order systems.The parameters τ 1 , τ 2 , and K are obtained by fitting the model response to the data obtained from the mechanistic model for a given step change in the input variables X 1 , X 2 , and X 3 .Similarly, for other output variables such as CPC (Y 2 ), MW w (Y 3 ), and N p (Y 4 ), the corresponding models are given by Equations ( 3)-( 5): where Y 2 , Y 3 and Y 4 represent the output variables CPC, MW w , and N p , respectively, whereas X 4 , X 5 , and X 6 represent the ratio of flow rates of the two monomers (AN to Bd), chain transfer agent flow rate, and emulsifier flow rate, respectively.K 2 , K 3 , K 4 , and τ 3 through τ 8 are parameters of the empirical models.Equations ( 2)-( 5) represent the empirical surrogate models for the dynamics of the first reactor in the reactor train.When the reactor is started full of recipe, the inflow to the first reactor is equivalent to giving a step input to all input variables X 1 through X 6 and the output responses Y 1 through Y 4 are used to obtain the corresponding parameters of the empirical models.The final comparison of the responses obtained for X, CPC, MW w , and N p using the data from the mechanistic model (MM) and the proposed empirical models (EM) are shown in Figure 6a-d, respectively.
Processes 2016, 4, x 9 of 14 to be zero at time t = 0.The structure of this model basically consists of a combination of a second order and two first order systems.The parameters τ1, τ2, and K are obtained by fitting the model response to the data obtained from the mechanistic model for a given step c hange in the input variables X1, X2, and X3.Similarly, for other output variables such as CPC (Y2), MWw (Y3), and Np (Y4), the corresponding models are given by Equations ( 3)-( 5): where Y2, Y3 and Y4 represent the output variables CPC, MWw, and Np, respectively, whereas X4, X5, and X6 represent the ratio of flow rates of the two monomers (AN to Bd), chain transfer agent flow rate, and emulsifier flow rate, respectively.K2, K3, K4, and τ3 through τ8 are parameters of the empirical models.Equations ( 2)-( 5) represent the empirical surrogate models for the dynamics of the first reactor in the reactor train.When the reactor is started full of recipe, the inflow to the first reactor is equivalent to giving a step input to all input variables X1 through X6 and the output responses Y1 through Y4 are used to obtain the corresponding parameters of the empirical models.The final comparison of the responses obtained for X, CPC, MWw, and Np using the data from the mechanistic model (MM) and the proposed empirical models (EM) are shown in Figure 6a-d   The proposed empirical models fit very well the data obtained from the mechanistic model is evident from the very high R 2 values (close to unity) reported on the corresponding figures.Since the weight-based average molecular weight (MW w ) in the mechanistic model is obtained from a set of very highly nonlinear model equations that originate using the method of moments, the corresponding empirical model had a lower R 2 value compared to the values of the other output variables.In general, the performance of the surrogate models is very good.The proposed empirical models are not only simple and amenable to use for online purposes but also have very few parameters.With the success of using this method for a single CSTR (the first reactor of the CSTR train), the properties at the exit of the eighth reactor can also be empirically modeled and further used in online applications, such as control and grade transitions.

Optimal CTA Profile for Minimizing off-Spec Product
The weight-based molecular weight response exiting the eighth reactor in the reactor train is empirically modeled using the knowledge of the corresponding dynamics in the first reactor of the train.The validity and versatility of the proposed empirical model is shown in Figure 7 where two different reactor start-up procedures are used (one of them starting the reactor train full of recipe (Figure 7a) and the other starting full of water (Figure 7b).The empirical model obtained for this case is given by Equation (6).
The proposed empirical models fit very well the data obtained from the mechanistic model is evident from the very high R 2 values (close to unity) reported on the corresponding figures.Since the weight-based average molecular weight (MWw) in the mechanistic model is obtained from a set of very highly nonlinear model equations that originate using the method of moments, the corresponding empirical model had a lower R 2 value compared to the values of the other output variables.In general, the performance of the surrogate models is very good.The proposed empirical models are not only simple and amenable to use for online purposes but also have very few parameters.With the success of using this method for a single CSTR (the first reactor of the CSTR train), the properties at the exit of the eighth reactor can also be empirically modeled and further used in online applications, such as control and grade transitions.

Optimal CTA Profile for Minimizing off-Spec Product
The weight-based molecular weight response exiting the eighth reactor in the reactor train is empirically modeled using the knowledge of the corresponding dynamics in the first reactor of the train.The validity and versatility of the proposed empirical model is shown in Figure 7 where two different reactor start-up procedures are used (one of them starting the reactor train full of recipe (Figure 7a) and the other starting full of water (Figure 7b).The empirical model obtained for this case is given by Equation (6).Y8 is the MWw obtained at the exit of the eighth reactor, X5 is the flow rate of the CTA to the first reactor in the reactor train, and K, τ1, and τ2 are model parameters.The empirical model consists of eight second-order transfer functions in series, each of them corresponding to the dynamics of each individual reactor in the reactor train.From the R 2 values reported in Figure 7a,b, it is evident that the proposed second order transfer function in series fits very well the corresponding mechanistic model in addition to the benefit of employing three parameters only (K, τ1, and τ2).
One of the most common operational constraints that occur in the commercial production of NBR is during grade changes.During grade changes, the reactor train is switched to operate from one steady state to another desired steady state.These can be viewed as set point changes that occur due to customer or production campaign requirements.One of the primary objectives during grade changes or for start-up of the reactor train is to minimize the off-specification product that is produced before reaching the steady state.The off-specification product can be minimized in multiple ways but a usual practice is to add intermittent flows of the manipulated variables along the reactor train in a feedforward fashion where the magnitudes of the flow rates of the manipulated variables are estimated by performing offline optimisation.For batch and semi-batch production of Y 8 is the MW w obtained at the exit of the eighth reactor, X 5 is the flow rate of the CTA to the first reactor in the reactor train, and K, τ 1 , and τ 2 are model parameters.The empirical model consists of eight second-order transfer functions in series, each of them corresponding to the dynamics of each individual reactor in the reactor train.From the R 2 values reported in Figure 7a,b, it is evident that the proposed second order transfer function in series fits very well the corresponding mechanistic model in addition to the benefit of employing three parameters only (K, τ 1 , and τ 2 ).
One of the most common operational constraints that occur in the commercial production of NBR is during grade changes.During grade changes, the reactor train is switched to operate from one steady state to another desired steady state.These can be viewed as set point changes that occur due to customer or production campaign requirements.One of the primary objectives during grade changes or for start-up of the reactor train is to minimize the off-specification product that is produced before reaching the steady state.The off-specification product can be minimized in multiple ways but a usual practice is to add intermittent flows of the manipulated variables along the reactor train in a feedforward fashion where the magnitudes of the flow rates of the manipulated variables are estimated by performing offline optimisation.For batch and semi-batch production of polymers, Fujisawa and Penlidis [16] have shown different reactor control policies targeted for obtaining desired copolymer compositions using an offline mechanistic model.For reducing transients during grade changes for NBR and styrene butadiene systems, Minari et al. [17,18] have used a bang-bang method (which also uses an offline mechanistic model), where predetermined quantities of CTA and monomer are added along the reactor train in one shot to reduce the molecular weights and the CPC in the last reactor of the reactor train.In the present work we propose an online method, where continuous profiles for manipulated variables such as CTA, monomers etc. are obtained using dynamic optimisation.This method assumes that the information for molecular weight becomes available online via an inferential estimator based on Mooney viscosity, in order to minimize the time delay related to the measurement of molecular weight.The empirical model (as given be Equation ( 6)) can be used either online or offline optimisation when there is a grade change with respect to MW w .For simultaneous control of conversion, CPC, and MW w , a multiobjective function based on a weighted sum or ε-constrained methods can be used [19,20].While grade changes can involve an increase or decrease in several product specifications, such as conversion, CPC, N P , MW w , etc., the application to the scenario where a decrease in MW w by adding extra CTA to the reactor train is discussed here as an example.In general, the inflows to the last few reactors are manipulated rather than manipulating the inflows to the first few reactors, due to the fact that the monomer droplets are absent in the last few reactors.For example, in a reactor train started up full of recipe, the monomer droplets will disappear in the sixth reactor of the train [5].Especially for grade changes involving MW w , the corresponding manipulated variable is the flow rate of CTA added to the reactors with monomer droplets present.The optimal flow rate of CTA to be added to the first reactor in the reactor train is obtained using the optimisation function represented by Equation ( 7): min where F CAT is the flow rate of CTA to the eighth reactor, MW des w is the desired steady state weight-based molecular weight, MW w (t) is the measured value of the weight-based molecular weight at any time t, and F CTA is the steady state value of the flow rate of CTA.Assuming a control valve with a rangeability of 50:1 is used to manipulate the CTA flow rate, the manipulated flow rate is constrained between 1 7 F CTA and 7F CTA .In Equation ( 7), time t refers to the operation time of the eighth reactor.The mean residence time for each CSTR in the train is 60 min; hence, the total time for a reactor train of eight CSTRs will be 480 min.Since it takes three times the total mean residence time for the reactor train to reach steady state operation, the corresponding operational time used in the simulations was set to 1500 min (approximately).The optimum value for F CTA is obtained by minimizing the cost function (as shown in Equation ( 7)) at different time steps simultaneously.This procedure can be extended for other grade change applications with specifications on other variables, such as CPC or X, with manipulated variables being the flow rates of monomers, initiator, and/or emulsifiers.Figure 8a shows the profiles obtained for MW w at the exit of the eighth reactor for the cases where a regular CTA flow rate based on a "full of recipe" start-up is compared to that of the CTA flow rate obtained from optimisation using Equation (7).In both cases (refer to Figure 8a), the area under the solid curve and the dashed curve with respect to the steady state value of MW w is an indirect measure of the amount of off-specification product generated during the operation of the reactor train.Figure 8a clearly shows that using the proposed optimisation method, the amount of off-specification product/material can be minimized by several folds compared to the base case where a constant CTA flow rate is used.The corresponding CTA flow rate profile to be added to the first reactor obtained from the above mentioned optimisation procedure is shown in Figure 8b.This CTA flow profile can be practically achieved by using an automatic flow controller installed on the CTA flow line.The proposed online method for the adjustment of the manipulated CTA flow rate can also be applied to the flow rates of monomers and initiator to control CPC and/or X. Figure 8a shows the profiles obtained for MWw at the exit of the eighth reactor for the cases where a regular CTA flow rate based on a "full of recipe" start-up is compared to that of the CTA flow rate obtained from optimisation using Equation (7).In both cases (refer to Figure 8a), the area under the solid curve and the dashed curve with respect to the steady state value of MWw is an indirect measure of the amount of off-specification product generated during the operation of the reactor train.Figure 8a clearly shows that using the proposed optimisation method, the amount of off-specification product/material can be minimized by several folds compared to the base case where a constant CTA flow rate is used.The corresponding CTA flow rate profile to be added to the first reactor obtained from the above mentioned optimisation procedure is shown in Figure 8b.This CTA flow profile can be practically achieved by using an automatic flow controller installed on the CTA flow line.The proposed online method for the adjustment of the manipulated CTA flow rate can also be applied to the flow rates of monomers and initiator to control CPC and/or X.

Concluding Remarks
Surrogate models were investigated in lieu of the original higher order mechanistic model for NBR emulsion copolymerisation, with the objective of minimizing the computational time for implementing the models for control/optimisation approaches.Different types of surrogate models, such as models based on artificial intelligence using neural networks and empirical models in t he form of first order and/or second order (with and without time delay), were designed for studying the dynamics of NBR production.It was shown that ANNs can be used to efficiently predict the dynamics, and also as an inverse modeling tool where the reaction ingredients to be added to the first reactor in the reactor train are obtained for targeted desired properties of the polymer produced in the eighth reactor of the reactor train.The transfer function models were in the form of standard first and second order processes (with or without time delay) and could readily be used in control and optimisation applications.These proposed models fitted well the dynamics of the NBR emulsion polymerization in the CSTR train and were subsequently used in an optimisation application.With the objective of minimizing the off-specification product exiting the eighth reactor, the optimal CTA flow rate was obtained.Compared to offline methods, the proposed (potentially online) method is a very promising tool with respect to optimal reactor train operation and minimizing the waste generated due to different startups or grade changes.

Concluding Remarks
Surrogate models were investigated in lieu of the original higher order mechanistic model for NBR emulsion copolymerisation, with the objective of minimizing the computational time for implementing the models for control/optimisation approaches.Different types of surrogate models, such as models based on artificial intelligence using neural networks and empirical models in the form of first order and/or second order (with and without time delay), were designed for studying the dynamics of NBR production.It was shown that ANNs can be used to efficiently predict the dynamics, and also as an inverse modeling tool where the reaction ingredients to be added to the first reactor in the reactor train are obtained for targeted desired properties of the polymer produced in the eighth reactor of the reactor train.The transfer function models were in the form of standard first and second order processes (with or without time delay) and could readily be used in control and optimisation applications.These proposed models fitted well the dynamics of the NBR emulsion polymerization in the CSTR train and were subsequently used in an optimisation application.With the objective of minimizing the off-specification product exiting the eighth reactor, the optimal CTA flow rate was obtained.Compared to offline methods, the proposed (potentially online) method is a very promising tool with respect to optimal reactor train operation and minimizing the waste generated due to different startups or grade changes.

Figure 1 .
Figure 1.Comparison of (a) conversion, (b) cumulative copolymer composition, (c) weight-average molecular weight, and (d) tri-functional branching frequency profiles obtained using ANN and mechanistic models.For process conditions, refer to Dube et al. [2].

Figure 1 .
Figure 1.Comparison of (a) conversion; (b) cumulative copolymer composition; (c) weight-average molecular weight; and (d) tri-functional branching frequency profiles obtained using ANN and mechanistic models.For process conditions, refer to Dube et al. [2].

Figure 2 .
Figure 2. ANN structure used for inverse modeling with X, CPC, and MWw as inputs; RA, I, E, M, W, and CTA as outputs.

Figure 2 .
Figure 2. ANN structure used for inverse modeling with X, CPC, and MW w as inputs; RA, I, E, M, W, and CTA as outputs.

Processes 2016, 4 , x 6 of 14 Figure 3 .
Figure 3.Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired conversion levels vs. (a) reducing agent, (b) initiator, (c) emulsifier, (d) monomer, (e) water, and (f) chain transfer agent.

Figure 4 .
Figure 4. Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired cumulative copolymer composition levels vs. (a) reducing agent, (b) initiator, (c) emulsifier, (d) monomer, (e) water, and (f) chain transfer agent.

Figure 3 .
Figure 3.Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired conversion levels vs. (a) reducing agent; (b) initiator; (c) emulsifier; (d) monomer; (e) water; and (f) chain transfer agent.

Figure 3 .
Figure 3.Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired conversion levels vs. (a) reducing agent, (b) initiator, (c) emulsifier, (d) monomer, (e) water, and (f) chain transfer agent.

Figure 4 .
Figure 4. Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired cumulative copolymer composition levels vs. (a) reducing agent, (b) initiator, (c) emulsifier, (d) monomer, (e) water, and (f) chain transfer agent.

Figure 4 .
Figure 4. Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired cumulative copolymer composition levels vs. (a) reducing agent; (b) initiator; (c) emulsifier; (d) monomer; (e) water; and (f) chain transfer agent.

Figure 5 .
Figure 5.Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired weight-based average molecular weight levels vs. (a) reducing agent, (b) initiator, (c) emulsifier, (d) monomer, (e) water, and (f) chain transfer agent.

Figure 5 .
Figure 5.Comparison of the predictions between the unseen targeted values (o) and the values obtained using ANN (+) for desired weight-based average molecular weight levels vs. (a) reducing agent; (b) initiator; (c) emulsifier; (d) monomer; (e) water; and (f) chain transfer agent.

Figure 6 .
Figure 6.Comparison of the responses obtained from the proposed empirical models (EM) and mechanistic model (MM) for (a) conversion, (b) CPC, (c) MWw, and (d) Np.

Figure 6 .
Figure 6.Comparison of the responses obtained from the proposed empirical models (EM) and mechanistic model (MM) for (a) conversion; (b) CPC; (c) MW w ; and (d) N p .

Figure 7 .
Figure 7.Comparison of the model validity for MWw at the exit of the eighth reactor using mechanistic model (MM) and empirical model (EM) for reactor train start-ups (a) full of recipe and (b) full of water.

Figure 7 .
Figure 7.Comparison of the model validity for MW w at the exit of the eighth reactor using mechanistic model (MM) and empirical model (EM) for reactor train start-ups (a) full of recipe and (b) full of water.

Figure 8 .
Figure 8.(a) Comparison of MWw profiles in the eighth reactor using regular CTA flow rate to that of using optimal flow rate of CTA; and (b) the dynamic CTA flow rate obtained from optimisation.

Figure 8 .
Figure 8.(a) Comparison of MW w profiles in the eighth reactor using regular CTA flow rate to that of using optimal flow rate of CTA; and (b) the dynamic CTA flow rate obtained from optimisation.

Table 1 .
Recipe ingredients to the first reactor in the reactor train and their levels.
Note: Mean residence time of each reactor in train is 60 min.

Table 1 .
Recipe ingredients to the first reactor in the reactor train and their levels.
Note: Mean residence time of each reactor in train is 60 min.

Table 2 .
Effect of number of hidden layers and number of neurons on MSE.

Table 2 .
Effect of number of hidden layers and number of neurons on MSE.

Table 3 .
MSE values for the 12 untested datasets using ANN.

Table 3 .
MSE values for the 12 untested datasets using ANN.