Application of Artificial Neural Networks for Producing an Estimation of High-Density Polyethylene

Polyethylene as a thermoplastic has received the uppermost popularity in a vast variety of applied contexts. Polyethylene is produced by several commercially obtainable technologies. Since Ziegler–Natta catalysts generate polyolefin with broad molecular weight and copolymer composition distributions, this type of model was utilized to simulate the polymerization procedure. The EIX (ethylene index) is the critical controlling variable that indicates product characteristics. Since it is difficult to measure the EIX, estimation is a problem causing the greatest challenges in the applicability of production. To resolve such problems, ANNs (artificial neural networks) are utilized in the present paper to predict the EIX from some simply computed variables of the system. In fact, the EIX is calculated as a function of pressure, ethylene flow, hydrogen flow, 1-butane flow, catalyst flow, and TEA (triethylaluminium) flow. The estimation was accomplished via the Multi-Layer Perceptron, Radial Basis, Cascade Feed-forward, and Generalized Regression Neural Networks. According to the results, the superior performance of the Multi-Layer Perceptron model than other ANN models was clearly demonstrated. Based on our findings, this model can predict production levels with R2 (regression coefficient), MSE (mean square error), AARD% (average absolute relative deviation percent), and RMSE (root mean square error) of, respectively, 0.89413, 0.02217, 0.4213, and 0.1489.


Introduction
Polyethylene as a thermoplastic has received the uppermost popularity in a vast variety of applied contexts. In spite of the need for considerable financial investment for producing polyethylene, the end consumer might receive a very inexpensive and throwaway product. Therefore, improvements in polymer fabrication procedures aiming at reducing manufacturing expenses remain as an investigational, developmental, and process expansion topic [1].
RBFNN is a common feedforward neural network, which has been demonstrated to be capable of global estimation without local minima problem. Besides, it possesses a plain construct and a rapid learning algorithm in comparison with other neural networks [16]. Despite the availability of multiple activation functions for radial basis neurons, the Gaussian function has received the uppermost popularity. Training is necessary before applying an RBF ANN, which is commonly achievable in two stages. Choosing centers among the data applied is in the first stage of the training. The second stage is to utilize the normal least squares for linear estimation of one weighting vector. The self-organized center election is the widespread learning approach employed for selecting RBF centers.
The input and output neurons begin the production of the CFNN topology. The output neurons are present in the neural network in advance; thus, novel neurons are provided for the network, as a result of which the network, in turn, attempts to increase the correlation level between the outputs and inputs by comparing the network residue with the novel measured error. This procedure goes ahead until reaching a smaller error value in the network, which explains the reasoning that it is labeled as a cascade [17]. CFNN generally comprises three major layers namely, input, hidden, and output layers. The variables in hidden layers are multiplied by the bias (1.0) and the weight (computed in the creation phase to decline the prediction error) followed by addition to the sum entering the neuron. The resultant value from this procedure will cross a transfer function to present the output value.
GRNN is a type of supervised network that works on the basis of the probabilistic model and is able to produce continuous outputs. It is a robust instrument for non-linear regression analysis based on the approximation of probability density functions using the Parzen window technique [18]. The GRNN architecture basically does not need an iterative process to simulate such results as back-propagation learning algorithms. GRNN is capable of estimating arbitrary functions among output and input datasets directly from training data.
The main uses of the RBF and GRNN topologies are with a rather small size of input data. Despite the common topology of neural networks, every neuron depends on the entire prior layer neurons in the CFNN. Additionally, the CFNN is able to carry on to a broad extent in case the input data possess a sizable memory capacity.
As revealed by a literature review, the ANN model has applications in predicting the performance of production rate. Nonetheless, only the MLP model was utilized to forecast the production rate [19]. The novelty of the present study is to utilize and compare the performance of several models including the GRNN and RBF models, which were not previously used for predicting the production rate. Accordingly, the current research mainly aims to introduce and assess a model for predicting the rate of polyethylene fabrication. The novelty of our model is characterized by its capability in predicting system productivity by taking the uncertainty issue into account.

HDPE Process
The HDPE plant comprises two procedures, through which the polymerization reaction is put into action. Figure 1 illustrates a representative gas phase polymerization procedure for producing HDPE introduced by this research. Every process involves two polymerization reactors. The polymerization reaction was very exothermal, with a heat reaction of around 1000 kcal/Kg of ethylene.

HDPE Process
The HDPE plant comprises two procedures, through which the polymerization reaction is put into action. Figure 1 illustrates a representative gas phase polymerization procedure for producing HDPE introduced by this research. Every process involves two polymerization reactors. The polymerization reaction was very exothermal, with a heat reaction of around 1000 kcal/Kg of ethylene.  There is, therefore, a need to provide suitable cooling systems that eliminate nearby 80% of heat in the polymerization process. Co-monomer (including 1-butene or higher alpha-olefin), ethylene, an activator, hydrogen, hexane, and a catalyst, as well as continually recycled original liquid, are supplied to reactors as reactants. Generally, the slurry phase occupies almost 90-95% of reactor volume. With building up of the reaction pressure, the polyethylene slurry is transferred to the next process apparatus and the reactor level is preserved within an allowable range. Separating the reaction slurry in the centrifugal separator yields cake, holding dilutants, after which dilutants are removed with hot nitrogen gas in a dryer. Thereafter, suitable additives are added depending on the final usages. After pelletizing in water, pellets are dried and placed in a homogenizer followed by cooling.

Artificial Neural Networks
The ANN is an AI (artificial intelligence) method, defined as the information processing model, which is exhilarated by the human nervous systems for information processing [21][22][23]. The ANN is capable of identifying patterns and learning from their interplays with the environment [24][25][26][27][28]. An ANN is constructed through three major fractions of the input, the output, and the hidden layer(s), all of which comprise parallel units named neurons [29]. The neurons are coupled with massive weight links, allowing the information to be transformed among the layers. The ANN model is essentially dependent upon two key steps for predicting the response of different systems: the training phase and the testing phase. In the training phase, the inputs are received by the neurons over their entering connections, after which these inputs are combined by a specific action with the output to discover the best the weight links values. Therefore, records of the association between output and input variables are taken to forecast the fresh data. In the testing phase, the system performance is tested using a portion of the input data and a comparison is made between the predicted data and the real data. The principal benefit of the ANN model is its ability in solving sophisticated problems that cannot be easily solved by traditional models; it is also capable of solving problems without an algorithmic solution or those with algorithmic solutions having sophisticated definitions.
The independent variables from outside resources in the input layer are processed by several mathematical operators and by sending values to the hidden layers. On the other hand, output neuron(s) are used to determine the dependent variables. The output value for all ANN structures can be defined as below: where this value is computed by the activation function of f . Here, b is the summation of the bias value, and w i is the weight of x i input. Figure 2 presents the schematic diagram of the proposed neural network for simulating the EIX.

Accuracy Assessment of AI Models
The current study has designed various AI-based approaches with diverse topologies to opt for the best model based on the accuracy of predictability. This can serve as a selection paradigm for network configuration to determine the number of hidden layers and neurons, the spread factor, and training algorithm. It is noted that the dependencies of the neurons number in the hidden layer on prediction intervals were also examined. The performance of ANN models using R 2 (regression coefficient), MSE (mean square error), AARD% (average absolute relative deviation percent), and RMSE (root mean square error) are calculated, respectively, as the following:

Accuracy Assessment of AI Models
The current study has designed various AI-based approaches with diverse topologies to opt for the best model based on the accuracy of predictability. This can serve as a selection paradigm for network configuration to determine the number of hidden layers and neurons, the spread factor, and training algorithm. It is noted that the dependencies of the neurons number in the hidden layer on prediction intervals were also examined. The performance of ANN models using R 2 (regression coefficient), MSE (mean square error), AARD% (average absolute relative deviation percent), and RMSE (root mean square error) are calculated, respectively, as the following: where, Y i, act is the actual and Y i, pred is predicted value. N and Y act also denote the number of data points and the mean of actual values, respectively. To evaluate mean squared error (MSE) for various parameters, the predicted data are also subject to statistical analysis. MSE measures the absolute deviation of the predicted and the actual values. Positive and negative values denote overestimation and underestimation of parameters, respectively. Based on the aforesaid description, the RMSE denotes that the model is efficient on the basis of the difference between the predicted and real data. Accordingly, a large positive RMSE indicates the presence of a high deviation between the predicted and real data and in a reverse order. The R 2 index defines the proximity of the actual data points to the predicted values.

Results and Discussion
This section summarizes the actual databank gathered from the industrial polyethylene petrochemical company, by considering the significant independent variables and by utilizing the Pearson correlation matrix. Furthermore, this section deals with determining the best structures of different models and comparing the precisions of different models. The present section concludes by selecting the best model and analyzing the results.

Industrial Database
To investigate the EIX, eleven independent sets of input data namely, temperature, operating pressure, level, loop flow, ethylene flow, hydrogen flow, 1-butane flow, hydrogen concentration, 1-butane concentration, catalyst flow, and TEA (triethylaluminium) flow (i.e., inputs 1 to 11 denoted, respectively, by X1 to X11), and the EIX response (denoted by Y) are collected. Table 1 presents the information summary of the industrial data used in this work. According to the industrial databank, 93 data points were gathered at the steady-state conditions. The trained neural network needs validation for determining the precision of the introduced model. The network performance can be analyzed by cross-validation of an unidentified dataset. The network is validated through preserving a fraction of the dataset (e.g., 15%) for validation and the rest of the dataset is used for training. After the training phase, the data forecasted through the ANN topology and the measured data undergo a correlation analysis.
All ANN models were developed in MATLAB ® with the Levenberg-Marquardt optimization algorithm. Besides, the choice of training algorithm and neuron transfer function has a major contribution to model precision. As researchers have shown, the Levenberg-Marquardt (LM) algorithm produces quicker responses for regression-type problems in overall facets of neural networks [31,32]. Most often, the LM training algorithm was reported to have the highest significant efficiency, fast convergence, and accuracy compared with other training algorithms.

Scaling the Data
To enhance the rate of convergence in the training step as well as to avoid parameter saturation in the intended ANNs, the entire actual data were subjected to mapping within the interval [0.01 0.99]. Data were normalized by Equation (6): where V denotes an independent or dependent variable, V normal represents the normal value, V max is the maximum, and V min is the minimum value of each variable.

Independent Variable Selection
Mathematical investigation for the dependency of two variables is possible through the correlation matrix study, the coefficients of which are usually measured from −1 to +1. This indicates that the two variables are correlated directly or indirectly given the signs of these coefficients, whereas the magnitude defines the robustness of their association. Our research surveyed a multivariate AI-based method with the Pearson correlation test for estimating the degree of relationship between each two variables [23]. Figure 3 displays the correlation coefficient values' given probable pairs of variables.

Scaling the Data
To enhance the rate of convergence in the training step as well as to avoid parameter saturation in the intended ANNs, the entire actual data were subjected to mapping within the interval [0.01 0.99]. Data were normalized by Equation (6) denotes an independent or dependent variable, represents the normal value, is the maximum, and is the minimum value of each variable.

Independent Variable Selection
Mathematical investigation for the dependency of two variables is possible through the correlation matrix study, the coefficients of which are usually measured from −1 to +1. This indicates that the two variables are correlated directly or indirectly given the signs of these coefficients, whereas the magnitude defines the robustness of their association. Our research surveyed a multivariate AI-based method with the Pearson correlation test for estimating the degree of relationship between each two variables [23]. Figure 3 displays the correlation coefficient values' given probable pairs of variables. The Pearson correlation coefficient as the variable ranking is described in choosing appropriate inputs for the neural network [33,34]. Values delivered by the Pearson method reveal the type and intensity of the association between every variable pair, with a value between −1 and +1 representing the uppermost converse relationship and the highest direct association, respectively. The coefficient takes a zero value in cases where the given variables do not have any association. Independent variables take non-zero correlation coefficients that verify their choices. The highest consideration is  The Pearson correlation coefficient as the variable ranking is described in choosing appropriate inputs for the neural network [33,34]. Values delivered by the Pearson method reveal the type and intensity of the association between every variable pair, with a value between −1 and +1 representing the uppermost converse relationship and the highest direct association, respectively. The coefficient takes a zero value in cases where the given variables do not have any association. Independent variables take non-zero correlation coefficients that verify their choices. The highest consideration is devoted to absolute average values because they have important strong associations. The values of Pearson correlation coefficient for each input are presented in Table 2. Accordingly, this examination confirmed that Input2, Input5, Input6, Input7, Input10, and Input11 had the uppermost direct dependency and that other inputs presented the lowermost indirect association. Hence, it is possible to model polyethylene as a function of pressure, ethylene flow, hydrogen flow, 1-butane, catalyst flow, and TEA flow. Therefore, we try to present a smart model to derive the following relation: Output = f unction (Input 2, Input 5, Input 6, Input 7, Input 10, Input 11) Consequently, in the case of maximizing the AAPC (average of absolute Pearson's coefficient) for a specific transformation on the dependent variable, it is inferred that it yields an association between the dependent and independent variables with the highest reliability. Every input variable of differing models was selected with the Pearson correlation coefficient. The inputs of every model are presented in Table 3. From this table, it can be concluded that it is better to output to the power of 12 instead of modeling the output itself. Although, at last, by inverse transformation, the dependent variable is calculated to compare with the actual values.

Configuration Selection for Different ANN Approaches
In this work, the EIX is predicted using a proper ANN model obtained from a logical procedure. As mentioned earlier, the number of hidden neurons has a major contribution to network performance. The majority of related investigations obtain the number of neurons through the trial and error approach. Training and generalization errors may highly happen when the numbers of hidden neurons are less than the optimum numbers. On the other hand, larger numbers of hidden neurons may result in over-fitting and considerable variations. Therefore, it is necessary to calculate the optimum number of hidden neurons for achieving the best performance of the network.
Subsequently, the ANN approaches were developed and then, for example, the MLP network was compared in terms of performance with CF, RBF, and GR neural networks. The numerical validation associates with the observed AARD%, R 2 , MSE, and RMSE between actual and estimated data. According to the literature, MLP network capability with one hidden layer was proven [35]. As such, an MLP network with only a single hidden layer is used for the analysis.
It is noted that the training data points should be at least twice the number of bias and weights. As a result, for the MLP with one dependent and six independent variables, the hidden neuron is computed as: Therefore, this number can change from 1 to 4 (the highest acceptable number) in this network, and is trained 50 times for each network. The best configuration of the hidden neurons in the MLP model is presented in Table 4. The MLP network with three hidden neurons and the structure of 6-3-1 was determined as the most appropriate model. The MSE values of the MLP network with various numbers of hidden layer neurons are presented in Figure 4. The data reveal that the optimality of the three numbers of neurons owes to the uppermost value of R 2 (0.89413) and the lowermost value of MSE (0.02217).
The MLP network with three hidden neurons and the structure of 6-3-1 was determined as the most appropriate model. The MSE values of the MLP network with various numbers of hidden layer neurons are presented in Figure 4. The data reveal that the optimality of the three numbers of neurons owes to the uppermost value of R 2 (0.89413) and the lowermost value of MSE (0.02217).

Other Types of ANN
To find an appropriate model for evaluating the EIX, different topologies of artificial neural networks must be compare based on their performances. Therefore, the intended MLP approach

Other Types of ANN
To find an appropriate model for evaluating the EIX, different topologies of artificial neural networks must be compare based on their performances. Therefore, the intended MLP approach developed with optimum configuration in terms of predictive accuracy was evaluated with other ANN models (GR, CF, and RBF). The sensitivity results for selecting the best number of hidden neurons are presented in Tables 5-7. It should be pointed out that determination of the number of hidden neurons in the other ANNs was the same as for the MLP model. In GR, hidden neurons were not significant and the spread value needed to be set up. Subsequently, the spread value for the GR changes from 0.1 to 10 with 0.1 steps and 50 different GRs were considered, with statistical indices. The MSE is minimum (0.10808) in the GR model when the spread value was 4.81.
Considering the task, the best model contains the lowest value for MSE and AARD%. Table 8 clearly reveals that the MLP model can predict the EIX more accurately than other types of ANN models. Based on the statistical error values, the MSE for the MLP model (0.02217) is less than those for the MSE estimated with CF (0.03914), GR (0.10808), and RBF (0.09255) models, respectively. The above findings confirm that the MLP model is superior in the prediction of the EIX comparing to other ANN models. The MLP model, trained by the Levenberg-Marquardt algorithm with 6-3-1 structure, has the logsig transfer function in the output and hidden layers. In fact, this model is chosen from 600 models (200 MLPNN models, 150 CFNN models, 50 GRNN models, and 200 RBFNN models). This selection is based on four statistical indices: AARD%, MSE, RMSE, and R 2 . Table 9 summarizes the value of the weight and bias for the proposed MLP model. The MLP was trained using a training dataset by the adjustment of the biases and weights. The performance validity of the trained MLP was achieved according to the training and testing datasets (independent datasets). The optimal division ratio is 85:15 for segregating the data. 2.
Multiply the first six columns of Table 9 by the variables achieved in step 1.

3.
The 7th column of

5.
Multiply the transposition of the 8th column of Table 9 by the obtained values in step 4. 6.
Add the value of the last column of Table 9, i.e., 0.69286, to the obtained values in step 5. 7.
Substitute the obtained value in step 6 in the following equation to calculate NO OL .
Map the output values in the previous step into the actual range of dependent variables, i.e., [24.3 26.9], using the following equation.
10. The obtained value in step 9 shows the estimated value for the dependent variable by the proposed MLP approach.
Finally, the proposed MLP model was statistically checked for reliability, so the leverage approach was used. The following figure shows the outlier detection based on the MLP model using the Williams Plot method. It is clear from Figure 6 that only 4 data points (red dots) of the 75 data available are difficult to model. In fact, about 95% of the data is in the valid range (blue squares).

Conclusions
In HDPE procedures, the EIX is the critical controlling variable indicative of product quality. A large number of approaches are available for the estimation and correlation of EIX, since it is difficult to nonlinearly measure them. The current paper applied predicting methods to prediction schemes, including MLPNN (multi-layer perceptron), CFNN (cascaded forward), RBFNN (radial basis function), and GRNN (general regression) neural networks. Comparisons were made between the findings of different dynamic prediction schemes to assess the best performance. The superior performance of the present MLP model was demonstrated using the same case study dataset for predicting the EIX than other models. The results clearly suggest that three hidden neurons are the best number of neurons in the proposed MLP model. For this model, the MSE and R 2 values of the total dataset are 0.02217 and 0.89413, respectively. The main advantages of using ANNs for the EIX are the ability to predict the production rate of the network quickly and to clarify the characteristics of high-density polyethylene with network inputs. Although these models have used complex Valid data Suspect data Upper suspect limit, 3% Cut off Lower suspect limit, -3% Cut off Warning Leverage

Conclusions
In HDPE procedures, the EIX is the critical controlling variable indicative of product quality. A large number of approaches are available for the estimation and correlation of EIX, since it is difficult to nonlinearly measure them. The current paper applied predicting methods to prediction schemes, including MLPNN (multi-layer perceptron), CFNN (cascaded forward), RBFNN (radial basis function), and GRNN (general regression) neural networks. Comparisons were made between the findings of different dynamic prediction schemes to assess the best performance. The superior performance of the present MLP model was demonstrated using the same case study dataset for predicting the EIX than other models. The results clearly suggest that three hidden neurons are the best number of neurons in the proposed MLP model. For this model, the MSE and R 2 values of the total dataset are 0.02217 and 0.89413, respectively. The main advantages of using ANNs for the EIX are the ability to predict the production rate of the network quickly and to clarify the characteristics of high-density polyethylene with network inputs. Although these models have used complex computational algorithms, fast convergence along with accuracy is not always confirmed in some cases.

Nomenclature:
b Bias N Number of actual data X i i th input variable Y Response w Weight