Model Prediction and Optimization of Waste Lube Oil Treated with Natural Clay

: In this work, used lube oil was treated using natural acid-free clay. Clay was added at di ﬀ erent amounts (5, 10, and 20 g) to 100 mL of waste engine oil at various temperatures (250, 350, 400, and 450 ◦ C) and mixed at a speed of 800 rpm for 30 min. After settling and separation, the treated oil was diluted with kerosene before being examined using a Ultraviolet–visible (UV) spectrophotometer. In order to achieve cost-e ﬀ ective recycling, this process is modeled using the response surface method ( RSM ). Five regression models (linear, quadratic, Two Factor Interactions (2FI) , cubic, and reduced-order quadratic model) were developed, then tested, and examined by calculating the statistical performance indicators (R 2 , R 2 adj , Akaike’s Information Criterion corrected ( AICc ) , Bayesian Information Criterion ( BIC ) , and Root Mean Square Error ( RMSE )). The results obtained reveal that the modiﬁed quadratic model outperforms the rest of the models in terms of the low value of RMSE , the lowest AICc , lowest BIC , and the highest R 2 and R 2 adj . The developed modiﬁed quadratic model is optimized successfully to predict optimum operation conditions. Results show that optimum operation conditions are at the minimum area under the curve for UV absorption at 223.358; this can be achieved with a process temperature of 266.246 ◦ C and clay quantity of 5.331 g. This model agreed with experimental data regardless of the e ﬀ ectiveness of red clay in the therapy of lube oil.


Introduction
Lubricants generated by the crude oil refining sector denote approximately 1.2% of annual petroleum consumption. At a worldwide level, this value corresponds to more than 40 million tonnes of base oil [1]. Refining of crude oil produces about 67% of base lube oil while recycling of used oil can produce up to 95% of base oil. These lubricating oils mainly consist of complicated mixtures of isoalkanes with slightly longer branches, and with monocycloalkanes and monoaramatics having several short branches on the ring [2]. The chemical structure of the lubricating oil differs considerably depending on the kinds and percentages of additives; these are also dependent on the applications of the oil [3].
During application, lubricating oils tend to degrade leading to the deterioration of various important features, such as rheological and film-forming properties. Rheological properties are considered to be among the most important degradable properties [4]. Waste lubricant oil disposal causes serious environmental damage [5]. Unburned lubricating oil was found to have a high contribution to particulate emissions [6]. Therefore, waste engine oil has gained significant attention from researchers for a long time, which has resulted in various methods for the treatment and recovery of base oil being developed. Over the years, the removal of waste lubricating oil contaminants using solvents has been studied [7][8][9][10][11]; the use of clay as a bleaching agent, especially bentonite, has been mentioned in the literature [12].
Traditional treatment with clay-acid utilizes concentrated sulfuric acid to remove asphalt compounds and generates highly toxic acid products. Waste oils are originally handled with natural polymers to remove carbonic elements in the regeneration technique using acid-free clays. The oil is then subjected to vacuum distillation and clay treatment in an appropriate amount to accomplish the necessary product coloration [13,14]. In addition to the high expense due to the amount of clay needed, the recovered oils obtained through this process still have relatively high metal percentages [15]. The use of activated carbon has also been reported in the literature and the results showed a highly effective waste engine oil treatment by polycyclic aromatic hydrocarbon (PAH) removal [16]. Process optimization and oil property improvement achieved through acid modification of clays is now generating a renewed interest by researchers in recycling waste engine oil [17,18].
In this paper, the quality of the obtained oil after clay treatment at various conditions is assessed using UV spectrophotometer. The variation in light intensity due to absorption is determined by the amount of absorbance, according to the Beer-Lambert relationship. The Beer-Lambert relationship says that there is a logarithmic dependency between the transmission of light through a substance and the product of the substance's absorption coefficient and the distance the light passes or travels through the material i.e., the length of the route "l." where A is the absorbance, T is the transmission defined as I/I 0 , I is the incident luminance, Io is the transmitted luminous intensity, a is the absorbing type absorbance, c is the absorbing type concentration (g/L), and b is the trajectory length of the luminous. The chemicals dissolved in oil can be characterized by the transmission of certain wavelengths to a compound and by the measurement of light intensity [19].
In the process industry, if high efficiency is to be achieved, the most advanced process control system needs precise models. Most chemical processes are nonlinear in nature, making the development of precise models difficult [20,21]. Different factors, ranging from model nonlinearity to dimensionality and the data sampling method to internal parameters are considerably influenced when examining the accuracy of the modeling technique [22].
The scope of this paper is to develop a prediction model for (Ultraviolet-visible) (UV-VIS) spectrophotometer data for the lube oil treatment process, which can play an important role in optimizing process operating conditions. A model capable of predicting experimental behavior correctly has been pursued earnestly by various researchers over the years.
In many engineering aspects, such models can dramatically decrease time and operating costs. The need for modeling the used lube recycling process using red clay under various operating conditions emerged from the above mentioned reasons. The response surface models are among the most recognized modeling techniques that are used widely [22][23][24][25][26][27]. Extensive surveys and reviews of various modeling techniques and their applications are provided by [28][29][30][31].

Experimental Methods
Fresh SAE (Society of Automotive Engineers) 20W-50 oil was introduced into an automotive engine, which was then operated for 3500 km. The used oil was subsequently collected for treatment and evaluation. Three liters were filtered and left for 48 h after and all settled particles were removed. Then the natural red clay was washed with distilled water, dried, and then crushed and sieved with a 200 µm mesh. Different quantities of clay (5, 10, and 20 g) was added to 100 mL of the waste engine oil. Each set of flasks containing oil-clay mixture was then subjected to specific temperatures: 100, 250, Processes 2019, 7, 729 3 of 14 350, and 450 • C with a constant mixing speed of 800 rpm for 3 h according to the method reported in [32]. The mixtures were then left to cool naturally to room temperature and to allow the clay to settle. Simple decantation was used to obtain most of the oil before the use of filtration. Samples from the treated oil (4 mL) were then diluted using kerosene (1 L) before being tested using a UV spectrophotometer. Kerosene was chosen because it is more suitable to mass spectrometric analysis as reported in [33]. The UV results are shown in Figure 1; the complexity of the graph emerged to show an effective way to optimize the operating conditions, which gave the best treatment.
Summary of UV spectrophotometer results is presented in Table 1. A range of wavelengths from 200 to 350 nm was defined and the area under the curve was noted for all the twelve tested samples.

Modeling Techniques
Multiple regression models were used in this study based on the response surface method: (a) linnear model, (b) 2 FI model, (c) quadratic model, (d) cubic model, (e) reduced term quadratic model.

Response Surface Methodology (RSM)
The response surface methodology came from the original work [23]. Their cooperation began when solving the issue of determining optimum operating conditions for a particular chemical process. The methodology of the response surface is used in many practical applications where the goal is to identify the levels of design factors or variables that optimize a response. Despite its simplicity and efficiency, RSM provides efficient and accurate solutions. It has therefore been applied effectively to many engineering aspects [34][35][36][37].
Before implementing the RSM methodology, it is first essential to select an experimental design to define which runs should be conducted in the studied experimental range [38].
For this purpose, there are some experimental matrices. Experimental designs can be used for first-order models (e.g., factorial designs) if there is no curvature in the data set [38]. However, in order to approximate a response function to experimental data which cannot be characterized by linear functions, experimental designs for quadratic response surfaces, such as three-level factorial models, Box-Behnken, central composite, and Doehlert designs should be used [39].
In this study, a three-level full factorial design is applied, the minimum number of experiments required for this design can be calculated by the expression N = 3 k , where N is the number of experiments and k is the independent variable. Other designs, such as the Box-Behnken and central composite design (CCD) are used more frequently for more than two factors because the full factorial design needs more experimental runs than can normally be accommodated in practice [39].
RSM is a higher-order polynomial that determines a best-fit polynomial of desired order developed after applying an ANalysis Of VAriance ANOVA test to express the value of the variable Y (lube oil absorbance, in this case) as a function of the independent variables (X1 and X2) as follows [27]: where β 0 , β i , and β ii , are the regression intercept constants and the notations X 1 = A and X 2 = B are the independent variables, as shown in Table 2.

Model Validation and Evaluation
R 2 and error analyses were performed between the experimental and predicted data in the four models to evaluate the goodness of the model fitting and prediction accuracy of the constructed models. Many approaches for validation are stated in the literature are used for error analyses, some are listed in [40].
In this paper, promising techniques that used the error as performance indices to measure the model accuracy are introduced. There are a number of different measures of model accuracy. The first two are the root mean square error (RMSE) and the R 2 values.
In general, the larger the values of R 2 and R 2 adj , and the smaller the value of RMSE, the better the fit. It is more appropriate to look at R 2 adj in situations where the number of design variables is large because R 2 always increases as the number of terms in the model increases, while R 2 adj actually decreases if unnecessary terms are added to the model. Different techniques for statistical analysis i.e., ANOVA test can be used to check an RSM model's fitness, thus identifying the main effects of the design variables. The major statistical parameters used for evaluating model fitness are the F statistic, R, adjusted R 2 , and the root mean square error (RMSE) [27,41].
These parameters are not totally independent of each other and are calculated by: In general terms, the lower the RMSE value, the better the fit. It could be calculated as: where p is the number of non-constant terms in the RSM model, SSE is the sum square error, and SST is the total sum square. The calculations for SSE and SST are: In situations where the number of design variables is large, it is more appropriate to look at R 2 adj because R 2 always increases as the number of terms in the model increases: In fact, R 2 adj reduces if the model is added to unnecessary terms: R 2 pred is a measure of the percentage of variation described by the model in new data: where Predicted Residual Error Sum of Squares PRESS is a measure of how each point in the design fit the mode: e -i is a deletion residual computed by fitting a model without the i-th run then trying to predict the i-th observation with the resulting model.
The adjusted R-squared and the predicted R-squared should be within 0.20 of each other.

AIC and BIC
AIC stands for Akaike's information criterion for a small design. It is commonly used for validity measurement within a cohort of nonlinear models and is often used for model selection. It is calculated as follows: where p = number of parameters and ln(L) = maximum log-likelihood of the estimated model. The latter, in the case of a nonlinear fit with normally distributed errors [42]. This is calculated by: where x 1 , . . . , x n are the residuals from the nonlinear least-square fit and N = their number For small sample sizes, AIC tends to choose models that have too many terms. The AIC corrected (AICc) is used to overcome these issues by increasing the penalty on additional terms for smaller designs.
where n = sample size and p = number of parameters.

Bayesian information criterion (BIC)
BIC is an alternate to AICc that performs better for larger designs. It is calculated by: where p is the number of parameters, n = sample size, and L = maximum likelihood of the estimated model.

Modeling Statistics
To compare predictive model terms several statistical indices can be used but are not limited to: RMSE: Square root of the residual mean square. Consider this to be an estimate of the standard deviation associated with the experiment. Mean: Overall average of all the response data. C.V.: (coefficient of variation) the standard deviation expressed as a percentage of the mean. It can be calculated by dividing the standard deviation by the mean and multiplying by 100. PRESS: (predicted residual error sum of squares) A measure of how the model fits each design point.
Adequate accuracy: this is the ratio of signal to noise. It compares the range of predicted values at the design points with the average error of prediction. Ratios larger than 4 indicate adequate discrimination in the model.

Results and Discussion
The UV absorbance values for the treated oil (S1-S12) differ substantially over a broad spectrum of wavelengths, a comparative absorbance assessment is shown in Figure 1. This is due to the presence of very complex mixtures of paraffinic, naphthenic, and aromatic hydrocarbon molecules in addition to waste material [9]. As the kerosene is introduced to dilute samples, the viscosity of oil samples becomes less. Consequently, the sample oil becomes optically more transparent. The UV spectrometer data is modeled to optimize the process operation parameters The model prediction is created using Design Expert 10 on Windows 7 with i5 8 Gigabite (GB) of Read Access Memory (RAM) to assess the computational effectiveness and precision of the models developed; the above performance evaluation features are recognized as excellent indices. The large values of R 2 adj and R 2 , as well as small values of RMSE, indicate good fitting for the models. Adequate accuracy: this is the ratio of signal to noise. It compares the range of predicted values at the design points with the average error of prediction. Ratios larger than 4 indicate adequate discrimination in the model.

Results and Discussion
The UV absorbance values for the treated oil (S1-S12) differ substantially over a broad spectrum of wavelengths, a comparative absorbance assessment is shown in Figure 1. This is due to the presence of very complex mixtures of paraffinic, naphthenic, and aromatic hydrocarbon molecules in addition to waste material [9]. As the kerosene is introduced to dilute samples, the viscosity of oil samples becomes less. Consequently, the sample oil becomes optically more transparent. The UV spectrometer data is modeled to optimize the process operation parameters The model prediction is created using Design Expert 10 on Windows 7 with i5 8 Gigabite (GB) of Read Access Memory (RAM) to assess the computational effectiveness and precision of the models developed; the above performance evaluation features are recognized as excellent indices. The large values of R 2 adj and R 2 , as well as small values of RMSE, indicate good fitting for the models.

Modeling Experimental Data Using RSM
Five models were compared together: linear model, quadratic model, cubic model, and Two Factor Interactions Model (2FI model).

Linear Model
When applying the developed model using the previously defined data, the following pre-model has been obtained and the final equation in terms of actual factors is as follows: Absorption area under the curve = +204.98717 + 0.063973 × Temperature +0.19579 × Concentration Regression coefficients and statistical analysis for the predicted response surface model are shown in Table 3. It was observed that the calculated F-value of 18.81 from Table 3 indicates that the model is significant (as it is desirable to have the biggest value). However, to specify that model terms are significant, the P-value should be less than 0.05. Only the A-temperature is regarded as a significant model term in this situation.
The model was verified using the statistical test ANOVA in which Table 4 was acquired. The 0.6808 "R 2 pred " is, to be exact, in significant agreement with the 0.7641 "R 2 adj ;" the difference is less than 0.2. The "Adeq Precision" measures the ratio of signal to noise. It is desirable to have a ratio higher than 4; 11.402 shows an appropriate signal ratio in this situation. However, the search emerges for another better model with insignificant terms.

The Quadratic Model
The final equation model in terms of coded factors is presented in the following quadratic form: Abs. area under the curve = +227.03 + 6.76 × A + 1.35 × B + 0.55 × AB + 4.19 × A 2 + 0.55 + B 2 As can be seen from Table 5, only two terms of temperature and temperature 2 are significant in this quadratic model (p-value less than 0.05).  Table 6 displays the 0.6980 "R 2 pred " insensible agreement with the 0.8700 "R 2 adj ". On the other hand, "Adeq Precision" measures the signal-to-noise ratio where it is desirable to have a ratio greater than 4. The 11.195 model ratio stated an appropriate signal as the difference is lower than 0.2, which is lower than the pre-model. In addition, the F value of 15.73 is also less than the prior one. This model may be enhanced. Only the temperature conditions of the linear model are important to have a p-value which is less than 0.05.

2-Factor Interactions Model (2FI)
The 2-factor interactions model can be written in the final equation in terms of coded factors without the square terms as follows: Abs. area under the curve = +229.82 + 6.46 × A + 1.40 × B + 0.55 × AB Values of "P-value" less than 0.05 in Table 7 show that the terms of the model are significant. In this situation, A is only a significant term. Furthermore, values higher than 0.1 imply that the terms of the model are not significant. The 11.43 model F-value means that the model is significant. There is only a 0.29% possibility that such a big F-value could happen because of noise. The "R 2 pred " of 0.5631 in Table 8 is in reasonable agreement with the "R 2 adj " of 0.7399; i.e., the difference is less than 0.2. "Adeq Precision" measures the signal to noise ratio. A ratio greater than 4 is desirable. The ratio of 9.396 indicates an adequate signal. Similarly, the cubic model can be written in the final equation in terms of coded factors as follows: Abs. area under the curve = +227.76 + 3.37 The statistical parameters for the cubic model are summarized in Table 9, showing that the parameters are estimated inadequately due to a p-value > 0.05. It is clearly noted that the p-value for all cubic terms is more than 0.05 from the table, which shows that the model is insignificant, whereas values greater than 0.1 have no significance. A negative "R 2 pred " in Table 10 indicates that the fit is worse than just fitting a horizontal line. This is a major drawback of the model and forces it to be rejected.  Tables 11 and 12 summarize the statistical parameters for the modified quadratic model, showing that the parameters are properly estimated as the calculated p-values < 0.05. For all response surface design models, high coefficients for determination (R 2 > 0.90) revealed that variation in responses could not be attributed to random errors but to the effects. On the other hand, for the cubic model, the value of R 2 adj (−0.1212) indicates the insignificance of the explanatory variables. Moreover, the difference between R 2 and R 2 adj , which is greater than 0.2, indicates unreasonable agreement as reported in the literature [25,38]. These statistical indicators are supplemented by scatterplots in Figure 2. The findings summarized in Table 13 demonstrate that the modified quadratic model is more precise and can be used to model the used lube oil treated with the red clay process. This accuracy is reflected by the low RMSE error value (1.84) and the high R 2 and R 2 adj ratios (0.9235 and 0.8948, respectively), in addition to the smallest PRESS value (54.02), smallest AICc value (57.57), and the smallest BIC value (53.79), which is compared to the corresponding values obtained by the linear, 2FI, quadratic, and cubic models. Although the linear model is suggested to model the understudied process, it presents the lowest value of R 2 and R 2 adj ratios (0.8070 and 0.7641, respectively). The cubic and quadratic models, on the other hand, have the highest values of AICc and BIC, whereas the 2FI has the lowest value of R 2 and R 2 adj after the linear model. In brief, the results indicate the supremacy of the quadratic model of reduced-order over the other models in terms of the minimum RMSE and the highest R 2 and R 2 adj . This finding is in line with that acquired by many scientists confirming that "choosing the best model should concentrate on maximizing the R 2 adj and the R 2 pred " [34][35][36][37]. Other scientists are advising the use of AICc and BIC for nonlinear models because they perform much better than R 2 [42]. The optimum operating conditions for maximizing production by minimizing the cost of treatment occurs at the minimum absorption under the curve were predicted to be 223.358; this can be achieved at a process temperature of 266.246 • C and 5.331 g clay. These optimum values can be viewed with the contour plot and Three-dimensional 3D curves in Figure 3. Experiments were carried out under these optimal conditions for validating the prediction optimum values. The resulting experimental values of 223.8 is in close agreement with that acquired from the model of regression. Adsorption rates improved with temperature owing to the diffusion of adsorbent molecules into the adsorbent; in addition, solubility and adsorption are inversely linked as temperature impacts the adsorption range [17,43]. R 2 adj after the linear model.
In brief, the results indicate the supremacy of the quadratic model of reduced-order over the other models in terms of the minimum RMSE and the highest R 2 and R 2 adj. This finding is in line with that acquired by many scientists confirming that "choosing the best model should concentrate on maximizing the R 2 adj and the R 2 pred" [34][35][36][37].
Other scientists are advising the use of AICc and BIC for nonlinear models because they perform much better than R 2 [42].
The optimum operating conditions for maximizing production by minimizing the cost of treatment occurs at the minimum absorption under the curve were predicted to be 223.358; this can be achieved at a process temperature of 266.246 °C and 5.331 g clay. These optimum values can be viewed with the contour plot and Three-dimensional 3D curves in Figure 3. Experiments were carried out under these optimal conditions for validating the prediction optimum values. The resulting experimental values of 223.8 is in close agreement with that acquired from the model of regression. Adsorption rates improved with temperature owing to the diffusion of adsorbent molecules into the adsorbent; in addition, solubility and adsorption are inversely linked as temperature impacts the adsorption range [17,43].  This study has revealed that effective oil treatment can be reproduced from natural free-acid clay under optimized process conditions; it is recommended that different types of clays need to be tested, optimized, and mixed to achieve better recycling.

Conclusions
This study presents an application of the RSM technique to optimize used the lube oil recycling process treated with natural red clay. Statistical indexes have generated competitive results in comparing the precision and accuracy of the presented models for the used lube oil treatment process. Five different models were tested to find the best fit with the obtained experimental data. The models are linear, 2FI, quadratic, modified quadratic, and cubic models.
Obtained results showed that the minimum RMSE value is obtained by the modified quadratic model, whereas the maximum calculated value for R 2 index is for cubic. However, this model had a low F-value, the highest PRESS, AIC, and BIC making it the least favorable model among the tested ones.
In addition, results show the supremacy of the modified quadratic model in terms of low value of error RMSE (1.84) high values of R 2 and R 2 adj with ratios (0.9235 and 0.8948, respectively), the lowest AICc (57.57), lowest BIC (53.79), comparing to the linear, 2FI, quadratic, and cubic models.
Optimization of the model has provided insightful observation into the process operation conditions for the used lube treatment process, revealing that the optimum operating conditions to maximize production with the minimum cost of treatment happens at minimum absorption under the curve of 223.358; this can be achieved with a temperature of 266.246 • C and a clay amount of 5.331 g. This model can also be used for online state estimation and advanced controlling of the oil recycling process.