Development of a Mathematical Model Based on an Artificial Neural Network (ANN) to Predict Nickel Uptake Data by a Natural Zeolite "2279

In this investigation, an artificial-neural-network-based mathematical model was developed for the prediction of nickel adsorption data. As input variables, the initial concentration, adsorbent dosage, and pH of the nickel solution were chosen, while the removal efficiency was chosen as an output variable. The hyperparameters were optimized to determine the perfect topology for the model. The study demonstrated that the 3-2-1 ANN architecture was the most suitable topology. The determination coefficient of 0.98 and the mean squared error of 0.02 indicated the high performance of the developed model, which was successfully applied for isotherm data prediction.


Introduction
Nowadays, due to the rapid increase in industrial production, a massive amount of industrial effluent is being created and released into the aquatic system. Heavy metal contaminants found in wastewater and industrial effluent include cobalt, nickel, lead, and copper. A high concentration of these heavy metals can induce acute or chronic poisoning [1][2][3]. Nickel is a toxic, non-biodegradable, and carcinogenic metal that can cause several health problems, including chronic asthma, dermatitis, and cancer. The permissible limit set by WHO for drinking water is 0.01 mg/L, whereas for wastewater it is 2 mg/L [3][4][5].
Several chemical and physical methods are used for heavy metals removal, such as chemical precipitation, ion exchange, electro-coagulation, and solvent extraction. However, most of these methods have been shown to have several drawbacks, i.e., a high cost and high energy consumption. Adsorption, on the other hand, has proven to be a more effective, simple, and less expensive method for heavy metals removal [1,6].
For heavy metal adsorption, several low-cost adsorbents have been studied. One of the most promising materials in this sector is natural zeolite. Natural zeolite is a porous hydrated aluminosilicate mineral with a three-dimensional structure. The fundamental building blocks of zeolite are SiO 4 and AlO 4 , and the isomorphic substitution of Si +4 by Al +3 provides a net negative charge on the framework surface, which is balanced by alkaline and alkali-earth metals, such as Na + , Ca + , K + , and Mg +2 [4,7].
An artificial neural network (ANN) is a reliable, rebuttable, and powerful mathematical tool of the artificial intelligence (AI) family. It correlates the non-linear relationship between input and output variables for complex problems [8,9]. The principal objective of this paper was to develop a mathematical model based on ANN simulation for nickeladsorption data prediction. Fifteen data sets were collected from our previous work [4] and divided into training and validation sets (70/30). The ANN architecture and the hyperparameters were optimized to find the best topology. In addition, the performance of the model was evaluated by minimizing the Mean Squared Error (MSE) for both the training and validation sets. The ANN model was validated by predicting the removal efficiency of nickel adsorption, and finally, it was tested for isotherm prediction to confirm the adequacy of the model.

Data Collection
The data used in the current paper for the development of a mathematical model based on an ANN for the prediction of nickel removal in aqueous solutions came from our previous work [4]. In our previous paper, we studied the adsorption of nickel using a NaClactivated natural zeolite where the adsorption parameters, such as the initial concentration, adsorbent dosage, and pH of the nickel solution, were optimized using the Box-Benkhen design as a response surface methodology. The adsorption process was comprehensively described in [4]. The same data were used for the development of an ANN model where the initial concentration, adsorbent dosage, and pH of the nickel solution were selected as the input variables. On the other hand, nickel removal was chosen as the output variable. The selected data are summarized in Table 1.

ANN Model
An artificial neural network is a powerful tool of artificial intelligence that is inspired by the human brain concept. This tool simulates the working principles of human intelligence in the human brain, which makes it a very powerful approach for solving many complex problems, such as regression or classification problems. The ANN architecture consists of three layers, namely input, hidden, and output. Each layer has a number of neurons, which are linked to each other, forming different architectures of the ANN [9,10]. In this paper, a 3-12-1 ANN architecture was adopted with three input variables (initial concentration, adsorbent dosage, and pH), twelve neurons in the hidden layer, and one output layer (nickel removal). The best architecture is shown in Figure 1.
In general, the data is divided into three sets, such as training, validation, and testing. But in this research, 15 data points are not enough to satisfy this condition. As a result, the data were randomly divided into two sets: training and validation. A total of 70% of the data was used for training and the remaining for validation. The ANN model was performed using Matlab software. A trainlm function based on Levenberg-Marquardt was applied for back-propagation training. The tan-sigmoid (tansing) and linear (purelin) transfer functions were applied at the hidden and output layers, respectively.
The equation that defines the process of ANN work is presented as follows [10]: where y j is the output variable, f is the transfer function, B j is the bias in the hidden layer, n is the number of neurons in the hidden layer, w ji is the connection weights between the input and hidden layers, and Xi is the input variable. To avoid overfitting or underfitting, the data were normalized in the scaled range of −1 to 1, using Equation (2) [11]: where R nor is the normalized data, and M max and M min are the maximum and minimum values of the scaling range, respectively. y i is the actual data. Max(y i ) and Min(y i ) are the maximum and minimum values of the actual data, respectively. The equation that defines the process of ANN work is presented as follows [10]: where yj is the output variable, f is the transfer function, Bj is the bias in the hidden layer, n is the number of neurons in the hidden layer, wji is the connection weights between the input and hidden layers, and Xi is the input variable. To avoid overfitting or underfitting, the data were normalized in the scaled range of −1 to 1, using Equation (2) [11]: where Rnor is the normalized data, and Mmax and Mmin are the maximum and minimum values of the scaling range, respectively. yi is the actual data. Max(yi) and Min(yi) are the maximum and minimum values of the actual data, respectively.

ANN Optimization
To find the best architecture for the ANN model, the hyperparameters, such as neuron numbers, transfer function type, and learning rate, should be optimized. In this work, the hidden neurons varied from 1 to 15, as shown in Figure 2. The best architecture was selected based on relative mean square error (RMSE) values for both the training and validation sets [12,13]. The optimum number of neurons was 12, where the RMSE values for the training and validation sets were as minimal as possible and converged to almost the same value.

ANN Optimization
To find the best architecture for the ANN model, the hyperparameters, such as neuron numbers, transfer function type, and learning rate, should be optimized. In this work, the hidden neurons varied from 1 to 15, as shown in Figure 2. The best architecture was selected based on relative mean square error (RMSE) values for both the training and validation sets [12,13]. The optimum number of neurons was 12, where the RMSE values for the training and validation sets were as minimal as possible and converged to almost the same value.

ANN Performance
The performance of the model was evaluated based on the variation of the mean

ANN Performance
The performance of the model was evaluated based on the variation of the mean squared error (MSE) as a function of the number of training cycles. As shown in Figure 3, the training stopped after three epochs, and the best validation performance was 0.02 at epoch 1. In addition, Figure 4 shows the regression plot for the model. R 2 values for both the training and validation sets are above 90%, which indicates the high accuracy of the ANN model [14].

ANN Performance
The performance of the model was evaluated based on the variation of the mean squared error (MSE) as a function of the number of training cycles. As shown in Figure 3, the training stopped after three epochs, and the best validation performance was 0.02 at epoch 1. In addition, Figure 4 shows the regression plot for the model. R 2 values for both the training and validation sets are above 90%, which indicates the high accuracy of the ANN model [14].

Mathematical Model Development
For the development of a mathematical model for data prediction, the simulated ANN was transformed into a mathematical equation that relies on the input variables with the output variable, based on the weights and biases extracted from the model in conjunction with the transfer function. The overall equation can be written as follows [10]: where b0 is the bias in the output layer, n is the number of neurons in the hidden layer, wk is the connection weights between the hidden and output layers, fsig is the transfer func-

Mathematical Model Development
For the development of a mathematical model for data prediction, the simulated ANN was transformed into a mathematical equation that relies on the input variables with the output variable, based on the weights and biases extracted from the model in conjunction with the transfer function. The overall equation can be written as follows [10]: Phys. Sci. Forum 2023, 6, 4

of 8
where b 0 is the bias in the output layer, n is the number of neurons in the hidden layer, w k is the connection weights between the hidden and output layers, f sig is the transfer function, b nk is the bias at each neuron in the hidden layer, m is the number of neurons in the input layer, w ik is the connection weights between the input and hidden layers, X i is the normalized input data, and y is the normalized output data.
In the present paper, the mathematical model was constructed based on the weights and biases extracted from the 3-12-1 ANN model. The extracted values of biases and weights are presented in Table 2. Using the information mentioned in Table 2, the Equation (3) is transformed into the Equation (4) as follows: B n is unknown and can be calculated using Equation (5): A n is also unknown and can be calculated using Equation (6): where IC is the initial concentration, Ad is the adsorbent dosage, n is the number of neurons, and w n-in p and w n-outp are the connection weights in the input and output layers, respectively. The final equation used for predicting the nickel removal after de-normalizing the data is presented as follows: 4. Discussion

ANN Validation
In order to validate this model, it was tested for predicting the adsorption efficiency and compared to the original data. Figure 5 shows the original data compared to the predicted data, and Figure 6 shows their error histogram. It is observed that the data follow a straight line with an R 2 value of 0.98, indicating the validation and high accuracy of the model. Furthermore, the error between the output and the target is very low. Therefore, the model can be chosen as appropriate for predicting future data.
In order to validate this model, it was tested for predicting the adsorption efficienc and compared to the original data. Figure 5 shows the original data compared to th predicted data, and Figure 6 shows their error histogram. It is observed that the dat follow a straight line with an R 2 value of 0.98, indicating the validation and high accurac of the model. Furthermore, the error between the output and the target is very low Therefore, the model can be chosen as appropriate for predicting future data.

Isotherm Prediction
To confirm the accuracy of the model for an important study on the adsorption process, the model was tested for predicting the data for an isotherm study, and the obtained results were compared to the original results. Figure 7 shows the predicted isotherm plot (Figure 7b) against the original isotherm plot (Figure 7a), and Table 3 summarizes the predicted isotherm parameters.

Isotherm Prediction
To confirm the accuracy of the model for an important study on the adsorption process, the model was tested for predicting the data for an isotherm study, and the obtained results were compared to the original results. Figure 7 shows the predicted isotherm plot (Figure 7b) against the original isotherm plot (Figure 7a), and Table 3 summarizes the predicted isotherm parameters.

Isotherm Prediction
To confirm the accuracy of the model for an important study on the adsorption process, the model was tested for predicting the data for an isotherm study, and the obtained results were compared to the original results. Figure 7 shows the predicted isotherm plot (Figure 7b) against the original isotherm plot (Figure 7a), and Table 3 summarizes the predicted isotherm parameters.    One can see from Figure 7 and Table 3 that the best isotherm model to fit the predicted data by the ANN model was the Redlich-Peterson isotherm, which has the highest correlation coefficient of 0.996, an adjusted R 2 of 0.993, and the lowest ARE and RMSE values. These results are very consistent with the experimental results. In addition, the predicted maximal adsorption capacity (Qm = 28.92) was very close to the experimental value (Qm = 28.79), with a standard deviation of−0.13. As a result, the developed ANN model was a valid and appropriate model for nickel-adsorption data prediction.

Conclusions
In this work, an ANN model was developed for nickel-adsorption data prediction. The latter was transformed into a simple mathematical equation that correlated the input with the output data using the weights and biases extracted from the model. The ANN model showed a high R 2 of 0.98, which indicates the high accuracy of the model. In addition, the model was tested for isotherm data prediction, where the prediction data were in agreement with the experimental data. The developed ANN model was accurate and appropriate for nickel-adsorption data prediction.