Prediction of Arsenic Removal from Contaminated Water Using Artificial Neural Network Model

Arsenic is a deleterious heavy metal that is usually removed from polluted water based on adsorption processes. The latest mode of modeling such a process is to implement artificial intelligence (AI). In the current work, a new artificial neural network (ANN) model was developed to predict the adsorption efficiency of arsenate (As(III)) from contaminated water by analyzing different architectures of an adaptive network-based fuzzy inference system (ANFIS). The database for the current study consisted of the experimental data of the adsorption of As(III) by different adsorbents/biosorbents. The data were randomly divided into two sets: 70% for the training phase and 30% for the testing phase. Four statistical evaluation metrics, namely, mean square error (MSE), root-mean-square error (RMSE), Pearson’s correlation coefficient (R%), and the determination coefficient (R2) were used for the analysis. The best performing ANFIS model was characterized with the average values of 97.72%, 0.9333, 0.137, and 0.274 of R%, R2, MSE, and RMSE, respectively. In addition, a parametric investigation revealed that the most dominating parameters on the adsorption process efficiency were in the following order: pH, As initial concentration, contact time, adsorbent dosage, inoculum size, and temperature. The results of the current study would be useful in the adsorption process scale-up and optimization.


Introduction
Heavy metals, discharged even at low concentrations into natural water bodies, have threatening impacts on human life and the environment. One of such toxic heavy metals is arsenic (As), which is considered an environmental hazard [1]. The immediate symptoms of acute As poisoning include abdominal pain and vomiting. It can lead to death in extreme cases. Long-term exposure can cause diabetes, pulmonary disease, cardiovascular disease, cancer, and skin lesions [2]. Various technologies, such as coagulation-filtration [3], membrane separation [4,5], ion exchange [6], adsorption [7,8], and hybrid membrane systems [9,10] have been employed to remove As from contaminated water. Among these methods, adsorption is probably the most effective separation method for the removal of hazardous heavy metals such as As from water. The adsorption process has long been used in the water and wastewater industries for its ease of handling, minimal sludge production, cost-effectiveness, and regeneration capability [11].
An integral part of the application of an adsorbent in removing a heavy metal ion from a contaminated aqueous solution is developing a process model. The traditional means of modeling adsorption is to obtain the parameters of the kinetic and isotherm models using experimental data at optimum conditions. However, the optimum values of the adsorption variables, such as pH, adsorbent dosage, adsorbate initial concentration, contact time, and 2 of 13 temperature, are investigated independently [7,[12][13][14]. Considering the analytical error, uncertainty, and independent investigation associated with the traditional experimental work, different artificial intelligence (AI)-based machine learning (ML) models are being used to correlate all input variables to the output parameter (contaminant removal percentage) directly. A few examples of such applications are presented in Table 1.  [27] The mostly used ML algorithm for modeling various adsorption processes is the artificial neural network (ANN) [28,29]. It has been used all across the world for classification and prediction purposes in a wide range of real-time adsorption applications [14][15][16][17][18][22][23][24][25][26][27]. The ANN correlates the input(s) to the output(s) with nodes arranged in single or multiple hidden layers. The nodes in one layer are connected with weight functions to the nodes in the next layer. An activation function is used to non-linearly map the inputs to the outputs.
In addition to ANN, other machine learning algorithms (MLAs) such as decision trees, support vector regression, random forest, genetic model, particle swarm optimization, and adaptive network-based fuzzy inference system (ANFIS) have been used for modeling various adsorption processes [18][19][20][21][22][25][26][27]. Among these MLAs, the ANFIS has the advantages of the ability to capture the non-linear structure of a process, adaptation capability, and rapid learning capacity. It is a sub-category of ANN that integrates the principles of neural networks and fuzzy logic to acquire the advantages of both in a single computational platform. Compared to other MLAs, ANFIS has not been used to model the adsorption of As. Although it was applied earlier for modeling the adsorption of other heavy metals, such as copper and chromium [25][26][27], those models are based on a single dataset with a relatively high prediction error. Therefore, a generalized model, using different datasets with different adsorbents, needs to be developed.
In this work, a highly efficient ANFIS model was developed to predict the adsorption removal efficiency of arsenate (As(III)) from aqueous solutions. A parametric investigation was also performed to identify the contribution of the input parameters in predicting the output parameter (As removal %).

Database
Seven experimental datasets were selected from the literature [15][16][17][18][30][31][32] for the current study. These studies were considered suitable, as a variety of absorbents or biosorbents were used to remove As(III) from contaminated aqueous solutions or ground/wastewater. In general, the adsorption or biosorption experimentation was followed by modeling As removal based on several input parameters, including initial As concentration, adsorbent dose, pH, contact time, agitation speed, temperature, and others.

ANFIS Model
The ANFIS model was introduced as a combination of the neural network model and fuzzy logic [33][34][35][36][37][38]. The ANFIS model showed a better performance in processing a small size of training datasets when compared to ANN [39]. A typical architecture of the ANFIS model has five layers: fuzzification, rule, normalization, defuzzification, and aggregation [40][41][42][43]. In this study, the Takagi-Sugeno function and if-then rules were used to represent the non-linear relationship between the input and the output parameters [44,45]. A fuzzy inference system with two inputs and one output is used in describing the ANFIS model. The rules of the fuzzy inference system are shown as follows: Rule1 : i f x is A 1 and y is B 1 , then f 1 = p 1 x + q 1 y + r 1 (1) Rule1 : i f x is A 2 and y is B 2 , then f 1 = p 2 x + q 2 y + r 2 (2) where: A 1 , A 2 , B 1 , and B 2 are the fuzzy sets; x, and y are the input variables; p 1, p 2 , q 1 , q 2 , r 1 , and r 2 : are the linear polynomial parameters; and f is the output of the ANFIS model. The layers of the ANFIS model used for the current study are described below and illustrated in Figure 1.
Layer 1: This layer is called the fuzzification layer. In this layer, the fuzzy inference system uses a membership function to convert the input parameters into a fuzzy set. Among different types of membership functions, the Gaussian-shape membership function was applied in this work to map the training values between [0,1].
for i = 1, 2 where: O 1,i is the output of layer 1; µ is the Gaussian-shape membership function; A i , and B i represent the linguistic variables; x and y are the input variables; and σ i , b i , and c i are the constants of the Gaussian-shape function.
A fuzzy inference system with two inputs and one output is used in describing the ANFIS model. The rules of the fuzzy inference system are shown as follows: where: , , , and are the fuzzy sets; , and are the input variables; , , , , , and : are the linear polynomial parameters; and is the output of the ANFIS model.
The layers of the ANFIS model used for the current study are described below and illustrated in Figure 1.

Layer 1:
This layer is called the fuzzification layer. In this layer, the fuzzy inference system uses a membership function to convert the input parameters into a fuzzy set. Among different types of membership functions, the Gaussian-shape membership function was applied in this work to map the training values between [0,1].

Layer 2:
This layer has fixed nodes. The output of this layer is attained by multiplying all the receiving signals from the input layer. The output is represented by w i Layer 3: The nodes in this layer are fixed and labeled as N. In this layer, the outputs are obtained using the firing strength inference system rules.
where: O 3,i is the output of layer 3; and w is the normalized firing strength of the inference system rules. Layer 4: This layer has adaptive nodes. It has three parameters used to adjust the adaptive nodes.
where: O 4,i is the output of layer 4; and p i , q i , and r i are the parameters of the inference system. Layer 5: This layer is the inference layer, which is used to obtain the overall output based on the previous layers.
Earlier, the ANFIS model demonstrated a good performance in predicting the removal efficiency of some heavy metals other than As from aqueous solutions [25][26][27].

Performance Evaluation
A set of statistical analysis metrics comprised of mean square error (MSE), rootmean-square error (RMSE), Pearson's correlation coefficient (R%), and the determination Appl. Sci. 2022, 12, 999 5 of 13 coefficient (R 2 ) were used to evaluate the developed ANFIS model in this study. All of these statistical parameters are defined as follows: where: y i,exp is the experimental value of the data point i; y i,pred is the predicted value of the data point i; y avg,exp is the average of the experimental values; and n is the total number of the input data.

Parametric Importance Analysis
After developing an AI model, it is necessary to conduct a parametric importance analysis. It allows identifying the most dominating parameters and checking if any input parameter can be omitted to simplify the model. In this work, an importance investigation of each parameter was performed for each dataset. Multiple non-linear regressions were used to obtain the relationship between input and output parameters. The value of the Pearson's correlation coefficient (R%) was used to evaluate the relative impact of each input parameter on the output parameter (As removal percentage).

Results and Discussion
As mentioned earlier, an efficient ANFIS model was developed based on a database comprised of the experimental measurements published in seven independent studies. The computational platform used for the modeling was MATLAB 2020. The input variables used for the modeling were As initial concentration, pH, temperature, contact time, adsorbent dosage, solution volume, agitation speed, inoculum size, and flow rate. The removal percentage of arsenate from the contaminated water was the output parameter. The log method was used for normalizing the data. The prediction capability of the developed model was evaluated using four statistical parameters: MSE, RMSE, R%, and R 2 .

Development of the ANFIS Model
In this work, the scatter partition method was applied because of the small sizes of some experimental datasets. It depends on fuzzy clustering of the data subgroups to find the optimum membership values. The architecture of the ANFIS model is presented in Figure 2, while the values of the corresponding model parameters are mentioned in Table 2

Training Phase of the ANFIS Model
The training process is an essential stage to develop an efficient model using experimental data. For this purpose, 70% of the data were used to train the developed ANFIS model. As mentioned earlier, the ANFIS model is based on fuzzy if-then rules and fuzzy reasoning. Therefore, the results of the ANFIS model have no deviation, and this is one of the advantages of this model.
The performance of the developed ANFIS model is shown in Figure 3, and the values of the evaluation metrics are presented in Table 3. The training results show the full agreement between the predicted values and experimental measurements of the arsenate removal percentage for all datasets. High values of R% (≥99.99%) or R 2 (≥0.9995) and very low values of MSE or RMSE indicate that the ANFIS model is ready to be tested.

Training Phase of the ANFIS Model
The training process is an essential stage to develop an efficient model using experimental data. For this purpose, 70% of the data were used to train the developed ANFIS model. As mentioned earlier, the ANFIS model is based on fuzzy if-then rules and fuzzy reasoning. Therefore, the results of the ANFIS model have no deviation, and this is one of the advantages of this model.
The performance of the developed ANFIS model is shown in Figure 3, and the values of the evaluation metrics are presented in Table 3. The training results show the full agreement between the predicted values and experimental measurements of the arsenate removal percentage for all datasets. High values of R% (≥99.99%) or R 2 (≥0.9995) and very low values of MSE or RMSE indicate that the ANFIS model is ready to be tested.

Testing of the ANFIS Model
Thirty percent (30%) of the data were used as the unseen data to test and validate the current ANFIS model. Its performance in the testing stage is illustrated in Figure 4 and Table 4. As shown in Figure 4, an excellent agreement between the predicted values and the targeted (experimental) values could be achieved. Similar to the training phase, high values of R% (97.72%) or R 2 (0.9333) and low values of MSE (0.137) or RMSE (0.274) were attained. This kind of agreement indicates the robustness of the developed ANFIS model in predicting the As removal from polluted water by the adsorption processes. Appl

Testing of the ANFIS Model
Thirty percent (30%) of the data were used as the unseen data to test and validate the current ANFIS model. Its performance in the testing stage is illustrated in Figure 4 and Table 4. As shown in Figure 4, an excellent agreement between the predicted values and the targeted (experimental) values could be achieved. Similar to the training phase, high values of R% (97.72%) or R 2 (0.9333) and low values of MSE (0.137) or RMSE (0.274) were attained. This kind of agreement indicates the robustness of the developed ANFIS model in predicting the As removal from polluted water by the adsorption processes.  Even though ANFIS was used to model the adsorption of other heavy metals such as chromium and copper [25][26][27], it was not applied earlier to model As adsorption. Its application to model As adsorption is a novel contribution of the current study. A comparative performance of different MLAs is presented in Table 5. The MLAs used before to model As adsorption were ANN, random forest (RF), and support vector regression (SVR) [15][16][17][18][19]. The primary challenge the researchers faced to use ANN was the small sizes of the datasets [15][16][17][18]. They overcame the challenge by producing synthetic data using a specific algorithm for interpolation. However, they did not use any means to validate the utility of this kind of artificial data. On the other hand, Hafsa et al. [21] tested two non-ANN models, RF and SVR, for modeling As adsorption. Even though these MLAs could predict the trends quite well (R 2 > 0.93), the errors or data dispersions were significantly high (RMSE > 2.5). In comparison, the current ANFIS model not only could predict the trends of the data more accurately (R 2 > 0.93) but also yielded significantly less error (RMSE < 0.48). It should be noted that Hafsa et al. [21] used the same datasets as those used for the current study. Appl. Sci. 2022, 12,    Even though ANFIS was used to model the adsorption of other heavy metals such as chromium and copper [25][26][27], it was not applied earlier to model As adsorption. Its application to model As adsorption is a novel contribution of the current study. A comparative performance of different MLAs is presented in Table 5. The MLAs used before to model As adsorption were ANN, random forest (RF), and support vector regression (SVR) [15][16][17][18][19]. The primary challenge the researchers faced to use ANN was the small sizes of the datasets [15][16][17][18]. They overcame the challenge by producing synthetic data using a specific algorithm for interpolation. However, they did not use any means to validate the utility of this kind of artificial data. On the other hand, Hafsa et al. [21] tested two non-ANN models, RF and SVR, for modeling As adsorption. Even though these MLAs could predict the trends quite well (R 2 > 0.93), the errors or data dispersions were significantly high (RMSE > 2.5). In comparison, the current ANFIS model not only could predict the trends of the data more accurately (R 2 > 0.93) but also yielded significantly less error

Parametric Importance Analysis
The impact of nine parameters on the As removal percentage was investigated in this study. However, all experimental parameters are not expected to have similar impacts on predicting the output. Therefore, the relative importance of each parameter in each dataset was quantified using the values of R%. The higher the R% value, the higher the importance of the parameter. Among all the investigated parameters, pH, the As initial concentration, and contact time were found as the most influential parameters with different impacts. While pH scored the highest of 575, the initial concentration and contact time scored 568 and 445, respectively. Figure 5 shows the importance of each input parameter on the efficiency of As adsorption for all datasets. The current investigation provides a comparative ranking of the most dominating parameters in modeling the arsenic adsorption process: pH > arsenate initial concentration > contact time > adsorbent mass > inoculum size > temperature. Generally, as the pH increases, the concentration of H + ions decreases. The solution H + ions can compete for the available total active sites, thus reducing the active sites available for the As ions. Therefore, increasing pH can increase the removal percentage of As [46]. However, there is an optimum value beyond which any further increase in pH will not affect the adsorption efficiency. Therefore, the value of pH must be optimized before conducting the batch adsorption experiments.
In addition, the arsenate removal percentage can be calculated using Equation (14). For the same adsorbent dosage (mass), as the adsorbate initial concentration increases, the removal percentage decreases [46]. An adsorbent has a limited number of active adsorption sites at which As ions are adsorbed. Therefore, insufficient active sites will be the case as the As concentration increases. Therefore, the initial concentration has a significant impact on the adsorption process.
where: C o is the As initial concentration (mg/L); and C f is the As concentration at the end of the experiment (mg/L). Moreover, contact time is one of the dominating parameters in the adsorption process. As the contact time increases, the adsorption efficiency increases until reaching equilibrium, at which no further mass transfer will occur. The optimum contact time must be determined before conducting experiments. Generally, as the pH increases, the concentration of H + ions decreases. The solution H + ions can compete for the available total active sites, thus reducing the active sites available for the As ions. Therefore, increasing pH can increase the removal percentage of As [46]. However, there is an optimum value beyond which any further increase in pH will not affect the adsorption efficiency. Therefore, the value of pH must be optimized before conducting the batch adsorption experiments.
In addition, the arsenate removal percentage can be calculated using Equation (14). For the same adsorbent dosage (mass), as the adsorbate initial concentration increases, the removal percentage decreases [46]. An adsorbent has a limited number of active adsorption sites at which As ions are adsorbed. Therefore, insufficient active sites will be the case as the As concentration increases. Therefore, the initial concentration has a significant impact on the adsorption process. where: Co is the As initial concentration (mg/L); and Cf is the As concentration at the end of the experiment (mg/L). Adsorbent dosage ranked fourth in terms of relative importance with a total score of 305. As the adsorbent dosage increases, the mass transfer surface area also increases for the same particle size. It enhances the adsorption efficiency. However, economic adsorbent mass, i.e., minimum mass with the highest adsorbability, should be used to achieve the optimum adsorption. Similarly, when using biosorbents, as the inoculum size increases, the adsorption efficiency increases. Therefore, the inoculum size ranked fifth in terms of relative importance with a total score of 270.
Furthermore, the nature and the spontaneity of the adsorption process have a significant effect on the removal efficiency. Temperature plays a key role in the determination of the thermodynamic properties of the adsorption process and thus has a non-negligible impact. Temperature ranked sixth in terms of relative importance with a total score of 237.
The rest of the parameters, namely, agitation speed, flow rate, and solution volume showed less significant impacts on the As removal efficiency. Their evaluation score varied between 63 and 76. That is, all the parameters selected for the current study had varying contributions in predicting the output.

Conclusions
In this work, the application of the ANFIS model was investigated to predict the arsenate removal efficiency from aqueous solutions using different adsorbents at different conditions. Based on the promising results of this study, the following conclusions can be drawn: • An efficient ANFIS model was successfully developed to predict the adsorption removal of arsenate from water. High values of R% and R 2 with low values of MSE/RMSE were reported for both training and testing phases.

•
The parametric investigation of the current study can be used to optimize the parameters and, hence, increase the removal efficiency. The relative ranking of the most dominating parameters, in the modeling of the arsenate adsorption process, were as follows: pH, arsenic initial concentration, contact time, adsorbent mass, inoculum size, and then temperature. • Prediction of the removal of single or multi-component heavy metal/s can be investigated using an appropriate ANFIS model.

•
Deep learning can solve complex problems that need to find hidden patterns from the available input data. Therefore, advanced artificial intelligence models such as the Long Short-Term Memory (LSTM) model are highly recommended to be used in the future. • Developing a smart system as an alternative tool to predict the arsenic (or other heavy metal/s) removal from contaminated drinking/ground/irrigation/wastewater is very important to save cost and time. Further investigation in this regard is currently under consideration. Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.