Artiﬁcial Neural Network Modeling for Prediction of Dynamic Changes in Solution from Bioleaching by Indigenous Acidophilic Bacteria

: In this study, indigenous acidophilic bacteria living in mine drainage and hot acidic spring were collected and used for bioleaching experiments. The incubated indigenous acidophilic bacteria were inoculated on various minerals. The changes in pH, Eh, and heavy metal concentrations were examined with uninoculated controls to study bioleaching over time. As a result, the aspects of bioleaching varied greatly depending on the origin of microorganisms, the type of minerals, the temperature conditions, etc. We applied an ANN model to express and predict these complex bioleaching trends. Through the application of an ANN model, we developed the ANN models that can predict the changes in concentration of pH, Eh, and heavy metal ion concentrations and further evaluated predictability. Through this, the predictability of bioleaching using the ANN models can be confirmed. However, we also identified limitations, showing that further testing and application of the ANN models in more diverse experimental conditions are needed to improve the predictability of the ANN models.


Introduction
Bioleaching is a way of metal extraction via biogenically produced metabolites and thus considered as a green alternative to chemical leaching processes [1]. The application of microbial processes to extract metals from nearly insoluble ores is commonly referred to as bioleaching, and understanding of this bacterial leaching has been rapidly established over the past decades. The main metals extracted include copper, cobalt, nickel, zinc, and uranium [2]. Although bioleaching is an eco-friendly technology for the recovery of useful metals, it is difficult to predict due to various factors affecting the trend [3]. According to our previous studies, the tendency of bioleaching varied greatly depending on many conditions such as the origin of microorganisms, the composition of minerals, and the temperature.
Artificial neural network (ANN) is a data-driven self-adaptive technique, which does not require an understanding of the complex nature of the underlying processes [4]. The ANN architecture is composed of input layers, hidden layers, and output layers along with neurons (perceptrons) to model nonlinear complex systems. The neurons are non-linearly connected to the neurons at the next layer via transfer functions, weights, and biases [5]. The ANN model produces the required response via modification of weights and biases of neurons in the network [6]. In ANN model development, the training process is initiated by generating output values from input values through internal calculations. Based on the differences between calculated output values and observed (target) output values, backpropagation is performed to reduce these differences by adjusting the weights and biases. This training process is called feedforward backpropagation and is the most widely used method in ANN architecture [7]. This nature of ANN gives efficient results in the prediction of its output with phenomena for which it is difficult to understand the interrelationship between inputs and outputs. Therefore, ANN can be used to predict the bioleaching when it is difficult to clarify the effects of various factors.
The prediction of metal bioleaching using ANN has been reported by several researchers. Laberge [8] used neural network procedures to predict the metal (Cu, Zn, and Cd) solubilization percentages in municipal sludge treated with a continuous process using Thiobacillus ferrooxidans. Pazouki [9] optimized the amount of iron removal by bioleaching of kaolin sample with high iron impurities with Aspergillus niger. Abdollahi [10] proposed the application of ANN to predict the effect of operating parameters on the dissolution of Cu, Mo, and Re from molybdenite concentrates via mesoacidophilic bioleaching. The initial pH, solid concentration, inoculation percent, and time (days) were used as input operating parameters for prediction. Vyas [11] predicted Mo bioleaching using ANN from experimental data of spent catalyst using Escherichia coli and predicted bioleaching using ANN.
The aim of this study was to predict the leachate's dynamic characteristics from bioleaching in batch conditions using ANN modeling. Bioleaching data used in this study were collected from our previous eight studies about bioleaching [12][13][14][15][16][17][18][19]. Most of the results of the eight studies were used for ANN learning and optimization. We also evaluated the predictability of the optimized ANN model by adapting the selected one result for each output variable.

Information on Indigenous Acidophilic Bacteria Sampling and Cultivation of Reference Studies
Our previous eight studies about bioleaching were performed with indigenous acidophilic bacteria. From the 16S rRNA sequence analysis, indigenous acidophilic bacteria were identified as Acidithiobacillus ferroxidans, which are acidophilic iron-oxidizing bacteria. Those collected bacteria were basically cultivated in ATCC 125 medium composed of a mineral salt medium and an energy source. For mineral salts medium of ATCC 125, Thiobacillus medium, 0.2 g/L of (NH 4 ) 2 SO 4 , 0.5 g/L of MgSO 4 ·7H 2 O, 0.25 g/L of CaCl 2 , 3.0 g/L of KH 2 PO 4 , and 5.0 mg/L of FeSO 4 were dissolved in 1.0 L of distilled water. For an energy source, 1.0 g/L of elemental sulfur was used. Growth medium before inoculation of an indigenous acidophilic bacterium and all glassware were sterilized in an autoclave (SW-90AV100). Several sub-culturing strategies, including temperature, sub-culture cycles, pH, and toxic impact, were successfully used. These indigenous acidophilic bacteria were divided into two groups, as those collected in mine drainage and those collected in acidic hot springs, depending on the characteristics of the collected areas to simplify for ANN application. The bacterium from mine drainage was collected from mine drainage located in Hwasun-gun (Jeollanam-do, Korea) [11], Samcheok-si (Kangwon-do, Korea) [18], and Goseong-gun (Gyeongsangnam-do, Korea) [19]. The bacteria from acid hot spring were collected near Hatchobaru thermal electrical plant (Oita Ken, Japan) [17].

Information on Bioleaching Experiments of Reference Studies
Bioleaching experiments were conducted to investigate the effect of the bioleaching of indigenous acidophilic bacteria for the extract of various mine wastes such as pyrite (collected from the abandoned coal mine in Hwasun-gun, Jeollanam-do, Korea), galena (purchased from Australia or collected from abandoned jade mine in Gwangyang-si, Jeollanam-do, Korea), sphalerite (purchased from Australia), pyrrhotite (collected from the abandoned mine in Samcheok-si, Kangwon-do, Korea), etc. containing the metals Fe, Cu, Pb, Zn, etc. Batch experiments were prepared in 500-mL flasks containing 150 mL of culture medium (140 mL fresh growth medium + 10 mL inoculation medium) amended with specific minerals at a specific dosage. Control experiments were carried out to check the pH, Eh changes, and heavy metals leaching in the growth medium. At regular time intervals, liquid samples were withdrawn and filtered with 0.45-µm filter to measure pH, Eh, and heavy metals concentration. The pH and Eh of the solutions were measured using a pH meter (Eijelkam, multi-parameter analyzer, Belgium). Heavy metals concentration in the bioleached solutions were measured by Inductively coupled plasma mass spectrometry (ICP-MS) (ELAM DRCa; PerkinElmer, USA). The bioleaching using indigenous acidophilic bacteria of mine waste leached 62-92% and 21-55% for Fe, suggesting the feasibility of bioleaching. A comparison of the moderate thermophiles (40-60 • C) with mesophiles (28-37 • C) showed that moderate thermophiles were more effective. It seems that the higher temperature enhances the metal solubilization by enhancing the rate of reaction. In addition, with a toxic impact (CuSO 4 ) adaptation, toxic impact adapted is more effective than the unadapted ones in bioleaching. Specific conditions are organized in Table 1.

Artificial Neural Network Modeling
Data for ANN modeling from our previous eight studies about bioleaching were collected and organized for input variables and output variables. If all factors were used as input variables, the amount of data required for learning the ANN model would be enormous. Therefore, factors that are considered important were selected as input variables. Among the various characteristics of microorganisms, the origin of bacteria was selected as an input variable because it was considered that it would have the greatest impact. Our previous studies have confirmed that temperature and dosage between mineral and medium have a great effect on bioleaching. Therefore, temperature and dosage were also selected as input variables. In addition, the preference of microorganisms will vary depending on the composition of minerals and the bioleaching tendency will vary significantly, thus the ratio of each mineral corresponding to the formation of minerals was selected as input variables. Finally, in this study, we selected time as an input variable because we wanted to predict the bioleaching over time through ANN model. For those reasons, nine factors were selected as input variables: origin of microorganisms (none = 0, mine drainage = 1, hot spring water = 2), temperature ( • C), dosage between solution and mineral (g/L), Pb percent ratio of mineral, Fe percent ratio of mineral, S percent ratio of mineral, Zn percent ratio of mineral, Cu percent ratio of mineral, and bioleaching time (day). When normalizing input values, the relative effects of each factor can be examined. However, in this study, it was difficult to average the inputs used, such as the origin of bacteria and bioleaching time, as in other inputs. Therefore, in this study, these values were used directly without averaging. As output variables, pH, Eh, Cu concentration of leachate at specific bioleaching time, Pb concentration of leachate at specific bioleaching time, Zn concentration of leachate at specific bioleaching time, and Fe concentration of leachate at specific bioleaching time were selected. Since the observed output variables in each study are different, the data used for ANN modeling in each study are organized in Table 2. All input and output variables are not normalized. In the case of input variables, normalization was not carried out due to the existence of variables that could not be normalized based on the minimum-maximum value due to the different characteristics of each variable. In the case of output variables, a maximum value of the concentration of heavy metal ions continuously leaking cannot be set, so it is not normalized.
MATLAB software (2019b, Mathworks, Natick, MA, USA) was used for ANN model composition and optimization. In the ANN model, a hyperbolic tangent sigmoid transfer function (tansig) and linear transfer function (purelin) were used for hidden and output layers, respectively: with tansig and purelin transfer function, general form of ANN model is presented as follow: where k is the order of output variables. f k is the predicted value of kth output variable. The schematic structure of ANN model used in this study is illustrated in Figure 1. As an indicator of the optimal weights (w) and biases (b) and ANN topology, the mean squared error (MSE) was chosen (Equation (3)).
where y i is the observed output value and f i is the predicted value from ANN model. The Levenberg-Marquardt algorithm was employed to optimize the ANN model by minimizing the MSE value [20]. While it would be nice to develop a model that can predict all output variables simultaneously, we developed a model separately for each output variable because we do not have all the data for all output variables in learning. Each dataset for a specific output variable was randomly divided into three subsets including training (60%), validating (20%), and testing (20%). Based on the Levenberg-Marquardt algorithm, the ANN model was iterated to reduce MSE between the predicted values and observed values of the training subset by modifying and adjusting the w and b matrices. With adjusting the w and b matrices, MSE was calculated between the predicted and observed values of the validating subset. In this step, the w and b matrices with the lowest MSE from validating subset were chosen as the optimized values for the ANN model. After optimizing the ANN model, the testing subset was used to compare the predicted values with the observed values to assess the performance of the developed ANN model [21,22].
This optimization process is performed in a certain topology. Since the topology is also a factor that greatly affects the predictability of ANN models, we optimized and compared ANN models for various topologies in this study. Using nine input variables and one output variable, the number of hidden layers and the number of hidden layer neurons were varied to determine the optimal topology. Initially, a single hidden layer with hidden layer neurons varying from five (9:5:1) to fifteen (9:15:1) was tested for the network topology. Based on the lowest MSE value, the best network topologies were selected as the best topology for each output variable.
Further verification with the data not used for ANN model optimization was also performed to test the predictability of the developed ANN model. The dataset for further verification was chosen from our previous studies, as presented in Table 2.  Table 3 presents the MSE and R value for each topology for each output variable. Regardless of the type of output variable, the optimal topology was when the neurons in the hidden layer were greater than 8, probably because many neurons under the same hidden layer structure can include small cases. However, there was no optimal case with the largest number of neurons, apparently due to the increasing number of neurons in the model's optimization process making it more difficult to find the optimal w and b matrices. Table 4 presents the values of weights and biases for each layer and neuron from the best topology for each output variable. The diagnostic plots between observed and predicted values in the ANN model are illustrated in Figure 2, which shows a high correlation coefficient (R) value.

ANN Modeling for pH and Eh
Predicting the results from the data for further verification was also performed to examine the predictability of the developed ANN models. The pH results and the prediction by ANN model are shown in Figure 3. The X-axes are diverse depending on experimental conditions. In the case of pH, the range of values does not vary much by studies, while, in the case of bacteria, the lower tendency is universal. Therefore, the presence or absence of bacteria has the greatest influence. Temperature and dosage were found to have relatively little effect on pH changes. For this reason, the prediction of the ANN model was similar to the actual results, regardless of the experimental conditions. However, the results from further verification were less predictable. In other words, it was predicted that the pH would be lower if bacteria existed, but, in reality, there was no significant difference depending on the presence or absence of bacteria in these conditions. This prediction is due to the fact that most of the data used in ANN model optimization are cases where pH varies greatly depending on the presence or absence of bacteria. Because there are no learning data under conditions where pH does not change much, it is difficult to predict the results under conditions that do not change much. Therefore, it is necessary to conduct additional experiments under conditions where the presence or absence of bacteria does not significantly affect pH and use them for the advanced ANN model.
The Eh results and the prediction by ANN model are shown in Figure 4. The X-axes are diverse depending on experimental conditions. Similar to the pH results, Eh also has similar levels of value, so it can be seen that overall predictions are good. Eh also showed a big difference in trends, as with pH, depending on the presence or absence of bacteria. The effect of temperature and dose was also smaller, similar to the pH result. For further verification, the only difference is in the dosage compared to Experimental Condition #4. Although the difference between the dosage is not significant in the corresponding results, it is judged that the effect of the dosage was greater for the ANN model. In the case of the Eh model, it is expected that the more sophisticated ANN model will be obtained by performing additional experiments involving various dosage conditions and applying the results to the training.      Table 1, while red number is the result from the data for further verification.  Table 1, while the red number is the result from the data for further verification.

ANN Modeling for Heavy Metals' Concentration
Unlike pH and Eh, heavy metals have a diverse range of concentrations. Therefore, the X-and Y-axes are diverse depending on experimental conditions. Since the model's optimization process is based on MSE, it was shown to be more predictable for high concentrations of results that have a significant impact on MSE.
The Cu concentration results and the prediction by ANN model are shown in Figure 5. Table 2 shows that, under Conditions #7 and #8, copper content was higher than other conditions, resulting in relatively high leaching of copper under both conditions. Unlike pH and Eh, the results change relatively significantly with changes in temperature and dosage. Overall, the higher is the temperature and the higher is the dosage, the greater is the amount of heavy metals being leached. For additional experimental conditions, the trend is similar to the Pyrite in Condition #8. However, this ANN model, which reflects the mineral composition, was able to confirm that the predictability was significantly different. To overcome these limitations, experiments were conducted in more diverse compositions of minerals, reflecting these results to ANN training.  Table 1, while the red number is the result from the data for further verification.
The Pb concentration results and the prediction by ANN model are shown in Figure 6. Unlike the results in copper, Pb does not increase leaching as the content of Pb increases. Compared to Condition #2, the amount of elution was lower despite the high Pb content of Condition #3. We could also confirm that the minerals of Conditions #5 and #3 are the same, but the leaching of Condition #5 is much greater. For the Pb concentration data, the degree of value varies, but there are few data, so it seems to fit well regardless of the conditions. The results of Condition #2 suggest that elution increases as the temperature rises from 42 to 62 • C. For this reason, under the additional experimental conditions of 52 • C conditions, similar to those of Condition #2, the ANN model was determined to have leaching between 42 and 62 • C. However, the actual results are similar to those at 42 • C. In other words, the effect of temperature is not linear, so it is necessary to look further at the effect of temperature.
The Zn concentration results and the prediction by ANN model are shown in Figure 7. In the case of Zn, the higher is the content of Zn in the mineral, the more likely it is to be leached. However, while Condition #2 had the highest Zn content and the highest leaching, Condition #7 tended to have higher leaching, even though Conditions #3 and 5 had less Zn content. The effects of temperature and dosage showed a tendency similar to other heavy metals. Similar to Cu concentration, the higher is the concentration, the more accurate is the prediction. In addition, ANN predictions tend to fluctuate at low concentrations, which seems to be the result of minimizing MSE since ANN is not a gradually enhanced model. To overcome these shortcomings, it is necessary to optimize the model by utilizing the structure of the ANN model in the form of Recurrent Neural Network (RNN) or by taking logs at the concentration of Zn. The forecast for the additional experiment was much lower than the actual value, which appears to be similar to Pb because it does not reflect actual measurements that are not linearly affected by temperature.   Table 1, while the red number is the result from the data for further verification.
The Fe concentration results and the prediction by ANN model are shown in Figure 8. In the case of Fe, the largest leaching occurred in Conditions #7 and #8, which had the highest Fe content of minerals, and the tendency to increase leaching as temperature and dosage increased. ANN results in Fe also showed high predictability at high concentrations and fluctuations at low concentrations. Condition #7 and the additional experiments changed the dosage, and the difference between these changes is not linear, so the ANN model's predictions appear to have been significantly misplaced.  Table 1, while the red number is the result from the data for further verification.

Conclusions
The ANN model generated predictions that were accurate for pH or Eh overall. However, the predictions of heavy metal ions were only accurate at high concentrations. In addition, for temperature or dosage, bioleaching tends to nonlinearly increase as temperatures and dosages increase. There were conditions in which the leaching tendency varied greatly depending on different experimental conditions when the content of certain minerals was similar. These differences were also confirmed to be expressible and predictable through the ANN model. Therefore, we confirmed that the pH, Eh, and leaching of heavy metals by bioleaching can be expressed and predicted through ANN.
However, in this study, the characteristics of the input factors were different, and normalization was not carried out, thus the order of the factors' effects and the correlations between the factors were not clear. Therefore, in the future, it is necessary to conduct experiments under conditions where each factor can be normalized and apply these results to ANN. Furthermore, it is also necessary to select the experimental conditions based on the experimental design so that the effects of each factor can be fully examined. In addition, the data were organized differently for each output, forcing the model to be developed individually. If all outputs were identified in the same condition through experimental design, an ANN model could be developed that can predict all outputs simultaneously.
Considering the above-mentioned limitations, adopting the recommendations is expected to result in more sophisticated bioleaching prediction models than the ANN models obtained from this study.

Conflicts of Interest:
The authors declare no conflict of interest.