Precise Modeling of the Protective Effects of Quercetin against Mycotoxin via System Identification with Neural Networks

Cell cytotoxicity assays, such as cell viability and lactate dehydrogenase (LDH) activity assays, play an important role in toxicological studies of pharmaceutical compounds. However, precise modeling for cytotoxicity studies is essential for successful drug discovery. The aim of our study was to develop a computational modeling that is capable of performing precise prediction, processing, and data representation of cell cytotoxicity. For this, we investigated protective effect of quercetin against various mycotoxins (MTXs), including citrinin (CTN), patulin (PAT), and zearalenol (ZEAR) in four different human cancer cell lines (HeLa, PC-3, Hep G2, and SK-N-MC) in vitro. In addition, the protective effect of quercetin (QCT) against various MTXs was verified via modeling of their nonlinear protective functions using artificial neural networks. The protective model of QCT is built precisely via learning of sparsely measured experimental data by the artificial neural networks (ANNs). The neuromodel revealed that QCT pretreatment at doses of 7.5 to 20 μg/mL significantly attenuated MTX-induced alteration of the cell viability and the LDH activity on HeLa, PC-3, Hep G2, and SK-N-MC cell lines. It has shown that the neuromodel can be used to predict the protective effect of QCT against MTX-induced cytotoxicity for the measurement of percentage (%) of inhibition, cell viability, and LDH activity of MTXs.


Introduction
Cell cytotoxicity assays have a central role in toxicology studies in the assessment of the in vivo toxic potential of pharmaceutical chemical agents based on in vitro cell cytotoxicity studies [1]. In vitro cell cytotoxicity assays are generally used for drug screening to detect whether the test molecules have effects on cell proliferation or display direct cytotoxic effects. By using this concept, it is possible to measure the protective effect of pharmaceutical lead compounds as well. Two cytotoxicity assays, including cell viability and lactate dehydrogenase (LDH) leakage assays, are widely used in vitro toxicology studies. Cell viability is an important indicator for understanding the underlying mechanisms of certain genes, proteins, and pathways involved cell survival or death after exposing to toxic agents [2]. The LDH leakage assay is based on the measurement of LDH activity in the extracellular medium, where the loss of intracellular LDH and its release into the culture medium The system to be modeled is placed in parallel with a neural network with nonlinear learning capability. That input signal x is given to both the system and the neural network and the output of the system is taken as the desired response d for training the neural network. The objective of system identification is then to build a neural network whose response o matches to the response y of the system for a given set of inputs x. To achieve this objective input signals are presented repeatedly to the network and the norm of the error vector d -o , is minimized using back-propagation algorithm, which iteratively adjusts the weights of the neural network until its response is close to the desired system response.

Multilayer Neural Networks
A multilayer neural network (MNN), as shown in Figure 2, is a class of feed-forward artificial neural network that consists of an input layer, an output layer, and at least one hidden layer in between them [16,17]. Except for the input layer, the basic element in each layer is a node called neuron. The input layer receives inputs x from the external world. This data is weighted with different synaptic weights {wji} and feed-forwarded to the hidden layer. Each neuron in the hidden layer sums the weighted inputs it receives from its preceding layer and applies a nonlinear transfer function also called "activation function" before passing them as inputs to the output layer. The output layer neurons perform similar computation and yield the output values y of the neural network. The output of the hidden and output neurons can be expressed as follows.
( . ) where fh and fo are the activation functions, and bj and bk are the bias of the hidden and output layer, respectively. In this study the sigmoid activation function is used: The multiple layers and nonlinear activation enable the MNN to learn a mapping of any complexity. The network is learned based on repeated presentations of the training samples and iterative adjustments of the weights using back-propagation algorithm. The system to be modeled is placed in parallel with a neural network with nonlinear learning capability. That input signal x is given to both the system and the neural network and the output of the system is taken as the desired response d for training the neural network. The objective of system identification is then to build a neural network whose response o matches to the response y of the system for a given set of inputs x. To achieve this objective input signals are presented repeatedly to the network and the norm of the error vector d−o , is minimized using back-propagation algorithm, which iteratively adjusts the weights of the neural network until its response is close to the desired system response.

Multilayer Neural Networks
A multilayer neural network (MNN), as shown in Figure 2, is a class of feed-forward artificial neural network that consists of an input layer, an output layer, and at least one hidden layer in between them [16,17]. Except for the input layer, the basic element in each layer is a node called neuron. The input layer receives inputs x from the external world. This data is weighted with different synaptic weights {w ji } and feed-forwarded to the hidden layer. Each neuron in the hidden layer sums the weighted inputs it receives from its preceding layer and applies a nonlinear transfer function also called "activation function" before passing them as inputs to the output layer. The output layer neurons perform similar computation and yield the output values y of the neural network. The output of the hidden and output neurons can be expressed as follows.
where f h and f o are the activation functions, and b j and b k are the bias of the hidden and output layer, respectively. In this study the sigmoid activation function is used: The multiple layers and nonlinear activation enable the MNN to learn a mapping of any complexity. The network is learned based on repeated presentations of the training samples and iterative adjustments of the weights using back-propagation algorithm.

System Identification via Learning with Back-propagation
Assume that the error function of the network shown in Figure 2 is defined as where y is the network output and t is the desired target and the error is computed over all the data points p. Using (1) and (2), (4) can be written as If all W are chosen appropriately for all the patterns, then the error E will approach close to zero. At this situation, the system can produce output values close to the target values for all the inputs. At this state the network is regarded as completely learned and we can say that the function  is the identified system of the target system.
So the goal is to find the appropriate value of the network weights W. This goal is achieved via learning which is performed by iteratively updating W such that the error E, at the output, is reduced. Different optimization techniques, like gradient-based back-propagation, genetic algorithm, and simulated annealing, are used for training neural networks. However, back-propagation [24] is the most commonly used algorithm for learning neural networks. It propagates the error backwards throughout the network layers and updates the weight by computing the gradient of the error. The derivative of the error with respect to the weights of the network is computed using the chain rule of differentiation.
The aim of our study is to model the protective effect of QCT against MTX-induced cytotoxicity in four different human cell lines (HeLa, PC-3, Hep G2, and SK-N-MC) using the above-mentioned multilayer neural network based system identification method. Toward this end, two ANN models are designed and trained to predict the protective effect of QCT against MTX-induced cytotoxicity.

Protective Effect of QCT on MTX-Induced Cytotoxicity in HeLa, PC-3, HepG2, and SK-N-MC Cells
The experimental results revealed that % of inhibition increased with increasing dose of MTX, while QCT pretreatment significantly decreased % of inhibition in HeLa, PC-3, Hep G2, and SK-N-MC cell lines (Table 1). An increase in cell viability was observed in QCT-treated cells compared to MTX alone group ( Figure 3). The result displayed that QCT at doses of 7.5 up to 20 μg/mL possessed the best protective effects. QCT alone treatment did not change the cell viability compared to the control group ( Figure 3). QCT pretreatment also markedly decreased the

System Identification via Learning with Back-propagation
Assume that the error function of the network shown in Figure 2 is defined as where y is the network output and t is the desired target and the error is computed over all the data points p. Using (1) and (2), (4) can be written as If all W are chosen appropriately for all the patterns, then the error E will approach close to zero. At this situation, the system can produce output values close to the target values for all the inputs. At this state the network is regarded as completely learned and we can say that the function f o ∑ w kj f h (. . . (X p )) is the identified system of the target system. So the goal is to find the appropriate value of the network weights W. This goal is achieved via learning which is performed by iteratively updating W such that the error E, at the output, is reduced. Different optimization techniques, like gradient-based back-propagation, genetic algorithm, and simulated annealing, are used for training neural networks. However, back-propagation [24] is the most commonly used algorithm for learning neural networks. It propagates the error backwards throughout the network layers and updates the weight by computing the gradient of the error. The derivative of the error with respect to the weights of the network is computed using the chain rule of differentiation.
The aim of our study is to model the protective effect of QCT against MTX-induced cytotoxicity in four different human cell lines (HeLa, PC-3, Hep G2, and SK-N-MC) using the above-mentioned multilayer neural network based system identification method. Toward this end, two ANN models are designed and trained to predict the protective effect of QCT against MTX-induced cytotoxicity. The experimental results revealed that % of inhibition increased with increasing dose of MTX, while QCT pretreatment significantly decreased % of inhibition in HeLa, PC-3, Hep G2, and SK-N-MC cell lines (Table 1). An increase in cell viability was observed in QCT-treated cells compared to MTX alone group (Figure 3). The result displayed that QCT at doses of 7.5 up to 20 µg/mL possessed the best protective effects. QCT alone treatment did not change the cell viability compared to the control group ( Figure 3). QCT pretreatment also markedly decreased the MTX-caused LDH release ( Figure 4). QCT alone treatment did not change the LDH activity compared to the control group ( Figure 4).

Neuro Modeling of Protective Effect of QCT
The neuromodeling of protective effect of QCT against MTX (CTN, PAT, and ZEAR) in different cell lines (HeLa, PC-3, Hep G2, and SK-N-MC) has been performed via learning of neural networks. For this end, determination of % of inhibition and cell viability using crystal violet assay and LDH activity using LDH assay were performed.  Table 1 were used to train the neural network model whereas 240 data points corresponding to the % inhibition for QCT values of 7.5 and 15 µg/mL were reserved for testing. Out of the 480 data points, Table 1, used for training the neural network, 20% samples were randomly chosen and set aside for validation.
The doses of MTX and QCT are the primary inputs to the neural network, and the target to be learned is the corresponding % of inhibition. As the % inhibition of QCT on different cell lines and different MTXs is different, a naïve way would be to learn 12 different neural network models corresponding the four different cell lines and three different MTXs. However, in this study we aim to design a single neural network that models the overall behavior on the above mentioned cell lines and MTXs. To distinguish the % inhibition corresponding to the four cell lines and the three MTXs, additional binary codes are used as input to the network. The four cell lines HeLa, PC-3, Hep G2, and SK-N-MC are encoded as 1000, 0100, 0010, and 0001, respectively, and, the three MTXs (CTN, PAT, and ZEAR) are encoded as 100, 010, and 001, respectively. For example, the input to the neural network corresponding to 10 µg/mL QCT on HeLa cell lines for 0.78125 µM CTN is (1, 0, 0, 0, 1, 0, 0, 0.78125, 10). The doses of QCT and MTXs are normalized using their corresponding maximum doses used in this study, i.e., (dose of QCT/20, dose of MTX/200). The output of the network, i.e., the % of inhibition, is expressed a real number between zero and one [25].
So the neural network to be designed should have nine input and one output nodes. The appropriate number of hidden nodes required to learn the system was determined empirically. Starting with an initial five hidden nodes in a single hidden layer, the number of hidden nodes were incremented gradually. The network was trained using the back-propagation algorithm with a learning rate of 0.05 on the training dataset for a fixed 10,000 epochs, and its performance was evaluated on the randomly selected validation dataset. The root mean square error was employed to evaluate the performance on the validation dataset where N is the number of validation data points. The results of the empirical method are presented in Table 2. The network with 20 and 30 hidden nodes produced the lowest error on the validation dataset. So, keeping the number of nodes in the first hidden layer at 20, another hidden layer was added to this network. Staring with three hidden nodes, the nodes in the second hidden layer were increased in steps of three.
The network was then trained and validated as described earlier. The results of the empirical method for determining the number of nodes in the second hidden layer is presented in Table 3.   Table 3. Performance comparison of a two hidden layer neural network by varying the number of nodes in the last hidden layer. The first and the second elements of the No. of Hidden Nodes are node numbers of the first and the second hidden layers, respectively.

No. of Hidden Nodes Train RMSE Validation RMSE
(20, 3) 0.0129 0.0150 (20,6) 0.0110 0.0173 (20,9) 0.0109 0.0162 (20,12) 0.0113 0.0161 (20, 15) 0.0106 0.0147 (20,18) 0.0105 0.0153 (20,21) 0.0108 0.0151 From Table 3, it is seen that the neural network with two hidden layers produces the lowest error in the validation dataset. Hence, a three-layered neural network of nine input nodes, 20 hidden nodes, 15 hidden nodes, and 1 output node, namely, a 9-20-15-1 network, as shown in Figure 5, was chosen as the optimal network to learn the % of inhibition for all the combinations of cell lines and MTXs. This empirically selected network was then finally trained on the whole training dataset (including the validation dataset). Figure 6 shows the error curve obtained during the learning of this network on the training and the test dataset, and the corresponding RMSEs are given in Table 4.   Biases of nodes are omitted in this figure, where four input terminals are used for indicating cell lines and three input terminals are to determine MTXs and other two input terminals are for the dose of MTX and QCT, respectively. Also, the single output terminal is for % of inhibition.   After the learning was completed, the network was evaluated on the test set and the % of inhibition values were obtained from the output of the ANN. The % of inhibition surface obtained from the trained neural network for Hep G2 cell lines with MTX CTN ranging from 0 to 200 µM for different doses of QCT ranging from 0 to 20 µg/mL is presented in Figure 7. Note that the increasing directions of QCT and CTN axis are indicated by arrow heads. The experimentally measured data is superimposed on this surface with solid circles. The black solid circles are the % of inhibition values used for training the ANN and the blue and red circles are those of test data. The solid circles hidden partially in the surface indicate that the % inhibition values predicted by the ANN are close to the experimentally measured data. For better visualization of the modeling accuracy of the ANN on the test data, the % inhibition values predicted by the ANN at QCT values of 7.5 and 15 µg/mL are presented in Figure 7c. The solid lines represent the outputs of ANN, whereas the solid dots are the measured data. Observe that the output of the ANN is close to the measured data at every test point.
The results of ANN were compared with that of linear regression method which is commonly used for a rough estimation of experimental results. Figure 7b shows the surface of the % of inhibition values obtained with linear regression of Hep G2 cell, and Figure 7d shows the % of inhibition curves at QCT values of 7.5 and 15 µg/mL concentrations.
Similar figures for other MTXs, such as PAT and ZEAR, are presented in Figures 8 and 9, respectively. As seen in Figures 7-9, the qualitative results of modeling the system with ANN shows excellent performance for the four different cell line types and three different MTXs consistently. A quantitative evaluation of the network's performance on the test set exhibited high correlation (R = 0.999) with the experimentally measured data, which is substantially higher than the correlation obtained from other statistical method such as partial least squares (PLS) regression (R = 0.90). Also note that separate models of linear regression or the partial least squares regression have to be built for each of the cell line and MTX combination, whereas the proposed ANN models all the cell line MTX combinations in as single model. The plot of the error between the measured % of inhibition and the predicted values using the ANN model and PLS is as shown in Figure 10. test data, the % inhibition values predicted by the ANN at QCT values of 7.5 and 15 µ g/mL are presented in Figure 7c. The solid lines represent the outputs of ANN, whereas the solid dots are the measured data. Observe that the output of the ANN is close to the measured data at every test point.
The results of ANN were compared with that of linear regression method which is commonly used for a rough estimation of experimental results. Figure 7b shows the surface of the % of inhibition values obtained with linear regression of Hep G2 cell, and Figure 7d shows the % of inhibition curves at QCT values of 7.5 and 15 µ g/mL concentrations. Similar figures for other MTXs, such as PAT and ZEAR, are presented in Figures 8 and 9, respectively. As seen in Figures 7-9, the qualitative results of modeling the system with ANN shows excellent performance for the four different cell line types and three different MTXs consistently. A quantitative evaluation of the network's performance on the test set exhibited high correlation (R = 0.999) with the experimentally measured data, which is substantially higher than the correlation obtained from other statistical method such as partial least squares (PLS) regression (R = 0.90). Also note that separate models of linear regression or the partial least squares regression have to be built for each of the cell line and MTX combination, whereas the proposed ANN models all the cell line MTX combinations in as single model. The plot of the error between the measured % of inhibition and the predicted values using the ANN model and PLS is as shown in Figure 10.

Neuro-Modeling of Cell Viability and LDH Activity
Similar experiments, as those done for modeling the % of inhibition of QCT, were also conducted to model the cell viability and LDH activity. As in the case of % of inhibition, variation of

Neuro-Modeling of Cell Viability and LDH Activity
Similar experiments, as those done for modeling the % of inhibition of QCT, were also conducted to model the cell viability and LDH activity. As in the case of % of inhibition, variation of cell viability and LDH activity was measured on cells treated with QCT (5, 7.5, 10, 15, and 20 µg/mL) and incubated with MTXs: CTN (100 µM), PAT (50 µM), and ZEAR (100 µM). As shown in Figures 3 and 4, a total of 72 data points were measured for % of cell viability and LDH activity, respectively. Out of the total 72 data points, 48 data points were used to train the ANN, whereas the remaining 24 data points, corresponding to QCT values of 7.5 and 15 µg/mL, were reserved for testing. As before, 20% of samples chosen randomly from the training set were used for validating the neural network.
The aim of neural network in this case is to predict the cell viability and LDH activity given the doses of MTXs and QCT for all the combination of MTXs and cell lines. The architecture of the ANN for modeling this system is different from the case of % of inhibition model as the network needs to output two values. The output of the network, i.e., the cell viability and LDH activity, are normalized to real numbers between zero and one by dividing with their corresponding maximum values encountered in this study, i.e., 100 and 350, respectively. The binary encoding scheme as described in Section 3.2.1 is used to distinguish the different cell lines and the MTX combination. Hence the total number of input nodes is equal to 9. The optimal network architecture for this task was also determined by following the empirical method described in Section 3.2.1. The results of the empirical method are presented in Tables 5 and 6.
output two values. The output of the network, i.e., the cell viability and LDH activity, are normalized to real numbers between zero and one by dividing with their corresponding maximum values encountered in this study, i.e., 100 and 350, respectively. The binary encoding scheme as described in Section 3.2.1 is used to distinguish the different cell lines and the MTX combination. Hence the total number of input nodes is equal to 9. The optimal network architecture for this task was also determined by following the empirical method described in Section 3.2.1. The results of the empirical method are presented in Tables 5 and 6.        Table 6. Performance comparison of a two hidden layer neural network by varying the number of nodes in the last hidden layer. The first and the second elements of the No. of Hidden Nodes are node number of the first and the second hidden layers, respectively.

No. of Hidden Nodes Train RMSE Validation RMSE
(10, 2) 0.0100 0.0612 (10,4) 0.0100 0.0450 (10,6) 0.0100 0.0384 (10,8) 0.0114 0.0338 (10, 10) 0.0100 0.0337 A network with nine input nodes, 10 hidden nodes, 10 hidden nodes, and two output nodes, namely, a 9-10-10-2 network, shown in Figure 11, was determined to be the optimal network for this task. This network was then trained on the whole training dataset (including the validation dataset). Figure 12 shows the evolution of error during the learning of this network on the training and the test dataset, and the corresponding RMSEs are given in Table 7.  Table 6. Performance comparison of a two hidden layer neural network by varying the number of nodes in the last hidden layer. The first and the second elements of the No. of Hidden Nodes are node number of the first and the second hidden layers, respectively. (10,2) 0.0100 0.0612 (10,4) 0.0100 0.0450 (10,6) 0.0100 0.0384 (10,8) 0.0114 0.0338 (10, 10) 0.0100 0.0337 A network with nine input nodes, 10 hidden nodes, 10 hidden nodes, and two output nodes, namely, a 9-10-10-2 network, shown in Figure 11, was determined to be the optimal network for this task. This network was then trained on the whole training dataset (including the validation dataset). Figure 12 shows the evolution of error during the learning of this network on the training and the test dataset, and the corresponding RMSEs are given in Table 7.  Figure 11. Architecture of ANN which was used for the modeling of cell viability and LDH activity. The size of the ANN is 9-10-10-2. Biases of nodes are omitted in this figure.  Figure 11. Architecture of ANN which was used for the modeling of cell viability and LDH activity.

No. of Hidden Nodes Train RMSE Validation RMSE
The size of the ANN is 9-10-10-2. Biases of nodes are omitted in this figure.  Table 7. Performance of the final 9-10-10-2 network on the train and test dataset.  Table 7. Performance of the final 9-10-10-2 network on the train and test dataset.

Network Train RMSE Test RMSE
9-10-10-2 0.0052 0.0364 The trained network was evaluated on the test set and the cell viability and LDH activity values were obtained from the output of the ANN. The cell viability and the LDH activity curve obtained from the trained neural network for different cell lines pretreated with different doses of QCT ranging from 0 to 20 µg/mL and incubated with MTXs: CTN (100 µM), PAT (50 µM), and ZEAR (100 µM) is presented in Figure 13. The experimentally measured data is superimposed on this curve with markers, where '*' indicates the data points used for training the ANN and 'o' for testing.
As seen from Figure 13, the qualitative results of modeling the system with ANN shows excellent performance for the 4 different cell line types and 3 different MTXs consistently. A quantitative evaluation of the network's performance on the test set exhibited high correlation (R cell_viability = 0.995 and R LDH_activity = 0.997) with the experimentally measured data.
The plot of the error between the measured cell viability and LDH activity and the corresponding values predicted using the ANN model is as shown in Figure 14.  Figure 13. The experimentally measured data is superimposed on this curve with markers, where '*' indicates the data points used for training the ANN and 'o' for testing.
As seen from Figure 13, the qualitative results of modeling the system with ANN shows excellent performance for the 4 different cell line types and 3 different MTXs consistently. A quantitative evaluation of the network's performance on the test set exhibited high correlation (Rcell_viability = 0.995 and RLDH_activity = 0.997) with the experimentally measured data.
The plot of the error between the measured cell viability and LDH activity a

Discussion
The main objective of this paper was to model the protective effect of QCT against MTXs using ANNs, and to verify the ability of the model in estimating the protective effect of QCT. Specifically, Figure 14. Error in modeling the (a) cell viability and (b) LDH activity on the test data using the artificial neural network.

Discussion
The main objective of this paper was to model the protective effect of QCT against MTXs using ANNs, and to verify the ability of the model in estimating the protective effect of QCT. Specifically, the protective effect of QCT against three different MTXs (Citrinin, Patulin, and Zearalenol) on four different cell lines (HeLa, PC-3, Hep G2, and SK-N-MC cell lines) was measured experimentally, and the data were used to model their nonlinear protective functions (% of inhibition, cell viability, and LDH activity) using multilayer neural networks.
The experimental measurements revealed that treatment with MTX significantly decreased cell viability and increased LDH activity. However, the % of inhibition of four different cells pretreated with the three MTXs was consistently decreased with the dose of QCT. Also, pretreatment with QCT attenuated MTX-induced alteration of cell viability and LDH activity that it could protect the cell lines from cytotoxicity. The effects of QCT against the MTX-induced cytotoxicity were conducted via cell viability and LDH release assays in Hela, PC-3, Hep G2, and SK-N-MC cell lines. The experimental results showed that treatment with MTX significantly decreased cell viability and increased LDH activity. However, pretreatment with QCT significantly attenuated MTX-induced alteration of cell viability and LDH activity, which suggests that it could protect the cell lines from cytotoxicity. Therefore, these results suggest that QCT may inhibit MTX-induced diseases in humans.
Two different ANN models with sizes of 9-20-15-1 and 9-10-10-2, determined empirically, were used to model the % of inhibition, the cell viability, and LDH activity, respectively. For both tasks, the experimentally measured data for the protective effects of QCT for three different MTXs in four different cells was used for training the neural network. As a result, twelve and twenty-four different models of precise protective effects of QCT were built on the two ANNs, respectively. Unlike the commonly used statistical methods, like linear regression or partial least square regression, which require separate models to be computed for each MTX and cell line combination, a single neural network was designed to model the different combinations using a special binary encoding scheme for each MTX and cell line combination. Moreover, quantitative evaluations of the network's performance on the test sets exhibited high correlation with experimentally measured data which was substantially higher than that of individual models computed using other statistical methods.
It was observed that the additional burden for the neural network to discriminate between the different input combinations, expressed as binary codes, demands comparatively larger networks, which require more number of iterations for network convergence and hence longer training times. However, it was shown that single model for different input combination provides a unified and elegant solution with the ability to precisely model the protective effects of QCT against MTXs.
To prevent mycoplasma effectively we followed a procedure described previously [26]. Briefly, we brought all cells and culture materials from reliable sources, used good aseptic technique, recommended antibiotic used for culture medium to eradicate all contamination and finally disinfecting the laminar flow hood after working.

Measurement of Mycotoxin-Induced Cytotoxicity
The MTX-induced cytotoxicity in different cell lines (HeLa, PC-3, Hep G2, and SK-N-MC) was measured by determination of percentage (%) of inhibition, cell viability, and LDH activity.

Measurement of Percentage (%) of Inhibition
At first, we measured percentage (%) of inhibition to observe the cytotoxic effect of the MTXs (Citrinin, Patulin, and Zearalenol) and MTX + QCT on HeLa, PC-3, Hep G2, and SK-N-MC. Briefly, cells were seeded in 24-well plates with 5 × 10 4 cells per well in culture media and allowed to attach overnight; cells were pretreated with QCT (0-20 µg/mL) at 37 • C in a humidified atmosphere of 5% CO 2 /95% air for 6 h followed by the incubation with mycotoxins (0-200 µM) for 24 h.

Cell Viability
Crystal violet assay was used to determine MTX-induced cell death. Briefly, cells were seeded in 24-well plates with 5 × 10 4 cells per well in culture media and allowed to attach overnight. The cells were pretreated with the doses of QCT at 5, 10, and 20 µg/mL at 37 • C in a humidified atmosphere of 5% CO 2 /95% air for 6 h, followed by the incubation with CTN (100 µM), PAT (50 µM), and ZEAR (100 µM) for 24 h. After 24 h of incubation, removed medium and washed the cells with phosphate buffer solution (PBS) and 0.2% crystal violet solution was added to each well. After 10 min of incubation, the crystal violet solution was removed carefully by washing with water. Finally, added 100 µL 1% sodium dodecyl sulfate (SDS) to solubilize the color solution until the color is uniform and no areas of dense coloration in bottom of wells. The samples were read at 590 nm in a microplate reader (Spectra MAX, Gemini EM, Molecular Device, Sunnyvale, CA, USA). The cell viability is expressed as the percentage of absorbance of control.

Lactate Dehydrogenase (LDH) Activity
Lactate dehydrogenase (LDH) activity assay was used to determine MTX-induced cytotoxicity. LDH release into the media was taken as an indicator of cell damage and the assay is based on the principle of reduction of nicotinamide adenine dinucleotide (NAD) by LDH. The reduced NAD (NADH) is utilized in the stoichiometric conversion of a tetrazolium dye which is measured spectrophotometrically using an LDH assay kit (Cat. No. 04744926001, Sigma, Saint Louis, MO, USA). Briefly, cells were seeded (5 × 10 4 cells/well) and cultured in 24-well culture plates. The cells were then preincubated with or without different concentrations of QCT (5, 7.5, 10, 15, and 20 µg/mL) at 37 • C for 6 h followed by incubation with CTN (100 µM), PAT (50 µM), and ZEAR (100 µM) for 24 h. After treatment was over, cells were centrifuged at 240 × g for 4 min and the culture supernatant was transferred in a new plate. The assay mixture was prepared and added to each well and the plate incubated wrapped in foil at room temperature for 30 min. Reaction was terminated by adding the stop solution to each well. The plate was read at 490 nm at a reference wavelength of 690 nm. The extent of cytotoxicity is expressed as the percentage of absorbance of control.

Statistical Data Analysis and Neural Network Training
All the data were expressed as mean ± SD and one-way ANOVA (Analysis of variance) followed by Dunnett's test was used for the statistical analysis using SPSS software (version 16, SPSS, Inc., Chicago, IL, USA). * p < 0.05 and ** p < 0.01 were considered significant. The artificial neural network was trained using custom codes developed by the authors written in MATLAB(R) (2017b, Mathworks, Natick, MA, USA) on a standard computer station (Intel(R) Core(TM) i7-6700k 4.00 GHz, 8 cores, 8 GB RAM) machine, whereas the partial least squares method was implemented using MATLAB's built-in plsregress function.