Optimization of Ultrasonic-Assisted Extraction Conditions for Bioactive Components and Antioxidant Activity of Poria cocos (Schw.) Wolf by an RSM-ANN-GA Hybrid Approach

In this study, a response surface methodology and an artificial neural network coupled with a genetic algorithm (RSM-ANN-GA) was used to predict and estimate the optimized ultrasonic-assisted extraction conditions of Poria cocos. The ingredient yield and antioxidant potential were determined with different independent variables of ethanol concentration (X1; 25–75%), extraction time (X2; 30–50 min), and extraction solution volume (mL) (X3; 20–60 mL). The optimal conditions were predicted by the RSM-ANN-GA model to be 55.53% ethanol concentration for 48.64 min in 60.00 mL solvent for four triterpenoid acids, and 40.49% ethanol concentration for 30.25 min in 20.00 mL solvent for antioxidant activity and total polysaccharide and phenolic contents. The evaluation of the two modeling strategies showed that RSM-ANN-GA provided better predictability and greater accuracy than the response surface methodology for ultrasonic-assisted extraction of P. cocos. These findings provided guidance on efficient extraction of P. cocos and a feasible analysis/modeling optimization process for the extraction of natural products.


Introduction
Poria cocos (Schw.) Wolf (Fu-ling) is an edible fungus commonly found on the dead bark and roots of pine trees [1] which are distributed all over the world. To date, numerous scientific studies have revealed the bioactivities of P. cocos, such as its antioxidant, immunomodulator, anti-inflammatory, anticancer, and antidiabetic effects [2,3]. Additionally, in phytochemical analysis, many important functional components, mainly including triterpene acids (pachymic acid, trametenolic acid, tsugaric acid A, and dehydrotrametenolic acid), polysaccharides, and polyphenols, have been discovered in Fu-ling [4][5][6]. Nowadays, with its marked health benefits, non-toxicity, and rich resources, P. cocos or its extracts have been widely used as raw materials or supplements to produce beverages, porridge, biscuits, and functional foods [7,8].
Extraction processes are the critical step in the separation, purification, and identification of functional components in food raw materials [9]. Ultrasonic-assisted extraction is one of the most commonly used techniques, which can fully release the ingredients in raw materials by ultrasonic energy [10,11]. However, the extraction efficiency is always affected by a variety of factors, including the type of solvent, extraction time, solvent volume, pH, and temperature. Ensuring the extraction efficiency of active ingredients from original materials has always been the focus of food processing and natural medicine development. For a long time, researchers have been committed to revealing the relationship between yields and influencing factors through statistics and mathematical techniques [12]. Appropriate mathematical modeling provides theoretical support and technical means for

Determination of Contents and Activity 2.3.1. Triterpene Acids Content
The four triterpene acids (PA: pachymic acid, TA: trametenolic acid, TAA: tsugaric acid A, and DA: dehydrotrametenolic acid) in the extracts were analyzed using an AB Sciex ExionLC UHPLC coupled with a QTRAP instrument. UPLC was equipped with an auto-sampler, a binary pump, a column heater, and a degasser. The samples were separated on an ACQUITY UPLC BEH C18 (100 mm × 2.1 mm; particle size, 1.7 µm). The mobile phase used was (A) water containing 0.1% formic acid and (B) acetonitrile containing 0.1% formic acid. Gradient elution was performed as follows: 20% B solvent at 0-0.50 min, 20-95% at 0.50-2.00 min, 95% at 2.00-4.50 min, 95-20% at 4.50-4.51 min. The flow rate was 0.3 mL/min at a column temperature of 40 • C. The electrospray spray ionization-quadropule ion trap (ESI-QTRAP) spectra were acquired in the negative ion mode, and the optimized parameters were as follows: ionspray voltage, 4500 V; ion source temperature, 500 • C; declustering potential, 200 eV; and collision energy, 55-58 eV. The multiple response monitoring parameters and MRM-extracted ion chromatogram of the four triterpene acids is provided in Table S1 and Figure S1.

Total Polysaccharide Content (TPs)
For the analysis of total polysaccharide content, the DNS method was adopted [27]. The total polysaccharides in the extracts were hydrolyzed into reducing sugar by HCl, and the reducing sugar was reduced to a reddish-brown amino compound after co-heating with DNS reagent under alkaline conditions. After the reaction, the absorbance was determined at 540 nm using a microplate reader. Glucose was used as a standard and the results were described as mg glucose equivalent per gram (mg GLU/g) of sample powder.

Total Phenolic Content (TPc)
Total phenolic content (TPc) was analyzed referring to a method in the literature [28]. The extracted sample (10 µL) and Folin-Ciocalteu reagent (50 µL) were mixed together in 96-well plates, then 7.5% sodium carbonate (50 µL) and distilled water (90 µL) were added to each well. After being kept at room temperature for 10 min, the absorbance was determined at 760 nm using a microplate reader with a gallic acid standard, representing mg gallic acid equivalents per gram (mg GAE/g) of sample powder.

Antioxidant Activity
Antioxidant activity was evaluated by a 2,2-diphenyl-1-picrylhydrazyl radical scavenging assay (DPPH-SC) and a FRAP total antioxidant capacity assay (T-AOC). The free radical scavenging activity and reducing power were determined from a previous method [29].

Statistical Analysis
The data were expressed as means ± standard deviation. Principle component analysis (PCA) was performed by Origin 2021 and used to overview the relationship of different extracts and antioxidant activity for grouping optimization. Design Expert Software 13.0 was used to formulate the experiment cases and statistical analysis of RSM along with Neural Network Toolbox™ in MATLAB 2018a.

RSM Modeling
ANOVA was used to analyze the statistical significance and each term of the RSM fitting model. The interaction effect of each variable on the response value was visualized by a 3D surface plot, and expressed by a modified cubic polynomial model in the following Equation (1). Eventually, the optimal extraction process was determined.
where Y represents the dependent variables for the independent variables (X 1 -X 3 ); and α 0 , α i , α ii , α ij , and α iij are the constant coefficients of intercept, linear, quadratic, and interaction terms, respectively.

Artificial Neural Network-Genetic Algorithm (ANN-GA) Modeling
ANN was used to explore the nonlinear correlation between independent (ethanol concentration, time, and temperature) and dependent variables (PA, TA, TAA, DA, TPs, TPc, DPPH, and FRAP). As per the work of Hee-Jeong Choi et al., the multi-layer perceptron (MLP) was constructed by input, hidden, and output layers, and the back propagation feed-forward (BPFF) model was used in the process [21]. For the network construction, all experimental data were apportioned into training (70%), testing (15%), and validation sets (15%). According to the approximation of mean square error (MSE) function, 10 hidden neurons were set in the process modeling. Each neuron was activated by using the output signals generated by the weight coefficient of the independent variables. MSE values were calculated by Equation (2), and the ANN model with minimum MSE and maximum R 2 was selected for further optimization [30].
where Y ANN and Y EXP are the results from ANN prediction and experiment, respectively. The tansig function, Equation (3), was employed for pattern recognition and modeling.
tan sig(n) = 2 1 + e −2n − 1 GA has been widely applied in multiple fields for algorithm optimization [31,32]. As previously reported, a hybrid ANN-GA could be used in ultrasonic extracting procedures [33]. After the ANN was constructed, GA was performed followed by genetic operators such as reproduction, crossover, and mutation steps until optimized results were obtained.

Principle Component Analysis (PCA)
PCA was conducted to visualize the correlation among all extracts (PA, TA, TAA, DA, TPs, and TPc) and antioxidant activity (DPPH-SC and T-AOC). As a powerful and common tool, PCA is able to reduce the dimensionality of the multivariate data to two or three principal components with maximized conservation of information [34]. As shown in Figure S2, the first (PC1) and second (PC2) principal components were 43.7% and 21.8%, respectively, which explain the original variance of the variables. The four triterpene acids were on the positive side of PC1, which were characterized as having high relevance (r ≥ 0.6) with each other. Additionally, there was a highly positive correlation (r ≥ 0.6) between TPs and TPc. Additionally, antioxidant activity, DPPH-SC, and T-AOC had a moderate relevance (0.4 < r < 0.6). However, the TPs and TPc had a higher correlation with the activity index [35]. Therefore, the basis had been provided for the classification of ingredients and activities. All responses could be divided into Group 1 (PA, TA, TAA, and DA) or Group 2 (TPs, TPc, and antioxidant activity) for subsequent algorithm optimization.

RSM Modeling
The aim was to strengthen the extraction efficiency of the four triterpene acids, polysaccharides, phenolics, and antioxidant activity from P. cocos. The most contributory and affected factors in the extraction process were ethanol concentration (X 1 , %), extraction time (X 2 , min), and extraction solution volume (X 3 , mL), and these were optimized by constructing a BBD formulation. Table 1 shows a comparison between response variables (PA, TA, TAA, DA, TPs, TPc, DPPH-SC, and T-AOC) and predicted responses by RSM and ANN for 17 run samples under different extraction conditions. In the design matrix, all the experimental values were quite near to the RSM-and ANN-predicted values. ANOVA was performed to evaluate RSM models for better precision. As the ANOVA results illustrate in Table 2, each model could reflect the relationship between input variables and output responses with a higher F-value, lower probability value (p < 0.05), and an insignificant lack of fit value. Moreover, the regression coefficient (R 2 ) and adequate precision were also important indicators of the model fitting. In our research, R 2 values of all responses fell within the acceptable range (R 2 ≥ 0.74), which indicated that these models had good reliability and fit to the responses with our equations [36,37]. Adequate precision was an index of signal to noise ratio. A value of greater than four was desirable, which indicates the model could be used to navigate the design space. Our results satisfied this acceptable minimum limit [38,39]. As previous research has demonstrated, p-values (the lower the better), F-values (the higher the better), and R-squared values (ideally as close to 1.00 as possible) suggest the predictive power of a model [40]. Sushma Chakraborty et al. optimized an ultrasound-assisted extraction process for bioactive compounds from bitter gourd by response surface methodology, and all p-values < 0.0001, F-values ≥ 29.36, and R 2 ≥ 0.9635 in RSM models, which was similar to our research [41]. In Chen Chen's study, the F-test had a very high model F-value (73.85) and very low p-values (p < 0.0001), which means that there was only a 0.01% chance that a large model F-value could be attribute to noise [42]. The bigger the F-value, the smaller significance of the corresponding coefficient, which implies the model is suitable for use. In summary, the current experimental data was suitable for optimization of extract conditions.   PA: pachymic acid (µg/g), TA: trametenolic acid (µg/g), TAA: tsugaric acid A (µg/g), DA: dehydrotrametenolic acid (µg/g), TPs: total polysaccharides (mg GLU/g), TPc: total phenolic (µg GAE/g), DPPH-SC: DPPH free radical scavenging capacity (%), T-AOC: total antioxidant capacity (µmol/g). df: degree of freedom, Std. Dev: standard deviation, R 2 : coefficient determination, Adeq precision: adequate precision.
The interactive influence of factors (ethanol concentration, solvent volume, and extraction time) on the responses (PA, TA, TAA, DA, TPs, TPc, DPPH-SC, and T-AOC) was visually analyzed by 3D response surface plots. The four triterpene acids yield, ethanol concentration, and extractant volume had similar tendencies, indicating a gradual increase in extraction rate with more ethanol and at larger volumes. However, extraction time had a relatively smaller effect on it ( Figure 1). For TPs and TPc, the effects of extraction conditions were complex. Generally, the yield increased as the extractants increased. However, it slightly decreased with an increase in time with the same ethanol concentration and an extractant volume lower than 50 mL (Figure 2). At different durations or lower solvent volumes, the content of TPs declined with a decrease in ethanol concentration then increased as the ethanol increased. However, it showed a parabolic trend, firstly increasing and then decreasing, with a higher volume of extraction solution. Additionally, the TPc content was significantly influenced by ultrasonic time at a constant ethanol concentration and solvent volume, showing the highest amount at 25% ethanol in 40 mL extractant for 30 min in the experimental data. For antioxidative activity, the impact of ethanol concentration and extraction duration on DPPH-SC and T-AOC followed similar trends. The activity was increased in a small range with an increase in ethanol content in then extract when the other factors were kept constant. Additionally, with an increasing duration, the two activity responses were first increased and then decreased with a constant ethanol concentration and extractant dosage. Moreover, as the dosage was increased with a constant ethanol content or extraction time, DPPH-SC first decreased and then increased, while T-AOC decreased under the same conditions. Based on the response data, it is clearly shown that ethanol concentration, extraction solution volume, and extraction time exert effects on all response parameters. Appropriately increasing the ethanol concentration, solvent amount, and ultrasonic time could result in an increased yield of the four triterpene acids. However, the anti-oxidative activity (DPPH-SC and T-AOC) and bioactive components (TPs and TPc) decreased upon prolonging the ultrasonic process after reaching the maximum level. This was probably due to the degradation mechanism on exposure to powerful ultrasonic energy for a long time [43,44]. All above data revealed a different impact for all extraction ingredients and activities from the three investigate factors. Therefore, these outputs could be grouped, which was consistent with PCA analysis. This did not mean that the triterpene acids had no antioxidant activity, although they were divided into two groups according to PCA and RSM analysis. In other words, there was a weaker correlation of the four triterpenoid acids in this study with DPPH-SC and T-AOC compared with TPs and TPc. Thinzar Aung et al. also used this method to study the extract process optimization of functional components from Porphyra dentate [20]. The ethanol amount, time, and solvent volume were found to exert significant effects on the release of triterpene acids. This is in good agreement with previous studies of triterpene acid extraction from Corni Fructus and olives using sonication [20,45]. Moreover, the stability and extractability of total polysaccharides and total phenols at longer ultrasonic durations were in good agreement with the ultrasound-assisted extraction research in previous work [46,47]. Antioxidant function is one of the most studied biological activities of P. cocos and is associated with a variety of components, such as triterpene acids, TPs, and TPc [5,48,49]. Therefore, the bioactivity indexes (DPPH-SC and T-AOC) exhibited a similar trend with TPs and TPc. The more appropriate the extraction conditions, the better the extraction efficiency of antioxidants and active ingredients of P. cocos.

RSM-ANN-GA Modeling
The extraction conditions were set as ANN inputs for optimization and the data generated from RSM experimental responses was fed in the output layer. The artificial neural network architecture topology of both groups is shown in Figure 3A. All the data were

RSM-ANN-GA Modeling
The extraction conditions were set as ANN inputs for optimization and the data generated from RSM experimental responses was fed in the output layer. The artificial neural network architecture topology of both groups is shown in Figure 3A. All the data were randomly allocated to training, testing, and validation sets for modeling. The experimental and predicted regressions for each group in the ANN are presented in Figure 3B,E and for each output in Figures S3 and S4. The best fit was obtained after 1000 iterations by using a feed-forward and BP algorithm according to previous studies [50,51]. As shown in Figure 3C,F, GA was used to optimize the weights and thresholds of each node in ANN for an optimal network structure. The initial population was generated from the RSM-ANN model. Finally, the optimum fit and minimum sum-square error were obtained within 100 iterations. In Table 2, all RSM-ANN-GA prediction values were compared with predicted and experimental RSM data for each test condition. Then, the training process was evaluated by mean square error (MSE), Figure 3D,G. Additionally, MSE values dropped rapidly to reach a minimum within 10 epochs, which meant the best validation performance had been reached for each model and for each output in Figures S3 and S4 [52]. At this stage, training was stopped, and the weights and biases were applied in processing for generation of the RSM-ANN-GA model. Obviously, the RSM-ANN-GA model with a higher coefficient of determination (R 2 ) attained a better predictive power and higher predictive accuracy than the RSM model. Based on the regression analysis above, these models met with the statistical agreement between experimental data and predicted values. randomly allocated to training, testing, and validation sets for modeling. The experi mental and predicted regressions for each group in the ANN are presented in Figure 3B,E and for each output in Figure S3-4. The best fit was obtained after 1000 iterations by usin a feed-forward and BP algorithm according to previous studies [50,51]. As shown in Fig  ure 3C,F, GA was used to optimize the weights and thresholds of each node in ANN fo an optimal network structure. The initial population was generated from the RSM-ANN model. Finally, the optimum fit and minimum sum-square error were obtained within 10 iterations. In Table 2, all RSM-ANN-GA prediction values were compared with predicted and experimental RSM data for each test condition. Then, the training process was evalu ated by mean square error (MSE), Figure 3D,G. Additionally, MSE values dropped rapidl to reach a minimum within 10 epochs, which meant the best validation performance had been reached for each model and for each output in Figure S3-4 [52]. At this stage, trainin was stopped, and the weights and biases were applied in processing for generation of th RSM-ANN-GA model. Obviously, the RSM-ANN-GA model with a higher coefficient o determination (R 2 ) attained a better predictive power and higher predictive accuracy tha the RSM model. Based on the regression analysis above, these models met with the statis tical agreement between experimental data and predicted values. ANN-GA modeling is a well-known, flexible, powerful technology. It has bee widely used in many research fields with the capacity to simultaneously optimize multi ple variables. Except for in the optimization of an ANN network structure, GA also wa applied in optimization tasks. As reported by Lahiri and Ghanta, only scalar values ca be used in objective functions, instead of the second-and/or first-order derivatives of i ANN-GA modeling is a well-known, flexible, powerful technology. It has been widely used in many research fields with the capacity to simultaneously optimize multiple variables. Except for in the optimization of an ANN network structure, GA also was applied in optimization tasks. As reported by Lahiri and Ghanta, only scalar values can be used in objective functions, instead of the second-and/or first-order derivatives of it [53]. For maximizing outputs, the network data from RSM-ANN-GA as the objective function and the three independent variables as a matrix variable with boundary constraints is taken as follows: [25;30;20]  The parameters applied in the GA optimization process were used as the default in the optimization tool of Matlab. The constrain-dependent population size was 50, and rank scaling function was carried out [54]. The stochastic uniform was selected to choose parents for the next generation based on their scaled values from the fitness scaling function [55]. A population size of 0.05× was set as the elite count guaranteed to survive to the next generation, and the default crossover fraction of 0.8 was used to produce a new generation in the reproduction process. Choosing a constraint-dependent mutation function provided genetic diversity and enabled the GA to search a broader space. The crossover function was also constraint dependent and was used to form a new individual or child for the next generation. A forward migration was adopted with 20% probability and 20 intervals. The augmented Lagrangian, an nonlinear constraint algorithm, was applied to achieve the required accuracy [56].

Optimum Ultrasonic Extraction Conditions
The optimization of two group responses was statistically evaluated by comparing RSM with the hybrid RSM-ANN-GA in Table 3. According to the RSM model, the best predicted yield of four triterpene acids in Group 1 was reached at 55.97% ethanol concentration, after 49.30 min, and with 60.00 mL of extraction agent with 0.9 desirability. For Group 2, 25.00% ethanol, 30.00 min, and 20.00 mL were more suitable, with 0.87 desirability. Under their own optimal conditions, the responses corresponding to PA, TA, TAA, DA, TPc, TPs, DPPH-SC, and T-AOC value were 697.92, 51.93, 184.87, 108.86 µg/g, 38.82 mg GLU/g, 319.78 µg GAE/g, 10.24%, and 1.77 µmol/g, respectively. Regarding the RSM-ANN-GA model, the optimized extraction conditions of Group 1 were 53.53% ethanol concentration, 48.64 min, and 60.00 mL, similar to the RSM model predictions. For Group 2, the optimized extraction time and volume were consistent with RSM results. The predicted output for each optimized condition exhibited an approximation for different models to be calculated. However, a higher ethanol concentration (25% to 40.49%) was required for achieving the same yield as RSM in Group 2. It could be observed that the predicted values in the RSM-ANN-GA model were lower than RSM alone, except for DPPH-SC, which was slightly higher in the RSM-ANN-GA model, but it was more credible and accurate based on the above modeling analysis [20,21].

Conclusions
In the present study, the optimization of ultrasound-assisted extraction was successfully carried out for the preparation of P. cocos extracts at different ethanol concentrations (25,50, and 75%), times (30,40, and 50 min), and extractant volumes (20,40, and 60 mL) using two different types of statistical approaches. RSM was used to predict and optimize the global results with the minimum number of test samples, and PCA was applied to classify multiple responses reasonably. The response indicators to be optimized were PA, TA, TAA, DA, TPc, TPs, DPPH-SC, and T-AOC values. On comparing the predictive ability of the two types of mathematical modeling, the RSM-ANN-GA approach proved to be more reliable and accurate, exhibiting a higher R 2 value than the RSM model. This research not only proposes a statistic process that follows experimental design, variable correlation analysis, modeling, model structure optimization, and prediction, but also proposes a suitable extraction technology for the preparation of antioxidant constituent and multicomponent P. cocos extracts. Of course, it should be emphasized that the affecting factors were complex and diverse for ingredients and activities extracted from P. cocos, such as pH, grinding fineness, frequency of ultrasound, and even origin and species. These factors limited the application scope of the model established in this study. Moreover, qualitative and quantitative analyses of bioactive components should also be explored for further purposeful modeling and optimization.