Construction of an MLR-QSAR Model Based on Dietary Flavonoids and Screening of Natural α-Glucosidase Inhibitors

Postprandial hyperglycemia can be reduced by inhibiting α-glucosidase activity. Common α-glucosidase inhibitors such as acarbose may have various side effects. Therefore, it is important to find natural products that are non-toxic and have high α-glucosidase-inhibitory activity. In the present study, a comprehensive computational analysis of 27 dietary flavonoid compounds with α-glucosidase-inhibitory activity was performed. These included flavonoids, flavanones, isoflavonoids, dihydrochalcone, flavan-3-ols, and anthocyanins. Firstly, molecular fingerprint similarity clustering analysis was performed on the target molecules. Secondly, multiple linear regression quantitative structure–activity relationship (MLR-QSAR) models of dietary flavonoids (2D descriptors and 3D descriptors optimized), with R2 of 0.927 and 0.934, respectively, were constructed using genetic algorithms. Finally, the MolNatSim tool based on the COCONUT database was used to match the similarity of each flavonoid in this study, and to sequentially perform molecular enrichment, similarity screening, and QSAR prediction. After screening, five kinds of natural product molecule (2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, norartocarpetin, 2-(2,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, 2-(3,4-dihydroxyphenyl)-5-hydroxy-4H-chromen-4-one, and morelosin) were finally obtained. Their IC50pre values were 8.977, 31.949, 78.566, 87.87, and 94.136 µM, respectively. Pharmacokinetic predictions evaluated the properties of the new natural products, such as bioavailability and toxicity. Molecular docking analysis revealed the interaction of candidate novel natural flavonoid compounds with the amino acid residues of α-glucosidase. Molecular dynamics (MD) simulations and molecular mechanics/generalized Born surface area (MMGBSA) further validated the stability of these novel natural compounds bound to α-glucosidase. The present findings may provide new directions in the search for novel natural α-glucosidase inhibitors.


Introduction
Diabetes mellitus is a metabolic disorder characterized by chronic hyperglycemia caused by various etiologies. Diabetes can be divided into different types according to its etiology. The common clinical types of diabetes are type 1 diabetes, type 2 diabetes, and gestational diabetes. Type 2 diabetes mellitus (T2DM) is the main type of diabetes, accounting for 90-95% of the incidence of diabetes [1]. T2DM can lead to certain secondary complications, such as nerve damage, kidney failure, and blindness [2]. Controlling postprandial blood glucose plays an important role in interfering with T2DM. Inhibiting China). The inhibition of α-glucosidase activity was performed using a modified method with p-nitrophenyl-α-D-glucopyranoside (p-NPG) as a substrate. The reaction was started by adding 4 mmol L −1 p-NPG (diluted in 50 mmol L −1 phosphate buffer, pH 6.8). The reaction mixture was incubated at 37 • C for 10 min, and the absorbance values were measured at 405 nm using a Spectra Max M5 microplate meter. Acarbose (purity > 95%) was used as a positive control, and 5% DMSO was used as a negative control. These molecules were used to perform the QSAR analysis. The average IC 50 values of flavonoids for α-glucosidase inhibition were converted to pIC 50 (−log IC 50 ) values as the dependent variable for QSAR analysis. The QSAR model was constructed by using MolAICal (version 1.3) [22] software. Additionally, the whole dataset was split into two datasets: 80% of the flavonoids were used as the training set, and the remaining 20% were used as the validation set, which was used to generate the final QSAR models and evaluate the predictive power of the derived models.

Structural Modeling
SMILES of dietary flavonoids were taken from the PubChem database, and the SMILES format for each dietary flavonoid was obtained using the RDKit program (version 2021.09, http://www.rdkit.org/ (accessed on 8 June 2022)) [23]. All structures performed energy minimization using the steepest descent method of the MMFF94s force field [24], which was used to eliminate poor atomic contacts and geometries.

Molecular Fingerprint Similarity Calculation
Molecular fingerprinting, as a method of characterizing chemical structures, is widely used in clustering or recursive partitioning and similarity searching [25]. The molecular fingerprint similarity of the dietary flavonoids was calculated using the Open-Source Cheminformatics Software RDKit (version 2021.09, http://www.rdkit.org/ (accessed on 12 June 2022)). Additionally, Simplified Molecular-Input Line-Entry System (SMILES) formats and topological fingerprints for each dietary flavonoid compound were obtained using the RDKit software. Finally, hierarchical clustering of dietary flavonoids was performed using the single-link algorithm of the Python scikit-learn library (version 0.23.2, https://scikit-learn.org.cn/ (accessed on 20 July 2022)) [26].

MLR-QSAR Modeling
The 2D and 3D molecular descriptors for each dietary flavonoid compound were obtained using the PaDEL-Descriptor software package (version 2.21, http://www.yapcwsoft. com/dd/padeldescriptor/ (accessed on 3 August 2022)) [27]. PaDEL descriptors contain 1875 descriptors (1444 1D and 2D descriptors, and 431 3D descriptors). The 2D molecular descriptors represent structural information that can be calculated from the 2D structure of the molecule, such as the number of benzene rings, the number of hydrogen bond donors, etc. The 3D molecular descriptors represent structural information that must be obtained from the 3D representation of the molecule (e.g., the solvent-accessible and surface area of the structure with a positive partial charge). Additionally, parameters such as the 3D autocorrelation charged part surface area, gravity index, length and width, moment of inertia, Petitjean shape index, RDF, and WHIM were described. Then, blank columns and columns with all zeros were removed, and correlation analysis was performed to remove the high covariance parameter (p > 0.9). Finally, genetic algorithms were used to further optimize these parameters, and the pIC 50 values of the flavonoids were iteratively optimized to build a multiple linear regression (MLR)-QSAR model.

Validation of the MLR-QSAR Model
The prediction ability and stability of the MLR-QSAR model were evaluated by using the parameters Q 2 (leave-one-out cross-validation), R 2 fitting, adjusted R 2 , RSS, PRESS, SDEC, SDEP, MSE, and MAE of partial least squares analysis. The formula definitions of these parameters can be found in the Supplementary Materials (Table S1).

Natural Product Screening
Firstly, molecular fingerprint similarity clustering analysis was performed on the target molecules, and the MolNatSim tool based on the COCONUT database was used to match the similarity of each flavonoid in this study to establish a clustering library of the target compounds. To rapidly locate the most similar natural product molecules and perform conditional screening, pre-optimized molecular similarity prediction models (molecular ECFP4 fingerprint and mini-batch k-means algorithm) are required [28]. In addition, the MolNatSim tool also provides visual analysis operations. Finally, the generated MLR-QSAR model and ADMET prediction analysis were used to make predictions for the possibility of the screened natural product molecules.

Molecular Docking and Molecular Dynamics Simulation
AutoDock Vina was used for molecular docking to obtain complexes in potentially optimal poses [29]. The 3D conformational isomers of the flavonoids were downloaded from PubChem and optimized using RDKit software based on the MMFF94s force field. The crystal structure of α-glucosidase from Saccharomyces cerevisiae is still not available., so we used the homology model reported by Jia et al. [21]. The structure was taken from the Protein Data Bank (PDB ID: 3A4A). ORCA 5.0.3 was used to calculate the structural optimization vibration and single-point energy, and the calculation level used for vibration and optimization was r2scan-3c [30]. The single-point energy used was RI-B3LYP-D3(BJ)/def2-TZVP [31], and the RESP charge was fitted using Multiwfn [32]. The small molecule Topol was obtained by fitting the bond and angle parameters based on the Hessian matrix using Sobtop [33]. Molecular dynamics (MD) simulations were used to explain conformational changes in the complex ligand-receptor binding interactions and stability [34]. All MD simulations in the current work used the GROMACS 19.5 package (https://manual.gromacs.org/ (accessed on 20 August 2022)) [35]. The docking structure models of flavonoids and α-glucosidase proteins were subjected to 50 ns MD simulations in solution with AMBER14SB [36] and the general AMBER force field (GAFF), respectively. The water box adopted the TIP3P water model with a minimum distance of 1.0 nm between the solute atoms and the edge of the periodic box. The system removed overlapping water molecules, and appropriate amounts of Na + /Cl − ions were added to neutralize the system. The energy minimization was conducted in two steps by gauge ensemble (NVT) and isothermal isobaric ensemble. After MD simulations, the root-mean-square deviation (RMSD) and root-mean-square fluctuation (RMSF) of the complexes were analyzed using the GROMACS package. Unless otherwise stated, other parameters can be found in our previous work [37].

Combined Free Energy Calculation
Molecular mechanics/generalized Born surface area (MMGBSA) was used to calculate the binding free energy of the enzyme protein receptor and ligand small molecule complexes [38]. This algorithm can be used to resolve complex interactions between complex molecules by decomposing and calculating the components that make up the binding free energy. This method extracts the architecture from the MD simulation trajectory of the complex at certain time intervals and calculates the average binding free energy. As a scoring function, the MMGBSA calculation method is widely used in drug design. MMGBSA was used in this study to obtain binding free energies for the design of α-glucosidase inhibitors.

Molecular Fingerprint Similarity Analysis
Molecular fingerprints are special qualitative molecular descriptors originally designed for similarity searching, comparison, and clustering of molecules, and they are widely used in drug discovery and virtual screening [15]. The molecular similarity method was developed based on the principle of molecular similarity (i.e., structurally similar molecules should exhibit the same or similar biological activity) to search for compounds with similar characteristics to known ligands. Therefore, to reveal the difference in the α-glucosidase-inhibitory activity of dietary flavonoids, fingerprint similarity analysis was performed on the selected dietary flavonoid compounds ( Figure 1).

Molecular Fingerprint Similarity Analysis
Molecular fingerprints are special qualitative molecular descriptors originally designed for similarity searching, comparison, and clustering of molecules, and they are widely used in drug discovery and virtual screening [15]. The molecular similarity method was developed based on the principle of molecular similarity (i.e., structurally similar molecules should exhibit the same or similar biological activity) to search for compounds with similar characteristics to known ligands. Therefore, to reveal the difference in the α-glucosidase-inhibitory activity of dietary flavonoids, fingerprint similarity analysis was performed on the selected dietary flavonoid compounds ( Figure 1).  The hierarchical clustering results of the molecular fingerprint similarity of the selected dietary flavonoid compounds are shown in Figure 1. The more purple the color of the heatmap, the higher the molecular fingerprint similarity (MFS). Flavonoids are structurally composed of two benzene rings (A and B rings) with phenolic hydroxyl groups linked by three central carbon atoms [39]. According to the hierarchical clustering results of molecular fingerprint similarity, these dietary flavonoids can be classified into six categories according to the differences in the three-carbon-atom structure connecting the A ring and the B ring, such as whether the ring is formed, oxidized, and/or substituted. Flavanones (e.g., hesperitin, taxifolin, and eriodictyol) and (+)-catechin, with the largest IC 50 values, can be clustered as Cluster 1 (mean MFS score > 0.6). Among them, epigallocatechin gallate and (+)-catechin both belong to flavan-3-ols and are therefore clustered in Cluster 1. Since phloretin belongs to the dihydrochalcones, it was separately aggregated as Cluster 2. Flavonoids that do not contain O-glucose groups (e.g., myricetin, luteolin, baicalein, isorhamnetin, quercetin, apigenin, galangin, kaempferol, fisetin) were clustered into Cluster 3 (mean MFS score > 0.7). Isoflavones (e.g., genistein, formononetin) were clustered into one category as Cluster 4, which had a higher paired MFS score (mean MFS score > 0.8). Anthocyanins (e.g., cyanidin-3-O-glucoside) and flavonoids containing the O-glucose group (e.g., isorhamnetin-3-O-rutinoside, myricitrin, apigenin-7-O-glucoside, luteoloside, rutin, kaempferol-7-O-βglucoside) were clustered into Cluster 5 (mean MFS score > 0.75). Among the flavonoids, vitexin, vitexin glucoside, and isoschaftoside, with more similar side-chain linkages, were clustered into one category as Cluster 6, which had the highest paired MFS scores (mean MFS score > 0.9). Overall, the hierarchical clustering results showed that the molecular fingerprint similarity and structures of the flavonoids were in good agreement.

MLR-QSAR Analysis
QSAR modeling is a method for establishing quantitative relationships between the structural or physicochemical parameters of compounds and their biological activities using mathematical calculation and statistical analysis based on MLR. Predicting the biological activity of unknown compounds based on the MLR-QSAR model can significantly reduce the cost of target compound screening and design. We used the PaDEL-Descriptor software package to generate 2D and 3D descriptors to fully characterize the structure of the dietary flavonoids, analyzing the autocorrelation of the descriptors while excluding highly linear descriptors. Finally, a genetic algorithm was used to further optimize the descriptors, and the pIC 50 values were iteratively modeled to construct the MLR-QSAR [40]. The results in Figure 2 show that all of the points are located near the diagonal line, indicating a good correlation between the experimental inhibition rate (pIC 50 ) and the predicted inhibition rate (pIC 50pre ), suggesting that the model is very predictable ( Table 1). In the 2D-MLR-QSAR model, the R 2 fitting was 0.9273, the R 2 adjusted was 0.9046, and the Q 2 LOO was 0.8861. In the 3D-MLR-QSAR model, the R 2 fitting, R 2 adjusted, and Q 2 LOO were 0.9336, 0.9129, and 0.8991, respectively ( Table 2). These results indicate that the developed model has a good fit and could adequately predict the inhibitory effects of dietary flavonoid compounds on α-glucosidase.
The RSS, PRESS, SDEC, SDEP, MSE, and MAE parameters were calculated to further evaluate the robustness of the model, and the results are shown in Table 2. The RSS of the 2D and 3D QSAR models was 0.6906 and 0.5679, the PRESS was 1.0829 and 0.8633, the SDEC was 0.1772 and 0.1607, the SDEP was 0.2219 and 0.1981, the MSE was 0.1636 and 0.2545, and the MAE was 0.3138 and 0.349, respectively. These results show that this model has good robustness and predictive capability, and that the predictive performance of 2D-MLR-QSAR is better than that of 3D-MLR-QSAR.
Next, the relationships between the structural properties and pIC 50 values of the dietary flavonoids were further analyzed based on key descriptors. In the MLR equation of the 2D-QSAR model, MIC1, ATS4v, AATS7m, CIC3, and minssCH2 were identified as key descriptors affecting the inhibitory activity of dietary flavonoids on α-glucosidase ( Table 2). The maximum coefficient of CIC3 was 3.06615, and the coefficient of AATS7m was 0.05492, contributing positively to the model. Meanwhile, the coefficients of MIC1, ATS4v, and minssCH2 were −0.15599, −0.00011, and −3.00189, respectively, contributing negatively to the model. ATS4v is a molecular descriptor for characterizing the topology of compounds based on the intrinsic states of atoms [41]. AATS7m corresponds to the length of the branch on the R group in the molecular structure of dietary flavonoids [42]. CIC3 is the complementary information content index (third-order neighborhood symmetry). MIC1 is a molecular descriptor that considers steric effects in molecule-receptor interactions based on proximity and edge multiplicity [43]. Unlike the MLR equation of the 2D-QSAR model, LOBMAX, RDF35i, TDB10i, TDB9i, and TDB6m in the MLR equation of the 3D-QSAR model were considered to be the key descriptors affecting the inhibitory activity of dietary flavonoids on α-glucosidase (Table 2). Among them, TDB10i, TDB9i, and TDB6m were considered to be based on molecular 3D topological distance autocorrelation [43]. Based on the coefficients of these parameters listed in Table 2, LOBMAX, TDB10i, and TDB6m were 0.49135, 0.00505, and 0.01155, respectively, contributing positively to the model, while the other two parameters (RDF35i and TDB9i) contributed negatively to the model. RDF35i corresponds to the radial distribution function-055/weighted by the first ionization potential, which describes the molecular radial distribution function [44].
Based on Pearson's correlation coefficients, autocorrelation analysis of the two descriptors used to train the MLR-QSAR model was conducted. Figure 3a,b show the optimal autocorrelation plots of the descriptors for the 2D-MLR-and 3D-MLR-QSAR models of the flavonoids, respectively. The results showed that the pairwise correlation of most of the key descriptors was less than 0.8, indicating that the correlation between these key descriptors was not strong. It follows that the constructed model can be used to quantify the inhibitory activity of dietary flavonoids on α-glucosidase. Since the 2D-MLR-QSAR model had better predictive performance, it was used for prediction and screening in subsequent study. these key descriptors was not strong. It follows that the constructed model can be used to quantify the inhibitory activity of dietary flavonoids on α-glucosidase. Since the 2D-MLR-QSAR model had better predictive performance, it was used for prediction and screening in subsequent study.

Discovery of Natural α-Glucosidase Inhibitors and ADMET Analysis
To better apply the model to predict the potential natural α-glucosidase inhibitors, we used the COCONUT-based MolNatSim tool (which contains over 400,000 natural products) to match the 27 natural dietary flavonoid molecules modeled in this paper. Refer to our previous work for an introduction to the MolNatSim Tool [45]. Natural product clustering libraries with five different clustering degrees were used to enrich each molecule. Firstly, the Morgan fingerprint similarity between the molecules in each molecular arrowhead and the target molecule was calculated using RDKit, and molecules with a similarity greater than 80% were retained. Secondly, a total of 1716 natural molecules were enriched by merging 27 molecular arrowheads. Finally, 1198 natural molecules were obtained by deduplication according to SMILE and number.
Drug similarity is a key consideration when selecting compounds in the early stages of drug discovery. Applying quantitative estimation of drug-likeness (QED) to the molecularly targeted druggability assessment problem facilitates drug discovery by prioritizing a large number of published biologically active compounds [46]. Therefore, QED and other screening methods should be used to accelerate the discovery of αglucosidase-inhibitory activity of natural dietary flavonoid molecules based on similarity. We chose a QED value of 0.5 and MWs of 650 as the threshold to reduce the number of molecules from 852 to 82 and removed the elements not contained in the molecules from the modeling. The results are shown in Table S2, and 78 natural product molecules were obtained, including the 3 molecules in the modeling. The prediction of pIC50pre was carried out using the 2D-MLR-QSAR model, and the results are shown in Table S2. Finally, five natural products were found to meet the requirements that pIC50pre be larger than the pIC50 of the 27 natural dietary flavonoid molecules. These were 2-(3,5dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, norartocarpetin, 2-(2,5dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, 2-(3,4-dihydroxyphenyl)-5hydroxy-4H-chromen-4-one, and morelosin. The quantitative conformational relationship predictions showed that their IC50pre values were 8.98, 31.95, 78.57, 87.87, and 94.14 µM, respectively.
To assess bioavailability and avoid drug interactions based on the proposed natural compounds, the prediction of ADMET properties was conducted to reduce potential problems in later clinical trials. For this purpose, the pharmacokinetics of these novel natural product α-glucosidase inhibitors were considered using SwissADME web server

Discovery of Natural α-Glucosidase Inhibitors and ADMET Analysis
To better apply the model to predict the potential natural α-glucosidase inhibitors, we used the COCONUT-based MolNatSim tool (which contains over 400,000 natural products) to match the 27 natural dietary flavonoid molecules modeled in this paper. Refer to our previous work for an introduction to the MolNatSim Tool [45]. Natural product clustering libraries with five different clustering degrees were used to enrich each molecule. Firstly, the Morgan fingerprint similarity between the molecules in each molecular arrowhead and the target molecule was calculated using RDKit, and molecules with a similarity greater than 80% were retained. Secondly, a total of 1716 natural molecules were enriched by merging 27 molecular arrowheads. Finally, 1198 natural molecules were obtained by deduplication according to SMILE and number.
Drug similarity is a key consideration when selecting compounds in the early stages of drug discovery. Applying quantitative estimation of drug-likeness (QED) to the molecularly targeted druggability assessment problem facilitates drug discovery by prioritizing a large number of published biologically active compounds [46]. Therefore, QED and other screening methods should be used to accelerate the discovery of α-glucosidase-inhibitory activity of natural dietary flavonoid molecules based on similarity. We chose a QED value of 0.5 and MWs of 650 as the threshold to reduce the number of molecules from 852 to 82 and removed the elements not contained in the molecules from the modeling. The results are shown in Table S2, and 78 natural product molecules were obtained, including the 3 molecules in the modeling. The prediction of pIC 50pre was carried out using the 2D-MLR-QSAR model, and the results are shown in Table S2. Finally, five natural products were found to meet the requirements that pIC 50pre be larger than the pIC 50 of the 27 natural dietary flavonoid molecules. These were 2-(3,5-dihydroxyphenyl)-5,7dihydroxy-4H-chromen-4-one, norartocarpetin, 2-(2,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, 2-(3,4-dihydroxyphenyl)-5-hydroxy-4H-chromen-4-one, and morelosin. The quantitative conformational relationship predictions showed that their IC 50pre values were 8.98, 31.95, 78.57, 87.87, and 94.14 µM, respectively.
To assess bioavailability and avoid drug interactions based on the proposed natural compounds, the prediction of ADMET properties was conducted to reduce potential problems in later clinical trials. For this purpose, the pharmacokinetics of these novel natural product α-glucosidase inhibitors were considered using SwissADME web server [47] (http://www.swissadme.ch (accessed on 17 August 2022)) and PreADMET web server [48] (https://preadmet.bmdrc.kr (accessed on 18 August 2022)) predictions. ADMET parameters were calculated for the five natural products, and the results are shown in Table 3. Solubility is an important parameter in ADMET prediction to evaluate the possibility of intestinal absorption and blood distribution. Three of these natural products-2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, norartocarpetin, and 2-(2,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one-had good solubility, and all five natural products may have good bioavailability and gastrointestinal absorption. In addition, the blood-brain barrier (BBB) is the main interface separating the central nervous system from the blood circulation [49]. Our results showed that these five natural products have non-permeable BBB ability. Pgp is an important transporter protein responsible for the excretion of many harmful substances from cells, and naturally also for the removal of many drugs from cells [50]. Our results showed that these five natural products are not Pgp substrates, indicating they cannot reduce the debilitating effects of drug efficacy. Fortunately, our candidate compounds 2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one and norartocarpetin are not expected to show toxicity to cancerous rat and mouse cells. As shown in Table S3, based on drug similarity predictions, the natural compounds conformed to Lipinski's rule, Veber's rule, and Egan's rule, indicating that the compounds can be easily synthesized [51]. Thus, these studies provide favorable support for our model to predict the outcomes of natural α-glucosidase inhibitors.

Molecular Dynamics Simulation Analysis
The dynamic stability and binding energy of the complexes were further analyzed by molecular dynamics (MD) simulations. RMSD was used to determine the average deviation of the complex conformation from the original conformation at a given time and to assess whether the complex system had reached a steady state [55]. As shown in Figure 5a, both 2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one and 2-(2,5dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one systems reached equilibrium at 20,000 ps, and their RMSD values remained stable at 0.12 and 0.14 nm, respectively. Both 2-(3,4-dihydroxyphenyl)-5-hydroxy-4H-chromen-4-one and morelosin showed a rise in their RMSD values in the first 25,000 ps of the simulation and then became more stable at 45,000 ps while reaching equilibrium, and their RMSD values remained stable at 0.20 and 0.18 nm, respectively. The norartocarpetin system reached equilibrium faster, within 15,000 ps, and was stable at an average value of 0.10 nm. Moreover, norartocarpetin was the least fluctuating system overall among the five systems. In summary, the five systems converged and balanced within 50,000 ps of simulation, with 2-(3,5dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, 2-(2,5-dihydroxyphenyl)-5,7dihydroxy-4H-chromen-4-one, and norartocarpetin showing the best stability.
RMSF can be used to characterize the stability of a composite system by evaluating the fluctuation of conformational change from the original conformation [55]. Figure 5b

Combining Free Energy Calculations by the MMGBSA Method
To quantitatively characterize the effects of the interactions between the five novel natural products and α-glucosidase, we calculated the binding free energy of the two products using the MMGBSA algorithm. After choosing 40-50 ns and taking conformations at certain intervals, the MD of the complex systems was simulated by the MM-GBSA algorithm. As shown in Table 4, all of the complexes showed good binding free energies; the simulated values of free energy of binding (∆Gbind) for the five complexes were −23.87, −37.78, −27.81, −22.53, and −16.56 kcal/mol, respectively. The negative values of the free energy of binding indicate that the process of binding is spontaneous and can reach a steady state [56]. Among all types of interactions, the van der Waals interaction energy term (ΔEvdw) and the electrostatic interaction energy term (∆Eele) contribute the most to the average binding free energy, providing the main driving force for the binding of novel natural products to α-glucosidases. The non-polar solvation energy term (∆Gnon-pol) also acts as a key force to maintain the stability of the system. ΔGgas includes van der Waals energy (ΔEvdw), electrostatic energy (ΔEele), and non-polar solvation energy (∆Gnon-pol) [56]. However, the polar solvation energy (∆Gpol) exhibited an unfavorable energetic contribution, inhibiting the spontaneous binding of the natural product to α-glucosidase, but was mostly counteracted by the molecular electrostatic interaction energy under vacuum conditions. Notably, in terms of the total binding free RMSF can be used to characterize the stability of a composite system by evaluating the fluctuation of conformational change from the original conformation [55]. Figure 5b shows the RMSF values of the five natural product-α-glucosidase complex molecules as a function of residue number. In the active center, GLU274, GLN276, ASP304, GLN350, and ARG439 are the key amino acid residues. The binding of the five natural products to the α-glucosidase receptor leads to the increased flexibility of residues (212-239), (274-304), and (407-439) in the critical region. This suggests that novel natural flavonoid compounds may inhibit α-glucosidase activity by interacting with key residues affecting the activity pocket. As shown in Figure 5b, the mean values of RMSF for all residues of the five natural products were 0.07, 0.06, 0.07, 0.08, and 0.09 nm, respectively. Thus, the natural products 2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, norartocarpetin, and 2-(2,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one have lower flexibility (or higher rigidity), except for a limited number of regions. The RMSF provided information on the local structural mobility of the proteins during the MD simulations. Among the five environments, the natural products 2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, norartocarpetin, and 2-(2,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one showed lower residue fluctuations, indicating that these three natural products have fewer conformational changes and higher stability after binding to α-glucosidase.

Combining Free Energy Calculations by the MMGBSA Method
To quantitatively characterize the effects of the interactions between the five novel natural products and α-glucosidase, we calculated the binding free energy of the two products using the MMGBSA algorithm. After choosing 40-50 ns and taking conformations at certain intervals, the MD of the complex systems was simulated by the MM-GBSA algorithm. As shown in Table 4, all of the complexes showed good binding free energies; the simulated values of free energy of binding (∆G bind ) for the five complexes were −23.87, −37.78, −27.81, −22.53, and −16.56 kcal/mol, respectively. The negative values of the free energy of binding indicate that the process of binding is spontaneous and can reach a steady state [56]. Among all types of interactions, the van der Waals interaction energy term (∆E vdw ) and the electrostatic interaction energy term (∆E ele ) contribute the most to the average binding free energy, providing the main driving force for the binding of novel natural products to α-glucosidases. The non-polar solvation energy term (∆G non-pol ) also acts as a key force to maintain the stability of the system. ∆G gas includes van der Waals energy (∆E vdw ), electrostatic energy (∆E ele ), and non-polar solvation energy (∆G non-pol ) [56]. However, the polar solvation energy (∆G pol ) exhibited an unfavorable energetic contribution, inhibiting the spontaneous binding of the natural product to α-glucosidase, but was mostly counteracted by the molecular electrostatic interaction energy under vacuum conditions. Notably, in terms of the total binding free energy, the natural product inhibitors norartocarpetin, 2-(3,5-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one, and 2-(2,5dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one can form more stable complexes with α-glucosidase. Table 4. Binding free energy of complex formation between α-glucosidase and its inhibitors.  Note: ∆E vdw , the van der Waals interaction energy term; ∆E ele , the electrostatic interaction energy term; ∆G pol , the polar solvation energy term; ∆G non-pol , The non-polar solvation energy term; ∆G gas , the gas phase free energy term; ∆G sol , the solvation free energy term; ∆G bind , the free energy of binding.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/foods11244046/s1, Table S1: Equations presented in the research paper; Table S2 2D-MLR-QSAR prediction of potential natural α-glucosidase inhibitors; Table S3. Drug similarity prediction of natural products based on Lipinski, Veber and Egan rules.