A Strategy Based on GC-MS/MS, UPLC-MS/MS and Virtual Molecular Docking for Analysis and Prediction of Bioactive Compounds in Eucalyptus Globulus Leaves

The discovery of medicinal plants is crucial for drug development. Eucalyptus globulus leaves are used as a traditional medicine in many areas of world due to herbicidal and insecticidal activity. While natural products are difficult to be separated and activity assayed, a new approach is needed to predict the active ingredients therein. In this study, a new method for screening active compounds extracted from E. globulus leaves was developed by GC-MS/MS and UPLC-MS/MS combined with molecular docking technology. Predicted compounds with high activity were proposed. Firstly, 35 volatile compounds and 34 aqueous extracted compounds were extracted from E. globulus leaves, and identified by GC-MS/MS and UPLC-MS/MS. The herbicidal receptor (1BX9) was then docked with the identified compounds by docking software, evaluated by docking models and seven scoring functions. The results showed that gallic acid had a strong inhibitory activity of 1BX9, which was speculated to be the main reason for the inhibitory effect of E. globulus leaves. Finally, allelopathic tests of gallic acid, citric acid, and isopulegol were carried out on grass seeds to verify its inhibitory activity against herbicide receptor 1BX9. The results show that the method can screen compounds with specific activity from a complex system of medicinal plants, which is very important for the screening of new active ingredients, confirmation of new medicinal ingredients, and the in-depth development of animal and plant medicines.


Introduction
Eucalyptus globulus Labill, an evergreen plant, is known in China as a fast-growing genus. It is a myrtaceae plant, usually distributed in tropical and subtropical regions, and also in the Guangdong, Guangxi and Yunnan provinces in China [1]. It has both pharmacological and biotoxicity activities. Pharmacological activities mainly include antioxidant, anti-tumor, hypoglycemia, etc., and biotoxicity activities include allelopathy, insecticidal action and antimicrobial activity [2,3].
At present, the research on the allelopathic activity of E. globulus leaves is mainly on the essential oil, but the allelopathic effect on the aqueous extracts of E. globulus leaves is still less known. There are four ways to release the allelopathic active ingredients: volatile release, root release, leaching release and decay release [4]. The leaching of E. globulus leaves is a way to release the active compounds of E. globulus leaves. It spreads through water and affects the environment. Therefore, the aqueous extracts of E. globulus leaves may contain a variety of important biologically active compounds. Therefore, we study the allelopathic activity of the volatile components and aqueous extracted components of E. globulus leaves.
Traditional separation techniques for natural products have low separation efficiency and high cost. More importantly, the active ingredients are usually low in content, and disappear during the separation process due to the irreversible adsorption of solid supports. Sometimes the recovery of the isolated compound is too low to allow further activity testing [5,6]. Over the last decade, GC-MS/MS, UPLC-MS/MS have been widely used in separation and rapid identification of compounds in natural products [7]. However, only using chromatographic-mass spectrometry, it is not possible to determine which compound in the complex mixture responds to which activity. To cope with this problem, the molecular docking method [8] was introduced.
The molecular docking technique uses energy-based scoring functions to predict interactions between ligand and receptor active sites [9]. It simulates the interaction between small molecule ligands and biomacromolecular receptors based on the "lock-key" principle. Ligand-receptor interaction is a process of molecular recognition, which mainly includes electrostatic interaction, hydrogen bonding, hydrophobic interaction, van der Waals interaction, etc. Through calculation, the combination mode and affinity between them can be predicted to perform virtual screening of drugs.
In this study, the compounds of E. globulus leaves were predicted by GC-MS/MS and LC-MS/MS combined with molecular docking technology. The potential active ingredients were determined by scoring function, interaction, etc. Finally, the predicted activity of active ingredients was verified by an activity test, and the active compounds of the E. globulus leaves were screened.

Optimization of UPLC Conditions
In order to obtain the best separation effect of the components in the aqueous extracts from the E. globulus leaves, the separation conditions of the ultra-high-performance liquid chromatography (UPLC) were optimized in this study, including the optimization of the column, mobile phase and column temperature.
Columns of different lengths and particle sizes were tested. The optimization of mobile phase conditions included the separation effect of methanol-formic acid solution and acetonitrile-formic acid solution, different mobile phase flow rates and the addition of different formic acid concentrations (0.1%, 0.5%, 1%). For optimizing the column temperature, we respectively tested 25, 30, 35, 40 • C. After the test, using the Waters Xselet HSS T3 (150 mm × 2.1 mm, 3.5 mm), acetonitrile-0.1%formic acid solution of mobile phase, flow rate of 0.2 mL/min, 30 • C column temperature, can get better separation results.

Optimization of HS-SPME Extraction Method
The equilibrium time, extraction time and extraction temperature of the headspace solid phase microextraction (HS-SPME) method have a great influence on the volatile components in the E. globulus leaves. The equilibrium time is designed in this experiment (20, 30, 40 min), extraction time (30,40, 50 min) and extraction temperature (40, 50, 60 • C), to investigate the optimal equilibrium time, extraction time and extraction temperature.
After comparing the number of volatile components and the peak intensity in GC-MS in different equilibrium time, extraction time and extraction temperature. The equilibrium time of 30 min, extraction time of 40 min and the extraction temperature of 50 • C is the optimum extraction condition.

Aqueous Extracts from E. Globulus Leaves
In order to analyze the bioactive compounds of aqueous extracts from the E. globulus leaves, UPLC-Q-Orbitrap-MS was used, and the mass spectra of the aqueous extracts were shown in Figure 1. A total of 34 active compounds were identified, mainly including gallic acid, protocatechuic acid, quinic acid, caffeic acid, P-coumalic acid, benzoic acid and ferulic acid. Among all the compounds, the content of tannins components was the highest class, accounting for 3.45% of aqueous extracts. Among them, the content of gallic acid and ellagic acid were higher, and the content of gallic acid accounted for 2.81%, and the ellagic acid content was 0.19%. In addition, the content of phenolic acid in the aqueous extracts was the second class, reaching 2.93%, of which the quinic acid content was the highest, accounting for 2.40%, and caffeic acid, protocatechuic acid and gentisic acid both accounted for 0.15% (Table 1). It can be found that gallic acid is the most abundant component in the aqueous extracts. Gallic acid is one of the common ingredients in eucalyptus plants. There are many studies on the antibacterial activity of gallic acid, which has a good inhibitory effect on Staphylococcus aureus and Salmonella enteritidis [10]. It is recognized in Eucalyptus robusta smith and Eucalyptus urophylla [11] and may be highly correlated with the activity of the aqueous extracts from E. globulus leaves.

Volatile Compounds of E. Globulus Leaves
The mass spectra of volatile components of E. globulus leaves were obtained by HS-SPME-GC-MS, as shown in Figure 2. A total of 35 volatile compounds were identified, mainly including isopulegol, α-terpineol, β-eudesmol, γ-terpinene, 3-carene, β-pinene, α-pinene, and camphene. Each volatile component of the E. globulus leaves was well separated and identified. The monoterpenoid volatile component was abundant, accounting for 43.09% of the total volatile content, which was the most important volatile component. Studies have found that the various biological activities are closely related to the monoterpenoids, and the monoterpenoids extracted from many plants have biological activity or potential biological activity. Isopulegol belongs to the monoterpenoids, accounting for 0.41% of the volatile compounds from E. globulus leaves ( Table 2). It presents in the essential oils of many plants, such as Corymbia citriodora H. [21], Zanthoxylum schinifolium L [22]. and Melissa officinalis L. [23], all these plants present anti-inflammatory activities. Isopulegol also has central and peripheral analgesic effects [24], as well as a broad spectrum of antifungal activity [25]. It may be related to the biological activity of the volatile compounds in E. globulus leaves.

Docking Results
Using herbicide as a key word, the relevant protein was searched from the TTD database as a target protein for herbicide, and the corresponding literature was searched and screened. The crystal structure can be obtained from the PDB database (http://www.rcsb.org/pdb/home/home.do), and the herbicidal protein is glutathione s-transferase and herbicide complex (1BX9). The Ligandfit [26] package in Discovery Studio is a classic tool for virtual screening of molecular docking methods. It has the functions of automatic searching and confirmation of receptor active sites, multi-ligand docking of conformational flexibility, and evaluation of interaction scores based on force field.
After docking 69 active compounds (supplementary 1) from E. globulus leaves with the target protein, it was found that 23 aqueous extracted compounds and 29 volatile compounds were successfully docked with 1BX9. As a result, the gallic acid had a maximum score of 89.142 and also the highest content in the aqueous extracted compounds, so it was suspected to have potential 1BX9 inhibitory activity. The dock score of glucogallic acid was also high at 82.628. Glucogallic acid is the glycosylation of gallic acid, which has enhanced aqueous solubility and low toxicity [27]. The scoring value of it was lower than gallic acid, which conforms to the docking result ( Table 3). The citric acid dock score in the aqueous extracts was ranked third at 76.598, had potential 1BX9 inhibitory activity. Among the volatile compounds from the E. globulus leaves, the isopulegol dock score had the highest scoring value and belonged to the monoterpenoids, which may have strong biological activity. It can be seen from Table 3 that most aqueous extracted compounds from E. globulus leaves were superior to the volatile compounds in the docking result of 1BX9.

Docking Model
Among all identified compounds from E. globulus leaves, gallic acid, glucogallic acid, and citric acid scored higher on the dock score, and all belonged to the aqueous extracts; isopulegol scored the highest in the volatile compounds. Therefore, the docking model of gallic acid, glucogallic acid, citric acid, and isopulegol were analyzed (Table 4) to study the interaction of the active compounds with the herbicidal receptor (1BX9). The greater the absolute energy in the docking model, the better the docking result, and the dashed line indicates the hydrogen bond formed with the amino acid side chain. The H-Bind number indicates the number of hydrogen bond formation, and the surrounding residues also participate in the interaction. As a result, the interaction between the compound gallic acid, glucogallic acid, citric acid and 1BX9 was mainly mediated by the amino acid residue Lys 35, and the interaction between isopulegol and 1BX9 was mainly caused by the amino acid residues Lys 35 and Tyr 126 (Table 4). Both conduct and interact through the formation of hydrogen bonds. The absolute energies of gallic acid, glucogallic acid, citric acid and isopulegol were 15.277, 13.502, 12.893 and 4.431 kcal/mol, respectively. Among them, gallic acid had the highest absolute energy, indicating that gallic acid had a good docking result with 1BX9. In addition to these residues, amino acid residues (His 7, Ile 125, Ala 9, Pro 8) also promoted substrate recognition by combination of electrostatic, hydrophobic interactions. The scoring and docking model analysis showed that gallic acid had the strongest potential inhibitory activity.

Allelopathic Effect of E. Globulus Leaves Extract
In order to study the allelopathic effect of the extract from E. globulus leaves on grass seeds, distilled water was used as a control, the aqueous extracted compounds and volatile compounds were diluted to different concentrations. The germination rate, root inhibition rate and seedling inhibition rate of grass seeds were observed to evaluate their allelopathic effects. In this experiment, the lower germination rate, the better germination effect of inhibiting grass seeds. However, the inhibition rate of roots and seedlings is opposite, the higher inhibition rate, the better allelopathic effect. The results showed that aqueous extracted compounds and volatile compounds all had an allelopathic effect, and the concentration of the extract had a positive correlation with the allelopathic effect of the grass seeds. As the concentration increases, the germination rate of grass seeds gradually decreased, while the inhibition rate of root growth and seedling growth gradually increased. At the same concentration, the germination rate of the aqueous extracted compounds was lower than that of the volatile compounds on the grass seed. The inhibition rate of the grass seedlings and roots was higher than the volatile compounds. It indicated that the allelopathic effect of aqueous extracted compounds on grass seeds were higher than volatile compounds from E. globulus leaves.

Allelopathic Effects of Predicted Active Ingredients of E. Globulus Leaves
The above allelopathic experiments showed that the allelopathic effect of the aqueous extracted compounds were stronger than the volatile compounds from E. globulus leaves, which was consistent with the predicted results of molecular docking. Molecular docking technology predicted that the most allelopathic effect in E. globulus leaves was gallic acid, followed by gallic acid glucoside and citric acid. Because gallic acid glucoside is the glycosylation of gallic acid, its aqueous solubility is enhanced and its toxicity is reduced [27], therefore the allelopathic effect on grass seeds is lower than that of gallic acid. In the volatile component, isopulegol was predicted to have the strongest allelopathic effect, so the allelopathic effect of gallic acid, citric acid, and isopulegol was further verified.
Taking distilled water as a control, gallic acid, citric acid and isopulegol were diluted to different concentrations, through germination rate, inhibition rate of root growth and seedling growth to research their allelopathic effects on grass seeds. The results showed that gallic acid, citric acid and isopulegol had significantly inhibited the germination, seedling and root growth of grass seeds compared with the control group, so they all had allelopathic effects. The concentration was positively correlated with the allelopathic effect on grass seeds. With the increase of concentration, the germination rate of grass seeds gradually decreased, while the root growth inhibition rate and seedling growth inhibition rate gradually increased. At the same concentration, the gallic acid inhibition rates of germination and grass seedling and root on grass seeds were higher than that of citric acid and isopulegol. Comparing  Figures 3 and 4, it can be found that the gallic acid inhibition rates of germination, seedling and root on grass seeds were higher than that of aqueous extracted compounds, which was much higher than volatile compounds. The citric acid inhibition rates of germination and grass seedling and root on grass seeds were slightly lower than that of gallic acid, higher than that of aqueous extracted compounds and volatile compounds from E. globulus leaves. Among them, isopulegol had the weakest allelopathic effect, which had the lowest inhibition rate on grass seed germination, root and seedling.   . Allelopathic effects of predicted active compounds in E. globulus leaves A is gallic acid, B is citric acid, and C is isopulegol. * represents a significant difference between the experimental group and the control group (* p < 0.05, ** p < 0.01).
Through the above experiments, it can be concluded that the aqueous extracted compounds and volatile compounds from E. globulus leaves, as well as the active compounds predicted by molecular docking have a certain allelopathic effect. The allelopathic effect of the aqueous extracted compounds were higher than the volatile components, and the allelopathic effects of different components were: gallic acid > citric acid > isopulegol. Gallic acid had the strongest allelopathic effect, and the allelopathic effect of isopulegol was the weakest, consistent with the results of molecular docking. In the result of molecular docking, the aqueous extracted compounds are superior to the volatile compounds, and the gallic acid was better than citric acid and isopulegol.

Chemicals and Plant Materials
E. globulus leaves (fresh and non-destructive); grass seeds (Cynodon dactylon, purchased from garden flower seed shop, Hangzhou, China); gallic acid (purchased from source leaf standard material center, Shanghai, China); isopulegol (purchased from Tan ink Standard Substance Center, Beijing, China); citric acid (purchased from Tan ink Standard Substance Center, Beijing, China); Acetophenone, n-tetradecane (purchased from Aladdin, Los Angeles, CA, USA)

Extraction of Aqueous Extracts from E. Globulus Leaves
Fresh E. globulus leaves (10 g) were randomly weighed, 100mL of pure aqueous at a ratio of 1:10 (M/V) was added, stirred and mixed, ultrasonically extracted for 30 min, leaching for 24 h at room temperature, and then centrifuged for 10 min, the supernatant was the aqueous extracts of E. globulus leaves.

Extraction of Volatile Components from E. Globulus Leaves
Fresh E. globulus leaves (100 g) were randomly weighed and placed in a Q-250A3 pulverizer for pulverization, and then passed through a 100-mesh sieve. 1.00g (±1%) pulverized E. globulus leaves were weighed into a 20 mL headspace vial for testing, and the above tests were repeated twice.

UPLC-MS/MS Analysis of Aqueous Extracts from Leaves
In this experiment, the active organic matter of the E. globulus leaves was separated by Dionex Ultimate 3000 ultra-high-performance liquid chromatography with Xselet HSS T3 (150 mm ×2.1 mm, 3.5 mm) column. The column temperature was set to 30 • C and sample tray temperature was set to 5 • C. The autosampler needle was washed twice with 200 mL of 70% methanol solution before and after each injection, detected by Waters 2996 PDA. Liquid chromatography mobile phase: phase A is acetonitrile, phase B is 0.1% formic acid, and the elution gradient is initial mobile phase 3% A for 5 min; 3-5% A, 5-10 min; 5% A for 10 min; 5-10% A, 20-25 min; 10% A for 10 min; 10-20% A, 35-50 min; 20% A for 10 min; 20-30% A, 60-70 min; 30-95% A, 70-75 min 95-3% A, 75-80 min; 3% A for 10 min. The flow rate was 0.2 mL/min and the injection volume was 10 mL.
The experiment used UPLC-Q-Orbitrap-MS, with ESI ionization source, and negative ion mode was used for scanning with the MS1 resolution setted to RP = 70,000 at 200 m/z; MS2 resolution set to RP = 35,000 at 200 m/z. The electric probe evaporator temperature was 250 • C, and scan mass range was set to 100-1000 m/z. The switching between MS1 and MS2 was performed according to the ionic strength of the substance MS1, when the ionic strength of the substance MS1 was the highest, MS2 scanning performed.
Data analysis was performed in the Xcalibur working software, and identification of some of the components was identified using the mzCloud (www.mzCloud.org) mass spectral library and NIST08f. In the identification process, the suspected parent ion is first extracted, so that a narrow mass range can be quickly determined in the complex components of each sample, and then the precise retention time, molecular ion peak and MS2 under specific energy are compared. The feature fragments are qualitatively analyzed with reference to the reference literature. Finally, the relative content was quantified by the area normalization method.

GC-MS/MS Analysis of Volatile Components of Leaves
The main part of the headspace solid phase microextraction device is the extraction head, which needs to be aged for 1 h at 280 • C in the gas chromatographic inlet before use. The micro-extraction step was as follows: firstly add 5 mL of double internal standard (2.5 mg/mL acetophenone and 1.0 mg/mL n-tetradecane in acetone) to the headspace sample bottle containing the sample, and the headspace vial was placed in 50 • C dry bath and equilibrated for 30 min; then a 1 cm 50/30 µm DVB/CAR/PDMS extraction head was inserted into the injection vial to enrich the volatile components of the vial headspace for 40 min. The extraction head was removed and immediately inserted into the GC inlet for 5 min.
The qualitative and quantitative analysis of the volatile constituents was performed on Agilent 7890A-7000 gas chromatography with HP-5 MS capillary column (30 mm × 0.25 mm × 0.25 µm) used for separation and analysis. The injector port was set at 280 • C, helium was used as the carrier gas at a flow rate of 0.8 mL/min. The following temperature program was applied: 50 • C for 1 min, and then to 90 • C at a rate of 8 • C/min, to 130 • C at a rate of 4 • C/min, for 8 min, to 160 • C at a rate of 4 • C/min for 2 min, finally to 230 • C at a rate of 10 • C/min. Mass spectrometry uses electron bombardment ionization source (EI source), ion source temperature of 230 • C, and scan mass range of 35-400 amu.
Data analysis was performed in Agilant Mass Hunter software, and the spectra of all peaks in the gas chromatogram were compared with the NIST 2.0 database to initially characterize the peaks in the spectrum. The retention indices (RI) of the compounds were determined relative to the retention times of a series of n-alkanes (C6-C26), and then compare with the retention index of each substance under the same or similar polarity column, verify all the substances again. The internal standard method was used to quantify the relative content. The internal standard was acetophenone and n-tetradecane, and the peak area of the internal standard was compared with the peak area of the relative quantitative substance.

Preparation of ligands and receptors
The E. globulus leaves components were collected from the TCMSP database (http://lsp.nwu. edu.cn/tcmsp.php). This database collected 499 Chinese herbal medicines registered in all Chinese Pharmacopoeias (2010) for a total of 12,144 compounds [28]. The components of E. globulus leaves were obtained by GC-MS/MS and UPLC-MS/MS. According to the content and activity of each component, 69 small molecules of volatile and aqueous extracts were finally screened. The three-dimensional structure of 69 small molecules was downloaded from TCMSP, imported into the docking software DiscoveryStudio2.5, hydrogenated, optimized with the CHARMm force field, and then stored as a candidate small molecule for molecular docking.
Herbicide targets are collected from the therapeutic target database (TTD, http://bidd.nus.edu.sg/ group/cjttd/). The therapeutic target protein database is a specialized target database developed by the National University of Singapore. It provides information on known therapeutic proteins and nucleic acid targets, disease information and signaling pathways described in published literature [26] and plays an important role in the innovative research and development of traditional Chinese medicine. Using herbicide as keywords, the related protein was searched from the TTD database as target protein for herbicide, and the corresponding literature was searched and screened. Its crystal structure is available through the PDB database (http://www.rcsb.org/pdb/home/home.do). These targets were introduced into Discovery Studio, aqueous molecules were removed, hydrogenated, energy optimized through the CHARMm force field, and then stored as receptors.

Docking Method
The active constituents of E. globulus leaves and the target protein for herbicide were docked through the Ligandfit module [29] of Discovery Studio 2.5 software (Chuangteng Technology Corporation, Beijing, China), and scored according to the energy of the system. The scoring function includes the default Ligscore1, Ligscore2 [30], PLP1, PLP2 [31], Jain [32], PMF and Dock score [29] seven scoring functions [33], other parameters are default values.

Allelopathic Effects on Seeds
Using the petri-dish filter method [34,35]; grass seeds (Cynodon dactylon) of the same size and full grain were selected, soaked for one day, and the cut filter paper was put in the culture dish on the second day, first wet with distilled aqueous, then 50 grass seeds were placed in each dish. The samples were then placed at different concentrations into the culture dish, 10 mL of extract was added to each dish, and three replicates were set for each concentration gradient to make the test more accurate. The control group was added with distilled aqueous. The number of seeds germinated in each petri-dish was counted from third day onwards. The final germination number was counted on the 9th day, and the root length and seedling length of all the seeds in the petri-dish were determined.
Determination of allelopathic test indicators: according to the tree seed test method GB2772-99 (supplementary 2), the germination rate of seeds and the inhibition rate of root and seedling growth were measured. Grass seed growth inhibition rate = [control group root (seedling) length-experimental group root (seedling) length] / [control group root (seedling) length] × 100%. Experimental data was analyzed using EXCEL (Microsoft Excel 2016, Microsoft, Redmond, WA, USA), or Origin 8.1 software (OriginLab, Northampton, MA, USA).

Conclusions
A simple, rapid and efficient method based on GC-MS/MS, UPLC-MS/MS combined molecular docking was established for the identification of bioactive compounds in natural products. The chemical composition was first identified by mass spectrometry, the active compound was predicted by molecular docking, and the activity of the compound was verified. In this study, 69 components of E. globulus leaves were docked with herbicidal receptor (1BX9) to obtain the strong bioactivity of gallic acid. The activity experiment showed that gallic acid had herbicidal activity, and fully proved the effectiveness of the method, which helps to discover potential active compounds in complex matrices.