Annotation and Identification of Phytochemicals from Eleusine indica Using High-Performance Liquid Chromatography Tandem Mass Spectrometry: Databases-Driven Approach

Eleusine indica (L.) Gaertn is a perennial herb belonging to the Poaceae family. As the only species of Eleusine found abundantly in Malaysia, it is locally known as “rumput sambau” and has been traditionally used to treat various ailments including pain relief from vaginal bleeding, hastening the placenta delivery after childbirth, asthma, hemorrhoids, urinary infection, fever, and as a tonic for flu-related symptoms. A diverse array of biological activities have been reported for the plant, such as antimicrobial, cytotoxic, anticonvulsant, anti-inflammatory, analgesic, antipyretic, and hepatoprotective action. Despite many reports on its traditional uses and biological activities, limited chemical databases are available for the plant. Thus, the aims of this study were to annotate and identify the phytochemical constituents in the methanolic extract of E. indica through tandem LCMS-based analysis techniques using MZmine, GNPS, Compound Discoverer, and SIRIUS platforms. This technique managed to identify a total of 65 phytochemicals in the extract, comprising primary and secondary metabolites, and was verified by the isolation of one of the identified phytochemicals. The structural elucidation mainly using 1D and 2D NMR as well as comparison with values in the literature confirms the isolated phytochemical to be a 3-OH anomer of loliolide, a benzofuran-type of compound, which consequently increases the level of confidence in the applied technique. The research describes a useful method for the fast and simultaneous identification of phytochemicals in E. indica, contributing to the study of the chemical properties of the genus and family.


Introduction
Identification of phytochemicals is crucial in the investigation of plant samples. In the past two decades, new technologies and methods for structural identification have come forth, which promote the speed and accuracy of phytochemical analysis [1]. The liquid chromatography coupled to mass spectrometry (LCMS) approach in structural identification has gained popularity due to its high throughput, soft ionization, and good coverage of phytochemicals [2]. LCMS is the best approach in plant chemicals analysis due to its versatility, sensitivity, and ability to separate and detect highly diverse semipolar compounds, including key secondary metabolite groups. Tandem MS analysis is important to acquire both precursor and fragment ion information which can be used to annotate, identify, and dereplicate phytochemicals by providing a wealth of precise structural information [3]. LCMS-based phytochemical analysis can be classified into two types, namely, untargeted and targeted approaches. The former refers to a comprehensive analysis of all the measurable chemicals including the unknowns, while the latter focuses on the measurement of defined groups of chemicals [4]. The analysis can be facilitated with metabolite annotation tools such as MZmine, Global Natural Social Molecular Networking (GNPS), Compound Discoverer, and SIRIUS 4.0 [5]. This approach leads to the structural characterization of phytochemical mixtures, especially through identifying biomarkers and minor components which consequently facilitate and accelerate the discovery of novel active compounds [6].
Eleusine indica (L.) Gaertn is a perennial herb belonging to the Poaceae family that has been utilized widely for its medicinal values. The plant is widely spread in tropical regions and most of the Pacific Islands [7]. In Malaysia, E. indica is locally known as "rumput sambau" and it is the only species of Eleusine that can be easily found; it grows abundantly as a weed along roads and pavements [8]. E. indica has been used as traditional medicine around the world to treat various ailments including symptoms related to microbial infection, sprained muscle, coughing blood, and centipede or scorpion poisoning [9]. In Peninsular Malaysia, the plant's leaves are pounded to extract the juice, which is used to hasten the delivery of placenta for women after childbirth and to relieve pain during vaginal bleeding. The root decoction is used in treating asthma, while the decoction of the whole plant is used to treat urinary infections [10,11]. In East Malaysia, Kadazandusun people used an infusion of the plant's aerial part with rice to treat symptoms related to flu viral infection, and the decoction of roots mixed with Capsicum sp. (Solanacae) to treat piles [9,12]. This plant has a diverse array of biological activities including antioxidant, antibacterial, cytotoxic, anticonvulsant, anti-inflammatory, antidiabetic, antiplasmodial, hepatoprotective, analgesic, antipyretic, and others [7,13,14]. Hitherto, only a few phytochemicals have been isolated from the plant; they include schaftoside, vitexin, and isovitexin, β-sitosterol, stigmasterol, 3-O-β-D-glucopyranosyl-β-sitosterol and its 6 -O-palmitoyl derivatives, 1-[[[(2-aminoethoxy) hydroxyphosphinyl]oxy]methyl]-1,2-ethanediyl ester, and hexadecanoic acid [15][16][17][18][19]. An LCMS metabolite profiling and fingerprinting on the plant extract has identified p-coumaric acid and isoschaftoside along with a series of primary metabolites and amino acids [20].
Despite many reports on the traditional uses and biological activities of E. indica, not many phytochemicals have been isolated, thus limiting the plant's available chemical databases. Therefore, the present study was undertaken to explore the chemistry of E. indica through tandem LCMS-based analysis. This paper discusses the analysis process for the characterization and identification of phytochemicals in the methanolic extract of the plant employing annotation platforms of MZmine, GNPS, Compound Discoverer 3.0 and SIRIUS; the platforms were integrated with several available spectral and compound databases as well as a custom-built one based on the Poaceae family. In addition, the isolation and structural elucidation of an identified phytochemical are also reported here, to verify the reliability and to increase the level of confidence in the technique. This research describes a useful method for the fast and simultaneous characterization and identification of phytochemicals in E. indica, contributing to knowledge about its genus and family chemical properties. Figure 1 shows an infographic summarizing the steps of the research: the methanolic extraction, tandem LCMS and data analyses; the use of MZmine, Compound Discoverer (CD), GNPS, and SIRIUS platforms for annotation and identification; the use of a Venn diagram to display the distribution of the annotated phytochemicals in each platform; and the verification of the technique through the isolation and structural elucidation of phytochemical 1. The detail for each step is explained in the subsection below.

Results and Discussion
Molecules 2023, 28, x FOR PEER REVIEW 3 of Figure 1. Infographic illustrating the annotation and identification process of phytochemicals fro E. indica using high-performance liquid chromatography tandem mass spectrometry-based ana sis.

Tandem LCMS Analysis for Phytochemicals Annotation and Identification
In this study, a comprehensive high-resolution MS in a data-dependent full-scan a quisition method was developed to separate and detect the phytochemicals in E. ind methanolic extract. The phytochemicals profile was composed of hundreds of featur that are recognized by their measured mass-to-charge ratio (m/z), retention time (Rt), an relative abundance. The annotation of the phytochemicals was performed using a libra of natural products that contains the previously reported phytochemicals from the pla family, Poaceae. The library was custom-built by plant names (with all synonyms) th were queried in the Dictionary of Natural Products (DNP) Ver. 26.2 (December 2017), an all resulting hits were used to build a library of natural products. All the detected phyt chemicals were screened against the prepared library using MZmine 2.53, Compound D coverer 3.0, GNPS, and SIRIUS platforms; this enabled comparison of the mass erro (ppm) and isotopic patterns of the phytochemicals in the library with the observed ma spectra and ranking of the probable identity of the phytochemicals based on match sco The combination of these annotation tools accelerates the process of identification of t phytochemicals. This approach has been shown to succeed in the identification of phyt chemicals in a number of studies including fully clarifying the chemical constituents o herbal medicine in China, the Pingxiao capsule (PXC) [21]. The processes for annotati and identification of the phytochemicals in E. indica methanolic extract through the me tioned platforms are discussed in detail below. The total ion chromatogram (TIC) of E. indica's phytochemicals profile ( Figure 2) a quired from the optimized 30-min gradient elution gave m/z features in a range 166.0859-806.5875. Pre-processing of the positive ion mode raw data file using MZmi resulted in 426 m/z features. Annotation of these features was carried out based on acc rate mass (MS 1 ) information with the curated DNP and custom databases (as previous outlined). To characterize the phytochemicals, a specialized compound database was r trieved from the online DNP database. The biological source keyword, Poaceae (famil yielded 1106 compounds. To supplement this database, a custom library generated fro

Tandem LCMS Analysis for Phytochemicals Annotation and Identification
In this study, a comprehensive high-resolution MS in a data-dependent full-scan acquisition method was developed to separate and detect the phytochemicals in E. indica methanolic extract. The phytochemicals profile was composed of hundreds of features that are recognized by their measured mass-to-charge ratio (m/z), retention time (Rt), and relative abundance. The annotation of the phytochemicals was performed using a library of natural products that contains the previously reported phytochemicals from the plant family, Poaceae. The library was custom-built by plant names (with all synonyms) that were queried in the Dictionary of Natural Products (DNP) Ver. 26.2 (December 2017), and all resulting hits were used to build a library of natural products. All the detected phytochemicals were screened against the prepared library using MZmine 2.53, Compound Discoverer 3.0, GNPS, and SIRIUS platforms; this enabled comparison of the mass errors (ppm) and isotopic patterns of the phytochemicals in the library with the observed mass spectra and ranking of the probable identity of the phytochemicals based on match score. The combination of these annotation tools accelerates the process of identification of the phytochemicals. This approach has been shown to succeed in the identification of phytochemicals in a number of studies including fully clarifying the chemical constituents of a herbal medicine in China, the Pingxiao capsule (PXC) [21]. The processes for annotation and identification of the phytochemicals in E. indica methanolic extract through the mentioned platforms are discussed in detail below.

Data Processing, Enrichment and Phytochemicals Annotation by MZmine 2.53
The total ion chromatogram (TIC) of E. indica's phytochemicals profile ( Figure 2) acquired from the optimized 30-min gradient elution gave m/z features in a range of 166.0859-806.5875. Pre-processing of the positive ion mode raw data file using MZmine resulted in 426 m/z features. Annotation of these features was carried out based on accurate mass (MS 1 ) information with the curated DNP and custom databases (as previously outlined). To characterize the phytochemicals, a specialized compound database was retrieved from the online DNP database. The biological source keyword, Poaceae (family) yielded 1106 compounds. To supplement this database, a custom library generated from the previously isolated compounds of E. indica was added [14,15,20]. Both databases were imported to MZmine and employed as the custom-built database for peak identification. Hits were manually cross-checked against the MS/MS spectral fragmentation data. In order to ensure that none of the peaks overlapped, a 3D chromatogram plot (front, left and right view) was generated for the profile. As there were many overlapping peaks at the same Rt, the highest m/z abundance that represented a specific Rt was selected. osine (5, 5′) was hit by both LR and DNP sourced. Of these annotated phytochemicals, only 2 and 3 have been previously reported as the constituents of E. indica [14,20,24]. Since the LR database was only focusing on secondary metabolites, it seems that the DNP database complemented it by annotating some common compounds and primary metabolites from the plant extract. All the phytochemicals annotated in MZmine were identified with confidence level 3 due to only MS 1 data characterization [25,26]. To acquire a higher level of confidence, the structures must be further characterized with their MS n using GNPS, CD and SIRIUS platforms.  As shown in Table 1, the above processing of the extract managed to annotate 12 phytochemicals based on the custom-built database (LR and DNP sourced). Three of the phytochemicals were detected at several different retention times, namely, loliolide (1, 1 ), isoschaftoside (2, 2 ), and adenosine (5, 5 ); they were assigned the same acronym but with the addition of prime (') to the duplicate to ease confusion. This redundancy in detection is probably due to the stereoisomerism of the compounds, which may affect their polarity [22]. Notably, MS spectra typically cannot differentiate the stereoisomers, and additional experiments, including comparison with standards, are required to assign the absolute structure [23]. The annotated phytochemicals, loliolide (1, 1 ), isoschaftoside (2, 2 ), and vitexin (3) were hit with the custom database developed through the LR, and 4-ethoxy-6-methoxy-2-(8,11,14-pentadecatrienyl)-1,3-benzenediol,5-Ethoxy-3-(10,13,16-heptadecatrienyl)-1,2,4-benzenetriol (4), oryzamutaic acid E (7), 1-feruloyl-2-hydroxyputrescine (8) and 2-[2-(3-Methoxyphenyl) ethenyl]-4H-3,1-benzoxazin-4-one, 2-(3-methoxycinnamoyl)-4H-3,1-benzoxazin-4-one (9) were detected using the enriched database from DNP. Adenosine (5, 5 ) was hit by both LR and DNP sourced. Of these annotated phytochemicals, only 2 and 3 have been previously reported as the constituents of E. indica [14,20,24]. Since the LR database was only focusing on secondary metabolites, it seems that the DNP database complemented it by annotating some common compounds and primary metabolites from the plant extract. All the phytochemicals annotated in MZmine were identified with confidence level 3 due to only MS 1 data characterization [25,26]. To acquire a higher level of confidence, the structures must be further characterized with their MS n using GNPS, CD and SIRIUS platforms.

. Phytochemicals Annotation by GNPS
The same LCMS pre-treated data by MZmine, MGF file and feature table were used and uploaded into the Feature-Based Molecular Networking (FBMN) workflow, in GNPS. FBMN is an advanced method for molecular networking that provides accurate ion abundance for statistical analysis and support for isomer resolution or ion mobility. It has been validated to be an effective strategy for the identification of natural compounds from different sources [27,28]. GNPS facilitated the phytochemicals annotation through the comparison of the spectra from experimental data with open-access reference spectral libraries [28]. In GNPS, the annotation is achieved through the cosine value which refers to a normalized dot-product, a mathematical measure of spectral similarity between two fragmentation spectra and their library class that determines the level awarded, whether gold, silver or bronze. A cosine score of 1 represents identical spectra while a cosine score of 0 denotes no similarity at all. The degree of confidence in the annotated phytochemicals is determined through the cosine value and library class index used globally for each cluster across all networking views [28].
As shown in Figure 3, analysis through the GNPS platform managed to annotate 14 phytochemicals, of which three (1, 2 and 3) were also hit in the MZmine platform. The detailed features of the 14 annotated phytochemicals are listed in Table 2, which includes their library class, cosine, spectral and library m/z, ionization used, instrumentation, and ion source. With all the library hits, the mirror match between the experimental and library mass spectra can be obtained. For example, the mirror spectral match for isoschaftoside (2) shows a very close similarity (gold level) to the experimental data ( Figure 4), with a cosine value of 0.82. This confirms that the m/z 565.26 peak detected in the methanolic extract can be putatively annotated as isoschaftoside. An m/z peak at 433.11 with a cosine value of 0.96 was annotated as vitexin (3) similar to that annotated in MZmine platform. This match was also classified in the GNPS platform as a gold level due to the excellent quality of a match with the experimental spectrum considering the mass accuracy of the reference spectrum (resolution and calibration of the instrument), sample type, experimental setup, and associated sample information (metadata). It is worth mentioning that annotations 2 and 3 were matched with the Bioinformatics and Molecular Design Research Center Mass Spectral Library-Natural Products (BMDMS-NP), which contains high reliability references on experimental spectral data of metabolites. In fact, BMDMS-NP reports that the reference data for the two phytochemicals were obtained from the same instrument used in the present study (orbitrap), thus further illustrating the reliability of the result. Another phytochemical that shared high reliability in the annotation is loliolide (1) of cosine value 0.94 and bronze match classification. Isoschaftoside and vitexin have been reported previously as constituents of this plant [14,20]. reports that the reference data for the two phytochemicals were obtained from the same instrument used in the present study (orbitrap), thus further illustrating the reliability of the result. Another phytochemical that shared high reliability in the annotation is loliolide (1) of cosine value 0.94 and bronze match classification. Isoschaftoside and vitexin have been reported previously as constituents of this plant [14,20].   reports that the reference data for the two phytochemicals were obtained from the same instrument used in the present study (orbitrap), thus further illustrating the reliability of the result. Another phytochemical that shared high reliability in the annotation is loliolide (1) of cosine value 0.94 and bronze match classification. Isoschaftoside and vitexin have been reported previously as constituents of this plant [14,20].    The pre-treated LCMS data was further analyzed through-CD platform for MS/MS fragmentation patterns from various spectra databases. Among the applied databases were mzCloud spectral library, mzVault, ChemSpider™, Human Metabolome Database (HMDB), Kyoto Encyclopaedia of Genes and Genomes (KEGG), Massbank, Biocyc, NIST and Drugbank as well as our own custom-built database from the Poaceae family. Here, exact mass and the chemical formula were calculated where several compounds with similar masses and formulas were selected as candidates. To ensure a better hit, entries with higher ppm errors (>10 ppm) were discarded from this analysis. However, in the present sample, it was observed that mass errors were below 2 ppm in most cases. The fragment ions in the MS/MS data were analyzed in silico; the results were generated by manually dissecting the molecules at various possible sites and comparing the theoretical fragments with those obtained from the data. Table 3 shows the 38 phytochemicals along with their features (Rt, molecular formula, mzCloud similarity, FISh scoring and MS information) that could be annotated using CD software. FISh scoring provides fragment assignment based on mzCloud literature and in silico fragmentation rules (MS 2 and MS 3 data), and comparisons to the parent molecule for phytochemical assignment. The identification was either formal (when at least two physicochemical parameters, such as chromatographic retention time and MS/MS spectrum, matched those of our spectral library of reference compounds) or putative (based on information from mzCloud and the interpretation of MS and MS/MS spectra), corresponding to levels 1 and 2 from the metabolomics standard initiative [29,30]. In the present work, the molecular formula of the phytochemicals was mainly considered to be putative based on the predicted composition on the platform.
For consistency, here again, the phytochemical annotated as isoschaftoside (2) will be used to discuss the CD features. As shown in Table 3, compound 2 was annotated with good accuracy by exhibiting only 0.00113 Da (∆Mass) and 2 ppm (∆Mass) mass error; the mzCloud similarity match was 97.1% and the FISh score was 45.16; it was annotated as level 2 [31]. Figure 5 shows the presentation of how the in silico spectral fragmentation in the mzCloud library of 2 matches the m/z peak 565.26 detected in the plant extract. The number of (green) fragments ions indicate that majority of the signals in the spectrum is matched with the mzCloud library, consequently increasing the confidence in identification. This compound also matched with the custom database from Poaceae family and has been isolated, elucidated and characterized previously from E. indica [14,20].      Another annotated phytochemical in the extract that worth highlighting is adenosine (5). The in silico fragmentation from the FISh algorithm structurally explained more than 83.33% of the fragment ions for adenosine (Table 3). Adenosine is an organic compound that occurs widely in nature in various derivatives. It participates in improving the action of the plant on memory impairment and increased cyclic adenosine monophosphate (AMP). This compound was previously isolated and identified from Anredera cordifolia [32]. However, this is the first report on adenosine in E. indica. Figure 6 shows the TIC containing the structures of all the phytochemicals annotated by the CD platform with their respective Rt. Although the in silico fragmentation behavior in the CD platform provides an efficient characterization feature that will accelerate the annotation process of the phytochemicals, this data still needs to be confirmed with the MS/MS spectra. Thus, SIRIUS platform was further applied in the analysis to confirm the phytochemicals' fragmentation pattern.  SIRIUS is a software dedicated to the annotation of ions from fragmentation spectra. This software complements the other platforms described above, particularly the CD platform. First, SIRIUS computes the candidate molecular formula (MF) by (1) matching the MS 1 experimental spectra against the predicted isotopic pattern and (2) establishing how much the fragmentation spectra can be explained by the candidate MFs using fragmentation trees. SIRIUS integrates other algorithms or models such as: ZODIAC for improved MF prediction; CSI: FingerID for putative annotation of structure and COSMIC for establishing confidence in the match; or CANOPUS for putative chemical class annotation. SIRIUS has an advanced graphical user interface, and the tools can be run in command line mode. This work presents the COSMIC (Confidence of Small Molecule IdentifiCations) workflow that combines the selection or generation of a structure database, searching in the structure database with CSI: FingerID and a confidence score to differentiate between correct and incorrect annotations. Candidate structures and database-independent fingerprint vectors were obtained by loading the above-mentioned MGF files into the SIRIUS and CSI-FingerID pipeline. Data were acquired in a positive mode due to the higher sensitivity and the higher quality of fingerprint predictions of SIRIUS + CSI-FingerID compared to the negative mode. After computing the processed data (MS 2 ) into this platform, the results of each feature are displayed through the Rt, m/z and COSMIC value data.
A fragmentation tree annotates peaks in the fragmentation spectrum with molecular formulas and identities of likely losses between the fragments, similar to "fragmentation diagrams" created by experts. For each fragmentation spectrum, COSMIC considers only the structure candidate that is top ranked by CSI: FingerID as an annotation; COSMIC neither changes annotations (re-ranks structure candidates) nor discards any annotations. COSMIC's confidence score combines E-value estimation and a linear support vector machine (SVM) with enforced directionality. The calculated tree must not be understood as ground truth but can be used to derive information about the measured phytochemical. The fragmentation tree is computed from the fragmentation spectrum given the (candidate) molecular formula of the precursor ion. Initially, a fragmentation graph is constructed in the following way. For every fragment peak, all possible molecular formula explanations are computed. These explanations must be subbing formulas of the precursor molecular formula fragment that only loses atoms, and never gains new atoms. Every such molecular formula is a node in the graph. Nodes are connected by an edge if one node is a subformula of another representing a potential loss. Using combinatorial optimization, the best scoring fragmentation tree is computed which explains every peak at most once. Unexplained peaks are considered noise. Figure 7 shows the example of a fragmentation tree for candidate 2 which is annotated as isoschaftoside. As shown from the fragmentation tree, the m/z 565.1557 C 28    Candidate structures for the phytochemicals of E. indica fed and non-fed pitchers were obtained by searching the top hit of CSI-FingerID in all databases and manually curating the results; for all the other analyses, fingerprint vectors of the top 10 candidates of all predicted formulas were exported. Annotation candidates were sorted by their score and the similarity between the predicted fingerprint and the fingerprint of each candidate. The higher the percentage, the higher the similarity. Candidates can be filtered by database, SMARTS string and XlogP value. In the SIRIUS platform, the practical databases used for annotation of phytochemicals would be Natural Products, COCONUT, and CHEBI, which are databases for natural products. It is worth mentioning that the unique feature of the SIRIUS platform is the CANOPUS which can predict the classification (i.e., functional group) of the phytochemicals. This would help to narrow down the search for the correct phytochemical annotation. Notably, COSMIC complements compound class annotation tools such as CANOPUS. COSMIC targets molecular structure annotations but annotates only a fraction of the compounds in a sample; in contrast, CANOPUS annotates practically all compounds in a sample for which fragmentation spectra have been measured but is restricted to annotating compound classes. Hence, both methods provide viable information. Which method is better suited depends on the underlying and focus of the research [33]. For example, COSMIC classified a phytochemical with an Rt of 5.00 min, and m/z peak of 565.16 to be from a kingdom of organic compounds, superclass of phenylpropanoids, class of flavonoids and subclass of flavonoid glycosides. The structure was further narrowed into a flavonoid C-glycosides compound type and flavonoid 8-Cglycosides. From this information and the literature values, the phytochemical that possibly satisfies these features is isoschaftoside (2). In fact, the MZmine, CD and GNPS platforms also support this annotation. This is evidence that COSMIC features can accelerate the selection of the candidates to be annotated. In addition, the level of confidence is evaluated through the metabolite identification confidence (MIC) level which is ranked 1-5 for putative metabolite annotation. Table 4 displays 20 phytochemicals that managed to be identified through the SIRIUS platform along with their MIC values. Table 5 lists 42 phytochemicals that could be characterized by their respective classes of compounds.    Schaftoside (59)

Verification of the Identified Phytochemicals
The platforms used in the present work each have their own strengths in the phytochemical annotation and identification process. The Venn diagram in Figure 8 below shows the performance of each platform in annotating the phytochemicals of E. indica. It can be clearly seen that only three phytochemicals, namely, loliolide (1), isoschaftoside (2), and vitexin (3), could be consistently identified through the four platforms. The CD platform gave the highest number of annotated phytochemicals (40) which is probably due to the large databases integrated into the platform, as mentioned earlier. As expected, the MZmine platform gave the lowest number of annotated phytochemicals (7) since it only integrates with a custom database built based on the phytochemicals previously reported from the Poaceae family. Since each platform gave a diversity of phytochemicals, it is suggested to use all of the platforms on a complementary basis, to annotate and identify different phytochemicals.
Even though the present work has managed to annotate and identify 65 phytochemicals from the methanolic extract of E. indica, some of the previously isolated constituents were not detected in the extract. These annotated and identified phytochemicals will remain tentative if no reference standard was used or the method is not verified. Thus, in an attempt to assure the reliability of the LCMS analysis in annotating phytochemicals in E. indica methanolic extract, one of the consistently identified phytochemicals on all the platforms, loliolide (1), has been subjected to a further isolation and purification process. From the total ion chromatogram of LCMS, compound 1 was predicted to elute at minute 6.78 (Tables 1, 3 and 4). In order to isolate and purify 1, the extract was subjected to chromatographic monitoring techniques including semi-prep HPLC, and recycling HPLC yielded 10 mg of 1. The structure of phytochemical 1 was elucidated through various spectroscopic techniques and comparisons with the literature values. The physical and spectroscopic characterization of phytochemical 1 was as follows.  (3), could be consistently identified through the four platforms. The CD platform gave the highest number of annotated phytochemicals (40) which is probably due to the large databases integrated into the platform, as mentioned earlier. As expected, the MZmine platform gave the lowest number of annotated phytochemicals (7) since it only integrates with a custom database built based on the phytochemicals previously reported from the Poaceae family. Since each platform gave a diversity of phytochemicals, it is suggested to use all of the platforms on a complementary basis, to annotate and identify different phytochemicals.   [35]. The H-3α proton appeared at a slightly lower field at δ H 4.84 as it was spatially away from the H-11 methyl electron cloud.
A total of 11 carbon resonances were demonstrated from the 13 C-APT NMR spectrum, of which one conjugated ketone carbonyl resonance was observed at δ c 181.5. Other than that, quaternary carbon resonances were observed including a deshielded trisubstituted methine carbon at δ c 170.1 which belongs to C-6, while C-5 shifted downfield due to neighboring oxygen. Moreover, comparative analyses of the 13 C-APT and 1 H NMR spectrum showed the resonances of three methyl groups at δ c 25.2, 26.5 and 29.7, a trisubstituted olefinic bond that appeared at δ c 112.2 and 170.1, and a secondary hydroxy group at δ c 66.1 ppm. The 1 H and 13 C-APT NMR data suggested that 1 is a bicyclic molecule, which led to the benzofuran type of compound. Comparison with literature data [35,36] confirmed that 1 is a benzofuran type of compound known to be loliolide, a 3-OH β-oriented structure. However, based on the deviation in the chemical shift of protons and carbons at positions 2, 3 and 4 from the reported data (Table 6), the stereochemistry at position C-3 is believed to be different. Based on the NOE correlations discussed earlier, the assignment of 3-OH was established to be α-oriented, consequently elucidating compound 1 as the C-3 anomer of loliolide ( Figure 9). Loliolide was previously reported as a constituent of other plants including L. salicaria, H. angiospermum, A. lappa, S. oleraceus, P. campanulatus (Cav.), P. indicus, M. alba, and M. whitei [37]. The isolation and structural elucidation of 1 as loliolide verified that structural annotation using tandem LCMS analysis is reliable. However, the elucidation also supports the limitation of MS spectra in differentiating stereoisomers. verified that structural annotation using tandem LCMS analysis is reliable. However, the elucidation also supports the limitation of MS spectra in differentiating stereoisomers.

Plant Materials and Extraction
E. indica was collected in Tanjung Karang, Malaysia. The plant specimen was identified by a certified botanist, En. Ahmad Zainudin Ibrahim, and a voucher specimen with the code DBKL177 was deposited at the Herbarium Taman Botani Perdana Kuala Lumpur. The aerial and leaf parts of the plant (10 kg) were cut into small pieces and dried in an oven at 40 °C. The dried sample was weighed and ground before being extracted using methanol at room temperature for 72 h. The extract was filtered, and the solvent was evaporated under reduced pressure, resulting in 41.62 g of methanol extract. The extract was stored at 4 °C before analysis.

Plant Materials and Extraction
E. indica was collected in Tanjung Karang, Malaysia. The plant specimen was identified by a certified botanist, En. Ahmad Zainudin Ibrahim, and a voucher specimen with the code DBKL177 was deposited at the Herbarium Taman Botani Perdana Kuala Lumpur. The aerial and leaf parts of the plant (10 kg) were cut into small pieces and dried in an oven at 40 • C. The dried sample was weighed and ground before being extracted using methanol at room temperature for 72 h. The extract was filtered, and the solvent was evaporated under reduced pressure, resulting in 41.62 g of methanol extract. The extract was stored at 4 • C before analysis.

Chemicals and Solvents
All AR (analytical reagent) grade chemicals used in this study were purchased from reputed manufacturers. Methanol (MeOH), and acetone (Ace) were of analytical grade. MeOH and acetonitrile (MeCN) HPLC grade were purchased from RCI Labscan (Bangkok, Thailand) and ultra-pure water (UPW) was from Sartorius.

Sample Preparation
Solid phase extraction (SPE) using Strata ® C18E (55 µm, 70 A), Phenomenex cartridges (500 mg, 6 mL) was employed for sample clean-up and pre-concentration. To activate the cartridges, 6 mL UPW was used, followed by 6 mL methanol. Before loading a 2 mL crude extract, the cartridge was equilibrated with 6 mL of 95% MeOH at a constant flow rate. Elution was performed with 5 mL of 95% MeOH. The extract was dried using a vacuum concentrator [38]. Then, 2 mg of the extract was dissolved in MeOH and filtered through 0.22 µm syringe filters into a vial, capped, and submitted for LCMS analysis.

LCMS Optimization
LCMS analysis was performed using a Phenomenex reversed-phase Kinetex XB-C18 column (100 × 2.1 mm, 100 Å, 1.7 µm particle size). Mobile phase A was UPW and mobile phase B was LCMS grade MeCN. A constant flow rate of 0.8 mL/min was used, and the mobile phase gradient was: 0 min; 10% B, 20 min; 100% B, 30 min; 100% B. The column was equilibrated with 90% mobile phase A for 15 min before the next injection. The column oven was set at 35 • C, and the full loop injection volume was set at 5 µL [39]. The LCMS instrument used was a Thermo Scientific Orbitrap Elite with electrospray ionization (ESI) in positive mode. The resolving power for accurate mass measurement during the LCMS run was 120 K defined at m/z 400. The instrument was externally calibrated with Thermo Pierce calibration solution before LCMS runs. Full scan mode was used to record all the masses in the range of 100-600 m/z. In addition to the full scan, data-dependent MS/MS fragmentation was recorded for the 5 tallest peaks on each spectral scan with various collision energies. The spectrum data obtained from the LCMS analysis were processed using several available platforms such as MZmine, GNPS, Compound Discoverer, and SIRIUS. The raw data were converted into mzML files using ProteoWizard function msconvert and were then processed using MZmine software version 2.53 [40] with the following steps: peak detection, chromatograph builder, chromatogram deconvolution, deisotoping, peak alignment, duplicate peak filtering, peak list row filtering, and gap-filling. The parameters for these steps were adjusted based on the centroid mass detector, noise level, minimum group size of scans, minimum group intensity threshold, m/z tolerance, S/N estimator and ratio, minimum feature height, coefficient area threshold, peak duration ranges, Rt wavelet range, isotopes peaks grouper, adduct search, complex search, m/z tolerance for peak alignment, absolute Rt tolerance, weight for m/z, weight for Rt, and peak-list rows filter. The online DNP database was used to create the custom compound database [41]. The resulting MS 1 feature data were exported to excel (.csv) and the MS 2 feature data were exported to SIRIUS and GNPS as an MGF file.

Global Natural Social Molecular Networking (GNPS)
After processing the LCMS/MS data with MZmine software, the data were exported in two formats (TXT or CSV): a table containing the intensities of LCMS ion features, and an MS/MS spectral summary file in MGF format that contained a list of MS/MS spectra associated with the LCMS ion features. These files were then used as input for the Su-perQuick FBMN tool, which was accessed using GNPS credentials and email. The "Feature Generation tool" was selected, and the feature quantification table and MS/MS spectral file were uploaded to the tool. After clicking on "Analyze Uploaded Files with GNPS Molecular Networking", the FBMN job was completed, and the results were inspected on GNPS. Once the job was finished, an email notification with a link to the results page was sent, which could take any time from 10 to 10 h depending on the number of samples and the instrument used [28].

Compound Discoverer™ 3.0
Data processing was carried out through Natural Product Unknown ID with Online and Local Database Searches method. This method involves several steps to detect and identify unknown phytochemicals, including Rt alignment, unknown phytochemical detection, and phytochemical grouping across all samples without statistics. In addition, elemental compositions were predicted for all phytochemicals, and the chemical background was hidden using blank samples. The phytochemicals were then identified using various tools, including mzCloud [data dependent MS 2 (ddMS 2 ) and/or data-independent acquisition (DIA)], ChemSpider (exact mass or formula), and local database searches against Mass Lists (exact mass with or without Rt) and mzVault spectral libraries. A spectral similarity search was performed against mzCloud for compounds with ddMS 2 , and mzLogic was applied to rank order structure candidates from ChemSpider and mass list matches. Finally, spectral distance scoring was applied to ChemSpider and mass list matches [42].

SIRIUS 4.0
The MGF file from MZmine was imported to the SIRIUS software where CSI-FingerID pipeline was applied to further annotate the phytochemicals in the plant. The data extraction (MS/MS) was carried out using an in-house built code that searched for fragmentation events triggered in a window of 0.5 min within the feature Rt. To avoid misidentification of closely eluting isobaric phytochemicals, the maximum intensity in the MS 1 extracted ion chromatogram (XIC) of the feature m/z (with 5 ppm error) that was closest to the feature Rt was searched. After determining the Rt window for a selected feature, all fragmentation events (MS/MS data) whose parent ions matched the feature m/z ratio within a 5 ppm error were stored within the new Rt window. CSI-FingerID was used to generate candidate structures for the phytochemicals of E. indica fed and non-fed pitchers. The top hit from this search was manually curated to ensure accuracy. For all other analyses, fingerprint vectors of the top 10 candidates of all predicted formulas were exported. When multiple adducts were present in a feature, only the formulas that matched the adducts' formulas were retained. Then, only fingerprints that explained over three peaks and over a third of the intensity were retained. The final selection of fingerprint vectors was determined by collapsing all adducts per feature and retaining only those fingerprint vectors corresponding to the candidate with the highest score and those with a percentage of less than 30%. If the candidates had a greater COSMIC value and CSI: FingerID matching percentage, their fingerprints were retained. The following SIRIUS molecular formula calculation parameters were created to proceed with the analysis of the benchmarking dataset: potential ionisation, [M + H] + ; instrument, orbitrap; tolerance, 50 ppm; candidate molecular formulas, 3; filtered by formulas from biological databases. For the CSI: FingerID process, the following parameters were used: potential adducts, [M + H] + ; filter, compounds present in the biological database; maximum number of returned candidates, infinite [43,44].

Isolation and Purification of Compound 1 3.4.1. General Chromatographic Procedure
The semi-preparative HPLC analysis was carried out using a DIONEX Ultimate 3000 HPLC system from ThermoFisher, Waltham, MA, USA. The system included a photodiode array detector (PDA), an auto-sampler injector, a fraction collector, and a 10 mL sample loop. The separation was performed on a Hypersil GOLD C18 column from Thermo Scientific with a pore size of 175 Å and dimensions of 250 mm × 10 mm (5 µm particle size). The instrument was controlled by software Chromeleon version 7.2 provided by the supplier, and data analysis was also conducted using this software. Recycling HPLC was performed on a JAI model LC-9103 from Japan Analytical Industry Co., Ltd. (Mizuho, Tokyo, Japan) equipped with a reciprocating double plunger pump type P-9140B and a UV detector with a wavelength set to 210 nm. The separation was carried out on preparative columns JAIGEL-ODS-AP, SP-120-15 and GL Sciences-Inertsustain Column C18, both with dimensions of 20 mm × 250 mm (10 µm particle size).

Semi-Preparative and Recycling HPLC
The extract was weighed precisely at a concentration of 8 mg/mL and dissolved in MeOH. The solution was filtered through a 0.45 µm PTFE filter into a screw cap vial prior to injection into the HPLC system. The mobile phase used for the HPLC analysis was a gradient elution program consisting of UPW (A) and MeCN (B). The gradient was as follows: 10-95% B over 0-18 min, 95% B from 18-24 min, and 95-10% B from 25-30 min. The absorbance of the eluent was monitored at 210 nm. A 3 mL sample was introduced into the system at 30 • C and a flow rate of 4.7 mL/min was used, resulting in the isolation of 36.5 mg of the component of interest. This component was dissolved in 10 mL MeCN and UPW (80:20) and 1 mL was injected into a recycling HPLC system. The separation was performed with an isocratic elution of MeCN and UPW (80:20). The flow rate of the system was set at 4 mL/min and the absorbance was set to 210 nm. Thirty minutes were allotted for the column to condition and for the baseline to stabilize. After four complete cycles, 10 mg of compound 1 was eluted at minute 242. Each cycle took 60 min, and the entire cycle took 254 min to complete [22].

Structural Elucidation
A microscope JM628 digital thermometer with an X-4 melting-point apparatus was used to determine the melting point. The 1 H and 13 C NMR spectra were acquired in deuterated methanol (MeOD) using a Bruker 600 Ultrashield NMR spectrometer at 600 and 150 MHz, respectively. UV and IR were measured using JASCO UV/Vis Spectrophotometer V-730 and Bruker FT-IR Spectrometer TENSOR II model, respectively.

Conclusions
The tandem LCMS-based phytochemicals analysis of the methanolic extract of E. indica through the integration of MZmine, GNPS, Compound Discoverer and SIRIUS platforms has managed to annotate and identify a total of 65 phytochemicals, comprising primary and secondary metabolites. It was found that all of these platforms complement each other by providing a wealth of information on the characterization, annotation, and identification of phytochemicals. The reliability of the technique was verified with the isolation of one of its consistently identified phytochemicals, 1, known as loliolide. The structural elucidation of 1 using 1D and 2D NMR as well as comparison with literature values confirm that the isolated phytochemical is an anomer of loliolide at the C-3 position. This has consequently increased the level of confidence in the technique applied. The present work describes a tandem LCMS-based data analysis as a useful method for a fast and simultaneous identification of phytochemicals in E. indica, contributing to the study of the chemical properties of the genus and family. However, the elucidation of 1 as the anomer of the annotated compound also supports the limitation of MS spectra in differentiating stereoisomers. For this, additional experiments, including comparison with standards, are required to assign the absolute structure.

Data Availability Statement:
The data presented in this study are available on request from corresponding author. The data are not publicly available due to being part of a research project.