How Does Lagenaria siceraria (Bottle Gourd) Metabolome Compare to Cucumis sativus (Cucumber) F. Cucurbitaceae? A Multiplex Approach of HR-UPLC/MS/MS and GC/MS Using Molecular Networking and Chemometrics

Cucurbitaceae comprises 800 species, the majority of which are known for their nutritive, economic, and health-promoting effects. This study aims at the metabolome profiling of cucumber (Cucumis sativus) and bottle gourd (Lagenaria siceraria) fruits in a comparative manner for the first time, considering that both species are reported to exhibit several in-common phytochemical classes and bioactivities. Nevertheless, bottle gourd is far less known and/or consumed than cucumber, which is famous worldwide. A multiplex approach, including HR-UPLC/MS/MS, GNPS networking, SPME, and GC/MS, was employed to profile primary and secondary metabolites in both species that could mediate for new health and nutritive aspects, in addition to their aroma profiling, which affects the consumers’ preferences. Spectroscopic datasets were analyzed using multivariate data analyses (PCA and OPLS) for assigning biomarkers that distinguish each fruit. Herein, 107 metabolites were annotated in cucumber and bottle gourd fruits via HR-UPLC/MS/MS analysis in both modes, aided by GNPS networking. Metabolites belong to amino acids, organic acids, cinnamates, alkaloids, flavonoids, pterocarpans, alkyl glycosides, sesquiterpenes, saponins, lignans, fatty acids/amides, and lysophospholipids, including several first-time reported metabolites and classes in Cucurbitaceae. Aroma profiling detected 93 volatiles presented at comparable levels in both species, from which it can be inferred that bottle gourds possess a consumer-pleasant aroma, although data analyses detected further enrichment of bottle gourd with ketones and esters versus aldehydes in cucumber. GC/MS analysis of silylated compounds detected 49 peaks in both species, including alcohols, amino acids, fatty acids/esters, nitrogenous compounds, organic acids, phenolic acids, steroids, and sugars, from which data analyses recognized that the bottle gourd was further enriched with fatty acids in contrast to higher sugar levels in cucumber. This study provides new possible attributes for both species in nutrition and health-care fields based on the newly detected metabolites, and further highlights the potential of the less famous fruit “bottle gourd”, recommending its propagation.


Introduction
Family Cucurbitaceae comprises 130 genera and 800 species, including several crucial crops of high nutritive and medicinal values and increasing economic interest [1,2]. Roots and seeds of most Cucurbitaceae members share the presence of toxic triterpenes, cucurbitacins. In contrast, their fruits are safe, edible, nutritious, and possess health-promoting effects [1,3]. Cucumis sativus (cucumber) is a well-known species of Cucurbitaceae that is

Molecular Networking
The molecular network (MN) was constructed using the UPLC-HRMS/MS data (in the negative and positive ion mode) from both fruit extracts as prepared in Section 2.3 following exact conditions mentioned in Hegazi, N.M. et al. [9]. The MN parameters were as follows: minimum cosine score 0.70; 0.1 Da parent mass tolerance, 0.5 Da as fragment ion tolerance to create consensus spectra, more than 6 matched peaks, and a minimum cluster size of 2. All of the matches kept between network spectra and library spectra were required to have a score above 0.7 and at least 4 matched peaks. Cytoscape (ver. 3.8.2.) was used for network visualization and analysis.

Headspace Volatiles Analysis of C. sativus and L. siceraria
The sample was prepared and analyzed following the same procedure and conditions reported in Farag et al. [12]. GC-MS analysis was adopted on an Agilent 5977B GC/MSD (Santa Clara, CA, USA) equipped with a DB-5 column (30 m × 0.25 mm i.d. × 0.25 µm film thickness; Supelco, Bellefonte, PA, USA) and coupled to a quadrupole mass spectrometer following the exact conditions mentioned in Farag, M.A. et al., [12]. For assessment of replicates, three different samples for each fruit were analyzed under the same conditions. Blank runs were conducted during sample analyses. The mass spectrometer was adjusted to EI mode at 70 eV with a scan range set at m/z 40-500.

GC-MS Analysis of Silylated Primary Metabolites in C. sativus and L. siceraria Fruits
100 Mg of finely freeze-dried powdered sample (for both fruits) was extracted with 5 mL 100% methanol with sonication for 30 min using Branson CPX-952-518R set at 36 • C, (Branson Ultrasonics, Carouge, SA Switzerland.) and with regular shaking, followed by centrifugation (LC-04C 80-2C regen lab prp centrifuge, Zhejiang, China) at 12,000× g for 10 min to eliminate debris. For evaluation of biological replicates, 3 independent samples for each fruit was analyzed under the same conditions. Then, 100 µL of the methanol extract was kept in opened screw-cap vials and left to evaporate under stream of nitrogen gas until full dryness. For derivatization, 150 µL of N-methyl-N-(trimethylsilyl)-trifluoroacetamide (MSTFA), previously diluted 1/1 with anhydrous pyridine, was mixed with the dried methanol extract and incubated (Yamato Scientific DGS400 Oven, QTE TECHNOLOGIES, Hanoi, Vietnam) for 45 min at 60 • C previous analysis using GC-MS. Separation of silylated derivatives was completed on a Rtx-5MS Restek, Bellefonte, PA, USA (30-m length, 0.25-mm inner diameter and 0.25-m film). Analysis of these primary metabolites followed the exact protocol detailed in Sedeek, M.S. et al., and Farag, M.A. et al. [11,12].

Metabolites Identification and Multivariate Data Analyses of Volatile and Non-Volatile Silylated Components Analyzed Using GC/MS
Identification was performed by comparing their retention indices (RI) in relation to n-alkanes (C8-C30), mass matching to NIST, WILEY library database and with standards if available. Peaks were first deconvoluted using AMDIS software (www.amdis.net, accessed on 23 April 2022) before mass spectral matching. Peak abundance data were exported for multivariate data analysis by extraction using MET-IDEA software (Broeckling, Reddy, Duran, Zhao, and Sumner, 2006). Data were then normalized to the amount of spiked internal standard (Z)-3-hexenyl acetate, pareto scaled and then subjected to principal component analysis (PCA), hierarchical clustering analysis (HCA) and partial least squares discriminant analysis (OPLS-DA) using SIMCA-P version 13.0 software package (Umetrics, Umeå, Sweden). All variables were mean-centered and scaled to Pareto variance.

Metabolome Profiling of L. siceraria and C. sativus Fruit Extracts via HR-UPLC/PDA/ESI-MS Based Molecular Networking
The comparative profiling of metabolites in bottle gourd and cucumber crude fruit methanol extracts was conducted via HR-UPLC/MS/MS in both negative and positive ionization modes (Section 2.4) for a comprehensive overview of metabolites that belong to different phytochemical classes. Gradient elution system of formic acid in water (0.1%): acetonitrile allowed for metabolites' elution in order of their decreasing polarity. The obtained base peak chromatograms (BPCs) of each fruit extract in both ionization modes are presented in Figure 1. Identification was based on determining retention time (Rt. min.) for each metabolite, its mass spectral data-including molecular ion, daughter ions, their respective formulae and fragmentation pattern-and comparing the collective data with the reported literature and databases such as HMDB, PubChem, FooDB, the Phytochemical dictionary of natural products, and others, combined with GNPS library annotations. The clustering of metabolites in MN (minimum two connected nodes) was based on shared fragments and their fragmentation pattern; this allowed us to extrapolate the identification of annotated compounds to the unknown peaks aided by the generated formulae [10].
Overall, 107 peaks were annotated in both modes (Table 1), belonging to amino acids, organic acids, cinnamates, alkaloids, flavonoids, pterocarpans, alkyl glycosides, sesquiterpenes, saponins, lignans, fatty acids/amides, and lysophospholipids, including several first-time reported metabolites and classes, as explained in detail in the following subsections. The constructed MN from the HR-negative ESI-MS/MS analysis was composed of 351 nodes and 449 edges, in which the clusters of interest included cluster A: fatty acids/amides, cluster B: lysophospholipids, cluster C: flavonoids and pterocarpans, cluster Furthermore, molecular networking was applied herein for the in-depth exploration and discrimination of samples guided by the Global Natural Products Social (GNPS) networking software and its spectral library that aided in peaks' assignment and annotation based on analyzing the HR-tandem MS/MS data [9,10]. Two molecular networks (MN) were constructed from both ionization mode analyses that provided visual discrimination of samples based on metabolites' abundance in each node, presented as a pie chart, allowing for rapid dereplication of known compounds (Figures 2 and S1). Samples were coded with orange and green colors for bottle gourd and cucumber, respectively. All nodes were labelled with parent mass and edges were labelled with neutral loss values. Identification was based on determining retention time (Rt. min.) for each metabolite, its mass spectral data-including molecular ion, daughter ions, their respective formulae and fragmentation pattern-and comparing the collective data with the reported literature and databases such as HMDB, PubChem, FooDB, the Phytochemical dictionary of natural products, and others, combined with GNPS library annotations. The clustering of metabolites in MN (minimum two connected nodes) was based on shared fragments and their fragmentation pattern; this allowed us to extrapolate the identification of annotated compounds to the unknown peaks aided by the generated formulae [10].
Overall, 107 peaks were annotated in both modes (Table 1), belonging to amino acids, organic acids, cinnamates, alkaloids, flavonoids, pterocarpans, alkyl glycosides, sesquiterpenes, saponins, lignans, fatty acids/amides, and lysophospholipids, including several first-time reported metabolites and classes, as explained in detail in the following subsections. The constructed MN from the HR-negative ESI-MS/MS analysis was composed of 351 nodes and 449 edges, in which the clusters of interest included cluster A: fatty acids/amides, cluster B: lysophospholipids, cluster C: flavonoids and pterocarpans, cluster D: alkyl glycosides, cluster E: saponins, cluster F: organic acids, cluster G: amino acids derivatives, cluster H: sesquiterpenes, cluster I: lysophosphatidic acid derivatives, and cluster J: cinnamates ( Figure 2). The positive MN was composed of 1002 nodes and 1451 edges, in which the clusters of interest included cluster A: lignans, cluster B: flavonoids, cluster C: amino acid derivatives, singleton D: alkaloid, and cluster E: fatty acids/amides ( Figure S1).
The detection of alkaloids in positive mode versus phenolic acids in negative mode is expected considering the improved sensitivity for each class in respective mode and highlighting the importance of acquiring in different ionization types. Glycosides were detected based on the neutral loss of the attached O-sugar moieties at 162 amu (C 6 [13]. Acylation of glycosides with acetyl or malonyl moieties was recognized by additional mass and/or neutral loss of 42 amu (C 2 H 2 O) or 86 amu (C 3 H 2 O 3 ), respectively, while C-glycosides showed significant neutral losses of 90 amu and 120 amu resulting from 0,2 and 0,3-sugar ring cleavage [14]. Several amino acids and amine derivatives were eluted early, as detected in chromatograms at Rt. 4-11 min. (Figure 1), and grouped in two major clusters, i.e., G and C, in negative and positive MNs, respectively ( Figure 2 and Figure S1), from which 14 metabolites could be annotated in peaks 2-15 (Table 1). The identified metabolites included four amines and 10 derivatives of amino acids viz. lysine, iso/leucine, phenylalanine, and glutamine, in which decarboxylation (-CO 2 , 44 amu), demethylation (-CH 2 , 14 amu), deglycosilation, deamination (-NH 2 , 17 amu) and dehydration (-H 2 O, 18 amu) were the major fragmentation pathways matching the reported literature, GNPS library, and HMDB and MassBank databases. All identified metabolites were common between both fruits except for peaks 14 and 15 (  Figure S1) at Rt. 11.4 min. (Table 1) and was identified as N,N,Ntrimethyltryptophan betaine, known as lenticin or hypaphorin alkaloid, which is reported herein for the first time in cucumber fruit. Identification was based on the presence of diagnostic daughter ions at m/z 188, C 11 H 10 NO 2 + , post the loss of N-trimethyl moiety, followed by decarboxylation at m/z 144 C 10 H 10 N + or dehydration at m/z 170, C 11 H 8 NO + . The latter proceeded into ring cleavage and demethylation at m/z 146, C 9 H 8 NO + and m/z 122 C 7 H 8 NO + , as explained in Figure S2, matching the reported literature [18] and HMDB spectrum (https://hmdb.ca/spectra/ms_ms/2947783, accessed on 11 January 2023). Lenticin is distributed in various vegetables and known to exhibit cardioprotective and neurological effects, and thus could be correlated with cucumbers' cardioprotective reported effect for the first time using such a metabolomics approach [19]. It is believed that cucumber may encompass several other alkaloids, but a special extract targeting method with solvents of lower polarity is needed to enhance their detection.

Phenolics and Cinnamic Acid Derivatives
In total, 15 glycosylated and/or acylated derivatives of phenolic and cinnamic acids were eluted at Rt. 8-16 min. in peaks 23-38 (Table 1) (Table 1). While sinapic acid derivatives were detected in cucumber only, bottle gourd extract was exclusively enriched with conjugates of phenyl and caffeoyl or feruloyl glycosides ( Figure S3-S5) that appeared in negative MN grouped in cluster J ( Figure 2) and were identified in peaks 28, 29, 31-33 and 35-38 (Table 1) (Table 1). Little information is available on the bioactivities of such acylated conjugates and how they contribute to L. siceraria health effects; thus, further studies are required to explore their potential pharmacological effect.

Sesquiterpenes
The family Cucurbitaceae is known to accumulate di-/sesqui-/and triterpenes, to which various bioactivities are attributed [4,21] (Table 1). Cynaroside A is sesquiterpene lactone-Ohexoside, abundant in artichoke, that undergoes deglycosilation at m/z 281, C 15 H 21 O 5 − , decarboxylation at m/z 237, C 14 H 21 O 3 , followed by sequential demethylation, dehydration and ring cleavage fragmentation processes to yield several diagnostic daughter ions, as explained in detail in Figure S6, in accordance with databases and the other literature [22]. Despite being previously detected/isolated from other green vegetables [23], this is the first report of its presence in F. Cucurbitaceae, and whether these fruits could serve as source of that sesquiterpene, similar to artichoke, should be considered. The other four metabolites clustered with cymaroside A sesquiterpene shared ions corresponding to lactone ring cleavage followed by sequential demethylation and dehydration at m/z 119, m/z 101, m/z 89, m/z 71, m/z 59 ( Figure S6), but could not be completely identified.

Alkyl Glycosides
From cluster D in negative MN (Figure 2), four metabolites were detected exclusively in cucumber as formate adducts in peaks 44-47 that belong to the alkyl glycosides class (Table 1) Figure S7) in good accordance with the literature [24] and databases. Similarly, peaks 45 and 46 were identified as benzyl-O-pentosyl-hexoside and hexanol-O-pentosyl-hexoside, respectively. Such compounds were previously detected in several fruits as in apples, anise, cumin, and others [15,[25][26][27]; however, according to our knowledge, this is the first report of their presence in Cucurbitaceae. The exact bioactivities of such class of compounds in these fruits have yet to be studied.

Flavonoids
Previous studies reported the presence of isoflavones and acylated/methoxylated apigenin and luteolin-O/C-glycosides in cucumber [28,29], while flavonoid-O/C-glycosides were reported in both species [3,4,30]. In this study, 24 flavonoids were detected and tenta-tively identified in both species, including flavones, flavonols, and isoflavones. C-glycosides were recognized by MSn ions corresponding to sugar ring cleavage, unlike O-glycosides, which showed intact release of the dehydrated sugar. Aglycones were differentiated based on their typical RDA fragments [31]. Identification was guided by GNPS-based networking, which allowed extrapolating peaks' annotations to unknown compounds, as observed in clusters B and C, in positive and negative MNs, respectively ( Figure S1 and Figure 2).
Such a mixture of flavonoids promotes the various health effects of both species, since acylation and methoxylation are known to impart structural and functional modifications that improve flavonoids' bioactivity by increasing their lipophilicity and intracellular bioavailability, thus exhibiting better receptor/ligand binding compared to their non-acylated/methoxylated analogues, as observed in cancer chemoprevention and anti-acetylcholine esterase activities [20,[33][34][35].

Pterocarpans
Pterocarpans are derivatives of isoflavonoids de novo biosynthesized as a response to stress, mainly detected in legumes [36,37]. Herein, three pterocarpans were detected in cucumber grouped with flavonoids in negative MN cluster C ( Figure 2) and identified in peaks 72-74 (Table 1). Identification was based on the diagnostic fragmentation pattern, showing sequential loss of two methyl groups (−2 × CH 2 , 14 amu) followed by decarbonylation (-CO, 28 amu) confirmed by generated formulae [38]. For example, peak 72 in Table 1 Table 1). This is the first report of pterocarpans in Cucurbitaceae; further studies are required to confirm their biosynthesis in plants other than legumes and present them as potential sources of pterocarpans.  (Table 1), identified as pentahydroxydimethoxylignan, known as carinol and secoisolariciresinol, previously reported in Cucurbitaceae [39], and trihydroxy-dimethoxylignan, respectively. Elemental composition of peaks 76, C 20 H 27 O 6 + and 77, C 20 H 27 O 5 + showed one and two hydroxyl moieties fewer than peak 75, C 20 H 27 O 7 + , respectively. Identification was based on diagnostic fragments corresponding to sequential dehydration, demethylation, and demethoxylation prior to and/or after cleavage of the bis(benzylbutanediol) bonding ( Figure S11) sharing MSn ions at m/z 137, C 8 H 9 O 2 + , m/z 121, C 8 H 9 O + m/z 107, C 7 H 7 O + and m/z 93, C 7 H 9 + , in accordance with references and databases [40].

Saponins
Previously, cucumber was reported for the presence of triterpenoid saponins [41]. In this study, five saponins in peaks 78-82 were detected in cucumber fruit (  Figure S12) matching references and databases [42]. On the other hand, all saponins were detected in negative ionization mode analysis as formate adducts (+46 amu, HCOOH), in which soyasaponin I showed few fragments indicative for the loss of hydroxymethyl, hydroxyl and ring A cleavage at m/z 397 and m/z 341, as predicted in HMDB spectra. In negative MN (Figure 2), cluster E consists of saponins correlated with soyasaponin I for their soyasapogenol B/E MSn ions at m/z 397 and m/z 341, in addition to other ions indicating terminal sugar loss or a fragment of hexose at m/z 113 (Table 1, Figure S13). From the aforementioned data, peaks 79 and 81 were annotated as soyasapogenol E-O-dihexosyl-O-glucuronide (soyasaponin Bd) and soyasapogenol B-O-rhamnosyl-O-pentosyl-O-hexosyl-O-glucuronide (melilotus saponin O1), respectively. The mixture of these saponins is reported in legumes [43], but this was the first time it was reported in cucumber. Other saponins maybe present in L. siceraria and C. sativus fruits that could be revealed upon using an extraction-targeting method instead of crude methanol extracts.

Fatty Acids/Amides
In total, 16 fatty acids and one fatty acyl amide were annotated in peaks 83-99, equally distributed in both species (Table 1) and grouped in the major clusters in both negative and positive MNs, A and E, respectively (Figure 2 and Figure S1). This included mono-, di-, tri-, and tetrahydroxylated fatty acids, as evidenced by the loss of water (H 2 [44]. Azelaic acid is incorporated into skin products for treatment of alopecia, as well as its reported cytotoxic and anti-inflammatory effects [44,45]. This could account for the popular usage of cucumber in skin preparations [3]. In contrast, 10 unsaturated fatty acids were detected in peaks 85, 88, and 90-97 (Table 1)

Lysophospholipids
In negative MN (Figure 2), cluster B and cluster I grouped lysophospholipids and lysophosphatidic acids, respectively, from which peaks 100-107 were detected (Table 1) (Table 1) Overall, metabolome profiling of cucumber and bottle gourd based on GNPSnetworking showed that both fruits share the abundance of amino acids, organic acids, flavones-C-glycosides, fatty acids and lysophospholipids in a comparable manner. By contrast, triterpenoid saponins, alkaloids, flavones-C-glycosides acylated with ferulic or coumaric acids, isoflavones alkyl glycosides and pterocarpans were exclusively detected in cucumber, and the latter two classes were reported for the first time in Cucurbitaceae. On the other hand, bottle gourd was further distinguished by the presence of dimethoxylated flavonoids acylated with malonic or acetic acids, lignans and conjugates of phenyl compounds with caffeic/ferulic acid-O-glycosides, all of which are reported for the first time in genus Lagenaria. Sesquiterpenes were more abundant in bottle gourd. except for cynaroside A, which was detected in both fruits for the first time as well. Successive extraction of fruits with solvents of different polarities is recommended to further reveal phytochemical classes that require targeting, such as alkaloids and saponins. The herein identified metabolites rationalized several of the reported pharmacological effects in both species, and further promoted them for new in vitro and in vivo medicinal research. Cucumber shared several phytochemicals that are characteristic to legumes (Fabaceae), e.g., soyasaponins, isoflavones, and pterocarpans, thus supporting the idea of an in-depth study of its biosynthetic pathways and gene expression data [37,43].

Aroma
Profiling of L. siceraria and C. sativus Fruits Using SPME Coupled to GC-MS SPME is well suited for the profiling of aroma for low-strength aroma food matrices, and also at low temperature for collection, compared to steam distillation, providing the true composition of volatile blends, [11] and this is the first time it has been reported in both species.

Multivariate Data Analysis of Volatiles Dataset Acquired from SPME-GC/MS
Although notable differences in volatile constituents were observed visually between both fruits, multivariate data analyses were employed to classify both fruits in an untargeted manner using PCA and OPLS modelling ( Figure S14). Samples segregation was observed along PC1 and PC2 to account for more than 80% of the total variance ( Figure S14A,B). Supervised OPLS was further applied for its superiority in class separation [47] to obtain a model with good variance and prediction power (R 2 = 0.88, Q 2 = 0.82) ( Figure S14C,D). The loading plot and S-loading plots identified metabolites mediating for samples separation as denoted by Mol. ion/Rt and labelled with their peak numbers in accordance with Table S1 (Figure S14B,D).
The overall aroma of bottle gourd and cucumber is manifested by a mixture of characteristic volatiles. Linalool, that is, the major detected constituent in both fruits, is known to impart an orange oil-like aroma, whereas fenchone in bottle gourd possesses a pleasant camphor-like aroma and is incorporated in food as a perfumery flavor [48]. On the other hand, aldehydes in cucumber, i.e., nonadienal, nonenal, and hexenal are known to mediate for the pleasant characteristic aroma of fresh cucumber that is further enhanced upon chewing, while benzaldhyde has an almond-like odor, all of which are shared in bottle gourd as well at variable concentrations. [49][50][51]. In addition to their role in taste and odor, such volatiles are known to mediate for antimicrobial activity [51,52]. The study infers that bottle gourd possesses a consumer-pleasant aroma.

Unsupervised PCA Data Analysis of C. sativus and L. siceraria Primary Metabolites
To provide insight into C. sativus and L. siceraria fruits' primary metabolome mediating for their nutritive value, GC-MS was employed, leading to the detection of 49 peaks. Chromatograms displayed a representative profile of C. sativus and L. siceraria fruits' nutrient primary metabolites ( Figure S15).

Peak
Rt ( In C. sativus, sugars represented the most abundant class at ca. 63% of detected primary metabolites, mostly represented by monosaccharides at 61% and trace disaccharides at ca. 2%. Fructose, glucose, and galactose were the main identified monosaccharides, amounting of phenyl compounds with caffeic/ferulic acid-O-glycosides, all of which were reported for the first time in genus Lagenaria. Sesquiterpenes were more abundant in bottle gourd, except for cynaroside A, which was detected in both fruits for the first time. The herein-identified metabolites rationalized several of the reported pharmacological effects of both species and further promoted them for new in vitro and in vivo medicinal research. Several of the herein-detected phytochemicals require further bioactivity studies to study their potential health effects, e.g., pterocarpans, alkyl glycosides, and phenyl-cinnamic acids conjugates. Aroma profiling via SPME-GC/MS analysis detected 93 volatiles at comparable levels in both species, responsible for the pleasant aroma of bottle gourd, despite its enrichment with ketones and esters. 49 Silylated primary metabolites were detected in both species via GC/MS analysis at comparable levels, of which bottle gourd was further enriched with fatty acids, and cucumber with sugars. Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/foods12040771/s1, Table S1: Relative percentile of volatile constituents detected in C. sativus and L. siceraria fruits via SPME/GCMS; Figure S1: Full molecular networking created using MS/MS data in positive ionization mode for L. siceraria (bottle gourd) and C. sativus (cucumber) crude fruit extracts showing 1002 nodes and 1451 edges. All nodes are labeled with parent mass and edges are labelled with neutral loss values. The network is displayed as pie chart with orange and green colors representing distribution of the precursor ion intensity in the pumpkin and cucumber extracts respectively; Figure Figure S13: Tandem MS/MS spectrum of melilotussaponin O1(peak 81) detected as formate adducts in C. sativus fruit crude extract via HR-UPLC/MS/MS analysis in negative ionization mode and identified based on GNPS networking with soyasaponin I in cluster E; Figure  S14: Principal component analysis (PCA) and orthogonal projection to latent structures-discriminant analysis (OPLS) supervised data analysis of modelling C. sativus and L. siceraria fruit specimens analyzed via SPME GC-MS for their volatile metabolites (n = 3). PCA score (A) and loading plot (B) with PC1 = 61% and PC2 = 22%; OPLS-DA score plot (C) and loading S-plot (D). Variables labelled with peak numbers (as in Table S1) correspond to discriminating metabolites for each sample identified by their Mol.wt/Rt. Benzaldhyde (peak 14), octanal (peak 15), benzenacetaldehyde (peak 17), nonadienal (peak 19), anethol ether (peak 83), fenchone (peak 67) and methyl hexadecanoate (peak 58) are the discriminating biomarkers; Figure S15: Representative GC-MS chromatograms of C. sativus and L. siceraria fruit specimens' silylated primary metabolites. Assigned peak numbers follow those shown in Table 2; Figure S16: Principal component analysis (PCA) and orthogonal projection to latent structures-discriminant analysis (OPLS) supervised data analysis of modelling C. sativus and L. siceraria fruit specimens analyzed via GC-MS for their silylated primary metabolites (n = 3). PCA score (A) and loading plot (B) with PC1 = 73% and PC2 = 21%; OPLS-DA score plot (C) and loading S-plot (D). Variables labelled with their names follow those listed in Table 2. Institutional Review Board Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available in article and supplementary materials.

Conflicts of Interest:
The authors declare no conflict of interest.