Margaritaria nobilis L.F. (Phyllanthaceae): Ethnopharmacology and Application of Computational Tools in the Annotation of Bioactive Molecules

Margaritaria nobilis is a shrubby species widely distributed in Brazil from the Amazon to the Atlantic Rainforest. Its bark and fruit are used in the Peruvian Amazon for disinfecting abscesses and as a tonic in pregnancy, respectively, and its leaves are used to treat cancer symptoms. From analyses via UHPLC-MS/MS, we sought to determine the chemical profile of the ethanolic extract of M. nobilis leaves by means of putative analyses supported by computational tools and spectral libraries. Thus, it was possible to annotate 44 compounds, of which 12 are phenolic acid derivatives, 16 are O-glycosylated flavonoids and 16 hydrolysable tannins. Among the flavonoids, although they are known, except for kaempferol, which has already been isolated from this species, the other flavonoids (10, 14, 15, 21, 24–26, 28–30, 33–35, 40 and 41) are being reported for the first time in the genus. Among the hydrolysable tannins, six ellagitannins present the HHDP group (6, 19, 22, 31, 38 and 43), one presents the DHHDP group (5), and four contain oxidatively modified congeners (12, 20, 37 and 39). Through the annotation of these compounds, we hope to contribute to the improved chemosystematics knowledge of the genus. Furthermore, supported by a metric review of the literature, we observed that many of the compounds reported here are congeners of authentically bioactive compounds. Thus, we believe that this work may help in understanding future pharmacological activities.


Introduction
The species Margaritaria nobilis, for a time, was classified as belonging to the genus Phyllanthus¸which chemical-pharmacological knowledge is widely disseminated [1]. However, phylogenetic studies have suggested reclassification to the genus Margaritaria, which is currently considered [2].
This species is popularly known as "botãozinho", "figueirinha", "sobragirana", "cafébravo" and "fruto-de-jacamin", and although not endemic in Brazil, it has well-established phytogeographic domains in the Amazon, Caatinga and Atlantic Forest [3,4]. In traditional medicine, the decoction of its bark is used for asepsis of abscesses, the slightly boiled fruit is used as a pregnancy tonic [5], and the leaves are used to treat cancer-like symptoms [6].
Chemically, for the genus Margaritaria, the presence of phenolic derivatives, such as gallic acid and glycosylated flavonoids obtained from M. discoidea [7,8], and the alkaloids securinine and phyllocrisin [9], found in M. indica, are reported.
Beyond that, according to our literature review, there are a few phytochemical studies of M. nobilis, on which authors reported the presence of kaempferol, the phenols gallic acid and methyl gallate and the tannin corilagin, in the leaves of the plant; betulinic acid and the alkaloid phyllanthidine were isolated from the stem [4].
In accordance with pharmacological documents for these species, we believe that activities, such as cytotoxicity [8], antioxidant [7], anti-inflammatory [10], analgesic effect [11], antimicrobial activity [12] and leishmanicidal activity [4] can be understood in the light of the potential that these classes have.
In this regard, we opportunely emphasize that a multifaceted investigative approach to the magnitude of these activities is only possible in light of the unequivocal structural definition of these biomolecules [13]. And, in this field, although Nuclear Magnetic Resonance spectroscopy is the main technique [14], we are well supported by computational tools that, from machine training, have anticipated the structural prelude of phytoconstituents of complex matrices [15,16].
At this juncture, the workflows for mining pharmacologically relevant natural products have arguably become faster and more precise, as they provide bioguided screening and isolation of active molecules [17][18][19]. The prospect is that these advances will become increasingly significant as the sharing of scientific data becomes normatized (Aron et al. 2020). Moreover, the continuous supply of spectral data of identified compounds has served as a mirror for the prospecting of unknown compounds, disclosing new natural matrices with high therapeutic advantages [20,21].
Thus, on this and other evidence, we strongly believe that plant extracts that have never been thoroughly investigated can be satisfactorily targeted to various pharmacological segments from the chemical annotation provided by robust computational tools.
In this perspective, considering that the species M. nobilis possesses an authentic arsenal of chemical constituents capable of providing formidable pharmacological bioprospecting, and supported by computational tools, we sought to annotate the largest number of the compounds present in the ethanolic extract of M. nobilis leaves through putative analysis via UHPLC-MS/MS, followed by a metric review of the pharmacological properties of compounds already reported in the literature. Thus, we describe here the annotation of 44 compounds, of which 12 are phenolic acid derivatives, 16 are flavonoids and their O-glycosylated derivatives, and 16 are hydrolysable tannins.

Characterization of Detectable Components in the EtOH Extract of Margaritaria nobilis
The characterization of detectable compounds was performed using two approaches: (1) analysis of LC-MS/MS results using cheminformatics tools, and (2) manual analysis of MS and MS/MS spectra. As a result of this process, a feature-based molecular Network ( Figure S2) was generated on the GNPS platform, which allowed the annotation of M. nobilis metabolites.
To increase the reliability in the putative identification of the compounds, the chemotaxonomy of the Phyllanthaceae family and more precisely that of the genus Margaritaria was considered. As shown in Table 1, forty-four compounds ( Figure S3) were identified and classified into three groups: phenolic acid derivatives, flavonoids and O-glycosylated derivatives and hydrolysable tannins.

Phenolic Acids Derivatives
The main phenolic compounds identified in M. nobilis were found to be gallic acid (1), methyl gallate (4), ethyl gallate (11), p-coumaric acid (9), O-coumaroylgalactaric acid (2) and O-feruloylgalactaric acid (3). These compounds showed common losses of 44 Da (CO 2 ), characteristic of this class [30]. • ] −• , followed by m/z 124 due to loss of CO 2 .    It is noteworthy that glycosylation at the 3-O position of the aglycone was defined based on the intensity and ratio of the radical ion and negative ion observed in the MS/MS spectrum [33]. The presence of glycosylated flavonoids in species of the genus Margaritaria has already been reported in the literature [7]. However, with the exception of kaempferol, which has already been isolated from M. nobilis [4], the other flavonoids ( [24]. The scheme in Figure 1 shows the main fragmentation pathways of the [M−H] − ion at m/z 603, identified as trigalloyl-dideoxyglucose (32). In addition to the characteristic losses mentioned, the fragment ion m/z 211 probably resulted from a retro Diels-Alder mecha-   [34]. However, the differentiation between the constitutional isomers of ellagitannins is not possible to determine by mass spectrometry alone [24,34].
For this reason, the annotations were made based on the structural proposals provided by the Sirius 4 software [35], considering the systematic classification of the Canopus [36], and the proposed structural formula was chosen based on compounds of this class already reported in the genus or family of M. nobilis.
In our study, six ellagitannins were putatively identified containing only HHDP groups ( which allowed its identification as an isomer of phyllanthusiin C (12), already isolated from the species Phyllanthus myrtifolius and P. urinaria (Phyllanthaceae) [37]. The diagram in Figure 2 presents the main fragmentation pathways of this compound.  [34]. However, the differentiation between the constitutional isomers of ellagitannins is not possible to determine by mass spectrometry alone [24,34].
For this reason, the annotations were made based on the structural proposals provided by the Sirius 4 software [35], considering the systematic classification of the Canopus [36], and the proposed structural formula was chosen based on compounds of this class already reported in the genus or family of M. nobilis.
In our study, six ellagitannins were putatively identified containing only HHDP groups (6, 19, 22, 31, 38 and 43), one containing DHHDP group (5), two isomers containing Che group (7 and 17) and four containing modified congeners oxidatively (12, 20, 37 and 39). which allowed its identification as an isomer of phyllanthusiin C (12), already isolated from the species Phyllanthus myrtifolius and P. urinaria (Phyllanthaceae) [37]. The diagram in Figure 2 presents the main fragmentation pathways of this compound.  Figure 3A (see spectrum in Figure S6B).  Figure 4, it is plausible to infer that it is an ellagitannin isomer of Excoecariphenol C (22) (see spectrum in Figure S6D).  Figure 4, it is plausible to infer that it is an ellagitannin isomer of Excoecariphenol C (22) (see spectrum in Figure S6D).   Figure 5, it is possible to suggest that it is an ellagitannin isomer of Phyllanthusiin U (39) (see spectrum in Figure S6E).  Figure 5, it is possible to suggest that it is an ellagitannin isomer of Phyllanthusiin U (39) (see spectrum in Figure S6E).  Research on the annotated hydrolysable tannins, carried out in a database of natural products, such as KNApSAcK and Dictionary of Natural Products, confirmed the presence of these compounds in the Phyllanthaceae family, especially in the Phyllanthus genus, which is closely related to Margaritaria. The ellagitannin corilagin has already been isolated from the species M. nobilis [4], and was identified in our study by mass spectrometry (compound 6). The remaining hydrolysable tannins are being reported for the first time in the genus.

Discussion
Despite reports of the use of Margaritaria nobilis in traditional medicine, only one study was performed on antimycobacterial evaluation [38], as well as limited studies on the characterization of its secondary metabolites [4].
In view of this, as an alternative to the use of the barks, which compromises the integrity and perpetuation of the species, we preferred to evaluate the leaves in view of their high availability and rapid natural replacement, with the perspective that it may have interesting compounds as much as those already observed at the bark. Based on the results obtained, a search was carried out in the scientific literature on the pharmacological activities already attributed to compounds (or their class) that were putatively identified in the ethanolic extract of M. nobilis leaves.
As result, studies with extracts of plant species rich on glycosylated flavonoids show pharmacological activities such as analgesic and anti-inflammatory [39]. For example, rutin (15) produces antinociceptive effects involving central modulation of the vIPAG downstream circuit partially by an opioidergic mechanism [40]. A mixture of quercetin 3-Oglucoside (21 and 26) showed comparable antinociceptive activity to the reference compound indomethacin [41].
Kaempferol (42) and its glycosylated derivatives are widely distributed in nature and have several biological activities. A review of kaempferol discussed the anti-inflammatory effects and mechanisms of action of this substance, confirming its potential to improve inflammation under both in vitro and in vivo conditions [42]. Other biological effects can be attributed to these substances, such as: hepatoprotective [43], gastroprotective [44], anti-arthritis [45], anti-cancer [46] and neuroprotective [47]. Research on the annotated hydrolysable tannins, carried out in a database of natural products, such as KNApSAcK and Dictionary of Natural Products, confirmed the presence of these compounds in the Phyllanthaceae family, especially in the Phyllanthus genus, which is closely related to Margaritaria. The ellagitannin corilagin has already been isolated from the species M. nobilis [4], and was identified in our study by mass spectrometry (compound 6). The remaining hydrolysable tannins are being reported for the first time in the genus.

Discussion
Despite reports of the use of Margaritaria nobilis in traditional medicine, only one study was performed on antimycobacterial evaluation [38], as well as limited studies on the characterization of its secondary metabolites [4].
In view of this, as an alternative to the use of the barks, which compromises the integrity and perpetuation of the species, we preferred to evaluate the leaves in view of their high availability and rapid natural replacement, with the perspective that it may have interesting compounds as much as those already observed at the bark. Based on the results obtained, a search was carried out in the scientific literature on the pharmacological activities already attributed to compounds (or their class) that were putatively identified in the ethanolic extract of M. nobilis leaves.
As result, studies with extracts of plant species rich on glycosylated flavonoids show pharmacological activities such as analgesic and anti-inflammatory [39]. For example, rutin (15) produces antinociceptive effects involving central modulation of the vIPAG downstream circuit partially by an opioidergic mechanism [40]. A mixture of quercetin 3-O-glucoside (21 and 26) showed comparable antinociceptive activity to the reference compound indomethacin [41].
Kaempferol (42) and its glycosylated derivatives are widely distributed in nature and have several biological activities. A review of kaempferol discussed the anti-inflammatory effects and mechanisms of action of this substance, confirming its potential to improve inflammation under both in vitro and in vivo conditions [42]. Other biological effects can be attributed to these substances, such as: hepatoprotective [43], gastroprotective [44], anti-arthritis [45], anti-cancer [46] and neuroprotective [47].
Ellagic acid (18) is a polyphenol widely investigated for its pharmacological properties, mainly against toxicity and liver diseases, which can be justified by its antioxidant capacity, in addition to reducing the lipid profile and lipid metabolism, altering pro-inflammatory mediators and decrease factor activity (kB). In addition to being detected in its free form, ellagic acid can be released by the hydrolysis of ellagitannins under physiological conditions [48,49].
Currently, articles and patents show a growing interest in hydrolysable tannins due to their economic, chemical and biological value, which can be used as veterinary products, food additives, biopesticides and for structural bone repair. Among the biological activities, we can mention anticancer, antioxidant, antimicrobial, anti-inflammatory, antidiabetic, healing, cardiovascular protection and antiviral activity [34,50,51].
The hydrolysable tannins are subdivided into gallotannins and ellagitannins. In our analyses, three gallotannins and several ellagitannins were identified. We mention here those that were detected with the highest degree of ionization, which are the isomers of: corilagin (6), geraniin (5) and chebulagic acid (7 and 17).
A systematic review of the pharmacological effects of corilagin described this substance as a promising herbal agent, highlighting its good antitumor activity in hepatocellular carcinoma and ovarian cancer cells [52]. Recently, this substance was tested as a nonnucleoside inhibitor of SARS-CoV-2, the virus that causes COVID-19. The results of this study indicate that this substance has great potential to become a new and effective drug to treat patients infected with this virus [53].
Geraniin has also been shown to be a promising therapeutic agent against SARS-CoV-2, inhibiting the entry of the virus into human cells [54]. Another study reports the potential of this substance against hepatitis B virus (HBV), interfering with the synthesis, stability or transcription of viral DNA [55]. A comprehensive review of this substance found its diversity of bioactive properties, with recommendations for additional studies for possible applications in the food, cosmetic and pharmaceutical industries [56].
The promising pharmacological potential of ellagitannins is undeniable, and we cite as a last example chebulagic acid, which was evaluated for its inhibition of the pleiotropic cytokine TNFα that induces pro-inflammatory and pro-angiogenic changes, configuring this compound as an anti-inflammatory agent [57]. Another test performed with this compound showed antiviral activity, which may represent a potential therapeutic agent to control enterovirus 71 infections [58].

Botanical Collection and Identification
Approximately 1 (one) kilogram of green and homogeneous leaves of mature specimens of Margaritaria nobilis were collected in the forest region of the municipality of Bragança/PA, Brazil, under the coordinates (1 • 02 08 S and 46 • 49 41 W). The botanical identification was carried out at the Embrapa Amazônia Oriental institution, by the botanist Nascimento, E.A.P., with an exsiccata deposited in the IAN herbarium, in the same institution, under registration number 191496. After the botanical certification, the material was washed with 0.1% sodium hypochlorite solution (NaCIO) to eliminate micro-organisms (fungi, bacteria, etc.), then with distilled water to remove residues and sprinkled with absolute ethanol for asepsis. Then, the material was dried in a circulation oven (Quimis, Diadema, Brazil) at 45 • C until constant weight.

Obtaining the Ethanol Extract
The dried leaves were ground in a ball mill (Fritsch, Idar-Oberstein, Germany) until obtaining a semi-fine powder granulometry (60-100 µm). The crushed material was subjected to a 48-h extraction divided into two 24-h batches, using ethanol (99%) as solvent in the proportion of 4 L of solvent for each 1.0 kg of dry and crushed material. Subsequently, the volumes were pooled and concentrated in a rotary evaporator (Büchi, Flawil, Germany). The concentrate was oven dried at 40 • C to constant weight.

Sample Preparation for Analysis via UHPLC-MS/MS
The extract (10 mg) was subjected to a pre-treatment by solid phase extraction (SPE) in a H 2 O:MeOH 2:8 (v/v) system to retain interferences, especially fat and chlorophyll present in the leaves. For this, a C18 analytical cartridge (SPE, Phenomenex, Torrance, CA, USA) was used with 50 mg of stationary phase and a volume of 1 mL, previously conditioned with 1 mL of MeOH and 1 mL of ultrapure water. After SPE treatment, a 3-mg aliquot was solubilized in 1 mL of a 2:8 H 2 O:MeOH system, followed by filtration with a 0.22 µm hydrophilic syringe filter (Millipore, Merk, Darmstadt, Germany) for analysis.

Analysis via UHPLC-ESI-QToF-MS/MS
The matrix was analyzed in an ultra-performance liquid chromatography system coupled to an ESI-QToF Xevo G2-S mass spectrometer (Waters Corp., Mil-ford, MA, USA) with an electrospray ionization (ESI) source operating in negative ionization mode. The mass scan had a range of 100 to 1200 Da and leucyin-enkephalin was used as the Lockspray reference mass.
UHPLC analysis was performed on a BEH C18 column (50 × 2.1 mm, 1.7 µm) Waters. The column and autoinjector temperatures were maintained at 40 and 25 • C, respectively. The chromatography run was performed with ultrapure water (solvent A) and acetonitrile (solvent B), both acidified with 0.1% formic acid. The gradient method was defined as follows: 0 min-10% B; 2 min-20% B; 30 min-50% B. The flow rate was 300 µL/min, and the injection volume was 2.00 µL. The total ion chromatogram was acquired using Masslynx V4.1 software (Waters Corp., Milford, MA, USA). The mass spectrometry parameters were set to the following: desolvation gas flow (N 2 ) at 800 L/h and desolvation temperature at 450 • C, cone gas flow (N 2 ) at 50 L/h, source temperature at 120 • C. The capillary and sampling cone voltages were set to 2.0 kV and 80 V, respectively.
Data-dependent acquisition (DDA, MS/MS) was performed on the five most abundant ions detected in full-scan MS (top 5 experiments per scan). The ion peaks were detected at charge states +1 and +2 with the inclusion of the 10 most intense ion peaks with a charge state tolerance of 0.2 Da (m/z) and an extraction tolerance of 2 Da. The differentiation of molecular ions, adducts and fragment ions were performed by chromatographic deconvolution with 3 Da isotope tolerance and 6 Da isotope extraction tolerance. The MS/MS isolation window width was 1 Da, and the scaled normalized collision energy (NCE) was set to units of 10, 20, 30, 40 and 50 eV.

Processing of UHPLC-MS/MS Data
UHPLC-MS/MS data were converted from standard .raw format (Waters Corp., Milford, MA, USA) to .mzML format using MSConvert 3.0.2 software [59]. The resulting file was processed using MZmine v2.53 [60]. For mass detection, at MS 1 and MS 2 levels, cut-off levels of 5.0 × 10 3 and 1.0 × 10 3 , respectively, were used. The ADAP chromatogram creation algorithm was used and set to a minimum scan group size of 3, minimum group intensity threshold of 5.0 × 10 3 , and highest maximum intensity of 5.0 × 10 3 with an m/z tolerance of 0.002 Da. The ADAP algorithm (Wavelets) was used for the deconvolution of the chromatogram. The S/N intensity window was used as the S/N estimator with a signal-to-noise ratio set to 15, a minimum feature height of 5.0 × 10 3 , a coefficient area limit of 50, a peak duration ranging from 0.01 to 1.0 min and an RT wavelet range of 0.01 to 0.1 min, an m/z interval for MS 2 scan pairing of 0.02 Da and an R/T interval for MS 2 scan pairing of 0.2 min were also used. Isotopes were detected using the isotope peak grouper with an m/z tolerance of 0.02 Da, an RT tolerance of 0.2 min (absolute) and the maximum load set to 2 and the representative isotope used was the most intense. Finally, using the peak list lines filter option, features without an associated MS 2 spectrum were removed, also using the parameter consecutive minimum peaks as 1 and minimum peaks in an isotope pattern as 1 as well. Finally, a manual validation step was performed to exclude false features, such as fragments from the ionization source [61] and features with low quality MS 2 spectra, resulting in a final list containing 151 features.

Resource-Based Molecular Network Creation
From the .mgf and .csv files obtained from processing the raw data with MZmine 2.53, a Molecular Network was created using the Feature-Based Molecular Networking workflow [62] on the GNPS platform (https://gnps.ucsd.edu/ProteoSAFe/static/gnpssplash.jsp) (accessed on 1 April 2022). The precursor ion mass and MS/MS fragment ion tolerances were both set at 0.02 Da. A molecular network was then created in which the edges were filtered to have a cosine score above 0.65 and more than 4 corresponding peaks. The edges between two nodes were kept in the network only if each of the nodes appeared in each of the other 10 most similar top nodes. The molecular family size was set to a maximum of 100, and the lowest scoring borders were removed from the molecular families until the molecular family size was below this threshold. The spectra on the network were searched against the GNPS spectral libraries [63]. The library spectra were filtered in the same way as the input data. All games held between the network spectra and the library spectra were required to have a score above 0.65 and at least 4 peaks combined. Molecular networks were visualized using Cytoscape software version 3.8.0 [64]. Molecular networking work can be publicly accessed at https://gnps.ucsd.edu/ProteoSAFe/status. jsp?task=72419d61c18f424d9544b41bc32c87e9 (accessed on 7 April 2022).

Putative Identification of Compounds
An extensive search in the scientific literature was carried out in order to build an internal database for the genus Margaritaria (Table S2), which resulted in 28 compounds already isolated from species of the genus. This table was used to evaluate the chemotaxonomy of the M. nobilis species and, adjunct to the molecular network created, served as a guide for the putative identification of the compounds present in the matrix under study. MS/MS spectra that did not have any correspondence on the GNPS platform were annotated using Sirius 4 software, in addition to being compared with spectral data present in the scientific literature.

Conclusions
From a workflow based on previous chemical reports from species of the genus Margaritaria, as well as supported by high-performance computational tools, we were able to establish a chemical profile for the ethanolic extract of M. nobilis leaves. In our results, 44 compounds were annotated; among these, we highlight compounds ellagic acid, galloyl-HHDP-glucose, quercetin 3-O-glucoside and galloyl-Che-HHDP-glucose that, in the first instance, may support the understanding of expected pharmacological activities for the species. We also highlight that by UHPLC-MS, we were able to analyze trace compounds that in conventional methods would not be verified. We emphasize that monitoring the availability of these compounds is also important, since the magnitude of the bioactive profile of this species can change dramatically due to seasonality.
Finally, we understand that, through this work, we contributed to the knowledge of the chemical profile of the leaves of this species, providing valuable information for the understanding and certification of pharmacological activities that will be studied in the future.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/metabo12080681/s1. Figure S1: LC-MS Base Peak Intensity (BPI) chromatogram of the EtOH extract from Margaritaria nobilis leaves (negative mode). The selected chromatographic peaks are annotated with peak numbers referred to in Table 1; Table S1. Summary of compound-dependent parameters used in the UHPLC-ESI-QToF-MS/MS experiment; Figure S2. Molecular network from UHPLC-MS/MS data in the negative ion mode for Margaritaria nobilis leaf extract; Figure S3. Proposed structures for annotated metabolites in the ethanolic extract of Margaritaria nobilis leaves; Figure S4. General fragmentation scheme and MS/MS spectra of Oglycosylated kaempferol derivatives; Figure S5. O-glycosylated quercetin derivatives MS/MS spectra; Figure S6. MS/MS spectra of hydrolysable tannins annotated in silico; Table S2. In-house database of compounds reported in the genus Margaritaria (Phyllanthaceae).  Institutional Review Board Statement: Not applicable because this study does not involve humans or animals.
Informed Consent Statement: Not applicable because this study does not involve humans or animals.

Data Availability Statement:
The data presented in this study are available in the main article and the supplementary materials.