Genome-Scale Metabolic Reconstruction, Non-Targeted LC-QTOF-MS Based Metabolomics Data, and Evaluation of Anticancer Activity of Cannabis sativa Leaf Extracts

Over the past decades, Colombia has suffered complex social problems related to illicit crops, including forced displacement, violence, and environmental damage, among other consequences for vulnerable populations. Considerable effort has been made in the regulation of illicit crops, predominantly Cannabis sativa, leading to advances such as the legalization of medical cannabis and its derivatives, the improvement of crops, and leaving an open window to the development of scientific knowledge to explore alternative uses. It is estimated that C. sativa can produce approximately 750 specialized secondary metabolites. Some of the most relevant due to their anticancer properties, besides cannabinoids, are monoterpenes, sesquiterpenoids, triterpenoids, essential oils, flavonoids, and phenolic compounds. However, despite the increase in scientific research on the subject, it is necessary to study the primary and secondary metabolism of the plant and to identify key pathways that explore its great metabolic potential. For this purpose, a genome-scale metabolic reconstruction of C. sativa is described and contextualized using LC-QTOF-MS metabolic data obtained from the leaf extract from plants grown in the region of Pesca-Boyaca, Colombia under greenhouse conditions at the Clever Leaves facility. A compartmentalized model with 2101 reactions and 1314 metabolites highlights pathways associated with fatty acid biosynthesis, steroids, and amino acids, along with the metabolism of purine, pyrimidine, glucose, starch, and sucrose. Key metabolites were identified through metabolomic data, such as neurine, cannabisativine, cannflavin A, palmitoleic acid, cannabinoids, geranylhydroquinone, and steroids. They were analyzed and integrated into the reconstruction, and their potential applications are discussed. Cytotoxicity assays revealed high anticancer activity against gastric adenocarcinoma (AGS), melanoma cells (A375), and lung carcinoma cells (A549), combined with negligible impact against healthy human skin cells.


Introduction
Over the past decades, Colombia has suffered from complex social problems related to illicit crops, including forced displacement, violence, and environmental damage, among other consequences for vulnerable populations [1]. Considerable effort has been made in Colombia to address this issue by creating a regulatory framework for import, export, cultivation, extraction, and research activities, especially of Cannabis sativa [2,3]. When the contingency caused by the coronavirus began, the former Minister of Health authorized Resolution 315 of 2020, which updates the lists of precursor drugs subject to state control and gives free access to the sale of master formulations (preparations made for medical indications) in order to eliminate some access barriers for research, medical, and scientific use [4]. In addition, two years later, Resolution 227 of 2022 was approved, regulating the use of medicinal C. sativa (non-psychoactive components) in food, beverages, and dietary supplements. Furthermore, since the beginning of this year, the national government, through Resolution 2808 of 2022, decided to include magistral preparations of C. sativa medicines within the health benefits plan for patients with pathologies such as refractory epilepsy, fibromyalgia, sleep and appetite disorder, cachexia due to cancer, insomnia, chronic pain, neuropathic pain, and pain associated with cancer, in order to address those public health concerns [5]. These laws laid the groundwork for the cultivation of C. sativa plants, the emergence of the medical cannabis industry, and safe access to medical and scientific use, among other developments. Hence, the current regulatory framework promotes scientific knowledge of C. sativa and allows for the exploration of potential markets for its alternative uses [6].
The field of research related to C. sativa has been expanding at an accelerated rate [7] thanks to the biotechnological capacity hidden in the plant. It is estimated that C. sativa can produce approximately 750 specialized secondary metabolites [8][9][10]. Some of the most relevant are monoterpenes, sesquiterpenoids, triterpenoids, essential oils, flavonoids, phenolic compounds (known as polyphenols [7]), lignans, stilbenoid derivatives, alkaloids, amino acids, spiro-indans, steroids, and glycoproteins, mainly due to their anticancer properties [8,[11][12][13][14]. Previous studies have shown a synergy among the metabolic compounds of the plant that, as a whole, show different behavior compared to the individual performance of each metabolite due to the "entourage effect" [15,16]. It is established that C. sativa chemotypes' rich cannabinoid and terpenoid content offer better pharmacological activities that are able to broaden clinical applications and improve therapeutic issues [17][18][19]. In the same way, remarkable anticancerogenic activity has been demonstrated for C. sativa extracts against different carcinoma cell lines such as melanoma [20], ovarian [21], prostate [22], breast [23], and pancreatic cancer [16]. These studies have revealed a reduction in tumor growth and promotion of apoptosis and autophagy in carcinoma cells [15,[23][24][25]. At the taxonomic level, chemotypes are grouped in terms of the relative amounts of their main compounds, the cannabinoids. Drug-type plants (chemotype I) contain high concentrations of the most prevalent cannabinoid known for its psychotropic capacity, (-)-trans-∆9-tetrahydrocannabinol, or D9-THC. When the cannabinoid content corresponds mostly to the second most abundant cannabinoid in the C. sativa plant, cannabidiol, CBD, it corresponds to chemotype III [26]. Finally, chemotype II, which is very scarce, is defined as a balanced content of the two main cannabinoids [27].
For all these reasons, it is critical to understand plant metabolism on a system-wide level to identify metabolic pathways involved in the production of key metabolites, characterize specific phenotypes influenced by environmental factors, and explore alternative uses of the leaf, such as nutraceuticals. Two processes were carried out with the data: functional annotation and automatic reconstruction of the metabolite network ( Figure 1).

Functional Annotation and Automated Reconstruction
Starting from the updated version of the C. sativa reference genome annotation [44], functional annotation and automated draft reconstruction were conducted based on the Plant SEED workflow for GEM [45,46]. The Plant SEED database (licensed under a Creative Commons Attribution 4.0 International License) describes the core metabolism of the plants and includes several refinement reconstruction steps such as embodiment of reaction stoichiometry and directionality, compartmentalization, transport reactions, charged molecules, and proton balancing on reactions, among others [28,46].
Next, the conversion of reconstructed data into a computable format was performed using the COBRA toolbox (GNU General Public License) and loading the reconstruction into MATLAB (Licence number 40902167) [47,48]; the topological metrics were obtained to evaluate the stoichiometric matrix, and an objective function was set based on the biomass composition of the plant cell [49].

Functional Annotation and Automated Reconstruction
Starting from the updated version of the C. sativa reference genome annotation [44], functional annotation and automated draft reconstruction were conducted based on the Plant SEED workflow for GEM [45,46]. The Plant SEED database (licensed under a Creative Commons Attribution 4.0 International License) describes the core metabolism of the plants and includes several refinement reconstruction steps such as embodiment of reaction stoichiometry and directionality, compartmentalization, transport reactions, charged molecules, and proton balancing on reactions, among others [28,46].
Next, the conversion of reconstructed data into a computable format was performed using the COBRA toolbox (GNU General Public License) and loading the reconstruction into MATLAB (Licence number 40902167) [47,48]; the topological metrics were obtained to evaluate the stoichiometric matrix, and an objective function was set based on the biomass composition of the plant cell [49].

Refinement of Reconstruction
After the first draft model was obtained, a great deal of work was required until the model represented the phenotypic states of the organism [50]. Gap-find was utilized to identify network pathologies which include root no-consumption, root no-production, downstream no-production, and upstream no-consumption and blocked reactions [28,51].

Identify Candidate Reactions to Fill Gaps
An exhaustive review of the literature was carried out to identify reactions related to the secondary metabolism of the plant that could fill the gaps and facilitate integrating diverse metabolic pathways taking place in the different cellular compartments [6,11,14,16,[52][53][54][55]. Furthermore, KEGG tools were used to complement the metabolic information of the reconstruction through a second functional annotation carried out based on BlastKOALA (KEGG Orthology and Links Annotation) [56]. This was aided by the updated annotation release of the C. sativa reference genome [44].

Add Gap Reactions to Reconstruction
Regarding the manual curation of models, one of the most complex problems researchers face is the diversity of terminology in reference databases. The present model relies on the Model SEED repository [57], which involves several databases (KEGG, Meta-Cyc, AraGEM, BiGG, Maize_C4GEM, PlantCyc, and TS_Athaliana, among others) and adds a unique identifier to them [12].
An iterative workflow was carried out to add reactions identified previously to the reconstruction. First, reactions were transformed to the ModelSEED nomenclature taking into account reference ModelSEED database information. Next, renamed reactions were integrated into the reconstruction, considering each compartment of the reconstruction. Finally, a network evaluation was carried out, looking for additional gaps that could be generated for new reactions (Figure 1).
Once the reconstruction was obtained, successive flux balance analyses (FBAs) were carried out. FBA is a mathematical approach that calculates the flow of metabolites through metabolic reconstruction, making it possible to predict the growth rate of an organism [58]. This is done by taking advantage of the constraints imposed by the stoichiometric coefficients of each reaction in the metabolic fluxes. FBAs of C. sativa are based on the biomass composition of the plant cell [49] as the objective function of the model (Supplementary  Table S1) and then evaluating the flow distribution within the system. The sample material was obtained from plants grown in the region of Pesca-Boyaca, Colombia, under greenhouse conditions at the Clever Leaves facility, in a legal operation and under controlled growing conditions, following the guidelines for good agricultural and collection practices (GACP) for starting materials of herbal origin.
The drying process of the plant material was carried out in rooms with controlled conditions for this purpose. The extraction process was carried out from fresh leaf tissue that was ground to a particle size of 1.4 mm, at a 5:1 ratio of ethanol to dry leaves by weight. Constant agitation was performed in a Heidolph shaker at 2000 rpm for 4 h. The supernatant was transferred to a new vial.
Subsequently, the extract obtained was used for LC-PDA and LC-QTOF-MS analysis under the conditions described below.

LC-PDA
The chromatographic analysis was carried out using a methodology validated by Clever Leaves, a company dedicated to pharmaceutical grade cannabis-based products.
The liquid chromatography method with PDA (photodiode array) detection was employed, using the following conditions. Mobile phase A involved a solution of 0.1% trifluoroacetic acid in water, while mobile phase B involved a solution of acetonitrile. A total injection volume of 2 µL was used for the analysis. UV detection was set at a wavelength of 220 nm. Chromatographic separation was carried out on a CORTECS ® UPLC ® Shield RP18 column (Milford, USA) with dimensions of 2.1 × 100 mm and a particle size of 1.6 µm. The autosampler and column temperatures were maintained at 8 • C and 35 • C, respectively. The total run time for the analysis was 11 min. Acetonitrile HPLC was used as the solvent for dilutions, while a mixture of acetonitrile and water (70:30) was employed as solvent. The purge solvent consisted of a water-acetonitrile mixture (90:10). The flow rate was set at 0.7 mL/min, and the mobile phase composition was kept isocratic at 41% mobile phase A and 59% mobile phase B. The system suitability test required a resolution between peaks to be greater than 1.5 for proper analysis.

Analysis by RP-LC-QTOF-MS
For metabolic analysis, 5 mg of the crude extract of C. sativa, which contains a high cannabidiol (CBD) content (>85% of the total phytocannabinoids extracted) [18], was dissolved in methanol to a final concentration of 250 mg/L for subsequent analysis via reverse-phase liquid chromatography coupled with mass spectrometry (RP-LC-QTOF-MS).
Samples were analyzed in a liquid chromatography system (Agilent Technologies 1260) coupled with a quadrupole time-of-flight (Q-TOF) mass analyzer (Agilent Technologies 6545B) with an electrospray ionization source (ESI). Separation was conducted in a C18 column (InfinityLab Poroshell 120 EC-C18 (100 × 3.0 mm, 2.7 µm) at 30 • C with a gradient elution consisting of 0.1% (v/v) formic acid in Milli-Q water (Phase A) and 0.1% (v/v) formic acid in acetonitrile (Phase B) at a constant flow rate of 0.4 mL/min. Mass spectrometric detection was performed initially in positive mode, followed by a subsequent analysis in negative mode using the same set of acquired data at full scan from 70 to 1100 m/z. The QTOF instrument was operated in 4 GHz (high resolution) mode. The data acquisition parameters were configured as follows: ion source temperature of 325 • C, gas flow of 8 L/min, nebulizer gas pressure at 50 psi, and capillary voltage of 2800 V. MS/MS acquisition mode was performed in data-dependent acquisition (DDA) mode in the range of m/z 50 to 1100 with a scan sweep rate of 3 spectra/s and under chromatographic and spectrometric conditions identical to those employed in the initial analysis. For each sample, analysis was performed at different collision energies 20 eV, 40 eV, and equation mode was used (CE = 3.6 × (m/z)/100 + 4.8) [59,60], using 3 precursors per cycle. During the analysis, several reference masses were used for mass correction: m/z 121.0509 (C 5

Data Processing
Data processing was performed with the Agilent MassHunter Profinder 10.0 software program for deconvolution, alignment, and integration, using the recursive feature extraction (RFE) algorithm. This algorithm performs a deconvolution of the chromatogram and integration of the molecular characteristics present in the samples according to mass and retention time. The data obtained from the deconvolution and integration were filtered by area by calculating the total area for the sample and then the area of each molecular feature. The annotation of the more abundant molecular features obtained was carried out using the CEU MASS MEDIATOR tool (https://ceumass.eps.uspceu.es/ (accessed on 1 October 2021)) [47], including the Metlin, Kegg, HDMB, and LipidMaps platforms as parameters, and with a tolerance of 10 ppm. Then, MS/MS analyses were performed in order to confirm the identity of the metabolites using MS-DIAL 4.8 (http: //prime.psc.riken.jp/compms/msdial/main.html (accessed on 1 October 2021)), in in silico mass spectral fragmentation through CFM-ID 4.0 (https://cfmid.wishartlab.com/ (accessed on October 2021)) and manual MS/MS spectral interpretation using the Agilent MassHunter Qualitative Analysis program (version 10.0, USA).
Cell viability was determined via a MTT metabolic activity assay (3-(4,5-Dimethylthiazol-2-yl)-2,5-Diphenyltetrazolium Bromide)) following the manufacturer's instructions. For this, cells (7000-10,000 cells/well depending on the cell line) were seeded on 96-well microplates with supplemented culture medium (10% FBS) and then incubated at 37 • C, in a 5% CO 2 , and humidified atmosphere (humidity above 90%) for 24 h. Next, the culture medium was extracted and replaced by a non-supplemented medium containing the C. sativa leaf extract at concentrations ranging from 0.05 to 0.0004 mg/mL (serial dilutions were performed). Cells were incubated at 37 • C, in a 5% CO 2 and humidified atmosphere for 24 and 72 h. After the incubation time, 10 µL of the MTT reagent (5 mg/mL) was added to each well, and the microplates were incubated for 2 h under the same conditions. Finally, supernatants were extracted and replaced by 100 µL of DMSO to dissolve formazan crystals. Absorbance was recorded at 595 nm in a microplate reader (Multiskan™ FC Microplate Photometer, ThermoFisher Scientific, Waltham, MA, USA).
Cell viability was calculated using the following equation: where Abs (C−) corresponds to the absorbance of the negative control (non-supplemented medium) at 595 nm and Abs (sample) corresponds to the absorbance of the sample at 595 nm. In addition, Cytotoxicity (%) was calculated as 100 − Cell viability (%).

Genome-Scale C. sativa Metabolic Reconstruction
The PlantSEED semi-automatic reconstruction strategy was performed and curated with an exhaustive review of the literature and BLASTKOALA, to obtain the first C. sativa GEM reported in the literature ( Figure 1). Results were analyzed considering the challenges involved in modeling eukaryotic cells (large size, compartmentalization of metabolic processes, and variation in tissue-specific metabolic activity [61]) and also by considering topological characteristics of the network that can be analyzed from the stoichiometric matrix [62]. Features of the initially reconstructed network and topological analysis of the stoichiometric matrix through the sparsity pattern are shown in Tables 1 and 2 and Figure 2.   Metabolic pathways with the highest number of reactions and compounds were sociated with the biosynthesis of fatty acids, steroids, arginine, and tyrosine, along w the metabolism of purine, pyrimidine, glucose, starch, and sucrose ( Figure 3). Metabolic pathways with the highest number of reactions and compounds were associated with the biosynthesis of fatty acids, steroids, arginine, and tyrosine, along with the metabolism of purine, pyrimidine, glucose, starch, and sucrose ( Figure 3).

Functional Annotation
The initial genome annotation reported by Grassa contains 31,170 genes, of which 25,296 are protein-coding genes (81%). A PlantSEED functional annotation was performed and complemented via BLASTKOALA to describe the metabolic capacity of C. sativa leaves ( Figure 4). A total of 10,636 C. sativa genes were related to KO numbers. Most orthologous groups were observed in metabolic pathways related to primary plant metabolism (amino acid, carbohydrate, energy, cofactors and vitamins, and lipid metabolism). C. sativa leaf metabolism reveals the complexity behind the biochemical reactions that occur in plant eukaryotic cells. A closer look at each of the modules ( Figure 4) shows that energy acquisition, storage, and the utilization of stored energy are central processes in the overall control of plant metabolism [35]. Additionally, 6% of KO numbers were related to the biosynthesis of secondary cannabinoid and non-cannabinoid metabolites. These were important results that will be used to strengthen secondary metabolism in metabolic reconstruction.
Some of the metabolic modules manually added are biosynthesis of flavanone, flavonoids, tryptophane, catecholamine, phenylalanine, proline, arginine, valine, leucine, cholesterol, cannabinoids, and fatty acids, among others.  [56]. While terpenoids and cannabinoids share the metabolite geranyl pyrophosphate as a common precursor, coumarins and toxins originate from tryptophan and phenylalanine biosynthesis. Metabolic modules of phenylpropanoid biosynthesis, essential and non-essential amino acids such as tryptophan and tyrosine, biosynthesis of monoterpenes, terpenes, and sesquiterpenes could be responsible for the observed synergistic effects that enhance the bioactivities of cannabinoids (entourage effect).  tiva leaf metabolism reveals the complexity behind the biochemical reactions that occur in plant eukaryotic cells. A closer look at each of the modules ( Figure 4) shows that energy acquisition, storage, and the utilization of stored energy are central processes in the overall control of plant metabolism [35]. Additionally, 6% of KO numbers were related to the biosynthesis of secondary cannabinoid and non-cannabinoid metabolites. These were important results that will be used to strengthen secondary metabolism in metabolic reconstruction. Some of the metabolic modules manually added are biosynthesis of flavanone, flavonoids, tryptophane, catecholamine, phenylalanine, proline, arginine, valine, leucine, cholesterol, cannabinoids, and fatty acids, among others.  [56]. While terpenoids and cannabinoids share the metabolite geranyl pyrophosphate as a common precursor, coumarins and toxins originate from tryptophan and phenylalanine biosynthesis. Metabolic modules of phenylpropanoid biosynthesis, essential and non-essential amino acids such as tryptophan and tyrosine, biosynthesis of monoterpenes, terpenes, and sesquiterpenes could be responsible for the observed synergistic effects that enhance the bioactivities of cannabinoids (entourage effect).  sativa [63]. Enzymes related to functional annotation are illustrated in green. Figure 5. Pathway of secondary metabolism in the biosynthesis of cannabinoids and terpenes in C. sativa [63]. Enzymes related to functional annotation are illustrated in green. Figure 5. Pathway of secondary metabolism in the biosynthesis of cannabinoids and terpenes in C. sativa [63]. Enzymes related to functional annotation are illustrated in green.

Non-Targeted LC-QTOF-MS Based Metabolomics Data
Characterization of the compounds present in the C. sativa leaf sample was performed using a non-targeted metabolomics approach. This approach has the advantage for the present study of analyzing the sample in general, without focusing on a particular set of metabolites, allowing for a more descriptive metabolomic characterization of the sample. Table 3 summarizes the identification of 41 molecules in negative ionization mode and 38 molecules in positive ionization mode. The metabolites obtained were clustered into four main clusters [64] of plant secondary metabolites (Figure 7).       The molecules with the highest intensity in the abundance peaks were mostly cannabinoids (delta-9-THC, Cannabidiolic acid, Cannabichromene), and terpenoids (Geranylhydroquinone). However, high-intensity peaks were found for coumarins (clausarinol), phenylflavonoids (cannflavin A), and steroids (pregna-4,9(11)-diene-3,20-dione, Neriantogenin) ( Table 3). Prenol lipids and glycerophospholipids were identified as the subgroups contributing to the greatest diversity of metabolites in the sample. The metabolic profile of the sample is illustrated in Supplementary Material Figure S3. The main precursors in their biosynthesis were identified and integrated into the reconstruction (Table 4) and will be key to studying and understand the metabolic transition from primary to secondary metabolism and the relationship between chemical synergy and C. sativa valuable characteristics. Taking advantage of the reconstruction, it is possible to study the biosynthesis of various value-added compounds. From here, various approaches such as bio-organic synthesis can be used to obtain these valuable compounds in a more economical way.

Cytotoxicity and Anticancer Activity of C. sativa Leaf Extracts
The cytotoxicity of the C. sativa extract was clearly affected by different factors such as concentration, exposure time, and cell line. Results showed high anticancer activity against gastric adenocarcinoma (AGS) and melanoma cells (A375) (Figure 8). Cytotoxicity levels ranging from 50 to 90% for concentrations between 0.0125 and 0.05 mg/mL were observed in both cell lines. In contrast, for Vero and lung carcinoma cells (A549), these cytotoxicity levels were observed in concentrations between 0.025 and 0.05 mg/mL. This confirms less activity against A549 and significant toxicity against Vero cells. Cytotoxicity levels ranging from 50 to 90% for concentrations between 0.0125 and 0.05 mg/mL were observed in both cell lines. In contrast, for Vero and lung carcinoma cells (A549), these cytotoxicity levels were observed in concentrations between 0.025 and 0.05 mg/mL. This confirms less activity against A549 and significant toxicity against Vero cells. Surprisingly, results obtained for healthy skin fibroblasts (HFF) showed negligible toxicity in concentrations between 0.0004 and 0.025 mg/mL (below 10%).

GEM Reconstruction, Functional Annotation and Secondary Metabolism of C. sativa
The whole-genome assembly of C. sativa (CBDRx:18:580) obtained by Grassa et al. [44] serves as the main input for the GEM reconstruction. CBDRx:18:580 was obtained from the leaf of a female plant grown indoors at 20-25 • C [44]. The plant belongs to chemotype III, which is associated with a high content of cannabidiol-CBD [26]. While this plant chemotype is widely recognized for its applications in the textile and paper industry, there exist other significant avenues that present potential opportunities to diversify and enhance its value chain [6]. Some of these potential uses are in the food industry (thanks to its nutraceutical value), medicine (thanks to the unique properties of cannabinodiol), and cosmetics (thanks to the possible effects of cannabinoids in synergy with terpenes) [17].
As for the reconstruction, analysis of the corresponding stoichiometric matrix enables the identification of topological features of the network. The sparsity pattern is illustrated in Figure 2. The stoichiometric matrix consists of 1314 rows (metabolites) and 2101 columns (reactions). Out of a total of 2760714 entries, 8361 (0.302%) are non-zero (nz). Generally, fewer than 1% of the elements in a genome-scale stoichiometric matrix are non-zero. This value is particularly useful for comparing models based on the number of metabolites involved in each reaction. The double upper diagonal appearance observed in the stoichiometric matrix is primarily a result of the ordering of reactions, rather than an intrinsic feature [62,65]. GEM reconstruction of C. sativa incorporates various compartments, including the cytosol, stroma, Golgi, vacuole, cell wall, peroxisome, mitochondria, nucleus, and endoplasmic reticulum (Supplementary Figure S1). Notably, around 50% of the model reactions specifically pertain to compartments that play crucial roles in primary metabolism, such as the cytosol, mitochondria, and plastids. Numerous studies have demonstrated the relationship between primary and secondary metabolisms in plants and how despite the great variety of secondary metabolites, only some basic pathways of primary metabolism function as their precursors [66]. Glycolysis is the precursor of fatty acid biosynthesis, the mevalonate pathway, and the DXP-MEP pathways, which give rise to a variety of important terpenes and phenolic compounds such as cannabinoids, flavonoids, and fatty acids. On the other hand, the Krebs cycle is a primary precursor in the biosynthesis of glutamate and aspartate, while the shikimate pathway is a precursor in the biosynthesis of phenylalanine, tyrosine, and tryptophan. This part of the relationship between primary and secondary metabolism is native to N-containing compounds. Thus, it is possible to affirm that glycolysis, Krebs cycle and shikimate pathways are the most important precursors in the reconstruction of the secondary metabolism of C. sativa when researching its potential as a cosmeceutical, cosmetic, or additive in the food industry (thanks to its properties derived from terpenes), therapeutic and nutraceutical (thanks to its properties derived from phenolic compounds such as flavonoids, cannabinoids, alkaloids and N-containing compounds), potential phytonutrient (thanks to its properties derived from fatty acids), and many other applications previously mentioned. A significant number of transport reactions were evidenced in the reconstruction, corresponding to the high flux of metabolites passing from one compartment to another. These reactions are linked to alkaloids, furanocoumarins, terpenes, and carotenoids formed in the chloroplast; similarly, sesquiterpenes, sterols, and hydroxylation steps together with fatty acid synthesis take place in a constant exchange between the cytosol and the endoplasmic reticulum. Most hydrophilic compounds originate in the cytosol, whereas the site of alkaloids, non-protein amino acids, glucosinolates, flavonoids, and carotenoids originate in the vacuole compartment [66].

Non-Targeted LC-QTOF-MS Based Metabolomics Data Analysis
In the present study, C. sativa chemotype III (with a phytocannabinoid content consisting of 12.78% CBD, 3.21% CBDA, and 0.54% THC as determined by LC-PDA) was chosen for the integration of metabolomics data into the reconstruction. The polar extract used in LC-QTOF-MS facilitates the identification of polar and low volatility compounds, mainly cannabinoids, some terpenes, and flavonoids [43]. In addition, the LC-QTOF-MS data show a complementarity between the positive and negative ionization modes. The positive ionization mode possibly reveals a higher quantity of CBD when compared to the quantity of THC. However, some THC isomers may play a role in peak discrimination. The metabolic profile obtained tentatively agrees with the initial chemotype III of the plant and also corresponds to the chemotype exposed by Grassa [44,67] in obtaining the reference genome of C. sativa.
The metabolites obtained from the analysis were found to exhibit a diverse range of chemical structures and functionalities. For the organization and classification of these metabolites, they were grouped into four distinct clusters. These clusters represent the main categories of plant secondary metabolites, highlighting the chemical diversity present in the sample [64] (Figure 7). This clustering approach provides valuable insights into the composition and distribution of secondary metabolites in the studied plant system [64].
Alkaloids can be defined as nitrogen-containing compounds derived from secondary, or specialized, metabolism. Their nitrogen compound is derived from an amino acid, and they are part of a complex ring structure [69]. Although there is an immense diversity of alkaloids, they all share a biosynthetic origin, derived from the formation and reactivity of the iminium cation. Its transition from primary to secondary metabolism is considered the most important as it opens the door to a new chemical space [54]. The first of the four stages found in alkaloid biosynthesis consists of the accumulation of an amino precursor from amino acid metabolism; these amino precursors can be divided into two categories: polyamines derived from lysine, arginine, and ornithine or aromatic amines derived from tryptophan and tyrosine. In the first case, polyamides are produced through the Krebs cycle pathway, which generates aspartate as a precursor of lysine and pyrimidines, as well as glutamate, which functions as a precursor of ornithine, arginine, and non-protein amino acids [66]. In the second case, the aromatic amines come from the shikimate pathway, which produces chorismate as a precursor, on one hand from arogenate to produce tyrosine and phenylalanine and on the other hand from anthranilate to produce tryptophan [66] (Figure 6). Tyrosine is the precursor to multiple alkaloid families, including the benzylisoquinolines, the amaryllidaceae alkaloids, and the betalains.
LC-QTOF-MS data revealed that neurine and cannabisativine are two alkaloids present in the leaves of the plant, in addition to their previously reported presence in the root of samples collected in Mexico [12,70]. These cannabis alkaloids have demonstrated antiparasitic, antipyretic, antiemetic, antitumor, diuretic, and analgesic properties [11,71]. Neurine can be biosynthesized from choline [72], which has been classified as an essential nutrient for humans, and additionally, it is a precursor of the osmoprotectant glycine betaine, an enhancer of osmotic resistance in the plant against drought and salinity [73]. Choline biosynthesis is thus a potential nutraceutical pathway by which 3 methylation reactions occur, catalyzed in parallel by the cytosolic enzyme phosphoethanolamine N-methyltransferase (EC 2.1.1.103) and mediating the next 2 methylations to produce phosphocholine [73]. Other N-containing compounds obtained in the LC-QTOF-MS data were glyceryl lactopalmitate, which is used in the food industry as an emulsifier [74] and belongs to the pyrazole-type alkaloids from ornithine. Another compound identified was pipercitine, which has proven insecticidal activity [75] and can be obtained from lysine.
In plants, between 20 and 30% of fixed carbon is invested in the synthesis of phenylalanine and then converted into lignin, which fulfills different roles in structural function as the most abundant compound in the cell wall, ultraviolet protection, signaling, and reproduction thanks to volatile anthocyanins and phenylpropanoid/benzenoid [76]. The latter is the second largest group of volatiles in plants, and they are divided into three classes according to their carbon backbone: benzenoids (C6-C1), phenylpropanoids (C6-C3), and phenylpropanoid-related compounds (C6-C2) [77]. Their biosynthesis is based on the amino acid-derivative pathways of shikimic acid (E.C. 1.1.1.25), which consists of seven reactions catalyzed by six enzymes and transforms phosphoenolpyruvate (PEP) with erythrose 4-phosphate (E4P) to chorismite ( Figure 6).
Data extracted from LC-QTOF-MS described some of the unique metabolites of the species, such as cannflavin A (Table 3). Cannaflavins come from the condensation of three malonyl molecules to form naringenin chalcone. When the ring is closed, it forms naringenin and thanks to the action of flavone synthase, it is possible to produce apigenin, which is a derivative of luteonyl [16]. Among the reported benefits of flavonoids and particularly cannabiflavins are their antioxidant and anti-inflammatory activity, and cardioprotective, neuroprotective, hepatoprotective, and immunomodulatory effects [80]. Other properties of flavonoids are their flavor, color, and aroma, as well as anti-diabetic and neuroprotective activities thanks to the modulation of the number of cellular cascade signals [11].
However, there are gaps in our knowledge of the biosynthesis of flavonoids and therefore the means by which some esters, lignins, flavonoids, and coumarins are formed is unknown [7].

Fatty Acids Derivates
Fatty acids are often esterified in form of phospholipids, glycerolipids, or sterol backbones. Their structure consists of a long chain of hydrogen-bonded carbons, with a terminal carboxyl group (-COOH) [11]. This functional group is key in their function as energy reservoirs. In this regard, they provide structure to and energy for cells in the absence of glucose and participate in the response to low-temperature tolerance. Finally, they are involved in the production of cholesterol as precursor for the biosynthesis of hormones such as estrogen, testosterone, vitamin D hormone, steroids, and prostaglandins [55]. These functions also explain their high nutritional value and pharmaceutical potential.
Fatty acids are synthesized in plastids and assembled by glycerolipids or triacylglycerols in the endoplasmic reticulum [81]. Fatty acid synthesis is a complex process involving three main phases: de novo synthesis of fatty acids in the plastidial compartment from acetyl CoA, desaturation in the chloroplast and elongases, modified reactions such as hydroxylation, and epoxidation, which take place in the endoplasmic reticulum [82]. Figures 5 and 6 describe in general terms the metabolism of fatty acid biosynthesis.
About 22% of metabolites detected in this non-targeted LC-QTOF-MS metabolomic analysis of a C. sativa sample are involved in different reactions related to the fatty acid biosynthesis. As products of de novo fatty acid synthesis, palmitoleic acid and other linolenic acids (13-Hydroxyoctadecatrienic acid, octadecatetraenoic acid, and trihydroxyoctadecadienoic acid) were identified. It has been reported that increasing the dietary intake of these fatty acids reduces the risk of coronary heart disease [83] due to inhibition of coagulation, improvement of glucose homeostasis, and attenuation of inflammation. On the other hand, fatty acids metabolized via modifiable reactions increase the production of vitamin E, prostacyclin, prostaglandins, leukotrienes, and hydroxy and hydroperoxy fatty acids, which have been reported to be involved in the modulation of cell growth, angiogenesis, inflammation, thrombosis, immune response, inhibition of carcinogenesis and tumor growth, and stimulation of cancer cells apoptosis, among others [84,85].

Terpenes
Terpenes are hydrocarbon compounds made up of 5C units called isoprenes. They are classified according to these units' size. Their biosynthesis is mediated by the cytosolic mevalonate (MVA) pathway, which provides farnesyl diphosphate (FPP) for sesquiterpenoids (C15) and squalene as precursors for triterpenoids (C30) and sterols. Alternatively, they might come from the patricidal DOXP/MEP pathway, which provides GPP to form monoterpenoids [8] (Figure 6). Almost 30% of the data obtained via LC-QTOF-MS are related to various terpenes and terpenoids. These compounds have shown multiple therapeutic benefits, including suppressing the immune system response against COVID-19, and inhibition in many species of bacteria and fungi [11]. Additionally, they have been reported to exhibit antimicrobial, repellant, antiallergy, anticancer, antifungal, antibacterial, antioxidant, anti-inflammatory, antidepressant, sedative, anticonvulsant, analgesic, gastroprotective, and antispasmoic properties [11].
The main precursors of the metabolites identified from LC-QTOF-MS metabolomics data were described and used for integration in the metabolic reconstruction ( Table 4). The integrated data are mainly primary precursors for metabolic modules of interest: anthocyanin biosynthetic pathway which is an extension of flavonoid pathway; fatty acid biosynthesis, degradation, and elongation; phenylalanine, tyrosine, and tryptophan biosynthesis; and terpenoid backbone biosynthesis. After data integration, the reconstruction increased by 297 active reactions and 118 metabolites.

Cytotoxicity and Anticancer Activity of C. sativa Leaf Extracts
The obtained results confirmed the remarkable anticancer activity of the C. Sativa extracts against different carcinoma cell lines (AGS, A375 and A549). This agreed well with previous works that studied the anticancer activity of C. sativa on different cell lines such as melanoma [20], ovarian cancer [21], prostate cancer [22], and breast and pancreatic cancer [16], among others. The results are also in agreement with the biological activities based on both the chemotype and the extraction taken from the leaves of the plant. Manosroi et al. [86] demonstrated that the ethanolic extract of the leaves and seeds of the C. sativa plant chemotype III, exhibited cytotoxicity activity against B16F10 melanoma cells in a concentration dependent manner (cytotoxicity of 46% at 1 mg/mL and total inhibition at 10 mg/mL). Additionally, both leaf and seed extracts demonstrated negligible toxicity against human skin fibroblast (viability above 80% for concentration below 0.5 mg/mL) confirming high biocompatibility.
The notable activity against melanoma cells combined with the negligible impact on healthy human skin cells confirms the great pharmacological potential that makes them suitable candidates for the development of new-generation topical treatments with reduced side effects, especially for melanoma, the most common and aggressive type of skin cancer. These findings have been confirmed in several works presenting promising results, both in vitro [87] and in vivo [20].
On the other hand, the potential selective toxicity of C. sativa leaf extracts has been widely studied in order to develop novel therapies with reduced negative side effects. Janatová and colleagues [15] evaluated selectivity by comparing the toxicity of six different genotypes of medical cannabis against three cancer cell lines (Ht-29, Caco-2, and Hep-G2) and two healthy cell lines (FHs 74 Int: healthy intestinal cells and MRC-5: healthy lung fibroblast). They demonstrated that the compound content of the different genotypes strongly affects selectivity. Highlighting specific compounds such as myrcene, βelemene, β-selinene, and α-bisabolol oxid as enhancers of selectivity and β-ocimene and β-caryophyllene oxide as cytotoxicity-associated molecules. Selectivity is therefore determined by the plant genotype (chemical profile and content) and by the specific cell line.
In consequence, these findings can explain the selectivity differences between all the different evaluated cell lines, especially, the significant increase of cytotoxicity observed in Vero cells. Furthermore, the obtained toxicity profiles against Vero cells agree strongly with previously reported articles. For example, Lamdabsri and coworkers [88] showed that the toxicity of cannabis extracts against Vero cells is highly influenced by compound content, reporting high toxicity in the crude and CBN extracts (IC50 of 13.4 and 10.6 µg/mL, respectively) and lower toxicity in the CBG, CBD, and THC (IC50 699.7, 39.77 and 67.2 µg/mL, respectively).

Conclusions
GEM reconstruction of C. sativa contributes to better understanding of cellular phenotypes and metabolic behavior [41,89] in terms of the identification of different biosynthetic pathways by integrating omics data and experimental anticancer results. Using the current model, it is possible to explore different biosynthetic pathways for many valuable compounds, especially those of major interest to the scientific community and which represent a significant opportunity to improve the value chain for C. sativa. The high number of reactions observed in the cytosol, plastids, and mitochondria compartments confirms the significance of primary metabolic pathways such as glycolysis, the Krebs cycle, and the shikimate pathway. These pathways play a crucial role as principal precursors for secondary metabolites, including cannabinoids, flavonoids, fatty acids, and nitrogencontaining compounds. Transport reactions have a crucial role in facilitating the exchange of metabolites between different cellular compartments. This is especially important in compartments such as the chloroplast, cytosol, endoplasmic reticulum, and vacuole, which are related to the synthesis of various metabolites, including alkaloids, terpenes, sterols, and hydrophilic compounds.
On the other hand, the LC-QTOF-MS metabolomics analysis provided insights into the diverse chemical composition and distribution of secondary metabolites in C. sativa. The LC-QTOF-MS data revealed a high abundance of secondary metabolite modules such as cannabinoids, terpenoids, coumarins, phenylpropanoids, and steroids. Specific metabolites identified included delta-9-THC, cannabidiolic acid, cannabichromene, geranylhydroquinone, cannflavin A, pregna-4,9(11)-diene-3,20-dione, and neriantogenin. These metabolites exhibit a range of biological activities and potential therapeutic benefits. Additionally, these metabolites contributed to the integration of the reconstruction, demonstrating that the use of omics contributes to the activation of a greater number of reactions that are required for the synthesis of metabolites in the reconstruction.
Finally, regarding to the cytotoxicity and anticancer activity of C. sativa, it can be concluded that although extracts demonstrated low selectivity in Vero cells, their remarkable selectivity against melanoma cells compared to the healthy skin fibroblast leaves an open window for continuing studies on C. sativa leaf extract as a potential candidate for the development of new-generation treatments for skin cancer with reduced side effects.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/metabo13070788/s1, Figure S1: Number of reactions per compartment in metabolic reconstruction. Figure S2: Glycolysis in GEM reconstruction of C. sativa; Figure S3: Chromatogram of the main ions extracted from C. sativa. (A) ESI (+) detection mode. (B) ESI (−) detection mode; Figure S4: Fluxer nodes and edge representation of metabolic reconstruction of C. sativa model; Figure S5: Fluxer nodes and edge representation of metabolic reconstruction of AraGEM model; Table S1: Biomass compounds in the objective function; Spreadsheet S1: CannGEM.xls.