Comprehensive GCMS and LC-MS/MS Metabolite Profiling of Chlorella vulgaris

The commercial cultivation of microalgae began in the 1960s and Chlorella was one of the first target organisms. The species has long been considered a potential source of renewable energy, an alternative for phytoremediation, and more recently, as a growth and immune stimulant. However, Chlorella vulgaris, which is one of the most studied microalga, has never been comprehensively profiled chemically. In the present study, comprehensive profiling of the Chlorella vulgaris metabolome grown under normal culture conditions was carried out, employing tandem LC-MS/MS to profile the ethanolic extract and GC-MS for fatty acid analysis. The fatty acid profile of C. vulgaris was shown to be rich in omega-6, -7, -9, and -13 fatty acids, with omega-6 being the highest, representing more than sixty percent (>60%) of the total fatty acids. This is a clear indication that this species of Chlorella could serve as a good source of nutrition when incorporated in diets. The profile also showed that the main fatty acid composition was that of C16-C18 (>92%), suggesting that it might be a potential candidate for biodiesel production. LC-MS/MS analysis revealed carotenoid constituents comprising violaxanthin, neoxanthin, lutein, β-carotene, vulgaxanthin I, astaxanthin, and antheraxanthin, along with other pigments such as the chlorophylls. In addition to these, amino acids, vitamins, and simple sugars were also profiled, and through mass spectrometry-based molecular networking, 48 phospholipids were putatively identified.


Introduction
Microalgae are fast growing autotrophic organisms. They use light energy and carbon dioxide for the photosynthetic biomass production with higher efficiency compared to plants. There are over 300,000 species of microalgae, of which only around 30,000 (10%) of them have been documented [1]. They live in complex natural habitats, making them able to adapt rapidly in different extreme conditions such as variable salinity, temperature, nutrients and UV-irradiation. Therefore, they can produce a great variety of fascinating metabolites, with novel structures that possess biological activities which are not found in other organisms [2]. Microalgae produce some useful bio-products including β-carotene, astaxanthin, fucoxanthin, docosahexaenoic acid (DHA), eicosapentaenoic acid (EPA), bioactive and functional pigments, natural dyes, polysaccharides, amino acids, vitamins, antioxidants, and many more [3]. Algal research started in the 1950s, starting with studies on Chlorella and Spirulina species. Chlorella has been considered as an alternative in phytoremediation and potential source of renewable result of normal media formulation. Usually, normal media formulations like the bold Basal's media (BBM) (chemical composition) imitate the favorable natural conditions in which microalgae grow via metamorphosis. In addition, most of the researches also reported the unavoidable depletion of other metabolites in the course of enhancing the routes for the optimal production of the targeted metabolites of their research objective(s). For instance, C. vulgaris mutant strain was reported to have displayed superior lipid productivity with a minor PUFA level [26]. This is the same with total protein content of the wild species which solely depends on culturing conditions [27]. Under NO 3 deficit, the production of photosynthetic pigments is drastically reduced with a huge decline of antioxidant activities particularly in C. vulgaris [28]. On the other hand, replete NO 3 led to chlorophyll pigments and amino acid accumulation, while NO 3 limitation produced a very low amino acid content but was very effective at increasing the generation of neutral lipid content [29,30]. Furthermore, most of the identification carried out on the carotenoids and chlorophylls was conducted on dietary supplements containing C. vulgaris. However, there is no information available regarding the comprehensive metabolite profile of C. vulgaris cultured under normal growing conditions and formulation which is ideal for incorporation into diets. It is important to note that the metabolite composition of a microalgae may contribute in one way or another to its biological effect [31]. Thus, having an appreciably good knowledge of the diversity in metabolite composition is a prerequisite to understanding their correlation to the nutritional and pharmacological effects of the microalgae. Therefore, it is equally important to comprehensively profile microalgae grown under normal conditions and formulations in order to gain a better insight into the C. vulgaris metabolome, which is the aim of the present study.

Identification of Fatty Acids in Chlorella vulgaris
Extracted fatty acids from C. vulgaris were derivatized to fatty acid methyl esters (FAMEs) via reaction with acetyl chloride [32] and subjected to GC-MS analysis ( Figure 1A). From the C. vulgaris, nine saturated fatty acids ranging from C 14 to C 22 were detected, accounting for a percentage of 27.8% of the total fatty acids ( Figure 1B), while the unsaturated fatty acids accounted for 71.2% of the total fatty acids. The compositional ratio of fatty acids was calculated from peak areas which were integrated in total ion chromatogram as shown in Figure 1.
The fatty acid profile of C. vulgaris is shown to be rich in omega-6, -7, -9, and -13 with omega-6 being the major fatty acid, representing more than sixty percent (>60%) of the total fatty acids ( Figure 1D). This is a clear indication that this Chlorella species could serve as a good source of nutrition when incorporated in diets. The profile also showed that the fatty acid composition is mostly made up of C 16 -C 18 (>92%) fatty acid ( Figure 1C,D) which further supports C. vulgaris as a potential candidate for biodiesel production. These results are in agreement with the recently published work of Fernández-Linares et al. [33].
Mar. Drugs 2020, 16 Table 1) (B) The percentage composition of saturated and unsaturated fatty acid methyl esters (C) Percentage distribution of individual saturated fatty acids relative to the total saturated fatty acid content (D) Percentage distribution of individual unsaturated fatty acids relative to the total unsaturated fatty acid content.
The fatty acid profile of C. vulgaris is shown to be rich in omega-6, -7, -9, and -13 with omega-6 being the major fatty acid, representing more than sixty percent (> 60%) of the total fatty acids ( Figure  1D). This is a clear indication that this Chlorella species could serve as a good source of nutrition when incorporated in diets. The profile also showed that the fatty acid composition is mostly made up of C16-C18 (> 92%) fatty acid ( Figure 1C,D) which further supports C. vulgaris as a potential candidate for biodiesel production. These results are in agreement with the recently published work of Fernández-Linares et al. [33].

Metabolite Profiling of Chlorella Vulgaris Ethanolic Extract
The ethanolic extract of C. vulgaris was analyzed by LC-MS/MS. The total scan PDA chromatogram and the total ion chromatogram of the extract are shown in Figure 2A,B, respectively. The observed peaks indicating the different metabolites present in the sample were labeled with numbers. A general idea of the compound classes contained in the extract was first obtained from the UV absorptions of compound peaks observed in the total scan PDA chromatograms (Figure 2A). Most of the UV-absorbing compounds detected by the photodiode array (PDA) detector showed maximum absorptions in the range of 200-300 nm. Nevertheless, carotenoids and chlorophyll pigments show characteristic absorption range of 200-400 nm and above, as a fingerprint that  Table 1) (B) The percentage composition of saturated and unsaturated fatty acid methyl esters (C) Percentage distribution of individual saturated fatty acids relative to the total saturated fatty acid content (D) Percentage distribution of individual unsaturated fatty acids relative to the total unsaturated fatty acid content.

Metabolite Profiling of Chlorella vulgaris Ethanolic Extract
The ethanolic extract of C. vulgaris was analyzed by LC-MS/MS. The total scan PDA chromatogram and the total ion chromatogram of the extract are shown in Figure 2A,B, respectively. The observed peaks indicating the different metabolites present in the sample were labeled with numbers. A general idea of the compound classes contained in the extract was first obtained from the UV absorptions of compound peaks observed in the total scan PDA chromatograms (Figure 2A). Most of the UV-absorbing compounds detected by the photodiode array (PDA) detector showed maximum absorptions in the range of 200-300 nm. Nevertheless, carotenoids and chlorophyll pigments show characteristic absorption range of 200-400 nm and above, as a fingerprint that supports their identification. The identities of the metabolites were then further elucidated based on their molecular masses and mass fragmentation patterns (Supplementary Figures S1-S5). In some instances where the identity of a peak was masked by overlaps with other peaks of similar retention times, the extracted ion chromatogram (EIC) was obtained for metabolite identification, as in the case of astaxanthin (supplementary materials Figure  S6). Although the LC-MS analysis was run in switching mode, almost all the compounds in the sample were better ionized in the positive mode. Therefore, identification of the compounds was conducted based on their full MS and MS/MS spectra obtained in positive ion mode.

Identification of Carotenoids
The carotenoids were easily recognizable based on their typical λ max values of 410, 430, 440 and 460 nm [33][34][35][36][37][38][39], as shown in Table 2. The mass fragmentation pathway for carotenoid peaks was very useful in assigning the mass fragments and greatly aided their identification.
Peak 6 showed characteristics which were consistent with that of vulgaxanthin I, with a parent ion at m/z 340 [M + H] + (observed = m/z 340.2592 and exact = m/z 340.1139, error m/z 0.1453). The peak occurred at t R 3.98 giving MS/MS fragment ions at m/z 322 (a loss of 18 amu due to the elimination of water from the protonated parent ion), m/z 209 (product of α-cleavage of the pi-bond dissociated parent ion that undergoes electron loss), and m/z 84 (α-cleavage at the amide terminal of the molecule). A possible fragmentation pathway for vulgaxanthin I is as shown in Figure 3. The proposed fragment ions with m/z 339 and m/z 129, precursors to the fragment ions at m/z 209 and m/z 84, respectively, however were not detected by the MS detector. This could happen as a result of its rapid dissociation due to instability of the ions. Vulgaxanthin I has been previously identified in the higher plant Berberis vulgaris [40]. Although it has been reported before in brown algae Chlorococcum humicola [41], this is the first report of its occurrence in C. vulgaris.     [41], corresponds to the loss of a lone C7H8 fragment. These two peaks share the same parent ion and some MS/MS product ions, the differences in some of their fragmentation pattern suggest that they are isomers. These fragmentation results agree with previously published results [42]. Fragmentation pathways proposed for neoxanthin and violaxanthin are provided in Supplementary Figure S1 and S2, respectively.   [41], corresponds to the loss of a lone C 7 H 8 fragment. These two peaks share the same parent ion and some MS/MS product ions, the differences in some of their fragmentation pattern suggest that they are isomers. These fragmentation results agree with previously published results [42]. Fragmentation pathways proposed for neoxanthin and violaxanthin are provided in Supplementary Figures S1 and S2, respectively.  [41], corresponds to the loss of a lone C7H8 fragment. These two peaks share the same parent ion and some MS/MS product ions, the differences in some of their fragmentation pattern suggest that they are isomers. These fragmentation results agree with previously published results [42]. Fragmentation pathways proposed for neoxanthin and violaxanthin are provided in Supplementary Figure S1 and S2, respectively.   [42]. The fragmentation pathway for β-carotene is provided in Supplementary Figure S4.
Peak number 26, which was not clearly seen in Figure 2B due to overlap, was identified from its EIC (Supplementary Figure S6 water molecules, respectively. Based on this information, peak 26 was identified as astaxanthin [33]. The fragmentation pathway for astaxanthin is shown in Supplementary Figure S5. The carotenoids detected in this work agree with recently published data [43,44].

Identification of Chlorophyll Pigments
The chlorophylls are macromolecules with highly conjugated systems that do not always give MS/MS fragments data under the range of collision energy usually set for other molecules of lower masses. Hence their identification is best conducted with the help of their UV spectrum. Chlorophyll pigments (peaks 19 and 27-31) showed characteristic λ max values in the range between 210 and 530 nm [35], as tabulated in Table 3, which also lists their respective parent ion masses. Compound identification made use of the mass fragmentation data provided in supplementary material (Supplementary Figure S7). Chlorophyll-a 28.66 202, 410, 538 894 [34] Peak 19, the most intense peak in the UV PDA, at t R 14.68 with characteristic λ max absorptions of 268, 474 and 536 nm was identified as pheophorbide-a based on the parent ion at m/z 593 [M + H] + in which correspond to its MS base peak at t R 14.74. Peak 27 is pheophytin-b with parent ion at m/z 885 [M + H] + at MS t R 21.13 (peak 27) gives a UV peak at t R 21.08 with absorption band of λ max 222, 436, 528 nm. Pheophorbide-b with parent ion m/z [M + H] + 607 was assigned to peak 28. This peak was found under the MS t R 21.25 which corresponds to the UV PDA observed peak at t R 20.90 that show an absorption band of λ max 222, 436 and 528 nm. Pheophytin-a having an MS parent ion m/z 871 [M + H] + at t R 24.30 (peak 30) also shows a UV peak at t R 24.28 with absorption band of λ max 408, 536 nm, corresponds with previously reported data [34]. The peak with t R 28.70 (peak 31) gives a mass peak of m/z 894 [M + H] + corresponding to a UV peak at t R 28.66 having an absorption of λ max 410, 538 nm, is attributed to chlorophyll a. The detected chlorophylic pigments agree with recently published data [44][45][46]. Table 4 shows peaks 2, 3, 5, 7 and 9 for the amino acids present in C. vulgaris. At t R 1.44 leucine (peak 2) was observed to fragment into m/z 115, 86, 72 and 57. The characteristic MS/MS fragmentation pattern for peak 3 at t R 2.29 with fragment ions m/z 165, 120, 103, 93, 91, 79, was attributable to phenylalanine. Peak 5 at t R 2.72 with m/z 205 and 100% relative abundance was identified as tryptophan based on the fragment ions m/z 146, 144, 143, 142, 132, 118, 91, and 74. Lysophosphatidylethanolamine (Lyso-PE) was assigned to peak 7 at t R 5.44 with fragment ion ions m/z 548, 452, 322, 209, 157, 114, 97, and 57. Disopyramide was the peak at t R 6.30 with fragment ions m/z 322, 306, 212, 196, 114, and 74. Table 4 also shows the identified fatty acids, fatty acyls and lipids.

Identification of Other Compounds
The identification of simple sugars in the complex crude ethanol fraction of C. vulgaris was not very successful and hence very few were identified (Table 6). This may be attributed to many factors including the probability of many adduct formations which were not completely identified due to the large error values calculated in relation to the literature. (R)-cryptone, input m/z 139.1228 and exact m/z 139.1117 therefore error m/z 0.0111, is known to be found in the higher plant Eucalyptus bosistoana [49] and has never been reported before as a component metabolite of C. vulgaris in the literature. Hence further research is ongoing to establish more facts regarding its identification in C. vulgaris. The proposed mass fragment ions resulting from fragmentation of R-cryptone are shown in Figure 5. The detected simple sugars agree with previously published results [46]. The identification of simple sugars in the complex crude ethanol fraction of C. vulgaris was not very successful and hence very few were identified (Table 6). This may be attributed to many factors including the probability of many adduct formations which were not completely identified due to the large error values calculated in relation to the literature. (R)-cryptone, input m/z 139.1228 and exact m/z 139.1117 therefore error m/z 0.0111, is known to be found in the higher plant Eucalyptus bosistoana [49] and has never been reported before as a component metabolite of C. vulgaris in the literature. Hence further research is ongoing to establish more facts regarding its identification in C. vulgaris. The proposed mass fragment ions resulting from fragmentation of R-cryptone are shown in Figure 5. The detected simple sugars agree with previously published results [46].

Identification of Lipids via Molecular Networking
Molecular networking (MN) assists in data mining via clustering of the MS/MS spectra based on fragmentation cosine similarities [50,51]. The molecular network of the ethanol extract was generated in order to analyze the lipid content of C. vulgaris more comprehensively, to enrich the information obtained from the fatty acid analysis via GC-MS. Therefore, the typical nature of the lipid content can be fully viewed via a putative annotation of the different lipids that make up the important property of C. vulgaris as a potential candidate for nutrition and biodiesel production using MN. Figure 6 shows the generated MN with the different clusters, each cluster shared some distinct fragments and fragmentation pattern.
The putative annotation was conducted with reference to different mass spectroscopic data bases. Three major and distinct clusters were observed in the network comprising diacylglycerophosphoserines, diacylglycerophosphocholines, and glycosphingolipids clusters. Another cluster comprising several separate sub-clusters of monoacylglycerophosphoethanolamines

Identification of Lipids via Molecular Networking
Molecular networking (MN) assists in data mining via clustering of the MS/MS spectra based on fragmentation cosine similarities [50,51]. The molecular network of the ethanol extract was generated in order to analyze the lipid content of C. vulgaris more comprehensively, to enrich the information obtained from the fatty acid analysis via GC-MS. Therefore, the typical nature of the lipid content can be fully viewed via a putative annotation of the different lipids that make up the important property of C. vulgaris as a potential candidate for nutrition and biodiesel production using MN. Figure 6 shows the generated MN with the different clusters, each cluster shared some distinct fragments and fragmentation pattern.
The putative annotation was conducted with reference to different mass spectroscopic data bases. Three major and distinct clusters were observed in the network comprising diacylglycerophosphoserines, diacylglycerophosphocholines, and glycosphingolipids clusters. Another cluster comprising several separate sub-clusters of monoacylglycerophosphoethanolamines was also detected. Identification of the metabolites was performed based on the systematic study of the fragmentation pathways and patterns observed from the resulting network. The mass spectrometry-based MN allowed the identification and putative annotation of 48 lipids in the major clusters as shown in Table 7. The MN was capable not only to dereplicate known lipids, but also pointed out related derivatives, described for the first time in C. vulgaris. The resulting annotation from MN conforms with those obtained from GCMS fatty acid analysis in terms of the number and degree of unsaturation; most of the lipids identified from the MN contain the substituent fatty acid carbon ring ranging from C 16     Visual inspection of the C. vulgaris MN showed that glycosphingolipids clustered in agreement with their substituent's similarities, i.e., grouped according to the type and size of the attached fatty acids. All the glycosphingolipids identified are mainly disubstituted lipids and mostly contain dialkene with few saturated, mono-and tri-alkene fatty acid substituents. Some phosphoethanolamines which were totally absent in their class cluster, were also seen in this cluster sharing some common fragments with glycosphingolipids. Therefore, the typical nature of the lipid content can be fully viewed via a putative annotation of the different lipids that make up the important property of C. vulgaris as a potential candidate for nutrition and biodiesel production using MN.
Some of the identified lipids agree with previously published results [52,53]. The fragmentation pattern in phosphoserine cluster (Figure 7) is observed by the π-bond dissociation of one of the alkene bonds of the corresponding fatty acid to yield a free radical ion that eventually leads to the elimination of C 2 H 5 or C 3 H 7 via б-bond dissociation. These fragments correspond to a loss of 29 and 43 amu, respectively. The fragment m/z 837 was a result of radical-site rearrangement that led to the elimination of two subsequent H 2 O molecules via α-bond cleavage corresponding to a loss of 18 amu. In addition, fragment m/z 593 mostly occurred as a result of the complete elimination of one of the fatty acid substituents as a result of γ-charge-site rearrangement. Fragments m/z 178 and m/z 113 were due to б-bond dissociation eliminating two alkene bonds involving 13 C-atoms and or without the alkene bonds involving only 8 C-atoms respectively as seen in Figure 8.     The fragmentation pattern in diacylglycerophosphocholines (Figure 9) is slightly different from the other classes because most of the commonly shared fragments among the lipids of this group contain the phospho-terminal of the lipid molecule. For instance, fragment m/z 189, which was a result of charge-remote rearrangement, contained both the phospho-terminal as well as two hydroxyl groups (OH). Other fragments m/z 137, 125, 99, and 81 contain only the phospho-terminal in different ionic forms which are also products of charge-remote rearrangements, and in few instances, from inductive cleavages, as shown in Figure 10.
The largest cluster of the MN, glycosphingolipids (Figure 11), are mainly characterized by the presence of one or two tetrahydropyran rings within the fragments and the presence of mono-alkene substituted fatty acids, as shown in Figure 12. correspond to a loss of 29 and 43 amu, respectively. The fragment m/z 837 was a result of radical-site rearrangement that led to the elimination of two subsequent H2O molecules via α-bond cleavage corresponding to a loss of 18 amu. In addition, fragment m/z 593 mostly occurred as a result of the complete elimination of one of the fatty acid substituents as a result of γ-charge-site rearrangement. Fragments m/z 178 and m/z 113 were due to б-bond dissociation eliminating two alkene bonds involving 13 C-atoms and or without the alkene bonds involving only 8 C-atoms respectively as seen in Figure 8.
The fragmentation pattern in diacylglycerophosphocholines ( Figure 9) is slightly different from the other classes because most of the commonly shared fragments among the lipids of this group contain the phospho-terminal of the lipid molecule. For instance, fragment m/z 189, which was a result of charge-remote rearrangement, contained both the phospho-terminal as well as two hydroxyl groups (OH). Other fragments m/z 137, 125, 99, and 81 contain only the phospho-terminal in different ionic forms which are also products of charge-remote rearrangements, and in few instances, from inductive cleavages, as shown in Figure 10.
The largest cluster of the MN, glycosphingolipids (Figure 11), are mainly characterized by the presence of one or two tetrahydropyran rings within the fragments and the presence of mono-alkene substituted fatty acids, as shown in Figure 12.  Table 7.   Table 7.   Table 7.

Fresh Water Microalgae Culture
Chlorella vulgaris was obtained from the Aquatic Laboratory Faculty of Veterinary Medicine, Universiti Putra Malaysia and prepared by raising 200 L of cultured media from 40 milliliters of starter-culture using Bold Basal's media (BBM). The whole process was conducted in duplicate (100 L each). The upscaling of culture was done via the addition of ten liters of culturing media in three days interval after the growth is checked using hemacytometer under a 400× microscope. The parameter for the algal culture was adopted from the original BBM media with slight modification in temperature which we could not maintain at 28 • C (changes in the range of 25-30 • C). The microalgae biomass production was cultivated in fresh water containing the prepared BBM media having various concentrations of different acids and trace elements autoclaved solutions. The growth medium was adjusted and monitored at pH 7.5. The wet crude product was harvested via centrifugation by employing high-speed Sorvall Evolution RC centrifuge (Thermo Electron Corporation, Asheville, NC, USA) at 12,000 rpm at 25 • C, kept at −80 • C for five days, and lyophilized using a freeze drier [48]. Solvent extraction was carried out with ethanol and for LC-MS/MS analysis.

Solvents and Chemicals
Analytical grade methanol, ethanol, ethyl acetate, chloroform and hexane were purchased from Merck Millipore (Darmstadt, Germany).

Microalgae Solvent Extraction Procedure
Lyophilized microalgae biomass (100 mg) was dissolved in 50 mL ethanol and vortexed for 5 min. The solution was then extracted via sonication for 30 min using ultrasonic water bath (SK8210HP Shanghai KUDOS Ultrasonic Instrument Co. Ltd., Shanghai, China) at room temperature. The solvent extract was then filtered through Whatman No. 1 filter paper and the procedure is repeated with another 50 mL of ethanol for a second and third round of extraction. The filtered extracts were pooled and evaporated to dryness using a rotary evaporator (Heidolph Instruments GmbH amd Co.KG, Schwabach, Germany) at 30 • C, and stored at −20 • C until further analysis. The ethanol fraction was then used for LC-MS analysis.

Extraction of Fatty Acids for GC-MS Analysis
Fatty acids in the crude sample of C. vulgaris (~3 mg) were extracted using chloroform and methanol (1:2 v/v) by an ultrasonic device for 10 min. To separate the residue, the mixture was separated by centrifugation at 3000 rpm for 10 min, and the liquid phase was transferred into a glass tube. These extraction steps were repeated three times to obtain maximum extract. After the extraction, several ml of Milli Q water was added, the glass tube was shaken using vortex. Then, to separate impurities (which were dissolved in the water) and the organic solvent phase containing fatty acids, the glass tube was centrifuged at 3000 rpm for 10 min. The water was removed, and the organic solvent phase was dried once. Derivatization into FAMEs were performed with acetyl chloride and methanol (5:100 v/v) [32]. A 10ml acetyl chloride and methanol was added into the dried glass tube, and the derivatization was achieved by reacting under heat at 100 • C for 60 min using a heat block. After cooling at room temperate, the glass tube was added 2 ml of hexane and shake well, and then an aliquot of the upper phase was transferred to a new glass vial. This extraction step was repeated two more times. The final removal of hexane was performed on a 40 • C hot plate with nitrogen stream. Then, the sample was resolved with an accurate amount of hexane (300 µL). Finally, 1 µL of the sample was injected into a GC-MS for fatty acid analysis. Using this method, non-esterified fatty acids were obtained [49]. The identification of FAMEs was assisted by National Institute of Standards and Technology Library (NIST 17 version 2.3) and was confirmed manually. The mono-unsaturated fatty acids were manually detected by the presence of intense peaks of m/z 55 and m/z 69 along with the molecular weight peak. Meanwhile, the di-unsaturated fatty acids were detected by the presence of m/z 55 and m/z 67 intense peaks and m/z 74 and m/z 87 intense peaks were features of saturated fatty acids.

Sample Preparation for UHPLC-MS/MS Analysis
The ethanol extract (2 mg) was dissolved in LCMS-grade methanol (1 mL). Dissolved extract was vortexed for 10 min, centrifuged for 10 min and filtered through a nylon filter (0.22 µm) into a glass vial for LC-MS/MS analysis [50].

GC-MS Analysis
The extracted fatty acids were analyzed using a gas chromatography (GC, HP 6890)-mass spectrometry (MS, HP 5973) (Hewlett Packard, Palo Alto, CA, USA). The GC column was set ZB-WAX column, 30 m (length) × 0.25 mm (I.D) × 0.25 µm film thickness. Helium was used as the carrier gas with a flow rate of 1.0 mL/min. The injector was set at 220 • C in split mode. The temperature gradient of the GC oven started with a 70 • C initial temperature, a linear increase to 170 • C at 11 • C /min, a slower linear increase to 175 • C at 0.8 • C /min, followed by an increase to reach 220 • C at 20 • C/min and a final 2.5 min hold. The total run time was 60 min. The MS quadrupole and MS ion source were programmed at 150 • C and 230 • C, respectively. MS data were obtained with scan mode scanning from 20 to 450 amu.
Analysis of MS data for the fatty acids was conducted using enhanced MSD ChemStation E.02.02.143 (Agilent Technologies, Inc. Santa Clara, CA, USA) fitted with NIST Library database (NIST 17 Version 2.3). The compositional rate in fatty acids was calculated from a comparison of each area in the fatty acids taken from the total ion chromatogram.

UHPLC-MS/MS Analysis
The crude methanol extract was separated using a C18 Reversed-phase Hypersil GOLD aQ column (100 × 2.1 mm~1.9 µm) (Thermo, Waltham, MA, USA) at 30 • C on Dionex Ultimate 3000 UHPLC with a diode-array DAD-3000 detector (Thermo Fisher Scientific, Waltham, MA, USA). Gradient elution was performed with LC-MS grade Solvent A (0.1% formic acid and 10 mMol ammonium formate in 500 mL methanol (70%) and acetonitrile (30%)) and Solvent B (0.1% formic acid and 10 mMol ammonium formate in 500 mL water) for the following gradient: 20% A in 5 min, and 20-80% A in the next 25 min at a flow rate of 0.2 mL/min. The concentration of sample extract was 1 mg/mL and the injection volume were set to 10 µL and the UV detector was set at 210, 310, 410 and 510 nm. The MS analysis was done on Q-Exactive Focus Orbitrap LC-MS/MS system. The eluent was monitored by ESI-MS under positive and negative switching mode and scanned from m/z 100 to 1500 amu. ESI was conducted using a spray voltage of 4.2 kV. High purity nitrogen gas was used as dry gas at a sheath gas flow rate of 40 (arbitrary units) and aux gas flow rate of 10 (arbitrary units). Capillary temperature was set at 350 • C while aux gas heater temperature was set at 10 • C. The MS data analysis was conducted using ThermoXcalibur 2.2 SP1.48 (Thermo Fisher Inc. Waltham, MA, USA) and literature data. Further, the confirmation of newly identified compounds was supported by its mass error that shows relatively low values indicating the high possibility of correct identification of the newly reported compounds.
The mass spectrometry molecular networks were created using the Global Natural Products Social Molecular Networking (GNPS) platform (http://gnps.ucsd.edu) [51]. The MS data were first converted into mzXML format using MSConvert [51]. Spectral information generated was uploaded on GNPS using FileZilla and was used to generate an MS/MS molecular network using the GNPS Data Analysis Workflow. The precursor ion mass tolerance was set to 0.02 Da and a fragment ion mass tolerance of 0.02 Da. The fragment ions below 10 counts were removed from the MS/MS spectra. The MN were generated using 6 minimum matched peaks and a cosine score of 0.7. The resulting data were downloaded and visualized using Cytoscape 3.7.1 software (Institute of Systems Biology Seattle, Washington D.C., USA) [50].

Conclusions
MS-based metabolite profiling of crude sample of C. vulgaris microalgae grown in normal conditions, proved to be efficient in the determination of certain metabolites including carotenoids, amino acids, vitamins and other pigments such as the chlorophylls. The combination of methanol 70% and acetonitrile 30% as the organic solvent used proved to be efficient in determining these compounds. In total, 31 metabolites were successfully determined using LC-MS among which two of them, vulgaxanthin I and R-crypton, were never reported in C. vulgaris before. In addition, 48 lipids were putatively identified via MN. On the other hand, the fatty acid content was best determined using GC-MS via fatty acid analysis due to their volatility and ease of derivatization into FAME, and a total of 20 fatty acids were identified in the derivatized sample with higher percentage of unsaturation and omega-6 being the most dominant. Recent novel bioinformatics approaches such as the MN and in-silico fragmentation tools have emerged and provided a new perspective for early metabolite identification in natural products research. Thus, in this research, an efficient exploitation of datasets was employed for automated data treatment and access to dedicated fragmentation databases during MN. As a result, a larger profile of C. vulgaris was successfully established and many lipids were putatively identified which were not reported before.
The presence of carotenoids, chlorophylls and amino acids suggests that the candidate sample C. vulgaris is a good source of nutrition supported by the presence of omega-6, -7, -9, and -13 fatty acids. Also, the fatty acid profile suggests that C. vulgaris is a good candidate for biodiesel production. More studies need to be carried out in order to determine the correct proportion of nutrient composition in this promising microalga.