Analysis of Phenolic and Cyclic Compounds in Plants Using Derivatization Techniques in Combination with GC-MS-Based Metabolite Profiling

Metabolite profiling has been established as a modern technology platform for the description of complex chemical matrices and compound identification in biological samples. Gas chromatography coupled with mass spectrometry (GC-MS) in particular is a fast and accurate method widely applied in diagnostics, functional genomics and for screening purposes. Following solvent extraction and derivatization, hundreds of metabolites from different chemical groups can be characterized in one analytical run. Besides sugars, acids, and polyols, diverse phenolic and other cyclic metabolites can be efficiently detected by metabolite profiling. The review describes own results from plant research to exemplify the applicability of GC-MS profiling and concurrent detection and identification of phenolics and other cyclic structures.


Introduction
Chromatographic techniques for the detection and identification of metabolites in plant material have undergone major changes in recent years due to improvements of analysis time, detection limit and separation characteristics. Depending on the biological question, one might distinguish between targeted and non-targeted strategies. Gas chromatography (GC) in particular is characterized by sensitivity and reliability of separations and detection of complex sample mixtures. Coupling with mass spectrometry (MS) provides highly robust analysis platforms compared to liquid chromatography (LC-MS) and allows

OPEN ACCESS
for the identification of compounds based on the use of commercially or publicly available MS libraries and resources (Table 1) in combination with retention time index (RI) data. Table 1. Selection of commercially and publicly available MS libraries and resources for structure elucidation and compound identification of GC-MS data. Included is also a list of freely software tools for identification, deconvolution and alignment purposes.

GC-MS Profiling of Complex Chemical Matrices
Beside parameters related to proper MS compound identification and use of retention index (RI) data, several aspects related to metabolite extraction and derivatization need to be considered in GC-MS-based metabolite profiling. Depending on the biological question, metabolite targets, and sample matrix, solvents of differing polarity and varying phases have been investigated for compound extraction. For global metabolomics approaches, the use of extraction mixtures covering a wide range from polar to apolar metabolites such as H2O:MeOH:CHCl3 (1:2.5:1) or H2O:MeOH are favoured [2,5]. In other cases, one might need to separate lipids from the polar phase as described by Lisec et al. [1], either for separate chromatographic study of the different fractions [6], or for a more targeted metabolite analysis.

Extraction Methods and Metabolite Coverage
While sample preparation and processing need to be individually adapted and customized, sample extraction often follows standardized procedures using established protocols for comprehensive metabolite profiling and non-targeted metabolomics approaches as already outlined at the beginning. An important parameter in high-throughput metabolite profiling is the high degree of miniaturization and automation in sample handling, requiring the processing of small or ultrasmall sample sizes as low as a few milligrams. Unless secondary metabolites are highly abundant, their potential recovery under such extraction conditions and GC-MS detection proves insufficient. Moreover, microextraction techniques such as sorbent-based solid-phase microextraction (SPME) [7], stirbar-sorptive extraction (SBSE) [8], and solvent-based methods such as e.g., single-drop microextraction (SDME) and liquid-liquid microextraction (LLME) [9] establish sensitive methods for the characterization of complex profiles of volatile compounds found in plant and food samples. Microextraction might also be successfully combined with derivatization techniques and subsequent gas chromatographic separation for the detection of a wide range of volatiles with different polarity [10][11][12]. In general, metabolomic approaches based on SPME extraction and detection of non-derivatized volatiles, emphasize the technique's tremendous capacity to cover a broad range of different compound groups [13], also including aromatic structures, as shown for e.g., tomato [14] and peach [15].

Derivatization of Metabolites
Derivatization prior to GC-MS is an essential preparatory step, which reduces polarity and increases volatility, and simultaneously, thermal stability of metabolites. Compound derivatization is either based on silylation, alkylation or acylation reactions, and a wide range of reagents with different properties are available. Comprehensive studies have shown the superior properties of silylation agents [16], which substitute protons bound to heteroatoms in functional groups (-OH, -COOH, -NH2, -NH, -SH, -OP(=O)(OH)2, etc.) and generate trimethylsilyl (TMS) and tert-butyldimethylsilyl (TBS) derivatives [17]. (BSTFA) and N-methyl-N-(trimethylsilyl)trifluoroacetamide (MSTFA) are widely applied in biological analyses, the use of MSTFA as derivatization agent has been favoured by leading metabolomics labs worldwide. In specific cases, the use of other derivatization agents might be advised. Newer studies suggest alkylation with e.g., methyl chloroformate (MCF) instead of, or in combination with compound silylation due to improved analytical performance [18]. MCF derivatization shows improved reproducibility and compound stability compared to silylation, and is suitable for the analysis of microbial-derived samples with matrices mainly composed of amino and non-amino organic acids, amines and nucleotides [18]. However, due to the wide range of compound structures being covered by combined methoximation and TMS derivatization [19], comprising amino acids, fatty acids, lipids, amines, alcohols, sugars, amino-sugars, sugar alcohols, sugar acids, organic phosphates, hydroxyl acids, aromatics, purines, and sterols, major metabolomics attempts towards the development of MS libraries have favoured the coverage of oximated and silylated metabolites.
Despite the feasibility and power of combined oximation/silylation in global metabolite profiling approaches, several factors impair sample analysis and data quality. A major point is that silylation reactions have to be carried out under anhydrous reaction conditions [18], which requires an additional drying step of sample extracts (e.g., SpeedVac TM (Thermo Scientific, Waltham, MA, USA) or lyophilization). Excess derivatization reagents are commonly introduced in the GC injection port, potentially leading to additional peaks in the chromatogram. Moreover, also non-volatile non-derivatized metabolites or even macromolecules such as peptides, proteins or polysaccharides might be injected, depending on preceding sample clean-up conditions (precipitation, centrifugation and/or filtration) and thus, impede separation performance and GC data analysis. Nevertheless, the utilization of packed inlet liners and/or suitable guard columns circumvent these problems and can protect analytical capillary columns from sample impurities.
Another important factor affecting the quality of metabolite data is the occurrence of artefacts of silylated compounds in GC-MS profiling [17]. Unexpected by-products might add to the complexity of peaks in a chromatogram and interfere with the identification process. This includes also conversion reactions of unstable intermediates, e.g., arginine to ornithine when using BSTFA or MSTFA, potentially leading to misinterpretation of metabolic data [4]. But more important, multiple peaks of one and the same metabolite, i.e., with different degree of TMS silylation of the original molecule, might be detected. This is particularly true for those metabolites with several functional groups such as amino acids (-COOH, -NH2, -OH) and monosaccharides which carry a high number of hydroxy groups. The amino acid serine e.g., shows four active hydrogens which might be exchanged by TMS groups (Figure 1).

Figure 1.
Trimethylsilylation levels of the amino acid serine commonly found in TMS derivatized samples. In serine (2TMS), the OH-and COOH-group are trimethylsilylated, in serine (3TMS) one OH-group of the amino group is exchanged, while in serine (4TMS) all active hydrogens are exchanged.
In GC-MS one might deal with both the 2TMS stage, where only the carboxy and hydroxy group are silylated, and the 3TMS and 4TMS stage with one or both H + of the amino group exchanged. For this reason, MS data of metabolites showing a varying numbers of TMS groups, and both oximate/ TMS metabolite derivatives have been included in MS libraries (e.g., GMD database, NIST, etc.). Such considerations play an important role if metabolites levels shall be quantitatively acquired. Expected that multiple compounds show the same detection response, their individual responses might be summed. Regarding secondary metabolites, it is likely to assume that artifact formation also occurs for these compounds depending on the number of active hydrogens and the given derivatization conditions. However, MS information of such by-products is scarcely, if at all, included in available TMS-MS libraries.

GC-MS Metabolite Profiling-Applications, Performance and Reliability
The term metabolite profiling has already existed for many years, but first the development of high-capacity and high-throughput chromatographic systems in recent years has established the basis to generate extensive amounts of profiling data, which is comparable in its extent to the output of proteomics and transcriptomic analyses. Moreover, the applicability and significance of particularly GC-MS metabolite profiling in functional genomics was early recognized using the plant model systems Arabidopsis thaliana and potato (Solanum tuberosum) [20,21]. Common for those reports is the low coverage of secondary structures which primarily include major phenolic acids, pyridines, tocopherols and sterols. The concept of metabolomics has been applied also for the study of crop plants in more recent years [22]. Using tomato (Solanum lycopersicum) as an example, breeding goals towards nutritional quality [23] and yield [24], impact of environment such as fertilization [25], and temporal metabolite patterns in molecular crop physiology have been addressed [26]. Also here, GC-MS metabolite profiling had major focus on central metabolism and less on secondary structures. However, phenolic acids commonly found in plant material have been detected in potato [27,28], maize [29], soybean leaves [30] and tobacco [31]. GC-MS profiling of lipophilic compounds including sterols/triterpenes and tocopherols has been described for tomato cuticles [32] and maize grain [29]. One important and necessary concept for the description of genetically modified (GM) crops is the so-called substantial equivalence based on the evaluation whether the chemical composition of a GM crop differs from the non-GM counterpart or not. Metabolite profiling is utilized as an essential tool for screening of GM crops with regard to quality and health requirements in order to investigate potential changes in metabolite profiles in e.g., wheat [33], rice [34], and maize [35].
Global metabolomic approaches based on derivatization techniques following GC-MS for the mapping of biosynthetic pathways and characterization of metabolic perturbations and genotypes appear to be less laborious and more cost-effective compared to metabolite targeting. The use of highly sensitive and fast-scanning GC-TOF-MS instrumentation in particular facilitates proper compound detection and resolution of co-eluting peaks. The latter point is a common feature in comprehensive metabolite matrices, and can successfully be addressed based on the acquired data information, including exact retention time, accurate mass, and characteristic MS fragmentation patterns and ion intensities. In GC-QMS or GC-ITMS, on the other hand, analysis time might be extended in order to improve separation performance and resolution of overlapping peaks. In recent years, more advanced extraction and separation methods have been added to the tool box of automated profiling techniques. Such advances include GC × GC-MS (or so-called 2D GC-MS), where samples are subsequently separated on columns of different polarity, thus increasing separation capacity and performance in the detection of biomolecules [36].
However, even the use of accurate mass, MS spectra and RI values might lead to misidentification of compound structures, not least because mass spectra of different TMS derivatives might be notoriously similar (e.g., sugar alcohols), or totally different metabolites might have the same RI value. Another major problem when analysing complex sample mixtures is the fact that metabolite abundances can vary by many orders of magnitude. The linearity and detector response dynamics of metabolites from different structure groups differ greatly [18], thus impeding quantitation of absolute compound concentrations. For that purpose, levels of distinct metabolites might be determined by comparison with calibration standard curve response ratios of various concentrations of standard substance solutions as described by e.g., Roessner-Tunali et al. [37]. Moreover, variations in sample volume, i.e., when using a sample range with a defined maximal tolerance, only allows for relative quantitative detection of metabolites, in contrast to procedures using the same exact amounts of samples and allowing for absolute quantitation [38]-a fact which needs suitable consideration when working with small sample sizes.

Secondary Metabolites in GC-MS-Based Metabolomic Approaches
GC-MS-based metabolite profiling of TMS derivatives does not only generate vast chemical information about primary metabolism, but includes also extensive MS data about secondary metabolites and "unknowns". In order to advance the identification process of less abundant structures in plant samples, comprehensive RI tables for cinnamic acids and other simple phenolic structures, flavonoids, tocopherols and sterols have been generated to provide useful information about the elution order of silylated secondary metabolites on a GC system [27,39]. Furthermore, MS fragmentation patterns have been reported for a wide range of secondary metabolites including phenols and phenolic acids [40][41][42][43], flavonoids [42,44,45], alkylresorcinols [46], phytoestrogens [47], secoiridoids and ligstrosides [41,48], diterpenes and diterpenic acids [49], phenolic diterpenes and pentacyclic triterpenes [50], sterols, stanols, and esters thereof [51,52], lignans [53], stilbenes [54,55], and alkaloids [56]. The utilization of relevant scientific literature reporting GC-MS information about phytochemicals is crucial for the tentative identification of less abundant chemical structures in profiling experiments. Regarding the restricted number of primary and major secondary metabolites, which are commonly included in spectral libraries of TMS analytes, several specific secondary structures are described based on case studies presented in Section 4.

Detection of Plant Phenolics and Other Cyclic Structures
The following classification of the quite diverse group of phenolic structures and other plant-derived cyclic compounds follows chemical structure characteristics rather than biosynthetic relationships, which makes it easier to discuss the topic from an analytical point of view. In a wider sense phenolic structures addressed here contain either one (Section 3.1) or several aromatic rings (Section 3.2) as part of the molecule. In order to cope with the tremendous variability of primary but also secondary metabolites detectable by GC-MS, several attempts have been made to facilitate identification through the construction of combined RI and MS databases of derivatized compounds, generally termed as mass spectral tags (MST) [57]. Retention time indices are a prerequisite for tentative metabolite annotation of mass spectra showing high similarity such as pentoses and hexoses. Recent efforts have focused on the need for extended and comprehensive RI information by compiling RI values of TMS analytes of various phytochemicals including phenolic acids and flavonoids [39,45,51,58,59], diterpenes [49], sterols [51] and tocopherols [32].
Moreover, several research groups and consortia have addressed the complexity of compound structures, since commercially available libraries (e.g., NIST or Wiley) contained insufficient MS information about derivatized analytes, which are frequently acquired in metabolomics experiments. The Golm Metabolome Database GMD [60] comprise today online information of about >4600 MS analyte entries including >3500 analytes with valid spectra (TMS derivatives, tributylsilyl (TBS) derivatives, and isotopically-labeled compounds). However more than 1400 spectra have not been annotated underscoring the need for further chemical information in order to approach the metabolome of plants and other organisms. The downloadable and searchable GMD library contains about 3600 analytes and MST information about 1200 single metabolites. In comparison, the Fiehn GC-MS Metabolomics RTL Library [19] is based on 1400 analyte entries relating to 900 metabolites, while the Massbank database [61] contains 963 MSTs of TMS analytes relating to >700 single metabolites. Hopefully, public repositories of GC-MS-based spectral information such as MetabolomeExpress [62] might help to extend accessible MS library information also including secondary metabolites.
Only those phenolic and cyclic structures which are commonly detectable in derivatized (silylated) samples following GC-MS profiling protocols applied by the majority of metabolomics labs worldwide [1,2,63,64], will be discussed in Subsections 3.1 through 3.6. Information about readily searchable and publicly available MS databases and libraries, providing MS spectra of silylated metabolites, presented and discussed in Sections 3 and 4, will be included in each figure

Simple Phenolics, Aromatic Acids and Related Structures
In most cases phenolic structures are derived from aromatic amino acids such as phenylalanine and tyrosine. Detectable phenolic structures comprise monophenols such as thymol (an aromatic monoterpene), benzyl alcohols, phenylethanoids (e.g., tyrosol), and the coumarins (e.g., umbelliferone) ( Figure 2). The huge class of aromatic acids include benzoic acid and cinnamic acid derivatives with different degree of hydroxylation and methoxylation. Metabolites with vitamin function such as vitamin E (tocopherols) and vitamin K (phylloquinone and menaquinone) represent minor groups of phenolic structures found in food materials. However, tocopherols like α-, β-, γ-and δ-tocopherol are readily detected in biological samples during profiling experiments. More complexly structured monophenolics comprise phenolic diterpenes (e.g., carnosic acid), phenolic amides including the capsaicinoids with capsaicin as well-known representant, the phenolic lipids, e.g., alkylresorcinols, which are commonly found in cereals (wheat, rye, barley and sorghum), but also in certain tree species and bacteria, and finally benzothiazoles which are mentioned in Subsection 3.6.

Polyphenols
The highly diverse group of plant flavonoids comprise flavonols (e.g., kaempferol and quercetin), flavanones (e.g., naringenin and hesperidin), flavones (e.g., luteolin and apigenin), flavan-3-ols (e.g., catechin and gallocatechin), and flavanonols (e.g., taxifolin) ( Figure 3). All these structures, either as aglycon or glycoside, are characterized by a certain number of hydroxy groups which can be silylated. However, due to relatively higher molecular weight of glycosylated polyphenols, the detection and structure elucidation of intact glycosides is preferably achieved on LC platforms, which is also true for hydolyzable tannins. Another closely related group of polyphenols, the phytoestrogens, comprise well-known structures such as isoflavonoids (e.g., genistein and daidzein) commonly found in species of the Fabaceae family, and the lignans (e.g., secoisolariciresinol and pinoresinol) derived from different plant food sources. Also the minor class of stilbenoids, hydroxylated stilbene derivatives such as resveratrol and piceatannol, are readily silylated and detectable if present in appreciable amounts in the sample. In contrast, the important group of anthocyanidins and their glycosylated counterpart, the anthocyanins, show high abundances in fruits and berries but also other plant tissues. These metabolites are normally detected on LC-MS systems not least due to the molecules' positive charge.

Terpenoids and Sterols
The terpenoids including the biosynthetically-derived sterols establish a huge class of secondary metabolites which can be found in diverse organisms (Figure 4). Mono-and sesquiterpenes are volatile and lipophilic metabolites commonly found in high abundances in herbs and spices, conifers and other tree species. The lipophilic phase might preferably be analysed separately prior to GC-MS profiling. However, this step is not always practically feasible, and might result in detection of silylated derivatives of alcoholic mono-and sesquiterpenes, hydroxylated and/or carboxylated di-and triterpenes, and phytosterols including sterols and stanols. Structurally close related is the group of secosteroids including cholecalciferol (vitamin D) and derivatives which are also determined in profiling studies.

N-Containing Cyclic Structures
Alkaloids establish a chemically quite diverse group of basic, nitrogenous secondary metabolites, most of which are characterized by heterocyclic structures (Figure 5). Well-known metabolites include stimulant and/or medicinally significant compounds such as nicotine (based on pyrrolidine/pyridine ring structures), caffeine (a purine), morphine (an isoquinoline), serotonin (an indole), and cocaine (a tropane). Apart from alkaloids, tryptophan-derived indoles establish also the basic ring structures of the plant-hormone related auxines (e.g., indole-3-acetic acid) and several amino acids. Also the cytokinins are N-heterocyclic structures with zeatin (a purine) as an important representative compound. Furthermore, most B-vitamins are built up of N-heterocycles comprising detectable structures such as B3 niacin and B6 pyridoxine (both pyridines), B7 biotin (an imidazole), and B9 folic acid (a pteridine). Importantly, nitrogenous bases, their nucleosides and phosphorylated nucleotides, which are essential components of RNA and DNA in all organisms, are readily detected in profiling experiments. Cyclic structures comprise pyrimidines (thymine, uracil and cytosine) and purines (adenine and guanine). The vast diversity of N-containing cyclic metabolites from plants does not allow to present all structural classes here. However, it is noteworthy that only a minor fraction of potentially detectable compounds is included in MS libraries of silylated compounds.

O-Containing Cyclic Structures
Silylated carbohydrates comprise the largest group of GC-MS detectable oxygen-containing cyclic metabolites. Pentoses and hexoses and oligosaccharides thereof occur in all biological samples showing cyclic structures either as furanoside or pyranoside ( Figure 6). Furan structures, in particular lactones derived from sugar acids (e.g., ascorbic acid) but also amino acids are readily determined. This includes the potentially detection of glycosylated compounds such as terpenes, aromatic structures and purines during profiling experiments. Benzopyran structures have already been mentioned in the context of tocopherols (see subsection 3.1).

S-Containing Cyclic Structures
Though nature produces a vast diversity of S-containing metabolites, only few cyclic structures have been included in MS libraries of silylated compounds. Those mentioned here both include natural products but also compounds which are not biosynthesized by organisms (Figure 7). Lipoic acid and its derivatives represent dithiolane structures with characteristic disulfide bonds in a pentacyclic structure. Though normally covalently bound in mitochondrial enzyme complexes, these compounds might potentially be detected in profiling experiments. This is also true for thiazoles, i.e., pentacyclic N-and S-containing structures, heterocyclic compounds such as natural and synthetic benzothiazoles (e.g., the artificial sweetener saccharin), and S-containing polycyclic thioxanthenes used as photoinitiators (e.g., 2-ITX) in paper and packaging materials.

Case Studies-GC-MS Profiling of Plant Samples
GC-MS is one of the most efficient technology platforms to approach complex mixtures of organic compounds based on a combination of MS database search and the use of calculated RI values. Suitable MS and RI resources have been developed and are well established for the analysis of e.g., essential oil constituents [67][68][69], and investigation of environmental samples using either the comprehensive NIST or Wiley MS libraries or vendor-specific and customized databases. In the case of silylation-based derivatization techniques which cover a broad range of molecular masses and different polarities, the information base regarding phenolic and other cyclic structures, belonging to the complex group of plant secondary metabolites, is rather limited. This is particularly true for higher molecular weight compounds (≥300) as pointed out by Isidorov and Szczepaniak [39]. However, the specificity of molecular structures and thus MS fragmentation patterns in most cases allow for the assignment of distinct compound groups and sub-classes.
In the following Subsections from 4.1 to 4.4, these aspects will be addressed by using examples from the analysis of various plant raw materials and processed plant food to emphasize the applicability of GC-MS profiling for the separation, detection and identification of phenolics and cyclic structures with respect to metabolic phenotyping and quality assessment purposes. Extraction, derivatization and GC-QMS conditions followed procedures for plant samples as described earlier [35,[70][71][72].

Fresh Plant Samples: Flavonoids and Derivatives
Flavonoids represent a highly diverse class of polycyclic secondary structures commonly found in the plant kingdom. In addition to their function as pigments for insect attraction, seed dispersal and UV light absorption, flavonoids serve as antioxidants and radical scavengers, in plant signaling and as defense compounds. Chemically, flavonoids are characterized by a C6-C3-C6 flavone skeleton (A-C-B rings) enclosed with oxygen in the 3-carbon bridge (C-ring) between the phenyl groups. The different sub-classes of flavonoid structures include flavones, flavonols, flavanones, flavanols, flavanonols, anthocyanidins, isoflavones, chalcones and neoflavonoids, as reviewed by Tsao and McCallum [73].
Depending on desaturation and oxidation status of the C-ring and moreover, hydroxylation, methoxylation, and/or prenylation patterns of the flavone backbone (for examples refer to Figure 3), EI-based fragmentation is expected to generate quite stable MS fragments with distinct molecular masses, thus providing sufficient identification capability for structure elucidation.
Strawberry (Fragaria × ananassa Duch.) is renown as a marketable fruit due to its pleasant taste and flavour [74], and its high content of health-beneficial polyphenolic compounds [75][76][77]. Also other plant parts of different Fragaria species were shown to contain high levels of phenolic structures as studied in flowers [78], leaves [79,80] and roots [81,82]. Moreover, GC-MS profiling has been recently applied for the characterization of shifts in metabolite pools in leaf and crown tissue of strawberry plants exposed to cold temperatures [70,72,83] in order to identify those compounds uniquely linked to cold acclimation. Published chemical information was related to primary metabolites in the first place, however a large number of secondary structures could be deduced easily based on the available high-resolution MS information from crown [83], and leaf and root samples [72]. Besides cinnamic acid-and benzoic acid-derived structures, vegetative tissue contains reasonable amounts of flavan-3-ols (catechin derivatives) and flavonols as presented in Figures 8 and 9.

Plant-Based Aquafeeds: Phenolic Acids
Fish feeds are formulated from marine (fish meal and oil), animal (e.g., blood meal and poultry by-products), and plant (e.g., starchy grains, protein meals, and oils) feedstuffs, and additional amino acid and micronutrient supplements. Due to limited resources of fishmeal and fish oil and the over-exploitation of wild fish stocks, plant feedstuffs derived from seeds are considered more sustainable, cost efficient and highly valuable as protein ingredients in aquafeeds. On the other hand, seeds contain well-characterized and supposedly unknown phytochemicals (secondary metabolites), which might impair appetite, nutrient utilization, physiology, fish health and growth [84], particularly in carnivorous fishes.
Therefore, these substances are also termed as antinutritional factors (ANF) because of their non-nutrient function [85]. In a recent study, GC-MS-based metabolite profiling was applied in order to gain detailed chemical information about plant derived feedstuffs with regard to nutritious small molecules (carbohydrates, lipids, amino acids and amines) and simultaneously potential ANFs [35], comprising compounds from diverse structure groups such as phenolics, alkaloids, terpenes, and glycosides. Protein-rich seeds from the legume plant family (e.g., soybean and pea) or refined plant ingredients derived from industrial processes after vegetable oil or starch extraction (e.g., sunflower, rapeseed and corn) represent good sources of plant proteins ( Figure 10). Despite industrial refinement and concentration steps, plant ingredients such as sunflower meal, soy protein concentrate and corn gluten might still contain reasonable amounts of free and bound phenolic ANFs due to interactions with polysaccharides and/or proteins [86], showing a broad spectrum of compounds derived from benzoic acid, cinnamic acid and phenyl ethanol ( Figure 11). Relatively high levels of vanillic acid (20-100 mg/kg), syringic acid (40-150 mg/kg), (E)-ferulic acid (20-60 mg/kg) and sinapic acid (20-60 mg/kg) were found in soybean meal and thus, underscore the potential ANF content related to non-flavonoid phenolics ranging between 660 to 2000 mg/kg in dehulled beans [87]. In comparison, sunflower seeds are known to contain relatively high levels of mono-and diacylquinic acids [88], which might negatively affect taste and nutrient uptake in humans [89]. Levels of different caffeoylquinic acid structures including chlorogenic acid in sunflower meal ( Figure 11) ranged between 600 to 2200 mg/kg and compared well with reported estimates in literature e.g., [90,91].

Cereals: Alkyresorcinols
Cereal grains (wheat, rye, barley and oat) constitute one of the major sources of staple foods for human consumption worldwide due to their nutritious content of carbohydrates, proteins and lipids, also including minerals, vitamins and dietary fibre. Moreover, recent epidemiological studies have shown that intake of whole grain products is positively linked to prevention of metabolic syndrome, obesity, cardiovascular disease and type 2 diabetes [92]. Despite relatively high concentrations of phenolic antioxidants in fruits and berries, the impact of levels of phenolic compounds in grain products and cereals on human health is underestimated. In other words, based on Western food traditions the intake of health-beneficial plant phenolics is not necessarily and primarily based on the consumption of fruits and vegetables. Cereal products contain relatively high levels of benzoic acid and cinnamic acid derivatives [93], either free or in bound form esterified with cell wall components. Based on data from the multinational HEALTHGRAIN study, wheat, rye and oat grain show comparable levels of phenolics, while concentration levels in barley are somewhat lower [94].
The so-called alkylresorcinols (AR) establish a characteristic sub-class of phenolic compounds found in cereals. ARs are based on a 1,3-dihydroxy-5-alkylbenzene structure being linked with an odd-numbered alkyl or alkenyl chain (C17:0 to C25:0) [95][96][97], and have been suggested to be used as biomarkers for the estimation of whole grain consumption in humans [98]. Due to their unique chemical structure, ARs show distinct EI-MS fragmentation patterns generating a base peak of m/z = 268 and a molecular ion (M + ) peak, depending on the length and degree of saturation of the side chain ( Figure 12). These molecular features facilitate the straight-forward detection and identification also in comprehensive GC-MS metabolite profiles. Detected levels of ARs in industrially-processed grain and bakery products were clearly depending on coarseness and declined with the degree of refinement. AR levels in bread ranged from 10 to 300 mg/kg dry weight (DW) and 200 to 300 mg/kg DW in wholemeal bread, while levels in low-processed four-grain meals were estimated at 500 to 650 mg/kg DW, thus corresponding well with results from other studies [46,91,[95][96][97].

Olive Oil: Simple Phenolic Structures and Secoiridoids
Olive oil is a vegetable oil produced from fruits of the olive tree (Olea europaea L.) and its subspecies. The olive tree is a traditional wood species in Mediterranean countries, the main production region in the world, but olive oil is also produced in Asia, the Americas and Australia. The oil is commonly used in cooking, cosmetics, pharmaceuticals, soaps and as a fuel for oil lamps. Olive oil is considered as a highly-valuable and healthy oil because of its high content of glyceridic-bound monounsaturated fatty acids, mainly oleic acid (C18:1), linoleic acid (18:2) and α-linolenic acid (C18:3). In addition, olive oil contains sterols, triterpenic compounds, aliphatic alcohols and esters, and reasonable amounts of different phenolic structures. Tyrosol and its derivatives, namely oleuropeins and ligstrosides, represent characteristic phenolic structures found in olive exerting health-beneficial effects as reported by the European Food Safety Authority (EFSA) [99]. Other commonly detected phenolic compounds comprise hydroxybenzoic-, hydroxycinnamic-and hydroxyphenylacetic acids, lignans and flavonoids [91]. Phenolic patterns found in olive oil can be used to study regional and varietal differences [100], and effects of oil production and processing [101]. Different approaches towards extraction and chromatographic separation have been described [102,103], also including derivatization methods following GC-MS for the analysis of olive and other vegetable oils [41,[104][105][106].
The quality of olive oils is mainly based on extraction and processing conditions, and distinct quality parameters such as acidity, taste and flavour characteristics. According to the classification system of the International Olive Council, oils can be divided into extra-virgin, virgin and the chemically-treated refined oils which all are originally obtained by pressing, and the solvent-extracted pomace oils. Depending on the physical and chemical extraction and processing steps utilized, the content and composition of phenolics shows a high degree of variation. Extra virgin oils often show two to three times higher levels of phenolic compounds compared to refined oil qualities [91].    As an effect of oil processing, refined oils generally show higher abundance of simple tyrosol structures due to the degradation of oleuropeins and ligstrosides. On the other hand, oleuropein and ligstroside content is generally much higher in extra-virgin and virgin oils, not uncommonly exceeding levels above 300 mg/kg. Based on a quality screening of different commercially available olive oils and olive extracts, variations in phenolic metabolites could easily be characterized using liquid-liquid extraction techniques and compound derivatization. Due to EI fragmentation patterns, secoiridoid aglyca could be easily traced using selected ion monitoring (SIM) plots depicting the distinct MS base peaks related to ligstroside (m/z = 192) and oleuropein (m/z = 280) structures ( Figure 13). A total of 55 phenolic metabolites and three other cyclic structures could be detected, of which 28 compounds were tentatively identified based on a combination of MS database search and retention index values ( Table 2).

Conclusions
GC-MS is frequently applied to characterize the chemical complexity of analytical samples based on its separation and identification capacity. Recent developments in GC-MS technology have facilitated global metabolomics approaches in order to approach biological functions and perturbations of biological systems, and for diagnostics and quality assessment purposes. However, one should be aware of the limitations of global GC-MS metabolite profiling. Processing, automated sample handling, and analysis conditions need to be strictly defined and controlled in order to minimize data variation and allow for quantitative calculations. When using standard protocols which are adapted to cover a broad range of biochemical structures, single metabolites or groups of compounds might be discriminated due to generalized compound extraction and derivatization conditions and thus, negatively affect compound recovery rates. Moreover, GC-MS analysis of highly complex mixtures of derivatized metabolites might impair separation and detection capacity with regard to the level of confidence in compound identification due to co-eluting peaks and similarity of MS spectra. For determination of absolute metabolite concentrations, the use of standard compounds is required, otherwise targeted methods need to be applied for the proper quantitation of compounds of interest. Despite limitations in GC-MS with respect to the mass range and polarity of metabolites, the utilization of derivatization techniques and automation technology have extended the range of separable and detectable compounds in high-throughput profiling experiments. Beside the qualitative and quantitative analysis of trimethylsilyl derivatives of highly abundant compounds found in plant samples such as sugars, amino acids and polyols, instrument sensitivity and resolution also allows for the successful detection of minor constituents such as plant secondary metabolites. Even though mass spectral information about monophenolic, polyphenolic and other cyclic compounds in MS libraries is limited, structure-specific MS fragmentation patterns enable to trace and identify low-concentration metabolites, often based on and in combination with published MS data from targeted GC-MS analyses. Current limitations in MS-based metabolomics due to the relatively small number of compounds included in MS databases, in particular secondary metabolites, and hurdles in compound identification, might be overcome by on-going and future efforts. These include in silico derivatization, retention indices and mass spectra matching [107], in silico enzymatic synthesis of biochemical compounds for non-targeted metabolomics [108], and endeavours such as web-based and shared collections of experimental metabolomics datasets, MS spectra and RI values for the processing and interpretation of GC-MS data [62]. Based on experimental data from own research, the present review has emphasized the capabilities of GC-MS to deduce chemical information on phenolics and cyclic compounds found in complex mixtures of plant metabolites.