Identification and Expression Analysis of Glucosinolate Biosynthetic Genes and Estimation of Glucosinolate Contents in Edible Organs of Brassica oleracea Subspecies

Glucosinolates are anti-carcinogenic, anti-oxidative biochemical compounds that defend plants from insect and microbial attack. Glucosinolates are abundant in all cruciferous crops, including all vegetable and oilseed Brassica species. Here, we studied the expression of glucosinolate biosynthesis genes and determined glucosinolate contents in the edible organs of a total of 12 genotypes of Brassica oleracea: three genotypes each from cabbage, kale, kohlrabi and cauliflower subspecies. Among the 81 genes analyzed by RT-PCR, 19 are transcription factor-related, two different sets of 25 genes are involved in aliphatic and indolic biosynthesis pathways and the rest are breakdown-related. The expression of glucosinolate-related genes in the stems of kohlrabi was remarkably different compared to leaves of cabbage and kale and florets of cauliflower as only eight genes out of 81 were expressed in the stem tissues of kohlrabi. In the stem tissue of kohlrabi, only one aliphatic transcription factor-related gene, Bol036286 (MYB28) and one indolic transcription factor-related gene, Bol030761 (MYB51), were expressed. The results indicated the expression of all genes is not essential for glucosinolate biosynthesis. Using HPLC analysis, a total of 16 different types of glucosinolates were identified in four subspecies, nine of them were aliphatic, four of them were indolic and one was aromatic. Cauliflower florets measured the highest number of 14 glucosinolates. Among the aliphatic glucosinolates, only gluconapin was found in the florets of cauliflower. Glucoiberverin and glucobrassicanapin contents were the highest in the stems of kohlrabi. The indolic methoxyglucobrassicin and aromatic gluconasturtiin accounted for the highest content in the florets of cauliflower. A further detailed investigation and analyses is required to discern the precise roles of each of the genes for aliphatic and indolic glucosinolate biosynthesis in the edible organs.


Introduction
Glucosinolates, β-thioglucoside-N-hydroxysulfates (cis-N-hydroximinosulfate esters), are sulfur-enriched, anionic secondary metabolites of plants synthesized from amino acids and sugars. They are synthesized in all vegetables and oilseed plants of the order Brassicales [1]. Upon hydrolysis, these metabolites not only confer characteristic flavors to Brassica vegetables [2,3] but also serve to prevent carcinogenesis in animals by regulating the cell cycle and stimulating apoptosis [4]. Hydrolysis by the myrosinase enzyme degrades glucosinolates into different bioactive products, mostly isothiocyanates [5][6][7]. Isothiocyanates such as sulforaphane [8,9] and indole-3-carbinol [10] are strongly anti-carcinogenic, whereas phenethyl isothiocyanate inhibits the transformation of carcinogens from one form to another [11,12]. In addition to their anti-carcinogenic properties in the animals that consume them, glucosinolates are anti-oxidative [13] and help defend against herbivores and microbes [14,15]. Apart from the various benefits of glucosinolates, a few of them, for example progoitrin, are also reported to have adverse effects in animals, with goitrogenic effects (i.e., enlargement of the thyroid) [16], although no evidence of any such effect has been reported in humans from Brassica consumption [17]. It is important to understand the genetics of biosynthesis and accumulation of health-promoting glucosinolates in order to increase their content for human and animal health and plant protection.
Plants contain over 200 structurally different glucosinolates, which are generally classified as aliphatic, indolic or aromatic based on their primary precursor amino acids [13,18]. The basic precursors of aliphatic, indolic and aromatic glucosinolates are methionine (or alanine, leucine, isoleucine and valine), tryptophan and phenylalanine (or tyrosine), respectively [13,19]. All three types of glucosinolates are generated by a characteristic biosynthetic pathway that involves elongation of the amino acid side chain by the addition of methylene groups, formation of core structure and subsequent secondary modification of amino acid side chains by oxidation, hydroxylation, methoxylation, sulfation, and glycosylation, etc. [20][21][22]. In Brassica species, most of the glucosinolates are biosynthesized from methionine. Elongation of the methionine side chain involves methylthioalkylmalate synthase (MAM), bile acid:sodium symporter family protein 5 (BASS5) and branched-chain aminotransferase (BCAT) [23][24][25][26]. Formation of the core structure is a five-step process that includes formation of aldoxime by cytochromes P450 of the CYP79 and CYP83 families, oxidation of aldoxime by members of the CYP83 family, formation of thiohydroximic acid followed by C-S cleavage, and formation of desulfoglucosinolate by S-glucosyltransferase and glucosinolates by sulfotransferase [27,28]. Subsequent secondary modification involves several gene loci, for example those encoding GS-OX, GS-AOP, GS-OH, BZO1 and CYP81F2. R2R3-Myb transcript factors and other nucleus-localized regulators participate in glucosinolate biosynthesis [18,[29][30][31][32][33][34]. Moreover, the sulfate assimilatory pathway, which provides glutathione and 3′-phosphoadenosine 5′-phosphosulfate co-substrates during glucosinolate biosynthesis on desulfo precursor, also involves several other genes [20].
Brassica oleracea is an important diversified vegetable species in which it has become clear that glucosinolate biosynthetic and catabolism pathways are different compared to those in Arabidopsis and B. rapa. B. oleracea also shows greater glucosinolate profile diversity than B. rapa and B. napus [35]. B. oleracea and B. rapa respectively have 105 and 101 glucosinolate metabolism-related genes, among which 22 genes are related to catabolism [35]. The coding DNA sequences of 84 B. oleracea genes related to glucosinolate biosynthesis [35] are available in two independent databases: Bolbase and EnesmblPlants, but expression analysis has been carried out for none of those genes to date. Therefore, a comparative validation of the coding sequences deposited in the two databases is necessary prior to functional analysis. The glucosinolate biosynthesis and catabolism among Arabidopsis, B. rapa and B. oleracea is likely related but also shows variation in the proportion of tandem genes, and the number and functions of genes for MAM and 2-oxoglutarate-dependent dioxygenase (AOP) [35]. Functions of MAM family members for condensation, side chain elongation and chain length production during glucosinolate biosynthesis differs in Arabidopsis compared to B. rapa and B. oleracea. MYB76 transcription factor is present in Arabidopsis but B. oleracea and B. rapa lack in that factor [36]. In addition to 4C glucosinolate, biosynthesis of sinigrin, a 3C glucosinolate, in B. oleracea is assumed to be related to high expression of the Bol017070 gene, while its ortholog, Bra013007, remain silenced in B. rapa [35]. By contrast, B. rapa biosynthesizes more of the 5C glucosinolate glucobrassicanapin due to higher expression of MAM3 compared to that in B. oleracea [35]. B. oleracea has only one functional AOP gene (AOP2) whereas Arabidopsis and B. rapa have four and three functional AOP genes, respectively [35].

Plant Materials and Growth Conditions
Seeds of 12 different genotypes from four different groups, three genotypes from each group of B. oleracea L., were purchased from Asia Seed Co., Ltd. (Seoul, Korea). The groups were B. oleracea capitata (cabbage), B. oleracea acephala (curly kale), B. oleracea italica (kohlrabi) and B. oleracea botrytis (cauliflower) ( Table 1). The seedlings were raised in garden soil composed of peat moss, coco peat, perlite, zeolite and vermiculite in a growth chamber. Four-week-old seedlings were transferred to a glasshouse. Plants were grown for four months in the glasshouse before samples were destructively excised from several plants. Sampling sites were the edible organ of the plants ( Table 1). The collected samples were snap frozen in liquid nitrogen and freeze-dried and stored at −80 °C for RNA isolation and/or high performance liquid chromatography (HPLC) analysis.

In Silico Analysis
For in silico analysis, the following databases were utilized: B. rapa genome database (http://brassicadb.org/brad/glucoGene.php), Bolbase, a comprehensive genomics database for B. oleracea, (http://www.ocri-genomics.org/bolbase/index.html) and EnsemblPlants (http://plants.ensembl.org/index.html). Gene symbols and annotated names for glucosinolate-related genes such those for transcription factors and enzymes related to side-chain elongation, core structure formation, secondary modification and co-substrate pathways of B. rapa were obtained from the two B. rapa databases [42]. B. oleracea orthologs along with full-length coding sequence (CDS) and % matching sequence between Bolbase and EnsemblPlants databases were then obtained ( Table 2). In cases where Bolbase data were not available, EnsemblPlants data were used.

cDNA Synthesis and Reverse-Transcriptase PCR Analysis
Total RNA of the samples harvested from 12 genotypes of four B. oleracea subspecies was extracted using the RNeasy mini kit (Catalogue No. 74106, Qiagen, Valencia, CA, USA). For cDNA synthesis, 5 μg total RNA, 1 μL gene-specific primer, 1 μL annealing buffer and 8 μL RNase were combined in a 0.2 mL thin-walled PCR tube on ice. Gene specific primers were designed using Primer3 website, http://primer3.ut.ee/ (Table 3). PrimeScript-based kit (Takara Bio, Inc., Shiga, Japan) was used for cDNA synthesis. There were two biological replicates for each genotype and gene combination. The RT-PCR experiment was repeated twice for each gene and genotype combination.    * Participates in biosynthesis of both aliphatic and indolic GSLs; ** These genes have the same CDS sequence in both databases; *** the number of matching base pairs between the sequence in Bol sequence (http://www.ocri-genomics.org/bolbase/) and that in EnsemblPlants (http://plants.ensembl.org/Brassica_oleracea/Info/Index?db=core); **** comparison between B. rapa and B. oleracea EnsemblPlants sequence.

Extraction of Desulfo-Glucosinolates and HPLC Analysis
Desulfoglucosinolates from the selected samples were isolated using the HPLC protocol previously used by Choi et al. [47] with modifications. Fresh leaf tissue (100 mg) was sampled and snap-frozen in liquid nitrogen and stored at −80 °C freezer. The frozen samples were ground and treated with 1 mL 70% alcohol followed by incubation at 70 °C in a water bath for 10 min and at room temperature for 1 h. The tissue and proteins were precipitated by centrifugation for 8 min at 10,000 g at 4 °C and the supernatant was collected for anion-exchange chromatography. The extraction process was repeated twice and the combined supernatant was collected in a 5 mL tube. The combined supernatants represented the crude glucosinolate extracts. The supernatant was mixed with 0.5 mL each 50 mM barium acetate and 50 mM lead acetate. The crude glucosinolates were centrifuged again at 2000 g for 10 min and loaded onto a pre-equilibrated column and rinsed three times with 1 mL distilled water and 250 μL aryl sulfatase was added for desulfation. The desulfation process was allowed to continue for 16 h and then the desulfated glucosinolates were eluted with 1 mL distilled water. The eluted glucosinolates were centrifuged at 20,000 g for 4 min at 4 °C and passed through a filter to remove any impurities (PTFE, 13 mm, 0.2 μm; Advantec, Pleasanton, CA, USA).
The samples were then used for analysis with an HPLC system (Waters 2695, Waters, Milford, MA, USA) equipped with a C18 column (Zorbax Eclipse XBD C18, 4.6 mm × 150 mm, Agilent Technologies, Palo Alto, CA, USA). Water and acetonitrile were used as mobile phase solvents. A flow rate of 0.4 mL·min −1 was set at 30 °C. The desulfoglucosinolates were detected at 229 nm using a UV-visible detector (PDA 996, Waters) with commercially available sinigrin as a glucosinolate standard for quantification. A sinigrin standard curve was used to quantify the amount of glucosinolates in the samples. Individual glucosinolates were identified after mass spectrometry analysis (HPLC/MS, Agilent 1200 series, Agilent Technologies) using an Electrospray ionization interface operated in positive ion mode.

Statistical Analysis
The HPLC measurements of glucosinolate contents were analyzed via one-way analysis of variance using MINITAB 17 statistical software (Minitab Inc., State College, PA, USA). Pairwise comparisons of means were conducted following Tukey's procedure for the significant statistical difference.

Genes Related to Glucosinolate Biosynthesis and Breakdown
A total of 84 B. oleracea genes orthologous to B. rapa genes related to glucosinolate biosynthesis, transcriptional regulation and breakdown were identified (Table 2).These 84 genes were distributed across all nine chromosomes of B. oleracea (Supplementary Table S1). Aliphatic biosynthesis, indolic biosynthesis and transcription factor-related genes identified in this study were not clustered in any particular chromosome, but rather were distributed across all nine chromosomes (Supplementary Table S1). The highest numbers of aliphatic and indolic biosynthesis genes were located on chromosomes C4 and C3, respectively (Supplementary Table S1). Nine genes were not able to be assigned to a chromosome and three AOP2 genes, related to aliphatic glucosinolate biosynthesis, were absent in Bolbase although they were present in EnsemblPlants ( Table 2, Supplementary Table S1). Per Bolbase, the highest and the lowest number of glucosinolate genes per chromosome were 14 and four in chromosomes C3 and C2, respectively. There was more than 93% identity in coding sequence (CDS) for 77 genes between Bolbase and EnsemblPlants ( Table 2). Three MAM1/2 genes and one SUR1 gene, showed 78%-89% identity in the two databases. CDS regions differed in number of nucleotides between the two databases for several genes ( Table 2). The aliphatic and indolic glucosinolate pathways involved two sets of 32 genes and seven shared genes including three genes for GGP1, four genes for SUR1 (Bol038764 and Bol038765 have identical CDS) and one gene for UGT74B1 (Table 2, Figure 1). Twenty genes were transcription related and five genes were related to aglucone biosynthesis through the breakdown of glucosinolates (Figure 1). MYB28 and MYB29 are aliphatic transcription factor-related and MYB51, MYB122 and MYB34 are indolic transcription factor-related genes in B. oleracea (Figure 1). The pairs of genes Bol000201 and Bol019784 for TFL2, Bol033373 and Bol033374 for GS-OH, and Bol038764 and Bol038765 for SUR1 each shared the same CDS (Table 2).

Glucosinolate Analysis in B. oleracea Subspecies
HPLC analysis revealed the presence of 16 different types of glucosinolates in three different edible organs of four different subspecies of B. oleracea (Supplementary Table S2, Table 4). Cabbage leaves contained 12 glucosinolates, kale leaves contained 10, kohlrabi stems contained 11 and the cauliflower florets contained 14. Gluconapin, glucoalyssin, gluconapoleiferin and 4-hydroxy glucobrassicin were identified only the florets of cauliflower (Table 4). Glucoerucin was only found in the cabbage leaves, and glucoiberverin was identified in the cabbage leaves and kohlrabi stems ( Table 4). The absolute amount of the three aliphatic glucosinolates gluconapin, glucoiberverin and glucobrassicanapin differed significantly or marginally among the edible organs ( Table 4). Stems of kohlrabi contained the most glucoiberverin and glucobrassicanapin while the florets of cauliflower contained the highest gluconapin content compared to other edible organs (Table 4). Cauliflower florets recorded the highest content of methoxyglucobrassicin and gluconasturtiin, which are respectively indolic and aromatic glucosinolates ( Table 4). Out of 11 aliphatic glucosinolates identified in three types of edible organs in 12 genotypes of four B. oleracea subspecies, only gluconapin, glucoalyssin and gluconapoleiferin were expressed in the florets of cauliflower (Table 4, Supplementary Table S3).

Discussion
In this study a total of 84 genes related to glucosinolate biosynthesis in B. oleracea were compared. Furthermore, the expression of those genes and biosynthesis of glucosinolates in the edible organs were monitored in four B. oleracea subspecies. The study revealed a disparity in chromosome position for glucosinolate biosynthesis genes between Bolbase and EnsemblPlants databases (Supplementary Table S1). Moreover, the number of nucleotides in the CDS for several glucosinolate-related genes differs between Bolbase and EnsemblPlants ( Table 2). These observations suggest that further investigation and validation of those two databases are required. In a future experimentation cloning and sequencing of the mismatched CDS would be targeted. In the present study, the 84 genes identified and expressed are expected to have high similarity with Arabidopsis thaliana and Brassica rapa, which have high ancestral synteny [35]. Other than those 84 genes, a recent study revealed that bHLH04, bHLH05, and bHLH06/MYC2 factors as novel regulators of glucosinolate biosynthesis in Arabidopsis, which belong to basic helix-loop-helix transcription factors and are essential for basal glucosinolate levels and response to jasmonic acid signal pathway; GTR1 and GTR2, which are involved in glucosinolate translocation [48]. Therefore in future investigation these three genes should be also included along with 84 genes reported in Liu et al. [35]. A previous study compared 52 glucosinolate biosynthetic genes between A. thaliana GLS (AtGS) and the draft B. rapa genome using nucleotide BLAST analysis [42]; high nucleotide sequence identity of about 72%-92% for the transcription factor-related genes was noted.
Kim et al. [44] studied a total of 17 transcription factor-related genes in B. rapa ssp. pekinensis involved in glucosinolate biosynthesis through aliphatic and indolic pathways in leaves, flower, stem and root. Similar to our study, expression of transcription factor-related genes was strikingly different in stem samples compared to leaves and florets [44]. Their relative expression level, compared to the reference gene, in young leaves and flowers was much higher compared to in stem [44], similar to the results of the present study. In B. rapa, the highest glucosinolate content was measured in seeds and the lowest in roots and old leaves [44]. The gene Bra035929 (encoding MYB28) in B. rapa exhibited 16-to 552-fold higher transcript levels in stems compared to seeds, young leaves and roots. Notably, the only B. oleracea orthologue of Bra035929, namely Bol036286, was expressed in all three genotypes of stem samples of kohlrabi, along with other edible organs ( Figure 2). A MYB29 gene, Bol08849, an orthologue of Bra005949, which has 11-to 92-fold higher gene expression in stems of B. rapa [44], was expressed only in two genotypes of kale and one genotype of kohlrabi ( Figure 2). These results are subject of further investigation as those genes were differentially expressed among genotypes within subspecies.
Both transcription factor-related genes and glucosinolate biosynthesis genes showed differences in expression in different plant organs such as seeds, stems, leaves and flowers in previous studies [25,44]. In A. thaliana, some important glucosinolate biosynthetic genes, such as CYP79B2, UGT74B1, CYP79F1, CYP79F2, IQD1, and Dof1.1, are expressed only in vascular tissues [19,30,31,[49][50][51][52][53]. Desulfoglucosinolate sulfotransferases (BrST) isoforms, involved in core glucosinolate biosynthesis in B. rapa, were found to be expressed in mature leaf and root highly compared to other tissues, displaying functional redundancy for differential expression [53]. In our study, the edible organs of kohlrabi (stems) and those of cauliflower (florets) have much different types of structural and vascular tissues compared to the leaves of the other two subspecies analyzed, cabbage and kale, and hence the variation in expression of glucosinolate biosynthesis genes is expected. The fact that only one gene, namely Bol036286, out of five aliphatic transcription factor-related genes was expressed in all 12 genotypes including the stems of kohlrabi suggests that expression of this particular gene is essential in B. oleracea to induce desulfo-glucosinolates as a precursor of different aliphatic glucosinolates (Figure 2). Similarly, only one indolic transcription factor-related gene, Bol030761, was expressed in all 12 genotypes, suggesting that the presence of that gene is needed for continuation of the glucosinolate biosynthetic pathway (Supplementary Figure S1). The genes Bol025706 and Bol030092 should be essential for aglucone biosynthesis from the aliphatic and indolic glucosinolate pathways, respectively (Figures 1 and 2). Expression analysis further suggests that in the aliphatic biosynthetic pathway two genes Bol031350 (FMOGS-OX5) and Bo9g006240 (AOP2) successively carry out glucosinolate transformation in the stems of kohlrabi from the primary glucosinolates glucoerucin and glucoibervirin derived from desulfo-glucosinolates produced by three ST5 genes (Figure 1,  Supplementary Figure S1). Our results thus indicate that: (i) expression of all genes simultaneously is not required for glucosinolate biosynthesis in a particular organ; and (ii) the expression of a single gene or a few genes from each step is required to complete the glucosinolate biosynthesis. In addition, as in the stems of kohlrabi, expression of MYB28 and contents of aliphatic glucosinolate were detected, but expression of genes related to side-chain elongation were extremely low compared to that in other subspecies, suggesting the involvement of other transcription factors recently reported [48], or there is possibility that glucosinolates were transported.
Glucosinolate concentrations are commonly estimated on a tissue dry weight basis. The variation in glucosinolate concentrations we found in the different edible parts might be related to the fact that leaves, stems and florets have differences in water content. Accordingly, glucosinolate concentration on a tissue fresh weight basis could be different from that on a tissue dry weight basis. Thus, the variation observed in glucosinolate content in our study comparing tissues on a dry weight basis might be explained as a methodological variation.
Velasco et al. [54] found that glucosinolate concentration in the floral parts of B. oleracea acephala subspecies greatly increases from 300 days of age, but that it decreases rapidly in the leaf samples of the same plants. In this study, we measured glucosinolate concentration only at one time point. Similar to our study, the presence of glucoiberin, sinigrin and glucobrassicin was previously reported in all different subspecies of B. oleracea [55][56][57]. Likewise, in this study, other glucosinolates such as glucoraphanin, progoitrin, glucobrassicin, methoxyglucobrassicin, neoglucobrassicin and gluconasturtiin were also expressed in all three types of edible organs, such as in the leaves, stems and florets (Table 4). In B. oleracea var. italica, the patterns of glucosinolates were found to be mainly controlled genetically and less affected by environmental factors but several agronomic and environmental factors strongly influence the absolute content of various glucosinolates [2,58]. In particular, biosynthesis of aliphatic glucosinolates was found strongly genetically controlled in broccoli whereas that of indolic glucosinolates was controlled by genetic and environmental factors and by their interactions [59,60]. For example, high nitrogen and high sulphur content were found to increase the content of indolic glucobrassicin in cabbage cultivars [61,62].
Glucoraphanin and glucoiberin are the two most desirable glucosinolates from a nutritional perspective, whereas 2-hydroxy-3-butenyl (progoitrin) glucosinolate is undesirable as upon hydrolysis it produces oxazolidine-2-thione, which causes goiters in mammals and other harmful effects [56,63]. Glucoraphanin and glucoiberin were found in all four subspecies (Table 4). In this study, one of the cauliflower genotype measured no progoitrin (Supplementary Table S3). Wang et al. [56] found comparatively higher progoitrin in commercial broccoli genotypes compared to inbred lines, 1.77-6.07 μmol·g −1 and reported that it contributed around 20% of the total glucosinolates measured in that subspecies. Generally, B. rapa is abundant in that undesired glucosinolate [63]. The glucosinolates gluconapin, sinigrin, progoitrin, glucobrassicin and neoglucobrassicin show chemoprotective activity, but produce bitter and pungent isothiocyanates [60], so an excessive content might decrease consumer preference [64]. Cauliflower florets contained all five of these glucosinolates, whereas all but gluconapin were identified in all four subspecies under study (Table 4).
Among the four subspecies, the florets of cauliflower contained the highest number of glucosinolates with the lowest absolute content of progoitrin. This study thus identified that natural variation in glucosinolates and their absolute content exist among the edible organs of different B. oleracea subspecies, the results of which might be useful in breeding for glucosinolate contents or in transformation studies.

Conclusions
In this study, a total of 84 genes related to aliphatic, indolic and aromatic glucosinolate pathways or transcription/breakdown were subjected to RT-PCR-based analysis of expression in the edible organs of four species of B. oleracea. Only eight genes were expressed in the stem samples of kohlrabi, whereas majority of those genes were expressed in leaves of cabbage or kale and florets of cauliflower. The results are subject of further investigation as genotypic variation within subspecies is also evident along with subspecies difference. Out of 16 different types of identified glucosinolates, only five differed among the edible organs of four subspecies. Stems of kohlrabi contained the most glucoiberverin and glucobrassicanapin, whereas the florets of cauliflower had the highest contents of glucoraphanin, methoxyglucobrassicin and gluconasturtiin in the four-month-old plants. Overall, cauliflower florets had the highest number of glucosinolates and lacked undesirable progoitrin a genotype-dependent manner.