Influence of Genotype on High Glucosinolate Synthesis Lines of Brassica rapa

This study was conducted to investigate doubled haploid (DH) lines produced between high GSL (HGSL) Brassica rapa ssp. trilocularis (yellow sarson) and low GSL (LGSL) B. rapa ssp. chinensis (pak choi) parents. In total, 161 DH lines were generated. GSL content of HGSL DH lines ranged from 44.12 to 57.04 μmol·g−1·dry weight (dw), which is within the level of high GSL B. rapa ssp. trilocularis (47.46 to 59.56 μmol g−1 dw). We resequenced five of the HGSL DH lines and three of the LGSL DH lines. Recombination blocks were formed between the parental and DH lines with 108,328 single-nucleotide polymorphisms in all chromosomes. In the measured GSL, gluconapin occurred as the major substrate in HGSL DH lines. Among the HGSL DH lines, BrYSP_DH005 had glucoraphanin levels approximately 12-fold higher than those of the HGSL mother plant. The hydrolysis capacity of GSL was analyzed in HGSL DH lines with a Korean pak choi cultivar as a control. Bioactive compounds, such as 3-butenyl isothiocyanate, 4-pentenyl isothiocyanate, 2-phenethyl isothiocyanate, and sulforaphane, were present in the HGSL DH lines at 3-fold to 6.3-fold higher levels compared to the commercial cultivar. The selected HGSL DH lines, resequencing data, and SNP identification were utilized for genome-assisted selection to develop elite GSL-enriched cultivars and the industrial production of potential anti-cancerous metabolites such as gluconapin and glucoraphanin.

The biosynthesis of GSL is one of the more complex processes and more than 130 GSLs have been identified so far [8]. The existence of the different GSL structures is controlled by variability in genes on individual genotypes, especially at loci involved in initial elongation and side-chain modification reactions [9]. In addition, the molecular function of a gene can be altered depending on the plant species, allelic condition, and polymorphic state of the regulatory network controlling it [10]. According to previous works, four major loci, GS-ELONG, GS-OX, GS-AOP (GS-ALK and GS-OHP), and GS-OH, control the difference in the accumulation of aliphatic GSLs [6,11,12]. GS-ELONG consists of genes methylthioalkylmalate synthase 1 (MAM1), MAM2, and MAM3 to regulate the chain length of GSLs [13,14]. The GS-OX loci contain flavin monooxygenase (FMOGS-ox) and GS-AOP two tightly linked loci, GS-ALK for 2-oxoglutarate-dependent dioxygenases (AOP2) and GS-OHP for AOP3 involved in the side-chain modification reaction determines the type of GSL products [15][16][17]. AOP2 genes are responsible for the conversion of GRA to GNA. In most B rapa, GNA makes up the major proportion of total GSL. However, the presence of stop codons in B. oleracea leads BoAOP2.2 and BoAOP2.3 into non-functional genes. This results in the major GSL content in B. oleracea being GRA instead of GNA [18][19][20]. GS-OH locus controls oxidation of 3-butenyl GSL to alkenyl GSL [6,21]. Most importantly, the accumulation and content of GSL are strongly influenced by R2R3-myeloblastosis (MYB) transcription factors (TFs) [22]. MYB28, MYB29, and MYB76 regulate the aliphatic GSL genes while MYB34, MYB51, and MYB122 regulate the indolic GSL genes [23]. For fine mapping and selection of individual genotypes, it is necessary to identify allelic discrepancies on key loci and for different TFs.
Resequencing provides the opportunity to develop a vast amount of novel markers. Identification of genes related to important agronomic traits, genetic diversity analysis, characterize and environmental factors influences are mandatory in genome-assisted breeding for crop improvement [24]. Combining quantitative trait loci (QTL) mapping with whole-genome sequencing helps in the precise detection of functional loci for traits of interest and their candidate genes. Major advantages of resequencing are functional allele mining and single-nucleotide polymorphism (SNP) discovery between large populations. The identification of trait loci in biparental cross populations by high-resolution linkage map and functional allele mining utilizing SNPs and Insertions-Deletions (InDels) Figure 1. Chemical structure of glucosinolates and hydrolyzed bioactive compounds. Structure was drawn in ChemSketch software using smile notation from ChemSpider (www.chemspider.com (accessed on 7 June 2021)) database.
The biosynthesis of GSL is one of the more complex processes and more than 130 GSLs have been identified so far [8]. The existence of the different GSL structures is controlled by variability in genes on individual genotypes, especially at loci involved in initial elongation and side-chain modification reactions [9]. In addition, the molecular function of a gene can be altered depending on the plant species, allelic condition, and polymorphic state of the regulatory network controlling it [10]. According to previous works, four major loci, GS-ELONG, GS-OX, GS-AOP (GS-ALK and GS-OHP), and GS-OH, control the difference in the accumulation of aliphatic GSLs [6,11,12]. GS-ELONG consists of genes methylthioalkylmalate synthase 1 (MAM1), MAM2, and MAM3 to regulate the chain length of GSLs [13,14]. The GS-OX loci contain flavin monooxygenase (FMO GS-ox ) and GS-AOP two tightly linked loci, GS-ALK for 2-oxoglutarate-dependent dioxygenases (AOP2) and GS-OHP for AOP3 involved in the side-chain modification reaction determines the type of GSL products [15][16][17]. AOP2 genes are responsible for the conversion of GRA to GNA. In most B rapa, GNA makes up the major proportion of total GSL. However, the presence of stop codons in B. oleracea leads BoAOP2.2 and BoAOP2.3 into non-functional genes. This results in the major GSL content in B. oleracea being GRA instead of GNA [18][19][20]. GS-OH locus controls oxidation of 3-butenyl GSL to alkenyl GSL [6,21]. Most importantly, the accumulation and content of GSL are strongly influenced by R2R3-myeloblastosis (MYB) transcription factors (TFs) [22]. MYB28, MYB29, and MYB76 regulate the aliphatic GSL genes while MYB34, MYB51, and MYB122 regulate the indolic GSL genes [23]. For fine mapping and selection of individual genotypes, it is necessary to identify allelic discrepancies on key loci and for different TFs.
Resequencing provides the opportunity to develop a vast amount of novel markers. Identification of genes related to important agronomic traits, genetic diversity analysis, characterize and environmental factors influences are mandatory in genome-assisted breeding for crop improvement [24]. Combining quantitative trait loci (QTL) mapping with whole-genome sequencing helps in the precise detection of functional loci for traits of interest and their candidate genes. Major advantages of resequencing are functional allele mining and single-nucleotide polymorphism (SNP) discovery between large populations. The identification of trait loci in biparental cross populations by high-resolution linkage map and functional allele mining utilizing SNPs and Insertions-Deletions (InDels) as markers provides a powerful complementary strategy to genome-wide association studies (GWAS) [25]. Recently, Chen et al. (2016) resequenced 199 and 119 accessions represent-ing 12 and nine morphotypes of B. rapa and B. oleracea, respectively. This resequencing data aided in the identification of leaf-heading and tuberous morphotypes associated with sub-genome parallel selection during diversification and domestication of B. rapa and B. oleracea [26]. Meanwhile, the resequencing of 588 B. napus accessions led to the identification of A and C sub-genome origins [27]. Similarly, genome mapping helped to characterize genes responsible for club root resistance [28], blackleg resistance [29], black rot resistance [30], and flowering time [31], etc.
B. rapa is one of the model plants for GSL metabolism and polyploid species with phenotypically diverse cultivated subspecies. In our previous study, we determined that eight subspecies of B. rapa have different GSL levels ranging from 4.42 µmol g −1 dry weight (dw) in B. rapa ssp. narinosa to 53.51 µmol·g −1 ·dw in B. rapa ssp. trilocularis [32]. The current study was conducted to identity SNPs based on resequencing between high GSL (HGSL) and low GSL (LGSL) content doubled haploid (DH) lines generated from two different parents, B. rapa ssp. trilocularis (yellow sarson) and B. rapa ssp. chinensis (pak choi). Yellow sarson is an oil plant with relatively high amounts of several beneficial GSLs, whereas pak choi is one of the major leafy vegetables in Korean and Chinese diets but contains lesser amounts of GSL. The aim of this work is to produce edible DH lines with high beneficial GSLs and low toxic substances like PRO. Detailed analysis of recombinant blocks between the HGSL and LGSL lines was carried out based on GSL biosynthetic pathways. The results from this study could be useful for precise genome-assisted selection in the development of elite B. rapa cultivars with enriched GSL content for commercial purposes.

Generation of BrYSP DH Lines and GSL Content Profiles
Accessions YS-033 (CGN06835) and PC-099 (CGN132924) [33] were maintained in our greenhouse. As neither of these accessions were perfect inbred genotypes, we performed selfing for four generations (S 4 ) of YS-033 and named LP08. DH plant of PC-099 was named LP21. LP08 and LP21 were used as the set of homozygous parents. The F 1 plant was developed using a female of LP08 and male of LP21 ( Figure 2). Microspore cultures were followed according to our previous method [34,35]. The leaf edge shapes were segregated in female and male parent's phenotype boundary (Figure 2, middle). Phenotype classification of the leaf edge shape consisted of (a) entire, (b) slightly serrated, (c) intermediately serrated, and (d) very serrated. Our population was named "BrYSP_DH000" in accordance with B. rapa (Br), yellow sarson + pak choi (YSP), doubled haploid (DH), and a unique three-digit number for each line. The regulation of GSL biosynthesis was then examined in Chiffu, LP08, LP21, F 1 , and 161 BrYSP_DH plants.

Resequencing of Parents, HGSL Lines, and LGSL Lines
Sequencing reads of 150 bp were generated using 10 paired-end sequencing libraries of 350 bp insert size. Resequencing of LP08 and LP21 produced around 775 and 704 million raw reads, respectively. The total bases of sequence obtained were 78.2 Gb for LP08 and 71.1 Gb for LP21. The sequences of the DH lines ranged from around 125.5 to 192.4 million clean reads. These reads sequenced were mapped to the B. rapa v3.0 reference genome [36]. Mapping reads (%) between parents differed with 90.65% (LP08) and 93.19% (LP21). The DH lines had slightly higher and more similar mapping read percentages, ranging from 94.8% to 95.98%. The genome coverages for the two parent sequences had a 157.05× depth for LP08 and 156.07× depth for LP21. Nine DH line were mapped from 46.89× (BrYSP_DH059) to 70.80× (BrYSP_DH026) ( Table 2).

SNP Genotyping and InDels
Approximately~3.5 million and~2.5 million SNPs were predicted in LP08 and LP21 lines in reference to B. rapa v3.0 genome, respectively [36]. SNP density was calculated as 27.5 per 1-kb in LP08 and 22.2 per 1-kb in LP21. The maximum number of InDels were identified in LP08, i.e., 724,760. Comparatively, a lesser number of InDels were detected in LP21, i.e., 552,220. On average, 2,707,400 high-quality SNPs and 617,408 InDels were identified in the DH lines. Details regarding the SNPs and InDels of the parents and DH lines are provided in Table S1.

Identification of GSL Biosynthesis-Specific Recombinant Blocks
Overall, 108,328 variants were extracted as allelic differences between LP08 and LP21 in the recombinant block search of 342 recombinant blocks among the genotypes (Table S2, Figure S1). Out of 110 GSL biosynthetic genes, 75 were identified in recombinant blocks and mapped to their respective chromosomes ( Figure 3). Uniformly, the recombinant blocks between the region of 12,929,867-23,248,122 bp of A03 with 3716 SNPs were discovered and identified to be present in only the HGSL lines ( Table 3). The ten GSL synthesis genes positioned between 12.9 Mb and 23.2 Mb of A03 commonly differed between all resequenced HGSL and LGSL lines. In detail, TFs MYB28.1 of aliphatic GSLs and MYB34.2 of indolic GSLs of the HGSL lines were LP08 types. Similarly, for the chain elongation step, branched-chain amino acid aminotransferase 4 (BCAT4; Bra001761), MAM1 (Bra013007), MAM3 (Bra013009 and Bra013011), and bile acid transporter 5 (BAT5; Bra000760) of the HGSL lines belong to LP08. AOP1 (Bra000847) and AOP2 (Bra000848) were the two keys genes derived from LP08 in high GSL lines for the side-chain modification process. One of the genes involved in the sulfur donation from chloroplast to sulfotransferase for production of desulfo GSL, APS kinase (APK)1 (Bra013120) is the LP08 type in all high GSL lines (Table 3).
Though all MAM genes are LP08 type in BrYSP_DH005, only three MAM genes are LP21 types in other high DH lines. In a similar way, all MAM genes were LP21 in BrYSP_DH059 and BrYSP_DH061 but three genes were LP08 type in the BrYSP_DH009 line. Even though in the core structure synthesis phase of BrYSP_DH005 most of the genes were identified as LP21 type, other HGSL lines showed about 50% of genes were LP08 type. It is noteworthy that BrYSP_DH014, BrYSP_DH026 and BrYSP_DH016, BrYSP_DH017 shared similar parental type recombinant blocks of GSL biosynthetic genes.
Side chain elongation Side chain modification-Aliphatic Bold letters are indicated when all HGSL lines had the LP08 genotypes whereas LGSL lines had the LP21 genotypes. Pink (LP08) and sky-blue (LP21).

Comparative Analysis of GSL Pathway between Individual Genotypes
Stepwise comparative analysis on the GSL biosynthetic pathway between the HGSL line BrYSP_DH005 and LGSL line BrYSP_DH059 was performed based on the recombinant blocks of parents LP08 and LP21. Key differences were observed between BrYSP_DH005 and BrYSP_DH059 in amino acid elongation and side-chain modification step TFs such as MYB28 and MYB29. Entire MAM1, AOP1, AOP2, and GSL-OH genes of BrYSP_DH059 were present as LP21 recombinant blocks. In contrast, complete MAM1, one AOP1 (Bra000847), two AOP2 (Bra018521 and Bra000848), and one GSL-OH (Bra022920) were LP08 blocks in BrYSP_DH005. Interestingly, all genes of indolic and aromatic pathways in BrYSP_DH059, except ST5a (Bra024634) and UGT74B1 (Bra024634), were LP21 recombination blocks. Contrastingly, critical genes for the synthesis of glucobrassicin (GBA) in BrYSP_DH005, such as ST5a (Bra008132 and Bra015935), and CYP81F2 (Bra020459 and Bra006830), were LP08 blocks. Except for MYB28.2 (Bra035929), TFs involved in the biosynthesis of the aliphatic GSL were LP08 blocks in BrYSP_DH005, whereas, aliphatic TFs of BrYSP_DH059 were LP21 blocks. A similar trend was observed in MYB34 TFs. Other than MYB34.2 (Bra013000), all MYB34 homologues in BrYSP_DH005 were LP08 types. MYB34 TFs in BrYSP_DH059 remained as the LP21 genotype ( Figure 4).

Comparative Analysis of GSL Pathway between Individual Genotypes
Stepwise comparative analysis on the GSL biosynthetic pathway between the HGSL line BrYSP_DH005 and LGSL line BrYSP_DH059 was performed based on the recombinant blocks of parents LP08 and LP21. Key differences were observed between BrYSP_DH005 and BrYSP_DH059 in amino acid elongation and side-chain modification step TFs such as MYB28 and MYB29. Entire MAM1, AOP1, AOP2, and GSL-OH genes of BrYSP_DH059 were present as LP21 recombinant blocks. In contrast, complete MAM1, one AOP1 (Bra000847), two AOP2 (Bra018521 and Bra000848), and one GSL-OH (Bra022920) were LP08 blocks in BrYSP_DH005. Interestingly, all genes of indolic and aromatic pathways in BrYSP_DH059, except ST5a (Bra024634) and UGT74B1 (Bra024634), were LP21 recombination blocks. Contrastingly, critical genes for the synthesis of glucobrassicin (GBA) in BrYSP_DH005, such as ST5a (Bra008132 and Bra015935), and CYP81F2 (Bra020459 and Bra006830), were LP08 blocks. Except for MYB28.2 (Bra035929), TFs involved in the biosynthesis of the aliphatic GSL were LP08 blocks in BrYSP_DH005, whereas, aliphatic TFs of BrYSP_DH059 were LP21 blocks. A similar trend was observed in MYB34 TFs. Other than MYB34.2 (Bra013000), all MYB34 homologues in BrYSP_DH005 were LP08 types. MYB34 TFs in BrYSP_DH059 remained as the LP21 genotype ( Figure 4).

GSL Hydrolysis Products
Main hydrolysis products of GSL such as BITC, 4-PEITC, 2-PEITC, and SFN of high GSL lines were compared with the commercial pak choi cultivar used in South Korea. All five of the representative HGSL DH lines possessed increased amounts of hydrolysis products. Overall, BrYSP_DH014 had the highest level of hydrolysis products (870.29 µg·g −1 dw). It is about 6.3-fold higher levels than that of the commercial cultivar. The HGSL line with the lowest level of hydrolysis products was the BrYSP_DH017 line which is about 417.5 µg·g −1 dw. Still, it had a 3.0-fold high-level hydrolysis product than that of the control pak choi. Most importantly, the amount of anti-carcinogenic agent SFN in BrYSP_DH005 was 20.2 µg·g −1 dw. This was 7-10-fold higher than that in the other HGSL DH lines and nearly 35-times more than that of the commercial cultivar (Table 4).

Nitrile Formation
Among the five HGSL DH lines tested, BrYSP_DH005 showed significantly lower nitrile formation compared to that of the other HGSL DH lines, from both SNG and GNT substrate-based assays ( Figure 5). The lower SNG substrate-based nitrile formation may explain why BrYSP_DH005 generated higher concentrations of SFN compared to other HGSL DH lines (Table 4). Although a decrease in GNT-based nitrile formation was observed in BrYSP_DH005 compared with other HGSL DH lines, it was more increased than the commercial pak choi.

GSL Hydrolysis Products
Main hydrolysis products of GSL such as BITC, 4-PEITC, 2-PEITC, and SFN of high GSL lines were compared with the commercial pak choi cultivar used in South Korea. All five of the representative HGSL DH lines possessed increased amounts of hydrolysis products. Overall, BrYSP_DH014 had the highest level of hydrolysis products (870.29 μg·g −1 dw). It is about 6.3-fold higher levels than that of the commercial cultivar. The HGSL line with the lowest level of hydrolysis products was the BrYSP_DH017 line which is about 417.5 μg·g −1 dw. Still, it had a 3.0-fold high-level hydrolysis product than that of the control pak choi. Most importantly, the amount of anti-carcinogenic agent SFN in BrYSP_DH005 was 20.2 μg·g −1 dw. This was 7-10-fold higher than that in the other HGSL DH lines and nearly 35-times more than that of the commercial cultivar (Table 4).

Nitrile Formation
Among the five HGSL DH lines tested, BrYSP_DH005 showed significantly lower nitrile formation compared to that of the other HGSL DH lines, from both SNG and GNT substrate-based assays ( Figure 5). The lower SNG substrate-based nitrile formation may explain why BrYSP_DH005 generated higher concentrations of SFN compared to other HGSL DH lines (Table 4). Although a decrease in GNT-based nitrile formation was observed in BrYSP_DH005 compared with other HGSL DH lines, it was more increased than the commercial pak choi.

Discussion
In this study, the representative lines were selected from previously developed 161 DH lines with HGSL and LGSL content for metabolite, resequencing, SNP mapping, and GSL biosynthetic pathway analysis. Total GSL content ranged from 44 μmol·g −1 ·dw to 57

Discussion
In this study, the representative lines were selected from previously developed 161 DH lines with HGSL and LGSL content for metabolite, resequencing, SNP mapping, and GSL biosynthetic pathway analysis. Total GSL content ranged from 44 µmol·g −1 ·dw to 57 µmol·g −1 ·dw in HGSL DH lines ( Table 1). Amounts of GSL were significantly higher than those in the inbred line Brassicaraphanus "BB1" [7] and broccoli (B. oleracea var. italica) [38]. Glucoraphasatin is the major GSL of Raphanus sativus [39] and Brassicoraphanus "BB1" [7], SNG in mustard [1] and horseradish [40] cultivars. Similar to our study, GNA was found to be the predominant GSL in B. rapa "Chinese cabbage". However, the range is around 54% only [41], but in our study, GNA is 80% to 91% of total GSL. Though GNA is the major GSL present in HGSL DH lines, BrYSP_DH005 possessed a considerable amount of other important metabolites such as GRA, GAL, NGBS, and GNT (Table 1). Due to its higher content of several beneficial GSLs, BrYSP_DH005 was selected as a vital genotype and detailed analysis on the GSL pathway was carried out in comparison with the representative LGSL line BrYSP_DH059 (Table 1).
For any hybrid studies, SNPs and InDels are valuable for developing linkage maps of trait loci in the cross-population lines. High-density SNPs and InDels polymorphism markers could be a valuable resource for genetic-linkage studies and precise QTL mapping of desirable traits [24,42] in Brassica (Table S1, Figure S1). The predicted recombinant block between 12.9-23.2 bp of A03 chromosome in the GSL-rich parent, all five HGSL lines, and three LGSL lines indicate that it is a key region that should be focused on GWAS (Table 3, Figure 3). GWAS analysis in combination with metabolite profiling has gained widespread acceptance to assess natural variations between populations [43].  noted that decreases in a few transcripts can have a major impact on the accumulation of GSL [22]. Epistatic effects of transcript expression levels are highly complex to mirrored with the GSL content. Resequencing followed by recombinant block analysis showed a clear picture of genes derived from HGSL and LGSL parents (Table S2, Figure 3). The results of this study indicate that integration of critical recombinant blocks from parents LP08 and LP21 triggered to turn on the genes involved in biosynthesis, transportation and regulation in the GSL metabolic pathway (Figures 2-4).
The first chain elongation process catalyzed by BCAT4 deaminates Met and homoMet to corresponding 2-oxoacids. Two copies of BAT5 serve as the importer of 2-oxo acids into plastid. Out of two copies of BCAT4 and one copy (Bra001761) present in all HGSL lines belonged to the LP08 genotype. The other copy (Bra022448) was also LP08-type in all HGSL DH lines except BrYSP_DH005. Both copies of BAT5 were LP08-type in BrYSP_DH005. Individual mutants of bcat4 and bat5 exhibit approximately a 50% reduction in the level of aliphatic GSLs [48]. As the GS-ELONG locus contains MAM genes, the number of side chains of GSL is decided by the elongation cycles it undergoes the initial steps [14]. Gene duplication, neo-functionalization, and polymorphism of MAM1 lead to the diversification of GSL profiles [49]. Following isomerization by isopropylmalate (IPM) isomerase and oxidative decarboxylation by IPM dehydrogenase (IPM-DH), the 2-oxo acid yields homoMet and chain-elongated derivatives of Met to enter the core GSL structure pathway [6].
There are five genes present in the GS-OX locus of A. thaliana (FMO GS-ox1-5 ), but only one copy of FMO GS-ox2 and two copies of FMO GS-ox5 have been identified in B. rapa. This locus is responsible for the oxygenation of glucoibervirin, glucoerucin, and glucoberteroin based on the co-expression of TFs for aliphatic GSLs [15,16]. AOP2 belongs to GS-ALK for conversion of S-oxygenated GSLs and AOP3 belongs to GS-OHP for conversion of hydroxyl GSLs [6]. Our fine-mapping on the key locus regions such as BCAT4, MAM1, BAT5, AOP2, and GS-OH in recombinant inbred lines showed the high production of GNA with less conversion of PRO ( Figure 4, Table 1, Table 3 and Table S2).
Cytochrome P450 (CYP) CYP79F1 catalyzes the conversion of all chain-elongated Met derivatives. CYP79F1 mainly for long-chained Met derivatives, but no copies of CYP79F2 have been found in B. rapa [50]. CYP79B2 and CYP79B3 are identified for indolic whereas CYP79A2 for aromatic derivatives [51]. The resulting aldoximes are then converted into nitrile oxides or aci-nitro compounds by CYP83A1 for Met derivatives and CYP83B1 for Trp and Phe derivatives [52]. Glutathione-S-transferase (GSTF) is involved in the catalysis of nitrile into S-alkyl-thiohydroximate. In this step, sulfur is supplied to alkyl-thiohydroximate as glutathione (GSH) [53]. GSH is a tripeptide with a gamma peptide linkage between glutamate and cysteine with glycine. S-alkyl thiohydroximates are converted into thiohydroximates by four copies of Y-glutamyl peptidase 1 (GGP1) and two copies of superroot1 (SUR1) [54]. S-glucosylated is catalyzed by glucotransferase. UDP-glucosyl transferase 74C1 (UGT74C1) and UGT74B1 metabolize Met-derived and Phe-derived compounds of thiohydroximates into desulfo GSL [55]. In the final step of core structure biosynthesis, the sulfate donor 3 -phosphoadenosine-5 -phosphosulfate (PAPS) is produced by two steps. First, ATP sulfurylase (ATPS) produces the intermediate adenosine-5 -phosphosulfate (APS). APS kinase then catalyzes APS to Cys. Of four copies of APK1, Bra013120 present in all the HGSL were LP08 type. The other three copies varied among the HGSL DH lines (Table 3). All APK1 and APK2 genes were LP21 type in BrYSP_DH059 (Figure 4b). Double mutants of apk1 and apk2 reduce total GSL in A. thaliana by almost 80%. In addition, the accumulation of desulfoGSL is noticeable in mutants, but present at undetectable levels in wild-type plants [56]. Core structures of GSLs depend on sulfur assimilation, especially sulfotransferase (ST) reaction [9,55]. Two copies of ST5a catalyze both Trp and Phe desulfoGSL. About ten copies of ST5b and one copy of ST5c are involved in the synthesis of aliphatic GSLs, such as glucoibervirin, glucoerucin, and glucoberteroin [57].
Hydrolysis of GSLs by endogenous myrosinase (β-D-thioglucosidase) produces active compounds, such as ITCs, nitriles, and indoles [3]. An important ITC, SFN is reported to be a natural inducer of phase II detoxification enzymes, including glutathione-S-transferase and quinone reductase (QR). SFN triggers cytostasis and apoptosis and also detoxifies xenobiotics [7]. BITC and PEITC have proved to induce apoptosis in cancer cell lines [8,9]. Nitriles have a weaker chemopreventive effect than ITCs [7]. Although broccoli is a well-known health-promoting vegetable due to high SFN concentrations, it has a wide range of ESP activity from 17.1% to 46% among 20 commercial broccoli cultivars [58]. In contrast, the HGSL DH lines of the current study exhibited much lower nitrile formation, and increasing GRA levels may have directly contributed to the induction of phase II detoxification enzymes ( Figure 6). Nitrile formation using SNG as a substrate in the HGSL DH lines ranged from 0.44% to 2.66%, which was considerably lower than nitrile formation compared with 11 commercial mustard cultivars (7.4-62.4%) or USDA fancy horseradish (average, 7.1%) [1,40]. Although nitrile formation based on GNT of BrYSP_DH005 was significantly higher than commercial pak choi, it was still lower than the recent report on broccoli [38]. In this study, important ITCs including SFN, BITC, 4-PEITC, and 2-PEITC were the major hydrolysis compounds along with some nitrile hydrolysis products from GNA (1-cyano-3,4-epithiobutane) and GNT (benzenepropanenitrile) ( Table S3). Our current multilayered analysis of resequencing and the revelation of SNP-based recombinant block discovery results will be helpful for further fine QTL mapping. This information will be beneficial to the production of elite GSL-enriched cultivars for commercialization of potential anti-cancerous metabolites, such as GRA, GLA, NGBS, and GNT, with higher SFN activity.
SNP-based recombinant block discovery results will be helpful for further fine QTL mapping.
This information will be beneficial to the production of elite GSL-enriched cultivars for commercialization of potential anti-cancerous metabolites, such as GRA, GLA, NGBS, and GNT, with higher SFN activity. Figure 6. Comparison of cancer-preventive effect of high glucosinolate DH_line with low nitrile formation ability with broccoli as well-known cancer fighting vegetable as model. Genome-assisted precision breeding in B. rapa achieved high GSLs DH line that directly contribute to high ITCs-mediated restoration of Nrf2/ARE signaling.

Plant Materials
Accessions YS-033 (CGN06835) and PC-099 (CGN132924) [33] were provided by professor Guusje Bonnema. We performed selfing for four generations (S4) of YS-033 and named LP08. DH plant of PC-099 was named LP21. LP08 and LP21 were used as the set of homozygous parents. The F1 plant was developed using an LP08 as female and LP21 as male parent (Figure 1). Microspores were collected and cultures were followed according to our previous method [34,35]. B. rapa ssp. perkinensis "Chiffu-401-42" was used as the reference.

HPLC Analysis for Identification of GSL Content
GSL content was estimated according to Seo et al. [32]. Fresh leaves of 6-weeks old plants were freeze-dried and 100 mg samples were used for protein extraction by boiling with 1.5 mL of 70% (v/v) methanol in a 10 mL test tube for 10 min at 95 °C. Extracts were loaded on Sephadex A25 columns and desulfation was conducted with aryl sulfatase (EC Figure 6. Comparison of cancer-preventive effect of high glucosinolate DH_line with low nitrile formation ability with broccoli as well-known cancer fighting vegetable as model. Genome-assisted precision breeding in B. rapa achieved high GSLs DH line that directly contribute to high ITCs-mediated restoration of Nrf2/ARE signaling.

Plant Materials
Accessions YS-033 (CGN06835) and PC-099 (CGN132924) [33] were provided by professor Guusje Bonnema. We performed selfing for four generations (S 4 ) of YS-033 and named LP08. DH plant of PC-099 was named LP21. LP08 and LP21 were used as the set of homozygous parents. The F 1 plant was developed using an LP08 as female and LP21 as male parent (Figure 1). Microspores were collected and cultures were followed according to our previous method [34,35]. B. rapa ssp. perkinensis "Chiffu-401-42" was used as the reference.

HPLC Analysis for Identification of GSL Content
GSL content was estimated according to Seo et al. [32]. Fresh leaves of 6-weeks old plants were freeze-dried and 100 mg samples were used for protein extraction by boiling with 1.5 mL of 70% (v/v) methanol in a 10 mL test tube for 10 min at 95 • C. Extracts were loaded on Sephadex A25 columns and desulfation was conducted with aryl sulfatase (EC 3.1.6.1) before HPLC. Desulfated GSLs were quantified in Agilent 1200 Series HPLC System (Agilent Technologies, Santa Clara, CA, USA) equipped with an Inertsil ODS-3 column (150 × 3.0 mm inner diameter, particle size 3 µm; GL Science, Tokyo, Japan).
Analysis was done using a flow rate of 0.4 mL·min −1 at a column over temperature of 35 • C and a wavelength of 227 nm. Sinigrin (SNG) was used as an external standard for quantification. Total and individual GSL content was calculated as means of three biological replicates.

Resequencing
Genomic DNA was extracted from fresh leaves as previously described [59]. In liquid nitrogen, 5 g of samples were finely ground and put in 50 mL falcon tube. It is mixed with the pre-warmed 15 mL of DNA extraction buffer (500 mM NaCl, 100 mM Tris-HCL, pH 8.0, 50 mM EDTA, pH 8.0, 1.25% SDS, and add 0.38% sodium bisulfite before use with adjust pH 8.0 with 0.2 N, NaOH). Incubate for 30 min, and invert gently every ten min at 65 • C. Add 5 mL of aquaphenol and rotate at 14 rpm for 10 min at room temperature (RT). Add to an equal volume of Chloroform: Isoamyalchole (24:1) and rotate at 20 rpm for 15 min at RT. Centrifuge for 15 min at 15 • C at 10,000 rpm. Supernatant was transferred to a new 50 mL falcon tube. Add 10 µL of RNaseA (20 mg/mL) and incubate at 37 • C in 10 min. An equal proportion of isopropanol was added and gently mix by inverting. Carefully pull out the DNA pellet with the closed sterile glass tube. DAN pellet was washed with 70% ethanol several times. Completely dry the pellet under vacuum pressure. Finally, dissolve the pellet in 0.1× TE buffer. Libraries with an average insert size of 350 bp were constructed using the genomic DNA TruSeq Nano DNA Sample Prep Kit according to the manufacturer's protocol. Sequencing was performed with 150 bp paired-end sequencing using the NovaSeq6000 platform. Reads were converted from the binary base call (BCL) format file using bcl2fastq V2.20 software with parameter "-barcode-mismatches 0".

Quantification of Glucosinolate Hydrolysis Products
Freeze-dried sample powder (50 mg) was suspended in 2 mL micro-centrifuge tube (Fisher Scientific, Waltham, MA, USA) with 1 mL distilled water. Under darkness for 24 h, hydrolysis products were generated naturally by endogenous myrosinase. Samples were added with 1 mL of dichloromethane and centrifuged at 12,000× g for 2 min. Lower organic layer was carefully collected. To quantify the GSL hydrolysis products, gas chromatograph (GC) (6890N, Agilent Technologies) coupled to an MS detector (5975B, Agilent Technologies) equipped with an auto sampler (7683B, Agilent Technologies) and a capillary column (30 m × 0.32 mm × 0.25 µm J&W HP-5, Agilent Technologies) was used. From extract, 1 µL was injected in GC-MS with split ratio of 1:1. Initial temperature was set to 40 • C for 2 min, then the oven temperature was increased to 260 • C at 10 • C/min and hold for 10 min. Temperature of injector and detector were set at 200 • C and 280 • C, respectively, with the flow rate in the helium carrier at 1.1 mL/min. Peaks were identified using the respective standards [40,41].

Measurement of Nitrile Formation
Nitrile formation (%) was measured to estimate the epithiospecifier protein (ESP) activity as ESP enhances the formation of nitriles over isothiocyanates. Nitrile formation in each sample was determined by incubating concentrated horseradish root extract with crude protein extract of the sample and analyzed using gas chromatography-mass spectrometry (GC-MS). Firstly, concentrated horseradish extract was prepared by powdered 10 g root samples were mixed in 100 mL of 70% methanol. After centrifuging at 4000× g for 5 min, supernatant was boiled in the beaker until all solvent was evaporated and reconstituted in 50 mL of deionized water. Freeze-dried powder (75 mg) of B. rapa DH leaves sample were mixed with 1.5 mL of concentrated "1091" horseradish root extract in 2 mL microcentrifuge. Centrifugation was carried out at 12,000× g for 2 min. Supernatant (0.6 mL) was transferred to 1.5 mL Teflon centrifuge tube (Savillex Corporation, Eden Prairie, MN, USA) and mixed with 0.6 mL of dichloromethane. Samples were incubated in RT for 1 h upside down to minimize volatile compounds loss. Vortexed tubes were centrifuged at 12,000× g for 2 min. Bottom organic layer was injected to GC-MS (Trace 1310 GC, Thermo Fisher Scientific, Waltham, MA, USA) coupled to a MS detector system (ISQ QD, Thermo Fisher Scientific, Waltham, MA, USA) and an auto sampler (Triplus RSH, Thermo Fisher ScientificA capillary column (DB-5MS, Agilent Technologies; 30 m × 0.25 mm × 0.25 µm capillary column). The sample was held at 40 • C for 2 min. Oven temperature was increased to 320 • C at 15 • C/min and held for 4 min. Injector and detector temperature were set at 270 • C and 275 • C, respectively. Flow rate of helium carrier gas was 1.2 mL/min. Standard curve was used to quantify hydrolysis rate of nitriles [40,41].

Statistical Analysis
Analysis of variance (ANOVA) was performed using SAS Enterprise Guide 7.1 (SAS Institute Inc., Carrey, NC, USA). Tukey's honest significant difference (HSD) test was performed using Prism 5 software (GraphPad, San Diego, CA, USA).

Conclusions
The present study advances our knowledge regarding the inheritance of GSL biosynthetic genes in B. rapa for high GSL synthesis and increased ITCs with low concentrations of nitriles. The result of this work will be useful for genome-assisted precise breeding in B. rapa. Metabolite profiling in various DH lines and its recombinant block predicted in A03 (12.9 Mb-23.2 Mb) chromosome based on SNPs and InDels broadens our understanding of GSL biosynthesis and the key genes responsible for the production of beneficial GSL. The integrative genetic-linkage map brings detailed knowledge of variants in GSL biosynthetic genes in the A03 chromosome region. Further recombinant blocks responsible for GSL biosynthesis can be used for the selection and development of GSL-rich edible cultivars of B. rapa.