Single Nucleotide Polymorphism (SNP) Discovery and Association Study of Flowering Times, Crude Fat and Fatty Acid Composition in Rapeseed ( Brassica napus L.) Mutant Lines Using Genotyping-by-Sequencing (GBS)

: Rapeseed is the most important oil crop used in the food and biodiesel industries. In this study, based on single nucleotide polymorphism (SNP) identiﬁed from genotyping-by-sequencing (GBS), and an association study of ﬂowering time, crude fat and fatty acid contents were investigated in 46 rapeseed mutant lines derived from gamma rays. A total of 623,026,394 clean data reads were generated with 6.6 million reads on average. A set of 37,721 ﬁltered SNPs was used to perform gene ontology and phylogenetic analysis. Hierarchical cluster analysis of the rapeseed mutant lines gave eight groups based on ﬂowering time and fatty acid compositions. Gene ontological analysis of the mutant lines showed that many genes displaying SNPs are involved in cellular processes, cellular anatomy, and binding. A total of 40 SNPs were signiﬁcantly associated with ﬂowering time (1 SNP), crude fat content (2 SNPs), and fatty acid content (37 SNPs). A total of 21 genes were annotated from fatty acid content SNPs; among them, nine genes were signiﬁcantly enriched in reproductive processes, such as embryonic development, fruit development, and seed development. This study demonstrated that SNPs are efﬁcient tools for mutant screening and it provides a basis that the improving the oil qualities of rapeseed.


Introduction
Rapeseed (Brassica napus L.) is an oilseed plant that grows three to five feet tall and produces beautiful little yellow flowers and is widely used as food, an animal feedstock, and biodiesel. The rapeseed plant is part of the brassica family, as are cabbage, broccoli, Brussels sprouts, and mustard [1,2]. Rapeseed provides for the beautiful landscape of yellow flowers in winter or spring [2]. The plant produces pods from which seeds are harvested. Rapeseed oil is obtained from ground seeds and these seeds contain about 40% oil. Rapeseed oil is recognized as being beneficial to human health because it can lower levels of cholesterol and reduce the risk of arteriosclerosis [3,4]. Crude fat content and fatty acid composition are the characteristics of rapeseed that are important for industrial applications [3,5]. The oil qualities of rapeseed depend mainly on its fatty acid content [5,6]. Many breeders have produced new rapeseed cultivars through breeding for modified fatty acid compositions to make a specific rapeseed oil [6]. Erucic acid, which is considered unhealthy to humans, is found in rapeseed [7]. Its most serious harmful effects include successfully used in association mapping selection [2,[9][10][11]. Additionally, many QTL for flowering time have been reported in rapeseed that co-localize with yield and major QTL that explain the variation accounting flowering time [9,25,32].
We have developed mutant rapeseed lines derived from gamma-radiation mutation, and these lines have a variety of flowering times and increased crude fat and fatty acid contents. The objectives of this study were the identification of SNP in gamma-irradiated rapeseed mutant lines. Meanwhile, new SNP locations for flowering time, crude fat content, and fatty acid composition in rapeseed mutant lines were found by an association study using GBS analysis.

Plant Materials
The mutant lines of rapeseed were generated by the treatment of seeds of the commercial cultivar 'Tammi' with 500 Gy of gamma ( 60 Co) radiation at 2009. The development of the mutant lines of rapeseed is shown in Figure 1. The selfing procedure was continued until the M 9 generations. Mutants that varied in flowering time, crude fat content, and fatty acid content that exhibited stable inheritance of the mutated characteristics from M 5 to M 7 generations were selected. The radiation-generated mutant genotypes were grown by the Radiation Breeding Research Team at the Advanced Radiation Technology Institute of the Korea Atomic Energy Research Institute. Young leaves were sampled from the original cultivar 'Tammi' and the 46 rapeseed mutants. Genomic DNA was isolated using a DNeasy Plant Mini Kit (Qiagen, Hilden, Germany).  4 population that flowering times, crude fat and fatty acid composition and was derived from the Korean rapeseed cultivar 'Tammi' with 500 Gy gamma ray irradiation. This mutant line was homozygous from the M 5 to the M [8][9] generations. At each generation, phenotypes, crude fat, and fatty acid compositions were screened.

Library Construction and Genotypin-by-Sequencing
GBS libraries were constructed using the restriction enzyme ApeKI (5 -GCWGC-3 ) following a protocol modified from that of previous study [23]. Oligonucleotides constituting the top and bottom strands of each barcode adapter and a common adapter were diluted (separately) with TE (50 µM each), and annealed using a thermocycler. DNA samples (100 ng/µL) were added to individual adapter-containing wells. Samples (DNA with adapters) were digested with ApeKI (New England Biolabs, Ipswich, MA, USA) overnight at 75 • C. Sets of digested DNA samples, each with a different barcode adapter, were combined (5 µL each), and purified using a commercial kit (QIAquick PCR Purification Kit; Qiagen, Valencia, CA, USA), according to the manufacturer's instructions. Restriction fragments from each library were then amplified in 50

Sequence Preprocessing and Alignment to Reference Genome Sequence
Demultiplexing was performed using the barcode sequence, and adapter sequence removal and sequence quality trimming were performed. Adapter trimming was performed using cutadapt (version 1.8.3) [33], and sequence quality trimming was performed using DynamicTrim and LengthSort of the SolexaQA program (v.1.13) [20]. DynamicTrim cuts low-quality bases at both ends of short reads according to the Phred score, and refines it with high-quality cleaned reads. LengthSort removes excess base cuts made in DynamicTrim; Phred score of Dynamic-Trim ≥ 20, and LengthSort using short read lengths ≥ 25 bp. BWA (0.6.1-r104) [34] generated cleaned reads passing the preprocessing process, and performed mapping to the reference genome sequence (Brassica napus V5.1; (http://www.genoscope.cns.fr/brassicanapus accessed on 5 February 2021). Mapping was a preliminary step to detect raw SNPs (In/Del) between B. napus V5.1 and sequenced samples. A SAM file was created, and default values were used, except for the following options: a seed length (−l) of 30, maximum differences in the seed (−k) of 1, number of threads (−t) of 16, mismatch penalty (−M) of 6, gap opening penalty (−O) of 15, and gap extension penalty (-E) of 8. The experiment was conducted in repetition.

Raw SNP Detection and Consensus Sequence Extraction
Clean reads were mapped to the standard genome sequence, and the generated SAM files were used to detect raw SNPs using SAMtools (0.1.16) [35], and extract consensus sequences. SNP validation was performed using SEEDERS in-house script [36] before SNP detection; raw SNP detection was performed, and default values were used, except for the following options: a minimum mapping quality for SNPs (−Q) of 30, minimum mapping quality for gaps (−q) of 15, minimum read depth (−d) of 3, minimum indel score for nearby SNP filtering (−G) of 30, SNPs within INT bp around a gap to be filtered (−w) of 15, window size for filtering dense SNPs (−W) of 30, and maximum read depth (−D) of 223.

Generate SNP Matrix
To conduct the analysis of SNPs between the analyzed objects, an integrated SNP matrix was generated between samples. A list of unions was created using the raw SNP positions obtained by comparing each sample with a standard dielectric, and a non-SNP locus was filled in from the consensus sequence of the sample. Then, the final SNP matrix was generated by filtering the miscalled SNP positions through SNP comparison among samples. SNPs were divided into homozygous (SNP read depth ≥ 90%), heterozygous (40% ≤ SNP read depth ≤ 60%), etc., (homozygous/heterozygous; could not be distinguished by type) groups based on their position. The named SNP positions were classified into "intergenic or genic regions" based on the position information of the standard genome sequence (Brassica napus V. 5.1), and the genic region was further classified into "CDS or intron regions." The common SNP in original cultivar 'Tammi' was selected first in the integrated SNP matrix position between mutants, and the polymorphic SNP was selected by comparing the common SNP of the original cultivar with the base sequence of each mutant. In order to perform GO analysis and flexible relationship analysis, SNP loci of each mutant line are integrated to secure the SNP locus of the union.

Gene Ontology Analysis of Genes with Polymorphic SNPs
Gene ontology alignment was performed using candidate sequences containing polymorphic SNPs and sequences provided by GO database in house scrips [37]. Thresholds were classified into three functional categories: BP (Biological Process), CC (Cellular Component), and MF (Molecular Function), with the significance level for 0.01 (e-value ≤ 1.0 × 10 −10 , best hits).

Construction of Phylogenetic Tree and Heatmap
The evolutionary history of the SNP was inferred using the neighbor-joining method. The optimal tree with the sum of branch lengths equal to 2.32462671 is shown. The percentage of replicate trees in which the same clusters were formed, determined by bootstrapping analysis (1000 replicates), are shown next to the branches. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the maximum composite likelihood method and are in units of number of base substitutions per site. The analysis involved 47 nucleotide sequences. All ambiguous positions were removed for each sequence pair. A total of 35,397 positions were included in the final dataset. Evolutionary analyses were conducted using MEGA6 [38]. The clustering analysis of samples from the 47 rapeseed lines was performed using the complete linkage method in the SPSS (IBM, Armonk, NY, USA). The flowering times, crude fat contents, and fatty acid compositions were visualized as z-values on the heatmap. The hierarchical clustering analysis of the 35,397 union SNPs was performed using the neighbor-joining method in TASSEL v5.0.

Association Analysis
Association analysis was performed using 35,397 union SNP dataset and FarmCPU model in genomic association and prediction integrated tool (GAPIT) R package [39]. The retained PC obtained in association analysis was used as a covariate in the FarmCPU procedure, as well as also kinship matrix. Significant SNPs were defined based on the threshold of p < 0.0001. The annotated genes as selected significant SNPs were evaluated enrichment analysis of GOBP category for defined gene functions by AgriGO database ( http://bioinfo.cau.edu.cn/agriGO/ accessed on 5 February 2021) based on the threshold of Hochberg multitest adjustment method (false discovery rate (FDR) < 0.05).

Determination of Crude Fat and Fatty Acid Compositions
The seed oil content was analyzed using the AOAC method as previously described [40]. Using the Soxhlet extraction procedure, 5 g crushed seeds (80 mashed) were packed into a thimble and the oils were extracted with diethyl ether for 6 h. Fatty acid composition was measured previously described method [17]. The rapeseed oil was extracted by rapeseed powder in chloroform: hexane: methanol (8:5:2, v/v/v) 1 mL for 12 h. From this, 200 µL of extracted oil was added to 75 µL of methylation reagent (0.25 M methanolic sodium methoxide: petroleum ether: ethyl ether = 1:5:2, v/v/v) for derivatization. Hexane was added to bring the total volume up to 1 mL. The fatty acid composition of the rapeseed seed oil was analyzed using a GC-MS (Plus-2010, Shimadzu, Japan) instrument equipped with an HP-88 capillary column (J&W Scientific, 60 m × 0.25 mm × 0.25 m) under the following conditions: ionization voltage, 70 eV; mass scan range, 50-450 mass units; injector temperature, 230 • C; detector temperature, 230 • C; injection volume, 1 µL; split ratio, 1:30; carrier gas, helium; and flow rate, 1.7 mL/min. The column temperature program specified an isothermal temperature of 40 • C for 5 min increasing to 180 • C at a rate of 5 • C/min then a subsequent increase to 230 • C at a rate of 1 • C/min. We identified the substances present in the extracts according to their retention time (RT) and using the mass spectra database (NIST 62 Library).  The fatty acid composition of the original cultivar was 4.8%, 1.9%, 64.9%, 15.7%, and 4.4% for palmitic, stearic, oleic, linoleic, and linolenic acids, respectively. Oleic, linoleic, and linolenic acids were the principal fatty acids represented in the rapeseed lines except for three mutant lines. Oleic, linoleic, and erucic acids were the major fatty acids in the Tm4M-2, Tm8-15, and Tm10-1St mutant lines. The palmitic acid content of the mutant lines ranged from 1.3% to 9.8%, with an average of 4.6%. Stearic acid was detected in 36 mutant lines, with a composition ranging from 0.2% to 2.7%. Significant differences in oleic acid composition were observed between all the mutant lines. The oleic acid contents of the mutant lines ranged from 35.3% to 76.7%. The Tm6-8 mutant line had the highest oleic acid content, whereas the Tm10-1St mutant line had the lowest. The highest linoleic acid content was recorded at 28.8% in the Tm8-7 mutant line, and the lowest value of 13.5% was found in the Tm8-2 and Tm7M-2 lines. The highest content of linolenic acid was found in the Tm10-1Lin line (12.8%), while the lowest content was detected in the TM8-14 line (1.4%). Erucic acid was determined in nine mutant lines (Tm8-14, Tm8-15, Tm10-1St, Tm10-6EF, Tm10-8EF, Tm10-9EF, Tm10-10EF, and Tm10-11EF) with a composition range of 0.1% to 21.6%. The Tm10-1St mutant line had the highest erucic acid content.

Genotyping-by-Sequencing of Rapeseed Mutant Lines
The GBS library was constructed from 46 rapeseed mutant lines, and the original cultivar was sequenced using the Illumina Hiseq 2000 platform (Illumina, Madison, WI, USA). A summary of these sequencing results is presented in Table 2 and Table S1. GBS was carried out a second time (two biological replicates were used for GBS analysis), and a total of 710 million reads comprising 107,525,459,536 nucleotides (107.5 Gb) were generated, with an average of 7.5 million reads per genotype. After removing low quality sequences, 623,026,394 clean reads remained, with 6.6 million reads per genotype on average. The total length range of clean reads was between 70,671,425 bp and 1,768,913,538 bp, with an average read length of 646,640,540 bp. The total number of mapped reads was 396,325,056 in all lines, with an average of 4,216,224 reads per sample. The mapped read rates (%) ranged from 29.42% to 85.14%. On average, 54.6% of the filtered reads were mapped to the reference genome sequence. The total length of the mapped region was 1,537,448,005 bp, with an average of 16,355,830 bp per sample, which covered approximately 1.92% of the reference genome sequence. Among the 47 lines, the average depth of the mapped region ranged from 7.83 to 51.44 (Table S1).

Identification of SNPs
The SNPs for each line were first selected from the union of SNPs in the matrix position between samples and the reference genome sequence (Table S2). A total of 1,528,875 SNPs were identified for all lines, of which the largest number of SNPs was recorded for the Tm8-10 mutant line and the lowest number was observed in the Tm7-1EF mutant line. To identify gamma-irradiated mutant lines, the polymorphic SNPs collected by comparing the common SNPs (1st selected 1,528,875 SNPs) in the original cultivar with the base sequences of the mutant lines were considered mutational changes (Table S3 and Supplementary Materials File S1). A total of 277,036 SNPs were observed in 46 mutants. Most of the SNPs (203,046) were homozygous, and there were 73,990 heterozygous variants. A union of 35,397 SNPs without overlap were constructed by combining SNPs for each mutant line. The SNP distribution in the rapeseed chromosomes ranged from 229 to 2765, with a mean of 1605 per chromosome ( Figure 2). Specifically, chromosomes C01, C03, A09, and A03 had the highest numbers of SNPs.

Association Analysis via GBS
All SNPs with significantly associated with flowering time, crude fat content, and fatty acid composition are described in Table 4. Among all 35,397 union SNP dataset associations identified, 40 SNPs were significantly associated with fatty acid composition, flowering time, and crude fat content by applying the fixed and random model circulating probability unification (FarmCPU). Of the 40 selected union SNPs, 28 SNPs were located in genic regions, and 12 SNPs were detected in intergenic regions. A total of 37 SNP loci were significantly associated with fatty acid composition, including 21 genes were annotated from fatty acid content SNPs (CCCH-type zinc finger family protein, glutamine-dependent asparagine synthase 1, arginyl-tRNA synthetase, class Ic, cullin 1 | NADH-ubiquinone oxidoreductase 24 kDa subunit, putative, plant U-box 24, uncharacterized protein family (UPF0497) | tetratricopeptide repeat (TPR)-like superfamily protein, response regulator 1, dicer-like 1, and NADPH-dependent thioredoxin reductase A) in mutant genotypes.
Of these, 11 SNPs from GBS were significantly associated with palmitic acid (C16:0) and located on chromosomes A2, A6, A8, C3, C5, C9, Ann, and Cnn. Four oleic acid (C18:1)associated region were located on chromosomes A5, A7, and C9 in the mutant genotypes. We discovered 10 SNPs associated with linoleic acid (C18:2) located on chromosomes A3, A6, A9, and C5. Nine SNPs were associated with linolenic acid (C18:3) and located on chromosomes A5, A6, C3, C5, and C9. Three SNPs were associated with erucic acid (C22:1) and located on chromosomes A5, A10, and C9. Of the two other candidate SNP loci, both were associated with crude fat content, and one SNP was associated with flowering time. To explore whether the genes related to fatty acid composition were related to gene functions, we performed enrichment analysis. Enrichment analysis was performed by the 21 genes from 40 significant associated SNPs (Table 4) based on the threshold of FDR < 0.05. We found that nine genes were significantly enriched in reproductive processes with sub-processes in the BP term (FDR < 0.05), such as embryonic development, fruit development, and seed developmental processes (Table 5).

Discussion
Rapeseed cultivars have been improved in terms of seed-oil quality and yield. Industrial requirements for rapeseed oils are high-quality fatty acids, and this is recognized by various breeders [4][5][6]. Rapeseed oil is rich in mono-and poly-unsaturated fatty acids, such as oleic acid, linoleic acid, linolenic acid, and erucic acid; of these fatty acids, oleic and linoleic acid are generally recognized to be useful oil compounds [1,3]. However, linolenic acid is readily oxidized, which leads to decreased stability for high-temperature frying and shelf-life [1,[5][6][7]. Therefore, a major goal has been breeding rapeseed with high oleic acid and/or low linolenic acid content. Rapeseed oil with higher oleic acid has many benefits compared with oil from low oleic acid cultivars, such as anti-oxidative properties, cost-effectiveness in fried cooking, and human health benefits [1][2][3]. Additionally, the consumption of oils with low levels of saturated fatty acids reduces the accumulation of low-density lipoprotein (LDL) cholesterol and the occurrence of heart disease in humans [3,4,7]. The average oleic acid contents of commercial rapeseed cultivars range from 60% to 65% of the total fatty acid content in Korea [41,42]. The oleic acid contents of the Tm6-8 and Tm6-13 mutant lines were higher (≥75%) than those from the other mutant lines. In addition, Tm6-8 and Tm6-13 mutant lines had significantly lower levels of saturated fatty acids (palmitic and stearic acid) compared with the original cultivar. Pod shattering is another important technological trait in rapeseed as it is the main reason of grain loss. Tm2M-1 had the strongest to pod shattering resistance (data not shown). These results indicate that these mutant lines are potentially useful as materials for developing new rapeseed cultivars.
Recently, many rapeseed genetic resources are being used in horticulture, food and biodiesel industry [1][2][3][4][5]. In spite of its great industrial value, Korean has limited genetic resources for oil sources of rapeseed cultivars. The 'Tammi' cultivar has major agronomic characteristics such as early flowering, high seed yield (3420 kg/ha) and non-erucic acid oil [41]. The fatty acid composition of 'Tammi' seed oil is common types. Therefore, it is not highly valuable for the oil industry. Ionizing radiation mutagens have been widely applied to the study of genetics and mutation-based breeding programs [12,14,15]. Ionizing radiation contributes to free radical generation and causes DNA damage. It has been widely used in mutation breeding because it can generate useful traits in plant genetic resources [12,15,16,43]. Gamma rays are widely used as mutagens for plant-mutation breeding because of their accuracy and their deep penetration of plant organs that can cause mutations affecting chemical compositions [12,44]. Generally, an easy method for modifying oil traits using mutation is by inducing mutated genes involved in metabolic pathways that alter fatty acid composition in oil crops [5,14]. Modifications of the fatty acid composition in oil crops have been achieved beyond naturally-occurring variation by mutagenesis [12,15,17]. Gamma irradiation has been found to be an especially effective method of improving oil traits in plants [12]. Gamma induced mutation for altering fatty acid composition has been the most frequently carried out and the mutants obtained had increased or decreased values of linolenic acid without any change in linoleic acid [12,45]. Kumar et al. [33] reported gamma rays induced Indian mustard mutant were developed containing a high oil percentage. Moreover, Rapeseed has been mutagenized using gamma rays for the improvement of several important characteristics, including reduction of toxin levels, such as glucosinolates and erucic acid [12,45]. In this study, oleic acid content has changed most and palmitic acid composition is lowest. Gamma irradiation can be used as an effective method for increasing the variability of oleic acid contents and lowering saturated fatty acids in oil crops [12,46]. Similar to the previous of development of high oleic acid composition (over 70%) of rapeseed breeding lines were obtained by 80-100 kR gamma ray treatment and oleic acid composition of M 5 progeny increased greatly [45]. However, Kim et al., reported that the wild-type soybean produced 9.4% palmitic acid, 2.3% stearic acid, 26.6% oleic acid, 53.0% linoleic acid, and 8.8% linolenic acid, while the gammainduced mutant (Hfa180) produced 9.4% palmitic acid, 19.7% stearic acid, 19.4% oleic acid, 43.0% linoleic acid, and 8.5% linolenic acid [13]. The development of erucic acid-inducing mutants are unintentional product. Normally, the conventional mutagenesis approaches induce random mutations in the plant genome [12,14]. Similarly, the gamma irradiation induced mutation influence on oleic acid and erucic acid composition in rapeseed [45,46]. Those mutant lines can be used as a material to study for biosynthesis of erucic acid.
NGS techniques have been applied to the molecular research of rapeseed. GBS is the effective method for analyzing genetic variations conferred by large numbers of SNPs, and it is possible to use it to develop marker systems for crop selection due to its low cost and locus specificity [20][21][22][23]. The NGS techniques has led to the identification of mutations. When comparing the mutation breeding lines, such as induced mutant and its original cultivar, the only expected differences are the mutations caused by the mutagen [47]. However, there is a lack of research about using NGS techniques to study the mutations in fatty acids arising from gamma irradiation of rapeseed genetic resources. In this study, we discovered new SNPs in rapeseed mutant lines, including mutant lines related to flowering time and oil traits (crude fat content and fatty acid composition). In this study, we discovered new SNPs in rapeseed mutant lines, including mutant lines related to flowering time and oil traits (crude fat content and fatty acid composition). As result of GBS, a large number of SNPs were detected on chromosomes C01, C03, A03, and A09. The number of SNPs ranged from 2376 to 9446, with 6022 SNPs per mutant line on average. In GBS analysis, total amount of produced sequences may not be even among samples. Library preparation process contains DNA amplification step which may result in large differences in final outputs even when the amount of input DNA or quality of DNA is slightly different between samples [48]. Although there were substantial differences among samples in sequencing depth, the lowest average of sequencing depth among samples was much higher than 5× which was the minimum threshold in SNP filtering. Therefore, there was not significant problem in SNP detection in samples with low sequencing depth [49,50]. In a previous rapeseed investigation of 633 ecotype germplasms, the GBS method detected 96.4 SNPs per sample, which was higher than the SNPs per sample detected in the present research [51]. Furthermore, there were more SNPs in the 38 mutant lines compared with the reference genome sequence. These results suggested that gamma ray treatment could effectively induce genetic variability in rapeseed plants.
Ultimately, we detected 35,397 high-quality SNPs, which were used for phylogenetic analysis. In this study, we found high levels of genetic diversity among the rapeseed mutant lines derived from gamma irradiation. The phylogenetic tree showed that all mutant lines were classified regardless of their flowering time and crude fat and fatty acid contents. Accordingly, the classification of genetic diversity and variation is one of the most important elements in the selection of germplasms for breeding programs and gives other useful information [52,53]. Hierarchical cluster analysis categorized the 47 rapeseed mutant lines according to their flowering times and/or oleic acid levels. These trends suggest that the oleic acid content of rapeseed oil could be used as a marker to assess chemotypes. Consequently, hierarchical cluster analysis is important for evaluating the chemical diversity of rapeseed as well as genetic variation among novel mutant lines. There was no agreement between the chemical and genetic classifications. Overall results could be applied to breeding programs to develop rapeseed cultivars with improved flowering times and fatty acid compound profiles.
We conducted GO analysis of rapeseed genes containing polymorphic SNPs induced by gamma radiation. The GO grouping of mutated genes indicated that many variations induced functional changes. For the categories of BP, mutations occurred most frequently in genes involved in cellular processes, primary metabolic processes, and nitrogen compound metabolic processes. Of the CC categories, intracellular entities were the major term, followed by organelles, and intracellular organelles. For the MF categories, the major polymorphic SNPs were nucleotide binding, nucleoside phosphate binding, anion binding, and carbohydrate derivative binding. Gamma rays can affect plant cell metabolism through thylakoid membrane dilation, alteration of photosynthesis, and changes to secondary metabolism [12,[14][15][16]. Previously, Go analysis of variation in Dendrobium mutant derived from gamma ray, aerospace, and somaclonal mutagenesis, including cellular aromatic compound metabolic (BP), nucleobase-containing compound metabolic process (BP), protein metabolic process (BP), intracellular membrane-bound organelles (CC), nucleic acid binding (MF), cation binding(MF) and nucleotide binding (MF) [54]. While, GO analysis revealed that genes of rapeseed, including response to herbivore, metabolic process, regulation of immune response, axillary shoot meristem initiation, meristem initiation, triglyceride biosynthetic process, terpenoid metabolic process, development process, photomorphogenesis, trichome morphogenesis, regulation of cell proliferation, and inflorescence development exhibited high mutation rate in 588 natural populations [55]. Therefore, it appears believable to expect that an appreciable difference in mutations is observed between natural variation and radiation mutagenesis.
Of the 35,397 high-quality SNPs, 40 SNPs were significantly associated with flowering time, crude fat content, and fatty acid composition. We found SNPs associated with flowering time, including one in an NPH3 family/ATP synthase subunit on chromosome A09. Rapeseed flowering time affects its yield, cultivation area, and ornamental value. Previous reports indicate that rapeseed flowering time is controlled by complex factors through allelic effects [25]. In rapeseed, the co-localization of a flowering time QTL with yield and SNP marker density was sufficient to describe several candidate genes on chromosomes A03, C04 and C08 [25,32]. The high oil yield of rapeseed may be due to improvements in the application of large amounts of fertilizers and pesticides, which may negatively impact the farmer. The development of new rapeseed cultivars with high crude fat contents could solve this problem [2,12,46]. We observed that two SNPs significantly associated with crude fat content were found on chromosomes C08 and Cnn. Unfortunately, these SNPs occur in the intergenic region. Lu et al. reported that chromosome A-specific selection may have promoted the oil accumulation among the landrace of rapeseed population [55]. It also believable to expect that an appreciable difference in mutation is radiation mutagenesis. We observed that 37 SNPs were significantly associated with fatty acid composition. Specifically, these SNPs occur in nine genes involved in embryonic development, fruit development, and seed development in our study ( Figure S3). Seed fatty acid biosynthetic pathways have been identified in plants. Fatty acid biosynthetic pathways are related to specific enzymes, which form a cooperative system for seed development, energy metabolism, and triacylglycerol biosynthetic pathways [28,29]. Additionally, the marker associated technique has been applied in the identification of genetic variation of inbred lines, including flowering time, pathogen resistance, glucosinolate content, and phenolic compounds in rapeseed [29][30][31][32]56,57]. The SNPs related to flowering time, crude fat content and fatty acid compositions could be developed as a marker for selecting lines.
The fatty acid desaturase 2 (FAD2) and fatty acid desaturase 3 (FAD3) genes encode key enzymes responsible for producing precursors of oleic and/or linoleic acid in the canola lipid biosynthetic pathway [5,28,29]. Rapeseed FAD2 genes are separated and located on chromosomes A01, A05, C01, and C05, and they may be involved in oleic acid regulation [29,31,32]. In our study, a total of nine SNPs were detected on chromosomes A05 and C05, including plant U-box 24, uncharacterized protein family (UPF0497) | tetratricopeptide repeat (TPR)-like superfamily protein, and response regulator 1. The linolenic acid content in rapeseed is controlled by FAD3 genes; they encode delta-15 linoleate desaturase, which is responsible for the dehydration of linoleic acid to form linolenic acid. Furthermore, we observed 19 SNPs significantly associated with linoleic and linolenic acid content on chromosomes A03, A05, A06, A09, C03, C05, and C09. However, most of the SNPs significantly associated with linoleic and linolenic acid content were found on chromosomes A08 and C03, and their detection varied widely depending on the environment [29][30][31][32]. Additionally, 30 SNP loci significantly associated with erucic acid, oleic acid, and linoleic acid contents were detected by specific locus amplified fragment sequencing and identified functional candidate genes related to fatty acid biosynthesis, including FAE1, FAD2, LACS09, KCS17, CER4, TT16, and ACBP5 on chromosomes A06, A08, A09, A10, C02, C03, C07, and C08 in different environments [26]. Fatty acid content is a typical quantitative trait controlled by multiple genes that regulate fatty acid desaturation, and most genes residing on chromosomes A03, A05, A06, A10, and C02 were repeatedly detected due to natural variation in rapeseed [26,31,55]. Moreover, an ethyl methanesulfonate (EMS)-treated population with varying fatty acid contents was also investigated; high oleic acid content mutants had nucleotide deletions in FAD2 on chromosomes A01 and A05 [19]. Previously, two genomic regions on chromosome C3 were associated with QTL for six (palmitic, stearic, oleic, linoleic, linolenic and archidic acid) fatty acid content [58] and a total of 406 identified QTLs were detected for eight fatty acid content in China and 14 microenvironments were involved [59]. Furthermore, 101 consensus QTLs were detected for six fatty acid using an inbreeding lines in six environments [60]. Identification of agronomical important genes are crucial benefits for plant breeding [61]. Identifying the genetic variation between SNP loci and mutants will be beneficial for improving the oil qualities of rapeseed. Therefore, the candidate genes selected in this study should be verified by additional analysis.

Conclusions
In conclusion, GBS was used to develop high-quality SNPs in rapeseed mutant genotypes that were subsequently used to assess genetic diversity and variation. Novel SNP locations for flowering time, crude fat content, and fatty acid composition in the rapeseed mutant lines were found by an association study. We revealed that GBS technology is a powerful tool for identifying SNPs for rapeseed mutants derived by gamma irradiation. Based on GBS and FarmCPU model analysis, significant association signals for flowering time, crude fat content, and palmitic, oleic, linoleic, linolenic, and erucic acids in seeds of rapeseed were found on chromosomes A02, A03, A05, A06, A07, A09, A10, Ann, C03, C05, C08, C09, and Cnn. Specifically, these results indicate that significantly associated SNPs are reliably involved in fatty acid biosynthesis and metabolism during seed reproductive stages in our study. This is the first report describing SNPs generated by GBS for mutants derived by gamma irradiation. Our results may improve molecular breeding for suitable SNPs among rapeseed mutants and could be useful for the development of novel cultivars for mutation breeding programs.
Supplementary Materials: The following are available online at https://www.mdpi.com/2073-439 5/11/3/508/s1, Table S1. Results of genotyping-by-sequencing analysis of rapeseed mutant lines, Table S2. Summary of GBS sequence data and alignment to the reference genome sequence, Table S3. List of union SNP matrix loci that were generated for 46 rapeseed mutant lines. Table S4. Statistics of polymorphic single nucleotide polymorphisms (SNPs) annotation by genomic locations. Table S5. List of functional groups under the three main categories; cellular components, biological process, and molecular function. Figure S1. Diagram of the significant enriched term of GOBP by candidate SNPs to enrichment of reproductive process in BP term. Supplementary Materials File S1: VCF file of union SNP matrix.