Transcriptomics Integrated with Metabolomics Unveil Carotenoids Accumulation and Correlated Gene Regulation in White and Yellow-Fleshed Turnip (Brassica rapa ssp. rapa)

Turnip (Brassica rapa ssp. rapa) is considered to be a highly nutritious and health-promoting vegetable crop, whose flesh color can be divided into yellow and white. It is widely accepted that yellow-fleshed turnips have higher nutritional value. However, reports about flesh color formation is lacking. Here, the white-fleshed inbred line, W21, and yellow-fleshed inbred line, W25, were profiled from the swollen root of the turnip at three developmental periods to elucidate the yellow color formation. Transcriptomics integrated with metabolomics analysis showed that the PSY gene was the key gene affecting the carotenoids formation in W25. The coding sequence of BrrPSY-W25 was 1278 bp and that of BrrPSY-W21 was 1275 bp, and BrrPSY was more highly expressed in swollen roots in W25 than in W21. Transient transgenic tobacco leaf over-expressing BrrPSY-W and BrrPSY-Y showed higher transcript levels and carotenoids contents. Results revealed that yellow turnip formation is due to high expression of the PSY gene rather than mutations in the PSY gene, indicating that a post-transcriptional regulatory mechanism may affect carotenoids formation. Results obtained in this study will be helpful for explaining the carotenoids accumulation of turnips.


Introduction
Turnip (B. rapa ssp. rapa 2n = 20) is one of the most important Curciferae leaf and root vegetable crops in China and throughout East Asia, which is used for human consumption, Tibetan medicine and animal fodder [1]. Turnip is a rich source of glucosinolates [2,3], dietary phenolics [4], dietary fiber, vitamin C [5] and other bioactive compounds [6]. It was also found that flavonoid compounds in Tibetan turnip, p-coumaric acid and glucoside (p coumaric acid-β-D-glucopyranoside), were closely related to an anti-hypoxia effect [7]. Color is an important commodity trait for fruits and vegetables [8]. Yellow-fleshed turnips are a large part of the turnip species, which are highly popular with consumers. Generally, yellow-fleshed turnips are caused by the accumulation of carotenoids.
Carotenoids are a class of important natural pigments with 40 carbons in their backbone, which provide the precursors for vitamin A synthesis [9]. In addition, carotenoids can reduce various chronic diseases due to their antioxidant properties [10]. Carotenoids can absorb light within 400-500 nm. Therefore, the high accumulation of carotenoids Genes 2022, 13, 953 2 of 17 makes many plants appear yellow, orange and red [11]. In carrot, red is attributed to the massive accumulation of lycopene and β-carotene, orange is due to the enrichment of β-carotene, while yellow is thanks to the concentration of lutein [12]. In tomato, lycopene is predominant in the red type [13], δ-carotene and lycopene are major components in the orange/red type [14], β-carotene and lycopene are predominant in the orange type [15]. In pepper, capsanthin is the main pigment in the red type; zeaxanthin, capsanthin and lutein are predominant in the orange type [16]; lutein and β-carotene are the main two compounds in the yellow type [17]. In orange cauliflower and orange heading Chinese cabbage, β-carotene is predominant [18,19].
Many genes have been reported to control carotenoids' regulation in vegetable crops. Carotenoids synthesis begins with geranylgeranyl diphosphate (GGPP) which is formed by the condensation of isopentenyl pyrophosphate (IPP) and dimethylallyl diphosphate (DMAPP) [20]. GGPP is condensed into colorless phytoene by the activity of phytoene synthase (PSY) [21]. Then, the colorless phytoene forms multiple compounds, such as β-carotene, δ-carotene, zeaxanthin, lutein, violaxanthin and neoxanthin, through a series of reactions [22][23][24]. In orange carrot, the CYP97A3 gene was reported to be responsible for the high levels of α-carotene accumulation [25]. In tomato, the SISGR1 gene was reported to interact with PSY1 and regulate lycopene enrichment [26]. In pepper, the CCS gene was considered to be the main gene that controls orange formation [16]. In heading Chinese cabbage, a large insertion in the CRTISO gene leads to the orange formation [27]. These studies suggested that most yellow/orange traits are controlled by the structure genes. In addition, many transcription factors have been proved to be involved in carotenoids biosynthesis. Wu et al. [28] reported that SlMYB72, a R2R3-MYB subfamily, directly bound to phytoene synthase, z-carotene isomerase and lycopeneb-cyclase genes and regulated carotenoids biosynthesis. Shi et al. [29] also published that SlZHD17 is involved in the control of chlorophyll and carotenoids metabolism in tomato fruit. To date, the carotenoid component and correlated gene analyses in white and yellow-fleshed turnip have been infrequent.
In this study, we identified the carotenoid components and correlated genes in two cultivars at three developmental periods by transcriptome and metabolome methods. The data from this study enhanced our understanding of the carotenoid accumulation and correlated gene expression in turnips, and provided an insight for the improvement of yellow turnips or other yellow/orange-fleshed root vegetable crops.

Plant Materials
Two B. rapa ssp. rapa cultivars (white-fleshed cultivar, W21, and yellow-fleshed cultivar, W25) were planted in a randomized field plot according to standard agricultural practices in a field at the Xining experimental station (36 • 42 N; 101 • 45 E) of the Academy of Agriculture and Forestry Sciences, Qinghai University. They were planted in June and swollen root tissue samples were collected at 30 d (the beginning of swelling of the root), 50 d (swelling period of the root) and 70 d (maturation period of the root as a commercial organ) after planting. The swollen root was washed with tap water and the peel was removed. Then, the swollen root tissues were cut into pieces and stored at −80 • C for further analysis. According to the sampling period, samples were marked as W1, W2, W3, Y1, Y2 and Y3, separately. Each cultivar included five individuals and each stage contained three samples. In total, 18 samples were quickly frozen and stored.

RNA Extraction, Quantification and Transcriptome Sequencing
Eighteen libraries representing the six flesh samples and the three replicates were constructed for transcriptome sequencing. The RNA extraction, quantification and transcriptome sequencing were conducted by Illumina HiSeq high-throughput sequencing platform (Illumina, San Diego, CA, USA) at Wuhan MetWare Biotechnology Co., Ltd. (Wuhan, China) following their standard procedures [30] and the raw data were obtained. After removing the adapter sequences, reads with unknown nucleotides (more than 10% Genes 2022, 13, 953 3 of 17 ambiguous residues N) and reads with low quality (containing >50% bases with quality score < 20), the clean reads were held. The high-quality reads were de novo assembled into transcripts using Trinity (Version 2.6.6 [31]) by employing a paired-end method [32]. The dataset is available from the NCBI (https://www.ncbi.nlm.nih.gov/, accessed on 27 July 2020) Short Read Archive (SRA) under accession number PRJNA645708.
The chemical reagents used in this study, including methanol (MeOH), ethanol (EtOH), acetone, methyl tert-butyl ether and BHT, purchased from Merck (Darmstadt, Germany), and Milli-Q water (Millipore, Bradford, MA, USA), were used in all experiments. All standards and formic acid were purchased from Sigma (St. Louis, MO, USA). The stock solutions of standards were prepared at a concentration of 1 mg/mL and were stored at −20 • C.

Weighted Gene Co-Expression Network Analysis
The searched DEGs were used to generate co-expression network modules by the weighted gene co-expression network analysis (WGCNA) package and the obtained coexpression modules were merged on eigengenes [46]. Eigengene value was calculated for each module, which was used to search the association with carotenoid metabolites. The co-expression network between structural genes and transcription factors was also created based on the Pearson correlation coefficient (PCC > 0.8).

DNA/RNA Extraction, Candidate Gene Prediction and Cloning
The extractions of the genomic DNA from the leaf sample and total RNA from the different tissues were performed as described by Ren et al. [47]. The candidate gene was predicted according to the association analysis of DEGs and DEMs. The specific primers for the candidate gene (Table S1) were designed using Primer Premier 6.0 on the basis of the reference genome of B. rapa [48,49] and the sequences of the candidate gene were cloned from 'W21' and 'W25'. The PCR products were purified and used for TA cloning, and the recombinant plasmids were transformed into E. coli strain DH5α. The recombinant plasmids were sequenced by Sunny Biotechnology Co. Ltd. (Shanghai, China) and sequence alignment was performed with DANMAN software (Lynnon BioSoft, San Ramon, CA, USA).

Sequence Analysis and Gene Expression Analysis of BrrPSY Gene
Gene sequences obtained from 'W21' and 'W25' were called PSY-W and PSY-Y, respectively. Gene structure of exons and introns was predicted based on the PSY gene in B. rapa. Sequence alignments were performed by the multiple sequence alignment module of DNAMAN software (Lynnon BioSoft, USA). The physicochemical properties of the proteins encoded by the BrrPSY gene were predicted using on-line software tools SOPMA and ExPaSy. The phylogenetic relationship tree of PSY from multiple species was constructed with MEGA 6.0 with the neighbor-joining method.
The analysis of gene expression was carried out by using quantitative real-time PCR (RT-qPCR) and the actin gene was used as the reference gene [49]. The specific primers of the BrrPSY gene are shown in Table S1. The qPCR reactions were performed in a Roche LightCycler 480 Detection System and all qPCR reactions of each sample were performed with three biological repetitions and three technical repetitions. The relative expression levels of genes were measured by the 2 −∆∆Ct method [50].

Transformation of Protoplasts and Fluorescence Observation
To clarify the location of the BrrPSY protein in the cells, the GFP gene was fused with BrrPSY as a reporter. The recombinant vector PBI211GFP-BrrPSY was transferred into Arabidopsis protoplasts using a polyethylene glycol-mediated approach. The transformed protoplasts were incubated at 23 • C for 16 h with low light level and then fluorescence was observed with a laser confocal microscope. PBI211GFP was used as the negative control.

Transformation of Tobacco
To validate the function of the BrrPSY gene, the PSY-Y and PSY-W genes with ScaI and XbaI restriction enzyme cutting sites were cloned into a PCAMBIA2300 vector, respectively. Then, Agrobacterium tumefaciens GV3101 cell lines carrying PCAMBIA2300, PCAMBIA2300-PSY-Y and PCAMBIA2300-PSY-W were transferred into a tobacco line using the injection infiltration method, separately. Three day later, the infiltration areas were collected for RNA extraction, RT-qPCR and carotenoid content detection. The primers used in these experiments are listed in Table S1.

De Novo Transcriptome Assembly in the Two Turnips at Three Development Stages
Swollen root tissue samples of two B. rapa ssp. rapa varieties (white-fleshed cultivar, W21, and yellow-fleshed cultivar, W25) were collected for the RNA-seq (Figure 1a,b). Each sample included five individuals and each stage contained three samples. A total of 173.82G clean bases with an average of 9.65 Gb sequencing data, 94.09% of bases scoring Q30 and 45.21% GC content for each sample, were obtained (Table S2). Compared with Zhuang et al. [1], we obtained more sequencing data in swollen root tissue samples of turnips than in root skin samples (6.20 Gb). After split joint by Trinity software, a total of 298,926 transcripts were generated with an N50 length of 1365 bp, an N90 of 362 bp and a mean length of 879 bp. The obtained transcripts were used as reference sequences for the unigene analysis based on the corset hierarchical clustering. The results showed that a total of 253,720 unigenes were obtained with an N50 length of 1424 bp, an N90 of 460 bp and a mean length of 989 bp. In other studies, Zhuang et al. [1] obtained 76,152 unigenes after assembly in turnip root skin samples, and Lin et al. [51] generated 84,132 unigenes in turnip floral buds, which suggest that more unigenes are expressed in turnip swollen root tissue than in root skin samples and floral bud samples.
173.82G clean bases with an average of 9.65 Gb sequencing data, 94.09% of bases scoring Q30 and 45.21% GC content for each sample, were obtained (Table S2). Compared with Zhuang et al. [1], we obtained more sequencing data in swollen root tissue samples o turnips than in root skin samples (6.20 Gb). After split joint by Trinity software, a total o 298,926 transcripts were generated with an N50 length of 1365 bp, an N90 of 362 bp and a mean length of 879 bp. The obtained transcripts were used as reference sequences for th unigene analysis based on the corset hierarchical clustering. The results showed that a total of 253,720 unigenes were obtained with an N50 length of 1424 bp, an N90 of 460 bp and a mean length of 989 bp. In other studies, Zhuang et al. [1] obtained 76,152 unigene after assembly in turnip root skin samples, and Lin et al. [51] generated 84,132 unigene in turnip floral buds, which suggest that more unigenes are expressed in turnip swollen root tissue than in root skin samples and floral bud samples. The size and distribution of transcripts and unigene length analysis showed tha 133,252 (44.58%) transcripts were less than 500 bp in length and 26,755 (8.95%) transcript were more than 2000 bp in length, 88,523 (34.89%) unigenes were less than 500 bp in length and 26,755 (10.55%) unigenes were more than 2000 bp in length ( Figure S1a, Table S3) Coding sequence (CDS) length and predicted protein sequence were also analyzed. A tota of 193,914 CDSs were searched and 99,798 (51.47%) CDSs were less than 500 bp and 6869 (3.54%) CDSs were greater than 2000 bp in length ( Figure S1b, Table S4). The results o predicted protein sequence analysis revealed that 152,628 (76.40%) protein sequence were less than 300 amino acids and 1711 (0.86%) protein sequences were greater than 1000 amino acids in length ( Figure S1c).  (Table S5). After the in tegrated analysis, 73,885 (29.12%) unigenes were found to be annotated in all seven data bases ( Figure 2a, Table S6). Compared with the Nr database, an interesting result observed The size and distribution of transcripts and unigene length analysis showed that 133,252 (44.58%) transcripts were less than 500 bp in length and 26,755 (8.95%) transcripts were more than 2000 bp in length, 88,523 (34.89%) unigenes were less than 500 bp in length and 26,755 (10.55%) unigenes were more than 2000 bp in length ( Figure S1a, Table S3). Coding sequence (CDS) length and predicted protein sequence were also analyzed. A total of 193,914 CDSs were searched and 99,798 (51.47%) CDSs were less than 500 bp and 6869 (3.54%) CDSs were greater than 2000 bp in length ( Figure S1b, Table S4). The results of predicted protein sequence analysis revealed that 152,628 (76.40%) protein sequences were less than 300 amino acids and 1711 (0.86%) protein sequences were greater than 1000 amino acids in length ( Figure S1c).  (Table S5). After the integrated analysis, 73,885 (29.12%) unigenes were found to be annotated in all seven databases ( Figure 2a, Table S6). Compared with the Nr database, an interesting result observed was that the functional information of the homologous sequences was similar to that of B. napus (80,389, 39.53%) than that of B. rapa (70,990, 34.91%) (Figure 2b). In fact, many researchers consider turnip to be a subspecies of Chinese cabbage. At the genomic level, the results of the phylogenetic relationship of the chloroplast and mitochondrial genome of turnip showed that turnip is closely related to B. rapa [52,53]. Results of homologous sequence alignment in this study may provide a new insight into the root development of B. napus, B. rapa and turnip.

Unigene Annotation
was that the functional information of the homologous sequences was similar to that of B. napus (80,389, 39.53%) than that of B. rapa (70,990, 34.91%) (Figure 2b). In fact, many researchers consider turnip to be a subspecies of Chinese cabbage. At the genomic level, the results of the phylogenetic relationship of the chloroplast and mitochondrial genome of turnip showed that turnip is closely related to B. rapa [52,53]. Results of homologous sequence alignment in this study may provide a new insight into the root development of B. napus, B. rapa and turnip. Gene Ontology annotation results showed that 38,256 unigenes were classified as biological process, 55,166 unigenes were classified into cellular component and 33,987 unigenes were annotated as molecular function (Table S7). In terms of KOG classification, 24,147 unigenes were general function prediction only, 13,277 unigenes were categorized into posttranslational modification, protein turnover and chaperones and 12,696 unigenes were categorized into signal transduction mechanisms (Table S8).

Identification of Differentially Expressed Unigenes
To identify the DEGs between W21 and W25, gene expression levels were estimated with the fragments per kilobase of exon per million fragments mapped (FPKM) values.
The FPKM values in W21 and W25 at the different developmental stages were compared and the DEGs were selected with the |log 2 Fold Change| ≥ 1 and false discovery rate  Table S9). Among these TFs, 18, 18, 14, 11, 11 and 10 unigenes were classified into ethylene-responsive transcription factor (ERF), C3H, bZIP, bHLH, WRKY and nuclear transcription factor Y subunit A (NF-YA) TF family, and they were considered as predominant common differentially expressed TFs. Among these TF gene families, 163 TFs exhibited a higher expression level in W25 than in W21 in all three pairwise comparisons of W1-vs-Y1, W2-vs-Y2 and W3-vs-Y3, and the predominant TFs were the same as the common differentially expressed TFs. An interesting phenomenon was that most of the above highly expressed TFs were mostly not expressed in W21 and there were 107/163 (65.64%). Based on the Log 2 FPKM values, the trend change map of these 163 TFs was constructed (Figure 2f).

Carotenoid Metabolites in Turnip at Three Developmental Stages
Though two turnip cultivars were planted simultaneously and grown in the same field and under the same conditions, the colors of the swollen root differed distinctly. The flesh color was white in W21 and yellow in W25. To elucidate the carotenoid metabolites in turnip, W21 and W25 were chosen for carotenoid metabolome profiling analysis. A total of 18 carotenoid compounds were detected and 10/18 were detected in swollen root tissues of the two turnips. Ten carotenoid compounds (phytoene, lycopene, α-carotene, γ-carotene, β-carotene, lutein, zeaxanthin, violaxanthin, neoxanthin and apocarotenal) were detected in W25 and four compounds (phytoene, lycopene, γ-carotene and apocarotenal) were unique. Seven carotenoid compounds (α-carotene, β-carotene, lutein, zeaxanthin, violaxanthin, neoxanthin and β-cryptoxanthin) were detected in W21 and only one compound (β-cryptoxanthin) was unique. At the same time, all detected carotenoid components were quantified and the contents are shown in Table 1. Among these carotenoids, phytoene was colorless, thus, the high contents of lycopene and γ-carotene made W25 develop a yellow swollen root. In the study of carotenoid metabolites, components of the carotenoids in most species are already well understood, such as in carrot [12], tomato [13][14][15], pepper [16,17] and orange cauliflower and orange heading Chinese cabbage [18,19]. Additionally, Hadjipieri et al. [21] also indicated that trans-lutein and trans-β-carotene were the major carotenoids in the peel of loquat fruit, and trans-β-cryptoxanthin, followed by trans-β-carotene and 5,8-epoxy-β-carotene, to be the most predominant carotenoids in the flesh of loquat fruit. Xu et al. [24] reported significant accumulation of antheraxanthin, zeaxanthin, neoxanthin and β-cryptoxanthin in orange zucchini. Our findings were different from previous studies in most vegetable crops, as we found that the high accumulation of lycopene and γ-carotene made the turnip develop a yellow-fleshed root.
The trends of the carotenoid component content changes were different. There were three trends of the content changes of carotenoid compounds detected from stage 1 to stage 3 in W25. The content of phytoene and lycopene decreased, the content of α-carotene, lutein, β-carotene, zeaxanthin, violaxanthin and neoxanthin increased and the content of γ-carotene and apocarotenal increased from Y1 to Y2 and decreased from Y2 to Y3. Of these carotenoid compounds, the content of phytoene at stage one (Y1) was higher than others, reaching 37.050 ± 0.920 mg/100 g DW, which was predominant. The content of apocarotenal at stage one (Y1) was lower than others, reaching 0.003 ± 0.000 mg/100 g DW. β-Cryptoxanthin was detected only in W21 and all carotenoid compounds detected in W21 maintained a similar change trend, continuously declining.

Weighted Gene Co-Expression Network and Module-Carotenoid Metabolite Correlation Analysis
In order to identify the genes related to the carotenoid components, the weighted gene co-expression network analysis (WGCNA) was performed to investigate the co-expression networks of DEGs and a total of 48 co-expression modules were identified based on their similar expression patterns (Figure 3a). The heatmap of module-carotenoid correlations was also created and the results showed that the accumulation of transcripts for the purple module was correlated with carotenoid metabolites, including phytoene, lycopene and γ-carotene (r > 0.8). The accumulation of transcripts for the brown module was correlated with phytoene (r = 0.82) and the accumulation of transcripts for the turquoise module was correlated with γ-carotene (r = 0.91) (Figure 3b), which suggest that unigenes clustering in these modules may be closely related to the formation of yellow root in W25. 1

Genetic Basic of Carotenoid Metabolites
Combining analysis of DEGs and carotenoid metabolites, |correlation coefficient| > 0.8 was considered to represent the gene that regulated the metabolites. In order to identify the genes related to the carotenoid components, the genes involved in carotenoid biosynthesis pathways were analyzed. In total, 125 DEGs (Table S10) were annotated as involved in the carotenoid biosynthesis pathways and the expression levels of these unigenes are shown in Based on these differentially expressed structural genes and differential accumulation of carotenoid metabolites in the carotenoid biosynthesis pathway, we reconstructed the carotenoid biosynthesis pathway and predicted the molecular mechanisms resulting in the yellow-fleshed root coloration in turnip (Figure 5a). Five carotenoid metabolites, phytoene, lycopene, α-carotene, β-carotene and γ-carotene, were highly accumulated in W25 and were used to reconstruct the carotenoid biosynthesis pathway. Of the above differentially expressed structural unigenes, PSY is the first enzyme to produce the colorless carotenoid 15-cis-phytoene in the carotenoid biosynthetic pathway [54,55]. Cluster-30924.23408, annotated as PSY, was highly expressed in W25, while it was mostly not expressed in W21. Similarity, phytoene accumulated in W25, while it was not detected in W21. The expression pattern of the PSY gene was consistent with the accumulation of 15-cis-phytoene in the two turnips, which explained the lack of phytoene in W21. The colorless phytoene was converted to the red-colored all-trans-lycopene via a series of enzymes, including phytoene desaturase (PDS), ζ-carotene isomerase (Z-ISO), ζ-carotene desaturase (ZDS) and carotenoid isomerase (CRTISO) [8,56]. However, the deficiency of the 15-cis-phytoene in W21 meant that the downstream lycopene and γ-carotene were also not detected in W21. The contents of downstream carotenoid metabolites were very interesting. Eight downstream carotenoid metabolites were detected in W21 except apocarotenal, which suggests that metabolites in other biosynthetic pathways may influence the formation of downstream carotenoid metabolites.  rase (ZDS) and carotenoid isomerase (CRTISO) [8,56]. However, the deficiency of the 15cis-phytoene in W21 meant that the downstream lycopene and γ-carotene were also not detected in W21. The contents of downstream carotenoid metabolites were very interesting. Eight downstream carotenoid metabolites were detected in W21 except apocarotenal, which suggests that metabolites in other biosynthetic pathways may influence the formation of downstream carotenoid metabolites.

Identification of Key Transcription Factors Regulating Sugar Metabolism
In order to predict the transcription factors involved in the carotenoids biosynthetic pathways, 45 differentially expressed TFs whose expressions were highly correlated with the above five carotenoid metabolites and eight carotenoid biosynthetic pathway structural unigenes were shown to form a correlated network (Figure 5b). All these transcription factors showed lower expression in W21 than in W25, including four NF-YA genes,

Identification of Key Transcription Factors Regulating Sugar Metabolism
In order to predict the transcription factors involved in the carotenoids biosynthetic pathways, 45 differentially expressed TFs whose expressions were highly correlated with the above five carotenoid metabolites and eight carotenoid biosynthetic pathway structural unigenes were shown to form a correlated network (Figure 5b). All these transcription factors showed lower expression in W21 than in W25, including four NF-YA genes, five WRKY genes, eight bZIP genes, eight ERF genes, six zinc finger CCCH domain-containing protein (C3H) genes and 14 other TFs. Results suggested that these TFs may correspond to the putative regulators controlling carotenoids biosynthesis.
Based on carotenoid metabolism in tomato fruit, Wu et al. [28] and Shi et al. [29] identified an R2R3-MYB transcription factor, SlMYB72, and demonstrated that the SlMYB72interacting protein SlZHD17, which belongs to the zinc-finger homeodomain transcription factor family, also functions in carotenoid metabolism. They also found that the expressions of PSY1 and Z-ISO were suppressed due to the regulation by SlZHD17, which decreased the lycopene content in fruits. In addition, Mannen et al. [57] reported that phytochromeinteracting factor 5 (PIF5), a basic helix-loop-helix family transcription factor, positively regulates expression of key enzymatic genes in the MEP pathway and improves the accumulation of chlorophyll and carotenoids in cultured cells. We also identified a basic helix-loop-helix family transcription factor, PIF3, which was highly correlation with γcarotene (R = 0.974). The possible regulation mechanism of PIF3 involved in the control of carotenoid metabolism in turnip root will investigated in the future. All these findings provided new insight into deep exploration of the formation of yellow turnips.

Cloning and Analysis of Candidate Gene Sequence
Based on the reconstructed carotenoid biosynthesis pathway in the yellow-fleshed root coloration of turnip, the PSY gene was considered as the candidate gene controlling the yellow-fleshed root coloration. In order to compare the PSY gene sequence between W21 and W25, the sequences of PSY were cloned, sequenced and named as PSY-W and PSY-Y for W21 and W25, respectively. The full length gDNA sequence of PSY-W and PSY-Y was 2086 bp and 2106 bp, cDNA sequence was 1275 bp and 1278 bp, respectively. Referring to the PSY gene sequence of Chinese cabbage in the BRAD database, the BrrPSY gene in turnip contains six exons and five introns and PSY-W in turnip was totally consistent with the PSY in Chinese cabbage. Sequence alignment revealed four mutations in the PSY-Y amino acid sequence compared with PSY-W, including mutations of proline to glutamine (P-Q) at the 18th amino acid position, serine to alanine (S-A) at the 35th amino acid position, a serine (S) insertion at the 62nd amino acid position and serine to arginine (S-R) at the 220th amino acid position (Figure 6a). the control of carotenoid metabolism in turnip root will investigated in the future. All these findings provided new insight into deep exploration of the formation of yellow turnips.

Cloning and Analysis of Candidate Gene Sequence
Based on the reconstructed carotenoid biosynthesis pathway in the yellow-fleshed root coloration of turnip, the PSY gene was considered as the candidate gene controlling the yellow-fleshed root coloration. In order to compare the PSY gene sequence between W21 and W25, the sequences of PSY were cloned, sequenced and named as PSY-W and PSY-Y for W21 and W25, respectively. The full length gDNA sequence of PSY-W and PSY-Y was 2086 bp and 2106 bp, cDNA sequence was 1275 bp and 1278 bp, respectively. Referring to the PSY gene sequence of Chinese cabbage in the BRAD database, the BrrPSY gene in turnip contains six exons and five introns and PSY-W in turnip was totally consistent with the PSY in Chinese cabbage. Sequence alignment revealed four mutations in the PSY-Y amino acid sequence compared with PSY-W, including mutations of proline to glutamine (P-Q) at the 18th amino acid position, serine to alanine (S-A) at the 35th amino acid position, a serine (S) insertion at the 62nd amino acid position and serine to arginine (S-R) at the 220th amino acid position (Figure 6a). To clarify the evolutionary relationship between PSY protein sequences in turnip and in other species, twenty-eight PSY proteins sequences were downloaded from the National Center for Biotechnology Information and a phylogenetic tree was constructed by MEGA 6.0 software. Results revealed that PSY in turnip (BrrPSY) and PSY in Brassicaceae species were clustered into the same clade, and there was a closer genetic relationship with BraPSY in Chinese cabbage and RsaPSY in Raphanus sativus rather than AthPSY in Arabidopsis thaliana and BnaPSY in B. napus (Figure 6b).
Relative expression levels of the BrrPSY gene in turnips at different root development stages in W21 and W25 and the BrrPSY-Y gene in W25 in the vegetative growth period (leaf, leaf petiole, fleshed root) and reproductive growth period (bud, flower, stem leaf, flower stem and seed) were measured by RT-qPCR. Results of relative expression levels of the BrrPSY gene in turnips at different root development stages showed that the expression levels of BrrPSY in W25 were evidently higher than in W21 at three stages, expression levels at S2 and S3 stages were higher than in S1 and BrrPSY expression levels at the S2 stage in W25 were the highest (Figure 6c). This result was consistent with transcript accumulation by RNA-seq. Results of the BrrPSY-Y gene in W25 in different tissues revealed that the BrrPSY-Y gene was expressed in various tissues at different levels. The BrrPSY-Y gene had the highest expression in the root, and the relative expression of each tissue was as follows: root > flower stem > leaf > petiole > seeds > bud > flower > stem leaf (Figure 6d).

Subcellular Localization and Functional Analysis of BrrPSY Gene in Turnip
In order to clear the BrrPSY protein localization in cells, we performed assays for the transient transformation of Arabidopsis protoplasts with PBI211GFP-BrrPSY recombinant plasmids and a PBI211GFP plasmid was used as control. Fluorescence of transformed Arabidopsis protoplasts showed that the BrrPSY protein appeared as green fluorescence at 488 nm, and the chloroplast in Arabidopsis protoplasts appeared as red fluorescence at 555 nm. The merged field showed green and red fluorescence presenting the superimposed orange, indicating that BrrPSY protein was localized in the chloroplasts (Figure 7a).
To test the ability of BrrPSY to impact the accumulation of carotenoid content in vivo, we generated over-expressing constructs targeting the PSY-Y and PSY-W genes. For this purpose, Agrobacterium tumefaciens strain GV3101 containing recombinant plasmid 35S::PSY-Y or 35S::PSY-W was injected into a tobacco line using the injection infiltration method, and corresponding empty plasmids were used as control. The relative expression of PSY-Y and PSY-W was measured by qRT-PCR analysis and the results showed that PSY-Y and PSY-W were both significantly up-regulated in the transient expression tobacco (Figure 7b,c). Similarly, the carotenoid contents were increased in both the PSY-Y and PSY-W over-expressing tobacco (Figure 7d,e), confirming a direct regulation of BrrPSY on carotenoid content in turnip. Results revealed that yellow turnips are due to high expression of the PSY gene rather than mutations in the PSY gene, indicating that a posttranscriptional regulatory mechanism may affect carotenoid formation.
Since PSY catalyzes the first committed reaction, it is considered to be the major ratelimiting enzyme regulating the flux into carotenoid production [37,38], while there are few reports on the function of a mutated PSY gene. Our research confirmed the mutated BrrPSY gene in carotenoid accumulation and validated the function of PSY-Y and PSY-W genes. All these findings provide new insight into deep exploration of the formation of yellow turnips. To test the ability of BrrPSY to impact the accumulation of carotenoid content in vivo, we generated over-expressing constructs targeting the PSY-Y and PSY-W genes. For this purpose, Agrobacterium tumefaciens strain GV3101 containing recombinant plasmid 35S::PSY-Y or 35S::PSY-W was injected into a tobacco line using the injection infiltration method, and corresponding empty plasmids were used as control. The relative expression of PSY-Y and PSY-W was measured by qRT-PCR analysis and the results showed that PSY-Y and PSY-W were both significantly up-regulated in the transient expression tobacco (Figure 7b,c). Similarly, the carotenoid contents were increased in both the PSY-Y and PSY-W over-expressing tobacco (Figure 7d,e), confirming a direct regulation of

Conclusions
In this study, we identified eight unigenes involved in carotenoid biosynthesis and that were significantly upregulated in W25 and barely expressed in W21. Metabolomics profiling in turnip revealed that four carotenoid components, phytoene, lycopene, γ-carotene and apocarotenal, were unique in W25. Thus, we hypothesized high accumulation of lycopene and γ-carotene in W25 to cause the yellow-fleshed root coloration in turnip. A carotenoid biosynthesis pathway was reconstructed and forty-five transcription factors were identified to predict the yellow-fleshed root coloration in turnip. The PSY gene was the key gene affecting the carotenoid formation in W25. The coding sequence of BrrPSY-W25 was 1278 bp and BrrPSY-W21 was 1275 bp, and BrrPSY was more highly expressed in swollen roots in W25 than in W21. Transient transgenic tobacco leaf overexpressing BrrPSY-W and BrrPSY-Y showed higher transcript levels and carotenoid contents. Results revealed that yellow turnip formation is due to high expression of the PSY gene rather than mutations in the PSY gene, indicating that a posttranscriptional regulatory mechanism may affect carotenoid formation.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13060953/s1, Table S1: All primers used in this paper. Table S2: The summary of the transcriptome sequencing dataset and quality control; Table S3: The size and distribution of transcripts and unigene length data; Table S4: Coding sequence length and predicted protein sequence analysis data; Table S5: Functional annotation of assembled unigenes in turnip; Table S6: Number of unigenes annotated in seven databases; Table S7: Gene Ontology classification of unigenes in turnips; Table S8: EuKaryotic Ortholog Groups classification of unigenes in turnips; Table S9: The identified differentially expressed TFs in two turnips at three development stages; Table S10: The FPKM values of 125 differentially expressed unigenes involved in the carotenoid biosynthesis pathways; Figure S1: The column diagram of transcripts, unigene, coding sequences and predicted protein sequence size and distribution in turnips.