Red Chinese Cabbage Transcriptome Analysis Reveals Structural Genes and Multiple Transcription Factors Regulating Reddish Purple Color.

Reddish purple Chinese cabbage (RPCC) is a popular variety of Brassica rapa (AA = 20). It is rich in anthocyanins, which have many health benefits. We detected novel anthocyanins including cyanidin 3-(feruloyl) diglucoside-5-(malonoyl) glucoside and pelargonidin 3-(caffeoyl) diglucoside-5-(malonoyl) glucoside in RPCC. Analyses of transcriptome data revealed 32,395 genes including 3345 differentially expressed genes (DEGs) between 3-week-old RPCC and green Chinese cabbage (GCC). The DEGs included 218 transcription factor (TF) genes and some functionally uncharacterized genes. Sixty DEGs identified from the transcriptome data were analyzed in 3-, 6- and 9-week old seedlings by RT-qPCR, and 35 of them had higher transcript levels in RPCC than in GCC. We detected cis-regulatory motifs of MYB, bHLH, WRKY, bZIP and AP2/ERF TFs in anthocyanin biosynthetic gene promoters. A network analysis revealed that MYB75, MYB90, and MYBL2 strongly interact with anthocyanin biosynthetic genes. Our results show that the late biosynthesis genes BrDFR, BrLDOX, BrUF3GT, BrUGT75c1-1, Br5MAT, BrAT-1, BrAT-2, BrTT19-1, and BrTT19-2 and the regulatory MYB genes BrMYB90, BrMYB75, and BrMYBL2-1 are highly expressed in RPCC, indicative of their important roles in anthocyanin biosynthesis, modification, and accumulation. Finally, we propose a model anthocyanin biosynthesis pathway that includes the unique anthocyanin pigments and genes specific to RPCC.


Identification of DEGs
The cDNA libraries of RPCC and GCC were mapped to the B. rapa reference genome with coverage of 95.2% and 94.66%, respectively ( Figure 2a). Among the mapped and annotated DEGs, the highest proportion had RPKM (reads per kilobase per million mapped reads) values of >10, followed by RPKM values of 1 to 4 ( Figure 2b). Among the predicted 3345 DEGs, 2706 were up-regulated and 639 were down-regulated in RPCC vs. GCC (Figure 2c; Table S2 and Table S3). Of them, 643 of the up-regulated DEGs and 354 of the down-regulated DEGs between RPCC and GCC samples were not functionally characterized (Table S2 and Table S3). The DEGs included many ABGs. A Bland-Altman (MA) plot was constructed to show the differentiation of gene expression between the two samples by plotting the values onto M (log ratio) and A (mean average) scales. The differences in gene expression are shown on the MA plot, where genes with ≥1-fold expression values are shown in red and those with negative log2 fold values are shown in green ( Figure S2).

Functional Characteristics of DEGs and KEGG Pathway Enrichment Analysis
We performed GO analyses to annotate the DEGs. The top 20 enriched terms in each GO category of DEGs were selected based on significance (p < 0.05) and are summarized in Figure 3. In the BP category, the majority of DEGs were involved in "response to chemical stimulus" followed by "response to organic substance", "response to endogenous stimulus", and "cellular response to chemical stimulus" processes. Similarly, in the MF category, the majority of genes were involved in "DNA binding", followed by "transcription factor activity", "sequence specific DNA binding" and "calcium ion binding". In the CC category, many of the DEGs were related to "extracellular region", "cell wall", "external encapsulating structure" and "plant type cell wall" (Figure 3). We further categorized the up-regulated and down-regulated genes. Among the up-regulated DEGs, a total of 385 GO terms were identified, comprising 328 in the BP category, 35 in the MF category, and 22 in the CC category. Among the down-regulated DEGs, a total of 210 GO terms were identified, comprising 133 terms in the BP category, 50 in the MF category, and 22 in the CC category (Table S6.) In the BP category, a few genes were involved in "biosynthetic and metabolic processes of anthocyanins, ethylene signaling, flavonoids and phenylpropanoids", "catabolic and metabolic processes of L-phenylalanine", "cinnamic acid biosynthesis", "response to absence of light", and "response to temperature stimulus". In the MF category, some genes were involved in "phenylalanine ammonia-lyase activity" and "O-methyltransferase activity" (Table S6.). Some of the down-regulated genes were involved in "biosynthesis process", "negative regulation of cellular metabolic process", and "response to UV-C" (Table S6). Next, we conducted a KEGG enrichment analysis to identify pathways significantly enriched with DEGs. The pathways enriched with up-regulated DEGs were "biosynthesis of secondary metabolites", "metabolic pathways", "biosynthesis of flavonoids and phenylpropanoids", "metabolism of fructose, mannose, starch, and sucrose", and several others ( Figure 4 and Table S7). The down-regulated DEGs were involved in 46 types of functions related to organ development, and other functions related to plant growth and development (Table S7).

Expression Analysis of Anthocyanin Biosynthetic Genes by qRT-PCR
In general, gene expression studies can demonstrate the biological activity of genes in plants. We confirmed the reproducibility and accuracy of DEGs identified in our transcriptome data through qRT-PCR analyses. Although transcriptome sequencing was performed using samples from 3-week-old plants (seedlings), we checked the expression profiles of genes in samples from 6-week-old (rosette stage) and 9-week-old (heading stage) plants of GCC and RPCC. This analysis of gene expression at three developmental stages sheds light on the expression profile of ABGs and their effect on anthocyanin biosynthesis and accumulation throughout plant development. In general, the anthocyanin biosynthesis pathway can be classified into three phases; 1. The phenylpropanoid pathway; 2. Early steps of the flavonoid pathway; and 3. The anthocyanin pathway [31]. For the validation of gene expression, we selected 60 genes identified from the transcriptome data and from a previous study [26] that are involved in various stages of anthocyanin biosynthesis (Table 4; Table S9).
In the phenylpropanoid pathway, three PAL genes (BrPAL1, BrPAL2 and BrPAL4), BrC4H, and the 4CL homolog Br4CL2 were detected in GCC and RPCC at three developmental stages. The transcriptome data from 3-week-old plants (heat map, Figure 5a) and the qRT-PCR analyses showed that the transcript levels of these genes were much higher in RPCC than in GCC. Gene expression was also compared among stages (seedling, rosette, and heading stages) (Figure 5a). Similar to PBGs, EBGs such as BrCHS, BrCHI and BrCHI1, BrF3H, and BrF3 H-1 showed similar expression patterns in both the transcriptome analysis (heat map, Figure 5b) and qRT-PCR analyses (Figure 5b). Unlike PBGs and EBGs, the LBG BrDFR was expressed only in the RPCC at all stages, while the LBG BrLDOX was expressed at lower levels at the early stage. The BrLDOX transcript levels gradually increased from the rosette to the heading stage in RPCC (Figure 5c). The transcript levels of MYBs including BrMYB90, BrMYB75, and BrMYBL2-1 varied among different stages. BrMYB90 and BrMYB75 showed maximum transcript levels in RPCC at the seedling stage, while BrMYBL2-1 had higher transcript levels at the rosette and heading stages than at the seedling stage in RPCC (Figure 5d).
The downstream LBGs are involved in the acylation, glycosylation, and methylation of anthocyanins and include genes encoding acyltransferase (AT), glycosyltransferase (GT) and O-methyltransferase (OMT) [32,33]. The downstream LBGs selected from the transcriptome data for qRT validation included BrUF3GT, BrUGT75C1-1, BrUGT73B2, Br5MAT, BrAT-1, and BrAT-2. BrUF3GT, BrUGT75C1-1, and BrUGT73B2 showed increased expression while Br5MAT and BrAT-1 showed decreased expression from the seedling to heading stages in RPCC. The highest transcript level of BrAT-2 was at the rosette stage in RPCC (Figure 5e). Interestingly, all the LBGs including regulatory MYB (RM) genes showed low or no expression in GCC compared with RPCC in the transcriptome data (heat map, Figure 5e) and in the qRT-PCR analyses (Figure 5e).    Among many anthocyanin transporter genes, four B. rapa transporter genes including BrMATE-1, BrMATE-2, and BrTT19-1 showed differential expression between GCC and RPCC (heat map, Figure 5f). At the seedling and rosette stages, both BrMATE-1 and BrMATE-2 were expressed at higher levels in RPCC than in GCC. BrTT19-1 and BrTT19-2 transcript levels were high at all stages in RPCC, but at negligible or undetectable levels in GCC at all stages (Figure 5f). The remaining ABGs, showed diverse expression patterns in the qRT-PCR analyses ( Figure S3). Most of the analyzed genes had higher transcript levels in RPCC than in GCC ( Figure S3).
We also conducted qRT-PCR analyses for some MYB TF genes showing differences in expression levels between RPCC and GCC in transcriptome data (Table S4). Among the five selected MYB genes, BrMYB15 and BrMYB51-2 showed >1-fold expression in RPCC than in GCC at all stages, and BrMYB77 transcript levels increased from the rosette to the heading stage in RPCC but not in GCC ( Figure 6). A correlation analysis revealed a strong correlation between the transcriptome data and qRT-PCR data (R = 0.81) (Figure 7). Overall, similar trends in gene expression were detected from transcriptome data and qRT-PCR analyses. The expression patterns of PBGs, EBGs, LBGs, TGs, and RMs implied that LBGs, TGs and RMs play crucial roles in anthocyanin biosynthesis during different developmental stages of RPCC.

Promoter Analysis of Anthocyanin Biosynthetic Genes
Cis-regulatory elements (CREs) are binding sites for TFs in the promoters of target genes. To identify CREs, we analyzed the promoter regions of 22 important ABGs. The 2-kb region upstream of the transcription start site (TSS) was extracted and analyzed by the New PLACE program to find CRE motifs (Figure 8a). Among the predicted CREs, most were binding elements for MYB, bHLH, WRKY, bZIP and Ap2/ERF TFs (Figure 8b and Table S10). Those binding to MYB TFs were the most abundant, followed by those binding to bHLH, WRKY, bZIP and AP2/ERF TFs (Figure 8b). MYB CREs have been found in the promoters of genes related to secondary metabolism, flavonoid biosynthesis, anthocyanin biosynthesis, and plant defense (Table S10). The bHLH and bZIP CREs are known to be involved in the light response, tissue specific activation of phenylpropanoid biosynthetic genes, sugar repression, seed development, and the biosynthesis of phenylpropanoids, lignin, and flavonoids. AP2/ERF CREs are involved in functions related to the ethylene response, the jasmonate response, and secondary metabolism (Table S10). Therefore, the results of our study and other studies [34,35] indicate that MYB, and bHLH CREs regulate the expression of genes at all stages in the anthocyanin biosynthesis pathway.

Regulatory Network Analysis of Anthocyanin Biosynthesis Genes
To identify the interactions among anthocyanin biosynthetic genes and the transcription factor genes including MYB, bHLH, WRKY, bZIP, and AP2/ERF (with log2fold change > 2) (Tables 3 and 4), a putative interactive network was constructed (Figure 9). Among them, 37 B. rapa genes (yellow circles) showed 147 interactions, which could be classified into two types: activation (↓/↑) and repression (⊥) (Figure 9). The gene network results showed that MYB75 interacts with a gene encoding an acyl transferase family protein, as well as MYB90, 5MAT, TT5, AGT, TT4, UGT78D2, TT19, DFR, and UF3GT, and activates them to promote anthocyanin biosynthesis. Two LBGs (DFR and LDOX) are positively regulated by TFs such as PIF3, MYB32, HY5, and TT2 ( Figure 9) and DFR is also positively regulated by the TT8 TF. Besides gene-gene interactions, this network analysis revealed many other interactions among TFs and structural genes involved in various functions. Among the repressors, the MYBL2 TF interacts with MYB75, DFR, TT2, TT8, GL2, and EGL3; MYB75 represses SCPL10; both EGL3 and GL3 repress LDOX; and bHLH32 represses DFR (Figure 9). These results indicate that interactions between TFs and their target genes play a vital role in the regulation of anthocyanin biosynthesis and other metabolic functions related to the growth and development of RPCC.

Discussion
The red color in Chinese cabbage has been introduced through different techniques of introgression breeding. Among the introduced varieties, Xie et al. [3] introgressed the red color phenotype by crossing a Chinese cabbage variety with red color Brassica juncea through the embryo rescue technique. While in our study, RPCC and red Chinese cabbage (RCC) reported by Lee et al. [4], have been developed through interspecific-crossing between Chinese cabbage and red cabbage.

Novel Anthocyanin Pigments Responsible for the Color of RPCC
Red Chinese cabbage is an economically important variety that is rich in various secondary metabolites including anthocyanins [4]. This variety accumulates red pigments at an early stage of plant growth (seedling stage), so it is an ideal system to study the genes and regulatory TFs that are involved in regulating color (anthocyanin) accumulation at the early stages of plant development (Figure 10). Fruits and vegetables contain five main types of anthocyanins, with different frequencies of occurrence: cyanidin (50%), delphinidin (12%), pelargonidin (12%), peonidin (12%), malvidin (7%), and petunidin (7%) [36]. We detected 11 anthocyanin variants in the RPCC samples, approximately 85% of which were cyanidin isoforms. Thus, our results and those of other studies show that cyanidin derivatives are the most abundant type of anthocyanins in red Chinese cabbage [4]. In addition, the anthocyanin pigments accumulated in RPCC and in previous studies [3,4] are entirely different, indicating that the color accumulation and regulation are due to different sets of anthocyanin pigments. In accordance with the results of the HPLC analysis, we propose a model of the anthocyanin biosynthesis pathway that generates cyanidin 3-(feruloyl) diglucoside-5-(malonoyl) glucoside, and pelargonidin 3-(caffeoyl) diglucoside-5-(malonoyl) glucoside in RPCC (Figure 11). Most of the anthocyanin components identified in this study have been detected in radish or other Brassica crops but not in red Chinese cabbage [20][21][22][23][37][38][39][40][41][42][43]. Hence, 11 of the 13 pigments identified in our study were detected for the first time in RPCC (Table S1). Joo et al [28] have proved that the anthocyanin rich extract has the ability to lower the risk of vascular inflammatory diseases.

Anthocyanin Biosynthesis Genes Are Differentially Regulated in RPCC
Transcriptome sequencing is an advanced NGS technique that can be used to predict novel genes, gene function, and genome evolution. Comparative transcriptome sequencing between two different phenotypes, GCC and RPCC, revealed differences in the expression levels of genes involved in anthocyanin biosynthesis and the regulation of this pathway [26,44]. The transcriptome sequencing of RPCC and GCC at the seedling stage revealed 3345 DEGs, which included unique genes with unknown functions and TF genes. About 255 DEGs were involved in various functions related to phenylpropanoids, lignins, flavonols, and anthocyanins. Further qRT-PCR analyses showed that PBGs and EBGs were expressed at levels 0.5-to 1.0-fold higher at the seedling stage than at the other two stages (rosette and heading) in RPCC. Many of these genes encoded proteins involved in the early phases of anthocyanin biosynthesis [7,45,46]. Our results indicate that BrPAL, BrPAL2, BrPAL4, BrC4H, Br4CL2, BrCHS, BrCHI, BrCHI1, BrF3H, and BrF3 H-1 may be involved in the early phase of anthocyanin biosynthesis (i.e., the conversion of phenylalanine to dihydroquercetin) in RPCC. As shown in Figure 5e, the important LBGs BrDFR and BrLDOX, whose encoded products catalyze the conversion of dihydroquercetin to cyanidin at the late stage of anthocyanin biosynthesis, were expressed in RPCC at all stages, but not in GCC at any stage. Previous studies have shown that DFR plays crucial roles in anthocyanin accumulation in many plant species under different abiotic stress conditions [47,48]. Similarly, in Arabidopsis, sucrose and jasmonic acid have been shown to induce LBGs such as DFR, LDOX, and UF3GT, leading to anthocyanin accumulation [45,49]. Interestingly, we detected high transcript levels of the LBGs BrDFR, BrLDOX, and BrUF3GT in RPCC but not in GCC, indicating that anthocyanin biosynthesis occurs in RPCC under normal growth conditions without induction by external factors such as sugars or hormones.

Differential Expression of MYBs Regulates Reddish Purple Color Accumulation
In general, ABGs are controlled by various TFs, including the MYB, bHLH, and WD40 TFs that make up the most well-known complex in plants, the MBW complex [55,56]. A study using the transcriptome approach in red Chinese cabbage identified that the anthocyanin pathway regulating genes are TT8 (Bra037887) and PAP1 (c3563g1i2) [3]. Most TFs that are known to regulate anthocyanins are MYB TFs. A study on grapes reported that the MYBA transcription factor regulates the anthocyanin biosynthesis pathway through controlling the expression of UFGT [57]. Because TFs, especially MYB, are known to be important for controlling expression of ABGs, we searched our transcription data for TF sequences. As a result, we identified 25 TF genes with log2fold change > 3: 12 TF genes with log2fold change > 4, five TF genes with log2fold change > 5 and > 6, and one TF gene with log2fold change > 10 (Table 3). They included two important MYB genes: MYB90 (log2fold change, 11.3) and MYB75 (log2fold change, 6.8), which are homologs of PRODUCTION OF ANTHOCYANIN PIGMENT 2 (PAP2) and PRODUCTION OF ANTHOCYANIN PIGMENT 1 (PAP1). These genes showed similar transcript levels in qRT-PCR and transcriptome analyses, and their high expression levels in RPCC suggested that they might be involved in the positive regulation of LBGs during anthocyanin biosynthesis [13,58]. We identified duplicate copies of MYBL2, Bra016164 (BrMYBL2-1) and Bra007957 (BrMYBL2-2), in B. rapa, which are orthologs of A. thaliana AtMYBL2 (AT1G71030). The transcript levels of both BrMYBL2-1 and BrMYBL2-2 were very high, as revealed by transcriptome analysis (log2fold change of 5.5 and 4.06, respectively) and qRT-PCR analysis. A previous study detected a similar expression pattern of MYBL2 in B. rapa, indicative of its role as a positive regulator [59]. However, other studies on the Brassicaceae have demonstrated that MYBL2 can function as a negative regulator of anthocyanin biosynthesis [60,61]. Functional characterization of BrMYBL2-1 and Bra007957 (BrMYBL2-2) will clarify the molecular mechanisms of these genes in RPCC. Our results also showed that MYB binding motifs are highly conserved not only in LBGs but in most of the analyzed ABGs.

Plant Material and Sample Collection
The RPCC was developed (through introgression hybridization) and registered by the Kwonnong Seed Company (Cheongju, S. Korea) as described by Lee et al. [4]. The lines used in this study, green Chinese cabbage (GCC) and RPCC, belong to B. rapa L. ssp. pekinensis and were selected on the basis of their distinct color phenotypes ( Figure 1). Seeds of GCC and RPCC were germinated in a growth chamber under a 16-h light/8-h dark photoperiod at 24 • C. Leaf samples (innermost and outermost leaves) were collected from three biological replicates at three growth stages: the seedling, rosette, and heading stages (at 3-, 6-, and 9-weeks-old, respectively). The leaf samples were stored at −70 • C until further analysis.

Anthocyanin Extraction and HPLC Analysis
The total anthocyanin content of freeze-dried outer and inner leaf tissues of 9-week-old GCC and RPCC was determined by HPLC and LC-MS/MS, as described previously [37]. Each 100-mg lyophilized leaf sample was mixed with 2 mL water:formic acid (95:5 v/v) followed by 5 min vortexing and 20 min sonication. The sample was centrifuged at 9200× g for 15 min at 4 • C and the supernatant was filtered through a 0.45-µm PTFE hydrophilic syringe filter. From this filtrate, 10 µL was used for estimating anthocyanin content. The sample was injected into an Agilent 1200 series HPLC connected to a 4000 Q-Trap LC-ECI-MS

Sequence Pre-Processing and Assembly
For transcriptome analysis, total RNA was extracted from leaf tissues of 3-week-old plants using an RNeasy Mini kit (Qiagen, Valencia, CA, USA) and sequenced with the Illumina Hi-seq2000 platform by SEEDERS Inc. (Korea). The sequence data was submitted to the NCBI database and are available under the accession number PRJNA612946. The raw reads were trimmed using the Dynamic-Trim and Length-Sort programs of the Solexa QA [62] package. Based on the Dynamic-Trim (phred score ≥ 20) and Length-Sort (short read length ≥ 25bp) parameters, clean reads were obtained. The clean reads were assembled according to the protocols of Velvet (version 1.2.08) and Oases (version 0.2.08) software [63]. The optimal k-mer was selected based on the max length, average length, and N50 according to the total length of the assembled sequence.

Mapping and Annotation of Transcripts
To identify gene function, the transcripts were used as queries in BlastX searches against the amino acid sequences in the BRAD [64] and KEGG databases with the following parameters: filter criterion: e-value ≤ 1e−10, best hits. Mapping was performed using Bowtie2 (v2.1.0) software with the following limitation: mismatch ≤ 2 bp, computed by the penalty method) [65]. Transcript levels were normalized using the R package of DESeq [66], and this software was also used to calculate the gene expression values for each sample with data deviation.

Identification of Transcription Factors
To identify TFs, sequences of all B. rapa 4127 TFs were downloaded from the Plant Transcription Factor Database [67] (http://planttfdb.cbi.pku.edu.cn/). The total assembled transcripts were compared and analyzed using BlastX software with the parameters e-value ≤ 1e−50 and identity ≥ 50, and annotated.

Prediction of Differentially Expressed Genes
Differentially expressed genes (DEGs) were defined as those with at least a log2fold difference in transcript levels between the RPCC and GCC samples. Up-regulated genes were those with log2 fold-change greater than 1, and down-regulated genes were those with log2 fold-change less than −1 [22].

Expression Analyses
Leaf samples collected from 3-, 6-, and 9-week-old plants were collected and immediately frozen in liquid nitrogen. Total RNA was extracted using an RNeasy Mini kit (Qiagen). cDNA was synthesized using a Reverse Transcription System kit (Promega, Madison, WI, USA). The resulting cDNA was used as a template for qRT-PCR analyses, which were conducted with the CFX96 Real-Time system (Bio-Rad, Hercules, CA, USA). qRT-PCR analyses were conducted for 60 anthocyanin biosynthetic genes (ABGs) with three biological replicates and three technical replicates using the following conditions: 95 • C for 3 min; 39 cycles of 95 • C for 15 s and 58 • C for 20 s. The relative expression levels were determined by normalizing the data with the comparative Ct method 2−[∆∆Ct] [68] using the Actin gene as a reference.

Gene Ontology Annotation and KEGG Enrichment Analyses
Gene ontology (GO) [69] alignments were performed using total transcripts and 'GO DB' with the threshold of 'counts ≥ 1' and 'GO depth' set to 3. Genes were classified into three functional categories, BP (Biological Process), CC (Cellular Component), and MF (Molecular Function). Then, gene annotation (filter criterion: e-value ≤ 1e−10, best hits) was performed through comparisons with amino acid sequences in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database using BLASTX.
To predict the functional characteristics of up-and down-regulated genes, we first computed p-values based on Fisher's exact test; these values were taken as indicators of significance of the gene in the respective function. Further KEGG enrichment analysis was performed using the KOBAS online tool (http://kobas.cbi.pku.edu.cn/kobas3) [70]. These analyses provided pathway annotations for the transcripts.

Identification of Cis-Regulatory Elements
The 2-kb region upstream from the transcription start site of selected genes was screened to identify cis-regulatory motifs using the New PLACE web tool [71].

Network Analysis
We carried out network analysis to construct an active gene-to-gene regulatory network of genes positively correlated with the anthocyanin biosynthetic pathway. The STRING 10.0 database (http://string-db.org/) was used to obtain an interaction network of these genes in B. rapa [72], according to orthologous genes in A. thaliana. Every link has a score from 0 to 1, where 1 is considered as the highest confidence link for reconstruction [73].

Conclusions
In conclusion, 11 of the 13 pigments detected in RPCC are reported for the first time for this variety. Analyses of the transcriptome data from two varieties at the seedling stage revealed many unique transcripts including DEGs and TF genes that are involved in a multitude of functions in growth and development. Our results show that many DEGs between the red and green varieties are involved in the biosynthesis of secondary metabolites such as phenylpropanoids, lignins, flavonoids, and anthocyanins, and in the regulation of these biosynthetic pathways. Further qRT-PCR expression analyses confirmed that ABGs and many TFs play essential roles in anthocyanin biosynthesis. The gene-to-gene interaction network illustrates the possible regulatory mechanism of MYBs with ABGs during anthocyanin biosynthesis in RPCC. Overall, our study describes the pigments in RPCC, identifies the important anthocyanin biosynthetic genes and TF genes that control the anthocyanin biosynthesis pathway, and proposes a model for the possible interaction mechanism between ABGs and TFs.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/21/8/2901/ s1. References [74][75][76][77][78][79][80][81][82][83][84][85] are cited in Supplementary Materials. Figure S1. Statistics of non-redundant transcripts predicted in transcriptome data. (a) Length distribution of transcripts. (b) Transcripts annotated in public databases. Figure S2 MA plot of DEGs between green (G) vs. reddish purple (RP) Chinese cabbage identified from transcriptome data. X-axis shows log2fold change and y-axis shows differentially expressed genes. Red and green dots indicate up-regulated and down-regulated genes, respectively. Figure S3. Additional ABGs showing differential expression in transcriptome data validated by qRT-PCR. Expression values were normalized against that of Actin. Error bars are based on mean of three technical replicates. Table S1. Total anthocyanin pigments identified in reddish purple Chinese cabbage leaf samples. Table S2. Total up-regulated DEG transcripts between RPCC and GCC with annotations from BRAD and TAIR databases. Table S3. Total down-regulated DEG transcripts between RPCC and GCC with annotations from BRAD and TAIR databases. Table S4. Transcription factor genes up-regulated between RPCC and GCC samples, as identified from transcriptome data. Table S5. Transcription factor genes down-regulated between RPCC and GCC, as identified from transcriptome data. Table S6. Gene ontology (GO) annotation of DEG transcripts predicted from transcriptome data. GO annotations are shown for up-regulated genes (left) and down-regulated genes (right). Table S7. KEGG pathway annotation of DEG transcripts predicted in transcriptome data. KEGG annotations are shown for up-regulated genes (left) and down-regulated genes (right). Table S8. Genes identified from transcriptome data that are involved in phenylpropanoid and anthocyanin pathways with expression values and GO and KEGG annotations. Table S9. List of primer sequences used for qRT-PCR of anthocyanin biosynthetic genes. Table S10. Cis-regulatory elements (CREs) identified in upstream promoter regions of anthocyanin biosynthetic genes.