Differential Regulation of Anthocyanins in Cerasus humilis Fruit Color Revealed by Combined Transcriptome and Metabolome Analysis

Coloring is an important appearance quality of fruit. In order to evaluate the relationship between metabolites and fruit color, we analyzed the metabolites and transcriptional profiles of two different Cerasus humilis cultivars: “RF” (cv. Zhangwu, red fruit) and “YF” (cv. Nongda No.5, yellow fruit). The results of identification and quantification of metabolites showed that there were significant differences in the contents of 11 metabolites between RF and YF. Transcriptomics was used to analyze the expression patterns of genes related to the anthocyanin biosynthesis pathway, and subsequently, the regulation network of anthocyanin biosynthesis was established to explore their relationship with color formation. QRT-PCR, performed for 12 key genes, showed that the expression profiles of the differentially expressed genes were consistent with the results of the transcriptome data. A co-expression analysis revealed that the late genes were significantly positively correlated with most of the different metabolites. The results of the study provide a new reference for improving the fruit color of Cerasus humilis in the future.


Introduction
Cerasus humilis (Bge.) Sok, which belongs to the genus Cerasus in the family Rosaceae, is a small dwarf shrub that originated in China. The fruit of C. humilis can be eaten as a fruit; they are known as "calcium fruits", owing to their relatively high calcium content [1][2][3]. They have a bright color, crystal beauty, and display an exceptionally wide diversity of fruit colors, including purple, red, yellow, and green [4]. This tree species has the advantages of early bearing, high yield, cold resistance, drought resistance, barren resistance, anti-wind abilities, sand-fixation, and soil conservation [5]. As a strategic tree species for sustainable development, C. humilis has been proven to be promising for building an eco-economic industrial chain in China. C. humilis seeds (semen pruni) have been used as a traditional Chinese medicine for heat-clearing and water-disinhibiting [6,7]. In recent years, because of their rich functional components, C. humilis has attracted more and more attention from consumers and researchers. Moreover, it has the characteristics of being highly nutritious, resistant, and having high bond retention, It is known as the third generation fruit tree variety together with the blueberry, raspberry, and Hippophae rhamnoides L. [8][9][10].
Forests 2020, 11 Fruit color is an important quality characteristic of fruit and an important index by which to evaluate fruit quality, as well as being one of the most important factors in attracting consumers [11,12]. Flavonoid biosynthesis is controlled by a variety of exogenous and endogenous factors, and is the most studied secondary metabolic pathway in plants, in which anthocyanins and procyanidins are the end product of the flavonoid metabolic pathway [13]. Anthocyanins, as one of the main pigments that form fruit colors, accumulate in specific organs, tissues and cells, displaying significant chemical diversity [14,15]. The structure of anthocyanins determines the color of fruit, and different plant species contain different anthocyanins [16]. The color of apples and peaches are mainly formed by anthocyanins [17,18]. In grapes, however, anthocyanins and delphinidin are synthesized first, then converted to other pigments [11]. In strawberries, pelargonidin is the main anthocyanin [19]. In addition to endowing plant organs with various colors, anthocyanins play an important role in pollination, seed transmission, protection against UV damage, and defense against pathogen infection [20][21][22][23]. At the same time, anthocyanins are known to have antioxidant, anti-insulin resistance, and anti-inflammatory effects [24][25][26]. Therefore, anthocyanins also have good application prospects in the field of functional foods [27,28].
Despite the economic potential of C. humilis, its genome sequencing has not yet been completed. Therefore, researchers have used other technologies to acquire information, to improve production and health-related benefits. In recent years, transcriptome sequencing has been an effective strategy of the discovery and identification of genes involved in secondary metabolite biosynthesis [29,30]. For example, Xu et al. [31] found key genes responsible for betalain-/anthocyanidin-production in bougainvilleas by transcriptome sequencing. Similarly, Yu et al. [32] illustrated the molecular regulatory mechanism of radish formation by transcriptome sequencing. Additionally, metabolomics analysis could be regarded as a technical means of association analysis, together with other data, to analyze the function of genes involved in the metabolic pathway of interest and also provide powerful supporting information for gene mining [33][34][35]. In recent years, combined metabolome and transcriptome analyses in fruits have clarified the relationship between contents of various secondary metabolites and corresponding differentially expressed genes, broadening our horizons of plant color regulation [36][37][38]. Until now, C. humilis has been seen as a resistant ornamental plant. It is necessary to cultivate new varieties to improve ornamental and fruit quality, however no studies have yet taken into account the differences in gene expression and metabolism between fruits from different kinds of C. humilis.
The anthocyanin biosynthesis pathway has been well studied in some fruit trees, such as apples, pears, and peaches [17,20,37]. Its biosynthesis is regulated by a highly conserved structure and regulatory genes [13]. The anthocyanin biosynthesis pathway involves a variety of metabolic processes, including ten core structural genes: PAL, C4H, 4CL, CHS, CHI, F3H, F3'H, DFR, ANS, and UFGT [39]. Blocking the anthocyanin biosynthesis pathway could directly lead to a change of pigment production, and the color of fruit peel [13]. In addition to the structural genes, transcription factor (TF) regulates the transcription and spatiotemporal expression of the structural genes by binding their own DNA binding domains and the promoter regions of target genes, to promote fruit peel color change [40]. For fruit color diversity within populations, understanding their special anthocyanin biosynthesis pathways will help breeders predict evolutionary differences.
In this study, we systematically studied the anthocyanin metabolism differences and differences in gene expression between YF and RF. We performed a metabonomics analysis of YF and RF to understand the differences in metabolites between them. Additionally, to identify differentially expressed genes (DEGs) related to anthocyanin metabolism in YF and RF, transcriptomic analysis was carried out. Our research provides valuable information to further elucidate the molecular mechanism of fruit color formation in C. humilis, and provides novel gene resources for C. humilis plant breeding. In addition, these results could be a useful reference for relevant research on color changes in other fruit species.

Plant Materials and Treatments
Three year old C. humilis cultivars "Zhangwu" (RF) and "Nongda No.5" (YF) were used as experimental materials and planted at Xiangyang farm, Harbin, Heilongjiang Province (45 • 47 0 N, 126 • 55 15 E). RF is a special cultivar from Northeast China; its fruit has red skin and red flesh, and when ripe the whole fruit turns blood red. YF is a variety bred by Shanxi Agricultural University, and its fruit has yellow skin and yellow flesh; when ripe, the whole fruit turns yellow. Mature fruits were collected and used in this experiment. After enucleation, the fruit peel was immediately frozen in liquid nitrogen and stored in the refrigerator at −80 • C until required for metabolite analysis and RNA extraction. Three biological repeats were performed for each sample.

Metabolite Extraction and Profiling
Vacuum freeze-drying technology was used to freeze dry the samples. Then, the samples were ground (30 Hz, 1.5 min) into powder by a grinding instrument (MM 400, Retsch). We weighed 100 mg of powder and dissolved it in 1.0 mL extract (70% methanol aqueous solution). The dissolved samples were refrigerated at 4 • C overnight, and in order to improve the extraction rate, three vortex-mixings were carried out during this period. After centrifugation at 10,000× g for 10 min, the supernatant was aspirated and filtered with a microporous membrane (0.22 µm pore size). Each obtained sample was then stored in a sample injection bottle for UPLC-MS/MS analysis. Meanwhile, a quality control standard was established. Each quality control (QC) sample was made by mixing 200 µL of supernatant of all biological test samples in equal volume. The mixed samples were used to observe and analyze the repeatability within the same batch. The data acquisition instrument system included Ultra Performance Liquid Chromatography (UPLC) and Tandem Mass Spectrometry (MS/MS) [41][42][43].

Qualitative and Quantitative Analysis of Metabolites
Based on the self-built MWBD (metaware database) and public database of metabolite information, qualitative analysis of the primary and secondary spectrum data mass spectrometry was carried out. Linear ion trap (LIT) and triple quadrupole (QQQ) scans were acquired on a triple quadrupole-linear ion trap mass spectrometer (Q TRAP), API 6500 Q TRAP LC/MS/MS System, equipped with an ESI Turbo Ion-Spray interface, operating in positive ion mode and controlled by Analyst 1.6 software (AB Sciex, Beijing, China). The ESI source operation parameters were as follows: ion source, turbo spray; source temperature 500 C; ion spray voltage (IS) 5500 V; ion source gas I (GSI), gas II (GSII), and curtain gas (CUR) were set at 55, 60, and 25.0 psi, respectively; the collision gas (CAD) was high. Instrument tuning and mass calibration were performed with 10 and 100 µmol/L polypropylene glycol solutions in QQQ and LIT modes, respectively. QQQ scans were acquired as MRM experiments with collision gas (nitrogen) set to 5 psi. The declustering potential (DP) and collision energy (CE) for individual MRM transitions were undertaken with further DP and CE optimization. A specific set of MRM transitions were monitored for each period, according to the metabolites eluted within this period. The metabolites with fold change ≥2 and fold changes ≤0.5 were selected as final differential metabolites [44,45].

RNA Extraction and Transcriptome Sequencing
Total RNA was extracted from 0.5 g RF and 0.5 g YF using an Omega RNA Extraction Kit (Biel, Switzerland). Transcriptome sequencing using 150 bp paired-end Illumina HiSeq and the construction of an RNA-Seq library was carried out by Annoroad Biotechnology (Zhejiang, China). Trinity Software for de novo assembly was developed by the broad Institute and the Hebrew University of Jerusalem. Trinity represents a new method to effectively and accurately reconstruct transcriptome data from RNA sequences. Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms. Trinotate makes use of a number of different, well-referenced methods for functional annotation, including homology search of the known sequence data (BLAST+/SwissProt), protein domain identification (HMMER/PFAM), protein signal peptides, and transmembrane domain prediction (signalP/tmHMM), and leverages various annotation databases (EMBL Uniprot eggNOG/GO/Kegg databases) [46][47][48].

Expression Level Analysis
DESeq2 was used to analyze the differential expression between groups with biological duplication. The statistical method of the negative binomial distribution was used to standardize the data. In order to control false positive, the obtained p value was corrected by Benjamini and Hochberg's approach. Genes with q < 0.05 and |log2_Ratio| ≥ 1 were selected as differential expression genes (DEGs) [49].
GO (Gene Ontology, http://geneontology.org/) enrichment analysis was performed for the DEGs. According to the number of DEGs contained in each GO term, the hypergeometric test was used to identify the significantly enriched GO terms against the whole genome background. After the calculated P values were corrected by multiple testing, GO terms with a q value <0.05 were selected as significantly enriched in the DEGs. The main biological functions of DEGs can be determined by Go enrichment analysis. KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.kegg.jp/) is a genome-wide metabolic pathway database. The hypergeometric test was applied to the enrichment analysis of KEGG pathways. The calculated p-value was corrected by multiple testing, taking q < 0.05 as the threshold value. Pathways meeting this condition were defined as significantly enriched pathways in DEGs [20].

Quantitative Real-Time PCR and Expression Validation
In order to validate the RNA-seq results, quantitative real-time PCR (qRT-PCR) was performed on the peel samples. Total RNA isolation was conducted using an Omega RNA Extraction Kit (Biel, Switzerland) according to the given protocol. The qRT-PCR analysis was carried out using the Roche LightCycler 480 II Real-Time PCR System (Roche, Basel, Switzerland) with a 96-well plate. The UltraSYBR Mixture (Low ROX) (CWBIO, Beijing, China) was used for all PCR reactions, according to the instruction protocols. The qRT-PCR primers were designed using IDT (https://sg.idtdna.com/ Scitools/Applications/RealTimePCR/) (Table S1). All analyses (including three biological repetitions) were quantified in triplicate.

Statistical Analysis
The expression level of different genes compared to the control was calculated according to the 2 −∆∆CT method. The correlation between qRT-PCR and RNA-seq was analyzed using Excel [50]. Some figures were presented using Adobe Illustrator CS6. Metabolome data statistics and transcriptome sequencing were prepared on Metware Biotechnology (Wuhan, China) and Annoroad Biotechnology (Zhejiang, China), respectively.

Metabolomics Analysis of C. humilis Fruit Peel
The ion flow diagram of all samples (total ion currents (TIC) each point in time for all the ions in the mass spectrogram strength after adding and continuously depicting the map) and the MRM metabolites detection multimodal graph, abscissa for the retention time of the detected metabolites (the retention time, Rt), and the vertical for ion detection of ion flow intensity (CPS, count per second), are shown in Figure S1. To detect anthocyanins in the two cultivars, the extracts from the peels of RF and YF were analyzed by MRM, and 11 metabolites were identified as having significantly different contents, among which peonidin O-hexoside, cyanidin 3-O-malonylhexoside, pelargonidin O-acetylhexoside, cyanidin 3-O-glucoside (kuromanin), cyanidin 3-O-rutinoside (Keracyanin), cyanidin 3,5-O-diglucoside (cyanin), cyanidin 3-O-galactoside, and peonidin 3-O-glucoside chloride were found to be abundant in RF, whereas they could not be detected in YF (Table 1). To compare RF and YF metabolites, datasets obtained from the ESI Turbo Ion-Spray interface by electrospray ionization (ESI + and ESI − ) were subjected to principal component analysis (PCA). It has been shown that metabolites from mature stage of RF and YF are clearly separated in the score plots, where the first principal component (PC1) was plotted against the second principal component (PC2). For the ESI + mode, PC1 and PC2 represented 73.58% and 113.18% of the total variation, respectively (Figure 1a). For the ESImode, PC1 and PC2 accounted for 74.66% and 14.22% of the variation, respectively (Figure 1b). These results suggest significant biochemical differences between RF and YF.

Assembly and Annotation of the C. humilis Fruit Peel Transcriptome
Six RNA libraries (RF-1, RF-2, and RF-3 for the red fruit peel and YF-1, YF-2, and YF-3 for the yellow fruit peel) were prepared and analyzed. In total, 284,739,040 raw reads were obtained from the six libraries. The raw reads in this study were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database (accession number: SAMN15805207 and SAMN15088359). For further analysis, after raw read quality filtering, 138,849,652 and 124,195,294 clean reads were obtained from the RF and YF libraries, respectively, among which the average GC content was 46.3% (Table 2). In addition, the Q30 percentages of all six libraries were above 93%, indicating that the sequencing and RNA qualities were high, and the data obtained was reliable enough for further profile studies on gene expression. Trinity software was used to assemble the obtained clean reads, resulting in 50,003 unigenes with an N50 length of 1972 nt and an average length of 1019.05 nt (max length 16,782 and min length 201). The length of 7953 (15.9%) of these unigenes exceeded 2 kb. A total of 50,003 unigenes of C. humilis were annotated by BLAST alignment with seven public databases (NT, NR, Uniprot database, RNAMMER, eggNOG, Gene Ontology (GO), and KEGG). In

Assembly and Annotation of the C. humilis Fruit Peel Transcriptome
Six RNA libraries (RF-1, RF-2, and RF-3 for the red fruit peel and YF-1, YF-2, and YF-3 for the yellow fruit peel) were prepared and analyzed. In total, 284,739,040 raw reads were obtained from the six libraries. The raw reads in this study were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database (accession number: SAMN15805207 and SAMN15088359). For further analysis, after raw read quality filtering, 138,849,652 and 124,195,294 clean reads were obtained from the RF and YF libraries, respectively, among which the average GC content was 46.3% (Table 2). In addition, the Q30 percentages of all six libraries were above 93%, indicating that the sequencing and RNA qualities were high, and the data obtained was reliable enough for further profile studies on gene expression. Trinity software was used to assemble the obtained clean reads, resulting in 50,003 unigenes with an N50 length of 1972 nt and an average length of 1019.05 nt (max length 16,782 and min length 201). The length of 7953 (15.9%) of these unigenes exceeded 2 kb. A total of 50,003 unigenes of C. humilis were annotated by BLAST alignment with seven public databases (NT, NR, Uniprot database, RNAMMER, eggNOG, Gene Ontology (GO), and KEGG). In summary, 29,653 unigenes could be annotated using the NR database (non-redundant protein database), representing 59.3% of the total, the highest percentage. In addition, there were 25,555, 24,179, and 17,489 unigenes annotated only using the NT, BLASTX, and BLASTP databases, respectively ( Figure 2).  Three categories with 59 GO terms were obtained according to GO classification, namely biological process, cellular component and molecular function ( Figure S2). In the biological process, 58.51% unigenes were associated with the metabolic process between RF and YF. Through KOG (EuKaryotic Orthologous Groups of proteins database) analysis, 19,157 unigenes were classified into 25 functional categories ( Figure S3). The KOG and GO annotations showed that unigenes expressed in C. humilis fruit peel encode various proteins related to metabolism.

GO Enrichment and KEGG Pathway Analysis of DEGs
Next, we compared differential expression between yellow and red coloration in C. humilis fruit peel. As a consequence, a total of 9496 DEGs (3939 up-regulated and 5557 down-regulated genes) between the YF and RF libraries were detected through the hierarchical clustering of DEGs ( Figure  3a,b). In the heat map, the change of expression level is indicated by the blue, white and red gradient, respectively. Red indicates a higher expression level, and blue represents a lower expression level. The expression patterns of the yellow and red fruit peel libraries were compared, revealing that the colors of the same variety were similar. Three categories with 59 GO terms were obtained according to GO classification, namely biological process, cellular component and molecular function ( Figure S2). In the biological process, 58.51% unigenes were associated with the metabolic process between RF and YF. Through KOG (EuKaryotic Orthologous Groups of proteins database) analysis, 19,157 unigenes were classified into 25 functional categories ( Figure S3). The KOG and GO annotations showed that unigenes expressed in C. humilis fruit peel encode various proteins related to metabolism.

GO Enrichment and KEGG Pathway Analysis of DEGs
Next, we compared differential expression between yellow and red coloration in C. humilis fruit peel. As a consequence, a total of 9496 DEGs (3939 up-regulated and 5557 down-regulated genes) between the YF and RF libraries were detected through the hierarchical clustering of DEGs (Figure 3a,b). In the heat map, the change of expression level is indicated by the blue, white and red gradient, respectively. Red indicates a higher expression level, and blue represents a lower expression level. The expression patterns of the yellow and red fruit peel libraries were compared, revealing that the colors of the same variety were similar.
Gene ontology (GO) is an internationally standardized gene function classification system that describes the properties of genes and gene products in organisms. We functionally categorized the differentially expressed unigenes using GO terms, showing that 9496 unigenes of DEGs (q value < 0.05) were divided into three major categories and 58 subcategories: 19 were cellular components, 26 were biological processes, and 13 were molecular functions. For the cellular component category,'cell part and organelle were the top enriched terms, whereas in the biological process category, most of the DEGs were enriched in cellular processes and metabolic processes, and in the molecular function category, most of the DEGs were participants in binding and catalytic activity (Figure 4). These results represent an overall condition of the DEG functions, providing information to understand inter-varietal differences at the gene function level.  Gene ontology (GO) is an internationally standardized gene function classification system that describes the properties of genes and gene products in organisms. We functionally categorized the differentially expressed unigenes using GO terms, showing that 9496 unigenes of DEGs (q value < 0.05) were divided into three major categories and 58 subcategories: 19 were cellular components, 26 were biological processes, and 13 were molecular functions. For the cellular component category,'cell part and organelle were the top enriched terms, whereas in the biological process category, most of the DEGs were enriched in cellular processes and metabolic processes, and in the molecular function category, most of the DEGs were participants in binding and catalytic activity (Figure 4). These results represent an overall condition of the DEG functions, providing information to understand intervarietal differences at the gene function level.
KEGG analysis provides information and further understanding on how C. humilis regulates its biological functions, biosynthesis of secondary metabolites, and multiple gene interactions at the transcriptome and molecular level. It has been shown that the DEGs are enriched in the 130 KEGG pathways, among which the most significantly enriched pathways were linoleic acid metabolism, plant-pathogen interaction, flavonoid biosynthesis and phenylpropanoid biosynthesis. These pathways are mainly composed of down-regulated DEGs ( Figure 5). Among the 130 KEGG pathways, there are two pathways related to color formation; namely, the flavonoids synthesis pathway (ko00941) and anthocyanin synthesis pathway (ko00942), with 36 unigenes and 1 unigene, respectively. The pathways annotated by these DEGs were closely related to the abundance of metabolites of YF and RF. KEGG analysis provides information and further understanding on how C. humilis regulates its biological functions, biosynthesis of secondary metabolites, and multiple gene interactions at the transcriptome and molecular level. It has been shown that the DEGs are enriched in the 130 KEGG pathways, among which the most significantly enriched pathways were linoleic acid metabolism, plant-pathogen interaction, flavonoid biosynthesis and phenylpropanoid biosynthesis. These pathways are mainly composed of down-regulated DEGs ( Figure 5). Among the 130 KEGG pathways, there are two pathways related to color formation; namely, the flavonoids synthesis pathway (ko00941) and anthocyanin synthesis pathway (ko00942), with 36 unigenes and 1 unigene, respectively. The pathways annotated by these DEGs were closely related to the abundance of metabolites of YF and RF.

Analysis of Transcription Factors
Transcription factors (TFs) are important regulatory factors, which can activate or inhibit the expression of structural genes by binding to their promoter sequences. A total of 3571 TF genes were identified through transcriptome data analysis, and could be grouped into 55 transcription factor families (Table S2), the majority of which were bHLH and MYB, followed by the NAC, B3, WRKY, ERF, FAR1, C2H2 and C3H transcription factor families ( Figure 6). It has been widely confirmed that members of the bHLH and MYB families regulate anthocyanin biosynthesis in plants. In the present study, we identified 383 and 330 unigenes in the bHLH and MYB families, with 102 and 124 up-regulated and 281 and 206 down-regulated, in YF vs. RF, respectively (Table S3). It is worth noting that three genes were annotated as being associated with R2R3-MYB (TRINITY_DN18821_c0_g5, TRINITY_DN20762_c1_g1 and TRINITY_DN23575_c1_g1). In particular, the MYB TF (TRINITY_DN20762_c1_g1) was only highly expressed in RF, but not in YF (Table S4). Transcription factors (TFs) are important regulatory factors, which can activate or inhibit the expression of structural genes by binding to their promoter sequences. A total of 3571 TF genes were identified through transcriptome data analysis, and could be grouped into 55 transcription factor families (Table S2), the majority of which were bHLH and MYB, followed by the NAC, B3, WRKY, ERF, FAR1, C2H2 and C3H transcription factor families ( Figure 6). It has been widely confirmed that members of the bHLH and MYB families regulate anthocyanin biosynthesis in plants. In the present study, we identified 383 and 330 unigenes in the bHLH and MYB families, with 102 and 124 upregulated and 281 and 206 down-regulated, in YF vs. RF, respectively (Table S3). It is worth noting that three genes were annotated as being associated with R2R3-MYB (TRINITY_DN18821_c0_g5, TRINITY_DN20762_c1_g1 and TRINITY_DN23575_c1_g1). In particular, the MYB TF (TRINITY_DN20762_c1_g1) was only highly expressed in RF, but not in YF (Table S4).

Regulatory Network of Anthocyanin Biosynthetics.
In order to confirm the expression profiles of key DEGs involved in the anthocyanin biosynthesis pathway, real-time quantitative PCR (qRT-PCR) was used to verify the 12 DEGs with different peel coloration in two varieties of C. humilis, and the gene expression profiles were analyzed (Figure 7a). The qRT-PCR was highly correlated with RNA-seq data, showing consistency in up-regulation and

Regulatory Network of Anthocyanin Biosynthetics
In order to confirm the expression profiles of key DEGs involved in the anthocyanin biosynthesis pathway, real-time quantitative PCR (qRT-PCR) was used to verify the 12 DEGs with different peel coloration in two varieties of C. humilis, and the gene expression profiles were analyzed (Figure 7a). The qRT-PCR was highly correlated with RNA-seq data, showing consistency in up-regulation and down-regulation of DEG expression (Figure 7b). The Pearson's correlation coefficient between RNA-seq and qRT-PCR was 0.8829 (Figure 7b). These results indicate that the RNA-seq data were reliable. For a better understanding of the relationship between metabolites and genes in anthocyanin biosynthesis, we combined all metabolites and transcriptome data to establish a network. We clearly found the relationship between gene expression and metabolites in the whole anthocyanin biosynthesis pathway of different varieties (Figure 8).
Two main classes of functional enzymes exist in this pathway; the early biosynthetic enzymes catalyze all flavonoid synthesis, and later biosynthetic enzymes, which are required for the synthesis of anthocyanins. First, compared with the RF group, the YF group had no significant changes in the early genes (such as PAL, C4H, 4CL, CHS, and CHI) of anthocyanin synthesis. Subsequently, F3H was responsible for the conversion of naringenin to dihydrokaempferol, as this was significantly upregulated in the YF group. However, transcriptome data showed that the late genes of anthocyanin synthesis (such as DFR, ANS, and UFGT) in the RF group were significantly higher than those in the For a better understanding of the relationship between metabolites and genes in anthocyanin biosynthesis, we combined all metabolites and transcriptome data to establish a network. We clearly found the relationship between gene expression and metabolites in the whole anthocyanin biosynthesis pathway of different varieties (Figure 8).
Two main classes of functional enzymes exist in this pathway; the early biosynthetic enzymes catalyze all flavonoid synthesis, and later biosynthetic enzymes, which are required for the synthesis of anthocyanins. First, compared with the RF group, the YF group had no significant changes in the early genes (such as PAL, C4H, 4CL, CHS, and CHI) of anthocyanin synthesis. Subsequently, F3H was responsible for the conversion of naringenin to dihydrokaempferol, as this was significantly up-regulated in the YF group. However, transcriptome data showed that the late genes of anthocyanin synthesis (such as DFR, ANS, and UFGT) in the RF group were significantly higher than those in the YF group. Therefore, these candidate genes obtained from this study could be used for further study to uncover the coloring mechanism of C. humilis fruit peel (Table S5).

Integrated Analysis of the Transcriptome and Metabolome of Anthocyanins.
Ten differential metabolites were obtained from anthocyanins, among which highly positive correlations were obtained among all of the metabolites (Pearson correlation coefficient >0.85). Moreover, the correlation analysis between anthocyanin biosynthesis-related genes and metabolites showed highly positive correlations among most genes and metabolites across the whole level. However, three genes (TRINITY_DN19104_c0_g1, TRINITY_DN22982_c3_g2 and TRINITY_DN23387_c0_g1) were negatively correlated with all other genes and metabolites, and the three genes have highly positive correlations with each other (Figure 9a). The results of metabonomics and transcriptome analysis showed a significant negative correlation between the three genes (TRINITY_DN19104_c0_g1, TRINITY_DN23387_c0_g1 and TRINITY_DN22982_c3_g2) and pmb0545 (Rosinidin O-hexoside) in the metabolites (Pearson correlation coefficient > -0.98) (Figure 9b). In addition, eight genes (TRINITY_DN20425_c0_g9, TRINITY_DN9931_c0_g2, TRINITY_DN9931_c0_g1, TRINITY_DN18971_c0_g3, TRINITY_DN21824_c1_g2, TRINITY_DN19472_c1_g2, TRINITY_DN20281_c1_g1 and TRINITY_DN21332_c0_g2) were significantly correlated with most of the differential metabolites (Figure 9b, Table S6); their regulatory relationships need to be further studied.

Integrated Analysis of the Transcriptome and Metabolome of Anthocyanins
Ten differential metabolites were obtained from anthocyanins, among which highly positive correlations were obtained among all of the metabolites (Pearson correlation coefficient >0.85). Moreover, the correlation analysis between anthocyanin biosynthesis-related genes and metabolites showed highly positive correlations among most genes and metabolites across the whole level. However, three genes (TRINITY_DN19104_c0_g1, TRINITY_DN22982_c3_g2 and TRINITY_DN23387_c0_g1) were negatively correlated with all other genes and metabolites, and the three genes have highly positive correlations with each other (Figure 9a). The results of metabonomics and transcriptome analysis showed a significant negative correlation between the three genes (TRINITY_DN19104_c0_g1, TRINITY_DN23387_c0_g1 and TRINITY_DN22982_c3_g2) and pmb0545 (Rosinidin O-hexoside) in the metabolites (Pearson correlation coefficient > −0.98) (Figure 9b). In addition, eight genes (TRINITY_DN20425_c0_g9, TRINITY_DN9931_c0_g2, TRINITY_DN9931_c0_g1, TRINITY_DN18971_c0_g3, TRINITY_DN21824_c1_g2, TRINITY_DN19472_c1_g2, TRINITY_ N20281_ c1_g1 and TRINITY_DN21332_c0_g2) were significantly correlated with most of the differential metabolites ( Figure 9b, Table S6); their regulatory relationships need to be further studied.

Effect of Content and Types of Anthocyanins on Coloration in Yellow and Red Fruit Peel
Fruit color is an important trait that affects the edible value of fruit to consumers, and the relationship between color formation and anthocyanin accumulation has been extensively studied for many years. Currently, several reports have described key genes and enzymes for fruit color formation and accumulation in many fruits, including apples and peaches [17,18]. However, the mechanism of anthocyanin biosynthesis in C. humilis is still obscure. In this study, the differences in anthocyanin metabolites in different varieties of C. humilis were qualitatively and quantitatively analyzed, and the UPLC-MS/MS (anthocyanin targeted) method was used to analyze the types of anthocyanin-related metabolites in different varieties of C. humilis for the first time. Based on target metabolomics analysis, a total of 16 metabolites of anthocyanins were identified, of which 11 were significantly different ( Table 1). Among the three main categories of pigmented glycosides, pelargonidin mainly shows orange/red, cyanidin mainly shows pink/magenta and delphinidin mainly shows purple/blue. Interestingly, cyanidin and pelargonidin contents were higher in RF but only cyanidin O-syringic acid and pelargonidin 3-O-beta-D-glucoside (Callistephin chloride) were detected in YF (Table 1). Zhou et al. [50] found that cyanidin O-syringic acid, petunidin 3-O-glucoside, and pelargonidin 3-O-β-d-glucoside were detected in pink tea flowers, while these secondary metabolites were not detected in white flowers. Similarly, anthocyanins were found in red Lycoris longitude, however no anthocyanins were found in white and yellow flowers [51]. Such a result provides us with insight into the difference between red and yellow coloring in C. humilis fruit peel.

Key Genes in the Anthocyanin Pathway Affect Coloration in Yellow and Red Fruit Peel
The genes in the pathway of anthocyanin metabolism are relatively conservative in different plants; for instance, for genes related to anthocyanin synthesis in C. humilis fruit peel, 23 transcripts encoding 11 enzymes were isolated. Three of the 11 were phenylalanine synthesis genes, including PAL, C4H and 4CL; six were flavonoid synthase genes, including CHS, CHI, F3H, F3'H, DFR, and ANS; one was a proanthocyanidin synthase gene, LAR; and one was an anthocyanin synthase gene, UFGT [36,38,52]. In our studies, the expression analysis of early genes showed that the expression of F3H in YF was higher than that in RF (Figure 8), suggesting that a large amount of dihydrokaempferol could be produced in YF, but this could not eventually outflow into anthocyanin synthase. Similar reports have revealed the role of naringenin chalcone as a yellow pigment in the coloring of yellow flowers [53]. Therefore, we speculated that this could be the reason for the color formation of YF. Our results also showed that DFR was significantly more upregulated in RF than in YF (Table S5). DFR led dihydrokaempferol and dihydroquercetin into the final pathway of anthocyanin synthesis [22]. It is worth mentioning that ANS and UFGT showed higher expression levels in RF, with several hundred more folds than YF (Table S5). The conversion of leucopelargonidin and leucocyanidin into unstable anthocyanins occurred under the catalysis of ANS [54]. These unstable anthocyanins were further glycosylated by UFGT to form stable anthocyanins, which finally showed color [55]. Therefore, we speculate that ANS and UFGT are two key genes that determine the color of RF.

Impact of Different Varieties on TFs
Several studies have shown that the MBW protein complex, composed of R2R3-MYB and bHLH TFs, plays an important role in regulating the transcription of anthocyanin biosynthesis-related genes, together with WD40 protein [12,13,56]. The activity of R2R3-MYB has a unique role in determining the function of the complex, which could promote or inhibit the transcription of structural genes. In this study, three R2R3-MYB TFs were identified by RNA-seq analysis, and all of them were significantly up-regulated in RF. We speculated that they may play a key role in the anthocyanin synthesis of RF (Table S4). For example, Meng et al. [57] found overexpression of the R2R3-MYB transcription factor LeAN2 could promote anthocyanin accumulation in transgenic tobacco. Similarly, Yao et al.
reported that PyMYB114 could increase anthocyanin biosynthesis in red-skinned pears [58]. In contrast, Zhou et al. reported that the overexpression of VvMYBC2L2 in tobacco resulted in a significant decrease in petal anthocyanin concentration [59]. However, the results of the current work can only show the expression of TFs in different varieties of C. humilis. The molecular mechanisms of interactions between these TFs and the promoter sequences of structural genes require further elucidation.

Integrated Analysis of the Transcriptome and Metabolome
Anthocyanins are the final product of plant development, and play an important role in regulating the color changes of fruits. In this study, combined metabonomics and transcriptome analysis showed that three late genes, DFR, ANS and UFGT, were significantly positively correlated with most of the different metabolites (Figure 9b), which provided a reference for the correlations between anthocyanin metabolites and genes. These late genes are likely regulated by upstream genes, which may finally lead to changes in the metabolism of downstream pathways. The results provide valuable information for explaining the color difference of C. humilis. Nevertheless, the specific mechanism still needs further research and exploration.

Conclusions
In this study, we sequenced and analyzed the transcriptome and metabolome in red and yellow fruit peel. A total of 9496 DEGs were identified in the transcriptome of C. humilis, of which 3939 were up-regulated and 5557 were down-regulated. A total of 16 anthocyanin metabolites were detected, including 11 differential metabolites. Through the combined analysis of the transcriptome and metabolomic data, we investigated the regulatory network and correlation analysis of anthocyanin accumulation in different varieties. Additionally, we screened out the key TFs and structural genes that may participate in anthocyanin accumulation. The results of this study provide a deeper understanding of the molecular mechanism of anthocyanin accumulation in C. humilis and lay a foundation for breeding new varieties of C. humilis.

Supplementary Materials:
The following are available online at http://www.mdpi.com/1999-4907/11/10/1065/s1, Figure S1: The total ion current (TIC) of Cerasus humilis fruit peel, Figure S2: Statistics of annotation information of assembled unigenes in Cerasus humilis. Gene Ontology (GO) classification of unigenes, Figure S3: Statistics of annotation information of assembled unigenes in Cerasus humilis. KOG function annotation of putative proteins, Table S1: Primer sequences for qRT-PCR, Table S2: The numbers of transcription factors, Table S3: The percent of transcription factors involved in the top 20 transcription factor families, Table S4: MYB TF expression in RF and YF, Table S5: Anthocyanin biosynthesis-related genes, Table S6: Correlation analysis of anthocyanin synthesis genes and metabolites.
Author Contributions: X.J. and X.S. conceived and designed the experiments. X.J. and L.Z. performed the experiments. X.J., J.R., S.L. D.W. and X.S. analyzed the data. X.J., J.R. and X.S. wrote the manuscript. All authors have read and agreed to the published version of the manuscript.