Combined Analysis of the Fruit Metabolome and Transcriptome Reveals Candidate Genes Involved in Flavonoid Biosynthesis in Actinidia arguta

To assess the interrelation between the change of metabolites and the change of fruit color, we performed a combined metabolome and transcriptome analysis of the flesh in two different Actinidia arguta cultivars: “HB” (“Hongbaoshixing”) and “YF” (“Yongfengyihao”) at two different fruit developmental stages: 70d (days after full bloom) and 100d (days after full bloom). Metabolite and transcript profiling was obtained by ultra-performance liquid chromatography quadrupole time-of-flight tandem mass spectrometer and high-throughput RNA sequencing, respectively. The identification and quantification results of metabolites showed that a total of 28,837 metabolites had been obtained, of which 13,715 were annotated. In comparison of HB100 vs. HB70, 41 metabolites were identified as being flavonoids, 7 of which, with significant difference, were identified as bracteatin, luteolin, dihydromyricetin, cyanidin, pelargonidin, delphinidin and (−)-epigallocatechin. Association analysis between metabolome and transcriptome revealed that there were two metabolic pathways presenting significant differences during fruit development, one of which was flavonoid biosynthesis, in which 14 structural genes were selected to conduct expression analysis, as well as 5 transcription factor genes obtained by transcriptome analysis. RT-qPCR results and cluster analysis revealed that AaF3H, AaLDOX, AaUFGT, AaMYB, AabHLH, and AaHB2 showed the best possibility of being candidate genes. A regulatory network of flavonoid biosynthesis was established to illustrate differentially expressed candidate genes involved in accumulation of metabolites with significant differences, inducing red coloring during fruit development. Such a regulatory network linking genes and flavonoids revealed a system involved in the pigmentation of all-red-fleshed and all-green-fleshed A. arguta, suggesting this conjunct analysis approach is not only useful in understanding the relationship between genotype and phenotype, but is also a powerful tool for providing more valuable information for breeding.


Introduction
Kiwifruit (Actinidiaceae, genus Actinidia), a kind of perennial and deciduous plant, is one of the fruit trees that have been domesticated and cultivated successfully in the last century, and in recent years has been grown commercially worldwide [1,2]. As a fruit tree originating from China, the genus Actinidia comprises a total of 76 species and about 125 known taxa worldwide [3], but among these, only two species-Actinidia chinensis and A. deliciosa-have been cultivated commercially [4]. software was conducted by the overall MS signal intensity controlled by TIC (total ion chromatogram) ( Figure S1), evaluation of width for m/z peak, and assessment of width for retention time peak for each feature ( Figure S2).
The original wiff format data was converted to the mzXML format using MSConvert software [27], followed by alignment and extraction of the peak and calculation of the peak area. The final statistics showed that 18,598 and 10,239 metabolites were obtained by the POS (positive) and NEG (negative) models, respectively, 9016 and 4699 of which were annotated (Table 1).   Figure S1), evaluation of width for m/z peak, and assessment of width for retention time peak for each feature ( Figure S2). The original wiff format data was converted to the mzXML format using MSConvert software [27], followed by alignment and extraction of the peak and calculation of the peak area. The final statistics showed that 18,598 and 10,239 metabolites were obtained by the POS (positive) and NEG (negative) models, respectively, 9016 and 4699 of which were annotated (Table 1). Mode indicates that the mode of MS analysis is mainly divided into a positive ion mode and a negative ion mode; all metabolites indicates the number of substances extracted by XCMS software (3.2.0 version, UC, Berkeley, CA, USA); all annotated indicates the amount of metabolites annotated by level-one and level-two MS data; MS2 indicates the number of metabolites that could not only match that of level one m/z, but could also match that of a level-two fragment ion in the database; MS1 PLANTCYC indicates the number of metabolites that could match that of level-one m/z in the database; MS1 KEGG (Kyoto Encyclopedia of Genes and Genomes) indicates the number of metabolites assigned to KEGG pathways, POS (positive) and NEG (negative). The metabolites identified above were assigned to the KEGG and PLANTCYC databases. All 8070 and 4321 metabolites were classified into 18 KEGG second-grade pathways, 3580 and 2461 of which were classified into "metabolism" in the POS and NEG models, respectively. For the "metabolism" term, the top priority was "Global and overview maps", followed by "biosynthesis of other secondary metabolites", "metabolism of terpenoids and polyketides", "amino acid metabolism", "metabolism of cofactors and vitamins" and "carbohydrate metabolism" (Figure 2). 480 and 340 metabolites were assigned to "biosynthesis of other secondary metabolites" in the POS Figure 1. The phenotype of Actinidia arguta cv. "HB" and "YF" at two different sampling stages 70 DAFB and 100 DAFB, respectively. (A) Green fruit "HB" at 70 DAFB, HB70; (B) Red fruit "HB" at 100 DAFB, HB100; (C) Green fruit "YF" at 70 DAFB, YF70; (D) Green fruit "YF" at 100 DAFB, YF100.
The metabolites identified above were assigned to the KEGG and PLANTCYC databases. All 8070 and 4321 metabolites were classified into 18 KEGG second-grade pathways, 3580 and 2461 of which were classified into "metabolism" in the POS and NEG models, respectively. For the "metabolism" term, the top priority was "Global and overview maps", followed by "biosynthesis of other secondary metabolites", "metabolism of terpenoids and polyketides", "amino acid metabolism", "metabolism of cofactors and vitamins" and "carbohydrate metabolism" (Figure 2). 480 and 340 metabolites were assigned to "biosynthesis of other secondary metabolites" in the POS and NEG models, respectively. Of these, 73 and 67 metabolites were involved in "flavonoid biosynthesis" and "anthocyanin biosynthesis" in the POS and NEG models, respectively ( Figure 3).
All 1267 and 360 metabolites obtained from level-two identification were assigned to the HMDB database, 550 and 250 of which were matched and classified into 11 HMDB super classes in POS and NEG, respectively. Of these, 305 metabolites were included in the "organic acids and derivatives" term, which presented the majority of all classes in the POS model. Meanwhile, "organic acids and derivatives", which contained 59 metabolites, was the first class, followed by "organooxygen compounds" and "lipids and lipid-like molecules" in the NEG model ( Figure 4). and NEG models, respectively. Of these, 73 and 67 metabolites were involved in "flavonoid biosynthesis" and "anthocyanin biosynthesis" in the POS and NEG models, respectively ( Figure 3). All 1267 and 360 metabolites obtained from level-two identification were assigned to the HMDB database, 550 and 250 of which were matched and classified into 11 HMDB super classes in POS and NEG, respectively. Of these, 305 metabolites were included in the "organic acids and derivatives" term, which presented the majority of all classes in the POS model. Meanwhile, "organic acids and derivatives", which contained 59 metabolites, was the first class, followed by "organooxygen compounds" and "lipids and lipid-like molecules" in the NEG model ( Figure 4).

Identified Metabolites Involved in Flavonoid Biosynthesis
In order to obtain quantitative information on the metabolites, 14,132 and 8400 high-quality metabolites obtained from the 18,598 and 10,239 of all metabolites were used for differential analysis ( Table 2). According to the quantitative results of the identified metabolites, the differential metabolites between different group comparisons were analyzed based on fold-change and p-value. In the comparison between HB100 and HB70, a total of 2421 and 1838 metabolites presented as being up-regulated and down-regulated, respectively (Table 3). In addition, a total of 41 metabolites were identified as being flavonoids among the high-quality metabolites, 7 of which were with significant difference, and were identified as bracteatin, luteolin, dihydromyricetin, cyanidin, pelargonidin, delphinidin and (−)-epigallocatechin (Tables S1 and 4).

Comprehensive Analysis of Metabolome and Transcriptome
In order to investigate the association between metabolites and genes involved in the same biological process (KEGG Pathway), the comprehensive analysis of metabolome and transcriptome was performed using Pearson's Correlation Coefficient [28,29]. The results showed that 646, 156 and 484 differentially expressed genes participated in 13, 18 and 12 pathways among three groups HB100 vs. HB70, HB100 vs. YF10 and YF100 vs. YF70, respectively (Table S3). The 2 pathways, "flavonoid biosynthesis" and "galactose metabolism", were a common presence among the three group comparisons. In flavonoid biosynthesis, a total of 8 candidate DEGs were found ( Figure 6).

RT-qPCR, Cluster and Phylogenetic Analysis
A total of 14 structural genes comprising the 8 DEGs found above and 6 other structural genes involved in flavonoid biosynthesis were determined to conduct RT-qPCR, as well as 5 regulatory genes AaMYBC1, AaMYB, AabHLH, AaHB1 and AaHB2 (Figure 7). The expression patterns of these three structural genes AaF3H, AaLDOX and AaUFGT were similar to the two regulatory genes AaMYB and AaHB2. The expression levels of AaF3H, AaLDOX and AaUFGT in "HB100" was significantly higher than those in "HB70" and "YF100" ( Figure 7A). A similar rule of expression was detected for AaMYB and AaHB2. The expression levels of AaMYB and AaHB2 in "HB100" were higher than those found in "HB70" and "YF100" ( Figure 7B). Cluster analysis results showed that the genes AaF3H, AaLDOX, AaUFGT, AaMYB, AabHLH and AaHB2 were clustered into the same class (Figure 8), suggesting that the expression patterns of these genes were similar to each other. Because MYB is the biggest and most important transcription factor family involved in flavonoids, AaMYB was used for BLAST in order to search for other MYB sequences in other species. A phylogenetic tree was constructed with the BLAST results. The results showed that AaMYB and CsMYB5a (MYB5a of Camellia sinensis) were clustered into one class ( Figure 9).

Comprehensive Analysis of Metabolome and Transcriptome
In order to investigate the association between metabolites and genes involved in the same biological process (KEGG Pathway), the comprehensive analysis of metabolome and transcriptome was performed using Pearson's Correlation Coefficient [28,29]. The results showed that 646, 156 and 484 differentially expressed genes participated in 13, 18 and 12 pathways among three groups HB100 vs. HB70, HB100 vs. YF10 and YF100 vs. YF70, respectively (Table S3). The 2 pathways, "flavonoid biosynthesis" and "galactose metabolism", were a common presence among the three group comparisons. In flavonoid biosynthesis, a total of 8 candidate DEGs were found ( Figure 6).

RT-qPCR, Cluster and Phylogenetic Analysis
A total of 14 structural genes comprising the 8 DEGs found above and 6 other structural genes involved in flavonoid biosynthesis were determined to conduct RT-qPCR, as well as 5 regulatory genes AaMYBC1, AaMYB, AabHLH, AaHB1 and AaHB2 (Figure 7). The expression patterns of these three structural genes AaF3H, AaLDOX and AaUFGT were similar to the two regulatory genes AaMYB and AaHB2. The expression levels of AaF3H, AaLDOX and AaUFGT in "HB100" was significantly higher than those in "HB70" and "YF100" (Figure 7A). A similar rule of expression was detected for AaMYB and AaHB2. The expression levels of AaMYB and AaHB2 in "HB100" were higher than those found in "HB70" and "YF100" (Figure 7B). Cluster analysis results showed that the genes AaF3H, AaLDOX, AaUFGT, AaMYB, AabHLH and AaHB2 were clustered into the same class (Figure 8), suggesting that the expression patterns of these genes were similar to each other. Because MYB is the biggest and most important transcription factor family involved in flavonoids, AaMYB was used for BLAST in order to search for other MYB sequences in other species. A phylogenetic tree was constructed with the BLAST results. The results showed that AaMYB and CsMYB5a (MYB5a of Camellia sinensis) were clustered into one class (Figure 9).

Regulatory Network of Flavonoid Biosynthesis
In order to better understand the relationship between metabolites and genes in flavonoid biosynthesis, all results of metabolites and genes were combined to establish a network, aiming to show the relationship between gene expression and metabolite accumulation more intuitively (Figure 10). Among the 14 structural genes, AaF3H, AaLDOX and AaUFGT were highly expressed in HB100 vs. HB70 in comparison with the other 11 structural genes. The 7 metabolites with significant differences in HB100 vs. HB70 of flavonoid biosynthesis were: bracteatin, luteolin, dihydromyricetin, cyanidin, pelargonidin, delphinidin and (−)-epigallocatechin. It is generally known that transcription factor plays a role by binding the promoter of structural genes. Therefore, we speculate the hypothesis that the three transcription factors AaMYB, AabHLH, and AaHB2 activate expression of AaF3H, AaLDOX and AaUFGT by binding the promoter of these three structural genes, inducing the accumulation of the 7 metabolites described above in flavonoid biosynthesis. Specifically, AaMYB, AabHLH, AaHB2, AaF3H, AaLDOX and AaUFGT were the candidate genes obtained from this study and could be used for further study, revealing the red mechanism of A. arguta.

Regulatory Network of Flavonoid Biosynthesis
In order to better understand the relationship between metabolites and genes in flavonoid biosynthesis, all results of metabolites and genes were combined to establish a network, aiming to show the relationship between gene expression and metabolite accumulation more intuitively ( Figure  10). Among the 14 structural genes, AaF3H, AaLDOX and AaUFGT were highly expressed in HB100 vs. HB70 in comparison with the other 11 structural genes. The 7 metabolites with significant differences in HB100 vs. HB70 of flavonoid biosynthesis were: bracteatin, luteolin, dihydromyricetin, cyanidin, pelargonidin, delphinidin and (−)-epigallocatechin. It is generally known that transcription factor plays a role by binding the promoter of structural genes. Therefore, we speculate the hypothesis that the three transcription factors AaMYB, AabHLH, and AaHB2 activate expression of AaF3H, AaLDOX and AaUFGT by binding the promoter of these three structural genes, inducing the accumulation of the 7 metabolites described above in flavonoid biosynthesis. Specifically, AaMYB, AabHLH, AaHB2, AaF3H, AaLDOX and AaUFGT were the candidate genes obtained from this study and could be used for further study, revealing the red mechanism of A. arguta.

Metabolites Were Obtained by Metabolome Analysis
Plant metabolomics, a new field in the post-genome era [30], presents the rule of change of metabolites in various tissues. Metabolites are the final products of cell biological regulation process, and their level can be regarded as the response of plant development to genetic and environmental changes [31]. Therefore, metabolomic analysis can be used to investigate the relationship between biological processes and phenotypes; furthermore, some intuitive changes can also be observed at the metabolic level [32]. Up until now, several reports have examined the metabolic responses of different species to flavonoid biosynthesis, such as Fagopyrum esculentum [33], Camellia sinensis [34], and Ficus carica L. [35]. This research suggests that metabolome analysis plays a crucial role in explaining the molecular responses to flavonoid biosynthesis. In this study, through fruit metabolome analysis on two different A. arguta cultivars "HB" and "YF" at two developmental stages, a total of 28,837 metabolites were obtained, of which 13,715 had been annotated (Table 1). In addition, 22,532 high-quality metabolites were identified and selected for differential analysis ( Table 2). The focus of our research was on flesh coloring, so the HB100 vs. HB70 comparison was selected as the object for further analysis. In the comparison HB100 vs. HB70, 41 metabolites were identified as being flavonoids, 7 of which, with significant difference, were identified as bracteatin, luteolin, dihydromyricetin, cyanidin, pelargonidin, delphinidin and (−)-epigallocatechin (Table S1). Such a result provides us with the insight that the difference between these 7 metabolites leads to the difference between red and green coloring in A. arguta flesh.

Transcription Factors Involved in Flavonoid Biosynthesis
Transcription factors are absolutely necessary for regulation of gene expression [36]. Normally, transcription factor proteins play a role through the combination of their own DNA-binding domain and the cis-acting element of their target genes [37,38]. In plants, transcription factors could participate in various biological process, including developmental regulation, defense elicitation, and stress responses [39][40][41][42][43]. Numerous studies have shown that flavonoid biosynthesis is regulated by the MBW complex, comprising MYB, bHLH, WD40 [44][45][46]. Therefore, finding transcription factors related to flavonoid biosynthesis is essential for investigating fruit coloring. In this study, 5 transcription factor genes were obtained by transcriptome analysis and used for expression analysis. The results showed that AaMYB, AabHLH and AaHB2 were highly expressed at HB100, indicating that these three transcription factors might play a key role in fruit coloring in A. arguta. Phylogenetic analysis revealed that AaMYB was highly homologous to CsMYB5a, rather than the other MYB in A. chinensis, in which AcMYB75 [47], AcMYB110 [48] and AcMYBF110 [13] were the key MYB transcription factors controlling fruit coloring, indicating that the MYB transcription factor that plays a key role in regulating flavonoid biosynthesis might be different in different Actinidia species.

Candidate Genes are Involved in Regulating Fruit Coloring
Genes, including structural and regulatory genes, involved in flavonoid biosynthesis and regulation have been found, studied and reported in many plants, including Actinidia [11,13,[15][16][17][18][19]. However, most genes have been obtained and identified using traditional study technology. Since transcriptome analysis has come to be regarded as a crucial way to study the expression level, structure and function of genes in order to revealing phenotypic traits, combined analysis of transcriptome and metabolome has increasingly become a popular and practical tool for the mining of new genes involved in various metabolic pathways [49]. In this study, metabolome data, coupled with transcriptome profiling, was carried out in a combined analysis for the discovery of genes involved in flavonoid biosynthesis, thus searching for useful information to illustrate phenomenon of red coloring in A. arguta fruit. 19 genes, comprising 14 structural genes and 5 transcription factor genes, were obtained and used to analyze expression level; additionally, cluster analysis was conducted and the construction of phylogenetic tree was also performed. The results showed that structural genes including AaF3H, AaLDOX and AaUFGT and transcription factor genes including AaMYB, AabHLH and AaHB2 were highly expressed at HB100, when the flesh color significantly presented red (Figure 7). In addition, the cluster analysis results suggested that AaF3H, AaLDOX, AaUFGT, AaMYB, AabHLH and AaHB2 were clustered into one class, indicating that their expression patterns were similar to each other (Figure 8). Based on these results, a regulatory network of flavonoid biosynthesis was established to show the role of genes involved in pathways more intuitively ( Figure 10). Thus, such a model of action was derived: the three transcription factors AaMYB, AabHLH and AaHB2 interact with promotors of the three structural genes aF3H, AaLDOX, and AaUFGT to control the expression, inducing the accumulation of metabolites and the appearance of the red phenotype of A. arguta fruit.
This method-combined metabolome and transcriptome-is an effective analytical method for explaining the relationship between key genes and metabolites involved in biosynthesis pathways. Using this method, we determined the candidate genes and metabolites involved in the flavonoid biosynthesis pathway, providing valuable information and a useful reference for explaining the phenomenon of red coloring of A. arguta fruit. Nevertheless, the specific mechanism still needs further research, and remains to be explored.

Fruit Materials
Two different types of kiwifruit (Actinidia arguta), "Hongbaoshixing" ("HB", a kind of all-red-fleshed A. arguta cultivar) and "Yongfengyihao" ("YF", a kind of all-green-fleshed A. arguta cultivar), were selected as experimental materials. Two different sampling stages of fruit development, defined in days after full bloom (DAFB), were 70 DAFB (70d, green fruit) and 100 DAFB (100d, the color-break stage, at which point they start to turn red). 'YF' fruit selected as a control case were green throughout the whole development process and were also sampled at 70d to 100d (Figure 1). Both of them are tetraploids with similar genetic backgrounds. The fruit samples at the two developmental stages described above were harvested from the National Kiwifruit Germplasm Garden of Zhengzhou Fruit Research Institute, CAAS, Henan province, China. Three biological replicates were collected per sample, each with 30 fruits randomly collected from 6 kiwifruit trees, every two of which were set as a biological replication; thus, all data were obtained based on three independent biological replicates. All flesh samples of the fresh fruits were dissected using a blade, frozen immediately in liquid nitrogen, and then stored at −80 • C until further use.

Metabolite Extraction and Parameter Setting
The collected samples were thawed on ice, and metabolites were extracted with 50% methanol buffer (50% solution of methanol in distilled water). Briefly, 20 µL of sample was extracted with 120 µL of precooled 50% methanol, vortexed for 1 min, and incubated at room temperature for 10 min; the extraction mixture was then stored overnight at −20 • C. After centrifugation at 4000× g for 20 min, the supernatants were transferred into new 96-well plates. Three independent repetitions were executed for the extraction and subsequent analysis process.
A high-resolution tandem mass spectrometer TripleTOF5600plus (SCIEX, Cheshire, UK) was used to detect metabolites eluted form the column. The Q-TOF was operated in both positive and negative ion modes. XCMS software 3.2.0 (UC, Berkeley, CA, USA) was used to control the chromatograph and mass spectrometer.

Identification and Quantification of Metabolite
MSConvert was used to transform LC-MS raw data into the mzXML format, which was then processed by the XCMS, CAMERA and metaX toolbox, implemented in the R software [50][51][52]. The combined retention time (RT) and m/z data were used to identify each ion.

RNA Sequencing
RNA isolation, purification and monitoring, and cDNA library construction and sequencing were performed as previously described [53]. Briefly, RNA purity, concentration and integrity were checked, measured and assessed using the NanoPhotometer ® spectrophotometer (IMPLEN, Westlake Village, CA, USA), Qubit ® RNA Assay Kit in Qubit ® 2.0 Flurometer (Life Technologies, Carlsbad, CA, USA) and RNA Nano 6000 Assay Kit of the Agilent Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA), respectively. Sequencing libraries were generated using NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (NEB, Ipswich, MA, USA) following manufacturer's recommendations and were sequenced on an Illumina Hiseq platform. According to the manufacturer's instructions, NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (NEB, Ipswich, MA, USA) was used to generate sequencing libraries, which were then sequenced on an Illumina Hiseq platform.

Analysis of Transcription Factors and DEGs (Differentially Expressed Genes)
Transcription factors were predicted using iTAK software [54]. The identification and classification of transcription factors were conducted by previously described methods [55,56]. On the basic of model of negative binomial distribution, DESeq R package (1. 10. 1) was used to perform differential expression analysis by providing statistical routines used to determine differential expression by means of the digital data of gene expression. The false discovery rate was deliberately controlled by the adjusted P values, which were adjusted by Benjamini and Hochberg's approach. The differentially expressed genes were defined as whose adjusted p-value < 0.05 found by DESeq.

RT-qPCR (Real-Time Quantitative PCR)
Total RNA was extracted using a modified CTAB (cetyltrimethyl ammonium bromide) method [16]. The testing of RNA quality and determination of RNA concentration were performed by 1.0% agarose gel electrophoresis and micro ultraviolet spectrophotometry (Thermo NanoDrop 2000, Thermo Fisher Scientific, Waltham, MA, USA), respectively. Approximately 1µg of total RNA was determined for cDNA synthesis using RevertAid™ First Strand cDNA synthesis kit (Thermo Fisher Scientific, Waltham, MA, USA). The LightCycler ® 480 real-time PCR system with a 96-well plate was used to conduct an amplified reaction consisting of 95 • C for 5 min, followed by 45 cycles of 10 s at 95 • C, 20 s at 60 • C, and 20 s at 72 • C in a volume of 10 µL. At the end of each experiment, a melt-curve analysis was carried out using the default parameters (5 s at 95 • C and 1 min at 65 • C). The Actinidia β-actin was used for normalization [57]. All analyses were repeated three times using biological replicates. Primer sequences are listed in Table S4.

Statistical Analysis
The relative expressions were calculated using the 2 −∆∆Ct method [35], and GraphPad Prism 5 (GraphPad Software Inc., San Diego, CA, USA) was used for chart preparation. The R-3.4.2 and MEGA6 were used to conduct the heatmap and cluster analysis. IBM SPSS Statistics 20 was used to test significant differences.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/19/5/ 1471/s1. Author Contributions: Y.L. designed the experiments, carried out the research, analyzed the data, wrote and revised the manuscript. M.L., Y.Z., L.S. and W.C. gave important suggestions and assisted in data analysis. J.F. and X.Q. designed the study and revised the manuscript. All authors participated in this study and approved the final version of the manuscript.