Proteomic Analysis of Hylocereus polyrhizus Reveals Metabolic Pathway Changes

Red dragon fruit or red pitaya (Hylocereus polyrhizus) is the only edible fruit that contains betalains. The color of betalains ranges from red and violet to yellow in plants. Betalains may also serve as an important component of health-promoting and disease-preventing functional food. Currently, the biosynthetic and regulatory pathways for betalain production remain to be fully deciphered. In this study, isobaric tags for relative and absolute quantitation (iTRAQ)-based proteomic analyses were used to reveal the molecular mechanism of betalain biosynthesis in H. polyrhizus fruits at white and red pulp stages, respectively. A total of 1946 proteins were identified as the differentially expressed between the two samples, and 936 of them were significantly highly expressed at the red pulp stage of H. polyrhizus. RNA-seq and iTRAQ analyses showed that some transcripts and proteins were positively correlated; they belonged to “phenylpropanoid biosynthesis”, “tyrosine metabolism”, “flavonoid biosynthesis”, “ascorbate and aldarate metabolism”, “betalains biosynthesis” and “anthocyanin biosynthesis”. In betalains biosynthesis pathway, several proteins/enzymes such as polyphenol oxidase, CYP76AD3 and 4,5-dihydroxy-phenylalanine (DOPA) dioxygenase extradiol-like protein were identified. The present study provides a new insight into the molecular mechanism of the betalain biosynthesis at the posttranscriptional level.


Introduction
Betalains are red and yellow pigments in plants of Caryophyllales only. It is interesting that betalains and anthocyanins are naturally mutually exclusive in an individual plant. Betalain pigments have potential benefits in promoting health and preventing diseases of human being by serving as potent antioxidant and possessing anti-inflammatory and chemo-preventive activities in vitro and in vivo [1][2][3][4][5][6]. Betalains also contribute to the early-phase insulin response [7] and play a role in Here we used this technique to perform comparative proteomic analyses of the key enzymes involved in betalain biosynthetic pathways in pitaya. For the proteomic analysis, two proteomic libraries were constructed using the pitaya (H. polyrhizus) fruits at two developmental stages (white and red pulps). Total proteins and changes in the protein profile upon coloring process were explored using the iTRAQ technique. A total of 89,583 peptides were identified that belong to 33,548 peptide species. In total, 6725 proteins were identified and quantified under the condition of 1% false-discovery rate (FDR) and only those unique peptides were used for quantitative comparison. A threshold of ≥1.5 for fold changes (FC) was set to filter the comparative data sets, resulting in the identification of 1946 proteins as the differentially expressed between the two samples. Among the differentially expressed proteins, 936 were significantly up-expressed in the fruits at the red pulp stage compared with those at the white stage. iTRAQ has the advantages of high coverage, accuracy and sensitivity, and our proteomic analyses of H. polyrhizus fruits using the iTRAQ technique could provide useful information for understanding the pitaya fruit development in general and the betalain biosynthesis pathway in particular as discussed below (Tables 1 and 2). Table 1. Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment of the genes/proteins with a positive correlation (p-value < 0.05, FC (pro) > 1.5, FC (RNA) > 2) between the levels of transcripts and proteins.

KEGG Pathway
No.  3  0  3  Chloroalkane and chloroalkene degradation  3  3  0  Glycerolipid metabolism  3  3  0  Valine leucine isoleucine biosynthesis  3  0  3  Isoquinoline alkaloid biosynthesis  2  0  2  Flavonoid biosynthesis  2  1  1  Naphthalene degradation  2  2  0 a , the total number of differentially expressed on genes and proteins; b , the number of significantly up-regulated expression in the pitaya fruit at the red pulp stage; c , the number of significantly down-regulated expression in the pitaya fruit at the red pulp stage. No., number; FC, fold changes.

Functional Classification of Differentially Expressed Proteins during Pitaya Fruit Coloring
The proteomes representing the protein profiles of the pitaya fruits at the white or red pulp stages were comparatively analyzed based on the assigned functions of the proteins in the public protein databases. Five hundred and seven (507) differentially accumulated proteins (p-value < 0.05, fold change (FC) > 1.5) in the red pulp fruits were annotated and classified into "biological process (BP)", "molecular function (MF)", and "cellular component (CC)" categories as well as their sub-categories. Based on the numbers of unique proteins identified in each of the functional categories, the two largest categories for each functional group were as follows: "translation" and "carbohydrate metabolic processes" for BP; "structural constituent of ribosome" and "DNA binding" for MF; and "ribosome" and "intracellular" for CC ( Figure 1). Some of these proteins could be mapped to the following pathways: "phenylpropanoid biosynthesis", "phenylalanine metabolism", "tyrosine metabolism" and "flavonoid biosynthesis" (Figure 1). These pathways might have impacts on the formation of betalain. Two hundred eighteen (218) proteins were assigned to 23 kyoto encyclopedia of genes and genomes (KEGG) pathways. The most frequently detected or abundant proteins, representing 37.6% of all the protein identified, belonged to the pathway of "ribosome" (Figure 2). Further work is needed to establish the metabolic profiles in red pitaya at a series of developmental stages.

Integrated Analyses of Transcriptomic and Proteomic Datasets on the Pitaya Ripening
In our previous study, the high efficient RNA sequencing (RNA-Seq) technology was used to identify key genes related to betalain biosynthesis during pulp coloration of H. polyrhizu. A total of about 12 Gb raw RNA-Seq data were generated and de novo assembled into 122,677 transcripts, of which 122,668 were annotated [68]. Differentially expressed transcripts and proteins between the two stages were identified by comparing the RNA-seq and iTRAQ datasets. Although most of the differentially expressed were consistent at both the transcript and protein levels (see below), discrepancy was detected between the transcript and protein data, and such genes/proteins fell into the following two groups. The first group is those differentially expressed at the transcript level but not at the protein level (p-value < 0.05, FC (pro) < 1.3, FC (RNA) > 2). They functionally belonged to "oxidation-reduction process" (BP); "oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen" (MF); and "protein complex" (CC) ( Table S1). These differentially expressed were further analyzed using the KEGG database, and fell into 12 pathways, with five differentially-accumulated proteins associated with the "arginine metabolism"

Integrated Analyses of Transcriptomic and Proteomic Datasets on the Pitaya Ripening
In our previous study, the high efficient RNA sequencing (RNA-Seq) technology was used to identify key genes related to betalain biosynthesis during pulp coloration of H. polyrhizus. A total of about 12 Gb raw RNA-Seq data were generated and de novo assembled into 122,677 transcripts, of which 122,668 were annotated [68]. Differentially expressed transcripts and proteins between the two stages were identified by comparing the RNA-seq and iTRAQ datasets. Although most of the differentially expressed were consistent at both the transcript and protein levels (see below), discrepancy was detected between the transcript and protein data, and such genes/proteins fell into the following two groups. The first group is those differentially expressed at the transcript level but not at the protein level (p-value < 0.05, FC (pro) < 1.3, FC (RNA) > 2). They functionally belonged to "oxidation-reduction process" (BP); "oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen" (MF); and "protein complex" (CC) ( Table S1). These differentially expressed were further analyzed using the KEGG database, and fell into 12 pathways, with five differentially-accumulated proteins associated with the "arginine metabolism" (Table S2). The second group is those differentially expressed at the protein level but not at the transcript level (p-value < 0.05, FC (pro) > 1.5, FC (RNA) < 2). However, no protein matching with corresponding transcript were detected. Analyzing the Gene Ontology (GO) categories and KEGG pathways of the differentially expressed with a negative correlation between the transcript and protein levels allowed us to map 22 differentially expressed genes/proteins (Tables S3 and S4). The fact that the abundance of transcripts differed from that of respective proteins strongly suggested that there were posttranscriptional regulation involved in the metabolisms and other cellular processes associated with the pitaya ripening. In fact, similar findings have previously been found in many other biological processes such as in human [70], yeast [71] and plants [72]. Such findings became possible only when the transcriptomic and proteomic datasets were integrated.  (Table S2). The second group is those differentially expressed at the protein level but not at the transcript level (p-value < 0.05, FC (pro) > 1.5, FC (RNA) < 2). However, no protein matching with corresponding transcript were detected. Analyzing the Gene Ontology (GO) categories and KEGG pathways of the differentially expressed with a negative correlation between the transcript and protein levels allowed us to map 22 differentially expressed genes/proteins (Tables S3 and S4). The fact that the abundance of transcripts differed from that of respective proteins strongly suggested that there were posttranscriptional regulation involved in the metabolisms and other cellular processes associated with the pitaya ripening. In fact, similar findings have previously been found in many other biological processes such as in human [70], yeast [71] and plants [72]. Such findings became possible only when the transcriptomic and proteomic datasets were integrated.  Despite the discrepancy discussed above, most of the genes/proteins had a positive correlation (p-value < 0.05, FC (pro) > 1.5, FC (RNA) > 2), i.e., differentially expressed at both transcript and protein levels. These genes/proteins were mapped to such categories as "oxidation-reduction process" (BP), "oxidoreductase activity" (MF) and "membrane" (CC) (Figure 3). There were totally Phagosome, ko00071 Fatty acid metabolism, ko00940 Phenylpropanoid biosynthesis, ko00240 Pyrimidine metabolism, ko01004 Lipid biosynthesis proteins, ko05323 Rheumatoid arthritis, ko00350 Tyrosine metabolism, ko05110 Vibrio cholerae infection, ko05120 Epithelial cell signaling in Helicobacter pylori infection, ko00360 Phenylalanine metabolism, ko04966 Collecting duct acid secretion, ko00460 Cyanoamino acid metabolism, ko00625 Chloroalkane and chloroalkene degradation, ko00980 Metabolism of xenobiotics by cytochrome P450, ko00982 Drug metabolism-cytochrome P450, ko00626 Naphthalene degradation, ko00830 Retinol metabolism, ko04920 Adipocytokine signaling pathway, ko00592 alpha-Linolenic acid metabolism, ko00941 Flavonoid biosynthesis, ko00603 Glycosphingolipid biosynthesis -globo series, ko00199 Cytochrome P450, ko00950 Isoquinoline alkaloid biosynthesis, ko05416 Viral myocarditis, ko04210 Apoptosis, ko04974 Protein digestion and absorption.
Despite the discrepancy discussed above, most of the genes/proteins had a positive correlation (p-value < 0.05, FC (pro) > 1.5, FC (RNA) > 2), i.e., differentially expressed at both transcript and protein levels. These genes/proteins were mapped to such categories as "oxidation-reduction process" (BP), "oxidoreductase activity" (MF) and "membrane" (CC) (Figure 3). There were totally 15 pathways were obtained, including "starch and sucrose metabolism", "phenylpropanoid biosynthesis", "lipid biosynthesis proteins", "ascorbate and aldarate metabolism", "tyrosine metabolism" and "flavonoid biosynthesis" (Table 1). 15 pathways were obtained, including "starch and sucrose metabolism", "phenylpropanoid biosynthesis", "lipid biosynthesis proteins", "ascorbate and aldarate metabolism", "tyrosine metabolism" and "flavonoid biosynthesis" (Table 1). Among the positively correlated genes/proteins, three were up-regulated in "starch and sucrose metabolism". It was reported that soluble solids concentration (SSC) was closely related to the synthesis of betacyanin [73]. Five up-regulated genes/proteins at both the transcriptional and posttranscriptional levels were in "phenylpropanoid biosynthesis". Two up-regulated and two down-regulated genes/proteins belonged to the "tyrosine metabolism" pathway. Betalains were derived from tyrosine [28], and the "phenylpropanoid biosynthesis" had been considered to be the upstream pathway of the tyrosine pathway. Two transcripts and proteins involved in "flavonoid biosynthesis" were significantly up-regulated and down-regulated, respectively. Four up-regulated Among the positively correlated genes/proteins, three were up-regulated in "starch and sucrose metabolism". It was reported that soluble solids concentration (SSC) was closely related to the synthesis of betacyanin [73]. Five up-regulated genes/proteins at both the transcriptional and posttranscriptional levels were in "phenylpropanoid biosynthesis". Two up-regulated and two down-regulated genes/proteins belonged to the "tyrosine metabolism" pathway. Betalains were derived from tyrosine [28], and the "phenylpropanoid biosynthesis" had been considered to be the upstream pathway of the tyrosine pathway. Two transcripts and proteins involved in "flavonoid biosynthesis" were significantly up-regulated and down-regulated, respectively. Four up-regulated transcripts and proteins were enriched in "ascorbate and aldarate metabolism" (Table 2), which played a role in the biosynthesis of betalains [28] (Table 2). These proteins may be associated with the betalain biosynthesis in H. polyrhizus.
Naturally, betalains and anthocyanin could not co-exist simultaneously in one plant. However, they can exist together by genetic engineering strategy [30,31]. Chalcone isomerase (CHI) and chalcone synthase (CHS) were the upstream enzymes of the anthocyanin biosynthesis pathway. In this study, one protein annotated as CHI and another as CHS were identified. CHI and CHS had higher expressed levels in the pitaya fruit at the red stage than at the white stage. This result was consistent with the conjecture that betalain-producing plants could not produce anthocyanins due to lower levels of dihydroflavonol reductase (DFR), anthocyanidin synthase (ANS) and leuco anthocyanidin reductase (LAR) [74]. More interestingly, five proteins, namely comp37375_c0_seq2_4 and comp37375_c0_seq1_4 (polyphenol oxidase), comp37692_c0_ seq1_7 (4, 5-DOPA dioxygenase extradiol-like), comp16058_c0_seq1_4 (CYP76AD3), and comp26435_c0_seq1_2 (Aromatic-L-amino-acid decarboxylase), were identified as some of the key enzymes in the betalain biosynthesis pathway ( Table 2). Not surprisingly, the transcriptional levels of these proteins were increasing with the progression of the betalain formation in the pitaya fruit as revealed in our previous study [68].
These results suggested that these pathways could be involved in the biosynthesis of pigments, and it was possible that polyphenol oxidase, CYP76AD3 and 4,5-DOPA dioxygenase extradiol-like protein were responsible for betalain biosynthesis in the pitaya fruit. These results were consistent with our previous findings from transcriptomic analysis that these enzymes might be involved in betalain biosynthesis in H. polyrhizus [68].  Figure S1). Every library consisted of equal amounts of protein from three fruit at each fruit developmental stage.

Samples Preparation, Protein Extraction and Detection
Upon harvested, the samples described above were immediately frozen in liquid nitrogen and stored at −80 • C prior to protein extraction. Trichloroacetic acid (TCA)-acetone method was used for protein extraction. Briefly, pulps without seeds were ground to fine powder in liquid nitrogen. Cold TCA-acetone (ratio of material to liquid is 1:4) was added to the cold 15 mL tube and vortexed for 30 s. Total proteins were precipitated overnight at −20 • C. Precipitation was collected by spinning 12,000 rpm for 15 min at 4 • C and subsequently washed with cold acetone and 90% acetone, respectively. The precipitation was freeze-dried by vacuum and dissolved in lysis buffer [8 M ureophil,  The concentration and quality of the protein were assayed by bicinchoninic acid (BCA) method and SDS-PAGE gel electrophoresis. At least 400 µg proteins (>2.0 µg/µL) were used for further experiment.

iTRAQ Labeling
Above extracted protein samples were prepared using the iTRAQ ® Reagents 8plexMulti-plex kit (AB Sciex, Boston, MA, USA). Enriched phosphopeptides were labeled with isobaric tags for relative and absolute quantification reagents (AB Sciex). iTRAQ 113 and 115 were used to label white and red samples, respectively. The labeled samples were combined equably and graded with RP C18 chromatographic column.

Strong Cation Exchange Chromatography (SCX)
The mixed and labeled samples were fractionated by SCX fractionation using HPLC (high performance liquid chromatography). The mobile phases consisted of buffer A (2% acetonitrile and 98% H 2 O (pH 10)) and B (90% acetonitrile and 10% H 2 O (pH 10)). The labeled samples were concentrated by vacuum and dissolved in 100 µL buffer A (pH 10). After the mixture was centrifuged, the supernatant was loaded onto a reverse phase (RP) C18 precolumn (LC Packings) (Agilent Technologies, Palo Alto, CA, USA). Separation was performed using a linear gradient at a flow rate of 1 mL/min. The gradient of elution is shown in Table S5. The liquid effluent was collected at a speed of 1.5 mL/min. Multiple components were obtained by merging samples according to the chromatogram map.

LC-MS/MS Analysis
The liquid chromatography-mass spectrometry (LC-MS)/MS method was developed for the separation and analysis of samples. The mass spectrometer used for detection was Q Exactive mass spectrometer (Thermo Scientific, Waltham, MA, USA). The HPLC was NCS3500 system. The mobile phases consisted of buffer A (99.9% H 2 O and 0.1% formic acid) and B (99.9% acetonitrile and 0.1% formic acid). The flow rate was 300 nL/min. The gradient of elution is as shown in Table S6.
Full MS scans range was 350-1600 m/z. The runtime was 75 min and the resolution was 70,000. The precursor ions were selected for the MS/MS scans using higher energy collision-induced dissociation (HCD) for each precursor ion. Then MS2 (secondary mass spectrum) sequences were determined. The dynamic exclusion option was implemented with a repeat count of 1 and exclusion duration of 15 s. The values of automated gain control (AGC) were set to 1 × 10 6 and 2 × 10 5 for full MS and MS2, respectively.

Proteomic Data Analysis
For peptide data analysis, raw mass data were processed and searched against the protein databases downloaded from the public databases using Proteome Discoverer software (Thermo Scientific, Waltham, MA, USA). Searches were performed using the following criteria: the precursor mass tolerance was set to 20 ppm, and fragment ion mass tolerance was set to 0.02 Da for HCD in addition to the general settings. The search parameters allowed two missed cleavage for tyrpsin. The maximum delta Cn was considered as 0.05, while 10 was thre maximum number of peptides reported. The score threshold for peptide identification was set at 0.01 false-discovery rate (FDR) in the iTRAQ experiment. The list of proteins obtained from the iTRAQ data was exported to Supplementary File S1, which contains such protein-specific information as accession numbers, percent coverage, protein scores, number of peptides matching individual proteins, etc. Proteins with a fold-change cutoff ≥1.5 between the two stages were identified as differentially expressed.

Conclusions
Betalains play a role in appearance quality and nutritional value of red pitaya, and they are also involved in stress-resistance. Study of the pitaya pulp pigments proteome is important to understand the correlative pathways contributed to the betalain biosynthesis. In this study, proteomic changes in H. polyrhizus were first investigated using iTRAQ. In total, 1946 differentially expressed proteins were identified to characterize the proteome of red pitaya. Based on integrated analyses of transcriptomic and proteomic datasets, several transcripts and proteins owning positive correlation were gathered in "phenylpropanoid biosynthesis", "tyrosine metabolism", "flavonoid biosynthesis", "ascorbate and aldarate metabolism", "betalains biosynthesis" and "anthocyanin biosynthesis". These pathways are related to the metabolites of betalain biosynthesis. Five proteins, which were annotated to polyphenol oxidase, CYP76AD3, and 4,5-DOPA dioxygenase extradiol-like, were found to be differentially expressed in the betalain biosynthesis pathway. The present study provides the first proteomic analysis of the red pitaya pulps by iTRAQ and could offer new insights into the molecular mechanism of the betalain biosynthesis at the posttranscriptional level.