Photosystem Disorder Could be the Key Cause for the Formation of Albino Leaf Phenotype in Pecan

Pecan is one of the most famous nut species in the world. The phenotype of mutants with albino leaves was found in the process of seeding pecan, providing ideal material for the study of the molecular mechanisms leading to the chlorina phenotype in plants. Both chlorophyll a and chlorophyll b contents in albino leaves (ALs) were significantly lower than those in green leaves (GLs). A total of 5171 differentially expression genes (DEGs) were identified in the comparison of ALs vs. GLs using high-throughput transcriptome sequencing; 2216 DEGs (42.85%) were upregulated and 2955 DEGs (57.15%) were downregulated. The expressions of genes related to chlorophyll biosynthesis (HEMA1, encoding glutamyl-tRNA reductase; ChlH, encoding Mg-protoporphyrin IX chelatase (Mg-chelatase) H subunit; CRD, encoding Mg-protoporphyrin IX monomethylester cyclase; POR, encoding protochlorophyllide reductase) in ALs were significantly lower than those in GLs. However, the expressions of genes related to chlorophyll degradation (PAO, encoding pheophorbide a oxygenase) in ALs were significantly higher than those in GLs, indicating that disturbance of chlorophyll a biosynthesis and intensification of chlorophyll degradation lead to the absence of chlorophyll in ALs of pecan. A total of 72 DEGs associated with photosynthesis pathway were identified in ALs compared to GLs, including photosystem I (15), photosystem II (19), cytochrome b6-f complex (3), photosynthetic electron transport (6), F-type ATPase (7), and photosynthesis-antenna proteins (22). Moreover, almost all the genes (68) mapped in the photosynthesis pathway showed decreased expression in ALs compared to GLs, declaring that the photosynthetic system embedded within the thylakoid membrane of chloroplast was disturbed in ALs of pecan. This study provides a theoretical basis for elucidating the molecular mechanism underlying the phenotype of chlorina seedlings of pecan.


Introduction
Plants synthesize the carbohydrates and energy needed for growth and development through photosynthesis in leaves. Leaf color directly affects photosynthesis. Usually, the leaf color is green; however, leaf color variations, including chlorina, albino, red, and green leaves, with white or yellow interion, have been observed in plenty of plants, such as tea plant [1][2][3], Anthurium andraeanum [4], red maple [5], and oilseed rapa [6]. To our knowledge, the occurrence of leaf color variations is a very complex biological process and is largely determined by genetic and environmental factors. Mutants with leaf color variations are ideal genetic material for exploring the physiological, biochemical, and molecular mechanisms of chlorophyll biosynthesis, chloroplast structure and function,

Content of Chlorophyll and Carotenoid in Green and Albino Leaves in Pecan
A few pecan seedlings with albino leaves were found during the progress of seeding ( Figure S1). It is well known that chlorophyll biosynthesis leads to leaf greening, and the chlorophyll contents of leaves from green leaf (GL) seedlings and albino leaf (AL) seedlings were measured, respectively ( Figure 1). The results showed that in AL, both chlorophyll a and chlorophyll b contents were significantly lower than those in GL (approximately 3.46% and 20.87% of the contents in GL, respectively; Figure 1B). The ratio of chlorophyll a/b in GL was significantly lower than that in AL (Table S1). The carotenoid contents in AL were significantly lower than those in GL ( Figure 1B), and the ratio of carotenoid/chlorophyll in AL was significantly high than that in GL (Table S1). These results suggested that albino leaves result from reduced chlorophyll levels and that the lower chlorophyll content might have resulted from abnormal chlorophyll biosynthesis and degradation.

RNA Sequencing of Leaf Transcriptomes of the GL and AL Seedlings and Mapping of RNA Sequences to the Reference Genome
RNA-seq, followed by strict quality control and processing, generated a total of 32.35 GB of clean data from 6 transcriptome libraries. The six transcriptome libraries represented two groups with three repetitions. After filtering out duplicate sequences and ambiguous and low-quality reads, we obtained a total of 231, 590, and 820 high-quality (HQ) clean reads: 115, 546, and 190 reads and 116, 044, and 630 reads were generated for GL and AL, respectively (Table S2). The average GC percentage was 45.51%, with a QC30 base percentage above 90.01%. Details on data and data quality, before and after filtering, are shown in Table S3. HQ clean reads were mapped to the pecan reference genome (Cil.genome.fa).
Approximately 35.83 million clean reads (92.81% of the total) were mapped; 34.91 million were unique. An overview and detailed data are given in Table 1 and Table S2. contents of leaves from green leaf (GL) seedlings and albino leaf (AL) seedlings were measured, respectively ( Figure 1). The results showed that in AL, both chlorophyll a and chlorophyll b contents were significantly lower than those in GL (approximately 3.46% and 20.87% of the contents in GL, respectively; Figure 1B). The ratio of chlorophyll a/b in GL was significantly lower than that in AL (Table S1). The carotenoid contents in AL were significantly lower than those in GL ( Figure 1B), and the ratio of carotenoid/chlorophyll in AL was significantly high than that in GL (Table S1). These results suggested that albino leaves result from reduced chlorophyll levels and that the lower chlorophyll content might have resulted from abnormal chlorophyll biosynthesis and degradation.

Differentially Expressed Gene Analysis
Three biological replicates were used for RNA-seq. To test sample repeatability, we calculated the correlation coefficient between the samples. The correlation coefficient in the repeat group was greater than 0.9375 ( Figure S2), indicating the consistency among the three biological replicates. Thus, the RNA-seq results were confirmed to be highly reliable for further analyses.
In the current study, a total of 5171 DEGs was identified in the comparison of AL vs. GL; 2216 DEGs (42.85%) were upregulated, and 2955 DEGs (57.15%) were downregulated (Table S3). Additionally, 4389 out of the 5171 DEGs (84.88%) were aligned to known proteins in the nr database, whereas 3596 (69.54%) could be annotated based on sequences in the Swiss-Prot database ( Table 2 and Table S4). Moreover, 1546 (29.90%) DEGs were categorized in 25 cluster of orthologous groups of proteins (COG) (Figure 2A and Table 2). The three largest categories were (1) general function prediction only (415, 26.84%), (2) transcription (191, 12.35%), and (3) carbohydrate transport and metabolism (190, 12.29%).    In total, 3337 DEGs (64.53%) were categorized into three different GO trees of cellular components, molecular functions, and biological processes ( Figure 2B, Table 2 and Table S5). The three main categories were further classified into 51 functional groups. In the category of cellular components, the largest groups were cell, cell part, and organelle. Binding, catalytic activity, and transcription regulator activity were the dominant groups in the molecular function category, and for the biological processes, DEGs with cellular process, metabolic process, and response to stimulus formed the major groups. The top-ten enrichment of GO were chloroplast thylakoid membrane, photosystem I, photosystem II, chlorophyll binding, reductive pentose-phosphate cycle photosynthesis, light-harvesting in photosynthesis, pigment binding, chloroplast envelope, photosynthesis, and integral component of membrane (Table S6). Furthermore, in order to understand the biological function of these DEGs, all DEGs were also mapped to terms in the KEGG database. Finally, 939 (18.16%) DEGs were matched and assigned to 128 KEGG pathways (Table S7). The first three biological pathways involved in photosynthesis (51), photosynthesis-antenna proteins (22), and metabolic pathways (360) were significantly enriched between AL and GL1 (Table 3).

Chlorophyll Metabolism-Related Genes Expression Analysis
To validate the RNA sequencing data, chlorophyll metabolism-related genes were selected for qRT-PCR analysis. The qRT-PCR results indicated that all of these DEGs exhibited similar expression kinetics to those obtained from the RNA sequencing analysis ( Figure S3), thus supporting the validity of the method used for determining DEGs from the RNA sequencing analysis.
Twelve genes involved in chlorophyll metabolism, including biosynthesis, cycle, and degradation, were expressed differentially in the comparison of AL vs. GL using de novo transcriptome sequencing (Table 4 and Figure 3). In chlorophyll biosynthesis, HEMA1 (encoding glutamyl-tRNA reductase), ChlH (encoding Mg-protoporphyrin IX chelatase (Mg-chelatase) H subunit), CRD (encoding Mg-protoporphyrin IX monomethylester cyclase), and POR (encoding protochlorophyllide reductase) showed significantly lower expression in ALs than in GLs, indicating that chlorophyll biosynthesis was downregulated in ALs. Among genes related to the chlorophyll cycle, the expression of two CAO (encoding chlorophyllide a oxygenase) and three CBR (encoding chlorophyll (ide) b reductase NYC1) genes were also significantly lower in expression in ALs than in GLs. Among DEGs related to chlorophyll degradation, the expression of SGR (STAY-GREEN, encoding Mg-dechelatase) in ALs were significantly lower than those in GLs. However, two PAO (encoding pheophorbide a oxygenase) genes in ALs were significantly higher than those in GLs, indicated that chlorophyll degradation was upregulated in ALs.

Identified Differentially Expressed Genes Involved in Photosynthesis
A total of 72 DEGs associated with the photosynthesis pathway was identified in AL compared to GL (Table 5), including PSI (15), PSII (19), cytochrome b6-f complex (3), photosynthetic electron

Response of Transcription Factors in the Comparison of AL vs. GL
Differentially expressed transcription factor genes were analyzed to identify the transcription factors involved in the regulation of chlorophyll metabolism in pecan (Table 6 and Table S8). Forty-two categories of different transcription factor families were identified in the comparison of AL and GL in this study (Table 2 and Table S8). We identified 40 MYB transcription factors expressed differentially and significantly, including 16 upregulated and 24 downregulated members, suggesting that MYB transcription factors could be involved in chlorophyll metabolism. Among the AP2/ERF transcription factor family, 23 members were upregulated and 12 members were downregulated in AL compared with GL. NAC, C2C2, C2H2, bHLH, and WRKY transcription factor families were over-represented in the list of regulated genes, indicating that those transcription factor families probably also play key roles in the transcriptional regulation of genes in the chlorophyll metabolism of pecan.

Discussions
A few pecan seedlings with albino leaves were found during the progress of seeding ( Figure S1). Chlorophyll content was significantly lower than that in GL ( Figure 1B), suggesting that the albino leaves resulted from reduced chlorophyll levels. In order to elucidate the key factors in the formation of AL mutation of pecan, de novo transcriptome sequencing and comparative analysis of DEGs were performed in comparing AL vs. GL. GO classification showed that genes associated with the chloroplast thylakoid membrane, photosystem I, photosystem II, chlorophyll binding, reductive pentose-phosphate cycle photosynthesis, light-harvesting in photosynthesis, pigment binding, chloroplast envelope, and photosynthesis (Table S6) were highly represented among the significantly regulated genes in AL. Additionally, the result showed that many of the genes related to photosynthesis were transcriptionally downregulated in AL.
Chlorophyll metabolism, including chlorophyll biosynthesis, chlorophyll cycling, and chlorophyll degradation, is a complex biological process in plants. Twelve genes engaging 10 enzymes exhibited significant regulation in AL. One of the key factors was that the content of chlorophyll was much lower in AL than in GL. The expression of four chlorophyll biosynthesis genes (encoding HEMA, CHLH, CRD, and POR) was lower in AL than in GL. It has been reported that these enzymes are considered key enzymes for chlorophyll biosynthesis during photomorphogenesis in plants [12][13][14][15][16][17]. Due to the remarkably low levels of expression of these genes, we conclude that chlorophyll biosynthesis activity is lower in AL than in GL. This would explain why the content of chlorophyll a in AL was much lower than in GL. The interconversion of chlorophyll a and chlorophyll b is called the "chlorophyll cycle" [18,19]. Previous studies have reported that a portion of chlorophyll a was converted to chlorophyll b through the activity of CAO. Additionally, chlorophyll b can be reversibly converted to chlorophyll a through 7-hydroxymethyl chlorophyll-a via CBR and 7-hydroxymethyl chlorophyll a reductase (HCAR) [20][21][22]. Two members of CAO and three members of CBR were downregulated in AL. This might explain why the contents of chlorophyll-a and chlorophyll b in AL were lower than those in GL under the condition of disturbance of chlorophyll a biosynthesis. PAO, which encodes pheophorbide, a oxygenase, catalyzes the oxidation of pheophytin a. Chen et al. reported that the chlorophyll degradation pathway is also called the "PAO pathway" [5]. In our study, compared to GL, two members of PAO expression levels in AL were upregulated, suggesting that chlorophyll degradation was enhanced in ALs of pecan. Based on our results, we hypothesize that the disturbance of chlorophyll a biosynthesis and intensification of chlorophyll degradation lead to the absence of chlorophyll in ALs of pecan.
Abnormal chloroplast structure was observed in yellow and variegated leaves compared with green leaves in C. sinensis, and the expression levels of the proteins related to the chlorophyll a-b binding protein, plastid-encoded genes (Lhcb, rbcL, rbcS, psaA, and psbA), photosystem I P700 chlorophyll A apoprotein A1, photosystem II Qb protein D1, and ribulose bisphosphate carboxylase were remarkably repressed in the variegated leaf, suggesting that the abnormal chloroplast profiles in yellow leaf and variegated leaf might be connected with the downregulation of the abovementioned proteins in C. sinensis [1]. The transcripts of differentially expressed proteins related to PSI subunits, PSII subunits, antenna proteins, cytochrome b6/f complex, and beta F-type ATPase were declined in yellow and variegated leaves compared with green leaves in C. sinensis [1]. Thus, a dramatic downregulation of proteins related to the photosystem might be linked to abnormal chloroplast profiles. In this study, most of the genes related to the PSI subunits, PSII subunits, cytochrome b6/f complex, photosynthetic electron transcript, F-type ATPase, and photosynthesis-antenna proteins were declined significantly in AL comparing with GL ( Table 5), declaring that the photosynthetic system embedded within the thylakoid membrane of the chloroplast was disturbed in ALs of pecan.
Most of the transcript factors play important roles in developmental processes in plants [23]. In tomato fruit, SlMYB72 directly targets protochlorophyllide reductase, Mg-chelatase H subunit, and knotted1-like homeobox2 genes and regulates chlorophyll biosynthesis and chloroplast development [24]. Kiwifruit MYB7 plays a role in modulating carotenoid and chlorophyll pigment accumulation in tissues through transcriptional activation of metabolic pathway genes [25]. LfWRKY70, LfWRKY75, LfWRKY65, LfNAC1, LfSPL14, LfNAC100, and LfMYB113 were shown to be key regulators of leaf senescence, and the genes regulated by LfWRKY75, LfNAC1, and LfMYB113 are candidates to link chlorophyll degradation and anthocyanin biosynthesis to senescence in Formosan gum [26]. The LHCB members, which are the apoproteins of the light-harvesting complex of photosystem II, were shown to be targets of WRKY40. Additionally, the positive function of LHCBs was balanced through WRKY40 by repressing the expression of LHCB in ABA signaling [27]. The overexpression of SlNAC1 resulted in reduced carotenoids by altering carotenoid pathway flux and decreasing ethylene synthesis, mediated mainly by the reduced expression of ethylene biosynthetic genes of system-2 in tomato [28]. Reduced expression of SlNAC4 by RNA interference (RNAi) in tomato resulted in delayed fruit ripening, suppressed chlorophyll breakdown, and decreased ethylene synthesis [29]. Plenty of differentially expressed transcript factor members were identified in this study, including MYB, NAC, and WRKY ( Table 6), indicating that those transcript factor members were involved in leaf formation in pecan.

Plant Materials and Sample Preparation
For this study, the mutant material with albino leaves was found in a nursery during the seedings of pecan ( Figure S1). The seedlings were planted in seedbeds at the Institute of Botany, Jiangsu Province, and the Chinese Academy of Sciences, Jiangsu, China. The substrate contained peat, perlite, and vermiculite in the ratio 5:1:1. The growth conditions consisted of relative humidity of~60%, a 12 h light/12 h dark photoperiod for 24 h, and a mean temperature of 25 • C. The albino leaves (ALs) and green leaves (GLs) were harvested from six-month-old seedlings. Three independent biological replicates were performed, and each replicate was collected from a pecan seedling. All samples were flash-frozen in liquid nitrogen and stored at −80 • C for future experiments.

Chlorophyll and Carotenoid Content Analysis
Chlorophyll and carotenoid contents were measured using high-performance liquid chromatography (HPLC), as published by Montefiori et al. [30]. Samples were ground into powder in liquid nitrogen and extracted with acetone. Chlorophyll and carotenoid contents were analyzed in biological triplicate.

RNA Isolation, cDNA Library Preparation and Sequencing
Total RNA was extracted from roots using the cetyltrimethylammonium bromide(CTAB) method [31] and then concentrated using oligo (dT) magnetic adsorption. The cDNA library was constructed using an Illumina TruSeq RNA Sample Preparation Kit (Illumina, San Diego, CA, United States). The samples were sequenced using the Illumina HiSeq 2000 machine in Nanjing Genepioneer Biotechnologies Co Ltd., China.

Analysis of Differentially Expressed Genes
After adaptor trimming and quality trimming, the clean reads were mapped to the pecan (Carya illinoensis) transcriptome (Cil.genome.fa, ftp://parrot.genomics.cn/gigadb/pub/10.5524/100001_101000/ 100571/) using HISAT2. The RPKM (reads per kilobase of exon model per million mapped reads) [32] values were preferred in order to measure the expression of reads using the software StringTie (The Center for Computational Biology at Johns Hopkins University, Baltimore, MD, USA). Gene expression differences between log 2 and early stationary phases were obtained by DESeq2 software (European Molecular Biology Laboratory, Heidelberg, Germany) [33]. We defined genes with at least 2-fold change between two samples and FDR (false discovery rate) less than 0.05 as differentially expressed genes. All differentially expressed gene sequences were searched against GenBank's nonredundant (nr) protein, Swiss-Prot, KEGG, and COG databases using BLASTx to identify the most descriptive annotation for each sequence. In order to understand the biological functions of genes, gene ontology (GO) enrichment (p-value < 0.05) was studied by exposing all DEGs to the GO database (http://www.geneontology.org/) to further classify genes or their products into terms (molecular function, biological process, and cellular component). Pathway projects were performed according to the KEGG pathway database in order to perform pathway enrichment analysis of DEGs.

Illumina RNA-seq Result Validation by qRT-PCR
To validate the Illumina RNA-seq results, the differentially expressed genes related to chlorophyll metabolism were selected for qRT-PCR analysis. RNA was isolated from leaves using the abovementioned methods [31], and RNA quality and quantity met the requirements of the qRT-PCR experiment. First-strand cDNA synthesis was performed using the PrimeScript RT Reagent Kit with gDNA Eraser (Takara, Dalian, China) according to the manufacturer's protocol. The primer sequences used were designed based on gene sequences and Beacon designer software (PREMIER Biosoft, San Francisco, CA, USA), as shown in Table S9 in this study. To ensure gene-specific amplification, normal PCR reactions were performed with the primers (Table S9) to amplify the target genes. A single PCR fragment of the expected size was amplified, suggesting that the primers were suitable for qRT-PCR analyses. The resulting PCR products were cloned and sequenced to confirm the expected fragment of the target genes. qRT-PCR was carried out, as previously described [34], on an Applied Biosystems 7300 Real-Time PCR System (Applied Biosystems, Waltham, MA, USA) using TaKaRa Company SYBR Premix Ex TaqTM II (Perfect Real Time, TaKaRa, code: DRR041A, Dalian, China). Dissociation curves from 55 to 95 • C were generated for each reaction to ensure specific amplification. The CiActin gene was used as a positive internal control [35]. The relative levels of genes to control actin mRNAs were analyzed using the 7300 System's software (Applied Biosystems, Waltham, MA, USA) and the 2 −DDCt method [36].

Conclusions
A total of 5171 DEGs was identified in the comparison of AL vs. GL through de novo transcriptome sequencing; 2216 DEGs (42.85%) were upregulated and 2955 DEGs (57.15%) were downregulated. Chlorophyll contents in AL were significantly lower than those in GL. Additionally, the expression of genes related to chlorophyll biosynthesis (HEMA1, ChlH, CRD, and POR) in AL was significantly suppressed and chlorophyll degradation (PAO) genes were enhanced in AL, suggesting that the disturbance of chlorophyll biosynthesis and the intensification of chlorophyll degradation lead to the absence of chlorophyll in ALs of pecan. Genes associated with the chloroplast thylakoid membrane, photosystem I, photosystem II, chlorophyll binding, reductive pentose-phosphate cycle photosynthesis, light-harvesting in photosynthesis, pigment binding, chloroplast envelope, and photosynthesis were highly represented in AL, indicating that photosynthesis was destroyed in ALs. Plenty of genes associated with photosynthesis were regulated in AL, declaring that the photosynthetic system embedded within the thylakoid membrane of chloroplast was disturbed in ALs of pecan. These results indicated that the photosynthetic system disturbance was the key cause for the formation of an albino leaf phenotype in pecan. This study provides the theoretical basis for elucidating the molecular mechanism underlying the phenotype of chlorina seedlings of pecan.
Supplementary Materials: The following are available online at http://www.mdpi.com/1422-0067/21/17/6137/s1, Figure S1: Albino leaf seedlings of pecan. Figure S2: The correlation coefficient in the repeat group of ALs and GLs. Figure S3: QRT-PCR analysis validation, Table S1: Content of chlorophyll and carotenoid in leaves of green leaves (GL) seedling and albino leaves (AL) seedling in pecan, Table S2: Information statistics of data after filtering and mapped to the pecan genome. Table S3: List of differentially expressed genes (DEGs) in albino leaf compared to green leaf. Table S4: Function annotation of DEGs. Table S5: Gene ontology (GO) functional annotation of DEGs. Table S6: GO function of DEGs. Table S7: KEGG pathway mapping. Table S8: Transcription factors differentially expressed in the comparison of AL vs GL. Table S9: Genes primers in this paper.