1. Introduction
Nelumbo nucifera Gaertn. belongs to the family Nelumbonaceae. Lotus is a famous traditional flower in China and has special religious significance in Buddhism [
1]. It is also a famous aquatic plant with ornamental, edible and medicinal value distributed in East Asia, Southeast Asia and Australia. China is one of the distribution and cultivation centers in the world. Indeed, the history of cultivation in China can be traced back to 2000 years ago. Because of its beautiful flowers, edible seeds and rhizomes, three types of cultivars were bred, including flower, seed and rhizome lotus, with more than a thousand cultivars being developed at present in China [
2,
3].
Lotus seeds have a long life span and still have vitality after thousands of years [
4,
5], and the high content of proanthocyanidins in lotus is one of the reasons for the long life span of the seeds. According to studies, lotus plants are rich in proanthocyanidins, especially in the seedpod and seed epicarps [
6,
7]. Moreover, the mass fraction of proanthocyanidins in mature samples is greater than that in young samples [
8]. In our preliminary survey of 10 cultivars, the seedpod proanthocyanidin content was highest in “Guoqing Hong” and lowest in “Space Lotus”. This pronounced difference prompted the selection of these two cultivars for subsequent transcriptomic analyses.
Proanthocyanidin oligomers can scavenge free radicals. Their antioxidant capacity decreases as the polymer length increases. Compared with the more highly polymerized proanthocyanidins, the oligomeric proanthocyanidins (degree of polymerization: 2–5) are more effectively absorbed and utilized by the human body. Proanthocyanidins are synthesized by 12 key enzymes through the common phenylpropanoid pathway, core flavonoid pathway and proanthocyanidin pathway [
9]. Among the oligomeric proanthocyanidins, eight isomers of dimeric proanthocyanidins have been found. However, there is not much information on the polymerization mechanism that produces proanthocyanidins. Three potential candidate polymerases have been postulated to produce the conformational changes and polymerization reactions that produce proanthocyanidins [
10], namely polyphenol oxidases (
PPOs) [
11,
12], laccases (
LACs) [
13] and peroxidases (
PODs) [
14,
15,
16]. Pourcel showed that a laccase encoded by TT10 catalyzes the polymerization of oligomeric proanthocyanidins to form polymeric proanthocyanidins [
13]. Plant peroxidases can polymerize catechins and other phenolics as substrates to form dimers in an alkaline environment [
14]. Peroxidases in strawberries can oxidize catechins to yield dimeric proanthocyanidins, trimeric proanthocyanidins and oligomeric proanthocyanidins [
15,
16].
Transcriptomes are dynamic and respond to developmental and environmental cues. Thus, analysis of transcriptomes can provide insight into the regulatory expression of all genes in different conditions, species, tissues and organs [
17]. Differential expression of transcripts can reveal molecular mechanisms related to the phenotype and genetic variation. With the published lotus genome data as a reference [
18], we performed a transcriptome analysis of seedpods and epicarps during the development of seedpods in two species of lotus. Transcriptomic profiling resolves the key polymerization nodes distinguishing high- and low-proanthocyanidin cultivars, while stage-specific temporal expression profiles precisely define the critical time points and genes governing the polymerization magnitude. Our analysis indicates that three enzyme classes (
PPOs,
LACs and
PODs) are candidate polymerases for proanthocyanidin polymerization.
For the lotus, most studies focused on the extraction, separation, structure and function of proanthocyanidins. Little is known about the polymerization reaction. The identification and analysis of the key genes and regulatory factors responsible for the polymerization reactions that yield proanthocyanidins will provide a theoretical basis for understanding the mechanism of proanthocyanidin synthesis.
2. Materials and Methods
2.1. Plant Materials of Transcriptome and HPLC-MS
All materials were collected from Qujing Lotus Park in Qujing City, Yunnan Province, China, and plant materials with good growth and no obvious diseases or pests were selected. The cultivars “Guoqing Hong” (GQH) and “Space Lotus” (TKL) were selected as contrasting groups based on their markedly divergent proanthocyanidin contents; “Guoqing Hong” displayed the highest accumulation in mature leaves and seedpods, whereas “Space Lotus” exhibited the lowest. Conversely, “Space Lotus” achieved the greatest proanthocyanidin content in mature seed epicarps, whereas “Guoqing Hong” showed the lowest. A total of 36 samples were collected from lotus seedpods (F) and seed epicarps (K) in three biological replicates (
Table 1). Sample identification key used throughout the study. Cultivar, developmental stage, and tissue are encoded in an nine-character ID (
Table 2).
Samples were collected during three developmental stages (
Figure 1). A sample identification key used throughout the study. The cultivar, tissue, developmental stage and biological replicate were encoded in a nine-character ID. The immature stage (S1) occurs approximately 10 d after pollination and is characterized by a yellow seed and an embryo that is poorly developed. The nearly mature stage (S2) occurs at approximately 30 d after pollination and is characterized by a lotus seed that is green and an embryo that is nearly fully developed. The mature stage (S3) occurs at approximately 45 d after pollination and is characterized by a purple-brown seed and an embryo that is fully developed. The group of samples that was used for transcriptome sequencing was rapidly frozen in liquid nitrogen and stored at −80 °C. The group of samples that was used for the HPLC-MS analysis of the total proanthocyanidin extract was dried in silica gel and stored at room temperature. Samples intended for HPLC-MS analysis of the total proanthocyanidin extracts were blended with silica gel and dried in a ventilated oven at 65 °C for 12 h. Upon complete dehydration, they were stored at room temperature.
2.2. Library Construction and Sequencing of cDNA
Thirty-six lotus samples were sent to Biomarker Technologies Co., Ltd (Beijing, China). After the samples passed the purity, concentration and integrity tests, the mRNA was enriched using oligo (dT) linked to magnetic beads. The mRNA was randomly interrupted with Fragmentation Buffer, and the first cDNA strand was synthesized using six-base random primers (i.e., random hexamers). The buffer, dNTPs, RNase H and DNA polymerase I were used to synthesize the second strand. The cDNA was purified using AMPure XP beads. The purified double-stranded cDNA was end-repaired, A-tailed and ligated into sequencing junctions, followed by fragment size selection using AMPure XP beads. Finally, the cDNA libraries were amplified using PCR. After library construction, the effective library concentration was accurately quantified at >2 nM using qPCR. After passing the library quality check, different libraries were pooled according to the target downstream data volume and sequenced using the Illumina platform.
2.3. Functional Annotation and Enrichment Analysis of Differentially Expressed Genes
We analyzed the quality and credibility of the raw sequence data. (1) Reads containing connectors were removed. (2) Low-quality reads including reads with a removal ratio of N greater than 10% were removed. Reads with a quality value ≤ 10 (i.e., Q ≤ 10) and an alkali base accounting for more than 50% for the entire read were removed. The high-quality clean data obtained from the quality control procedure were aligned to the lotus genome yielding mapped data. Then, the FPKM was calculated as the abundance of each gene expression. The selection criteria were log2 (FC) ≥ 2 and an error detection rate (FDR) < 0.01. Venn diagrams were generated using TBtools (v1.09813). The R package (R-4.2.3) topGO [
19] and cluster Profiler [
20] were used for gene ontology (GO) and KEGG enrichment analysis, respectively.
2.4. WGCNA Analysis
The co-expression networks were constructed using the WGCNA package in R (R-4.2.3). Eigengene values were calculated for each module and used to test associations with each developmental stage.
2.5. RT-qPCR Tests of Candidate Genes
Two micrograms of the total RNA from different materials of
N. nucifera was reverse transcribed using an M-MLV4 Reverse Transcriptase Kit (Biomed, Beijing, China) according to the manufacturer’s instructions. After a 10-fold dilution, the cDNA reaction mixture was used as a template in 10 μL qPCRs that were performed in a Roche LightCycler
®480 II real-time system (Roche Diagnostics AG, Basel, Switzerland) using a BlasTaqTM 2 × qPCR MasterMix kit (abm, Zhenjiang, China). Each 10 μL reaction contained 5 μL of BlasTaqTM 2 × qPCR MasterMix, 0.25 μL of each primer, 0.5 μL of template DNA and 4 μL of nuclease-free H
2O. The qPCR program was initiated with a pre-incubation step at 95 °C for 3 min, followed by 40 cycles of 95 °C for 15 s, 60 °C for 1 min and 72 °C for 10 s. At the end of each run, a melting curve was generated for each sample to ensure the purity of the amplified product, with melting segments of 95 °C for 5 s, 65 °C for 1 min and a continuous 97 °C segment, followed by cooling to 40 °C for 30 s. Three biological replicates were used for each sample. Gene-specific primers for qPCR were designed based on the sequences selected from the RNA-seq data (
Supplement Table S1). The mRNA levels of each sample were normalized to the expression level of a gene encoding the ubiquitin carrier (
UBC). ANOVA was used to analyze the data. The mean differences were compared using Duncan’s multiple range tests (DMRTs). Different letters indicate significant differences at
p < 0.05 according to the DMRTs.
2.6. HPLC-MS Analysis of Major Proanthocyanidin Substances
Three replicates of lotus seedpods and seed epicarpss were ground into powder with a pulverizer and passed through a 40-mesh sieve. Samples of 0.5 g were ground into a powder using liquid nitrogen, extracted with 20 mL of analytically pure methanol and allowed to stand for 30 min. Ultrasonic homogenization was performed at 40 °C for 30 min (room temperature, 160 W). After 10 minutes of incubation, ultrasonic homogenization was conducted for another 30 min. The extract was collected after standing for 1 h. The extract was filtered under reduced pressure with a Brinell funnel and then dried in a vacuum concentrator. The extract was dissolved with chromatographically pure methanol, filtered through a funnel, filtered twice through a 0.45-mm microporous membrane using a disposable syringe, fixed with a 25-mL volumetric flask with chromatographically pure methanol and stored in a refrigerator at 2–5 °C. The samples were analyzed using high-performance liquid chromatography-mass spectrometry (HPLC-MS) with a Thermo Fisher UltiMate 3000 high-performance liquid chromatograph (Thermo Fisher Scientific, Waltham, MA, USA) and a Thermo Fisher Hypersil-GOLD liquid chromatography C18 column, 23002-102130 (particle size of 1.9 u, 100 × 2.1 mm) (Thermo Fisher Scientific, Waltham, MA, USA). The chromatographic conditions were as follows. The mobile phase was a water-95% methanol solution, the retention time was 0–10 min, A was an aqueous solution, B was 95% methanol, and the mobile phase flow rate was 0.3 mL/min. The mass spectrometry conditions were as follows. The positive ion mode was used, the ion source was an ESI source, the injection volume was 0.3 μL, the mass-to-charge ratio was m/z 100–1000 Da, the heating temperature was 300 °C, the sheath gas flow rate was 35 arb, the auxiliary gas flow rate was 15 arb, the purge gas flow rate was 2 arb, the spray voltage was 3.5 V, the ion transfer tube temperature was 350 °C, the ion tube transfer voltage was 1 V, the tube lens voltage was 45 V, and the mass range was 100–1000 m/z. A calibration curve was constructed using (+)-catechin as the external standard, and the concentrations of the detected polyphenols were estimated semi-quantitatively against this curve.
3. Results
3.1. Transcriptome Analysis
RNA sequencing was used to analyze gene expression during three periods in the seedpods and seed epicarps from two lotus species. Genes were considered to be differentially expressed if the log2 fold change (FC) was ≥2 and the false discovery rate (FDR) was <0.01. We found 2783 DEGs between the “Guoqing Hong” and “Space Lotus” seedpods during period 1. Among these DEGs, 1587 were upregulated, and 1196 were downregulated. We found 1350 DEGs between the lotus seed eicarps. Among these DEGs, 807 were upregulated, and 543 were downregulated. In period 2, there were 3149 DEGs between the lotus seedpods. Among these DEGs, 1989 were upregulated, and 1160 were downregulated. We found 5854 DEGs between the lotus seed epicarps. Among these DEGs, 3267 were upregulated, and 2587 were downregulated. For the third period, we found 3118 DEGs, which included 1723 upregulated and 1395 downregulated genes, between the lotus seedpods. We found 543 DEGs, including 286 upregulated genes and 257 downregulated genes, between the seed epicarps of the two varieties. In period 2, we found the highest number of DEGs between the “Guoqing Hong” and “Space Lotus” in the seed epicarps relative to the seedpods (
Figure 2A). Indeed, 398 genes were expressed at significantly different levels in the seedpods during the different periods, with 98 genes expressed at significantly different levels in the seed epicarps (
Figure 2B).
Multiple-testing correction was performed using the Benjamini–Hochberg FDR method. Briefly, the raw
p values from hypergeometric tests for each GO term were ranked in ascending order, and corrected q values were calculated as q =
p × (m/k), where m is the total number of tests and k is the rank of the
p value. GO terms with q < 0.05 were considered significantly enriched. GO enrichment analysis of all DEGs in the lotus seedpods and seed coats that were detected during the three periods showed that these DEGs were widely distributed among three functional groups (biological processes, molecular functions and cellular components). Among the biological process categories, most of the DEGs were associated with the metabolic process group, followed by the cellular process group. In the cellular components category, most of the DEGs were associated with the “cell”, “cell part” and “organelle” groups. In the molecular functions category, the DEGs were predominantly associated with the “catalytic activity” and “binding” groups (
Figure 2C).
3.2. WGCNA
To understand the biological processes in different parts of the lotus plant at different times from the perspective of the overall network, WGCNA was performed (
Figure 3). Using correlation coefficients among the genes, a hierarchical clustering tree was constructed. Different branches of the tree represent different genes, and different colors represent different modules. Based on the weighted correlation coefficients of the genes, genes were classified according to their expression patterns, and genes with similar patterns were grouped into one module. The differences between modules can be distinguished by different colors. The WGCNA divided all DEGs into 11 detailed modules (
Figure 3A,B). Some modules were highly correlated with different species and different tissues during the three developmental periods. For example, the cyan and plum3 modules were emphatically associated with the immature and mature phases of the “Guoqing Hong” lotus seedpods. The brown4 module was decidedly associated with the immature phase of the “Space Lotus” seedpod, with connection coefficients that were more noteworthy than 0.8. The blue and dark turquoise modules were decidedly connected with the immature and mature phases of the “Space Lotus” seed epicarp. Indeed, the relationship coefficients were 0.97 and 0.87, respectively. Also, the skyblue2 and plum2 modules were strikingly associated with the near-mature phase of the lotus seed epicarp (
Figure 3C,D). The correlation between gene expression and the module characteristic values reflects the relationship between genes and a module. If a module has a correlation coefficient close to one, then the expression patterns of the genes associated with the module are similar.
3.3. KEGG Enrichment Analysis of Modules
To mine the key genes of the proanthocyanidin pathway, we performed a KEGG enrichment analysis for the different modules. We found that four modules—bisque4, brown4, dark-turquoise and grey60—were associated with the upstream steps of flavonoid-anthocyanin biosynthesis, which contributes to proanthocyanidin biosynthesis. The brown4 and dark-turquoise modules were analyzed further because they had the highest correlation coefficients of 0.84 and 0.87, respectively. Carbon metabolism was most enriched in the brown4 module, followed by the highly enriched flavonoid biosynthesis and photosynthesis-related pathways, including photosynthesis, photosynthesis-antenna proteins and carbon fixation in photosynthetic organisms (
Figure 4A). These data provide evidence that photosynthesis is vigorous and that core metabolic pathways are activated to provide the necessary precursors for other metabolic pathways. Consistent with these interpretations, the biosynthesis of flavonoids begins during the early stages of lotus development. In the dark-turquoise module, the DEGs were mainly enriched in metabolic pathways and secondary metabolic pathways (
Figure 4B), indicating that the synthesis of macromolecules is highly active during the maturation stage of the lotus seed epicarp. Secondly, the DEGs were enriched in phytohormone signaling and the plant–pathogen interaction pathways (
Figure 4B), which indicates the presence of particular mechanisms that respond to endogenous and environmental cues at the mature phase of development in the lotus seed epicarp.
3.4. Expression Analysis of Key Genes Associated with the Flavonoid Pathway
The expression of 10 DEGs associated with the flavonoid pathway and the brown4 and dark-turquoise modules was quantified using qRT-PCR (
Figure 5). These genes include one chalcone synthase (
CHS), two chalcone isomerases (
CHIs), one flavanone 3-hydroxylase (
F3H), one dihydroflavonol-4-reductase (
DFR), two leucoanthocyanidin dioxygenases (
LDOXs), two leucoanthocyanin reductases (
LARs) and one anthocyanin reductase (
ANR) (
Table 3).
As the committed step for flavonoid biosynthesis, the relative expression of CHS was higher in the seedpods of both lotus species than in the lotus seed epicarps. The expression of CHS was highest at approximately 10 d after pollination in the “Space Lotus” seedpods and decreased to the lowest levels in the green and purple-brown lotus seed epicarps. There was no significant difference in the expression of CHI-1 (gene 10159) in “Guoqing Hong” relative to “Space Lotus”. CHS was expressed at the lowest levels in the lotus seed epicarp at the mature stage. The expression of CHI-2 (gene 5973) was upregulated throughout the development of the seedpods of “Guoqing Hong”. The expression was not significantly different throughout seedpod development in the “Space Lotus”. At the early stages of epicarp development in the “Space Lotus”, CHI-2 was expressed at low levels and then was expressed at higher levels during the maturation stage. In “Guoqing Hong”, the expression of F3H increased and then decreased during the development of seedpods, and it decreased and then increased during the development of seed epicarps. In contrast, in “Space Lotus”, F3H expression was downregulated throughout the development of seedpods and seed epicarps. The expression of these three genes is a prerequisite for the accumulation of dihydroflavonols.
In “Guoqing Hong”, the expression of DFR was first upregulated and then downregulated in the seedpods. DFR expression was highest in the immature seed epicarps and then was gradually downregulated. In “Space Lotus”, the expression of DFR was downregulated at different times in the seedpods and first downregulated and then upregulated in the seed epicarps. The expression of LDOX-1 (newgene 8998) was not significantly different in the seedpods of the two species. In contrast, in “Guoqing Hong”, the expression of LDOX-1 was upregulated in the immature and near-mature stages of seed epicarp development and significantly downregulated in the mature stage. In “Space Lotus”, LDOX-1 expression was significantly downregulated and then significantly upregulated during the development of seed epicarps. In “Guoqing Hong”, the expression of LDOX-2 (gene 7404) was highest in the seedpods near maturity and slightly downregulated at maturity. In “Space Lotus”, the expression of LDOX-2 was highest during the first period of seedpod development and then was gradually downregulated in both the seed epicarps and seedpods. LAR converts leucoanthocyanins to catechins (2,3-trans-flavan-3-ol). In “Guoqing Hong”, the expression of LAR-1 (gene 1239) was upregulated in the seedpods and peaked at maturity. In contrast, the expression of LAR-1 increased and subsequently decreased in the seed epicarps. In “Space Lotus”, the expression of LAR-1 gradually decreased in the seedpods. In the seed epicarps, the expression of LAR-1 decreased and then subsequently increased. The expression pattern of LAR-2 (gene 13595) in the seed epicarps and seedpods of the different varieties was the same as that for LAR-1.
3.5. Semi-Quantitative Analysis via UHPLC-MS
An accurately weighed 0.5-g sample was extracted to analyze the total proanthocyanidins. The extract was analyzed semi-quantitatively using UHPLC-MS. The mass spectral information from the six proanthocyanidin target substances was extracted with a Thermo Xcalibur analysis of the mass spectral data in positive ion mode. The quantitative ions of each substance were [M + H]
+. The peak area values of all substances in each sample were obtained based on the mass spectral peak areas of the substances (
Supplemental Table S2). It can be seen from
Supplemental Table S2 that epi-catechin, catechin, proanthocyanidin A, proanthocyanidin B and proanthocyanidin OPC accumulated in the seed epicarps and seedpods of both lotus varieties. Gallocatechin and epigallocatechin were not detected in the mature ‘Space Lotus” seed epicarps. In “Guoqing Hong”, proanthocyanidin C was not present in the seedpod and accumulated only in the near-ripe and mature stages of the lotus seed epicarp. In “Space Lotus”, proanthocyanidin C accumulated in both the seed epicarps and seedpods and peaked in the near-ripe stages in both the seedpods and seed epicarps. The six molecules associated with the accumulation of proanthocyanidins accumulated to higher levels in the “Space Lotus” than in the “Guoqing Hong” (
Figure 6A). Oligomeric proanthocyanidins (OPCs) accumulated in the seed epicarps and seedpods of “Guoqing Hong” during the development of each, but the content was low. In “Space Lotus”, OPCs accumulated to the highest levels during the early stage of lotus seedpod development and to the lowest levels in the mature lotus seed epicarps (
Figure 6A). These data indicate that there were some differences in the composition and content of molecules associated with the accumulation of proanthocyanidins in seedpods and seed epicarps in “Space Lotus” and “Guoqing Hong”.
3.6. Correlation Network Analysis of Compounds and Candidate Polymerases
In total, we identified 15
LACs, 42
PODs and 5
PPOs in the transcriptomes of DEGs. In order to more visually represent the correlation between the three types of genes and the different proanthocyanidins detected by HPLC, we performed a correlation network analysis. The six kinds of proanthocyanidins are catechin and epi-catechin (C/EC) monomer, gallocatechin and epigallocatechin (GC/EGC) monomer, proanthocyanidin A (PA A) dimer, proanthocyanidin B (PA B) dimer, proanthocyanidin (PA C) trimer and oligomeric proanthocyanidins (OPC) with polymerization degrees of 2–5 (
Figure 6B). The data with
p values greater than 0.5 were selected for visualization. The orange circles represent the different proanthocyanidins, the yellow circles are
LACs, the green circles are
PODs, and the blue circles are
PPOs. The straight line indicates positive correlation, and the dashed line indicates negative correlation. A total of 8
LACs, 22
PODs and 3
PPOs were considered to be positively and negatively correlated with proanthocyanidin substances.
3.7. Validation of Expression Analysis of Candidate Polymerases
Based on our analysis of transcriptome data and the WGCNA, we randomly selected nine candidate genes (
Table 4) that may contribute to the polymerization of proanthocyanidin and quantified their relative expression using RT-qPCR. The two genes encoding laccases (
LACs) are mainly expressed in the lotus seedpods and the immature lotus seed epicarps (
Figure 6C). The expression of a gene encoding the peroxidases
POD-1 (
gene 13067) was higher in the lotus seedpods than in the seed epicarps. The expression of
POD-2 (
gene 1490) was upregulated in the mature lotus seed epicarps. The expression of
POD-3 (
gene 18702) reached its peak level at the mature stage in the “Guoqing Hong” seed epicarps. In contrast, in “Space Lotus”, the expression of
POD-3 decreased and then increased during the development of the “Space Lotus” seed epicarps (
Figure 6C). The genes encoding polyphenol oxidases (
PPOs) were expressed in the seedpods and seed epicarps of the “Guoqing Hong”. The expression of
PPO-3 (
gene 6867) was upregulated only during the immature stages in the seedpods and seed epicarps of the “Space Lotus”. The other three genes that encode
PPOs were expressed at the highest levels during the mature stages in the seedpods of the “Guoqing Hong” (
Figure 6C). The expression of
PPO-4 (
gene 12491) was expressed at the highest levels during the mature stages in the seedpods of the “Guoqing Hong” and in the seed epicarps of the “Space Lotus”. The expression patterns of these DEGs matched well with the FPKM values obtained using RNAseq.
4. Discussion
Our semi-quantitative analysis demonstrated that different species accumulate different levels of proanthocyanidins during different developmental periods. The proanthocyanidins we detected in the two varieties of lotus flower were dimers and trimers composed mainly of catechins, epi-catechin, gallocatechin and epigallocatechin monomers, which is consistent with previous work [
21,
22]. Polyphenols in the lotus seed epicarp at the mature stage were found to include proanthocyanidin dimers and trimers [
23]. Proanthocyanidin dimers accumulated stage-dependently: GQH-K-3 > GQH-K-2 > GQH-K-1 and TKL-K-2 > TKL-K-1 > TKL-K-3 (Supplemental
Table S2). Lotus genotypes accumulate OPC, dimers and trimers in a genotype- and stage-specific manner governed by LAR and ANR expression, providing a clear breeding target for modulating the proanthocyanidin content in lotus seed products.
Anthocyanins and proanthocyanidins are derived from the same biosynthetic pathway and may compete for substrates, which may regulate the synthesis and accumulation of proanthocyanidins to some extent [
24]. The different expression levels of
LAR,
ANR and
LDOX may lead to differences in the biosynthesis and accumulation of anthocyanins, proanthocyanidins and flavan-3-ols [
25].
LAR converts leucoanthocyanins to catechins [
26,
27], and
LDOX oxidizes leucoanthocyanins to anthocyanins, which are subsequently reduced by ANR to epi-catechins [
25]. In this study, we found that the two
LAR genes were expressed at higher levels than the two
LDOX genes during seedpod development (
Figure 5). Thus,
LAR was presumably superior to
LDOX in competition for the same substrate.
In recent years, our knowledge of the regulation of proanthocyanidin biosynthesis has progressed in a variety of plants. Although the mechanism of polymerization has attracted the attention of researchers, this mechanism remains poorly understood. There is general agreement that the place where proanthocyanidin polymerization occurs is the large central vacuole of plant cells [
10,
28]. The putative mechanism that transports proanthocyanidins from the site of biosynthesis in the cytoplasm to the vacuole depends on glutathione S-transferase (GST)
TT19 [
29,
30] and MATE (
TT12) [
31]. GST and MATE use different means to mediate the transport and accumulation of proanthocyanidins. Although both epi-catechin and catechin are the substrates that initiate the polymerization reaction [
32], three possible precursors contribute to the extension reaction: leucoanthocyanin, flavan-3-ol and anthocyanin. The polymerization of proanthocyanidins was suggested to require a nucleophilic flavan-3-ol and an electrophilic carbocation [
33]. A polyphenol oxidase, laccase or peroxidase was suggested to produce nucleophilic catechins or epi-catechins that attack carbocations to yield dimeric to oligomeric proanthocyanidins [
34,
35,
36]. In this study, we found that the expression trend of genes is consistent with the trend of material accumulation, and it is speculated to have a polymeric effect. Therefore, it is speculated that polyphenol oxidases, laccases or peroxidases are candidate enzymes for the conformational change and polymerization of proanthocyanidins. In this study, 8
LAC, 22
POD and 3
PPO candidate polymerases were found among the DEGs. Whether any of these enzymes can catalyze the polymerization reaction needs to be verified in additional experiments.