Genome-Wide Identification of PAP1 Direct Targets in Regulating Seed Anthocyanin Biosynthesis in Arabidopsis

Anthocyanins are widespread water-soluble pigments in the plant kingdom. Anthocyanin accumulation is activated by the MYB-bHLH-WD40 (MBW) protein complex. In Arabidopsis, the R2R3-MYB transcription factor PAP1 activates anthocyanin biosynthesis. While prior research primarily focused on seedlings, seeds received limited attention. This study explores PAP1’s genome-wide target genes in anthocyanin biosynthesis in seeds. Our findings confirm that PAP1 is a positive regulator of anthocyanin biosynthesis in Arabidopsis seeds. PAP1 significantly increased anthocyanin content in developing and mature seeds in Arabidopsis. Transcriptome analysis at 12 days after pollination reveals the upregulation of numerous genes involved in anthocyanin accumulation in 35S:PAP1 developing seeds. Chromatin immunoprecipitation and dual luciferase reporter assays demonstrate PAP1’s direct promotion of ten key genes and indirect upregulation of TT8, TTG1, and eight key genes during seed maturation, thus enhancing seed anthocyanin accumulation. These findings enhance our understanding of PAP1’s novel role in regulating anthocyanin accumulation in Arabidopsis seeds.


Introduction
Anthocyanins, water-soluble pigments, fall under the flavonoids class of secondary metabolites, contributing red, purple, and blue hues to various fruits and vegetables [1].These pigments play a role in attracting pollinators and seed dispersers agents [1,2].Anthocyanin formation is influenced by environmental factors such as UV irradiation [3], temperature [4], drought [5], and nutrient deficiency [6].Under adverse conditions, anthocyanin concentrations generally increase, indicating their involvement in biotic and abiotic stress responses [7][8][9].Anthocyanins, with their antioxidant properties, also serve as vital micronutrients for humans, guarding against cardiovascular, neurodegenerative, metabolic diseases, and cancer [10].Therefore, gaining a deeper understanding of anthocyanin accumulation and its regulatory mechanisms holds significant scientific and economic importance.
These findings suggest that the MBW complex serves as a central regulatory hub for anthocyanin accumulation.Notably, PAP1 is a key activator in the anthocyanin biosynthesis pathway with conserved functions in various crops [11,21,28].However, genome-wide targets of PAP1 in Arabidopsis seeds remain unexplored.In our study, we demonstrated that PAP1 enhances seed anthocyanin accumulation by upregulating some anthocyanin biosynthesis-related genes during Arabidopsis seed development.Our results offer new insights into PAP1's regulatory role in Arabidopsis seed anthocyanin accumulation.
Furthermore, we measured the anthocyanin levels in mature seeds of the wild-type and three transgenic lines, revealing a significant increase in anthocyanin content in the transgenic lines compared to wild-type plants (Figure 2D).Proanthocyanidins (PAs) are a class of oligomeric or polymeric flavonoids.The intermediate compound dihydroflavonol can be further converted to anthocyanins or PAs through distinct branches of the flavonoid pathway [36].Previous studies have shown that PAs accumulate in the seed coat and protect the embryo and endosperm [37].Consequently, we assessed the PAs levels in mature seeds of both the wild-type and the three transgenic lines, revealing no significant difference in PAs content (Figure 2E).In summary, our findings suggested that PAP1 selectively regulates anthocyanin accumulation, but not PAs, in seeds during Arabidopsis seed development.Compared to wild-type plants, the developing seeds of these three PAP1-over-expressing transgenic lines exhibited enhanced pigmentation at 10 and 12 days after pollination (DAP) as well as in mature seeds (Figure 2A-C).Additionally, the seedlings of the transgenic lines (35S:PAP1 #1, 35S:PAP1 #3, and 35S:PAP1 #5) displayed purple stems and petioles, while wild-type seedlings were green (Supplementary Figure S1).
Furthermore, we measured the anthocyanin levels in mature seeds of the wild-type and three transgenic lines, revealing a significant increase in anthocyanin content in the transgenic lines compared to wild-type plants (Figure 2D).Proanthocyanidins (PAs) are a class of oligomeric or polymeric flavonoids.The intermediate compound dihydroflavonol can be further converted to anthocyanins or PAs through distinct branches of the flavonoid pathway [36].Previous studies have shown that PAs accumulate in the seed coat and protect the embryo and endosperm [37].Consequently, we assessed the PAs levels in mature seeds of both the wild-type and the three transgenic lines, revealing no significant difference in PAs content (Figure 2E).In summary, our findings suggested that PAP1 selectively regulates anthocyanin accumulation, but not PAs, in seeds during Arabidopsis seed development.

A Whole-Genome Analysis of Genes Associated with Seed Anthocyanin Accumulation
To elucidate the regulatory mechanism of PAP1 in seed anthocyanin accumulation, we performed RNA-Sequencing (RNA-Seq) analysis on developing seeds from the transgenic line 35S:PAP1 #5 and wild-type Col-0 plants at 12 DAP.The results identified 5174 differentially expressed genes (DEGs), with 4760 upregulated and 414 downregulated DEGs (Table 1).Among them, seventy-four upregulated genes (1.6%) and three downregulated genes (0.7%) were involved in flavonoid biosynthesis (Table 1).Additionally, 30% of upregulated genes participated in primary metabolic processes, including carbohydrate metabolism (12.8%), nucleic acid (5.2%), amino acid and protein (5.0%), cell wall (3.3%), and photosynthesis (2.3%) (Table 1).Notably, 780 upregulated genes (16.4%) and 95 downregulated genes (22.9%) were linked to stress/defense responses (Table 1).These findings highlight PAP1's pivotal role in seed anthocyanin accumulation and other crucial physiological and biochemical processes.Note: Percentage refers to the ratio of genes of each functional category relative to total upregulated or downregulated DEGs identified in the RNA-seq experiment.The DEGs with log 2 ratios greater than 1 or less than −1 (only Gene Ontology Slim identifiers with p ≤ 0.05 and FDR ≤ 0.05) are listed.

PAP1 Promotes Anthocyanin Accumulation by Directly Activating the Expression of ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26 in Arabidopsis Developing Seeds
To investigate PAP1's regulation of seed anthocyanin accumulation, we performed chromatin immunoprecipitation (ChIP) assays on developing siliques at 12 DAP from 35S:PAP1-6HA #5 plants.This allowed us to understand how PAP1 controls the transcription of target genes.From the twenty genes mentioned earlier, we selected ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26 due to their possession of PAP1 binding sites.The core binding motif of PAP1, identified as a 7 bp MYB-recognizing element (MRE) (ANCNNCC), was found in the promoter regions of these ten genes [34,63].We designed primers to cover all possible MRE sites bound by PAP1 in the promoter regions of these genes.There are three MREs within the promoter of ADT5, five MREs in CHS, three MREs in F3H, two MREs in DFR, three MREs in ANS, three MREs in 3GT, two MREs in UGT79B2, two MREs in UGT79B3, three MREs in 5MAT, and four MREs in GST26 (Figure 4).The ChIP assay revealed that PAP1-6HA was associated with specific promoter regions: P3 of ADT5, P1 of CHS, P1 of F3H, P1 of DFR, P1 and P2 of ANS, P1 and P2 of 3GT, P2 of UGT79B2, P1 and P2 of UGT79B3, P3 of 5MAT, and P2 of GST26 (Figure 4).These results demonstrated that PAP1 directly binds to the promoter regions of ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26 to promote their expression.Moreover, we further evaluated the positively regulatory function of PAP1 on th transcription of ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST2 using a transient dual-luciferase reporter assay.We constructed effectors with or withou as the internal control.Then, the value for Col-0 35S:PAP1 #5 was normalized against it for wild-type (Col-0) plants.The Arabidopsis ACTIN7 fragment was amplified as the negative control.Values are means ± SD (n = 3).Significant differences in comparison with the ACTIN7 fragment enrichment are indicated with asterisks (*) (two-tailed paired Student's t-test, p ≤ 0.05).

Discussion
Anthocyanin accumulation is a dynamic phenomenon in various plant species.The transcription factor PAP1, an R2R3-MYB member, serves as a pivotal hub, integrating diverse internal and external stimuli affecting anthocyanin biosynthesis, with prior research mainly focused on seedling phenotypes [28,29].However, information on the direct targets of the PAP1 in the regulation of anthocyanin accumulation, especially in seeds of Arabidopsis, remains limited.In this study, we identified new genes targeted by PAP1, directly or indirectly regulating anthocyanin accumulation at the genome-wide level in Arabidopsis seeds.
Numerous R2R3-MYB genes are known to positively or negatively regulate anthocyanin biosynthesis [24,64].Previous reports indicated that two PAP1 over-expressing lines, the pap1-D mutant and the PAP1 cDNA over-expressing transgenic plant, exhibited similar anthocyanin accumulation in vegetative tissues but differed in seed color and accumulation patterns of anthocyanins and PAs in seeds [20,65].The pap1-D mutant showed increased pigmentation in leaves, stems, and roots but no change in seed color [20,65].In contrast, the PAP1 cDNA over-expressing plant displayed darker colors in all vegetative organs, including seeds [65].Soluble anthocyanin levels increased in seeds of transgenic plants over-expressing PAP1 cDNA but remained unchanged in seeds of the pap1-D mutant [65].Consistent with results from PAP1 cDNA over-expressing lines, our reverse genetic approach demonstrated that PAP1 over-expression modulates anthocyanin biosynthesis, leading to anthocyanin hyperaccumulation in both developing and mature seeds of Arabidopsis (Figure 2A-C).The increased anthocyanin content should be the reason for the darker color of mature seeds in transgenic lines (Figure 2C,D).Notably, total PAs content showed no difference between the wild-type and transgenic lines in mature seeds (Figure 2D).Successful PAs accumulation in Arabidopsis reportedly requires the cooperation of multiple genes [51].Based on this observation and previous findings showing increased PAs content in the pap1-D mutant but decreased PAs content in PAP1 cDNA over-expressing plants in Arabidopsis seeds [65], it is reasonable to suggest that PAP1's influence on accumulation of PAs is more complex than anthocyanin production in Arabidopsis seeds, likely due to various unknown factors.Further research is needed to understand the relationships between PAP1 and PAs in Arabidopsis seeds.Therefore, we speculate that PAP1 plays a specific positive role in anthocyanin accumulation, not PAs, during seed development in Arabidopsis seeds.
The regulation of gene expression involved in the anthocyanin biosynthetic pathway is largely coordinated by a complex network of interactions between transcription factors and their target genes [21,34].The transcriptome analysis revealed that seventy-four upregulated genes (1.6%) and three downregulated genes (0.7%) in developing seeds of 35S:PAP1 #5 were related to flavonoid metabolism (Tables 1 and S2).Additionally, a significant portion of all DEGs in developing seeds of 35S:PAP1 #5 (16.9%) were associated with stress/defense responses (Table 1), consistent with PAP1's known role as a stress regulator [66,67].Previous studies have discovered that anthocyanins play a crucial role in enhancing tolerance to biotic and abiotic stresses in vegetative tissues [8,9].It is possible that PAP1 directly altered the expression patterns of these differentially expressed stress/defense-responsive genes (Supplementary Tables S2 and S3).Alternatively, the hyperaccumulation of anthocyanins in developing seeds may have indirectly regulated these stress/defense-responsive genes.
be a valuable candidate gene that is associated with seed dormancy and germination under stress due to its increased anthocyanin content.
In summary, our study reveals that the R2R3-MYB transcription factor PAP1 directly activates the expression of ten anthocyanin biosynthetic pathway-related structural genes, ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26, during seed development (Figure 6).This makes PAP1 a promising target for genetic manipulation to enhance seed anthocyanin levels, improving seed anthocyanin quality.Furthermore, we demonstrated that PAP1 indirectly enhanced the expression of two regulatory genes, TT8 and TTG1, during seed anthocyanin biosynthesis (Figure 3).This activation is instrumental in accelerating anthocyanin production in seeds (Figure 2).The feedback mechanisms between the MYB and bHLH components of the MBW activation complex play a pivotal role in flavonoid regulation.TT8, a bHLH transcription factor, regulates its own expression via a positive feedback loop through an MBW complex, contributing to anthocyanin and PA biosynthesis regulation [60,70].TTG1, encoding a WD40 repeat transcription factor, collaborates with TT8 and TT2 (encoding MYB123) to mediate anthocyanin pigment production in developing seeds [59].Our data suggests that PAP1, an R2R3-MYB factor, upregulates TT8 and TTG1 expression within the MBW complex for anthocyanin gene regulation, with the hierarchical regulation of TT8 and TTG1 expression requiring further investigation.
PAP1 expression has the potential to significantly enhance anthocyanin accumulation in many plant species, resulting in a dark purple color in various plant organs [20,21,65].It has been demonstrated that the increased anthocyanin accumulation confers plants with enhanced tolerance to abiotic stress [66,67].Thus, there is an interesting question needing to be investigated: do the enhanced anthocyanin levels in seeds enhance abiotic stress tolerance?Previous studies indicated that the anthocyanins present in seed extracts could serve as defense molecules against abiotic stresses such as UVB radiation, drought, and low or high temperatures [71].Recently, some researchers have observed an association between seed dormancy and seed color [72,73].These findings suggested that PAP1 may be a valuable candidate gene that is associated with seed dormancy and germination under stress due to its increased anthocyanin content.

Plant Materials and Growth Conditions
The Arabidopsis thaliana ecotype Col-0 served as the wild-type control.As previously stated [74], the plants were grown in a 16 h light/8 h dark cycle at 22 • C in a growth chamber with overhead light at 160 µmol m −2 s −1 .

Plasmid Construction and Transgenic Plants Generation
Specific primers were designed from the full-length coding sequence (CDS) of PAP1 (At1g56650) without a stop codon from the TAIR database.To create 35S:PAP1-6HA, the CDS of PAP1 (without the stop codon) was amplified using PAP1_F and PAP1_R primers (Supplemental Table S1).PCR products were digested using XmaI and SpeI, then ligated into the pGreen-35S-6HA binary vector to create an in-frame fusion of PAP1-6HA under the 35S promoter.
The 35S:PAP1-6HA vector was transformed into Arabidopsis Col-0 using the Agrobacteriumtumefaciens-mediated floral dip method [75].35S:PAP1 transgenic plants were selected based on their soil using Basta, and successful transformation was confirmed through DNA genotyping until homozygous T 3 transgenic progenies were obtained.

Phenotypic Observation of Seeds Color and Seed Size
To analyze seed color and seed size, the immature (10 and 12 DAP) and mature seeds of three Arabidopsis transgenic lines were randomly selected from major inflorescences and photographed using a SZ61 stereomicroscope (Olympus, Tokyo, Japan).

Determination of Anthocyanin and PAs Content
Anthocyanin measurement followed the protocol by Li et al. [76] with minor adjustments.Seeds were briefly frozen in liquid nitrogen, ground in a mortar, and approximately 5 mg of seed powder was placed in a 10 mL graduated test tube and incubated overnight at 4 • C in 3 mL methanol solution with 1% (v/v) HCl.After a 60 min incubation at 75 • C and cooling to room temperature, the sample was centrifuged for 15 min at 1500 rpm (HC-3018R, Zonkia, Anhui, China).The supernatant was mixed with 2 mL of distilled water and an equal volume of chloroform, followed by centrifugation (1500 rpm, 15 min, HC-3018R, Zonkia, Anhui, China).The supernatant's absorbance was measured at 535 nm using a spectrophotometer (V-1200, Mapada, Shanghai, China) and the anthocyanin content was then normalized to the dry seed weight.
PAs extraction, adapted from Kitamura et al. [77], involved grinding mature seeds (10 mg) and mixing the powder with 1.5 mL of 70% (v/v) acetone extraction buffer containing 5.26 mM Na 2 S 2 O 5 .This mixture was sonicated using an ultrasonic bath (SB-5200 DT, Scientz, Ningbo, China) for 20 min at room temperature and then centrifuged at 1500 rpm for 15 min (HC-3018R, Zonkia, Anhui, China).The supernatant was dried and resuspended in HCl:butanol:70% acetone (2:10:3).The resulting absorbance was measured at 545 nm using an Infinite M200 PRO (Tecan).Following this, the solution was heated at 95 • C for 60 min and the absorbance at 545 nm was recorded again.The soluble PA fraction was calculated by subtracting the initial absorbance from the final one.The pellet obtained after 70% acetone extraction was dried via evaporation, resuspended in the HCl/butanol solution, and hydrolyzed to determine the insoluble PAs.Three independent biological replicates were conducted, each with three technical repetitions.

RNA Extraction and RT-qPCR Analysis
Total RNA extraction from 12 DAP developing seeds was performed using the Steady-Pure Plant RNA Extraction Kit (Accurate Biology, Changsha, China), followed by cDNA reverse transcription (TransGen, Beijing, China).RT-qPCR analysis was conducted with three independent biological replicates using SYBR Green Master Mix (Cofitt, Hongkong, China) on the QuantStudio TM 7 Flex Real-Time System (Thermo Fisher Scientific, Waltham, MA, USA).The Arabidopsis house-keeping gene EF1αA4 served as the internal control, and relative expression values of the target genes were calculated via normalization against EF1αA4 using a modified double-delta method [78].The RT-qPCR primer details are listed in Supplemental Table S4.

ChIP-qPCR Assay
The ChIP-qPCR assay followed a previously described protocol [79].Developing siliques (3−5 g) at 12 DAP were harvested from both wild-type (Col-0) and Col-0 35S:PAP1 #5 over-expressing plants.The samples underwent triple ddH 2 O washes and were crosslinked using 1% (v/v) formaldehyde (37 mL) under vacuum on ice for 15 min.Crosslinking was terminated by adding 2.5 mL of 2 M glycine.After being ground in liquid nitrogen, nuclear protein was separately extracted using sucrose-based buffers containing 0.4, 0.25, and 1.7 M sucrose.Chromatins were isolated, and DNA was sheared into 200-700 bp fragments through sonication with ultrasonic cell disruptors (Scientz-IID, Scientz, Ningbo, China).After centrifugation at 4 • C for 5 min at 12,000 rpm (Sorval Legend TM Micro 17, Thermo Fisher Scientific, Waltham, MA, USA), the chromatin remained in the upper aqueous phase.PAP1-6HA chromatin DNA was immunoprecipitated overnight using anti-HA magnetic beads (Thermo, USA) at 4 • C. The beads were washed and collected using a magnetic rack, and the immune complexes were eluted twice.Subsequently, the complexes were eluted and reversely crosslinked at 65 • C for 10 h in 5 M NaCl.Proteins were digested with 0.5 M EDTA, 1 M Tris-HCl (pH 6.5), and 3 mL of proteinase K (10 mg/mL) at 45 • C for 1 h.The DNA fragments were extracted using a Phenol/chloroform/isoamyl alcohol solution (25:24:1, pH > 7) and stored at −80 • C. The relative enrichment of each fragment was assessed via RT-qPCR.Each experiment involved three biological replicates with three technical replicates per biological replicate.Arabidopsis EF1αA4 and ACTIN7 served as the internal reference and negative control, respectively.The ChIP-qPCR assay primer details are provided in Supplemental Table S5.

Transient Dual-Luciferase Reporter Analysis
The PAP1 CDS was amplified and cloned into pGreenII 62-SK under the 35S promoter to form effector constructs.The effector constructs without PAP1 served as the empty control.Promoters for ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26 were separately cloned into pGreenII 0800-LUC [80] to form reporter constructs.A. tumefaciens strain GV3101 was transformed with all the constructs along with pSoup-P19 (Weidi Biotechnology, Shanghai, China).Effector and reporter constructs were mixed in a buffer comprising 10 mM MgCl 2 , 10 mM MES-KOH (pH 5.8), and 10 µM acetosyringone in a 1:1 ratio and then injected into the young leaves of 4-week-old N. benthamiana.The infiltrated plants were cultured in a climate incubator (RXD-1000D-LED, Prandt, Ningbo, China) with a light/dark 16:8 h photoperiod cycle at 22 • C for 72 h.The firefly luciferase (LUC) and Renilla luciferase (REN, an internal control) activities were assessed using a dual-luciferase reporter assay kit (YEASEN, Shanghai, China) on a multifunctional enzyme label instrument (Spark ® , Tecan, Männedorf, Switzerland).Six independent biological samples were examined.The primers for the dual-luc assay are provided in Supplemental Table S6.

Statistical Analysis
This study used a completely randomized design.Data were expressed as mean and standard deviation and analyzed using one-way analysis of variance (ANOVA) via SPSS software (version 17.0, SPSS Inc., Chicago, IL, USA).Significant differences were determined using a two-tailed paired Student's t-test at the 0.05 significance level.

Figure 4 .
Figure 4. PAP1 targets ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST promoters and directly promotes their expressions in developing Arabidopsis seeds.Schematic di grams show the promoter regions of ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MA and GST26, while ChIP-qPCR assays show PAP1 binding to their promoter regions in the develo ing Arabidopsis siliques at 12 DAP.The transcriptional start site (TSS) and exons are represented b black boxes, whereas promoter regions are represented by white boxes.The triangle represents th MYB-recognizing element (MRE) site ANCNNCC and the black lines represented the DNA fra ments amplified in ChIP assays for each gene.The enrichment fold of each fragment was calculate first by normalizing the amount of a target DNA fragment against a genomic fragment of Arabidops EF1aΑ4 as the internal control.Then, the value for Col-0 35S:PAP1 #5 was normalized against it f wild-type (Col-0) plants.The Arabidopsis ACTIN7 fragment was amplified as the negative contro Values are means ± SD (n = 3).Significant differences in comparison with the ACTIN7 fragme enrichment are indicated with asterisks (*) (two-tailed paired Student's t-test, p ≤ 0.05).

Figure 4 .
Figure 4. PAP1 targets ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26 promoters and directly promotes their expressions in developing Arabidopsis seeds.Schematic diagrams show the promoter regions of ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26, while ChIP-qPCR assays show PAP1 binding to their promoter regions in the developing Arabidopsis siliques at 12 DAP.The transcriptional start site (TSS) and exons are represented by black boxes, whereas promoter regions are represented by white boxes.The triangle represents the MYB-recognizing element (MRE) site ANCNNCC and the black lines represented the DNA fragments amplified in ChIP assays for each gene.The enrichment fold of each fragment was calculated first by normalizing the amount of a target DNA fragment against a genomic fragment of Arabidopsis EF1aA4 as the internal control.Then, the value for Col-0 35S:PAP1 #5 was normalized against it for wild-type Int. J. Mol.Sci.2023, 24, x FOR PEER REVIEW 9 of 18 transcription in N. benthamiana leaves.In summary, our findings collectively indicate that PAP1 promotes anthocyanin accumulation by directly activating the expression of ADT5, CHS, F3H, DFR, ANS, 3GT, UGT79B2, UGT79B3, 5MAT, and GST26, while also indirectly promoting the expression of C4H, 4CL3, CHI, F3′H, 5GT, UF3GT, 3AT1, 3AT2, TT8, and TTG1 during seed development in Arabidopsis.