Transcriptome Analysis of Populus Overexpression in SVL Transcription Factor

: Flowering is an essential part of the productive process, and ﬂowering time is determined by endogenous genetic components and many ambient factors. SHORT VEGETATIVE PHASE ( SVP ), a MADS-box transcription factor, regulates ﬂoral transition by repressing ﬂoral integrator genes and is involved in ABA-mediated drought stress. In this study, we transformed the poplar ( Populus ) clone “84K” with the SVP-Like gene, while stable overexpression transgenic lines were obtained. Transcriptome analysis of the leaves of the transgenic lines and WT (Wide Type) poplars revealed that a total of 477 genes showed signiﬁcantly altered expression, overexpressing SVL genes, including 342 upregulated and 135 downregulated genes. Ten subclusters in DEGs were analyzed, and KEGG terms of the largest subcluster were associated with two key pathways: hormone-related genes and glutathione metabolism. Meanwhile, many transcriptional factors were involved. Our results are helpful for in-depth analysis of the MADS transcriptional factor in poplars. This work provides the basis for studying woody plant growth, and development and molecular mechanisms responded to environmental stresses.


Introduction
Stem/progenitor cells that sustain the post-embryonic growth of all plant organs are encompassed by meristems [1]. In such meristems, the central zone contains stem cells, where cells differentiate into organs at the periphery of the meristem primordia [2]. The primordia develop into leaves during the vegetative phase, and a floral transition is established during the generative phase, when the shoot apical meristem becomes an inflorescence meristem (IM) [2]. In Arabidopsis, floral meristems were developed from the IM in a spiral manner, and a number of floral organs were produced precisely in a similar way.
Photoperiod and vernalization are important environmental cues for flowering in many annual and perennial plants, including horticultural, woody, and crop plants. FLOW-ERING LOCUS T (FT) is closely associated with flowering on long days, and the TWIN SISTER OF FT (TSF) probably acts in a similar way to FT. The FT-FLOWERING LOCUS D (FD) complex activates the expression of flowering genes, shown as a network in the shoot meristem. Compared with wild-type Arabidopsis and in response to long days, transcription of SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) is delayed in ft tsf double mutants or fd single mutant. When overexpressing SOC1 is derived from the 35S promoter, Arabidopsis displayed early flowering under both long-day and short-day [3][4][5].

Plant Material and Growth Conditions
The transcriptomes of 3 independent wild-type seedlings (Populus alba × Populus tremula glandulosa (84K)) with the same growth conditions and the same number of independent and consistent transgenic seedlings were compared. Both wild-type and transgenic seedlings were grown on 1/2 MS medium with 0.05 mg/L IBA and 0.05 mg/L NAA under 16 h light/8 h dark conditions at 25 • C and then were transplanted into soil and grown naturally for 4 to 5 weeks.

Vector Construction and Production of Transgenic Plants
The full-length cDNAs encoding PagSVL (Potri.007G010800.2) were amplified using PCR from the cDNA of the hybrid poplar (Populus alba × Populus tremula var. glandulosa, 84K). For PagSVL-3 × FLAG chimeric gene construction, a 3 × FLAG sequence was added to PagSVL cDNA. The chimeric gene was inserted downstream of the CaMV35S promoter into the pBWA(V)HS vector (BioRun, Wuhan, China). The construct used in this study was verified using DNA sequencing at Rui Biotech (Beijing, China). The promoters and primers used for amplification are listed in Table S1. The poplar transformation protocol followed that described in Meng et al. [17]. The overexpression vector was introduced into Agrobacterium tumefaciens (EHA105), after which the transgenic bacteria were transferred to 84K poplar plants using the leaf disc method. After one night culture for Agrobacterium, and when its OD 600 reached approximately 0.7, Agrobacterium was centrifuged and resuspended in 1/2 MS liquid medium containing 1.8 g/L galactose, 250 mg/L 2-(N-morpholino) ethanesulfonic acid (MES), and 50 mg/L acetosyringone (pH = 5.0). Leaf discs of aseptic 84K plantlets were cut into small pieces with a sterile razor blade, syringed with an Agrobacterium suspension for approximately 20 min, and then co-cultured on MS medium (pH = 5.8, 10 µM 1-naphthaleneacetic acid, 5 µM 2-ip, 250 mg/L cefotaxime, 3 mg/L hygromycin, and 0.80% (w/v) agar) for the selection of transformed calli. After 3 weeks, explants were cultured using shoot-selection medium (0.2 µM thidiazuron, 250 mg/L cefotaxime, and 0.80% (w/v) agar) for 2-3 months, during which time they were subcultured every 3-4 weeks. The regenerated shoots were further screened for kanamycin resistance by rooting on 1/2 MS medium supplemented with 0.05 mg/L indolebutyric acid and 0.05 mg/L NAA. The successful overexpression seedlings were identified using DNA sequencing at Rui Biotech (Beijing, China), and the primer sequences used for amplification are listed in Table S1 (Supplementary Data).

RNA Quantification and Qualification
Leaves of WT and transgenic poplars were harvested for RNA sequencing (RNA-seq) using liquid nitrogen and immediately stored at −80 • C until analysis. Total RNA was extracted from the leaves according to the Qiagen RNAeasy kit manual (Qiagen, Hilden, Germany). The quality of the RNA samples was measured using a NanoDrop ND-1000 and an Agilent 2100 Bioanalyzer after purification and DNA digestion.

Library Preparation for Transcriptome Sequencing
A NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (NEB, Ipswich, MA, USA) was prepared for the libraries. A 3 µg subsample of quantified RNA from each plant was enriched using magnetic beads with oligo (dT) and then broken into fragments. Using mRNA as the template, the first strand of cDNA was synthesized using random hexamer primers and M-MuLV reverse transcriptase (RNaseH). Second-strand cDNA was synthesized using DNA polymerase I and RNase H. Library fragments were purified using the AMPure XP system (Beckman Coulter, Beverly, MA, USA) and eluted with EB buffer for terminal repair. Agarose gel electrophoresis was performed after performing end repair and adding polyA, and library quality was assessed using an Agilent Bioanalyzer 2100 system. Clustering was performed using the TruSeq PE Cluster Kitv3-cBot-HS. The library was sequenced using an Illumina HiSeq 2100 platform.

Quality Control, Mapping Reads to the Reference Genome, and Annotation
An equivalent nucleotide sequence was transformed from image data obtained from the Illumina platform. The original sequencing data were first filtered to obtain high-quality sequencing data (clean data) to ensure the smooth progress of subsequent analysis. Reads were treated as failed or low-quality reads as follows: (1) without the inserted fragment, (2) with an N of >10% ratio, (3) with a quality value <10, or (4) with a length <50 bp after adapter and quality trimming. The SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle (https://github.com/najoshi/sickle) (accessed on 15 October 2021) programs were used to control quality.
The sequences were aligned to poplar reference genomes (http://plants.ensembl. org/Populus_trichocarpa/Info/Index, Pop_tri_v3, accessed on 16 July 2023). Multiple alignment statistical models were used to consider the structural features during sequence alignment. Gene and RNA expression levels were normalized using an FPKM indicator, which signified the fragments per kilobase of transcripts per million mapped fragments. The expression levels and differential gene expression were calculated using the standard read mode within the reference gene regions using Cufflink software (Version 2.2.1) [18].

Quantification of Gene Expression Levels
Reads corresponding to sequence joints from each sample were mapped to transcripts and spliced using Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2, Version 2.4.1, accessed on 16 July 2023). Standard RESM was used to evaluate the expression abundance and gene expression levels represented by read counts. DESeq2 software (Version 1.10.1) was used to calculate fold change, a p-value was used for hypothesis testing, and a gene with a fold change of >1 (upregulated) or <1 (downregulated); p-value <0.05 was considered to be a differentially expressed gene (DEG) [19,20]. The presentation of differential genes was displayed within a volcano plot using the R language.

GO Enrichment and KEGG Pathway Enrichment Analyses
The probability of the Gene Ontology (GO) term was calculated using the R package GOsEq to identify biological functions relevant to the DEGs. The function was considered enriched when the corrected p-value was <0.05 [21]. KEGG pathway enrichment was also performed using KOBAS software (Version 2.1.1) with Benjamini-Hochberg (BH) correction applied to the false discovery rate (FDR) parameter to elucidate the main biochemical, metabolic, and signal transduction pathways. The pathway was reported as enriched when FDR ≤ 0.05 [22].

Phylogenetic Analysis
Protein sequences for the SVP of different species from the NCBI data and previously published articles were aligned, and a phylogenetic tree was developed using MEGA7 (Version 7.0) software [23]. The parameters of the program were as follows: neighborjoining method with 1000 bootstraps and Poisson correction method with the unit number of amino acid substitutions per site.

Candidate Populus SVL Homologs and Overexpression of SVL
DORMANCY-ASSOCIATED MADS-box (DAM) and SHORT VEGETATIVE PHASE (SVP) genes are implicated in regulating perennial winter dormancy. We acquired a MADS TF named SVL (SVP-Like) from the 84K poplar (Populus alba × P. glandulosa). A phylogenetic tree showed that the hybrid aspen SVL was more homologous to the Arabidopsis floral repressor SVP, kiwifruit, and apple SVP than to the other MADS-box or apple DAM genes ( Figure 1a and Table S2). The SVL overexpression vector was transformed at 84K, and a transgenic event was obtained (Figure 1b

Illumina Sequencing and Alignment to the Reference Genome
Six cDNA libraries derived from the OE1-OE3 (transgenic) and WT1-WT3 (wildtype) lines were analyzed on an Illumina platform. A total of 46.6-55.0 million raw reads were generated whose Q20% or Q30% score was high enough to ensure accuracy, and GC content was reasonable from the samples; after removing reads of unreliable quality, 46.1-54.4 million reads were processed for analysis (Table 1). Finally, the clean reads were mapped to the reference genome sequences of Populus trichocarpa. Of the total reads, more than 70% aligned with the reference genomes because of the general dissimilarity of Populus.

Illumina Sequencing and Alignment to the Reference Genome
Six cDNA libraries derived from the OE1-OE3 (transgenic) and WT1-WT3 (wildtype) lines were analyzed on an Illumina platform. A total of 46.6-55.0 million raw reads were generated whose Q20% or Q30% score was high enough to ensure accuracy, and GC content was reasonable from the samples; after removing reads of unreliable quality, 46.1-54.4 million reads were processed for analysis (Table 1). Finally, the clean reads were mapped to the reference genome sequences of Populus trichocarpa. Of the total reads, more than 70% aligned with the reference genomes because of the general dissimilarity of Populus. Cufflinks were used to assemble the mapped reads and compare the assembled reads with the reference genome. A total of 91,642 transcripts ranged from 200 to >1800 bp. Analysis of the dataset (Figure 2) showed that 39,553 expressed transcripts in the >1800 bp range accounted for 43.1% of the expressed genes, which was the maximum. The number of expressed genes ranged from 201 to 400 bp, accounting for only 5.6%. The percentage of transcripts of <1000 bp was similar to that of transcripts ranging from 1000 to 1800 bp, accounting for approximately 25%.
The relationships among expressed genes between the six samples, including three wild-type poplars (WT1, WT2, and WT3) and three poplars overexpressing SVL genes (OE1, OE2, and OE3), are illustrated in Figure 3, which shows the distribution of expressed genes. The two plant types shared 24,049 of the 25,412 total transcripts expressed in both the WT and OE plants. We found that the number of genes expressed was similar between the two sample types and biological replicates.
The relationships among expressed genes between the six samples, including three wild-type poplars (WT1, WT2, and WT3) and three poplars overexpressing SVL genes (OE1, OE2, and OE3), are illustrated in Figure 3, which shows the distribution of expressed genes. The two plant types shared 24,049 of the 25,412 total transcripts expressed in both the WT and OE plants. We found that the number of genes expressed was similar between the two sample types and biological replicates.   wild-type poplars (WT1, WT2, and WT3) and three poplars overexpressing SVL genes (OE1, OE2, and OE3), are illustrated in Figure 3, which shows the distribution of expressed genes. The two plant types shared 24,049 of the 25,412 total transcripts expressed in both the WT and OE plants. We found that the number of genes expressed was similar between the two sample types and biological replicates.

Analysis of DEGs between Transgenic and Non-Transgenic Poplars
The gene expression profiles of the two types of poplar were analyzed and compared in order to identify DEGs. The fold-change values between samples were calculated according to the TPM. Genes with a p-value <0.05, calculated with DESeq2, were regarded as DEGs. A total of 477 genes showed significantly altered expression in poplar-overexpressing SVL genes. Most (342) showed upregulated expression; the remaining 135 genes were downregulated ( Figure 4; Table S3).
DEGs from the six samples were clustered to fully explore the relationships among DEGs; 10 major clusters with distinct transcriptional dynamics were identified Figure 4 and Figure S1). Subcluster 4 consisted of the largest number (152 genes) of up-or downregulated genes. KEGG analysis revealed that hormone-related genes and glutathione metabolism were enriched in this cluster and GO analysis revealed that oxidoreductase activity was enriched (Figures 5 and 6; Figure S2 and Table S4).
These results suggest that the overexpression of exogenous genes results in remarkable changes in the transcriptome of transgenic poplars, primarily by increasing the expression of hundreds of genes.
genes. KEGG analysis revealed that hormone-related genes and glutathione metabolism were enriched in this cluster and GO analysis revealed that oxidoreductase activity was enriched ( Figure 5 and 6; Figure S2 and Table S4).
These results suggest that the overexpression of exogenous genes results in remarkable changes in the transcriptome of transgenic poplars, primarily by increasing the expression of hundreds of genes.

Identification of Transcription Factors by Transcriptome Analysis
Proteins containing a DNA-binding domain that recognizes a specific DNA sequence are usually defined as transcription factors (TFs), which regulate the first step of gene expression. Early studies on TFs revealed their typical structure, containing DNA-binding domains (DBDs) and critical activation domains (ADs), which were required for transcription [24]. Transcriptional regulation plays a pivotal role in controlling gene expression in plants. To better understand the role of SVL transcription factors, we screened the DEGs to identify potential transcription factors. Finally, 34 transcriptional factors were identified, except for the overexpressed target gene, Potri.007G010800 (Table 2). Four WRKY transcription factors, which act as differentially regulated genes, play important roles in development, senescence, defense responses, and stress tolerance. WRKY41 is involved in plant development, such as seed dormancy, and defends plants against Pseudomonas syringae pv. tabaci 6605 [25]. WRKY6 is associated with plant senescence processes as well as plant defense responses [26]. WRKY57 confers drought tolerance in rice and Arabidopsis and enhances the resistance of Arabidopsis against Botrytis cinerea infection [27]. WRKY53 plays a central role in the stress response and during the early stages of leaf senescence [28,29]. In addition to the WRKY transcription factors, the expression levels of NAC, MYB, ERF, bZIP, and GRAS plant-specific transcription factors were also altered. For example, Arabidopsis NAC domain-containing proteins 56 and 55. 2008 [25].
Potri.004G007500 AT1G62300 WRKY transcription factor 6 Serves a function in plant senescence, pathogen defense, abiotic stress, and accumulation of FAs.
In maize, mutation of the TGA9 homolog LIGULELESS2 causes defects in the formation of the blade-sheath boundary in leaves and delayed flowering.
Potri.011G123300 AT3G15500 NAC domain containing protein 55 Particates in pathogen response pathway induced by chitin. Appears to be dependent on ANAC055.

Dehydration responsive element binding protein 19
Involved in response to drought. 3 Potri.017G147000 AT1G01490 Heavy metal transport/detoxification superfamily protein.

Validation of RNA-Seq Results by qRT-RCR
Quantitative real-time (qRT)-PCR was performed on nine randomly selected genes with increased or decreased transcript abundances in transgenic poplars to confirm the accuracy and reproducibility of the Illumina RNA-Seq results. The correlation between the RNA-Seq and qRT-PCR data was evaluated using log2-fold change measurements, defined as ∆∆CT (comparative threshold cycle). Of the nine genes examined, the qRT-PCR expression data showed similar trends to the RNA-Seq data (Figure 7). The fold changes in the expressions of the nine genes were generally in agreement with the RNA-Seq data. Specifically, the fold changes in the expressions of seven genes (WRKY53-Like, TCP12-Like, MYB transcription factor, Hombox12-Like, WRKY41-Like, WRKY6-Like, and bZIP transcription factor) were higher than those obtained with RNA-Seq. The remaining two genes (HMP1-Like and TGA1.1-Like) showed slightly lower expression levels than the RNA-Seq data (Figure 7). Potri.T127400 AT2G04890 SCARECROW-Like 21 0

Validation of RNA-Seq Results by qRT-RCR
Quantitative real-time (qRT)-PCR was performed on nine randomly selected genes with increased or decreased transcript abundances in transgenic poplars to confirm the accuracy and reproducibility of the Illumina RNA-Seq results. The correlation between the RNA-Seq and qRT-PCR data was evaluated using log2-fold change measurements, defined as ΔΔCT (comparative threshold cycle). Of the nine genes examined, the qRT-PCR expression data showed similar trends to the RNA-Seq data (Figure 7). The fold changes in the expressions of the nine genes were generally in agreement with the RNA-Seq data. Specifically, the fold changes in the expressions of seven genes (WRKY53-Like, TCP12-Like, MYB transcription factor, Hombox12-Like, WRKY41-Like, WRKY6-Like, and bZIP transcription factor) were higher than those obtained with RNA-Seq. The remaining two genes (HMP1-Like and TGA1.1-Like) showed slightly lower expression levels than the RNA-Seq data (Figure 7).

Discussion
Transcription factors regulate complex networks with downstream signaling genes characterized by specific responsive elements to participate in plant development and stress responses. In most eukaryotic plants, the MADS-box family transcription factors,

Discussion
Transcription factors regulate complex networks with downstream signaling genes characterized by specific responsive elements to participate in plant development and stress responses. In most eukaryotic plants, the MADS-box family transcription factors, serving as key determinants of gene regulatory networks, always are involved in responses to stress and development in plants [54]. The MADS-domain factor SVP is present in the leaves and SAM during the vegetative phase and serves as a negative regulator of floral transition and an important floral meristem gene. SVP was repressed in the meristem when the plant entered reproductive development, while sufficient SVP transcriptional activity was detectable in the inflorescence apex and the primordia of the coflorescences at the very early flower stage [8,55,56]. In Arabidopsis, SVP regulates ABA catabolism by directly binding to the CArG motifs of CYP707A1/3 and AtBG1 promoters and plays a key role in plant responses to water-limiting assays and long-drought stress [57]. SVP represses plant growth regulator gibberellin (GA) biosynthesis under inductive photoperiods, contributing to delayed flowering [10]. In Populus, SVL, a homolog of Arabidopsis SVP, acts as a negative regulator in bud break by two antagonistic plant hormones, gibberellin and abscisic acid [12]. In yellowhorn, AGAMOUS-LIKE22-regulated ABA biosynthesis and overexpression of yellowhorn AGL22 helped poplars increase resistance to drought stress [58].
Gregis et al. employed ChIP-seq analysis to study SVP binding behavior in Arabidopsis at the genome-wide level, and approximately 3000 genes were identified [2]. In this study, we performed differential gene expression analysis using Illumina HiSeq 2000 paired-end sequencing (RNA-Seq) performed in transgenic poplars. At least 46.6 million raw reads with Q20% or Q30% scores that were sufficiently high to ensure accurate sequence reads were generated. Approximately 75.14% of the sequences mapped to the reference genome sequence of P. trichocarpa. Global analysis of gene expression revealed transcriptome reprogramming in the transgenic poplar (OE1-3) compared with that in non-transgenic WT poplar. A total of 477 genes showed significantly altered expression in poplars overexpressing SVL genes, including 342 upregulated and 135 downregulated genes, indicating that flowering inhibitors play a key role in the insertion/expression of transgenes in poplars. Additionally, 34 transcriptional factors were identified. The RNA-seq data were validated using qRT-PCR analysis of nine genes. The expressions of the nine genes revealed by qRT-PCR analysis (Figure 7) showed that they may be associated with the introduction of exogenous genes and may be involved in different regulatory networks.
The defining features of WRKY transcription factors were several almost nonchanged amino acid sequences at the N-terminus and an atypical zinc-finger structure at the Cterminus, which was called the WRKY domain. Accumulating evidence has revealed that WRKY proteins play diverse roles in response to ambient stresses and plant growth, as well as development, by binding to the W-box cis-elements of target genes [59]. In rice, WRKY53 acted as an important regulator of BR signaling. Phenotypic analyses showed that the leaf angles of the OsWRKY53 overexpression line were enlarged, revealing a novel role for OsWRKY53 in mediating crosstalk between hormones and other signaling pathways [60]. In addition, WRKY53 transcriptionally represses GA biosynthesis genes to regulate male fertility in the anthers, leading to changes in breeding potential when rice is subjected to cold stress [61]. WRKY6 played key roles in plant senescence, pathogen defense, ambient stress, seed oil accumulation, and fatty acid composition [39]. WRKY41 was an important regulator of ABSCISIC ACID INSENSITIVE 3 (ABI3) expression, which played an essential role in seed dormancy [37]. When poplars overexpressed SVL, the expression of the WRKY53-Like, WRKY6-Like, and WRKY41-Like genes in the poplar also increased, consistent with the integration of SVL into plant hormone signaling to mediate senescence, pathogen defense, and abiotic stress.
The TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR (TCP) has been characterized as a semiredundant regulator in plant cell elongation and proliferation, stature, germination, flowering time, pollen development, and leaf morphology [62].
TCPs have also been implicated in several plant hormones' biosynthesis and signaling pathways, including jasmonic acid (JA), salicylic acid (SA), cytokinin (CK), ABA, auxin, and BR [63][64][65]. AtTCP8 positively responded to early immune signaling, and when combined with mutations in AtTCP14 and AtTCP15, additional layers of defense signaling in Arabidopsis. The target genes of TCP8, along with other TCP members by pathogen effectors, may be modulators of BR and other plant hormone signaling pathways [66]. TCP12 and TCP18, renamed as BRC1 and BRC2 based on their similarity to teosinte branched1 (tb1) from maize, were the only Arabidopsis TCP genes that retained a respective role in branching and axillary bud development [67]. In poplars, the expression of TCP12-Like genes increased when SVL was overexpressed, implying that SVL may play a role in axillary bud development. The MYB, bZIP, Homebox12-Like, and TGA1.1-Like transcription factors were also upregulated.
Meanwhile, PME35-Like (PECTIN METHYLESTERASE 35), which encoded a pectin methylesterase, was significantly upregulated in our transcriptome. The stem of the lossof-function mutant of Arabidopsis pme35 exhibited a pendant phenotype and an increased deformation rate [68]. In plants, pectin methylesterases are involved in vegetative as well as reproductive processes, including wood, pollen formation, and plant-pathogen interactions [69]. Dodder is a stem parasite with haustoria which is an important tool to extract necessary nutrients and water when it is attached to its host. Cuscuta haustoria showed the phenotype of reducing pectin digestion and lacking searching hypha when growing on CcLBD25 RNAi tomatoes, indicating that there was a correlation between pectin digestion and parasitism [70]. Our study provides preliminary insights into genes affected by SVL by means of transcriptome analysis. The specific functions of these genes mediated by SVL require further experimental validation in future studies.

Conclusions
In the present study, we cloned the SHORT VEGETATIVE PHASE (SVP)-Like gene in the poplar, an MADS-box transcription factor and a repressor for flowering, and transformed the poplar clone "84K" with the SVP-Like gene, and obtained stable overexpression transgenic lines. Transcriptome analysis of the leaves of the transgenic and WT lines revealed that there were 342 upregulated and 135 downregulated genes. Ten subclusters in DEGs were analyzed, and the KEGG terms of the largest subcluster were associated with two key pathways: hormone-related genes and glutathione metabolism. Meanwhile, many transcriptional factors were involved. Overall, our results contributed to a deeper understanding of how candidate genes were regulated by the SVP-Like gene using overexpression and transcriptome technology and provided target genes for plant senescence, pathogen defense, and abiotic stress in the future.