Conservation and Divergence of SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL) Gene Family between Wheat and Rice

The SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL) gene family affects plant architecture, panicle structure, and grain development, representing key genes for crop improvements. The objective of the present study is to utilize the well characterized SPLs’ functions in rice to facilitate the functional genomics of TaSPL genes. To achieve these goals, we combined several approaches, including genome-wide analysis of TaSPLs, comparative genomic analysis, expression profiling, and functional study of TaSPL3 in rice. We established the orthologous relationships of 56 TaSPL genes with the corresponding OsSPLs, laying a foundation for the comparison of known SPL functions between wheat and rice. Some TaSPLs exhibited different spatial–temporal expression patterns when compared to their rice orthologs, thus implicating functional divergence. TaSPL2/6/8/10 were identified to respond to different abiotic stresses through the combination of RNA-seq and qPCR expression analysis. Additionally, ectopic expression of TaSPL3 in rice promotes heading dates, affects leaf and stem development, and leads to smaller panicles and decreased yields per panicle. In conclusion, our work provides useful information toward cataloging of the functions of TaSPLs, emphasized the conservation and divergence between TaSPLs and OsSPLs, and identified the important SPL genes for wheat improvement.


Introduction
Wheat is one of the most important crops worldwide, providing a food supply for about 28% of the global population [1]. However, sustaining wheat yield and quality has become unprecedentedly challenging for several reasons, including the reduction of arable land area, water resource shortages, and the emergence of new pathogens and pests. Our fundamental understanding of the genes involved in wheat functional traits represents one of the key aspects for wheat molecular breeding and, hence, is of great significance for the improvement of wheat yield and quality.
Transcription factors (TFs) represent a particular type of DNA-binding proteins encoded by certain gene families, which bind to their target genes in a sequence-specific manner and activate or inhibit the transcription of target genes. Thus, TF families are involved in various aspects of plant growth and development, and often contain master regulators and key genes for crop improvement. paleoduplication, allopolyploidization, and tandem or segmental duplication of genes. On one hand, several gene family studies have consistently identified 56 TaSPL genes in wheat [41][42][43][44]. On the other hand, functional characterization of TaSPLs in transgenic monocot plants remains very limited [45][46][47]. For example, transgenic wheat lines with TaSPL8 gene edited by the CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 method were characterized by deficient leaf stalk bases (auricles, ligules, and laminar joints), erect leaves, and compact plant architectures [46]. Overexpression of TaSPL13 affects inflorescence architectures in wheat, with an increased number of florets and grains [45]. Other TaSPLs have been functionally studied in ectopic expression systems, including rice and Arabidopsis. In transgenic rice, panicle length, primary, and secondary branches of the panicle, and grain numbers were increased significantly after overexpression of TaSPL20 and TaSPL21 (renamed to TaSPL10 and TaSPL5, which are actually the wheat orthologs of OsSPL10 and 5, respectively) [47]. Ectopic overexpression of TaSPL16 in A. thaliana promoted flowering [48], while TaSPL3 or TaSPL6 overexpression in A. thaliana affected flowering time and organ size [49].
As techniques for the generation of transgenic or genome-edited plants are still not routine in wheat, and are challenging and time-consuming, utilizing the existed knowledge of OsSPL functions is expected to facilitate the functional study of TaSPLs. We hypothesize that a combination of the spatial-temporal expression patterns of TaSPLs, the orthologous relationships between TaSPLs and OsSPLs will greatly help to understand and to predict the functions of TaSPLs. We also hypothesize that a TaSPL gene may have largely conserved or overlapping functions with the OsSPL ortholog if both the OsSPL and TaSPL exhibit a similar expression pattern. For the present study, we aimed to: (1) establish orthologous relationships between TaSPLs and OsSPLs; (2) collect expression evidence for the functional conservation and divergence of TaSPL members; and (3) summarize the current state-ofthe-art of OsSPL functions and to identify and validate examples of TaSPL genes with conserved functions between wheat and rice.

Results
During the evolution of Brassicaceae and Pooideae, Arabidopsis and rice have experienced distinct sets of paleopolyploidization events, which have had major impacts in terms of gene expansion and functional divergence. In light of this, SPLs have expanded in a lineage-specific manner in Arabidopsis and rice, even though Arabidopsis and rice have similar numbers of SPL genes (17 in Arabidopsis vs. 19 in rice) [15,23]. Differing from AtSPLs, lineage-specific gene expansion has involved OsSPLs in a diverse set of biological processes related to plant growth and development, which have been functionally characterized using either mutants, transgenic lines, or CRISPR/Cas9-based knockout lines [5,10,11,13,14,31,37,50,51]. Among the agronomically important monocot cereal crops, rice and wheat are very similar in plant development and architecture, except for their inflorescence structures, making it possible to transfer the current knowledge of gene functions from rice to wheat.
To utilize the large volume of knowledge related to the functions of OsSPLs, we combined BLAST-and protein domain-based methods for gene identification and found 56 genes encoding TaSPLs. The TaSPLs reported in our work are consistent with those identified in several previous studies [41,44,45]. All of the TaSPL proteins have one SBP domain with a length of approximately 78 amino acids, and which is conserved between OsSPLs and TaSPLs ( Figure S1). This SBP domain contains two zinc-binding sites-the zinc finger 1 motif and zinc finger 2 motif-together with a conserved nuclear localization signal (NLS) located in the C-terminal of the SBP domain. TaSPLs have a Cys-Cys-Cys-His (CCCH)-type zinc finger 1 motif, except for TaSPL9-A/B/D, which contain the Cys-Cys-Cys-Cys (CCCC) type, while all TaSPLs have the Cys-Cys-His-Cys (CCHC)-type zinc finger 2 motif ( Figure S1).

Polyploidization and Gene Tandem Duplication Shapes the Expansion of TaSPLs
To focus on the phylogeny of SPLs in Pooideae, we combined the TaSPLs with 18 SPLs from Aegilops tauschii (Ae. tauschii), 10 SPLs from Tritucum urartu (T. urartu), 16 SPLs from Brachypodium distachyon (B. distachyon), and the 19 SPLs from Oryza sativa L. (O. sativa) to construct a maximum-likelihood tree ( Figure 1) [41,44]. Phylogenetic analysis clustered the SPLs into five groups (groups I-V). Further, we compared the micro-synteny between the genomic segments harboring OsSPL genes and TaSPL genes in order to determine the wheat syntelogs of each OsSPL gene ( Figure S2, Table S1). Consistent with the syntelogous relationship between OsSPLs and TaSPLs, our phylogenetic tree clustered each set of TaSPL homeologs well (labeled using the green, purple, and yellow circles for the homeologous copies from the wheat A, B, and D sub-genomes, respectively), together with the corresponding rice ortholog ( Figure 1A). In rice, among the 19 OsSPLs, ten genes (accounting 52% of the OsSPL family) formed five sister gene pairs, which are hereafter named SPL pairs 1 to 5 for OsSPL3/12, OsSPL4/11, OsSPL5/10, OsSPL14/17, and OsSPL16/18, respectively, in order to describe the SPL genes in the context of evolution. These five paleoduplicated gene pairs were retained after ancient whole-genome duplication (WGD) events, with evidence of sub-functionalization [31]. In contrast to OsSPLs, deletions of pair-1 and pair-2 SPLs (SPL3/12 and SPL4/11) occurred in wheat and its diploid relative species, retaining only SPL3 and SPL4, respectively ( Figure 1). More interestingly, the tandem duplications have driven TaSPL10 expansion (belonging to the pair-3 SPL), resulting in 11 copies, designated as TaSPL10a-A/B/D, TaSPL10b-A/B/D, TaSPL10c-A/B/D, and TaSPL10d-A/D ( Figure 1). For singleton SPLs, syntelogous relationships have been well-kept between OsSPLs and TaSPLs, indicative of the possibly different evolutionary fates of singleton SPLs and paleoduplicated SPL pairs in wheat. In addition, SPL19 genes were not identified in the analyzed Triticeae species. Taken together, our genome-wide analysis revealed that the expansion of TaSPL genes in wheat are a consequence of the allopolyploidization of A/B/D sub-genomes and the tandem duplication of TaSPL10.
Analyses of gene structures and protein motifs found, on one hand, that TaSPLs from the same phylogenetic groups tend to have similar exon-intron structures and similar combinations of MEME-predicted protein motifs ( Figure S3). On the other hand, differences in the predicted protein motifs between SPL triads were sometimes observed, suggesting divergence between SPL triads at the protein sequence or protein-protein interaction levels (shown in red boxes in Figure S3B).
MicroRNA156 (miR156) is the major microRNA that regulates several SPL genes at the post-transcriptional level in many plant species [55]. The miR156-SPL is an important regulatory module for plant growth and development [3,5,26,[56][57][58]. In rice, 12 of the 19 OsSPLs are targeted by miR156, covering phylogenetic groups II, IV, and V of OsSPLs (labeled by red crosses in Figure 1A). As the miRNA-target complementary rules in plants have been well studied [59], we utilized psRNATarget prediction to identify the TaSPLs that are likely targeted by miR156 ( Figure S4) [60]. A total of 27 TaSPLs (i.e., the TaSPL2, 3,4,7,13,14,16,17, and 18 triads) were predicted to be targeted by tae-miR156 ( Figure S4), all of which are evolutionarily conserved wheat synteologs of the miR156-regulated OsSPLs ( Figure 1A). Sequence alignment results demonstrated that the mature tae-miR156 sequence is conserved between dicot and monocot species, while the miR156 recognition sites in TaSPLs have a few sequence variations between TaSPL genes ( Figure S5), suggesting that miR156-SPL regulation is likely conserved between rice and wheat.  Table S1); (B) diagram of the chromosomal locations of TaSPL genes, identifying several TaSPL clusters and gene expansion at the TaSPL10 locus driven by tandem duplications. TaSPL gene clusters are shown by blue boxes and the tandemly duplicated genes of TaSPL10 are highlighted in red.

Expression Profiling Indicates the Conservation and Divergence of TaSPL Genes
Determining when and where a gene is expressed is a necessary step to perform reverse genetics for the study of gene functions. Several RNA-seq studies have recently been reported based on the high-quality wheat reference genome, documenting gene expression profiles in various tissues and organs across wheat developmental stages [61]. We utilized RNA-seq data and compiled expression profiles of TaSPLs in leaves, roots, stems, inflorescences, flowers, and seeds (Figure 2A), likely representing one of the most comprehensive expression analyses of TaSPLs, to the best of our knowledge.

RNA-Seq Analysis Highlights Sub-Genome Expression Biases of TaSPL Genes
TaSPL genes were hierarchically clustered into seven clusters (namely, Clusters 1-7), based on their RNA-seq expression patterns (Figure 2A). Clusters 1, 2, and 4 (containing TaSPL14/17, TaSPL5/10, and TaSPL7, respectively) are those with tissue-specific expression patterns, while Cluster 5 (TaSPL2/13/16/18) contains TaSPLs preferentially expressed in some tissues or developmental stages. The remaining TaSPLs (Clusters 3, 6, 7, and 8; see Figure 2A) were widely expressed in multiple tissues. For instance, TaSPL14/17 were strongly expressed in shoots and inflorescences. All of the TaSPL10 copies were specifically expressed in young leaves and spikes. Moreover, we found that TaSPL8 exhibited strong and specific expression in the leaf ligule, matching its validated function in leaf ligule development [46]. In addition, TaSPL1/3/4/6/9/15 were widely expressed in multiple tissues, with strong expression observed in developing tissues and organs, including shoot apical meristem (SAM), shoots and developing roots, stems, inflorescences, and seeds, indicating that these TaSPLs might play pleiotropic roles in the regulation of plant development. Meanwhile, we validated the expression profiles of several TaSPLs, including TaSPL2/3/4/6/8/10/17/18, in wheat cv. Chinese Spring at multiple development stages and in various tissues ( Figure S6).
Our RNA-seq analysis demonstrated a good agreement between the expression patterns and biological functions of TaSPLs. For example, TaSPL13 was predominantly expressed in inflorescences (Figure 2A), consistent with the qPCR-based spatial-temporal expression results, matching its confirmed functions in spike and floret development in wheat [45]. TaSPL3 and TaSPL6 had similar expression patterns, with highest expression observed in the shoot axis and spikes at boot stage (Figure 2A), in agreement with the functions in regulating heading date and flowering time in A. thaliana [49].
These expression results also indicated potential functional divergence between the paleoduplicated TaSPL gene pairs (i.e., TaSPL16 and TaSPL18). For example, TaSPL18 was expressed in the leaf sheath across multiple stages, while TaSPL16 was not; furthermore, TaSPL18 showed stronger expression levels in flower organs (glume, lemma, and stigma) than TaSPL16. In contrast, the other two duplicated pairs of TaSPLs (TaSPL5/10 and TaSPL14/17) showed relatively similar expression profiles.
As a polyploid species, the homeologous copies from the three sub-genomes of common wheat adopted different spatial-temporal expression patterns or were expressed at distinct levels, thus differing in their contributions to a particular phenotype. This phenomenon is known as sub-genome expression bias (SEB). We analyzed the SEBs of TaSPLs using RNA-seq data. Statistical analysis of the expression ranges between the A, B, and D sub-genome TaSPL homeologous copies revealed 11 TaSPL genes with SEBs. Particularly, many TaSPLs (TaSPL1/3/4/5/9/15/18) from the B sub-genome had relatively lower expression levels, when compared with the corresponding copies from A and D sub-genomes ( Figure 2B). Typical examples are as follows: TaSPL7-B was not expressed in any of the analyzed RNA-seq samples, while TaSPL7-A and -D were expressed in inflorescences; furthermore, TaSPL1-B was not expressed in most shoot, leaf sheath, and inflorescence samples, while TaSPL1-A and -D were expressed, indicating potential sub-functionalization of TaSPL1-B. respectively. The TaSPL genes are row-clustered into eight clusters (namely, C1-C8) based on their expression similarity, as determined by the k-mean clustering method; (B) comparison of expression between each set of TaSPL homeologous copies, identifying multiple TaSPL genes with sub-genome expression biases. The Y-axis indicates the gene expression levels (as TPM). Statistical significance of sub-genome expression biases (SEB) for each TaSPL gene was determined by two-tailed Student's t-test, with *, **, and *** representing p < 0.05, p < 0.01, and p < 0.005, respectively (details in Method Section 4.5).

RNA-Seq Analysis Shows the Different Spatial-Temporal Expression Preferences between TaSPLs and OsSPLs
With the extensive expression data of TaSPLs, we questioned whether the wheat SPL orthologs retained similar expression preferences after rice-wheat divergence. To address this, we retrieved both the microarray-and RNA-seq-based expression profiles of OsSPLs ( Figure S7). The microarray expression data of OsSPLs cover a wide range of tissues/organs across different rice developmental stages [23], with comparable tissues and stages to the wheat expression atlas of cv. Azhurnaya [62] (see Method Section 4.5). We summarized the TaSPL and OsSPL expression profiles, respectively, for each organ (i.e., leaves, roots, stems, inflorescences, flowering organs, and seed tissues), in order to simplify the comparison between rice and wheat ( Figure 3). Both TaSPLs and OsSPLs can be grouped into three classes-namely, ubiquitously expressed, tissue-preferentially expressed, and specifically expressed-based on the number of tissues where each SPL gene is expressed (see Methods section). In rice, OsSPL1/3/4/6/9/11/12/15 are ubiquitously expressed, OsSPL2/8/14/16/18 are expressed preferentially in some tissues, and OsSPL5/7/10/13/17 are expressed specifically in particular tissues ( Figure S7). In wheat, TaSPL1/3/4/6/9/15 are ubiquitously expressed, TaSPL2/13/16/18 have tissue-preferential expression patterns, and TaSPL5/7/8/10/14/17 are expressed in a tissue-specific manner ( Figure 2A). The SPL genes with ubiquitous and tissuepreferential expression largely overlap in rice and wheat, whereas the tissue-specifically expressed SPL genes mostly differed between rice and wheat.
In rice, the two SPL genes within the paleoduplicated pairs 1 or 2 showed distinct expression patterns; for example, OsSPL3 had highest expression in roots, whereas OsSPL12 was highly expressed in roots, stems, inflorescences, and flower organs ( Figure 3A). In contrast, the two OsSPL genes within each duplicated pair (e.g., OsSPL5/10 and OsSPL16/18) exhibited similar expression abundance at the organ level.
In wheat, the majority of SPL triads showed similar organ-level expression preferences ( Figure 3B); however, TaSPL1-B had relatively lower expression in roots, stems, and flower organs, when compared with those of TaSPL1-A and -D. TaSPL18-A and -D, but not TaSPL18-B, were expressed in stems. It has been well-accepted that where a gene is expressed is correlated to its function in a particular tissue/organ. Therefore, we considered that comparison of the expression patterns at tissue or organ levels between OsSPLs and TaSPLs provides valuable information regarding the transfer of the known functions of OsSPLs to TaSPLs, thus facilitating the functional study of TaSPLs. For example, TaSPL8 and OsSPL8 have high expression levels specifically in the leaf sheath and ligule; indeed, their conserved function in regulating the development of the leaf ligule can be reasoned from such expression analyses, and has been experimentally confirmed [32,46].
When comparing the expression patterns of TaSPLs and OsSPLs, some TaSPLs appeared to exhibit expression patterns differing from their rice SPL orthologs, with some TaSPLs even being expressed in a manner much more specific to certain tissues and stages. For instance, OsSPL7 is highly expressed in both inflorescences and seeds, while TaSPL7 is specifically expressed in inflorescence tissues (Figures 3 and S7). Another example is that OsSPL5 is expressed highly in inflorescences, flower organs, and seeds, whereas TaSPL5 is particularly expressed in leaf, root, and inflorescence samples. These obvious changes in expression patterns between OsSPLs and their orthologs in wheat possibly indicate the TaSPLs may possess functions different from those of the corresponding OsSPLs.

TaSPLs Respond to Abiotic Stresses and Phytohormone Treatments
While TaSPLs exhibited spatial-temporal expression preferences, the potential roles of TaSPLs in stress response and regulation have been understudied. This is, at least partly, due to the fact that the high-quality wheat reference genome has only become available recently, and comprehensive transcriptome studies of wheat plants under stress treatments have been limited. Based on the limited RNA-seq data of different wheat varieties treated by abiotic stresses (summarized in Table S3) [63,64], we selected the stress-responsive TaSPL genes to perform qPCR expression analysis under abiotic stresses and phytohormone treatments, in order to gain more insights into the roles of TaSPLs in stress responses and regulation, as many stress-tolerance mechanisms are mediated by phytohormone signaling pathways [65,66].
Our qPCR analysis demonstrated that TaSPL genes responded quite differently to the same treatment, and a TaSPL gene also showed distinct responses, in terms of expression, to the stresses and phytohormone treatments ( Figure 4). The expression of TaSPL2 was significantly upregulated after 1 h of drought treatment. TaSPL2 was also upregulated after 6 h of combined drought and heat stresses ( Figure 4). Under drought stress, eight TaSPL genes (TaSPL2/3/4/6/8/10/17/18) were up-or downregulated, with different stress-response patterns. TaSPL6 was downregulated throughout the whole process of drought stress, while TaSPL10 was dramatically upregulated (by over seven-fold) after 3 h of treatment. After PEG treatment, the expression of TaSPL2/4/6/8/18 peaked at 1 h, TaSPL10 peaked at 12 h, while TaSPL3 was downregulated. Different from PEG treatment, NaCl treatment induced the expression of TaSPL2/6/10 but repressed the expression of TaSPL3/4/17. Under cold treatment, TaSPL6/10 responded rapidly at 1 h, while TaSPL2/3/18 had two upregulation peaks in response.
Previous studies have shown that SPL genes respond to drought [67], heat [68], auxin (IAA), and brassinolide signal pathways [46], and biotic and abiotic stresses [34,67]. For the present study, we investigated the expression of TaSPLs expression under several phytohormone treatments, including ABA, IAA, GA, GR24, MeJA, and BR ( Figure 4). Except for GA and MeJA, the remaining four phytohormones induced the expression of the analyzed TaSPLs, with IAA exhibiting the highest upregulation of TaSPLs (ranging from~3-fold to over 25-fold upregulation). IAA mediated strong upregulation of TaSPL2/8/10/17 at 12 h, while TaSPL4/6/18 responded to IAA treatment earlier: at 1 to 3 h. For ABA treatment, all of the eight TaSPL genes showed a similar up-and-down expression pattern, with the majority of TaSPL expression induced in 3-6 h. The expression patterns of TaSPLs in response to GA were complex. TaSPL3/6 were downregulated by GA; TaSPL2/17 were first downregulated at early stages and then upregulated by GA at late stages; while TaSPL8/10/18 expression was significantly induced by GA treatment. Unlike the complex responses to GA treatment, a few TaSPL genes (TaSPL3/6/10/17/18) were upregulated at late stages (12 to 24 h) under GR24 treatment. As the only TaSPL-repressing phytohormone, MeJA induced downregulation of several TaSPLs, including TaSPL2/4/6/8/10/18. TaSPL17 was the only TaSPL gene significantly upregulated after MeJA treatment. For the BR treatment, several TaSPL genes (i.e., TaSPL2/4/6/8/10/17) exhibited two upregulation peaks-one peak at 3 h and the other at 12 h-while TaSPL18 was downregulated by BR. These results suggest that the functions of some TaSPLs are (directly or indirectly) related to abiotic stress tolerance mechanisms and other biological processes, such as phytohormone biosynthesis, degradation, and signaling, thus regulating the plant growth and development of wheat. . Spatial-temporal expression preference of TaSPLs and their responses to abiotic and phytohormone treatments, as determined by qPCR. Temporal expression profiles of TaSPL2, TaSPL3, TaSPL4, TaSPL6, TaSPL8, TaSPL10, TaSPL17, and TaSPL18 at 0, 1, 3, 6, 12, and 24 h after the treatments of abiotic stresses or phytohormones. Significant differences in expression levels were determined by comparing each treatment at each time point with that at 0 h for each gene per treatment using Student's t-test (p < 0.05). * and ** indicates significant difference at p < 0.05 and p < 0.01, respectively, in gene expression levels compared to that at 0 h for each treatment.

TaSPL3 Encodes a SPL Transcription Factor Highly Expressed in Young Spikes
When analyzing the expression preference between OsSPLs and their corresponding orthologous SPLs in wheat, the ubiquitously expressed OsSPLs and TaSPLs have been suggested as sharing conserved and significant roles during the development of rice and wheat, respectively. In order to prove the concept that a TaSPL gene may have largely conserved or overlapping functions with its OsSPL ortholog, if both the OsSPL and TaSPL exhibit similar expression patterns, we performed experiments to study the functions of TaSPL3. The expression preference of SPL3 in developing spikes and stems in both rice and wheat indicates that SPL3 could play significant and conserved biological roles in the development of spikes and/or stems.
TaSPL3 consists of three highly conserved homeologous copies, TaSPL3-A (TraesCS6A02G1101001.2), TaSPL3-B (TraesCS6B02G138400.1), and TaSPL3-D (TraesCS6D02G098500.1), with over 96% identity for both the nucleotide and amino acid sequences. According to the public RNA-seq results of different tissues across wheat developmental stages, TaSPL3 triads exhibit particularly high expression in developing spikes, moderate expression in stems and roots, low expression in leaves, and almost no expression in seeds ( Figure 5A). TaSPL3-A has relatively higher expression, compared with TaSPL3-B and -D. Similarly, qPCR analysis of TaSPL3 validated that it has the highest expression in young spikes, with moderate expression in other green tissues at different stages ( Figures 3A and S6). To substantiate the functional study of TaSPL3, we investigated the sub-cellular localization of TaSPL3 ( Figure 5B). TaSPL3-A-GFP fused protein was specifically localized in the nucleus, matching its role as a transcription factor. Furthermore, we performed a transactivation assay to determine the self-activation activity of TaSPL3 ( Figure 5C). TaSPL3-A was truncated into three parts: N-terminal (1-183 amino acids), the SBP domain (middle 184-261 amino acids), and C-terminal (262-475 amino acids) (for primers, see Table S1). Our results showed that the N-terminal and SBP domain do not possess transactivation activity, while the C-terminal and the full-length TaSPL3-A can activate transcription. These results indicate that TaSPL3 is a nucleus-localized transcription activator. TaSPL3 To investigate the function of TaSPL3 in rice, we generated transgenic lines of rice ubiquitously expressing TaSPL3-A, fused with the 3 × myc tag to facilitate protein detection ( Figure S8). Six transgenic lines of TaSPL3-A (hereafter referred to as the TaSPL3-OE lines) were obtained, together with one line transforming the empty vector (the vector control line, VC) as a negative control. PCR results confirmed that the six TaSPL3-OE lines were TaSPL3positive, with various expression levels of TaSPL3, as determined by qPCR ( Figure S9). To investigate the phenotypic effects of ectopically expressed TaSPL3 in rice, the transgenic lines and controls (VC and wild-type cv. Nipponbare) of the T 2 generation were grown in the experimental field (Wuhan, China) using a complete random block design. Meanwhile, another batch of transgenic and control lines were grown in pots and placed beside the experimental field for ease of observation.

Ectopic Expression of
The ectopic expression of TaSPL3 in rice promotes heading. The period from sowing to mature ranged from~91 to~96 days for the six TaSPL3-OE lines, while the same period for the control lines was 97 to 99 days ( Figure 6). We found that the TaSPL3-OE lines started to head as early as~60 days after sowing, while the control lines headed at~72 days after sowing. Statistical analysis showed that the ectopic expression of TaSPL3 led to 6-7 days earlier heading. TaSPL3 Table S4). To identify detailed reasons for the shorter plant stature of TaSPL3-OE lines, we compared the length and diameter for each internode between the transgenic and control lines. Our results showed that, compared with the control lines, the TaSPL3-OE lines had shorter internode I and II, with some of the internode diameters also being smaller ( Figure 7E,F); however, there was no obvious difference in tiller number between the transgenic and control lines, as shown in Figure 7D. In addition, we found that the TaSPL3-OE lines also affected the length, width, and area of the flag leaf ( Figure 7G-K). These results clearly indicated that TaSPL3 affects vegetative growth, but it remains to be investigated whether the decreased flag leaf and internode sizes were indirectly due to early heading, or due to ectopically expressed TaSPL3 influencing the development of leaves and internodes.

Ectopic Expression of TaSPL3-A in Transgenic Rice Affects Panicle Structures
As young panicles are the tissue where SPL3 is primarily expressed, we investigated the phenotypic effects of TaSPL3 overexpression in rice. Ectopic overexpression of TaSPL3 affected the panicle size and structure, but did not influence grain-related traits (Figure 8). Expression of TaSPL3 in rice led to smaller panicles ( Figure 8A-C), primarily due to shorter primary branches and a decreased number of secondary branches and, hence, a reduced number of grains per panicle ( Figure 8D-J). Slight differences in the number of primary branches and setting rate were also observed for the TaSPL3-OE lines, when compared to the control lines ( Figure 8I-K). As for the grain-related traits, grain length, width, thickness, and thousand kernel weight were not affected by TaSPL3 expression (Figure 8M-P). Collectively, TaSPL3 expression resulted in lower yield, when compared to the non-transgenic cv. Nipponbare.  TaSPL3-OE transgenic lines and WT shows that both the internode lengths (E) and stem diameters (F) in transgenic rice plants were significantly shorter than those in WT, leading to shorter plant heights of the transgenic plants (C). Tiller numbers of the TaSPL3-OE transgenic lines did not differ from WT (D). The length (E) and diameter (F) of each internode (I to V) were compared. Morphological comparison and detailed measurements showed that both the flag leaf length (G, I) and flag leaf width (H, J) in TaSPL3-OE transgenic lines were significantly lower than those in WT, leading to decreased flag leaf areas (K) in TaSPL3-OE transgenic lines. Statistical differences of the traits were determined using Tukey's test, with mean values marked with different letters differing significantly (p < 0.05) among the lines. For figure (E) and F, * and ** indicates significant differences in stem lengths or diamters at p < 0.05 and p < 0.01, respectively, compared to those of the wildtype plants (determined by Student's t-test).
shorter primary branches and a decreased number of secondary branches and, hence, a reduced number of grains per panicle ( Figure 8D-J). Slight differences in the number of primary branches and setting rate were also observed for the TaSPL3-OE lines, when compared to the control lines ( Figure 8I-K). As for the grain-related traits, grain length, width, thickness, and thousand kernel weight were not affected by TaSPL3 expression ( Figure  8M-P). Collectively, TaSPL3 expression resulted in lower yield, when compared to the non-transgenic cv. Nipponbare.

Divergence between OsSPLs and TaSPLs
Due to the functional importance of SPLs in plant growth and development, the TaSPL genes have previously been identified in a genome-wide manner [41][42][43][44]. Consistent with the previous studies, we found 56 TaSPL genes present in the wheat reference genome of CS. Interestingly, TaSPLs have experienced different evolutionary trajectories, compared to the OsSPLs. The evolutionary differences of the SPL family are reflected in the following three aspects: First, the paleoduplicated SPL gene pairs were not retained in common wheat (Figure 1). The previous study has shown that the most recent whole-genome duplication in rice has led to five duplicated pairs of OsSPL genes (pair 1: OsSPL3/12; pair 2: OsSPL4/11; pair 3: OsSPL5/10; pair 4: OsSPL14/17; and pair 5: OsSPL16/18). All five OsSPL pairs are retained and exhibit partially redundant functions and evidence for neofunctionalization [5,10,13,14,31,33,37,50,69]. For example, OsSPL3 knockout altered the heading date, whereas OsSPL12 knockout did not; OsSPL4 knockout altered the heading date, whereas OsSPL11 knockout did not; OsSPL5 knockout changed the tiller number, whereas OsSPL10 knockout did not [31]. In wheat, we did not find TaSPL11 and TaSPL12, suggesting that their functions may have been replaced by other SPL members.
Second, TaSPLs differ from OsSPLs in terms of gene duplication patterns. During the allohexaploidization process of the wheat genome, the number of TaSPL genes was not only tripled by polyploidization, but also increased by tandem duplications, such as the case of TaSPL10a, 10b, 10c, and 10d (Figure 1). A preliminary analysis regarding TaSPL gene duplication has been reported [44]; however, attempts have not previously been made to reconstruct the evolutionary relationships between OsSPL and TaSPL. Evolutionary comparison between the wheat and rice genomes clearly shows the expansion of the wheat genome by multiple mechanisms, including whole-genome duplications (WGD), tandem duplications and segmental duplications [70][71][72]. In rice, the WGD which occurred 70 to 90 million years ago (MYA) strongly impacted the rice genome and served as one of the major driving forces of gene duplication and divergence [73][74][75]. The WGD in the rice genome also led to the duplication of OsSPLs and drove subfunctionalization within the OsSPL pairs [31]. Unlike the rice genome, the wheat genome is characterized by a huge proportion of transposable elements (TEs, about 80% to 90%), polyploidization and a recent burst of gene duplications (RBGD) [72,76,77]. Indeed, our work demonstrates that the polyploidization and tandem duplications represent the major evolutionary force to drive the expansion of TaSPLs, differing from the case of OsSPLs.
Third, some TaSPL genes adopted distinct expression patterns, compared with those of OsSPLs. For example, TaSPL5, 7, 10, 14, and 17 exhibited tissue-specific expression profiles, while other SPLs (e.g., TaSPL2, 13, 16, and 18) were preferentially expressed in some tissues and stages, but barely expressed in some other tissues, such as leaf sheaths, roots, stems, and seeds (see Figure 2A). Unlike TaSPLs, OsSPLs generally show tissue or organ expression preferences, but do not exhibit very specialized expression patterns (Figures 3 and S7) [23]. In rice, the previous study has shown that the expression patterns of OsSPLs are associated with their functions. For example, the duplicated pair of OsSPL3 and OsSPL12 are expressed in leaves and panicles, with OsSPL3 expressed higher in leaves and OsSPL12 expressed higher in panicles, which partly explains the phenotypic effects of OsSPL3 or OsSPL12 knockout lines [31]. In addition, altered expression levels or expression patterns between the homeologous copies of a certain TaSPL further impacts their functions. For instance, we observed that TaSPL8-D was specifically and strongly expressed in leaf ligules followed by TaSPL8-B and TaSPL8-A. Indeed, the CRISPR-mediated knockout lines of TaSPL8 homeologous copies, respectively, have proved that TaSPL8-D plays a determinant role in leaf-ligule development, while TaSPL8-B has only a moderate phenotypic effect, consistent with its expression level [46].

Toward Linking the Functions between OsSPLs and TaSPLs
Rice is the model species for gene function studies in monocot plants. It has several advantages when being used to facilitate the functional study of TaSPL genes: (1) rice has a similar plant architecture to wheat; (2) the orthologous relationships between TaSPLs and OsSPLs are studied and reported herein; (3) rice has extensive gene expression data sets and several well-established expression databases [78][79][80]; (4) extensive functional studies have been reported, using forward genetics, mutants, transgenic, or genome-editing approaches to characterize OsSPL members (as summarized in Table S5); and (5) rice is a monocot species that can be easily transformed and genome-edited with high efficiency, making it a prime model system for heterologously investigating the effects of TaSPL.
Unlike the monocot model specie rice, the genetic transformation of wheat has been established for decades [81][82][83][84], while creating genome-edited plants has only recently become possible in wheat [85,86]. The transformation efficiency in wheat has been improved recently, by optimizing the Agrobacterium-mediated transformation system [87]. Nevertheless, the transformation efficiency in wheat is not comparable to that in rice, nor have improved wheat transformation systems become widely used as a routine technique yet. Therefore, comparisons between OsSPLs and TaSPLs, in the aspects of gene orthology, expression patterns, and functions may indicate the functions of TaSPL genes, thus helping to prioritize the TaSPL genes for detailed genetic and functional studies. To this end, we collected the known functions of OsSPL genes based on previous reports [5,6,11,13,14,[30][31][32][33][34][35][36][37]44,69] (Table S5). Consistent with the observations that most Os-SPL genes are widely expressed in several organs-including leaves, stems, inflorescences, and seeds-with expression preferences in certain tissues and stages ( Figure S7), several published investigations have revealed the pleiotropic functions of OsSPLs, and the complex functional redundancy between the OsSPL members (summarized in Table S5). For example, transgenic and mutant studies have unveiled that many OsSPLs (i.e., SPL3,4,7,9,10,13,14,16,17,and 18) have impacts on plant height, flowering time, panicle structure, and grain development [5,31,33,34,36,37,50]. Similarly, the paleoduplicated pairs of OsSPL genes exhibit overlapped functions with sub-functionalizations in some traits, which are likely contributed to by the differentiated expression between pairs of SPL genes [31].
Owing to the important relatedness between the expression patterns and gene functions as demonstrated in OsSPLs, the expression patterns between OsSPLs and TaSPLs have been compared in the present study (Figures 2, 3 and S7). In such comparative analyses, we acknowledge that several limitations hamper the direct comparison of expression levels between rice and wheat. The widely used expression data sets in rice have been generated by microarray ( Figure S7), while wheat expression has been profiled more recently by using RNA-seq ( Figure 2) [78][79][80][88][89][90]. Therefore, we compared the organs where OsSPL and TaSPL genes were preferentially or highly expressed (Figure 3), utilizing the concept that, if a gene shows particularly high expression in a certain tissue, it likely exerts an important function in that tissue. Following this concept, we discovered that both OsSPL3 and TaSPL3 shared a similar expression pattern, being highly expressed in leaves, roots, stems, inflorescences, and flower organs (Figures 2, 3 and 5A). As a proof of concept, we sought to validate the function of TaSPL3 in transgenic rice. Indeed, ectopic expression of TaSPL3 in rice affected flowering time, plant height, flag leaf development, and panicle structures, but did not alter tiller number or grain size (Figures 6-8). Similarly, a previous research work has demonstrated that the OsSPL3 knockout by using CRISPR-mediated genome editing also modified plant height, flowering time, and panicle-related traits, supporting the conserved functions between OsSPL3 and TaSPL3 [31]. Other lines of evidence also support the concept that where a SPL is highly expressed affects its functions. For example, TaSPL8 is specifically expressed at leaf ligules ( Figure 2A). The knockout lines of TaSPL8-A, -B, and -D, respectively, proved that TaSPL8 controls leaf ligule development, with TaSPL8-D having the highest expression levels and phenotypic contribution [46]. By contrast, OsSPL8 are expressed in leaves, panicles and developing seeds ( Figure S7). Matching with the expression pattern, functional characterization of OsSPL8 proved that it not only controls leaf ligule development, but also affects plant height, panicle size, and grain length [31]. In rice, several forward genetic studies have demonstrated that the natural alleles of OsSPLs with elevated and/or ectopic expression confer agronomically desirable traits in different accessions of rice or wild rice [5,10,11,13,36,91,92]. Natural variation in the promoter region leads to decreased expression of OsSPL10 and regulates trichome development in rice cultivars [91]. Natural variation of OsSPL14 in rice causes its deregulation by miR156 and higher expression levels in the developing tissues, led to the Ideal Plant Architecture (IPA) phenotypes with less tillers, bigger panicles and bigger grains [10]. Because the TaSPL-OsSPL comparison described here highlighted gene divergence and differences in expression patterns, it is suggested that TaSPLs could provide a novel genetic resource to modify the growth, development, and yield in cereal crops.
Another limitation for the comparison between OsSPLs and TaSPLs lies in the miR-NAs that regulates SPL genes. In rice, miR156 and miR529 are known to target some OsSPL [10,36,93], while the expression profiling of miR156 and miR529 have not been reported in wheat. Only a few studies in wheat have annotated and profiled miRNAs [92,94,95]. In particular, molecular characterization of the miR529 family has not been carried out in wheat. Integrated multi-omics analysis combining both expression data sets of TaSPL genes and their regulatory miRNAs (miR156 and miR529) are expected to be indispensable and useful in gaining a thorough understanding of the pleiotropic functions of TaSPLs in the future [96].
Phylogenetic analysis of the SPL gene family was performed using the maximumlikelihood method with 1000 bootstraps, using MEGA X for TaSPLs, AetSPLs, TuSPLs, BdSPLs, OsSPLs, and AtSPLs [99]. The full-length protein sequences of SPLs were used.

Analyses of Sequence Alignment, Protein Domains, and Conserved Motifs
The protein sequences of OsSPLs and the 56 TaSPL identified in the present study were aligned using ClustalW 2.0, in order to determine the SBP domain [100]. The genomic and cDNA sequences of the 56 TaSPL genes were retrieved from the wheat reference genome, and the exon-intron structures of TaSPLs were analyzed using TBtools [101]. Protein motifs conserved in TaSPLs were identified using MEME [102].

Syntenic Analysis of SPL Genes between Wheat and Rice
The chromosomal locations of the 56 TaSPL genes were visualized using TBtools ( Figure 1B). Syntenic relationships between OsSPLs and TaSPLs were established using TriGeneTribe and MCScanX ( Figure S2) [103]. Based on this result, the nomenclature of TaSPL genes was compared, based on several previous studies on the TaSPL family, and adjusted in the present study to reflect the SPL syntelogous connections between wheat and rice (Table S1) [42][43][44].

Plant Materials
The wheat cultivar Chinese Spring was planted in the experimental field of Huazhong University of Science and Technology (Wuhan, China) for TaSPL gene cloning and expression analysis. The rice (Oryza sativa L. japonica) cultivar "Nipponbare" plants grown in greenhouse were used for rice protoplast preparation and transformation. To study the phenotypic effects of TaSPL3, transgenic lines of rice expressing TaSPL3-6A were generated (see Method Section 4.9). The transgenic lines of rice with ectopic TaSPL3-6A expression or with the TaSPL3pro:uidA expression cassette were also grown under the field conditions for molecular and phenotypic studies.

Gene Expression Analysis
Two wheat expression databases-WheatOmics and expVIP-were used to retrieve the gene expression profiles of TaSPL genes [61,89]. To examine the expression patterns of TaSPL across different tissues and stages during development, three RNA-seq data sets were used: one from wheat cultivar Azhurnaya [62] (the data set "BCS" in Figure 2A), one from the different tissues of developing grain of cv. Chinese Spring [90] (the data set "Dev_Grain" in Figure 2A), and one from the immature inflorescences of cultivar Kenong9204 [88] (the data set "Spikes" in Figure 2A). All of the gene expression data were presented in transcript per million (TPM), and then converted to z-scores for heatmap visualization using the "pheatmap" function from the R package "COMPASS" (http://www.bioconductor.org/ packages/devel/bioc/html/COMPASS.html) (accessed on 22 November 2021). When identifying sub-genome expression biases, the largest data set in wheat (namely, BCS) was used, in order to avoid potential batch effects between the RNA-seq data sets, with the statistical differences of expression between sub-genomes calculated by two-tailed Student's t-test ( Figure 2B). RNA-seq data and qPCR data from previous studies were used to analyze the expression patterns of TaSPLs in response to abiotic stresses (primers in Table S2; previous results summarized in Table S3) [47,49,83,84].
Two rice expression databases-RiceXpro and the rice expression database (RED)were used to retrieve the gene expression profiles of OsSPL genes [79,80]. As several OsSPLs are involved in grain development, four data sets in RiceXPro were chosen, including RXP_0001 (spatial-temporal expression of various tissues throughout entire growth period) [78], RXP_0010 (gene expression profile during reproductive organ development), RXP_0011 (gene expression in grains at early developmental stages), and RXP_0012 (gene expression in embryos and endosperms at ripening stages). These four RiceXPro data sets provided microarray-based gene expression profiles with similar developmental tissues and stages to the TaSPL expression. The expression analysis of rice microarray data sets has been described elsewhere [78]. To facilitate the data comparison and heatmap visualization across several data sets, the microarray-based expression values were z-score transformed and scaled within the range from −1 to 1 ( Figure S7A). The RNA-seq based OsSPL expression data from the RED were used to validate the OsSPL preferentially expressed tissues ( Figure S7B). As many RNA-seq data sets collected in the RED have placed emphasis on the expression responses to nutritional elements, abiotic, and biotic stresses, the OsSPL expression obtained from the RED were not used for comparison with that of TaSPLs. Due to technical difficulties in direct comparison between microarray-and RNA-seq-based expression, we grouped the TaSPL expression data (RNA-seq based) and OsSPL expression data (microarray-based) first by organs and then by tissues and stages, and generally compared preferentially expressed organs and the relative expression abundance at the organ level (Figure 3).

qPCR-Based Expression Profiling of TaSPLs
The expression profiles of TaSPL genes were analyzed using quantitative PCR (qPCR) in different tissues across the developmental stages and under several abiotic stresses or phytohormone treatments.
Total RNA was extracted by using an RNA Extraction Kit (Zomanbio, Beijing, China) and reversely transcribed into cDNA for qRT-PCR as described previously [106]. qRT-PCR was carried out using SYBR Green Master Mix on a CFX96 real-time System (Bio-RAD, Hercules, CA, USA), with three biological replicates for each sample or treatment. TaActin (TraesCS1B02G283900) was used as the internal reference gene for qPCR. The qPCR program included pre-denaturation at 95 • C for 10 min and 40 cycles of denaturation at 95 • C for 10 s, annealing at 60 • C for 30 s, and extension at 72 • C for 1 min. The primers used for qPCR are provided in Table S2. To carry out the functional study, TaSPL3-6A was amplified from the cDNA of wheat young spikes with the following PCR program: pre-denaturation at 98 • C for 30 s and 35 cycles of denaturation at 98 • C for 10 s, annealing at 55-65 • C for 15 s, and extension at 72 • C for 1 min (primer sequences in Table S2). The TaSPL3-6A sequence was verified using Sanger sequencing at the AuGCT company (Beijing, China).
The coding region of TaSPL3-6A was fused with GFP ORF through XbaI/BamHI restriction sites to obtain the construct (namely, pSGN-TaSPL3-6A-GFP) for the sub-cellular localization experiment, in which the expression of TaSPL3-GFP was driven by the CaMV35S promoter. An empty vector, pSGN-GFP, was used as the negative control. Protoplasts of rice seedling leaves were prepared. Under an induction of 40% PEG-4000, empty or recombinant plasmids (pSGN-GFP or pSGN-TaSPL3-6A-GFP) were co-transformed into rice protoplasts with the marker plasmid (CFP). The transformed protoplasts were cultured at 28 • C for 8-10 h under dark conditions. A laser confocal microscope (FV1200, Olympus, Valley, PA, USA) was used to detect the sub-cellular localization of GFP proteins or TaSPL3-6A-GFP fusion proteins.

Transactivation Assay
The TaSPL3-6A coding region was truncated into three parts: The N-terminal (N), SBP domain (SBP), and C-terminal (C). Each of the three parts or the full-length of TaSPL3-6A was cloned into pGBKT7 plasmids through BamHI/NcoI restriction sites, in order to obtain the recombinant constructs pGBKT7-TaSPL3-N, pGBKT7-TaSPL3-SBP, pGBKT7-TaSPL3-C, and pGBKT7-TaSPL3-6A-FL, respectively. According to the manufacturer's protocol (Clontech, Foster City, CA, USA), the abovementioned recombinant constructs, as well as pGBKT7 and the positive control, were transformed into yeast strain AH109. The transformed yeast strains were diluted at different concentrations, then dotted onto SD/-Trp or SD/-Trp/-His/-Ade medium. After culturing for four days, the trans-activation activities of full-length TaSPL3-6A or its fragments were evaluated by the columns of transformed yeasts.

Generation of the Transgenic Rice Lines
To ectopically express TaSPL3 in rice, the open reading frame (ORF) of TaSPL3-6A was fused with the 3-myc tag and then inserted into the Agrobacterium transformation vector pCAMBIA1304, with its expression driven by the CaMV 35S promoter. Rice transformation was performed using the Agrobacterium immersion method with strain EHA105 and calluses induced from cv. Nipponbare immature embryos [107]. To determine transgenic positive events of rice, DNA was extracted from leaves of independent T 0 plants, and specific PCR primers were designed to amplify the 383-bp fragment within the selection gene (hygromycin B phosphotransferase gene) or the 2161-bp fragment of the TaSPL3-6A gene. Quantitative PCR analysis was performed to evaluate TaSPL3 expression levels in the leaves of transgenic rice plants in the T 0 generation (all primer sequences provided in Table S1). Subsequently, the T 0 plants were selected based on the aforementioned PCR and qPCR results, in order to propagate to T 1 and T 2 lines for phenotypic observation. Additionally, transgenic rice lines transformed with the empty vector pCAMBIA1304 (the vector control lines, VC) were also generated to serve as a negative control.

Phenotypic Analysis of TaSPL3-OE Transgenic Lines
The T 2 lines of TaSPL3-OE and control lines (including both non-transformed cv. Nipponbare and VC) of rice were planted in a randomized block field experiment with three replicates at the experimental fields (Wuhan, China). In each plot, three rows of plants were grown with 25-cm row spaces. Within each row, about 15 individual plants were grown with 20-cm of plant spacing. Regular field managements, including irrigation, fertilization, and insect, disease, and weed control, were applied.
The growth periods for each line of rice, including seedling date, tiller date, stem elongation date, heading date, flowering time, and maturity time were observed. In each plot, 10 plants were chosen to measure various agronomic traits, including plant height, flag leaf sizes (leaf length, leaf width, and leaf area), tiller numbers, panicle length, numbers of primary and secondary branches per panicle, grain weight per panicle, grain numbers per panicle, seed-setting percentage per panicle, length, width, and thickness of grains, thousand-grain weight, and yield per plant. The length, width, and thickness of grains were measured using seed testing instrument (SC-G, Wanshen, Hangzhou, China). The significance of differences among means of agronomic traits was determined using Tukey's honest significant difference test.

Conclusions
In the present study, we performed a comprehensive analysis of the TaSPL family, identified 56 TaSPL genes, and established the orthologous relationship between TaSPLs and OsSPLs. A detailed qRT-PCR analysis pinpointed several TaSPLs, TaSPL2/6/8/10, involved in the tolerance of different abiotic stresses. Our results highlighted the conservation and divergence between TaSPLs and OsSPLs. As a proof of the functional prediction from the dry lab data, we demonstrated that TaSPL3 shares a conserved function with OsSPL3 in regulating plant height, flowering time and panicle-related traits by using transgenic lines of rice. Importantly, our work leads to a clear take-home message that the combination of evolutionary and expression analyses can serve as an efficient approach to transfer the functional knowledge from the monocot model species rice to wheat, helping to gain better understanding of the functions of TaSPLs. The approach exemplified here may also be effective in functional characterization of agronomically important gene families in wheat, such as other transcription factors.

Acknowledgments:
The authors would like to thank the International Wheat Genome Sequencing Consortium (IWGSC), the publicly available RNA-seq, and the WheatOmics website (http://202.194. 139.32/help/contact.html) (accessed on 10 December 2021) operated by Shengwei Ma from Institute of Genetics and Developmental Biology, CAS. We also express our thank to Zhenwu He for managing the experimental fields.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manu-script, or in the decision to publish the results.