Genome-Wide Identification and Characterization of SQUAMOSA—Promoter-Binding Protein (SBP) Genes Involved in the Flowering Development of Citrus Clementina

SQUAMOSA-promoter binding protein (SBP)-box genes encode a family of plant-specific transcription factors that play vital roles in plant growth and development. In this study, 15 SBP-box genes were identified and isolated from Citrus clementina (CclSBPs), where 10 of these genes were predicted to be putative targets of Citrus clementina microRNA156 (CclmiR156). The 15 CclSBP genes could be classified into six groups based on phylogenetic analysis, diverse intron–exon structure, and motif prediction, similar to the SQUAMOSA promoter binding protein-like (SPL) gene family of Populus trichocarpa and Arabidopsis thaliana. Furthermore, CclSBPs classified into a group/subgroup have similar gene structures and conserved motifs, implying their functional redundancy. Tissue-specific expression analysis of CclSBPs demonstrated their diversified expression patterns. To further explore the potential role of CclSBPs during floral inductive water deficits, the dynamic changes of the 15 CclSBPs were investigated during floral inductive water deficits, and the results showed that some CclSBPs were associated with floral induction. Among these genes, CclSBP6 was not homologous to the Arabidopsis SBP-box gene family, and CclSBP7 was regulated by being alternatively spliced. Therefore, CclSBP6 and CclSBP7 were genetically transformed in Arabidopsis. Overexpression of the two genes changed the flowering time of Arabidopsis.


Introduction
SQUAMOSA-promoter binding protein (SBP)-box genes are a family of plant-specific transcript factors that play crucial roles in the regulation of plant growth and development [1]. The common feature of SPP genes is that their protein products contain a highly conserved SBP-box DNA-binding domain (approximately 76 amino acid residues). This domain features a zinc finger motif that contains two zinc finger domains. A putative nuclear localization signal (NLS) is located at the C-terminal of the SBP domain, which partly overlaps with the DNA-binding domain, particularly with the second zinc finger domain [2,3]. The SBP-box genes were first identified in snapdragon (Antirrhinum

Identification and Molecular Cloning of CclSBPs
The nucleotide and deduced amino acid sequences of 16 SPL genes from Arabidopsis [5] were obtained from TAIR (The Arabidopsis Information Resource) database [33]. A genome-wide search of CclSBP genes was carried out using Basic Local Alignment Search Tool (BLAST) analyses, with the 16 AtSPLs genes used as queries against the C. clementina genome [34]. All non-redundant protein sequences of the putative citrus SBP-box genes were checked for the SBP domain using the protein families database (Pfam) [35]. To verify the coding regions of CclSBPs, gene-specific primers were designed for amplification of the 15 CclSBP genes using polymerase chain reaction (PCR) with cDNA templates from leaves of C. clementina (Supplementary Table S1). The PCR was performed in a 20 µl system, in a Veriti 96-Well Thermal Cycler. Amplified products were separated in a 1.0% agarose gel electrophoresis. The target bands were recovered, cloned into pMD18-T vectors (TaKaRa Biotech, Dalian, China), and then transformed into the Escherichia coli strain DH5a. Three positive clones were sequenced for each candidate SBP-box gene.

Phylogenetic Analysis of CclSBPs
Multiple alignments of the SBP-box protein sequences were performed using the ClustalW program [36]. The sequence logo was obtained using the online Weblogo platform [37]. The phylogenetic trees were generated by MEGA 6.0, using the maximum likelihood (ML) algorithm [38]. Bootstrap analysis with 1000 replicates was used to evaluate the significance of the nodes. Jones-Taylor-Thornton (JTT) model was used to ensure that the divergent domains could contribute to the topology of the ML tree [38].

Gene Structure, Conserved Domain and miR156 Target Site Analysis of CclSBPs
The exon/intron organization of CclSBPs was determined by comparing the coding sequences to their corresponding genomic sequences using the gene structure display server (GSDS) program [39]. The simple modular architecture research tool (SMART) and multiple em for motif elicitation (MEME) were used to identify the conserved motif structures of the SBP-box protein sequences [40,41]. The sequence target of miR156 (miR156a and miR156b) was determined in a previous work on C. trifoliata [42]. The targets of miR156 were predicted by searching the coding regions, as well as the 3 -UTR of all the CclSBP genes, using the psRNATarget tool with default parameters as in Reference [43].

Identification of the Specific Repetitive Elements of CclSBPs
To determine whether specific repetitive elements drive the sequence divergence of CclSBPs, the tandem repeat (TR), transposable element (TE), low-complexity repeat (LCR), and simple sequence repeat (SSR) were investigated in the region that was 1.5 kb upstream of the 5 UTR to the 3'-UTR of the genes. TRs were searched using the Tandem Repeat Finder 4.04 [44], with default parameter values, respectively. TEs and LCRs were detected using RepeatMasker [45]. SSRs were examined using the web-based software SSRIT [46]. Putative cis-elements in the promoter regions of CclSBPs were annotated using the PlantCARE software [47]. The motifs putatively involved in plant growth and development, hormone responses, light response, and stress responses were summarized in this study.

Co-Expression Analysis of CclSBPs During Floral Inductive Water Deficits
The published transcriptome from the lemon bud provided a useful complement to understand the expression patterns of the CclSBPs during floral inductive water deficits [32]. The co-expression network of the CclSBPs was constructed, the correlations between two genes above 0.85 were kept, and then visualized in Cytoscape as in Reference [48]. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses of genes in the co-expression network were conducted in R, using the packages "clusterProfiler" and "pathview" as described in reference [49,50]. The predicted protein interaction network of CclSBPs proteins in citrus was built according to knowledge obtained from Arabidopsis.

Real-Time Polymerase Chain Reaction (RT-PCR)
Total RNA was isolated using the Oligotex mRNA mini kit (Qiagen, Gaithersburg, MD, USA), according to the manufacturer's instructions. The RNA preparation was treated with DNase I (Promega, Madison, WI, USA), and first strand synthesis of the cDNA was performed using RT Primer Mix and Primescript RT Enzyme Mix I (life technology). Three biological replicates were performed in this study. To normalize the variance among samples, the expression level of the citrus β-actin was used as the internal control. The program was performed as described previously in [51]. The gene-specific primers for real-time polymerase chain reaction (RT-PCR) of CclSBPs were listed in Table S1.

Arabidopsis Transformation
The full-length cDNA sequences were ligated into pBI121 driven by CaMV35S, and then transferred into the Agrobacterium strain GV3101. Arabidopsis (Col) was transformed using the floral dip method as in Reference [52]. The third generation of transgenic lines and wild type were grown under long-day conditions. Plants were arranged in a randomized complete block design with three blocks. Each genotype within a block was represented by five plants. Thus, 15 plants from each genotype were observed. Flowering time was measured by counting the number of rosette leaves and the number of days from sowing to when the first flower bud was seen. Statistical analysis was performed by one-way analysis of variance (ANOVA), taking p < 0.05 as significant.

Identification, Cloning and Sequence Feature Analysis of CclSBPs
A total of 15 unique genes containing SBP motif were indentified after removing the redundant sequences in this study and they were assigned names from CclSBP1 to CclSBP15 based on their location on seven scaffolds (equivalent to seven chromosomes). The length of the CclSBPs varied from 1247 (CclSBP1) to 7069 bp (CclSBP15), and the length of the coding sequences varied from 393 (CclSBP1) to 3309 bp (CclSBP12). Furthermore, the number of transcripts of each CclSBP gene and the percentage of the expressed sequence tags (EST) that match with the database are shown in Table 1. CclSBPs were unevenly distributed across seven chromosomes: chromosomes 3 and 4 displayed four genes, chromosomes 2 and 9 had three genes, and chromosomes 1, 5, and 6 only showed one gene (Table 1). There was only one conserved domain (the SBP domain) found to be shared by all the CclSBPs. The SBP domains of the CclSBPs were very similar, with a high conservation at certain positions ( Figure 1a). All of the CclSBPs shared two zinc finger-like structures (Zn-1, Zn-2) and a highly conserved bipartite nuclear localization signal (NLS). For all the CclSBPs, the second zinc finger within the SBP domain was CysCysHisCys, whilst the first zinc finger was CysCysCysHis, except for the CclSBP15, where the other zinc finger type was CysCysCysCys ( Figure 1b). Remarkably, the second zinc finger-like structure of all the CclSBPs was partly overlapped with the nuclear localization signal. To further verify the results of the CclSBP gene models obtained from computational prediction, we cloned and sequenced the coding region of all the CclSBP (Figure 1c). For most CclSBP genes, clear single bands were amplified, except CclSBP7. Three positive cloning were sequenced for each gene, and the complete cDNA of each gene was submitted to the GenBank (KT601172-KT601186). Subsequently, the alignment between these cloned cDNA and the coding sequences from the C. clementina genome database were investigated, and the results showed that no differences were found in eight CclSBPs (CclSBP2, CclSBP4, CclSBP5, CclSBP6, CclSBP7, CclSBP8, CclSBP10, and CclSBP12). Only one single nucleotide polymorphism (SNP) site was found in CclSBP1, CclSBP3, CclSBP11, and CclSBP13, respectively; three SNP sites were found in CclSBP15; 36 SNP sites were found in CclSBP14; and one 15 bp deletion was found in CclSBP9.

Gene Structure, Conserved Domain and miR156 Target Site Analysis of CclSBPs
To provide further insights into the evolutionary relationship of CclSBPs, a phylogenetic tree was constructed based on the citrus full-length SBP-box protein sequences. The 15 SBP-box genes were clustered into different groups based on the evolutionary relationship ( Figure 2a). The exon/intron organization in the coding sequences of each CclSBP gene was also performed, and the number of introns in the coding region of 15 CclSBPs varied from 1 to 9 (Figure 2b). Most closely related members shared similar exon/intron structures in terms of intron number and exon length. In addition to the SBP domain, other conserved motifs could also be important for the functioning of SBP-box proteins. Hence, we further searched the conserved motifs using the Multiple Em for Motif Elicitation (MEME) online server and we applied an e-value cut-off of 10 −10 to the recognition. As a result, eight conserved

Gene Structure, Conserved Domain and miR156 Target Site Analysis of CclSBPs
To provide further insights into the evolutionary relationship of CclSBPs, a phylogenetic tree was constructed based on the citrus full-length SBP-box protein sequences. The 15 SBP-box genes were clustered into different groups based on the evolutionary relationship ( Figure 2a). The exon/intron organization in the coding sequences of each CclSBP gene was also performed, and the number of introns in the coding region of 15 CclSBPs varied from 1 to 9 (Figure 2b). Most closely related members shared similar exon/intron structures in terms of intron number and exon length. In addition to the SBP domain, other conserved motifs could also be important for the functioning of SBP-box proteins. Hence, we further searched the conserved motifs using the Multiple Em for Motif Elicitation (MEME) online server and we applied an e-value cut-off of 10 −10 to the recognition. As a result, eight conserved motifs were discovered among the 15 CclSBPs proteins (Figure 2c). The number of motifs in each CclSBPs varied from 1 to 8 (Figure 2c). Motif 1 was actually the SBP domain and it existed in all the CclSBPs. However, the putative functions of the other motifs are currently unknown and they need to be further investigated. As expected, most of the closely related members had common motif compositions, such as CclSBP4 and CclSBP9, CclSBP8 and CclSBP10, CclSBP5 and CclSBP14, CclSBP1 and CclSBP3. This probably implied functional similarities in plant growth and development amongst these CclSBPs which shared similar gene structures and conserved motifs. motifs were discovered among the 15 CclSBPs proteins (Figure 2c). The number of motifs in each CclSBPs varied from 1 to 8 (Figure 2c). Motif 1 was actually the SBP domain and it existed in all the CclSBPs. However, the putative functions of the other motifs are currently unknown and they need to be further investigated. As expected, most of the closely related members had common motif compositions, such as CclSBP4 and CclSBP9, CclSBP8 and CclSBP10, CclSBP5 and CclSBP14, CclSBP1 and CclSBP3. This probably implied functional similarities in plant growth and development amongst these CclSBPs which shared similar gene structures and conserved motifs.
To understand the miR156-mediated posttranscriptional regulation of CclSBPs, a total of 10 CclSBPs were predicted to be targets of miR156 (Figure 2d). The target sites of miR156 were located in the coding regions for five CclSBPs (CclSBP4, CclSBP6, CclSBP9, CclSBP11, and CclSBP13) and two CclSBPs (CclSBP8 and CclSBP10) belonging to two different subgroups. The target sites for the other three CclSBPs (CclSBP1, CclSBP2, and CclSBP3) belonging to a same subgroup, were located in the 3'-UTR regin. It suggested that miR156-mediated posttranscriptional regulation of SBP-box genes was conserved in these plant species.  To understand the miR156-mediated posttranscriptional regulation of CclSBPs, a total of 10 CclSBPs were predicted to be targets of miR156 (Figure 2d). The target sites of miR156 were located in the coding regions for five CclSBPs (CclSBP4, CclSBP6, CclSBP9, CclSBP11, and CclSBP13) and two CclSBPs (CclSBP8 and CclSBP10) belonging to two different subgroups. The target sites for the other three CclSBPs (CclSBP1, CclSBP2, and CclSBP3) belonging to a same subgroup, were located in the 3'-UTR regin. It suggested that miR156-mediated posttranscriptional regulation of SBP-box genes was conserved in these plant species.

Comparative and Phylogenetic Analysis of the SBP-Box Genes in Various Plants
Previous studies and a growing number of fully sequenced plant genomes make it possible to perform a comparative genomic analysis of the SBP-box gene family across a wide range of plant species. In this study, we analyzed some of the major types of model organisms whose genomes have already been sequenced, including green algae, chlorophytes, moss, lycophyte, eudicots and monocots ( Figure 3a). The number of SBP-box genes has been identified in 15 plant species, and the SBP genes of seven plant species from the Phytozome database (Medicago truncatula, Glycine max, Fragaria vesca, Brassica rapa, Carica papaya, Citrus sinensis and C. clementina) were checked in our study. We constructed a phylogenetic tree and displayed the duplication events of these 22 species (Figure 3a). In brief, the number of SBP-box genes in algal, lowland species, monocots, and dicots offered further insight into the evolutionary processes of this gene family.
On the other hand, to further investigate the evolutionary relationship between citrus, Arabidopsis, poplar and rice, a total of 79 SBP-box genes were used for phylogenetic analysis based on their conserved SBP domains ( Figure 3b). These SBP genes were also clustered into six groups, each of which contained at least one AtSPL and one CclSBP. It was shown that the number of citrus, Arabidopsis, Populus and rice SBP genes varied across the six groups (the CclSBPs in group 1 to group 6 had 4, 1, 4, 2, 1, 3 members, respectively; the PtSPLs in group 1 to group 6 had 7, 2, 7, 4, 2, 6 members, respectively; the OsSPLs in group 1 to group 6 had 5, 3, 2, 6, 1, 2 members, respectively; and the AtSPLs in group 1 to group 6 had 2, 1, 3, 5, 1, 4 members, respectively). Interestingly, the SBP domain of SBP-box genes in group 5 were divergent from the other groups ( Figure 3b). The N-terminal zinc finger of group 5 SBP-box genes has four cysteine residues in the SBP domain, while SBP-box genes in other groups mainly contain three cysteines and one histidine. Based on the phylogenetic tree, the miR156-targeted SBP-box genes, including sequences from rice, Arabidopsis and poplar, were distributed into only three of the subgroups (Groups 1, 3 and 4). Generally, most of the CclSBPs showed a closer phylogenetic relationship with PtSPLs than AtSPLs, and they showed the most distant phylogenetic relationship with OsSPLs. These results indicated that orthologous genes between woody plants showed higher similarities than those genes between wood plants and herbaceous plants, and orthologous genes between dicots had a higher similarity than between monocots and dicots. Additionally, it is worth noting that the clusters group together Arabidopsis, rice, poplar, and citrus genes, which are closer to their orthologous counterparts from the other species than to the other family members from their own species. These results indicated that the SBP-box family of genes was present in the ancestor plants that gave rise to the monocot and dicot lineages, making it possible to estimate the minimum number of SBP-box genes in this ancestor.

Specific Repetitive Elements in the CclSBPs
To characterize the sequence divergence of CclSBPs, the spatial distribution of repetitive sequences with respect to the genomic position of the CclSBP was also examined. We investigated the distribution of four types of repetitive sequences that are frequently found in the promoter and coding regions of the CclSBP (TR, TE, LCR, and SSR). The results indicated that all the CclSBP had repetitive sequence insertions and the repetitive sequences were frequent in the promoter and genome DNA regions (Figure 4a). Different types of repetitive sequences frequency were different in the 15 CclSBP genes, and SSR had the largest number and was present in all the CclSBPs except CclSBP1. TR were also found to be present in the genome regions of all the CclSBPs, except CclSBP8 and CclSBP11. LCR was only found in eight CclSBP genes. It is worth noting that TE was not found in any of the CclSBP sequences ( Figure 4a).

Specific Repetitive Elements in the CclSBPs
To characterize the sequence divergence of CclSBPs, the spatial distribution of repetitive sequences with respect to the genomic position of the CclSBP was also examined. We investigated the distribution of four types of repetitive sequences that are frequently found in the promoter and coding regions of the CclSBP (TR, TE, LCR, and SSR). The results indicated that all the CclSBP had repetitive sequence insertions and the repetitive sequences were frequent in the promoter and genome DNA regions (Figure 4a). Different types of repetitive sequences frequency were different in the 15 CclSBP genes, and SSR had the largest number and was present in all the CclSBPs except CclSBP1. TR were also found to be present in the genome regions of all the CclSBPs, except CclSBP8 and CclSBP11. LCR was only found in eight CclSBP genes. It is worth noting that TE was not found in any of the CclSBP sequences (Figure 4a).  Cis-elements play important roles in the regulation of gene transcription during plant growth, development, and stress responses. To understand the transcriptional regulation mechanisms, the cis-elements in the promoter regions of CclSBP genes were identified through the PlantCARE database (Figure 4b). Except for the common cis-acting elements, such as CAAT-box and TATA-box, many cis-elements were identified in the promoter regions of the 15 CclSBP genes (Table S2). According to their putative functions, these elements were categorized into four classes. The results showed that light-responsive elements had the largest number and were present in all the promoter regions of CclSBP. The hormone responsive elements, plant growth and development, and stress responsive elements were also found to be present in the promoter regions of all the CclSBP genes. In addition, other rarely distributed cis-elements in CclSBP were also found to be functionally involved in transcription regulation, circadian control, protein binding, and stress responsiveness. Therefore, the transcription of the CclSBP genes could be regulated by various environmental and developmental changes, which implied that CclSBP genes were involved in important physiological processes and developmental events. Furthermore, limited similar cis-elements distribution was observed amongst these CclSBP genes, even for those CclSBP in the same phylogenetic group.

Analysis of the Expression Patterns of the CclSBPs Gene Family
To preliminarily elucidate the roles of CclSBPs in citrus growth and development, we examined the relative expression levels of 15 CclSBPs in 11 different tissues (Figure 5a). All the CclSBPs were detected in at least one of the tissues examined, but a differential expression was observed. The expression data showed a high variability in the transcript abundance of CclSBPs in various tissues and organs, strongly indicating the diversified functions of CclSBPs in citrus growth and development. It is worth noting that many genes have shown similar expression patterns, such as CclSBP4 and CclSBP9, CclSBP8 and CclSBP10, CclSBP12 and CclSBP14, and CclSBP2 and CclSBP3 belonging to the same subgroup (Figure 2a), indicating their redundant functions. Nevertheless, the expression patterns of a few similar gene pairs, including CclSBP1 and CclSBP3, CclSBP6 and CclSBP13, and CclSBP5 and CclSBP14 belonging to the same subgroup (Figure 2a) are distinct. This suggests that these genes may play different roles in citrus growth and development, although they have high sequence similarity.
To explore the potential role of CclSBPs during floral inductive water deficits, RNA sequencing was performed in a previous study on lemon buds at three stages (Stage 1: one week before water deficit; Stage 2: one week after the beginning of water deficit; and Stage 3: one week after release from water deficit) [32]. We classified the 15 CclSBPs into four clusters based on the similarity of the expression patterns (Figure 5b). Cluster 1 genes (including CclSBP3 and CclSBP12) were induced immediately at stage 1 and mostly maintained high expression levels at Stage 3( Figure 5b). These genes were significantly induced at the beginning of the water deficit, indicating that the gene cluster might play a key role in the necessary growth and development of citrus. Cluster 2 genes (including CclSBP11, and CclSBP15) were suppressed immediately at stage 1 and mostly maintained low expression levels at Stage 3 ( Figure 5b). Cluster 3 genes (including CclSBP2, CclSBP6, CclSBP7, CclSBP8, CclSBP9, CclSBP10, CclSBP12, and CclSBP13) were transiently suppressed at stage 2 and were then induced at stage 3 ( Figure 5b). This cluster showed up-regulated expression at later stages of treatment, indicating the expression of genes involved in the flowering and recovery of vegetative growth. Cluster 4 genes (including CclSBP1, CclSBP5, and CclSBP14) were transiently induced at stage 2 and were then suppressed at stage 3 (Figure 5b). The suppression of the gene cluster may have implied possible involvement in the drought stress response, floral induction, and flower bud differentiation of lemon. A total of four genes (CclSBP4, CclSBP7, CclSBP8, and CclSBP13) were considered to be differentially expressed based on a probability ≥0.8 and an absolute value of log 2 Ratio ≥ 1 as a threshold. These findings suggested that these genes may play important roles during the floral inductive water deficit process. Cluster analysis of the expression profiles of CclSBPs at three stages (stage 1: one week before water deficit; stage 2: one week after the beginning of water deficit; and stage 3: one week after release from the water deficit), as shown in a previous study [32]. Each column represents a sample, and each row represents a single citrus transcript sequence. The bar represented the scale of relative expression levels of differentially expressed genes (DEGs), and the colors indicate relative signal intensities.

Co-Expression Analysis of the CclSBPs Under Floral Inductive Water Deficits Conditions
To uncover the possible roles that CclSBPs played in citrus, a co-expression analysis was performed under the floral inductive water deficit conditions. There were 15 CclSBPs together with the 1638 differentially expressed genes (DEGs) forming a co-expression network via 20,450 interactions (edges). The co-expression network naturally clustered into two modules with certain CclSBPs (Figure 6a). For instance, module 1 represented the largest module in the co-expression network containing six CclSBPs (including CclSBP2, CclSBP4, CclSBP7, CclSBP8, CclSBP9, and Cluster analysis of the expression profiles of CclSBPs at three stages (stage 1: one week before water deficit; stage 2: one week after the beginning of water deficit; and stage 3: one week after release from the water deficit), as shown in a previous study [32]. Each column represents a sample, and each row represents a single citrus transcript sequence. The bar represented the scale of relative expression levels of differentially expressed genes (DEGs), and the colors indicate relative signal intensities.

Co-Expression Analysis of the CclSBPs Under Floral Inductive Water Deficits Conditions
To uncover the possible roles that CclSBPs played in citrus, a co-expression analysis was performed under the floral inductive water deficit conditions. There were 15 CclSBPs together with the 1638 differentially expressed genes (DEGs) forming a co-expression network via 20,450 interactions (edges). The co-expression network naturally clustered into two modules with certain CclSBPs (Figure 6a). For instance, module 1 represented the largest module in the co-expression network containing six CclSBPs (including CclSBP2, CclSBP4, CclSBP7, CclSBP8, CclSBP9, and CclSBP13). Genes from module 2 were uniquely co-expressed with CclSBP14. Gene ontology (GO) term analysis of the CclSBPs centered co-expression network uncovered the possible roles of CclSBPs in the processes of growth and stress response (Figure 6b). For instance, the genes from the co-expression network were enriched in terms of the "response to water/water deprivation", "response to wounding", "regulation of meristem development", "photosynthesis", as well as in terms of hormone synthesis and metabolism ( Figure 6b). As mentioned above, the co-expression network was naturally separated into two dependent modules; and it was interesting to check the specific roles enacted by the different modules. The GO term analysis further confirmed that the genes from the different modules had distinct functions involved in the different aspects of plant growth and development. CclSBP13). Genes from module 2 were uniquely co-expressed with CclSBP14. Gene ontology (GO) term analysis of the CclSBPs centered co-expression network uncovered the possible roles of CclSBPs in the processes of growth and stress response (Figure 6b). For instance, the genes from the coexpression network were enriched in terms of the "response to water/water deprivation", "response to wounding", "regulation of meristem development", "photosynthesis", as well as in terms of hormone synthesis and metabolism ( Figure 6b). As mentioned above, the co-expression network was naturally separated into two dependent modules; and it was interesting to check the specific roles enacted by the different modules. The GO term analysis further confirmed that the genes from the different modules had distinct functions involved in the different aspects of plant growth and development. Figure 6. Co-expression network analysis of the CclSBPs using the data from a previous study [32].
(a) CclSBPs centered the gene co-expression network under floral inductive water deficit conditions. (b) The biological processes of the GO term that were significantly enriched in the CsSBP centered network.

Phenotypes of CclSBP6 Over-Expression in Arabidopsis
CclSBP6 did not exhibit homology with SPL gene family members of Arabidopsis (Figure 3b), and we speculated that this gene may perform special functions during the growth and development of citrus. Therefore, the function of CclSBP6 was investigated by introduction into Arabidopsis ( Figure  7). Twenty-three transgenic plants were obtained in the T1 generation; where all of the lines flowered later than their wild-type counterparts. For further analysis of the CclSBP6 function, three independent transgenic lines in the third generation were selected for phenotypic observation ( Figure  7g). Three CclSBP6 transgenic lines flowered significantly later than the wild-type plants in terms of both days to flowering and the number of leaves. The average time to flowering of the transgenic plants ranged from 35.8 to 38.4 days, while that of the wild-type plants was 28.2 days (Figure 7h). The average number of leaves at flowering ranged from 16.0 to 16.8 in the transgenic plants, and was 11.8 in the wild-type plants (Figure 7g). In addition, the transgenic plants of CclSBP6 showed multiple morphological changes, such as smaller flowers, slender leaves, shorter siliques, and extended root systems under long-days (Figure 7c-f). Figure 6. Co-expression network analysis of the CclSBPs using the data from a previous study [32]. (a) CclSBPs centered the gene co-expression network under floral inductive water deficit conditions. (b) The biological processes of the GO term that were significantly enriched in the CsSBP centered network.

Phenotypes of CclSBP6 Over-Expression in Arabidopsis
CclSBP6 did not exhibit homology with SPL gene family members of Arabidopsis (Figure 3b), and we speculated that this gene may perform special functions during the growth and development of citrus. Therefore, the function of CclSBP6 was investigated by introduction into Arabidopsis (Figure 7). Twenty-three transgenic plants were obtained in the T 1 generation; where all of the lines flowered later than their wild-type counterparts. For further analysis of the CclSBP6 function, three independent transgenic lines in the third generation were selected for phenotypic observation (Figure 7g). Three CclSBP6 transgenic lines flowered significantly later than the wild-type plants in terms of both days to flowering and the number of leaves. The average time to flowering of the transgenic plants ranged from 35.8 to 38.4 days, while that of the wild-type plants was 28.2 days (Figure 7h). The average number of leaves at flowering ranged from 16.0 to 16.8 in the transgenic plants, and was 11.8 in the wild-type plants (Figure 7g). In addition, the transgenic plants of CclSBP6 showed multiple morphological changes, such as smaller flowers, slender leaves, shorter siliques, and extended root systems under long-days (Figure 7c-f). To further elucidate the physiological functions of CclSBP6 during the flowering of transgenic Arabidopsis, we analyzed the abundance of Arabidopsis early flowering related genes at 1-cm inflorescence stages of the wild-type. The levels of Arabidopsis endogenous FLOWERING LOCUS T (FT) and SPL2/3/4/5/9 transcripts were clearly reduced in the transgenic lines (Figure 7j). These data suggested that CclSBP6 functions may act as a floral repressor and might be involved in the citrus flowering. Previously, the transgenic Arabidopsis of citrus miR156a was developed [53]. The results showed that over-expression of citrus miR156a resulted in an extended juvenile phase in transgenic plants compared to the control plants. The phenotypes of CclSBP6 and citrus miR156a transgenic plants were the same by genetic transformation of the Arabidopsis, where we speculated that CclSBP6 may not be the target gene of miR156a in citrus.

Alternative Splicing and Functional Analysis of CclSBP7 in Transgenic Arabidopsis
Further expression analysis of the CclSBP7 was analyzed using RT-PCR, with the primers designed according to the open reading frame (ORF) of the CclSBP7 from the citrus database, where they were named CclSBP7α. The templates from the adult plants' mRNA were prepared. However, To further elucidate the physiological functions of CclSBP6 during the flowering of transgenic Arabidopsis, we analyzed the abundance of Arabidopsis early flowering related genes at 1-cm inflorescence stages of the wild-type. The levels of Arabidopsis endogenous FLOWERING LOCUS T (FT) and SPL2/3/4/5/9 transcripts were clearly reduced in the transgenic lines (Figure 7j). These data suggested that CclSBP6 functions may act as a floral repressor and might be involved in the citrus flowering. Previously, the transgenic Arabidopsis of citrus miR156a was developed [53]. The results showed that over-expression of citrus miR156a resulted in an extended juvenile phase in transgenic plants compared to the control plants. The phenotypes of CclSBP6 and citrus miR156a transgenic plants were the same by genetic transformation of the Arabidopsis, where we speculated that CclSBP6 may not be the target gene of miR156a in citrus.

Alternative Splicing and Functional Analysis of CclSBP7 in Transgenic Arabidopsis
Further expression analysis of the CclSBP7 was analyzed using RT-PCR, with the primers designed according to the open reading frame (ORF) of the CclSBP7 from the citrus database, where they were named CclSBP7α. The templates from the adult plants' mRNA were prepared. However, three bands were discovered in the RT-PCR analyses observation (Figure 1c). These amplification products were recovered and sequenced. Other than CclSBP7α, two different transcripts were also isolated. During comparisons of these cDNA sequences with each other, three transcripts of CclSBP7 showed high identities with each other. They encoded the same 5 -and 3 -UTRs, whereas a region of these different transcripts was strongly divergent from the transcription initiation site (TSS) to 739 in the ORF through some nucleotide acids deletion and insertion. To further investigate whether these different transcripts were due to alternative splicing or they were transcripts from different genes, we isolated the full-length DNA of CclSBP7. Sequence analysis revealed that these CclSBP7 transcripts came from the same genomic DNA, which was about 2 kb in size and had three introns and three exons with reference to the nucleotide sequence of CclSBP7α. CclSBP7β contained 1098 nucleotides of an ORF because of the first intron retention. CclSBP7γ contained 651 nucleotides of an ORF because of the partial deletion of the first exon (Figure 8a). The three unique transcripts of CclSBP7 were conceptually translated and they showed three unique peptide sequences, and the three sequences were CclSBP7α (329aa), CclSBP7β (365aa), and CclSBP7γ (216aa).
To further analyze the function of the CclSBP7, three transgenic lines were randomly selected for each alternatively spliced transcript (Figure 8e). We selected 15 T 3 plants for each transgenic line. Compared with control plants, the CclSBP7α, CclSBP7β, and CclSBP7γ transgenic lines flowered significantly earlier than the control plants in terms of both the number of leaves and days to flowering (Figure 8b). In the CclSBP7α, CclSBP7β and CclSBP7γ, the average time to flowering ranged from 23.7 to 27.6 days in six transgenic lines, whereas that of the control plants was 29.5 days (Figure 8f). The average number of leaves at flowering ranged from 8.4 to 10.9, and it was 12.7 in the control plants (Figure 8g). In addition, it is worth noting that the transgenic siliques were approximately 80% as long as the siliques of the control plants (Figure 8c-d). To evaluate the possible relation between the expression of CclSBP7 and the early flowering phenotype of the transgenic Arabidopsis, the expression of some endogenous flowering-related genes from Arabidopsis was also assessed at 1-cm inflorescence stages of transgenic Arabidopsis. The levels of FT, FRUITFULL (FUL), APETALA1 (AP1) and LEAFY (LFY) transcripts were clearly elevated in the transgenic plants compared to the wild-type (Figure 8h). These findings further supported our conclusion that the early flowering phenotype of the transgenic Arabidopsis was attributable to the expression of CclSBP7. Meanwhile, these data suggested that CclSBP7 functions may act as a floral activator and might be involved in citrus flowering.

Discussion
The SBP-box genes are plant-specific transcription factors encoding proteins that contain a highly conserved SBP domain. This can specifically bind to the promoters of the floral meristem identity gene and it plays significant regulatory roles in plant growth and development, including sporogenesis, leaf development, vegetative and reproductive phase transitions, response to copper and fungal toxins and hormone signaling [4,5,15,[21][22][23]25,54]. In a previous study, Shalom et al. identified the members of the SBP gene family in citrus, and they studied their seasonal expression patterns in buds and leaves, and in response to de-fruiting [30]. In this study, a comprehensive overview of the SBP-box gene family was undertaken, including the gene structures, phylogeny, chromosome locations, conserved motifs, and cis-elements in the promoter sequences as compared with the previous study. Meanwhile, we also performed a functional analysis of some of the members of the SBP gene family in Arabidopsis. The roles of SPLs and miR156 as regulators of flowering have been extensively studied in Arabidopsis [15,17,24]. However, considerably less research has been done with woody plants. In the current work, similar to other plants, about two-thirds of the CclSBP contained sequences that were complementary to miR156. Furthermore, three (CclSBP1, CclSBP2, and CclSBP3) of 10 citrus SBP-box genes contained sequences that were complementary to miR156 in the

Discussion
The SBP-box genes are plant-specific transcription factors encoding proteins that contain a highly conserved SBP domain. This can specifically bind to the promoters of the floral meristem identity gene and it plays significant regulatory roles in plant growth and development, including sporogenesis, leaf development, vegetative and reproductive phase transitions, response to copper and fungal toxins and hormone signaling [4,5,15,[21][22][23]25,54]. In a previous study, Shalom et al. identified the members of the SBP gene family in citrus, and they studied their seasonal expression patterns in buds and leaves, and in response to de-fruiting [30]. In this study, a comprehensive overview of the SBP-box gene family was undertaken, including the gene structures, phylogeny, chromosome locations, conserved motifs, and cis-elements in the promoter sequences as compared with the previous study. Meanwhile, we also performed a functional analysis of some of the members of the SBP gene family in Arabidopsis. The roles of SPLs and miR156 as regulators of flowering have been extensively studied in Arabidopsis [15,17,24]. However, considerably less research has been done with woody plants. In the current work, similar to other plants, about two-thirds of the CclSBP contained sequences that were complementary to miR156. Furthermore, three (CclSBP1, CclSBP2, and CclSBP3) of 10 citrus SBP-box genes contained sequences that were complementary to miR156 in the 3'-UTR region, except for the seven other members contained in the miR156-binding site located in the exon region, which comprised a relatively short protein size, consistent with previous reports on Arabidopsis [15,55]. The results provided a basis for elucidating the functions of the SBP-box genes in citrus.
It is believed that the SBP-box gene family has undergone gene duplications in many plants, such as Arabidopsis [55], rice [8], maize [9], Populus [10], grape [13], and apple [12]. The number of SBP-box genes identified in citrus was less than in Arabidopsis, which is inconsistent with the three-fold larger genome size of the Citrus clementina (367 Mb) versus that of Arabidopsis (125 Mb). This finding was similar to the results of a previous study that analyzed the CCCH gene family [51]. Evidence suggests that citrus is rather ancient and has an infrequent reproductive cycle in some taxa due to apomixis, male or female sterility, long juvenility and vegetative propagation [56]. These properties may greatly contribute to the restriction of genome expansion and evolution. Likewise, previous studies have demonstrated that there were no whole genome duplication events (WGDs) in citrus except an ancient triplication, called the γ event, which was shared by all the core eudicots [56]. There are, however, additional recent WGDs that have occurred in Arabidopsis, rice, maize, Populus and apple. Therefore, recent WGDs are likely to be the reason for a larger SBP-box gene number in these plants. It is noteworthy that CclSBP6 and CclSBP7 tended to be putative tandem duplicated genes based on the comprehensive analysis of gene locations and sequence properties. Several SBP-box gene pairs in citrus shared high-sequence similarity and were likely to be putative segmental duplicated genes (CclSBP4/9, CclSBP8/10 and CclSBP5/14). The phylogenetic tree demonstrated that most citrus SBP-box genes were clustered more tightly with Populus and Arabidopsis rather than rice SBP-box genes. This is consistent with the fact that citrus, Populus and Arabidopsis are dicots and diverged more recently from a common ancestor than plants from the lineage leading to monocots. These results indicate that although plant SBP-box genes may be derived from a common ancestor and appeared after the divergence of plants and animals, many of them have undergone distinct patterns of differentiation and played different roles after the separation of each lineage.
To further reveal the possible roles of CclSBP genes, we constructed a phylogenetic tree based on 15 CclSBPs, 16 AtSPLs, and several SBP-box genes, where the functions have been characterized in other plant species ( Figure S1). The phylogenetic tree was further divided into eight groups, and the members where functions have been characterized in each group were annotated using different colored circles. Within group I, aside from the miR156-binding site unknown of AmSBP1, all the other members were contained in the miR156-binding site. AmSBP1 and AtSPL3 have also been reported to bind cis-elements in the promoters of the floral organ identity genes SQUAMOSA and APETALA1(AP1), respectively [4], and AtSPL3/4/5, as well as AmSBP1, have all been implicated in the vegetative phase change and floral induction [4,18]. Another SBP-box gene within this group, CNR, has been reported to be pivotal for normal fruit ripening in tomato [27]. Surprisingly, a recent study has shown that the CclSBP1 was able to promote flowering independently of the photoperiod in Arabidopsis, while miR156 repressed its flowering-promoting activity [30]. Meanwhile, CclSBP1, CclSBP2, and CclSBP3 have a close relationship with these functional investigated genes, and they showed high expression levels in flowering. These results suggest strongly that CclSBP1, CclSBP2, and CclSBP3 have putative significant roles in the citrus flower and/or fruit development.
By contrast with Group I, the members of Group II were relatively large and lacked negative regulation by miR156. Only PpSBP2 and AtSPL14 were functionally characterized within this group. Although PpSBP2 plays important roles in the regulation of copper homeostasis in Barbula unguiculata [57], it showed a relatively more distant relationship with other members from citrus and Arabidopsis compared to AtSPL14. AtSPL14 has been reported to play significant roles in plant architecture [25]. As the ortholog gene of AtSPL14, CclSBP12 probably has a similar function among them. Within group III, all the SBP-box members contained the miR156-binding site. CclSBP6, CclSBP13, AtSPL9, AtSPL15, and OsSPL14 were clustered in one subgroup close together. In Arabidopsis, AtSPL9 and AtSPL15 play redundant roles in vegetative phase change and reproductive transition [20]. In rice, OsSPL14 can repress vegetative branching and promote inflorescence branching [26]. Recently, new SBP-box genes (NbSPL6) have been identified from Nicotiana benthamiana, which is essential for the N-mediated resistance to the Tobacco mosaic virus. Similarly, AtSPL6 functions in resistance to the bacterial pathogen Pseudomonas syringae, expressing the AvrRps4 effector [58]. Therefore, CclSBP4 and CclSBP9 may have similar function because of their closely phylogenetic relationship in the other subgroup.
Although they lack regulation by miR156, the members of group IV were characterized by functional diversity. Two SBP-box genes, PpSBP1 and PpSBP4, were divided into one subgroup and were involved in regulating phase change and the circadian clock in moss [59]. In the other subgroup, CclSBP7, AtSPL8, OsSPL8, and ZmLG1 were clustered closely together. The AtSPL8 was the first SBP-box gene to be functionally characterized in Arabidopsis [22]. Although mutations of AtSPL8 did not play roles in phase change, they have a profound effect on the seed set, petal trichome production, root growth, and male fertility [22][23][24]. In our study, the sequence analysis of several clones from CclSBP7 obtained using RT-PCR resulted in the discovery of three alternative splicing transcripts. Transgenic Arabidopsis over-expressing these transcripts flowered earlier than the control. A previous study indicated that CclSBP7 played a role in the floral inductive water deficit process. These results further suggested that CclSBP7 acts as a floral inducer in citrus. One possible explanation for this observation is that the regulatory mechanism of CclSBP7 differs between Arabidopsis and woody plants. Within group V, CclSBP8 and CclSBP10 were highly expressed in the leaf and homologous to AtSPL13, teosinte glume architecture 1 (tga1), and OsSPL6. AtSPL13 has been shown to affect the initiation of the first true leaves [54], maize tga1 is involved in the ear glume development [60], and OsSPL16 controls the grain size, shape, and quality in rice [61]. Therefore, we hypothesized that CclSBP8 and CclSBP10 may provide functions in controlling the characteristics of leaves and/or fruit.
Surprisingly, only CclSBP11 was classified together with AtSPL2, AtSPL10, and AtSPL11 into Group VI. AtSPL10/11/2 are involved in the development of lateral organs, shape of the cauline leaves, and the number of trichomes on cauline leaves and flowers [19]. The results from spatial expression showed that CclSBP11 was expressed at relatively higher levels in the leaves. Therefore, the CclSBP11 probably had similar functions with the three Arabidopsis SBP-box genes in citrus. Within group VII, the AtSPL7 and its ortholog gene CclSBP15 were characterized by their large size and lack of miR156-binding site. Evidence reveals that AtSPL7 can bind to the Cu-response element (CuRE) and be involved in copper homeostasis [62]. Furthermore, the COPPER RESPONSE REGULATOR 1 (CRR1) in Chlamydomonas reinhardtii is the only one classified in group VIII, and it is homologous to AtSPL7. Similar to AtSPL7, CRR1 recognizes and binds to the GTAC core sequence of CuRE in Chlamydomonas reinhardtii, and it has a similar function as the copper-responsive gene [7]. In addition, CclSBP15 also exhibited responsiveness to abiotic stress, being up-regulated by drought. The results strongly suggest that CclSBP15 probably has a similar function involved in copper homeostasis. The phylogenetic tree helped to predict the putative functions of the CclSBP genes based on the functions of SBP-box proteins in other species clustered in the same group. However, these findings only provide helpful information for understanding the function and regulation mechanism of CclSBPs in citrus. Further efforts will be exerted to find more direct evidence, including ectopic expression of different transcripts by transformation in citrus.

Conclusions
Function analysis has shown that SBP-box genes play crucial roles in the regulation of plant growth and development, especially in terms of flowering time, meristem identity and architecture, and fruit development. Citrus, as a perennial fruit tree, has distinct growth habits and quantitative traits in contrast to annual plant species. Comparative analysis of the SBP-box genes among various plants suggests that CclSBPs probably plays some unique roles and has undergone distinct evolutionary processes. In the present work, a total of 15 CclSBPs were identified from the whole genome sequence of Citrus, and the complete cDNA of all the members were isolated. Subsequently, a clear nomenclature was provided for them based on their chromosomal locations. Comprehensive analysis including gene structures, phylogenetic relationships, conserved motifs, expression patterns and miR156-mediated transcriptional regulation was conducted. We also performed a functional analysis of some CclSBPs in Arabidopsis. The results showed that it was able to change the flowering time of transgenic plants.
The results provide useful information on CclSBPs, which should facilitate further research on elucidating the functions of SBP-box genes in citrus.