Molecular Characterization of the miR156/MsSPL Model in Regulating the Compound Leaf Development and Abiotic Stress Response in Alfalfa

Plant leaf patterns and shapes are spectacularly diverse. Changing the complexity of leaflet numbers is a valuable approach to increase its nutrition and photosynthesis. Alfalfa (Medicago sativa) is the most important forage legume species and has diversified compound leaf patterns, which makes it a model species for studying compound leaf development. However, transcriptomic information from alfalfa remains limited. In this study, RNA-Seq technology was used to identify 3746 differentially expressed genes (DEGs) between multifoliate and trifoliate alfalfa. Through an analysis of annotation information and expression data, SPL, one of the key regulators in modifiable plant development and abiotic stress response, was further analyzed. Here, thirty MsSPL genes were obtained from the alfalfa genome, of which 16 had the putative miR156 binding site. A tissue expression pattern analysis showed that the miR156-targeted MsSPLs were divided into two classes, namely, either tissue-specific or widely expressed in all tissues. All miR156-targeted SPLs strongly showed diversification and positive roles under drought and salt conditions. Importantly, miR156/MsSPL08 was significantly suppressed in multifoliate alfalfa. Furthermore, in the paralogous mutant of MsSPL08 isolated from Medicago truncatula, the phenotypes of mutant plants reveal that miR156/MsSPL08 is involved not only involved the branches but also especially regulates the number of leaflets. The legume is a typical compound leaf plant; the ratio of the leaflet often affects the quality of the forage. This study sheds light on new functions of SPL genes that regulate leaflet number development.


Introduction
Each step of biological and developmental progress requires appropriate transformation and expression changes in plants, such as the formation of organs, the division of cells, and the production of metabolites [1,2]. Leaves are the major component of plant architecture, and the main tissue of photosynthesis, thermoregulation, and gas exchange. Plant leaf forms are spectacularly diverse, and can be grouped into simple leaves and compound leaves. Compound leaves exhibit more diversity than simple leaves, particularly because of a specific morphogenetic process [3]. Alfalfa is the most widely cultivated legume for livestock, with leaves having a typical trifoliate pattern [4]. As one of the forages with the highest digestible protein, 70% of the protein is stored in the leaves, and the cellulose content in the leaves is only 1/3 of the stem [5]. Therefore, studying the molecular mechanism of alfalfa multifoliate formation can provide a strong theoretical basis for creating new alfalfa germplasm resources with high protein content.

Plant Material and Abiotic Stress Treatment
Two first-generation backcross populations were constructed from a cross between multifoliate alfalfa "PL34HQ", and trifoliate alfalfa "Huaiyin". A single F 1 plant from the cross was used as the female parent, trifoliate alfalfa "Huaiyin" as a recurrent parent to create the BC 1 backcross populations. Then choose the offspring with multifoliate trait as the female parent, "Huaiyin" alfalfa as a father to create the BC 2 backcross populations. Finally, we obtained two genetic strains: multifoliate (pinnate pentafoliate leaves in most cases) and trifoliate alfalfa, which were used as RNA sequencing materials. The leaflets were collected from the apical meristem of five vegetative growth stage plants in May 2019 (active stage of physiological activity), then mix them as one biological replicate. Three biological replicates were applied for each leaf type, all sampled leaflets were rizol Kit (Invitrogen, Gaithersburg, MD, USA) immediately frozen in liquid nitrogen and stored at −80 • C for RNA isolation. A M. truncatula mtspl mutant (ecotype R108: NF10482-4) was isolated from a tobacco (Nicotiana tabacum) Tnt1 retrotransposon-tagged mutant collection by forward genetic screening [26]. The MtSPL4 gene has one exon with 423 nucleotides. The MtSPL4 gene was tested using the following primers designed based on MtSPL4: NF10482_F: TTCACCTGTGTTCTTGTGCAT, NF10482_R: TGGGACATTTGATCACTACTCTT, LTR6-R: GCTACCAACCAAACCAAGTCAA.
Alfalfa seeds (cultivar Gannong No. 9) were sterilized in 1.0% (v/v) sodium hypochlorite and germinated at 25 • C for 5 days. Then the uniform seedlings were cultivated into the sand culture at 25 • C, 50% relative humidity, and 16-h light/8-h dark conditions. After 30 days of cultivating, the uniform growth seedlings were classified into two groups: one was keeping the culture conditions as control, the other one was subjected to 20% PEG-6000 and 300 mM NaCl for 1, 2, and 4 days. Three biological replicates were applied for each time point. After treatment, the young root was immediately frozen in liquid nitrogen and stored at −80 • C for RNA isolation.

RNA Isolation, Illumina Library Construction, and Sequencing
Total RNA from each sample was separately isolated for Trizol Kit (Invitrogen, Gaithersburg, MD, USA). In total, six RNA samples were obtained, representing the two leaf types from five single plants. RNA was quantified and quality-checked for each sample using one percent of agarose gels and Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Furthermore, the cDNA libraries were constructed according to the manufacturer's instructions of the NEBNext UltraTM RNA Library Prep Kit for Illumina (NEB, Shanghai, China). The cDNA library was sequenced using the HiseqXTen platform at Novogene (Beijing, China). FastQC raw data were processed using Trimmomatic (Version 0.39) to obtain clean reads by removing reads containing N base and adapter, low-quality bases from reads 3 to 5 and 5 to 3 (Q value < 20), and the bases whose tail mass value is less than 20 (the window size is 5 bp).

Bioinformatics Analysis of RNA-Seq Data
The genome assembly and annotation files (Version 3) of alfalfa from the figshare were used as the reference [6]. The TPM (Transcripts Per Million) method was used to calculate expression patterns through StringTie software (version 1.3.3b). Differentially expressed genes (DEGs) were conducted using the DESeq2 R package, and the q-Value < 0.05, FoldChange > 2 were defined as DEGs. The DEGs Gene Ontology (GO) and the Kyoto Encyclopedia (KEGG) enrichment analyses were implemented using the clusterProfiler (q-Value < 0.05). The software of HISAT2 was used to map the clean reads onto the cultivated alfalfa (cultivar XinJiangDaYe) genome.

Identification of Alfalfa SPL Genes
Alfalfa genome data were derived from the Figshare website (https://figshare.com/articles/ dataset/genome_fasta_sequence_and_annotation_files/12327602, accessed on 9 July 2021). Ini-tially, the conserved SBP domain (PF03110) based on Hidden Markov Model (HMM) was retrieved from the PFAM database (http://pfam.xfam.org/, accessed on 9 July 2021) and was employed to search the alfalfa protein sequence with the threshold of e < 1 × 10 −5 . Furthermore, a TBLASTN homology search was performed against the alfalfa genome using Arabidopsis, rice, and M.truncatula SPL proteins, to identify additional SPLs. For the SPL with splice variants, only the primary one was kept and redundant sequences were removed using the CD-HIT tool (http://weizhongli-lab.org/cd-hit/, accessed on 9 July 2021). The protein length, molecular weight, grand average of hydropathy (GRAVY), and theoretical isoelectric point (pI) of each MsSPL protein were calculated within the ExPasy program (http://www.expasy.org/, accessed on 12 July 2021).

SPL Gene Bioinformatics Analysis in Alfalfa
The physical location of the MsSPL was displayed to chromosomes using TBtools software. The exon/intron structures of MsSPLs were analyzed in the GSDS 2.0 (http: //gsds.gao-lab.org/, accessed on 12 July 2021). Conserved motifs of MsSPL proteins were identified using the MEME (http://meme-suite.org/tools/meme, Version 5.2.0) (accessed on 12 July 2021), with the maximum number of the motif is ten. The SPL protein sequences of MtSPLs (24), AtSPLs (16), and OsSPLs (19) were downloaded from the Phytozome 12 database (https://phytozome.jgi.doe.gov/pz/portal.html, accessed on 12 July 2021). Multiple protein sequence alignment was conducted using the ClustalW program. An unrooted phylogenetic tree was constructed with the maximum likelihood method and 1000 bootstrap by MEGA 7.0 software. Collinear blocks were evaluated by MCScan. The syntenic relationship between the MsSPL genes and SPL genes from A. thaliana, Oryza sativa, M. truncatula, and Glycine max were determined by using TBtools [27]. Furthermore, the synonymous substitution rate (Ks) and nonsynonymous substitution rate (Ka) ratios were calculated by the software of TBtools. The divergence date (T) of collinear gene pairs was estimated using the formula T = Ks/2λ, where λ was 1.5 × 10 −8 in dicots [28].

Prediction of the miR156 Target Site and Promoter Cis-Element
The MtmiR156 sequences were obtained from the miRBase (http://www.mirbase.org/, accessed on 20 July 2021). Then the psRNATarget server (http://plantgrn.noble.org/ psRNATarget/, accessed on 20 July 2021) was employed to predict potentially targeted by miR156 in alfalfa CDS and 3 UTRs regions of all MsSPLs for complementary sequences, with the maximum expectation of three.
The 2.0 kb upstream sequence of the predicated MsSPL genes were used to analyze the putative stress or hormone-responsive cis-acting regulatory elements in PlantCARE online database [29].

RNA Isolation and qRT-PCR Verification
Total RNA was isolated using Sangon UNIQ-10 column Trizol total RNA extraction kit (TaKaRa, Dalian, China). cDNA synthesis, qRT-PCR reactions, and data analysis were performed according to previous studies [31], the actin gene used to evaluate the relative fold differences based on the comparative Ct method. The gene-specific primers were designed by the Primer3 software (Table S1).

Statistical Analysis
In the initial sampling portion of the entire experiment, we used three independent biological replicates. RNAseq analysis, SPL genes identify and bioinformatics analysis described in the Section 2 part. In addition, the significance between control and treatment groups were evaluated using Duncan's multiple range test in SPSS 20. Graphs were generated with Photoshop 7.

Illumina Sequencing, Mapping, and Annotation
To deeply understand the compound leaf development of alfalfa at the transcriptome level, after filtering out low-quality reads, approximately 0.245 billion clean reads were generated. The GC content of clean data ranged from 43.84% to 45.09%, and the Q30 ranged from 92.70% to 97.19%, with an average of 93.90% from six RNA samples (Table 1). Overall, the average total mapped, multi mapped and uniquely mapped clean read ratios were 91.89%, 58.12%, and 33.77%, respectively (Table 1). In total, 164,632 genes with 251,850 transcripts were detected with a maximum of 13,074 bp, and the mean length was 757.78 bp with an N50 of 1135 kb ( Figure 1a). The functional annotation of all genes aligned with six commonly used public databases, which revealed that the number of genes ranged from 88,116 (53.52%, KEGG) to 163,176 (99.12%, NR), and 44,239 (27%) genes were annotated in all six databases (Table S2).

Functional Annotation and Transcription Factor Identification
Among 3746 DEGs, 1633 DEGs were upregulated and 2113 DEGs were downregulated between multifoliate and trifoliate alfalfa (Figure 1b). A GO classification summarized all DEGs into three main categories including 64 functional groups ( Figure S1). In the biological process (BP) category, "rhythmic process" (105) and "response to stimulus" (1130) were highly represented groups. Within the cellular component (CC) category, the "extracellular region" (321) was the most abundant group, followed by "nucleoid" (8). For the molecular function (MF) category, "catalytic activity" (1772), "nutrient reservoir activity" (34), and "transporter activity" (244) were the top 3 highly represented groups. All genes were further assigned to the KEGG and KOG databases. A total of 495 genes were assigned to 184 KEGG pathways, and the top 30 significant enrichment pathways are displayed in Figure 2a. "Fructose and mannose metabolism" (26, 5.25%) was the most significantly enriched pathway, followed by "Cysteine and methionine metabolism" (28, 5.66%), "Biosynthesis of amino acids" (36, 7.27%), and "Ubiquinone and another terpenoid-quinone biosynthesis" (12, 2.42%). Furthermore, 2028 genes were annotated in 23 individual KOG functional classes, with "Amino acid transport and metabolism" (162, 7.99%) being the most significant enrichment term, followed by "Lipid transport and metabolism" (149, 7.35%), "Carbohydrate transport and metabolism" (179, 8.83%), and "Inorganic ion transport and metabolism" (103, 5.08%) (Figure 2b). generated. The GC content of clean data ranged from 43.84% to 45.09%, and the Q30 ranged from 92.70% to 97.19%, with an average of 93.90% from six RNA samples (Table  1). Overall, the average total mapped, multi mapped and uniquely mapped clean read ratios were 91.89%, 58.12%, and 33.77%, respectively (Table 1). In total, 164,632 genes with 251,850 transcripts were detected with a maximum of 13,074 bp, and the mean length was 757.78 bp with an N50 of 1135 kb (Figure 1a). The functional annotation of all genes aligned with six commonly used public databases, which revealed that the number of genes ranged from 88,116 (53.52%, KEGG) to 163,176 (99.12%, NR), and 44,239 (27%) genes were annotated in all six databases (Table S2).   32.52% Note: Q30 the probability that the base was miscalled is 0.1%, GC content the percentage of guanine and cytosine out of four bases. T: trifoliate, M: Multifoliate.

Functional Annotation and Transcription Factor Identification
Among 3746 DEGs, 1633 DEGs were upregulated and 2113 DEGs were downregulated between multifoliate and trifoliate alfalfa (Figure 1b). A GO classification summarized all DEGs into three main categories including 64 functional groups ( Figure S1). In the biological process (BP) category, "rhythmic process" (105) and "response to stimulus" (1130) were highly represented groups. Within the cellular component (CC) category, the "extracellular region" (321) was the most abundant group, followed by "nucleoid" (8). For the molecular function (MF) category, "catalytic activity" (1772), "nutrient reservoir activity" (34), and "transporter activity" (244) were the top 3 highly represented groups. All genes were further assigned to the KEGG and KOG databases. A total of 495 genes were assigned to 184 KEGG pathways, and the top 30 significant enrichment pathways are displayed in Figure 2a. "Fructose and mannose metabolism" (26, 5.25%) was the most significantly enriched pathway, followed by "Cysteine and methionine metabolism" (28, 5.66%), "Biosynthesis of amino acids" (36, 7.27%), and "Ubiquinone and another terpenoid-quinone biosynthesis" (12, 2.42%). Furthermore, 2028 genes were annotated in 23 individual KOG functional classes, with "Amino acid transport and metabolism" (162, 7.99%) being the most significant enrichment term, followed by "Lipid transport and metabolism" (149, 7.35%), "Carbohydrate transport and metabolism" (179, 8.83%), and "Inorganic ion transport and metabolism" (103, 5.08%) (Figure 2b). TFs are master regulators of gene expression by binding to their promoter regions. A total of 227 DEGs were identified as TFs and classified into 29 gene families ( Figure S2). Among the recognized TFs, bHLH has the most members at 19, followed by the ERF (ethylene response factor: 17) (17), NAC (NAM, ATAF, and CUC: 15), MYB_related (V-myb myeloblastosis viral oncogene homologue: 14), and GATA (14) families. Compared with upregulated TFs, most of them were downregulated, such as ERF, DBB (double B-box), and Dof (DNA binding with one finger), which were all downregulated, and NAC and SPL, which had only one and two members, respectively, were upregulated. Among the various TF members, SPL genes are post-transcriptionally regulated by miR156 and play roles in plant morphology and development. This study found that the expression of SPLs was significantly suppressed in multifoliate alfalfa. Thus, the genome-wide characterization of the SPL genes in alfalfa was further analyzed.

Identification of MsSPL Genes in Alfalfa
A total of 30 SPL putative genes were identified in alfalfa after removing redundant sequences. Furthermore, a uniform nomenclature was assigned to these genes based on the order of their chromosomal locations ( Table 2). Among the 16 SPL genes in Arabidopsis, AtSPL15 and AtSPL16 no orthologues in alfalfa; AtSPL2 has the maximum orthologues (five); AtSPL13 has four paralogues; AtSPL1, AtSPL6, and AtSPL14 have three paralogues; AtSPL2, AtSPL10, AtSPL11, and AtSPL12 have two paralogues; and remaining genes have one ( Table 2). Among the 30 MsSPL proteins, the relative molecular weights varied from 16.58 kDa (MsSPL08) to 113.32 kDa (MsSPL08), and from 141 (MsSPL08) to 1025 (MsSPL05 and MsSPL07) amino acids (aa) in length, with 21 members showing a pI > 7 and the remaining showing a pI < 7. The GRAVY index of all MsSPLs below zero indicated that all of the deduced polypeptides were hydrophilic.

Phylogenetic, Gene Structure, Conserved Motif, and Domain Analyses
To study the evolutionary correlations of MsSPLs, an unrooted phylogenetic tree was created using the full-length SPL proteins of M. truncatula, Arabidopsis, alfalfa, and rice by the maximum likelihood algorithm in MEGA 7.0, which divided the 30 MsSPL proteins into eight distinct groups ( Figure 3). As expected, the SPL proteins from legumes generally exhibited closer relationships, than those from rice (monocot). Except for subfamily VI SPLs from dicots, the remaining subfamilies contained four species. The phylogenetic analysis suggests that the SPL genes have the conserved characteristics of dicots and monocots.

Phylogenetic, Gene Structure, Conserved Motif, and Domain Analyses
To study the evolutionary correlations of MsSPLs, an unrooted phylogenetic tree was created using the full-length SPL proteins of M. truncatula, Arabidopsis, alfalfa, and rice by the maximum likelihood algorithm in MEGA 7.0, which divided the 30 MsSPL proteins into eight distinct groups ( Figure 3). As expected, the SPL proteins from legumes generally exhibited closer relationships, than those from rice (monocot). Except for subfamily VI SPLs from dicots, the remaining subfamilies contained four species. The phylogenetic analysis suggests that the SPL genes have the conserved characteristics of dicots and monocots. To examine the diversity in the MsSPLs, 10 conserved motifs were predicted ( Figure  4a). The number of conserved motifs in 30 MsSPL proteins varied from 1 to 9. Generally, the MsSPL proteins that shared similar motif compositions tended to cluster together. All MsSPLs contained motif 1, which represented two Zn-finger-like structures, and NLS. In addition, some groups were found to share specific motifs. For instance, motifs 3, 7, 8, and 9 are found exclusively in Group I, and groups II, III, and VII only contain only motif 1. To examine the diversity in the MsSPLs, 10 conserved motifs were predicted (Figure 4a). The number of conserved motifs in 30 MsSPL proteins varied from 1 to 9. Generally, the MsSPL proteins that shared similar motif compositions tended to cluster together. All MsSPLs contained motif 1, which represented two Zn-finger-like structures, and NLS. In addition, some groups were found to share specific motifs. For instance, motifs 3, 7, 8, and 9 are found exclusively in Group I, and groups II, III, and VII only contain only motif 1.
Gene structural features are known to play a crucial role in the evolution of gene families. We analyzed the intron/exon distribution, and the results showed that the number of introns in the MsSPL genes ranged from one to ten (Figure 4b). Additionally, we found that nearly 66.7% of the MsSPL genes had fewer than four introns. Compared with other groups, Group I contained more introns, which ranged from eight to ten.

Multiple Sequence Alignment and Prediction of the miR156 Target Site of MsSPLs
The detailed domain structures of all MsSPLs and five previously reported protein sequences were determined. As a result, the specific SBP domain is shared by the MsSPL members ( Figure S3a). The MsSPL genes shared 76 highly conserved amino acid residues, including two Zn finger-like structures and one NLS. Compared with C3H, C2HC was more conserved in all MsSPLs.
The previous study has isolated a miR156 precursor from alfalfa by PCR (MsmiR156d: CACGAGTGAGAGAAGACAGT). BLAST search results showed that the isolated sequence was similar to miR156d precursor in M. truncatula [16]. Then we compared the sequence of miRNA156 among different species, the result solidly confirms that plant miRNA156 are evolutionarily conserved Thus, ten mtr-miR156 (mtr-miR156a-j) genes were obtained from the M. truncatula genome to predicate miR156 target site of MsSPLs. Furthermore, a multiple sequence alignment was performed, which produced five types of mature miRNA sequences as shown in Figure S3b. A comparison of the miR156 mature sequences to the MsSPL transcript sequences indicated that 16 out of 30 MsSPL are complementary to the miR156 mature sequences, with a maximum of one to three mismatches which suggests that miR156 may specifically target these genes in alfalfa. Among 16 MsSPL genes specifically targeted by miR156, 15 MsSPL genes contained target sites in their coding regions, and MsSPL08 contained a miR156 target site in the 3 -UTR. MsSPL18 has two miR156 target sites. The miR156-targeted SPL genes from 11 target OsSPL and 10 target AtSPL genes were distributed into six out of eight groups (I-VIII), and most of them clustered into Group V.  Gene structural features are known to play a crucial role in the evolution of gene families. We analyzed the intron/exon distribution, and the results showed that the number of introns in the MsSPL genes ranged from one to ten (Figure 4b). Additionally, we found that nearly 66.7% of the MsSPL genes had fewer than four introns. Compared with other groups, Group I contained more introns, which ranged from eight to ten.

Multiple Sequence Alignment and Prediction of the miR156 Target Site of MsSPLs
The detailed domain structures of all MsSPLs and five previously reported protein sequences were determined. As a result, the specific SBP domain is shared by the MsSPL members ( Figure S3a). The MsSPL genes shared 76 highly conserved amino acid residues, including two Zn finger-like structures and one NLS. Compared with C3H, C2HC was more conserved in all MsSPLs.
The previous study has isolated a miR156 precursor from alfalfa by PCR (MsmiR156d: CACGAGTGAGAGAAGACAGT). BLAST search results showed that the isolated sequence was similar to miR156d precursor in M. truncatula [16]. Then we compared the sequence of miRNA156 among different species, the result solidly confirms that plant miRNA156 are evolutionarily conserved Thus, ten mtr-miR156 (mtr-miR156a-j) genes were obtained from the M. truncatula genome to predicate miR156 target site of MsSPLs. Furthermore, a multiple sequence alignment was performed, which produced five types of mature miRNA sequences as shown in Figure S3b. A comparison of the miR156 mature sequences to the MsSPL transcript sequences indicated that 16 out of 30 MsSPL are complementary to the miR156 mature sequences, with a maximum of one to three mismatches

Chromosomal Distributions, Synteny, and Evolutionary Analyses of the MsSPL
The physical positions of 30 MsSPL genes were mapped onto 22 allelic chromosomes, which represented an unbalanced distribution ( Figure 5). Chromosome 4 contained the largest number of MsSPL genes (7;~23.33%), whereas no gene was distributed on chromosome 6. Recently, a study showed that a round of whole-genome duplication (WGD) events were estimated to occur approximately 58 Myr ago in the alfalfa genome [6]. Among the 30 MsSPL genes, one gene pair of MsSPL12/MsSPL13 displayed tandem duplications and 14 segmentally duplicated pairs located on duplicated segments on 5 chromosomes ( Figure 5). largest number of MsSPL genes (7; ~23.33%), whereas no gene was distributed on chromosome 6. Recently, a study showed that a round of whole-genome duplication (WGD) events were estimated to occur approximately 58 Myr ago in the alfalfa genome [6]. Among the 30 MsSPL genes, one gene pair of MsSPL12/MsSPL13 displayed tandem duplications and 14 segmentally duplicated pairs located on duplicated segments on 5 chromosomes ( Figure 5).  Figure  S4). These results indicate that the MsSPL genes were highly conserved and that the  Figure S4). These results indicate that the MsSPL genes were highly conserved and that the MsSPL genes were closer to the legume than to other plants. Additionally, some orthologous gene pairs were specifically mapped between alfalfa and legumes plants, including MsSPL01, MsSPL02, MsSPL03, MsSPL05, MsSPL07, MsSPL25, MsSPL28, MsSPL29, and MsSPL30, which may show that these orthologous pairs formed after the divergence of legume plants. These results indicated that MsSPL genes were closer to the legume than to others.
MsSPL genes were closer to the legume than to other plants. Additionally, some orthologous gene pairs were specifically mapped between alfalfa and legumes plants, including MsSPL01, MsSPL02, MsSPL03, MsSPL05, MsSPL07, MsSPL25, MsSPL28, MsSPL29, and MsSPL30, which may show that these orthologous pairs formed after the divergence of legume plants. These results indicated that MsSPL genes were closer to the legume than to others. Furthermore, the Ka, Ks, Ka/Ks, and duplication event data of the orthologous gene pairs were estimated through Nei and Gojobori's method (Table S3). All MsSPL orthologous gene pairs had Ka/Ks < 0.8, which suggests that the MsSPL genes might have experienced strong purifying selective pressure during evolution. The latest duplication event occurred between MsSPL12/13, which might occur ~0.11 million years ago, and all duplication events occurred after WGD events.

Cis-element Analysis of the Promoter
Previous studies have shown that SPL is involved in plant growth processes and abiotic stress responses. To further forecast the potential regulatory mechanisms of MsSPL, five hormone-related, four abiotic stress-related, and four tissue development-related elements were discovered and are displayed in Figure 7. The numbers in MsSPL04 and MsSPL30 are the greatest, and have the most cis-elements (16). MsSPL07 contains only two abscisic acid responsiveness elements. Notably, all of the MsSPL genes contained at least one plant hormone-responsive element, including 27 (90%), 18 (60%), 16 (53%), 16 (53%) and 14 (47%) MsSPLs that had one or more abscisic acid responsive, auxin responsive, salicylic acid responsive and gibberellin responsive elements. The hormone-related elements were broadly distributed, which indicates that these genes may be regulated by phytohormones. Additionally, abiotic stress-related elements were enriched in the promoter region, and 47% and 33% of the members contained at least one MYB binding site involved in drought-induced ability (MBS) and low-temperature response (LTR) elements, which speculates that the expressions of these MsSPLs largely participates in signal perception and abiotic stress responses. Furthermore, some plant growth and development- Furthermore, the Ka, Ks, Ka/Ks, and duplication event data of the orthologous gene pairs were estimated through Nei and Gojobori's method (Table S3). All MsSPL orthologous gene pairs had Ka/Ks < 0.8, which suggests that the MsSPL genes might have experienced strong purifying selective pressure during evolution. The latest duplication event occurred between MsSPL12/13, which might occur~0.11 million years ago, and all duplication events occurred after WGD events.

Cis-element Analysis of the Promoter
Previous studies have shown that SPL is involved in plant growth processes and abiotic stress responses. To further forecast the potential regulatory mechanisms of MsSPL, five hormone-related, four abiotic stress-related, and four tissue development-related elements were discovered and are displayed in Figure 7. The numbers in MsSPL04 and MsSPL30 are the greatest, and have the most cis-elements (16). MsSPL07 contains only two abscisic acid responsiveness elements. Notably, all of the MsSPL genes contained at least one plant hormone-responsive element, including 27 (90%), 18 (60%), 16 (53%), 16 (53%) and 14 (47%) MsSPLs that had one or more abscisic acid responsive, auxin responsive, salicylic acid responsive and gibberellin responsive elements. The hormone-related elements were broadly distributed, which indicates that these genes may be regulated by phytohormones. Additionally, abiotic stress-related elements were enriched in the promoter region, and 47% and 33% of the members contained at least one MYB binding site involved in droughtinduced ability (MBS) and low-temperature response (LTR) elements, which speculates that the expressions of these MsSPLs largely participates in signal perception and abiotic stress responses. Furthermore, some plant growth and development-related elements, including the differentiation of palisade mesophyll cells, seed-specific regulation, meristem expression, and endosperm expression elements, were also identified. The expression of MsSPLs with different cis-elements illustrates that MsSPL may play vital roles in regulating alfalfa growth and stress response. related elements, including the differentiation of palisade mesophyll cells, seed-specific regulation, meristem expression, and endosperm expression elements, were also identified. The expression of MsSPLs with different cis-elements illustrates that MsSPL may play vital roles in regulating alfalfa growth and stress response.

Transcriptome and qRT-PCR Analysis Revealed Diverse Expression Patterns of MsSPL
Expression profiles were mined from a comprehensive profiling study of different alfalfa tissues [30]. As shown in Figure 8a, all MsSPLs were expressed in at least one tissue, and nearly half of them exhibited tissue specificity. According to expression profiles of MsSPL members, which were divided into two groups. The first group were widely expressed across six tissues, with expression levels ranging from 0 to 63, except for MsSPL08, MsSPL16, and MsSPL19 were not detected in roots and nodules. The second group showed relatively low expression levels in most tissues, which were highly induced in a specific tissue; for example, most MsSPLs were expressed in flowers and elongating stem internodes, but not detected in leaf, nodule and root tissues, including MsSPL09, MsSPL10, MsSPL20 and MsSPL29. To explore the functions of MsSPLs in the regulation of leaflet development, a compare transcriptome between multifoliate and trifoliate alfalfa were conducted as shown in Figure 8b. Most of MsSPLs were not, or only lightly induced in multifoliate alfalfa, but the expression levels of MsSPL08 and MsSPL25 were significantly suppressed, and MsSPL28 was significantly induced in multifoliate alfalfa.

Transcriptome and qRT-PCR Analysis Revealed Diverse Expression Patterns of MsSPL
Expression profiles were mined from a comprehensive profiling study of different alfalfa tissues [30]. As shown in Figure 8a, all MsSPLs were expressed in at least one tissue, and nearly half of them exhibited tissue specificity. According to expression profiles of MsSPL members, which were divided into two groups. The first group were widely expressed across six tissues, with expression levels ranging from 0 to 63, except for MsSPL08, MsSPL16, and MsSPL19 were not detected in roots and nodules. The second group showed relatively low expression levels in most tissues, which were highly induced in a specific tissue; for example, most MsSPLs were expressed in flowers and elongating stem internodes, but not detected in leaf, nodule and root tissues, including MsSPL09, MsSPL10, MsSPL20 and MsSPL29. To explore the functions of MsSPLs in the regulation of leaflet development, a compare transcriptome between multifoliate and trifoliate alfalfa were conducted as shown in Figure 8b. Most of MsSPLs were not, or only lightly induced in multifoliate alfalfa, but the expression levels of MsSPL08 and MsSPL25 were significantly suppressed, and MsSPL28 was significantly induced in multifoliate alfalfa.
To verify the functions of miR156 target MsSPL in response to drought ( Figure S5a) and salt ( Figure S5b) stresses, 16 MsSPLs were used to examine the expression profiles under PEG and salt treatments by using qRT-PCR. All miR156 target MsSPLs responded to PEG and salt stresses, but the response intensities and speeds were different. The majority of genes were upregulated on the fourth day of the PEG and NaCl treatments. Under drought conditions, all genes were upregulated, except for two (MsSPL16 and MsSPL19) that were downregulated on the 1st day, and MsSPL23 was downregulated on the 2nd day. After NaCl treatment for one day, three of the MsSPL genes (MsSPL08, MsSPL10, and MsSPL11) were significantly upregulated. After NaCl treatment for two and four days, most of them were greatly upregulated. Genes 2022, 13, x FOR PEER REVIEW 13 of 18 To verify the functions of miR156 target MsSPL in response to drought ( Figure S5a) and salt ( Figure S5b) stresses, 16 MsSPLs were used to examine the expression profiles under PEG and salt treatments by using qRT-PCR. All miR156 target MsSPLs responded to PEG and salt stresses, but the response intensities and speeds were different. The majority of genes were upregulated on the fourth day of the PEG and NaCl treatments. Under drought conditions, all genes were upregulated, except for two (MsSPL16 and MsSPL19) that were downregulated on the 1st day, and MsSPL23 was downregulated on the 2nd day. After NaCl treatment for one day, three of the MsSPL genes (MsSPL08, MsSPL10, and MsSPL11) were significantly upregulated. After NaCl treatment for two and four days, most of them were greatly upregulated.

Paralogous of MsSPL08 Isolation and Functional Analysis in M. truncatula
The cDNA sequences of MsSPL08 were compared among M. truncatula SPLs (MtSPL) to identify the paralogous gene. A multiple sequence alignment showed that MsSPL08 has a high identity with the MtSPL4 protein sequence, which shows a 93% similarity and similar gene structure (Figure 9a,b). M. truncatula with a good collinearity with alfalfa, has long been considered as a model species for the studies of legume biology [32]. A significant level of sequence conservation was reported between alfalfa and M. truncatula allowing estimates of gene function between two species. As the sequence conservation between alfalfa and M. truncatula is high, we chose M. truncatula as a model to explore the possible functions of the miR156-targeted MsSPL8 in compound leaf development. The expression profiles of MsSPL08 and MtSPL4 in different tissues showed that MsSPL08 and MtSPL4 play important roles in regulating aboveground tissue development (Figure 9c,d).

Paralogous of MsSPL08 Isolation and Functional Analysis in M. truncatula
The cDNA sequences of MsSPL08 were compared among M. truncatula SPLs (MtSPL) to identify the paralogous gene. A multiple sequence alignment showed that MsSPL08 has a high identity with the MtSPL4 protein sequence, which shows a 93% similarity and similar gene structure (Figure 9a,b). M. truncatula with a good collinearity with alfalfa, has long been considered as a model species for the studies of legume biology [32]. A significant level of sequence conservation was reported between alfalfa and M. truncatula allowing estimates of gene function between two species. As the sequence conservation between alfalfa and M. truncatula is high, we chose M. truncatula as a model to explore the possible functions of the miR156-targeted MsSPL8 in compound leaf development. The expression profiles of MsSPL08 and MtSPL4 in different tissues showed that MsSPL08 and MtSPL4 play important roles in regulating aboveground tissue development (Figure 9c,d).
To determine the biological funct-ions of MsSPL08 in compound leaf formation in alfalfa, a Tnt1 retrotransposon-tagged MtSPL4 mutant line was obtained. The mutant plants had more than three leaflets, increased lateral branches, a reduced plant height, and a delayed flowering time (Figure 9e), which reveals that MtSPL4 plays important roles in compound leaf development, stem elongation, and branching. Compared with two cotyledons and one true leaf of wild-type seedlings, the mutant had three cotyledons and a trifoliate form of the true leaf (Figure 9f). The mutant had a different type of leaflet, and the number varied from one to seven, with 30.5% of the leaves exhibiting four or five leaflets in adult plants (Figure 9g). To determine the biological funct-ions of MsSPL08 in compound leaf formation in alfalfa, a Tnt1 retrotransposon-tagged MtSPL4 mutant line was obtained. The mutant plants had more than three leaflets, increased lateral branches, a reduced plant height, and a delayed flowering time (Figure 9e), which reveals that MtSPL4 plays important roles in compound leaf development, stem elongation, and branching. Compared with two cotyledons and one true leaf of wild-type seedlings, the mutant had three cotyledons and a trifoliate form of the true leaf (Figure 9f). The mutant had a different type of leaflet, and the number varied from one to seven, with 30.5% of the leaves exhibiting four or five leaflets in adult plants (Figure 9g).

Discussion
Alfalfa is referred to as "the queen of forage crops" because of its high biomass, biological nitrogen fixation, choice nutritional profiles, stress-tolerance and reliable sources of protein and minerals for livestock [31,33]. Increasing the leaf/stem ratio is an effective way to improve its rumen digestibility. However, the regulatory mechanism controlling leaflet formation in this species remains limited. After four years of backcross breeding, we obtained a BC2 population with 27.04% multifoliate rate at budding period. The single plant with the highest main branch multifoliate rate was 88%, the pinnate pentafoliate was the most stable leaflet type in BC2 population. To acquire more accurate information on the transcript functions, we compared the transcriptome profiles of the multifoliate (pinnate pentafoliate type) and trifoliate alfalfa genotypes. The enrichment results showed that the DEGs were mainly annotated to major metabolic and biosynthesis processes such

Discussion
Alfalfa is referred to as "the queen of forage crops" because of its high biomass, biological nitrogen fixation, choice nutritional profiles, stress-tolerance and reliable sources of protein and minerals for livestock [31,33]. Increasing the leaf/stem ratio is an effective way to improve its rumen digestibility. However, the regulatory mechanism controlling leaflet formation in this species remains limited. After four years of backcross breeding, we obtained a BC 2 population with 27.04% multifoliate rate at budding period. The single plant with the highest main branch multifoliate rate was 88%, the pinnate pentafoliate was the most stable leaflet type in BC 2 population. To acquire more accurate information on the transcript functions, we compared the transcriptome profiles of the multifoliate (pinnate pentafoliate type) and trifoliate alfalfa genotypes. The enrichment results showed that the DEGs were mainly annotated to major metabolic and biosynthesis processes such as fructose and mannose metabolism, cysteine and methionine metabolism, ubiquinone, and another terpenoid-quinone biosynthesis, and amino acid biosynthesis.
The development of plant leaves is regulated by transcriptional regulators, plant hormones, and the mechanical properties of the tissue [34]. Among them, TFs are master regulators in response to various developmental processes and abiotic stresses in plants.
Here, 227 TFs were differentially expressed, and most of them were suppressed. Previous studies have found that manipulating the expression of TF often drastically changes plant phenotypes [35]. Among many types of TF families, SPL is unique to plants [2]. The role of some SPL genes is post-transcriptionally regulated by miR156, as their transcripts carry functional response elements to these highly related miRNAs, which have emerged as essential regulators of various biological processes, such as phase change, leaf development, tillering/branching, stem growth, and response to stresses [36]. The chromosome-level genome assembly was completed, which offers detailed information to comprehensively analyse the SPL genes in this major legume forage. Through genome-wide identification, 30 MsSPL genes were acquired from the alfalfa genome. A phylogenetic investigation showed that the relationship of MsSPL is closer to M. truncatula and Arabidopsis than to rice, which suggests that eudicot SPL genes may diverge from a common ancestor [37,38]. Numerous SPLs have been identified in green plants, and the highest number of 59 SPLs have been recognized in Gossypium hirsutum and Brassica juncea [13], while, only 15 SPLs were identified in Solanum lycopersicon [39]. The number of MsSPLs is greater than that in some species such as Citrus sinensis [15], Cucumis sativus [15], Arabidopsis [16] and O.sativa [19], and close to Populus trichocarpa [27], Gossypium raimondii [29], and M. truncatula [22], which indicates that more SPLs contained in these species were attributed to WGD events. Similarly, a WGD event played a significant role in the alfalfa genome~58 Mya ago, which allowed SPL gene expansion in alfalfa [6]. Motif composition and gene structure analyses indicated that MsSPL gene clusters in the same group shared similar motifs and exon/intron organization. Furthermore, a high divergence of motif was noticed among MsSPL proteins. For example, motifs 3 and 9 were unique to Group 1, whereas motifs 4 and 6 were only for Group 8 as compared to other SPLs, which suggests that the function of MsSPL genes may show diversity because of diversification [10,37,38]. In total, 16 identified MsSPLs identified have miR156 target sites, and the results showed that most of them are located in the coding region, except for MsSPL08 in the 3 -UTR. Similarly, these results were also found in Arabidopsis, Hordeum vulgare, and rice [36,40], which shows the conservation of miR156-mediated posttranscriptional regulation in different plant species [21,41,42].
In nature, leaves are an important agronomic trait, where photosynthesis converts carbon dioxide and water into carbohydrates; therefore, a high leaf/stem ratio breeding variety will increase its nutrient accumulation [43]. Leaf patterns are highly variable among different species, and even between different varieties. Among different leaf patterns and shapes, compound leaves are spectacularly diverse [44]. We analyzed the molecular mechanism by which these factors modulate leaf development to yield a substantial diversity of leaf forms in multifoliate (pinnate pentafoliate type). We found that miR156-targeted MsSPL08 was highly expressed in leaves, stems, and flowers, but was suppressed in nodules and roots. While, compared with trifoliate alfalfa, the expression of MsSPL08 was downregulate in multifoliate alfalfa. To further verify whether MsSPL08 functions in leaflet differentiation, we choose M. truncatula as a model to carry out further research. The compound leaf composition ranged from one to seven simple leaflets, and the plant height was decreased, but the stem number was increased in M. truncatula mutant plants. Importantly, the mutant exhibits multifoliate traits in the seedling stage, such as three cotyledons, the true leaf is composed of three leaflets, and the serrations of the leaf edge disappear. Compared with multifoliate alfalfa, MtSPL4 mutant plants displayed a more variable phenotype, we speculate that alfalfa is autotetraploid (2n = 4X = 32) plant, the genomes is more complex and high similarity of their subgenomes. Additionally, compound leaf development is also flexibly tuned among different species in a spatiotemporal manner. The precise regulation network of how MsSPL08 involved in leaf patterning is also an open question. miR156 has agronomic relevance for plant growth and stress tolerance among a wide variety of plant miRNAs. Previously studies have shown that miR156-targeted SPL play important role in improving biomass production and abiotic stress tolerance in legumes [36,45,46]. Similarly, we find that MsSPL08 was also significantly induced by drought and salt stress. These observations show that the miR156/SPL module may contribute to genetic variability through the regulation of plant development and abiotic stress responses. The phenotypic changes and stress response were similar to those of the miR156-overexpressing alfalfa genotypes, including delayed flowering, reduced stem length, and increased shoot branching, and played a positive role in abiotic stress toler-ance [14,[47][48][49]. This observation suggests that the miR156/SPL module is integrated with the plant developmental and stress response. So far, still not clear how does miR156 integrate biological stress and regulate growth. The detailed relationship between abiotic stress response and development needs further study.

Conclusions
In this study, RNA-Seq technology was utilized to generate a comprehensive transcriptome between multifoliate and trifoliate alfalfa. Then, the RNA-Seq results were used to recognize the sequences associated with the important regulating pant trait gene SPL. Overall, 30 MsSPL were identified in alfalfa. Phylogenetic, gene structure and evolutionary analyses suggest that the functions of MsSPL genes may show diversity after the diversification of dicots and monocots. Based on different tissue expression profiles, MsSPLs can be divided into widely expressed and tissue-specific expressed groups. In this case, 16 miR156-targeted MsSPLs strongly implied diversification and positive roles under drought and salt conditions. Tthe transcripts of MsSPL08 were significantly suppressed in multifoliate alfalfa. Furthermore, we used M. truncatula as a model to study the possible roles of miR156-targeted MsSPL08 in regulating alfalfa compound leaf development. The phenotypes of mutant plants revealed that miR156/MsSPL08 not only was involved in the development of multifoliate but also contributed to branches.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/genes13020331/s1, Figure S1: GO enrichment analysis of the DEGs. Figure S2: The expression pattern distribution of DEGs identified as TFs between multifoliate and trifoliate alfalfa. Figure S3: Multiple alignments of the MsSPL protein sequences. (a) The highly conserved SBPdomain, including two Zn-finger-like structures (C2HC and C3H) and an NLS. (a) The alignment of miR156 sequences with the target sites in MsSPL genes. Figure S4: The venn diagram of syntenic SPL genes throughout diverse species. Figure S5: The relative expression ratios of 16 miR156 target MsSPL genes under drought (a) and salt conditions (b). * p < 0.05 and ** p < 0.01. Table S1: The qRT-PCR primers used in this study. Table S2: BLAST analysis of the non-redundant genes against public databases. Table S3: The Ka, Ks and dates of the duplication evens in the paralogous MsSPL gene pairs.