Genome-Wide Investigation of Spliceosomal SM/LSM Genes in Wheat ( Triticum aestivum L.) and Its Progenitors

: The SSM/SLSM (spliceosomal Smith (SM)/SM-like (LSM)) genes are the central compo-nents of the spliceosome in eukaryotes, which play an important role in regulating RNA splicing, participating in diverse biological processes. Although it has been detected in Arabidopsis and rice etc. plants, the members and signiﬁcance of the SSM/SLSM gene family in wheat are still not reported. In this study, we identiﬁed the SSM/SLSM genes in wheat and its progenitors at genome-scale, where 57 SSM/SLSM genes were identiﬁed in wheat, together with 41, 17and 19 found in Triticum dicoccoides , Triticum urartu , and Aegilops tauschii . Furthermore, their phylogenetic relationship, gene structures, conserved motifs, and cis-regulatory elements were systematically analyzed. By synteny analysis, good collinearity of SSM/SLSM genes was found among bread wheat and its progenitors’ genomes, and the distribution of SMD2 genes in wheat chromosome 5A, 4B and 4D located in the 4AL-5AL-7BS chromosome model, due to the translocation. Then, the positively selected genes were further investigated based on the non-synonymous to synonymous (dN/dS) analysis of the orthologous pairs. Finally, the expression proﬁles of the SSM/SLSM genes were detected using RNA-seq datasets, and eight stress-responsive candidate genes were selected to validate their expression through qPCR (real-time quantitative polymerase chain reaction). According to the co-expression network analysis, the correlation between the LSM7-7A gene and related genes was illustrated through Gene Ontology (GO) enrichment analysis. Furthermore, the LSM7-7A gene was related to the Arabidopsis homologous salt tolerance gene RCY1. This investigation systematically identiﬁed the complete candidates of SSM/SLSM genes and their characters in wheat and its progenitors, and provided clues to a better understanding of their contribution during the wheat polyploidy process.

In plants, the SM gene family, including SSM/SLSM gene family, has been reported in Arabidopsis thaliana [15], longan [16], maize and rice [17]. In Arabidopsis thaliana [18], nine other plant species (Glycine max, Lotus japonicus, Medicago truncatula, Oryza sativa, Physcomitrella patens, Populus trichocarpa, Sorghum bicolour, Vitis vinifera, and Zea mays) [19], spliceosomal associated proteins were identified, including SSM/SLSM proteins, where the identification of SSM/SLMS members in maize and rice were mainly based on a bioinformatics search of related members in Arabidopsis thaliana [19]. However, the SM family in animals has not been specifically studied. While the SMB protein was found only in a small number of rodent cell types and suggest a role in the regulation of some cases of alternative RNA splicing [20]. Anne [21] l found that arginine methylation of SMB is required for Drosophila germ cell development. Scruggs [22] implicate SmD3 as a critical determinant in the processing of intronic non-coding RNAs in general and as an upstream mediator of metabolic stress response pathways through the regulation of snoRNA expression. SME and SMG proteins were associated with cancer [23,24].
The SSM/SLSM proteins in plant are not only associated with spliceosomes but also related to the circadian rhythm [25], mRNA degradation [10], and stress resistance [26]. So far, the distinct functions of the SSM/SLSM genes have not been extensively deciphered in plants. Only the LSM1-7 and LSM2-8 complexes and SME, SMD3, LSM5, and LSM4 genes have been studied in Arabidopsis [26][27][28][29][30]. However, the SSM/SLSM genes have not been identified and characterized in bread wheat up to now.
Bread wheat is an allohexaploid species (Triticum aestivum L., AABBDD) originating from two major allopolyploid events [31][32][33]. Firstly, diploid Triticum urartu (AA) hybridized with an unknown diploid grass (related to Aegilops speltoides, BB) to produce wild tetraploid wheat Triticum dicoccoides (AABB). Then, wild tetraploid wheat hybridized with the diploid goat grass Aegilops tauschii (DD) to form hexaploid bread wheat [34]. Allopolyploidy can result in the change of transcription and/or function in homologous genes [32]. Here, the genome-scale SSM/SLSM genes were systematically identified in bread wheat and its progenitors. Then, their genomic organization, phylogenetic relationships, gene structures, conserved motifs and gene expression patterns were comprehensively investigated.

Identification of Smith (SM)/SM-Like (SSM/SLSM) Genes in Wheat and Its Progenitors
The Arabidopsis SSM/SLSM protein sequences were retrieved and downloaded from the SRGD database (http://www.plantgdb.org/SRGD/, accessed on 1 June 2021), and then used as the queries to perform a BLASTP search against the local protein database of wheat (IWGSC_v1.1) and its progenitors, which was downloaded from Ensembl Plants (http:// plants.ensembl.org/index.html, accessed on 1 June 2021, MBKbase (http://www.mbkbase. org, accessed on 1 June 2021 with the expected value (E-value) of 1 × 10 −20 . Meantime, the SSM/SLSM domain (PF01423) was downloaded from the PFAM database (http://pfam. xfam.org/, accessed on 1 June 2021. Hmmsearch tool implemented in HMMER 3.3.1 [35] was used to search for the proteins with this domain in local wheat protein database with an E-value of 1 × 10 −5 Furthermore, the protein sequences identified by both above methods were integrated and parsed by manual editing to remove the redundant. The remaining proteins were considered as candidate SSM/SLSM proteins. The candidates were finally submitted to the PfamScan database (https://www.ebi.ac.uk/Tools/pfa/ pfamscan/, accessed on 1 June 2021 and NCBI-CDD (https://www.ncbi.nlm.nih.gov/cdd/, accessed on 1 June 2021 to verify the SSM/SLSM conserved domain. Additionally, the same method was used to identify the SSM/SLSM genes in wild emmer wheat (T. dicoccoides), A. tauschii, and T. urartu. The relationship between gene name and gene ID is in Table 1 The physical properties of proteins were identified using ExPASy (https://web.expasy. org/protparam/, accessed on 1 June 2021, including the number of amino acids, molecular weight, theoretical pI, and grand average of hydropathicity (GRAVY). MEME online analysis (http://meme-suite.org/, accessed on 1 June 2021 of conservative motif of SSM/SLSM proteins was used, and TBtools was used to draw the gene structure and motif. The gene collinearity analysis of wheat and its relatives were predicted using the MCScanX [36] program and visualized with Circos [37]. In order to observe the gene changes during the process of wheat polyploidy, we calculated the dN/dS, which was displayed using ParaAT [38] and PAML [39].

Analysis of Cis-Acting Elements of SSM/SLSM Genes Promoter
The 1500 bp promoter sequences were processed through PERL script and used for the prediction of the plant cis-acting regulatory elements. The PlantCARE database was used for the identification of the elements in the promoters (http://bioinformatics.psb. ugent.be/webtools/plantcare/html/ (accessed on 1 June 2021)). Then, PERL scripts were used to calculate the cis-acting elements of the promoters in Supplemental Table S1.

Analysis of the Specific Expression of SSM/SLSM Genes
We downloaded the original data from the NCBI database (detailed information is in Supplemental Table S2) and processed the data according to the transcriptome analysis process. In the first step, low-quality data were filtered through the software Trimmomatic, and in the second step, clean data was compared to the reference genome by Hisat2 [40]. The third step is to carry out quantitative calculation of reads to the reference genome by comparison with Stringtie software.
TPM values of transcripts in the five tissues (root, stem, leaf, spike and grain) under no pressure treatment were calculated by taking the mean value method. Similarly, TPM values in four stages (booting stage, heading stage, flowering stage and grain filling stage) were calculated by taking the mean value method. The same calculation was performed for the stress treatment including heat stress, drought stress and salt stress.
The seeds of wheat genotype Chinese Spring were germinated in petri dishes and grown in a growth chamber at controlled conditions (23 ± 1 • C, 16 Table S3). The internal reference primer was the wheat's β-actin gene. qPCR were performed on the QuantStudioTM 7 Flex System (Thermo Fisher Scientifc, Waltham, MA, USA) with SYBR ® Premix Ex Taq™ II (TaKaRa, Dalian, China) with the thermal cycling condition was 95 • C for 30 s followed by 40 cycles of 95 • C for 3 s, 60 • C for 30 s, then 95 • C for 15 s.
The qPCR for each primer was repeated three times in each different treatment. Three technological replications were applied and the expression level was calculated using the 2 −∆∆CT method [41].

Construction of the Co-Expression Network and Functional Search of Key Genes
The WGCNA package [42] was used to construct the co-expression network based on the wheat transcriptome data (Supplemental Table S2). Set the power to 26 to build a co-expression network. The genes associated with TaLSM7-7A with a weight of more than 0.3 from the co-expression network were extracted and displayed by Cytoscape3.6.0 software [43].
Then, the genes related to the LSM7 gene were extracted through the Perl script, and these genes were enriched in TBtools [44] and mapped by WEGO [45]. And we found the genes related to salt stress, and performed BLAST comparison with the Arabidopsis genome to find the homologous genes of the genes, which were related to salt stress by referring to the literature.
For the materials and methods part, we have prepared a diagram showing the whole steps in the Supplemental Figure S2.

Identification of SSM/SLSM Gene Members and Their Physico-Chemical Properties in Wheat and Its Progenitors
By BLASTP and HMMER software searching, 57 SSM/SLSM genes were found in the wheat (TaSSM/SLSMs). In addition, 41, 17, and 19 SSM/SLSM genes were also identified in Triticum dicoccoides, Triticum urartu, and Aegilops tauschii, respectively. SSM/SLSM genes were named according to their chromosome location (Supplemental File 1). There are no significant sequence variation of the 19, 19, and 16 SSM/SLSM genes in wheat's A, B, and D subgenomes. The physico-chemical property features of identified SSM/SLSM proteins in wheat and its progenitors were listed in Table 1.  28.04 kDa for wheat, T. urartu, T. dicoccoides, and A. tauschii, respectively. Compared to T. dicoccoides (from 4.27 to 11.5 with an average of 8.29), A. tauschii (4.42 to 11.53 with an average of 7.82), and T. urartu (4.42 to 11.3 with an average of 8.12), the predicted isoelectric points (pI) of the wheat SSM/SLSM proteins varied from 4.42 to 11.3 with an average of 7.87, suggesting no significant difference occurred among them.
In wheat subgenomes, the similarity of nucleic acid sequence in the CDS region of LSM2 gene was between 98.6% and 99.3%, the similarity of nucleic acid sequence in CDS region of the LSM3 gene was between 83.3% and 99.7%, the sequence similarity of LSM4 was between 96.1% and 97.5%, the sequence similarity of LSM5 was between 97.4% and 99.3%, the sequence similarity of LSM6 was between 94.7% and 97.37%. Similarly, the sequence similarity distribution range of LSM7, LSM8, SMB, SMD1, SMD2, SMD3, SME, SMF and SMG genes was between 77.5% and 99.01%; between 96.3% and 99.3%; between 97.6% and 98.2%; between 87.1% and 99.4%; between 87.1% and 100%; between 97.30% and 97.8%, between 85.7% and 100%; between 97.7% and 97.77%; between 95.9% and 95.9% (Table 2). Wheat minimum similarity (%): The minimum similarity between different nucleic acid sequences in CDS region of the same gene in wheat by MEGAX. Wheat maximum similarity (%): The maximum similarity between different nucleic acid sequences in CDS region of the same gene in wheat by MEGAX.

Genome Distribution and Synthetic Analysis of SSM/SLSM Genes among Bread Wheat and Its Progenitors
In wheat and its progenitors, the LSM2 genes were distributed on chromosome 7, the LSM3 genes were distributed on chromosomes 3 and 1, the LSM4 genes were distributed on chromosome 3, the LSM5 genes were distributed on chromosome 1, the LSM6 genes were distributed on chromosome 2, the LSM7 genes were distributed on chromosomes 7 and 2, and the LSM8 genes were distributed on chromosome 1. SMBs were distributed on chromosome 2; SMD1s were on chromosomes 2 and 6; SMD2s were on chromosomes 5, 3, and 4; SMD3s were on chromosomes 5, 7, and 4; SMEs were on chromosomes 6 and 7; SMFs were on chromosomes 6 and 7; and SMGs were on chromosome 2. We revealed that the chromosome distribution of the LSM gene was mainly concentrated on single chromosome (except for LSM7 and LSM3), while the chromosome distribution of the SM gene was relatively scattered, with only SMB and SMG genes distributed on single chromosome, and the rest were distributed on two or three chromosomes.
The synteny analysis of the SSM/SLSM genes among bread wheat and its progenitors was performed (Figure 1). A total of 94 pairs of orthologous genes were identified in bread wheat and wild emmer wheat. Similarly, pairs of genes in bread wheat and A. tauschii, bread wheat and T. urartu, and T. dicoccoides and T. urartu, respectively, were 42 pairs, 38 pairs, and 48 pairs, and both were identified in the above. Most characterized domestication events are associated with primitive extreme genetic mutations and selection pressures. These factors are predicted to increase the relative rate of nonsynonymous to synonymous (dN/dS) substitution, potentially resulting in the fixation of deleterious alleles. Therefore, in order to observe the gene changes during the process of wheat polyploidy, we calculated the dN/dS rate, which is displayed in Figure 2. We found that, in bread wheat and wild emmer wheat, the dN/dS of LSM2, LSM4, LSM6, LSM7, LSM8, and SMB was much greater than 1, so these genes were positively selected by environmental pressures, while other genes were purified and selected (dN/dS << 1). In bread wheat and T. urartu, for the LSM7 genes dN/dS >> 1, and for all the other genes dN/dS << 1. In bread wheat and A. tauschii, the LSM4 and LSM6 were positively selected, while others were purified and selected. In T. dicoccoides and T. urartu, the LSM7 and SMD1 were positively selected, others were purified and selected. In general, the SMD1 gene was positively selected only in the process of polyploidy from T. urartuto wild emmer wheat under the environmental influence. The LSM7 gene Most characterized domestication events are associated with primitive extreme genetic mutations and selection pressures. These factors are predicted to increase the relative rate of nonsynonymous to synonymous (dN/dS) substitution, potentially resulting in the fixation of deleterious alleles. Therefore, in order to observe the gene changes during the process of wheat polyploidy, we calculated the dN/dS rate, which is displayed in Figure 2. We found that, in bread wheat and wild emmer wheat, the dN/dS of LSM2, LSM4, LSM6, LSM7, LSM8, and SMB was much greater than 1, so these genes were positively selected by environmental pressures, while other genes were purified and selected (dN/dS << 1). In bread wheat and T. urartu, for the LSM7 genes dN/dS >> 1, and for all the other genes dN/dS << 1. In bread wheat and A. tauschii, the LSM4 and LSM6 were positively selected, while others were purified and selected. In T. dicoccoides and T. urartu, the LSM7 and SMD1 were positively selected, others were purified and selected. In general, the SMD1 gene was positively selected only in the process of polyploidy from T. urartuto wild emmer wheat under the environmental influence. The LSM7 gene was positively selected only in the process of polyploidy from T. urartu to wild emmer wheat and then to bread wheat. From the result of dN/dS, we found that most SM genes were purely selected during polyploidization except in the first stage of polyploidization SMD1 were positively selected and in the second stage of polyploidization SMB were positively selected, while spliceosomal LSM genes mostly undergo positive selection except for LSM3 and LSM7 genes. The evolutionary tree constructed for SMD2 gene is shown in Figure 3. Compared with TaSMD2-3D, TaSMD2-3B and TaSMD2-3A, TaSMD2-5A, the evolutionary relationship between TaSMD2-4D and TaSMD2-4B is closer.
was positively selected only in the process of polyploidy from T. urartu to wild emmer wheat and then to bread wheat. From the result of dN/dS, we found that most SM genes were purely selected during polyploidization except in the first stage of polyploidization SMD1 were positively selected and in the second stage of polyploidization SMB were positively selected, while spliceosomal LSM genes mostly undergo positive selection except for LSM3 and LSM7 genes. The evolutionary tree constructed for SMD2 gene is shown in Figure 3. Compared with TaSMD2-3D, TaSMD2-3B and TaSMD2-3A, TaSMD2-5A, the evolutionary relationship between TaSMD2-4D and TaSMD2-4B is closer.

The Phylogenetic and Gene Structure of SSM/SLSM Proteins in Wheat and Its Progenitors
To further understand the evolutionary relationships of SSM/SLSM genes, their structural features and phylogenetic characters were analyzed. The phylogenetic tree was constructed using the full-length protein sequence alignments of the identified 57 TaSSM/SLSM, 41 TdSSM/SLSM, 17 TuSSM/SLSM, and 19 AetSSM/SLSM (Figure 4). The neighbor-joining (NJ) tree of SSM/SLSM genes can be clearly divided into 14 known groups. By contrast, within each SSM/SLSM gene, a strong amino acid sequence conservation was found, suggesting strong evolutionary relationships among all the members.
Moreover, other evidence, such as motif compositions and gene structure as described below, additionally support the truth. Form the NJ tree, we found that the pair-wise relationships between LSM and SM genes are as follows: LSM3-SMD2, LSM4-SMD3, LSM6-SMF, LSM7-SMG, and LSM8-SMB.
The SSM/SLSM structure was analyzed based on the arrangement of their exons ( Figure 5). The number and distribution of exons in each SSM/SLSM gene between wheat and its ancestor genes were very similar. The SMB genes have only one exon in wheat and its progenitors, whereas the SMD1 genes have three or four exons in wheat, it has four or five exons in T. urartu, it has two or four exons in T. dicoccoides, and it has three exons in A. tauschii. SMD2 genes in wheat and its ancestors have four exons except TdSMD2-4B, TdSMD2-3A and TaSMD2-3B. In wheat and T. urartu the SMD3 genes have four exons, it has three, four or five exons in T. dicoccoides, and it has three exons in A. tauschii. In wheat the SME genes have four or six exons, it has six exons in T. urartu, it has three or four or six exons in T. dicoccoides, and it has five or six exons in A. tauschii. SMF genes in wheat and its ancestors have five exons except TdSMF-6A, AetSMF-6, TaSMF-6B and TaSMF-6D. SMG genes in wheat and its ancestors have four exons except TdSMG-2B.

The Phylogenetic and Gene Structure of SSM/SLSM Proteins in Wheat and Its Progenitors
To further understand the evolutionary relationships of SSM/SLSM genes, their structural features and phylogenetic characters were analyzed. The phylogenetic tree was constructed using the full-length protein sequence alignments of the identified 57 TaSSM/SLSM, 41 TdSSM/SLSM, 17 TuSSM/SLSM, and 19 AetSSM/SLSM (Figure 4). The neighbor-joining (NJ) tree of SSM/SLSM genes can be clearly divided into 14 known groups. By contrast, within each SSM/SLSM gene, a strong amino acid sequence conservation was found, suggesting strong evolutionary relationships among all the members.  Moreover, other evidence, such as motif compositions and gene structure as described below, additionally support the truth. Form the NJ tree, we found that the pairwise relationships between LSM and SM genes are as follows: LSM3-SMD2, LSM4-SMD3, LSM6-SMF, LSM7-SMG, and LSM8-SMB.
The SSM/SLSM structure was analyzed based on the arrangement of their exons ( Figure 5). The number and distribution of exons in each SSM/SLSM gene between wheat and its ancestor genes were very similar. The SMB genes have only one exon in  The maximum likelihood (ML) tree was constructed using MEGAX based on the full-length protein sequence. The exon-intron structures of these genes were graphically displayed by the Gene Structure Display Server using the CDS and genome sequence of SSM/LSM genes. The protein sequences of SSM/LSM genes were used to predict the conserved motifs using the MEME Suite web server. . The maximum likelihood (ML) tree was constructed using MEGAX based on the full-length protein sequence. The exon-intron structures of these genes were graphically displayed by the Gene Structure Display Server using the CDS and genome sequence of SSM/LSM genes. The protein sequences of SSM/LSM genes were used to predict the conserved motifs using the MEME Suite web server.
In wheat, A. tauschii and T. urartu the LSM2 gene has three exons, it has four exons in T. dicoccoides. LSM3 gene in wheat and its ancestors has three exons except TdLSM3-3A.
LSM4 gene in wheat and its ancestors has seven exons except TdLSM4-3B. LSM5 gene in wheat and its ancestors has four exons except TdLSM5-1A and AetLSM5-1. The LSM6 gene only has three exons in wheat and its progenitors. LSM7 gene in wheat and its ancestors has five exons except TdLSM7-7B, TdLSM7-2B, TdLSM7-2A and TaLSM7-U. LSM8 gene in wheat and its ancestors has five exons except AetLSM8-1.

Expression Profile Analysis of SSM/SLSM Genes in Wheat
We used high-throughput RNA-seq data to analyze the expression patterns in different wheat tissues, developing periods and various stress. We detected the spatial and temporal specific expression patterns of SSM/SLSM genes in wheat, and in general, most members of wheat SSM/SLSM genes were expressed in lower levels in the leaf and higher levels in the spike, implying that SSM/SLSM genes may prefer to be involved in spike development of the spike in bread wheat. The calculation of wheat germination after a week of SSM/SLSM gene expression in different tissues found that the LSM2 and LSM4 expression in the stem was generally high. For genes in the spike tissue, the SMD1, SMD2, and SME expression were high, the other SMG expression was relatively high in the root. In the grain, the LSM5 and LSM7 expression were relatively high and the expression of these genes in the leaf were relatively low.
In general, the expression of the LSM4, SMD2, SME, SMF, and SMG genes was relatively high one week after germination among abiotic stress treatments. The expression levels of these genes in leaves at different stages were calculated, and we found that the expression levels of SME, SMD1, SMD2, SMD3, and LSM5 were generally high at the grain-filling stage, and the expression levels of LSM4 and SMB were high at the flowering stage. In addition, the expression levels of LSM6, LSM7, and SMG were relatively high at the booting stage, while the expression levels of these genes were relatively low at the heading stage.
The RNA-seq data confirmed previous indications that abiotic stress markedly alters alternative splicing in plants [48][49][50][51][52][53][54][55], and altered alternative splicing is regulated by the spliceosome, and thus, the expression pattern of the core protein of the spliceosome may change under the control of stress. Therefore, we also analyzed the expression of these genes under different abiotic stresses ( Figure 6). TaSME-6B and TaLSM7-1D displayed no expression levels when exposed to any level of any treatment. In general, compared with the contrast, most SSM/SLSM genes were expressed at lower levels.

Co-Expression Network Analysis of Wheat SSM/SLSM Genes
We constructed a gene co-expression network based on the wheat transcriptome in

Co-Expression Network Analysis of Wheat SSM/SLSM Genes
We constructed a gene co-expression network based on the wheat transcriptome in different tissues and developing periods. These SSM/SLSM genes were widely distributed in different modules without any preference. Furthermore, we found that LSM7-7A gene might be related to salt stress and drought stress. Subsequently, we screened LSM7-7A related genes in this network (Figure 8), and annotated these genes (Figure 9), showing that 47 genes related to the TaLSM7-7A genes were related to abiotic stress. Among the TaLSM7-7A related genes, TraesCS6D02G310800, highly homologous with the RCY1 gene of Arabidopsis, was found to be related with salt stress [55].

Discussion
In this study, we systematically identified 17 SSM/SLSM genes in diploid T. urartu, 19 in diploid A. tauschii, 41 in tetraploid emmer wheat (T. dicoccoides), and 57 in hexaploid wheat (T. aestivum) at the whole-genome scale. There was no multiplier increase of SSM/SLSM genes' number during wheat polyploidy, indicating that they may undergo gene loss [56]and recombination between homologous chromosomes [57]during the polyploidy process.
The wheat SSM/SLSM genes were mainly distributed on all chromosomes except for 4A, 5D, and 5B. There were no SSM/SLSM genes on the chromosome 4A of T. urartu. None of these genes were identified on the chromosome 5 of Ae. tauschii and the 5B of T. dicoccoides. There was no any TdSMD3-4A gene expression evidence by the subsequent RNA-seq data analysis, which may indicate a pseudo gene. From the NJ tree, we found that the pair-wise relationships between the LSM and SM genes were as follows: LSM3-SMD2, LSM4-SMD3, LSM6-SMF, LSM7-SMG, and LSM8-SMB. This was similar to Veretnik's results [9,16]. Based on the physical and chemical properties of these genes, the SSM/SLSM protein sequence lengths of bread wheat were shorter than those of its relatives, and their theoretical pI were smaller than the other relatives.
Chromosome translocation plays an important role in wheat breeding [58]. The 4AL-5AL-7BS chromosome translocation model exists naturally in wheat and was obtained after two translocations. However, the specific function of 4AL-5AL-7BS chromosome regions remains unclear. The TaSMD2-5A gene on chromosome 5A was closely related to the TaSMD2-4D gene on chromosome 4D and the TaSMD2-4B gene on chromosome 4B, compared with other SMD2 genes in wheat and its relatives. The sequence similarity of the SMD2 gene in wheat was studied, where the sequence similarity between the TaSMD2-5A and TaSMD2-4B gene was 99.1%, 98.5% between TaSMD2-5A and

Discussion
In this study, we systematically identified 17 SSM/SLSM genes in diploid T. urartu, 19 in diploid A. tauschii, 41 in tetraploid emmer wheat (T. dicoccoides), and 57 in hexaploid wheat (T. aestivum) at the whole-genome scale. There was no multiplier increase of SSM/SLSM genes' number during wheat polyploidy, indicating that they may undergo gene loss [56] and recombination between homologous chromosomes [57] during the polyploidy process.
The wheat SSM/SLSM genes were mainly distributed on all chromosomes except for 4A, 5D, and 5B. There were no SSM/SLSM genes on the chromosome 4A of T. urartu. None of these genes were identified on the chromosome 5 of Ae. tauschii and the 5B of T. dicoccoides. There was no any TdSMD3-4A gene expression evidence by the subsequent RNA-seq data analysis, which may indicate a pseudo gene. From the NJ tree, we found that the pair-wise relationships between the LSM and SM genes were as follows: LSM3-SMD2, LSM4-SMD3, LSM6-SMF, LSM7-SMG, and LSM8-SMB. This was similar to Veretnik's results [9,16]. Based on the physical and chemical properties of these genes, the SSM/SLSM protein sequence lengths of bread wheat were shorter than those of its relatives, and their theoretical pI were smaller than the other relatives.
Chromosome translocation plays an important role in wheat breeding [58]. The 4AL-5AL-7BS chromosome translocation model exists naturally in wheat and was obtained after two translocations. However, the specific function of 4AL-5AL-7BS chromosome regions remains unclear. The TaSMD2-5A gene on chromosome 5A was closely related to the TaSMD2-4D gene on chromosome 4D and the TaSMD2-4B gene on chromosome 4B, compared with other SMD2 genes in wheat and its relatives. The sequence similarity of the SMD2 gene in wheat was studied, where the sequence similarity between the TaSMD2-5A and TaSMD2-4B gene was 99.1%, 98.5% between TaSMD2-5A and TaSMD2-4D, which was higher than others. This result suggested that the SMD2 gene may be translocated between chromosomes 5A and 4A, which was consistent with Zhou's study [59]. We found that TaSMD2-5A was translocated to chr5A during the first translocation event [60]. The discovery of the TaSMD2-5A gene location contributes to the understanding of natural translocation chromosome models in wheat breeding.
The ratio of non-synonymous to synonymous substitution rates (dN/dS) can represent the evolutionary relationship under selection pressure, which is commonly used to identify protein sites that experience purifying selection (dN/dS < 1), evolve neutrally (dN/dS ≈ 1), or experience (dN/dS > 1) [61][62][63][64]. The trend diagram suggests that the LSM/SM gene with dN/dS value greater than one occurred with positive selection in the polyploidy of bread wheat. From the result of dN/dS, we found that most of the SM genes were purely selected during polyploidization, except in the first stage of polyploidization SMD1 were positively selected and in the second stage of polyploidization SMB were positively selected, while spliceosomal LSM genes mostly undergo positive selection except for LSM3 and LSM7 genes.
There is evidence that knockout SMD3 delayed flowering time and completion of the life cycle [65]. There is evidence that SMD1 relates to the formation of giant cells, is required for successful nematode infection, and facilitates posttranscriptional gene silencing (PTGS), SMD1 mutants are embryo-lethal [66,67]. Therefore, SMD1 and SMD3 may be related to spike development in bread wheat. The functions of other genes in plants are still unclear and require further verification in the later stage.
According to the qPCR validation of the selected genes, we found that the TaSMF-6B, TaLSM7-7A, TaSMD1-2A, and TaLSM8-1A in salt and drought stress were highly expressed.
Cis-regulatory elements are composed of DNA (typically, non-coding DNA) containing binding sites for TFs and/or other regulatory molecules that are needed to activate and sustain transcription. Zhang indicated that ABREs are also determinant cis-elements for stress-related transcription regulations. TaSMF-6B, TaLSM7-7A, TaSMD1-2A, and TaLSM8-1A all contain ABREs. Meanwhile TaSMD1-2A and TaLSM7-7A have the droughtinducibility cis-element MBS. Furthermore, we found that the TaLSM7-7A gene was associated with the salt stress gene (ATRCY1) by gene co-expression network analysis. The TaLSM7-7A gene was speculated to be the salt tolerance and drought-resistance gene.

Conclusions
In summary, we identified systematically the members of the SSM/SLSM gene family in bread wheat and its three progenitors, and analyzed and compared their characters. A good collinearity of SSM/SLSM genes was found among bread wheat and its progenitors, and evolutionary pressures on genes were further investigated based on dN/dS analysis of the orthologous pairs. Based on the RNA-seq data, the SSM/SLSM genes exhibited distinct tissue-specific expression patterns, and LSM7, LSM8, SMF, and SMD1 were induced by diverse abiotic stresses. This investigation provided comprehensive SSM/SLSM genes in wheat and its progenitors for further functional analysis, and contributes to a better understanding of the evolution mechanism of the SSM/SLSM genes during wheat polyploidy process.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/agronomy11071429/s1, Supplemental Table S1: Details of the cis-acting element in the promoter, Supplemental Table S2: The accession numbers and sample information of the RNA-seq data, Supplemental Table S3: Primers were designed for qPCR experiments, Supplemental Figure S1  Informed Consent Statement: This study did not involve humans.
Data Availability Statement: All the related sequence data in this study were downloaded from public database, detailed in Material and methods.