Comprehensive In Silico Characterization and Expression Proﬁling of Nine Gene Families Associated with Calcium Transport in Soybean

: Calcium (Ca 2 + ) plays a critical role in the regulation of growth and development and environmental stress responses in plants. The membrane-associated Ca 2 + transport proteins are required to mediate Ca 2 + signaling and maintain Ca 2 + homeostasis. Ca 2 + channels, pumps (ATPases), and antiporters are three major classes of Ca 2 + transporters. Although the genome-wide analysis of Ca 2 + transporters in model plants Arabidopsis and rice have been well documented, the identiﬁcation, classiﬁcation, phylogenesis, expression proﬁles, and physiological functions of Ca 2 + transport proteins in soybean are largely unknown. In this study, a comprehensive in silico analysis of gene families associated with Ca 2 + transport was conducted, and a total of 207 putative Ca 2 + transporter genes have been identiﬁed in soybean. These genes belong to nine di ﬀ erent families, such as Ca 2 + -ATPase, Ca 2 + / cation antiporter, cyclic nucleotide-gated ion channel (CNGC), and hyperosmolality induced cytosolic Ca 2 + concentration channel (OSCA). Detailed analysis of these identiﬁed genes was performed, including their classiﬁcation, phylogenesis, protein domains, chromosomal distribution, and gene duplication. Expression proﬁling of these genes was conducted in di ﬀ erent tissues and developmental stages, as well as under stresses using publicly available RNA-seq data. Some genes were found to be predominantly expressed in speciﬁc tissues like ﬂowers and nodules, and some genes were found to be expressed strongly during seed development. Seventy-four genes were found to be signiﬁcantly and di ﬀ erentially expressed under abiotic and biotic stresses, such as salt, phosphorus deﬁciency, and fungal pathogen inoculation. In addition, hormonal signaling- and stress response-related cis-elements and potential microRNA target sites were analyzed. This study suggests the potential roles of soybean Ca 2 + transporters in stress responses and growth regulation, and provides a basis for further functional characterization of putative Ca 2 + transporters in soybean.


Introduction
Calcium (Ca 2+ ) is an essential macronutrient required for various physiological and biochemical processes in plants [1]. As the divalent cation, Ca 2+ is indispensable for plant growth and development by playing a structural role in the cell wall, acting as a counter-cation for inorganic and organic anions in the vacuole, and functioning as a universal intracellular second messenger in the cytosol [2]. A high ratio of total Ca 2+ in plant tissues is located in cell walls, which is required for cell wall stabilization [1].

CNGC Gene Family in Soybean
In this study, 39 CNGC gene members were identified in the soybean genome (Table S2 (Supplementary Materials)). CNGC protein sequences from soybean, Arabidopsis, and rice were aligned using ClustalW, and a phylogenetic tree was constructed using the neighbor-joining method to determine the similarity and homology between the CNGC families of these plants. The topology of the phylogenetic tree revealed that the GmCNGC gene family can be divided into four major groups (Groups I-IV) (Figure 2), which is based on the Arabidopsis groups [20]. Groups I-III are monophyletic, whereas Group IV is subdivided into two distinct clades (Groups IV-a and IV-b). Group IV contains 13 GmCNGC genes, while the other groups contain five to 12 members. Most of these proteins contained a cyclic nucleotide-binding domain (CNBD) and an ion transport domain ( Figure S1 (Supplementary Materials)). The sequence alignment of the 39 GmCNGC proteins indicated that the two most conserved regions within the CNBD domain are a phosphate-binding cassette (PBC), and an adjacent hinge region ( Figure S2 (Supplementary Materials)), which is the characteristics of CNBD [72]. Additionally, a relatively conserved IQ motif was shown in all the 39 GmCNGC proteins.

Ca 2+ -ATPase Gene Family in Soybean
The Ca 2+ -ATPases are P-type ATPases that can be classified in two distinct groups: Type IIA (ECA) and type IIB (ACA). Here, we identified eight ECA genes and 25 ACA genes in the soybean genome (Table S3 (Supplementary Materials)). The soybean Ca 2+ -ATPase family exhibited little sequence variation in terms of amino acid length (752 to 1103 aa), but presented high variation in terms of intron number (from 0 to 36) (Table S3 (Supplementary Materials)). All the soybean Ca 2+ -ATPase proteins were predicted to contain five to ten transmembrane domains (Table S3 (Supplementary Materials)). Domain analysis showed that five domains characteristic for Ca 2+ -ATPases were found to be conserved among most of the soybean Ca 2+ -ATPase proteins. These domains include the N-terminal autoinhibitory domain, N-terminus cation transporter/ATPase, E1-E2 ATPase, haloacid dehalogenase-like hydrolase, and C-terminus cation transporter/ATPase ( Figure  S3 (Supplementary Materials)). Most of the soybean ACAs were predicted to contain an N-terminal autoinhibitory domain, with the exception of six GmACAs ( Figure S3 (Supplementary Materials)). To analyze the evolutionary relatedness among the members of Ca 2+ -ATPases from soybean, rice, and Arabidopsis, a phylogenetic tree was constructed using the deduced amino acid sequences. The phylogenetic analysis revealed that Ca 2+ -ATPases formed two distinct groups, namely, type IIA and type IIB (Figure 3). Type IIB is divided into four subgroups, based on the groups of Arabidopsis. The Ca 2+ -ATPases of the same group seemed to have a high degree of evolutionary relatedness. The investigation for the evolution of Ca 2+ -ATPases among these plant species revealed that the gene

CNGC Gene Family in Soybean
In this study, 39 CNGC gene members were identified in the soybean genome (Table S2 (Supplementary Materials)). CNGC protein sequences from soybean, Arabidopsis, and rice were aligned using ClustalW, and a phylogenetic tree was constructed using the neighbor-joining method to determine the similarity and homology between the CNGC families of these plants. The topology of the phylogenetic tree revealed that the GmCNGC gene family can be divided into four major groups (Groups I-IV) (Figure 2), which is based on the Arabidopsis groups [20]. Groups I-III are monophyletic, whereas Group IV is subdivided into two distinct clades (Groups IV-a and IV-b). Group IV contains 13 GmCNGC genes, while the other groups contain five to 12 members. Most of these proteins contained a cyclic nucleotide-binding domain (CNBD) and an ion transport domain ( Figure S1 (Supplementary Materials)). The sequence alignment of the 39 GmCNGC proteins indicated that the two most conserved regions within the CNBD domain are a phosphate-binding cassette (PBC), and an adjacent hinge region ( Figure S2 (Supplementary Materials)), which is the characteristics of CNBD [72]. Additionally, a relatively conserved IQ motif was shown in all the 39 GmCNGC proteins.

Ca 2+ -ATPase Gene Family in Soybean
The Ca 2+ -ATPases are P-type ATPases that can be classified in two distinct groups: Type IIA (ECA) and type IIB (ACA). Here, we identified eight ECA genes and 25 ACA genes in the soybean genome (Table S3 (Supplementary Materials)). The soybean Ca 2+ -ATPase family exhibited little sequence variation in terms of amino acid length (752 to 1103 aa), but presented high variation in terms of intron number (from 0 to 36) (Table S3 (Supplementary Materials)). All the soybean Ca 2+ -ATPase proteins were predicted to contain five to ten transmembrane domains (Table S3 (Supplementary  Materials)). Domain analysis showed that five domains characteristic for Ca 2+ -ATPases were found to be conserved among most of the soybean Ca 2+ -ATPase proteins. These domains include the N-terminal autoinhibitory domain, N-terminus cation transporter/ATPase, E1-E2 ATPase, haloacid dehalogenase-like hydrolase, and C-terminus cation transporter/ATPase ( Figure S3 (Supplementary Materials)). Most of the soybean ACAs were predicted to contain an N-terminal autoinhibitory domain, with the exception of six GmACAs ( Figure S3 (Supplementary Materials)). To analyze the evolutionary relatedness among the members of Ca 2+ -ATPases from soybean, rice, and Arabidopsis, a phylogenetic tree was constructed using the deduced amino acid sequences. The phylogenetic analysis revealed that Ca 2+ -ATPases formed two distinct groups, namely, type IIA and type IIB ( Figure 3). Type IIB is divided into four subgroups, based on the groups of Arabidopsis. The Ca 2+ -ATPases of the same group seemed to have a high degree of evolutionary relatedness. The investigation for the evolution of Ca 2+ -ATPases among these plant species revealed that the gene members in the same clade have a high degree of similarity (Figure 3), suggesting the common ancestry of P-type II Ca 2+ -ATPases in these plant species.
Agronomy 2020, 10, 1539 6 of 28 Agronomy 2020, 10, x FOR PEER REVIEW 6 of 33 Figure 2. Unrooted phylogenetic tree of the cyclic nucleotide-gated ion channel (CNGC) family proteins from soybean (Glycine max), rice (Oryza sativa), and Arabidopsis thaliana. The alignment for the phylogenetic tree was performed with ClustalW using full-length amino acid sequences. The phylogenetic tree was created with the MEGA6 software and the neighbor-joining method with 1000 bootstrap replications. GmCNGCs are marked with red circles, AtCNGCs were marked with purple triangles, while OsCNGCs were marked with green diamonds. The locus name of each gene is shown in brackets. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch. Roman numerals designate the subfamilies.  . Unrooted phylogenetic tree of the cyclic nucleotide-gated ion channel (CNGC) family proteins from soybean (Glycine max), rice (Oryza sativa), and Arabidopsis thaliana. The alignment for the phylogenetic tree was performed with ClustalW using full-length amino acid sequences. The phylogenetic tree was created with the MEGA6 software and the neighbor-joining method with 1000 bootstrap replications. GmCNGCs are marked with red circles, AtCNGCs were marked with purple triangles, while OsCNGCs were marked with green diamonds. The locus name of each gene is shown in brackets. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch. Roman numerals designate the subfamilies. . Unrooted phylogenetic tree of the Ca 2+ -ATPase family proteins from soybean (Glycine max), rice (Oryza sativa), and Arabidopsis thaliana. The alignment for the phylogenetic tree was performed with ClustalW using full-length amino acid sequences. Soybean Ca 2+ -ATPases are marked with red circles, Arabidopsis Ca 2+ -ATPases were marked with purple triangles, while rice Ca 2+ -ATPases were marked with green diamonds. The locus name of each gene is shown in brackets. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch.  Figure 3. Unrooted phylogenetic tree of the Ca 2+ -ATPase family proteins from soybean (Glycine max), rice (Oryza sativa), and Arabidopsis thaliana. The alignment for the phylogenetic tree was performed with ClustalW using full-length amino acid sequences. Soybean Ca 2+ -ATPases are marked with red circles, Arabidopsis Ca 2+ -ATPases were marked with purple triangles, while rice Ca 2+ -ATPases were marked with green diamonds. The locus name of each gene is shown in brackets. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch.

CaCA Gene Family in Soybean
Here, a total of 30 genes encoding putative CaCAs were identified in the soybean genome, including 18 CAX genes, eight CCX genes, three NCL genes, and one MHX gene (Table S4 (Supplementary  Materials)). The intron number of GmCaCA family genes varied from 0 to 11. The protein length of the GmCaCA family ranged from 182 to 718 amino acids (Table S4 (Supplementary Materials)). All the GmCaCA proteins were predicted to contain three to 14 transmembrane domains (Table S4 (Supplementary Materials)). All the GmCaCA proteins contain NCX domains with the exception of GmNCL3, which has a high sequence and phylogenetic similarity to other NCL members (Figure 4, Figures S4 and S5 (Supplementary Materials)). In addition, all the GmNCLs were predicted to contain EF-hand motifs ( Figure S4 (Supplementary Materials)), which is similar to the NCL proteins in Arabidopsis and rice [73]. The presence of EF-hand motifs suggests that these proteins may bind Ca 2+ and regulate cellular Ca 2+ homeostasis. Phylogenetic analysis indicated that GmCaCA members are closely related to the CaCA proteins in Arabidopsis and rice ( Figure 4).

OSCA Gene Family in Soybean
A total of 21 putative OSCA genes were identified, and they were named GmOSCA1.1 to GmOSCA4.2 in accordance with Arabidopsis orthologues (Table S5 (Supplementary Materials)). A phylogenetic tree was generated to compare the evolutionary relationship of OSCA proteins among Arabidopsis, rice, and soybean. The OSCA family members were separated into four distinct clades, designated I, II, III, and IV ( Figure 5). Clade I, II, III, and IV in soybean contained nine, eight, two, and two members, respectively. The amino acid number of GmOSCAs varied from 500 to 803 (Table S5 (Supplementary Materials)). Seven to 11 transmembrane domains were predicted for the different GmOSCAs (Table S5 (Supplementary Materials)). In addition, Ca 2+ dependent channel domains, late exocytosis domains, and cytosolic domains were contained by most of the GmOSCAs ( Figure S6 (Supplementary Materials)).

GLR Gene Family in Soybean
Thirty-five putative GLR genes were identified in the soybean genome (Table S6 (Supplementary  Materials)). The encoded proteins were predicted to contain three to eight transmembrane domains (Table S6 (Supplementary Materials)). All the proteins were predicted to contain a ligand-gated ion channel, and most of them contain a receptor family ligand-binding domain ( Figure S7 (Supplementary Materials)). To detect the evolutionary relationships of GmGLRs, a phylogenetic tree was generated from alignments of all the GLRs in soybean, Arabidopsis, and rice ( Figure 6). These GLRs could be grouped into four clades, namely, I, II, III, and IV. There were 4, 16, and 15 GmGLR members in clade I, III, and IV. In clade II, there were nine Arabidopsis GLR members and four rice GLR members, but no soybean GLR member was found. In clade IV, only one rice GLR was shown, and no Arabidopsis GLR was found ( Figure 6). Therefore, the phylogenetic relationships among these plants are different in clade II and clade IV, suggesting a diverging trend in the evolution of GLR family across different species. The alignment for the phylogenetic tree was performed with ClustalW using full-length amino acid sequences. Soybean CaCAs are marked with red circles, Arabidopsis CaCAs were marked with purple triangles, while rice CaCAs were marked with green diamonds. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch.

OSCA Gene Family in Soybean
A total of 21 putative OSCA genes were identified, and they were named GmOSCA1.1 to GmOSCA4.2 in accordance with Arabidopsis orthologues (Table S5 (Supplementary Materials) Figure 4. Unrooted phylogenetic tree of the CaCA (Ca 2+ /Cation Antiporter) family proteins from soybean (Glycine max), rice (Oryza sativa), and Arabidopsis thaliana. The alignment for the phylogenetic tree was performed with ClustalW using full-length amino acid sequences. Soybean CaCAs are marked with red circles, Arabidopsis CaCAs were marked with purple triangles, while rice CaCAs were marked with green diamonds. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch. designated I, II, III, and IV ( Figure 5). Clade I, II, III, and IV in soybean contained nine, eight, two, and two members, respectively. The amino acid number of GmOSCAs varied from 500 to 803 (Table  S5 (Supplementary Materials)). Seven to 11 transmembrane domains were predicted for the different GmOSCAs (Table S5 (Supplementary Materials)). In addition, Ca 2+ dependent channel domains, late exocytosis domains, and cytosolic domains were contained by most of the GmOSCAs ( Figure S6 (Supplementary Materials)).

Figure 5.
Unrooted phylogenetic tree of the OSCA family proteins from soybean (Glycine max), rice (Oryza sativa), and Arabidopsis thaliana. Soybean OSCAs are marked with red circles, Arabidopsis OSCAs were marked with purple triangles, while rice OSCAs were marked with green diamonds. The bar indicates the relative divergence of the sequences examined, and bootstrap values are displayed next to the branch. Roman numerals designate the subfamilies.

ANN Gene Family in Soybean
A total of 26 ANN genes were identified in the soybean genome; they were designated as GmANN1 to GmANN26 (Table S7 (Supplementary Materials)). Three to seven introns were found in these genes, and their protein length ranged from 141 to 362 amino acids, with a typical length of 296 amino acids (Table S7 (Supplementary Materials)). All the GmANN proteins were predicted to contain annexin domains, and their sequences are highly conserved ( Figures S8 and S9 (Supplementary Materials)). A phylogenetic tree was constructed using full-length amino acid sequences of all the ANNs in soybean, Arabidopsis, and rice. These ANNs could be grouped into six clades (Figure 7). Clade I contains 11 GmANN members, clade V contains five GmANN members, while other clades contain one to four members. In each clade, proteins from soybean, Arabidopsis, and rice are included, with the exception that there is no Arabidopsis ANN in clade VI (Figure 7). These results suggest the similar clustering of ANN proteins in these three plants.

TPC Proteins in Soybean
Although there is only one TPC member in Arabidopsis and rice, two orthologues were identified in the soybean genome, and designated as GmTPC1 and GmTPC2 (Table S8 (Supplementary  Materials)). These two genes both contain 23 introns and encode proteins with an amino acid length of 738-739. Twelve to thirteen transmembrane domains were predicted for these two proteins ( Figure  8A). Similar to AtTPC1 and OsTPC1, ion transport protein domain and EF-hand pair were found in GmTPC1 and GmTPC2 ( Figure 8A). Phylogenetic analysis indicated that GmTPC1 and GmTPC2 are

TPC Proteins in Soybean
Although there is only one TPC member in Arabidopsis and rice, two orthologues were identified in the soybean genome, and designated as GmTPC1 and GmTPC2 (Table S8 (Supplementary Materials)). These two genes both contain 23 introns and encode proteins with an amino acid length of 738-739. Twelve to thirteen transmembrane domains were predicted for these two proteins ( Figure 8A). Similar to AtTPC1 and OsTPC1, ion transport protein domain and EF-hand pair were found in GmTPC1 and GmTPC2 ( Figure 8A). Phylogenetic analysis indicated that GmTPC1 and GmTPC2 are closely related to the AtTPC1 and OsTPC1 ( Figure 8B).

MSL Gene Family in Soybean
A total of 16 genes were identified to be putative MSL proteins in the soybean genome (Table S9  (Supplementary Materials)). These proteins have a length from 387 to 947 were predicted to contain three to seven transmembrane domains (Table S9 (Supplementary Materials)). All the soybean MSL proteins were predicted to contain a mechanosensitive ion channel domain ( Figure S10 (Supplementary Materials)). A phylogenetic tree constructed to analyze the evolutionary relationships among MSL proteins in soybean, Arabidopsis, and rice, showed that the MSLs could be classified into three clades. There were two, six, and eight GmMSL members in clade I, II, and III, respectively ( Figure 9).

MSL Gene Family in Soybean
A total of 16 genes were identified to be putative MSL proteins in the soybean genome (Table S9 (Supplementary Materials)). These proteins have a length from 387 to 947 were predicted to contain three to seven transmembrane domains (Table S9 (Supplementary Materials)). All the soybean MSL proteins were predicted to contain a mechanosensitive ion channel domain ( Figure S10 (Supplementary Materials)). A phylogenetic tree constructed to analyze the evolutionary relationships among MSL proteins in soybean, Arabidopsis, and rice, showed that the MSLs could be classified into three clades. There were two, six, and eight GmMSL members in clade I, II, and III, respectively (Figure 9).

MCA Gene Family in Soybean
Five loci were found to be potential MCA genes; these genes designated as GmMCA1 to GmMCA5 contain five to eight introns and encode proteins with a length of 403 to 418 amino acids (Table S10 (Supplementary Materials)). The protein sequences of MCAs are very conserved ( Figure  S11 (Supplementary Materials)). All the GmMCAs contain a putative transmembrane domain and a Cys-rich domain of the unknown function, called the PLAC8 motif ( Figure 10A). A phylogenetic tree indicated that GmMCAs are closely related to the three proteins in Arabidopsis and rice (AtMCA1, AtMCA2, and OsMCA1) ( Figure 10B).

MCA Gene Family in Soybean
Five loci were found to be potential MCA genes; these genes designated as GmMCA1 to GmMCA5 contain five to eight introns and encode proteins with a length of 403 to 418 amino acids (Table S10 (Supplementary Materials)). The protein sequences of MCAs are very conserved ( Figure S11 (Supplementary Materials)). All the GmMCAs contain a putative transmembrane domain and a Cys-rich domain of the unknown function, called the PLAC8 motif ( Figure 10A). A phylogenetic tree indicated that GmMCAs are closely related to the three proteins in Arabidopsis and rice (AtMCA1, AtMCA2, and OsMCA1) ( Figure 10B).

Chromosomal Distribution and Gene Duplication
The identified 207 genes encoding putative Ca 2+ transport proteins were variously distributed on all the 20 chromosomes of soybean ( Figure 11). The number of genes on each chromosome ranged from 3 to 24. A maximum of 24 genes was found on chromosome 13, while only three genes were found on chromosome 20. Meeting the criteria of distance less than 100 kb and separation less than five intervening genes, 20 groups of genes exhibited tandem duplication ( Figure 11). Out of the 20 groups, six groups had two genes, each duplicated; the remaining groups had three to five genes duplicated, thus forming clusters on the chromosomes. The GLR, CNGC, and ANN gene family members formed four, two, and two clusters, respectively.

Chromosomal Distribution and Gene Duplication
The identified 207 genes encoding putative Ca 2+ transport proteins were variously distributed on all the 20 chromosomes of soybean ( Figure 11). The number of genes on each chromosome ranged from 3 to 24. A maximum of 24 genes was found on chromosome 13, while only three genes were found on chromosome 20. Meeting the criteria of distance less than 100 kb and separation less than five intervening genes, 20 groups of genes exhibited tandem duplication ( Figure 11). Out of the 20 groups, six groups had two genes, each duplicated; the remaining groups had three to five genes duplicated, thus forming clusters on the chromosomes. The GLR, CNGC, and ANN gene family members formed four, two, and two clusters, respectively.
In this study, 79 duplicated gene pairs of putative Ca 2+ transporters were identified with higher bootstrap values (<90%) (Table S11 (Supplementary Materials)). The duplication of genes resulted in gene family expansion. The synonymous substitution rates (Ks), the non-synonymous substitution rates (Ka), and the Ka/Ks ratio for the 79 duplicated gene pairs revealed high similarities in their coding sequence alignments. The Ks values of these 79 genes ranged from 0.05 for gene pair GmCNGC13/GmCNGC14 to 0.48 for gene pair GmANN3/GmANN4 with an average Ks of 0.139 (Table S11 (Supplementary Materials)). According to the Ka/Ks ratio, the evolutionary history of selection acting on different genes could be measured [74]. The direction and magnitude of natural selection enforcing on the different protein-coding genes could be interpreted by the Ka/Ks ratio. A pair of sequences having Ka/Ks < 1 indicates purifying selection; Ka/Ks = 1 implies both sequences are drifting neutrally; and Ka/Ks > 1 indicates positive or Darwinian selection [62]. The Ka/Ks of 79 duplicated gene pairs was found to be less than 0.64 (Table S11 (Supplementary Materials))-which indicates the influence of purifying selection on the evolution of these gene pairs. Based on the divergence rate of λ = 6.1 × 10 −9 proposed for soybean [62], the duplication time for each gene pairs was calculated. It is observed that all the segmental duplicated pairs showed a time frame between 4.49 and 39.39 Mya, with an average of 11.38 Mya (Table S11 (Supplementary Materials)). In this study, 79 duplicated gene pairs of putative Ca 2+ transporters were identified with higher bootstrap values (<90%) (Table S11 (Supplementary Materials)). The duplication of genes resulted in gene family expansion. The synonymous substitution rates (Ks), the non-synonymous substitution rates (Ka), and the Ka/Ks ratio for the 79 duplicated gene pairs revealed high similarities in their coding sequence alignments. The Ks values of these 79 genes ranged from 0.05 for gene pair GmCNGC13/GmCNGC14 to 0.48 for gene pair GmANN3/GmANN4 with an average Ks of 0.139 (Table S11 (Supplementary Materials)). According to the Ka/Ks ratio, the evolutionary history of selection acting on different genes could be measured [74]. The direction and magnitude of natural selection enforcing on the different protein-coding genes could be interpreted by the Ka/Ks ratio. A pair of sequences having Ka/Ks < 1 indicates purifying selection; Ka/Ks = 1 implies both sequences are drifting neutrally; and Ka/Ks > 1 indicates positive or Darwinian selection [62]. The Ka/Ks of 79 duplicated gene pairs was found to be less than 0.64 (Table S11 (Supplementary Materials))-which indicates the influence of purifying selection on the evolution of these gene pairs. Based on the divergence rate of λ = 6.1 × 10 −9 proposed for soybean [62] Figure 11. Graphical representation of locations for all the Ca 2+ transport genes on each chromosome of soybean. Genes within the same family were marked in the same color. Genes within a light blue box are putatively tandemly duplicated.

Hormone and Stress-Related Cis-Elements in the Promoters
The presence of cis-elements associated with hormonal signal and stress responses were surveyed in the −1500 bp promoter regions upstream to the predicted transcription start sites of these putative Ca 2+ transport genes. A total of 13 cis-elements have been surveyed, such as dehydration and cold response element DRE/CRT, ABA response element ABRE, ethylene-responsive element GCC-box, SA-responsive element SARE, and WRKY-binding site W-box (Table S12 (Supplementary Materials)). The exact positions of cis-elements located in the promoters were shown in Table S13 (Supplementary Materials). All of the Ca 2+ transport genes contain at least one kind of cis-element; some genes contain up to nine kinds of cis-elements, like GmCNGC16, GmOSCA2.8, GmCAX18, GmANN4, GLR3.3, and GmMCA2 (Table S13 (Supplementary Materials)). Among the Ca 2+ transport genes, most of them contain W-box and/or sulfur-responsive element SURE, around half of them contain SARE, MYC2-binding site, and/or I-box, 36 of them contain ABRE, 68 of them contain DRE, while only 8 of them contain GCC-box ( Figure S13 (Supplementary Materials)). It is notable that some cis-elements have many copies in the promoter regions ( Figure 14 and Figure S14 (Supplementary Materials)). For example, three and four ABRE elements were found in the promoter of GmOSCA2.7 and GmOSCA2.8, respectively; five DRE elements were found in the promoter of GmCAX18; five W-boxes, two CG-boxes, and two I-boxes were found in the promoter of GmCNGC32 (Figure 14 and Figure S14 (Supplementary Materials)). It is interesting that the drought-responsive element DRE/CRT is more enriched in the drought/dehydration-responsive genes encoding putative Ca 2+ transporters ( Figure 13); the percentage of drought/dehydration-responsive genes containing DRE/CRT element is about 1.6 times more than that of the total Ca 2+ transporter genes and about 1.8 times more than that of drought/dehydration-irresponsive genes ( Figure S15

Hormone and Stress-Related Cis-Elements in the Promoters
The presence of cis-elements associated with hormonal signal and stress responses were surveyed in the −1500 bp promoter regions upstream to the predicted transcription start sites of these putative Ca 2+ transport genes. A total of 13 cis-elements have been surveyed, such as dehydration 3.15. Analysis of Potential MicroRNA Target Sites microRNAs (miRNAs) have been indicated to be involved in growth and development control, stress response, and nutrient homeostasis by mediating post-transcriptional regulation [75,76]. To detect potential miRNA target sites within the putative Ca 2+ transporter genes, the mRNA sequences of putative Ca 2+ transporter genes were acquired from SoyBase (http://www.soybase.org/) and analyzed with the psRNATarget server (http://plantgrn.noble.org/psRNATarget/) [77]. To decrease the number of false-positive predictions, miRNA/target site pairs with an expectation score, and cut-off threshold of 3.0 were considered. Consequently, 61 miRNA members belonging to 37 families were predicted to target 47 genes (Table S14 (Supplementary Materials)). Many of these miRNAs have multiple target genes. For example, gma-miR156b has potential target sites on GmACA22 and GmOSCA2.2; gma-miR164 has potential target sites on GmMSL13 and GmMSL14; gma-miR9750 has potential target sites on GmMCA3 and GmMCA4 (Table S14 (Supplementary Materials)). These results suggest that miRNAs are potentially involved in Ca 2+ signaling and Ca 2+ homeostasis regulation.

Discussion
The physiological and biochemical functions of Ca 2+ are enabled by its elaborate transportation across cell membranes, which is mediated by diverse Ca 2+ transporters, including Ca 2+ channels, pumps, and antiporters [3,5]. Ca 2+ transporters, which differ in their cellular distribution and mechanism of transport, execute the complex and tight regulation of Ca 2+ homeostasis [2]. Many proteins belonging to the families of CNGC, GLR, TPC, OSCA, and MCA have been identified to function as Ca 2+ channels [15][16][17]26,33]. Some proteins belonging to protein families of ANN and MSL were also suggested to be Ca 2+ -permeable channels [3,37]. Although the physiological roles of many Ca 2+ channels, pumps, and antiporters have been well documented in model plants like Arabidopsis and rice [12,18,21,34,78,79], there is still limited characterization of Ca 2+ transport gene families in soybean. In the present study, we have performed a genome-wide in silico analysis of soybean to identify diverse Ca 2+ transport gene families, including Ca 2+ -ATPase, CaCA, CNGC, OSCA, GLR, TPC, MCA, MSL, and ANN families. The chromosomal location, gene and protein structure, domain architecture, expression profiles, promoter cis-elements, and potential miRNA target sites of genes belonging to these families have been analyzed. The total number of putative Ca 2+ transport protein-coding genes is 104 in Arabidopsis and 89 in rice, while a total of 207 genes encoding putative Ca 2+ transport proteins were identified in soybean (Figure 1, Tables S1-S11 (Supplementary Materials)). The number of total Ca 2+ transport protein-coding genes in soybean is about 2.0 and 2.3 times more than that of Arabidopsis and rice, respectively. For each of the nine Ca 2+ transport protein families, the number of family members in soybeans was around 2.0 times more than that in Arabidopsis or in rice ( Figure 1). For example, the number of CNGC genes in Arabidopsis and rice is 20 and 16, respectively, while the number in soybean is 39, which includes 15 segmentally duplicated gene pairs and five tandem duplicated gene groups ( Figure 11, Table S11 (Supplementary Materials)). The possible reason behind this significant increase in gene number might be the two duplication events of soybean that has occurred after the monocot/dicot split approximately 59 and 13 million years ago [80]. In this study, phylogenetic trees were constructed to study the phylogenesis and the expansion of Ca 2+ transport protein gene families in the course of evolution (Figures 2-10). The Ca 2+ transport protein-coding genes from Arabidopsis, rice, and soybean form common sub-clades with high bootstrap values (Figures 2-10), suggesting the conserved evolution of these gene families in monocots and eudicots via common ancestors [4,55]. There are three principal patterns of gene duplications, such as tandem duplication, segmental duplication, and transposition. A total of 79 segmentally duplicated gene pairs were identified in all the putative Ca 2+ transporter families (Table S11 (Supplementary  Materials)). In addition, 20 gene groups showed tandem duplication ( Figure 11). The expansion of putative Ca 2+ transport protein-coding genes by gene duplication could be associated with the adaptation to various adverse environmental conditions during the evolution of soybean.
It is clear that various Ca 2+ transport proteins play important roles in plant growth and development by mediating the generation of Ca 2+ signaling. For example, Arabidopsis CNGC5, CNGC6, CNGC9, and CNGC14 function as central regulators of Ca 2+ oscillations in root hair tip growth [9,81]; Arabidopsis CNGC7, CNGC8, CNGC18, GLR1.2, and GLR3.3 are required for pollen tube growth [26,82]; GLR3.2 and GLR3.4 were indicated to form heteromeric channels to regulate lateral root development via Ca 2+ signaling in the phloem of Arabidopsis [83]; Ca 2+ -ATPase genes, ACA8, ACA10, ACA12, and ACA13, were suggested to be involved in vegetative growth, inflorescence growth, and seed setting [84,85]. In the present study, the tissue expression patterns of putative Ca 2+ transporter genes showed that these genes were differentially expressed in diverse tissues of soybean during growth and development ( Figure 12 and Figure S12 (Supplementary Materials)). Some ubiquitously expressed genes like GmCNGC12/13/14/25/26, GmACA5/9/11/13/17, GmCAX1/2/8, and GmOSCA3.1/3.2/4.1/4.2 may function in diver tissues during the whole life of soybean. Some genes showed strong expression during seed development, such as GmCNGC29 and GmANN5 (Figure 12 and Figure S12 (Supplementary Materials)), suggesting their potential roles in seed setting. Some genes expressed distinctively in the flowers, such as GmCNGC20, GmACA16, GmOSCA1.6, and GmGLR1.2/4.8/4.9/4.10, which may be associated with the reproductive growth of soybean. It is interesting that some genes, such as GmCNGC33, GmACA21/22, GmCAX4, GmCCX4, and GmGLR4.6, were predominantly expressed in root nodule. In addition, GmCNGC15, GmCNGC27, GmCNGC34, and GmANN24 were found to be induced by the inoculation of rhizobia (B. japonicum) ( Figure 13). These results suggest their putative roles in symbiosis establishment and nodule development. In Medicago truncatula, the Ca 2+ -ATPase MCA8 and Ca 2+ channel CNGC15 have been indicated to be required for the generation of Ca 2+ spiking in the nucleus, where the Ca 2+ spiking is induced by symbiosis [86,87].
In this study, 35.7% of the putative Ca 2+ transporter genes in the soybean genome were found to be significantly and differentially regulated under diverse abiotic and biotic stresses, such as short-term salt, short-term dehydration, flooding, drought, phosphorus deficiency, zinc deficiency, and fungal pathogen inoculation ( Figure 13). These differentially expressed genes included members from all groups of Ca 2+ transport proteins. All of the putative Ca 2+ transporter genes contain diverse stress-and/or hormone-related cis-elements in their promoter regions ( Figure 14 and Figure S14 (Supplementary Materials)). The stress-responsive expression patterns of the putative Ca 2+ transporter genes suggest the involvement of diverse Ca 2+ transporters in abiotic and biotic stress responses. It is well-known that numerous environmental and developmental cues can trigger fluctuations of [Ca 2+ ] cyt , and various Ca 2+ transporters are involved in these processes [13]. It is notable that 48 genes are responsive to short-term salt stress treatment and most of them are Ca 2+ -ATPase genes and GLR genes ( Figure 13). In soybean, eight Ca 2+ -ATPase genes (GmACA1/2/12/13/22/23/25/26) were found to be quickly induced by salt stress (Figure 13). This is similar to previous findings that some plant Ca 2+ -ATPase genes were up-regulated by salt stress [45,78,88]. AtACA2 and AtACA4 in Arabidopsis, PCA1 in the moss Physcomitrella patens, and OsACA6 in rice have been identified to be involved in salt stress tolerance [44,78,[89][90][91]. The enhanced capacity of Ca 2+ -ATPase could help in restoring [Ca 2+ ] cyt , which is transiently elevated by salt treatment, and thus, maintaining Ca 2+ homeostasis.
In addition to salt stress, various genes of Ca 2+ channels, pumps, and antiporters were also indicated to be involved in responses and tolerance to numerous stresses [12,15,21,79,92,93]. However, whether the stress-responsive Ca 2+ transporter genes are possibly involved in stress response and tolerance in soybean deserves further researches.

Conclusions
In this study, a total of 207 genes encoding putative Ca 2+ transporters have been identified in soybean genome, including 33 Ca 2+ -ATPase genes, 30 CaCA genes, 39 CNGC genes, 21 OSCA genes, 35 GLR genes, two TPC genes, 26 ANN genes, 16 MSL genes, and 5 MCA genes. A comprehensive analysis of these genes was preformed, including phylogenetic relationships, chromosomal distribution, protein domains, gene duplication, promoter cis-elements, potential miRNA target sites, and tissue and stress-responsive expression patterns. The comprehensive in silico characterization and expression analyses would provide preliminary evidence of the potential of role of Ca 2+ transporter genes in growth and development, symbiosis, and responses to multiple environmental stresses in soybean. Considering the important roles of diverse Ca 2+ transporters in developmental regulation and environmental stress adaptation that have been documented in model plants, further in-depth functional characterization of the orthologous genes in soybean through the combination of biochemical, molecular, physiological and genetic approaches would facilitate the development of high-yielding and stress-resistant crops.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4395/10/10/1539/s1. Figure S1: Schematic representation of functional domains of soybean cyclic nucleotide-gated ion channel (CNGC) family proteins. Figure S2: Multiple sequence alignment of GmCNGC-specific motifs. Figure S3: Schematic representation of functional domains of soybean Ca 2+ -ATPase family proteins. Figure S4: Schematic representation of functional domains of soybean Ca 2+ /cation antiporter (CaCA) superfamily proteins. Figure S5: Multiple sequence alignment of NCL proteins of Arabidopsis, rice and soybean. Figure S6: Schematic representation of functional domains of soybean OSCA family proteins. Figure S7: Schematic representation of functional domains of soybean GLR family proteins. Figure S8: Schematic representation of functional domains of soybean ANN family proteins. Figure S9: Multiple sequence alignment of AtANN1 and GmANNs. Figure S10: Schematic representation of functional domains of soybean MSL family proteins. Figure S11: Multiple sequence alignment of MCA proteins in soybean, Arabidopsis and rice. Figure S12: Heat map representation for tissue-specific expression patterns of the predicted calcium transport genes according to Illumina transcriptome data. Figure S13: Number of genes containing the corresponding cis-element in their promoter regions. Figure S14: The distribution of cis-elements in the promoters of soybean Ca 2+ transport genes. Figure S15: The occurrence of DRE/CRE cis-element in the promoter regions (1500 bp) of the total soybean Ca 2+ transporter genes, Ca 2+ transporter genes that are responsive to drought/dehydration, and Ca 2+ transporter genes that are irresponsive to drought/dehydration. Table S1: A list of all the putative genes encoding Ca 2+ transport proteins in soybean genome. Table S2: A list of putative CNGC genes and their sequence details in soybean. Table S3: A list of putative Ca 2+ -ATPase genes and their sequence details in soybean. Table S4: A list of putative CaCA genes and their sequence details in soybean. Table S5: A list of putative OSCA genes and their sequence details in soybean. Table S6: A list of putative GLR genes and their sequence details in soybean. Table S7: A list of putative ANN genes and their sequence details in soybean. Table S8: A list of putative TPC genes and their sequence details in soybean. Table S9: A list of putative MSL genes and their sequence details in soybean. Table S10: A list of putative MCA genes and their sequence details in soybean. Table S11: Identification of substitution rates for soybean Ca 2+ transport gene pairs. Table S12: Summary of cis-acting elements that are related to plant hormone and stress response. Table S13: Distribution of hormone-and stress-related cis-elements in promoters of putative Ca 2+ transporter genes in soybean. Table S14: Putative microRNA targets predicted in the putative Ca 2+ transporter genes.

Conflicts of Interest:
The authors declare no conflict of interest.