Genome-Wide Identification and Characterization of the Soybean Snf2 Gene Family and Expression Response to Rhizobia

Sucrose nonfermenting 2 (Snf2) family proteins are the core component of chromatin remodeling complexes that can alter chromatin structure and nucleosome position by utilizing the energy of ATP, playing a vital role in transcription regulation, DNA replication, and DNA damage repair. Snf2 family proteins have been characterized in various species including plants, and they have been found to regulate development and stress responses in Arabidopsis. Soybean (Glycine max) is an important food and economic crop worldwide, unlike other non-leguminous crops, soybeans can form a symbiotic relationship with rhizobia for biological nitrogen fixation. However, little is known about Snf2 family proteins in soybean. In this study, we identified 66 Snf2 family genes in soybean that could be classified into six groups like Arabidopsis, unevenly distributed on 20 soybean chromosomes. Phylogenetic analysis with Arabidopsis revealed that these 66 Snf2 family genes could be divided into 18 subfamilies. Collinear analysis showed that segmental duplication was the main mechanism for expansion of Snf2 genes rather than tandem repeats. Further evolutionary analysis indicated that the duplicated gene pairs had undergone purifying selection. All Snf2 proteins contained seven domains, and each Snf2 protein had at least one SNF2_N domain and one Helicase_C domain. Promoter analysis revealed that most Snf2 genes had cis-elements associated with jasmonic acid, abscisic acid, and nodule specificity in their promoter regions. Microarray data and real-time quantitative PCR (qPCR) analysis revealed that the expression profiles of most Snf2 family genes were detected in both root and nodule tissues, and some of them were found to be significantly downregulated after rhizobial infection. In this study, we conducted a comprehensive analysis of the soybean Snf2 family genes and demonstrated their responsiveness to Rhizobia infection. This provides insight into the potential roles of Snf2 family genes in soybean symbiotic nodulation.


Introduction
Chromatin, which consists of nucleosomes as its basic units, is the eukaryotic DNAprotein complex that allows efficient packaging of large genomes within nuclei. Nucleosomes are composed of 147 base pairs of DNA that are tightly wrapped around an octamer of histone proteins, comprised of two copies each of H2A, H2B, H3, and H4 [1]. However, chromatin also poses a barrier for regulatory factors such as transcription factors to access DNA. Therefore, eukaryotes have evolved precise mechanisms to regulate chromatin structure and dynamics in a spatiotemporal manner, which is essential for gene expression control. These mechanisms include covalent modification of DNA and histones, replacement of histone variants, regulation of nucleosome assembly by histone chaperones and ATP-dependent chromatin remodeling mediated [2][3][4]. The Sucrose nonfermenting 2 (Snf2) Love (TML), NIN, and other factors to regulate the expression of CLAVATA3/endospermsurrounding region (CLE) and control root nodulation [49][50][51]. Soybean nodulation and nitrogen fixation involve a complex regulatory network. Although some regulatory genes have been discovered in the past 20 years, many aspects of this network remain unclear and many genes await discovery. Chromatin remodeling is an important epigenetic regulation component that has been rarely reported in soybean nodulation research. Recent studies have found that histone deacetylases (HDACs) in Medicago truncatula affect primordium formation by regulating 3-hydroxy-3-methylglutaryl coenzyme a reductase 1 (MtHMGR1) gene expression [52]. The Snf2 family genes, an important gene family in chromatin remodeling, have been rarely studied in soybean. To better understand the role of chromatin remodeling in nodulation, this study provides a comprehensive analysis of the soybean Snf2 family genes and their gene expression. It is suggested that Snf2 family genes may participate in soybean symbiotic nodulation. This provides insight into the potential roles of Snf2 family genes in soybean symbiotic nodulation.

Identification of Soybean Snf2 Family Proteins
In order to comprehensively identify all Snf2 family proteins in soybean, we conducted a search using a Hidden Markov Model (HMM) and the SNF2_N (Pfam: PF00176) and Helicase_C (Pfam: PF00271) domains as queries against the soybean protein database available on Phytozome This resulted in 502 genes encoding proteins with the conserved Helicase_C domain and 167 genes encoding proteins with the SNF2_N domain. The accuracy of these proteins was further verified using CDD and SMART. Finally, we identified 66 high-confidence genes encoding proteins containing both the Helicase_C and SNF2_N domains as members of the soybean Snf2 family (Table S1). Following the nomenclature used for rice and Arabidopsis, we named the soybean Snf2 family proteins GmCHRs.
In order to examine the evolutionary relationship between Snf2 family proteins in soybean and Arabidopsis, the MEGAX software was used to analyze the Snf2 family members in both species using the maximum likelihood method. The results showed that the soybean Snf2 family proteins were classified into six groups and eighteen subfamilies. (Figure 1). Most of the soybean Snf2 family members were closely related to their Arabidopsis counterparts and were distributed across all the subfamilies, with more members in the Snf2, DRD1, ERRC6, and Rad5/16 subfamilies ( Figure 1). Among these subfamilies, the Snf2 subfamily had the most members, with 9 members; while the ALC1 and Rad54 subfamilies had only one member each ( Figure 1 and Table S1).

Figure 1.
Phylogenetic tree of Snf2 family proteins from soybean and Arabidopsis. The maxim likelihood tree was constructed using MEGAX software with 1000 bootstrap replications. Blue cles and red pentagrams represent Snf2 family proteins from soybean and Arabidopsis, respectiv Different groups of proteins are distinguished using different background colors.

Analysis of Chromosomal Distribution and Duplication of the Soybean Snf2 Family Gene
Using TBtools to draw the chromosome localization map of the soybean Snf2 fam genes. We mapped GmCHRs to the soybean genome and discovered an uneven distri tion of 66 Snf2 family genes across the 20 chromosomes. The majority of these genes w located near the chromosomal ends, while a few were situated in the middle regions the chromosomes (Figure 2). The number of Snf2 genes on each chromosome also var There is only 1 Snf2 gene on chr06, chr14, and chr19; 2 on chr03, chr04, chr05, chr15, a chr18; 3 on chr01 and chr11; 4 on chr02, chr08, chr09, and chr17; 5 on chr07 and chr2 on chr10 and chr12; and 8 on chr13 (Figure 2). Phylogenetic tree of Snf2 family proteins from soybean and Arabidopsis. The maximum likelihood tree was constructed using MEGAX software with 1000 bootstrap replications. Blue circles and red pentagrams represent Snf2 family proteins from soybean and Arabidopsis, respectively. Different groups of proteins are distinguished using different background colors.

Analysis of Chromosomal Distribution and Duplication of the Soybean Snf2 Family Genes
Using TBtools to draw the chromosome localization map of the soybean Snf2 family genes. We mapped GmCHRs to the soybean genome and discovered an uneven distribution of 66 Snf2 family genes across the 20 chromosomes. The majority of these genes were located near the chromosomal ends, while a few were situated in the middle regions of the chromosomes ( Figure 2). The number of Snf2 genes on each chromosome also varies. There is only 1 Snf2 gene on chr06, chr14, and chr19; 2 on chr03, chr04, chr05, chr15, and chr18; 3 on chr01 and chr11; 4 on chr02, chr08, chr09, and chr17; 5 on chr07 and chr20; 6 on chr10 and chr12; and 8 on chr13 (Figure 2  Tandem and segmentalduplication are widely acknowledged as crucial mechanisms for the expansion of gene families [53]. Therefore, we conducted a survey to determine segmental duplication in the formation of the soybean Snf2 family genes. In general, Tandem and segmentalduplication are widely acknowledged as crucial mechanisms for the expansion of gene families [53]. Therefore, we conducted a survey to determine segmental duplication in the formation of the soybean Snf2 family genes. In general, tandem duplication is characterized by the presence of two paralogous genes located in close proximity to each other on the same chromosome, with typically no more than 5 intervening genes separating them. We detected no instances of tandem duplication, but the analysis revealed 34 pairs of 41 genes each that likely underwent segmental duplication ( Figure 3). These results indicate that the origin of the Snf2 family genes in soybean is likely attributable to segmental duplication rather than tandem duplication. Similarly, Snf2 family genes in rice and Arabidopsis were found to have experienced only segmental duplication events [18]. Next, we computed the ratio of nonsynonymous (Ka) to synonymous (Ks) substitutions (Ka/Ks) to explore the possible selective pressure driving the duplication events of the GmCHRs (Table S2). Ka/Ks ratios are considered to indicate purifying selection, indicating that natural selection has removed deleterious mutations and maintained protein stability; Ka/Ks ratios less than 1 are considered to indicate purifying selection, indicating that natural selection has removed deleterious mutations and maintained protein stability; Ka/Ks ratios greater than 1 indicate positive selection, suggesting that natural selection has acted on changes in the protein, causing the mutated sites to rapidly fix in the population and accelerate the evolution of the gene; Ka/Ks ratios equal to 1 indicate neutral selection, suggesting that natural selection has no effect on the mutation [54]. The 34 pairs of genes that underwent segmental duplication all exhibited Ka/Ks ratios lower than 1 (0.056-0.506), indicating that the duplicated genes were subjected to purifying selection pressure (Table S2).
Int. J. Mol. Sci. 2023, 24, x FOR PEER REVIEW tandem duplication is characterized by the presence of two paralogous genes loca close proximity to each other on the same chromosome, with typically no more intervening genes separating them. We detected no instances of tandem duplicatio the analysis revealed 34 pairs of 41 genes each that likely underwent segmental du tion ( Figure 3). These results indicate that the origin of the Snf2 family genes in so is likely attributable to segmental duplication rather than tandem duplication. Sim Snf2 family genes in rice and Arabidopsis were found to have experienced only segm duplication events [18]. Next, we computed the ratio of nonsynonymous (Ka) to sy mous (Ks) substitutions (Ka/Ks) to explore the possible selective pressure driving t plication events of the GmCHRs (Table S2). Ka/Ks ratios are considered to indicate p ing selection, indicating that natural selection has removed deleterious mutation maintained protein stability; Ka/Ks ratios less than 1 are considered to indicate pu selection, indicating that natural selection has removed deleterious mutations and tained protein stability; Ka/Ks ratios greater than 1 indicate positive selection, sugg that natural selection has acted on changes in the protein, causing the mutated s rapidly fix in the population and accelerate the evolution of the gene; Ka/Ks ratios to 1 indicate neutral selection, suggesting that natural selection has no effect on the tion [54]. The 34 pairs of genes that underwent segmental duplication all exhibited ratios lower than 1 (0.056-0.506), indicating that the duplicated genes were subjec purifying selection pressure (Table S2).

Analysis of Gene Structure and Conserved Domains in the Soybean Snf2 Family
We analyzed the Snf2 family proteins and found that they have amino acid numbers ranging from 276 to 3789, molecular weights (Mw) ranging from 32.1 to 410.7 kDa, and theoretical isoelectric points (pI) ranging from 4.96 to 9.3 (Table S3). In order to gain a better understanding of the functions of Snf2 family proteins, the conserved domains of Snf2 proteins were analyzed using the Conserved Domain Database (CDD) and Pfam websites. The analysis revealed that the Chromo domain was only found in the Chd1 and Mi-2 subfamilies of the soybean Snf2 genes ( Figure 4A). The PHD domain was present in all members of the Mi-2 subfamily except for GmCHR58 ( Figure 4A). Interestingly, GmCHR5 from the SHPRH subfamily also contained the PHD domain ( Figure 4A). The zf-C3HC4 domain was detected in almost all members of the SHPRH, Rad5/16 and Ris1 subfamilies, except for GmCHR36, GmCHR11 and CmCHR28 ( Figure 4A). However, the F-box domain was only observed in GmCHR36 and GmCHR11 from the SHPRH subfamily ( Figure 4A). theoretical isoelectric points (pI) ranging from 4.96 to 9.3 (Table S3). In order to gain a better understanding of the functions of Snf2 family proteins, the conserved domains of Snf2 proteins were analyzed using the Conserved Domain Database (CDD) and Pfam websites. The analysis revealed that the Chromo domain was only found in the Chd1 and Mi-2 subfamilies of the soybean Snf2 genes ( Figure 4A). The PHD domain was present in all members of the Mi-2 subfamily except for GmCHR58 ( Figure 4A). Interestingly, GmCHR5 from the SHPRH subfamily also contained the PHD domain ( Figure 4A). The zf-C3HC4 domain was detected in almost all members of the SHPRH, Rad5/16 and Ris1 subfamilies, except for GmCHR36, GmCHR11 and CmCHR28 ( Figure 4A). However, the F-box domain was only observed in GmCHR36 and GmCHR11 from the SHPRH subfamily ( Figure 4A).
Analysis of exon/intron structures of 66 Snf2 family genes showed that the number of exons varied greatly among Snf2 family members, ranging from 36 (GmCHR55) to 2 (GmCHR29) ( Figure 4B). Further analysis revealed that the members of DRD1 subfamily had significantly fewer exons than those of other subfamilies ( Figure 4B). The Snf2 family genes usually had very long sequences, with 13 genes exceeding 20 kb and 3 genes exceeding 30 kb ( Figure 4B).  Analysis of exon/intron structures of 66 Snf2 family genes showed that the number of exons varied greatly among Snf2 family members, ranging from 36 (GmCHR55) to 2 (Gm-CHR29) ( Figure 4B). Further analysis revealed that the members of DRD1 subfamily had significantly fewer exons than those of other subfamilies ( Figure 4B). The Snf2 family genes usually had very long sequences, with 13 genes exceeding 20 kb and 3 genes exceeding 30 kb ( Figure 4B).

Analysis of Cis-Element the Soybean Snf2 Gene Promoters
Because gene transcription regulation is typically achieved through the binding of different transcription factors to cis-elements in the promoter. To explore the transcription regulation of Snf2 genes in response to various environmental signals, we analyzed the 2 kb promoter regions of 66 soybean Snf2 family genes using the PlantCARE. ( Figure 5 and Table S4) A total of 23 cis-elements were discovered in the promoter regions of soybean Snf2 family genes. Five of these cis-elements are related to hormones, such as methyl jasmonate (MeJA), abscisic acid (ABA), gibberellin (GA), auxin, and salicylic acid (SA). These hormone-related cis-elements are widely distributed in the promoter regions of Snf2 family genes, especially ABA-and MeJA-responsiveness cis-elements ( Figure 5 and Table S4). Some cis-elements related to stress response, such as drought, low temperature, wound, and tissue-specific expression (seed and root), are also widely distributed in the promoter regions of various genes. The distribution of these hormone-and stress-related cis-elements among different subfamilies does not seem to follow any specific pattern ( Figure 5 and Table S4). In addition, 12 types of cis-elements were related to metabolism regulation and development. ( Figure 5 and Table S4). It is worth noting that nodule specificity cis-element (5'AAAGAT) [55] is the second most widely distributed cis-element after MeJA responsiveness (17%) and ABA responsiveness (16%). Nodule specificity cis-element is distributed in the promoter regions of 52 Snf2 family genes, which suggests that these genes may be related to symbiotic nodulation. (Figures 5 and S1).

Expression Profiles of the Snf2 Family Genes in Symbiotic Nitrogen Fixation
To explore the potential functions of the Snf2 genes in soybean, we obtained their expression patterns from the eFP Browser for soybean, an online transcriptome database. The expression patterns of Snf2 family genes were analyzed in root and nodule (Figure

Expression Profiles of the Snf2 Family Genes in Symbiotic Nitrogen Fixation
To explore the potential functions of the Snf2 genes in soybean, we obtained their expression patterns from the eFP Browser for soybean, an online transcriptome database. The expression patterns of Snf2 family genes were analyzed in root and nodule ( Figure 6A and Table S5). However, the database did not contain information on the remaining 14 genes. To examine how Snf2 family genes respond to rhizobial infection, the expression profiles of 52 Snf2 genes in root hairs at 12 and 24 h after inoculation (HAI) were analyzed. Figure 6A shows that out of the 52 Snf2 genes, more than 50% had higher expression levels in nodule than in root. Furthermore, the expression of Snf2 genes in infected root hairs at different time points after inoculation was analyzed. The heatmap shows that these Snf2 genes were responsive at 12 HAI in infected root hairs, but they responded differently ( Figure 6B and Table S5). Some genes (such as GmCHR16, GmCHR35 and GmCHR51) were upregulated after inoculation, while some genes (such as GmCHR30, GmCHR9 and GmCHR15) were downregulated after inoculation ( Figure 6B and Table S5). Overall, there were more upregulated than downregulated genes. Interestingly, most of the genes with large expression differences at 12 HAI showed smaller differences at 24 HAI. For example, GmCHR4, GmCHR24, GmCHR44 and other genes were downregulated at 24 HAI; whereas GmCHR26, GmCHR18 and GmCHR27 and other genes were upregulated at 24 HAI. These genes had different expression patterns at 12 HAI and 24 HAI ( Figure 6B and Table S5). Although some genes increased and some decreased in expression, almost all Snf2 genes responded to rhizobial infection, suggesting that Snf2 family genes may play important roles in symbiotic nitrogen fixation.
To further analyze the potential role of Snf2 family genes in symbiotic nodulation, we performed qPCR analysis on 26 Snf2 genes containing nodule specificity cis-element and with Reads Per Kilobase per Million mapped reads (RPKM) greater than 2.5 in nodules in the eFP Browser. These genes were detected in roots or nodules at 28 days after inoculation (DAI). The results showed that all genes except GmCHR5 were significantly more highly expressed in mature nodules than in roots ( Figure 7A). To verify the response of these genes to rhizobial infection, we analyzed their expression induced by rhizobia at 24 h after inoculation (HAI) in root hairs using GmNIN1a and GmENOD40.1 as positive controls. The results showed that both marker genes were upregulated after rhizobial infection. Nine genes did not show significant changes in expression, and the remaining genes were significantly downregulated ( Figure 7B). The discrepancy between our results and microarray data may be caused by differences in plant culture or rhizobial infection efficiency.
were upregulated at 24 HAI. These genes had different expression patterns at 12 HAI and 24 HAI ( Figure 6B and Table S5). Although some genes increased and some decreased in expression, almost all Snf2 genes responded to rhizobial infection, suggesting that Snf2 family genes may play important roles in symbiotic nitrogen fixation. To further analyze the potential role of Snf2 family genes in symbiotic nodulation, we performed qPCR analysis on 26 Snf2 genes containing nodule specificity cis-element and with Reads Per Kilobase per Million mapped reads (RPKM) greater than 2.5 in tive controls. The results showed that both marker genes were upregulated after rhizobial infection. Nine genes did not show significant changes in expression, and the remaining genes were significantly downregulated ( Figure 7B). The discrepancy between our results and microarray data may be caused by differences in plant culture or rhizobial infection efficiency. The expression patterns of specific Snf2 genes in roots and nodules were analyzed using qRT-PCR (B) Expression patterns of Snf2 genes in root hair were analyzed using qRT-PCR at 12/24 hours after rhizobium inoculation (HAI). The color scale ranges from white to red, indicating low or high levels of gene expression. The term 'mock' refers to samples without rhizobia inoculation. Data was the most representation of three biological replicates, and the GmActin11 gene was chosen as the internal reference. A Studentʹs t-test was applied to assess the significances of the difference between the two groups. * p < 0.05. ** p < 0.01. "ns" indicates that there is no significant difference.

Discussion
The Snf2 family proteins are essential for chromatin-remodeling complexes that regulate transcription, replication, homologous recombination, and DNA repair in all eukaryotes. These proteins have diverse functions in plant development and stress responses. [6,20,56]. For example, Snf2 proteins are involved in flowering, organ formation, and stress response in Arabidopsis [20,21]. Previous studies on Snf2 family proteins have mainly focused on model plants such as Arabidopsis and rice. However, soybean is an important crop that differ from these non-leguminous crops due to its unique plant- The expression patterns of specific Snf2 genes in roots and nodules were analyzed using qRT-PCR (B) Expression patterns of Snf2 genes in root hair were analyzed using qRT-PCR at 12/24 h after rhizobium inoculation (HAI). The color scale ranges from white to red, indicating low or high levels of gene expression. The term 'mock' refers to samples without rhizobia inoculation. Data was the most representation of three biological replicates, and the GmActin11 gene was chosen as the internal reference. A Student's t-test was applied to assess the significances of the difference between the two groups. * p < 0.05. ** p < 0.01. "ns" indicates that there is no significant difference.

Discussion
The Snf2 family proteins are essential for chromatin-remodeling complexes that regulate transcription, replication, homologous recombination, and DNA repair in all eukaryotes. These proteins have diverse functions in plant development and stress responses. [6,20,56]. For example, Snf2 proteins are involved in flowering, organ formation, and stress response in Arabidopsis [20,21]. Previous studies on Snf2 family proteins have mainly focused on model plants such as Arabidopsis and rice. However, soybean is an important crop that differ from these non-leguminous crops due to its unique plant-rhizobia symbiosis. The role of Snf2 proteins in soybeans, especially in nodulation, remains unknown. Here, we identified 66 soybean Snf2 proteins (Table S1). In this study, we identified 66 Snf2 proteins in soybeans and performed a comprehensive analysis of their phylogenetic relationships, gene classification, chromosomal locations, conserved domains, gene structures, cis-elements, and expression profiles in different tissues and during nodulation. Our results provide valuable insights into the Snf2 family proteins and highlight their potential functions in the symbiotic interaction between soybeans and rhizobia. Soybeans contain 66 Snf2 family genes, which is more than Arabidopsis (41), rice (40), and tomato (45) [15,17,19], may be due to the partial diploidization of the tetraploid soybean genome, resulting in a higher number of Snf2 genes than in diploid species [57]. Phylogenetic analyses using Snf2 proteins from Arabidopsis and soybean classified the 66 soybean Snf2 proteins into 6 groups and 18 subfamilies (Figure 1). The number of Snf2 proteins in each subfamily ranged from 1 to 9 (Figure 1). Interestingly, the ALC1, Rad54, and SMARCAL1 subfamilies showed a 1:1 orthologous pattern between Arabidopsis and soybean, indicating that these subfamilies are more conserved than others (Figure 1). This is noteworthy given that genome duplications occurred around 59 and 13 million years ago, leading to a highly duplicated genome with nearly 75% of genes present in multiple copies [58]. We identified 34 pairs of segmental duplications in soybean, but no tandem duplications (Figure 3), which is similar to Arabidopsis and rice [18]. Notably, some segmentally duplicated pairs in Arabidopsis are functionally redundant (CHR12/23 and CHR11/17) [59,60], and their homologs in soybean also had segmental duplication. This implies that the gene pairs GmCHR18/56, GmCHR18/53, GmCHR32/65, and GmCHR34/65 may have similar functions. Therefore, segmental duplication, rather than tandem duplication, seems to be the main evolutionary mechanism for the expansion and functional diversification of the Snf2 gene family. The evolution of new gene functions usually results from the combined effects of duplication and selection. Our analysis found that all 34 segmental-duplication gene pairs had Ka/Ks ratios less than 1 (Table S2), indicating that they underwent purifying selection and reduced genetic diversity. This implies that the functional divergence of these duplicated genes might tend to be conservative.
The catalytic ATPase domain of Snf2 proteins is responsible for chromatin-remodeling activity. It consists of SNF2_N, which has ATP hydrolysis activity, and Helicase_C, which has ATP-dependent DNA or RNA unwinding activity. This structure is conserved in plants [15,16]. In soybean, besides these two conserved domains, each family also contains some unique domains. Proteins in the Mi-2 subfamily (except for GmCHR58) and SHPRH subfamily (only GmCHR5) have a PHD domain at their N-terminus ( Figure 4). The PHD domain is a Zn 2+ -binding domain that can recognize and bind H3K4me3 in various proteins such as BPTF, YNG1 and ING2 [61][62][63][64]. In addition to H3K4me3, this domain can also recognize various other histone modifications, such as H3K9me3 recognition by the PHD domain of human Mi-2 homolog CHD4 [65]. This suggests that soybean proteins with PHD domains may crosstalk with other histone modifications. In both Chd1 and Mi-2 subfamily proteins, there is a Chromo domain upstream of Helicase_C ( Figure 4). The CHROMO domain was first discovered in animals and has DNA-binding activity [66]. The CHROMO domain of rice Mi-2 subfamily protein OsCHR729 was also found to bind methylated H3K4, suggesting that these proteins with CHROMO domains may also have the ability to bind methylated H3K4 [67]. GmCHR36 and GmCHR11 from the SHPRH subfamily contain an F-box domain (Figure 4). Proteins containing an F-box domain are usually subunits of SCF complexes, which are E3 ubiquitin ligases that mediate the proteasomal degradation of specific substrates. F-box proteins function as substrate recognition components in SCF complexes [68]. This suggests that GmCHR36 and GmCHR11 may have E3 ubiquitin ligase activity. Moreover, most members of the three subfamilies in the Rad5/16-like group (SHPRH subfamily, Rad5/16 subfamily, Ris1 subfamily) have a zf-C3HC4 domain at their N-terminus (except for GmCHR11 and GmCHR36) (Figure 4). The zf-C3HC4 domain is a zinc-finger domain that may be involved in both DNA-binding and protein-protein interaction functions [69]. This implies that soybean proteins with zf-C3HC4 domains may play important roles in connecting other soybean chromatin-remodeling complex subunits.
The promoters of almost all Snf2 genes were found to contain diverse cis-elements, which are involved in organ development, plant hormone (such as abscisic acid and jasmonic acid) response and stress (drought, low temperature) tolerance ( Figure 5). Notably, the cis-elements associated with jasmonic acid, abscisic acid, and nodule specificity were the most common (Figures Figure 5 and S1). These two hormones are known to be involved in the response to external biotic and abiotic stresses. The frequent occurrence of these three cis-elements in Snf2 gene promoters suggests a close relationship between Snf2 family genes and plant stress response. It also implies an essential role of Snf2 family genes in symbiotic nodulation. Soybean is a globally important food and economic crop, and the identification of these stress-related genes is crucial for developing new stress-resistant varieties.
Nitrogen is a crucial factor that limits crop growth and productivity. Soybean can form nodules by interacting with rhizobia in the soil to meet its own nitrogen demand [25]. The efficiency of soybean nodulation is determined by three main factors: nodule infection rate, nodule development process and nitrogen fixation capacity of mature nodules [26]. Chromatin remodeling is one of the major mechanisms of transcriptional regulation, which involves Snf2 family proteins as the core components of ATP-dependent chromatin remodeling complexes. These proteins may be involved in various aspects of plant life, including nodulation. For instance, Arabidopsis BRM regulates root development by affecting PIN-FORMED (PIN) expression and auxin distribution. Recent studies have demonstrated that GmPINs also regulate nodulation [27]. This implies that Snf2 family genes may have a significant role in nodulation as well. We examined the expression patterns of Snf2 genes in roots and nodules ( Figure 7A). Many genes exhibited remarkable expression changes before and after rhizobial inoculation (such as GmCHR34, GmCHR10, GmCHR66, GmCHR14), while some genes had striking expression differences between roots and nodules (such as GmCHR18, GmCHR19, GmCHR56 and GmCHR66) ( Figure 7B). These findings suggest that these genes can respond to rhizobial infection and may participate in nodulation regulation by Snf2 family proteins. These genes could offer valuable resources for molecular breeding of high-efficiency nitrogen fixation.

Plant Materials and Growth Conditions
In this study, soybean Williams 82 (W82) was utilized as the plant material.The soybean seeds were germinated in sterile water for a period of five days before being transferred to vermiculite that was supplemented with a low-nitrogen culture solution. The soybeans were then grown in a greenhouse under controlled conditions, including a temperature of 25 • C, 70% humidity, and a photoperiod consisting of 16 h of light and 8 h of darkness for three days The rhizobial strain Bradyrhizobium diazoefficens USDA110 was inoculated at an optical density (OD) of 0.08 using sterile water as a carrier.

Identification of the Soybean Snf2 Family Genes
To thoroughly identify all proteins within the soybean Snf2 family, a search was conducted for a high-confidence soybean genome in Phytozome 13. (https://phytozome.jgi. doe.gov/pz/portal.html, accessed on 23 November 2022), because all Snf2 proteins contain two conserved domains: SNF2_N and Helicase_C. The Hidden Markov Model (HMM) files for SNF2_N and Helicase_C were downloaded from the Pfam database (SNF2_N Pfam: PF00176, Helicase_C Pfam: PF00271). The Simple HMM Search plugin from TBtools (a software suite for biological data analysis) was used to retrieve 66 high-confidence genes that contained both SNF2_N and Helicase_C domains in W82 reference genome [70]. The candidates for Snf2 proteins were then confirmed using SMART (http://smart.emblheidelberg.de/, accessed on 23 November 2022) and CDD (https://www.ncbi.nlm.nih. gov/Structure/cdd/wrpsb.cgi, accessed on 23 November 2022). The conserved motif in Snf2 proteins was detected through CDD, and a visual representation of the motif map was generated using TBtools [70].

Phylogenetic Construction
A phylogenetic tree file was generated by comparing the amino acid sequences of soybean and Arabidopsis Snf2 family proteins with the Maximum Likelihood method in MEGAX software (Molecular Evolutionary Genetics Analysis) [71]. The generated phylogenetic tree was then refined and beautified using the Evoview (http://www.evolgenius. info/evolview/, accessed on 23 November 2022).

Chromosome Localization, Duplication, and Evolution
The chromosomal location of each Snf2 gene was determined by using a GTF file of soybean genome and an ID list of CHR family members with TBtools [70]. Synteny analysis on internal CHR genes in soybean was conducted with One Step MCScanX plugin in TBtools, and visualization was performed with Advanced Circos plugin [70]. Nonsynonymous (Ka) and synonymous (Ks) substitution rates for each Snf2 gene pair were calculated with Simple Ka/Ks Calculator in TBtools [70].

Characterization of Snf2 Family Proteins and Gene Structure
Theoretical grand average of hydropathicity (GRAVY), isoelectric point (pI) and molecular weight (MW) of soybean Snf2 proteins were analyzed using ProParam software on the ExPASy server (Expert Protein Analysis System), NCBI's CDD platform (Conserved Domain Database) was used to predict conserved domains for all identified soybean Snf2 proteins' polypeptide sequences. Data were then visualized using TBtools [70]. The exonintron architecture diagrams were also created using TBtools with its Gene Structure View (Advanced) [70].

Promoter Analysis
The 2 kb upstream of the translation initiation site of each Snf2 gene was defined as its promoter region, and PlantCARE was used to analyze these regions. The results were visualized with TBtools [70].

Expression Profile Analysis
Gene expression data of soybean Snf2 genes were retrieved from the Glycine max eFP Browser website and imported into TBtools to generate a heat map displaying the expression levels of soybean Snf2 genes in root and nodule, as well as in root hairs at 12/24 h after inoculation (HAI) [70].

RNA Isolation and Real-Time Quantitative RT-PCR Expression Analysis
Total RNA from root materials was extracted using FastPure Cell/Tissue Total RNA Isolation Kit V (No.RC112-01, Vazyme). mRNA was reverse transcribed using TransScript Uni All-in-One First-Strand cDNA Synthesis SuperMix (No. AU341-02, Transgen). Gene expression was detected using ChamQ Universal SYBR qPCR Master Mix (No. Q711-02, Vazyme) as the quantitative reagent. qPCR assays were performed as previously described [55]. The relative expression level of each gene was calculated using the 2 −∆∆CT algorithm with GmActin11 as the internal control [72]. The results were then normalized. Three independent replicates were performed for each treatment. The primers used are described in Table S6.

Conclusions
In this study, 66 soybean Snf2 family members were identified. These Snf2 family genes are unevenly distributed across all 20 chromosomes. Soybeans are important food and economic crops. Unlike other non-leguminous crops, soybeans can form a symbiotic relationship with rhizobia for biological nitrogen fixation. Nodule specificity cis-elements are widely distributed in the promoters of many soybean Snf2 family genes. Gene expression analysis showed that most of the detected genes had significant expression before and after rhizobia inoculation. This indicates the potential role of Snf2 family proteins in regulating symbiotic nodulation and helps future research on soybean Snf2 family proteins. Data Availability Statement: The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. However, most of the data is shown in Supplementary Files.