Next Article in Journal
Transluminal Pillars—Their Origin and Role in the Remodelling of the Zebrafish Caudal Vein Plexus
Previous Article in Journal
Albumin/Mitotane Interaction Affects Drug Activity in Adrenocortical Carcinoma Cells: Smoke and Mirrors on Mitotane Effect with Possible Implications for Patients’ Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Impacts of Natural Selection on Evolution of Core and Symbiotically Specialized (sym) Genes in the Polytypic Species Neorhizobium galegae

by
Evgeny S. Karasev
1,
Sergey L. Hosid
1,
Tatiana S. Aksenova
1,
Olga P. Onishchuk
1,
Oksana N. Kurchak
1,
Nikolay I. Dzyubenko
2,
Evgeny E. Andronov
1,3,* and
Nikolay A. Provorov
1
1
All-Russia Research Institute for Agricultural Microbiology, 196608 St. Petersburg, Russia
2
All-Russia Research Institute of Plant Genetic Resources, 190031 St. Petersburg, Russia
3
Dokuchaev Soil Science Institute, 119017 Moscow, Russia
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2023, 24(23), 16696; https://doi.org/10.3390/ijms242316696
Submission received: 8 September 2023 / Revised: 17 November 2023 / Accepted: 20 November 2023 / Published: 24 November 2023
(This article belongs to the Section Molecular Genetics and Genomics)

Abstract

:
Nodule bacteria (rhizobia) represent a suitable model to address a range of fundamental genetic problems, including the impacts of natural selection on the evolution of symbiotic microorganisms. Rhizobia possess multipartite genomes in which symbiotically specialized (sym) genes differ from core genes in their natural histories. Diversification of sym genes is responsible for rhizobia microevolution, which depends on host-induced natural selection. By contrast, diversification of core genes is responsible for rhizobia speciation, which occurs under the impacts of still unknown selective factors. In this paper, we demonstrate that in goat’s rue rhizobia (Neorhizobium galegae) populations collected at North Caucasus, representing two host-specific biovars orientalis and officianalis (N2-fixing symbionts of Galega orientalis and G. officinalis), the evolutionary mechanisms are different for core and sym genes. In both N. galegae biovars, core genes are more polymorphic than sym genes. In bv. orientalis, the evolution of core genes occurs under the impacts of driving selection (dN/dS > 1), while the evolution of sym genes is close to neutral (dN/dS ≈ 1). In bv. officinalis, the evolution of core genes is neutral, while for sym genes, it is dependent on purifying selection (dN/dS < 1). A marked phylogenetic congruence of core and sym genes revealed using ANI analysis may be due to a low intensity of gene transfer within and between N. galegae biovars. Polymorphism in both gene groups and the impacts of driving selection on core gene evolution are more pronounced in bv. orientalis than in bv. officianalis, reflecting the diversities of their respective host plant species. In bv. orientalis, a highly significant (P0 < 0.001) positive correlation is revealed between the p-distance and dN/dS values for core genes, while in bv. officinalis, this correlation is of low significance (0.05 < P0 < 0.10). For sym genes, the correlation between p-distance and dN/dS values is negative in bv. officinalis but is not revealed in bv. orientalis. These data, along with the functional annotation of core genes implemented using Gene Ontology tools, suggest that the evolution of bv. officinalis is based mostly on adaptation for in planta niches while in bv. orientalis, evolution presumably depends on adaptation for soil niches. New insights into the tradeoff between natural selection and genetic diversity are presented, suggesting that gene nucleotide polymorphism may be extended by driving selection only in ecologically versatile organisms capable of supporting a broad spectrum of gene alleles in their gene pools.

1. Introduction

Root nodule bacteria (rhizobia) represent the genetically most thoroughly studied group of symbiotic microorganisms fixing N2 in the nodules of legume and some non-legume plants. Being highly effective producers of N compounds for terrestrial ecosystems, these bacteria are of exclusive ecological and agronomic importance. A complicated system of symbiotically specialized (sym) genes, including those responsible for nodule development (nod) and N2 fixation (nif/fix), emerged in rhizobia during co-evolution with host plants, which represent the major factors shaping rhizobia natural history [1,2].
Based on intensive genetic research, rhizobia are used as a model to address a range of general evolutionary problems, including the role of natural selection in the evolution of beneficial symbioses. At present, this research is focused on factors restricting cheater rhizobia genotypes, which, according to theoretical/computer models, should outcompete beneficial genotypes in plant-associated populations [3,4]. However, it was demonstrated that, in many symbiotic systems, host-specific selective pressures provide a competitive advantage for mutualists, which resist cheater expansion [5,6], and are responsible for the evolution of N2-fixing symbiosis towards increased structural complexity and ecological efficiency [7].
Up to now, there has been little focus on the forms of natural selection induced by host plants in polymorphic rhizobia populations. Proceeding from computer and experimental simulations, we suggested that individual (Darwinian, frequency-dependent, disruptive) and group (inter-deme, kin) selection may be induced by hosts in associated rhizobia populations at different stages of symbiosis development [8,9]. These selective pressures are based on and responsible for the high sym gene diversity in rhizobia populations. However, the tradeoff between genetic polymorphism and natural selection operating in microbial populations remains obscure [10]. As was previously demonstrated, purifying selection usually results in decreased population/gene polymorphism [11], while disruptive and negative frequency-dependent selection results in extended polymorphism [12,13]. In rhizobia populations, balancing selection may be responsible for the coexistence of cheating and beneficial genotypes, which elicits the evolution of symbiosis for an improved efficiency that is compatible with stable polymorphism in symbiont populations [2]. Data on the influence of driving selection on genetic polymorphism are contradictory: it may be increased [14,15,16,17], conserved, or even decreased [18,19,20] by this selection.
Research of the impacts of natural selection on symbiosis evolution is based on the multipartite genomic structures of rhizobia in which the core parts encoding for housekeeping functions differ in their natural histories from the accessory parts, including sym genes [21]. Convenient models for analyzing rhizobia genome dynamics are represented by polytypic species Neorhizobium galegae, which are composed of host-specific biovars bv. orientalis and bv. officinalis (symbionts of Galega orientalis and G. officinalis), and Rhizobium leguminosarum, composed of bv. viciae (symbionts of plants from genera Lathyrus, Lens, Pisum, Vavilovia, and Vicia), bv. trifolii (symbionts of genus Trifolium), and bv. phaseoli (symbionts of Phaseolus vulgaris).
Rhizobia genomes are subjected to multilevel evolution based on the diversification of: (i) sym genes resulting in the formation of polytypic species composed of host-specific biovars; and (ii) core genes resulting in the formation of cryptic (twin) species [8]. It was demonstrated that in different rhizobia species, core genes are evolutionarily more conservative than accessory genes [22,23,24], resulting in a closed pangenome structure for core genes but in an open pangenome structure for accessory genes [25,26]. In R. leguminosarum, sym and core genes differ greatly in nucleotide polymorphism (p-distance) and are phylogenetically non-congruent, suggesting that the evolution of these genes is mostly independent [27]. This independency may result from intensive gene transfer in R. leguminosarum populations, wherein chromosomal core genes recombine randomly with sym genes located on mobile plasmids [21].
Previously, we demonstrated that the parameters of core and sym gene polymorphism are different in R. leguminosarum biovars viciae and trifolii [27]. Cross-inoculation between these biovars is limited and results in non-N2-fixing nodules, which are usually underdeveloped and morphologically abnormal [28]. In R. leguminosarum, genomes are composed of circular chromosomes and several plasmids, with one of them (pSym) having a size of 200–500 kb harboring the majority of sym genes [21]. We suggested that, in R. leguminosarum, evolution of sym genes is implemented under the impacts of host-induced natural selection [9,27], while the mechanisms for core gene evolution remain obscure.
In the presented paper, we compare the evolutionary dynamics of sym and core genes in goat’s rue rhizobia (N. galegae), in which sym genes are located on chromids having sizes of over 1600 kb. These circular replicons have a plasmid-type repABC system combined with core genes that are typically located on bacterial chromosomes, e.g., rRNA and tRNA genes [29]. Previously, we demonstrated [30] that populations of N. galegae bv. orientalis collected in the North Caucasian region are more polymorphic for sym and core genes than N. galegae bv. officinalis populations. This difference presumably reflects the diversity of the host plant species, which is sufficiently higher in G. orientalis than in G. officinalis. The difference between the two N. galegae biovars for nif/fix genes is much more pronounced than for nod genes since the host specificity of the compared biovars pertains to N2 fixation not nodulation activity [30].
The main objective of the presented research was to address the relationship between nucleotide polymorphism and natural selection in the N. galegae population composed of biovars orientalis and officinalis. Their host species, G. orientalis and G. officinalis, grow in the North Caucasus region with different levels of diversity, which is mirrored in the polymorphism of their microsymbionts. The following methodology was used: (1) according to analysis of the set of complete N. galegae genomes from two biovars, a common gene pool was identified, and for each gene of this pool, nucleotide polymorphism (p-distance) and natural selection statistics (dN/dS) were analyzed; (2) correlation analysis was carried out between the p-distance and dN/dS values separately for core and sym genes; (3) to search for genes with either similar or contrastingly different polymorphism and selection patterns between the two rhizobia biovars, analysis of polymorphism and selection was carried out in four groups (p-distance or dN/dS values above average in both biovars, below average in both biovars, and two groups with traits differing in opposite directions); and (4) to address the gene functions presumably involved in differential ecological adaptation of each biovar, the similar analysis was carried using Gene Ontology tools in accordance with the same four-group strategy that was used in the analysis of individual core genes.
In the presented paper, we analyze the impacts of driving/purifying selection on the diversity of core and sym genes of N. galegae, which allow us to address the molecular and ecological factors of beneficial symbiosis evolution. The obtained results enable us to reveal the mechanisms shaping the evolution of different parts of multipartite rhizobia genomes and to fill some important gaps in our knowledge on the tradeoff between the polymorphism of genes and impacts of natural selection on their evolution.

2. Results

In total, 14 complete genomes of N. galegae strains (8 bv. officinalis and 6 bv. orientalis) were studied. In Figure 1A, we present data from the phylogenetic analysis of the average nucleotide identity (ANI) for chromosomal and sym genes of this genome set. It should be noted that complete phylogenetic congruence between sym and core gene phylogenies reflects parallel processes of microevolution and speciation, probably occurring with lack of horizontal gene transfer. Figure 1B demonstrates different levels of nucleotide polymorphism in individual core genes of bv. officinalis and bv. orientalis. From this picture it is clear that nucleotide polymorphism is sufficiently higher in bv. orientalis (see below for details). Figure 1C shows a scheme of dividing the total core gene pool into four clusters, depending on the similarity/difference in the levels of nucleotide polymorphism of genes in the two N. galegae biovars. This scheme was used further for analysis of nucleotide polymorphism and dN/dS in the Gene Ontology group clustering.

2.1. Gene Polymorphism and Natural Selection

We demonstrated that nucleotide polymorphism in polytypic species N. galegae depends on driving/purifying (dN/dS-measured) natural selection, which operates in a biovar-specific manner in core and sym genes (Table 1). The maximal impacts of driving selection (dN/dS > 1) were revealed in the high-polymorphic core genes of bv. orientalis, while the minimal impacts were revealed in the low-polymorphic sym genes of bv. officinalis.
Analysis of correlations between nucleotide polymorphism (p-distance) and driving/purifying selection (dN/dS) impacts suggested (Table 2) that this selection implements different roles in the evolution of core and sym genes, which depends on the N. galegae biovar. For core genes, driving selection may result in a marked increase in polymorphism of bv. orientalis (indicated by a highly significant positive correlation between p-distance and dN/dS), but this increase was much less evident in bv. officinalis (indicated by a significantly lower although positive correlation between p-distance and dN/dS). Figure S1 in the Supplementary Materials demonstrates the corresponding differences in regression between dN/dS and p-distance values in the bv. orientalis and bv. officinalis core gene pools.
For sym genes, natural selection does not influence polymorphism in bv. orientalis (no correlation between p-distance and dN/dS values) but results in a decreased polymorphism in bv. officinalis (negative correlation between these values) (Table 2).
Importantly, the frequency of polymorphic sym genes in the total gene pools is higher in bv. orientalis than in bv. officinalis: 38.5 ± 7.8% and 7.7 ± 4.3%, respectively (tSt = 3.46; P0 < 0.01). For core genes, these frequencies do not differ: 74.6 ± 0.70% and 75.9 ± 0.82%, respectively.

2.2. Gene Ontology Analysis

To address the factors responsible for the evolution of rhizobia core genes, we used Gene Ontology tools [31] to provide the functional annotation of genes that vary sufficiently for nucleotide polymorphism and correlate with the impacts of natural selection (Table 1 and Table 2). The set of 782 core genes that are polymorphic in both N. galegae biovars was distributed into 76 Gene Ontology Groups (GOGs), contrasting for p-distance or dN/dS, in which deviations of these parameters exceed the standard deviations of the average values of GO enrichment (1.509 for dN/dS and 1.519 for p-distance). These GOGs were assigned into clusters (as shown in Figure 1C) with contrasting p-distance and dN/dS values: (i) higher than average in both biovars, orientalis and officinalis (Ori+Off+); (ii) below average in bv. orientalis but higher in bv. officinalis (Ori–Off+); (iii); higher in bv. orientalis but below average in bv. officinalis (Ori+Off–); and (iv) below average in both biovars (Ori–Off–). For statistical analysis, four clusters were established independently for p-distance (Cpol-I…Cpol-IV) and dN/dS (Csel-I…Csel-IV) values.
We demonstrated that clusters Cpol-IV and Csel-IV, in which p-distance and dN/dS values are below average in both N. galegae biovars (Ori–Off–), are most numerous, suggesting that purifying selection (dN/dS < 1 in Csel-IV cluster) resulting in decreased nucleotide polymorphism (low p-distance in Cpol-IV cluster) represents an important factor of core gene evolution (Table 3). However, the tradeoff between gene polymorphism and natural selection varies greatly in the analyzed genes: gene frequencies (%) in the total pool of 782 polymorphic genes for the Ori+Off+ and Ori–Off+ clusters are higher for p-distance than for dN/dS (Cpol-I > Csel-I; Cpol-II > Csel-II), while in the Ori+Off– and Ori–Off– clusters, gene frequencies are higher for dN/dS than for p-distance (Csel-III > Cpol-III; Csel-IV > Cpol-IV) (Figure 2).
Analysis of GOG composition enabled us to reveal several regularities in the functional segregation of the core genome concerning its operational (involved in cellular metabolism and development) and informational (involved in template processes) components. As expected, genes that are less variable in both biovars (Ori–Off– clusters with minimal p-distance or dN/dS values for both biovars) are associated with highly conservative template processes. Specifically, genes encoding for translation are revealed in the Cpol-IV cluster with minimal p-distance while genes for replication, transcription, translation, and DNA repair are in the Csel-IV cluster with minimal dN/dS (Tables S2–S4 in the Supplementary Materials). Interestingly, analysis of GOGs identified on the basis of p-distance demonstrated (Table S2 in Supplement) that low-polymorphic genes responsible for metabolism of N-compounds (nucleosides, amino acids) are assigned to Cpol-III, while low-polymorphic genes for lipid and oligosaccharide metabolism are in Cpol-II.
In order to address the tradeoff between nucleotide polymorphism and impacts of natural selection on core gene evolution, we analyzed the distributions of GOGs among the clusters identified using p-distance or dN/dS values (Table 3). To statistically assess the coincidence of these distributions, we separately calculated for the two N. galegae biovars the frequencies of GOGs with elevated values of dN/dS or p-distance within GOGs with elevated or decreased values of p-distance or dN/dS (addressed as “High-in-High” and “High-in-Low” frequencies, respectively). This calculation demonstrated that, in bv. orientalis, “High-in-High” exceeds “High-in-Low” frequencies for p-distance (Figure 3A) and for dN/dS (Figure 3B), suggesting that driving selection is responsible for elevated core gene polymorphism in this biovar. However, in bv. officinalis, no difference was revealed between “High-in-High” and “High-in-Low” frequencies, suggesting a negligable influence of driving selection on gene polymorphism in this biovar.

3. Discussion

The aim of our research was to use N. galegae species possessing multicomponent genome to compare the evolutionary dynamics of core and symbiotically specialized parts. In order to reveal the impacts of natural selection on rhizobia gene polymorphism, we analyzed a set of N. galegae strains originating from the North Caucasus region. In accordance with previously published results [30], we demonstrated that the diversity of nucleotide sequences (measured as p-distance) in N. galegae is higher for core genes than for sym genes and is biovar-dependent: bv. orientalis is more polymorphic than bv. officinalis for both gene groups (Table 1). This difference may be due to contrasting levels of diversity in the respective host plant species. Specifically, North Caucasus is the longstanding center of G. orientalis origin while colonization of this area by G. officinalis is more recent [32]. Previously, we quantified the diversity of two Galega species in North Caucasus using AFLP fingerprinting, followed by nucleotide polymorphism analysis for a range of genes, followed by genomic fingerprinting, which confirmed the morphological data and suggested higher G. orientalis diversity with respect to G. officinalis [33,34]. An important source of genetic diversity in N. galegae may be represented by insertion sequences (IS), which are more abandoned in bv. orientalis than in bv. officinalis [35].
In this paper, we demonstrate that in N. galegae, core and sym genes are phylogenetically congruent (Figure 1A), apparently due to their restricted recombination based on the location of sym genes on non-mobile chromids. Nevertheless, some evolutionarily important parameters of diversity differ in these genes. For example, the tradeoff between nucleotide polymorphism and the evolutionary impacts of natural selection depend on the gene group (core or sym) and on the N. galegae biovar (orientalis or officinalis) (Table 1). Differential impacts of natural selection on polymorphism of core and sym genes are evident in bv. officinalis (r values differ significantly) but not in bv. orientalis (r values do not differ), suggesting that the adaptive impacts of these genes are biovar-specific (Table 2).
Analysis of the total gene pools (Table 2) as well as Gene Ontology Groups (GOGs) (Table 3, Figure 3) suggested that driving selection (dN/dS > 1) results in increased polymorphism of core genes in bv. orientalis but not in bv. officinalis. We suggest that, in bv. orientalis, maintenance of newly emerged core gene alleles by driving selection may be combined with preservation of preexisting alleles; therefore, genetic polymorphism of this biovar is elevated. However, in bv. officinalis, the newly emerged gene alleles possibly substitute for preexisting ones due to a restricted ecological amplitude of this biovar; therefore, gene polymorphism in this biovar is not changed. This difference is in accordance with the contrasting ecological affinities of the compared N. galegae biovars [33]. Specifically, in the North Caucasian region, biovar officinalis persists under unfavorable conditions occupied by its host, G. officinalis, such as wetlands and saline and acid soils, and should survive mostly due to colonization of endosymbiotic niches. However, bv. orientalis persists under more favorable habitats occupied by G. orientalis, such as moderately moist and non-saline, neutral soils. Therefore, persistence in soil niches dependent on core gene operation may be more prolonged for bv. orientalis than for bv. officinalis.
In agreement with the contrasting ecological affinities of Galega species, a range of differences between their symbionts were revealed: (i) low polymorphic GOGs are affiliated with N metabolism (apparently responsible for symbiotic adaptations) in bv. officinalis and with the synthesis of surface polysaccharides (presumable responsible for adaptations to edaphic stresses) in bv. orientalis (Tables S2–S4 in the Supplementary Materials); (ii) sym genes evolve under purifying selection impacts in bv. officinalis, while neutral evolution was revealed for these genes in bv. orientalis; (iii) the evolution of core genes occurs mostly under the impacts of driving selection in bv. orientalis, while this evolution is neutral in bv. officinalis (Table 1, Table 2 and Table 3, Figure 3). These data are in agreement with the suggestion of the critical role of sym genes for bv. officinalis survival in the “plant–soil” system.
Our data suggest that different molecular and selective mechanisms are responsible for the evolution of sym and core genes in N. galegae. Different natural histories of core and sym genes were revealed also in Sinorhizobium meliloti species, which harbor sym genes on mega-plasmids [23,36], Mesorhizobium spp. harboring sym genes in highly mobile chromosomal islands [26,37,38], and Bradyrhizobium spp. harboring these genes in low-mobil chromosomal clusters [39,40]. A detailed comparison of N. galegae core and sym gene polymorphism is available with another polytypic (composed of host-specific biovars) species, R. leguminorarum. Basic differences between these species pertain to their host ranges, which is narrow in N. galegae and broad in R. leguminosarum (Table 4).
The striking differences between these polytypic species were revealed by comparing sym and core gene polymorphism. Specifically, these gene groups in N. galegae are much more similar in their diversity parameters than in R. leguminosarum, as was demonstrated by comparing gene phylogenies that coincide completely in N. galegae (Figure 1A) but are discordant in R. leguminosarum [27]. This difference may be due to a highly restricted recombination of sym and core genes in N. galegae, while in R. leguminosarum, this recombination is relaxed [21].
Different parameters of nucleotide polymorphism may be revealed by comparing two polytypic rhizobia species for gene variations within and between biovars. Within biovars, variation for sym genes is higher in R. leguminosarum than in N. galegae, reflecting the taxonomic diversities of the hosts, which for R. leguminosarum represent different genera (bv. viciae) or species (bv. trifolii), but N. galegae biovars are restricted to a single host species. This is why, in N. galegae, sym genes are less variable than core genes, while in R. leguminosarum biovars, sym genes are significantly more variable than core genes (Table S8).
Marked differences between the two polytypic rhizobia species were also revealed by the genetic comparison of strains representing different biovars (Table S9). Specifically, the divergence for sym genes is higher in R. leguminosarum than in N. galegae, clearly correlating with the divergences of the respective host plants. The other reason for this difference may be represented by intensive sym gene transfer in R. leguminosarum, which is responsible for the incongruence of the gene phylogenies in this species [27]. In both species, inter-biovar differences for sym genes are higher than for core genes. These differences are strongest in R. leguminosarum, suggesting pronounced disruptive selection in this species resulting from high taxonomic diversity of the host plants.
Collectively, the presented data suggest that gene polymorphism in the compared rhizobia species depends on: (i) the diversity of the hosts; and (ii) the intensity of sym gene transfer in rhizobia populations. An attempt to correlate the diversities of N. galegae sym genes with the diversity of its hosts was made earlier, but the relevant correlations were restricted to AFLP fingerprints [33]. In R. leguminosarum, we demonstrated a close correlation between the diversities of the nodA gene involved in Nod factor synthesis by rhizobia and the Nfr5 gene responsible for perception of Nod factors by hosts [8].
Importantly, the molecular mechanisms responsible for core gene polymorphism are less clear than for sym gene polymorphism, since the impacts of selective (presumably, edaphic) factors influencing core genes are poorly understood as compared to the impacts of symbiotically induced selective factors. The application of Gene Ontology tools suggested in our paper may represent a useful approach to reveal the functional basis for the adaptive evolution of core genes in rhizobia.
The other important issue of rhizobia evolutionary genetics pertains to the tradeoff between natural selection and gene polymorphism, which may be increased by driving selective pressures in an ecologically versatile organism (such as N. galegae bv. orientalis), enabling broad allelic diversity in the analyzed genes. However, in an ecologically restricted organism (such as N. galegae bv. officinalis), gene polymorphism is not changed or is even decreased under the impacts of driving selection since co-existence of different gene alleles is presumable blocked. Significantly, in the mutualistic legume–rhizobia system, the genotypic partners’ interactions responsible for plant fitness may stabilize gene variation under the impacts of either driving or purifying selection induced by hosts and environment in microbial populations [41]. Extended genetic and bioinformatics analyses are required to address the relationship between the adaptive potentials of different gene groups and the impacts of natural selection on the polymorphism expressed in diverse rhizobia species and in other symbiotic organisms.

4. Materials and Methods

4.1. Collection of Strains and DNA Sequencing

During an expedition to the North Caucasus in 2003, a number of soil samples were collected, from which rhizobia strains of bv. orientalis and bv. officinalis were isolated [30,33,34]. 14 rhizobia strains were isolated from soil samples in a microvegetation experiment using nodules of Galega orientalis and G. officinalis according to standard protocol [42]. They included strains of bv. officinalis (NG_35_off (JANFGK000000000), NG_37_off (VYYB00000000), NG_46_off (JANFGL000000000), NG_47_off (JAMQCN000000000), NG_58_off (JANFGM000000000), NG_77_off (JANFGN000000000), NG_81_off (JANFGO000000000), and NG_110_off (VZUM00000000)) and of bv. orientalis (NG_35_ori (JANFGP000000000), NG_46_ori (JANFGQ000000000), NG_58_ori (VZUN00000000), NG_77_ori (JANFGR000000000), NG_87_ori (VZUL00000000), and NG_110_ori (JANEZU000000000)).
Isolates were cultivated at 28 °C and 220 rpm for 48 h in modified yeast mannitol broth (YMB) with 1% sucrose [43]. DNA was obtained by the lysozyme–SDS–phenol–chloroform extraction protocol, with minor modifications [44]. Sequencing of strains NG_37_off and NG_87_ori was performed on a PacBio RS II instrument with P6 in two SMRT cells (Pacific Biosciences of California, Inc., Menlo Park, CA, USA). PacBio sequencing, subsequent error correction analysis, and assembly were performed at the Arizona Genomics Institute (US). Genome assembly for the strains was carried out de novo using HGAP https://github.com/jtchien0925/PacBio_HGAP_assembly (accessed on 19 November 2023). Sequencing of 12 other rhizobia strains, 7 strains of biovar officinalis and 5 strains of biovar orientalis, was performed on a MiSeq genomic sequencer (Illumina Inc., San Diego, CA, USA) according to the manufacturer’s protocol, using the MiSeq Reagent Kit 600 Cycles (Illumina, USA), at the Genomics Core Facility, Siberian Branch, Russian Academy of Sciences (Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia). Assembly of the sequences was carried out using the CLC Workbench https://digitalinsights.qiagen.com/products-overview/discovery-insights-portfolio/analysis-and-visualization/qiagen-clc-genomics-workbench/ (accessed on 19 November 2023) by mapping on reference genomes NG_37_off and NG_87_ori of each biovar, respectively.

4.2. Finding Core Genes by Global Alignment

All genes of each genome were matched with the genes of the other 13 genomes using the global alignment method. For this, BLAT was used http://genome.ucsc.edu/cgi-bin/hgBlat (accessed on 19 November 2023). All paired genes were sorted in descending order of identity. Paired genes with a maximal identity of at least 70% in the DNA sequences were selected. After that, a table of all genes and their presence in each of the 14 strains was generated. Only genes that were found in all 14 strains were selected for the core genome.
For bv. orientalis, the estimated number of core genes was approximately 5000, for bv. officinalis, it was 4200, while 3900 core genes were common for the two biovars. For analyzing the variability and selection indices, we used 3840 genes from bv. orientalis and 2734 genes from bv. officinalis (genes are common for the two biovars with non-zero polymorphism and dN/dS).

4.3. Symbiotic Genes

For both biovars, we analyzed 16 nod genes (encoding for Nod factor synthesis) 8 nif genes (for nitrogenase synthesis), and 15 fix genes (for electron and energy supply of nitrogenase) (Table S7 in the Supplementary Materials).

4.4. Gene Alignment Using Muscle

The DNA sequences of 14 strain variants of each gene were aligned by MUSCLE Multiple Sequence Comparison by Log-Expectation, https://www.ebi.ac.uk/Tools/msa/muscle/ (accessed on 19 November 2023) using standard coding sequence alignment parameters.

4.5. Calculation of Nucleotide Polymorphism (p-Distance)

The DNA polymorphism of each gene was calculated based on the number of nucleotide substitutions for each pair of strains using standard metrics https://www.megasoftware.net/mega1_manual/Distance.html (accessed on 19 November 2023). DNA regions with undetermined sequences (N, non-detected) and gaps were not taken into account. The number of substitutions was normalized by dividing the total length of the compared genes without gaps and undefined nucleotides. The matrices (sized 14 × 14 by the number of strains) of the p-distances of each gene were calculated. The average polymorphism of each gene was calculated using the average p-distance of all elements of the matrix, excluding diagonal elements (distance of a gene with itself is zero). Scripts are available at https://github.com/sergeyhosid/P-distance (accessed on 19 November 2023).

4.6. Calculation of dN/dS Index

Calculation of the dN/dS ratio of nonsynonymous (dN) to synonymous (dS) substitutions was performed according to the Jukes–Cantor (JC) model https://bioinformatics.cvr.ac.uk/calculating-dnds-for-ngs-datasets/ (accessed on 19 November 2023). In the JC model, dN/dS for each codon was calculated separately and compared with the theorized ratio of substitutions. For example, alanine is encoded by three different codons when there are nine possible single substitutions of each codon and, consequently, its theoretical dN/dS ratio is 3/9 or 1/3. The obtained dN/dS value of each gene was normalized by dividing by the number of coding codons of the compared sequences. Then, the matrices (sized 14 × 14 by the number of strains) of the dN/dS indexes of each gene were calculated. The average dN/dS index of each gene was calculated by the average dN/dS index of all elements of the matrix, excluding diagonal elements. Scripts are available at https://github.com/sergeyhosid/dNdS (accessed on 19 November 2023).

4.7. Functional Annotation of Core Genes, Gene Ontology (GO)

We used eggnog-mapper https://github.com/eggnogdb/eggnog-mapper/issues/135 (accessed on 19 November 2023) to annotate newly assembled genomes and assign genes to certain functional groups based on Gene Ontology. A detailed transcript of each group was performed on the AmiGO 2 website [45]. AmiGO 2 is a project to create the next generation of AmiGO, the current official web-based toolkit for searching and browsing the Gene Ontology database. The Gene Ontology Consortium (GOC) provides computable knowledge regarding the functions of genes and gene products.

4.8. Determination of the Predominance of Functional Groups of Genes GO (Gene Ontology) and Statistical Significance

The prevalence of certain groups of genes was calculated as the ratio of the actual number of genes to their expected number based on the sample size and total number of genes of the selected group:
Penrich = Nobs/(Ssep*(Nsep/Ngenome))
where Penrich is the predominance of a given group of genes, Nobs is the number of genes in the sample, Ssep is the number of genes in a given group (GO), Nsep is the sample size, and Ngenome is the total number of genes found in Gene Ontology.
The statistical significance of the predominance of certain groups of genes was obtained using a permutation test that simulated the same value of the size of the group and the total number of genes. The permutation test was performed 10,000 times, which was sufficient to calculate the statistical significance at 95% (p-value < 0.05).

Supplementary Materials

The supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms242316696/s1.

Author Contributions

Conceptualization, N.A.P. and E.E.A.; methodology, E.E.A., E.S.K. and S.L.H.; software, E.S.K. and S.L.H.; validation, N.A.P. and S.L.H.; formal analysis, N.A.P. and S.L.H.; investigation, E.E.A., E.S.K. and S.L.H.; resources T.S.A., O.N.K., O.P.O. and N.I.D.; data curation, E.S.K. and S.L.H.; writing—original draft preparation, N.A.P., E.E.A., E.S.K. and S.L.H.; writing—review and editing, N.A.P., E.E.A., E.S.K. and S.L.H.; visualization, E.E.A. and S.L.H.; supervision, E.E.A.; project administration, N.A.P. and E.E.A.; funding acquisition, N.A.P. and E.E.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Russian Science Foundation, grant number 19-16-00081-P.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All 14 genome sequences mentioned in this study can be downloaded by their accession number from https://www.ncbi.nlm.nih.gov (accessed on 19 November 2023).

Acknowledgments

We are grateful to Olga Hosid for editing the manuscript.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses and interpretation of data.

References

  1. Martínez-Romero, E. Coevolution in Rhizobium-legume symbiosis? DNA Cell Biol. 2009, 28, 361–370. [Google Scholar] [CrossRef] [PubMed]
  2. Carlson, C.; Frederickson, M.E. An emerging view of coevolution in the legume-rhizobium mutualism. Mol. Ecol. 2023, 32, 3793–3797. [Google Scholar] [CrossRef] [PubMed]
  3. Porter, S.S.; Simms, E.L. Selection for cheating across disparate environments in the legume-rhizobium mutualism. Ecol. Lett. 2014, 17, 1121–1129. [Google Scholar] [CrossRef]
  4. Gano-Cohen, K.A.; Wendlandt, C.E.; Stokes, P.J.; Blanton, M.A.; Quides, K.W.; Zomorrodian, A.; Adinata, E.S.; Sachs, J.L. Interspecific conflict and the evolution of ineffective rhizobia. Ecol. Lett. 2019, 22, 914–924. [Google Scholar] [CrossRef]
  5. Weyl, E.G.; Frederickson, M.E.; Yu, D.W.; Pierce, N.E. Economic contract theory tests models of mutualism. Proc. Natl. Acad. Sci. USA 2010, 107, 15712–15716. [Google Scholar] [CrossRef]
  6. Quides, K.W.; Weisberg, A.J.; Trinh, J.; Salaheldine, F.; Cardenas, P.; Lee, H.-H.; Jariwala, R.; Chang, J.H.; Sachs, J.L. Experimental evolution can enhance benefits of rhizobia to novel legume hosts. Proc. R. Soc. B 2021, 288, 1951. [Google Scholar] [CrossRef]
  7. Regus, J.U.; Quides, K.W.; O'Neill, M.R.; Suzuki, R.; Savory, E.A.; Chang, J.H.; Sachs, J.L. Cell autonomous sanctions in legumes target ineffective rhizobia in nodules with mixed infections. Am. J. Bot. 2017, 104, 1299–1312. [Google Scholar] [CrossRef]
  8. Provorov, N.A.; Andronov, E.E.; Kimeklis, A.K.; Onishchuk, O.P.; Igolkina, A.A.; Karasev, E.S. Microevolution, speciation and macroevolution in rhizobia: Genomic mechanisms and selective patterns. Front. Plant Sci. 2022, 13, 1026943. [Google Scholar] [CrossRef]
  9. Provorov, N.A.; Andronov, E.E.; Onishchuk, O.P. Forms of natural selection controlling the genomic evolution in nodule bacteria. Russ. J. Genet. 2017, 53, 411–419. [Google Scholar] [CrossRef]
  10. Santangelo, J.S.; Johnson, M.T.J.; Ness, R.W. Modern spandrels: The roles of genetic drift, gene flow and natural selection in the evolution of parallel clines. Proc. Biol. Sci. 2018, 285, 20180230. [Google Scholar] [CrossRef]
  11. Cheng, C.; Kirkpatrick, M. Molecular evolution and the decline of purifying selection with age. Nat. Commun. 2021, 12, 2657. [Google Scholar] [CrossRef]
  12. Lee, C.-R.; Mitchell-Olds, T. Environmental adaptation contributes to gene polymorphism across the Arabidopsis thaliana genome. Mol. Biol. Evol. 2012, 29, 3721–3728. [Google Scholar] [CrossRef]
  13. Marchinko, K.B.; Matthews, B.; Arnegard, M.E.; Rogers, S.M.; Schluter, D. Maintenance of a genetic polymorphism with disruptive natural selection in stickleback. Curr. Biol. 2014, 24, 1289–1292. [Google Scholar] [CrossRef] [PubMed]
  14. Rahman, S.; Kosakovsky, P.S.L.; Webb, A.; Hey, J. Weak selection on synonymous codons substantially inflates dN/dS estimates in bacteria. Proc. Natl. Acad. Sci. USA 2021, 118, e2023575118. [Google Scholar] [CrossRef] [PubMed]
  15. Taub, D.R.; Page, J. Molecular signatures of natural selection for polymorphic genes of the human dopaminergic and serotonergic systems: A review. Front. Psychol. 2016, 8, 857. [Google Scholar] [CrossRef] [PubMed]
  16. Moon, S.U.; Na, B.K.; Kang, J.M.; Kim, J.Y.; Cho, S.H.; Park, Y.K.; Sohn, W.M.; Lin, K.; Kim, T.S. Genetic polymorphism and effect of natural selection at domain I of apical membrane antigen-1 (AMA-1) in Plasmodium vivax isolates from Myanmar. Acta Trop. 2010, 114, 71–75. [Google Scholar] [CrossRef]
  17. Kang, J.M.; Ju, H.L.; Kang, Y.M. Genetic polymorphism and natural selection in the C-terminal 42 kDa region of merozoite surface protein-1 among Plasmodium vivax Korean isolates. Malar. J. 2012, 11, 206. [Google Scholar] [CrossRef] [PubMed]
  18. Barnard-Kubow, K.; Sloan, D.; Galloway, L. Correlation between sequence divergence and polymorphism reveals similar evolutionary mechanisms acting across multiple timescales in a rapidly evolving plastid genome. BMC Evol. Biol. 2014, 14, 1. [Google Scholar] [CrossRef]
  19. Vigué, L.; Eyre-Walker, A. The comparative population genetics of Neisseria meningitidis and Neisseria gonorrhoeae. Peer J. 2019, 27, e7216. [Google Scholar] [CrossRef]
  20. Sunyaev, S.; Kondrashov, F.A.; Bork, P.; Ramensky, V. Impact of selection, mutation rate and genetic drift on human genetic variation. Hum. Mol. Genet. 2003, 12, 3325–3330. [Google Scholar] [CrossRef]
  21. Young, P.W.; Crossman, L.C.; Johnston, A.W.B.; Thomson, N.R.; Ghazoui, Z.F.; Hull, K.H.; Wexler, M.; Curson, A.; Todd, J.D.; Poole, P.S.; et al. The genome of Rhizobium leguminosarum has recognizable core and accessory components. Genome Biol. 2006, 7, R34. [Google Scholar] [CrossRef]
  22. Lozano, L.; Hernández-González, I.; Bustos, P.; Santamaría, R.I.; Souza, V.; Young, J.P.W.; Dávila, G.; González, V. Evolutionary dynamics of insertion sequences in relation to the evolutionary histories of the chromosome and symbiotic plasmid genes of Rhizobium etli populations. Appl. Environ. Microbiol. 2010, 76, 6504–6513. [Google Scholar] [CrossRef] [PubMed]
  23. Galardini, M.; Pini, F.; Bazzicalupo, M.; Biondi, E.G.; Mengoni, A. Replicon-dependent bacterial genome evolution: The case of Sinorhizobium meliloti. Genome Biol. Evol. 2013, 5, 542–558. [Google Scholar] [CrossRef] [PubMed]
  24. Bailly, X.; Giuntini, E.; Sexton, M.; Lower, R.P.J.; Harrison, P.W.; Kumar, N.; Young, J.P.W. Population genomics of Sinorhizobium medicae based on low-coverage sequencing of sympatric isolates. ISME J. 2011, 5, 1722–1734. [Google Scholar] [CrossRef] [PubMed]
  25. Nelson, M.; Guhlin, J.; Epstein, B.; Tiffin, P.; Sadowsky, M. The complete replicons of 16 Ensifer meliloti strains offer insights into intra- and inter-replicon gene transfer, transposon-associated loci, and repeat elements. Microb. Genom. 2018, 4, e000174. [Google Scholar] [CrossRef]
  26. Greenlon, A.; Chang, P.L.; Damtew, Z.M.; Muleta, A.; Carrasquilla-Garcia, N.; Kim, D.; Nguyen, H.P.; Suryawanshi, V.; Krieg, C.P.; Yadav, S.K.; et al. Global-level population genomics reveals differential effects of geography and phylogeny on horizontal gene transfer in soil bacteria. Proc. Natl. Acad. Sci. USA 2019, 116, 15200–15209. [Google Scholar] [CrossRef] [PubMed]
  27. Kimeklis, A.; Chirak, E.; Kuznetsova, I.; Sazanova, A.; Safronova, V.; Belimov, A.; Onishchuk, O.; Kurchak, O.; Aksenova, T.; Pinaev, A.; et al. Rhizobia isolated from the relict legume Vavilovia formosa represent a genetically specific group within Rhizobium leguminosarum biovar viciae. Genes 2019, 10, 991. [Google Scholar] [CrossRef]
  28. Onischuk, O.P.; Kurchak, O.N.; Kimeklis, A.K.; Aksenova, T.S.; Andronov, E.E.; Provorov, N.A. Biodiversity of the symbiotic systems formed by nodule bacteria Rhizobium leguminosarum with the leguminous plants of galegoid complex. Sel'skokhozyaistvennaya Biol. (Agric. Biol.) 2023, 58, 87–99. [Google Scholar] [CrossRef]
  29. Österman, J.; Marsh, J.; Laine, P.K.; Zeng, Z.; Alatalo, E.; Sullivan, J.T.; Young, P.W.; Thomas-Oates, J.; Paulin, L.; Lindström, K. Genome sequencing of two Neorhizobium galegae strains reveals a noeT gene responsible for the unusual acetylation of the nodulation factors. BMC Genom. 2014, 15, 500. [Google Scholar] [CrossRef]
  30. Karasev, E.S.; Andronov, E.E.; Aksenova, T.S.; Tupikin, A.E.; Provorov, N.A. Evolution of goat’s rue rhizobia (Neorhizobium galegae): An analysis of the polymorphism of the nitrogen fixation genes and the genes of nodule formation. Russ. J. Genet. 2019, 55, 234–238. [Google Scholar] [CrossRef]
  31. Xin, Z.; Cai, Y.; Dang, L.T.; Burke, H.M.S.; Revote, J.; Charitakis, N.; Bienroth, D.; Nim, H.T.; Li, H.Y.; Ramialison, M. MonaGO: A novel gene ontology enrichment analysis visualisation system. BMC Bioinform. 2022, 23, 69. [Google Scholar] [CrossRef] [PubMed]
  32. Raig, H.; Nõmmsalu, H.; Meripõld, H.; Metlitskaja, J. Fodder Galega; Estonian Research Institute of Agriculture: Saku, Estonia, 2001; 141p. [Google Scholar]
  33. Andronov, E.; Terefework, Z.; Roumiantseva, M.; Dzyubenko, N.; Onishchuk, O.; Kurchak, O.; Dresler-Nurmi, A.; Young, J.P.; Simarov, B.; Lindstrom, K. Symbiotic and genetic diversity of Rhizobium galegae isolates collected from the Galega orientalis gene center in the Caucasus. Appl. Environ. Microbiol. 2003, 69, 1067–1074. [Google Scholar] [CrossRef]
  34. Österman, J.; Chizhevskaya, E.; Andronov, E.; Fever, D.; Terefework, Z.; Roumiantseva, M.; Onishchuk, O.; Dresler-Nurumi, A.; Simarov, B.; Dzybenko, N.; et al. Galega orientalis is more diverse than Galega officinalis in Caucasus—Whole-genome AFLP analysis and phylogenetics of symbiosis-related genes. Mol. Ecol. 2011, 20, 4808–4821. [Google Scholar] [CrossRef] [PubMed]
  35. Radeva, G.; Jurgens, G.; Niemi, M.; Nick, G.; Suominen, L.; Lindström, K. Description of two biovars in the Rhizobium galegae species: Biovar orientalis and biovar officinalis. System. Appl. Microbiol. 2001, 24, 192–205. [Google Scholar] [CrossRef] [PubMed]
  36. Fagorzi, C.; Ilie, A.; Decorosi, F.; Cangioli, L.; Viti, C.; Mengoni, A.; diCenzo, G.C. Symbiotic and nonsymbiotic members of the menus Ensifer (syn. Sinorhizobium) are separated into two clades based on comparative genomics and high-throughput phenotyping. Genome Biol. Evol. 2020, 12, 2521–2534. [Google Scholar]
  37. Muleta, A.; Tesfaye, K.; Assefa, F.; Greenlon, A.; Riely, B.K.; Carrasquilla-Garcia, N.; Gai, Y.; Haileslassie, T.; Cook, D.R. Genomic diversity and distribution of Mesorhizobium nodulating chickpea (Cicer arietinum L.) from low pH soils of Ethiopia. Syst. Appl. Microbiol. 2022, 45, 126279. [Google Scholar] [CrossRef]
  38. Laranjo, M.; Alexandre, A.; Oliveira, S. Legume growth-promoting rhizobia: An over-view on the Mesorhizobium genus. Microbiol. Res. 2014, 169, 2–17. [Google Scholar] [CrossRef] [PubMed]
  39. Perrineau, M.M.; Le Roux, C.; de Faria, S.M.; de Carvalho Balieiro, F.; Galiana, A.; Prin, Y.; Béna, G. Genetic diversity of symbiotic Bradyrhizobium elkanii populations recovered from inoculated and non-inoculated Acacia mangium field trials in Brazil. Syst. Appl. Microbiol. 2011, 34, 376–384. [Google Scholar] [CrossRef]
  40. Arashida, H.; Odake, H.; Sugawara, M.; Noda, R.; Kakizaki, K.; Ohkubo, S.; Mitsui, H.; Sato, S.; Minamisawa, K. Evolution of rhizobial symbiosis islands through insertion sequence-mediated deletion and duplication. ISME J. 2022, 16, 112–121. [Google Scholar] [CrossRef]
  41. Batstone, R.T.; Peters, M.A.E.; Simonsen, A.K.; Stinchcombe, J.R.; Frederickson, M.E. Environmental variation impacts trait expression and selection in the legume-rhizobium symbiosis. Am. J. Bot. 2020, 107, 195–208. [Google Scholar] [CrossRef]
  42. Novikova, N.; Safronova, V. Transconjugants of Agrobacterium radiobacter harbouring sym genes of Rhizobium galegae can form an effective symbiosis with Medicago sativa. FEMS Microbiol. Lett. 1992, 93, 261–268. [Google Scholar] [CrossRef]
  43. Allen, O.N. Experiments in Soil Bacteriology; Burgess Publishing Co.: Minneapolis, MN, USA, 1959; pp. 52–59. [Google Scholar]
  44. Somasegaran, P.; Hoben, H.J. Isolating and purifying genomic DNA of rhizobia using a large-scale method. In Handbook for Rhizobia; Garber, R.C., Ed.; Springer: New York, NY, USA, 1994; pp. 279–283. [Google Scholar]
  45. Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davi, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Presentation of the Neorhizobium galegae genome and gene datasets studied. (A). Phylogenetic congruence of N. galegae core (left) and sym (right) genes according to average nucleotide identity (ANI) analysis. Strains of bv. officinalis (OF) are represented in red, bv. orientalis (OR) in blue. (B). Distribution of core gene polymorphism (p-distance) in biovars orientalis (horizontal axis) and officinalis (vertical axis) and division of gene dataset into clusters I–IV. The red lines correspond to the average p-distance value of each biovar. (C). The scheme for dividing core genes into four clusters: a column above the line represents a trait value that is above average, below the line represents below average.
Figure 1. Presentation of the Neorhizobium galegae genome and gene datasets studied. (A). Phylogenetic congruence of N. galegae core (left) and sym (right) genes according to average nucleotide identity (ANI) analysis. Strains of bv. officinalis (OF) are represented in red, bv. orientalis (OR) in blue. (B). Distribution of core gene polymorphism (p-distance) in biovars orientalis (horizontal axis) and officinalis (vertical axis) and division of gene dataset into clusters I–IV. The red lines correspond to the average p-distance value of each biovar. (C). The scheme for dividing core genes into four clusters: a column above the line represents a trait value that is above average, below the line represents below average.
Ijms 24 16696 g001
Figure 2. Distribution of 782 polymorphic core genes into clusters (introduced in text) contrasting for p-distance (Cpol-I…Cpol-IV are in blue) and dN/dS (Csel-I…Csel-IV are in orange) values in Neorhizobium galegae biovars orientalis (Ori) and officinalis (Off). The vertical axis shows the representations (in %, with standard errors) of each cluster in the total pool of 782 analyzed genes (data from Table 3 are used, sizes of columns are given in Table S1 in the Supplementary Materials). Columns with significant differences are connected with square brackets.
Figure 2. Distribution of 782 polymorphic core genes into clusters (introduced in text) contrasting for p-distance (Cpol-I…Cpol-IV are in blue) and dN/dS (Csel-I…Csel-IV are in orange) values in Neorhizobium galegae biovars orientalis (Ori) and officinalis (Off). The vertical axis shows the representations (in %, with standard errors) of each cluster in the total pool of 782 analyzed genes (data from Table 3 are used, sizes of columns are given in Table S1 in the Supplementary Materials). Columns with significant differences are connected with square brackets.
Ijms 24 16696 g002
Figure 3. Statistical analysis of clustering of 76 Gene Ontology Groups (GOGs) contrasting for p-distance and dN/dS values in Neorhizobium galegae biovars orientalis (Ori) and officinalis (Off) (data from Table 3 are used, sizes of columns are given in Tables S5 and S6 in the Supplementary Materials). (A) Frequencies (in % with standard errors) for GOGs with elevated dN/dS values among GOGs with elevated or decreased p-distance values (“High-in-High” and “High-in-Low” frequencies are presented in blue and orange, respectively). (B) The same for GOGs with elevated p-distance values among GOGs with elevated or decreased dN/dS values. Significant (P0 < 0.01) differences were revealed in the comparisons of “High-in-High” and “High-in-Low” frequencies for bv. orientalis, while for bv. officinalis, these differences were not significant. Columns with significant differences are connected with square brackets.
Figure 3. Statistical analysis of clustering of 76 Gene Ontology Groups (GOGs) contrasting for p-distance and dN/dS values in Neorhizobium galegae biovars orientalis (Ori) and officinalis (Off) (data from Table 3 are used, sizes of columns are given in Tables S5 and S6 in the Supplementary Materials). (A) Frequencies (in % with standard errors) for GOGs with elevated dN/dS values among GOGs with elevated or decreased p-distance values (“High-in-High” and “High-in-Low” frequencies are presented in blue and orange, respectively). (B) The same for GOGs with elevated p-distance values among GOGs with elevated or decreased dN/dS values. Significant (P0 < 0.01) differences were revealed in the comparisons of “High-in-High” and “High-in-Low” frequencies for bv. orientalis, while for bv. officinalis, these differences were not significant. Columns with significant differences are connected with square brackets.
Ijms 24 16696 g003
Table 1. Nucleotide polymorphism (p-distance) and driving/purifying selection impacts (dN/dS) on core and sym gene evolution in host-specific Neorhizobium galegae biovars.
Table 1. Nucleotide polymorphism (p-distance) and driving/purifying selection impacts (dN/dS) on core and sym gene evolution in host-specific Neorhizobium galegae biovars.
Genes *Means ± Standard Errors
bv. orientalisbv. officinalistSt (P0)
p-distance
core0.048 ± 0.0010.010 ± 0.001106.4 (<0.001)
sym0.028 ± 0.0080.005 ± 0.0012.84 (<0.05)
tSt (P0)2.47 (<0.05)3.57 (<0.01)-
dN/dS **
core1.571 ± 0.050 (D)1.013 ± 0.026 (N)9.91 (<0.001)
sym1.009 ± 0.142 (N)0.272 ± 0.111 (P)4.09 (<0.001)
tSt (P0)3.72 (<0.01)6.50 (<0.001)-
* Numbers of studied core genes are 3840 for bv. orientalis and 2734 for bv. officinalis; number of studied sym genes for both biovars is 39 (16 nod, 8 nif, 15 fix genes are listed in Table S7 in the Supplementary Materials). The Student’s t-test (tSt) was used to assess the probability of the null hypothesis (P0) suggesting no difference between core and sym gene groups or between N. galegae biovars. ** Natural selection is: D—driving (dN > dS), P—purifying (dN < dS); N—no selection (dN ≈ dS; neutral evolution occurs).
Table 2. Correlations between nucleotide polymorphism (p-distance) and natural selection (dN/dS) impacts in core and sym genes of Neorhizobium galegae biovars.
Table 2. Correlations between nucleotide polymorphism (p-distance) and natural selection (dN/dS) impacts in core and sym genes of Neorhizobium galegae biovars.
GenesPearson correlations (r) *tSt (P0)
bv. orientalisbv. officinalis
core+0.346 (P0 < 0.001)+0.066 (0.05 < P0 < 0.10)12.73 (<0.001)
sym **+0.078 (P0 > 0.10)–0.991 (0.05 < P0 < 0.10)4.18 (<0.001)
tSt (P0)0.99 (>0.10)50.03 (<0.001)-
* Probabilities of the null hypothesis suggesting no correlation between p-distance and dN/dS are given in parentheses after r values; tSt (P0) used to compare the r values is introduced in Table 1. ** Numbers of studied polymorphic core genes are 2864 for biovar orientalis and 2076 for biovar officinalis; numbers of studied sym genes are 15 for bv. orientalis and 3 for bv. officinalis (only sym genes polymorphic in both biovars were analyzed).
Table 3. Distribution of 76 Gene Ontology Groups (GOGs) composed of 782 polymorphic Neorhizobium galegae core genes into clusters with contrasting p-distance (Cpol-I … Cpol-IV) or dN/dS (Csel-I … Csel-IV) values (clusters are introduced in the text).
Table 3. Distribution of 76 Gene Ontology Groups (GOGs) composed of 782 polymorphic Neorhizobium galegae core genes into clusters with contrasting p-distance (Cpol-I … Cpol-IV) or dN/dS (Csel-I … Csel-IV) values (clusters are introduced in the text).
Gene Clusters with Contrasting Values of Polymorphism or Natural Selection Numbers of GOGs in Clusters Contrasting for p-Distance *
Cpol-I (229)Cpol-II
(112)
Cpol-III (153)Cpol-IV
(288)
Total GOGs
The same in clusters for dN/dS *Csel-I (148)30003
Csel-II (190)420410
Csel-III (85)51309
Csel-IV (359)122041854
Total GOGs242372276
* Numbers of genes in each cluster are given in parentheses.
Table 4. Comparison of evolutionarily sufficient items in Neorhizobium galegae and Rhizobium leguminosarum.
Table 4. Comparison of evolutionarily sufficient items in Neorhizobium galegae and Rhizobium leguminosarum.
ItemsNeorhizobium
galegae *
Rhizobium leguminosarum **
Compared biovars (their hosts)bv. orientalis (Galega orientalis), bv. officinalis (G. officinalis)bv. viciae (Lathyrus, Lens, Pisum, Vavilovia, Vicia), bv. trifolii (Trifolium)
Taxonomic diversity of host plants (host ranges of compared rhizobia species)Different species of the same plant genus (narrow)Different plant genera and tribes (broad)
Replicons harboring sym genesChromids (>1600 kb)Plasmids (200–500 kb)
Phylogenetic congruence of core and sym genesHigh or completeLow or absent
Differences between biovars for nucleotide sequences of sym and core genes (Table S9)Significant; for sym genes, polymorphism is more pronounced than for core genesHighly significant for sym genes but absent or less significant for core genes
Variation within biovars (Table S8):
for core genes
for sym genes
highly significant
significant but less than for core genes
significant
much higher than for core genes
* This research (Table 1, Tables S8 and S9); ** from [27].
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Karasev, E.S.; Hosid, S.L.; Aksenova, T.S.; Onishchuk, O.P.; Kurchak, O.N.; Dzyubenko, N.I.; Andronov, E.E.; Provorov, N.A. Impacts of Natural Selection on Evolution of Core and Symbiotically Specialized (sym) Genes in the Polytypic Species Neorhizobium galegae. Int. J. Mol. Sci. 2023, 24, 16696. https://doi.org/10.3390/ijms242316696

AMA Style

Karasev ES, Hosid SL, Aksenova TS, Onishchuk OP, Kurchak ON, Dzyubenko NI, Andronov EE, Provorov NA. Impacts of Natural Selection on Evolution of Core and Symbiotically Specialized (sym) Genes in the Polytypic Species Neorhizobium galegae. International Journal of Molecular Sciences. 2023; 24(23):16696. https://doi.org/10.3390/ijms242316696

Chicago/Turabian Style

Karasev, Evgeny S., Sergey L. Hosid, Tatiana S. Aksenova, Olga P. Onishchuk, Oksana N. Kurchak, Nikolay I. Dzyubenko, Evgeny E. Andronov, and Nikolay A. Provorov. 2023. "Impacts of Natural Selection on Evolution of Core and Symbiotically Specialized (sym) Genes in the Polytypic Species Neorhizobium galegae" International Journal of Molecular Sciences 24, no. 23: 16696. https://doi.org/10.3390/ijms242316696

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop