Next Article in Journal
Transcriptome of Sterile Testes in dnd-Depleted Atlantic Salmon (Salmo salar L.) Highlights Genes Involved in Gonadal and Brain Development
Previous Article in Journal
Hi-C Technology Reveals Actionable Gene Fusions and Rearrangements in Diffuse Large B-Cell Lymphoma Unidentified by Conventional FISH
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Updated Sequence and Annotation of the Broad Host Range Rhizobial Symbiont Sinorhizobium fredii HH103 Genome

by
Francisco Fuentes-Romero
,
Francisco-Javier López-Baena
,
José-María Vinardell
* and
Sebastián Acosta-Jurado
Department of Microbiology, Faculty of Biology, University of Seville, 41012 Sevilla, Spain
*
Author to whom correspondence should be addressed.
Genes 2025, 16(9), 1094; https://doi.org/10.3390/genes16091094
Submission received: 31 July 2025 / Revised: 5 September 2025 / Accepted: 9 September 2025 / Published: 16 September 2025
(This article belongs to the Section Microbial Genetics and Genomics)

Abstract

Background: Sinorhizobium fredii HH103 is a fast-growing rhizobial strain capable of infecting a broad range of legumes, including plants forming determinate and indeterminate nodules, such as Glycine max (its natural host) and Glycyrrhiza uralensis, respectively. Previous studies reported the sequence and annotation of the genome of this strain (7.25 Mb), showing the most complex S. fredii genome sequenced to date. It comprises seven replicons: one chromosome and six plasmids. Among these plasmids, pSfHH103d, also known as the symbiotic plasmid pSymA, harbors most of the genes involved in symbiosis. Due to limitations of the sequencing technology used at the time and the presence of high number of clusters of transposable elements, this plasmid could only be partially assembled as four separated contigs. Methods: In this work, we have used a combination of PacBio and Illumina sequencing technologies to resolve these complex regions, obtaining an updated genome sequence (7.27 Mb). Results: This updated version includes an increase in size of the largest replicons (chromosome, pSfHH103d, and pSfHH103e) and a complete and closed symbiotic plasmid (pSfHH103d or pSymA). Additionally, we carried out a re-annotation of the updated genome, merging the previous annotation and the new one found in the remaining gaps. Notably, we found a high number of transposable elements in the HH103 genome, especially in three plasmids (pSfHH103b, pSfHH103c, and pSymA), a feature that is common among S. fredii strains. Conclusions: The combination of PacBio and Illumina sequencing technologies has allowed us to obtain a complete version of the HH103 pSymA. The presence of a high number of mobile elements seems to be a general characteristic among S. fredii strains, a fact that might be related to a high genome plasticity.

1. Introduction

S. fredii is a rhizobial species able to establish a nitrogen-fixing symbiosis with dozens of legumes [1,2,3,4]. Most S. fredii strains have been isolated from Chinese soils and are able to nodulate soybeans (G. max) and wild soybeans (Glycine soja), their natural hosts [2,5]. However, a small number of S. fredii strains have been isolated in other geographical locations. This is the case for strains NGR234, isolated from nodules of Lablab purpureus in New Guinea [1], and SMH12, isolated from soybean nodules in Vietnam [6,7]. Interestingly, NGR234 is unable to nodulate soybean, but it can induce the formation of nitrogen-fixing nodules in different G. soja accessions from Central China [8]. The rest of S. fredii strains nodulate effectively Asiatic varieties of soybeans, and many strains are also able to nodulate American (commercial) varieties of this legume [2,9]. At least in the well-studied cases, this symbiotic (in)compatibility is caused by plant recognition through resistance (R) proteins of effectors delivered into host cells by a symbiotic type 3 secretion system (T3SS) [10]. Among the best studied S. fredii strains, USDA257 and HH103 are the representative examples of strains nodulating only Asiatic soybeans and strains nodulating both Asiatic and American soybeans, respectively [2,10].
The genome of S. fredii HH103 was first sequenced in 2012 [11,12]. Its size was 7.25 Mb and consisted of a chromosome and 6 plasmids, being, to our knowledge, the most complex S. fredii genome analyzed so far (https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=380, accessed on 31 July 2025). The number of plasmids present in other S. fredii strains varies between one (strain USDA257) and four (Strain Sf1), with the presence of two megaplasmids being the most frequent case (for example, strains NGR234 and SMH12). An interesting study of the complexity of S. fredii genomes and the lineage-specific adaptations among soybean-nodulating rhizobia has been provided by Tian and collaborators [13]. Among HH103 plasmids, the symbiotic plasmid (plasmid d or pSymA), which harbors most of the genes involved in symbiosis, could only be partially assembled (4 contigs) due to limitations of the sequencing technology (454 Life Sciences) and the presence of a high number of clusters of repeated sequences [12].
In this work, we have used a combination of two sequencing technologies, PacBio and Illumina, to obtain a new genome sequence of HH103 that includes a complete version of the pSymA plasmid. The updated HH103 genome sequence (7.27 Mb) shows an increase in size of the largest replicons (chromosome, plasmid c, plasmid d and plasmid e) in comparison to the previous version of the genome, and the new detected genes have been manually annotated. Also, the annotation of the gene coding for the T3SS effector protein NopD has been updated. The HH103 genome harbors 6949 open reading frames (ORFs), including 340 related to transposable elements (110 located on the pSymA).

2. Materials and Methods

For DNA isolation, S. fredii HH103 was streaked from a glycerol stock conserved at −80 °C onto tryptone-yeast extract (TY) medium [14] and cultured at 28 °C for a week. Single colonies from the plate were inoculated into TY broth and grown for three days at 28 °C and 180 rpm. DNA from bacterial cultures was extracted using Monarch genomic DNA purification kits (New England Biolabs, Ipswich, MA, USA) following the protocol supplied by the manufacturer. Extracted DNA was sequenced at Novogene (Cambridge, UK) using Pacific Biosciences (PacBio, Menlo Park, CA, USA) Technology Sequel II (CLR mode). Illumina sequencing was performed at Macrogen using Novaseq 600 150PE (Seoul, Republic of Korea) to produce 2 × 150 bp reads. Illumina reads were assessed using FastQC v0.12.1 and no residual adapters or low-quality regions were identified, so no trimming or filtering steps were applied.
The assembly and draft genome of HH103 were generated from the PacBio reads using Flye v2.9.3-b1797 [15]. Contigs shorter than 3000 bp were discarded. Draft genome assembly improvement was carried out with Illumina reads using Pilon v1.24 [16]; read mapping for this step was performed with bowtie2 v2.5.3 [17]. To reorient replicons to start at the dnaA or repA gene, Circlator v1.5.5 was used with the fixstart options enabled [18]. Finally, the assembly was annotated by extracting the coordinates from the previous annotation and their sequence by using bedtools v2.31.1 [19] with the option getfasta. Those sequences were then searched in the new assembly by using the blastn command [20], and the coordinates of the matching regions were used to make a new annotation. The new genome was larger than the previous one, so the remaining gaps were annotated with bakta v1.11.0 [21] and this information was manually added to the new annotation file. Default settings and twenty-four threads, when multithreading was possible, were used for all the software employed in this work (unless otherwise specified). Sequencing statistics are provided in Table 1.
The automatic annotation provided by RefSeq (Locus_tag=“ACN6KE_RS…”) is available at https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_048585425.1/ (accessed on 31 July 2025). The manual annotation generated in this work (Locus_tag=“ACN6KE_…” is available at https://github.com/ffuentesr97/08_25_HH103_genome.git (accessed on 1 September 2025).
A maximum likelihood phylogeny of 15 species from the genus Sinorhizobium was constructed as described previously [22]. Rhizobium leguminosarum SM52 was included as an outgroup.
Scripts to search mobile element genes in S. fredii strains and to plot the HH103 pSym are available at GitHub, software version 3.17.3 (https://github.com/ffuentesr97/08_25_HH103_genome.git, accessed on 1 September 2025). These scripts were run using the annotation and/or the nucleotide sequence, as indicated in the script.
The accession numbers of the other genomes analyzed in this work are as follows: R. leguminosarum SM52, GCF_004306555.1; Sinorhizobium americanum CCGM7, GCF_000705595.2; Sinorhizobium kummerowiae CCBAU71714, GCF_030064585.1; Sinorhizobium medicae WSM419, GCF_000017145.1; Sinorhizobium meliloti 1021, GCF_000006965.1; S. meliloti Rm41, GCF_002197045.1; Sinorhizobium sojae CCBAU05684, GCF_002288525.1; Sinorhizobium terangae CB3126, GCF_029714365.1; S. fredii CCBAU45436, GCF_003100575.1; S. fredii NGR234, GCF_000018545.1; S. fredii SMH12, GCF_024400375.1; S. fredii USDA192, GCF_041260365.1; S. fredii USDA193, GCF_041262265.1; S. fredii USDA205T, GCF_009601405.1; S. fredii USDA257, GCF_000265205.3.

3. Results and Discussion

3.1. Characteristics of the Updated Version of the S. fredii HH103 Genome

As described in the Materials and Methods Section, we have de novo sequenced the S. fredii HH103 genome by combining PacBio and Illumina technologies. This approach allowed us to obtain the pSymA sequence into a single contig (in the previous version, this sequence was split into 4 non-assembled contigs). As stated previously by our group [11,12], the HH103 genome consists of seven replicons: the chromosome and six plasmids (named as e, d, c, b, a2, a1), confirming that it is the most complex S. fredii genome sequenced so far. The GenBank accession numbers for the updated sequences of the seven replicons are as follows: CP183939.1 to CP183945.1. This new version of the HH103 genome is slightly bigger (7.27 Mb) than the previous one (7.25 Mb) since it entails an increase in size of the largest replicons (chromosome and plasmids d and e). Details about the gene content of each replicon have been previously described [12]. The largest plasmids, pSfHH103e and pSfHH103d, correspond to the pSymB and pSymA plasmids, respectively, described in S. meliloti and in other S. fredii strains [23,24]. Hereafter, we will refer to them as HH103 pSymB and pSymA, respectively. Thus, HH103 pSymB carries genes related to exopolysacharide production, one important rhizobial symbiotic signal [25], whereas HH103 pSymA harbors genes related to the production of Nod factors and the symbiotic T3SS as well as nif and fix genes required for nitrogen fixation inside nodules [12]. Table 2 contains the comparison between the previous and the updated versions of the HH103 genome, replicon by replicon, and Figure 1 shows a circular plot of HH103 pSymA (CP183945.1).
This complete sequence of the pSymA, which contains key symbiotic genes, should provide much better tools for researchers to probe the functions of these genes and further our understanding of legume-rhizobia symbiosis in a strain that is noteworthy for its broad host range. As we will comment below, the fully assembled HH103 pSymA plasmid contains a remarkably high number of mobile elements. It is noticeable that the updated version of the HH103 genome confirmed the presence of a second copy of the nifHDK genes (ACN6KE_004317 to ACN6KE_004319), also located (as the first ones: ACN6KE_004532 to ACN6KE_004530) on the pSymA (Figure 2). These two sets of copies of nifHDK, which are separated by around 190 kb in the sequence of pSymA, were 100% identical and preceded by a NifA-box, suggesting a similar regulation of their expression. Since the nifHDK genes code for the structural units of the nitrogenase [26], the presence of two copies of these genes might be related to nitrogen fixation efficiency inside nodules. The presence of two copies of nifHDK is common among S. fredii strains, in contrast to S. meliloti, where the common fact is the presence of a single copy of these genes. It is also relevant that the annotation of the nopD gene (ACN6KE_004365) has changed in the updated version of the HH103 genome. The nopD gene, also located on pSymA, codes for a T3SS effector protein [10,27] harboring a C48 protease domain that is involved in SUMOylation and de-SUMOylation of host proteins. The updated sequence of NopD is 1490 residues long, whereas the previous version had 1318 residues.
In addition to the presence of additional copies of previously described HH103 genes (especially mobile elements), we have found 16 new genes in the updated version of the genome (Table 3), most of them located on the chromosome, two in the pSymA, and one in the pSymB. Seven out of these 16 genes code for hypothetical proteins, whereas the rest encode highly diverse proteins.

3.2. Comparison of the Genome of HH103 with Other Rhizobia

We have compared the updated genome sequence of S. fredii HH103 with the previous one and with that of different S. fredii strains and Sinorhizobium species by using core-genome gene phylogeny. These comparisons gave rise to a phylogenetic tree, shown in Figure 3. As expected, S. fredii genomic sequences clustered together. Both HH103 genome sequences clustered with a set of S. fredii strains: USDA192, SMH12, USDA193, and USDA205. All these genomic sequences were closer to those of CCBAU45436 and NGR234, whereas the genome of USDA257 resulted in being the most different with respect to the rest of S. fredii strains. Regarding the comparisons between S. fredii strains and the other Sinorhizobium species included in this study, interestingly, genomes of S. fredii strains were more similar to that of S. americanum (an species that was first isolated from Acacia nodules in Mexico but that is also able to nodulate Phaseolus) [28,29] than to that of S. sojae, which was isolated from soybean nodules in China [30]. All the mentioned genomes are closer to S. medicae, S. kummerowiae and S. meliloti than to S. terangae. Our results are consistent with those obtained by Kuzmanović and collaborators [31] by using the same methodology.

3.3. The HH103 Genome Is the Most Complex Among the Different S. fredii Strains Characterized So Far

The updated version of the HH103 genome confirms its previously described complexity [12]. As mentioned above, this genome is composed of 1 chromosome and 6 different plasmids. Besides comparing the genome sequences of the different S. fredii strains analyzed in this work, we analyzed their genome structures. As shown in Table 4, the S. fredii genome sizes studied vary between 6.6 (strain USDA205T) and 7.3 Mb (strain HH103). Thus, among the different S. fredii strains analyzed, the genome of HH103 is not only the most complex but also the largest one. Except for USDA257, whose genome is described as composed of a chromosome and a single plasmid [32,33], all the strains harbor at least two plasmids, that, in all the strains in which this information is available, correspond to the two typical symbiotic megaplasmids (pSymB and SymA) of about 2 and 0.5 Mb also present in S. meliloti [12,23,24]. The single plasmid described in USDA257 corresponds to the megaplasmid that carries genes coding for Nod factor production and secretion, the symbiotic T3SS, and genes related to nitrogen fixation (pSymA) [32]. In conclusion, the S. fredii genome structure is highly variable among different strains.

3.4. S. fredii Strains Harbor Higher Numbers of Mobile Elements than Other Sinorhizobium Species

The S. fredii HH103 genome carries a high number of mobile elements. In fact, the high number of clusters of transposases and insertion sequences present in the pSymA prevented the full assembly of this replicon in the previous version of the HH103 genome [12]. Actually, the number of genes related to mobile elements present in the HH103 genome is 340. The presence of a high number of mobile elements in HH103 prompted us to investigate whether this is also the situation in other S. fredii strains and Sinorhizobium species. For this purpose, we carried out a search for mobile elements in the whole genomes of the rhizobial strains, and the results are shown in Table 5.
The number of genes related to mobile elements found in S. fredii strains was always higher than 200, oscillating between 209 and 356, present in USDA257 and USDA205, respectively. The genome of HH103 harbors 340 genes related to transposable elements. These numbers were higher than those found in other Sinorhizobium species, which varied between 108 in S. terangae CB3126 and 194 in S. sojae CCBAU05684. These data suggest a higher genome plasticity in S. fredii than in other Sinorhizobium species, since it is well known that mobile elements have an important impact on genome structure and function [34].
Finally, we decided to investigate the distribution of mobile elements among the different replicons of HH103. As shown in Table 6, the density of genes related to mobile elements (calculated as the number of these genes per 10 kb) varied enormously between the different replicons, being low (≤0.40) for the two largest replicons (chromosome and pSymB) and one of the two smallest plasmids (pSfHH103a2), medium (0.83) for pSfHH103a1, and high (>1.82) for plasmids pSfHH103b, pSfHH103c, and pSymA. Whether this higher presence of mobile elements in these three plasmids is related to a higher plasticity and/or a higher probability of gene horizontal transfer phenomena of these replicons in comparison with the rest of the HH103 genome remains to be investigated.

4. Conclusions

The use of two different technologies, one providing long-length fragments (PacBio) and the other resulting in a very high number of short reads (Illumina), has allowed us to solve the problem of the presence of a high number of clusters of transposable elements for the assembly of the pSymA plasmid. This approach might be adequate for the sequencing and assembling of other complex genomes. The de novo generated HH103 sequence confirmed that the HH103 genome, 1 chromosome and six plasmids, is the most complex among the 34 different S. fredii strains whose genomes have been sequenced so far (https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=380, accessed on 31 July 2025). The fact of having a complete sequence of the pSymA plasmid will facilitate further studies on S. fredii HH103 genes relevant for symbiosis. We have also shown that S. fredii genomes exhibit a higher presence of mobile elements than other Sinorhizobium species, a fact that might be related to a greater genomic plasticity and horizontal gene transfer probability. In fact, the role of insertion sequences in S. fredii strains’ adaptive evolution to symbiosis with their host plants has been previously proposed [35,36]. Also, a recent study highlights the importance of rhizobial mobile gene clusters in driving partner quality variation in symbiosis [37]. In the case of HH103, the presence of mobile elements is especially abundant in the pSymA, opening the possibility of the influence of those transposable elements in adaptation to the different partners of this strain.

Author Contributions

Conceptualization, F.F.-R., F.-J.L.-B., J.-M.V. and S.A.-J.; methodology, F.F.-R. and S.A.-J.; formal analysis, F.F.-R., J.-M.V. and S.A.-J.; investigation, F.F.-R., J.-M.V. and S.A.-J.; data curation, F.F.-R. and S.A.-J.; writing—original draft preparation, F.F.-R., J.-M.V. and S.A.-J.; writing—review and editing, F.F.-R., F.-J.L.-B., J.-M.V. and S.A.-J.; supervision, J.-M.V.; funding acquisition, F.-J.L.-B. and J.-M.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by FEDER/UNIVERSIDAD DE SEVILLA, grants number US-1250546, and MCIN/AEI/10.13039/501100011033, grant number PID2022-141156OB-I00. FFR is funded by JUNTA DE ANDALUCÍA PAIDI, grant number PREDOC_01119.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the data generated in this work have been deposited in public databases. All the scripts generated in this work, the annotation, and the nucleotide sequence of Sinorhizobium fredii HH103, as well as raw sequencing reads from PacBio and Illumina, are available at NCBI GenBank and the Sequence Read Archives, respectively, and are accessible via the accession numbers listed in Table 1.

Acknowledgments

We sincerely thank George diCenzo for hosting Francisco Fuentes-Romero for three months in his lab at Queen’s University (Kingston, Canada). During this period, diCenzo provided a high-quality knowledge in bioinformatics to F.F.R.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pueppke, S.G.; Broughton, W.J. Rhizobium sp. strain NGR234 and R. fredii USDA257 share exceptionally broad, nested host ranges. Mol. Plant-Microbe Interact. 1999, 12, 293–318. [Google Scholar] [CrossRef]
  2. López-Baena, F.J.; Ruiz-Sainz, J.E.; Rodríguez-Carvajal, M.A.; Vinardell, J.M. Bacterial molecular signals in the Sinorhizobium fredii-soybean symbiosis. Int. J. Mol. Sci. 2016, 17, 755. [Google Scholar] [CrossRef] [PubMed]
  3. El Idrissi, M.M.; Kaddouri, K.; Bouhnik, O.; Lamrabet, M.; Alami, S.; Abdelmoumen, H. Conventional and unconventional symbiotic nitrogen fixing bacteria associated with legumes. In Developments in Applied Microbiology and Biotechnology, Microbial Symbionts; Dharumadurai, D., Ed.; Academic Press: Cambridge, MA, USA, 2023; pp. 75–109. ISBN 9780323993340. [Google Scholar] [CrossRef]
  4. Kawaka, F. Characterization of symbiotic and nitrogen fixing bacteria. AMB Express 2022, 12, 99. [Google Scholar] [CrossRef]
  5. Margaret, I.; Becker, A.; Blom, J.; Bonilla, I.; Goesmann, A.; Göttfert, M.; Lloret, J.; Mittard-Runte, V.; Rückert, C.; Ruiz-Sainz, J.E.; et al. Symbiotic properties and first analyses of the genomic sequence of the fast-growing model strain Sinorhizobium fredii HH103 nodulating soybean. J. Biotechnol. 2011, 155, 11–19. [Google Scholar] [CrossRef]
  6. Cleyet-Marel, J.C. Dynamique des Populations de Rhizobium et de Bradyrhizobium dans le Sol et la Rhizosphere. Ph.D. Thesis, University Claude Bernard, Lyon, France, 1987. [Google Scholar]
  7. Rodríguez-Navarro, D.N.; Bellogín, R.; Camacho, M.; Daza, A.; Medina, C.; Ollero, F.J.; Santamaría, C.; Ruíz-Saínz, J.E.; Vinardell, J.M.; Temprano, F. Field assessment and genetic stability of Sinorhizobium fredii strain SMH12 for commercial soybean inoculants. Eur. J. Agron. 2003, 19, 299–309. [Google Scholar] [CrossRef]
  8. Temprano-Vera, F.; Rodríguez-Navarro, D.N.; Acosta-Jurado, S.; Perret, X.; Fossou, R.K.; Navarro-Gómez, P.; Zhen, T.; Yu, D.; An, Q.; Buendía-Clavería, A.M.; et al. Sinorhizobium fredii strains HH103 and NGR234 form nitrogen fixing nodules with diverse wild soybeans (Glycine soja) from Central China but are ineffective on Northern China accessions. Front. Microbiol. 2018, 9, 2843. [Google Scholar] [CrossRef]
  9. Videira, L.B.; Pastorino, G.N.; Balatti, P.A. Incompatibility may not be the rule in the Sinorhizobium fredii–soybean interaction. Soil Biol. Biochem. 2001, 33, 837–840. [Google Scholar] [CrossRef]
  10. Jiménez-Guerrero, I.; Medina, C.; Vinardell, J.M.; Ollero, F.J.; López-Baena, F.J. The rhizobial type 3 secretion system: The Dr. Jekyll and Mr. Hyde in the rhizobium–legume symbiosis. Int. J. Mol. Sci. 2022, 23, 11089. [Google Scholar] [CrossRef]
  11. Weidner, S.; Becker, A.; Bonilla, I.; Jaenicke, S.; Lloret, J.; Margaret, I.; Pühler, A.; Ruiz-Sainz, J.E.; Schneiker-Bekel, S.; Szczepanowski, R.; et al. Genome sequence of the soybean symbiont Sinorhizobium fredii HH103. J. Bacteriol. 2012, 194, 1617–1618. [Google Scholar] [CrossRef]
  12. Vinardell, J.M.; Acosta-Jurado, S.; Zehner, S.; Göttfert, M.; Becker, A.; Baena, I.; Blom, J.; Crespo-Rivas, J.C.; Goesmann, A.; Jaenicke, S.; et al. The Sinorhizobium fredii HH103 genome: A comparative analysis with S. fredii strains differing in their symbiotic behavior with soybean. Mol. Plant Microbe Interact. 2015, 28, 811–824. [Google Scholar] [CrossRef] [PubMed]
  13. Tian, C.F.; Zhou, Y.J.; Zhang, Y.M.; Li, Q.Q.; Zhang, Y.Z.; Li, D.F.; Wang, S.; Wang, J.; Gilbert, L.B.; Li, Y.R.; et al. Comparative genomics of rhizobia nodulating soybean suggests extensive recruitment of lineage-specific genes in adaptations. Proc. Natl. Acad. Sci. USA 2012, 109, 8629–8634. [Google Scholar] [CrossRef]
  14. Beringer, J.E. R factor transfer in Rhizobium leguminosarum. J. Gen. Microbiol. 1974, 84, 188–198. [Google Scholar] [CrossRef]
  15. Kolmogorov, M.; Yuan, J.; Lin, Y.; Pevzner, P.A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 2019, 37, 540–546. [Google Scholar] [CrossRef]
  16. Walker, B.J.; Abeel, T.; Shea, T.; Priest, M.; Abouelliel, A.; Sakthikumar, S.; Cuomo, C.A.; Zeng, Q.; Wortman, J.; Young, S.K.; et al. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 2014, 9, e112963. [Google Scholar] [CrossRef]
  17. Langmead, B.; Salzberg, S. Fast gapped-read alignment with Bowtie 2. Nat. Methods 2012, 9, 357–359. [Google Scholar] [CrossRef]
  18. Hunt, M.; Silva, N.D.; Otto, T.D.; Parkhill, J.; Keane, J.A.; Harris, S.R. Circlator: Automated circularization of genome assemblies using long sequencing reads. Genome Biol. 2015, 16, 294. [Google Scholar] [CrossRef]
  19. Bedtools: A Powerful Toolset for Genome Arithmetic. Available online: https://bedtools.readthedocs.io/en/latest/ (accessed on 15 March 2025).
  20. Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef] [PubMed]
  21. Schwengers, O.; Jelonek, L.; Dieckmann, M.A.; Beyvers, S.; Blom, J.; Goesmann, A. Bakta: Rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom. 2021, 7, 000685. [Google Scholar] [CrossRef] [PubMed]
  22. Kaur, S.; Espinosa-Sáiz, D.; Velázquez, E.; Menéndez, E.; diCenzo, G.C. Complete genome sequences of the species type strains Sinorhizobium garamanticum LMG 24692 and Sinorhizobium numidicum LMG 27395 and CIP 109850. Microbiol. Resour. Announc. 2023, 12, e0025123. [Google Scholar] [CrossRef] [PubMed]
  23. Galibert, F.; Finan, T.M.; Long, S.R.; Puhler, A.; Abola, P.; Ampe, F.; Barloy-Hubler, F.; Barnett, M.J.; Becker, A.; Boistard, P.; et al. The composite genome of the legume symbiont Sinorhizobium meliloti. Science 2001, 293, 668–672. [Google Scholar] [CrossRef]
  24. Schmeisser, C.; Liesegang, H.; Krysciak, D.; Bakkou, N.; Le Quéré, A.; Wollherr, A.; Heinemeyer, I.; Morgenstern, B.; Pommerening-Röser, A.; Flores, M.; et al. Rhizobium sp. strain NGR234 possesses a remarkable number of secretion systems. Appl. Environm. Microbiol. 2009, 75, 4035–4045. [Google Scholar] [CrossRef] [PubMed]
  25. Acosta-Jurado, S.; Fuentes-Romero, F.; Ruiz-Sainz, J.E.; Janczarek, M.; Vinardell, J.M. Rhizobial exopolysaccharides: Genetic regulation of their synthesis and relevance in symbiosis with legumes. Int. J. Mol. Sci. 2021, 22, 6233. [Google Scholar] [CrossRef] [PubMed]
  26. Ausubel, F.M. Molecular genetics of symbiotic nitrogen fixation. Cell 1982, 29, 1–2. [Google Scholar] [CrossRef]
  27. Rodrigues, J.A.; López-Baena, F.J.; Ollero, F.J.; Vinardell, J.M.; Espuny, R.; Bellogín, R.A.; Ruiz-Sainz, J.E.; Thomasm, J.R.; Sumpton, D.; Ault, J.; et al. NopM and NopD are rhizobial nodulation outer proteins: Identification using LC-MALDI and LC-ESI with amonolithic capillary column. J. Proteome Res. 2007, 6, 1029–1037. [Google Scholar] [CrossRef]
  28. Toledo, I.; Lloret, L.; Martínez-Romero, E. Sinorhizobium americanus sp. nov., a new Sinorhizobium species nodulating native Acacia spp. in Mexico. Syst. Appl. Microbiol. 2003, 26, 54–64. [Google Scholar] [CrossRef]
  29. Mnasri, B.; Saïdi, S.; Chihaoui, S.A.; Mhamdi, R. Sinorhizobium americanum symbiovar mediterranense is a predominant symbiont that nodulates and fixes nitrogen with common bean (Phaseolus vulgaris L.) in a Northern Tunisian field. Syst. Appl. Microbiol. 2012, 35, 263–269. [Google Scholar] [CrossRef]
  30. Li, Q.Q.; Wang, E.T.; Chang, Y.L.; Zhang, Y.Z.; Zhang, Y.M.; Sui, X.H.; Chen, W.F.; Chen, W.X. Ensifer sojae sp. nov., isolated from root nodules of Glycine max grown in saline-alkaline soils. Int J. Syst. Evol. Microbiol. 2011, 61, 1981–1988. [Google Scholar] [CrossRef]
  31. Kuzmanović, N.; Fagorzi, C.; Mengoni, A.; Lassalle, F.; diCenzo, G.C. Taxonomy of Rhizobiaceae revisited: Proposal of a new framework for genus delimitation. Int. J. Syst. Evol. Microbiol. 2022, 72, 005243. [Google Scholar] [CrossRef] [PubMed]
  32. Schuldes, J.; Rodriguez Orbegoso, M.; Schmeisser, C.; Krishnan, H.B.; Daniel, R.; Streit, W.R. Complete genome sequence of the broad-host-range strain Sinorhizobium fredii USDA257. J. Bacteriol. 2012, 194, 4483. [Google Scholar] [CrossRef]
  33. Cutiño, A.M.; Del Carmen Sánchez-Aguilar, M.; Ruiz-Sáinz, J.E.; Del Rosario Espuny, M.; Ollero, F.J.; Medina, C. A novel system to selective tagging of Sinorhizobium fredii symbiotic plasmids. Meth. Mol. Biol. 2024, 2751, 247–259. [Google Scholar] [CrossRef]
  34. Siguier, P.; Gourbeyre, E.; Chandler, M. Bacterial insertion sequences: Their genomic impact and diversity. FEMS Microbiol. Rev. 2014, 38, 865–891. [Google Scholar] [CrossRef] [PubMed]
  35. Zhao, R.; Liu, L.X.; Zhang, Y.Z.; Jiao, J.; Cui, W.J.; Zhang, B.; Wang, X.L.; Li, M.L.; Chen, Y.; Xiong, Z.Q.; et al. Adaptive evolution of rhizobial symbiotic compatibility mediated by co-evolved insertion sequences. ISME J. 2018, 12, 101–111. [Google Scholar] [CrossRef] [PubMed]
  36. Liu, S.; Jiao, J.; Tian, C.-F. Adaptive evolution of rhizobial symbiosis beyond horizontal gene transfer: From genome innovation to regulation reconstruction. Genes 2023, 14, 274. [Google Scholar] [CrossRef] [PubMed]
  37. Riaz, M.R.; Sosa Marquez, I.; Lindgren, H.; Levin, G.; Doyle, R.; Romero, M.C.; Paoli, J.C.; Drnevich, J.; Fields, C.J.; Geddes, B.A.; et al. Mobile gene clusters and coexpressed plant-rhizobium pathways drive partner quality variation in symbiosis. Proc. Natl. Acad. Sci. USA 2025, 122, e2411831122. [Google Scholar] [CrossRef]
Figure 1. Circular plot of pSfHH103d (pSymA). The legends (see inside the plot) show what each circle represents.
Figure 1. Circular plot of pSfHH103d (pSymA). The legends (see inside the plot) show what each circle represents.
Genes 16 01094 g001
Figure 2. Genetic context of the two copies of nifHDK present in the S. fredii HH103 genome. The locus tags of the manual annotation are shown in brackets.
Figure 2. Genetic context of the two copies of nifHDK present in the S. fredii HH103 genome. The locus tags of the manual annotation are shown in brackets.
Genes 16 01094 g002
Figure 3. Maximum-likelihood core-genome gene phylogeny of different Sinorhizobium representative species and strains. R. leguminosarum SM52 was included as an outgroup. The scale represents the mean number of nucleotide substitutions per site.
Figure 3. Maximum-likelihood core-genome gene phylogeny of different Sinorhizobium representative species and strains. R. leguminosarum SM52 was included as an outgroup. The scale represents the mean number of nucleotide substitutions per site.
Genes 16 01094 g003
Table 1. Summary of sequencing statistics for the S. fredii HH103 genome.
Table 1. Summary of sequencing statistics for the S. fredii HH103 genome.
BioProject accession no.PRJNA1233244
BioSample accession no.SAMN47263981
GenBank Assembly accession no.GCF_048585425.1
GenBank accession no.CP183939, CP183940, CP183941, CP183942, CP183943, CP183944, CP183945
SRA accession no.-
PacBio readsSRR34701253
Illumina readsSRR34710307
Total PacBio read length (nt)1,170,389,454
No. of PacBio reads98,229
PacBio N50 read length (nt)12,887
Total Illumina read length (nt)1,782,256,322
No. of Illumina paired reads11,803,022
Illumina read length (nt)2 × 151
Genome size (bp)7,273,959
No. of protein-coding genes6949
G+C content (%)62.14
Genome coverage147×
No. of replicons7
Replicon sizes (bp)24,038; 25,081; 61,874; 144,081; 605,378; 2,099,565; 4,313,942
Table 2. Comparison of the main characteristics of the previous and updated versions of the S. fredii HH103 genome.
Table 2. Comparison of the main characteristics of the previous and updated versions of the S. fredii HH103 genome.
Replicon ChromosomepSfHH103e
(p_e, pSymB))
pSfHH103d
(p_d, pSymA)
pSfHH103c
(p_c)
pSfHH103b
(p_b)
pSfHH103a2
(p_a2)
pSfHH103a1
(p_a1)
Length (bp)Previous4,305,7232,096,125ca. 588,797 a144,08261,88025,08124,036
Updated4,313,9422,099,565605,378144,08161,87425,08124,038
GC content (%)Previous62.6162.3859.5958.6858.4758.0258.21
Updated62.6062.3859.5658.6858.4758.0358.21
CDSPrevious40081991664169623819
Updated40131982665169623820
t-RNA genesPrevious53010000
Updated53010000
rrn operonsPrevious3000000
Updated3000000
GenBank Accession numberPreviousHE616890HE616899CDSA010000001 to CDSA010000004HE616893HE616892LN735562HE616891
UpdatedCP183939.1CP183944.1CP183945.1CP183943.1CP183942.1CP183941.1CP183940.1
a four concatenated contigs.
Table 3. Genes previously not described in the S. fredii HH103 genome. The GeneID corresponds to the manual annotation performed in this work. The annotation generated automatically (RefSeq) is shown in brackets. Note that several genes are not detected by the automatic annotator.
Table 3. Genes previously not described in the S. fredii HH103 genome. The GeneID corresponds to the manual annotation performed in this work. The annotation generated automatically (RefSeq) is shown in brackets. Note that several genes are not detected by the automatic annotator.
Locus_Tag aRepliconDescription
ACN6KE_000391ChromosomeZinc metalloendopeptidase
ACN6KE_001521ChromosomeHypothetical protein
ACN6KE_001826ChromosomeAntifreeze protein
ACN6KE_001897ChromosomeABC transporter ATP-binding protein
ACN6KE_001899ChromosomeHypothetical protein
ACN6KE_001932ChromosomeHypothetical protein
ACN6KE_002195ChromosomeHypothetical protein
ACN6KE_002415ChromosomeRTX toxin hemolysin-type protein
ACN6KE_003412ChromosomeIS21 family transposase
ACN6KE_003587ChromosomePeptidoglycan-binding protein LysM
ACN6KE_003588ChromosomeHypothetical protein
ACN6KE_003589ChromosomeHypothetical protein
ACN6KE_0035890ChromosomeImidazole glycerol phosphate synthase subunit HisF
ACN6KE_004165pSymATIR domain-containing protein
ACN6KE_004181pSymAHypothetical protein
ACN6KE_006263pSymBDUF1059 domain-containing protein
a According to the manual annotation.
Table 4. Comparison of the genome structure of several relevant S. fredii strains. When possible, the plasmid containing nod and nif genes is denoted in bold.
Table 4. Comparison of the genome structure of several relevant S. fredii strains. When possible, the plasmid containing nod and nif genes is denoted in bold.
S. fredii StrainGenome Accession
Number a
Genome Size (Mb)Genome Structure (Sizes in Mb in Brackets)
CCBAU45436GCF_003100575.16.91 chromosome (4.16), four plasmids: a (0.42), b (1.96), d (0.20), e (0.17)
HH103GCF_048585425.17.31 chromosome (4.31), six plasmids: a1 (0.024), a2 (0.025, b (0.062), c (0.14), d (0.61), e (2.10)
NGR234GCF_000018545.16.91 chromosome (3.92), two plasmids: a (0.54), b (2.43)
SMH12GCF_024400375.17.01 chromosome (4.02), two plasmids: a (0.56), b (2.39)
USDA192GCF_041260365.16.94 contigs, not fully assembled (1 chromosome, three plasmids)
USDA193GCF_041262265.16.83 contigs, not fully assembled (1 chromosome, two plasmids)
USDA205TGCF_001461695.16.6209 contigs, non-assembled
USDA257GCF_024400375.17.01 chromosome (6.48) and one plasmid (19 contigs, non-assembled, 0.56)
a Available at https://www.ncbi.nlm.nih.gov/datasets/genome/ (accessed on 31 July 2025).
Table 5. Number of genes related to mobile elements that have been annotated in different rhizobial genomes.
Table 5. Number of genes related to mobile elements that have been annotated in different rhizobial genomes.
Rhizobial StrainNumber of Genes Related to
Mobile Elements 1
R. leguminosarum SM5290
S. americanum CCGM7113
S. kummerowiae CCBAU71714172
S. medicae WSM419191
S. meliloti 1021158
S. meliloti Rm41118
S. sojae CCBAU05684194
S. terangae CB3126108
S. fredii_CCBAU45436260
S. fredii HH103 (updated)340
S. fredii NGR234240
S. fredii SMH12352
S. fredii USDA192286
S. fredii USDA193282
S. fredii USDA205356
S. fredii USDA257209
1 Search words: transposase, insertion sequence, mobile[_]element, mobile element[_]protein, IS[_], tnpA.
Table 6. Distribution of genes related to mobile elements among the different replicons of S. fredii HH103.
Table 6. Distribution of genes related to mobile elements among the different replicons of S. fredii HH103.
RepliconNumber of Genes Related to Mobile
Elements
Replicon Size (bp)Genes Related to Mobile Elements per 10 kb
chromosome1354,313,9420.31
pSfHH103a1224,0380.83
pSfHH103a2125,0810.40
pSfHH103b1461,8742.26
pSfHH103c27144,0811.87
pSymA110605,3781.82
pSymB512,099,5650.24
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fuentes-Romero, F.; López-Baena, F.-J.; Vinardell, J.-M.; Acosta-Jurado, S. Updated Sequence and Annotation of the Broad Host Range Rhizobial Symbiont Sinorhizobium fredii HH103 Genome. Genes 2025, 16, 1094. https://doi.org/10.3390/genes16091094

AMA Style

Fuentes-Romero F, López-Baena F-J, Vinardell J-M, Acosta-Jurado S. Updated Sequence and Annotation of the Broad Host Range Rhizobial Symbiont Sinorhizobium fredii HH103 Genome. Genes. 2025; 16(9):1094. https://doi.org/10.3390/genes16091094

Chicago/Turabian Style

Fuentes-Romero, Francisco, Francisco-Javier López-Baena, José-María Vinardell, and Sebastián Acosta-Jurado. 2025. "Updated Sequence and Annotation of the Broad Host Range Rhizobial Symbiont Sinorhizobium fredii HH103 Genome" Genes 16, no. 9: 1094. https://doi.org/10.3390/genes16091094

APA Style

Fuentes-Romero, F., López-Baena, F.-J., Vinardell, J.-M., & Acosta-Jurado, S. (2025). Updated Sequence and Annotation of the Broad Host Range Rhizobial Symbiont Sinorhizobium fredii HH103 Genome. Genes, 16(9), 1094. https://doi.org/10.3390/genes16091094

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop