Evolution of Predicted Acid Resistance Mechanisms in the Extremely Acidophilic Leptospirillum Genus.

Organisms that thrive in extremely acidic environments (≤pH 3.5) are of widespread importance in industrial applications, environmental issues, and evolutionary studies. Leptospirillum spp. constitute the only extremely acidophilic microbes in the phylogenetically deep-rooted bacterial phylum Nitrospirae. Leptospirilli are Gram-negative, obligatory chemolithoautotrophic, aerobic, ferrous iron oxidizers. This paper predicts genes that Leptospirilli use to survive at low pH and infers their evolutionary trajectory. Phylogenetic and other bioinformatic approaches suggest that these genes can be classified into (i) "first line of defense", involved in the prevention of the entry of protons into the cell, and (ii) neutralization or expulsion of protons that enter the cell. The first line of defense includes potassium transporters, predicted to form an inside positive membrane potential, spermidines, hopanoids, and Slps (starvation-inducible outer membrane proteins). The "second line of defense" includes proton pumps and enzymes that consume protons. Maximum parsimony, clustering methods, and gene alignments are used to infer the evolutionary trajectory that potentially enabled the ancestral Leptospirillum to transition from a postulated circum-neutral pH environment to an extremely acidic one. The hypothesized trajectory includes gene gains/loss events driven extensively by horizontal gene transfer, gene duplications, gene mutations, and genomic rearrangements.


Introduction
Although microorganisms such as Escherichia coli, Vibrio cholerae, and Salmonella spp. are neutrophiles, some strains survive acid shock during transient passage through low pH conditions such as in the stomach (<pH 3.5). In addition, the neutrophile Helicobacter pylori survives and grows in the stomach by creating its own near neutral pH environment via the hydrolysis of urea to produce CO 2 and NH 3 that buffer the acidity of the local environment (reviewed in [1]). Genes and mechanisms that these microorganisms use to survive acid shock and the regulatory networks that control their expression were reviewed in [2][3][4][5][6]. Due to the transitory nature of the acid shock response, these neutrophiles have been dubbed "amateur" acidophiles [5]. In contrast, extreme or "professional" acidophiles are organisms that thrive in extremely low pH environments (≤pH 3.5). They include Bacteria, Archaea, and Eukarya and are found widely distributed across the Tree of Life (reviewed in [1,[7][8][9][10]).
Second line of defense mechanisms include proton antiporters such as NahP-/NhaA-type Na + /H + exchangers. These proteins constitute a large family of integral membrane proteins with roles related to homeostasis of sodium in high salt environments and regulation of intracellular pH [40]. In addition, voltage gated ClC-type Cl − /H + transporters [41] are essential for the survival of E. coli under extreme acid stress [42]. In addition to its proton exporting function, the ClC channel is also proposed to prevent inner-membrane hyperpolarization (inner positive) in E. coli at extremely acidic pH as a result of the action of other acid resistance systems such as arginine and glutamate decarboxylation systems [43]. A similar function for ClC in acid resistance has been proposed in the amateuracidophile Bacillus coagulans [44].
Another second line of defense mechanism is the use of amino acid decarboxylation systems such as the Gad glutamate decarboxylase system. These systems have two components: (i) an inner membrane amino acid antiporter that imports the amino acid in exchange for its decarboxylated form and (ii) a cytoplasmic decarboxylase that catalyzes the proton-consuming decarboxylation of an amino acid [45]. Glutamate decarboxylase is the major response in E. coli under extreme acid conditions [6]. H + -consuming reactions during acid stress have also been observed in Acidithiobacillus caldus [33]. Several amino acid decarboxylases genes are present in Leptospirillum, e.g., [13,46], and transcripts coding for glutamate, arginine, and lysine decarboxylation have been detected in an AMD community [34].
In this study, we predict and analyze both first and second line of defense acid resistance systems in the extremely acidophilic genus Leptospirillum. We create a global model of pH homeostasis mechanisms that suggests how this genus can survive in hyper-acidic conditions. We use parsimony to infer the evolutionary trajectories of the gene gains/losses and gene mutations that are hypothesized to be involved in acid resistance. The Leptospirillum genus was chosen as a model system for this study because it is one of the most extreme bacterial acidophiles known with a growth pH range between 0.7 and 2.2 [47,48]. In addition, fluorescent in situ hybridization and "omic" studies highlight Leptospirillum as one of the main active taxa in extremely low pH natural and man-made acid mine drainage environments [49][50][51][52][53][54][55][56][57][58] and in commercial copper and gold biorecovery operations [47,59].

Genomes and Quality Assessment
Fifteen complete and partial genome sequences from the Leptospirillum genus were downloaded from the NCBI RefSeq genomic database, National Center for Biotechnology Information (NCBI), and Joint Genome Institute's IMG-M databases (https://img.jgi.doe.gov/) in August 2018. In addition, the Nitrospira marina Nb-295 genome (IMG Genome ID 2596583682, https://img.jgi.doe.gov/) was downloaded as an outgroup for the Leptospirillum genus. Quality assessment of the 16 Nitrospirae genomes was carried out by CheckM [60], defining >90% base completeness and <10% contamination as high-quality genomes according to a bioinformatics pipeline ( Figure 1).

Phylogenetic Analysis
16S rRNA gene sequences from organisms within the Nitrospirae phylum were identified by a BLASTn comparison against the SILVA [61], RDP [62], and GREENGENES rRNA databases [63] with an E-value threshold of 1E −5 . Gene sequences with a minimum length of 1400 nucleotides were selected [64]. Alignments of 16S rRNA sequences were generated with MAFFT with the L-INSI iterative refinements option [65,66], MUSCLE [67], and T-Coffee alignment tools [68]. A maximumlikelihood tree was constructed with IQTREE [69], and the best-suited evolutionary models were selected using the model test tool implemented in IQTREE [70] according to the Bayesian (BIC) and Akaike information criterion (AIC). The robustness of the inferred tree was assessed using the nonparametric bootstrap procedure implemented in IQTREE (1000 replicates of the original datasets) with the ultrafast bootstrap option [71]. The final tree was visualized using Figtree (http://tree.bio.ed.ac.uk/software/figtree/).

Prediction of Mobile Genetic Elements and Genome Islands
Insertion elements and transposases were predicted and classified using TnpPred [72] and ISsaga [73,74]. Sigi-HMM [75] was used to predict genes obtained by Horizontal Gene Transfer (HGT). IslandViewer 4 [76] was used to predict genomic islands. Genome contexts of predicted acid resistance genes, associated hypothetical genes, and predicted mobile elements were analyzed using STRING [77] and by manual inspection using MAUVE [78] and Artemis [79].

Identification of Genes Related to Low pH Resistance
Genes reported to be involved in acid resistance were identified through an extensive literature search [4][5][6]38,[80][81][82][83][84][85][86][87]. A search for similar genes in Leptospirillum and Nitrospira marina Nb-295 genomes was carried out through BLASTp comparison [88] using a minimal E-value cutoff of 1e −5 . Synteny blocks and conservation of genetic context between Nitrospiraceae genomes were determined by MAUVE [78]. Genomic contexts were visualized by Artemis [79]. Conservation of sequences and domains within the Leptospirillum genus were analyzed and visualized by the WebLogo [89,90] and AliView [91] alignments tools. Selected Nitrospiraceae genes were compared against the UniProt and NCBI databases by BLASTp to identify orthologous proteins in other microorganisms. This collection of sequences was aligned with MAFFT using the L-INSI iterative refinements option [65,66]. IQTREE was used to construct a maximum-likelihood tree with 1000 replicates by the ultrafast bootstrap

Phylogenetic Analysis
16S rRNA gene sequences from organisms within the Nitrospirae phylum were identified by a BLASTn comparison against the SILVA [61], RDP [62], and GREENGENES rRNA databases [63] with an E-value threshold of 1E −5 . Gene sequences with a minimum length of 1400 nucleotides were selected [64]. Alignments of 16S rRNA sequences were generated with MAFFT with the L-INSI iterative refinements option [65,66], MUSCLE [67], and T-Coffee alignment tools [68]. A maximum-likelihood tree was constructed with IQTREE [69], and the best-suited evolutionary models were selected using the model test tool implemented in IQTREE [70] according to the Bayesian (BIC) and Akaike information criterion (AIC). The robustness of the inferred tree was assessed using the nonparametric bootstrap procedure implemented in IQTREE (1000 replicates of the original datasets) with the ultrafast bootstrap option [71]. The final tree was visualized using Figtree (http://tree.bio.ed.ac.uk/software/figtree/).

Prediction of Mobile Genetic Elements and Genome Islands
Insertion elements and transposases were predicted and classified using TnpPred [72] and ISsaga [73,74]. Sigi-HMM [75] was used to predict genes obtained by Horizontal Gene Transfer (HGT). IslandViewer 4 [76] was used to predict genomic islands. Genome contexts of predicted acid resistance genes, associated hypothetical genes, and predicted mobile elements were analyzed using STRING [77] and by manual inspection using MAUVE [78] and Artemis [79].

Identification of Genes Related to Low pH Resistance
Genes reported to be involved in acid resistance were identified through an extensive literature search [4][5][6]38,[80][81][82][83][84][85][86][87]. A search for similar genes in Leptospirillum and Nitrospira marina Nb-295 genomes was carried out through BLASTp comparison [88] using a minimal E-value cutoff of 1e −5 . Synteny blocks and conservation of genetic context between Nitrospiraceae genomes were determined by MAUVE [78]. Genomic contexts were visualized by Artemis [79]. Conservation of sequences and domains within the Leptospirillum genus were analyzed and visualized by the WebLogo [89,90] and AliView [91] alignments tools. Selected Nitrospiraceae genes were compared against the UniProt and NCBI databases by BLASTp to identify orthologous proteins in other microorganisms. This collection of sequences was aligned with MAFFT using the L-INSI iterative refinements option [65,66]. IQTREE was used to construct a maximum-likelihood tree with 1000 replicates by the ultrafast bootstrap option [71] and to identify the best-suited evolutionary model by the Bayesian (BIC) and Akaike information criterion (AIC) [70]. Phylogenetic trees were visualized with Figtree (http://tree.bio.ed.ac.uk/software/figtree/) and iTOL [92].

Evolutionary Pressures on Acid Resistance Genes
Selective pressure on genes was determined by calculating the ratio of non-synonymous DNA substitutions (D n ) to synonymous DNA substitutions (D s ) in the coding region [93]. Individual genes (DNA and amino acid sequences) from all protein families were extracted using custom Perl scripts. Amino acid alignments were constructed using the MAFFT L-INSI iterative refinements option [65,66] and MUSCLE [67] and used as input for PAL2NAL [94] in conjunction with their nucleotide sequences to obtain the corresponding codon alignments for gene families. D n /D s ratios were assigned for all possible pairwise comparisons within a protein family and calculated based on the codon alignments using the SeqinR package of the R project [95]. Mean D n /D s ratios were assigned for individual gene families by averaging all pairwise ratios within each family. D n /D s ratios of > 1 indicated beneficial mutations, and ratios of <1 indicated purifying selection [96].

Mapping Evolutionary Events
The inference of branch-site-specific events was made using the 16S rRNA gene tree of Leptospirillum genomes with N. marina Nb-295 as the outgroup. The presence and absence of genes related to acid resistance and associated genes located within the same genomic context were mapped onto each branch of the phylogenetic tree to model gene gain, loss, and modification events. Inference of evolutionary events was made using maximum parsimony criteria [97,98].

Genomic Features of Leptospirillum Genomes
Twelve publicly available genomes of the Leptospirillum genus were analyzed together with the genome of the neutrophile N. marina. N. marina is the closest phylogenetic relative of the Leptospirilli with available genomic data ( Table 1). Five of the genomes were complete, while eight (including N. marina) were permanent drafts. "Leptospirillum rubarum", Leptospirillum "5-way CG", Leptospirillum "C-75", and "Leptospirillum ferrodiazotrophum" were from metagenomic samples. The G + C content of Leptospirillum Group I, II, and III genomes were 50.1, 54.0, and 57.5%, respectively. Leptospirilli included both mesophiles and moderate thermophiles and grew between 40-43 • C. However, L. ferrooxidans C2-3, L. ferriphilum DSM 14647 T , and L. ferriphilum Sp-Cl were mesophiles with growth temperatures between 30 • C and 37 • C. Leptospirillum sp. "UBA BS" (Group IV; NCBI Accession: PRJNA176861 [32]) was not included in this study because the genome did not meet the quality criteria of CheckM [60], exhibiting only 41% completeness with 38% contamination.

Phylogenetic Relatedness between Leptospirillum Species and Other Members of the Nitrospirae Phylum
A phylogenetic tree of the members of the Nitrospirae phylum was developed based on 16S rRNA gene sequences (Figure 2a). The tree was rooted using the validated Rubrobacter radiotolerans T DSM 5868 as an outgroup. Three species of Leptospirillum can be distinguished in the phylogenetic tree: L. ferrooxidans (Group I), L. ferriphilum (Group II), and "L. ferrodiazotrophum" (Group III) with bootstrap support ≥ 60% (Figure 2a). This tree was consistent with published trees of Leptospirillum [51,101,105]. The phylogenetic branching points of the members of Group II had insufficient resolution to be separated ( Figure 2a). Therefore, the phylogenetic tree was also presented as a cladogram (Figure 2b), showing their predicted branching order with bootstrap values. Based on the measurement of genetic distance, it was shown that N. marina Nb-295 was the closest extant relative with a sequenced genome to the Leptospirillum genus.

Phylogenetic Relatedness between Leptospirillum Species and Other Members of the Nitrospirae Phylum
A phylogenetic tree of the members of the Nitrospirae phylum was developed based on 16S rRNA gene sequences (Figure 2a). The tree was rooted using the validated Rubrobacter radiotolerans T DSM 5868 as an outgroup. Three species of Leptospirillum can be distinguished in the phylogenetic tree: L. ferrooxidans (Group I), L. ferriphilum (Group II), and "L. ferrodiazotrophum" (Group III) with bootstrap support ≥ 60% (Figure 2a). This tree was consistent with published trees of Leptospirillum [51,101,105]. The phylogenetic branching points of the members of Group II had insufficient resolution to be separated ( Figure 2a). Therefore, the phylogenetic tree was also presented as a cladogram (Figure 2b), showing their predicted branching order with bootstrap values. Based on the measurement of genetic distance, it was shown that N. marina Nb-295 was the closest extant relative with a sequenced genome to the Leptospirillum genus.

Gene Inventories
Genes with predicted or experimental evidence for functions related with first and second lines of defense to low pH environments were identified in the literature. A list of the genes used, their NCBI accession numbers, and their predicted features are provided in Supplementary Table S1.

Membrane Potential and Potassium Transporters
Kch, potentially encoding a K + channel protein, was identified in all Leptospirillum genomes, but not in N. marina. One possible explanation for this is that kch was incorporated into the Leptospirillum lineage by HGT after its separation from the N. marina lineage. Alternatively, kch was present in the last common ancestor and was lost in the N. marina lineage. Examination of the NCBI database using

Gene Inventories
Genes with predicted or experimental evidence for functions related with first and second lines of defense to low pH environments were identified in the literature. A list of the genes used, their NCBI accession numbers, and their predicted features are provided in Supplementary Table S1.

Membrane Potential and Potassium Transporters
Kch, potentially encoding a K + channel protein, was identified in all Leptospirillum genomes, but not in N. marina. One possible explanation for this is that kch was incorporated into the Leptospirillum lineage by HGT after its separation from the N. marina lineage. Alternatively, kch was present in the last common ancestor and was lost in the N. marina lineage. Examination of the NCBI database using BLASTp showed that the best hits of the Leptospirillum Kch were with proteins of the Acidithiobacillus genus together with several other known acidophiles (Supplementary Figure S1). These microorganisms are frequently found in extremely acidic environments populated by Leptospirillum (e.g., [34,56,106,107]), strongly suggesting that they have shared kch via HGT. In addition, this result was consistent with the contention that kch is associated with acid resistance.
In Leptospirillum Group III, kch was adjacent to, but divergent from, a gene potentially encoding a phage holin-like protein that is involved in stress response and other functions (reviewed in [108]). In Leptospirillum Groups I and II, additional copies of the phage holin were located close to predicted genes encoding DNA uptake competence functions (ComEC). These are thought to be one of the major components involved in HGT (reviewed in [109]). In Leptospirillum Groups I and II, kch was clustered with two other predicted acid resistance genes, slp8 and gadA (discussed in Sections 3.4.4 and 3.5.2, respectively). One possibility is that kch, slp8, and gadA entered the Leptospirillum genome by HGT, possibly via a phage mediated uptake mechanism.
A kdpABC gene cluster potentially encoding a potassium transporting Kdp P-type ATPase was found in all the Leptospirillum genomes ( Figure 3). Downstream of the kdp cluster, there was a predicted BBP2 porin, a putative gadC2a permease, followed by another K + sensing histidine kinase with a response regulator CitB. This cluster may be associated with K + regulation. According to STRING analysis [77], this gene cluster was co-expressed in other species, suggesting that it was an operon. Some Group II genomes contained a predicted transposase (tnp3) associated with the insertion of two hypothetical genes. There was also an insertion of two hypothetical genes just upstream of gadC2a. The functions of the hypothetical genes remain unknown.
Genes 2020, 11, x FOR PEER REVIEW 8 of 23 BLASTp showed that the best hits of the Leptospirillum Kch were with proteins of the Acidithiobacillus genus together with several other known acidophiles (Supplementary Figure S1). These microorganisms are frequently found in extremely acidic environments populated by Leptospirillum (e.g., [34,56,106,107]), strongly suggesting that they have shared kch via HGT. In addition, this result was consistent with the contention that kch is associated with acid resistance.
In Leptospirillum Group III, kch was adjacent to, but divergent from, a gene potentially encoding a phage holin-like protein that is involved in stress response and other functions (reviewed in [108]). In Leptospirillum Groups I and II, additional copies of the phage holin were located close to predicted genes encoding DNA uptake competence functions (ComEC). These are thought to be one of the major components involved in HGT (reviewed in [109]). In Leptospirillum Groups I and II, kch was clustered with two other predicted acid resistance genes, slp8 and gadA (discussed in Sections 3.4.4 and 3.5.2, respectively). One possibility is that kch, slp8, and gadA entered the Leptospirillum genome by HGT, possibly via a phage mediated uptake mechanism.
A kdpABC gene cluster potentially encoding a potassium transporting Kdp P-type ATPase was found in all the Leptospirillum genomes ( Figure 3). Downstream of the kdp cluster, there was a predicted BBP2 porin, a putative gadC2a permease, followed by another K + sensing histidine kinase with a response regulator CitB. This cluster may be associated with K + regulation. According to STRING analysis [77], this gene cluster was co-expressed in other species, suggesting that it was an operon. Some Group II genomes contained a predicted transposase (tnp3) associated with the insertion of two hypothetical genes. There was also an insertion of two hypothetical genes just upstream of gadC2a. The functions of the hypothetical genes remain unknown. . Phylogenetic distribution and genomic contexts of genes predicted for Kdp potassium uptake. N. marina, which lacks these genes, is included as an outgroup. Color coding of genes: red = Kdp genes, purple = orphan hypothetical genes, grey = additional genes whose genomic context is conserved, orange = predicted mobile elements and their remnants (* tnp3 inserted only in CF-1, ** hyp4 inserted only in the YSK strain), and black = gadC2A potentially involved proton export (see Section 3.5.2).
TrkA was identified in all Leptospirillum genomes and in the genome of N. marina (data not shown). Comparative amino acid sequence analysis indicated that Leptospirillum trkA was found in a cluster that was separate from other Nitrospira (Supplementary Figure S2). An extremely large Dn/Ds ratio was observed (~1) between trkA of Leptospirillum compared to other Nitrospira, suggesting that it could have been vertically inherited from a likely neutrophilic common ancestor of Leptospirillum and other Nitrospira and then subjected to intense selective pressure to adapt to an acidic environment. Once adapted, it underwent a few additional changes, as shown by extremely low Dn/Ds ratios (~0.05) within the Leptospirillum genus (Supplementary Figure S2). N. marina, which lacks these genes, is included as an outgroup. Color coding of genes: red = Kdp genes, purple = orphan hypothetical genes, grey = additional genes whose genomic context is conserved, orange = predicted mobile elements and their remnants (* tnp3 inserted only in CF-1, ** hyp4 inserted only in the YSK strain), and black = gadC2A potentially involved proton export (see Section 3.5.2).
TrkA was identified in all Leptospirillum genomes and in the genome of N. marina. Comparative amino acid sequence analysis indicated that Leptospirillum trkA was found in a cluster that was separate from other Nitrospira (Supplementary Figure S2). An extremely large D n /D s ratio was observed (~1) between trkA of Leptospirillum compared to other Nitrospira, suggesting that it could have been vertically inherited from a likely neutrophilic common ancestor of Leptospirillum and other Nitrospira and then subjected to intense selective pressure to adapt to an acidic environment. Once adapted, it underwent a few additional changes, as shown by extremely low D n /D s ratios (~0.05) within the Leptospirillum genus (Supplementary Figure S2).

Spermidine Biosynthesis and Associated Genes
The Leptospirillum genomes were searched for genes potentially encoding aliphatic polycation polyamines. No genes encoding for spermine or cadaverine biosynthesis were detected in any of the genomes. However, a conserved cluster of four genes potentially encoding spermidine biosynthesis was predicted in all three Leptospirillum groups, extending the earlier prediction of spermidine genes in in L. ferriphilum [13]. This cluster was not detected in N. marina (Figure 4 and Supplementary Figure  S3). Three of the genes in the cluster were predicted to encode the biosynthesis of spermidine from S-adenosyl-L-methionine (SAM): speH encoding S-adenosylmethionine decarboxylase, speE encoding spermidine synthase, and an odc-like gene predicted to produce putrescine from ornithine (Figure 4a,b). The fourth gene (hyp4) encoded a conserved hypothetical protein UPF0182 found in many organisms in the same genomic context, but whose function remains unknown. UPF0182 was predicted to have a signal peptide for protein export and six transmembrane regions and was most likely to be an inner membrane protein.

Spermidine Biosynthesis and Associated Genes
The Leptospirillum genomes were searched for genes potentially encoding aliphatic polycation polyamines. No genes encoding for spermine or cadaverine biosynthesis were detected in any of the genomes. However, a conserved cluster of four genes potentially encoding spermidine biosynthesis was predicted in all three Leptospirillum groups, extending the earlier prediction of spermidine genes in in L. ferriphilum [13]. This cluster was not detected in N. marina (Figure 4 and Supplementary Figure  S3). Three of the genes in the cluster were predicted to encode the biosynthesis of spermidine from S-adenosyl-L-methionine (SAM): speH encoding S-adenosylmethionine decarboxylase, speE encoding spermidine synthase, and an odc-like gene predicted to produce putrescine from ornithine (Figure 4a,b). The fourth gene (hyp4) encoded a conserved hypothetical protein UPF0182 found in many organisms in the same genomic context, but whose function remains unknown. UPF0182 was predicted to have a signal peptide for protein export and six transmembrane regions and was most likely to be an inner membrane protein. An additional upstream gene, termed dgc1 (for diguanylate cyclase), was found only in Group II (Figure 4). This gene contained a predicted GGDEF domain and three associated GAF superfamily domains. In other organisms, the GGDEF domain is involved in cyclic diguanosine monophosphate turnover and the production of the secondary messenger C-di-GMP (reviewed in [110]). A functional GGDEF gene was reported in the extremely acidophilic genus Acidithiobacillus, although it was associated with the EAL rather than the GAF domains [111,112]. The secondary messenger C-di-GMP was implicated in the regulation of biofilm formation and other functions in many bacteria [113]. An additional upstream gene, termed dgc1 (for diguanylate cyclase), was found only in Group II (Figure 4). This gene contained a predicted GGDEF domain and three associated GAF superfamily domains. In other organisms, the GGDEF domain is involved in cyclic diguanosine monophosphate turnover and the production of the secondary messenger C-di-GMP (reviewed in [110]). A functional GGDEF gene was reported in the extremely acidophilic genus Acidithiobacillus, although it was associated with the EAL rather than the GAF domains [111,112]. The secondary messenger C-di-GMP was implicated in the regulation of biofilm formation and other functions in many bacteria [113]. Spermidine has also been associated with both the formation and inhibition of biofilms in other bacteria [30,31].
All three Leptospirillum groups contained a hypothetical orphan gene (hyp1) upstream of the spermidine cluster, and Group III contained, in addition, two predicted hypothetical orphan genes downstream (hyp2 and hyp3; Figure 4a). Although hyp1 remains of unknown function, it has been identified in AMD community proteomes along with the full spermidine "acid resistance cluster" [52,55]. Transcripts for the spermidine genes have been detected in AMD community meta-transcriptomes [34].
Genes potentially involved in HGT and/or genome rearrangement were detected in the neighborhood of the spermidine cluster. These included a predicted P-type conjugative transfer protein TrbG with a signal sequence and lipoprotein signal and a TnpIS5-like sequence (Figure 4a).
Heat maps derived from alignments of DNA sequences of speE and speH in Leptospirillum Group II illustrated an important aspect of their evolution (Figure 4c,d). The DNA sequence of speE was 100% conserved between some strains. For example, speE of "L. rubarum", L. sp. "C75", and L. sp. "CF-1" from Iron Mountain, USA, shared 100% nucleotide sequence identity (Figure 4c). Inspection of the position of these strains in the phylogenetic cladogram (Figure 2b) suggested that speE was inherited from their last common ancestor and its sequence subsequently maintained under strong selective pressure within the shared acidic environment of Iron Mountain. The close physical proximity of the strains could also facilitate genetic exchange and homologous recombination, contributing to the maintenance of DNA sequence similarity. The speE of L. ferriphilum ZJ, DX, and Sp-Cl formed another cluster with 100% DNA sequence identity different from the Iron Mountain cluster (Figure 4c). Strains ZJ and DX were from China, whereas Sp-Cl was from Chile. In this case, close geographic proximity could not explain the sequence identity, and selective pressure resulting from a similar environment seemed more likely to account for the maintenance of sequence identity. A comparison of the nucleotide sequences of speH showed 100% identity in strains of Group II Leptospirillum from Iron Mountain, China, and Chile, with two exceptions (Figure 4d). The exceptions were L. ferriphilum DSM 14647 (from Peru) and L. sp. '5-way CG' (from Iron Mountain) that formed a second cluster with 100% sequence identity.
It was hypothesized that geographical proximity could potentially explain some of the evolutionary trajectories of speE and speH, perhaps by maintenance of sequence identity via homologous recombination [104]. However, adaptation of vertically inherited genes to similar acidic econiches was a more likely explanation for those strains not geographically juxtaposed (e.g., L. ferriphilum Sp-Cl and L. ferriphilum DSM 14647).

Hopanoid Biosynthesis
HpnCDE, potentially encoding squalene and a core set of hopanoid biosynthesis genes (hpnFGAHROP), were identified in Leptospirillum and N. marina ( Figure 5). All these genes showed considerable syntenic conservation within all Leptospirillum Groups and N. marina, suggesting that they were inherited from a common ancestor by vertical descent. On the other hand, hpnIJ, encoding enzymes that modify hopanoids to bacteriohopanetetrol cyclitol ether, were predicted only in the genomes of Leptospirillum. It was hypothesized that hpnIJ entered the genome of the ancestral Leptospirillum after its divergence from the other Nitrospira. Mobile elements (tnp1-3) were detected in the neighborhood of the hpn gene cluster in Leptospirillum Group I (Figure 5b), suggesting that HGT of the cluster into Leptospirillum may have occurred. HpnIJ have been shown in Burkholderia to be involved in C 35 extensions of hopanoids including bacteriohopanetetrol (BHT), BHT glucosamine, and BHT cyclitol ether, which are in turn involved in the response to environmental stress conditions including low pH [29].

Slp Starvation Lipoprotein
Four copies of slp were identified in N. marina (termed slps1-4) and an additional four copies were discovered in Leptospirillum (termed slps5-8). Phylogenetic analysis of their amino acid sequences suggested that all eight copies were distinct and displayed different evolutionary trajectories (Supplementary Figure S4). It is possible that one or more of the Leptospirillum slps were derived by vertical descent from the slps of the inferred ancestor with N. marina. However, the long branch lengths derived from the phylogeny made it difficult to pin-point unambiguously the Nitrospira ancestral slp that gave rise to an ancestral Leptospirillum slp.
The evolutionary trajectories of the slps in some clades of Leptospirillum could be explained by the similarity of geographic location. For example, slp6 and slp7 were found in clades belonging mainly to Iron Mountain, USA, and exhibited 100% amino acid sequence identity (Supplementary Figure S5). However, geographical proximity could not explain all trajectories. For example, slp5 and slp7 of L. ferriphilum DSM 14647 from Peru had 100% amino acid sequence identity with the Iron Mountain, USA, clade (Supplementary Figure S5). Furthermore, slp8 of L. ferriphilum DSM 14647 from Peru had 100% amino acid sequence identity with slp8 from a Chinese location (L. ferriphilum ML-04) and with Leptospirillum sp. "5-way CG" from Iron Mountain, USA. We concluded that the inheritance pattern of the slps could be explained either by geographic proximity or by adaptation to similar acidic econiches similar to that postulated for speE and speH, as described in Section 3.4.2 Several deductions could be inferred from an inspection of the genomic contexts of the four Leptospirillum slps ( Figure 6): (i) All four slps in each group displayed a different genomic context and, with the exception of slp5, also between species of the same group. The genomic context of slp5 was strongly conserved between Groups I and II and slightly conserved with Group III (Figure 6a). (ii) All four slps were located near genes potentially encoding transposase-like functions, tRNAs, and phage-like genes, suggesting that they entered the genomes by HGT or underwent mobile element mediated rearrangement within these genomes. (iii) slp7 and slp8 were associated with hpnR and kch,

Slp Starvation Lipoprotein
Four copies of slp were identified in N. marina (termed slps1-4) and an additional four copies were discovered in Leptospirillum (termed slps5-8). Phylogenetic analysis of their amino acid sequences suggested that all eight copies were distinct and displayed different evolutionary trajectories (Supplementary Figure S4). It is possible that one or more of the Leptospirillum slps were derived by vertical descent from the slps of the inferred ancestor with N. marina. However, the long branch lengths derived from the phylogeny made it difficult to pin-point unambiguously the Nitrospira ancestral slp that gave rise to an ancestral Leptospirillum slp.
The evolutionary trajectories of the slps in some clades of Leptospirillum could be explained by the similarity of geographic location. For example, slp6 and slp7 were found in clades belonging mainly to Iron Mountain, USA, and exhibited 100% amino acid sequence identity (Supplementary Figure  S5). However, geographical proximity could not explain all trajectories. For example, slp5 and slp7 of L. ferriphilum DSM 14647 from Peru had 100% amino acid sequence identity with the Iron Mountain, USA, clade (Supplementary Figure S5). Furthermore, slp8 of L. ferriphilum DSM 14647 from Peru had 100% amino acid sequence identity with slp8 from a Chinese location (L. ferriphilum ML-04) and with Leptospirillum sp. "5-way CG" from Iron Mountain, USA. We concluded that the inheritance pattern of the slps could be explained either by geographic proximity or by adaptation to similar acidic econiches similar to that postulated for speE and speH, as described in Section 3.4.2 Several deductions could be inferred from an inspection of the genomic contexts of the four Leptospirillum slps ( Figure 6): (i) All four slps in each group displayed a different genomic context and, with the exception of slp5, also between species of the same group. The genomic context of slp5 was strongly conserved between Groups I and II and slightly conserved with Group III (Figure 6a). (ii) All four slps were located near genes potentially encoding transposase-like functions, tRNAs, and phage-like genes, suggesting that they entered the genomes by HGT or underwent mobile element mediated rearrangement within these genomes. (iii) slp7 and slp8 were associated with hpnR and kch, respectively, genes potentially encoding other acid resistance functions (Figure 6b,d). (iv) All four slps were associated with a number (18 in total) of orphan hypothetical genes with no known function. Given the context of these genes, they may be related to acid resistance or other stress-related functions and can be highlighted for future experimental analysis. Alternatively, they could be unidentified phage remnants.
Genes 2020, 11, x FOR PEER REVIEW 12 of 23 function. Given the context of these genes, they may be related to acid resistance or other stressrelated functions and can be highlighted for future experimental analysis. Alternatively, they could be unidentified phage remnants. Although the function(s) of the Leptospirillum slps remain(s) unknown, all contain the lipobox motif that is characteristic of slps in other organisms, at the end of a predicted signal peptide (Supplementary Figure S6) characteristic of slps from other organisms [114]. The +2 position after the lipobox was proposed to be the main determinant for protein export such that if it had an Asp amino acid residue, then it was retained at the inner membrane (Supplementary Figure S6). An Asp was identified only in Slp5 from Group III, suggesting all the other proposed Slps might be exported to the outer membrane.

Proton Antiporters
One gene copy of the putative voltage gated ClC-type chloride/proton antiporter (ClcA) was identified in all Leptospirillum genomes, but not in N. marina. The genomic region around clcA was not conserved in any of the Leptospirillum Groups. However, each genomic context of clcA was associated with mobile elements or their remains. For example, a transposase DDE domain (cl26088) and a DNA recombinase Rci/bacteriophage Hp1-like integrase (cd00796) were in its neighborhood in Groups III and II (data not shown). Cluster analysis of the ClcA amino acid sequences showed that the evolutionary trajectory of ClcA followed the pattern of the 16S rRNA phylogeny and was most Although the function(s) of the Leptospirillum slps remain(s) unknown, all contain the lipobox motif that is characteristic of slps in other organisms, at the end of a predicted signal peptide (Supplementary Figure S6) characteristic of slps from other organisms [114]. The +2 position after the lipobox was proposed to be the main determinant for protein export such that if it had an Asp amino acid residue, then it was retained at the inner membrane (Supplementary Figure S6). An Asp was identified only in Slp5 from Group III, suggesting all the other proposed Slps might be exported to the outer membrane.

Proton Antiporters
One gene copy of the putative voltage gated ClC-type chloride/proton antiporter (ClcA) was identified in all Leptospirillum genomes, but not in N. marina. The genomic region around clcA was not conserved in any of the Leptospirillum Groups. However, each genomic context of clcA was associated with mobile elements or their remains. For example, a transposase DDE domain (cl26088) and a DNA recombinase Rci/bacteriophage Hp1-like integrase (cd00796) were in its neighborhood in Groups III and II. Cluster analysis of the ClcA amino acid sequences showed that the evolutionary trajectory of ClcA followed the pattern of the 16S rRNA phylogeny and was most likely inherited by vertical descent within the Leptospirillum groups (Supplementary Figure S7). We suggested that the mobile elements associated with clcA could have been involved in its chromosomal relocation in each group.
A second mechanism postulated to remove protons from the cytoplasm involved the NhaP sodium/proton antiporter for which two copies, termed nhaP1 and 2, were identified only in Leptospirillum Group II. A cluster analysis of their amino acid sequences suggested they were members of two different families (Supplementary Figure S8). NhaP1 was located in a conserved genomic context in all members of Group II. Remnants of a number of transposases and integrases together with tRNA-Arg were detected in the gene neighborhood, suggesting that nhaP1 was acquired by HGT in an ancestral Leptospirillum after its divergence with the last common ancestor. However, the conserved genomic context is consistent with the idea that it was inherited vertically within the different groups after the initial HGT event. A copy of nhaP2 was found only in L. ferriphilum ML-04 (Group II). However, its sequence was interrupted by a transposase insertion (tnp1; Figure 7a). It was unlikely that nhaP2 was functional because the transposase insertion introduced stop codons in its reading frame and split the functional NhaP domain (COG0025) into two parts.
Genes 2020, 11, x FOR PEER REVIEW 13 of 23 A second mechanism postulated to remove protons from the cytoplasm involved the NhaP sodium/proton antiporter for which two copies, termed nhaP1 and 2, were identified only in Leptospirillum Group II. A cluster analysis of their amino acid sequences suggested they were members of two different families (Supplementary Figure S8). NhaP1 was located in a conserved genomic context in all members of Group II. Remnants of a number of transposases and integrases together with tRNA-Arg were detected in the gene neighborhood, suggesting that nhaP1 was acquired by HGT in an ancestral Leptospirillum after its divergence with the last common ancestor. However, the conserved genomic context is consistent with the idea that it was inherited vertically within the different groups after the initial HGT event. A copy of nhaP2 was found only in L. ferriphilum ML-04 (Group II). However, its sequence was interrupted by a transposase insertion (tnp1; Figure 7a). It was unlikely that nhaP2 was functional because the transposase insertion introduced stop codons in its reading frame and split the functional NhaP domain (COG0025) into two parts.

Gad Decarboxylase
Four copies of the acid resistance amino acid permease (gadC) and one copy of the acid resistance amino acid decarboxylase (gadA) were identified in all Leptospirillum. In contrast, N. marina contained only one putative amino acid permease. The glutamate decarboxylase (gadA) did not show synteny in its genome context between the three groups (Figure 7b), and a cluster analysis suggested that the gene was introduced by HGT from Archaea (in separate events, as the top hits for the three groups were different; Supplementary Figure S9). In addition, four predicted gadC amino acid permeases encoding genes were identified in a separate genomic location to gadA with one of them in the gene context of potassium transporter Kdp system (Figure 3). Cluster analysis suggested one gadC copy was similar to the one found in the N. marina genome with three additional families present in the Leptospirillum genomes (Supplementary Figure S10).
GadA (glutamate decarboxylase) in Leptospirillum Groups II and III was associated with a cluster

Gad Decarboxylase
Four copies of the acid resistance amino acid permease (gadC) and one copy of the acid resistance amino acid decarboxylase (gadA) were identified in all Leptospirillum. In contrast, N. marina contained only one putative amino acid permease. The glutamate decarboxylase (gadA) did not show synteny in its genome context between the three groups (Figure 7b), and a cluster analysis suggested that the gene was introduced by HGT from Archaea (in separate events, as the top hits for the three groups were different; Supplementary Figure S9). In addition, four predicted gadC amino acid permeases encoding genes were identified in a separate genomic location to gadA with one of them in the gene context of potassium transporter Kdp system (Figure 3). Cluster analysis suggested one gadC copy was similar to the one found in the N. marina genome with three additional families present in the Leptospirillum genomes (Supplementary Figure S10).
GadA (glutamate decarboxylase) in Leptospirillum Groups II and III was associated with a cluster of trehalose biosynthesis genes (Figure 7b). It has been shown that potassium, glutamate, and trehalose form part of a response to osmotic shock and acid stress in E. coli (reviewed in [115]), suggesting that a similar response was possible in Leptospirillum Groups II and III.

Model of Leptospirillum Acid Resistance
A model of the Leptospirillum acid resistance systems, classified into first and second lines of defense mechanisms, is shown in Figure 8. Transcriptomic and proteomic analyses supported the relationship of the first line of defense genes with low pH adaptation in Leptospirillum [11,34,52,55,116]. Evidence also linked the expression of the KdpABC high-affinity potassium transport system and the HpnCDEFGIJHGRP and BamA hopanoid system to acid stress in AMD communities [34,52,55,116]. Environmental expression of genes involved in the second line of defense, such as the glutamate decarboxylase system gad and the Na+/H+ antiporter nhaP, has been detected [34,116,117]. Clearly, additional experiments are required to explore the validity of the model. However, we posit that it provides a platform for helping to circumscribe future experimental directions.

Model of Leptospirillum Acid Resistance
A model of the Leptospirillum acid resistance systems, classified into first and second lines of defense mechanisms, is shown in Figure 8. Transcriptomic and proteomic analyses supported the relationship of the first line of defense genes with low pH adaptation in Leptospirillum [11,34,52,55,116]. Evidence also linked the expression of the KdpABC high-affinity potassium transport system and the HpnCDEFGIJHGRP and BamA hopanoid system to acid stress in AMD communities [34,52,55,116]. Environmental expression of genes involved in the second line of defense, such as the glutamate decarboxylase system gad and the Na+/H+ antiporter nhaP, has been detected [34,116,117]. Clearly, additional experiments are required to explore the validity of the model. However, we posit that it provides a platform for helping to circumscribe future experimental directions. Three genomic regions of Leptospirillum contained potassium transport system kch genes in proximity with other acid resistance genes, for example, gadC2 (Figure 3), slp (Figure 6), and gadA ( Figure 7). This could allow their coordinated regulation. Each of these systems was associated with multiple mobile elements, suggesting that they could have entered the genome by HGT. Acid resistance genes were also found in close proximity to other stress responsive genes such as those involved in biofilm formation ( Figure 4) and trehalose biosynthesis (Figure 7), potentially allowing coordination of genes involved in acid resistance and osmotic stress.
Whereas the inventory of potential mechanisms involved in first and second lines of acid resistance in Leptospirillum was quite extensive, they were by no means the only ones used by organisms for acid resistance. For example, in Leptospirillum, protons were hypothesized to be exported by the glutamate decarboxylase system, but evolutionary and mechanistically related systems such as ornithine decarboxylases have been implicated in acid stress responses in amateur Three genomic regions of Leptospirillum contained potassium transport system kch genes in proximity with other acid resistance genes, for example, gadC2 (Figure 3), slp (Figure 6), and gadA ( Figure 7). This could allow their coordinated regulation. Each of these systems was associated with multiple mobile elements, suggesting that they could have entered the genome by HGT. Acid resistance genes were also found in close proximity to other stress responsive genes such as those involved in biofilm formation ( Figure 4) and trehalose biosynthesis (Figure 7), potentially allowing coordination of genes involved in acid resistance and osmotic stress.
Whereas the inventory of potential mechanisms involved in first and second lines of acid resistance in Leptospirillum was quite extensive, they were by no means the only ones used by organisms for acid resistance. For example, in Leptospirillum, protons were hypothesized to be exported by the glutamate decarboxylase system, but evolutionary and mechanistically related systems such as ornithine decarboxylases have been implicated in acid stress responses in amateur acidophiles [86]. The extreme acidophile Ferrovum, belonging to the Betaproteobacteria class, has been hypothesized to use the Kef-type K + transport system to help in maintaining a positive inside membrane potential and to utilize urease activity to neutralize its immediate environment [118]. External cellular capsule formation has been speculated to be involved in acid resistance in the Acidithiobacillia class [119]. None of these systems were predicted in Leptospirillum. Given that multiple acid resistance mechanisms were found in different Bacterial classes widely distributed in the Tree of Life, it was most likely that acid resistance evolved independently multiple times, perhaps aided by HGT. A similar conclusion has been made regarding the evolution of acid resistance in Archaea [120].
Genes encoding orphan hypothetical proteins located in genomic contexts associated with both first and second line of defense acid resistance genes may potentially encode unknown acid resistance mechanisms or functions that modify known acid resistance responses. Their bioinformatic prediction highlighted the need for experimental investigation into their functions.

Phylogenetic Distribution of Acid Resistance Genes and Their Inferred Evolutionary Trajectories
Evolutionary events leading to acid resistance in the Leptospirillum genus were inferred using parsimony bioinformatic methods [121] and were mapped onto the branches of the phylogenetic tree of Leptospirillum (Figure 9).
Genes 2020, 11, x FOR PEER REVIEW 15 of 23 multiple acid resistance mechanisms were found in different Bacterial classes widely distributed in the Tree of Life, it was most likely that acid resistance evolved independently multiple times, perhaps aided by HGT. A similar conclusion has been made regarding the evolution of acid resistance in Archaea [120]. Genes encoding orphan hypothetical proteins located in genomic contexts associated with both first and second line of defense acid resistance genes may potentially encode unknown acid resistance mechanisms or functions that modify known acid resistance responses. Their bioinformatic prediction highlighted the need for experimental investigation into their functions.

Phylogenetic Distribution of Acid Resistance Genes and Their Inferred Evolutionary Trajectories
Evolutionary events leading to acid resistance in the Leptospirillum genus were inferred using parsimony bioinformatic methods [121] and were mapped onto the branches of the phylogenetic tree of Leptospirillum (Figure 9). . Inferred evolutionary reconstruction of the acid related genes of Leptospirillum with the main evolutionary events (gene gain/loss/fusion/duplication). Parsimony was used for the inferences. Names in red = first line of defense mechanisms and blue = second line of defense mechanisms. Black square = presence and white square = absence of mechanism. Black squares with white sections = mechanism is present in some, but not all, strains. LCA = last common ancestor.
Many of the predicted acid resistance genes were hypothesized to have been absent in the inferred last common ancestor of Leptospirillum and N. marina. These included genes encoding the K + transporters Kch and Kdp, the four spermidine gene clusters, HpnIJ, ClcA, NhaP, and GadC2 and 3. It was proposed that they entered the Leptospirillum ancestral line by HGT via conjugation (e.g., the spermidine four-gene cluster), viruses (e.g., slps7 and 8, clcA), and multiple examples involving transposases. HGT has been suggested to be a prevalent mechanism in genome evolution in a wide range of microorganisms [122]. . Inferred evolutionary reconstruction of the acid related genes of Leptospirillum with the main evolutionary events (gene gain/loss/fusion/duplication). Parsimony was used for the inferences. Names in red = first line of defense mechanisms and blue = second line of defense mechanisms. Black square = presence and white square = absence of mechanism. Black squares with white sections = mechanism is present in some, but not all, strains. LCA = last common ancestor.
Many of the predicted acid resistance genes were hypothesized to have been absent in the inferred last common ancestor of Leptospirillum and N. marina. These included genes encoding the K + transporters Kch and Kdp, the four spermidine gene clusters, HpnIJ, ClcA, NhaP, and GadC2 and 3. It was proposed that they entered the Leptospirillum ancestral line by HGT via conjugation (e.g., the spermidine four-gene cluster), viruses (e.g., slps7 and 8, clcA), and multiple examples involving transposases. HGT has been suggested to be a prevalent mechanism in genome evolution in a wide range of microorganisms [122].
With two exceptions, the donors of the HGT genes were difficult to trace, perhaps because the events occurred so long ago that molecular signals of the donors have been erased with the passage of time. The Leptospirillum Kch potassium transporter had several top BLASTp hits with other extreme acidophiles, including the Acidithiobacillus genus which shares its low pH environment. Leptospirillum is rooted deeper in the Tree of Life than Acidithiobacillus and therefore was probably ancestral to it, suggesting that the direction of transfer of Kch was from a Leptospirillum ancestor to an Acidithiobacillus ancestor. A second example of possible HGT donor identification lied in the comparison of the "L. ferrodiazotrophum" and "L. rubarum" glutamate decarboxylase acid resistance system with other members of the AMD community, suggesting that the genes were introduced by HGT into a Leptospirillum ancestor from an Archaeal Ferroplasma ancestor.
Gene duplication and gene diversification events were identified using a combination of phylogenetic inference based on alignments of families (cluster analysis) [123] and calculations of D n /D s [94][95][96]. Multiple examples of gene duplication events were detected. These included examples of potential vertical descent followed by gene duplication giving rise to paralogs (e.g., HpnJ1-3 in Group II). There were also many cases of gene duplications that were predicted to be xenologs, arising from HGT events. Xenologs were defined as a distinct form of horizontal gene transfer in which a gene was displaced by an ortholog from a different lineage [124], e.g., slps5-8 in Leptospirillum replaced slps present in the last common ancestor.
The relative timing of the events leading to the hypothesized transition of the ancestral Leptospirillum from a circumneutral or mildly acidic environment to a hyper-acidic one was difficult to assess. Based on what was known about acid stress response mechanisms of moderate ("amateur") acidophiles (pH 3.5-6), it seemed likely that the first transition events involved the development of second line of defense mechanisms including proton expulsion mechanisms such as ClcA, NhaP, and Gad. However, it could not be ruled out that the mechanisms of the first line of defense were also involved in the early transition to very low pH environments. Some of these, such as hopanoids and spermidine, could have been gained initially to provide protection from other stresses such as oxidative stress or high temperature and subsequently adapted for low pH protection. This paper focused on the potential mechanisms employed by Leptospirillum to thrive in extremely low pH environments. One of the major aspects of adaptation that was not investigated was how proteins fold and function in acid conditions. The cytoplasm was hypothesized to be around pH 6 or circumneutral, as was shown for other acidophiles, although this was not experimentally verified for Leptospirillum. If this assumption were correct, then only proteins or protein loops outside the periplasmic membrane would be exposed to low pH. Protein adaptations to low pH have been investigated in other acidophiles (e.g., [125,126]), but since no information was available for Leptospirillum, the model of its hypothesized transition from a neutral to an acid environment remains incomplete. Other major lacunae in our knowledge of the evolution of acidophilia in Leptospirillum were how changes in pH were sensed and transduced into gene regulation and how chaperones could be involved in maintaining protein integrity.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/11/4/389/s1, Figure S1: Unrooted phylogenetic tree constructed from the predicted amino acid sequences of Kch in the Leptospirillum genus. Figure S2. (a) Unrooted phylogenetic tree constructed from the predicted amino acid sequences from TrkA in the Leptospirillum genus. (b) D n /D s box plots for TrkA. Figure S3. Unrooted phylogenetic tree constructed from the predicted amino acid sequences of SpeE in the Leptospirillum genus. Figure S4. Phylogenetic tree for slp gene copies (in different background highlighted colors) from the Leptospirillum genus. Figure S5. Heatmap of amino acid identity sequences of slp genes for the Leptospirillum genus. Figure S6. Multiple sequence alignment of N. marina and Leptospirillum slp gene sequences including a WebLogos plot of the slp lipobox. Figure S7. Unrooted phylogenetic tree constructed from the predicted amino acid sequences from ClcA in the Leptospirillum genus. Figure S8. Unrooted phylogenetic tree constructed from the predicted amino acid sequences from NhaP1 and 2 in Leptospirillum Group II. Figure S9. Unrooted phylogenetic tree constructed from the predicted amino acid sequences from GadA in the Leptospirillum genus. Figure S10. Unrooted phylogenetic tree constructed from the predicted amino acid sequences from GadC in the Leptospirillum genus. Table S1. 16S rRNA gene sequences used for the phylogenetic tree for the 26 Nitrospirae used in the study; best hit IDs used in the clustering analysis; IDs for the acid resistance genes; predicted locations of the hypothetical genes included in the study and the names and accessions for acid resistant genes not found in Leptospirillum.

Conflicts of Interest:
The authors declare no conflict of interest.