Niche Preference of Escherichia coli in a Peri-Urban Pond Ecosystem

Escherichia coli comprises diverse strains with a large accessory genome, indicating functional diversity and the ability to adapt to a range of niches. Specific strains would display greatest fitness in niches matching their combination of phenotypic traits. Given this hypothesis, we sought to determine whether E. coli in a peri-urban pond and associated cattle pasture display niche preference. Samples were collected from water, sediment, aquatic plants, water snails associated with the pond, as well as bovine feces from cattle in an adjacent pasture. Isolates (120) were obtained after plating on Membrane Lactose Glucuronide Agar (MLGA). We used the uidA and mutS sequences for all isolates to determine phylogeny by maximum likelihood, and population structure through gene flow analysis. PCR was used to allocate isolates to phylogroups and to determine the presence of pathogenicity/virulence genes (stxI, stxII, eaeA, hlyA, ST, and LT). Antimicrobial resistance was determined using a disk diffusion assay for Tetracycline, Gentamicin, Ciprofloxacin, Meropenem, Ceftriaxone, and Azithromycin. Our results showed that isolates from water, sediment, and water plants were similar by phylogroup distribution, virulence gene distribution, and antibiotic resistance while both snail and feces populations were significantly different. Few of the feces isolates were significantly similar to aquatic ones, and most of the snail isolates were also different. Population structure analysis indicated three genetic backgrounds associated with bovine, snail, and aquatic environments. Collectively these data support niche preference of E. coli isolates occurring in this ecosystem.


Introduction
Escherichia coli is a commensal in the gastrointestinal tracts of humans and vertebrate animals, but readily isolated from aquatic and terrestrial habitats. Some data suggest semi-permanent residence in extra-host habitats [1,2]. The species displays a broad range of genotypes and associated phenotypes [3,4] and has previously been classified into four phylogroups (A, B1, B2, and D) [5], and later eight phylogroups based on their genomic information [6]. Of these, seven (A, B1, B2, C, D, E, and F) belong to E. coli sensu stricto whereas the eighth is represented by cryptic Clade-I. Variation in genotype and phenotype among strains of different phylogroups is believed to support fitness in different ecological habitats, leading to niche preference. Phylogroups A and B1 occur more frequently in the environment [7]. Some strains of phylogroup B1 were reported to persist in water [7,8] and soil [9], with some believed to be naturalized members of their specific communities [1]. B2 and D strains are frequently isolated from extra-intestinal sites within host bodies [3]. Many studies have reported that phylogroup B2 and, to a lesser extent, D strains are more likely to carry virulence factors than other phylogroups [10][11][12]. Interestingly, virulence genes are more frequently present in phylogroup B1 isolates from environments where phylogroup Life 2021, 11, 1020 3 of 15 then crushed in 10 mL sterile dH 2 O, and 100 µL plated directly on to MLGA plates. Feces samples were suspended in sterile water and serial tenfold dilutions plated onto MLGA. MLGA plates were incubated at 37 • C for 18 h. Green colonies indicated positive for β-Galactosidase (yellow) and β-Glucuronidase (blue) and were assumed to be E. coli. This protocol therefore excluded β-Glucuronidase negative O157:H7 strains. One colony was selected at random from the highest dilution showing growth, streaked onto MLGA to confirm purity, sub-cultured on LB agar, and stored at −80 • C in 50% glycerol.

Phylogroup Analysis
Genomic DNA was extracted from overnight LB agar cultures harvested and resuspended in 5mL 10 mM phosphate buffer (pH 7.0) using the genomic DNA Quick Prep Kit (Zymo Research, Irvine, CA, USA), and stored at −20 • C. Isolates were assigned to phylogroups using the protocol of Clermont et al. [6]. To avoid ambiguity, PCR was performed separately for each primer set (Table S1). Phylogroup similarity among the five sample types was determined by UPGMA analysis using the constrained Jaccard coefficient in PAST version 3.14 (https://www.nhm.uio.no/english/research/infrastructure/past/, last accessed on 17 September 2021) [36]. To determine whether the distribution of phylogroups differed by source we used multinomial log-linear regression models. The models were fitted using the nnet package in R (v.3.2.2) [37]. The response variable in this analysis was the phylogroup of each isolate (A, B1, B2, C, D, E, and Unknown), and the explanatory variables were the sample source and clusters associated with origin of the isolates. To visualize the effect of significant explanatory variables, we used regression trees fitted using Package Party [38] in R.

uidA and mutS Sequence Analysis
To genotype isolates, we amplified the uidA and mutS genes by PCR using primers described previously [39] (Table S1). uidA, which encodes the β-Glucuronidase enzyme is often used to differentiate E. coli sensu lato from other species in the genus and the primer set used was originally designed to target the most phylogenetic region of the gene [39]. mutS forms part of the mismatch repair system of bacteria, was shown to have a high diversity among E. coli isolates [40], and was an ideal target for this diversity study. PCR reactions (25 µL) were set up as follows: 2.5 µL reaction buffer (10×) (New England Biolab), 1.5 µL MgCl 2 (25 mM), 0.5 µL dNTPs (40 mM), 0.1 µL forward primer and 0.1 µL reverse primer (100 µmol), 0.125 µL of Taq polymerase (NE Biolabs), 0.5 µL of DNA template, and 20.7 µL sterile nano pure water. The amplification cycle was initiated with 95 • C for 2 min, followed by 30 cycles of denaturing at 95 • C for 30 s, annealing at 56 • C for 30 s and extension at 72 • C for 1 min, with a final extension at 72 • C for 5 min. DNA sequences were determined by the dideoxy chain termination method (Beckman Coulter Genomic Center at Denver, MA, USA). The uidA and mutS sequences were submitted to Genbank (http://www.ncbi.nlm.nih.gov/genbank/, accessed on 17 September 2021) under BankIt2031081: MF459726-MF459846 and BankIt2031086: MF459847-MF459967 respectively.
To infer the relationships among isolates, DNA sequences were aligned using ClustalW [41], and overhangs were trimmed using SeAl [42]. The uidA and mutS sequences for all isolates and reference strains [43] were concatenated using SeAl. A maximum likelihood analysis using model GTR+G+I with 1000 bootstrap replicates was performed in the program MEGA 6.06 [41]. The tree was then annotated and visualized using the ITOL online tool [44].

Population Genetic Analysis
To infer population structure and assign isolates to distinct populations, we employed a model-based clustering method using STRUCTURE [45]. More specifically the admixture model was applied using sample locations as prior (LOCPRIOR). By assuming mixed ancestry, individuals within a population were thought to have inherited a fraction of their genome from an ancestor in the population [46]. Ln probability values and the variance of Ln likelihood scores were estimated for the concatenated uidA-mutS sequences, assuming the presence of 2 populations (K = 2, with an adjusted alpha = 0.5) and performing twenty iterations for each K from K = 1 to K = 6. For these analyses a burn-in of 10,000 and a run length of 500,000 were used [47]. All other parameters in STRUCTURE were left as default. The resulting data from STRUCTURE were collated and visualised using the web-based program Structure Harvester [46] to assess which likelihood values across the multiple estimates of K best explained the data (in this case K = 3 was the best) using the Evanno method [48,49]. Furthermore, optimal alignments for the number of replicate cluster analyses were generated using the FullSearch algorithm in CLUMPP [50] and the corresponding output files were used directly for cluster visualization as plots in Excel and the program Distruct 1.1 [51].

Virulence Gene Assays
PCR for detection of stx1, stx2, eaeA and hlyA genes was performed using primers as described by Fagan, et al. [52] (Table S2), and for STa and LTb virulence genes as described by Osek [53] (Table S2). DNA samples for PCR were prepared by the boiling method. Stock cultures were recovered on LBA, two colonies suspended in 500 µL dH 2 O, washed by centrifugation and suspended in sterile dH 2 O, lysed by incubating at 100 • C for 10 min, and immediately chilled on ice for 5 min. Debris was removed by centrifugation for 1 min at 12,000× g and the supernatant was transferred to a new sterile tube and stored at −20 • C for further use as PCR template. PCR reactions were carried out in 25 µL volume containing 1 µL of DNA template, 2.5 µL reaction buffer (10×) (New England Biolabs, Ipswich, MA, USA), 1.5 µL MgCl 2 (25 mM), 0.5 µL dNTPs (40 mM), 0.1 µL forward primer and 0.1 µL reverse primer (100 µmol), 0.1 µL of Taq polymerase (New England Biolabs), and 19.2 µL sterile nano pure water. PCR amplification for stx1, stx2, eaeA, and hlyA was performed under the following conditions: initial 95 • C denaturation step for 3 min followed by 35 cycles of 20 s denaturation at 95 • C, 40 s primer annealing at 58 • C, and 90 s extension at 72 • C. The final cycle was followed by a 72 • C incubation for 5 min [52]. LTb and STa were amplified under the following conditions: an initial DNA denaturation step at 94 • C for 5 min followed by 30 cycles of 1 min of denaturation at 94 • C, 1 min of primer annealing at 55 • C, and 2 min of extension at 72 • C. The final extension step was performed at 72 • C for 5 min [53].

Antibiotic Resistance Assays
Antibiotic susceptibility of the 120 E. coli isolates was determined using a disk diffusion assay following the CLSI standard [54]. Stock cultures were recovered in 5 mL Mueller Hinton (MH, Oxoid) broth at 37 • C for 16 h. Cells were harvested by centrifugation (10,000× g, 2 min), re-suspended in sterile tap water and the cell density adjusted to 0.5 on the Mc-Farland turbidity standard. Cell suspensions were spread onto MH agar (Oxoid), and antibiotic disks (Oxoid) Ciprofloxacin (CIP, 5 µg), Meropenem (MEM, 10 µg), Ceftriaxone (CRO, 30 µg), Gentamicin (CN, 10 µg), Azithromycin (AZM, 15 µg), Tetracycline (TE, 30 µg), with Penicillin (10 µg) as control were placed on the surface. After 18 h incubation at 37 • C, zone diameters were measured, and isolates scored as intermediately or fully resistant. E. coli ATCC 25922 was included in each assay as a negative control as it is sensitive to all these antibiotics.

Bacterial Isolates
Strains of E. coli were obtained from water (31), sediment (27), water plants (35), and snails collected from the pond (20), and from fresh bovine feces (7) obtained from the adjacent pasture.

Phylogroup Distribution
Isolate collections obtained from the water and submerged water plants showed similar phylogroup distribution (Figure 1), predominated by phylogroups B1, E and some B2 isolates. Sediment was similar to water and water plants, but with the addition of phylogroup A strains. In contrast, snail isolates were mostly phylogroup B2, while those from bovine feces were phylogroup E. Multinomial log linear regression supported a significant difference (p < 0.001) between the isolate collections from water snails and those from water, sediment, and plants (Supplementary Materials, Figure S1), with the latter three collections not significantly different from one another. As our isolation method was based on MLGA (β-Glucuronidase and β-Galactosidase), phylogroup E strains lacking the uidA gene for β-Glucuronidase would have been excluded [55]. However, we obtained several green colonies from feces, all falling into phylogroup E but yielding the uidA gene by PCR.

Bacterial Isolates
Strains of E. coli were obtained from water (31), sediment (27), water plants (35), and snails collected from the pond (20), and from fresh bovine feces (7) obtained from the adjacent pasture.

Phylogroup Distribution
Isolate collections obtained from the water and submerged water plants showed similar phylogroup distribution (Figure 1), predominated by phylogroups B1, E and some B2 isolates. Sediment was similar to water and water plants, but with the addition of phylogroup A strains. In contrast, snail isolates were mostly phylogroup B2, while those from bovine feces were phylogroup E. Multinomial log linear regression supported a significant difference (p < 0.001) between the isolate collections from water snails and those from water, sediment, and plants (Supplementary Materials, Figure S1), with the latter three collections not significantly different from one another. As our isolation method was based on MLGA (β-Glucuronidase and β-Galactosidase), phylogroup E strains lacking the uidA gene for β-Glucuronidase would have been excluded [55]. However, we obtained several green colonies from feces, all falling into phylogroup E but yielding the uidA gene by PCR.  [6]. The relatedness between Phylogroup distribution similarity was determined by UPGMA using the constrained Jaccard coefficient.

Phylogenetic Analysis
The concatenated mutS and uidA sequence phylogeny formed many well-separated clusters with strong bootstrap support ( Figure 2). None of our isolates grouped with any of the Clade I, III, IV, or V strains and all belonged to E. coli sensu stricto. Most of the water, water plant, and sediment isolates fell into mixed clusters, some with reference strains. This indicated co-occurrence of diverse strains across the three niches. The majority of  [6]. The relatedness between Phylogroup distribution similarity was determined by UPGMA using the constrained Jaccard coefficient.

Phylogenetic Analysis
The concatenated mutS and uidA sequence phylogeny formed many well-separated clusters with strong bootstrap support ( Figure 2). None of our isolates grouped with any of the Clade I, III, IV, or V strains and all belonged to E. coli sensu stricto. Most of the water, water plant, and sediment isolates fell into mixed clusters, some with reference strains. This indicated co-occurrence of diverse strains across the three niches. The majority of water snail isolates grouped into three unique clusters that contained no water, sediment, or water plant isolates, and also no reference strains, indicating that they are unique and potentially have a preference for snails over surrounding water, sediment, or water-niches. Three of the snail isolates did cluster with water plant, sediment, water, and reference strains. All bovine fecal isolates formed a separate cluster from aquatic and from snail isolates. Furthermore, no bovine isolates clustered with phylogroup E reference strains (Figure 2), indicating hitherto poorly studied diversity within cattle. water snail isolates grouped into three unique clusters that contained no water, sediment, or water plant isolates, and also no reference strains, indicating that they are unique and potentially have a preference for snails over surrounding water, sediment, or waterniches. Three of the snail isolates did cluster with water plant, sediment, water, and reference strains. All bovine fecal isolates formed a separate cluster from aquatic and from snail isolates. Furthermore, no bovine isolates clustered with phylogroup E reference strains (Figure 2), indicating hitherto poorly studied diversity within cattle.

Population Genetic Analysis
Population genetic analysis of concatenated uidA and mutS genes was performed assuming one aquatic and one fecal population (i.e., K = 2, alpha = 0.5). The result obtained Life 2021, 11, 1020 7 of 15 from the Evanno table was K = 3, supporting the existence of three separate genetic backgrounds within the collection of isolates examined (Figure 3). The bovine fecal isolate collection was homogenous, containing mainly one genetic background. Isolates from snails were associated with two backgrounds that were mostly homogenous, one of which was identical to the fecal background. In contrast, water, sediment, and water plant isolates were associated with a mixture of three genetic backgrounds shared by the bovine fecal isolates, some shared by the second group of snail isolates, and a distinct third background (yellow in Figure 3) that was more common in the aquatic populations but not in snail isolates. Thus, the pond ecosystem comprised of an admixture of strains representing three genetic backgrounds, one likely due to introduction of bovine-derived strains (blue in Figure 3), a second associated with snail populations (red), and a third unique to the aquatic environment (yellow). This indicates gene flow among the water, sediment, and water plant populations, but with some genetic input from the fecal and snail populations. No genetic input from the water, water plant, sediment, and snail populations to the bovine fecal population was observed.
performed in the program MEGA 6. The phylogenetic tree was color-coded and visualized using the Interactive Tree of Life with isolates color-coded based on their sources. Grey circles on branches indicate a bootstrap value of >80% (1000 bootstraps).

Population Genetic Analysis
Population genetic analysis of concatenated uidA and mutS genes was performed assuming one aquatic and one fecal population (i.e., K = 2, alpha = 0.5). The result obtained from the Evanno table was K = 3, supporting the existence of three separate genetic backgrounds within the collection of isolates examined (Figure 3). The bovine fecal isolate collection was homogenous, containing mainly one genetic background. Isolates from snails were associated with two backgrounds that were mostly homogenous, one of which was identical to the fecal background. In contrast, water, sediment, and water plant isolates were associated with a mixture of three genetic backgrounds shared by the bovine fecal isolates, some shared by the second group of snail isolates, and a distinct third background (yellow in Figure 3) that was more common in the aquatic populations but not in snail isolates. Thus, the pond ecosystem comprised of an admixture of strains representing three genetic backgrounds, one likely due to introduction of bovine-derived strains (blue in Figure 3), a second associated with snail populations (red), and a third unique to the aquatic environment (yellow). This indicates gene flow among the water, sediment, and water plant populations, but with some genetic input from the fecal and snail populations. No genetic input from the water, water plant, sediment, and snail populations to the bovine fecal population was observed.

Virulence Gene Distribution
To determine their pathogenic potential, isolates were screened for the presence of major virulence genes associated with diarrhoeagenic E. coli. EHEC represent a pathotype producing at least one of the two Shiga toxins, Stx1 and Stx2, encoded by stx1 and stx2 [56]. In addition, EHEC produce numerous other putative virulence factors including Intimin, responsible for attachment of the bacteria to intestinal epithelial cells, causing attaching and effacing lesions in the intestinal mucosa and aiding in the attachment and colonization of the bacteria at the intestinal wall [57]. Intimin, encoded by the chromosomal gene eaeA, is part of a pathogenicity island termed the locus of enterocyte effacement. Hemolysin, encoded by the hlyA gene, can lyse red blood cells and liberate iron to help Figure 3. Population structure analysis of E. coli isolates. Concatenated uidA and mutS sequences were analyzed assuming presence of two populations, but analysis using Structure Harvester showed that K = 3 best explained the data. The short color bars below the figure indicate the isolate source as defined in the legend.

Virulence Gene Distribution
To determine their pathogenic potential, isolates were screened for the presence of major virulence genes associated with diarrhoeagenic E. coli. EHEC represent a pathotype producing at least one of the two Shiga toxins, Stx1 and Stx2, encoded by stx1 and stx2 [56]. In addition, EHEC produce numerous other putative virulence factors including Intimin, responsible for attachment of the bacteria to intestinal epithelial cells, causing attaching and effacing lesions in the intestinal mucosa and aiding in the attachment and colonization of the bacteria at the intestinal wall [57]. Intimin, encoded by the chromosomal gene eaeA, is part of a pathogenicity island termed the locus of enterocyte effacement. Hemolysin, encoded by the hlyA gene, can lyse red blood cells and liberate iron to help support E. coli metabolism [58]. The intestinal tract of cattle is regarded as the primary reservoir of EHEC, also recovered from other domestic animals, such as sheep, goats, pigs, cats, and dogs, as well as wild animals [56,59]. ETEC commonly express heat labile toxin encoded by LTb and heat stable toxin encoded by Sta [53].
Out of six genes tested for, four (stx2, eaeA, hlyA and STa) were detected. We did not detect any isolates with the stx-1 and LTb genes, although the control strains EDL933D and O157:K88 [60] yielded positive results, confirming reliability of the assay. Among the four detected genes, eaeA was the most frequently detected (36.13%), then stx2 (12.61%), LT (10.9%), and hlyA (3.36%). Distribution of the virulence genes in E. coli populations of water, sediment, and water plants was similar, supporting exchange of isolates among these niches (Figure 4). Yet the water population was much richer in prevalence of the STa gene and had no isolates with hlyA. Virulence gene distribution of snail populations was different, with more than half the isolates containing the eaeA gene. While all isolates from bovine feces belonged to phylogroup E, none contained any of the six virulence genes (Figure 4). Few E isolates carried stx2, and none tested positive for stx1 ( Figure S2). β-Glucuronidase negative strains would not have formed green colonies on MLGA, and would have been excluded, so some phylogroup E strains containing virulence genes may have been excluded in our study. support E. coli metabolism [58]. The intestinal tract of cattle is regarded as the primary reservoir of EHEC, also recovered from other domestic animals, such as sheep, goats, pigs, cats, and dogs, as well as wild animals [56,59]. ETEC commonly express heat labile toxin encoded by LTb and heat stable toxin encoded by Sta [53].
Out of six genes tested for, four (stx2, eaeA, hlyA and STa) were detected. We did not detect any isolates with the stx-1 and LTb genes, although the control strains EDL933D and O157:K88 [60] yielded positive results, confirming reliability of the assay. Among the four detected genes, eaeA was the most frequently detected (36.13%), then stx2 (12.61%), LT (10.9%), and hlyA (3.36%). Distribution of the virulence genes in E. coli populations of water, sediment, and water plants was similar, supporting exchange of isolates among these niches (Figure 4). Yet the water population was much richer in prevalence of the STa gene and had no isolates with hlyA. Virulence gene distribution of snail populations was different, with more than half the isolates containing the eaeA gene. While all isolates from bovine feces belonged to phylogroup E, none contained any of the six virulence genes (Figure 4). Few E isolates carried stx2, and none tested positive for stx1 ( Figure S2). β-Glucuronidase negative strains would not have formed green colonies on MLGA, and would have been excluded, so some phylogroup E strains containing virulence genes may have been excluded in our study.

Antibiotic Resistance Profiling
One antibiotic from each of six target classes was chosen to evaluate the resistance of isolates: ceftriaxone (CRO, class cephalosporins), ciprofloxacin (CIP, class-fluoroquinolones), gentamicin (CN, class aminoglycosides), azithromycin (AZM, class-macrolides), meropenem (MEM, class carbapenems), and tetracycline (TE). Isolate collections from water and water plants showed a similar resistance distribution, with 60% of isolates resistant to gentamicin ( Figure 5). Sediment antibiotic resistance distribution was different from water and water plant populations. Water, water plant, and sediment samples contained isolates resistant to three antibiotics, many of which also contained the eaeA gene as well as either STb or hlyA ( Figure 6). The isolate collection from snails had a unique antibiotic resistance profile, with 80% sensitive to all antibiotics (Figure 6), whereas only 20% of the

Antibiotic Resistance Profiling
One antibiotic from each of six target classes was chosen to evaluate the resistance of isolates: ceftriaxone (CRO, class cephalosporins), ciprofloxacin (CIP, class-fluoroquinolones), gentamicin (CN, class aminoglycosides), azithromycin (AZM, class-macrolides), meropenem (MEM, class carbapenems), and tetracycline (TE). Isolate collections from water and water plants showed a similar resistance distribution, with 60% of isolates resistant to gentamicin ( Figure 5). Sediment antibiotic resistance distribution was different from water and water plant populations. Water, water plant, and sediment samples contained isolates resistant to three antibiotics, many of which also contained the eaeA gene as well as either STb or hlyA ( Figure 6). The isolate collection from snails had a unique antibiotic resistance profile, with 80% sensitive to all antibiotics (Figure 6), whereas only 20% of the isolates from water, sediment, and water plant isolates were not resistant to any of the antibiotics. However, most of the snail isolates displayed intermediate resistance to three or four antibiotics ( Figure S3). Isolates from bovine feces also displayed a unique antibiotic resistance profile ( Figure 5), with 80% of isolates displaying intermediate resistance to either two or three antibiotics ( Figure S3). isolates from water, sediment, and water plant isolates were not resistant to any of the antibiotics. However, most of the snail isolates displayed intermediate resistance to three or four antibiotics ( Figure S3). Isolates from bovine feces also displayed a unique antibiotic resistance profile ( Figure 5), with 80% of isolates displaying intermediate resistance to either two or three antibiotics ( Figure S3). Figure 5. Antibiotic resistance across isolates from the five sample types. The relatedness between resistance profiles was determined by UPGMA using the constrained Jaccard coefficient.  isolates from water, sediment, and water plant isolates were not resistant to any of the antibiotics. However, most of the snail isolates displayed intermediate resistance to three or four antibiotics ( Figure S3). Isolates from bovine feces also displayed a unique antibiotic resistance profile ( Figure 5), with 80% of isolates displaying intermediate resistance to either two or three antibiotics ( Figure S3). Figure 5. Antibiotic resistance across isolates from the five sample types. The relatedness between resistance profiles was determined by UPGMA using the constrained Jaccard coefficient. Figure 6. Sensitivity and resistance to 0, 1, 2, or 3 antibiotics across sample types expressed as percentage, compared to occurrence of virulence genes. Figure 6. Sensitivity and resistance to 0, 1, 2, or 3 antibiotics across sample types expressed as percentage, compared to occurrence of virulence genes.

Discussion
We sought to determine whether environmental E. coli display niche preference by associating with specific environments. We chose a secluded peri-urban pond adjacent to a cattle pasture, isolating E. coli from water, sediment, submerged water plants, and water snails, as well as from freshly deposited bovine feces in the adjacent pasture. While the pond is secluded and therefore less subject to introduction of bacteria from outside, we cannot exclude introduction of E. coli through wild birds or small mammals [61].
To obtain evidence of niche partitioning, isolates were characterized genotypically by phylogrouping, analysis of their uidA and mutS sequences, and virulence gene distribution, and phenotypically for antibiotic resistance.
Snail E. coli populations were predominated by phylogroup B2. Snail phylogroup distribution was different (p < 0.001) to water, sediment, and water plant populations when using multinomial log-linear regression analysis. This indicated that strains display preference for either snail or aquatic niches but not both. Phylogroup distribution differed slightly between sediment and water and water plants but was not significant by multinomial log-linear regression analysis, indicating indiscriminate distribution of specific strains among these three niches. The prevalence of B1 and E, and some B2 in water, water plant, and sediment was consistent with previous studies where B1 have been interpreted as generalists and harbor traits linked to plant association, whereas B2 strains are associated more with animals [62,63]. Phylogroup distribution within the E. coli population in both water and superficial sediments showed spatial variation [34]. It has also been reported that phylogenetic groups are adaptable and genotypically influenced by changes in environmental conditions; however, phylogroup B1 isolates seem to persist in water [8,64]. Our data indicated that the B2 populations occurring in the pond persisted mostly in water snails. Likewise, phylogroup E strains predominated in bovine feces deposited nearby, and despite run-off from the pasture to the pond, did not thrive in the pond environment. The composition differences of phylogroups among populations in different environments may be caused by differences in adaptability and genome plasticity of E. coli strains [64]. Such variation in phylogroup distribution suggests that E. coli phylogroups are affected by niche specific selective pressures [63].
The phylogroup E strains isolated from the water, sediment, and water plants formed several clusters within the uidA mutS phylogeny. Importantly, snail phylogroup E isolates clustered separately, as did bovine fecal isolates, indicating three separate groups of isolates and supporting niche preference among various phylogroup E strains. The mutS and uidA phylogeny showed that some clusters were devoid of reference strains. None of our phylogroup E isolates clustered with any reference strains, suggesting these isolates are different to those typically associated with humans. In a recent study of cattle pasture we also found a higher percentage of phylogroup E in bovine fecal isolates compared to soil isolates, and none clustered with reference strains [9]. There appear to be diverse environmental β-Glucuronidase positive E. coli that are allocated to phylogroup E by the Clermont scheme [6], but that do not align with human isolates available in the databanks, warranting further investigation. The prevalence of E. coli in soils depends on specific conditions with phylogroup B1 and E associated with pasture lands while B2 and D phylogroups were associated with wooded areas [65]. Collectively, the phylogeny derived from uidA and mutS genes supported by phylogroup distribution analysis showed a preference of certain isolates with distinct backgrounds for specific niches.
Population genetic analysis of mutS and uidA supported the existence of three distinct genetic backgrounds within the collection of isolates analyzed. The bovine fecal isolates had a homogenous background mostly lacking admixture. Some snail isolates shared this background, while others had their own background also lacking admixture. Isolates from water, water plants and sediment varied. Some had pure bovine background, others pure snail background, while the majority had an admixture of two or three backgrounds, bovine and/or snail plus a third, apparently aquatic one. This indicated directional gene flow from bovine fecal, and separately from snail-associated strains to aquatic strains. In contrast there was no or limited evidence for gene flow from aquatic to snail or cattle populations, indicating that none of these aquatic strains were able to persist in snail or bovine gastrointestinal environments. The neutral theory of molecular evolution makes a clear prediction on how the genetic drift in the absence of all other evolutionary forces shapes genetic diversity [66]. To study genomic evolution and consider a more complex explanation for the pattern of molecular variation, the neutral theory must be rejected as a null hypothesis [67]. Genetic variation in E. coli combines aspects of recombination, selection, and population structure [68]. The gene flow model has some support in the literature. Retchless and Lawrence [69] proposed the fragment speciation model in which different segments of bacterial chromosomes become genetically isolated at different times. Sheppard et al. [70] found evidence of increasing gene flow between previously distinct Campylobacter species. Luo et al. [71] described the genomes of environmental isolates of E. coli and found little evidence of gene exchange between gut commensal E. coli due to possible ecological barriers, although they found transfer of core genes within the clades. Similarly, Karberg et al. [72] found that recently acquired genes in Salmonella and Escherichia genomes have similar codon usage frequencies, while cores genes have noticeably diverged in codon usage. Therefore, it seems that Salmonella and Escherichia strains acquire genes from common pangenomes shared among enterobacterial species.
The presence of virulence genes eaeA, stx2, hlyA, and STa indicates potential pathogens, though it has been suggested that the occurrence of single or multiple virulence genes in E. coli does not confirm its pathogenicity, unless it has the appropriate combination of virulence genes to cause disease to the host. Enteric pathogens exposed to vegetables express similar genes to those required to colonize the host intestine, indicating that enteric bacteria may have the ability for colonization of vegetables by using similar mechanism required for animal cells [73]. High prevalence of the intimin-encoding gene eaeA was observed in all four pond niches, but not in feces, indicating presence of eaeA may play a role in aquatic fitness that is distinct from virulence. Byappanahalli et al. [74] detected a high level of eaeA in isolates from algae and to a lesser extent in those from water and sand samples from lake Michigan. eaeA is one of the most frequently detected E. coli pathogenicity genes in the environment [75][76][77]. It is not certain if these isolates with virulence genes are pathogenic and persist in the environment, or whether they acquire these genes from these environments.
The similarities in patterns of antibiotic resistance in aquatic populations suggested a common source of resistant strains, with preference for these niches. Snail populations were almost devoid of resistance to the wide array of antibiotics evaluated, again supporting niche preference. E. coli isolated from various sampling sources showed variation in the antibiotic resistance patterns depending on the use of antibiotics and their exposure to environments [78][79][80]. This pond was not being used for any human or domestic animal activities and there was no direct input of wastewater. It is unclear whether isolates acquired antibiotic resistance through antibiotic exposure, or whether they maintain these genes in the absence of antibiotics [81].
In conclusion, sediment, water, and water plant populations showed similarities in phylogroup distribution, occurrence of virulence genes and antibiotic resistance patterns in their populations, indicating that individual strains of this population can associate with any of these three niches. Snail-associated populations were different, and contained several apparently novel E. coli strains, primarily belonging to phylogroup B2. Bovine fecal populations from the adjoining pasture were different based on phenotype and genotype, and not similar to any aquatic isolates. The distinct distribution patterns of E. coli strains indicate niche preference, with specific aquatic strains not associating with snails or cattle.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/life11101020/s1, Figure S1: Multinomial log-linear regression analysis of phylogroup distribution of isolates across sample types. Phylogrouping was performed according to the scheme of Clermont et al., 2013. The X axis denotes phylogroups and the Y-axis represents proportion of isolates. Sed-sediment, W-water, WP-water plant, SN-snail, Figure S2: Virulence gene distribution across isolates allocate to phylogroups based on the scheme of Clermont et al., 2013. The number of isolates for each phylogroup is given in parentheses on the x axis, Figure S3: Distribution of sensitive (0) and isolates displaying Intermediate resistance to 1, 2, 3, 4 or 5 antibiotics from the five sampling sites, Table S1: Primers used for determining the uidA and mutS genes, and for phylogrouping, Table S2: Primers used for amplification of virulence genes.