Genomic Profiling of Antibiotic-Resistant Escherichia coli Isolates from Surface Water of Agricultural Drainage in North-Western Mexico: Detection of the International High-Risk Lineages ST410 and ST617

Aquatic environments are recognized as one of the main reservoirs for the emergence and dissemination of high-risk lineages of multidrug-resistant (MDR) bacteria of public health concern. However, the genomic characteristics of antibiotic-resistant Escherichia coli isolates from aquatic origins remain limited. Herein, we examined the antibiotic resistance and virulence genomic profiles of three E. coli recovered from surface water in northwest Mexico. Antimicrobial susceptibility testing, whole-genome sequencing (WGS), and in-depth in silico analysis were performed. Two E. coli exhibited MDR phenotypes. WGS-based typing revealed genetic diversity, and phylogenetic analysis corroborated a notable divergent relationship among the studied E. coli. One E. coli strain, harboring enterotoxigenic and extraintestinal pathogenic-associated virulence genes, was assigned to the ST4 lineage. MDR E. coli, belonging to the international high-risk clones ST410 and ST617, carried genes and mutations conferring resistance to aminoglycosides, β-lactams, quinolones, sulfonamides, tetracyclines, and trimethoprim. This study describes, for the first time, the detection and genomic profiling of high-risk lineages of E. coli ST410 and ST617 from surface water in Mexico. Additionally, our results underscore the role of surface water as a reservoir for critical pathogenic and MDR E. coli clones and the need for the surveillance and monitoring of aquatic environments via WGS from the One Health perspective.


Introduction
Antimicrobial resistance (AMR) has been a primary global public health issue in recent decades. AMR hinders clinical effectiveness and limits the therapeutic options for infectious diseases, leading to increased healthcare costs, treatment failure, morbidity, and mortality rates. At present, the estimated AMR-related death toll accounts for 700,000 annually, and it is projected to increase to 10 million deaths per year by 2050 unless action is taken [1]. AMR phenomenon represents a complex and multifaceted threat of growing concern to the human, animal, and environmental health due to the emergence, persistence, and spread of antibiotic resistance genes (ARGs) and their acquisition by clinically relevant bacteria in each interconnected sector [2].
The importance of the environment as a reservoir of antibiotic resistance and its role in the spread of potential ARGs has been widely recognized. Antibiotic resistance can arise either by mutational mechanisms or by resistance-conferring gene acquisition via horizontal gene transfer (HGT) events, that is, conjugation, transformation, or transduction. HGT is considered the most important factor in the recruitment and transmissibility of ARGs from the environmental gene pool to pathogenic bacteria [3].
Among all environmental compartments, the aquatic ecosystems are constantly subjected to anthropogenic contamination with antibiotics through the direct or indirect discharges of agricultural, aquaculture, domestic, hospital, and industrial effluents. The selective pressure exerted by antibiotic contamination impacts the resident bacterial communities towards the selection of antibiotic-resistant bacteria (ARB) and enrichment of drug-resistant genetic determinants [4]. Furthermore, aquatic environments have dynamic and distinct microbial composition patterns influenced by temporal and spatial disparities in physiochemical and biotic factors, including environmental stresses and nutrient composition [5]. Consequently, the aquatic milieus constitute an ideal reservoir for microbial genetic exchange, evolution, and dissemination of ARGs and ARB.
Among the ARB harboring transmissible resistance traits of greatest clinical relevance are extended-spectrum β-lactamase (ESBL)-producing Enterobacteriaceae, which have become a challenge for infection control because of their worldwide endemicity and multidrug-resistant (MDR) phenotypes [6,7]. Escherichia coli (E. coli) is a highly versatile member of the Enterobacteriaceae family that constitutes an important component of the commensal gastrointestinal microbiota of warm-blooded animals, including humans. However, its genome plasticity and HGT mechanisms have led to the evolution of this microorganism into highly adapted antibiotic-resistant and pathogenic clones [8].
Depending on the site of infection, pathogenic E. coli strains are classified as intestinal pathogenic E. coli (InPEC) or extraintestinal pathogenic E. coli (ExPEC) and further subcategorized into distinct pathotypes based on clinical disease manifestations, virulence factors (VFs), and genetic background [9,10]. The long-term survival of E. coli has been described in secondary habitats, including food products, sediments, soil, and water [11,12]. Thus, environmental E. coli strains could be a source of opportunistic and MDR-related diseases of public health significance in animals and humans; therefore, such strains must be monitored and tracked.
Several studies have identified antibiotic-resistant and pathogenic E. coli strains in environmental water samples from Mexico [13][14][15][16][17]. Previous studies in Sinaloa, Mexico, have reported E. coli as a major biological pollutant of diverse aquatic resources, including agricultural drainage, drinking, and irrigation water [18][19][20]. Canizalez-Roman et al. [21] demonstrated multidrug-resistant InPEC strains in surface water samples from irrigation channels and river water across Sinaloa State, where they exhibited resistance to β-lactams, tetracycline, and trimethoprim/sulfamethoxazole.
To the best of our knowledge, genomic data describing the mechanisms underlying antibiotic resistance and virulence properties of E. coli from environmental water samples in Mexico are extremely limited. Therefore, this study aimed to examine, from a genomic perspective, the antibiotic resistance patterns and virulence determinants of E. coli isolates recovered from surface water of an agricultural drainage ditch that constantly receives irrigation water and domestic wastewater. In this study, we describe the genomic characteristics and determine the genetic diversity regarding the virulence-associated and antimicrobial resistance genes, plasmids, and prophage content of three antibiotic-resistant E. coli strains from the aquatic environment, two of which are international high-risk clones.

DNA Extraction, Whole-Genome Sequencing (WGS), and Read Preprocessing
For DNA extraction, each E. coli isolate was grown in Luria-Bertani (LB) broth at 37 ± 2 • C for 24 h. Genomic DNA (gDNA) was extracted using the ZymoBIOMICs DNA Miniprep Kit (Zymo Research Corp., Irvine, CA, USA) according to the manufacturer's protocol. The purity, concentration, and integrity of gDNA were assessed by spectrophotometry (Nanodrop Lite, Thermo Fisher Scientific, Waltham, MA, USA) and electrophoresis on a 1.0% agarose gel. Paired-end DNA libraries were prepared using the Nextera XT Library Preparation Kit (Illumina, San Diego, CA, USA) following the manufacturer's instructions, and sequenced on the Illumina MiniSeq platform with a 2 × 150 bp approach. Sequence data quality was evaluated using the FastQC software v.0.11.9 [26]. Raw sequencing reads were quality trimmed and filtered using Cutadapt v2.4 [27] to remove adapters, low-quality bases (<Q30), and short reads (<50 bp). The processed reads were submitted to the KmerFinder v3.2 [28] web tool from the Center for Genomic Epidemiology (CGE) (http://www.genomicepidemiology.org/, accessed on 20 November 2021) for species confirmation.

Plasmid Replicons and Prophages Regions
Plasmid replicon identification was performed using PlasmidFinder v2.1 from the CGE using a minimum 90% nucleotide identity and 60% coverage threshold [40]. Prophage sequences in E. coli genomes were determined with Phage Search Tool Enhanced Release (PHASTER) (https://phaster.ca/, accessed on 21 November 2021). Prophage regions are predicted to be intact, questionable, and incomplete if the prophage sequence scores are ≥90, 70-90, and ≤70, respectively [41]. Only prophages identified as intact were used in the current analysis.

Phylogenetic Analysis
To place our three sequenced E. coli isolates into a global context, a collection of isolates with the same ST (i.e., ST4, ST410, and ST617) were selected from EnteroBase using the Achtman seven gene MLST scheme as a query (https://enterobase.warwick.ac.uk/ species/index/ecoli, accessed on 2 March 2022) [42]. Phylogenetic analysis was performed using GrapeTree to construct a minimum spanning tree based on the core-genome MLST (cgMLST) V1 + Hierarchical Clustering (HierCC) V1 scheme from EnteroBase through the rapid neighbor-joining (RapidNJ) algorithm [43].
Because ST410, ST617, and ST4 phylogenetic trees comprised an excess of 200 isolates, refined subtrees were generated from clades containing E. coli ADD147, ADD167, and ADD183. Branch lengths were used to calculate the cgMLST allelic differences between closely related isolates. Metadata of E. coli isolates used for ST410, ST617, and ST4 subtree reconstruction are described in Supplementary Tables S1-S3, respectively.

Phylogroups, Serotype, and Sequence Types
Based on the WGS data, in silico characterization was performed to gain insights into the genetic diversity of these E. coli isolates. In summary, two phylogenetic groups were identified. E. coli ADD167 and ADD183 were assigned to phylogroup A, whereas ADD147 was allocated to phylogroup C. The O:H typing predicted three dissimilar serotypes among the sequenced E. coli genomes, of which E. coli ADD147 was O-nontypable (ONT). Similarly, three distinct STs were detected, assigning E. coli ADD147, ADD167, and ADD183 to ST410, ST617, and ST4, respectively. Based on MLST data, both ST4 and ST617 were grouped within clonal complex 10 (CC10) ( Table 1).
Analysis of the distribution of VAGs indicated that strains under CC10 (ADD167 and ADD183) partially shared a virulence profile. However, E. coli ADD183 also displayed genes involved in capsular polysaccharide biosynthesis (kpsE, kpsM_K15), increased serum survival (traT), toxin production (ltcA, heat-labile enterotoxin A subunit), and protease production (eatA), namely, the serine protease autotransporter of Enterobacteriaceae (SPATE) EatA. A key feature of the enterotoxigenic E. coli (ETEC) pathotype, a member of the InPEC group, is the ability to express heat-labile (LT) and/or heat-stable (ST) enterotoxins [8]. Although further exploration is required, E. coli ADD183 was assigned to the ETEC pathotype according to the virulence profile predicted in this study.

Resistance Profile and Antimicrobial Resistance Genes (ARGs)
The phenotypic antimicrobial susceptibility was determined by disk diffusion according to CLSI standards. In summary, the phenotypic resistance profiles of E. coli ADD147 and ADD167 were comparable and cataloged both as MDR bacteria owing to their resistance to antibiotics from the β-lactam, fluoroquinolone, and folate pathway antagonists antimicrobial classes (Table 2) [25]. Both MDR E. coli strains showed resistance to ampicillin, carbenicillin, ciprofloxacin, norfloxacin, and trimethoprim/sulfamethoxazole. However, E. coli ADD147 further exhibited resistance to chloramphenicol, gentamicin, and netilmicin, whereas E. coli ADD167 was resistant to first-and third-generation cephalosporins (i.e., cephalothin and cefotaxime), suggesting a presumptive ESBL phenotype. In addition, E. coli ADD167 was not susceptible to gentamicin and nitrofurantoin. In contrast, ADD183 was sensitive to all antibiotics except ciprofloxacin, to which it remained not susceptible ( Table 2).
With regard to (fluoro)quinolone resistance, no plasmid-mediated quinolone resistance (PMQR) genes, such as qnr, qepA, or oqxAB, were observed. However, nucleotide sequence analysis of the quinolone resistance-determining region (QRDR) revealed point mutations leading to non-synonymous substitutions in the DNA gyrase and topoisomerase IV targets. In particular, double amino acid exchange was found in GyrA (S83L, D87N), whereas the S80I and S458A substitutions were recognized in ParC and ParE, respectively ( Table 2). Additional chromosomal point mutations that confer antibiotic resistance have not been identified.
PHASTER analysis of the E. coli genomes identified 33 phage-related sequences. However, only seven prophage regions were predicted to be intact and considered for further analysis (Table 3); these showed homology to six different phages. Their length varied from 11 to 53 kb, of which the most frequent was the Enterobacteria phage P88. Predicted prophage regions were manually inspected to identify putative active phage signatures, that is, attachment site (att) sequences (attL and attR) and genes encoding structural proteins, DNA packaging and regulation, insertion into the host, and lysis. Prophages similar to VT2phi_272, pro483, and SH2026Stx1 lacked attachment sites, suggesting defective phages. Conversely, the prophages BP-4795, P88, and P4 possessed att sequences; however, the P4-like prophage cargo only contained integrase and a few structural-related proteins, lacking recombinase, terminase, transposase, and lysis module. Except for the iss gene encoded in the BP-4795 prophage, none of the predicted prophages carried further genetic elements implicated in virulence or antimicrobial resistance properties.

Phylogenetic Analysis
The EnteroBase search resulted in a total of 1522, 508, and 209 isolates available for phylogenetic analysis of E. coli ST410, ST617, and ST4 lineages, respectively, including ADD147, ADD167, and ADD183. All of them are derived from diverse isolation sources.
Overall, the global collection of ST410, ST617, and ST4 lineages was predominantly depicted by human-derived isolates, and to a lesser extent, by environment-sourced isolates. Of note, a significant number of isolates in each collection are from unknown origin (Supplementary Figures S1-S3). A similar trend was observed during the examination of refined subtrees. All three sequenced E. coli were clustered with closely related humanderived E. coli isolates (Figure 1a-c).
E. coli ADD147/ST410 was placed within a sub-cluster comprising companion animal, environmental, human, and poultry-sourced isolates showed 60-84 cgMLST allelic differences to E. coli ADD147 (Figure 1a). Notably, the twelve ST410 isolates belonging to this sub-clade were predicted like ONT:H9 serotype. E. coli ADD147 was more closely related to the unknown host source isolate ESC_EA7552AA from Vietnam (60 cgMLST allelic differences) and two human blood-sourced isolates ESC_OA5418AA and ESC_GA9568AA from Canada and Singapore (61-78 cgMLST allelic differences), respectively. GrapeTree phylogenetic subtrees of the ST410, ST617, and ST4 lineages based on coregenome multilocus sequence typing (cgMLST) distances. The phylogeny was reconstructed using the rapid neighbor-joining (RapidNJ) algorithm and the core-genome MLST (cgMLST) V1 + Hierarchical Clustering (HierCC) V1 scheme from EnteroBase. E. coli isolates from this study are highlighted with a red circle. Isolate ID is indicated as the EnteroBase Barcode. The nodes are color-coded according to the isolation source niche as indicated in the key and the numbers in brackets represent the number of isolates. Scale bar indicates the number of cgMLST allelic differences.
Similarly, E. coli ADD167/ST617 was located in a discrete cluster consisting of six isolates sourced from animal, environmental, and human origin showed 28-38 cgMLST allelic differences to E. coli ADD167 (Figure 1b). According to cgMLST analysis, the most closely related to E. coli ADD167 are the isolate ESC_EA3834AA, recovered from the hospital environment in Germany (26 cgMLST allelic differences), and the human blood-sourced isolate ESC_LA4981AA from the USA (31 cgMLST allelic differences).
Notably, according to EnteroBase and GrapeTree phylogenetic analysis of the global collection of ST4 lineage, E. coli ADD183 is the only water-sourced isolate reported. E. coli ADD183/ST4 was clustered with human-associated isolates showed 2-31 cgMLST allelic differences to E. coli ADD183 (Figure 1c). Furthermore, it was placed into a sub-clade containing five unknown host source E. coli isolates from the USA. The most closely related isolates to E. coli ADD183 were ESC_HA3106AA, ESC_HA3105AA, and ESC_HA3103AA differing only by two, three, and four cgMLST alleles, respectively.

Discussion
Aquatic environments are one of the main reservoirs and transmission sources for the dissemination of antibiotic resistance. This study constitutes the first approach for the genomic analysis of three antibiotic-resistant E. coli strains from surface water of agricultural drainage receiving raw irrigation water and domestic effluents in northwest Mexico (Sinaloa State). The objective of this genome interrogation was to determine the genetic diversity in terms of virulence features, antimicrobial resistance gene carriage, plasmids, and prophage content and to portray the potential risk that environmental E. coli could pose to animal or human health. Although the studied E. coli strains were isolated from the same area and period, they showed dissimilar genomic profiles. The resulting differences observed in the phenotypic and genomic profile could be partially explained by (1) the different evolutionary scenarios to which each E. coli strain was subjected [44]; (2) the accessory genome dynamics [45]; and (3) the environmental selective pressure imposed by the constant inflow of contaminated agricultural and domestic sewage on the studied agricultural drainage [22]. However, to address this knowledge gap, large-scale genomic epidemiological studies in the One Health framework are needed.
The main concern was the identification of international high-risk clones of E. coli carrying multiple acquired antibiotic resistance determinants in the agricultural drainage ditch in the Sinaloa Valley. In this area, intensive agriculture is practiced, and the drained water is subsequently reused for agricultural irrigation, livestock farming, and aquaculture purposes [20]. The "high-risk clone" designation has been used to describe bacterial lineages that enhance the dissemination of antibiotic resistance [46], and a prominent example of this is the ExPEC ST131, a successful globally disseminated clone associated with multiple antimicrobial resistance and enhanced pathogenicity and fitness features [47,48].
The MDR CTX-M-15-positive E. coli strain ADD167 belongs to the ST617 lineage, a high-risk clone member of the internationally widespread CC10, which is related to clinical strains found in animal, environmental, food, and human samples worldwide and is predominantly associated with the CTX-M-15 and OXA-1 β-lactamases, and to a lesser extent, with CMY-2 [57][58][59][60][61]. Recent studies have reported the mobile colistinresistant mcr-1 gene in E. coli ST617 from humans and food-producing animals, which threatens the effectiveness of polymyxins, one of the last-resort drug options for treating infections caused by multidrug-and carbapenem-resistant gram-negative bacteria [62,63]. No acquired resistance determinants for the last-resort treatment options were observed.
Based on the investigation of virulence factors, a low abundance of VAGs was predicted in the high-risk clones ADD147 (ST410) and ADD167 (ST617). Alternatively, strain ADD183 (O6:H16) belongs to the ETEC O6 group, the most common ETEC serogroup involved in multiple outbreaks and sporadic infections internationally, and typically carries both enterotoxins [64][65][66]. In contrast to what was reported in ETEC O6 strains [65,67], E. coli ADD183 harbored additional nonclassical virulence factors found in ETEC (eatA) and ExPEC (iss, kpsE, and kpsM_K15) strains, implying the potential of E. coli ADD183 to cause both intestinal and extraintestinal diseases. Therefore, further studies are needed to confirm this hypothesis.
Available information from previous studies shows that clinically sourced colistinresistant E. coli belonging to ST410, ST617, and ST4 lineages have been identified worldwide. In particular, ST410 isolates have been reported in Asia (China and Vietnam), Europe (Italy), and America (Brazil), while ST617 isolates have been described in Asia (China, Korea, and Taiwan), and Europe (France and Italy). Alternatively, ST4 isolates have been reported in Arabia Saudi [68]. It is noteworthy that at present, there is no information regarding the prevalence or distribution of E. coli ST410 and ST617 lineages in Mexico, and the unique report of a single human-derived ETEC strain belonging to the ST4 lineage is by Saldaña-Ahuactzi et al. [69].
Although the determination of the origin or direct source was not part of this study, the phylogenetic analysis based on cgMLST elucidated the genetic relatedness and partially gave insight into the clinical relevance of the three E. coli strains studied here. Indeed, according to the former analysis, it is most likely that E. coli ADD147, ADD167, and ADD183 are human-derived strains that could be introduced into the agricultural drainage by the inflow of untreated domestic wastewater [20,22].
This study had several limitations. The results of the current analysis of a small number of E. coli isolates from surface water resources of the selected area cannot be extrapolated to aquatic ecosystems dedicated to agricultural practices at the regional or national level. Nonetheless, despite the limited number of E. coli isolates analyzed, our work highlights the need for extensive epidemiological and genomic studies on E. coli from the aquatic environment to understand the genetic diversity and the ongoing circulation of multidrug-resistant high-risk clones in the region.

Conclusions
To date, there is limited information regarding the genomic and molecular data describing the mechanisms underlying antibiotic resistance and virulence properties in E. coli from aquatic environments. To the best of our knowledge, this is the first report of genomic profiling of antibiotic-resistant E. coli strains from the surface water of agricultural drainage in Mexico. Our study highlights the identification of international high-risk clones ST410 and ST617 of E. coli carrying multiple antibiotic resistance determinants, including the widespread CTX-M-15 ESBL, in drained water that is subsequently reused for different practices, such as agricultural irrigation, livestock farming, and aquaculture. Moreover, the presence of enterotoxigenic E. coli strains harboring ExPEC-associated virulence factors was also evidenced. These results also underscore the role of surface water as a potential reservoir for critical pathogenic and multidrug-resistant E. coli clones and the urgent need for monitoring and tracking these bacterial populations to prevent their ongoing dissemination. Finally, the results of the present study will contribute to antibiotic resistance and virulence surveillance by employing WGS-based methodologies under the One Health concept.