Comprehensive Genome and Plasmidome Analysis of Antimicrobial Resistant Bacteria in Wastewater Treatment Plant Effluent of Tokyo

To characterize environmental antimicrobial resistance (AMR) in urban areas, extended-spectrum β-lactamase- (ESBL)/carbapenemase-producing bacteria (EPB/CPB, respectively) from urban wastewater treatment plant effluents in Tokyo were isolated on CHROMagar ESBL plate. Complete genome sequence analysis, including plasmids, indicated that 126 CTX-M-positive isolates (31%) were identified among the 404 obtained isolates. The CTX-M-9 group was predominant (n = 65, 52%), followed by the CTX-M-1 group (n = 44, 35%). Comparative genome analysis revealed that CTX-M-27-positive E. coli O16:H5-ST131-fimH41 exhibited a stable genome structure and clonal-global dissemination. Plasmidome network analysis revealed that 304 complete plasmid sequences among 85 isolates were grouped into 14 incompatibility (Inc) network communities (Co1 to Co14). Co10 consisted of primarily IncFIA/IncFIB plasmids harboring blaCTX-M in E. coli, whereas Co12 consisted primarily of IncFIA(HI1)/Inc FIB(K) plasmids harboring blaCTX-M, blaKPC, and blaGES in Klebsiella spp. Co11 was markedly located around Co10 and Co12. Co11 exhibited blaCTX-M, blaKPC, and blaNDM, and was mainly detected in E. coli and Klebsiella spp. from human and animal sources, suggesting a mutual role of Co11 in horizontal gene transfer between E. coli and Klebsiella spp. This comprehensive resistome analysis uncovers the mode of relational transfer among bacterial species, highlighting the potential source of AMR burden on public health in urban communities.


Introduction
The World Health Organization has endorsed a global action plan for antimicrobial resistance (AMR), which calls upon all nations to adopt mitigation strategies [1]. However, there is still a need to fully understand the ecology and evolution of AMR based on a one-health approach. In particular, the properties of the microbial resistome in ecosystems dominated by humans and how to monitor such environmental factors to evaluate their potential risk for promoting the evolution of AMR have to be sufficiently characterized. Instead of the limited nosocomial AMR surveillance, sewage AMR surveillance could highlight a broader picture of the global burden of AMR, including local-specific features, such as urban, suburban, industrial, and agricultural, based on the one-health approach.
There is a growing concern for sludge management due to the high levels of contaminants, and the design of current wastewater treatment plants (WWTPs) does not restrict the elimination of emerging contaminants and their metabolites [2]. These contaminants are released into rivers or streams as sewage effluents without interruption. The prevalence of AMR bacteria (ARB) and AMR genes (ARGs) in rivers would increase downstream of WWTPs because of effluents [3,4]. Indeed, the high density of bacteria in WWTPs could provide an optimum environment for horizontal gene transfer (HGT) between human pathogens and environmental bacteria [5]. Moreover, such environmental bacteria persist in the environment and later they might transfer back such genes to clinically relevant pathogens [6]. Therefore, wastewater and active sludge in WWTPs can act as reservoirs and environmental suppliers of antibiotic resistance, implying that they are hotspots for HGT under the selective pressure of antibiotics, disinfectants, and metals, even at low concentrations, enabling the dissemination of antibiotic resistance genes among different bacterial species. ARGs are located in multidrug-resistance (MDR) plasmids, which could have been transferred into broad recipient targets among different Proteobacteria strains, indicating a high possibility of HGT among bacteria in wastewater [7].
Thus far, the potential correlation between the integron intI1 and anthropogenic pollution in WWTPs has been documented among the main hotspots of ARGs in the environment [8]. The importance of the synergy between plasmids and ARGs with flanking insertion sequences (ISs) has also been documented to potentiate the emergence of MDR-hypervirulent clones between WWTP-and human/animal-associated bacteria [5]. Ekanzala et al. reported that hospital wastewater showed significantly higher environmental resistome risk scores than all other assessed matrices [9,10], suggesting that hospital wastewater, effluent, and sewage sludge should be subjected to stringent mitigation measures to minimize such dissemination. Therefore, urban WWTPs are among the main sources of ARB and ARGs released into the environment [11]. However, the evolution of resistance and the spread of ARGs in WWTPs have not been widely researched and clearly evidenced.
Regarding a comprehensive global plasmidome analysis, Redondo-Salvo et al. characterized the global map of all publicly available over 10,000 reference prokaryotic plasmid sequences, including Proteobacteria, Firmicutes, Actinobacteria, Spirochaetes, Cyanobacteria, and Archaea [12,13]. Global analysis showed that more than 60% of the plasmids were grouped with host ranges beyond the species barrier, although plasmid transmission was constrained by taxonomic boundaries. Another report using long-read sequencing and an improved bioinformatics workflow for global sewage samples revealed marked AMR transmission among plasmids in complex untreated domestic sewage [13].
In this study, we characterized the complete genome sequences of extended-spectrum β-lactamase (ESBL)-or carbapenemase-producing bacteria (EPB or CPB) isolated from urban WWTP effluents in Tokyo. We have analyzed a marked genomic feature of CTX-Mpositive Enterobacterales isolates that could be reflected in the urban areas of Tokyo as AMR surveillance. In addition, we determined the complete genome sequence of AMR plasmids in the EPB/CPB isolates to perform accurate gene network analysis (plasmidome analysis) to assess plasmid transmission among bacterial species.
Core-genome single-nucleotide variation (SNV) phylogeny (see the detailed procedures in Figure S1) of the ST131 isolates were compared with publicly available E. coli ST131 draft or complete genome sequences (315 strains; see Table S4), suggesting that the four subclonal types of ST131 were primarily identified from the WWTP effluents, two isolates of ST131-fimH41 with bla CTX-M-27 , seven isolates of ST131-fimH30 with bla CTX-M-27 , one isolate of ST131-fimH30 with bla CTX-M-3 , and one isolate of ST131-fimH30 with bla CTX-M-15 ( Figure 2).
One isolate (WP5-S18-ESBL-09) of O16:H5-ST131-fimH41 with bla CTX-M-27 exhibited marked clonality, showing less than 12 SNVs with clinical isolates from several countries (Southeast Asia, EU, and Oceania) ( Figure 3A). Genome recombination and pan-genome analysis suggested that the genome structure of these strains was very stable, except for some ARGs that were transferred with conjugative plasmids ( Figure 3B).
For instance, all E. coli ST38 isolates (n = 10) were ST38-fimH5-and CTX-M-14-positive. However, these were classified into different serotypes (O86:H18 and O50/O2:H30), suggesting that at least two major serotypes of E. coli clones could play a pivotal role as hosts for bla CTX-M-14 gene dissemination in the global environment ( Figure 4, panel ST38). Core-genome SNVs phylogeny of E. coli ST131 isolates using publicly available ST131 draft or complete genome sequences (total 315 strains, see Table S4). Isolates in this study was highlighted by red in the strain name (see detail in Tables S2 and S3).
One isolate (WP5-S18-ESBL-09) of O16:H5-ST131-fimH41 with blaCTX-M-27 exhibited marked clonality, showing less than 12 SNVs with clinical isolates from several countries (Southeast Asia, EU, and Oceania) ( Figure 3A). Genome recombination and pan-genome analysis suggested that the genome structure of these strains was very stable, except for some ARGs that were transferred with conjugative plasmids ( Figure 3B).

Figure 2.
Core-genome SNVs phylogeny of E. coli ST131 isolates using publicly available ST131 draft or complete genome sequences (total 315 strains, see Table S4). Isolates in this study was highlighted by red in the strain name (see detail in Tables S2 and S3).
Regarding E. coli ST602 (n = 5 isolates), four isolates appeared to be clones because they were isolated from the same sample (WP2-W18-CRE-xx stands for isolate ID with meropenem selection at the WWTP WP2 site in winter 2018). The inclusion of publicly available ST602 genomes indicated that the CTX-M-27-positive O9:H9-ST602-fimH86 isolate could be notable in Japan (Figure 4, panel ST602).

Other CTX-M Positive Bacteria
Among the 88 isolates of Aeromonas species (Table 1), three isolates of A. hydrophila, seven isolates of A. caviae, one isolate of A. media, and four isolates of A. veronii were identified as CTX-M producers (Table S2). Although four isolates of CTX-M-14 producing A. caviae were detected in different WWTP effluents (WP2, 4, and 7), core-genome SNV phylogeny revealed that they were clonal in 21 SNVs with the same recombination and pangenome regions ( Figure 5A), suggesting that the A. caviae clone may be successfully predominating in the general WWTP environment in Tokyo but not in WWTPs at specific locations.

Plasmidome Analysis of EPB and CPB Isolated from WWTP Effluents and Environment
ESBL and carbapenemase β-lactamase genes were acquired by conjugative plasmid transfer among variable strains from the Proteobacteria group. Genetic network analysis among AMR plasmids could be useful to uncover the modes of relational transfer and trace dissemination. Of the 404 EPB/CPB isolates (Table 1), 85 were identified as complete genome sequences in this study (Table S2). Moreover, out of the 304 complete plasmid Among the 48 isolates of Klebsiella species (Table 1), eight isolates of K. pneumoniae, seven isolates of K. quasipneumoniae, two isolates of K. variicola, and seven isolates of other Klebsiella spp. were identified as CTX-M producers (Table S2). Core-genome SNV phylogeny revealed that four isolates of CTX-M-3-producing K. quasipneumoniae ST668 exhibited clonality in 14 SNVs and shared similar recombination and pan-genome regions ( Figure 5B), although these isolates were obtained from the same place (WP5) but at different sampling times (summer 2017, summer 2018, and winter 2019), suggesting that the ST668 clone may remain in active sludge at WP5 for at least one year and more.

Plasmidome Analysis of EPB and CPB Isolated from WWTP Effluents and Environment
ESBL and carbapenemase β-lactamase genes were acquired by conjugative plasmid transfer among variable strains from the Proteobacteria group. Genetic network analysis among AMR plasmids could be useful to uncover the modes of relational transfer and trace dissemination. Of the 404 EPB/CPB isolates (Table 1), 85 were identified as complete genome sequences in this study (Table S2). Moreover, out of the 304 complete plasmid sequences from the 85 strains described above, 73 β-lactamase-positive plasmids (Table S5) were subjected to plasmidome network analysis based on orthologous analysis (see details in Figure S2). To characterize global ARG dissemination through plasmid transfer, 758 publicly available complete plasmid sequences showing a clear description of the isolation and source were selected from 19,904 complete plasmid sequences. A total of 831 complete plasmids (73 in this study and 758 in the public domain; see Table S6) were subjected to pangenome analysis using Roary and NMDS (vegan in the R package), and the results revealed that 14 communities were classified and clustered ( Figure 6).
Most network communities (Co) showed notable incompatibility (Inc) replicon-dependent distribution, suggesting that the network analysis was well-performed based on the genetic features of each Inc replicon. The major Co comprised Co10, Co11, and Co12 of the shared 423 plasmids among the tested 831 plasmids. Co10 was primarily IncFIA and IncFIB (AP001918) replicon plasmids harboring the bla CTX-M gene in E. coli from human sources, whereas Co12 was primarily IncFIA(HI1) and Inc FIB(K) replicon plasmids harboring bla CTX-M , bla KPC , and bla GES genes in Klebsiella spp. from human sources ( Figure 6). This network analysis highlighted the potential mutual role of Co11 plasmids between Co10 and Co12 plasmids because Co11 plasmids are placed around Co10 and next to Co12 (upper-left panel in Figure 6). Co11 is primarily rich in IncFII, Inc FII(pHN7A8), IncX1, and IncN replicon plasmids harboring bla CTX-M , bla KPC , and bla NDM genes, and are mainly detected in E. coli and Klebsiella spp. from human and animal sources. Therefore, Co11 plasmids may contribute to the horizontal ARG transfer between E. coli and Klebsiella spp.
Regarding other notable co-exhibiting specific replicon and host bacteria, Co3 of comprised IncHI2 and IncHI2A replicon plasmids harboring bla CTX-M and bla IMP genes and was mainly detected in Enterobacter species (E. cloacae, 8; E. hormaechei, 6; E. asburiae, 3; others, 1) from human sources. Co4 comprised IncB/O/K/Z and IncI1-gamma replicon plasmids harboring the bla CTX-M gene and were primarily detected in E. coli from human, environmental, and animal sources. Co6 comprised of an IncP6 replicon plasmid harboring bla GES and bla KPC genes and was primarily detected in Aeromonas, Enterobacter, and Klebsiella spp. from environmental sources.
The gene-specific distribution of carbapenemase genes was as follows: bla IMP in Co1 and Co3; bla KPC in Co6, Co8, and Co12 and bla NDM in Co5, Co11, and Co13. Such ARG distribution may be involved in the plasmid replicon and its bacterial hosts.
plasmids harboring the blaCTX-M gene and were primarily detected in E. coli from human, environmental, and animal sources. Co6 comprised of an IncP6 replicon plasmid harboring blaGES and blaKPC genes and was primarily detected in Aeromonas, Enterobacter, and Klebsiella spp. from environmental sources.
The gene-specific distribution of carbapenemase genes was as follows: blaIMP in Co1 and Co3; blaKPC in Co6, Co8, and Co12 and blaNDM in Co5, Co11, and Co13. Such ARG distribution may be involved in the plasmid replicon and its bacterial hosts. Figure 6. Plasmidome analysis of EPB and CPB isolates from WWTP effluents and environment. Nodes on the network are illustrated with different colors based on communities with sequence similarity, source, bla gene type, host bacteria genus, and its isolate location on the continent. The absolute counts for plasmids in plasmidome analysis are summarized in Table S7.

Discussion
Here, we characterized WWTP-related EPB and its plasmids based on the complete genome sequences. Urban WWTP effluents could be a potential source of AMR burden. Therefore, there is concern that hospital [19] and community [20] effluents include a considerable proportion of ARGs in the environment. Among potential healthy EPB carriers, globally disseminated E. coli ST131 clones should first be investigated. This study suggests that multiple ST131 clones in WWTP effluents harbor the currently prevalent CTX-M variants (Figure 2). E. coli O16:H5-ST131-fimH41 with bla CTX-M-27 was identified as a marked clone (Figure 3). Indeed, CTX-M-27-positive E. coli ST131 has been increasing in China [21], EU [22], Australia [23], New Zealand [23], and Japan [24], which has been speculated by the observation that CTX-M-27 might exhibit a higher hydrolyzing activity against ceftazidime compared with CTX-M-14 [25]. In contrast to ST131 with bla CTX-M-27 , other E. coli STs still harbor bla CTX-M-14 , which has been the predominant variant since the early 20th century in Japan [24], suggesting that the other STs have not shifted to acquire more potential CTX-M-27 thus far.
Plasmidome network analysis ( Figure 6) exhibited a good approach for illuminating bacterial communication through plasmid-based HGT. This study highlights bacterial species-dependent plasmid tropism and compatibility among distinct species. The bla CTX-M variants were identified in widely variable plasmids, whereas carbapenemase genes (bla NDM , bla KPC , and bla IMP ) showed a typical plasmid replicon with a specific distribution, suggesting that the region-specific distribution of carbapenemase [26] may be associated with its locality [27]. In particular, specific bla IMP -containing integrons are markedly circulating among different bacteria in countries, such as Japan, Australia, and Thailand [27]. It has been reported that IMP-6-positive CPB clinical isolates are predominantly disseminated among chromosomally distinct isolates through the pKPI-6-related plasmid (IncN replicon [28]) in Japan [29]. Although IMP is the most predominant type of carbapenemase in clinical CPB in Japan, apart from clinical settings, we have reported the KPC-2 or NDM-5-positive WWTP-effluent isolates, i.e., the complete genome sequences of KPC-2-positive Klebsiella pneumoniae GSU10-3 [15], KPC-2-positive Aeromonas hydrophila GSH8-2 [16], KPC-2-positive Aeromonas caviae GSH8M-1 [16], and NDM-5-and CTX-M-55-coproducing E. coli GSH8M-2 [17] were determined. Overall, WWTP monitoring is an ideal approach to identify non-clinical but marked ARBs for the prediction of future clonal expansion from local to global dissemination.
According to the European Centre for Disease Prevention and Control, a significant proportion of antibiotics consumed by humans are in the community rather than in healthcare settings [30], suggesting that outpatient therapy could be the most effective factor in increasing the proportion of antimicrobials, selected ARGs, and ARB in WWTP. Therefore, a nation (or region)-wide survey of the resistome in WWTP effluents could provide a rapid and efficient method for assessing the environmental AMR burden in urban populations. To date, the healthy carriage rate of EPB has been rising worldwide. In Japan, the detection rate of EPB was reported to be 12.2% in healthy adult volunteers [31] and 15.6% in healthy food handlers [32]. As the number of EPB-positive healthy carriers has been increasing worldwide, Japan is no exception to the issue of increasing AMR healthy carriers. According to the Japan Nosocomial Infections Surveillance (JANIS), clinical reports of antibiotic-resistant gram-negative bacteria are increasing (https://janis.mhlw.go.jp/english/index.asp (accessed on 29 August 2022)). As suggested in this study, the whole genome sequence of the obtained ARBs (Table 1) and its plasmidome profiles in WWTP ( Figure 6) could then reflect the structure and diversity of ARBs in the gastrointestinal tracts of urban residents within the WWTP catchment area.
The risk assessment of transmission of AMR vectors/reservoirs from the environment to humans cannot be adapted from the model for pathogens because most AMR vectors/reservoirs are supposed to be composed of non-or low-pathogenic bacterial species. As colonization may be asymptomatic in most humans, this may cause further dissemination under frequent abuse of antimicrobials in clinical use, leading to an underestimation of the extent of transmission of ARB from the environment to humans, as well as from humans to humans in the community. Therefore, they can colonize humans without notable symptoms, resulting in a healthy carrier [33]. Members of the family Enterobacterales and genera, such as Aeromonas, Acinetobacter, and Pseudomonas have been frequently documented as carriers of ESBL and carbapenemase genes in wastewater samples, and some of them may act as vectors for HGT [34]. Indeed, this study identified the clonal dissemination of CTX-M-14-positive Aeromonas caviae isolates at different WWTP sites ( Figure 5A), indicating that such low-pathogenic Aeromonas spp. could be a potential reservoir for AMR dissemination. Therefore, reservoir ARBs should be considered in the overall AMR risk assessment.

Sample Collection
Sewage effluent samples were collected from eight WWTPs (WP1-WP9 and WP6 were treated as missing numbers owing to incorrect sampling locations) along the Tama River and around Tokyo Bay (Table S1). Surface water from a recreational beach (BEC1) was used as the environmental control sample (Table S1). Sampling was conducted during the summer and winter seasons for 2 years between August 2017 and February 2019 (Table S1). The sampling procedure is summarized in Figure 1. A fresh 50 mL of effluent was centrifuged at 7000× g for 3 min, and the cell pellet was resuspended with residual water, followed by spreading all suspensions on CHROMagar ESBL plates (CHROMagar, Paris, France). The plates were then incubated at 36 • C for 24 h (Figure 1). Suspected colonies of Escherichia coli, Klebsiella, Enterobacter, Aeromonas, and Pseudomonas ( Figure 1) were isolated as single clones, followed by whole-genome sequencing. In addition to EPB, CPB was isolated as described previously [15]. Briefly, 500 mL of effluent was filtered, and the membrane was incubated in 20 mL of Luria-Bertani (LB) broth supplemented with 1 mg/L meropenem at 37 • C for 14 h, after which the culture was spread on CHROMagar ESBL plates (CHROMagar, Paris, France).

SNV Phylogenetic Analysis and Pan-Genome Analysis
A flowchart of the single nucleotide variation (SNV) phylogenetic analysis is summarized in Figure S1. Assembly data and/or Illumina raw reads of four species (E. coli, Aeromonas caviae, and Klebsiella quasipneumoniae) were retrieved from the NCBI database. If assembly data were not deposited in the NCBI assembly database, the raw sequence data downloaded from the SRA database were assembled using SKESA version 2.3.0 [44]. MinHash sketch for each genome sequence in this study (n = 53) and NCBI data (n = 76,449) were constructed using the sourmash program version 2.0.0a4 [45] with the following parameters: compute -k 21 -scaled 4000. The MinHash sketch database was built using the sourmash program (parameter: index -k 21) with the NCBI dataset, followed by a MinHash search against 53 genome sequences. The collected dataset was analyzed for core genome SNV and pan-genome analyses for each species or sequence type (ST). In SNV analysis, the longest chromosomal sequences from this study were selected as the reference sequences. Repeat and prophage regions of the reference sequences were analyzed using NUCmer (MUMmer 3.0) [46] and prophet [47] programs, respectively. If the available data were only contig sequences from the NCBI database, the SimSeq program (https://github.com/jstjohn/SimSeq (accessed on 1 April 2018)) was used to construct simulated 150-mer paired-end reads with a 200 bp insert length. Read mapping was performed using BWA-MEM [48] with default parameters against reference chromosomal sequences, followed by variant calling using VarScan version 2.3.4 [49]. SNVs on repeat and predicted prophage regions were removed, and recombination regions were predicted using the Gubbins software [50], followed by filtering SNVs on recombination regions. Core genome SNV phylogenetic analysis was performed by the approximate maximum likelihood phylogenetic method using FastTree v2.1.10 [51], followed by visualization using Fandango version 1.3.0 [52] and interactive tree of life (iTOL) version 3 [53]. SNV network analysis was performed using the median joining network method and TCS network method of PopART (http://popart.otago.ac.nz (accessed on 29 August 2022)). Pan-genome analysis of predicted open reading frames (ORFs) was performed using Roary version 3.12.0 with default parameters [54].

Plasmidome Analysis
The flowchart for the plasmidome phylogenetic analysis is summarized in Figure S2. Complete plasmid sequences were retrieved from the NCBI database as of November 2019, followed by re-annotation and ARG detection using DFAST with the ResFinder and BARRG databases. A plasmid database of the nucleotide and protein sequences was constructed using BLAST. For plasmidome analysis, complete plasmids in the public database were selected according to the following criteria: (i) ≥90% nucleotide sequence identity and ≤1 × e −100 e-value against β-lactamase gene-positive complete plasmids revealed in this study, (ii) sharing ≥10 ORFs (≥99% identity) against these plasmids, (iii) presence of ARGs, and (iv) existence of metadata related to the isolation source. Pan-genome analysis using the collected plasmids was performed using Roary version 3.12.0 [54] with a parameter of 99% BLASTP percentage identity cutoff, followed by the construction of a distance matrix using the R package "proxy" with edge weights of plasmids sharing ≥ 10 ORFs. NMDS and community structure analysis were performed by "vegan" and "igraph" of R package, respectively. Plasmid data were visualized using Cytoscape version 3.7.2.

Conclusions
Urban WWTP effluents, even with proper treatment, may cause AMR burden with a high frequency of acquired ARGs in the environment. The dissemination of ARB/Gs in the environment might increase the risk of infectious diseases [55], but there is little direct evidence regarding their epidemiological effects. This study revealed resistome analysis based on complete genome sequencing and subsequent plasmidome analysis of EPB/CPB isolated from WWTP effluents, suggesting that every urban community, including hospitals, healthy carriers, and travelers, can be a potential source of ARGs. WWTP is speculated to be hotspots where various bacteria can acquire ARGs via HGT. Therefore, resistome analysis of AMR plasmids and their specific bacterial hosts in WWTP effluent is expected to identify the presence of undetected nosocomial infections, leading to the detection of potential ongoing dissemination in the overall community.
Supplementary Materials: The following supporting information can be downloaded from: https: //www.mdpi.com/article/10.3390/antibiotics11101283/s1, Figure S1: Flowchart for verification of strain clonality using comparative genomic analysis; Figure S2: Flowchart of plasmidome analysis; Table S1: General information on wastewater treatment plants and sampling sites in this study; Table S2: Whole-genome sequence information for the strain grown on CHROMagar ESBL; Table S3: Summary of isolates on CHROMagar ESBL and their sequence type in this study; Table S4: The strain information of E. coli used in the core-genome SNV phylogenetic and pan-genome analyses; Table S5: Complete plasmid sequence information of the strain growing on CHROMagar ESBL; Table S6: Complete plasmid sequence list for plasmidome analysis; Table S7: Summary of the absolute counts of plasmids for each category in plasmidome analysis.

Informed Consent Statement: Not applicable.
Data Availability Statement: All complete sequences are available in the DDBJ/EMBL/GenBank database (accession numbers AP021908-AP022304; see Table S2). All raw read sequence files are available from the DRA/SRA database (accession numbers DRR199157-DRR199560 [whole-genome data, see Table S2]).