1. Introduction
Antimicrobial resistance (AMR) has become one of the most pressing global health challenges, driven largely by the rapid emergence and widespread dissemination of resistant bacterial lineages across human, animal, and environmental settings [
1].
Escherichia coli (
E. coli) plays a central role in this crisis. As both a ubiquitous commensal and a leading cause of extraintestinal infections, it serves as a key reservoir and vehicle for mobile resistance determinants [
2]. Its ability to acquire, maintain, and disseminate antimicrobial resistance genes (ARGs) is tightly linked to its genomic plasticity, particularly the role of plasmids—self-replicating mobile elements that facilitate horizontal gene transfer at scales unmatched by chromosomal evolution [
3,
4].
Plasmids are well-established vectors for some of the most clinically significant resistance genes, including extended-spectrum β-lactamases (ESBLs), carbapenemases, and colistin resistance genes [
5,
6]. These include
blaCTX-M variants,
blaNDM carbapenemases, and the
mcr family, whose global dissemination has been strongly linked to specific plasmid families, notably IncF, IncX, IncI, and related types [
7,
8]. Growing evidence indicates that plasmids and ARGs associations are not random: certain plasmid backbones consistently co-localize with particular resistance genes, forming stable modules that occurs across unrelated lineages and diverse ecological niches [
9,
10]. These stable associations complicate AMR surveillance, as they facilitate rapid cross-host and cross-clone transmission of high-risk genes.
Despite considerable research on AMR in
E. coli, most studies have focused on narrow taxonomic or genomic scopes such as individual high-risk clones (e.g., ST131) [
11,
12], specific resistance genes like
mcr-1 or
blaCTX-M [
13,
14,
15], or geographical confined clinical settings [
16]. Recent long-read studies, including the landmark analysis by Arredondo-Alonso et al. [
17], reconstructed complete plasmidomes and chromosomes from 2000 extra-intestinal pathogenic
E. coli (ExPEC) bloodstream isolates collected over two decades. While their work provided key insights into plasmid–host evolutionary dynamics and demonstrated parallel plasmid-mediated and chromosomal strategies for clone success, it focused primarily on clinical clones within a single national cohort. As a result, broader questions about plasmidome diversity and resistome structure across the full phylogenetic and ecological spectrum of
E. coli remain open.
In contrast, the global architecture of the
E. coli plasmidome, defined here as the complete repertoire of plasmids circulating within the species, including their diversity, distribution, and associated genetic cargo remains poorly resolved. Plasmids are pivotal to AMR evolution because they mediate horizontal transfer of resistance genes across unrelated lineages, hosts, and environments, allowing
E. coli to acquire AMR far more rapidly and flexibly than through chromosomal mutation alone. This plasmid-mediated exchange amplifies the complexity of the
E. coli resistome by linking mobile ARGs, chromosomal integration events, and clonal background into highly dynamic, mosaic genomic architectures. Consequently, key questions remain: how different plasmid families vary in their capacity to mobilize priority ARGs, how plasmid–gene associations differ across phylogroups and sequence types, and how plasmid-driven AMR evolution compares with emerging trends of chromosomal stabilization, such as the increasing integration of ESBL genes within the chromosome [
17,
18,
19,
20,
21].
Addressing these gaps requires analyses that integrate plasmidome reconstruction, genomic context, and lineage-level population structure at a scale large enough to capture the diversity and evolutionary dynamics of the species. By examining 9700 high-quality E. coli genomes, this study provides an extensive plasmid-focused resistome analysis conducted to date. Through this approach, we offer a detailed and global view of how plasmid backbones, chromosomal integration events, and clonal lineages interact to shape AMR evolution in E. coli, establishing a foundation for future surveillance strategies centered on plasmid lineages rather than individual resistance genes.
2. Results
2.1. Plasmid Diversity Across the E. coli Collection
Among the 9700
E. coli genomes included in our dataset, 7758 (80%) contained at least one plasmid, representing a total of 24,201 plasmids. Plasmid carriage varied substantially across isolates, ranging from 1 to 13 plasmids per genome. The overall mean was 3.12 plasmids per genome, with a median of 3.0, and most isolates carried between one and five plasmids (
Table 1).
When examining plasmid replicon types, we observed that a relatively small set of plasmid families accounted for a large fraction of all plasmids identified. The most abundant replicon was IncFIB(AP001918), with 4311 copies detected, followed by ColRNAI (2387), Col156 (1768), IncFIA (1542), and IncFIC(FII) (1517). Several other families including Col(MG828), IncI1_1_Alpha, IncFII, IncX1, Col440I, IncY, and Col8282 were also well represented (
Table 2).
Plasmid burden showed clear differences across phylogroups. Phylogroup C harbored the highest number of plasmids on average (mean = 4.40), followed by G (3.52) and F (3.32). In contrast, groups such as B1, D and E showed lower plasmid loads, while the human-associated groups A and B2 displayed intermediate values (3.13 and 3.26, respectively). These patterns were supported statistically by a Kruskal–Wallis test (H = 363.97,
p = 6.6 × 10
−73), and the Dunn post hoc comparisons identified several significant pairwise differences (
Table 3).
At the sequence-type (ST) level, plasmid counts were even more variable. A small number of rare STs carried unusually high plasmid loads often eight or nine plasmids per genome despite being represented by single isolates. Examples include ST773, ST7236, ST6775 and ST772 (
Table 4).
2.2. Phylogroup and ST Variation in Plasmid Load and Composition
Marked variation in plasmid composition was observed across phylogroups (
Figure 1).
Phylogroup C carried the highest plasmid burden (mean = 4.40 per genome), followed by G and F, whereas D, E and unassigned isolates exhibited lower plasmid counts. Phylogroup B2 dominated by clinically important lineages such as ST131 displayed a strong enrichment in IncF-type plasmids. In contrast, phylogroups A and B1 showed increased frequencies of Col-type replicons, while phylogroups F and G were characterized by moderate but consistent enrichment in IncF plasmids (
Figure 2A).
At the sequence-type level, ST131 showed a distinctive plasmid profile dominated by IncFIB(AP001918), IncFIA and IncFIC(FII). ST10 carried primarily Col-type plasmids, whereas ST38 and ST648 showed profiles enriched in IncX3/IncX4 and mixed IncF/IncI/Col elements, respectively (
Figure 2B).
To refine lineage-level differences in plasmid distribution, we quantified the phylogroup and sequence-type representation of the most prevalent plasmid replicons. Col(BS512) was detected in 393 genomes (5.1% of the 7758 plasmid-positive isolates). Its distribution was dominated by phylogroup B2 (213 genomes; 54.2%), followed by A (29 genomes; 7.4%) and B1 (23 genomes; 5.9%). Across these groups, Col(BS512) was subdivided across more than 20 STs, including ST10, ST46, ST167 and ST295 in phylogroups A and B1, and ST12, ST73, ST91, ST131 and ST405 in phylogroup B2; no single ST accounted for more than 6% of all occurrences.
In addition to Col(BS512), several other plasmid families exhibited broad lineage coverage and strong associations with specific sequence types. The IncFIB(AP001918) replicon, detected in 4311 genomes (55.6%), was the most widespread. Its distribution was enriched in phylogroups B2 (34%), B1 (21%), A (18%), F (5%) and C (4%), with the remaining 18% spread across D, E, G and unassigned isolates. At the ST level, this replicon was most common in ST131 (9.4% of all IncFIB-positive genomes), ST10 (7.1%), ST38 (5.5%), ST73 (4.2%), ST648 (3.8%), and ST12 (3.1%), with more than 50 additional STs represented at lower frequency.
The IncI1_1_Alpha plasmid (916 genomes; 11.8%) showed a distinct pattern centered on phylogroups A (27%), B1 (24%) and B2 (18%), followed by F (8%) and C (5%). This plasmid family was distributed across 30 STs, most commonly ST10 (8.5%), ST101 (5.9%), ST155 (4.3%), ST167 (3.7%) and B2-ST131 (3%), with numerous additional STs represented by few isolates.
The IncX1 replicon (668 genomes; 8.6%) was most frequent in phylogroups A (33%), C (20%), and B2 (16%), with additional representation in B1, F and G. Within these groups, IncX1 was primarily associated with ST10, ST744, ST224 and ST101, each contributing 6–11% of all IncX1 occurrences.
Together, these patterns highlight substantial heterogeneity in plasmid lineage coverage.
Some replicons such as IncFIB(AP001918) and Col156 showed broad phylogroup distribution and extensive ST diversification, whereas others such as Col(BS512) and IncI1_1_Alpha exhibited multi-lineage presence but more structured, lineage-dependent profiles.
2.3. Distribution of High-Impact ARGs Across Plasmid Replicons
Across the 9700
E. coli genomes, a total of 42,826 ARG hits corresponding to 274 distinct genes were identified, including 12,451 (29%) plasmid-borne and 30,375 (71%) chromosomal occurrences (
Figure 3A).
At the genome level, 2764 isolates (28.5%) carried at least one plasmid-associated ARG, whereas 6936 genomes (71.5%) encoded their resistome exclusively on the chromosome. Among plasmid-positive isolates, 2637 genomes (27.2%) carried ARGs on both compartments, and 127 genomes (1.3%) harbored all detected ARGs on plasmids.
To characterize the organization of plasmid-associated resistance determinants, we analyzed co-occurrence patterns between the 50 most frequent ARGs and the 50 most abundant plasmid replicons, representing more than 85% of all plasmid-borne ARG occurrences (
Figure 3B).
Extended-spectrum β-lactamase (ESBL) genes showed strong and structured associations with IncF-type plasmids. Among plasmid-associated blaCTX-M occurrences, IncFIB(AP001918) carried approximately 30%, IncFIA 20%, and IncFIC(FII) 15%, together accounting for 65–70% of all plasmid-borne blaCTX-M genes. Within this group, blaCTX-M-15 represented the dominant allele and was most frequently associated with IncFIB(AP001918) and IncFIC(FII) replicons.
Carbapenemase and colistin resistance genes exhibited even tighter plasmid backbone specialization. Among plasmid-associated blaNDM-5 occurrences (n = 159), 147 (92.6%) were carried by IncX3 plasmids, with the remaining 12 occurrences distributed across a small number of IncF- and IncI-type replicons. Similarly, among plasmid-associated mcr-1.1 occurrences (n = 323), 271 (83.9%) were linked to IncX4, while 48 (14.7%) were carried by IncHI2/IncHI2A plasmids, together accounting for >98% of plasmid-borne mcr-1.1.
Other frequently mobilized resistance determinants formed broader MDR modules. Among plasmid-associated sul2 occurrences (n = 1259), Col156, IncFIA, IncFIB(AP001918) and IncI1_1_Alpha collectively carried 780 occurrences (62%). For tet(A), IncI1_1_Alpha alone accounted for 27% of plasmid-borne copies (200 of 728), followed by IncFII and several Col-type replicons. aadA-family genes showed a similar pattern, with IncFIB(AP001918), IncFIC(FII), IncI1_1_Alpha and IncI2 jointly carrying 65% of the 859 plasmid-associated aadA detections.
Plasmid-mediated quinolone resistance genes were less abundant but highly mobile. Among plasmid-associated qnrS-family genes (n = 663), IncFIA, IncFIB(AP001918) and IncI1-type plasmids together accounted for 70% of occurrences, indicating recurrent recruitment of qnrS into IncF- and IncI-associated MDR modules.
These data demonstrate that plasmid-borne ARGs are concentrated within a limited number of replicon families, which collectively account for most high-impact resistance genes.
2.4. MDR-Associated Plasmid Profiles in Major Sequence Types
Marked heterogeneity in MDR-associated plasmid carriage was observed across phylogroups and sequence types (
Figure 4A).
Among the six major MDR plasmid families, IncFIB(AP001918) was by far the most abundant, with 2460 occurrences, followed by IncFIC(FII) (758), IncI1_1_Alpha (728), IncFIA (512), IncX3 (212), and IncX4 (146). At the phylogroup level, B2 carried the largest MDR plasmid burden, including 892 IncFIB(AP001918) replicons (36.3% of all B2 plasmid observations), 430 IncFIC(FII) replicons (17.5%), and 282 IncI1_1_Alpha plasmids (11.5%). Phylogroup C also exhibited high MDR plasmid density, dominated by 615 IncFIB(AP001918) occurrences, together with 78 IncFIC(FII) and 52 IncI1_1_Alpha replicons. In contrast, phylogroups A and B1 carried fewer MDR plasmids but showed broad diversity, with A harboring 326 ColRNAI, 209 IncFIA, and 430 IncFIC(FII) plasmids, while B1 contained 380 IncFIB(AP001918), 247 IncI1_1_Alpha, and 187 IncFIC(FII) plasmids. These data illustrate that MDR plasmids are not restricted to a single phylogroup: although B2 remains the principal hotspot, substantial MDR plasmid reservoirs also exist within A, B1, and C.
At the sequence-type level, MDR plasmid families displayed strikingly unequal distributions (
Figure 4B).
ST410 (phylogroup C) represented the most significant MDR hub, carrying 522 IncFIB(AP001918) plasmids (21.2% of all IncFIB detections worldwide), together with 70 IncFIC(FII) plasmids (9.2% of total IncFIC detections) and 59 IncFIA replicons (11.5%), confirming its central role in ESBL-associated plasmid dissemination. ST131 (phylogroup B2) constituted the second major MDR lineage, with 379 IncFIB(AP001918) replicons (15.4% of the global total), 12 IncFIC(FII) plasmids (1.6%), and 74 IncFIA replicons (14.5%), consistent with its well-established association with IncF-mediated resistance.
ST167 (phylogroup A) carried a substantial MDR plasmid load, including 151 IncFIB(AP001918) plasmids (6.1%), 107 IncFIC(FII) plasmids (14.1%), and 69 IncFIA replicons (13.5%), reflecting its importance as a broad-spectrum MDR lineage. ST410 further accumulated carbapenemase- and colistin-associated plasmids, including 48 IncX3 replicons (22.6% of all IncX3 detections) and 11 IncX4 plasmids (7.5%), highlighting its additional role in last-resort resistance dissemination.
ST10 (phylogroup A), although highly prevalent, displayed a more commensal-like profile with moderate MDR involvement (140 IncFIB(AP001918), 65 IncFIC(FII), and 48 IncFIA plasmids), while ST744 showed moderate IncFIB(AP001918) (57) and IncX4 (7) representation. Altogether, these quantitative patterns demonstrate that MDR plasmid dissemination is dominated by a limited number of plasmid–ST partnerships, with ST410 and ST131 representing the principal IncFIB-associated hubs and IncX3/IncX4 plasmids being largely concentrated in ST410 and selected additional lineages. This structured, non-random distribution underscores the presence of high-impact sequence types that act as central conduits for plasmid-mediated AMR dissemination across the E. coli population.
2.5. Co-Occurrence Patterns Between Plasmids and Resistance Genes
ARG–plasmid co-occurrence analyses confirmed that a small number of plasmid families act as major hubs of the plasmid-borne resistome (
Figure 5).
IncFIB(AP001918) was the largest ARG carrier, with 2378 ARG occurrences mapped to this replicon (19.1% of all plasmid-borne ARG hits). The most frequent co-occurring genes included sul2 (448 occurrences; 18.8% of the IncFIB-associated resistome), tet(A) and its variants (358; 15.1%), aadA-type aminoglycoside-modifying enzymes (279; 11.7%), blaTEM-1B (210; 8.8%) and blaCTX-M-15 (146; 6.1%). This dense cluster of sulfonamide, tetracycline, aminoglycoside and ESBL determinants around IncFIB(AP001918) forms a core MDR module.
IncFIA and IncFIC(FII) plasmids displayed similar, though smaller, MDR modules. IncFIA carried 874 ARG occurrences, enriched in sul2 (173; 19.8%), tet(A) variants (121; 13.8%), aadA genes (98; 11.2%), blaTEM-1B (74; 8.5%) and blaCTX-M-15 (58; 6.6%). IncFIC(FII) harbored 477 ARG occurrences, again dominated by sul2 (92; 19.3%), tet(A) (63; 13.2%), aadA (54; 11.3%) and blaTEM-1B (39; 8.2%). Together, these three IncF families formed a large, interconnected module comprising ESBLs, aminoglycoside-modifying enzymes, sulfonamide and tetracycline resistance genes.
IncI1_1_Alpha formed a second major MDR module. This replicon carried 799 ARG occurrences, frequently including blaCTX-M-15 (118; 14.8%) and blaTEM-1B (97; 12.1%), together with sul2 (119; 14.9%), aadA variants (89; 11.1%) and tet(A) (68; 8.5%). This pattern supports a role for IncI1 in mobilizing ESBLs alongside aminoglycoside and tetracycline resistance.
By contrast, IncX-type plasmids were more specialized. IncX3 carried 158 ARG occurrences, dominated by blaNDM-5 (116; 73.4%), with additional co-carriage of other β-lactamase, aminoglycoside and sulfonamide genes at much lower frequencies. IncX4 (256 ARG occurrences) was strongly associated with mcr-1.1 (205; 80.1%), with remaining ARGs predominantly involving tet and sul genes. These two plasmid families formed a distinct module linking carbapenemase (IncX3–blaNDM-5) and colistin (IncX4–mcr-1.1) determinants to a narrower accessory resistance background.
Col-type plasmids contributed smaller, more peripheral clusters. Col156, ColRNAI and Col(MG828) were most frequently associated with tet(A), sul2 and aadA variants, but only rarely with ESBLs or carbapenemases. Overall, the network structure highlighted a few highly connected MDR hubs (IncF- and IncI-type plasmids) and more specialized vehicles for last-resort resistance genes (IncX3 and IncX4).
2.6. Comparative Contribution of Plasmids and Chromosomes to the Resistome
Plasmid and chromosomal compartments contributed unevenly to the carriage of individual resistance genes. While most ARGs remained primarily chromosomal, several clinically important determinants showed strong plasmid enrichment. For example, blaNDM-5 was detected 246 times, of which 159 (64.6%) were located on plasmids mostly IncX3 while the remaining 87 (35.4%) were chromosomal. mcr-1 variants (mostly mcr-1.1) showed an even stronger plasmid bias, with 323 of 423 detections (76.4%) were located on plasmids, predominantly IncX4 replicons.
In contrast, blaCTX-M-15 displayed a more mixed distribution, with 757 total occurrences split between plasmids (169; 22.3%) and chromosomes (588; 77.7%). Plasmid-mediated quinolone resistance genes of the qnrS family were predominantly plasmid-borne (663/809 detections; 82.0%), while sul2 (1259/2474; 50.9%) and aadA variants (859/1377; 62.4%) showed intermediate plasmid contributions. The efflux pump mdf(A) was largely chromosomal, with only 140 of 9675 detections (1.4%) found on plasmids.
These data highlight that, although the majority of ARGs are chromosomally encoded, plasmids disproportionately contribute to the dissemination of key acquired determinants particularly blaNDM-5, mcr-1, qnrS and many aminoglycoside- and sulfonamide-resistance genes.
2.7. ARG Burden and Diversity Across Plasmid Families
Plasmid families differed not only in abundance but also in the size and diversity of their ARG cargo. IncFIB(AP001918) carried the largest ARG burden, with 2378 plasmid-borne ARG occurrences spanning 148 distinct genes, representing 19.1% of all plasmid-associated ARG hits in the dataset (
Figure 6).
IncFIA and IncFIC(FII) collectively contributed an additional 1351 occurrences across 113 and 90 distinct genes, respectively, reinforcing the central role of IncF-type plasmids as broad-spectrum MDR backbones.
IncI1_1_Alpha harbored 799 ARG occurrences representing 104 distinct genes, many of which were ESBLs (notably blaCTX-M-15 and blaTEM-1B), together with sulfonamide, tetracycline and aminoglycoside resistance determinants. IncX3 and IncX4 carried smaller but highly focused ARG repertoires: 158 occurrences across 30 distinct genes for IncX3 (dominated by blaNDM-5), and 256 occurrences across 33 genes for IncX4 (dominated by mcr-1.1).
Col-type plasmids, although highly prevalent, carried more modest and specialized ARG payloads. Col156, ColRNAI and Col(MG828) each contributed several hundred ARG occurrences, largely restricted to tet(A), sul1/sul2 and streptomycin-resistance genes (e.g., aph(3″)-Ib, aph(6)-Id), and only rarely encoded β-lactamases or carbapenemases. Overall, this pattern indicates that IncF-, IncI- and IncX-type plasmids carry the bulk of clinically important MDR determinants, whereas Col-type plasmids primarily act as vehicles for older, non–β-lactam resistance genes.
2.8. Identification and Quantitative Characterization of High-Risk Plasmid (HRP) Groups
Based on combined criteria of abundance, ARG burden and lineage breadth, six plasmid families were classified as High-Risk Plasmid (HRP) groups: IncFIB(AP001918), IncFIA, IncFIC(FII), IncI1_1_Alpha, IncX3 and IncX4 (
Figure 7).
Together, these families were detected in 5027 of the 7758 plasmid-positive genomes (64.8%) and carried 4965 plasmid-borne ARG occurrences (39.9% of all plasmid-associated ARG hits).
IncFIB(AP001918) was the most widespread HRP, present in 4311 genomes (55.6% of plasmid-positive isolates), spanning eight phylogroups and 229 distinct STs. IncFIA and IncFIC(FII) were detected in 1542 (19.9%) and 1517 (19.6%) genomes, respectively, each spanning all major phylogroups and >150 STs. IncI1_1_Alpha occurred in 916 genomes (11.8%), across seven phylogroups and 146 STs, while IncX3 and IncX4 were detected in 222 (2.9%) and 301 (3.9%) genomes, respectively, yet remained widely distributed across phylogroups and sequence types.
In terms of ARG content, IncFIB(AP001918) alone carried 2378 ARG occurrences, followed by IncFIA (874), IncI1_1_Alpha (799), IncFIC(FII) (477), IncX4 (256) and IncX3 (158). Each HRP family thus combined (i) high prevalence, (ii) broad phylogroup and ST coverage and (iii) substantial and often clinically critical ARG cargo, justifying their designation as high-risk vehicles for MDR dissemination.
2.9. ARG Mobility Potential Index (MPI)
To quantify the propensity of individual ARGs to be carried on plasmids, we calculated a Mobility Potential Index (MPI) for each gene, defined as the proportion of its occurrences found on plasmids. MPI values ranged from near-zero (predominantly chromosomal) to >0.8 (strong plasmid bias).
Among high-impact determinants, blaNDM-5 displayed an MPI of 0.65 (159/246 occurrences on plasmids), reflecting frequent plasmid association but also a substantial chromosomal component. mcr-1 variants (mostly mcr-1.1) showed a higher MPI of 0.76 (323/423 plasmid-borne), while qnrS-family genes were strongly plasmid-associated with an MPI of 0.82 (663/809 occurrences on plasmids). In contrast, blaCTX-M-15 exhibited a more balanced distribution with an MPI of 0.22 (169/757 plasmid-borne), indicating that in this dataset a majority of blaCTX-M-15 copies are chromosomally encoded.
Intermediate MPI values were observed for widely distributed accessory genes such as sul2 (0.51; 1259/2474 plasmid-borne), aadA variants (0.62; 859/1377) and tet(A)-like genes (0.38; 909/2393). The efflux pump mdf(A), by contrast, had a very low MPI of 0.015, with only 140 of 9675 detections located on plasmids, consistent with its role as a core chromosomal determinant.
Taken together, these MPI profiles indicate that a subset of ARGs particularly qnrS, mcr-1 and blaNDM-5 are strongly enriched on plasmids and therefore have high potential for horizontal dissemination, whereas others such as mdf(A) and several native β-lactamase variants remain largely chromosomal and are less mobilization-prone.
3. Discussion
This study provides the first plasmidome-resolved analysis of AMR architecture across
E. coli at a truly global scale, integrating 9700 high-quality genomes spanning all major phylogroups and dominant MDR sequence types. By unifying plasmid backbone diversity, ARG localization, lineage structure, and mobility potential, our work demonstrates that plasmid lineages, not bacterial clones, or individual resistance genes constitute the primary organizational units shaping AMR dissemination in
E. coli. The global resistome is not diffuse or stochastic; rather, it is structured around a restricted set of hyper-successful plasmid families that repeatedly assemble with the same high-risk resistance determinants. These findings reveal a modular, evolutionarily conserved plasmid–gene architecture that redefines how AMR emerges, spreads, and stabilizes across ecological and geographical boundaries [
22,
23].
The recurrent associations observed between IncX3–
blaNDM-5, IncX4–
mcr-1.1, and multiple IncF backbones with
blaCTX-M alleles provide strong evidence for co-adapted plasmid–gene compatibility modules. These modules persist across unrelated phylogroups and hundreds of sequence types, suggesting that plasmids impose structural constraints that shape the evolutionary landscape of AMR. Such constraints likely arise from backbone-specific replication systems, addiction modules, conjugation machinery, and compensatory evolution that collectively optimize the stability and transmissibility of particular gene combinations [
24,
25,
26,
27,
28]. This restricted evolutionary “design space” helps explain why AMR dissemination converges on a small number of plasmid backbones despite the enormous genomic diversity of
E. coli. Our findings extend and globalize observations from long-read clinical studies, showing that these modules are not confined to specific outbreaks or national cohorts; they represent globally persistent evolutionary units that underpin AMR flow across the One-Health continuum [
17].
The marked contrast between plasmid-rich resistomes in generalist phylogroups (A, B1, D) and chromosomal stabilization of ESBL determinants in pandemic clones like ST131 and ST410 highlights distinct evolutionary strategies for long-term success. Generalist lineages appear to rely on highly mobile plasmidome architectures that facilitate horizontal exchange and ecological versatility, whereas globally dominant ExPEC clones increasingly stabilize key ESBL genes within the chromosome to reduce plasmid fitness costs and ensure vertical persistence. This duality parallels prior evolutionary models and population structure observations and is consistent with reports of extensive chromosomal fixation of
blaCTX-M-15 in ST131 and ST410 [
18,
21,
29,
30,
31].
The discovery of stable plasmid–gene modules has direct translational potential. Because these modules recur predictably across lineages and continents, they represent structurally coherent targets for plasmid-focused AMR mitigation strategies. CRISPR-based antimicrobials, for example, could be engineered to selectively disrupt high-risk plasmid families such as IncX3, IncX4, and IncFIB(AP001918), thereby eliminating
blaNDM-5,
mcr-1.1, or
blaCTX-M reservoirs at their backbone source rather than targeting individual alleles. The consistency of these plasmid–gene architectures suggests that plasmid-targeted interventions may bypass the formidable genetic diversity of
E. coli, focusing instead on the limited set of vehicles that sustain global AMR transmission [
6].
The structured nature of the plasmidome revealed here provides a foundation for developing plasmid-centric AMR surveillance systems. Instead of tracking hundreds of resistance genes across thousands of clones, genomic surveillance programs can monitor a small set of high-risk plasmid backbones that account for the majority of clinically important ARGs. Plasmidome signatures such as the presence of IncX3, IncX4, or IncFIB(AP001918) could serve as early warning indicators of emerging threats, enabling predictive modeling of AMR dissemination across human, animal, and environmental reservoirs. This plasmid-centered perspective aligns with global sewage surveys [
32] and environmental resistome studies [
33,
34], which similarly highlight recurrent plasmid-driven AMR modules as ecological sentinels.
Collectively, our results advance a new conceptual framework for AMR evolution in E. coli: plasmid backbones, not bacterial genomes, constitute the fundamental units driving the global dissemination of resistance. The resistome emerges through modular, lineage-agnostic plasmid–gene combinations that recur across phylogroups, enabling rapid cross-host and cross-ecosystem transmission. This represents a shift from traditional clone-centric models toward a plasmidome-centric understanding of AMR biology. By mapping this architecture at scale, we provide the first global reference framework capable of guiding surveillance, prediction, and intervention strategies targeting plasmid-mediated resistance.
Despite reliance on replicon-based plasmid inference which, may overlook cryptic or highly divergent plasmids the remarkably consistent plasmid–gene associations observed across this extensive dataset underscore the robustness of our conclusions. As long-read sequencing, metagenomics, and functional plasmid biology continue to expand, the plasmidome map generated here will serve as a foundational resource for decoding mobile genetic element evolution, validating plasmid–gene compatibility mechanisms, and designing plasmid-targeted AMR mitigation tools [
32,
35].