Genotypic Characteristics and Antimicrobial Resistance of Escherichia coli ST141 Clonal Group

Escherichia coli ST141 is one of the ExPEC lineages whose incidence is rising in France, even if no epidemic situation involving multidrug resistant isolates has been reported so far. Nonetheless, in a 2015–2017 monocentric study conducted in our French University hospital, ST141 was the most frequent lineage after ST131 in our collection of phylogroup B2 ESBL-producing E. coli. The genomes of 187 isolates representing ST141 group, including 170 genomes from public databases and 17 from our local collection, of which 13 produced ESBL, were analyzed to infer the maximum likelihood phylogeny SNP-based (Single Nucleotide Polymorphism) free-recombinant tree defining the ST141 population structure. Genomes were screened for genes encoding virulence factors (VFs) and antimicrobial resistance (AMR). We also evaluated the distribution of isolates according to their origin (host, disease, country) and the distribution of VFs or AMR genes. Finally, the phylogenic tree revealed that ST141 isolates clustered into two main sublineages, with low genetic diversity. Contrasting with a highly virulent profile, as many isolates accumulated VFs, the prevalence of AMR was limited, with no evidence of multidrug resistant emerging lineage. However, our results suggest that surveillance of this clonal group, which has the potential to spread widely in the community, would be essential.


Introduction
Escherichia coli is the predominant aerobic bacterium in the normal gut microbiota of humans and vertebrates as well as a major human pathogen [1]. Indeed, E. coli strains can cause both extra-intestinal pathologies (urinary tract infection, intra-abdominal or pulmonary infection as well as newborn meningitis or bacteraemia) and intestinal infections [2]. Currently, unlike these latter one (Intestinal Pathogenic E. coli, InPEC), for which virulence factors have been clearly identified and for which pathovar classification seems easy, the classification of ExPEC (Extraintestinal Pathogenic E. coli) strains is still subject to discussion, as no disease-specific virulence genes have been identified. It is the combination of different virulence factors, which may be numerous in these strains, that may explain their pathogenicity. Indeed, most, if not all, strains that may cause extra-intestinal infections have different genes encoding for virulence factors, such as adhesins, toxins, protectins and iron capture systems. However, the conditions under which E. coli strains emerge from their intestinal reservoir to cause extraintestinal infections, remain largely unknown as well as the reasons for the success of some pandemic lineages. Indeed, phylogenomic approaches have demonstrated that only four sequence types or sequencing type complexes (STcs), responsible of extraintestinal infections (STc131, STc73 and STc95 belonging to phylogroup B2, and STc69 belonging to phylogroup D) were always observed in epidemiological studies and were thus named "the big four ExPEC clones". Although not part of the big four ExPEC clones, E. coli ST141 is regularly reported as one of the most represented ExPEC [3]. Furthermore, among ExPEC, E. coli ST141 is singular: it has recently been hypothesized that the ST141 E. coli lineage genome would be very susceptible to recombinations, making it capable of acquiring and expressing specific genes of InPEC thus conferring a heteropathogen status [4]. The appearance of such hybrid clones could reshuffle the deck within the pathotype classification widely used to describe E. coli populations. Another matter of concern is the ability of E. coli to gain antibiotic resistance determinants given that E. coli may easily spread, as occurred with the pandemic of highly drug-resistant E. coli sequence type 131 (ST131) H30 sublineage [5]. To date, no epidemic situation involving multidrug resistant ST141 isolates has been reported [6]. Nonetheless, in a 2015-2017 monocentric study conducted in our French University hospital, consisting of collecting all non-duplicate extended-spectrum beta-lactamase producing E. coli (ES-BLEc), we identified that ST141 was the second most frequent lineage in our collection of phylogroup B2 ESBLEc, accounting for 8.7% of ESBL-producing isolates, while ST131 lineage, the most frequent lineage, accounted for 55.3%, and isolates belonging to other STs each accounted for less than 3% of isolates [7].
Here, we used a comprehensive data set of ST141 genomes available in databases to characterize population structure of this clonal group with the aim of identifying the emergence of a multidrug resistant sublineage. (Table S1) • Firefox file:///E: Figure 1. Representation of the two main E. coli ST141 sublineages obtained from maximum likelihood phylogeny SNP-based free-recombinant tree inferred from alignment to a fully sequenced reference strain of E. coli ST141.

•
Isolates from our local collection mostly belonged to S1 (15/17) and six of them exclusively grouped together. Isolates S86, S87, S80, S82, and S92 were indeed separated to each other by a maximum of 14 SNPs while S84 was distant with 26 SNPs. Finally, only two isolates, S89 and S96, did not belong to S1 (Table S1, Figure 2).
To note, all 13 ESBL E. coli strains in our collection were tested in vitro (diffusion susceptibility testing according to EUCAST/CA-SFM recommendations in force at the time of the establishment of the strain collection) with a perfect correlation with results obtained in silico. (Tables 2 and S1) • Overall, we did not find a strong association between the distribution of VFs and the geographical origin, the origin (human or non-human), and the type of clinical sample. However, some differences in the distribution of VFs in S1 and S2 were statistically significant and displayed in Table 2. Thus, stx2 gene was more frequent in the S2 sublineage (due to the predominance of strains causing diarrhea in this sublineage (S1 = 5/118; S2 = 12/118, p = 0.004)), while the Enteroaggregative E. coli (EAEC) astA gene previously described to exclusively belong to strains of the L2 (i.e., S1 in this study) sublineage was significantly predominant in S1 sublineage [4]. However, the astA gene was also recovered in two isolates (Ec116 and C08) from our collection, classified as lineage S2, as they belonged to fimH14 subtype. Of note, while this gene is supposed to be typical of EAEC (Enteroadhesive E. coli, responsible of diarrhea), we showed that astA gene was statistically more present in isolates from urine (23/36, 63.9%) than in isolates from stool (5/18, 21.7%, p = 0.0028). Finally, the invasion gene ibeABC was exclusively present in S2 sublineage.  (Table S1). However, in our study collection, we have not identified any strain with EHEC-hly gene, except for the strain previously described in Gati's study (N011) [4].  ). Conversely, the frequency of Salmochelin system, astA and K1 capsule were more important in S1 (written in bold).

Discussion
We have demonstrated here that the population of E. coli ST141 is organized in two main sublineages S1 and S2, one predominantly associated with urinary tract infection (S1) and the other, more frequently associated with intestinal infections (S2). This confirms results generated from a smaller collection [4]. Indeed, for 41 isolates of our collection and previously studied by Gati et al., we noted a good correlation between the L1/L2 Gati's lineages and our S1/S2 distribution, as all isolates belonging to L2 clustered in the S1 sublineage and possessed the fimH5 subtype ( Figure 1). Conversely, all isolates belonging to L1 clustered in the S2 sublineage, with different fimH subtypes (seven of subtype fimH14; three of subtype fimH350, two of subtype fimH76 and one of subtype fimH674) [4] (Table S1, Figure 1). The ST141 lineage was highly conserved as all the isolates belonged to B2-phylogroup and had O2:H6 serotype with a restricted number of fimH subtypes. This contrasts with other B2 ExPEC successful lineages such as ST95, ST117 and ST131, which display a diversity of O-serogroups [2]. Although, E. coli ST141 isolates show various combinations of numerous VFs, as previously observed by Flament et al., we found no clear genomic signature that would indicate an ecological adaptation to a host species (i.e., humans and animals), infections sites, and countries of origin (Table S1, Figure 1) [9]. However, such diversity of VFs, consistent with the hypothesis that ST141 acts as one of the melting pots within the E. coli population, could be evidence of numerous recombinations and the presence of different PAI (pathogenicity islands), which could be interesting to explore in more details [4].
Previously, ST141 had been described as a STEC/UPEC hybrid or an EAEC/UPEC hydrid as some isolates carried stx2 or pic and astA genes, respectively [4]. However, the majority (13/17) of strains in our ST141 collection (n = 187) which genomes contained stx2 were those described by Gati et al., and only one of the four additional stx2 gene positive strains also possibly had PAI II 536-like UPEC virulence factor, questioning the systematic and specifically heteropathogenicity of this clonal group. Moreover, Lindstedt et al. found in a Norwegian collection, a high frequency (64.3%) of E. coli strains combining IPEC and ExPEC virulence-associated genes [10]. Another German study revealed that 10.6% of strains isolated from UTIs harbored at least one IPEC virulence factor [11]. Given these conflicting data, the frequency of heteropathogenicity in E. coli remains to be clarified, as well as the involvement of different lineages (notably ST141) in this trait.
The fact remains that the ST141 clone has many virulence factors identified in silico. Taking into account the concept of antagonistic pleiotropy and epistatic interactions, a study aiming to characterize the expression of these, in particular in vivo, in an animal model or in vitro, by the capacity of the strains of our collection to produce biofilm could be envisaged in a future work [2].
Nevertheless, E. coli ST141 is commonly involved in human diseases and its incidence may reach a significant level. Indeed, ST141 lineage belonged to the 12 most frequent STs involved in bacteremia in France in 2014 [3]. In 2020, ST141 was quoted as one of the most common B2-ExPEC in France [9]. However, incidence of E. coli ST141 may vary between countries since a higher incidence of ST141-associated infections was found in France, compared to Spain [9]. In addition, E. coli ST141 has been found as the most frequent E. coli lineage responsible for ventilator-associated pneumonia over the 2012-2014 study period [12]. Interestingly, in a previous study, Philipps-Houlbracq et al., identified the antigen-43 (Ag43) as significantly associated in pneumonia pathogenesis [13]. It is important to note that Ag43 is widely represented among our collection (like most of B2 E. coli) as 97.5% (115/118) and 89.9% (62/69) of strains from sublineage 1 and 2, respectively, harbored this gene (Table 2).
Similarly, the panel of antibiotic resistance genes in the lineage ST141 was large with many combinations (Table S1, Figure 3). However, we could not identify a multi-resistant epidemic sub-lineage, such as that was observed with ST131 (i.e., C2-H30 producing CTX-M-15) [5]. Indeed, our study depicted a situation where E. coli ST141 might only contribute to limited locally spread of ESBL-encoding genes in the community, since the majority of ESBL strains isolated in our hospital form a cluster of strain separated by less than 30 SNPs, suggesting a local limited dissemination of an ESBL clone, although these strains have been isolated from patients with non-obvious epidemiological links (Figure 2) [7].
Likewise, although E. coli ST141 was capable of acquiring resistance genes to last resort antibiotics such as carbapenems (bla OXA-48 , bla VIM-1 ) and colistin (mcr-1), we did not find evidence of the spread of such MDR clones.

ST141 Genome Collection
The genomes of 187 isolates representing ST141 E. coli group were obtained from various sources collected over a 30-year period (Table S1).
Firstly, 13 non-duplicate ESBL-producing ST141 E. coli isolates, collected in our University hospital, in inpatients, during a previous prospective observational cohort study, between 02/2015 and 01/2017 have been paired-end sequenced with Illumina NextSeq at 2 × 150 bp and included [7]. Then, four supplemental non-ESBL-producing ST141 E. coli responsible for bloodstream infection, isolated in 2014 in our hospital, have also been fully sequenced and added [3]. Finally, we collected all the ST 141 E. coli genomes available in public databases in March 2020 (NCBI and ENA).

Genome Analysis
Raw reads were first trimmed using Sickle and the 187 genomes were de novo assembled from reads using SPADES, as previously described [14,15]. From these assemblies, we determined in silico: MultiLocus Sequence Typing (MLST) according to the Achtman scheme, O:H serotype, fimH type, and phylogroup [8,16]. In the same way, we searched and identified antibiotic resistance genes with ResFinder database, and putative virulence factors (VFs) genes using the VFDB database, which compiles most E. coli VFs related to adhesion/invasion, autotransporter system, fimbria or flagella expression, iron uptake, serum resistance and toxicity [17,18]. A SNP (Single Nucleotide Polymorphism) call variant was performed against a fully sequenced ST141 E. coli genome (NCBI biosample accession number SAMN10740161) used as reference genome using BACTSNP [19]. After recombination curation with Gubbins, a maximum likelihood phylogenetic tree was then inferred from the resulting SNP-based pseudogenomes using RaxML [20,21]. A Cluster Picker analysis was processed to identify phylogenetic clusters (Table S1) [22]. The tree and corresponding metadata information were visualized with iTOL [23].

Statistical Analysis
To evaluate the distribution of isolates according to their origin and the distribution of VF genes, variables were examined by univariate analysis using the Fisher's exact test. All statistical tests were two tailed, and p-value < 0.01 was considered statistically significant.

Conclusions
Using a genome-based methodology, we depicted in our study the population structure of E. coli ST141 group using a large collection of 187 genomes. Our findings did not confirm the initial hypothesis of an emerging ESBL-producing ST141 sublineage, but rather demonstrated the spread of a subgroup of isolates, showing a closer relatedness, on a regional scale. Nonetheless, we found that E. coli ST141 readily acquires and cumulates VF encoding genes and that its prevalence had recently increased either in both extra-intestinal and intestinal diseases. Considering this, there is a need to monitor the spread of this clonal group that has the potential to largely spread in the community and whose involvement in human disease seemed to increase.
Supplementary Materials: The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/antibiotics12020382/s1, Table S1: Genomes features (origin, VFs, AMR). Institutional Review Board Statement: All information associated with the genome data were anonymized, thus identification of individuals is not possible. Therefore, ethical approval was not required.

Informed Consent Statement: Not applicable.
Data Availability Statement: All ST141 genomes from strains recovered from Besançon (local collection, n = 17) are publicy available through the NCBI BioProject PRJNA667655.