Genomic Analysis of a Hybrid Enteroaggregative Hemorrhagic Escherichia coli O181:H4 Strain Causing Colitis with Hemolytic-Uremic Syndrome

Hybrid diarrheagenic E. coli strains combining genetic markers belonging to different pathotypes have emerged worldwide and have been reported as a public health concern. The most well-known hybrid strain of enteroaggregative hemorrhagic E. coli is E. coli O104:H4 strain, which was an agent of a serious outbreak of acute gastroenteritis and hemolytic uremic syndrome (HUS) in Germany in 2011. A case of intestinal infection with HUS in St. Petersburg (Russian Federation) occurred in July 2018. E. coli strain SCPM-O-B-9427 was obtained from the rectal swab of the patient with HUS. It was determined as O181:H4-, stx2-, and aggR-positive and belonged to the phylogenetic group B2. The complete genome assembly of the strain SCPM-O-B-9427 contained one chromosome and five plasmids, including the plasmid coding an aggregative adherence fimbriae I. MLST analysis showed that the strain SCPM-O-B-9427 belonged to ST678, and like E. coli O104:H4 strains, 2011C-3493 caused the German outbreak in 2011, and 2009EL-2050 was isolated in the Republic of Georgia in 2009. Comparison of three strains showed almost the same structure of their chromosomes: the plasmids pAA and the stx2a phages are very similar, but they have distinct sets of the plasmids and some unique regions in the chromosomes.


Introduction
Escherichia coli is a bacterium that is widely distributed as a free-living in the environment and as an important member of the large intestine microbiota in humans and warm-blooded animals [1,2]. Although most E. coli are harmless commensals, some strains of this species are pathogenic and can induce diseases in humans and animals [3]. Humanpathogenic E. coli strains exhibit a wide spectrum of clinical manifestations, which are dependent on the virulence factors [4]. The E. coli genome is characterized by genetic mosaicism, high variability, and the ability to exchange genetic information [5]. In most cases, this exchange is carried out by the horizontal transfer of the bacterial mobile genetic elements such as plasmids, transposons, pathogenicity islands, and bacteriophages [6,7]. Pathogenic E. coli, according to their localization in the macro-organism and generated pathological processes, are divided into diarrheagenic (DEC) and extraintestinal (ExPEC) pathogens [8,9]. Seven pathotypes have been described for the DEC group, including enteropathogenic E. coli (EPEC), enterohaemorrhagic E. coli (EHEC), enterotoxigenic E. coli (ETEC), enteroinvasive E. coli (EIEC), enteroaggregative E. coli (EAEC), diffusely adherent

Genomic Characteristics of EAHEC Strain SCPM-O-B-9427
The complete genome assembly of the strain SCPM-O-B-9427 contained one chromosome and five plasmids, one of which is homologous with virulence plasmid pAA-pSCPM-O-B-9427-2, encoding an aggregative adherence fimbria I (AAF/I). The major genomic characteristics are summarized in Table 2 and are shown in comparison to two hybrid EAHEC strains of O104:H4: 2011C-3493, which caused large German outbreak in 2011, and 2009EL-2050, which caused the group outbreak in 2009 in Georgia. The chromosome sizes of the strains are comparable and slightly different, with identical GC content. The total numbers of genes, including RNA genes, are about the same amount, but the chromosome of the strain SCPM-O-B-9427 contained more pseudogenes.
In silico multi-locus sequence typing (MLST) based on seven loci of house-keeping genes by the Achtman's MLST scheme database (adk_6, fumC_6, gyrB_5, icd_136, mdh_9, purA_7, recA_7) showed the strain SCPM-O-B-9427 belonging to ST678 [27]. The same sequence type was identified for the strain E. coli 2011C-3493 O104:H4 that caused the large German outbreak in 2011 and for the strain E. coli 2009EL-2050 O104:H4 that was isolated in the Republic of Georgia, in 2009. Several chromosomal-located virulence genes were identified in the strain SCPM-O-B-9427 as well as in the genomes of two above-named EAHEC strains of O104:H4. The strain SCPM-O-B-9427 was positive for aaiC gene (coding type VI secretion protein), capU (hexosyltransferase homolog), fyuA (siderophore receptor), gad (glutamate decarboxylase), iha (adherence protein), irp2 (high-molecular-weight protein 2 non-ribosomal peptide synthetase), iucC (aerobactin synthetase), iutA (ferric aerobactin receptor), lpfA (long polar fimbriae), pic (serine protease autotransporters of Enterobacteriaceae, SPATE), sigA (serine protease), and terC (tellurium ion resistance protein). However, the differences between three EAHEC strains were found: compared to the strains of O104:H4, the strain SCPM-O-B-9427 is negative for the neuC gene coding the polysialic acid capsule biosynthesis protein; and the strain 2011C-3493 is negative for the bor gene coding the serum resistance lipoprotein, while the strains SCPM-O-B-9427 and 2009EL-2050 are positive. In addition to the aggDCBA cluster and aggR regulator gene, other EAEC genetic determinants were detected in all three EAHEC strains: sepA gene (Shigella extracellular protein A), aap gene (dispersin), and aat operon (factor of adhesion). Other E. coli virulence genes such as fimH, sfa, papA, hylA, cnfl, aer, and afaC were not detected.      The region contains the genes involved in lipopoly-    (Table 3). strains SCPM-O-B-9427 and 2009EL-2050 carried a bor gene, which may be involved in serum complement resistance, but the strain 2011C-3493 does not have this gene. The second difference between the strain 2011C-3493 and other two strains is in sequence of putative tail fiber protein, which is 133 amino acids shorter. The third site of difference is a deletion, which perturbed homologous antirepressor proteins genes (antA and antB); the anti-termination protein gene is absent in the strain 2009EL-2050.  The changes of nucleotides did not affect the stx genes, which were the most SNPs detected in the genes coding hypothetical proteins and putative endolysin. The number of SNPs between the stx2a phages of the strains SCPM-O-B-9427 and 2009EL-2050 is only five. One of them is intergenic, one is a non-synonymous change in each of two genes (hypothetical protein, putative tail fiber protein), and one is a synonymous change in each of two genes (repressor protein CI, Shiga toxin 2 subunit A).

The Plasmids of the Strain SCPM-O-B-9427
Five plasmids were identified in the strain SCPM-O-B-9427. Plasmid sequences were compared using BLASTn to the NCBI nucleotide database to identify their closest matches, which are given in Table 4. The plasmid pB-9427-2 (CP086261) is homologous to pAA plasmid, which harbored several EAEC-specific virulence loci, including aggDCBA cluster, aggR gene, aatPABCD operon, sepA gene, and aap gene. Importance of pAA in disease severity has been demonstrated: it was involved in the host-pathogen interaction [28] (Berger et al., 2016). We compared the plasmid pB-9427-2 with homologous plasmids pAA-EA11 and pAA-09EL50 of the strains 2011C-3493 and 2009EL-2050. The plasmids do not have rearrangements; the size of pB-9427-2 was 1331 bp bigger due to an additional insertion caused by IS4-like element IS421 family transposase ( Figure 4).    Table 2). The plasmids of the strain SCPM-O-B-9427 did not carry any antibiotic resistance genes, and there were not identified any significant virulence factors on the plasmid pB-9427-5 except celb gene (endonuclease colicin E2).

Phylogenetic Analysis for E. coli O181 and O104:H4
To clarify the relationship of the hybrid EAHEC strain SCPM-O-B-9427 and several E. coli strains of O104:H4 and O181 found in the database NCBI, a phylogenetic tree based on the core chromosomal SNPs using Wombac was built. The tree was represented on 96,244 SNPs of 25 strains ( Figure 5). The E. coli strains were grouped due to their sero-and MLST sequence type. The hybrid strain SCPM-O-B-9427 was very closely related to the group of O181:H4 and O104:H4 strains while not to the strains of O181 non-H4 serogroups. For a better understanding of the relationship within the group of the strains belonging to ST678, a second phylogenetic tree based on 2115 core SNPs of 12 strains was built. E. coli strains formed three clusters: the first one included the strains of O181:H4 serotype; the second included the strains of O104:H4 serotype (caused outbreaks in Germany, 2011 and Georgia, 2009) and the third the other O104:H4 strains ( Figure 6). The strains from the second and the third clades are hybrid, carrying both the stx2 gene and aggDCBA cluster. The difference between the second and the third clades was the carriage of fimbriae AAF/I or fimbriae AAF/III gene cluster, respectively.
On the phylogenetic tree, the strains of O181:H4 are spaced from the center of the clade and do not form a well-defined subclade. The performed comparison of the SCPM-O-B-9427 chromosome with other E. coli O181:H4 genomes showed that the core SNP numbers varied from 133 to 159. Major strains carried genes coding fimbriae AAF/I; only three of them including the strain SCPM-O-B-9427 additionally carried the stx2 genes and were attributed as EAHEC.
We compared the sequences of stxA and stxB genes of O181:H4 strains (Figure 6), which were attributed to the 2a type. Significant differences were revealed between named genes of the strain SCPM-O-B-9427 and other two strains of this clade, PNUSAE005891 and PNUSAE043602. The strains PNUSAE005891 and PNUSAE043602 have identical sequence of stx2 genes. Although StxA amino acid sequences are the same for the three strains, i.e., PNUSAE005891, PNUSAE043602, and SCPM-O-B-9427, seven synonymous SNPs were identified in the stxA gene of the strain SCPM-O-B-9427. Moreover, two synonymous SNPs and one SNP that led to a change from alanine to valine were For a better understanding of the relationship within the group of the strains belonging to ST678, a second phylogenetic tree based on 2115 core SNPs of 12 strains was built. E. coli strains formed three clusters: the first one included the strains of O181:H4 serotype; the second included the strains of O104:H4 serotype (caused outbreaks in Germany, 2011 and Georgia, 2009) and the third the other O104:H4 strains ( Figure 6). The strains from the second and the third clades are hybrid, carrying both the stx2 gene and aggDCBA cluster. The difference between the second and the third clades was the carriage of fimbriae AAF/I or fimbriae AAF/III gene cluster, respectively.
On the phylogenetic tree, the strains of O181:H4 are spaced from the center of the clade and do not form a well-defined subclade. The performed comparison of the SCPM-O-B-9427 chromosome with other E. coli O181:H4 genomes showed that the core SNP numbers varied from 133 to 159. Major strains carried genes coding fimbriae AAF/I; only three of them including the strain SCPM-O-B-9427 additionally carried the stx2 genes and were attributed as EAHEC. We also compared the stx2a-carrying phage sequence of the strain SCPM-O-B-9427 with the same phage sequences of the strains PNUSAE005891 and PNUSAE043602. In NCBI Genome, the results of whole-genome sequencing of these strains are the assemblies of contigs. In the strain PNUSAE005891, the stxA and stxB genes were located in the contig 51 (AASQNB010000051, 21,013 bp) and in the strain PNUSAE043602 in the contig 30 (AA-QWXB010000030, 73,345 bp). Sequence alignment of the stx2a phage of the strain SCPM-O-B-9427 and the contig 30 of the strain PNUSAE043602 showed only 11% query cover (the region of stx genes) and 96.24% identity of this region. It was pointed out that the stx2a-carrying phages have a different genetic structure. BLAST search revealed the closest to the prophage of the strain PNUSAE043602 sequence of the stx2a-carrying phage Stx2_12E129_yecE DNA (LC567842, 46,100 bp) with 82% query cover and 94.64% identity. Comparison of the contig 30 of the strain PNUSAE043602 and assembly PNUSAE005891 revealed that the stx2a-carrying phage of the second strain fell to pieces, but apparently, they are the similar.

Discussion
The conception about several well-classified pathotypes of DEC E. coli based on the presence of specific virulence factors directly related to disease development existed before the German outbreak in 2011. This conviction collapsed due to the fact that this severe outbreak was caused by a hybrid strain bearing the virulence factors of enteroaggregative E. coli and Shiga-toxin-producing E. coli simultaneously [16,17]. After that, many studies have shown that E. coli strains combining genetic markers belonging to different pathotypes are more frequent than previously thought [12][13][14][15]. Hybrid DEC are associated with more serious diseases and may frequently progress to HUS, which could be explained by producing a set of proteins involved in intestinal colonization, leading to persistent diarrhea and facilitating Stx absorption [29]. We also compared the stx2a-carrying phage sequence of the strain SCPM-O-B-9427 with the same phage sequences of the strains PNUSAE005891 and PNUSAE043602. In NCBI Genome, the results of whole-genome sequencing of these strains are the assemblies of contigs. In the strain PNUSAE005891, the stxA and stxB genes were located in the contig 51 (AASQNB010000051, 21,013 bp) and in the strain PNUSAE043602 in the contig 30 (AAQWXB010000030, 73,345 bp). Sequence alignment of the stx2a phage of the strain SCPM-O-B-9427 and the contig 30 of the strain PNUSAE043602 showed only 11% query cover (the region of stx genes) and 96.24% identity of this region. It was pointed out that the stx2a-carrying phages have a different genetic structure. BLAST search revealed the closest to the prophage of the strain PNUSAE043602 sequence of the stx2a-carrying phage Stx2_12E129_yecE DNA (LC567842, 46,100 bp) with 82% query cover and 94.64% identity. Comparison of the contig 30 of the strain PNUSAE043602 and assembly PNUSAE005891 revealed that the stx2a-carrying phage of the second strain fell to pieces, but apparently, they are the similar.

Discussion
The conception about several well-classified pathotypes of DEC E. coli based on the presence of specific virulence factors directly related to disease development existed before the German outbreak in 2011. This conviction collapsed due to the fact that this severe outbreak was caused by a hybrid strain bearing the virulence factors of enteroaggregative E. coli and Shiga-toxin-producing E. coli simultaneously [16,17]. After that, many studies have shown that E. coli strains combining genetic markers belonging to different pathotypes are more frequent than previously thought [12][13][14][15]. Hybrid DEC are associated with more serious diseases and may frequently progress to HUS, which could be explained by producing a set of proteins involved in intestinal colonization, leading to persistent diarrhea and facilitating Stx absorption [29].
We present the description of a new hybrid EAHEC O181:H4 strain obtained from the rectal swab of a HUS patient in St. Petersburg (Russian Federation) in July 2018. The strain was investigated by whole-genome sequence analysis in this study.
The genoserotyping in silico confirmed that the strain SCPM-O-B-9427 was attributed to O181:H4 serogroup and to the sequence type ST678. The same ST was identified for the strains caused the large German outbreak in 2011 and the bloody diarrhea outbreak in the Republic of Georgia in 2009, but these strains belonged to O104 serogroup [20]. Affiliation to the same ST and similar sets of virulence genes suggested a close genetic relationship between these strains belonging to different serogroups. We found six assemblies of E. coli O181:H4 draft genomes in the NCBI database without detailed information about isolation sources and hosts. All of them belonged to ST678 and carried the aggR gene, and only two possessed stxA and stxB genes. Considering these data, it was not surprising that the phylogenetic relationship of E. coli O181:H4 strains based on core SNPs were very closely related to E. coli O104:H4.
Comparison of the strain E. coli SCPM-O-B-9427 with E. coli strains 2011C-3493 and 2009EL-2050 showed almost the same structure of their chromosomes and high similarity of two pAA-carrying plasmids the major virulence factors and stx2a-bearing prophages. There were identified few SNPs in the homologous parts of the chromosomes. However, at the same time, they were variable from each other and have unique differences. Hence, these strains carried distinct sets of the plasmids and unique regions in the chromosomes. Thus, the compared strains are close relatives; probably, they have the same ancestor, but different genetic events occurred during their evolution.
Unfortunately, study of the EAHEC strain SCPM-O-B-9427 genetic characteristics does not allow answering the question about the origin and evolution of this pathogen. We can suppose the hypothesis about the genesis of the EAHEC strain O104:H4 that caused severe foodborne infection in Germany in 2011 [16]. The proposed development model of this strain derivation consists of the acquisition of a stx2a-converting prophage into an EAEC strain by transduction. The Stx-converting phages were considered as highly mobile genetic elements capable of infecting a susceptible bacterium and leading to positive Stx-conversion. However, apart from the transduction of the stx2a phage in an EAEC cell, it should be introduced into the chromosome in appropriate location, be properly functioned, and be stably inherited. In some cases, the infection of DEC strains by Stx-converting phages has been carried out in experiments. The emergence of hybrid EAHEC strains is not an often-occurring event, which can be explained by different susceptibility of EAEC strains to the Stx-converting phages. Probably, the EAEC of O104:H4 and O181:H4 were competent recipients of Stx-converting phages; moreover, they are very genetically close to each other. It was revealed in our study that the stx2a prophages of the strains SCPM-O-B-9427 and 2009EL-2050 are almost identical, while they were somewhat different from the stx2a prophage of the strain 2011C-3493; nevertheless, all three stx2a prophages are very similar. Interestingly, the E. coli strains PNUSAE005891 and PNUSAE043602 of the O181:H4 serotype carried different stx2a prophages compared to that of the strain SCPM-O-B-9427. This fact may indicate an increased sensitivity of E. coli of different origins to the Stx-phage infection. The forming of new hybrid DEC strains has high threat potential to humans due to the combination of the damages induced by Stx on susceptible cells and the aggregative, adherence-mediating fimbriae allowed to colonize the gastrointestinal tract, which can lead to disease worsening and the development of HUS.
E. coli are characterized by a high degree of genetic heterogeneity, high genome plasticity, and the ability to exchange genetic information and thereby increase virulence, to acquire drug resistance, and eventually improve the adaptation to different environments, keeping E. coli as a successful bacterium in various ecological niches.
Although the strain SCPM-O-B-9427 and the "Georgian strain" caused not-so-large foodborne outbreaks compared to that of the "Germany strain", some patients developed HUS. The emergence of O181:H4 EAHEC strain, phylogenetically related to the Shigatoxin-producing E. coli of O104:H4, show that new genetic variants continuously formed in this bacterial species. Therefore, genetic research is essential for detecting and controlling the spread of new variants of pathogenic bacteria.

Bacterial Strains Used in This Study
Escherichia coli strain SCPM-O-B-9427 was obtained from the patient's rectal swab received from The Center of Hygiene and Epidemiology in Saint Petersburg and deposited into The State Collection of Pathogenic Microbes of The State Research Center for Applied Microbiology and Biotechnology, Obolensk. It was not needed to obtain permission from the Ethics Committee to conduct the study. The name of the strain does not contain personal data about the patient.

DNA Isolation and Pathotype and Phylogroup Identification
DNA extraction was performed using a nucleic acid extraction kit AmpliSens ® RIBOprep (InterLabService, Moscow, Russia). Pathotype detection was performed using commercial assay AmpliSens ® Escherichioses-FRT (InterLabService, Moscow, Russia  [32]. Short and long raw reads were used to obtain the hybrid assembly of the strain using Unicycler v. 0.4.7 software (The University of Melbourne, Victoria, Australia) with default settings that included primary filtering and quality control [33]. Annotation was carried out by NCBI Prokaryotic Genome Annotation Pipeline (PGAP) v. 5.3 (National Center for Biotechnology Information, Bethesda, MD, USA) [34].

Conclusions
This study revealed the genetic properties of the new hybrid enteroaggregative/Shigatoxin-producing (EAHEC) strain E. coli Notably, the strains SCPM-O-B-9427 and 2009EL-2050 did not cause enormous outbreaks compared to the strain 2011C-3493. Therefore, hybrid strains producing Stx2 and the aggregative adherence-mediating fimbriae simultaneously could have a potential threat to humans. Combination of these virulence factors increase the pathogenic potential due to the damaging effects on the intestinal epithelium and colonization of the gastrointestinal tract. Therefore, it is necessary to undertake investigations of new genetic variants of pathogenic E. coli to detect their spreading among people and in environments.

Institutional Review Board Statement:
The study did not require ethical approval because it did not involve humans or animals.