Identification and Genomic Characterization of Escherichia albertii in Migratory Birds from Poyang Lake, China

Escherichia albertii is an emerging zoonotic foodborne enteropathogen leading to human gastroenteritis outbreaks. Although E. albertii has been isolated from birds which have been considered as the potential reservoirs of this bacterium, its prevalence in migratory birds has rarely been described. In this study, E. albertii in migratory birds from Poyang Lake was investigated and characterized using whole genome sequencing. Eighty-one fecal samples from nine species of migratory birds were collected and 24/81 (29.6%) tested PCR-positive for E. albertii-specific genes. A total of 47 isolates was recovered from 18 out of 24 PCR-positive samples. All isolates carried eae and cdtB genes. These isolates were classified into eight E. albertii O-genotypes (EAOgs) (including three novel EAOgs) and three E. albertii H-genotypes (EAHgs). Whole genome phylogeny separated migratory bird-derived isolates into different lineages, some isolates in this study were phylogenetically closely grouped with poultry-derived or patient-derived strains. Our findings showed that migratory birds may serve as an important reservoir for heterogeneous E. albertii, thereby acting as potential transmission vehicles of E. albertii to humans.


Introduction
Escherichia albertii is a newly described Escherichia species and emerging foodborne pathogen causing watery diarrhea, abdominal distention, vomiting, fever, and even bacteremia in humans [1]. E. albertii was first identified in diarrheal children from Bangladesh and has been associated with several human gastroenteritis outbreaks in Japan [2,3]. Due to the lack of distinguishing biochemical characteristics, E. albertii strains were often misidentified as E. coli, Hafnia alvei, Salmonella enterica, or Shigella boydii serotype 13 [4]. Thus, the prevalence of E. albertii in different hosts may well have been underestimated.
E. albertii possesses an outer membrane protein intimin encoded by eae gene, which is responsible for the formation of attaching effacing (A/E) lesions on host intestinal epithelium [1]. In addition, almost all E. albertii isolates harbor a cdtABC locus which encodes the cytolethal distending toxin (CDT) [5]. The cdtB gene has been divided into five subtypes (cdtB-I to cdtB-V) in E. coli. A new subtype, cdtB-VI, was recently reported in E. albertii [6]. Shiga toxins (stx2a and stx2f ) were identified in some E. albertii isolates [7,8].
Other virulence factors reported in E. albertii, such as enteroaggregative E. coli heat-stable enterotoxin encoded by astA, may contribute to pathogenicity, but they have not been systematically investigated [6].

Whole Genome Sequencing and Assembling
Genomic DNA of all isolates was extracted from overnight culture using the Wizard Genomic DNA purification kit (Promega, Madison, WI, USA) according to the manufacturer's instructions. Bacterial genomes were sequenced and assembled as previously described [23]. One representative E. albertii isolate from each sample was sequenced using the combined methods of the PacBio Sequel (Pacific Biosciences, Menlo Park, CA, USA) and Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, USA) to obtain complete genomes. The raw PacBio sequencing reads were first quality-controlled with the "RUN QC" module in SMRT Link version 5.1.0 (www.pacb.com/support/software-downloads accessed on 18 September 2022) and de novo assembled using the hierarchical genome assembly process (HGAP) pipeline [24], then corrected with the Illumina short reads. The remaining isolates were sequenced using the Illumina platform as described above. The paired-end reads were filtered by fastp v0.20.1 (https://github.com/OpenGene/fastp accessed on 18 September 2022) [25] and assembled using SKESA v2.4.0 [26].

Molecular Characterization of E. albertii Isolates
The E. albertii H-genotypes (EAHgs) were determined using BLAST + search against four H-genotypes sequences described by Nakae et al. with a coverage ≥ 90% and identity ≥ 99% [27]. The E. albertii O-genotypes (EAOgs) were determined using BLAST+ search against 42 primer pairs described by Ooka et al. [28]. For unmatched isolates, each O-antigen biosynthesis gene cluster (O-AGC) between galF and gnd genes was extracted from the genome sequence. Open reading frames (ORFs) were predicted using Prokka v1.14.6 [29]. Functional annotation of the ORFs was performed based on the results of a homology search against the public, non-redundant protein database using BLASTP. The DeepTMHMM v1.0.10 analysis program (https://dtu.biolib.com/DeepTMHMM accessed on 10 October 2022) [30] was used to identify potential transmembrane segments from the amino acid sequences. The genetic structures of O-AGC were visualized using EasyFig v2.2.5 [31].

Data Availability
All genomes in this work were submitted to GenBank under the accession numbers CP099868-CP099914 and JAMXMZ000000000-JAMXOB000000000. The annotated se-quences of O-antigen biosynthesis gene clusters were submitted to GenBank under accession numbers OP019328-OP019330.

Prevalence of E. albertii in Fecal Samples of Different Migratory Bird Species
A total of 81 fecal samples from nine species of migratory birds were collected. The major species from which the samples were collected were Eurasian wigeon (Mareca penelope) (n = 31), Taiga bean goose (Anser fabalis) (n = 19), Greater white-fronted goose (Anser albifrons) (n = 11), and Northern pintail (Anas acuta) (n = 11). Of the 81 samples, 24 (29.6%) yielded a 393-bp PCR amplicon specific for E. albertii, and 18 (22.2%) PCR-positive samples were culture-positive for E. albertii (Table 1). Isolates were recovered from six species of migratory birds, including Eurasian wigeon, Taiga bean goose, Greater white-fronted goose, Northern pintail, Lesser white-fronted goose, and Tundra swan. A single isolate was obtained from three fecal samples, two isolates per sample were recovered from four samples each, three isolates per sample were obtained from eight samples each, and four isolates each were obtained from three samples. A total of 47 E. albertii isolates were kept for further analysis (Table 1).

Antimicrobial Susceptibility and Antimicrobial Resistance (AMR) Genes
Twelve isolates (25.5%) showed resistance to tetracycline; all isolates, except one, carried tetracycline-associated resistant gene tetB. All isolates possessed macrolide-associated resistance gene mdfA, while only six isolates showed resistance to azithromycin. Twelve isolates possessed sulfonamide-associated resistance gene sul2, while all of them were susceptible to trimethoprim-sulfamethoxazole. All 47 isolates were susceptible to the other 15 antimicrobials tested (Table S1).

Genomic Variations among 47 E. alberii Isolates
The genomic size of 18 complete E. alberii genomes ranged from 4,657,64 to 5,122,791 bp, with gene number ranging from 4477 to 4942. Plasmids were identified in 15 out of 18 complete genomes. Among these 15 isolates, three possessed only one plasmid, eleven isolates possessed two plasmids, and one isolate possessed four plasmids. The genome sizes of the 29 draft genomes ranged from 4,575,346 to 5,334,983 bp, with gene number ranging from 4461 to 5228. The GC content of all isolates was approximately 49% (Table S3).
A maximum-likelihood phylogenetic tree based on 83,912 SNPs identified among 47 E. albertii isolates was constructed. Six main clades were observed in the phylogenetic tree (Figure 2). Three samples (sample ID: 121, 153, and 253) recovered 2-3 isolates each. These isolates separated into different clades. For example, isolates 121_1_EW_A and 121_2_EW_A were separated into Clade 2 and Clade 4.2, respectively. Isolates 153_1_TBG_A, 153_2_TBG_A, and 153_3_TBG_A were divided into Clade 3, Clade 4.2, and Clade 2. Isolates 253_1_EW_B, 253_2_EW_B, and 253_3_EW_B were grouped into Clade 3, Clade 4.1, and Clade 4.2. This may indicate that various clones of E. albertii coexisted in the same individual. Notably, some isolates from different migratory birds showed highly phylogenetical relatedness (e.g., Clade 3 and Clade 4.2), although the migratory birds were generally captured in different sites around Poyang Lake (Figure 2A).

Global Comparative Analysis of E. albertii
Among the 15 samples recovered with more than one isolate each, eight samples (e.g., 105, 121, 153, 205, 208, 222, 251, and 253) yielded different isolates based on their molecular characteristics and SNPs. Some isolates derived from the other seven samples (e.g., 104, 110, 133, 147, 155, 233, and 247) showed similar antimicrobial phenotypes and molecular characteristics, and were clustered into the same clade in the phylogenetic tree (Tables S1 and S2, and Figure 2A). This may indicate that these isolates are derived from the same clone. One of the clonal strains from same samples was used to understand the genetic relationship of E. albertii populations from birds, humans, and other sources. A phylogenetic tree was constructed based on 164 E. albertii genome sequences downloaded from NCBI and Entrobase database (Table S4), and 29 genome sequences from this study. A previous study showed that E. albertii strains were divided into eight lineages (L1-L8) [6]. The phylogenetic tree showed that 22 isolates from migratory birds fell into three different lineages (L5, L7, and L9), while the remaining seven isolates did not belong to any lineages ( Figure 2B). In the three lineages, L5 and L7 were proposed in the previous study [6], and L9 was expanded and first defined in this study. Some isolates were phylogenetically related to poultry-or patient-derived strains, especially in lineage L7 ( Figure 2B).
To further define genomic distance between migratory bird-derived isolates and phylogenetically related strains, the pairwise cgSNP (core genome SNP) values were calculated and analyzed (Figure 3 and Table S5). In L5, the cgSNPs between three migratory birdderived isolates and phylogenetically related strains differed from 755 to 1096 cgSNPs. In L7, the cgSNPs between nineteen migratory bird-derived isolates and phylogenetically related strains (including one strain SRR1999986 out of lineages) differed from 97 to 8267 cgSNPs. In L9, the cgSNPs between fourteen migratory bird-derived isolates and phylogenetically related strains differed from 458 to 7787 cgSNPs (Table S5). The strains showing phylogenetic relatedness to the migratory bird isolates were mainly isolated from human (13 strains from UK, Japan, Guinea, Poland, and China), poultry (9 strains from China), bird (2 strains from Japan and France), dog (1 strain from Australia), and unknown source (1 strain from USA) from 1994 to 2019 (Figure 3). Thirteen human-derived strains were collected from clinical settings and one of them (GCA_001515065.1) was identified as a causative bacterium of a human gastroenteritis outbreak in Japan in 2011 [3]. Nine phylogenetically related poultry-derived strains were collected in different provinces of China. Notably, one isolate 105_3_LWG_A in L7 showed closely phylogenetic relationship with a poultry-derived strain (SRR13494886) and a human-derived strain (SRR12769693), with an average of 101 cgSNPs. Moreover, these two strains were collected from China (SRR13494886) in 2014 and UK (SRR12769693) in 2019, respectively (Figure 3). , although the migratory birds were generally captured in different sites around Poyang Lake (Figure 2A).

Global Comparative Analysis of E. albertii
Among the 15 samples recovered with more than one isolate each, eight samples (e.g., 105, 121, 153, 205, 208, 222, 251, and 253) yielded different isolates based on their molecular characteristics and SNPs. Some isolates derived from the other seven samples (e.g., 104, 110, 133, 147, 155, 233, and 247) showed similar antimicrobial phenotypes and molecular characteristics, and were clustered into the same clade in the phylogenetic tree (Tables S1 and S2, and Figure 2A). This may indicate that these isolates are derived from the same clone. One of the clonal strains from same samples was used to understand the genetic relationship of E. albertii populations from birds, humans, and other sources. A phylogenetic tree was constructed based on 164 E. albertii genome sequences downloaded from NCBI and Entrobase database (Table S4), and 29 genome sequences from this study. A previous study showed that E. albertii strains were divided into eight lineages (L1-L8) [6]. The phylogenetic tree showed that 22 isolates from migratory birds fell into three different isolates in this study. The scale represents the evolutionary distances. Isolate details are summarized in their names: Sample ID_Number_Abbreviation of migratory birds_Sampling sites. The name with same sample ID in red color indicates these isolates could be the same clone from one sample. The asterisk symbol represents the representative strain selected for further genomic analysis. (B) An SNP-based phylogenetic tree of 193 E. albertii genome sequences. From inside to outside, the ring symbol shows the isolated countries, the sources, and the lineages. The leaves of tree were annotated with strain names (this study) or accession numbers of reference strains downloaded from NCBI or Enterobase.

Discussion
E. albertii is known to be an emerging zoonotic foodborne pathogen and has been isolated from several species of wild birds (Redpoll finches, European wigeon, Pine siskins, Magpies, Pigeons, and others) worldwide, demonstrating its diverse reservoirs and global distribution [12,44]. In the present study, for the first time, we reported E. albertii in different migratory birds in Poyang Lake, China. Besides Eurasian wigeon, E. albertii were first identified from Taiga bean goose, Greater white-fronted goose, Northern pintail, Lesser white-fronted goose, and Tundra swan. The overall culture-positive rate was 22.2% (18/81) in this study, which was higher than that reported in birds from other countries (0.7-3.2%) [12,44].

Discussion
E. albertii is known to be an emerging zoonotic foodborne pathogen and has been isolated from several species of wild birds (Redpoll finches, European wigeon, Pine siskins, Magpies, Pigeons, and others) worldwide, demonstrating its diverse reservoirs and global distribution [12,44]. In the present study, for the first time, we reported E. albertii in different migratory birds in Poyang Lake, China. Besides Eurasian wigeon, E. albertii were first identified from Taiga bean goose, Greater white-fronted goose, Northern pintail, Lesser white-fronted goose, and Tundra swan. The overall culture-positive rate was 22.2% (18/81) in this study, which was higher than that reported in birds from other countries (0.7-3.2%) [12,44].
Serotyping plays an important role in diagnosis and epidemiological studies for pathogens of public health importance. For example, most reported outbreaks of E. coli have been attributed to several serogroups (e.g., O26, O111, and O157) [45]. The diversity of O-antigen biosynthesis gene clusters (O-AGCs) provides the primary basis for serotyping. Instead of the conventional agglutination test, forty O-genotypes (named EAOg1-EAOg40) and four H-genotypes (EAHg1-EAHg4) unique to E. albertii have been proposed [27,28]. The O-antigen genotypes of E. albertii were associated with virulence genes. For example, the EAOg18 strains were predominant in human-derived strains and often harbored stx2f gene [46]. In this study, 38 isolates belonged to five known EAOgs (EAOg1, EAOg2, EAOg6, EAOg8, and EAOg21) and three EAHgs (EAHg1, EAHg3, and EAHg4). Three novel O-AGCs, named as EAOg41-43, were identified among nine isolates in this study, indicating the high diversity of E. albertii in migratory birds.
The eae and cdtB genes were commonly considered as the key virulence determinants of E. albertii [5]. Currently, at least 30 eae subtypes have been described in E. coli. Some subtypes such as beta1, gamma1, and sigma are common in E. albertii, but several novel subtypes have also been identified in E. albertii, implying the pathogenic difference between E. coli and E. albertii [6,13]. The CDT is encoded by the cdtABC genes which were widely distributed in E. albertii [1]. The cdtB gene has been divided into six subtypes (cdtB-I to cdtB-VI), with cdtB-II and cdtB-VI being the most common subtypes [6,13]. Recently, several cdtB-II-positive E. coli isolates were reclassified as E. albertii, suggesting that previously identified cdtB-II-positive E. coli isolates might be E. albertii. [47]. E. albertii cdtB-II gene (Eacdt) was used to develop a PCR assay for the detection of E. albertii [20]. In this study, all isolates possessed eae and cdtB, but none carried stx2. The predominant eae and cdtB subtypes were sigma and cdtB-II, which were found to be common in clinical strains of E. albertii (Table S3). Moreover, heat-stable enterotoxin gene astA, an important virulence gene in diarrheagenic E. coli [48], was also presented in several isolates of migratory birds. These indicated that the migratory bird-derived isolates may have pathogenic potential for humans.
The E. albertii isolates from migratory birds were classified into different phylogenetic clusters, indicating the genomic diversity of E. albertii in migratory birds in Poyang Lake. Several genotypes of E. albertii coexisted in a single individual, similar to other findings in raccoons and other birds [44,49]. Isolates belonging to the same clone were identified in two sampling sites (A and B) about 15 km apart, indicating clonal transmission in Poyang Lake.
Several gastroenteritis outbreaks caused by E. albertii have been reported [3]. Environmental water and vegetables were identified as some of the transmission vehicles in previous outbreaks [3,50]. Migratory birds are known to be involved in the maintenance and dissemination of zoonotic pathogens such as viruses, tick-borne pathogens, Vibrio, Listeria monocytogenes, Salmonella enterica, Escherichia coli, Campylobacter jejuni, and Mycobacterium avium [15,[51][52][53]. These pathogens can also be transmitted to humans, animals, and poultry by contaminated water. Humans may come in contact with contaminated water for household or agricultural purposes. Therefore, migratory birds with a higher occurrence rate of E. albertii might contaminate environmental water, leading to human infections or outbreaks. In the present study, the isolates from migratory birds were genetically related to those isolated from human, poultry, bird, and dog from different regions or countries. Human, animal, and poultry might be infected by contact with contaminated water, soil, and food. The strains with an average of 101 cgSNPs indicated a highly close relationship with each other. Though the estimated number of cgSNPs per year for E. albertii or E. coli was unclear, a cutoff of ≤21 cgSNPs per genome per year for Klebsiella pneumoniae has been proposed [54]. These isolates with 101 cgSNP differences might share a recent common ancestor and clonal transmission. These data further proved that migratory birds might play a significant role in pathogen dissemination.
Our proposed method of bacterial isolation and identification faces several key limitations that must be acknowledged. First, the E. albertii specific primers proposed by Lindsey et al. [19] targeting EAKF1_ch4033 can only correctly identify 96.5% (305/316) and missed 11 strains [9]. Second, the selection of white or colorless colonies in MacConkey agar might result in the exclusion of other lactose-fermenting E. albertii isolates [9]. These limitations might result in underestimation of the actual pathogen occurrence rate. We additionally acknowledge the limited sample size in this study; further large-scale epidemiological and more in-depth studies on migratory birds, other animal species, the environment and humans around Poyang Lake region are highly warranted to understand the significance of birds, and the transmission potential to the environment and humans due to E. albertii being carried.
In conclusion, this study proved that migratory birds may serve as an important reservoir of heterogeneous E. albertii with potential transmission sources to cause human infections. Considering the limited number of samples in this study, in the future, integrated, global, and 'One Health' approaches are critically needed to study E. albertii, an emerging zoonotic bacterial pathogen important in public health.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/pathogens12010009/s1, Table S1. The antimicrobial resistance phenotypes and genes of 47 isolates of migratory birds; Table S2. The virulence genes of 47 migratory bird-derived isolates; Table S3. Genome profiles of the 47 isolates of migratory birds in this study; Table S4. Escherichia albertii Reference strains from NCBI and Enterobase used in this study; Table S5. Pairwise cgSNP values between migratory bird isolates and phylogenetically related strains in L5, L7, L9, and other lineages.  Data Availability Statement: All genomes in this work were submitted to GenBank under the accession numbers CP099868-CP099914 and JAMXMZ000000000-JAMXOB000000000. The annotated sequences of O-antigen biosynthesis gene clusters were submitted to GenBank under accession numbers OP019328-OP019330.

Conflicts of Interest:
The authors declare no conflict of interest.