Characterization of the Escherichia coli Virulent Myophage ST32

The virulent phage ST32 that infects the Escherichia coli strain ST130 was isolated from a wastewater sample in China and analyzed. Morphological observations showed that phage ST32 belongs to the Myoviridae family, as it has an icosahedral capsid and long contractile tail. Host range analysis showed that it exhibits a broad range of hosts including non-pathogenic and pathogenic E. coli strains. Interestingly, phage ST32 had a much larger burst size when amplified at 20 °C as compared to 30 °C or 37 °C. Its double-stranded DNA genome was sequenced and found to contain 53,092 bp with a GC content of 44.14%. Seventy-nine open reading frames (ORFs) were identified and annotated as well as a tRNA-Arg. Only nineteen ORFs were assigned putative functions. A phylogenetic tree using the large terminase subunit revealed a close relatedness with four unclassified Myoviridae phages. A comparative genomic analysis of these phages showed that the Enterobacteria phage phiEcoM-GJ1 is the closest relative to ST32 and shares the same new branch in the phylogenetic tree. Still, these two phages share only 47 of 79 ORFs with more than 90% identity. Phage ST32 has unique characteristics that make it a potential biological control agent under specific conditions.


Introduction
Pathogenic Escherichia coli (E. coli) is a common zoonotic agent that poses a significant threat to public health and safety. Shiga-toxin-producing E. coli (STEC) strains are one of the most important foodborne pathogens [1,2]. The Shiga toxin (Stx) cleaves ribosomal RNA, thereby disrupting protein synthesis and killing the intoxicated epithelial or endothelial cells [3]. STEC infection can result in diseases such as diarrhea, hemorrhagic colitis, and hemolytic-uremic syndrome (HUS) in humans and animals. These diseases are subjected to various pharmaceutical treatments including antibiotics, such as ampicillin, streptomycin, sulfonamides, and oxytetracycline [4,5].
It is well-known that the use of antibiotics can lead to the spread of antibiotic-resistant bacteria in the environment, which poses a risk to human health [6][7][8]. Antimicrobial resistance of E. coli is an issue of the utmost importance since it can affect both animals and humans [9]. This bacterial species has a great capacity to accumulate antibiotic resistance genes, mostly through horizontal gene transfer [10,11].
Viruses 2018, 10 For example, the intensive use of various antibiotics in aquaculture has had significant benefits to the fish industry but it has also led to serious negative effects on the environment, including the emergence of a pool of antibiotic-resistant bacteria and transferable resistance genes [6,[12][13][14]. Some of those antibiotic-resistance genes can be transferred horizontally from bacteria in aquatic environments to pathogenic bacteria, affecting land animals and humans [13,14]. Moreover, the transmission of resistant clones and resistance plasmids of E. coli from poultry to humans has also been identified [15,16]. Of note, the highest rate of antibiotic-resistance genes was found in E. coli strains of a sewage treatment plant that treats both municipal and hospital sewage [17][18][19]. Although wastewater treatment processes reduce the number of bacteria in sewage by up to 99%, E. coli cells can still reach the receiving water and contribute to the dissemination of resistant bacteria into the environment [20]. As a result, antimicrobial resistance in E. coli is considered one of the major challenges for both humans and animals at a worldwide scale and it needs to be considered as a real public health concern.
Alternative strategies must be developed to reduce the risk associated with the dissemination of antimicrobial resistance and to control the risk of disease transmission. The use of phages as biocontrol agents has received increasing attention recently as a possible alternative or as a complement to antibiotics [21][22][23][24][25][26][27]. For example, bacteriophages have demonstrated efficacy in controlling pathogenic bacterial populations in, among others, poultry meat [28], aquaculture [23], wastewater, and minimally processed, ready-to-eat products and fresh fruits [25,[29][30][31]. It can also help to remove bacteria on chicken skin [22] and on dairy cows at different lactation stages [26]. Interestingly, these bacterial viruses can be highly specific to a single bacterial species or to only a few strains within that species, or can productively infect a range of bacterial species [32,33].
In the present study, we used the host pathogenic E. coli ST130 (flagellin H21) carrying Shiga toxin (stx1, stx2) genes to isolate and characterize a new virulent coliphage, named ST32. This phage was isolated from sewage water and possesses appealing characteristics that could be of interest for specific biocontrol purposes.

Bacterial Strain
Escherichia coli ST130 was obtained from the Chinese Center for Disease Control and Prevention (China CDC). This bacterium was used as the phage host.

Phage Isolation and Purification
Phage ST32 was isolated from a wastewater sample of a sewage treatment plant in Beijing, and was propagated and titrated using methods described previously [34]. Samples were filtered with a 0.45 µm sterile PES syringe filter (Sarstedt, Nümbrecht, Germany, catalog number 83.1826), and then, 2.5 mL of the filtered sample and 1 mL of an overnight E. coli ST130 culture were added to 7.5 mL of Luria broth (LB) (1% bacto-tryptone, 0.5% bacto-yeast extract, and 1% NaCl) incubated overnight with agitation (200 rpm) at 37 • C. The resulting supernatant was filtered and serially diluted in order to isolate phage plaques using the double layer agar method. Briefly, 100 µL of serially diluted lysate and 100 µL of an overnight E. coli culture were added to 4 mL of LB supplemented with 0.75% agar. The inoculated soft agar was then poured into LB plates (1.5% agar). The plates were incubated overnight at 37 • C, and single phage plaques were picked, propagated, and purified three times.

Phage Morphology
Phage ST32 was purified and concentrated by CsCl gradient as described previously [35]. Phage particles were stained with 2% (w/v) aqueous uranyl acetate on a carbon-coated grid and were observed using a JEM-1230 transmission electron microscope (JEOL, Tokyo, Japan) [36]. Over 10 specimens were observed and used for size determination.

Host Range
The host range of phage ST32 was tested on 73 bacterial strains from different genera, species, and serotypes using the spot test method and a diluted phage lysate. In brief, 200 µL of overnight culture of E. coli, Shigella, Salmonella, or Citrobacter was mixed with 3.5 mL of LB containing 0.75% (w/v) soft agar. The inoculated soft agar was then poured on LB (1.5% (w/v) agar) plates. Then, serial dilutions of phage lysate were made in buffer (50 mM Tris−HCl at pH 7.5, 100 mM NaCl, and 8 mM MgSO 4 ). Five microliters of various serial dilutions (10 0 , 10 −2 , 10 −4 and 10 −6 ) was spotted on the top agar. After overnight incubation at 37 • C, phage plaques or lysis zones were recorded.
Moreover, the propagation of phage ST32 on non-pathogenic host strains (E. coli HER1036, HER1155, HER1222, HER1315, HER1375, and HER1536) was compared to that of the pathogenic E. coli ST130 strain. In brief, the strains were grown at 37 • C in LB medium until an optical density at 600 nm (OD) of 0.25. Then, approximately 10 6 PFU·mL −1 of phage ST32 was added. The phage-infected cultures were incubated with agitation at 37 • C until complete bacterial lysis was achieved. The phage lysate was centrifuged to remove cell debris, and the supernatant was filtered using a 0.45 µm syringe filter. Then, the phage lysates were serially diluted in buffer and titered by spot test as described above. Of note, the pathogenic E. coli ST130 strain was used for phage titration after propagation.

One-Step Growth Curve Assay
The influence of the incubation temperature on phage ST32 plaque formation was investigated by spot test as described above. Following the spot test assay, the plates were incubated at various temperatures (ranging from 10 to 42 • C).
One-step growth curve assays were also performed in triplicate. Briefly, phages were mixed with 2 mL of a mid-exponential phase culture of E. coli ST130 (OD of 0.8) with a starting multiplicity of infection (MOI) of 0.05. ST32 phages were allowed to adsorb to E. coli ST130 cells for 5 min at various temperatures (20, 30, or 37 • C), and then the mixture was centrifuged for 1 min at 16,000× g. The pellet was resuspended, diluted, and added to 10 mL of LB. This suspension was incubated at three different temperatures (20, 30, or 37 • C) without agitation, and samples were taken to test the phage titers. The phage titer of each sample was determined using the double layer agar method. All plates were incubated overnight at 30 • C. The burst size was calculated by subtracting the initial titer from the final titer and then dividing by the initial titer. The latent phase corresponded to the middle of the exponential phase of the curve [37]. The data were analyzed under a one-way analysis of variance (ANOVA) followed by a Tukey test to correct the p-values for the multiple comparisons. Significant differences were reported at an alpha level of 1%.
2.6. E. coli ST130 Growth E. coli ST130 growth was also determined at various temperatures using OD and recorded in triplicate. In brief, 200 µL of ST130 overnight culture was added to 5 mL of LB medium. Then, inoculated samples were incubated with agitation (200 rpm) at 20, 30, and 37 • C. The OD was measured at intervals of 30 min.

Sequencing and Analysis
Phage DNA was extracted as described elsewhere [38]. DNA was sequenced using the Illumina Hiseq (PE250) platform at Beijing Fixgene Tech Co., Ltd. (Beijing, China). More than 5000-fold coverage of the phage genome was generated. The paired-end reads were assembled using ABySS v. 1.3.6. Open reading frames (ORFs) were predicted using PHASTER [39]. The identified ORFs were confirmed with GeneMark.hmm prokaryotic (http://exon.gatech.edu/GeneMark/gmhmmp.cgi) and ORF Finder (https://www.ncbi.nlm.nih.gov/orffinder/). ORFs were considered candidates for evaluation when they encoded 45 or more amino acids (aa) and possessed both a conserved Shine-Dalgarno sequence (5 -AGGAGGU-3 ) and a start codon (AUG, UUG, or GUG). BLASTp was Viruses 2018, 10, 616 4 of 16 used to identify the putative functions of the proteins. Hits were considered valid when the E-value was lower than 10 −3 . The percent identity between proteins was calculated by dividing the number of identical residues by the size of the smallest protein. The theoretical molecular weights (MW) and isoelectric points (pI) of the proteins were obtained using tools available on the ExPASy webpage (http://web.expasy.org/compute_pi/). The bioinformatic tool tRNAscan-SE (http://lowelab.ucsc. edu//tRNAscan-SE/) was used for tRNA detection.

Terminase Tree
A phylogenetic tree was generated based on the large terminase subunit amino acid sequences of phage ST32 and multiple phages available in databases sharing sequence identity. The corresponding phage protein sequences were retrieved from GenBank (https://www.ncbi.nlm. nih.gov/). In constructing the terminase phylogenetic tree, these sequences were aligned with MAFFT [40] using the E-INS-i alignment algorithm. Thereafter, MAFFT-profile alignment was processed, as previously described [41], in order to generate the tree. Briefly, ProtTest 3.2 was applied to find an appropriate model of amino acid substitution and was implemented in PhyML 3.0 to calculate a maximum likelihood tree. Finally, the Shimodaira-Hasegawa-like procedure was used to determine the branch support values and the Newick utility package was used to render the trees.

Nucleotide Sequence Accession Number
The complete genome sequence of phage ST32 was deposited in GenBank under the accession number MF044458.2.

Phage Morphology
The morphological characteristics of phage ST32 were examined by transmission electron microscopy. Electron micrographs ( Figure 1) showed that phage ST32 has an icosahedral capsid with an apex diameter of 64 ± 6 nm and a long contractile tail with a length of 132 ± 9 nm. These morphological features [42] indicate that phage ST32 belongs to the Caudovirales order and the Myoviridae family.
Viruses 2018, 10, x FOR PEER REVIEW 4 of 18 evaluation when they encoded 45 or more amino acids (aa) and possessed both a conserved Shine-Dalgarno sequence (5′-AGGAGGU-3′) and a start codon (AUG, UUG, or GUG). BLASTp was used to identify the putative functions of the proteins. Hits were considered valid when the E-value was lower than 10 −3 . The percent identity between proteins was calculated by dividing the number of identical residues by the size of the smallest protein. The theoretical molecular weights (MW) and isoelectric points (pI) of the proteins were obtained using tools available on the ExPASy webpage (http://web.expasy.org/compute_pi/). The bioinformatic tool tRNAscan-SE (http://lowelab.ucsc.edu//tRNAscan-SE/) was used for tRNA detection.

Terminase Tree
A phylogenetic tree was generated based on the large terminase subunit amino acid sequences of phage ST32 and multiple phages available in databases sharing sequence identity. The corresponding phage protein sequences were retrieved from GenBank (https://www.ncbi.nlm.nih.gov/). In constructing the terminase phylogenetic tree, these sequences were aligned with MAFFT [40] using the E-INS-i alignment algorithm. Thereafter, MAFFT-profile alignment was processed, as previously described [41], in order to generate the tree. Briefly, ProtTest 3.2 was applied to find an appropriate model of amino acid substitution and was implemented in PhyML 3.0 to calculate a maximum likelihood tree. Finally, the Shimodaira-Hasegawa-like procedure was used to determine the branch support values and the Newick utility package was used to render the trees.

Nucleotide Sequence Accession Number
The complete genome sequence of phage ST32 was deposited in GenBank under the accession number MF044458.2.

Phage Morphology
The morphological characteristics of phage ST32 were examined by transmission electron microscopy. Electron micrographs ( Figure 1) showed that phage ST32 has an icosahedral capsid with an apex diameter of 64 ± 6 nm and a long contractile tail with a length of 132 ± 9 nm. These morphological features [42] indicate that phage ST32 belongs to the Caudovirales order and the Myoviridae family.

Host Range
Currently, phages are tested for biocontrol purposes against E. coli strains that may cause infections [43,44] or used as indicators of coliform contamination [45]. The host range plays a key

Host Range
Currently, phages are tested for biocontrol purposes against E. coli strains that may cause infections [43,44] or used as indicators of coliform contamination [45]. The host range plays a key role in the selection of any given phage for therapy or biocontrol purposes, as a broad host range phage is likely to kill multiple strains of a given bacterial species and maybe even beyond the species or genus levels for enterophages [43,46].
To this end, the host range of phage ST32 was evaluated on 73 bacterial strains obtained from the Félix d'Hérelle Reference Center for Bacterial Viruses (Table 1). Phage ST32 was able to infect 10 strains (14%), including four pathogenic and six non-pathogenic strains. Pathogenic strains infected by phage ST32 included four E. coli strains of multiple serotypes. In order to reduce the risk of possible harmful substances from the pathogenic host strain in phage lysate, we evaluated the ability of phage ST32 to propagate on its sensitive, non-pathogenic host strains (++++; Table 1). The results showed that phage ST32 was propagated to a high titre (10 9 PFU/mL) when using five (E. coli HER1036, HER1222, HER1315, HER1375 and HER1536) out of six of these strains.
Based on the above, phage ST32 has a broad host range, infecting both pathogenic and non-pathogenic E. coli strains. These features led us to consider phage ST32 to be a potential biocontrol agent rather than a therapeutic agent. In order to use phage ST32 as a biocontrol agent, we further studied the influence of temperature on its lytic activity as well as on the growth of E. coli host strain ST130. Notes: (−) Do not infect; (+) lysis zone at dilution 10 0 or "lysis from without [47]"; (++) infect at dilutions of 10 0 to 10 −2 ; (+++) infect at dilutions of 10 0 to 10 −4 ; (++++) infect at dilutions of 10 0 to 10 −6 .

One-Step Growth Curve
The influence of temperature on plaque formation was first analyzed by spot test at 10, 20, 30, 37, and 42 • C. The results showed that phage ST32 produced clear plaques at dilutions of 10 −1 to 10 −7 when plates were incubated at 10, 20, 30, and 37 • C. Turbid plaques were seen but only at 42 • C. A one-step growth curve was conducted at 20, 30, and 37 • C to determine its latent period and burst size at these temperatures. Moreover, the growth of the bacterial host strain followed under the same conditions.
As indicated by the results of the one-step growth curve experiments (Figure 2a), the burst size of phage ST32 was very low at 37 • C, to the extent that only 2 ± 0.1 new virions were released per infected cell with an estimated latent period of 55 ± 6 minutes. When the phage-infected cells were Viruses 2018, 10, 616 6 of 16 incubated at 30 • C, the average burst size of phage ST32 increased to 64 ± 30 new virions per infected cell, and the latent period remained the same (54 ± 2 min). Interestingly, the burst size of phage ST32 was significantly higher when the infected cells were incubated at 20 • C with an average of 602 ± 159 new virions being released per infected cell. Conversely, the latent period increased to approximately 102 ± 10 min. Of note, the growth of the E. coli ST130 host strain was much faster at 30 • C and 37 • C compared to that at 20 • C ( Figure 2B). Nonetheless, phage ST32 could still kill its host at these temperatures.
Phage ST32 is evidently part of a low-temperature (LT) phage group with an optimum burst at 20 • C [48]. Of note, this phage was isolated from a wastewater sample of a sewage treatment plant in Beijing that has a temperature of about 20 • C. Therefore, it appears to be adapted to replicate at such ambient-like temperatures. These features make this phage a potential agent for the biocontrol of E. coli. For instance, it could be used to control pathogenic bacteria present in wastewater where physical conditions, such as temperature, are optimal for its lytic activity. Moreover, it may provide an effective intervention against foodborne pathogens and spoilage bacteria in minimally processed, ready-to-eat products and fresh fruits [29][30][31]. It could also help to remove bacteria from poultry meat that are often found to be contaminated with potentially pathogenic micro-organisms [28]. In order to support its potential as a biocontrol agent, we further characterized phage ST32 at the genomic and phylogenetic levels.

Genomic Features of Phage ST32
The genome sequence of phage ST32 consists of a double-stranded DNA molecule of 53,092 bp with a GC content of 44.14% as well as 79 open reading frames (ORFs) and a tRNA ( Table 2). The tRNA-Arg of 95 bp (from 15,909 bp to 16,003 bp), without an intron, found in the genome of phage ST32, shares 99% identity with phage phiEcoM-GJ1 [49]. tRNA-Arg is often found in phage genomes [50]. The 79 ORFs have the same transcriptional orientation, and ATG is the most common initiation codon (81.0%), followed by GTG (11.4%) and TTG (7.6%).
Based on the BLASTp analyses, 19 of the 79 ORFs (24.1%) were assigned a putative function, including lysis, capsid, and tail morphogenesis as well as transcription and DNA replication. The functions of the remaining sixty putative ORFs remained unknown, and they were annotated as hypothetical proteins. Besides the predicted protein functions, Table 2 shows the predicted size, the genomic position, the transcriptional orientation, and the closest phage protein homolog. In several cases, protein homologies were with proteins of phages belonging to the Podoviridae or Myoviridae families. The best matches for a large portion of these ORFs were with proteins of the Enterobacteria phage phiEcoM-GJ1 belonging to the Myoviridae family [49]. Thereafter, phylogenetic trees were constructed for further investigation of the relatedness of phage ST32 to other phages.

Genomic Features of Phage ST32
The genome sequence of phage ST32 consists of a double-stranded DNA molecule of 53,092 bp with a GC content of 44.14% as well as 79 open reading frames (ORFs) and a tRNA ( Table 2). The tRNA-Arg of 95 bp (from 15,909 bp to 16,003 bp), without an intron, found in the genome of phage ST32, shares 99% identity with phage phiEcoM-GJ1 [49]. tRNA-Arg is often found in phage genomes [50]. The 79 ORFs have the same transcriptional orientation, and ATG is the most common initiation codon (81.0%), followed by GTG (11.4%) and TTG (7.6%).
Based on the BLASTp analyses, 19 of the 79 ORFs (24.1%) were assigned a putative function, including lysis, capsid, and tail morphogenesis as well as transcription and DNA replication. The functions of the remaining sixty putative ORFs remained unknown, and they were annotated as hypothetical proteins. Besides the predicted protein functions, Table 2 shows the predicted size, the genomic position, the transcriptional orientation, and the closest phage protein homolog. In several cases, protein homologies were with proteins of phages belonging to the Podoviridae or Myoviridae families. The best matches for a large portion of these ORFs were with proteins of the Enterobacteria phage phiEcoM-GJ1 belonging to the Myoviridae family [49]. Thereafter, phylogenetic trees were constructed for further investigation of the relatedness of phage ST32 to other phages.

Phylogeny of Phage ST32
The conserved sequence of the large terminase subunit (ORF51) has been used previously to study the phylogeny of numerous phages [41,42]. As an ATP-driven protein motor, the phage terminase is generally a hetero-oligomer composed of two subunits (small and large) that translocates the phage genome into the preformed capsid. The large subunit usually possesses endonucleolytic and ATPase activities [51,52]. A phylogeny tree, based on the amino acid sequences of the large terminase subunit (ORF51), was constructed to examine the evolutionary relationships between phage ST32 and other phage genomes (Figure 3). The phylogeny tree supported the finding that phage ST32 belongs to the Myoviridae family. Moreover, phage ST32 was on the same branch as phage phiEcoM-GJ1 (EF460875.1), indicating a close relatedness between these two phages and suggesting that they belong to the same new cluster. Interestingly, phiEcoM-GJ1 phage currently belongs to an unclassified genus of the Myoviridae family [49]. Moreover, the tree indicated that the closest evolutionary relatives to both phages were the Pectobacterium virulent phages PM1 [53] and PP101 and the Erwinia virulent phage vB_EamM-Y2 [54]. This relatedness between the PM1, vB_EamM-Y2, and phiEcoM-GJ1 phages was revealed in a previous study [53].

Phylogeny of Phage ST32
The conserved sequence of the large terminase subunit (ORF51) has been used previously to study the phylogeny of numerous phages [41,42]. As an ATP-driven protein motor, the phage terminase is generally a hetero-oligomer composed of two subunits (small and large) that translocates the phage genome into the preformed capsid. The large subunit usually possesses endonucleolytic and ATPase activities [51,52]. A phylogeny tree, based on the amino acid sequences of the large terminase subunit (ORF51), was constructed to examine the evolutionary relationships between phage ST32 and other phage genomes (Figure 3). The phylogeny tree supported the finding that phage ST32 belongs to the Myoviridae family. Moreover, phage ST32 was on the same branch as phage phiEcoM-GJ1 (EF460875.1), indicating a close relatedness between these two phages and suggesting that they belong to the same new cluster. Interestingly, phiEcoM-GJ1 phage currently belongs to an unclassified genus of the Myoviridae family [49]. Moreover, the tree indicated that the closest evolutionary relatives to both phages were the Pectobacterium virulent phages PM1 [53] and PP101 and the Erwinia virulent phage vB_EamM-Y2 [54]. This relatedness between the PM1, vB_EamM-Y2, and phiEcoM-GJ1 phages was revealed in a previous study [53].
Thereafter, we compared the percent identity between the genome sequences of these five phages. Our results showed that the percentage of nucleotide sequence identity between phages in the same branch was relatively high compared to phages in different branches. For example, the percent identity between phages ST32 and PM1 did not exceed 36% compared to 84.5% between the two Pectobacterium phages PM1 and PP101. phage protein sequences were retrieved from GenBank (https://www.ncbi.nlm.nih.gov/). The colors in the internal and external circular layers categorize phages, genera, and families, respectively. When the genera or the family of a phage is not indicated, it means that it was not available in the database or in the associated publication. Branches with branch support values greater than 90% are marked with a blue dot. The size of the dot is directly proportional to the branch support value. Phylogenetic tree based on the amino acid sequences of the large terminase subunit (ORF51) of phage ST32 and the phages available in databases sharing sequence identity. The corresponding phage protein sequences were retrieved from GenBank (https://www.ncbi.nlm.nih.gov/). The colors in the internal and external circular layers categorize phages, genera, and families, respectively. When the genera or the family of a phage is not indicated, it means that it was not available in the database or in the associated publication. Branches with branch support values greater than 90% are marked with a blue dot. The size of the dot is directly proportional to the branch support value.
Thereafter, we compared the percent identity between the genome sequences of these five phages. Our results showed that the percentage of nucleotide sequence identity between phages in the same branch was relatively high compared to phages in different branches. For example, the percent identity between phages ST32 and PM1 did not exceed 36% compared to 84.5% between the two Pectobacterium phages PM1 and PP101.

Comparative Genomic Analysis
The genomic sequences of the ST32, phiEcoM-GJ1, PM1, PP101, and vB_EamM-Y2 phages were further analyzed, compared, and aligned using the deduced amino acid sequences of all of the ORFs. A comparative analysis showed that when using a cut-off of 80% identity, phage ST32 shares 54 proteins with phage phiEcoM-GJ1, while the Pectobacterium phages PM1 and PP101 share 53 proteins. On the other hand, at the same cut-off, the Erwinia phage vB_EamM-Y2 shares only three proteins with the other four phages (Figure 4). Notably, at 70% identity, this number went up to 14 proteins, as indicated by the gray shading in Figure 4.

Conclusions
In this study, the virulent phage ST32 was isolated from wastewater using the pathogenic host E. coli ST130. Morphological and genomic characterization showed that phage ST32 belongs to the Myoviridae family. Host range analysis showed that it can infect a broad range of hosts including non-pathogenic and pathogenic bacteria. Moreover, phage ST32 has a very high burst size at 20 °C which is far from the optimal growth of its host. Phylogenetic analysis, based on the large terminase subunit (ORF51), revealed a close relatedness with the Enterobacteria phage phiEcoM-GJ1 belonging to an unclassified genus of the Myoviridae family. Interestingly, both phages are part of a new branch in the phylogeny. Moreover, neighboring branches carry unclassified Myoviridae relatives, among others, the Pectobacterium phages PM1 and PP101 and the Erwinia phage vB_EamM-Y2. A comparative genomic analysis of the five phages based on nucleotide and amino acid sequences, showed that phage phiEcoM-GJ1 is by far the closest relative to phage ST32. A more detailed genomic comparison between these two phages showed that 47 of 79 ORFs in the phage ST32 genome have more than 90% identity with the phage phiEcoM-GJ1. Many of these ORFs had few homologs in databases. Some striking differences were detected, including the absence of three putative HNH endonucleases of phiEcoM-GJ1 ORFs in phage ST32. On the other hand, five additional ORFs with unknown functions were detected in the phage ST32 genome. Taken together, the newly characterized phage ST32 has appealing and unique characteristics that make it a potential biological control agent under specific conditions.  . Schematic representation of the genomic organization of phage ST32 compared to phages phiEcoM-GJ1, vB_EamM-Y2, PM1, and PP101. Each line represents a different phage genome and each arrow represents an ORF. Arrows of the same color indicate ORFs that share more than 80% identity. White arrows indicate that the identity is less than 80% or there is no homologous putative protein. Gray shading indicates vB_EamM-Y2 phage ORFs sharing more than 70% with that of other aligned phages.
Interestingly, with more than 60% identity, 31 proteins were shown to be shared by the four phages ST32, phiEcoM-GJ1, PM1, and PP101. Based on this comparative analysis, these five phages can be separated into three distinct groups, which is consistent with their three-branch division in the phylogenetic tree ( Figure 3). Based on the close relatedness between phages ST32 and phiEcoM-GJ1 shown in the above analysis, we compared them further.
The genomic organization of phage ST32 compared to phage phiEcoM-GJ1 ( Figure 4) showed that all genes from both phage genomes have the same transcription orientation (5 to 3 from left to right in the figure). Moreover, 47 of 79 ORFs share more than 90% identity, of which eight (ORF42, ORF49, ORF50, ORF52, ORF60, ORF64, ORF68 and ORF69) are 100% identical ( Table 2). The latter are proteins with hypothetical functions. Interestingly, six of these eight ORFs are found in very few phage genomes available in databases [49,53], including the ones closely related to phage ST32 that were used for the genomic comparison in Figure 4.
The global analysis of both phage genomes showed that they are organized into functional clusters to which different roles can be assigned. First, both phages share a cluster of a high number of small genes at the beginning of the genome (starting from ORF2), reminiscent of those on of T4 coliphages which are involved in host takeover [42,49,55] (Figure 4). Most of the phage ST32 ORFs in this cluster share less than 90% identity with those of phage phiEcoM-GJ1 (Table 2). Then, downstream of the genome, several putative replication-related genes were identified, encoding a single-stranded DNA-binding protein (ORF19), thymidylate synthase (ORF38), helicase/primase (ORF39), DNA polymerase (ORF40), 5 -3 exonuclease (ORF43), DNA ligase (ORF45), deoxyuridine 5 -triphosphate nucleotidylhydrolase (ORF47), and ribonucleotide reductase beta subunit (ORF79). In addition to the replication-related genes, the last ORFs in the genome of phages ST32 and phiEcoM-GJ1 encode a ribonucleotide reductase beta subunit. In this regard, it is interesting to note that the ORF1 of both phages encodes a single-subunit RNA polymerase which is a feature of phages of the T7 group of the Podoviridae [49]. These transcription-related ORFs share more than 90% identity (Table 2). Then, downstream of the replication-related genes, we identified a cluster of DNA packaging, capsid, and tail morphogenesis conserved genes sharing more than 90% identity, except for two ORFs, ORF66 and ORF76, encoding for two putative tail fiber proteins and sharing 76% and 72% identity, respectively.
Finally, further main differences were identified between the two phages. For example, three ORFs were only found throughout the genome of phage phiEcoM-GJ1, encoding for three putative HNH endonucleases (ORF34 phiEcoM-GJ1 , ORF36 phiEcoM-GJ1 , and ORF47 phiEcoM-GJ1 ) [49]. Moreover, five additional ORFs (ORF17, ORF18, ORF33, ORF35, and ORF56) encoding proteins with unknown functions were found in the genome of phage ST32 but not in that of phage phiEcoM-GJ1. Interestingly, the best match for one (ORF56) of these five ORFs was with that of the Erwinia phage vB_EamM-Y2, which is closely related to phage ST32, as shown in the phylogenetic tree ( Figure 3).

Conclusions
In this study, the virulent phage ST32 was isolated from wastewater using the pathogenic host E. coli ST130. Morphological and genomic characterization showed that phage ST32 belongs to the Myoviridae family. Host range analysis showed that it can infect a broad range of hosts including non-pathogenic and pathogenic bacteria. Moreover, phage ST32 has a very high burst size at 20 • C which is far from the optimal growth of its host. Phylogenetic analysis, based on the large terminase subunit (ORF51), revealed a close relatedness with the Enterobacteria phage phiEcoM-GJ1 belonging to an unclassified genus of the Myoviridae family. Interestingly, both phages are part of a new branch in the phylogeny. Moreover, neighboring branches carry unclassified Myoviridae relatives, among others, the Pectobacterium phages PM1 and PP101 and the Erwinia phage vB_EamM-Y2. A comparative genomic analysis of the five phages based on nucleotide and amino acid sequences, showed that phage phiEcoM-GJ1 is by far the closest relative to phage ST32. A more detailed genomic comparison between these two phages showed that 47 of 79 ORFs in the phage ST32 genome have more than 90% identity with the phage phiEcoM-GJ1. Many of these ORFs had few homologs in databases. Some striking differences were detected, including the absence of three putative HNH endonucleases of phiEcoM-GJ1 ORFs in phage ST32. On the other hand, five additional ORFs with unknown functions were detected in the phage ST32 genome. Taken together, the newly characterized phage ST32 has appealing and unique characteristics that make it a potential biological control agent under specific conditions.