PCR Assay for Rapid Taxonomic Differentiation of Virulent Staphylococcus aureus and Klebsiella pneumoniae Bacteriophages

Phage therapy is now seen as a promising way to overcome the current global crisis in the spread of multidrug-resistant bacteria. However, phages are highly strain-specific, and in most cases one will have to isolate a new phage or search for a phage suitable for a therapeutic application in existing libraries. At an early stage of the isolation process, rapid screening techniques are needed to identify and type potential virulent phages. Here, we propose a simple PCR approach to differentiate between two families of virulent Staphylococcus phages (Herelleviridae and Rountreeviridae) and eleven genera of virulent Klebsiella phages (Przondovirus, Taipeivirus, Drulisvirus, Webervirus, Jiaodavirus, Sugarlandvirus, Slopekvirus, Jedunavirus, Marfavirus, Mydovirus and Yonseivirus). This assay includes a thorough search of a dataset comprising S. aureus (n = 269) and K. pneumoniae (n = 480) phage genomes available in the NCBI RefSeq/GenBank database for specific genes that are highly conserved at the taxonomic group level. The selected primers showed high sensitivity and specificity for both isolated DNA and crude phage lysates, which permits circumventing DNA purification protocols. Our approach can be extended and applied to any group of phages, given the large number of available genomes in the databases.


Introduction
The discovery of antibiotics marked a revolution in healthcare, changing the treatment of infectious diseases for many years to come. However, the emergence and rapid spread of multidrug-resistant (MDR) bacteria has become one of the major threats to public health in the twenty-first century. Based on a comprehensive assessment of the global burden of antimicrobial resistance (AMR), in 2019 AMR was shown to be the third leading cause of death [1]. The usefulness of antibiotics is waning, and it is possible that by the year 2050, MDR infections will be the leading cause of death worldwide [2]. Moreover, the following species pose the greatest danger, among which the proportion of MDR strains exceeds 50%: Escherichia coli, Klebsiella pneumoniae, Enterobacter cloacae, Staphylococcus spp., Enterococcus spp., Acinetobacter baumannii, and Pseudomonas aeruginosa [1].
One of the most promising approaches for overcoming the current global crisis of antimicrobial resistance is the usage of (bacterio)phages (viruses that kill bacteria) to treat infections caused by MDR strains. Phage therapy has been shown to be safe and effective against a wide range of bacterial pathogens in both preclinical and clinical studies [3][4][5][6]. However, due to the high specificity of phage-bacteria interactions, in most cases it is necessary to isolate a new phage or to search for a suitable one in existing phage libraries [7].
At the same time, it is important to emphasize that phages are living antimicrobials, which imposes particular requirements on their therapeutic use; specifically, only virulent phages (the reproduction of these bacteriophages always leads to cell lysis; unable to integrate into the genome of an infected bacterium) can be used clinically [8,9]. Wholegenome sequencing (WGS) is gradually introduced to confirm the virulent nature of the phage: knowing the entire phage nucleotide sequence makes it possible to check the genome for the presence of recombination or transposition gene modules (responsible for recombination and/or integration of the phage genome into the host cell genome), toxins, antibiotic resistance genes, and virulence factors [9,10]. Furthermore, the ongoing systematization of information about taxa suitable for therapeutic needs is a result of the active improvement of virus taxonomy, aided by the widespread use of WGS.
A universal hierarchical taxonomic system is being maintained and developed by the International Committee on Taxonomy of Viruses (ICTV). Genomic characteristics are now acknowledged to be the fundamental component of taxonomic classification [11,12]. Several approaches, such as computing genome-wide sequence similarities (ViPTree) [13], network-based genome taxonomy (vConTACT2) [14], and many others, are proposed for viral taxonomic classification. Although WGS-based methods are becoming more and more efficient, they remain rather time-consuming and labor-intensive, especially in the case of emergency selection of individualized phage cocktails. Therefore, the development of a cheaper and faster approach for primary screening is an urgent task for typing existing and de novo isolated phages.
To date, several approaches for phage typing have been proposed. Although alternative methods have been described [15,16], the majority of the studies on phage typing rely on the presence of molecular markers (signature genes) [17][18][19][20]. Furthermore, a significant portion of existing methods is devoted to the typing of temperate bacteriophages [19][20][21][22], which is connected with the need to type bacterial strains. Significantly fewer studies are concerned with the typing of virulent phages; such typing schemes have been proposed only for Salmonella enterica, Escherichia coli and Erwinia amylovora [17].
Here, we propose a novel PCR-based typing scheme for S. aureus and K. pneumoniae phages. To develop the scheme, a search for taxon-specific orthologous gene families was conducted on a dataset comprising n = 749 phage genomes. This assay led to the selection of n = 13 primer pairs that could identify two families of virulent Staphylococcus phages (Herelleviridae and Rountreeviridae) and 11 genera of virulent Klebsiella phages (Przondovirus, Taipeivirus, Drulisvirus, Webervirus, Jiaodavirus, Sugarlandvirus, Slopekvirus, Jedunavirus, Marfavirus, Mydovirus and Yonseivirus). The assessment of sensitivity and specificity of bacteriophages and bacteria from laboratory collections showed that the scheme is applicable both for isolated DNA and for crude phage lysates.

Sample Collection and Phylogenetic Analysis
To develop a typing scheme, all available S. aureus (n = 269) and K. pneumoniae (n = 480) phage genomes in the NCBI RefSeq/GenBank database were used (Table S1A,B). Out of these,~53% of Staphylococcus phages (n = 142) and~25% of Klebsiella phages (n = 123) have been classified by the ICTV (Figure 1).
Using taxonomy derived from the ICTV database and the vConTACT2 network-based viral classification tool [14], we were able to assign taxonomy to most of the Staphylococcus and Klebsiella phage genomes retrieved from the NCBI RefSeq/GenBank database ( Figures S1 and S2). In total, Staphylococcus phages belonged to 2 families, 4 subfamilies, and 11 genera. Meanwhile, a greater diversity of taxonomic groups was found among Klebsiella phages: 8 families, 7 subfamilies, and 27 genera (Table 1). For the n = 18 Staphylococcus and n = 178 Klebsiella phage genomes, genus-level classification was not assigned. Using taxonomy derived from the ICTV database and the vConTACT2 networkbased viral classification tool [14], we were able to assign taxonomy to most of the Staphylococcus and Klebsiella phage genomes retrieved from the NCBI RefSeq/GenBank database (Figures S1 and S2). In total, Staphylococcus phages belonged to 2 families, 4 subfamilies, and 11 genera. Meanwhile, a greater diversity of taxonomic groups was found among Klebsiella phages: 8 families, 7 subfamilies, and 27 genera (Table 1). For the n = 18 Staphylococcus and n = 178 Klebsiella phage genomes, genus-level classification was not assigned.

Identification of Signature Genes Suitable for PCR Typing
We used the following criteria to search for taxon-specific groups of genes suitable for the development of a PCR typing scheme: ≥2 genomes at the genus level, the gene of interest is only present within members of the operational taxonomic unit (OTU), amino acid sequence identity threshold > 60% and average length > 400 bp. From 0 to 207 specific genes were identified for each OTU (Tables 1 and S2). Phylogenetic trees inferred from protein distances between genomes and trees derived from pangenome analysis based on pairwise distance matrix from gene presence/absence data were found to be largely concordant and showed a similar clustering of OTUs ( Figures S3 and S4).
Within the Staphylococcus phages, we were unable to identify specific genes that would allow us to accurately differentiate the Kayvirus, Biseptimavirus, Peeveelvirus and Fibralongavirus genera from the other most-closely related ones. However, at the Herelleviridae family level, we found n = 8 family-specific genes suitable for typing.
In contrast, for Klebsiella phages, we found specific genes for each genus, but not for families. Four genes were found to be orthologous for the Jiaodavirus and Slopekvirus genera of the Straboviridae family, but they were also found in members of the Marfavirus genus. Moreover, n = 12 genes were found to be common between the Jiaodavirus and Marfavirus genera, which indicates the close relationship between them.

PCR-Based Typing Scheme for Rapid Phage Classification
Genes were selected for typing if they allowed us to clearly distinguish among genera of virulent phages that have at least five representatives per genus (Table 1). Therefore, primers for K. pneumoniae phages were selected for the following genus-specific genes: internal virion protein D (Przondovirus), putative tail tube-associated baseplate protein (Taipeivirus), capsid protein (Drulisvirus), putative major head subunit precursor (Webervirus), alpha-glucosyl-transferase (Jiaodavirus), putative replicative DNA helicase (Sugarlandvirus), putative baseplate hub (Slopekvirus), virion structural protein (Jedunavirus), hypothetical protein (Marfavirus), putative helicase (Mydovirus) and DNA polymerase I (Yonseivirus). In Staphylococcus phages, for the Rountreeviridae and Herelleviridae families, primers were selected for the gene family encoding the resolvase and major capsid protein (    The sensitivity of the selected primer pairs was evaluated both on bacteriophage DNA isolated using phenol-chloroform extraction and directly on crude phage lysates. As a result of serial dilutions of the matrix, it was found that the limit of detection of PCR primers corresponds to a phage DNA concentration ranging from 150 to 1.5 pg/μL. Amplification of lysates was successful at a minimal concentration of 10 4-10 5 plaque-forming units per milliliter (PFU/mL) (Figure 2, Table S3).
Specificity of the primers was tested on non-target phage genera from the laboratory The sensitivity of the selected primer pairs was evaluated both on bacteriophage DNA isolated using phenol-chloroform extraction and directly on crude phage lysates. As a result of serial dilutions of the matrix, it was found that the limit of detection of PCR primers corresponds to a phage DNA concentration ranging from 150 to 1.5 pg/µL. Amplification of lysates was successful at a minimal concentration of 10 4 -10 5 plaque-forming units per milliliter (PFU/mL) (Figure 2, Table S3).
Specificity of the primers was tested on non-target phage genera from the laboratory collection (n = 21) (Table S3). No nonspecific products were detected after the amplification of phages. Additionally, bacterial strains of S. aureus (n = 21) and K. pneumoniae (n = 16) were tested These strains were either hosts of the studied phages or were effectively lysed, and were considered as potential hosts. Only several reactions with K. pneumoniae strains yielded non-specific results, which appeared as faint bands that were different in size from positive controls. These results were later regarded as negative.

Discussion
Phages are one of the most abundant and diverse entities of the biosphere; according to some estimates, their number reaches about 10 31 particles [23]. This wide diversity makes it possible to isolate a specific phage that can lyse almost any of the existing infectious bacterial strains. However, only virulent phages are suitable for phage therapy, and it is not possible to reliably determine the type of phage-bacteria interaction solely on the basis of plaque morphology. Thus, the suitability of a phage for therapy can only be established after acquiring its complete genome via WGS or other additional experiments [24].
In the case of personalized phage therapy, the most important aspect is the issue of time spent on the selection of a suitable phage, and therefore a pre-WGS screening that would be able to exclude temperate phages from the analysis would be greatly beneficial. It has been shown that phages belonging to the same taxonomic unit (genus, family) are characterized by the same type of interaction with the cell (virulent or temperate) [25], which makes it possible to take the taxonomic affiliation of phages as a basis for a rapid preliminary screening.
In addition to characterizing the type of phage-bacteria interaction, taxonomic affiliation makes it possible to determine the approximate size of the phage genome [26,27], which is also important for WGS analysis. Since the size of phage genomes varies greatly, this must be taken into account at the DNA library preparation step in order to achieve optimal coverage and high sequencing quality [27].
To create a PCR typing system, we limited ourselves to S. aureus and K. pneumoniae phages. Both bacterial species cause a wide range of infectious diseases and belong to the so-called ESKAPE (Enterococcus faecium, S. aureus, K. pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) group of pathogens, characterized by the high resistance of clinical strains to antimicrobial agents, which makes them important potential targets for phage therapy [1,4]. On the other hand, phages of these species are often well-characterized and a sufficient number of cases of their successful application have been described [4].
Thirteen primer pairs were selected based on the gene alignments of the virulent S. aureus and K. pneumoniae phages (Table 2). In a number of cases, the genes chosen in this study, such as major capsid protein and DNA polymerase, match those previously proposed for phage typing and classification [17,29,30]. These genes play a key role in the physiology of phages and are widely distributed among various families. As previously established, their use as target genes for PCR typing schemes allows viruses to be reliably differentiated irrespective of the mosaic nature of their genome, presumably due to limited horizontal gene transfer [31]. This study made it possible to significantly expand the list of conservative genes specific to certain taxonomic units, which can be used in further phylogenetic studies (Table S2). For example, we have found eight specific genes for the family Herelleviridae, which are part of the previously described core genes of this family [32].
In silico, the selected primers showed the ability to detect all currently known S. aureus phages (Herelleviridae and Rountreeviridae families) associated with a virulent lifestyle. In the case of K. pneumoniae phages, the proposed PCR scheme is able to differentiate n = 11 from n = 20 genera of virulent bacteriophages. The vast majority of the studied virulent K. pneumoniae phages (92.4%) belong to phages from the aforementioned eleven genera.
Testing of the thirteen PCR schemes was carried out on a laboratory collection of phages, the genomes of which were not used for the primer design. Results obtained confirmed that the selected loci are conservative, and the primers have good specificity. Moreover, the limit of detection of the scheme was found to be rather high, which allows it to be used both when working with isolated DNA and with lysates when searching for a specific phage.
The proposed PCR assay has a number of limitations that must be taken into account. Although Sugarlandvirus, Slopekvirus, Jedunavirus, and Marfavirus primers showed high specificity in silico, they were not tested in vitro due to the absence of these phages in our laboratory collection. Moreover, for some primer systems, an experimental confirmation of sensitivity and specificity was carried out on a single phage, so sensitivity limits require further confirmation. Finally, it is necessary to consider the fact that several samples of K. pneumoniae showed nonspecific products, but we believe this issue can be overcome by introducing an additional control in the form of phage-free bacterial genomic DNA.

Phylogenetic Analysis
Trees based on pairwise distances between phage genomes were inferred using a standalone version of ViPTree v.1.1.2 [13]. For 265 genomes, the official taxonomy was derived from the ICTV Master Species List 2021.v1 (https://talk.ictvonline.org/files/ master-species-lists/m/msl/13425/download; accessed on 7 June 2022). Phage genomes that were not present in the ICTV database were classified using vConTACT2 v.0.9.22 with default parameters using the "ProkaryoticViralRefSeq211-Merged" database [14] to obtain a taxonomic affiliation at the family or genus rank. Furthermore, a phage's lifestyle (virulent or temperate) was predicted from conserved protein domains using BACPHLIP v.0.9.6 [10]

Primer Design
Genes were selected for potential use in the typing system for each operational taxonomic unit Orthologous genes with an identity of >60% and average length of >400 bp were used for further analysis. Orthologous gene family alignment was performed for each pair of genes using MAFFT v.7.490 [43] under default parameters. Primers were designed using BioEdit v.7.0.5.3 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html; accessed on 7 June 2022) and OLIGO v.6.31 (Molecular Biology Insights Inc., Cascade, CO, USA). The NCBI Primer-BLAST online tool (https://www.ncbi.nlm.nih.gov/tools/primer-blast/; accessed on 7 June 2022) was used to assess primer pair specificity [44]. The primers used in this study and PCR product lengths are listed in Table 1.

PCR Verification
Validation of the PCR typing assays was carried out on a laboratory collection of bacteriophages, including S. aureus (n = 6) bacteriophages from the families Herelleviridae and Rountreeviridae and K. pneumoniae (n = 15) bacteriophages from seven different genera (Przondovirus, Taipeivirus, Drulisvirus, Webervirus, Jiaodavirus, Mydovirus and Yonseivirus) (Table S3). A standard phenol-chloroform extraction protocol was used for phage DNA isolation [45]. DNA concentration was determined using a Qubit 2.0 spectrophotometer (Life Technologies, Darmstadt, Germany).
The standard PCR was carried out in 25 µL of the reaction mixture (66 mM Tris-HCl (pH 9.0), 16.6 mM (NH 4 ) 2 SO 4 , 2.5 mM MgCl 2 , 250 µM of each dNTP, 1 U of Taq DNA polymerase (Promega, Madison, WI, USA) and 5 pmol of each primer). About 10 ng of phage DNA was used as a template for PCR. PCR was performed in a DNA Engine Tetrad (MJ Research, Inc., Saint-Bruno-de-Montarville, Quebec, Canada) at 96 • C for 2 min, followed by 35 cycles of 30 s at 95 • C, 20 s at 56 • C for Staphylococcus phage primers and 60 • C for Klebsiella phage primers, and 40 s at 72 • C. Amplification products were analyzed by 2% agarose gel electrophoresis followed by ethidium bromide staining.
The limit of detection of the PCR system was determined by serial dilutions of DNA template (15 ng/µL-0.15 pg/µL). In addition, crude phage lysates without preliminary purification, obtained by growing bacteriophages on their host strains in lysogeny broth (LB) medium, were used as a matrix. For PCR, 5 µL of lysates containing from 10 9 to 10 4 PFU/mL were taken. PCR specificity was evaluated by cross-testing the selected primers on bacteriophages from other genera (Table S3). DNA from bacterial host strains and strains from the laboratory collection (S. aureus (n = 21) belonging to 21 multi-locus sequencing typing (MLST) sequence types (STs) and K. pneumoniae (n = 16) belonging to 16 MLST STs) were also used for testing. MLST was performed based on standard schemes [https://pubmlst.org/; accessed on 7 June 2022]. The primers for the arcC and rpoB housekeeping genes from S. aureus and K. pneumoniae, respectively, were used as positive controls.

Conclusions
The introduction of phage therapy in clinical practice has led to the development of criteria that a potential phage preparation must meet. The initial stage in the creation of such preparations, however, is currently rather empirical in nature, with phages that effectively lyse a certain pathogenic strain of a certain bacterial species being selected during screening.
In this work, we used generalized genomic data to create a PCR typing scheme for S. aureus and K. pneumoniae phages. This scheme could be used for a preliminary rapid screening of phages. The approach described here can be applied to other bacterial species, which will ultimately facilitate the selection of phages for further study and their application for therapeutic purposes.