Whole-Genome Analysis Reveals That Bacteriophages Promote Environmental Adaptation of Staphylococcus aureus via Gene Exchange, Acquisition, and Loss

The study of bacteriophages is experiencing a resurgence owing to their antibacterial efficacy, lack of side effects, and low production cost. Nonetheless, the interactions between Staphylococcus aureus bacteriophages and their hosts remain unexplored. In this study, whole-genome sequences of 188 S. aureus bacteriophages—20 Podoviridae, 56 Herelleviridae, and 112 Siphoviridae—were obtained from the National Center for Biotechnology Information (NCBI, USA) genome database. A phylogenetic tree was constructed to estimate their genetic relatedness using single-nucleotide polymorphism analysis. Comparative analysis was performed to investigate the structural diversity and ortholog groups in the subdividing clusters. Mosaic structures and gene content were compared in relation to phylogeny. Phylogenetic analysis revealed that the bacteriophages could be distinguished into three lineages (I–III), including nine subdividing clusters and seven singletons. The subdividing clusters shared similar mosaic structures and core ortholog clusters, including the genes involved in bacteriophage morphogenesis and DNA packaging. Notably, several functional modules of bacteriophages 187 and 2368A shared more than 95% nucleotide sequence identity with prophages in the S. aureus strain RJ1267 and the Staphylococcus pseudintermedius strain SP_11306_4, whereas other modules exhibited little nucleotide sequence similarity. Moreover, the cluster phages shared similar types of holins, lysins, and DNA packaging genes and harbored diverse genes associated with DNA replication and virulence. The data suggested that the genetic diversity of S. aureus bacteriophages was likely due to gene replacement, acquisition, and loss among staphylococcal phages, which may have crossed species barriers. Moreover, frequent module exchanges likely occurred exclusively among the subdividing cluster phages. We hypothesize that during evolution, the S. aureus phages enhanced their DNA replication in host cells and the adaptive environment of their host.


Introduction
Bacteriophages (phages) are natural viral predators of bacteria that have been used therapeutically for over a century [1]. The increasing prevalence of antimicrobial resistance is leading to the resurgence of phage therapy [2]. Bacteriophages can replicate exponentially in the presence of susceptible bacteria and can kill the target bacteria irrespective of their antimicrobial resistance status [3]. Phages offer several advantages over antibiotics: (i) target specificity, which protects the microbiota of the host; (ii) the capacity to multiply at the site of infection; and (iii) low production costs [4]. Furthermore, phages and their proteins have other applications as vaccine adjuvants, vaccine nanocarriers, and anti-biofilm agents, as well as in bacterial biosensing, gene transfer, drug and therapeutic gene therapy, surface disinfection, bacteriophage display, and food bio-preservation [5]. The various applications

Orthogroup Clustering
Based on the phylogenetic tree, the mosaic structure of the S. aureus phages was aligned using progressive MAUVE [25]. Based on their structural similarity, S. aureus phages were classified into nine main clades and seven singletons. The protein sequences of the nine main lineages of S. aureus phages were used for orthogroup clustering, as described previously [26]. The sequences of these 9 main clades comprised 20 Podoviridae phages in lineage I, 8 Herelleviridae phages in clade IIa, 2 Herelleviridae phages in clade IIb, 45 Herelleviridae phages in clade IIc, 29 Siphoviridae phages in clade IIIa, 16 Siphoviridae phages in clade IIIb, 11 Siphoviridae phages in clade IIIc, 22 Siphoviridae phages in clade IIId, and 28 Siphoviridae phages in clade IIIe.

Analysis and Comparison of Holins, Lysins, DNA Packaging Proteins, Antimicrobial Resistance, Transposase, and Virulence Genes
A subset of 99 genes associated with DNA replication (n = 13), host cell lysis (n = 39), DNA packaging (n = 33), lysogeny (n = 6), virulence (n = 7), and antimicrobial resistance (n = 1) among the 188 S. aureus phages was analyzed based on their amino acid sequence identity, with a cut-off of 80%. The 13 DNA replication-associated genes included genes encoding DNA synthesis proteins, DNA-binding proteins, DNA polymerase, DNA primase/helicase, DNA helicase, DNA primase, DNA modification proteins, DNA methylase, DNA repair, DNA sliding clump inhibitor, RNA ligase, RNA polymerase, and the type III restriction enzyme. The genes associated with host cell lysis included 13 holin and 26 lysin genes. The genes associated with DNA packaging proteins comprised 23 genes encoding the large packaging subunits and 10 genes encoding the small packaging subunits. The six genes associated with lysogeny were those encoding recombinase (rec), transposase (tnp), integrase (int), repressor, anti-repressor, and Clp protease (clp). The seven virulence genes comprised the virulence E family protein (VirE), Panton-Valentine leukocidin (pvl), dUTP pyrophosphatase (dut), complement inhibitor sciderin (scn), staphylokinase (sak), beta hemolysin (hlb), and gamma hemolysin (hlg). Only one antimicrobial resistance gene encoding beta-lactamase (bla) was found in the 188 genome sequences. These genes were assembled and aligned with the 188 genomes using a BLASTx search, as described elsewhere [9].

Statistical Analyses
The SPSS software (version 19) was used for statistical analyses. Pearson's chi-square test (two-tailed) was performed to analyze the differences in the distribution of genes associated with DNA metabolism, host cell lysis, DNA packaging, lysogeny, virulence, and antimicrobial resistance among the subdividing clusters.
As shown in Figures 3-5, three mosaic structures were found in clades IIa-IIc. Notably, the major difference between these clades and other S. aureus phages was the abundance of genes associated with DNA metabolism. Clade IIa contained two modules associated with DNA metabolism. The first DNA metabolism module was composed of 102 ORFs, which contained seven genes encoding DNA metabolism-related proteins, including DNA synthesis proteins, DNA polymerase I, DNA repair recombinase, and DNA-binding proteins. The second DNA metabolism module consisted of 44 ORFs, which harbored six genes encoding DNA metabolism-related proteins, including RNA polymerase, DNA helicase, a type III restriction enzyme, DNA methylase, DNA repair exonuclease, and DNA primase. Clade IIb contained two modules associated with DNA metabolism, which harbored eight genes encoding type III restriction enzymes, DNA helicase, DNA primase/helicase, DNA synthesis proteins, DNA polymerase I, and DNA modification proteins. Furthermore, clade IIc contained two modules associated with DNA metabolism, which contained 10 genes encoding RNA ligase, a type III restriction enzyme, DNA helicase, DNA primase, DNA synthesis, DNA polymerase, RNA polymerase, and a DNA sliding clump inhibitor. As shown in Figure 6, five mosaic structures were found in clades IIIa-IIIe. Notably, the major difference between these clades and other S. aureus phages was the abundance of genes associated with lysogeny and virulence. Clade IIIa contained two genes encoding lysogeny proteins (Clp protease and repressor) and four virulence genes (hlg, pvl, dut, and virE). Clade IIIb contained three genes encoding lysogeny proteins (integrase, antirepressor protein, and Clp protease) and four virulence genes (dut, pvl, scn, and sak). Clade IIIc contained three genes encoding lysogeny proteins (integrase, anti-repressor protein, and Clp protease) and three virulence genes (hlb, sak, and dut). Clade IIId contained four genes encoding lysogeny proteins (integrase, excisionase, repressor, and anti-repressor) and one virulence gene (dut). Clade IIIe contained two genes encoding lysogeny proteins (integrase and anti-repressor protein) and one virulence gene (dut).    Our phylogenetic analysis revealed nine main mosaic structures of S. aureus, indicating the structural diversity and high genetic mosaicism of S. aureus phages. These results were consistent with those of previous studies [13,14]. Although the genomes of lineage III phages displayed obvious functional modules, those of lineage I and lineage II were hybridized. A previous study indicated that genome mosaicism varies depending on the host, lifestyle, and genetic constitution of the phages [7]. The two modules of genes associated with DNA metabolism in clades IIa-IIc accelerated the synthesis of phage macromolecules and, hence, increased phage production. Moreover, the integrase and C repressor coding regions identified in clades IIIa-IIIc exhibited extensive diversity, which is consistent with the results of a study indicating that S. aureus integrase diversity has a minimum of 38% nucleotide identity [27]. These results revealed the distinct genetic features of S. aureus phages, suggesting diverse interactions between phages and their hosts. Although phage classification has historically been based on characteristics such as genome type (ssDNA, ssRNA, dsDNA, or dsRNA), viral morphology, and host range, it is currently undergoing a major overhaul, primarily using genome-based methods [8]. Therefore, our comprehensive exploration of structural diversity has modernized the classification of S. aureus phages.

S. aureus Phages in Subdividing Clusters Shared Similar Ortholog Clusters
To explore the core genome of S. aureus phages, ortholog clusters were analyzed in the subdividing clusters (Table S2). BLASTx revealed 34 orthogroups and 16 ORFs in lineage I. These 16 ORFs comprised 6 genes associated with phage morphogenesis and 1 gene associated with DNA packaging. Clade IIa consisted of 8 Herelleviridae phages and 222 orthogroups. A total of 137 ORFs were found in clade IIa, which contained 8 genes associated with phage morphogenesis and 1 gene associated with DNA packaging. Clade IIb consisted of 2 Herelleviridae phages and 229 orthogroups. A total of 202 ORFs were observed in both phages, including 6 genes associated with phage morphogenesis and 1 gene associated with DNA packaging. Clade IIc contained 437 orthogroups. A total of 97 ORFs were observed, including 16 genes associated with phage morphogenesis and 1 gene associated with DNA packaging.
In clade IIIa, 29 Siphoviridae phages contained 177 orthogroups. A total of 27 ORFs were observed, including 7 genes associated with phage morphogenesis and 2 genes associated with DNA packaging. Clade IIIb contained 161 orthogroups and 23 ORFs, including 8 genes associated with phage morphogenesis and 1 gene associated with DNA packaging. Clade IIIc consisted of 11 Siphoviridae phages and 128 orthogroups. A total of 27 ORFs were observed, including 6 genes associated with phage morphogenesis and 2 genes associated with DNA packaging. In clade IIId, the 22 Siphoviridae phages contained 161 orthogroups and 32 ORFs, including 10 genes associated with phage morphogenesis and 2 genes associated with DNA packaging. Clade IIIe phages contained 218 orthogroups and 23 ORFs, including 6 genes associated with phage morphogenesis and 1 gene associated with DNA packaging.
Despite the genetic and structural diversity of this species, it is notable that the cluster members share common ortholog groups. A previous study analyzed the genome sequence of 205 staphylococci phages and found that the genomes have mosaic architectures and that individual genes with common ancestors are positioned in distinct genomic contexts in different clusters [13]. Consistently, our study revealed that each cluster yielded a pangenome size of 34-437 genes and shared 16-22 genes in the core genome. The absence of core ortholog groups in all the S. aureus phages indicates the frequent exchange, acquisition, and loss of genetic material. Nonetheless, genes associated with phage morphogenesis and DNA packaging were observed in each ortholog group of the subdividing clusters. Phage genomic diversity is difficult to establish because of the absence of a conserved genetic marker and a large number of phages in the biosphere [8]. However, genes associated with phage morphogenesis and DNA packaging may be genetic markers for subdividing cluster phages. A DNA packaging protein that assembles a motor complex may effectively pump DNA into tailed phage procapsids and accelerate phage assembly [28,29]. The disruption of DNA packaging genes completely abolished phage DNA packing events, suggesting that these genes play a prominent role in the transfer of S. aureus phages [30]. Therefore, the conserved DNA packaging gene indicates a similar DNA packaging mechanism in the subdividing cluster phages. However, the present study was limited to the complete phage genomes deposited in GenBank, and an updated genetic analysis is thus necessary to provide accurate genetic markers for phage classification and identification.

Exchange of Functional Modules and the Insertion/Deletion of Small DNA Segments Promote the Evolution of S. aureus Phages
To further understand the interaction between S. aureus phages and their hosts, the mosaic structures of singleton phages 187 and 2638A were analyzed. The phage-187 genome comprised four functional modules, as mentioned previously (Figure 7). The DNA packaging module was an 1818-base-pair-long structure and harbored 2 genes encoding the small and large terminase subunits. This region shared 98.0% nucleotide sequence identity with prophage 6 in the S. aureus strain RJ1267 (CP047321). The phage morphogenesis module was an 18,416-base-pair-long structure and contained 13 genes involved in phage morphogenesis and 1 lysin gene. This region shared 98.9% nucleotide sequence identity with that of RJ1267. The host cell lysis module was a 1021-base-pair-long structure and harbored one lysin and one holin gene. Notably, this module shared 99.3% and 99.0% nucleotide sequence identity with that of RJ1267. However, the DNA metabolism module of phage-187 shared little nucleotide sequence identity with the DNA metabolism module of strain RJ1267. This module was an 18,216-base-pair-long structure and contained 42 ORFs, including genes involved in DNA metabolism, lysogen, virulence, and the toxin-antitoxin system. The phage-2638A genome was also composed of four functional modules (Figure 8). The DNA packaging module was a 2057-base-pair-long structure and harbored one gene encoding the large terminase subunit. This region shared 98.0% nucleotide sequence identity with prophage 3 from the Staphylococcus pseudintermedius strain SP_11306_4A (CP065919). The phage morphogenesis module was an 18,253-base-pair-long structure and contained six genes encoding phage morphogenesis and one clp gene. Region A in this module was a 13,064-base-pair-long structure and shared 99.7% nucleotide sequence identity with that of strain SP_11306_4A. However, the remaining region shared less than 90% nucleotide sequence identity with that of SP_11306_4A. The host cell lysis module of phage-2638A was a 1711-base-pair-long structure and harbored one lysin and one holin gene. The DNA metabolism module of phage-2638A was a 19,061-base-pair-long structure and contained 35 ORFs, including genes involved in DNA polymerase, integrase, and virulence. Regions B and C in this module were 6415-and 3510-base-pair-long structures and shared 96.2% and 96.9% nucleotide sequence identity with those of SP_11306_4A, respectively. Phages 187 and 2638A, isolated from S. aureus strains in Canada and the United States, respectively, shared little nucleotide sequence identity with the genome sequences in the NCBI database. However, the DNA packaging, phage morphogenesis, and host cell lysis modules of phage-187 shared a high nucleotide sequence identity with a prophage in the S. aureus strain RJ1267, which was isolated from a sputum sample in Shanghai, China. These results suggest that phage-187 and prophage 6 in the S. aureus strain RJ1267 probably shared a common ancestor, which subsequently underwent an exchange of DNA metabolism module. Consistently, the DNA packaging module of phage-2368A was similar to that of prophage in S. pseudintermedius strain SP_11306_4, which was isolated from a canine skin sample in the US. However, the host cell lysis module shared little nucleotide sequence identity with SP_11306_4. These results reveal that the exchange of functional modules among staphylococcal phages may cross the species barriers, which is consistent with the results of a study indicating that the gene exchange between staphylococcal phages may cross the species barriers because they coexist in a common host [31]. Moreover, small DNA segment insertion/deletion events were observed in the DNA metabolism module and phage morphogenesis module of 2368A, which is consistent with previous findings that the transduction of phiSaBov was accompanied by the mobilization of the genomic islands vSaα, vSaβ, and vSaγ [30,32]. Our study indicates that the genetic diversity of S. aureus phages is likely due to the exchange of functional modules and the insertion/deletion of small DNA segments, which may cross species barriers. Therefore, gene exchange, acquisition, and loss resulting from the exchange of functional modules and the insertion/deletion of small DNA segments promote the evolution of S. aureus phages. Future research should, however, elucidate the exact mechanism of gene exchange between S. aureus and its hosts.
DNA replication is driven by multiple enzymes, including DNA helicase, which separates double-stranded template DNA; RNA polymerase, which synthesizes an RNA primer; DNA synthesis protein, which initiates Okazaki fragment synthesis; and DNA polymerase, which synthesizes leading and lagging daughter strands [33][34][35]. Therefore, the prevalence of DNA replication genes in S. aureus phages enhances phage DNA replication in host cells. It was surprising to observe the abundance of genes encoding type III DNA restriction and modification enzymes in S. aureus phages, which is inconsistent with previous results that 28.1% of Acinetobacter phages encoded type II restriction-modification systems [36]. Type III DNA restriction and modification enzymes are responsible for hostspecific barriers and protect bacterial cells against bacteriophage infections [37]. Therefore, the presence of modified nucleosides in phage genomes may protect host cells against other bacteriophage infections.
No lysogeny-associated genes were found in lineage I (Table 2). However, genes encoding recombinase and transposase were exclusively found in lineages IIa, IIb, and IIc (p < 0.001). Genes encoding integrase, repressor, and anti-repressor were predominantly detected in clades IIIa-IIIe. This result indicated that integration systems varied based on the subdividing clusters, which is inconsistent with the results of a study revealing no obvious link between the types of integrase, host species, or subclusters [13].

Conclusions
The abundance of virulence-determinant genes in the phage genomes was consistent with the results of a previous study [13]. Panton-Valentine leukocidin is a cytotoxin that induces pore formation in leukocyte cell membrane receptors, which leads to a higher pathogenic potential and the recurrence of community-associated MRSA [38]. The dUTPase enzyme is essential for DNA integrity and viability in many prokaryotic and eukaryotic organisms, as it controls the transfer of virulence genes via a proto-oncogenic G protein-like mechanism [39]. Furthermore, sciderin is an important protein associated with host defense that interferes with the activation of the human complement system [40]. Additionally, staphylokinase is a fibrinolytic agent that plays an important role in dissolving blood clots on fibrin surfaces [41]. β-Haemolysin acts as a hemolytic in sheep, contributes to biofilm formation in rabbit endocarditis models, and enhances the ability of S. aureus to colonize murine skin [42]. The abundance of these virulence genes suggests that the evolutionary model of S. aureus phages promotes host pathogenicity. β-Lactamase, which hydrolyses the β-lactam ring, is the primary resistance mechanism of antibacterial activity against β-lactam antibiotics caused by their extensive use [43]. Our results also indicated that S. aureus phage evolution contributes to the adaptive environment of its host.
In conclusion, our study provides insight into the interaction between S. aureus phages and their hosts by exploring their genomic, structural, and genetic diversity. Our analysis suggests that the genes associated with phage morphogenesis and DNA packaging are conserved in the subdividing clusters, despite the mosaic structural diversity of S. aureus phages. The genetic diversity of S. aureus phages is likely due to gene exchange, acquisition, and loss resulting from the exchange of functional modules and the insertion/deletion of small DNA segments among staphylococcal phages, which may cross species barriers. Moreover, module exchange probably occurred exclusively among the subdividing cluster phages. Through these evolutionary strategies, S. aureus phages enhance phage DNA replication in host cells and contribute to the adaptive environment of their host.
Author Contributions: Study conception and design: Z.Y. and W.Z.; acquisition of data: W.Z., H.W. and Y.L.; analysis and interpretation of data: W.Z., Y.G. and X.Z.; drafting of the manuscript: W.Z., L.Y. and Z.Y.; critical revision: Z.Y. and G.Z. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
The study results' data are included in the article or Supplementary Materials. Further specific information regarding the dataset analyzed during the study can be obtained from the corresponding author on reasonable request.

Conflicts of Interest:
The authors have no conflict of interest to declare.