Staphylococcus arlettae Genomics: Novel Insights on Candidate Antibiotic Resistance and Virulence Genes in an Emerging Opportunistic Pathogen

Coagulase Negative Staphylococci (CoNS) are becoming increasingly recognized as an important cause of human and animal infections. Notwithstanding their clinical relevance, annotation of genes potentially involved in pathogenicity and/or antibiotic resistance in the CoNS species Staphylococcus arlettae (SAR) is currently very limited. In the current work we describe the genome of a novel methicillin resistant isolate of SAR, which we named Bari, and present a comprehensive analysis of predicted antibiotic resistance profiles and virulence determinants for all the 22 currently available SAR genomes. By comparing predicted antibiotic resistance and virulence-associated genes with those obtained from a manual selection of 148 bacterial strains belonging to 14 different species of staphylococci and to two “outgroup” species, Bacillus subtilis (BS) and Macrococcus caseoliticus (MC), we derived some interesting observations concerning the types and number of antibiotic resistance-related and virulence-like genes in SAR. Interestingly, almost 50% of the putative antibiotic resistance determinants identified in this work, which include the clinically relevant mec, van, and cls genes, were shared among all the SAR strains herein considered (Bari included). Moreover, comparison of predicted antibiotic resistance profiles suggest that SAR is closely related to well-known pathogenic Staphylococcus species, such as Staphylococcus aureus (SA) and Staphylococcus epidermidis (SE). A similar analysis of predicted virulence factors, revealed that several genes associated with pathogenesis (including, for example, ica, nuc, and ssp), which are commonly found in the genomes of pathogenic staphylococci such as Staphylococcus haemolyticus (SH) and Staphylococcus saprophyticus (SS), are observed also in the SAR strains for which a genomic sequence is available. All in all, we believe that the analyses presented in the current study, by providing a consistent and comprehensive annotation of virulence and antibiotic resistance-related genes in SAR, can constitute a valuable resource for the study of molecular mechanisms of opportunistic pathogenicity in this species.


Introduction
The first strains of Staphylococcus arlettae (SAR) were reported in 1984, isolated from the skin and nares of poultry and goats [1]. Since then, SAR strains have been isolated from different animals (mainly mammals and birds) and environments including salt mines, estuaries, fermented foods, and biological safety cabinets [2][3][4][5][6]. Although SAR is normally considered a commensal species, it may also be associated to different types of infections or in contexts where a large use of antibiotics is applied. For example, SAR strains were isolated from bovine mastitis, pig exudative epidermidis, dairy goat intramammary infection, a human patient affected by rheumatic mitral stenosis, as well as from blood clinical samples [7][8][9]. Recently a plasmid encoding for nine antibiotic resistance genes, cfr, erm(C), tet(L), erm(T), aadD, fosD, fexB, aacA-aphD, and erm(B), was characterized in SA-01, a SAR strain isolated from a chicken farm. The plasmid contained three IS431 elements mediating intra-or inter-plasmid recombination, and was considered as a potential vector of antibiotic resistance genes with relevant implications on the effectiveness of clinical therapy based on antimicrobials [10]. More recently, an operon encoding for a novel functional β-lactamase (bla ARL ) was detected in SAN1670, a SAR strain isolated from bovine mastitis. Interestingly, bla ARL was located in a high-mobility genomic island, suggesting its potential for mobilization and lateral gene transfer [11]. Independent studies have also identified several multidrug efflux pumps (e.g., norA) coding genes as well as other genes related to resistance to antibiotics such as chloramphenicol (e.g., fexA), tetracycline (e.g., tetL), and erythromycin (e.g., msrA, mphC) in the genomes of SAR strains isolated from chicken farm and dairy herds affected by mastitis [10,12,13]. The fosfomycin resistant fosD gene was described in a novel plasmid of the SAR SA-01 strain [10]. Since fosD-resistance genes are typically located in mobile genetic elements, they may contribute to multi-resistant traits to other staphylococci [14][15][16].
Additionally, several virulence-associated genes, including fibronectin/fibrinogen binding protein, programmed cell death toxin ydcD, hemolysin III, autolysins (atl), and genes involved in the regulation of virulence accessory factors such as agrA, agrB, agrR, agrV, and agrZ, were identified in the SAR CVD059 strain isolated from the blood of a cardiovascular disease patient [8,17,18].
Although SAR is emerging as an important opportunistic pathogen, apart from the studies discussed above, to date there is no comprehensive information on antibiotic resistance and presence/absence of virulence-associated genes for most of the 22 SAR isolates from which genomic sequences are available.
The resistance to most β-lactam antibiotics is a typical trait of many pathogenic staphylococci, collectively defined as methicillin-resistant staphylococci, and currently represents a relevant problem in clinical treatment of Staphylococcus infections. Indeed, methicillin resistance is usually associated with resistance to additional antibacterial agents, producing a multi-resistant phenotype that may further compromise the therapy [19]. However, at present, information concerning the presence/absence of β-lactam resistance genes is currently lacking for several Coagulase Negative Staphylococci (CoNS) species including SAR.
In this study, we present the draft genome of a novel SAR methicillin resistant strain, which we named Bari, isolated from a disused biological safety cabinet, and perform systematic bioinformatics predictions of antibiotic resistance and virulence-associated genes for all 22 SAR genomes available in the NCBI database. By comparing the predicted profiles with equivalent profiles obtained from a manually curated selection of 148 Staphylococcus genomes belonging to fourteen species and two distantly related groups of Firmicutes, Bacillus subtilis (BS) and Macrococcus caseolyticus (MC), we highlight for all the SAR genomes considered, the presence of several antibiotic-resistance genes and virulence determinants never reported thus far, which are commonly detected in pathogenic staphylococci.
The analyses presented in the current study, by providing a consistent and comprehensive annotation of virulence and antibiotic resistance-related genes in SAR, can constitute a valuable resource for the study of molecular mechanisms of opportunistic pathogenicity in this species.

Isolation and Sequencing of SAR Bari
The SAR Bari strain was isolated from a Nunc bioassay plate, containing LB (Lauria Bertani) agar supplemented with hexavalent chromate Cr (VI), after an incubation at room temperature on the top of a disused biological safety cabinet. SAR Bari exhibited a high resistance to Cr (VI), up to 150 mM in LB and M9 minimal broth. Genomic DNA was extracted with the DNeasy blood and tissue kit (Qiagen, Hilden, Germany) and sequenced by an Illumina MiSeq instrument (San Diego, CA, USA) by producing 2 × 250 nucleotide bp paired-end reads.

Genome Assembly and Taxonomic Classification of SAR Bari
Raw sequencing data from the Illumina platform were processed using a modified version of the "Fosmid1" pipeline in the A-GAME Galaxy framework [20,21]. Quality trimming was executed using the sliding-window operation in Trimmomatic with default parameters [22]. Overlapping reads were merged using PEAR with standard parameters [23]. The final assembly was performed using the SPAdes assembler (version 3.50) using kmers of 33, 55, 77, 99, and 121 nt [24]. Annotation was performed with PROKKA with default parameters [25]. The draft genome of SAR Bari was deposited in NCBI under the accession number WEIN00000000, BioSample number SAMN12991358 and BioProject number ID PRJNA576354. Taxonomic classification of SAR Bari was performed by using Tetra Correlation Search (TCS) and ANI (Average Nucleotide Identity), as implemented by the JspeciesWS web server http://jspecies.ribohost.com/jspeciesws/) [26].

Comparative Genomics Dataset
In order to carry out a comparative analysis of predicted profiles of antibiotic resistance and virulence-associated genes in the genus Staphylococcus, 148 strains from 14 different species of Staphylococcus (SAR included) were selected, including two distantly related species (BS and MC) to provide an outgroup for comparative analyses.
Staphylococcus species included in these analyses were selected based on the criteria previously proposed [27]. Briefly, the Staphylococcus genus was subdivided in six major sub-clades. For each sub-clade one or more representative species were selected based on the availability of genomic sequences and annotation of protein coding genes in GenBank. Only species for which genome annotation for at least four distinct strains and for which the genome sequences of the representative types of strains were available were included in the study. A total of 148 strains belonging to 16 species were selected based on these criteria. These include: 14 Table S1. All genomes deposited before June 2019 were considered. The complete list of species included in the dataset for the comparative genomics analysis, with the corresponding GenBank accession numbers, are in Table S2.

Genomics of Antibiotic Resistance
Putative antibiotic resistance genes were detected by RGI v.5.1.0 (Resistance Gene Identifier, https://card.mcmaster.ca/analyze/rgi) using the CARD (Comprehensive Antibiotic Resistance Database, https://card.mcmaster.ca/home) reference collection on the predicted annotated CDS (CoDing Sequence) for all considered species [28]. All the analyses were performed using the RGI web portal with default parameters. The RGI output was imported in Cytoscape v.3.7.2. to generate a network of shared antibiotic resistance determinants, where the distance among species was proportional to the number of unique elements detected per species and connections among species [29].

Genomics and Taxonomy of SAR (Staphylococcus arlettae) Bari
The final assembly of the SAR Bari strain consisted of 75 scaffolds, for a total size of 2,547,240 bp and an N50 of 90.3 Kb. A total of 2530 genes and 2460 coding sequences (CDS) were annotated. Taxonomic characterization based on TCS suggested that SAR Bari was very closely related to the SAR type strain (NCTC 12413 T ), with a z-score of 0.99915, which is marginally above the cut-off value for species identification. Average Nucleotide Identity (ANI) between the draft genome assembly of SAR Bari and NCTC 12413 T was 98.68%, that is well above the value typically used for species delineation (95-96%). Based on this evidence, both TCS and ANI suggested the classification of the isolate as a novel SAR strain, which was then named Staphylococcus arlettae Bari.

Antibiotic Resistance of SAR Bari
The experimental results of the antimicrobial susceptibility test showed that SAR Bari was resistant to oxacillin and to all β-lactam antibiotics tested (ampicillin, cefoxitin, ceftaroline, imipenem, penicillin). Resistance to clindamycin, erythromycin, fosfomycin, and fusidic acid was also observed. SAR Bari was sensitive to ciprofloxacin, moxifloxacin, daptomycin, teicoplanin, vancomycin, tetracycline, tigecycline, gentamicin, linezolid, trimethoprim/sulfamethoxazole, mupirocin, nitrofurantoin, and rifampin. A full concordance between results obtained by the BD Phoenix and ETest was observed.
The majority of candidate antibiotic resistance proteins identified by RGI were associated with efflux-based mechanisms of antibiotic resistance (52%), with other hits correlated to resistance mechanisms involving antibiotic target alteration, protection and replacement (31%), and antibiotic inactivation (17%) (Figure 1).
The majority of candidate antibiotic resistance proteins identified by RGI were associated with efflux-based mechanisms of antibiotic resistance (52%), with other hits correlated to resistance mechanisms involving antibiotic target alteration, protection and replacement (31%), and antibiotic inactivation (17%) (Figure 1).

Antibiotic Resistance Genomics of SAR and Species Dataset
The number of putative antibiotic resistance genes identified by RGI in other SAR strains was substantially equivalent to that identified for SAR Bari (Table 2), ranging from 234 (AR15) to 212 (AR12). Consistent with our previous observations, 112-125 of these proteins were associated with efflux-based mechanisms of antibiotic resistance ( Table 2).
The RGI analysis of antibiotic resistance determinants in the extended dataset of 148 strains detected a total of 505 distinct antibiotic resistance-like genes (Table S4) with thirty-six shared between all species investigated. Unique determinants were identified for all 16 species of the dataset. While nine genes were found to be specific to SAR isolates (catA8, fosB3, emrA, gimA, spd, tetA, fosA6, mdtH, and otrC genes), a number of SAR genes were shared exclusively with SA (imp-35 gene), BS (vanSN, qepA4, cau-1, catB9 genes), and SK (soxS, qacB genes).
The RGI analysis of antibiotic resistance determinants in the extended dataset of 148 strains detected a total of 505 distinct antibiotic resistance-like genes (Table S4) with thirty-six shared between all species investigated. Unique determinants were identified for all 16 species of the dataset. While nine genes were found to be specific to SAR isolates (catA8, fosB3, emrA, gimA, spd, tetA, fosA6, mdtH, and otrC genes), a number of SAR genes were shared exclusively with SA (imp-35 gene), BS (vanSN, qepA4, cau-1, catB9 genes), and SK (soxS, qacB genes).
Similarity of global profiles of predicted antibiotic resistance patterns are represented in the form of a network graph in Figure 2, where nodes represent species and edges represent shared predicted antibiotic resistance genes. As outlined in the figure, predicted resistome genes identified in BS are remarkably different from predicted resistome genes of other staphylococci and MC. Reduced levels of diversity are observed for SH, SS, SSC, SAU, SE, and SC, with SAR more closely related to SA, SE, SC, and SCH.  Table S4 for details on the molecular elements of antibiotic resistance shared among the species in the dataset. Species abbreviations are in the Material and Methods Section (Section 2.4).

Analysis of the Distribution of Putative Virulence Elements
Prediction of virulence-associated genes in SAR Bari, carried out by VRprofile, identified a total of 440 putative virulence-related genes, showing significant levels of similarity with VF, AR, T3SE, T4SE, T6SE, T7SE, T3SS, T6SS, T7S, Prophage, Integrons, IS, PAI, and ARI virulence elements in the VFDB database (Table S5). The majority of the genes identified by the VRprofile were annotated as VF (135), ARI (66), PAI (84), Prophage (111), and AR (120), while T3SE, T4SE, T3SS, T6SS, Integrons,  Table S4 for details on the molecular elements of antibiotic resistance shared among the species in the dataset. Species abbreviations are in the Material and Methods Section (Section 2.4).
Comparative analysis did not evidence relevant differences between the virulence profile of SAR Bari and all other SAR strains (Table S6) with, for example, AR2 (657) and TSAR (644) showing a higher number of virulence elements identified with respect to other strains such as AR4 (521) or B (Bari, 542).
Clustering of virulence elements profiles performed on the 148 genomes dataset, which are represented in the heatmap in Figure 3 (data in Table S6), highlighted two major groups, one of them possibly split into four additional subgroups (G1-G4) and the other corresponding to BS. G1 and G2 include virulence profiles of SSC, SAG, SHY, SCH, SAU, SSI, SE, SH, and MC strains. G3 includes virulence profiles of SAR strains as well as other species (SC, SK, SH, and SS). In particular, SAR Bari (B) showed a high similarity with SAR B1, B2, and B3. G3 also included the SE strain (SE9). G4, well separated from the other groups, was instead represented by virulence profiles of SA strains.  Color scale corresponds to the number of predicted genes associated to specific virulence element, from green (lowest) to red (highest). Note: Abbreviations used are relative to single strains (see Table S2 for details).

Virulence Factors within SAR and Species Dataset
VFanalyzer identified a total of 23 putative virulence factors, related to five different classes (VFclass) in the genome of SAR Bari ( Table 3). Most of these (14) were related to genes (capsule undetermined, capB, capC, galE) coding for virulence factors involved in immune system evasion (VFclass 2). Additional predicted virulence factors were associated with allantoine utilization (4 hits, nutritional factors); lipase (1 hit, lip), protease (1 hit, sspA), and nuclease activity (1 hit, nuc); virulence-factor-related to the process of antiphagocytosis (1 hit, uge); and serum resistance and immune evasion (1 hit, wbtP).  Color scale corresponds to the number of predicted genes associated to specific virulence element, from green (lowest) to red (highest). Note: Abbreviations used are relative to single strains (see Table S2 for details).

Virulence Factors within SAR and Species Dataset
VFanalyzer identified a total of 23 putative virulence factors, related to five different classes (VFclass) in the genome of SAR Bari ( Table 3). Most of these (14) were related to genes (capsule undetermined, capB, capC, galE) coding for virulence factors involved in immune system evasion (VFclass 2). Additional predicted virulence factors were associated with allantoine utilization (4 hits, nutritional factors); lipase (1 hit, lip), protease (1 hit, sspA), and nuclease activity (1 hit, nuc); virulence-factor-related to the process of antiphagocytosis (1 hit, uge); and serum resistance and immune evasion (1 hit, wbtP). All the predicted virulence factors identified in SAR Bari were present in at least one of the SAR strains included in the study (Table 4). Table 4. Virulence-factor genes predicted within SAR.  In total, 39 distinct virulence-related genes were identified by VFanalyzer in SAR. Of these, only two genes, one related to VFclass-immune evasion (capsule undetermined) and one to the VFclass-enzyme (sspA gene related to a serine protease), were shared between all SAR strains. The groEL and lplA1 genes were found exclusively in the AR8 strain, while hemL and icaC genes were identified solely in the AR12 and AR17 strains.
Profiles of presence/absence of virulence-factor-related genes identified within SAR strains were compared with equivalent profiles derived from the analyses of the other 15 bacterial species considered in this study. The complete results of these analyses are reported in Table S7. Comparison of the presence/absence of virulence-related genes identified in SAR with the other species considered in the study are reported in Table 5. A total of 160 distinct virulence-factor-related genes were identified in the 148 genomes included in our dataset, of these, 37 were shared between SAR and different species in the dataset ( Table 5). Among these, SAR shared four genes uniquely with BS (plr/gapA, katA, acpXL, and gnd), one gene with SAU (lpeA), and one gene with SE (eno). Further, SAR had six unique genes (flmH, T6SS-II, tuf, groEL, slrA, and sigA/rpoV).

Discussion
Despite the considerable clinical and environmental relevance of CoNS and the availability of an increasing number of genomic sequences, at present, information concerning SAR, particularly the presence/absence of putative virulence factors and antibiotic resistance genes, is limited.
In this study we present a detailed comparative genomics analysis of predicted resistomes and virulence factors of a carefully selected collection of 124 representative staphylococci, including all the currently available SAR genomes.
Our analyses highlight the presence of numerous candidate antibiotic resistance genes in SAR. Interestingly, several of these genes are commonly observed in other Staphylococcus species, suggesting a widespread presence of putative antibiotic resistance genes in the genomes of staphylococci. On the other hand, some genes (e.g., fosB3 and tetA) are consistently observed only in SAR isolates, or are shared between SAR and a limited number of commensal (soxS with SK) and pathogenic species (e.g., imp-35 and catB9 with SA and BS, respectively).
The global analysis of predicted antibiotic resistance profiles, as shown in Figure 2, clearly indicated that the predicted resistome of SAR is closely related to the predicted resistomes of pathogenic species, including SE and SA.
Notably, we observe that the predicted antibiotic resistance profile of MC, which was included in our analyses to serve as a reasonably phylogenetically distant outgroup, was very closely related to that of several Staphylococcus species, suggesting that these bacteria have very similar antibiotic resistance profiles. This observation might reflect that Macrococcus and Staphylococcus can be found associated in different environments (e.g., animal infections, milk, meat products) were horizontal gene transfer could take place [33].
However, the observation that Macrococcus and Staphylococcus are very closely related from a phylogenetic point of view could provide an equally likely explanation [34]. Importantly, predicted antibiotic resistance profiles recovered for BS, which in this study was used as an alternative outgroup species, were considerably different from those of staphylococcal genomes.
All the SAR strains herein considered contained a similar number of predicted antibiotic resistance genes. Unsurprisingly these genes were consistently annotated to similar functional classes. The majority (~50%) of putative antibiotic resistance genes identified in SAR strains were consistently associated with the efflux-based mechanism of resistance ( Table 2). In this respect, it should be noted that several recent studies suggest an increasing relevance of efflux pumps in antibiotic resistance mechanisms [35]. Efflux pumps displayed a tendency to promote loss of substrate specificity, which could translate into multi-resistance [36,37]. Moreover, as evidenced in a recent large-scale genomics investigation of pathogenicity determinants, many efflux pumps do not exclusively export antimicrobials, but could be implicated in bacterial virulence, representing potent targets for adaptation to a pathogenic lifestyle [38,39].
Our genomic analysis also predicted some genes of clinical relevance for all SAR strains. These include methicillin resistance (mecD, mecR1, and mecI), vancomycin resistance (e.g., vanKI, vanHA, vanTG, and vanYM) as well as daptomycin resistance (cls, rpoB, and pgsA). Moreover, the SAR Bari strain, which was described and characterized for the first time in this study, contained a putative arl-1 gene encoding for a novel β-lactamase, which was recently identified by whole-genome sequencing of a penicillin-resistant SAR strain (SAN1670) [11]. Interestingly, this gene was observed only in the SAN1670 and Bari strains, while all the other SAR strains and Staphylococcus species considered in this study contained the blaZ gene, coding for the most common β-lactamase found in staphylococci.
Notwithstanding these very interesting considerations, we observe that the predicted profile of antibiotic resistance of the SAR Bari strain does not completely match the experimental data obtained in this study. Indeed, while we observe a perfect correspondence between candidate resistance genes and in vitro resistance to several antibiotics (e.g., all used β-lactam antibiotics, fosfomycin, fusidic acid, erythromycin), it is also true that SAR Bari was sensitive to some antimicrobials (e.g., tetracycline, gentamicin, teicoplanin) for which putative resistance genes were predicted. These observations could be explained by several possible reasons, including complex regulatory mechanisms of gene expression, reduced enzymatic activity of some predicted candidate genes, or involvement of these genes in the resistance to other types of antimicrobials. Notwithstanding these limitations, we believe that the availability of a large and comprehensive annotation of candidate antimicrobial resistance genes in CoNS will constitute an important resource for the functional characterization of these genes and a more detailed understanding of the molecular mechanisms involved in antibiotic resistance.
Bioinformatics prediction of virulence-related genes identified a consistent number of putative virulence elements and factors (Table 5 and Table S6) in the genomes of all the SAR strains. Comparative analyses suggested that the predicted virulence profiles of SAR isolates have a relevant similarity with other pathogenic staphylococci, including SS and SH (Figure 3, Table S6).
These observations are confirmed even when a specific set of virulence factors (e.g., ssp, nuc, cap) involved in immune evasion is considered.
Remarkably, the putative orthologs of the ssp, nuc, and cap, which have recently been described in the majority of opportunistic CoNS pathogenic species (SE, SH, and SS), were observed in all SAR strains for which genomic sequences are available [40].
The presence of a high number of predicted antibiotic resistance genes and virulence-related elements in SAR suggests that these genes might constitute an important genetic reservoir of genes of clinical relevance for other pathogenic bacterial strains and species.
In conclusion, we believe that by providing, for the first time, an extensive annotation of potentially clinically relevant pathogenic genes in SAR isolates, the analyses here presented can constitute a valuable resource for the study of molecular mechanisms of opportunistic pathogenicity in this species and the functional characterization of antibiotic resistance and virulence genes of CoNS in general.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2076-2607/7/11/580/s1, Table S1: Genome assembly of Staphylococcus arlettae, Table S2: DATASET: Species and Genome Assembly, Table S3: Antibiotic resistance determinants within SAR, Table S4: Antibiotic resistance determinants shared among SAR and other species in the dataset, Table S5: Virulence elements of the SAR Bari draft genome, Table S6: Number of predicted genes correlated to virulence elements identified within species in the dataset, Table S7: Distribution of putative genes related to virulence factors identified within species in the dataset.