Antimicrobial Resistance Genes Analysis of Publicly Available Staphylococcus aureus Genomes

Staphylococcus aureus is a pathogen that can cause severe illness and express resistance to multiple antimicrobial agents. It is part of the ESKAPE organisms and it has been included by the Centers for Disease Control and Prevention (CDC) of USA in the list of serious threats to humans. Many antimicrobial mechanisms have been identified, and, in particular, antimicrobial resistance genes (ARGs) can be determined by whole genome sequencing. Mobile genetic elements (MGEs) can determine the spread of these ARGs between strains and species and can be identified with bioinformatic analyses. The scope of this work was to analyse publicly available genomes of S. aureus to characterise the occurrence of ARGs present in chromosomes and plasmids in relation to their geographical distribution, isolation sources, clonal complexes, and changes over time. The results showed that from a total of 29,679 S. aureus genomes, 24,765 chromosomes containing 73 different ARGs, and 21,006 plasmidic contigs containing 47 different ARGs were identified. The most abundant ARG in chromosomes was mecA (84%), while blaZ was the most abundant in plasmidic contigs (30%), although it was also abundant in chromosomes (42%). A total of 13 clonal complexes were assigned and differences in ARGs and CC distribution were highlighted among continents. Temporal changes during the past 20 years (from 2001 to 2020) showed that, in plasmids, MRSA and macrolide resistance occurrence decreased, while the occurrence of ARGs associated with aminoglycosides resistance increased. Despite the lack of metadata information in around half of the genomes analysed, the results obtained enable an in-depth analysis of the distribution of ARGs and MGEs throughout different categories to be undertaken through the design and implementation of a relatively simple pipeline, which can be also applied in future works with other pathogens, for surveillance and screening purposes.


Introduction
Staphylococcus aureus is a bacterial pathogen that can cause cases of serious bacteraemia and other infections, such as endocarditis, prosthetic device infections (PDI), pleuropulmonary infections, abscesses, meningitis and urinary tract infections (UTIs), in humans [1]. It is also a major concern in the food production industry because some clonal lineages of this pathogen can be transferred from animals to humans, with possible associated health issues and economical losses [1][2][3]. Its capacity for adaptation and resistance to antimicrobials makes it a very dangerous microorganism. Indeed, it is included in the list of serious threats to humans by the Centers for Disease Control and Prevention (CDC) of USA and in the World Health Organisation (WHO) list of high priority pathogens for the development of new antibiotics [4,5]. Antimicrobial resistance (AMR) is a serious issue in both clinical and food production environments due to the possible spread of superbugs with multi-drug resistance (MDR) mechanisms [6]. The acquisition of AMR by S. aureus strains started with the extensive use of penicillin in the 1940s and the use of methicillin later, leading to the development of methicillin-resistant Staphylococcus aureus (MRSA) [7]. As a result, S. aureus has been included in the group of ESKAPE pathogens, a group of MDR organisms of major concern for humans [8].
Whole genome sequencing (WGS) is nowadays a commonly used and powerful tool for the characterisation of pathogenic bacteria, including S. aureus [9]. It enables the typing of strains (e.g., using core genome multi locus sequence typing), as well as the prediction, with a certain level of confidence, of their phenotypic characteristics, such as virulence and AMR potential, by searching for the genes which determine these phenotypic traits in the genomes or specifically in their mobile genetic elements (MGE) [10]. Some studies have found a high level of confidence between AMR phenotypes and genotypic predictions, although sometimes the correlations obtained are not absolute, meaning that the standard microbiological protocols for AMR assessment still need to be used [11,12]. WGS is also being used for outbreak investigations because it facilitates the fast discovery of the source of outbreaks [13]. Thanks to the diffusion of open science practices, WGS research data are being made freely available in public databases. When corresponding metadata are also incorporated, such as information on isolation source, country, time of isolation, etc., the investigation of trends in distribution of genetic determinants as associated with particular origins is facilitated. In this manner, the aim of this study was to provide an overview of the antimicrobial resistance genes (ARGs) of S. aureus in publicly available genomes, with a focus on their distribution through continents, isolation sources, clonal complexes (CCs), and the temporal changes along the past 20 years. The pipeline used in this study can be easily applied for further analyses, for example considering other pathogens, or for surveillance purposes, such as to monitor changes in ARGs distribution in source categories and countries, in relation to the antibiotic usage.
Significant differences (p < 0.05) were observed in the distribution of chromosomic ARGs among Continents ( Figure 3A). For example, aac(6 )-aph(2") and catpC221 were found significantly more abundant in Asia (44 and 3% of chromosomes from Asia, respectively) and in South America (38% and 3%, respectively) than in Europe (15% and 0.1%, respectively), even though catpC221 occurrence was higher in Africa, but with no significant differences if compared to its occurrence in other continents (p > 0.05); tet(K) and tet(M) were more represented in Europe (15% and 27%, respectively) than in South America (6%, in both cases); aadD had very small presence in Africa (0.3%), while it was much more prevalent in North America (53%) than in South America (12%), Europe (9%), or Asia (21%). The ARGs ant(6)-Ia and aph(3')-III were significantly more abundant in South America (46% in both cases) than in all the other continents, excluded North America (32% and 6%, respectively) and dfrG was found more abundant in Africa (39%) and Asia (32%) than in Europe (6%) and North America (1%) ( Figure 3A).

CCs and ARGs Distribution by Source of Isolation
In chromosomes, 35.9% of animal sources belonged to CC398 and 34.56% of human sources to CC5, followed by CC1 in animal (24.36%) and CC8 in human (24.49%) sources ( Figure 4A). CC398 was also found in abundance in the categories Environment and Food (21.41% and 17.05%, respectively), while CC8 was prevalent in the Environment category (15.11%), however, these differences were not significant (p > 0.05).
The analysis of plasmidic contigs showed a predominance of CC1 plasmidic contigs The distribution of plasmidic ARGs throughout the continents showed statistically significant differences for only four genes: erm(C) was found more abundant in plasmidic contigs from Asia (24%) than from North and South America together (7%), while mph(C), msr(A), and spc were more abundant in North/South America (15%, 15%, and 22%, respectively) than in Europe (3%, 4%, and 4%, respectively) ( Figure 3B).

CCs and ARGs Distribution by Source of Isolation
In chromosomes, 35.9% of animal sources belonged to CC398 and 34.56% of human sources to CC5, followed by CC1 in animal (24.36%) and CC8 in human (24.49%) sources ( Figure 4A). CC398 was also found in abundance in the categories Environment and Food (21.41% and 17.05%, respectively), while CC8 was prevalent in the Environment category (15.11%), however, these differences were not significant (p > 0.05). The analysis of plasmidic contigs showed a predominance of CC1 plasmidic contigs in isolates from animal sources (73%) and those from CC398 in environmental isolates (74%) ( Figure 4B). Furthermore, plasmidic contigs from human isolates seemed to be more represented in CC8 (21%), CC22 (22%), and CC5 (20%) ( Figure 4B). Additionally, in this case, the statistical analysis showed no significant differences.
Chromosomal mecA and blaZ were widespread throughout the isolation sources, although mecA was significantly (p < 0.05) more abundant in human (84% of human isolates were carrying mecA) than in animal isolates (74%), while blaZ was significantly more abundant in isolates obtained from environmental samples (70%) than from animals (32%) ( Figure 5A). Some other genes showed differences among isolation sources. Thus, aac(6 )-aph(2") and tet(M) were more abundant in the animal source (42% and 45%, respectively) than in all the other sources, although these differences were not significant, apart from aac(6 )-aph(2") in human isolates vs. other isolates (16% and 8%, respectively) ( Figure 5A). Eventually, 13 and 3% of environment and animal isolates, respectively, were carrying aph(3 )-III, and 15 and 3% of environment and animal isolates, respectively, were carrying dfrK (p < 0.05, Figure 5A). The ARGs found in plasmidic contigs from different sources showed a high abundance of tet(K) in the environment (60%) and blaZ throughout the different isolation sources, with no significant differences (p > 0.05, Figure 5B).
Some characteristic patterns of association of CCs with ARGs linked to resistance to particular antimicrobial families were observed (Figure 7). Globally, all the chromosomes analysed were carrying ARGs associated with resistance to Beta-Lactams and Aminoglycosides, with chromosomes belonging to CC8 and CC5 carrying ARGs associated with resistance to a maximum of 13 different antimicrobial families, while CC130 only to 2 (Supplementary Table S8). All the plasmidic contigs analysed carried at least one ARG associated with resistance to the macrolides, aminoglycosides, and tetracyclines classes, with plasmidic contigs belonging to CC8 and CC5 carrying ARGs associated with resistance to a maximum of 11 antimicrobial families, while CC130 and CC121 only to 4 (Supplementary Table S9). In chromosomes, some common patterns of (co)occurrence were observed in particular CCs ( Figure 7A). For example, the occurrence of beta-lactams ARGs was lower in CC121 (11%) than in other CCs, while tetracyclines and trimethoprim ARGs were more abundant in CC398 (85% and 38%, respectively) and CC121 (70% and 51%, respectively) than in other CCs ( Figure 7A). Another cluster, formed by four CCs (CC8, CC1, CC5, and CC59) showed a significantly higher occurrence of aminoglycosides ARGs (82%, 70%, 81%, and 48%, respectively), while CC1, CC8, CC5, and CC398 showed significantly higher abundance of macrolides ARGs (20%, 27%, 65%, and 30%, respectively) compared to CC121 (13%) ( Figure 7A).
In plasmidic contigs, CC398 and CC121 clustered together, with high occurrence of tetracyclines ARGs (45% and 67%, respectively) ( Figure 7B). Despite the many diversities observed in the occurrence of ARG families among the CCs, significant differences were found only for plasmidic contigs from CC1 and CC45, with a higher occurrence of aminoglycosides ARGs than CC93 (5%, 7% and 1%, respectively) ( Figure 7B).

Temporal Changes in ARGs
From the whole dataset, information regarding the isolation dates (from 2001 to 2020) were present only in 10,927 chromosomes (44%) and 2855 plasmidic contigs (14%), which were employed for the following analysis. The analysis of temporal changes in ARGs present on chromosomes showed that over the last 20 years the global occurrence of ARGs has changed, in some cases showing significant differences (p < 0.05, Figure 8, Supplementary Table S10). For example, mecA presence has declined significantly up to 2010, with no significant differences after that, while the reduction in blaZ presence in the past five years is not statistically significant compared to the previous years ( Figure 8). A reduction was also observed in other genes, such as erm(A) and aph(3 )-III, where the reduction from 2015 to 2020 was significant, or spc and erm(C), with a significant reduction between 2006 and 2020 ( Figure 8). The significant reduction in mecA, erm(A), aph(3 )-III, and spc occurrence was observed mainly in isolates from Asia and North/South America, while in Africa also erm(C) showed a reduction in occurrence (Supplementary Figure S1). In Asia, it was observed also a significant reduction in the occurrence of dfrG over the past two decades (Supplementary Figure S1).

Temporal Changes in ARGs
From the whole dataset, information regarding the isolation dates (from 2001 to 2020) were present only in 10,927 chromosomes (44%) and 2855 plasmidic contigs (14%), which were employed for the following analysis. The analysis of temporal changes in ARGs present on chromosomes showed that over the last 20 years the global occurrence of ARGs has changed, in some cases showing significant differences (p < 0.05, Figure 8, Supplementary Table S10). For example, mecA presence has declined significantly up to 2010, with no significant differences after that, while the reduction in blaZ presence in the past five years is not statistically significant compared to the previous years ( Figure 8). A reduction was also observed in other genes, such as erm(A) and aph(3′)-III, where the reduction from 2015 to 2020 was significant, or spc and erm(C), with a significant reduction between 2006 and 2020 ( Figure 8). The significant reduction in mecA, erm(A), aph(3′)-III, and spc occurrence was observed mainly in isolates from Asia and North/South America, while in Africa also erm(C) showed a reduction in occurrence (Supplementary Figure S1). In Asia, it was observed also a significant reduction in the occurrence of dfrG over the past two decades (Supplementary Figure S1). On the other hand, some ARGs increased their occurrence, such as erm(B), fexA, and lnuB, with a significant increase until 2015, and tet(K), tet(L) and tet(M), which increased sig-nificantly until 2010, with no significant differences thereafter (Figure 8 and Supplementary  Table S10). However, in Asia, tet(M) showed a significant reduction over time.
The temporal analysis performed for global ARGs occurrence on plasmidic contigs also showed some significant differences (Figure 9 and Supplementary Table S11). The presence of blaZ increased significantly from 2001-2010 to 2011-2020, although a reduction was observed after 2015, while erm(C) occurrence decreased between 2010 and 2015, with a statistically significant difference between 2010 and 2020 ( Figure 9). A significant decrease was also observed in tet(K) between 2005 and 2015 ( Figure 9). Other genes that significantly decreased their prevalence were erm(A) and spc, between 2005 and 2020 ( Figure 9). A similar reduction in erm(C), tet(K), erm(A), and spc was observed in isolates from Asia (Supplementary Figure S1). An increase up to 2015 was observed for tet(L), with a significant decrease thereafter, while mph(C) and msr(A) had a significant increase up to 2015, and ant(6)-Ia and aph(3')-III also increased significantly (Figure 9 and Supplementary Table S11). On the other hand, some ARGs increased their occurrence, such as erm(B), fexA lnuB, with a significant increase until 2015, and tet(K), tet(L) and tet(M), which incr significantly until 2010, with no significant differences thereafter (Figure 8 and Su mentary Table S10). However, in Asia, tet(M) showed a significant reduction over t The temporal analysis performed for global ARGs occurrence on plasmidic contig showed some significant differences (Figure 9 and Supplementary Table S11). The pr of blaZ increased significantly from 2001-2010 to 2011-2020, although a reduction w served after 2015, while erm(C) occurrence decreased between 2010 and 2015, with a s cally significant difference between 2010 and 2020 ( Figure 9). A significant decrease w observed in tet(K) between 2005 and 2015 ( Figure 9). Other genes that significantly dec their prevalence were erm (A) and spc, between 2005 and 2020 (Figure 9). A similar red in erm(C), tet(K), erm(A), and spc was observed in isolates from Asia (Supplementary S1). An increase up to 2015 was observed for tet(L), with a significant decrease ther while mph(C) and msr(A) had a significant increase up to 2015, and ant(6)-Ia and aph(3' so increased significantly (Figure 9 and Supplementary Table S11).

Discussion
In this study, we identified ARGs in the chromosomes and plasmidic contigs of publicly available genomes from Staphylococcus aureus. The performed analyses returned a global overview of the spread of ARGs and CCs worldwide according to isolation sources, and of temporal changes in ARGs occurrence. To our knowledge, this is the first study that includes S. aureus WGS data from worldwide, as previous studies were focused on specific regions or clonal complexes, for example CC5 in the western hemisphere or the strain USA300 (CC8) in USA [14,15]. Furthermore, the differentiation between plasmidic and chromosomic data provided a more precise perspective of the genomic location of ARGs.
Obviously, this study has some limitations, firstly because of the lack of metadata in many genomes and even associated with biases due to predominance of genomic studies from different sources and/or countries, which could cause an under-or over-estimation in terms of geographical distribution, isolation sources, and temporal analysis. Furthermore, as new genomes are submitted continuously in the databases, these results represent a picture of the moment this study was undertaken, which can easily change considerably with the addition of new genomes to the analysis. Nevertheless, changes in ARG distribution during the last 20 years are consistently demonstrated.
Most of the isolates were MRSA (84% of human isolates were carrying mecA, against 74% of animal isolates), as the main ARG found along S. aureus genomes was the mecA gene, which was found only in chromosomes but not in the plasmidic contigs. It is well known that this gene, which confers resistance to beta-lactam antibiotics, is located on the Staphylococcal Cassette Chromosome mec (SCCmec), which can be exchanged between bacteria thanks to the presence of genes encoding the cassette chromosome recombinases (ccr) within the SCCmec [16]. On the other hand, the main ARG found in plasmidic contigs, the blaZ gene, which confers resistance to penicillin, was also found in some chromosomic contigs. These findings are in line with other research works, where the origin of genes associated with resistance to beta-lactams in staphylococci was attributed to transposon Tn552-like genes, while the blaZ gene was found on various mobile genetic elements (MGEs), such as transposons and plasmids. In our study, the overall presence of beta-lactam resistance genes was higher in the chromosomes than in the plasmidic contigs. Some other ARGs were found almost exclusively on plasmidic contigs, such as ant(6)-Ia, aph(3 )-III, aac(6 )-aph(2"), tet(M), and dfrG. The presence of these genes on MGEs has been documented in the past and their location on plasmids and other MGEs, such as transposons and integrons, has been previously described [17,18].
The geographical distribution analysis showed that some CCs are almost exclusively circumscribed to some continents, such as CC5 in North and South America; CC22 and CC398 in Europe; and CC8 in Africa, Oceania, and Asia. However, CC8 and CC30 were also abundant in Argentina. CC30 is nowadays the most common hospital and community acquired clonal complex in Argentina, which has replaced the main abundance of CC5 on the past decade [19]. The sequence type ST80 is considered the most abundant in Europe [20], but only 30 European isolates belonged to ST80 in our dataset, probably due to the lack of metadata about geographical location in the analysed public repositories. However, our study highlighted the presence of other CCs in European countries, for example, the high prevalence of CC398 in the Netherlands, followed by Germany, Italy, and Denmark, remarks the emerging spread of this clone in these areas. Furthermore, CC1 and CC59 are most prevalent in China, as has been found in previous studies [21][22][23], while CC8, CC45, and CC121 were the most abundant in Singapore, Taiwan, and Thailand, respectively. Other studies also described differences among Asian countries, for example, Turner et al. (2019) [24] identified CC8 as a common CC in Korea, CC30 in Japan, and CC59 in Taiwan, while Song et al. (2011) [25] described CC8 as widespread throughout Asian Countries and CC59 mostly isolated in Taiwan [24,25]. Our findings slightly differ from these examples, probably because of the different number of genomes considered due to a wider temporal interval and, again, a possible lack of metadata associated with the genomes available in public repositories.
The distribution of plasmids among CCs throughout the continents was similar to that of the chromosomes, however, the overall number of CCs with plasmid contigs associated was lower. It is well documented that plasmids are transferred between specific lineages in S. aureus and this could explain the differences in the abundance of plasmidic contigs belonging to different CCs [26,27]. Our results showed that plasmids associated with CC8 are widespread worldwide and this agrees with other studies highlighting the presence of plasmids in this CC [28][29][30]. Additionally, the presence of plasmids in other common CCs, such as CC398, CC22, CC5, and CC1, has also been documented in the past [26,[31][32][33].
Our results showed that MRSA are still the most abundant S. aureus throughout the globe and that the genes conferring resistance to beta-lactams being both in chromosomes and plasmidic contigs. Chromosomes showing ARGs associated with resistance to aminoglycosides and trimethoprim and plasmidic contigs with ARGs for resistance to macrolides were found abundant in Asia. Liang et al. (2019) [34] showed that macrolide and aminoglycoside resistance genes were between the most represented on 103 MRSA clinical isolates from China, while aac(6 )-aph(2"), dfrG, and ermC genes were also present with high occurrence [34]. The high prevalence of tetracycline resistance genes in Europe is in agreement with its association with CC398, which in this study was mainly found in European isolates. Tetracycline resistance is indeed a typical trait in CC398 isolates, as shown by previous studies [35,36].
The dfrG gene, which confers resistance to trimethoprim, was found significantly more abundant in Africa than in other continents, which is consistent with the antibiotic prescription and antimicrobial resistance data described in the literature [37,38]. On the other hand, the significant abundance of aminoglycosides ARGs in South America does not reflect the global use of this class of antibiotics, which seems to be more common in Asian countries, according to the data summarized on ResistanceMap [39]. However, ResistanceMap only includes data up to 2015 and on antibiotics used for human consumption [40], while the information of their use on farm animals is missing.
In our study, CC398 was found mainly in the Animal category, followed by Environment and Food categories. S. aureus CC398 was first isolated in humans, but it acquired the ability to also infect livestock, then humans in contact with livestock, and eventually to spread between humans [22,41,42].
Finally, CC5 and CC8 isolates were mainly associated with the human source, in concordance with outbreak and pandemic reports worldwide [43,44].
Our results showed the global distribution of ARGs in the different CCs analysed. In chromosomes, in almost all CCs, mecA was carried (absent only in CC130), reflecting the high incidence of MRSA. CC398 also harboured erm(B), tet(M), tet(L), dfrG, dfrK, erm(T), fexA, lnuB, and str genes, at higher levels compared to other CCs, similarly to the findings of previous works where isolates belonging to CC398 harboured resistance genes against aminoglycosides, beta-lactams, macrolides, tetracycline, and trimethoprim [36,45]. CC8 genomes showed high levels of mecA, blaZ, tet(K), aph(3 )-III, aac(6 )-aph(2"), ant(6)-Ia, and tet(M). CC8 is known to carry resistance genes for beta-lactams, followed by a possible variety of other ARGs depending on the strain [46]. CC5 had high incidence of mecA, blaZ, erm(A), aadD, and spc genes, which is in concordance with the findings by Challagundla et al. (2018) [14], where a phylogenomic study on CC5-MRSA found that the most common ARGs were those conferring resistance to beta-lactams, fluoroquinolones, macrolides, aminoglycosides, and tetracyclines [14]. Only blaZ was highly abundant in the chromosomes of CC30 isolates in our results, but aadD, spc, and erm(A) also had a high occurrence, especially in the plasmidic contigs, partially confirming the description of Moneke et al., where the most predominant strain, ST36/39-MRSA-II, carried SCCmec II with aadD and erm(A) in the plasmid pUB110 and the transposon Tn554, respectively [46].
In our study, CC130 carried on the chromosome only mecC and str, while blaZ, erm(C), tet(K), tet(L), and str genes were in the plasmidic contigs. In a study by Gómez et al. (2021) [23], only mecC, blaZ, and tet(K) were found as the main ARGs in CC130, showing that this CC carries ARGs mainly on plasmids [23].
The analysis of temporal changes highlighted an overall reduction along time in mecA and erm(A) on chromosomes, among other genes, while blaZ prevalence increased within plasmidic contigs. The same trend was found in a study on clinical isolates from two hospitals in USA during the period 2000 to 2014, where a decline of MRSA and erythromycin resistance was observed, while penicillin-resistant S. aureus increased [47]. Another study, analysing MRSA infections in patients of the Veterans Affairs medical centres in the USA from 2005 to 2017, also showed a decreasing trend of MRSA infections [48]. The increase in blaZ presence in plasmids, while it is decreasing in chromosomic regions, needs further investigations focused on the ecological advantages of keeping this gene in mobile regions, such as in the case of plasmids. Another significant trend was observed in the decrease over time of erm(C) occurrence, which is similar to the decline in the presence of MGEs carrying the erm(C) gene over the period 2001-2010 in a pool of 1193 S. aureus clinical isolates from Ireland [32]. Interestingly, aph(3 )-III had a significant reduction over time in the chromosomes and a significant increase in the plasmidic contigs in the past five years.
Many efforts have been made by the governments in the past to reduce the threat of antimicrobial resistance by monitoring and reducing the use of antibiotics in various sectors, following the One Health strategy [49][50][51][52][53][54][55][56][57]. The data gathered in this work can complement future works focused on understanding if the reductions observed in this study in the ARGs over time reflected the strategies adopted by countries for AMR control.
The present study has some limitations, which can be addressed by future research. First of all, it is important to highlight that low-income countries do not possess the same availability of technologies of high-income countries, which results in less data released from these countries, being the countries where AMR burden is higher.
Another limit is the necessity of constantly updated databases and software for the identification, for example, of all the different alleles of ARGs and different MGEs. This would bring much more detail into the analysis, identifying more ARGs on chromosomes and different MGEs, not only on plasmidic contigs. Furthermore, it would be interesting to expand the temporal analysis also to CCs, isolation sources, and geographical location, however, more effort is needed when submitting metadata related to the isolates to increase the statistical significance of the studies.
In conclusion, the aim of this study was to explore the resistome of S. aureus through analysing publicly available WGS data. The analysis, conducted on 29,679 S. aureus genomes, allowed to identify different ARGs, to locate them on chromosomes or plasmidic contigs and to associate them with CCs, isolation sources, geographical distribution, and temporal changes. Our results confirmed a global temporal reduction in MRSA and macrolide resistance occurrence and remarked on the increase in the occurrence of ARGs associated with aminoglycosides resistance in plasmids in recent years. The workflow used for this analysis is available for future resistome analyses on other species and can be adapted for more targeted outputs.

Download of Genomes and Associated Metadata
Assembled genomes and associated metadata were downloaded on February 2021 from NCBI [58], PATRIC [59], and PubMLST [60] databases, adapting the available Ruby scripts (https://github.com/JoseCoboDiaz/download_genomes, accessed on 12 November 2021). All the metadata from each database were adapted as previously described [61] and gathered in one table. Other categories were added manually to the metadata table, where possible, based on the subcategories present in the databases: the subcategory "collection_date" was grouped into "Date_Range"; a "Continent" was assigned manually where necessary, based on the "country" column; and "Isolation_Source" was grouped into "Category".

Analysis of Genomes
The assembled genomes from each of the three databases were analysed with Staramr [62], which includes the ResFinder database [63] for ARGs identification. The scheme for MLST, from Pubmlst scheme [64], was also included in the Staramr pipeline. To automatise the analysis for the thousands of genomes available, the Ruby script 08.staramr_auto.rb (https://github.com/JoseCoboDiaz/download_genomes, accessed on 12 November 2021) was adapted and used. The clonal complexes (CCs) were assigned automatically, using the list of CCs associated with STs in the S. aureus PubMLST scheme [65].
PlasFlow [70] was used for the identification of plasmidic contigs. A table with all the plasmidic contigs was created from the PlasFlow output files, which was used to split the ResFinder results into two files, one corresponding to chromosomic contigs and the other to plasmidic contigs. Two datasets including both resistome data and associated metadata were generated: one for chromosomic data and one for plasmidic data.

Data Analysis and Statistics
Both chromosomic and plasmidic datasets were explored and analysed in RStudio (Version 1.4.1106). Occurrence and patterns of distribution of AMR genes were analysed by CC, isolation source, temporal and geographical categories. For the temporal analysis, the isolates collected before 2001 were excluded due to the small sample size. Heatmaps were generated using the pheatmap R package. Only those genes present in at least 100 genomes were included for further analysis. Before performing the statistical analysis, the categories and subcategories were trimmed (categories consisting of < 100 genomes were excluded; subcategories of <10 genomes were excluded; all chromosomes or plasmids with missing metadata were also excluded). Normal distribution of occurrence of ARGs in the categories was assessed with the Shapiro-Wilk normality test and, based on the result, Student's t-tests or Wilcoxon tests were performed to assess significant differences (p < 0.05) between each category of data. Boxplots were generated to visualise the results of the statistical analysis using the ggplot2 R package.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/antibiotics11111632/s1, Table S1: Geographical distribution of chromosomes belonging to different source categories; Table S2: Geographical distribution of plasmidic contigs belonging to different source categories; Table S3: Whole chromosomic dataset; Table S4: Whole plasmidic contigs dataset; Table S5: Correspondence between ARGs and antibiotic families; Table S6: Frequencies of ARGs in different CCs found in chromosomes; Table S7: Frequencies of ARGs in different CCs found in plasmidic contigs; Table S8: Number of chromosomes for each CC carrying ARGs conferring resistance to antibiotic families; Table S9: Number of plasmidic contigs for each CC carrying ARGs conferring resistance to antibiotic families; Table S10: Significant differences found in the occurrence of chromosomic ARGs since 2001; Table S11: Significant differences found in the occurrence of plasmidic contigs ARGs since 2001; Figure S1: Temporal changes on the occurrence of ARGs in the chromosomes of isolates from Africa (A), Asia (B), North/South America (C), and in the plasmidic contigs of isolates from Asia (D). Only the ARGs described in the text are shown.