Correlation between Phenotypic and In Silico Detection of Antimicrobial Resistance in Salmonella enterica in Canada Using Staramr

Whole genome sequencing (WGS) of Salmonella supports both molecular typing and detection of antimicrobial resistance (AMR). Here, we evaluated the correlation between phenotypic antimicrobial susceptibility testing (AST) and in silico prediction of AMR from WGS in Salmonella enterica (n = 1321) isolated from human infections in Canada. Phenotypic AMR results from broth microdilution testing were used as the gold standard. To facilitate high-throughput prediction of AMR from genome assemblies, we created a tool called Staramr, which incorporates the ResFinder and PointFinder databases and a custom gene-drug key for antibiogram prediction. Overall, there was 99% concordance between phenotypic and genotypic detection of categorical resistance for 14 antimicrobials in 1321 isolates (18,305 of 18,494 results in agreement). We observed an average sensitivity of 91.2% (range 80.5–100%), a specificity of 99.7% (98.6–100%), a positive predictive value of 95.4% (68.2–100%), and a negative predictive value of 99.1% (95.6–100%). The positive predictive value of gentamicin was 68%, due to seven isolates that carried aac(3)-IVa, which conferred MICs just below the breakpoint of resistance. Genetic mechanisms of resistance in these 1321 isolates included 64 unique acquired alleles and mutations in three chromosomal genes. In general, in silico prediction of AMR in Salmonella was reliable compared to the gold standard of broth microdilution. WGS can provide higher-resolution data on the epidemiology of resistance mechanisms and the emergence of new resistance alleles.


Introduction
Salmonella spp. are a major cause of foodborne illness that can produce symptoms ranging from mild gastrointestinal illness to more severe invasive infections such as bacteremia. Invasive infections with nontyphoidal Salmonella accounted for an estimated 535,000 human cases and 59,100 deaths in 2017 globally [1,2]. While most cases are self-limiting, antimicrobials may be prescribed for invasive infections and for serious gastrointestinal infections in young infants, the elderly, and immunocompromised individuals [3]. Recommended antimicrobial treatments include ceftriaxone, ciprofloxacin, trimethoprim/sulfamethoxazole, or amoxicillin [3].
Antimicrobial resistance (AMR) in Salmonella, including multidrug resistance, is increasing [4][5][6][7][8][9]. Surveillance of AMR can help inform treatment guidelines and policies on antimicrobial stewardship in human and veterinary medicine. Traditional antimicrobial susceptibility testing (AST) is performed with phenotypic methods, including broth microdilution, disc diffusion, and Etest strips. Advances in the speed and cost of wholegenome sequencing (WGS) is transforming microbiology [10]. Globally, countries are transitioning to WGS for epidemiological surveillance, including detection of outbreaks, and detection of AMR in Salmonella and other pathogens [10]. Genetic mechanisms of resistance are relatively well characterized in Salmonella, which facilitates in silico AMR prediction [11].
There are many databases and tools for AMR detection, such as CARD, AMRFinder, ARG-ANNOT, and ARIBA [12][13][14][15]. Each tool has its strengths and limitations, and different countries have adopted different approaches for genotypic AMR detection [10]. The Center for Genomic Epidemiology (CGE) in Denmark has published databases for acquired resistance genes (ResFinder), chromosomal mutations (PointFinder), and plasmids (Plas-midFinder) [16][17][18]. CGE's web-based tools to query these databases can help low-and middle-income countries to overcome challenges in the implementation of in silico AMR detection; however, the offline versions of these tools do not currently process samples in batches [19].
Here, we describe a new tool called Staramr, which was created to query the CGE databases in high throughput. Staramr uses a BLAST-based approach to scan bacterial genome contigs for antimicrobial genes and mutations and compiles a summary report of genetic mechanisms and predicted antibiogram based on a gene-drug key developed by the United States Centers for Disease Control (US CDC). We used Staramr to evaluate the reliability of in silico AMR detection from WGS for Salmonella enterica isolated from human infections in Canada.

Methods
Bacterial isolates. The Canadian Integrated Program for Antimicrobial Resistance Surveillance (CIPARS) collects human clinical isolates of Salmonella from all ten provincial public health laboratories in Canada. Further details on the methods used by CIPARS are described in the Design and Methods section of the annual report [20]. Our study included all isolates of Salmonella enterica collected from January to June 2017 that were tested by both broth microdilution and WGS (n = 1321).
Whole-genome sequencing and assembly. PulseNet Canada conducts short-read WGS on all Salmonella from human-source infections. DNA extractions were carried out with the Epicentre Complete DNA and RNA Extraction Kit (Illumina Inc, San Diego, CA, USA) or the DNeasy blood and tissue kit (Qiagen, Germantown, MD, USA). Libraries were prepared with the Nextera XT kit and sequencing was carried out on the Miseq platform with the Miseq Reagent v3 600 cycle kit (Illumina Inc, San Diego, CA, USA). Isolates with coverage below 40× and an average Q-score ≤ 30 were re-sequenced. Genomes were assembled within Bionumerics v7.6.3 using spades v3.7.1 [23] with a minimum contig length of 1000. The quality of the assemblies was assessed within BioNumerics v7.6.3; isolates with ≥200 contigs or a genome size outside the range of 4.4 to 6.0 Mb were re-sequenced.
Sequence accession. PulseNet Canada deposits sequence reads of Salmonella enterica to the National Center for Biotechnology Information in Bioproject PRJNA543337.

Results
Staramr tool for genotypic AMR prediction. Staramr can be used to (1) detect resistance genes and point mutations in assembled contigs, (2) apply a gene-drug key to produce a predicted antibiogram, and (3) provide assembly statistics, plasmids and the MLST type of the genome. The program utilizes CGE's ResFinder, PointFinder, and PlasmidFinder databases as well as PubMLST databases. Staramr (* amr) was named after the asterisk (star) character, which is often used as a wildcard when searches are performed within software. The gene aac(6 )-Iaa was frequently detected but did not confer resistance to any antimicrobial; therefore, the script was revised to reject this gene at the antibiogram predic-tion step. This gene is reported to be cryptic [25]. For PointFinder, the minimum length was set to 95% to ensure that the correct gene was captured. For ResFinder, a truncated but functional sul1 was sometimes detected in S. Typhimurium; the threshold of minimum length of the BLAST hit was thus lowered to 52% (from the default of 60%) when querying Salmonella, which improved overall sensitivity but did not affect specificity. The nucleotide ID threshold for ResFinder was set to 98% ID, as recommended by Zankari et al. [16].
Comparison of phenotypic and genotypic AMR detection. Genotypic AMR prediction was conducted with Staramr, and MICs were produced by broth microdilution testing. We compared phenotypic and genotypic resistance for 14 antimicrobials belonging to ten classes. Staramr predicts categorical resistance (resistant or non-resistant) for all drugs except for ciprofloxacin, which was categorized as either intermediate/resistant or susceptible. Of 1321 isolates, 529 displayed resistance to one or more antimicrobials for a total of 1572 instances of resistance in the dataset.
The sensitivity of detection, which is the ability of the genotypic test to detect antimicrobial resistance (true positive rate), was >90% for 10 antimicrobials: ampicillin, chloramphenicol, ciprofloxacin, ceftriaxone, sulfisoxazole, gentamicin, meropenem, streptomycin, sulfisoxazole/trimethoprim, and tetracycline. The antimicrobials with sensitivity <90% were amoxicillin/clavulanic acid, azithromycin, cefoxitin, and nalidixic acid. For amoxicillin/clavulanic acid, the sensitivity was 87.8% due to 5/41 phenotypically resistant isolates being missed by the genotypic method (three of the five missed isolates contained bla CARB-2 ). In our dataset, 41/49 (84%) of isolates containing bla CARB-2 had amoxicillin/clavulanic acid MIC that was one two-fold dilution below the resistance breakpoint (intermediate category), three were resistant and five were susceptible. Thus, bla CARB-2 appears to confer reduced susceptibility to amoxicillin/clavulanic that is usually just below the resistance break point. Azithromycin detection had a sensitivity of 83.3% due to 2/10 azithromycin resistant isolates being missed by genotypic detection while cefoxitin displayed a sensitivity of 80.5% due to 8/41 resistant isolates being missed. These exceptions may be due to porin mutations or plasmid loss. For nalidixic acid, 49/251 nalidixic acid resistant isolates were missed by genotypic detection resulting in a sensitivity of 80.5%. Of these 49 false negatives, 32 (65%) contained the quinolone resistance gene qnrB19, which Staramr currently predicts as conferring ciprofloxacin intermediate/resistance but not nalidixic acid-resistance. In our dataset, of 35 isolates containing qnrB19, 32 isolates were nalidixic acid-resistant, two had MIC of one two-fold dilution below the resistance breakpoint and one was susceptible. If the interpretation of qnrB19 was changed from only "CIP-I/R" to both "CIP-I/R, and NAL-R", the sensitivity of detection of nalidixic acid would increase from 80.5% to 93.2%.
The specificity of the genotypic detection of AMR, which is the ability of the genotypic test to detect susceptbility (true negative rate), was high for all drugs, ranging from 98.6% for streptomycin to 100% for azithromycin, meropenem, sufisoxazole, sulfamethoxazole/trimethoprim, and tetracycline.  The PPV is the probability that an isolate that is predicted to be resistant by the genotypic test is actually resistant. This metric differs from sensitivity because it takes into account the prevalence of resistance in a population. The PPVs were >90% for all drugs except gentamicin and cefoxitin. For gentamicin, the PPV of 68.2% was due to 7 genotypic false positive isolates out of 1305 phenotypically susceptible isolates. All seven false positives carried an aac(3)-IVa gene and were predicted to be gentamicin resistant but had MICs of one or two two-fold dilutions below the resistance breakpoint. Only two isolates carrying aac(3)-IVa were phenotypically resistant. Thus, aac(3)-IVa appears to confer MICs close to the CLSI breakpoint for gentamicin. Cefoxitin had a PPV of 89.2% due to 4 genotypic false positives out of 1281 phenotypically susceptible isolates. Three of the four phenotypically susceptible isolates carried bla CMY-2 and had MICs that were one two-fold dilution below the resistance breakpoint while one isolate carried bla CMY-4 . Thus, both antimicrobials with lower PPVs were mainly due to genes that conferred MICs just below the CLSI resistance breakpoint.
The negative predictive value is the probability that an isolate that is predicted to be susceptible by the genotypic test is actually susceptible. This metric differs from specificity because it takes into account the prevalence of resistance in a population. The negative predictive values for genotypic AMR detection ranged from 95.6% for nalidixic acid to >98% for the other 13 drugs.
Isolates showing discrepancies to ≥2 drugs were retested by broth microdilution, but categorical results did not change significantly (data not shown). Resistance determinants were also detected for antimicrobials that were not routinely tested during the study period such as fosfomycin (n = 156), kanamycin (n = 140), hygromycin (n = 9), lincomycin (n = 1), rifampin (n = 1) and colistin (n = 1). Colistin has since been added to the broth microdilution antimicrobial panel that is used for routine testing by NARMS and CIPARS due to renewed interest in this drug as a last resort treatment for highly drug-resistant organisms.

Discussion
We developed and evaluated the use of an AMR prediction tool called Staramr and observed a strong correlation between phenotypic and genotypic detection of AMR in Salmonella. One advantage of Staramr over similar programs is that it evaluates quality metrics of the input data. Staramr, which uses genome assemblies as input, is not computationally intensive, so that it can be run on a local computer instead of a high-performance cluster. We recommend the use of a quick genome assembler such as Shovill/SPAdes (Shovill: https://github.com/tseemann/shovill, date accessed 18 January 2022; SPAdes: https://currentprotocols.onlinelibrary.wiley.com/doi/abs/10.1002/cpbi.102, date accessed 18 January 2022) [23]. Future work will include adding point mutations for additional bacterial species from the PointFinder database. Further, antibiogram predictions are an experimental feature which is continually being improved. The CARD database is not as yet validated in terms of which genes/mutations confer MICs above a clinical resistance breakpoint; however, CARD was useful for further analysis of discrepant results.
Genotypic AMR prediction is currently used for surveillance in some countries, but further validation of its reliability is needed before this test is approved by regulatory agencies for clinical use. Genotypic results, like phenotypic results, would need to be subjected to clinical interpretation of which antimicrobials are appropriate for treatment. The overall concordance between genotypic and phenotypic AMR detection was 99% in Salmonella isolates in Canada. Similar studies in Salmonella from Denmark and the United States have found concordances of 98.8% (slightly lower for quinolone) and 99% (slightly lower for aminoglycosides and beta-lactams), respectively [19,26]. The NCBI AMRFinder tool found a concordance of 98% for Salmonella [13].
The positive and negative predictive values of genotypic detection of AMR were >95% for all antimicrobials except gentamicin and cefoxitin. For these two drugs, a few false positive results were obtained from isolates with known resistance genes that conferred MICs just below the current CLSI breakpoints for resistance. A study by McDermott et al. found that the greatest number of discrepancies occurred for aminoglycosides and beta-lactams, notably streptomycin and cefoxitin [26]. Similar to our findings, cefoxitin false positives in their study carried resistance genes that conferred MICs just below the resistance breakpoint. If an infection was caused by an isolate known to carry a resistance gene, that drug may not be the first choice for treatment despite the MIC being slightly below the breakpoint in in vitro testing. The European Committee for Antimicrobial Susceptibility Testing (EUCAST) also provides clinical breakpoints for Salmonella, and, in some cases, the breakpoints are different from CLSI [27]. While phenotypic testing is considered to be the gold standard, there are several caveats for this method. Variability in phenotypic results occur due to multiple factors such as amount of inoculum used, and bacterial growth conditions. Further, the conditions used in vitro do not perfectly mimic the site of infection.
The gene-drug key is provided with support from the US CDC and is continually being improved. There were 141 false negatives whereby isolates that were phenotypically resistant were not predicted to be resistant in the genotypic test. One fifth of these discrepancies might be eliminated by modifying the gene-drug key to interpret qnrB19 as both CIP-I/R and NAL-R instead of CIP-I/R only. Optimization of the key may be influenced by strain epidemiology in different geographic locations. In the United States, qnrB19 does not reliably confer nalidixic acid resistance, perhaps due to different serotype or strain epidemiology in Canada and the United States. In some cases, such as bla CARB-2 which usually conferred MIC just below the resistance breakpoint, the current interpretation is appropriate for the CLSI breakpoint, but variation in phenotypic testing or strain dependence can cause occasional discrepancies. False negatives may also be caused when strains carry determinants of resistance that are currently undiscovered or fragmented in the assembly; however, large-scale WGS along with continued phenotypic testing of a subset of isolates offers opportunities to discover new mechanisms of resistance to strengthen the databases.
Mechanisms of resistance detected in our dataset included 64 unique acquired alleles and mutations in three genes (gyrA, gyrB, and parC) conferring resistance to all antimicrobials that were phenotypically tested except for meropenem. These resistance mechanisms were mostly similar to those reported in a similar study from the United States, which detected 65 unique mechanisms [26]. Some of the differences in the two studies may be due to the fact that Canadian isolates were all from human sources, whereas the majority of isolates in the US study were from food/animals. For ciprofloxacin, the intermediate and resistant categories were combined, and the gene-drug key interprets the presence of any determinant as CIP-I/R. In general, the presence of a single fluoroquinolone resistance determinant such as a gyrA mutation or a qnr allele conferred an MIC in the intermediate range, while the presence of multiple determinants conferred outright resistance. There was a lot of variability in this general observation; the intermediate and resistant categories were therefore combined. Studies using machine learning or deep learning approaches with large datasets are being conducted to produce algorithms for predicting antimicrobial MICs for Salmonella enterica, Neisseria gonorrhoeae, Klebsiella pneumoniae, and other pathogens [28][29][30].
One limitation of genotypic-based detection of AMR is the possibility of missing new resistance genes; thus, CIPARS continues to test 10% of human-source Salmonella using broth microdilution for continual validation and detection of emerging resistance. On the other hand, WGS will facilitate the detection of new variants of known resistance genes.
WGS also allows monitoring of the molecular epidemiology of resistance mechanisms in different reservoirs to enhance One Health studies of AMR. Genomic databases can be scanned retroactively when new mechanisms of resistance are discovered; for example, many institutions conducted retrospective analyses after the discovery of the first mobile colistin resistance (mcr-1) gene. Genotypic AMR detection also allows monitoring of trends in specific combinations of genes that produce different multidrug resistance patterns, as well as monitoring the genetic context of resistance genes, and potential for horizontal gene transfer.
In summary, in silico prediction of AMR from WGS for Salmonella from humans in Canada was reliable. This method can yield a wealth of additional data that is not routinely generated from phenotypic testing, such as the genetic context of resistance and surveillance of the molecular epidemiology of resistance determinants.