Across the US, foodborne pathogens cause illness in approximately 48 million people each year and impose over a $
15.5 billion economic burden annually [1
]. In particular, there are 31 pathogens known to cause foodborne illness [2
]. Just with these 31 pathogens alone, there are an estimated 9.4 million illnesses annually, leading to estimated 55,961 hospitalizations and 1351 deaths a year (90% credible interval) in the US [3
]. These pathogens obtain antimicrobial-resistance genes and become resistant to existing antimicrobials, encoding proteins with antimicrobial-resistance functions. Specifically, these proteins degrade antimicrobials, pump antimicrobials out of the cells, or change the active binding sites for antimicrobials [4
]. Each year in the US, at least 2 million people become infected with antimicrobial-resistant bacteria and at least 23,000 people die as a direct result [5
]. Bacteria have mobile elements that can be transferred between different bacteria (such as plasmids that contain antimicrobial-resistance genes) and thus are released into the environment for another bacterium to take. This is known as horizontal gene transfer [6
]. Foodborne pathogens generally obtain multiple antimicrobial-resistance genes, which equip the pathogens with multiple resistance functions (e.g., antimicrobial degradation, antimicrobial binding site alteration, and antimicrobial efflux pump). This enables foodborne pathogens to resist multi-antimicrobials [7
]. Antimicrobial-resistance genes are spread by pathogens that are carried in foods (e.g., meats). In particular, farm animals carry bacteria in their intestines and are given antimicrobials frequently. Overdoses of oral prescription of antimicrobials for animals will destroy or inhibit part of their intestinal bacteria, but the overuse of antimicrobials may cause the mutation that enables bacteria to survive and multiply. These bacteria, which carry antimicrobial-resistance genes, go forth to contaminate meats and other animal products during the slaughtering and further processing of the meat. The bacteria may also contaminate animal feed and drinking water through infected bodily fluids. These antimicrobial-resistant bacteria, along with the genes they carry, are then passed to people through industrial animal food production. It is thus important to study the genes that are most related to antimicrobial resistance and the pathogens/foods that carry them.
Fortunately, antimicrobial resistance data in the US are actively collected through the National Database of Antibiotic Resistance Organisms (NDARO), the NCBI Pathogen Detection Isolates Browser (NPDIB), and the National Antibiotic Resistance Monitoring System (NARMS). Among these databases, only the NPDIB database shows the antimicrobial genes sampled from four types of meats (i.e., chicken, beef, pork, and turkey). While these databases are available, little research has been done to systematically analyze the data, study how antimicrobial-resistance genes are carried by pathogens and meats throughout the US, and identify the set of the most common antimicrobial-resistance genes. The NPDIB lists antimicrobial-resistance genes carried by pathogens that were isolated from patients, food, and environmental samples in state and federal laboratories over time. Founded in collaboration with the Food and Drug Administration (FDA), the Centers for Disease Control and Prevention (CDC), the United States Department of Agriculture (USDA), and other institutions, the NPDIB allows people to search for pathogen isolates and identify pathogens with particular antimicrobial-resistance genes. Since foodborne pathogens are sequenced and submitted to the NPDIB in real time, it allows for quick diagnosis and detection of pathogens that cause foodborne disease outbreaks.
While the NPDIB database itself contains a significant amount of important information on foodborne pathogens and antimicrobials, few studies have been conducted to extract meaningful information from its gene data. The NPDIB database is typically used to detect pathogens by comparing the genomic sequences in it with the pathogens isolated from particular foods. On the other hand, there are papers that have analyzed data from the NARMS. For example, Sivapalasingam et al. in 2006 [8
] utilized data from NARMS to study Shigella
isolates in the US from 1999 to 2002. Since 1999, NARMS has tested every tenth Shigella
isolate from 16 public health laboratories for susceptibility to 15 antimicrobials. That paper used the data from NARMS to confirm what percentage of Shigella
was resistant and in which geographic regions these antimicrobials were most prevalent. However, the paper did not expound upon the meat industry. Another paper, Zhao et al. in 2009 [9
] did focus on the meat industry and analyzed data from NARMS. However, the paper only focused on Salmonella
and its resistance to antimicrobial agents from five beta-lactamase gene families. Although the findings indicated a varied spectrum of resistance present in Salmonella
strains in the meat supply chain of the US, the paper did not analyze the geographical distribution of these meats and pathogens through the food industry.
As mentioned above, little data analysis has been conducted to use those existing databases to extract useful information. In this work, we perform the first multivariate statistical analysis of gene data from the NPDIB database for six states that are geographically either close (i.e., PA, MD, and NY states) or far (i.e., NM, MN, and CA). The specific antimicrobial resistance found in these six states may direct the choice of antimicrobials used in these geographic areas. We aim to identify the antimicrobials to which pathogens show most resistance in these states, the genes that are mostly involved in antimicrobial resistance, and the carrying of antimicrobial-resistance genes via the pathogens and meats in these states.
We study the impact of geographic location on the distribution of antimicrobial-resistance genes. Since each of the six states contains hundreds of samples of antimicrobial-resistant pathogens and over 100 antimicrobial-resistance genes, we implement principal component analysis (PCA) [10
] to reduce the data dimensions so that we can visualize each dataset in a two-dimensional space. On the basis of the reduced data space characterized by PCA, hierarchical clustering is used to identify the antimicrobials, genes, pathogens, and meats that are mostly involved in the antimicrobial resistance. Hierarchical clustering is one of the most commonly used approaches for separating data points while providing similarity analysis between data points [12
The NPDIB database collects antimicrobial-resistant data sampled from foodborne pathogens in animal meats across the US. In this work, we presented the first multivariate statistical analysis to project antimicrobial-resistance gene data sampled from four types of meats (i.e., chicken, turkey, pork, and beef) for six states (i.e., PA, NY, MD, NM, MN, and CA) onto a two-dimensional space, thereby identifying the major antimicrobials, foodborne pathogens, genes, and meats involved in antimicrobial resistance. The results indicate that: 1) aadA, aph(3’’), aph(3’’)-Ib, aph(6)-I, aph(6)-Id, bla, blaCMY, tet, tet(A), and sul2 are the ten genes most found in antimicrobial-resistant foodborne pathogens; 2) these genes were mainly carried by Salmonella species in chicken and turkey; and 3) ampicillin, streptomycin, gentamicin, kanamycin, cefoxitin, sulfisoxazole, tetracycline, and ciprofloxacin are the major antimicrobials to which foodborne pathogens are resistant. While geographically adjacent states PA, NY, and MD share more similar antimicrobial-resistance genes than the others (i.e., MN, NM, and CA), all six states share common antimicrobial-resistance genes. This is likely explained by the finding that chicken and turkey, the two major meats that carry antimicrobial-resistance genes, are delivered nationwide. Overuse of antimicrobials in chicken and turkey were reported. This may explain why most antimicrobial-resistance genes were found in these two meats. Antimicrobial resistance is typically caused by the synergistic cooperation of multiple genes. The ten genes identified in this work (i.e., aadA, aph(3’’), aph(3’’)-Ib, aph(6)-I, aph(6)-Id, bla, blaCMY, tet, tet(A), and sul2) provide valuable insight on this issue that may be used for future investigation. While the findings presented in this work are mainly based upon the antimicrobial-resistant data of foodborne pathogens for six states, they will be further validated and updated when the data for other states are more complete in the NPDIB database.