Next Article in Journal
A Systematic Review of the Sedative, Behavioral, Analgesic and Cardiovascular Effects of Gabapentin in Cats
Previous Article in Journal
Functional Mutations in the VRTN Gene Influence Growth Traits and Meat Quality in Hainan Black Goats
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning Prediction of Multidrug Resistance in Swine-Derived Campylobacter spp. Using United States Antimicrobial Resistance Surveillance Data (2013–2023)

1
Independent Researcher, Perth 6107, Australia
2
Department of Quality Assurance and Data Analytics, Jouya Behnood Company, Tehran, Iran
3
Department of Pathobiology, College of Veterinary Medicine, University of Illinois Urbana-Champaign, Urbana, IL 61802, USA
4
Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61802, USA
5
Veterinary Public Health Research Laboratory, Department of Veterinary Medicine, College of Agriculture and Veterinary Medicine, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
*
Authors to whom correspondence should be addressed.
Vet. Sci. 2025, 12(10), 937; https://doi.org/10.3390/vetsci12100937
Submission received: 31 August 2025 / Revised: 22 September 2025 / Accepted: 24 September 2025 / Published: 26 September 2025

Simple Summary

Campylobacter spp. are among the most important causes of bacterial gastroenteritis around the world. In addition to poultry, pigs are also considered an important source of this pathogen. Antimicrobial resistance (AMR) in Campylobacter is a serious public health concern. A supervised machine learning model was developed and validated in this study to predict MDR status in Campylobacter isolates from swine, using publicly available phenotypic AMR data collected by NARMS from 2013 to 2023. Among five evaluated machine learning algorithms, Random Forest showed the highest performance (accuracy = 99.87%, Kappa = 0.9962), achieving high balanced accuracy, sensitivity, and specificity in both training and external validation. The feature importance analysis found that erythromycin, azithromycin, and clindamycin were the most influential predictors of MDR among Campylobacter isolates from swine. Our temporally validated, interpretable model offers a robust and cost-effective approach for predicting MDR in Campylobacter spp., facilitating surveillance and early detection in food animal production systems.

Abstract

Campylobacter spp. are leading causes of bacterial gastroenteritis globally. Swine are recognized as an important reservoir for this pathogen. The emergence of antimicrobial resistance (AMR) and multidrug resistance (MDR) in Campylobacter is a global health concern. Traditional methods for detecting AMR and MDR, such as phenotypic testing or whole-genome sequencing, are resource-intensive and time-consuming. In the present study, we developed and validated a supervised machine learning model to predict MDR status in Campylobacter isolates from swine, using publicly available phenotypic AMR data collected by NARMS from 2013 to 2023. Resistance profiles for seven antimicrobials were used as predictors, and MDR was defined as resistance to at least one agent in three or more antimicrobial classes. The model was trained on 2013–2019 isolates and externally validated using isolates from 2020, 2021, and 2023. Random Forest showed the highest performance (accuracy = 99.87%, Kappa = 0.9962) among five evaluated algorithms, which achieved high balanced accuracy, sensitivity, and specificity in both training and external validation. Our feature importance analysis identified erythromycin, azithromycin, and clindamycin as the most influential predictors of MDR among Campylobacter isolates from swine. Our temporally validated, interpretable model provides a robust, cost-effective tool for predicting MDR in Campylobacter spp. and supports surveillance and early detection in food animal production systems.

1. Introduction

Campylobacter spp. has been recognized as one of the leading causes of bacterial gastroenteritis globally [1]. Campylobacteriosis in humans is mostly caused by Campylobacter jejuni and Campylobacter coli, which are commonly found in the gastrointestinal tract of poultry, livestock, and other animals [2]. Infections with this pathogen are often associated with the consumption of contaminated food products, especially undercooked or raw poultry, unpasteurized milk, and untreated water [3,4,5]. In addition to poultry, swine have also been recognized as a reservoir of Campylobacter spp. [6]. While most Campylobacter spp. infections are self-limiting, in rare cases, they can cause reactive arthritis, Guillain–Barré syndrome, or post-infectious irritable bowel syndrome [7,8]. In severe forms of Campylobacter infections or for patients with an increased risk of developing complications, macrolides and fluoroquinolones may be prescribed as the drugs of choice for treatment [9].
The emergence of antimicrobial resistance (AMR) and multidrug resistance (MDR) is a global health concern, influencing the effectiveness of treatment of infections in both human and veterinary medicine [10]. Several previous studies have consistently documented rising MDR trends in animal-derived Campylobacter spp. isolates, including poultry [11,12] and swine [13]. The emergence of resistance to macrolides (azithromycin, erythromycin) and fluoroquinolones (ciprofloxacin) is concerning, as they are commonly prescribed to treat severe Campylobacter spp. infections [14].
Detection of AMR and MDR relies on isolating bacterial pathogens from cases and susceptibility testing of isolates to a panel of antimicrobials or conducting whole-genome sequencing (WGS), which are resource-intensive and time-consuming [15]. Consequently, there is growing interest in computational approaches that enhance the predictive capabilities of existing surveillance data [16]. Machine learning has emerged as a powerful approach for pattern recognition and classification in complex biological datasets [17]. Several recent investigations have indicated the application of machine learning models in predicting AMR in human pathogens using phenotypic, genomic, or clinical metadata [18,19,20,21]. Various supervised and unsupervised machine learning models have demonstrated AMR-predictive performance across different pathogens. For instance, machine learning models have predicted carbapenem-resistance in Gram-negative bacteria among ICU patients with accuracies of 72–84% [22]. In another study, an open-source machine learning algorithm (XGBoost, Version 3.0.5) was applied to predict antibiotic resistance in three Gram-negative bacterial species (E. coli, Klebsiella pneumoniae, and Pseudomonas aeruginosa) isolated from patients’ blood and urine [23]. In a separate study, a Random Forest model has also been used to accurately predict AMR in Pseudomonas aeruginosa [24]. Focusing on Campylobacter spp., a machine learning model (Support Vector Machine) on protein sequences successfully predicted resistance genes in Gram-negative bacteria, including Campylobacter spp., with 90% accuracy [25]. In another investigation, a combination of Matrix-Assisted Laser Desorption/Ionization-Time of Flight (MALDI-TOF) mass spectrometry and machine learning showed high performance in detecting susceptible, as well as ciprofloxacin- and tetracycline-resistant Campylobacter spp. isolates from clinical and environmental sources [26].
Although machine learning-based AMR research is rapidly growing, few studies have investigated machine learning-based MDR prediction in pathogens derived from livestock using phenotypic data. Therefore, this study utilizes publicly available data on AMR in Campylobacter spp. isolated from cecal samples of swine at slaughter collected by the National Antimicrobial Resistance Monitoring System of Enteric Bacteria (NARMS) between 2013 and 2023 to: (i) identify the machine learning algorithm with the highest MDR predictive accuracy, (ii) quantify the contribution of individual AMR traits to MDR classification, and (iii) evaluate temporal robustness of the predictions by testing model performance using isolates not included in the model training, which were collected during the final three years of the study period. The value of this work could be extended to additional livestock sectors and might have global application.

2. Materials and Methods

2.1. Study Design

In this study, we developed and validated a supervised machine learning model to predict MDR status in the US swine-derived Campylobacter spp. isolates. The investigation used publicly available AMR monitoring data collected by NARMS [27]. This dataset includes AMR data of Campylobacter spp. isolates from the cecal content of swine at Food Safety and Inspection Service—U.S. Department of Agriculture (USDA)-regulated slaughter establishments across the US from 2013 to 2023 [27]. Antimicrobial susceptibility results (resistant or susceptible) of each isolate were recorded for seven antimicrobials included in this study: azithromycin (AZI), clindamycin (CLI), erythromycin (ERY), ciprofloxacin (CIP), nalidixic acid (NAL), tetracycline (TET), and gentamicin (GEN). Resistance to each antimicrobial was recorded as binary (1 = resistant, 0 = susceptible). The outcome variable was MDR, defined as resistance to at least one agent in three or more antimicrobial classes [28], and was encoded as a binary classification (YES, NO). Figure 1 shows the workflow of our machine learning pipeline, including data preprocessing, algorithm selection, model training, and validation steps.
Our dataset was temporally partitioned into two phases (Phase 1 and Phase 2) to simulate a real-world application and evaluate the model’s generalizability over time.
Phase 1 included isolates collected from 2013 to 2019, which were used for model development and internal validation. Phase 2 consisted of isolates from the last three years of the study period (2020, 2021, 2023) and served as an external test dataset for independent validation (data for the year 2022 was not available). The “Year” variable was only used for temporal stratification and excluded from model training.

2.2. Phase 1

2.2.1. Data Preprocessing and Temporal Partitioning

Before model training, the dataset was cleaned by removing non-informative identifiers. The MDR labels were converted to a factor variable. While scaling and centering are typically required for distance-based algorithms such as Support Vector Machines (SVM) and k-Nearest Neighbors (KNN) [29], these steps were not applied, as all predictors in our dataset were binary [29]. All analysis was performed using R software (Version 4.5.0, 11-04-2025) [30], within the RStudio platform (2024.09.0 Build 375 © 2009–2024 Posit Software, PBC, Boston, MA, USA).

2.2.2. Algorithm Selection and Internal Validation

Five supervised classification machine learning algorithms, including Support Vector Machine (SVM), Random Forest, Decision Tree, Naive Bayes, and k-Nearest Neighbors (KNN), which have been extensively used in previous studies [31,32,33,34], were evaluated for their ability to predict MDR status in Campylobacter spp. isolates from swine based on resistance patterns of predictor variables (seven above-mentioned antimicrobials). All models were trained using 5-fold cross-validation to ensure robust performance estimates. Cross-validation was performed using the “caret R-package”, and model accuracy and Cohen’s Kappa were used for comparative evaluation. Resampling summaries were generated to evaluate the distribution of performance metrics across folds. Hyperparameter tuning for each algorithm was performed automatically using the default grid search provided by the caret package during 5-fold Cross-Validation.

2.2.3. Model Development and Validation Using the Selected Machine Learning Algorithm

Among the evaluated supervised machine learning algorithms, the one with the highest cross-validated accuracy and Kappa scores was selected to develop the final predictive model. The dataset (2013–2019) was initially partitioned into training (80%) and testing (20%) subsets using stratified sampling to preserve the distribution of MDR classes. The selected algorithm was then trained and internally tested on the training subset (2013–2019), and the final trained model was saved for external validation.
A five-fold cross-validation was conducted on the entire dataset to assess model robustness and performance. In each fold, the data were split into training (80%) and validation (20%) subsets. The model was trained on the training portion and evaluated on the validation subset. During training, feature standardization techniques (centering and scaling) were applied to improve model performance.
Predictions from all folds were aggregated to produce out-of-fold predictions across the entire dataset. These predictions were compared to the true MDR labels to generate a single overall confusion matrix. Performance metrics, including accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Cohen’s kappa, were computed to assess the developed model’s performance. The confusion matrix was then visualized using the ggplot2 package, showing assessment of misclassification and class imbalance.

2.2.4. Feature Importance Analysis

To identify the most influential predictors of MDR, feature importance scores were extracted from the trained Random Forest model using the RandomForest R-package. These scores represent the relative contribution of each antibiotic resistance feature (seven antimicrobials included in this study) to the prediction of MDR status. Features were ranked by their mean decrease in Gini impurity and then visualized using the ggplot2 package to identify the most influential predictors.

2.3. Phase 2

External Validation of the Trained Model

To evaluate potential overfitting and assess temporal generalizability, an external validation was performed by splitting the dataset chronologically. Isolates from 2013 to 2019 were already used to train the model (Phase 1), while isolates from the last three years of the study period (2020, 2021, and 2023), which have not been used to train the model, were reserved for external validation (Phase 2). The ‘Year’ column was excluded from predictor variables (AMR status of the antimicrobials that were included) to prevent data leakage. The trained Random Forest model was applied to the test set, and predictive performance metrics and a confusion matrix were generated as described in phase 1.
Feature importance analysis was also performed using the same method described in phase 1 to identify the most influential predictors based on their feature importance score, contributing to MDR classification during external validation. The results were compared to those from phase 1 (model training step) to identify any shift in predictive patterns.

3. Results

3.1. Performance Evaluation of Classification Machine Learning Algorithms for MDR Prediction in Swine-Derived Campylobacter

We initially evaluated five classification machine learning algorithms: SVM, Decision Tree, Random Forest, Naive Bayes, and KNN using 5-fold cross-validation. Among the evaluated algorithms, the Random Forest algorithm exhibited the highest performance on our dataset (Accuracy = 99.87%, Kappa = 0.9962), followed by SVM (Accuracy = 99.82%, Kappa = 0.9950) and KNN (Accuracy = 99.55%, Kappa = 0.9876) (Figure 2, Supplementary Table S1). Our results also showed that the lowest-performing algorithms were the Decision Tree (Accuracy = 98.31%, Kappa = 0.9520) and Naive Bayes (Accuracy = 97.96%, Kappa = 0.9423) (Figure 2, Supplementary Table S1). Based on our evaluation, the Random Forest algorithm was selected for developing the model (using the Random Forest R-package) and further analysis due to its high accuracy and robustness across folds.

3.2. Development and Evaluation of a Random Forest Model to Predict Multidrug Resistance in Campylobacter from Swine

The developed Random Forest model was then evaluated using a 5-fold cross-validation strategy. The model showed excellent classification performance, with a balanced accuracy of 99.43% and a Kappa score of 0.9925 (Table 1). The model also achieved high sensitivity (98.86%), specificity (100%), PPV (100%), and NPV (99.65%) (Table 1). The confusion matrix represented that our trained model had high precision with only 6 misclassified isolates out of 2551, all of which were MDR isolates incorrectly predicted as non-MDR (Figure 3a, Supplementary Table S2).

3.3. Important Features Predicting MDR in the Trained Random Forest Model

Feature importance score for the predictors (resistance to a specific antimicrobial) obtained from the trained Random Forest model demonstrated the relative contribution of each AMR feature in predicting MDR status (Figure 3b). Among the evaluated predictors, the macrolide antimicrobial class, including erythromycin (importance score = 226.40) and azithromycin (importance score = 161.10), along with the lincosamide class, including clindamycin with a score of 115.14, were the most influential predictors of MDR in Campylobacter spp. isolates (Figure 3b). Other predictors, including tetracycline, nalidixic acid, ciprofloxacin, and gentamicin, indicated lower importance scores of 30.13, 14.78, 12.70, and 0.45, respectively (Figure 3b).

3.4. External Validation of the Trained Random Forest Model (Phase 2)

To evaluate model generalizability and identify potential overfitting, the trained Random Forest model from Phase 1 was applied to predict MDR outcomes on a new test dataset (n = 603 isolates) from the last three years of the study period (2020, 2021, 2023), which was not used during model training. Our external validation showed that the model achieved a high balanced accuracy of 96.72% and a Kappa score of 0.9565 (Table 2). High sensitivity (93.43%) and maximum specificity (100%) (Table 2), further reinforcing the high validity and predictive power of our trained model on temporally independent data. Despite a slight performance decline compared to the previous phase (phase 1), our trained model indicated high precision with 9 misclassified isolates out of 603, which were incorrectly predicted as non-MDR isolates (Figure 4a, Supplementary Table S3). Table 2 and Figure 4a show external validation results, including performance statistics and a confusion matrix of our Random Forest model.

3.5. Important Features Predicting MDR in the External Validation Phase of the Trained Random Forest Model

Feature importance scores of the predictors for the external validation dataset were consistent with phase 1. Erythromycin (importance score = 226.40), azithromycin (importance score = 161.10), and clindamycin (importance score = 1115.14) were the most influential predictors of MDR in Campylobacter spp. isolates (Figure 4b). Tetracycline, nalidixic acid, ciprofloxacin, and gentamicin were also other features indicating lower importance scores of 30.13, 14.78, 12.70, and 0.45, respectively (Figure 4b).

4. Discussion

In this study, using phenotypic AMR data collected by NARMS, we developed and validated a supervised machine learning model to predict MDR status in swine-derived Campylobacter spp. isolates, by including predictors representing the isolates’ antimicrobial resistance status to seven antimicrobials (erythromycin, azithromycin, clindamycin, tetracycline, nalidixic acid, ciprofloxacin, and gentamicin). The model identified that resistance to erythromycin, azithromycin, and clindamycin was the most influential predictor of MDR in Campylobacter spp. isolates, as this was the highest importance score compared to the other predictors.
Among the five classification machine learning algorithms evaluated in the present study, the Random Forest algorithm represented the highest performance (99.87% accuracy) and was selected for model development. Our trained Random Forest model demonstrated strong performance, achieving a balanced accuracy of 99.43% on the training dataset from 2013 to 2019 (Phase 1) and maintaining a high accuracy of 96.72% when tested on the external dataset from 2020, 2021, and 2023 (Phase 2). These findings suggest that supervised classification algorithms may provide a robust, data-driven approach to predict MDR status in Campylobacter spp. using phenotypic AMR surveillance data. Similar trends have been observed in previous studies employing machine learning for AMR prediction in Campylobacter. Chowdhury et al. (2019) applied SVM model to predict antimicrobial resistance genes in Gram-negative bacteria, including Campylobacter, and reported an accuracy above 90% [25]. In another investigation, gradient boosting (XGBoost) regression was used to predict AMR in C. jejuni and other pathogens, achieving accurate metrics comparable to our findings [35]. Several previous machine learning investigations primarily focused on other pathogens like M. tuberculosis, E. coli, S. aureus, and Klebsiella, and only examined resistance to a single class of antimicrobial rather than predicting MDR [36,37,38,39].
The present study indicated high sensitivity (98.86%) and maximum specificity (100%) in predicting MDR status among swine-derived Campylobacter. Misclassifying a resistant isolate as susceptible can result in ineffective treatment, potentially influencing infection management in patients [26]. Therefore, classifiers that inform antimicrobial therapy decisions must prioritize high sensitivity [40]. The high sensitivity achieved in both training the model and its external validation step in this study highlights the robustness of the MDR predictive approach.
Many existing AMR and MDR prediction tools rely on genomic data. Although these approaches are powerful, they often require specialized infrastructure, bioinformatic expertise, and significant cost [15,41,42]. Our study highlights that phenotypic profiles, particularly resistance to macrolides like erythromycin and azithromycin, can predict MDR status in Campylobacter isolates. Since macrolides are considered the first-line antimicrobial class for the treatment of human Campylobacteriosis [43], early screening for macrolide resistance not only plays an essential role in guiding antimicrobial therapy but can also serve as an important indicator for MDR prediction in Campylobacter spp. isolates, at least those of swine origin.
Resistance to tetracycline and fluoroquinolone is commonly observed in Campylobacter isolates [12]; however, based on our findings, these resistances do not always appear to be associated with MDR phenotype, particularly when compared to macrolide resistance. Our results showed that, at least in US swine Campylobacter isolates, macrolide resistance may be more consistently associated with broader resistance patterns and may play a more significant role in MDR phenotype. It seems that there is a possibility that some Campylobacter isolates exhibit resistance to only tetracycline or fluoroquinolone without concurrent resistance to other antimicrobial classes, and therefore do not meet the criteria for MDR. Further studies are required to elucidate the underlying reasons for this phenomenon.
Compared to our findings, a recent machine learning study on Campylobacter spp. reported that ciprofloxacin and tetracycline were among the highest performing classifiers in both C. coli and C. jejuni isolates [26], which underscores the potential role of the pathogen source and geographical region in determining influential resistance predictors among Campylobacter isolates. It should be noted that comparing our findings with similar previous investigations should be treated with caution due to the differences in study design, pathogen’s host, and geographical region. Our findings may also complement genomic research and agree with previous genomic and phenotypic concordance studies [44,45].
Our results also showed that clindamycin resistance (a lincosamide) was among the most important features in predicting MDR status in both training and external validation phases. This shows the importance of considering lincosamide resistance in addition to macrolide resistance in livestock production, as both antimicrobial classes target the bacterial ribosome, particularly the 50S subunit, to inhibit protein synthesis [46].
Compared to the previous similar studies, the power of our study is its two-phase temporal validation design, which addresses a common limitation in machine learning predictive AMR models. By training the model on data from 2013 to 2019 and validating on 2020–2023 (excluding 2022) data, we tried to evaluate real-world generalizability, and we showed our model’s ability to maintain high predictive accuracy in different periods of time. Despite possible shifts in antimicrobial use, biosecurity practices at swine farms, and the emergence of new resistance mechanisms over time, our trained model correctly predicted MDR status in the majority of Campylobacter spp. isolates. Of 603 Campylobacter spp. isolates in the external validation dataset, only nine MDR isolates were misclassified as non-MDR (93.43% sensitivity), resulting in a few false negatives.
Our model might have potential practical applications, beyond its performance metrics. In resource-limited settings, the model may detect potential MDR isolates based on partial antimicrobial susceptibility results, particularly when macrolide resistance is observed. Such early detection could support targeted interventions in mitigating AMR and MDR in livestock production systems.
Added to that, integrating this type of machine learning model into laboratory data systems has the potential to simplify and speed up the detection of MDR isolates. Automated MDR prediction could be a valuable tool for public health authorities, helping them to detect emerging MDR clusters more quickly and allocate resources more efficiently [47], without the need for panel antimicrobial susceptibility testing or genomic sequencing. This approach can save time, money, and effort. Further machine learning research on AMR surveillance data over time is required to reinforce our findings and shed more light on the broader applicability and potential limitations of phenotypic-based machine learning MDR prediction.
In the present study, we did not conduct species-level modeling (C. jejuni vs. C. coli) because the dataset was largely composed of C. coli, which is the main Campylobacter species associated with swine. This imbalance made it difficult to perform meaningful comparisons, so our predictions mainly reflect resistance patterns in C. coli. Further studies are required to investigate species-level modeling, where both C. jejuni and C. coli are more balanced representations, such as Campylobacter species isolated from poultry. This may provide more reasonable comparative insights.
This study has several limitations. Our model was trained and validated only on Campylobacter spp. isolates from swine in the United States, which may limit its generalizability for predicting MDR in Campylobacter spp. from other hosts or geographical regions. Moreover, our model was trained solely on phenotypic resistance data and did not include genomic information, which could influence predictive accuracy and mechanistic interpretability.

5. Conclusions

In conclusion, our study presented a robust, interpretable, and temporally validated supervised machine learning model (Random Forest) that indicated high accuracy in predicting MDR status in Campylobacter isolates from US swine by using predictors of phenotypic resistance profiles of seven antimicrobials. This model may offer a scalable, cost-effective, and practical complement to traditional phenotypic testing, facilitating the rapid detection and monitoring of MDR in Campylobacter spp., particularly through key resistance predictors, like macrolides. Such models or similar approaches may also have potential for real-time implementation in AMR surveillance efforts and AMR diagnostic laboratories. Further validation on different host species and geographic regions, as well as integration of genomic data, will be essential to enhance the model’s generalizability and predictive accuracy in the global battle against antimicrobial resistance in major foodborne pathogens.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/vetsci12100937/s1, Table S1. Performance comparison of five machine learning algorithms for predicting multidrug resistance in swine-derived Campylobacter spp. (Accuracy and Kappa); Table S2. Prediction results of the Random Forest model for multidrug resistance in Campylobacter isolates from swine (Actual vs. Predicted); Table S3. Prediction results of the external validation phase (Phase 2) of the Random Forest model for multidrug resistance in Campylobacter isolates from swine (Actual vs. Predicted).

Author Contributions

Conceptualization, H.R.S.; methodology, H.R.S. and M.G.; software, H.R.S.; validation, H.R.S. and M.G.; formal analysis, H.R.S.; visualization, H.R.S.; data curation, H.R.S. and M.G.; Investigation, H.R.S., M.G., C.V. and I.H.; writing—original draft preparation, H.R.S. and M.G.; writing—review and editing, H.R.S., M.G., C.V. and I.H.; supervision, H.R.S. and I.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no specific grant from funding agencies in the public or not-for-profit sectors.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data is publicly available, the original contributions presented in this study are included in the article/Supplementary Material.

Acknowledgments

The authors thank the National Antimicrobial Resistance Monitoring System (NARMS) for generating the data and making it publicly available.

Conflicts of Interest

Author Maryam Ghasemi is employed by Jouya Behnood Company, Tehran, Iran. The remaining authors declare no conflict of interest.

References

  1. Ranasinghe, S.; Fhogartaigh, C.N. Bacterial Gastroenteritis. Medicine 2021, 49, 687–693. [Google Scholar] [CrossRef]
  2. Chlebicz, A.; Śliżewska, K. Campylobacteriosis, Salmonellosis, Yersiniosis, and Listeriosis as Zoonotic Foodborne Diseases: A Review. Int. J. Environ. Res. Public Health 2018, 15, 863. [Google Scholar] [CrossRef]
  3. Ben Romdhane, R.; Merle, R. The Data Behind Risk Analysis of Campylobacter Jejuni and Campylobacter Coli Infections. Curr. Top. Microbiol. Immunol. 2021, 431, 25–58. [Google Scholar] [CrossRef]
  4. Francois Watkins, L.K.; Laughlin, M.E.; Joseph, L.A.; Chen, J.C.; Nichols, M.; Basler, C.; Breazu, R.; Bennett, C.; Koski, L.; Montgomery, M.P.; et al. Ongoing Outbreak of Extensively Drug-Resistant Campylobacter Jejuni Infections Associated with US Pet Store Puppies, 2016–2020. JAMA. Netw. Open 2021, 4, e2125203. [Google Scholar] [CrossRef] [PubMed]
  5. Habib, I.; Mohamed, M.-Y.I.; Lakshmi, G.B.; Al Marzooqi, H.M.; Afifi, H.S.; Shehata, M.G.; Khan, M.; Ghazawi, A.; Abdalla, A.; Anes, F. Quantitative Assessment and Genomic Profiling of Campylobacter Dynamics in Poultry Processing: A Case Study in the United Arab Emirates Integrated Abattoir System. Front. Microbiol. 2024, 15, 1439424. [Google Scholar] [CrossRef] [PubMed]
  6. Boes, J.; Nersting, L.; Nielsen, E.M.; Kranker, S.; Enøe, C.; Wachmann, H.C.; Baggesen, D.L. Prevalence and Diversity of Campylobacter Jejuni in Pig Herds on Farms with and without Cattle or Poultry. J. Food Prot. 2005, 68, 722–727. [Google Scholar] [CrossRef]
  7. Papadopoulos, D.; Petridou, E.; Papageorgiou, K.; Giantsis, I.A.; Delis, G.; Economou, V.; Frydas, I.; Papadopoulos, G.; Hatzistylianou, M.; Kritas, S.K. Phenotypic and Molecular Patterns of Resistance among Campylobacter Coli and Campylobacter Jejuni Isolates, from Pig Farms. Animals 2021, 11, 2394. [Google Scholar] [CrossRef] [PubMed]
  8. Whitehouse, C.A.; Young, S.; Li, C.; Hsu, C.-H.; Martin, G.; Zhao, S. Use of Whole-Genome Sequencing for Campylobacter Surveillance from NARMS Retail Poultry in the United States in 2015. Food Microbiol. 2018, 73, 122–128. [Google Scholar] [CrossRef]
  9. Sodagari, H.R.; Sohail, M.N.; Varga, C. Temporal, Regional, and Demographic Differences among Antimicrobial-Resistant Domestic Campylobacter Jejuni Human Infections across the United States, 2013–2019. Int. J. Antimicrob. Agents 2025, 65, 107467. [Google Scholar] [CrossRef]
  10. Ahmed, S.K.; Hussein, S.; Qurbani, K.; Ibrahim, R.H.; Fareeq, A.; Mahmood, K.A.; Mohamed, M.G. Antimicrobial Resistance: Impacts, Challenges, and Future Prospects. J. Med. Surg. Public Health 2024, 2, 100081. [Google Scholar] [CrossRef]
  11. Habib, I.; Ibrahim Mohamed, M.-Y.; Ghazawi, A.; Lakshmi, G.B.; Khan, M.; Li, D.; Sahibzada, S. Genomic Characterization of Molecular Markers Associated with Antimicrobial Resistance and Virulence of the Prevalent Campylobacter Coli Isolated from Retail Chicken Meat in the United Arab Emirates. Curr. Res. Food Sci. 2023, 6, 100434. [Google Scholar] [CrossRef]
  12. Sodagari, H.R.; Agrawal, I.; Sohail, M.N.; Yudhanto, S.; Varga, C. Monitoring Antimicrobial Resistance in Campylobacter Isolates of Chickens and Turkeys at the Slaughter Establishment Level across the United States, 2013–2021. Epidemiol. Infect. 2024, 152, e41. [Google Scholar] [CrossRef]
  13. Marin, C.; Lorenzo-Rebenaque, L.; Moreno-Moliner, J.; Sevilla-Navarro, S.; Montero, E.; Chinillac, M.C.; Jordá, J.; Vega, S. Multidrug-Resistant Campylobacer Jejuni on Swine Processing at a Slaughterhouse in Eastern Spain. Animals 2021, 11, 1339. [Google Scholar] [CrossRef] [PubMed]
  14. Fernández-Palacios, P.; Galán-Sánchez, F.; Casimiro-Soriguer, C.S.; Jurado-Tarifa, E.; Arroyo, F.; Lara, M.; Chaves, J.A.; Dopazo, J.; Rodríguez-Iglesias, M.A. Genotypic Characterization and Antimicrobial Susceptibility of Human Campylobacter Jejuni Isolates in Southern Spain. Microbiol. Spectr. 2024, 12, e0102824. [Google Scholar] [CrossRef] [PubMed]
  15. Chavan, P.; Vashishth, R. Antimicrobial Resistance in Foodborne Pathogens: Consequences for Public Health and Future Approaches. Discov. Appl. Sci. 2025, 7, 623. [Google Scholar] [CrossRef]
  16. McClymont, H.; Lambert, S.B.; Barr, I.; Vardoulakis, S.; Bambrick, H.; Hu, W. Internet-Based Surveillance Systems and Infectious Diseases Prediction: An Updated Review of the Last 10 Years and Lessons from the COVID-19 Pandemic. J. Epidemiol. Glob. Health 2024, 14, 645–657. [Google Scholar] [CrossRef] [PubMed]
  17. Xu, C.; Jackson, S.A. Machine Learning and Complex Biological Data. Genome Biol. 2019, 20, 76. [Google Scholar] [CrossRef]
  18. Drouin, A.; Giguère, S.; Déraspe, M.; Marchand, M.; Tyers, M.; Loo, V.G.; Bourgault, A.-M.; Laviolette, F.; Corbeil, J. Predictive Computational Phenotyping and Biomarker Discovery Using Reference-Free Genome Comparisons. BMC Genom. 2016, 17, 754. [Google Scholar] [CrossRef]
  19. Hyun, J.C.; Kavvas, E.S.; Monk, J.M.; Palsson, B.O. Machine Learning with Random Subspace Ensembles Identifies Antimicrobial Resistance Determinants from Pan-Genomes of Three Pathogens. PLoS. Comput. Biol. 2020, 16, e1007608. [Google Scholar] [CrossRef]
  20. Naidenov, B.; Lim, A.; Willyerd, K.; Torres, N.J.; Johnson, W.L.; Hwang, H.J.; Hoyt, P.; Gustafson, J.E.; Chen, C. Pan-Genomic and Polymorphic Driven Prediction of Antibiotic Resistance in Elizabethkingia. Front. Microbiol. 2019, 10, 1446. [Google Scholar] [CrossRef]
  21. Parthasarathi, K.T.S.; Gaikwad, K.B.; Rajesh, S.; Rana, S.; Pandey, A.; Singh, H.; Sharma, J. A Machine Learning-Based Strategy to Elucidate the Identification of Antibiotic Resistance in Bacteria. Front. Antibiot. 2024, 3, 1405296. [Google Scholar] [CrossRef] [PubMed]
  22. Liang, Q.; Zhao, Q.; Xu, X.; Zhou, Y.; Huang, M. Early Prediction of Carbapenem-Resistant Gram-Negative Bacterial Carriage in Intensive Care Units Using Machine Learning. J. Glob. Antimicrob. Resist. 2022, 29, 225–231. [Google Scholar] [CrossRef]
  23. Moran, E.; Robinson, E.; Green, C.; Keeling, M.; Collyer, B. Towards Personalized Guidelines: Using Machine-Learning Algorithms to Guide Antimicrobial Selection. J. Antimicrob. Chemother. 2020, 75, 2677–2680. [Google Scholar] [CrossRef] [PubMed]
  24. Noman, S.M.; Zeeshan, M.; Arshad, J.; Deressa Amentie, M.; Shafiq, M.; Yuan, Y.; Zeng, M.; Li, X.; Xie, Q.; Jiao, X. Machine Learning Techniques for Antimicrobial Resistance Prediction of Pseudomonas Aeruginosa from Whole Genome Sequence Data. Comput. Intell. Neurosci. 2023, 2023, 5236168. [Google Scholar] [CrossRef]
  25. Chowdhury, A.S.; Call, D.R.; Broschat, S.L. Antimicrobial Resistance Prediction for Gram-Negative Bacteria via Game Theory-Based Feature Evaluation. Sci. Rep. 2019, 9, 14487. [Google Scholar] [CrossRef]
  26. Feucherolles, M.; Nennig, M.; Becker, S.L.; Martiny, D.; Losch, S.; Penny, C.; Cauchie, H.-M.; Ragimbeau, C. Combination of MALDI-TOF Mass Spectrometry and Machine Learning for Rapid Antimicrobial Resistance Screening: The Case of Campylobacter Spp. Front Microbiol. 2021, 12, 804484. [Google Scholar] [CrossRef]
  27. U.S. Department of Agriculture, Food Safety and Inspection Service. FSIS Laboratory Sampling Data: NARMS Cecal Sampling. Food Safety and Inspection Service. Available online: https://www.Fsis.Usda.Gov/Science-Data/Data-Sets-Visualizations/Laboratory-Sampling-Data (accessed on 1 August 2025).
  28. Magiorakos, A.-P.; Srinivasan, A.; Carey, R.B.; Carmeli, Y.; Falagas, M.E.; Giske, C.G.; Harbarth, S.; Hindler, J.F.; Kahlmeter, G.; Olsson-Liljequist, B.; et al. Multidrug-Resistant, Extensively Drug-Resistant and Pandrug-Resistant Bacteria: An International Expert Proposal for Interim Standard Definitions for Acquired Resistance. Clin. Microbiol. Infect. 2012, 18, 268–281. [Google Scholar] [CrossRef]
  29. Singh, D.; Singh, B. Investigating the Impact of Data Normalization on Classification Performance. Appl. Soft. Comput. 2020, 97, 105524. [Google Scholar] [CrossRef]
  30. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2025; Available online: https://www.r-project.org/ (accessed on 1 August 2025).
  31. Osisanwo, F.Y.; Akinsola, J.E.T.; Awodele, O.; Hinmikaiye, J.O.; Olakanmi, O.; Akinjobi, J. Supervised Machine Learning Algorithms: Classification and Comparison. Int. J. Comput. Trends Technol.-IJCTT 2017, 48. [Google Scholar] [CrossRef]
  32. Muhammad, I.; Yan, Z. Supervised Machine Learning Approaches: A Survey. ICTACT J. Soft Comput. 2015, 5, 946–952. [Google Scholar] [CrossRef]
  33. Radhoush, S.; Whitaker, B.M.; Nehrir, H. An Overview of Supervised Machine Learning Approaches for Applications in Active Distribution Networks. Energies 2023, 16, 5972. [Google Scholar] [CrossRef]
  34. Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing Different Supervised Machine Learning Algorithms for Disease Prediction. BMC Med. Inform. Decis. Mak. 2019, 19, 281. [Google Scholar] [CrossRef]
  35. ValizadehAslani, T.; Zhao, Z.; Sokhansanj, B.A.; Rosen, G.L. Amino Acid K-Mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights. Biology 2020, 9, 365. [Google Scholar] [CrossRef]
  36. Lu, J.; Chen, J.; Liu, C.; Zeng, Y.; Sun, Q.; Li, J.; Shen, Z.; Chen, S.; Zhang, R. Identification of Antibiotic Resistance and Virulence-Encoding Factors in Klebsiella Pneumoniae by Raman Spectroscopy and Deep Learning. Microb. Biotechnol. 2022, 15, 1270–1280. [Google Scholar] [CrossRef] [PubMed]
  37. Portelli, S.; Myung, Y.; Furnham, N.; Vedithi, S.C.; Pires, D.E.V.; Ascher, D.B. Prediction of Rifampicin Resistance beyond the RRDR Using Structure-Based Machine Learning Approaches. Sci. Rep. 2020, 10, 18120. [Google Scholar] [CrossRef] [PubMed]
  38. Tian, Y.; Zhang, D.; Chen, F.; Rao, G.; Zhang, Y. Machine Learning-Based Colistin Resistance Marker Screening and Phenotype Prediction in Escherichia Coli from Whole Genome Sequencing Data. J. Infect. 2024, 88, 191–193. [Google Scholar] [CrossRef]
  39. Wang, H.-Y.; Chen, C.-H.; Lee, T.-Y.; Horng, J.-T.; Liu, T.-P.; Tseng, Y.-J.; Lu, J.-J. Rapid Detection of Heterogeneous Vancomycin-Intermediate Staphylococcus Aureus Based on Matrix-Assisted Laser Desorption Ionization Time-of-Flight: Using a Machine Learning Approach and Unbiased Validation. Front. Microbiol. 2018, 9, 2393. [Google Scholar] [CrossRef]
  40. Weis, C.; Cuénod, A.; Rieck, B.; Dubuis, O.; Graf, S.; Lang, C.; Oberle, M.; Brackmann, M.; Søgaard, K.K.; Osthoff, M.; et al. Direct Antimicrobial Resistance Prediction from Clinical MALDI-TOF Mass Spectra Using Machine Learning. Nat. Med. 2022, 28, 164–174. [Google Scholar] [CrossRef]
  41. Babiker, A.; Mustapha, M.M.; Pacey, M.P.; Shutt, K.A.; Ezeonwuka, C.D.; Ohm, S.L.; Cooper, V.S.; Marsh, J.W.; Doi, Y.; Harrison, L.H. Use of Online Tools for Antimicrobial Resistance Prediction by Whole-Genome Sequencing in Methicillin-Resistant Staphylococcus Aureus (MRSA) and Vancomycin-Resistant Enterococci (VRE). J. Glob. Antimicrob. Resist. 2019, 19, 136–143. [Google Scholar] [CrossRef] [PubMed]
  42. Hendriksen, R.S.; Bortolaia, V.; Tate, H.; Tyson, G.H.; Aarestrup, F.M.; McDermott, P.F. Using Genomics to Track Global Antimicrobial Resistance. Front. Public Health 2019, 7, 242. [Google Scholar] [CrossRef]
  43. Dai, L.; Sahin, O.; Grover, M.; Zhang, Q. New and Alternative Strategies for the Prevention, Control, and Treatment of Antibiotic-Resistant Campylobacter. Transl. Res. 2020, 223, 76–88. [Google Scholar] [CrossRef] [PubMed]
  44. Gharbi, M.; Kamoun, S.; Hkimi, C.; Ghedira, K.; Béjaoui, A.; Maaroufi, A. Relationships between Virulence Genes and Antibiotic Resistance Phenotypes/Genotypes in Campylobacter Spp. Isolated from Layer Hens and Eggs in the North of Tunisia: Statistical and Computational Insights. Foods 2022, 11, 3554. [Google Scholar] [CrossRef] [PubMed]
  45. Painset, A.; Day, M.; Doumith, M.; Rigby, J.; Jenkins, C.; Grant, K.; Dallman, T.J.; Godbole, G.; Swift, C. Comparison of Phenotypic and WGS-Derived Antimicrobial Resistance Profiles of Campylobacter Jejuni and Campylobacter Coli Isolated from Cases of Diarrhoeal Disease in England and Wales, 2015-16. J. Antimicrob. Chemother. 2020, 75, 883–889. [Google Scholar] [CrossRef] [PubMed]
  46. Pyörälä, S.; Baptiste, K.E.; Catry, B.; van Duijkeren, E.; Greko, C.; Moreno, M.A.; Pomba, M.C.M.F.; Rantala, M.; Ružauskas, M.; Sanders, P.; et al. Macrolides and Lincosamides in Cattle and Pigs: Use and Development of Antimicrobial Resistance. Vet. J. 2014, 200, 230–239. [Google Scholar] [CrossRef]
  47. de la Lastra, J.M.P.; Wardell, S.J.T.; Pal, T.; de la Fuente-Nunez, C.; Pletzer, D. From Data to Decisions: Leveraging Artificial Intelligence and Machine Learning in Combating Antimicrobial Resistance—A Comprehensive Review. J. Med. Syst. 2024, 48, 71. [Google Scholar] [CrossRef]
Figure 1. Workflow of the machine learning pipeline to develop and validate a predictive model for multidrug resistance (MDR) in swine-derived Campylobacter isolates.
Figure 1. Workflow of the machine learning pipeline to develop and validate a predictive model for multidrug resistance (MDR) in swine-derived Campylobacter isolates.
Vetsci 12 00937 g001
Figure 2. Cross-validation performance metrics (Accuracy and Cohen’s Kappa) of five supervised machine learning algorithms for predicting multidrug resistance in Campylobacter spp. isolates from swine.
Figure 2. Cross-validation performance metrics (Accuracy and Cohen’s Kappa) of five supervised machine learning algorithms for predicting multidrug resistance in Campylobacter spp. isolates from swine.
Vetsci 12 00937 g002
Figure 3. (a) Confusion matrix showing predicted versus actual multidrug resistance classification results of the trained Random Forest model using 5-fold cross-validation. The matrix displays the number of correctly and incorrectly classified MDR (Y) and non-MDR (N) isolates. (b) Feature importance scores of predictors from the trained Random Forest model predicting MDR in swine-derived Campylobacter isolates. ERY: erythromycin, AZI: azithromycin, CLI: clindamycin, TET: tetracycline, NAL: nalidixic acid, CIP: ciprofloxacin, GEN: gentamicin.
Figure 3. (a) Confusion matrix showing predicted versus actual multidrug resistance classification results of the trained Random Forest model using 5-fold cross-validation. The matrix displays the number of correctly and incorrectly classified MDR (Y) and non-MDR (N) isolates. (b) Feature importance scores of predictors from the trained Random Forest model predicting MDR in swine-derived Campylobacter isolates. ERY: erythromycin, AZI: azithromycin, CLI: clindamycin, TET: tetracycline, NAL: nalidixic acid, CIP: ciprofloxacin, GEN: gentamicin.
Vetsci 12 00937 g003
Figure 4. (a) Confusion matrix showing predicted versus actual MDR classification results from the external validation of the trained Random Forest model. The matrix displays the number of correctly and incorrectly classified MDR (Y) and non-MDR (N) isolates. (b) Feature importance scores of predictors from the external validation dataset, evaluating the trained Random Forest model predicting MDR in swine-derived Campylobacter spp. isolates. ERY: erythromycin, AZI: azithromycin, CLI: clindamycin, TET: tetracycline, NAL: nalidixic acid, CIP: ciprofloxacin, GEN: gentamicin.
Figure 4. (a) Confusion matrix showing predicted versus actual MDR classification results from the external validation of the trained Random Forest model. The matrix displays the number of correctly and incorrectly classified MDR (Y) and non-MDR (N) isolates. (b) Feature importance scores of predictors from the external validation dataset, evaluating the trained Random Forest model predicting MDR in swine-derived Campylobacter spp. isolates. ERY: erythromycin, AZI: azithromycin, CLI: clindamycin, TET: tetracycline, NAL: nalidixic acid, CIP: ciprofloxacin, GEN: gentamicin.
Vetsci 12 00937 g004
Table 1. Performance statistics of the trained Random Forest model for predicting MDR in Campylobacter isolates from swine.
Table 1. Performance statistics of the trained Random Forest model for predicting MDR in Campylobacter isolates from swine.
MetricValue
Accuracy99.73%
95% Confidence Interval99.42–99.90%
No Information Rate (NIR)76.54%
p-Value [Accuracy > NIR]p < 2 × 10−16
Kappa0.9925
McNemar’s Test p-Value0.0412
Sensitivity98.86%
Specificity100.00%
Positive Predictive Value100.00%
Negative Predictive Value99.65%
Prevalence23.46%
Detection Rate23.19%
Detection Prevalence23.19%
Balanced Accuracy99.43%
Table 2. Performance statistics of the external validation of the Random Forest model for predicting MDR in Campylobacter isolates from swine.
Table 2. Performance statistics of the external validation of the Random Forest model for predicting MDR in Campylobacter isolates from swine.
MetricValue
Accuracy98.51%
95% Confidence Interval97.19–99.32%
No Information Rate (NIR)77.28%
p-Value [Acc > NIR]p < 2.2 × 10−16
Kappa0.9565
McNemar’s Test p-Value0.007661
Sensitivity93.43%
Specificity100%
Positive Predictive Value 100%
Negative Predictive Value 98.11
Prevalence22.72%
Detection Rate21.23%
Detection Prevalence21.%23
Balanced Accuracy96.72%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sodagari, H.R.; Ghasemi, M.; Varga, C.; Habib, I. Machine Learning Prediction of Multidrug Resistance in Swine-Derived Campylobacter spp. Using United States Antimicrobial Resistance Surveillance Data (2013–2023). Vet. Sci. 2025, 12, 937. https://doi.org/10.3390/vetsci12100937

AMA Style

Sodagari HR, Ghasemi M, Varga C, Habib I. Machine Learning Prediction of Multidrug Resistance in Swine-Derived Campylobacter spp. Using United States Antimicrobial Resistance Surveillance Data (2013–2023). Veterinary Sciences. 2025; 12(10):937. https://doi.org/10.3390/vetsci12100937

Chicago/Turabian Style

Sodagari, Hamid Reza, Maryam Ghasemi, Csaba Varga, and Ihab Habib. 2025. "Machine Learning Prediction of Multidrug Resistance in Swine-Derived Campylobacter spp. Using United States Antimicrobial Resistance Surveillance Data (2013–2023)" Veterinary Sciences 12, no. 10: 937. https://doi.org/10.3390/vetsci12100937

APA Style

Sodagari, H. R., Ghasemi, M., Varga, C., & Habib, I. (2025). Machine Learning Prediction of Multidrug Resistance in Swine-Derived Campylobacter spp. Using United States Antimicrobial Resistance Surveillance Data (2013–2023). Veterinary Sciences, 12(10), 937. https://doi.org/10.3390/vetsci12100937

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop