Machine Learning Prediction Model for Inflammatory Bowel Disease Based on Laboratory Markers. Working Model in a Discovery Cohort Study
Abstract
:1. Introduction
- building the machine learning model based on routinely performed laboratory blood, urine, and fecal tests to support differentiation between IBD patients and non-IBD patients
- comparison of the effectiveness of our model to standard inflammatory serum marker, that is C-reactive protein (CRP), in the prediction of IBD, creating a website-based application supporting the prediction of the presence of IBD
2. Materials and Methods
Evaluating the Effectiveness of the Model
3. Results
3.1. Data Filtering and Input Features
3.2. Machine Learning Classifiers
3.2.1. Logistic Regression
3.2.2. The k-Nearest Neighbor
3.2.3. Gradient Boosting Classifier
3.2.4. Random Forests
3.2.5. Support Vector Classifiers
3.2.6. Majority Voting
3.3. Best Classifiers and Most Important Predictors
3.4. Model Robustness
3.5. Comparison of the Machine Learning Model to C-Reactive Protein in the Prediction of the IBD Presence
3.6. Web Application Integrated Model
4. Discussion
- gastrointestinal infections, gastric and colonic malignancies, eosinophilic colitis, lymphocytic colitis, and coeliac disease,
- concomitant medical treatment with proton pump inhibitors, nonsteroidal anti-inflammatory drugs, and acetylsalicylic acid,
- age.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- De Souza, H.S.P.; Fiocchi, C.; Iliopoulos, D. The IBD interactome: An integrated view of etiology, pathogenesis, and therapy. Nat. Rev. Gastroenterol. Hepatol. 2017, 14, 739–749. [Google Scholar] [CrossRef] [PubMed]
- Kellermayer, R.; Zilbauer, M. The gut microbiome and the triple environmental hit concept of inflammatory bowel disease pathogenesis. J. Pediatr. Gastroenterol. Nutr. 2020, 71, 589–595. [Google Scholar] [CrossRef] [PubMed]
- Magro, F.; Gionchetti, P.; Eliakim, R.; Ardizzone, S.; Armuzzi, A.; Barreiro-de Acosta, M.; Burisch, J.; Gecse, K.B.; Hart, A.L.; Hindryckx, P.; et al. Third European evidence-based consensus on diagnosis and management of ulcerative colitis. Part 1: Definitions, diagnosis, extra-intestinal manifestations, pregnancy, cancer surveillance, surgery, and ileo-anal pouch disorders. J. Crohns Colitis 2017, 11, 649–670. [Google Scholar] [CrossRef]
- Gomollón, F.; Dignass, A.; Annese, V.; Tilg, H.; Van Assche, G.; Lindsay, J.O.; Peyrin-Biroulet, L.; Cullen, G.J.; Daperno, M.; Kucharzik, T.; et al. 3rd European evidence-based consensus on the diagnosis and management of Crohn’s disease 2016: Part 1: Diagnosis and medical management. J. Crohns Colitis 2017, 11, 3–25. [Google Scholar] [CrossRef] [Green Version]
- Cantoro, L.; Di Sabatino, A.; Papi, C.; Margagnoni, G.; Ardizzone, S.; Giuffrida, P.; Giannarelli, D.; Massari, A.; Monterubbianesi, R.; Lenti, M.V.; et al. The time course of diagnostic delay in inflammatory bowel disease over the last 436 sixty years: An Italian multicentre study. J. Crohns Colitis 2017, 11, 975–980. [Google Scholar] [CrossRef] [Green Version]
- Dave, M.; Loftus, E.V., Jr. Mucosal healing in inflammatory bowel disease-a true paradigm of success? Gastroenterol. Hepatol. 2012, 8, 29–38. [Google Scholar]
- Krzystek-Korpacka, M.; Kempiński, R.; Bromke, M.; Neubauer, K. Biochemical biomarkers of mucosal healing for inflammatory bowel disease in adults. Diagnostics 2020, 10, 367. [Google Scholar] [CrossRef]
- Bromke, M.A.; Neubauer, K.; Kempiński, R.; Krzystek-Korpacka, M. Faecal calprotectin in assessment of mucosal healing in adults with inflammatory bowel disease: A meta-analysis. J. Clin. Med. 2021, 10, 2203. [Google Scholar] [CrossRef]
- Nebbia, M.; Yassin, N.A.; Spinelli, A. Colorectal cancer in inflammatory bowel disease. Clin. Colon. Rectal Surg. 2020, 33, 305–317. [Google Scholar] [CrossRef] [PubMed]
- Magro, F.; Rahier, J.F.; Abreu, C.; MacMahon, E.; Hart, A.; van der Woude, C.J.; Gordon, H.; Adamina, M.; Viget, N.; Vavricka, S.; et al. Inflammatory bowel disease management during the COVID-19 outbreak: The ten do’s and don’ts from the ECCO-COVID Taskforce. J. Crohns Colitis 2020, 14, S798–S806. [Google Scholar] [CrossRef]
- Perisetti, A.; Goyal, H. Successful distancing: Telemedicine in gastroenterology and hepatology during the COVID-pandemic. Dig. Dis. Sci. 2021, 66, 945–953. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. JMLR 2011, 12, 2825–2830. [Google Scholar]
- Seyed Tabib, N.S.; Madgwick, M.; Sudhakar, P.; Verstockt, B.; Korcsmaros, T.; Vermeire, S. Big data in IBD: Big progress for clinical practice. Gut 2020, 69, 1520–1532. [Google Scholar] [CrossRef]
- Okagawa, Y.; Abe, S.; Yamada, M.; Oda, I.; Saito, Y. Artificial Intelligence in Endoscopy. Dig. Dis. Sci. 2021, 91, 1–20. [Google Scholar] [CrossRef]
- Tontini, G.E.; Rimondi, A.; Vernero, M.; Neumann, H.; Vecchi, M.; Bezzio, C.; Cavallaro, F. Artificial intelligence in gastrointestinal endoscopy for inflammatory bowel disease: A systematic review and new horizons. Therap. Adv. Gastroenterol. 2021, 14, 17562848211017730. [Google Scholar] [CrossRef] [PubMed]
- McDonnell, M.; Harris, R.J.; Borca, F.; Mills, T.; Downey, L.; Dharmasiri, S.; Patel, M.; Zare, B.; Stammers, M.; Smith, T.R.; et al. High incidence of glucocorticoid-induced hyperglycaemia in inflammatory bowel disease: Metabolic and clinical predictors identified by machine learning. BMJ Open Gastroenterol. 2020, 7, e000532. [Google Scholar] [CrossRef] [PubMed]
- Choi, Y.I.; Park, S.J.; Chung, J.W.; Kim, K.O.; Cho, J.H.; Kim, Y.J.; Lee, K.Y.; Kim, K.G.; Park, D.K.; Kim, Y.J. Development of machine learning model to predict the 5-year risk of starting biologic agents in patients with inflammatory bowel disease (IBD): K-CDM Network Study. J. Clin. Med. 2020, 9, 3427. [Google Scholar] [CrossRef]
- Sarrabayrouse, G.; Elias, A.; Yáñez, F.; Mayorga, L.; Varela, E.; Bartoli, C.; Casellas, F.; Borruel, N.; Herrera de Guise, C.; Machiels, K.; et al. Fungal and bacterial loads: Noninvasive inflammatory bowel disease biomarkers for the clinical setting. mSystems 2021, 6, e01277-20. [Google Scholar] [CrossRef] [PubMed]
- Manandhar, I.; Alimadadi, A.; Aryal, S.; Munroe, P.B.; Joe, B.; Cheng, X. Gut microbiome-based supervised machine learning for clinical diagnosis of inflammatory bowel diseases. Am. J. Physiol. Liver Physiol. 2021. [Google Scholar] [CrossRef]
- Khorasani, H.M.; Usefi, H.; Peña-Castillo, L. Detecting ulcerative colitis from colon samples using efficient feature selection and machine learning. Scient. Rep. 2020, 10, 13744. [Google Scholar] [CrossRef]
- Gubatan, J.; Levitte, S.; Patel, A.; Balabanis, T.; Wei, M.T.; Sinha, S.R. Artificial intelligence applications in inflammatory bowel disease: Emerging technologies and future directions. World J. Gastroenterol. 2021, 27, 1920–1935. [Google Scholar] [CrossRef]
- Ma, C.; Battat, R.; Parker, C.E.; Khanna, R.; Jairath, V.; Feagan, B.G. Update on C-reactive protein and fecal calprotectin: Are they accurate measures of disease activity in Crohn’s disease? Expert Rev. Gastroenterol. Hepatol. 2019, 13, 319–330. [Google Scholar] [CrossRef] [PubMed]
- Mosli, M.H.; Zou, G.; Garg, S.K.; Feagan, S.G.; MacDonald, J.K.; Chande, N.; Sandborn, W.J.; Feagan, B.G. C-reactive protein, fecal calprotectin, and stool lactoferrin for detection of endoscopic activity in symptomatic inflammatory bowel disease patients: A systematic review and meta-analysis. Am. J. Gastroenterol. 2015, 110, 802–820. [Google Scholar] [CrossRef]
- Chen, P.; Zhou, G.; Lin, J.; Li, L.; Zeng, Z.; Chen, M.; Zhang, S. Serum biomarkers for inflammatory bowel disease. Front. Med. (Lausanne) 2020, 7, 123. [Google Scholar] [CrossRef] [Green Version]
- Fengming, Y.; Jianbing, W. Biomarkers of inflammatory bowel disease. Dis. Markers. 2014, 2014, 710915. [Google Scholar] [CrossRef] [PubMed]
- Menees, S.B.; Powell, C.; Kurlander, J.; Goel, A.; Chey, W.D. A meta-analysis of the utility of C-reactive protein, erythrocyte sedimentation rate, fecal calprotectin, and fecal lactoferrin to exclude inflammatory bowel disease in adults with IBS. Am. J. Gastroenterol. 2015, 110, 444–454. [Google Scholar] [CrossRef]
- Ricciuto, A.; Griffiths, A.M. Clinical value of fecal calprotectin. Crit. Rev. Clin. Lab. Sci. 2019, 56, 307–320. [Google Scholar] [CrossRef]
- Widbom, L.; Ekblom, K.; Karling, P.; Hultdin, J. Patients developing inflammatory bowel disease have iron deficiency and lower plasma ferritin years before diagnosis: A nested case-control study. Eur. J. Gastroenterol. Hepatol. 2020, 32, 1147–1153. [Google Scholar] [CrossRef]
- Zhang, Z.; Pereira, S.L.; Luo, M.; Matheson, E.M. Evaluation of blood biomarkers associated with risk of malnutrition in older adults: A systematic review and meta-analysis. Nutrients 2017, 9, 829. [Google Scholar] [CrossRef]
- Vermeire, S.; Van Assche, G.; Rutgeerts, P. Laboratory markers in IBD: Useful, magic, or unnecessary toys? Gut 2006, 55, 426–431. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, X.; Wang, G.; Shang, J.; Pan, H.; Zhang, X.A.; Zhou, F. Immunosuppressive therapies adversely affect blood biochemical parameters in patients with inflammatory bowel disease: A meta-analysis. J. Int. Med. Res. 2019, 47, 3534–3549. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Valentino, P.L.; Feldman, B.M.; Walters, T.D.; Griffiths, A.M.; Ling, S.C.; Pullenayegum, E.M.; Kamath, B.M. Abnormal liver biochemistry is common in pediatric inflammatory bowel disease: Prevalence and associations. Inflamm. Bowel Dis. 2015, 21, 2848–2856. [Google Scholar] [CrossRef] [Green Version]
- Plevy, S.; Silverberg, M.S.; Lockton, S.; Lockton, S.; Stockfisch, T.; Croner, L.; Stachelski, J.; Brown, M.; Triggs, C.; Chuang, E.; et al. Combined serological, genetic, and inflammatory markers differentiate non-IBD, Crohn’s disease, and ulcerative colitis patients. Inflamm. Bowel Dis. 2013, 19, 1139–1148. [Google Scholar] [CrossRef] [Green Version]
- Padoan, A.; D’Incà, R.; Scapellato, M.L.; De Bastiani, R.; Caccaro, R.; Mescoli, C.; Moz, S.; Bozzato, D.; Zambon, C.F.; Lorenzon, G.; et al. Improving IBD diagnosis and monitoring by understanding preanalytical, analytical and biological fecal calprotectin variability. Clin. Chem. Lab. Med. 2018, 56, 1926–1935. [Google Scholar] [CrossRef] [PubMed]
- Hruz, P.; Juillerat, P.; Kullak-Ublick, G.A.; Schoepfer, A.M.; Mantzaris, G.J.; Rogler, G. On behalf of Swiss IBDnet, an official working group of the Swiss Society of Gastroenterology. Management of the elderly inflammatory bowel disease patient. Digestion 2020, 101, 105–119. [Google Scholar] [CrossRef] [PubMed]
Characteristic | Ulcerative Colitis N = 319 | Crohn’s Disease N = 383 |
---|---|---|
Age at diagnosis, y (%) | ||
A1, below 16 years | 38 (9.9) | |
A2, between 17 and 40 years | 234 (51.1) | |
A3, above 40 years; | 11 (29.0) | |
Behaviour, n (%) | ||
B1, non-stricturing, non-penetrating | 298 (77.8) | |
B2, structuring | 57 (14.9) | |
B3, penetrating | 28 (7.3) | |
Extent, n (%) | ||
E1, proctitis | 149 (46.7) | |
E2, left-sided colitis | 100 (31.3) | |
E3, extended colitis | E3 70 (22) | |
Location, n (%) | ||
L1, ileal | 85 (22.2) | |
L2, colonic | 107 (27.9) | |
L3, ileocolonic | 191 (49.9) | |
Severity, n (%) | ||
S0, clinical remission | 162 (50.8) | |
S1, mild ulcerative colitis | 39 (12.2) | |
S2, moderate ulcerative colitis | 70 (22) | |
S3, severe ulcerative colitis | 48 (15.1) |
Characteristic | Ulcerative ColitisN = 319 | Crohn’s Disease N = 383 | Control N = 315 |
---|---|---|---|
Age, y | 41.8 ± 14.7 | 36.3 ± 12.6 | 63.8 ± 13.4 |
Female gender, n (%) | 183 (57) | 159 (41) | 135 (43) |
C-reactive protein level, mg/L | 20.7 ± 50.8 | 25.1 ± 41.7 | 5.5 ± 15.4 |
Sodium, mmol/L | 139.2 ± 2.9 | 139.3 ± 2.3 | 140.0 ± 2.5 |
Potassium, mmol/L | 4.1 ± 0.4 | 4.3 ± 0.3 | 4.3 ± 0.5 |
eGFR, mg/dL | 90.5 ± 18.1 | 214.9 ± 526.8 | 76.7 ± 19.5 |
Random blood glucose, mg/dL | 104.9 ± 31.1 | 96.5 ± 11.0 | 108.4 ± 30.8 |
Complete blood count | |||
White blood cell, 103/µL | 8.1 ± 3.5 | 7.5 ± 2.6 | 6.8 ± 1.9 |
Hemoglobin, g/dL | 12.6 ± 2.1 | 12.6 ± 1.9 | 13.7 ± 1.8 |
Hematocrit, % | 38.6 ± 5.3 | 38.6 ± 4.7 | 41.0 ± 4.8 |
Activated partial thromboplastin time, sec. | 27.7 ± 6.0 | 27.3 ± 3.4 | 26.3 ± 3.8 |
Liver function test | |||
Alanine transaminase, U/L | 32.5 ± 43.0 | 20.8 ± 15.5 | 27.0 ± 15.1 |
Aspartate transaminase, U/L | 32.4 ± 35.6 | 23.4 ± 9.4 | 30.5 ± 17.1 |
Alkaline phosphatase, U/L | 124.9 ± 174.7 | 86.1 ± 36.4 | 82.1 ± 43.5 |
Total bilirubin, mg/dL | 0.87 ± 0.99 | 0.71 ± 0.48 | 0.96 ± 1.27 |
Numerical Value * | Converted Text Value 1 | Binary Value * |
---|---|---|
25-OH-Vitamin D | bilirubin in urine | disease diagnosis 2 |
age | blood in urine | gender 3 |
ALT | ketones in urine | HBeAg |
alkaline phosphatase | leukocytes in urine | stool ova and parasites microscopic test |
APTT | nitrites in urine protein in urine | |
AST | squamous epithelial cells in urine | |
basophils % | ||
basophils # | ||
bilirubin (total) | ||
cholesterol (HDL) | ||
cholesterol (LDL) | ||
cholesterol (total) | ||
eGFR | ||
eosinophils % | ||
eosinophils # | ||
erythroblasts % | ||
erythroblasts # | ||
erythrocytes % | ||
Ferritin fecal calprotectin | ||
folic acid | ||
GGTP | ||
glucose | ||
haemoglobin level haemoglobin A1c | ||
HTC % | ||
immature granulocytes # 4 | ||
iron | ||
leukocytes % | ||
lipase | ||
lymphocytes % | ||
lymphocytes # | ||
MCH | ||
MCHC | ||
MCV | ||
monocytes % | ||
monocytes # | ||
MPV | ||
neutrophils % | ||
neutrophils # PCT | ||
potassium | ||
protein (total) | ||
PT (index) | ||
PT/INR | ||
amylase | ||
creatinine | ||
sodium | ||
TSH 3rd generation | ||
ultra-sensitive CRP | ||
urine pH | ||
urobilinogen in urine | ||
vitamin B12 |
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
no disease (0) | 0.86 | 0.73 | 0.79 | 94 |
UC present (1) | 0.77 | 0.89 | 0.83 | 96 |
accuracy | - | - | 0.81 | 190 |
macro average | 0.82 | 0.81 | 0.81 | 190 |
weighted average | 0.82 | 0.81 | 0.81 | 190 |
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
no disease (0) | 0.92 | 0.76 | 0.83 | 94 |
CD present (1) | 0.83 | 0.95 | 0.88 | 115 |
accuracy | - | - | 0.86 | 209 |
macro average | 0.87 | 0.85 | 0.86 | 209 |
weighted average | 0.87 | 0.86 | 0.86 | 209 |
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
no disease (0) | 0.55 | 0.82 | 0.66 | 94 |
UC present (1) | 0.66 | 0.34 | 0.45 | 96 |
accuracy | - | - | 0.58 | 190 |
macro average | 0.60 | 0.58 | 0.56 | 190 |
weighted average | 0.61 | 0.58 | 0.55 | 190 |
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
no disease (0) | 0.59 | 0.77 | 0.66 | 94 |
CD present (1) | 0.74 | 0.56 | 0.64 | 115 |
accuracy | - | - | 0.65 | 209 |
macro average | 0.66 | 0.66 | 0.65 | 209 |
weighted average | 0.67 | 0.65 | 0.65 | 209 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kraszewski, S.; Szczurek, W.; Szymczak, J.; Reguła, M.; Neubauer, K. Machine Learning Prediction Model for Inflammatory Bowel Disease Based on Laboratory Markers. Working Model in a Discovery Cohort Study. J. Clin. Med. 2021, 10, 4745. https://doi.org/10.3390/jcm10204745
Kraszewski S, Szczurek W, Szymczak J, Reguła M, Neubauer K. Machine Learning Prediction Model for Inflammatory Bowel Disease Based on Laboratory Markers. Working Model in a Discovery Cohort Study. Journal of Clinical Medicine. 2021; 10(20):4745. https://doi.org/10.3390/jcm10204745
Chicago/Turabian StyleKraszewski, Sebastian, Witold Szczurek, Julia Szymczak, Monika Reguła, and Katarzyna Neubauer. 2021. "Machine Learning Prediction Model for Inflammatory Bowel Disease Based on Laboratory Markers. Working Model in a Discovery Cohort Study" Journal of Clinical Medicine 10, no. 20: 4745. https://doi.org/10.3390/jcm10204745
APA StyleKraszewski, S., Szczurek, W., Szymczak, J., Reguła, M., & Neubauer, K. (2021). Machine Learning Prediction Model for Inflammatory Bowel Disease Based on Laboratory Markers. Working Model in a Discovery Cohort Study. Journal of Clinical Medicine, 10(20), 4745. https://doi.org/10.3390/jcm10204745