Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis
Abstract
:1. Introduction
- (1)
- Using data mining techniques to analyze patients without and with complications (cardiovascular diseases, hepatitis and ESRD) and to predict whether the ESR value will be within the normal range (refers to improvement of the degree of inflammation and corrosion of limbs).
- (2)
- Using data mining techniques and logical regression analysis to predict the prognosis of RA patients.
2. Rheumatoid Arthritis (RA) Associated Complications and Prognosis
3. Research Method
3.1. Data
3.1.1. Data Cleanup and Sample Selection
3.1.2. Research Variables
3.2. Experimental Design
3.3. The Investigated Classification Techniques
4. Results
4.1. Descriptions of the Related Variables
4.2. Evaluation Results
4.3. Discussion
4.3.1. RA-Group
- Sex: no significant difference was found between males and females. However, the ESR of males reduced to below 12 more easily than the ESR of females in this study. From the report published by McInnes et al., annual incidence of RA was 40/100,000 in women while the rate in men is about half of the rate in women [1]. Nonetheless, in this study, male patients were found to be better in controlling the disease which is different from past reports, and thus further research and analysis are required.
- Day_d: based on experts’ clinical interpretations, the longer the treatment, the better the treatment efficacy and the ESR may be easily reduced to below 12. According to Mark et al., under the general treatment guidelines, the full effect of DMARDs may not occur until a certain period of time (around two to three months) during the initial therapy [20], and thus the explanatory power of day_d is high, which is similar to the result of this study.
- RF-IgM: based on experts’ clinical interpretations, RA patients with an RF-IgM > 40 g/L had severe conditions and poorly controlled disease and the ESR may not be easily reduced to below 12. According to Combe et al., the positivity of rheumatoid factor RF-IgM could predict the destruction of the joints and the radiographic progression in early-stage RA [12]. If the odds ratio (OR) of the predictor variable RF-IgM correlation is high (P < 0.001) [3], the explanatory power of RF-IgM is high, which is similar to the result of this study.
- Steroids: based on experts’ clinical interpretations, patients with better controlled disease require less doses and the ESR may be easily reduced to below 12. From the results reported by Gestel et al., combination of low dose (≦7.5 mg) steroids auxiliary treatment may improve disease prognosis [4,5] and the explanatory power of steroids is high, which is similar to the result of this study.
4.3.2. RA_COM Group
- RF_IgM: based on experts’ clinical interpretations, RA patients with an RF_IgM > 40 g/L had severe conditions and poorly controlled disease and the ESR may not be easily reduced to below 12. According to Combe et al., the positivity of rheumatoid factor RF-IgM could predict the destruction of the joints and the radiographic progression in early-stage RA. If the odds ratio (OR) of the predictor variable RF-IgM correlation is high (P < 0.001) [12], the explanatory power of RF_IgM is high, which is similar to the result of this study.
- Methotreate: based on experts’ clinical interpretations, patients with complications had better controlled disease after taking Methotrexate, and the ESR is more easily reduced to below 12. From the results reported by Mark et al. [7], the immunosuppressant methotrexate acts quickly, starts working within a few weeks and has excellent efficacy. Most patients respond well to the methotrexate treatment and therefore the methotrexate is currently the most frequently used first-line drug in RA [20] and the explanation power of Methotrexat is high, which is similar to the result of this study.
- Leflunomide: based on experts’ clinical interpretations, Leflunomide is the second-line drug and only used to treated severe patients in later treatment course and thus these patients frequently had poorly controlled disease and the ESR cannot be easily reduced to below 12. From the results reported by Mark et al., Leflunomide is the second-line drug and is specifically for treating severe RA patients [20] and the explanation power of Leflunomide is high, which is similar to the result of this study.
- IF_NSAIDs: based on experts’ clinical interpretations, RA patients with acute symptoms and severe disease activity will be treated with these drugs and thus these patients had poorly controlled disease and the ESR cannot be easily reduced to below 12. According to the results published by Belton and Mark et al., NSAIDs are widely used for treating pain and inflammation, e.g., sore throat, fever, gout, sprains and joint pain inflammation, including RA and osteoarthritis [3,21] and the explanation power of IF_NSAIDs is high, which is similar to the result of this study.
5. Conclusions
5.1. Results and Suggestions
5.2. Research Contribution
5.3. Research Restrictions
5.4. Future Research Direction
Author Contributions
Funding
Conflicts of Interest
References
- McInnes, I.B.; O’Dell, J.R. State-of-the-art: Rheumatoid arthritis. Review of pathophysiology and treatment. Ann. Rheum. Dis. 2010, 69, 1898–1906. [Google Scholar] [CrossRef] [PubMed]
- Arnett, F.C.; Edworthy, S.M.; Bloch, D.A. The American Rheumatism Association 1987 revised criteria for the classication of rheumatoid arthritis. Arthritis Rheum 1998, 31, 315–324. [Google Scholar] [CrossRef] [PubMed]
- Mark, C.G. Treatment of Rheumatoid Arthritis. In Firestein: Kelley’s Textook of Rheumatology, 9th ed.; Saunders: Philadelphia, PA, USA, 2012; Volume 67, pp. 1119–1143. [Google Scholar]
- Gestel, A.M.; Laan, R.F.; Haagsma, C.J.; Putte, L.B.; Riel, P.L. Oral steroids as bridge therapy in rheumatoid arthritis patients starting with parenteral gold. A randomized double-blind placebo-controlled trial. Br. J. Rheumatol. 1993, 34, 347–351. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Keyser, F.D. Choice of biologic therapy for patients with rheumatoid arthritis: The infection perspective. Curr. Rheumatol. Rev. 2011, 7, 77–87. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Clair, E.W.; Pisetsky, D.S.; Haynes, B.F. Rheumatoid Arthritis, 1st ed.; Lippincott Williams & Wilkins: Philadelphia, PA, USA, 2004. [Google Scholar]
- Mark, L.; Deborah, P.M.S.; Alan, J.S. An Evaluation of the Decision Tree Format of the American College of Rheumatology 1987 Classification Criteria for Rheumatoid Arthritis. Arthritis Rheumatol. 2005, 52, 2277–2283. [Google Scholar]
- Fautrel, B.; Guillemin, F.; Meyer, O.; Bandt, M.D.; Berthelot, J.M.; Flipo, R.M.; Maillefert, J.F.; Wendling, D.; Saraux, A.; Combe, B.; et al. Choice of Second-Line Disease- Modifying Antirheumatic Drugs After Failure of Methotrexate Therapy for Rheumatoid Arthritis: A Decision Tree for Clinical Practice Based on Rheumatologists’ Preferences. Arthritis Rheumatol. 2009, 61, 425–434. [Google Scholar] [CrossRef] [PubMed]
- Boyesen, P.; Hoff, M.; Odegard, S.; Haugeberg, G.; Syversen, S.W.; Gaarder, P.I.; Okkenhaug, C.; Kvien, T.K. Antibodies to cyclic citrullinated protein and erythrocyte sedimentation rate predict hand bone loss in patients with rheumatoid arthritis of short duration: A longitudinal study. Arthritis Res. Ther. 2009, 11, R103. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Weinblatt, M.E.; Keystone, E.C.; Cohen, M.D.; Freundlich, B.; Li, J. Factors associated with radiographic progression in patients with rheumatoid arthritis who were treated with methotrexate. J.Rheumatol. 2011, 38, 242–246. [Google Scholar] [CrossRef] [PubMed]
- Drouin, J.; Haraoui, B. Predictors of clinical response and radiographic progression in patients with rheumatoid arthritis treated with methotrexate monotherapy. J. Rheumatol. 2010, 37, 1405–1410. [Google Scholar] [CrossRef] [PubMed]
- Combe, B.; Dougados, M.; Goupille, P.; Cantagrel, A.; Eliaou, J.F.; Sibilia, J.; Meyer, O.; Sany, J.; Daures, J.P.; Dubois, A. Prognostic factors for radiographic damage in early rheumatoid arthritis: A multiparameter prospective study. Arthritis Rheumatol. 2001, 44, 1736–1743. [Google Scholar] [CrossRef]
- Delen, D.; Walker, G.; Kadam, A. Predicting breast cancer survivability: A comparison of three data mining methods. Artif. Intell. Med. 2005, 34, 113–127. [Google Scholar] [CrossRef] [PubMed]
- Kretschmann, E.; Apweiler, R. Automatic Rule Generation for Protein Annotation with the C4 5 Data-Mining Algorithm Applied on Peptides in Ensembl. In Proceedings of the German Conference on Bioinformatics, Braunschweig, Germany, 7–10 October 2001; pp. 53–57. [Google Scholar]
- Liao, S.H.; Wen, C.H. Data Mining and Business Intelligence; Yeh Yeh Book Gallery: Taipei, Taiwan, 2009. [Google Scholar]
- Richards, J.A.; Jia, X. Remote Sensing Digital Image Analysis: An Introduction, 3rd ed.; Springer: Berlin, Gemany, 1999. [Google Scholar]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth: Belmont, CA, USA, 1984. [Google Scholar]
- Vapnik, V. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995. [Google Scholar]
- Hosmer, D.W.; Lemeshow, S. Applied Logistic Regression; John Wiley & Sons: New York, NY, USA, 2013. [Google Scholar]
- Ozdamar, K. Paket Programlarla Istatistiksel veri Analizi—1; Kaan Kitabevi: Eskisehir, Turkey, 2004. [Google Scholar]
- Morel, J.; Combe, B. How to predict prognosis in early rheumatoid. Arthritis Best Pract. Res. Clin. Rheumatol. 2005, 19, 137–146. [Google Scholar] [CrossRef] [PubMed]
- Belton, O.; Byrne, D.; Kearney, D.; Leahy, A.; Desmond, J. Cyclooxygenase – 1 and – 2 -dependent prostacyclin formulation in patient with atherosclerosis. Circulation 2000, 102, 840–845. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Column Name | Column Description | Description of Definitions |
---|---|---|
Age | 9 to 88 | The date of diagnosis of RA minus the patient’s date of birth. |
Gender | Male or female | Male: M Female: F |
RF-Ig M | Rheumatoid Arthritis Index | The blood test results obtained before receiving treatments. |
GPT | Liver function index | The blood test results obtained before receiving treatments. |
IF_NSAIDs | The patient uses non-steroid classified into Cox-2 (Etodolac, Meloxicam, Nimesulide, Celecoxib, and Etoricoxib) and Non- Cox-2 (Aceclofenac, Diclofenac, Flubiprofen, Naproxen, and Sulindac) drugs or not. | If non-steroid is used record as yes, 0therwise, record as N. |
Steroids | Steroids (include prednisolone and methylprednisolone). | Calculate the average doses of the prescribed steroids by examining the patient’s medical record between two ESR test dates of the current treatment. |
DMARDs | The patient uses DMARDs (include Methotreate, Leflunomide, Hydroxychloroquine, Sulfsalazine and Cyclosporine) or not. | If a DMARD (divide five attributes: Methotreate, Sulfsalazine, Leflunomide, Hydroxychloroquine and Cyclosporine) is used, record as Y, otherwise record as N. |
Number | The drugs should be the five drugs described in the above DMARDs section. | Calculate the total number of DMARDs the patient uses. |
IF_Biologics_i | The patient uses biologics (including Etanercept, Adalimumab and Rituximab) or not. | If a biologic by examining the patient’s medical record between two ESR test dates of the current treatment is used, record as Y, otherwise record as N. |
day_d | Duration of medication. | Calculate the days between two ESR test dates. |
Cardiovascular diseases | Patient has cardiovascular disease or not. | ICD-9-CM: 401.9, 402.9, 405.99, 410.90, 412.0, 414.00, 414.9, 424.90, 428.0, 428.9, 429.2, 434.9, 434.91, 435.9, 436.0, 437.0, 437.2, 437.9, 438.9, 440.9. |
Hepatitis | The patient has hepatitis or not. | ICD-9-CM: 070.30, 070.51, 070.9, 070.9. |
ESRD | The patient has end-stage renal disease or not. | Secondary diagnosis ICD-9-CM: 585.6. |
ESR | Is patient’s ESR normal? | The improved (reduced) ESR values obtained by examining the patient’s medical record between two ESR test dates of the current treatment. (normal range within 0–12 mm/h) |
Research Variables | RA Patients Without Complications n = 3081 | RA Patients with Complications n = 405 | |
---|---|---|---|
Age (years) | 57.18 (9 to 88) [8.78] | 64.25 (41 to 86) [8.78] | |
Sex | M:475 | M:60 | |
F:2,606 | F:345 | ||
RF_IgM (g/L) | 176.15 (0.57~3800) [300.33] | 188.71 (4.23~2860) [419.15] | |
GPT (U/L) | 24.66 (2~557) [24.15] | 30.14 (0~40) [27.60] | |
IF_NSAIDs | Y:2,023 | Y:342 | |
N:1,058 | N:63 | ||
Steroids (mg) | 5.72 (0~66) [6.01] | 3.70 (0~16.5) [3.21] | |
DMARDs | Methotreate | Y:2635 | Y:313 |
N:446 | N:92 | ||
Sulfsalazine | Y:776 | Y:310 | |
N:2,305 | N:95 | ||
Leflunomide | Y:654 | Y:97 | |
N:2,427 | N:308 | ||
Hydroxychloroquine | Y:968 | Y:123 | |
N:2,113 | N:282 | ||
Cyclosporine | Y:174 | Y:16 | |
N:2,907 | N:389 | ||
Number | Combination of 1 to 3 drug names | 1:1,748 | 1:50 |
2:540 | 2:256 | ||
3:793 | 3:99 | ||
IF_Biologics_i (Biologics) | Y:59 | Y:24 | |
N:3,022 | N:381 | ||
day_d (Medication time) (day) | 125 (60~2713) [150.20] | 114 (60~1321) [88.37] | |
ESR < 12 mm/h | Y:845 | Y:113 | |
N:2,236 | N:292 |
Single Classifier | Correctly Classified Instances | Sensitivity | Specificity | |||
---|---|---|---|---|---|---|
μ (σ) | Max/Min | μ (σ) | Max/Min | μ (σ) | Max/Min | |
Simple Logistic | 0.7927 (0.019) | 0.8198/0.7604 | 0.792 (0.059) | 0.874/0.706 | 0.793 (0.054) | 0.866/0.705 |
SMO | 0.7829 (0.018) | 0.8162/0.7500 | 0.783 (0.076) | 0.901/0.679 | 0.783 (0.069) | 0.878/0.667 |
J48 | 0.9094 (0.042) | 0.9519/0.8098 | 0.908 (0.047) | 0.959/0.785 | 0.911 (0.039) | 0.961/0.829 |
Single Classifier | Correctly Classified Instances | Sensitivity | Specificity | |||
---|---|---|---|---|---|---|
μ (σ) | Max/Min | μ (σ) | Max/Min | μ (σ) | Max/Min | |
Simple Logistic | 0.9393 (0.068) | 1.000/0.800 | 0.935 (0.090) | 1.000/0.667 | 0.936 (0.083) | 1.000/0.743 |
SMO | 0.9290 (0.072) | 1.000/0.7753 | 0.907 (0.109) | 1.000/0.602 | 0.920 (0.087) | 1.000/0.755 |
J48 | 0.9812 (0.024) | 1.000/0.9012 | 0.984 (0.024) | 1.000/0.892 | 0.978 (0.025) | 1.000/0.909 |
Variable | RA Group | RA-COM Group |
---|---|---|
GPT | 13 | 13 |
Number | 14 | 12 |
Age | 12 | 10 |
Leflunomide | 10 | 3 |
Hydroxychloroquine | 11 | 8 |
Cyclosporine | 7 | 7 |
If_Biologics_i | 6 | 5 |
Methotreate | 8 | 2 |
Sulfsalazine | 9 | 9 |
IF_NSAIDs | 5 | 4 |
Steroids | 4 | 14 |
RF_igM | 3 | 1 |
Day_d(Medication Time) | 2 | 11 |
Sex | 1 | 6 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, C.-T.; Lo, C.-L.; Tung, C.-H.; Cheng, H.-L. Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis. Healthcare 2020, 8, 85. https://doi.org/10.3390/healthcare8020085
Wu C-T, Lo C-L, Tung C-H, Cheng H-L. Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis. Healthcare. 2020; 8(2):85. https://doi.org/10.3390/healthcare8020085
Chicago/Turabian StyleWu, Chien-Ting, Chia-Lun Lo, Chien-Hsueh Tung, and Hsiu-Lan Cheng. 2020. "Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis" Healthcare 8, no. 2: 85. https://doi.org/10.3390/healthcare8020085
APA StyleWu, C.-T., Lo, C.-L., Tung, C.-H., & Cheng, H.-L. (2020). Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis. Healthcare, 8(2), 85. https://doi.org/10.3390/healthcare8020085