Using an Artificial Intelligence Approach to Predict the Adverse Effects and Prognosis of Tuberculosis

Background: Tuberculosis (TB) is one of the leading causes of death worldwide and a major cause of ill health. Without treatment, the mortality rate of TB is approximately 50%; with treatment, most patients with TB can be cured. However, anti-TB drug treatments may result in many adverse effects. Therefore, it is important to detect and predict these adverse effects early. Our study aimed to build models using an artificial intelligence/machine learning approach to predict acute hepatitis, acute respiratory failure, and mortality after TB treatment. Materials and Methods: Adult patients (age ≥ 20 years) who had a TB diagnosis and received treatment from January 2004 to December 2021 were enrolled in the present study. Thirty-six feature variables were used to develop the predictive models with AI. The data were randomly stratified into a training dataset for model building (70%) and a testing dataset for model validation (30%). These algorithms included XGBoost, random forest, MLP, light GBM, logistic regression, and SVM. Results: A total of 2248 TB patients in Chi Mei Medical Center were included in the study; 71.7% were males, and the other 28.3% were females. The mean age was 67.7 ± 16.4 years. The results showed that our models using the six AI algorithms all had a high area under the receiver operating characteristic curve (AUC) in predicting acute hepatitis, respiratory failure, and mortality, and the AUCs ranged from 0.920 to 0.766, 0.884 to 0.797, and 0.834 to 0.737, respectively. Conclusions: Our AI models were good predictors and can provide clinicians with a valuable tool to detect the adverse prognosis in TB patients early.


Introduction
Tuberculosis (TB) is an infectious disease that spreads directly from one person to another and is a major cause of morbidity and mortality worldwide. It is also one of the leading causes of death from a single infectious disease and is more prevalent than human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) worldwide [1].
Patients infected with TB can be effectively treated with anti-TB medication, and the drug regimen, dosage, and length of treatment period depend on whether it is a drugresistant strain, what comorbidities are present (diabetes, HIV, liver disease, renal disease, etc.), and where is the infection located in the body [2]. Most tuberculosis medications can be toxic to the liver and have the adverse effect of hepatitis. Therefore, when patients take these anti-TB medications, physicians need to monitor the patient's liver enzymes and be aware of the risk of hepatitis. For example, Ramappa and Aithal [3] found that the TB medication that can cause hepatitis included isoniazid, rifampicin, and pyrazinamide. Patients with TB who develop hepatitis during the treatment may need to change TB medications if the hepatitis is severe. On the other hand, Elhidsi et al. [4] found that most patients with TB with acute respiratory failure were newly diagnosed patients, and had advanced lesions and hypoxemic type respiratory failure. The independent risk factors of in-hospital mortality were severe hypoxemia and kidney injury. Another study [5] showed that advanced age and presence of shock unrelated to sepsis were independently associated with mortality after multivariate analysis.
Artificial intelligence (AI)-based computer programs can assist hospitals in reading chest radiographs in a timely fashion, these programs perform similarly to expert physicians and radiologists with high sensitivity in detecting TB disease to determine which patients need further examination [6].
Recently, most studies have used AI and machine learning (ML) models to diagnose TB and explore the data characteristics and features used for algorithm accuracy [7]. Limited studies have focused on predicting adverse outcomes such as mortality and treatment failure [8][9][10]. Our study aimed to use the AI/ML model to detect hepatitis, respiratory failure, and mortality early in patients with TB after receiving anti-TB medications.

Feature and Outcome Variables
We chose three outcome variables for the prediction models: (1) acute hepatitis, (2) acute respiratory failure, and (3) all-cause mortality during treatment.
The death certificate data were obtained through a formal application to Taiwan's

Feature and Outcome Variables
We chose three outcome variables for the prediction models: (1) acute hepatitis, (2) acute respiratory failure, and (3) all-cause mortality during treatment.
The death certificate data were obtained through a formal application to Taiwan's Health and Welfare Data Science Center.
The diagnosis of acute hepatitis must meet at least one of the following criteria: Condition 1: The initial alanine aminotransferase (ALT, GPT) or aspartate aminotransferase (AST, GOT) is three times (or higher) the upper limit of the normal range during the treatment period.
Condition 2: The initial ALT (GPT) or AST (GOT) is more than twice the original value during the treatment period. The treatment period is from the date of starting the TB medication to the date of the completion of the TB treatment. The normal value of ALT (GPT) is 41 U/L; the normal value of AST (GOT) is 31 U/L; and the normal value of T-Bil is 1.2 mg/dL in our laboratory.

Model Building and Evaluation
We used all the variables to build the prediction models to maximize model performance without performing any feature selection preprocessing. The data were randomly stratified into a training dataset (70%) and a testing dataset (30%). The SMOTE method (synthetic minority oversampling technique) [11] was used to fix the data imbalance due to the fewer related positive classes (outcomes to be predicted, such as mortality) in the training dataset. The model of each outcome was built with 6 machine learning algorithms, including (1) multilayer perceptron (MLP), (2) LightGBM, (3) random forest, (4) XGBoost, (5) logistic regression, and (6) support vector machine (SVM).
We used a grid search with 10-fold cross-validation to build the best models based on the training dataset. We then used the testing dataset to evaluate the models with the performance indicators of accuracy, sensitivity, specificity, and AUC (area under the receiver operating characteristic curve).

Results
Finally, after removing data with missing values, 2248 patients were enrolled for model building. The data distribution and significance are summarized in Table 1, showing that the mean age of the patients was 67.7 years old, 71.7% were males, and 28.3% were females. According to the Spearman correlation analysis (Figure 2), the most relevant features to acute hepatitis were S-GOT, S-GPT, and T-Bil before hepatitis; those to acute respiratory failure were WBC, BUN, and age; and those to mortality were BUN, WBC, and age. We used six machine learning algorithms to build the three outcome-predictive models of acute hepatitis, acute respiratory failure, and mortality. The results showed that the MLP algorithm obtained the highest AUC value (0.834) for the mortality prediction model (see Table 2 and Figure 3), the random forest algorithm had the highest AUC value (0.884) for acute respiratory failure (see Table 3 and Figure 4), and the XGBoost algorithm had the highest AUC value (0.920) for acute hepatitis (see Table 4 and Figure 5).  We used six machine learning algorithms to build the three outcome-predictive models of acute hepatitis, acute respiratory failure, and mortality. The results showed that the MLP algorithm obtained the highest AUC value (0.834) for the mortality prediction model (see Table 2 and Figure 3), the random forest algorithm had the highest AUC value (0.884) for acute respiratory failure (see Table 3 and Figure 4), and the XGBoost algorithm had the highest AUC value (0.920) for acute hepatitis (see Table 4 and Figure 5).

Discussion
To our knowledge, this is the first study to use AI and ML models to early detect acute hepatitis, respiratory failure, and mortality simultaneously in patients with TB after receiving anti-TB medications.
Our study included common clinical information and demographic data, such as age, sex, WBC, Hb, platelet count, BUN, creatinine, AST (GOT), ALT (GPT), bilirubin, comorbidities, and TB medication, to predict acute hepatitis, respiratory failure, and mortality in patients with TB after receiving TB medication. We also comprehensively included comorbidities, such as diabetes, hypertension, dyslipidemia, ESRD, CVA, dementia, CHF, COPD, asthma, malignancy, autoimmune disease, HIV, history of liver cirrhosis, hepatitis, old TB, and presence of pleural effusion in the models. With soft computing techniques, electrical medical systems can retrieve this information, and a clinician is not required to survey and rearrange the examination data. Moreover, our study evaluated laboratory data and systemic diseases in predicting TB patients' prognosis.
We compared previous related studies on predicting adverse outcomes of TB patients [8][9][10]12,13], and found that our predictive model was based on the literature and practical availability, and had excellent quality (AUCs: 0.834~0.920), which is quite worthy of being developed as a predictive tool to assist in clinical decision-making. We summarized the comparison of these works in Table 5.

Discussion
To our knowledge, this is the first study to use AI and ML models to early detect acute hepatitis, respiratory failure, and mortality simultaneously in patients with TB after receiving anti-TB medications.
Our study included common clinical information and demographic data, such as age, sex, WBC, Hb, platelet count, BUN, creatinine, AST (GOT), ALT (GPT), bilirubin, comorbidities, and TB medication, to predict acute hepatitis, respiratory failure, and mortality in patients with TB after receiving TB medication. We also comprehensively included comorbidities, such as diabetes, hypertension, dyslipidemia, ESRD, CVA, dementia, CHF, COPD, asthma, malignancy, autoimmune disease, HIV, history of liver cirrhosis, hepatitis, old TB, and presence of pleural effusion in the models. With soft computing techniques, electrical medical systems can retrieve this information, and a clinician is not required to survey and rearrange the examination data. Moreover, our study evaluated laboratory data and systemic diseases in predicting TB patients' prognosis.
We compared previous related studies on predicting adverse outcomes of TB patients [8][9][10]12,13], and found that our predictive model was based on the literature and practical availability, and had excellent quality (AUCs: 0.834~0.920), which is quite worthy of being developed as a predictive tool to assist in clinical decision-making. We summarized the comparison of these works in Table 5.

Acute Hepatitis
Luo et al. [14] enrolled patients with active TB and latent TB infection in China based on multiple laboratory data and used different models established by ML for distinguishing the patient's TB infection status. Nijiati et al. [15]. used a three-dimensional model to detect lung field regions in CT images and ML methods for classification and differentiating active/nonactive pulmonary TB. With AI assistance, radiologists working in this field can truly help potential patients. Another study [16] used AI for training models to interpret chest X-ray images and achieved high accuracy. These recent studies used AI and ML to detect TB early and did not mention how to detect hepatitis in patients with TB.
Risk factors for hepatitis after TB treatment have been assayed in an observational study [17]. Among the various risk factors assessed, extensive disease, old age, excessive alcohol use, and slow acetylator phenotype were risk factors for hepatitis in patients who received anti-TB drugs. A study enrolled 765 patients who received anti-TB treatment and found that the risk factors for hepatotoxicity included advanced age, female sex, extensive tuberculosis, and no alcohol consumption [18]. In our population, there was no significant difference in age between TB patients with and without acute hepatitis. Wang et al. showed that age and hepatitis B infection were important risk factors for hepatitis in patients with TB via a multiple logistic regression analysis [19]. Extra-pulmonary TB, advanced age, and comorbidities were found to be significant predictors of the development of hepatitis in studies using multivariable logistic regression analyses [20]. However, our study showed no significant difference between extra-pulmonary and intra-pulmonary TB among patients with acute hepatitis. Approximately 12% of the patients died after the development of anti-TB drug-induced hepatitis [21]. In our study, most of these factors and patient laboratory data that were taken before acute hepatitis had developed were included in the models. However, alcohol intake was not included in our study because alcohol intake was not recorded in our electrical medical record and because this was a retrospective study. From our data, we found that patients with TB and acute hepatitis had a high proportion history of hepatitis and liver cirrhosis. Furthermore, with the aid of these variables and ML, the XGBoost model still had a high accuracy of 0.868, a sensitivity of 77.9%, a specificity of 92.5%, and an AUC of 0.920. In addition to higher accuracy in detecting hepatitis during the TB treatment course, we can detect hepatitis early in patients with TB. With the aid of AI, physicians can be aware of the risk of hepatitis and more frequently and intensively monitor liver function before this adverse effect occurs.

Acute Respiratory Failure
Despite the availability of effective anti-TB medications, TB, as a cause of respiratory failure requiring mechanical ventilation, is often associated with acute respiratory distress syndrome, which leads to a high mortality rate [22]. A study enrolled 41 patients with TB in Taiwan from January 1996 to April 2001; a total of 27 died (65.9%) in the hospital, and 14 survived, with a (mean ± sd) of 40.7 ± 35.4 admission days before death. The mortality rate for the 180 day monitoring period was 79% [23]. The multivariate analysis found that old age, multiple organ failure, and shock unrelated to sepsis were related to poor outcomes [5]. Therefore, detecting patients with TB at risk of acute respiratory failure from complex diseases and patients with multiple comorbidities earlier is important for clinical care. We found that our patients with TB and respiratory failure requiring mechanical ventilation had a higher proportion of CVA, dementia, CHF, COPD, and TB present with pleural effusion. These baseline comorbidities may reflect the fragility in patients with TB, and the elderly with these comorbidities are more vulnerable to respiratory failure. TB effusion was a common condition, and treatment of TB with pleural effusion was the same as for pulmonary TB. Most TB patients with pleural effusion had a benign course after treatment with mild-to-no residual effects. There was scarce literature regarding TB with pleural-effusion-related acute respiratory failure. Further studies were required to answer if pleural effusion is meaningful to TB patients with acute respiratory failure.
The early signs of acute respiratory failure may be uncertain in some laboratory test results. Predictive models using AI that integrate and leverage multiple variable factors could help identify areas of uncertainty, and this identification would likely occur before any noticeable physical symptoms appear. By incorporating ML into laboratory data, routine data results can be merged into other relevant patient characteristics, such as age, sex, and comorbidities, for use within disease-specific AI models. By integrating information, patient characteristics, and laboratory data, there is a potential to generate acute respiratory failure patient probability scores to help alert clinicians. Our predictive random forest models' accuracy, sensitivity, specificity, and AUC were 0.819, 0.812, 0.820, and 0.884, respectively. In collaboration with more patient information and healthcare institutions, ML and computerized reasoning can be used to develop AI-driven clinical decision support tools that can potentially aid clinicians in making prompt and correct decisions before TB patients experience acute respiratory failure.

Mortality
TB hurts the patients' long-term survival rate even after successful treatment and decreases the survival rate in long-term follow-up, even after accounting for acute TBrelated mortality [24]. The survival rate at 11 years was 70% after successful TB treatment, and the probability of survival was 46% in the age group of 55 years and older after 11 years of follow-up [25]. Another study also showed that mortality in the TB cohort was 2.3 times higher than in the general population after age matching. Most mortality occurred in the first year after completing treatment [26]. During our 17 year follow-up, we enrolled 2128 patients with TB, and there were 120 deaths during this period. The predictive model of MLP had an accuracy of 0.735, a sensitivity of 0.722, a specificity of 0.736, and an AUC of 0.834. Our data found that extra-pulmonary TB had a low risk of mortality, and comorbidities of hypertension, CVA, dementia, CHF, and COPD were associated with mortality among patients with TB compared with those without these comorbidities. A previous study showed advancing age and drug resistance were the features most associated with risk of death. In contrast, male sex, European origin, pulmonary site of TB infection, and previous history of anti-TB treatment were weaker predictors [27] but our data are inconsistent with their results. The median age in their study was 43 years, with 2% aged < 15 years and 24% aged ≥ 60 years, 5% of patients had multidrug resistance, and most cases were European (68%). The inconsistencies may result from differences between countries, presumably reflecting the differences in patient characteristics and drug susceptibility to TB. These outcomes are important for patients with TB and their medical teams. Focusing on patient-centered care and the early prediction of adverse drug effects, respiratory failure, and mortality in patients receiving TB treatment could contribute to the optimal use of medical resources.
In addition to accurate and prompt diagnosis of TB, it is important to detect the possible risk of hepatitis in patients who receive TB treatment as early as possible so that the culprit medicine can be discontinued to improve the patient's outcome.
Our study also has some limitations that need to be addressed and explored. First, our patients were from southern Taiwan and may have differed from TB patients in other countries. Our models were not representative of other countries. Further studies in other areas with more hospitals may be needed for more representative results. Second, alcohol use in TB patients was considered an important risk factor for hepatitis. The study is retrospective, and data on smoking and drinking variables are missing. Our study could not include this factor because we could not accurately obtain this information from electronic medical records. Third, the model created in the current study lacks TB drug susceptibility, and future studies should focus on drug susceptibility testing. Fourth, the mortality of patients with TB failed to determine the direct cause. Thus, our results may not be recommended for the general extrapolation population of patients with TB.

Conclusions
In conclusion, we created a model based on laboratory data and patient characteristics that has significant value in the early detection of hepatitis, respiratory failure, and mortality in patients with TB who received anti-TB treatment. Informed Consent Statement: Patient consent was waived due to the retrospective nature of the study.

Data Availability Statement:
The dataset used for this study is available on request to the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.