Comorbidity in Older Patients Hospitalized with Cancer in Northeast China based on Hospital Discharge Data

Patients with cancer often carry the dual burden of the cancer itself and other co-existing medical conditions. The problems associated with comorbidities among elderly cancer patients are more prominent compared with younger patients. This study aimed to identify common cancer-related comorbidities in elderly patients through routinely collected hospital discharge data and to use association rules to analyze the prevalence and patterns of these comorbidities in elderly cancer patients at different cancer sites. We collected the discharge data of 80,574 patients who were diagnosed with cancers of the esophagus, stomach, colorectum, liver, lung, female breast, cervix, and thyroid between 2016 and 2018. The same number of non-cancer patients were randomly selected as the control group and matched with the case group by age and gender. The results showed that cardiovascular diseases, metabolic diseases, digestive diseases, and anemia were the most common comorbidities in elderly patients with cancer. The comorbidity patterns differed based on the cancer site. Elderly patients with liver cancer had the highest risk of comorbidities, followed by lung cancer, gastrointestinal cancer, thyroid cancer, and reproductive cancer. For example, elderly patients with liver cancer had the higher risk of the comorbid infectious and digestive diseases, whereas patients with lung cancer had the higher risk of the comorbid respiratory system diseases. The findings can assist clinicians in diagnosing comorbidities and contribute to the allocation of medical resources.


Introduction
The incidence of cancer (and cancer mortality rates) is rising rapidly worldwide due to the aging population [1]. Cancer is the leading cause of death and produces a heavy disease burden in China [2]. It has been reported that more than one-half of patients with cancer >65 years of age often carry the dual burden of cancer itself and other co-existing chronic conditions [3,4]. Individuals have multiple medical conditions referred to as comorbidity [5]. Comorbidities potentially affect the stages of the cancer spectrum from diagnosis, through treatment, to outcome [6]. Patients with comorbidity are substantially more likely to experience complicate treatment, increased cost of care, decreased quality of life and lower survival probabilities than those without comorbidity [3,7]. Therefore, understanding cancer comorbidities can help to comprehend the pathogenesis of comorbidities, promote the prevention and control of comorbidities, and assist the health administration department to rationally evaluate the status of comorbidities to better optimize the distribution and utilization of medical resources.
Many population-based surveys and clinical studies have attempted to explore the prevalence of comorbidities and the impact of comorbidities on health care or survival outcomes, such as identifying the comorbidity patterns of mental diseases [8,9], and assessing the impact of comorbidity on health care or outcomes of chronic diseases in the elderly [10,11]. Researchers have attempted to determine the risk of comorbidities in cancer patients [3,12], focusing on specific disorders related to cancers, such as cardiovascular and cerebrovascular diseases [13,14], hypertension [15,16], other associated complications, and/or specific populations with cancer, such as the elderly [17,18]. The overall pattern of cancer comorbidities has not been established to date because survey data are relatively small in size, usually focus on specific disorders, and sometimes include inadequate information on diagnosis and treatment. Therefore, there is a need for comprehensive information from large long-term datasets to improve our understanding of the prevalence of cancer-related comorbidities and analyze the comorbidity patterns.
With the development and advances in information technology, the emergence of electronic medical record (EMR) systems has made it possible to use clinical information for disease relationship mining. Hospital discharge data, as a type of administrative data derived from EMR, are becoming one of the available data sources for assessing disease comorbidities [19,20], with the discharge diagnosis codes assigned by trained physicians following standard guidelines. Therefore, the use of EMR data for comorbidity analysis has gradually attracted the attention of researchers, such as in identifying important comorbidities among cancer [21], analyzing the impact of comorbidities on cancer care and outcomes [22], and assessing comorbidities of substance abuse [23,24]. Most previous studies used statistical analysis methods, such as relative risk and ϕ-correlation, to mine comorbidity patterns. However, both of these measures mainly consider pairwise relationships, which cannot demonstrate all of the comorbid associations [25]. To completely detect the co-occurrence relationships, association rule mining [26] (ARM) was used in the current study to identify the comorbidity patterns of cancer patients. ARM is an important data mining technology that is used to mine the association between valuable data items from a large amount of data [25,27]. ARM makes it possible to analyze the association between not only two diseases, but also among three or more comorbidities that can be calculated from existing statistics.
In this study, we used hospital discharge data derived from 16 tertiary hospitals in northeast China between 2016 and 2018 to identify important cancer-related comorbidities and estimate the prevalence and patterns of these comorbidities among elderly cancer patients with diverse cancer sites.

Study Population and Data Source
The 10th revision of the International Classification of Diseases (ICD-10) [28] is used in the public hospital diagnosis system of Jilin Province. All categories of comorbidities in the current study followed the original categories of the ICD-10 system. The selection of cancers for this study included cancers of the esophagus (ICD 10 code C15), stomach (ICD 10 code C16), colorectum (ICD 10 codes C18-C20), liver (ICD 10 code C22), lung (ICD 10 code C34), female breast (ICD 10 code C50), cervix (ICD 10 code C53), and thyroid (ICD 10 code C73), which are the predominant malignancies based on incidence and mortality [29]. Jilin Province is located in northeast China and had 27.04 million permanent residents in 2018. Because of environmental factors and dietary habits, such as serious air pollution, a love of pickled cabbage, smoking and drinking, northeast China has a high incidence of cancer, and the incidence of respiratory, digestive, and reproductive tract cancers ranks among the highest in China. The data used in this study were obtained from the hospital discharge medical records of patients >60 years of age who were diagnosed with the above cancers in Jilin Province, China between 2016 and 2018. In addition, we randomly selected 80,574 elderly patients without cancer as the control group, which was matched to the case group by age and gender.
The hospital discharge medical records included the following data: demographic, such as age and gender; disease diagnosis; and medication. The diagnostic data consisted of one primary diagnosis and up to 15 secondary diagnoses, which were coded by trained coders using ICD 10. We used demographic and diagnostic data to analyze the co-morbid relationships to cancers. To ensure data quality, we only included the medical records of patients from tertiary hospitals. Ethical approval to conduct this study was obtained from the Ethics Committee of the School of Public Health at Jilin University (Jilin, China) (grant number: ethical review 2020-02-01).

Statistical Analysis
The characteristics of patients were summarized using frequency distributions and proportions. We used quartiles to describe the number of comorbidities in patients, presented as median (interquartile range (IQR)). The prevalence ratios (PRs) and 95% confidence intervals (CIs) for categories of co-morbid diseases were calculated, which were based on the categories of ICD 10. Distributions of categories of co-morbid diseases were compared for cancer and non-cancer patients using chi-square tests. Differences were considered significant if the p-value was ≤ 0.01. We adopted the Apriori algorithm, which is the best-known ARM algorithm to extract and analyze the patterns of liver cancer comorbidities. The Apriori algorithm is a frequent itemset algorithm for mining association rules. The association rules are evaluated by support (the number of occurrences of disease A and disease B among all patients) and confidence (the number of occurrences of disease A co-occurring with disease B). The formulas for support and confidence are presented below.

patients with X and Y Total number o f patients
The basic premise of the algorithm is to first find all frequency sets, the frequency of which is at least as frequent as the pre-defined minimum support, then generate strong association rules from the frequency sets that satisfy the minimum support and minimum confidence. The advantage of the Apriori algorithm is that the structure is simple, easy to understand, and there is no complicated derivation, which greatly improves the efficiency of the algorithm. The result of the algorithm is a list of patterns between two sets of diseases in the form of "X→Y," which indicates that if disease X exists, disease Y co-exists. Although each pattern is directed with an arrow, it does not mean causation between diseases, but only represents co-occurrences. To avoid confusion, we ignored the directions of the patterns, and considered all diseases in set X and Y to be associated. Herein, support >0.01 and confidence >0.5 were used according to performance of validation diseases. Figure 1 and Table 1 presents the age, gender, and number of comorbidities in the cancer and non-cancer study groups, each of which was comprised of 80,574 patients. There were 80,574 patients shown to have one of the specific cancers, as follows: esophagus, 3088; gastric, 8220; colorectal, 16,961; liver, 8710; lung, 28,282; breast, 11,231; cervix, 2623; and thyroid, 1459. Except for malignant tumors of the reproductive system, thyroid cancer patients were more likely to be female, while cancer patients with cancers other than thyroid cancer were more likely to be male; the distributions were consistent with the expected results. The patients in the cancer and non-cancer study groups were stratified using the following age brackets (in years): 60-69; 70-79; and ≥ 80. The largest age group was 60-69 years, and the proportion of each type of cancer was > 60%. In addition, the same analysis was performed on the control group. As a group, the cancer patients most often had three comorbidities (median, 3; IQR, 1-6) and the prevalence of comorbidities tended to be higher in the cancer groups compared to the controls (median, 2; IQR, 1-4). Patients with liver (median, 4; IQR, 2-6) and lung cancers (median, 4; IQR, 2-7) were more likely to have higher levels of comorbidities than patients with other cancers. Females with digestive system cancers had more comorbidities than males (median, 3; IQR, 1-5 vs. median, 2; IQR, [1][2][3][4][5]. Overall, the number of comorbidities increases with age in patients with cancer. Specifically, we observed that with age, the number of comorbidities in patients with thyroid cancer increases (median, 2; IQR, 1-4 to median, 5; IQR, 2-8). Table 2 lists the prevalence of comorbidities and PRs in elderly patients with and without cancer. Table 3 presents the PRs of co-morbid diseases for each type of cancer compared with the control group. We omitted cancer metastases because we considered metastases to represent an advanced stage of cancer rather than an independent comorbidity. The highest prevalences of comorbidities that existed among elderly cancer patients were circulatory system (35.11%), digestive system (29.60%), and metabolic diseases (24.20%). However, the PRs of the circulatory system and metabolic diseases in cancer patients were lower than that in non-cancer patients, and the same results were obtained when comparing all types of cancers. Infectious diseases, blood system diseases, respiratory diseases, digestive diseases, and symptoms, signs and ill-defined conditions exhibited higher PRs when comparing elderly patients with and without cancers. Higher PRs of the comorbid infectious diseases, blood system diseases and digestive diseases were found in patients with cancers, especially in esophageal cancer, stomach cancer, colorectal cancer, liver cancer and lung cancer. Comorbid respiratory diseases showed higher PRs in esophageal cancer and lung cancer. As a group, the cancer patients most often had three comorbidities (median, 3; IQR, 1-6) and the prevalence of comorbidities tended to be higher in the cancer groups compared to the controls (median, 2; IQR, 1-4). Patients with liver (median, 4; IQR, 2-6) and lung cancers (median, 4; IQR, 2-7) were more likely to have higher levels of comorbidities than patients with other cancers. Females with digestive system cancers had more comorbidities than males (median, 3; IQR, 1-5 vs. median, 2; IQR, [1][2][3][4][5]. Overall, the number of comorbidities increases with age in patients with cancer. Specifically, we observed that with age, the number of comorbidities in patients with thyroid cancer increases (median, 2; IQR, 1-4 to median, 5; IQR, 2-8). Table 2 lists the prevalence of comorbidities and PRs in elderly patients with and without cancer. Table 3 presents the PRs of co-morbid diseases for each type of cancer compared with the control group. We omitted cancer metastases because we considered metastases to represent an advanced stage of cancer rather than an independent comorbidity. The highest prevalences of comorbidities that existed among elderly cancer patients were circulatory system (35.11%), digestive system (29.60%), and metabolic diseases (24.20%). However, the PRs of the circulatory system and metabolic diseases in cancer patients were lower than that in non-cancer patients, and the same results were obtained when comparing all types of cancers. Infectious diseases, blood system diseases, respiratory diseases, digestive diseases, and symptoms, signs and ill-defined conditions exhibited higher PRs when comparing elderly patients with and without cancers. Higher PRs of the comorbid infectious diseases, blood system diseases and digestive diseases were found in patients with cancers, especially in esophageal cancer, stomach cancer, colorectal cancer, liver cancer and lung cancer. Comorbid respiratory diseases showed higher PRs in esophageal cancer and lung cancer.      Figure 2 shows the number of rules and comorbidities for each cancer based on association rules analysis, which reflected the overall complexity of comorbidities in each type of cancer. The comorbidity pattern of liver cancer was the most complicated, with the largest number of comorbidities and association rules, followed by lung cancer. The number of comorbidities and association rules were similar for esophagus, stomach, and colorectum cancers. Cervix and breast cancers had the least number of comorbidities and association rules. Figure 2 shows the number of rules and comorbidities for each cancer based on association rules analysis, which reflected the overall complexity of comorbidities in each type of cancer. The comorbidity pattern of liver cancer was the most complicated, with the largest number of comorbidities and association rules, followed by lung cancer. The number of comorbidities and association rules were similar for esophagus, stomach, and colorectum cancers. Cervix and breast cancers had the least number of comorbidities and association rules.  Figure 3 presents the heatmaps for association rules analyses of comorbidities co-occurring with cancer. Each row represents a different itemset of comorbidities. Each column represents a cancerrelated comorbidity. Red represents the support of the itemset of co-morbid diseases; the darker the color, the higher the support. Using esophagus cancer as an example, there were seven dyads and five triads of comorbidities, with a total of 12 diseases. The most common dyad was chronic ischemic heart disease (I25) and heart failure (I50), and the most common triad was hypertension (I10), chronic ischemic heart disease (I25), and heart failure (I50).
Liver cancer involved the largest number of types of co-morbid diseases and association rules, with a total of 31 diseases and 96 rules. The categories of these diseases were as follows: infectious diseases; blood and hematopoietic organ diseases; metabolic diseases; cardiovascular diseases; respiratory diseases; digestive system diseases; genitourinary system diseases; and symptoms, signs, and ill-defined conditions. Among the categories of disease, the rule with the highest degree of support was chronic viral hepatitis (B18) and liver cirrhosis (K74) (support, 24.77%). Although the types of comorbid diseases involved in lung cancer were less than liver cancer, the types of comorbid diseases involved in lung cancer were greater in number than other cancers (17 diseases and 24 rules). The categories of these diseases included metabolic, cardiovascular, respiratory, digestive, and reproductive system diseases. The rule with the highest degree of support was chronic ischemic heart disease (I25) and heart failure (I50) (support, 6.45%).
In addition to cardiovascular and metabolic diseases, common comorbidities of stomach and colorectal cancers were digestive tract inflammation, including esophagitis (K20), gastritis, and  Figure 3 presents the heatmaps for association rules analyses of comorbidities co-occurring with cancer. Each row represents a different itemset of comorbidities. Each column represents a cancer-related comorbidity. Red represents the support of the itemset of co-morbid diseases; the darker the color, the higher the support. Using esophagus cancer as an example, there were seven dyads and five triads of comorbidities, with a total of 12 diseases. The most common dyad was chronic ischemic heart disease (I25) and heart failure (I50), and the most common triad was hypertension (I10), chronic ischemic heart disease (I25), and heart failure (I50).
Liver cancer involved the largest number of types of co-morbid diseases and association rules, with a total of 31 diseases and 96 rules. The categories of these diseases were as follows: infectious diseases; blood and hematopoietic organ diseases; metabolic diseases; cardiovascular diseases; respiratory diseases; digestive system diseases; genitourinary system diseases; and symptoms, signs, and ill-defined conditions. Among the categories of disease, the rule with the highest degree of support was chronic viral hepatitis (B18) and liver cirrhosis (K74) (support, 24.77%). Although the types of comorbid diseases involved in lung cancer were less than liver cancer, the types of comorbid diseases involved in lung cancer were greater in number than other cancers (17 diseases and 24 rules). The categories of these diseases included metabolic, cardiovascular, respiratory, digestive, and reproductive system diseases. The rule with the highest degree of support was chronic ischemic heart disease (I25) and heart failure (I50) (support, 6.45%).
In addition to cardiovascular and metabolic diseases, common comorbidities of stomach and colorectal cancers were digestive tract inflammation, including esophagitis (K20), gastritis, and duodenitis (K29), while the common comorbidities of esophagus cancer were pneumonia (J18) and pleural effusion (J94). Cervix and breast cancers, cancers of the female reproductive system, had fewer types of comorbidities (mainly common cardiovascular and metabolic diseases).
Int. J. Environ. Res. Public Health 2020, 17, x 10 of 13 duodenitis (K29), while the common comorbidities of esophagus cancer were pneumonia (J18) and pleural effusion (J94). Cervix and breast cancers, cancers of the female reproductive system, had fewer types of comorbidities (mainly common cardiovascular and metabolic diseases).

Discussion
This study was conducted to analyze the association between various types of cancers and comorbid diseases. Diagnostic data used in this study were collected from hospital discharge medical records distributed throughout Jilin province, providing data from a diverse population to comprehensively examine and characterize wide-ranging patterns of comorbidities. We used ARM to identify common comorbidities and comorbidity patterns in patients with cancer. These influential comorbidities could be targeted for specific intervention and/or screening.
The results of the high PRs and support for blood system diseases and digestive system diseases showed a strong association between blood and digestive system diseases and cancer, and there was a higher risk of comorbidities in cancer patients than non-cancer patients. Anemia is a common issue in cancer patients, which has several possible causes and contributing mechanisms. Anemia may be

Discussion
This study was conducted to analyze the association between various types of cancers and comorbid diseases. Diagnostic data used in this study were collected from hospital discharge medical records distributed throughout Jilin province, providing data from a diverse population to comprehensively examine and characterize wide-ranging patterns of comorbidities. We used ARM to identify common comorbidities and comorbidity patterns in patients with cancer. These influential comorbidities could be targeted for specific intervention and/or screening.
The results of the high PRs and support for blood system diseases and digestive system diseases showed a strong association between blood and digestive system diseases and cancer, and there was a higher risk of comorbidities in cancer patients than non-cancer patients. Anemia is a common issue in cancer patients, which has several possible causes and contributing mechanisms. Anemia may be the result of the cancer itself, cancer treatment, blood losses, hemolysis or inflammatory cytokines associated with chronic disease [30]. For example, some elements related to hemoglobin synthesis, such as iron, cannot be fully utilized, or endogenous erythropoietin (EPO) is relatively insufficient [31][32][33]. Inflammation is often associated with the development and progression of cancer [34]. Digestive system inflammation can induce carcinogenic mutations, increase the risk of cancers, and promote cancers initiation [35].
ARM analysis showed that cardiovascular and metabolic diseases have a high degree of support. In addition, ARM analysis showed high support rules between cardiovascular and metabolic diseases and each type of cancer, which are consistent with other research results [18,36,37]. Compared with non-cancer patients, the PRs of cancer patients were significantly less than one, which showed that the support scores of cardiovascular and metabolic diseases were highly correlated with prevalence. Diabetes, hypertension, and heart failure are the most common chronic diseases in elderly patients, and have a high prevalence and strong associations [38]. Because of the high incidence of diabetes, hypertension, and coronary heart disease, the possibility of complications with other diseases in the population are increased and the comorbid relationship between cardiovascular disease and cancer may be overestimated [25].
Our study has also shown that different cancer sites have different comorbidity patterns. The study found that liver and lung cancers had the most types and rules of complications, followed by gastrointestinal cancers, such as esophagus, gastric, and colorectal cancers. In contrast, cervical and breast cancer, cancers of the female reproductive system, had fewer comorbidities, which is in agreement with published results [21]. Therefore, we take the comorbidity model of liver cancer as an example to discuss the relationship between comorbidities and cancer. The results showed a strong association between liver cancer, cirrhosis and chronic viral hepatitis. The most common causes for liver cancer are chronic viral hepatitis B and C infection [39], and cirrhosis is a strong risk factor for liver cancer [40]. Liver cancer is frequently accompanied by one or more components of metabolic diseases, because metabolism is the most important function of the liver. The metabolism of sugar, protein, fat, vitamins and electrolytes is closely related to the liver. Liver lesions occur in patients with liver cancer, leading to metabolic disorders [41,42]. This finding indicates that elderly patients with liver cancer face a higher risk of comorbidity diseases. It is recommended that elderly patients with liver cancer pay more attention to co-morbid diseases and strengthen the prevention and management of co-morbid diseases.
There were several limitations to this study. We can only identify diseases that were coded during hospitalization, and there is limited information about the onset of comorbidities, so we cannot determine the precedence or causality of comorbidities. Moreover, the threshold set of association rules is empirical, and the results may vary depending on the selected threshold. Due to space limitations, only rules and comorbidities with high reliability levels were analyzed. Therefore, we will explore the mechanisms underlying comorbidities and the rules of occurrence and development of comorbidity patterns in more detail in subsequent studies.

Conclusions
This study used ARM from hospital discharge data to identify an extensive list of important comorbidities in cancer patients. Our work demonstrates how clinically derived data can be used to identify cancer-related comorbidities and the ARM algorithm can be used to analyze the comorbidities associated with cancer. This method may be widely applied to exploring other chronic disease-related comorbidities. From the overall pattern of comorbidities, cardiovascular disease, metabolic disease, anemia, and digestive system disease were the most common comorbidities in elderly patients with cancer. Studies have also shown that different cancer sites have different comorbidity patterns compared with other cancer sites. Elderly patients with liver cancer face the highest risk of comorbidities. These results can provide references for the clinical diagnosis and active prevention of cancer comorbidities, and play a positive role in improving the quality of life of patients with cancer comorbidities.

Conflicts of Interest:
The authors declare no conflict of interest.