Relationships between Demographic Factors and Chronic Conditions with Disease Severities

Disease severities are the outcomes of an inpatient visit classification that assigns a diagnostic related group, including risk of mortality and severity of illness. Although widely used in healthcare, the analysis of factors affecting disease severities has not been adequately studied. In this study, we analyze the relationships between demographics and chronic conditions and specify their influence on disease severities. Descriptive statistics are used to investigate the relationships and the prevalence of chronic conditions. To evaluate the influence of demographic factors and chronic conditions on disease severities, several multinomial logistic regression models are performed and prediction models for disease severities are conducted based on National Inpatient Sample data for 2016 provided by the Healthcare Cost and Utilization Project database in the United States. The rate of patients with a chronic illness is 88.9% and the rate of patients with more than two chronic conditions is 67.6%; further, the rate is 62.7% for females, 73.9% for males, and 90% for the elderly. A high level of disease severity commonly appears in patients with more than two chronic conditions, especially in the elderly. For patients without chronic conditions, disease severities show a lower or safe level, even in the elderly.


Introduction
Chronic conditions (CC) are a topic of much interest because of their influence and importance in the healthcare and treatment of patients in recent years. Multiple chronic conditions, referred to as comorbidity, are used to refer to patients who have at least two chronic conditions at the same time [1]. This concept has been quite widely used by healthcare professionals in clinical practice and health policy documents [2][3][4][5][6][7]. Multimorbidity has been used to refer to patients who have at least three chronic conditions at the same time [8]. The criterion of three chronic conditions has been considered to be a more valid cut-off in elderly patients treated in the ambulatory care setting, instead of the usual criterion of two chronic conditions [8,9]. Patients with many chronic conditions often have certain difficulties and place a burden on health facilities, and so are associated with high healthcare costs because having more than one disease requires complex disease management, including treatment and self-care [10][11][12][13][14][15][16][17]. In this study, the concept of chronic conditions is spontaneously used to refer to the number of chronic conditions suffered by a patient. This is done to avoid confusion in implementation, moreover, the differences between patients without chronic conditions, patients with only one or two chronic conditions, and patients with three or more chronic conditions are clearly specified.
Disease severity measures, including the risk of mortality and severity of illness, are the outcomes of an inpatient visit classification system that assigns a diagnostic related group [18]. They are widely used to characterize the impact of a disease process on the utilization of resources, comorbidities, and mortality [18][19][20]. Thus, disease severity status has a high impact on mortality rates [18]. The early determination of a disease severity level helps many health facilities simultaneously examine and make the best treatment plan for patients, something that is especially important for patients with three or more chronic conditions [18,21].
Based on previous studies, the prevalence of people with multiple chronic conditions ranges from 16% to 58% in UK studies, 26% in US studies, and 9.4% in urban South Asian studies [2]. As the average span of a person's life increases, the number of people with many chronic conditions increases significantly [14,16,17,22]. The prevalence of chronic conditions in patients changes also by demographic factors [13,17,23]. For instance, the proportion of black people with three or more chronic conditions is much higher than white people [23]. Healthcare providers have struggled to manage chronic conditions as they bring adverse effects such as severe disease, which is highly positively correlated with disability rates and mortality rate [13]. Thus, it is necessary to classify and identify the characteristics of patients with each chronic condition in their epidemiological relationships with demographic factors, including age, sex, and race, and analyze the impact of these factors on patients' disease severities. However, previous studies lack specific information about the relationship between demographic factors, chronic conditions, and patients' disease severities [18,19,21].
The purpose of this study is to identify if there is a relationship between demographics and chronic conditions, and to provide quantitative analysis to specify the influence of demographic factors as well as chronic conditions on patients' disease severities, include estimates and predictions of disease severities which may help in the provision of healthcare.

Data and Variable Definitions
The inpatient data from the National Inpatient Sample for 2016 (NIS 2016 data) are used, which were provided by the Healthcare Cost and Utilization Project (HCUP) database of the United States, including nine months of medical records (from 1 January 2015, to 30 September 2015). The NIS 2016 data contain information from all patients whether they were insured or uninsured. The NIS 2016 sampling frame comprises 46 states and the District of Columbia, covering more than 97% of the United States population and including almost 96% of discharges from United States community hospitals.
Demographic factors used in this study included age, sex, and race. For research purposes, only patients aged 18 years or older were used, and to indicate differences by age, the AGE variable in the NIS 2016 data was separated into four age groups: early working age (18-24 years), prime working age (25-54 years), mature working age (55-64 years), and elderly (65 years or older), based on the United States age structure [24]. The RACE variable divided the patients into six racial and ethnic groups: White, Black, Hispanic, Asian and Pacific Islander, Native American, and Other [25]. The risk of mortality (ROM) and severity of illness (SOI) are two measures of disease severities [18]. Both ROM and SOI in the NIS 2016 data were divided into five levels corresponding to disease severity levels, numbered 0 through 4. Level 0 was considered the lowest, while level 4 referred to extreme severity level [25].
During this study, we used the chronic disease classification method introduced in previous literature [8] and ICD 10-CM [26] to code diseases. Specifically, chronic diseases were classified into 46 major disease groups called "chronic conditions" (CC). The results of previous studies showed that the list covered all chronic diseases with prevalence rates of at least 1% in elderly patients [8,10,11]. As an example of the chronic conditions, the chronic diseases with ICD-10-CM codes F00-03, F05.1, G30, G31, and R54 were grouped under the "dementia" chronic condition. To avoid repetition, a complete list of 46 chronic conditions was not included. A patient was considered chronically ill only if at least one of the conditions on the list is present [8]. The CC variable was determined based upon the number of chronic conditions found in the patient and was divided into four categories, corresponding to patients without chronic conditions, with only one chronic condition, with two chronic conditions, and with more than two chronic conditions. Categories of these variables are described in detail in the following Table 1. No chronic conditions 1 Having one chronic condition 2 Having two chronic conditions 3 Having more than two chronic conditions

Methods
This study provides several descriptive statistics to show the prevalence of CC and the differences in demographic characteristics between a patient without chronic conditions, with a single chronic condition, with two chronic conditions, or a patient with more than two chronic conditions. The differences in the expression of patients' disease severities for each demographic factor and CC are also indicated. The Chi-square test [27] is used to calculate p values for the differences across a patient's sex, CC categories, or disease severity levels. The obtained results are shown in Sections 3.1-3.3.
Multinomial logistic regression models [28] are performed to evaluate the influence of demographics and CC on patient's disease severities. To conduct the models, the dependent variables are disease severities (ROM and SOI), and the independent variables are demographic factors (AGE, SEX, and RACE) and CC. There are two multinomial logistic models that correspond with two dependent variables ROM and SOI. They are defined as: The values of p i (R) , p i (S) show the corresponding probabilities for a patient who has the ROM and the SOI at i-th level, i = 0, 1, 2, 3, 4.
The reference categories are specified to conduct the multinomial logistic regression models for both models. Here, the category corresponding to level 0, considered the safest level, is used to designate the reference. Subsequently, the other categories are separately regressed against the reference. The general multinomial logistic regression models are shown in the following equations.
The regression coefficients are typically jointly estimated by maximum a posteriori (MAP), an extension of the maximum likelihood method. The Wald test [27] is used to determine the statistical significance of estimates. The goodness of fit test [29] is used to test the suitability of models.
The coefficient represents the change in the log-odds ratio (or the relative risk ratio) of the dependent variable's difference in a particular category compared with the reference category, associated with a one-level change of the respective independent variable.
The following formulas are used to obtain the predicted probabilities for each level of disease severity.
where p For the results fitted by the models, we report the estimates of the regression coefficients and the corresponding odds ratios with a 5% level of significance. For the purpose of prediction, all achievable possibilities are considered and calculated for each case, then an average with a standard error is reported. The obtained results are shown in Section 3.4.

Characteristics of Patients
The characteristics of patients are described in Table 2. There is a total of 893,967 patients, of which females comprise 506,189 (56.62%), while males comprise only 387,778 of the patients (43.38%). By AGE, the elderly (coded by 3) have the highest percentage with 44.22% (395,274 patients). Among them, females account for a higher percentage than males with 54.66% (216,047 patients). Patients of prime working age (coded by 1) rank second with 300,732 patients (33.64%), of which females also account for a higher percentage than males with 62.57% (188,179 patients). Ranked third is patients of mature working age (coded by 2) with 17.16% (153,446 patients), however, in this group, the proportion of males (53.78%) is higher than that of females (46.22%). Patients of early working age account for the lowest percentage in this study (4.98%), of which females account for more than twice as many males (69.74% vs. 30.26%). Regarding RACE, white patients (coded by 1) account for the highest percentage with 69.79% (623,895 patients), of which females account for 55.53%, higher than males (44.47%). Followed by black patients (coded by 2) with 13.53% (120,984 people), of which females continue to account for a higher percentage than males with 58.10%. Hispanic patients (coded by 3) are ranked third with 9.19% (82,115 people), and females account for a higher percentage than males with 60.21%. The remaining racial groups (coded by 4-6) account for less or less significant numbers, with proportions of 4.83%, 0.16%, and 2.5%, respectively. Finally, in terms of CC, patients with at least one chronic condition account for a very high rate with 88.89%, including 10.63% with only one chronic condition (coded by 1), 10.71% with 2 chronic conditions (coded by 2), and 67.55% of patients with three or more chronic conditions (coded by 3). A rather interesting finding here is that although the proportion of females is consistently higher than that of male patients, while the proportion of males tends to increase with the number of chronic conditions, there is a tendency of decreasing trend in females. For patients without chronic conditions, females account for a much higher percentage than males (78.79% vs. 21.21%).

Prevalence of Chronic Conditions
The distribution of CC by demographic factors is shown in Table 3. By AGE, the proportion of patients with 3 or more chronic conditions in the elderly is very high (over 90%) and then it gradually decreases in the younger age groups. Specifically, in patients of mature working age, it is over 78%, it is 40.57% for prime working age, and nearly 14% for early working age. Meanwhile, the proportion of patients without chronic conditions in early working age (41.5%) is much higher than in the other age groups, even the proportions of mature working age and the elderly are very small (2.74% and 0.88%, respectively). The proportion of patients with only one chronic condition or two chronic conditions continue to decline with increasing age, however, these proportions of mature working age and the elderly increase significantly compared with patients without chronic conditions. By SEX, the proportion of patients with three or more chronic conditions is 73.87% in males, higher than in females (with 62.71%). However, the proportion of patients without chronic conditions is much higher in females than in males (15.46% vs. 5.43%). Regarding RACE, the proportion of patients without chronic conditions among Asians and Islanders is 29.65%, higher than all those of other racial groups. The proportion of patients with three or more chronic conditions in this group is also the lowest (44.95%). Meanwhile, for white patients, the proportion of patients without chronic conditions is only 8.73%, but it is very high for patients with three or more chronic conditions (71.91%). For black patients or native Americans, the proportion of patients with three or more chronic conditions accounts for about 64%. The prevalence of CC in patients is shown in detail in Figure 1. With regard to the visual aspect, there is a difference between male and female patients here. For patients of mature working age, the number of females corresponding to each CC is always less than that of males. Conversely, in the other ages, females corresponding to each CC outperform males.

Disease Severity Measures
Disease severity measures, including ROM and SOI, are essential factors in the prognosis and treatment of disease. As shown in Table 4, consider the ROM aspect, of the total amount of patients, those at level 1 account for the highest rate with 48.71%, followed by patients at level 2 with over 25%. Rates for patients at higher ROM levels are 19.15% at level 3 and 7.1% at level 4 while the rate of patients at level 0 is negligible (only 0.03%). By AGE, patients of early working age have the highest percentage at level 1 with over 88%,

Disease Severity Measures
Disease severity measures, including ROM and SOI, are essential factors in the prognosis and treatment of disease. As shown in Table 4, consider the ROM aspect, of the total amount of patients, those at level 1 account for the highest rate with 48.71%, followed by patients at level 2 with over 25%. Rates for patients at higher ROM levels are 19.15% at level 3 and 7.1% at level 4 while the rate of patients at level 0 is negligible (only 0.03%). By AGE, patients of early working age have the highest percentage at level 1 with over 88%, followed by level 2 with 7.45% and the lowest at level 0 with a nominal proportion of 0.05%. The ROM varies in descending order at level 1 and in ascending at more dangerous levels (2-4) as age increases, while the ROM at level 0 is negligible. This makes ROM in the elderly open to the highest percentage at level 2 with over 34% and level 3 with over 32%. The extreme level (level 4) of the ROM for the elderly has increased significantly by 11.48%. In comparison, level 1 drops to 22.29%. Concerning SEX, there is a similarity in the order of the ROM in levels for both males and females (the highest rates are at level 1, followed by level 2, level 3, and the lowest at level 0). Nonetheless, there is a slight difference in the proportions of patients at each level; specifically, these rates on females correspond to levels 1-4, and 0, respectively, with 54.04%, 22.84%, 17.17%, 5.93% and 0.02% and in males with 41.75%, 27.85%, 21.75%, 8.61%, and 0.03%, respectively. In terms of RACE, although there is no change in the order of the ROM in levels for six racial groups, coded 1 to 6 (the highest is still level 1, followed by levels 2-4, and 0), there is a clear difference in the ROM levels for the groups. White patients have the most force among racial groups, however, their ROM at level 1 has the lowest rate (45.42%); nevertheless, at higher levels (2)(3)(4), this racial group has the highest rate (with 26.03%, 20.77%, 7.75%, respectively). For patients who have less than three chronic conditions, the ROM at level 1 particularly predominates, it is 93.82% for patients without chronic conditions, 81.71% for those with only one chronic condition, and 68.53% for those with two chronic conditions. However, the rates of patients at higher ROM levels increase rapidly as the CC increases. Specifically, for patients without chronic conditions, the ROM at level 2 is only 3.67%, it is 1.44% at level 3, and is even less than 1% at level 4. On the other hand, for patients with three or more chronic conditions, the ROM at level 2 is 31.82%, approximately 26% at level 3, and at the highest ROM level is 9.25%. This shows the extent of the danger posed by CC.  Consider the SOI aspect shown in Table 4, of the total number of patients, those at level 2 account for the highest rate with 40.24%, followed by those at level 1 with 26.24% and those at level 3 with 25.92%. The rate for patients at the highest ROM level is 7.57% while the rate of patients at level 0 is negligible (only 0.03%). For early working-age patients, the SOI at level 1 is the highest with over 47%, followed by level 2 with around 40% and level 3 with 10.4%. Similar to the ROM aspect, the rates of patients at high levels (3 and 4) increase rapidly as age increases. In terms of SEX, the SOI at level 2 is the highest and is similar for both males and females (about 40%); however, at higher levels (3 and 4), these rates are higher for males than females. In terms of race, SOI at level 2 is the highest for all racial groups. White people have an SOI at levels 3 and 4 with a slightly higher incidence than the rest of the racial groups. The Hispanic and Asian and Pacific Islanders groups have an SOI at level 1 with higher rates than the rest. SOI at level 1 predominates with over 60% for patients without chronic conditions, followed by level 2 with 32.45%, while SOI at level 4 is only 1.12%. The rates of patients at high SOI levels (3 and 4) also increasing rapidly as CC increases. Specifically, for patients with only one chronic condition, SOI at level 3 is 11.37%, and it is 3.18% for SOI at level 4. These rates become 16.24% and 5.27%, respectively for patients with two chronic conditions and they increase extremely for patients with three or more chronic conditions (32.98% and 9.68%, respectively).
The disease severities of patients vary according to demographic factors and CC. The older the patient, the more severe patient's disease severities are, especially for patients with many chronic conditions. Figures A1 and A2 describe the differences in patient's disease severities among demographic factors and CC.

Multinomial Logistic Regression Analysis
The results of multinomial logistic regression analyses are shown in Table 5. For a 5% level of confidence, all of the attributes are statistically significant. The extremely small p-value in the goodness of fit test (around 2.2 × 10 −16 ) means that the models are appropriate and consistent. In other words, the demographic factors and CC indeed affect disease severities. Odds ratios, which are obtained by exponentiation of the regression coefficients, show the association as well as the influence of factors with disease severities. An odds ratio of 1 means there is no influence while the further away the odds ratio from 1 is, the stronger is the influence [21]. As shown in Table 5, for both ROM and SOI, elderly patients (AGE = 3) are strongly correlated with disease severities, particularly with high ROM and SOI levels. Sex is also a very important factor in determining disease severities. By racial groups, Hispanics have the strongest influence. There is an especially powerful influence of the CC factor on disease severities. Table 5. Results of multinomial logistic regression models.   The biggest advantage of using multinomial logistic regression models is to provide predictive results. This tells us what kind of level of disease severity a patient with given demographic characteristics and CC is likely carrying and what achievable probabilities they can expect. This can then act as a guide for healthcare facilities in disease control and in developing treatment plans for patients. For example, consider a patient with the demographic characteristics of being elderly, male, Hispanic, and without chronic conditions (AGE = 3, SEX = 0, RACE = 3, and CC = 0), the predicted probability for ROM is 0.13% at level 0; 69.62% at level 1; 14.67% at level 2; 9.69% at level 3; and 5.89% at level 4. For SOI, the predicted probabilities for the patient to fall into the 0-4 levels are 0.09%; 45.19%; 38.42%; 13.25%; and 3.05%, respectively. The age factor has influenced the disease severities, although there are no chronic conditions. The following examples show the dangers of CC. Consider a patient described as elderly, female, Asian and Pacific Islanders race, and having more than two chronic conditions (AGE = 3, SEX =1, RACE = 4, CC = 3). The predicted probability for this patient to fall into ROM is 0% at level 0; 22.09% at level 1; 33.90% at level 2; 32.52% at level 3; and finally 11.49% at level 4. For SOI, the predicted probabilities for this patient to fall into the 0-4 levels are 0%; 14.95%; 39.81%; 34.37%; and 10.87%, respectively. It is clear for this patient that both ROM and SOI are of great concern. One more example, consider the patient characterized by (AGE = 0, SEX = 1, RACE = 1, and CC = 3). The predicted probabilities for this patient are as follows. ROM is 0.01% at level 0; 70.61% at level 1; 17.70% at level 2; 8.82% at level 3; and 3.45% at level 4. ROM at level 1 is the highest but is also of concern at higher levels. Further, SOI is 0.01% at level 0; 24.67% at level 1; 48.56% at level 2; 22.19% at level 3; and 4.57% at level 4. The predicted probability of falling into level 2 is the highest, followed by level 1, level 3, and level 4.

Coefficient (α) Estimate (Standard Error) Odds Ratio (Exp(α))
Nevertheless, the patient's information is not always complete; some factors of demographics and CC may be lacking. In such cases, we calculate and give an average with a standard error based on all achievable possibilities that the patient will likely encounter the respective levels of disease severities. For example, to provide a predictive result for a female patient without chronic conditions, lacking the information of age and race, all potential outcomes of AGE or RACE are considered, with only one average value reported later. To highlight the powerful influence of the CC factor on disease severities, the predictive results for patients with only one demographic factor are presented and compared with patients with more information regarding their CC. The predictive accuracy is reflected through the standard error criterion. For example, if the patient is of mature working age (AGE = 2) with three chronic conditions (CC = 3) (sex and race factors are unknown), the predicted probabilities are as follows: ROM is in 45.35% at level 1; 30.68% at level 2; 17.42% at level 3; and 6.54% at level 4. The corresponding standard errors are 5.09%; 2.24%; 2.03%; and 1.15%. Similarly, SOI is 19.07% at level 1; 41.69% at level 2; 30.24% at level 3; and 8.99% at level 4 with corresponding standard errors 2.71%; 1.11%; 2.33%; and 1.54%. The detailed results are shown in Table A1.

Discussion
Disease severity measures, including risk of mortality (ROM) and severity of illness (SOI), are the criteria used to describe the impact of the disease process on the utilization of resources, comorbidities, and mortality [18][19][20]. Early identification of disease severity helps health facilities in the provision of the best treatment plan for patients [21], which is especially important for patients with multiple chronic conditions. This study shows the relationships between demographic factors, including AGE, SEX, RACE, and chronic conditions (CC) and their influence on patient's disease severities, based on NIS 2016 data. The differences between a patient without chronic conditions; with a single chronic condition; with two chronic conditions or a patient with more than two chronic conditions are specified. Simultaneously, the prevalence of CC and the differences in expression of patient's disease severities for each demographic factor and CC are also indicated. This study uses the NIS 2016 data.
ROM and SOI are often closely related, but the level of ROM and SOI is not always the same for each patient. As shown in Table 6, among patients with an SOI level of 1, just 91.40% of those have a ROM level of 1; among patients with an SOI level of 2, only 37.35% of those have a ROM level of 2, while 55.28% of patients have a ROM level of 1, and so on. The percentage of patients with ROM levels corresponding to each SOI level is indicated in the following matrix. This explains the simultaneous study of both ROM and SOI instead of focusing on only one aspect. This study uses a natural classification of CC, including patients without chronic conditions, with a single chronic condition, with two chronic conditions, or patients with three or more chronic conditions. We find that the proportion of patients with three or more chronic conditions in diagnosed patients is over 67%. It is over 62% for only female patients and approximately 74% for male patients. For elderly patients, it is over 90% while only around 14% for patients of early working age. This is different but not contradictory compared with previous studies because the prevalence of chronic conditions is calculated based only on inpatients instead of the whole US population [3,6]. The proportion of patients with more than two chronic conditions increases with age. This proportion also varies with the patient's race. White people have the highest proportion (nearly 72%), the Asians and Pacific Islanders have the lowest proportion (nearly 45%). This is consistent with previous studies [13,23].
The obtained results show that a patient's disease severities vary with demographic and CC factors. As age increases, the proportions of patients with low levels (including levels 0 and 1) decrease while increasing rapidly in higher levels of disease severities (including grades 2-4). Especially in the elderly, disease severities are at very high levels. High levels of disease severity account for greater proportions of patients with three or more chronic conditions, while for patients without chronic conditions these proportions are very small and insignificant. These results are new and have not been found in detail in previous studies. The results obtained by the multinomial logistic regression analysis are considered a quantification of the influence of demographic and CC factors on disease severities. The older the patient is, the more pronounced the disease severities are. If the relative risk ratio of patients in primary working age compared with early working age according to ROM at levels 2-4 only increases by 0.95, 0.98, and 0.83 times respectively, this rate increases by 1.77, 2.15, and 1.83 times respectively for patients in mature working age. For elderly patients, this rate increased by 5.91, 12.03, and 9.44 times, respectively. In particular, the relative risk ratio of patients with three or more chronic conditions compared with those without chronic conditions according to ROM at levels 2-4 increases as high as 39.93, 54.87, and 31.25 times, respectively. This is similar to SOI.
Furthermore, the increasing trend of CC, as well as relative risk ratios for elderly patients, may significantly increase disease severities. Here, we provide prediction models through multinomial logistic regression models even if patient information is lacking or incomplete. The predictive results show that the models are appropriate and consistent with NIS 2016 data. This contributes to the provision of more quality information about diseases, helping healthcare facilities make appropriate plans for diagnosing and treating diseases for patients.
As a limitation, this study did not include all demographic factors such as income, location, and hospital. However, this study is useful to describe the influence of demographic factors on disease severities. The results of the study demonstrate the relationship between disease severities, demographic factors (including age, sex, race) and CC. For future studies, other demographic factors as well as lifestyle factors will be included based on the methods presented in this study.

Conclusions
This study shows that there is a significant relationship between demographic factors, chronic conditions and a patient's disease severity. The differences between patients corresponding to each chronic condition level in the levels of disease severities are revealed. A high level of disease severity commonly appears in patients with more than two chronic conditions, especially in the elderly. For patients without chronic conditions, disease severities are revealed to be at a lower or safe level, even in the elderly.
Author Contributions: Formal analysis, and writing-original draft preparation, writing, V.C.N.; conceptualization, methodology, investigation, resources, supervision, project administration, writing, review and editing, J.P. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
The study examined regional cost differences for congestive heart failure admissions using discharge data from the National Inpatient Sample (NIS), the Healthcare Cost and Utilization Project (HCUP), and the Agency for Healthcare Research and Quality.