Identifying Characteristics Associated with the Concentration and Persistence of Medical Expenses among Middle-Aged and Elderly Adults: Findings from the China Health and Retirement Longitudinal Survey

Medical expenses, especially among middle-aged and elderly people, have increased in China over recent decades. However, few studies have analyzed the concentration or persistence of medical expenses among Chinese residents or vulnerable groups with longitudinal survey data. Based on the data of CHARLS (China Health and Retirement Longitudinal Study), this study sought to identify characteristics associated with the concentration and persistence of medical expenses among Chinese middle-aged and elderly adults and to help alleviate medical spending and the operational risk of social medical insurance. Concentration was measured using the cumulative percentages of ranked annual medical expenses and descriptive statistics were used to define the characteristics of individuals with high medical expenses. The persistence of medical expenses and associated factors were estimated using transfer rate calculations and Heckman selection modeling. The results show that total medical expenses were concentrated among a few adults and the concentration increased over time. People in the high medical expense group were more likely to be older, live in urban areas, be less wealthy, have chronic diseases, and attend higher-ranking medical institutions. Lagged medical expenses had a persistent positive effect on current medical expenses and the effect of a one-period lag was strongest. Individuals with chronic diseases during the lagged period had a higher likelihood of experiencing persistent medical expenses. Policy efforts should focus on preventive management, more efficient care systems, improvement of serious illness insurance level, and strengthening the persistent protection effect of social medical insurance to reduce the high medical financial risk and long-term financial healthcare burden in China.


Background
The concentration and persistence of medical expenses are well documented [1][2][3]. Health care spending is overwhelmingly concentrated within a very small proportion of the population defined as high-cost users (HCUs) [4]. The persistence of medical expenditures generally refers to "long-term medical spending patterns" [3]. However, the concentration and persistence of medical costs have primarily been confirmed in the United States. Monheit et al. [5] found that 1% of patients account for 27% of annual medical spending in the US and Riley et al. [6] reported that 5% accounts for 34.4% of total health expenses. Kohn et al. [7] revealed that medical costs have significant persistence while Newhouse et al. [8] reported that 40% of current medical expenses are associated with past medical expenditures. Persistent high medical expenses have also been reported in many developed high-income countries, such as Germany, Canada, the Netherlands, and medical costs. Findings could provide information about the concentration and persistence of medical expenses. In addition, this study provides the implication policies to reduce medical expenses and alleviate the operational risk of social medical insurance.
The remaining sections of this study are as follows. Section 2 introduces the approach of this study, including data source, data analysis, and definitions of the variables. Section 3 shows the descriptive results and the regression results. Section 4 provides the discussion and implication policy. Section 5 presents conclusions.

Methods
Based on previous studies on the concentration and persistence of medical expenses, the following hypotheses in this study are proposed:
The survey used multi-stage stratified random sampling involving four steps. First, 150 county-level units covering 28 of the 30 provinces in mainland China, excluding Tibet, were sampled based on population size and stratified by GDP and district (urban or rural) using probability proportionate to size (PPS) sampling. Second, using the latest population data, village and community units within counties were chosen by referencing the National Bureau of Statistics. The administrative villages in rural areas and communities in urban areas were used as primary sampling units (PSUs). Three PSUs were selected within each county, and 450 PSUs were selected using the PPS. Third, approximately 25 household units were selected in each PSU based on sampling frames constructed using CHARLS GIS maps, considering the pre-investigation refusal rate. Fourth, one or two age-eligible respondents in each household were administered the survey and followed up every 2-3 years.
The CHARLS collects detailed information about each respondent and their spouse, including basic demographics, chronic disease status, health status and behaviors, health care utilization and insurance, income and retirement, and medical expenditures. Medical expenditure data include outpatient, hospitalization, and self-treatment expenses and total expenditures included the sum of all three spending categories. Based on CHARLS responses, nine chronic diseases were chosen based on their high prevalence or mortality rate in 2020 chronic disease status reports across China: hypertension, diabetes (i.e., high blood sugar), cancer or malignant tumors (except minor skin cancers), chronic lung disease (i.e., chronic bronchitis, emphysema), liver disease (except fatty liver, tumors or cancer), heart disease (i.e., coronary heart disease, angina, congestive heart failure, and other heart problems), kidney disease (except tumors or cancer), stomach and other digestive diseases (except tumors or cancer), and arthritis or rheumatism. Cross-sectional data from 2013,  2015, and 2018 were matched to a three-period dynamic panel dataset using ID, restricting  the sample to individuals with data in each year. After dropping patients who had migrated  or died, had negative expenditures, or had incomplete information, 19,869  Consistent with other surveys, most CHARLS survey respondents (47.2%) belonged to the "zero medical expenses" group. Participants with positive medical expenses were ranked from highest to lowest each year and divided into categories representing the top 1%, 10%, and 20%, and the bottom 50% of expenses. The medical expenses of each category were divided by the annual medical expenditures to estimate the cumulative percentage of each expense category and used to determine the concentration of total medical expenses as well as hospitalization, outpatient, and self-treatment-specific expenses from 2013 to 2018 [4][5][6]11].

Determining the Persistence of High Medical Expenses among Middle-Aged and Elderly Participants
The persistence of high medical expenditures was determined by calculating the transfer rate of high medical expense samples [2,3,12,16]. Rates were estimated by determining the number of spenders in the top 10%, 20%, and 50% categories in 2013 that transferred into other categories or remained in the original categories in 2015 and 2018 and then dividing these values by the number of spenders in the top 10%, 20%, and 50% categories in 2013. The top 1% category was excluded because its small sample size made it difficult to obtain scientifically representative transfer rates.

Descriptive Analysis
Descriptive statistics were used to analyze the characteristics of middle-aged and elderly people with high medical costs. While the top 1%, 10%, and 50% categories were used to estimate the concentration of medical expenses, individuals in the top 10% and 20% of annual total medical expenses were used for the descriptive analysis because the top 1% group was too small to calculate reliable rates for specific characteristics. Participants with the top 10% of total medical costs in 2013, 2015, and 2018 were combined as the "Top 10% high medical expense group" while those with the bottom 90% of total medical expenses in 2013, 2015, to 2018 were combined as the "Bottom 90% medical expense group" and defined as the control group. Similarly, a "Top 20% high medical expense group" and "Bottom 80% medical expense group" were also created using combined data from each of the survey years.

Regression Analysis of Medical Expenses among Middle-Aged and Elderly Participants
As a result of the large proportion of participants in the "zero medical expenses" group and the "long-tail distribution" of positive medical costs, the Heckman selection model was used to identify the contribution of related variables to medical spending incidence. Lagged medical expenditures were used as the independent variable to analyze the persistence of medical expenses. The analysis included a regression of medical expense incidence and a regression of total medical expenditures.
The first regression used the probit model for estimation as follows: In this model, the dependent variable, "incidence of medical costs" (I it ), is a binary variable defined as "whether an individual i had medical expenses in year t" and di-chotomized as "0" = "Non-incurrence of medical expenses" or "1" = "Incurrence of medical expenses." I i,t−k denotes the lag term of "whether medical expenses occurred." X it are a series of control variables representing the demographic factors, age, sex, education level, marital status, retirement condition, and residence. D ij,t−k is a dummy indicator representing "whether individual i suffered from disease j in lagged k period." I i,t−k D ij,t−k are cross-variables indicating "whether medical expenses incurred in the lag period" interacted with "the prevalence of diseases i in the lag k stage." In the current paper, k = 1,2, or k here does not refer to the specific value of the year. Instead, t-1 denotes lagged one period while t-2 denotes lagged two periods. υ it is the error term, with a normal distribution, and γ and λ are the regression coefficients assessed in this study. While γ assesses the persistence of incurring medical expenses, λ represents the effect of diseases in the lagged period on the incidence of current medical expenses. Based on the study of Bai et al., we reported "dy/dx" as the marginal effect coefficients of the probit regression on incidence of medical expenses [24].
The second model represented the total medical expenditures and OLS regression was used, as follows: In this model, the dependent variable,Y it , is the total medical expenditure of an individual i in year t, and was made logarithmic to normalize the skewed medical expense data. The key independent variables, Y i,t−k , are "the lagged term of medical expenses for each individual i over a certain number of years." For example, if Y i,t−k includes Y i,2015 and Y i,2013 for 2018 then k = 2. According to Peng et al. [21], ln (1 If Y i,t−k = 0, 1 should be added when transforming into a logarithm, and if Y i,t−k is not equal to 0, the log of the original lagged expenditure, k, should be used to define Equation (1). Z it is a vector for covariates such as socioeconomic and other endogenous factors, ϕ and η t−k are the regression coefficients of the equation, η t−k determines how many percent the value of Y it will change if Y i,t−k changes by 1%, and ε it is a random error term. Statistical analyses were performed using Stata, version 16.0 (Stata Corp, Inc., Cary, TX, USA).

Variable Selection Dependent Variables
The dependent variables included total medical expenditures (Y) and annual incidence of medical costs (I). Annual total medical costs were calculated by combining the outpatient, hospitalization (fees paid to the hospital, including ward fees but not wages paid to a hired nurse, transportation costs, and accommodation costs for oneself or family members), and self-treatment expenses (medicines purchased by the patient but not including prescription medications). I is a dummy binary variable set based on Y such that if the total medical expenses are not equal to zero, I is defined as "1", otherwise it is defined as "0."

Independent Variables
The independent indicators from three dimensions (predisposing factors, enabling factors, and need factors) were selected using the Anderson health service utilization model [25], the mainstream model used for health service utilization research. The model divides factors affecting health service utilization into three categories: predisposing, enabling, and need factors. Predisposing factors include demographic and social structure factors, such as age, sex, marital status, retirement condition, education level, and residence. Enabling factors include financial and organizational variables such as household income per capita, health insurance, inpatient visit times, type of outpatient institution and visit times, and multi-type outpatient facility visits. Specifically, "multi-type outpatient facility visits" is a virtual variable based on the multi-choice question "Which types of medical facilities have you visited in the last 4 weeks for outpatient treatment?" The outpatient institutions were divided into two categories based on the grade classification of the medical facilities. While general, specialized and Chinese medicine hospitals = 0, community health centers, town hospitals, and village clinics = 1. If the respondent received treatment from both high-ranking hospitals and primary medical institutions, "type of outpatient institution" = 1, and if the respondent only received medical services from primary medical institutions, the "type of outpatient institution" = 0. Need factors included self-reported health status and chronic diseases. To investigate the impact of several chronic diseases, a "comorbidity" dummy variable was developed. All independent variables were set as categorical indicators. Detailed descriptions of the samples are included in Table 1.

The Concentration of Medical Expenses among Middle-Aged and Elderly Participants
The cumulative percentages of medical expenditures spent by middle-aged and elderly people who were in the top 1%, 10%, 20%, and 50% annual medical expense categories in 2013, 2015, and 2018 are shown in Figures 1-4. Four types of expenses were reported: total medical expenses, and inpatient, outpatient, and self-treatment medical expenses.       In 2013, 2015, and 2018, middle-aged and elderly people with the top 1% of selftreatment expenses spent 21.6%, 20.4%, and 24.4% of the annual costs, respectively ( Figure  4). Those with the top 10% spent 59.2%, 61.4%, and 59.2% of the annual self-treatment medical costs in 2013, 2015, and 2018, respectively. Those with the top 20% spent 74.3%, 75.8%, and 71.6%, respectively, and those with the top 50% spent 94.5%, 94.0%, and 92.3%, respectively. The proportion of individuals with the top 1%, 10%, 20%, and 50% annual self-treatment medical expenses fluctuated slightly from 2013 to 2018.

Persistence of High Medical Expenses
No middle-aged and elderly people with the top 10% medical expenses in 2013 remained in the top 10% in 2015 or 2018 (Table 4)

Incidence of Medical Expenses
Results of the two regression models, including the lagged one-period and two-period variables, are shown in Table 5. Once medical expenses occurred, the probability of incurring persistent medical expenditures in the next period increased significantly by 16.5% and 15.8%, respectively. Incurring medical expenses in lagged two periods increased the incidence of current medical expenses by 11.7%. Females had a significantly higher likelihood of incurring medical expenses than males (dy/dx = 0.033, 0.027), and chronic diseases had a persistent effect on the incidence of medical expenses. The probability of incurring medical expenses in the current period increased significantly by 29.3% once middle-aged and elderly adults with cancer had medical expenses in lagged one period (dy/dx = 0.293). Meanwhile, those suffering from hypertension (dy/dx = 0.102, 0.069), diabetes (dy/dx = 0.098), chronic lung diseases (dy/dx = 0.078, 0.081), liver diseases (dy/dx = 0.049), heart diseases (dy/dx = 0.074), kidney diseases (dy/dx = 0.089, 0.115), digestive diseases (dy/dx = 0.055) and arthritis or rheumatism (dy/dx = 0.060) in the lagged one-period significantly increased the probability of incurring medical expenses. In addition, suffering from hypertension, liver disease, arthritis, or rheumatism in the lagged two-period also increased the incidence of current medical costs (dy/dx = 0.103, 0.111, and 0.036, respectively).

Total Medical Expenses
Factors associated with total medical expenses among middle-aged and elderly participants in the second part of Heckman selection model are shown in Table 6. The medical expenses were logarithmically transformed in the regression model. When other variables were controlled, the current level of medical expenses was strongly affected by past medical expenditures. For each 10% increase in medical costs during the lagged one-period, the current medical expenses increased by 2.25% and 1.65%, respectively, and for each 10% increase in medical expenses during the lagged two-period, current medical expenses increased by 1.13%. Middle-aged and elderly people who were ≥75 years of age generally had more medical expenses than those <55 years of age (Coef. = 0.447). Females had higher medical expenses than males (Coef. = 0.328, 0.297) and married respondents had higher medical expenses than the unmarried (Coef. = 0.258). Respondents living in urban areas generally spent more on medical expenses than those living in rural areas (Coef. = 0.221).

Discussion
The medical expenses of middle-aged and elderly people account for a large proportion of all medical expenses because of the higher incidence of age-related chronic disease in this population. Thus, it is important to study the concentration and persistence of medical expenses in this group. To the best of our knowledge, this study of a nationally representative longitudinal Chinese household survey population is the first to analyze the extent and characteristics of the concentration and persistence of medical expenses among Chinese middle-aged and elderly adults [21][22][23].

The Concentration of Medical Expenses among Middle-Aged and Elderly Participants
This study found that 22.0-35.2% of total annual medical expenses were spent by the top 1% of middle-aged and elderly medical spenders, while 64.2-76.1% were spent by the top 10%, 79.0-85.0% were spent by the top 20%, and 95.7-97.2% were spent by the top 50%. These findings demonstrate that total medical expenses are concentrated in a small portion of middle-aged and elderly adults, supporting findings from previous studies [6,10]. This means that hypothesis 1 is proven. These results indicate that there is an inequitable utilization of health resources among middle-aged and elderly adults. As a result, policy efforts should focus on optimizing the allocation of health service utilization by different groups to enhance social welfare [26]. This study suggest: (1) To achieve the similar financing and reimbursement level of social medical insurance between urban and rural regions, the Urban Employee Basic Medical Insurance and the New Rural Cooperation Medical Insurance should be further integrated. (2) To narrow the gap in allocation of medical resources among regions, the government should improve the financial transfer payment system. (3) The government should improve the security level of serious illness insurance and medical assistance among rural residents and the low-income population. (4) To integrate medical resources and reduce the excessive utilization of medical services, medical consortiums need to be further established, and standardization of the clinical care process should be continuously implemented.
The concentration of total medical expenses increased from 2013 to 2018, which contrasts with some studies [4,6]. This suggests that the total medical expenses of Chinese middle-aged and elderly people were more concentrated among a small number of individuals in recent years, which may be the result of advancements in medical treatment technologies and universal social health insurance coverage in China. Thus, differentiated measures of social health insurance for high medical expense spenders among these demographics should be a priority concern for public policy. Meanwhile, the concentration of outpatient medical expenses also increased from 2013 to 2018. Inpatient medical expenses in the top 10% decreased slightly from 2015 to 2018, this may be the result of recent reforms to social medical insurance and standardization of the clinical care process in China. The concentration of self-treatment medical expenses for participants in the top 1% increased from 2015 to 2018, likely as a result of worsening inequities in self-treatment expenses. We suggest that the fairness of self-treatment can be better improved in the following ways. To reduce the excessive utilization of over-the-counter medications, general practitioners can guide the use of these medicines. Health administrative departments, price bureaus, and drug administrations should reasonably adjust the prices of self-medication drugs, health supplements, and health care equipment.

Persistence of High Medical Expenses among Middle-Aged and Elderly Participants
Some middle-aged and elderly participants with high medical expenses in 2013 remained in the same high medical expense categories in 2015 and 2018, supporting prior studies illustrating the persistence of high medical costs [7,22,27]. Hypothesis 2 is supported. This may be linked to the transitioning disease spectrum and high prevalence of chronic diseases among high medical expense groups. Policy efforts should focus on preventive management for individuals at risk of incurring high medical expenses [2,6,20].
Most middle-aged and elderly participants with the top 10% of medical expenses in 2013 transferred to the top 20% or 50% categories in 2015 and 2018. This is consistent with previous surveys [17,28] and indicates that high medical expenses may have been brought under control with reform to the medical system and the advancement of medical technology in China [12,15]. However, it should be noted that almost 20% of middle-aged and elderly adults in the top 50% spending category in 2013 transferred to the top 20% spending category in 2015 and 2018. Thus, policy efforts will need to better define the characteristics of this group to effectively control rising medical expenses.

Characteristics of Middle-Aged and Elderly Participants with High Medical Expenses
Middle-aged and elderly participants with high medical expenses were more likely to be female and older, which is in agreement with previous studies [5,29]. This may be the result of the increasing incidence of menopause and severe disease in older women [30,31]. A high education level was also associated with higher medical expenses, possibly resulting from better health awareness and a stronger motivation to seek medical services [32]. In addition, participants with high medical expenses were more likely to live in urban than rural areas. This may be attributed to socioeconomic differences between urban and rural areas in China [33]. Urban residents tend to have higher incomes and are thus able to incur higher medical expenses and advanced medical technologies and resources tend to be more concentrated in urban areas. Thus, initiatives that optimize the allocation of health resources in urban and rural areas should be considered to narrow the gap between these regions. Participants with lower income were also more likely to be in the higher medical expenses group, which is consistent with some studies [11,34]. This may be because less-wealthy groups have lower health management awareness and more risky lifestyle behaviors that increase the prevalence of severe diseases [35]. Thus, health sectors should strengthen health management, disease prevention, and financial assistance for poorer individuals.
In this study, participants receiving medical services in general or specialized hospitals had high medical expenses, potentially as a result of excessive medical treatment. This should be addressed by reforming hierarchical health services and promoting rational utilization. Middle-aged and elderly participants who visited multi-type medical institutions also demonstrated high medical expenses. Social health insurance policies should focus on reducing the medical expenses of these at-risk groups.
Middle-aged and elderly people with hypertension, diabetes, cancer, chronic lung diseases, liver diseases, heart diseases, kidney diseases, digestive diseases, arthritis, rheumatism, and comorbidities were more likely to have high medical expenses, which supports previous survey findings [17,36]. Hypothesis 3 is proven. Prevention management and early screening and treatment for chronic diseases should help to reduce medical expenses in this population.

Factors Associated with the Persistence of Total Medical Expenses
This study found that the incidence of medical expenses for middle-aged and elderly people increased by 15.8-16.5% from 2013 to 2015 and by 11.7% from 2013 to 2018. Current medical expenses increased by 1.65-2.25% and 1.13% for every 10% increase in the lagged one and lagged two-period medical expenses, respectively, which is consistent with previous survey results [5,37]. These findings demonstrate that medical expenses are significantly persistent. While previous medical expenses had a persistently positive effect on current medical expenses, the effect of the lagged one period was the strongest, which is also supported by other studies [15,18]. This effect may be attributed to the long-term treatment of chronic diseases. It is worth noting that middle-aged and elderly people with diabetes, cancer, chronic lung diseases, heart diseases, kidney diseases, and digestive diseases had a high likelihood of having persistent medical expenses for two periods, while those with hypertension, liver diseases, arthritis, or rheumatism were more likely to have persistent medical expenses for three periods. In addition, those with hypertension, diabetes, chronic lung diseases, heart diseases, arthritis, rheumatism, or comorbidities generally had higher medical expenses, supporting prior studies [16,38]. Hypothesis 4 is proven. To control persistent medical expenses and higher medical expenses, preventive measures should focus on reducing the incidence of chronic diseases. Health sectors should consider joining primary medical institutions to establish persistent integrated care systems for middle-aged and elderly people with chronic diseases to reduce the overuse of medical resources during long-term treatment regimens. In addition, the reimbursement of medical insurance should be improved for those with chronic diseases to alleviate persistent medical expense burdens.
This study found that females, with age ≥75 years, and living in an urban area were all positive factors for higher medical expenses, which is consistent with other survey results [21,39]. In addition, married individuals had higher medical costs than unmarried, possibly because married people utilize more health services as a result of supervision from their spouses. Middle-aged and elderly people with high per capita income and health insurance generally incurred higher medical expenses, indicating that those with high income could afford higher medical expenses. Moreover, social health insurance may promote an increase in medical expenses to some extent. Participants who received more outpatient visits and hospitalizations or who visited higher-ranking hospitals generally had higher medical expenses, which is consistent with a previous study [40]. Thus, increasing the reimbursement level of social medical insurance of hierarchical health service in China may help to reduce the financial healthcare burden of high-cost users.
There are some limitations to the current study. Because the CHARLS does not include information on chronic disease diagnoses, the chronic disease prevalence was obtained by asking adults whether they had been diagnosed with a chronic disease. Thus, individuals who might have had a chronic disease but were not or did not recall being diagnosed may have been excluded in which case the true prevalence of a particular condition may be underestimated. In addition, data on missing samples were removed from the study, potentially resulting in an undercount of individuals with persistent medical expenses. Finally, medical expenses are likely to be influenced by additional factors which are not included in the claims data. Future research could consider further exploring the influence mechanism of the persistence of high medical expenses in more complex models.

Conclusions
This study found that total medical expenses were concentrated among a few middleaged and elderly individuals and the concentration increased over time. Some health and socioeconomic characteristics, such as chronic disease, age, education level, residence, employment status, income level, and medical service utilization condition, were significantly associated with high medical expenses. Thus, the government should address these factors to further improve security levels of serious illness insurance among middle-aged and elderly participants with high medical expenses to lower medical financial risk. High medical expenses also demonstrated strong persistence. Lagged medical expenses had a significantly persistent positive effect on current medical expenses and the effect of lagged one period was the strongest. Those having chronic diseases in the lagged periods were more likely to have persistent medical expenses. To address this, the government should establish a more efficient care system and strengthen the persistent protection effect of social health insurance policies to alleviate long-term financial healthcare burdens.
Author Contributions: L.J. led the analysis of the data and wrote the first draft of the manuscript; Z.W. contributed to the study design, interpretation of the data, and helped in the writing of the final draft of the manuscript; L.Z. and Q.Q. helped in data analysis and contributed to writing. All authors have read and agreed to the published version of the manuscript.
Funding: This study is founded by the major research project of philosophy and social sciences in colleges and universities in Jiangsu Province (No. 2021SJZDA148) and The Excellent Innovation Team of the Philosophy and Social Sciences in the Universities and Colleges of Jiangsu Province "The Public Health Policy and Management Innovation Research Team". The funding bodies were not involved in the design of the study, or data collection, analysis, and interpretation or in writing the manuscript.

Institutional Review Board Statement: This study was approved by the Academic Research Ethics
Committee of Nanjing Medical University; reference number: 2022460. All procedures were in accordance with the ethical standards of the Helsinki Declaration.
Informed Consent Statement: Participants provided informed consent prior to data collection.

Data Availability Statement:
The datasets used in the current study are not publicly available due to the confidential policy but are available from the corresponding author on reasonable request.