Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite

Open AccessArticle

Peer-Review Record

Financial Relief and Health Effects of Urban–Rural Health Insurance Integration on Older Rural Adults: A Causal Analysis of Age-Based Heterogeneity

Healthcare 2026, 14(12), 1780; https://doi.org/10.3390/healthcare14121780 (registering DOI)

by Sirui Li

, Xiangdong Liu

, Xi Wang

and Shufang Zhao^*

Reviewer 1:

Md. Abdur Rahman Forhad

Reviewer 2: Anonymous

Reviewer 3:

Jorge Luis Tonetto

Healthcare 2026, 14(12), 1780; https://doi.org/10.3390/healthcare14121780 (registering DOI)

Submission received: 30 March 2026 / Revised: 11 May 2026 / Accepted: 13 May 2026 / Published: 19 June 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors The study evaluates the effects of Urban-Rural Medical Insurance Integration (URMII) on health outcomes and financial burden by analyzing CHARLS panel data from 2013 to 2018 using a multi-period difference-in-differences (DID) methodology. The primary finding is that URMII led to a 35% increase in out-of-pocket (OOP) expenses without corresponding improvements in short-term health outcomes, suggesting a “welfare paradox” driven by demand release and supply-side distortions. While the research addresses a topic of significant importance in health economics and public policy, the current version requires substantial methodological and interpretational revisions prior to publication. The paper employs a two-way fixed effects (TWFE) DID model despite staggered treatment timing. Although the authors acknowledge potential bias, the corrective measure of excluding late adopters is insufficient. It is recommended that the authors provide comprehensive event-study specifications with leads and lags. The policy was implemented gradually, but the model averages the effects. This approach neglects dynamic and cohort-specific treatment effects. To address this, they can estimate dynamic effects using an event study with multiple time leads and lags. The rollout of URMII is influenced by variations in provinces' fiscal capacity, healthcare infrastructure, and demographic structure. This can lead to concerns about policy endogeneity and selection bias. The study indicates no significant short-term health effects but notes that health capital accumulation is dynamic and that short panels may fail to capture true effects.

Author Response

Comments 1 and 2: Regarding the issue that “despite differences in processing time, the study still employed a two-way fixed effects model, which is insufficient as it only excludes late adopters. This approach overlooks dynamic effects; we recommend providing a complete event-study design that includes both lead-up and lag periods.”

Response： We fully agree with the reviewers' concerns regarding potential biases in the TWFE under a staggered DID design. The suggestion to incorporate multiple lead and lag periods was very helpful in refining our methodology. In fact, we did consider using all five waves of the CHARLS data (2011–2020). However, following rigorous econometric evaluation, we ultimately chose to restrict the sample window to the 2013, 2015, and 2018 waves, primarily based on the following two considerations regarding causal identification: (1) to exclude endogenous interference from the COVID-19 pandemic. The 2020 (Wave 5) CHARLS survey data were subject to the exogenous shock of the COVID-19 pandemic. Pandemic lockdowns caused an abnormal contraction in rural residents' access to healthcare services and their healthcare-seeking behavior. If 2020 were included in the dynamic effects assessment, the policy's own lagged effects would be completely contaminated by the pandemic's interference effects. (2) To control for long-term macroeconomic confounding factors. The 2011 (Wave 1) data is too distant from the full implementation of the policy (after 2016) to serve as a rigorous baseline period. Therefore, to maximize the purity of the difference-in-differences model estimates, we strictly limited the event study window to the three periods of 2013, 2015, and 2018—immediately before and after the policy implementation. While this objectively restricts our ability to present longer-term dynamic patterns, it is a necessary compromise to ensure cleaner causal identification.

To address this limitation and address concerns regarding bias, we adopted an alternative strategy: (1) Conducted the event study using 2015 as the baseline period, while honestly reporting the pre-event heterogeneity trends observed in 2013; (2) We moved away from relying solely on TWFE and formally introduced propensity score matching (PSM) for difference-in-differences (DID). By strictly matching the treatment group and the control group during the baseline period, we stripped away the inherent heterogeneous time trends between urban and rural areas, effectively mitigating selection bias; the corresponding results (Table 6, Panel E) validate the robustness of our conclusions.

The changes are as follows： Section 3.1 (Model Specification) （Pages 5）, Section 4.3 (Parallel Trends and Endogeneity)（Pages 10–12）, and Section 4.4.3 (PSM-DID)（Pages 15–16）.

Comments 3： "The implementation of policies is influenced by provincial fiscal capacity, healthcare infrastructure, and demographic structure, which may lead to policy endogeneity and selection bias."

Response： The reviewers' concerns regarding provincial heterogeneity driving policy implementation are well-founded. To control for this endogeneity, we strictly controlled for "provincial fixed effects" in our model to account for differences in fiscal capacity and infrastructure foundations across provinces that do not change over time. More importantly, in the revised manuscript, we have included 500 Monte Carlo placebo tests (Figure 2) to rule out the interference of other macroeconomic shocks during the same period from a spatial counterfactual perspective. Furthermore, a sensitivity analysis excluding provinces with later implementation (Table 5, Panel C) indicates that the tax reduction effect remains valid in a sample with more uniform exposure timing.

The changes are as follows: Sections 4.4.1 and 4.4.2（Pages 14-15）.

Comments 4：Regarding the issue that “the reviewers noted there was no significant impact on short-term health and pointed out that the accumulation of health capital is dynamic, and that a cross-sectional analysis cannot reflect the true effects.”

CommentsResponse： We thank the reviewers for acknowledging our explanation of health outcomes. We fully agree with your view that the accumulation of health capital is a long-term, dynamic process. Economic interventions (such as increasing reimbursement rates) can immediately alleviate the "financial toxicity" caused by illness, but transforming the utilization of healthcare services into tangible physiological and psychological improvements requires a longer observation window. In the revised manuscript, we have significantly expanded this discussion in Section 5.1 (Discussion). By introducing Grossman's (1972) "health capital model," we have explicitly articulated the inherent intertemporal lags in health capital accumulation and emphasized that the primary policy objective of basic health insurance is to provide financial protection, rather than to instantly extend healthy life expectancy. Additionally, we have explicitly included "the limitations of short-panel data" in the research outlook.

The changes are as follows: Section 5.1, Section 6.1 (Limitations of the Study)（Page18-19, 20）.

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript presents a research topic that may be of importance to the audience, especially decision makers. The increase in costs for each individual without a visible benefit is a topic that can always be interesting. The paper is well structured and follows the usual academic format. On the other hand, there are elements that cause concern and as such need to be improved.

1. At the beginning of the Introduction section, the initial motivation for the research and the presented context are nicely explained. On the other hand, this section lacks a clear positioning of the research gap and the novelty brought by the study in relation to the same or similar research in the world. In the second part of this section, the contribution is positioned, but a clearly declared research goal is missing.

Already in this section, the design of the study is partially presented ("Based on this, this paper utilizes panel data from the China Health and Retirement Longitudinal Study (CHARLS) from 2013 to 2018. It employs a multi-period Difference-in-differences (DID) model to systematically evaluate the net effects of the health insurance integration on the multidimensional health and economic burdens of rural middle-aged and elderly individuals.") and this is not a common practice. In this context, it is recommended to clearly separate the parts of the text that refer to methodological aspects in relation to the introduction. A clear grouping of parts of the text that belong to the same thematic units is expected.

The key weakness of the study is also recognized in the mentioned part of the text. It is stated that data was used for the period 2013-2018, with the clear fact that today is already 2026. What is the significance of data from the period 2013-2018 for decision makers in 2026? It is known that making decisions in the current moment has an impact on the effects in the years to come. It additionally provides a warning and signals the question: what is the significance of outdated data for the future? In addition, it should be taken into account that there is a clear discontinuity for the period from 2018 to 2026. The circumstances of ten years ago can hardly be the same or even similar to those of today and that fact calls into question the purpose of the work. Taking into account that the main purpose of the manuscript is reflected in a practical rather than a theoretical or strictly academic contribution, it can be stated that this design of the study causes concern. The author is expected to provide a strong and complete argument for the relevance of the study based on very outdated data. This is a critical element of the work and special attention should be directed towards this.

2. It is necessary to explain more clearly why the division into treated and control groups based on hukou status is appropriate for evaluating the effect of health insurance integration. Since the reform refers to the institutional change of the insurance system, it would be useful to show more precisely the relationship between hukou status, type of insurance and actual exposure to the reform. This would facilitate the interpretation of the DID results as an effect of the policy itself, rather than a broader difference between rural and urban respondents.

3. Tests of parallel trends represent an important part of the work, but their results are not equally convincing for all outcomes. In particular, the findings for health outcomes, where there are indications of pretreatment differences, need to be more cautiously interpreted. And for the economic outcome, it would be advisable to soften the wording that suggests a fully confirmed causal mechanism. The results are interesting and relevant, but it would be more appropriate to present them as findings that support a certain interpretation, rather than as definitive proof.

4. The Discussion section is focused on the interpretation of the results of the current study, without comparison with the results of previous studies and discussion of similarities and differences between the obtained results. Also, theoretical implications were missing, thus neglecting the academic purpose of the work. Therefore, additional attention should be focused on these elements.

The significant effort invested by the authors in the study is evident. However, additional methodological clarifications are needed, as well as a more careful formulation of the conclusions in order to give the manuscript the necessary integrity.

Author Response

At the beginning of the Introduction section, the initial motivation for the research and the presented context are nicely explained. On the other hand, this section lacks a clear positioning of the research gap and the novelty brought by the study in relation to the same or similar research in the world. In the second part of this section, the contribution is positioned, but a clearly declared research goal is missing.

Already in this section, the design of the study is partially presented ("Based on this, this paper utilizes panel data from the China Health and Retirement Longitudinal Study (CHARLS) from 2013 to 2018. It employs a multi-period Difference-in-differences (DID) model to systematically evaluate the net effects of the health insurance integration on the multidimensional health and economic burdens of rural middle-aged and elderly individuals.") And this is not a common practice. In this context, it is recommended to clearly separate the parts of the text that refer to methodological aspects in relation to the introduction. A clear grouping of parts of the text that belong to the same thematic units is expected.

The key weakness of the study is also recognized in the mentioned part of the text. It is stated that data was used for the period 2013-2018, with the clear fact that today is already 2026. What is the significance of data from the period 2013-2018 for decision makers in 2026? It is known that making decisions in the current moment has an impact on the effects in the years to come. It additionally provides a warning and signals the question: What is the significance of outdated data for the future? In addition, it should be taken into account that there is a clear discontinuity for the period from 2018 to 2026. The circumstances of ten years ago can hardly be the same or even similar to those of today, and that fact calls into question the purpose of the work. Taking into account that the main purpose of the manuscript is reflected in a practical rather than a theoretical or strictly academic contribution, it can be stated that this design of the study causes concern. The author is expected to provide a strong and complete argument for the relevance of the study based on very outdated data. This is a critical element of the work, and special attention should be directed towards this.

Response： We fully agree with your critiques regarding the structure and logic of the Introduction. (1) We have revised the Introduction to clearly define the research gap (namely, the lack of evidence regarding the relationship between final health outcomes and intermediate healthcare utilization) and to clearly articulate the study objectives. (2) We have removed the detailed methodological description from the Introduction and placed it strictly in Section 3. (3) We greatly appreciate your critical question regarding the relevance of the 2013–2018 data to policymakers in 2026. We have provided a thorough justification in the paper. The 2013–2018 period serves as a clean quasi-natural experiment window capable of isolating the pure causal effects of the URMII policy (implemented in 2016). Extending the panel data beyond 2019 would introduce significant contamination into the estimates due to the massive exogenous shock caused by the COVID-19 pandemic, which severely distorted the utilization of healthcare services globally and in rural areas. Therefore, deriving uncontaminated causal parameters from this original historical window is scientifically crucial, as it will provide reliable long-term policy insights for current policymakers committed to deepening healthcare integration.

Sections and content to be revised in the article: The introduction. The revised text is as follows(Page 2-3) :

Introduction

China’s aging process is characterized by the coexistence of “growing old before growing rich” and an “urban-rural inversion” (Ren Zeping 2023; WHO 2024). Data from the Seventh National Population Census show that the proportion of the elderly population in rural areas is significantly higher than in urban areas, and rural seniors generally face the dual pressures of low pension levels and heavy medical expenses, making them vulnerable to falling into the health poverty trap (Jiang Lin 2024). To bridge the institutional gap in healthcare coverage, the reform to integrate the basic medical insurance systems for urban and rural residents was fully launched in 2016. This reform consolidated the NCMS with the urban resident basic medical insurance (URBMI) into a unified URRBMI system, standardizing funding criteria, reimbursement benefits, and drug formularies, with coverage extending to over 1 billion insured individuals. This institutional transformation provides a rare quasi-natural experiment window for evaluating the health outcomes of universal public medical insurance.

While existing research has examined the effects of health insurance integration, further work is needed in the following areas. First, most studies have focused on intermediate outcomes such as reimbursement rates and the level of care sought (Liu Weiwei 2020; Yang Lin & Miao Lizhong 2021). Health is not merely about curing diseases but also encompasses the maintenance of daily living functions and psychological well-being (Liu chang & He kun 2025); yet, causal evaluations targeting ultimate outcomes such as physical and mental health remain scarce. Furthermore, few studies have effectively disentangled inherent urban-rural health trends to perform causal identification of the pure net effect of the policy. Second, the net effect of integration on household out-of-pocket (OOP) expenditures faces a theoretical tension: while higher reimbursement rates reduce OOP expenditures (Du et al. 2022), the behavioral response of releasing pent-up medical demand may increase total expenditures (Qiu et al. 2026). Which of these forces prevails depends on the characteristics of existing rural healthcare demand and supply-side conditions; academia has yet to provide a clear answer. Third, the fairness of the distribution of policy benefits, particularly whether vulnerable groups such as the elderly and those with disabilities are effectively covered by the system, still lacks sufficient empirical testing (Sowden et al. 2025).

Based on this, the objective of this study is to systematically evaluate the true causal net effects of the urban - rural health insurance integration on the multidimensional health and economic burden of older adults in rural areas, and to analyze the distribution patterns of policy benefits across different socioeconomic groups. The study utilizes panel data from the China Health and Retirement Longitudinal Study (CHARLS) for the period 2013–2018. The time window was defined based on quasi-natural experiment logic: the period around 2016 marked the shock phase of concentrated policy implementation, while the COVID-19 pandemic disrupted healthcare service data after 2019. Consequently, the years 2013–2018 constitute a relatively ideal and less constrained observation window for assessing the policy's pure effects prior to the pandemic. The causal parameters derived from this window period can provide practical, cross-economic-cycle guidance for policymakers in the present and future (2026 and beyond) to deepen healthcare system reforms and improve long-term mechanisms for preventing medically induced poverty. The subsequent structure of this paper is as follows: Section 2 presents the literature review and hypotheses; Section 3 introduces the model and data; Section 4 presents the empirical results; Section 5 discusses the findings; and the paper concludes with recommendations.

It is necessary to explain more clearly why the division into treated and control groups based on hukou status is appropriate for evaluating the effect of health insurance integration. Since the reform refers to the institutional change of the insurance system, it would be useful to show more precisely the relationship between hukou status, type of insurance and actual exposure to the reform. This would facilitate the interpretation of the DID results as an effect of the policy itself, rather than a broader difference between rural and urban respondents.

Response： Thank you for your important methodological suggestion. As you pointed out, we do need to clarify the relationship between household registration status, insurance type, and policy coverage. In the revised Section 3.2.2, we have explicitly outlined the rationale for using household registration status based on the "Intention-to-Treat" (ITT) framework. Prior to integration, China's health insurance system was strictly dichotomized due to the rigid household registration system: residents with agricultural hukou were institutionally assigned to the new cooperative medical scheme (NCMS). In contrast, urban non-working residents were uniformly included in the urban resident basic medical insurance (URBMI) system. Household registration status serves as a strictly exogenous and rigid institutional barrier, capable of circumventing endogeneity (selection bias) arising from voluntary enrollment in different insurance schemes. Therefore, using rural household registration as the treatment group indicator perfectly captures participants' actual exposure to the institutional integration reform.

Sections and content to be revised in the article: 3.2.2 Core explanatory variables. The specific additions are as follows （Page 6-7）:

3.2.2 Core Explanatory Variables

The core explanatory variables were constructed based on the timing of policy implementation and the respondents’ household registration status. The temporal dimension (Post) defines the year of the policy shock based on the State Council’s “Opinions on Integrating the Urban and Rural Resident Basic Medical Insurance Systems” and the actual implementation progress in each province (Cheng et al. 2020). Among the 28 provinces covered by the CHARLS dataset, 25 completed the substantive integration by 2017 or earlier, while Beijing, Jiangsu, and Gansu did not complete the integration until 2018; the model thus defines the policy implementation status for each province accordingly (Liu et al. 2019). The definition of the treatment group (Treat) is based on an individual’s household registration status. Prior to integration, residents with agricultural hukou were enrolled in the new cooperative medical scheme (NCMS), while urban non-working residents were enrolled in the urban resident basic medical insurance; the two systems operated separately and in parallel. Household registration status (particularly for middle-aged and older adults aged 45 and above) exhibits strong temporal stability and exogeneity. Grouping based on household registration aligns with the intention-to-treat (ITT) framework (Nagel et al. 2020). It helps avoid sample selection bias that might arise from categorization based on self-reported insurance types.

Tests of parallel trends represent an important part of the work, but their results are not equally convincing for all outcomes. In particular, the findings for health outcomes, where there are indications of pretreatment differences, need to be more cautiously interpreted. And for the economic outcome, it would be advisable to soften the wording that suggests a fully confirmed causal mechanism. The results are interesting and relevant, but it would be more appropriate to present them as findings that support a certain interpretation, rather than as definitive proof.

Response: We completely agree with your highly professional assessment regarding the parallel trends and the need for cautious causal claims.

（1）Regarding Pre-treatment Differences: In the revised Section 4.3 (Parallel Trends and Endogeneity), we explicitly acknowledge that the simple TWFE model failed the strict parallel trend test for health outcomes due to the inherent heterogeneous depreciation rate of health capital between rural and urban populations. To address this definitively, we introduced the PSM-DID model (Section 4.4.3). By strictly matching covariates at the 2013 baseline, we eliminated these pre-treatment structural differences and successfully re-identified the robust policy effects. （2）Softening Causal Claims: We fully accept your advice. Throughout the revised manuscript (especially in Sections 4.2 and Chapter 5), we have carefully softened our wording. We removed definitive phrases like "fully confirmed" or "definitive proof," and rephrased them as "the data supports the interpretation that..." or "empirical results indicate a robust association driven by the policy." We treat our findings as strong empirical evidence supporting theoretical interpretations rather than absolute proofs.

The Discussion section is focused on the interpretation of the results of the current study, without comparison with the results of previous studies and discussion of similarities and differences between the obtained results. Also, theoretical implications were missing, thus neglecting the academic purpose of the work. Therefore, additional attention should be focused on these elements.

Sections and content requiring revision in the article: Sections 4.3（Page 10-12）, 4.4.3（15-16）, and Chapter 5（18-20）.

The significant effort invested by the authors in the study is evident. However, additional methodological clarifications are needed, as well as a more careful formulation of the conclusions, in order to give the manuscript the necessary integrity.

Response: Thank you for pointing out this critical deficiency in the original manuscript. We have completely rewritten the Discussion (Chapter 5) to fulfill its true academic purpose.

In the revised Chapter 5, we extensively engage in theoretical dialogue and literature comparison. （1）In Section 5.1, we compare our findings with the famous "Oregon Health Insurance Experiment" (Finkelstein et al., 2012) and employ Grossman's Health Capital Theory (1972) to explain the immediate synchronization of financial protection and health improvements. （2）In Section 5.2, we contrast our results with previous concerns regarding "Supplier-Induced Demand (SID)" and moral hazard (Yip & Mahal, 2008), providing a theoretical explanation for the achievement of "Pure Financial Protection" without excessive healthcare utilization. （3）In Section 5.3, we discuss the intergenerational boundaries of the policy dividends, contrasting our findings with the Fundamental Cause Theory (Link & Phelan, 1995) to highlight the limitations of current medical insurance in covering long-term care for the deeply aged and disabled population.

Sections and content to be revised in the article： Chapter 5. The revised content is as follows（Page 20-22）：

5 Discussion

5.1 Synergies Between Financial Protection and Health Improvements

Empirical results show that the urban–rural health insurance integration reduced actual OOP expenditure for middle-aged and elderly rural residents (by approximately 5.6%) while simultaneously leading to improvements in indicators of physical functioning (ADL) and psychological depression (CES-D). In existing evaluations of international health insurance systems, the classic “Oregon Health Insurance Experiment” (Finkelstein et al. 2012) observed that expanding insurance coverage can rapidly improve the psychological well-being of low-income groups. However, the short-term effects of such interventions on physiological indicators, such as functional impairment, were unclear. Compared to the asymmetric pattern observed in Western literature—where "psychological benefits are significant but physiological improvements lag"—the dual improvement effect observed in this study stems from the unique baseline constraints and intensity of policy interventions among China's rural population. Analyzed through the lens of Grossman's (1972) health capital theory, rural elderly populations have long been subject to high healthcare budget constraints(Michael Grossman 1972), making their health investment behavior sensitive to service prices. Following the integration of the healthcare system, the elevation of the pooling level substantially reduced the relative prices of clinical treatments. This price leverage not only directly alleviated the anxiety and depression caused by medically induced poverty but also, by easing financial constraints, prompted more timely primary healthcare interventions, effectively counteracting the natural decline in physical function. These findings confirm the health spillover benefits of public health insurance beyond its role as a financial safety net, providing robust causal evidence for developing countries to narrow the urban-rural health welfare gap through institutional integration under resource-constrained conditions.

5.2 Mitigating Moral Hazard and Ensuring Financial Safety Nets

Empirical analysis reveals that the integration policy did not lead to a statistically significant increase in the probability of rural residents seeking outpatient or inpatient care. However, the risk of households incurring CHE decreased significantly by 1.9%. In traditional health economic discussions regarding the expansion of basic health insurance, supplier-induced demand and moral hazard are mechanisms of widespread concern in the academic community. Some previous empirical studies have argued that higher reimbursement levels stimulate overutilization of medical services, thereby driving up systemic total costs (Van Dijk et al. 2013; Zavras 2025). However, the micro-level pathways outlined by the data in this study diverge from these concerns: the stable trend in healthcare utilization rules out disorderly resource misuse. At the same time, the reduction in extreme financial risk reflects how the expansion of the fund pool has effectively alleviated the burden of severe illnesses. This phenomenon of "pure financial protection without overtreatment" is attributed to the institutional safeguards designed during China's health insurance pooling process. While policies have increased reimbursement rates, they have retained strict deductibles and copayment mechanisms (Wei et al. 2026) and strengthened regulatory constraints on the medical insurance formulary. This ensures that newly pooled funds primarily fulfill the essential insurance function of hedging against the tail risks of severe illnesses, rather than providing excessive subsidies for minor ailments. The empirical results support the competitive hypothesis that price compensation effects prevail, confirming that through the optimization of risk-sharing mechanisms, universal health insurance can fulfill its baseline function of preventing medically induced poverty without increasing the overall financial burden.

5.3 Policy Benefits Fail to Penetrate Non-Medical Barriers Facing the Elderly with Disabilities

Heterogeneity analysis indicates that the reduction in medical burdens exhibits universal benefits across groups with different educational levels; however, intergenerational heterogeneity in benefits reveals implicit boundaries in institutional coverage, as no substantial relief in burdens was observed among the elderly aged 75 and older. This divergence partially deviates from and complements the classic predictions of the Fundamental Cause Theory in sociology. This theory typically posits that socioeconomic status determines why disadvantaged groups struggle to access policy benefits equitably (Link & Phelan, 2025). The universal nature of the educational dimension in the research findings reflects that the integration policy demonstrates strong institutional inclusivity in process optimization and benefit disbursement, successfully overcoming barriers to information conversion. However, the limited benefits for the elderly expose the inherent blind spots in basic medical insurance, which is oriented toward clinical cure. As the aging process progresses, the core expenditure structure of the elderly gradually shifts from routine outpatient, emergency, and inpatient costs to long-term care and disability care costs that fall outside the scope of medical insurance reimbursement. Relying solely on health insurance for medical conditions cannot penetrate the non-medical financial barriers caused by physiological decline. These findings clearly highlight the limitations of the current health system reform and suggest that policy design should focus on transitioning from single-item expense reimbursement to comprehensive elderly care, providing crucial empirical support for accelerating the institutional integration of health insurance and long-term care insurance (LTCI).

Reviewer 3 Report

Comments and Suggestions for Authors

The paper presents well-constructed hypotheses and a clearly structured and well-written manuscript. The research question is highly relevant, and the empirical strategy is appropriate for the problem under investigation. The dataset appears robust, with a substantial number of observations, which strengthens the empirical analysis.

One possible interpretation of the approximately 35% increase in out-of-pocket health expenditures is that the reform effectively reached very poor households that previously spent little or nothing on formal healthcare, often relying on home remedies or informal treatment. The research data suggest it. In this sense, the insurance expansion may have enabled access to formal care and made treatment choices feasible for the first time. The policy may therefore have activated previously suppressed preferences for healthcare, reflecting latent demand that could not previously be realized.

The parallel trends test suggests that the treatment group had lower healthcare expenditures than the control group prior to the intervention, which is consistent with the idea that the treated population may have been more financially constrained before the policy.

The reported reduction in the probability of hospitalization is potentially an important result. Hospitalization represents one of the most economically significant components of healthcare spending. If the policy indeed reduces hospitalization rates, this could indicate a more efficient and preventive healthcare system overall.

However, it is important to emphasize that only the financial burden variable shows statistically significant results. The other two hypotheses do not yield statistically significant effects and therefore should be interpreted with greater caution. The discussion should clearly highlight this distinction.

In addition, the assumption of parallel trends for the ADL (physical health) dimension appears unstable. This weakens the identification strategy for this outcome and suggests that results related to ADL should be interpreted very cautiously. In fact, given the lack of parallel trends and the absence of statistically significant estimates, the manuscript should avoid drawing substantive conclusions regarding this outcome.

My reading of the conclusion is the following:

The empirical results indicate that the Urban–Rural Medical Insurance Integration (URMII) significantly increased out-of-pocket healthcare expenditures among rural residents. However, the study does not provide statistically significant evidence of short-term improvements in either physical or mental health outcomes.

Therefore, the findings do not confirm the traditional expectation that expanding insurance coverage would necessarily reduce the financial burden on households. Instead, the increase in expenditures may reflect the release of previously suppressed demand for healthcare services that had been constrained by limited financial capacity. In this sense, the policy may have enabled individuals to pursue healthcare needs that were previously unattainable.

That said, the study does not provide direct empirical evidence supporting some of the stronger interpretations presented in the discussion, such as the claim that:

“Under this system, Health insurance integration has enhanced patients' payment capacity, objectively incentivizing healthcare institutions to provide more high-tech, high-cost diagnostic and treatment services [50].

The empirical model is not capable of detecting such mechanisms. The authors should therefore ensure that interpretations remain strictly grounded in statistically supported results. The absence of significant health outcomes in the model does not necessarily imply that no health improvements occurred; rather, it indicates that the study was not able to detect such effects within the available data and time horizon.

Citations from the literature should not be conflated with the empirical results obtained in the study. Doing so risks introducing interpretative bias and may inadvertently weaken what is otherwise a very well-conducted and valuable piece of research.

More generally, some sections of the discussion appear to overinterpret non-significant results as negative findings, while potentially underexploring the implications of the significant results.

The conclusions should remain strictly consistent with the methodological limitations and empirical results.

Overall, the policy evaluated in the study appears promising and potentially beneficial. However, the results suggest that the future studies should be revisited with additional data and a longer follow-up period in order to more accurately assess its long-term effects on health outcomes.

Inequality is multidimensional and extends beyond income differences to include disparities in capabilities, access to services, opportunities, and social participation.

Author Response

Question 1: Regarding the statement that “the ‘increase in costs’ may reflect the release of pent-up demand,” while noting that the parallel trends indicate that the treatment group (rural areas) faced more severe financial constraints prior to the policy.

Response： We thank the reviewer for noting that the treatment group faced significant financial constraints during the baseline period. As shown in the revised Sections 4.1 (Descriptive Statistics) and 4.3 (parallel trends), the baseline data reveal a pronounced health deficit among the rural population at lower expenditure levels. This phenomenon is inherently consistent with the logic of financial constraints you mentioned. After correcting for missing values and coding errors in the earlier data, our empirical results show that the integration policy effectively alleviated this budget constraint by raising the level of coverage. Notably, this alleviation of constraints did not manifest as a surge in costs within the sample, but rather translated into significant improvements in physical and mental health indicators. This transformation from "financial constraints" to a "health benefit" validates the substantive role of health insurance pooling in risk protection and welfare enhancement.

Sections and content requiring revision in the article: Sections 4.1（Page 8-9） and 4.3（Page 10-12）.

Question 2：Regarding the finding that “hospitalization rates have declined,” which suggests that the healthcare system has become more efficient and preventive。

Response： Thank you for highlighting the significant economic implications of hospitalization. In our revised dataset (see Table 6 in the updated Section 4.5), we found that the probability of hospitalization remained statistically stable (with no significant change), rather than declining. However, while the hospitalization rate remained stable, the probability of catastrophic health expenditure (CHE) decreased significantly by 1.9% (p<0.01). We believe this latest finding also reflects the policy's positive role in enhancing system efficiency. While providing robust financial protection against catastrophic shocks, the policy has neither triggered inefficient overutilization nor placed pressure on hospital capacity.

Sections and content requiring revision in the article: Section 4.5（Page 16-17）.

Question 3：Regarding the issue that “only the economic burden was statistically significant in the data, while health outcomes were not significant at all; furthermore, the parallel trends for ADL were poor, strongly urging us ‘not to draw any substantive conclusions regarding health outcomes.’”

Response： Thank you for your rigorous scrutiny of the statistical standards. Your comment regarding the instability of parallel trends in the ADL dimension in the original manuscript is well-founded. (1) Regarding statistical significance: After correcting coding errors in the baseline data set, the study results have changed. As shown in the revised Table 3, improvements in physical health (ADL, p < 0.01) and mental health (CES-D, p < 0.01) have both reached statistical significance. (2) Regarding the lack of parallel trends in ADL: We fully agree that, due to baseline structural differences (as acknowledged in the revised Section 4.3), the simple TWFE model did not fully satisfy the parallel trends assumption for ADL. To address your specific concerns and strengthen our identification strategy, we have formally introduced a propensity score matching (PSM) difference-in-differences (PSM-DID) model (revised Section 4.4.3, Table 6, Panel E). By strictly matching covariates at baseline, we eliminated the pre-treatment differences you identified, thereby yielding robust and statistically significant results regarding health improvements.

Sections and content requiring revision in the article： Sections 4.2（Page 9-10）, 4.3（Page 10–12）, and 4.4.3（Page 15-16）.

Question 4: Regarding the issue of “overinterpreting the notion that healthcare institutions create demand in the absence of direct statistical evidence to support it”.

Response： We accept this important methodological suggestion. As you pointed out, our previous draft suffered from interpretive bias by conflating references to the literature (such as "provider-induced demand") with our empirical limitations. In the completely rewritten Chapter 5 (Discussion), we have grounded each interpretation strictly in statistically supported findings. Given that our new empirical model (Table 6) shows no significant increase in the probability of healthcare service utilization, we have explicitly removed all speculative assertions regarding "hospitals having an incentive to provide high-cost services." Instead, we now firmly ground our discussion in the evidence of "pure financial protection," which has been verified and is free from moral hazard. We have carefully eliminated any overinterpretation of non-significant results to ensure the academic integrity of the paper.

Sections and content requiring revision in the paper： Chapter 5（Page 18-20）

Question 5: Regarding the statement that “inequality is not merely a matter of income, but a multidimensional gap encompassing capabilities, access to services, and social participation.”

Response： We fully agree with your insightful observations regarding the multidimensional nature of inequality. The issue of inequality does indeed involve multiple dimensions. To reflect this broader perspective, in the revised Section 4.6 (Analysis of Heterogeneity), we have explicitly expanded the scope of our analysis from income to include disparities in capabilities and access to opportunities. Specifically, we examine how age-related disabilities (capabilities) and educational attainment (institutional adaptability) create multidimensional barriers that hinder older adults from equally sharing in the benefits of public policies.

Sections and content to be revised in the article: Section 4.6（Page 17-18）

4.6 Heterogeneity Analysis

The distribution of the inclusive benefits resulting from institutional integration across different groups is a key factor in determining the fairness of the policy. This paper conducts a heterogeneous regression analysis based on educational attainment and age structure (Table 8 and Figure 5).

Table 8 Heterogeneity Analysis: Regression by Education Level and Age ()

Variables	Panel A: SES		Panel B: Age
	(1) Low-education group	(2) Higher education group	(3) Middle-aged group (45–74 years old)	(4) Elderly group (75 years and older)
Did (Treat × Post)	-0.046	-0.044	-0.067**	0.019
	-0.049	-0.031	-0.028	-0.056
Control Variables & Fixed Effects	YES	YES	YES	YES
Observed Values (N)	23,362	25,370	42,905	5,827
	0.019	0.024	0.02	0.032

The analysis results reveal distinct patterns in the scope of policy impact. On the one hand, regarding socioeconomic status (SES), the reduction in healthcare costs for the low-education group and the high-education group was highly consistent in magnitude (–0.046 and –0.044, respectively). The universal nature of the cost-reduction effect across different educational groups fails to support the hypothesis in H3a that individuals with absolute essential needs would exhibit a stronger behavioral response. On the other hand, intergenerational heterogeneity was significant. The policy benefits were primarily evident among middle-aged and older adults (45–74 years old, Did = −0.067, p < 0.05). In contrast, the limited benefits for those aged 75 and older corroborate the core logic of Hypothesis H3b. That is, universal health insurance struggles to fully penetrate the non-medical financial barriers arising from aging and disability, leading to a decline in the effectiveness of policy benefits among the most vulnerable elderly population.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revised version can be accepted.

Author Response

Thank you for your time and expert review throughout the peer-review process. Your recognition is a great encouragement to us. We are deeply grateful for all the insights and suggestions you have provided to enhance the quality of this study.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors appropriately addressed the suggestions and revised the manuscript. The work has been improved in all its elements. No additional comments.

Author Response

Thank you very much for your detailed and constructive feedback during the previous review rounds. We are delighted to know that the revised manuscript and the improvements across all its elements have met your expectations. We sincerely appreciate your careful guidance and support for this manuscript.

Reviewer 3 Report

Comments and Suggestions for Authors

The new modeling and statistical corrections have greatly improved the work. The results appear more consistent and the problems of over-interpretation have been resolved.

It would only improve the writing of the abstract for publication. It would be more natural and fluid. This can be done in proofreading.

Author Response

We greatly appreciate your positive feedback regarding the statistical corrections, new modeling, and interpretation of results included in this revised version. Your earlier caution regarding “overinterpretation” has enhanced the scientific rigor and validity of our conclusions. We agree with your suggestion to further refine the abstract to make it read more naturally and fluently upon publication. In this final submission, we have thoroughly polished and streamlined the wording of the abstract to ensure it is more concise and consistent with native-language conventions. Thank you once again for your contributions to the refinement of this study.

Article Menu

Financial Relief and Health Effects of Urban–Rural Health Insurance Integration on Older Rural Adults: A Causal Analysis of Age-Based Heterogeneity

Further Information

Guidelines

MDPI Initiatives

Follow MDPI