Next Article in Journal
Vehicle Routing Problem Considering Reconnaissance and Transportation
Previous Article in Journal
Relationships between Social Support, Social Status Perception, Social Identity, Work Stress, and Safety Behavior of Construction Site Management Personnel
 
 
Article
Peer-Review Record

Will Delayed Retirement Affect the Health of Chinese Workers? A Study from the Perspective of Sustainability of Physical Health

Sustainability 2021, 13(6), 3187; https://doi.org/10.3390/su13063187
by Hanwei Li 1,*, Dongling Xu 2 and Xin Hao 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sustainability 2021, 13(6), 3187; https://doi.org/10.3390/su13063187
Submission received: 2 January 2021 / Revised: 3 March 2021 / Accepted: 8 March 2021 / Published: 14 March 2021

Round 1

Reviewer 1 Report

Thank you for giving me the opportunity to review this paper. The paper deals with an important question, the relationship between retirement and health, and has its merits, particularly in terms of the data and analysis. However, the analysis is based on a flawed premise: it assumes that the causality goes from retirement to health.

Throughout the entire paper, the assumption is that retirement affects health. This is assumption is reinforced by a an experimental vocabulary: a treatment and a control group. The authors place a lot of trust in their method, propensity score matching (PSM), to account for any causality issues and note that 'In the non-randomized test conditions, the selecting bias and confounding bias can be eliminated to the maximum extent by the Propensity Score Matching method.' But PSM cannot deal with the issue of causality. It does not go beyond noting whether people who retired earlier and people who retired later, who otherwise have similar characteristics, have a different health score. It does not eliminate the possibility that people retired because of health.

Now, I can see how the direction of the effect might make this assumption reasonable. The study finds that health is worse in people delaying retirement than people who didn't above a certain age. It is reasonable to assume that people don't retire because they are in good health. But the indicator complicates matters: health is measured by medical expenses. There might be very good reasons for someone in employment to spend more on the same health issues as a retired person. For instance, for a person in employment, being ill means loss of income. It can capture an income effect beyond what the costs-per-day variable controls for. In sum, the paper does not provide convincing evidence that retirement affects health.

A second issue with the paper is the age groups. Even though the abstract suggests that the paper will answer the question from which age delaying retirement might be harmful for a specific type of worker, the paper provides no such evidence. The division of age categories seems to be based on a number of scatter plots, although scatter plots with thousands of cases are entirely impossible to interpret. If indeed there is a cut-off point in medical expenses based on age, then a simple plot with mean and 95% confidence interval for every age would show this. While it is a necessity to work with a graph with means and CIs, doing so does not mean that it is no longer problematic to suggest that the paper investigates from which age onwards retirement affects health - it still wouldn't. Moreover, it also seems to be a very binary idea that there is no relationship between retirement and health before a certain age and there is after that age. The reality is likely a more gradual transition. Maybe it is worth considering methods that could capture such a more gradual transition?

A third issue, connected to the method. It is unclear which variables were used for the PSM procedure, and it would be useful to get a table comparing these variables between both groups to check whether the procedure worked well.

Fourth, the tables are unclear. Tables 3 and 5 seem to contain the same type of information but have very different titles. Without the right titles, it is not clear what the numbers in the table mean. Moreover, in some places, stars are used to indicate significance where they aren't elsewhere.

Fifth, the paper mentions different statutory retirement ages for different occupational categories. However, I do miss some more contextual information about retirement and pension policies in China. How generous are the pensions? Do the same regulations apply to all sampled units? (e.g. if regulations differ depending on hukou status, does the sample include people of both urban and rural hukous?)

Further, I have a number of less fundamental issues:

Overall, the language is quite good, but there are some words that seem a bit off (e.g. 'the approach of the tool variable' - which presumably means 'the instrumental variable approach'; 'mental workers' - referring to non-manual workers) and some sentences are not construed in a very fluent way. Language editing would greatly improve the text.

Tables 4 and 6 contain 3 different methods saying pretty much exactly the same. Reduce clutter by choosing 1 method only and add a simple note saying that the 2 other methods resulted in substantially the same conclusions.

The paper states that 'due to China's special national conditions, the delayed retirement age policy will meet considerable resistance for complex reasons.' Subsequently it mentions only one reason. One that is in no way China-specific. Delaying retirement meets resistance pretty much everywhere. If the policy would indeed be particularly difficult in China, then it would require concrete arguments stating why that is the case.

The paper states that 'The proportion of those people is very small' - as a factual statement, this either requires a reference or concrete data.

Line 553 talks about 'female non-manual workers', presumably that is supposed to be 'male non-manual workers'. Also, it is unclear where the number 1002.34 mentioned on this line comes from, as I can't find it in any of the tables.

Table 5 seems to show some area effects for women below age 66. Why? And what do they mean?

Author Response

Dear Referee,

Thank you for the opportunity to revise my paper. Your suggestions have certainly improved our paper. I have commented below on each of the points raised in your feedback.

First, we did not assume there exists a causality between the health and retirement. PSM cannot deal with the issue of causality as you pointed out. We followed the general methodology when PSM was adopted in other researches. It estimates the effect of our “treat” variable i.e. delayed retirement potential has on the physical health of the subject. In our research, the health status was represented by the medical expenses. We understand it may not be a perfect indicator that captures the complete picture of ones’ health state. However, it has the following three advantages which makes it an excellent indicator. 1) medical expense is subjective, compare with other types of self-declaration, where biasness is unavoidable. People may over- or under-estimate their health for various reasons. 2) medical expenses are quantitative to the ratio level of measurement. This allows to perform more data analysis methods than some commonly used health rating system with are normally only up to ordinal level of measurement. 3) medical expenses are comparable between different individuals. Suppose we use medical record of specific illness as indicator, how to rank the impact of pneumonia and arthritis to different patients? In addition, by using medical expenses that directly obtained from medical insurance fund, we did not include the loss of income.

Secondly, all our subjects are people who have passed the legal retirement age in China as we are interested in the impact of delayed retirement. The critical age in the paper is not a sudden change stipulated by the authors in advance, the results imply the possibility of an individual’s health start to deteriorate is high around that critical age.

Please allow me to address your third and fourth comments together. Indeed, there are many endogenous factors that affect health, including some that are not easy to measure. PSM is an effective tool for solving endogenous problems. The PSM model itself can solve endogenous problems well without any control variables. In other words, the PSM model used in this paper can better study the impact of delayed retirement on health without considering any control variables, that is, the impact of variable delay on the variable sdcosts. During the research process, we have collected a wide range of variables like education level, consumption level, location, gender, nature of work, age, and others. Out of these variables: ‘Age‘ ‘Gender’ ‘Work’  are used for classification or segmentation, while variables of education level, consumption level, and location are used as the control variables of PSM to make the result of PSM operation more accurate. The factors affecting health that are not easy to measure and count, such as genetics and living habits, are not used as control variables in this article, but the absence of these control variables will not affect the basic effects of PSM.

In response to your fifth comment, the medical expenses in this article are derived from the payment data of the investigator's medical insurance fund. Since workers with registered permanent residence in cities and towns in China pay their medical insurance funds every month during their work, they all use the medical insurance fund to see a doctor. Before the balance runs out, the medical expenses are directly deducted from their medical insurance fund. The patient does not need to pay separately; if the balance is used up, then she has to pay part of the medical expenses by herself, and the government also subsidizes part of it through the medical insurance fund. Therefore, by checking the payment status of the investigated patient's medical insurance fund, we can basically know the cost of medical treatment and use it to measure the health status.

In China, the concept of retirement refers to the urban population only. For people living in the rural area, they will continue to participate in the labour force as long as they are physically able to work. Therefore, the scope of Chinese labours limits to urban workers in this study which is in line with the traditional literature in this field (China’s retirement issues).

The above are our response to your major concerns. Regarding the less fundamental issues, thanks for suggesting a more traditional way to refer non-manual workers. We have adopted this term in this and future research. Table  3 and 4 are there for the purpose of robustness, we will add a footnote to clarify. There are many different features in China that makes the delayed retirement a special case, for example, Chinese retirement worker often need to help look after the grandsons and granddaughters; due to lack of development of China 40 years ago, many of the people who are now facing retirement have a relative low level of education compare with both the current Chinese workers or retired worker in developed countries; and for the same reason, they started work in a very young age (around 16 years old) which is rare to see in developed countries as well.

There are three main groups of retired groups over the age of 65: some national leaders in the government, academicians of the Chinese Academy of Sciences, and doctoral tutors in universities. These groups account for the same proportion of the total population as other countries in the world. It is very small, this is a common sense of society, and it should not be proved by literature and data.

Finally, line 553 has been corrected, thanks for pointing that out. It is our greatest pleasure to read your comment and the paper has been revised accordingly. We are looking forward for your opinion on the improved version.

In addition, The premise of the implementation of PSM is the balance of grouping data. The data of control group and processing group should have a large common range of values, that is, the overlap hypothesis testing, which has been improved in the revised manuscript.

Best regards,

Hanwei Li, Dongling Xu and Xin Hao

Reviewer 2 Report

Referee report on ”Will Delayed Retirement Affect the Health of Chinese Worker? From the perspective of sustainability of physical health”

Overall impression

The topic of the paper is interesting and important. However, I have a number of concerns. The treatment model is not specified and there is not sufficient information (actually no information) for the reader to observe whether the match was successful. These are basic requisites for using propensity score matching and estimating treatment effects, so these are quite significant shortcomings. Overall, I find that the idea of the article is interesting, and the results could be interesting. However, the quality (or at least the clarity) of execution of the analysis and the presentation of the result could be improved significantly. More detailed comments are given below.   

More detailed comments (not in any particular order)

Section 1

Line76- “Then we believe the physical health is unsustainable, and delaying retirement has a significant impact on workers' health.” This is repeated in few places. If delaying retirement has a significant impact on health, this can be positive or negative. I think you should say that delaying retirement has a significant NEGATIVE impact, which I assume you mean. Also in the abstract you say that “ After these chronological age stages, however, it has significant impact on their health.” Reader does not know whether the impact is positive or negative.

Section 3 and 4 and 5

Line 277- “Therefore, the research only uses those factors which can be accurately answered and quantified as the controlling factors, including education levels and the physical health conditions of individuals, and the areas in which they are living..” I don’t see you using physical health conditions in the analyses. Did you exclude all persons with physical health conditions from the sample? Or does this mean medical costs?

Medical expenses should be explained in more detail. What they actually include? Are for example visits to a doctor included in these costs? It is important to be able to understand how these costs are calculated, since you use them to measure physical health.

Since a lot of effort has been put to collecting the data (face-to-face survey), it would be useful to know what questions was asked from the respondents. I would imagine that education and daily basic living costs (and medical costs) are not the only thing that were asked. Perhaps the survey includes more variables that could be used, now the variables used are quite scarce.

Table 1. Row and column total should be given, now it is very difficult to get an idea of the total number of observations. In addition, %-shares should be reported.

Table 2. Variable ”Work” definition?

Figures 1-8. The scale should be similar in all figures.

Are all in the delayed retirement group working? Even those who are 70 years old?

Is it really feasible to pre-select age-groups for inspection according to the outcome? It is exactly the effect of continuing at work on the outcome that is the interest here. But you select your inspection groups based on the observed difference in the outcome? Somehow I don’t like this idea. If this is feasible, then it has to be justified clearly and well.

Section 6.

Line 413- “A very important requirement for using PSM is that the data of the treatment group and the control group meet the overlap requirements. As shown in Figures 1-8, the data (medical expenses) of the treatment group and the control group in this study are between 0 and 10000 Chinese Yuan and the data overlap area is large, which satisfies the basic conditions of using PSM.” As far as I understand, the overlap condition is related to the probability of the treatment; the overlap assumption is satisfied when there is a chance of seeing observations in both the control and the treatment groups at each combination of covariate values. In practise you need to check the density of propensity scores of treated and control units in order to make conclusions about the overlap condition.

Line 402- ATT. “The average treatment effect of the treatment group (ATT), after Propensity Score Matching, can be used to observe and analyze the difference between the control group and the treatment group, and then to be used to analyze the policy effect.” I get the impression that you inspect Treatment effect on the treated: how the health of those who continue to work would have changed in case they did not continue to work (that is, retired). In this case the comparison should be Health costs of those who continue to work vs what would have been their health costs in case they did not continue to work. That is the whole idea of treatment effects. The above explanation that you give is not accurate enough. This should be explained more clearly.

Specification of the treatment (selection) model (=model for continuing at work/retiring)? This is the basis of PSM (propensity score is based on this) and now there is no mention about this? In addition to presenting the model, justification for the specification is needed.

Not only there is no information concerning the matching process; there is also not sufficient information for reader to observe whether the match was successful. Presenting raw figures of the outcome is not enough. More descriptive statistics should be presented. In particular, covariate balance summary statistics should be reported. Baseline characteristics control and treated (mean and variance). Then measures for matched sample, along with raw and matched standardised differences and raw and matched variance ratio. (or alternatively baseline means for control and treated and differences in means and means for control and treated and difference in means after matching). There should be practically no differences after matching. Number of matches should also be indicated somewhere.

Tables 3 and 5. It is totally unclear to me what you actually present in table 3 and table 5. Based on that you present LR Chi2 and Pseudo R2, it seems that this is some kind of discrete choice model but the table titles and the text indicate that you model medical expenses. I simply don’t understand these tables.

In table 3 it can be seen that for manual men aged 64 or over, region North is significant and positive, yet there is no mention about this in the text. I don’t know what this means since I don’t understand what you investigate in this table.

Could it be that those who continue working have higher incomes and therefore can afford to get more medical help, and that is why it seems the medical costs increase after certain age for those who are working? This naturally relates to the question of how medical costs are defined (see my earlier comment).

Differences between different matching techniques should be explained more clearly. It is also unclear what is the “k” in your k-nearest neighbour matching? From table 4 I assume it is 1. Radius matching, what is the caliper? Kernel matching, which function?

Line 432, 458, 474, 510, 527, 539, 556 “ In order to test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching.” You don’t have to repeat this same sentence every time you present a new result.

Line 435 “The estimated ATT of the nuclear matched PSM analysis is -5.63,…” What is this nuclear matched analysis?

Line 443- “The difference can be explained by the fact that the female manual workers who are older than 63 and delay their retirement pay 969.38 RMB more annual medical expenses than those who do not, which reflects the apparent impact of the delayed retirement on their health.” I don’t think the difference can be explained by the fact…There are likely to be completely different explanations for the difference. You also repeat this same sentence every time you present a new result. I think the result should be interpreted for example like this: This result indicates that female manual workers who are older than 63 and delay their retirement pay XXXX more annual medical expenses than comparable women who do not delay retirement.

Line 520- “As shown in Table 6, , it is clear that delayed retirement has a significant effect on the health differences of male non-manual workers who are older than 67, and their k-nearest neighbor matching PSM analysis ATT is 1183.73, with a corresponding the value of 9.49, meaning the difference is very significant. The difference can be explained by the fact that the female non-manual workers who delay their retirement beyond 67 pay 1002.34 RMB more annual medical expenses than the female non-manual workers who do not, which reflects the apparent impact of the delayed retirement age on their health difference.” Something wrong with this sentence? It seems that copy-paste is used much in this article.  

Author Response

Dear Referee,

Thank you for the opportunity to revise my paper. Your suggestions have certainly improved our paper. I have commented below on each of the points raised in your feedback.

For your comment on Section 1, we will be more specific by adding “negative” to these places you mentioned. With regarding to Section 3-5, indeed, there are many endogenous factors that affect health, including some that are not easy to measure. PSM is an effective tool for solving endogenous problems. The PSM model itself can solve endogenous problems well without any control variables. In other words, the PSM model used in this paper can better study the impact of delayed retirement on health without considering any control variables, that is, the impact of variable delay on the variable sdcosts. During the research process, we have collected a wide range of variables like education level, consumption level, location, gender, nature of work, age, and others. Out of these variables: gender, nature of work and age are used for classification or segmentation, while variables of education level, consumption level, and location are used as the control variables of PSM to make the result of PSM operation more accurate. The factors affecting health that are not easy to measure and count, such as genetics and living habits, are not used as control variables in this article, but the absence of these control variables will not affect the basic effects of PSM.

In the survey, it is true that some elderly people surveyed have hardly been to the hospital in a few years, and some elderly people have suffered from major diseases (such as cancer or serious injury hospitalization) and spent huge expenses in a year, but the proportion of these people is relatively small. This article treats these data as outliers, and deletes the two extreme groups.

the medical expenses in this article are derived from the payment data of the investigator's medical insurance fund. Since workers with registered permanent residence in cities and towns in China pay their medical insurance funds every month during their work, they all use the medical insurance fund to see a doctor. Before the balance runs out, the medical expenses are directly deducted from their medical insurance fund. The patient does not need to pay separately; if the balance is used up, then she has to pay part of the medical expenses by herself, and the government also subsidizes part of it through the medical insurance fund. Therefore, by checking the payment status of the investigated patient's medical insurance fund, we can basically know the cost of medical treatment and use it to measure the health status.

We have changed Table 1 as instructed. In Table 2, the variable work means what is the nature of the work: manual worker, or non-manual worker. All workers in the treatment group are still working after retirement included those who are over 70 years old. Finally, we didn’t pre-select age-groups, the age-groups in the paper are not stipulated by the authors in advance, the results imply the possibility of an individual’s health start to deteriorate is high around that boundary ages.

For Section 6, the overlap test is indeed a necessary condition for using PSM. The control group and the treatment group have the opportunity to see the observation results in each combination of covariate values, and the overlap hypothesis is satisfied. In this modification, the results of the overlap test are presented, and the test results meet the overlap.

PSM method has multiple applications. The survey data you mentioned before and after the policy implementation of the same set of samples are used as the control group and the processing group respectively. This is of course a typical application scenario, but this situation It must be used in conjunction with DID. However, delayed retirement in China is a special topic. China has not implemented delayed retirement so far, but is only in the policy discussion stage, so it is impossible to obtain the real data of the same set of samples before and after delayed retirement. Therefore, It is impossible to use the DID+PSM model you mentioned. But in this article, we have adopted a method of classifying PSM processing, control group and processing group gender, age, work attributes, (mental and physical), spending power, education level, the areas where they are located are well controlled. Workers who still choose to work at the statutory retirement age can be regarded as elderly people who delay retirement. They meet the conditions of the processing effect and are also an application scenario of PSM. This is in Morgan and Winship (2007), Rosenbaum (20100, Chen (2013) and other monographs are discussed.

As for the results of the matching, we made it clearer.

Previous Table 3 and Table 5 are some control variable data in the process of doing PSM. The impact of delayed retirement on the health of Chinese workers studied in this article is not the focus of this article and does not need to be interpreted. In this modification, these two tables are deleted.

K-nearest neighbor matching, caliper matching (radius matching), and kernel matching are three typical PSM methods. In this paper, three methods are used to do it separately, which not only tests the robustness of the research method, but also verifies the validity of the research conclusions. The problem of kernel function is also explained in the revised version.

For Line 435, it is a typo which has been corrected as Kernel matching.

For Line 443, we have changed the expression as you suggested.

For Line 520, we have mistyped the number which has now been corrected.

It is our greatest pleasure to read your comment and the paper has been revised accordingly. We are looking forward for your opinion on the improved version.

Best regards,

Hanwei Li, Dongling Xu and Xin Hao

Reviewer 3 Report

This paper analyzes an important question: does working longer help or hurt health? The literature has not come to firm conclusions on this point, and thus additional study is certainly warranted.

The authors collected survey data in six regions of China on a variety of personal characteristics including age, work status, and health expenditures. Using this data, the authors conduct a propensity score matching analysis that compares individuals who worked past their statutory retirement age (the “treatment” group) and those who did not (the “control” group). The authors further split the analysis to a number of groups: males and females, in manual and non-manual jobs, before and after certain cutoff ages. They find that working past the statutory retirement age causes no increase in medical spending before the cutoff age for each group, but does lead to increases after the cutoff age.

The analysis left me with a number of questions, detailed below in no particular order. I also had some more minor suggestions listed at the end.

Major Comments:

  • Methodologically, the worry is that those who work past their statutory retirement age are unobservably different from those who do not (for instance, different in their underlying health in ways that would not be reflected in the existing control variables). Such differences would bias the estimated effect of working longer. I suspect the bias would be in the direction of making working longer seem less detrimental to health, since I expect that those who do work longer would be healthier than those who retire earlier. Since this is counter to the sign found by the authors, perhaps it can be argued that the estimates in the paper are a lower bound of the true effect. The propensity score matching may help alleviate this concern, but the matching can only be as good as the observable characteristics of individuals in the two groups. The authors should try to address this in some way, at the very least by describing which variables are used in the matching process.
  • The use of health expenditures as a measure of health is understandable but is also limited. For example, those who die young may have lower health expenditures than those who receive extended medical or long-term care. Furthermore, expenditures may also be influenced by factors beyond pure health: the availability and generosity of health insurance, and income, to name two. These factors may, in turn, be influenced by the decision to work longer. Thus, we might observe those who work longer to have higher expenditures simply because they can afford more medical services. This, too, would bias estimates relative to the effect on underlying health that we care about.
  • An indicator controlling for which one of the six regions an individual lives in would make assumption 1 (pg. 6) unnecessary.
  • It is not clear why we need assumption 2 (pg. 6). But if it is necessary, again, a control for region could replace the assumption.
  • Figures 1-8 would be easier to read if the statutory retirement age were marked for each group on the figure. Also, I think it would be easier to compare the treatment and control groups if they were plotted on the same graph. For example, the average of spending at each age could be shown instead of the full distribution of spending; then both groups could be displayed on a single figure.
  • My understanding is that the statutory retirement ages are much lower than the ages at which health spending seems to diverge between the treatment and control groups. What might explain the sudden difference at, for example, age 63, when the retirement age is 50? The authors should motivate using the observed ages at which the outcome variable is different as the motivation for the choice to split the regression analysis at those ages. Without some theoretical motivation here, it would seem that these ages are chosen specifically to maximize the estimated differences between the groups in the later age groups, which is essentially selecting on the outcome ex-post, leading to upward-biased estimates.
  • I would have liked a discussion of the magnitudes found. Are the estimated increases in health spending due to working to older ages big or small? How do they compare to the mean of health spending for each of the groups, for example? What would the estimates imply for proposed increases in the statutory retirement ages?

Minor Comments:

  • English language editing would be helpful throughout the paper.
  • The literature review is already quite extensive, but I would suggest adding the following paper:

Fitzpatrick, Maria D. and Timothy J. Moore. 2018. “The Mortality Effects of Retirement: Evidence from Social Security Eligibility at Age 62.” Journal of Public Economics 157: 121-137.

Author Response

Dear Referee,

Thank you for the opportunity to revise my paper. Your suggestions have certainly improved our paper. I have commented below on each of the points raised in your feedback.

In response to your comment on our Methodology. Indeed, there are many endogenous factors that affect health, including some that are not easy to measure. PSM is an effective tool for solving endogenous problems. The PSM model itself can solve endogenous problems well without any control variables. In other words, the PSM model used in this paper can better study the impact of delayed retirement on health without considering any control variables, that is, the impact of variable delay on the variable sdcosts. During the research process, we have collected a wide range of variables like education level, consumption level, location, gender, nature of work, age, and others. Out of these variables: gender, nature of work and age are used for classification or segmentation, while variables of education level, consumption level, and location are used as the control variables of PSM to make the result of PSM operation more accurate. The factors affecting health that are not easy to measure and count, such as genetics and living habits, are not used as control variables in this article, but the absence of these control variables will not affect the basic effects of PSM.

In our research, the health status was represented by the medical expenses. We understand it may not be a perfect indicator that captures the complete picture of ones’ health state. However, it has the following three advantages which makes it an excellent indicator. 1) medical expense is subjective, compare with other types of self-declaration, where biasness is unavoidable. People may over- or under-estimate their health for various reasons. 2) medical expenses are quantitative to the ratio level of measurement. This allows to perform more data analysis methods than some commonly used health rating system with are normally only up to ordinal level of measurement. 3) medical expenses are comparable between different individuals. Suppose we use medical record of specific illness as indicator, how to rank the impact of pneumonia and arthritis to different patients? In addition, by using medical expenses that directly obtained from medical insurance fund, we did not include the loss of income.

Regarding assumption 1, medical expenses are used to represent the health status of the workers. If the medical expenses of the same medicine and the same operation in different regions of China are quite different, then there will be deviations in the use of different medical expenses to represent different health conditions.

Assumption 2 was made because the daily consumption level is used as a control variable to represent the economic status of the laborers. If the non-medical costs of food, clothing, transportation and other areas in China vary greatly, then this indicator will be biased as a control variable.

The statutory retirement age for different age groups will be added t In the revised manuscript.

These sudden difference at for example age 63 is not stipulated by the authors in advance, the results imply the possibility of an individual’s health start to deteriorate is high around that critical age.

Delayed retirement to different ages has different health and medical expenditures. For female manual workers, there is no significant difference between the health expenditures of workers of any age who delay retirement until the age of 63 and those of the same age who do not delay retirement. Workers of any age after 63 years old have a significant difference in health expenditures from those of the same age who do not delay retirement. Similarly, this age is 66 years for female non-manual  workers, 64 years for male manual workers, and 67 years for male non- manual workers. Of course, these studies are based on the perspective of physical health. This provides a basis for China to formulate an appropriate retirement age.

In addition, The premise of the implementation of PSM is the balance of grouping data. The data of control group and processing group should have a large common range of values, that is, the overlap hypothesis testing, which has been improved in the revised manuscript.

Finally, thanks for pointed out Fitzpatrick and Moore (2018), it was not published when we finished the first drafted, it has been added to our reference list. We would like to use a English language editing service to improve the paper. It is our greatest pleasure to read your comment and the paper has been revised accordingly. We are looking forward for your opinion on the improved version.

Best regards,

Hanwei  Li, Dongling Xu and Xin Hao

Round 2

Reviewer 2 Report

Referee report n:o 2 on ”Will Delayed Retirement Affect the Health of Chinese Worker? From the perspective of sustainability of physical health”

The paper has improved somewhat. Now it is at least clear that the authors have balanced the covariates and the common support assumption is fulfilled. However, there still are a number of issues that must be addressed.

I indicated in my previous report: “Also in the abstract you say that “ After these chronological age stages, however, it has significant impact on their health.” Reader does not know whether the impact is positive or negative.” This sentence is still the same.

In their response to my earlier comment about medical expenses the authors explain: “the medical expenses in this article are derived from the payment data of the investigator's medical insurance fund. Since workers with registered permanent residence in cities and towns in China pay their medical insurance funds every month during their work, they all use the medical insurance fund to see a doctor. Before the balance runs out, the medical expenses are directly deducted from their medical insurance fund. The patient does not need to pay separately; if the balance is used up, then she has to pay part of the medical expenses by herself, and the government also subsidizes part of it through the medical insurance fund. Therefore, by checking the payment status of the investigated patient's medical insurance fund, we can basically know the cost of medical treatment and use it to measure the health status.” This more accurate description should be added to the paper, the current explanation is not clear enough. The authors should understand that if I as a referee don't understand the contents of medical expenses, the readers will not understand it either. Moreover, can different persons have different amounts in medical insurance fund? Is it possible that retired people have less money in their medical insurance fund (above you state that workers pay their medical insurance funds every month during their work, so how do retired persons pay to medical insurance funds)? This could explain why they have lower medical costs?

In my previous report I asked “I don’t see you using physical health conditions in the analyses. Did you exclude all persons with physical health conditions from the sample? “ Despite that, this sentence is still the same as before “Therefore, the research only uses those factors which can be accurately answered and quantified as the controlling factors, including education levels and the physical health conditions of individuals, and the areas in which they are living.” You clearly say you use physical health as controlling factor. I still don’t see you using physical health as controlling factor.

Relating to medical expenses it is stated in the text: “Among the responses to annual medical expenses, some of the figures are unusually high. The respondents who provided the figures could have suffered from very serious health problems such as cancer or other major diseases.” However, in your response relating to medical expenses (above), you state that they are derived from the payment data and that you check the payment status of the medical insurance fund, which gives an impression that you use some kind of register in obtaining this data. So are the medical expenses based on survey responses or on registers?

In my earlier report I specifically asked: “Table 1. Row and column total should be given, now it is very difficult to get an idea of the total number of observations. In addition, %-shares should be reported.” This is not done. Instead, a mention about retirement ages is added. I don’t understand why?

In my previous report I asked: “Table 2. Variable ”Work” definition?“ The authors response: “In Table 2, the variable work means what is the nature of the work: manual worker, or non-manual worker.” Yes, I can see that. What I meant is that there are 2 times Work=1. Anyone who reads the table carefully notices this.

Scales of medical expenses in Figures 1-8 are still different, even if in my previous report I made it clear that the scales should be similar in all figures.

In their response the authors state: “ Therefore, It is impossible to use the DID+PSM model you mentioned.” I have no idea what the authors are talking about. I did not mention DID+PSM model. I only wanted clarification of what is actually compared with treatment effects. Your text implies you inspect Treatment effect on the treated (ATT). Therefore in my first report I indicated how the results should be interpreted. “how the health of those who continue to work would have changed in case they did not continue to work (that is, retired). In this case the comparison should be Health costs of those who continue to work  vs what would have been their health costs in case they did not continue to work.” This has nothing to do with inspecting the same group before and after delayed retirement. This is only the principle of PSM and treatment effects, and when using PSM-method and ATT, this is actually what you compare. You can see the actual health costs of those who continue to work (treatment group), and based on your PSM-method, you can get the result what would have been their health costs in case they did not continue to work (counterfactual outcome obtained using matched observations). The difference is the effect of continuing at work to health.  (ATT=E(Y1-Y0|D=1)).

The paper still has this sentence: “A very important requirement for using PSM is that the data of the treatment group and the control group meet the overlap requirements. As shown in Figures 1-8, the data (medical expenses) of the treatment group and the control group in this study are between 0 and 10000 Chinese Yuan and the data overlap area is large, which satisfies the basic conditions of using PSM.” As I instructed in my previous report, this kind of inspection is not enough to verify the overlap condition.

There is absolutely no need to add complicated formulas to the paper. The reader does not understand what the formula for ATT actually means.

It is not necessary to present the figures 9-12 and 17-20. The figures are unclear, it is hard to see where the data points are. Rather, the results of balancing should be presented in tables. This would save space and make the results clearer. Moreover, the interpretation of these figures is totally missing “As shown in the figure9, figure10, figure11, figure12, the grouped data of non-manual workers has a good balance form PSM covariates standardized bias.” Just adding figures is not enough, there is practically no interpretation of these figures, the content of the figures are totally unclear and should be explained in understandable way. Also if these results are presented in tables (as I suggest), the content of the tables must be explained in the text, so that a reader understands what is inspected and what the numbers tell and what conclusion can be drawn from the figures. Moreover, it must be clearly communicated why the inspections are performed in the selected way.

Figure 13-16 and 21-24 need not be presented, but the content of the figures should be presented and explained in the text.

In your response you indicate that you have changed the following expression, but this kind of sentences are still there “The difference can be explained by the fact that the female manual workers who are older than 63 and delay their retirement pay 969.38 RMB more annual medical expenses than those who do not, which reflects the apparent impact of the delayed retirement on their health.” As I mentioned in my previous report, I don’t think the difference can be explained by the fact…There are likely to be completely different explanations for the difference. Rather this should be interpreted as follows: This result indicates that female manual workers who are older than 63 and delay their retirement pay XXXX more annual medical expenses than they would pay in case they did not delay retirement.

In my previous report I indicated that you don’t have to repeat this sentence every time you present a new result “In order to test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching” Despite this, you present the same sentence every time.

Author Response

Dear Referee,

Thanks again for your new comments and advices. Our paper certainly improved since our last communication. I have made some modifications according to your suggestions in the second letter.

First, we have made it clear that the impact is negative through out the paper, thanks for your advice, this makes the statement clearer than before.

Your second suggestion is very important. The explanation of the medical insurance fund has been added to the main text. In addition, in China, the medical insurance fund contributions of the incumbent are different according  her salary. The amount of the medical insurance fund account consists of two parts, one part comes from individual contributions, and the other comes from the social pooling of corporate contributions; when the worker retires, individuals no longer pay the medical insurance fund, and the pool of medical insurance funds is only the social pooling part of the corporate payment. Therefore, when the elderly goes to see a doctor after retirement, if the medical insurance fund accumulated by the individual payment is used up, they will use the socially coordinated medical reimbursement fund. The socially coordinated medical insurance fund can reimburse a large proportion of the expenses (but not all of them) depending on the condition of the illness, the shortfall shall be paid by the individual herself.

Regarding your point about the medical expense, in China, the medical insurance fund card and the detailed usage record are used together. This detailed usage record is kept by the individual and is carried with you every time you go to see a doctor. When paying for medical expenses, the printer will show the details of each payment and the total amount. The amount is printed on this record book. Therefore, when we conducted this investigation, we can accurately know the medical expenses by checking the investigator's record book.

The sum of the numbers in the rows and columns of Table 1 has been added according to your suggestion. The addition about the retirement age was requested by another referee of the paper.

There was a spelling mistake in Table 2 which has been corrected, thanks for pointing that out.

We have tried to use similar scales in an earlier version of the paper, however, that makes the Figures uneven. We received many suggestions to change the scales to improve the presentation.

There may be a misunderstanding regarding the use of DID+PSM. We did not say that you mentioned using this method, just to explain it to you more clearly. For the same group of surveyed people, for example, using the difference between the performance before taking a certain drug and the performance after taking the drug to study the effectiveness of the drug, this problem can use the DID+PSM method. As for the delayed retirement problem studied in this article, one is that China is studying the delayed retirement policy and has not implemented delayed retirement. Therefore, it is not possible to obtain the medical expenses data of the same group of surveyed people before and after delayed retirement. People who are still working after age are treated as the treatment group of delayed retirement statutory retirement groups under the control of gender, job attributes and education level, while those who have not continued to work after the statutory retirement age are used as the control group. PSM method for research. In order to avoid possible misunderstandings, this revision has deleted some misunderstood words.

The statement about “the value between 0 and 10,000 is not enough to meet the requirement of PSM overlap” has been deleted in this revision.

The addition of the formula for ATT was requested by another referee.

Regarding Figures 9-12, 13-16, 17-20, 21-24, we understand your concern and value your suggestions. However, for similar reason we mentioned earlier, other reviewers specifically proposed to add this part. They have discussed the importance of those figures with us, so we are in a difficult situation we you think they are not necessary. We will try to provide more detailed explanation to these figures. In fact, these figures clearly reflect the requirements for good balance and overlap between the treatment group and the control group. Researchers who using PSM can understand it without more detailed explanation.

Regarding our interpretation of the result. We have decided to use the health expenditure as the measurement of one’s health status. As the  physical health of a person is difficult to quantify and hard to compare across various individuals. No matter how to measure health, no matter what indicators are adopted, it will not be perfect, and there will be doubts. For a certain individual, the increase in medical expenses may be caused by other reasons, but for a large sample size surveyed, patients with the same conditions, the substantial increase in medical expenses can reflect their health status. In the process of writing the paper, we also consulted some health experts, and they all agreed with this view.

Finally, we are sorry about our poor language skills that makes the reading of our paper a boring job. That sentence that we keep repeating is the best way we can express our opinion correctly. We will try to use other way to say it in the paper.

Author Response File: Author Response.pdf

Reviewer 3 Report

Please see attached.

Comments for author File: Comments.pdf

Author Response

Dear Referee,

Thanks again for your new comments and advices. Our paper certainly improved since our last communication. I have made some modifications according to your suggestions in the second letter.

In this paper, the key age is not set in advance, but roughly judge a value range according to the scatter diagram, and then select a value in this range for PSM analysis and verification. If it meets the verification, we think that this value is the key value. If it does not meet the verification, we will increase this age by one year, and then carry out PSM verification until it meets the verification to obtain the key age . The initial critical age selected in this paper according to the scatter diagram has all passed the verification at one time, giving you a pre-set feeling. We understand your concern, and it is also explained in the revised version.

Your understanding is correct: the matching is done on gender, nature of work and age, while education, consumption, and location are introduced as controls in the regressions.

As for the regression model you mentioned, in China, we call it threshold effect regression, which is also an effective regression analysis method to solve the key age problem, and points out a good research method for us. Because we used the grouping method when collecting the research data in this paper, we used the PSM regression analysis method, which is also an effective effect analysis method. Thank you for your sincere insight and sharing.

Thank you very much for your professional and high-level modification comments. Thank you again!!

Round 3

Reviewer 3 Report

All my concerns have been addressed.

Back to TopTop