Will Delayed Retirement Affect the Health of Chinese Workers? A Study from the Perspective of Sustainability of Physical Health

: This article presents the important research ﬁndings of our study on whether delayed retirement will affect the health of Chinese workers from the perspective of sustainability of physical health. The treatment group is those who continue to work after the current statutory retirement age and the control group are those who no longer work after current statutory retirement age, and the physical health of these people is used as a dependent variable. The samples are selected from the following six regions of China: Shanghai, Chengdu, Guangzhou, Beijing, Zhengzhou, and Shenyang. Quantitative studies of physical health status of the samples are carried out. The study illustrates that for female manual workers who are younger than 63 years old, female non-manual workers who are younger than 66 years old, male manual workers who are younger than 64 years old, and male non-manual workers who are younger than 67 years old, delaying retirement, and continuing to stay in work has no effect on their physical health. After these chronological age stages, however, it has signiﬁcant impact on their health. These research ﬁndings have signiﬁcant impact and value for the design of China’s delayed retirement policy.


Introduction
Economics and demographics are driving change in expectations about work beyond the traditional retirement age and in employers' need for older workers in China [1]. The Administration in China proposes a public policy to establish a more equitable and sustainable social security system and study the formulation of a delayed retirement age. The emergence of this decision is brought about by four key factors: (i) the pension payment pressure due to the increasingly more serious problem of the aging Chinese population; (ii) the unsustainable demographic dividend; (iii) the increasing population in education as a result of lengthened duration of education; (iv) and the fact that average retirement age of developed countries is greater than that of China [2]. Delaying retirement age has become inevitable for the Administration in China. At the same time, life expectancy and education level of the Chinese population have increased significantly, while labor intensive jobs have been reduced significantly. These provide the context of possibility of delaying the retirement age. At the Fifth Plenary Session of the 19th Central Committee of the Communist Party of China held in 2020, the Chinese government put forward a progressively delayed retirement program once again. On the other hand, due to China's special national conditions, the delayed retirement age policy will meet considerable resistance for complex reasons. One of the reasons is that many employees worry that delayed retirement may affect their health and well-being [3].
Whether a delayed retirement has effect on the health of workers or not, what is the degree of influence? There is no definite conclusion from academic studies. Therefore, there is an urgent need for the academic and theoretical research community to conduct rigorous research studies and draw a convincing conclusion to better guide the design and actual implementation of China's delayed retirement age policy. Does delayed retirement age policy have an impact on the health of workers? To interrogate this question, we need to look at the complex issues related to the delayed retirement. On the one hand, Chinese Administration's delayed retirement program has not yet started to be fully implemented which leads to the lack of real data for conducting empirical analysis. In addition, the current retirement age policy of China is based on the retirement age program of 1978 in which different groups of people in China have different retirement ages [1]. According to the current China retirement policy, a very small number of people from China such as academicians, PhD supervisors, and national special leading cadres retire over 65, female workers at 50, male workers at 60, female cadres at 55, and male cadres at 60 which implies disjointed retirement policy for differential workers. The segmentation of Chinese retirement policy brings complication to both the study, and the design and implementation of the delayed retirement policy. The delayed retirement program may have different levels of impact on different social groups of people. To assess the potential impact of the policy, we need to take the segmentation into account and so that the analysis can be enhanced in terms of rigor and reliability.
Although there is no relevant macro and micro data that can be collected and used for analyzing the impact of the delayed retirement age policy [2], the study found that in many major Chinese cities many workers find themselves a new job after retiring. For both business needs or personal hobbies, many lecturers and professors at universities and medical doctors of hospitals stay in work after statutory retirement age. If we treat those people who keep working after the current statutory retirement age as the employees who must delay their retirement due to the new policy, we can then carry out data collection and quantitative analysis and compare the physical health of the workers with those who have not worked after retirement.
Therefore, the idea of whether delaying retirement affecting the health of workers is based on the sustainability of physiological health, i.e., delaying the retirement age to a specific age, there is a big difference in the physiological health between the people who delay retirement and those who do not. Then we believe the physical health is unsustainable, and delaying retirement has a significant negative impact on workers' health. Conversely, there is no significant difference in physical health between those who delay retirement and those who do not when the retirement age extended for a certain age, then they think that physical health is sustainable, and that delayed retirement age has no effect on workers' health. This specific age is also the appropriate retirement age in China based on the perspective of sustainable physical health.
The findings of the analysis should at least provide some indication and close evidence to show whether delayed retirement has some effect on the health of employees and to what extent. In the process of data collection and analysis, there is a need to consider gender differences, job status differences, education level differences, differences in financial conditions, and the relevant causal relationship between control and constraint variables. We need to select and use appropriate data analysis method to make sure that the quantitative analysis of the impact of China's delayed retirement on the health of workers is feasible and reliable.

Background: Literature Review
As for how retirements would affect the health of the retired people in China, we need to interrogate and situate the existing comparative yet differential research literature in its broad context. These reveal important implications for understanding the relationship between retirement age and health of workers that informs an understanding of this research topic and its complexity.
Research from the USA by Neuman [4] had taken data from HRS (Health and Retirement Statistics in the USA). By inputting retirement pensions as exogenous variables, the findings from this research showed that retirement could positively affect the subjective self-assessment on health of both male and female retired people. Using the same data from HRS, the findings from Coe's [5] research illustrated that retirement could barely affect the cognitive capability of old-age white-collar workers. The findings from Ekerdt et al. [6] influential study indicated that retirement would not possibly degrade the health conditions after the extensive survey were conducted on retired veterans aged 55-73. However, Estes [7] has suggested that poverty in old age for manual workers created the conditions for ill health. Using mortality data covering the entire U.S. population, Fitzpatrick and Moore [8] found that male mortality is closely related to retirement age. After 62 years of work, the mortality rate rises rapidly, but the change of female mortality is not obvious.
Coupled with this, Bound and Waidmann [9] from the UK analyzed data from ELSA (Elderly Living or Life Statistics Analysis); they found that retirement could not only impose very little negative effects on both the self-assessment health and objective health conditions, but also impose tangibly positive effects on men's health. When Davies et al. [10] took data from LFS (Labor Force Statistics), they found out that the data from LFS were mainly collected from people on the job while data from people reached at certain age and unemployed were very rare. Therefore, the analysis from LFS data was biased which mainly compared the health conditions of working people on the job at present and in the past. The bias would lead to the findings and results exaggerating the negative effects from aging process on the health when people were on the job. The policy on prolonging the working years for the old people would not significantly affect the health conditions if ignoring the bias. However, research from Powell and Taylor [11] illustrated that the retirement in the UK could be extended to 67 mainly due to demographic projections in the UK population from 2030. This was about better health, increased longevity, and shortage of younger workers due to increasing demographic shifts. However, Phillipson [12] suggested that poverty in retirement did affect aging experiences related to health.
In France, Westerlund et al. [13] investigated elderly at retirement age excluding persons who retired for health reasons and found those persons who retired at the age of 55 or higher had less self-perceived health than those persons who retired before the age of 55. Comparatively, Leinonen et al. [14] surveyed Finland. The findings and results from their research illustrated that retirement at statutory defined age would not necessarily have a detrimental impact on their quality of health. However, the loosening requirements for retirement encourage more individuals to retire earlier despite more working years advocated by the government and economic incentives for people willing to stay more years at work. The incentive measures do not work as expected. Scientific researchers have applied pressure to the government to tighten the requirements for retirement and set aside favorable policy with early retirement conditions for people in need because of underlying health conditions. In Norway, Hernaes et al. [15] made a good use of the database of samples from the national census. By using difference-in-differences models, their research tried to find out the links between the retirement age and the deaths of older people. The results indicate that the retirement age has played no obvious roles on the deaths of the demographic groups such as older people.
Outside of Europe, Butterworth et al. [16] used data from Australian National Survey of Mental Health and Well-being and studied on the survey responses of 10,641 Australian adults. The results from the research showed that the health conditions were degrading for retired people in comparison with the working people at retired age, specifically when it came to the mental illnesses which was growing. Zhu [17] looked at 19,185 samples (including 3771 females) from surveys on Australian households, incomes, and fluctuations of employments. By inputting retirement age with qualified retirement pensions as input variables or exogenous variables, the research focused on the impacts from retirements on the health conditions of females. The results showed that retirements played a tangible positive role in the self-sensed health condition, physical health condition, and mental health condition for women.
In Japan, Minami et al. [18] took the survey data of year 2008, 2010, and 2012 with 1768 survey participants. The data from the surveys focused on the worsening of mental health conditions and higher-level functional capacity (HLFC) for older people aged 65 and over. The data showed that the mental health conditions were degrading quickly and HLFC would be worsening gradually. The results from the research also indicated that there were no obvious differences on the health conditions between part-time older workers and full-time older workers.
Synthesizing research studies from various countries, research by Hessel [19] showed that retirement could degrade the health conditions for both male and female workers. However, by taking the approach of the tool variables, the results from the research indicated that retirement could lower the possibilities of the worse self-sensed health conditions for both women and men. Rohwedder and Willis [20] used data from surveys on 11 nations including US and some European nations. The results from this research illustrated that retirement would apparently degrade the cognitive capabilities of older people.
There is relatively little literature on the effects of retirement on the health conditions. As with research from the other nations described above, the results from the research in China also vary across research studies. For example, Dong and Zang [21] analyzed data from surveys on the CHARLS (China Health and Retirement Longitudinal Study). By using the statutory retirement age as the tool variable, the research found that retirement played negative roles in the health conditions for retiring people even when they began to handle the process and experience of retiring. Therefore, prolonging working years by delaying the retirement age may optimize the health conditions for the entire life-course. Wang and Zang [22] analyzed data from 2010 CGSS (Chinese General Social Survey) and found that retirement has played a significant role in the negative effects for both psychological and physical conditions. Furthermore, the negative effects would be worse for men than women.
The findings and results of the literature review always significantly vary because of the various sources of the data and various approaches adopted in the research studies. Among them, the different level of considerations about inherent factors and their coefficients from the health itself would be the reason for missing variables. However, the more important issue could be the different retirement ages for different nation states. It is understandable that disparities in the living environments and financial conditions inevitably cause the various findings and results in the various research.
The work of Michael [23] was based on personal savings and social security. The results from the research proved that the best age to retire would be 62 under the circumstances of regular markets or slow markets.
From three different perspectives of individual, enterprises, and governments, Forman and Chen [24] gave some suggestions about the best statutory age to retire based on the old-age pensions and retirement ages. Deng and Wang [25] considered the possibilities of deaths in Chinese population and established the model for the best statutory age to retire. Zhang and Ren [26] established an objective to maximize social welfare and built the economic model for finding the best flexible statutory age to retire based on the ever-changing structure of ages of population.
Tacchino [27] put forward 25 personal factors which could possibly affect the retirement age. Those factors contained individual psychological conditions, ability to continue working, and personal financial capabilities. Li and Wang [28] built a model of the best age to retire based on maximizing personal happiness. Using the models, the research simulated the data on the city of Shanghai. Lei and Yong [29] studied the best statutory age to retire based on the optimization of the benefits of personal old-age pensions.
In general, current research across nations on the relationship between retirement and health is fragmented and contradictory. For example, there is not sufficient consideration of the differences in retirement among different groups of people. There is a lack of comprehensive discussion about whether different time spans of retirement in samples may lead to different conclusions. In addition, previous studies in the literature review are basically based on the comparison of post-retirement data and pre-retirement data of the same target population. Whether the research conclusions drawn from populational profiles in Europe and North America can be applied directly to Chinese population is essential for researchers and policy makers.
Through data collection and analysis, we assess whether there is a suitable retirement age before which retirement has no significant negative impact on people's physical health on average, but after which a delayed retirement has. In addition, in China under current retirement policy, different gender groups and different types of labor forces are retired at different ages. Therefore, the analysis will also be carried out for different groups. From the findings of the analysis, recommendations of suitable retirement ages can be made for different groups of people.

Research Method and Key Assumptions
In China, the concept of retirement refers to the urban population only. For people living in the rural area, they will continue to participate in the labor force if they are physically able to work. Therefore, the scope of Chinese labor is limited to urban workers in this study which is in line with the traditional literature in this field (China's retirement issues).
Will delaying retirement age affect workers' health? The basic idea of this paper is based on the sustainability of physical health. The specific ideas are as follows: Assuming that the current legal retirement age is postponed to a specific age, the difference between the physical health of the people who delay retirement and those who do not delay retirement is small, i.e., whether they delay retirement or not has no negative impact on their physical health; their physical health is considered sustainable. On the contrary, if the current legal retirement age is delayed to a specific age, the physical health of the people who delay retirement and those who do not delay retirement is quite different, i.e., whether the retirement is delayed or not has a negative impact on their physical health, it is considered that the physical health of the people who delay retirement cannot be sustained. When the delay of legal retirement age to a certain age leads to the unsustainable physical health, it is considered that this age is the appropriate retirement age from the perspective of sustainable physical health.
The method of the research is to consider the sample group who continue to work after the current statutory retirement age as the proxy sample group for those who delay their retirement in the future, and let this group as a treatment group. The control group is the sample group of those who are no longer engaged in work after the current statutory retirement age. The physical health of those people is the dependent variable. The firsthand data was collected from surveys, and the influence of the delayed retirement on the health of the residents is studied by selecting the appropriate statistical analysis method based on controlling some important variables.
In this study, the survey is conducted in the following six major cities in China, Shanghai, Beijing, Guangzhou, Chengdu, Zhengzhou, and Shenyang. The annual medical expenses are surveyed and considered to be indicators of the health of the respondents. The choice of the indicator is based on the principle that an indicator should be effective, readily available, and easy to quantify. In China urban residents normally pay their medical expenses by using their personal health insurance card (Chen and Powell. 2012). The cost of medical treatment is recorded and can be easily found. The medical expenses in this article are derived from the payment data of the investigator's medical insurance fund usage record. Since workers with registered permanent residence in cities and towns in China pay their medical insurance funds every month during their work, they all use the medical insurance fund to see a doctor. In addition, in China, employees pay different medical insurance funds according to their different salaries. The amount of medical insurance fund account consists of two parts, one is from the individual payment, the other is from the social co-ordination of enterprise payment; when the worker retires, the individual no longer pays the medical insurance fund, the co-ordination of medical insurance fund is only the social co-ordination of enterprise payment. Therefore, when the elderly go to see a doctor after retirement, if the medical insurance fund accumulated by individual payment is used up, they will use the medical reimbursement fund of social pooling. According to the different conditions, the social pooling medical insurance fund can reimbursement a large proportion of the expenses (but not all), and the insufficient part will be paid by individual cash.
There are many factors that may affect the physical health conditions of the respondents, such as genetic factors, lifestyles, exercises, and mental health. These factors are difficult to quantify, and are not considered in the investigation. We assume that any issues such as biases arising from doing so can be overcome through random sampling. Therefore, the research only uses those factors which can be accurately answered and quantified as the controlling factors, including education levels and the physical health conditions of individuals, and the areas in which they are living. In this research, the number of years in education is used to characterize educations levels of the respondents. In general, non-manual workers have significantly more years of education than manual workers. The amount of money spent on basic daily life is used to represent the living standards or conditions. The selected six major cities cover a wide range of geographical areas, with Shanghai representing the eastern, Beijing the northern, Guangzhou the southern, Chengdu the western, Zhengzhou the central, and Shenyang the northeastern region of China.
To make the analysis of the survey data comparable and engage the research with more rigor and validity, the following assumptions and explanations are made because the samples of this paper come from the six cities in different regions of China.

Assumption 1.
There is no significant difference in the medical expenses among different areas for the same treatment regimen.
The medical expenses of Chinese patients are composed of four parts: drug costs, hospitalization and surgery costs, and registration fees. China has a unified national price management mechanism for pharmaceutical drugs; the price differences among regions is not significantly different. The hospitalization and surgery costs for treating the same type of disease are also very similar. For the registration fees, there are some differences in different regions of the country, but because the fees account for a very small proportion of the medical expenses, the regional differences of the medical expenses caused by registration fee are small. Therefore, for this research, it is assumed that there is no significant difference in the medical expenses of the same kind of disease in the surveyed samples.

Assumption 2.
There is no significant difference in the daily non-medical expenses in the surveyed samples which come from different cities.
During the research, we have checked the Official Statistical Yearbooks to examine the daily consumption levels of residents in the six cities in the past five years and found that there is no significant difference among the regions. In addition, we have also compared the prices of 100 randomly selected commodities in the shopping malls and supermarkets of these six cities and found that the selling prices of similar commodities in the six cities are very close. As online shopping is becoming popular in China, the prices of products selling online are the same for buyers from different regions of China. Therefore, it is reasonable to assume that there is no significant difference in the daily non-medical expenses in the surveyed samples.

Assumption 3.
In this study, all the samples with the identity "cadres" are non-manual workers who are intellectuals with more than 10 years of education. The majority of those who are cadres before retirement in China are those who are college graduates or are engaged in management work. Undeniably some of them are not highly educated and are engaged in experienced physical labor; they are promoted to cadre status due to their performance in work, and thus enjoy the cadre retirement policy. The proportion of those people is very small and is getting smaller in future as the education levels of the Chinese population are continuously improving.

Assumption 4.
In this study, all the samples with the identity "common workers" are engaged in manual labor. Although some of the workers who enjoy common worker retirement policy are not all manual workers, but most of them are. Therefore, this is also a reasonable assumption.

Data Sources and Variable Settings
The survey was conducted in six cities using random sampling and the ages of surveyed samples spanned from 50 to 70. To ensure the quality of the data, we used "face to face" surveys, which took six months to complete. To have balanced numbers of samples in various regions and groups, in the course of the survey we tried to survey the same number of people approximately in each of the following groups: manual labor control group (female), manual labor treatment group (female), manual labor control group (male), manual labor treatment group (male), non-manual labor control group (female), non-manual labor treatment group (female), non-manual labor control group (male), nonmanual labor treatment group (male). The detailed sample numbers in each group and each city are shown in Table 1. The non-manual workers surveyed are mainly doctors, teachers in universities, and experts working in scientific research institutes. The manual workers surveyed are mainly engaged in industrial manufacturing, supermarket sales, and hotel logistics. The survey collected 8123 questionnaire responses in total. After deleting 555 invalid responses, we are left with 7568 of them for further checking. Among the responses to annual medical expenses, some of the figures are unusually high. The respondents who provided the figures could have suffered from very serious health problems such as cancer or other major diseases. Although the number of those samples are small, they will seriously skew the outcome of the analysis. As the research is mainly concerned with general public health level, the samples with serious illness are excluded from further analysis. After the exclusion, 7200 survey responses are used in the analysis. The specific sample distributions are shown in Table 1.
The variables used for the data analysis are defined and described in Table 2. Whether it is a control group or a treatment group, China's current statutory retirement age of female manual workers in Table 1 is 50, and China's current statutory retirement age of female non-manual workers in Table 1 is 55, China's current statutory retirement age of male non-manual and manual workers in Table 1 is 60.

Data Characteristics Analysis and Model Selection
In China, for different groups of people there are different retirement policies. Accordingly, our analysis of data is also carried out for those groups. The scatter plots of annual medical expenses of the different groups of retired people are shown in Figures 1-4 for manual workers and Figures 5-8 for non-manual workers.        Annual medical expenses scatter plot of delayed retirement group of male non-manual workers. Figures 1 and 2 both illustrate that for un-delayed retirement group of female manual workers, the changes in their annual medical expenses between ages 50 to 63 are almost unnoticeable. For the women who continue to work after their statutory retirement age, there are also no noticeable changes in the expenses before age 63, but the expenses increase quite rapidly after age 63.
In Figures 3 and 4, similar patterns can be observed for male manual workers, except that age 64 is the dividing point.
We can also observe similar patterns in Figures 5 and 6 for female non-manual workers, and Figures 7 and 8 for male non-manual workers on their annual medical expenses, except that the dividing age is 66 for female and 67 for male.
Whether the above observed patterns in the annual medical expenses of various groups of people are significant or not, further regression analysis and statistical tests are necessary. Through the uneven distributions of data points in the scatter plots, we can see that data analysis should be conducted on different age groups instead of on all ages as one group; otherwise, the patterns may well be smoothed out.
In the next section, segmented regression with two segments is carried out for each of the eight groups as shown in Figures 1-8. The breakpoint is 63-year-old for female manual workers, 64-year-old for male manual workers, 66-year-old for female non-manual workers, and 67-year-old for male non-manual workers. The key age is not set in advance, but roughly judge a value range according to the scatter diagram, and then select a value in this range for Propensity Score Matching (PSM) analysis and verification. If it meets the verification, we think that this value is the key value. If it does not meet the verification, we will increase this age by one year, and then carry out PSM verification until it meets the verification to obtain the key age.

Introduction to PSM Analysis
This paper examines the sample group who continue to work after the current statutory retirement age as the proxy sample group for those who delay their retirement in the future, makes comparative analysis with the sample group who no longer engage in work after current statutory retirement age, and analyzes their differences in health conditions. This is a typical analysis of treatment effect. In the analysis, due to the possibility that a dependent variable could be influenced by many factors, a simple study of the relationship between the dependent variable and independent variables often leads to the emergence of endogenous problems. An important way to solve endogenous problems is the PSM method. The method of preference matching was used by Rosenbaum and Rubin for the first time in the field of biostatistics for experimental effect analysis.
In the 1990s, the method began to be applied in the field of health economics and other social science. PSM is an effective method to study the experimental treatment effect and the treatment effect of public policy. The PSM method can compare the multidimensional standard of two objects, and reduce the dimension of the two objects by calculating Propensity Score and find the cases which are most close to the Propensity Score to match through different matching methods. PSM includes k-nearest neighbor matching, caliper and radius matching, kernel matching, and so on. In the nonrandomized test conditions, the selecting bias and confounding bias can be eliminated to the maximum extent by the PSM method. This bias and impact are likely to exaggerate or understate the effects of policy shocks. Making the use of trend to score matching method, it is possible to eliminate this bias to the greatest extent. Of course, it is possible to eliminate this bias by controlling all covariates that may influence selection and impact on the results, so it is often required to control as many variables as possible when matching. The average treatment effect of the treatment group (ATT), after PSM, can be used to observe and analyze the difference between the control group and the treatment group, and then to be used to analyze the policy effect.
Following to the PSM method, this study set the sample group who no longer engage in work after current statutory retirement age as the control group and the delayed retirement sample group as the treatment group. Generally, in PSM, the data for the control group is derived from survey before policy implementation or experimentation, the data for the treatment group is derived from survey after policy implementation or experimentation. It is usually used in conjunction with difference-in-differences (DID) and data are analyzed by using difference-in-differences PSM estimator. As the data used in this study is collected in the same time period, it is not necessary to conduct difference-in-differences effect processing. PSM analysis alone is sufficient. A very important requirement for using PSM is that the data of the treatment group and the control group meet the overlap requirements.
PSM is an effective tool for solving endogenous problems. The PSM model itself can solve endogenous problems well without any control variables. In other words, the PSM model used in this paper can better study the impact of delayed retirement on health without considering any control variables, i.e., the impact of variable delay on the variable sdcosts. During the research process, we have collected a wide range of variables such as education level, consumption level, location, gender, nature of work, age, and others. Out of these variables: "Age" "Gender", and "Work" are used for classification or segmentation, while variables of education level, consumption level, and location are used as the control variables of PSM to make the result of PSM operation more accurate. The factors affecting health that are not easy to measure and count, such as genetics and living habits, are not used as control variables in this article, but the absence of these control variables will not affect the basic effects of PSM.
K-nearest neighbor matching, caliper matching (radius matching), and kernel matching are three typical PSM methods. In this paper, three methods are used to do it separately, which not only tests the robustness of the research method, but also verifies the validity of the research conclusions. In kernel matching, the estimator of ATT is: where w(i, j) is the weight of the matching, the weight expression is where h is the specified bandwidth, K( * ) is kernel function Based on the previous breakpoint analysis of scatter diagram, this paper adopts the PSM analysis of classified data. The premise of the implementation of PSM is the balance of grouping data. The data of control group and treatment group should have a large common range of values, i.e., the overlap hypothesis, which has been verified in detail in the following quantitative analysis.

Analysis of the Manual Workers Group
As shown in Figures 9-12, the grouped data of manual workers has a good balance form PSM covariates standardized bias.
As shown in Figures 13-16, the treatment group and the control group have most common value range, the grouped data of manual workers has a good overlap, meeting the overlap assumption of PSM implementation.
For female manual workers aged between 50 and 63, it can be seen from Table 3 that whether they delay their retirement or not has almost no effect on their health differences. Its k-nearest neighbor matching PSM analysis ATT estimate is −22.07, which is so small that we can ignore it. The corresponding t value is −0.32, indicating that the difference is not significant. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 3. The estimated value of ATT for the caliper matching PSM analysis is −37.40, and the corresponding t value is −0.66 which means not significant. The estimated ATT of the kernel matched PSM analysis is −5.63, and its corresponding t value is −0.11 which means not significant. Those results from caliper matching and kernel matching methods are consistent with the conclusion obtained from the k-nearest neighbor matching PSM analysis, which also validates the robustness of the PSM analytical method.         The first bracket in the cell is Std. Err. The second bracket is the t value.
For female manual workers who are older than 63, from the results shown in Table 3, it is clear that a delayed retirement has a negative effect on their health differences. Their k-nearest neighbor matching PSM analysis ATT is 969.38 with a corresponding t value of −8.26, which is very significant. The difference can be explained by the fact that the female manual workers who are older than 63 and delay their retirement pay 969.38 RMB more annual medical expenses than those who do not, which reflects the apparent negative impact of the delayed retirement on their health. To test the validity of this result, caliper matching and kernel matching PSM analysis are also carried out and the results are given in Table 3. The estimated value of the ATT of the caliper match PSM is 1022.85, and the corresponding t value is 10.53, which means the difference is significant. The estimated ATT of the kernel matching PSM analysis is 1025.90, and the corresponding t value is 10.80, indicating that the difference is also significant. The results from the three PSM analysis methods all indicate that a delayed retirement has a negative effect on the health of female manual workers older than 63.
It can be seen from Table 3, for male manual workers aged between 60 to 64, whether to delay their retirement or not has little effect on their health differences. Its k-nearest neighbor matching PSM analysis ATT estimate is −85.53, which is so small that we can ignore it. The corresponding t value is −0.94, indicating that the difference is not significant. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 3. The estimated value of ATT for the caliper matching PSM analysis is −77.50, and the corresponding t value is −1.00, which is not significant. The estimated ATT of the kernel matching PSM analysis is −70.98, and its corresponding t value is −0.95, which is not significant. These two analytical methods show that whether to delay retirement or not has no effect on the health of male manual workers who are younger than 64 years old, which is consistent with the conclusion of the k-nearest neighbor matching PSM analysis, and validates the robustness of the PSM analytical method.
As shown in Table 3, for male manual workers, it is clear that delayed retirement has a significant effect on their health differences when they are older than 64 years old, and their k-nearest neighbor matching PSM analysis ATT is 1002.34, with a corresponding t value of 10.81, which is very significant. The difference can be explained by the fact that male manual workers who delay their retirement pay 1002.34 RMB more annual medical expenses than male manual workers who do not when they are older than 64 years old, which reflects the apparent negative impact of the delayed retirement age on their health difference. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 3. The estimated value of the ATT of the caliper match PSM is 1047.38, and the corresponding t value is 13.54, indicating that the difference is significant. The estimated ATT of the kernel matching PSM analysis is 1044.33, and the corresponding t value is 14.01, indicating that the difference is significant. Therefore, the results of both PSM analysis methods indicate that delayed retirement has a negative effect on the health of male manual workers older than 64, and validate the robustness and effectiveness of the PSM analysis method in this article.

Analysis of the Non-Manual Workers Group
As shown in Figures 17-20, the grouped data of non-manual workers has a good balance form PSM covariates standardized bias.
As shown in Figures 21-24, the treatment group and the control group have most common value range, the grouped data of non-manual workers has a good overlap, meeting the overlap assumption of PSM implementation.        As for the variable "delay" (Delayed Retirement), it can be seen from Table 4 that for female non-manual workers aged between 50 to 66, delayed retirement has little effect on their health differences. Its k-nearest neighbor matching PSM analysis ATT estimate is −36.47, which is so small that we can ignore it. The corresponding t value is −0.48, indicating that the difference is not significant. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 4. The estimated value of ATT for the caliper matching PSM analysis is 12.33, and its corresponding t value is 0.20, which means not significant. The estimated ATT of the kernel matching PSM analysis is −29.71, and its corresponding t value is 0.56, which means not significant. The results from applying these two analytical methods show that the delayed retirement has no effect on the health of female non-manual workers who are younger than 66, which is consistent with the conclusion of the k-nearest neighbor matching PSM analysis. The consistency also validates the robustness of the PSM analytical method. The first bracket in the cell is Std. Err. The second bracket is the value.
As shown in Table 4, for female non-manual workers, it is clear that delayed retirement has a significant effect on their health differences when they are older than 66, and their k-nearest neighbor matching PSM analysis ATT is 949.20, with a corresponding t value of 6.73, which means the difference is very significant. The difference can be explained by the fact that the female non-manual workers who delay their retirement beyond 66 pay 949.20 RMB more annual medical expenses than the female non-manual workers who do not, which reflects the apparent negative impact of the delayed retirement age on their health difference. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 4. The estimated value of the ATT of the Caliper Match PSM is 1048.86, and the corresponding t value is 8.79, meaning the difference is significant. The estimated ATT of the kernel matching PSM analysis is 1000.61, and the corresponding t value is 8.67, meaning the difference is significant. These PSM analyses indicate that the delayed retirement has a significant negative effect on the health of female non-manual workers who are older than 66, and the consistency of those results also validates the robustness and effectiveness of the PSM analysis method.
As can be seen from Table 4, delayed retirement has little effect on the health differences of male non-manual workers who are aged between 60 to 67. Its k-nearest neighbor matching PSM analysis ATT estimate is 72.10, which is so small that we can ignore it. The corresponding t value is 1.10, indicating that the difference is not significant. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 4. The estimated value of ATT for the caliper matching PSM analysis is 57.29, with a corresponding t value of 1.13, which means the difference is not significant. The estimated ATT of the kernel matching PSM analysis is 61.61, with a corresponding t value of 1.26, indicating the difference is not significant. These results from applying the two analytical methods show that the delayed retirement has no effect on the health of male non-manual workers who are younger than 67, which is consistent with the conclusion of the k-nearest neighbor matching PSM analysis. The consistency also validates the robustness of the PSM analytical method.
As shown in Table 4, it is clear that delayed retirement has a significant effect on the health differences of male non-manual workers who are older than 67, and their k-nearest neighbor matching PSM analysis ATT is 1183.73, with a corresponding the t value of 9.49, meaning the difference is very significant. The difference can be explained by the fact that the male non-manual workers who delay their retirement beyond 67 pay 1183.73 RMB more annual medical expenses than the male non-manual workers who do not, which reflects the apparent negative impact of the delayed retirement age on their health difference. To test the validity of this result, PSM analysis is carried out by using caliper matching and kernel matching. The results are shown in Table 4. The estimated value of the ATT of the caliper match PSM is 1185.39, and the corresponding t value is 11.76, indicating the difference is significant. The estimated ATT of the kernel matching PSM analysis is 1201.25, and the corresponding t value is 12.12, indicating the difference is significant. the results from applying the PSM analysis methods indicate that the delayed retirement has a significant negative effect on the health of male non-manual workers older than 67, and the consistency of the findings obtained from applying different PSM methods validates the robustness and effectiveness of the methods.

Discussion
The previous literature mixed evidence of a relationship between early retirement and health outcomes is found, but those studies tended to focus on smaller segments of a population and few studies focus on the effects of delayed retirement on health based on sustainability of physical health. This study contributes to the literature by exploring the effect of delayed retirement instead of early retirement on health. The findings suggest that different groups of people have different suitable retirement ages from the perspective of sustainability of physical health. The study found that statistically a delayed retirement has no effect on their physical health for the following groups of people, female manual workers who are younger than 63, female non-manual workers who are younger than 66, male manual workers who are younger than 64, and male non-manual workers who are younger than 67.
The findings can be used as an important reference for the delayed retirement policy design in China. Furthermore, they could be used to increase public acceptance of a delayed retirement policy as health is a major concern of people approaching retirement ages.
The findings also suggest that the delayed retirement policy should still segment people into groups and set a different appropriate retirement age for each group. More specifically, it is recommended that female manual workers should delay their retirement age to 63, male manual workers to 64, female non-manual workers to 66, and male nonmanual workers to 67.

Conclusions
In conclusion, following the experiences of other countries, such as the UK, a delayed retirement policy should normally be phased in over several years by gradually introducing a short period of delay at a time until the recommended delay for each group is reached. As China is a large country with 1.4 billion population (Chen and Powell, 2012), with diverse demographic structures and social and economic conditions, the design and introduction of retirement policy should be even more cautious and flexible. For example, the policy design should consider allowing different provinces to decide their own phase-in paces of the policy. It should also allow the retirement policies for female and male to phase-in at different paces.
Other issues related the delayed retirement policies include the lengthening of the population working life, the possibility of a longer population life expectancy in future, and the potential effects on the benefit pay out. It is likely that further research on those issues may be necessary to add to the important findings of this paper for the policy design.