Does Air Pollution Affect Health and Medical Insurance Cost in the Elderly : An Empirical Evidence from China

In recent years, the health conditions of the elderly (the middle-aged and the old, which for this study means people over 50 years of age) have deteriorated along with the aggravation of air pollution, which led to the change of medical insurance costs. This phenomenon is particularly prominent in developing countries, such as China. A total of 15,892 research subjects from 56 prefecture-level cities in 23 provinces were collected from the database of China Health and Retirement Longitudinal Survey (CHARLS). We investigated the effects of air pollution, physical health, and medical insurance costs among three mechanisms using logistics and Ordinary Least Squares (OLS) hybrid cross-sectional regression, and we conducted a robust test. Overall, two pollutants, namely, PM10 and NO2, respectively showed an “inverted U-shaped” and “positive U-shaped” influencing path to health. In addition, when we studied the mechanism of air pollution affecting medical insurance costs, we found that air pollution can affect medical insurance costs through affecting self-rated health, and that the impact path is related to different diseases to some extent. At the same time, there was a certain negative correlation between air pollution and medical insurance: The higher the degree of air pollution, the worse the self-rated health, and the fewer opportunities there are to purchase medical insurance. It can be seen that air pollution affects the physical health of middle-aged and elderly people, thus indirectly and negatively affecting the medical insurance cost. Further research also found that the types of air pollutants in southern and northern China showed some differences. Specifically, NO2 and SO2 were the pollutants that harm the health of the elderly in the south and north, respectively.


Introduction
In recent years, with the rapid development of emerging economies, such as in China, the world has experienced several problems including population explosion, over-consumption of resources, and ecological destruction, which have prevented economic development, as well as an improvement of life quality.Furthermore, these problems are threatening the sustainable development of humankind in the future.Among these problems of environmental pollution, air pollution accounts for a large proportion.
In fact, the impact of air pollution on human health has caused widespread concern.Globally, 3.3 million people die prematurely each year due to outdoor air pollution [1] Numerous studies have shown that air quality can affect public health [2][3][4][5][6][7].Pope [2] was first to find that the negative effect of air pollution on public health would gradually increase over time.Grant and Poursafa [3] examined the relationship among air quality index (AQI), cardiac risk factors, and cancer mortality in Iran and the United States.With the concern of air pollution from all communities, an increasing number of research has focused on the relationship between air pollution, health, and medical care.Li [4] analyzed the relationship between daily Air Pollution Index (API) and pollution indices, and between API and deaths using a distributed lag nonlinear model, which indicated that API can be used to inform people about the effects of air pollution on health in different populations.Geriatric people, females, and people with low educational attainment are the vulnerable groups affected by air pollution; so this information indicates who should be given more attention.Sun [5] collected data about residents' understanding and the increase of their medical expenses caused by air pollution through questionnaires.The author concluded that smog could directly affect the health status of residents, and increase their outpatient and emergency visits, as well as the incidence of chronic diseases.Xu [6] estimated the harmful effects of air pollutants, such as SO 2 , on human health based on inter-provincial panel data.Liu [7] assessed the effects of air pollution on health using a semi-parametric regression model in four municipalities, and found that a nonlinear relationship exists between API and respiratory diseases, and that the impact of air pollution on health is highly significant.Hamedian [8] utilized a total of 11 parameters (i.e., CO 2 , SO 2 , PM10, PM2.5, O 3 , NO 2 , benzene, toluene, ethylbenzene, xylene, and 1, 3-butadiene) in a fuzzy inference system and a fuzzy C-means clustering method in order to evaluate five monitoring stations in Tehran, Iran.However, several types of air pollution indices have not been measured in China.Therefore, in this study, we used the concentrations of PM10, SO 2 , and NO 2 to measure the degree of air pollution in the country.
Notably, although the economic growth in China has maintained its rapid growth since the 21st century, the problem of environmental pollution has become increasingly serious.As a result, widespread damage to human health has caused increasing concern.Particularly, air pollution and its health damage are prominent in the environment.According to the report on China's environmental status in 2015, up to 265 (78.4%) of the 338 prefecture-level cities exceed the standard of air quality.Current research has also shown that air pollution leads to respiratory and cardiovascular diseases, which increases the rate of morbidity, attendance, and mortality.Short-term exposure to air pollution can also increase the risk of respiratory and cardiovascular diseases in healthy populations.Poursafa and Rashidi [9] analyzed the relationship between the pollutant standards index and hematologic parameters, as well as respiratory diseases, in Iran.Pope [2] found that for each 10 µg/m 3 increase in the concentration of PM10, the risk of total deaths in the population increases by approximately 4.0%.Chen [10] found that reducing the concentration of PM2.5 can effectively reduce the loss of life expectancy.Hanne Krage Carlsen [11] used time series regression to investigate the association between the daily number of individuals who were dispensed anti-asthma medication and levels of the air pollutants particle matter.She found that there was a measurable effect on health, which could be attributed to the air pollution in Iceland.
Despite the theoretical and factual evidence that reveals that air pollution in China has become a major factor endangering people's physical and mental health, the existing literature on the impact of air pollution on health is mostly based on the investigation of a particular area, and the research on the true impact on micro-subjects (individuals) is lacking.In addition, the continuous rise in medical costs, including medical insurance, has become a global phenomenon.Maintaining a reasonable increase in medical expenses and coordinating the development of public health with the social economy has become the common goal of most countries [12].Moreover, numerous analyses of consumption and investment needs on health, reflect that there is some economic impact on individuals and the society from physical health [13][14][15][16][17][18].Some studies have examined the influencing factors and control strategies of medical expenses from the perspectives of supply and demand of medical services and prices.However, in developing countries where the current environmental pollution is becoming increasingly serious, such as in China, the existence of other conduction mechanisms for the significant increase in individual medical insurance costs due to the increase of air pollution remains unknown.
Therefore, based on the existing literature and the data from CHARLS of 2011, 2013, and 2015, this study attempts to systematically analyze the relationship between air pollution, physical health, and medical expenses in recent years.China Health and Elderly Tracking Investigation Database adopts a multistage sampling method.In the sampling stages of counties, districts, and village houses, a probability sampling method is used according to population scale.A total of 150 districts and 450 villages were randomly selected from the 30 provincial-level administrative units in China.After excluding the cities with missing data, 15,892 sample units were obtained from 56 prefecture-level cities in China, covering 23 provinces (e.g., Hebei, Zhejiang, Yunnan, and Jiangxi) and four municipalities (e.g., Beijing, Tianjin, Shanghai, and Chongqing), as well two autonomous regions (i.e., Guangxi and Ningxia), which cover nearly 80% of China's provincial units and reflects the current situation in the country.
We used the self-rated health, outpatient, and inpatient of micro-subjects as the basic measure of the health condition of an individual.China has only started the statistics of the current widely used indicator, PM2.5, in 2013; thus, to ensure the consistency of statistical indicators before and after 2013, we substituted PM2.5 with three other indicators, namely, PM10, SO 2 , and NO 2 , which also reflect the common air quality conditions.We also included the respondents' age, gender, total years of education, number of chronic diseases, local Gross Domestic Product (GDP) per capita of the previous year, and other factors that may affect health and medical expenses as control variables.
The present study found that the air pollutants PM10, SO 2 , and NO 2 affect the health status of the elderly.Among the pollutants, the impact of PM10 on health is the inverted U-shaped structure, which initially increases and then declines.Air pollution, physical health, and medical insurance costs have a significant conduction mechanism, wherein air pollution on physical health showed a highly significant positive effect; thus, more serious air pollution implies worse health status.Consequently, changes in health positively promote the increase in medical insurance costs.At the same time, to explore the difference of the effect on the elderly who have different diseases, this study also conducted an in-depth analysis on the impact of air pollution on health by dividing people into groups of those who have hypertension, heart disease, and other chronic diseases.The results show that PM10 and SO 2 have a significant effect on the health of patients with hypertension, whereas SO 2 has a greater impact on the health of those who have cardiovascular diseases.In terms of air pollution affecting the health of the population, further tests indicated that the main cause shows a distinct difference in north and south of China; hence, the pollutants NO 2 and SO 2 are harmful to the health of those in the south and north, respectively.
Based on the 15,892 sample units in China, we conducted a highly objective and quantitative assessment of the damage of air pollution on human health, which could help in understanding its true degree and provide a scientific basis, as well as serve as guidance for formulating suitable policies on balancing economic development and environmental protection in developing countries.

Theoretical Analysis and Research Hypotheses
In the past decade, air pollution caused by poor weather conditions, such as smog, gale, and sandstorm, has severely threatened the health of the middle-aged and old Chinese people increasingly.According to the public health data from the Ministry of Health (2016), respiratory diseases in middle-aged and old Chinese people have increased significantly over the past 10 years (2016 Ministry of Health data.).Health insurance costs reflect the seniors' awareness of physical and mental health insurance.In fact, an inherent mechanism of association and linkage is found among air pollution, people's health condition, and medical expenses.
Air is one of the environmental elements on which mankind depends, and it is directly involved in human life activities, including metabolism and thermoregulation.Once the severity of air pollution continues to increase, its impact on the human body will appear from recessive to dominant, which is a long accumulation process.For example, inhaled particles will accumulate and cause respiratory and pulmonary diseases through the lungs and trachea, which can also drastically stimulate eye conjunctiva, nasal cavity, and respiratory mucosa, causing lung inflammation or edema.NO 2 may also lead to severe pulmonary edema.These indications about air pollution are closely related to the increasing incidence of respiratory and circulatory diseases and mortality.Therefore, the cumulative effect of air pollution on human health will have a significant impact.Hence, we hypothesized the following: Hypothesis 1.A close positive correlation exists between air pollution and individual health in the area, that is, the more serious air pollution is, the worse the health condition will be.
Generally, health status can directly affect personal medical expenses.That is, the worse the health condition is, the greater the prevalence will be, which will then cause greater expenditure on body care, medicine, and medical insurance, and vice versa.Furthermore, air pollution can be regarded as a source of retrospective health and hygiene problems, because of the special effects on human health and its feature of long-term and large-scale effects.We speculated that the deterioration of health caused by air pollution would also lead to the change in medical insurance costs.Based on this situation, we expected the following hypothesis: Hypothesis 2. Air pollution is related to the cost of individual medical insurance costs, and the influencing path has a certain relationship with different diseases.

Measurement of Air Pollution and Health
AQI is used as an indicator of air pollution in most of the current research.However, the AQI index in China was officially released in 2013.Under this condition, we used the concentrations of PM10, SO 2 , and NO 2 to measure the degree of air pollution.The self-rated health indicators in CHARLS database can reflect the health status of the respondents, which is the result of evaluating the health status through the angle of the individual.The indicators are divided into the following five categories: Excellent, better, good, fair, and bad.This study reprocessed these indices as follows.If the respondent was "excellent," "better," or "good," then the respondent was set to 1, whereas "fair" or "bad" were set to 0. To avoid the possible bias of using a single indicator to measure health level, we also used the "outpatient times" in four weeks and "inpatient times" within a year in the CHARLS database for comparison.

Selection of Control Variables
In reality, several factors affect the health status of the elderly.Hence, we selected some characteristic variables of the respondents in the CHARLS database as control variables, including age, gender, education level, and quantity of chronic disease.Moreover, the proportion of medical expenses to GDP usually increases with the development of economy and citizens' living standards.However, the level of economic development often has a certain lag effect on the demand of health services [19].Therefore, we also selected local per capita GDP of the previous year as a control variable.The related variables are defined in Table 1.

Analysis of Descriptive Statistics
In this study, a total of 15,892 individual samples were obtained after excluding certain cities with missing relevant data of air pollution and other indicators.We first made a basic analysis of the statistical characteristics of the main variables above.The results are shown in Table 2.The descriptive analysis in Table 2 indicates that the average concentrations of the three main pollutants (i.e., PM10, SO 2 , and NO 2 ) in the atmosphere are between 0.03-0.10mg/m 3 , of which the average concentration of PM10 is the highest, whereas those of SO 2 and NO 2 are similar.For the self-rated health of the dummy variable, the mean value is 0.6425, which exceeds 0.5; thus, the physical health status of the study sample is slightly better than the general level.From the statistical results of the two variables containing inpatient and outpatient treatments, the means are 0.2312 and 0.5288, respectively; hence, the number of outpatient treatment is significantly higher than that of inpatient treatment, which accords with the actual situation based on experience.
In addition, for a total of 15,892 samples over a three-year period, the average age is 61.25 years, of which 7390 (46.5%) are male and 8502 (53.5%) are female.The distribution of education years ranges from 0 to 42, with a mean of 5.02 and a standard deviation of 5.27; these values indicate that the level of education significantly differs in this study.GDP per capita in US dollars can objectively reflect the level of social economy development and the degree of development.The average medical insurance cost is 983.1716yuan, ranging from 0 to 19,000, which has a large standard deviation.
Wooldridge put forward that, there are many advantages of using logarithms of strictly positive variables: (1) Interpretation of coefficients is easier: independent of the units of measurements of xs (elasticity or semi-elasticity); (2) when y > 0, log(y) often satisfies CLM assumptions more closely than y in levels.Strictly positive variables (prices, income, etc.) often have heteroscedastic or skewed distributions.Taking logs can mitigate these problems; (3) log transformation reduces or eliminates skewness and reduces variance; (4) taking logs narrows the range of the variable leading to less sensitive estimates to outliers (extreme observations) [20]; for better data visualization and easier statistical inference, we took the logarithmic treatment of the variable GDP per capita and quantified the level of education.

Air Pollution and Self-Rated Health
We built a logistic regression model (1) in order to analyze the relationship between air pollution and self-rated health.
The logistic regression model was established as follows: And in which F ij are the factors influencing self-rated health, β j is the coefficient to be determined.The function of p is s-type distributed and is an increasing function.p ∈ (0, 1).
For each sample i, (i = 1, 2, . . ., n), if p ≈ 0, it indicates that the health of the body is poor, if p ≈ 1, it indicates the health condition is good.
We took the maximum likelihood function method to find the parameters, n individuals are independent, then the joint density likelihood function of the sample is: Take the logarithm on both sides: in which Maximize the above function value, find the coefficient (j = 0, 1, 2, . . ., m), and find the partial derivative and equal to 0: The above equations are combined to find the estimated parameter values.
In our model, in all pentameters β i , we used β 1 and β 2 to represent the coefficients of the first term about the pollutant variable, and the quadratic term about the pollutant variable respectively. Thus, where the dependent variable Health represents self-rated health, which measures the health condition of each individual.As previously mentioned, if the indicator "self-rated health" is "excellent," "better," or "good," then we set it to 1; otherwise, it is 0. We used AP as air pollution variable to represent the concentrations of PM10, SO 2 , and NO 2 .In view of the possible nonlinear relationship between air pollution and physical health, we also added the quadratic term of AP into the model.The impact from air pollution to self-rated health can be divided into four categories according to the sign of β 1 and β 2 .(1) If β 1 > 0 and β 2 > 0, then the impact of air pollution on health is positive; (2) if β 1 < 0 and β 2 < 0, then the impact of air pollution on health is negative; (3) if β 1 < 0 and β 2 > 0, then the effect of air pollution on health is "initially negative and then positive"; (4) if β 1 > 0 and β 2 < 0, then the effect of air pollution on health will appear as "initially positive and then negative."Of course, this is only a theoretical way for air pollutants to affect human health, and it needs to be analyzed in combination with the air pollution concentration range.In addition, x represents a series of control variables that can control the population characteristics of a region.
The current micro-survey data in CHARLS database only include the years 2011, 2013, and 2015.Thus, we merged the three-year data to perform mixed logistics regression.The regression results are shown in Table 3.
We first analyzed the regression results of air pollution on self-rated health, focusing on the coefficients of the three main air pollutants in the models.
The first five columns in Table 3 show the results of regression with only one type of air pollutant concentration without control variables in the model.Let us consider the first column as an example, the first and square terms of PM10 pass the test at the significance level of 1%; the coefficients of the first and square terms are positive and negative respectively, indicating that the impact of air pollution on self-rated health seems to present an "inverted U-shaped" path, which implies an "initially positive and then negative" effect.That is, people's health condition become better at first, and then worsens with the increase in pollutant concentration, which reflects the path of impact caused by air pollution on health to some extent.However, the influence degree of various air pollutants indicates that the regression coefficients of SO 2 and NO 2 are relatively small, whereas the absolute value of PM10 coefficient is greater and has more impact on health.This impact may be due to the fact that China in recent years has weak restrictions on PM10 emission and inhalable particles compared with sulfides and nitrogen oxides, resulting in a large content of this pollutant in the air; moreover, PM10 is presently more harmful to health.
The fourth column in Table 3 shows the regression results of adding all the first terms about air pollutants into the model.The results show that the coefficients between health and NO 2 and SO 2 pass the test at the 1% significance level, whereas PM10 is significant at 10%.After adding the first and square terms of the three air pollutant concentrations, the regression results are shown in the fifth column of Table 3, in which only PM10 shows a statistical significance.The first and second coefficients are positive and negative respectively, which is consistent with the first regression result.NO 2 and SO 2 do not show significant effects.Table 3.Effect of air pollution on self-rated health (self-rated health as the dependent variable).

Variable
Regression Finally, to measure the impact of air pollution on health accurately and compare with the regression results above, we controlled the age, education years, number of chronic diseases, and regional GDP per capita.The related results are listed in the last five columns in Table 3.The results of adding the three air pollutants separately into the model indicate that PM10 and NO 2 pass the significance test at the levels of 1%, whereas SO 2 is not significant.Meanwhile, when the three pollutants are added into the model all at once, the coefficients of PM10, NO 2 and SO 2 are all significant at the level of 1%, indicating that the results are better after adding these three pollutants into the model together.Moreover, similar results are obtained in the nonlinear model with the first and second term additions of the three air pollutant concentrations.The primary and secondary terms of PM10 and NO 2 pass the test at the significance level of 1%, whereas SO 2 is not significant.
It is worth mentioning that: The coefficient of PM10 is 11.6392 in Regression (1) and 0.1050 in Regression (6) in Table 3, which is mainly due to the control of more variables, which reduces the explanatory power of explanatory variables and makes them closer to the real situation.Refer to the above for an explanation of the relevant situation later.
To sum up, Hypothesis 1 indicates that the air pollution status is negatively related to the health of individuals in the region to a certain extent.That is, the more severe air pollution is, the worse health condition will be.

Self-Rated Health and Medical Insurance Costs
To verify whether a significant change exists in medical expenses under different health conditions, and to make a reference and comparison with the following study about the impact of air pollution on medical insurance costs, we established an OLS regression Model (2) as where the dependent variable Medical is an indicator of individual medical insurance costs, and X represents a series of control variables in terms of the population characteristics of a region.We also performed a hybrid cross-sectional regression using the three-year data in the CHARLS database.
The estimated results are shown in Table 4.In view of other relevant factors that influence medical expenses, we added a series of variables that represent demographic characteristics, including age, gender, number of chronic diseases, local GDP and educational years.We selected local per capita GDP of the previous year as a control variable due to the lagged effect of economic level on the demand for health services.
From Table 4, the regression coefficient of self-rated health is −102.2384,which indicates that individuals will generally increase their medical insurance costs to obtain premium payments in the event of a medical condition to reduce personal medical expenses, because they are aware of the deterioration of their health status.Hence, the empirical results of self-rated health status are in line with expectations, and the result is statistically significant.Among all the control variables, the chronic disease index passes the test at a significance level of 1%, which shows that the individual health status (the index of chronic disease type) is closely related to the local medical insurance costs.

Mechanism of Air Pollutants Affecting Medical Insurance Cost
To further clarify the operated mechanism between air pollution, self-rated health, and medical insurance costs, we added the interaction of air pollution and self-rated health as an explanatory variable and made a comparative analysis with the self-rated health in Regression Model (2).Thus, we built the OLS regression Models (3), ( 4) and ( 5) as follows: where the independent variable AP represents the concentrations of air pollutants, PM10, SO 2 , and NO 2 .To verify the effect of air quality on medical insurance costs by indirectly influencing people's health, we added the interaction of air quality and health to test the indirect impact of air quality on medical insurance costs.The empirical results are shown in Table 5.
Table 5. Regression results of Models ( 3), ( 4) and ( 5) on air pollution, physical health, and medical insurance costs.The results show that when air pollution is added into the model and the interaction term between air pollution and self-rated health is not added, the concentration of air pollution and self-rated health are inversely related to the expenditure of medical insurance.When air pollution, the interaction between air pollution and self-rated health are both added into the model, the self-rated health has a positive impact on the expenditure of medical insurance, but air pollution still has a negative impact on the expenditure of medical insurance on the whole.It seems that self-rated health is inversely proportional to medical insurance expense, but after removing the influence of air pollutant concentration on medical insurance expense, self-rated health on medical insurance expense is positively correlated, that is, the higher the self-rated health level, the higher the medical insurance expense should be, which is reasonable, because medical insurance requires that the insured person is in good health.This proves that air pollution can affect medical insurance expense by affecting self-rated health, which is consistent with Hypothesis 2.

Variable
Only the number of chronic diseases negatively affects medical insurance expense at the significance level of 1%, and the parameters of other variables are not very significant.The reason for this result may lie in the fact that; (1) the major factors currently affecting the medical insurance expenses in China are mainly diseases, especially those with long treatment cycles.In this regard, the data released by the Chinese Ministry of Health also seems to support this point; (2) moreover, the effect of air pollution on self-rated health and medical insurance expenses of micro-entities may have a certain degree of lag; thus, the impact on medical insurance expenses in the current period will not be significant; (3) furthermore, the short-term effect is not evident due to the limited time available to obtain individual data.

Further Study
Air pollution affects the health of middle-aged and old people.However, is there any difference in the impact among people who have different diseases?Local scholars have shown that air pollution leads to respiratory and cardiovascular diseases, which increases the rate of system incidence, visiting, admission, and mortality [21][22][23][24].Short-term air pollution exposure increases the risk of respiratory and cardiovascular diseases in healthy people [25][26][27].
Heart disease and hypertension (high blood pressure) are among the three most common chronic diseases in the elderly (see Appendix A).We selected the similar prevalence of heart disease and hypertension as a sub-sample basis to explore the different populations affected by air pollution (Red as above).
A straightforward assumption is that the impact caused by air pollution on the health conditions of people with different diseases is different.We constructed OLS regression Models ( 6) and (7) to verify the effect of air pollution on health in different affected populations.
Table 6 shows that the rate of hypertension, age, education years, number of chronic diseases, and local GDP in the previous year all pass the test with a significance of 1%.Among the factors, self-rated health and age of the patients are statistically significant (p < 0.01), and the regression coefficient is −0.1299.Therefore, as age increases, the health condition of patients with hypertension continues to deteriorate.In addition, the health of patients with hypertension improves with the development of local GDP level, reflecting that the local economic level and medical services can effectively reflect the health of patients.The patients' health and education level also pass the test at a significance level of 1% with a regression coefficient of 0.1677, indicating that a higher education level indicates better health condition.Meanwhile, the results of patients with heart diseases show a certain similarity: Self-Rated health and the number of chronic diseases, the local GDP level as well as gender are statistically significant (p < 0.01).It is indicated that female patients with heart disease have better self-rated health than male patients.Statistical significance among the patients' age, education level, and health, which may be related to the influence from genetic factors to the incidence of heart disease, is not observed.As shown in Table 7 and Figure 1, the self-rated health of patients with hypertension and PM10 show a statistical significance (p < 0.01); moreover, linear regression coefficient of PM10 is 0.4164, whereas the square term regression coefficient is −0.0368, indicating that the impact of PM10 on the health of patients with hypertension shows an "initially positive and then negative" effect, whereas the opposite result occurs in SO 2 , that is, "initially negative and then positive" effect.Among the coefficients, the first order of SO 2 is 77.5688 and the coefficient of quadratic term is 1491.7310,both of which have statistical significance, that is, the positive U−shaped effect of "initially negative and then positive," and the regression results with the overall sample show some differences.However, no significant effect between NO 2 and the health of patients with hypertension is observed.The impact of air pollutants on the health of patients with heart diseases is not significantly different from those with hypertension.Among the impacts, the square and first terms of SO2 not pass the regression under the significance level of 10%, the quadratic coefficient is 412.9760, and the coefficient of first term is 10.2429.Therefore, the effect of SO2 on self-rated health may be "the positive The impact of air pollutants on the health of patients with heart diseases is not significantly different from those with hypertension.Among the impacts, the square and first terms of SO 2 not pass the regression under the significance level of 10%, the quadratic coefficient is 412.9760, and the coefficient of first term is 10.2429.Therefore, the effect of SO 2 on self-rated health may be "the positive U−shaped."The coefficient of PM10 is 0.3616, which is significant.A significant effect of the NO 2 coefficient is not observed.
The inflection point of air pollution on self-rated health is shown in Table 8.The collected NO 2 concentration range has no inflection point (not shown in the table above) because of the limitations of the sample selected in this study, that is, the effect on health becomes severe as its concentration increases.A meta−analysis of heart failure and air pollution shows a positive correlation between SO 2 concentration, and hospitalization and mortality in heart failure [24].Combined with the regression results of air pollutants and self-rated health of patients with hypertension and heart diseases, SO 2 has an inflection point on the health of patients with heart diseases.That is, as the SO 2 concentration increases, the impact of health is initially negative, but becomes positive when it reaches 0.014 mg/m 3 .For PM10, the effect on the health of patients with hypertension becomes negative when the level of 0.040 mg/m 3 is reached.When the level of 0.064 mg/m 3 is reached, the effect on the health of patients with heart diseases is negative and can be regarded as the harmful concentration of PM10.

Adding Outpatient Frequency and Hospitalization Times
To ensure the robustness of the empirical results, we used outpatient frequency and hospitalization times, which can reflect individual health indicators to replace the self-rated health as dependent variables in the regression model.The results are shown in Tables 9 and 10.
The regression results using outpatient frequency as dependent variable are shown in Table 9.The first three columns show the results with no control variables.PM10 and NO 2 are significant, and the sign is consistent with the expectations, whereas SO 2 is not significant.Meanwhile, regressions from 6 to 8 are the empirical results after adding the control variables on the basis of the regressions from 1 to 3. The regression results are almost identical to the first three regressions, that is, PM10 and NO 2 pass the significance test.One possible factor may be that PM10 and NO 2 are more harmful to health than the other one pollutant during this particular period.
Column 4 in Table 9 shows the result of regression, including all the first terms of PM10, SO 2 , and NO 2 as independent variables, in which SO 2 is more significant.Similarly, the result of regression 9 after adding all the control variables based on Regression Model 4 is also consistent, which indicates that SO 2 also has a significant positive effect on individual health; however, the effect is still less than that of PM10 and NO 2 , considering their value of regression coefficient.
Moreover, we regress the first and square terms of PM10, SO 2 , and NO 2 as independent variables and consider any significant change of the results without control variables.The empirical results are shown in regressions 5 and 10 of Table 9.The results show that PM10 and SO 2 still have significant positive effects on individual health, although their effects are diminishing according to the regression coefficients.
Finally, in most regression models, the results of PM10 show an "inverted U-shaped" phenomenon; thus, the coefficients of the first and square items are positive and negative, respectively, which is the same in Table 3.
The air pollutants represented by PM10 and SO 2 affect the health conditions of individuals, and PM10 has the greatest impact on health, which may be because the PM2.5 index has not yet been measured separately before 2013 in China.As a result, the impact of PM10 on the respiratory system of an individual includes the effects of PM2.5, which manifests as the obvious effects of PM10.As we separately add PM2.5 indicators in a later time, the impact of PM10 on health significantly decreases.The effect of SO 2 on health should be related to its strong toxicity.
Table 10 shows the regression results using hospitalization times as an independent variable.Generally, the conclusion that air pollution significantly affects the health conditions of an individual remains valid.Particularly, most of the results in Table 10 are almost identical to those in Table 9; however, the significance of indicators (i.e., PM10, SO 2 , and NO 2 ), which measure air pollution concentrations, are significantly high, and the effect is strong (with generally large regression coefficients).The NO 2 indicator, which is not significant in Table 9, also shows a significant positive effect on health.In addition, similar to Table 3, the effect of SO 2 on health is mostly manifested as "inverted U-shaped."No significant difference is found from the empirical results in Tables 3, 9 and 10, and the three air pollutants pass the significance test to different degrees.Hence, the impact of air pollution on health shows a significant positive effect.SO 2 and other indicators of air pollution influence physical health in an inverted U-shaped path as "initially positive and then negative."As shown in Tables 3-10, the sign of the quadratic coefficient of PM10 is from negative to positive, which has an "unstable" effect on self-rated health.

Difference between Northern and Southern China
Given the large geographical differences in China, we divided the overall sample into two, including the north and south of China according to the Qinling-Huaihe River (geographical boundary between south and north) to verify whether the empirical results will change as the sample changes.Several studies have shown that the south and north are relatively different in terms of natural conditions, agricultural production methods, geographical features, and local customs.
Table 11 shows the regression results of the samples containing 31 southern cities in China.Overall, the regression results of the southern samples are generally better than the overall sample, indicating that the regression effect has improved after excluding regional differences.Particularly, the first three columns show results without control variables, reflecting that NO 2 and PM10 pass the test at a significance level of 10%.NO 2 also exhibits a mechanism of U-shaped effect on self-rated health, whereas the PM10 coefficient shows the opposite influence compared with NO 2 .After adding the first term of three pollutants simultaneously, except for SO 2 , the other two pollutants pass the test at a significance level of 1%, indicating a difference from the result of the overall sample due to the regression of SO 2 .We initially deduced that the percentages of the three air pollutants in the north and south are relatively different, resulting in differences in the regression results between the overall and southern samples.This difference is reflected in SO 2 , whereas PM10 is significant in the overall and southern regression results.
The last five columns of Table 11 show the regression results after adding the control variables.Among the regressions to which the three pollutants are added, only the first and square terms of NO 2 and PM10 pass the test at a significance level of 1%, whereas SO 2 is not significant.The regression results of PM10 and NO 2 are consistent with the regression results of the overall sample.Moreover, PM10, NO 2 , and SO 2 pass the significance test when the first term of the three pollutants are added simultaneously, which is in accordance with the characteristics of the overall sample.
The difference between the regression results of 25 cities in northern China and 31 cities in the south are shown in Table 12.Overall, among the 10 regressions listed, the significance of the air pollutants, PM10, SO 2 , and NO 2 , on self-rated health indicators is enhanced, exceeding the regression results of the overall and southern samples.Specifically, in the separate regression of the three pollutants without control variables, only the first term of NO 2 does not show any significance, whereas the other indicators pass the regression at the significance level of 10%.After controlling other influencing variables, PM10 and SO 2 show a high significance (1%) on self-rated health, whereas NO 2 is less significant.
Thus, no difference is found between the regression results of samples from the north and south and the overall sample.However, between the north and south samples, the types of major pollutants that are hazardous to health show a difference.Among the northern samples, the significance of air pollutants on self-rated health indicators is better than that of the southern and overall samples; hence, the north has poor air quality due to burning coal for heating and other influence, resulting to the relatively poor health status of local residents.That is, the impact of air pollution on physical health is more significant.In terms of pollutant types, PM10 accounts for a larger proportion in the north and south, causing inevitable adverse effects on health conditions.NO 2 mainly exists in the air pollutants in the south, which should be their focus of air pollution control.SO 2 is one of the pollutants that is harmful to people's health in the north, which may also point the direction for key targets in effective air pollution control in northern and southern China in the future.

Conclusions
Since the turn of the 21st century, environmental pollution has become a significant hindrance for developing countries when solving developmental problems.The major air pollution problem that China is facing is compound pollution of the atmosphere represented by PM2.5 and ozone.Many types of pollutants in the atmosphere are in high concentration levels, which are the root cause of frequent air pollution in major urban agglomerations in China.In recent years, haze, winds, and other extreme weather occur frequently, which affect the cityscape and people's daily lives.Specifically, respiratory and lung diseases have become the consequences of air pollution.In this study, 2011, 2013, and 2015 data in CHARLS database are used to analyze the impact of air pollution on physical health and medical insurance expenses empirically among the middle-aged and old Chinese people.The conclusions are as follows.
First, air pollutants, PM10, SO 2 , and NO 2 , have an impact on the health of the elderly people.Among these pollutants, the impact of PM10 shows an inverted U-shaped structure that initially increases and then decreases.That is, as PM10 concentration increases, the health condition initially becomes better and then worsens; whereas SO 2 does not show a stable and significant effect; NO 2 shows a downward trend, that is, the health status continues to deteriorate with the increase in concentration.
Second, health and medical insurance expenses are closely related; a worse self-rated health indicates more medical insurance expenses.If the number of patients suffering from chronic diseases increases, the total medical insurance expenditure will be reduced, and the total medical expenses will increase.Age, self-rated health, and local GDP significantly affect medical insurance expenses.The local economic development also has a direct relationship with the expenditure on medical insurance expenses of local residents.In developed areas, people have more channels to receive medical services with high quality, their own economic conditions can support the demand, and insurance is strong.
Finally, the concentration of air pollutant and self-rated health indicators have an impact on the medical insurance expenses of the elderly people.The severity of air pollution indirectly promotes the increase of medical insurance expenses by affecting the health conditions, and the impact on medical insurance expenses caused by different pollutants is also different.The results show that in the main air pollutants, only the changes of PM10 and SO 2 have a significant impact on medical insurance expenses, which shows a reverse relationship.In addition, significant differences are found in the effects of air pollution on the health status among populations affected by different diseases, such as hypertension and heart diseases.
Moreover, based on the robustness test, we found a difference in the types of major pollutants that influence health conditions between the north and south samples by dividing the city into two sub−samples of north and south with Qinling Ridge-Huaihe River as a boundary.The air quality level is worse in the north than that in the south due to burning coal for heating in the north.From the perspective of pollutant categories, PM10 accounts for a larger proportion in the north, whereas NO 2 is mainly present in the south.Moreover, SO 2 is the main pollutant hazardous to health in the north.This finding can provide some guidance to help us focus on the main target of air pollution control activity in the next stage.
The conclusion of this study verifies the practical significance of protecting the environment and purifying the air for the citizens and development of our country.Compared with the medical expenses caused by pure physical deterioration without air pollution, air pollution does affect people's health in real terms, and this impact can be reflected in people's medical insurance expenses.This paper presents seven regression models, which were constructed in such a way that each of them examines and verifies the relationship among three different objects: Air pollutants, self-rated health and medical insurance costs.All tested models were characterized and based on a progressive way, which indicates the interaction among air pollutants, self-rated health as well as medical insurance costs.The robustness test, where two different perspectives including adding outpatient frequency and hospitalization times, and distinguishing the difference between northern and southern China, support the result more solidly.

Discussion
Curbing air pollution can help maintain a suitable environmental quality and have a good effect on people's health, as well as reducing the costs for medical treatment, thereby decreasing social costs.This study is of an enlightening significance for improving air pollution, protecting people's physical and mental health, saving social medical costs, and implementing targeted air pollution control.This study also provides a useful perspective for the debate in China and the world about the deterioration of air pollution.
At the same time, due to the limited knowledge of the author and time constraints, this paper has some unsolved problems: (1) Further study on air quality difference between the north and south of China, and the difference in health, medical expenses and life expectancy caused by air pollution.(2) The concentrations of SO 2 , NO 2 and PM10 are representative of air pollutants, but they are not comprehensive.For example, PM2.5, is small in particle size, rich in toxic substances, and has long residence time in the atmosphere and long transport distance.Therefore, the impact on human health and atmospheric environmental quality is greater, but PM2.5 monitoring indicators The data was only available in January 2013.(3) The data span is relatively short, the current CHARLS website only updated to 2015 data.
In view of the above problems, this paper puts forward the following ideas: (1) The coal-burning heating policy in the south and north of China is implemented with the Qinling-Huaihe River as the boundary, so we can consider the Regression Discontinuity with the latitude of the Qinling-Huaihe River as the breakpoint.The difference of air quality between the north and the south and its series influence are obtained.(2) Various important air pollutant concentrations can be incorporated into the calculation to construct a new "air quality composite index", similar to the air quality index "Air Quality Index" (AQI), which China began to monitor and publish in real time in May 2012.(3) To better measure the void, the national baseline survey data for 2017 will be updated in 2019, and the data will be updated to further verify the results of the article.Note: The two lines on the red icon indicate that the prevalence of hypertension and heart disease is close.We selected the similar prevalence of heart disease and hypertension as a sub-sample basis to explore the different populations affected by air pollution.

Appendix B
Take Table 4 as an example.After the regression, we used the White's (1980) test to test the heteroscedasticity: The prevalence of chronic diseases in the elderly (‰)

Appendix B
Take Table 4 as an example.After the regression, we used the White's (1980) test to test the heteroscedasticity: Prob > chi2 = 0.0000 From the above regression results, Prob > Chi2 = 0.0000, significantly less than 0.05, indicating that the null hypothesis of "no heteroscedasticity" is rejected, indicating that there is heteroscedasticity.Next, the weighted least squares method (WLS) is used to correct the heteroscedasticity.The corrected regression results are shown in Table 4 in the main text.
In this paper, we have tested the heteroscedasticity for each OLS regression.The results of heteroscedasticity in the test are as follows (Columns 1, 2, and 3 of Table 5), and the test results without heteroscedasticity are not listed White's test for H0: homoscedasticity Against Ha: unrestricted heteroscedasticity Chi2 (73) = 142.43 s Prob > chi2 = 0.0000 All the above regression with heteroscedasticity are corrected by the weighted least squares method (WLS), and is displayed in the corresponding position in the text.

Appendix C
For both logistic regression models and OLS regression models, we added a multicollinearity test.The final results were shown below.

Appendix C.1. Multicollinearity Test for Logistic Regression
We used coldaig2 to test the multicollinearity problems.Coldiag 2 first computes the condition number of the matrix.If this number is "large" (Belsley et al. suggest 30 or higher), then there may be collinearity problems.All rest results after modification are shown below.

1 )Figure 1 .
Figure 1.Effects of NO2, PM10 and SO2 concentration on self-rated health of hypertension and heart disease.

Figure 1 .
Figure 1.Effects of NO 2 , PM10 and SO 2 concentration on self-rated health of hypertension and heart disease.

Figure A1 .
Figure A1.Prevalence of chronic diseases in the elderly.

Figure A1 .
Figure A1.Prevalence of chronic diseases in the elderly.Note: The two lines on the red icon indicate that the prevalence of hypertension and heart disease is close.We selected the similar prevalence of heart disease and hypertension as a sub-sample basis to explore the different populations affected by air pollution.

Table 1 .
Variables and definition.

Table 2 .
Descriptive statistics of main variables.

Table 4 .
Effect of self-rated health on medical insurance costs.
Note: (1) "t value" is enclosed in parentheses; ** and *** denote the levels of significance test by 10%, 5%, and 1%, respectively; (2) we used mixed OLS regression to estimate Model (2); (3) we used White's (1980) test in the regression analysis to test possible heteroscedasticity problems.The detailed test process is shown in Appendix B, and the same as the following Tables; (4) we used VIF to test the multicollinearity problems.Results for each model are shown in Appendix C.2. Multicollinearity test for OLS regression; (5) we conducted residential analysis, and the residual plots are shown in Appendix D Residential Analysis.For those regressions which did not pass the test, we conducted modification one by one.
The detailed test process is shown in Appendix B, and the same as the following Tables; (4) we used VIF to test the multicollinearity problems.Results for each model are shown in Appendix C.2. Multicollinearity test for OLS regression; (5) we conducted residential analysis, and the residual plots are shown in Appendix D Residential Analysis.For those regressions which did not pass the test, we conducted modification one by one.

Table 6 .
Regression result between basic information and self-rated health in patients with hypertension and heart diseases.We used mixed logistics regression to estimate Model(1).We used the ordinary likelihood ratio test to select a model that is better suited to current data analysis; (2) "z value" is enclosed in parentheses; **, and *** denote the levels of significance by 10%, 5%, and 1%, respectively; (3) we used coldaig2 to test the multicollinearity problems.Results for each regression are shown in Appendix C.1.Multicollinearity test for logistic regression.For those regressions which did not pass the test, we conducted modification one by one.

Table 7 .
Regression result between air pollution and self-rated health in patients with hypertension and heart diseases.We used the ordinary likelihood ratio test to select a model that is better suited to current data analysis; (2) "z value" is enclosed in parentheses; *, **, and *** denote the levels of significance by 10%, 5%, and 1%, respectively; (3) we used coldaig2 to test the multicollinearity problems.Result for each regression are shown in Appendix C.1.Multicollinearity test for logistic regression.For those regressions which did not pass the test, we conducted modification one by one.

Table 8 .
Inflection point of the impact from air pollutants to Self-Rated health of patients with hypertension and heart diseases (Unit: mg/m 3 ).

Table 9 .
Effect of air pollution on self-rated health (hospitalization times as independent variable).We use White's (1980) test in the regression analysis to test possible heteroscedasticity problems.The detailed test process is shown in Appendix B, and the same as the following Tables.(4)WeuseVIF to test the multicollinearity problems.Result for each model is shown in Appendix C.2 Multicollinearity test for OLS regression.(5)Weconducted Residential analysis and the residual plots are shown in Appendix D Residential Analysis.For those regressions which didn't pass the test, we conducted modification one by one.

Table 10 .
Effect of air pollution on self-rated health (Outpatient frequency as independent variable).We use White's (1980) test in the regression analysis to test possible heteroscedasticity problems.The detailed test process is shown in Appendix B, and the same as the following Tables.(4)WeuseVIF to test the multicollinearity problems.Result for each model is shown in Appendix C.2 Multicollinearity test for OLS regression.(5)Weconducted Residential analysis and the residual plots are shown in Appendix D Residential Analysis.For those regressions which didn't pass the test, we conducted modification one by one.

Table 11 .
Effect of air pollution on self-rated health in 31 southern cities of China.and *** denote the levels of significance by 10%, 5%, and 1%, respectively.(3) We use coldaig2 to test the multicollinearity problems.Result for each regression are shown in Appendix C.1 Multicollinearity test for logistic regression.For those regressions which didn't pass the test, we conducted modification one by one.
Note:(1)We use mixed logistics regression to estimate Model(1).We use the ordinary likelihood ratio test to select a model that is better suited to current data analysis.(2) "z value" is enclosed in parentheses; *, **,

Table 12 .
Effect of air pollution on self-rated health in 25 northern cities of China.
We use the ordinary likelihood ratio test to select a model that is better suited to current data analysis.(2) "z value" is enclosed in parentheses; *, **, and *** denote the levels of significance by 10%, 5%, and 1%, respectively.(3) We use coldaig2 to test the multicollinearity problems.Result for each regression are shown in Appendix C.1 Multicollinearity test for logistic regression.For those regressions which didn't pass the test, we conducted modification one by one.

Table A1 .
Heterogeneous test and correction for regression results before the presentation of Table4.

Table A1 .
Heterogeneous test and correction for regression results before the presentation of Table4.

Table A4 .
Heteroscedasticity test.: Table5results of heteroscedasticity test in the second column of regression. Note

Table A5 .
Heteroscedasticity test.: Table5results of heteroscedasticity test in the third column of regression. Note