The Impact of Internet Medical Information Overflow on Residents’ Medical Expenditure Based on China’s Observations

Background: The rapid rise of medical expenditure is a common problem in the field of public health around the world, but the challenges for the Chinese government are even greater. How to control the rapid rise in medical expenditure and reduce individuals’ economic burden when receiving medical treatment has become one of the core issues that the Chinese government urgently needs to solve. The aim of this study was to evaluate the impact of Internet use on individuals’ medical expenditure and further discuss the potential impact mechanism. Methods: The data used in this study were from the 2018 China Family Panel Studies (CFPS) conducted by Peking University. The Heckman sample selection model was used to analyse the impact of Internet use on individuals’ medical expenditure. Results: Internet use reduced the medical expenditure of individuals by 6.19%; high frequency Internet use reduced the medical expenditure of individuals by 15.1%, while low frequency Internet use had no impact. In addition, Internet use had different impacts on individuals’ medical expenditure at different levels of hospitals. Specifically, Internet use reduced the medical expenditure of individuals who received medical treatment at general hospitals by 9.63%, and high frequency Internet use reduced the medical expenditure of individuals by 22.2%. However, Internet use had no impact on the medical expenditure of individuals who received medical treatment at primary hospitals. Conclusions: Findings from this study underscore the importance of Internet use as an important role in reducing individuals’ medical expenditure. The use of the Internet can significantly reduce the level of individuals’ medical expenditure, and high frequency Internet use has a greater effect. However, Internet use has different impacts on individuals’ medical expenditure among different levels of hospitals. The reduction effect of Internet use on individuals’ medical expenditure is mainly concentrated in general hospitals but has no effect in primary hospitals.


Introduction
The rapid rise of health care costs is a common problem in the field of public health around the world, but the challenges for Chinese government have been even greater in recent years. The Chinese government launched a new round of medical and health system reform, in 2009, to comprehensively improve the health status of residents and reduce the financial burden of medical treatment, and one of the main initiatives was to massively increase the financial funds in the field of public health. According to data released by the National Bureau of Statistics, from 2009 to 2017, the government expenditure on health care increased from 481.626 billion RMB to 1520.587 billion RMB, an increase of 3.16 times. However, the outcome of the reform appears to have been contrary to expectations, very common for residents to master health-related information through the Internet [8,30]. For this paper, we used the 2018 China Family Panel Studies (CFPS) data to explore the relationship between Internet use and individuals' medical costs in the context of informatization, and we further discussed the potential impact mechanism.

Data
The data used for this paper were retrieved from the 2018 CFPS survey, which aimed to reflect the changes among Chinese families by collecting data at three levels, i.e., individuals, families, and communities. The CFPS survey covers various research topics, including family finances, marital status, social security, health care costs, health status, Internet use, and so on. The survey encompassed the country's 25 provinces/cities/autonomous regions and adopted a multistage, implicit stratification and population-scale method (PPS) that integrated urban and rural areas. The population of the sample collection area accounts for 94.5% of the country's total population. The data are currently representative of large-scale micro-integrated social survey data in China.
The 2018 CFPS database consists of a family database, an adult database, and a children's database. Due to research needs, we used data from the adult database and the family database for the study analysis. First, we matched the data from the adult database with those of the family database and deleted the duplicate samples to obtain a sample size of 32,669. Second, we removed the samples with responses to core variables of "I don't know", "I refuse to answer", or "not applicable", as well as those with missing values, and thus a final sample size of 27,890 was used to analyse the impact of Internet use on individuals' health care costs.

Outcome Variable
The outcome variable of this study was the amount of individual's health care costs. According to the design of the 2018 CFPS questionnaire, we chose the following question to measure the individuals' medical expenditure: "In the past year, how much money (including the amount reimbursed or to be reimbursed) has been spent on your medical care (including medicine, medical treatment, hospitalization, etc.)?" Table 1 shows that the average medical expenditure of respondents was 4.9 yuan (log) in the past year.

Explanatory Variable
The core explanatory variable of this paper was the Internet use status of Chinese residents. We chose two indicators, namely, Internet access and the frequency of Internet use, to measure the Internet use status of respondents. First, we chose the following question from the 2018 CFPS questionnaire to measure the Internet access of respondents: "Do you use a computer to access the Internet?" The respondents were assigned a value of 1 if they chose "Yes", or a value of 0 otherwise. Second, we chose the question "In general, how frequently do you use the Internet?" to measure the frequency of Internet use, and the answer choices were as follows: (1) every day, (2) 3-4 times per week, (3) 1-2 times per week, (4) 2-3 times per month, (5) once per month, (6) once every few months, and (7) never. In the process of our empirical analysis, we generated three dummy variables, namely, "never", "sometimes" and "almost every day". Specifically, when respondents chose "never", they were assigned a value of 1, or a value of 0 otherwise; when they chose "1-2 times per week", "2-3 times per month", "once per month", or "once every few months", they were assigned a value of 1, or 0 otherwise; when they chose "every day" or "3-4 times per week", they were assigned a value of 1, or 0 otherwise. Table 1 shows that 8.63% of individuals used the Internet almost every day, 40.2% of individuals used the Internet sometimes, and 51.1% of individuals never utilized the Internet.

Controlled Variables
To reduce possible bias in the statistical model due to omitted variables, four types of variables were controlled for in the empirical analysis: (1) individual characteristics, including age, gender, marriage status, hukou (Hukou is household register registration system, which consists of rural household registration and urban household registration in China.), religious belief, and family size; (2) health status, including self-rated health, chronic disease, smoking, drinking, and the frequency of exercise per week; and (3) socioeconomic status indicators, including education level, annual household income, health insurance, and hospital level at which the individuals receive medical treatment; (4) meanwhile, considering that differences in the economic development level and Internet coverage among different provinces and cities in China can have an impact on the results, in the empirical analysis, we controlled for the provinces where the individuals were located. The specific definitions are presented in Table 1.

Heckman Sample Selection Model
While processing the data, we found that some individuals had not generated a medical expenditure in the past year. If individuals chose not to receive medical treatment after being ill, we could not observe the real medical expenditure of these samples, which would result in sample selection bias. Therefore, this study used the Heckman sample selection model to solve the possible selective deviation in the sample.
The Heckman sample selection model consists of a selection equation and an expenditure equation. The selection equation is mainly used to estimate the probability of receiving medical treatment and is estimated by using the probit model, which is set as follows: where Prob(Z i = 1|W i ) represents the probability of receiving medical treatment and Z i is the dependent variable in the first stage (Z i = 1 represents the i-th individual who received medical treatment and Z i = 0 represents the i-th individual who did not receive medical treatment); β 0 is the intercept term; Wi represents a series of controlled variables, including individual characteristics, living habits, health status, and socioeconomic indicators; β i indicates the coefficient of the impact of these controlled variables on the respondent's medical treatment; and ε is a random error term.
The second stage is the individual's medical expenditure equation, mainly examining the impact of Internet use on the individual's medical expenditure, which is the focus of this paper. The model is set as follows: where LnExpenditure represents the medical expenditure of the respondents and β 0 is the intercept term; Internet i represents the Internet use status of the individual; β 1 indicates the coefficient of the impact of Internet use on the respondent's medical expenditure; Wi represents a series of controlled variables, including individual characteristics, living habits, health status, and socioeconomic indicators; β i indicates the coefficient of the impact of these controlled variables on the respondent's medical expenditure; and ε is a random error term. Table 2 reports the regression results of the impact of Internet use on individuals' medical expenditure. Columns (1)-(3) add individual characteristic variables, health status and socioeconomic indicators, in order, and report the estimation results of the impact of Internet use on individuals' medical expenditure. The regression results suggest that, after controlling for the individual characteristic variables, health status and socioeconomic indictors, Internet use can reduce individuals' medical expenditure by 6.19% as compared with individuals who never use the Internet, and this result was significant at the 10% statistical level, which implies that Internet use can reduce individuals' medical expenditure.

Results
Columns (4)-(6) add individual characteristic variables, health status, and socioeconomic indicators, in order, and report the estimation results of the impact of Internet use frequency on individuals' medical expenditure. We categorized Internet use frequency into "almost every day" (high frequency), "sometimes" (low frequency), and "never". The baseline variable was "never". The regression results show that after controlling for the individual characteristic variables, health status and socioeconomic indictors, utilizing the Internet almost every day can reduce medical expenditure by 15.1% as compared with individuals who never use the Internet, and this result was significant at the 1% statistical level, which means that high frequency Internet use can significantly reduce individuals' medical expenditure. However, utilizing the Internet sometimes has no impact on individuals' medical expenditure, which implies that low frequency Internet use has no effect on individuals' medical spending.
It is worth noting that there are large gaps in health resources distribution in China. China's medical and health resources are mainly concentrated in general hospitals (including general hospitals and specialized hospitals) and relatively scarce in primary hospitals (including community health service center/health clinic in town or township, community health station/village clinic and private clinic). High-tech, advanced equipment, and excellent experts in the field of health services are basically concentrated in general hospitals in cities, attracting crowds of patients. The rapid rise of medical expenditure has mainly occurred in general hospitals, where the doctors have the incentive to induce demand and generate excessive medical treatment caused by information asymmetry. On the contrary, the growth space of medical expenditure is limited in primary hospitals, so theoretically, the impact of Internet use on the medical expenditure varies among different hospital levels. Table 3 divides the hospital levels into general hospitals and primary hospitals and further reports the impact of Internet use on individuals' medical expenditure among different hospital levels. Specifically, Column (1) in Table 3 reports the impact of Internet use on the medical expenditure of individuals treated in general hospitals. The results of the regression analysis suggested that, under the same conditions, Internet use reduced the medical expenditure of individuals treated in general hospitals by 9.63%, and this result was significant at the level of 10%. Column (2) reports the impact of Internet use on the medical expenditure of individuals treated in primary hospitals. The regression analysis results showed that Internet use had no impact on the medical expenditure of individuals treated in primary hospitals.
Column (3) in Table 3 further reports the impact of the frequency of Internet use on the medical expenditure of individuals treated in general hospitals. We divided the Internet use frequency into "almost every day", "sometimes", and "never", and the baseline variable was "never". Results of the regression analysis showed that under the same conditions as compared with individuals who never used the Internet, utilizing the Internet almost every day reduced the medical expenditures of individuals treated in general hospitals by 22.2%, and this result was significant at the level of 1%. However, utilizing the Internet sometimes had no effect on the medical costs of individuals treated in general hospitals. Column (4) reports the impact of the frequency of Internet use on the medical expenditures of individuals treated in primary hospitals. The results showed that, under the same conditions, as compared with individuals who never used the Internet, utilizing the Internet almost every day or utilizing the Internet sometimes had no impact on the medical expenditure of individuals treated in primary hospitals, which means that the frequency of Internet use had no impact on the medical expenditure of individuals treated in primary hospitals. Notes: (1) Due to space limitations, we did not report the regression results of the selection equation; (2) The benchmark variable for marriage status, self-rated health, education, medical insurance, and hospital level in Columns (1)-(6) was "single", "excellent", "unschooled", "no medical insurance" and "general hospital", respectively; (3) The benchmark variable for Internet use in Columns (4)-(6) was "never"; (4) due to space limitations, we did not report the dummy variables of province in detail; and (5) *** p < 0.01, ** p < 0.05, and * p < 0.1.   Notes: (1) Due to space limitations, we did not report the regression results of the selection equation; (2) The benchmark variable for marriage status, self-rated health, education, and medical insurance in Columns (1)-(4) was "single", "excellent", "unschooled", and "no medical insurance", respectively; (3) The benchmark variable of the frequency of Internet use in Columns (3)-(4) was "never"; (4) Due to space limitations, the dummy variables of province were not reported; and (5) *** p < 0.01, ** p < 0.05, and * p < 0.1.

Robustness Test
To examine the robustness of the results above, we replaced the outcome variables to test the robustness of the results. According to the 2018 CFPS questionnaire design, we chose the question, "Degree of importance of getting information by using the Internet" as the alternative variable for Internet access and frequency of Internet use. The respondents make their choice from 1-5, meaning "very unimportant" to "very important", respectively. Generally speaking, the higher the importance of using the Internet to access information, the greater the degree of medical information overflow from the Internet. Table 4 presents the regression analysis results of the impact of the alternative variable on individuals' medical expenditure. Specifically, the results in Column (1) suggest that the higher the importance of using the Internet to obtain information, the lower the individuals' medical expenditure (decrease by 1.82%). In Table 4, Columns (2) and (3) divide the hospital levels into general hospitals and primary hospitals, respectively, and further report the impact of the importance of using the Internet to access information on individuals' medical expenditures among different hospital levels. The results in Columns (2) and (3) suggest that the higher the importance of using the Internet to obtain information, the lower the medical expenditures of individuals treated in general hospitals (decrease by 3.12%) but has no impact on the medical expenditure of individuals treated in primary hospitals. It can be seen that the results in Table 4 are consistent with those in Tables 2 and 3, which further proves the robustness of the above conclusion.

Discussion
This paper mainly used the Heckman sample selection model to discuss the relationship between Internet use and individuals' medical expenditure, which avoided the estimation bias caused by sample deviation, and it is the mainstream model for the current research on health expenditure. According to the relevant literature on the impact of medical expenditure, the variables were strictly controlled to solve the estimation bias caused by the omitted variables and to ensure the credibility of the results. At the same time, the multiplex collinearity analysis of the data specifies that the VIF value of each variable is less than 3 (see Table A1), the maximum value is 2.13, and the average value is 1.34, which is far less than 10. This indicates that there is no multicollinearity relationship among variables, and the design of the model is reasonable. In addition, we used the substitution variable method to test the robustness, which further improved the credibility of the results.

The Overflow of Internet Medical Information Has Reduced Individuals' Medical Expenditure, and the High Frequency of Internet Use Has a Greater Effect
The rapid rise of medical expenditure in the field of health care has become a common problem faced by most countries around the world, and the failure of the micro medical service market is an important reason for this rise, which is mainly reflected in excessive medical treatment caused by information asymmetry [1,15,16]. Information asymmetry makes doctors tend to "induce" patients to overconsume medical services and drugs, which is, in turn, influenced by the supply of medical information [31,32]. However, with the advance of Internet technology, searching for health information online is becoming more and more popular among the general population, which could reduce the information asymmetry to some extent. The increasing utilization of the Internet has provided a better opportunity for people to seek health information online regardless of its credibility, accuracy, and reliability [20]. Several studies have found that online health information searching can reduce traditional health care service consumption [27,33,34], and this could further reduce individuals' health care costs.
We hold that Internet use can improve or even reduce the information asymmetry between doctors and patients, and thus reduce the level of individuals' medical expenditure. Compared with offline communication channels, the Internet environment has significant advantages in the dissemination of health information [35]. In the Internet society, individuals can achieve accurate access to medical information through search engines; patients can match disease diagnosis information according to their symptoms and obtain detailed drug treatment programs [8], which helps to increase the individual's understanding of health and disease information. To some extent, the Internet can eliminate the professional knowledge barriers in doctor-patient communication [29], which can reduce the incentive for doctors to "induce demand" and further decrease the medical expenditure for individuals. Previous studies have confirmed that disease diagnosis and processing information can produce substitutes for hospital diagnosis and treatment services. For instance, through a randomized trial, Osman et al. [36] found that informationalized medical education could reduce the hospitalization rate of asthma patients by 54%. Wagner et al. (2001) reached similar conclusions using natural experiments and found that self-diagnosis books and Internet health information were negatively correlated with the utilization of paediatric services [37]. Meischke et al. (2005) found that older people with high blood pressure and heart disease could effectively reduce morbidity by mastering relevant knowledge of prevention and health care through the use of the Internet [30].
These studies support the conclusion of the current study to some extent. The overflow of Internet medical information can affect the use of medical services and further influence the level of individuals' medical expenditure by reducing the information asymmetry in medical services. Moreover, a high frequency of Internet use has an even greater impact on the reduction effect on individuals' medical expenditure, which implies that high frequency Internet use has an impact on reducing the information asymmetry in doctor-patient relationships. In other words, the higher the frequency of residents' Internet use, the richer their information search experience, which generates a greater effect on reducing the information asymmetry in the process of medical services and further reduce the level of individuals' medical expenditure.

The Impact of Internet Use on Individuals' Medical Expenditure Varies among Different Levels of Hospitals
This study found that Internet use had different effects on individuals' medical expenditure among different levels of hospitals. The reduction effect of Internet use on individuals' medical expenditure is mainly concentrated in general hospitals but has no effect in primary hospitals. On the one hand, primary hospitals generally tend to accept patients with common diseases (e.g., a cold or fever) and chronic diseases (e.g., high blood pressure and hyperlipidaemia), which do not generate excessive medical expenditure, and doctors themselves have no motivation to induce demand.
Furthermore, the medical conditions and quality of physicians in primary hospitals are far lower than those in general hospitals, and there are no objective conditions to induce demand for doctors, and therefore the use of the Internet has no impact on individuals' medical expenditure. On the other hand, general hospitals have advanced medical equipment and excellent professional physicians (almost all advanced medical resources are concentrated in general hospitals), but the hospital's income-generating mechanism is not perfect; therefore, the doctors have the incentive to induce demand and generate excessive medical treatment, resulting in an abnormal rise in medical expenditure. The development of Internet technology enables patients to search for health information and medical-related information efficiently, which can improve the problem of "information asymmetry" between doctors and patients, eliminate the motivation of doctors to induce demand, and reduce the medical expenditure level of individuals to some extent.
Our study had some limitations. First, Internet use can break through barriers to physicians' professional knowledge and provide quick and easy access to information related to health and medical treatment. However, online knowledge dissemination can have negative effects; false health information can mislead patients' choices of medical treatment, generate a negative impact on individuals' health status, and interfere with the research conclusions [38]. Different kinds of information can generate different impacts through related mechanism, which is our further research direction in the future. Second, in terms of the measurement indicators of Internet use, we adopted "Internet use" and "the frequency of Internet use" to measure individuals' use of the Internet; however, these two indicators may not be accurate enough to measure the impact of Internet use on individuals' medical expenditure.
However, more accurate measurement indicators eluded us due to the CFPS data limits. We plan to further discuss, in detail, the impact of Internet use on residents' medical expenditure, as soon as possible, if we find more appropriate measurement indicators.

Conclusions
This paper empirically analysed the impact of Internet use on the medical expenditure of Chinese residents based on 2018 CFPS data. The study found that Internet use can significantly reduce the level of individuals' medical expenditure, and high frequency Internet use has an even greater impact on the reduction of medical expenditure, but Internet use has different impacts among different levels of hospitals. The impact of Internet use on the reduction of medical expenditure is mainly reflected in general hospitals but has no impact in primary hospitals, which is of great implication to future research. Excessive medical treatment mainly exists in general hospitals, and primary hospitals have no incentive to induce demand. Future research should mainly focus on how to control the rapid rise of medical expenditure in general hospitals, and the use of the Internet should provide a new direction for future research.
First, measures should be taken by the Chinese government to optimize the Internet information environment and promote the construction of "Internet + health care". On the one hand, the government is supposed to strengthen the infrastructure construction and further expand the coverage of the Internet, which can provide a convenient information environment for citizens to obtain health information and improve the accessibility of medical information. On the other hand, the government should strongly support medical institutions in building Internet information platforms and encourage them to provide more Internet medical services such as online consultation, health consultation, communication among patients, and self-health management.
Second, the government can effectively guide the individual's use of Internet information overflow media to strengthen the health management awareness, and regulate the operation and maintenance management of an online consultation platform to improve the accessibility of health management information related to disease, and eliminate the information asymmetry problem between doctors and patients as soon as possible.
Third, Internet medical service providers must improve the timely feedback mechanism of online health information consultation. In the whole process of disease prevention, diagnosis, treatment, and rehabilitation, the advantages of Internet health information overflow should be used to provide more channels for access to health information, and further promote individual's health status without increasing the economic burden.
Author Contributions: All authors contributed to the study conception and design. The literary and survey sections were primarily written by Y.M.; the material preparation, data collection and analysis were performed by J.H.; X.Z. was responsible for data curation, methodology and formal analysis; and all authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Funding: This research was funded by the National Social Science Foundation (project 15CGL046 and16CGL046), the Youth Foundation of Humanities and Social Sciences by the Ministry of Education of the People's Republic of China (project 15YJCZH118), the Scientific Research Projects by the State Ethnic Affairs Commission (projects 14ZNZ005 and MSQ16002), and the Fundamental Research Funds for the Central Universities (projects CSY16018 and 410500076).

Internet users
Internet penetration rate Figure A2. Changes of Internet users and Internet penetration rate.