Exploring the Relationship between Sugar and Sugar Substitutes—Analysis of Income Level and Beverage Consumption Market Pattern Based on the Perspective of Healthy China

This paper estimates the impact of income level on household beverage consumption, analyzes the consumption trends of sugar-sweetened beverages and sugar-free beverages in households, explores the future changes in the beverage consumption market pattern, and predicts the possible impact of the sugar industry on the development of sugar substitutes based on the beverage consumption data of Kantar Consumer Index in China from 2015 to 2017. The research results show that, firstly, there is an “inverted U-shaped” relationship between income level and household consumption of sugar-sweetened beverages, which indicates that as income rises, household consumption of sugar-sweetened beverages tends to increase and then decrease. Secondly, income level has a positive effect on the household consumption of sugar-free beverages. Finally, in the future stage, with the further growth of income and the promotion of a healthy China, a large amount of sugar substitutes will be added to beverages instead of the original sugar, and the relationship between sugar and sugar substitute consumption will change from complementary to substitution. The findings of this paper have implications for encouraging food and beverage suppliers to produce “healthy”, “nutritious” and “innovative” low-sugar products to meet the health needs of residents and ensure the healthy and orderly development of the sugar industry.


Introduction
Since the 21st century, obesity has gradually become one of the most problematic public health issues worldwide. The excessive consumption of unhealthy foods high in sugar and calories, especially in the form of sugary drinks, is a major cause of obesity and has been confirmed in several countries with high levels of sugary drink consumption [1,2]. In contrast to these countries, sugary drink consumption in China is growing at a relatively fast rate, despite its low level of consumption. This has also attracted a great deal of attention from the Chinese health authorities. Considering the paradoxical relationship between sugary drink consumption and obesity, the health authorities have accelerated the development of interventions and programs, such as the "Healthy China Action (2019-2030)". These interventions all emphasize the importance of a healthy diet and recommend replacing sugary beverage consumption with low-sugar or sugar-free beverage consumption to reduce health problems such as obesity [3].
In the process of responding positively to the Healthy China Action, the beverage industry, as the supply side, began to make rapid adjustments to its product formulations and worked hard to innovate various beverage products in line with the concept of nutrition The data used in this paper come from the 2015-2017 China Household Beverage Consumption Dynamics Tracking Survey (Kantar Worldpanel) by the Kantar Consumer Index, which provides insight into changes in Chinese household beverage consumption behavior and provides a reliable source of data for academic research and public policy analysis, while better supporting the entire analytical framework and content of this paper. The data sample covers 24 provinces, with a target sample size of 181,613 households, and the survey only includes respondents in the sample households. The data used in this paper, with 2015 as the baseline survey data, contains 31,175 households with a total of 31,175 respondents(annual rotation of sample of approximately 25%), of which, 3315 are male and 27,860 are female. The data contain daily and annual samples of total beverage household consumption, number of individual beverages consumed, and package size, and the beverage categories are divided into sugary and sugar-free beverages. Among them, sugary beverages include six types of sugary ready-to-drink coffee, sugary carbonated drinks, sugary fruit juices, sugary functional drinks, sugary Asian traditional drinks, and sugary ready-to-drink tea. Correspondingly, sugar-free beverages include sugar-free ready-to-drink coffee, sugar-free carbonated beverages, sugar-free fruit juices, sugar-free functional beverages, sugar-free Asian traditional beverages and sugar-free ready-to-drink teas in a total of six categories. In addition, the data also contained the following two sections, basic household information (household income, household form, household type, presence of children in the household) and information on the demographic characteristics of the respondents (age, gender, education level) [16].

Variable Definition
The explanatory variable in this paper is household beverage consumption, which is further subdivided into household sugar-sweetened beverage consumption and household sugar-free beverage consumption, and is mainly used to measure the beverage consumption of residential households. Although there may be substitution effects between the various types of beverages, this paper ultimately aims to explore the possible changes in future consumption demand for total sugar-sweetened beverages and sugar-free beverages and the relationship between the two. Therefore, this paper defines the explanatory variable, household consumption of sugary beverages, as the sum of the consumption of sugary ready-to-drink coffee, sugary carbonated beverages, sugary fruit juice, sugary functional beverages, sugary Asian traditional beverages, and sugary ready-to-drink tea, in order to investigate the total consumption of sugary beverages in households. Household consumption of sugar-free beverages was defined as above.
The core explanatory variable in this paper is total household income. Referring to the existing studies [17,18], the control variables were set as household size, age and gender of the respondents, and education level. In addition, the presence or absence of children in the household was also included as a control variable in this paper, because children tend to eat the food that their parents eat and their eating habits and behaviors are influenced by family members, therefore, families with children are generally more health conscious and will try to reduce the consumption of unhealthy foods such as sugary drinks to avoid children being affected by them [19,20].

Emperical Test
Since the data used in this paper are unbalanced panel data, and it is necessary to observe variables that do not change over time such as gender and province, the model estimation method needs to be chosen between random effects estimation and mixed effects estimation. Using the LM test, the original hypothesis of "individual random effects" is accepted, so the random effects model is chosen, and the parameters are estimated using the feasible generalized least squares estimation method (FGLS). The model is set as follows.
where Y SSB and Y NSSB are the household consumption of sugar-sweetened beverages and household consumption of sugar-free beverages, respectively. lnInc is the logarithm of household income; Z is the household characteristics variable and respondent characteristics variable, including household size, presence of children, respondent age, respondent gender, and respondent education level. θ 0 . . . θ 4 , and α 0 . . . α 4 are unknown parameters. ε 1 and ε 2 are random errors. The price variables are not considered in the model, mainly because the beverage consumption involved in the article contains various types of different sugar-sweetened and sugar-free beverages, which are extremely different and difficult to account for, so it is assumed that residents of the same province face the same beverage market price during the survey period, and the province dummy variable can control the impact of price differences on consumption between provinces to some extent [21]. Therefore, a province dummy variable R was introduced in the model.

Statistical Analysis
Since the data used in this paper are unbalanced panel data, and it is necessary to observe variables that do not vary over time such as gender and province, the model estimation method needs to be chosen between random effects estimation and mixed effects estimation. Using the LM test, the original hypothesis of "individual random effects" is accepted, so the random effects model is chosen, and the parameters are estimated using the feasible generalized least squares estimation method (FGLS). Models 1 and 3 include only the household income variable and the squared household income variable, while models 2 and 4 add control variables and province dummy variables. Considering that the high correlation between Inc and Inc 2 may lead to the existence of multiple covariances in the model, therefore, this paper refers to the previous method [22], which will use a centralized treatment, lnInc minus its own mean treatment before constructing the squared term, to eliminate or reduce the problem of multiple covariances (variance inflation factor VIF < 5). All analyses were performed using stata15 statistical software, and p-values < 0.1 were considered statistically significant.

Basic Characteristics of Sample
As shown in Table 1, the average annual household consumption of sugary drinks in the sample was 28.55 L, and the average household consumption of sugar-free drinks was 0.98 L. The average household income was 100,500 yuan per year, and they were generally families of three, with approximately 28% of the households in the sample having children. The average age of the respondents was 44 years old, and 44.23% were female and the average education level was above high school.

Basic Regression Analysis of Household Income on Beverages
Models 1 and 3 only contain the logarithm of household income and its squared term, while models 2 and 4 add a series of control variables on top of models 1 and 2, and the resulting estimation results are shown in Tables 2 and 3. a Model 1 is the test results obtained by regressing household income and sugary drinks consumption using the FGLS approach without the inclusion of control variables. b Model 2 is the test results obtained by regressing household income and sugary drinks consumption using the FGLS method after adding control variables and controlling for the dummy variables of the provinces. c This regression was analyzed using Stata15 statistical software, and a p < 0.1 is statistically significant. Where "**" represents 0.01 < p < 0.05, and "***" represents p < 0.01. The smaller the p-value, the better the performance.
First, from the regression results, the estimated coefficients of the log of household income in both model 1 and model 2 are significantly positive, and the coefficients of the squared terms of the log of income are significantly negative, indicating that there is a nonlinear effect of household income on household consumption of sugary drinks, which is more common in food category studies [23,24]. On the one hand, the marginal effect of household income on household consumption of sugary beverages is not constant, and the latter varies with the former. The comparison of coefficients in model 2 shows that the absolute value of the logarithmic coefficient of income, i.e., the coefficient of the linear relationship, is 0.086, which is larger than the squared term of 0.034, indicating that the linear relationship dominates when the income is small, and the coefficient of the squared term will gradually dominate as the income keeps increasing. On the other hand, the relationship between household income and household consumption of sugary drinks may have an inverted U-shaped relationship, but this relationship depends on whether the turning point falls within the sample interval. This means that household consumption of sugary drinks does not always increase with income, but may fall after income reaches a certain level. Regardless, further verification is needed. a Model 3 is the test results obtained by regressing household income and sugar-free drinks consumption using the FGLS approach without the inclusion of control variables. b Model 4 is the test results obtained by regressing household income and sugar-free drinks consumption using the FGLS method after adding control variables and controlling for the dummy variables of the provinces. c This regression was analyzed using Stata15 statistical software, and a p < 0.1 is statistically significant. Where "*" represents 0.05 < p < 0.1, "**" represents 0.01 < p < 0.05, and "***" represents p < 0.01. The smaller the p-value, the better the performance.
In this paper, the marginal effect of household income on household consumption of sugary drinks is calculated to verify whether there is an "inverted U-shaped" relationship between the two, and the results are shown in Figure 1. When the logarithm of income reaches 12.9, which means when the household income is 400,300 yuan, the marginal effect is 0. At the same time, the marginal effect shows a positive to negative characteristic here. Thus, the existence of the "inverted U-shaped" relationship is confirmed, implying that household consumption of sugar-sweetened beverages increases with household income in the early stage, but then tends to fall with income growth after reaching the turning point (at the household income of 400,300 yuan).
Second, the effect of log household income on household sugar-free beverage consumption is significantly positive in Models 3 and 4, but only the coefficient of the squared term of log income is significantly positive in Model 4. Overall, an increase in household income has a positive incremental effect on household sugar-free beverage consumption, which represents as household income increases, household sugar-free beverage consumption shows an exponential increase. Meanwhile, combined with Figure 2, it is found that the samples in this paper all fall within the interval where the marginal effect is positive, again proving that the increase in household income has a boosting effect on household sugar-free beverage consumption.
In terms of other variables, there is a significant positive correlation between household size and household consumption of sugary beverages and a significant negative correlation between household consumption of sugar-free beverages, i.e., a one unit increase in household size is associated with a 0.113 unit increase in household consumption of sugary beverages and a 0.20 unit decrease in household consumption of sugar-free beverages. There is a significant negative correlation between the presence of children in the household and the household consumption of sugary drinks, i.e., households with children consume 0.068 units less sugary drinks than households without children. The age of the respondent has a significant negative correlation with household consumption of sugary drinks and a significant positive correlation with household consumption of sugar-free drinks, i.e., an increase of one unit in the respondent's age reduces household consumption of sugary drinks by 0.294 units and increases household consumption of sugar-free drinks by 0.273 units. There was a positive correlation between the gender of the respondent and household consumption of sugary beverages and household consumption of sugar-free beverages, with men consuming more beverages. There was a significant negative correlation between the education level of the respondent and household consumption of sugary beverages, i.e., one unit increase in education level of the respondent was associated with a 0.028 unit decrease in the household consumption of sugary beverages. Second, the effect of log household income on household sugar-free beverage consumption is significantly positive in Models 3 and 4, but only the coefficient of the squared term of log income is significantly positive in Model 4. Overall, an increase in household income has a positive incremental effect on household sugar-free beverage consumption, which represents as household income increases, household sugar-free beverage consumption shows an exponential increase. Meanwhile, combined with Figure 2, it is found that the samples in this paper all fall within the interval where the marginal effect is positive, again proving that the increase in household income has a boosting effect on household sugar-free beverage consumption. In terms of other variables, there is a significant positive correlation between household size and household consumption of sugary beverages and a significant negative correlation between household consumption of sugar-free beverages, i.e., a one unit increase in household size is associated with a 0.113 unit increase in household consumption of sugary beverages and a 0.20 unit decrease in household consumption of sugar-free bever- Second, the effect of log household income on household sugar-free beverage consumption is significantly positive in Models 3 and 4, but only the coefficient of the squared term of log income is significantly positive in Model 4. Overall, an increase in household income has a positive incremental effect on household sugar-free beverage consumption, which represents as household income increases, household sugar-free beverage consumption shows an exponential increase. Meanwhile, combined with Figure 2, it is found that the samples in this paper all fall within the interval where the marginal effect is positive, again proving that the increase in household income has a boosting effect on household sugar-free beverage consumption. In terms of other variables, there is a significant positive correlation between household size and household consumption of sugary beverages and a significant negative correlation between household consumption of sugar-free beverages, i.e., a one unit increase in household size is associated with a 0.113 unit increase in household consumption of sugary beverages and a 0.20 unit decrease in household consumption of sugar-free beverages. There is a significant negative correlation between the presence of children in the household and the household consumption of sugary drinks, i.e., households with children consume 0.068 units less sugary drinks than households without children. The age

Robustness Test
The above regressions suggest that household income has an "inverted U-shaped" relationship with household consumption of sugar-sweetened beverages and an exponential relationship with household consumption of sugar-free beverages, but the robustness of the findings needs to be further verified.
In fact, households make rational choices of beverages based on their own endowment characteristics, which means choosing sugary or sugar-free beverages or not is a kind of "self-selection" behavior of households. Ignoring this self-selection problem and directly estimating the effect of household income on household beverage consumption may lead to biased results. Therefore, this paper uses the Generalization Propensity Score (GPS) matching method, which is suitable for handling multi-valued variables, to address the self-selection bias of household beverage consumption and to determine the robustness of the findings. The propensity score matching method is not chosen in this paper because it is only suitable for solving the problem of binary variables with 0 or 1 and can only obtain the average treatment effect of household income on household beverage consumption, which cannot portray the dynamic changes of household beverage consumption at different income levels and may obscure or underestimate the actual effect between the two.
The GPS approach requires that the assumption of conditional independence between the treatment and outcome variables be satisfied holds that: where T is household income, the treatment variable, and Y(t) is the magnitude of household beverage consumption corresponding to when the treatment variable household income takes the value of t, the outcome variable. For the treatment variable to satisfy T ∈ [0, 1], this paper draws on a previous research method [25,26], which further defines the treatment variable as the ratio of the ith household income to the income of the highestincome household by excluding individual outliers based on the distribution of household income levels. The variables contained in the vector X are called "matching variables", also known as covariates, indicating control variables that can affect both the outcome variable Y and the treatment variable T. The covariates in this paper will be selected to control for household size, the presence of children in the household, the age of the respondent, the gender of the respondent, the education level of the respondent, and province. The GPS method is implemented in three steps. In the first step, the conditional probability density distribution of the treatment variable T(t) is estimated given the covariate X. Since household income in the sample presents a non-normal distribution, the conditional probability density of household income is estimated by choosing the Fractional Logit model in this paper. In the second step, the outcome variable Y(t) is expressed as a function of the treatment variable T and the generalized propensity score variable, and its expectation condition is estimated using the OLS method. In the third step, the average expectation of the outcome variable Y at t for the treatment variable T is estimated. The results of the first and second steps are shown in Table 4, and the overall results are as expected.  Table 4 shows the test analysis using GPSM, where T is the treatment variable, R is the GPS score, and T * R is the interaction term between the two. b "*" represents 0.05 < p < 0.1, and "***" represents p < 0.01. The smaller the p-value, the better the performance.
In the third step, the paper divides the values of the treatment variables into five regions and estimates the causal effect of household income on household beverage consumption for each of the five regions. The causal effects of different household incomes in the five regions are linked to obtain the "dose-response" between household income and household beverage consumption in the whole range of treatment variables. The two images in Figure 3 verify the relationship between household income and the con-sumption of sugary and sugar-free beverages, respectively, and a significant difference can be observed between the two. As shown in the left panel of Figure 3, the relationship between household income and household consumption of sugary beverages is not linear, and as household income increases, household consumption of sugary beverages shows an "inverted U-shaped" curve that rises first and then falls. As shown in the right panel of Figure 3, the relationship between household income and household consumption of sugar-free beverages is not linear, and the household consumption of sugar-free beverages grows logarithmically with the increase of household income. Therefore, the conclusions of this paper are robust.

Constant
−3.810 *** (0.035) F-value 92.520 32.000 AIC 0.566 Decision factor 0.070 0.077 a Table 4 shows the test analysis using GPSM, where T is the treatment variable, R is the GPS score, and T * R is the interaction term between the two. b "*" represents 0.05 < p< 0.1, and "***" represents p < 0.01. The smaller the p-value, the better the performance.
In the third step, the paper divides the values of the treatment variables into five regions and estimates the causal effect of household income on household beverage consumption for each of the five regions. The causal effects of different household incomes in the five regions are linked to obtain the "dose-response" between household income and household beverage consumption in the whole range of treatment variables. The two images in Figure 3 verify the relationship between household income and the consumption of sugary and sugar-free beverages, respectively, and a significant difference can be observed between the two. As shown in the left panel of Figure 3, the relationship between household income and household consumption of sugary beverages is not linear, and as household income increases, household consumption of sugary beverages shows an "inverted U-shaped" curve that rises first and then falls. As shown in the right panel of Figure   (

Discussion
In the current study, we found that household income levels have a significant impact on household beverage consumption, and that further increases in income levels will lead to more households choosing sugar-free beverages and gradually reducing their consumption of sugary beverages in the future. The reason for this is that for households with higher incomes, they have healthier diets and healthier lifestyles and eating habits, and therefore will increase their consumption of nutrient-rich foods and decrease their intake of carbohydrate foods [27][28][29]. While sugary beverages, as one of the representatives of carbohydrate foods, have become an inevitable trend to reduce the consumption with the big step of household income [30,31]. This means that to a certain extent, this will undoubtedly change the landscape of the beverage consumption market, while the continued development of the sugar substitute industry is also highly likely to influence the future development of the sugar industry.
Overall, household consumption of both sugary and sugar-free beverages will grow with income when household income is less than 400,300 yuan. The average annual income of Chinese resident households in 2020 will be about 96,600 yuan, an increase of 4.7% from 2019. Taking 2019 as the base year and assuming a constant annual income growth rate, the average annual income of Chinese resident households in 31 years will exceed 400,300 yuan. This means that sugar consumption in terms of sugar for beverages will not decrease significantly over the next 30 years due to the advent of sugar substitutes, but its growth rate will be lower. In a sense, the development of sugar substitutes is complementary to the supply of table sugar during this period, and the relationship between the two is more like a complementary relationship than a complete substitution. This is in line with the findings of Si Wei and Zhu Haiyan [14]. However, after household income exceeds 400,300 yuan, household consumption of sugar-sweetened beverages begins to decline, while at the same time, sugar-free beverage consumption continues to have good room for growth. This means that in the coming period, the emerging sugar-free beverage industry will weaken the market competitiveness of the traditional sugar-containing beverage industry, and the dominant position of sugar-containing beverages in the beverage market will be broken by sugar-free beverages, and the beverage consumption market pattern will be greatly changed or even reshaped. In addition, the beverage industry, as the industry that uses the most sugar, the continued decline in the consumption of sugar-sweetened beverages has led to a significant reduction in sugar consumption, which also means that the relationship between sugar and sugar substitutes will also shift from complementary to substitution.
We also found that at least three of the control variables carried behind them the population's need for a healthy diet. First, the presence of children coefficient is significantly negative, implying that the consumption of sugary beverages decreases in households with children, which is in line with expectations. Some argue that diet has a memory function, and that eating habits during childhood remain for life [32,33]. Therefore, healthy eating habits during childhood will be maintained for a long time. Then, less exposure to and intake of unhealthy foods such as sugary drinks during childhood would mean that consumption of sugary drinks would decrease in adulthood. Second, there is a significant negative relationship between respondents' age and the household consumption of sugary beverages, while there is a significant positive relationship with the household consumption of sugar-free beverages. This indicates to some extent that there is a stratification of beverage consumption among residents of different age groups, especially for middle-aged and elderly people, less sugar, less salt and less oil are important principles of their food consumption [34]. Therefore, it is a more reasonable choice for this group to reduce sugary beverage consumption and increase sugar-free beverage consumption. Third, the effect of respondents' education level on household beverage consumption is similar to respondents' age, and there is a stratification of beverage consumption among residents with different education levels. In general, residents with higher education levels are also more health conscious and their health needs are greater [35,36]. Thus, this group will reduce sugary beverage consumption.
At the same time, based on the results of the study, we speculate that the intrinsic health needs of the population are an important factor in the increasing consumption of sugar-free beverages, which in turn leads to changes in the beverage consumption market pattern, but they need to be stimulated by certain conditions, such as an increase in income levels, to function better. Studies have shown that only those with higher income levels and sufficient economic conditions as supports can purchase healthier and nutritious food and make lifestyle choices that are beneficial to their health, thus satisfying their health needs and pursuing healthy behaviors [37,38]. As the most representative health product, sugarfree beverages can not only ensure the need for beverage taste, but also carry the health needs of residents, and with the increase of income level, it is only in contempt to become the best substitute for sugary beverages. This also indicates that in the general environment of building a healthy China, with the growth of income, the residents' consumption of sugar or beverages changes from a single demand for sweeteners to a health-oriented demand with sweeteners as a supplement is likely to become a popular trend in the future.
Finally, considering that a sugary beverage tax, one of the most cost-effective public health strategies, is a common price instrument used in several countries to tackle obesity [39,40], it may also be one of the intervention policies to tackle rising sugary beverage consumption in China in the future. It has been argued that a tax on sugary beverages would induce a shift in the consumption of sugary beverages to sugar-free or low-sugar beverages among the population [41,42]. Then, with the gradual increase in income levels and nutritional health awareness of the population, coupled with the sugary beverage tax, we can imagine that the consumption of sugary beverages by the population will further decrease significantly, thus accelerating the change in the beverage consumption market pattern.
Our study has the following strengths: firstly, there is a lack of current research related to the emerging beverage consumption market in China, which this paper complements. Secondly, the paper uses the GPSM approach to deal with possible self-selection issues, while also addressing possible endogeneity issues and confirming the robustness of our results. However, there are some limitations to this study. Due to data limitations, there is a lack of variables on nutritional perceptions and food preferences, and we are unable to conduct mechanistic tests or better disentangle the direct and mediating effects of income. Overall, this study has served as a primer and will hopefully provide some inspiration for subsequent research in this area. We will also follow up on this study and organize research on the relevant variables to broaden the depth and breadth of the study on this basis and continuously improve it to make up for the shortcomings.

Results
In conclusion, our findings show that there is an inverted U-shaped relationship between household income and household consumption of sugary beverages, and an exponential relationship with the household consumption of sugar-free beverages. This means that once household income reaches a certain point, further increases will drive household sugary beverage consumption down, while sugar-free beverage consumption will continue to rise. The findings of this paper have implications for the beverage consumption market and the sugar industry. With the growth of income and the continued promotion of a healthy China, consumers' own nutritional awareness has been strengthened, making it highly likely that the consumption demand for sugary foods, including sugary beverages, will decrease. This is not only related to the reshaping of the beverage consumption market pattern, but also has a real impact on the future development of sugar consumption in China. If the supply side of sugar is not adjusted quickly, it may cause some industrial shocks. Firstly, Chinese sugar enterprises and related departments should encourage the food and beverage industry to transition from concentrating resources on the production of high-sugar goods to the production of "healthy", "nutritious" and "innovative" low-sugar goods to meet the current and future needs of the population. Secondly, the food and beverage industry should be encouraged to make use of technological innovation to lower the cost of low-sugar goods and improve the competitiveness of prices.
Author Contributions: Methodology, Z.L.; formal analysis, Z.L., S.L. and J.P.; data curation, Z.L., S.L. and J.P.; writing-original draft preparation, Z.L.; writing-review and editing, J.P. All authors have read and agreed to the published version of the manuscript.