Health and Housing Energy Expenditures: A Two-Part Model Approach

: Interest in the interaction between energy and health within the built environment has been increasing in recent years, in the context of sustainable development. However, in order to promote health and wellbeing across all ages it is necessary to have a better understanding of the association between health and energy at household level. This study contributes to this debate by addressing the case of Portugal using data from the Household Budget Survey (HBS) microdata database. A two-part model is applied to estimate health expenditures based on energy-related expenditures, as well as socioeconomic variables. Additional statistical methods are used to enhance the perception of relevant predictors for health expenditures. Our ﬁndings suggest that given the high signiﬁcance and coefﬁcient value, energy expenditure is a relevant explanatory variable for health expenditures. This result is further validated by a dominance analysis ranking. Moreover, the results show that health gains and medical cost reductions can be a key factor to consider on the assessment of the economic viability of energy efﬁciency projects in buildings. This is particularly relevant for the older and low-income segments of the population.


Introduction
Interest in the interaction between energy and health within the built environment has been increasing in recent years, in the context of climate change and sustainable development. Recent studies focusing on overheating, cold homes and the inability of households to keep comfortable indoor temperatures [1][2][3][4][5][6], have emphasized the exposure to inadequate indoor temperatures. According to Daniel et al., the issue of cold homes has been increasingly recognized even in mild climate countries, with households in Adelaide registering indoor temperatures below health thresholds and low indoor thermal comfort perception [2]. A recent study by Guertler and Smith on cold homes reports that the majority of excess winter mortality in the United Kingdom (UK) is linked to the "the coldest 25% of homes in the UK" and to a lesser extent are linked to people experiencing fuel poverty [3]. A severe lack of thermal comfort for heating and cooling seasons was also identified by Gouveia et al., affecting fuel poverty households in Portugal [6]. This has been considered a key challenge to establishing healthy housing guidelines [7]. Overheating has been already experienced by a significant number of dwellings in London, although its impact on wellbeing was not quantified [8]. However, this trend is expected to increase with climate change [9]. According to Santamouris (2020), it is expected to increase indoor temperature in low-income housing, affecting energy as well as health [10]. In this context, The use of country level microdata datasets is common for health and energy purposes. The use of common databases possibly entails common explanatory variables. However, these areas are approached separately. Recent studies have either looked to understand the key drivers for household energy expenditures or to understand the key drivers and implications of health expenditures at household level. The development of statistical models has empathized the relevance of sociodemographic and building characteristic variables as household energy determinants in several countries. For instance, Salari and Javid's [32] study has shown that that a combination of higher education level and newer buildings registered effective reductions in energy expenditures for metropolitan areas in the Unites States (US). In Italy, Besagni and Borgarello [33,34] have determined that sociodemographic characteristics seem to have greater explanatory power than building and appliances for electricity expenditures; meanwhile, building characteristics seem to be more relevant regarding heating energy needs. In the case of Portugal, Wiesmann et al. [35] have concluded that household and dwelling characteristics have a significant influence on residential electricity consumption comparatively to socioeconomic variables. However, future electricity demand is expected to be largely influenced by changes in both socioeconomic and building stock level. These studies have resorted to common modelling approaches, such as ordinary least squares (OLS) or odds ratio (OR) method to establish the main explanatory variables for energy consumption and/or expenditures at household level.
The study of household health expenditures resorts to different modelling approaches. One of the most common is the use of two-part models, to better accommodate for nonnegative, skewed and the "zero-inflated nature" of data [36][37][38][39]. The relevance of demographic and socioeconomic conditions has also been emphasized using this model in several countries. For instance, Norton and Deb [40] assessed the impact of the change in policy insurance for young adults on total health expenditures in the US. As main explanatory variables, their model included socioeconomic characteristics of the population, such as age, gender, marital status, family size and health status. The study concluded that health insurance coverage expansion for young adults has contributed to a decrease in overall health spending. Household health expenditures and obesity were studied in Portugal by Veiga [41]. Taking into consideration socioeconomic conditions, health status, health risks and health care use as explanatory variables, the two-part model established that the prevalence of excessive weight and obesity contributed to an increase in the probability of household health expenditures [41]. More recently, a two-part model was also developed by Caballer-Tarazona et al. [42] to model the integrated healthcare expenditure for the entire population of a health district in Spain. The author was able to develop improved expenditure models that account for multimorbidity, including socioeconomic variables, such as age, gender, health status severity level and healthcare expenditures.
Therefore, there seems to be a connection between sociodemographic variables, such as age, energy use and/or expenditure. There also seems to exist a connection between age, health and housing energy efficiency. However, there is a lack of empirically surveybased studies to relate age, energy and health expenditures. This paper addresses this gap by proposing a two-part model to estimate health expenditures based on energy-related expenditures, as well as socioeconomic variables. Therefore, the present study aims to extend previous research and seeks to answer the following research questions: -How are health expenditures associated with energy-related expenditures? -How does this association vary with other explanatory variables, such as age or income?
This study contributes thus to the analysis of the relationship between health and energy-related expenditures, based on microdata from Eurostat's Household Budget Surveys (HBS) [43], addressing the case of Portugal. This database may be made available for scientific purposes for recognized scientific research centers and university institutions, under a strict access and usage protocol. HBS dataset has disaggregated household level data information per country, enabling a detailed insight into household dynamics, providing for greater in-depth analysis than other datasets [44].
Additionally, a dominance analysis (DA) is presented in order to understand the relative importance of energy and other explanatory variables for health expenditures. This statistical method has been considered suitable to identify the relative ranking of predictors' dominance for complex survey data [45].
In particular, we aim to show the relevance of the interconnection between energy and health expenditures at household level. Identified trends, based on the relationship between income, age, and electricity and health expenditures, denote that vulnerable populations, such as elderly and low-income group, might be already experiencing household energy poverty. These findings open the way for policy makers to consider household-related policies that address high health-electricity expenditure issues, upon further research considering building and health characteristics. In summary, the research shows the importance of health care costs and energy costs taking into account socioeconomic conditions and, in particular, the income and age of the householders as key explaining factors. In the coming sections, the research methodology adopted and model specification are described in Section 2, followed by obtained results in Section 3 and concluding remarks in Section 4.

Dataset and Variables: Data Sources and Survey Description
The study of health and energy expenditures at household level required the use of the Household Budget Survey (HBS) microdata database, commissioned by the European Commission. The HBS database consists of a collection of nationally representative household surveys at EU Member State level, portraying living conditions based on household-related expenditures on goods and services [43,46].
Data is collected on a voluntary basis by Eurostat every five years [43,46]. The last collection round, dating from 2010, was made available for this study, upon request. The survey is based on national household surveys, which in the case of Portugal is carried out by Statistics of Portugal (INE). The structure of the dataset contains yearly records for different categories of household expenditures, such as food, clothing, energy and water supply, health or education and transportation expenditures. The values for expenditures are collected at national level by interviews and diaries and converted to annual amounts and corrected using coefficient factors by Eurostat [46]. For additional details on the survey itself, regarding sampling design or sampling weights, the following reports are publicly available [47][48][49].
This information is organized into Household and the Household Members files. Within these files, information is grouped into basic variables at household level, which relate to identification, weighting and demographic characteristics of the households' income and consumption expenditure data [46]. Similarly, basic variables at household member level account for: identification and weighting; gender, age, marital status, country of birth; education level, activity status and income [46]. The sample size for 2010 HBS edition included a total of N = 9489 household units and N = 24,383 householders in Portugal. This original sample underwent a filtering process, in order to promote a match between households as a whole and householders, based on the household reference person (HRP) concept as well as to exclude missing data. The exclusion criteria followed is illustrated in Figure 1. From Figure 1, it is possible to see a substantial reduction in sample size, from an initial N = 24,383 observations to a sub-sample with N = 8947 observations, based on filtering steps 1-4. This size reduction is needed to establish a single person or householder (HRP) for each household. For Portugal, the reference person (HRP) corresponds to the person aged 16 or more and with the highest annual income, for a given family household [48,50]. Taking into consideration this definition filter (1) excludes all household members whose age is below to 16 years of age (legal age to work unattained). Based on the reference person concept, filter (2) excludes household members with incomes that do not correspond to the highest annual income for a given household or "income profile". Conversely, in order to establish one single household member as reference person (HRP), filter (3) excludes duplicate household members with highest income. Filter (4) excludes households with zero income reported since it is not possible to establish the HRP concept, based on "income profile". This way, the householder file becomes compatible with the household file, defining the sub-sample from the initial total number of observations. The advantage of the use of a single individual per household, such as in the HRP concept, is that it avoids subjective bias (see [51]) and avoids the difficulties of specific equivalence scales as described by the Organization for Economic Co-operation and Development (OECD) (2013) [52]. Together, filters for variables within householder file accounted for 15,436 dropped observations. The final sample size (N = 8 473) has been established after excluding households for having reported zero cost for electricity expenditures. This filter was established given that access to residential electricity supply has 100% coverage at national level. According to the World Bank's Sustainable Energy for All (SE4ALL) database [53], a time series for the last ten years (2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017), showed that indicators for access to electricity were 100% for both rural and urban areas. This time series is relevant, because it includes the most recent wave of HBS available, in 2010, used in the current research. Inexistent electricity costs, geographical and socioeconomic variables, namely education level, activity status and hours worked variables, contributed to excluding 474 observations.

Dataset and Variables: Dependent and Independent Variables
For the current assessment, the data from HBS on a variety of health care and household energy expenditures, as well as sociodemographic variables were drawn. These variables are categorized according to the Classification of Individual Consumption by Purpose (COICOP). A list of the variables considered in this study, according to this classification, is described in Table 1. Each of these main categories can be further disaggregated into sub-categories for households that participated in the survey, such as detailed fuel costs (e.g., electricity and natural gas), hospital services or prescriptions for health costs. From Figure 1, it is possible to see a substantial reduction in sample size, from an initial N = 24,383 observations to a sub-sample with N = 8947 observations, based on filtering steps 1-4. This size reduction is needed to establish a single person or householder (HRP) for each household. For Portugal, the reference person (HRP) corresponds to the person aged 16 or more and with the highest annual income, for a given family household [48,50]. Taking into consideration this definition filter (1) excludes all household members whose age is below to 16 years of age (legal age to work unattained). Based on the reference person concept, filter (2) excludes household members with incomes that do not correspond to the highest annual income for a given household or "income profile". Conversely, in order to establish one single household member as reference person (HRP), filter (3) excludes duplicate household members with highest income. Filter (4) excludes households with zero income reported since it is not possible to establish the HRP concept, based on "income profile". This way, the householder file becomes compatible with the household file, defining the sub-sample from the initial total number of observations. The advantage of the use of a single individual per household, such as in the HRP concept, is that it avoids subjective bias (see [51]) and avoids the difficulties of specific equivalence scales as described by the Organization for Economic Co-operation and Development (OECD) (2013) [52]. Together, filters for variables within householder file accounted for 15,436 dropped observations. The final sample size (N = 8 473) has been established after excluding households for having reported zero cost for electricity expenditures. This filter was established given that access to residential electricity supply has 100% coverage at national level. According to the World Bank's Sustainable Energy for All (SE4ALL) database [53], a time series for the last ten years (2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017), showed that indicators for access to electricity were 100% for both rural and urban areas. This time series is relevant, because it includes the most recent wave of HBS available, in 2010, used in the current research. Inexistent electricity costs, geographical and socioeconomic variables, namely education level, activity status and hours worked variables, contributed to excluding 474 observations.

Dataset and Variables: Dependent and Independent Variables
For the current assessment, the data from HBS on a variety of health care and household energy expenditures, as well as sociodemographic variables were drawn. These variables are categorized according to the Classification of Individual Consumption by Purpose (COICOP). A list of the variables considered in this study, according to this classification, is described in Table 1. Each of these main categories can be further disaggregated into sub-categories for households that participated in the survey, such as detailed fuel costs (e.g., electricity and natural gas), hospital services or prescriptions for health costs. Among the advantages and main reasons for the use of HBS data is the free access and it being a database used for health and energy issues in prior studies. It gathers in a single database highly disaggregated information on consumption expenditures, but also provides additional data regarding the household´s sociodemographic context. Additionally, the use of expenditures has been considered more reliable as a measure of household resources in prior studies concerning household health costs [54]. However, a drawback from resorting to this database is that health status, considered one of the influencing factors of health expenses [40], is not included. For the HRP, variables such as age, gender, birthplace, marital status, education level, income and employment status have also been defined to characterize the observed population. Each household geographic context (region of the country and population density) has been considered as well.
Income and energy expenditures are annualized, and similarly to prior studies for different countries [33,55,56], they have been organized into low, medium and high levels, as referred to in Table 2. Similarly, this organization of income into levels has been also adopted for Portugal by Gouveia et al. [6,57]. This study adapts this concept to establish yearly income levels. Segmentation of energy according to low medium and high consumption profiles has also been previously performed by [58,59]. Here, this concept has been adapted and applied to define energy cost profiles. The abovementioned studies are surveys conducted at household level, and given that HBS are also conducted directly on a sample of households, this approach was considered suitable for the dataset used. Dummy variables representing the 'use', or 'no use' were developed to code other commodities (e.g., bottled gas). A similar procedure was developed for health insurance and rental status. Activity status: a unemployed and/or retired; b student and/or domestic; Income levels 1 based on [6,57]; Electricity levels 2 and Natural gas levels 3 based on [58,59]. 0 = no use; 1 = use, * reference category for Section 3.
Total health expenditures correspond to the sum of all household health expenditures categories featured in Table 1.
To briefly summarize, this study includes additional energy-related variables, such as electricity expenditures, relevant at household level, to complement variables traditionally featured in the estimation of healthcare such as demographic, socioeconomic, and health insurance [56,60], with the exception of health status.

Model Specification
The total health expenditures were modeled according to a two-part model, following a statistical model design proposed by Belotti et al. [61] and Deb et al. [62]. The prediction Processes 2021, 9, 943 8 of 23 of mean dependent variable health expenditures (Y), given explanatory variables (X), depends on both parts of the model, and can be written as in Equation (1).
The main focus of Part 1 from the Equation (1) is to model the probability of the household having any health expenditures; in Part 2, the main concern is to predict health expenditures conditional on households having them. Both parts of Equation (1) are associated with several model choices.
The main modelling choices have been identified by Deb and Norton [40]. The first choice refers to Part 1 of Equation (1), between logit and probit options, to model the probability of zero vs. non-zero expenditures. The second and third choices are explained below and correspond to the GLM model family and link choices. Similar to prior studies [42,61,63], this study has opted for a logit alternative to model the probability of having expenditures greater than zero, as defined by Equation (2). This option was based on the qualities of simplicity and interpretability the logit model presents in comparison to probit alternative [64].
Equation (2) corresponds to a binary response model, that assumes the value of 1 if health expenditure is positive, and assumes the value 0 otherwise [63].
where X stands for the vector of the independent variables and α for the vector of the regression coefficients for Part 1 of the model. The remaining choices refer to Part 2 of Equation (1): the choice of link function and distribution family of the generalized linear model (GLM) to model a positive (non-zero) outcome. Standardized specification tests were performed to determine final model specification, namely box cox test to understand the link or transformation for the dependent variable (Y). The obtained coefficient from this test is close to zero, and indicative of a log transformation link. In order to determine the distribution family, a modified Park test was computed following the steps of Norton and Deb [40]. With a coefficient on the expected value close to 2.0, a gamma distribution was selected. The gamma distribution flexibility in specifying shape and rate parameters provides a way to deal with issues such as skewness and kurtosis, as well as a range of positive values with outliers (number of individual high expenditures) [65]. Therefore, the GLM with Gamma log link model has been considered an increasingly popular option to model health costs, helping with mass zeros and non-zero skewed distributions [37,39].
Taking into consideration the specification tests, the Part 2 model description can be written with Equation (3).
where β stands for the vector of regression coefficients for Part 2 of the model. As previously mentioned, each observation follows an exponential type gamma distribution with a log link function; therefore, each coefficient β is interpreted as a percentage change, given by ((e β ) − 1 ), as decribed by Norton et al. [66]. Based on Equations (2) and (3) from Kyriopoulos et al. [34], the final equation for the two-part model can re-written as presented in Equation (4).
Additionally, specification tests for independent variables such as Pregribon and Ramsey RESET tests were also conducted. This procedure has been applied previously by several authors (see [40,42,63]) to detect misspecification by omitted variables. Performed tests were non-significant, meaning that the model is well specified and does not need additional variables.
Processes 2021, 9, 943 9 of 23 All statistical analysis was performed using Stata software with 'twopm' command [40,61,62]. Moreover, in the result section, the use of Average Adjusted Predictions (AAP) is favored to convey two-part model results. Adjusted predictions and marginal effects are alternative ways to present the results from a modelling approach, making them easier to understand and interpret [67,68]. Their use enables us to visually understand differences across variables of interest, such as age, income or electricity expenditure, giving obtained results a "substantive and practical significance.", as emphasized by Williams [67]. Stata commands 'margins', and 'maginsplot', developed by Williams, were used [69].
To determine the relative importance of energy-related variables in the statistical model, a dominance analysis (DA) was undertaken. In order to compute it, the general dominance equation for each independent variable x (i.e., Cx), suggested by Luchman (2015) [45], was adopted: "where Fij is the fit metric associated with model ij, p is the number of independent variables in the model, ni stands for the number of possible combinations of size i given the p independent variables, and C(m,k) is the number of combinations of size k possible given set size m" [45] (p. 2). According to Azen and Traxel (2009), the importance is defined by pairwise comparison of predictors (i and j) from a selected model (model ij) across all subsets of predictors to establish relative importance or dominance, requiring for the effect only a measure of model fit, such as R 2 . Therefore, dominance analysis enables us to define "the additional contribution of any given predictor to a given subset model as the change in R 2 when the predictor is added to the model" [70] (p. 324). Dominance analysis was performed using Stata software, resorting to the 'domin' command [71].
Prior to the statistical analysis, a brief summary statistic of the sample is provided in Table 2. Table 2 presents the summary statistics for the final sample, according to their groupings: geographic, socioeconomic, and energy expenditure. A few aspects regarding the sampled population are emphasized here. Regarding demographic characteristics, there is a prevalence in the sample of mainland households (81.53%) in comparison to Madeira and Azores. These household units seem to be located in sparsely populated areas (63.21%). About 60% of the population is male in gender, from older age ranges and lower education levels. Just over 90% of the population is national, working and to a large extent belongs to the medium income level. Household composition varies greatly, with greater presence of adults without children, of different age ranges but with high share of elderly population. Household composition and size (n • of people according to each age range) follow the harmonized categories that classify household members into adults and children, according to Eurostat's HBS methodology [46,49].

Sample Summary Statistics
Most householders working fulltime are homeowners and without private health insurance, as the National Health Service provides coverage to all population. Most households have a medium electricity expenditure, where bottled gas expenses seem to prevail over natural gas ones, with low use of liquid and solid fuels.

Results
In this section the results from the statistical analysis conducted are provided. The outcome from the two-part model is presented to establish the associations between overall health expenditures, energy expenditures and sociodemographic variables. For significant associations, results are also reported as Average Adjusted Predictions (AAP) to provide a better understanding of the amount of the model coefficients and the change in health expenditures produced by one unit change in the independent variable or in comparison to the reference level, depending on if the variable is continuous or categorical in nature.
Second, results from the conducted dominance analysis are presented to understand the explanatory power of independent variables. A rank of the relative importance of different independent variables included in the model is provided.

Main Findings
The focus of this section is to present two-part model results. Parameter or coefficient estimates from the regression model are presented in Table 3. These coefficients reflect the association or effect that independent variables have on healthcare expenditures. The logit model (part 1) indicates the likelihood of having any expenditure versus not having them. Meanwhile, the GLM model (part 2) indicates the increase or decrease in spending, conditional on having any expenditure.  According to Table 3, several model coefficients show different signs and levels of significance, for each part of the model. These coefficients are reflective of the relationship between the dependent and independent variables. The coefficients for the logit model (part 1) are in log-odd units, and depending on their sign, allow to infer whether having health expenditures is a more likely (log-odds > 0) or less likely (log-odds < 0) event. In the GLM model (part 2) with a log link function, variable coefficient can be interpreted as a percentage change regarding the reference level. The overall predicted health expenditure is obtained from the multiplication of the predictions from the logit model (part 1) and the GLM model (part 2), as shown in Equation (1). As a result, the two-part model combined estimates may not all be jointly significant. Yet, these variables may be significant for each part of the model separately.
As illustrated in Table 3, there are highly significant explanatory variables for having health expenditures, among all the categories featured in the model, i.e., geographic, socioeconomic and energy expenditure. However, given that the main focus of the current study is to better understand the role of energy expenditures as an explanatory variable for health, other significant explanatory variables are focused on, taking into consideration their relationship with energy and particularly with electricity expenditures. The use of adjusted predictions and marginal effects is required to graphically illustrate how health expenditures change according to age range or income level and electricity expenditure, within the energy expenditure category. Average adjusted predictions (AAPs) are used to visually translate the association between the dependent variable and a given independent variable as the predicted estimate, while all other variables are left at their observed values [72]. They are used to improve the perception of the outcome of the two-part model, since the coefficients in both parts of the model are difficult to interpret, based only on the sign and statistical significance.
According to Table 3, electricity expenditures stand out among the energy category, as it is the only energy alternative that is statistically significant for part 1 and part 2 of the two-part model. Overall, electricity presents a positive coefficient for both the logit and GLM model. This entails that the probability of having health expenditures and its amount increase with electricity expenditures. Overall, the electricity expenditure variable is associated with an increased probability of having health expenditures (by 9.75%), according to Equation (2), for the logit model (part 1). With a higher level of significance (p < 0.001), the amount of health expenditures increases by 8.00% (0.080= (e 0.078 )) − 1)), from Equation (3), for the GLM model (part 2). Further analysis is conducted in the next section.

Discussion
The main results from the two-part model are further discussed. The critical analysis takes into consideration significant variables of interest, such as age, income and electricity expenditures. This critical analysis is followed by a dominance analysis (DA).
In Figure 2, a more detailed look at the two-part model's health expenditure for all electricity expenditure categories is displayed. For each category of electricity expenditures, variation among different levels is revealed. For instance, higher health expenditures are expected for very low electricity expenditure level in comparison to low and medium electricity expenditure levels. Moreover, other socioeconomic variables are taken into consideration to contextualize obtained results. According to Table 3, it is also possible to see that other energy variables, particularly for liquid and solid fuels, present higher positive and significant coefficients, only for the logit model (part 1). Liquid fuels include domestic heating and lighting oil, which similarly to solid fuels might be more carbon intensive energy alternatives. Their use and exposure might imply higher probability of incurring health expenditures. This argument is plausible, as prior studies have emphasized the adverse health implications of the use of carbon intensive fuels, though mostly related to developing countries [72,73].
The stronger contribution of logit model (part 1) in comparison to GLM model (part 2) for two-part combined estimate for health expenditure is also noticeable for natural and bottled gas (both with higher positive and significant coefficients). According to Figure 3, health expenditures tend to increase with increasing levels of natural gas expenditure (from level 1 to level 3) and then suffer an abrupt decrease at higher levels of expenditure (from level 3 to level 4). The estimated overall health expenditures reaches about 1450 €/year for people included in level 3 and drops about 150 € in comparison to the reference level (level 1), as illustrated in Figure 3. According to Table 3, it is also possible to see that other energy variables, particularly for liquid and solid fuels, present higher positive and significant coefficients, only for the logit model (part 1). Liquid fuels include domestic heating and lighting oil, which similarly to solid fuels might be more carbon intensive energy alternatives. Their use and exposure might imply higher probability of incurring health expenditures. This argument is plausible, as prior studies have emphasized the adverse health implications of the use of carbon intensive fuels, though mostly related to developing countries [72,73].
The stronger contribution of logit model (part 1) in comparison to GLM model (part 2) for two-part combined estimate for health expenditure is also noticeable for natural and bottled gas (both with higher positive and significant coefficients). According to Figure 3, health expenditures tend to increase with increasing levels of natural gas expenditure (from level 1 to level 3) and then suffer an abrupt decrease at higher levels of expenditure (from level 3 to level 4). The estimated overall health expenditures reaches about 1450 €/year for people included in level 3 and drops about 150 € in comparison to the reference level (level 1), as illustrated in Figure 3.
bottled gas (both with higher positive and significant coefficients). According to Figure 3, health expenditures tend to increase with increasing levels of natural gas expenditure (from level 1 to level 3) and then suffer an abrupt decrease at higher levels of expenditure (from level 3 to level 4). The estimated overall health expenditures reaches about 1450 €/year for people included in level 3 and drops about 150 € in comparison to the reference level (level 1), as illustrated in Figure 3.  In fact, Figure 3 shows that the reduction from level 3 to level 4 is outweighed by the increase from level 2 to level 3, which is consistent with small and non-significant GLM coefficient in Table 2. A possible explanation should take into consideration summary statistics of the dataset, presented in Table 2. According to Table 2, high natural gas expenditures represent a very small share of sampled households (1.56%), in comparison to other levels, such as low natural gas expenditures (84.70%). This share corresponds to a total of N = 132 units in a universe of N = 8 473 households. Therefore, any conclusions for the high natural gas expenditure segment may be difficult to justify.
The prevalence of lower natural gas expenditure levels could also be related to the location of the households, mostly in sparsely populated areas (rural areas), as illustrated in Table 2. In Portugal, natural gas connection is typical for urban, densely populated areas, with a lower share in the analyzed sample (29.87%). This information from summary statistics might help contextualize and interpret the obtained results. Yet, as previously mentioned, statistical inferences are premature, given the nature and size of the sample for the case of gas. Two-part combined health expenditures are estimated to be higher for people who use bottled gas (1230 €/year) than for people that do not use it (1180 €/year). Conversely to natural gas, bottled gas is a common energy alternative for sampled households, supplying over 70% of households. This corresponds to about N = 5 947 in comparison to natural gas and is in accordance with the rural setting. The difference of close to 50 €/year for health expenditures occurs when there is a change from 0 (no use) to 1 (use). Similarly, for natural gas, this increase is spurred by part 1 of the model given the small and non-significant value for the GLM coefficient.
For age-related variables, such as age range and the number of people above a given age range, the results from the logit model (part 1) indicate that, ceteris paribus, the likelihood of having health expenditures increases by 0.112 or 10.01% in comparison to the reference category. The results for the GLM model (part 2) show that the model coefficient is positive, in alignment with the logit model (part 1). Therefore, in comparison to the reference category (15-29) (15 to 29 years old) there is a 10.2% (from Equation (3) (0.102 = (e 0.097 − 1)) increase in the health expenditures, when they exist. As the number of people between 25 and 65 years old included in the household composition increases, so does the likelihood of having increased health expenditures. This likelihood increases by 0.264 in comparison to the reference level. The reference level is defined by Stata software as the less frequent category (zero people between 25 and 65 years old). Therefore, in households with people between 25 and 65 years old, there is an increased probability of 20.89% of having additional health expenditures in comparison to households without any people between 25 and 65 years old.
Similarly, for each additional person above 65 in the household composition, the predicted log odds are expected to increase by 0.556, in comparison to no people above 65 (reference level) or 35.73%. Therefore, a greater likelihood of having health expenditures is expected for higher levels of the coefficient. The changes in health expenditure by electricity expenditure and age range are presented in Figure 4. The analysis of all age groups presents higher health expenditures than the reference age level (15-20) (15-29 years old). People over 60 (≥60) from low and medium electricity expenditure categories experience a decrease in health expenditure, in comparison to very low electricity expenditure, as illustrated in Figure 4. Low and medium electricity expenditure categories experience a decrease in health expenditure, in comparison to very low electricity expenditure, across all age ranges. However, this difference is particularly remarkable for people over 60 (≥60) years old for who the difference in health expenditures to reference level amounts to −110 €/year. It should also be noticed in Figure 4, that people over 60 (≥60) with very low electricity expenditure are estimated to have heath expenditures almost as high as the highest electricity expenditure groups (1400 €/year for the former versus 1600 €/year for the latter).
A similar interpretation is required for the increasing number of people above 65 living in the same house, i.e., they tend to increase health expenditures. Although for each electricity expenditure category, the higher number of people above 65 tends to lead to a higher health expenditure in Figure 5, low and medium electricity expenditure categories tend to have lower health expenditures in comparison to very low electricity expenditure. In Figure 4, average adjusted predictions (AAP) for electricity expenditure by each age range shows the incremental effect of age. In comparison to people of lower age ranges, people over 60 (≥60) are associated with higher electricity expenditures and higher health expenditures. For instance, for the low (2) electricity expenditure category, the estimated overall health expenditure is 800 €/year for people between 15 and 29 (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) and increases to over 1200 €/year for people over 60 (≥60). These results are plausible since, as one of the most vulnerable segments of the population, the elderly tend to incur increased health expenditures. Also, given their need to keep warm at home, they have been known to increase energy expenditures [74,75].
The analysis of all age groups presents higher health expenditures than the reference age level (15-20) (15-29 years old). People over 60 (≥60) from low and medium electricity expenditure categories experience a decrease in health expenditure, in comparison to very low electricity expenditure, as illustrated in Figure 4. Low and medium electricity expenditure categories experience a decrease in health expenditure, in comparison to very low electricity expenditure, across all age ranges. However, this difference is particularly remarkable for people over 60 (≥60) years old for who the difference in health expenditures to reference level amounts to −110 €/year. It should also be noticed in Figure 4, that people over 60 (≥60) with very low electricity expenditure are estimated to have heath expenditures almost as high as the highest electricity expenditure groups (1400 €/year for the former versus 1600 €/year for the latter).
A similar interpretation is required for the increasing number of people above 65 living in the same house, i.e., they tend to increase health expenditures. Although for each electricity expenditure category, the higher number of people above 65 tends to lead to a higher health expenditure in Figure 5, low and medium electricity expenditure categories tend to have lower health expenditures in comparison to very low electricity expenditure. 021, 9, x FOR PEER REVIEW 16 of 25 Figure 5. AAP for health expenditures by electricity expenditure and n° people above 65.
In Figure 5, households with one household member over 65 years of age seem to have health expenditures close to the ones with two or more household members over 65 years. The decreasing impact of one additional household member in electricity expenditures has been reported in previous studies [55]. These results are aligned with recent European Union trends, that claim that a growing share of older people live alone, i.e., in households composed of a single person, particularly older women [76]. Further information regarding health status and living conditions is required to draw any conclusions regarding this result. Yet, even taking into consideration the increasing effect of age-related variables, these results are aligned with prior trends for electricity, in Figure 2.
Moreover, the relationship between other significant variables for health expenditures should also be considered to establish whether this result of (very) low electricityhigh health expenditures takes place in a low-income context, possibly affecting vulnerable segments of the population. Low electricity-high health expenditure within a low-income context could configure a situation of energy insecurity/poverty at household level. In southern European countries, Castaño-Rosa et al. (2020) claim that people in cold and energy inefficient households run the risk of developing cold-related diseases while struggling to meet their energy needs [77]. Among the health impacts in household communities, energy insecurity has been identified as being significantly associated with poor respiratory, mental health, and sleep outcomes by Hernandez and Siegel (2019) [78]. This risk could lead to an increased need for healthcare services, which could contribute to increasing household health expenditure burden, that is aligned with the obtained low electricity-high health expenditure situation. Furthermore, despite the warm weather, Portugal has been repeatedly considered one of the countries most affected by this issue [3,79].
Income is also amongst the statistically significant variables for both parts of the model. Overall, according to Table 3, the likelihood of having health expenditures is increased by 0.119, in the logit model (part 1). This means that in comparison to the reference level (low income) there is an increased probability to incur in health expenditures by 10.63%. This result is significant at 5% level. The greater impact of this explanatory variable is felt in the GLM model (part 2), with a higher coefficient and significance level. Conditional on having health expenditures, the amount of health expenditures increases by 19.48% (exp (0.1948 = ( . ) − 1)) in comparison to low-income level. This result is statistically significant at 1% level.
Obtained results regarding the relationship between health expenditures-income-age and electricity expenditures are illustrated in Figure 6. As expected, health expenditures In Figure 5, households with one household member over 65 years of age seem to have health expenditures close to the ones with two or more household members over 65 years. The decreasing impact of one additional household member in electricity expenditures has been reported in previous studies [55]. These results are aligned with recent European Union trends, that claim that a growing share of older people live alone, i.e., in households composed of a single person, particularly older women [76]. Further information regarding health status and living conditions is required to draw any conclusions regarding this result. Yet, even taking into consideration the increasing effect of age-related variables, these results are aligned with prior trends for electricity, in Figure 2.
Moreover, the relationship between other significant variables for health expenditures should also be considered to establish whether this result of (very) low electricity-high health expenditures takes place in a low-income context, possibly affecting vulnerable segments of the population. Low electricity-high health expenditure within a low-income context could configure a situation of energy insecurity/poverty at household level. In southern European countries, Castaño-Rosa et al. (2020) claim that people in cold and energy inefficient households run the risk of developing cold-related diseases while struggling to meet their energy needs [77]. Among the health impacts in household communities, energy insecurity has been identified as being significantly associated with poor respiratory, mental health, and sleep outcomes by Hernandez and Siegel (2019) [78]. This risk could lead to an increased need for healthcare services, which could contribute to increasing household health expenditure burden, that is aligned with the obtained low electricity-high health expenditure situation. Furthermore, despite the warm weather, Portugal has been repeatedly considered one of the countries most affected by this issue [3,79].
Income is also amongst the statistically significant variables for both parts of the model. Overall, according to Table 3, the likelihood of having health expenditures is increased by 0.119, in the logit model (part 1). This means that in comparison to the reference level (low income) there is an increased probability to incur in health expenditures by 10.63%. This result is significant at 5% level. The greater impact of this explanatory variable is felt in the GLM model (part 2), with a higher coefficient and significance level. Conditional on having health expenditures, the amount of health expenditures increases by 19.48% (exp (0.1948 = (e 0.178 ) − 1)) in comparison to low-income level. This result is statistically significant at 1% level.
Obtained results regarding the relationship between health expenditures-income-age and electricity expenditures are illustrated in Figure 6. As expected, health expenditures increase according to age and income level. This is visible for the adjusted prediction values in Figure 6. These results are in accordance with prior results for age-related variables.
Processes 2021, 9, x FOR PEER REVIEW 17 of 25 increase according to age and income level. This is visible for the adjusted prediction values in Figure 6. These results are in accordance with prior results for age-related variables. This is a transversal pattern, across all income levels. It shows that regardless of the income level, the lowest electricity expenditure category tends to have higher health expenditures than the following category, as illustrated in Figure 7. These results point to the existence of a relationship that may be described as a (very) low electricity expenditure-high health expenditure situation across all income levels. The identified U-trend for health expenditures would be consistent with householders, particularly elderly, who are more likely to spend more time in hospital or day-care centers than in their home, decreasing their electricity expenditures and increasing health expenditures. Thus, this emphasizes the relevance of these expenditures in relation to income and age and the need to further investigate them. These results for (very) low electricity expenditures could imply differences from an energy efficiency perspective that require further research related to the household building characteristics for all income levels, which are not available in the database. However, this is particularly relevant for the vulnerable segments of the population, with low-income levels, since it configures the previously identified possibility of household energy poverty. According to Mohan (2021), despite this issue being recognized as a concern, its impacts are still understudied for vulnerable segments of the population [80].
These results draw the attention of policy makers to an issue that could easily be overlooked, given the lower share of low-income and very low electricity expenditure households in the sample. It also allows them to target this vulnerable group as a priority candidate for efficiency improvements at household level. Efficiency improvements at envelope level, such as wall insulation, and change to energy efficient doors and windows have been known to improve indoor temperatures [81]. Cold homes associated with low energy efficiency have been identified as one of the causes of the inability to conciliate indoor comfort with increasing energy expenditures (see [75]). Therefore, based on obtained results, this work somewhat supports the view that investment in energy efficiency This is a transversal pattern, across all income levels. It shows that regardless of the income level, the lowest electricity expenditure category tends to have higher health expenditures than the following category, as illustrated in Figure 7. These results point to the existence of a relationship that may be described as a (very) low electricity expenditurehigh health expenditure situation across all income levels. The identified U-trend for health expenditures would be consistent with householders, particularly elderly, who are more likely to spend more time in hospital or day-care centers than in their home, decreasing their electricity expenditures and increasing health expenditures. Thus, this emphasizes the relevance of these expenditures in relation to income and age and the need to further investigate them. These results for (very) low electricity expenditures could imply differences from an energy efficiency perspective that require further research related to the household building characteristics for all income levels, which are not available in the database. However, this is particularly relevant for the vulnerable segments of the population, with low-income levels, since it configures the previously identified possibility of household energy poverty. According to Mohan (2021), despite this issue being recognized as a concern, its impacts are still understudied for vulnerable segments of the population [80].  Furthermore, given the high differentiation of results by income, age and electricityexpenditure categories and their statistical significance, it is plausible that the trends identified for vulnerable households could be used as a steppingstone to further explore housing conditions to address electricity and health expenditures. This information on the relationship between income, age, electricity and health, provided in this study, could then be crucial to identify and target priority action groups for the future development of policies at local community level, taking into consideration the insight from energy and health stakeholders.
This inclusive and multidisciplinary approach, suggested by obtained results, supports Kahouli's (2020) idea that, in the context of energy poverty, housing energy efficiency measures might constitute a solution to promote the reduction of public health care expenditure in the long term [51]. "Local action" has been considered by Bergman and Foxon (2020) as a potential aspect/driver for reframing housing energy efficiency policies, by engaging and promoting local authorities or local and regional partnerships [82]. Meanwhile, according to Mallabanda and Lipson (2020), health and thermal comfort have been identified within a range of householder needs that low carbon energy policies should meet in order to be successful [83]. In fact, a recent study by Ahn (2021) has emphasized the relevance of developing new modelling approaches to improve efficiency of energy use that determine thermal quality of the space [84]. These results are also reinforced by the study of Pais-Magalhães et al. (2020), that emphasized the challenges electricity consumption at household level faces with the aging population, requiring the development of energy efficiency policies that take into consideration the needs of the elderly [85].
A dominance analysis (DA) is also used to determine the order or rank of energy expenditure as an explanatory variable for health expenditure. It allows us to establish the relative importance, from most relevant to least relevant, or explanatory power of independent variables considered. According to the results and for those with health expenditures, electricity expenditure has been disclosed as a relevant explanatory variable for the value of the health expenditures, though it has often been neglected in this context. It is ranked in fifth place out of 26 explanatory variables included in the two-part model. Also, in the top 5 rank, electricity expenditures are preceded by income, the number of people above 65, the age range and education level, as presented in Table 4. These values are These results draw the attention of policy makers to an issue that could easily be overlooked, given the lower share of low-income and very low electricity expenditure households in the sample. It also allows them to target this vulnerable group as a priority candidate for efficiency improvements at household level. Efficiency improvements at envelope level, such as wall insulation, and change to energy efficient doors and windows have been known to improve indoor temperatures [81]. Cold homes associated with low energy efficiency have been identified as one of the causes of the inability to conciliate indoor comfort with increasing energy expenditures (see [75]). Therefore, based on obtained results, this work somewhat supports the view that investment in energy efficiency could contribute to reducing both energy and health expenditures for vulnerable populations, such as low-income elderly households.
Furthermore, given the high differentiation of results by income, age and electricityexpenditure categories and their statistical significance, it is plausible that the trends identified for vulnerable households could be used as a steppingstone to further explore housing conditions to address electricity and health expenditures. This information on the relationship between income, age, electricity and health, provided in this study, could then be crucial to identify and target priority action groups for the future development of policies at local community level, taking into consideration the insight from energy and health stakeholders.
This inclusive and multidisciplinary approach, suggested by obtained results, supports Kahouli's (2020) idea that, in the context of energy poverty, housing energy efficiency measures might constitute a solution to promote the reduction of public health care expenditure in the long term [51]. "Local action" has been considered by Bergman and Foxon (2020) as a potential aspect/driver for reframing housing energy efficiency policies, by engaging and promoting local authorities or local and regional partnerships [82]. Meanwhile, according to Mallabanda and Lipson (2020), health and thermal comfort have been identified within a range of householder needs that low carbon energy policies should meet in order to be successful [83]. In fact, a recent study by Ahn (2021) has emphasized the relevance of developing new modelling approaches to improve efficiency of energy use that determine thermal quality of the space [84]. These results are also reinforced by the study of Pais-Magalhães et al. (2020), that emphasized the challenges electricity consumption at household level faces with the aging population, requiring the development of energy efficiency policies that take into consideration the needs of the elderly [85].
A dominance analysis (DA) is also used to determine the order or rank of energy expenditure as an explanatory variable for health expenditure. It allows us to establish the relative importance, from most relevant to least relevant, or explanatory power of independent variables considered. According to the results and for those with health expenditures, electricity expenditure has been disclosed as a relevant explanatory variable for the value of the health expenditures, though it has often been neglected in this context. It is ranked in fifth place out of 26 explanatory variables included in the two-part model. Also, in the top 5 rank, electricity expenditures are preceded by income, the number of people above 65, the age range and education level, as presented in Table 4. These values are based on changes in R 2 , and the dominance standardized estimates represent the "proportion of each factor's explained variance of the variance explained by the whole model" [86]; in this particular case, DA was applied for those who present health expenditures. Therefore, it is not expected to be reflective of the entire model but gives an idea of the relative importance of independent variables of interest in explaining the variance in the dependent variable. For instance, based on Table 4, electricity expenditures played a relevant role in DA, explaining 6.71% of the variance of health expenditure. This DA further emphasizes the need to consider energy-related variables as explanatory variables for health issues. Despite this, although all top ranked explanatory variables show high statistical significance (1% level), this ranking presents some differences regarding the coefficients from the two-part model; this is because the coefficients are not directly comparable, as different scales are used for each variable. Dominance analysis also reinforced the connections explored between different explanatory variables. The relationship between energy, age, income and health is also in accordance with other emerging concepts such as aging in place, and healthy aging that contribute towards a more efficient, greener and age-friendly city.

Conclusions and Future Research
Overall, the aim of the present study, using the two-part model and microdata from Eurostat Household Budget Survey (HBS), was to investigate the association between energy and health expenditures and determine the role of electricity expenditures as an explanatory variable for health expenditures.
Our findings suggest that given the high significance and coefficient value, electricity expenditure is a relevant explanatory variable for health expenditures. This result is further validated by the top 5 position in the dominance analysis ranking. The assessment with other explanatory variables of interest, such as income and age-related variables, enabled us to identify vulnerable target groups, and could act as triggers for further research in unison with housing and health conditions at the basis for the development of policies, with the aim of reducing health expenditures through the reduction of energy expenditures.
This study contributes to demonstrating the existence of a relationship that can be described as (very) low energy expenditure-high health expenditure, in Portugal. This might be a pattern across all income levels; nonetheless, obtained results call the attention of policy makers to the low-income-(very) low energy expenditure-high health expenditure relationship that could configure a household energy poverty situation. The (very) low energy expenditure-high health expenditure trend, common to all income levels, also enables some considerations as to how energy efficiency could be experienced across several segments of the population. In this particular case, the evidence suggests that, regardless of the income level, for households with very low energy expenditures, energy efficiency benefits could be experienced differently. In households with very low electricity expenditures, in the case of energy efficiency measures, spending could be further reduced although in limited extension. However, it is expected that these energy efficiency measures could have a major impact in bettering living conditions, namely adequate indoor temperature. Therefore, the development of energy efficiency policies should contribute to improving indoor temperature without necessarily increasing energy expenditures. Especially for the abovementioned vulnerable segment of the population, these policies should then focus on the improvement of living conditions and by this enabling the decreasing of health expenditures. Meanwhile, for higher income segments, the development of energy efficiency policies could be translated into both energy expenditure savings and higher indoor temperature, potentially also leading to a health co-benefit, and a reduction in health expenditures.
The translation of energy efficiency into health co-benefits requires additional information on building characteristics and health status. These findings support the need to assess investment in improving housing conditions under a social cost-benefit analysis. Recent studies consider new policy framings such as new business models to incorporate health and wellbeing into housing energy efficiency policies [82,87]. Meanwhile, Ezratty et al. (2018) have proposed a methodology to assess the health cost implications of fuel poverty [88]. The evaluation of the economic interest of energy efficiency projects must then go beyond the traditional estimation of energy cost reduction and must recognize also likely health gains and avoided medical costs. Therefore, in this sense, housing energy efficiency measures could improve and encourage the development of efficient public health policies regarding health expenditures, based on the assessment of energy efficiency's health gains and avoided medical costs.
The present study provides key information regarding the relevance of energy-related and health expenditures, that could be of use for vulnerable households that suffer added pressure in the context of climate change. However, a deeper understanding between housing conditions and their interaction with health status, such as chronic conditions under a changing climate, is required. Thus, obtained results also call attention of policy makers towards the need to take into consideration not only economic and environmental impacts but also health impacts in future design of energy efficiency policies. The implementation of energy efficiency measures that take into consideration the interaction between energy, other explanatory variables and health could contribute towards climate change targets by reducing energy expenditures while protecting health from inadequate heat or cold exposure. Other emerging concepts such as "ageing in place", that ultimately contribute towards accomplishment of energy and health-related SDGs, could also be simultaneously promoted.
Yet, it is recognized that results from the current research are hindered by the nature of the data available at the HBS dataset, since they lack specific information regarding health (e.g., health outcomes) and energy efficiency (e.g., level of thermal insulation) and appliances (e.g., heating and cooling systems ownership). Further efforts should, therefore, focus on the inclusion of building and health status information, to understand and promote how living conditions influence health and how to develop synergies with building energy efficiency strategies.
Author Contributions: Conceptualization, F.L. and P.F. and V.L.; methodology, F.L.; software, F.L.; supervision, P.F. and V.L.; validation, P.F. and V.L. formal analysis, F.L. and P.F. and V.L.; investigation, F.L. and P.F. and V.L.; resources F.L. and P.F. and V.L.; data curation, F.L. and V.L.; writing-original draft preparation, F.L.; writing-review and editing, P.F. and V.L. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: 3rd Party Data Restrictions apply to the availability of data used and the STATA model developed. Data was obtained from Eurostat HBS database (https://ec.europa.eu/ eurostat/web/microdata/overview), accessed on 23 October 2019. A confidentiality agreement was signed and the data cannot be publicly disclosed.