The Combined E ﬀ ect of Indoor Air Quality and Socioeconomic Factors on Health in Northeast China

: Research has increasingly demonstrated that complex relationships exist between residential indoor air quality, health and socioeconomic factors. However, few studies have provided a comprehensive understanding of these relationships. The purpose of this paper, therefore, was to use structural equation modeling to identify the combined e ﬀ ect of residential indoor air quality and socioeconomic factors on occupants’ health, based on ﬁeld measurement data in Northeast China. The results showed that socioeconomic status had a direct impact on the occupants’ health with the path coe ﬃ cient of 0.413, whereas the e ﬀ ect from indoor air quality was 0.105. Socioeconomic status posed the direct e ﬀ ect on indoor air quality with path coe ﬃ cients of 0.381. The weights of PM 2.5 , CO 2 , TVOC (Total Volatile Organic Compounds), and formaldehyde concentration to the indoor air quality were 0.813, 0.385, 0.218, and 0.142, respectively. Relative contributions of Income level, education level, and occupation prestige to socioeconomic status were 0.595, 0.551, and 0.508, respectively. Relationships between indoor air quality, socioeconomic factors and health were further conﬁrmed based on multiple group analysis. The study deﬁnes and quantiﬁes complex relationships between residential indoor air quality, socioeconomic status and health, which will help improve knowledge of the impacts of the residential indoor environment on health.


Introduction
China's rapid economic growth, urbanization, and industrialization over the past four decades has increased concern about air pollution. It was estimated that air pollution in China could lead to approximately 1.6 million deaths in 2015 [1], and cost 1.4 trillion dollar in 2010 [2]. Moreover, millions of Chinese people in modern society spend about 83.3% of their time indoors (such as residences, office buildings, schools, daycare centers and public spaces) [3]. Approximately 67% of the time spent indoors is spent in their residences [4], and the figure is even higher during the outbreak of novel coronavirus pneumonia epidemic currently. The indoor air quality in residential buildings has far-reaching implications for occupants' health and wellbeing.
Considerable scientific progress has been made in the understanding of health effects related to the indoor air quality over the past decades. These efforts tend to address indoor air pollution sources and exposures and associated health concerns, such as sulfur dioxide [5], nitrogen oxides [6], ozone [7], particulate matter [8], formaldehyde [9], radon [10], tobacco smoke [11], VOC [12], pesticides [13] and SVOC (Semi-Volatile Organic Compounds) [14]. More recently, research has increasingly shown the effect of socioeconomic factors (e.g., income, education, and job) on the relationship between indoor air quality and occupants' health. On the one hand, socioeconomic factors can influence indoor air pollution exposures directly. For example, low-income families were likely to live in dilapidated houses with deteriorating building structures and poor inferiors building materials, which led to high The investigation of residential indoor environment consisted of two parts: Measurements of indoor air environment parameters and a self-administered questionnaire survey. CO2, PM2.5, TVOC, and formaldehyde concentration were selected as the criteria for indoor air quality. Because CO2 concentration is a good indicator of contamination caused by persons indoors, while TVOC, and formaldehyde (HCHO) concentration are indicators for indoor air pollution by building materials and fittings [29]. PM2.5 concentration represents the source of indoor air contamination caused by particulate matter. Table 1 shows the characteristics of measuring instruments. Indoor air parameters were monitored at the breathing zone height, i.e., 1.1 m above the floor, as recommended by the National Indoor Air Quality Standard (GB/T 18883-2002) [30]. Moreover, indoor environmental parameters were measured simultaneously in the living room, bedroom, kitchen to represent the conditions of the indoor environment accurately. The measurement data were recorded every five or ten minutes.  The investigation of residential indoor environment consisted of two parts: Measurements of indoor air environment parameters and a self-administered questionnaire survey. CO 2 , PM 2.5 , TVOC, and formaldehyde concentration were selected as the criteria for indoor air quality. Because CO 2 concentration is a good indicator of contamination caused by persons indoors, while TVOC, and formaldehyde (HCHO) concentration are indicators for indoor air pollution by building materials and fittings [29]. PM 2.5 concentration represents the source of indoor air contamination caused by particulate matter. Table 1 shows the characteristics of measuring instruments. Indoor air parameters were monitored at the breathing zone height, i.e., 1.1 m above the floor, as recommended by the National Indoor Air Quality Standard (GB/T 18883-2002) [30]. Moreover, indoor environmental parameters were measured simultaneously in the living room, bedroom, kitchen to represent the conditions of the indoor environment accurately. The measurement data were recorded every five or ten minutes.  Table S1). The reliability and validity of the Chinese version of the SF-8 Health Survey were confirmed [31]. The SF-8 Health Survey includes physical and mental components, and each component covers four sub-scales, the physical health: Physical functioning (PF), role physical (RP), bodily pain (BP) and general health (GH); the mental health: Vitality (VT), social functioning (SF), role emotional (RE) and mental health (MH). For example, role emotional refers to 'have you had problems with your work or other regular daily activities (such as walking or climbing stairs) as a result of any emotional problems?' Bodily pain indicates 'how much bodily pain have you had?' Each sub-scale was scored on a scale of 1 to 4 (1 represents the worst and 4 the best health status). -Socioeconomic status: Socioeconomic factors are typically determined by the level of education, income, and occupation prestige [32]. This survey used the combinations of education level (1 = 'primary school', 2 = 'middle school', 3 = 'professional', 4 = 'university', 5 = 'master, PhD, or specialization'), income level (1 = 'low': <5000 yuan/month, 2 = 'middle':5000~10,000 yuan/month, 3 = 'high': >10,000 yuan/month), and occupation prestige (1 = 'low', 2 = 'middle', 3 = 'high') as the measurement. -Lifestyle: smoking status and alcohol consumption ('Frequently' = 0, 'Sometimes' = 1, 'Rarely' = 2, and 'Not at all' = 3); -Personal characteristics: Gender, age, and length of residence

Structure Equation Modeling
Structural equation modeling was used to define and quantify complex relations between the residential indoor air quality, socioeconomic status and health, as shown in Figure 2. It consists of two main components: (1) The measurement model showing the relations between latent variables (indoor air quality, socioeconomic status, and health as shown in the ellipses) and their measurement indicators (as shown in the rectangles); (2) the structural model, which imputes relations between the latent variables (indoor air quality, socioeconomic status, and health).

The Measurement Model
There are two types of measurement models in Figure 2: the reflective model (the direction of the arrow: Latent variables→observed variables) and the formative model (the direction of the arrow: observed variables latent variables). This depends on the appropriate relationship is between latent variables and their indicators [33]. For example, the indoor air environment includes CO 2 , PM 2.5 , TVOC, and formaldehyde concentration. It is difficult to use a single index that quantifies an individual's response to all these factors [34]. Hence, the indoor air quality was measured by combining these factors to form an index in the formative model. While the measurement indicators of health, such as physical functioning (PF), role physical (RP), and bodily pain (BP), were manifestations of health in the reflective model. The details are shown as follows.
In formative models, the latent variable (LV) is defined by the corresponding observable variables (MVs) in full [35]. The LV is a liner function of MVs, where λ determines the relative contribution of an observable variable to the corresponding latent variable, and is called the outer weight. The range of λ is [−1, 1]. Conversely, the MV is associated with the corresponding LV by a liner regression in the reflective model.
where ω shows the absolute contribution of an observable variable to the corresponding latent variable, and is called the outer loading. The range of ω is also [−1, 1]. The δ represents the measurement error.

The Structural Model
The relationships between latent variables including indoor air quality (LV air ), socioeconomic status (LV ses ) and health (LV health ), were described by the structure model ( Figure 2). The details are showed as follows, LV health = η 3 LV air + η 2 LV ses + σ health (3) LV air = η 1 LV ses + σ air (4) where η is equivalent to standardized betas in a linear regression model, and is called the path coefficient. The σ represents the measurement error.
The model was estimated based on the partial least squares SEM (PLS-SEM) algorithm using the software SmartPLS 3.0 [36]. The PLS-SEM algorithm is generally superior to others when formatively measured latent variables are part of the structural models and the sample size is small [37]. According to the PLS-SEM algorithm, the measurement models and the structural model were combined. The algorithm maximized the explained variance of the dependent latent variables to estimate the model parameters. Henseler et al. provided a detailed description of the algorithm [38]. The average values of indoor air environmental parameters during the measuring period were imported into the model, as well as the corresponding questionnaire data (health variables and socioeconomic status). It should be noted that the average values of indoor environmental parameters cannot be imported into the model directly. This is because the dimensions of indoor air environment parameters are not uniform. To overcome this problem, we classified the indoor environment into four categories according to the EN 15,251 Standard [39]. Categories I to III indicate the high level, normal level, and acceptable level of expectation, respectively. Category IV represents values outside the criteria for the above categories. Categories I to IV are scored on a scale of 4 to 1, whereby 1 represents the lowest and 4 the highest performance.  [41]. Finally, the mean values of indoor air environment parameters were scored from 1 to 4 on the basis of Table 2.

Model Evaluation
On the basis of the results of the computation, the model should be evaluated. Hair et al. provided a detailed description of how to assess PLS-SEM results of reflective measurement models, formative measurement models, and the structural model [37]. The formative measurement models are assessed on their significance and relevance (p value < 0.05), and the presence of collinearity among indicators (variance inflation factor, VIF < 5). Reflective measurement models are evaluated by the composite reliability (as a mean to evaluate the internal consistency reliability, >0.7), convergent validity (average variance extracted, AVE > 0.5), and discriminant validity (Fornell-Larcker criterion, average variance extracted should exceed the squared correlation with any other latent variables). The evaluation of the structural model results involved collinearity issues (variance inflation factor, VIF < 5), significance and relevance of the structural model relationships (p value < 0.05), predictive accuracy (coefficients of determination, >0.2), and predictive relevance (Stone-Geisser's Q 2 value, >0). Table 3 illustrates the information on the houses measured and respondents' characteristics. About 69.2% of residential buildings were completed in 2000 or after. The percent of number of people per floor area above 0.02 in the investigated buildings was 81.5%. A total of 151 respondents participated in the survey. Approximately 52.4% of the respondents were women. The age of most of the subjects (66.9%) ranged from 30 to 50 years. A total of 33.8% of respondents had a middle school education or less. Most respondents (74.8%) resided in the surveyed buildings for more than five years. With regard to their lifestyle, 70.8% of the subjects reported no smoking and 36.2% no drinking.    According to Figure 3, there were high levels of average CO 2 and PM 2.5 concentration for the investigated houses. As can be seen from Figures S1 and S2, CO 2 concentration remained high, especially at night. There were CO 2 and PM 2.5 peak exposures during the period of time, such as cooking time. This was due to the fact that there was insufficient ventilation for most investigated houses. It was estimated that a large majority of residents opened windows for less than 10 min per day to avoid heat loss [42], and approximately 71% of homes had an air change rate lower than 0.5 h −1 in winter [43]. Moreover, there were a lack of mechanical ventilation systems in most residential buildings of Northeast China. Furthermore, burning coal was still the main source of central heating, leading to indoor and outdoor environmental pollution [44]. Figure 4 shows the structural equation modeling results inferred from field measurement data.

Model Result and Evaluation
The results indicate that the indoor air quality posed a direct impact on the occupants' health with the path coefficient of 0.105, whereas the effects from socioeconomic status were 0.413. The effect of socioeconomic status on indoor air quality was 0.381. As for the measurement model of the indoor air quality, the greatest weight came from PM 2.5 (0.813), followed by CO 2 (0.385), TVOC (0.218), and formaldehyde (0.142). Income level (0.595) was the major indicator contributing to socioeconomic status, followed by education level (0.551) and occupation prestige (0.508). With respect to the measurement model of health status, physical functioning (0.849) had the greatest absolute contribution, followed by role emotional (0.773), vitality (0.736), role physical (0.677), mental health (0.565), general health (0.564), bodily pain (0.545), and social functioning (0.491).
According to Figure 3, there were high levels of average CO2 and PM2.5 concentration for the investigated houses. As can be seen from figure S1 and figure S2, CO2 concentration remained high, especially at night. There were CO2 and PM2.5 peak exposures during the period of time, such as cooking time. This was due to the fact that there was insufficient ventilation for most investigated houses. It was estimated that a large majority of residents opened windows for less than 10 min per day to avoid heat loss [42], and approximately 71% of homes had an air change rate lower than 0.5 h −1 in winter [43]. Moreover, there were a lack of mechanical ventilation systems in most residential buildings of Northeast China. Furthermore, burning coal was still the main source of central heating, leading to indoor and outdoor environmental pollution [44].   Table 4 summarizes the results of the model evaluation. In the formative measurement models, the indicators' variance inflation factor (VIF) were uniformly below the threshold value of 5. Hence, the level of collinearity was very low. Moreover, all indicators in the formative models were significant (p < 0.01). In the reflective models, the values of composite reliability for health (0.853) were larger than the threshold value of 0.7. The average variance extracted (AVE) were greater than  Table 4 summarizes the results of the model evaluation. In the formative measurement models, the indicators' variance inflation factor (VIF) were uniformly below the threshold value of 5. Hence, the level of collinearity was very low. Moreover, all indicators in the formative models were significant (p < 0.01). In the reflective models, the values of composite reliability for health (0.853) were larger than the threshold value of 0.7. The average variance extracted (AVE) were greater than the threshold value of 0.5 for health (0.637), confirming the convergent validity. The square roots of AVE, for health (0.798) were larger than its correlations with other variables, indicating a well-established discriminant validity. Finally, all VIF values were lower than the threshold value of 5, suggesting that the collinearity was also not an issue in the structure model. All relationships were significant in the structural model. The coefficients of determination for health (0.305) and indoor air quality (0.220) were above the threshold value of 0.2. The predictive relevance values for the indoor air quality (0.110) and health (0.283) were greater than the threshold value of 0, proving the predictive relevance for the model. In summary, the model has met all requirements, thus proving that it was reliable and valid.  Figure 4).

Link Between Socioeconomic Factors, Indoor Air Quality, and Health
This paper confirmed the complex relationship between socioeconomic factors, indoor air quality and occupants' health ( Figure 4). Socioeconomic factors (0.413) had a stronger direct impact on health than indoor air quality (0.105). Although, there is no literature that specifically indicates the relative weight of socioeconomic factors or indoor air quality to health, research has increasingly pointed out that the contribution of socioeconomic factors (e.g., education, income, and work conditions) was beyond physical environment (e.g., air quality and drinking water). For example, America's Health Rankings estimated that the weight of socioeconomic factors (0.27) to health was three times as great for physical environment (0.09) [45]. County Health Rankings established the relative contribution of socioeconomic factors (0.4) and physical environment (0.1) on health [46]. The University of Wisconsin Population Health Institute assigned weights to health as follows: Socioeconomic factors, 0.4; physical environment, 0.05 [47]. Unfortunately, there is no consensus on the relative contribution of socioeconomic factors or physical environment on health. But the weight of physical environment is rather weaker than socioeconomic factors.
Income level (0.595 × 0.413 = 0.246), education level (0.551 × 0.413 = 0.228), and occupation prestige (0.508 × 0.413 = 0.209) delivered the similar weight to health. This may be due to the interplay among the three factors. More education level is usually connected to higher income level and better employment options [48]. It is estimated that each additional year of schooling can increase approximately 11% more income every year [49]. Higher income jobs usually provide not only a safe work environment, but also health insurance and paid sick leave [50]. As for indoor air quality, PM 2.5 concentration (0.813 × 0.105 = 0.085) was the major indicator contributing to health, followed by CO 2 (0.385 × 0.105 = 0.04), TVOC (0.218 × 0.105 = 0.02) and formaldehyde (0.142 × 0.105 = 0.01). The weighting distributions can be explained as follows. Approximately 61.7% of the average PM 2.5 concentration exceeded 35 µg/m 3 specified by WHO air quality guidelines [51], as shown in Figure 3. The average CO 2 level were above 1000 ppm (National Indoor Air Quality Standard GB18883-2002 [30]) in nearly 40.7% of the investigated houses. Only 7.4% of homes failed to reach TVOC concentration standard (0.6 mg/m 3 , National Indoor Air Quality GB18883-2002). Furthermore, the formaldehyde in all the houses were below 100 µg/m 3 (National Indoor Air Quality GB18883-2002). As can be seen, TVOC and formaldehyde, as the indicators for indoor air pollution by building materials and fittings [29], are not major indoor air pollutants in the investigated houses. The primary reason is that about 82.8% of residential buildings were completed in 2010 or before (Table 3). Indoor air pollution from PM 2.5 and CO 2 (the indicator of contamination caused by persons) has presented a noteworthy challenge. Moreover, CO 2 concentration was related to the density of people indoors (as shown in Figure S3). About 42% of the variance in the indoor CO 2 concentration can be explained by the number of persons per floor area. Socioeconomic factors, including income level (0.595 × 0.381 = 0.227), education (0.551 × 0.381 = 0.209), and occupation prestige (0.508 × 0.381 = 0.194), shared the similar contribution to indoor air quality. According to Table 5, PM 2.5 and CO 2 concentration both decreased steadily, with the growth of education level. For example, PM 2.5 and CO 2 concentration dropped by 8.4% and 10.4%, respectively, from middle school less to university or professional subgroup. Despite some minor fluctuations, TVOC and HCHO concentration showed the downward trend. For income level and occupantion group, PM 2.5 and CO 2 concentration both followed a similar decline pattern. TVOC concentration generally still maintained the decrease. HCHO concentration fluctuated slightly. Furthermore, it can be found that there were significant differences for smoking status among the education level subgroups. A total of 45.0% of respondents reported smoking in the middle school or less subgroup, but the proportion dropped sharply to just 15.3% for the master, PhD, or specialization group. A large body of research has demonstrated that smoking has the great influence on pollutants indoors [52]. It was estimated that increases of PM 2.5 concentration in homes with smokers ranged from 25 to 45 µg/m 3 [53]. Smoking also strongly influenced the indoor concentration of VOC (p < 0.05) [54]. Moreover, smoking was known to have significant effects on health, such as carcinogenic, reproductive, and/or acute or chronic respiratory effects. For example, Canadian Tobacco, Alcohol and Drugs Survey revealed that around 52.5% of current smokers reported 'excellent' or 'very good' health, compared to 58% of former smokers, and significantly lower than 69% of never smokers [55]. Arday et al. estimated the effect of cigarette smoking on health among 26,504 American high school seniors, and found that respondents who were regular smokers were significantly more likely to report shortness of breath, coughing spells, productive cough, and wheezing or gasping [56]. There were no significant differences for household size among the income level subgroups, i.e., high income level (2.21), middle income level (2.12), and low income level (2.21). However, the average floor area in the high level income subgroup was 144.9 m 2 , which was much higher than low level income subgroup (68.7 m 2 ). Hence, crowding may be common in the low level subgroup, leading to high indoor air pollutant level.

Multiple Group Analysis
The initial model (Figure 4) was tested to determine variances in the model relationships by the multi-group analysis based on gender, age, and length of residence. Table 6 shows the distributions of path coefficients for these subgroups. The mean path coefficients (standard deviation) for the effects of socioeconomic status and indoor air quality on health were 0.373 ± 0.055, and 0.166 ± 0.052, for gender, respectively; 0.404 ± 0.023 and 0.156 ± 0.040, for age, respectively; 0.408 ± 0.062 and 0.138 ± 0.029, respectively, for length of residence. The impacts of socioeconomic status on indoor air quality were 0.321 ± 0.017, 0.359 ± 0.060, and 0.347 ± 0.029, respectively, for gender, age, and length of residence. There were not a great deal of difference between above results and the initial model (Figure 4), i.e., 0.413, 0.105, and 0.381, correspondingly. Additionally, it can be found that there was stronger relationship between indoor air quality and health for shorter length of residence. Perhaps this was due to the fact that CO 2 , VOC and formaldehyde concentration all decreased steadily with the growth of the length of residence. For example, VOC and formaldehyde concentration dropped by 36% and 4.1%, respectively, from '<5 years' to '5-10 years' subgroup.
Despite the fluctuation of path coefficients for the gender, age, and length of residence subgroup, the complex interactions among indoor air quality, health, and socioeconomic status were verified. Note: SES-socioeconomic status; IAQ-indoor air quality; SD-standard deviation "→"-direction of the effect.

Strengths and Limitations
This study was successful in its approach, resulting in an association model between the residential indoor air quality, socioeconomic factors, and health based on the 'real-world' data (field measurement data). The model reflected the combined effect of residential indoor air quality and socioeconomic factors on health. Moreover, indoor air quality and socioeconomic factors were measured by the combination of multiple factors.
However, some limitations in this study should be noted. First, the model results may not reflect all residential indoor environment conditions in Northeast China because of the sampling error. In fact, it is difficult to carry out the large-scale measurement of residential indoor environment because of the privacy and funding. Considering the lifestyle of residents and climate in Northeast China was partly similar, this study selected 81 residences in six densely populated cities. Second, the measurement of occupants' health status was only performed by self-reporting. There are differences between respondents' self-reported health and their actual health status. But self-reported health is a widely used measure to predict health outcomes [57]. It has been consistently shown to predict morbidity and mortality, even after accounting for objective health status, behavioral risk factors, and sociodemographic characteristics [58], and mortality is considered as the most objective measurement of the general health of an individual [59]. Moreover, self-rated health has been found to be a reliable measurement of general health, since respondents rated the same general health assessment within a period where their health was unlikely to change [60]. However, it should be noted that health effects are seldom momentaneous. Most effects do not appear at the first exposure and do not disappear immediately after the exposure stops. The same was true for the 'health effects' measured in this study. Hence, there may be a risk that this study overestimated factors which would not change in a short time (such as socioeconomic factors) compared with those which would change over time (such as indoor air pollution factors). This problem will be considered in future studies. In addition, it is necessary to emphasize that this study identified the association model between residential indoor air quality, socioeconomic factors and self-reported health using the SEM method. But the association is not synonymous with causality. The causality of identified relations should also be confirmed in future research.

Conclusions
This study reports the combined effect of residential indoor air quality and socioeconomic factors on health, using structural equation modeling, on the basis of field measurements in Northeast China. The following conclusions were derived: (1) The indoor air quality and socioeconomic status had the direct effect on the occupants' health with path coefficients of 0.105, and 0.413, respectively. Socioeconomic status posed the direct effect on indoor air quality with path coefficients of 0.381.
(3) Relationships between indoor air quality, socioeconomic factors and health were further confirmed for selected data groups (gender, age, and length of residence) using multiple group analysis.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-3417/10/8/2827/s1, Figure S1: CO 2 concentration in the selected residential buildings, Figure S2: PM 2.5 concentration in the selected residential buildings, Figure S3: The relation between the number of persons per floor area and average CO 2 concentration for all the investigated houses, Table S1: The Short Form 8 Health Survey.