Development of Hourly Indoor PM2.5 Concentration Prediction Model: The Role of Outdoor Air, Ventilation, Building Characteristic, and Human Activity

Exposure to indoor particulate matter less than 2.5 µm in diameter (PM2.5) is a critical health risk factor. Therefore, measuring indoor PM2.5 concentrations is important for assessing their health risks and further investigating the sources and influential factors. However, installing monitoring instruments to collect indoor PM2.5 data is difficult and expensive. Therefore, several indoor PM2.5 concentration prediction models have been developed. However, these prediction models only assess the daily average PM2.5 concentrations in cold or temperate regions. The factors that influence PM2.5 concentration differ according to climatic conditions. In this study, we developed a prediction model for hourly indoor PM2.5 concentrations in Taiwan (tropical and subtropical region) by using a multiple linear regression model and investigated the impact factor. The sample comprised 93 study cases (1979 measurements) and 25 potential predictor variables. Cross-validation was performed to assess performance. The prediction model explained 74% of the variation, and outdoor PM2.5 concentrations, the difference between indoor and outdoor CO2 levels, building type, building floor level, bed sheet cleaning, bed sheet replacement, and mosquito coil burning were included in the prediction model. Cross-validation explained 75% of variation on average. The results also confirm that the prediction model can be used to estimate indoor PM2.5 concentrations across seasons and areas. In summary, we developed a prediction model of hourly indoor PM2.5 concentrations and suggested that outdoor PM2.5 concentrations, ventilation, building characteristics, and human activities should be considered. Moreover, it is important to consider outdoor air quality while occupants open or close windows or doors for regulating ventilation rate and human activities changing also can reduce indoor PM2.5 concentrations.


Introduction
Particulate matter with an aerodynamic diameter of fewer than 2.5 µm (PM 2.5 ) is a critical risk factor for hospital admission for respiratory [1,2] and cardiovascular diseases [1,3]. PM 2.5 exposure is

Study Area and Design
The study framework is presented in Figure 1. The data were used to analyze from the DBH study in Tainan, Taiwan. Tainan (120 • 38 E; 120 • 01 E; 22 • 53 N; 23 • 24 N; area, 2199 km 2 ) is a city in southwest Taiwan. It is the sixth largest city in Taiwan, with a population of over 1.88 million people. The climate region is split between tropical and subtropical regions. The annual temperature, relative humidity, and rainfall are 24.3 • C, 77.2%, and 1698 mm [34], respectively.

Study Area and Design
The study framework is presented in Figure 1. The data were used to analyze from the DBH study in Tainan, Taiwan. Tainan (120°38′ E; 120°01′ E; 22°53′ N; 23°24′ N; area, 2199 km 2 ) is a city in southwest Taiwan. It is the sixth largest city in Taiwan, with a population of over 1.88 million people. The climate region is split between tropical and subtropical regions. The annual temperature, relative humidity, and rainfall are 24.3 °C, 77.2%, and 1698 mm [34], respectively. In brief, the sample comprised 93 children (the DBH study had 101 participants but eight were excluded because of missing PM2.5 data) aged from 3 to 9 years who completed questionnaires regarding the building characteristics of children's bedrooms, human indoor activities, furniture materials, and disease history in Tainan, Taiwan, from August 2008 to September 2009. The number of study cases was 21, 39, 13, and 20 in the spring, summer, fall, and winter, respectively. The indoor air quality (including PM2.5, CO2, and temperature) of the children's bedrooms was recorded during visits. The samplers also simultaneously collected outdoor air quality, and the hourly outdoor PM2.5 concentrations of each household was also modeled using the Kriging model. We used a simple linear model regression to select variables, and MLR was used to estimate a final model. This study also employed cross-validation to assess the prediction model. In brief, the sample comprised 93 children (the DBH study had 101 participants but eight were excluded because of missing PM 2.5 data) aged from 3 to 9 years who completed questionnaires regarding the building characteristics of children's bedrooms, human indoor activities, furniture materials, and disease history in Tainan, Taiwan, from August 2008 to September 2009. The number of study cases was 21, 39, 13, and 20 in the spring, summer, fall, and winter, respectively. The indoor air quality (including PM 2.5 , CO 2 , and temperature) of the children's bedrooms was recorded during visits. The samplers also simultaneously collected outdoor air quality, and the hourly outdoor PM 2.5 concentrations of each household was also modeled using the Kriging model. We used a simple linear model regression to select variables, and MLR was used to estimate a final model. This study also employed cross-validation to assess the prediction model.

Building Characteristics, Human Indoor Activities, and Furniture Materials
During each visit of DBH study, questionnaires were used to survey building characteristics (e.g., building age and type), human indoor activities (e.g., smoking, incense burning, or planting), and furniture materials (e.g., wood or leather) during visits. All sources of potential predictors are shown in Table S1. The data are summarized in Tables 1 and 2.

Indoor and Outdoor Air Quality
During each visit of DBH study, to prevent the occupants' daily activities, such as walking, from being obstructed, the indoor PM 2.5 instruments were located near the wall of children's bedrooms, and the instruments' inlet was directed toward the air in their bedrooms. Another PM 2.5 instrument was simultaneously used to collect data of outdoor PM 2.5 concentrations on the balconies of children's bedrooms during the same visits. When outdoor PM 2.5 instruments were not available on the balconies, the sampler was attached to the windows of children's bedrooms and their inlets were used to connect the tube for outdoor sampling. PM 2.5 instruments' inlets were located 1.0 to 1.2 m above floor level for indoor and outdoor sampling. The concentrations of indoor and outdoor PM 2.5 were detected using DUST-TRAK aerosol monitors (Model 8520; TSI Corporation, Shoreview, MN, USA). The instruments automatically recorded data at 5-min intervals for 22-24 h. We then downloaded the data from instruments and analyzed the hourly average PM 2.5 concentration. We recorded 1979 measurements for hourly indoor and outdoor PM 2.5 concentrations. Before sampling, we compared the two DUST-TRAK aerosol monitors in the same space, and in each measurement, R 2 was required to exceed 0.995.
DUST-TRAK aerosol monitoring is a type of light-scattering aerosol measuring device. Previous studies have indicated that the data from the DUST-TRAK aerosol monitoring system could be overestimated when the relative humidity is greater than 60% [35][36][37]. The average relative humidity is greater than 75% in Taiwan. Thus, PM 2.5 samples were simultaneously collected through both personal environmental monitoring with an air flow of 10 L/min on 37-mm Teflon filters and DUST-TRAK aerosol monitoring. In total, indoor and outdoor air quality measurements were collected from 15 and 12 study households, respectively (three outdoor samples had missing data), to calibrate the data from the DUST-TRAK aerosol monitors, respectively. The filters were weighed before and after sampling and stored in a room with controlled temperature at 23 ± 3 • C and relative humidity at 40 ± 5% for at least 24 h. The sampling time for each sample was 24 h. The calibration equations were calculated as Equations (1) and (2) for indoor and outdoor PM 2.5 , respectively, by using a linear regression model. To reduce the difficulty of collecting variables for a prediction model of indoor PM 2.5 concentration, we collected outdoor PM 2.5 concentrations near the study households from Taiwan's EPA and calculated the outdoor PM 2.5 levels of study cases through the Kriging model [38]. Pearson correlation analysis revealed a significant association (r = 0.91, p < 0.0001) between outdoor PM 2.5 concentrations of the Kriging model and the study households.
To develop a prediction model for predicting indoor PM 2.5 concentrations, we also used two Q-TRAK air quality monitors (Model 7575; TSI Corporation, Shoreview, MN, USA) to measure indoor and outdoor CO 2 and temperatures, respectively. The instruments automatically recorded data at 5-min intervals for 22-24 h. We downloaded the data from instruments and analyzed the hourly average concentration. The sampling strategies were consistent with the descriptions provided in this subsection.

MLR Model Procedure
The MLR was used to select variables to establish a prediction model for indoor PM 2.5 concentrations. The process was as follows: • Step1: A simple regression analysis was used to analyze the association between indoor PM 2.5 level and all variables. Variables with p > 0.05 were excluded;

•
Step 2: A simple linear regression model was used to assess the collinearity between variables. Values with a variance inflation factor >3 were excluded to establish the prediction model; • Step 3: MLR (stepwise) was used to analyze the association between all variables and indoor PM 2.5 concentrations. We repeated this process until no more variables could be removed without statistically significant (p > 0.05) changes in the regression model.

Prediction Model Performance Evaluation
Prediction model performance was assessed using the coefficient of determination (R 2 ) and root mean square error. We used 5-fold cross-validation (80% data for development, 20% for validation) to confirm model reliability. This study also used the developed prediction model to calculate the indoor PM 2.5 concentrations and further analyzed the corrections between predicted PM 2.5 concentrations and measured PM 2.5 concentration to investigate the performance of the final prediction model in different seasons and areas. SAS (v9.4, SAS Institute Inc., Cary, NC, USA) statistical software was used to analyze all data. Table 1 indicates that townhouses (57%) were the major building type and window and single-split air conditioners were major types of air conditioners (59%). Most buildings were 20-40 years old and had more than three floor levels. Furthermore, 45% of buildings were located near a main road and 84% of households had painted walls. Most of the furniture was made from wood (92%).

Building Characteristics and Human Activity
As presented in Table 2, 20%, 39%, 48%, and 15% of occupants indulged in planting, smoking, incense burning, and indoor mosquito coil burning. Furthermore, the floors of 65% of children's rooms were cleaned daily. The furniture in 60% of children's rooms was cleaned every month. The bed sheets in children's rooms were cleaned or replaced every month in 58% and 56% households, respectively. Other minor building characteristics and human indoor activities are presented in Tables 1 and 2. Table 3 lists the distribution of indoor and outdoor PM 2.5 levels. Average concentrations of indoor and outdoor PM 2.5 were 19.5 ± 10.6 (DUST-TRAK) and 38.1 ± 20.6 µg/m 3 (Kriging), respectively. The highest (27.4 ± 12.0 µg/m 3 , DUST-TRAK) and lowest (13.7 ± 6.5 µg/m 3 , DUST-TRAK) levels of indoor PM 2.5 occurred during the winter and summer, respectively. The highest (53.0 ± 22.3 µg/m 3 , Kriging) and lowest (25.1 ± 11.3 µg/m 3 , Kriging) levels of outdoor PM 2.5 occurred during the winter and summer, respectively. Seasonal distributions of outdoor PM 2.5 concentrations were consistent with those of indoor PM 2.5 . The levels of indoor and outdoor temperature and CO 2 are presented in Table S2. The highest and lowest indoor temperatures were observed in the summer and winter, respectively, similar to the results for outdoor temperature. The highest and lowest levels of CO 2 were observed during the winter and spring, respectively. However, the seasonal change was not obvious.

MLR Model Results
The results of the MLR model are presented in Table 4. Our model indicated that the outdoor PM 2.5 concentrations (Kriging), difference in indoor and outdoor CO 2 levels, building types, building floor levels, the frequencies of bed sheet cleaning and replacing, and mosquito burning behavior were associated with the indoor PM 2.5 concentrations. The overall R 2 was 74%. The equation of indoor PM 2.5 concentration prediction model is as follows   Table 5 presents the results of cross-validation for the MLR model for hourly average indoor PM 2.5 concentrations. As presented in Table 5, we used 20% of the data to validate the predicted model, which indicated that the adjusted R 2 had a range of 72-78%, with an average of 75%. Figure 2 displays the results of a Pearson correlation analysis, which revealed a significant relationship between predicted and measured indoor PM 2.5 concentrations (R 2 = 0.74, p < 0.05). We also used Equation (3) to calculate the indoor PM 2.5 concentrations and further analyzed the associations between predicted and measured values by season and area (Figures 3 and 4). The results of a Pearson correlation analysis revealed a satisfactory association between predicted and measured indoor PM 2.5 concentrations (p < 0.05) in different seasons or areas, thus indicating that the prediction model sufficiently estimated hourly average indoor PM 2.5 concentrations.  Table 5 presents the results of cross-validation for the MLR model for hourly average indoor PM2.5 concentrations. As presented in Table 5, we used 20% of the data to validate the predicted model, which indicated that the adjusted R 2 had a range of 72-78%, with an average of 75%. Figure 2 displays the results of a Pearson correlation analysis, which revealed a significant relationship between predicted and measured indoor PM2.5 concentrations (R 2 = 0.74, p < 0.05). We also used Equation (3) to calculate the indoor PM2.5 concentrations and further analyzed the associations between predicted and measured values by season and area (Figures 3 and 4). The results of a Pearson correlation analysis revealed a satisfactory association between predicted and measured indoor PM2.5 concentrations (p < 0.05) in different seasons or areas, thus indicating that the prediction model sufficiently estimated hourly average indoor PM2.5 concentrations.

Discussion
We created a prediction model for hourly average indoor PM 2.5 concentrations by using MLR model, the results of which suggested that outdoor PM 2.5 , ventilation, building characteristics, and human activities were key predictors. This prediction model also sufficiently estimated hourly average PM 2.5 concentrations in different seasons and areas. Based on the modeling results, we suggest that controlling outdoor PM 2.5 emission and certain human activities are beneficial for reducing indoor PM 2.5 concentrations. Occupants should also consider the outdoor air quality when they open or close windows or doors to regulate ventilation. This prediction model can be used to assess variations in indoor PM 2.5 concentrations across seasons and areas for future epidemiological research and make the control policy.
The average age of buildings in Taiwan is 29.6 years, and 54% of buildings are aged 20-40 years [39]. In this study, the average building age was 27.0 years, and 62% were aged between 20-40 years. Moreover, 50%, 18%, and 32% of people in Taiwan live in townhouses, single-family homes, and apartments, respectively, according to statistical data [39]. The households in our study were similarly distributed across building types (Table 1). These data reflected that building type and age distributions from the studied households matched the situation in Taiwan.
In this study, both indoor and outdoor PM 2.5 concentrations were the highest in the winter (p < 0.05, based on analysis of variance), and the next highest of concentration in spring. This reveals that indoor and outdoor PM 2.5 pollution is more serious in the winter and spring than that in the summer and fall in Taiwan. Studies have indicated that dust storms, biomass burning, and long transportation from China are notable contributors to outdoor PM 2.5 in the winter or spring [40][41][42]. Kuo and Shen [41] found that indoor and outdoor PM 2.5 concentrations were high during dust-storm periods. Moreover, our prediction model indicated that outdoor PM 2.5 concentrations was a critical factor affecting indoor PM 2.5 concentrations. This explains why high indoor and outdoor PM 2.5 concentrations were observed in the spring and winter, and similar seasonal patterns. Table 4 indicates that outdoor PM 2.5 concentrations is a critical parameter in the indoor PM 2.5 concentration prediction model. In Taiwan, opening windows is a universal behavior, and indoor PM 2.5 concentration is associated with outdoor PM 2.5 concentration in residential areas [41,43]. Previous studies have also indicated that outdoor PM 2.5 is a major variable in indoor PM 2.5 prediction models [22,44]. Therefore, outdoor PM 2.5 concentration plays an important role in predicting indoor PM 2.5 concentrations.
Some studies directly measured outdoor PM 2.5 concentrations to predict indoor PM 2.5 concentrations [13,16]; however, installing sampling instruments to measure outdoor PM 2.5 concentration in studied households for the prediction of indoor PM 2.5 concentration is difficult. Studies have directly collected outdoor PM 2.5 data from monitoring stations [14,15]; however, given the distance between monitoring stations and studied households, the data from monitoring stations have a large error and do not accurately reflect the pollution levels near the study households. In this study, we used the Kriging model to estimate the outdoor PM 2.5 concentrations surrounding studied households. Chiang used a Kriging model to estimate outdoor PM 2.5 concentration in Taiwan [45]. The data from Taiwan's EPA air quality-monitoring stations were critical for simulating outdoor PM 2.5 concentrations by using the Kriging model, and these data were easily collected online. Moreover, a satisfactory correlation was observed between the outdoor PM 2.5 concentrations surrounding studied households determined using the Kriging model and measurements in this study (r = 0.91, p < 0.0001). Therefore, we can directly collect outdoor PM 2.5 concentrations from Taiwan's EPA air quality monitoring stations and estimated the outdoor PM 2.5 concentrations of studied households by using the Kriging model. These data from the Kriging model can be inputted in the prediction model for calculating the indoor PM 2.5 concentrations.
Our data indicate that indoor PM 2.5 concentration is negatively associated with the differences of indoor and outdoor CO 2 levels in Table 4, and building type was also a factor influencing the indoor PM 2.5 concentrations. In Langer and Bekö's study [30], they found that ventilation rate was positively associated with the level of indoor NO 2 level. Outdoor air was an important contributor to NO 2 in indoor air, due to high ventilation rate increasing the contribution of outdoor NO 2 to indoor air. Hänninen and co-workers indicated that the ventilation rate was positively associated with the PM 2.5 infiltration rate [46]. Our model also indicated that outdoor PM 2.5 was an important factor influencing indoor PM 2.5 concentrations. In this study, the average differences of indoor and outdoor CO 2 concentrations were 359, 284, and 256 ppm in buildings with townhouse, single-family, and apartment, respectively. High ventilation rate may increase the contribution of outdoor PM 2.5 to indoor air. We determined that apartments contribute more to the indoor PM 2.5 concentration than single-family houses and townhouses do. We speculated that the small space of the apartments has a large cumulative effect on the PM 2.5 concentration. Our data indicated that the indoor PM 2.5 concentrations was higher on the first floors than on other floors. We speculated that emissions from vehicular emission was a critical factor influencing the indoor PM 2.5 concentration [29]. Therefore, the first-floor level had a higher risk of exposure to higher PM 2.5 . Taken together, building characteristics were important influence factors and occupants should consider the outdoor air quality when they regulate the ventilation situation. Table 4 presents frequencies of bed sheet cleaning and replacing are also predictors of indoor PM 2.5 concentrations. One study investigated the variations in particulate matter concentrations for different human activities indoors and found that a folded blanket also contributed particulate matter due to resuspension [28]. We speculated that particle concentration was higher on the bed sheet when the frequency of bed sheet cleaning or replacing was lower, and occupants' use or a folded bed sheet increase the particle concentrations due to resuspension. Previous studies have also demonstrated that mosquito coil burning was a contributor to particulate matter in the air [47,48]. Mosquito bite is universal in tropical and subtropical regions, and a mosquito coil is often used to prevent mosquito bite. Therefore, we cannot ignore the effect of mosquito coil burning on indoor air quality.
Studies have indicated that other indoor human activities, such as smoking or incense stick burning, are major factors affecting the PM 2.5 concentration [47][48][49]. Our data indicated that indoor PM 2.5 concentrations are not associated with smoking and incense stick burning. We speculate that the burning time of cigarettes and incense sticks was shorter and the effects of cigarette and incense stick burning was weaker on indoor PM 2.5 concentrations. Moreover, this study lacked detailed records of the occupants' behaviors and the exact time of event. Applying feasible technology to collect occupant activity frequency and time is necessary for completely analyzing the effects of occupant activities on the hourly PM 2.5 concentration in the future.
Studies have completed cross-validation for prediction models [13,15,50]. In this study, we also conducted cross-validation, which revealed that R 2 > 75%. Figures 3 and 4, respectively, illustrate significant correlations between predicted and measured indoor PM 2.5 concentrations in different seasons and areas. These correlations were significant, and the R 2 values ranged from moderately correlated to highly correlated, which indicates that the prediction model in this study can be used to estimate indoor PM 2.5 concentrations in across seasons and areas in Taiwan. However, Figures 2-4 show some measurements have larger relative errors, especially in winter. In Sun' study [51], they found that window opening/closing behavior significantly affected the error on the prediction model of indoor PM 2.5 concentration. Our study did not investigate the time point and frequency of window opening/closing behaviors. Thus, the time point or frequency of window opening/closing behaviors investigation was necessary in the near future for reducing the relative error.
Some studies developed indoor aerosol modeling according to the complicated mechanisms (such as coagulation or secondary formation) [26][27][28]. However, these models could ignore the building characteristics or other human activities. Some studies estimate prediction models for indoor PM 2.5 concentration in households [13][14][15], but these studies were conducted in temperate or cold regions, such as in Canada, Mongolia, and the United States. The predictors in our study differed from those in previous studies. For example, in the studies by Clark and Yuchi, outdoor PM 2.5 concentrations was not a crucial predictor. We speculate that outdoor air does not critically influence indoor air in different temperature regions because of doors and windows closing, particularly in the winter. In Taiwan, the frequency of window-opening behavior is high due to its warm climate [21]; thus, outdoor air considerably influences indoor air in Taiwan. Moreover, Clark's study indicated that forced air heating is a key variable in prediction models of indoor PM 2.5 concentrations. We did not find that heater use was associated with indoor PM 2.5 concentration, possibly because only 8% of occupants had heaters; moreover, the heaters were only operated 0.3 h/day during the winter in our previous study [21]. This indicates that heater use is not a critical variable in tropical and subtropical regions. Altogether, human activity may differ across climate regions and influence indoor air quality with different predictors.

Conclusions
This study used the data on household indoor and outdoor pollutants, building characteristics, and human indoor activity from the DBH study in Taiwan to develop a prediction model for hourly indoor PM 2.5 concentrations and further investigate the impact factor. Our results revealed that outdoor PM 2.5 concentrations, difference between indoor and outdoor CO 2 levels, building type, building floor level, bed sheet cleaning, bed sheet replacing, and mosquito coil burning were key variables, and the predictor was not different on different climate regions. The prediction model of indoor PM 2.5 concentrations can explain 74% of the variation and sufficiently predict indoor PM 2.5 concentrations across seasons (55 to 69%) and areas (65 to 70%). PM 2.5 concentrations from Taiwan's EPA air quality monitoring stations can be used to estimate the outdoor PM 2.5 concentrations of studied households by using the Kriging model and thereby reduce the time and cost of installing PM 2.5 monitoring instruments.
This study developed a prediction model for predicting indoor hourly PM 2.5 concentrations in tropical and subtropical regions, and found that the major predictors were different according to different climate regions. Controlling outdoor PM 2.5 pollution emission and changing certain human activities can reduce PM 2.5 exposure. Occupants should assess the outdoor air quality when they open or close doors or windows for ventilation rate regulation. We assert that our prediction model can estimate indoor PM 2.5 concentrations across seasons and areas in Taiwan and can be used for future epidemiological research on the relationship between indoor PM 2.5 exposure and health effects and obtaining rapid data for pollution control.
Supplementary Materials: The following are available online at http://www.mdpi.com/1660-4601/17/16/5906/s1, Table S1: Summary of predictors considered as candidates for hourly indoor PM 2.5 concentration model developing, Table S2: Summary of indoor and outdoor temperature and CO 2 .