#
Estimation of Unmeasured Room Temperature, Relative Humidity, and CO_{2} Concentrations for a Smart Building Using Machine Learning and Exploratory Data Analysis

^{1}

^{2}

^{*}

## Abstract

**:**

_{2}concentrations. Our models accurately estimated temperature, humidity, and CO

_{2}concentration under various case studies with an average root mean squared error (RMSE) of 0.3 degrees, 2.6%, and 26.25 ppm, respectively. Obtained results show an accurate estimation of indoor environment measurements that is applicable for optimal HVAC system control in smart buildings with a reduced number of required sensors.

## 1. Introduction

_{2}concentration. Energy wastage is reduced by optimally controlling the HVAC system’s operations, which also improves the thermal comfort of occupants. From the algorithm perspective, machine learning (ML) based techniques that learn historical data trends and thermal behaviors have attracted significant research attention [14]. In [15], researchers compared the accuracy of various building energy consumption forecasting models such as linear regression, ridge regression, K-Nearest Neighbors regressor, Random Forest regressor, gradient Boosting regressor, Extra Trees regressor, MLP regressor, and Artificial Neural Networks (ANNs). Their findings show that ANN models are the best alternative for short-term load forecasting (STLF). Other different case studies mentioned in [16,17] showed that 1D-CNNs outperformed LSTM, shallow ANNs, and SVM models based on root mean squared error (RMSE) evaluation. Ref. [18] went a step further, combining 1D-CNN with LSTM to form a CNN-LSTM hybrid model that outperformed the other ML models in short-term load forecasting.

_{2}concentration within room areas without sensors for a building in Japan. Consequently, the novelty of our methodology is the design of simple extreme gradient boost (XGBoost) models that can utilize limited data for training and make accurate estimations. The two primary contributions of our methodology are:

- Reduced number of sensors required for optimal indoor environment variable measurements in a commercial building.
- Accurate indoor temperature and relative humidity estimation for HVAC system control to reduce energy waste while improving occupant thermal comfort.

## 2. Materials and Methods

#### 2.1. XGBoost Machine Learning Algorithm

_{i}is the input variable, f

_{t}(x

_{i}) is the learner at time step t, f

_{i}

^{(t)}is the prediction at step t, and f

_{i}

^{(t−1)}is the prediction at step t − 1.

_{i}is the estimated value, y

_{i}is the actual value, and Ω is the regularization term and defined in Equation (3) as:

#### 2.2. Methodology

_{2}concentrations of an office building used in our case study are described in detail. The building used in our research is a certified zero energy building (ZEB) [27] in Japan. Figure 1 illustrates the methodology process overview.

#### 2.2.1. Data Collection and Pre-Processing

_{2}concentration, and send them to a local server for storage. For our study, a six-month data set was utilized.

_{2}concentration data from the large data set. Then, we combined all daily data files into one large file. After that, we analyzed different techniques to handle missing data. Missing data points were randomly scattered throughout the whole data set. We tried various interpolation methods and finally employed the spline interpolation method found in the Pandas python library. The data set was then converted into hourly intervals and visualized.

#### 2.2.2. Data Analysis and Input Feature Selection

_{2}concentration. Figure 2 depicts cleansed temperature data for all building rooms from January 2019 to June 2019.

_{2}concentration data sets, then carried out input feature engineering.

_{2}concentration estimation modeling.

#### 2.2.3. XGBoost Model Design, Training, Testing, and Evaluation

_{2}concentration data during the training stage. The model has hyperparameters that were initially set to default values. However, the default value will not always give the best accuracy for all cases. Therefore, tuning the hyperparameters for each case is necessary to obtain the best possible estimation accuracy. Our research achieved this by employing the grid search CV [28] technique borrowed from the Scikit learn library [29]. Grid search cross-validation is a tuning technique that uses cross-validation to perform an exhaustive search over specified parameter values of an estimator.

_{i}is the estimated value, y

_{i}is the actual value, and n is the number of samples.

## 3. Results

_{2}concentration were estimated and evaluated using RMSE and MAPE performance metrics.

#### 3.1. Indoor Temperature Estimation Results

#### 3.2. Relative Humidity and CO_{2} Concentration Estimation Results

_{2}concentration are slightly different; hence, each case requires a specific XGBoost model and a specific set of input features. We, therefore, employed exploratory data analysis techniques on the relative humidity and CO

_{2}concentration data sets. Optimal hyperparameters for each model were carefully obtained, and essential input features were extracted from the data.

_{2}concentration estimation. However, Figure 7 compares the basement conference room 2 relative humidity and third-floor office CO

_{2}concentration estimations with the actual test set. Again, the estimations are a good fit, depicting the excellent accuracy of the designed models.

_{2}concentration estimation because we only had data from four CO

_{2}concentration sensors installed in the two basement rooms, the second and third floors of the building.

_{2}concentration was accurate to an average of 26.25 ppm in terms of RMSE evaluation metrics scores. This represents a good fit and shows the power of XGBoost algorithms in the precise estimation of indoor environment variables necessary for occupant well-being and control policies for the smooth operation of heating and cooling systems in a building.

## 4. Discussion

_{2}concentration. For the temperature estimation, we utilized data from seven data points to estimate the temperature for six rooms, as shown in Table 2. This implies that the proposed temperature XGBoost model reduced the number of temperature sensors from 13 to 7, representing a reduction of about 50%. The relative humidity estimation XGBoost models utilized five data points to estimate relative humidity for six rooms (Table 3), while The CO

_{2}concentration estimation XGBoost models used three data points to estimate CO

_{2}concentration for two floors, as shown in Table 3.

_{2}concentration models obtained an average RMSE metric score of 26.25. These errors appear prominent in value because they are dependent on a different range of scales for the estimated environment variables. However, the MAPE evaluation metric, which depicts the estimation errors as a percentage, indicated an error of 2.2342% and 2.4581% for relative humidity and CO

_{2}concentration estimation, respectively, representing a good estimation.

## 5. Conclusions

_{2}concentration. Following the discussion of results, the adopted models accurately estimated both RMSE and MAPE metric scores. Modeling and accurately estimating indoor environmental variables in buildings is an essential task for reducing the overall energy consumption of the building and improving occupant comfortability.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Frontczak, M.; Wargocki, P. Literature survey on how different factors influence human comfort in indoor environments. Build. Environ.
**2011**, 46, 922–937. [Google Scholar] [CrossRef] - Heinzerling, D.; Schiavon, S.; Webster, T.; Arens, E. Indoor environmental quality assessment models: A literature review and a proposed weighting and classification scheme. Build. Environ.
**2013**, 70, 210–222. [Google Scholar] [CrossRef] [Green Version] - Al Horr, Y.; Arif, M.; Kaushik, A.; Mazroei, A.; Katafygiotou, M.; Elsarrag, E. Occupant productivity and office indoor environment quality: A review of the literature. Build. Environ.
**2016**, 105, 369–389. [Google Scholar] [CrossRef] [Green Version] - Provins, K.A. Environmental heat, body temperature, and behavior: An hypothesis. Aust. J. Psychol.
**2007**, 18, 118–129. [Google Scholar] [CrossRef] - Rezaie, B.; Rosen, M.A. Department of Environment and Energy HVAC Energy Breakdown. HVAC Hess
**2013**, 93, 36–37. [Google Scholar] - Manic, M.; Amarasinghe, K.; Rodriguez-Andina, J.J.; Rieger, C. Intelligent Buildings of the Future: Cyber aware, Deep Learning Powered, and Human Interacting. IEEE Ind. Electron. Mag.
**2016**, 10, 32–49. [Google Scholar] [CrossRef] - Weng, T.; Agarwal, Y. From buildings to smart buildings-sensing and actuation to improve energy efficiency. IEEE Des. Test Comput.
**2012**, 29, 36–44. [Google Scholar] [CrossRef] - Batov, E.I. The distinctive features of “smart” buildings. Procedia Eng.
**2015**, 111, 103–107. [Google Scholar] [CrossRef] [Green Version] - Crawley, D.B.; Lawrie, L.K.; Winkelmann, F.C.; Buhl, W.F.; Huang, Y.J.; Pedersen, C.O.; Strand, R.K.; Liesen, R.J.; Fisher, D.E.; Witte, M.J.; et al. EnergyPlus: Creating a new-generation building energy simulation program. Energy Build.
**2001**, 33, 319–331. [Google Scholar] [CrossRef] - Chen, X.; Li, X. Virtual temperature measurement for smart buildings via Bayesian model fusion. Proc.-IEEE Int. Symp. Circuits Syst.
**2016**, 2016, 950–953. [Google Scholar] [CrossRef] - Ghahramani, A.; Galicia, P.; Lehrer, D.; Varghese, Z.; Wang, Z.; Pandit, Y. Artificial Intelligence for Efficient Thermal Comfort Systems: Requirements, Current Applications, and Future Directions. Front. Built Environ.
**2020**, 6, 109807. [Google Scholar] [CrossRef] - Dong, B.; Prakash, V.; Feng, F.; O’Neill, Z. A review of a smart building sensing system for better indoor environment control. Energy Build.
**2019**, 199, 29–46. [Google Scholar] [CrossRef] - Han, Z.; Gao, R.X.; Fan, Z. Occupancy and indoor environment quality sensing for smart buildings. In Proceedings of the 2012 IEEE International Instrumentation and Measurement Technology Conference Proceedings, Graz, Austria, 13–16 May 2012; pp. 1–6. [Google Scholar]
- Wei, Y.; Zhang, X.; Shi, Y.; Xia, L.; Pan, S.; Wu, J.; Han, M.; Zhao, X. A review of data-driven approaches for prediction and classification of building energy consumption. Renew. Sustain. Energy Rev.
**2018**, 82, 1027–1047. [Google Scholar] [CrossRef] - Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M.A. Optimal deep learning LSTM model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches. Energies
**2018**, 11, 1636. [Google Scholar] [CrossRef] [Green Version] - Amarasinghe, K.; Marino, D.L.; Manic, M. Deep neural networks for energy load forecasting. In Proceedings of the 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE), Edinburgh, UK, 19–21 June 2017; pp. 1483–1488. [Google Scholar] [CrossRef]
- Kaligambe, A.; Fujita, G. Short-Term Load Forecasting for Commercial Buildings Using 1D Convolutional Neural Networks. In Proceedings of the 2020 IEEE PES/IAS PowerAfrica, Nairobi, Kenya, 25–28 August 2020. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar] [CrossRef] [Green Version]
- Zhang, X.; Pipattanasomporn, M.; Chen, T.; Rahman, S. An IoT-Based Thermal Model Learning Framework for Smart Buildings. IEEE Internet Things J.
**2020**, 7, 518–527. [Google Scholar] [CrossRef] - Aliberti, A.; Ugliotti, F.M.; Bottaccioli, L.; Cirrincione, G.; Osello, A.; MacIi, E.; Patti, E.; Acquaviva, A. Indoor Air-Temperature Forecast for Energy-Efficient Management in Smart Buildings. In Proceedings of the 2018 IEEE International Conference on Environment and Electrical Engineering and 2018 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe), Palermo, Italy, 12–15 June 2018. [Google Scholar] [CrossRef]
- Chen, X.; Li, X.; Tan, S.X.D. Overview of cyber-physical temperature estimation in smart buildings: From modeling to measurements. In Proceedings of the 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), San Francisco, CA, USA, 10–14 April 2016; pp. 251–256. [Google Scholar] [CrossRef]
- Attoue, N.; Shahrour, I.; Younes, R. Smart building: Use of the artificial neural network approach for indoor temperature forecasting. Energies
**2018**, 11, 395. [Google Scholar] [CrossRef] [Green Version] - Soleimani-Mohseni, M.; Thomas, B.; Fahlén, P. Estimation of operative temperature in buildings using artificial neural networks. Energy Build.
**2006**, 38, 635–640. [Google Scholar] [CrossRef] - Alawadi, S.; Mera, D.; Fernández-Delgado, M.; Alkhabbas, F.; Olsson, C.M.; Davidsson, P. A comparison of machine learning algorithms for forecasting indoor temperature in smart buildings. Energy Syst.
**2020**, 1–17. [Google Scholar] [CrossRef] [Green Version] - Ma, X.; Fang, C.; Ji, J. Prediction of outdoor air temperature and humidity using Xgboost. IOP Conf. Ser. Earth Environ. Sci.
**2020**, 427, 012013. [Google Scholar] [CrossRef] - Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, 14–18 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; Volume 13–17, pp. 785–794. [Google Scholar]
- Enefice Kyushu|U.S. Green Building Council. Available online: https://www.usgbc.org/projects/enefice-kyushu (accessed on 8 January 2021).
- Tuning the Hyper-Parameters of an Estimator—Scikit-Learn 0.24.1 Documentation. Available online: https://scikit-learn.org/stable/modules/grid_search.html#grid-search (accessed on 26 January 2021).
- Scikit-Learn: Machine Learning in Python—Scikit-Learn 0.24.0 Documentation. Available online: https://scikit-learn.org/stable/ (accessed on 14 January 2021).
- Yildiz, B.; Bilbao, J.I.; Sproul, A.B. A review and analysis of regression and machine learning models on commercial building electricity load forecasting. Renew. Sustain. Energy Rev.
**2017**, 73, 1104–1122. [Google Scholar] [CrossRef] - Gao, G.; Li, J.; Wen, Y. DeepComfort: Energy-Efficient Thermal Comfort Control in Buildings Via Reinforcement Learning. IEEE Internet Things J.
**2020**, 7, 8472–8484. [Google Scholar] [CrossRef]

**Figure 3.**(

**a**) Time series plot for outside air temperature, B1F conference room 1 temperature, and B1F conference room 2 temperature; (

**b**) scatter plot of B1F conference room 1 temperature and B1F conference room 2 temperature.

**Figure 6.**(

**a**) Plot of actual 2F office room temperature and its estimation for June 2019; (

**b**) plot of actual 3F office room temperature and its estimation for June 2019; (

**c**) plot of actual 1f Men’s changing room temperature and its estimation for June 2019; (

**d**) plot of actual B1F conference room 2 temperature, and its estimation for June 2019.

**Figure 7.**(

**a**) Plot of actual B1F conference room 2 relative humidity and its estimation for June 2019; (

**b**) plot of actual 3F office room CO

_{2}concentration and its estimation for June 2019.

Hyperparameter | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Description |
---|---|---|---|---|---|---|---|

max_depth | 4 | 2 | 4 | 3 | 2 | 2 | Maximum depth of each tree (1–10) |

n_estimators | 400 | 50 | 200 | 400 | 400 | 400 | Number of trees in the ensemble |

colsample_bytree | 1 | 1 | 1 | 1 | 1 | 1 | Number of features used in each tree |

min_child_weight | 1 | 1 | 1 | 1 | 1 | 1 | Minimum sum of weight needed in a child |

learning_rate | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | The learning rate used to weigh each step |

Selected Building Rooms | Validation RMSE | Test RMSE | Test MAPE |
---|---|---|---|

B1F Conference room 2 | 0.2101 | 0.4632 | 1.0656 |

1F Men’s changing room | 0.3741 | 0.4340 | 0.9707 |

2F Office Room (West) | 0.2906 | 0.1467 | 0.4204 |

2F OA center room | 0.3312 | 0.3134 | 0.8392 |

3F Office Room (East) | 0.4236 | 0.1736 | 0.5060 |

3F OA center | 0.3155 | 0.2436 | 0.6374 |

Selected Building Rooms | Relative Humidity RMSE | Relative Humidity MAPE | CO_{2} Conc. RMSE | CO_{2} Conc. MAPE |
---|---|---|---|---|

B1F Conference Room 2 | 1.0992 | 1.1175 | 19.2314 | 1.5552 |

Cool Pit | 2.9769 | 2.2044 | N/A | N/A |

2F Office Room (East) | 2.9958 | 2.5130 | N/A | N/A |

2F OA center room | 3.1536 | 2.696 | N/A | N/A |

3F Office Room (West) | 2.9958 | 2.4096 | 33.3331 | 3.3610 |

3F OA center | 2.7648 | 2.4707 | N/A | N/A |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kaligambe, A.; Fujita, G.; Keisuke, T.
Estimation of Unmeasured Room Temperature, Relative Humidity, and CO_{2} Concentrations for a Smart Building Using Machine Learning and Exploratory Data Analysis. *Energies* **2022**, *15*, 4213.
https://doi.org/10.3390/en15124213

**AMA Style**

Kaligambe A, Fujita G, Keisuke T.
Estimation of Unmeasured Room Temperature, Relative Humidity, and CO_{2} Concentrations for a Smart Building Using Machine Learning and Exploratory Data Analysis. *Energies*. 2022; 15(12):4213.
https://doi.org/10.3390/en15124213

**Chicago/Turabian Style**

Kaligambe, Abraham, Goro Fujita, and Tagami Keisuke.
2022. "Estimation of Unmeasured Room Temperature, Relative Humidity, and CO_{2} Concentrations for a Smart Building Using Machine Learning and Exploratory Data Analysis" *Energies* 15, no. 12: 4213.
https://doi.org/10.3390/en15124213