Integrated Method for Personal Thermal Comfort Assessment and Optimization through Users’ Feedback, IoT and Machine Learning: A Case Study

Thermal comfort has become a topic issue in building performance assessment as well as energy efficiency. Three methods are mainly recognized for its assessment. Two of them based on standardized methodologies, face the problem by considering the indoor environment in steady-state conditions (PMV and PPD) and users as active subjects whose thermal perception is influenced by outdoor climatic conditions (adaptive approach). The latter method is the starting point to investigate thermal comfort from an overall perspective by considering endogenous variables besides the traditional physical and environmental ones. Following this perspective, the paper describes the results of an in-field investigation of thermal conditions through the use of nearable and wearable solutions, parametric models and machine learning techniques. The aim of the research is the exploration of the reliability of IoT-based solutions combined with advanced algorithms, in order to create a replicable framework for the assessment and improvement of user thermal satisfaction. For this purpose, an experimental test in real offices was carried out involving eight workers. Parametric models are applied for the assessment of thermal comfort; IoT solutions are used to monitor the environmental variables and the users’ parameters; the machine learning CART method allows to predict the users’ profile and the thermal comfort perception respect to the indoor environment.


Introduction
Thermal Comfort (TC) is defined as the psychophysical satisfaction of an individual immersed in a thermal environment [1]. As described in the EN ISO 7730:2005 standard [2], TC is influenced by six factors [3], summarized in two categories: four objective variables: air temperature (T air ), relative humidity (RH), air velocity (V air ), and mean radiant temperature (T rad ) and two subjective variables: metabolic activity and clothing. Over the years, several studies have highlighted how the thermal sensation of users further depends on factors linked to human characteristics, e.g., age, gender, pathologies, etc. [4][5][6][7][8][9][10][11]. Adaptive approaches, on the basis of the subjective response of individuals to thermal stimuli, have been proposed in order to include these factors in thermal assessment process.
The Thermal Sensation Vote (TSV) expressed by the user through a web-based survey are compared with the indices provided by the Standards EN ISO 7730:2005 and ASHRAE 55:2017, PMV and the Graphic Comfort Zone Method (GCZM), respectively. The data collected during an experimental campaign in an office building are threaded through ML techniques in order to identify advanced algorithms for the prediction of the user profile and the related optimal TC conditions.

Description of the Framework
The framework consists of the following parts: • a monitoring system composed by: a nearable device (a term composed by the words "near" and "wearable") for the monitoring of the environmental parameters nearby the user; a wearable device for the monitoring of subjective variables; • a web-based survey for the detection of users feedback in terms of TSV; • a parametric model to assess the real TC conditions.

Nearable System
The nearable system applied in the framework is based on low-cost sensors and open-source hardware able to monitor indoor environmental parameters (air temperature, relative humidity, radiant temperature, air velocity, CO 2 concentration, illuminance level) useful to assess different aspects of the Indoor Environmental Quality (IEQ) [31,32]. The accuracy of the sensors complies the requirements provided by the ISO 7726:1998 [33]. Table 1 reports the characteristics of the sensors used for TC assessment. The technical characteristics of the low-cost hot wire anemometer are not stated but have been assessed by calibration as reported in [34]. Tests conducted in the 0 ÷ 6 m/s range through the methodology described in [31] allowed to verify the good behavior of the low-cost sensor through direct comparison with a professional one.

Wearable System
The wearable device is the Empatica E4 wristband (Empatica Inc., Cambridge, MA, USA). It is a class II medical device according to the FDA 21 CFR Part 860.3 regulations and is equipped with the following sensors: • a photoplethysmography (PPG) sensor for the detection of the heart rate (HR) [35]; • an electrodermal activity (EDA) sensor; • an infrared thermopile; • a 3-axis accelerometer. The Empatica E4 wristband was chosen for its verified accuracy [36] and because it is the most practical solution for the purpose of the research. Table 2 reports the characteristics of the sensors used for the acquisition of biometric data.

Web Based Survey
The web-based survey was defined according to the guidelines provided by the ASHRAE 55:2017 standard. It is realized using a Google Forms model allowing a free access at the users. The data are automatically collected in a Google spreadsheet. The information requested at each user are: position occupied in the indoor environment, activity performed, clothing characteristics and thermal sensation on the 7 point scale on general comfort where −3 is equivalent to "cold", +3 to "hot" and 0 to "neutral" sensation.
The analysis of the survey allowed the identification of the insulation levels related to the clothing, the metabolic rate [37][38][39][40] and the thermal sensation of the individuals. The thermal resistance of the clothing is determined in compliance with the Annex C of the EN ISO 7730:2005 Standard. Each clothing garment is characterized by a specific insulation value, expressed in clo (clothing unit, 1 clo = 0.155 m 2 K/W); the overall thermal resistance is the algebraic sum of the single value of the thermal resistance [41]. The standard provides an additional thermal resistance for sedentary activities due to the type of chair, corresponding to an incremental value equal to 0.17 clo [42]. The standard corrects the static insulation considering a dynamic effect due to both air and body movements [43,44]. Table 3 reports the considered average clothing insulation values for each user. The common activity of the users provided by the surveys is "typing". As reported in the Annex B of the EN ISO 7730:2005 standard, for sedentary activity in office, a metabolic rate of 1.2 met is considered.

Parametric Model
The parametric model is realized using Grasshopper, a graphical algorithm editor tightly integrated with Rhino's 3-D modeling tools [45] and the following plugin:

•
Ladybug tools, to assess the TC [46,47] based on PMV and PPD and GCZM; • TT Toolbox, to import and to export data from a generic database in .csv or .xls formats [48].

First Application and General Data
The system was installed on the desktop of eight workstations of a two-story office building located in San Giuliano Milanese, near Milan (Italy) and eight individuals were involved in the survey. The workstations are placed in five offices, three on the ground floor and two on the first floor of the building (Figure 1). Four offices (1,3,4,5) have similar geometrical and morphological characteristics; office n.2 is a double size open space. Offices 1 and 3 have double orientation, East and South-facing; offices 2, 4 and 5 have a single orientation, East-facing. The envelope of the building is characterized by: an external wall consisting of a non-insulated double layer of masonry brick with internal plaster finishing; a concrete basement for floor and a mixed concrete-brick roof; single-pane glass windows with iron frame with the same dimensions. Offices 1 and 3 have two windows, one for each orientation; offices 4 and 5 have a single window each; office 2 has two windows. Each window is equipped with an internal manually-oriented curtain. The heating system consists of radiators placed below the windows. Furthermore, each office is equipped with a manually-controlled reversible heat pump. an external wall consisting of a non-insulated double layer of masonry brick with internal plaster finishing; a concrete basement for floor and a mixed concrete-brick roof; single-pane glass windows with iron frame with the same dimensions. Offices 1 and 3 have two windows, one for each orientation; offices 4 and 5 have a single window each; office 2 has two windows. Each window is equipped with an internal manually-oriented curtain. The heating system consists of radiators placed below the windows. Furthermore, each office is equipped with a manually-controlled reversible heat pump.   The typical installation of the considered monitoring system on the office desktop is shown in Figure 2. The typical installation of the considered monitoring system on the office desktop is shown in Figure 2.  Table 4 reports the areas of the offices, the personal data of the involved users and the periods of the tests. All subjects gave their informed consent for inclusion before they participated in the study.  Table 5 reports the weather data related to the period of the test.   Table 4 reports the areas of the offices, the personal data of the involved users and the periods of the tests. All subjects gave their informed consent for inclusion before they participated in the study.  Table 5 reports the weather data related to the period of the test.

Objective Assessment of Thermal Comfort
The monitored environmental variables have been used along with the users' subjective variables for the calculation of PMV. The PMV values have been approximated to an integer number (PMV int ) defined considering the ranges reported in Table 6. Table 6. PMV int and related range of PMV.

PMV int PMV
3 (hot) >2.5 2 (warm) 2.5:1.5 1 (slightly warm) The data collected by the survey are used to verify the difference between the calculated PMV and the TSV. The TSVs are compared with the equivalent PMV int defined considering the average of the environmental value recorded in the previous 5 min. Table 7 shows the percentage difference between the sensation vote expressed by the feedbacks of each user and the related calculated PMV int . The greatest difference between the indices is recorded for users 2a and 2b with a PMV int value higher than 60% of the TSV and for users 4a and 4c with a value greater than 40%.
Moreover, the analysis makes use of the psychrometric chart to identify the comfort zone according to the GCZM and the adapted Graphic Comfort Zone (GCZa), defined as the environmental variables corresponding to TSV equal to 0. Figure 3 shows the comfort zone [49,50] according to the GCZM (black line) and the GCZa (pink line) for users 5a. The black line in Figure 3 is defined as a function of the thermal insulation, the metabolic rate, the radiant temperature and the air velocity. The pink line is based on the environmental data averaged over a period of a minute at the time where the user 5a has provided a TSV equal to 0. The differences in term of comfort zone are quite evident, with a more restricted area in the approach based on the user's feedback. The new individual comfort zone could be considered in a hypothetical adaptive control and optimization strategy of the office thermal plant system. In this regard it is possible to consider, as hypothesis, two users, 5a and 4a, located in the same office ( Figure 4). It is possible to identify an optimal TC level considering the intersection of the personalized comfort zones for the two users, by identifying the GCZa based on the user's feedback.
As displayed above, the parametric model allows to calculate and to display the differences between the standard and personal comfort perception. The framework can use also the data collected by wearable and nearable devices to investigate the interactions between the variables or to provide predictions of the users, as described in the following paragraphs.
In this regard it is possible to consider, as hypothesis, two users, 5a and 4a, located in the same office ( Figure 4). It is possible to identify an optimal TC level considering the intersection of the personalized comfort zones for the two users, by identifying the GCZa based on the user's feedback.
As displayed above, the parametric model allows to calculate and to display the differences between the standard and personal comfort perception. The framework can use also the data collected by wearable and nearable devices to investigate the interactions between the variables or to provide predictions of the users, as described in the following paragraphs.

Dataset Definition and Machine Learning Approach
All users have been informed about how to use the nearable and wearable devices and how to compile the web based survey. The recorded data by the wearable device have been verified and filtered considering a ML algorithm for automatically detecting EDA artefacts [51], generated from electronic noise or variation in the contact between the skin and the recording electrode caused by pressure, excessive movement, or adjustment of the device. The algorithm is available on a web page [52] or by downloading a Python script. The application of algorithm for the data check is essential to detect

Dataset Definition and Machine Learning Approach
All users have been informed about how to use the nearable and wearable devices and how to compile the web based survey. The recorded data by the wearable device have been verified and filtered considering a ML algorithm for automatically detecting EDA artefacts [51], generated from electronic noise or variation in the contact between the skin and the recording electrode caused by pressure, excessive movement, or adjustment of the device. The algorithm is available on a web page [52] or by downloading a Python script. The application of algorithm for the data check is essential to detect and filter noise and artefacts [52]. Through this algorithm the raw data, acquired with a sampling frequency of 4 Hz, are divided in periods of 5 s. Considering a binary classification, for all of these periods, a noise classification number equal to −1 (noise data, in red background in Figure 5), or 1 (clean data, in white background in Figure 5) is attributed.
Then a first dataset is defined, composed by the data of the nearable device aggregated with those of the wearable one elaborated with the EDA explorer algorithm. The result is a dataset structured considering 15,456 instances (rows) and 15 attributes (Time, Z-axis, Y-axis, X-axis, EDA_explorer_label, skin temperature-Tskin-, EDA, HR, Tair, RH, Trad, air velocity, CO 2 concentration, illuminance level and User). Then a new dataset was built excluding: time dependencies, subjective variable related to the accelerations in the three axis, scarcely significant given the sedentary activities of the involved users, and environmental variable such as CO 2 concentration and illuminance level. At present, the attention of the experimentation is paid on the environmental and subjective parameters that directly describe the TC conditions and allow the identification of user profile. Future developments will provide the analysis of the correlation between other environmental parameters, such as CO 2 and illuminance level, and the thermal sensation through ML techniques.  Finally, the information related to Tair and Trad are used for the calculation of To while the air velocity is excluded because the monitored values are closest to zero. The air velocity in the monitored spaces is closed to 0 m/s; for these reasons, this parameter is neglected. As a result, only 6 attributes (Tskin value, EDA, HR, To, RH and User) are considered and 9022 instances defined considering only the rows with a noise classification value equal to 1, as reported in the categorical column "EDA_explorer_label". Table 8 summarizes the number of instances for each user and related data.  Finally, the information related to T air and T rad are used for the calculation of T o while the air velocity is excluded because the monitored values are closest to zero. The air velocity in the monitored spaces is closed to 0 m/s; for these reasons, this parameter is neglected. As a result, only 6 attributes (T skin value, EDA, HR, To, RH and User) are considered and 9022 instances defined considering only the rows with a noise classification value equal to 1, as reported in the categorical column "EDA_explorer_label". Table 8 summarizes the number of instances for each user and related data.  Figure 6 reports the distribution of the input variables.  Figure 6 reports the distribution of the input variables. The data related to HR, EDA and skin surface temperature (Tskin) have a pseudo-Gaussian distribution. Figure 7 reports the interaction between the variables. As reported in the legend, each color characterizes a specific user. Some pairs of attributes highlights a predictable relationship in some dimensions but, generally, it is not possible to identify which algorithms would be the best to validate and predict the users based on this dataset. For this purpose, a set of six different linear (Logistic Regression [53] and Linear Discriminant Analysis [54]) and non-linear (K-Nearest Neighbors [55], Classification and Regression Trees [56], Gaussian Naive Bayes [57], Support Vector Machines [58]) algorithms are considered. The dataset is divided into two subsets, composed by 80% The data related to HR, EDA and skin surface temperature (T skin ) have a pseudo-Gaussian distribution. Figure 7 reports the interaction between the variables. As reported in the legend, each color characterizes a specific user. Some pairs of attributes highlights a predictable relationship in some dimensions but, generally, it is not possible to identify which algorithms would be the best to validate and predict the users based on this dataset. For this purpose, a set of six different linear (Logistic Regression [53] and Linear Discriminant Analysis [54]) and non-linear (K-Nearest Neighbors [55], Classification and Regression Trees [56], Gaussian Naive Bayes [57], Support Vector Machines [58]) algorithms are considered. The dataset is divided into two subsets, composed by 80% and 20% of values. The former used to train the models and the latter for the validation. Table 9 reports the average accuracy (Avg.) and the standard deviation (St. dev.) of the simple linear and non-linear algorithms based on the training dataset. Depending on the scenario reported in Table 9, all instances of a different combination of the attributes from 0 to 4 (T skin , EDA, HR, T o and RH) are considered as an input variable "x" and the instances of attribute 5 (User) as the target variable "y". The metric of 'accuracy' is used to evaluate the models [59] defined as the ratio of the number of correctly predicted instances in divided by the total number of instances in the dataset. The k-fold cross validation (k = 10) [60] is used to evaluate the performances of the different algorithms on the dataset. Table 9 reports the Avg. and St. dev. for each algorithm evaluated 10 times (10 fold cross validation). and 20% of values. The former used to train the models and the latter for the validation. Table 9 reports the average accuracy (Avg.) and the standard deviation (St. dev.) of the simple linear and non-linear algorithms based on the training dataset. Depending on the scenario reported in Table 9, all instances of a different combination of the attributes from 0 to 4 (Tskin, EDA, HR, To and RH) are considered as an input variable "x" and the instances of attribute 5 (User) as the target variable "y". The metric of 'accuracy' is used to evaluate the models [59] defined as the ratio of the number of correctly predicted instances in divided by the total number of instances in the dataset. The k-fold cross validation (k = 10) [60] is used to evaluate the performances of the different algorithms on the dataset. Table 9 reports the Avg. and St. dev. for each algorithm evaluated 10 times (10 fold cross validation).   Among the models, the Classification and the Regression Trees (CART) algorithm has the highest estimation accuracy in all the considered scenarios (Table 9). CART is a non-parametric supervised learning method that predicts the value of a target variable by learning simple decision rules inferred from the data features [61]. Figure 8 reports the visual representation (truncated at fourth level for a better visualization) of the trained user classifier for the scenario V (see Table 9) that granted the highest average accuracy score, defined considering as input variables all the instances of T skin , EDA, T o and RH columns. Among the models, the Classification and the Regression Trees (CART) algorithm has the highest estimation accuracy in all the considered scenarios (Table 9). CART is a non-parametric supervised learning method that predicts the value of a target variable by learning simple decision rules inferred from the data features [61]. Figure 8 reports the visual representation (truncated at fourth level for a better visualization) of the trained user classifier for the scenario V (see Table 9) that granted the highest average accuracy score, defined considering as input variables all the instances of Tskin, EDA, To and RH columns. In all level it is possible to note the "gini" impurity of the node, that reaches the zero value when all cases in the node fall into a single target category, the "sample" variable that indicates the number of samples at each node and, finally, the "value" as list of seven attributes, reports how many of the observation sorted into that node fall into each of seven categories (1a, 2a, 2b, 4a, 4b, 4c, 5a). In [62] the complete tree can be displayed thus allowing to visualize the rules extracted from the training dataset.
Identified the best model on the training dataset and visualized all the rules of classification, it is possible now to get an idea of the accuracy of the selected CART model and scenario V on the validation set, giving an independent final check on the accuracy of the selected model and the input variables in order to identify the users. Table 10 shows the classification report summarizing the results as a final accuracy score of the CART model directly on the validation set. In all level it is possible to note the "gini" impurity of the node, that reaches the zero value when all cases in the node fall into a single target category, the "sample" variable that indicates the number of samples at each node and, finally, the "value" as list of seven attributes, reports how many of the observation sorted into that node fall into each of seven categories (1a, 2a, 2b, 4a, 4b, 4c, 5a). In [62] the complete tree can be displayed thus allowing to visualize the rules extracted from the training dataset.
Identified the best model on the training dataset and visualized all the rules of classification, it is possible now to get an idea of the accuracy of the selected CART model and scenario V on the validation set, giving an independent final check on the accuracy of the selected model and the input variables in order to identify the users. Table 10 shows the classification report summarizing the results as a final accuracy score of the CART model directly on the validation set. • Precision defined as a measure of a classifiers exactness; • Recall considered as the completeness of the classifier; • F1-score, a weighted average of precision and recall; • Support, the number of occurrences of each label in y true.
The application of the ML approach has allowed to exclude the HR variable granting the highest level of accuracy for the specific case of sedentary activity. The ML approach allows then to identify the users and consequently their neutral TSV, considering objective and subjective variables.

Conclusions and Future Work
The standard approach for TC assessment is essentially based on a thermal physic model that does not consider any other factors (behavioral, physiological, and psychological) and the complex state of mind that could affect the TC perception. The developed framework combines the user's feedback and an IoT-based solution with the functionality of parametric models and ML and allows one to overcome the limitations of the thermal energy balance equations. The comparison of all information acquired by survey highlights the differences between the individual perception of TC, TSV and GCZa, and those defined by the standards, PMV and GCZM, respectively. The wearable and nearable data elaborated with the functionality of ML, allow to investigate the possibility to find some dependences among the different variables in order to identify the different subjects. The proposed framework has allowed to detect the indoor environmental variables close to users, in addition to the biometric parameters and users' feedback in order to: • highlight differences among users and TC perception; • define individual GCZa based on users' feedback in order to optimize the TC control strategy; • identify the most relevant parameters for users recognition and, consequently for their personal TC optimal perception identification; The method is the basis for the development of individual control strategy that combines environmental variables and biometric with the powerfulness of ML techniques. The ML is applied to identify the users and their TC perception thus to overcome the limit of an imbalanced dataset due to a small number of users' feedback in relation to the environmental and biometric data. A more balanced dataset, obtained through a longer detection of both environmental parameters and users' feedback, will allow to directly predict the thermal sensation of the users. That is a goal of a future application of the framework. The adopted flexible solution can be used in different contexts (hospitals, schools, gyms, nursing homes, etc.) considering different type of users (divided by age, gender, presence of pathologies, etc.) as to identify possible useful pattern for the optimal management of personal thermal comfort. It can be upgraded with some other features including a more integrated approach that can consider also a chat bot following the user in the initial phase and in the activity of feedback recognition.
In the current case study, the influence of air velocity was neglected. The air velocity could become an important variable for thermal comfort assessment that cannot be excluded especially in indoor space with a forced air cooling/heating system [64,65]. The variability of the environmental parameters in the space deserves interest as they can be cause of thermal discomfort for the users. The current framework can face this issue by exploiting the flexibility and the reliability of the low cost devices integrated in a Wireless Sensor Network. Finally, the interaction between the environmental parameters, such as CO 2 concentration and illuminance level, and the users' thermal sensation will be investigated by exploiting the potential provided by ML techniques, thus allowing to include other aspects related to IEQ: Indoor Air Quality (IAQ), Indoor Lighting Quality (ILQ) [66][67][68], Acoustic comfort [69][70][71] and their interaction with the energy performance of buildings [72][73][74][75].
Author Contributions: The work presented in this paper is a collaborative development by all of the authors under the coordination of F.S. who conceived the framework. In particular, F.S., L.B. and G.G. have performed the experimental campaign. F.S., L.D. and I.M. performed the environmental analyses. F.S., M.G. and the team of SCS have performed the machine learning approach. Writing of the paper was shared between the authors.
Acknowledgments: This work has been supported within the Framework Agreement between the National Research Council of Italy (CNR) and the Lombardy Region in the framework of the research project FHfFC (Future Home for Future Communities) [76]. Some of the parametric models used for the graphical representations are developed under the RIGERS project, "Regeneration of the city: smart buildings and grids". All plug-ins and libraries used are open source or BSD licensed: Thanks to the developer community for the great work and the commitment to keep them. Thanks also to the colleagues who lent themselves to the test.

Conflicts of Interest:
The authors declare no conflict of interest.