2.5.1. Decision-Tree Model
Actual thermal comfort is dependent on environmental factors, such as air temperature, air velocity, relative humidity and the uniformity of conditions and personal factors such as clothing and metabolic heat. However, it is very complex to assess indoor thermal comfort by considering all these variables (see predicted mean vote), and a simpler measure can be more useful in practice. In practice, operative temperature derived from air temperature, mean radiant temperature and air speed is widely used as a reasonable indicator of thermal comfort. Operative temperature is defined as:
where
is the air temperature;
is the mean radiant temperature; and
is the air speed. In this study, the indoor operative temperature was selected as the indoor thermal comfort index for natural ventilation control.
In this paper, the decision-tree induction method was used to generate a rule-based window control algorithm in order to determine whether natural ventilation could be used under local climate conditions. For a decision-tree model, a reversed tree-like structure is built with several nodes and branches. Each internal node and leaf node represents a test condition with an attribute and a classification prediction, respectively. Meanwhile, the outcomes of the test are presented by branches [
32]. The process to generate the decision-tree model employed in this study is presented in
Figure 6.
The data used for the decision-tree induction and validation was first generated based on the hourly simulation with the multi-zone building model under different window opening percentages (i.e., 25%, 50%, 75% and 100%). Then, the simulated indoor operative temperatures under different window and weather conditions (including ambient air temperature, relative humidity, solar radiation, wind speed and wind direction) were prepared as the data sets for decision-tree induction and validation. Based on these data sets, the applicability of natural ventilation under different window and weather conditions was assessed by an adaptive thermal comfort model (the 80% thermal comfort band) developed by De Dear [
33]. The equation is given below [
34]:
where
and
represent the upper and lower thresholds of temperature varided by month,
is the monthly average outdoor temperature, and
is the mean comfort zone temperature band for 80% acceptability.
If indoor operative temperature was within the 80% acceptable indoor operative temperature thresholds [
35], it was considered that natural ventilation could be used for this particular condition and the specific data set was then labelled as “ON”. Otherwise, the data set would be labelled as “OFF”. Half of the labelled data was randomly selected as the training data for the decision-tree induction using the C4.5 algorithm [
36]. C4.5 inducts the decision tree based on the concept of Shannon entropy in order to measure the unpredictability or the impurity of the information content [
37]. The impurity of the attribute partition decreases with the decrease of Shannon entropy. If a set of training data was allocated to a node S, and the probability distribution of the target attributes was
, Shannon entropy for the training data carried by this distribution is defined as Equation (4).
In the decision-tree induction, rules including the Gini index, pre-pruning criteria, and the minimal expected predictive accuracy were defined first. In order to balance the decision-tree scale and splitting accuracy (i.e., the ratio between the correctly labelled training datasets and all the training datasets), the Gini index was used to measure the impurity of a node [
38]. Meanwhile, pre-pruning criteria including the minimal gain, minimal leaf size and minimal size were adopted to avoid overfitting of the decision tree. As it represented the expected ratio between the correctly labelled testing data sets and all the testing data sets, the expected predictive accuracy was related to the data quality. Thus, a reasonable expected predictive accuracy was selected through trial and error, set as 0.93. The details of induction rule settings were given in the
Appendix (
Table A3).
On account of the defined Gini index, pre-pruning criteria, and the expected predictive accuracy, an initial maximum tree depth can be assigned for decision-tree induction and validation. To control the size of the decision tree, the depth should start with a relatively small value. The values for the tree induction were used according to [
39].
If the predictive accuracy of the decision tree validated by the testing data was larger than the minimum expected value, the decision-tree learning process would be terminated. Then, the generated decision tree would be used for ventilation control. Otherwise, a new tree would be generated by increasing the maximum tree depth so as to improve the predictive accuracy.
2.5.2. Natural Ventilation Strategy Based on the Decision-Tree Model
The decision tree for the case study building was introduced by the open source data mining software RapidMiner. A total of 3000 hourly data sets for each of the four window opening conditions (i.e., 25%, 50%, 75% and 100%) were obtained during the whole RMY for decision-tree generation and validation.
A final decision-tree model for the case study building in Sydney is depicted in
Figure 7. The decision tree consisted of 55 nodes, among which 27 yellow rectangular nodes presented the categorical parameters, and 12 blue and 16 red ovals at the bottom denoted the classification results. The outdoor air temperature nodes accounted for 1/3 of the total internal (i.e., a node between input and output) nodes, indicating that the outdoor air temperature was one of the most critical parameters for natural ventilation. By using this decision tree, each data record was assigned to a leaf node that was associated with a specific window condition, and a window opening prediction could be made. In addition, no internal node related to the window opening percentage was found in this decision tree, implying that window opening percentage had less influence on ventilation mode selection when compared to the outdoor climate.