Hygrothermal Dynamic and Mould Growth Risk Predictions for Concrete Tiles by Using Least Squares Support Vector Machines

The hygrothermal analysis of roofs is relevant due to the large areas exposed to a wide range of weather conditions, these directly affecting the energy performance and thermal comfort of buildings. However, after a long life service, the solar absorptivity coatings of roofs can be altered by mould accumulation. Based on two well established mathematical models, one that adopts driving potentials to calculate temperature, moist air pressure and water vapor pressure gradients, and the other to estimate the mould growth risk on surfaces, this research introduces an approach to predict mould growth considering a reduced computational effort and simulation time. By adopting multiple MISO (Multiple-Input, Single-Output) Nonlinear AutoRegressive with eXogenous inputs (NARX) models, a machine learning technique known as Least Squares Support Vector Machines (LS-SVM), a maximum margin model based on structural risk minimization, was used to predict vapor flux, sensible heat flux, latent heat flux, and mould growth risk on roof surfaces. The proposed model was validated in terms of the Multiple Correlation Coefficient (R2), Mean Square Error (MSE) and Mean Absolute Error (MAE) performance indices considering as input the weather file from Curitiba city—Brazil, showing consistent precision when compared to the results of a validated numerical model.


Introduction
Most recent studies in building physics focus on energy savings and/or thermal comfort. Taking into account that a considerable amount of energy attributed to buildings is used to provide thermal comfort, and that in modern societies people spend over 90% of their time indoors [1], buildings became responsible for a considerable amount of energy demand worldwide [2,3]. In Brazil, residential and commercial buildings are responsible for almost 45% of the country's energy demand [4], which progressively motivates energy conservation studies for promoting building energy efficiency.
The design of energy efficient buildings depends on a significant amount of variables, the required indoor climate conditions, internal gains, the outdoor prevailing climatic conditions, and the choice of building construction materials and insulation among other variables. Accurate methods to predict hygrothermal performance of building envelopes became necessary, and software capable of performing whole-building energy simulation and analysis became available [5][6][7][8][9]. However, the complexity of considering moisture presence in this type of simulation requires both time-consuming and complex computational codes [10][11][12][13]. However, an alternative way to evaluate the influences of coupled heat and moisture transfer in building can be performed by adopting computational intelligence and machine learning techniques [14]. Moreover, this type of technology can be also used in the analysis of building energy demand and energy savings [15][16][17].
In warm climates, during clear sky conditions, up to about 1 kW/m 2 of solar radiation can be incident on a roof surface, and between 20% and 95% of this radiation is typically absorbed. The roof color that is apparent from the reflected visible part of the solar radiation usually gives an indication of the value of solar absorption [18]. In cases of high radiation incidence, the use of proper insulating or higher roof solar reflectance can reduce the solar energy absorbed by the roof, providing economy in the usage of air conditioning in warmer climate countries. Additionally, in cold weather, it can avoid heat losses, by increasing the energy efficiency of the whole building. Due to these facts, the application of different types of thermal insulation and special building materials significantly increased in recent years [19] and became a valuable strategy for making buildings more sustainable [20][21][22].
In this context, reflective roof or insulation coatings have been utilized for increasing the energy savings potential of building envelopes, and several studies can be found in the literature, such as [18,19,[23][24][25][26][27]. Nevertheless, due to the low cost, concrete or ceramic tiles are widely used in roofs in Brazil. Those tiles enable mould or algae to grow and normally no paint layer or impermeable films ( Figure 1a) are provided. In asphalt roofing shingles found widely in European and North American buildings, this phenomenon is commonly observed (Figure 1b). The mould growth decreases the tile durability, worsens the aesthetic appearance of buildings and increases the solar absorptivity. capable of performing whole-building energy simulation and analysis became available [5][6][7][8][9]. However, the complexity of considering moisture presence in this type of simulation requires both time-consuming and complex computational codes [10][11][12][13]. However, an alternative way to evaluate the influences of coupled heat and moisture transfer in building can be performed by adopting computational intelligence and machine learning techniques [14]. Moreover, this type of technology can be also used in the analysis of building energy demand and energy savings [15][16][17].
In warm climates, during clear sky conditions, up to about 1 kW/m 2 of solar radiation can be incident on a roof surface, and between 20% and 95% of this radiation is typically absorbed. The roof color that is apparent from the reflected visible part of the solar radiation usually gives an indication of the value of solar absorption [18]. In cases of high radiation incidence, the use of proper insulating or higher roof solar reflectance can reduce the solar energy absorbed by the roof, providing economy in the usage of air conditioning in warmer climate countries. Additionally, in cold weather, it can avoid heat losses, by increasing the energy efficiency of the whole building. Due to these facts, the application of different types of thermal insulation and special building materials significantly increased in recent years [19] and became a valuable strategy for making buildings more sustainable [20][21][22].
In this context, reflective roof or insulation coatings have been utilized for increasing the energy savings potential of building envelopes, and several studies can be found in the literature, such as [18,19,[23][24][25][26][27]. Nevertheless, due to the low cost, concrete or ceramic tiles are widely used in roofs in Brazil. Those tiles enable mould or algae to grow and normally no paint layer or impermeable films (Figure 1a) are provided. In asphalt roofing shingles found widely in European and North American buildings, this phenomenon is commonly observed (Figure 1b). The mould growth decreases the tile durability, worsens the aesthetic appearance of buildings and increases the solar absorptivity. Taking into account that the cost of both heating and cooling in buildings is directly affected by roof performances, this article presents an approach combining a numerical computational code and an artificial intelligence method in order to predict the hygrothermal behavior of building roofs, focusing on the evaluation of mould growth risk on these structures. The main idea is to perform a nonlinear system identification by using data obtained from the results of the numerical model. In this manner, the main objectives are to reduce computational costs and to provide consistent approximation when compared to an already validated numerical model. SVMs (Support Vector Machines) have already proven to be a promising approach in nonlinear identification and modelling. This technique was developed based on statistical learning (details in [28]) and was originally created to solve classification problems. SVM refers to a kernel-based method, similar to artificial neural network (ANN) models, which constitute an approximate implementation of the structural risk minimization principle [29]. Considering structures called nuclei (kernels), SVMs go beyond the hyperplanes generated initially, been widely applied in Taking into account that the cost of both heating and cooling in buildings is directly affected by roof performances, this article presents an approach combining a numerical computational code and an artificial intelligence method in order to predict the hygrothermal behavior of building roofs, focusing on the evaluation of mould growth risk on these structures. The main idea is to perform a nonlinear system identification by using data obtained from the results of the numerical model. In this manner, the main objectives are to reduce computational costs and to provide consistent approximation when compared to an already validated numerical model. SVMs (Support Vector Machines) have already proven to be a promising approach in nonlinear identification and modelling. This technique was developed based on statistical learning (details in [28]) and was originally created to solve classification problems. SVM refers to a kernel-based method, similar to artificial neural network (ANN) models, which constitute an approximate implementation of the structural risk minimization principle [29]. Considering structures called nuclei (kernels), SVMs go beyond the hyperplanes generated initially, been widely applied in classification [30][31][32] and nonlinear regression [33][34][35] areas, mapping the input data in a space with characteristics of high dimensionality.
A variation of SVMs, known as LS-SVMs (Least Squares Support Vector Machines), adopted in this work, was proposed, evaluated, and compared to the classical version of SVMs [36,37] for a regression/identification task. It involves the equality constraints only, where the solution is obtained by solving a system of linear equations. In the application presented in this work, both sensible and latent heat flows, vapor flow and mould growth risk on concrete tiles, which are constantly adopted in Brazilian buildings, were predicted using the LS-SVM approach considering Multiple-Input, Single-Output (MISO) models. The technique was selected by considering the learning capability of the SVM, especially related to nonlinearities available in the system caused by moisture presence.
The next section of this article presents the data acquisition procedures, where a validated mathematical model was adopted to generate data for the system identification procedures. The data set analysis is presented in Section 3, followed by a detailed description of the LS-SVM technique in Section 4. Section 5 presents and discusses both training and validation results found by the LS-SVM. Finally, Section 6 addresses the conclusions and future work of this type of research.

Data Acquisition Procedures
This section presents the mathematical model adopted for data acquisition. In order to obtain a consistent data set, simulation procedures indicated in this section were used for the system identification procedures presented on Section 4 of the present study.

Mathematical Model
The model for the porous media domain ( Figure 2) has been elaborated considering the differential governing equations for moisture, air and energy balances [38]. The transient terms of each governing equation have been written in terms of the driving potentials to take more advantage of the Multitridiagonal-Matrix Algorithm (MTDMA) solution algorithm [39]. classification [30][31][32] and nonlinear regression [33][34][35] areas, mapping the input data in a space with characteristics of high dimensionality. A variation of SVMs, known as LS-SVMs (Least Squares Support Vector Machines), adopted in this work, was proposed, evaluated, and compared to the classical version of SVMs [36,37] for a regression/identification task. It involves the equality constraints only, where the solution is obtained by solving a system of linear equations. In the application presented in this work, both sensible and latent heat flows, vapor flow and mould growth risk on concrete tiles, which are constantly adopted in Brazilian buildings, were predicted using the LS-SVM approach considering Multiple-Input, Single-Output (MISO) models. The technique was selected by considering the learning capability of the SVM, especially related to nonlinearities available in the system caused by moisture presence.
The next section of this article presents the data acquisition procedures, where a validated mathematical model was adopted to generate data for the system identification procedures. The data set analysis is presented in Section 3, followed by a detailed description of the LS-SVM technique in Section 4. Section 5 presents and discusses both training and validation results found by the LS-SVM. Finally, Section 6 addresses the conclusions and future work of this type of research.

Data Acquisition Procedures
This section presents the mathematical model adopted for data acquisition. In order to obtain a consistent data set, simulation procedures indicated in this section were used for the system identification procedures presented on Section 4 of the present study.

Mathematical Model
The model for the porous media domain ( Figure 2) has been elaborated considering the differential governing equations for moisture, air and energy balances [38]. The transient terms of each governing equation have been written in terms of the driving potentials to take more advantage of the Multitridiagonal-Matrix Algorithm (MTDMA) solution algorithm [39].

Porous Element Domain
The model is based on averages taken over a representative elementary volume (REV). The moisture transport has been divided into liquid and vapor flows as shown in Equation (1): where is the density of moisture flow rate (kg/m 2 .s), , the density of liquid flow rate (kg/m 2 .s) and, , the density of vapor flow rate (kg/m 2 .s). The liquid transport calculation is based on the Darcy equation: where is the liquid water permeability (s), , the suction pressure (Pa), , the liquid water density (kg/m 3 ) and the gravity (m/s 2 ).
The capillary suction pressure can be written as a function of temperature and moisture content in the following form:

Porous Element Domain
The model is based on averages taken over a representative elementary volume (REV). The moisture transport has been divided into liquid and vapor flows as shown in Equation (1): where j is the density of moisture flow rate (kg/m 2 .s), j l , the density of liquid flow rate (kg/m 2 .s) and, j v , the density of vapor flow rate (kg/m 2 .s). The liquid transport calculation is based on the Darcy equation: where K is the liquid water permeability (s), P suc , the suction pressure (Pa), ρ l , the liquid water density (kg/m 3 ) and g the gravity (m/s 2 ).
The capillary suction pressure can be written as a function of temperature and moisture content in the following form: Similar to the liquid flow, the vapor flow is calculated from the Fick equation considering effects of both vapor pressure and air pressure driving potentials: where δ v is the vapor diffusive permeability (s), P v , the partial vapor pressure (Pa), ρ v , the vapor density (kg/m 3 ), k, the absolute permeability (m 2 ), k rg , the gas relative permeability, µ g , the dynamic viscosity (Pa.s) and, P g , the gas pressure. The first term after the equality represents vapor diffusion, and the second one vapor advection. The water mass conservation equation can be described as: where w m is the moisture content (kg/m 3 ). This moisture content conservation equation-Equation (5)-can be written in terms of the three driving potentials as: In the proposal model, the air transport is individually considered through the dry-air mass balance. In this way, the dry-air conservation equation can be expressed as: with the air flow calculated by the following expression: where ρ a is the dry-air density (kg/m 3 ), j a , the dry-air flow rate density (kg/m 2 .s) and, P g , the gas pressure (dry air pressure plus vapor pressure, in Pa). The first term after the equality represents air diffusion, and the second one air convection. Therefore, the dry air transport can be described as a function of the partial gas and vapor pressure driving potentials so that the air balance can be written as: Due to the presence of low temperature gradients, heat transfer has been attributed to both conductive and convective effects only. The conductive transport is calculated by Fourier's law: while the convective transport can be written as: where λ is the thermal conductivity (W/(m.K)), c pa , the dry-air specific heat at constant pressure (J/(kg.K)), c pl , the liquid water specific heat (J/(kg.K)), c pv , the vapor specific heat at constant pressure (J/(kg.K)) and, L, the vaporization latent heat (J/kg). The terms after the equality represent liquid flow, dry air flow, phase change, and vapor flow, respectively.
The energy balance equation based on the first law of thermodynamics can be written in this case as: where c m is the specific heat of the structure (J/(kg.K)) and ρ 0 , the dry-basis material density (kg/m 3 ). In this way, assuming 0 • C as the reference temperature, the energy conservation equation can be rewritten in terms of the three driving potentials as:

Mould Growth Model
The mould growth prediction has been verified through the model presented in [40]. The mould has been measured applying an existing standard index based on the visual appearance of the surface under study. This model was initially presented in [41] considering the analysis in wood material, and improved in a project carried out at VTT (Technical Research Center of Finland) and Tampere University of Technology, where experiments included large sets of steady-state and dynamic laboratory experiments for typical building materials. This mould index assumes the values presented in Table 1. Small amounts of mould on surface (microscopic), initial stages of local growth 2 Several local mould growth colonies on surface (microscopic) 3 Visual findings of mould on surface, <10% coverage, or, <50% coverage of mould (microscopic) 4 Visual findings of mould on surface, 10-50% coverage, or, >50% coverage of mould (microscopic) 5 Plenty of growth on surface, >50% coverage (visual) 6 Heavy and tight growth, coverage about 100% The index M is presumed to increase linearly in time: where k 1 represents the intensity of growth, k 2 is the moderation of the growth intensity when the mould index (M) level approaches the maximum peak value, W is the timber species (0 = pine and 1 = spruce) and SQ is the term for surface quality (SQ = 0 for sawn surface, SQ = 1 for kiln dried quality), T is the temperature ( • C) and φ, the relative humidity (%). The terms k 1 and k 2 are modified for other materials than wood and SQ = 0 is used. The idea is to compare the mould growth of other materials to that of the pine sapwood (original work). More details can be seen in [40,42].

Simulation Using the Mathematical Model
This section describes the simulation procedures and the data set obtained and adopted for the system identification using LS-SVM technique. The next subsection describes the roof composition and parameters followed by explanations about the data set.

Simulation Procedures
The roof analyzed is composed of concrete tile (1.5 cm) and two rifters (2.5 cm) as shown in Figure 3. The hygrothermal properties have been obtained from [43] for concrete and from IEA Annex 14 Report [44] for timber (rifters). The internal emissivity was considered equal to 0.9 for both concrete tile and timber roofs, and the solar absorptivity equal to 0.6 has been adopted.
In order to evaluate extremal conditions, where higher incidence of solar radiation is desired, a flat roof at a horizontal angle has been considered during simulations. In terms of moisture transport, this inclination also indicates extremal transport level as the flow is in the opposite direction of the roof normal. Additionally, thermal insulation is not commonly adopted in many regions of Brazil due to elevated costs. The building roof configuration presented in this work is typically adopted residential buildings.
A regular 2-D mesh (2.5 mm 2 ) for the discretization using the finite-volume method and a 30 s constant time step have been applied for all simulations. The computational code was implemented in C programming language in order to enable dynamic memory allocation, and the sample time for data generation was set to 6 h. Due to high computational time consumption, the simulation using the hygrothermal model was performed for almost 1 year and 8 months, when the index M = 3 was reached.
A temperature of 24 • C and a relative humidity of 50% (conditioned environment) were considered for indoor conditions. The outdoor climate conditions were represented by the TRY (Test Reference Year) weather data for the city of Curitiba-Brazil, which can be found in [45], and are presented in Figure 3 for the first week of January (summer period), and in Figure 4 for the first week of July (winter period). Constant convective heat transfer coefficients of 3 and 10 W/(m 2 .K) have been used at the internal and external surfaces. The external and internal convective water vapor transfer coefficients are calculated by Lewis's relation for each control volume. The other surfaces were considered adiabatic and impermeable. The sky temperature correlation presented in [46] has been adopted in this work. Gas (moist air) pressure has been considered constant at all surfaces. In order to evaluate extremal conditions, where higher incidence of solar radiation is desired, a flat roof at a horizontal angle has been considered during simulations. In terms of moisture transport, this inclination also indicates extremal transport level as the flow is in the opposite direction of the roof normal. Additionally, thermal insulation is not commonly adopted in many regions of Brazil due to elevated costs. The building roof configuration presented in this work is typically adopted residential buildings.
A regular 2-D mesh (2.5 mm 2 ) for the discretization using the finite-volume method and a 30 s constant time step have been applied for all simulations. The computational code was implemented in C programming language in order to enable dynamic memory allocation, and the sample time for data generation was set to 6 h. Due to high computational time consumption, the simulation using the hygrothermal model was performed for almost 1 year and 8 months, when the index = 3 was reached.
A temperature of 24 °C and a relative humidity of 50% (conditioned environment) were considered for indoor conditions. The outdoor climate conditions were represented by the TRY (Test Reference Year) weather data for the city of Curitiba-Brazil, which can be found in [45], and are presented in Figure 3 for the first week of January (summer period), and in Figure 4 for the first week of July (winter period). Constant convective heat transfer coefficients of 3 and 10 W/(m 2 .K) have been used at the internal and external surfaces. The external and internal convective water vapor transfer coefficients are calculated by Lewis's relation for each control volume. The other surfaces were considered adiabatic and impermeable. The sky temperature correlation presented in [46] has been adopted in this work. Gas (moist air) pressure has been considered constant at all surfaces.    In order to evaluate extremal conditions, where higher incidence of solar radiation is desired, a flat roof at a horizontal angle has been considered during simulations. In terms of moisture transport, this inclination also indicates extremal transport level as the flow is in the opposite direction of the roof normal. Additionally, thermal insulation is not commonly adopted in many regions of Brazil due to elevated costs. The building roof configuration presented in this work is typically adopted residential buildings.
A regular 2-D mesh (2.5 mm 2 ) for the discretization using the finite-volume method and a 30 s constant time step have been applied for all simulations. The computational code was implemented in C programming language in order to enable dynamic memory allocation, and the sample time for data generation was set to 6 h. Due to high computational time consumption, the simulation using the hygrothermal model was performed for almost 1 year and 8 months, when the index = 3 was reached.
A temperature of 24 °C and a relative humidity of 50% (conditioned environment) were considered for indoor conditions. The outdoor climate conditions were represented by the TRY (Test Reference Year) weather data for the city of Curitiba-Brazil, which can be found in [45], and are presented in Figure 3 for the first week of January (summer period), and in Figure 4 for the first week of July (winter period). Constant convective heat transfer coefficients of 3 and 10 W/(m 2 .K) have been used at the internal and external surfaces. The external and internal convective water vapor transfer coefficients are calculated by Lewis's relation for each control volume. The other surfaces were considered adiabatic and impermeable. The sky temperature correlation presented in [46] has been adopted in this work. Gas (moist air) pressure has been considered constant at all surfaces.

Data Analysis
Heat fluxes at the internal roof surfaces are presented in Figure 5a. The mass transport effect is verified when a roof was simulated without the moisture content and air conservation equations. As observed in Figure 5, the latent effect was small when compared with the sensible heat flux. This fact is attributed to the analyzed period (January). This effect is increased in the winter (Figure 5b), where the sensible heat flux losses dramatically increase its magnitude and the Sun does not dry out the roof as it does in the summer. However, a difference of 20% in the heat flux is reported for the peak values when the mass transport is considered.

Data Analysis
Heat fluxes at the internal roof surfaces are presented in Figure 5a. The mass transport effect is verified when a roof was simulated without the moisture content and air conservation equations. As observed in Figure 5, the latent effect was small when compared with the sensible heat flux. This fact is attributed to the analyzed period (January). This effect is increased in the winter (Figure 5b), where the sensible heat flux losses dramatically increase its magnitude and the Sun does not dry out the roof as it does in the summer. However, a difference of 20% in the heat flux is reported for the peak values when the mass transport is considered.   (Figure 6b). In Figure 6a the low temperature at the external surface increases the relative humidity and, consequently, the mould index in this period, as described by Equation (14). Although the indoor latent heat flux is small when compared to the sensible flux, the mass transport can play an important role in the heat exchanged at the external surface, mainly, in the daytime, when the adsorbed moisture at nighttime is released, decreasing the concrete tile temperature due to the outward evaporation. In order to confirm the possibility of mould growth on concrete tile, the mould index evolution is reported in Figure 7 as the results of the simulation using the numerical model presented in Section 3. Some mould growth can be detected visually after 76 weeks (Curitiba climate). A small decrease in the mould index is observed after the winter period. This work has been limited to 85 weeks due   (Figure 6b). In Figure 6a the low temperature at the external surface increases the relative humidity and, consequently, the mould index in this period, as described by Equation (14). Although the indoor latent heat flux is small when compared to the sensible flux, the mass transport can play an important role in the heat exchanged at the external surface, mainly, in the daytime, when the adsorbed moisture at nighttime is released, decreasing the concrete tile temperature due to the outward evaporation.

Data Analysis
Heat fluxes at the internal roof surfaces are presented in Figure 5a. The mass transport effect is verified when a roof was simulated without the moisture content and air conservation equations. As observed in Figure 5, the latent effect was small when compared with the sensible heat flux. This fact is attributed to the analyzed period (January). This effect is increased in the winter (Figure 5b), where the sensible heat flux losses dramatically increase its magnitude and the Sun does not dry out the roof as it does in the summer. However, a difference of 20% in the heat flux is reported for the peak values when the mass transport is considered.   (Figure 6b). In Figure 6a the low temperature at the external surface increases the relative humidity and, consequently, the mould index in this period, as described by Equation (14). Although the indoor latent heat flux is small when compared to the sensible flux, the mass transport can play an important role in the heat exchanged at the external surface, mainly, in the daytime, when the adsorbed moisture at nighttime is released, decreasing the concrete tile temperature due to the outward evaporation. In order to confirm the possibility of mould growth on concrete tile, the mould index evolution is reported in Figure 7 as the results of the simulation using the numerical model presented in Section 3. Some mould growth can be detected visually after 76 weeks (Curitiba climate). A small decrease in the mould index is observed after the winter period. This work has been limited to 85 weeks due In order to confirm the possibility of mould growth on concrete tile, the mould index evolution is reported in Figure 7 as the results of the simulation using the numerical model presented in Section 3. Some mould growth can be detected visually after 76 weeks (Curitiba climate). A small decrease in the mould index is observed after the winter period. This work has been limited to 85 weeks due to the high computational time required and to the fact that according to the mould growth risk model (Section 2.3), mould would visually appear in less than 85 weeks.
Beyond damage and the roof appearance, the mould growth can change the solar absorption of the external roof surface, causing a high building thermal load. The results indicate that a cleaning preventive maintenance is suggested every two years. This problem can be mitigated by using impermeable coating on the tiles.
To conclude this section, it is important to emphasize both the complexity and time consumption of the previously mentioned computational code. It took almost one week to conclude the 1 year and 8 month simulation period. to the high computational time required and to the fact that according to the mould growth risk model (Section 2.3), mould would visually appear in less than 85 weeks. Beyond damage and the roof appearance, the mould growth can change the solar absorption of the external roof surface, causing a high building thermal load. The results indicate that a cleaning preventive maintenance is suggested every two years. This problem can be mitigated by using impermeable coating on the tiles.
To conclude this section, it is important to emphasize both the complexity and time consumption of the previously mentioned computational code. It took almost one week to conclude the 1 year and 8 month simulation period. As it can be seen in Figure 7, it takes almost 78 weeks to reach = 3 in terms of mould growth, which means, according to Table 1, that there are visual findings of mould at the roof surface, or 50% coverage of mould can be found through microscopic analysis. This graphic also provides information about the mould growth model dynamic, where a time delay of acceptable hygrothermal conditions for growth is considered. Just after this time delay, mould starts growing in order to overcome = 1 and = 3 values.

Least Squares Support Vector Machines (LS-SVMs)
This section introduces the machine learning technique adopted to identify the hygrothermal dynamic of building roofs. Computational intelligence has been widely applied to identify building thermal dynamics and their subsystems, especially in control applications. The LS-SVM can be classified within a class of models that are used for pattern recognition, those that use a set or subset of training data in the prediction stage based on kernels. These methods perform predictions from combinations of the outputs of functions centered on each of the points available. The functions used for weighting a given set of training data are called kernels.
At first, SVMs were used to train classifiers based on the concept of structural risk minimization [47]. Besides, the SVMs were developed using the method known as statistical learning. Statistical learning theory was developed for solving problems where a small amount of data and little prior knowledge about the system are available, which differs from the traditional methods.
The SVM technique is designed to adjust the vectors defined for supporting a hyperplane, which aims to separate the input data. The SVM estimated the relationship between output and an input pattern by the following equation: where is a bias term, is a weighting vector and is a nonlinear function that maps the input pattern into a higher-dimensional feature space. The coefficient vector and bias term are unknown and can be obtained by solving an optimization problem. As it can be seen in Figure 7, it takes almost 78 weeks to reach M = 3 in terms of mould growth, which means, according to Table 1, that there are visual findings of mould at the roof surface, or 50% coverage of mould can be found through microscopic analysis. This graphic also provides information about the mould growth model dynamic, where a time delay of acceptable hygrothermal conditions for growth is considered. Just after this time delay, mould starts growing in order to overcome M = 1 and M = 3 values.

Least Squares Support Vector Machines (LS-SVMs)
This section introduces the machine learning technique adopted to identify the hygrothermal dynamic of building roofs. Computational intelligence has been widely applied to identify building thermal dynamics and their subsystems, especially in control applications. The LS-SVM can be classified within a class of models that are used for pattern recognition, those that use a set or subset of training data in the prediction stage based on kernels. These methods perform predictions from combinations of the outputs of functions centered on each of the points available. The functions used for weighting a given set of training data are called kernels.
At first, SVMs were used to train classifiers based on the concept of structural risk minimization [47]. Besides, the SVMs were developed using the method known as statistical learning. Statistical learning theory was developed for solving problems where a small amount of data and little prior knowledge about the system are available, which differs from the traditional methods.
The SVM technique is designed to adjust the vectors defined for supporting a hyperplane, which aims to separate the input data. The SVM estimated the relationship between output y i and an input pattern x by the following equation: where b is a bias term, w is a weighting vector and ϕ is a nonlinear function that maps the input pattern x into a higher-dimensional feature space. The coefficient vector w and bias term b are unknown and can be obtained by solving an optimization problem.
When LS-SVM is applied for system identification tasks, the following optimization problem linked with the minimization of the risk function J can be defined [29]: subject to where W is the vector of weights, ε is a given real number and γ is a regularization parameter that provides balance between model complexity and training error. The first part of the objective function given by Equation (16) is used to regulate the weights and penalize those with higher values. Due to regularization, weights tend to converge to smaller values. This is necessary because heavy loads cause excessive variance in the model dynamic, deteriorating the generalization ability of LS-SVM. The second part of Equation (16) represents the regression error of the training data. The equality constraint imposed by Equation (17) provides the definition of the regression error.
In the case of nonlinearly separable patterns, the model needs to add variables to the problem, by introducing loss variables, ζ i and ζ * i . In this case, it is possible to transform Equation (16) into a primal objective function given by: subject to By introducing the Lagrange multipliers α i and α * i (support vectors), the regression function given by Equation (15) can be written as: where G x i , x j is the core function, and vectors α i and α * i are obtained solving the linear system of equations, following Karush-Kuhn-Tucker. The vector G x i , x j equals the inner product of two vectors x i and x j in the space of characteristics, ϕ(x i ) and ϕ x j ; i.e., G x i , x j = ϕ(x i ) T ϕ x j . The fact of adopting nucleuses to replace the calculation of ϕ(x i ) and ϕ x j is complex and can be done in a simpler way by means of an approximate function.
These nucleuses generate a mapping between the input space and a high dimensional space, called the feature space. The SVM hyperplane generated by this space of characteristics, to be mapped back to input space, becomes a non-linear surface. Finally, the separation hyperplane becomes no longer a linear function of the input vectors, but a linear function of the space vector of characteristics [39].
In this work, a Radial Basis Function (RBF) kernel was adopted, which is given by: where σ is the spread of Gaussian kernel. In this application, to solve the linear programing training problem of LS-SVM, the SIMPLEX method was adopted [48]. A Nonlinear AutoRegressive with eXogenous inputs (NARX) model structure was adopted in this work. Four inputs have been considered in the system identification procedures: external temperature (in K), external relative humidity (in %), direct solar radiation (W/m 2 ), and diffuse solar radiation (W/m 2 ). Four outputs have been identified in a one-step ahead prediction considering a Multiple-Input, Single-Output (MISO) structure. The identified outputs are: sensible heat flux (W/m 2 ), latent heat flux (W/m 2 ), vapor flux (kg/m 2 .s), and the Mould Growth Risk index.
A NARX model can be defined as a product (Equation (22)) to create a nonlinear form presented in Equation (23). a 1 , a 2 , . . . , a n a , b 1 , b 2 where t represents the current time, and d the delayed sample.
The nonlinear function f can be expressed in terms of the model regressors, and the nonlinear mapping can be performed using nonlinear estimators. In Equations (21) and (22), y(t) represents the current output of the model, y(t − d) is a finite number of past outputs, u(t − d) the inputs, e(t) is a white-noise error that is introduced in the difference equation, andŷ(t) is the predicted output of the system. The model structure is entirely defined by three integers, where n a represents the number of poles, (n b − 1) is the number of zeros, and n d is the time delay of the systems.

Results
This section was divided into three parts. The first describes the simulation parameters adopted for the LS-SVM method on the system identification procedure. The second part of this section shows the analysis performed on the data set in order to define the percentage of data used for both training and test phases. Finally, Section 5.3 presents the prediction results considering the four MISO models proposed in this study.

Simulation Parameters
At the beginning of this analysis, one year and six months of data were collected (2444 samples sets of inputs and outputs) considering the physical domain and the numerical model presented in Section 2.
The training set was divided into 10 subsets following the l-fold cross validation method [49], in order to train the classifier 10 times, each time leaving out one of the subsets from training, but using the omitted subset to compute the classification errors using the Mean Absolute Error (MAE) presented in Equation (24) as minimization criterion.
In terms of the model structure, n a , and n b were set equal to 2 according to previous analysis of the number of regressors using the neighborhood component analysis for regression [50]. Additionally, no time delay was considered between the inputs and the outputs (n k = 0).

Definition of Training and Test Data Sets
In order to define the percentage of data used for training, an analysis using 10% to 90% of dada for training was performed. In this case, besides MAE, the Multiple Correlation Coefficient (R 2 ) and the Mean Square Error (MSE) were also adopted. Figure 8 presents the training percentage related to R 2 , MAE, and MSE for LS-SVM output prediction. As it can be verified in Figure 8, the test set was also presented in order to evaluate the balance between the quantity of data used for both training and validation procedures.

LS-SVM Prediction
According to the results presented in the previous subsection, 50% of data were selected for training, as they present reasonable approximation for all the outputs for both training and test procedures.
Figures 9 and 10 present the comparison between the LS-SVM model and the numerical method in terms of prediction. Figure 9 presents both training and test phases for outputs 1 (vapor flow) and 2 (sensible heat flow), while Figure 10 shows the results for outputs 3 (latent heat flow) and 4 (mould growth risk). The absolute error is also presented in these figures. Additionally, Table 2 reports the values of the Multiple Correlation Coefficient for both training and test phases.

LS-SVM Prediction
According to the results presented in the previous subsection, 50% of data were selected for training, as they present reasonable approximation for all the outputs for both training and test procedures.
Figures 9 and 10 present the comparison between the LS-SVM model and the numerical method in terms of prediction. Figure 9 presents both training and test phases for outputs 1 (vapor flow) and 2 (sensible heat flow), while Figure 10 shows the results for outputs 3 (latent heat flow) and 4 (mould growth risk). The absolute error is also presented in these figures. Additionally, Table 2 reports the values of the Multiple Correlation Coefficient for both training and test phases.

LS-SVM Prediction
According to the results presented in the previous subsection, 50% of data were selected for training, as they present reasonable approximation for all the outputs for both training and test procedures.
Figures 9 and 10 present the comparison between the LS-SVM model and the numerical method in terms of prediction. Figure 9 presents both training and test phases for outputs 1 (vapor flow) and 2 (sensible heat flow), while Figure 10 shows the results for outputs 3 (latent heat flow) and 4 (mould growth risk). The absolute error is also presented in these figures. Additionally, Table 2 reports the values of the Multiple Correlation Coefficient for both training and test phases. As can be observed in Figures 9 and 10, the highest absolute error values can be found in the mould growth index approximation, which can be justified by the different in the dynamic of the mould growth model. As the index provides different growth behavior in distinct stages of As can be observed in Figures 9 and 10, the highest absolute error values can be found in the mould growth index approximation, which can be justified by the different in the dynamic of the mould growth model. As the index provides different growth behavior in distinct stages of As can be observed in Figures 9 and 10, the highest absolute error values can be found in the mould growth index approximation, which can be justified by the different in the dynamic of the mould growth model. As the M index provides different growth behavior in distinct stages of growth, those defined by distinct equations, all these stages should be used in the LS-SVM training stage. As the LS-SVM training data set presented in this work adopted only the behavior between the values of 0 ≤ M < 2 (Figure 9c), the model was still capable of reproducing mould growth with considerable precision. As can be viewed in Table 2, the model presented consistent approximation for all four outputs. Higher values of R 2 for the mould growth risk can be justified by its prior behavior in exceeding the M = 2 limit, as the multiple correlation index is a cumulative measure.
In terms of computational effort, the whole simulation considering both training and test phases did not take more than 30 s, while the traditional method took about 120 h.

Conclusions and Future Research
This article presented an approach to predict vapor flux, sensible heat flux, latent heat flux and mould growth risk for concrete tiles based on the external weather conditions, considering as inputs for four MISO (multiple-input, single-output) models external temperature, relative humidity and direct and diffuse solar radiation; all inputs were obtained from a Test Reference Year (TRY) weather file.
Roofs are subjected to both thermal and moisture gradients, so that an accurate heat transfer determination requires a simultaneous calculation of both sensible and latent effects. Therefore, a mathematical model considering the combined two-dimensional heat, air and moisture transport through an unsaturated roof was presented, and the effects of moisture adsorption and desorption on the thermal performance of concrete tiles was shown. However, this type of numerical model is complex and both hardware and time-consuming when used for long periods of simulations, which is the case of mould growth evaluation.
Besides its effect on heat transfer, moisture can cause damage to the building structure and can promote both mould and mildew growth. The mould growth on roof surfaces can increase the solar absorptivity, decreasing its hygrothermal performance.
By considering the whole-building simulation, a tendency in the building physics area due to energy policies and the search for thermal comfort in indoor environments, fast and precise techniques, that could be coupled to building simulation software, are proposed and present consistent results, as shown in this work.
For future research, the authors intend to include the changes on roof solar absorptivity in the presented model, so that mould growth and the effects on the hygrothermal performance of the building can be verified more precisely. Moreover, an additional roof painting layer will be considered during simulations in order to provide a consistent efficiency analysis in order to reduce mould growth.