Nanomaterials’ Influence on the Performance of Thermal Insulating Mortars—A Statistical Analysis

This research provides a statistical analysis of the mechanical and physical properties of thermal insulating mortars developed in the laboratory and by the industry with and without the incorporation of nanomaterials. This was evaluated by carrying out a uni and bivariate analysis, principal components and factor analysis, cluster analysis, and the application of regression models. The results show that it is possible to find associations between these mortars’ properties, but also how these formulations’ development can be approached in the future to achieve better overall performance. They also show that the use of nanomaterials, namely silica aerogel, significantly improved the mortars’ thermal insulation capabilities, allowing us to obtain mortar formulations with thermal conductivities below the values presented by classic thermal insulating materials. Therefore, with this investigation, other researchers can support their product-development choices when incorporating nanomaterials to reduce mortars’ thermal conductivities, increasing their production efficiency, overall multifunctionality, and sustainability.

Thermal insulating mortars have been used as an alternative to conventional thermal insulating materials [8], making use of the introduction of EPS (expanded polystyrene) granules and other materials, such as cork granulate, perlite, or natural fibres [9][10][11][12], to lower their thermal conductivities [9,13]. Although their performance being still lower than that of ETICS (External Thermal Insulation Composite System) solutions incorporating traditional thermal insulation materials [14], they present several advantages over those materials, such as [15,16]: the possibility of smoothing complex areas; improved fire reaction; and easiness of application.
To improve the performance of construction materials, several studies have included nanomaterials [17,18]. One of the nanomaterials recently used and studied as a thermal insulator is the silica aerogel [19,20]. This material is produced by a bottom-up approach through the sol-gel process, presenting only a small fraction of solid silica [21]. Although it can be classified as a 3-D nanomaterial (with all dimensions above the nanoscale), its inner structure is within the nanoscale [21,22].
The silica aerogel is nanostructured and presents a very high quantity of pores with a reduced dimension (~10 to 100 nm) [21,23], supported by a nanoparticle structure (with particle diameters 2 to 5 nm) of silica (SiO 2 ), that are interconnected, forming a porous grid. It has an open pore network, with densities between 1.9 and 500 kg·m −3 [24][25][26]. As a consequence of its nanostructure and highly tortuous paths, the silica aerogel presents high thermal insulation due to the limitation of heat transport by radiation, convection, and conduction [27][28][29]. Therefore, the silica aerogel presents thermal conductivities between 0.012 and 0.021 W·m −1 ·K −1 , which is lower than that of air at ambient pressure (0.026 W·m −1 ·K −1 ), but it is also significantly lower than those of EPS (0.032 W·m −1 ·K −1 ) [8,30] and cork granules (0.045 W·m −1 ·K −1 ) [14].
In order to analyse, compare, and optimise new products, different approaches have been used [40,41], with one of these approaches being related to the use of multivariate statistical techniques [42][43][44][45]. These techniques have also been used to study cement-based products [46,47], allowing for the simplification of the interpretation of their performance, with particular emphasis on cluster analysis and Principal Components Analysis (PCA) [43,44,48,49]. This approach allows us to reduce data dimension, visualise results, and identify possible correlations that would otherwise not be found [50].
Therefore, the scope of this paper is to present a multivariate statistical approach to study the current performance of thermal insulating mortars, and how the inclusion of nanomaterials influences their behaviour, trying to: (i) know the characteristics of these materials and to study the correlations between their properties (uni and bivariate analysis); (ii) classify them into distinct groups, quantifying the association between different parameters, and identifying factors that influence their variations (PCA and factor analysis); (iii) classify them according to the degree of statistical similitude and verify the existence of classification patterns (cluster analysis); (iv) evaluate the possibility of predicting their behaviour as a function of one, or more, independent variables (regression models); and, finally, (v) provide some insights into how these formulations' design can be automated, further reducing their costs [51], impacts [52], and simplifying the production and construction processes [53], while presenting the desired multifunctionality.
The objective of this work was not to directly compare formulations, from different manufacturers, but to provide a comprehensive evaluation of this class of products while studying the influence that nanomaterials can have on their performance. Therefore, fifteen characteristics from the fresh and hardened states, from thirty-five different formulations, were gathered. This integrated approach, not yet published in the literature, allowed us to analyse the collected data and understand the variables' relationships, identifying latent structures and their organisation in groups.
Other researchers can use the presented results but also this approach when developing new formulations of thermal insulating mortars. The possible correlations between the formulations' constituent materials and their performance, and the implications of nanomaterials use, were identified. With this knowledge, this study can potentially lead to the automation of the formulations' design and to multifunctional applications, improving buildings' energy efficiency, and indoor comfort.

General Considerations
To develop this research, a set of requirements were defined for data collection: • The mortars had to be in the market, or a result of published research; • The mortars had to present, according to the EN 998-1 [7], a λ 10 • C dry (following the EN 1745 [54]) ≤0.200 W·m −1 ·K −1 (class T2); • A complete set of characteristics had to be available for each formulation (Table 1). Then, an exhaustive search was conducted using specific keywords. The following search databases were used: "Google"; "Google Scholar"; "ResearchGate"; "ScienceDirect"; "Scopus"; and "SpringerLink". Moreover, in each of these, the following keywords were used, alone or together: "aerogel"; "compressive strength"; "mechanical performance"; "mortar"; "physical performance"; "render"; "thermal conductivity"; "thermal insulating mortar"; and "thermal insulation".
The thirty-five formulations that were possible to gather came from various sources- Table 2-with distinct characteristics and representing the latest developments in this research field.  [74][75][76][77][78][79][80][81][82][83][84] Although all calculation procedures being widely known and easily found in statistics literature [85,86], their implementation using statistical software makes its analysis more complete. Therefore, all data was inserted into an Excel data matrix. At this point, data started to be transformed into information using the RStudio v1.1.463 [87], with ggplot2 [88], and also the Statistica v14.
In this research, a descriptive statistical analysis (uni and bivariate), PCA, factor analysis, and a cluster analysis were completed, and regression models were applied. For some of these analyses, it was also needed to standardise the variables, Equation (1) and Table 3, allowing variables with different units and magnitudes to present equal (unitary) variance, with equal discriminative capacity. This standardisation avoided variables with greater dispersion presenting excessive weight [89]. In Equation (1), the original value corresponds to the value presented by the specific formulation and its given property; regarding the average and standard deviation, these are the results obtained for the sample. In Table 3, for example, the S7 row presents negative values in almost all characteristics, indicating values lower than the sample's average.

Uni and Bivariate Analysis
The univariate analysis tries to describe the sample as a whole, summarising data and evaluating the measures of central tendency, dispersion, and skewness [86].
In the measures of central tendency, the objective is to identify the distribution centre considering the average, the median, and the mode. For the measures of dispersion, the objective is to evaluate the values' degree of variation around the central point using the range of values, interquartile range, variance, and the coefficient of variation. Finally, the measures of skewness study the degree of (a) symmetry of the distribution to the central point, translating how the mass of the values' distribution is positioned-to the right (negative skewness) or left (positive skewness).
The bivariate analysis is linked with the construction of a correlation matrix between the different pairs of variables, where their degree of linear correlation is identified. Moreover, a matrix of scatterplots representing the values' scatter is built, helping to identify trends other than linear [90].

Principal Components and Factor Analysis
Although principal components (PC) and factor analysis concepts are often mixed, they differ [91]. The PC synthesises the initial information in a minimum number of dimensions, with minimum loss of information, considering the total variance of the original variables, allowing to avoid a redundant system by identifying the independent components [92]. These components would also explain a certain percentage of variability of the variables, these being more interesting the higher their explanatory power [42]. On the other hand, the factor analysis is focused on identifying latent factors, or not directly observable, characteristic to the variables that reflect what they have in common, only considering the common variance [50].
For selecting the number of PCs, some criteria are considered [93]: (i) choose the components that explain a reasonable percentage of the total variance (>70 %); (ii) neglect components whose own value is less than the unity (Kaiser criterion); (iii) based on the screen plot, the recommended number of components is limited by the existence of an 'elbow'.
With the definition of the number of PCs, it is possible to obtain the weight of the original variables in the corresponding factors, after which a Varimax normalised rotation is applied, as this is the most popular orthogonal factor rotation to simplify columns in a factor matrix, achieving a simplified identification of the factor structure [89]. With a new application of the previous selection procedure, it is verified which variables contributed the most to the factor formation (closer to ± 1). This way, it is possible to try to interpret the variables' latent factors.

Cluster Analysis
The main objective of this analysis is to assign each formulation to groups so that the variables within each group are similar to one another (internally homogeneous), but the different groups stand apart (heterogeneous with each other) [89]. The two most common methods were considered: hierarchical agglomerative and non-hierarchical [94], as presented in Table 4. For the hierarchical agglomerative method, the agglomeration initiates with n clusters with a single entity that is added successfully until, in the last step, it reaches a single group including all entities, creating a dendrogram where all aggregations can be observed [90]. To define the number of clusters, it was used a linkage distance graph, being this a function of the chosen linking distance algorithm, and where a steeper increase indicates a heterogeneity [95] and a potential cluster division.
Regarding the non-hierarchical method, it was used the k-means algorithm. This is based on iterative algorithms, whose entities are grouped into a defined number of clusters (the same number of the hierarchical methods). The iterations performed maximise the variance between clusters while minimising their internal variance [94], with an allocation of entities to groups, characterised by their centroids [96], hence the interest in applying both methods to improve the quality of the study.

Regression Models
Whether a mortar can be classified as thermally insulating depends, as previously seen, on its thermal conductivity value. Therefore, the thermal conductivity at 10 • C at the dry state (λ 10 • C,dry ) was the dependent variable chosen. This variable presents a significant delay for its obtention since, after 28 days of production, the mortar must undergo a drying process at 60 • C until its mass stabilises (varies ± 0.20%), but there is the possibility of shortening the time to obtain that result, using these models.
Both simple and multiple linear regression studies were conducted. In the simple linear regression, a linear relationship between two variables using the least-squares method [85] was established, where the scatterplot matrix created in the bivariate analysis to visualise the correlations is used. For the more complex multiple linear regression, there is an extended analysis of partial correlations between variables [97]. For the application of these analyses, some assumptions are needed [89]: the mean value of the residuals is zero; there is homoscedasticity; the residuals present normal distribution; and the remaining variables are independent.
The multiple linear regression allows to examine and evaluate the influence of multiple independent variables in the variability of the dependent variable. Its calculation is made using Equation (2), where y represents the dependent variable, β 0 , . . . , β k the regression coefficients, x 1 , . . . , x k the independent variables, and ε the random errors of the model.
For the multiple linear regression, a stepwise technique was used, allowing for the selection of the independent variables with a higher explanatory power to predict the dependent variable. As this is an automatic method, the software introduces the independent variables iteratively, first introducing the variable with the highest correlation with the dependent variable. After this step, the variables start to be introduced in the model, one by one, being conducted a significance test, evaluating if the previous independent variable continues to be statistically significant as a predictor, being removed if not. At the same time, an ANOVA (Analysis of Variance) analysis was performed with the requirement of the p-value of F and t being < 0.05.
In the end, for comparing the different models, the results of R 2 , adjusted R 2 , ANOVA, independence (Durbin-Watson test), normality, homoscedasticity, and residuals were evaluated.

Univariate Analysis
For the univariate analysis, it was possible to obtain the results presented in Table 5. This analysis, although simple to be made, allows to know the characteristics and tendencies of the thermal insulating mortars considered in the sample.
Note to It was seen that AC, Ed, and TK were influenced by the presence of extreme values, verified by the distance from the median to the average and confirmed by the difference of the 25 % and 75 % quartiles, when compared with the minimum and maximum values, showing a significant variability of values on these materials. Regarding mode, the Cons, AC, w/p, Ed, fc, and ft presented multiple modes, indicating the probable influence of the classifications T1 and T2 of EN 998-1 [7] on the thermal performance that manufacturers declare.
The variance of Cons (Consistency), BDf (Bulk density fresh-state), and Λ was the lowest and thus less discriminant. Of all the variables, ten are characterised by low dispersion (coefficient of variation < 50%), the remaining showing more dispersion (fu, Ed, fc, ft, and TK), with TK having a high coefficient of variation, as expected [98].
For the asymmetry, the variables Bar (Powder bulk density), Cons, BDf, BDh (Bulk density hardened-state), ft (flexural strength), and λ 10 • C,28days presented negative values, meaning that current thermal insulating mortars tend to present a concentration towards higher values on those variables.

Bivariate Analysis
The matrix with the linear correlation coefficients between the different pairs of variables is presented in Table 6. The correlations considered significant (p-value < 0.05) are presented in bold, where cells with thick black lines indicate the highest positive correlations (>0.80), and cells with dashed-lines indicate the highest negative correlations.
It should be noted that correlations marked in bold, due to p-value < 0.05, are relevant since the higher than that threshold the p-value is, the less the observed relationship between variables might be a reliable indicator of their relation in the population [86,90], limiting the relevance of the former.
In this analysis, it was also built a scatterplot matrix for all pairs of variables, Figure 1 briefly presenting an extract of the full matrix. In that Figure, the correlations between variables are shown in the upper right portion, the values' distribution for each variable is shown on its diagonal, and, in the lower-left portion, the values' scatter, with a line of the best linear correlation (in blue) and a best-fit line for identifying other types of possible regression (in orange), is shown. This allowed us to visualise the correlations and the scatter of the values (e.g., outliers).    Table 5, it is seen that the average value of fc is ~ 1.33 MPa and of fu is ~ 0.16 MPa, with both properties presenting a higher concentration of formulations (positive asymmetry) on lower values. This indicates the influence that a small number of mortars with higher values can present. It is also possible to see that, as already known [25,69], these mortars tend to present lower mechanical performance than current cement-based mortars, which seems associated with the inclusion of the nanomaterial silica aerogel [99]. Something that is also noted is the need for an increased w/p ratio when compared to conventional cement-based mortars [100], which can also influence their physical and mechanical performance due to an increased porosity.
For the physical characteristics (λ10 ℃ ,dry, W and Λ) it was found, as shown in Table 5, that the average values are of 0.122 W·m −1 ·K −1 ; 0.479 kg·m −2 ·min −0.5 and 2.0 × 10 −11 kg·m −2 ·s −1 ·Pa −1 . These are results with lower λ10 ℃ ,dry and Λ than conventional cement-based mortars, but with higher W [101,102]. Both λ10 ℃ ,dry and Λ present interesting performance, but W can be further improved. All Note to Table 6: bold-significant correlations; cells with thick lines -highest positive correlations; cells with dashed lines-highest negative correlation.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 21 Table 6. Matrix with the linear correlation coefficients between pairs of variables.

Variable
Correlations in bold are considered significant: p-value < 0.05  Table 5, it is seen that the average value of fc is ~ 1.33 MPa and of fu is ~ 0.16 MPa, with both properties presenting a higher concentration of formulations (positive asymmetry) on lower values. This indicates the influence that a small number of mortars with higher values can present. It is also possible to see that, as already known [25,69], these mortars tend to present lower mechanical performance than current cement-based mortars, which seems associated with the inclusion of the nanomaterial silica aerogel [99]. Something that is also noted is the need for an increased w/p ratio when compared to conventional cement-based mortars [100], which can also influence their physical and mechanical performance due to an increased porosity.
For the physical characteristics (λ10 ℃ ,dry, W and Λ) it was found, as shown in Table 5, that the average values are of 0.122 W·m −1 ·K −1 ; 0.479 kg·m −2 ·min −0.5 and 2.0 × 10 −11 kg·m −2 ·s −1 ·Pa −1 . These are results with lower λ10 ℃ ,dry and Λ than conventional cement-based mortars, but with higher W  Table 5, it is seen that the average value of fc is~1.33 MPa and of fu is~0.16 MPa, with both properties presenting a higher concentration of formulations (positive asymmetry) on lower values. This indicates the influence that a small number of mortars with higher values can present. It is also possible to see that, as already known [25,69], these mortars tend to present lower mechanical performance than current cement-based mortars, which seems associated with the inclusion of the nanomaterial silica aerogel [99]. Something that is also noted is the need for an increased w/p ratio when compared to conventional cement-based mortars [100], which can also influence their physical and mechanical performance due to an increased porosity.
Both λ 10 • C,dry and Λ present interesting performance, but W can be further improved. All these characteristics present positive asymmetries, but, due to the high degree of magnitudes, the higher values influence the average performance. These results also show that for the average thermal insulation mortar, although presenting higher mechanical performance than required by the EN 998-1 [7] (fc ≥ 0.40 MPa), the associated thermal conductivities are also high (classified as a T2, according to the same standard), even more so if λ is compared with EPS (λ~0.032 W·m −1 ·K −1 [30]).
From this analysis, it was verified that these mortars still have a high margin of development if the objective is to decrease their λ and W while maintaining high Λ and mechanical behaviour. However, the use of formulations incorporating nanomaterials significantly reduced their thermal conductivities, allowing us to obtain mortars with λ below classic thermal insulating materials.
In the bivariate analysis and using the matrices of linear correlation and of scatterplots, it was observed how the different variables relate to each other. The linear correlation coefficients ranged from~0 (fu and Λ) to |1| (λ 10 • C,28days , and λ 10 • C,dry ) and, in most cases, with p-value < 0.05, increasing the confidence in accepting the observed result as representative of all thermal insulating mortars, indicating that independently of the insulation aggregate, nanomaterial or not, the correlations were maintained.
The bivariate analysis showed interesting correlations: BDh presents a high correlation (0.91) with BDf, this being somehow expected, as it was also verified by Soares [68]. However, some other significant correlations were also identified: for the λ 10 • C,28days , high correlations (≥ 0.80) were found with BDf, BDh, Ed, and ft, which is an interesting behaviour. Although, for the bulk density values, this was a known fact [68,103,104], it is a new finding that Ed and ft also showed high correlations. This can be related to a less porous microstructure that better conducts the heat-flux [105], increasing the thermal conductivity. This behaviour was also verified for other types of mortars by Badache et al. [106] and is something to consider when designing new formulations.
For the dry state (λ 10 • C,dry ), the correlation with BDh and BDf was also high, but it was higher with λ 10 • C,28days . So, despite the drying process in which the free water lost is unpredictable, a very high correlation was found between λ 10 • C,dry and λ 10 • C,28days.
The behaviour of the variable w/p was expected (inverse with other variables). This fact agrees with what is known, since the increase of w/p, to improve the workability, reduces fc, ft, and BD H , because of the increase in voids, capillaries, and pores in the microstructure, when the free water evaporates [107,108]. This would be expected to correlate well with a decrease in the thermal conductivity values (due to the low thermal conductivity of the air), but that was not verified. Some correlation between W and Λ was also expected, due to the presence of capillaries, as well as between fc and fu, as a result of the internal cohesion, but that was not verified. As for fc, it was correlated with Ed and ft, as expected [109], since the increment of one increases the others. Figure 1 complemented the matrix of linear correlation, allowing us to visualise the values' dispersion for all formulations. This way, it was possible to see that all pairs of variables did not show distribution patterns other than linear, and that outlier behaviours were not significant.
With this analysis, it was verified that the use of nanomaterials in thermal insulating mortars leads to a significant lowering of BDf, BDh, λ 10 • C,dry , and λ 10 • C,28days , however, it also lowered fc and fu. This seems related to the higher degree of kneading water needed to achieve an adequate Cons (associated with workability) that is related with an increased porosity and microstructural fragility.

PCA Construction and Factor Analysis
Following the described selection procedures, Table 7, as well as Figure 2, were obtained. With the results from the various methods being around the value of three components, this was the one considered. Although iterations for two and four components were also performed, neither presented additional information.  Note to Table 7: cell with thick lines-cumulative variance requirement; cell with dashed lines-first eigenvalue above 1.00 Table 8 presents the composition and weight distribution of each one of the three PCs (Principal Components), allowing to see the relative influence of the variables. In this division, it was possible to capture ~ 71% of the total variance. With the three components chosen, it was verified that PC1 and PC3 had associated positive loadings, and PC2 had negative (fu) and positive (Cons and w/p) loadings.

Figure 2. Scree plot
To better interpret the PCA results, biplots were built- Figure 3. These allowed us to see the correlations between the variables and the PCs. For the construction of the biplots, the colour scheme represents the following three types of formulations (Table 2): designed and produced in the laboratory (green); industrial but manipulated in the laboratory (blue); and industrial (orange). These colour codes were implemented after building the first graphics to see if there was any trend or cluster formation.

Main inflexion point "elbow"
Note to Table 7: cell with thick lines-cumulative variance requirement; cell with dashed lines-first eigenvalue above 1.00.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 21 Further analysis allowed us to obtain the weight of the original variables in the corresponding components, with a Varimax rotation applied, allowing us to verify which variables contributed the most to the formation of the factor (± 1) and thus attempt to facilitate their interpretation. Note to Table 7: cell with thick lines-cumulative variance requirement; cell with dashed lines-first eigenvalue above 1.00 Table 8 presents the composition and weight distribution of each one of the three PCs (Principal Components), allowing to see the relative influence of the variables. In this division, it was possible to capture ~ 71% of the total variance. With the three components chosen, it was verified that PC1 and PC3 had associated positive loadings, and PC2 had negative (fu) and positive (Cons and w/p) loadings.

Figure 2. Scree plot
To better interpret the PCA results, biplots were built- Figure 3. These allowed us to see the correlations between the variables and the PCs. For the construction of the biplots, the colour scheme represents the following three types of formulations (Table 2): designed and produced in the laboratory (green); industrial but manipulated in the laboratory (blue); and industrial (orange). These Further analysis allowed us to obtain the weight of the original variables in the corresponding components, with a Varimax rotation applied, allowing us to verify which variables contributed the most to the formation of the factor (± 1) and thus attempt to facilitate their interpretation. Table 8 presents the composition and weight distribution of each one of the three PCs (Principal Components), allowing to see the relative influence of the variables. In this division, it was possible to capture~71% of the total variance. With the three components chosen, it was verified that PC1 and PC3 had associated positive loadings, and PC2 had negative (fu) and positive (Cons and w/p) loadings.
To better interpret the PCA results, biplots were built- Figure 3. These allowed us to see the correlations between the variables and the PCs. For the construction of the biplots, the colour scheme represents the following three types of formulations (Table 2): designed and produced in the laboratory (green); industrial but manipulated in the laboratory (blue); and industrial (orange). These colour codes were implemented after building the first graphics to see if there was any trend or cluster formation.

Discussion of the PCA and Factor analysis
The PCA found that three PCs would be considered the ideal number of dimensions. Then, with the factor analysis, it was possible to identify their latent structure.
The first component, PC1, captured 47% of the total variance, displaying positive correlations between all variables. The higher loadings were associated with λ, BD, and mechanical properties. This was a behaviour already expected and known [68,106]. An increment in any of these variables leads to an increase in the other variables included in the same PC, probably being associated with the mortars' compacity and microstructural porosity [43], influencing their behaviour.
The second component, PC2, captured 15% of the total variance, displaying a high negative loading of the fu correlated with both positive Cons and w/p ratio. This behaviour was already reported as a consequence of a more porous microstructure [108,110], due to the increased needs of kneading water that leaves voids after water evaporation [111], fragilizing the microstructure. However, it was unexpected that Cons and w/p would relate so well with the decrease in fu.

Discussion of the PCA and Factor Analysis
The PCA found that three PCs would be considered the ideal number of dimensions. Then, with the factor analysis, it was possible to identify their latent structure.
The first component, PC1, captured 47% of the total variance, displaying positive correlations between all variables. The higher loadings were associated with λ, BD, and mechanical properties. This was a behaviour already expected and known [68,106]. An increment in any of these variables leads to an increase in the other variables included in the same PC, probably being associated with the mortars' compacity and microstructural porosity [43], influencing their behaviour.
The second component, PC2, captured 15% of the total variance, displaying a high negative loading of the fu correlated with both positive Cons and w/p ratio. This behaviour was already reported as a consequence of a more porous microstructure [108,110], due to the increased needs of kneading water that leaves voids after water evaporation [111], fragilizing the microstructure. However, it was unexpected that Cons and w/p would relate so well with the decrease in fu.
The third component, PC3, captured 9% of the total variance, presenting a positive correlation with both variables linked to water permeability, one to liquid (TK) and the other to vapour (Λ). This is an interesting behaviour, somehow expected [109], that can be related to interconnected voids and capillaries present in the mortars' microstructure [112]. Therefore, with the loadings' distribution and the biplots, it was possible to try to interpret the latent factors on each PC: Factor 1-thermo-mechanical behaviour; Factor 2-internal cohesion and microstructural porosity behaviour; Factor 3-water permeability behaviour.
Analysing the biplot of PC1 and PC2 (Figure 3a), it is observed that laboratory mortars (lab) tend to present lower thermo-mechanical properties and lower internal cohesion, being further apart from the other two types. For the mortars manipulated in the laboratory (semi), their PC1 behaviour is neutral, presenting values around the average. However, the cohesion presented higher variability, related with the insertion of thermal insulating nano aggregates (i.e., silica aerogel) in formulations initially designed considering other quantities, or materials, increasing their need for kneading water [107] and limiting their performance due to porosity increases. Finally, the industrial mortars (ind) present a distribution around the average of PC1 and PC2, being an expected behaviour due to the maturity of their development.
For the biplot of PC1 and PC3 (Figure 3b), it is seen that the decrease in thermo-mechanical factors is associated with permeability increases. This is in line with the fact that the increase in voids and cracks leads to a more fragile matrix, with those voids then being filled with water, while also allowing water vapour circulation [110,113].
The biplot representing PC2 and PC3 (Figure 3c) did not correspond to a significant improvement of the level of interpretation, due to a central distribution of the values.
From the integrated analysis of all the biplots, it is seen that the current thermal insulation mortars with lower λ lose their mechanical resistance and cohesion, with an increased water permeability. This fact can be related with the fragility of the thermal insulating nano aggregates [25,71] but also with the high need for kneading water that leaves capillaries and pores when it evaporates after the hydration reactions [114], indicating the need for further development. However, as shown by Pedroso et al. [115], this can be overcome using multilayer protective systems.
It was also possible to observe the formation of clearly defined groups, where the most recent formulations, containing silica aerogel as an insulating aggregate, are the ones needing further development. When incorporating additional thermal insulating aggregates in already available mortars, they exhibit an intermediate behaviour. Moreover, the industrial-based formulations, which present the highest mechanical properties, tend to present lower performance in terms of thermal behaviour. The insertion of silica aerogel is associated with the lowest λ, but at the same time strongly related to lower mechanical performance, probably due to an increased water need, to the fragility of the aerogel and to some interfacial weakness between this nanomaterial and the matrices [25,[68][69][70].
It is also seen an unusual behaviour of the industrial mortars, since their values do not follow the same pattern of the remaining mortars. This can indicate that some of the declared values can present some measurement error. Therefore, some care must be taken when interpreting declared values.

Discussion of the Cluster analysis
This research obtained equivalent results for different cluster methods. However, it should be noted that choosing only one method may lead to results that are not as optimised as desired.
From the various tested agglomeration and distance methods, a trend to obtain three different clusters was shown, following a division associated with the increase in the linkage distance, which is an indicator of a growth in the heterogeneity of groups, being a good indicator of separation. Moreover, when using the k-means method, the formulations' distribution remained stable.
From the beginning, with the hierarchical methods, an interesting agglomeration in the dendrograms started to appear. These methods clearly show a quick agglomeration of formulations L1, L2, L3, and L4, this group being near to another one (containing I1, S2, and others), showing some The results for the hierarchical and non-hierarchical methods returned the distribution of formulations presented in Table 9. In this table, it is possible to see that L1, L2, L3, and L4 were always together. It was also found that, overall, the groups in cluster 1 and cluster 2 remained constant, only switching their cluster number in the k-means method. Thus, the strong correspondence between some of the formulations that remained together in the various methods can be seen. When analysing PCs, this behaviour was already anticipated, due to their performance. Table 9. Mixes composition of each cluster, for 3 clusters: hierarchical and non-hierarchical.

Discussion of the Cluster Analysis
This research obtained equivalent results for different cluster methods. However, it should be noted that choosing only one method may lead to results that are not as optimised as desired.
From the various tested agglomeration and distance methods, a trend to obtain three different clusters was shown, following a division associated with the increase in the linkage distance, which is an indicator of a growth in the heterogeneity of groups, being a good indicator of separation. Moreover, when using the k-means method, the formulations' distribution remained stable.
From the beginning, with the hierarchical methods, an interesting agglomeration in the dendrograms started to appear. These methods clearly show a quick agglomeration of formulations L1, L2, L3, and L4, this group being near to another one (containing I1, S2, and others), showing some degree of similarity. The other group (S1, I2, and others) show an evident heterogeneity from these two. All the groups show an early building of the clusters in all methods, and high stability in their agglomeration, independent of the agglomeration and distance methods applied, showing marked similarities within the same group.
Further analysis found that L1 to L4 groups corresponded to the formulations more recently developed, including the nanomaterial silica aerogel, which allowed to obtain low λ, high Λ, and W, but also low mechanical performance, thus presenting a characteristic behaviour.
The group containing I1 and S2 comprised a mix of formulations. Although most of the members were related to industrial thermal mortars manipulated in the laboratory, with the inclusion of one or more thermal insulating aggregates, they also included some industrial formulations (e.g., I4, I8, and I15) despite only recently being introduced into the market. So, this cluster presented a mixture of behaviours and formulations, being a transitional group in terms of composition or performance. Figure 3 shows that the elements of this cluster, although presenting higher values of λ, while lower than the average, also presented better mechanical performance. This cluster is related to the use of nanomaterials inserted as a second (or third) insulating aggregate in formulations already produced industrially, to further decrease their thermal conductivities.
The third cluster is associated with industrial formulations. It is a group that has a more mature development, presenting both high mechanical performances and values of λ.
When joining the information of PCA and clusters, a clear division between formulations as a function of their origin was found (as in Table 2): laboratory, industrial, or a mixture of both. One of the decisive factors in that division, as expected, was the use of specific lightweight thermal insulating aggregates (e.g., silica aerogel). The associated base formulations (the powder portion without insulating aggregates) also seem to present a significant effect depending on their adaptation capacity to incorporate those aggregates. Here, the use of aerogel seemed to influence the most.

Regression Models Results
With the dependent variable chosen (λ 10 • C,dry ), the simple and multiple linear regression models were applied. The simple linear regressions started to be evaluated through the scatterplot matrix built in the bivariate analysis, and, from there, Table 10 was obtained. In that table, the highest coefficient of determination (which represents the fraction of the initial variation explained by the regression, thus being a measure of the explanatory capacity) was obtained with λ 10 • C,28days (in bold). With further analysis of this potential model, between λ 10 • C,dry and λ 10 • C,28days , the result of Model 1 and Table 11 was obtained, which allowed Equation (3) to be obtained.  When testing the multiple linear regression models (models 2 to 6 in Table 11), the variables fc and λ 10 • C,28days (Model 6) allowed us to obtain an adjusted R 2 close to the other models with more variables, but simplifying the need to obtain data, which made this model more readily applicable. Model 6, which lead to Equation (4), showed p-values < 0.05 for all variables and its ANOVA indicated that the independent variables were statistically significant in explaining the dependent variable. However, it presented an interesting behaviour: Fc, as seen in the scatterplot matrix (Figure 1), presented a positive relationship with the dependent variable, and here presented a negative relation. This can be a sign of confounding (i.e., a variable that influences both dependent and independent variables) [116]; therefore, some care must be taken when applying this model.
For both linear and multilinear regression models, a more in-depth analysis was made. The models were further tested, and it was possible to confirm the assumptions of independence, normality, homoscedasticity, and multicollinearity.
The last task involved the validation of the regression models to ensure that the results were generalisable to the population and not just specific to the sample. Thus, for the present case, as the sample is small as the market, the adjusted R 2 was evaluated against the value of R 2 [97], which indicated a lack of overfitting that would be shown by a smaller difference between the two values, indicating an application potential [89].

Discussion of the Regression Models
Using all regression models, it was not only possible to obtain predictive equations of the dependent variable but also to verify the main advantages and disadvantages of each one. Here, there was not a marked contribution of the nanomaterials to the regression models, as the overall behaviour of the thermal mortars seems to follow the same overall trend.
The application of simple linear regression allowed to predict the dependent variable (λ 10 • C,dry ) using the independent variable λ 10 • C,28days . This model already presented a high coefficient of determination (> 0.90) with statistical significance.
For the multiple linear regression, using the stepwise method, the best model allowed to gain 0.05 (0.93 vs 0.98) on the adj R 2 , when compared with the simple linear regression, thus increasing the predictive capacity of the model while presenting statistical values of F and the p-values of F and t within the initially proposed limit (< 0.05). However, there were some problems associated with the signal of the variable fc, probably related to some confounding behaviour. Therefore, Equation (3) seems to be a quick, easy, and robust way to predict λ 10 • C,dry without confounding problems.
Moreover, with the tested regression models, it was possible to obtain a predicting equation of how the λ 10 • C,dry will behave many days earlier. Although BDh has been identified as a good predictor in previous works [68,102], which was confirmed (R 2~0 .70), it presented a lower proportion of the variance than λ 10 • C,28days .
Finally, a new regression model was obtained. This model met the requirements of independence, normality, homoscedasticity, and multicollinearity, and was partially validated, allowing researchers to apply it readily.

Application of the Multivariate Statistical Methods
The application of these statistical methods allowed us to analyse the current state-of-the-art thermal mortars' formulations. However, each method can be applied by itself, providing additional information related to the data being analysed. This way, each method can be further applied in other research studies using the following considerations: • Due to the ease of application, the uni and bivariate analysis allow us to obtain a general overview of the sample's characteristics, the scatter of the data, and the existence of outliers. Therefore, this analysis is so simple to apply that it is considered a good starting point before applying any other statistical technique; • The PCA and factor analysis were revealed to be the techniques that allowed to obtain more insights that can further help to develop these products. They allowed us to find the interdependence between several characteristics and to verify the influence that the use of nanomaterials has on the performance of these formulations. These techniques can be an interesting tool to develop new formulations in the future; • The cluster analysis, although allowing us to separate formulations into different origins, only served as a verification for the results obtained in the PCA and factor analysis, reinforcing the influence that the use of nanomaterials has on the performance. However, when considering a large amount of data, it can help the researchers to discover products that are intimately related; • The regression models allow us to find predictive equations, when possible, for a desired dependent variable. In this case, they allowed us to predict the thermal conductivity at 10 • C and in the dry state, when knowing the thermal conductivity at 10 • C at 28 days, several days earlier.
These methods allow us to analyse substantial amounts of data and to transform it into structured knowledge, with their relevance and usefulness increasing when used together.

Conclusions
This statistical approach allowed us to better understand the characteristics of current thermal insulating mortars, transforming data into knowledge. With this analysis, it was possible to identify and evaluate the influence that formulations and thermal insulating aggregates have on the performance of these mortars, and the following findings can be highlighted:

•
With the uni and bivariate analysis, it was possible to confirm: correlations between the ratio water:powder (w/p) and the remaining variables; the influence that kneading water has on the mortars' performance; correlations between the bulk-density hardened state (BDh) and the bulk-density fresh state (BDf ), and correlations between compressive strength (fc), the dynamic modulus of elasticity (Ed.), and flexural strength (ft), and other more interesting ones, such as between BDh, λ 10 • C,28days , and λ 10 • C,dry ; • PCA and factor analysis allowed us to identify three main components, the latent structure of which allowed us to classify them into the following categories: thermo-mechanical, internal cohesion, and microstructural porosity, and water permeability behaviours. Here, the most relevant finding was the achievement of a factor showing a significant impact of the use of nanomaterials as thermal insulating aggregates, where the increase in mechanical performance decreases the thermal performance. So, both the formulation and the addition of thermal insulating aggregates, like the silica aerogel, influenced their behaviour; • With the cluster analysis, the findings of PCA were enhanced and verified. Three groups of formulations were obtained, showing the importance that nano thermal insulating aggregates, in conjunction with the base formulations (laboratory, industrial, or industrial tuned in the laboratory), have on the overall performance; • Although simple and multiple linear regression models were studied, the simplest model was the one with the best results. Therefore, it was possible to obtain a new model that can allow researchers to save time, allowing them to predict λ 10 • C,dry earlier.
The extensive statistical analysis conducted allowed us to verify that the most significant influences on the performance of these mortars are related to the use of nanomaterials and their higher need for kneading water to reach the desired workability. This excess water, after the initial hydration reactions, leaves voids, pores, and capillaries in the microstructure. The conjunction of both factors leads to lower thermal conductivities, but also decreased mechanical performances, showing a need for further development.
Nevertheless, the introduction of thermal insulating nanomaterials allowed us to obtain thermal conductivities below classic thermal insulating materials. This was not possible before, using other types of thermal insulation aggregates; therefore, it is a great achievement in reducing the oil needs (e.g., for EPS manufacture) of these mortars, with significant potential to improve their sustainability.
As current formulations seem only to introduce additional thermal insulating aggregates into existing formulations, looking at the results of this research, it would be beneficial to approach these products from a distinct perspective-that is, to design entirely new formulations instead of only adding thermal insulating aggregates.
The methods and techniques applied in this research present a new critical view on current thermal mortars' performance, namely on how they can be further fine-tuned as multifunctional products. Here, the final application and performance needs will dictate the formulations' design.
The future thermal mortar formulations would, preferably, present low thermal conductivities and water absorption while having high water vapour permeability and mechanical properties. This seems to be partially achieved using nanomaterials, such as silica aerogel. Therefore, it would be interesting, as further developments, to evaluate the application of this type of statistical analysis relating each raw material included in a formulation, in its powder state, and its final mechanical and physical performance, after application. This could potentially enable new formulations to be automatically designed as a function of the required multifunctionality. Another interesting study could be related with the testing of different formulations in the same conditions for inter-comparison in order to determine the specific performance requirements for these innovative products.