A Method to Facilitate Uncertainty Analysis in LCAs of Buildings

: Life cycle assessment (LCA) is increasingly becoming a common technique to assess the embodied energy and carbon of buildings and their components over their life cycle. However, the vast majority of existing LCAs result in very deﬁnite, deterministic values which carry a false sense of certainty and can mislead decisions and judgments. This article tackles the lack of uncertainty analysis in LCAs of buildings by addressing the main causes for not undertaking this important activity. The research uses primary data for embodied energy collected from European manufacturers as a starting point. Such robust datasets are used as inputs for the stochastic modelling of uncertainty through Monte Carlo algorithms. Several groups of random samplings between 10 1 and 10 7 are tested under two scenarios: data are normally distributed (empirically veriﬁed) and data are uniformly distributed. Results show that the hypothesis on the data no longer inﬂuences the results after a high enough number of random samplings (10 4 ). This ﬁnding holds true both in terms of mean values and standard deviations and is also independent of the size of the life cycle inventory (LCI): it occurs in both large and small datasets. Findings from this research facilitate uncertainty analysis in LCA. By reducing signiﬁcantly the amount of data necessary to infer information about uncertainty, a more widespread inclusion of uncertainty analysis in LCA can be encouraged in assessments from practitioners and academics alike.


Introduction and Theoretical Background
Life cycle assessment (LCA) is a common method whereby the whole life environmental impacts of products can be assessed.Applied within the built environment, the focus is on assessment of products [1], assemblies [2] and whole buildings [3,4], and is predominantly concerned with energy consumption and greenhouse gas emissions [2,4,5].However, buildings and assemblies are complex entities with long and unpredictable lifespans; they are designed and built by a fragmented industry through temporary and shifting alliances and dispersed supply chains.LCA, therefore, while providing an indication of environmental impacts, includes inherent uncertainties.
It was more than 25 years ago that the US Environmental Protection Agency [6] brought to researchers' and practitioners' attention the role and impact of uncertainty and variability in LCA modelling.Nearly ten years has passed since the importance of the topic resurfaced with Lloyd and Ries [7] who revealed a general lack of appreciation of uncertainty modelling and lack of application in life cycle analysis.Even so, almost all life cycle assessment studies in the built environment continue to Energies 2017, 10, 524 3 of 15 and the complexity of stochastic modelling for uncertainty analysis, offering an innovative solution for the construction sector.The next section introduces the theory and the algorithmic approach developed and used.Results follow, along with a discussion on the wider implication of the findings from this research.The final section concludes the article and presents future work that will be undertaken.

Theory and Methods
This research aims to investigate the impact of assumptions on data distribution on the results of uncertainty analysis and to provide an innovative simple approach to reduce the complexity of the assessment and time-consuming data collection activities.It is based on robust primary data collected from European manufacturing plants related to the glass and construction industry [53].(The quality of the data is demonstrated by the pedigree matrix [22], for the data are characterised by the following scores (2), ( 2), ( 1), ( 2), (1) across the relevant five indicators which are: Reliability, Completeness, Temporal Correlation, Geographical Correlation, Further Technological Correlation-where (1) is the highest possible score and ( 5) is the lowest, assumed as the default value.)The algorithm has also been tested on randomly generated data and it works equally fine.
The fundamental role of uncertainty analysis in LCAs is easily understood by looking at Figure 1.The figure shows a mock example where the embodied energy (EE) of two alternatives, 1 and 2, are being compared.Most LCAs only provide deterministic, single-valued results which generally represent the most likely values resulting from the specific life cycle inventory (LCI) being considered.These values can be seen as µ 1 and µ 2 in the figure.If a decision were to be taken only based on such numbers one should conclude that alternative 1 is preferable over alternative 2 since µ 1 is lower than µ 2 .This, unfortunately, is often the case in most existing literature in LCAs of buildings.
Energies 2017, 10, 524 3 of 15 solution for the construction sector.The next section introduces the theory and the algorithmic approach developed and used.Results follow, along with a discussion on the wider implication of the findings from this research.The final section concludes the article and presents future work that will be undertaken.

Theory and Methods
This research aims to investigate the impact of assumptions on data distribution on the results of uncertainty analysis and to provide an innovative simple approach to reduce the complexity of the assessment and time-consuming data collection activities.It is based on robust primary data collected from European manufacturing plants related to the glass and construction industry [53].(The quality of the data is demonstrated by the pedigree matrix [22], for the data are characterised by the following scores ( 2), ( 2), ( 1), ( 2), (1) across the relevant five indicators which are: Reliability, Completeness, Temporal Correlation, Geographical Correlation, Further Technological Correlation-where ( 1) is the highest possible score and ( 5) is the lowest, assumed as the default value.)Thealgorithm has also been tested on randomly generated data and it works equally fine.
The fundamental role of uncertainty analysis in LCAs is easily understood by looking at Figure 1.The figure shows a mock example where the embodied energy (EE) of two alternatives, 1 and 2, are being compared.Most LCAs only provide deterministic, single-valued results which generally represent the most likely values resulting from the specific life cycle inventory (LCI) being considered.These values can be seen as μ1 and μ2 in the figure.If a decision were to be taken only based on such numbers one should conclude that alternative 1 is preferable over alternative 2 since μ1 is lower than μ2.This, unfortunately, is often the case in most existing literature in LCAs of buildings.However, when considering the relative uncertainty-which provides measures of the standard deviations σ1 and σ2 as well as the probability density function-it is evident that alternative 2 has a far less spread distribution and therefore a much narrower range within which values are likely to vary.This makes, all of sudden, alternative 2 at least as appealing as alternative 1, if not more.The information on uncertainty and data distribution does therefore clearly enable a better and more informed decision making.However, when considering the relative uncertainty-which provides measures of the standard deviations σ 1 and σ 2 as well as the probability density function-it is evident that alternative 2 has a far less spread distribution and therefore a much narrower range within which values are likely to vary.This makes, all of sudden, alternative 2 at least as appealing as alternative 1, if not more.The information on uncertainty and data distribution does therefore clearly enable a better and more informed decision making.
In the specific case presented here the processes of the life cycle inventory (LCI) are from the manufacture of one functional unit (FU) of double skin façade, a flat glass cladding systems.In total, 200 items have been followed throughout the supply chain and therefore the dataset available for each entry of the LCI is constituted of 200 data points.For example, cutting the flat glass to the desired measure has been measured and monitored (in terms of energy inputs, outputs, and by-products) for all 200 glass panes and therefore a population of 200 data points is available for that specific process.However, the algorithm presented has been tested successfully also on randomly generated data to ensure its wide applicability.The robust primary data set served the purpose of comparing empirical vs. predicted results.
The data and related life cycle processes refer to embodied energy (MJ or kWh) and embodied carbon (kgCO 2e ) (for definitions on embodied carbon see, for instance [8]) and both datasets have been used and tested in this research.Embodied energy has however been the main input to the calculation, since embodied carbon is the total carbon dioxide equivalent emissions of embodied energy given the energy mix and carbon conversion factors which are specific to the geographical context under consideration.Therefore, to avoid duplication of figures, only embodied energy results have been included in the article.
Figure 2 shows the specific embodied energy of three such processes against the frequency with which they occur.
Energies 2017, 10, 524 4 of 15 In the specific case presented here the processes of the life cycle inventory (LCI) are from the manufacture of one functional unit (FU) of double skin façade, a flat glass cladding systems.In total, 200 items have been followed throughout the supply chain and therefore the dataset available for each entry of the LCI is constituted of 200 data points.For example, cutting the flat glass to the desired measure has been measured and monitored (in terms of energy inputs, outputs, and by-products) for all 200 glass panes and therefore a population of 200 data points is available for that specific process.However, the algorithm presented has been tested successfully also on randomly generated data to ensure its wide applicability.The robust primary data set served the purpose of comparing empirical vs. predicted results.
The data and related life cycle processes refer to embodied energy (MJ or kWh) and embodied carbon (kgCO2e) (for definitions on embodied carbon see, for instance [8]) and both datasets have been used and tested in this research.Embodied energy has however been the main input to the calculation, since embodied carbon is the total carbon dioxide equivalent emissions of embodied energy given the energy mix and carbon conversion factors which are specific to the geographical context under consideration.Therefore, to avoid duplication of figures, only embodied energy results have been included in the article.
Figure 2 shows the specific embodied energy of three such processes against the frequency with which they occur.The graphs in the figure present the cumulative distribution function (CDF) of the collected data plotted against the CDF of the data if they were perfectly normally distributed.The check on the approximation of the real distribution of the dataset versus a normal distribution has been conducted through the Z-test [54].As the plots in the figure show, there is good agreement between the collected data and normally distributed data, and therefore the hypothesis that the collected data were normally distributed was adopted.
The primary data collection represented a time-consuming and costly activity.Therefore, the research team wondered whether it was possible to infer information on the uncertainty of the result of the LCA as well as its probability distribution without embarking onto such an extensive data collection.To do so, the empirical case of collected primary data characterised by a normal distribution has been used as a reference.This was then compared with a less demanding approach to determine variability in the data.The simplest alternative is generally the use of a maximumminimum variation range, whereby the only known piece of information is that a value is likely to vary within that range.The likelihood of different values within that range remains, however, unknown.From a probabilistic point of view, the closest case to this maximum-minimum scenario is that of a uniform distribution, which is characterised by a defined data variation range within which all values have the same probability.The comparison between these two alternatives, and their influence on the uncertainty of the result of the LCA, was the underpinning idea for this research.This has been tested through the algorithm developed and explained in detail in the next section.

The Algorithm Developed for This Research
To explain the algorithm, let us assume the life cycle inventory (LCI) of the primary data collection is arranged as a vector P: with m representing the total number of sub-processes and the jth entry being, in turn, another vector s, containing the entire dataset of n measures of embodied energy, x, associated with the jth manufacturing sub-process: In order to perform a Monte Carlo analysis, an input domain is required from which xi variables can be randomly picked according to a given probability distribution.For each j sub-process, two sets of continuous input domains are derived in here from the discrete data collection P. Namely: a vector The graphs in the figure present the cumulative distribution function (CDF) of the collected data plotted against the CDF of the data if they were perfectly normally distributed.The check on the approximation of the real distribution of the dataset versus a normal distribution has been conducted through the Z-test [54].As the plots in the figure show, there is good agreement between the collected data and normally distributed data, and therefore the hypothesis that the collected data were normally distributed was adopted.
The primary data collection represented a time-consuming and costly activity.Therefore, the research team wondered whether it was possible to infer information on the uncertainty of the result of the LCA as well as its probability distribution without embarking onto such an extensive data collection.To do so, the empirical case of collected primary data characterised by a normal distribution has been used as a reference.This was then compared with a less demanding approach to determine variability in the data.The simplest alternative is generally the use of a maximum-minimum variation range, whereby the only known piece of information is that a value is likely to vary within that range.The likelihood of different values within that range remains, however, unknown.From a probabilistic point of view, the closest case to this maximum-minimum scenario is that of a uniform distribution, which is characterised by a defined data variation range within which all values have the same probability.The comparison between these two alternatives, and their influence on the uncertainty of the result of the LCA, was the underpinning idea for this research.This has been tested through the algorithm developed and explained in detail in the next section.

The Algorithm Developed for This Research
To explain the algorithm, let us assume the life cycle inventory (LCI) of the primary data collection is arranged as a vector P: with m representing the total number of sub-processes and the jth entry being, in turn, another vector s, containing the entire dataset of n measures of embodied energy, x, associated with the jth manufacturing sub-process: In order to perform a Monte Carlo analysis, an input domain is required from which x i variables can be randomly picked according to a given probability distribution.For each j sub-process, two sets of continuous input domains are derived in here from the discrete data collection P. Namely: a vector N containing n pairs of values µ j and σ j , representing respectively the mean and standard deviation of x i values of embodied energy associated with the jth sub-process. where, as well as a vector U containing n pairs of values x j,min and x j,max , representing respectively the minimum and maximum embodied energy values, x i , associated with the jth sub-process: Two sets of Monte Carlo simulations were then run, each with a prescribed increasing number of output samples to be generated (i.e., from 10 1 to 10 7 ).The inputs for the first set were randomly sampled, under the assumption that data were normally distributed (as from N), from a probability density function defined as: whereas the assumption of uniform distribution (as from U) was used for the second set, and data were randomly sampled from a probability density function defined as: , f or x j,min ≤ x j ≤ x j,max 0, f or x j < x j,min or x j >x j,max As explained, the normal distribution was the distribution that best fitted the primary data collected, whereas the uniform distribution was chosen because it is characterised by the lowest number of required inputs where only the lower and upper bounds of the variation range are necessary.As such, it represents the least intensive option in terms of data collection to enable an uncertainty analysis.
The single output obtained at the end of each Monte Carlo iteration is a summation of the (randomly picked) embodied energy values, x, associated with the jth sub-process.This represents an estimate of the embodied energy associated with the LCI, which is in this case the entire manufacturing process.This is because, mathematically, a life cycle impact assessment (LCIA) equals to: which represents the summation of the impacts x of the jth process S across the life cycle inventory P for the relevant impact category being considered (e.g., cumulative energy demand).
The algorithmic approach is also shown visually in Figure 3.
As mentioned, the collection of data to characterise the data distribution is a significantly time-consuming and costly activity, which often severely limits or completely impedes undertaking uncertainty analysis [46].The literature reviewed has shown that different approaches arose as alternative, more practical solutions.Out of those, one of the most often utilised in practice in the construction industry is the use of expert judgement to identify a range within which data is expected to vary [24,25].This is then repeated for several or indeed all life cycle processes that constitute the life cycle inventory and it generally leads to calculating a minimum and maximum value for the overall impact assessment, which are then labelled as 'worst' and 'best' case scenario.It has been shown that expert judgements can provide an accurate overview of the variability range [26], and this characteristic held true in the case of the research underpinning this article.Practitioners and professionals involved in the life cycle processes for which primary data have been collected have indeed shown a remarkably accurate sensitivity to the processes variation range.As a result, it would have been possible to identify the same (or significantly similar) data range, which resulted from the extensive data collection by asking people in the industry: "What are the best and worst case scenarios in terms of energy input for this specific process?"However, the answer to such a question tells nothing about the way data are distributed between these scenarios, and both values represent the least probable events, as they are, statistically speaking, the farthest points from the mean (i.e., most likely value).If this approach were propagated throughout the whole LCI, the result would lead to an overall range for the impact assessment which is characterised by significant confidence in terms of inclusion of the true value but the numbers it produces would likely be of very little use.As a consequence, the resulting decisions could be significantly biased, with the risk of invalidating the merit and efforts of the whole LCA [10].
Therefore, to combine the benefits of a lighter data collection with the benefit of an uncertainty analysis, the algorithmic approach developed for this research has tested whether and to what extent the knowledge of a data distribution within a known data variation range influences the outcome of an uncertainty analysis undertaken through Monte Carlo simulation.The algorithmic approach has been developed, implemented and tested in MATLAB R2015b (8.6.0).Once initial results were obtained, the robustness of the findings has been further tested for validity by one of the co-authors who independently implemented an algorithm in Python with the aim of addressing the same research problem.Findings were therefore confirmed across the two programming languages with different algorithms written by two of the authors to strengthen the reliability of our results.A comparison of the results produced in MATLAB and Python 3.5 is shown in Figures 4 and 5 for the mean values μ and the standard deviations σ respectively.It has been shown that expert judgements can provide an accurate overview of the variability [26], and this characteristic held true in the case of the research underpinning this article.Practitioners and professionals involved in the life cycle processes for which primary data have been collected have indeed shown a remarkably accurate sensitivity to the processes variation range.As a result, it would have been possible to identify the same (or significantly similar) data range, which resulted from the extensive data collection by asking people in the industry: "What are the best and worst case scenarios in terms of energy input for this specific process?"However, the answer to such a question tells nothing about the way data are distributed between these scenarios, and both values represent the least probable events, as they are, statistically speaking, the farthest points from the mean (i.e., most likely value).If this approach were propagated throughout the whole LCI, the result would lead to an overall range for the impact assessment which is characterised by significant confidence in terms of inclusion of the true value but the numbers it produces would likely be of very little use.As a consequence, the resulting decisions could be significantly biased, with the risk of invalidating the merit and efforts of the whole LCA [10].
Therefore, to combine the benefits of a lighter data collection with the benefit of an uncertainty analysis, the algorithmic approach developed for this research has tested whether and to what extent the knowledge of a data distribution within a known data variation range influences the outcome of an uncertainty analysis undertaken through Monte Carlo simulation.The algorithmic approach has been developed, implemented and tested in MATLAB R2015b (8.6.0).Once initial results were obtained, the robustness of the findings has been further tested for validity by one of the co-authors who independently implemented an algorithm in Python with the aim of addressing the same research problem.Findings were therefore confirmed across the two programming languages with different algorithms written by two of the authors to strengthen the reliability of our results.A comparison of the results produced in MATLAB and Python 3.5 is shown in Figures 4 and 5 for the mean values µ and the standard deviations σ respectively.To broaden the impact and applicability of the approach developed, and to strengthen the relevance of its findings, two extreme cases have been tested:   To broaden the impact and applicability of the approach developed, and to strengthen the relevance of its findings, two extreme cases have been tested: To broaden the impact and applicability of the approach developed, and to strengthen the relevance of its findings, two extreme cases have been tested:

•
LCI is constituted by as few as two entries (e.g., only two life cycle processes)-an example could be a very simple construction product or material such as unfired clay; • LCI is constituted by as many entries as those for which collected primary data were available.
In terms of number of samplings and iterations, each run of the Monte Carlo algorithm randomly and iteratively samples 10 i (with i = 1, . . ., 7 and 10 i+1 step increases) values from within each range under the pertinent assumption regarding data distribution (that is, uniform in one case and normal in the other).This process repeats across all entries of the LCI and produces the final figures for the impact assessment.The random sampling mechanism is not pre-assigned or registered, and it varies at each and new run of the algorithm.The algorithm stops after 10 different runs, a high enough number to ensure that potential biases in the variability with which the random sampling operates would emerge.

Results and Discussion
Figure 6 shows the results for one specific run out of the ten the algorithm runs for both hypotheses on data distribution: uniform (S 1 ) and normal (S 2 ).
The histograms refer to the LCIA (overall impact) and not to the LCI (individual inputs).All seven sampling cases are shown, from 10 1 to 10 7 random samplings.The figures show that for lower numbers of samplings the characteristics of the two distributions are still evident whereas from 10 4 random samples upwards the results of the two methods converge.The algorithm calculates also the difference between the mean values and standard deviations and normalises them to a percentage.This information is presented in the last (bottom-right) graph of Figure 6.It can be seen how the difference between µ and σ under the two hypotheses is not influential anymore after 10 3 -10 4 samplings.If for the mean this could be expected as a consequence of the central limit theorem, this finding is noteworthy for the standard deviation.
Energies 2017, 10, 524 9 of 15 • LCI is constituted by as few as two entries (e.g., only two life cycle processes)-an example could be a very simple construction product or material such as unfired clay; • LCI is constituted by as many entries as those for which collected primary data were available.
In terms of number of samplings and iterations, each run of the Monte Carlo algorithm randomly and iteratively samples 10 i (with i = 1, …, 7 and 10 i+1 step increases) values from within each range under the pertinent assumption regarding data distribution (that is, uniform in one case and normal in the other).This process repeats across all entries of the LCI and produces the final figures for the impact assessment.The random sampling mechanism is not pre-assigned or registered, and it varies at each and new run of the algorithm.The algorithm stops after 10 different runs, a high enough number to ensure that potential biases in the variability with which the random sampling operates would emerge.

Results and Discussion
Figure 6 shows the results for one specific run out of the ten the algorithm runs for both hypotheses on data distribution: uniform (S1) and normal (S2).
The histograms refer to the LCIA (overall impact) and not to the LCI (individual inputs).All seven sampling cases are shown, from 10 1 to 10 7 random samplings.The figures show that for lower numbers of samplings the characteristics of the two distributions are still evident whereas from 10 4 random samples upwards the results of the two methods converge.The algorithm calculates also the difference between the mean values and standard deviations and normalises them to a percentage.This information is presented in the last (bottom-right) graph of Figure 6.It can be seen how the difference between μ and σ under the two hypotheses is not influential anymore after 10 3 -10 4 samplings.If for the mean this could be expected as a consequence of the central limit theorem, this finding is noteworthy for the standard deviation.Figure 7 shows the μ and σ variation (percentage) between the two hypotheses across all 10 runs in the case of a simplistic LCI composed of two entries.In all runs, regardless of the initial difference for lower number of samplings, after 10 5 random samplings the average differences stabilise around: • 0.01% for the μ, and, • 1% for the σ.
Figure 8 presents the same results but for the full LCI as explained in the research design section.In the case of the detailed LCI, after 10 4 random samplings the average differences stabilise around: • 0.01% for the μ, and, • 1% for the σ.
This demonstrates the validity of the approach developed regardless of the size of the LCI (i.e., number of entries from which the algorithm samples randomly).It should be noted that despite it takes 10 5 samplings to achieve perfect convergence in the case of an LCI made of just two entries, the stabilisation of any variation around 0.01% for the μ and 1% for the σ is already clearly identifiable from 10 4 samplings.
In terms of computational costs, the algorithm is extremely light and the whole lot of 10 runs, each of which has seven iterations, from 10 1 to 10 7 random samplings, only takes 61.594 s to run in the case of the full LCI on a MacBook Pro.Specifically, the core computations take 38 s. Figure 7 shows the µ and σ variation (percentage) between the two hypotheses across all 10 runs in the case of a simplistic LCI composed of two entries.In all runs, regardless of the initial difference for lower number of samplings, after 10 5 random samplings the average differences stabilise around: • 0.01% for the µ, and, • 1% for the σ.
Figure 8 presents the same results but for the full LCI as explained in the research design section.In the case of the detailed LCI, after 10 4 random samplings the average differences stabilise around: • 0.01% for the µ, and, • 1% for the σ.
This demonstrates the validity of the approach developed regardless of the size of the LCI (i.e., number of entries from which the algorithm samples randomly).It should be noted that despite it takes 10 5 samplings to achieve perfect convergence in the case of an LCI made of just two entries, the stabilisation of any variation around 0.01% for the µ and 1% for the σ is already clearly identifiable from 10 4 samplings.
In terms of computational costs, the algorithm is extremely light and the whole lot of 10 runs, each of which has seven iterations, from 10 1 to 10 7 random samplings, only takes 61.594 s to run in the case of the full LCI on a MacBook Pro.Specifically, the core computations take 38 s.
To ensure that the LCI with as little as two entries would be representative of a generic case and not just of a fortunate combination of two processes that confirm the general case, we have further tested this simplified inventory through five combinations of two processes randomly picked from the database.These have been tested again on both embodied energy and embodied carbon data, and graphical results for the embodied carbon in all five cases are provided as supplementary material attached to this article (Figures S1-S5).To ensure that the LCI with as little as two entries would be representative of a generic case and not just of a fortunate combination of two processes that confirm the general case, we have further tested this simplified inventory through five combinations of two processes randomly picked from the database.These have been tested again on both embodied energy and embodied carbon data, and graphical results for the embodied carbon in all five cases are provided as supplementary material attached to this article (Figures S1-S5).
These findings address current challenges in uncertainty analysis in LCA as described in the introduction, and have implications for both theory and practice.Firstly, it has been shown that extensive data collection to characterise the data distribution is not necessary to undertake an These findings address current challenges in uncertainty analysis in LCA as described in the introduction, and have implications for both theory and practice.Firstly, it has been shown that extensive data collection to characterise the data distribution is not necessary to undertake an uncertainty analysis.As few as two numbers (the upper and lower bounds of the data variation range) combined with a sensible use of the power of Monte Carlo simulation suffice to characterise and propagate uncertainty.Secondly, it has also been found that 10 4 random samplings are sufficient to achieve convergence, thus establishing a reference number for random samplings within Monte Carlo simulation, at least for uncertainty analysis from LCIs similar to the one in question, i.e., uncertainty analysis.As few as two numbers (the upper and lower bounds of the data variation range) combined with a sensible use of the power of Monte Carlo simulation suffice to characterise and propagate uncertainty.Secondly, it has also been found that 10 4 random samplings are sufficient to achieve convergence, thus establishing a reference number for random samplings within Monte Carlo simulation, at least for uncertainty analysis from LCIs similar to the one in question, i.e., built environment studies.Thirdly, the high computational costs often associated with Monte Carlo simulation have been reduced, through the simple and innovative algorithm developed.

Conclusions and Future Work
The research presented in this article has addressed current challenges in uncertainty analysis in LCA of buildings.By means of an innovative approach based on Monte Carlo simulation, it has tested the influence of two different assumptions on data distribution towards the final results of the uncertainty analysis.The different assumptions on the data sets tested in this research were (1) data with a normal distribution and (2) data with a uniform distribution, both based on primary data collected as part of a robust dataset.Results have demonstrated that after 10 4 random samplings from within the data variation range, the initial assumption on whether the data were normally or uniformly distributed loses relevance, both in terms of mean values and standard deviations.
The consequence of the findings is that an initial characterisation and propagation of uncertainty within a life cycle inventory during an LCA of buildings can be carried out without the usually expensive and time consuming primary data collection.The approach presented here therefore allows a simplified way to include uncertainty analysis within the LCA of complex construction products, assemblies and whole buildings.In turn, this should encourage increased confidence in, and therefore increased uptake of, the LCA calculation in the construction industry, leading eventually to meaningful and reliable reduction of environmental impacts.
Future work will test the influence of more data distributions used and found in LCA practice in the built environment, including triangular, trapezoidal, lognormal, and beta distributions.In its current form the approach developed requires the use of MATLAB, which could be a limit to its widespread adoption.As a consequence, we intend to develop a web-based free application hosted on an academic website where academics and practitioners alike can update their data as Excel or .csvfiles and get results in both graphical and datasheet formats.In the meantime, interested users are encouraged to contact the authors to find out more about the algorithm or to have their LCI data processed for uncertainty analysis.

Figure 1 .
Figure 1.Importance of uncertainty analysis in life cycle analyses (LCAs).

Figure 1 .
Figure 1.Importance of uncertainty analysis in life cycle analyses (LCAs).

Figure 2 .
Figure 2. Examples of three life cycle processes part of the primary data for this research.

Figure 3 .
Figure 3. Algorithmic approach developed for this research.

Figure 3 .
Figure 3. Algorithmic approach developed for this research.

Figure 4 .
Figure 4. Difference between results produced in MATLAB and Python for the mean values of embodied energy of the LCI.

Figure 5 .
Figure 5. Difference between results produced in MATLAB and Python for the standard deviation values of embodied energy of the life cycle inventory (LCI).

Figure 4 .
Figure 4. Difference between results produced in MATLAB and Python for the mean values of embodied energy of the LCI.

Figure 4 .
Figure 4. Difference between results produced in MATLAB and Python for the mean values of embodied energy of the LCI.

Figure 5 .
Figure 5. Difference between results produced in MATLAB and Python for the standard deviation values of embodied energy of the life cycle inventory (LCI).

Figure 5 .
Figure 5. Difference between results produced in MATLAB and Python for the standard deviation values of embodied energy of the life cycle inventory (LCI).

Figure 6 .
Figure 6.Example of the results of the developed algorithm for one specific run (results refer to embodied energy (EE) (MJ) but are equal in trend in the case of embodied carbon (EC) (kgCO 2e )).

Figure 7 .
Figure 7. μ and σ variation (percentage) between the two hypotheses across all 10 runs (only two LCI entries).

Figure 7 .
Figure 7. µ and σ variation (percentage) between the two hypotheses across all 10 runs (only two LCI entries).

Figure 8 .
Figure 8. μ and σ variation (percentage) between the two hypotheses across all 10 runs (for the full LCI).

Figure 8 .
Figure 8. µ and σ variation (percentage) between the two hypotheses across all 10 runs (for the full LCI).