Sacrificial Pseudoreplication in LEED Cross-Certification Strategy Assessment : Sampling Structures

The study aims to suggest sampling structures to avoid sacrificial pseudoreplication in the evaluation of Leadership in Energy and Environmental Design (LEED)-certified projects. The sampling includes two structures that exclude sacrificial pseudoreplication and one structure that leads to sacrificial pseudoreplication: (i) The state is the sampling frame in which LEED projects are treated as primary sampling units; (ii) The US is the sampling frame, the state is the primary sampling unit in which LEED projects are treated as evaluation units; and (iii) The US is the sampling frame in which LEED projects are pooled from different states and treated as primary sampling units. The three sampling structures are applied to the evaluation of the Silver-to-Gold cross-certification performances of LEEDv3 for new construction and LEEDv3 for existing buildings. The same cross-certification strategy was revealed if either structure (i) or structure (ii) was applied, while it was poorly estimated and misinterpreted if structure (iii) was applied, i.e., sacrificial pseudoreplication had occurred.


Introduction
We first start with statistical terminology necessary for explaining the study (Table 1).

Terminology Reference
The sampling frame is a collection of primary sampling units "accessible for sampling in the population of interest" [1] The primary sampling unit is statistically independent of other primary sampling units within the same sampling frame [1] Evaluation units are nested in the primary sampling unit and are statistically dependent of those primary sampling unit [1] Simple pseudoreplication occurs when multiple measurements are performed on individual primary sampling units and treated statistically as if each represents a separate primary sampling unit [2] Temporal pseudoreplication occurs when performing on individual primary sampling units multiple measurements, which are performed over time and treated statistically as if each represents a different primary sampling unit [2] Sacrificial pseudoreplication occurs when the evaluation units from different primary sampling units are pooled and each evaluation unit is treated as a primary sampling unit [2] "Pseudoreplication is a serious type of statistical error that is unfortunately common in all the sciences.It was originally defined in the context of manipulative experiments but can also occur in observational studies" [3].Leadership in Energy and Environmental Design (LEED)-certified projects data analysis is an observational study [4].Therefore, LEED project data analysis should be considered regarding the problem of pseudoreplication.
The term pseudoreplication was coined by Hurlbert [5] and defined as " . . .a particular combination of experimental design (or sampling) and statistical analysis which is inappropriate for testing the hypothesis of interest".Pseudoreplication typically occurs when the number of observations or the number of data points are treated inappropriately as independent replicates [5][6][7].Pseudoreplication contains several types: simple, temporal, and sacrificial.One of the most common types is sacrificial pseudoreplication [6].This study focuses on the sacrificial pseudoreplication that arises when the evaluation units from different primary sampling units are pooled to obtain "[but not universal] . . .exaggeration of both the strength of the evidence for a true difference between treatments and of the precision with which any difference that does exist has been estimated" [2].Recently, when Wu et al. [8] and Wu et al. [9] analyzed the LEED project data, they utilized pooling procedures that achieved extremely low P values.An evaluation of LEED project data and different sampling structures (without and with pooling) is discussed below.
A collection of Certified, Silver, Gold, and Platinum LEED-certified projects in the US Green Building Council (USGBC) website [10] is a unique database that has been used by researchers to perform retrospective analyses of the LEED-certified building certification, category, and cross-certification performances [4,8,9,11,12].Such analyses produce feedback that can be a useful information for experts who oversee further improvements in the seven LEED categories such as sustainable sites (SS), water efficiency (WE), energy and atmosphere (EA), material and resources (MR), indoor environmental quality (EQ), innovation in design (ID), and regional priority (RP).
In three recent cases [4,8,9], a single-unit design structure of the analysis according to the term originally suggested by Hurlberd [7] (p.652) was used.In these studies, all LEED projects were treated as statistically independent units using a non-parametric Wilcoxon-Mann-Whitney (WMW) test [13].The WMW test was chosen, because LEED data are related to an ordinal measurement scale.
Wu et al. [8] pooled all LEED projects within the US to determine cross-certification performances of LEED-NCv2.2-certifiedprojects in 2007-2015.Sample sizes for both Silver and Gold LEED projects were n 1 = 1798 and n 2 = 2469, respectively [8] (p.174, Table 8).Based on the absolute effect size (mean credit increase, [MCI]) and statistical difference (p-values) of the pooled data, the authors concluded that at least three categories, EA (MSI = 2.72 and p = 0.000), EQ (MCI = 1.50 and p = 0.000), and SS (MCI = 1.12 and p = 0.000), were propelling categories for moving the projects from Silver to Gold certification [8] (p.173, Table 6).
Wu et al. [9] recently carried out pooling of all LEED projects within the US to determine cross-certification performances of LEED-NCv3 projects in 2007-2015.Sample sizes for both Silver and Gold LEED-certified projects were n 1 = 1310 and n 2 = 1201, respectively [9] (p.375, Table 9).Based on MCI and p-values of the pooled data, the authors concluded that at least three categories, EA (MCI = 5.46 and p = 0.000), SS (MCI = 2.57 and p = 0.000), and WE (MCI = 0.84 and p = 0.000), were propelling categories in order to move from Silver to Gold certification [9] (p.375, Table 7).
One possible reason for such contradictory results is that the different US states have adopted different green and energy codes along with their different versions [15,16].For example, one of the main energy regulations, ASHRAE 90.1 Standard (Energy Standard for Buildings Except Low-Rise Residential Buildings), is not under national regulation.Hence, states are willing to adopt any version of the ASHRAE 90.1; in this way, different ASHRAE 90.1 versions are adopted in different US states [15].However, these different ASHRAE 90.1 versions that are accepted in different US states can have different influences on one of the highly weighted LEED categories, EA [4].In addition, ASHRAE 90.1 [17] version has usually moved dynamically through the years in the same state.For example, in CA: 2011-ASHRAE 90.[17].Thus, an adopted ASHRAE 90.1 version can be changed one time in three years in the same state, while typical duration of the building construction is approximately two years.For this reason, to minimize the evaluation of LEED projects in the same state with different ASHRAE 90.1 versions, we propose that the time collection of the LEED projects must not exceed one year.Therefore, due to both the spatial and temporal influences of ASHRAE 90.1 energy code on EA category, LEED projects certified in different years cannot be pooled in one US group as suggested by Wu et al. [8] and Wu et al. [9].
In the context of statistical terminology, a single-unit design allows three methods of analysis: (i) If a sampling frame contains primary sampling units, and the primary sampling unit does not contain evaluation units, then primary sampling units are treated as statistically independent units; (ii) If a sampling frame contains primary sampling units, and a primary sampling unit contains evaluation units, then evaluation units must be averaged within each primary sampling unit, and then primary sampling units are treated as statistically independent units.However, (iii) If a sampling frame contains primary sampling units that contain evaluation units, and evaluation units are pooled from two or more primary sampling units and treated as primary sampling units, it will lead to the problem of sacrificial pseudoreplication [5].Ignoring sampling structure of the collected data leads to sacrificial pseudoreplication and can lead to "artificially inflated degrees of freedom, giving the illusion of having a more powerful test than the data support" [1].
In our case, pooling LEED projects from different states of the US and treating them as primary sampling units can lead to erroneous conclusions about cross-certification strategies [4].Thus, based on the Kozlov and Hurlbert [18] study, to avoid erroneous conclusions about cross-certification strategies, the following logical structure should be accepted: any two LEED projects in the same state share more similar "green building policy" conditions compared to any two LEED projects from the different states.It is suggested that states in the US are "natural" groups or clusters.If a single-design structure in the present study is used, then the primary sampling units are only used in the inferential statistical analysis.
The goal of the present study is to correct the statistical malaise over evaluating and interpreting the LEED certification process.Correcting statistical analysis included two possible evaluation structures: (i) individual state sampling frame analysis and (ii) the US sampling frame analysis.These two structures were contrasted to a third evaluation structure: (iii) sacrificial pseudoreplication analysis.
Application of the three evaluation levels was demonstrated on a case study of LEED-NCv3 and LEED-EBv3 projects certified in four US states in 2016.For both certification schemes, only the most popular Silver and Gold certifications [4] were considered.As a result, the cross-certification analyses were performed so as to reveal which LEED categories are Silver-to-Gold propelling.This will serve as valuable information for understanding the strategies employed by green building practitioners in moving projects toward higher certification level.

Design of the Study
Only Silver and Gold projects of LEED-NCv3 and LEED-EBv3 in the US in 2016 were considered for the analysis, which was performed in the following three consequent steps: 1.
In this analysis, each of the three structures was evaluated using either approximate or exact WMW tests: (i) individual state sampling frame analysis used an approximate WMW test, (ii) the US sampling frame analysis used an exact WMW test, and (iii) sacrificial pseudoreplication analysis used an approximate WMW test.To perform an approximate WMW test [11], the balanced sample sizes for Silver (n 1 ) and for Gold (n 2 ) projects were (n 1 = n 2 ≥ 9).For states with suitable sample sizes for LEED projects, an approximate WMW test was used.Then, LEED projects from each state were randomly selected.To perform an exact WMW test, minimum sample size was (n 1 = n 2 = 4) [19] (p.19).To obtain primary sampling units, the median Silver and Gold project in each state was computed and collected.2.
USGBC scorecards of the randomly selected projects were retrieved from the USGBC website new construction building directory [20] and existing building directory [21].Then, information on the retrieved projects regarding the awarded points in the six main categories (SS, WE, EA, MR, EQ, and ID) was accumulated.Eventually the RP points were also accumulated and redistributed among the five relevant main categories.

3.
For each of the three structures, (i) individual state sampling frame analysis (the state is sampling frame in which LEED projects are treated as primary sampling units), (ii) the US sampling frame analysis (the US is the sampling frame, the state is the primary sampling unit in which LEED projects were treated as evaluation units), and (iii) sacrificial pseudoreplication analysis (the US is the sampling frame in which LEED projects were pooled from different states and treated as primary sampling units), robust statistical analysis for comparison between the randomly selected Silver-and Gold-certified projects in ordinal scale was performed.

Statistical Analysis
LEED data are presented in ordinal scales.Based on that fact, the following nonparametric tests, Cliff's δ [22] and WMW test [13], were used to compare the two unpaired groups.The data are presented as the median ± interquartile range (IQR, 25th-75th percentile).
Cliff's δ was used to measure the substantive significance (effect size) between two unpaired groups.Cliff's δ [22] (p.495) is expressed as in which x 1 and x 2 are scores within group 1 and group 2, respectively; n 1 and n 2 are the sizes of the sample groups, group 1 and group 2; and # indicates the number of times.
The effect size is considered (i) negligible if |δ| < 0.147, (ii) small if 0.147 ≤ |δ| < 0.33, (iii) medium if 0.33 ≤ |δ| < 0.474, or (iv) large if |δ| ≥ 0.474 [23].According to Cohen [24] (p.156), "a medium effect is visible to the naked eye of a careful observer.A small effect is noticeably smaller than medium but not so small as to be trivial.A large effect is the same distance above the medium as small is below it."It should be noted that the effect size is not "iron-clad criteria" [25] but is only a general rule of thumb that might be followed in the absence of knowledge of the area [26].
WMW test.The WMW test was used to determine statistical difference (p-value) between two unpaired groups.It should be noted that WMW tests could be applied in two forms: approximate form or extract form [27].If sample sizes were n 1 = n 2 ≥ 9, then an approximate WMW test was used [28] (p.56).Mann and Whitney [13] (p. 50) noted that when sample sizes achieve n 1 = n 2 = 8, then "at this point the distribution is almost normal".If sample sizes were n 1 = n 2 = 4, an exact WMW test was used [19] (p.19).In both tests, a two-tailed p-value was applied.Only balanced sample size cases were studied.
Neo-Fisherian significance assessments.In the current study, for standard types of significance assessment, the hybrid of the Paleo-Fisherian and Neyman-Pearsonian paradigms [i.e., null hypothesis significance tests (NHST)] are replaced by a neo-Fisherian assessment, as recommended by Hurlbert and Lombardi [14,29].The neo-Fisherian paradigm (1) does not fix α, (2) does not describe p-values as 'significant' or 'nonsignificant', (3) does not accept null hypotheses based on high p-values but only suspends judgment, (4) interprets significance tests according to "three-valued logic", and (5) presents effect size information in conjunction with significance tests.Analyses conducted under this paradigm are termed neo-Fisherian significance assessments (NFSAs) [14].NFSAs are used to interpret the signs and magnitudes of the statistical effects [14].Based on NFSAs, precise P-values were evaluated and shown according to a three-valued logic as follows: "it seems to be positive" (i.e., there seems to be a difference between group 1 and group 2), "it seems to be negative" (i.e., there does not seem to be a difference between group 1 and group 2), and "judgment is suspended" regarding the difference between group 1 and group 2 [14,29].

Results
The number of the selected states with the total number of LEED projects and number of randomly selected LEED projects is presented in Table 2. Ordinal font size numbers show the total number of LEED's projects.Numbers in parentheses display randomly selected LEED's projects to perform the balanced non-parametric statistical analysis.The resulting state number with Silver-and Gold-certified projects was n 1states = n 2states = 4 and included the states of California (CA), Washington (WA), Virginia (VA), and Texas (TX) (Table 2).It was sufficient to perform an exact WMW test in the (ii) US sampling frame analysis [19].The number of Silver-and Gold-selected projects n 1projects = n 2projects was different for each of the four states, but it was sufficient to perform an approximate WMW test in (i) individual state sampling frame analysis and (iii) sacrificial pseudoreplication analysis.

LEED-NC 2009 Certified Projects
The individual state sampling frame analysis.Strategies of each of the four states toward moving from Silver to Gold for LEED-NCv3 are presented in Table 3.According to these strategies, the EA is the most accepted propelling category, which was involved in all four states.These results confirm the popularity of the EA propelling category in cross-certification performance of LEED-NCv3-certified projects, which was revealed by other studies [4,9].In particular, Pushkar and Verbitsky [4], who analyzed LEED-NCv3 certified projects in 2016, revealed that EA was the most propelling category in Florida (FL), Illinois (IL), and Massachusetts (MA), while EQ was the most propelling category in New York (NY).SS, EQ, and ID are moderately accepted propelling categories (Table 3); these were involved in two of the four states.Again, the SS and EQ popularity for Silver-Gold moving was revealed early by Wu et al. [9], who outlined SS as one of the main propelling categories and by Pushkar and Verbitsky [4], who named both SS and EQ as the moderate propelling categories.ID was also reported as a somewhat moderate propelling category for Silver-Gold cross-certification performance [9].
According to the present research, WE and MR were not involved in any of the four states in Silver-Gold performance (Table 3).These two categories were noted as the worst-performing categories in LEED-NCv3 cross-certification performance [4,9].As was explained by Wu et al. [9], there is a difficulty in accessing MR points due to a low possibility of reducing construction material consumption.
The US sampling frame analysis.Total tendency in the strategy employed by the four states applying the US sampling frame analysis toward moving from Silver to Gold for LEED-NCv3 is presented in Table 4.According to the revealed strategies, the EA and EQ were the most accepted propelling categories, SS and ID were the moderately accepted propelling categories, and WE and MR were not accepted propelling categories at all.In general, the results of the US sampling frame analysis (Table 4) match the results of the individual state sampling frame analysis (Table 3).The only exception is the EQ category.In the individual state sampling frame analysis (Table 3), EQ was grouped with the moderately accepted propelling categories (SS, EQ, and ID), and in the US sampling frame analysis (Table 4), it was grouped with the most accepted propelling categories (EA and EQ).The sacrificial pseudoreplication analysis.Total tendency in strategy employed by the four states on the basis of the analysis when projects from different states were pooled in one US sampling frame for LEED-NCv3 is presented in Table 5.According to these revealed strategies, the SS, EA, EQ, and ID were the most accepted propelling categories, while WE and MR were not accepted propelling categories at all.In general, the results revealed that when projects from different states were pooled in one US sampling frame (Table 5), the results evaluated in the individual state sampling frame analysis were different (Table 3).The differently evaluated results were for SS, EQ, and ID categories: in the individual state sampling frame analysis (Table 3), these three categories were grouped under the moderately accepted propelling categories (SS, EQ, and ID), and in the analysis where different states were pooled in one US sampling frame (Table 5), these categories were grouped under the most accepted propelling categories (SS, EA, EQ, and ID).

LEED-EB 2009 Certified Projects
The individual state sampling frame analysis.Strategies of each of the four states toward moving from Silver to Gold for LEED-EB 2009 are presented in Table 6.According to the results, all six categories were propelling: SS and EQ were the most accepted, and WE, EA, MR, and IO were moderately accepted propelling categories.Thus, in contrast to unemployed WE and MR in a case of new projects certified under LEED-NCv3 (Table 3), these categories were active in a case of renovated buildings certified under LEED-EBv3 (Table 6).It can be suggested that site and building restrictions are weaker in new building design and construction than in the renovation of existing buildings.Therefore, in selecting preferred strategy design, a team has more flexibility in new buildings than in renovated buildings.As a result, in new building certification, difficult points such as MR points were not considered as a preferred Silver-Gold cross-certification strategy (Table 3), while in renovated buildings, all possible strategies were employed (Table 6).The US sampling frame analysis.Total tendency in strategy employed by the four states applying the US sampling frame analysis toward moving from Silver to Gold for LEED-EBv3 is presented in Table 7.According to the revealed strategies, the SS and EQ were the most accepted propelling categories; WE, EA, MR, and ID were the moderately accepted propelling categories.The US sampling frame analysis results (Table 7) were the same as the results evaluated with the individual state sampling frame analysis (Table 6).It is interesting that EQ was one of the most propelling categories in both new buildings (certified under LEED-NCv3) and renovated buildings (certified under LEED-EBv3).However, an additional most propelling category was different in these schemes: EA-in LEED-NCv3, and SS-in LEED-EBv3.It should be noted that SS and EA have the same number of achievable points, 26 pt and 35 pt, respectively, in both LEED-NCv3 and LEED-EBv3 [30,31].However, there are differences in some inherent credits between LEED-NCv3 and LEED-EBv3 [30,31].For example, the SS category in LEED-NC 2009 [30] includes four separate rigid sub-credits for SSc4 Alternative Transportation (12 pt) (SSc4.1 Public Transportation Access, SSc4.2 Bicycle Storage & Changing Rooms, SSc4.3 Low-Emitting & Fuel-Efficient Vehicles, and SSc4.4 Parking Capacity).However, the SS category in LEED-EB 2009 [31] includes the one flexible SSc4 Alternative Commuting Transportation credit (15 pt), which covers the issue of reduction in regular conventional commuting trips (when individual conventional automobiles are replaced with any other human-powered conveyances, for example, mass transit, walking, or bicycles).SSc4 credit is more flexible and more point-accounted in LEED-EB 2009 [31], but it is more rigid and less point-accounted in LEED-NC 2009 [30].This is possibly the reason why SS was one of the most propelling categories in LEED-EB 2009 [31] but not in LEED-NC 2009 [30].
The sacrificial pseudoreplication analysis.Total tendency in strategy employed by the four states on basis of the analysis when projects from different states were pooled in one US sampling frame for LEED-EBv3 is presented in Table 8.According to the revealed strategies, the SS, EA, MR, EQ, and IO were the most accepted propelling categories, while WE was the moderately propelling category.The results of different pooled states in one US sampling frame (Table 4c) were not reflected in the results evaluated with the individual state sampling frame analysis, in which only SS and EQ were the most propelling categories (Table 6).It is interesting that the most propelling categories in both new buildings (the SS, EA, EQ, and ID) and renovated buildings (the SS, EA, MR, EQ, and IO) were almost the same.There was only exception: MR category.MR was not revealed as a propelled category in LEED-NCv3 certified projects (Table 5), while it was revealed as a propelled category in LEED-EBv3 certified projects (Table 8).However, both of these results should be considered as false, due to deliberately applied wrong analysis that could lead to sacrificial pseudoreplication.Perhaps, this is why these results that are closer to the results reported by Wu et al. [8], who concluded that SS, EA, and EQ were revealed as propelled categories from Silver to Gold for LEED-NCv2.2certified projects.It can be suggested that such similarity in results was revealed because the same method for statistical evaluation was used here, in the sacrificial pseudoreplication analysis sub-section, and in the study of Wu et al. [8].

Conclusions
Silver to Gold cross-certification in both the LEED-NCv3 and LEED-EBv3 projects in the four states (CA, VA, WA, and TX) was statistically evaluated applying (i) the individual state sampling frame analysis, (ii) the US sampling frame analysis, and (iii) the sacrificial pseudoreplication analysis.The following was concluded:

•
The individual state sampling frame analysis.Different cross-certification strategies were revealed in LEED-NCv3 and LEED-EBv3-certified projects.For newly constructed projects certified under LEED-NCv3, four of the six categories were employed, in which EA was the most popular propelling category; SS, EQ, and ID were the intermediately popular propelling categories; and WE and MR were completely unpopular categories.For renovated projects certified under LEED-EBv3, all categories were employed, in which SS and EQ were the most popular propelling categories; WE and EA were the intermediately popular propelling categories; and MR and IO were the least popular propelling categories.Thus, in new versions of LEED-NC and LEED-EB, experts should encourage building practitioners to also focus on currently less popular categories toward more equal high achievements in all of the five main environmental categories.

•
The US sampling frame analysis.For both LEED-NCv3-and LEED-EBv3-certified projects, the cross-certification strategy revealed in (ii) the US sampling frame analysis was the same as the strategy revealed in (i) the individual state sampling frame analysis.This means that the US sampling frame (a median LEED project in each individual state) analysis can be recommended as an appropriate test for revealing total tendency in the LEED strategy of both new and renewed projects.

•
The sacrificial pseudoreplication analysis.For both the LEED-NCv3-and LEED-EBv3-certified projects, the cross-certification strategy revealed that when projects from different states were pooled in one US sampling frame (the sacrificial pseudoreplication analysis), the strategy revealed on the individual state sampling frame analysis was different.This means that the sacrificial pseudoreplication analysis cannot be recommended as an appropriate test for revealing total tendency in LEED category strategy of both new and renewed projects.

Implications of This Study
This study outlines the importance of correct applied statistical methods based on the individual state sampling frame analysis as opposed to the incorrect one based on the US sampling frame through revealed building practice in certification under LEED-NCv3 and LEED-EBv3.The results of this individual state-prevailing trend can help LEED researchers to further correct "a serious type of statistical error" [2] in current evaluations of LEED certified projects.As a result, more realistic feedback from LEED certification can help LEED experts in further versions of rating schemes to move towards more sustainable building.
Funding: This research received no external funding.

Table 3 .
The individual state-sampling frame analysis: median ± interquartile range (IQR, 25th-75th percentile) of six categories of LEED-NCv3 in four US states.

Table 4 .
The US sampling frame analysis: median ± interquartile range (IQR, 25th-75th percentile) of six categories of LEED-NCv3 in four US states.

Table 6 .
The individual state-sampling frame analysis: median ± interquartile range (IQR, 25th-75th percentile) of six categories of LEED-EBv3 in four US states.

Table 7 .
The US sampling frame analysis: median ± interquartile range (IQR, 25th-75th percentile) of six categories of LEED-EBv3 in four US states.