Development and Application of Schema Based Occupant-Centric Building Performance Metrics

: Occupant behavior can signiﬁcantly inﬂuence the operation and performance of buildings. Many occupant-centric key performance indicators (KPIs) rely on having accurate counts of the number of occupants in a building, which is very different to how occupancy information is currently collected in the majority of buildings today. To address this gap, the authors develop a standardized methodology for the calculation of percent space utilization for buildings, which is formulated with respect to two prevalent operational data schemas: the Brick Schema and Project Haystack. The methodology is scalable across different levels of spatial granularity and irrespective of sensor placement. Moreover, the methods are intended to make use of typical occupancy sensors that capture presence level occupancy and not counts of people. Since occupant-hours is a preferable metric to use in KPI calculations, a method to convert between percent space utilization and occupant-hours using the design occupancy for a space is also developed. The methodology is demonstrated on a small commercial ofﬁce space in Boulder, Colorado using data collected between June 2018 and February 2019. A multiple linear regression is performed that shows strong evidence for a relationship between building energy consumption and percent space utilization.


Introduction
Climate change mitigation efforts require increasing the share of renewable energy sources, while simultaneously decreasing total energy use across all sectors. One of these sectors is the buildings sector, which consumes roughly one third of primary energy globally [1]. Buildings underpin many aspects of society by providing shelter and security, places for people to travel from and to, places to work, and places that we call home. They have always been designed and developed to support some aspect of our lives, yet formal analysis of how people use and interact with buildings has been lacking. In an effort to understand the driving forces behind energy consumption in buildings, the International Energy Agency (IEA) Energy in Buildings and Communities (EBC) program commissioned Annex 53 (2008)(2009)(2010)(2011)(2012)(2013), which determined occupant behavior to be one of the key influencing factors on building energy consumption [2]. Recent research on UK buildings has also found that occupancy behavior can account for 10-80% of the discrepancies between designed and operational energy performance in buildings [3].
Two additional research programs have since been commissioned, Annex 66 (2013-2018) and Annex 79 (2018-2023), both of which have focused on understanding, simulating, and characterizing occupant behavior and its effects on building operation and overall building performance. They have developed metrics and key performance indicators (KPIs) to quantify these effects and demonstrated them using case studies. The case studies utilize multiple methods for determining how occupants use the building, such as occupancy surveys [4], human administered occupancy counting [5], cameras [6], movement detection and tracking devices [7], radio-frequency identification (RFID) devices and WiFi [8]. Using some of these different methods for measuring occupancy, case study 29 [8] summarizes and compares the annual energy consumption of buildings considering the designed occupant density and measured occupancy counts. The majority of the research regarding occupantcentric metrics and KPIs was recently summarized in [9]. An important observation is that many of these KPIs rely on obtaining an accurate count of people in a building.
Occupant-centric KPIs that relate building energy consumption to occupancy data have been demonstrated before, such as the energy per occupant-hour [9]. Most of these metrics have been annualized metrics and haven't considered granular occupancy data or energy data, which is now significantly more relevant. Measurement and verification 2.0 (MV 2.0) is used to refer to a set of methods for estimating temporally granular energy consumption data for buildings [10]. The time-of-week temperature (TOWT) model, consists of a "time-of-week indicator variable and a piecewise linear and continuous outdoor air temperature dependence" [11]. It has been demonstrated to perform well on the efficiency valuation organization (EVO) portal and has also been adopted as part of the CalTRACK hourly methods [12]. The ASHRAE Guideline 14 three parameter heating, three parameter cooling, and four parameter method [13] additionally all perform well on the EVO portal, even though they don't take any temporal indicator variables into account. ENERGY STAR for offices, although not temporally granular, uses weekly operating hours, number of workers, an adjustment for heating degree days, and other explanatory variables to model annual energy use intensity (EUI) for office buildings [14]. The success of these previous results demonstrate the importance of using outdoor air temperature as an explanatory variable in estimating building energy consumption. Moreover, the time of week indicator variable used in the TOWT model is really just a proxy variable for what is actually desired, which is the occupancy state of the building. The same can be said for the weekly operating hours and number of workers used as explanatory variables in the ENERGY STAR method. Both models account for occupancy patterns in some way, although they are not based on true measured occupancy data, wherein lies another opportunity.
Accompanying the need to better understand how occupants use buildings are the advances in sensor technology development, specifically with regards to occupancy detection. Occupancy detection technologies are typically characterized by the granularity of occupancy they can detect, namely: presence occupancy detection (at least one person is present), occupancy counting (the exact number of people present), identity detection (who the people are), and activity detection (what the people are doing) [15]. The case studies mentioned previously describe some of these new technologies in greater detail and we refer the reader to recent reviews for comprehensive characterizations [16][17][18]. Moreover, the Advanced Research Project Agency-Energy (ARPA-E) put out a call to advance the state of occupancy detection technologies in buildings. One of the main pillars for achieving this is by advancing the state of occupancy counting technologies [19]. Along with the IEA EBC push, this has created a significant research agenda towards more accurately and cost effectively quantifying the number of people in buildings [16][17][18].
The thrust to advance the state of the art in terms of occupancy counting technologies will take time to make an impact on the existing building market. Even then, the politics of data privacy in buildings has not been well established and presents some risk [20,21], although the opportunity to use occupancy data for space utilization and optimization of facility usage is promising [22] and commercially available services already exist in this area [23][24][25]. However, it is our opinion that space utilization applications and occupancy counting systems will primarily be implemented in Class A commercial office spaces (Class A commercial real estate consists of the highest-quality and sought after buildings amongst "high-profile, white-collar companies", typically characterized by "top-of-the-line fixtures, amenities, and HVAC and technological systems" [26]) for the next decade or so. On the other hand, the precedent set by buildings codes mandates simple occupancy detection systems, such as passive infrared (PIR), sonar, and dual-tech sensors, which only capture presence level occupancy.

Developments in Data Schemas for Buildings
In addition to development in occupant-centric metrics and occupancy sensing technologies, there has been significant advancement in the standardized representation of operational building data [27][28][29]. Many different forms of digital information are produced over the lifespan of a building, capturing design, construction, commissioning, operations and controls, maintenance, and audit information. Standardized representations of this digital data (data schemas) are designed to improve interoperability between different applications at different points of digital handoff, with the ultimate goals of streamlining workflows and reducing duplication of work. Examples of data schemes include the Industry Foundation Classes (IFC), used to standardize building information model (BIM) data [30], green building XML (gbXML), used to standardize data exchange for BIM to building energy modeling (BEM) applications [31], and BuildingSync, used to standardize data captured by an energy audit [32]. While many of the occupant-centric KPIs were demonstrated using simulated buildings (BEMs) [9,33], it is our opinion that evaluating these KPIs during the operational phase of the building is more valuable. Therefore, we will focus on operations specific building data schemas and only provide a brief introduction to the two of interest. For a more comprehensive review of the data schema landscape across all aspects of the building lifecycle, we refer the reader to [27].
Two prominent operations-oriented data schemas are Project Haystack (Haystack) [34] and the Brick Schema (Brick) [35]. Both of them were originally designed as an abstraction layer for data typically collected in building management systems (BMS), lighting control systems, and other operational control systems [34,35]. Typical types of entities one would find described by Haystack or Brick include control and sensing points, air and hydronic equipment, spatial information, energy producing or consuming equipment, and the relationships between these entities. While the domains of interest for Brick and Haystack overlap significantly, the design and implementation methodology is different.
Project Haystack 3 provides a standardized dictionary of terms (referred to as tags), which are used to annotate building data. Sets of tags (tagsets) are used to convey the full meaning of what an entity represents as well as its relationships to other entities. Until recently, neither the schema nor accompanying documentation were provided in a machine readable format. The newest version, Haystack 4 (currently in prerelease), has addressed this while also defining concepts in a taxonomic structure to address some previous critiques [28]. The Brick Schema was developed to provide a more formalized class hierarchy than Haystack 3 originally implemented. It is defined using open source semantic web standards, including the resource description framework (RDF) and web ontology language (OWL) [35]. For more detailed information on the usage of Brick and Haystack, we refer the reader to the websites for both projects [34,36].

Point of Departure
While it is well known that Haystack and Brick are used by energy management information systems (EMIS) to support energy analysis and fault detection and diagnostics applications (FDD) [28,37], documentation for the implementation of algorithms using standardized concepts defined by either schema has not been demonstrated. Although detailed descriptions of control and FDD algorithms are provided in many previous publications [38][39][40][41][42], they merely include point name descriptions and don't reference any data schema objects, which we believe is a serious opportunity missed and a novelty of the research performed by our study. Furthermore, relying on occupancy counts for occupant-centric KPI calculations can substantially limit the number of buildings for which the KPIs can be calculated, since most buildings today can't capture this information. A key contribution of this study is to enable buildings with code prescribed occupancy detection systems to calculate occupant-centric KPIs. These opportunities in the current state of practice inspired the work for this article, the goals of which are to: • Focus on the existing state of practice for occupancy detection systems in commercial buildings, namely, that of presence level occupancy detection. • Develop a consistent methodology for calculating the percent space utilization of a building at different spatial granularities and with different sensor configurations. • Formalize the percent space utilization calculation using both Project Haystack and Brick Schema. An abstraction of an applicable point is introduced to decouple the algorithm development from the data modeling implementation to accommodate differences in data modeling implementations. • Define a methodology to convert between percent space utilization and occupanthours using the design occupancy so as to better bridge the gap between ideally observable metrics (occupant-hours) and the state of practice observable metrics (percent space utilization). • Demonstrate the use of temporally granular space utilization as a regression variable in predicting energy consumption.

Definition of Space Utilization and Occupancy Concepts
The concept of occupant-hour is commonly used in [9] as part of a KPI calculation. Although this is an ideal metric to use in occupant-centric KPIs, it is impractical in the majority of buildings since occupancy counting is still an emerging technology. Passive infrared (PIR) or ultrasonic occupancy sensors that determine presence are more common, but no standard methodology exists to utilize data from these sensors in occupant-centric metric calculations.
In this section, we define methods to calculate the space utilization for both container and contained spaces. A contained space is characterized as a space physically contained by another space, whereas a container space is characterized by a space physically containing other spaces. Also note that spaces can be both contained and container spaces depending on one's vantage point.
The term applicable point is used throughout this section to identify a point that conveys information about an entity of interest. We allow an occupancy point to be applicable for a particular space in two ways: (1) an occupancy point is directly associated to the space of interest, or (2) an occupancy point is associated with a physical sensor or equipment that is associated to the space of interest. A visual representation of these two configurations is presented in Figure 1.

Sampling Techniques
We begin by defining conventions regarding occupancy sampling techniques as they are relevant to the calculation of the percent occupied time. Two prominent sampling methods are fixed interval sampling and change of value (COV) sampling. Simply put, fixed interval sampling consists of data being logged at a specified time interval (∆t i ), whereas COV sampling only logs data when a sensor value has changed outside of a specified value. With occupancy sensors, COV sampling practically means that a value will be logged every time the space goes from unoccupied to occupied (This can be represented many ways, such as 0 or 1, "on/off", "occupied/unoccupied", etc.). This is important for the following reasons: • Occupancy sensors set to record data at fixed intervals are backward-looking, meaning that the value recorded at time t actually represents data for the Occupancy sensors set to record data in a COV manner are forward-looking, meaning that the value recorded at time t represents data for the interval [t, t + ∆t v ) where ∆t v is a variable length time interval and is only realized once the next COV occurs.

Occupied Time
The percent (or ratio) of occupied time for occupancy point i is defined as the percent of time between two time points (t 1 , t 2 ) that the point registers occupied, presented by Equation (1): where Occ t represents the recorded occupancy value at time t. We use TO i to represent both the ratio and percent occupied time. Although the functional notation does not change depending on the sampling methodology, the implementation of the calculation does need to account for the backward-or forward-looking convention, as demonstrated in Figure 2.

Space Utilization
After the occupied time for an individual sensor is determined, space utilization can be calculated. The only requirement for this methodology is that all spaces have a property conveying their area. Using Brick, this is accomplished with a brick:hasArea object property (see [43] for specifics), while Haystack uses a phIoT:area datatype property.

Contained Spaces
The percent space utilization for a contained space, α, is defined as an average of the percent occupied time for the set of occupancy points applicable to α, presented by Equation (2): where I represents the set of occupancy points applicable to the space α, |I| represents the total number of occupancy points in the set I, and SU α (t 1 , t 2 ) represents the space utilization of space α evaluated from time t 1 to t 2 .

Container Spaces
Container spaces build upon the previous methods for calculating space utilization, however, an area weighted approach is utilized. There are multiple scenarios to account for with container spaces, their contained spaces, and their relationship to occupancy points, demonstrated by the example floor plan in Figure 3. In addition to contained spaces with one applicable occupancy point (Sp1-2) or multiple applicable occupancy points (Sp1-1), we note the following additional scenarios: (a) an occupancy point that is applicable to a container space (PointC in Sp1) instead of one of its contained spaces, or (b) a container space containing a space that does not have any applicable occupancy points (Sp1-3 within Sp1). With these considerations, we define the following conventions: A j represents the set of spaces contained within space j that have at least one applicable occupancy point and NA j represents the set of spaces contained within space j that have no applicable occupancy points. For example, in Figure 3 The total area for the set of spaces, A j , contained within space j is noted as area A j and presented by Equation (3). Similarly, the total area for the set of spaces, NA j , contained within space j is noted as area NA j and presented by Equation (4).
We define the area weighted space utilization for the set of spaces, A j , contained by j as AWSU A j , presented by Equation (5). This metric multiplies the area of each space in A j by its corresponding space utilization over a given time interval, sums them, and then normalizes by area A j . It specifically does not include the space utilization for the occupancy points directly applicable to space j.
The space utilization captured by the sensors directly applicable to j is noted as SU j and uses the same method of calculation as defined in Equation (2), that is, it only considers the occupancy sensors directly applicable to it. To holistically consider space utilization for a container space, both the occupancy sensors directly applicable to that space and the space utilization of the spaces contained by that space must be considered. We use AWSU j presented by Equation (6) to capture the area weighted space utilization for a container space: Note that the area used for normalization in Equation (6) excludes area NA j . Similarly, the weighted term multiplied by SU j excludes the area area NA j . We specifically exclude these since no information is known, and adding any default assumptions would only impose bias.

Example
Using the topology provided by Figure 3, we create example data using a fixed interval and binary occupancy data in Table 1, which also includes the occupied time, TO i , for each of the occupancy sensors. Moreover, values for calculations of the methods previously defined by Equations (2)-(6) are presented in Table 2 for reference. Table 1. Time series data and percent occupied time calculation for the four occupancy points defined in Figure 3.

Relationship between Space Utilization and Occupant Hours
Our methodology for converting between occupant-hours and space utilization is fully dependent on the designed (maximum) occupancy level of a space. The designed occupancy level is determined on a room by room basis during building design in order to satisfy ventilation requirements as prescribed by ASHRAE 62.1 [44]. The limitation in this methodology is important to understand because the determination of occupant-hours is not based on true counts of occupants observed, but rather on assumed occupancy counts based on design information and observed presence level sensor data. This can both overestimate and underestimate occupant-hours depending on the situation. For example, a space designed for 15 people, where only one occupant is present for a one hour period, the conversion would calculate 15 occupant-hours, while the true value would only be 1 occupant-hours. This is one of the main reasons why occupancy counting technologies are so desirable, as they make no assumptions about how many people were designed to be in a space and report the actual measured value. Nevertheless, the conversion may still be useful, as it provides a mechanism to use data more commonly available. Converting between occupant-hours and space utilization can be done using Equation (7): where OH j (t 1 , t 2 ) represents the occupant-hours for the set of spaces in j, DO j represents the designed occupancy level (with units of people) for the corresponding set of spaces, and AWSU j (t 1 , t 2 ) represents the area weighted space utilization for j as defined in Equation (6).

Description of Experimental Setup
To evaluate different occupant-centric technologies, the office of a consulting engineering firm is used as a living laboratory. The office is located in Boulder, Colorado and has been outfitted with different systems for sensing occupancy, monitoring power consumption, and monitoring indoor environmental conditions. Some of the systems were installed in the office as part of the tenant fit-out in the fall of 2016, while others were installed for testing purposes. The office space exists on the third story of a three story building, with a floor area of approximately 595 m 2 (6400 f t 2 ). It was designed with an occupant density of about 14 m 2 (150 f t 2 )/person, giving a design occupancy of 42 people. Data used in the analysis presented was recorded between June 2018 and January 2019.
We describe the office space and sensing technologies using two relevant metadata schemas: the Brick Schema (version 1.2) and Haystack (version 3.9.9). The relevant concepts from each of the schemas are outlined in Table 3. Unless indicated otherwise, all Brick terms use the typical Brick namespace (https://brickschema.org/schema/Brick#, mainly seen in Turtle documents with the brick namespace prefix), while all Haystack terms come from one of the four standardized library modules defined by the Haystack 4 prerelease [45]. The building data schema space can be difficult to understand, specifically for practitioners with limited data modeling background. One of the goals of summarizing the relevant concepts used by this building from Haystack and Brick into Table 3 is to demonstrate that the two technologies can achieve very similar outcomes, they are just achieved in slightly different ways. Secondly, it demonstrates that reporting data regarding a specific study can be much more easily understood and replicated when standardized data schemas are used.  Figure 4a provides a layout of the office and the locations of different sensors used for the experiment. The office space is served by a shared air handling unit (AHU) located on the roof of the building, which is a single stage direct expansion unit with a natural gas furnace. Unfortunately, no metering infrastructure was installed into the AHU since it is part of the base building systems and was not upgraded during the tenant fit-out process. Therefore, all energy consumption information analyzed throughout this paper consists of electrical energy, namely, lighting, plug loads, and a single computer room air conditioner unit installed in the server room. Table 4 describes the different systems and sensing technologies used and also characterizes them according to relevant occupant-centric factors as outlined in [9]. System C demonstrates the installation density of occupancy sensors that is typical for office buildings designed in accordance with the IECC 2015 code cycle. This system design is especially important as it represents what will be installed in the majority of buildings (code minimum design). System A demonstrates the density of occupancy sensors achieved when installed integral with lighting fixtures. It provides more spatial granularity especially in larger rooms (open office spaces, kitchen) compared to smaller rooms (private offices), but is a higher-end product and we believe less likely to be installed. System B is an even higher end product and would take some custom development to actually maintain occupancy counts for the floor, which we expect is inaccessible for the majority of buildings. Characterizing these technologies in accordance with [9] helps us identify which occupant-centric metrics can be calculated using the data available. While the goal of this article is only to provide a simple demonstration case, an implementation on a real building could use this characterization to downselect all applicable metrics for their building. Figure 4b demonstrates the modeling of the primary spatial elements (building, floor, and room) using both the Brick and Haystack schemas. For simplicity, the Brick representation does not include inferred metadata (Inference demonstrations can be found [46] with details described in [28]). Per the Haystack RDF export specification [47], all tags are exported as objects of the ph:hasTag predicate. Notice that the only metadata required to be added to the Brick and Haystack entities, besides specifically declaring their type, is the metadata used to convey their floor area, which is already reported when submitting ventilation requirements on mechanical drawings and therefore should be easily accessible from design documents. An example of adding area properties for Brick and Haystack spatial entities is shown in Figure 4b.      Table 4. Summary of systems installed and characterization of occupant-centric factors per [9]. This system is an integrated electrical panel and branch circuit monitoring system. Two electrical panels and branch circuit monitoring devices collect power consumption at 1 min intervals for each of the measured circuits. It is important to note that these panels do not provide power to the base building mechanical system.

E
Occupancy Resolution: NA Occupancy Object: NA Spatial Resolution: NA Temporal Resolution: Sub-hourly This system consists of four RESET certified IAQ monitors. Per RESET standards, these sensors monitor CO 2 , TVOC, PM10, PM2.5, temperature, and relative humidity.

Space Utilization
The space utilization for System A and System C are calculated using the methodology described in Section 2.3, where the space utilization is specifically determined on an hourly basis. A two-week snapshot for System A is presented in Figure 5, and an average hourly space utilization for each system is presented in Figure 6. For context, it is also compared with the hourly occupancy profile used by the Department of Energy (DOE) Medium Office Prototype building [48]. The purpose of these two graphs is to demonstrate what space utilization data looks like in a transient manner ( Figure 5) as well as on average over a long time period ( Figure 6). As stated in Section 2.4, the methodology does not accurately capture the true occupancy state of the building and spaces, since an occupancy count cannot be determined using the data captured by Systems A and C. Many studies relying on occupancy counts will implement an additional sensing system or have people monitoring the building in order to capture the ground truth occupancy state of the building. Our study did not capture ground truth occupancy and therefore we don't declare an accuracy metric for the space utilization calculations. This methodology is designed to be used on real world, minimally code designed buildings, which are not going to incur additional costs simply to capture ground truth occupancy.

Energy Consumption
Our analysis uses the availability of granular spatial and temporal data for occupancy, temperature, and energy to create a multiple linear regression (MLR) analysis. Space utilization and outdoor air temperature are used as the explanatory variables with energy consumption as the response variable. Separate MLR models are created for the weekend and weekday, although we suspect these to not diverge substantially as space utilization should capture differences in weekday vs. weekend patterns. We do not utilize a time of week indicator variable as the intention is to understand whether space utilization is a valuable explanatory variable.  A few power outages and server reboots caused some measurements to be corrupted during a week in September and another week in October. Therefore, before the MLR was performed, outliers were removed if they were outside 150% of the interquartile range (IQR). The results are presented in Table 5 and Figure 7. The top portion of Figure 7 displays the relationship between the energy consumption and the percent space utilization, while the bottom portion displays the relationship between the energy consumption and outdoor air temperature. Each point represents a one hour time interval. Table 5 captures the estimates for each of the MLR parameters, as well as the standard error, t-value, and the p-value.

Discussion of Results
The information displayed in Figure 5 demonstrates that the space utilization calculation can capture fluctuations in occupancy patterns, a known limitation in existing occupancy modeling for buildings [49]. Besides being used as part of KPI calculations, space utilization could also be used by BEM professionals to assist in energy model calibration. Specifically, the fluctuations in occupancy throughout the course of a day can be seen. While ground truth measurements are not available, some simple heuristics about the occupancy patterns of the occupants are known. The schedule for the majority of workers in the office space was known to be 8 am-5 pm, with some workers starting slightly earlier and others staying later. It was typical for people to eat lunch in the office, so the noticeable lack of lunch time dip in occupancy status also aligns with expectations. It was rare for people to come into the office on the weekends (captured), and a cleaning crew came every weeknight after hours (also captured). Therefore, the shape of the space utilization calculation aligns with expectations. Figure 6 visually demonstrates the average space utilization captured by System A and System C. The calculations for both of the systems were performed using the space utilization algorithm defined in Section 2.3. It is immediately apparent that the occupancy data captured by System C is incorrect. A consistent bias of about 30% is present in the data. Upon further investigation, it was discovered that three of the occupancy sensors for System C, representing about 30% of the floor area, were always returning an occupied signal. Accounting for an approximate 30% bias in the data, the shape and magnitude of the data for System C closely mirrors that of System A. Although System A does not represent ground truth, this is a useful result to observe and means that, on average, space utilization calculations for this experimental setup are consistent when using underlying data captured from presence level occupancy sensors installed at higher or lower spatial granularities. In essence, occupancy sensors implemented to code minimum (System C) on average capture similar information as higher end occupancy detection systems (System A). The bias in the results from System C, however, does demonstrate the importance of ongoing commissioning for occupancy sensors.
The results of the MLR model are promising. The distribution of the data around the space utilization line is tighter than that around the temperature line for both the weekend and weekday models, which is numerically confirmed by the larger t-value for the space utilization parameter in Table 5. Heteroscedastic behavior of the weekend model at low space utilization appears to be present. The extremely small p-values for all parameters indicates it is very likely that these parameter estimates are useful in explaining the variance in the energy consumption, and the adjusted R-Squared value reveals that much of the variance in the energy consumption is explained by the models. The parameter estimates for the weekday vs. weekend models are very similar for the intercept (5.31 vs. 5.32, 0.2% difference), temperature (0.015 vs. 0.012, 20% difference) and space utilization (0.040 vs. 0.038, 5% difference), meaning that a singular model built from all data (weekend and weekday) may not differ significantly from the two separate models. The purpose of this study is not to perform a holistic analysis, merely to investigate whether space utilization as a data point could be useful in predicting energy consumption, which the results of the MLR confirm.

Challenges and Considerations for Adoption
The majority of the KPIs defined in [9] are formulated with respect to occupant-hour and rely on normalizing some value or metric by occupancy (i.e., energy per occupant-hour, lighting energy per occupant-hour, etc.). While this formulation can be useful, it suffers from the fact that the denominator can be zero, which is the case when either occupant-hour or space utilization are used as the normalizing factor. While it is unlikely this would occur when calculating these metrics over longer time horizons (monthly, annually), it does occur when they are calculated over short time horizons and should be considered by other practitioners. The simplest workaround for the normalization by zero issue is to formulate the problem differently, namely, to use occupant-hours or percent space utilization as explanatory variables in a regression formulation of the problem as demonstrated in Section 4.2. Although this gets us away from simple KPI definitions, it is consistent with the majority of work currently going on in the MV 2.0 space [11,50]. Instead of proxying occupied or unoccupied states with time of week [11], both occupant-hour and space utilization provide numerical measurements for occupancy. Moreover, either of these two values could be built into the regression formulation of a new or existing metric such as the ENERGY STAR score, which becomes highly useful for peer comparison of buildings.
Additionally, multiple KPIs defined in [9] are significantly difficult to quantify via measurements (i.e., degree-hour criterion) since they require atypical sensors (mean radiant temperature, air velocity sensors, etc.). Similar to what was presented by our analysis, proposed metrics should seek to define optimal sensing technologies to use (occupancy counters), but provide an alternate method that is designed to work with the existing state of practice precedented by code standards (presence level occupancy sensors). In this way, KPI definitions can be practically implemented in existing buildings while also preparing for future availability of sensor technology to be mainstreamed.
Finally, it is always important to consider the interpretability of a metric when presenting information to others, specifically when the metrics are intended for quick cognitive digestion. Miles per gallon or energy use intensity are examples of good KPIs because they can be understood with limited explanation even by people who know little about cars or buildings. Occupant-hours, similar to kilowatt-hours, is a difficult metric to interpret. The timespan over which the measurement is performed must be considered and many people will often normalize it by some familiar feeling timespan (day, hour) to have a better feel for what it means. Reporting of occupant-hours may be a better metric for more advanced users, but it is our opinion that space utilization is a more interpretable standalone metric as the value will generally remain bounded between 0-100%.

Conclusions
This paper defines and implements a consistent methodology for calculating percent space utilization in buildings. The methodology is scalable across different levels of spatial granularity (building, floor, space) and irrespective of sensor configuration (zero or more sensors in each space analyzed). The methodology is demonstrated using two prevalent operations-oriented data schemas, Haystack and Brick. Moreover, the concept of an applicable point is introduced to overcome differences in data modeling implementations, which provides a useful abstraction when implementing the space utilization algorithm. The methodology is intended to make use of typical occupancy sensors that capture presence level occupancy and not counts of people. Since occupant-hours is a preferable metric to use in KPI calculations, a method to convert between percent space utilization and occupant-hours using the design occupancy for a space is also developed and demonstrated.
The methodology is demonstrated on a 595 m 2 office building in Boulder, Colorado that is outfitted with different occupancy and energy sensing systems. The percent space utilization calculated at the floor level shows trends consistent with how the building is known to be used by the engineering consulting firm, which are typical 8 am-5 pm working hours with certain employees coming in early and others staying slightly later. A multiple linear regression is performed using energy as the dependent variable and space utilization and outdoor air temperature as the independent variables. There is strong evidence to support that a relationship exists between percent space utilization and energy consumption. This result indicates that even though the true state of occupancy is not known (i.e., exactly how many people are in the space), the methodology for determining percent space utilization is still useful in practice when used as an explanatory variable in a regression formulation for predicting energy consumption.

Future Work
Many additional occupant-centric metrics could be adapted from previous literature and oriented in a manner consistent with the standardized data schema efforts. Moreover, it would be useful to develop an open source library of occupant-centric metrics, each of which defines the applicable points (in either Brick or Haystack terms) necessary to calculate the metric. This would be useful not only for occupant-centric KPIs, but also for generic building performance metrics. Additionally, it is the hope of the authors that a longitudinal study will be undertaken to understand the applicability and usefulness of the percent space utilization methodology across a variety of building vintages and typologies, specifically considering how space utilization used in regression formulations for predicting energy consumption performs against other state of the art algorithms.