Metadata Schemas and Ontologies for Building Energy Applications: A Critical Review and Use Case Analysis

: Digital and intelligent buildings are critical to realizing efﬁcient building energy operations and a smart grid. With the increasing digitalization of processes throughout the life cycle of buildings, data exchanged between stakeholders and between building systems have grown signiﬁcantly. However, a lack of semantic interoperability between data in different systems is still prevalent and hinders the development of energy-oriented applications that can be reused across buildings, limiting the scalability of innovative solutions. Addressing this challenge, our review paper systematically reviews metadata schemas and ontologies that are at the foundation of semantic interoperability necessary to move toward improved building energy operations. The review ﬁnds 40 schemas that span different phases of the building life cycle, most of which cover commercial building operations and, in particular, control and monitoring systems. The paper’s deeper review and analysis of ﬁve popular schemas identify several gaps in their ability to fully facilitate the work of a building modeler attempting to support three use cases: energy audits, automated fault detection and diagnosis, and optimal control. Our ﬁndings demonstrate that building modelers focused on energy use cases will ﬁnd it difﬁcult, labor intensive, and costly to create, sustain, and use semantic models with existing ontologies. This underscores the signiﬁcant work still to be done to enable interoperable, usable, and maintainable building models. We make three recommendations for future work by the building modeling and energy communities: a centralized repository with a search engine for relevant schemas, the development of more use cases, and better harmonization and standardization of schemas in collaboration with industry to facilitate their adoption by stakeholders addressing varied energy-focused use cases.


Introduction and Background
Buildings are major energy end users, electricity consumers, and carbon emitters.In the U.S., they are responsible for 39% of primary energy [1], 74% of electricity [2], and 38% of national carbon emissions [1], with similar impacts in other industrialized economies.A typical commercial building has also been shown to waste 30% of its energy consumption, attributable to suboptimal design, construction, and operational processes.
Intelligent, or smart, buildings are understood to feature advanced sensing, communication, and control technologies that support integrated system operations; they also provide the capability to react to and transfer information within and external to the building [3].Intelligent buildings are critical to realizing efficient building operations and a smart grid [4] Advanced building modeling, control, and analytics applications deliver grid interactivity [5] and continuous efficiency [6]; however, increased interoperability is a noted barrier to optimizing the performance and cost-effectiveness of grid-interactive efficient buildings [5].

Digitalization of Building Data
Intelligent buildings have been driven by the digital revolution and data digitization similar to the transformation seen in telecommunication, manufacturing, and transportation sectors [7].At the design stage, new buildings are designed with computer-aided design (CAD) tools that can generate and export building information models (BIMs) [8] to exchange information between the planning, design, construction [9], and procurement [10] phases of a building or through an integrated project delivery process [11], although researchers have proposed extending them to other phases of the building life cycle [8].
Information about the operation of building systems (e.g., mechanical, lighting) is typically collected and stored by a building automation system (BAS) [12] or by a specialized monitoring and/or control system (e.g., submeters, networked lighting, or thermostats) [13].In the past decade, the integration of intelligence (e.g., microcontrollers and microcomputers), sensors, and networking (i.e., Internet of Things (IoT) connectivity [14]) has become more common in all types of buildings [15].These systems can generate significant amounts of data if they record operational parameters with time intervals on the order of seconds or minutes [16].Accordingly, digitization has enabled energy-focused advances in building maintenance [17], commissioning practices [18], asset tracking [19], and energy audits [20].
While building data have grown substantially in the past several years, the industry has not adopted universal standards for data storage, exchange, and use.The scalability of building applications, such as energy audits, fault detection, and diagnostic and optimal controls (see Section 4), is currently hindered by the lack of standardization in metadata schemas that represent the meaning of the data [21].This issue is generally referred to as a lack of semantic interoperability.

Interoperability Frameworks and the Role of Semantic Interoperability
For full value, digitized information and systems must be interoperable, where interoperability is defined as "the capability of two or more networks, systems, devices, applications, or components to work together, and to exchange and readily use information securely, effectively, and with little or no inconvenience to the user" [22].Several frameworks exist to describe interoperability that define different layers.For example, in the context of the smart grid, the GridWise Architecture Council defines three conceptual layers: organizational, informational, and technical (Figure 1) [23].The technical layer focuses on the digital exchange of data between two systems, including the physical, network, and syntactic aspects of the messages.The information layer deals with the semantics of the data, as well as its integration with business processes and procedures.Within this layer, the semantic sublayer tries to define a standardized way of describing the meaning of the data communicated.For example, a semantic model may capture the function and location of a sensor and its relationship with the rest of the system, independently from the protocol used to transmit that data.Its meaning should be preserved regardless of communication protocols and applications.The organization layer relates to business objectives and policy context [23].Hardin [24] adapted this framework for building operations and highlighted how different communication standards map to the interoperability layers.[23].
Technical interoperability between devices can be achieved through use of the same communication protocol or by using communication gateways that allow messages to be translated between protocols [25].However, a lack of interoperability in the semantic layer (sublayer 4 in Figure 1) is currently a significant problem that hinders the streamlined integration of interdependent software applications [21] and the development of applications that can be reused across buildings [26].In large commercial buildings, building automation systems (BASs) have progressively adopted standard communication protocols (e.g., BACnet [27], KNX [28]) during the past two decades [16].This has facilitated greater integration of building systems to a certain degree, but mapping legacy building automation system (BAS) metadata to semantic information models requires extensive expertise [21] and significant time and cost [29].This issue is caused, in part, by the complexity and heterogeneity of building systems typically controlled by the BAS.For simpler systems, such as the ones found in residential or light commercial buildings, modern smart home or IoT technologies have gained traction in the past decade [30].In this domain, a multitude of communication protocols are used, each one with custom semantics [31], but the industry has recently realized the importance of using a common semantic layer to improve customer experience and general interoperability between software platforms [32].

Ontologies and the Semantic Web
Semantic models, also called semantic or metadata schemas, contain information that describes the meaning of the underlying data.They vary in complexity, from simple lists of terms (e.g., types of sensors), called glossaries or dictionaries, to taxonomies that encode a specific hierarchical relationship (e.g., family, genus, species in the animal kingdom, or the hierarchical control architecture of an Heating Ventilation and Air Conditioning (HVAC) system) to ontologies, which use graph data structures, whereby concepts are represented by graph nodes and their relationships are represented by graph edges [33].While experts recognize that different artifacts may loosely be referred to as ontologies in different communities [34], in the field of information science, ontologies, sometimes referred to as vocabularies, comprise a formalized representation of knowledge for a given domain.Ontologies establish the domain's concepts and relationships, classes, and attributes.Studer defines an ontology as a "formal, explicit specification of a shared conceptualization," recognizing the need for an underlying shared understanding of the knowledge to be represented [35].
The World Wide Web Consortium (W3C) established standards that created the Semantic Web, an extension of the World Wide Web aimed to make internet data machinereadable [36].Ontologies that comply with W3C standards use triples in the form of subject-predicate-object to encode knowledge, following the Resource Description Frame-work (RDF) data model [37].When multiple triples are put together, they form a directed multigraph.The W3C also provides a set of fundamental languages that can be leveraged to define ontologies using classes and properties (i.e., Resource Description Framework Schema or RDFS) [38], description logics (i.e., Web Ontology Language or OWL) [39] and constraints (i.e., OWL and Shapes Constraint Language or SHACL [40]).Ontologies and Semantic Web technologies have experienced some adoption for internet services, providing interoperability of digitized data, for example, between search engines, web crawlers, and other web-based software [41].
Meanwhile, in multiple scientific domains, ontologies have demonstrated some success in standardizing representations of concepts and definitions.The Disease Ontology [42], Marine Ontology [43], and Chemistry Information Ontology [44] are all examples of domain-specific ontologies offering the ability to map common concepts across different resources and computational tools.A recent survey paper [45] documents several case studies in which the adoption of ontologies has enabled vocabulary standardization.These ontologies provide a formal basis for specific extensions that may explore a detailed subdomain, such as coronavirus [46] and influenza [47] research.

Contribution of This Review
In the building domain, several parallel initiatives are using Semantic Web technologies to develop metadata schemas [21].While these are promising endeavors, there is growing confusion over their scope and overlap, and a review of such efforts would be beneficial to the building community.A few recent papers survey the application of ontologies to IoT devices.Wang et al. [48] provide a historical perspective on the ontologies developed for sensor networks.Li et al. [49] review standardized Web of Things ontologies and identify their coverage of different levels of abstraction.Honti and Abonyi [50] present sensor ontologies according to the semantic needs of the layers of IoT solutions based on an IoT standard.Bajaj et al. [51] survey existing IoT ontologies based on the fundamental ontological concepts required for an IoT-based application.Overall, these surveys are only marginally relevant to building applications.
Other review papers look more specifically at buildings.Esnaola-Gonzalez et al. [52] investigate ontologies for observations and actuations in buildings but do not provide concrete use cases for these ontologies.Bergmann et al. [21] summarize the purpose of several building metadata schemas in the context of proposing a pathway to drive semantic interoperability for grid-interactive energy-efficient buildings.Benndorf et al. [16] review semantic interoperability in building design and building automation, including new Semantic Web technologies.Bhattacharya et al. [53] compare the ability of three ontologies to describe concepts necessary to run a set of applications described in the academic literature.Butzin et al. [54] present a survey on information modeling and ontologies in building automation to provide contextual information to the applications.While being valuable contributions, these papers do not conduct a systematic survey of existing metadata schemas.Gilani et al. [55] use a systematic approach but limit its scope to smart buildings and their ongoing commissioning.
This paper builds on previous research and provides a unique contribution by: • combining a systematic approach to reviewing the academic and gray literature with a deeper analysis of a select number of ontologies and use cases that have high value for building control and analytics applications affecting energy concerns; • surveying schemas that cover multiple stages of the building life cycle; and • providing the reader with a comprehensive list of metadata schemas that are documented and publicly available.
Distinct from the review papers listed above, which focus on academic publications, the unit of analysis of this review are metadata schemas with publicly available code and documentation, not academic articles written on them.With this approach, we attempt to identify schemas that can be used by researchers and practitioners in their use cases.This review is significant, since it addresses the barriers to scaling efficient grid-interactive build-ings that are rooted in a lack of semantic interoperability.By surveying, summarizing, and comparing existing schemas, this work aims at answering the following research questions:

•
RQ1: What is the landscape of building-related metadata schemas/ontologies in the academic/gray literature?• RQ2: Given a selection of relevant ontologies, what are the overlaps and gaps among these metadata schemas that support building operational applications?• RQ3: How does this subset of schemas support a building modeler targeting three use cases of high value to efficiency and grid interactivity?
The rest of the paper is organized as follows: Section 2 details the method used to identify metadata schemas, Section 3 presents the results of a broad review of existing metadata schemas in the building domain, Section 4 proposes three high-value building operation use cases and identifies their model needs, Section 5 presents a deeper analysis of five selected ontologies and assesses how well they support the three targeted use cases, and Section 6 discusses the results and describe the limitations and future work.

Method
To address the first research question and evaluate the landscape of schemas created for energy applications in buildings, we surveyed the academic and gray literature.To ensure a systematic approach, we used the previously developed preferred reporting items for systematic reviews and meta-analyses (PRISMA) approach [56].This method defines a set of steps to identify, screen, and select papers for formal review.Since new semantic models are introduced and documented in both academic as well as technical literature (e.g., standards, technical reports), we coupled a search of papers using the Scopus search engine [57] with a survey of schemas based on informal ontology databases and expert searches (Figure 2).The identification of the articles used the following criteria, applied to the title, abstract, and keywords using the Scopus article search tool: Ontology AND building AND (intelligent OR smart OR controls OR architecture OR hvac OR facility OR occupant OR home).
The search was limited to papers published starting from 2011 and available in English.Metadata schemas were also collected using three dedicated ontology databases.These include Smartcity Ontologies [58], Linked Open Vocabularies (LOV) [59], and Linked Open Vocabularies for Internet of Things (LOV4IoT) [60].We also leveraged knowledge of ongoing activities around semantic interoperability in buildings, including the ASHRAE Semantic Interoperability Working Group [61], the Project Haystack Forum [62], the American National Standards Institute (ANSI) C137 Lighting Systems Committee [63], the Project Connected Home over IP of the Zigbee Alliance [32], and the Brick Schema Working Groups [64].For papers, the screening was conducted by manually analyzing the title and abstract and looking for papers presenting original ontologies.While this process was designed to be linear, a small number of schemas were identified from references in the eligibility phase, in an iterative process conventionally called snowball sampling [65].For ontologies collected in databases or other sources, the screening excluded ontologies that did not relate specifically to buildings.This step removed a number schemas, typically called top-level ontologies [66] that described generic concepts such as units of measure (e.g., QUDT [67]) or time (e.g., OWL-time [68]), but preserved a selection of schemas that described sensor networks or IoT devices.These schemas were kept since this technology is becoming more prevalent in building construction and operation [14] and digital building management [69].The eligibility step focused on selecting schemas that had a public repository, accessible at the time of the analysis, and an associated paper, report, or fully published draft or online documentation, describing the concepts and structure of the schema.Figure 2 shows the number of papers and schemas at each phase.Eligible schemas were then classified based on 10 characteristics: (1) phase of the building life cycle they address; (2) type (dictionary, taxonomy, ontology); (3) syntax; (4) purpose; (5) year of origin; (6) development stage; (7) for ontologies, whether best practices were followed; (8) a link to the repository and online documentation; (9) for ontologies, whether they explicitly reuse concepts from other ontologies (modularity); and (10) adoption.These resulting ontologies were reviewed and clustered based on their phase of the life cycle and general topic.Results of this broad review of schemas are presented in Section 3.
To address the second research question, a subset of five ontologies that address building operation were selected based on their design, goal, and popularity, for a more detailed examination of the concepts they describe and evaluation of the gaps and overlaps between them.Building operation was selected from all building life cycle phases because it is clear from the literature that semantic interoperability is still a major issue in this phase.Work emphasizes the need to enable semantic interoperability for smart, grid-interactive buildings [21], to optimize energy performance in buildings [16], and to support the smart and ongoing commissioning of buildings [55].Additionally, many initiatives are underway to support improved building operation, including efforts by Project Haystack [62], the ANSI C137 lighting systems committee [63], the ZigBee Alliance's Project Connected Home over IP [32], Brick Schema [64], the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) 223p committee [70], and the US Department of Energy Semantic Interoperability Initiative [71].Details about the selected ontologies are presented at the end of Section 3.2.
To address the third research question, we crossed-referenced concepts presented in each of the five ontologies with use cases that are core to efficient building system operation.Three use cases were prioritized from a larger set of previously developed candidates based on their potential to explore semantic sufficiency and impact on building performance: (1) energy audits, (2) automated fault detection and diagnostics (AFDD), and (3) optimal control of HVAC systems.These use cases were inspired by the literature outlining the need for enabling grid-interactive efficient buildings [21], energy performance optimization requirements [16], and the scope of the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) Standard 223p effort [70].They span different levels of detail (e.g., audits collect high-level building system information, while advanced controls require much more detailed information), scope (e.g., audits typically analyze multiple building systems, while AFDD applications are typically focused on one system or subsystem), current level of automation (e.g., audit information is typically collected manually, while AFDD data is collected by software applications), and maturity (e.g., AFDD applications have been commercialized, while optimal controls are at present primarily the domain of academia).These use cases are presented in more detail in Section 4. The concepts required by the use cases and their mapping to the five ontologies are described in Section 5.

Definitions
As acknowledged by Gruninger [34], different communities use different terms to describe and represent knowledge; here we define how they are used in the rest of the document.

Information Stack (Knowledge Hierarchy)
Information science scholars have proposed to use a hierarchy of concepts, known as the DIKW pyramid, to distinguish between data and more abstract concepts, such as information, knowledge, and wisdom [72].While there is disagreement on the specific meaning of each category, the stack suggests a difference between raw data and higher-level information that captures the meaning of the underlying data.The latter is the basic idea behind metadata schemas defined below.

Metadata
Metadata is the information used to describe, present, or link other information [73].An example of metadata is the labels that describe the sensors in a BAS.

Schema, Taxonomy, Ontology, and Linked Data
A schema refers to a data model that represents the relationships between a set of concepts.Some types of schemas include relational database schemas, taxonomies, and ontologies [73].A taxonomy is a formal representation of relationships between items in a hierarchical structure [73].An ontology (sometimes called a vocabulary) is a formal model that allows knowledge to be represented for a specific domain.An ontology describes the types of things that exist (classes), the relationships between them (properties), and the logical ways those classes and properties can be used together (axioms) [73].We call linked data a pattern for hyperlinking machine-readable data sets to each other using Semantic Web techniques (e.g., using Uniform Resource Identifiers).

Tag, Tagging, and Folksonomy
A tag is an arbitrary text label associated with a resource or piece of data.Tagging is the process of annotating resources with tags by users (folks).A folksonomy (folk + taxonomy) can be seen as a data structure that is implemented in a collaborative tagging system [74].

Model and Instance
In the literature, the term model is sometimes used as a synonym of schema (e.g., an ontology is a model) and sometimes used to describe an instance of a schema (e.g., a Brick model of a specific building).In this review, we will use the former definition.

Knowledge Base
A knowledge base is a type of database that contains information instead of raw facts [75].

Metadata Schemas
The last step of the PRISMA process found 40 eligible metadata schemas listed in Table 1.They span multiple phases of the building life cycle: design and/or energy mod-eling (n = 6) and building operation (n = 34), including audits (n = 1).They cover several applications, including sensor networks, IoT, and smart homes; commercial building automation and monitoring; grid-interactive efficient building (GEB) applications; occupants and behavior; and asset management and audits.Of the schemas, 60% were developed in the past five years, and they are at different stages of development: three are published standards, nine are published schemas supported by workgroups or standard organizations, while the others are either drafts or unknown.The majority (70%) are ontologies and are structured using the RDF syntax.Table 1 shows the list of schemas found during the review process.A more detailed table is included in Supplementary Materials (Table S1).We found two schemas to semantically describe BIMs that are well established in the architecture, engineering, and construction (AEC) domain.The Industry Foundation Classes (IFC) is an ISO standard that enables designers to exchange models between different BIM applications during the design phase of the building [76].The IFC standard is widely used in industry, and its latest version was published in 2018.Green Building XML (gbXML) is another popular schema aimed at enabling exchange of information between a BIM and architectural/engineering analysis software, including Building Energy Modeling (BEM) [77].Both schemas are supported by several design and modeling software tools.A comparison of the two schemas is presented in Dong et al. [111].To improve data interoperability and flexible data exchange of the IFC model, Pauwels et al. developed ifcOWL [78], which provides an OWL representation of the original EXPRESS language used to describe the IFC model.Tubes [79] is a lightweight ontology for providing a highlevel description of building service systems (e.g., ducts) and their topology extracted from an IFC model using linked data principles.Two other schemas proposed to make energy simulations more interoperable include the SimModel Ontology [80] and EnergyADE [81], an extension of the CityGML schema [112], to exchange energy simulation data and to store the corresponding results at the urban scale.

Schemas for Building Operations: Sensor Networks, IoT, and Smart Homes
The review found several schemas addressing sensor networks and IoT devices that have some relevance for energy applications.The Semantic Sensor Network/Sensor, Observation, Sample, and Actuator (SSN/SOSA) ontology [113] describes sensors and their observations, procedures involved with sensors and observations, as well as features of interest, the samples needed, observed properties, and actuators.This ontology is not specific to a building's domain, but it can be used to describe sensors, actuators, and devices installed in the building.
The Web Thing Model (WoT) [83] aims to describe the virtual counterpart of physical objects in the Web of Things.It defines a model and Web API for things to be followed by anyone wanting to create a product, device, service, or application for the Web of Things.oneM2M BaseOntology's [84] goal is to provide a minimal number of concepts, relations, and restrictions that are necessary for semantic discovery of entities in the oneM2M system.M2M is a standard for the Internet of Things.One Data Model (OneDM) [85] seeks to support multiple interaction models among devices, applications, and services interacting with and communicating about the Internet of Things.It is based on examples from actual semantic models implemented in commercial IoT products.
A subset of IoT schemas describe metadata about smart appliances and other devices for the smart home, including devices that control major energy-consuming end uses (e.g., thermostats).The Smart Energy Aware Systems (SEAS) [86] are intended to design and develop a global ecosystem of services and smart things collectively capable of ensuring the stability and energy efficiency of the future energy grid.The ThinkHome ontology [87] aims at creating a comprehensive knowledge base that includes all the different concepts needed to realize energy-efficient, intelligent control mechanisms for homes.The purpose of the Building Ontology for Ambient Intelligence (BOnSAI) [88] is to enable the vision of ambient intelligence in large-scale service-oriented pervasive systems.The DogOnt ontology [89] describes different characteristics of smart environments such as location, capabilities, software interfaces, states, and composition.The Ontology of smart building (SBOnto) [90] intends to formalize knowledge in a smart building, including these main concepts: device, state, architecture, environment, furniture, and network.Finally, the Smart Applications REFerence (SAREF) [91] and its extensions aim to enable interoperability among IoT solutions developed by different appliance manufacturers.

Schemas for Building Operations: Commercial Building Automation and Monitoring
Several schemas have developed standardized descriptions of metadata typically stored in large commercial building BASs.Project Haystack 3 [92] is a popular building metadata schema based on the concept of tags and used in several commercial products.Its purpose is to provide a standard set of terms for describing sites, equipment, points, and the relationships between them.It is primarily focused on representing information from a BAS, an HVAC system, and a metering system.Project Haystack 3 uses a custom text format to encode the schema.BASont [90] was one of the first academic efforts to model building automation systems for various use cases, including design, commissioning, operation, and refurbishment, using modular ontologies.Project Haystack 4 [93], under development at the time of this review, builds on its previous version and adds formal mechanisms for defining terms, a taxonomy, and an ontology on top of Project Haystack 3. The Haystack Tagging Ontology (HTO) [94] is another effort to create an OWL ontology wrapper for Project Haystack 3. Brick Schema [95] is a nascent schema that is gaining traction in academia and industry and aims to provide a standardized ontology for representing the physical, logical, and virtual assets in buildings and the relationships between them.The Google Digital Building Ontology [96] is a recent open-source project (2020) that aims to create a uniform schema and toolset for representing structured information about buildings and building-installed equipment.Inspired by both Project Haystack and Brick Schema, the project has developed a schema that diverges from both of them.This ontology and toolset are currently being used by Google to manage buildings in its portfolio.The Semantic BMS ontology (SBMS) [97] provides a BAS-protocol-independent schema for intelligent building systems.CTRLont [97] formalizes an explicit specification of the control logic in BASs.Green Button [98] is an XML schema and tool to provide utility customers (residential and commercial) access to their utility data in a standardized consumer-friendly and computer-friendly format.The schema has been adopted by several utility companies across the U.S., impacting millions of customers, and it is sometimes integrated with the BAS.
Other schemas related to commercial building operation have a broader scope.For example, RealEstateCore (REC) [114] is designed to enable building controls and the development of services around the building and smart city data, and it blends together concepts from the IFC standard, Haystack, and the SSN and adds concepts of building ownership into a single ontology.The Building Topology Ontology (BOT) [100] intends to represent the core topological concepts of a building by defining relationships between subcomponents contained within a building.The Building Automation and Control Systems (BACS) [101] ontology aims to model the domain of building control and automation by reusing and combining several existing domain ontologies.The Knowledge Model for City (KM4City) [102] ontology seeks to enable the description of smart cities, including buildings, transportation, and other applications.The EM-KPI Ontology [103] describes the key performance indicators and master data domains (energy, building, utility, occupancy, observation, weather, location) in energy management at district and building levels.

Schemas for Building Operations: Grid-Interactive Efficient Building (GEB) Applications
The Facility Smart Grid Information Model [104] standard, developed by ASHRAE, aspires to define an abstract representation of the energy-consuming, energy-producing, and energy storage systems found in residential, commercial, and industrial facilities.The RESPOND [105] ontology, co-funded by the European Commission, intends to deploy an interoperable energy automation, monitoring, and control solution to deliver demand response programs at the dwelling, building, and district levels.

Schemas for Building Operations: Occupants and Behavior
A few schemas cover building occupant behavior.The DNAs Framework (obXML) [106] was the first schema to provide a standardized data model for occupant behavior, with a focus on energy simulation.The Occupancy Profile (OP) Ontology [107] also represents occupant behavior, with a focus on the energy impact the occupants' actions produce.It is based on obXML but uses RDF instead of XML as the underlying technology.Onto-SB: Human Profile Ontology for Energy Efficiency in Smart Building [115] aims at providing contextual information (human, environment, services, devices, places, context awareness, energy sources, profiles) for smart building systems.Finally, OnCom [108] describes the occupants' thermal comfort.

Schemas for Building Operations: Asset Management and Audits
The Building Energy Data Exchange Specification (BEDES) [109] provides a standard set of terms to facilitate the exchange of information about building characteristics and energy use.BEDES is supported by the US Department of Energy and adopted by several organizations.The Virtual Buildings Information System (VBIS) [19] is a schema and tool to classify and organize assets used in buildings using tags.The Ontology of Property Management (OPM) [110] seeks to describe temporal properties that are subject to changes as the building design evolves.BuildingSync seeks to standardize the reporting format of information obtained through commercial building energy audits defined in ASHRAE Standard 211 by building on the terms defined by BEDES.
From this list of schemas, we selected five ontologies to be further analyzed.These ontologies were selected based on their life cycle phase (i.e., operations), design (i.e., upper vs. application), purpose (summarized above), and adoption (based on expert opinion).Three upper ontologies were identified: (1) SAREF [91] to cover smart appliances, as well as other concepts such as systems, meters, and spaces using its extensions.SAREF's scope is broad, and it is highly cited in the academic literature.
(2) SSN/SOSA [116] provides a model for the "entities, relations and activities involved in sensing, sampling, and actuation," and is the most frequently cited ontology in the IoT space.
(3) BOT [100] provides generic concepts of building components.BOT is relatively new, but it is supported by an active research community.
In addition, two application ontologies were chosen: (4) Brick [26], which is the most cited ontology describing building automation and monitoring, and (5) RealEstate-Core [117], which covers concepts of real estate and structures.Other strong candidates such as BACS and Google Digital Buildings were not included for a lack of space, but they should be considered for future reviews.

Use Cases
Use cases are an effective way for capturing business processes and functional requirements.They are particularly useful for prioritizing the scope of work for complex interdisciplinary undertakings with large numbers of possibilities, stakeholders, and challenges.While use cases can and have been described in many ways, one useful definition is "a sequence of actions that an actor (usually a person, but perhaps an external entity, such as another system) performs within a system to achieve a particular goal" [118].Many textual, structural, and visual approaches for formally capturing use cases have been defined and put into practice by various industries.One such example is the use case diagram, one of many behavioral diagrams that are described in the Universal Modeling Language (UML) standard that was originally developed and published by the Object Management Group and has subsequently been published by the International Organization for Standardization.Diagrams are described in UML as a partial graphical representation of a systems model.Use case diagrams graphically describe the behavior of an entity, like a system or a subsystem, in relationship to one or more actors and a use case goal (e.g., business process, functional requirement).
To facilitate the analysis of the selected metadata schemas, we defined three use cases: (1) energy audits, (2) automated fault detection and diagnostics (AFDD), and (3) optimal control.A simple UML-like use case diagram describing the relationship between a key use case actor, a building system or subsystem, and the use case outcome is shown in Figure 3. Building systems or subsystems are functionally represented in this diagram as either components and equipment or control and monitoring systems.A metadata model (i.e., an instance of the metadata schema) operates at the interface between the actors and these components, equipment, and systems.This model can be used by different actors to more efficiently perform their use case tasks, as described in more detail in Sections 4.1 and 4.2.Given that all use cases describe the behavior of building systems or subsystems, we also defined a prototypical building comprising example systems and subsystems, as shown in Figure 4.In Figure 4, the logical relationships between floorplan (in red) spaces or zones and key (but not all) HVAC system components are illustrated.For example, while the Air Handling Unit (AHU) (in blue) and two Variable Air Volume (VAV) boxes (in green) that condition the spaces are shown, chiller and boiler plants that connect to the AHU via the heating and cooling coils are not shown for the sake of simplicity.The space is physically divided into four rooms and one long corridor.Walls are represented by thick lines, and two windows are shown on the north side of the floorplan.The bathroom has one fan exhausting its air to the outside.The AHU supplies air to two VAV boxes that serve two HVAC zones, and the air returns from the zones to the AHU via a return duct on the top right of the figure.The lower branch of the AHU is the supply duct with a sequence of components from left to right: outdoor air damper, filter, heating coil, cooling coil, and supply air fan.A set of abstract sensors (S) (e.g., temperature, airflow, and pressure) and actuators (A) (e.g., dampers) are also shown in the diagram.The upper branch of the AHU is the return duct with a return air fan and an exhaust damper, while the vertical branch represents the mixed air duct with its damper.The VAV boxes contain one damper and a reheat coil.A controller for the return fan, displayed in the center of the figure, is connected to sensors and actuators and thereby a set of virtual points that represent its non-physical inputs and outputs: setpoints, commands, and alarms.
The floorplan on the right shows that zones and rooms are not wholly contained by each other; there are overlaps.For example, two rooms and part of the corridor constitute the first thermal zone, while the remaining room and the other half of the corridor belong to the second thermal zone.While some lighting zones (dotted lines) are contained by space, the corridor lighting zone crosses the boundaries of two HVAC zones.For each lighting zone, lighting sources and their occupancy sensor area also marked (S).Different end-use appliances, such as computers, office equipment, and kitchen appliances, are also illustrated in the floorplan.All the electrical loads are metered by a building-level meter (M), and each fan has a submeter (M).While this diagram does not fully depict the described building systems, it is still clear that even a relatively small section of a building can contain tens to hundreds of devices and elements that need to be semantically related to each other to fully describe a use case.

Use Case 1: Energy Audits
In this use case, a human auditor with specialized training performs an audit of a building and its energy-consuming systems.To perform the audit, the auditor needs to collect data about the building components and systems (e.g., device efficiencies or efficacies, system architecture) and any control and monitoring systems (e.g., device configurations, cumulative end-use energy consumption).A metadata model facilitates the accurate and efficient collection of data over the course of the energy audits.
An energy audit is "an assessment of the energy needs and efficiency of a premises" [119].In the commercial building space, it is performed at one of three levels (Levels 1, 2, and 3), formally defined by ASHRAE Standard 211 [120].Recently, energy audits have gained traction as a policy mechanism (e.g., New York City's Local Law 87) used to educate building owners about their energy-consuming equipment and practices, with an ultimate goal of improving the energy efficiency of the existing building stock [121].The scope of an ASHRAE Level 2 energy audit includes capturing overall facility characteristics, monthly and annual energy consumption and costs, and an asset registry of primary building systems and energy-consuming equipment and documenting undesirable operational characteristics.Finally, the job of the energy auditor is to recommend energy efficiency measures (EEMs) based on their evaluation of the building, as well as estimate the energy and cost savings were these EEMs to be implemented [120].They provide a high-level snapshot of the building and low-/no-cost or capital EEM recommendations that are selected based on past experience with building systems.In reference to Figure 4, an ASHRAE Level 2 audit would identify the main heating and cooling sources and their efficiencies and capacities and may acquire BAS data on their operational schedule, setpoints, and whether setbacks are implemented.It would identify the lighting sources and lighting controls technologies, characterize the envelope, and analyze the utility data.Finally, it would characterize the overall square footage of the building and window-to-wall ratios and identify occupancy patterns in the different sections of the building.Table S2 (Supplementary Materials) summarizes this information, as well as the competency questions (competency questions are interrogatives that allow to identify requirements for an ontology based on users' needs) and related concept required to describe this use case.

Use Case 2: Automated Fault Detection and Diagnostics
In this use case, a facility manager or a third-party technician configures and monitors the output of an automated fault detection and diagnostics (AFDD) tool for one or more building systems (e.g., HVAC, lighting) and performs preventative or reactive maintenance aimed at maximizing the performance and lifetime of the monitored systems [122].To configure the AFDD tool, the facility manager or technician needs to access building component and system data (e.g., rated operational/environmental conditions or safe operating areas, rated lifetime, and reliability/failure rate) as well as control and monitoring system data (e.g., hours of use, threshold violations).A metadata model facilitates the efficient and accurate configuration of the tool, the analysis of faults, and the reporting of diagnostics.
AFDD tools analyze historic time series of data in combination with the knowledge of system capabilities (e.g., HVAC capacities, rated lighting parameters) and characteristics (e.g., safe operating ranges, failure mechanisms), schedules, and sequences of operation (SOO) to determine the presence of operational faults and control improvement opportunities [123].Commercially available AFDD tools have primarily been developed to monitor HVAC BAS data.However, similar approaches may be applied to lighting and other systems as well [124].
To operate correctly, system configurations must be understood, as well as physical and logical relationships between system components.AFDD tools share/report to a human user some combination of the location of faults, their severity, potential root causes, energy cost equipment, or maintenance impacts.The output of the tool may be a series of tables and/or graphics [122].These may overlay floor plans to indicate the location of the fault.AFDD tools may also integrate with the computerized maintenance management system (CMMS) to track the status of an action to fix or inspect a fault.Some tools include features to track these processes internally.This use case covers the metadata used by AFDD algorithms and the initial configuration step.For the HVAC systems shown in the Figure 4 example, the tool should know the relationships-i.e., context-of data sources to related equipment, locations, control sequences, and other entities.The relationship between each system is also required; therefore, the metadata needs to specify that the AHU supplies air to the two VAV boxes and to the two thermal zones and that the bathroom fan exhausts air from HVAC Zone 1. Further, the tool needs to link each fan with its submeter.Time series data are required for all the sensors, actuators, and virtual points, including their unit of measure (e.g., • C kWh) and their expected reporting interval or interval configuration options.Nameplate capacity and efficiency values of the different systems are also useful to compare actual and expected performance.For the lighting system shown in Figure 4, the AFDD tool needs to know rated device inputs (e.g., 120 V, 120-277 V), outputs (e.g., 0.1-1 A, 12-120 W, 0.99-0.80power factor), and safe operating conditions (e.g., −40 to 49 • C), as well as system architecture details.Notably, system input architecture and system output architecture can be different.For example, a group of lighting devices (e.g., integrated luminaires, lamp/fixture combinations) might be designed and installed to serve a particular space, but the electrical subsystems that power lighting devices (e.g., electrical panel circuits, junction boxes) might be designed and installed to serve devices in multiple spaces or portions of spaces.Time series data are needed for all metered parameters (e.g., input voltage, current, and power factor; output power or luminous flux; hours of use; internal operating or ambient temperature) including their unit of measure (e.g., A, V, W) and their expected reporting interval or interval configuration options.These data needs, competency questions, and related concepts are summarized in Table S3 in Supplementary Materials.

Use Case 3: Optimal Control of HVAC
In this use case, an engineer installs and configures a supervisory control system to optimally operate one or more building systems (e.g., HVAC, lighting) systems.To configure the supervisory control system, the engineer needs to access both building system and component data (e.g., rated capacities, current operating state) and control and monitoring system data (e.g., in-use control strategy, setpoints, sensor data).A metadata model facilitates the accurate and efficient configuration of the control system/software.
Optimal control strategies, such as model-predictive control (MPC), have the potential to significantly reduce building energy consumption, demand, and cost and improve comfort compared to traditional rule-based control sequences [125].MPC strategies compute optimal control inputs by minimizing an objective function, given a set of constraints, over a finite prediction horizon [125].In the past two decades, MPC has received significant attention by the building research community, but it has not yet been implemented at scale, due, among other things, to the significant effort required to configure its models [126].While different modeling approaches can be used, gray boxes (i.e., physics-inspired but simplified models) are considered the most promising option [127].To configure such models, one needs to gather detailed information about the HVAC system, its components, and the relationships between those components and system performance.Time series data from sensors and actuators are often needed to train the model hyperparameters and run the optimization algorithm.In addition, information about the building envelope often needs to be collected to correctly characterize heat transfer through the different surfaces as well as thermal storage in the envelope and internal mass of the building [125].Referring to the example in Figure 4, to configure the MPC model, one needs to know all the HVAC information listed in the AFDD use case, and more details about the envelope such as position and orientation of the building, size of the windows, estimated properties of the different envelope elements, as well as the internal mass of the building.To characterize the internal heat transfer between zones of the building, information about adjacency of the zones is also needed.MPC is typically implemented as a supervisory control strategy (i.e., it just determines setpoints), and it leaves direct control of the equipment to lower-level fast-reacting controllers.Since the actual behavior of the system depends on the details about the implementation of the lower-level control logic, more details about the control strategy need to be understood (e.g., what setpoints can be overridden, what is the interaction between these setpoints, and what is the existing control sequence).Further, MPC can benefit from having access to occupancy data, for example, from the lighting system.These data needs, competency questions, and related concepts are summarized in Table S4 in Supplementary Materials.

Core Concepts
Table 2 summarizes the combination (i.e., union) of concepts, properties, and relationships identified in Sections 4.1-4.3.Concepts identify what a thing is (its type), properties are general information about that thing, and relationships define how that thing" relates to other things (its function or role in a larger system) [21].In the next section, we examine five ontologies from the perspective of a building modeler who would use a particular ontology to create an instance of a model (e.g., to represent the components of the building in Figure 4).The role of modelers in creating and updating models is an important topic and will be further discussed in Section 6.For the time being, we assume that the modeler has domain expertise and knowledge of Semantic Web technologies.

Results of the Review: Assessing Core Concepts in Five Ontologies
Modeling buildings to track their systems and energy usage required for activities such as those in the previous section is accomplished using ontologies to represent each element.We examine five ontologies designed to support building modeling, which energy applications can use, and assess how a modeler would use these schemas to prepare the resources necessary to accomplish the goals of the three described use cases.
Ontologies are a general tool that can fulfill a variety of data modeling needs.The five ontologies range from upper ontologies that provide general-purpose concepts to application ontologies designed to represent domain-specific concepts.Application ontologies (sometimes referred to as domain ontologies) are typically built on upper ontologies and define specific concepts, properties, and axioms that are relevant to a particular domain or application [66].Both types of ontologies include core classes to describe things that exist in an instantiated model as well as properties that are then used to express relationships among instances of different classes.Core classes and properties available for modeling energy-related issues in buildings varied widely across these five schemas.These include, bolded throughout these sub-sections, Zones and Spaces, a building's Envelope, the Building Systems and Equipment, the Control Devices in a building, and importantly Sensors and Actuators (captured in the Category column in Table 2).The prototypical office building shown in Figure 4 was modeled using each ontology to facilitate a structured comparison, and model files are briefly discussed and linked to in Supplementary Materials (Table S5).Here we describe each of the ontologies and some of their core concepts and then summarize how successfully a modeler would be able to represent the concepts, properties, and relationships necessary to support the three described use cases to help us answer our second and third research questions.

Building Topology Ontology (BOT)
The Building Topology Ontology (BOT) upper ontology is designed so that modelers can represent the core topological concepts of a building by defining relationships between sub-components contained within [128].BOT was started in 2019 and is developed and maintained by the W3C Linked Building Data Community Group to provide a minimal ontology for describing relationships between a building's sub-components but is not a formal W3C standard [129].BOT is designed to be an extensible baseline ontology that can be used with domain-specific schemas for more complex, specific use cases.
BOT is scoped around buildings and their particular topologies.BOT models are structured with Zones defining a particular 3D area that can contain particular Elements or be adjacent to Elements.BOT Elements represent physical parts of buildings, such as air conditioners, and can be composed of sub-elements.BOT also specifies Interfaces to capture the relationships between multiple elements, zones, or some combination of these items.BOT also encapsulates 3D models so that Zones and Elements can be assigned to a particular physical space.BOT provides basic alignments between this ontology and others, including the Brick and SAREF4Bldg schemas examined below.Unlike the latter, BOT does not directly include representations of HVAC or lighting systems that a building modeler would need, since it is not application specific.
A modeler using BOT ontology can broadly represent many of our core concepts in a generic way using classes that are part of the schema.Zones and Spaces concepts can be modeled using the provided Zone class and its sub-classes.The Element class can generically represent parts of our remaining core concepts, including Envelope, Building Systems and Equipment, Control Devices, and Sensors and Actuators.This is in line with the schema's intentional design as a baseline ontology for the building domain oriented around topologies.There are, however, many gaps that a modeler would have to fill using external schemas, an approach encouraged by the BOT documentation.
Modeling Zones and Spaces, BOT explicitly includes a Zone class (bot:Zone).This conceptualization of Zones broadly starts at any defined 3D volume of space and nests more specific areas inside.BOT defines four sub-classes for representing different spaces related to a building.These vary in scope from the full site or piece of land (bot:Site) to a building as a whole (bot:Building), the particular floor or level within a building (bot:Storey), and the breakdown of particular spaces within a story (bot:Space).Turning to the Envelope, Building Systems and Equipment, Control Devices, and Sensors and Actuators concepts, a modeler would use the BOT Element class (bot:Element).Elements in BOT can be composed of sub-elements that allow a user to model sub-components, e.g., sensors within an air handling unit.BOT includes the Interface class (bot:Interface) for representing where zones and elements meet in space.Modeling these interfaces would support the work of energy auditors in our first use case or stakeholders configuring and optimizing advanced controls in our third use case, either of whom is concerned with characterizing and using properties of the envelope of a space or zone.
BOT can represent the relationships among instances of Zones and Elements using various properties defined in the schema.Physical, spatial relationships among Zones can be modeled using properties such as bot:containsZone, bot:adjacentZone, bot:hasBuilding, and so on.Items modeled using the Element class can draw upon the bot:hasElement, bot:hasSubElement, bot:containsElement, bot:adjacentElement, and bot:intersectingElement properties to represent relationships among items.The bot:interfaceOf property similarly allows a modeler to specify how an Interface class instance is related to adjacent zones or elements.An instance of a Zone can express the systems or equipment encapsulated in the space through the bot:hasElement class to convey where the equipment is and make the relationship queryable.The meters and sub-meters needed in our three use cases can likely be modeled using this concept, just like other equipment.
While BOT properties allow modelers to capture relationships among instantiated items, many of the characteristics that emerge as part of core concepts are not able to be modeled using BOT.Instead, a modeler would have to leverage an external schema.An Envelope has many characteristics important to energy modeling that are not part of BOT.Similarly, Building Systems and Equipment as well as Sensors and Actuators have many intrinsic aspects (e.g., type of equipment, rated power draw, capacity, energy, etc.) that require using another schema when building a model.Detailed aspects of Control Devices, such as setpoints, commands, or alarms, are not directly within the scope of the schema.Particular units of measurement for values generated by equipment or systems are also not a feature of BOT.To handle either of these aspects, a modeler would need to rely upon another ontology (e.g., QUDT for units of measurement).

Semantic Sensor Network/Sensor, Observation, Sample, and Actuator (SSN/SOSA)
The Semantic Sensor Network (SSN) ontology is designed to help modelers describe sensors and their observations, procedures involved with sensors and observations, as well as features of interest, the samples needed, observed properties, and actuators [130].The SSN was initially developed by the W3C Semantic Sensor Network Incubator Group [116], and now it is maintained and actively updated by the Spatial Data on the Web Working Group [131].The most recent recommendation to the W3C was published in 2017, and a more recent editor's draft was published in November 2020 [82].Recent versions of the SSN were developed in part to remove some of the complexity found in the ontology originally released in 2011-2012 [116].The self-contained core ontology Sensor, Observation, Sample, and Actuator (SOSA) is included to contain the main SSN classes and properties [132].This ontology can be used to model a variety of systems, beyond buildings.
The SSN/SOSA ontologies are designed for modelers trying to represent sensor-based systems.Building topologies are out of scope and meant to be modeled through the use of another ontology such as BOT, but we see that classes included in these ontologies can be used to represent many of the core concepts found across our three use cases, specifically Building Systems and Equipment, Control Devices, and Sensors and Actuators.Classes to represent the Zones and Spaces and Envelope concepts are not part of SSN/SOSA ontologies that lack location and space concepts.Modelers must incorporate them through the use of an external schema.
Modelers representing various Building Systems and Equipment can leverage the core System class in the SSN (ssn:System) to capture the different systems and equipment that may be in a building, such as the air handling unit and VAV boxes in Figure 4.The SSN also provides a Property class (ssn:Property) to capture the intrinsic characteristics of equipment (e.g., type, rated power draw).Control Devices can be represented in SSN/SOSA since Systems can implement Procedures, which could perform actuations based on observation inputs.These target a sosa:FeatureOfInterest, which could be a damper position, heating valve command, or other similar control point.Envelope elements like walls or windows may also be modeled using the Feature of Interest class.These schemas also include classes for the representation of time series data in the form of Observations, Actuations, or Samplings.These items produce a sosa:Result so that a modeler can represent readings of data from systems or equipment (and their component sensors and actuators).Finally, Sensors and Actuators is a core defined class in the SSN/SOSA ontologies.The Sensor class in SOSA (sosa:Sensor) is provided as a sub-class of the SSN System to represent a device, agent, or software that is involved in or implements a sosa:Procedure.Sensors are hosted by SOSA Platforms, which are themselves part of a SOSA Deployment and respond to either an environmental stimulus or input data from prior results to generate a result.Likewise, Actuators are SOSA (sosa:Actuator) devices that are used by, or implement, a SOSA Procedure to change the environment's state.SOSA includes a sosa:ActuatableProperty class to allow a modeler to characterize what aspect of a sosa:FeatureOfInterest can be acted upon in the environment (e.g., a window's ability to be opened and closed by an attached actuator).While SSN/SOSA does not directly represent zones and physical spaces, the relationship between Sensors or Actuators and part of a system can be at least partially determined by the FeatureOfInterest with which they are associated.
SSN/SOSA include many built-in properties necessary for modeling various concepts from our use cases.Using the sosa:observedProperty enables a relationship between Observations and Features of Interest to be specified.The sosa:hasFeatureOfInterest and sosa:isFeatureOfInterestOf properties enable the specification of relationships between Observations and Actuations.The ssn-system:SystemProperty class also enables a modeler to specify a characteristic that represents the particular system's ability to operate for a primary purpose, e.g., for a Sensor to produce Observations or an Actuator to make Actuations on the environment.
Representing core concepts of Zones and Spaces and part of Envelope requires the modeler to use an external schema.Modeling physical Zones and Spaces would require using a schema like BOT to capture spatial elements.Modeling properties of Envelope elements does not seem to be in scope with SSN/SOSA and likely requires using another ontology.SOSA does include a Platform class (sosa:Platform) for representing entities that host other entities, and examples include aircraft-hosting sensors.It is probably possible to leverage the Platform concept to represent a building, but the ontology is designed so that a modeler will rely upon another schema to represent these concepts in the clearest way.Missing from SSN/SOSA are units of measure, for which modelers are encouraged to use an outside schema, such as QUDT or Ontology of Units of Measure (OM), when building a model.With Control Devices, aspects like control strategy or schedules are missing from the ontology.Our three use cases also raise concepts of Schedules as part of Control Devices, and there is no mechanism to reflect these aspects to building systems and applications.

Smart Applications REFerence Ontology (SAREF) and Extensions
The Smart Applications REFerence Ontology (SAREF) and extensions are designed to enable interoperability among IoT solutions developed by different appliance manufacturers [133].SAREF was initially developed by a team from the Netherlands Organization for Applied Scientific Research between 2014 and 2015 at the behest of the European Commission [134].SAREF and its extensions are maintained and supported by the European Telecommunications Standards Institute (ETSI), a European Standards Organization, with the most recent modification of the core in May 2020.SAREF focuses on the concept of devices as tangible objects that can accomplish a task (e.g., changing a room's tempera-ture) through a specified function or procedure when a measured value hits a specified threshold.Examples include light switches, temperature sensors, or more complex things like washing machines.Devices have Properties related to Measurements that can be produced while offering a particular Service.For instance, SAREF [133] defines a model for the services and functions provided by devices, which has been extended to cover concepts in several other domains: SAREF4BLDG [135] introduces a taxonomy of devices from the IFC standard [136] and spatial components inspired by BOT, SAREF4ENER [137] defines representations of energy use profiles for devices, and SAREF4SYST [138] defines a model for topologies of devices and how they connect.SAREF4SYST is an upper ontology, while SAREF core is application specific for devices, and the SAREF4BLDG extension is even more narrowly focused on building components.Between the core ontology and its extensions, SAREF spans the gulf between the upper ontology and application or domain-specific ontologies, covering many of the core concepts identified by our use cases.
Modeling buildings and their energy usage would mostly leverage the SAREF4BLDG extension.The SAREF4BLDG extension includes building and building spaces (s4bldg:Building and s4bldg:BuildingSpace) classes to represent Zones and Spaces such as the physical composition of a given building, but the focus in this extension is still primarily device oriented rather than space oriented.The Building class provides the container in which particular building spaces are described, and, for example, a zone can be modeled as an instance of the BuildingSpace class.The core Envelope concepts for a building such as walls or a roof are not represented in SAREF.However, some types of devices that impact the Envelope are included, such as s4bldg:shadingDevice, which is used to model devices for protecting from light exposure.Building Systems and Equipment can be represented using a combination of the core ontology and its extensions.The SAREF4SYST extension provides a basic abstraction for Systems that can have sub-systems.The saref:Device class encapsulates tangible objects, and the SAREF4BLDG extension adds domain-specific sub-classes such as s4bldg:Boiler and s4bldg:CoolingTower.SAREF's Property class can be used to represent measurable qualities as well as intrinsic characteristics that a modeler needs to encapsulate.
SAREF includes a general Device class to represent tangible objects designed to accomplish a task.Tasks are represented as Functions, and each Device performs one or more Functions.This generic concept is used to represent Control Devices and Sensors and Actuators through varied sub-classes.The Control Devices concept is supported at least partially by SAREF4BLDG's Controller class.Instances of Controller monitor control inputs and outputs needed in building automation.These can be physical devices or logical and implemented in software.SAREF includes a Commands class as well as Services that represent functions advertising the device on a network and its availability to perform actions.The SAREF4BLDG extension includes an Alarm class as a sub-class of Distribution control devices.SAREF4SYST includes a Connection Point class to describe the connections that may be leveraged for modeling this important aspect of Control Devices.SAREF directly represents Sensors and Actuators in the ontology.SAREF4BLDG includes an Actuator class to represent mechanical devices for adjusting or controlling a mechanism or system.Sensors have a SensingFunction for transmitting data such as measurement values, while Actuators have an ActuatingFunction to transmit data to actuators and change a state.For measurements from Sensors, SAREF includes Measurement and UnitOfMeasurement classes and draws from the Ontology of Units of Measure (OM) schema by default.Examples included in the ontology include Illuminance, Power, and Temperature units.
SAREF and its extensions also include a variety of properties to define relationships among instances of the various classes.The SAREF4BLDG extension introduces properties useful when modeling Zones and Spaces to represent Location using the W3C's basic geo vocabulary [139], as well as the hasSpace and isContainedIn relations to define how spaces physically relate to each other.Properties necessary for modeling various types of devices, such as Building Systems and Equipment or Sensors and Actuators, include hasManufacturer and hasModel to let users capture this intrinsic property.Incorporating concepts from the IFC ontology, SAREF4BLDG adds building-specific properties as subclasses of saref:Property.Meters are included as specific devices in SAREF designed to perform a Metering Function.
One element missing from the SAREF ontologies is the concept of Schedules that would be necessary when modeling a control strategy for various Control Devices in our energy use cases.It is also unclear how well a modeler could represent various aspects of an Envelope's properties.While much of Zones and Spaces can be modeled, the geometric elements are not aligned with the core ontology or its extensions and would require the modeler to extend the schemas further.

RealEstateCore (REC)
RealEstateCore (REC) is an ontology designed for enabling building control and the development of services around the data that can be generated from buildings in preparation for interactions with smart cities [140].REC is produced and funded by a consortium involving many major real estate companies in Northern Europe, founded in 2017 [117].REC is oriented around a RealEstate concept that has one or more BuildingStructures that encapsulate BuildingStructureComponents such as rooms.Devices in REC are electronic equipment made for a specific purpose and composed of Sensors or Actuators.Devices are related to BuildingStructureComponents through their Sensors or Actuators, which define their location through a property specification [141].REC pulls together more than 10 schema namespaces to make this ontology and as an application ontology provides many building concepts as part of the ontology itself upon which a modeler can rely [114].RealEstateCore's design encompasses many core concepts that building modelers require, in contrast with upper ontologies.Since the ontology designers have included these notions as classes or properties, end users modeling particular instances have fewer decisions to make about representing particular pieces of equipment and the relationships of spaces.REC incorporates various concepts through a series of modules, including the Core, Metadata, Building, Device, and Actuation modules among others.
The RealEstateCore (REC) ontology starts from a larger conceptualization by providing Land as well as Real Estate classes that have a location captured in latitude/longitude coordinates.Zones and Spaces concepts emerge in REC with classes for modeling buildingsm such as BuildingStructure.The BuildingStructure class has sub-components to represent rooms of various types, building facades, and the roof.The ontology does not have a conceptualization of a Zone, but a modeler can use the Virtual Building Component class to represent non-physical components.REC includes some classes needed to model a building's Envelope.These include notions of the Roof with its inner and outer portions; the Facade; Walls, including the inner portion; and Floors.The Virtual Building Components class may possibly be used to represent additional aspects of a building's Envelope.RealEstateCore's Device module can represent Building Systems and Equipment of various types, while allowing for sub-devices within a larger encompassing system.REC defines devices as electronic equipment made or adapted for a particular purpose, and devices have at least a sensor or an actuator.As particular types of equipment, Control Devices can be modeled in REC using the Device class again.The ontology includes elements as named individuals that include varied control devices like Thermostats.Representing Sensors and Actuators, RealEstateCore provides a Sensor class in the core schema, and in the Device module, as something that detects or measures physical properties to be recorded, indicated, or responded to by some other entity.An Actuator class is included in the Device module to represent a device that "takes some control input and executes some real-world action based on this input" [142].REC furthermore has an entire module for actuations and their actions [143].The schema offers a PlacementContext class to specify the particular context or media that a sensor or actuator is functioning within, and this may align with the notion of control points.
Properties in REC allow the building modeler to define various relationships.With Zones and Spaces, the modeler can specify relationships such as hasBuildingComponents to define how spaces related to each other.Instances of Building Systems and Equipment elements are described with a location through the isLocatedIn relationship and can denote a placement context (e.g., a sensor in a supply air location) through the hasPlacement-Context relationship.RealEstateCore as an ontology includes multiple types of properties that can be used when modeling building systems and equipment to describe important information about a system or piece of equipment.Measured values coming from equipment are represented using the QuantityKind class that is inspired by the QUDT ontology.Modelers specifying Sensors and Actuators can indicate relationships between instances of these classes with the isPartOf property.
While many aspects of our use cases can be modeled with REC, various elements necessary to model some aspects of our energy use cases are missing.It is unclear how a modeler would represent meters or submeters, but likely this would involve extending the Device class to encapsulate additional energy uses.Aspects such as control points, and elements like alarms or commands, are also missing at this time.Finally, there also does not appear to be a conceptualization of schedule or occupancy necessary for control strategies.

Brick Schema
Brick Schema is an application ontology standardizing the representation of physical, logical, and virtual assets in buildings, the relationships between them, and their associated telemetry such as sensors and actuators [95].Brick was started by academic researchers in 2016 and continues as an open community developing and improving the schema.Brick Schema and its community are supported by grants from various US and international government funding bodies along with a few corporations.Brick was created to provide an expanded, extensible, and formalized vocabulary of common building assets and points, driven by an empirical analysis of the concepts and relationships actually required by real building applications [53].Brick defines a class structure, which provides an organization of entities by their behavior, purpose, and context.Brick relationships describe perspectives on building assets and systems, which generalize across different building subsystems [144].Broadly, these perspectives are composition (how entities are assembled together), topology (how entities are connected and how media flows between them), and telemetry (how configuration and time series data relate to entities).Brick is designed to integrate and interoperate with other linked data models such as BOT and REC (alignments with these already exist) as well as external standards such as the VBIS [19].
Brick currently models several categories of building concept classes that are largely compatible with the core concepts outlined above: Location, Equipment, Point, and Substance.Brick's set of Location classes directly represents many of the Zones and Spaces core concepts.Brick decomposes a Site into zero or more Buildings, which may contain Floors and Zones, which, in turn, can be made up of zero or more Spaces.Spaces, in turn, are also organized by use case, e.g., machine room, wet lab, dry lab, or office.The Building Systems and Equipment core concepts are covered by Brick's set of Equipment classes.Brick defines hundreds of Equipment classes, which cover many common types of equipment and devices found in HVAC, lighting, electrical, and water systems; the class structure organizing these is designed to be extensible.Brick contains a few System classes that use compositional relationships to denote the inclusion of equipment within one or more systems.Brick has limited support for different kinds of Control Devices but defines a comprehensive array of classes for data sources including Sensors, Actuators, Setpoints, and Alarms.Brick defines a single Controller class, which is a high-level black-box representation of any logical control sequence such as a schedule or embedded device logic.Data sources can be described as being the input to or the output of one or more controllers; controllers are related to equipment and devices through a specialized relationship.Brick defines hundreds of Point classes, broadly organized into Sensors, Commands, Setpoints, Alarms, and Parameters.Point instances have units, defined by the QUDT ontology, and are related to the substances and quantities that they measure or control.These data sources are contextualized by their relationships to equipment, controllers, and locations.Brick does not represent the current value of Points, instead recommending that these be stored in a specialized external store such as a time series database.
Brick includes many Relationships that can be used by modelers to specify property relationships.When modeling Zones and Spaces, as of version 1.1.0,Brick does not support modeling of additional properties of these location concepts, such as orientation, area, and volume.These features are slated for the upcoming 1.2.0 minor release.This also means that Brick does not model the Envelope core concepts as of the 1.1.0release.With respect to Building Systems and Equipment, Brick also uses topological relationships between components to represent the flow of a given substance between a sequence of equipment within a given system.The topological and compositional relationships also capture the relationship between equipment and the spatial elements of a building, for example, the zones fed by a particular VAV or the room containing a particular luminaire.Equipment properties, such as rated efficiency or capacity, are being addressed in upcoming releases.

Discussion and Conclusions
This paper presented a systematic survey of metadata schemas for building energy applications across the building life cycle to address three research questions.We see that there is a complex ecosystem of concerns that face building modelers and managers through our review contribution surveying the landscape of metadata schemas, identifying three building energy use cases and presenting a model building, and analyzing five ontologies in light of these use cases and models.
Answers to the three research questions, based on the results illustrated above, are provided in Section 6.3.

The Landscape of Metadata Schemas for Building Energy Applications
Forty schemas were found that met the defined criteria.The schemas found were grouped into 7 categories, with the majority describing building operation and 14 focusing on the control and metering systems of commercial buildings.Among the 40 surveyed schemas, only 2 of them (i.e., IFC and gbXML) and 1 derived from them (ifcowl) address the design stage of the building life cycle.This may suggest that tools in this building design have achieved a high degree of standardization.While there has been convergence in BIM practice, compared to a few decades ago, several issues still remain.BIM commercial software packages implement some features of these standards differently, and some only support a specific subset of features (e.g., 3D architectural elements) [145].This issue may cause loss of information when files are exchanged between different actors using different packages.Similar problems may also occur between different versions of the same software [145].Further, extension of BIMs to building operations still remains far from being realized in practice [146].However, some of the operation-focused schemas reviewed try to borrow concepts derived from the design phase (e.g., BACS [101]).
IoT schemas surveyed are just a small fraction of the total ontologies cited in other reviews.Gyrard et al. identify more than 200 ontologies relevant to IoT applications [147], but it is unclear how many of these can be directly related to building applications.For interested readers, other reviews provide more information about these schemas [48][49][50][51][52].In addition to the smart home schemas described in Section 3.2.2,Connected Home over IP (CHIP) is a recent project supported by several IoT/smart home vendors and coordinated by the Zigbee Alliance.Its scope is to "develop and promote the adoption of a new, royaltyfree connectivity standard to increase compatibility among smart home products, with security as a fundamental design tenet" [32].
Fourteen different schemas have been identified that describe metadata about controls and metering in commercial buildings, but from this high-level review, the overlaps between them and their gaps are unclear.An extensive comparison between Brick and Project Haystack is presented in [148], but no publication compares all these schemas in depth.Gilani et al. tried to evaluate the scope of several ontologies by identifying what type of data they describe (e.g., external conditions, energy use), but these categories are too broad to clearly identify the differences between schemas.For instance, Brick 1.1 and IFC are both listed under the physical building information category, but Brick cannot describe envelope characteristics or even the floor area, while IFC can.A more precise analysis is needed to compare all these schemas and harmonize them.

Challenges in Collecting and Identifying Schemas
Collecting and identifying these schemas proved to be a challenging task.The search using Scopus was ineffective, since the title and abstract were often insufficient to determine whether the paper was presenting a new ontology that met the criteria specified (i.e., having a public or published documentation and a report or paper describing them).The schemas found did not have a consistent and universal identifier (e.g., a digital object identifier: DOI) and were cited in the literature using a mix of references, such as the number of the standard defining them (if it existed), the paper(s) that introduced them, the web address of the public repository containing the schema, and sometimes the website containing the documentation.Some schemas were introduced in multiple papers by a different set of authors.Some schemas had no clear authors but rather were presented as result of a standard process led by a committee (e.g., [104]) or were proposed by a supporting organization (e.g., [92] or [99]).The ratio of schemas selected using the article search to the total number of papers was 1.5%.Indeed, a large fraction of the papers screened were found to compare existing schemas, synthesize or extend them, or use them to demonstrate particular applications rather than presenting new ones (in agreement with [55]).Further, of the 24 ontologies found using this method, only 3 were not duplicated by the ontology search or expert search.
The search of ontologies using databases was also problematic.While centralized repositories of ontologies exist in other disciplines, this is not the case for building-related ontologies [45].This is probably a reflection of the heterogeneity of the academic communities working on buildings (e.g., architects, computer scientists, electrical engineers, mechanical engineers).These groups may publish in different journals and go to different conferences and do not necessarily use the same tools to share knowledge.On the industrial side, metadata schemas may be developed by different trade organizations (e.g., construction, HVAC, lighting, controls) but not be well known from the outside.We found three active databases containing different lists of ontologies for IoT devices and smart buildings, but the submission of the content was voluntary and the information was often incomplete or out of date.One website listed 70 ontologies related to smart cities, but a deeper look reveals that 28 links expired, 7 were in foreign languages, and 42 had no linked documentation.The website is maintained by academic researchers with unclear funding mechanisms or promise of continuity.
Another challenge in collecting information about metadata schemas is that they may change in time.Similar to software packages, these schemas may go through different versions and the papers that described their original implementation may not be up to date. Figure 5 shows an example of the evolution and mutual influence of the SSN, SAREF, SEAS, IoT-O, and WoT in the past 10 years.The arrows show that some concepts and ideas influenced each other [149].Maintenance and update mechanisms for these schemas vary.Official standards have slow update cycles, are typically maintained by a sponsoring organization, and have formal procedures based on community consensus.The W3C provides guidelines for implementing and maintaining ontologies, but these are not universally followed.

Comparing Use Cases across the Ontologies
We examined five ontologies and their ability to represent the core concepts necessary for enabling energy audits, automated fault detection and diagnostics, and optimal control of HVAC systems.These use cases illustrate diverse needs that a building modeler must be able to account for when supporting energy applications.Understanding this, we first discuss how the five ontologies support a building modeler tackling our use cases by reflecting on the ways the core concepts can be captured in models.We then examine challenges and gaps that emerge when applying the ontologies to the use cases.

Testing Ontologies vs. Core Concepts
The five ontologies we examined span a range of purposes from upper to application ontologies.Upper ontologies and domain ontologies address different layers of abstraction and consequently should not be directly compared.Seeing the differences in what can be modeled with each illustrates some of the decisions building energy modelers face.Rather than defining the array of terms required for a use case, upper ontologies like BOT or SSN/SOSA define a framework of generic terms that are common to many different domains.Domain experts can then build on these frameworks to define the concepts they require.This freedom to choose how to represent nuanced concepts creates complexity for a building systems modeler and raises the possibility of inconsistencies between how two modelers might represent the same concept, making semantic interoperability more challenging potentially.In contrast, application or domain ontologies such as RealEstate-Core and Brick Schema present a lower-level and more opinionated representation of buildings.These ontologies reduce some of the complexity end users face, by making decisions to include domain-specific concepts in the ontologies themselves (e.g., an Air Handling Unit class).
To understand how the ontologies described in Sections 5.1-5.5 support a building modeler targeting energy use cases, we created an instance of the building in Figure 4 using only the classes immediately defined by each of the ontologies (see Supplementary Materials for a link to the model files).Modeling this toy building illustrates the similarities and gaps where energy modeling actors will have to decide the best approach for representing problems they are tackling.Figure 6 graphs a small subset of the model, focused on the Open Office room, while showing connections to HVAC Zone 1, which is itself connected to VAV Box 1, Window 1, and Window 2. Entities are represented by nodes of the graph, while relationships are annotated on the edges.White circles represent the elements of the model building (e.g., Window 1) using each ontology.These elements are modeled with classes and properties using BOT in red, Brick in blue, SAREF4BLDG in orange, RealEstateCore in green, and SSN/SOSA in purple.Examining Zones and Spaces, both Brick and RealEstateCore have a formal way of describing the difference between Space and Thermal Zone, while BOT and SAREF lack that capability.RealEstateCore also allows us to describe the space as an office.Brick can explicitly characterize the Zone as an HVAC zone.Windows can be modeled in SAREF4BLDG as generic BuildingObject but not in the other ontologies.SSN/SOSA do not natively represent topological aspects like rooms or zones; therefore, these components are not mapped to this ontology.BOT and SSN/SOSA provide generic, baseline classes that only map to certain components in Figure 5. Creating a model that combines both of these ontologies is aligned with their purposes to capture topologies and networked sensors, respectively.A building modeler expressing Figure 5 using SAREF extensions, RealEstateCore, or Brick is overall able to draw from many included classes to represent a significant amount of our model building.While this simple example shows significant differences in the way building components are described, we underscore that an even-handed evaluation must take into account the differing nature and intent of particular ontologies.Upper ontologies do not contain concepts specific to a domain, and they should be evaluated in terms of how well their structure describes the concepts and properties required by the domain and how easily a user can possibly extend the ontology for specific use cases.In essence, how easily can the required concepts and properties be expressed with the provided ontological structure?Application ontologies should be evaluated like upper ontologies as well as in terms of how many of the domain-specific concepts and properties they capture.Where application ontologies are incomplete, an evaluation should again consider how easily the required concepts or properties can be expressed in terms of the provided ontological structure (i.e., extensibility).For example, is there an existing concept in the application ontology that can be subclassed to express the missing concept?

Challenges and Gaps Applying the Ontologies to Our Use Cases
The three use cases that emerged from our review of schemas, and the needs of building modelers supporting energy applications, illustrate a range of concerns that affect what needs to be modeled to enable different applications.Taking these core concerns and studying the five selected ontologies, we see that due to the design and purpose of any ontology, the ability of a modeler to represent buildings and their energy issues requires balancing varied opportunities and limitations.Even among just these five ontologies, the perspective and starting point for modeling and presenting building information can vary, from orientations around the topologies of buildings and their sub-components (BOT) such as sensors (SSN/SOSA) or devices (SAREF) and assets (Brick Schema).Each is able to capture and represent interrelated ideas, but the starting point affects the model representation produced as well as the data that can be generated, affecting what can be made interoperable and how this can be achieved.Gaps are summarized below in Table 3.We find that building modelers would generally be unable to represent key aspects of Zones and Spaces, in particular the geometry of these spaces.This is an area where an individual could develop their own extension, or pull in an external schema, but doing so would not necessarily lead to interoperability when using varied tools with different systems across use cases.Building Envelopes are also consistently not fully represented across these schemas, with many key properties left out of the scope of the ontology designs.A major limitation around Control Devices is the inability to represent aspects of control strategies, such as building schedules.Even this short list of gaps demonstrates how ontologies are not, by default, comprehensive enough to fully enable energy applications such as audits, automated AFDD, or optimal control.Comparing and contrasting upper and application ontologies, we better understand challenges resulting from such gaps as well as the opportunities individual modelers have to extend and customize ontologies to suit their particular needs.
Upper ontologies (BOT, SSN/SOSA, the core of SAREF and SAREF4SYST) are generalpurpose ontologies that result in gaps that a modeler would need to fill in with extensions or external schemas.BOT's design, as a general way of representing building topologies, results in an ontology that can be used to express many high-level aspects of a building.Key gaps that emerged when modeling our use cases highlight how a building modeler would require the use of multiple external schemas if BOT was their primary ontology.BOT does not include representations of properties or units of measurement that affect multiple concepts related to systems, equipment, and devices, like sensors or actuators, making it difficult to capture the characteristics of a building's envelope necessary for energy auditing.BOT cannot represent a building's schedule or occupancy, which is needed by various aspects of our three use cases and their applications, and does not directly include a way to represent the floor area for Zones and Spaces or other properties of building components.In contrast to BOT, SSN/SOSA ontologies are designed to model sensor networks and systems.Gaps emerge here where a modeler cannot natively represent locations and concepts like Zones and Spaces or many aspects of a building's Envelope, such as properties of these components.Modeling Control Device schedules is also not within the scope of the ontology.An individual trying to model a building for use in energy applications would thus likely need to build such a representation by drawing from each of these ontologies.The gaps that emerge in each are distinct, and leveraging each schema would provide the flexibility to portray complex building environments, while providing an opportunity to make choices in extending each to suit the unique circumstances of a given building.This flexibility may be beneficial to an individual with significant experience in modeling buildings systems for a range of use cases who desires the ability to incorporate their nuanced point of view.This would be more challenging for modelers with less experience, time, or interest in deeply customizing a building model if their goal is to reach a point where data can be collected and energy applications undertaken.
The application ontologies (Brick, RealEstateCore, SAREF4BLDG), in contrast, have gaps but provide many relevant domain-specific features that would benefit modelers with less experience or interest in taming complexity themselves.SAREF and its many extensions cover a wide swath of our core concepts and yet still do not include ways to represent geometry and functions of spaces, aspects of building envelopes key to useful energy applications, or schedules and control strategies for control devices.RealEstateCore similarly does not natively support modeling concepts and properties around zones, even though the ontology's Virtual Building Component can represent such entities generally.Classes and properties for modeling meters and sub-meters for energy use cases appear to be missing from REC and would be a key area that a user would have to determine to implement a customization.Brick at this point also does not include a mechanism to represent quantifiable static properties like room area or the rated power draw of motors.Brick's design is oriented around making queries easier for users with instantiated Brick models, and this results in gaps when trying to determine which zones or spaces are adjacent to each other.Brick also abstracts and simplifies some aspects of common building equipment, such as an air handling unit, and leaves out details about the internal device layout of such equipment.This may be a challenge for building modelers who are concerned with this type of information as an input to their energy applications.
The insights and challenges illuminated by our review and analysis underscore the complex decisions building modelers face when building the resources necessary to enable energy applications effectively and intuitively.Much of the decision making will be context specific, and the path toward semantic interoperability among applications and models will not be simple.

Answers to the Research Questions
The first question asked what the landscape of metadata schemas looks like for the building-related domain.Overall, our review illustrates that the landscape of ontologies relevant to buildings and energy applications is diverse, fragmented, and constantly evolving.No one schema enables the variety of applications that stakeholders focused on energy issues need to address, and many schemas are one offs or fall out of use as their developers fail to support them sufficiently.Combining schemas in a rigorous way is generally challenging, with minimal guidance provided by developers, such as formal alignment between ontologies.Maintaining a model of a building that uses multiple schemas is likely to be challenging, given that each evolves at different rates.Our review of five ontologies answers our second research question by identifying overlaps and gaps in their ability to support building operations applications, as well as our third question by examining how a subset of schemas can support building modelers targeting energy applications.We found several missing concepts that would make modeling use cases difficult, or at least incomplete, since facets such as control logic, geometry, and building envelope characteristics are not supported (Table 3).Customizations required to add these concepts are possible, but they are labor intensive, tend to cause interoperability issues between applications, and reduce scalability.We also found several overlapping concepts between ontologies that will need to be harmonized in the future to promote semantic interoperability in building applications (e.g., Figure 6).While progress was made in the past decade, the lack of complete, synergistic, and widely adopted metadata schemas is still hindering the development of energy-oriented applications that can be reused across buildings, limiting the scalability of innovative solutions.

Limitations of the Review
The two reviews presented in this paper have a few limitations.First, this review only considers schemas with published documentation, ignoring many models developed internally from companies and not shared publicly.Further, considering the fast pace at which new schemas and ontologies are introduced, especially in the IoT domain, this review has to be considered as just a snapshot in time.Inevitably new schemas will be introduced, and existing ones will be modified or they will disappear.The building community should maintain an updated database of these schemas, as happens in other disciplines or sectors (e.g., [150]).Nevertheless, a look at existing schemas for building energy applications in 2021 is still a useful exercise, as the building industry is substantially fragmented and different communities do not typically share knowledge that will be necessary to build models that are semantically interoperable.
Another limitation concerns the scope of the analysis in Section 5. We could only explore three use cases (Section 4) and apply them to one simple prototypical building (Figure 4), given space limitations.While the three use cases selected are often cited in the literature (e.g., [16]) and are central to recent standardization efforts (e.g., [70]), they only represent a fraction of the possible applications and they only describe building operations.Further, the example building in Figure 4 was arbitrary defined, based on authors' expertise (i.e., HVAC, lighting), and does not cover other energy-consuming systems (e.g., plug loads) or distributed energy resources.The decision was taken after searching in vain for reference buildings to use.The search led to examples that either were too complex for this application [151] or only described HVAC equipment [152].We also considered using an actual building from a public database of metadata [153], but we realized that this solution was not necessarily more representative than a fictional building.Both use cases and examples should be further expanded and refined in the future to account for other perspectives.
While comparing the ontologies in Section 5, it also became evident that a quantitative comparison between upper and domain ontologies was difficult to make as well as unfair, due to the different purposes underlying their design.As shown in Section 5, upper ontologies provide a different level of abstraction and do not directly contain domain concepts such as an AHU or a thermostat, but they may provide the constructs and relationships to describe them indirectly.We also realized that the same concept can be represented in one ontology in alternative ways and the modeler using the ontology has the freedom to choose one approach over the other, based on the level of granularity of the model or other considerations.In some cases, the boundaries of an ontology were unclear.Some ontologies such as SAREF have many formal extensions that add concepts to the main ontology, while RealEstateCore is composed of varied modules that a modeler must grasp.Some of these concepts may actually conflict or overlap, and they are difficult to analyze.Sometimes, ontologies indicate that they can be extended by using other ontologies, but they do not specify how, since there are generally minimal formal alignments or examples provided to prospective modelers.For this reason, the comparison of the five ontologies is not quantitative, and a building modeler will face significant challenges accomplishing their work.

Future Work
Our review demonstrates that there is clearly work to improve beyond the status quo if there is to be a path forward to improve semantic interoperability.It is important to recognize that in the building industry, the use of ontologies in commercial products is still low.One example of an open-source product is iot.mozilla.org,which uses the Web of Things (WoT) ontology for IoT applications [154].However, at this point in time, its commercial success is unclear.In addition, other products may incorporate different ontologies for internal use, but they do not typically advertise it or provide details to outside stakeholders potentially limiting their utility and reach.We see three key areas for future work to improve semantic interoperability in the building sector: 1.
Create and maintain a public repository of schemas and ontologies for building energy applications.A centralized database and search engine will reduce the effort required to search and identify existing schemas, and hopefully promote the reuse of concepts, as demonstrated in other disciplines (i.e., medicine).Answering our first research question demonstrated the variability of this landscape, and fostering a community driven repository will be necessary to sufficiently maintain a grasp on the evolution of ontologies relevant for building modeling and energy use cases.

2.
Develop and share additional use cases.We faced the challenge of identifying useful but tractable use cases to use in our review.With our analysis of five schemas, in particular our attempts representing the model building, we noted the importance of a building modeler's role in producing a useful product that can support energy applications.Future endeavors should work to produce public use cases and reference models (both conceptual and instantiated using particular ontologies) that clearly examine and weigh trade-offs modelers face building these key resources.The decisions individual modelers, as well as communities, make are nuanced, and individuals have to balance leveraging standardized elements of schemas with the need to convey the most expressive depiction of a situation that can be used to illuminate meaningful problems.These examples should be of the appropriate complexity to test and evaluate the completeness, extensibility, and usability of a schema.Use cases should also investigate the role of different actors in creating, updating, and using metadata models.

3.
Work with multiple stakeholders to harmonize and standardize schemas.Academia, industry, and other interested stakeholders (e.g., policymakers) should collaborate within a standard organization (e.g., ASHRAE) to create a standard schema addressing semantic interoperability for building applications.Such institutional frameworks allow communities to gather direct inputs from different parties and to create an informed, industry-relevant standard that is more likely to be adopted [21].Furthermore, it provides a mechanism for updating and modifying the schema using formal procedures based on community consensus.From a technical perspective, the resulting schema should have the right level of detail to cover the target use cases but avoid over-complex solutions.The schema should follow best-practice guidelines by allowing reuse of concepts from existing ontologies (e.g., as in BACS [101]) and extensions for uncommon concepts of future application that have not been identified yet.Further, tools and reference implementation should be developed by the standard organization to facilitate adoption.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10.3390/en14072024/s1: Table S1.The full table of our review of metadata schemas can be found in the following publicly accessible Google spreadsheet.This expanded table captures extensive details about the 40 schemas our review surfaced, including more information about their syntax and purpose.Links to publicly available repositories for each schema are also provided (https://docs.google.com/spreadsheets/d/1Ldx5jC0ua1Y55D3SwkQ5lmUbisGAFhXzB3RC4bCYJfU);Table S2.Concepts required by Use Case 1: Energy Audits; Table S3.Concepts required by Use Case 2: AFDD; Table S4.Concepts required by Use Case 3: Optimal Control; Table S5: Building models

Figure 2 .
Figure 2. Systematic review process (preferred reporting items for systematic reviews and metaanalyses (PRISMA)) and how it relates to the three research questions.

Figure 3 .
Figure 3. Use cases developed and use of metadata models.

Figure 4 .
Figure 4. Section of an office building and a representation of the HVAC system that serves its spaces.

Figure 5 .
Figure 5. Evolution and mutual influence in the development of several ontologies [149].

Figure 6 .
Figure 6.Graph showing the Open Office segment of Figure 4 model building, as represented by the five ontologies examined and modeled in this paper (orange: SAREF; purple: SSN/SOSA; blue: Brick; red: BOT; green: RealEstateCore).

Table 1 .
Metadata schemas resulting from the review process.

Table 2 .
Concepts needed by the use cases.

Table 3 .
Elements missing between use case concepts and five ontologies.Use cases affected by gaps are indicated in parentheses.