Advancing Urban Building Energy Modeling: Building Energy Simulations for Three Commercial Building Stocks through Archetype Development

: Urban building energy models (UBEMs), developed to understand the energy performance of building stocks of a region, can aid in key decisions related to energy policy and climate change solutions. However, creating a city-scale UBEM is challenging due to the requirements of diverse geometric and non-geometric datasets. Thus, we aimed to further elucidate the process of creating a UBEM with disparate and scarce data based on a bottom-up, physics-based approach. We focused on three typically overlooked but functionally important commercial building stocks, which are sales and shopping, healthcare facilities, and food sales and services, in the region of Pittsburgh, Pennsylvania. We harvested relevant local building information and employed photogrammetry and image processing. We created archetypes for key building types, designed 3D buildings with SketchUp, and performed an energy analysis using EnergyPlus. The average annual simulated energy use intensities (EUIs) were 528 kWh/m 2 , 822 kWh/m 2 , and 2894 kWh/m 2 for sales and shopping, healthcare facilities, and food sales and services, respectively. In addition to variations found in the simulated energy use pattern among the stocks, considerable variations were observed within buildings of the same stock. About 9% and 11% errors were observed for sales and shopping and healthcare facilities when validating the simulated results with the actual data. The suggested energy conservation measures could reduce the annual EUI by 10–26% depending on the building use type. The UBEM results can assist in finding energy-efficient retrofit solutions with respect to the energy and carbon reduction goal for commercial building stocks at the city scale. The limitations highlighted may be considered for higher accuracy, and the UBEM has a high potential to integrate with urban climate and energy models, circular economy, and life cycle assessment for sustainable urban planning.


Introduction 1.Background
The building sector is one of the largest contributors to global greenhouse gas (GHG) emissions due to its energy-intensive nature [1].Considering the necessity of decarbonizing buildings, substantial efforts have been made during the past decades, particularly for energy-efficient building retrofitting.For enhancing large-scale applications, building energy modeling has gradually shifted from individual buildings to an urban scale [2].Urban building energy models (UBEMs) aim to understand urban building energy consumption more comprehensively, which can effectively be used in city energy planning, neighborhood design, and implementing carbon reduction strategies [3][4][5][6].UBEMs can also effectively help city managers to determine, evaluate, and select energy strategies in the urban context [7][8][9]; evaluate future climate change scenarios [10]; and establish an energy performance benchmark at the city level for future improvement [11].
The typical UBEM creation approach involves developing archetypes by segmenting the building stock, collecting geometric and non-geometric information, simulating through a simulation engine, and validating the measured data.However, creating a UBEM can be challenging and highly uncertain [3,12,13].Three main UBEM approaches have been commonly used in the literature: data-driven, physics-based, and reduced-order approaches [14][15][16].The data-driven approach is useful for analyzing load prediction, energy pattern analysis, etc. [17], but has limited ability for strategic reduction scenario analysis.The physics-based approach is more useful as it allows for scenario analysis and produces results at a high spatial and temporal resolution without considering historical energy consumption and socioeconomic factors.However, the approach is associated with detailed physical and technical building information, which is often complex to obtain, and it requires time-consuming modeling and simulation procedures [18].The reduced-order approach is gradually being adopted to quickly simulate the building energy performance as it needs fewer input data than the other two [19].As the characteristics and functions of each approach are different, the choice could be based on the aim of the study and the data and resources available [20,21].

Building Stock Energy Simulation and Modeling
Large-scale building energy simulation and modeling in both residential and commercial building stocks has gained increasing attention over the past decade.Krati et al. [22] analyzed the energy benefits of utilizing hybrid indirect evaporative and vapor compression systems for the existing residential building stock in Saudi Arabia.The potential of installing large-scale building-integrated photovoltaics (PVs) on building façades and rooftop-mounted PVs in the commercial building stock was demonstrated in Tokyo, Japan [23].Kotarela et al. [24] studied the energy performance gap for selecting dynamic and quasisteady-state simulation tools.The study found about a 4.5% higher annual electricity consumption per conditioned area in the existing building when using the steady-state simulation tool.Ward et al. [25] proposed a modular framework to estimate the energy consumption of residential neighborhoods using drive-by image capture and mobile sensing data.Important parameters such as thermal properties were collected based on statistical archetypes, internal load, and scheduling data, which were extracted from the literature and industry guidelines, and uniform assumptions were made for all simulations.By analyzing multiple available building systems and energy conservation measures, Yamaguchi et al. [26] employed statistical models for commercial building stocks, especially to analyze the effect of technology deployment.The study identified system alternatives for heating, ventilation, and air conditioning (HVAC) systems, water heating systems, and a combination of energy conservation measures, where building stock segments were classified by building use types and floor sizes.A GIS-based hybrid approach for commercial building stock was introduced by Perwez et al. [27] for multi-scale simulating of energy consumption.The approach is feasible when important physical factors, such as the geometric, non-geometric, and typology data of buildings including the thermo-physical properties, footprints, heights, shape features, etc., are embedded as geo-referenced datasets.

UBEMs and Energy Simulation
Among other available tools such as building stock modeling, UBEMs can effectively support city-scale energy performance simulation and planning [28].Borràs et al. [29] evaluated energy communities by integrating a UBEM and solar installation into the rooftops of residential and school buildings in Portugal.Similarly, a GIS-synthetic hybrid UBEM coupled with building-integrated photovoltaics was proposed by Perwez et al. [30] with the aim of understanding the carbon neutrality potential of the commercial building stock in Tokyo city.The study found that a 16-40% decarbonization potential can be attributed to implementing this measure.A UBEM was developed for 30 multifamily residential buildings in Uppsala, Sweden, through modeling by zoning requirements [31].Carnieletto et al. [32] generated a UBEM for residential and office building stock by developing a prototype in northern Italy.Katal et al. [33] integrated a microclimate model into a UBEM to simulate the dynamic energy consumption of downtown Montreal, Canada.By considering the disparateness of HVAC systems, a UBEM was developed for simulating the energy consumption of Japanese office building stock [34].By using data-driven graph neural network models, Hu et al. [35] studied solar-based building interdependency in their UBEM and modeled energy consumption at the hourly timescale of campus buildings in Atlanta.
Recently, web-based and digital approaches were integrated with UBEMs for data collection and simulation.For example, Wu et al. [36] used mobile position data to collect more representative occupancy-related information, compared it with the Department of Energy (DOE) reference models, and then evaluated the impacts on building energy consumption.The study observed a considerable difference in HVAC energy demand from variations in occupancy rate, although lighting and plug loads were not considered.A similar approach was adopted by Mosteiro-Romero et al. [37] in Zurich, Switzerland.The study found a 33% lower maximum number of occupants for adopting a population-based approach compared to the deterministic model, which could vary by around 10% of annual energy demand.Pasichnyi et al. [21] analyzed building retrofit potential by developing a data-driven building archetype in Stockholm, Sweden.A similar approach was also adopted for residential buildings in Ireland [38].Roth et al. [5] proposed an augmented UBEM by combining data-driven (open-source) and simulation methods for generating synthetic hourly load curves for city-scale buildings.Though statistical analyses were conducted, the simulated results were not validated.Prataviera et al. [39] integrated a physical UBEM and the sensitivity and uncertainty of the main input parameters obtained from the regional/national statistics in Verona, Italy.The study found that stochastic load profiles obtained from the analysis could significantly improve the simulations compared to the deterministic archetype data, especially the peak load and energy demands in residential buildings.Buckley et al. [40] evaluated the efficiencies of different energy retrofitting policies for residential buildings in Ireland by developing an archetype-based UBEM and running urban modeling interface simulations.Deng et al. [41] introduced automated Building Performance Simulation (AutoBPS) to estimate urban building energy use and analyze potential energy savings, including rooftop photovoltaics.Ang et al. [6] developed a web-based framework (UMEB.io) to quickly develop UBEMs for analyzing energy use and carbon scenarios.This tool requires a high level of technical expertise to accurately collect the archetype data and input the data into the model.In addition, it is not clear how to collect and model several critical parameters such as the window-to-wall ratio (WWR).
A lack of high-quality and open-source data has increased the uncertainty in input parameters (building elements, operation, and geometry) and thus hindered the effective application of UBEMs in many parts of the world, as UBEMs mostly depend on archetype data [39].In addition, construction year, usage, and refurbishment states are important in energy performance simulations and are often excluded in open datasets [42].As the data are almost similar for each same class (particularly for non-geometric data), the use of archetypes may reduce variations in modeled energy use compared to the actual data [43].Cerezo et al. [44] developed UBEMs in several cities using a probabilistic approach for nongeometric parameters of archetypes.Similar to Ang et al. [6], the study did not specify the calculation method for several parameters such as the WWR ratio.

Research Gap and Study Contribution
Developing UBEMs for commercial buildings can be important due to their energyintensive sub-segment and considerable energy-saving potential [30].Several commercial building stocks are used in UBEM analysis including offices, educational buildings, lodgings, warehouses, garages, and public safety and assembly buildings [45]; offices, educational buildings, and hotels [37]; residential and office buildings [32]; office and retail buildings [9]; residential buildings, offices, and hotels [41]; and residential buildings [46].Moreover, researchers have adopted physics-based approaches in developing UBEMs in different research contexts, in addition to the studies highlighted in Sections 1.1-1.3.For instance, Beltran-Velamazan et al. [47] proposed a national-scale physics-based UBEM based on an energy performance certificate to build a single GIS database in Spain.Kim et al. [48] proposed a novel framework for a physics-based UBEM to model electricity load profiles in commercial building stocks (e.g., offices, hotels, and medical buildings) by considering the building system composition and occupancy profile.Though UBEMs are used globally in commercial buildings, several functionally important commercial buildings such as healthcare facilities, food sales and service buildings, and sales and shopping buildings are typically overlooked in the existing literature.Therefore, this study developed a bottom-up physics-based UBEM for simulating these three commercial building stocks in Pittsburgh, Pennsylvania, as a case study to support the potential decarbonizing solutions.With this paper, we are building on and expanding our prior work and focusing on the commercial buildings that were missing [45].In this work, we expanded our archetype library while further enhancing our semi-automatic process for collecting the building envelope properties (such as the WWR), as the WWR is an important geometric parameter for UBEMs [49].We provided a comprehensive framework for estimating building heights, as these were not embedded with the existing GIS dataset in many regions including the study location.As validation is a major challenge of UBEMs, our results were validated with actual energy data and/or with statistical analysis.Several energy conservation measures were explored in this study.The adopted methodological framework, including archetype development, data collection procedure, LiDAR analysis for estimating building heights, image collection and processing, 3D building design, simulation, and validation process, can effectively be used for UBEMs in other regions.

Methodology
Developing UBEMs is often complicated, especially when public data are not accessible or available.The availability and quality of these data affect the reliability of UBEM results [50].To fill this existing gap, a multi-layer process was adopted for creating the UBEM (described in Section 2.5).An enhanced physics-based UBEM approach for commercial buildings developed by this research group was adopted in this study [45].

Selection of the Building Stocks
The three commercial building stocks selected for this study were healthcare facilities, food sales and services, and sales and shopping.The location, Pittsburgh, is in the western part of Pennsylvania within climate zone 5A (cold climate) in the United States (US) [51].Pittsburgh is a part of the 2030 District Network, where building owners voluntarily commit to reducing 50% of their energy, water, and transportation emissions by the year 2030 [52].In our prior work, we modeled a number of commercial buildings such as educational buildings, offices, lodgings, warehouses, garages, and public safety and assembly buildings in Pittsburgh except for the three stocks studied in this paper [45].The Department of City Planning provided the latest actual annual energy use data for validating the simulated results.The geospatial data, in GIS format, were collected from the Western Pennsylvania Regional Data Center (WPRDC) [53] for spatial information and analysis of the studied buildings.
In this study, the archetypes were developed based on the construction year and function of the buildings.Three building stocks comprising 48 buildings were used in this UBEM: sales and shopping (40%), healthcare facilities (38%), and food sales and services (22%).Among the selected buildings, 42% of the buildings were constructed before 1980, 44% during 1980-2004, and 14% after 2004.The stocks were further classified based on their function, such as shopping center/mall, strip mall, grocery store, and supermarket within the sales and shopping stock; full-and quick-service restaurants within food sales and services; and out-patient services and hospitals within healthcare facilities (Figure 1).The reason for this subclassification was to have a more detailed and accurate archetype of the stocks, as this can influence the accuracy of the model [54].
function of the buildings.Three building stocks comprising 48 buildings were used in this UBEM: sales and shopping (40%), healthcare facilities (38%), and food sales and services (22%).Among the selected buildings, 42% of the buildings were constructed before 1980, 44% during 1980-2004, and 14% after 2004.The stocks were further classified based on their function, such as shopping center/mall, strip mall, grocery store, and supermarket within the sales and shopping stock; full-and quick-service restaurants within food sales and services; and out-patient services and hospitals within healthcare facilities (Figure 1).The reason for this subclassification was to have a more detailed and accurate archetype of the stocks, as this can influence the accuracy of the model [54].

Archetype Library
Archetype-based UBEMs are most commonly used due to the data requirements for a large-scale study [21,39,40,44].In an archetype, buildings are classified according to their functionality and characteristics [32].In this study, the selected buildings were broadly classified based on function and construction year, as these are the important criteria for developing archetypes in a UBEM study [2,55].In this study, twenty archetypes were developed for the three commercial building stocks (built during three construction periods: pre-1980, 1980-2004, and post-2004) (Figure 1).The selected building stocks were subclassified considering the differences in certain variables and characteristics (further discussed in Section 3.1), as detailed archetypes can improve the accuracy of the model [54].Available resources such as standards, building codes, the literature, and surveys, including the Commercial Building Energy Consumption Survey (CBECS), were used for extracting

Archetype Library
Archetype-based UBEMs are most commonly used due to the data requirements for a large-scale study [21,39,40,44].In an archetype, buildings are classified according to their functionality and characteristics [32].In this study, the selected buildings were broadly classified based on function and construction year, as these are the important criteria for developing archetypes in a UBEM study [2,55].In this study, twenty archetypes were developed for the three commercial building stocks (built during three construction periods: pre-1980, 1980-2004, and post-2004) (Figure 1).The selected building stocks were subclassified considering the differences in certain variables and characteristics (further discussed in Section 3.1), as detailed archetypes can improve the accuracy of the model [54].Available resources such as standards, building codes, the literature, and surveys, including the Commercial Building Energy Consumption Survey (CBECS), were used for extracting non-geometric information.Depending on the data requirements for modeling and simulation, three sets of non-geometric parameters were collected: occupancy-related parameters, envelope properties, and electrical and mechanical systems.The input parameters corresponding to their sources are shown in Supplementary Figure S1.The collected data were checked and pre-processed (if needed) before being input into the simulation process.The archetype was developed for each subclass (Figure 1) of the studied building stocks.For instance, the occupancy-related parameters, envelope properties, and internal loads were the same for each subclass of the building within each stock.It was believed that archetypes based on the subclass (rather than a broader class or stock) would give more accuracy in the results.

Photogrammetry and Image Processing for Estimating the WWR
Though UBEM studies have flourished in recent years, challenges in collecting data still exist.For example, several key and influential envelope properties, notably the WWR, are unavailable in city databases and tedious to collect manually at an urban scale.To fill this gap, Szczesniak et al. [56] proposed an automated façade analysis for collecting the WWR and found around a 20% error between manual and automated methods in 90% of the studied buildings.Mohammadiziazi et al. [45] proposed a semi-automatic method for collecting the WWR.Both approaches used an image-processing technique where they obtained Google's Street View Static (SVS) images.This study adopted the aforementioned approach.In this approach, photogrammetry (acquiring façade images) and image processing (interpreting images) were used to obtain the WWR.Information on building façades can be collected by obtaining and analyzing aerial or street-level images.Considering the difficulties in obtaining building façades through aerial images due to the block vision of neighboring façades in the city, street-level images of façades were used.SVS façades of each building were collected according to the application programming interface (API) designed by Google [57].The required format for image processing (e.g., JPEG or PNG) (along with the desired resolution) can be downloaded by employing SVS API, which cannot be collected through regular Google Street View.Building coordinates (found in the GIS analysis) were input into the SVS API to obtain façade images for every building.This semiautomatic API allows the users to remotely control the attribute of an image by changing the vertical angle of the camera (pitch), the horizontal angle of the camera (heading), the field of vision (fov), and the resolution (size) to find images with the desired quality.The images can be rotated to examine details, such as the façade's materials.First, the external wall material was identified from the images, and then the compositions and specifications were matched with ASHRAE standards [58][59][60].Then, the number of floors of each façade was identified, as it is important for thermal zone definition.The obtained images were checked and verified and then transferred into SketchAndCalc, an online area calculator software, to measure the total area of the windows and the gross wall area and then estimate the WWR.Following Equation (1), the WWR of building i with n façades was estimated.This process was replicated for all of the selected buildings in each stock.

LiDAR Analysis for Estimating the Building Height
Another important geometric parameter for building energy simulation is the elevation, or building height, which is not embedded in GIS data in many cities, including Pittsburgh.Though the average floor height was used in several UBEM studies (e.g., [30]), it may affect the accuracy of the models [61].We used a similar process to determine the building height as in [45], but we provide more specific information herein.The light detection and ranging (LiDAR) technique was used to determine the building height in this study [29,38].The step-by-step procedure for estimating the building height is shown in Figure 2. In the first step, GIS-compatible airborne LiDAR data were collected from the U.S. Geological Survey (USGS) in las format.At the second step, the collected raw LiDAR data were processed into an LAS dataset (.lasd format), and then the processed LAS datasets (different blocks) were combined to make a Pittsburgh city map.The building footprint shape file was then embedded into the combined blocks in the third step.The file was then further processed to create the two elevation models.In the fourth step, the digital elevation model (DEM) was created, which contains the elevation information of the earth's surface with reference to a specific datum.The digital surface model (DSM) was created at the fifth step, which contains the elevation information of different objects on the earth (i.e., buildings) with reference to the same datum.Thus, the height of objects can be obtained by subtracting the DEM's elevations from the DSM's elevations at the sixth step.The new height model was then filtered in relation to its building footprint.Several random points were assigned to the building footprint (at the seventh and eighth steps) and then averaged to estimate the final height of the building at the ninth step.
data were processed into an LAS dataset (.lasd format), and then the processed LAS datasets (different blocks) were combined to make a Pittsburgh city map.The building footprint shape file was then embedded into the combined blocks in the third step.The file was then further processed to create the two elevation models.In the fourth step, the digital elevation model (DEM) was created, which contains the elevation information of the earth's surface with reference to a specific datum.The digital surface model (DSM) was created at the fifth step, which contains the elevation information of different objects on the earth (i.e., buildings) with reference to the same datum.Thus, the height of objects can be obtained by subtracting the DEM's elevations from the DSM's elevations at the sixth step.The new height model was then filtered in relation to its building footprint.Several random points were assigned to the building footprint (at the seventh and eighth steps) and then averaged to estimate the final height of the building at the ninth step.

Developing UBEM and the Energy Simulations
The methodological framework for developing the UBEM in this study is shown in Figure 3. Similar to estimating the building height, we provided more detailed information for replication purposes.In Step 1, the latest building footprint shape file of Pittsburgh was collected from the WPRDC.The scope of the study was defined, and the commercial building types were selected in Step 2. After developing an archetype library for the selected building stocks in Step 3 (described in Section 2.2), the envelope properties of each archetype of the studied building stocks were collected and/or identified (e.g., roof, infiltration, etc.) in Step 4.Then, the WWR of the selected buildings was measured, and the number of floors and external wall composition were identified in Step 5 (described in Section 2.3).After that, LiDAR analysis was conducted to estimate the building heights in Step 6 (described in Section 3.4).The 3D building models were designed using the platform SketchUp in Step 7, as 3D geometric models of buildings are fundamental for UBEMs [49].GIS data (building footprints) were preprocessed from ArcGIS to SketchUp to accurately model the buildings, which reflect the actual volumetric shapes and orientations.Then, the total heights and number of floors were assigned to model every building separately.The WWR was then assigned to the corresponding building once the 3D model was developed.Windows were evenly assigned to each façade of the model (Step 7 in Figure 3).One thermal zone on each floor was defined to reduce the model complexity and running time [14,45].The OpenStudio platform (an add-in tool of SketchUp) was used to complete the boundary condition of the roof, floors, and external walls.Examples of 3D models for the studied stocks are shown in Figure 4.After completing the 3D model and inputting the geometric information, the model was converted to EnergyPlus format (.idf) and imported into EnergyPlus, where non-geometric information was then input into the different thermal zones of the building in Step 8, based on the archetypes of the building stocks (based on Step 3).After verifying all inputs (both geometric and non-geometric information) including the selection of weather data, the completed energy models were run to simulate the energy consumption of the selected buildings in each stock in Step 9.The simulated results were then analyzed to identify the pattern of energy consumption in each stock and validate the UBEM with the collected actual energy consumption data.
mation for replication purposes.In Step 1, the latest building footprint shape file of Pittsburgh was collected from the WPRDC.The scope of the study was defined, and the commercial building types were selected in Step 2. After developing an archetype library for the selected building stocks in Step 3 (described in Section 2.2), the envelope properties of each archetype of the studied building stocks were collected and/or identified (e.g., roof, infiltration, etc.) in Step 4.Then, the WWR of the selected buildings was measured, and the number of floors and external wall composition were identified in Step 5 (described in Section 2.3).After that, LiDAR analysis was conducted to estimate the building heights in Step 6 (described in Section 3.4).The 3D building models were designed using the platform SketchUp in Step 7, as 3D geometric models of buildings are fundamental for UBEMs [49].GIS data (building footprints) were preprocessed from ArcGIS to SketchUp to accurately model the buildings, which reflect the actual volumetric shapes and orientations.Then, the total heights and number of floors were assigned to model every building separately.The WWR was then assigned to the corresponding building once the 3D model was developed.Windows were evenly assigned to each façade of the model (Step 7 in Figure 3).One thermal zone on each floor was defined to reduce the model complexity and running time [14,45].The OpenStudio platform (an add-in tool of SketchUp) was used to complete the boundary condition of the roof, floors, and external walls.Examples of 3D models for the studied stocks are shown in Figure 4.After completing the 3D model and inputting the geometric information, the model was converted to EnergyPlus format (.idf) and imported into EnergyPlus, where non-geometric information was then input into the different thermal zones of the building in Step 8, based on the archetypes of the building stocks (based on Step 3).After verifying all inputs (both geometric and non-geometric information) including the selection of weather data, the completed energy models were run to simulate the energy consumption of the selected buildings in each stock in Step 9.The simulated results were then analyzed to identify the pattern of energy consumption in each stock and validate the UBEM with the collected actual energy consumption data.

Energy Conservation Strategies
The UBEM can effectively be used in calculating the EUIs for large-scale building stocks (for both base EUIs and for implementing the selected energy conservation strategies), as the accuracy of the UBEM results is high (See Tables 1 and 2), and the UBEM also

Energy Conservation Strategies
The UBEM can effectively be used in calculating the EUIs for large-scale building stocks (for both base EUIs and for implementing the selected energy conservation strategies), as the accuracy of the UBEM results is high (See Tables 1 and 2), and the UBEM also involves straight-forward calculations (using the already collected input parameters and developed 3D models) (Figure 3).Considering the EUIs of the existing commercial building sector, enhancing building energy efficiency is of paramount importance, as carbon emissions are directly proportional to energy consumption.For quantifying the energy efficiency potential, it is important to break down the energy consumption across building types and energy systems and also identify the relevant energy efficiency hotspots [62].Some studies suggested that lighting upgrades are the most effective individual energy conservation measure for commercial buildings, while window upgrades are more so for residential buildings [41].Therefore, this study adopted two energy conservation measures: (i) lighting upgradation to LEDs, and (ii) plug and process load reduction.As most of the buildings were built before 2004, it is easier and more cost-effective to upgrade lighting systems and install energy-efficient appliances [63] compared to other measures such as building envelope, ground-source heat pump, fault detection and diagnostics, etc. [64].In addition, upgrading traditional incandescent lighting to LEDs consumes up to 90% less energy and lasts up to 25 times longer [65].Thus, this study assumed that the lighting upgradation would reduce 75% (for pre-1980 and 1980-2004) and 50% (for post-2004) of the lighting inputs, whereas 15% for the plug and process load reduction was assumed (for all buildings), according to Mohammadiziazi et al. [45].After modifying the energy models, the simulations were run again according to the considered measures for all the studied buildings.

Estimating the Window to Wall Ratio (WWR) for Individual Stock Is Essential for the UBEM
Considering the importance of the WWR in UBEM studies, which is often unavailable in public datasets, the measured values were used in this study.The estimated WWRs for the studied building stocks were compared with the values derived from the CBECS 2018 data for the same commercial use of buildings in the cold region in the US [67].The values were organized according to the CBECS classification for comparative analysis.
For sales and shopping, about 27% of the buildings had a WWR of less than 0.01 according to the CBECS data, whereas it was only 6% of buildings measured in this study.However, about 20% of the buildings in the same category had a WWR of 0.02-0.10according to the CBECS data, which was much lower than the value (39% of buildings) measured in this study.For healthcare facilities, about 23% of buildings had a WWR less than 0.10 according to the CBECS data, compared to only 6% of the buildings measured in this study.Differing results from the CBECS data and this study were again found for WWRs of 0.26-0.50.For instance, about 21% of the buildings had a WWR of 0.26-0.50according to the CBECS data, which was 47% of the buildings measured in this study.The measured WWRs for food sales and services from this study revealed that the values from the CBECS survey data are an underestimate (Figure 5).Although a similar trend of WWRs was found from the CBECS survey data and this study when all of the building stocks were combined (Figure 5d), clearly distinguished values can be noted for the individual stocks.Therefore, it is suggested to use measured values for UBEM studies, as WWRs are cityor region-specific.

Estimating the Window to Wall Ratio (WWR) for Individual Stock Is Essential for the UBEM
Considering the importance of the WWR in UBEM studies, which is often unav ble in public datasets, the measured values were used in this study.The estimated WW for the studied building stocks were compared with the values derived from the CB 2018 data for the same commercial use of buildings in the cold region in the US [67].values were organized according to the CBECS classification for comparative analysi For sales and shopping, about 27% of the buildings had a WWR of less than according to the CBECS data, whereas it was only 6% of buildings measured in this st However, about 20% of the buildings in the same category had a WWR of 0.02-0.10cording to the CBECS data, which was much lower than the value (39% of buildi measured in this study.For healthcare facilities, about 23% of buildings had a WWR than 0.10 according to the CBECS data, compared to only 6% of the buildings measu in this study.Differing results from the CBECS data and this study were again found WWRs of 0.26-0.50.For instance, about 21% of the buildings had a WWR of 0.26according to the CBECS data, which was 47% of the buildings measured in this study.measured WWRs for food sales and services from this study revealed that the values f the CBECS survey data are an underestimate (Figure 5).Although a similar tren WWRs was found from the CBECS survey data and this study when all of the buil stocks were combined (Figure 5d), clearly distinguished values can be noted for the i vidual stocks.Therefore, it is suggested to use measured values for UBEM studie WWRs are city-or region-specific.

CBECS
This study

Energy Use Intensity Pattern of the Studied Buildings
The UBEM results (e.g., annual energy use intensity) for the selected building stocks are shown in this section, which were obtained based on the developed archetype data (input parameters, highlighted in Supplementary Figure S1), measured data (e.g., WWRs, building heights), designed 3D buildings, and the modeling and simulation through the EnergyPlus 24.1.0software.The annual energy use intensity (EUI) pattern for the studied building stocks is shown in Figure S2.One of the most common measures of building energy performance benchmarking is the EUI, which is calculated by dividing the total energy consumed by the building in one year by the total gross floor area of the building (EUI, kWh/m 2 ) [68,69].The EUI is the sum of the energy consumption by the different end uses such as heating and cooling systems, lighting, ventilation, plug and process systems (various appliances), water systems, and water heating systems normalized by the total floor area.The estimated annual EUI was 282-676 kWh/m 2 for sales and shopping, 545-1298 kWh/m 2 for healthcare facilities, and 2727-3332 kWh/m 2 for food sales and services.The EUI composition by the end use of the studied building stocks is shown in Figure 6.The composition is defined by three main classes such as HVAC, lighting, and equipment.On average, the lighting, HVAC system, and equipment contributed about 34%, 6%, and 59% to the total EUI for sales and shopping.The plug load and processes (e.g., internal equipment and appliances), including refrigerators, were the predominant contributors in this category.Lighting also contributed considerably to the higher energy consumption due to the selection of lighting systems and the density of lighting.Similarly, equipment contributed to the highest energy consumption (about 81%) for food sales and services, as this can be attributed to high natural gas consumption, the intensity of internal equipment (e.g., cooking appliances and refrigerators), and the corresponding operational schedules.For healthcare facilities, the share was 18% by lighting, 21% by the HVAC system, and 61% by the plug load and processes (e.g., internal equipment, appliances, refrigerators, etc.).Compared to the other two categories, the share of the lighting and HVAC systems contributed to a higher EUI in healthcare facilities due to the operational schedules and the other factors described above.In addition to the variations found in the simulated energy use pattern of the studied commercial building stocks (Figure 6), considerable variations were also observed within buildings of the same stock.For instance, the contributions of lighting, the HVAC system, and equipment were 18-58%, 1-27%, and 26-82%, respectively, in the category of sales and shopping; they were 4-7%, 12-14%, and 79-83%, respectively, in the category of food sales and shopping; and 7-52%, 9-43%, and 12-81%, respectively in the category of healthcare facilities.Such higher variations were observed mainly for the functions of the buildings in the same category (e.g., outpatient clinics vs. hospitals in the category of healthcare facilities).

Variations in the Simulated EUIs in the Studied Buildings
We calculated the frequency distributions for the actual and simulated annual EUIs and the probability distribution function (PDF) for the estimated EUIs, see Figure 7.Although similar thermal zones were considered in each building, the simulated EUIs were considerably varied among the buildings in each stock.In addition to the description of such variations for lighting systems and equipment (Section 3.2), solar heat gain and loss through the different WWRs, external wall composition, and orientation of the buildings can contribute to the differences (for HVAC systems) [45,70].The PDF of the simulated EUIs for the studied building stocks followed a lognormal distribution (Figure 8).Lower EUIs had a higher frequency in buildings in the food sales and services and healthcare facilities stocks, but higher EUIs with higher frequency were observed for sales and shopping.This finding is proven by more than 70% of the buildings having annual EUIs within 500-700 kWh/m 2 in this category.The frequency distributions of the EUIs demonstrate that UBEM results were mostly concentrated for the simulated data but scattered for the actual data.This was due to using archetype data for the UBEM, but this can vary for individual buildings in the real world.However, the average annual simulated EUIs for the studied building stocks were considered with the actual annual EUIs or with the references for the same commercial use types (Table 1).It should be noted that the Department of City Planning of the City of Pittsburgh provided the most recent actual annual energy use data of several commercial buildings to the research group, which was used for comparing and validating the simulated results; data on food sales and services were not provided.The actual energy consumption data for the buildings from the Department

Variations in the Simulated EUIs in the Studied Buildings
We calculated the frequency distributions for the actual and simulated annual EUIs and the probability distribution function (PDF) for the estimated EUIs, see Figure 7.Although similar thermal zones were considered in each building, the simulated EUIs were considerably varied among the buildings in each stock.In addition to the description of such variations for lighting systems and equipment (Section 3.2), solar heat gain and loss through the different WWRs, external wall composition, and orientation of the buildings can contribute to the differences (for HVAC systems) [45,70].The PDF of the simulated EUIs for the studied building stocks followed a lognormal distribution (Figure 8).Lower EUIs had a higher frequency in buildings in the food sales and services and healthcare facilities stocks, but higher EUIs with higher frequency were observed for sales and shopping.This finding is proven by more than 70% of the buildings having annual EUIs within 500-700 kWh/m 2 in this category.The frequency distributions of the EUIs demonstrate that UBEM results were mostly concentrated for the simulated data but scattered for the actual data.This was due to using archetype data for the UBEM, but this can vary for individual buildings in the real world.However, the average annual simulated EUIs for the studied building stocks were considered with the actual annual EUIs or with the references for the same commercial use types (Table 1).It should be noted that the Department of City Planning of the City of Pittsburgh provided the most recent actual annual energy use data of several commercial buildings to the research group, which was used for comparing and validating the simulated results; data on food sales and services were not provided.The actual energy consumption data for the buildings from the Department of City Planning was from 2018.Along with data for other building stocks (325 individual buildings in total), the actual energy consumption data for 9 healthcare facilities and 7 for sales and shopping buildings were provided.The average values were then used for the comparison (Table 1).

Validation of UBEM Results
The reliability of UBEM results can be proven through validation against actua particularly with measured energy data, which are often challenging to collect fo stocks [3,12,14,33].A possible alternative way to validate UBEM results is to find o modeling error, which can be estimated through the variation in the simulated and energy use due to data input and simulation engine errors [71].The difference be these two, or the percent error (PE), was estimated using Equation 2[45].The P estimated using the aggregated energy use of each use type.Mean EUIaj is the a annual actual EUI for use type j, and Mean EUIsj is the average annual EUI obtained the UBEM for use type j.

Mean PE
× 100 The UBEM validation results are given in Table 2.The PE for sales and shoppin 9%, compared to 11% for healthcare facilities.The overall acceptable EUI variatio 19%, as suggested by the existing literature [14,16].As the PE for food sales and se was not calculated due to unavailability of the actual data, we compared the result the existing literature.The variation was 8% according to Zhang et al. [66] for the climate zone in the US.In addition to calculating the PE, a two-sample Kolmog Smirnov (KS) test was performed to analyze the similarity of distributions of the sim and actual EUIs.The KS test is a non-parametric test that gives insight to the sta difference between two samples [72].The KS test reports the maximum differen tween two cumulative distributions and calculates a p-value from that and the s sizes.Two datasets comprising the actual EUIs and simulated EUIs for both sale shopping and healthcare facilities were used to conduct the KS test separately usi SPSS 28 software.The assumption of a null hypothesis means that the two distrib are not statistically different, and it is not rejected when the p-value is greater than cific significance level (either 0.05 or 0.01).With a significance level of 0.05, the nu pothesis was not rejected for both sales and shopping and healthcare facilities (Ta implying that the distributions of the simulated and actual EUI were not distinct.Bo PE and KS results demonstrate the validity of the UBEM-simulated results and prov

Validation of UBEM Results
The reliability of UBEM results can be proven through validation against actual data, particularly with measured energy data, which are often challenging to collect for large stocks [3,12,14,33].A possible alternative way to validate UBEM results is to find out the modeling error, which can be estimated through the variation in the simulated and actual energy use due to data input and simulation engine errors [71].The difference between these two, or the percent error (PE), was estimated using Equation (2) [45].The PE was estimated using the aggregated energy use of each use type.Mean EUI aj is the average annual actual EUI for use type j, and Mean EUI sj is the average annual EUI obtained from the UBEM for use type j.

Mean PE j =
Mean EUI aj − Mean EUI sj Mean EUI aj × 100% The UBEM validation results are given in Table 2.The PE for sales and shopping was 9%, compared to 11% for healthcare facilities.The overall acceptable EUI variation is 1-19%, as suggested by the existing literature [14,16].As the PE for food sales and services was not calculated due to unavailability of the actual data, we compared the results with the existing literature.The variation was 8% according to Zhang et al. [66] for the same climate zone in the US.In addition to calculating the PE, a two-sample Kolmogorov-Smirnov (KS) test was performed to analyze the similarity of distributions of the simulated and actual EUIs.The KS test is a non-parametric test that gives insight to the statistical difference between two samples [72].The KS test reports the maximum difference between two cumulative distributions and calculates a p-value from that and the sample sizes.Two datasets comprising the actual EUIs and simulated EUIs for both sales and shopping and healthcare facilities were used to conduct the KS test separately using the SPSS 28 software.
The assumption of a null hypothesis means that the two distributions are not statistically different, and it is not rejected when the p-value is greater than a specific significance level (either 0.05 or 0.01).With a significance level of 0.05, the null hypothesis was not rejected for both sales and shopping and healthcare facilities (Table 2), implying that the distributions of the simulated and actual EUI were not distinct.Both the PE and KS results demonstrate the validity of the UBEM-simulated results and prove that they can be adopted by Pittsburgh for further energy planning and conservation strategies.The low PEs for both sales and shopping and healthcare facilities were likely because of the experiences of the research team in characterizing archetypes with greater similarity to real-world operation and the availability of representative input parameters, especially found in [66,67].The errors for both building stocks were almost similar and could mainly be traced to various occupant-related inputs and internal equipment.Though the errors were already low, they can be further addressed by conducting occupant surveys and determining the inventory of internal equipment with case studies.

Energy Conservation Strategies of the Studied Commercial Building Stock
Considering the contribution of lighting and equipment to the total EUI composition of the studied commercial building stocks (Figure 6), this study adopted two energy conservation measures such as lighting upgrades to LED and plug and process load reduction.The results of the energy conservation strategies considered in this study, including the average percentage of composition by end use (Section 2.6), are shown in Figure 8.About 18% of the total EUI can be saved by upgrading the lighting systems in sales and shopping buildings, whereas 8% can be saved by implementing the plug and process load reduction.The higher reduction in the EUI from upgrading lighting systems in buildings can be attributed to the significant energy consumption by lighting in this category (it is about 34% for the base case) (Figure 6).These combined strategies could reduce about 26% of the total annual energy consumption.For food sales and services, the savings for implementing these two strategies were not considerably high (about 17%), as fuel consumption for cooking contributed the greatest portion of the total energy consumption.The lighting system upgradation can reduce around 10% of the total energy consumption in healthcare facilities, whereas plug load and process load reductions were not significant (only 4%).Both strategies combined could reduce around 14% of the total energy consumption in this category.The results suggested that upgrading lighting systems could be important strategy for achieving the energy reduction goals for commercial buildings in Pittsburgh.Though not covered in this study, there is also the potential for substantial economic and environmental benefits from implementing the suggested energy conservation measures.
This study enhanced a step-by-step framework for adopting a UBEM for energy simulation, which would be very effective for city planners for policy and implementing energy conservation strategies.The framework has high potential to be scaled up for implementing the UBEM for the large-scale energy simulation of different building stocks, including residential and commercial, and it can be implemented in other cities, regions, and countries (depending on the collection of required data).

Limitations
As diverse information is needed to develop UBEMs, several uncertainties are associated with both non-geometric and geometric parameters during the archetype development.For non-geometric parameters, fixed values were mostly adopted in each stock (or subclass) in this study, which may vary practically.For instance, the operation schedules or occupancy rates may vary in each building even though they are in the same class or subclass.The same limitations are also applicable to the geometric parameters.When estimating the WWR through image processing, images of various façades were collected using SVS API.Obtaining a full coverage of the façades of some buildings was not possible, and some façade images were also not available (not covered by Google).In both cases, the total façade area and number of windows were calculated, and the WWR was estimated by multiplying the similar window size (from the accessible façades of the same building).Google Earth was used to check and verify the number and type of windows.For estimating the building height, random sample points were averaged in this study.This approach may induce inaccurate height information, as roofs may be pitched and height variations might exist in different blocks of the building.This problem can be minimized once this information is readily available in the city's GIS dataset.Moreover, inspecting reports on the detailed building design is often impractical due to unavailability or inaccessibility and its time-consuming nature.Calibration is also often very difficult, especially for collecting detailed information of each building for a large and diverse stock [73].Chen et al. [7] suggested using a range of key input (upper and lower limits) parameters for each building for UBEM calibration.For UBEMs, a detailed model involves a significant increase in the time required for the development, especially the required data collection and carrying out simulations.Thus, simplifications are preferred in 3D model development, data collection, and simulation.For instance, modeling and simulating for a single floor to represent the other floors of a high-rise building would reduce difficulties in the creation of the model and increase the simulation time, while developing building databases required for simulation would also considerably reduce the time [74][75][76].Sensitivity analysis is important to evaluate the effect of input parameter uncertainties on UBEM simulation results.Prataviera et al. [39] proposed a three-phase methodology for uncertainty and sensitivity analysis, such as (i) identifying the key uncertain input parameters (operational, geometrical, and physical parameters) and their characterization by using probability distribution functions, (ii) uncertainty analysis based on Monte Carlo sampling, and (iii) conducting sensitivity analysis on the simulation outputs.Though sensitivity analysis was not conducted in this study, it is recommended to conduct sensitivity analysis to ensure the robustness of simulation results.

Conclusions
Based on the bottom-up engineering approach, a UBEM was developed in this study for simulating three commercial building stocks in the City of Pittsburgh.The major conclusions of this study can be drawn as follows:

•
Considerable variations for the measured WWRs were found compared to the latest CBECS survey data among the studied individual building stock.Considerable variations were also observed when comparing the WWRs.For instance, a higher frequency of WWRs was found between 0.02-0.50(83%) in the survey data compared to 93% for the studied building stocks.• The simulated annual EUI ranged from 282-3332 kWh/m 2 for the studied building stocks depending on the type of use.Lower EUIs were found for sales and shopping, while much higher ones were found for food sales and services.

•
More than 70% of the buildings had annual EUIs within 500-700 kWh/m 2 for sales and shopping, about 70% within 2600-2900 kWh/m 2 for food sales and services, and about 65% within 600-1000 kWh/m 2 for healthcare facilities.• Validating the simulated results with the actual data showed a 9% and 11% PE for sales and shopping and healthcare facilities, respectively.The KS results also demonstrated the validity of the UBEM-simulated results for the studied stocks.

•
Lighting system upgrades together with the energy-efficient appliances could reduce the annual EUI by 26%, 17%, and 14% for sales and shopping, food sales and services, and healthcare facilities, respectively.
The simulated results can be considered in city planning in the City of Pittsburgh, especially for achieving the energy and carbon reduction goals for commercial buildings.The limitations highlighted in this study may be considered in future studies for higher simulation accuracy.In addition, the UBEM may be integrated into urban climate models, urban energy systems, thermal comfort models and urban mobility models [12], materials stock and flow analysis (for adopting circular economy) [77], lifecycle assessment [5], and future climate scenarios for better energy use prediction and management.For emerging data mining techniques, the data-driven approach is preferred over the engineering-or physics-based approach due to the higher variance, complex and time-consuming modeling, and lack of validation/calibration [19,78].The integration of both approaches (e.g., hybrid model) could significantly enhance the UBEM performance [12,35].Input parameters such as occupant behavior can significantly influence the UBEM simulation results.As fixed occupant-related inputs (e.g., occupancy rate, schedules, etc.) are mostly employed in the existing studies, the uncertainty related to this input parameter on the simulated results should be a focus of future studies [79].Though the fixed schedules may be sufficient for large-scale simulation, they may influence small groups of buildings [80].It would also be interesting to see the variations in the simulation results for this UBEM and other methods.

Figure 1 .
Figure 1.Information on the three building stocks for this study: sales and shopping, healthcare facilities, and food sales and services.(A) Year of construction, (B) the percentage of studied buildings, and (C) further classification of the building stock.

Figure 1 .
Figure 1.Information on the three building stocks for this study: sales and shopping, healthcare facilities, and food sales and services.(A) Year of construction, (B) the percentage of studied buildings, and (C) further classification of the building stock.

Figure 2 .
Figure 2. LiDAR analysis for estimating the building height.Figure 2. LiDAR analysis for estimating the building height.

Figure 2 .
Figure 2. LiDAR analysis for estimating the building height.Figure 2. LiDAR analysis for estimating the building height.

Figure 3 .
Figure 3. Framework adopted for developing the UBEM in this study.Figure 3. Framework adopted for developing the UBEM in this study.

Figure 3 .
Figure 3. Framework adopted for developing the UBEM in this study.Figure 3. Framework adopted for developing the UBEM in this study.Buildings 2024, 14, x FOR PEER REVIEW 9 of 20

Figure 4 .
Figure 4. Examples of 3D models for the studied stocks.

Figure 4 .
Figure 4. Examples of 3D models for the studied stocks.

Figure 5 .
Figure 5. Comparative measured window to wall ratio (WWR) for the studied commercial buildings with the Commercial Building Energy Consumption Survey (CBECS) data [59].

20 Figure 6 .
Figure 6.End-use energy use intensity (EUI) composition of the studied building stocks.

Figure 6 .
Figure 6.End-use energy use intensity (EUI) composition of the studied building stocks.

Figure 7 .
Figure 7. (A) Frequency distributions of simulated and actual annual energy use intensity (EUI)s; (B) PDF and frequency distribution of simulated annual EUIs.

Figure 7 .
Figure 7. (A) Frequency distributions of simulated and actual annual energy use intensity (EUI)s; (B) PDF and frequency distribution of simulated annual EUIs.

Figure 8 .
Figure 8. Energy reduction potential for the selected energy conservation strategies.

Figure 8 .
Figure 8. Energy reduction potential for the selected energy conservation strategies.

Table 1 .
Comparative annual EUIs for studied building stocks to the references.
[66]cording to Zhang et al.[66]; actual data on food sales and services were not available.
* Not carried out due to unavailability of actual data for testing; 0: null hypothesis not rejected; 1: null hypothesis rejected.