Streamlining Building Energy Modelling Using Open Access Databases—A Methodology towards Decarbonisation of Residential Buildings in Sweden

: The building sector is a major contributor to greenhouse gases, consuming signiﬁcant energy and available resources. Energy renovation of buildings is an effective strategy for decarbonisation, as it lowers operational energy and avoids the embodied impact of new constructions. To be successful, the energy renovation process requires meaningful building models. However, the time and costs associated with obtaining accurate data on existing buildings make large-scale evaluations unrealistic. This study proposes a methodology to streamline building energy models from open-access datasets for urban scalability. The methodology was tested on six case study buildings representing different typologies of the Swedish post-war construction period. The most promising results were obtained by coupling OpenStreetMap-sourced footprints with energy performance declarations and segmented archetypes for building characterisation. These signiﬁcantly reduced simulation time while retaining similar accuracy. The suggested methodology streamlines building energy modelling with a promising degree of automation and without the need for input from the user. The study concludes that municipalities and building owners could use a such methodology to develop roadmaps for cities to achieve carbon neutrality and evaluate energy renovation solutions. Future work includes achieving higher accuracy of the generated energy models through calibration, performing renovation analysis, and upscaling from individual buildings to neighbourhoods.


Introduction
Globally, the building sector accounts for 40% of energy use, corresponding to 36% of global greenhouse gas emissions in Europe [1].Roughly three-quarters of buildings in the EU are energy inefficient, yet 85-95% of today's buildings will still be in use in 2050 [2] when the EU aims to achieve climate neutrality [3].More than one-third of the EU's buildings are over 50 years old, but the renovation rate is lower than 1% per year [3,4].Sweden aims to reach climate neutrality by 2045 [5].Moreover, 23 Swedish pioneer municipalities-together accounting for 40% of Sweden's population-are even more ambitious and aim for that goal by 2030 [6].As in many European cities, many municipalities in Sweden experienced the rapid growth of residential neighbourhoods during the post-war period.Those extensive neighbourhoods now need renovation.Thus, large-scale energy renovation opens a window of opportunity to reduce building operational energy and, therefore, help to decarbonise Swedish cities.
Making an informed decision on building renovation is challenging [4].Stakeholder interests, combined with uncertainty regarding the final performance, often impede deep energy renovations [4,[7][8][9].Building energy modelling is advantageous in assessing the impact of different renovation scenarios through numerical simulations, thereby providing evidence to the debate among decision makers [4,10,11].For energy renovations, energy models of large building stocks are particularly beneficial for both increasing the actual renovation rate of buildings and evaluating building impacts on a larger scale [12].
Numerous methodologies and tools have been developed to scale energy modelling from single buildings to urban models [13].Urban energy building models (UBEM) provide users with the energy demand of the building stock, including baseline calibration [14], energy efficiency scenario evaluation, and other important analyses such as the economic and environmental impact during the future life cycle of the buildings [15][16][17][18].
Urban building energy modelling has been thoroughly reviewed [19][20][21] and is usually divided into two modelling approaches: top-down and bottom-up.Top-down urban models describe buildings at a general level, using large datasets that are not building-specific.These datasets contain factors that drive the energy performance of buildings [22,23], such as technical [9,24], socio-economic [25][26][27], or physical factors [28].However, these datasets are based on reported data from the past, making top-down UBEMs useful for large-scale assessments but not ideal for predicting future scenarios.Further, top-down building modelling lacks the necessary granularity to enhance detailed numerical simulations.Bottom-up building modelling, on the other hand, uses building-specific models that can support detailed numerical simulations to predict the future energy performance of buildings.The energy modelling can be accomplished through statistical methods or by actual thermal modelling of the buildings.Both methods are driven by databases on building constructions such as envelope materials, window structure, HVAC systems and user behaviour.However, suitable databases are often non-existent or difficult to access, and energy modelling knowledge is needed to process them.
Organising the input for simulation is the key focus of this paper, which aims to simplify this task as it is considered the major obstacle to obtaining bottom-up urban models.The urban modelling process is time-consuming and complex, even for an expert user.To obtain a reliable energy model, methods have been sought to break down the process into three main subtasks: (1) organise the input for the simulation, (2) generate and run the thermal model, and (3) validate the results (calibration) [29].Within the scope of the energy modelling process, this paper does not pay much attention to the second and third subtasks, which are already well established and supported by numerous studies and literature.Rather, it focuses on the first step, which is considered the major obstacle to large-scale assessments.
In recent years, many bottom-up generated urban models have gained momentum [30].Generally, bottom-up UBEM is built by merging large datasets that describe buildings and user patterns.It is common to use available geographical information systems (GIS) datasets to generate building geometries, often referred to as 3D city models [31][32][33].In Sweden, however, the standard for generating and managing 3D city models is still under development [31,34], which is a major barrier to automating UBEM generation.Moreover, due to the lack of a standard, access to 3D city models is not centralized and each city decides on the level of detail or the tools they use to generate these models.[31,33,34].
Unlike 3D urban models, building footprints are freely accessible worldwide through Google Maps, OpenStreetMap or other web mapping platforms.The OpenStreetMap project is a global open GIS database that offers a free download service for building footprints in OSM extension.In Sweden, Lantmäteriet, the Swedish mapping cadastral and land registration authority, offers public geodata services that include building footprint datasets, typically as shapefiles (SHP) [35].Those datasets can be partially obtained on the website.Full access is given for research through the Geodata Extraction Tool (GET) [36] offered by the Swedish University of Agricultural Sciences, SLU.
Building geometry modelling can be obtained from the "slicing" of a 3D solid or from extrusion of the building footprint.Figure 1 summarises the cutting (A) and extrusion (B) methods graphically.The first modelling method (A) employs existing 3D solid models, such as City GML or City JSON, to create building solids.These 3D models represent the geometry, topology, semantics, and appearance of real-world city objects, such as buildings, in a digital format.Building solids can also be constructed from other types of geospatial data, such as point clouds based on laser scanning (LIDAR) or raster datasets (aerial photographs).Point clouds are dense collections of 3D points that can be used to reconstruct the 3D geometry of buildings.Raster datasets, such as aerial photographs, can be used to extract building footprints and other features.In this case, if thermal zones are required for each level, the solid is sliced to define the different floors.
The second modelling method (B) uses the building footprint, which represents the outline of the building at ground level, to extrude each level and stack them on top of each other.This method can be used when the detailed 3D geometry of a building is not available.The choice between the two methods depends on the level of detail required for the simulation and the available dataset.The City GML [37][38][39] modelling standard organises the level of detail (LOD) of the building geometry, into four consecutive groups (LOD0-3).LOD0 represents the lowest level of detail, with only the building's footprint available.LOD1 represents a simplified exterior volume with a flat roof [34,40], and it is typically used for thermal modelling of buildings [21,38,41,42].LOD2 and LOD3 represent increasingly detailed exterior and interior geometries, respectively.The use of LOD1 allows for a balance between computational efficiency and accuracy of the simulation results [40,43].
Energy models from building footprints are often more convenient as they require less post-processing to become a simulation-friendly thermal model compared to the first method.This is because the building footprint already represents the outline of the building at ground level and can be directly used to generate the thermal model.In contrast, the detailed 3D geometry of the building may require additional processing steps, such as simplification or cleaning, before it can be used for simulation.
For building energy modelling, the thermal characterisation of existing construction, active systems and usage patterns is often a complex task.At present, manual inspection or the use of databases is necessary to obtain the thermal model of the building.When moving to urban models, thermal characterization gets more complicated as the amount of data to be filled in becomes too large.
At the European level, there are several databases that can provide valuable modelling input data.These include the EU Building Stock Observatory (BSO) [44], the TABULA and EPISCOPE research projects [45,46], and Energy Performance Certificates (EPC) [47][48][49].Additionally, EPCs can be used for benchmarking the energy performance results of the UBEMs.
In Sweden, there are several databases that facilitate characterisation of the existing residential building inventory.This includes the BETSI database, which was created following a survey on the technical status of 10,000 buildings by the Swedish National Board of Housing during the heating season in 2007-2008.The building selection process in the The first modelling method (A) employs existing 3D solid models, such as City GML or City JSON, to create building solids.These 3D models represent the geometry, topology, semantics, and appearance of real-world city objects, such as buildings, in a digital format.Building solids can also be constructed from other types of geospatial data, such as point clouds based on laser scanning (LIDAR) or raster datasets (aerial photographs).Point clouds are dense collections of 3D points that can be used to reconstruct the 3D geometry of buildings.Raster datasets, such as aerial photographs, can be used to extract building footprints and other features.In this case, if thermal zones are required for each level, the solid is sliced to define the different floors.
The second modelling method (B) uses the building footprint, which represents the outline of the building at ground level, to extrude each level and stack them on top of each other.This method can be used when the detailed 3D geometry of a building is not available.The choice between the two methods depends on the level of detail required for the simulation and the available dataset.The City GML [37][38][39] modelling standard organises the level of detail (LOD) of the building geometry, into four consecutive groups (LOD0-3).LOD0 represents the lowest level of detail, with only the building's footprint available.LOD1 represents a simplified exterior volume with a flat roof [34,40], and it is typically used for thermal modelling of buildings [21,38,41,42].LOD2 and LOD3 represent increasingly detailed exterior and interior geometries, respectively.The use of LOD1 allows for a balance between computational efficiency and accuracy of the simulation results [40,43].
Energy models from building footprints are often more convenient as they require less post-processing to become a simulation-friendly thermal model compared to the first method.This is because the building footprint already represents the outline of the building at ground level and can be directly used to generate the thermal model.In contrast, the detailed 3D geometry of the building may require additional processing steps, such as simplification or cleaning, before it can be used for simulation.
For building energy modelling, the thermal characterisation of existing construction, active systems and usage patterns is often a complex task.At present, manual inspection or the use of databases is necessary to obtain the thermal model of the building.When moving to urban models, thermal characterization gets more complicated as the amount of data to be filled in becomes too large.
At the European level, there are several databases that can provide valuable modelling input data.These include the EU Building Stock Observatory (BSO) [44], the TABULA and EPISCOPE research projects [45,46], and Energy Performance Certificates (EPC) [47][48][49].Additionally, EPCs can be used for benchmarking the energy performance results of the UBEMs.
In Sweden, there are several databases that facilitate characterisation of the existing residential building inventory.This includes the BETSI database, which was created following a survey on the technical status of 10,000 buildings by the Swedish National Board of Housing during the heating season in 2007-2008.The building selection process in the BETSI study was careful to be representative of the residential building stock in Sweden.Surveyed buildings were split between single-family houses and multi-family buildings.Within the latter group, a segmentation was made according to the year of construction: before 1960, 1961-1975, 1976-1985, 1986-1995 and 1996-2005.Additionally, each age group period was subdivided into seven building typologies, resulting in 35 archetypes.BETSI provides valuable data on building dimensions, construction details, HVAC systems, type and percentage of openings according to facade orientation, roof and basement typology, as well as the need for renovation and work performed to date, without being an exhaustive list.Additionally, the cross-industry program for standardisation and verification of the energy performance of buildings, SVEBY [50], has conducted multiple studies on Swedish users' patterns and input values [51] for accurate energy modelling of the Swedish building stock.
However, current approaches for building energy modelling at both the building and neighbourhood level are not yet ready to provide fully automated energy modelling.This is due to limitations in data management and interoperability, and a lack of suitable methods for large-scale assessments that are technologically ready to be implemented.These limitations have been recognized by several studies [52][53][54][55][56].
This research aims to fill this gap by providing a new methodology to generate building energy models for multi-dwelling residential buildings or groups of buildings in Sweden.The target group of the methodology is users without prior knowledge of energy modelling, such as building owners and municipalities.To ensure ease of use, priority was given to technological and interoperability readiness, as well as free access to the tools and datasets employed.To demonstrate the effectiveness of the proposed methodology, six case study buildings were selected from different geographical locations in Sweden, built during the period 1961-1975, and representing different building typologies.

Methods
This section describes the proposed energy modelling methodology.Section 2.1 describes the overall modelling workflow of the flexible input process whereby the user can customise the default generated building energy model needed.Section 2.2 describes the generation of the building geometry, which is then thermally and functionally characterised into a suitable energy model, as described in Section 2.3.Finally, the selected case study buildings are presented in Section 2.4.

Overall Modelling Workflow
Figure 2 summarises the entire modelling process, describing the script developed to automate a building energy model using the Grasshopper visual programming environment coupled with the CAD modeller Rhinoceros 3D.The process starts by asking the user to define specific data.When input data are unknown or unavailable for the different input needed, characterisation is made through a predefined age and typology archetype segmentation from a database pool, including geographic information systems (GIS), thermal properties and energy performance certificate (EPC) datasets.
The Grasshopper script automates the process of finding the closest building footprint to a selected address or building.The building footprint is defined as a closed polygon with embedded metadata for each point, such as geographical coordinates, building, block, neighbourhood and others, following predefined attribute categorisation of the map features [57].Although the amount and quality of the metadata associated with each building footprint in OpenStreetMap may vary, points defining building entrance and address are generally available.The address for each staircase is used to access the building's corresponding energy performance certificates (EPC).The Grasshopper script automates the process of finding the closest building footprint to a selected address or building.The building footprint is defined as a closed polygon with embedded metadata for each point, such as geographical coordinates, building, block, neighbourhood and others, following predefined attribute categorisation of the map features [57].Although the amount and quality of the metadata associated with each building footprint in OpenStreetMap may vary, points defining building entrance and address are generally available.The address for each staircase is used to access the building's corresponding energy performance certificates (EPC).
The data obtained from the EPCs is used to adjust the geometrical model, which is built from either OSM or SHP footprints.If more precise data are not available from the user, the number of stories is obtained from the EPC, and each level's height is an average value obtained from the BETSI database.
The thermal characterization of the building is then performed using segmented archetypes related to the building's age and typology.Once the numerical model is ready, it can be stored as a file (json) or sent for direct simulation to EnergyPlus through Open-Studio using different energy modelling plugins for Grasshopper 3D to create the numerical model.

Building Geometry
For the building geometry construction, average floor-to-floor heights for each level (ground floor, intermediate floor(s) and, last floor) and the window-to-wall ratio (WWR) for each façade were taken from BETSI's archetype.Once the whole building geometry was generated, the resulting total heated floor area was benchmarked with the heated area defined in the EPCs.The geometry model was adjusted to match the EPC total heated floor area.Employing a single-objective optimisation algorithm, the necessary building footprint offset to be made to the footprint of the building was obtained.This allowed automating the fitting even with a complex building footprint geometry.Figure 3 provides an overview of the process for generating the building geometry.The data obtained from the EPCs is used to adjust the geometrical model, which is built from either OSM or SHP footprints.If more precise data are not available from the user, the number of stories is obtained from the EPC, and each level's height is an average value obtained from the BETSI database.
The thermal characterization of the building is then performed using segmented archetypes related to the building's age and typology.Once the numerical model is ready, it can be stored as a file (json) or sent for direct simulation to EnergyPlus through OpenStudio using different energy modelling plugins for Grasshopper 3D to create the numerical model.

Building Geometry
For the building geometry construction, average floor-to-floor heights for each level (ground floor, intermediate floor(s) and, last floor) and the window-to-wall ratio (WWR) for each façade were taken from BETSI's archetype.Once the whole building geometry was generated, the resulting total heated floor area was benchmarked with the heated area defined in the EPCs.The geometry model was adjusted to match the EPC total heated floor area.Employing a single-objective optimisation algorithm, the necessary building footprint offset to be made to the footprint of the building was obtained.This allowed automating the fitting even with a complex building footprint geometry.Figure 3 provides an overview of the process for generating the building geometry.The datasets explored for generating building footprints included the cadastre building footprints from building permits (SHP files) and OpenStreetMap footprint points with embedded map features (OSM files); both are open access and freely available.Additionally, a surface model was used from Airborne Laser Scanning (LAS/LAZ files), although only for verification of significant deviations of volume and building height, not support- The datasets explored for generating building footprints included the cadastre building footprints from building permits (SHP files) and OpenStreetMap footprint points with embedded map features (OSM files); both are open access and freely available.Additionally, a surface model was used from Airborne Laser Scanning (LAS/LAZ files), although only for verification of significant deviations of volume and building height, not supporting the building geometry modelling, as further explained in the paragraph below.
Different tools and processes were explored to generate the building geometry from open access geodata.After an exhaustive search in the two main Grasshopper 3D tool libraries [58,59], 49 tools were identified, out of which 39 were open access and thus further investigated.After testing, Elk, Urbano and Volvox Grasshopper plug-ins were incorporated.Table 1 describes the selected GIS tools.As listed in Table 1, Elk was selected for point and metadata extraction from OSM (OpenStreetMap) files due to its simplicity and good operability.Urbano adds automation features to the OSM download and supports both Shapefile and Lidar formats, which can be imported and positioned on the correct coordinates using a translation vector, referencing data previously sourced from OpenStreetMap.Volvox was employed to validate the volume and height of the geometries obtained manually.As shown in a graphical summary in Figure 4, the different building geometries (A) overlapped with the "real" 3D surface model obtained from the Airborne Laser Scanning, LIDAR (B) dataset to ensure there were no significant discrepancies through manual inspection (C).Differences in height between the generated geometry and point-cloud dataset across the six neighbourhoods investigated did not exceed 1.3 m, while differences in volume were up to 46% greater in neighbourhood F (Ä llingavägen), with L-shaped buildings.The building geometry generation using Elk and Urbano was considered reasonably accurate.Due to its complexity, the Volvox plug-in was not integrated into the modelling automation.It is reserved for verification of specific cases where the discrepancies between EPC Differences in height between the generated geometry and point-cloud dataset across the six neighbourhoods investigated did not exceed 1.3 m, while differences in volume were up to 46% greater in neighbourhood F (Ällingavägen), with L-shaped buildings.The building geometry generation using Elk and Urbano was considered reasonably accurate.Due to its complexity, the Volvox plug-in was not integrated into the modelling automation.It is reserved for verification of specific cases where the discrepancies between EPC surfaces and the model obtained are considerable.Therefore, Volvox and point-cloud generated surfaces are not evaluated in the Results and Discussion sections.

Building Thermal Model
The next step in the modelling process was to assign the envelope characteristics that influence the energy performance of the building.Following the previously mentioned process, average values from the archetype database were taken according to the year of construction of the building and its building typology.Table 2 summarises the envelope thermal properties assigned to the model.The affected building elements, type of input and source used are detailed.Once the static properties of the energy model are defined, dynamic characterisation of the model is needed to run an initial dynamic simulation.Building program and loads used to define the operation model of the building are summarised in Table 3.
The data sources used in this case were taken from official recommendations for the simulation input of residential buildings in Sweden (Boverket, Sveby).Data related to the operation of the building were based on several previous measurements [60].These were considered independent of the age and typology of the building.Since there are no data from infiltration tests in the databases, an average value for the infiltration rate was used.The type of ventilation, natural exhaust, forced exhaust, supply and exhaust, or supply and exhaust with heat recovery, were taken from specific building EPC.
The Grasshopper visual programming platform in the Rhinoceros 3D modeller was used to implement an all-encompassing script which automated the BEM generation within one environment.Moreover, this allowed for an optimisation process with reasonable time allocation per energy simulation iteration, as no need for manual post-processing was needed between different tools.The building model was obtained using Dragonfly (https://doi.org/10.3390/en14185931)and Honeybee from the Ladybug Tools Grasshopper plug-in, connecting with the above input and assigning the specific weather data.Several measures were evaluated to simplify the model and, thus, reduce simulation time for each iteration.Keeping reasonable time per iteration was needed to allow scalability of the model in future work, both for evaluating more buildings at the urban level and assessing many renovation scenarios for each building.Some of the simplification measures that were considered are illustrated in Figure 5. A.
The study evaluates the effect of using one thermal zone per floor instead of one per individual dwelling unit, as the necessary detailed plans for each floor are not available; B.
Windows obtained as a ratio of the exterior façade (window-to-wall ratio) could be modelled as a single window per level and façade or distributed evenly; C.
Depth or thickness of the existing façade and its relative position to the window was also evaluated; D.
Impact of considering different radius distances for the modelling of the context (e.g., other buildings) around the building.

Case Studies
The selected neighbourhoods are from the post-war construction period (1961-1975) since they contain a significant share of the building stock in Sweden and need renovation.For this period, three out of the seven different building typologies available were selected, including different numbers of stories.Those typologies were selected based on a

Case Studies
The selected neighbourhoods are from the post-war construction period (1961)(1962)(1963)(1964)(1965)(1966)(1967)(1968)(1969)(1970)(1971)(1972)(1973)(1974)(1975) since they contain a significant share of the building stock in Sweden and need renovation.For this period, three out of the seven different building typologies available were selected, including different numbers of stories.Those typologies were selected based on a statistical representative within the construction period.However, the archetype characterization seen in the previous Building Thermal Model section was averaged for the entire age group, disregarding building typologies.In order to limit uncertainty, the selected buildings are in the same climatic zone (south of Sweden), do not have a basement, and have no upgraded forced exhaust ventilation system.No further data were available for the buildings besides those obtained from the databases mentioned above.A synopsis of the selected case studies can be found in Table 4.

Results
The results of implementing the methodology for geometry and thermal modelling of the case studies are described in Sections 3.1 and 3.2, respectively.

Geometry Modelling of Case Studies
The generation of building geometry by using building footprints from OpenStreetMap (OSM) and cadastral shapefiles (SHP) was successful in all neighbourhoods.Table 5 displays the results obtained.When using OSM footprints, the resulting areas were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.
Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.
The total area difference boxplots show that the use of Shapefiles resulted in geometries with an area difference of less than 16% compared to EPC.Most results were targeted between 7% and 11%.With the OSM dataset, the area difference reached a maximum of 21%, with a wider dispersion of results.The offset required for area adjustment with SHP dataset ranged from no adjustment to a maximum of 80 cm, with a concentration of results in the 30-60 cm range.In contrast, the offset with OSM ranged from 30 cm up to 1.2 m.
The different complexities of the buildings affected both datasets similarly, but a more compact range of the results of total area difference and offset required was observed using cadastral shapefiles.were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.were generally larger, with differences ranging from 3.5% to 21% compared to the area obtained from the energy performance certificates (EPC).The offset needed to adjust building footprints and match EPC's total heated floor area ranged from 32 to 123 centimetres.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.Differences in the resulting heated floor area (HFA) obtained were smaller when using SHP building footprints.No adjustment to the building footprint was necessary for neighbourhood B, as the area was practically the same as that found in the EPC.However, in two cases, neighbourhood A and C, the area difference was 16.3% and 11.4%, respectively.
Figure 6 compares the distribution of area difference between the model and EPC, the offset applied to the building footprint to fit EPC area, and the relative compactness ratio of resulting geometries from OSM and SHP footprints.

Thermal Modelling of Case Studies
The thermal characterization using the proposed methodology resulted in satisfactory generation of the numerical building model and subsequent energy simulation in Energy-Plus.Table 6 summarises the energy performance accuracy and simulation times obtained during the optimisation process of the thermal modelling.The most significant simulation time savings were achieved when using a single thermal zone per level, with reductions of 625% and 728% using OSM and SHP, respectively, while heating energy increased by around 1%. Grouping windows had a negative impact on the energy performance of 0.6% with OSM and barely any impact on the SHP model.Simulation time was reduced by an average of 35.4% to 14.7%.Modelling the façade's depth resulted in simulations that were up to 500% slower and increased heating energy by 5%.Finally, the impact of modelling different contexts (adjacent buildings, vegetation, etc.) at different distances around the building was verified, and a maximum radius of 50 m (D 50 ) was defined as a "common practice" value and used as a reference for comparison.The default context radius of 50 m was retained as the standard, but it should be increased where necessary, for instance, if the primary source of obstruction is beyond 50 m.Once the above-described settings for the energy modelling were optimised, the procedure was tested on specific case studies with the following settings: a single thermal zone per storey, grouped windows per storey and façade orientation, no façade depth shading around the apertures, and context within 50 m.Table 7 displays the results of the energy performance and simulation time according to the dataset used for importing the building footprints.The measured heating energy in the EPCs was used to benchmark the percent difference between the energy models obtained from the two datasets before and after the area fitting of the building's footprint.As shown in Figure 7, results show that the gap in energy performance was lower when using SHP (5% to 25% after area adjustment).Nevertheless, the time increase when using SHP was up to eight times higher compared to OSM.The measured heating energy in the EPCs was used to benchmark the percent difference between the energy models obtained from the two datasets before and after the area fitting of the building's footprint.As shown in Figure 7, results show that the gap in energy performance was lower when using SHP (5% to 25% after area adjustment).Nevertheless, the simulation time increase when using SHP was up to eight times higher compared to OSM.The measured heating energy in the EPCs was used to benchmark the percent difference between the energy models obtained from the two datasets before and after the area fitting of the building's footprint.As shown in Figure 7, results show that the gap in energy performance was lower when using SHP (5% to 25% after area adjustment).Nevertheless, the simulation time increase when using SHP was up to eight times higher compared to OSM.The measured heating energy in the EPCs was used to benchmark the percent difference between the energy models obtained from the two datasets before and after the area fitting of the building's footprint.As shown in Figure 7, results show that the gap in energy performance was lower when using SHP (5% to 25% after area adjustment).Nevertheless, the simulation time increase when using SHP was up to eight times higher compared to OSM.The measured heating energy in the EPCs was used to benchmark the percent difference between the energy models obtained from the two datasets before and after the area fitting of the building's footprint.As shown in Figure 7, results show that the gap in energy performance was lower when using SHP (5% to 25% after area adjustment).Nevertheless, the simulation time increase when using SHP was up to eight times higher compared to OSM.The simulation time remained below one minute for OSM, except in neighbourhood F, where it was 72 s due to the increased complexity of the buildings.For SHP, the longest time was 657 s per building in the Siriusgatan neighbourhood (D) (see Figure 7).The highrise slab buildings resulted in the most time-consuming energy simulations when using SHP, while the more geometrically complex neighbourhood performed similarly for OSM and shapefile energy models.The low-rise slab neighbourhoods had the fastest energy The simulation time remained below one minute for OSM, except in neighbourhood F, where it was 72 s due to the increased complexity of the buildings.For SHP, the longest time was 657 s per building in the Siriusgatan neighbourhood (D) (see Figure 7).The high-rise slab buildings resulted in the most time-consuming energy simulations when using SHP, while the more geometrically complex neighbourhood performed similarly for OSM and shapefile energy models.The low-rise slab neighbourhoods had the fastest energy simulations and the closest results to the EPCs, with the best performance across all building typologies when using OSM-based energy models.Figure 8 shows that the modelled heating need had the lowest deviation compared to measurements from EPCs for low-rise buildings.The simulation time remained below one minute for OSM, except in neighbourhood F, where it was 72 s due to the increased complexity of the buildings.For SHP, the longest time was 657 s per building in the Siriusgatan neighbourhood (D) (see Figure 7).The highrise slab buildings resulted in the most time-consuming energy simulations when using SHP, while the more geometrically complex neighbourhood performed similarly for OSM and shapefile energy models.The low-rise slab neighbourhoods had the fastest energy simulations and the closest results to the EPCs, with the best performance across all building typologies when using OSM-based energy models.Figure 8 shows that the modelled heating need had the lowest deviation compared to measurements from EPCs for low-rise buildings.

Discussion
The results of the study contribute to quantifying the trade-off between energy results accuracy and simulation time when using different approaches to generate building energy models with no input data from users.The OpenStreetMap (OSM) GIS database was found to be suitable for modelling low-rise buildings.The accuracy of the generated heated floor area and heating need was found to be similar between OSM and shapefile (SHP) cadastre GIS database, so OSM was preferred for this building typology due to the faster simulation time.The higher compactness and less complex geometry of low-rise buildings could also explain the lower deviation between the modelled heating area and resulting heating need compared to measurements from energy performance certificates (EPCs) for low-rise buildings.The good performance for low-rise buildings is further supported by the fact that this typology represents 58% of the BETSI dataset for the analysed period.This means that assumptions about the thermal properties of the building envelope and ventilation systems are more likely to be accurate when modelling low-rise

Discussion
The results of the study contribute to quantifying the trade-off between energy results accuracy and simulation time when using different approaches to generate building energy models with no input data from users.The OpenStreetMap (OSM) GIS database was found to be suitable for modelling low-rise buildings.The accuracy of the generated heated floor area and heating need was found to be similar between OSM and shapefile (SHP) cadastre GIS database, so OSM was preferred for this building typology due to the faster simulation time.The higher compactness and less complex geometry of low-rise buildings could also explain the lower deviation between the modelled heating area and resulting heating need compared to measurements from energy performance certificates (EPCs) for low-rise buildings.The good performance for low-rise buildings is further supported by the fact that this typology represents 58% of the BETSI dataset for the analysed period.This means that assumptions about the thermal properties of the building envelope and ventilation systems are more likely to be accurate when modelling low-rise buildings.Additionally, low-rise buildings are the most common typology for residential buildings in Sweden, so this result is relevant for scaling up future studies on national-level renovation scenarios where accuracy and low computational cost of thermal modelling are essential.
Contrarily, other building typologies, such as high-rise and complex buildings showed significant differences between the modelling procedures using OSM and SHP.These buildings have more complicated footprint geometries, which explains the larger deviation when modelled using a more simplistic method such as OSM.In Figure 9, a visual comparison is shown of a high-rise building modelled using the OSM footprint on the left and the cadastral shapefile on the right.
The difference between the footprints is due to the fact that high-rise slab buildings often have buffered entrances attached to each staircase, which are not captured in the OSM data but are present in the cadastral shapefile database.When using the SHP data with the extrusion modelling procedure, it is important to target a higher level of detail (LOD), such as LOD 1.3, which provides geometric detail of the outer perimeter per level.Otherwise, the complexity can hinder the effectiveness of simplification measures such as grouping the windows per façade.This issue was seen during the optimization process of the geometry model (Figure 5), as the implementation of grouping apertures proved to be twice as effective in reducing simulation times with the OSM-based models.ling are essential.
Contrarily, other building typologies, such as high-rise and complex buildings showed significant differences between the modelling procedures using OSM and SHP.These buildings have more complicated footprint geometries, which explains the larger deviation when modelled using a more simplistic method such as OSM.In Figure 9, a visual comparison is shown of a high-rise building modelled using the OSM footprint on the left and the cadastral shapefile on the right.The difference between the footprints is due to the fact that high-rise slab buildings often have buffered entrances attached to each staircase, which are not captured in the OSM data but are present in the cadastral shapefile database.When using the SHP data with the extrusion modelling procedure, it is important to target a higher level of detail (LOD), such as LOD 1.3, which provides geometric detail of the outer perimeter per level.Otherwise, the complexity can hinder the effectiveness of simplification measures such as grouping the windows per façade.This issue was seen during the optimization process of the geometry model (Figure 5), as the implementation of grouping apertures proved to be twice as effective in reducing simulation times with the OSM-based models.
The results of the study showed that the difference between the modelled and measured heating energy was larger in high-rise buildings, which could be attributed to the generic archetype being more representative of low-rise buildings and their thermal properties and HVAC systems.However, it should be noted that the sample size is small, and the results may not be generalizable to all building types from the same construction period, pointing to the importance of introducing more archetypes to better model underrepresented building typologies in future studies.Although the authors are aware that comparing energy efficiency accuracy with EPC values is not ideal, the aim of this study was to establish the methodology and verify the proper semi-automated modelling process.Future work includes increasing the accuracy of the generated energy models, analysing renovation scenarios, and upscaling from individual buildings to neighbourhoods.To achieve this, the methodology should be calibrated using more data on real energy use and extended to other construction periods, incorporating more archetypes and "layers" of modelling, such as profitability, life cycle assessment, energy use, and indoor comfort, which may require access to additional databases such as material and labour costs and the climate impact of materials and energy use.The results of the study showed that the difference between the modelled and measured heating energy was larger in high-rise buildings, which could be attributed to the generic archetype being more representative of low-rise buildings and their thermal properties and HVAC systems.However, it should be noted that the sample size is small, and the results may not be generalizable to all building types from the same construction period, pointing to the importance of introducing more archetypes to better model underrepresented building typologies in future studies.Although the authors are aware that comparing energy efficiency accuracy with EPC values is not ideal, the aim of this study was to establish the methodology and verify the proper semi-automated modelling process.Future work includes increasing the accuracy of the generated energy models, analysing renovation scenarios, and upscaling from individual buildings to neighbourhoods.To achieve this, the methodology should be calibrated using more data on real energy use and extended to other construction periods, incorporating more archetypes and "layers" of modelling, such as profitability, life cycle assessment, energy use, and indoor comfort, which may require access to additional databases such as material and labour costs and the climate impact of materials and energy use.

Conclusions
The integration of urban energy modelling into existing building renovation processes is vital for promoting building upgrades across a large building stock without the need for personal inspection or technical expertise.This study proposes a methodology for streamlining building energy modelling with a promising degree of automation and reduced input requirements from the user.The methodology provides a solid basis for important analyses of the existing building stock at the urban level, empowering non-technical stakeholders to make informed decisions in the early stages of the process.
The building energy modelling process was automated by coupling open access GIS databases, statistical analysis of the building stock's thermal performance, and available modelling tools.Several modelling approaches were tested in six case study buildings representing different post-war construction typologies in Sweden.The study found that OpenStreetMap (OSM) is a promising GIS database for the proposed methodology.Results showed six times faster simulation time averaging 23 s compared to the shapefile (SHP) cadastral GIS database, which had an average simulation time of 257 s.Despite the difference in simulation time, the accuracy of heating energy need predictions between the two databases was comparable, with differences between simulations and measurements ranging from 6% to 22% for OSM and from 5% to 15% for SHP.
The limited sample size and use of a generic archetype highlight the need for further research to increase the accuracy of the building energy models.This can be achieved by incorporating additional building archetypes and calibrating the methodology with real energy use data.The study lays the foundation for further development of a methodology for evaluating renovation scenarios toward decarbonisation in the Swedish building stock.

Figure 1 .
Figure 1.Building geometry generation methods using GIS datasets: (A) From a 3D surface or solid; (B) From a closed curve or a 2D surface.

Figure 1 .
Figure 1.Building geometry generation methods using GIS datasets: (A) From a 3D surface or solid; (B) From a closed curve or a 2D surface.

Figure 2 .
Figure 2. Overall building modelling workflow proposed.The automated grasshopper script highlighted in grey integrates different GIS and modelling tools with access to different databases (online or excel files) using an additional layer of Python code.

Figure 2 .
Figure 2. Overall building modelling workflow proposed.The automated grasshopper script highlighted in grey integrates different GIS and modelling tools (Ladybug Tools-Dragonfly.Available online: https://www.ladybug.tools/dragonfly.html(accessed on 1 February 2023).Ladybug Tools-Honeybee.Available online: https://www.ladybug.tools/honeybee.html(accessed on 1 February 2023).)with access to different databases (online or excel files) using an additional layer of Python code.

Sustainability 2023 , 18 Figure 3 .
Figure 3.The geometry model workflow includes step (3) of area fit between the geometry model and input data from energy performance certificates.

Figure 3 .
Figure 3.The geometry model workflow includes step (3) of area fit between the geometry model and input data from energy performance certificates.

Figure 4 .
Figure 4.The process for using LIDAR-based point-cloud datasets for volume and height verification of building geometries obtained from OSM and SHP footprints using Elk and Urbano, across the three subfigures.Subfigure (A) displays the building geometries obtained from OSM and SHP footprints, which are then compared to subfigure (B) that showcases the building envelope mesh and point cloud elevations of the buildings obtained from LIDAR datasets.Finally, in subfigure (C), the volume and height of the buildings are verified, as indicated by the arrows taking the average elevation of points representing the roof of each building.

Figure 4 .
Figure 4.The process for using LIDAR-based point-cloud datasets for volume and height verification of building geometries obtained from OSM and SHP footprints using Elk and Urbano, across the three subfigures.Subfigure (A) displays the building geometries obtained from OSM and SHP footprints, which are then compared to subfigure (B) that showcases the building envelope mesh and point cloud elevations of the buildings obtained from LIDAR datasets.Finally, in subfigure (C), the volume and height of the buildings are verified, as indicated by the arrows taking the average elevation of points representing the roof of each building.

Figure 5 .
Figure 5. Four simplification measures were tested to optimize accuracy and computational cost of the thermal model.The 3D model shows the building with colour-coded features.Red is the roof, yellow is exterior walls, blue is apertures, and purple is the context.Subfigures (A), (B), (C), and (D) depict the effects of thermal zoning, window distribution, facade depth/thickness, and context distance modeling, respectively.

Figure 5 .
Figure 5. Four simplification measures were tested to optimize accuracy and computational cost of the thermal model.The 3D model shows the building with colour-coded features.Red is the roof, yellow is exterior walls, blue is apertures, and purple is the context.Subfigures (A-D) depict the effects of thermal zoning, window distribution, facade depth/thickness, and context distance modeling, respectively.

Figure 6 .
Figure 6.Boxplot results comparison of modelling with OpenStreetMap and shapefiles from cadastre for (A) Area difference of resulting models, (B) Offset needed to fit EPD area, and (C) Relative compactness ratio of resulting geometries.

, 15 , 18 Figure 7 .
Figure 7. Boxplot distribution of energy need difference and total simulation time results of the different GIS sources.

Figure 7 .
Figure 7. Boxplot distribution of energy need difference and total simulation time results of the different GIS sources.

Figure 7 .
Figure 7. Boxplot distribution of energy need difference and total simulation time results of the different GIS sources.

Figure 8 .
Figure 8. Normalised heating need and difference from EPC results using OSM and SHP segmented by building typologies.

Figure 8 .
Figure 8. Normalised heating need and difference from EPC results using OSM and SHP segmented by building typologies.

Figure 9 .
Figure 9. Example of a high-rise building energy model analysed.The figure displays the 3D model of the building with colours representing different features.Red represents the roof of the building, yellow represents the exterior walls, and blue represents the apertures of the building.

Figure 9 .
Figure 9. Example of a high-rise building energy model analysed.The figure displays the 3D model of the building with colours representing different features.Red represents the roof of the building, yellow represents the exterior walls, and blue represents the apertures of the building.

Table 1 .
Selected tools supporting GIS integration within the Grasshopper environment.

Table 2 .
[49]c input needed on the building envelope predefined by BETSI[49]archetype datasets when unknown.

Table 3 .
Default input sourcing that defines the building operation model.

Table 4 .
Neighbourhoods selected as case studies.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 5 .
Building geometry benchmark of OpenStreetMap (OSM files) building footprints against (energy performance certificates, EPC) for case studies A to F.

Table 6 .
Averaged results comparison of the different simplification measures tested using Open-StreetMap and shapefile sourced building footprints.(*) The default context distance of 50 m was retained for the comparison of results when considering other radius distances for modeling the building context.

Table 7 .
Energy performance for heating and simulation time results obtained across the studied neighbourhoods.The offset values are referring to heating energy based on adjusted building footprints to match the EPC's total heated floor area.

Table 7 .
Energy performance for heating and simulation time results obtained across the studied neighbourhoods.The offset values are referring to heating energy based on adjusted building footprints to match the EPC's total heated floor area.

Table 7 .
Energy performance for heating and simulation time results obtained across the studied neighbourhoods.The offset values are referring to heating energy based on adjusted building footprints to match the EPC's total heated floor area.
neighbourhoods.The offset values are referring to heating energy based on adjusted building footprints to match the EPC's total heated floor area.