Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration

Ahmeti, Albin; Hensel, Defne Sunguroglu; Pruski, Cédric; Tyc, Jakub; Hensel, Michael

doi:10.3390/app15105311

Open AccessArticle

Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration

by

Albin Ahmeti

^1,*

,

Defne Sunguroglu Hensel

²

,

Cédric Pruski

³

,

Jakub Tyc

¹

and

Michael Hensel

¹

Department of Digital Architecture and Planning, Faculty of Architecture and Planning, Technical University of Vienna, Karlsplatz 13, 1040 Wien, Austria

²

Internationalisation Demonstration School, Southeast University, Si-Pai-Lou 2, Nanjing 211102, China

³

Luxembourg Institute of Science and Technology, 5, Avenue des Hauts-Fourneaux, L-4362 Esch-sur-Alzette, Luxembourg

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(10), 5311; https://doi.org/10.3390/app15105311

Submission received: 20 February 2025 / Revised: 25 March 2025 / Accepted: 31 March 2025 / Published: 9 May 2025

(This article belongs to the Special Issue Knowledge Graphs: State-of-the-Art and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Urban biodiversity is essential for sustainable cities, as it helps address the challenges of environmental degradation, ecosystem loss, species decline, and increased vulnerability to climate hazards, which negatively affect human health and well-being. ECOLOPES (ECOlogical building enveloPES) aims to develop a design approach for multi-species as stakeholders to achieve regenerative urban ecosystems. Integrating the diverse data required for stakeholders and beyond—spanning the life sciences, geography, and architecture—and utilising it for design presents a significant challenge. This paper introduces an ontology-driven approach that utilises ontology-based data management (OBDM) as a framework for integrating diverse data sources, enabling ecologists and architects to design sites and buildings that foster urban biodiversity. OBDM offers a unified view of multiple data sources through an ontology, enabling query and update operations to be performed directly on the integrated data. The proposed ontology, developed in collaboration with domain experts and adhering to Semantic Web and Linked Data best practices, serves as a mediator between life sciences data (e.g., species distribution and habitats) and geometric information (e.g., maps and voxel models of building structures). This integration enables the adaptation of sites, buildings, and geometries, respectively, to create habitats that attract and support urban wildlife, contributing to ecological sustainability. The paper illustrates the practical utility of the ontology through a case study, highlighting its role in guiding building designs that promote species attractiveness and urban biodiversity.

Keywords:

ontology-based data access; ontology-based data management; knowledge graphs; urban biodiversity; sustainable building design

1. Introduction

The expansion of urban areas and the resulting decline in natural spaces has led to new challenges, including ecosystem degradation, species loss, and increased vulnerability to climate hazards, among others. Urban biodiversity is essential for sustainable cities, as it helps address challenges that negatively affect human health and well-being. ECOLOPES (ECOlogical building enveloPES), an H2020 FET Project, aims to develop a design approach for multi-species as stakeholders to achieve regenerative urban ecosystems [1]. For multi-species, we mean four different types of inhabitants that can co-occupy the envelope: plants, animals, microbiota, and humans.

Designing with biodiversity in mind involves addressing the challenge of accommodating the diverse requirements and constraints of multiple species while also considering architectural factors. These requirements and constraints must be clearly defined, formalised, and placed in the appropriate context to ensure effective integration within urban and architectural planning. Ontologies [2] provide a structured approach to representing knowledge by defining concepts and relationships within and across various domains. Additionally, they enable axiomatic reasoning, allowing implicit facts to be inferred from explicitly stated information within a knowledge base [3]. Implementing a knowledge base using knowledge graphs [4] offers greater flexibility in data integration, as they are built upon well-established Semantic Web and Linked Data principles—such as IRIs for uniquely identifying entities and RDF for representing facts in the form of triples, ensuring interoperability and semantic richness in data integration processes.

Integrating the diverse data required for stakeholders and beyond—spanning life sciences, geography, and architecture—and utilising it for design presents a significant challenge. This paper introduces an ontology-driven approach that utilises ontology-based data management (OBDM) [5] as a framework that integrates diverse data sources, enabling ecologists and architects to design sites and buildings that foster urban biodiversity. The proposed ontology-aided design methodology builds upon the concept of performance-oriented design [6,7] and the conceptual framework focused on data-driven approaches to understanding and designing environments [8].

Data sources range from publicly available datasets, which can be easily accessed and retrieved via web APIs—such as GLoBI (Global Biotic Interactions for species) [9]—to private datasets generated within ECOLOPES, including Plant Functional Groups (PFGs) [10], Animal Functional Groups (AFGs), and voxel data [11] that capture site-specific variables like coordinates, aspect, slope, or solar radiation. Plant and animal functional groups, defined by shared traits or ecological roles rather than taxonomic classification, play a crucial role in ecosystem processes, resilience, and biodiversity conservation. A voxel model contains useful aggregated information applicable to various use cases, such as computing the Habitat Suitability Index (HSI), a model which evaluates how well a habitat suits a given species. Therefore, depending on the weighting scheme employed by the respective specific publication or study on the reported species (e.g., rabbit, fox), one can calculate and infer for each point in a CAD environment: aspect, slope, elevation, solar radiation, and so forth, how suitable it is for the designated species with respect to environmental conditions in a voxel model. Another dimension of data emerges from the designer’s interaction with CAD and Grasshopper environments. In this environment, based on the project brief requirements, the designer creates a network configuration by using predefined ‘circle shapes’ of EcoNodes (denoting species) and ArchiNodes (denoting buildings or other architectural objects).

The resulting disparate and heterogeneous data requires integration, alignment, and consolidation into a unified RDF-based graph data model, with ontological terms used to define the mappings. The result is a knowledge graph [4,12] that contains curated and contextualised knowledge, enabling holistic querying and answer retrieval through the use of domain ontologies. In order to implement it, we have employed OBDM—an extension of ontology-based data access (OBDA) [13] with support for updates [14] in addition to queries–as a data integration framework. This approach led to some data being virtualised (e.g., voxel model), with the remaining data materialised, i.e., stored and consequently indexed by the triple store. This approach is pragmatic, as the data are not moved, and no data duplication is created, albeit with the general downside of reduced performance in query evaluation. Nonetheless, subject to performance indicators or providing a seamless workflow for the designer, a subset of voxel data can be materialised in order to meet the required interaction criteria [15]. For instance, in our case, the Euclidian distance between two nodes is a candidate for pre-materialisation to increase the performance of the query, typically viable in cases where changes to the position of nodes do not occur frequently.

The scope of the knowledge graph is often determined by the set of competency questions (CQs) [2] it is designed to answer, such as “Provide me with the local species that are known to have the protection status threatened”. To achieve this, new data must be ingested, such as public data on local species provided by municipalities, citizen science (e.g., GBIF (https://www.gbif.org, accessed on 17 March 2025)), and threatened species data from the International Union for Conservation of Nature (IUCN) (https://iucn.org, accessed on 17 March 2025). Additionally, this data often requires alignment across datasets—for instance, by linking data referring to the same entity using owl:sameAs assertions or custom relationships defined within the ontology. In this case, the species’ Latin name alongside the species’ taxonomic rank (if applicable) can be leveraged in order to perform the disambiguation process and, therefore, aid in the alignment process between entities.

Ontologies, particularly those built using OWL (https://www.w3.org/TR/owl2-overview/, accessed on 17 March 2025) (Web Ontology Language), not only provide a comprehensive description of a domain but also enable the inference of new implicit facts (i.e., RDF triples) based on the explicit facts asserted in the knowledge graph. Domain experts and ontologists often aim to reuse concepts from existing ontologies or establish alignments to create more expressive and granular structures, which can be leveraged on demand or to support reasoning capabilities using Open World Assumption (OWA). In our case, we have modelled ECOLOPES ontology using a top-down conceptualisation, a bottom-up approach informed by the data, and by reusing structures from well-established ontologies, namely OBO Relations [16] or Darwin Core https://dwc.tdwg.org (accessed on 17 March 2025) following ontology construction best practices.

The proposed ontology, developed in collaboration with domain experts and adhering to Semantic Web and Linked Data best practices, serves as a mediator between life sciences data (e.g., species distribution and habitats) and geometric information (e.g., maps, voxel models of building structures). The ontology is an interface with CAD and aids the design of ecological building envelopes addressing the critical “data-to-design gap” [17,18] in computational architecture and multi-species architectural design.

Constraints, unlike ontologies–which are designed for logical, deductive reasoning–adhere to the Closed World Assumption (CWA), which precisely defines how data should be structured by imposing specific restrictions on the data model. In our case, constraints are represented as explicit facts in the knowledge graph, such as the solar requirements of plants or the minimum and maximum proximity distances between species. By using standards like SHACL https://www.w3.org/TR/shacl/ (accessed on 17 March 2025), one can define and validate the expected “shapes” and structure of the data, ensuring data quality and completeness while also enabling the enforcement of consistency across the dataset. In our case, validation is currently performed using a set of SPARQL queries. The constraints in our problem setting can be translated to SPARQL queries, e.g., “Provide me the locations in site where I can place the plant Abies alba (EcoNode) such that its solar requirements are met?”. Given the spatial nature of this query, typically, one would employ a GeoSPARQL query to answer it; nonetheless, in our case, the voxel model containing x, y, and z coordinates boils down to equality checking with respect to CAD coordinates.

The computational model that is used to generate design outcomes based on the designer’s input is rule-based. This approach computes the design outcomes based on the selected rules that can be run in a designated order. Our current implementation mimics rule-based reasoning by employing a set of SPARQL update queries [19] in the context of OBDM that are run in a sequence that is orchestrated by the Grasshopper environment.

This integration enables the adaptation of sites, buildings, and geometries, respectively, to create habitats that attract and support urban wildlife, contributing to ecological sustainability. The paper illustrates the practical utility of the ontology through a (case study) demonstrator, highlighting its role in guiding building designs that promote species attractiveness and urban biodiversity.

Our contributions are the following:

A new EIM (ECOLOPES Information Model) ontology and knowledge graph in the domains of architecture and ecology.
A method that utilises OBDM to map and integrate different data sources—stemming from different environments from the respective domains—through an ontology, effectively creating the knowledge graph. OBDM enables query and update operations to be performed directly on the integrated data.
A designer workflow, in which the designer obtains feedback (e.g., the fulfilment of solar radiation or proximity constraints) or asks CQs against the knowledge graph, enabled by OBDM.

The rest of the paper is structured as follows. In Section 2, we provide the necessary background, motivation, and problem statement. In Section 3, we discuss related work with respect to using ontologies in design and ecology. In Section 4, we explain the ontology design, reuse, and alignment. In Section 5, we discuss the data integration framework used for integrating heterogeneous data sources. In Section 6, we describe the demonstrator, and we provide the results achieved in that context. Finally, in Section 7, we conclude the paper with conclusions and future work.

2. Motivation and Problem Statement

The ontology-aided generative computational design process facilitates design generation, namely for a search space populated with alternative design solutions that can be analysed and evaluated. The overall process of design generation entails synthesising heterogeneous ecological, environmental, and architectural data, voxel-based data structuring, context information at regional/urban and local/architectural scales, and modelling information according to decision needs. A detailed description of the conceptual framework that informed the development of the research presented in this paper is published in a separate study [20].

The role of ontologies is to model information regarding the entities and relationships that need to be represented and to aid the design of ecological building envelopes. We distinguish between ontologies using schema for TBox and instance data for ABox. More formally, for a given data source and its variables x, we create ABox by applying a set of mappings:

x ⇝ E (f (x))

, where E is a TBox class or property, and

f (x)

is a set of IRI templates, with each applied to the variable x in order to create a literal or IRI. A knowledge graph KG is a union of ABox and TBox (herein, we also distinguish taxonomies or controlled vocabularies as a part of KG, sometimes referred to as CBox. They are essential to capture hierarchical representations, e.g., species taxonomic rank) represented in RDF and typically stored (materialised) in a triple store. To avoid the complexity of reification statements in RDF, we chose to model statements using RDF-star for convenience, namely for expressing proximity constraints. A part of ABox can be virtualised instead of being materialised, depending on the use case requirements. In this particular case, the mappings ⇝ that are represented in a specific mapping language are used to translate the queries on the fly back to the original data sources. TBox axioms are used to infer new implicit triples in ABox based on asserted (ABox) triples, which are either stored in the triple store or, alternatively, a query is rewritten during the query time with respect to TBox axioms to account for the same (query) answers. This proposition does not hold for expressive OWL ontology languages but rather for (minimal) RDFS, consisting of rdfs:domain/rdfs:range, rdfs:subClassOf, and rdfs:subPropertyOf axioms [14,21]. For such pragmatic reasons, we use (minimal) RDFS in conjunction with a few OWL axioms (e.g., for aligning properties via owl:equivalentProperty) for the task of ontology modelling. In the case of OBDA and mappings, aside from query rewriting, an additional step of query unfolding is executed, in which SPARQL queries are translated on the fly to SQL queries in order to obtain the so-called certain query answers [22]. OBDM [5] extends OBDA with support for update queries.

The role of the voxel model within the ontology-aided design process is to describe the real-world location in which the design process is taking place by structuring and spatialising datasets. An early definition of voxel models as “knowledge representation schemata” [23] was identified in the course of an extensive literature review of voxel models’ role in the engineering fields [11], and this was re-established in the context of computational design. The voxel model implementation developed for this study extends the Composite Voxel Model methodology [24] by incorporating the explicit link with knowledge representation through the ontology-aided design process. The implemented ECOLOPES Voxel Modell, described in detail in [25], integrates geometric data describing the real-world site’s physical geometry and simulated environmental performance. Currently, voxel-based design approaches addressing anthropogenic landscape adaptations can be found [26], and studies aiming to integrate voxel and ontological modelling are ongoing [27]. A unique property of the presented ECOLOPES Voxel Model is the implementation within the relational database environment (PostgreSQL), which enables direct access to voxel data through the native “OBDA” mappings. As indicated in Figure 1, the concept of voxel data layers is used to structure datasets provided by the ECOLOPES Voxel Model. By utilising the OBDA virtualisation technique, the EIM Ontologies can currently access the raw, aggregate, and filtered data layer, enabling the interactive temporal resolution change of the environmental simulation data contained in the ECOLOPES Voxel Model (cf. Figure 2). For example, average sunlight exposure (sunlight hours) in summer, autumn, winter, spring, and the growing season can be aggregated from the raw voxel data representing monthly average sunlight exposure. Multi-scalar voxel data representing geometry and all available parameters stored in different voxel levels can be retrieved by designers and accessed through the SPARQL endpoint (see Section 5).

Two processes are combined in the ontology-aided generative computational design process: (i) the translational and (ii) the generative processes.

In the translational process, the designer analyses, correlates, and locates spatially architectural and ecological requirements contained in the design brief and design-specific determinations while taking into consideration relevant constraints (i.e., as given by planning regulations). In order to do so, the designer uses a set of predefined nodes and places them in the CAD environment in order to satisfy ecological and architectural requirements. Herein, the designer can query the knowledge graph in order to obtain feedback regarding the solution space for permissible design outputs in terms of satisfying the proximity distances between nodes (e.g., species) or solar radiation requirements. Nodes that fulfil the constraints are depicted as green circles, whereas those that do not are depicted as red crosses. The Grasshopper environment is used as a component for doing the orchestration process of the validation. A knowledge graph also acts as a mediator with the voxel model (cf. Figure 1) that contains prepared datasets regarding geometric and environmental variables. In this paper, we focus only on the translation process. The intended goal of the translational process is shown in Figure 3.

The generative process consists of two distinct stages. In the first stage, the variations of spatial organisation are generated and evaluated. In the second one, specific surface geometries are generated for selected outputs of the previous stage and are evaluated according to defined criteria, i.e., key performance indicators. The generative process falls outside the scope of this work and is planned for future exploration.

3. Related Work

The use of Semantic Web technologies and OWL to represent ecological network specifications is explored in [30]. Ecological networks capture the structure of existing ecosystems and provide a basis for planning their expansion, conservation, and enhancement. An OWL ontology is employed to model these networks, enabling reasoning about potential improvements and identifying violations in proposed network expansions. The approach incorporates a language called GeCoLan to define ecological network specifications, which are subsequently translated into GeoSPARQL queries. This methodology enhances the planning and management of ecological networks by supporting automated reasoning and integrating diverse data sources effectively.

Global Biotic Interactions (GLoBI) [9] is an open-access platform designed for researchers, aggregating diverse datasets to provide comprehensive information on species interactions, such as predator–prey, pollinator–plant, and parasite–host relationships, among others. The platform serves as a valuable resource for understanding ecological networks, predicting the effects of environmental changes, and guiding conservation strategies. It consolidates and harmonises data on species and their occurrences and interactions from a variety of publications and observational databases. GLoBI offers both an API and a SPARQL endpoint (with partial data coverage) for accessing species interaction data in CSV or RDF formats. Species data are structured using the Darwin Core vocabulary, with equivalence between species established through owl:sameAs relations. Additionally, the platform features a web-based frontend application that supports species searches via text and/or geographical boundaries. OBO Relations (http://obofoundry.org/ontology/ro.html, accessed on 17 March 2025) ontology can be leveraged for reasoning to infer implicit interactions based on explicitly recorded triples.

The integration of ontologies and knowledge graphs in urban ecology has gained significant attention due to their ability to model complex interactions between built environments and natural ecosystems [18]. Existing research spans two major domains: Architecture, Engineering, and Construction (AEC(O)) and Urban Design, Planning, and Management, both of which are crucial for advancing sustainable urban development. Ontologies and knowledge graphs have been extensively used in AEC(O) to improve data interoperability, lifecycle management, and decision support systems. In Building Modelling, semantic models such as the Industry Foundation Classes (IFC) ontology enable structured data exchange across different software platforms [31]. In Building Construction and Renovation, ontologies assist in process optimisation, cost estimation, and resource planning by integrating heterogeneous datasets [32]. Additionally, Building Automation has benefited from ontologies like the Brick schema, which enables the semantic representation of building sensors and control systems. Ontologies also play a key role in Building Sustainability, facilitating the assessment of energy efficiency and carbon footprints [33]. Semantic models allow for the integration of green building certification standards (e.g., LEED, BREEAM) with real-time building performance data, enhancing decision-making for sustainable design and renovation [34]. Ontologies have been applied in City Modelling, enhancing urban digital twins with semantic interoperability [35]. The CityGML standard, extended with ontologies, provides a structured representation of urban elements, supporting applications in Civil Construction and infrastructure management. Urban Sustainability ontologies enable the integration of environmental indicators with urban planning tools, aiding in decision-making for reducing ecological footprints [36]. The role of Urban Green and Blue Infrastructure ontologies is particularly relevant for urban ecology. These semantic models capture the relationships between natural elements (e.g., vegetation and water bodies) and built environments, supporting biodiversity monitoring and ecosystem services assessment [37]. While AEC(O) and urban planning ontologies provide structured models for specific domains, their integration with ecological and functional dimensions is essential for advancing sustainable urban ecosystems. Several key research areas highlight the necessity of interlinking these domains: Simulation, where ontologies enable multi-domain simulations for energy performance [38], microclimate analysis, and species habitat suitability within urban environments [39]; Garden Management, where knowledge graphs integrate urban vegetation data with soil conditions, climate factors, and biodiversity metrics, improving urban green space management [40]; and Urban Key Performance Indicators (KPIs), where ontologies facilitate KPI-based urban sustainability assessments by structuring data around air quality, noise pollution, and resource efficiency [41]. Despite advances in these areas, challenges remain in harmonising ontologies across domains, ensuring scalability, and developing reasoning mechanisms that enhance automated decision support.

Unlike the approaches reviewed in this section, the ontology we propose is designed to bridge the gap between two distinct domains, ecology and architecture, ensuring seamless integration of heterogeneous data sources. It is built on a strong foundation of Semantic Web best practices and adheres to FAIR principles, enhancing data findability, accessibility, interoperability, and reusability. Moreover, the ontology goes beyond pure modelling; it enables semantic reasoning and decision support, allowing stakeholders to derive new insights, infer implicit knowledge, and guide informed decision-making in sustainable urban planning and biodiversity-conscious architectural design.

4. Ontology Design and Development

In this section, we outline the ontology development process for EIM following W3C best practices and FAIR principles. The process begins with defining requirements through expert input and competency questions (CQs). Next, terms and axioms are created, ensuring the reuse of existing vocabularies for interoperability. The ontology is then structured and formalised, followed by validation and visualisation to enhance accessibility and usability.

4.1. Defining the Ontology Requirements

We organised a collaborative workshop with experts from the ECOLOPES consortium, comprising architects and ecologists, to define the ontology and knowledge graph requirements. Given that most participants were not ICT specialists or knowledge engineers, identifying and conceptualising ontology requirements posed a significant challenge. To facilitate this process, we structured the workshop around ontology construction best practices, ensuring that domain experts could systematically contribute their knowledge in a way that aligns with semantic modelling principles.

The participants were divided into four interdisciplinary teams, each consisting of six people, to encourage the “cross-pollination” of ideas, leveraging diverse expertise to bridge conceptual gaps. Each team was tasked with defining a comprehensive set of competency questions (CQs) spanning both domains, a fundamental step in ontology engineering that helps capture the intended scope and use cases. These CQs were written on post-it notes to foster discussion and iterative refinement, allowing experts to progressively articulate their knowledge in a structured manner.

In parallel, the participants identified key concepts and relationships relevant to answering the CQs, mapping out the foundational elements of the ontology. However, the abstraction required for formal ontological representation proved difficult for non-ontology experts. To mitigate this, they were guided to propose synonyms, group related concepts into broader categories, and outline hierarchical relationships, effectively shaping the initial class taxonomy. Recognising that defining structured properties and attributes requires a technical perspective, we encouraged participants to describe informal connections between concepts, which were later formalised by knowledge engineers. When gaps remained in structuring the ontology, participants suggested relevant datasets or external sources that could support answering the identified CQs, ensuring that the ontology remained grounded in real-world data and met FAIR principles for interoperability and reusability.

This bottom-up, expert-driven approach resulted in a draft skeleton ontology comprising initial classes, properties, and attributes.

Requirements were captured through CQs, with some listed here to name a few:

Which species do we want/do not want to attract close to our building?
Which species are in PFG herbs_3?
Which species that colonise the building are protected?
Which species are invasive in this location?
Which species can reach or live on sloped surfaces?
What is the soil depth required for PFG herbs_3?

4.2. Ontology Requirements and Scope

One of the main ontology aims was to integrate diverse data sources to support sustainable building design decisions that enhance urban biodiversity. Its scope includes ecological data (e.g., species distribution and habitat preferences), architectural data (e.g., building geometry and materials), and geographical data (e.g., environmental conditions and maps). By facilitating data interoperability and enabling reasoning capabilities, the ontology seeks to inform ecologists and architects when designing buildings that foster ecosystem development.

Once the definition phase of the CQs had been finished, the next steps were to group them based on similarity and prioritise them based on importance. After the screening and processing of the CQs have been performed, we could extract general (i.e., upper-level) requirements from those. In the scope of this paper, we consider that processing all the CQs is out of scope; hence, herein, we focus only on general constraints and requirements that pertain to the following:

The proximity between architectural objects and species (and vice versa) or, likewise, between species only.
Solar radiation (regarding species).
The prey areas in the CAD environment, which denote areas in which species are allowed to prey on one another.

4.3. Ontology Creation Process

The ontology (https://github.com/aahmeti/Ecolopes, accessed on 17 March 2025) was developed following Semantic Web and Linked Open Data (LOD) best practices, ensuring compliance with the FAIR (Findable, Accessible, Interoperable, Reusable) principles. The methodology employed a structured approach incorporating the following:

Iterative design and refinement, integrating feedback from domain experts and ecological datasets on biodiversity.
Modular construction, enabling scalability, maintainability, and reuse across multiple domains.
Reuse of established vocabularies and ontologies, including GeoSPARQL for spatial representation and Darwin Core for biodiversity data integration.

Ontology development was conducted using Protégé for schema and taxonomy management, while GraphDB was employed for managing the knowledge graph instance data. Modularisation was applied to facilitate ontology structuring, leading to a well-defined prefix scheme that enhances accessibility and reusability. The ontology components are categorised into schemas (networks-schema, ecolopes-schema), taxonomies (networks-taxo), and instance data (ecolopes-inst, networks-inst). This structured approach ensures a clear separation of entities originating from different environments, improving clarity and consistency, as shown in Table 1.

Finally, the ontologies were imported into GraphDB and stored in separate named graphs. The GraphDB repository was configured using reasoning that enabled the support of owl:sameAs at the instance level and ontology axioms for inferencing. To maintain governance and ensure robust deployment, development and production environments were distinguished as separate repositories. Additionally, GraphDB enables the querying of inferred triples using the implicit graph FROM http://www.ontotext.com/implicit (https://graphdb.ontotext.com/documentation/10.8/query-behavior.html#tuning-query-behavior, accessed on 17 March 2025)), allowing for advanced semantic reasoning and enriched data retrieval.

4.4. Core Concepts and Structure

The ontology’s core concepts address the integration of ecological, architectural, and geographical data. The key classes include the following:

Classes: Archi- and EchoNode, Voxel, PlaneConstraint, SolarConstraint, and more.
Relationships: nodeHasDistanceToNode, nodeHasSpecies, nodeHasArch, nodeOverlapsPlane, and more.
Attributes: x, y, z, proximityRequirementMin, proximityRequirementMax, solarRequirementMin, solarRequirementMax, and more.

4.5. Alignment with Existing Standards

To ensure full compliance with FAIR principles, the ontology was designed to be highly interoperable, reusable, and findable by being aligned with well-established standards and vocabularies. This alignment enables seamless integration with external datasets, facilitates data interoperability, and enhances the ontology’s applicability across diverse domains. Specifically, the ontology incorporates the following:

GeoSPARQL: Used to represent geographical data and enable spatial reasoning and geospatial queries across datasets.
Darwin Core: Integrated for biodiversity data representation, ensuring compatibility with life sciences repositories such as GBIF.
OBO Relations: Employed for modelling ecological relationships, such as predator–prey interactions, ensuring consistency with existing biological ontologies.

To maintain semantic consistency and interoperability, OBO Relations were aligned using owl:equivalentProperty, allowing for seamless integration with external data sources like GLoBI while preserving existing structures without additional data transformation (cf. Figure 4). This approach follows Linked Open Data (LOD) best practices, ensuring that the ontology extends and interacts meaningfully with established ontologies. On the other hand, the mapping of instance GLoBI data was carried out using Ontotext Refine mappings. In conjunction with the owl:equivalentProperty alignments presented above, the reasoning would infer triples with the OBO Relations predicates, hence enabling interoperability.

Regarding geospatial data, the ontology aligns with GeoSPARQL ontology to ensure full compatibility with spatial datasets (e.g., GBIF or shapefiles, cf. Table 2, Dataset No. 2 and 3). Shapefiles were translated to RDF, complying with GeoSPARQL vocabulary by using RML mappings generated by GeoTriples (ver. 1.1.6) [42]. Likewise, latitude and longitude coordinates stemming from GBIF in RDF format were transformed into GeoSPARQL representations via a SPARQL update query, and the result is demonstrated in the following example.

Example 1.

This example in RDF Turtle syntax represents a point close to the Nordbahnhof area using GeoSPARQL vocabulary:

ecolopes-inst:Location_48.230_16.390 a sf:Point ;

...

ecolopes-schema:hasExactGeometry [ a geo:Geometry ;

geo:asWKT "POINT(48.23012995681571 16.39000953797779)’’

]

The ontology also integrates Darwin Core for structuring species taxonomic classifications. This allows for querying across multiple biological repositories using standardised taxonomic ranks, such as dwc:genus, dwc:order, and dwc:family. By adhering to Darwin Core, the ontology supports the automated classification and retrieval of species information, ensuring interoperability with existing biodiversity knowledge graphs (e.g., query federation of GBIF backbone species). Whenever possible, the species have been modelled using the Darwin Core vocabulary, i.e., the use of dwc:scientificName and dwc:vernacularName to represent the scientific name and common name, respectively (cf. Section 6).

Through these measures, the ontology ensures compliance with FAIR principles, promoting interoperability, structured data reuse, and seamless integration with the broader Linked Data ecosystem.

5. Ontology-Driven Data Integration Through OBDM

We used a triple store as an implementation for OBDM, specifically GraphDB, which supports modes of virtualisation via Ontop integration [43] and materialisation (natively). This approach is versatile, as the virtualised data can also be materialised to enhance query performance. OBDM [5] was chosen for its ability to provide semantic interoperability, reasoning, and data virtualisation, ensuring a flexible and scalable approach to integrating heterogeneous data sources. Unlike traditional ETL-based data integration, which requires extensive preprocessing and rigid schema mappings, OBDM allows for query federation, enabling real-time access to distributed data without duplication. For these reasons, we chose to virtualise the voxel model. By aligning with FAIR principles, OBDM enhances data findability, accessibility, interoperability, and reusability, making it particularly suited to complex domains like urban ecology and sustainable architecture. Additionally, GraphDB’s mature implementation of OBDM supports efficient query execution and reasoning, further strengthening its suitability over conventional integration techniques. In GraphDB, one can configure a repository (equivalent to a database) by choosing one of the modes. Then, one can combine the results of the two modes by using query federation in SPARQL via the SERVICE keyword. As we have explained previously (cf. Figure 2), the voxel model is virtualised using OBDA mappings. The implementation of OBDA mappings in RDF is provided in Figure 5. The set mappings for voxel data are provided 1-1 (one-to-one), meaning that each column from the voxel data is mapped to a unique property in RDF, i.e.,

v o x_x ⇝ : x, v o x_y ⇝ : y

, and so on. Keeping the mapping expressivity lean makes it future-proof for future use cases, namely in cases where updating voxel data is needed from the ontology via SPARQL update queries without the need to resolve ambiguous mappings and the side-effect of running into the view update problem, which is known in database theory [14].

The system allows for defining the sources (the voxel model stored and managed in the relational database) and mapping definitions (OBDA or R2RML (https://www.w3.org/TR/r2rml/, accessed on 17 March 2025)), thereby exposing it as a separate repository in which SPARQL queries (via SERVICE keyword) are translated into SQL queries, respectively, on the fly via query unfolding (cf. Figure 6). For the time being, we have used the native OBDA mappings; in the future, we plan to replace this with the more future-proof R2RML mapping language.

The integration scenario in Figure 7 showcases the OBDM, namely the combination of virtualisation and materialisation in the context of the CAD and Grasshopper environments. The federated query using the SERVICE keyword issues the query to the repository where the mappings and virtualisation are applied. The bindings are joined with the variables in the current repository that contain knowledge of species’ solar requirements. The process is computed via the Grasshopper pipeline, and the visualisation is shown in the CAD environment. The permissible places for half-shady, shady, and sunny plant species, each with their respective requirements, are shown spatially using the blue colour. OBDM, in this case, is a decision support system for the designer to make informed (spatial) decisions that are orchestrated and issued by the Grasshopper workflow. This demo showcases the integration between two datasets, namely the voxel model and plants’ solar requirements. We can iteratively increase the number of datasets that are ingested, harmonised, and linked, and, therefore, iteratively increase the number of CQs we can ask the KG or use it for more complex decision support in the design process.

Currently, a number of datasets have been identified and integrated into the KG using the defined ontological classes, properties, or attributes; see Table 2. The triple count depicts the size and complexity of the data that can be handled from the data management perspective. Datasets No. 1 and No. 2 are local datasets which can be replaced by other (local) datasets in the case of utilising and replicating the approach for a different city.

Vienna local plants (CSV): Plants, metadata, and solar requirements.
Vienna local species (shapefile): Primarily birds metadata (nesting boxes) (provided by the city of Vienna https://www.wien.gv.at/umweltschutz/umweltgut/ (accessed on 17 March 2025)).
GBIF citizen science (API): Local species filtered based on (proximity to) geographical coordinates.
Plant Functional Groups (CSV): Internal dataset generated within ECOLOPES via the Ecological Model [10].
Animal Functional Groups (JSON): Internal dataset generated within ECOLOPES via the Ecological Model.
Animal-aided design (PDF): Specifying the constraints between species and various architectural objects, used as a placeholder.
GLoBI (API): Global biotic interactions between species described in OBO Relations.
GBIF backbone species (RDF): Query the federation of species, including metadata details such as taxonomic rank described in Darwin Core and GBIF backbone taxonomy (http://graph.openbiodiv.net (accessed on 17 March 2025) [44]).
Wikidata (RDF): Query the federation of species, including metadata details, synonyms, common names, crossovers (identifiers) to many datasets (DBpedia, Catalogue of Life, and so forth) (https://query.wikidata.org (accessed on 17 March 2025)).
Nature FIRST (RDF) [45]: Query the federation of species and habitats, including IUCN protection status.
CAD environment: Rhino CAD and Grasshopper environments.
Voxel model (RDB-RDF): Query the federation of the voxel model, storing site-specific environment variables in RDB.

Heterogeneous data are mapped and converted into a graph representation in the KG (ABox) by using different mapping languages, such as RML [29], Ontotext Refine, and OBDA (https://graphdb.ontotext.com/documentation/10.8/virtualization.html (accessed on 17 March 2025)). Shapefiles are converted into RDF using RML mappings that are generated by GeoTriples (ver. 1.1.6) [42]. For this dataset (including GBIF, which can be transformed in this representation; see Example 1), the triples are indexed by the triple store and can be queried and reasoned using GeoSPARQL. The user can query the KG using (Geo)SPARQL as a unified interface that encompasses and integrates different datasets combining the results, and, if required, the user can use reasoning on top using TBox statements. Therefore, the KG integrates heterogeneous data from different sources, providing unified access to the data by using the SPARQL (or SPARQL*) query language.

Figure 8 shows a snapshot of the core parts of the KG, which links knowledge from datasets originated in Grasshopper (left), global biotic interactions and proximity constraints (centre), and the voxel model (top) and PFGs (right). One can traverse from one context to another, given that the relations between datasets are explicitly encoded (e.g., owl:sameAs), or they can be inferred by SPARQL queries (e.g., if x,y,z coordinates in between datasets coincide). Whenever possible for the IRIs of species, we treat Wikidata identifiers as canonical IRIs (we mention [46] as an approach that tackles canonical IRIs in the setting of virtual mappings and OBDM; in our case, this part of the data is materialised) and provide crossovers via owl:sameAs to other linked datasets. This pragmatic approach allows us to start from a central entity directed towards other entities when browsing or querying species data, as well as for convenience reasons, i.e., when using reasoning support via owl:sameAs to infer all the statements for the linked species during repository loading time (alternatively, the latter can be computed via query rewriting during query time by using path queries in SPARQL owl:sameAs*).

Example 2.

In the following, we see an example of how the requirements for the proximity between the species Abies alba and Scopula floslactata have been modelled in RDF* to represent n-ary relations in a succinct way, which are stored in GraphDB. The predicates proximityRequirementMin and proximityRequirementMax have been used to set the minimum and maximum allowed value between Abies alba and Scopula floslactata. The visualisation of RDF* triples is depicted in Figure 8 (dashed line within the centre environment/container).

<<ecolopes-inst:Abies_alba ecolopes-schema:proximityDistance

ecolopes-inst:Scopula_floslactata>>

ecolopes-schema:proximityRequirementMin "20"^^xsd:integer;

ecolopes-schema:proximityRequirementMax "100"^^xsd:integer .

In the following, we describe the process used to harvest data from GLoBI to obtain all (biotic) interactions between designated species. The workflow begins by specifying a list of plant or animal species as a seed. This initial list can either be derived from geographic coordinates (e.g., by retrieving local species using the GBIF API) or manually selected based on species relevant to our context in Vienna. Once the species list is established, we utilise the GLoBI API to retrieve interaction data in the form of CSV files, with one file generated per species.

The next step involves transforming the CSV data into the RDF format. To achieve this, we use Ontotext Refine to create the necessary mappings via a graphical user interface (GUI). These mappings are subsequently exported as JSON files (cf. Figure 9). Using the mapping file as input, along with the corresponding species CSV files, we invoke the Ontotext Refine API to generate RDF data in batch mode for each species. Given that the CSV schema returned by the GLoBI API remains consistent, we can reliably apply the same set of mappings to produce RDF representations of species interactions.

To enhance interoperability and enable reasoning over implicit triples, we aligned the interaction properties with the OBO Relations ontology (cf. Figure 4). This alignment ensures that the RDF output adheres to established semantic standards, facilitating integration with other biodiversity and ecological datasets.

In Figure 10, we see the visualisation of the node Abies alba containing integrated data, along with its metadata panel on the right side, and the biotic interactions (:hasHost, :interactsWith) with respect to other species.

6. Demonstrator

In the following section, we demonstrate the application and utility of the ontology and KG in a design-centric case study in Vienna, located on the Nordbahnhof Freie Mitte site. More details on the case study and on the development of the respective network configuration are presented in a dedicated study [20]. The main result of the demonstrator is to showcase the expressiveness and capture of the higher-level requirements defined in Section 4.2 by using a network configuration in the CAD environment (cf. Figure 3). Local species around the site are curated and provided by the Vienna municipality, and in addition, they can be obtained from GBIF API (cf. Figure 11)—albeit with the common data quality issues encountered in the common citizen science approaches. The case study can be replicated to any other case study in a different location, provided that the local data are available. In this demonstration experiment, the widely used McNeel Rhinoceros 3D and the Grasshopper 3D platform were used. This platform consists of a 3D CAD environment (Rhinoceros 3D) that is interactively linked with a domain-specific node-based programming interface (Grasshopper 3D). Dedicated ETL workflows can be implemented within this node-based programming interface, using both native Grasshopper components and custom functionalities, e.g., as Python scripts encapsulated using the native Grasshopper Python Script component. Since the Grasshopper 3D platform does not contain any components that can interact with ontologies or the GraphDB platform, the required functionalities were implemented as Python scripts in the Grasshopper environment. This implementation was based on the GraphDB REST API specification and the underlying RDF4J SPARQL endpoint.

The pipeline (cf. Figure 12) that we describe in the demonstration section consists of various steps, starting with cleaning and mapping the designer network created by the end-users (designers) within the Rhinoceros 3D CAD environment. The designer network consists of nodes (cf. Figure 3) represented both visually in the 3D CAD environment and mapped to the JSON-LD schema (https://www.w3.org/TR/json-ld11/ (accessed on 17 March 2025)) within the Grasshopper platform. The implemented JSON-LD serialisation prepares the 3D CAD data to be pushed to GraphDB through the embedded RDF4J SPARQL endpoint. The implemented REST-based interface, consisting of the JSON-LD parser and GraphDB SPARQL endpoint, works bi-directionally, allowing the end-users to receive the data returned by the SPARQL endpoint, formatted in compliance with the native Grasshopper data typing and formatting. From the end-user perspective, the results of the SPARQL-based reasoning can be used in the Grasshopper-native ETL workflows. From the perspective of ontology engineering, data received from the Grasshopper platform can be validated and used for SPARQL-based reasoning. In the current implementation, before every run of the Grasshopper pipeline, the data were cleaned by calling DROP GRAPH from the Grasshopper environment. This ensured that no stale data representing the past geometric condition of the 3D CAD environment were present in the knowledge graph. The rationale for choosing JSON-LD lies in the fact that JSON-LD is compatible with JSON, and the Grasshopper 3D platform features limited compatibility with JSON data streams through the jSwan Grasshopper plugin.

6.1. Mapping of Designer Network Data and Voxel Data Layers in EIM Ontology

The ECOLOPES Voxel Model contains environmental data describing the Viennese site of Nordbahnhof Freie Mitte. sunlightHoursSummer and sunlightHoursGrowingSeason properties map Voxel Data Layers to the respective attributes in the EIM Onology. For instance, the designer places an EcoNode, defines its subType as Plant, and sets dwc:vernacularName as “Silver fir”. The KG stores mappings between dwc:vernacularName (“Silver fir”) and dwc:scientificName (e.g., Abies alba), along with additional plant datasets and respective metadata. Consequently, sunlight requirements (minSunHours and maxSunHours) are retrieved for the given EcoNode.

6.2. Validation Process for EcoNodes and ArchiNodes

The presented functionality, showcasing the interaction between end-users, ECOLOPES Voxel Model, and knowledge graph utilises the GraphDB SPARQL endpoint to compare the sunlightHoursGrowingSeason property retrieved from the ECOLOPES Voxel Model with the minSunHours and maxSunHours properties queried from the knowledge graph. This query returns a Boolean value (?isValid) as the result of the comparison. In the designer networks workflow, the designer places an ArchiNode and defines its subType as Infrastructure or Building. Similar validation and mapping processes are outlined for ArchiNodes, including the prey area that is predefined by the designer as a rectangular shape. The rule-based process is based on CONSTRUCT queries [19] for the respective solar, proximity, and prey area constraints. In Figure 13, one can see the query for the proximity constraint. The CONSTRUCT query is transformed into an INSERT query in order for the validation data to not only be computed but also materialised. The query takes into account multiple proximity constraints between different nodes n in the range of

O (n^{2})

computational complexity. Variable ?res computes the Euclidian distance between nodes using GraphDB’s math functions (https://graphdb.ontotext.com/documentation/10.8/sparql-ext-functions-reference.html#mathematical-function-extensions (accessed on 17 March 2025)).

6.3. Preparing and Sending Network Data

At the beginning of the designer network workflow, the end-user inputs are read as data stored as key-value pairs, assigned to individual Rhinoceros 3D (ver. 7.3) objects representing network nodes. The 3D CAD environment provides the functionality to store key-value pairs as User Attributes attached to any native 3D CAD object, such as Rhino Point 3D. Moreover, the custom User Attributes are exposed in the Grasshopper 3D (ver. 7.3) interface and can be used by designers in the Grasshopper node-based programming interface. For this workflow, two custom Rhino commands, PlaceArchiNode and PlaceEcoNode (cf. Figure 14), were implemented in Python (ver. 2.7), allowing the designers to place native Point 3D objects containing the required attributes and their values. The designers can construct networks in the 3D CAD environment, consisting of nodes, and the implemented Grasshopper pipeline converts and outputs data to JSON-LD, matching the ontology (TBox) schema. The developed Grasshopper components send the JSON-LD formatted Network Node data to the GraphDB (ver. 10.4) environment through the SPARQL endpoint.

6.4. Define Evaluation Constraints

The last part of the demonstrator pipeline enables the designer to select constraints to validate the network configuration created in the 3D CAD environment. From within the Grasshopper interface, end-users can choose the Solar and Proximity constraints to be validated against, cf. Figure 15, for visualisation of the validation, which is stored in GraphDB. The final output is a Boolean answer that is computed in GraphDB by the AND (“join”) intersection of constraints, which mimics using the MIN operator in the SPARQL query. The results of this validation are returned to the Grasshopper pipeline and visualised in the Rhinoceros 3D CAD interface. This mechanism provides visual feedback to the end-user, indicating which nodes in the designer networks do not fulfil the constraints inferred from the ontologies using SPARQL-based reasoning.

7. Conclusions and Future Work

This paper presents an ontology-driven approach to integrate diverse datasets—ranging from public APIs and endpoints and internal and local sources to CAD environments—into a unified RDF-based knowledge graph, facilitating the design of sites as well as fostering ecological building design. By leveraging OBDM, the approach balances virtualisation and materialisation, ensuring efficient data integration while supporting advanced queries, updates, and reasoning. The ECOLOPES ontologies and knowledge graph bridge the life sciences, geography, and geometric data, enabling designers to address the ecological constraints of solar radiation, proximity, and prey areas through interactive workflows. The application of OBDM has been demonstrated in the case of computing the permissible positions on a site where one can place a plant species (EcoNode) based on solar radiation and the plant’s requirements. Furthermore, this was extended to a case study comprising a network of Archi- and EcoNodes, where additional requirement constraints are applied, namely proximity and prey areas (cf. Section 4.2). Through the application of a network configuration of nodes and the respective expressivity of the higher-level constraints, the case study highlighted the ontology’s practical application in designing habitats that promote species attractiveness and urban biodiversity.

The advantage of the proposed approach is that it is flexible, as different constraints have already been implemented and can be plugged in using Grasshopper components. Nevertheless, the addition of other constraints is possible to the extent that their implementation is provided using the SPARQL update queries. The latter limits the expressivity of the constraints subject to the expressivity of the SPARQL language [19,48] and its extensions for expressing those constraints. The next step of joining the constraints can be simply carried out using the implementation provided in this paper via the off-the-shelf Grasshopper components. The adoption of the workflow by the community is made possible due to the encapsulation of the complexity in Grasshopper components, through which different constraints can be enabled or disabled.

The problem setting is limited to the translational process of gathering the requirements from the design brief and creating the network of nodes. This translation (or mapping) is made by the designer subjectively. The automatic generation of nodes, which is out of the scope of this work, could possibly employ Generative AI in conjunction with the proposed ontologies and knowledge graph.

It is worth emphasising that the validation of the approach from the ecological point of view is left for future work. Furthermore, we plan to use SHACL to define constraints, as well as use a bespoke rule language (e.g., Datalog or SWRL (https://www.w3.org/submissions/SWRL/ (accessed on 17 March 2025))) instead of SPARQL queries in order to adhere to Semantic Web standards. More importantly, we plan to extend the problem setting towards a generative process (geometric articulation) in ontology-aided generative computational design by employing more sophisticated approaches akin to answer set programming [49,50].

Author Contributions

Conceptualization, A.A., D.S.H. and M.H.; Methodology, A.A., C.P. and J.T.; Validation, A.A. and J.T.; Formal analysis, A.A.; Investigation, A.A.; Data curation, A.A.; Writing—original draft, A.A., C.P. and J.T.; Writing—review & editing, A.A., D.S.H., C.P., J.T. and M.H.; Visualization, J.T.; Supervision, C.P. and M.H.; Project administration, M.H.; Funding acquisition, D.S.H. and M.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work is funded by the European Union Horizon 2020 Future and Emerging Technologies research project ECOLOPES: ECOlogical building enveLOPES (grant number: 964414). The funding source is not involved in the conduct of the research or preparation of the article.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We extend our gratitude to Mariasole Calbi for kindly providing the PFGs dataset and to Victoria Culshaw for sharing the AFGs dataset. We thank Navid Javan for his work and assisting us with the Grasshopper workflow and components.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ABox	Assertion Box
AECO	Architecture, Engineering, and Construction
AFG	Animal Functional Group
API	Application Programming Interface
CAD	Computer Aided Design
CQs	Competency Questions
CRUD	Create, Read, Update, and Delete
CSV	Comma-Separated Values
CWA	Closed World Assumption
ECOLOPES	ECOlogical building enveLOPES
EIM	ECOLOPES Information Model
FAIR	Findable, Accessible, Interoperable and Reusable
FET	Future and Emerging Technologies
GBIF	Global Biodiversity Information Facility
GLoBI	Global Biotic Interactions
HSI	Habitat Suitability Index
ICT	Information and Communications Technology
IFC	Industry Foundation Classes
IRI	Internationalised Resource Identifier
IUCN	International Union for Conservation of Nature
JSON	JavaScript Object Notation
JSON-LD	JavaScript Object Notation-Linked Data
KG	Knowledge Graph
KPI	Key Performance Indicator
LOD	Linked Open Data
OBDA	Ontology-based Data Access
OBDM	Ontology-based Data Management
OBO	Open Biological and Biomedical Ontologies
OWA	Open World Assumption
OWL	Web Ontology Language
PDF	Portable Document Format
PFG	Plant Functional Group
RDB	Relational Database
RDF	Resource Description Framework
RDFS	Resource Description Framework Schema
RML	RDF Mapping Language
SHACL	Shapes Constraint Language
SPARQL	SPARQL Protocol and RDF Query Language
SQL	Structured Query Language
SWRL	Semantic Web Rule Language
TBox	Terminology Box
UUID	Universally Unique Identifier

References

Perini, K.; Canepa, M.; Barath, S.; Hensel, M.; Mimet, A.; Uthaya Selvan, S.; Roccotiello, E.; Selami, T.; Sunguroglu Hensel, D.; Tyc, J.; et al. ECOLOPES: A multi-species design approach to building envelope design for regenerative urban ecosystems. In Proceedings of the Responsive Cities: Design with Nature Symposium Proceedings 2021, Barcelona, Spain, 11–12 November 2021; pp. 368–380. [Google Scholar]
Grüninger, M.; Fox, M.S. Methodology for the design and evaluation of ontologies. In Proceedings of the IJCAI95 Workshop on Basic Ontological Issues in Knowledge Sharing, Montreal, QC, Canada, 13 April 1995. [Google Scholar]
van Harmelen, F.; van Harmelen, F.; Lifschitz, V.; Porter, B. Handbook of Knowledge Representation; Elsevier: San Diego, CA, USA, 2007. [Google Scholar]
Hogan, A.; Gutierrez, C.; Cochez, M.; Melo, G.D.; Kirrane, S.; Polleres, A.; Navigli, R.; Ngomo, A.C.N.; Rashid, S.M.; Schmelzeisen, L.; et al. Knowledge Graphs; Synthesis Lectures on Data, Semantics, and Knowledge; Springer International Publishing: Cham, Switzerland, 2022. [Google Scholar] [CrossRef]
Lenzerini, M. Ontology-based data management. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, New York, NY, USA, 24–28 October 2011; pp. 5–6. [Google Scholar] [CrossRef]
Hensel, M. Performance-Oriented Architecture: Rethinking Architectural Design and the Built Environment; Wiley: Hoboken, NY, USA, 2013. [Google Scholar] [CrossRef]
Hensel, M.U.; Sunguroğlu Hensel, D. Performances of Architectures and Environments: En Route to a Theory and Framework. In The Routledge Companion to Paradigms of Performativity in Design and Architecture; Routledge: London, UK, 2019; p. 12. [Google Scholar]
Sunguroğlu Hensel, D.; Tyc, J.; Hensel, M. Data-driven design for Architecture and Environment Integration: Convergence of data-integrated workflows for understanding and designing environments. SPOOL 2022, 9, 19–34. [Google Scholar] [CrossRef]
Poelen, J.H.; Simons, J.D.; Mungall, C.J. Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets. Ecol. Inform. 2014, 24, 148–159. [Google Scholar] [CrossRef]
Calbi, M.; Boenisch, G.; Boulangeat, I.; Bunker, D.; Catford, J.A.; Changenet, A.; Culshaw, V.; Dias, A.S.; Hauck, T.; Joschinski, J.; et al. A novel framework to generate plant functional groups for ecological modelling. Ecol. Indic. 2024, 166, 112370. [Google Scholar] [CrossRef]
Tyc, J.; Selami, T.; Hensel, D.S.; Hensel, M. A Scoping Review of Voxel-Model Applications to Enable Multi-Domain Data Integration in Architectural Design and Urban Planning. Architecture 2023, 3, 137–174. [Google Scholar] [CrossRef]
Fensel, D.; Şimşek, U.; Angele, K.; Huaman, E.; Kärle, E.; Panasiuk, O.; Toma, I.; Umbrich, J.; Wahler, A. Knowledge Graphs: Methodology, Tools and Selected Use Cases; Springer: Cham, Switzerland, 2020. [Google Scholar]
Poggi, A.; Lembo, D.; Calvanese, D.; Giacomo, G.D.; Lenzerini, M.; Rosati, R. Linking Data to Ontologies. J. Data Semant. 2008, 10, 133–173. [Google Scholar]
Ahmeti, A. Updates in the Context of Ontology-Based Data Management. Ph.D. Thesis, Technische Universität Wien, Wien, Austria, 2020. [Google Scholar] [CrossRef]
Sequeda, J.F.; Arenas, M.; Miranker, D.P. OBDA: Query rewriting or materialization? In practice, both! In The Semantic Web—ISWC 2014; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; pp. 535–551. [Google Scholar]
Smith, B.; Ceusters, W.; Klagges, B.; Köhler, J.; Kumar, A.; Lomax, J.; Mungall, C.; Neuhaus, F.; Rector, A.L.; Rosse, C. Relations in biomedical ontologies. Genome Biol. 2005, 6, R46. [Google Scholar] [CrossRef]
Glick, J. Ontologies and Databases – Knowledge Engineering for Materials Informatics. In Informatics for Materials Science and Engineering; Elsevier: Amsterdam, The Netherlands, 2013; pp. 147–187. [Google Scholar]
Pruski, C.; Hensel, D.S. The Role of Information Modelling and Computational Ontologies to Support the Design, Planning and Management of Urban Environments: Current Status and Future Challenges. In Informed Urban Environments: Data-Integrated Design for Human and Ecology-Centred Perspectives; Chokhachian, A., Hensel, M.U., Perini, K., Eds.; Springer: Cham, Switzerland, 2022; pp. 51–70. [Google Scholar]
Polleres, A. From SPARQL to rules (and back). In Proceedings of the 16th International Conference on World Wide Web, Banff, AB, Canada, 8–12 May 2007; pp. 787–796. [Google Scholar] [CrossRef]
Hensel, M.; Tyc, J.; Ahmeti, A.; Sunguroglu Hensel, D. An Ontology-Aided Generative Computational Design Process for Ecological Building Envelopes: Development of a Conceptual Framework for the Early Design Stage. Land 2025, 840. [Google Scholar] [CrossRef]
Muñoz, S.; Pérez, J.; Gutierrez, C. Minimal Deductive Systems for RDF. In Proceedings of the Semantic Web: Research and Applications; Franconi, E., Kifer, M., May, W., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 53–67. [Google Scholar]
Lenzerini, M. Data integration: A theoretical perspective. In Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, New York, NY, USA, 3–5 June 2002; pp. 233–246. [Google Scholar] [CrossRef]
Srihari, S.N. Representation of Three-Dimensional Digital Images. Acm Comput. Surv. 1981, 13, 399–424. [Google Scholar] [CrossRef]
Tyc, J.; Sunguroğlu Hensel, D.; Parisi, E.I.; Tucci, G.; Hensel, M.U. Integration of Remote Sensing Data into a Composite Voxel Model for Environmental Performance Analysis of Terraced Vineyards in Tuscany, Italy. Remote Sens. 2021, 13, 3483. [Google Scholar] [CrossRef]
Tyc, J.; Ahmeti, A.; Sunguroglu Hensel, D.; Hensel, M. The ECOLOPES Voxel Model: Multi-domain data integration for ontology-aided generative computational design of ecological building envelopes. Intell. Build. Int. 2025; submitted. [Google Scholar]
Tyc, J.; Parisi, E.I.; Tucci, G.; Sunguroğlu Hensel, D.; Hensel, M.U. A data-integrated and performance-oriented parametric design process for terraced vineyards. J. Digit. Landsc. Archit. 2022, 1, 504–520. [Google Scholar] [CrossRef]
Ludwig, F.; Hensel, M.; Rötzer, T.; Ahmeti, A.; Chen, X.; Erdal, H.I.; Reischel, A.; Qiguan, S.; Tyc, J.; Yazdi, H. Digital Workflow for Novel Urban Green System Design Derived from a Historical Role Model. J. Digit. Landsc. Archit. 2024, 2024, 333–345. [Google Scholar] [CrossRef]
Hensel, M.U.; Tyc, J.M.; Hensel, D.S.; Ahmeti, A.; Cifci, M.A. ECOLOPES Voxel Model; Technical Report; TU: Wien, Austria, 2023. [Google Scholar]
Dimou, A.; Sande, M.V.; Colpaert, P.; Verborgh, R.; Mannens, E.; de Walle, R.V. RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In Proceedings of the Workshop on Linked Data on the Web Co-Located with the 23rd International World Wide Web Conference (WWW 2014), Seoul, Republic of Korea, 8 April 2014; Volume 1184. [Google Scholar]
Torta, G.; Ardissono, L.; La Riccia, L.; Savoca, A.; Voghera, A. Representing ecological network specifications with semantic web techniques. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. SCITEPRESS—Science and Technology Publications, Funchal, Portugal, 1–3 November 2017. [Google Scholar]
Sadeghineko, F.; Kumar, B. Application of semantic Web ontologies for the improvement of information exchange in existing buildings. Constr. Innov. 2022, 22, 444–464. [Google Scholar] [CrossRef]
Abanda, F.H.; Kamsu-Foguem, B.; Tah, J.H.M. BIM–New rules of measurement ontology for construction cost estimation. Eng. Sci. Technol. Int. J. 2017, 20, 443–459. [Google Scholar] [CrossRef]
Liao, L.; Wen, Y.; Mo, W.; Gan, C. A Carbon Footprint Management Framework for Prefabricated Buildings Based on Knowledge Graph. In Proceedings of the International Symposium on Advancement of Construction Management and Real Estate; Springer: Berlin/Heidelberg, Germany, 2023; pp. 1551–1569. [Google Scholar]
Olanrewaju, O.I.; Enegbuma, W.I.; Donn, M.; Chileshe, N. Building information modelling and green building certification systems: A systematic literature review and gap spotting. Sustain. Cities Soc. 2022, 81, 103865. [Google Scholar] [CrossRef]
Shi, J.; Pan, Z.; Jiang, L.; Zhai, X. An ontology-based methodology to establish city information model of digital twin city by merging BIM, GIS and IoT. Adv. Eng. Inform. 2023, 57, 102114. [Google Scholar] [CrossRef]
Sica, F.; De Paola, P.; Tajani, F.; Doko, E. Spatial–Temporal Ontology of Indicators for Urban Landscapes. Land 2025, 14, 72. [Google Scholar] [CrossRef]
Affinito, F.; Holzer, J.M.; Fortin, M.J.; Gonzalez, A. Towards a unified ontology for monitoring ecosystem services. EcoEvoRxiv 2024, X21W5S. [Google Scholar] [CrossRef]
Ma, R.; Li, Q.; Zhang, B.; Huang, H.; Yang, C. An ontology-driven method for urban building energy modeling. Sustain. Cities Soc. 2024, 106, 105394. [Google Scholar] [CrossRef]
Al-Ageili, M.; Mouhoub, M. An ontology-based information extraction system for residential land-use suitability analysis. Int. J. Softw. Eng. Knowl. Eng. 2022, 32, 1019–1042. [Google Scholar] [CrossRef]
Hu, X.; Chen, Q.; Du, M. Ontology-based multi-sensor information integration model for urban gardens and green spaces. Iop Conf. Ser. Earth Environ. Sci. 2020, 615, 012023. [Google Scholar] [CrossRef]
Santos, H.; Dantas, V.; Furtado, V.; Pinheiro, P.; McGuinness, D.L. From data to city indicators: A knowledge graph for supporting automatic generation of dashboards. In Proceedings of the The Semantic Web: 14th International Conference, ESWC 2017, Portorož, Slovenia, 28 May–1 June 2017; Proceedings, Part II 14. Springer: Berlin/Heidelberg, Germany, 2017; pp. 94–108. [Google Scholar]
Kyzirakos, K.; Savva, D.; Vlachopoulos, I.; Vasileiou, A.; Karalis, N.; Koubarakis, M.; Manegold, S. GeoTriples: Transforming geospatial data into RDF graphs using R2RML and RML mappings. J. Web Semant. 2018, 52–53, 16–32. [Google Scholar] [CrossRef]
Calvanese, D.; Cogrel, B.; Komla-Ebri, S.; Kontchakov, R.; Lanti, D.; Rezk, M.; Rodriguez-Muro, M.; Xiao, G. Ontop: Answering SPARQL queries over relational databases. Semant. Web 2017, 8, 471–487. [Google Scholar] [CrossRef]
Penev, L.; Dimitrova, M.; Senderov, V.; Zhelezov, G.; Georgiev, T.; Stoev, P.; Simov, K. OpenBiodiv: A knowledge graph for literature-extracted linked open data in biodiversity science. Publications 2019, 7, 38. [Google Scholar] [CrossRef]
Ahmeti, A.; Schakel, J.K.; David, R.; Revenko, A. Towards preserving Biodiversity using Nature FIRST Knowledge Graph with Crossovers. In Proceedings of the ISWC 2023 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice Co-Located with 22nd International Semantic Web Conference (ISWC 2023), Athens, Greece, 6–10 November 2023; Volume 3632. [Google Scholar]
Xiao, G.; Hovland, D.; Bilidas, D.; Rezk, M.; Giese, M.; Calvanese, D. Efficient ontology-based data integration with canonical IRIs. In The Semantic Web; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018; pp. 697–713. [Google Scholar]
Mouromtsev, D.; Pavlov, D.; Emelyanov, Y.; Morozov, A.; Razdyakonov, D.; Galkin, M. The Simple Web-based Tool for Visualization and Sharing of Semantic Data and Ontologies. In Proceedings of the International Semantic Web Conference (Posters & Demos), Bethlehem, PA, USA, 11–15 October 2015. [Google Scholar]
Angles, R.; Gutierrez, C. The Expressive Power of SPARQL. In Proceedings of the 7th International Conference on The Semantic Web, Berlin/Heidelberg, Germany, 26–30 October 2008; pp. 114–129. [Google Scholar] [CrossRef]
Eiter, T.; Ianni, G.; Krennwallner, T. Answer Set Programming: A Primer. In Reasoning Web. Semantic Technologies for Information Systems: 5th International Summer School 2009, Brixen-Bressanone, Italy, 30 August–4 September 2009, Tutorial Lectures; Tessaris, S., Franconi, E., Eiter, T., Gutierrez, C., Handschuh, S., Rousset, M.C., Schmidt, R.A., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 40–110. [Google Scholar]
Erdem, E.; Gelfond, M.; Leone, N. Applications of answer set programming. AI Mag. 2016, 37, 53–68. [Google Scholar] [CrossRef]

Figure 1. Three types of voxel data layers and their 3D visualisations provided by the current selection of temporal datasets [28]. The colors in voxels represent the level of sunlight exposure.

Figure 2. Ontology-based data management—data integration framework of heterogeneous datasets. Some of the datasets are materialised in a KG (RML [29] and Ontotext Refine mappings), with the aim of achieving better query or update performance; a voxel model that contains a high volume of data and is not subject to frequent changes is virtualised.

Figure 3. The placement of nodes in the CAD environment and their validation against the knowledge graph. Nodes that fulfil the constraints are depicted as green circles, whereas those that do not are depicted as red crosses (left figure). In this case, the constraints are subject to proximity distances between nodes and/or solar radiation (yellow highlighted area), which has the information encoded in the voxel mode; therefore, both can be queried via the knowledge graph. Grasshopper (right) is used as a component for doing the orchestration process of the validation.

Figure 4. OBO Relations alignment using owl:equivalentProperty relations stored in a separate named graph https://tbox.crossovers.dap.tuwien.ac.at.

Figure 5. Native OBDA mappings used for configuring the GraphDB repository using a Turtle-like syntax.

Figure 6. SPARQL queries—subject to mappings (cf. Figure 5)—are translated (on the fly) into SQL queries, which are used to run against the Postgres RDB to return the results.

Figure 7. Grasshopper pipelines pose the SPARQL queries that calculate the permissible positions, where one can place a plant species based on plant solar requirements.

Figure 8. Knowledge graph snapshot connecting data from different environments. The resulting graph data are managed in the GraphDB triple store. The dashed lines represent implicit equality between entities in the KG (i.e., the relation needs to be resolved via species’ Latin names), whereas in other cases, they are explicitly asserted using owl:sameAs relations.

Figure 9. Ontotext Refine mappings for mapping source and target species, as well as including the type of interaction between them. More metadata can be mapped on demand by choosing the specific columns.

Figure 10. Integrated data on the species Abies alba from different datasets: GloBI (central), PFGs (metadata panel on the right), and CAD (UUID node) in the GraphDB environment. Due to the limitation in the visualisation feature, RDF* data from Example 2 are not shown despite the fact that they exist in a separate named graph and can be queried using SPARQL*.

Figure 11. Location of the Nordbahnhof Freie Mitte development site in the context of Vienna (top-right corner of the left figure, source: Google Earth). The citizen science data on species obtained from GBIF in the proximity of Nordbahnhof (right figure).

Figure 12. The complete Grasshopper pipeline, which prepares points as JSON-LD and uploads them to GraphDB. Afterwards, it validates the chosen constraints and returns a Boolean value.

Figure 13. INSERT query—as a counterpart to CONSTRUCT with (rule) body and head—inserts the validation results based on the proximity constraints. INSERT queries are issued in the context of OBDM to mimic the rule-based approach.

Figure 14. The placed EcoNodes and ArchiNodes, including the respective metadata defined in the CAD panel. The metadata are mapped to the respective attributes of the ontology, creating entities in ABox.

Figure 15. The graph visualisation (in open-access version of Ontodia [47]) of the EcoNode that satisfies both solar and proximity constraints.

Table 1. Prefixes and their namespace IRIs used for entity representation and CRUD operations.

Prefix	Namespace IRI
xsd	`http://www.w3.org/2001/XMLSchema#`
networks-schema	`https://schema.dap.tuwien.ac.at/network#`
networks-inst	`https://resource.dap.tuwien.ac.at/network/`
networks-taxo	`https://taxonomy.dap.tuwien.ac.at/network/`
ecolopes-inst	`https://resource.ecolopes.org/`
ecolopes-schema	`https://schema.ecolopes.org#`
ofn	`http://www.ontotext.com/sparql/functions/`
obo-ro	`http://purl.obolibrary.org/obo/`
geo	`http://www.opengis.net/ont/geosparql#`
dwc	`http://rs.tdwg.org/dwc/terms/`

Table 2. The datasets that are mapped and integrated into the ECOLOPES KG.

Dataset No.	Spatial	Input Format	Mapping Language	Triple Count	Storage Mode
1.	(x)	CSV	Ontotext Refine	3548	Materialised
2.	x	shapefile	RML	881	Materialised
3.	(x)	CSV	Ontotext Refine	2,454,463	Materialised
4.	x	CSV	Ontotext Refine	* 492	Materialised
5.	-	JSON	RML	256	Materialised
6.	(x)	PDF	-	** 34	Materialised
7.	(x)	CSV	Ontotext Refine	* 2858	Materialised
8.	-	RDF	-	136,865,946	Federated
9.	(x)	RDF	-	*** 3,889,978	Federated
10.	(x)	RDF	-	5,363,877	Federated
11.	x	- (CAD metadata)	Grasshopper components	**** 20	Materialised
12.	x	RDB	OBDA	1,329,096	Virtualised *****

(x) (for spatial) means partial/implicit coverage. * Triple count depends on the number of API requests. ** Placeholder data. *** SELECT (COUNT(*) as ?cnt) WHERE ?item wdt:P225 ?taxon. **** Triples vary depending on the complexity and size of the CAD network. ***** Materialisation of virtualised data can be performed subject to performance indicators.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ahmeti, A.; Hensel, D.S.; Pruski, C.; Tyc, J.; Hensel, M. Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration. Appl. Sci. 2025, 15, 5311. https://doi.org/10.3390/app15105311

AMA Style

Ahmeti A, Hensel DS, Pruski C, Tyc J, Hensel M. Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration. Applied Sciences. 2025; 15(10):5311. https://doi.org/10.3390/app15105311

Chicago/Turabian Style

Ahmeti, Albin, Defne Sunguroglu Hensel, Cédric Pruski, Jakub Tyc, and Michael Hensel. 2025. "Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration" Applied Sciences 15, no. 10: 5311. https://doi.org/10.3390/app15105311

APA Style

Ahmeti, A., Hensel, D. S., Pruski, C., Tyc, J., & Hensel, M. (2025). Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration. Applied Sciences, 15(10), 5311. https://doi.org/10.3390/app15105311

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enabling Biodiversity-Informed Architecture Through Ontology-Driven Data Integration

Abstract

1. Introduction

2. Motivation and Problem Statement

3. Related Work

4. Ontology Design and Development

4.1. Defining the Ontology Requirements

4.2. Ontology Requirements and Scope

4.3. Ontology Creation Process

4.4. Core Concepts and Structure

4.5. Alignment with Existing Standards

5. Ontology-Driven Data Integration Through OBDM

6. Demonstrator

6.1. Mapping of Designer Network Data and Voxel Data Layers in EIM Ontology

6.2. Validation Process for EcoNodes and ArchiNodes

6.3. Preparing and Sending Network Data

6.4. Define Evaluation Constraints

7. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI