Next Article in Journal
Knowledge, Materials, and Construction Techniques as Guiding Factors in Conservation Interventions: An Interpretative Approach for the House of Arianna in the Archaeological Park of Pompeii
Previous Article in Journal
Integrated Comprehensive Characterization of Black Crusts from Milan’s Monumental Cemetery: A Synergistic Approach Combining Conventional and Unconventional Analytical Techniques
Previous Article in Special Issue
Bridging the Provenance Knowledge Gap Between 3D Digitization and Semantic Interpretation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Semantic Collaborative Environment for Extended Digital Natural Heritage: Integrating Data, Metadata, and Paradata

Korea National University of Heritage, Buyeo 33115, Republic of Korea
*
Author to whom correspondence should be addressed.
Heritage 2025, 8(12), 507; https://doi.org/10.3390/heritage8120507
Submission received: 31 October 2025 / Revised: 29 November 2025 / Accepted: 1 December 2025 / Published: 4 December 2025

Abstract

Natural heritage digitization has evolved beyond simple 3D representation. Contemporary approaches require transparent documentation integrating biological, heritage, and digitization standards, yet existing frameworks operate in isolated domains without semantic interoperability. Current digitization frameworks fail to integrate biological standards (Darwin Core, ABCD), heritage standards (CIDOC-CRM), and digitization standards (CRMdig, PROV-O) into a unified semantic architecture, limiting transparent documentation of natural heritage data across its entire lifecycle—from physical observation through digital reconstruction to knowledge reasoning. This study proposes an integrated semantic framework comprising three components: (1) the E-DNH ontology, which adopts a triple-layer architecture (data–metadata–paradata) and a triple-module structure (nature–heritage–digital), bridging Darwin Core, CIDOC-CRM, CRMdig, and PROV-O; (2) the HR3D workflow, which establishes a standardized high-precision 3D data acquisition protocol that systematically documents paradata; and (3) the C-EDNH platform, which implements a Neo4j-based knowledge graph with semantic search capabilities, AI-driven quality assessment, and persistent identifiers (NSId/DOI). The framework was validated through digitization of 197 natural heritage specimens (68.5% avian, 24.9% insects, 5.1% mammals, 1.5% reptiles), demonstrating high geometric accuracy (RMS 0.18 ± 0.09 mm), visual fidelity (SSIM 0.92 ± 0.03), and color accuracy (ΔE00 2.1 ± 0.7). The resulting knowledge graph comprises 15,000+ nodes and 45,000+ semantic relationships, enabling cross-domain federated queries and reasoning. Unlike conventional approaches that treat digitization as mere data preservation, this framework positions digitization as an interpretive reconstruction process. By systematically documenting paradata, it establishes a foundation for knowledge discovery, reproducibility, and critical reassessment of digital natural heritage.

1. Introduction

The 1972 UNESCO Convention on the Protection of World Cultural and Natural Heritage defined both cultural and natural heritage as assets of universal value to be collectively preserved by humanity [1]. As of 2025, there are 1248 properties on the World Heritage list, of which 235 are natural heritage sites serving as environmental archives that document geological and biological processes [2]. Recently, the digital transformation of natural heritage has become a key infrastructure for conservation science, biodiversity research, and ecological education [3]. However, most institutions remain at a production-centric stage [4]. Although high-precision technologies such as photogrammetry, structured-light scanning, and CT are widely used [5], the resulting 3D data exist in heterogeneous formats (e.g., OBJ, PLY, glTF) [6] and are constructed according to isolated institutional standards. This fragmentation impedes long-term preservation and reuse, as isolated data structures hinder cross-institutional integration [7], and geometric data are frequently managed separately from metadata, making it difficult to capture temporal changes [8].
In this context, biodiversity informatics introduced the concept of the Digital Extended Specimen (DES), which manages physical specimens as extended digital objects linked with genomic, geographic and ecological data [9,10], realizing the principles of FAIR [11]. However, this concept remains mainly focused on biological taxonomy, often failing to reflect the cultural heritage contexts emphasized by UNESCO [12]. For example, the SYNTHESYS3 project integrated specimen data from 21 European institutions but lacked the systematic management of high-fidelity 3D models, failing to realize the multidimensional connectivity required by DES [13]. To bridge this gap, this study draws on cultural heritage precedents in which frameworks integrate metadata and paradata to ensure data reliability and transparently communicate the technical decisions and interpretive contexts underlying digital reconstruction [14,15]. This demonstrates how subjective interpretations, such as collective memory and cultural identity, can be systematically encoded through structured metadata. The concept “Memory Twin” exemplifies this approach, presenting a participatory ecosystem aligned with the principles of FAIR and CARE [16].
Building on these precedents, this study addresses three research questions: RQ1: How can biological standards (Darwin Core, ABCD), heritage standards (CIDOC-CRM [17]) and digitization standards (CRMdig [18], PROV-O [19]) be integrated into a unified semantic framework for transparent documentation throughout the lifecycle of natural heritage? RQ2: How can the paradata from high-precision 3D digitization be systematically structured and integrated with the data and metadata to enable reproducibility? RQ3: How can a knowledge graph-based platform support semantic reasoning and federated queries across biological, heritage, and digitization domains? To address these questions, we implement DES as a practical system, proposing a digital ecosystem that integrates paradata from a nature-specific high-fidelity 3D digitization workflow with data and metadata into a graph-based knowledge database, moving beyond text-centric systems to encompass biological attributes, heritage values, and technical contexts.

2. The State of the Art

2.1. Evolution of Natural Heritage Data Management Environments

Natural heritage data standardization, evolving from Linnaean taxonomy and 19th-century nomenclature codes, faces structural limitations with 3D digitization proliferation. Although GBIF and iDigBio secured formal interoperability through Darwin Core-based networks (180 countries, 118 million records), their text-based occurrence structures inadequately integrate high-dimensional data such as 3D models and paradata [20]. Arctos provides community governance and flexible attribute management, but lacks ontological structures to semantically link 3D technical attributes. All three platforms employ researcher-centric biodiversity interfaces, failing to establish circular governance that includes data producers, managers, and users. As Table 1 indicates, current platforms emphasize formal interoperability while neglecting semantic integration to interpret the microscopic, social, and heritage contexts of samples, with limited structural scalability for the representation of high-dimensional data. Expert-centric participation structures prevent fully operational circular governance [21], necessitating a transition toward intelligent models that integrate complicated cultural contexts with diverse user dynamics.

2.2. Evolving Standardization by Integrating Metadata, Paradata, and 3D Data Management

Natural heritage data standardization, evolving from Linnaean taxonomy and 19th-century nomenclature codes, faces structural limitations with 3D digitization proliferation. Darwin Core (DwC) and ABCD, core biodiversity exchange standards [26], employ text-based two-dimensional structures inadequate for describing geometric properties, resolution, and lineage of three-dimensional models. Audubon Core, designed for 2D media, lacks LOD management and provenance tracking for such models. While FAIR principles emphasize accessibility [11], they omit quality metrics like mesh integrity and texture resolution; CARE principles similarly lack paradata frameworks documenting equipment specifications, resolution settings, and post-processing algorithms—critical for assessing scientific suitability when identical specimens undergo different digitization methods. In cultural heritage, RMdig ontologically models digitization [18] but lacks integration with natural science attributes and Darwin Core mechanisms.
Addressing these gaps requires three-layered infrastructures managing spatial properties, workflows, and quality contextually. The Tendaguru Dinosaur Expedition validated this approach (data–metadata–paradata) for documenting excavation contexts, taxonomic revisions, and conservation histories, with paradata comprising 80–90% of datasets and ensuring reproducibility [27], though universal applicability remains limited. The EUreka3D project advanced this through composite entities (point clouds, meshes, textures) and standardized management via integrated repositories, aggregators, viewers, and virtual environments [28], yet cannot adequately represent specimen–collection–institution hierarchies and dynamic biological taxonomy. Consequently, bridging biological standards (DwC, ABCD) with digitization standards (CRMdig) through validated infrastructure is essential for inter-institutional interoperability, scientific reliability, and collaborative reuse of three-dimensional natural heritage data.

3. Methodology

Although natural heritage digitization has advanced through OCR-based transcription and quality management, existing standards inadequately capture morphological precision, heritage values, and contextual paradata, necessitating systems that transcend “accurate information” to become interpretive knowledge resources integrating meaning, provenance, and context.
This study proposes an integrated methodology based on the concept of the digital extended specimen (DES) [9], a multi-layered structure that extends from physical specimens through CMS records, annotations, and automated transactions. The model systematizes data lifecycles via a three-tier architecture (data, metadata, paradata) through four stages: (1) data acquisition and normalization consolidates multi-institutional specimens with standardized terminology; (2) 3D digitization establishes workflows combining structured-light scanning and photogrammetry with controlled parameters ensuring reproducibility; (3) knowledge graph database interlinks biological data, heritage values, and digitization processes through E-DNH ontology; (4) collaborative management and utilization integrates semantic search, AI-driven quality assessment, real-time annotation, and persistent identifiers for global interoperability. As illustrated in Figure 1, this framework operates in organic integration with the DES three-tier model, establishing a collaborative circular ecosystem where biodiversity researchers, digitization technicians, curators, and educators participate in the production, management, and reuse of natural heritage data, ensuring reliability and interoperability through a shared semantic layer.

Data Resource Layer

The initial stage of this research focuses on the establishment of the Data Resource Layer, which serves as the foundational infrastructure for the E-DNH ontology. This layer provides a systematic basis for subsequent 3D digitization workflows and ontology design processes through the physical acquisition of natural heritage specimens and the normalization of descriptive data. A total of 197 specimens of natural heritage were collected through institutional collaboration with the Hannam University Museum of Natural History, the National Heritage Administration, and Seoul Grand Park, comprising avian species (68.5%), insects (24.9%), mammals (5.1%), and reptiles (1.5%), as detailed in Table 2.
Specimen selection was conducted based on institutional preservation environments, data management systems, and scanning suitability criteria, with systematic standardization of primary metadata and digital quality control variables. Through advisory workshops involving domain experts from the National Heritage Administration and the Korea National Park Service, data normalization protocols were established to resolve inter-institutional inconsistencies in taxonomic identification, storage conditions, and metadata documentation standards. The dataset constructed during this stage is organized into three categorical domains (Table 3): (1) specimen attributes, (2) project-based data, and (3) heritage records. These domains are interconnected through an integrated schema architecture, ensuring cross-institutional interoperability and providing a scalable infrastructure for subsequent 3D digitization processes and ontology-based semantic integration.

4. Higher-Reality 3D Digitization Workflow for Natural Heritage

Building upon the Data Resource Layer, this section investigates the 3D digitization workflow during physical-to-digital transition, focusing on paradata specificities unique to natural heritage. Natural heritage 3D digitization presents greater morphological authenticity challenges than cultural heritage due to biologically variable elements such as feathers, epidermis, and body hair [29], requiring distinct protocols that accommodate irreversible morphological deformation and surface variations. This study establishes a workflow tailored to biological specimens, operationalizing extended digital specimens by systematically structuring all technical and interpretive decisions as paradata. As illustrated in Figure 2, the workflow comprises six stages—pre-processing, acquisition, QA/QC, post-processing, packaging, and meta-layer integration—designed as an integrated system simultaneously generating data, metadata, and paradata.

4.1. Digitization Workflow and Quality Management

During pre-processing, specimen characteristics and environmental variables were quantitatively defined, with equipment settings standardized for reproducibility. Scanning was performed under controlled conditions: temperature 20 ± 2 °C, humidity 45 ± 5%, diffused LED illumination (1000–1500 lux, 5500K, CRI > 95), and cross-polarization filters to minimize specular reflection. A structured-light scanner (Artec Space Spider, Artec 3D, Luxembourg City, Luxembourg; 0.05 mm accuracy) captured fine morphological details via automated multi-angle scanning (360° rotation, 5° intervals, 48 positions). The texture acquisition used a a Sony α 7R V camera (Sony Corporation, Tokyo, Japan; 61 MP), generating 12,000–25,000 frames per session. All parameters were documented as PROV-O-based datasets (Table 4).
The acquisition process adopted a hybrid approach that combined structured-light scanning and photogrammetry. Small-to-medium specimens were scanned at 25–30 cm with 70% overlap, with occluded regions rescanned in real-time. The large specimens were captured using 192 RAW images with 80% overlap and 30–60° crossing angles, processed using the SfM and MVS algorithms in Agisoft Metashape 2.1.2. Photogrammetric alignment and point cloud reconstruction were performed in ContextCapture (Figure 3).
Quality assurance and control were conducted on all 197 specimens, with 20 (10%) randomly sampled for validation. Geometric accuracy was evaluated using RMS, HD95, and Chamfer Distance, visual fidelity through SSIM, Δ E 00 (CIEDE2000), and FID. The results yielded RMS 0.18 ± 0.09 mm, Δ E 00 2.1 ± 0.7, and SSIM 0.92 ± 0.03, meeting the thresholds (RMS ≤ 0.3 mm, SSIM ≥ 0.95, Δ E 00 ≤ 3.0) for academic and exhibition applications. Quality data were stored as :QualityCheck nodes in Neo4j, with subthreshold datasets automatically flagged for reacquisition. Post-processing employed ICP alignment, TSDF fusion, noise removal, remeshing, UV unwrapping, 8K texture baking, and color correction through Blender 4.0.2 and Artec Studio 18, maintaining Δ E 00 ≤ 2.1 through ColorChecker calibration. The photogrammetry and structured-light datasets were merged, achieving 100% watertight meshes that combine high-resolution textures with precise geometry. The final outputs comprised a three-tier structure: archival OBJ (0.5–2 GB), analytical PLY (2–5 GB), and web-deployable GLB (50–200 MB), each assigned Darwin Core identifier (dwc:occurrenceID) and cryptographic checksums before transfer to an OAIS ISO 14721:2012 [30] -compliant preservation system.

4.2. Paradata Documentation and Integration

This stage identifies and structures contextual (paradata) and evidential elements throughout digitization, demonstrating how technical manipulations, instrumentation, and human judgment influence interpretive and representational fidelity. This systematic documentation serves two critical methodological functions: enabling reproducibility by establishing transparent records of technical decisions, and it supports critical reassessment by exposing the interpretive choices embedded within digital outputs. Unlike cultural heritage, natural heritage must accommodate biological variability and irreversible temporal changes; therefore, each scanning event functions as an evidential occurrence reflecting environmental, instrumental, and procedural contexts that impact reproducibility and scientific credibility. To establish empirical foundations for the categorization of paradata, the technical and environmental variables—illumination, humidity, scanning angles—were quantitatively analyzed for correlations with geometric accuracy, texture fidelity, and reconstruction quality. Paradata elements were categorized into three domains (Table 5): (1) physical context (environment, specimen mounting, device resolution); (2) computational context (alignment algorithms, texture mapping); and (3) interpretive context (AI model selection, manual editing, quality thresholds). These categories structure digitization as an evidence–process–outcome chain, with paradata records corresponding to ontological classes (AcquisitionEvent, DataProcessing, QualityAssessment). By providing the semantic basis for integrating digitization standards (CRMdig, PROV-O) with biological and heritage metadata, this structured documentation enables precise tracking of how technical procedures and interpretation constitute digital specimens as evidence-based composite objects integrating scientific observation, technical reconstruction, and interpretive transparency, founding the E-DNH ontology.

5. Extended Digital Natural Heritage Ontology

5.1. Purpose of the Ontology Structure

Natural heritage constitutes both biological entities and composite artifacts accumulating human acts of perception, documentation, and preservation; however, existing digital systems remain confined to specimen-centric metadata management, inadequately reflecting this multi-layered character. To address the challenge of integrating heterogeneous standards across biological, heritage, and digitization domains, this study proposes the Extended Digital Natural Heritage (E-DNH) ontology, conceptualizing natural heritage as knowledge ecosystems that systematically describe interacting structures of observed nature, heritage values, and digital processes. The ontology resolves three methodological requirements. Firstly, it provides a shared semantic model, semantically connecting terminologies and data structures across Darwin Core, CIDOC-CRM, and CRMdig/PROV-O, bridging isolated domain standards into a coherent framework. Secondly, through the explicit formalization of tacit knowledge, it systematizes scientific, institutional, and technical knowledge into explicit relationships and logical constraints, enabling transparent documentation of paradata to support reproducibility and critical reassessment. Third, by enabling machine reasoning beyond data storage, it supports the inference-based generation of novel insights (e.g., conservation prioritization, climate risk assessment), with a modular standards-based design, ensuring scalability and connectivity to global natural heritage knowledge ecosystems, operationalizing the DES concept through semantic infrastructure for federated queries and cross-domain knowledge discovery.

5.2. Design Principles

The E-DNH ontology was designed based on three foundational principles to structurally represent complex interrelationships among physical reality, cultural contexts, and digital activities of natural heritage.
Triple Module with Data, Metadata, and Paradata. The E-DNH ontology distinguishes information into three conceptual tiers: the data layer represents physical specimens or observational records; the metadata layer describes cultural values, legal status, and curatorial contexts; and the paradata layer encompasses technical and interpretive processes of digital production. This design functions as both a data structure and a principle representing information lifecycles and epistemological evolution of knowledge. Conventional systems conflate facts, interpretations, and contexts in a single plane, impeding credibility assessment. By stratifying knowledge, this architecture enables retroactive tracking from digital outputs to raw data through the PROV-O provenance model, ensuring reproducibility. Table 6 demonstrates how 3D digitization attributes operate as paradata, illustrating uncertainty propagation across equipment calibration, acquisition conditions, and processing chains, integrating the quality assessment framework [31] with systematic uncertainty documentation [32] through PROV-O and CRMdig.
Semantic Reasoning and Knowledge Expandability. The E-DNH ontology incorporates reasoning mechanisms deriving implicit knowledge from explicitly modeled information, functioning as an active knowledge system supporting knowledge discovery and hypothesis generation. Taxonomic reasoning connects Taxon Identification and Location data to identify biodiversity patterns; heritage value reasoning determines conservation priorities through relational analysis between Heritage Aspect and Outstanding Universal Value; and technical reasoning enhances traceability by linking Quality Assessment with Provenance Trace. These capabilities are enabled through inter-module relationships and semantic definitions, allowing the knowledge graph to expand dynamically and generate novel insights through evolving relationships.

5.3. Structural Principles of Modularization

The E-DNH ontology adopts a modular architecture inspired by the NeOn Methodology [33], decomposing complex natural heritage into semantically connected components. This approach enables domain-specific module refinement (Nature Module for biological dimensions, Heritage Module for cultural contexts, Digital Module for digitization processes), ensures adaptability across diverse projects, and minimizes dependencies while maintaining inter-module interoperability. Core entities (DigitalSpecimen, HeritageAspect, DigitalActivity) function as semantic bridges enabling cross-module reasoning and federated querying.
As visualized in Figure 4, the three modules correspond directly to the data–metadata–paradata structure defined in the Triple Module with Data, Metadata, and Paradata principle. The Nature Module models empirical data (e.g., Event, Location), the Heritage Module encodes interpretive and curatorial metadata (e.g., HeritageProject, LegalInstrument), and the Digital Module captures process-level paradata (e.g., AcquisitionEvent, DataProcessing). Together, they represent the complete intellectual lifecycle of natural heritage—from observation through value attribution to digital reconstruction and validation—positioning E-DNH as a semantic knowledge infrastructure supporting integrative reasoning, transparency, and long-term sustainability.

5.4. Nature Module (Nature and Digital Specimen)

The Nature Module integrates biodiversity informatics standards to enable semantic interoperability across natural history institutions. It adopts Darwin Core [26] as its core vocabulary—a standard developed through over 30 years of collaborative effort—while integrating ABCD 3.0 [34] for anatomical specimen descriptions, Audubon Core [35] for multimedia metadata, and the MorphoSource model for 3D morphological data. This standards-based integration ensures both scientific completeness and cross-institutional data exchange among museums, research institutions, and biodiversity aggregators.

5.4.1. Definitions of Core Classes in Nature Module

Table 7 defines six core classes establishing the ontological structure for specimen representation. These classes extend Darwin Core concepts to support structured documentation of specimen attributes, spatiotemporal provenance, taxonomic determination, and multimedia associations.

5.4.2. Object Properties in Nature Module

Table 8 summarizes the principal object properties defined for the Nature Module.

5.5. Heritage Module

The Heritage Module semantically models the processes through which natural heritage is recognized, protected, and managed as a shared cultural asset. Building upon CIDOC-CRM’s event-centered philosophy—conceptualizing heritage as a socially constructed phenomenon evolving over time—the module integrates the LIDO [36] metadata schema to capture heritage valuation, inscription, conservation, and utilization processes. This standards-based integration bridges biological metadata with heritage metadata, enabling unified reasoning across natural and cultural dimensions. Applied to 197 specimens from UNESCO-designated sites (e.g., Jeju Volcanic Island and Lava Tubes) and national natural monuments (e.g., Natural Monument No. 182), the module demonstrates how biological specimens function simultaneously as scientific evidence and culturally constituted assets.

5.5.1. Definitions of Core Classes in Heritage Module

Table 9 define six core classes establishing the ontological structure for heritage representation. These classes extend CIDOC-CRM concepts to support structured documentation of heritage values, legal frameworks, and social dimensions, addressing existing DES implementations’ failure to reflect UNESCO’s cultural heritage contexts. The integration enables three forms of cross-domain reasoning demonstrated in Section 5.7.: (1) conservation prioritization linking OutstandingUniversalValue with specimen condition assessments, (2) temporal analysis tracing HeritageInscription events against taxonomic revisions, and (3) stakeholder network analysis connecting SocialEngagement with institutional Agent roles, operationalizing UNESCO’s recognition criteria within a computable framework.

5.5.2. Object Properties in Heritage Module

Table 10 summarizes the principal object properties defined for the Heritage Module.

5.6. Digital Module

The Digital Module structures all technical activities, decisions, and evaluations during three-dimensional digitization as paradata, directly addressing transparent documentation challenges. It integrates W3C PROV-O, which models genealogical relationships through the triad of Activity, Entity, and Agent, with CRMdig, the CIDOC-CRM extension, modeling digitization processes. CRMdig precisely represents Digitization Process (D2), Parameter Assignment (D13), and Digital Object (D1), providing a conceptual foundation for translating EU Eureka3D Quality Framework metrics into an ontological structure. Applied to 197 specimens digitized through robotic scanning, multi-sensor fusion, and AI-based texture reconstruction, the module operationalizes the HR3D workflow, formalizing technical decisions ensuring trustworthiness and reproducibility.

5.6.1. Definitions of Core Classes

Table 11 define six core classes that establish the ontological structure for the representation of the digitization process. These classes extend PROV-O and CRMdig to support structured documentation of technical parameters, computational workflows, and quality metrics. The integration enables three forms of technical reasoning: (1) uncertainty propagation analysis tracing quality degradation through processing chains, (2) parameter optimization linking acquisition conditions with quality outcomes, and (3) provenance-based validation verifying digital asset authenticity, operationalizing digital specimens as interpretive reconstructions rather than neutral recordings.

5.6.2. Object Properties in Digital Module

Table 12 summarizes the principal object properties defined for the Digital Module.

5.7. Ontology-Based Knowledge Graph Implementation

To validate the practical utility of the E-DNH ontology for knowledge discovery beyond data retrieval, we constructed a knowledge graph comprising 1243 nodes and 3856 edges centered on 197 DigitalSpecimen nodes. Cross-module bridge relationships (hasOutstandingValue, digitizedBy, implementedThrough) connecting the Nature, Heritage, and Digital Modules enable semantic reasoning across biological, cultural, and technical dimensions. Graph analysis confirms the structural properties critical for knowledge navigation. DigitalSpecimen nodes function as network hubs (avg. degree 12.3), facilitating centralized access to multi-domain information; bridge edges demonstrate high betweenness centrality (0.42), whose removal fragments the graph into three isolated components, confirming their role as essential semantic connectors; small-world topology ( L = 3.2 ) enables efficient query traversal across domains; and high clustering (0.68) supports context-based subgraph extraction for domain-specific analyzes.
These structural properties operationalize three research hypotheses that examine how cross-domain integration reveals insights unattainable through isolated data systems, as illustrated in Figure 5. Hypothesis 1 (Biological–Heritage Integration): Taxonomic rarity correlates with heritage designation patterns, predicting conservation prioritization. Among 31 natural monument species, 90% are endangered (IUCN Red List), with the Japanese crane (Grus japonensis, NA0007) exemplifying how TaxonIdentification → HeritageAspect pathways reveal that designation decisions systematically integrate biological rarity with cultural significance, validated through the chi-square test ( χ 2 = 18.7 , p < 0.001 ). This shows that heritage value is not arbitrarily assigned, but reflects scientifically grounded biodiversity priorities. Hypothesis 2 (Heritage–Digital Integration): Legal protection status predicts digitization quality specifications. Natural monuments receive 2× scan resolution (0.1 mm vs. 0.5 mm for general specimens) and achieve significantly stricter geometric accuracy (RMS 0.21 ± 0.08 mm vs. 0.42 ± 0.15 mm, t = 7.3 , p < 0.001 ), demonstrating how HeritageInscription → QualityAssessment pathways directly translate legal mandates into quantifiable technical specifications. This validates that digitization is not technologically neutral but institutionally governed. Hypothesis 3 (Digital–Nature Integration): Institutional type determines output format differentiation. Educational, research, and governmental institutions produce systematically differentiated outputs through Agent → TechnicalSpecification pathways (ANOVA F = 12.4 , p < 0.001 ), optimizing digitization workflows for diverse stakeholder needs (e.g., educational institutions prioritize web-compatible formats; research institutions prioritize high-resolution raw data).
These findings demonstrate that the E-DNH knowledge graph enables hypothesis-driven discovery by revealing causal relationships across domains that remain invisible in siloed data systems, directly addressing the research objective of establishing a unified semantic framework for transparent documentation and knowledge reasoning in natural heritage digitization.

6. Semantic Collaborative Environment for Natural Heritage

6.1. System Architecture Overview

The Collaborative Extended Digital Natural Heritage Platform (C-EDNH) is an intelligent data management framework that integrates the entire lifecycle of natural heritage data—collection, semantic enrichment, and utilization (Figure 6).
Extending the concept of the Digital Knowledge Ecosystem from the Basilica Iulia project to natural heritage, the architecture establishes a collaborative knowledge management environment where multidisciplinary researchers can explore, annotate, and share three-dimensional spatial information. By combining biological, heritage, and technical attributes with 3D digital twins, the platform transcends conventional metadata-centric systems to implement a knowledge graph-based Semantic Knowledge Loop, ensuring findability, accessibility, interoperability, and reusability throughout the data lifecycle. The architecture comprises three layers performing distinct but complementary functions within an integrated semantic framework.
Layer I—Data Acquisition and Digitization Layer
This layer encompasses the measurement-based digitization workflow, referred to as Higher Reality 3D Digitalize (HR3D). It converts physical specimens into digital representations while comprehensively documenting every processing step. Using an integrated workflow combining RealityCapture-based photogrammetry and robotic scanning systems, the platform standardizes high-precision measurement, registration, mesh generation, texture correction, and AI-assisted restoration. All outputs are recorded in standardized data formats and managed through persistent identifiers (NSId/DOI) to ensure long-term traceability and interoperability.
Layer II—Data Integration and Management Layer
This layer integrates and normalizes acquired 3D and textual data within an ontology-based framework. Designed in a Neo4j graph database, it semantically links biological, heritage, and technical attributes of natural heritage specimens. Following CIDOC-CRM, CRMdig, and Darwin Core standards, it defines relationships among data, metadata, and paradata in a triple-layer structure. Through E-DNH Tools, researchers can collaboratively annotate and validate heritage significance, legal protection status, and digitization quality. At this stage, digital representations are transformed into semantically enriched knowledge structures, enabling cross-domain reasoning and validation.
Layer III—Data Collaboration and Application Layer
This layer provides user-facing interfaces for knowledge discovery and collaborative annotation. It supports advanced semantic search, interactive 3D visualization, and community-driven knowledge curation, enabling researchers to query complex relationships (e.g., “Which endangered species in a given region have high-resolution 3D models?”) and contribute annotations that enrich the knowledge graph. The layered architecture supports bidirectional data flow, allowing Layer III annotations to be immediately reflected in the Layer II knowledge graph and guide Layer I scanning priorities. Through this integration, the platform unifies dispersed natural heritage data into a knowledge graph, implementing a collaborative digital ecosystem encompassing biological, heritage, and technical contexts.

6.2. Core Technological Components

This section presents the core technological components of the C-EDNH platform, corresponding to the layered architecture illustrated in Figure 7. These components form an integrated framework that supports 3D data processing, semantic knowledge structuring, and interoperability of international identifiers within a unified workflow.
HR3D(Higher reality 3D digitalization) reproduces natural heritage specimens with ultra-high precision by integrating Gaussian Splatting-based 3D reconstruction with AI-based inpainting based on the Latent Diffusion Model (LDM) (Figure 7a). It includes multi-view point cloud generation; surface-aware noise filtering (feathers, fur, scales); LOD generation; UV mapping; and integrated metadata management. The GS2Mesh architecture achieved significant improvements: the Chamfer Distance error reduced from 0.76 to 0.68 cm, file sizes decreased from 2400 to 310 MB, and the processing speed increased 830 times.
E-DNH tools provide semantic curation and digital asset workflow management (Figure 7b). Researchers collaboratively annotated heritage values, legal status, and biological classification, with annotations immediately reflected in the Neo4j knowledge graph for automatic semantic inference. Role-Based Access Control (RBAC) defines hierarchical permissions, while blockchain-based integrity logging ensures complete traceability.
The Hndle system-based PID infrastructure ensures the long-term traceability and international interoperability of digital specimens. All specimens receive an NSId (Natural Specimen Identifier) linked to DOI via the Handle protocol and synchronized with DiSSCo and GBIF APIs. The OpenSearch-based semantic engine processes multi-dimensional queries across taxonomic and spatiotemporal attributes, while the OAI-PMH protocol harvests metadata from Europeana, converting it to Darwin Core standards for federated semantic management.

7. Performance Evaluation and Validation

This section validates two research hypotheses addressing the practical utility of the proposed semantic framework. H1: The E-DNH ontology enables cross-domain reasoning that reveals causal relationships invisible in isolated data systems. H2: The platform achieves compliance with international FAIR data principles through semantic interoperability. Evaluation involves (1) ontology-based query testing (Section 7.1) demonstrating cross-module reasoning capabilities across Nature, Heritage, and Digital Modules, and (2) FAIR Data Maturity assessment (Section 7.2), quantifying data management maturity using RDA indicators. Together, these results confirm that the platform achieves both logical coherence in its knowledge structure and compliance with FAIR principles.

7.1. Validation of Cross-Domain Reasoning Capabilities

To test H1, we designed four query scenarios examining whether the E-DNH ontology can derive insights requiring integration across biological, heritage, and digitization domains. Using the Cypher query language in a Neo4j graph environment, we validated the ontology’s structural integrity and reasoning capability through representative cases demonstrating cross-modular inference.

7.1.1. Case 1: Causal Relationship Between Heritage Value and Digitization Quality

Research Question: Does heritage significance predict digitization quality specifications?
Query: “What heritage value does the red-crowned crane hold and what is the quality of its digital representation?”
This query integrates the Nature Module (species identification), Heritage Module (heritage value), and Digital Module (digitization quality) to test whether legal protection status causally determines technical specifications. The reasoning chain illustrated in Figure 8 connects biological identification (NA0007) → heritage value assessment → legal protection status → digitization process → quality verification.
Finding: The red-crowned crane, designated as Natural Monument No. 202 with high Outstanding Universal Value, received stricter digitization quality standards (geometric accuracy 0.23mm, completeness 97.8%), significantly exceeding general specimen standards (accuracy 0.42 ± 0.15mm, completeness 89.3 ± 4.2%). This validates the hypothesis that HeritageInscription directly translates into QualityAssessment parameters, demonstrating that digitization is institutionally governed rather than technologically neutral.

7.1.2. Case 2: Taxonomic Influence on Heritage Interpretation Patterns

Research Question: Does taxonomic classification predict heritage aspect emphasis patterns?
Query: “Which heritage aspects are emphasized among carnivorous mammals designated as Natural Monuments?”
This query tests whether biological taxonomy (Nature Module) systematically influences heritage interpretation (Heritage Module). The analysis examined 12 carnivorous mammal specimens across three families (Mustelidae, Felidae, Canidae).
Finding: While all carnivorous Natural Monuments emphasize “ecological significance,” family-level variation emerged: Mustelidae species (e.g., Eurasian otter) emphasize scientific research value in aquatic ecosystem studies (83% of cases), whereas Felidae species (e.g., Amur leopard) emphasize cultural symbolism and humanistic interpretation (75% of cases). Figure 9 demonstrates that taxonomic hierarchy serves as a structural determinant of heritage value semantics, revealing how biological classification informs cultural interpretation patterns—a relationship invisible in non-integrated systems.

7.1.3. Case 3: Multi-Criteria Prioritization in Digitization Workflows

Research Question: How do overlapping conservation statuses (IUCN + national designation) influence digitization priority decisions?
Query: “Which endangered species are held by Seoul Grand Park, and how are digitization priorities determined?”
This query examines whether institutions systematically integrate biological rarity (Nature Module) with legal protection (Heritage Module) to guide digitization workflows (Digital Module).
Finding: Seoul Grand Park implements a two-tier prioritization algorithm; the primary criterion follows IUCN categories (EN → VU → NT) and the secondary criterion elevates specimens with dual designation (IUCN + Natural Monument). Swan goose and red-crowned crane, possessing both EN status and Natural Monument designation, received highest priority (processed within 6 months vs. 18-month average). Figure 10 validates that HeritageInscription information operationally determines AcquisitionEvent sequencing, demonstrating evidence-based resource allocation in digitization projects.

7.1.4. Case 4: Project-Based Quality Assessment and Accountability

Research Question: Can heritage conservation project outcomes be quantified through digitization quality metrics?
Query: “What is the average quality of the digital specimens produced through the Crane Restoration Project?”
This query tests whether HeritageProject (Heritage Module) outcomes can be systematically evaluated through QualityAssessment (Digital Module), enabling an evidence-based assessment of conservation initiatives.
Finding: The Crane Restoration Project (2020–2025) generated 23 digital specimens with average geometric accuracy of 0.19 ± 0.05mm and texture fidelity (SSIM) of 0.94 ± 0.02, exceeding institutional benchmarks. Figure 11 demonstrates that the cross-module relationship between HeritageProject and DataProcessing enables translation of conservation project outcomes into quantifiable digital quality indicators, supporting an evidence-based evaluation of heritage initiatives.
Summary: These four cases validate H1 by demonstrating that the E-DNH ontology enables hypothesis-driven reasoning revealing causal relationships (heritage value → digitization quality); structural patterns (taxonomy → heritage interpretation); operational algorithms (conservation status → prioritization); and quantifiable outcomes (projects → quality metrics) that remain invisible in siloed data systems.

7.2. FAIR Data Maturity Evaluation

To test H2, we quantitatively assessed platform compliance with FAIR principles using the RDA FAIR Data Maturity Model (RDA-FDMM), which evaluates Findable, Accessible, Interoperable, and Reusable dimensions across 41 indicators. This evaluation examines whether the semantic framework achieves international data management standards. The assessment involved (1) mapping E-DNH data structures to RDA-FDMM indicators; (2) evaluating implementation levels (Essential, Important, Useful) across five maturity stages (Level 0: Not Applicable → Level 5: Fully Implemented); and (3) calculating weighted FAIRness scores. Figure 12 visualizes implementation status across all 41 indicators.
Results and Interpretation
The platform achieved a FAIR maturity score of 0.88 (87%), exceeding the project target (0.83) and significantly surpassing typical institutional repositories (0.65–0.75). As shown in Figure 12a, 37 of 41 indicators (90%) reached the fully implemented level, demonstrating systematic FAIR compliance.
Dimension-specific analysis reveals both strengths and areas for refinement. Findability achieved Level 3; while manual NSId assignment ensures 100% persistent identifier coverage, automated DOI integration (RDA-F1-01D, RDA-F1-02D) remains at the planning stage. Accessibility and Interoperability both attained Level 5, the highest maturity, through standardized API protocols (HTTP/S, OAuth2, OIDC) and robust integration with global infrastructures (GBIF, GenBank) via Darwin Core, CIDOC-CRM, CRMdig, and PROV-O alignment. Reusability achieved Level 3-4; technical mechanisms are fully operational, but license metadata policy (RDA-R1.2-01M) requires refinement for full COAR and OpenAIRE compliance.
These results validate Hypothesis H2, confirming that the semantic framework achieves FAIR-by-Design compliance through systematic integration rather than post hoc supplementation. High scores in Accessibility and Interoperability demonstrate technical effectiveness, while remaining gaps in Findability and Reusability concern governance policies (DOI automation, open licensing) rather than technical capabilities, addressable through administrative optimization. The radar visualization in Figure 12c illustrates balanced maturity across dimensions. Comparatively, the 0.88 score positions the platform among the top 10% globally, comparable to DiSSCo (0.86) and iDigBio (0.84), validating practical viability for institutional deployment.

8. Discussion

This study validated the proposed framework using 197 natural heritage specimens (140 bird, insect, mammal, and reptile species) collected through collaboration with Hannam University Museum of Natural History, National Research Institute of Cultural Heritage, and Seoul Grand Park. While the dataset demonstrated interoperability among heterogeneous institutional structures, taxonomic diversity remains limited, excluding herbarium sheets, geological samples, and microscope slides due to optical constraints. Future studies should develop adaptive protocols—polarization filtering, HDR-based texture correctio, hybrid illumination scanning and integrate WebGL-based 3D annotation APIs with automated RDF triple generation to enable interactive semantic annotation currently unrealized in PROV-O-based paradata documentation.
A critical limitation concerns cross-framework interoperability validation. Although the E-DNH Ontology successfully integrates standards, compatibility with GBIF IPT and BioCASe remains empirically untested. Future work must conduct federated query testing across institutional boundaries to validate semantic alignment consistency. Furthermore, this study does not sufficiently address the threshold of semantic alignment necessary for stable reasoning. Future research should investigate link-only integration modes, where heterogeneous metadata are connected via PIDs without strict prior alignment, allowing AI-based reasoning to learn latent correspondences while maintaining provenance-based accountability.

9. Conclusions

This study introduces an integrated framework redefining digital natural heritage as a collaborative and interpretive ecosystem. The Collaborative Extended Digital Natural Heritage Platform (C-EDNH), built on the E-DNH ontology and HR3D workflow, enables diverse participants to explore, annotate, and reinterpret natural heritage within an interactive 3D environment, transforming static data into a dynamic semantic network.
The framework achieves four key contributions. Firstly, it establishes a multilayered semantic integration model bridging Darwin Core (biological data), CIDOC-CRM (heritage value), and CRMdig (digitization process), enabling specimens to function as both scientific evidence and cultural heritage. Secondly, it operationalizes hyperreality as quantitative quality management (RMS 0.18 ± 0.09 mm; Δ E 00 2.1 ± 0.7), establishing measurable standards for reproducibility. Third, through PROV-O-based paradata documentation, it systematically structures technical decisions within shared prov/Activity frameworks, enabling transparent accountability and critical reassessment. Fourthly, it validates cross-domain reasoning capabilities through hypothesis-driven queries, demonstrating how heritage designation predicts digitization quality specifications and taxonomic classification influences heritage interpretation patterns—insights invisible in siloed data systems.
Future work will expand specimen diversity beyond current taxonomic constraints, enhance 3D semantic annotation capabilities, and validate global interoperability through integration with GBIF, DiSSCo, IIIF, and Linked Art infrastructures. Ultimately, this framework advances digital heritage management from isolated technical replication to an open, intelligent, and human-centered collaborative model, establishing a foundation for transparent and scientifically accountable digitization practices.

Author Contributions

Conceptualization, Y.L. and J.L.; methodology, Y.L. and S.S.; software, J.O.; validation, Y.L. and S.S.; formal analysis, Y.L.; investigation, Y.L. and S.S.; resources, Y.L. and J.O.; writing—original draft preparation, Y.L.; writing—review and editing, J.L.; visualization, Y.L. and S.S.; supervision, J.L.; project administration, J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Cultural Technology Research and Development Program of the Ministry of Culture, Sports and Tourism (MCST) and the Korea Creative Content Agency (KOCCA) under Grant No. RS-2024-00442308, titled “Development of AI-based Restoration Technology for Digital Museum Services through Nature-Inspired Intelligence”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset supporting the findings of this study (complete metadata and paradata for 197 natural heritage specimens including taxonomic classifications, institutional provenance, preservation states, digitization parameters, and quality assessment metrics) is not publicly available due to ongoing research and institutional data policies. Summary statistics are provided in Table 2 and Table 3. Additional data may be obtained from the corresponding author upon reasonable request and subject to approval from the participating institutions (Hannam University Museum of Natural History, National Heritage Administration, Seoul Grand Park).

Acknowledgments

The authors would like to thank the following organizations for their invaluable collaboration and support in this research project. Management Agency: Korea Creative Content Agency (KOCCA) for policy guidance and project administration. Principal Research Institute:Technology Research Institute for Culture and Heritage (Tric) for leading the HR3D digitization workflow and technical infrastructure development. Collaborative Institutions: Korea Institute for Natural Heritage for providing natural history specimens and natural monument data; Seoul Grand Park for specimen access, conservation data, and establishing Nature Asset standards. Participating Institutions: Korea National University of Heritage (specimen curation and collaborative framework establishment); SQISOFT (digital archive CMS and platform development); Korea Advanced Institute of Science and Technology (KAIST) for AI-driven 3D asset generation and data augmentation technologies; LOCUS for immersive visualization and virtual exhibition development. We also acknowledge technical discussions with experts from the Korea National Park Service during the advisory workshop, and express our gratitude to the E-DNH development team for their continuous technical support throughout the project implementation.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
E-DNHExtended Digital Natural Heritage
C-EDNHCollaborative Extended Digital Natural Heritage Platform
HR3DHigher Reality 3D Digitalize
DESDigital Extended Specimen
DwCDarwin Core
ABCDAccess to Biological Collection Data
CIDOC-CRMConceptual Reference Model (CIDOC)
CRMdigCIDOC-CRM extension for digitization
PROV-OProvenance Ontology (W3C)
PIDPersistent Identifier
NSIdNatural Specimen Identifier
QA/QCQuality Assurance and Quality Control
SfMStructure from Motion
MVSMulti-View Stereo
ICPIterative Closest Point

References

  1. UNESCO. Convention Concerning the Protection of the World Cultural and Natural Heritage; United Nations Educational, Scientific and Cultural Organization: Paris, France, 1972; Available online: https://whc.unesco.org/en/conventiontext/ (accessed on 1 January 2025).
  2. UNESCO World Heritage Centre. World Heritage List Statistics; United Nations Educational, Scientific and Cultural Organization: Paris, France, 2025; Available online: https://whc.unesco.org/en/list/stat/ (accessed on 1 January 2025).
  3. Hedrick, B.P.; Hetherington, A.; Lowe, A.J.; Meineke, E.K.; Romero, C.J.; Sterner, B.; Stigall, A.; Thompson, J.C.; Wills, D. Digitization and the Future of Natural History Collections. BioScience 2020, 70, 243–251. [Google Scholar] [CrossRef]
  4. Ong, S.-Q.; Jalaluddin, N.S.M.; Yong, K.T.; Ong, S.P.; Lim, K.F.; Azhar, S. Digitization of natural history collections: A guideline and nationwide capacity-building workshop in Malaysia. Ecol. Evol. 2023, 13, e10212. [Google Scholar] [CrossRef]
  5. Brecko, J.; Mathys, A. Handbook of best practice and standards for 2D+ and 3D imaging of natural history collections. Eur. J. Taxon. 2020, 623, 1–115. [Google Scholar] [CrossRef]
  6. Barzaghi, S.; Bordignon, A.; Zinck Lauersen, D.; Heller, B.; Giagnolini, M.; Renda, G.; Peroni, S.; Schirinzi, M.; Passarelli, M.; Fiorini, P. A proposal for a FAIR management of 3D data in cultural heritage: The Aldrovandi Digital Twin case. Data Intell. 2024, 6, 1190–1221. [Google Scholar] [CrossRef]
  7. Scopigno, R.; Callieri, M.; Cignoni, P.; Corsini, M.; Dellepiane, M.; Ponchio, F.; Ranzuglia, G. 3D models for cultural heritage: Beyond plain visualization. Computer 2011, 44, 48–55. [Google Scholar]
  8. Oh, J.; Yu, J. USD-based 3D archiving framework for time-series digital documentation of natural heritage. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, XLVIII-M-9-2025, 1097–1104. [Google Scholar] [CrossRef]
  9. Hardisty, A.R.; Ellwood, E.R.; Nelson, G.; Zimkus, B.; Buschbom, J.; Addink, W.; Rabeler, R.K.; Bates, J.; Bentley, A.; Fortes, J.A.; et al. Digital Extended Specimens: Enabling an Extensible Network of Biodiversity Data Records as Integrated Digital Objects on the Internet. BioScience 2022, 72, 978–987. [Google Scholar] [CrossRef]
  10. Meineke, E.K.; Davis, C.C.; Davies, T.J. The unrealized potential of herbaria for global change biology. Ecol. Monogr. 2018, 88, 505–525. [Google Scholar] [CrossRef]
  11. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef]
  12. Buckner, J.C.; Sanders, R.C.; Faircloth, B.C.; Chakrabarty, P. The critical importance of vouchers in genomics. eLife 2021, 10, e68264. [Google Scholar] [CrossRef]
  13. SYNTHESYS3 Consortium. Final Publishable Summary Report (Grant Agreement No. 312253); European Commission Horizon 2020 Programme; European Commission: Brussels, Belgium, 2017; Available online: https://cordis.europa.eu/docs/results/312/312253/final1-synthesys3-final-publishable-summary.pdf (accessed on 1 January 2025).
  14. Bentkowska-Kafel, A.; Denard, H.; Baker, D. (Eds.) Paradata and Transparency in Virtual Heritage; Ashgate Publishing: Surrey, UK, 2012; ISBN 978-0754675839. [Google Scholar]
  15. Huvila, I. The Unbearable Complexity of Documenting Intellectual Processes: Paradata and Virtual Cultural Heritage Visualisation. Hum. IT 2012, 12, 97–110. Available online: https://humanit.hb.se/article/view/96 (accessed on 1 January 2025).
  16. Cassar, A.; Baker, D.; Ioannides, M. From Digital Twin to Memory Twin: A Holistic Framework for Cultural Heritage Documentation, Interpretation, and Adaptive Reuse. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, XLVIII-M-9-2025, 203–213. [Google Scholar] [CrossRef]
  17. Oldman, D. The CIDOC Conceptual Reference Model (CIDOC-CRM): Primer; CRM Labs: Pune, India, 2014; Available online: https://cidoc-crm.org/Resources/the-cidoc-conceptual-reference-model-cidoc-crm-primer (accessed on 1 January 2025).
  18. Catalano, C.E.; Vassallo, V.; Hermon, S.; Spagnuolo, M. Representing quantitative documentation of 3D cultural heritage artefacts with CIDOC CRMdig. Int. J. Digit. Libr. 2020, 21, 251–266. [Google Scholar] [CrossRef]
  19. Missier, P.; Belhajjame, K.; Cheney, J. The W3C PROV family of specifications for modelling provenance metadata. In Proceedings of the 16th International Conference on Extending Database Technology, Genoa, Italy, 18–22 March 2013; ACM: New York, NY, USA, 2013; pp. 773–776. [Google Scholar] [CrossRef]
  20. Rinaldo, C.; Rielinger, D.; Deveer, J.; Castronovo, D. Connecting Libraries, Archives, and Museums: Collections in Support of Natural History Science. J. Comput. Cult. Herit. 2021, 16, 7. [Google Scholar] [CrossRef]
  21. van Egmond, E.; Willemse, L.; Runnel, V.; Saarenmaa, H.; Koivunen, A.; Lahti, K.; Livermore, L. Prioritising Needs for Data of Private Natural History Collections (ICEDIG Deliverable D2.2); Zenodo: Geneva, Switzerland, 2019; Available online: https://doi.org/10.5281/zenodo.2582995 (accessed on 1 January 2025).
  22. Nelson, G.; Ellis, S. The History and Impact of Digitization and Digital Data Mobilization on Biodiversity Research. Philos. Trans. R. Soc. B 2018, 374, 20170391. [Google Scholar] [CrossRef]
  23. Robertson, T.; Wieczorek, J.R.; Raymond, M. Diversifying the GBIF Data Model. Biodivers. Inf. Sci. Stand. 2022, 6, e94420. [Google Scholar] [CrossRef]
  24. Soltis, P.S.; Nelson, G.; Fortes, J. iDigBio: Integrated Digitized Biodiversity Collections; Florida Museum of Natural History, University of Florida: Gainesville, FL, USA, 2011. [Google Scholar]
  25. Cicero, C.; Koo, M.S.; Braker, E.; Abbott, J.; Bloom, D.; Campbell, M.; Cook, J.A.; Demboski, J.R.; Doll, A.C.; Frederick, L.M.; et al. Arctos: Community-driven Innovations for Managing Natural and Cultural History Collections. PLOS ONE 2024, 19, e0296478. [Google Scholar] [CrossRef]
  26. Wieczorek, J.; Bloom, D.; Guralnick, R.; Blum, S.; Döring, M.; Giovanni, R.; Robertson, T.; Vieglais, D. Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 2012, 7, e29715. [Google Scholar] [CrossRef]
  27. Depraetere, M.; Akhlaq, S.; Díaz, V.D.; Schwarz, D.; Haendel, J. Virtual Access to Fossil & Archival Material from the German Tendaguru Expedition (1909–1913): More Than 100 Years of Data–Meta–Paradata Management for Improved Standardisation. In 3D Research Challenges in Cultural Heritage V; Springer: Berlin/Heidelberg, Germany, 2024; pp. 119–133. [Google Scholar] [CrossRef]
  28. Orzechowski, M.; Opioła, Ł; Martínez, I.L.; Ioannides, M.; Panayiotou, P.N.; Wróblewska, A. Integrated Data, Metadata, and Paradata Management System for 3D Digital Cultural Heritage Objects: Workflow Automation, Federated Authentication, and Publication. Future Gener. Comput. Syst. 2026, 174, 107964. [Google Scholar] [CrossRef]
  29. Vollmar, A.; Macklin, J.A.; Ford, L.S. Natural history specimen digitization: Challenges and concerns. Biodivers. Inform. 2010, 7, 93–112. [Google Scholar] [CrossRef]
  30. ISO 14721:2012; Space Data and Information Transfer Systems — Open Archival Information System (OAIS) — Reference Model. International Organization for Standardization: Geneva, Switzerland, 2012.
  31. Ioannides, M.; Patias, P. The complexity and quality in 3D digitisation of the past: Challenges and risks. In Digital Heritage III: Complexity and Quality in Digitisation; Springer: Berlin/Heidelberg, Germany, 2023; pp. 1–14. [Google Scholar] [CrossRef]
  32. Schäfer, U.U. Uncertainty Visualization and Digital 3D Modeling. Int. J. Digit. Art Hist. 2019, 3, 87–104. [Google Scholar] [CrossRef]
  33. Suárez-Figueroa, M.C.; Gómez-Pérez, A.; Fernández-López, M. The NeOn Methodology framework: A scenario-based methodology for ontology development. Appl. Ontol. 2015, 10, 107–145. [Google Scholar] [CrossRef]
  34. Groom, Q.; Dillen, M.; Hardy, H.; Phillips, S.; Willemse, L.; Wu, Z. Improved standardization of transcribed digital specimen data. Database 2019, 2019, baz129. [Google Scholar] [CrossRef] [PubMed]
  35. Morris, R.A.; Barve, V.; Carausu, M.; Chavan, V.; Cuadra, J.; Freel, C.; Hagedorn, G.; Leary, P.; Mozzherin, D.; Olson, A.; et al. Discovery and publishing of primary biodiversity data associated with multimedia resources: The Audubon Core strategies and approaches. Biodivers. Inform. 2013, 8, 185–197. [Google Scholar] [CrossRef]
  36. Pitzalis, D.; Niccolucci, F.; Cord, M. Using LIDO to handle 3D cultural heritage documentation data provenance. In Proceedings of the 2011 International Workshop on Content-Based Multimedia Indexing (CBMI), Madrid, Spain, 13–15 June 2011; pp. 520–525. [Google Scholar] [CrossRef]
Figure 1. Integrated framework of the Digital Extended Specimen (DES) model and methodological workflow for natural heritage digitization. The left section represents the conceptual and attribute structure of DES, while the right section illustrates the stepwise methodology for data acquisition, 3D scanning, semantic modeling, and collaborative management [9].
Figure 1. Integrated framework of the Digital Extended Specimen (DES) model and methodological workflow for natural heritage digitization. The left section represents the conceptual and attribute structure of DES, while the right section illustrates the stepwise methodology for data acquisition, 3D scanning, semantic modeling, and collaborative management [9].
Heritage 08 00507 g001
Figure 2. Three-dimensional digitization workflow for natural heritage specimens.
Figure 2. Three-dimensional digitization workflow for natural heritage specimens.
Heritage 08 00507 g002
Figure 3. Photogrammetric reconstruction process using ContextCapture. (a) Final result preview. (b) Tie-point alignment quality. (c) Camera positions during aerotriangulation. (d) Aerotriangulation results with camera poses. (e) Final point cloud and camera positions. This workflow enables accurate surface reconstruction of natural heritage specimens.
Figure 3. Photogrammetric reconstruction process using ContextCapture. (a) Final result preview. (b) Tie-point alignment quality. (c) Camera positions during aerotriangulation. (d) Aerotriangulation results with camera poses. (e) Final point cloud and camera positions. This workflow enables accurate surface reconstruction of natural heritage specimens.
Heritage 08 00507 g003
Figure 4. Modular structure of the E-DNH ontology showing the three main modules and their interconnections.
Figure 4. Modular structure of the E-DNH ontology showing the three main modules and their interconnections.
Heritage 08 00507 g004
Figure 5. Neo4j-based visualization of E-DNH ontology relationships for natural heritage.
Figure 5. Neo4j-based visualization of E-DNH ontology relationships for natural heritage.
Heritage 08 00507 g005
Figure 6. Overall system architecture of the C-EDNH platform showing three main layers.
Figure 6. Overall system architecture of the C-EDNH platform showing three main layers.
Heritage 08 00507 g006
Figure 7. Core technological components of the E-DNH platform. (a) HR3D (Higher Reality 3D Digitalization) system: Gaussian Splatting-based 3D reconstruction and AI inpainting pipeline. (b) E-DNH Tools: Collaborative semantic annotation environment and digital asset workflow management interface.
Figure 7. Core technological components of the E-DNH platform. (a) HR3D (Higher Reality 3D Digitalization) system: Gaussian Splatting-based 3D reconstruction and AI inpainting pipeline. (b) E-DNH Tools: Collaborative semantic annotation environment and digital asset workflow management interface.
Heritage 08 00507 g007
Figure 8. Integrated query result demonstrating causal relationship between heritage designation and digitization quality standards (Neo4j visualization).
Figure 8. Integrated query result demonstrating causal relationship between heritage designation and digitization quality standards (Neo4j visualization).
Heritage 08 00507 g008
Figure 9. Pattern analysis revealing taxonomic influence on heritage aspect emphasis (Neo4j visualization).
Figure 9. Pattern analysis revealing taxonomic influence on heritage aspect emphasis (Neo4j visualization).
Heritage 08 00507 g009
Figure 10. Query result demonstrating multi-criteria prioritization algorithm integrating IUCN status and national designation (Neo4j visualization).
Figure 10. Query result demonstrating multi-criteria prioritization algorithm integrating IUCN status and national designation (Neo4j visualization).
Heritage 08 00507 g010
Figure 11. Quality assessment result demonstrating quantifiable outcomes of heritage conservation projects (Neo4j visualization).
Figure 11. Quality assessment result demonstrating quantifiable outcomes of heritage conservation projects (Neo4j visualization).
Heritage 08 00507 g011
Figure 12. FAIR Data Maturity assessment results: (a) indicators matrix showing implementation levels; (b) FAIRness levels per FAIR dimension; (c) FAIRness progress radar visualization.
Figure 12. FAIR Data Maturity assessment results: (a) indicators matrix showing implementation levels; (b) FAIRness levels per FAIR dimension; (c) FAIRness progress radar visualization.
Heritage 08 00507 g012
Table 1. Comparative analysis of current natural heritage data platforms [22,23,24,25].
Table 1. Comparative analysis of current natural heritage data platforms [22,23,24,25].
PlatformCore StrategyLimitation
Arctos• Community-driven governance (400+ institutions)
• Entity-based relational model with flexible many-to-many linking
• Darwin Core integration with PID-based interoperability
• Lacks ontology-based semantic structure for 3D object attributes (mesh resolution, texture versioning)
• Limited paradata integration for digitization workflows
GBIF• Global federation model (180+ countries)
• Occurrence-centric data model with Darwin Core standardization
• Distributed publishing via IPT
• Text-based occurrence records; limited support for 3D, paradata, and non-taxonomic contexts
• Expert-centric interface limits broader user engagement
iDigBio• Cloud-based centralized hub with federated storage
• Thematic Collections Networks for taxonomic collaboration
• RESTful API and RDF-based media linking
• Darwin Core dependency; lacks structured paradata for digitization provenance
• Metadata-centric approach insufficient for 3D data lifecycle management
Table 2. Dataset composition and institutional distribution of 197 natural heritage specimens.
Table 2. Dataset composition and institutional distribution of 197 natural heritage specimens.
Taxonomic GroupSpecimens (n)PercentageInstitutionsMajor Taxa
Aves (Birds)13568.5%NMCNMK (95), SGP (40)Cranes, eagles, owls, herons, etc.
Insecta (Insects)4924.9%HNU (49)Beetles, mantises, grasshoppers, etc.
Mammalia
(Mammals)
105.1%NMCNMK (7), SGP (3)Tiger, otter, bear, goral, etc.
Reptilia
(Reptiles)
31.5%SGP (2), NMCNMK (1)Tortoises, turtles, etc.
Total197100%3 institutionsca. 140 species
NMCNMK = National Monument Center of Ministry of Natural Knowledge; SGP = Seoul Grand Park; HNU = Hannam University Natural History Museum.
Table 3. Data types and characteristics in the natural heritage and specimen.
Table 3. Data types and characteristics in the natural heritage and specimen.
Data TypeCharacteristicsApplication
Specimen
Attributes
• Taxonomic & biological data
• Collection metadata
• Preservation state
• Persistent identifiers (PIDs)
• Taxonomic classification
• Specimen instances
• Linking physical specimens
to digital surrogates
Project
Information
• Digitization activity records
• Workflows & equipment
parameters
• Institutional collaboration metadata
• Conservation & restoration records
• Documentation of digitization events
• Paradata modeling via CRMdig
• Activity attribution with PROV-O
Heritage
Records
• Unstructured text formats
• Historical documents
• Public reports like evaluation
• Cultural and ecological narratives
• Integration of cultural significance
• Heritage enrichment
• Semantic annotation of narratives
Table 4. Major equipment and specifications used for 3D digitization.
Table 4. Major equipment and specifications used for 3D digitization.
CategoryEquipmentKey SpecificationsApplication Stage
Structured-light ScannerArtec Space Spider (Artec 3D, Luxembourg City, Luxembourg)0.05 mm point accuracy, 16 fpsClose-range geometry capture
DSLR CameraSony  α 7R V (Sony Corporation, Tokyo, Japan)61 MP, 35 mm f/1.4, ISO 100Texture acquisition
Turntable SystemRB10-1300 (Rotary Systems, Seoul, South Korea)360° rotation, 5° interval, 48 positionsAutomated rotational scanning
LightingDiffuse LED Panel (Manufacturer name, City, Country)1000–1500 lux, 5500 K, CRI > 95Uniform illumination environment
Table 5. Analysis of paradata scopes in the digitization process of natural heritage.
Table 5. Analysis of paradata scopes in the digitization process of natural heritage.
CategoryKey ElementsAnalytical Purpose
Physical ContextScanning environment (illumination, humidity), specimen mounting, device resolution, scanning anglesReproducibility and scientific credibility through controlled acquisition conditions
Computational ContextAlignment algorithms, texture mapping parameters, reconstruction workflowsTransparency and traceability of computational transformations
Interpretive ContextAI model selection, manual editing, quality thresholds, validation criteriaAccountability and interpretive transparency in human judgment
Table 6. Mapping between 3D digitization workflow and paradata elements in natural heritage.
Table 6. Mapping between 3D digitization workflow and paradata elements in natural heritage.
3D Digitization TaskType of ParadataOntology Class
Input condition settingEquipment selection, specimen complexity analysisInputProvenance
Acquisition controlEnvironmental conditions, image overlap ratioAcquisitionEvent
Algorithm parametersRegistration threshold, feature extraction methodDataProcessing
Interpretive decisionsAI model selection, reconstruction scopeInterpretiveDecision
Reliability expressionConfidence scores, extent of data lossUncertaintyAnnotation
Quality assessmentGeometric accuracy, color fidelityQualityAssessment
Optimization strategyPolygon reduction, texture compressionOptimizationProcedure
Table 7. Core classes of the Nature Module.
Table 7. Core classes of the Nature Module.
ClassDefinition and Functional Role
DigitalSpecimenRepresents the digital surrogate of a physical natural specimen, characterized by persistent identifiers, specimen type, and holding institution. Extends the Darwin Core Occurrence class to establish a curated digital representation linked to the physical specimen within museum collections, supporting the Digital Extended Specimen (DES) paradigm.
EventRecords the spatiotemporal context of specimen collection, observation, or documentation activities. Captures the precise moment when the specimen was documented within a scientific framework, serving as essential provenance information for assessing specimen origin, collection methodology, and data reliability.
LocationDefines the geographic context in which the specimen was discovered or observed. Incorporates hierarchical spatial data including coordinates, administrative divisions, place names, elevation, and depth, supporting ecological distribution analysis, biogeographic research, and heritage site boundary delineation.
IdentificationDocuments the taxonomic determination process and outcomes for a specimen. Explicitly records the taxonomist, determination date, identification method, and confidence level, enabling historical traceability and re-evaluation of taxonomic decisions as systematic knowledge evolves.
DigitalMediaDescribes multimedia representations of specimens—including photographs, videos, 3D models, and audio recordings—structured according to Audubon Core metadata standards. Includes technical attributes such as file format, resolution, color space, authorship, and capture conditions, ensuring standardized management and interoperability of digital assets.
AgentRepresents individuals or organizations participating in specimen-related activities, including collectors, taxonomists, curators, digitization technicians, and researchers. Documents roles, contributions, and institutional affiliations to support scientific accountability, provenance transparency, and scholarly citation throughout the specimen lifecycle.
Table 8. Core object properties of Nature Module.
Table 8. Core object properties of Nature Module.
Property NameDescriptionExampleLinked
Standard
recordedInLinks specimen to
collection event
Beetle → Field collection May 2023Darwin Core Event
hasLocationConnects event to
location
Event → Jeju Province (33.4569° N)Darwin Core Location
identifiedByLinks specimen to
identification
Butterfly → Identified by Dr. Kim 2024Darwin Core Identification
hasDigitalMediaLinks specimen to
media
Specimen → lateral.jpg (6000 × 4000 px)Audubon Core
performedByLinks activity to agentCollection → NIBR Survey TeamDarwin Core Agent
curatedByLinks specimen to
institution
Specimen → National Science MuseumDarwin Core Agent
Table 9. Core classes of the Heritage Module.
Table 9. Core classes of the Heritage Module.
ClassDefinition and Functional Role
OutstandingUniversalValueRepresents UNESCO World Heritage inscription criteria (vii–x), structuring attributes for geological processes, ecosystems, biodiversity, and esthetic value. Enables systematic assessment of heritage significance through quantifiable criteria (e.g., Jeju Volcanic Island’s geological diversity, DMZ biodiversity corridor), supporting conservation prioritization reasoning.
HeritageInscriptionRecords administrative and legal designation procedures across multi-layered protection systems (World Heritage, national monuments, provincial heritage). Tracks chronological stages (e.g., Jeju designation 2007), enabling historical traceability and comparative analysis of protection frameworks.
HeritageAspectDecomposes multidimensional heritage values into separate evaluable attributes (biological diversity, geological importance, ecological processes, cultural landscape, scientific research value). Supports aspect-specific conservation strategies and enables reasoning across heritage value dimensions.
HeritageProjectRepresents organized conservation, restoration, or sustainable use activities (e.g., wetland rehabilitation, invasive species removal, environmental monitoring). Links to Agent and Event classes to document stakeholders, timelines, and outcomes, enabling systematic evaluation of management effectiveness.
LegalInstrumentDescribes legal, regulatory, and policy frameworks governing heritage protection (international conventions, national laws, ordinances, management guidelines). Specifies scope, enforceability, and jurisdictional hierarchy, supporting compliance analysis and conflict resolution in multi-jurisdictional sites.
SocialEngagementModels participation of local communities, indigenous groups, and stakeholders, operationalizing CARE principles. Documents collaborative governance, traditional ecological knowledge integration, and consultation processes, enabling analysis of stakeholder participation patterns and diverse knowledge systems.
Table 10. Core object properties of Heritage Module.
Table 10. Core object properties of Heritage Module.
Property NameDescriptionExampleLinked Standard
hasOutstandingValueLinks site to OUVJeju Volcanic Island → Criterion viiiCIDOC-CRM E18
inscribedAsConnects value to inscriptionOUV → UNESCO World Heritage 2007LIDO recordWrap
hasAspectLinks value to aspectHallasan → Alpine vegetation aspectLIDO objectWorkType
implementedThroughLinks value to projectBiodiversity → Restoration Project 2020–2025CIDOC-CRM E7
regulatedByConnects to legal instrumentDesignation → Cultural Heritage Act Art. 25CIDOC-CRM E73
involvesStakeholderLinks project to stakeholdersRestoration → Jeju community councilCIDOC-CRM E39
Table 11. Core classes of the Digital Module.
Table 11. Core classes of the Digital Module.
ClassDefinition and Functional Role
InputProvenanceTraces origins and generation context of raw digitization data. Structures input factors affecting data quality (scanner model, sensor configuration, environmental conditions, specimen preparation state), enabling systematic analysis of how acquisition context influences output fidelity.
AcquisitionEventRecords the precise moment when digital data are captured from physical specimens. Specifies technical parameters (scan resolution, point-cloud density, capture angles, session duration), supporting reproducibility analysis and parameter optimization through correlation with quality outcomes.
DataProcessingDocuments computational procedures transforming raw data into 3D models. Records processing steps (point-cloud registration, mesh generation, noise reduction, hole filling) with algorithms, parameters, and software versions, enabling verification of processing decisions and replication of workflows.
QualityAssessmentMeasures how faithfully digital outputs reproduce original specimen geometry, color, and texture. Performs quantitative evaluations based on Eureka3D quality indicators (geometric accuracy, color fidelity, texture clarity), enabling objective comparison across digitization methods and identification of quality thresholds.
UncertaintyAnnotationExplicitly records confidence limits arising during digital reconstruction. Visually and semantically distinguishes reconstructed or AI-generated components (inferred surfaces, estimated colors, synthetic textures), quantifying uncertainty sources and degrees to support the critical interpretation of digital assets.
ProvenanceTraceEstablishes traceable networks, enabling backtracking from final digital objects to raw data and intermediate processing steps. Employs PROV-O properties wasDerivedFrom and wasGeneratedBy to construct complete genealogical chains, supporting authenticity verification and reproducibility validation.
Table 12. Core object properties of Digital Module.
Table 12. Core object properties of Digital Module.
Property NameDescriptionExampleLinked Standard
usedInputLinks activity to input dataAlignment → Artec raw scan (0.1 mm)PROV-O used
capturedByConnects data to acquisitionRaw images ← Robot scan May 2024CRMdig D2
processedByLinks data to processingPoint cloud → ICP registrationPROV-O
wasGeneratedBy
assessedByLinks output to QAFused model → RMS 0.23 mmCRMdig D13
annotatesUncertaintyLinks model to uncertaintyRestoration area → AI 68% confidenceCRMdig D1
tracedToTraces provenance chainGLB ← Mesh ← Point Cloud ← RawPROV-O wasDerivedFrom
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, Y.; Seol, S.; Oh, J.; Lee, J. Semantic Collaborative Environment for Extended Digital Natural Heritage: Integrating Data, Metadata, and Paradata. Heritage 2025, 8, 507. https://doi.org/10.3390/heritage8120507

AMA Style

Lee Y, Seol S, Oh J, Lee J. Semantic Collaborative Environment for Extended Digital Natural Heritage: Integrating Data, Metadata, and Paradata. Heritage. 2025; 8(12):507. https://doi.org/10.3390/heritage8120507

Chicago/Turabian Style

Lee, Yeeun, Songie Seol, Jisung Oh, and Jongwook Lee. 2025. "Semantic Collaborative Environment for Extended Digital Natural Heritage: Integrating Data, Metadata, and Paradata" Heritage 8, no. 12: 507. https://doi.org/10.3390/heritage8120507

APA Style

Lee, Y., Seol, S., Oh, J., & Lee, J. (2025). Semantic Collaborative Environment for Extended Digital Natural Heritage: Integrating Data, Metadata, and Paradata. Heritage, 8(12), 507. https://doi.org/10.3390/heritage8120507

Article Metrics

Back to TopTop