1. Introduction
The exponential growth of unstructured textual data has positioned ontologies as a cornerstone of the Semantic Web and knowledge-driven Artificial Intelligence (AI). As formal, explicit specifications of conceptualizations [
1], ontologies provide a structured, machine-interpretable framework that achieves advanced data integration, semantic search, and sophisticated reasoning tasks. They move beyond simple data representation by encoding knowledge in a logical format, which allows for sophisticated automated reasoning. This reasoning capability is the cornerstone of intelligent applications in multiple domains such as semantic search, complex data integration, and knowledge discovery. The foundational logic underpinning modern Web Ontologies Language (OWL) is Description Logics (DL), a family of knowledge representation languages that offer a compelling balance between expressivity and computational tractability [
2]. The manual construction of ontologies, however, is a notoriously knowledge-intensive, time-consuming, and expensive process, creating a significant bottleneck for the extension of knowledge systems to the vastness of modern Big Data corpora [
3]. In response, the field of Ontology Learning from Text (OLT) has emerged, aiming to (semi-)automate the acquisition of ontological components from unstructured documents.
Recent years have witnessed remarkable progress in the initial stages of this pipeline, particularly with the advent of deep neural networks and Transformer-based models. Techniques for Named Entity Recognition (NER) and relation extraction (RE) have achieved state-of-the-art performance in identifying concepts and their binary relations within text [
4,
5]. Our previous work [
6], which employed a Transformer-based Fusion model, falls squarely within this paradigm, successfully extracting a rich set of initial concepts and relational triples e.g.,
,
from corpora. Despite these advances, a critical gap persists between the extraction of flat, lexical triples and the construction of a formally rigorous ontology. The latter requires the generation of axioms, which are logical sentences that define the semantics of the ontology’s vocabulary [
7]. These axioms populate the Terminological Box (TBox), which contains intensional knowledge like class hierarchies (
), domain/range restrictions, and property characteristics (functional and inverse functional), and the Assertional Box (ABox), which contains extensional knowledge about individuals [
2]. The core problem this paper addresses is the axiom generation bottleneck. While we can extract
as a relation between
and
with high confidence, current automated methods struggle to formally assert that
is an
with
as its domain and
as its range. This gap leaves a chasm between the statistical outputs on Natural Language Processing (NLP) models and the logical requirements of ontology engineering. The initial approaches to this problem was purely manual, relying on domain experts to author logical axioms [
8]. While they ensuring high semantic fidelity, it is fundamentally not scalable and lacks the agility required for dynamic, large-scale corpora [
7,
9].
Within ontological engineering, axioms constitute the formal underpinnings of a knowledge representation system [
1,
10]. They are expressed in a logical language called the Web Ontology Language (OWL) [
11], and serve to define the semantics of the vocabulary by imposing constraints and declaring logical relationships. The Web Ontology Language (OWL), particularly OWL 2, is a WC3-standardized language for constructing ontologies on the Semantic Web, with its formal semantics grounded in the
Description Logic (W3C OWL Working Group) [
11]. It models domain knowledge using a structure of classes (concepts), properties (relations), and individuals (instances), and supports powerful reasoning through constructors that allow for the formation of complex class expressions. OWL 2 is characterized by its dual semantic frameworks [
11], that is Direct Semantics for full
expressivity and RDF-Based Semantic for broader web integration, as well as defined profiles that optimize the trade-off between logical expressiveness and computational tractability for practical applications [
11].
While the prior steps of concept and relation extraction populate the ontology with its core elements, for example, defining classes like
and
, instances like
and properties like
and
, it is the axiom set that codifies their intended meaning and enables automated reasoning [
2]. The critical function of axioms is twofold. First, they define the Terminological Box (TBox), which describes the conceptual schema of the domain. This includes axioms such as declaring domain and range restrictions (e.g.,
), stating subsumption hierarchies (e.g.,
), or defining property characteristics (e.g., asserting that
is transitive) [
12]. Second, they populate the Assertional Box (ABox) with ground facts that instantiate the TBox schema (e.g.,
). The power of this formalization is that it transforms a static collection of terms and relations into a dynamic knowledge base. This enables foundational knowledge management tasks such as consistency checking, classification, and subsumption inference, thereby uncovering implicit knowledge that is not explicitly stated in the original text [
12,
13].
The first wave of automation employed traditional machine learning (ML) and pattern-based methods. Early works used association rule mining [
14] and inductive logic programming [
15,
16,
17] to infer hierarchical structures, e.g.,
-
a relations and simple axioms from textual data. However, these methods were severely limited. They relied heavily on hand-crafted linguistic patterns or shallow statistical co-occurrence, making them brittle, domain-dependent, and incapable of capturing the complex, contextual semantics required for robust axiom induction (e.g., distinguishing between
as a geographical versus an administrative relation) [
18]. The shift towards deep learning with models like Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTMs) marked a significant step forward [
11,
18]. These models could better capture contextual word representations and long-range dependencies, and hence improve the extraction of concepts and relations. However, their application to direct axiom generation was limited. RNNs and LSTMs, with their sequential processing nature, often struggled with capturing the complex, non-sequential logical structures of axioms and faced challenges with vanishing gradients over long, structured outputs. CNNs, adept at local feature extraction, were less effective at modeling the global dependencies within a sentence that are crucial for understanding relational semantics. Consequently, deep learning of this era was primarily used as feature extractors for the initial NLP tasks, but the leap from these features to formal logic was still bridged by simplistic, non-neural post-processing rules, inheriting many of the same semantic limitations [
7,
19].
The advent of the Transformer architecture [
20] and its subsequent evolution into large-scale pre-trained language models (LLMs) like BERT [
4] and GPT has fundamentally revolutionized natural language processing (NLP). With their core self-attention mechanism [
20], these models excel at capturing complex, contextual relationships in text, which has propelled to state-of-the-art performance on fundamental tasks such as Named Entity Recognition (NER) and relation extraction (RE) [
4]. Recently, LLMs have significantly advanced beyond these foundational tasks, demonstrating near-human performance across a wide spectrum of linguistic challenges and showing their utility as general-purpose foundational models [
21,
22]. This capability highlights their potential as powerful tools for high-level knowledge engineering tasks, including knowledge graph completion, ontology refinement, and question and answering, with recent research even applying them directly to ontology-related tasks such as generating competency questions [
23], concepts, relations [
22], and axioms [
10]. Building directly upon this transformative technology, our previous work employed a Multimodal Transformer-based Fusion model [
6], where we leveraged its strengths to produce a high-quality set of concepts and relational triples that now serve as the foundational input to the current study.
However, a critical gap persists. While Transformer-based models are unparalleled extractors, they are not innate . The output of these models is typically a set of probabilistic, lexical triples. (e.g., , , ). The transformation of these surface-level extractions into a formally rigorous ontological structure, populated with property assertion axioms like (, ) and (, ), remains an open problem because the fields lack a robust, scalable method to translate the statistical confidence of a neural extractor into the logical constraints of an ontology. Current state-of-the-art pipelines often treat extraction and axiomatization as disconnected phases, leaving the latter under-automated. This study addresses this final-mile challenge by proposing a novel pipeline for automated axiom generation via schema mapping, defined on Transformer-based extracted triples. We posit that the rich, contextual representations from models like ours provide a sufficiently stable foundation for statistical schema induction. Our approach does not attempt to build a new neural logic generator, but rather, develops a principled, data-driven mapping methodology to formalize the outputs of the state-of-the-art extractor into OWL axioms.
The contribution of this study displays three aspects below:
We propose an end-to-end pipeline that bridges the gap between state-of-the-art Transformer-based knowledge extraction and axiom generation bottleneck.
We introduce a practical schema mapping and induction methodology that leverages the quality of Transformer-derived triples to infer TBox and ABox axioms, including class hierarchies, domain/range constraints, property characteristics and class assertions.
We contribute a scalable, Transformer-augmented framework that advances the field toward fully automated ontology learning, providing a clear pathway from textual data to a reasoning-ready knowledge base.
It is important to note that the primary objective of this study is to bridge the critical gap between statistical relation extraction and the generation of foundational, logically consistent ontological axioms. Consequently, our pipeline is designed to produce an ontology comprising (1) class hierarchies (SubClassOf), (2) domain and range restrictions, and (3) class and property assertions (ABox). While this forms the essential terminological and assertional backbone of an OWL ontology, we consciously defer the automated generation of richer OWL constructs, such as property characteristics (e.g., symmetry, transitivity) and cardinality constraints, to future work. This strategic focus allows us to establish a reliable, scalable foundation for automated ontology learning, upon which more expressive layers can be subsequently built.
The remainder of this paper is structured as follows:
Section 2 reviews related work in detail.
Section 3 details our proposed framework.
Section 4 describes the experimental setup.
Section 5 presents and discusses the results. Finally,
Section 6 concludes the paper.
2. Related Work
Several studies have addressed the problem of axiom generation from textual data [
14,
24,
25,
26,
27,
28], a critical sub-problem in the long-standing goal of automating ontology construction for the Semantic Web and knowledge engineering communities. This literature review traces the evolution of these methods, from early logic-based and statistical approaches to modern neural techniques.
Maedche and Volz [
14] proposed Text-To-Onto, a system that relies on association rule mining and formal concept analysis to discover taxonomic relationships from text [
14]. Similarly, Poon and Domingos [
24] proposed an approach based on inductive logic programming (ILP) to learn Horn clauses from corpora that could be translated into ontological axioms [
24]. However, their limitations are fundamentally brittle and domain-dependent. They operated on hand-crafted linguistic patterns and shallow syntactic parses, lacking the robust semantic understanding needed to handle paraphrases or discern precise logical semantics for complex relations. Our proposed approach addresses these limitations by building upon the rich, context-aware representations from a Transformer-based model, enabling a more nuanced and robust understanding of relational semantics without relying on predefined patterns.
Völker and Niepert [
25] proposed a statistical schema induction method that uses hierarchical clustering on the argument pairs of relational phrases to hypothesize domain and range restrictions for properties [
25]. This represented a shift towards more scalable, corpus-driven methods. However, its limitations is its reliance on the distributional hypothesis alone. It can be misled by frequent but semantically invalid co-occurrences and struggles to differentiate between polysemous relations (e.g., the various senses of
), as it prioritizes empirical frequency over contextual logic. Our proposed approach addresses this limitation by using the high-confidence, typed entities and relations extracted by our Transformer-based Fusion model as a filtered and structured input. We perform schema induction on this refined set, mitigating the noise of raw corpus statistics and allowing for more precise axiom generation.
Chen et al. [
26] utilized an LSTM-based encoder to learn representations for ontology alignment, a task requiring understanding of axiomatic structures [
26]. Similarly, Javed et al. [
27] employed deep learning neural networks to learn embeddings for ontology completion in the Existential Language (EL) Description Logic [
27]. However, their limitations for direct axiom generation from text are significant. The sequential nature of RNNs/LSTMs made them suboptimal for modeling the complex, non-sequential structures of OWL axioms, often creating an information bottleneck. These models were often used as feature extractors within larger pipelines, where the final axiomatization still required manual, rule-based post-processing, failing to fully automate the leap from distributed representations to symbolic logic. Our proposed approach addresses this limitation by designing a deterministic, post hoc mapping pipeline that is decoupled from the neural extraction phase. This provides a transparent and scalable path from neural extracts to formal axioms, overcoming the architectural constraints of sequential models.
The advent of Transformer-based models such as BERT [
4] revolutionized the initial stages of ontology learning, providing state-of-the-art performance in extracting concepts and relations. Recently, He [
28] explored using prompt engineering with Large Language Models (LLMs) to generate OWL axioms directly from text [
28]. However. Their limitations are critical for ontology engineering: LLMs are stochastic and can “hallucinate,” generating logically inconsistent axioms with no formal guarantees. Their black-box nature makes debugging and verification difficult, and their computational cost is prohibitive for generating large-scale ontologies. Our proposed approach addresses these limitations by avoiding direct, generative use of LLMs for logic. Instead, we use a structured, deterministic schema mapping process applied to the outputs of a smaller, fine-tuned Transformer.
Our proposed method of axiom generation through schema mapping does not seek to replace deep learning but to complement it effectively. It provides a principled, automated bridge from the probabilistic, textual world captured by Transformers to the deterministic, logical world of Description Logics (DLs). This approach directly populates a Description Logic (DL) knowledge base with both TBox and ABox axioms, thereby addressing the axiom generation bottleneck that has persisted through multiple eras of NLP research. Our proposed approach to automated axiom generation is presented next.