Next Article in Journal
Pieces of the Puzzle: Scaling Community-Engaged Research to a Statewide Level
Previous Article in Journal
Migration, Motherhood, and Maternal Health: Brazilian Women’s Encounters with the Portuguese Healthcare System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

From Tacit Knowledge Distillation to AI-Enabled Culture Revitalization: Modeling Knowledge Cycles in Indigenous Cultural Systems

1
Department of Computer Science and Information Engineering, National Taitung University, Taitung 950309, Taiwan
2
Department of Information Science and Management Systems, National Taitung University, Taitung 950309, Taiwan
3
Department of Cultural Resources and Leisure Industries, National Taitung University, Taitung 95092, Taiwan
*
Author to whom correspondence should be addressed.
Soc. Sci. 2026, 15(1), 7; https://doi.org/10.3390/socsci15010007
Submission received: 28 October 2025 / Revised: 11 December 2025 / Accepted: 18 December 2025 / Published: 23 December 2025
(This article belongs to the Topic Social Sciences and Intelligence Management, 2nd Volume)

Abstract

This study addresses the challenge of digitally modeling Indigenous Traditional Ecological Knowledge (TEK) in a manner that respects and preserves its epistemic integrity. Grounded in ethnographic inquiry and system design, the research introduces a four-tier knowledge typology that conceptualizes how tacit, explicit, tribal and cultural knowledge circulate within Indigenous communities. This cyclical model highlights recursive and embodied processes of knowledge internalization, transmission, and integration, offering a dynamic alternative to linear knowledge flow frameworks. Building upon this epistemological foundation, this study traces the transition from traditional data practices, which are centered on oral histories, ritual performances, and ecological observation, to a contemporary AI-assisted architecture that operationalizes these forms through structured semantic enrichment, modular knowledge storage, and culturally aligned reasoning systems. The proposed system integrates layered components, from data acquisition to multi-agent inference models, while embedding ethical protocols that affirm community sovereignty and relational authority. The findings suggest that TEK systems can be effectively encoded into modern digital infrastructures without erasing their socio-cultural contexts. By foregrounding Indigenous epistemologies within system design, the research advances a critical paradigm for culturally responsive knowledge technologies in sustainability, education, and heritage preservation.

1. Introduction

Indigenous Traditional Ecological Knowledge (TEK) represents a rich reservoir of cultural, linguistic, and environmental insights developed through generations of intimate interaction with local ecosystems. In many Indigenous communities, such knowledge is not merely a static repository of facts but a living, evolving system embedded in oral tradition, ritual practice, and intergenerational memory. For Taiwan’s Paiwan people, TEK encompasses sophisticated systems of plant classification, seasonal agricultural rhythms, hunting practices, and sustainable land stewardship embedded within myth, ritual and oral memory.
Paiwan knowledge is transmitted through ceremonial storytelling, intergenerational farming, and object-based learning, such as crafting hunting tools or preparing medicinal plants. These oral practices are frequently nonlinear and situational, and deeply tied to both the land and the social fabric of community life. One example is the Maljeveq (Paiwan Five-Year Ceremony) in which the knowledge of setting up symbolic barricades to fend off evil spirits, piercing rattan balls pitched into the air in a contest, singing and dancing is ritually enacted rather than explained. Capturing this kind of knowledge requires not only data collection but context-sensitive models that reflect the structure of Indigenous memory and meaning-making.
However, with the encroachment of globalization, linguistic decline and the disconnection of younger generations from traditional lifeways, many of these knowledge systems face the risk of fragmentation or loss. This challenge is particularly acute in regions where oral transmission remains the primary mode of knowledge preservation. As elders pass away and youth become increasingly embedded in digital and globalized realities, mechanisms for sustaining TEK must evolve in both form and function. Traditional methods alone may no longer suffice to ensure the continued relevance and accessibility of Indigenous knowledge in contemporary contexts. Accordingly, there is an urgent need for hybrid approaches that bridge oral and digital domains, enabling the encoding, retrieval, and regeneration of TEK through culturally sensitive technological means.
Recent advances in artificial intelligence (AI, computer systems designed to perform tasks that typically require human intelligence, such as understanding language or making predictions), particularly large language models (LLMs, AI models trained on large amounts of text to generate and interpret human-like language), knowledge modeling (organizing knowledge into structured concepts and relationships so it can be stored, searched, and reasoned with) and agent-based systems (software agents that can take actions, follow rules or goals, and collaborate with other agents) have opened up new possibilities for revitalizing Indigenous knowledge systems. These technologies can support not only digitization and archiving (converting materials into digital form and storing them for long-term access) but also the simulation of cultural contexts, enabling interactive transmission, and fostering community-based knowledge engagement. When applied thoughtfully and collaboratively, such technologies have the potential to support rather than supplant Indigenous modes of learning and teaching.
This study proposes a conceptual and practical framework for sustaining Indigenous TEK by transforming oral and fragmented data into actionable knowledge systems. Through an interdisciplinary approach that integrates information science, digital humanities, and Indigenous epistemology, we explore how structured, semi-structured, and unstructured data, collected both through fieldwork and external repositories, can be processed using AI-assisted methods, including statistical analysis, machine learning, and knowledge modeling. Moreover, we investigate the role of semantic agents (software agents that use meaning-aware concepts to interpret and act on knowledge), retrieval-augmented generation (RAG, an approach where an AI first retrieves relevant documents from a collection and then generates an answer grounded in those sources), agentic AI (AI designed to plan and carry out multi-step tasks, often by calling tools or coordinating sub-tasks), and scenario-based systems (interactive setups that let users explore “what-if” situations and possible outcomes) in facilitating user-centered knowledge interaction.
The goals of this research are twofold: first, to demonstrate how data-to-knowledge transformation can support cultural revitalization and intergenerational transmission. Second, to design a participatory AI-enhanced knowledge system that reflects the lived realities and memory practices of Indigenous communities. The present study contributes to a broader discourse on digital heritage preservation, knowledge sovereignty, and the ethical application of AI in culturally embedded contexts.

2. Literature Review

TEK is widely recognized as a holistic and adaptive knowledge system that integrates ecological observations, spiritual beliefs, and social practices developed through generations of close interaction with the environment. Usher (2000) and Robinson et al. (2021) provided detailed discussions of TEK in their work. In contrast to positivist models, TEK is often encoded in oral narratives, ritual performances, and lived experiences. McGregor (2004) argued that this characteristic makes TEK both dynamic and place-based. In the context of Taiwan, Wu et al. (2014) and Fang et al. (2016) demonstrated that Indigenous groups such as the Paiwan and the Atayal developed elaborate ecological classifications, weather forecasting methods, and seasonal agricultural practices rooted in oral traditions.
However, TEK faces persistent threats arising from colonization, language loss, generational disconnection, and the dominance of Western scientific paradigms. Scholars such as Joubert and Biernacka (2015), McCann et al. (2016), and Tumbali (2025) argued that preserving TEK requires more than documentation. They maintained that it also necessitates active revitalization strategies that are culturally situated and technologically adaptive.
The transition from oral to digital forms of knowledge remains a central concern in digital humanities and Indigenous media studies. Oral knowledge systems are nonlinear, contextual, and often correlative, which makes them difficult to translate into rigid data structures. Attempts to digitize TEK through video archives, ethnographic databases, or mobile applications have often encountered epistemological mismatches between Indigenous worldviews and Western data logics. Christen (2012) offered several insightful perspectives on this issue, particularly by emphasizing the risks of applying extractive digital frameworks to relational knowledge systems.
Aotearoa New Zealand’s Māori-led research has been internationally recognized as a pioneering force in the digital development of TEK, where technological innovation is deeply intertwined with assertions of cultural sovereignty. Foundational efforts in natural language processing, such as those by Mohaghegh et al. (2014) and James et al. (2022), laid the groundwork for digitizing the Māori language while simultaneously foregrounding the core challenge of data scarcity in low-resource linguistic contexts. The digital archiving of Māori songs by Ka’ai-Mahuta (2012) further illustrated how cultural heritage can be preserved through innovative strategies that address such limitations. Hudson and Whaanga (2023) provided a comprehensive synthesis of three decades of Māori engagement with computational technologies aimed at enhancing cultural resilience, preserving language, fostering social regeneration, and resisting epistemic erosion. More recently, Keegan (2025) called for greater technological agency within Māori communities and proposed a “Sovereign AI Framework” that affirms self-determination not only in technological design and deployment but also in matters of data ownership, governance, and ethical use.
Emerging studies, such as those by Dutta and Das (2016) and Schmitt (2024), advocated decolonized design approaches in which Indigenous communities co-created the structure, interface, and functionality of digital knowledge systems. These approaches respect the ontological integrity of oral traditions while leveraging digital affordances for transmission and regeneration. Informed by these precedents, our research follows a similar trajectory in developing a Paiwan-specific knowledge sovereignty framework for AI. This framework amins to align with Indigenous epistemologies, while ensuring culturally grounded governance over data, tools, and knowledge systems.
LLMs such as GPT (Generative Pre-Trained Transformer, a language model that can generate fluent text and answer questions by predicting what words should come next), BERT (Bidirectional Encoder Representations from Transformers, a language model designed to understand the meaning of text by reading the words in context, which is useful for tasks like classification and information extraction), and LLaMA (Large Language Model Meta AI, Meta’s family of language models that can be run and adapted by researchers and developers to build language-based applications) are reshaping natural language understanding and generation. Their application in knowledge distillation, which converts unstructured linguistic input into structured, queryable knowledge, and provides novel avenues for encoding and interacting with TEK. Domazetoski (2024) demonstrated the feasibility of this approach through practical implementation in his master’s thesis.
However, applying LLMs to culturally sensitive domains raises critical concerns. Challenges such as hallucinations, cultural misrepresentation, and epistemic flattening require deliberate mitigation through fine-tuning (process of further training a pre-trained model on task- or domain-specific data to better adapt its behavior and performance), RAG, and the incorporation of community-validated corpora. Zhao et al. (2024) discussed these issues in detail. Māori research also provides illustrative cases in which the impacts of AI hallucinations are documented. Hellmann (2025) observed that generative AI reproduces colonial visual conventions by depicting land as terra nullius, which undermines Māori assertions of sovereignty. Greaves et al. (2024) used the Integrated Data Infrastructure maintained by Statistics New Zealand to highlight structural imbalances in data governance. They argued that Māori communities exercise minimal control over how their data are managed, analyzed, and reported. This lack of authority can contribute to stigmatization when interpretations are detached from broader socio-historical contexts and do not support solution-oriented outcomes. Moreover, such government datasets often serve as foundational training material for LLMs, which can amplify epistemic risks. Without community agency in data governance, reusing these datasets in generative AI systems risks reinforcing marginalizing narratives and undermining Indigenous sovereignty over knowledge representation.
Ethical concerns also arise because AI systems that operate without culturally grounded frameworks can cause systemic harm. Taiuru (2024) reported that findings from the New Zealand Artificial Intelligence Attitudes Study indicated widespread concern among businesses regarding AI ethics and algorithmic bias. The study reported that 53% of companies expressed doubts about the ethical implications of AI systems, and 50% believed that AI algorithms reflect human biases. Moreover, 82% of respondents agreed that AI should be subject to legislative oversight. Importantly, the study’s conclusions suggested that in areas such as algorithmic bias, lived experience may provide insights that exceed those derived from formal academic training. This finding underscores the need to incorporate community-based knowledge and experiential perspectives into AI governance frameworks, especially in contexts where cultural and social implications are profound. Hallucination-driven misinformation also poses risks to social trust, contributing to community anxiety, anger, and broader distrust, as Hannah et al. (2022) discussed.
In response to these challenges, the New Zealand government released the Responsible AI Guidance for the Public Service (New Zealand Government 2025). Māori scholars also developed the Māori AI Governance Framework (Cormack et al. 2025), which outlined core principles grounded in key values derived from a Māori worldview. Implementing this framework can help ensure that AI technologies contribute to collective wellbeing, protect Māori data rights, and prepare Aotearoa for a culturally grounded digital future. These developments inform our effort to articulate how LLMs may evolve, under explicit constraints and governance structures, toward LKMs that align with Paiwan epistemologies and knowledge sovereignty. In this context, TEK is not merely a dataset; it is a living system that necessitates ethical and participatory approaches to AI integration.
Semantic agents and scenario-based AI systems are gaining traction as tools for navigating complex, context-dependent knowledge domains. These agents integrate natural language processing, contextual memory, and external tool invocation to simulate human-like interactions and planning processes. Han et al. (2023) and Park et al. (2023) demonstrated that embedding semantic agents within culturally specific logic models, such as Paiwan ceremonial calendars or traditional land use cycles, enables AI systems to align more closely with Indigenous epistemologies. Agentic AI models also offer possibilities for interactive, intergenerational dialog that mirrors the dynamics of oral knowledge transmission.
At the core of any digital intervention involving Indigenous knowledge is the principle of knowledge sovereignty. This principle protects the rights of Indigenous communities to own, control, access, and benefit from their cultural knowledge and data. Carroll et al. (2019) and Hiraldo et al. (2020) provided comprehensive examinations of key perspectives in Indigenous data governance. Ethical frameworks such as OCAP® (Ownership, Control, Access, and Possession), discussed by Konczi and Bill (2024), and the CARE principles (Collective Benefit, Authority to Control, Responsibility, Ethics), outlined in Carroll et al. (2020), offered foundational guidance for designing TEK-related AI systems. Neglecting these frameworks risks perpetuating colonial extractivism under the guise of technological advancement. Accordingly, this study integrates ethical review and community consent as foundational elements in the system design process.
While previous research addressed TEK preservation, digital archiving, and AI applications in general knowledge systems, few studies integrated these domains into a unified framework. In particular, the intersection of agent-based reasoning, culturally grounded modeling, and Indigenous knowledge systems remains underexplored. This study addresses this gap by proposing a model that transforms TEK-related data into actionable knowledge through agentic processes. The proposed approach centers Indigenous memory, values, and epistemologies in the design and application of digital systems, thereby supporting cultural coherence and community relevance.

3. Methodology and System Design

The research adopts a longitudinal, participatory, and technology-integrated methodology to facilitate the transformation of orally transmitted Indigenous TEK into digitally operable and culturally respectful knowledge systems. Bridging ethnographic inquiry with computational design, we adopt a co-constructive research paradigm that underpins community participation, iterative knowledge modeling, and the responsible deployment of agentic artificial intelligence. Our methodology is shaped by two interdependent trajectories: the cultural logic and embodied practices of TEK observed through long-term fieldwork, and the affordances of semantic data processing and adaptive AI systems capable of dialogical, permission-aware interaction. By integrating these dimensions, we seek to construct a system in which Indigenous knowledge is not merely digitized, but meaningfully structured, contextualized, and sustained through culturally congruent interaction models. The section presents the stepwise transition from ethnographic engagement to technical implementation, including the modeling of knowledge layers, agent design strategies, and the ethical protocols that govern system development and use.

3.1. Ethnographic Engagement and Community Co-Creation

Over a six-year period, we conducted iterative fieldwork in partnership with Paiwan communities in southeastern Taiwan. Through cycles of observation, dialog and collaborative practice, the research team transitioned from external documentation to joint knowledge production. Ethnographic practices included oral history interviews, ritual observation, cultural mapping, and co-design workshops with elders, artisans, and youth. Emphasis was placed on relational epistemologies, privileging Indigenous ways of knowing and community-defined priorities. The conceptual framework from oral data to knowledge systems is illustrated in Figure 1.
In alignment with the CARE principles and Indigenous data sovereignty frameworks, all research activities were carried out with prior informed consent, and under a dual structure of ethical oversight. At the national level, our protocol was reviewed and approved by the Institutional Review Board (IRB) and remains under its continuous monitoring. At the community level, all procedures were openly discussed in tribal councils and consulted with recognized knowledge holders. To prevent the imposition of external legal frameworks, all governance instruments, including consent forms and authorization agreements, were developed in collaboration with a native Paiwan legal advisor.
The data collection and approval workflow, illustrated in Figure 2, begins when trained knowledge collectors identify appropriate knowledge holders and obtain written informed consent. Field data, collected through voice recordings, photographs, videos, and text, are generated during co-creative engagements. These raw data are then revised and coding by designated responsible editors who possess cultural fluency and community affiliations. Their role is to ensure semantic accuracy and minimize misinterpretation during the data transformation process.
Following this editorial phase, compliance-processed materials are submitted to a multi-stakeholder review panel composed of local collaborators and domain experts. The panel evaluates the content for epistemic accuracy, cultural appropriateness, and alignment with community protocols. If materials are deemed unsuitable for public circulation or digital inclusion, they are either discarded or returned for revision and resubmission.
Once approved, materials are categorized as either individual or community knowledge. Individual knowledge requires explicit authorization from the respective knowledge holder, while community knowledge is deliberated through tribal council meetings. Only after the appropriate authorization agreements are completed does the material proceed to be integrated into the digital knowledge system. If consensus cannot be reached, the process is terminated or restarted. All records, regardless of their final status, remain subject to revocation or restriction by community representatives should they be later deemed sensitive or inappropriate.
This layered protocol ensures that field-based data are not only collected ethically but also governed by systems of accountability that honor both Indigenous self-determination and formal research ethics. The architecture affirms that no cultural data, be it ecological, ceremonial, or linguistic, is incorporated without explicit community approval. It also safeguards against extractive practices by embedding oversight, consent, and cultural coherence into every stage of the research lifecycle.

3.2. Knowledge Accumulation and Outreach, Sharing and Internalization in Indigenous Knowledge Systems

Based on extensive data accumulation, this study proposes a bidirectional model of Indigenous knowledge dynamics. The model, as depicted in Figure 3, illustrates two interrelated processes: knowledge accumulation and outreach, represented in the lower-right quadrant, and knowledge sharing and internalization, depicted in the upper-left quadrant. It captures the reciprocal movement between individual and collective knowledge systems within TEK. This framework is empirically informed by observations of how knowledge is embodied, enacted, stored, and transmitted across generations in Paiwan social contexts.
Building upon the tacit–explicit knowledge framework proposed by Nonaka (1994) and further developed by Nonaka and von Krogh (2009), this study extends the model within the context of Paiwan TEK to delineate four interrelated domains: Individual Tacit Knowledge (K1), Individual Explicit Knowledge (K2), Community Tribal Knowledge (K3), and Community Cultural Knowledge (K4). Together, these domains outline the transition from a personal, embodied understanding to a culturally institutionalized system of knowledge, illustrating how Indigenous epistemologies evolve through articulation, collaboration, and collective validation.
K1 refers to the unarticulated, experiential knowledge that individuals internalize through direct engagement with their surroundings. These include embodied practices, sensory familiarity with local ecologies, and intuitive understanding that is difficult to formalize. Among the Paiwan people, for example, elders may possess tacit knowledge of soil moisture conditions suitable for millet planting, acquired through decades of hands-on observation and interaction with the land. Such knowledge was not formally codified but remained embedded in practice, movement, and seasonal rhythms.
K2 occurs when tacit knowledge is externalized through language, diagrams, rituals, or other communicable forms. This process often involves reflection and the codification of practical experience into teachable content. Paiwan farmers, for example, who documents seasonal planting cycles and correlate them with lunar phases, are transforming embodied knowledge into a structured knowledge system. Such explicit representations enable the transmission of traditional ecological wisdom across individuals and generations, supporting both continuity and reinterpretation within the community.
K3 encompasses collectively held, contextually grounded knowledge embedded in specific tribal groups. It is relational and socially regulated, often shared through communal rituals, apprenticeships and performances. In Paiwan communities, this might take the form of ceremonial protocols for land use, clan-based rules for forest resource gathering, or the narrative logic embedded in origin myths that guide environmental stewardship. These shared epistemic forms are not only practical but also symbolic, anchoring tribal identity and guiding collective interaction with the natural world through culturally sanctioned knowledge frameworks.
K4 represents a broader, systematized stratum of knowledge that has been maintained across generations. They include shared symbols, collective memory, normative values and ontological worldviews. This form of knowledge is often codified in language, ritual calendars and oral literature. For the Paiwan, such cultural knowledge is not confined to a single locality but represents a trans-tribal cultural fabric. It is expressed through intergenerational songs, chieftain narratives, and symbolic systems that link ecological phenomena with social responsibilities. These elements do not simply preserve tradition; they actively shape collective identity, legitimate authority structures, and guide ethical engagement with both human and non-human worlds.
The diagonal line extending from the lower left to the upper right of the matrix, along with the lower-right quadrant, represents a process of knowledge accumulation and outreach. The transition from K1 to K2 occurs when embodied and experiential understanding is externalized through language, symbols, or other forms of representation. This stage is conceptualized as Discourse and Authorship, denoting the transformation of expert, experience-based knowledge into communicable and shareable formats. For example, the Paiwan people’s intuitive understanding of mountain weather patterns, shaped by decades of hunting and farming, is articulated through storytelling, sketching and pedagogical exchanges. Through narration, gestures and metaphors, tacit knowledge is rendered explicit. This discursive process not only documents personal expertise but also prompts epistemic reflection, transforming localized knowledge into transmissible form while preserving its cultural embeddedness.
K1 may also transition directly into K3 when tacit knowledge is collectively enacted, negotiated, and validated through shared practices. These transformations occur when multiple individuals co-experience a task, engage in dialog, and align their interpretations in situ. During communal activities such as millet planting or harvest preparation, Paiwan participants often exchange embodied techniques, reflect on their actions, and adapt to others’ cues. As mutual understanding crystallizes, a shared interpretive framework is emerging, forming the basis of community tribal knowledge. This stage is conceptualized as the Team Consensus (Resonant Knowledge). Within Paiwan society, such alignment is notably visible during the Masalut (Paiwan’s harvest rituals) where coordinated movements and timing are achieved not through formal instruction but through mutual observation, rhythm, and trust. These embodied synergies cultivate coherence and solidarity, reinforcing a communal epistemology grounded in lived practice.
As individualized knowledge is spread and systematized through collaborative effort, it transforms into functional tribal knowledge. This transformation from K2 to K3 is conceptualized as Organizational Co-Production (Operational Knowledge). In this phase, individual contributions, such as documented ancestral planting practices or ecological observations, are incorporated into collective initiatives. For example, a community may use such knowledge to collaboratively map traditional territorial boundaries or design projects to restore native plant and animal species. Through processes of dialog, consensus-building, and collaborative decision-making, these insights are formalized into communal routines, guidelines, or institutional memory. The resulting co-produced knowledge becomes embedded within collective practice, reinforcing the dynamic interdependence between individual agency and tribal organization.
Certain tacit insights, that are repeatedly tested, ritualized and validated across generations, crystallize into paradigmatic knowledge that shapes cultural norms. This transformation from K1 to K4 is described as Paradigm Construction (Best Practice Knowledge). Within the Paiwan worldview, recurrent experiential lessons, such as the belief that overharvesting disrupts ancestral harmony, have evolved into enduring paradigms that emphasize ecological balance and interdependence. These paradigms not only inform local ethical codes and governance structures but also guide cosmological education across generations. The progression from embodied intuition to culturally embedded principle illustrates how deeply personal, experiential knowledge can scale into foundational moral and ecological frameworks that sustain collective identity and guide future action.
Cultural knowledge may also develop through the Knowledge Integration (Systemic Knowledge) process, which represents the transformation from K2 to K4. Within this trajectory, explicit knowledge becomes culturally significant when it is systematically embedded within institutions such as education, ritual practice, and local governance. Among Paiwan communities, there is a growing trend of incorporating documented oral histories, ecological knowledge and ceremonial protocols into bilingual archives and school curricula. Through such integrative efforts, individual knowledge gains cultural legitimacy and ensures continuity across generations. This process allows digital, textual, and performative knowledge to coexist, bridging generational and linguistic gaps while reinforcing shared epistemological frameworks. Knowledge integration thus serves as a circulatory system that connects codified documentation with the dynamic processes of living cultural practice.
Finally, K3 evolves into K4 through iterative enactment, ritual reinforcement and intergenerational transmission. As ritual behavior, ceremonial songs and oral narratives are performed collectively, they are gradually moving beyond their original contexts and becoming emblematic of a collective identity. A compelling example is the transformation of the Maljeveq, which has evolved from localized tribal knowledge into a pan-tribal cultural symbol. Historically, Maljeveq was not a standardized tradition but rather a constellation of ritual practices embedded within specific communities, transmitted orally and legitimized through sustained performance. Its recent public revival has catalyzed inter-communal engagement, fostering the codification of a shared ceremonial framework across diverse Paiwan territories. Through this process, Maljeveq has transitioned from an internally focused, clan-specific ritual into a widely recognized emblem of Paiwan cultural identity and resilience.
Together, these six pathways delineate a multidirectional ecology of knowledge transformation. The proposed model underscores that Indigenous TEK is not a unidirectional transmission of static facts but a dynamic epistemological system. It is sustained through practices of co-authorship, dialogic consensus, and ritual regeneration. Within this cyclical framework, personal experience is not isolated but continuously contributes to the renewal and evolution of collective wisdom. This ongoing interplay between individual insight and communal knowledge ensures that TEK remains adaptive, resilient, and culturally embedded.
By contrast, the diagonal trajectory from the upper right to the lower left of the matrix, together with the upper-left quadrant, illustrates a cyclical process of knowledge sharing and internalization, wherein culturally embedded knowledge is refracted back into individual and community contexts. The transition from K4 to K3 occurs through active engagement in ritual performance and everyday communal life. Within the Paiwan context, such a process is exemplified by events such as the Maljeveq ceremony, during which historical narratives, mytho-temporal structures, and hierarchical kinship systems are embodied through coordinated ritual acts. Notably, even Paiwan communities that have experienced cultural discontinuity may gradually reintegrate these traditional elements through intertribal ritual collaboration, using shared ceremonial spaces as vehicles for cultural revitalization. Through repeated enactment, abstract values such as relational ethics, ecological reverence, and ancestral accountability, become crystallized into normative frameworks and embodied expectations. These frameworks, in turn, provide the foundation for communal coordination, identity formation, and enhanced collective intelligence. This process constitutes the stage of “Collective Behavior (Cultural Knowledge)”.
K4 also facilitates personal consciousness (K2) through intentional communication such as storytelling, instruction, or ceremonial discourse. For example, during the Kaumaqan (ancestral house) initiation, elders articulate genealogies and spatial metaphors to younger generations. These culturally embedded narratives transform diffuse collective memory into recognizable and structured knowledge components. As individuals engage with these narratives, they begin to internalize the underlying cultural values, historical lessons, and spatial ontologies encoded within them. This transformative process, referred to as “Knowledge Sharing (Informational Knowledge)”, bridges systemic cultural knowledge and individual cognitive frameworks, reinforcing identity formation and cultural continuity through intentional verbal transmission.
K3 is transmitted to individuals as K2 through structured educational mechanisms that scaffold learning and internalization. In the Paiwan context, this frequently involves apprenticeship-based instruction in cultural practices such as the preparation of cinavu (millet dumplings), wherein procedural knowledge is conveyed through demonstration and participatory observation. By engaging with these tasks, learners begin to formulate explicit representations, such as sequential methods, material patterns, or symbolic associations, that reflect and extend the community’s embodied knowledge system. This process marks the stage of Knowledge Transmission (Role-Based Knowledge), wherein community-held knowledge is personalized and formalized through guided socialization and role-appropriate learning.
K4 is internalized as K1 through experiential immersion, where individuals acquire knowledge not through formal instruction but through participation in embodied cultural practices. In the Paiwan context, rituals such as the Masalut transmit relational ethics, kinship hierarchies, and sacred spatial understandings through affective and somatic engagement rather than verbal articulation. Since participants repeatedly engage in culturally sanctioned behaviors, these behaviors are internalized as intuitive dispositions, guiding behavior, emotional orientation, and decision-making in ways that feel natural rather than taught. This stage is conceptualized as Embodied Learning (Lifeworld-Based Knowledge), highlighting how collective symbolic orders are absorbed into the pre-reflective realm of personal experience.
K3 is internalized into K1 at the tacit level when individuals engage in situated observation and imitation within context-rich communal settings. Rather than relying on formal instruction, knowledge is absorbed through proximity and participation. When youth observe elders interpreting animal behavior to predict weather conditions, they begin to internalize ecological patterns through multisensory experiencing. This transmission occurs via osmosis, which is usually unspoken, immersive, and grounded in a shared lifeworld, embedding TEK within embodied cognition. This process is referred to as Experiential Transmission (Proficiency-Based Knowledge), emphasizing the gradual acquisition of know-how through lived interaction and social attunement.
Finally, K2 is transformed into K1 through practice-based internalization, guided instruction, or even through verbal transmission alone. Paiwan youth learn about the calendrical logic of planting cycles, whether through oral interpretation or written documentation, this knowledge remains abstract until enacted in the field. Working alongside elders during millet cultivation allows the learner to refine intuition and develop embodied sensitivity to seasonal rhythms and environmental signals. Representational knowledge becomes perceptual and responsive in this process. The procedural is reconstituted as lived experience, enabling real-time decision-making grounded in relational awareness. This transformation is described as Learning and Communication (Knowledge Internalization), emphasizing how culturally situated knowledge becomes personalized and actionable through mentorship and embodied practice.
In this epistemological cycle, tribal chief, shaman and elders serve not only as transmitters of K3 procedural knowledge but also as gatekeepers of K1 intuitive and lifeworld knowledge. Their participation determines the legitimacy of knowledge flow between layers, particularly where spiritual, ritual, or taboo elements are involved. The transformation of abstract representational knowledge into embodied understanding often occurs only through elder-guided practice, such as through ritual chanting, spatial anchoring of stories, or situational permissions based on ancestral lineage.
This reversed epistemological flow illustrates the recursive vitality of TEK within Indigenous systems. Rather than treating knowledge as static content to be archived, the Paiwan epistemic tradition emphasizes relational animation where knowledge is lived, embodied, articulated, and reabsorbed through iterative cycles of practice and reflection. Such dynamism affirms that TEK is not merely preserved through documentation but sustained through continuous enactment and affective engagement within community lifeworlds.
By embedding our design within this bidirectional knowledge ecology, we move beyond linear knowledge transfer models. Instead, we cultivate a culturally resonant feedback loop that honors Indigenous logic, relational authority, and co-evolving agency. This ensures that the TEK system we model not only safeguards intergenerational knowledge but also actively reinforces the contexts in which that knowledge remains meaningful, adaptive, and alive.

3.3. From Data to Knowledge: A Transitional Pipeline for TEK Structuring

Building upon the ethnographically grounded epistemological matrix from Figure 3 and the DIKM (Data, information, knowledge, wisdom) framework proposed by Baskarada and Koronios (2013), we construct a stepwise transition from raw Indigenous knowledge representations, ranging from oral narratives to field-collected multimedia, into structured digital knowledge assets. As shown in Figure 4, the process traverses both traditional and modern data paradigms, forming a pipeline from big data processing to multi-model integration. In the figure, the solid-colored text boxes with white text on blue and red backgrounds represent stages of data processing. The first three blue boxes correspond to conventional data-processing approaches, and the first vertical dashed line indicates the progression from data to digital knowledge within traditional paradigms. The second vertical dashed line represents the transition from conventional digitization to contemporary AI-based approaches. Within the AI domain, the first two blue boxes denote technologies that have already reached a relatively mature stage of development, whereas the final red box represents the large knowledge model framework proposed in this study.
The left portion of the figure focuses on traditional data processing stages. These include sourcing raw data from varied environments, performing data cleaning and classification, and developing structured schemas for information organization. At this stage, the transformation is largely syntactic, oriented toward producing valid, queryable datasets. The outcome is structured information rather than high-level semantic knowledge.
The central phase introduces a key inflection point: semantic parsing and meta-tagging. This represents a phase of knowledge inflection in which cultural context, environmental metadata, and ontological relationships are encoded into a dataset. The transition from data to knowledge is a vital step in enabling TEK systems to be both culturally grounded and computationally tractable.
On the right side of the diagram, focus shifts to modern AI-based modeling. There, machine learning techniques enable automated pattern discovery and multidimensional reclassification. These models lay the groundwork for embedding context-sensitive knowledge into larger-scale reasoning systems. Subsequently, LLMs are introduced to simulate human-like interpretation, albeit with risks of hallucinations or semantic drift. The culmination of this progress is the emergence of Large Knowledge Models (LKMs), which aim to construct coherent, context-aware, and verifiable knowledge repositories capable of supporting next-generation applications.
This figure thus traces the functional continuum from raw data to cognitive-level modeling, serving as a precursor to the modular, AI-enabled system implementation which will be discussed in the next section. It helps to clarify how epistemological complexity emerges from layered data engineering, setting the stage for more advanced features such as scenario agents, multimodal integration, and cultural validation protocols.

3.4. System Architecture: Technical Design for AI-Enabled TEK Modeling

To instantiate the conceptual pipeline introduced in Section 3.3, we present an operational system architecture designed to bridge TEK processing methods with contemporary artificial intelligence capabilities. As shown in Figure 5, this layered architecture supports both cultural specificity and computational scalability, ensuring that TEK remains epistemologically intact while benefiting from technological advancements. Extended from Figure 4 with additional elaboration, the hollow text boxes positioned above and below the solid boxes illustrate representative data or technical examples associated with each stage.
The system comprises five interdependent functional layers, each responsible for a critical phase in the knowledge modeling lifecycle. These layers work together to ingest, process, structure and interact with TEK in a manner that respects Indigenous epistemologies while enabling advanced reasoning and interaction modalities.

3.4.1. Layer 1: Data Source Integration with Data Governance

The foundational layer of the architecture begins with the acquisition of diverse data sources originating from both external repositories and internally conducted fieldwork. External data may include government records, ecological databases, and prior ethnographic studies, while internal data sources typically involve multimedia documentation, interviews, sensor data, and community-contributed content obtained through participatory methods. The dual sourcing strategy ensures both breadth and depth of informational input, accommodating the variability and multimodality of TEK. By embedding field-collected, lived-context data at the outset, the architecture preserves the ontological grounding of Indigenous knowledge, thereby maintaining fidelity to community-authenticated perspectives.
To address concerns regarding epistemic reliability, the architecture implements a governance-oriented validation and adjudication procedure prior to the incorporation of externally sourced materials. External repositories may contain errors, omissions, or interpretive framings that are incongruent with local meanings and community experience. Accordingly, candidate external inputs are reviewed through established mechanisms of knowledge governance, including domain expert assessment and, where appropriate, deliberation in tribal or community meetings that are authorized to determine the admissibility of information. Field-collected, community-authenticated materials, such as audio recordings, video documentation, transcripts, and field notes, are treated as primary evidence. These materials are curated jointly with local collaborators and subjected to iterative feedback from elders, ritual practitioners, and youth participants before being encoded in the knowledge schema, thereby anchoring the system in community-validated lived realities. In contrast, external sources, including government records, ecological databases, and prior ethnographic studies, are treated as secondary or contextual evidence. They are first examined by the research team for internal consistency and then systematically cross-checked against field data and community narratives rather than being presumed authoritative.
This procedural emphasis is particularly consequential in the Paiwan context, where the Paiwan language has been classified as “vulnerable” within the United Nations Educational, Scientific and Cultural Organization’s endangerment framework, which constrains the feasibility of constructing a comprehensive local knowledge model solely through field-based elicitation and documentation. At the same time, an approach that relies extensively on external repositories would reintroduce the reliability limitations and sovereignty-related contestations identified in the preceding literature review. When inconsistencies arise between external documents and Paiwan-authenticated field records, the architecture assigns higher evidentiary weight to community-authenticated data and explicitly prioritizes these first-hand sources. In such cases, external claims are flagged, down-weighted, or excluded from the operational corpus, rather than treated as neutral or default references, thereby reducing the risk of reproducing external misrepresentation and reinforcing alignment with locally validated realities.

3.4.2. Layer 2: Data Cleaning and Classification

The second layer involves the preprocessing of raw data into structured, semi-structured and unstructured formats. Structured data includes tabular records and metadata from existing databases; semi-structured data comprises annotated texts, spreadsheets, and logs; while unstructured data spans images, videos, and oral narratives. Each category undergoes cleansing and classification operations to ensure consistency, resolve ambiguities, and eliminate redundancy. This process not only improves the technical quality of the data but also introduces a culturally conscious taxonomy that differentiates between ecological signs, cultural protocols, and contextual dependencies. The outcome is a refined corpus that is both machine-readable and semantically meaningful, serving as a bridge between Indigenous epistemologies and computational representations.

3.4.3. Layer 3: Systematic Processing and Knowledge Schema Development

At this stage, processed data are transformed into information architectures through the construction of structured relational databases, document and wide-column databases, and advanced indexing mechanisms such as vector, graph or hybrid search indices. These forms enable high-performance querying, semantic inference, and cross-referencing across knowledge modalities. The development of a knowledge schema at this layer introduces formal ontologies, meta-tagging standards, and logic models that encode TEK concepts such as seasonality, ritual timing, plant–animal symbiosis, and relational ethics. A dedicated knowledge center or data warehouse serve as repositories for these encoded relationships, allowing for versioned updates, lineage tracing, and participatory validation. This layer effectively scaffolds the transition from data to meaning, enabling culturally coherent knowledge structures to be computationally instantiated.

3.4.4. Layer 4: Machine Learning and LLM Integration

The fourth layer transitions from static knowledge structures to dynamic inference engines by integrating machine learning algorithms and LLMs. Classical models such as recurrent neural networks (RNN, a type of artificial neural network designed to process sequential data by using loops that allow information to persist across time steps), convolutional neural networks (CNNs, a type of neural network that learns spatial patterns by applying sliding filters across images or other grid-structured data), and transformer-based architectures like BERT are employed to recognize temporal patterns, spatial relations, and contextual semantics. These models are complemented by frontier LLM implementations such as those developed by OpenAI, LLaMA, and Gemini (Google’s multimodal generative AI model family), offering scalable natural language understanding and generation. By fine-tuning these models with TEK-specific corpora, the architecture is capable of generating human cognitive outputs that mimic Indigenous reasoning styles. Nonetheless, attention is given to the epistemic risks of hallucination, misrepresentation, and delusion, which are mitigated through feedback loops and rule-based constraints established in the next layer.

3.4.5. Layer 5: Knowledge Modeling Agents and Practical TEK Deployment

The final layer introduces a suite of semantic and scenario agents designed to enact practical applications of TEK in real-world or simulated environments. They include RAG agents for context-specific querying, semantic agents for ontology-consistent reasoning, and scenario agents for role-based interaction and decision support. All agents are anchored by a LKM, which consolidates prior outputs, aligns them with Indigenous logic and enables emergent properties such as dialogic learning, situated adaptation, and multi-agent coordination. This layer marks the reintegration of knowledge into community-facing tools, decision-support systems, and educational interfaces. Rather than treating TEK as a static corpus, the architecture re-embeds it within relational, experiential, and iterative loops, thereby reinforcing its performative and regenerative nature.
Across all layers, the architecture is guided by principles of ethical AI, including data traceability, explainability, and Indigenous data sovereignty. Each transformation from data to knowledge is documented with provenance metadata, allowing communities to audit and contest outputs. Knowledge is never extracted or abstracted without communal validation, and the system explicitly supports iterative co-design with knowledge holders.
In summary, this layered framework establishes a technically robust yet culturally respectful platform for the modeling of TEK. It exemplifies how AI can serve not to replace Indigenous epistemologies, but to extend their resilience, interpretability, and translatability in the digital age.

3.5. Agentic AI in Next-Generation LKM Systems

Agentic AI is the most important component in the LKM, a dynamical reasoning engine that emulates user-centered cognition and decision-making in knowledge interaction. As shown in Figure 6, this agent-based structure responds to user queries not as a passive retrieval system, but through an iterative cycle of planning, acting, observing and adjusting, thereby simulating processes of cultural interpretation, contextual negotiation, and epistemic reflexivity.
The process begins with a user query, which is parsed by the agent’s “Plan/Think” module. The module implements a retrieval strategy based on a user’s intent, cultural context, and system memory. The agent then proceeds to “Act”, querying both internal knowledge graphs and invoking external tools (e.g., language models, rule-based engines, or TEK-specific ontologies) as necessary. The “Observe” module evaluates temporary results, cross-validating them with cultural constraints and permit rules. When the output is inadequate, the agent iteratively refines its retrieval plan. Once assessment criteria are satisfied, the agent delivers a response answer in a form aligned with cultural tone, ritual appropriateness, and relational knowledge hierarchy.
The agent is not only dialogic interfaces for end-users but also a governance sub-system that includes tribal councils or designated knowledge authorities. All AI outputs, particularly those concerning ritual permissions, kinship roles, or spiritually sensitive interpretations, are subject to validation by these cultural gatekeepers. This preserves epistemic legitimacy and embeds the system within a relational, community-approved framework of truth authorization.
This design contrasts with conventional LLM-based systems by embedding relational intelligence, epistemic humility, and response traceability into the interaction process. From the perspective of Indigenous users, especially elders, youth learners, and ritual practitioners, this approach better mirrors lived experience and knowledge flow, and reduces the risks of hallucinations or decontextualization often seen in generic AI outputs. Also, the design respects the principle that certain forms of knowledge are not universally accessible. Elders determine which knowledge can be surfaced in the system and under what conditions, ensuring that sacred or esoteric content is protected according to cultural protocols

4. Case Study: Indigenous Communities and TEK Application

To evaluate the practical feasibility and cultural significance of the proposed system, a case-based analysis grounded in field interactions with Paiwan communities in southeastern Taiwan. Rather than treating the system as a fully deployed technical solution, we examine how the epistemological and architectural components described in Section 3 can support real-world TEK processes in situ. The cases emphasize the interplay between human-centered knowledge flow and AI-enabled reasoning, highlighting how Indigenous practices can be enriched, rather than displaced, by digital augmentation.

4.1. Case 1: Knowledge Accumulation and Outreach Through Ritual Observation and Co-Documentation

In long-term field collaborations with Paiwan communities, we engaged in seasonal agricultural and ritual cycles as co-participants rather than external observers. This relational positioning situated the research team within the social and ecological rhythms through which TEK was enacted. During millet planting, ritual preparation, and inter-clan gatherings, we documented embodied practices and social negotiations through audiovisual recordings, environmental notes, spatial mapping, and kinship-based role descriptions. These field materials represented multiple forms of knowledge, including tacit skills, procedural knowledge embedded in ritual sequences, and symbolic interpretation conveyed through chant, gesture, and ancestral narratives.
Within the workflow outlined in Section 3.3, these heterogeneous materials first entered a participatory curation process. Community collaborators assisted in sorting and validating content while identifying segments that required restricted access due to ritual sensitivity or kinship authority. The approved content then underwent transcript generation, segmentation, and culturally attuned annotation. Semantic markers were added to indicate ritual phases, lineage responsibilities, taboos, and ecological cues, which ensured that representational forms remained grounded in Indigenous ontological classifications rather than external schemas.
These curated and enriched data were subsequently incorporated into the layered system architecture depicted in Section 3.4. Structured components, such as lineage rules or planting sequences, were stored as knowledge graph entities, whereas less formalized narrative content entered a dialogic processing environment supported by retrieval-augmented reasoning. This dual placement allowed TEK to remain actionable through both procedural and interpretive dimensions. The resulting structure preserved relational authority, temporality, and spatial logic, enabling the system to model who could act, when, where, and under which ancestral mandate. Rather than reducing cultural practice to static information, this case demonstrated how digital systems could sustain the dynamic interplay of rules, memory, environment, and social accountability that defined Paiwan knowledge systems.

4.2. Case 2: Knowledge Transmission Using LKM for Education

To assess the pedagogical potential and cultural responsiveness of the TEK modeling system, the research team simulates and tests the prototype LKM interface. Participants submit context-rich, open-ended queries, such as “Can I lead the millet ritual if my mother’s clan used to do it?” or “Is Red-eyed Palji a good person or a bad person?” These queries activate the system’s culturally aligned reasoning cycle, which retrieves ontologically tagged knowledge from the Layer 3 knowledge center, processes permissions and roles via semantic agents, and generates responses that reflect both procedural accuracy and cultural nuance.
The system does not simply return literal answers, instead, it embeds each response within a narrative context, lineage protocol, or taboo logic. These responses often draw upon oral histories or ritual precedents archived in Layer 1 and semantically enriched in Layer 2. For example, when responding to the question about ritual leadership, the system references matrilineal inheritance patterns, ancestral locations, and clan-specific responsibilities. When interpreting questions about mythical figures like Red-eyed Palji, the system accesses canonical story arcs and acknowledges variations across multi-tribal traditions to produce responses grounded in collective memory.
This increases the motivation for youth to engage with their heritage, particularly highlighting the conversational tone and the system’s recognition of kinship-based authority. The AI interface encourages youth to formulate additional questions, fostering dialogic learning that reinforces epistemic trust and narrative exploration. All outputs are subject to validation by local elders or cultural experts, who ensure the accuracy and appropriateness of the AI-generated content. In special circumstances, for instance, if a user asks a question that would traditionally be considered inappropriate without prior ceremonial initiation, the system is designed to either decline to answer or defer to elder review protocols. This feedback loop affirms the legitimacy of AI-facilitated interaction while preserving community control over cultural representation. The case demonstrates how agentic AI serves not only as a knowledge delivery mechanism but also as an intergenerational bridge that sustains traditional ecological knowledge through culturally meaningful learning experiences.

4.3. Implications for System Design and Community Governance

These case studies highlight the strengths of an epistemically aligned digital system, one that honors the recursive logic between tacit and explicit knowledge, individual and collective perspectives, and cultural sovereignty and AI reasoning. The mutual empowerment model described in Section 3.2, when implemented through modular architecture and agentic interfaces, offers a scalable approach for enabling Indigenous communities to retain authorship over their knowledge while engaging with modern AI systems.
At the same time, the cases reinforce the importance of co-governance structures for digital TEK. Knowledge must not only be protected through access control and ethical filtering, but continuously validated by community-led protocols. In making a system a living part of the cultural knowledge ecosystem, the system becomes not just a technological artifact.

5. Discussion

This study proposes a shift from traditional knowledge to an artificial intelligence-based agent framework, not merely for digitizing Indigenous traditional knowledge but also for fostering a culturally coherent, context-sensitive, and ethically governed system of knowledge mediation. Drawing from longstanding field engagement and iterative co-design processes, we identify that agentic systems, when designed to reflect the ontological structures of TEK, can serve not as replacements but as collaborative agents that participate in cultural transmission.
To exam the epistemic and operational alignment between traditional knowledge systems and AI-enabled infrastructures, Table 1 presents a comparative analysis. It reveals both functional homologies and critical points of divergence. Whereas ritual knowledge is governed by kinship roles, relational timing, and embodied participation, agentic systems rely on semantic rules, role-based access control, and adaptive interaction cycles (e.g., Plan-Act-Observe-Adjust). These computational processes emulate, but do not replicate, the relational nuances of lived ritual practice. The implication is that meaningful cultural AI must reconstruct not only the informational content but also the social protocols of “who speaks, when, and to whom”.
The incorporation of semantic agents demonstrates the potential to mediate between Indigenous epistemologies and digital infrastructures. Rather than treating TEK as static information, the system defines TEK as a dynamic interplay of roles, permissions, temporalities, and relational responsibilities. The Plan-Act-Observe loop mirrors the dialogic and corrective dimensions of ritual engagement. But the success of such mediation depends on developing cultural consistency modules, ensuring that responses are not only linguistically accurate, but socially legitimate and ritually appropriate.
Beyond the computational domain, the long-term viability of AI-assisted TEK systems hinges on community trust and governance. This study argues that cultural AI must adhere to three core design logics:
  • The traceability of sources of knowledge to ensure epistemic transparency;
  • Permission-aware architecture to respect the integrity of kinship and ritual-based rights of knowledge;
  • Negotiability of agentic outputs, to allow Indigenous communities to critique, reject, or modify responses generated by AI systems.
Such design principles align with broader movements in Indigenous data sovereignty and digital heritage ethics, positioning AI not as a neutral tool, but as a cultural co-agent whose legitimacy must be continually negotiated through participatory design and communal oversight.
While AI systems can assist in structuring and retrieving culturally grounded knowledge, they cannot fully represent the spiritual force, intergenerational resonance, or ontological gravity embedded in Indigenous cosmologies. The presence of elders ensures that the AI remains a support tool, not a substitute, and that the boundary between communicable knowledge and sacred silence is upheld.

6. Conclusions and Future Work

This study presents a conceptual and technical framework for the digital transmission of Indigenous TEK, situating it within the evolving interplay between community-based epistemologies and agentic artificial intelligence. Drawing from extended ethnographic engagement and community co-creation, we identify the recursive interdependence between personal and communal knowledge systems as a foundational logic of TEK. Through this understanding, we develop a methodology that transforms field-based oral narratives and multimodal data into structured and semantically enriched knowledge models. When embedded in agentic AI architectures, these models enable dynamic, dialogic and culturally responsive knowledge systems that preserve not only the content but also the procedural and relational dimensions of traditional knowledge practices.
Our work contributes to the growing field of AI for cultural sustainability by demonstrating how data-intensive technologies can be realigned with the epistemic values of Indigenous communities. By integrating semantic agents, RAG, and community-validated knowledge graphs enable scalable yet context-aware interaction, offering new pathways for intergenerational knowledge transfer and ecological stewardship.
Future research will focus on deepening the alignment between AI-driven inference mechanisms and Indigenous ethical frameworks. This includes the co-development of culturally grounded evaluation metrics, refinement of feedback and negotiation protocols within agentic systems, and the long-term governance of digital knowledge infrastructures under the principles of Indigenous data sovereignty. In parallel, technological advances will emphasize hybrid knowledge models capable of integrating symbolic, statistical, and relational learning across multilingual and multimodal data. Ultimately, we envision a future in which Indigenous knowledge is not merely preserved in digital forms, but actively revitalized through community-led AI systems that are both technologically robust and culturally legitimate.

Author Contributions

Conceptualization, all authors.; methodology, R.-C.W.; software, R.-C.W.; validation, R.-C.W. and M.-C.H.; formal analysis, R.-C.W.; resources, M.-C.H. and L.-C.L.; data curation, R.-C.W.; writing—original draft, R.-C.W.; writing—review & editing, R.-C.W.; visualization, R.-C.W.; supervision, M.-C.H. and L.-C.L.; project administration, L.-C.L.; funding acquisition, L.-C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project “Flow (mavalidi) and Morphogenesis (sivaikan): Co-creating the Conditions for Sustainable Development of the South-Link Area in Taitung County” of the National Taitung University, sponsored by the National Science and Technology Council, Taiwan, R.O.C. under Grant no. NSTC 113-2420-H-143-003-HS3.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the National Cheng Kung University Governance Framework for Human Research Ethics (protocol code 108-376-2 on 5 December 2019; 111-468-2 on 25 October 2022; and 114-0779-2 on 2 November 2025).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.

Acknowledgments

The authors would like to thank all colleagues and students at the Center for Innovation in Humanities and Social Practices at National Taitung University who contributed to collecting ethnographic research data for this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Baskarada, Sasa, and Andy Koronios. 2013. Data, information, knowledge, wisdom (DIKW): A semiotic theoretical and empirical exploration of the hierarchy and its quality dimension. Australasian Journal of Information Systems 18: 1. [Google Scholar] [CrossRef]
  2. Carroll, Stephanie, Ibrahim Garba, Oscar Figueroa-Rodríguez, Jarita Holbrook, Raymond Lovett, Simeon Materechera, Mark Parsons, Kay Raseroka, Desi Rodriguez-Lonebear, Robyn Rowe, and et al. 2020. The CARE Principles for Indigenous Data Governance. Data Science 19: 43. [Google Scholar] [CrossRef]
  3. Carroll, Stephanie Russo, Desi Rodriguez-Lonebear, and Andrew Martinez. 2019. Indigenous data governance: Strategies from United States native nations. Data Science Journal 18: 31. [Google Scholar] [CrossRef]
  4. Christen, Kimberly A. 2012. Does information really want to be free? Indigenous knowledge systems and the question of openness. International Journal of Communication 6: 24. [Google Scholar]
  5. Cormack, Chris, Erena Mikaere, and Te Taka Keegan. 2025. Māori AI Governance: A Framework for Ethical Governance of AI in Aotearoa. Te Kāhui Raraunga. Available online: https://www.kahuiraraunga.io/_files/ugd/b8e45c_79f4162207da4123861c50dec08cd0fb.pdf (accessed on 11 December 2025).
  6. Domazetoski, Viktor. 2024. Enhancing Ecological Knowledge Discovery Using Large Language Models. Master’s thesis, Georg-August Universität Göttingen, Gottingen, Germany. [Google Scholar]
  7. Dutta, Uttaran, and Swayang Das. 2016. The digital divide at the margins: Co-designing information solutions to address the needs of indigenous populations of rural India. Communication Design Quarterly Review 4: 36–48. [Google Scholar] [CrossRef]
  8. Fang, Wei-Ta, Hsin-Wen Hu, and Chien-Shing Lee. 2016. Atayal’s identification of sustainability: Traditional ecological knowledge and indigenous science of a hunting culture. Sustainability Science 11: 33–43. [Google Scholar] [CrossRef]
  9. Greaves, Lara M., Cinnamon Lindsay Latimer, Emerald Muriwai, Charlotte Moore, Eileen Li, Andrew Sporle, Terryann C. Clark, and Barry J. Milne. 2024. Māori and the Integrated Data Infrastructure: An assessment of the data system and suggestions to realise Māori data aspirations. Journal of the Royal Society of New Zealand 54: 190–206. [Google Scholar] [CrossRef]
  10. Han, Barbara A., Kush R. Varshney, Shannon LaDeau, Ajit Subramaniam, Kathleen C. Weathers, and Jacob Zwart. 2023. A synergistic future for AI and ecology. Proceedings of the National Academy of Sciences 120: e2220283120. [Google Scholar] [CrossRef] [PubMed]
  11. Hannah, Kate, Sanjana Hattotuwa, and Kayli Taylor. 2022. The murmuration of information disorders: Aotearoa New Zealand ‚mis-and disinformation ecologies and the parliament protest. Pacific Journalism Review 28: 138–61. [Google Scholar]
  12. Hellmann, Olli. 2025. Colonial bias in AI training data: Prompting Sora to generate images of Aotearoa New Zealand’s historical past. Kōtuitui: New Zealand Journal of Social Sciences. Available online: https://www.researchgate.net/profile/Olli-Hellmann/publication/396746408_Colonial_bias_in_AI_training_data_Prompting_Sora_to_generate_images_of_Aotearoa_New_Zealand’s_historical_past/links/68f840427d9a4d4e870b6625/Colonial-bias-in-AI-training-data-Prompting-Sora-to-generate-images-of-Aotearoa-New-Zealands-historical-past.pdf (accessed on 11 December 2025).
  13. Hiraldo, Danielle, Stephanie Russo Carroll, Dominique M. David-Chavez, Mary Beth Jäger, and Miriam Jorgensen. 2020. Native Nation Rebuilding for Tribal Research and Data Governance. NNI Policy Brief Series; Tucson: Native Nations Institute, University of Arizona. [Google Scholar]
  14. Hudson, Petera, and Hēmi Whaanga. 2023. Computing Technologies for Resilience, Sustainability, and Resistance. IEEE Annals of the History of Computing 45: 27–38. [Google Scholar] [CrossRef]
  15. James, Jesin, Vithya Yogarajan, Isabella Shields, Catherine I. Watson, Peter Keegan, Keoni Mahelona, and Peter-Lucas Jones. 2022. Language models for code-switch detection of te reo Māori and English in a low-resource setting. In Findings of the Association for Computational Linguistics: NAACL 2022. Stroudsburg: Association for Computational Linguistics, pp. 650–60. [Google Scholar]
  16. Joubert, Annekie, and Katarzyna Biernacka. 2015. Cultural heritage and new technologies: The role of technology in preserving, restoring and disseminating cultural knowledge. Southern African Journal for Folklore Studies 25: s21–s33. [Google Scholar] [CrossRef] [PubMed]
  17. Ka’ai-Mahuta, Rachael. 2012. The use of digital technology in the preservation of Māori song. Te Kaharoa 5: 1. [Google Scholar] [CrossRef]
  18. Keegan, Te Taka. 2025. A Voice for the Future: The First AI-Generated Waikato-Maniapoto Voice Project. University of Waikato AI Institute News/Technical Initiative. Available online: https://www.waikato.ac.nz/news-events/news/a-voice-for-the-future-the-first-ai-generated-waikato-maniapoto-voice/ (accessed on 11 December 2025).
  19. Konczi, Anita E., and Lea Bill. 2024. Advancing First Nations Principles of OCAP®. In Indigenous and Tribal Peoples and Cancer. Cham: Springer Nature, pp. 37–39. [Google Scholar]
  20. McCann, Heidi S., Peter L. Pulsifer, and Carolina Behe. 2016. Sharing and preserving Indigenous knowledge of the Arctic using information and communications technology. Indigenous Notions of Ownership and Libraries, Archives and Museums 166: 126. [Google Scholar]
  21. McGregor, Deborah. 2004. Coming full circle: Indigenous knowledge, environment, and our future. American Indian Quarterly 28: 385–410. [Google Scholar] [CrossRef]
  22. Mohaghegh, Mahsa, Michael McCauley, and Mehdi Mohammadi. 2014. Maori-English Machine Translation. Paper presented at NZCSRSC New Zealand Computer Science Research Student Conference, Canterbury University. Christchurch, New Zealand, April 24; Available online: https://www.researchbank.ac.nz/items/587e76ed-bb0b-4a7d-bbaf-626c32e844e3 (accessed on 11 December 2025).
  23. New Zealand Government. 2025. Responsible AI Guidance for the Public Service–GenAI. Available online: https://www.digital.govt.nz/assets/Standards-guidance/Technology-and-architecture/Generative-AI/Responsible-AI-Guidance-for-the-Public-Service-GenAI-Print.pdf (accessed on 11 December 2025).
  24. Nonaka, Ikujiro. 1994. A dynamic theory of organizational knowledge creation. Organization Science 5: 14–37. [Google Scholar] [CrossRef]
  25. Nonaka, Ikujiro, and Georg Von Krogh. 2009. Perspective—Tacit knowledge and knowledge conversion: Controversy and advancement in organizational knowledge creation theory. Organization Science 20: 635–52. [Google Scholar] [CrossRef]
  26. Park, Joon Sung, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. Paper presented at 36th Annual ACM Symposium on User Interface Software and Technology, San Francisco, CA, USA, October 29–November 1; pp. 1–22. [Google Scholar]
  27. Robinson, Jake M., Nick Gellie, Danielle MacCarthy, Jacob G. Mills, Kim O’Donnell, and Nicole Redvers. 2021. Traditional ecological knowledge in restoration ecology: A call to listen deeply, to engage with, and respect Indigenous voices. Restoration Ecology 29: e13381. [Google Scholar] [CrossRef]
  28. Schmitt, Ulrich. 2024. Benefitting systemic citizens and sustainable knowledge heritage: Building a digital platform ecosystem and community for knowledge co-creation. Paper presented at DRS2024, Boston, MA, USA, June 23–28. [Google Scholar] [CrossRef]
  29. Taiuru, Karaitiana T. 2024. Māori Voices in the Artificial Intelligence (AI) Landscape of Aotearoa New Zealand; An Interim Research Report. Available online: https://www.taiuru.co.nz/maori-voices-in-the-artificial-intelligence-ai-landscape-new-zealand/ (accessed on 11 December 2025).
  30. Tumbali, George Canilao. 2025. Exploring Traditional Ecological Knowledge (TEK) in Kalinga Province: Practices, Preservation, and Perspectives. Religion and Social Communication 23: 9–35. [Google Scholar] [CrossRef]
  31. Usher, Peter J. 2000. Traditional ecological knowledge in environmental assessment and management. Arctic 53: 183–93. [Google Scholar] [CrossRef]
  32. Wu, Chin-Shien, Meng-Shan Wu, and Yi-Honng Chen. 2014. Quantitative Analysis of Traditional Ecological Knowledge: A Case Study of Paiwan People in Jialan Village, Taiwan. Taiwan Journal of Forest Science 29: 205–19. [Google Scholar]
  33. Zhao, Jiajing, Cheng Huang, and Xian Li. 2024. A comparative study of cultural hallucination in large language models on culturally specific ethical questions. Research Square. [Google Scholar] [CrossRef]
Figure 1. Conceptual Framework.
Figure 1. Conceptual Framework.
Socsci 15 00007 g001
Figure 2. Data Collection and Preliminary Processing Procedures.
Figure 2. Data Collection and Preliminary Processing Procedures.
Socsci 15 00007 g002
Figure 3. Bidirectional Model of Knowledge Cycle in TEK. The green boxes represent four distinct knowledge bodies, while the red arrows indicate the flow of knowledge during processes of accumulation and sharing.
Figure 3. Bidirectional Model of Knowledge Cycle in TEK. The green boxes represent four distinct knowledge bodies, while the red arrows indicate the flow of knowledge during processes of accumulation and sharing.
Socsci 15 00007 g003
Figure 4. The Evolutionary Pipeline from Data to Knowledge Modeling in Traditional and Modern Contexts.
Figure 4. The Evolutionary Pipeline from Data to Knowledge Modeling in Traditional and Modern Contexts.
Socsci 15 00007 g004
Figure 5. Layered Architecture and Functional Components.
Figure 5. Layered Architecture and Functional Components.
Socsci 15 00007 g005
Figure 6. Agentic AI Architecture for User-Centered Knowledge Interaction in Next-Generation LKM Systems.
Figure 6. Agentic AI Architecture for User-Centered Knowledge Interaction in Next-Generation LKM Systems.
Socsci 15 00007 g006
Table 1. Comparative Structures of Traditional Ritual Knowledge vs. Agentic AI-Enabled TEK Processing.
Table 1. Comparative Structures of Traditional Ritual Knowledge vs. Agentic AI-Enabled TEK Processing.
DimensionTraditional Ritual Knowledge ProcessingAgentic AI-Enabled TEK Processing
Knowledge SourceOral narratives, ancestral instruction, embodied experienceField-derived datasets, multimedia inputs, semantically annotated knowledge graphs
Initiating ConditionsSeasonal indicators, ancestral cues, elder-led ritualsUser queries, simulated ritual contexts, external triggers
Epistemic AuthorityElders, ritual leaders, kinship-based transmission rightsRule-based access layers, ontological inference models, role permission matrices
Interpretive ModeSymbolic actions, ritual performances, relational heuristicsSemantic parsing, language generation, context-aware knowledge retrieval
Transmission PathwayApprenticeship, communal participation, embodied teachingDialog-based interaction, multimodal output (text, images, audio)
Execution and AdjustmentOn-site elder intervention and improvisationIterative agentic planning (Plan–Act–Observe–Adjust)
Validation StandardsRitual coherence, ancestral acceptance, communal feedbackCultural filter modules, elder verification, semantic consistency checks
Error RecoveryImmediate correction by ritual expertsMulti-turn dialog correction, semantic re-planning, agentic recalibration
Governance and SovereigntyKin-based authority, oral licensing, ritual accountabilityCARE principles, community-curated access governance, culturally negotiated permissions
StrengthsDeep relationality, cultural embodiment, context specificityScalability, modular inference, context adaptation
RisksIntergenerational erosion, difficulty of preservation and scalingHallucination risks, cultural misalignment, epistemic decontextualization
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, R.-C.; Hsieh, M.-C.; Lai, L.-C. From Tacit Knowledge Distillation to AI-Enabled Culture Revitalization: Modeling Knowledge Cycles in Indigenous Cultural Systems. Soc. Sci. 2026, 15, 7. https://doi.org/10.3390/socsci15010007

AMA Style

Wang R-C, Hsieh M-C, Lai L-C. From Tacit Knowledge Distillation to AI-Enabled Culture Revitalization: Modeling Knowledge Cycles in Indigenous Cultural Systems. Social Sciences. 2026; 15(1):7. https://doi.org/10.3390/socsci15010007

Chicago/Turabian Style

Wang, Reen-Cheng, Ming-Che Hsieh, and Liang-Chun Lai. 2026. "From Tacit Knowledge Distillation to AI-Enabled Culture Revitalization: Modeling Knowledge Cycles in Indigenous Cultural Systems" Social Sciences 15, no. 1: 7. https://doi.org/10.3390/socsci15010007

APA Style

Wang, R.-C., Hsieh, M.-C., & Lai, L.-C. (2026). From Tacit Knowledge Distillation to AI-Enabled Culture Revitalization: Modeling Knowledge Cycles in Indigenous Cultural Systems. Social Sciences, 15(1), 7. https://doi.org/10.3390/socsci15010007

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop