Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (551)

Search Parameters:
Keywords = knowledge-base question answering

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 5529 KB  
Article
Geospatial Knowledge-Base Question Answering Using Multi-Agent Systems
by Jonghyeon Yang and Jiyoung Kim
ISPRS Int. J. Geo-Inf. 2026, 15(1), 35; https://doi.org/10.3390/ijgi15010035 - 8 Jan 2026
Abstract
Large language models (LLMs) have advanced geospatial artificial intelligence; however, geospatial knowledge-base question answering (GeoKBQA) remains underdeveloped. Prior systems have relied on handcrafted rules and have omitted the splitting of datasets into training, validation, and test sets, thereby hindering fair evaluation. To address [...] Read more.
Large language models (LLMs) have advanced geospatial artificial intelligence; however, geospatial knowledge-base question answering (GeoKBQA) remains underdeveloped. Prior systems have relied on handcrafted rules and have omitted the splitting of datasets into training, validation, and test sets, thereby hindering fair evaluation. To address these gaps, we propose a prompt-based multi-agent LLM framework (based on GPT-4o) that translates natural-language questions into executable GeoSPARQL. The architecture comprises an intent analyzer, multi-grained retrievers that ground concepts and properties in the OSM tagging schema and map geospatial relations to the GeoSPARQL/OGC operator inventory, an operator-aware intermediate representation aligned with SPARQL/GeoSPARQL 1.1, and a query generator. Our approach was evaluated on the GeoKBQA test set using 20 few-shot exemplars per agent. It achieved 85.49 EM (GPT-4o) with less supervision than fine-tuned baselines trained on 3574 instances and substantially outperformed a single-agent GPT-4o prompt. Additionally, we evaluated GPT-4o-mini, which achieved 66.74 EM in a multi-agent configuration versus 47.10 EM with a single agent. The observations showed that the multi-agent gain was higher for the larger model. Our results indicate that, beyond scale, the framework’s structure is important; thus, principled agentic decomposition yields a sample-efficient, execution-faithful path beyond template-centric GeoKBQA under a fair, hold-out evaluation protocol. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
13 pages, 2220 KB  
Article
Evaluating Chat GPT-4o’s Comparative Performance over GPT-4 in Japanese Medical Licensing Examination and Its Clinical Partnership Potential
by Masatoshi Miyamura, Goro Fujiki, Yumiko Kanzaki, Kosuke Tsuda, Hironaka Asano, Hideaki Morita and Masaaki Hoshiga
Int. Med. Educ. 2026, 5(1), 9; https://doi.org/10.3390/ime5010009 - 7 Jan 2026
Viewed by 68
Abstract
Background: Recent advances in artificial intelligence (AI) have produced ChatGPT-4o, a multimodal large language model (LLM) capable of processing both text and image inputs. Although ChatGPT has demonstrated usefulness in medical examinations, few studies have evaluated its image analysis performance. Methods: This study [...] Read more.
Background: Recent advances in artificial intelligence (AI) have produced ChatGPT-4o, a multimodal large language model (LLM) capable of processing both text and image inputs. Although ChatGPT has demonstrated usefulness in medical examinations, few studies have evaluated its image analysis performance. Methods: This study compared GPT-4o and GPT-4 using public questions from the 116th–118th Japanese National Medical Licensing Examinations (JNMLE), each consisting of 400 questions. Both models answered in Japanese using simple prompts, including screenshots for image-based questions. Accuracy was analyzed across essential, general, and clinical questions, with statistical comparisons by chi-square tests. Results: GPT-4o consistently outperformed GPT-4, achieving passing scores in all three examinations. In the 118th JNMLE, GPT-4o scored 457 points versus 425 for GPT-4. GPT-4o demonstrated higher accuracy for image-based questions in the 117th and 116th exams, though the difference in the 118th was not significant. For text-based questions, GPT-4o showed superior medical knowledge, clinical reasoning, and ethical response behavior, notably avoiding prohibited options. Conclusion: Overall, GPT-4o exceeded GPT-4 in both text and image domains, suggesting strong potential as a diagnostic aid and educational resource. Its balanced performance across modalities highlights its promise for integration into future medical education and clinical decision support. Full article
Show Figures

Figure 1

21 pages, 1207 KB  
Article
Insights on the Pedagogical Abilities of AI-Powered Tutors in Math Dialogues
by Verónica Parra, Ana Corica and Daniela Godoy
Information 2026, 17(1), 51; https://doi.org/10.3390/info17010051 - 6 Jan 2026
Viewed by 188
Abstract
AI-powered tutors that interact with students in question-answering scenarios using large language models (LLMs) as foundational models for generating responses represent a potential scalable solution to the growing demand for one-to-one tutoring. In fields like mathematics, where students often face difficulties, sometimes leading [...] Read more.
AI-powered tutors that interact with students in question-answering scenarios using large language models (LLMs) as foundational models for generating responses represent a potential scalable solution to the growing demand for one-to-one tutoring. In fields like mathematics, where students often face difficulties, sometimes leading to frustration, easy-to-use natural language interactions emerge as an alternative for enhancing engagement and providing personalized advice. Despite their promising potential, the challenges for LLM-based tutors in the math domain are twofold. First, the absence of genuine reasoning and generalization abilities in LLMs frequently results in mathematical errors, ranging from inaccurate calculations to flawed reasoning steps and even the appearance of contradictions. Second, the pedagogical capabilities of AI-powered tutors must be examined beyond simple question-answering scenarios since their effectiveness in math tutoring largely depends on their ability to guide students in building mathematical knowledge. In this paper, we present a study exploring the pedagogical aspects of LLM-based tutors through the analysis of their responses in math dialogues using feature extraction techniques applied to textual data. The use of natural language processing (NLP) techniques enables the quantification and characterization of several aspects of pedagogical strategies deployed in the answers, which the literature identifies as essential for engaging students and providing valuable guidance in mathematical problem-solving. The findings of this study have direct practical implications in the design of more effective math AI-powered tutors as they highlight the most salient characteristics of valuable responses and can thus inform the training of LLMs. Full article
(This article belongs to the Special Issue AI Technology-Enhanced Learning and Teaching)
Show Figures

Figure 1

21 pages, 1410 KB  
Article
Do Large Language Models Know When They Lack Knowledge?
by Shuai Qin, Lianke Zhou, Liu Sun and Nianbin Wang
Electronics 2026, 15(2), 253; https://doi.org/10.3390/electronics15020253 - 6 Jan 2026
Viewed by 141
Abstract
Although Large Language Models (LLMs) excel in language tasks, producing fluent and seemingly high-quality text, their outputs are essentially probabilistic predictions rather than verified facts, rendering reliability unguaranteed. This issue is particularly pronounced when models lack the required knowledge, which significantly increases the [...] Read more.
Although Large Language Models (LLMs) excel in language tasks, producing fluent and seemingly high-quality text, their outputs are essentially probabilistic predictions rather than verified facts, rendering reliability unguaranteed. This issue is particularly pronounced when models lack the required knowledge, which significantly increases the risk of fabrications and misleading content. Therefore, understanding whether LLMs know when they lack knowledge is of critical importance. This work systematically evaluates leading LLMs on their ability to recognize knowledge insufficiency and examines several training-free techniques to foster this metacognitive capability, referred to as “integrity” throughout this research. For rigorous evaluation, this study firstly develops a new Question-Answering (Q&A) dataset called Honesty. Specifically, events emerging after the model’s deployment are utilized to generate “unknown questions,” ensuring they fall outside LLMs’ knowledge boundaries, while “known questions” are drawn from existing Q&A datasets, together constituting the Honesty dataset. Subsequently, based on this dataset, systematic experiments are conducted using multiple representative LLMs (e.g., GPT-4o and DeepSeek-V3). The results reveal that semantic understanding and reasoning capabilities are the core factors influencing “integrity.” Furthermore, we find that well-crafted prompts markedly improve models’ integrity, and integrating them with probability- or consistency-based uncertainty evaluation methods yields even stronger performance. These findings highlight the considerable potential of LLMs to express uncertainty when they lack knowledge, and we hope these observations can lay the groundwork for developing more reliable models. Full article
(This article belongs to the Special Issue Trustworthy LLM: AIGC Detection, Alignment and Evaluation)
Show Figures

Figure 1

22 pages, 820 KB  
Article
CBR2: A Case-Based Reasoning Framework with Dual Retrieval Guidance for Few-Shot KBQA
by Xinyu Hu, Tong Li, Lingtao Xue, Zhipeng Du, Kai Huang, Gang Xiao and He Tang
Big Data Cogn. Comput. 2026, 10(1), 17; https://doi.org/10.3390/bdcc10010017 - 4 Jan 2026
Viewed by 202
Abstract
Recent advances in large language models (LLMs) have driven substantial progress in knowledge base question answering (KBQA), particularly under few-shot settings. However, symbolic program generation remains challenging due to its strict structural constraints and high sensitivity to generation errors. Existing few-shot methods often [...] Read more.
Recent advances in large language models (LLMs) have driven substantial progress in knowledge base question answering (KBQA), particularly under few-shot settings. However, symbolic program generation remains challenging due to its strict structural constraints and high sensitivity to generation errors. Existing few-shot methods often rely on multi-turn strategies, such as rule-based step-by-step reasoning or iterative self-correction, which introduce additional latency and exacerbate error propagation. We present CBR2, a case-based reasoning framework with dual retrieval guidance for single-pass symbolic program generation. Instead of generating programs interactively, CBR2 constructs a unified structure-aware prompt that integrates two complementary types of retrieval: (1) structured knowledge from ontologies and factual triples, and (2) reasoning exemplars retrieved via semantic and function-level similarity. A lightweight similarity model is trained to retrieve structurally aligned programs, enabling effective transfer of abstract reasoning patterns. Experiments on KQA Pro and MetaQA demonstrate that CBR2 achieves significant improvements in both accuracy and syntactic robustness. Specifically on KQA Pro, it boosts Hits@1 from 72.70% to 82.13% and reduces syntax errors by 25%, surpassing the previous few-shot state-of-the-art. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Figure 1

17 pages, 1877 KB  
Article
BioChat: A Domain-Specific Biodiversity Question-Answering System to Support Sustainable Conservation Decision-Making
by Dong-Seok Jang, Jae-Sik Yi, Hyung-Bae Jeon and Youn-Sik Hong
Sustainability 2026, 18(1), 396; https://doi.org/10.3390/su18010396 - 31 Dec 2025
Viewed by 335
Abstract
Biodiversity knowledge is fundamental to conservation planning and sustainable environmental decision-making; however, general-purpose Large Language Models (LLMs) frequently produce hallucinations when responding to biodiversity-related queries. To address this challenge, we propose BioChat, a domain-specific question-answering system that integrates a Retrieval-Augmented Generation (RAG) framework [...] Read more.
Biodiversity knowledge is fundamental to conservation planning and sustainable environmental decision-making; however, general-purpose Large Language Models (LLMs) frequently produce hallucinations when responding to biodiversity-related queries. To address this challenge, we propose BioChat, a domain-specific question-answering system that integrates a Retrieval-Augmented Generation (RAG) framework with a Re-Ranker–based retrieval and routing mechanism. The system is built upon a verified biodiversity dataset curated by the National Institute of Biological Resources (NIBR), comprising 25,593 species and approximately 970,000 structured data points. We systematically evaluate the effects of embedding selection, routing strategy, and generative model choice on factual accuracy and hallucination mitigation. Experimental results show that the proposed Re-Ranker-based routing strategy significantly improves system reliability, increasing factual accuracy from 47.9% to 71.3% and reducing hallucination rate from 34.0% to 24.4% compared with Naive RAG baseline. Among the evaluated LLMs, Qwen2-7B-Instruct achieves the highest factual accuracy, while Gemma-2-9B-Instruct demonstrates superior hallucination control. By delivering transparent, verifiable, and context-grounded biodiversity information, BioChat supports environmental education, citizen science, and evidence-based conservation policy development. This work demonstrates how trustworthy AI systems can serve as sustainability-enabling infrastructure, facilitating reliable access to biodiversity knowledge for long-term ecological conservation and informed public decision-making. Full article
Show Figures

Figure 1

15 pages, 1308 KB  
Article
Evolution of Convolutional and Recurrent Artificial Neural Networks in the Context of BIM: Deep Insight and New Tool, Bimetria
by Andrzej Szymon Borkowski, Łukasz Kochański and Konrad Rukat
Infrastructures 2026, 11(1), 6; https://doi.org/10.3390/infrastructures11010006 - 22 Dec 2025
Viewed by 187
Abstract
This paper discusses the evolution of convolutional (CNN) and recurrent (RNN) artificial neural networks in applications for Building Information Modeling (BIM). The paper outlines the milestones reached in the last two decades. The article organizes the current state of knowledge and technology in [...] Read more.
This paper discusses the evolution of convolutional (CNN) and recurrent (RNN) artificial neural networks in applications for Building Information Modeling (BIM). The paper outlines the milestones reached in the last two decades. The article organizes the current state of knowledge and technology in terms of three aspects: (1) computer visualization coupled with BIM models (detection, segmentation, and quality verification in images, videos, and point clouds), (2) sequence and time series modeling (prediction of costs, energy, work progress, risk), and (3) integration of deep learning results with the semantics and topology of Industry Foundation Class (IFC) models. The paper identifies the most used architectures, typical data pipelines (synthetic data from BIM models, transfer learning, mapping results to IFC elements) and practical limitations: lack of standardized benchmarks, high annotation costs, a domain gap between synthetic and real data, and discontinuous interoperability. We indicate directions for development: combining CNN/RNN with graph models and transformers for wider use of synthetic data and semi-/supervised learning, as well as explainability methods that increase trust in AECOO (Architecture, Engineering, Construction, Owners & Operators) processes. A practical case study presents a new application, Bimetria, which uses a hybrid CNN/OCR (Optical Character Recognition) solution to generate 3D models with estimates based on two-dimensional drawings. A deep review shows that although the importance of attention-based and graph-based architectures is growing, CNNs and RNNs remain an important part of the BIM process, especially in engineering tasks, where, in our experience and in the Bimetria case study, mature convolutional architectures offer a good balance between accuracy, stability and low latency. The paper also raises some fundamental questions to which we are still seeking answers. Thus, the article not only presents the innovative new Bimetria tool but also aims to stimulate discussion about the dynamic development of AI (Artificial Intelligence) in BIM. Full article
(This article belongs to the Special Issue Modern Digital Technologies for the Built Environment of the Future)
Show Figures

Figure 1

25 pages, 2085 KB  
Article
SPR-RAG: Semantic Parsing Retriever-Enhanced Question Answering for Power Policy
by Yufang Wang, Tongtong Xu and Yihui Zhu
Algorithms 2025, 18(12), 802; https://doi.org/10.3390/a18120802 - 17 Dec 2025
Viewed by 278
Abstract
To address the limitations of Retrieval-Augmented Generation (RAG) systems in handling long policy documents, mitigating information dilution, and reducing hallucinations in engineering-oriented applications, this paper proposes SPR-RAG, a retrieval-augmented framework designed for knowledge-intensive vertical domains such as electric power policy analysis. With practicality [...] Read more.
To address the limitations of Retrieval-Augmented Generation (RAG) systems in handling long policy documents, mitigating information dilution, and reducing hallucinations in engineering-oriented applications, this paper proposes SPR-RAG, a retrieval-augmented framework designed for knowledge-intensive vertical domains such as electric power policy analysis. With practicality and interpretability as core design goals, SPR-RAG introduces a Semantic Parsing Retriever (SPR), which integrates community detection–based entity disambiguation and transforms natural language queries into logical forms for structured querying over a domain knowledge graph, thereby retrieving verifiable triple-based evidence. To further resolve retrieval bias arising from diverse policy writing styles and inconsistencies between user queries and policy text expressions, a question-repository–based indirect retrieval mechanism is developed. By generating and matching latent questions, this module enables more robust retrieval of non-structured contextual evidence. The system then fuses structured and unstructured evidence into a unified dual-source context, providing the generator with an interpretable and reliable grounding signal. Experiments conducted on real electric power policy corpora demonstrate that SPR-RAG achieves 90.01% faithfulness—representing a 5.26% improvement over traditional RAG—and 76.77% context relevance, with a 5.96% gain. These results show that SPR-RAG effectively mitigates hallucinations caused by ambiguous entity names, textual redundancy, and irrelevant retrieved content, thereby improving the verifiability and factual grounding of generated answers. Overall, SPR-RAG demonstrates strong deployability and cross-domain transfer potential through its “Text → Knowledge Graph → RAG” engineering paradigm. The framework provides a practical and generalizable technical blueprint for building high-trust, industry-grade question–answering systems, offering substantial engineering value and real-world applicability. Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
Show Figures

Figure 1

29 pages, 2363 KB  
Article
Fine-Tuning a Local LLM for Thermoelectric Generators with QLoRA: From Generalist to Specialist
by José Miguel Monzón-Verona, Santiago García-Alonso and Francisco Jorge Santana-Martín
Appl. Sci. 2025, 15(24), 13242; https://doi.org/10.3390/app152413242 - 17 Dec 2025
Viewed by 413
Abstract
This work establishes a large language model (LLM) specialized in the domain of thermoelectric generators (TEGs), for deployment on local hardware. Starting with the generalist JanV1-4B model and Qwen3-4B-Thinking-2507 models, an efficient fine-tuning (FT) methodology using quantized low-rank adaptation (QLoRA) was employed, modifying [...] Read more.
This work establishes a large language model (LLM) specialized in the domain of thermoelectric generators (TEGs), for deployment on local hardware. Starting with the generalist JanV1-4B model and Qwen3-4B-Thinking-2507 models, an efficient fine-tuning (FT) methodology using quantized low-rank adaptation (QLoRA) was employed, modifying only 3.18% of the total parameters of thee base models. The key to the process is the use of a custom-designed dataset, which merges deep theoretical knowledge with rigorous instruction tuning to refine behavior and mitigate catastrophic forgetting. The dataset employed for FT contains 202 curated questions and answers (QAs), strategically balanced between domain-specific knowledge (48.5%) and instruction-tuning for response behavior (51.5%). Performance of the models was evaluated using two complementary benchmarks: a 16-question multilevel cognitive benchmark (94% accuracy) and a specialized 42-question TEG benchmark (81% accuracy), scoring responses as excellent, correct with difficulties, or incorrect, based on technical accuracy and reasoning quality. The model’s utility is demonstrated through experimental TEG design guidance, providing expert-level reasoning on thermal management strategies. This study validates the specialization of LLMs using QLoRA as an effective and accessible strategy for developing highly competent engineering support tools, eliminating dependence on large-scale computing infrastructures, achieving specialization on a consumer-grade NVIDIA RTX 2070 SUPER GPU (8 GB VRAM) in 263 s. Full article
(This article belongs to the Special Issue Large Language Models and Knowledge Computing)
Show Figures

Figure 1

27 pages, 1212 KB  
Systematic Review
Enhancing Cybersecurity Readiness in Non-Profit Organizations Through Collaborative Research and Innovation—A Systematic Literature Review
by Maryam Roshanaei, Premkumar Krishnamurthy, Anivesh Sinha, Vikrant Gokhale, Faizan Muhammad Raza and Dušan Ramljak
Computers 2025, 14(12), 539; https://doi.org/10.3390/computers14120539 - 9 Dec 2025
Viewed by 462
Abstract
Non-profit organizations (NPOs) are crucial for building equitable and thriving communities. The majority of NPOs are small, community-based organizations that serve local needs. Despite their significance, NPOs often lack the resources to manage cybersecurity effectively, and information about them is usually found in [...] Read more.
Non-profit organizations (NPOs) are crucial for building equitable and thriving communities. The majority of NPOs are small, community-based organizations that serve local needs. Despite their significance, NPOs often lack the resources to manage cybersecurity effectively, and information about them is usually found in nonacademic or practitioner sources rather than in the academic literature. The recent surge in cyberattacks on NPOs underscores the urgent need for investment in cybersecurity readiness. The absence of robust safeguards and cybersecurity preparedness not only exposes NPOs to risks and vulnerabilities but also erodes trust and diminishes the value donors and volunteers place on them. Through this systematic literature review (SLR) mapping framework, the existing work on cyber threat assessment and mitigation is leveraged to make a framework and data collection plan to address the significant cybersecurity vulnerabilities faced by NPOs. The research aims to offer actionable guidance that NPOs can implement within their resource constraints to enhance their cybersecurity posture. This systematic literature review (SLR) adheres to PRISMA 2020 guidelines to examine the state of cybersecurity readiness in NPOs. The initial 4650 records were examined on 6 March 2025. We excluded studies that did not answer our research questions and did not discuss the cybersecurity readiness in NPOs. The quality of the selected studies was assessed on the basis of methodology, clarity, completeness, and transparency, resulting in the final number of 23 included studies. Further, 37 studies were added investigating papers that referenced relevant studies or that were referenced by the relevant studies. Results were synthesized through quantitative topic analysis and qualitative analysis to identify key themes and patterns. This study makes the following contributions: (i) identify and synthesize the top cybersecurity risks for NPOs, their service impacts, and mitigation methods; (ii) summarize affordable cybersecurity practices, with an emphasis on employee training and sector-specific knowledge gaps; (iii) analyze organizational and contextual factors (e.g., geography, budget, IT skills, cyber insurance, vendor dependencies) that shape cybersecurity readiness; and (iv) review and integrate existing assessment and resilience frameworks applicable to NPOs. Full article
(This article belongs to the Section ICT Infrastructures for Cybersecurity)
Show Figures

Figure 1

32 pages, 611 KB  
Article
Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Biomedical Question Answering
by Larissa Pusch and Tim O. F. Conrad
BioMedInformatics 2025, 5(4), 70; https://doi.org/10.3390/biomedinformatics5040070 - 9 Dec 2025
Cited by 1 | Viewed by 952
Abstract
Advancements in natural language processing (NLP), particularly Large Language Models (LLMs), have greatly improved how we access knowledge. However, in critical domains like biomedicine, challenges like hallucinations—where language models generate information not grounded in data—can lead to dangerous misinformation. This paper presents a [...] Read more.
Advancements in natural language processing (NLP), particularly Large Language Models (LLMs), have greatly improved how we access knowledge. However, in critical domains like biomedicine, challenges like hallucinations—where language models generate information not grounded in data—can lead to dangerous misinformation. This paper presents a hybrid approach that combines LLMs with Knowledge Graphs (KGs) to improve the accuracy and reliability of question-answering systems in the biomedical field. Our method, implemented using the LangChain framework, includes a query-checking algorithm that checks and, where possible, corrects LLM-generated Cypher queries, which are then executed on the Knowledge Graph, grounding answers in the KG and reducing hallucinations in the evaluated cases. We evaluated several LLMs, including several GPT models and Llama 3.3:70b, on a custom benchmark dataset of 50 biomedical questions. GPT-4 Turbo achieved 90% query accuracy, outperforming most other models. We also evaluated prompt engineering, but found little statistically significant improvement compared to the standard prompt, except for Llama 3:70b, which improved with few-shot prompting. To enhance usability, we developed a web-based interface that allows users to input natural language queries, view generated and corrected Cypher queries, and inspect results for accuracy. This framework improves reliability and accessibility by accepting natural language questions and returning verifiable answers directly from the knowledge graph, enabling inspection and reproducibility. The source code for generating the results of this paper and for the user-interface can be found in our Git repository: https://git.zib.de/lpusch/cyphergenkg-gui, accessed on 1 November 2025. Full article
Show Figures

Figure 1

25 pages, 1910 KB  
Review
Natural Language Processing in Generating Industrial Documentation Within Industry 4.0/5.0
by Izabela Rojek, Olga Małolepsza, Mirosław Kozielski and Dariusz Mikołajewski
Appl. Sci. 2025, 15(23), 12662; https://doi.org/10.3390/app152312662 - 29 Nov 2025
Viewed by 876
Abstract
Deep learning (DL) methods have revolutionized natural language processing (NLP), enabling industrial documentation systems to process and generate text with high accuracy and fluency. Modern deep learning models, such as transformers and recurrent neural networks (RNNs), learn contextual relationships in text, making them [...] Read more.
Deep learning (DL) methods have revolutionized natural language processing (NLP), enabling industrial documentation systems to process and generate text with high accuracy and fluency. Modern deep learning models, such as transformers and recurrent neural networks (RNNs), learn contextual relationships in text, making them ideal for analyzing and creating complex industrial documentation. Transformer-based architectures, such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), are ideally suited for tasks such as text summarization, content generation, and question answering, which are crucial for documentation systems. Pre-trained language models, tuned to specific industrial datasets, support domain-specific vocabulary, ensuring the generated documentation complies with industry standards. Deep learning-based systems can use sequential models, such as those used in machine translation, to generate documentation in multiple languages, promoting accessibility, and global collaboration. Using attention mechanisms, these models identify and highlight critical sections of input data, resulting in the generation of accurate and concise documentation. Integration with optical character recognition (OCR) tools enables DL-based NLP systems to digitize and interpret legacy documents, streamlining the transition to automated workflows. Reinforcement learning and human feedback loops can enhance a system’s ability to generate consistent and contextually relevant text over time. These approaches are particularly effective in creating dynamic documentation that is automatically updated based on data from sensors, registers, or other sources in real time. The scalability of DL techniques enables industrial organizations to efficiently produce massive amounts of documentation, reducing manual effort and improving overall efficiency. NLP has become a fundamental technology for automating the generation, maintenance, and personalization of industrial documentation within the Industry 4.0, 5.0, and emerging Industry 6.0 paradigms. Recent advances in large language models, search-assisted generation, and multimodal architectures have significantly improved the accuracy and contextualization of technical manuals, maintenance reports, and compliance documents. However, persistent challenges such as domain-specific terminology, data scarcity, and the risk of hallucinations highlight the limitations of current approaches in safety-critical manufacturing environments. This review synthesizes state-of-the-art methods, comparing rule-based, neural, and hybrid systems while assessing their effectiveness in addressing industrial requirements for reliability, traceability, and real-time adaptation. Human–AI collaboration and the integration of knowledge graphs are transforming documentation workflows as factories evolve toward cognitive and autonomous systems. The review included 32 articles published between 2018 and 2025. The implications of these bibliometric findings suggest that a high percentage of conference papers (69.6%) may indicate a field still in its conceptual phase, which contextualizes the article’s emphasis on proposed architecture rather than their industrial validation. Most research was conducted in computer science, suggesting early stages of technological maturity. The leading countries were China and India, but these countries did not have large publication counts, nor were leading researchers or affiliations observed, suggesting significant research dispersion. However, the most frequently observed SDGs indicate a clear health context, focusing on “industry innovation and infrastructure” and “good health and well-being”. Full article
(This article belongs to the Special Issue Emerging and Exponential Technologies in Industry 4.0)
Show Figures

Figure 1

20 pages, 47355 KB  
Article
KA-RAG: Integrating Knowledge Graphs and Agentic Retrieval-Augmented Generation for an Intelligent Educational Question-Answering Model
by Fangqun Gao, Shu Xu, Weiyan Hao and Tao Lu
Appl. Sci. 2025, 15(23), 12547; https://doi.org/10.3390/app152312547 - 26 Nov 2025
Viewed by 2748
Abstract
Generative artificial intelligence (AI) and large language models (LLMs) are reshaping the landscape of intelligent educational systems; however, existing solutions often suffer from unstructured resource organization, limited interpretability, and suboptimal retrieval precision. To address these challenges, this study introduces KA-RAG, a course-oriented question [...] Read more.
Generative artificial intelligence (AI) and large language models (LLMs) are reshaping the landscape of intelligent educational systems; however, existing solutions often suffer from unstructured resource organization, limited interpretability, and suboptimal retrieval precision. To address these challenges, this study introduces KA-RAG, a course-oriented question answering (QA) framework that integrates a structured Knowledge Graph (KG) with an Agentic Retrieval-Augmented Generation (Agentic-RAG) workflow. The system incorporates a responsive interface, a unified agent controller (ToolPlanner), a course knowledge graph, and a vector-based retrieval subsystem. By combining symbolic graph reasoning with dense semantic retrieval, the proposed dual-retrieval strategy supports interpretable, context-aware responses to course-related queries. Experiments conducted on a graduate-level Pattern Recognition course demonstrate that KA-RAG achieves a retrieval accuracy of 91.4%, semantic consistency of 87.6%, and an average response latency of 2.8 s. User surveys further reveal significant improvements in learning efficiency and satisfaction. The results validate the feasibility of integrating KG and Agentic-RAG techniques for knowledge-grounded educational applications, offering a practical pathway toward intelligent knowledge organization and interactive learning support. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

29 pages, 756 KB  
Article
Progressive Knowledge Distillation and Numerical Reasoning Enhancement for Financial Report Question Answering
by Ruonan Fang, Chao Yang, Wei Li, Xin Lin, Pingping Li, Yiman Wu and Xinyan Liu
Electronics 2025, 14(23), 4653; https://doi.org/10.3390/electronics14234653 - 26 Nov 2025
Viewed by 525
Abstract
Financial report question answering (FRQA) presents unique challenges due to the need for precise numerical reasoning, complex table structures, and multi-table associations. Existing approaches often overlook the domain-specific complexities of financial reports and struggle with accurate numerical computation, leading to suboptimal performance in [...] Read more.
Financial report question answering (FRQA) presents unique challenges due to the need for precise numerical reasoning, complex table structures, and multi-table associations. Existing approaches often overlook the domain-specific complexities of financial reports and struggle with accurate numerical computation, leading to suboptimal performance in real-world financial intelligence applications. In this study, we propose FinQA-PKD, a framework designed to mitigate these challenges through a novel integration of progressive knowledge distillation and numerical reasoning enhancement. Our method introduces a difficulty-aware curriculum learning strategy that organizes training into two progressive stages, facilitating more effective and stable model learning. To address the limitations of large language models in numerical reasoning, we develop a numerical reasoning enhancement module that automatically decomposes calculation chains, augments numerical tokens, and validates results using a financial formula library. Furthermore, we implement a domain-adaptive selective knowledge distillation strategy, which evaluates teacher model outputs based on numerical accuracy, calculation correctness, and terminology precision, and selectively distills knowledge from high-quality samples. Experimental results in benchmark datasets demonstrate that FinQA-PKD improves numerical and calculation accuracy, achieving competitive performance with reduced computational resources. This framework provides a robust and efficient solution for answering financial report questions in practical financial analysis scenarios. Full article
Show Figures

Figure 1

20 pages, 2254 KB  
Article
A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval
by Xiaonan Shi, Junhe Chen, Yumo Wang and Limei Fu
Appl. Sci. 2025, 15(23), 12381; https://doi.org/10.3390/app152312381 - 21 Nov 2025
Viewed by 351
Abstract
By searching for and summarizing the relevant information of the enterprise, we can build relevant knowledge maps, supplement and enrich the existing knowledge base, and support existing experiments and subsequent algorithm improvements. The extracted input text of enterprise archives is described via relation [...] Read more.
By searching for and summarizing the relevant information of the enterprise, we can build relevant knowledge maps, supplement and enrich the existing knowledge base, and support existing experiments and subsequent algorithm improvements. The extracted input text of enterprise archives is described via relation extraction and semantic analysis to improve the efficiency of archive retrieval and reduce the cost of communication. On the basis of the analysis of previous research, an enterprise archive semantic retrieval algorithm based on deep learning technology is constructed, that is, the BERT + BiGRU + CRF + HHO_improved model, to extract the relevant information of the enterprise. In the model, the Bidirectional Encoder Representations from Transformers (BERT) model is used to preprocess the Chinese word embedding, and the question-and-answer data are generated from the actual enterprise file database. Next, a Bidirectional Gated Recursive Unit (BiGRU) is used with the attention mechanism to capture the contextual features of the sequence. The Conditional Random Field (CRF) classifier is subsequently used to classify the text related to the enterprise archives, and the obtained data are labeled in sequence. Moreover, the swarm intelligence algorithm is introduced to dynamically optimize the model parameters and data processing strategies further to improve the generalization ability and adaptability of the model. The Harris Hawks Optimizer Improved (HHO_improved) algorithm is used to optimize the parameters of the CRF module to increase the performance and efficiency of named entity recognition. On the independently constructed dataset, the advantages of our algorithm are verified via comparative experiments with a variety of semantic matching algorithms and ablation experiments on the CRF and HHO_improved. The CRF and HHO_improved play essential roles in improving model performance. The obtained knowledge extraction results are expected to supplement and enhance the existing knowledge base, simplify the workflow, assist the enterprise’s dynamic production task management, and improve the efficiency of enterprise operations. The proposed algorithm achieves an accuracy improvement of 36.33%, 43.88%, 15.24%, and 12.41% over the BERT, BiGRU, BERT + BiGRU, and BERT + BiGRU + CRF models, respectively. Full article
Show Figures

Figure 1

Back to TopTop