Advanced Retrieval-Augmented Generation Systems Based on Large Language Models

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 February 2026 | Viewed by 2454

Special Issue Editors


E-Mail Website
Guest Editor
Department of Applied Computing, University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia
Interests: artificial intelligence; natural language processing; machine learning

E-Mail Website
Guest Editor
Institute for Cognitive Sciences and Technologies, National Research Council, 00185 Rome, Italy
Interests: ontology engineering; linked data; semantic web; knowledge extraction; natural language understanding

Special Issue Information

Dear Colleagues,

We invite researchers, academics, and practitioners to contribute to this Special Issue focused on advanced retrieval-augmented generation (RAG) systems powered by large language models (LLMs). This issue aims to explore cutting-edge methodologies and solutions that leverage LLMs to enhance information retrieval, knowledge generation, and decision-making processes across various domains. Key topics include the following:

  • Advanced techniques for integrating LLMs into RAG systems to improve accuracy, efficiency, and scalability.
  • Cross-domain applications of RAG systems powered by LLMs, including healthcare, legal tech, education, and finance.
  • Novel approaches to fine-tuning LLMs for specialized retrieval tasks and domain-specific knowledge bases.
  • Comparative studies and evaluations of LLM-based RAG systems versus traditional or hybrid approaches.
  • Ethical, legal, and societal implications of deploying LLMs in RAG systems, including considerations of bias, fairness, and transparency.

We welcome submissions presenting original research, case studies, and theoretical advancements, addressing both technical and practical aspects of LLM-driven RAG systems.

Dr. Marina Bagic Babac
Dr. Andrea Giovanni Nuzzolese
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • retrieval-augmented generation (RAG)
  • large language models (LLMs)
  • information retrieval
  • knowledge generation
  • decision making
  • cross-domain applications

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

34 pages, 3333 KB  
Article
A Systematic Evaluation of Large Language Models and Retrieval-Augmented Generation for the Task of Kazakh Question Answering
by Aigerim Mansurova, Arailym Tleubayeva, Aliya Nugumanova, Adai Shomanov and Sadi Evren Seker
Information 2025, 16(11), 943; https://doi.org/10.3390/info16110943 - 30 Oct 2025
Viewed by 260
Abstract
This paper presents a systematic evaluation of large language models (LLMs) and retrieval-augmented generation (RAG) approaches for question answering (QA) in the low-resource Kazakh language. We assess the performance of existing proprietary (GPT-4o, Gemini 2.5-flash) and open-source Kazakh-oriented models (KazLLM-8B, Sherkala-8B, Irbis-7B) across [...] Read more.
This paper presents a systematic evaluation of large language models (LLMs) and retrieval-augmented generation (RAG) approaches for question answering (QA) in the low-resource Kazakh language. We assess the performance of existing proprietary (GPT-4o, Gemini 2.5-flash) and open-source Kazakh-oriented models (KazLLM-8B, Sherkala-8B, Irbis-7B) across closed-book and RAG settings. Within a three-stage evaluation framework we benchmark retriever quality, examine LLM abilities such as knowledge-gap detection, external truth integration and context grounding, and measures gains from realistic end-to-end RAG pipelines. Our results show a clear pattern: proprietary models lead in closed-book QA, but RAG narrows the gap substantially. Under the Ideal RAG setting, KazLLM-8B improves from its closed-book baseline of 0.427 to reach answer correctness of 0.867, closely matching GPT-4o’s score of 0.869. In the end-to-end RAG setup, KazLLM-8B paired with Snowflake retriever achieved answer correctness up to 0.754, surpassing GPT-4o’s best score of 0.632. Despite improvements, RAG outcomes show an inconsistency: high retrieval metrics do not guarantee high QA system accuracy. The findings highlight the importance of retrievers and context grounding strategies in enabling open-source Kazakh models to deliver competitive QA performance in a low-resource setting. Full article
Show Figures

Graphical abstract

16 pages, 2128 KB  
Article
Secure Multifaceted-RAG: Hybrid Knowledge Retrieval with Security Filtering
by Grace Byun, Shinsun Lee, Nayoung Choi and Jinho D. Choi
Information 2025, 16(9), 804; https://doi.org/10.3390/info16090804 - 16 Sep 2025
Viewed by 803
Abstract
Existing Retrieval-Augmented Generation (RAG) systems face challenges in enterprise settings due to limited retrieval scope and data security risks. When relevant internal documents are unavailable, the system struggles to generate accurate and complete responses. Additionally, using closed-source Large Language Models (LLMs) raises concerns [...] Read more.
Existing Retrieval-Augmented Generation (RAG) systems face challenges in enterprise settings due to limited retrieval scope and data security risks. When relevant internal documents are unavailable, the system struggles to generate accurate and complete responses. Additionally, using closed-source Large Language Models (LLMs) raises concerns about exposing proprietary information. To address these issues, we propose the Secure Multifaceted-RAG (SecMulti-RAG) framework, which retrieves not only from internal documents but also from two supplementary sources: pre-generated expert knowledge for anticipated queries and on-demand external LLM-generated knowledge. To mitigate security risks, we adopt a local open-source generator and selectively utilize external LLMs only when prompts are deemed safe by a filtering mechanism. This approach enhances completeness, prevents data leakage, and reduces costs. In our evaluation on a report generation task in the automotive industry, SecMulti-RAG significantly outperforms traditional RAG—achieving 79.3–91.9% win rates across correctness, richness, and helpfulness in LLM-based evaluation and 56.3–70.4% in human evaluation. This highlights SecMulti-RAG as a practical and secure solution for enterprise RAG. Full article
Show Figures

Figure 1

24 pages, 3421 KB  
Article
Cloud-Based Medical Named Entity Recognition: A FIT4NER-Based Approach
by Philippe Tamla, Florian Freund and Matthias Hemmje
Information 2025, 16(5), 395; https://doi.org/10.3390/info16050395 - 12 May 2025
Viewed by 709
Abstract
This paper presents a cloud-based system that builds upon the FIT4NER framework to support medical experts in training machine learning models for named entity recognition (NER) using Microsoft Azure. The system is designed to simplify complex cloud configurations while providing an intuitive interface [...] Read more.
This paper presents a cloud-based system that builds upon the FIT4NER framework to support medical experts in training machine learning models for named entity recognition (NER) using Microsoft Azure. The system is designed to simplify complex cloud configurations while providing an intuitive interface for managing and converting large-scale training and evaluation datasets across formats such as PDF, DOCX, TXT, BioC, spaCyJSON, and CoNLL-2003. It also enables the configuration of transformer-based spaCy pipelines and orchestrates Azure cloud services for scalable and efficient NER model training. Following the structured Nunamaker research methodology, the paper introduces the research context, surveys the state of the art, and highlights key challenges faced by medical professionals in cloud-based NER. It then details the modeling, implementation, and integration of the system. Evaluation results—both qualitative and quantitative—demonstrate enhanced usability, scalability, and accessibility for non-technical users in medical domains. The paper concludes with insights gained and outlines directions for future work. Full article
Show Figures

Figure 1

Back to TopTop