Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (119)

Search Parameters:
Keywords = retrieval-augmented generation (RAG)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 3121 KiB  
Article
SG-RAG MOT: SubGraph Retrieval Augmented Generation with Merging and Ordering Triplets for Knowledge Graph Multi-Hop Question Answering
by Ahmmad O. M. Saleh, Gokhan Tur and Yucel Saygin
Mach. Learn. Knowl. Extr. 2025, 7(3), 74; https://doi.org/10.3390/make7030074 (registering DOI) - 1 Aug 2025
Abstract
Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given [...] Read more.
Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given knowledge graph and retrieve the subgraph necessary to answer the question. The results from our previous work showed the higher performance of our method compared to the traditional Retrieval Augmented Generation (RAG). In this work, we further enhanced SG-RAG by proposing an additional step called Merging and Ordering Triplets (MOT). The new MOT step seeks to decrease the redundancy in the retrieved triplets by applying hierarchical merging to the retrieved subgraphs. Moreover, it provides an ordering among the triplets using the Breadth-First Search (BFS) traversal algorithm. We conducted experiments on the MetaQA benchmark, which was proposed for multi-hop question-answering in the movies domain. Our experiments showed that SG-RAG MOT provided more accurate answers than Chain-of-Thought and Graph Chain-of-Thought. We also found that merging (up to a certain point) highly overlapping subgraphs and defining an order among the triplets helped the LLM to generate more precise answers. Full article
(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)
Show Figures

Figure 1

13 pages, 564 KiB  
Article
Enhanced Semantic Retrieval with Structured Prompt and Dimensionality Reduction for Big Data
by Donghyeon Kim, Minki Park, Jungsun Lee, Inho Lee, Jeonghyeon Jin and Yunsick Sung
Mathematics 2025, 13(15), 2469; https://doi.org/10.3390/math13152469 - 31 Jul 2025
Abstract
The exponential increase in textual data generated across sectors such as healthcare, finance, and smart manufacturing has intensified the need for effective Big Data analytics. Large language models (LLMs) have become critical tools because of their advanced language processing capabilities. However, their static [...] Read more.
The exponential increase in textual data generated across sectors such as healthcare, finance, and smart manufacturing has intensified the need for effective Big Data analytics. Large language models (LLMs) have become critical tools because of their advanced language processing capabilities. However, their static nature limits their ability to incorporate real-time and domain-specific knowledge. Retrieval-augmented generation (RAG) addresses these limitations by enriching LLM outputs through external content retrieval. Nevertheless, traditional RAG systems remain inefficient, often exhibiting high retrieval latency, redundancy, and diminished response quality when scaled to large datasets. This paper proposes an innovative structured RAG framework specifically designed for large-scale Big Data analytics. The framework transforms unstructured partial prompts into structured semantically coherent partial prompts, leveraging element-specific embedding models and dimensionality reduction techniques, such as principal component analysis. To further improve the retrieval accuracy and computational efficiency, we introduce a multi-level filtering approach integrating semantic constraints and redundancy elimination. In the experiments, the proposed method was compared with structured-format RAG. After generating prompts utilizing two methods, silhouette scores were computed to assess the quality of embedding clusters. The proposed method outperformed the baseline by improving the clustering quality by 32.3%. These results demonstrate the effectiveness of the framework in enhancing LLMs for accurate, diverse, and efficient decision-making in complex Big Data environments. Full article
(This article belongs to the Special Issue Big Data Analysis, Computing and Applications)
Show Figures

Figure 1

19 pages, 6095 KiB  
Article
MERA: Medical Electronic Records Assistant
by Ahmed Ibrahim, Abdullah Khalili, Maryam Arabi, Aamenah Sattar, Abdullah Hosseini and Ahmed Serag
Mach. Learn. Knowl. Extr. 2025, 7(3), 73; https://doi.org/10.3390/make7030073 - 30 Jul 2025
Viewed by 68
Abstract
The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific [...] Read more.
The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific retrieval with large language models (LLMs) to deliver robust question answering, similarity search, and report summarization functionalities. MERA is designed to overcome key limitations of conventional LLMs in healthcare, such as hallucinations, outdated knowledge, and limited explainability. To ensure both privacy compliance and model robustness, we constructed a large synthetic dataset using state-of-the-art LLMs, including Mistral v0.3, Qwen 2.5, and Llama 3, and further validated MERA on de-identified real-world EHRs from the MIMIC-IV-Note dataset. Comprehensive evaluation demonstrates MERA’s high accuracy in medical question answering (correctness: 0.91; relevance: 0.98; groundedness: 0.89; retrieval relevance: 0.92), strong summarization performance (ROUGE-1 F1-score: 0.70; Jaccard similarity: 0.73), and effective similarity search (METEOR: 0.7–1.0 across diagnoses), with consistent results on real EHRs. The similarity search module empowers clinicians to efficiently identify and compare analogous patient cases, supporting differential diagnosis and personalized treatment planning. By generating concise, contextually relevant, and explainable insights, MERA reduces clinician workload and enhances decision-making. To our knowledge, this is the first system to integrate clinical question answering, summarization, and similarity search within a unified RAG-based framework. Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 1540 KiB  
Article
Evaluating a Nationally Localized AI Chatbot for Personalized Primary Care Guidance: Insights from the HomeDOCtor Deployment in Slovenia
by Matjaž Gams, Tadej Horvat, Žiga Kolar, Primož Kocuvan, Kostadin Mishev and Monika Simjanoska Misheva
Healthcare 2025, 13(15), 1843; https://doi.org/10.3390/healthcare13151843 - 29 Jul 2025
Viewed by 208
Abstract
Background/Objectives: The demand for accessible and reliable digital health services has increased significantly in recent years, particularly in regions facing physician shortages. HomeDOCtor, a conversational AI platform developed in Slovenia, addresses this need with a nationally adapted architecture that combines retrieval-augmented generation [...] Read more.
Background/Objectives: The demand for accessible and reliable digital health services has increased significantly in recent years, particularly in regions facing physician shortages. HomeDOCtor, a conversational AI platform developed in Slovenia, addresses this need with a nationally adapted architecture that combines retrieval-augmented generation (RAG) and a Redis-based vector database of curated medical guidelines. The objective of this study was to assess the performance and impact of HomeDOCtor in providing AI-powered healthcare assistance. Methods: HomeDOCtor is designed for human-centered communication and clinical relevance, supporting multilingual and multimedia citizen inputs while being available 24/7. It was tested using a set of 100 international clinical vignettes and 150 internal medicine exam questions from the University of Ljubljana to validate its clinical performance. Results: During its six-month nationwide deployment, HomeDOCtor received overwhelmingly positive user feedback with minimal criticism, and exceeded initial expectations, especially in light of widespread media narratives warning about the risks of AI. HomeDOCtor autonomously delivered localized, evidence-based guidance, including self-care instructions and referral suggestions, with average response times under three seconds. On international benchmarks, the system achieved ≥95% Top-1 diagnostic accuracy, comparable to leading medical AI platforms, and significantly outperformed stand-alone ChatGPT-4o in the national context (90.7% vs. 80.7%, p = 0.0135). Conclusions: Practically, HomeDOCtor eases the burden on healthcare professionals by providing citizens with 24/7 autonomous, personalized triage and self-care guidance for less complex medical issues, ensuring that these cases are self-managed efficiently. The system also identifies more serious cases that might otherwise be neglected, directing them to professionals for appropriate care. Theoretically, HomeDOCtor demonstrates that domain-specific, nationally adapted large language models can outperform general-purpose models. Methodologically, it offers a framework for integrating GDPR-compliant AI solutions in healthcare. These findings emphasize the value of localization in conversational AI and telemedicine solutions across diverse national contexts. Full article
(This article belongs to the Special Issue Application of Digital Services to Improve Patient-Centered Care)
Show Figures

Figure 1

16 pages, 1170 KiB  
Article
LoRA-Tuned Multimodal RAG System for Technical Manual QA: A Case Study on Hyundai Staria
by Yerin Nam, Hansun Choi, Jonggeun Choi and Hyukjin Kwon
Appl. Sci. 2025, 15(15), 8387; https://doi.org/10.3390/app15158387 - 29 Jul 2025
Viewed by 148
Abstract
This study develops a domain-adaptive multimodal RAG (Retrieval-Augmented Generation) system to improve the accuracy and efficiency of technical question answering based on large-scale structured manuals. Using Hyundai Staria maintenance documents as a case study, we extracted text and images from PDF manuals and [...] Read more.
This study develops a domain-adaptive multimodal RAG (Retrieval-Augmented Generation) system to improve the accuracy and efficiency of technical question answering based on large-scale structured manuals. Using Hyundai Staria maintenance documents as a case study, we extracted text and images from PDF manuals and constructed QA, RAG, and Multi-Turn datasets to reflect realistic troubleshooting scenarios. To overcome limitations of baseline RAG models, we proposed an enhanced architecture that incorporates sentence-level similarity annotations and parameter-efficient fine-tuning via LoRA (Low-Rank Adaptation) using the bLLossom-8B language model and BAAI-bge-m3 embedding model. Experimental results show that the proposed system achieved improvements of 3.0%p in BERTScore, 3.0%p in cosine similarity, and 18.0%p in ROUGE-L compared to existing RAG systems, with notable gains in image-guided response accuracy. A qualitative evaluation by 20 domain experts yielded an average satisfaction score of 4.4 out of 5. This study presents a practical and extensible AI framework for multimodal document understanding, with broad applicability across automotive, industrial, and defense-related technical documentation. Full article
(This article belongs to the Special Issue Innovations in Artificial Neural Network Applications)
Show Figures

Figure 1

18 pages, 1363 KiB  
Article
FairRAG: A Privacy-Preserving Framework for Fair Financial Decision-Making
by Rashmi Nagpal, Unyimeabasi Usua, Rafael Palacios and Amar Gupta
Appl. Sci. 2025, 15(15), 8282; https://doi.org/10.3390/app15158282 - 25 Jul 2025
Viewed by 209
Abstract
Customer churn prediction has become crucial for businesses, yet it poses significant challenges regarding privacy preservation and prediction accuracy. In this paper, we address two fundamental questions: (1) How can customer churn be effectively predicted while ensuring robust privacy protection of sensitive data? [...] Read more.
Customer churn prediction has become crucial for businesses, yet it poses significant challenges regarding privacy preservation and prediction accuracy. In this paper, we address two fundamental questions: (1) How can customer churn be effectively predicted while ensuring robust privacy protection of sensitive data? (2) How can large language models enhance churn prediction accuracy while maintaining data privacy? To address these questions, we propose FairRAG, a robust architecture that combines differential privacy, retrieval-augmented generation, and LLMs. Our approach leverages OPT-125M as the core language model along with a sentence transformer for semantic similarity matching while incorporating differential privacy mechanisms to generate synthetic training data. We evaluate FairRAG on two diverse datasets: Bank Churn and Telco Churn. The results demonstrate significant improvements over both traditional machine learning approaches and standalone LLMs, achieving accuracy improvements of up to 11% on the Bank Churn dataset and 12% on the Telco Churn dataset. These improvements were maintained when using differentially private synthetic data, thus indicating robust privacy and accuracy trade-offs. Full article
(This article belongs to the Special Issue Soft Computing Methods and Applications for Decision Making)
Show Figures

Figure 1

23 pages, 1127 KiB  
Article
NOVA: A Retrieval-Augmented Generation Assistant in Spanish for Parallel Computing Education with Large Language Models
by Gabriel A. León-Paredes, Luis A. Alba-Narváez and Kelly D. Paltin-Guzmán
Appl. Sci. 2025, 15(15), 8175; https://doi.org/10.3390/app15158175 - 23 Jul 2025
Viewed by 539
Abstract
This work presents the development of NOVA, an educational virtual assistant designed for the Parallel Computing course, built using a Retrieval-Augmented Generation (RAG) architecture combined with Large Language Models (LLMs). The assistant operates entirely in Spanish, supporting native-language learning and increasing accessibility for [...] Read more.
This work presents the development of NOVA, an educational virtual assistant designed for the Parallel Computing course, built using a Retrieval-Augmented Generation (RAG) architecture combined with Large Language Models (LLMs). The assistant operates entirely in Spanish, supporting native-language learning and increasing accessibility for students in Latin American academic settings. It integrates vector and relational databases to provide an interactive, personalized learning experience that supports the understanding of complex technical concepts. Its core functionalities include the automatic generation of questions and answers, quizzes, and practical guides, all tailored to promote autonomous learning. NOVA was deployed in an academic setting at Universidad Politécnica Salesiana. Its modular architecture includes five components: a relational database for logging, a vector database for semantic retrieval, a FastAPI backend for managing logic, a Next.js frontend for user interaction, and an integration server for workflow automation. The system uses the GPT-4o mini model to generate context-aware, pedagogically aligned responses. To evaluate its effectiveness, a test suite of 100 academic tasks was executed—55 question-and-answer prompts, 25 practical guides, and 20 quizzes. NOVA achieved a 92% excellence rating, a 21-second average response time, and 72% retrieval coverage, confirming its potential as a reliable AI-driven tool for enhancing technical education. Full article
Show Figures

Figure 1

26 pages, 663 KiB  
Article
An Information-Theoretic Framework for Retrieval-Augmented Generation Systems
by Semih Yumuşak
Electronics 2025, 14(15), 2925; https://doi.org/10.3390/electronics14152925 - 22 Jul 2025
Viewed by 281
Abstract
Retrieval-Augmented Generation (RAG) systems have emerged as a critical approach for enhancing large language models with external knowledge, yet the field lacks systematic theoretical analysis for understanding their fundamental characteristics and optimization principles. A novel information-theoretic approach for analyzing and optimizing RAG systems [...] Read more.
Retrieval-Augmented Generation (RAG) systems have emerged as a critical approach for enhancing large language models with external knowledge, yet the field lacks systematic theoretical analysis for understanding their fundamental characteristics and optimization principles. A novel information-theoretic approach for analyzing and optimizing RAG systems is introduced in this paper by modeling them as cascading information channel systems where each component (query encoding, retrieval, context integration, and generation) functions as a distinct information-theoretic channel with measurable capacity. Following established practices in information theory research, theoretical insights are evaluated through systematic experimentation on controlled synthetic datasets that enable precise manipulation of schema entropy and isolation of information flow dynamics. Through this controlled experimental approach, the following key theoretical insights are supported: (1) RAG performance is bounded by the minimum capacity across constituent channels, (2) the retrieval channel represents the primary information bottleneck, (3) errors propagate through channel-dependent mechanisms with specific interaction patterns, and (4) retrieval capacity is fundamentally limited by the minimum of embedding dimension and schema entropy. Both quantitative metrics for evaluating RAG systems and practical design principles for optimization are provided by the proposed approach. Retrieval improvements yield 58–85% performance gains and generation improvements yield 58–110% gains, substantially higher than context integration improvements (∼9%) and query encoding modifications, as shown by experimental results on controlled synthetic environments, supporting the theoretical approach. A systematic theoretical analysis for understanding RAG system dynamics is provided by this work, with real-world validation and practical implementation refinements representing natural next phases for this research. Full article
(This article belongs to the Special Issue Advanced Natural Language Processing Technology and Applications)
Show Figures

Figure 1

18 pages, 1554 KiB  
Article
ChatCVD: A Retrieval-Augmented Chatbot for Personalized Cardiovascular Risk Assessment with a Comparison of Medical-Specific and General-Purpose LLMs
by Wafa Lakhdhar, Maryam Arabi, Ahmed Ibrahim, Abdulrahman Arabi and Ahmed Serag
AI 2025, 6(8), 163; https://doi.org/10.3390/ai6080163 - 22 Jul 2025
Viewed by 365
Abstract
Large language models (LLMs) are increasingly being applied to clinical tasks, but it remains unclear whether medical-specific models consistently outperform smaller, generalpurpose ones. This study investigates that assumption in the context of cardiovascular disease (CVD) risk assessment. We fine-tuned eight LLMs—both general-purpose and [...] Read more.
Large language models (LLMs) are increasingly being applied to clinical tasks, but it remains unclear whether medical-specific models consistently outperform smaller, generalpurpose ones. This study investigates that assumption in the context of cardiovascular disease (CVD) risk assessment. We fine-tuned eight LLMs—both general-purpose and medical-specific—using textualized data from the Behavioral Risk Factor Surveillance System (BRFSS) to classify individuals as “High Risk” or “Low Risk”. To provide actionable insights, we integrated a Retrieval-Augmented Generation (RAG) framework for personalized recommendation generation and deployed the system within an interactive chatbot interface. Notably, Gemma2, a compact 2B-parameter general-purpose model, achieved a high recall (0.907) and F1-score (0.770), performing on par with larger or medical-specialized models such as Med42 and BioBERT. These findings challenge the common assumption that larger or specialized models always yield superior results, and highlight the potential of lightweight, efficiently fine-tuned LLMs for clinical decision support—especially in resource-constrained settings. Overall, our results demonstrate that general-purpose models, when fine-tuned appropriately, can offer interpretable, high-performing, and accessible solutions for CVD risk assessment and personalized healthcare delivery. Full article
Show Figures

Graphical abstract

8 pages, 1058 KiB  
Proceeding Paper
A Review of Global Microplastic (MP) Databases: A Study on the Challenges and Opportunities for Data Integration in the Context of MP Pollution
by Hussain Ahamed, Marwa Al-Ani, Ala Al-Ardah and Noora Al-Qahtani
Mater. Proc. 2025, 22(1), 6; https://doi.org/10.3390/materproc2025022006 - 21 Jul 2025
Viewed by 160
Abstract
Microplastic (MP) pollution is an escalating global environmental concern, with a growing body of research addressing diverse dimensions of this issue. Despite this progress, the field remains hindered by generating large, heterogeneous datasets that follow inconsistent reporting standards, resulting in fragmented and often [...] Read more.
Microplastic (MP) pollution is an escalating global environmental concern, with a growing body of research addressing diverse dimensions of this issue. Despite this progress, the field remains hindered by generating large, heterogeneous datasets that follow inconsistent reporting standards, resulting in fragmented and often incompatible databases. While various databases on MPs have been developed, they primarily operate in isolation, limiting the accessibility and cross-comparison of data. This study presents a foundational approach to aggregating and accessing existing MP pollution datasets. A comprehensive review of the currently available databases was conducted to evaluate their integration potential. It revealed key challenges such as non-standardized data formats, limited accessibility, and difficulty performing comparative analyses across sources. To address these barriers, a prototype web-based platform was developed that enables unified access to MP datasets. The architecture includes a smart standardization layer that harmonizes inputs from disparate sources. The integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) techniques was proposed to facilitate natural language querying. This enables researchers to interact with the platform intuitively and extract meaningful insights more efficiently. The proposed system aims to enhance data discoverability, promote interoperability, and support robust, data-driven environmental research, paving the way toward more informed policy-making and scientific collaboration in the fight against MP pollution. With this platform, there is a potential for new discoveries and a future in which the tools to effectively combat this global issue are available, making the audience realize the potential for new discoveries. Full article
Show Figures

Figure 1

18 pages, 1332 KiB  
Article
SC-LKM: A Semantic Chunking and Large Language Model-Based Cybersecurity Knowledge Graph Construction Method
by Pu Wang, Yangsen Zhang, Zicheng Zhou and Yuqi Wang
Electronics 2025, 14(14), 2878; https://doi.org/10.3390/electronics14142878 - 18 Jul 2025
Viewed by 401
Abstract
In cybersecurity, constructing an accurate knowledge graph is vital for discovering key entities and relationships in security incidents buried in vast unstructured threat reports. Traditional knowledge-graph construction pipelines based on handcrafted rules or conventional machine learning models falter when the data scale and [...] Read more.
In cybersecurity, constructing an accurate knowledge graph is vital for discovering key entities and relationships in security incidents buried in vast unstructured threat reports. Traditional knowledge-graph construction pipelines based on handcrafted rules or conventional machine learning models falter when the data scale and linguistic variety grow. GraphRAG, a retrieval-augmented generation (RAG) framework that splits documents into fixed-length chunks and then retrieves the most relevant ones for generation, offers a scalable alternative yet still suffers from fragmentation and semantic gaps that erode graph integrity. To resolve these issues, this paper proposes SC-LKM, a cybersecurity knowledge-graph construction method that couples the GraphRAG backbone with hierarchical semantic chunking. SC-LKM applies semantic chunking to build a cybersecurity knowledge graph that avoids the fragmentation and inconsistency seen in prior work. The semantic chunking method first respects the native document hierarchy and then refines boundaries with topic similarity and named-entity continuity, maintaining logical coherence while limiting information loss during the fine-grained processing of unstructured text. SC-LKM further integrates the semantic comprehension capacity of Qwen2.5-14B-Instruct, markedly boosting extraction accuracy and reasoning quality. Experimental results show that SC-LKM surpasses baseline systems in entity-recognition coverage, topology density, and semantic consistency. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 4388 KiB  
Article
An Optimized Semantic Matching Method and RAG Testing Framework for Regulatory Texts
by Bingjie Li, Haolin Wen, Songyi Wang, Tao Hu, Xin Liang and Xing Luo
Electronics 2025, 14(14), 2856; https://doi.org/10.3390/electronics14142856 - 17 Jul 2025
Viewed by 303
Abstract
To enhance the accuracy and reliability of large language models (LLMs) in regulatory question-answering tasks, this study addresses the complexity and domain-specificity of regulatory texts by designing a retrieval-augmented generation (RAG) testing framework. It proposes a dimensionality reduction-based semantic similarity measurement method and [...] Read more.
To enhance the accuracy and reliability of large language models (LLMs) in regulatory question-answering tasks, this study addresses the complexity and domain-specificity of regulatory texts by designing a retrieval-augmented generation (RAG) testing framework. It proposes a dimensionality reduction-based semantic similarity measurement method and a retrieval optimization approach leveraging information reasoning. Through the construction of the technical route of the intelligent knowledge management system, the semantic understanding capabilities of multiple mainstream embedding models in the text matching of financial regulations are systematically evaluated. The workflow encompasses data processing, knowledge base construction, embedding model selection, vectorization, recall parameter analysis, and retrieval performance benchmarking. Furthermore, the study innovatively introduces a multidimensional scaling (MDS) based semantic similarity measurement method and a question-reasoning processing technique. Compared to traditional cosine similarity (CS) metrics, these methods significantly improved recall accuracy. Experimental results demonstrate that, under the RAG testing framework, the mxbai-embed-large embedding model combined with MDS similarity calculation, Top-k recall, and information reasoning effectively addresses core challenges such as the structuring of regulatory texts and the generalization of domain-specific terminology. This approach provides a reusable technical solution for optimizing semantic matching in vertical-domain RAG systems, particularly for MDSs such as law and finance. Full article
Show Figures

Figure 1

14 pages, 679 KiB  
Article
Enhancing Patient Outcomes in Head and Neck Cancer Radiotherapy: Integration of Electronic Patient-Reported Outcomes and Artificial Intelligence-Driven Oncology Care Using Large Language Models
by ChihYing Liao, ChinNan Chu, TingChun Lin, TzuYao Chou and MengHsiun Tsai
Cancers 2025, 17(14), 2345; https://doi.org/10.3390/cancers17142345 - 15 Jul 2025
Viewed by 773
Abstract
Background: Electronic patient-reported outcomes (ePROs) enable real-time symptom monitoring and early intervention in oncology. Large language models (LLMs), when combined with retrieval-augmented generation (RAG), offer scalable Artificial Intelligence (AI)-driven education tailored to individual patient needs. However, few studies have examined the feasibility and [...] Read more.
Background: Electronic patient-reported outcomes (ePROs) enable real-time symptom monitoring and early intervention in oncology. Large language models (LLMs), when combined with retrieval-augmented generation (RAG), offer scalable Artificial Intelligence (AI)-driven education tailored to individual patient needs. However, few studies have examined the feasibility and clinical impact of integrating ePRO with LLM-RAG feedback during radiotherapy in high-toxicity settings such as head and neck cancer. Methods: This prospective observational study enrolled 42 patients with head and neck cancer undergoing radiotherapy from January to December 2024. Patients completed ePRO entries twice weekly using a web-based platform. Following each entry, an LLM-RAG system (Gemini 1.5-based) generated real-time educational feedback using National Comprehensive Cancer Network (NCCN) guidelines and institutional resources. Primary outcomes included percentage weight loss and treatment interruption days. Statistical analyses included t-tests, linear regression, and receiver operating characteristic (ROC) analysis. A threshold of ≥6 ePRO entries was used for subgroup analysis. Results: Patients had a mean age of 53.6 years and submitted an average of 8.0 ePRO entries. Frequent ePRO users (≥6 entries) had significantly less weight loss (4.45% vs. 7.57%, p = 0.021) and fewer treatment interruptions (0.67 vs. 2.50 days, p = 0.002). Chemotherapy, moderate-to-severe pain, and lower ePRO submission frequency were associated with greater weight loss. ePRO submission frequency was negatively correlated with both weight loss and treatment interruption days. The most commonly reported symptoms were appetite loss, fatigue, and nausea. Conclusions: Integrating LLM-RAG feedback with ePRO systems is feasible and may enhance symptom control, treatment continuity, and patient engagement in head and neck cancer radiotherapy. Further studies are warranted to validate the clinical benefits of AI-supported ePRO platforms in routine care. Full article
(This article belongs to the Special Issue Personalized Radiotherapy in Cancer Care (2nd Edition))
Show Figures

Graphical abstract

11 pages, 1132 KiB  
Article
Custom-Tailored Radiology Research via Retrieval-Augmented Generation: A Secure Institutionally Deployed Large Language Model System
by Michael Welsh, Julian Lopez-Rippe, Dana Alkhulaifat, Vahid Khalkhali, Xinmeng Wang, Mario Sinti-Ycochea and Susan Sotardi
Inventions 2025, 10(4), 55; https://doi.org/10.3390/inventions10040055 - 8 Jul 2025
Viewed by 374
Abstract
Large language models (LLMs) show promise in enhancing medical research through domain-specific question answering. However, their clinical application is limited by hallucination risk, limited domain specialization, and privacy concerns. Public LLMs like GPT-4-Consensus pose challenges for use with institutional data, due to the [...] Read more.
Large language models (LLMs) show promise in enhancing medical research through domain-specific question answering. However, their clinical application is limited by hallucination risk, limited domain specialization, and privacy concerns. Public LLMs like GPT-4-Consensus pose challenges for use with institutional data, due to the inability to ensure patient data protection. In this work, we present a secure, custom-designed retrieval-augmented generation (RAG) LLM system deployed entirely within our institution and tailored for radiology research. Radiology researchers at our institution evaluated the system against GPT-4-Consensus through a blinded survey assessing factual accuracy (FA), citation relevance (CR), and perceived performance (PP) using 5-point Likert scales. Our system achieved mean ± SD scores of 4.15 ± 0.99 for FA, 3.70 ± 1.17 for CR, and 3.55 ± 1.39 for PP. In comparison, GPT-4-Consensus obtained 4.25 ± 0.72, 3.85 ± 1.23, and 3.90 ± 1.12 for the same metrics, respectively. No statistically significant differences were observed (p = 0.97, 0.65, 0.42), and 50% of participants preferred our system’s output. These results validate that secure, local RAG-based LLMs can match state-of-the-art performance while preserving privacy and adaptability, offering a scalable tool for medical research environments. Full article
(This article belongs to the Special Issue Machine Learning Applications in Healthcare and Disease Prediction)
Show Figures

Figure 1

21 pages, 561 KiB  
Article
Comparative Analysis of BERT and GPT for Classifying Crisis News with Sudan Conflict as an Example
by Yahya Masri, Zifu Wang, Anusha Srirenganathan Malarvizhi, Samir Ahmed, Tayven Stover, David W. S. Wong, Yongyao Jiang, Yun Li, Qian Liu, Mathieu Bere, Daniel Rothbart, Dieter Pfoser and Chaowei Yang
Algorithms 2025, 18(7), 420; https://doi.org/10.3390/a18070420 - 8 Jul 2025
Viewed by 431
Abstract
To obtain actionable information for humanitarian and other emergency responses, an accurate classification of news or events is critical. Daily news and social media are hard to classify based on conveyed information, especially when multiple categories of information are embedded. This research used [...] Read more.
To obtain actionable information for humanitarian and other emergency responses, an accurate classification of news or events is critical. Daily news and social media are hard to classify based on conveyed information, especially when multiple categories of information are embedded. This research used large language models (LLMs) and traditional transformer-based models, such as BERT, to classify news and social media events using the example of the Sudan Conflict. A systematic evaluation framework was introduced to test GPT models using Zero-Shot prompting, Retrieval-Augmented Generation (RAG), and RAG with In-Context Learning (ICL) against standard and hyperparameter-tuned bert-based and bert-large models. BERT outperformed GPT in F1-score and accuracy for multi-label classification (MLC) while GPT outperformed BERT in accuracy for Single-Label classification from Multi-Label Ground Truth (SL-MLG). The results illustrate that a larger model size improves classification accuracy for both BERT and GPT, while BERT benefits from hyperparameter tuning and GPT benefits from its enhanced contextual comprehension capabilities. By addressing challenges such as overlapping semantic categories, task-specific adaptation, and a limited dataset, this study provides a deeper understanding of LLMs’ applicability in constrained, real-world scenarios, particularly in highlighting the potential for integrating NLP with other applications such as GIS in future conflict analyses. Full article
(This article belongs to the Special Issue Evolution of Algorithms in the Era of Generative AI)
Show Figures

Graphical abstract

Back to TopTop