Next Article in Journal
VaxiGen Database of Tumor Immunogens
Previous Article in Journal
Agricultural Life Cycle Assessment Dataset for Phase 1 Goals, Products, and Scope Definitions
Previous Article in Special Issue
A Scalable Data Pipeline for Early Detection and Decision Support in Higher Education: YuumCare
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature

by
Pablo Picazo-Sanchez
* and
Lara Ortiz-Martin
School of Information Technology, Halmstad University, 301 18 Halmstad, Sweden
*
Author to whom correspondence should be addressed.
Data 2026, 11(5), 122; https://doi.org/10.3390/data11050122
Submission received: 30 March 2026 / Revised: 5 May 2026 / Accepted: 11 May 2026 / Published: 20 May 2026

Abstract

Large Language Models have become important in our lives, and academia is not agnostic to this trend, offering tools like text rephrasing and summarisation. However, this integration raises significant concerns regarding the integrity of science. In this paper, we investigate hallucinations of LLMs when generating scientific references. Using nine LLMs, we generated a dataset of 74,196 Bib references to quantify and analyse fabricated references, focusing on distinguishing between intrinsic and extrinsic hallucinations. Also, we extracted and analysed 127,063 references from 3541 published papers in 2023 to assess the prevalence of fake bibliographic data. Our manual verification process identified eight instances of fabricated references. While the overall rate is statistically low, the mere existence of fabricated content in the peer-reviewed literature is a critical integrity issue, demonstrating a vulnerability in current academic validation systems. The significance of our finding is not the statistical prevalence but rather the necessity for rigorous, human-validated processes to prevent the injection of spurious citations regardless of their source.
Keywords: hallucinations; scientific references; NLP; trust hallucinations; scientific references; NLP; trust

Share and Cite

MDPI and ACS Style

Picazo-Sanchez, P.; Ortiz-Martin, L. Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature. Data 2026, 11, 122. https://doi.org/10.3390/data11050122

AMA Style

Picazo-Sanchez P, Ortiz-Martin L. Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature. Data. 2026; 11(5):122. https://doi.org/10.3390/data11050122

Chicago/Turabian Style

Picazo-Sanchez, Pablo, and Lara Ortiz-Martin. 2026. "Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature" Data 11, no. 5: 122. https://doi.org/10.3390/data11050122

APA Style

Picazo-Sanchez, P., & Ortiz-Martin, L. (2026). Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature. Data, 11(5), 122. https://doi.org/10.3390/data11050122

Article Metrics

Back to TopTop