Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature

Picazo-Sanchez, Pablo; Ortiz-Martin, Lara

doi:10.3390/data11050122

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature

by

Pablo Picazo-Sanchez

^*

and

Lara Ortiz-Martin

School of Information Technology, Halmstad University, 301 18 Halmstad, Sweden

^*

Author to whom correspondence should be addressed.

Data 2026, 11(5), 122; https://doi.org/10.3390/data11050122

Submission received: 30 March 2026 / Revised: 5 May 2026 / Accepted: 11 May 2026 / Published: 20 May 2026

(This article belongs to the Special Issue Mining and Computational Intelligence for E-Learning and Education—4th Edition)

Download Versions Notes

Abstract

Large Language Models have become important in our lives, and academia is not agnostic to this trend, offering tools like text rephrasing and summarisation. However, this integration raises significant concerns regarding the integrity of science. In this paper, we investigate hallucinations of LLMs when generating scientific references. Using nine LLMs, we generated a dataset of 74,196 Bib references to quantify and analyse fabricated references, focusing on distinguishing between intrinsic and extrinsic hallucinations. Also, we extracted and analysed 127,063 references from 3541 published papers in 2023 to assess the prevalence of fake bibliographic data. Our manual verification process identified eight instances of fabricated references. While the overall rate is statistically low, the mere existence of fabricated content in the peer-reviewed literature is a critical integrity issue, demonstrating a vulnerability in current academic validation systems. The significance of our finding is not the statistical prevalence but rather the necessity for rigorous, human-validated processes to prevent the injection of spurious citations regardless of their source.

Keywords: hallucinations; scientific references; NLP; trust

Share and Cite

MDPI and ACS Style

Picazo-Sanchez, P.; Ortiz-Martin, L. Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature. Data 2026, 11, 122. https://doi.org/10.3390/data11050122

AMA Style

Picazo-Sanchez P, Ortiz-Martin L. Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature. Data. 2026; 11(5):122. https://doi.org/10.3390/data11050122

Chicago/Turabian Style

Picazo-Sanchez, Pablo, and Lara Ortiz-Martin. 2026. "Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature" Data 11, no. 5: 122. https://doi.org/10.3390/data11050122

APA Style

Picazo-Sanchez, P., & Ortiz-Martin, L. (2026). Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature. Data, 11(5), 122. https://doi.org/10.3390/data11050122

Article Menu

Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI