Critical Challenges in Large Language Models and Data Analytics: Trustworthiness, Scalability, and Societal Impact

A special issue of Analytics (ISSN 2813-2203).

Deadline for manuscript submissions: 31 July 2026 | Viewed by 5083

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computing and Mathematics, Manchester Metropolitan University, Manchester M1 5GD, UK
Interests: data science; artificial intelligence; data analytics; social media analysis

E-Mail Website
Guest Editor
School of Architecture, Technology and Engineering, University of Brighton, Brighton BN2 4GJ, UK
Interests: data anayltics; artificial intelligence; generative AI; higher education

E-Mail Website
Guest Editor
School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 1WB, UK
Interests: NLP; software engineering; fuzzy logic; IoT; artificial intelligence; machine learning; data science; text mining

Special Issue Information

Dear Colleagues,

The rapid proliferation of large language models (LLMs) and advanced artificial intelligence (AI) systems has fundamentally transformed the landscape of data analytics, creating unprecedented opportunities alongside critical challenges that demand immediate scholarly attention. This Special Issue addresses the most pressing contemporary issues at the intersection of artificial intelligence, data science and analytics, and societal impact, focusing on three interconnected domains of critical importance.

  • Trustworthiness and Reliability represents our first major theme, examining the urgent need for robust evaluation frameworks that can assess LLM performance beyond traditional benchmarks. We seek contributions that address hallucination detection and mitigation, bias quantification across diverse populations, uncertainty quantification in model outputs, and the development of explainable AI techniques that can provide meaningful insights into model decision-making processes. The challenge of ensuring consistent performance across different domains, languages, and cultural contexts remains a fundamental barrier to widespread deployment.
  • Data Quality and Governance forms our second focus area, recognizing that the unprecedented scale of training data for modern LLMs introduces novel challenges in data curation, privacy preservation, and intellectual property considerations. We welcome research on automated data quality assessment techniques, privacy-preserving training methodologies, federated learning approaches for sensitive datasets, and frameworks for managing the complex ethical and legal implications of large-scale data utilization.
  • Scalability and Environmental Impact constitutes our third pillar, addressing the growing computational demands of increasingly sophisticated models and their environmental consequences. We encourage submissions that explore energy-efficient training algorithms, model compression techniques that maintain performance while reducing resource requirements, distributed computing strategies for democratized AI access, and comprehensive lifecycle assessments of AI systems from development through deployment.

This Special Issue aims to present recent advancements while providing practical frameworks for researchers, practitioners, and policymakers navigating these complex relationships.

We particularly encourage interdisciplinary submissions that bridge AI, data science and analytics, ethics, policy, and domain-specific applications.

The scope of this Special Issue includes, but is not limited to, the following topics:

  • Healthcare and Biomedical Analytics
  • Financial Technology and Risk Analytics
  • Educational Technology and Learning Analytics
  • Natural Language Processing and Computational Linguistics
  • Environmental and Climate Analytics
  • Social Media and Digital Humanities
  • Industrial and Manufacturing Analytics
  • Legal Technology and Compliance

Dr. Oluwaseun Ajao
Dr. Bayode Ogunleye
Dr. Hemlata Sharma
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Analytics is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • large language models (LLMs)
  • data analytics
  • trustworthiness
  • data quality
  • explainable AI
  • bias and fairness
  • privacy-preserving AI
  • scalability
  • model compression
  • environmental impact
  • ethical AI
  • data governance

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 519 KB  
Article
From Models to Metrics: A Governance Framework for Large Language Models in Enterprise AI and Analytics
by Darshan Desai and Ashish Desai
Analytics 2026, 5(1), 8; https://doi.org/10.3390/analytics5010008 - 11 Jan 2026
Viewed by 1649
Abstract
Large language models (LLMs) and other foundation models are rapidly being woven into enterprise analytics workflows, where they assist with data exploration, forecasting, decision support, and automation. These systems can feel like powerful new teammates: creative, scalable, and tireless. Yet they also introduce [...] Read more.
Large language models (LLMs) and other foundation models are rapidly being woven into enterprise analytics workflows, where they assist with data exploration, forecasting, decision support, and automation. These systems can feel like powerful new teammates: creative, scalable, and tireless. Yet they also introduce distinctive risks related to opacity, brittleness, bias, and misalignment with organizational goals. Existing work on AI ethics, alignment, and governance provides valuable principles and technical safeguards, but enterprises still lack practical frameworks that connect these ideas to the specific metrics, controls, and workflows by which analytics teams design, deploy, and monitor LLM-powered systems. This paper proposes a conceptual governance framework for enterprise AI and analytics that is explicitly centered on LLMs embedded in analytics pipelines. The framework adopts a three-layered perspective—model and data alignment, system and workflow alignment, and ecosystem and governance alignment—that links technical properties of models to enterprise analytics practices, performance indicators, and oversight mechanisms. In practical terms, the framework shows how model and workflow choices translate into concrete metrics and inform real deployment, monitoring, and scaling decisions for LLM-powered analytics. We also illustrate how this framework can guide the design of controls for metrics, monitoring, human-in-the-loop structures, and incident response in LLM-driven analytics. The paper concludes with implications for analytics leaders and governance teams seeking to operationalize responsible, scalable use of LLMs in enterprise settings. Full article
Show Figures

Figure 1

35 pages, 3498 KB  
Article
PSYCH—Psychometric Assessment of Large Language Model Characters: An Exploration of the German Language
by Nane Kratzke, Niklas Beuter, André Drews and Monique Janneck
Analytics 2026, 5(1), 5; https://doi.org/10.3390/analytics5010005 - 6 Jan 2026
Viewed by 1637
Abstract
Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big [...] Read more.
Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big Five Inventory-2 (BFI-2). Methods: Thirty-two contemporary commercial and open-source LLMs completed all 60 BFI-2 items 60 times each (once with and once without having to justify their answers), yielding over 330,000 responses. Models answered independently, under male and female impersonation, and with and without required justifications. Responses were compared to German human reference data using Welch’s t-tests (p<0.01) to assess deviations, response stability, justification effects, and gender differences. Results: At the domain level, LLM personality profiles broadly align with human means. Facet-level analyses, however, reveal systematic deviations, including inflated agreement—especially in Agreeableness and Aesthetic Sensitivity—and reduced Negative Emotionality. Only a few models show minimal deviations. Justification prompts significantly altered responses in 56% of models, often increasing variability. Commercial models exhibited substantially higher response stability than open-source models. Gender impersonation affected up to 25% of BFI-2 items, reflecting and occasionally amplifying human gender differences. Conclusions: This study introduces a reproducible psychometric framework for benchmarking LLM behavior against validated human norms and shows that LLMs produce stable yet systematically biased personality-like response patterns. Psychometric screening could therefore complement traditional LLM evaluation in sensitive applications. Full article
Show Figures

Figure 1

20 pages, 602 KB  
Article
A Threshold Selection Method in Code Plagiarism Checking Function for Code Writing Problem in Java Programming Learning Assistant System Considering AI-Generated Codes
by Perwira Annissa Dyah Permatasari, Mustika Mentari, Safira Adine Kinari, Soe Thandar Aung, Nobuo Funabiki, Htoo Htoo Sandi Kyaw and Khaing Hsu Wai
Analytics 2026, 5(1), 2; https://doi.org/10.3390/analytics5010002 - 26 Dec 2025
Viewed by 1010
Abstract
To support novice learners, the Java programming learning assistant system (JPLAS) has been developed with various features. Among them, code writing problem (CWP) assigns writing an answer code that passes a given test code. The correctness of an answer code is validated [...] Read more.
To support novice learners, the Java programming learning assistant system (JPLAS) has been developed with various features. Among them, code writing problem (CWP) assigns writing an answer code that passes a given test code. The correctness of an answer code is validated by running it on JUnit. In previous works, we implemented a code plagiarism checking function that calculates the similarity score for each pair of answer codes based on the Levenshtein distance. When the score is higher than a given threshold, this pair is regarded as plagiarism. However, a method for finding the proper threshold has not been studied. In addition, AI-generated codes have become threats in plagiarism, as AI has grown in popularity, which should be investigated. In this paper, we propose a threshold selection method based on Tukey’s IQR fences. It uses a custom upper threshold derived from the statistical distribution of similarity scores for each assignment. To better accommodate skewed similarity distributions, the method introduces a simple percentile-based adjustment for determining the upper threshold. We also design prompts to generate answer codes using generative AI and apply them to four AI models. For evaluation, we used a total of 745 source codes of two datasets. The first dataset consists of 420 answer codes across 12 CWP instances from 35 first-year undergraduate students in the State Polytechnic of Malang, Indonesia (POLINEMA). The second dataset includes 325 answer codes across five CWP assignments from 65 third-year undergraduate students at Okayama University, Japan. The applications of our proposals found the following: (1) any pair of student codes whose score is higher than the selected threshold has some evidence of plagiarism, (2) some student codes have a higher similarity than the threshold with AI-generated codes, indicating the use of generative AI, and (3) multiple AI models can generate code that resembles student-written code, despite adopting different implementations. The validity of our proposal is confirmed. Full article
Show Figures

Figure 1

Back to TopTop