You are currently on the new version of our website. Access the old version .
  • Indexed in
    Scopus
  • 21 days
    Time to First Decision

Analytics

Analytics is an international, peer-reviewed, open access journal on methodologies, technologies, and applications of analytics, published quarterly online by MDPI.

All Articles (132)

Large language models (LLMs) and other foundation models are rapidly being woven into enterprise analytics workflows, where they assist with data exploration, forecasting, decision support, and automation. These systems can feel like powerful new teammates: creative, scalable, and tireless. Yet they also introduce distinctive risks related to opacity, brittleness, bias, and misalignment with organizational goals. Existing work on AI ethics, alignment, and governance provides valuable principles and technical safeguards, but enterprises still lack practical frameworks that connect these ideas to the specific metrics, controls, and workflows by which analytics teams design, deploy, and monitor LLM-powered systems. This paper proposes a conceptual governance framework for enterprise AI and analytics that is explicitly centered on LLMs embedded in analytics pipelines. The framework adopts a three-layered perspective—model and data alignment, system and workflow alignment, and ecosystem and governance alignment—that links technical properties of models to enterprise analytics practices, performance indicators, and oversight mechanisms. In practical terms, the framework shows how model and workflow choices translate into concrete metrics and inform real deployment, monitoring, and scaling decisions for LLM-powered analytics. We also illustrate how this framework can guide the design of controls for metrics, monitoring, human-in-the-loop structures, and incident response in LLM-driven analytics. The paper concludes with implications for analytics leaders and governance teams seeking to operationalize responsible, scalable use of LLMs in enterprise settings.

11 January 2026

Three-Layer Governance Framework for LLM-Powered Enterprise Analytics.

Environmental, social and governance (ESG) metrics increasingly inform sustainable investment yet suffer from inter-rater heterogeneity and incomplete reporting, limiting their utility for forward-looking allocation. In this study, we developed and validated a two-level stacked-ensemble machine-learning framework to predict total ESG risk scores for S&P 500 firms using a comprehensive feature set comprising pillar sub-scores, controversy measures, firm financials, categorical descriptors and geospatial environmental indicators. Data pre-processing combined median/mean imputation, one-hot encoding, normalization and rigorous feature engineering; models were trained with an 80:20 train–test split and hyperparameters tuned by k-fold cross-validation. The stacked ensemble substantially outperformed single-model baselines (RMSE = 1.006, MAE = 0.664, MAPE = 3.13%, R2 = 0.979, CV_RMSE_Mean = 1.383, CV_R2_Mean = 0.957), with LightGBM and gradient boosting as competitive comparators. Permutation importance and correlation analysis identified environmental and social components as primary drivers (environmental importance = 0.41; social = 0.32), with potential multicollinearity between component and aggregate scores. This study concludes that ensemble-based predictive analytics can produce reliable, actionable ESG estimates to enhance screening and prioritization in sustainable investment, while recommending human review for extreme predictions and further work to harmonize cross-provider score divergence.

9 January 2026

ML-based ESG prediction framework.

Many fluid-injection sequences display burst-like seismicity with approximate power-law event-size distributions whose exponents drift between catalogs. Classical percolation models instead predict fixed, dimension-dependent exponents and do not specify which geometric mechanisms could underlie such b-value variability. We address this gap using two loopless invasion percolation variants—the constrained Leath invasion percolation (CLIP) and avalanche invasion percolation (AIP) models—to generate synthetic burst catalogs and quantify how burst geometry modifies size–frequency statistics. For each model we measure burst-size distributions and an interference fraction, defined as the proportion of attempted growth steps that terminate on previously activated bonds. Single-burst clusters recover the Fisher exponent of classical percolation, whereas multi-burst sequences show systematic, dimension-dependent drift of the effective exponent with a burst number that is strongly correlated with the interference fraction. CLIP and AIP are indistinguishable under these diagnostics, indicating that interference-driven exponent drift is a generic feature of burst growth rather than a model-specific artifact. Mapping the size-distribution exponent to an equivalent Gutenberg–Richter b-value shows that increasing interference suppresses large bursts and produces b value ranges comparable to those reported for injection-induced seismicity, supporting the interpretation of interference as a geometric proxy for mechanical inhibition that limits the growth of large events in real fracture networks.

6 January 2026

Example multi-burst cluster generated by the CLIP model on a two-dimensional lattice. Colored symbols indicate successive bursts produced as the invasion front repeatedly reaches a breakthrough site: the first burst (black), second burst (blue), and third burst (red). The yellow star marks the initial seed site. This construction defines the burst ensembles used in the subsequent scaling and interference analyses.

Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big Five Inventory-2 (BFI-2). Methods: Thirty-two contemporary commercial and open-source LLMs completed all 60 BFI-2 items 60 times each (once with and once without having to justify their answers), yielding over 330,000 responses. Models answered independently, under male and female impersonation, and with and without required justifications. Responses were compared to German human reference data using Welch’s t-tests (p<0.01) to assess deviations, response stability, justification effects, and gender differences. Results: At the domain level, LLM personality profiles broadly align with human means. Facet-level analyses, however, reveal systematic deviations, including inflated agreement—especially in Agreeableness and Aesthetic Sensitivity—and reduced Negative Emotionality. Only a few models show minimal deviations. Justification prompts significantly altered responses in 56% of models, often increasing variability. Commercial models exhibited substantially higher response stability than open-source models. Gender impersonation affected up to 25% of BFI-2 items, reflecting and occasionally amplifying human gender differences. Conclusions: This study introduces a reproducible psychometric framework for benchmarking LLM behavior against validated human norms and shows that LLMs produce stable yet systematically biased personality-like response patterns. Psychometric screening could therefore complement traditional LLM evaluation in sensitive applications.

6 January 2026

Comparison of personality profiles across language model providers based on the BFI-2 model. Each radar plot illustrates the average scores of models within each provider group. The red lines indicate female humans, while the blue lines represent male humans (Ground Truth).

News & Conferences

Issues

Open for Submission

Editor's Choice

Get Alerted

Add your email address to receive forthcoming issues of this journal.

XFacebookLinkedIn
Analytics - ISSN 2813-2203