Journal Description
Analytics
Analytics
is an international, peer-reviewed, open access journal on methodologies, technologies, and applications of analytics, published quarterly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus and other databases.
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 20.6 days after submission; acceptance to publication is undertaken in 5.5 days (median values for papers published in this journal in the second half of 2025).
- Recognition of Reviewers: APC discount vouchers, optional signed peer review, and reviewer names published annually in the journal.
- Analytics is a companion journal of Mathematics.
- Journal Cluster of Information Systems and Technology: Analytics, Applied System Innovation, Cryptography, Data, Digital, Informatics, Information, Journal of Cybersecurity and Privacy and Multimedia.
Latest Articles
Impacting Brand Awareness and Emotions in Retail Consumer Decision-Making Within a Digital Context
Analytics 2026, 5(2), 16; https://doi.org/10.3390/analytics5020016 - 30 Mar 2026
Abstract
►
Show Figures
This study explores the intricate behavioral consumer psychology dynamics of how certain elements—color, price, gender differences, and the concept of the frequency illusion—affect emotions, brand awareness, and consumer decision-making in a digital environment. Going beyond conventional analyses, this study also explores the intersection
[...] Read more.
This study explores the intricate behavioral consumer psychology dynamics of how certain elements—color, price, gender differences, and the concept of the frequency illusion—affect emotions, brand awareness, and consumer decision-making in a digital environment. Going beyond conventional analyses, this study also explores the intersection of sustainable business practices, elucidating the potential for ethical, environmentally conscious, and business-sustainable decision-making. Utilizing a quantitative method and survey data from 207 respondents, this research contributes to a more profound level of understanding of consumer decision-making in the Lebanese retail sector, offering strategic insights for organizations seeking to enhance brand recognition, while aligning with responsible and sustainable practices in today’s dynamic and competitive environment. The study found that psychological cues—color, price, gender differences, and frequency illusion—significantly influence emotions, brand awareness, and consumer decision-making in retail. Future research should examine the tensions in consumer decision-making, where brand awareness and emotional cues can simultaneously facilitate and bias choices, with effects contingent on exposure, demographic characteristics, digital fluency, and cultural context.
Full article
Open AccessArticle
Visualizing the Machine Learning Process in Multichannel Time Series Classification
by
Edgar Acuña and Roxana Aparicio
Analytics 2026, 5(1), 15; https://doi.org/10.3390/analytics5010015 - 12 Mar 2026
Abstract
►▼
Show Figures
This paper uses visualization techniques to analyze the learning process of six machine learning classifiers for multichannel time series classification (MTSC), including five deep learning models—1D CNN, CNN-LSTM, ResNet, InceptionTime, and Transformer—and one non-deep learning method, ROCKET. Sixteen datasets from the University of
[...] Read more.
This paper uses visualization techniques to analyze the learning process of six machine learning classifiers for multichannel time series classification (MTSC), including five deep learning models—1D CNN, CNN-LSTM, ResNet, InceptionTime, and Transformer—and one non-deep learning method, ROCKET. Sixteen datasets from the University of East Anglia (UEA) multivariate time series repository were employed to assess and compare classifier performance. To explore how data characteristics influence accuracy, we applied channel selection, feature selection, and similarity analysis between training and testing sets. Visualization techniques were used to examine the temporal and structural patterns of each dataset, offering insight into how feature relevance, channel informativeness, and group separability affect model performance. The experimental results show that ROCKET achieves the most consistent accuracy across datasets, although its performance decreases with a very large number of channels. Conversely, the Transformer model underperforms in datasets with limited training instances per class. Overall, the findings highlight the importance of visual exploration in understanding MTSC behavior and indicate that channel relevance and data separability have a greater impact on classification accuracy than feature-level patterns.
Full article

Figure 1
Open AccessArticle
A Decade of Evolution: Evaluating Student Preferences for Degree Selection in the Spanish Public University System Through Directional Community Analysis (2014–2023)
by
José-Miguel Montañana, Antonio Hervás and Pedro-Pablo Soriano-Jiménez
Analytics 2026, 5(1), 14; https://doi.org/10.3390/analytics5010014 - 11 Mar 2026
Abstract
►▼
Show Figures
The Spanish Public University System (SUPE) assigns student placements through a multi-step application process governed by legal criteria. Analyzing how students move between different degree programs during this process is crucial for universities to optimize and plan their academic offerings. This paper analyzes
[...] Read more.
The Spanish Public University System (SUPE) assigns student placements through a multi-step application process governed by legal criteria. Analyzing how students move between different degree programs during this process is crucial for universities to optimize and plan their academic offerings. This paper analyzes a decade of student pre-registration data (2014–2023) to track evolving preferences and mobility between degrees. We model this process as a directed graph, mapping student traffic and studying the formation of directional communities within the degree network. A significant challenge is the weakly connected and poorly conditioned nature of these graphs, which impedes standard community detection algorithms. Extending prior work that relied on manually set thresholds for pruning edges, we propose a novel adaptive pruning algorithm that requires no manual intervention. Applying this method to annual data improves community detection performance and reveals gradual shifts in student preferences and demand, particularly in response to new degrees. These insights provide a valuable decision-making tool for higher education institutions, helping them refine their degree offerings in response to evolving trends.
Full article

Figure 1
Open AccessArticle
Distributed Orders Management in Make-to-Order Supply Chain Networks Using Game-Based Alternating Direction Method of Multipliers
by
Amirhosein Gholami, Nasim Nezamoddini and Mohammad T. Khasawneh
Analytics 2026, 5(1), 13; https://doi.org/10.3390/analytics5010013 - 9 Mar 2026
Abstract
►▼
Show Figures
Operations scheduling of mass customized products is vital in the modern make-to-order (MTO) supply chains. In these systems, order acceptance decisions should be coordinated with available capacity in different sections of the supply chain while considering their potential correlations and interactions. One of
[...] Read more.
Operations scheduling of mass customized products is vital in the modern make-to-order (MTO) supply chains. In these systems, order acceptance decisions should be coordinated with available capacity in different sections of the supply chain while considering their potential correlations and interactions. One of the fundamental challenges in optimization of these systems is the computation time of solving models with multiple coupling constraints between supply chain units. This paper addresses this issue by proposing a game-based framework that decomposes the related mixed integer programming mathematical model and it is coordinated and solved using integrated game-based Alternating Direction Method of Multipliers (ADMM). The proposed Stackelberg Leader-Follower game optimizes order acceptance decisions while considering the requirements in supply, production planning, maintenance, inventory, and distribution units. To validate the efficiency of the proposed framework, the model is tested with a simulated four-layer supply chain. The results of experiments proved that decompositions of the model to smaller subsections and solving it in a distributed manner not only optimizes supply chain participating units but also coordinate their movements to achieve the global optimal solution. The proposed framework offers managers a practical decision layer that preserve local autonomy of the supply chain units and reduce their data sharing and computation burdens and concerns.
Full article

Figure 1
Open AccessArticle
Operationalising CTT and IRT in Spreadsheets: A Methodological Demonstration for Classroom Assessment
by
António Faria and Guilhermina Lobato Miranda
Analytics 2026, 5(1), 12; https://doi.org/10.3390/analytics5010012 - 24 Feb 2026
Abstract
►▼
Show Figures
The evaluation of student performance often relies on basic spreadsheet outputs that provide limited insight into item functioning. This study presents a methodological demonstration showing how widely available spreadsheet software can be transformed into a practical environment for psychometric analysis. Using a simulated
[...] Read more.
The evaluation of student performance often relies on basic spreadsheet outputs that provide limited insight into item functioning. This study presents a methodological demonstration showing how widely available spreadsheet software can be transformed into a practical environment for psychometric analysis. Using a simulated dataset of 40 students responding to 20 dichotomous items, spreadsheet formulas were developed to compute descriptive statistics and Classical Test Theory (CTT) indices, including item difficulty, discrimination, and corrected item–total correlations. The demonstration was extended to Item Response Theory (IRT) through the implementation of 1PL, 2PL, and 3PL logistic models using forward-calculated item parameters. A smaller dataset of 10 students and 10 items was used to illustrate the interpretability of the indices and the generation of Item Characteristic Curves (ICCs). Results show that spreadsheets can support teachers in in-terpreting test data beyond total scores, enabling the identification of weak items, refinement of distractors, and construction of small-scale item banks aligned with competence-based curricula. The approach contributes to Sustainable Development Goal 4 (SDG 4) by promoting accessible, equitable, and high-quality assessment practices. Limitations include the instability of IRT parameter estimation in small samples and the need for teacher training. Future research should apply the approach to real classroom data, explore automation within spreadsheet environments, and examine the integration of artificial intelligence for adaptive assessment.
Full article

Figure 1
Open AccessArticle
Integrating Deep Learning Nodes into an Augmented Decision Tree for Automated Medical Coding
by
Spoorthi Bhat, Veda Sahaja Bandi, Haiping Xu and Joshua Carberry
Analytics 2026, 5(1), 11; https://doi.org/10.3390/analytics5010011 - 12 Feb 2026
Abstract
►▼
Show Figures
Accurate assignment of International Classification of Diseases (ICD) codes is essential for healthcare analytics, billing, and clinical research. However, manual coding remains time-consuming and error-prone due to the scale and complexity of the ICD taxonomy. While hierarchical deep learning approaches have improved automated
[...] Read more.
Accurate assignment of International Classification of Diseases (ICD) codes is essential for healthcare analytics, billing, and clinical research. However, manual coding remains time-consuming and error-prone due to the scale and complexity of the ICD taxonomy. While hierarchical deep learning approaches have improved automated coding, their deployment across large taxonomies raises scalability and efficiency concerns. To address these limitations, we introduce the Augmented Decision Tree (ADT) framework, which integrates deep learning with symbolic rule-based logic for automated medical coding. ADT employs an automated lexical screening mechanism to dynamically select the most appropriate modeling strategy for each decision node, thereby minimizing manual configuration. Nodes with high keyword distinctiveness are handled by symbolic rules, while semantically ambiguous nodes are assigned to deep contextual models fine-tuned from PubMedBERT. This selective design eliminates the need to train a deep learning model at every node, significantly reducing computational cost. A case study demonstrates that this hybrid and adaptive ADT approach supports scalable and efficient ICD coding. Experimental results show that ADT outperforms a pure decision tree baseline and achieves accuracy comparable to that of a full deep learning-based decision tree, while requiring substantially less training time and computational resources.
Full article

Figure 1
Open AccessArticle
Site Selection for Solar Photovoltaic Power Plant Using MCDM Method with New De-i-Fuzzification Technique
by
Kamal Hossain Gazi, Asesh Kumar Mukherjee, Shashi Bajaj Mukherjee, Sankar Prasad Mondal, Soheil Salahshour and Arijit Ghosh
Analytics 2026, 5(1), 10; https://doi.org/10.3390/analytics5010010 - 9 Feb 2026
Abstract
Choosing sites for solar photovoltaic (PV) power plants in developing countries like India is a crucial task while considering multiple conflicting factors and sub-factors simultaneously. Multi-criteria decision-making (MCDM) is an optimisation method that provides a framework for handling such situations in an intuitionistic
[...] Read more.
Choosing sites for solar photovoltaic (PV) power plants in developing countries like India is a crucial task while considering multiple conflicting factors and sub-factors simultaneously. Multi-criteria decision-making (MCDM) is an optimisation method that provides a framework for handling such situations in an intuitionistic fuzzy environment. The complexity and uncertainty associated with the site selection model are dealt with professionally. The Criteria Importance Through Intercriteria Correlation (CRITIC) method is applied to determine the relative importance of the criteria, identifying airflow speed as the most influential factor, followed by humidity ratio, level of dust haze, availability of labour and resources, and ecological effects. This shows that airflow speed plays an important role in the power plant’s efficiency and performance. The Vlse Kriterijumska Optimizacija I Kompromisno Rešenje (VIKOR) method is then used to prioritise the alternatives as potential locations for setting up a solar PV power plant in India. A new de-i-fuzzification method based on the relative difference between two real numbers is also proposed. Sensitivity analyses and comparative studies are conducted to assess the robustness and effectiveness of the framework. Overall, the results demonstrate that the proposed framework is useful and effective for optimising site selection for solar power plants in India.
Full article
(This article belongs to the Topic Data Intelligence and Computational Analytics)
►▼
Show Figures

Figure 1
Open AccessArticle
Denoising Stock Price Time Series with Singular Spectrum Analysis for Enhanced Deep Learning Forecasting
by
Carol Anne Hargreaves and Zixian Fan
Analytics 2026, 5(1), 9; https://doi.org/10.3390/analytics5010009 - 27 Jan 2026
Abstract
►▼
Show Figures
Aim: Stock price prediction remains a highly challenging task due to the complex and nonlinear nature of financial time series data. While deep learning (DL) has shown promise in capturing these nonlinear patterns, its effectiveness is often hindered by the low signal-to-noise ratio
[...] Read more.
Aim: Stock price prediction remains a highly challenging task due to the complex and nonlinear nature of financial time series data. While deep learning (DL) has shown promise in capturing these nonlinear patterns, its effectiveness is often hindered by the low signal-to-noise ratio inherent in market data. This study aims to enhance the stock predictive performance and trading outcomes by integrating Singular Spectrum Analysis (SSA) with deep learning models for stock price forecasting and strategy development on the Australian Securities Exchange (ASX)50 index. Method: The proposed framework begins by applying SSA to decompose raw stock price time series into interpretable components, effectively isolating meaningful trends and eliminating noise. The denoised sequences are then used to train a suite of deep learning architectures, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and hybrid CNN-LSTM models. These models are evaluated based on their forecasting accuracy and the profitability of the trading strategies derived from their predictions. Results: Experimental results demonstrated that the SSA-DL framework significantly improved the prediction accuracy and trading performance compared to baseline DL models trained on raw data. The best-performing model, SSA-CNN-LSTM, achieved a Sharpe Ratio of 1.88 and a return on investment (ROI) of 67%, indicating robust risk-adjusted returns and effective exploitation of the underlying market conditions. Conclusions: The integration of Singular Spectrum Analysis with deep learning offers a powerful approach to stock price prediction in noisy financial environments. By denoising input data prior to model training, the SSA-DL framework enhanced signal clarity, improved forecast reliability, and enabled the construction of profitable trading strategies. These findings suggested a strong potential for SSA-based preprocessing in financial time series modeling.
Full article

Figure 1
Open AccessArticle
From Models to Metrics: A Governance Framework for Large Language Models in Enterprise AI and Analytics
by
Darshan Desai and Ashish Desai
Analytics 2026, 5(1), 8; https://doi.org/10.3390/analytics5010008 - 11 Jan 2026
Abstract
Large language models (LLMs) and other foundation models are rapidly being woven into enterprise analytics workflows, where they assist with data exploration, forecasting, decision support, and automation. These systems can feel like powerful new teammates: creative, scalable, and tireless. Yet they also introduce
[...] Read more.
Large language models (LLMs) and other foundation models are rapidly being woven into enterprise analytics workflows, where they assist with data exploration, forecasting, decision support, and automation. These systems can feel like powerful new teammates: creative, scalable, and tireless. Yet they also introduce distinctive risks related to opacity, brittleness, bias, and misalignment with organizational goals. Existing work on AI ethics, alignment, and governance provides valuable principles and technical safeguards, but enterprises still lack practical frameworks that connect these ideas to the specific metrics, controls, and workflows by which analytics teams design, deploy, and monitor LLM-powered systems. This paper proposes a conceptual governance framework for enterprise AI and analytics that is explicitly centered on LLMs embedded in analytics pipelines. The framework adopts a three-layered perspective—model and data alignment, system and workflow alignment, and ecosystem and governance alignment—that links technical properties of models to enterprise analytics practices, performance indicators, and oversight mechanisms. In practical terms, the framework shows how model and workflow choices translate into concrete metrics and inform real deployment, monitoring, and scaling decisions for LLM-powered analytics. We also illustrate how this framework can guide the design of controls for metrics, monitoring, human-in-the-loop structures, and incident response in LLM-driven analytics. The paper concludes with implications for analytics leaders and governance teams seeking to operationalize responsible, scalable use of LLMs in enterprise settings.
Full article
(This article belongs to the Special Issue Critical Challenges in Large Language Models and Data Analytics: Trustworthiness, Scalability, and Societal Impact)
►▼
Show Figures

Figure 1
Open AccessArticle
Predicting ESG Scores Using Machine Learning for Data-Driven Sustainable Investment
by
Sanskruti Patel, Abhay Nath and Pranav Desai
Analytics 2026, 5(1), 7; https://doi.org/10.3390/analytics5010007 - 9 Jan 2026
Abstract
►▼
Show Figures
Environmental, social and governance (ESG) metrics increasingly inform sustainable investment yet suffer from inter-rater heterogeneity and incomplete reporting, limiting their utility for forward-looking allocation. In this study, we developed and validated a two-level stacked-ensemble machine-learning framework to predict total ESG risk scores for
[...] Read more.
Environmental, social and governance (ESG) metrics increasingly inform sustainable investment yet suffer from inter-rater heterogeneity and incomplete reporting, limiting their utility for forward-looking allocation. In this study, we developed and validated a two-level stacked-ensemble machine-learning framework to predict total ESG risk scores for S&P 500 firms using a comprehensive feature set comprising pillar sub-scores, controversy measures, firm financials, categorical descriptors and geospatial environmental indicators. Data pre-processing combined median/mean imputation, one-hot encoding, normalization and rigorous feature engineering; models were trained with an 80:20 train–test split and hyperparameters tuned by k-fold cross-validation. The stacked ensemble substantially outperformed single-model baselines (RMSE = 1.006, MAE = 0.664, MAPE = 3.13%, = 0.979, CV_RMSE_Mean = 1.383, CV_R2_Mean = 0.957), with LightGBM and gradient boosting as competitive comparators. Permutation importance and correlation analysis identified environmental and social components as primary drivers (environmental importance = 0.41; social = 0.32), with potential multicollinearity between component and aggregate scores. This study concludes that ensemble-based predictive analytics can produce reliable, actionable ESG estimates to enhance screening and prioritization in sustainable investment, while recommending human review for extreme predictions and further work to harmonize cross-provider score divergence.
Full article

Figure 1
Open AccessArticle
Interference-Driven Scaling Variability in Burst-Based Loopless Invasion Percolation Models of Induced Seismicity
by
Ian Baughman and John B. Rundle
Analytics 2026, 5(1), 6; https://doi.org/10.3390/analytics5010006 - 6 Jan 2026
Abstract
►▼
Show Figures
Many fluid-injection sequences display burst-like seismicity with approximate power-law event-size distributions whose exponents drift between catalogs. Classical percolation models instead predict fixed, dimension-dependent exponents and do not specify which geometric mechanisms could underlie such b-value variability. We address this gap using two
[...] Read more.
Many fluid-injection sequences display burst-like seismicity with approximate power-law event-size distributions whose exponents drift between catalogs. Classical percolation models instead predict fixed, dimension-dependent exponents and do not specify which geometric mechanisms could underlie such b-value variability. We address this gap using two loopless invasion percolation variants—the constrained Leath invasion percolation (CLIP) and avalanche invasion percolation (AIP) models—to generate synthetic burst catalogs and quantify how burst geometry modifies size–frequency statistics. For each model we measure burst-size distributions and an interference fraction, defined as the proportion of attempted growth steps that terminate on previously activated bonds. Single-burst clusters recover the Fisher exponent of classical percolation, whereas multi-burst sequences show systematic, dimension-dependent drift of the effective exponent with a burst number that is strongly correlated with the interference fraction. CLIP and AIP are indistinguishable under these diagnostics, indicating that interference-driven exponent drift is a generic feature of burst growth rather than a model-specific artifact. Mapping the size-distribution exponent to an equivalent Gutenberg–Richter b-value shows that increasing interference suppresses large bursts and produces b value ranges comparable to those reported for injection-induced seismicity, supporting the interpretation of interference as a geometric proxy for mechanical inhibition that limits the growth of large events in real fracture networks.
Full article

Figure 1
Open AccessArticle
PSYCH—Psychometric Assessment of Large Language Model Characters: An Exploration of the German Language
by
Nane Kratzke, Niklas Beuter, André Drews and Monique Janneck
Analytics 2026, 5(1), 5; https://doi.org/10.3390/analytics5010005 - 6 Jan 2026
Abstract
Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big
[...] Read more.
Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big Five Inventory-2 (BFI-2). Methods: Thirty-two contemporary commercial and open-source LLMs completed all 60 BFI-2 items 60 times each (once with and once without having to justify their answers), yielding over 330,000 responses. Models answered independently, under male and female impersonation, and with and without required justifications. Responses were compared to German human reference data using Welch’s t-tests ( ) to assess deviations, response stability, justification effects, and gender differences. Results: At the domain level, LLM personality profiles broadly align with human means. Facet-level analyses, however, reveal systematic deviations, including inflated agreement—especially in Agreeableness and Aesthetic Sensitivity—and reduced Negative Emotionality. Only a few models show minimal deviations. Justification prompts significantly altered responses in 56% of models, often increasing variability. Commercial models exhibited substantially higher response stability than open-source models. Gender impersonation affected up to 25% of BFI-2 items, reflecting and occasionally amplifying human gender differences. Conclusions: This study introduces a reproducible psychometric framework for benchmarking LLM behavior against validated human norms and shows that LLMs produce stable yet systematically biased personality-like response patterns. Psychometric screening could therefore complement traditional LLM evaluation in sensitive applications.
Full article
(This article belongs to the Special Issue Critical Challenges in Large Language Models and Data Analytics: Trustworthiness, Scalability, and Societal Impact)
►▼
Show Figures

Figure 1
Open AccessArticle
GSM: An Integrated GAM–SHAP–MCDA Framework for Stroke Risk Assessment
by
Rilwan Mustapha, Ashiribo Wusu, Olusola Olabanjo and Bamidele Adetunji
Analytics 2026, 5(1), 4; https://doi.org/10.3390/analytics5010004 - 29 Dec 2025
Abstract
►▼
Show Figures
This study proposes GSM, an interpretable and operational GAM-SHAP-MCDA framework for stroke risk stratification by integrating generalized additive models (GAMs), a point-based clinical scoring system, SHAP-based explainability, and multi-criteria decision analysis (MCDA). Using a publicly available dataset of individuals (
[...] Read more.
This study proposes GSM, an interpretable and operational GAM-SHAP-MCDA framework for stroke risk stratification by integrating generalized additive models (GAMs), a point-based clinical scoring system, SHAP-based explainability, and multi-criteria decision analysis (MCDA). Using a publicly available dataset of individuals ( stroke prevalence), a GAM was fitted to capture nonlinear effects of key physiological predictors, including age, average blood glucose level, and body mass index (BMI), together with linear effects for hypertension, heart disease, and categorical covariates. The estimated smooth functions revealed strong age-related risk acceleration beyond 60 years, threshold behavior for glucose levels above approximately , and a non-monotonic BMI association with peak risk at moderate BMI ranges. In a comparative evaluation, the GAM achieved superior discrimination and calibration relative to classical logistic regression, with a mean AUC of versus and a lower Brier score ( vs. ). A calibration analysis yielded an intercept of and a slope of , indicating near-ideal agreement between the predicted and observed risks. While high-capacity ensemble models such as XGBoost achieved slightly higher AUC values ( ), the GAM attained near-upper-bound performance while retaining full interpretability. To enhance clinical usability, the GAM smooth effects were discretized into clinically interpretable bands and converted into an additive point-based risk score ranging from 0 to 42, which was subsequently calibrated to absolute stroke probability. The calibrated probabilities were incorporated into the TOPSIS and VIKOR MCDA frameworks, producing transparent and robust patient prioritization rankings. A SHAP analysis confirmed age, glucose, and cardiometabolic factors as dominant global contributors, aligning with the learned GAM structure. Overall, the proposed GAM–SHAP–MCDA framework demonstrates that near-state-of-the-art predictive performance can be achieved alongside transparency, calibration, and decision-oriented interpretability, supporting ethical and practical deployment of medical artificial intelligence for stroke risk assessment.
Full article

Figure 1
Open AccessArticle
Can Length Limit for App Titles Benefit Consumers?
by
Saori Chiba, Yu-Hsi Liu, Chien-Yuan Sher and Min-Hsueh Tsai
Analytics 2026, 5(1), 3; https://doi.org/10.3390/analytics5010003 - 29 Dec 2025
Abstract
The App Store introduced a title-length limit for mobile apps in 2016, and similar policies were later adopted across the industry. This issue drew considerable attention from industry practitioners in the 2010s. Using both empirical and theoretical approaches, this paper examines the effectiveness
[...] Read more.
The App Store introduced a title-length limit for mobile apps in 2016, and similar policies were later adopted across the industry. This issue drew considerable attention from industry practitioners in the 2010s. Using both empirical and theoretical approaches, this paper examines the effectiveness of this policy and its welfare implications. Title length became an issue because some sellers assemble meaningful keywords in the app title to convey information to consumers, while others combine irrelevant yet popular keywords in an attempt to increase their app’s downloads. We hypothesize that when titles are short, title length is positively associated with an app’s performance because both honest and opportunistic sellers coexist in the market. However, due to the presence of opportunistic sellers, once titles become too long, this positive relationship disappears. We examine this hypothesis using a random sample of 1998 apps from the App Store in 2015. Our results show that for apps with titles longer than 30 characters, title length remains positively associated with app performance. However, for titles exceeding 50 characters, we do not have sufficient evidence to conclude that further increases in length continue to generate additional downloads. To interpret our empirical findings, we construct communication games between an app seller and a consumer, in which the equilibrium is characterized by a threshold. Based on our model and empirical observations, the 30-character limit might hurt consumers.
Full article
Open AccessArticle
A Threshold Selection Method in Code Plagiarism Checking Function for Code Writing Problem in Java Programming Learning Assistant System Considering AI-Generated Codes
by
Perwira Annissa Dyah Permatasari, Mustika Mentari, Safira Adine Kinari, Soe Thandar Aung, Nobuo Funabiki, Htoo Htoo Sandi Kyaw and Khaing Hsu Wai
Analytics 2026, 5(1), 2; https://doi.org/10.3390/analytics5010002 - 26 Dec 2025
Abstract
To support novice learners, the Java programming learning assistant system (JPLAS) has been developed with various features. Among them, code writing problem (CWP) assigns writing an answer code that passes a given test code. The correctness of an answer code is validated
[...] Read more.
To support novice learners, the Java programming learning assistant system (JPLAS) has been developed with various features. Among them, code writing problem (CWP) assigns writing an answer code that passes a given test code. The correctness of an answer code is validated by running it on JUnit. In previous works, we implemented a code plagiarism checking function that calculates the similarity score for each pair of answer codes based on the Levenshtein distance. When the score is higher than a given threshold, this pair is regarded as plagiarism. However, a method for finding the proper threshold has not been studied. In addition, AI-generated codes have become threats in plagiarism, as AI has grown in popularity, which should be investigated. In this paper, we propose a threshold selection method based on Tukey’s IQR fences. It uses a custom upper threshold derived from the statistical distribution of similarity scores for each assignment. To better accommodate skewed similarity distributions, the method introduces a simple percentile-based adjustment for determining the upper threshold. We also design prompts to generate answer codes using generative AI and apply them to four AI models. For evaluation, we used a total of 745 source codes of two datasets. The first dataset consists of 420 answer codes across 12 CWP instances from 35 first-year undergraduate students in the State Polytechnic of Malang, Indonesia (POLINEMA). The second dataset includes 325 answer codes across five CWP assignments from 65 third-year undergraduate students at Okayama University, Japan. The applications of our proposals found the following: (1) any pair of student codes whose score is higher than the selected threshold has some evidence of plagiarism, (2) some student codes have a higher similarity than the threshold with AI-generated codes, indicating the use of generative AI, and (3) multiple AI models can generate code that resembles student-written code, despite adopting different implementations. The validity of our proposal is confirmed.
Full article
(This article belongs to the Special Issue Critical Challenges in Large Language Models and Data Analytics: Trustworthiness, Scalability, and Societal Impact)
►▼
Show Figures

Figure 1
Open AccessArticle
A Novel Magnificent Frigatebird Optimization Algorithm with Proposed Movement Strategies for Enhanced Global Search
by
Glykeria Kyrou, Vasileios Charilogis and Ioannis G. Tsoulos
Analytics 2026, 5(1), 1; https://doi.org/10.3390/analytics5010001 - 23 Dec 2025
Abstract
►▼
Show Figures
Global optimization is a fundamental tool for addressing complex and nonlinear problems across scientific and technological domains. The primary objective of this work is to enhance the efficiency, stability, and convergence speed of the Magnificent Frigatebird Optimization (MFO) algorithm by introducing new strategies
[...] Read more.
Global optimization is a fundamental tool for addressing complex and nonlinear problems across scientific and technological domains. The primary objective of this work is to enhance the efficiency, stability, and convergence speed of the Magnificent Frigatebird Optimization (MFO) algorithm by introducing new strategies that strengthen both global exploration and local exploitation. To this end, we propose an improved version of MFO that incorporates three novel movement strategies (aggressive, conservative, and mixed), a BFGS-based local search procedure for more accurate solution refinement, and a dynamic termination criterion capable of detecting stagnation and reducing unnecessary function evaluations. The algorithm is extensively evaluated on a diverse set of benchmark functions, demonstrating substantially lower computational cost and higher reliability compared to classical evolutionary and swarm-based methods. The results confirm the effectiveness of the proposed modifications and highlight the potential of the enhanced MFO for application to demanding real-world optimization problems.
Full article

Figure 1
Open AccessArticle
Assessing the Impact of Capital Expenditure on Corporate Profitability in South Korea’s Electronics Industry: A Regression Analysis Approach
by
Bomee Park and Tetiana Paientko
Analytics 2025, 4(4), 36; https://doi.org/10.3390/analytics4040036 - 10 Dec 2025
Abstract
►▼
Show Figures
This study investigates the relationship between capital expenditure (CAPEX) and long-term corporate profitability in South Korea’s electronics industry. Using panel data from 126 listed electronics firms covering 2005–2019, the research applies fixed-effects regression analysis to examine how CAPEX influences profitability, measured by EBITDA/total
[...] Read more.
This study investigates the relationship between capital expenditure (CAPEX) and long-term corporate profitability in South Korea’s electronics industry. Using panel data from 126 listed electronics firms covering 2005–2019, the research applies fixed-effects regression analysis to examine how CAPEX influences profitability, measured by EBITDA/total assets. The results confirm that CAPEX exerts a positive and statistically significant long-term effect on profitability, with stronger but not significantly different impacts for large firms compared to SMEs. The findings contribute to empirical evidence on capital investment efficiency and the implications of economies and diseconomies of scale in capital-intensive industries.
Full article

Figure 1
Open AccessArticle
Option Pricing in the Approach of Integrating Market Risk Premium: Application to OTM Options
by
David Liu
Analytics 2025, 4(4), 35; https://doi.org/10.3390/analytics4040035 - 21 Nov 2025
Abstract
►▼
Show Figures
In this research, we summarize the results of implementing the market risk premium into the option valuation formulas of the Black–Scholes–Merton model for out-of-the-money (OTM) options. We show that derivative prices can partly depend on systematic market risk, which the BSM model ignores
[...] Read more.
In this research, we summarize the results of implementing the market risk premium into the option valuation formulas of the Black–Scholes–Merton model for out-of-the-money (OTM) options. We show that derivative prices can partly depend on systematic market risk, which the BSM model ignores by construction. Specifically, empirical studies are conducted using 50ETF options obtained from the Shanghai Stock Exchange, covering the periods from January 2018 to September 2022 and from December 2023 to October 2025. The pricing of the OTM options shows that the adjusted BSM formulas exhibit better pricing performance compared with the market prices of the OTM options tested. Furthermore, a framework for the empirical analysis of option prices based on the Capital Asset Pricing Model (CAPM) or factor models is discussed, which may lead to option formulas using non-homogeneous heat equations. The later proposal requires further statistical testing using real market data but offers an alternative to the existing risk-neutral valuation of options.
Full article

Figure 1
Open AccessArticle
Fan Loyalty and Price Elasticity in Sport: Insights from Major League Baseball’s Post-Pandemic Recovery
by
Soojin Choi, Fang Zheng and Seung-Man Lee
Analytics 2025, 4(4), 34; https://doi.org/10.3390/analytics4040034 - 21 Nov 2025
Abstract
►▼
Show Figures
The COVID-19 pandemic disrupted traditional patterns of sport consumption, raising questions about whether fans would return to stadiums and how sensitive they would be to ticket prices in the recovery period. This study reconceptualizes ticket price elasticity as a market-based indicator of fan
[...] Read more.
The COVID-19 pandemic disrupted traditional patterns of sport consumption, raising questions about whether fans would return to stadiums and how sensitive they would be to ticket prices in the recovery period. This study reconceptualizes ticket price elasticity as a market-based indicator of fan loyalty and applies it to Major League Baseball (MLB) during 2021–2023. Using team–season attendance data from Baseball-Reference, primary-market ticket prices from the Team Marketing Report Fan Cost Index, and secondary-market prices from TicketIQ, we estimate log–log fixed-effects panel models to separate causal price responses from popularity-driven correlations. The results show a strongly negative elasticity of attendance with respect to primary-market prices (β ≈ −7.93, p < 0.001), indicating that higher ticket prices substantially reduce attendance, while secondary-market prices are positively associated with attendance, reflecting demand shocks rather than causal effects. Heterogeneity analyses reveal that brand strength, team performance, and game salience significantly moderate elasticity, supporting the interpretation of inelastic demand as revealed loyalty. These findings highlight the potential of elasticity as a Fan Loyalty Index, providing a replicable framework for measuring consumer resilience. The study offers practical insights for pricing strategy, fan segmentation, and engagement, while emphasizing the broader social role of sport in restoring community identity during post-pandemic recovery.
Full article

Figure 1
Open AccessArticle
AI-Powered Chatbot for FDA Drug Labeling Information Retrieval: OpenAI GPT for Grounded Question Answering
by
Manasa Koppula, Fnu Madhulika, Navya Sreeramoju and Praveen Kolimi
Analytics 2025, 4(4), 33; https://doi.org/10.3390/analytics4040033 - 17 Nov 2025
Abstract
►▼
Show Figures
This study presents the development of an AI-powered chatbot designed to facilitate accurate and efficient retrieval of information from the FDA drug labeling documents. Leveraging OpenAI’s GPT-3.5-turbo model within a controlled, document-grounded question–answering framework, Chatbot was created, which can provide users with answers
[...] Read more.
This study presents the development of an AI-powered chatbot designed to facilitate accurate and efficient retrieval of information from the FDA drug labeling documents. Leveraging OpenAI’s GPT-3.5-turbo model within a controlled, document-grounded question–answering framework, Chatbot was created, which can provide users with answers that are strictly limited to the content of the uploaded drug label, thereby minimizing hallucinations and enhancing traceability. A user-friendly interface built with Streamlit allows users to upload FDA labeling PDFs and pose natural language queries. The chatbot extracts relevant sections using PyMuPDF and regex-based segmentation and generates responses constrained to those sections. To evaluate performance, semantic similarity scores were computed between generated answers and ground truth text using Sentence Transformers. Results across 10 breast cancer drug labels demonstrate high semantic alignment, with most scores ranging from 0.7 to 0.9, indicating reliable summarization and contextual fidelity. The chatbot achieved high semantic similarity scores (≥0.95 for concise sections) and ROUGE scores, confirming strong semantic and textual alignment. Comparative analysis with GPT-5-chat and NotebookLM demonstrated that our approach maintains accuracy and section-specific fidelity across models. The current work is limited to a small dataset, focused on breast cancer drugs. Future work will expand to diverse therapeutic areas and incorporate BERTScore and expert-based validation.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Applied Sciences, Future Internet, AI, Analytics, BDCC
Data Intelligence and Computational Analytics
Topic Editors: Carson K. Leung, Fei Hao, Xiaokang ZhouDeadline: 30 November 2026
Topic in
Analytics, Clean Technol., Economies, Energies, JMSE, Resources, Sustainability, Sci
Towards Green and Energy Transitions: Techno-Economic Analysis, Optimization, and Innovation Pathways for Sustainability
Topic Editors: Konstantinos Aravossis, Eleni StrantzaliDeadline: 30 April 2027
Topic in
Information, Healthcare, Informatics, Digital, Analytics, Platforms
Digital Platform Analytics for Societal Development Across Sectors
Topic Editors: Ivy Shiue, Timo KoivumäkiDeadline: 31 May 2027
Special Issues
Special Issue in
Analytics
Critical Challenges in Large Language Models and Data Analytics: Trustworthiness, Scalability, and Societal Impact
Guest Editors: Oluwaseun Ajao, Bayode Ogunleye, Hemlata SharmaDeadline: 31 July 2026
Special Issue in
Analytics
Business Analytics and Applications, 2nd Edition
Guest Editor: Tatiana ErmakovaDeadline: 30 September 2026



