Analytics

25 pages, 3034 KB

Open AccessArticle

Distributional CNN-LSTM, KDE, and Copula Approaches for Multimodal Multivariate Data: Assessing Conditional Treatment Effects

by Jong-Min Kim

Analytics 2025, 4(4), 29; https://doi.org/10.3390/analytics4040029 - 21 Oct 2025

Abstract

We introduce a distributional CNN-LSTM framework for probabilistic multivariate modeling and heterogeneous treatment effect (HTE) estimation. The model jointly captures complex dependencies among multiple outcomes and enables precise estimation of individual-level conditional average treatment effects (CATEs). In simulation studies with multivariate Gaussian mixtures, [...] Read more.

We introduce a distributional CNN-LSTM framework for probabilistic multivariate modeling and heterogeneous treatment effect (HTE) estimation. The model jointly captures complex dependencies among multiple outcomes and enables precise estimation of individual-level conditional average treatment effects (CATEs). In simulation studies with multivariate Gaussian mixtures, the CNN-LSTM demonstrates robust density estimation and strong CATE recovery, particularly as mixture complexity increases, while classical methods such as Kernel Density Estimation (KDE) and Gaussian Copulas may achieve higher log-likelihood or coverage in simpler scenarios. On real-world datasets, including Iris and Criteo Uplift, the CNN-LSTM achieves the lowest CATE RMSE, confirming its practical utility for individualized prediction, although KDE and Gaussian Copula approaches may perform better on global likelihood or coverage metrics. These results indicate that the CNN-LSTM can be trained efficiently on moderate-sized datasets while maintaining stable predictive performance. Overall, the framework is particularly valuable in applications requiring accurate individual-level effect estimation and handling of multimodal heterogeneity—such as personalized medicine, economic policy evaluation, and environmental risk assessment—with its primary strength being superior CATE recovery under complex outcome distributions, even when likelihood-based metrics favor simpler baselines. Full article

► Show Figures

Figure 1

19 pages, 674 KB

Open AccessArticle

Reservoir Computation with Networks of Differentiating Neuron Ring Oscillators

by Alexander Yeung, Peter DelMastro, Arjun Karuvally, Hava Siegelmann, Edward Rietman and Hananel Hazan

Analytics 2025, 4(4), 28; https://doi.org/10.3390/analytics4040028 - 20 Oct 2025

Abstract

Reservoir computing is an approach to machine learning that leverages the dynamics of a complex system alongside a simple, often linear, machine learning model for a designated task. While many efforts have previously focused their attention on integrating neurons, which produce an output [...] Read more.

Reservoir computing is an approach to machine learning that leverages the dynamics of a complex system alongside a simple, often linear, machine learning model for a designated task. While many efforts have previously focused their attention on integrating neurons, which produce an output in response to large, sustained inputs, we focus on using differentiating neurons, which produce an output in response to large changes in input. Here, we introduce a small-world graph built from rings of differentiating neurons as a Reservoir Computing substrate. We find the coupling strength and network topology that enable these small-world networks to function as an effective reservoir. The dynamics of differentiating neurons naturally give rise to oscillatory dynamics when arranged in rings, where we study their computational use in the Reservoir Computing setting. We demonstrate the efficacy of these networks in the MNIST digit recognition task, achieving comparable performance of 90.65% to existing Reservoir Computing approaches. Beyond accuracy, we conduct systematic analysis of our reservoir’s internal dynamics using three complementary complexity measures that quantify neuronal activity balance, input dependence, and effective dimensionality. Our analysis reveals that optimal performance emerges when the reservoir operates with intermediate levels of neural entropy and input sensitivity, consistent with the edge-of-chaos hypothesis, where the system balances stability and responsiveness. The findings suggest that differentiating neurons can be a potential alternative to integrating neurons and can provide a sustainable future alternative for power-hungry AI applications. Full article

► Show Figures

Figure 1

50 pages, 6680 KB

Open AccessArticle

Multiplicative Decomposition Model to Predict UK’s Long-Term Electricity Demand with Monthly and Hourly Resolution

by Marie Baillon, María Carmen Romano and Ekkehard Ullner

Analytics 2025, 4(4), 27; https://doi.org/10.3390/analytics4040027 - 6 Oct 2025

Abstract

The UK electricity market is changing to adapt to Net Zero targets and respond to disruptions like the Russia–Ukraine war. This requires strategic planning to decide on the construction of new electricity generation plants for a resilient UK electricity grid. Such planning is [...] Read more.

The UK electricity market is changing to adapt to Net Zero targets and respond to disruptions like the Russia–Ukraine war. This requires strategic planning to decide on the construction of new electricity generation plants for a resilient UK electricity grid. Such planning is based on forecasting the UK electricity demand long-term (from 1 year and beyond). In this paper, we propose a long-term predictive model by identifying the main components of the UK electricity demand, modelling each of these components, and combining them in a multiplicative manner to deliver a single long-term prediction. To the best of our knowledge, this study is the first to apply a multiplicative decomposition model for long-term predictions at both monthly and hourly resolutions, combining neural networks with Fourier analysis. This approach is extremely flexible and accurate, with a mean absolute percentage error of 4.16% and 8.62% in predicting the monthly and hourly electricity demand, respectively, from 2019 to 2021. Full article

► Show Figures

Graphical abstract

16 pages, 894 KB

Open AccessArticle

Fairness in Predictive Marketing: Auditing and Mitigating Demographic Bias in Machine Learning for Customer Targeting

by Sayee Phaneendhar Pasupuleti, Jagadeesh Kola, Sai Phaneendra Manikantesh Kodete and Sree Harsha Palli

Analytics 2025, 4(4), 26; https://doi.org/10.3390/analytics4040026 - 1 Oct 2025

Abstract

As organizations increasingly turn to machine learning for customer segmentation and targeted marketing, concerns about fairness and algorithmic bias have become more urgent. This study presents a comprehensive fairness audit and mitigation framework for predictive marketing models using the Bank Marketing dataset. We [...] Read more.

As organizations increasingly turn to machine learning for customer segmentation and targeted marketing, concerns about fairness and algorithmic bias have become more urgent. This study presents a comprehensive fairness audit and mitigation framework for predictive marketing models using the Bank Marketing dataset. We train logistic regression and random forest classifiers to predict customer subscription behavior and evaluate their performance across key demographic groups, including age, education, and job type. Using model explainability techniques such as SHAP and fairness metrics including disparate impact and true positive rate parity, we uncover notable disparities in model behavior that could result in discriminatory targeting. We implement three mitigation strategies—reweighing, threshold adjustment, and feature exclusion—and assess their effectiveness in improving fairness while preserving business-relevant performance metrics. Among these, reweighing produced the most balanced outcome, raising the Disparate Impact Ratio for older individuals from 0.65 to 0.82 and reducing the true positive rate parity gap by over 40%, with only a modest decline in precision (from 0.78 to 0.76). We propose a replicable workflow for embedding fairness auditing into enterprise BI systems and highlight the strategic importance of ethical AI practices in building accountable and inclusive marketing technologies. technologies. Full article

(This article belongs to the Special Issue Business Analytics and Applications)

► Show Figures

Figure 1

44 pages, 3307 KB

Open AccessReview

Evolution Cybercrime—Key Trends, Cybersecurity Threats, and Mitigation Strategies from Historical Data

by Muhammad Abdullah, Muhammad Munib Nawaz, Bilal Saleem, Maila Zahra, Effa binte Ashfaq and Zia Muhammad

Analytics 2025, 4(3), 25; https://doi.org/10.3390/analytics4030025 - 18 Sep 2025

Abstract

The landscape of cybercrime has undergone significant transformations over the past decade. Present-day threats include AI-generated attacks, deep fakes, 5G network vulnerabilities, cryptojacking, and supply chain attacks, among others. To remain resilient against contemporary threats, it is essential to examine historical data to [...] Read more.

The landscape of cybercrime has undergone significant transformations over the past decade. Present-day threats include AI-generated attacks, deep fakes, 5G network vulnerabilities, cryptojacking, and supply chain attacks, among others. To remain resilient against contemporary threats, it is essential to examine historical data to gain insights that can inform cybersecurity strategies, policy decisions, and public awareness campaigns. This paper presents a comprehensive analysis of the evolution of cyber trends in state-sponsored attacks over the past 20 years, based on the council on foreign relations state-sponsored cyber operations (2005–present). The study explores the key trends, patterns, and demographic shifts in cybercrime victims, the evolution of complaints and losses, and the most prevalent cyber threats over the years. It also investigates the geographical distribution, the gender disparity in victimization, the temporal peaks of specific scams, and the most frequently reported internet crimes. The findings reveal a traditional cyber landscape, with cyber threats becoming more sophisticated and monetized. Finally, the article proposes areas for further exploration through a comprehensive analysis. It provides a detailed chronicle of the trajectory of cybercrimes, offering insights into its past, present, and future. Full article

► Show Figures

Figure 1

23 pages, 501 KB

Open AccessArticle

Meta-Analysis of Artificial Intelligence’s Influence on Competitive Dynamics for Small- and Medium-Sized Financial Institutions

by Macy Cudmore and David Mattie

Analytics 2025, 4(3), 24; https://doi.org/10.3390/analytics4030024 - 18 Sep 2025

Abstract

Artificial intelligence adoption in financial services presents uncertain implications for competitive dynamics, particularly for smaller institutions. The literature on AI in finance is growing, but there remains a notable absence regarding the impacts on small- and medium-sized financial services firms. We conduct a [...] Read more.

Artificial intelligence adoption in financial services presents uncertain implications for competitive dynamics, particularly for smaller institutions. The literature on AI in finance is growing, but there remains a notable absence regarding the impacts on small- and medium-sized financial services firms. We conduct a meta-analysis combining a systematic literature review, sentiment bibliometrics, and network analysis to examine how AI is transforming competition across different firm sizes in the financial sector. Our analysis of 160 publications reveals predominantly positive academic sentiment toward AI in finance (mean positive sentiment 0.725 versus negative 0.586, Cohen’s d = 0.790, p < 0.0001), with anticipatory sentiment increasing significantly over time (

β = 2.10 \times 10^{- 2}, p = 0.007

). However, network analysis reveals substantial conceptual fragmentation in the research discourse, with a low connectivity coefficient (

ϕ = 0.125

) indicating that the field lacks unified terminology. These findings expose a critical knowledge gap: while scholars increasingly view AI as competitively advantageous, research has not coalesced around coherent models for understanding differential impacts across firm sizes. The absence of size-specific research leaves practitioners and policymakers without clear guidance on how AI adoption affects competitive positioning, particularly for smaller institutions that may face resource constraints or technological barriers. The research fragmentation identified here has direct implications for strategic planning, regulatory approaches, and employment dynamics in financial services. Full article

(This article belongs to the Special Issue Business Analytics and Applications)

► Show Figures

Figure 1

17 pages, 512 KB

Open AccessArticle

Game-Theoretic Analysis of MEV Attacks and Mitigation Strategies in Decentralized Finance

by Benjamin Appiah, Daniel Commey, Winful Bagyl-Bac, Laurene Adjei and Ebenezer Owusu

Analytics 2025, 4(3), 23; https://doi.org/10.3390/analytics4030023 - 15 Sep 2025

Abstract

Maximal Extractable Value (MEV) presents a significant challenge to the fairness and efficiency of decentralized finance (DeFi). This paper provides a game-theoretic analysis of the strategic interactions within the MEV supply chain, involving searchers, builders, and validators. A three-stage game of incomplete information [...] Read more.

Maximal Extractable Value (MEV) presents a significant challenge to the fairness and efficiency of decentralized finance (DeFi). This paper provides a game-theoretic analysis of the strategic interactions within the MEV supply chain, involving searchers, builders, and validators. A three-stage game of incomplete information is developed to model these interactions. The analysis derives the Perfect Bayesian Nash Equilibria for primary MEV attack vectors, such as sandwich attacks, and formally characterizes attacker behavior. The research demonstrates that the competitive dynamics of the current MEV market are best described as Bertrand-style competition, which compels rational actors to engage in aggressive extraction that reduces overall system welfare in a prisoner’s dilemma-like outcome. To address these issues, the paper proposes and evaluates mechanism design solutions, including commit–reveal schemes and threshold encryption. The potential of these solutions to mitigate harmful MEV is quantified. Theoretical models are validated against on-chain data from the Ethereum blockchain, showing a close alignment between theoretical predictions and empirically observed market behavior. Full article

► Show Figures

Figure 1

13 pages, 874 KB

Open AccessArticle

Bankruptcy Prediction Using Machine Learning and Data Preprocessing Techniques

by Kamil Samara and Apurva Shinde

Analytics 2025, 4(3), 22; https://doi.org/10.3390/analytics4030022 - 10 Sep 2025

Abstract

Bankruptcy prediction is critical for financial risk management. This study demonstrates that machine learning models, particularly Random Forest, can substantially improve prediction accuracy compared to traditional approaches. Using data from 8262 U.S. firms (1999–2018), we evaluate Logistic Regression, SVM, Random Forest, ANN, and [...] Read more.

Bankruptcy prediction is critical for financial risk management. This study demonstrates that machine learning models, particularly Random Forest, can substantially improve prediction accuracy compared to traditional approaches. Using data from 8262 U.S. firms (1999–2018), we evaluate Logistic Regression, SVM, Random Forest, ANN, and RNN in combination with robust data preprocessing steps. Random Forest achieved the highest prediction accuracy (~95%), far surpassing Logistic Regression (~57%). Key preprocessing steps included feature engineering of financial ratios, feature selection, class balancing using SMOTE, and scaling. The findings highlight that ensemble and deep learning models—particularly Random Forest and ANN—offer strong predictive performance, suggesting their suitability for early-warning financial distress systems. Full article

► Show Figures

Figure 1

12 pages, 271 KB

Open AccessArticle

Accurate Analytical Forms of Heaviside and Ramp Function

by John Constantine Venetis

Analytics 2025, 4(3), 21; https://doi.org/10.3390/analytics4030021 - 26 Aug 2025

Abstract

In this paper, explicit exact representations of the Unit Step Function and Ramp Function are obtained. These important functions constitute fundamental concepts of operational calculus together with digital signal processing theory and are also involved in many other areas of applied sciences and [...] Read more.

In this paper, explicit exact representations of the Unit Step Function and Ramp Function are obtained. These important functions constitute fundamental concepts of operational calculus together with digital signal processing theory and are also involved in many other areas of applied sciences and engineering practices. In particular, according to a rigorous process from the viewpoint of Mathematical Analysis, the Unit Step Function and the Ramp Function are equivalently performed as bi-parametric single-valued functions with only one constraint imposed on each parameter. The novelty of this work, when compared with other investigations concerning accurate and/or approximate forms of Unit Step Function and/or Ramp Function, is that the proposed exact formulae are not exhibited in terms of miscellaneous special functions, e.g., Gamma Function, Biexponential Function, or any other special functions, such as Error Function, Complementary Error Function, Hyperbolic Function, or Orthogonal Polynomials. In this framework, one may deduce that these formulae may be much more practical, flexible, and useful in the computational procedures that are inserted into operational calculus and digital signal processing techniques as well as other engineering practices. Full article

18 pages, 844 KB

Open AccessArticle

LINEX Loss-Based Estimation of Expected Arrival Time of Next Event from HPP and NHPP Processes Past Truncated Time

by M. S. Aminzadeh

Analytics 2025, 4(3), 20; https://doi.org/10.3390/analytics4030020 - 26 Aug 2025

Abstract

This article introduces a computational tool for Bayesian estimation of the expected time until the next event occurs in both homogeneous Poisson processes (HPPs) and non-homogeneous Poisson processes (NHPPs), following a truncated time. The estimation utilizes the linear exponential (LINEX) asymmetric loss function [...] Read more.

This article introduces a computational tool for Bayesian estimation of the expected time until the next event occurs in both homogeneous Poisson processes (HPPs) and non-homogeneous Poisson processes (NHPPs), following a truncated time. The estimation utilizes the linear exponential (LINEX) asymmetric loss function and incorporates both gamma and non-informative priors. Furthermore, it presents a minimax-type criterion to ascertain the optimal sample size required to achieve a specified percentage reduction in posterior risk. Simulation studies indicate that estimators employing gamma priors for both HPP and NHPP demonstrate greater accuracy compared to those based on non-informative priors and maximum likelihood estimates (MLE), provided that the proposed data-driven method for selecting hyperparameters is applied. Full article

► Show Figures

Figure 1

29 pages, 2318 KB

Open AccessArticle

A Bounded Sine Skewed Model for Hydrological Data Analysis

by Tassaddaq Hussain, Mohammad Shakil, Mohammad Ahsanullah and Bhuiyan Mohammad Golam Kibria

Analytics 2025, 4(3), 19; https://doi.org/10.3390/analytics4030019 - 13 Aug 2025

Abstract

Hydrological time series frequently exhibit periodic trends with variables such as rainfall, runoff, and evaporation rates often following annual cycles. Seasonal variations further contribute to the complexity of these data sets. A critical aspect of analyzing such phenomena is estimating realistic return intervals, [...] Read more.

Hydrological time series frequently exhibit periodic trends with variables such as rainfall, runoff, and evaporation rates often following annual cycles. Seasonal variations further contribute to the complexity of these data sets. A critical aspect of analyzing such phenomena is estimating realistic return intervals, making the precise determination of these values essential. Given this importance, selecting an appropriate probability distribution is paramount. To address this need, we introduce a flexible probability model specifically designed to capture periodicity in hydrological data. We thoroughly examine its fundamental mathematical and statistical properties, including the asymptotic behavior of the probability density function (PDF) and hazard rate function (HRF), to enhance predictive accuracy. Our analysis reveals that the PDF exhibits polynomial decay as

x \to \infty

, ensuring heavy-tailed behavior suitable for extreme events. The HRF demonstrates decreasing or non-monotonic trends, reflecting variable failure risks over time. Additionally, we conduct a simulation study to evaluate the performance of the estimation method. Based on these results, we refine return period estimates, providing more reliable and robust hydrological assessments. This approach ensures that the model not only fits observed data but also captures the underlying dynamics of hydrological extremes. Full article

► Show Figures

Figure 1

26 pages, 1566 KB

Open AccessArticle

Predictive Framework for Regional Patent Output Using Digital Economic Indicators: A Stacked Machine Learning and Geospatial Ensemble to Address R&D Disparities

by Amelia Zhao and Peng Wang

Analytics 2025, 4(3), 18; https://doi.org/10.3390/analytics4030018 - 8 Jul 2025

Abstract

As digital transformation becomes an increasingly central focus of national and regional policy agendas, parallel efforts are intensifying to stimulate innovation as a critical driver of firm competitiveness and high-quality economic growth. However, regional disparities in innovation capacity persist. This study proposes an [...] Read more.

As digital transformation becomes an increasingly central focus of national and regional policy agendas, parallel efforts are intensifying to stimulate innovation as a critical driver of firm competitiveness and high-quality economic growth. However, regional disparities in innovation capacity persist. This study proposes an integrated framework in which regionally tracked digital economy indicators are leveraged to predict firm-level innovation performance, measured through patent activity, across China. Drawing on a comprehensive dataset covering 13 digital economic indicators from 2013 to 2022, this study spans core, broad, and narrow dimensions of digital development. Spatial dependencies among these indicators are assessed using global and local spatial autocorrelation measures, including Moran’s I and Geary’s C, to provide actionable insights for constructing innovation-conducive environments. To model the predictive relationship between digital metrics and innovation output, this study employs a suite of supervised machine learning techniques—Random Forest, Extreme Learning Machine (ELM), Support Vector Machine (SVM), XGBoost, and stacked ensemble approaches. Our findings demonstrate the potential of digital infrastructure metrics to serve as early indicators of regional innovation capacity, offering a data-driven foundation for targeted policymaking, strategic resource allocation, and the design of adaptive digital innovation ecosystems. Full article

► Show Figures

Figure 1

26 pages, 1859 KB

Open AccessArticle

Domestication of Source Text in Literary Translation Prevails over Foreignization

by Emilio Matricciani

Analytics 2025, 4(3), 17; https://doi.org/10.3390/analytics4030017 - 20 Jun 2025

Abstract

Domestication is a translation theory in which the source text (to be translated) is matched to the foreign reader by erasing its original linguistic and cultural difference. This match aims at making the target text (translated text) more fluent. On the contrary, foreignization [...] Read more.

Domestication is a translation theory in which the source text (to be translated) is matched to the foreign reader by erasing its original linguistic and cultural difference. This match aims at making the target text (translated text) more fluent. On the contrary, foreignization is a translation theory in which the foreign reader is matched to the source text. This paper mathematically explores the degree of domestication/foreignization in current translation practice of texts written in alphabetical languages. A geometrical representation of texts, based on linear combinations of deep–language parameters, allows us (a) to calculate a domestication index which measures how much domestication is applied to the source text and (b) to distinguish language families. An expansion index measures the relative spread around mean values. This paper reports statistics and results on translations of (a) Greek New Testament books in Latin and in 35 modern languages, belonging to diverse language families; and (b) English novels in Western languages. English and French, although attributed to different language families, mathematically almost coincide. The requirement of making the target text more fluent makes domestication, with varying degrees, universally adopted, so that a blind comparison of the same linguistic parameters of a text and its translation hardly indicates that they refer to each other. Full article

► Show Figures

Figure 1

19 pages, 2861 KB

Open AccessArticle

The Classical Model of Type-Token Systems Compared with Items from the Standardized Project Gutenberg Corpus

by Martin Tunnicliffe and Gordon Hunter

Analytics 2025, 4(2), 16; https://doi.org/10.3390/analytics4020016 - 5 Jun 2025

Abstract

We compare the “classical” equations of type-token systems, namely Zipf’s laws, Heaps’ law and the relationships between their indices, with data selected from the Standardized Project Gutenberg Corpus (SPGC). Selected items all exceed 100,000 word-tokens and are trimmed to 100,000 word-tokens each. With [...] Read more.

We compare the “classical” equations of type-token systems, namely Zipf’s laws, Heaps’ law and the relationships between their indices, with data selected from the Standardized Project Gutenberg Corpus (SPGC). Selected items all exceed 100,000 word-tokens and are trimmed to 100,000 word-tokens each. With the most egregious anomalies removed, a dataset of 8432 items is examined in terms of the relationships between the Zipf and Heaps’ indices computed using the Maximum Likelihood algorithm. Zipf’s second (size) law indices suggest that the types vs. frequency distribution is log–log convex, with the high and low frequency indices showing weak but significant negative correlation. Under certain circumstances, the classical equations work tolerably well, though the level of agreement depends heavily on the type of literature and the language (Finnish being notably anomalous). The frequency vs. rank characteristics exhibit log–log linearity in the “middle range” (ranks 100–1000), as characterised by the Kolmogorov–Smirnov significance. For most items, the Heaps’ index correlates strongly with the low frequency Zipf index in a manner consistent with classical theory, while the high frequency indices are largely uncorrelated. This is consistent with a simple simulation. Full article

► Show Figures

Figure 1

14 pages, 613 KB

Open AccessArticle

Multiplicity Adjustments for Differences in Proportion Parameters in Multiple-Sample Misclassified Binary Data

by Dewi Rahardja

Analytics 2025, 4(2), 15; https://doi.org/10.3390/analytics4020015 - 28 May 2025

Abstract

Generally, following an omnibus (overall equality) test, multiple pairwise comparison (MPC) tests are typically conducted as the second step in a sequential testing procedure to identify which specific pairs (e.g., proportions) exhibit significant differences. In this manuscript, we develop maximum likelihood estimation (MLE) [...] Read more.

Generally, following an omnibus (overall equality) test, multiple pairwise comparison (MPC) tests are typically conducted as the second step in a sequential testing procedure to identify which specific pairs (e.g., proportions) exhibit significant differences. In this manuscript, we develop maximum likelihood estimation (MLE) methods to construct three different types of confidence intervals (CIs) for multiple pairwise differences in proportions, specifically in contexts where both types of misclassifications (i.e., over-reporting and under-reporting) exist in multiple-sample binomial data. Our closed-form algorithm is straightforward to implement. Consequently, when dealing with multiple sample proportions, we can readily apply MPC adjustment procedures—such as Bonferroni, Šidák, and Dunn—to address the issue of multiplicity. This manuscript advances the existing literature by extending from scenarios with only one type of misclassification to those involving both. Furthermore, we demonstrate our methods using a real-world data example. Full article

► Show Figures

Figure 1

12 pages, 882 KB

Open AccessReview

Analytical Modeling of Ancillary Items

by John Wilson

Analytics 2025, 4(2), 14; https://doi.org/10.3390/analytics4020014 - 19 May 2025

Abstract

Airlines profitability increasingly depends on the sale of ancillary items such as seat selection, baggage fees, etc. The modeling of ancillary items is becoming more important in the analytics literature. Much of the modeling is stylized and not immediately applicable. This paper contains [...] Read more.

Airlines profitability increasingly depends on the sale of ancillary items such as seat selection, baggage fees, etc. The modeling of ancillary items is becoming more important in the analytics literature. Much of the modeling is stylized and not immediately applicable. This paper contains a review of the approaches and modeling assumptions made in the literature. The focus is on the assumptions made so that models may be evaluated for how effective they are for applications and to highlight gaps in the literature. Full article

► Show Figures

Figure 1

25 pages, 512 KB

Open AccessSystematic Review

Artificial Intelligence Applied to the Analysis of Biblical Scriptures: A Systematic Review

by Bruno Cesar Lima, Nizam Omar, Israel Avansi and Leandro Nunes de Castro

Analytics 2025, 4(2), 13; https://doi.org/10.3390/analytics4020013 - 11 Apr 2025

Abstract

The Holy Bible is the most read book in the world, originally written in Aramaic, Hebrew, and Greek over a time span in the order of centuries by many people, and formed by a combination of various literary styles, such as stories, prophecies, [...] Read more.

The Holy Bible is the most read book in the world, originally written in Aramaic, Hebrew, and Greek over a time span in the order of centuries by many people, and formed by a combination of various literary styles, such as stories, prophecies, poetry, instructions, and others. As such, the Bible is a complex text to be analyzed by humans and machines. This paper provides a systematic survey of the application of Artificial Intelligence (AI) and some of its subareas to the analysis of the Biblical scriptures. Emphasis is given to what types of tasks are being solved, what are the main AI algorithms used, and their limitations. The findings deliver a general perspective on how this field is being developed, along with its limitations and gaps. This research follows a procedure based on three steps: planning (defining the review protocol), conducting (performing the survey), and reporting (formatting the report). The results obtained show there are seven main tasks solved by AI in the Bible analysis: machine translation, authorship identification, part of speech tagging (PoS tagging), semantic annotation, clustering, categorization, and Biblical interpretation. Also, the classes of AI techniques with better performance when applied to Biblical text research are machine learning, neural networks, and deep learning. The main challenges in the field involve the nature and style of the language used in the Bible, among others. Full article

► Show Figures

Figure 1

24 pages, 1216 KB

Open AccessFeature PaperArticle

Traffic Prediction with Data Fusion and Machine Learning

by Juntao Qiu and Yaping Zhao

Analytics 2025, 4(2), 12; https://doi.org/10.3390/analytics4020012 - 9 Apr 2025

Cited by 3

Abstract

Traffic prediction, as a core task to alleviate urban congestion and optimize the transport system, has limitations in the integration of multimodal data, making it difficult to comprehensively capture the complex spatio-temporal characteristics of the transport system. Although some studies have attempted to [...] Read more.

Traffic prediction, as a core task to alleviate urban congestion and optimize the transport system, has limitations in the integration of multimodal data, making it difficult to comprehensively capture the complex spatio-temporal characteristics of the transport system. Although some studies have attempted to introduce multimodal data, they mostly rely on resource-intensive deep neural network architectures, which have difficultly meeting the demands of practical applications. To this end, we propose a traffic prediction framework based on simple machine learning techniques that effectively integrates property features, amenity features, and emotion features (PAE features). Validated with large-scale real datasets, the method demonstrates excellent prediction performance while significantly reducing computational complexity and deployment costs. This study demonstrates the great potential of simple machine learning techniques in multimodal data fusion, provides an efficient and practical solution for traffic prediction, and offers an effective alternative to resource-intensive deep learning methods, opening up new paths for building scalable traffic prediction systems. Full article

► Show Figures

Figure 1

21 pages, 3030 KB

Open AccessArticle

Copula-Based Bayesian Model for Detecting Differential Gene Expression

by Prasansha Liyanaarachchi and N. Rao Chaganty

Analytics 2025, 4(2), 11; https://doi.org/10.3390/analytics4020011 - 3 Apr 2025

Abstract

Deoxyribonucleic acid, more commonly known as DNA, is a fundamental genetic material in all living organisms, containing thousands of genes, but only a subset exhibit differential expression and play a crucial role in diseases. Microarray technology has revolutionized the study of gene expression, [...] Read more.

Deoxyribonucleic acid, more commonly known as DNA, is a fundamental genetic material in all living organisms, containing thousands of genes, but only a subset exhibit differential expression and play a crucial role in diseases. Microarray technology has revolutionized the study of gene expression, with two primary types available for expression analysis: spotted cDNA arrays and oligonucleotide arrays. This research focuses on the statistical analysis of data from spotted cDNA microarrays. Numerous models have been developed to identify differentially expressed genes based on the red and green fluorescence intensities measured using these arrays. We propose a novel approach using a Gaussian copula model to characterize the joint distribution of red and green intensities, effectively capturing their dependence structure. Given the right-skewed nature of the intensity distributions, we model the marginal distributions using gamma distributions. Differentially expressed genes are identified using the Bayes estimate under our proposed copula framework. To evaluate the performance of our model, we conduct simulation studies to assess parameter estimation accuracy. Our results demonstrate that the proposed approach outperforms existing methods reported in the literature. Finally, we apply our model to Escherichia coli microarray data, illustrating its practical utility in gene expression analysis. Full article

► Show Figures

Figure 1

Journal Description

Analytics

Latest Articles

Journal Menu

Journal Browser

Highly Accessed Articles

Latest Books

E-Mail Alert

News

Topics

Conferences

Special Issues

Further Information

Guidelines

MDPI Initiatives

Follow MDPI