Submit to Special Issue Submit Abstract to Special Issue Review for Mathematics Propose a Special Issue

Journal Menu

Journal Browser

Advances in Statistical Approaches with Applications for Multivariate Data Analysis

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "D1: Probability and Statistics".

Deadline for manuscript submissions: 31 December 2026 | Viewed by 3992

Share This Special Issue

Special Issue Editor

Dr. Md Erfanul Hoque

E-Mail
Guest Editor

Department of Community Health & Epidemiology, College of Medicine, University of Saskatchewan, SK S7N 5E5, Canada
Interests: longitudinal data analysis; statistical learning/machine learning; dynamic data science; spatial statistics; mixed models; statistical computing; time series analysis; biostatistics

Special Issue Information

Dear Colleagues,

Multivariate data analysis is a cornerstone of modern statistics and data science, enabling researchers to uncover complex relationships, patterns, and structures across multiple dimensions. With the exponential growth in data dimensions and complexity, traditional statistical methods often fall short in providing reliable and interpretable insights. This Special Issue, “Advances in Statistical Approaches with Applications for Multivariate Data Analysis”, brings together cutting-edge methodologies, theoretical developments, and innovative applications tailored to address challenges in multivariate data analysis across diverse domains.

In addition to methodological advancements, the Special Issue emphasizes practical applications. Papers showcase solutions to real-world problems in diverse domains, including genomics, image analysis, financial modeling, health sciences, and environmental sciences. This application-focused perspective demonstrates the relevance and adaptability of these advanced techniques in handling complex and heterogeneous data structures.

This Special Issue focuses on a wide range of topics, including, but not limited to novel developments in dimension reduction methods, advances in clustering and classification for multivariate data, robust multivariate statistical models for high-dimensional data, time series and longitudinal data analysis, machine/statistical learning methods for multivariate data, applications in genomics, neuroimaging, and computational biology, multivariate approaches for spatial and environmental data, statistical tools for dynamic systems and functional data, etc.

Dr. Md Erfanul Hoque
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

multivariate data
biostatistical methods
time series analysis
dynamic data science
machine/statistical learning
computational statistics
bayesian analysis
longitudinal data analysis
image analysis
spatial/spatio temporal analysis

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

26 pages, 12712 KB

Open AccessArticle

Subsampling-Based Consensus Hierarchical Clustering for Robust Customer Segmentation with Mixed-Type Data

by Nooshin Marefat, Purificación Galindo-Villardón and Purificación Vicente-Galindo

Mathematics 2026, 14(8), 1294; https://doi.org/10.3390/math14081294 - 13 Apr 2026

Abstract

Hierarchical clustering is an unsupervised framework that organizes observations according to pairwise similarity relationships. In this study, an agglomerative hierarchical approach combined with Gower dissimilarity is employed to accommodate mixed-type customer data. To address data quality issues such as missing values and outliers, Multiple Imputation by Chained Equations (MICE) and Winsorization are incorporated into the preprocessing pipeline. To validate cluster stability and identify the optimal number of clusters, we employ silhouette analysis, the Davies–Bouldin Index (DBI), the Proportion of Ambiguous Clustering (PAC), and a subsampling-based consensus clustering framework. A consensus-based hierarchical tree derived from the consensus matrix is employed to assess the robustness of the segmentation structure. The resulting clusters are further evaluated through comparisons with baseline algorithms for mixed-type data, including Partitioning Around Medoids (PAM) based on Gower dissimilarity and the K-prototypes method, together with statistical tests confirming significant behavioral differences between the identified segments. From an application standpoint, these results provide a data-driven basis for customer targeting by identifying distinct behavioral patterns, thereby supporting more effective engagement strategies and optimized resource allocation. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

43 pages, 10109 KB

Open AccessArticle

Stabilizer Variables for Measurement Invariance–Induced Heterogeneity: Identification Theory and Testing in Multi-Group Models

by Salim Yilmaz and Erhan Cene

Mathematics 2026, 14(6), 1064; https://doi.org/10.3390/math14061064 - 21 Mar 2026

Viewed by 366

Abstract

When measurement invariance (MI) is violated in multi-group structural equation models, group-specific measurement artifacts inflate the between-group variance of structural parameters beyond their true values. Existing remedies—partial invariance, group-specific estimation, or moderation analysis—address the consequences of inflation but not its mechanism. This article introduces the stabilizer variable, a covariate that absorbs measurement-induced parameter heterogeneity while maintaining structural independence from the focal relationship. Two theoretical results are established: a variance decomposition theorem showing that MI violations inflate dispersion through an identifiable artifactual component, and a purification theorem proving that a stabilizer reduces this dispersion via Frisch–Waugh–Lovell projection. Two stabilization mechanisms are identified: variance purification (Type A) and directional alignment (Type B). We then develop the stabilizer variable test, a dual-criterion procedure combining nonparametric bootstrap testing for stabilization magnitude with binomial testing for directional consistency, incorporating adaptive MI severity scoring with calibrated fit-index weights. Simulations comprising 949,100 replications across varying group counts, sample sizes, and MI severity levels demonstrate 80–99% power with false-positive rates below 2%. Practical guidelines recommend

K \geq 10

groups and

n \geq 100

per group for conservative applications. The framework generalizes to any multi-group regression context where systematic measurement error induces spurious parameter heterogeneity. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

► Show Figures

Figure 1

33 pages, 1665 KB

Open AccessArticle

Modeling Healthcare Data with a Novel Flexible Three-Parameter Distribution

by Thamer Manshi, Ammar M. Sarhan and M. E. Sobh

Mathematics 2026, 14(2), 359; https://doi.org/10.3390/math14020359 - 21 Jan 2026

Viewed by 321

Abstract

Developing flexible lifetime distributions is essential for accurately modeling reliability and lifetime data across various scientific and engineering contexts. In this work, we introduce a new three-parameter lifetime distribution, which extends the well-known two-parameter Sarhan–Tadj–Hamilton model. We derive and discuss several of its important theoretical properties, including the reliability characteristics and moments. The parameter estimation is carried out using both maximum likelihood and Bayesian approaches, providing a comprehensive comparison of inferential techniques. To further examine the efficiency and robustness of the proposed estimators, a detailed Monte Carlo simulation study is conducted under different sample sizes and parameter settings. The practical usefulness of the distribution is illustrated through its application to three real-world datasets, namely cancer and COVID-19 data, where it demonstrates superior fit and flexibility compared to existing and nested lifetime models. These findings highlight the potential of the proposed model as a valuable addition to the toolbox of applied statisticians and reliability practitioners. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

► Show Figures

Figure 1

22 pages, 12979 KB

Open AccessArticle

A High-Breakdown MCD-Based Robust Concordance Correlation Coefficient

by Hasan Bulut, Müjgan Zobu and Vedat Sağlam

Mathematics 2026, 14(1), 196; https://doi.org/10.3390/math14010196 - 4 Jan 2026

Viewed by 632

Abstract

The concordance correlation coefficient (CCC) is a popular measure of agreement between two continuous variables but is highly sensitive to outliers and data contamination. In this study, we propose a robust reformulation of the CCC by replacing classical moment estimators with Minimum Covariance Determinant (MCD) estimators. The proposed robust CCC preserves the interpretability of the classical coefficient while providing substantially improved robustness. Comprehensive Monte Carlo simulations under normal and non-normal distributions, varying sample sizes, correlation levels, and contamination schemes compare the proposed coefficient with the classical CCC and existing robust alternatives. The results show that the proposed robust CCC achieves superior stability and accuracy in contaminated settings while remaining competitive under clean data. Theoretical properties of the estimator are discussed, and its practical usefulness is demonstrated using real glucose measurement and blood pressure data sets. The proposed method is implemented in the MVTests R package, enabling straightforward application to real-world data. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

► Show Figures

Figure 1

26 pages, 634 KB

Open AccessArticle

Time-Weighted Result-Based Strength Indicators from Head-to-Head Outcomes: An Application to Trotter (Harness) Racing

by Manuel Ligero-Acosta, Juan M. Muñoz-Pichardo, María Dolores Gómez, María Ripollés-Lobo and Mercedes Valera

Mathematics 2026, 14(1), 167; https://doi.org/10.3390/math14010167 - 1 Jan 2026

Viewed by 633

Abstract

We propose a general methodology for constructing dynamic performance indicators (or strength metrics) in any sport that relies on comparative outcomes among competitors, using chronological positional data. Specifically, we develop a family of strength indicators for harness trotting races based on time-weighted, head-to-head results. Using the official Balearic trotting records (1990–2023), we construct win, draw, and confrontation matrices up to each event and apply a triweight kernel to reduce the influence of older results. From these matrices, we derive a family of five bounded, interpretable indicators on the interval

[0, 1]

: an overall average win rate, a category-adjusted version, and three distance-specific versions (short, medium, and long). Indicator validation is performed via predictive validation, employing regularized logistic regression models (Elastic Net) based on indicator differences between horse pairs. Standard metrics (accuracy, calibration, discrimination, and Brier score) are used for the validation analysis. The results confirm that the indicators are coherent, stable, and interpretable, demonstrating that the generic construction procedure yields robust outcomes. We conclude that these indicators establish a solid and easily updatable foundation for developing dynamic ranking systems and practical selection/handicap procedures in trotting. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

► Show Figures

Figure 1

25 pages, 1808 KB

Open AccessArticle

A Dependent Bivariate Burr XII Inverse Weibull Model: Application to Diabetic Retinopathy and Dependent Competing Risks Data

by Ammar M. Sarhan, Ahlam H. Tolba, Dina A. Ramadan and Thamer Manshi

Mathematics 2026, 14(1), 120; https://doi.org/10.3390/math14010120 - 28 Dec 2025

Viewed by 385

Abstract

This paper introduces a novel bivariate distribution, referred to as the Bivariate Burr XII Inverse Weibull (BBXII-IW) distribution, constructed via the Marshall–Olkin approach from the univariate Burr XII Inverse Weibull (BXII-IW) distribution. The proposed BBXII-IW model provides a flexible framework for modeling dependent bivariate data, including competing risk scenarios. The key statistical properties of the distribution are derived, and parameter estimation is conducted using the maximum likelihood method. The model’s performance is evaluated using two types of real-world datasets: (1) bivariate data and (2) dependent competing risk data related to diabetic retinopathy. The results demonstrate that the BBXII-IW distribution offers an improved fit compared to existing models, highlighting its flexibility and practical relevance in modeling complex dependent structures. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

► Show Figures

Figure 1

33 pages, 752 KB

Open AccessFeature PaperArticle

Flux and First-Passage Time Distributions in One-Dimensional Integrated Stochastic Processes with Arbitrary Temporal Correlation and Drift

by Holger Nobach and Stephan Eule

Mathematics 2025, 13(19), 3163; https://doi.org/10.3390/math13193163 - 2 Oct 2025

Viewed by 804

Abstract

The arrival of tracers at boundaries with defined distances from the origin of their motion in stochastically fluctuating advection processes is investigated. The advection model is a stationary one-dimensional integrated stochastic process with an arbitrary a priori known correlation and with possible mean drift. The current (direction-sensitive), the total flux (direction-insensitive) of tracers through a non-absorbing boundary, and the first-passage times of the tracers at an absorbing boundary are derived depending on the correlation function of the carrying flow velocity. While the general derivations are universal with respect to the distribution function of the advection’s increments, the current and the total flux are explicitly derived for a Gaussian distribution. The first-passage time is derived implicitly through an integral that is solved numerically in the present study. No approximations or restrictions to special cases of the advection process are used. One application is one-dimensional Gaussian turbulence, where the one-dimensional random velocity carries tracer particles through space. Finally, subdiffusive or superdiffusive behavior can temporarily be reached by such a stochastic process with an adequately designed correlation function. Full article

(This article belongs to the Special Issue Advances in Statistical Approaches with Applications for Multivariate Data Analysis)

► Show Figures

Journal Menu

Journal Browser

Advances in Statistical Approaches with Applications for Multivariate Data Analysis

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (7 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI