Journal Description
Stats
Stats
is an international, peer-reviewed, open access journal on statistical science published bimonthly online by MDPI. The journal focuses on methodological and theoretical papers in statistics, probability, stochastic processes and innovative applications of statistics in all scientific disciplines including biological and biomedical sciences, medicine, business, economics and social sciences, physics, data science and engineering.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within ESCI (Web of Science), Scopus, RePEc, and other databases.
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 18.2 days after submission; acceptance to publication is undertaken in 2.9 days (median values for papers published in this journal in the first half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
- Journal Cluster of Artificial Intelligence: AI, AI in Medicine, Algorithms, BDCC, MAKE, MTI, Stats, Virtual Worlds and Computers.
Impact Factor:
1.0 (2024);
5-Year Impact Factor:
1.1 (2024)
Latest Articles
Robust Kibria Estimators for Mitigating Multicollinearity and Outliers in a Linear Regression Model
Stats 2025, 8(4), 119; https://doi.org/10.3390/stats8040119 - 17 Dec 2025
Abstract
In the presence of multicollinearity, the ordinary least squares (OLS) estimators, aside from BLUE (best linear unbiased estimator), lose efficiency and fail to achieve minimum variance. In addition, these estimators are highly sensitive to outliers in the response direction. To overcome these limitations,
[...] Read more.
In the presence of multicollinearity, the ordinary least squares (OLS) estimators, aside from BLUE (best linear unbiased estimator), lose efficiency and fail to achieve minimum variance. In addition, these estimators are highly sensitive to outliers in the response direction. To overcome these limitations, robust estimation techniques are often integrated with shrinkage methods. This study proposes a new class of Kibria Ridge M-estimators specifically developed to simultaneously address multicollinearity and outlier contamination. A comprehensive Monte Carlo simulation study is conducted to evaluate the performance of the proposed and existing estimators. Based on the mean squared error criterion, the proposed Kibria Ridge M-estimators consistently outperform the traditional ridge-type estimators under varying parameter settings. Furthermore, the practical applicability and superiority of the proposed estimators are validated using the Tobacco and Anthropometric datasets. Overall, the new proposed estimators demonstrate good performance, offering robust and efficient alternatives for regression modeling in the presence of multicollinearity and outliers.
Full article
(This article belongs to the Special Issue Advances in Machine Learning, High-Dimensional Inference, Shrinkage Estimation, and Model Validation)
►
Show Figures
Open AccessArticle
Korovkin-Type Approximation Theorems for Statistical Gauge Integrable Functions of Two Variables
by
Hari Mohan Srivastava, Bidu Bhusan Jena, Susanta Kumar Paikray and Umakanta Misra
Stats 2025, 8(4), 118; https://doi.org/10.3390/stats8040118 - 15 Dec 2025
Abstract
In this work, we develop and investigate statistical extensions of gauge integrability and gauge summability for double sequences of functions of two real variables, formulated within the framework of deferred weighted means. We begin by establishing several fundamental limit theorems that serve to
[...] Read more.
In this work, we develop and investigate statistical extensions of gauge integrability and gauge summability for double sequences of functions of two real variables, formulated within the framework of deferred weighted means. We begin by establishing several fundamental limit theorems that serve to connect these generalized notions and provide a rigorous theoretical foundation. Based on these results, we establish Korovkin-type approximation theorems using the classical test function set in the Banach space . To demonstrate the applicability of the proposed framework, we further present an example involving families of positive linear operators associated with the Meyer-König and Zeller (MKZ) operators. These findings not only extend classical Korovkin-type theorems to the setting of statistical deferred gauge integrability and summability but also underscore their robustness in addressing double sequences and the approximation of two-variable functions.
Full article
(This article belongs to the Section Statistical Methods)
►▼
Show Figures

Figure 1
Open AccessArticle
Still No Free Lunch: Failure of Stability in Regulated Systems of Interacting Cognitive Modules
by
Rodrick Wallace
Stats 2025, 8(4), 117; https://doi.org/10.3390/stats8040117 - 15 Dec 2025
Abstract
►▼
Show Figures
The asymptotic limit theorems of information and control theories, instantiated as the Rate Distortion Control Theory of bounded rationality, enable examination of stability across models of cognition based on a variety of fundamental, underlying probability distributions likely to characterize different forms of embodied
[...] Read more.
The asymptotic limit theorems of information and control theories, instantiated as the Rate Distortion Control Theory of bounded rationality, enable examination of stability across models of cognition based on a variety of fundamental, underlying probability distributions likely to characterize different forms of embodied ‘intelligent’ systems. Embodied cognition is inherently unstable, requiring the pairing of cognition with regulation at and across the various and varied scales and levels of organization. Like contemporary Large Language Model ‘hallucination,’ de facto ‘psychopathology’—the failure of regulation in systems of cognitive modules—is not a bug but an inherent feature of embodied cognition. What particularly emerges from this analysis, then, is the ubiquity of failure-under-stress even for ‘intelligent’ embodied cognition, where cognitive and regulatory modules are closely paired. There is still No Free Lunch, much in the classic sense of Wolpert and Macready. With some further effort, the probability models developed here can be transformed into robust statistical tools for the analysis of observational and experimental data regarding regulated and other cognitive phenomena.
Full article

Figure 1
Open AccessReview
Mapping Research on the Birnbaum–Saunders Statistical Distribution: Patterns, Trends, and Scientometric Perspective
by
Víctor Leiva
Stats 2025, 8(4), 116; https://doi.org/10.3390/stats8040116 - 13 Dec 2025
Abstract
►▼
Show Figures
This article provides a critical assessment of the Birnbaum–Saunders (BS) distribution, a pivotal statistical model for lifetime data analysis and reliability estimation, particularly in fatigue contexts. The model has seen successfully applied across diverse fields, including biological mortality, environmental sciences, medicine, and risk
[...] Read more.
This article provides a critical assessment of the Birnbaum–Saunders (BS) distribution, a pivotal statistical model for lifetime data analysis and reliability estimation, particularly in fatigue contexts. The model has seen successfully applied across diverse fields, including biological mortality, environmental sciences, medicine, and risk models. Moving beyond a basic scientometric review, this study synthesizes findings from 353 peer-reviewed articles, selected using PRISMA 2020 protocols, to specifically trace the evolution of estimation techniques, regression methods, and model extensions. Key findings reveal robust theoretical advances, such as Bayesian methods and bivariate/spatial adaptations, alongside practical progress in influence diagnostics and software development. The analysis highlights key research gaps, including the critical need for scalable, auditable software and structured reviews, and notes a peak in scholarly activity around 2019, driven importantly by the Brazil-Chile research alliance. This work offers a consolidated view of current BS model implementations and outlines clear future directions for enhancing their theoretical robustness and practical utility.
Full article

Figure 1
Open AccessArticle
Entropy and Minimax Risk Diversification: An Empirical and Simulation Study of Portfolio Optimization
by
Hongyu Yang and Zijian Luo
Stats 2025, 8(4), 115; https://doi.org/10.3390/stats8040115 - 11 Dec 2025
Abstract
The optimal allocation of funds within a portfolio is a central research focus in finance. Conventional mean-variance models often concentrate a significant portion of funds in a limited number of high-risk assets. To promote diversification, Shannon Entropy is widely applied. This paper develops
[...] Read more.
The optimal allocation of funds within a portfolio is a central research focus in finance. Conventional mean-variance models often concentrate a significant portion of funds in a limited number of high-risk assets. To promote diversification, Shannon Entropy is widely applied. This paper develops a portfolio optimization model that incorporates Shannon Entropy alongside a risk diversification principle aimed at minimizing the maximum individual asset risk. The study combines empirical analysis with numerical simulations. First, empirical data are used to assess the theoretical model’s effectiveness and practicality. Second, numerical simulations are conducted to analyze portfolio performance under extreme market scenarios. Specifically, the numerical results indicate that for fixed values of the risk balance coefficient and minimum expected return, the optimal portfolios and their return distributions are similar when the risk is measured by standard deviation, absolute deviation, or standard lower semi-deviation. This suggests that the model exhibits robustness to variations in the risk function, providing a relatively stable investment strategy.
Full article
(This article belongs to the Special Issue Robust Statistics in Action II)
►▼
Show Figures

Figure 1
Open AccessArticle
Validated Transfer Learning Peters–Belson Methods for Survival Analysis: Ensemble Machine Learning Approaches with Overfitting Controls for Health Disparity Decomposition
by
Menglu Liang and Yan Li
Stats 2025, 8(4), 114; https://doi.org/10.3390/stats8040114 - 10 Dec 2025
Abstract
Background: Health disparities research increasingly relies on complex survey data to understand survival differences between population subgroups. While Peters–Belson decomposition provides a principled framework for distinguishing disparities explained by measured covariates from unexplained residual differences, traditional approaches face challenges with complex data patterns
[...] Read more.
Background: Health disparities research increasingly relies on complex survey data to understand survival differences between population subgroups. While Peters–Belson decomposition provides a principled framework for distinguishing disparities explained by measured covariates from unexplained residual differences, traditional approaches face challenges with complex data patterns and model validation for counterfactual estimation. Objective: To develop validated Peters–Belson decomposition methods for survival analysis that integrate ensemble machine learning with transfer learning while ensuring logical validity of counterfactual estimates through comprehensive model validation. Methods: We extend the traditional Peters–Belson framework through ensemble machine learning that combines Cox proportional hazards models, cross-validated random survival forests, and regularized gradient boosting approaches. Our framework incorporates a transfer learning component via principal component analysis (PCA) to discover shared latent factors between majority and minority groups. We note that this “transfer learning” differs from the standard machine learning definition (pre-trained models or domain adaptation); here, we use the term in its statistical sense to describe the transfer of covariate structure information from the pooled population to identify group-level latent factors. We develop a comprehensive validation framework that ensures Peters–Belson logical bounds compliance, preventing mathematical violations in counterfactual estimates. The approach is evaluated through simulation studies across five realistic health disparity scenarios using stratified complex survey designs. Results: Simulation studies demonstrate that validated ensemble methods achieve superior performance compared to individual models (proportion explained: 0.352 vs. 0.310 for individual Cox, 0.325 for individual random forests), with validation framework reducing logical violations from 34.7% to 2.1% of cases. Transfer learning provides additional 16.1% average improvement in explanation of unexplained disparity when significant unmeasured confounding exists, with 90.1% overall validation success rate. The validation framework ensures explanation proportions remain within realistic bounds while maintaining computational efficiency with 31% overhead for validation procedures. Conclusions: Validated ensemble machine learning provides substantial advantages for Peters–Belson decomposition when combined with proper model validation. Transfer learning offers conditional benefits for capturing unmeasured group-level factors while preventing mathematical violations common in standard approaches. The framework demonstrates that realistic health disparity patterns show 25–35% of differences explained by measured factors, providing actionable targets for reducing health inequities.
Full article
(This article belongs to the Special Issue Advances in Machine Learning, High-Dimensional Inference, Shrinkage Estimation, and Model Validation)
►▼
Show Figures

Figure 1
Open AccessCommunication
Goodness of Chi-Square for Linearly Parameterized Fitting
by
George Livadiotis
Stats 2025, 8(4), 113; https://doi.org/10.3390/stats8040113 - 1 Dec 2025
Abstract
►▼
Show Figures
The paper shows an alternative perspective of the reduced chi-square as a measure of the goodness of fitting methods. The reduced chi-square is given by the ratio of the fitting over the propagation errors, that is, a universal relationship that holds for any
[...] Read more.
The paper shows an alternative perspective of the reduced chi-square as a measure of the goodness of fitting methods. The reduced chi-square is given by the ratio of the fitting over the propagation errors, that is, a universal relationship that holds for any linearity, but not for a nonlinearly parameterized fitting model. We begin by providing the proof for the traditional examples of one-parametric fitting of a constant and the bi-parametric fitting of a linear model, and then, for the general case of any linearly multi-parameterized model. We also show that this characterization is not generally true for nonlinearly parameterized fitting. Finally, we demonstrate these theoretical developments with an application in real data from the plasma protons in the heliosphere.
Full article

Figure 1
Open AccessArticle
Factor Analysis Biplots for Continuous, Binary and Ordinal Data
by
Marina Valdés-Rodríguez, Laura Vicente-González and José L. Vicente-Villardón
Stats 2025, 8(4), 112; https://doi.org/10.3390/stats8040112 - 25 Nov 2025
Abstract
This article presents biplots derived from factor analysis of correlation matrices for both continuous and ordinal data. It introduces biplots specifically designed for factor analysis, detailing the geometric interpretation for each data type and providing an algorithm to compute biplot coordinates from the
[...] Read more.
This article presents biplots derived from factor analysis of correlation matrices for both continuous and ordinal data. It introduces biplots specifically designed for factor analysis, detailing the geometric interpretation for each data type and providing an algorithm to compute biplot coordinates from the factorization of correlation matrices. The theoretical developments are illustrated using a real dataset that explores the relationship between volunteering, political ideology, and civic engagement in Spain.
Full article
(This article belongs to the Section Multivariate Analysis)
►▼
Show Figures

Figure 1
Open AccessArticle
A Copula-Based Model for Analyzing Bivariate Offense Data
by
Dimuthu Fernando and Wimarsha Jayanetti
Stats 2025, 8(4), 111; https://doi.org/10.3390/stats8040111 - 19 Nov 2025
Abstract
We developed a class of bivariate integer-valued time series models using copula theory. Each count time series is modeled as a Markov chain, with serial dependence characterized through copula-based transition probabilities for Poisson and Negative Binomial marginals. Cross-sectional dependence is modeled via a
[...] Read more.
We developed a class of bivariate integer-valued time series models using copula theory. Each count time series is modeled as a Markov chain, with serial dependence characterized through copula-based transition probabilities for Poisson and Negative Binomial marginals. Cross-sectional dependence is modeled via a bivariate Gaussian copula, allowing for both positive and negative correlations and providing a flexible dependence structure. Model parameters are estimated using likelihood-based inference, where the bivariate Gaussian copula integral is evaluated through standard randomized Monte Carlo methods. The proposed approach is illustrated through an application to offense data from New South Wales, Australia, demonstrating its effectiveness in capturing complex dependence patterns.
Full article
(This article belongs to the Section Time Series Analysis)
►▼
Show Figures

Figure 1
Open AccessArticle
Prediction Inferences for Finite Population Totals Using Longitudinal Survey Data
by
Asokan M. Variyath and Brajendra C. Sutradhar
Stats 2025, 8(4), 110; https://doi.org/10.3390/stats8040110 - 18 Nov 2025
Abstract
In an infinite-/super-population (SP) setup, regression analysis of longitudinal data, which involves repeated responses and covariates collected from a sample of independent individuals or correlated individuals belonging to a cluster such as a household/family, has been intensively studied in the statistics literature over
[...] Read more.
In an infinite-/super-population (SP) setup, regression analysis of longitudinal data, which involves repeated responses and covariates collected from a sample of independent individuals or correlated individuals belonging to a cluster such as a household/family, has been intensively studied in the statistics literature over the last three decades. In general, a longitudinal, such as an auto-correlation structure for repeated responses for an individual or a two-way cluster–longitudinal correlation structure for repeated responses from the individuals belonging to a cluster/household, are exploited to obtain consistent and efficient regression estimates. However, as opposed to the SP setup, a similar regression analysis for a finite population (FP)-based longitudinal or clustered longitudinal data using a survey sample (SS) taken from the FP-based on a suitable sampling design becomes complex, which requires first defining the FP regression and correlation (both longitudinal and/or clustered) parameters and then estimating them using appropriate sampling weighted-design unbiased (SWDU) estimating equations. The finite sampling inferences, such as predictions of longitudinal changes in FP totals, would become much more complex, meaning that it would be necessary to predict the non-sampled totals after accommodating the longitudinal and/or clustered longitudinal correlation structures. Our objective in this paper is to deal with this complex FP prediction inference by developing a design cum model (DCM)-based estimation approach. Two competitive FP total predictors, namely design-assisted model-based (DAMB) and design cum model-based (DCMB) predictors are compared using an intensive simulation study. The regression and correlation parameters involved in these prediction functions are optimally estimated using the proposed DCM-based approach.
Full article
Open AccessArticle
Maximum Likelihood and Calibrating Prior Prediction Reliability Bias Reference Charts
by
Stephen Jewson
Stats 2025, 8(4), 109; https://doi.org/10.3390/stats8040109 - 6 Nov 2025
Abstract
►▼
Show Figures
There are many studies in the scientific literature that present predictions from parametric statistical models based on maximum likelihood estimates of the unknown parameters. However, generating predictions from maximum likelihood parameter estimates ignores the uncertainty around the parameter estimates. As a result, predictive
[...] Read more.
There are many studies in the scientific literature that present predictions from parametric statistical models based on maximum likelihood estimates of the unknown parameters. However, generating predictions from maximum likelihood parameter estimates ignores the uncertainty around the parameter estimates. As a result, predictive probability distributions based on maximum likelihood are typically too narrow, and simulation testing has shown that tail probabilities are underestimated compared to the relative frequencies of out-of-sample events. We refer to this underestimation as a reliability bias. Previous authors have shown that objective Bayesian methods can eliminate or reduce this bias if the prior is chosen appropriately. Such methods have been given the name calibrating prior prediction. We investigate maximum likelihood reliability bias in more detail. We then present reference charts that quantify the reliability bias for 18 commonly used statistical models, for both maximum likelihood prediction and calibrating prior prediction. The charts give results for a large number of combinations of sample size and nominal probability and contain orders of magnitude more information about the reliability biases in predictions from these methods than has previously been published. These charts serve two purposes. First, they can be used to evaluate the extent to which maximum likelihood predictions given in the scientific literature are affected by reliability bias. If the reliability bias is large, the predictions may need to be revised. Second, the charts can be used in the design of future studies to assess whether it is appropriate to use maximum likelihood prediction, whether it would be more appropriate to reduce the reliability bias by using calibrating prior prediction, or whether neither maximum likelihood prediction nor calibrating prior prediction gives an adequately low reliability bias.
Full article

Figure 1
Open AccessArticle
Analysis of the Truncated XLindley Distribution Using Bayesian Robustness
by
Meriem Keddali, Hamida Talhi, Ali Slimani and Mohammed Amine Meraou
Stats 2025, 8(4), 108; https://doi.org/10.3390/stats8040108 - 5 Nov 2025
Abstract
In this work, we present a robust examination of the Bayesian estimators utilizing the two-parameter Upper truncated XLindley model, a unique Lindley model variant, and the oscillation of posterior risks. We provide the model in a censored scheme along with its likelihood function.
[...] Read more.
In this work, we present a robust examination of the Bayesian estimators utilizing the two-parameter Upper truncated XLindley model, a unique Lindley model variant, and the oscillation of posterior risks. We provide the model in a censored scheme along with its likelihood function. The topic of sensitivity and robustness analysis of the Bayesian estimators was only covered by a small number of authors. As a result, very few apps have been created in this field. The oscillation of the posterior hazards of the Bayesian estimator is used to illustrate the method. By using a Monte Carlo simulation study, we show that, with the correct generalized loss function, a robust Bayesian estimator of the parameters corresponding to the smallest oscillation of the posterior risks may be obtained; robust estimators can be obtained when the parameter space is low-dimensional. The robustness and precision of Bayesian parameter estimation can be enhanced in regimes where the parameters of interest are of small magnitude.
Full article
(This article belongs to the Special Issue Robust Statistics in Action II)
►▼
Show Figures

Figure 1
Open AccessArticle
A High Dimensional Omnibus Regression Test
by
Ahlam M. Abid, Paul A. Quaye and David J. Olive
Stats 2025, 8(4), 107; https://doi.org/10.3390/stats8040107 - 5 Nov 2025
Cited by 1
Abstract
Consider regression models where the response variable Y only depends on the vector of predictors through the sufficient predictor .
[...] Read more.
Consider regression models where the response variable Y only depends on the vector of predictors through the sufficient predictor . Let the covariance vector . Assume the cases are independent and identically distributed random vectors for . Then for many such regression models, if and only if where 0 is the vector of zeroes. The test of versus is equivalent to the high dimensional one sample test versus applied to where and the expected values and . Since and are unknown, the test of versus is implemented by applying the one sample test to for . This test has milder regularity conditions than its few competitors. For the multiple linear regression one component partial least squares and marginal maximum likelihood estimators, the test can be adapted to test versus where
Full article
(This article belongs to the Section Regression Models)
Open AccessArticle
A Multi-State Model for Lung Cancer Mortality in Survival Progression
by
Vinoth Raman, Sandra S. Ferreira, Dário Ferreira and Ayman Alzaatreh
Stats 2025, 8(4), 106; https://doi.org/10.3390/stats8040106 - 5 Nov 2025
Abstract
►▼
Show Figures
Lung cancer remains one of the leading causes of death worldwide due to its high rates of illness and mortality. In this study, we applied a continuous-time multi-state Markov model to examine how lung cancer progresses through six clinically defined stages, using retrospective
[...] Read more.
Lung cancer remains one of the leading causes of death worldwide due to its high rates of illness and mortality. In this study, we applied a continuous-time multi-state Markov model to examine how lung cancer progresses through six clinically defined stages, using retrospective data from 576 patients. The model describes movements between disease stages and the final stage (death), providing estimates of how long patients typically remain in each stage and how quickly they move to the next. It also considers important demographic and clinical factors such as age, smoking history, hypertension, asthma, and gender, which influence survival outcomes. Our findings show slower changes at the beginning of the disease but faster decline in later stages, with clear differences across patient groups. This approach highlights the dynamic course of the illness and can help guide tailored follow-up, personalized treatment, and health policy decisions. The study is based on a secondary analysis of publicly available data and therefore did not require clinical trial registration.
Full article

Figure 1
Open AccessArticle
Silhouette-Based Evaluation of PCA, Isomap, and t-SNE on Linear and Nonlinear Data Structures
by
Mostafa Zahed and Maryam Skafyan
Stats 2025, 8(4), 105; https://doi.org/10.3390/stats8040105 - 3 Nov 2025
Abstract
►▼
Show Figures
Dimensionality reduction is fundamental for analyzing high-dimensional data, supporting visualization, denoising, and structure discovery. We present a systematic, large-scale benchmark of three widely used methods—Principal Component Analysis (PCA), Isometric Mapping (Isomap), and t-Distributed Stochastic Neighbor Embedding (t-SNE)—evaluated by average silhouette scores to quantify
[...] Read more.
Dimensionality reduction is fundamental for analyzing high-dimensional data, supporting visualization, denoising, and structure discovery. We present a systematic, large-scale benchmark of three widely used methods—Principal Component Analysis (PCA), Isometric Mapping (Isomap), and t-Distributed Stochastic Neighbor Embedding (t-SNE)—evaluated by average silhouette scores to quantify cluster preservation after embedding. Our full factorial simulation varies sample size , noise variance , and feature count under four generative regimes: (1) a linear Gaussian mixture, (2) a linear Student-t mixture with heavy tails, (3) a nonlinear Swiss-roll manifold, and (4) a nonlinear concentric-spheres manifold, each replicated 1000 times per condition. Beyond empirical comparisons, we provide mathematical results that explain the observed rankings: under standard separation and sampling assumptions, PCA maximizes silhouettes for linear, low-rank structure, whereas Isomap dominates on smooth curved manifolds; t-SNE prioritizes local neighborhoods, yielding strong local separation but less reliable global geometry. Empirically, PCA consistently achieves the highest silhouettes for linear structure (Isomap second, t-SNE third); on manifolds the ordering reverses (Isomap > t-SNE > PCA). Increasing and adding uninformative dimensions (larger p) degrade all methods, while larger n improves levels and stability. To our knowledge, this is the first integrated study combining a comprehensive factorial simulation across linear and nonlinear regimes with distribution-based summaries (density and violin plots) and supporting theory that predicts method orderings. The results offer clear, practice-oriented guidance: prefer PCA when structure is approximately linear; favor manifold learning—especially Isomap—when curvature is present; and use t-SNE for the exploratory visualization of local neighborhoods. Complete tables and replication materials are provided to facilitate method selection and reproducibility.
Full article

Figure 1
Open AccessArticle
Computational Testing Procedure for the Overall Lifetime Performance Index of Multi-Component Exponentially Distributed Products
by
Shu-Fei Wu and Chia-Chi Hsu
Stats 2025, 8(4), 104; https://doi.org/10.3390/stats8040104 - 2 Nov 2025
Abstract
►▼
Show Figures
In addition to products with a single component, this study examines products composed of multiple components whose lifetimes follow a one-parameter exponential distribution. An overall lifetime performance index is developed to assess products under the progressive type I interval censoring scheme. This study
[...] Read more.
In addition to products with a single component, this study examines products composed of multiple components whose lifetimes follow a one-parameter exponential distribution. An overall lifetime performance index is developed to assess products under the progressive type I interval censoring scheme. This study establishes the relationship between the overall and individual lifetime performance indices and derives the corresponding maximum likelihood estimators along with their asymptotic distributions. Based on the asymptotic distributions, the lower confidence bounds for all indices are also established. Furthermore, a hypothesis testing procedure is formulated to evaluate whether the overall lifetime performance index achieves the specified target level, utilizing the maximum likelihood estimator as the test statistic under a progressive type I interval censored sample. Moreover, a power analysis is carried out, and two numerical examples are presented to demonstrate the practical implementation for the overall lifetime performance index. This research can be applied to the fields of life testing and reliability analysis.
Full article

Figure 1
Open AccessArticle
A Nonparametric Monitoring Framework Based on Order Statistics and Multiple Scans: Advances and Applications in Ocean Engineering
by
Ioannis S. Triantafyllou
Stats 2025, 8(4), 103; https://doi.org/10.3390/stats8040103 - 1 Nov 2025
Abstract
►▼
Show Figures
In this work, we introduce a statistical framework for monitoring the performance of a breakwater structure in reducing wave impact. The proposed methodology aims to achieve diligent tracking of the underlying process and the swift detection of any potential malfunctions. The implementation of
[...] Read more.
In this work, we introduce a statistical framework for monitoring the performance of a breakwater structure in reducing wave impact. The proposed methodology aims to achieve diligent tracking of the underlying process and the swift detection of any potential malfunctions. The implementation of the new framework requires the construction of appropriate nonparametric Shewhart-type control charts, which rely on order statistics and scan-type decision criteria. The variance of the run length distribution of the proposed scheme is investigated, while the corresponding mean value is determined. For illustration purposes, we consider a real-life application, which aims at evaluating the effectiveness of a breakwater structure based on wave height reduction and wave energy dissipation.
Full article

Figure 1
Open AccessArticle
The Ridge-Hurdle Negative Binomial Regression Model: A Novel Solution for Zero-Inflated Counts in the Presence of Multicollinearity
by
HM Nayem and B. M. Golam Kibria
Stats 2025, 8(4), 102; https://doi.org/10.3390/stats8040102 - 1 Nov 2025
Abstract
Datasets with many zero outcomes are common in real-world studies and often exhibit overdispersion and strong correlations among predictors, creating challenges for standard count models. Traditional approaches such as the Zero-Inflated Poisson (ZIP), Zero-Inflated Negative Binomial (ZINB), and Hurdle models can handle extra
[...] Read more.
Datasets with many zero outcomes are common in real-world studies and often exhibit overdispersion and strong correlations among predictors, creating challenges for standard count models. Traditional approaches such as the Zero-Inflated Poisson (ZIP), Zero-Inflated Negative Binomial (ZINB), and Hurdle models can handle extra zeros and overdispersion but struggle when multicollinearity is present. This study introduces the Ridge-Hurdle Negative Binomial model, which incorporates L2 regularization into the truncated count component of the hurdle framework to jointly address zero inflation, overdispersion, and multicollinearity. Monte Carlo simulations under varying sample sizes, predictor correlations, and levels of overdispersion and zero inflation show that Ridge-Hurdle NB consistently achieves the lowest mean squared error (MSE) compared to ZIP, ZINB, Hurdle Poisson, Hurdle Negative Binomial, Ridge ZIP, and Ridge ZINB models. Applications to the Wildlife Fish and Medical Care datasets further confirm its superior predictive performance, highlighting RHNB as a robust and efficient solution for complex count data modeling.
Full article
(This article belongs to the Section Statistical Methods)
►▼
Show Figures

Figure 1
Open AccessArticle
Robustness of the Trinormal ROC Surface Model: Formal Assessment via Goodness-of-Fit Testing
by
Christos Nakas
Stats 2025, 8(4), 101; https://doi.org/10.3390/stats8040101 - 17 Oct 2025
Abstract
Receiver operating characteristic (ROC) surfaces provide a natural extension of ROC curves to three-class diagnostic problems. A key summary index is the volume under the surface (VUS), representing the probability that a randomly chosen observation from each of the three ordered groups is
[...] Read more.
Receiver operating characteristic (ROC) surfaces provide a natural extension of ROC curves to three-class diagnostic problems. A key summary index is the volume under the surface (VUS), representing the probability that a randomly chosen observation from each of the three ordered groups is correctly classified. A parametric estimation of VUS typically assumes trinormality of the class distributions. However, a formal method for the verification of this composite assumption has not appeared in the literature. Our approach generalizes the two-class AUC-based GOF test of Zou et al. to the three-class setting by exploiting the parallel structure between empirical and trinormal VUS estimators. We propose a global goodness-of-fit (GOF) test for trinormal ROC models based on the difference between empirical and trinormal parametric estimates of the VUS. To improve stability, a probit transformation is applied and a bootstrap procedure is used to estimate the variance of the difference. The resulting test provides a formal diagnostic for assessing the adequacy of trinormal ROC modeling. Simulation studies illustrate the robustness of the assumption via the empirical size and power of the test under various distributional settings, including skewed and multimodal alternatives. The method’s application to COVID-19 antibody level data demonstrates the practical utility of it. Our findings suggest that the proposed GOF test is simple to implement, computationally feasible for moderate sample sizes, and a useful complement to existing ROC surface methodology.
Full article
(This article belongs to the Section Biostatistics)
►▼
Show Figures

Figure 1
Open AccessTechnical Note
Synthetic Hydrograph Estimation for Ungauged Basins: Exploring the Role of Statistical Distributions
by
Dan Ianculescu and Cristian Gabriel Anghel
Stats 2025, 8(4), 100; https://doi.org/10.3390/stats8040100 - 17 Oct 2025
Abstract
The use of probability distribution functions in deriving synthetic hydrographs has become a robust method for modeling the response of watersheds to precipitation events. This approach leverages statistical distributions to capture the temporal structure of runoff processes, providing a flexible framework for estimating
[...] Read more.
The use of probability distribution functions in deriving synthetic hydrographs has become a robust method for modeling the response of watersheds to precipitation events. This approach leverages statistical distributions to capture the temporal structure of runoff processes, providing a flexible framework for estimating peak discharge, time to peak, and hydrograph shape. The present study explores the application of various probability distributions in constructing synthetic hydrographs. The research evaluates parameter estimation techniques, analyzing their influence on hydrograph accuracy. The results highlight the strengths and limitations of each distribution in capturing key hydrological characteristics, offering insights into the suitability of certain probability distribution functions under varying watershed conditions. The study concludes that the approach based on the Cadariu rational function enhances the adaptability and precision of synthetic hydrograph models, thereby supporting flood forecasting and watershed management.
Full article
(This article belongs to the Special Issue Robust Statistics in Action II)
►▼
Show Figures

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
JPM, Mathematics, Applied Sciences, Stats, Healthcare
Application of Biostatistics in Medical Sciences and Global Health
Topic Editors: Bogdan Oancea, Adrian Pană, Cǎtǎlina Liliana AndreiDeadline: 31 October 2026
Conferences
Special Issues
Special Issue in
Stats
Advances in Machine Learning, High-Dimensional Inference, Shrinkage Estimation, and Model Validation
Guest Editor: B. M. Golam KibriaDeadline: 25 March 2026
Special Issue in
Stats
Nonparametric Inference: Methods and Applications
Guest Editor: Stefano BonniniDeadline: 28 April 2026
Special Issue in
Stats
Benford's Law(s) and Applications (Second Edition)
Guest Editors: Roy Cerqueti, Claudio LupiDeadline: 30 June 2026
Special Issue in
Stats
Extreme Weather Modeling and Forecasting
Guest Editor: Wei ZhuDeadline: 30 July 2026





