Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (86)

Search Parameters:
Keywords = zero-inflated Poisson model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1357 KB  
Article
Zero-Inflated Data Analysis Using Graph Neural Networks with Convolution
by Sunghae Jun
Computers 2026, 15(2), 104; https://doi.org/10.3390/computers15020104 - 2 Feb 2026
Viewed by 309
Abstract
Zero-inflated count data are characterized by an excessive frequency of zeros that cannot be adequately analyzed by a single distribution, such as Poisson or negative binomial. This problem is pervasive in many practical applications, including document–keyword matrix derived from text corpora, where most [...] Read more.
Zero-inflated count data are characterized by an excessive frequency of zeros that cannot be adequately analyzed by a single distribution, such as Poisson or negative binomial. This problem is pervasive in many practical applications, including document–keyword matrix derived from text corpora, where most keyword frequencies are zero. Conventional statistical approaches, such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models, explicitly separate a structural zero component from a count component, but they typically assume independent observations and can be unstable when covariates are high-dimensional and sparse. To address these limitations, this paper proposes a graph-based zero-inflated learning framework that combines simple graph convolution (SGC) with zero-inflated count regression heads such as ZIP and ZINB. We first construct an observation graph by connecting similar samples, and then apply SGC to propagate and smooth features over the graph, producing convolutional representations that incorporate neighborhood information while remaining computationally lightweight. The resulting representations are used as covariates in ZIP and ZINB heads, which preserve probabilistic interpretability through maximum likelihood learning. Our experiments on simulated zero-inflated datasets with controlled zero ratios demonstrate that the proposed ZIP+SGC and ZINB+SGC consistently reduce prediction errors compared with their non-graph baselines, as measured by mean absolute error and root mean squared error. Overall, the proposed approach provides an efficient and interpretable way to integrate graph neural computation with zero-inflated modeling for sparse count prediction problems. Full article
Show Figures

Figure 1

16 pages, 336 KB  
Article
Bayesian Neural Networks with Regularization for Sparse Zero-Inflated Data Modeling
by Sunghae Jun
Information 2026, 17(1), 81; https://doi.org/10.3390/info17010081 - 13 Jan 2026
Viewed by 401
Abstract
Zero inflation is pervasive across text mining, event log, and sensor analytics, and it often degrades the predictive performance of analytical models. Classical approaches, most notably the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models, address excess zeros but rely on rigid [...] Read more.
Zero inflation is pervasive across text mining, event log, and sensor analytics, and it often degrades the predictive performance of analytical models. Classical approaches, most notably the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models, address excess zeros but rely on rigid parametric assumptions and fixed model structures, which can limit flexibility in high-dimensional, sparse settings. We propose a Bayesian neural network (BNN) with regularization for sparse zero-inflated data modeling. The method separately parameterizes the zero inflation probability and the count intensity under ZIP/ZINB likelihoods, while employing Bayesian regularization to induce sparsity and control overfitting. Posterior inference is performed using variational inference. We evaluate the approach through controlled simulations with varying zero ratios and a real-world dataset, and we compare it against Poisson generalized linear models, ZIP, and ZINB baselines. The present study focuses on predictive performance measured by mean squared error (MSE). Across all settings, the proposed method achieves consistently lower prediction error and improved uncertainty problems, with ablation studies confirming the contribution of the regularization components. These results demonstrate that a regularized BNN provides a flexible and robust framework for sparse zero-inflated data analysis in information-rich environments. Full article
(This article belongs to the Special Issue Feature Papers in Information in 2024–2025)
Show Figures

Graphical abstract

16 pages, 2700 KB  
Article
Spatio-Temporal Distribution of Setipinna taty Resources Using a Zero-Inflated Model in the Offshore Waters of Southern Zhejiang, China
by Xiaoxue Liu, Wen Ma, Jin Ma, Chunxia Gao, Weifeng Chen and Jing Zhao
J. Mar. Sci. Eng. 2026, 14(1), 96; https://doi.org/10.3390/jmse14010096 - 3 Jan 2026
Viewed by 347
Abstract
Effective fishery management in coastal waters requires accurate assessments of species–environment relationships, particularly in data-rich but zero-inflated contexts (i.e., datasets with an excess of zero catches). Here, we used fishery-independent trawl survey data collected from 2018 to 2019 in the offshore waters of [...] Read more.
Effective fishery management in coastal waters requires accurate assessments of species–environment relationships, particularly in data-rich but zero-inflated contexts (i.e., datasets with an excess of zero catches). Here, we used fishery-independent trawl survey data collected from 2018 to 2019 in the offshore waters of southern Zhejiang Province of China to investigate the spatio-temporal distribution of Setipinna taty (scaly hairfin anchovy) and its environmental determinants. Given the high frequency of zero catches, we fitted both zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models and selected the best-performing approach using the Akaike information criterion (AIC). Cross-validation indicated that the ZINB model (RMSE: 199.1, R2; 0.25) outperformed ZIP model (RMSE: 239.4, R2; 0.23). Temperature, depth, and salinity were key predictors of S. taty abundance, which generally occurred at depths of 20–40 m and salinities of 26–34 psu. We then applied the optimal ZINB model to predict S. taty distributions in spring, summer, and autumn of 2020. The predictions indicated a summer peak in abundance and a nearshore-to-offshore decreasing gradient, and were broadly consistent with the spatial distribution trends observed in the 2020 survey data. The highest predicted densities were located in nearshore areas off Wenzhou and Taizhou, west of 122° E. By clarifying the key environmental factors shaping S. taty distribution and applying zero-inflated count models to account for an excess of zero catches, which occur more frequently than expected under standard negative binomial models, this study provides an improved basis for effective conservation and sustainable utilization of S. taty resources in the southern offshore waters of Zhejiang; nevertheless, predictive performance could be further improved by incorporating additional environmental and biotic covariates together with extended spatio-temporal data. Full article
(This article belongs to the Section Marine Ecology)
Show Figures

Figure 1

30 pages, 539 KB  
Article
Symmetric Discrete Distributions on the Integer Line: A Versatile Family and Applications
by Lamia Alyami, Hugo S. Salinas, Hassan S. Bakouch, Maher Kachour, Amira F. Daghestani and Sudeep R. Bapat
Symmetry 2025, 17(12), 2148; https://doi.org/10.3390/sym17122148 - 13 Dec 2025
Viewed by 389
Abstract
We introduce the Symmetric-Z (Sy-Z) family, a unified class of symmetric discrete distributions on the integers obtained by multiplying a three-point symmetric sign variable by an independent non-negative integer-valued magnitude. This sign-magnitude construction yields interpretable, zero-centered models with tunable mass [...] Read more.
We introduce the Symmetric-Z (Sy-Z) family, a unified class of symmetric discrete distributions on the integers obtained by multiplying a three-point symmetric sign variable by an independent non-negative integer-valued magnitude. This sign-magnitude construction yields interpretable, zero-centered models with tunable mass at zero and dispersion balanced across signs, making them suitable for outcomes, such as differences of counts or discretized return increments. We derive general distributional properties, including closed-form expressions for the probability mass and cumulative distribution functions, bilateral generating functions, and even moments, and show that the tail behavior is inherited from the magnitude component. A characterization by symmetry and sign–magnitude independence is established and a distinctive operational feature is proved: for independent members of the family, the sum and the difference have the same distribution. As a central example, we study the symmetric Poisson model, providing measures of skewness, kurtosis, and entropy, together with estimation via the method of moments and maximum likelihood. Simulation studies assess finite-sample performance of the estimators, and applications to datasets from finance and education show improved goodness-of-fit relative to established integer-valued competitors. Overall, the Sy-Z framework offers a mathematically tractable and interpretable basis for modeling symmetric integer-valued outcomes across diverse domains. Full article
Show Figures

Figure 1

25 pages, 2764 KB  
Article
Integrated Quality Inspection and Production Run Optimization for Imperfect Production Systems with Zero-Inflated Non-Homogeneous Poisson Deterioration
by Chih-Chiang Fang and Ming-Nan Chen
Mathematics 2025, 13(24), 3901; https://doi.org/10.3390/math13243901 - 5 Dec 2025
Cited by 1 | Viewed by 450
Abstract
This study develops an integrated quality inspection and production optimization framework for an imperfect production system, where system deterioration follows a zero-inflated non-homogeneous Poisson process (ZI-NHPP) characterized by a power-law intensity function. Parameters are estimated from historical data using the Expectation-Maximization (EM) algorithm, [...] Read more.
This study develops an integrated quality inspection and production optimization framework for an imperfect production system, where system deterioration follows a zero-inflated non-homogeneous Poisson process (ZI-NHPP) characterized by a power-law intensity function. Parameters are estimated from historical data using the Expectation-Maximization (EM) algorithm, with a zero-inflation parameter π modeling scenario where the system remains defect-free. Operating in either an in-control or out-of-control state, the system produces products with Weibull hazard rates, exhibiting higher failure rates in the out-of-control state. The proposed model integrates system status, defect rates, employee efficiency, and market demand to jointly optimize the number of conforming items inspected and the production run length, thereby minimizing total costs—including production, inspection, correction, inventory, and warranty expenses. Numerical analyses, supported by sensitivity studies, validate the effectiveness of this integrated approach in achieving cost-efficient quality control. This framework enhances quality assurance and production management, offering practical insights for manufacturing across diverse industries. Full article
(This article belongs to the Section C: Mathematical Analysis)
Show Figures

Figure 1

16 pages, 522 KB  
Article
Zero-Inflated Text Data Analysis Using Imbalanced Data Sampling and Statistical Models
by Sunghae Jun
Computers 2025, 14(12), 527; https://doi.org/10.3390/computers14120527 - 2 Dec 2025
Viewed by 528
Abstract
Text data often exhibits high sparsity and zero inflation, where a substantial proportion of entries in the document–keyword matrix are zeros. This characteristic presents challenges to traditional count-based models, which may suffer from reduced predictive accuracy and interpretability in the presence of excessive [...] Read more.
Text data often exhibits high sparsity and zero inflation, where a substantial proportion of entries in the document–keyword matrix are zeros. This characteristic presents challenges to traditional count-based models, which may suffer from reduced predictive accuracy and interpretability in the presence of excessive zeros and overdispersion. To overcome this issue, we propose an effective analytical framework that integrates imbalanced data handling by undersampling with classical probabilistic count models. Specifically, we apply Poisson’s generalized linear models, zero-inflated Poisson, and zero-inflated negative binomial models to analyze zero-inflated text data while preserving the statistical interpretability of term-level counts. The framework is evaluated using both real-world patent documents and simulated datasets. Empirical results demonstrate that our undersampling-based approach improves the model fit without modifying the downstream models. This study contributes a practical preprocessing strategy for enhancing zero-inflated text analysis and offers insights into model selection and data balancing techniques for sparse count data. Full article
Show Figures

Graphical abstract

30 pages, 1354 KB  
Article
Driving Behavior and Insurance Pricing: A Framework for Analysis and Some Evidence from Italian Data Using Zero-Inflated Poisson (ZIP) Models
by Paola Fersini, Michele Longo and Giuseppe Melisi
Risks 2025, 13(11), 214; https://doi.org/10.3390/risks13110214 - 3 Nov 2025
Viewed by 2737
Abstract
Usage-Based Insurance (UBI), also referred to as telematics-based insurance, has been experiencing a growing global diffusion. In addition to being well established in countries such as Italy, the United States, and the United Kingdom, UBI adoption is also accelerating in emerging markets such [...] Read more.
Usage-Based Insurance (UBI), also referred to as telematics-based insurance, has been experiencing a growing global diffusion. In addition to being well established in countries such as Italy, the United States, and the United Kingdom, UBI adoption is also accelerating in emerging markets such as Japan, South Africa, and Brazil. In Japan, telematics insurance has shown significant growth in recent years, with a steadily increasing subscription rate. In South Africa, UBI adoption ranks among the highest worldwide, with market penetration placing the country among the top three globally, just after the United States and Italy. In Brazil, UBI adoption is expanding, supported by government initiatives promoting road safety and innovation in the insurance sector. According to a MarketsandMarkets report of February 2025, the global UBI market is expected to grow from USD 43.38 billion in 2023 to USD 70.46 billion by 2030, with a compound annual growth rate (CAGR) of 7.2% over the forecast period. This growth is driven by the increasing adoption of both electric and internal combustion vehicles equipped with integrated telematics systems, which enable insurers to collect data on driving behavior and to tailor insurance premiums accordingly. In this paper, we analyze a large dataset consisting of trips recorded over five years from 100,000 policyholders across the Italian territory through the installation of black-box devices. Using univariate and multivariate statistical analyses, as well as Generalized Linear Models (GLMs) with Zero-Inflated Poisson distribution, we examine claims frequency and assess the relevance of various synthetic indicators of driving behavior, with the aim of identifying those that are most significant for insurance pricing. Full article
(This article belongs to the Special Issue Innovations in Non-Life Insurance Pricing and Reserving)
Show Figures

Figure 1

21 pages, 1332 KB  
Article
The Ridge-Hurdle Negative Binomial Regression Model: A Novel Solution for Zero-Inflated Counts in the Presence of Multicollinearity
by HM Nayem and B. M. Golam Kibria
Stats 2025, 8(4), 102; https://doi.org/10.3390/stats8040102 - 1 Nov 2025
Viewed by 1590
Abstract
Datasets with many zero outcomes are common in real-world studies and often exhibit overdispersion and strong correlations among predictors, creating challenges for standard count models. Traditional approaches such as the Zero-Inflated Poisson (ZIP), Zero-Inflated Negative Binomial (ZINB), and Hurdle models can handle extra [...] Read more.
Datasets with many zero outcomes are common in real-world studies and often exhibit overdispersion and strong correlations among predictors, creating challenges for standard count models. Traditional approaches such as the Zero-Inflated Poisson (ZIP), Zero-Inflated Negative Binomial (ZINB), and Hurdle models can handle extra zeros and overdispersion but struggle when multicollinearity is present. This study introduces the Ridge-Hurdle Negative Binomial model, which incorporates L2 regularization into the truncated count component of the hurdle framework to jointly address zero inflation, overdispersion, and multicollinearity. Monte Carlo simulations under varying sample sizes, predictor correlations, and levels of overdispersion and zero inflation show that Ridge-Hurdle NB consistently achieves the lowest mean squared error (MSE) compared to ZIP, ZINB, Hurdle Poisson, Hurdle Negative Binomial, Ridge ZIP, and Ridge ZINB models. Applications to the Wildlife Fish and Medical Care datasets further confirm its superior predictive performance, highlighting RHNB as a robust and efficient solution for complex count data modeling. Full article
(This article belongs to the Section Statistical Methods)
Show Figures

Figure 1

17 pages, 3465 KB  
Article
Longitudinal Gut Microbiome Changes Associated with Transitions from C. difficile Negative to C. difficile Positive on Surveillance Tests
by L. Silvia Munoz-Price, Samantha N. Atkinson, Vy Lam, Blake Buchan, Nathan Ledeboer, Nita H. Salzman and Amy Y. Pan
Microorganisms 2025, 13(10), 2277; https://doi.org/10.3390/microorganisms13102277 - 29 Sep 2025
Viewed by 810
Abstract
Clostridioides difficile is an obligate anaerobe and is primarily transmitted via the fecal–oral route. Data characterizing the microbiome changes accompanying transitions from non-colonized to C. difficile colonized subjects are currently lacking. In this retrospective cohort study, we examined 16S rRNA gene sequencing data [...] Read more.
Clostridioides difficile is an obligate anaerobe and is primarily transmitted via the fecal–oral route. Data characterizing the microbiome changes accompanying transitions from non-colonized to C. difficile colonized subjects are currently lacking. In this retrospective cohort study, we examined 16S rRNA gene sequencing data in a total of 481 fecal samples belonging to 107 patients. Based on C. difficile status over time, patients were categorized as Negative-to-Positive, Negative Control, and Positive Control. A linear mixed effects model was fitted to investigate the changes in the Shannon α-diversity index over time. Zero-inflated negative binomial/Poisson mixed effects models or generalized linear mixed models with negative binomial/Poisson distribution were used to investigate the changes in taxon counts over time among different groups. A total of 107 patients were eligible for the study. The median number of stool samples per patient was 3 (IQR 2–4). A total of 42 patients transitioned from C. difficile negative to positive (Negative-to-Positive), 47 patients remained negative throughout their tests (Negative Control) and 18 were always C. difficile positive (Positive Control). A significant difference in microbiome composition between the last negative samples and the first positive samples were shown in Negative-to-Positive patients, ANOSIM p = 0.022. In Negative-to-Positive patients, the phylum Pseudomonadota and family Enterobacteriaceae increased significantly in the first positive samples compared to the last negative samples, p = 0.0075 and p = 0.0094, respectively. Within the first 21 days, Actinomycetota decreased significantly over time in the Positive Control group compared to the other two groups (p < 0.001) while Bacillota decreased in both the Negative-to-Positive group and Positive Control. These results demonstrate that the transition from C. difficile negative to C. difficile positive is associated with alterations in gut microbial communities and their compositional patterns over time. Moreover, these changes play an important role in both the emergence and intensification of the gut microbiome dysbiosis in patients who transitioned from C. difficile negative to positive and those who always tested positive. Full article
(This article belongs to the Special Issue The Microbiome in Ecosystems)
Show Figures

Figure 1

32 pages, 1288 KB  
Article
Random Forest Adaptation for High-Dimensional Count Regression
by Oyebayo Ridwan Olaniran, Saidat Fehintola Olaniran, Ali Rashash R. Alzahrani, Nada MohammedSaeed Alharbi and Asma Ahmad Alzahrani
Mathematics 2025, 13(18), 3041; https://doi.org/10.3390/math13183041 - 21 Sep 2025
Cited by 2 | Viewed by 1484
Abstract
The analysis of high-dimensional count data presents a unique set of challenges, including overdispersion, zero-inflation, and complex nonlinear relationships that traditional generalized linear models and standard machine learning approaches often fail to adequately address. This study introduces and validates a novel Random Forest [...] Read more.
The analysis of high-dimensional count data presents a unique set of challenges, including overdispersion, zero-inflation, and complex nonlinear relationships that traditional generalized linear models and standard machine learning approaches often fail to adequately address. This study introduces and validates a novel Random Forest framework specifically developed for high-dimensional Poisson and Negative Binomial regression, designed to overcome the limitations of existing methods. Through comprehensive simulations and a real-world genomic application to the Norwegian Mother and Child Cohort Study, we demonstrate that the proposed methods achieve superior predictive accuracy, quantified by lower root mean squared error and deviance, and critically produced exceptionally stable and interpretable feature selections. Our theoretical and empirical results show that these distribution-optimized ensembles significantly outperform both penalized-likelihood techniques and naive-transformation-based ensembles in balancing statistical robustness with biological interpretability. The study concludes that the proposed frameworks provide a crucial methodological advancement, offering a powerful and reliable tool for extracting meaningful insights from complex count data in fields ranging from genomics to public health. Full article
(This article belongs to the Special Issue Statistics for High-Dimensional Data)
Show Figures

Figure 1

23 pages, 575 KB  
Article
A Comparison of the Robust Zero-Inflated and Hurdle Models with an Application to Maternal Mortality
by Phelo Pitsha, Raymond T. Chiruka and Chioneso S. Marange
Math. Comput. Appl. 2025, 30(5), 95; https://doi.org/10.3390/mca30050095 - 2 Sep 2025
Cited by 1 | Viewed by 2544
Abstract
This study evaluates the performance of count regression models in the presence of zero inflation, outliers, and overdispersion using both simulated and real-world maternal mortality dataset. Traditional Poisson and negative binomial regression models often struggle to account for the complexities introduced by excess [...] Read more.
This study evaluates the performance of count regression models in the presence of zero inflation, outliers, and overdispersion using both simulated and real-world maternal mortality dataset. Traditional Poisson and negative binomial regression models often struggle to account for the complexities introduced by excess zeros and outliers. To address these limitations, this study compares the performance of robust zero-inflated (RZI) and robust hurdle (RH) models against conventional models using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) to determine the best-fitting model. Results indicate that the robust zero-inflated Poisson (RZIP) model performs best overall. The simulation study considers various scenarios, including different levels of zero inflation (50%, 70%, and 80%), outlier proportions (0%, 5%, 10%, and 15%), dispersion values (1, 3, and 5), and sample sizes (50, 200, and 500). Based on AIC comparisons, the robust zero-inflated Poisson (RZIP) and robust hurdle Poisson (RHP) models demonstrate superior performance when outliers are absent or limited to 5%, particularly when dispersion is low (5). However, as outlier levels and dispersion increase, the robust zero-inflated negative binomial (RZINB) and robust hurdle negative binomial (RHNB) models outperform robust zero-inflated Poisson (RZIP) and robust hurdle Poisson (RHP) across all levels of zero inflation and sample sizes considered in the study. Full article
Show Figures

Figure 1

15 pages, 358 KB  
Article
Multi-Task CNN-LSTM Modeling of Zero-Inflated Count and Time-to-Event Outcomes for Causal Inference with Functional Representation of Features
by Jong-Min Kim
Axioms 2025, 14(8), 626; https://doi.org/10.3390/axioms14080626 - 11 Aug 2025
Cited by 1 | Viewed by 1393
Abstract
We propose a novel deep learning framework for counterfactual inference on the COMPAS dataset, utilizing a multi-task CNN-LSTM architecture. The model jointly predicts multiple outcome types: (i) count outcomes with zero inflation, modeled using zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), and negative [...] Read more.
We propose a novel deep learning framework for counterfactual inference on the COMPAS dataset, utilizing a multi-task CNN-LSTM architecture. The model jointly predicts multiple outcome types: (i) count outcomes with zero inflation, modeled using zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), and negative binomial (NB) distributions; (ii) time-to-event outcomes, modeled via the Cox proportional hazards model. To effectively leverage the structure in high-dimensional tabular data, we integrate functional data analysis (FDA) techniques by transforming covariates into smooth functional representations using B-spline basis expansions. Specifically, we construct a pseudo-temporal index over predictor variables and fit basis expansions to each subject’s feature vector, yielding a low-dimensional set of coefficients that preserve smooth variation while reducing noise. This functional representation enables the CNN-LSTM model to capture both local and global temporal patterns in the data, including treatment-covariate interactions. Our approach estimates both population-average and individual-level treatment effects (ATE and CATE) for each outcome and evaluates predictive performance using metrics such as Poisson deviance, root mean squared error (RMSE), and the concordance index (C-index). Statistical inference on treatment effects is supported via bootstrap-based confidence intervals and hypothesis testing. Overall, this comprehensive framework facilitates flexible modeling of heterogeneous treatment effects in structured, high-dimensional data, advancing causal inference methodologies in criminal justice and related domains. Full article
(This article belongs to the Special Issue Functional Data Analysis and Its Application)
Show Figures

Figure 1

19 pages, 539 KB  
Article
Maximum-Likelihood Estimation for the Zero-Inflated Polynomial-Adjusted Poisson Distribution
by Jong-Seung Lee and Hyung-Tae Ha
Mathematics 2025, 13(15), 2383; https://doi.org/10.3390/math13152383 - 24 Jul 2025
Viewed by 879
Abstract
We propose the zero-inflated Polynomially Adjusted Poisson (zPAP) model. It extends the usual zero-inflated Poisson by multiplying the Poisson kernel with a nonnegative polynomial, enabling the model to handle extra zeros, overdispersion, skewness, and even multimodal counts. We derive the maximum-likelihood framework—including the [...] Read more.
We propose the zero-inflated Polynomially Adjusted Poisson (zPAP) model. It extends the usual zero-inflated Poisson by multiplying the Poisson kernel with a nonnegative polynomial, enabling the model to handle extra zeros, overdispersion, skewness, and even multimodal counts. We derive the maximum-likelihood framework—including the log-likelihood and score equations under both general and regression settings—and fit zPAP to the zero-inflated, highly dispersed Fish Catch data as well as a synthetic bimodal mixture. In both cases, zPAP not only outperforms the standard zero-inflated Poisson model but also yields reliable inference via parametric bootstrap confidence intervals. Overall, zPAP is a clear and tractable tool for real-world count data with complex features. Full article
(This article belongs to the Special Issue Statistical Theory and Application, 2nd Edition)
Show Figures

Figure 1

17 pages, 343 KB  
Article
On the Conflation of Poisson and Logarithmic Distributions with Applications
by Abdulhamid A. Alzaid, Anfal A. Alqefari and Najla Qarmalah
Axioms 2025, 14(7), 518; https://doi.org/10.3390/axioms14070518 - 6 Jul 2025
Cited by 1 | Viewed by 915
Abstract
It is frequent for real-life count data to show inflation in lower values; however, most of the well-known count distributions cannot capture such a feature. The present paper introduces a new distribution for modeling inflated count data in small values based on a [...] Read more.
It is frequent for real-life count data to show inflation in lower values; however, most of the well-known count distributions cannot capture such a feature. The present paper introduces a new distribution for modeling inflated count data in small values based on a conflation of distributions approach. The new distribution inherits some properties from Poisson distribution (PD) and logarithmic distribution (LD), making it a powerful modeling tool. It can serve as an alternative to PD, LD, and zero-truncated distributions. The new distribution is worth considering theoretically, as it belongs to the weighted PD family. With zero as a support point, two additional models are suggested for the new distribution. These modifications yield distributions that demonstrate overdispersion models comparable to the negative binomial distribution (NBD) while retaining essential PD properties, making them suitable for accurately representing count data with frequent events of low frequency and high variance. Furthermore, we discuss the superior performance of three new distributions in modeling real count data compared to traditional count distributions such as PD and NBD, as well as other discrete distributions. This paper examines the key statistical properties of the proposed distributions. A comparison of the novel and other distributions in the literature is shown employing real-life data from some domains. All of the computations shown in this study are generated using the R programming language. Full article
(This article belongs to the Special Issue Advances in the Theory and Applications of Statistical Distributions)
Show Figures

Figure 1

24 pages, 347 KB  
Article
Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model
by Michael Pearce and Michael D. Perlman
Stats 2025, 8(3), 55; https://doi.org/10.3390/stats8030055 - 5 Jul 2025
Viewed by 626
Abstract
The problem of estimating the ratio of the means of a two-component Poisson mixture model is considered, when each component is subject to zero-inflation, i.e., excess zero counts. The resulting zero-inflated Poisson mixture (ZIPM) model can be viewed as a three-component Poisson mixture [...] Read more.
The problem of estimating the ratio of the means of a two-component Poisson mixture model is considered, when each component is subject to zero-inflation, i.e., excess zero counts. The resulting zero-inflated Poisson mixture (ZIPM) model can be viewed as a three-component Poisson mixture model with one degenerate component. The EM algorithm is applied to obtain frequentist estimators and their standard errors, the latter determined via an explicit expression for the observed information matrix. As an intermediate step, we derive an explicit expression for standard errors in the two-component Poisson mixture model (without zero-inflation), a new result. The ZIPM model is applied to simulated data and real ecological count data of frigatebirds on the Coral Sea Islands off the coast of Northeast Australia. Full article
Show Figures

Figure 1

Back to TopTop