Special Issue "Probability, Statistics and Their Applications"

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics Theory".

Deadline for manuscript submissions: 31 July 2020.

Special Issue Editor

Prof. Dr. Vasile Preda

Guest Editor
“Gheorghe Mihoc-Caius Iacob” Institute of Mathematical Statistics and Applied Mathematics, Bucharest, Romania
“Costin C. Kiritescu” National Institute of Economic Research, Bucharest, Romania
Faculty of Mathematics and Computer Science, University of Bucharest, Romania
Interests: statistics; decision theory; operational research; variational inequalities; equilibrium theory; generalized convexity; information theory; biostatistics; actuarial statistics

Special Issue Information

Dear Colleagues,

Statistics and probability are important domains in the scientific world, having many applications in various fields, such as engineering, reliability, medicine, biology, economics, physics, and not only, probability laws providing an estimated image of the world we live in.  This Special Volume deals targets some certain directions of the two domains as described below. 

Some applications of statistics are clustering of random variables based on simulated and real data or scan statistics, the latter being introduced in 1963 by Joseph Naus. In reliability theory, some important statistical tools are hazard rate and survival functions, order statistics, and stochastic orders. In physics, the concept of entropy is at its core, while special statistics were introduced and developed, such as statistical mechanics and Tsallis statistics.

~In economics, statistics, mathematics, and economics formed a particular domain called econometrics. ARMA models, linear regressions, income analysis, and stochastic processes are discussed and analyzed in the context of real economic processes. Other important tools are Lorenz curves and broken stick models.

~Theoretical results such as modeling of discretization of random variables and estimation of parameters of new and old statistical models are welcome, some important probability laws being heavy-tailed distributions. In recent years, many distributions along with their properties have been introduced in order to better fit the growing data available.

The purpose of this Special Issue is to provide a collection of articles that reflect the importance of statistics and probability in applied scientific domains. Papers providing theoretical methodologies and applications in statistics are welcome.

Prof. Vasile Preda
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Applied and theoretical statistics
  • New probability distributions and estimation methods
  • Broken stick models
  • Lorenz curve
  • Scan statistics
  • Discretization of random variables
  • Clustering of random variables

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Open AccessArticle
Estimation of Beta-Pareto Distribution Based on Several Optimization Methods
Mathematics 2020, 8(7), 1055; https://doi.org/10.3390/math8071055 - 01 Jul 2020
Abstract
This paper is concerned with the maximum likelihood estimators of the Beta-Pareto distribution introduced in Akinsete et al. (2008), which comes from the mixing of two probability distributions, Beta and Pareto. Since these estimators cannot be obtained explicitly, we use nonlinear optimization methods [...] Read more.
This paper is concerned with the maximum likelihood estimators of the Beta-Pareto distribution introduced in Akinsete et al. (2008), which comes from the mixing of two probability distributions, Beta and Pareto. Since these estimators cannot be obtained explicitly, we use nonlinear optimization methods that numerically provide these estimators. The methods we investigate are the method of Newton-Raphson, the gradient method and the conjugate gradient method. Note that for the conjugate gradient method we use the model of Fletcher-Reeves. The corresponding algorithms are developed and the performances of the methods used are confirmed by an important simulation study. In order to compare between several concurrent models, namely generalized Beta-Pareto, Beta, Pareto, Gamma and Beta-Pareto, model criteria selection are used. We firstly consider completely observed data and, secondly, the observations are assumed to be right censored and we derive the same type of results. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Estimation of Uncertainty in Mortality Projections Using State-Space Lee-Carter Model
Mathematics 2020, 8(7), 1053; https://doi.org/10.3390/math8071053 - 30 Jun 2020
Abstract
The study develops alternatives of the classical Lee-Carter stochastic mortality model in assessment of uncertainty of mortality rates forecasts. We use the Lee-Carter model expressed as linear Gaussian state-space model or state-space model with Markovian regime-switching to derive coherent estimates of parameters and [...] Read more.
The study develops alternatives of the classical Lee-Carter stochastic mortality model in assessment of uncertainty of mortality rates forecasts. We use the Lee-Carter model expressed as linear Gaussian state-space model or state-space model with Markovian regime-switching to derive coherent estimates of parameters and to introduce additional flexibility required to capture change in trend and non-Gaussian volatility of mortality improvements. For model-fitting, we use a Bayesian Gibbs sampler. We illustrate the application of the models by deriving the confidence intervals of mortality projections using Lithuanian and Swedish data. The results show that state-space model with Markovian regime-switching adequately captures the effect of pandemic, which is present in the Swedish data. However, it is less suitable to model less sharp but more prolonged fluctuations of mortality trends in Lithuania. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
On the Consecutive k1 and k2-out-of-n Reliability Systems
Mathematics 2020, 8(4), 630; https://doi.org/10.3390/math8040630 - 19 Apr 2020
Abstract
In this paper we carry out a reliability study of the consecutive-k1 and k2-out-of-n systems with independent and identically distributed components ordered in a line. More precisely, we obtain the generating function of the structure’s reliability, while recurrence [...] Read more.
In this paper we carry out a reliability study of the consecutive-k1 and k2-out-of-n systems with independent and identically distributed components ordered in a line. More precisely, we obtain the generating function of the structure’s reliability, while recurrence relations for determining its signature vector and reliability function are also provided. For illustration purposes, some numerical results and figures are presented and several concluding remarks are deduced. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Asymptotic Results in Broken Stick Models: The Approach via Lorenz Curves
Mathematics 2020, 8(4), 625; https://doi.org/10.3390/math8040625 - 18 Apr 2020
Abstract
A stick of length 1 is broken at random into n smaller sticks. How much inequality does this procedure produce? What happens if, instead of breaking a stick, we break a square? What happens asymptotically? Which is the most egalitarian distribution of the [...] Read more.
A stick of length 1 is broken at random into n smaller sticks. How much inequality does this procedure produce? What happens if, instead of breaking a stick, we break a square? What happens asymptotically? Which is the most egalitarian distribution of the smaller sticks (or rectangles)? Usually, when studying inequality, one uses a Lorenz curve. The more egalitarian a distribution, the closer the Lorenz curve is to the first diagonal of [ 0 , 1 ] 2 . This is why in the first section we study the space of Lorenz curves. What is the limit of a convergent sequence of Lorenz curves? We try to answer these questions, firstly, in the deterministic case and based on the results obtained there in the stochastic one. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Open AccessArticle
Multi-Partitions Subspace Clustering
Mathematics 2020, 8(4), 597; https://doi.org/10.3390/math8040597 - 15 Apr 2020
Abstract
In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result [...] Read more.
In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Theoretical Aspects on Measures of Directed Information with Simulations
Mathematics 2020, 8(4), 587; https://doi.org/10.3390/math8040587 - 15 Apr 2020
Abstract
Measures of directed information are obtained through classical measures of information by taking into account specific qualitative characteristics of each event. These measures are classified into two main categories, the entropic and the divergence measures. Many times in statistics we wish to emphasize [...] Read more.
Measures of directed information are obtained through classical measures of information by taking into account specific qualitative characteristics of each event. These measures are classified into two main categories, the entropic and the divergence measures. Many times in statistics we wish to emphasize not only on the quantitative characteristics but also on the qualitative ones. For example, in financial risk analysis it is common to take under consideration the existence of fat tails in the distribution of returns of an asset (especially the left tail) and in biostatistics to use robust statistical methods to trim extreme values. Motivated by these needs in this work we present, study and provide simulations for measures of directed information. These measures quantify the information with emphasis on specific parts (or events) of their probability distribution, without losing the whole information of the less significant parts and at the same time by concentrating on the information of the parts we care about the most. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
One Dimensional Discrete Scan Statistics for Dependent Models and Some Related Problems
Mathematics 2020, 8(4), 576; https://doi.org/10.3390/math8040576 - 13 Apr 2020
Abstract
The one dimensional discrete scan statistic is considered over sequences of random variables generated by block factor dependence models. Viewed as a maximum of an 1-dependent stationary sequence, the scan statistics distribution is approximated with accuracy and sharp bounds are provided. The longest [...] Read more.
The one dimensional discrete scan statistic is considered over sequences of random variables generated by block factor dependence models. Viewed as a maximum of an 1-dependent stationary sequence, the scan statistics distribution is approximated with accuracy and sharp bounds are provided. The longest increasing run statistics is related to the scan statistics and its distribution is studied. The moving average process is a particular case of block factor and the distribution of the associated scan statistics is approximated. Numerical results are presented. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
On a Periodic Capital Injection and Barrier Dividend Strategy in the Compound Poisson Risk Model
Mathematics 2020, 8(4), 511; https://doi.org/10.3390/math8040511 - 02 Apr 2020
Cited by 12
Abstract
In this paper, we assume that the reserve level of an insurance company can only be observed at discrete time points, then a new risk model is proposed by introducing a periodic capital injection strategy and a barrier dividend strategy into the classical [...] Read more.
In this paper, we assume that the reserve level of an insurance company can only be observed at discrete time points, then a new risk model is proposed by introducing a periodic capital injection strategy and a barrier dividend strategy into the classical risk model. We derive the equations and the boundary conditions satisfied by the Gerber-Shiu function, the expected discounted capital injection function and the expected discounted dividend function by assuming that the observation interval and claim amount are exponentially distributed, respectively. Numerical examples are also given to further analyze the influence of relevant parameters on the actuarial function of the risk model. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Treating Nonresponse in Probability-Based Online Panels through Calibration: Empirical Evidence from a Survey of Political Decision-Making Procedures
Mathematics 2020, 8(3), 423; https://doi.org/10.3390/math8030423 - 15 Mar 2020
Abstract
The use of probability-based panels that collect data via online or mixed-mode surveys has increased in the last few years as an answer to the growing concern with the quality of the data obtained with traditional survey modes. However, in order to adequately [...] Read more.
The use of probability-based panels that collect data via online or mixed-mode surveys has increased in the last few years as an answer to the growing concern with the quality of the data obtained with traditional survey modes. However, in order to adequately represent the general population, these tools must address the same sources of bias that affect other survey-based designs: namely under coverage and non-response. In this work, we test several approaches to produce calibration estimators that are suitable for survey data affected by non response where auxiliary information exists at both the panel level and the population level. The first approach adjusts the results obtained in the cross-sectional survey to the population totals, while, in the second, the weights are the result of two-step process where different adjusts on the sample, panel, and population are done. A simulation on the properties of these estimators is performed. In light of theory and simulation results, we conclude that weighting by calibration is an effective technique for the treatment of non-response bias when the response mechanism is missing at random. These techniques have also been applied to real data from the survey Andalusian Citizen Preferences for Political Decision-Making Procedures. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Asymptotic Approximations of Ratio Moments Based on Dependent Sequences
Mathematics 2020, 8(3), 361; https://doi.org/10.3390/math8030361 - 06 Mar 2020
Abstract
The widely orthant dependent (WOD) sequences are very weak dependent sequences of random variables. For the weighted sums of non-negative m-WOD random variables, we provide asymptotic expressions for their appropriate inverse moments which are easy to calculate. As applications, we also obtain [...] Read more.
The widely orthant dependent (WOD) sequences are very weak dependent sequences of random variables. For the weighted sums of non-negative m-WOD random variables, we provide asymptotic expressions for their appropriate inverse moments which are easy to calculate. As applications, we also obtain asymptotic expressions for the moments of random ratios. It is pointed out that our random ratios can include some models such as change-point detection. Last, some simulations are illustrated to test our results. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Pointwise Optimality of Wavelet Density Estimation for Negatively Associated Biased Sample
Mathematics 2020, 8(2), 176; https://doi.org/10.3390/math8020176 - 02 Feb 2020
Cited by 1
Abstract
This paper focuses on the density estimation problem that occurs when the sample is negatively associated and biased. We constructed a block thresholding wavelet estimator to recover the density function from the negatively associated biased sample. The pointwise optimality of this wavelet density [...] Read more.
This paper focuses on the density estimation problem that occurs when the sample is negatively associated and biased. We constructed a block thresholding wavelet estimator to recover the density function from the negatively associated biased sample. The pointwise optimality of this wavelet density estimation is shown as L p ( 1 p < ) risks over Besov space. To validate the effectiveness of the block thresholding wavelet method, we provide some examples and implement the numerical simulations. The results indicate that our block thresholding wavelet density estimator is superior in terms of the mean squared error (MSE) when comparing with the nonlinear wavelet density estimator. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Open AccessArticle
Optimal Designs for Carry Over Effects the Case of Two Treatment and Four Periods
Mathematics 2019, 7(12), 1179; https://doi.org/10.3390/math7121179 - 03 Dec 2019
Abstract
The optimal cross-over experimental designs are derived in experiments with two treatments, four periods, and an experimental unit. The results are given for the values n = 0mod4, 1mod4, 2mod4 and 3mod4. The criterion being the minimization of the variance of the estimated [...] Read more.
The optimal cross-over experimental designs are derived in experiments with two treatments, four periods, and an experimental unit. The results are given for the values n = 0mod4, 1mod4, 2mod4 and 3mod4. The criterion being the minimization of the variance of the estimated carry over effect. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Open AccessArticle
Statistical Properties and Different Methods of Estimation for Type I Half Logistic Inverted Kumaraswamy Distribution
Mathematics 2019, 7(10), 1002; https://doi.org/10.3390/math7101002 - 22 Oct 2019
Abstract
In this paper, we introduce and study a new three-parameter lifetime distribution constructed from the so-called type I half-logistic-G family and the inverted Kumaraswamy distribution, naturally called the type I half-logistic inverted Kumaraswamy distribution. The main feature of this new distribution is to [...] Read more.
In this paper, we introduce and study a new three-parameter lifetime distribution constructed from the so-called type I half-logistic-G family and the inverted Kumaraswamy distribution, naturally called the type I half-logistic inverted Kumaraswamy distribution. The main feature of this new distribution is to add a new tuning parameter to the inverted Kumaraswamy (according to the type I half-logistic structure), with the aim to increase the flexibility of the related inverted Kumaraswamy model and thus offering more precise diagnostics in data analyses. The new distribution is discussed in detail, exhibiting various mathematical and statistical properties, with related graphics and numerical results. An exhaustive simulation was conducted to investigate the estimation of the model parameters via several well-established methods, including the method of maximum likelihood estimation, methods of least squares and weighted least squares estimation, and method of Cramer-von Mises minimum distance estimation, showing their numerical efficiency. Finally, by considering the method of maximum likelihood estimation, we apply the new model to fit two practical data sets. In this regards, it is proved to be better than recent models, also derived to the inverted Kumaraswamy distribution. Full article
(This article belongs to the Special Issue Probability, Statistics and Their Applications)
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

1. Title: Asymptotic results in broken stick models. Lorenz curves approach
Author: Gheorghiță Zbăganu
affiliation: Institute for Mathematical Statistics and Applied Mathematics
Abstract: We study the following problem: a stick of length 1 is broken at random into n smaller sticks using i.i.d. random variables (Xk)k=1,…,n-1 having a continuous probability distribution on [0,1] denoted by F. Normalize these small sticks in such a way that their average length is equal to 1. This is a broken stick model. Consider the empirical distribution of these n smaller sticks. What happens if n is great?  Is there a limit distribution of these empirical distributions if n tends to infinity?  It is known that if  F is the uniform distribution then the limit distribution does exist and it is the exponential one. How much inequality among the small sticks does this procedure produce? Usually, when studying the inequality, one uses the Lorenz curve and the Gini coefficient. The more egalitarian a distribution, the closer is the Lorenz curve to the first diagonal of [0,1] 2. We conjecture that the limit does exist for any absolutely continuous distribution F and, moreover, that  it produces more inequality that the uniform distribution. This seems to be an optimality property of the uniform distribution: it is the most equalitarian. We prove the conjecture when the density of F is a step function. In order to prepare the proof we study the empirical Lorenz and pre-Lorenz curves of a sequence of non-negative real numbers

2. Title: Discretization of Random Variables in Lp(Ω,F,P)
Author: George-Jason Siouris and Alex Karagrigoriou
Affiliation: Lab of Statistics and Data Analysis, Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, Karlovasi, 83200 Samos, Greece
Abstract: This work deals with the modelling of the discrete nature of the returns by applying the discretization of the tail density function. Motivated by the discrete nature of returns and since it seems to be present in a number of scientific fields, we focus in this work on a general framework for the discretization of random variables in Lp(Ω,F,P).  More specifically, we concentrate on finite moments after the discretization is applied and introduce two new families of random variables, namely the C(r) which contains all strictly continuous random variables with r finite moments and the D(r) that contains all discrete random variables with support of equidistant values. Hence, lp(Ω,F,P) is a subset of D(r). The extraordinary behavior the different discretizations exhibit, as well as the new possibilities they provide, are presented in this work.

3. Title: Estimation of Generalized Beta-Pareto distribution based on several optimization methods
Author: Badredine BOUMARAF,  Nacira SEDDIK-AMEUR, Vlad Stefan BARBU
Abstract: In this paper, we are interested in the four-parameter generalized Beta-Pareto distribution introduced in Akinsete et al. (2008). Note that the family of the Pareto distribution and its generalizations has been found useful in the literature for providing appropriate models for many types of heavy-tailed data, like income series, city population sizes or the size of firms. This is one of the reasons for being successfully used in many applications in physics, biology, hydrology, finance, social sciences, engineering, etc. Since the maximum likelihood estimators of the parameters of the generalized Beta-Pareto distribution cannot be obtained explicitly, we use several nonlinear optimization methods for numerically obtaining these estimators: the Fletcher-Reeves’ method, the Polack-Ribière’s method and a variant of the Fletcher-Reeves’ method, called the Hesteness-Stiefel’s method. In order to evaluate an approximation of the Hessian matrix involved in the computation, several methods exist in the literature. Among these methods, three are retained in our work: the BFGS method (from Broyden - Fletcher - Goldfarb - Shanno), the DFP method (from Davidon - Fletcher - Powell) and the SR1 method (Symmetric Rank 1). We derive the corresponding algorithms for obtaining the estimators for the parameters of interest and investigate through simulations the accuracy of these methods.


4. Title: Simultaneous dimension reduction and multi-objective clustering
Author: Vincent Vandewalle
Abstract: In model based clustering of quantitative data it is often supposed that only one clustering latent variable explains the heterogeneity of all the others variables. However, when considering variables coming from different sources it is often unrealistic to suppose that the heterogeneity of the data can only be explained by one categorical variable. If such an assumption is made, this could lead to a high number of clusters which could be difficult to interpret. A model based multi-objective clustering is proposed, it assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the
data on some clustering projection. In order to estimate the parameters of the model an EM algorithm is proposed, it relies on a probabilistic reinterpretation of the standard factorial discriminant analysis. The obtained results are projections of the data on some "principal clustering components" allowing some synthetic interpretation of the principal clusters raised by the data.

5. Title: Some dependent models for the one-dimensional discrete scan statistics
Author:  Alexandru Amarioarei, Cristian Preda
Abstract: The distribution of the one-dimensional discrete scan statistics is approximated under k-dependence models obtained as block factors of i.i.d. sequences of real random variables. For binary trials, the distribution of the scan statistics associated to 1-dependent sequences is compared with those obtained under Markovian dependance model with the same bivariate distribution of two consecutive trials. In a more general framework (arbitrary distributions), the distribution of the scan statistics is related to the length of the longest increasing run...
Back to TopTop