Next Article in Journal
The Set Covering and Other Problems: An Empiric Complexity Analysis Using the Minimum Ellipsoidal Width
Previous Article in Journal
A Combinatorial Optimization Approach for Air Cargo Palletization and Aircraft Loading
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

New Closed Form Estimators for the Beta Distribution

by
Victor Mooto Nawa
1 and
Saralees Nadarajah
2,*
1
Department of Mathematics and Statistics, University of Zambia, Lusaka 10101, Zambia
2
Department of Mathematics, University of Manchester, Manchester M13 9PL, UK
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(13), 2799; https://doi.org/10.3390/math11132799
Submission received: 24 May 2023 / Revised: 15 June 2023 / Accepted: 19 June 2023 / Published: 21 June 2023

Abstract

:
In this paper, we detail closed form estimators for beta distribution that are simpler than those proposed by Tamae, Irie and Kubokawa. The proposed estimators are shown to have smaller asymptotic variances and smaller asymptotic covariances compared to Tamae estimators and maximum likelihood estimators. The proposed estimators are also shown to perform better in real data applications.

1. Introduction

The most popular model for data in a finite interval is that based on the beta distribution. It is a two parameter distribution with probability density function given by
f ( x ; α , β ) = x α 1 ( 1 x ) β 1 B ( α , β )
for 0 < x < 1 , α > 0 and β > 0 , where B ( α , β ) denotes the beta function defined by
B ( α , β ) = 0 1 t α 1 ( 1 t ) β 1 d t ,
which can be equivalently expressed as B ( α , β ) = Γ ( α ) Γ ( β ) Γ ( α + β ) , where Γ ( α ) denotes the gamma function. We shall write X B e t a ( α , β ) to mean that a random variable X has the beta distribution. The mean and variance of X B e t a ( α , β ) are
E ( X ) = α α + β and Var ( X ) = σ 2 = α β ( α + β ) 2 ( α + β + 1 ) ,
respectively. There are hundreds if not thousands of papers on the theory and applications of the beta distribution. It is impossible to cite all of these papers. Comprehensive accounts of the theory and applications of the beta distribution can be found in [1,2,3,4,5]. See also [6].
For a long time, estimating parameters of the beta distribution and other related distributions, such as the gamma distribution, was only possible through iteration methods. For instance, if x is an observation on X B e t a ( α , β ) then maximum likelihood estimators of α and β , say α ^ and β ^ , respectively, can be obtained by solving
log x = A , log ( 1 x ) = B ,
where A = ψ ( α ) ψ ( α + β ) , B = ψ ( β ) ψ ( α + β ) and ψ ( α ) = log Γ ( α ) α denotes the digamma function. Furthermore, the corresponding asymptotic variances and asymptotic covariance are given by
n α ^ α , β ^ β N 0 0 , 1 G D ψ 1 ( C ) ψ 1 ( C ) E
as n , where ψ 1 ( α ) = ψ Γ ( α ) α denotes the trigamma function, C = α + β , D = ψ 1 ( β ) ψ 1 ( C ) , E = ψ 1 ( α ) ψ 1 ( C ) and G = ψ 1 ( α ) ψ 1 ( β ) ψ 1 ( α ) ψ 1 ( C ) ψ 1 ( β ) ψ 1 ( C ) . Furthermore, let F = ψ 1 ( α ) + ψ 1 ( β ) .
However, recently closed form estimators for the gamma and beta distributions have been proposed. Ref. [7] proposed closed form estimators for the gamma distribution by considering the likelihood equations of the generalized gamma distribution and taking the gamma distribution as a special case. Ref. [8] proposed closed form estimators for the gamma and beta distributions using the “score adjusted approach”. The estimators for the gamma distribution in [8] turned out to be the same as those obtained by [7].
Following the method applied by [8], two closed form and simpler estimators for the beta distribution are proposed in this paper. The two estimators appear to have smaller variances and smaller covariances for certain values of the parameters of the beta distribution.
The remainder of this paper is organized as follows. Section 2 re-derives the closed form estimators for the beta distribution due to [8]. Section 3 derives two new closed form estimators for the beta distribution. Section 4 establishes asymptotic normality of the new estimators. It also derives expressions for asymptotic variances and asymptotic covariances. Section 5 conducts a numerical comparison to show that the new estimators can be better than the estimators due to [8] as well as maximum likelihood estimators. A simulation study is conducted in Section 6 to check finite sample performance of all the estimators. Section 7 shows that the new estimators can provide better fit to a real dataset. Finally, some conclusions are given in Section 8. All computations in the paper were performed using the R software [9]. Sample code are given in Appendix A.

2. Tamae et al.’s [8] Closed Form Estimators

Let X B e t a ( α , β ) . Using the facts
E ( log X ) = ψ ( α ) ψ ( C )
and
E log ( 1 X ) = ψ ( β ) ψ ( C ) ,
we can show that
E ( X log X ) = α C ψ ( α ) ψ ( C ) + 1 α 1 C
and
E X log ( 1 X ) = α C ψ ( β ) ψ ( C ) 1 C .
Subtracting (5) from (4), using (1), (2) and (3) and solving for α gives
E ( X log X ) E X log ( 1 X ) = E ( X ) E ( log X ) E log ( 1 X ) + 1 α ,
which implies
α = E ( X ) E ( X log X ) E X log ( 1 X ) E ( X ) E ( log X ) + E ( X ) E log ( 1 X ) .
Another way of expressing (6) is
E ( X log X ) E X log ( 1 X ) = α C E ( log X ) E log ( 1 X ) + 1 α .
Substituting α from (6) into (7) and solving for β yields
β = 1 E ( X ) E ( X log X ) E X log ( 1 X ) E ( X ) E ( log X ) + E ( X ) E log ( 1 X ) .
By the weak law of large numbers, we can replace the expectations in (6) and (8) by their sample versions yielding the closed form estimators proposed by [8] as
α ^ = X ¯ X log X ¯ X log ( 1 X ) ¯ X ¯ log X ¯ + X ¯ log ( 1 X ) ¯ = X ¯ X log X 1 X ¯ X ¯ log X 1 X ¯ , β ^ = 1 X ¯ X log X ¯ X log ( 1 X ¯ ) X ¯ log X ¯ + X ¯ log ( 1 X ) ¯ = 1 X ¯ X log X 1 X ¯ X ¯ log X 1 X ¯ .
The corresponding asymptotic variances and asymptotic covariance are given by
n α ^ α , β ^ β N 0 0 , σ 2 C 2 F + 1 α 2 α β α β β 2 1 C + 1 α β C 2 α β C 2 α β α β
as n .

3. New Closed Form Estimators

In this section, we propose two new estimators for α and β . Throughout, we suppose that X B e t a ( α , β ) .
Substituting (3) into (5) and rearranging the terms gives
E X log ( 1 X ) = E ( X ) E log ( 1 X ) 1 C ,
which implies
1 C = E log ( 1 X ) E X log ( 1 X ) E ( X ) .
Multiplying (9) by α , using (1) and solving for α gives
α = E ( X ) 2 E ( X ) E log ( 1 X ) E X log ( 1 X ) .
Substituting (10) in (9) and solving for β gives
β = E ( X ) E ( X ) 2 E ( X ) E log ( 1 X ) E X log ( 1 X ) .
By the weak law of large numbers, we can replace the expectations in (10) and (11) by sample versions to obtain the estimators:
α ^ = X ¯ 2 X ¯ log ( 1 X ) ¯ X log ( 1 X ) ¯ , and   β ^ = X ¯ X ¯ 2 X ¯ log ( 1 X ) ¯ X log ( 1 X ) ¯ .
Substituting (1) and (2) in (4) and rearranging yields
E ( X log X ) = E ( X ) E ( log X ) + 1 α 1 C ,
which implies
1 α 1 C = E ( X log X ) E ( X ) E ( log X ) E ( X ) .
Multiplying (13) by α , using (1) and solving for α gives
α = E ( X ) [ 1 E ( X ) ] E ( X log X ) E ( X ) E ( log X ) .
Substituting (14) in (13) and solving for β gives
β = 1 E ( X ) 2 E ( X log X ) E ( X ) E ( log X ) .
By the weak law of large numbers, we can replace the expectations in (14) and (15) by their sample versions to obtain
α ^ = X ¯ 1 X ¯ X log X ¯ X ¯ log X ¯   and   β ^ = 1 X ¯ 2 X log X ¯ X ¯ log X ¯ .
Note that β ^ α ^ = 1 X ¯ 1 . This relationship can be useful in deriving large sample properties given in Section 4.

4. Large Sample Properties

Large sample properties of the estimators given by (12) and (16) are derived in this section. Theorem 1 proves asymptotic normality of the estimators given by (12). Theorem 2 proves asymptotic normality of the estimators given by (16).
Theorem 1.
The estimators given in (12) satisfy
n α ^ α , β ^ β d N 0 0 , Σ 11 Σ 12 Σ 12 Σ 22
as n , where
Σ 11 = α β C + 1 1 + C 2 D + α C ( α + 1 ) C 2 + 1 ( C + 1 ) 3 ,
Σ 12 = β 2 C 2 D α ( β + C ) C + 1 + β C ( C + 1 ) 3 ( α + 2 ) C 2 + C + α + 1
and
Σ 22 = α β C + 1 + β 2 C α ( C + 1 ) ( β C D α + 1 ) + 2 β C ( α + 1 ) α ( C + 1 ) 3 β C 2 α C α .
Proof. 
Let the empirical means of X, log ( 1 X ) , and X log ( 1 X ) be denoted by X ¯ , Y ¯ , and Z ¯ , respectively. We can easily show that
E X ¯ = α C ,
E Y ¯ = B
and
E Z ¯ = α C B 1 C .
We can also show that
Var n Y ¯ = D .
Using the fact that
E X 2 log 2 ( 1 X ) = α ( α + 1 ) C ( C + 1 ) ψ ( β ) ψ ( C + 2 ) 2 + ψ 1 ( β ) ψ 1 ( C + 2 ) ,
we can show that
Var n Z ¯ = σ 2 B 1 C 2 + α ( α + 1 ) C ( C + 1 ) 2 ( C + 1 ) 2 2 C + 1 B 1 C + 1 C 2 + D .
Similarly, we can show that
Cov n X ¯ , n Y ¯ = α C 2 ,
Cov n X ¯ , n Z ¯ = σ 2 B 1 C α ( α + 1 ) C ( C + 1 ) 2
and
Cov n Y ¯ , n Z ¯ = α C 2 C 2 B C + D .
By the central limit theorem,
n X ¯ , Y ¯ , Z ¯ α C , B , α C ψ ( β ) ψ ( C + 1 ) d N 0 3 , Σ
as n , where
0 3 = 0 0 0 ,
and
Σ = Var ( X ) Cov ( X , Y ) Cov ( X , Z ) Cov ( X , Y ) Var ( Y ) Cov ( Y , Z ) Cov ( X , Z ) Cov ( Y , Z ) Var ( Z ) .
The entries of this matrix are given by (1) and (17)–(21) as
Var ( X ) = α β C 2 ( C + 1 ) = σ 2 ,
Cov ( X , Y ) = α C 2 ,
Cov ( X , Z ) = σ 2 B 1 C α ( α + 1 ) C ( C + 1 ) 2 ,
Var ( Y ) = D ,
Cov ( Y , Z ) = α C 2 C 2 B C + D
and
Var ( Z ) = σ 2 B 1 C 2 + α ( α + 1 ) C ( C + 1 ) 2 ( C + 1 ) 2 2 C + 1 B 1 C + 1 C 2 + D .
Let g 1 ( x , y , z ) = x 2 x y z , g 2 ( x , y , z ) = x x 2 x y z and x 0 , y 0 , z 0 = α C , B , α C ψ ( β ) ψ ( C + 1 ) . Then
g 1 x | x 0 , y 0 , z 0 = C ( 2 B C ) ,
g 1 y | x 0 , y 0 , z 0 = α C ,
g 1 z | x 0 , y 0 , z 0 = C 2 ,
g 2 x | x 0 , y 0 , z 0 = C 2 β α α C β B α ,
g 2 y | x 0 , y 0 , z 0 = β C
and
g 2 z | x 0 , y 0 , z 0 = β C 2 α .
Let
M = C ( 2 B C ) α C C 2 C 2 β α α C β B α β C β C 2 α .
Using the delta method,
n α ^ α , β ^ β d N 0 2 , M Σ M T
as n , where
0 2 = 0 0
and
M Σ M T = M Σ M 11 T M Σ M 12 T M Σ M 12 T M Σ M 22 T ,
where
M Σ M 11 T = σ 2 C 2 ( 2 B C ) 2 + 2 Cov ( X , Z ) C 3 ( 2 B C ) + Var ( Z ) C 4 α 2 C 2 D ,
M Σ M 12 T = σ 2 C 3 ( 2 B C ) β α α C β B α + Cov ( X , Z ) C 3 α 3 β 2 β B C α + Var ( Z ) β C 4 α α C + β C 2 D
and
M Σ M 22 T = σ 2 C 4 β α α C β B α 2 + 2 Cov ( X , Z ) β C 4 α β α α C β B α + Var ( Z ) β 2 C 4 α 2 β C ( 2 + β C D ) .
The theorem follows by simplification of these expressions.    □
Theorem 2.
The estimators given in (16) satisfy
n α ^ α , β ^ β d N 0 0 , Σ 11 Σ 12 Σ 12 Σ 22
as n , where
Σ 11 = σ 2 α 2 C 2 β 2 1 + 2 C 2 C + 1 + β C 3 ( α + 1 ) ( C + 1 ) 2 + α 2 C β 2 + α C E C + 1 + α 3 C 3 ( α + 1 ) β 2 ( C + 1 ) 1 C 2 + 1 ( C + 1 ) 2 1 α 2 1 ( α + 1 ) 2 ,
Σ 12 = σ 2 α C 2 β β α 1 + C 2 C + 1 + β C 3 ( α + 1 ) ( C + 1 ) 2 + 2 + 3 C 2 C + 1 + α C 1 + α C E C + 1 + α 2 C 3 ( α + 1 ) β ( C + 1 ) 1 C 2 + 1 ( C + 1 ) 2 1 α 2 1 ( α + 1 ) 2
and
Σ 22 = σ 2 C 2 4 1 + β α + C 2 C + 1 + β α β α + 2 C 2 C + 1 + β C 3 ( α + 1 ) ( C + 1 ) 2 + α β C 2 E C + 1 + α C 3 ( α + 1 ) C + 1 1 C 2 + 1 ( C + 1 ) 2 1 α 2 1 ( α + 1 ) 2 .
Proof. 
Let the empirical means of X, log ( X ) and X log ( X ) be denoted by X ¯ , U ¯ and V ¯ , respectively. We can easily show that
E U ¯ = A
and
E V ¯ = α C A + 1 α 1 C .
We can also show that
Var n U ¯ = E .
Using the fact that
E X 2 log 2 ( X ) = α ( α + 1 ) C ( C + 1 ) ψ ( α + 2 ) ψ ( C + 2 ) 2 + ψ 1 ( α + 2 ) ψ 1 ( C + 2 ) ,
we have
Var n V ¯ = σ 2 A + β α C 2 + σ 2 C C + 1 β ( α + 1 ) ( C + 1 ) + 2 A + 2 β α C + α ( α + 1 ) C ( C + 1 ) 1 C 2 + 1 ( C + 1 ) 2 1 α 2 1 ( α + 1 ) 2 + E .
Similarly, we can show that
Cov n X ¯ , n U ¯ = β C 2 ,
Cov n X ¯ , n V ¯ = σ 2 A + β α C + C C + 1
and
Cov n U ¯ , n V ¯ = α C A β α C + E 2 β α C 2 .
By the central limit theorem,
n X ¯ , U ¯ , V ¯ α C , A , α C ψ ( α + 1 ) ψ ( C + 1 ) d N 0 3 , Σ
as n , where
Σ = Var ( X ) Cov ( X , U ) Cov ( X , V ) Cov ( X , U ) Var ( U ) Cov ( U , V ) Cov ( X , V ) Cov ( U , V ) Var ( V ) .
The entries of this matrix are given by (1) and (22)–(26) as
Var ( X ) = α β C 2 ( C + 1 ) = σ 2 ,
Cov ( X , U ) = β C 2 ,
Cov ( X , V ) = σ 2 A + β α C + C C + 1 ,
Var ( U ) = E ,
Cov ( U , V ) = α C A β α C + E 2 β α C 2
and
Var ( V ) = σ 2 A + β α C 2 + σ 2 C C + 1 β ( α + 1 ) ( C + 1 ) + 2 A + 2 β α C + α ( α + 1 ) C ( C + 1 ) 1 C 2 + 1 ( C + 1 ) 2 1 α 2 1 ( α + 1 ) 2 + E .
Let h 1 ( x , y , z ) = x x 2 x z y , h 2 ( x , y , z ) = 1 2 x + x 2 z x y and x 0 , y 0 , z 0 = [ α C , A , α C [ ψ ( α + 1 ) ψ ( C + 1 ) ] ] . Then,
h 1 x | x 0 , y 0 , z 0 = C β α A C + β α ,
h 1 y | x 0 , y 0 , z 0 = α 2 C β ,
h 1 z | x 0 , y 0 , z 0 = α C 2 β ,
h 2 x | x 0 , y 0 , z 0 = A C 2 2 C ,
h 2 y | x 0 , y 0 , z 0 = α C
and
h 2 z | x 0 , y 0 , z 0 C 2 .
Let
W = C β α A C + β α α 2 C β α C 2 β A C 2 2 C α C C 2 .
Using the delta method,
n α ^ α , β ^ β d N 0 2 , W Σ W T
as n , where
W Σ W T = W Σ W 11 T W Σ W 12 T W Σ W 12 T W Σ W 22 T ,
where
W Σ W 11 T = 2 α 2 C β σ 2 α C 2 β 2 ( α A C + β α ) 1 + A C + β α + 2 C 2 C + 1 α 4 C 2 E β 2 + α 2 C 4 β 2 Var ( V ) ,
W Σ W 12 T = σ 2 α C 2 β A C A C + 2 β α + 2 C 2 C + 1 + β α 1 + β α + C 2 C + 1 2 3 C 2 C + 1 + α C α 3 C 2 E β + α C 4 β Var ( V )
and
W Σ W 22 T = σ 2 C 2 ( A C 2 ) 2 + 2 β α + 2 C 2 C + 1 + A C α 2 C 2 E + C 4 Var ( V ) .
The theorem follows by simplification of these expressions.    □

5. Numerical Comparison

In this section, we compare the asymptotic variances and asymptotic covariances of the estimators given by Theorems 1 and 2, Tamae et al.’s estimators and maximum likelihood estimators. Figure 1, Figure 2 and Figure 3 show how the asymptotic variances and asymptotic covariances of these estimators vary versus α = 0.1 , 0.2 , , 10 and β = 0.1 , 0.2 , , 10 . Figure 4, Figure 5 and Figure 6 show the differences in asymptotic variances and the differences in asymptotic covariances for two of the estimators at a time as α = 0.1 , 0.2 , , 10 and β = 0.1 , 0.2 , , 10 . We shall refer to Tamae et al.’s estimators as TIK estimators.
We can observe the following from Figure 1, Figure 2 and Figure 3. Asymptotic variances of all of the estimators for α increase with respect to α ; asymptotic variances of all of the estimators for α decrease with respect to β ; asymptotic variances of all of the estimators for β decrease with respect to α ; asymptotic variances of all of the estimators for β increase with respect to β ; asymptotic covariances of all of the estimators increase with respect to α ; and asymptotic covariances of all of the estimators increase with respect to β . All four estimators in each of the Figure 1, Figure 2 and Figure 3 appear to behave similarly.
We can observe the following from Figure 4, Figure 5 and Figure 6. With respect to asymptotic variances of the estimators for α , the estimator in Theorem 1 is more efficient than the corresponding estimator in Theorem 2 for all α > β ; with respect to asymptotic variances of the estimators for α , the estimator in Theorem 1 is more efficient than the corresponding TIK estimator in a parabolic region containing large values of α and small values of β ; with respect to asymptotic variances of the estimators for α , the maximum likelihood estimator is more efficient than the corresponding estimator in Theorem 1 for all values of α and β ; with respect to asymptotic variances of the estimators for α , the estimator in Theorem 2 is more efficient than the corresponding TIK estimator for all β > 2 α ; with respect to asymptotic variances of the estimators for α , the maximum likelihood estimator is more efficient than the corresponding estimator in Theorem 2 and the corresponding TIK estimator for all values of α and β ; with respect to asymptotic variances of the estimators for β , the estimator in Theorem 1 is more efficient than the corresponding estimator in Theorem 2 for all β < α ; with respect to asymptotic variances of the estimators for β , the estimator in Theorem 1 is more efficient than the corresponding TIK estimator for all 2 β < α ; with respect to asymptotic variances of the estimators for β , the estimator in Theorem 2 is more efficient than the corresponding TIK estimator in a parabolic region containing small values of α and large values of β ; with respect to asymptotic variances of the estimators for β , the maximum likelihood estimator is more efficient than the three remaining estimators for all values of α and β ; with respect to asymptotic covariances, the estimators in Theorem 1 are more efficient than the corresponding estimators in Theorem 2 for all β < α ; with respect to asymptotic covariances, the estimators in Theorem 1 are more efficient than the corresponding TIK estimators in a parabolic region containing large values of α and small values of β ; with respect to asymptotic covariances, the estimators in Theorem 1 are more efficient than the corresponding maximum likelihood estimators for all small values of β ; with respect to asymptotic covariances, the estimators in Theorem 2 are more efficient than the corresponding TIK estimators in a parabolic region containing small values of α and large values of β ; with respect to asymptotic covariances, the estimators in Theorem 2 are more efficient than the corresponding maximum likelihood estimators for all small values of β ; and with respect to asymptotic covariances, the TIK estimators are more efficient than the corresponding maximum likelihood estimators in a hyperbolic region containing either small values of both α and β or small values of α and large values of β or large values of α and small values of β .
In summary, we can see that the estimators in Theorem 1 can be more efficient than the TIK estimators if either 2 β < α or in a parabolic region containing large values of α and small values of β . The estimators in Theorem 1 can be more efficient than the maximum likelihood estimators for small values of β . The estimators in Theorem 2 can be more efficient than the TIK estimators if either β > 2 α or in a parabolic region containing large values of β and small values of α . The estimators in Theorem 2 can be more efficient than the maximum likelihood estimators for small values of β . The estimators in Theorem 1 can be more efficient than the estimators in Theorem 2 if α > β .

6. Simulation Study

Finite sample performances of the estimators given by Theorems 1 and 2, Tamae et al.’s estimators and maximum likelihood estimators are compared in this section. We use the following simulation scheme:
(i)
Simulate a random sample of size n from a beta distribution with parameters a = 2 and b = 2 ;
(ii)
Compute the estimators given by Theorems 1 and 2, Tamae et al.’s estimators and maximum likelihood estimators;
(iii)
Repeat steps (i) and (ii) 1000 times;
(iv)
Compute the bias and mean squared error of the estimators;
(v)
Compute also the p-value for [10]’s test of bivariate normality of the estimators;
(vi)
Repeat steps (i) to (v) for n = 10 , 20 , , 1000 .
Biases versus n for all four estimators are shown in Figure 7. Mean squared errors versus n for all four estimators are shown in Figure 8. p-values versus n for all four estimators are shown in Figure 9.
We observe the following from the figures: increasing n leads the biases generally decreasing to zero; the relative performances of the estimators with respect to bias appear similar; the biases appear reasonably small for all n 200 ; increasing n leads the mean squared errors generally decreasing to zero; the mean squared errors appear largest for Tamae et al.’s estimators and smallest for maximum likelihood estimators; the mean squared errors appear reasonably small for all n 200 ; asymptotic normality of all estimators appears to have been achieved for all n 400 .
These observations are for a = 2 and b = 2 . However, the same observations held for a wide range of other values of a and b. In particular, increasing n always led to the biases decreasing to zero, increasing n always led to the mean squared errors decreasing to zero, and the mean squared errors always appeared largest for Tamae et al.’s estimators and smallest for maximum likelihood estimators.

7. Real Data Illustration

In this section, we compare the performance of the estimators given by Theorems 1 and 2, Tamae et al.’s estimators and maximum likelihood estimators using a real dataset. The data are the proportion voting Remain in the Brexit (EU referendum) poll outcomes for 127 polls from January 2016 to the referendum date on June 2016. The actual data values are 0.52, 0.55, 0.51, 0.49, 0.44, 0.54, 0.48, 0.41, 0.45, 0.42, 0.53, 0.45, 0.44, 0.44, 0.42, 0.42, 0.37, 0.46, 0.43, 0.39, 0.45, 0.44, 0.46, 0.40, 0.48, 0.42, 0.44, 0.45, 0.43, 0.43, 0.48, 0.41, 0.43, 0.40, 0.41, 0.42, 0.44, 0.51, 0.44, 0.44, 0.41, 0.41, 0.45, 0.55, 0.44, 0.44, 0.52, 0.55, 0.47, 0.43, 0.55, 0.38, 0.36, 0.38, 0.44, 0.42, 0.44, 0.43, 0.42, 0.49, 0.39, 0.41, 0.45, 0.43, 0.44, 0.51, 0.51, 0.49, 0.48, 0.43, 0.53, 0.38, 0.40, 0.39, 0.35, 0.45, 0.42, 0.40, 0.39, 0.44, 0.51, 0.39, 0.35, 0.41, 0.51, 0.45, 0.49, 0.40, 0.48, 0.41, 0.46, 0.47, 0.43, 0.45, 0.48, 0.49, 0.40, 0.40, 0.40, 0.39, 0.41, 0.39, 0.48, 0.48, 0.37, 0.38, 0.42, 0.51, 0.45, 0.40, 0.54, 0.36, 0.43, 0.49, 0.41, 0.36, 0.42, 0.38, 0.55, 0.44, 0.54, 0.41, 0.52, 0.42, 0.38, 0.42, 0.44.
We fitted the beta distribution using the four estimators. The estimates of ( α , β ) of the beta distribution were (45.315, 57.107), (45.196, 56.974), (44.061, 55.543) and (46.139, 58.163) for the maximum likelihood estimators, TIK estimators, the estimators given by Theorem 1 and the estimators given by Theorem 2, respectively.
The standard errors of the estimates of ( α , β ) can be obtained using Theorems 1 and 2. However, the theorems are asymptotic and do not account for variability of the estimators. We used the following bootstrapping procedure to check variability:
(i)
Remove the ith observation from the data;
(ii)
Refit the beta distribution to the modified data using the four estimators;
(iii)
Repeat steps (i) and (ii) for i = 1 , 2 , , 127 .
The histograms not shown here looked similar for all four estimators. However, the standard deviations of the histograms were 0.448, 0.448, 0.435 and 0.459 for the maximum likelihood estimator, TIK estimator, the estimator given by Theorem 1 and the estimator given by Theorem 2, respectively, of α . The standard deviations of the histograms were 0.600, 0.599, 0.580 and 0.617 for the maximum likelihood estimator, TIK estimator, the estimator given by Theorem 1 and the estimator given by Theorem 2, respectively, of β . Hence, the estimators given by Theorem 1 provide the best performance with respect to accuracy of fit.
The probability and quantile plots not shown here looked similar for all four estimators. However, the sums of the squares of the deviations between expected and observed probabilities were 0.2970, 0.2932, 0.2884, and 0.2981 for the maximum likelihood estimators, TIK estimators, the estimators given by Theorem 1 and the estimators given by Theorem 2, respectively. The sums of the squares of the deviations between expected and observed quantiles were 0.0101, 0.0101, 0.0100, and 0.0102 for the maximum likelihood estimators, TIK estimators, the estimators given by Theorem 1 and the estimators given by Theorem 2, respectively. Hence, the estimators given by Theorem 1 provide the best performance with respect to goodness of fit.

8. Conclusions

Motivated by [8], we have proposed two closed form estimators for the parameters of the beta distribution. We have proved their asymptotic normality and derived expressions for asymptotic variances and asymptotic covariances. Through a numerical comparison and a real data application, we have shown that proposed estimators can be more efficient than Tamae et al.’s estimators as well as maximum likelihood estimators for certain parameter values. We have also performed a simulation study to check finite sample behavior of all the estimators.
Future work is theoretical comparison of the performances of proposed estimators, Tamae et al.’s estimators and maximum likelihood estimators for finite n. Other future work are to see if closed form estimators can be derived for known extensions of the beta distribution, bivariate beta distributions, multivariate beta distributions, matrix variate beta distributions and complex variate beta distributions [1,11].

Author Contributions

Methodology, V.M.N.; Software, S.N.; Formal analysis, V.M.N.; Writing—original draft, V.M.N.; Supervision, S.N. All authors have read and agreed to the published version of the manuscript.

Funding

This paper has received no external funding.

Data Availability Statement

Data and code are given as part of the manuscript.

Acknowledgments

The authors would like to thank the Editor and the four referees for careful reading and comments which greatly improved the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. R Code

The following code computes the estimators given by Theorems 1 and 2, Tamae et al.’s estimators and maximum likelihood estimators.
f=function (p)
{tt=1.0e20
if (p[1]>0&p[2]>0) tt=-sum(dbeta(x,shape1=p[1],shape2=p[2],log=TRUE))
return(tt)}
est=optim(f,par=c(1,1))
a=est$par[1]
b=est$par[2]
a2=mean(x)/(mean(x*log(x/(1-x)))-mean(x)*mean(log(x/(1-x))))
b2=(1-mean(x))/(mean(x*log(x/(1-x)))-mean(x)*mean(log(x/(1-x))))
a3=(mean(x))**2/(mean(x)*mean(log(1-x))-mean(x*log(1-x)))
b3=(mean(x)-(mean(x))**2)/(mean(x)*mean(log(1-x))-mean(x*log(1-x)))
a4=(mean(x))*(1-mean(x))/(mean(x*log(x))-mean(x)*mean(log(x)))
b4=(1-mean(x))**2/(mean(x*log(x))-mean(x)*mean(log(x)))
The following code computes the asymptotic variances and asymptotic covariances of the estimators given by Theorems 1 and 2, Tamae et al.’s estimators and maximum likelihood estimators.
A=digamma(a)-digamma(a+b)
B=digamma(b)-digamma(a+b)
C=a+b
D=trigamma(b)-trigamma(a+b)
E=trigamma(a)-trigamma(a+b)
F=trigamma(a)+trigamma(b)
G=trigamma(a)*trigamma(b)-trigamma(a)*trigamma(a+b)-trigamma(b)*trigamma
(a+b)
s2=a*b/((a+b)**2*(a+b+1))
vic111=a*b*(1+C*C*D)/(C+1)+a*(a+1)*C*(C*C+1)/(C+1)**3
vic112=(b*b*C*C*D-a*(b+C))/(C+1)+(b*C)*((a+2)*C*C+C+a+1)/(C+1)**3
vic122=a*b/(C+1)+b*b*C*(b*C*D-a+1)/(a*(C+1))+
2*(a+1)*b*C*(b*C*C-a*C-a)/(a*(C+1)**3)
vic211=s2*a*a*C*C*(1+2*C*C/(C+1)+b*C**3/((a+1)*(C+1)**2))/(b*b)+
a*a*C*(2+a*C*E/(C+1))/b+a**3*C**3*(a+1)*(1/C**2+1/(C+1)**2-
1/(a*a)-1/(a+1)**2)/(b*b*(C+1))
vic212=s2*a*C*C*((b/a)*(1+C*C/(C+1))+b*C**3/((a+1)*(C+1)**2)+
2+3*C*C/(C+1))/b+a*C*(1+a*C*E/(C+1))+a*a*C**3*(a+1)*(1/C**2+
1/(C+1)**2-1/a**2-1/(a+1)**2)/(b*(C+1))
vic222=s2*C*C*(4*(1+b/a+C**2/(C+1))+(b/a)*(b/a+2*C**2/(C+1))+
b*C**3/((a+1)*(C+1)**2))+a*b*C*C*E/(C+1)+a*(a+1)*C**3*(1/C**2+
1/(C+1)**2-1/a**2-1/(a+1)**2)/(C+1)
TIK11=(s2*C*C*F+1)*a**2-a*b/(C+1)
TIK22=(s2*C*C*F+1)*b**2-a*b/(C+1)
TIK12=(s2*C*C*F+1)*a*b-(C*C-a*b)/(C+1)
MLE11=D/G
MLE22=E/G
MLE12=(trigamma(a+b))/G

References

  1. Gupta, A.K.; Nadarajah, S. Handbook of Beta Distribution and Its Applications; CRC Press: New York, NY, USA, 2004. [Google Scholar]
  2. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions; John Wiley and Sons: New York, NY, USA, 1995; Volume 2. [Google Scholar]
  3. Kotz, S.; van Dorp, J.R. Beyond Beta: Other Continuous Families of Distributions with Bounded Support and Applications; World Scientific Publishing Company: Hackensack, NJ, USA, 2004. [Google Scholar]
  4. Larson, H.J. Introduction to Probability Theory and Statistical Inference; John Wiley and Sons: New York, NY, USA, 1982. [Google Scholar]
  5. Seber, G.A.F. The Linear Model and Hypothesis: A General Unifying Theory; Springer: Cham, Switzerland, 2015. [Google Scholar]
  6. Ferrari, S.L.P.; Cribari-Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
  7. Ye, Z.-S.; Chen, N. Closed-form estimators for the gamma distribution derived from likelihood equations. Am. Stat. 2017, 71, 177–181. [Google Scholar] [CrossRef]
  8. Tamae, H.; Irie, K.; Kubokawa, T. A score-adjusted approach to closed form estimators for the gamma and beta distributions. Jpn. J. Stat. Data Sci. 2020, 3, 543–561. [Google Scholar] [CrossRef]
  9. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023. [Google Scholar]
  10. Henze, N.; Zirkler, B. A class of invariant consistent tests for multivariate normality. Commun. Stat.-Theory Methods 1990, 19, 3595–3617. [Google Scholar] [CrossRef]
  11. Balakrishnan, N.; Lai, C.D. Continuous Bivariate Distributions; Springer Verlag: New York, NY, USA, 2009. [Google Scholar]
Figure 1. Asymptotic variance of the estimator of α versus α and β for the estimator given by Theorem 1 (top left), the estimator given by Theorem 2 (top right), TIK estimator (bottom left), and maximum likelihood estimator (bottom right).
Figure 1. Asymptotic variance of the estimator of α versus α and β for the estimator given by Theorem 1 (top left), the estimator given by Theorem 2 (top right), TIK estimator (bottom left), and maximum likelihood estimator (bottom right).
Mathematics 11 02799 g001
Figure 2. Asymptotic variance of the estimator of β versus α and β for the estimator given by Theorem 1 (top left), the estimator given by Theorem 2 (top right), TIK estimator (bottom left), and maximum likelihood estimator (bottom right).
Figure 2. Asymptotic variance of the estimator of β versus α and β for the estimator given by Theorem 1 (top left), the estimator given by Theorem 2 (top right), TIK estimator (bottom left), and maximum likelihood estimator (bottom right).
Mathematics 11 02799 g002
Figure 3. Asymptotic covariance between the estimators of α and β versus α and β for the estimators given by Theorem 1 (top left), the estimators given by Theorem 2 (top right), TIK estimators (bottom left), and maximum likelihood estimators (bottom right).
Figure 3. Asymptotic covariance between the estimators of α and β versus α and β for the estimators given by Theorem 1 (top left), the estimators given by Theorem 2 (top right), TIK estimators (bottom left), and maximum likelihood estimators (bottom right).
Mathematics 11 02799 g003
Figure 4. Sign of the difference between the asymptotic variances of the estimators of α given by Theorems 1 and 2 (top left); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 1 and TIK estimator (top right); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 1 and maximum likelihood estimator (middle left); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 2 and TIK estimator (middle right); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 2 and maximum likelihood estimator (bottom left); sign of the difference between the asymptotic variances of TIK estimator of α and maximum likelihood estimator (bottom right).
Figure 4. Sign of the difference between the asymptotic variances of the estimators of α given by Theorems 1 and 2 (top left); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 1 and TIK estimator (top right); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 1 and maximum likelihood estimator (middle left); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 2 and TIK estimator (middle right); sign of the difference between the asymptotic variances of the estimator of α given by Theorem 2 and maximum likelihood estimator (bottom left); sign of the difference between the asymptotic variances of TIK estimator of α and maximum likelihood estimator (bottom right).
Mathematics 11 02799 g004
Figure 5. Sign of the difference between the asymptotic variances of the estimators of β given by Theorems 1 and 2 (top left); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 1 and TIK estimator (top right); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 1 and maximum likelihood estimator (middle left); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 2 and TIK estimator (middle right); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 2 and maximum likelihood estimator (bottom left); sign of the difference between the asymptotic variances of TIK estimator of β and maximum likelihood estimator (bottom right).
Figure 5. Sign of the difference between the asymptotic variances of the estimators of β given by Theorems 1 and 2 (top left); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 1 and TIK estimator (top right); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 1 and maximum likelihood estimator (middle left); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 2 and TIK estimator (middle right); sign of the difference between the asymptotic variances of the estimator of β given by Theorem 2 and maximum likelihood estimator (bottom left); sign of the difference between the asymptotic variances of TIK estimator of β and maximum likelihood estimator (bottom right).
Mathematics 11 02799 g005
Figure 6. Sign of the difference between the asymptotic covariances of the estimators given by Theorems 1 and 2 (top left); sign of the difference between the asymptotic covariances of the estimators given by Theorem 1 and TIK estimator (top right); sign of the difference between the asymptotic covariances of the estimators given by Theorem 1 and maximum likelihood estimator (middle left); sign of the difference between the asymptotic covariances of the estimators given by Theorem 2 and TIK estimator (middle right); sign of the difference between the asymptotic covariances of the estimators given by Theorem 2 and maximum likelihood estimator (bottom left); sign of the difference between the asymptotic covariances of TIK estimator and maximum likelihood estimator (bottom right).
Figure 6. Sign of the difference between the asymptotic covariances of the estimators given by Theorems 1 and 2 (top left); sign of the difference between the asymptotic covariances of the estimators given by Theorem 1 and TIK estimator (top right); sign of the difference between the asymptotic covariances of the estimators given by Theorem 1 and maximum likelihood estimator (middle left); sign of the difference between the asymptotic covariances of the estimators given by Theorem 2 and TIK estimator (middle right); sign of the difference between the asymptotic covariances of the estimators given by Theorem 2 and maximum likelihood estimator (bottom left); sign of the difference between the asymptotic covariances of TIK estimator and maximum likelihood estimator (bottom right).
Mathematics 11 02799 g006
Figure 7. Biases of the estimators of a (left) and b (right) versus n.
Figure 7. Biases of the estimators of a (left) and b (right) versus n.
Mathematics 11 02799 g007
Figure 8. Mean squared errors of the estimators of a (left) and b (right) versus n.
Figure 8. Mean squared errors of the estimators of a (left) and b (right) versus n.
Mathematics 11 02799 g008
Figure 9. P-values for [10]’s test of bivariate normality of the estimators versus n. The horizontal line corresponds to 0.05.
Figure 9. P-values for [10]’s test of bivariate normality of the estimators versus n. The horizontal line corresponds to 0.05.
Mathematics 11 02799 g009
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nawa, V.M.; Nadarajah, S. New Closed Form Estimators for the Beta Distribution. Mathematics 2023, 11, 2799. https://doi.org/10.3390/math11132799

AMA Style

Nawa VM, Nadarajah S. New Closed Form Estimators for the Beta Distribution. Mathematics. 2023; 11(13):2799. https://doi.org/10.3390/math11132799

Chicago/Turabian Style

Nawa, Victor Mooto, and Saralees Nadarajah. 2023. "New Closed Form Estimators for the Beta Distribution" Mathematics 11, no. 13: 2799. https://doi.org/10.3390/math11132799

APA Style

Nawa, V. M., & Nadarajah, S. (2023). New Closed Form Estimators for the Beta Distribution. Mathematics, 11(13), 2799. https://doi.org/10.3390/math11132799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop