Multistage Estimation of the Rayleigh Distribution Variance

Yousef, Ali; Amin, Ayman A.; Hassan, Emad E.; Hamdy, Hosny I.

doi:10.3390/sym12122084

Open AccessArticle

Multistage Estimation of the Rayleigh Distribution Variance

¹

Department of Mathematics, Kuwait College of Science and Technology, KuwaitCity 27235, Kuwait

²

Faculty of Commerce, Menoufia University, Menoufia 32952, Egypt

³

Faculty of Management Sciences, October University for Modern Sciences and Arts, Cairo 11435, Egypt

^*

Author to whom correspondence should be addressed.

Symmetry 2020, 12(12), 2084; https://doi.org/10.3390/sym12122084

Submission received: 28 November 2020 / Revised: 14 December 2020 / Accepted: 14 December 2020 / Published: 15 December 2020

(This article belongs to the Special Issue Skewed (Asymmetrical) Probability Distributions and Applications across Disciplines)

Download Versions Notes

Abstract

:

In this paper we discuss the multistage sequential estimation of the variance of the Rayleigh distribution using the three-stage procedure that was presented by Hall (Ann. Stat. 9(6):1229–1238, 1981). Since the Rayleigh distribution variance is a linear function of the distribution scale parameter’s square, it suffices to estimate the Rayleigh distribution’s scale parameter’s square. We tackle two estimation problems: first, the minimum risk point estimation problem under a squared-error loss function plus linear sampling cost, and the second is a fixed-width confidence interval estimation, using a unified optimal stopping rule. Such an estimation cannot be performed using fixed-width classical procedures due to the non-existence of a fixed sample size that simultaneously achieves both estimation problems. We find all the asymptotic results that enhanced finding the three-stage regret as well as the three-stage fixed-width confidence interval for the desired parameter. The procedure attains asymptotic second-order efficiency and asymptotic consistency. A series of Monte Carlo simulations were conducted to study the procedure’s performance as the optimal sample size increases. We found that the simulation results agree with the asymptotic results.

Keywords:

asymptotic regret; coverage probability; loss function; Monte Carlo simulation; optimal stopping rule; three-stage procedure

1. Introduction

Rayleigh distribution was presented by Rayleigh [1] in 1880 and primarily proposed in the context of a problem in acoustics and optics. As a useful reference for the history of the distribution, see Johnson et al. [2]. The distribution is extensively used in communication theory to describe the hourly median and immediate peak power of received radio signals. It plays a crucial role in survival analysis, reliability analysis, physical sciences, engineering, medical imaging science, applied statistics, and clinical studies. For more details related to its application, see Palovko [3], Gross and Clark [4], Lee and Wang [5], Rosen et al. [6], and Siddiqui [7]; for more information about the distribution and its statistical parameter inference, see Siddiqui [7], Hirano [8], Dyer and Whisenand [9], Howlader and Hossian [10], and Johnson et al. [2].

There have been many forms for the Rayleigh distribution to provide flexibility for modeling data. Vod [11,12] proposed a generalized form of the Rayleigh distribution and discussed its statistical and inferential properties. The probability density function of the generalized form with scale parameter

σ

and shape parameter

λ

is given by:

f (x; σ, λ) = \frac{σ^{- 2 λ - 2} x^{2 λ + 1}}{Γ (λ + 1)} e^{- {(x / σ)}^{2}}, x > 0, σ > 0, λ > - 1

where

Γ (a) = \int_{0}^{\infty} t^{a - 1} e^{- t} d t

is the gamma function.

At

λ = 0

we obtain the standard Rayleigh distribution. The probability density function of the standard Rayleigh distribution with scale parameter

σ

is:

f (x; σ) = \frac{x}{σ^{2}} e^{- {(x / σ)}^{2}}, x > 0, σ > 0 .

Now, let

X_{1}, X_{2,} \dots

be independent and identically distributed random variables following a standard Rayleigh distribution with unknown scale parameter

σ

. It can be shown from Johnson et al. [2] that the population mean and population variance of the distribution are, respectively,

E (X) = \sqrt{π / 2} σ

and

V a r (X) = (2 - \frac{π}{2}) σ^{2}

.

Recently, Yousef et al. [13] discussed the Rayleigh distribution scale parameter’s multistage estimation using Hall’s [14] three-stage procedure. They tackled two estimation problems, point and confidence interval estimation, under a unified optimal stopping rule. They obtained the three-stage regret while they discussed the coverage probability through Monte Carlo simulation. They proved that the procedure attains asymptotic second-order efficiency and asymptotic consistency in the sense of Chow and Robbins [15] and Ghosh and Mukhopadhyay [16]. Tahir [17] proposed a purely sequential procedure to tackle the point estimation problem for the square of the scale parameter of the Rayleigh distribution, using a weighted squared-error loss function plus the cost of sampling. He found a second-order asymptotic expansion for the incurred regret and proved that the asymptotic regret is negative for a range of parameter values.

In this paper, the aim is to estimate the population variance

V a r (X) = (2 - \frac{π}{2}) σ^{2}

or the population second moment

E (X^{2}) = 2 σ^{2}

of the Rayleigh distribution through estimating the scale parameter’s square of the Rayleigh distribution. We do so because both the variance and the second moment are linear functions of

σ^{2}

. We use Hall’s [14] procedure to carry out the study. Using sequential estimation goes back to the non-existence of a fixed-sample-size procedure that solves the problem analytically. For more details, see Mukhopadhyay and de Silva [18] (chapters 6–13 and 16) and Ghosh et al. [19] (Theorem 3.7.1). Since our focus is on the sequential estimation of the scale parameter’s square, we use the following transformation to ease the subsequent sections’ calculations. Let

W = X^{2} / 2

, and

θ = σ^{2}

. Then the Jacobian transformation yields that:

f (w; θ) = \frac{1}{θ} e^{- w / θ}, 0 \leq w < \infty, θ > 0,

(1)

which is the probability density function of the exponential distribution with mean

θ

. It is readily known that

2 W / θ

is distributed according to the chi-squared distribution with two degrees of freedom

χ_{2}^{2}

. Let

W_{1}, W_{2}, \dots, W_{n}

be a sequence of independent and identically distributed random variables following the exponential distribution in (1), then the

r^{t h}

raw moment is given by:

E (W^{r}) = θ^{r} Γ (r + 1) .

Hence, for any

W_{i}

and

i = 1, 2, 3, \dots, n

we have:

E (W_{i} - θ) = 0, E {(W_{i} - θ)}^{2} = θ^{2}, E {(W_{i} - θ)}^{3} = 2 θ^{3}, E {(W_{i} - θ)}^{4} = 9 θ^{4} .

Moreover, let

{\bar{W}}_{n} = \frac{\sum_{i = 1}^{n} W_{i}}{n}

be the sample average of a random sample of size

n (\geq 2)

, then:

E ({\bar{W}}_{n} - θ) = 0, E {({\bar{W}}_{n} - θ)}^{2} = \frac{θ^{2}}{n}, E {({\bar{W}}_{n} - θ)}^{3} = \frac{2 θ^{3}}{n^{2}}, E {({\bar{W}}_{n} - θ)}^{4} = \frac{9 θ^{4}}{n^{3}} .

2. Estimation Problems

2.1. Minimum Risk Point Estimation for the Parameter $θ$

It is a common practice in optimal decision structures to assume that the cost incurred in estimating

θ

by the corresponding sample measure

{\bar{W}}_{n}

takes the form of a squared-error loss function with linear sampling cost. For more details, see Degroot [20], Chow and Yu [21], and Martinsek [22].

L_{n} (A) = A^{2} {({\bar{W}}_{n} - θ)}^{2} + c n .

(2)

The first term in (2) is known as the cost of estimation and the second term is the cost of sampling. The constant

c

is the cost per unit sample, and

A

is the estimation unit’s cost. Details regarding the interpretation of

A

are given in the following sections. The risk associated with (2) is:

R_{n} (A) = E L_{n} (A) = A^{2} \frac{θ^{2}}{n} + c n .

(3)

Considering

n

as a continuous variable, we differentiate (3) concerning

n

, and equate the result by zero to get the optimal sample size as:

n^{*}_{p o i n t} = A θ / \sqrt{c} .

(4)

Note that (4) is the optimal fixed-sample size required to minimize the risk had

θ

been known.

2.2. Fixed-Width Confidence Interval Estimation for the Parameter $θ$

Assume further that for a fixed-width

2 (d > 0)

a confidence interval for

θ

is required, whose coverage probability is at least the nominal value

100 (1 - α) %

. We use the central limit theorem and the normal approximation of the distribution of the sample average

{\bar{W}}_{n}

to propose the interval

I_{n} = ({\bar{W}}_{n} - d \leq θ \leq {\bar{W}}_{n} + d)

for

θ

. It follows that for large

n

, the central limit theorem states that the quantitate Q =

\frac{\sqrt{n} ({\bar{W}}_{n} - θ)}{θ}

is distributed as a standard normal distribution. Therefore,

P (| Q) | \leq \frac{d \sqrt{n}}{θ}) \geq (1 - α) = 2 Φ (a) - 1,

where

Φ (u) = {(\sqrt{2 π})}^{- 1} \int_{- \infty}^{u} e^{- t^{2} / 2} d t

. Moreover,

a

is the upper

α / 2

percentage, the cutoff point of the standard normal distribution. It follows that the optimal sample size required to satisfy the above objectives takes the form:

n^{*}_{c o n f} = \frac{a^{2} θ^{2}}{d^{2}} .

(5)

Since

θ

is numerically unknown, then

n^{*}_{c o n f}

is unknown. It was shown by Dantzig [23] that there exists no fixed sample size that can achieve the above objectives uniformly for all

θ > 0

except sequentially.

In the following, we combine both the point estimation and the confidence interval estimation in one decision framework to make maximum use of the available sample to achieve several objectives in performing inference.

2.3. A Unified Decision Framework for Point and Interval Estimation

To determine the optimal sample size required to achieve both types of estimation, we equate both Equations (4) and (5) to obtain

\frac{a^{2} θ^{2}}{d^{2}} = A θ / \sqrt{c}

, which results in

A = \frac{a^{2} θ \sqrt{c}}{d^{2}}

and

A^{2} = \frac{a^{2}}{d^{2}} c n^{*}

, contrary to what has been said about

A^{2}

, being a known constant in the literature. In fact,

A^{2}

is partially known. The term

\frac{a^{2}}{d^{2}}

is a Fisher information and

c n^{*}

represents the optimal cost of sampling, which depends on the unknown

n^{*}

. Therefore,

A^{2}

represents the cost of estimation measured relative to the optimal cost of sampling. Clearly,

A^{2} \to \infty

as

d \to 0

. However, we continue to use the optimal sample size in a general form as

n^{*} = λ^{2} g (θ)

to define the three-stage sampling procedure in the following section. The function

g (> 0)

is a real-valued and continuously differentiable bounded function in a neighborhood around the parameter

θ

such that

S u p_{n > m} |g^{″} (n)| = O |g ″ (n^{*})|

.

3. Three-Stage Sequential Procedure for Inference

Hall [14] introduced the three-stage sampling procedure. The objective was to obtain a fixed-width confidence interval for the mean of a normal distribution when the variance is finite but unknown. It was designed to overcome several technical problems in both one-by-one purely sequential schemes that were introduced by Anscombe [24], Robbins [25], and Chow and Robbins [15] and the two-stage bulk sample that was introduced by Stein [26,27] and Cox [28]. The procedure showed asymptotic second-order efficiency and asymptotic consistency in the sense of Chow and Robbins [15]. Mukhopadhyay [29] developed a unified framework for the three-stage procedure and laid out the theory associated with asymptotic second-order properties. As suggested by the name, the procedure is carried out in three consecutive sampling phases, the pilot phase, the main-study phase, and the fine-tuning phase.

The pilot-study phase: We start the process by observing a random pilot sample of size

m (> 2)

W_{1}, W_{2}, \dots, W_{m}

to initiate the sampling procedure and calculate the sample estimate

{\bar{W}}_{m} = \frac{\sum_{i = 1}^{m} W_{i}}{m}

.

The main-study phase: In this phase, only a portion

0 < δ < 1

of the optimal sample size

n^{*}

is estimated to avoid the early stopping and the possibility of oversampling. Let

[x]

be the integer-valued function. Then the procedure is terminated in this stage according to the following stopping rule:

N_{1} = m a x {m, [δ λ^{2} g ({\bar{W}}_{m})] + 1} .

(6)

If

m \geq [δ λ^{2} g ({\bar{W}}_{m})] + 1

, we terminate the process at this phase; otherwise, we continue to observe an additional sample of size

[δ λ^{2} g ({\bar{W}}_{m})] + 1 - m,

say

(W_{m + 1}, W_{m + 2}, \dots, W_{N_{1}})

augment the two samples and update the estimate

{\bar{W}}_{N_{1}} = \frac{\sum_{i = 1}^{N_{1}} W_{i}}{N_{1}}

.

The fine-tuning phase: We define the fine-tuning stopping rule as:

N = m a x \{N_{1}, [λ^{2} g ({\bar{W}}_{N_{1}})] + 1\} .

(7)

If

N_{1} \geq [λ^{2} g ({\bar{W}}_{N_{1}})] + 1

, sampling is terminated, else we continue to sample a sample of size

[λ^{2} g ({\bar{W}}_{N_{1}})] + 1 - N_{1}

, say

(W_{N_{1} + 1}, W_{N_{1} + 2}, W_{N_{1} + 3}, \dots, W_{N})

, then we terminate the sampling course. Hence, we propose the point estimate

{\bar{W}}_{N}

and the confidence interval

I_{N} = {\bar{W}}_{N} \pm d

for the unknown parameter

θ

. As a result, the three-stage point estimate for the variance of the Rayleigh distribution is

(2 - \frac{π}{2}) \hat{θ}

.

The asymptotic characteristics of each phase are given in the following section.

The following asymptotic results were developed under the general regularity assumptions set forward by Hall [14] to develop a theory for the three-stage sampling procedure, which states:

Assumption A.

Let

0 < δ < 1, k > 1

and

λ^{2} = λ^{2} (m)

,

m \geq 1

be constants, with

λ^{2} (m) \to \infty

,

l i m S u p_{m \to \infty} \frac{m}{λ^{2} (m)} < δ θ

and

λ^{2} (m) = O (m^{k})

,

k > 1

.

3.1. Asymptotic Characteristics for the Main Study Phase

Theorem 1.

Under assumption (A), for the stopping rules (6) and (7), we have as

d \to 0,

(i)

E ({\bar{W}}_{N_{1}}) = θ - \frac{θ^{2}}{δ n^{*}} \frac{d l n (g (θ))}{d θ} + o (d^{2})

(ii)

E ({\bar{W}}_{N_{1}}^{2}) = θ^{2} + \frac{θ^{2}}{δ n^{*}} (1 - 2 θ \frac{d l n (g (θ))}{d θ}) + o (d^{2})

(iii)

E {({\bar{W}}_{N_{1}} - θ)}^{2} = \frac{θ^{2}}{δ n^{*}} + o (d^{2})

Proof.

To prove

(i),

we consider

E ({\bar{W}}_{N_{1}}) = θ + E \{N_{1}^{- 1} \sum_{i = 1}^{N_{1}} (W_{i} - θ)\}

.

Conditioning on the

σ - f i e l d

generated by

W_{1}

,

W_{2}

,

W_{3}

, …,

W_{m}

we get

E ({\bar{W}}_{N_{1}}) = θ + E \{N_{1}^{- 1} (\sum_{i = 1}^{m} (W_{i} - θ) + \sum_{i = m + 1}^{N_{1}} (W_{i} - θ)) | W_{1}, W_{2}, \dots, W_{m}\}

.

E ({\bar{W}}_{N_{1}}) = θ + E \{N_{1}^{- 1} (\sum_{i = 1}^{m} (W_{i} - θ) + (N_{1} - m) E (W_{i} - θ)\}

, by Wald’s [30] first equation. Therefore,

E ({\bar{W}}_{N_{1}}) = θ + E \{N_{1}^{- 1} \sum_{i = 1}^{m} (W_{i} - θ)\}

. We then expand

N_{1}^{- 1}

in a Taylor series expansion around

g (θ)

to obtain:

N_{1}^{- 1} = {(δ n^{*})}^{- 1} - δ λ^{2} (g ({\bar{W}}_{m}) - g (θ)) {(δ n^{*})}^{- 2} + {(δ λ^{2})}^{2} {(g ({\bar{W}}_{m}) - g (θ))}^{2} ν^{- 3},

where

ν

is a random variable that lies between

N_{1}

and

δ n^{*}

.

\begin{array}{l} E ({\bar{W}}_{N_{1}}) = θ - δ λ^{2} m E {({\bar{W}}_{m} - θ)}^{2} g' (θ) {(δ n^{*})}^{- 2} + (δ λ^{2}) (m E {({\bar{W}}_{m} - θ)}^{3} g ″ (ν) {(δ n^{*})}^{- 2} \\ + {(δ λ^{2})}^{2} m E (({\bar{W}}_{m} - θ) {(g ({\bar{W}}_{m}) - g (θ))}^{2} ν^{- 3} = I - I I + I I I + V I . \end{array}

where we

expand g ({\bar{W}}_{m})

in a Taylor series to the second-order around

θ

and the fact that

g (> 0)

at its derivative is bounded gives:

E (I I) = \frac{θ^{2}}{δ n^{*}} \frac{d l n (g (θ))}{d θ}, E (I I I) = (δ λ^{2}) (m E {({\bar{W}}_{m} - θ)}^{3} g ″ (ν) {(δ n^{*})}^{- 2} \leq M (δ λ^{2}) (m) {(δ n^{*})}^{- 2} \frac{2 θ^{3}}{m^{2}} = o (d^{2}) .

as

m \to \infty

, by the regularity condition in Assumption (A) and

M

is a constant independent of

both ν, and N_{1}

.

Next,

E (I V) = {(δ λ^{2})}^{2} m E ({\bar{W}}_{m} - θ) {(g ({\bar{W}}_{m}) - g (θ))}^{2} ν^{- 3};

consider the following two cases:

If $m \leq N_{1} < ν < δ n^{*}$ , we have:

$E (I V) = {(δ λ^{2})}^{2} m E ({\bar{W}}_{m} - θ) {(g ({\bar{W}}_{m}) - g (θ))}^{2} ν^{- 3} \leq M \frac{δ λ^{2} m E ({\bar{W}}_{m} - θ)}{m^{3}} = 0 .$

We have used the assumption that

g (\cdot)

and its derivatives are bounded and the fact that

E ({\bar{W}}_{m} - θ) = 0

.

If $δ n^{*} < ν < N_{1}$ , $E (I V) = {(δ λ^{2})}^{2} m E (({\bar{W}}_{m} - θ) {(g ({\bar{W}}_{m}) - g (θ))}^{2} ν^{- 3} \leq M$ ${(δ λ^{2})}^{2} m E ({\bar{W}}_{m} - θ) / {(δ n^{*})}^{3} = 0$ .

Therefore,

E ({\bar{W}}_{N_{1}}) = θ - \frac{θ^{2}}{δ n^{*}} \frac{d (l n (g (θ)))}{d θ} + o (d^{2}) .

This proves

(i)

of Theorem 1. □

To prove

(i i)

, we write

E ({\bar{W}}_{N_{1}}^{2}) = E \{N_{1}^{- 2} ({(\sum_{i = 1}^{N_{1}} (W_{i} - θ + θ))}^{2})\} .

E ({\bar{W}}_{N_{1}}^{2}) = θ^{2} + E N_{1}^{- 2} {(\sum_{i = 1}^{N_{1}} (W_{i} - θ))}^{2} + 2 θ E N_{1}^{- 1} (\sum_{i = 1}^{N_{1}} (W_{i} - θ))

and by

(i)

of Theorem 1, we get

E ({\bar{W}}_{N_{1}}^{2}) = θ^{2} - \frac{2 θ^{3}}{δ n^{*}} \frac{d l n (g (θ))}{d θ} + E \{N_{1}^{- 2} {(\sum_{i = 1}^{N_{1}} (W_{i} - θ))}^{2}\}

. Arguments similar to those used above provide

E \{N_{1}^{- 2} {(\sum_{i = 1}^{N_{1}} (W_{i} - θ))}^{2}\} = \frac{θ^{2}}{δ n^{*}} + o (d^{2})

. Combining terms, the statement of the part

(i i)

of Theorem 1 is immediate. The proof of (ii) is complete. □

Part (

i i i

) of Theorem 1 is the direct application of parts

(i)

and

(i i)

of Theorem 1. □

The following Theorem 2 provides second-order asymptotic expansion of a real-valued, continuously differentiable and bounded function

g (> 0)

of

{\bar{W}}_{N_{1}}

.

Theorem 2.

Under assumption (A), for the three-stage stopping rule (6) and (7) and as

d \to 0,

we have:

E (g ({\bar{W}}_{N_{1}})) = g (θ) + \frac{θ^{2}}{2 δ n^{*}} \{g^{″} (θ) - 2 g' (θ) \frac{d (l n (g (θ))}{d θ}\} + o (d^{2}) .

Proof.

The proof is prompt if we consider the second-order Taylor expansion of

g ({\bar{W}}_{N_{1}})

around

θ

and make use of

(i)

and

(i i)

of Theorem 1, and the assumption that the real-valued continuously differentiable function

g > 0

and its derivatives are bounded. The proof of (iii) is complete. □

In the following section, we present the asymptotic theory for the stopping variable

N .

3.2. Asymptotic Characteristics for the Fine-Tuning Phase

Theorem 3.

Under assumption (A), for the stopping variable

N,

and as

d \to 0,

we have:

(i)

E (N) = n^{*} + \frac{θ^{2} δ^{- 1}}{2} (\frac{g^{″} (θ)}{g (θ)} - 2 {(\frac{g' (θ)}{g (θ)})}^{2}) + \frac{1}{2} + o (1)

(ii)

E (N^{2}) = {(n^{*})}^{2} + \frac{θ^{2} n^{*}}{δ} (\frac{g^{″} (θ)}{g (θ)} - {(\frac{g' (θ)}{g (θ)})}^{2}) + n^{*} + o (d^{- 2})

(iii)

E {(N - n^{*})}^{2} = \frac{θ^{2} n^{*}}{δ} {(\frac{g' (θ)}{g (θ)})}^{2} + o (d^{- 2})

Proof.

First, write the random variable

N

as

N = λ^{2} g ({\bar{W}}_{N_{1}}) + φ_{N_{1}}

,

a . s .

(almost surely) except possibly on a set

ζ = \{N_{1} < m\} \cup \{λ^{2} g ({\bar{W}}_{N_{1}}) < δ λ^{2} g ({\bar{W}}_{N_{1}}) + 1\}

, of measure zero such that

\int_{ζ} d P = o (1)

, and where the random variable

φ_{N_{1}} = 1 - \{λ^{2} g ({\bar{W}}_{N_{1}}) - [λ^{2} g ({\bar{W}}_{N_{1}})]\}

is distributed uniformly over

(0, 1)

; see Hall [14] for details. Therefore,

E (N) = λ^{2} E (g ({\bar{W}}_{N_{1}})) + E (φ_{N_{1}}) + o (1)

. By using Theorem 2 and

E (φ_{N_{1}}) = 1 / 2

the proof is complete. □

Part (

i i)

.

N^{2} = λ^{4} (g^{2} ({\bar{W}}_{N_{1}})) + 2 φ_{N_{1}} λ^{2} g ({\bar{W}}_{N_{1}}) + φ_{N_{1}}^{2} + o (1)

. By taking the expectation we have

E (N^{2}) = λ^{4} E (g^{2} ({\bar{W}}_{N_{1}})) + 2 λ^{2} E \{φ_{N_{1}} g ({\bar{W}}_{N_{1}})\} + E (φ_{N_{1}}^{2}) + o (1) .

It can be shown that as

d \to 0

,

φ_{N_{1}}

and

g ({\bar{W}}_{N_{1}})

are asymptotically uncorrelated.

Hence,

E (N^{2}) = λ^{4} E (g^{2} ({\bar{W}}_{N_{1}})) + λ^{2} E \{g ({\bar{W}}_{N_{1}})\} + o (1)

. By using Taylor expansion for

g^{2} ({\bar{W}}_{N_{1}})

and utilizing Theorem 2, we have:

E (λ^{4} g^{2} ({\bar{W}}_{N_{1}})) = {(n^{*})}^{2} + \frac{θ^{2} n^{*}}{δ g^{2} (θ)} \{g (θ) g^{''} (θ) - {(g' (θ))}^{2}\} + o (d^{2}) .

By using part (i), we obtain the proof. The proof is complete. □

Part

(i i i)

follows directly from

(i)

and

(i i)

of Theorem 3. □

The first part of Theorem 3 shows that

\lim_{d \to 0} E (N / n^{*}) = 1

(first-order asymptotic efficiency) and

\lim_{d \to 0} E (N - n^{*}) < \infty

is bounded by a finite number that is unrelated to

n^{*}

. Such a property is called second-order asymptotic efficiency in the sense of Chow and Robbins [15]. Part (iii) shows that the variance increases as

n^{*}

increases.

Theorem 4 below provides a second-order asymptotic expansion of the moments of a real-valued function

h (> 0)

that is a continuously differentiable and bounded function of

N

.

Theorem 4.

Let assumption (A) hold, and let

h

(> 0) be a real-valued continuously differentiable function in a neighborhood around

n^{*}

such that

\underset{n > m}{\lim Sup} |h'' (n)| = O | h''' (n^{*}

)|. Then as

m \to \infty

,

E (h (N)) = h (n^{*}) + \{\frac{θ^{2} δ^{- 1}}{2} (\frac{g^{″} (θ)}{g (θ)} - 2 {(\frac{g^{'} (θ)}{g (θ)})}^{2}) + \frac{1}{2}\} h^{'} (n^{*}) + \frac{1}{2} \{\frac{θ^{2} n^{*}}{δ} {(\frac{g' (θ)}{g (θ)})}^{2}\} h^{'' (n^{*})} + o (d^{- 2} |h''' (n^{*})|) .

Proof.

The proof is a direct substitution of

(i)

and

(i i i)

of Theorem 3 in the Taylor series expansion of the function

h (N)

. We omit any further details for brevity. The proof is complete. □

Lemma 1.

As

d \to 0

,

N

is an asymptotically standard normal distribution.

Proof.

According to Anscombe [31], the central limit theorem

d \to 0

,

Q = \sqrt{\frac{δ}{n^{*}}} (\frac{(N - n^{*})}{θ (\frac{g' (θ)}{g (θ)})})

has an asymptotically standard normal distribution. By computing the moment generating function of

Q

,

E (e^{t Q})

and using Theorem 4 we get the result. The proof is complete. □

Theorem 5.

Under assumption (A), for the stopping variable

N,

and as

d \to 0,

we have:

(i)

E ({\bar{W}}_{N}) = θ - \frac{θ^{2} δ}{n^{*}} (\frac{g^{'} (θ)}{g (θ)}) + o (d^{2})

(ii)

E {({\bar{W}}_{N})}^{2} = θ^{2} - \frac{θ^{3} δ}{n^{*}} (1 - \frac{2 θ d l n (g (θ)}{d θ}) + o (d^{2})

(iii)

E {({\bar{W}}_{N} - θ)}^{2} = \frac{θ^{2} δ}{n^{*}} + o (d^{2})

Proof.

To prove (i) write

E ({\bar{W}}_{N}) = θ + E \{N^{- 1} \sum_{i = 1}^{N} (W_{i} - θ)\} .

Then, condition on the

σ - f i e l d

generated by

W_{1}

,

W_{2}, W_{3}, \dots, W_{N_{1}}

to get:

E ({\bar{W}}_{N}) = θ + E \{N^{- 1} E \{\sum_{i = 1}^{N_{1}} (W_{i} - θ) + \sum_{i = N_{1} + 1}^{N} (W_{i} - θ) | W_{1}, W_{2}, \dots, W_{N_{1}}\}\} .

Given

W_{1}, W_{2}, \dots, W_{N_{1}}

the term

E (\sum_{i = 1}^{N_{1}} (W_{i} - θ))

is constant. Moreover, by Wald’s first equation [30] we get:

E \{\sum_{i = N_{1} + 1}^{N} (W_{i} - θ) | W_{1}, W_{2}, \dots, W_{N_{1}}\} = (N - N_{1}) E (W_{i} - θ) = 0 .

Therefore,

E ({\bar{W}}_{N}) = θ + E \{N^{- 1} (\sum_{i = 1}^{N_{1}} (W_{i} - θ))\} .

Consider the second-order expansion of

N^{- 1}

in the Taylor series around

n^{*}

; we have:

N^{- 1} = {(n^{*})}^{- 1} - λ^{2} ((g ({\bar{W}}_{N_{1}}) - g (θ)) {(n^{*})}^{- 2} + λ^{4} {((g ({\bar{W}}_{N_{1}}) - g (θ))}^{2} {(ν)}^{- 3},

where random variable

ν

is between

N

and

n^{*}

. The assumption that

g (> 0)

and its derivatives are bounded can be used to prove that

λ^{4} E \{\sum_{i = 1}^{N_{1}} (W_{i} - θ) {(g ({\bar{W}}_{N_{1}}) - g (θ))}^{2} v^{- 3}\} = 0 .

E ({\bar{W}}_{N}) = θ + E (\sum_{i = 1}^{N_{1}} (W_{i} - θ)) \{{(n^{*})}^{- 1} - λ^{2} ((g ({\bar{W}}_{N_{1}}) - g (θ)) {(n^{*})}^{- 2}\} + o (d^{2}) .

E ({\bar{W}}_{N}) = θ - λ^{2} E (\sum_{i = 1}^{N_{1}} (W_{i} - θ)) \{((g ({\bar{W}}_{N_{1}}) - g (θ)) {(n^{*})}^{- 2}\} + o (d^{2}) .

Consider that the first-order Taylor expansion of

g ({\bar{W}}_{N_{1}}

) gives

E ({\bar{W}}_{N}) = θ - λ^{2} E \{N^{- 1} {(\sum_{i = 1}^{N_{1}} (W_{i} - θ))}^{2} g' (θ) {(n^{*})}^{- 2}\}

. Again, we condition on the

σ - f i e l d

generated by

W_{1}, W_{2}, \dots, W_{m}

. We get:

E ({\bar{W}}_{N}) = θ - λ^{2} g' (θ) {(n^{*})}^{- 2} E {N^{- 1} E {{(\sum_{i = 1}^{m_{1}} (W_{i} - θ) + \sum_{i = m + 1}^{N_{1}} (W_{i} - θ))}^{2}} | W_{1}, W_{2}, W_{3}, \dots, W_{m}} .

Arguments similar to those used above and the fact that

N_{1}^{- 1} \approx {(n^{*})}^{- 1}

yield statement of (

i i)

of Theorem 5. The proof is complete. □

Part

(i i i)

of Theorem 5 is the direct use of

(i) a n d (i i)

of Theorem 5; we omit details. The proof is complete. □

Part (i) of Theorem 4 shows that

{\bar{W}}_{N}

is an asymptotically unbiased estimator of

θ

whereas the variance decreases as

n^{*}

increases.

3.3. The Asymptotic Regret

The regret associated with the quadratic loss function with linear sampling cost given by (2) is:

ω (d) = E L_{N} (d)) - L_{n^{*}} (d),

where

L_{N} (d)

is the loss associated with the three-stage sampling estimation procedure, and

L_{n^{*}} (d)

is the optimal loss had the parameter

θ

been known.

E L_{N} (d)) = \frac{a^{2}}{d^{2}} c n^{*} E {({\bar{W}}_{N} - θ)}^{2} + c E (N)

and by

(i i i)

of Theorem 5 and

(i)

of Theorem 3, we get:

E (L_{N} (d)) = \frac{c a^{2} δ θ^{2}}{d^{2}} + c \{n^{*} + \frac{θ^{2} δ^{- 1}}{2} (\frac{g^{″} (θ)}{g (θ)} - 2 {(\frac{g' (θ)}{g (θ)})}^{2}) + \frac{1}{2}\} + o (1) . = c n^{*} (1 + δ) + c \frac{θ^{2} δ^{- 1}}{2} (\frac{g^{″} (θ)}{g (θ)} - 2 {(\frac{g' (θ)}{g (θ)})}^{2}) + \frac{c}{2} + o (1) .

and the optimal risk

L_{n^{*}} (d) = 2 c n^{*}

. Therefore, the asymptotic regret is given as:

ω (d) = c n^{*} (δ - 1) + \frac{c θ^{2} δ^{- 1}}{2} (\frac{g^{″} (θ)}{g (θ)} - 2 {(\frac{g' (θ)}{g (θ)})}^{2}) + \frac{c}{2} + o (1) .

As shown above, negative regret is expected, since

c n^{*} (δ - 1) < 0

. The issue of negative regret was addressed also by Martinsek [22], Yousef [32], and Hamdy [33]. This phenomenon deserves an in-depth investigation shortly.

3.4. Three-Stage Asymptotic Coverage Probability for the Parameter $θ$

The three-stage coverage probability is defined as:

\begin{matrix} P (θ \in I_{N}) = \sum_{n = m}^{\infty} P (|{\bar{W}}_{N} - θ| \leq d, N = n) \\ = \sum_{n = m}^{\infty} P (|{\bar{W}}_{N} - θ| \leq d | N = n) P (N = n) . \end{matrix}

Meanwhile, the two events

P (|{\bar{W}}_{N} - θ| \leq d)

and the event

\{N = n\}

are dependent for

n = m, m + 1, \dots

. Therefore, we cannot obtain a mathematical expression of the coverage probability like those of Hall [14], Hamdy et al. [34], and Hamdy [35]. Therefore, we conducted a Monte Carlo simulation using Microsoft Developer Studio software to study the performance of the three-stage fixed-width confidence interval for

θ

when the optimal sample size varies from small to moderate and to large.

4. Simulation Study

We conducted a Monte Carlo simulation [36] to study the performance of the fixed-width confidence interval for the parameter

θ

. A series of 5

0, 000

replications was generated from the exponential distribution with mean

θ = 5

using Microsoft Developer Studio software with the IMSL (International Mathematical and Statistical Library). The optimal sample sizes were chosen as recommended by Hall [15]:

n^{*} = 24, 43, 61, 76, 96, 125, 171, 246,

and 500. We took the design factor

γ = 0.5

and the pilot sample

m = 10

. For brevity, we will consider

α = 5 %,

which gives

a = 1.96

. Let

\bar{N}

be the simulated estimate of the optimal sample size

n^{*}

with standard error

S (\bar{N})

. Let

\hat{θ}

be the simulated estimate of the scale parameter’s square of the Rayleigh distribution with standard error

S (\hat{θ})

.

\hat{v a r} (x)

is the simulated estimate for the variance of the Rayleigh distribution and

1 - \hat{α}

is the simulated estimate of the coverage probability. Table 1 below demonstrates the simulation results as the optimal sample size increases. We noticed that the simulation results agree with our findings while the coverage probability improves as the optimal sample size increases. It is evident from the simulation that the procedure provides coverage probabilities that are less than the prescribed nominal value, that is,

P (θ \in I_{N}) < 1 - α

. At the same time, as

d \to 0

,

P (θ \in I_{N}) \to 1 - α

. Collectively, all estimates improve as the optimal sample size increases. For the simulation methodology, see Yousef [37].

5. Conclusions

We have proposed a three-stage sequential procedure for estimating the Rayleigh distribution variance by estimating the Rayleigh distribution scale parameter’s square. We proposed a unified decision framework for estimation and found all asymptotic results that led to asymptotic regret. The procedure attained negative regret, which shows that the three-stage procedure provides estimates better than the classical fixed-sample size procedures. Monte Carlo simulation agreed with our findings and revealed that the procedure provides coverage probabilities that are always less than the desired nominal value.

Author Contributions

Conceptualization, A.Y., and H.I.H.; methodology, A.Y.; A.A.A.; software, A.Y.; validation, A.Y., A.A.A., E.E.H.; investigation, A.Y., and H.I.H.; resources, A.Y.; data curation, A.Y., A.A.A., E.E.H. and H.I.H.; writing—original draft preparation, A.Y.; writing—review and editing, A.Y., A.A.A.; visualization, A.Y. and H.I.H.; supervision, A.Y. and H.I.H.; project administration, A.Y.; funding acquisition, A.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rayleigh, L. On the stability or instability of certain fluid motions. Proc. Lond. Math. Soc. 1880, 1, 57–70. [Google Scholar] [CrossRef]
Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; John Wiley & Sons: New York, NY, USA, 1994. [Google Scholar]
Palovko, A.M. Fundamentals of Reliability Theory; Academic Press: New York, NY, USA, 1968. [Google Scholar]
Gross, A.J.; Clark, V.A. Survival Distributions, Reliability Application in Biomedical Science; John Wiley & Sons: New York, NY, USA, 1975. [Google Scholar]
Lee, E.T.; Wang, J.W. Statistical Methods for Survival Data Analysis, 3rd ed.; John Wiley & Sons Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
Rosen, K.; Buskirk Van, R.; Garbesi, K. Wind energy potential of coastal Eritrea: An Analysis of sparse wind data. Sol. Energy 1999, 66, 201–213. [Google Scholar] [CrossRef]
Siddiqui, M.M. Some problems connected with Rayleigh distributions. J. Res. Nat. Bur. Stand. D 1962, 66, 167–174. [Google Scholar] [CrossRef]
Hirano, K. Rayleigh Distributions; John Wiley & Sons: New York, NY, USA, 1986. [Google Scholar]
Dyer, D.D.; Whisenand, C.W. Best Linear Unbiased estimator of the parameter of the Rayleigh distribution: Part-II optimum theory for selected order statistics. IEEE Trans. Reliab. 1965, 60, 229–231. [Google Scholar] [CrossRef]
Howlader, H.A.; Hossain, A. On Bayesian estimation and prediction from Rayleigh distribution based on type-II censored data. Commun. Stat. Theory Methods 1995, 24, 2249–2259. [Google Scholar] [CrossRef]
Vode, V.G. Inferential procedures on a generalized Rayleigh variate (I). Apl. Mat. 1976, 21, 395–412. [Google Scholar] [CrossRef]
Vode, V.G. Inferential procedures on a generalized Rayleigh variate (II). Apl. Mat. 1976, 21, 413–419. [Google Scholar] [CrossRef]
Yousef, A.; Hassan, E.E.H.; Amin, A.A.; Hamdy, H.I. Multistage Estimation of the Scale Parameter of Rayleigh Distribution with Simulation. Symmetry 2020, 12, 1925. [Google Scholar] [CrossRef]
Hall, P. Asymptotic theory and triple sampling of sequential estimation of a mean. Ann. Stat. 1981, 9, 1229–1238. [Google Scholar] [CrossRef]
Chow, Y.S.; Robbins, H. On the asymptotic theory of fixed-width sequential confidence intervals for the mean. Ann. Math. Stat. 1965, 36, 1203–1212. [Google Scholar] [CrossRef]
Ghosh, M.; Mukhopadhyay, N. Consistency and asymptotic efficiency of two-stage and sequential estimation procedures. Sankhya Indian J. Stat. Ser. A 1981, 43, 220–227. [Google Scholar]
Tahir, M. Sequential estimation of the square of the Rayleigh parameter. J. Math. Stat. 2014, 10, 275–280. [Google Scholar] [CrossRef]
Mukhopadhyay, N.; de Silva, B.M. Sequential Methods and Their Applications; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
Ghosh, M.; Mukhopadhyay, N.; Sen, P.K. Sequential Estimation; John Wiley & Sons: New York, NY, USA, 1997. [Google Scholar]
Degroot, M.H. Optimal Statistical Decisions; McGraw-Hill: New York, NY, USA, 1970. [Google Scholar]
Chow, Y.S.; Yu, K.F. The performance of a sequential procedure for the estimation of the mean. Ann. Stat. 1981, 9, 189–198. [Google Scholar] [CrossRef]
Martinsek, A.T. Negative regret, optional stopping and the elimination of outliers. J. Am. Stat. Assoc. 1988, 83, 160–163. [Google Scholar] [CrossRef]
Dantzig, G.B. On the Non-existence of tests of student’s hypothesis having power function independent of σ. Ann. Math. Stat. 1940, 11, 186–192. [Google Scholar] [CrossRef]
Anscombe, F.J. Sequential estimation. J. R. Stat. Soc. Ser. B 1953, 15, 1–21. [Google Scholar] [CrossRef]
Robbins, H. Sequential Estimation of the Mean of a Normal Population; Probability and Statistics (Harald Cramer Volume); Almquist and Wiksell: Uppsala, Sweden, 1959; pp. 235–245. [Google Scholar]
Stein, C. A two-sample test for a linear hypothesis whose power is independent of the variance. Ann. Math. Stat. 1945, 16, 243–258. [Google Scholar] [CrossRef]
Stein, C. Some problems in sequential estimation (abstract). Econometrica 1949, 17, 77–78. [Google Scholar]
Cox, D.R. Estimation by double sampling. Biometrika 1952, 39, 217–227. [Google Scholar] [CrossRef]
Mukhopadhyay, N. A note on three-stage and sequential point estimation procedures for a normal mean. Seq. Anal. 1985, 4, 311–319. [Google Scholar] [CrossRef]
Wald, A. Sequential Analysis; John Wiley & Sons: New York, NY, USA, 1947. [Google Scholar]
Anscombe, F.J. Large sample theory of sequential estimation. In Mathematical Proceedings of the Cambridge Philosophical Society; Cambridge University Press: Cambridge, UK, 1952; Volume 48, pp. 600–607. [Google Scholar]
Yousef, A.; Hamdy, H. Three-stage estimation for the mean and variance of the normal distribution with application to inverse coefficient of variation. Mathematics 2019, 7, 831. [Google Scholar] [CrossRef] [Green Version]
Yousef, A.; Hamdy, H. Three-Stage Sequential Estimation of the Inverse Coefficient of Variation of the Normal Distribution. Computation 2019, 7, 69. [Google Scholar] [CrossRef] [Green Version]
Hamdy, H.I.; Mukhopadhyay, N.; Costanza, M.C.; Son, M.S. Triple stage point estimation for the exponential location parameter. Ann. Inst. Stat. Math. 1988, 40, 785–797. [Google Scholar] [CrossRef]
Hamdy, H.I. Performance of fixed width confidence intervals under Type II errors: The exponential case. S. Afr. Stat. J. 1997, 31, 259–269. [Google Scholar]
Dunn, W.; Shultis, J.K. Exploring Monte Carlo Methods, 1st ed.; Elsevier Science: Amsterdam, The Netherlands, 2011. [Google Scholar]
Yousef, A. Performance of three-stage sequential estimation of the inverse coefficient of variation under type II error probability: A Monte Carlo simulation study. Front. Phys. 2020, 8, 71. [Google Scholar] [CrossRef]

Table 1. Three-stage simulation results at

m = 10

,

δ = 0.5

, and

α = 5 %

Table 1. Three-stage simulation results at

m = 10

,

δ = 0.5

, and

α = 5 %

$n^{*}$	$\bar{N}$	$S (\bar{N})$	$\hat{θ}$	$S (\hat{θ})$	$\hat{v a r} (x)$	$1 - \hat{α}$
24	21.42	0.055	4.5220	0.0048	1.94086	0.9096
43	39.26	0.088	4.5909	0.0046	1.97043	0.8066
61	57.14	0.113	4.6970	0.0040	2.01597	0.8618
76	72.36	0.130	4.7639	0.0036	2.04468	0.8830
96	92.66	0.154	4.8196	0.0031	2.06859	0.8977
125	122.50	0.183	4.8749	0.0026	2.09232	0.9138
171	169.57	0.223	4.9197	0.0021	2.11155	0.9241
246	246.55	0.284	4.9503	0.0016	2.12469	0.9324
500	509.36	0.468	4.9790	0.0010	2.13701	0.9428

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yousef, A.; Amin, A.A.; Hassan, E.E.; Hamdy, H.I. Multistage Estimation of the Rayleigh Distribution Variance. Symmetry 2020, 12, 2084. https://doi.org/10.3390/sym12122084

AMA Style

Yousef A, Amin AA, Hassan EE, Hamdy HI. Multistage Estimation of the Rayleigh Distribution Variance. Symmetry. 2020; 12(12):2084. https://doi.org/10.3390/sym12122084

Chicago/Turabian Style

Yousef, Ali, Ayman A. Amin, Emad E. Hassan, and Hosny I. Hamdy. 2020. "Multistage Estimation of the Rayleigh Distribution Variance" Symmetry 12, no. 12: 2084. https://doi.org/10.3390/sym12122084

APA Style

Yousef, A., Amin, A. A., Hassan, E. E., & Hamdy, H. I. (2020). Multistage Estimation of the Rayleigh Distribution Variance. Symmetry, 12(12), 2084. https://doi.org/10.3390/sym12122084

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multistage Estimation of the Rayleigh Distribution Variance

Abstract

1. Introduction

2. Estimation Problems

2.1. Minimum Risk Point Estimation for the Parameter $θ$

2.2. Fixed-Width Confidence Interval Estimation for the Parameter $θ$

2.3. A Unified Decision Framework for Point and Interval Estimation

3. Three-Stage Sequential Procedure for Inference

3.1. Asymptotic Characteristics for the Main Study Phase

3.2. Asymptotic Characteristics for the Fine-Tuning Phase

3.3. The Asymptotic Regret

3.4. Three-Stage Asymptotic Coverage Probability for the Parameter $θ$

4. Simulation Study

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Multistage Estimation of the Rayleigh Distribution Variance

Abstract

1. Introduction

2. Estimation Problems

2.1. Minimum Risk Point Estimation for the Parameter θ

2.2. Fixed-Width Confidence Interval Estimation for the Parameter θ

2.3. A Unified Decision Framework for Point and Interval Estimation

3. Three-Stage Sequential Procedure for Inference

3.1. Asymptotic Characteristics for the Main Study Phase

3.2. Asymptotic Characteristics for the Fine-Tuning Phase

3.3. The Asymptotic Regret

3.4. Three-Stage Asymptotic Coverage Probability for the Parameter θ

4. Simulation Study

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.1. Minimum Risk Point Estimation for the Parameter $θ$

2.2. Fixed-Width Confidence Interval Estimation for the Parameter $θ$

3.4. Three-Stage Asymptotic Coverage Probability for the Parameter $θ$