Multistage Estimation of the Scale Parameter of Rayleigh Distribution with Simulation

: This paper discusses the sequential estimation of the scale parameter of the Rayleigh distribution using the three-stage sequential sampling procedure proposed by Hall ( Ann. Stat. 1981 , 9 , 1229–1238). Both point and conﬁdence interval estimation are considered via a uniﬁed optimal decision framework, which enables one to make the maximum use of the available data and, at the same time, reduces the number of sampling operations by using bulk samples. The asymptotic characteristics of the proposed sampling procedure are fully discussed for both point and conﬁdence interval estimation. Since the results are asymptotic, Monte Carlo simulation studies are conducted to provide the feel of small, moderate, and large sample size performance in typical situations using the Microsoft Developer Studio software. The procedure enjoys several interesting asymptotic characteristics illustrated by the asymptotic results and supported by simulation.

The survival or reliability function is e −x 2 /2σ 2 and the hazard function is x/σ 2 for all x > 0 and σ > 0. An important characteristic of the Rayleigh distribution is that its failure rate is a linear function of time. The reliability function decreases at a much higher rate than the exponential distribution's reliability function, whose hazard rate is constant (see Kodlin [1]). This distribution relates to several distributions, such as generalized extreme value, Weibull, and Chi-square, and hence its applicability in real-life situations is significant. This is a particular case of a two-parameter Weibull distribution with a shape parameter equal to 2 and scale parameter σ √ 2. Rayleigh distribution was introduced by Lord Rayleigh [2] and plays an important role in various research areas, such as acoustics, communication engineering, clinical studies, applied statistics, life-testing experiments, reliability analysis, and survival analysis. For instance, Palovko [3] discussed its application in life testing, especially in electro-vacuum devices. Gross and Clark [4] and Lee and Wang [5] discussed its usage in clinical studies dealing with cancer patients. Dyer and Whisenand [6] used it in communication engineering. Siddiqui [7] discussed its usage in electromagnetic wave propagation through a scattering medium. Others, such as Siddiqui [7], Hirano [8], and Howlader and Hossian [9], discussed several aspects of the Rayleigh distribution. It was shown from [10][11][12][13] that both Rayleigh and Weibull distributions are suitable probability distributions for evaluating wind energy potentials. They provide the most accurate and adequate wind, analyzing and interpreting the actual wind speed data and predicting the prevailing wind profile. Several authors have contributed to this model, such as Sinha and Howlader [14], Ariyawansa and Templeton [15], Howlader [16], Lalitha and Mishra [17], and Abd Elfattah et al. [18].
Regarding estimation, [19][20][21] have carried out extensive studies concerning the estimation, prediction, and several other inferences concerning the Rayleigh distribution. From [22], the r th moment around the origin is: Substituting for r = 1 and r = 2, yield that the population mean is E(X) = √ π/2 σ and the population variance Var(X) = (4−π) 2 σ 2 . The mode equals σ, and the median is σ 2 ln (2). All the moments are unknown but finite. Moreover, the differential entropy of the random variable X, Shannon [23] entropy, measures the amount of uncertainty or missing information and is defined by means of its underlying distribution f (x) as h(X) = − S f (x) log f (x)dx, where S is the support of f (x). The entropy of the Rayleigh distribution is defined as: where γ ≈ 0.5772 is the Euler's constant, which will be of interest in this research. Assume a preliminary random sample X 1 , X 2 , . . . , X n of size; n becomes available, from which we calculate the sample mean X n = n 1 X i /n for n ≥ 1 and propose the estimate T n = √ 2/π X n as an unbiased point estimate of the unknown parameter σ. For the convenience of calculation, we continue to use: This paper aims to estimate sequentially the scale parameter σ of the Rayleigh distribution using a multistage sampling procedure, the three-stage procedure, that was presented by Hall [24]. For more details regarding the three-stage procedure and its properties, see Section 3. We tackle two types of estimation problems-point estimation under a squared-error loss function plus linear sampling cost and confidence interval estimation. In the following section, we set up the estimation problems.

Minimum Risk Point Estimation
To obtain a point estimate for σ, we assume that the cost incurred in estimating the scale parameter σ by the corresponding sample measure T n is given by the following squared-error loss function with a linear sampling cost, given by: where A 2 represents the cost per estimation unit and A > 0 will be determined shortly. The cost function in (1) is similar to those considered by Degroot [25], Chow and Yu [26], Martinsek [27], and Hamdy [28]. The risk associated with the cost function in (1) is given by, Minimizing the risk in (2) concerning the sample size n yields the optimal sample size: where n * point → ∞, as A → ∞. We elaborate more on the physical entity of A in the following subsections. The optimal sample size in (3) is unknown because σ is unknown. Therefore, we resort to multistage sampling procedures, developed over the last 50 years, to estimate the unknown scale parameter σ via the estimation of n * .

Fixed-Width Confidence Interval
Assume further that a fixed 2d width confidence interval for σ of the form is required, such that its coverage probability is at least a 100(1 − α)% uniformly over σ > 0.
where a is the upper α/2 percentage, the cutoff point of the standard normal distribution. It follows that the optimal sample size required to achieve the above objectives must satisfy:

A Unified Decision Framework
Since sequential sampling is utilized to perform inference, authors usually specify one decision rule for each research objective. It could be a point estimation with a specified cost function or a fixed-width confidence interval whose coverage probability is at least the nominal value, or testing hypotheses regarding the population parameters. If the interest is in defining one decision rule to achieve more than one objective, and at the same time to make the maximum use of the available data, we have to have n * point = n * con f , which implies that: where k 2 = a 2 d 2 (4−π) π . As A → ∞ k → ∞ . Therefore, the constant A is chosen as in (5) to perform inference through a unified framework. In fact, in sequential point estimation problems where cost functions are assumed to assess the encountered risk, the constant A is assumed to be known and is permitted to go to infinity to check if the estimation risk is still finite and bounded. However, by knowing A we restrict the sampling population. In other words, by doing so we assume that σ is not entirely unknown. The constant A in (5) is partially known because it depends on the unknown parameter σ. The constant A → ∞ , as the width of the interval d → 0, is a common practice in the sequential estimation when we study the asymptotic characteristics of the fixed-width interval. The constant A 2 can be thought of as A 2 = a 2 d 2 × n * point , where: A 2 = Fisher s in f ormation × optimal sampling cost.
Meanwhile, we continue to use the representation n * = k 2 σ 2 to define the three-stage stopping rules in the following subsections. Therefore, we proceed to use the following optimal sample size, to perform the necessary inference. The parameter σ 2 in (6) is unknown, then no fixed sample size procedure can estimate the scale parameter uniformly over the parameter space; see Dantzig [29]. Therefore, we resort to a three-stage sampling procedure to achieve the required objectives.
Henceforth, we continue to use the asymptotic sample size defined in (6) to propose the following three-stage sampling procedure to estimate the unknown scale parameter σ via the estimation of n * .

Multistage Sampling
Multistage sequential sampling procedures have been developed over the past few decades to achieve several popular characteristics lacking in classical inference theory. This goes back to Abraham Wald in 1947, who introduced the idea of one-by-one sequential sampling through the sequential probability ratio test (SPRT) to minimize the cost of inspection and transportation. Since the publication of the one-by-one sequential sampling procedure, attention was mainly directed to multistage sampling under optimal decision frames. The aim is to achieve several optimal objectives, including minimizing the risk associated with point estimation, maintaining the coverage probability of at least the desired nominal value, or controlling the type I and type II error probabilities. This was not the case in classical inference.
Multistage came out to motivate researchers to perform inference through different sampling techniques. Stein [30,31] created the foundation of two-stage sampling, also referred to as double sampling, which led to an exact solution for a fundamental statistical inference problem. Additionally, Seelbinder [32] and Cox [33] introduced the idea of group sampling in two stages. Although the procedure enjoys many asymptotic requirements, it still suffers from a lack of asymptotic efficiency. The procedure could lead to oversampling, mostly when the initial sample chosen is much smaller than the optimal sample size. Anscombe [34], Robbins [35], Chow, and Robbins [36] devised purely one-by-one sequential sampling procedures to perform inference subjected to some optimality criteria. The one-by-one sequential sampling procedure surpasses two-stage sampling in achieving all asymptotic characteristics. However, practically it is inefficient since it takes quite some time to terminate the sampling course.
Hall [24], in his sophisticated influential work, introduced the idea of sampling in three stages to overcome all the deficiencies portrayed in both two-stage and purely one-by-one sequential sampling. By doing so, he combined both the asymptotic characteristics of the purely one-by-one sequential sampling of Anscombe [34], Robbins [35], and Chow and Robbins [36] and the operational saving made possible by Stein [30] and Cox [33] bulk sampling.
The extension of Hall's results to tackle hypothesis testing problems of the normal mean was developed by Liu [58]. At the same time, Son et al. [59] proposed a three-stage sampling sequential procedure that yields both a fixed-width confidence interval and a hypothesis test for the normal while controlling the type II error probability. Their procedure also provided second-order approximations to the operating characteristic curves of the inference.
Tahir [60] addressed a sequential procedure to tackle a point estimation problem for the Rayleigh distribution parameter square, subject to a weighted squared-error loss plus cost of sampling. He found a second-order asymptotic expansion for the incurred regret and found that the asymptotic regret is negative for a range of parameter values.
The main objective of this paper is the estimation of the unknown scale parameter σ. We tackle two estimation problems-point estimation under a squared-error loss function with linear sampling cost and confidence interval, where we find a fixed-width confidence interval with a coverage probability of at least 100(1 − α)%. We use the three-stage procedure to find all the asymptotic results that enhanced finding the asymptotic regret and the asymptotic confidence interval. We use Monte Carlo simulation to verify the asymptotic results. To the best of our knowledge, none of the existing papers in the literature on sequential estimation conduct this study.
In the following lines, we state the three-stage procedure as follows: Pilot Study Phase : The pilot study phase starts with selecting an initial random sample T 1 , T 2 , T 3 , . . . , T m of size m (≥ 2) from the Rayleigh distribution and calculate the sample average T m to initiate the process. We propose to estimate σ by the corresponding sample measure T m .
Main Study Phase : During the main study phase, we only estimate a portion 0 < δ < 1, of n * to avoid the possibility of over-sampling in the pilot study phase. The required stopping rule is: where [x] is the integer-valued function.
If m ≥ [δk 2 T 2 m ] + 1, then we stop at this stage. Otherwise, we continue to observe an additional sample of size [δk 2 T 2 m ] + 1 − m-say, T m+1 , T m+2 , T m+3 , . . . , T N 1 . Hence, we update the estimate T N 1 based on the collected N 1 samples to define the main study phase. Note that in this stage,σ = T N 1 .
The Fine Tuning Phase: The primary study phase is determined through the following stopping rule: , . . ., T N . Upon the realization of N, we terminate the sampling course and propose the estimate T N = √ 2/π X N for the unknown scale parameter σ. In the following subsection, we present the stopping rules (7) and (8). These results were developed under the following assumption set forward by Hall [24] to develop a three-stage sequential sampling procedure theory. That is, Assumption A: Letξ(>0) such that limSup( m ξ(m) ) < δas ξ(m) → ∞ , and ξ(m) = O(m r ), for r > 1. Theorem 1 below provides the asymptotic results of the main study phase: Theorem 1. Under assumption A, for the three-stage sampling procedure (7) and (8) as d → 0, we have: σ)), then, conditional on the σ−field generated by the random variables X 1 , X 2 , X 3 , . . . , X m , we have: [61].
Therefore, we have E(T N 1 ) = σ + m E( T m −σ N 1 ). Next, expand N −1 1 around δn * in stochastic Taylor series to obtain: where ν is a random variable between N 1 and δn * .
It follows that: It follows that: By assumption A, m/n * ≈ δ. Then, as m → ∞ , I = 0, and II = mδk 2 (δn * ) −2 2(π−3)σ 3 . Next, recall III: . Next, recall IV: where we consider the two cases ν ≤ δn * , then We have also used assumption A. The proof of (i) is complete. Similar arguments can be used to justify (ii) and (iv). Part(iii) follows from (i) and (ii). Part (v) follows from (ii) and (iv). We omit details for brevity. The proof is complete.
Theorem 2 below provides the asymptotic mean and variance for the final random sample size.

Theorem 2.
Under assumption (A), for the three-stage procedure (7) and (8) 1 ) has a standard uniform distribution (see Hall [24], and for large m see Anscombe [62]) central limit Theorem suggests that T 2 N 1 is normally distributed. Hence, E(N) = E(k 2 T 2 N 1 ) + 1/2 + o(1). By using the Theorem 1 part (ii), we get the result. The proof of part (i) is complete.
Theorem 2 shows that the average random sample size is always less than the optimal sample size. That is E(N) < n * for all values of n * . Moreover, lim d→0 E(N/n * ) = 1, which means the procedure attains first-order asymptotic efficiency and lim d→0 E(N − n * ) < ∞, which indicates that the procedure attains asymptotic second-order efficiency in the sense of [63]. Part (ii) shows that the variance increases as n * increases.
The following Theorem 3 gives the second-order asymptotic expansion of the moments of a real-valued continuously differentiable function of the stopping time random variable N.

Theorem 3.
Leth(> 0)be a real-valued continuously differentiable and bounded function, such that sup Proof. The proof is a direct substitution of Theorem 2 parts (i) and (ii) in Taylor expansion of h(N), while we use the assumption that h is bounded. The proof is complete.
Theorem 4 below gives the asymptotic characteristics of the fine-tuning phase under the Assumption A.

Theorem 4.
For the three-stage rules (7) and (8), and as d → 0, Next, condition on the σfield generated by T 1 , T 2 , . . . , T N 1 . It follows that: then expand N −1 in Taylor series around n * as: where ν is a random variable between N and n * .
Therefore, [61]. Then, recall II, The first term of II, Condition on the σ− field generated by T 1 , T 2 , . . . , T m and expand ( N 1 i=1 (T N 1 − σ)) 3 . We have, We have used Wald's first equation [61] to prove that the second term in the expansion is zero.
The second term in II, Here, we have used the fact N −1 1 ≈ (δn) −1 and m/n * ≈ δ under assumption A. Similar arguments prove that E = o(d 2 ), where we have used the fact that 1 N 1 (N 1 −m) ≤ 1 m 2 . It remains to evaluate the remainder term in III, which is: . Arguments similar to those used above and the fact that the random variable ν is between N and n * can be used to justify the rate of convergence of III. We omit any further details for brevity.
This completes the proof of (i).
Likewise, (ii) can be asserted along the above lines if we write: π n * . Therefore, Likewise, condition on the σ − f ield generated by T 1 , T 2 , . . . , T N 1 to obtain: The term , while, by using Wald's, second equation [61]: This completes the proof.
Part (i) of Theorem 4 shows that T N is an asymptotically unbiased estimator of σ. Meanwhile part (iii) shows that the variance decreases as n * increases.
Proof. The proof is instantaneous if we expand g(T N ) in Taylor series around σ, and substitute (i) and (iii) of Theorem 4, together with the assumption that the function g(> 0) and its derivatives are bounded. The proof is complete.

Three-Stage Minimum Risk Point Estimation
The asymptotic regret ω(d) encountered in the estimation of σ by the corresponding three-stage point estimate T N is given by: By using Theorems 2 and 4, as d → 0, we get: The asymptotic regret ω(d) < 0 (negative regret), which reflects that the three-stage procedure produces estimates for the Rayleigh distribution scale parameter better than using the fixed sample size technique. Additionally, the regret of using the three-stage procedure to estimate the scale parameter compared to using the fixed sample size (classical inference) is less than a non-vanishing finite quantity 3(4−π) π δ −1 + 1 2 , 0 < δ < 1. Simon [64] called this quantity the cost of ignorance, of not knowing the scale parameter. The issue of negative regret was discussed by Martinsek [27]. Table 1 below shows the Rayleigh distribution characteristics' mathematical representation and the three-stage estimates for the mode, the median, the reliability, the hazard function at a specific time, and the entropy. Table 1. Point estimation of other distribution parameters.

Distribution Characteristic Mathematical Representation Three Stage Point Estimate
The

Three-Stage Fixed-Width Confidence Interval
Once the sampling procedure is terminated, we propose the fixed 2d width three-stage confidence interval I N = T N ± d for the scale parameter σ.
The coverage probability of the interval is calculated as: Since the stopping variable N depends on the scale parameter estimate T N , then N and T N are not stochastically independent. Therefore, we use Monte Carlo simulation to study the characteristic of P(σ ∈ I N ) when the sample size varies from small, moderate, and large.

Simulation Study
Monte Carlo simulation is conducted to evaluate the three-stage procedure's performance when the sample size varies from small, moderate, and large. A FORTRAN program is coded using Microsoft Developer Studio software to generate a series of simulations. For each experimental situation, 50,000 replicate samples were used. Random samples from the Rayleigh distribution were generated, and a three-stage sampling rule (7), (8) was implemented to estimate all the parameters in concern; σ and its standard error; N the estimated values of n * and their standard error; the mean and the variance of the Rayleigh distribution and their standard errors; the regret; and, finally, the estimated value of the coverage probability. The optimal sample sizes are chosen typically n * = 25,50,100,150,200,250, 300, 400, and 500.
For constructing a fixed-width confidence interval for the scale parameter σ, we take α = 0.05, and, accordingly, a = 1.96. Additionally, we consider different values for the initial sample size, m = 5, 10, and 15, and the portion of the initial sample used for estimation, δ = 0.3, 0.5, and 0.8. Mukhopadhyay [41] noted that if the design factor δ is chosen near zero or one, then a three-stage procedure would be more like Stein's two-stage procedure. Therefore, a three-stage procedure is better implemented with δ = 0.4, 0.5, or 0.6. Hall [36] mentioned that in practice, it seems a reasonable compromise to choose δ = 0.5.
The simulation process is performed as follows: For the i-th sample generated for a particular combination of σ, m, δ, n * , and a, we have: First. Generate an initial sample of size m(≥ 2), say T 1,i , T 2,i , . . . , T m,i from Rayleigh distribution with scale parameter σ and calculate T m as an initial estimate of σ.
Second. Apply the three-stage sampling procedure as presented in (7) and (8)  Fourth. The simulated regret is Fifth. The simulated coverage probability is: For brevity, Table 2 below demonstrates the simulation results evaluated at m = 10, δ = 0.5, and 1 − α =0.95 for each respective n * as follows: N is the simulated estimate for the optimal sample size, with a standard error given by S(N).σ is the simulated estimate for the scale parameter σ with standard error S(σ).μ is the simulated estimate for the population mean of the distribution with standard error S(μ). var(x) is the simulated estimate for the variance of the distribution with standard error Svar(x). med(x) stands for the simulated estimate for the population median with standard error S med(x). Ent stands for the simulated estimate for the population entropy with standard error and SEnt.ω is the simulated estimates for the asymptotic regret and finally 1 −α is the simulated estimate for the asymptotic coverage probability. From these results, we observe that the final random sample size N is very close to the optimal sample size n * -i.e., N/n * ≈ 1 (first-order asymptotic efficiency)-and N is less than n * , which refers to early stopping with standard error increases as n * increases. Additionally, N − n * is bounded by a finite number that is unrelated to n * (second-order asymptotic efficiency). Besides, as n * increases the estimate of the scale parameter gets significantly closer to the actual value with decreasing standard errors. Moreover, the simulated coverage probability is always less than the desired nominal value (asymptotic consistency in the sense of [30,36,63]), and this might be because of the early stopping sampling. The regret is a non-vanishing finite quantity with negative values. The negativity in the regret goes due to the dependency between the final random sample size N and the estimate of the scale parameter T N Furthermore, it may refer to early stopping.

Conclusions
This paper proposes a unified decision framework to estimate the scale parameter of the Rayleigh distribution and several related parameters. Within this optimal decision structure, a three-stage sampling procedure with a bona fide stopping rule is defined to determine the optimal sample size required to perform inference. The procedure enjoys the asymptotic characteristics set forward by Chow and Robbins [36] and Anscombe [34] as well the operational saving made possible by sampling in batches, as in Stein [30] and Cox [33]. Asymptotic characteristics of the three-stage sampling scale parameter estimate and its higher-order moments are presented in Theorems 1-5. The asymptotic regret associated with minimizing the expected cost of the squared-error loss function with the linear sampling cost is also discussed. Monte Carlo simulation was performed to give a proper feel of the inference performance in typical real-life situations. This current problem is different from those considered previously in the case of the normal and exponential distribution. The independence between the stopping variable N and the nuisance parameter estimates are apparent. In the Rayleigh distribution case, the stopping variable N depends on the scale parameter estimate, and thus the proofs took different directions.

Conflicts of Interest:
The authors declare no conflict of interest.