Confidence Intervals for the Signal-to-Noise Ratio and Difference of Signal-to-Noise Ratios of Log-Normal Distributions

In this article, we propose approaches for constructing confidence intervals for the single signal-to-noise ratio (SNR) of a log-normal distribution and the difference in the SNRs of two log-normal distributions. The performances of all of the approaches were compared, in terms of the coverage probability and average length, using Monte Carlo simulations for varying values of the SNRs and sample sizes. The simulation studies demonstrate that the generalized confidence interval (GCI) approach performed well, in terms of coverage probability and average length. As a result, the GCI approach is recommended for the confidence interval estimation for the SNR and the difference in SNRs of two log-normal distributions.


Introduction
In statistics, it is well-known that the standard deviation and the variance are used to measure dispersion.Although the standard deviation has an important advantage and is easier to interpret than the variance, the former is not an appropriate indicator when we compare the dispersion in distributions of several variables.Therefore, the coefficient of variation (CV), which is defined as the ratio of the standard deviation to the mean, is used to measure the relative dispersion.CV is free from the unit of measurement and is useful in comparing the variability between groups of observations.Many authors have proposed confidence intervals for CV.For instance, Niwitpong [1] constructed the confidence intervals for the CV of a log-normal distribution with restricted parameter space, while Ng [2] studied the confidence interval for the common CV of log-normal distributions.Furthermore, Thangjai [3] proposed the simultaneous fiducial generalized confidence intervals (SFGCIs) for the differences in the CVs of log-normal distributions.
The signal-to-noise ratio (SNR) is the inverse of the CV.It is the ratio of the mean to the standard deviation.SNR has been used in many fields, such as finance, quality control, medicine, imaging, economics, marketing, and biology.For the application of this ratio in the theory of finance, SNR measures the relationship between excess return and the risk of financial assets.In analog and digital communications, SNR is a measure of the signal strength relative to the background noise.In quality control, SNR represents the magnitude of the mean of a process compared to its variation.In medicine, SNR can be used to analyze the blood pressure of patients in a longitudinal study.In image processing, the ratio of the mean pixel values over a given neighborhood is calculated by the SNR of an image.Furthermore, the SNR is used for the analysis of portfolio selection models and market risk (see [4,5]).
A log-normal distribution is right-skewed and is used in models for various applications, such as medicine, economics, biology, agriculture, entomology, and finance.Applications of the log-normal distribution can be found in [6][7][8].
Suppose that a random variable X = (X 1 , X 2 , . . . ,X n ) follows all possible distributions.The lower and the upper limits of the confidence interval for the CV are denoted by L (X) and U (X), respectively.By definition, if X = (X 1 , X 2 , . . . ,X n ) is a random sample from a probability distribution with statistical parameters, then the confidence interval for the CV (γ) with nominal confidence level 1 − α is an interval with L (X) and U (X): These are determined with the property P (L (X) Then, 1/U (X) ≤ 1/γ ≤ 1/L (X) can be achieved by taking the inverse values of L (X), U (X), and γ.That is to say, the confidence interval for the inverse of CV (1/γ) with nominal confidence level 1 − α is the interval with 1/U (X) and 1/L (X).Confidence interval estimation in terms of SNR has received attention in the literature; see [9][10][11][12][13][14][15].In this article, we propose two approaches for constructing the confidence intervals for the SNR of a log-normal distribution, using the GCI and the large sample approaches.Furthermore, three confidence intervals for the difference between the SNRs of log-normal distributions are constructed based on the GCI, large sample, and method of variance estimates recovery (MOVER) approaches.
The rest of this article is organized as follows.In Section 2, the confidence intervals for the SNR of a log-normal distribution are presented, and the confidence intervals for the difference between the SNRs of log-normal distributions are given in Section 3. In Section 4, the results of simulation studies to assess the coverage probabilities and the average lengths of all of the proposed confidence intervals are presented.Next, two examples are given to illustrate the proposed approaches in Section 5, and the concluding remarks are presented in Section 6.

The Confidence Intervals for a Single SNR
Suppose that a random variable X = log(Y) follows a normal distribution with mean µ and variance σ 2 .Then, the random variable Y follows a log-normal distribution with mean We are interested in constructing the confidence interval for the SNR θ.
2 / (n − 1) be the sample mean and sample variance for log-transformed data X i = log(Y i ), where i = 1, 2, . . ., n; and let x and s 2 be the observed sample mean and observed sample variance, respectively.The estimator of θ is The variance of exp (S 2 ) − 1, given in [3], is in the form Var exp (S 2 ) − 1 = Therefore, it is easy to derive the variance of θ, as follows: 3 . (3)

The GCI Approach for a Single SNR
The concept of GCI was introduced by Weerahandi [16].Let X = (X 1 , X 2 , . . . ,X n ) be a random sample having a density function f (X|θ, ν), where θ is the parameter of interest and ν is a nuisance parameter.Let x be the observed sample of X.A generalized pivotal quantity R (X; x, θ, ν) is considered and satisfies the following conditions: Condition (i) is imposed to guarantee that a subset of the sample space of the possible values of R (X; x, θ, ν) can be found at a given value of the confidence coefficient, with no knowledge of the parameters.Condition (ii) is imposed to ensure that such probability statements, based on a generalized pivotal quantity, lead to confidence regions involving the observed data x only.The GCI for θ is computed using the percentiles of the generalized pivotal quantity.Let [R (α/2) , R (1 − α/2)] be a 100 (1 − α) % two-sided GCI for the parameter of interest, where R (α/2) and R (1 − α/2) denote the 100 (α/2)-th and the 100 (1 − α/2)-th percentiles of R (X; x, θ, ν), respectively.
Suppose that X and S 2 are the mean and variance of the log-transformed sample from a log-normal distribution.Furthermore, let x and s 2 be the observed values of X and S 2 , respectively.Since s 2 has a chi-squared distribution with n − 1 degrees of freedom, defined by We define the generalized pivotal quantity for σ 2 as where χ 2 n−1 denotes a chi-squared distribution with n − 1 degrees of freedom.From Equations ( 1) and ( 4), the generalized pivotal quantity for θ, based on the generalized pivotal quantity for σ 2 , is given by The 100 (1 − α) % two-sided confidence interval for the SNR of log-normal distribution θ, based on the GCI approach, is given by where R θ (α/2) and R θ (1 − α/2) denote the (α/2)-th and (1 − α/2)-th quantiles of R θ , respectively.
The following algorithm is used to construct the GCI for the SNR of a log-normal distribution (Algorithm 1): Algorithm 1: The GCI for the SNR.
For a given x and s 2 For g = 1 to h:

The Large Sample Approach for a Single SNR
From Equations ( 2) and ( 3), the 100 (1 − α) % two-sided confidence interval for the SNR of log-normal distribution θ, based on the large sample approach, is given by where z 1−α/2 denotes the (1 − α/2)-th quantile of a standard normal distribution and Var θ is defined as in Equation ( 3), with σ replaced by s.

The Confidence Intervals for the Difference between SNRs
Suppose that X = log(Y) follows a normal distribution with mean µ X and variance σ 2 X .Similarly, let T = log(W) be a normal distribution with mean µ T and variance σ 2 T .Moreover, X and T are independent.The single SNRs of Y and W are, respectively, given by The estimators of θ X and θ T are The variances of θX and θT are, respectively, Therefore, the difference between θX and θT is Let n and m be the sample sizes of X and T, respectively.Using the Bienaymé formula, the variance of the sum of uncorrelated random variables is the sum of their variances.Moreover, using the linearity of the expectation operator and the assumption that X and T are independent, the variance of θX − θT is obtained as

The GCI Approach for the Difference between SNRs
Suppose that S 2 X and S 2 T denote the variances of the log-transformed sample, and let s 2 X and s 2 T be the observed values of S 2 X and S 2 T , respectively.The generalized pivotal quantities for σ 2 X and σ 2 T are obtained from where χ 2 n−1 and χ 2 m−1 denote chi-squared distributions with n − 1 and m − 1 degrees of freedom, respectively.
Therefore, the difference between the generalized pivotal quantities R θ X − R θ T , based on the generalized pivotal quantities for σ 2 X and σ 2 T , can be written as The 100 (1 − α) % two-sided confidence interval for the difference between the SNRs of log-normal distributions δ, based on the GCI approach, is given by where R δ (α/2) and R δ (1 − α/2) denote the (α/2)-th and (1 − α/2)-th quantiles of R δ , respectively.

The Large Sample Approach for the Difference between SNRs
Using the central limit theorem, the 100 (1 − α) % two-sided confidence interval for the difference between SNRs of log-normal distributions δ, based on the large sample approach, is given by where z 1−α/2 is the (1 − α/2)-th quantile of the standard normal distribution, and δ and Var δ are defined as in Equations ( 11) and ( 12), respectively, with σ X and σ T replaced by s X and s T .

The MOVER Approach for the Difference between SNRs
Let l X and u X be the lower and upper limits of the confidence interval for the SNR of X, respectively, then they can be defined by where t 1−α/2 is the (1 − α/2)-th quantile of a Student's t distribution, and θX and Var θX are defined as in Equations ( 9) and ( 10), respectively, with σ X replaced by s X .
Similarly, let l T and u T be the lower and upper limits of the confidence interval for the SNR of T, respectively, then they can be written as where t 1−α/2 is the (1 − α/2)-th quantile of a Student's t distribution, and θT and Var θT are defined as in Equations ( 9) and ( 10), respectively, with σ T replaced by s T .Following Zou and Donner [17] and Zou et al. [18], the 100 (1 − α) % two-sided confidence interval for the difference between the SNRs of log-normal distributions δ, based on the MOVER approach, is given by where l X and u X are defined as in Equation ( 17), and l T and u T are defined as in Equation ( 18).

Simulation Studies
Two simulation studies were conducted to evaluate the coverage probabilities and average lengths of the proposed confidence intervals.The aim of the first simulation was to assess the performance of the GCI approach, in comparison with the large sample approach, for the confidence interval estimation for the single SNR of a log-normal distribution.The aim of the second simulation was to examine the performance of the GCI approach, in comparison with the large sample and MOVER approaches.
In the single SNR simulation study, the sample sizes were n = 10, 20, 30, 50, 100, and 200; the population mean of normal data was µ = 1; the population standard deviation was computed as σ = log((1/θ 2 ) + 1) for the normally distributed data; and the SNR was θ = 1, 3, 5, and 10.A total of 5000 random samples were generated for each set of parameters.For the GCI approach, 2500 R θ were obtained for each of the random samples.Table 1 reports the coverage probabilities and average lengths of the 95% two-sided confidence intervals for the SNR of the log-normal distribution.The results show that the coverage probabilities of both approaches were close to the nominal confidence level of 0.95.Moreover, the average lengths of the GCI approach were shorter than those of the large sample approach, when the sample size was small.For a large sample size (n ≥ 100), the GCI approach performed as well as the large sample approach, in terms of the average length, when the SNR was small; otherwise, the average lengths of the GCI approach were shorter than those of the large sample approach.
In the simulation study of the difference of SNRs, the sample sizes were (n, m) = (10, 10), (10,20), (20,20), (20,30), (30, 30), (30, 50), (50, 50), (50, 100), (100, 100), (100, 200), and (200, 200); the population means were µ X = µ T = 1; and the population SNRs were (θ X , θ T ) = (10, 1), (10, 2), (10,5), and (10,10) for the normally distributed data.Therefore, the population standard deviations of the normally distributed data σ X = log((1/θ 2 X ) + 1) and σ T = log((1/θ 2 T ) + 1) were computed.The coverage probabilities and average lengths of the 95% two-sided confidence intervals for the difference between the SNRs of the log-normal distributions were evaluated, based on 5000 replications, and 2500 R δ were obtained for the GCI approach.The results are given in Table 2, in which it can be seen that the GCI approach and the large sample approach were preferable for all cases.However, the average lengths of the GCI approach were shorter than those of the large sample approach.Furthermore, the coverage probabilities of the MOVER approach provided more than 0.97 for (n, m) = (10, 10) and (10,20); thus, the MOVER confidence interval was conservative for those two sample sizes.For large sample sizes, the coverage probabilities of the MOVER approach were close to the nominal confidence level of 0.95, although the average lengths were wider than those of the GCI and large sample approaches.

Empirical Applications
Two examples are given to illustrate our proposed approach for confidence intervals for the SNR of a log-normal distribution and the difference between the SNRs of log-normal distributions.The GCIs are computed using Algorithm 1, with h = 2500.
Example 1.The data are from Fung and Tsang [19] and Ng [2].The data-set contains hemoglobin values from one normal and one abnormal blood sample of Hb1995.The summary statistics are n = 65, x = 14.64, and s 2 = 0.0665.Therefore, the SNR of the log-normal distribution is 3.8135.The procedures in Section 2 are applied to compute the 95% two-sided confidence intervals for the SNR of the log-normal distribution.The 95% GCI and large sample confidence interval for the SNR are [3.1365, 4.4748] with a length of interval of 1.3383 and [3.1307, 4.4964] with a length of interval of 1.3657, respectively.Note that the GCI and the large sample confidence intervals contain the true value of the SNR.However, the length of the GCI is shorter than the length of the large sample confidence interval and, thus, the former is better when the sample size is small (Table 1).
Example 2. The data are from the Regenstrief Medical Record System, as reported in MCDonald et al. [20], Zhou et al. [21], and Jafari and Abdollahnezhad [8].The data represent the effects of race on medical charges for patients with type I diabetes who received inpatient or outpatient care, on at least two occasions, during the period from 1 January 1993 to 30 June 1994.The dataset consists of African American and white patients.For African American patients, the summary statistics are n = 119, x = 9.0670, and s 2 X = 1.8240 and, for the white patients, the summary statistics are m = 106, t = 8.6930, and s 2 T = 2.6920.The difference between the SNRs is 0.1691.Zhou et al. [19] showed that both datasets come from log-normal distributions.The 95% two-sided confidence intervals for the difference between the SNRs of the log-normal distributions were constructed, using the three approaches given in Section 3. The 95% GCI, large sample, and MOVER confidence intervals for the difference between SNRs are [0.0101,0.3258] with a length of interval of 0.3157, [0.0082, 0.3300] with a length of interval of 0.3218, and [0.0064, 0.3318] with a length of interval of 0.3254, respectively.The results indicate that all of the confidence intervals contain the true difference between the SNRs, but GCI provided the shortest length, and so is much more satisfactory than the others.

Discussion and Conclusions
In this article, we considered the confidence intervals for the single SNR of a log-normal distribution and for the difference of SNRs between the two log-normal distributions.First, we used the GCI approach and the large sample approach to construct the confidence intervals for the SNR, and then we used the GCI, large sample, and MOVER approaches to estimate the confidence interval for the difference between the SNRs.
For the confidence interval for SNR, the coverage probabilities of both approaches were satisfactory.However, the GCI approach was better than the large sample approach, in terms of the average length.For the difference between the SNRs, the GCI approach and the large sample approach were preferable to MOVER.However, the average lengths of the GCI approach were shorter than those of the large sample approach.As a result, comparing the GCI approach and the large sample approach, the former was therefore more preferable, in terms of the average length.

Table 1 .
Coverage probabilities (CP) and average lengths (AL) of the 95% two-sided confidence intervals for the signal-to-noise ratio (SNR) of the log-normal distribution.

Table 2 .
CP and AL of the 95% two-sided confidence intervals for the difference between the SNRs of the log-normal distributions.