We will show two ways of estimating the parameters. The first one can be found in [
1], but we review it and add some more details. Both of them are essentially the maximum likelihood estimation procedure, but in the first case, we perform maximization, whereas in the second case, we seek the root of an equation.
The log-likelihood of Equation (
2) can be written in the following way:
where
n is the sample size of the
values. The partial derivatives of Equation (41) are:
By equating the first derivative of the log-likelihood to zero, we obtain a nice relationship:
Note that Equation (
42) has three solutions, one at zero and two more with the opposite sign. The example in
Section 4.1 will show graphically the three solutions. By substituting Equation (
42), to the derivative of the log-likelihood w.r.t
and equating to zero, we get the following expression for the variance:
The above relationships Equations (
42) and (
43) can be used to obtain maximum likelihood estimates in an efficient recursive way. We start with an initial value for
and find the positive root of Equation (
42). Then, we insert this value of
μ in Equation (
43) and get an updated value of
. The procedure is being repeated until the change in the log-likelihood value is negligible.
Another easier and more efficient way is to perform a search algorithm. Let us write Equation (
42) in a more elegant way.
where
is defined in Equation (
43). It becomes clear that the optimization the log-likelihood Equation (41) with respect to the two parameters has turned into a root search of a function with one parameter only. We tried to perform maximization via the E-M algorithm, treating the sign as the missing information, but it did not prove very good in this case.
4.2. Simulation Studies
Simulation studies were implemented to examine the accuracy of the estimates using numerical optimization based on the simplex method [
10]. Numerical optimization was performed in [
15], using the
optim function. The term accuracy refers to interval estimation rather than point estimation, since the interest was on constructing confidence intervals for the parameters. The number of simulations was set equal to R = 1,000. The sample sizes ranged from 20 to 100 for a range of values of the parameter vector. The R-package
VGAM[
16] offers algorithms for obtaining maximum likelihood estimates of the folded normal, but we have not used it here.
For every simulation, we calculated confidence intervals using the normal approximation, where the variance was estimated from the inverse of the observed information matrix. The maximum likelihood estimates are asymptotically normal with variance equal to the inverse of the Fisher’s information. The sample estimate of this information is given by the second derivative (Hessian matrix) of the log-likelihood with respect to the parameter. This is an asymptotic confidence interval.
Bootstrap confidence intervals were also calculated using the percentile method [
17]. For every simulation, we produced the bootstrap distribution of the data with
bootstrap repetitions. Thus, we calculated the
lower and upper quantiles for each of the parameters. In addition, we calculated the correlations for every pair of the parameters.
Table 1,
Table 2,
Table 3 and
Table 4 present the coverage of the
confidence intervals for the two parameters at different pairs of sample size and mean. The rows correspond to the sample size, whereas the columns correspond to the ratio
, with
fixed.
Table 1.
Estimated coverage probability of the confidence intervals for the mean parameter, μ, using the observed information matrix.
Table 1.
Estimated coverage probability of the confidence intervals for the mean parameter, μ, using the observed information matrix.
| | | Values | of | θ | | | |
---|
Sample size | 0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 |
---|
20 | 0.689 | 0.930 | 0.955 | 0.931 | 0.926 | 0.940 | 0.930 | 0.948 |
30 | 0.679 | 0.921 | 0.949 | 0.943 | 0.925 | 0.926 | 0.941 | 0.915 |
40 | 0.690 | 0.916 | 0.936 | 0.933 | 0.941 | 0.948 | 0.944 | 0.928 |
50 | 0.718 | 0.944 | 0.955 | 0.938 | 0.933 | 0.948 | 0.946 | 0.946 |
60 | 0.699 | 0.950 | 0.968 | 0.948 | 0.949 | 0.941 | 0.942 | 0.946 |
70 | 0.721 | 0.931 | 0.956 | 0.939 | 0.939 | 0.939 | 0.949 | 0.945 |
80 | 0.691 | 0.930 | 0.950 | 0.940 | 0.946 | 0.936 | 0.945 | 0.939 |
90 | 0.720 | 0.932 | 0.960 | 0.949 | 0.949 | 0.939 | 0.954 | 0.944 |
100 | 0.738 | 0.945 | 0.949 | 0.938 | 0.943 | 0.926 | 0.946 | 0.952 |
What can be seen from
Table 1 and
Table 2 is that whist the sample size is important, the value of
θ, the mean to standard deviation ratio, is more important. As this ration increase the coverage probability increases, as well, and reaches the desired nominal
. This is also true for the bootstrap confidence intervals, but the coverage is in general higher and increases faster as the sample size increases in contrast to the asymptotic confidence interval. What is more is that when the value of
θ is less than one, the bootstrap confidence interval is to be preferred. When the value of
θ becomes equal to or more than one, then both the bootstrap and the asymptotic confidence intervals produce similar coverages.
The results regarding the variance are presented in
Table 3 and
Table 4. When the value of
θ is small, both ways of obtaining confidence intervals for this parameter are rather conservative. The bootstrap intervals tend to perform better, but not up to the expectations. Even when the value of
θ is large, if the sample sizes are not large enough, the nominal coverage of
is not attained.
Table 2.
Estimated coverage probability of the bootstrap confidence intervals for the mean parameter, μ, using the percentile method.
Table 2.
Estimated coverage probability of the bootstrap confidence intervals for the mean parameter, μ, using the percentile method.
| | | Values | of | θ | | | |
---|
Sample size | 0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 |
---|
20 | 0.890 | 0.925 | 0.939 | 0.921 | 0.918 | 0.940 | 0.929 | 0.942 |
30 | 0.894 | 0.931 | 0.933 | 0.943 | 0.926 | 0.922 | 0.942 | 0.910 |
40 | 0.910 | 0.925 | 0.927 | 0.933 | 0.941 | 0.947 | 0.946 | 0.928 |
50 | 0.914 | 0.943 | 0.942 | 0.934 | 0.934 | 0.945 | 0.946 | 0.943 |
60 | 0.904 | 0.949 | 0.953 | 0.950 | 0.941 | 0.938 | 0.943 | 0.944 |
70 | 0.893 | 0.934 | 0.943 | 0.936 | 0.937 | 0.938 | 0.949 | 0.939 |
80 | 0.918 | 0.940 | 0.939 | 0.939 | 0.944 | 0.935 | 0.946 | 0.938 |
90 | 0.920 | 0.934 | 0.952 | 0.948 | 0.946 | 0.939 | 0.951 | 0.947 |
100 | 0.918 | 0.940 | 0.936 | 0.932 | 0.946 | 0.925 | 0.945 | 0.949 |
Table 3.
Estimated coverage probability of the confidence intervals for the variance parameter, , using the observed information matrix.
Table 3.
Estimated coverage probability of the confidence intervals for the variance parameter, , using the observed information matrix.
| | | Values | of | θ | | | |
---|
Sample size | 0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 |
---|
20 | 0.649 | 0.765 | 0.854 | 0.853 | 0.876 | 0.870 | 0.862 | 0.885 |
30 | 0.697 | 0.794 | 0.870 | 0.898 | 0.892 | 0.898 | 0.894 | 0.896 |
40 | 0.723 | 0.849 | 0.893 | 0.914 | 0.919 | 0.913 | 0.909 | 0.902 |
50 | 0.751 | 0.867 | 0.916 | 0.907 | 0.911 | 0.924 | 0.899 | 0.912 |
60 | 0.745 | 0.865 | 0.911 | 0.913 | 0.916 | 0.906 | 0.920 | 0.933 |
70 | 0.769 | 0.874 | 0.928 | 0.928 | 0.912 | 0.930 | 0.926 | 0.935 |
80 | 0.776 | 0.883 | 0.927 | 0.919 | 0.934 | 0.936 | 0.916 | 0.924 |
90 | 0.795 | 0.901 | 0.931 | 0.932 | 0.925 | 0.930 | 0.940 | 0.941 |
100 | 0.824 | 0.904 | 0.927 | 0.933 | 0.925 | 0.936 | 0.932 | 0.942 |
The correlation between the two parameters was also estimated for every simulation from the observed information matrix. The results are displayed in
Table 5. The correlation between the two parameters is always negative irrespective of the sample size or the value of
θ, except for the case when
. In this case, the correlation becomes zero as expected. As the value of
θ grows larger, the probability of the normal distribution, which lies on the negative axis, becomes smaller until it becomes negligible. In this case, the distribution equals the classical normal distribution for which the two parameters are known to be orthogonal.
Table 4.
Estimated coverage probability of the bootstrap confidence intervals for the variance parameter, , using the percentile method.
Table 4.
Estimated coverage probability of the bootstrap confidence intervals for the variance parameter, , using the percentile method.
| | | Values | of | θ | | | |
---|
Sample size | 0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 |
---|
20 | 0.657 | 0.814 | 0.862 | 0.842 | 0.840 | 0.832 | 0.818 | 0.824 |
30 | 0.701 | 0.850 | 0.885 | 0.891 | 0.882 | 0.867 | 0.869 | 0.866 |
40 | 0.743 | 0.881 | 0.896 | 0.913 | 0.912 | 0.886 | 0.881 | 0.878 |
50 | 0.772 | 0.895 | 0.921 | 0.916 | 0.897 | 0.901 | 0.885 | 0.892 |
60 | 0.797 | 0.907 | 0.912 | 0.910 | 0.906 | 0.897 | 0.907 | 0.916 |
70 | 0.807 | 0.904 | 0.925 | 0.915 | 0.909 | 0.918 | 0.908 | 0.924 |
80 | 0.822 | 0.895 | 0.925 | 0.914 | 0.925 | 0.917 | 0.909 | 0.909 |
90 | 0.869 | 0.916 | 0.932 | 0.922 | 0.919 | 0.915 | 0.934 | 0.929 |
100 | 0.873 | 0.915 | 0.918 | 0.925 | 0.906 | 0.931 | 0.920 | 0.939 |
Table 5.
Estimated correlations between the two parameters obtained from the observed information matrix.
Table 5.
Estimated correlations between the two parameters obtained from the observed information matrix.
| | | Values | of | θ | | | |
---|
Sample size | 0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 |
---|
20 | −0.600 | −0.495 | −0.272 | −0.086 | −0.025 | −0.006 | −0.001 | 0.000 |
30 | −0.638 | −0.537 | −0.262 | −0.089 | −0.022 | −0.005 | −0.001 | 0.000 |
40 | −0.695 | −0.548 | −0.251 | −0.081 | −0.021 | −0.005 | −0.001 | 0.000 |
50 | −0.723 | −0.580 | −0.259 | −0.076 | −0.020 | −0.005 | −0.001 | 0.000 |
60 | −0.750 | −0.597 | −0.251 | −0.075 | −0.019 | −0.004 | −0.001 | 0.000 |
70 | −0.771 | −0.588 | −0.256 | −0.073 | −0.019 | −0.004 | −0.001 | 0.000 |
80 | −0.774 | −0.604 | −0.253 | −0.074 | −0.019 | −0.004 | −0.001 | 0.000 |
90 | −0.796 | −0.599 | −0.245 | −0.073 | −0.018 | −0.004 | −0.001 | 0.000 |
100 | −0.804 | −0.611 | −0.252 | −0.072 | −0.019 | −0.004 | −0.001 | 0.000 |
Table 6 shows the probability of a normal random variable being less than zero when
and the same values of
θ as in the simulation studies.
Table 6.
Probability of a normal variable having negative values.
Table 6.
Probability of a normal variable having negative values.
| | Values | of | θ | | | |
---|
0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 |
---|
0.309 | 0.159 | 0.067 | 0.023 | 0.006 | 0.001 | 0.000 | 0.000 |
When the ratio of mean to standard deviation is small, the area of the normal distribution in the negative side is large, and as the value of this ratio increases, the probability decreases until it becomes zero. In this case, the folded normal is the normal distribution, since there are no negative values to fold on to the positive side. This of course is in accordance with all the previous observations and results we saw.