New weighted $L^2$-type tests for the inverse Gaussian distribution

We propose a new class of goodness-of-fit tests for the inverse Gaussian distribution. The proposed tests are weighted $L^2$-type tests depending on a tuning parameter. We develop the asymptotic theory under the null hypothesis and under a broad class of alternative distributions. These results are used to show that the parametric bootstrap procedure, which we employ to implement the test, is asymptotically valid and that the whole test procedure is consistent. A comparative simulation study for finite sample sizes shows that the new procedure is competitive to classical and recent tests, outperforming these other methods almost uniformly over a large set of alternative distributions. The use of the newly proposed test is illustrated with two observed data sets.


Introduction
The inverse Gaussian distribution (also known as the Wald distribution) was first heuristically observed by Bachelier (1900), and derived by Schroedinger (1915) as the distribution of the first passage time of Brownian motion with drift, see Seshadri (1993) for a historical summary. In the statistical literature the usual parametrization of the inverse Gaussian law IG(µ, λ), µ, λ > 0, follows the representation of Tweedie (1957a,b), namely the density is given by common probability space (Ω, A, P). Writing P X for the distribution of X, we intend to test the composite hypothesis H 0 : P X ∈ IG (2) against general alternatives. This testing problem has been considered in the statistical literature. The methods by Baringhaus and Gaigall (2015) and Mudholkar et al. (2001) are based on a characterization of the IG family by an independence property, Ducharme (2001) uses a connection to the so called Random Walk distribution, and Nguyen and Dinh (2003) propose exact tests based on the empirical distribution function of transformations characterising the inverse Gaussian law, which are commented and corrected in Gracia-Medrano and O'Reilly (2005). Henze and Klar (2002) use a differential equation that characterizes the Laplace transform of the IG family as well as a L 2 -distance test, both using the empirical Laplace transform. Koutrouvelis and Karagrigoriou (2012) use an empirical version of the standardized form of the cumulant generating function, Vexler et al. (2011) propose an empirical likelihood test based on densities for IG, and Villaseñor and González-Estrada (2015) consider a variance ratio test of fit. Finally, Koudou and Ley (2014b) tackle the testing problem for the generalized inverse Gaussian family exploiting the ULAN property in connection to Le Cam theory. Although the testing problem is derived for a wider class of distributions, it still applies for testing (2) when it is restricted to the special case p 0 = − 1 2 in the authors' notation, see Section 4 of the cited article. A comparative simulation study is provided by Noughabi (2017).
It is a common approach to exploit distributional characterizations to propose goodness-of-fit testing procedures, for an overview see Nikitin (2017). As evidenced by the list above there are numerous characterizations of the inverse Gaussian distribution, including characterizing properties based on independence, constancy of regression of suitable functions on the sum of identically distributed random variables, random continued fractions, or on the relation between E[X] and E[X −1 ]. For details see Seshadri (1993), Section 3, and Koudou and Ley (2014a), Section 2.5, and for an introduction to characterizations for other distributions, see Ahsanullah (2017) and Kagan et al. (1973). A very recent characterization identity is given in Example 5.10 by Betsch and Ebner (2019a), which reads as follows.
Theorem 1.1. Let X : Ω → (0, ∞) be a random variable with distribution function F , E[X] < ∞ and E X −1 < ∞. Then X has the inverse Gaussian distribution IG(1, ϕ) if, and only if, Note that X ∼ IG(µ, λ) if, and only if, X µ ∼ IG(1, λ/µ) and therefore the family IG is closed under scale transformations. Since the characterization is directly related to the theory of Stein characterizations [for details on Stein operators, see Ley et al. (2017)], we refer to the corresponding characterization of the generalized inverse Gaussian distribution, see Koudou and Ley (2014a), Theorem 3.2, and to the connection with the Stein operator for the special case (p, a, b) = (−1/2, λ/µ 2 , λ), using the authors' notation.
Our novel testing procedure is motivated by Theorem 1.1: We estimate both sides of (3) by their empirical counterparts, then calculate the weighted L 2 -distance of the difference. We choose this distance since L 2 -type statistics are widely used in goodness-of-fit testing, see Baringhaus et al. (2017). In this spirit, we propose the statistics with Y n,j = Xj µn , where µ n , λ n , are consistent estimators of µ, λ and ϕ n = λ n / µ n . The function w(t) is a positive, where P −→ denotes convergence in probability (as n → ∞). Note that, when suitable weight functions are chosen, we have numerically stable versions for the calculation of T n avoiding numerical integration, see Section 7.1. In particular, we use the weights w a (t) = e −at and w a (t) = e −at 2 , t > 0, with a positive tuning parameter a > 0. Both satisfy the conditions in (4). A proof of this fact is given by Betsch and Ebner (2019b) for the first weight function, and the argument for the second weight is very similar, so we do not discuss it here. Since IG is scale invariant, the test should reflect this property. Thus we only consider scale equivariant estimators µ n of µ, i.e. we have µ n (βX 1 , . . . , βX n ) = β µ n (X 1 , . . . , X n ), β > 0.
With this type of estimators it is straightforward to show that T n is invariant under scale transformations of the data, as it depends on (the scale invariant) Y n,j , j = 1, . . . , n, and ϕ n only. Rejection of H 0 in (2) is for large values of T n .
The paper is structured as follows. In Section 2 we present two estimation procedures in conjunction with the IGfamily as well as asymptotic representations of the estimators needed in the subsequent theory. Section 3 gives theoretical derivations of asymptotic results under the null hypothesis, and Section 4 summarizes the behavior under contiguous alternatives. A limit result under a large class of alternatives is derived in Section 5. In Section 6 we explain the implementation of the method via a parametric bootstrap procedure and prove consistency of this bootstrap-based test. We finalize the article with a Monte Carlo power study in Section 7, an application to observed data examples in Section 8, and we draw conclusions and indicate open questions in Section 9.

Estimation of the parameters and asymptotic representations
In this section we consider two suitable estimation methods which satisfy the requirement of scale equivariance namely the maximum likelihood (ML) and the moment (MO) estimators. For details about the estimation procedures, we refer to Johnson et al. (1995), Chapter 15, andSeshadri (1993), Chapter 6.
To account for the bootstrap procedure used to obtain critical values, we later on study the asymptotic behavior of T n under a triangular array X n,1 , . . . , X n,n of rowwise iid. random variables, where X n,1 ∼ IG(1, ϕ n ) for a sequence of positive numbers (ϕ n ) n ∈ N with lim n → ∞ ϕ n = ϕ > 0. Notice that in the following we assume without loss of generality, and with respect to the scale invariance of the test statistic, that µ = 1. We write o P (1) and O P (1) for (real-valued) random variables that converge to 0 in probability or that are bounded in probability, respectively. For both methods we need expansions of the form where ε n,j = o P (1), j = 1, 2, and Ψ j are measurable functions such that the random variables Ψ 1 (X n,1 ) and Ψ 2 (X n,1 , ϕ n ) are centered with existing second moment, and where X ∼ IG(1, ϕ).
1. Maximum likelihood estimators: Standard calculations show that µ n = 1 n n j=1 X n,j = X n and λ n = are the ML estimators of µ and λ. The asymptotic expansions as in (5) are derived as 2. Moment estimators: The moment estimators based on the first two moments of the inverse Gaussian distribution are µ n = X n and λ n = µ 3 n 1 n n j=1 X 2 n,j − µ 2 n , with the same asymptotic expansion for µ n as in the case of ML estimation, and √ n( λ n − ϕ n ) = 1 √ n n j=1 − ϕ 2 n X 2 n,j + 3ϕ n + 2ϕ 2 n X n,j − ϕ n (2 + ϕ n ) + o P (1).
Remark 2.1. Note that if X, X 1 , X 2 , . . . are any iid. positive random variables such that E X + X −1 < ∞ (and EX 2 < ∞ in case of the moment estimators), the definitions and asymptotic expansions above remain essentially the same, only X n,j is replaced by X j , and ϕ n by ϕ (with defined ϕ as below). To see that the asymptotic expansions continue to hold, notice that by the scale invariance of our test statistic, we may assume that EX = 1, hence µ n → 1 P-almost surely (a.s.) and Ψ 1 (X j ) = X j − 1 is centered. Similarly, using that s., as n → ∞, for the ML or MO estimator, respectively, the expansions (6) and (7) are seen to remain valid with the mentioned replacements.
For the ML estimators also note that since the IG family is a 2-parameter general exponential family, the statistic X n , is minimal sufficient and complete. An application of the Lehmann-Scheffé theorem shows that µ n and n−1 n λ n are uniformly minimum variance unbiased estimators, see Seshadri (1993). Note that n−1 n λ n is the same estimator as the one considered by Bourguignon and Saulo (2018). In principle, any other estimation method which gives scale equivariant estimators that allow asymptotic expansion as above can be considered as well.

The limit null distribution
A suitable setting to derive asymptotic results for the test is L 2 = L 2 (0, ∞), B(0, ∞), w(t) dt , the Hilbert space of The inner product and norm in L 2 are denoted by , for g, g 1 , g 2 ∈ L 2 . We assume the triangular array X n,1 , . . . , X n,n from Section 2. In particular, recall that X n,1 ∼ IG(1, ϕ n ) with ϕ n → ϕ > 0, and define for one of the two types of estimators µ n and λ n as in Section 2. Note that V n (·) is a random element of L 2 , and after a simple change of variable we have The first step of the proof of Theorem 3 by Betsch and Ebner (2019b) yields the following result [using assumption (4)].
Tedious but straightforward approximations yield the following relation.
As these calculations provide no essential insights, we omit them here and refer to Betsch and Ebner (2019b) where similar arguments are stated to prove Theorem 3 in their work. It is an easy corollary of Lemma 3.1 that the test statistic can be written as follows.
Next, we write V * n as a sum of iid. random elements of L 2 , and use a central limit theorem for triangular arrays to conclude that V * n is a tight sequence in L 2 (thus rendering Corollary 3.3 applicable) and derive the limit distribution. To this end, notice that we obtain from (5) and thus, defining . Notice that W n,1 , . . . , W n,n are iid. random elements of L 2 with EW n,1 = 0 (by Theorem 1.1) and E W n,j 2 L 2 < ∞ (using the assumptions on Ψ 1 , Ψ 2 , and w). This asymptotic representation leads to the following theorem which gives the limit null distribution of T n .
Theorem 3.4. Under the standing assumptions there exists a centered Gaussian element, W, of L 2 with covariance kernel Proof. We verify condition (a) from Lemma 3.1 and condition (ii) from Remark 3.3 of Chen and White (1998) to apply the corresponding central limit theorem for triangular arrays in Hilbert spaces to the array W n,j j ∈ {1, . . . , n}, n ∈ N . As argued above, these random elements are centered and have finite second moment. Moreover, the limit exists and is finite. The boundedness is seen by checking that the 'lim sup' of the sequence is finite, using the existence of moments of inverse Gaussian random variables, the assumptions on the weight function, and the asymptotic expansions of the estimators, as well as Hölder's inequality. That very calculation also justifies the exchange of the limit 'lim n → ∞ ' with the first integral. Then to see that the limit in (8) exists, the convergence of the term lim n → ∞ E W n,1 (s) 2 is shown by proving that the separate terms in the expectation (after carrying out the square) are uniformly integrable, utilizing the explicit form of Ψ 1 , Ψ 2 for the ML or MO estimators, and the fact that sup n ∈ N EX 8 n,1 < ∞, sup n ∈ N EX −4 n,1 < ∞. The classical Lindeberg central limit theorem for arrays implies for any g ∈ L 2 \ {0}, with σ 2 ϕ (g) = lim n → ∞ E W n,1 , g 2 L 2 which is finite by the finiteness of the term in (8). From the limit theorem by Chen and White (1998) we obtain the convergence V * * n = n −1/2 n j=1 W n,j D −→ W, where W is as in the statement of the theorem, and where we used is a tight sequence in L 2 , and Corollary 3.3 implies the claim.

Contiguous alternatives
In this section we summarize the asymptotic behavior under contiguous alternatives to a fixed representative of IG. As the proof of the subsequent results is almost identical to the arguments used by Betsch and Ebner (2019b) in Theorem 4, we omit the details and only state the results. Let f 0 denote the density of the IG(1, ϕ) distribution for a fixed ϕ > 0, and let c : Further, let Z n,1 , . . . , Z n,n , n ∈ N, be a triangular array of rowwise iid. random variables with Lebesgue density where we assume n to be large enough to ensure the non-negativity of f n . The n-fold product measure of f n (x)dx is contiguous to the n-fold product measure of f 0 (x)dx in terms of Le Cam (1960).
Theorem 4.1. Under the stated assumptions we have and W is the centered Gaussian element of L 2 from Theorem 3.4 with the function g ϕ (·, ·) related to the covariance kernel of W as in the previous section.
We thus conclude that our test has non-trivial power against contiguous alternatives of the considered kind.

A functional law of large numbers
In the next section we explain how the test is implemented using a parametric bootstrap procedure. We also show that Theorem 3.4, that is, the result on the limit null distribution from Section 3, can be used to prove that the bootstrap is asymptotically valid. However, we would also like to show that the bootstrap-based test is consistent against fixed alternatives. To achieve this goal, we provide another limit result, this time considering the setting under alternative distributions. Let X be any positive random variable with EX < ∞ and EX −1 < ∞ (and EX 2 < ∞ if moment estimators are used), and let X 1 , X 2 , . . . be iid. copies of X. We assume that either the ML or the moment estimators are considered, and we thus have, from Remark 2.1, that µ n , λ n −→ (µ, λ) P-a.s., as n → ∞, for some µ, λ > 0 (where we can specify µ = EX). In view of the scale invariance of T n , we assume µ = EX = 1, and have ϕ n = λ n / µ n −→ λ/µ = λ =: ϕ P-a.s., as n → ∞, where ϕ may be specified as in Remark 2.1.
With multiple use of the triangle inequality, V n 2 L 2 n ∈ N is easily seen to be a tight sequence of random variables.
A statement identical to Lemma 3.1 yields The first term in the product on the right hand side of this inequality is tight, while the second term is seen to converge to 0 almost surely by an easy calculation and standard Glivenko-Cantelli arguments (see Betsch and Ebner (2019b), Theorem 5, for some further insight).
Because of Theorem 1.1 the limit in Theorem 5.1 is 0 if, and only if, X follows some inverse Gaussian distribution IG(1, ϕ), and it is strictly greater than 0 otherwise.

The bootstrap procedure and a consistency result
In practice, to carry out the test at some level α ∈ (0, 1), we suggest a parametric bootstrap procedure, as the distribution of the test statistic depends on an unknown parameter and is very complicated. The approach is as follows. Given a sample X 1 , . . . , X n of iid. positive random variables with E X 1 + X −1 1 < ∞ (and EX 2 1 < ∞ when moment estimation is used), calculate T n (X 1 , . . . , X n ), using for instance the explicit formulae from Section 7.1. Also calculate the estimators µ n (X 1 , . . . , X n ) and λ n (X 1 , . . . , X n ), and put ϕ n (X 1 , . . . , X n ) = λ n (X 1 , . . . , X n )/ µ n (X 1 , . . . , X n ). Conditional on this value of ϕ n , generate b samples of size n from the IG(1, ϕ n )-distribution, and calculate the test statistic for each of them, with the parameters estimated from the bootstrap sample in every instance. This yields the values T * n,1 , . . . , T * n,b . Define the empirical distribution function . Denote by F ϕ the distribution function of W 2 L 2 under the IG(1, ϕ)-distribution as in Theorem 3.4, where ϕ is the (almost sure) limit of ϕ n which exists by Remark 2.1. It is straightforward to adapt the methods from Henze (1996), applying Theorem 3.4, to prove that We conclude that a given level of significance is attained in the limit, as both the sample size and the chosen bootstrap size approach infinity. The bootstrap procedure is thus asymptotically valid. Now suppose that X 1 from above does not follow any inverse Gaussian law. Then the limit ∆ ϕ figuring in Theorem 5.1 is strictly positive. Consequently, Theorem 5.1 and the results on the bootstrap critical values above imply P T n (X 1 , . . . , X n ) > c * n,b (α) −→ 1, as n, b → ∞, that is, our test is consistent (in the bootstrap setting) against any such alternative distribution. We suggest using the above bootstrap procedure when the test is applied in practice. In the extensive power approximation in the following section, the procedure becomes very demanding due to the high number of Monte Carlo runs. To accelerate the computations, we employ in that simulation study the so-called warp-speed bootstrap, see Giacomini et al. (2013), as is explained in Section 7.

Monte Carlo study
This section compares the finite sample power performance of the newly proposed test to that of competing tests for the inverse Gaussian distribution. Below, we consider the computational form of our test statistic. Then various existing goodness-of-fit tests for the inverse Gaussian distribution are discussed. Finally, the power calculations, including the considered alternative distributions and the warp-speed bootstrap methodology, are detailed. Sample sizes of 30 and 50 are considered throughout.

Computational form
We start with computationally stable representations of the test statistic for the weight functions w a (t) = exp(−at) and w a (t) = exp(−at 2 ), t > 0. For w a we define h 1,a (s, t) = 1 a 3 2 − e −as (as + 1) 2 + 1 + s a 2 (e −as (as + 1) − e −at (at + 1)) + st a e −at , s ≤ t, 1 a 3 2 − e −at (at + 1) 2 + 1 + t a 2 (e −at (at + 1) − e −as (as + 1)) + st a e −as , s > t, 1 a 2 (e −at (at + 1) − e −as (as + 1)) + s a e −as , s > t, and get a numerically stable version of the test statistic, T n,a = 1 4n n j,k=1 where Y n,1 , . . . , Y n,n and ϕ n are as in Section 1. For the second weight, w a , define where Φ denotes the distribution function of the standard normal distribution. Then we have a numerically stable version of the corresponding test statistic, namely T n,a = 1 4n n j,k=1

Existing tests of fit for the inverse Gaussian distribution
The class of competing tests we consider comprises several classical tests as well as more recent tests. We choose the following procedures:  (2002), 5. a recent test by Villaseñor and González-Estrada (2015).
Below, we briefly provide the details of these tests. The first three tests are well-known and we only provide the computational form of the test statistic in each case. The remaining three tests are considered in more detail. The test by Villaseñor and González-Estrada (2015) is very recent, while the tests by Henze and Klar (2002) are included due to their impressive power performance in previous empirical studies. For a recent literature overview concerning the existing tests for the inverse Gaussian distribution, see Koutrouvelis and Karagrigoriou (2012).

Classical tests
Let X (j) denote the j th order statistic of X 1 , . . . , X n , and let F (x) = F x; µ n , λ n , where F is the distribution function of the inverse Gaussian distribution. For each of the following tests, the null hypothesis is rejected for large values of the test statistic.

Tests proposed by Henze and Klar (2002)
Henze and Klar (2002) proposed two classes of tests for the inverse Gaussian distribution based on the empirical Laplace transform. The Laplace transform of the IG (µ, λ) distribution is given by As a result, the Laplace transform of the inverse Gaussian distribution satisfies the characteristic differential equation subject to the initial condition L (0) = 1. The empirical Laplace transform is given by Under the assumption that X 1 , . . . , X n are realized from an inverse Gaussian distribution, (9) suggests that ε n (t) = µ n L n (t) + 1 + 2 µ 2 n t λ n L ′ n (t) is close to zero for each value of t. The proposed class of test statistics thus is where µ n and λ n denote the maximum likelihood estimates of µ and λ, respectively, and a ≥ 0 is a tuning parameter. Due to the intractability of some of the calculations required for the implementation of the test statistic, the authors recommend the use of the exponentially scaled complementary error function rather than the distribution function of a Gaussian random variable in the value of the test statistic. Note that erfce(x) tends to ∞ for small values of x. Furthermore, for sufficiently large values of x, the numerical evaluation of erfce(x) breaks down, since the value of the integral and the exponential function are calculated to be 0 and ∞, respectively, by all standard statistical software packages. The latter difficulty can be overcome by noting that lim x→∞ erfce(x) = 0.
As a result, the use of the erfce function reduces the numerical problems encountered when implementing the tests proposed in Henze and Klar (2002) without removing these difficulties altogether.
Letting ϕ n = λ n / µ n and Y n,j = X j / µ n as in Section 1, and Z jk = ϕ n · (Y n,j + Y n,k + a), the test statistic can be expressed in a tractable form, namely

The null hypothesis is rejected for large values of HK
(1) n,a . Based on the power performance of this class of tests, the authors recommend the use of a = 0. This recommendation is met in order to obtain the numerical results shown below. Henze and Klar (2002) also proposed a second, more immediate, class of tests based on the empirical Laplace transform. This second class of tests is based on the difference between the Laplace transform of the IG µ n , λ n distribution (denoted by L) and the empirical Laplace transform: for some a ≥ 0. Two distinct computational forms for the test statistic are obtained, distinguishing the cases a = 0 and a > 0. Again, the recommended value of the tuning parameter is a = 0. In this case, the test statistic can be expressed as

HK
(2) n,0 = 1 n n j,k=1 n (Y n,j + 1) 2Y n,j + n 1 + 2 ϕ n 4 ϕ n , where Z jk = Y n,j + Y n,k . The hypothesis that the data is realized from an inverse Gaussian distribution is rejected for large values of the test statistic.

The test of Villaseñor and González-Estrada (2015)
Villaseñor and González-Estrada (2015) introduced three goodness-of-fit tests for the inverse Gaussian distribution, the most powerful of which is discussed below. Using the same notation as the mentioned authors, it can be shown that if X ∼ IG (µ, λ), then X follows a Gamma distribution with shape parameter 1/2 and scale parameter 2µ 2 /λ. A goodness-of-fit test for the inverse Gaussian distribution can be constructed using the moment estimator of Cov X, Z (µ) , where µ is estimated by µ n = X n . The suggested test statistic can be written as where S 2 n denotes the sample variance of X 1 , . . . , X n , and where λ n is the ML estimator of λ. The asymptotic distribution of V G n is standard normal, and the null hypothesis is rejected for large values of |V G n |.

Alternative
Density  Table 1 shows the alternative distributions considered in the empirical study below. Each of the listed distributions is investigated for various parameter values. The powers of the tests against the inverse Gaussian distribution with mean parameter 1 and shape parameter θ, denoted in Tables 6 and 7 by IG (θ), are also calculated for several values of θ in order to evaluate the empirical size of all competing tests.

Power calculations
Since the null distribution of the test statistic depends on the unknown value of the shape parameter, it is our intention to use a parametric bootstrap, as explained in Section 6, to calculate critical values for the test statistics in consideration. Given that the parametric bootstrap is computationally demanding, we employ the warp-speed method proposed by Giacomini et al. (2013) to approximate the power of all the considered tests. Denote the number of Monte Carlo replications by M C, and recall that this method capitalizes on the repetition inherent in the Monte Carlo simulation to produce bootstrap replications, rather than relying on a separate 'bootstrap loop'. The procedure can be summarized as follows.
1. Obtain a sample X 1 , . . . , X n from a distribution, say F , and estimate µ and λ by µ n and λ n , respectively.
3. Generate a bootstrap sample X * 1 , . . . , X * n by independently sampling from IG(1, ϕ n ). Also calculate µ * n = µ n (X * 1 , . . . , X * n ) and λ * n = λ n (X * 1 , . . . , X * n ). 4. Scale the values in the bootstrap sample using Y * n,j = X * j / µ * n , j = 1, . . . , n, and determine the value of the test statistic for these scaled bootstrap values, that is, S * = S(Y * n,1 , . . . , Y * n,n ). 5. Repeat steps 1 to 4 M C times to obtain S 1 , . . . , S MC and S * 1 , . . . , S * MC , where S m denotes the value of the test statistic calculated from the m th scaled sample data generated in step 1, and S * m denotes the value of the bootstrap test statistic calculated from the single scaled bootstrap sample obtained in the m th iteration of the Monte Carlo simulation. The warp-speed methodology described above is a computationally efficient alternative to the classical parametric bootstrap procedure (with a separate 'bootstrap loop') usually employed in power calculations in the presence of a shape parameter. The latter method is detailed and implemented in Section 5 of Henze and Klar (2002). Interestingly, extensive Monte Carlo simulations indicate that the power estimates obtained using the two bootstrap methods provide almost identical results for all but two of the test statistics used. The tests for which the power estimates obtained using the two methods differ are those proposed in Henze and Klar (2002). For these tests, the classical parametric bootstrap approach typically provides power estimates that are higher than those obtained using the warp-speed methodology. The differences observed between the two sets of estimated powers are ascribed to the different ways in which the numerical difficulties pointed out in Section 7.2.2 manifest in the two methods.
Tables 6 and 7 at the end of the paper show the estimated powers obtained by the warp-speed bootstrap methodology with 50 000 replications for sample sizes 30 and 50. The entries show the percentage of samples for which the null hypothesis was rejected, rounded to the closest integer. The nominal significance level is set to 10% throughout. In order to ease comparison, the highest power against each alternative distribution is printed in bold in the tables. We divide our new tests into four categories, depending on the weight function and estimation technique. More precisely,   Table 3: Analysis of the data example from Table 2. we implement the two weight functions w a and w a as in Sections 1 and 7.1, distinguishing the resulting tests by the ' '-notation, and use upper indices M L and M O to indicate the use of maximum likelihood or moment estimators, respectively. For each of the resulting four classes, we consider three different values of the tuning parameter, namely a = 0.1, 1, 10. Several other values were also investigated, but due to the remarkable insensitivity of the tests with regard to the tuning parameter, we present only these three values in the numerical results. The mentioned insensitivity is particularly noticeable for the tests that employ the moment estimators.
When comparing the power results presented in Tables 6 and 7, some remarks are in order. Notice that each of the tests keeps the nominal level of 10% closely when the null hypothesis is true. Observing the powers associated with the existing tests for the inverse Gaussian distribution, it is clear that the power of the test by Villaseñor and González-Estrada (2015) is generally lower than the power of the remaining tests. Among the three classical procedures, the Anderson-Darling test performs best, while the HK n,0 test proposed by Henze and Klar (2002) produces the highest power among the more recent tests.
Turning our attention to a global comparison between the tests, we see that our newly proposed method outperforms the existing tests uniformly for sample size n = 30, and when n = 50, the new tests produce higher powers in 19 out of 20 cases. The only exception is the LN (3) distribution. Interestingly, the tests using the moment estimators outperform those based on the maximum likelihood estimates. Although the choice of the weight function seems to have a smaller influence on the power than the estimation method, the test statistics which employ w a (t) = exp(−at) as a weight outperform those that use w a (t) = exp(−at 2 ). Based on the numerical results, we recommend for practical applications the statistic T MO n,10 .

Practical application
We apply all tests from the simulation study in Section 7 to two real-world data examples. The first data set is from von Alven (1964). It was also analyzed by Gunes et al. (1997) and Henze and Klar (2002) with regard to the inverse Gaussianity hypothesis stated by Chhikara and Folks (1977). The data is recalled in Table 2, where n = 46 active repair times (in hours) for an airborne transceiver are provided. Table 3 shows the calculated value of each test statistic as well as the associated p-value. Each of the p-values are calculated using the classical parametric bootstrap approach used in Section 5 of Henze and Klar (2002). Observing the p-values, none of the tests rejects the null hypothesis at a nominal significance level of 10%.
We now turn our attention to a second example, with data taken from Ang and Tang (1975) as analyzed by O' Reilly and Rueda (1992) and Henze and Klar (2002), where the inverse Gaussian distribution as an underlying model was again suggested by Folks and Chhikara (1978). The n = 25 recorded measurements correspond to precipitation (in inches) at Jug Bridge, Maryland. Table 4 provides the data itself, while Table 5 shows the values of the   Table 5: Analysis of the data example from Table 4. different test statistics as well as the associated p-value for each test. In contrast to the previous example, four of the tests reject the null hypothesis at a nominal significance level of 10%. This casts some doubt as to whether or not the data in question was realized from an inverse Gaussian law. Note that the p-values associated with the newly proposed tests are substantially lower than those associated with the existing tests.

Conclusions
Starting with the characterization of the inverse Gaussian distribution which results as a special case of the general identities provided by Betsch and Ebner (2019a), we set out to construct a goodness-of-fit test for inverse Gaussianity. The test is based on a weighted L 2 -statistic, a class of statistics which is extensively studied and extraordinarily well understood. After introducing our new testing procedure, we recalled the maximum likelihood and moment estimators for the parameters of the inverse Gaussian distribution, focusing on asymptotic expansions that were needed in the subsequent section to derive the limit null distribution. We briefly summarized the behavior of the novel statistic under contiguous alternatives to the hypothesis, and then turned to proving consistency. The main point of Section 6 was the proof of the consistency of the test in the bootstrap setting, thus taking into account precisely how the test is implemented in practice. In this endeavor, we also argued that the test based on the bootstrap critical values keeps its nominal level asymptotically. Monte Carlo simulations further indicate that the level is also kept for moderate, finite sample sizes (n = 30). Moreover, the power simulation study puts our method in strong favor, as it beats all competing tests virtually uniformly. In particular, the new test is more powerful than the other quite recent tests we considered and it also outperforms the classical empirical distribution function based tests, except in one instance of all alternative distributions where the Cramér-von Mises Anderson-Darling test have higher power. We concluded the paper by considering a data example from reliability engineering and one data set from meteorology, for which we tested the hypothesis that the data sets were realized from some inverse Gaussian law.