A Multi-Aspect Permutation Test for Goodness-of-Fit Problems

Rosa Arboretti; Elena Barzizza; Nicolò Biasetton; Riccardo Ceccato; Livio Corain; Luigi Salmaso

doi:10.3390/stats5020035

,

and

¹

Department of Civil, Environmental and Architectural Engineering, University of Padova, 35131 Padova, Italy

²

Department of Management and Engineering, University of Padova, 36100 Vicenza, Italy

^*

Author to whom correspondence should be addressed.

Stats2022, 5(2), 572-582;https://doi.org/10.3390/stats5020035

This article belongs to the Special Issue Re-sampling Methods for Statistical Inference of the 2020s

Version Notes

Order Reprints

Abstract

Parametric techniques commonly rely on specific distributional assumptions. It is therefore fundamental to preliminarily identify the eventual violations of such assumptions. Therefore, appropriate testing procedures are required for this purpose to deal with a the goodness-of-fit (GoF) problem. This task can be quite challenging, especially with small sample sizes and multivariate data. Previous studiesshowed how a GoF problem can be easily represented through a traditional two-sample system of hypotheses. Following this idea, in this paper, we propose a multi-aspect permutation-based test to deal with the multivariate goodness-of-fit, taking advantage of the nonparametric combination (NPC) methodology. A simulation study is then conducted to evaluate the performance of our proposal and to identify the eventual critical scenarios. Finally, a real data application is considered.

Keywords:

multi-aspect; NPC; goodness-of-fit

1. Introduction

In this study, we propose a permutation-based methodology, based on multi-aspect testing, to address the goodness-of-fit problems when the sample size is small and numeric multivariate data are available.

Parametric techniques rely on specific assumptions about the distribution of the population from which the parameter of interest is drawn. When such assumptions are violated, inference can be highly unreliable. For this reason, appropriate tests need to be preliminarily conducted to detect the eventual departure from the required distribution.

Considering the multivariate scenario, in the literature, a few solutions have been proposed to evaluate the multivariate normality.

For example, Mardia [1] proposed a pair of solutions based on multivariate versions of skewness and kurtosis measures. Let us suppose that

S k e w = \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} m_{i j}^{3}

,

K u r t = \frac{1}{n} \sum_{i = 1}^{n} m_{i i}^{2}

,

m_{i j} = {(x_{i} - \bar{x})}^{'} S^{- 1} (x_{j} - \bar{x})

,

S = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - \bar{x}) {(x_{i} - \bar{x})}^{'}

,

df = \frac{V (V + 1) (V + 2)}{6}

,

x = {x_{i j}, i = 1, \dots, n, j = 1, \dots, V}

is the sample of interest, n is the sample size, and V is the number of multivariate components. Equation (1) displays the proposed skewness test, while a kurtosis test is reported in Equation (2).

\frac{n}{6} S k e w \dot{\sim} X_{df}^{2}

(1)

K u r t \dot{\sim} N (V (V + 2), \frac{8 V (V + 2)}{n})

(2)

Given that for small samples, the power and the type I error could be violated, the author afterwards proposed a corrected version of the Skewness test, using

\frac{n c}{6} S k e w

as test statistic, where

c = \frac{(n + 1) (n + 3) (V + 1)}{n (n + 1) (V + 1) - 6}

.

A small sample size indeed makes multivariate goodness-of-fit problems quite challenging, given that we cannot rely on asymptotic properties. To deal with such circumstances, Arboretti et al. [2] proposed a permutation-based method relying on the nonparametric combination (NPC) methodology [3]. Given a random sample

x

from a population with distribution

f

and a theoretical distribution

f_{0}

, the authors followed the approach suggested by Friedman [4], addressing a goodness-of-fit problem as a two-sample equality in distribution problem. They drew an additional sample

x_{0}

from the theoretical distribution

f_{0}

, and used

x

and

x_{0}

to test whether

f = f_{0}

. Arboretti et al. [2] also recommended the use of permutation tests, highlighting their non-parametric, distribution-free nature and their power.

Considering a generic equality in distribution problem, there are many aspects that can determine a difference and need to be monitored. We can have completely different distribution functions, but also differences in location, in variability or in shape parameters. For this reason, multi-aspect permutation tests represent a solution worth considering. These tests follow the idea proposed by Fisher in 1947, which affirmed that different tests can be adopted to evaluate different aspects of the same null hypothesis [5,6]. Multi-aspect testing procedures indeed are aimed at simultaneously testing multiple features of the same

H_{0}

.

The NPC methodology allows us to easily extend such procedures to the multivariate scenario and thereafter to multivariate goodness-of-fit problems. In Section 2, we propose a possible extension, providing a detailed description of the underlying algorithm and of a possible competing technique. Then, Section 3 is devoted to the investigation of its performance through a simulation study. In Section 4, a real data application is proposed. Finally, in Section 5, we make conclusions about the conducted study.

2. Multi-Aspect Permutation Solution

Arboretti et al. [2] showed that goodness-of-fit problems can be easily converted into a two-sample equality in the distribution problem. For this reason, the nonparametric combination methodology can provide suitable and quite powerful solutions.

The NPC essentially requires three steps to be undertaken:

The decomposition of the global system of hypotheses (see System (3)) into multiple sub-problems:

$\{\begin{matrix} H_{0} : X \overset{d}{=} X_{0} \\ H_{1} : X \overset{d}{\neq} X_{0} \end{matrix};$

(3)
The application of partial test statistics to address each sub-problem and computation of partial p-values.
The combination of partial p-values using appropriate combining functions to finally compute a global p-value to test System (3).

In a multivariate scenario, the first step implies the following decomposition:

\{\begin{matrix} H_{0} : ⋂_{v = 1}^{V} H_{0 v} \\ H_{1} : ⋃_{v = 1}^{V} H_{1 v} . \end{matrix} where \{\begin{matrix} H_{0 v} : x_{v} \overset{d}{=} x_{0 v} \\ H_{1 v} : x_{v} \overset{d}{\neq} x_{0 v} . \end{matrix}

where we create a sub-system of hypotheses for each of the V components

x_{v}

of the multivariate outcome.

On the other hand, NPC-based solutions for multi-aspect testing [7,8,9,10] also require an initial decomposition of the system of hypotheses, defining a sub-problem for each aspect to be considered. For the sake of simplicity, in this study, we focus on three aspects and report the related sub-systems:

location

$\{\begin{matrix} H_{0 μ} : μ = μ_{0} \\ H_{1 μ} : μ \neq μ_{0} \end{matrix};$
variability

$\{\begin{matrix} H_{0 σ} : σ = σ_{0} \\ H_{1 σ} : σ \neq σ_{0} \end{matrix};$
cumulative distribution function

$\{\begin{matrix} H_{0 F} : F = F_{0} \\ H_{1 F} : F \neq F_{0} \end{matrix} .$

To address the aforementioned multivariate goodness-of-fit problem, we propose a multivariate multi-aspect test, and therefore we need to combine the two different decompositions:

\{\begin{matrix} H_{0 μ} : ⋂_{v = 1}^{V} H_{0 μ v} = ⋂_{v = 1}^{V} μ_{v} = μ_{0 v} \\ H_{1 μ} : ⋃_{v = 1}^{V} H_{1 μ v} = ⋃_{v = 1}^{V} μ_{v} \neq μ_{0 v} \end{matrix}

(4)

\{\begin{matrix} H_{0 σ} : ⋂_{v = 1}^{V} H_{0 σ v} = ⋂_{v = 1}^{V} σ_{v} = σ_{0 v} \\ H_{1 σ} : ⋃_{v = 1}^{V} H_{1 σ v} = ⋃_{v = 1}^{V} σ_{v} \neq σ_{0 v} \end{matrix}

(5)

\{\begin{matrix} H_{0 F} : ⋂_{v = 1}^{V} H_{0 F v} = ⋂_{v = 1}^{V} F_{v} = F_{0 v} \\ H_{1 F} : ⋃_{v = 1}^{V} H_{1 F v} = ⋃_{v = 1}^{V} F_{v} \neq F_{0 v} . \end{matrix}

(6)

For each individual aspect, we then identify a suitable test statistic. In particular, we detect differences in the following:

For location, we use the absolute difference between sample means:

$T_{μ v} = | {\bar{x}}_{v} - {\bar{x}}_{0 v} |$
For variability, we adopt the ratio of the two estimated variances $s_{1}^{2}$ and $s_{0}^{2}$ :

$T_{σ v} = max (\frac{s_{1}^{2}}{s_{0}^{2}}, \frac{s_{0}^{2}}{s_{1}^{2}})$
For the cumulative distribution function, we use the Anderson–Darling test statistic [3]:

$T_{F v} = \sum_{i = 1}^{N} {[\hat{F} (z_{v i}) - {\hat{F}}_{0} (z_{v i})]}^{2} / {\bar{F} (z_{v i}) [1 - \bar{F} (z_{v i})]}$

where $z = {x, x_{0}}$ is the pooled sample, n and m are individual sample sizes, $N = n + m$ , $\hat{F} (t) = \sum_{i = 1}^{n} I (x_{v i} \leq t) / n$ , ${\hat{F}}_{0} (t) = \sum_{i = 1}^{m} I (x_{0 v i} \leq t) / m$ , $\bar{F} (t) = \sum_{i = 1}^{N} I (z_{v i} \leq t) / N$ , $I (t) = {1 if t is TRUE; 0 otherwise}$ , and $t \in R^{1}$ .

The second step of the NPC methodology can thereafter be undertaken. We apply each test statistic to each univariate component of the outcome and compute the related partial p-values via multivariate permutation.

The adopted algorithm is as follows:

Apply the three test statistics to the original pooled data set $z = {x, x_{0}}$ . Observed values $T_{μ v}^{o}$ , $T_{σ v}^{o}$ , and $T_{F v}^{o}$ are achieved.
For $b = 1, \dots, B$ :
–
Shuffle rows of $z$ (i.e., the same permutation scheme is applied to each component), implicitly taking into account the existing correlation among variables.
–
Apply the three test statistics and retrieve $T_{μ v}^{b}$ , $T_{σ v}^{b}$ , and $T_{F v}^{b}$ .
Compute the partial p-values $λ_{μ v}^{'}$ , $λ_{σ v}^{'}$ , and $λ_{F v}^{'}$ , comparing the values of the observed test statistics to those of the permuted test statistics (e.g., $λ_{μ v}^{'} = \frac{\sum_{b} I (T_{μ v}^{b} \geq T_{μ v}^{o})}{(B + 1)}$ , $v = 1, \dots, V$ ), and their permutation distributions $λ_{μ v}^{^{'} b}$ , $λ_{σ v}^{^{'} b}$ , and $λ_{F v}^{^{'} b}, b = 1, \dots, B$ (for further details, see Pesarin and Salmaso [3]).

The last step consists of the combination of the partial p-values and the computation of the global p-value. To do that, the following procedure needs to be followed:

For each aspect, apply a combining function $ϕ (\cdot)$ to the V vectors of partial p-values and their permutation distributions to achieve second-order test statistics $T_{μ}^{^{″} o} = ϕ (λ_{μ 1}^{'}, \dots, λ_{μ V}^{'})$ , $T_{σ}^{^{″} o} = ϕ (λ_{σ 1}^{'}, \dots, λ_{σ V}^{'})$ , $T_{F}^{^{″} o} = ϕ (λ_{F 1}^{'}, \dots, λ_{F V}^{'})$ and their estimated distributions $T_{μ}^{^{″} b} = ϕ (λ_{μ 1}^{^{'} b}, \dots, λ_{μ V}^{^{'} b})$ , $T_{σ}^{^{″} b} = ϕ (λ_{σ 1}^{^{'} b}, \dots, λ_{σ V}^{^{'} b})$ , $T_{F}^{^{″} b} = ϕ (λ_{F 1}^{^{'} b}, \dots, λ_{F V}^{^{'} b})$ , $b = 1, \dots, B$ .
Compute second-order p-values $λ_{μ}^{″}$ , $λ_{σ}^{″}$ , and $λ_{F}^{″}$ (and the related distributions $λ_{μ}^{^{″} b}$ , $λ_{σ}^{^{″} b}$ , $λ_{F}^{^{″} b}$ , $b = 1, \dots, B$ ) comparing $T_{a}^{^{″} o}$ to the permuted values $T_{a}^{^{″} b}, b = 1, \dots, B$ , with $a \in {μ, σ, F}$ .
Apply a combining function $θ (\cdot)$ to the second-order p-values and their permutation distributions to achieve a third-order test statistic $T^{^{‴} o} = θ (λ_{μ}^{″}, λ_{σ}^{″}, λ_{F}^{″})$ and its estimated distribution $T^{^{‴} b} = θ (λ_{μ}^{^{″} b}, λ_{σ}^{^{″} b}, λ_{F}^{^{″} b}), b = 1, \dots, B$ .
Compute the global p-value $λ^{‴}$ comparing $T^{^{‴} o}$ to the permuted values $T^{^{‴} b}$ , $b = 1, \dots, B$ .

The choice of the combining function can represent a key factor in determining the power of the proposed test. According to Pesarin and Salmaso [3], a combining function should satisfy four fundamental properties, which are as follows:

It should be a non-increasing and possibly symmetric function;
It should reach its supremum value even when only a single partial p-value attains 0;
For each significance level $α$ , the related critical value should be finite and lower than the supremum value;
The rejection region of the resulting combined test should be convex.

Among the functions satisfying these requirements, we have the following:

Fisher’s [11]:

$ψ_{F i s h e r} : R^{K} ⟶ R (λ_{1}, \dots, λ_{K}) \mapsto - 2 \cdot \sum_{k = 1}^{K} log (λ_{k})$
The truncated product method [12], a modification of Fisher’s combining function which generally helps in gaining power with highly dependent data [13]:

$ψ_{T P M} : R^{K} ⟶ R (λ_{1}, \dots, λ_{K}) \mapsto - 2 \cdot \sum_{k = 1}^{K} log (λ_{k}) \cdot 1_{[0, τ]} (λ_{k})$
Tippett’s [14]:

$ψ_{T i p p e t t} : R^{K} ⟶ R (λ_{1}, \dots, λ_{K}) \mapsto 1 - min {λ_{1}, \dots, λ_{K}}$

In this study, we decided to set

θ (\cdot) = ψ_{T i p p e t t} (\cdot)

and to investigate the impact of choosing the truncated product method over Fisher’s

ψ_{F i s h e r} (\cdot)

as the first combining function

ϕ (\cdot)

used to combine V vectors of p-values related to the V components of the multivariate outcome. A simulation study was indeed conducted to evaluate the power of our proposal, implementing two different versions of the test, one using

ϕ (\cdot) = ψ_{T P M} (\cdot)

(indicated as NPC—Fisher) and one using

ϕ (\cdot) = ψ_{F i s h e r} (\cdot)

(indicated as NPC—Truncated).

A Competing Method

For a better evaluation of the performance of the proposed method, we decided to consider a possible competing method. In particular, we focused on the two-sample energy tests introduced by Székely et al. [15] in order to deal with the equality in distribution in high-dimensional problems. The test statistic they suggested is

ϵ = \frac{n m}{n + m} (\sum_{i = 1}^{n} \sum_{i^{'} = 1}^{m} | | x_{i} - x_{i^{'}} | | + \frac{1}{m^{2}} \sum_{i = 1}^{n} \sum_{i^{'} = 1}^{m} | | x_{0 i} - x_{0 i^{'}} | | - \frac{2}{n m} \sum_{i = 1}^{n} \sum_{i^{'} = 1}^{m} | | x_{i} - x_{0 i^{'}} | |)

where

| | \cdot | |

is the Euclidean distance. Large values of this statistic lead to rejection and in order to propose an adequate p-value, they rely on a permutation approach.

3. Simulation Study

In this study, we considered several different scenarios to accurately evaluate the performance of the proposed NPC-based approach.

Firstly, we decided to consider three different multivariate distributions:

Multivariate normal distribution with mean $μ$ and variance-covariance matrix $Σ$ ;
Multivariate log-normal distribution with the mean vector of the log of the distribution equal to $μ$ and variance–covariance matrix of the log of the distribution equal to $Σ$ ;
Multivariate Student’s t distribution with 3 degrees of freedom, location parameter $μ$ and scale matrix $Σ$ .

Data generation was conducted, taking advantage of the rmvnorm [16], LaplacesDemon [17] and compositions [18] packages implemented in R. The energy package [19] was adopted to apply the competing method, while R codes implementing the two versions of the multi-aspect test are available upon request.

Initially, the sizes of the observed and the theoretical samples were both fixed to 20, while we decided to consider two possible numbers of variables V, namely 6 and 10.

Under the null hypothesis, the observed and the theoretical distributions are expected to be the same. Therefore, under

H_{0}

, we set the V-dimensional vector

μ = [10, \dots, 10]

and the

(V \times V)

-dimensional matrix

Σ = [\begin{matrix} 1 & ω & \dots & ω \\ ω & 1 & \dots & ω \\ \dots & \dots & \dots & \dots \\ ω & ω & \dots & 1 \end{matrix}]

for both samples. We considered three possible values of

ω

, i.e., 0, 0.25, and 0.5, to introduce different degrees of correlation.

Under the alternative hypothesis, we focused on scenarios where the observed and the theoretical distributions were different in terms of both the location and scale parameters. For the theoretical sample

X_{0}

, the aforementioned

μ

and

Σ

were adopted. On the other hand, to generate the observed sample

X

, we used

μ_{1} = [9.5, \dots, 9.5]

and

Σ_{1} = [\begin{matrix} 3 & ω & \dots & ω \\ ω & 3 & \dots & ω \\ \dots & \dots & \dots & \dots \\ ω & ω & \dots & 3 \end{matrix}] .

Again, three possible values of

ω

, i.e., 0, 0.25, and 0.5, were considered, using each time the same value for both the distributions.

Given the well-known properties of the NPC methodology, increasing the size m of the theoretical sample could allow us to increase the power of our procedure. To better illustrate such a phenomenon, we also tried to vary m in a final scenario, where for the theoretical sample, we use

μ = [10, \dots, 10]

and

Σ = [\begin{matrix} 1 & ω & \dots & ω \\ ω & 1 & \dots & ω \\ \dots & \dots & \dots & \dots \\ ω & ω & \dots & 1 \end{matrix}],

and for the observed sample, we have

μ_{1} = [9.5, \dots, 9.5]

and

Σ_{1} = [\begin{matrix} 3 & ω & \dots & ω \\ ω & 3 & \dots & ω \\ \dots & \dots & \dots & \dots \\ ω & ω & \dots & 3 \end{matrix}],

with

ω = 0.25

. In particular, we considered three possible values for m (i.e., 20, 30, and 40), while keeping n fixed to 20.

It is worth noting that the current choice of test statistics is not ideal for situations where the observed and the theoretical samples differ in terms of correlation structure. This shortcoming is illustrated and further discussed through an additional simulation scenario, where for the theoretical sample we use

V = 6

,

μ = [10, \dots, 10]

and

Σ = [\begin{matrix} 3 & ω & \dots & ω \\ ω & 3 & \dots & ω \\ \dots & \dots & \dots & \dots \\ ω & ω & \dots & 3 \end{matrix}]

with

ω = 0.5

, while for the observed sample we have

μ_{1} = [10, \dots, 10]

and

Σ_{1} = [\begin{matrix} 3 & ω_{1} & \dots & ω_{1} \\ ω_{1} & 3 & \dots & ω_{1} \\ \dots & \dots & \dots & \dots \\ ω_{1} & ω_{1} & \dots & 3 \end{matrix}]

with

ω_{1} = {1.5, 2.0, 2.5}

.

The number of simulation runs was set equal to 5000, while the number of permutations was equal to 2000.

Results and Discussion

Under the null hypotheses, both versions of the NPC-based test (using Fisher’s combining function and the truncated product method with

τ = 0.2

) and the energy test all kept the nominal level. Having fixed the significance level

α

to

1 %

, the rejection rates (i.e., the proportion of p-values less than or equal to

α

) are always quite close to

1 %

, with some slight random fluctuations (see Table 1).

Table 1. Rejection rates with significance level

α = 1 %

under the null hypothesis.

Under the alternative hypothesis, we can appreciate some differences between the considered methods (see Table 2).

Table 2. Rejection rates with significance level

α = 1 %

under the alternative hypothesis varying

ω

.

First of all, it appears that the choice of the combining function does not affect considerably the performance of the NPC-based multi-aspect test. This is probably due to the fact that two sequential combination steps are undertaken, mitigating the impact of the use of a specific function during the first step. However, when a larger number of variables and a multivariate log-normal distribution are considered, the solution adopting Fisher’s combining function appears to be perform slightly better.

Both the Fisher and truncated product methods show reasonably high rejection rates and outperform the energy test under the vast majority of scenarios. However, it should be noted that for the multivariate log-normal distribution, this is not true. When considering this asymmetric distribution, the methods show similar performance when

V = 6

and the energy test even has the highest rejection rates when a larger number of variables is considered (i.e., when

V = 10

).

It is then worth noting how a high correlation among variables appears to be detrimental to the power of the methods. All the considered tests show indeed considerably lower rejection rates when

ω = 0.5

with respect to the case

ω = 0

. The performance of the NPC-based solutions, however, remains reasonably good, even for

ω = 0.5

when the symmetrical distributions are considered.

Additionally, a higher number of informative variables appears to lead to an increase in power. This is a well-known property of NPC-based tests, called finite-sample consistency [20]. This means that whatever the sample size, a reasonably good power can be reached if a considerably high number of informative variables is available. This is a pretty useful property that introduces a potential solution to deal with the shortcomings posed by small-sample scenarios.

Table 3 allows us to appreciate the positive effect on power of an increase in the theoretical sample size m. In particular, we can see that the permutation-based solutions with

m = 40

are able to outperform the competing method even for

V = 10

and a multivariate log-normal distribution, i.e., the only case where the energy test was performing the best for smaller sample sizes. To further enhance the power of these tests, the user could therefore consider increasing the size of the theoretical sample, given that it can be freely chosen. However, when adopting such an approach, we should be aware that it could lead to a substantial increase in the computational burden.

Table 3. Rejection rates with significance level

α = 1 %

under the alternative hypothesis varying m.

Investigating a potential shortcoming due to the current choice of the test statistics, we noticed that the methods indeed fail at detecting differences in the correlation structure. This is true for both the NPC-based solutions and the energy test. Looking at Table 4, we can indeed see that the rejection rates are very close to the nominal level expected under

H_{0}

. However, by including an additional test statistics specifically designed to detect differences in correlation, it could be possible to address even such a scenario [3].

Table 4. Rejection rates with significance level

α = 1 %

with differences in the correlation structure.

4. Real Data Application

We decided to consider a real data application in order to better show the usefulness of our proposed procedure. In particular, we applied our approach to an industrial problem, where an operator is interested in checking the quality of a production process in terms of multiple key performance indicators. Initially, 25 different bottles (i.e., the output of the process) were randomly selected. Their diameters measured on three key positions were expected to be, on average, equal to 2.5 cm (Diameter A), 5 cm (Diameter B) and 7 cm (Diameter C), respectively (i.e.,

μ_{s} = [2.5, 5.0, 7.5]

). Additionally, after an application of the Six Sigma methodology, we knew in advance that the expected variance–covariance matrix was as follows:

Σ_{s} = [\begin{matrix} 1.02 \times 10^{- 4} & 1.04 \times 10^{- 5} & 2.01 \times 10^{- 5} \\ 1.04 \times 10^{- 5} & 1.12 \times 10^{- 4} & 2.03 \times 10^{- 5} \\ 2.01 \times 10^{- 5} & 2.03 \times 10^{- 5} & 2.05 \times 10^{- 4} \end{matrix}],

with the diameters values following a multivariate normal distribution.

The gathered sample showed a potential shift from the expected mean value in Diameter A (i.e., the diameter of the neck of the bottle) as we can see in Table 5. We therefore applied both the versions of the NPC-based multi-aspect test and the energy test to further investigate this hypothesis.

Table 5. Descriptive statistics.

Table 6 reports the achieved global p-values. We can see that all the considered methods allow us to reject the null hypothesis with a significance level equal to 5%, which means that the gathered sample does not follow a multivariate normal distribution with mean

μ_{s}

and a matrix of variance and covariance

Σ_{s}

. Looking at adjusted partial p-values (see Table 7) we can also identify which aspects lead to this rejection. In particular, we can see that a significant shift in mean did happen.

Table 6. Global p-values.

Table 7. Partial p-values.

5. Conclusions

In this paper, we introduced a multi-aspect permutation test to deal with the multivariate goodness-of-fit (GoF). First of all, we adopted the approach already proposed by Arboretti et al. [2], transforming the GoF problem into a traditional two-sample one. Then, we simply introduced an extension of the nonparametric combination (NPC) methodology [3], which is able to detect differences in location, scale and cumulative distribution function between the observed sample distribution and the theoretical distribution.

To evaluate the performance of this solution, we proposed a simulation study, which allowed us to appreciate the goodness of our proposal, even when compared to a possible competing testing procedure (i.e., the energy test proposed by Székely et al. [15]). Its power appears to be negatively affected by high correlation among variables, but at the same time, it tends to substantially increase when the number of informative variables increases. It also emerged that the choice of the combining function adopted in the first combination step required by the NPC methodology does not appear to significantly affect the performance of the proposed test.

The conducted simulation study showed the benefits of choosing a large size m of the sample drawn from the theoretical distribution, which appears to lead to an increase in power. Future studies could therefore consider providing guidelines about the appropriate ratio between the observed and the theoretical samples sizes.

On the other hand, it emerged also a shortcoming of the current configuration of the proposed approach, which fails at detecting differences in the correlation structure. For this reason, future studies could focus on the introduction of a further test statistic, specifically designed to detect such differences, which could allow us to improve the performance under such scenarios.

A real data application was also proposed, which allowed us to show the usefulness of our approach.

Overall, our proposal demonstrated to be a quite powerful solution to goodness-of-fit problems, which shows high flexibility and leaves room for further improvement and investigation.

Author Contributions

Conceptualization, R.C., N.B. and E.B.; Methodology, R.C., N.B. and E.B.; Software, R.C., N.B. and E.B.; Validation, R.C., N.B. and E.B.; Formal Analysis, R.C., N.B. and E.B.; Investigation, R.C., N.B. and E.B.; Resources, R.C., N.B. and E.B.; Data Curation, L.C., L.S. and R.A.; Writing—Original Draft Preparation, L.C., L.S. and R.A.; Writing—Review and Editing, L.C., L.S. and R.A.; Visualization, L.C., L.S. and R.A.; Supervision, L.C., L.S. and R.A.; Project Administration, L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mardia, K.V. Applications of some measures of multivariate skewness and kurtosis in testing normality and robustness studies. Sankhyā Indian J. Stat. Ser. B 1974, 36, 115–128. [Google Scholar]
Arboretti, R.; Ceccato, R.; Salmaso, L. Permutation testing for goodness-of-fit and stochastic ordering with multivariate mixed variables. J. Stat. Comput. Simul. 2021, 91, 876–896. [Google Scholar] [CrossRef]
Pesarin, F.; Salmaso, L. Permutation Tests for Complex Data: Theory, Applications and Software; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Friedman, J. On Multivariate Goodness-of-Fit and Two-Sample Testing; SLAC National Accelerator Lab.: Menlo Park, CA, USA, 2004. [Google Scholar]
Fisher, R.A. The Design of Experiments, 4th ed.; Hafner Press: New York, NY, USA, 1947. [Google Scholar]
Lehmann, E.L. The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two? J. Am. Stat. Assoc. 1993, 88, 1242–1249. [Google Scholar] [CrossRef]
Salmaso, L.; Solari, A. Multiple aspect testing for case-control designs. Metrika 2005, 62, 331–340. [Google Scholar] [CrossRef]
Brombin, C.; Salmaso, L. Multi-aspect permutation tests in shape analysis with small sample size. Comput. Stat. Data Anal. 2009, 53, 3921–3931. [Google Scholar] [CrossRef]
Brombin, C.; Salmaso, L.; Ferronato, G.; Galzignato, P.F. Multi-aspect procedures for paired data with application to biometric morphing. Commun. Stat. Comput. 2010, 40, 1–12. [Google Scholar] [CrossRef] [Green Version]
Corain, L.; Salmaso, L. Improving power of multivariate combination-based permutation tests. Stat. Comput. 2015, 25, 203–214. [Google Scholar] [CrossRef]
Fisher, R. Statistical Methods for Research Workers; Oliver and Boyd: Edinburgh, UK, 1932. [Google Scholar]
Zaykin, D.V.; Zhivotovsky, L.A.; Westfall, P.H.; Weir, B.S. Truncated product method for combining P-values. Genet. Epidemiol. Off. Publ. Int. Genet. Epidemiol. Soc. 2002, 22, 170–185. [Google Scholar]
Arboretti Giancristofaro, R.; Bonnini, S.; Corain, L.; Salmaso, L. Dependency and truncated forms of combinations in multivariate combination-based permutation tests and ordered categorical variables. J. Stat. Comput. Simul. 2016, 86, 3608–3619. [Google Scholar] [CrossRef]
Tippett, L.H.C. The Methods of Statistics. An Introduction Mainly for Workers in the Biological Sciences; Williams & Norgate: London, UK, 1931. [Google Scholar]
Székely, G.J.; Rizzo, M.L. Testing for equal distributions in high dimension. InterStat 2004, 5, 1249–1272. [Google Scholar]
Genz, A.; Bretz, F.; Miwa, T.; Mi, X.; Leisch, F.; Scheipl, F.; Hothorn, T. Mvtnorm: Multivariate Normal and t Distributions; R Package Version 1.1-3; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
Statisticat LLC. Bayesian Inference; R Package Version 16.1.6; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
van den Boogaart, K.G.; Tolosana-Delgado, R.; Bren, M. Compositions: Compositional Data Analysis; R Package Version 2.0-4; R Foundation for Statistical Computing: Vienna, Austria, 2022. [Google Scholar]
Rizzo, M.; Szekely, G. Energy: E-Statistics: Multivariate Inference via the Energy of Data; R Package Version 1.7-8; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
Pesarin, F.; Salmaso, L. Finite-sample consistency of combination-based permutation tests with application to repeated measures designs. J. Nonparametric Stat. 2010, 22, 669–684. [Google Scholar] [CrossRef]

Table 1. Rejection rates with significance level

α = 1 %

under the null hypothesis.

Table 1. Rejection rates with significance level

α = 1 %

under the null hypothesis.

$ω$	V	Method	Multivariate Normal	Multivariate Log-Normal	Multivariate Student’s t
0	6	Energy	0.010	0.009	0.010
		NPC —Fisher	0.012	0.010	0.011
		NPC—Truncated	0.012	0.009	0.010
	10	Energy	0.007	0.011	0.010
		NPC—Fisher	0.008	0.009	0.011
		NPC—Truncated	0.007	0.009	0.013
0.25	6	Energy	0.010	0.010	0.009
		NPC—Fisher	0.010	0.008	0.009
		NPC—Truncated	0.009	0.009	0.011
	10	Energy	0.008	0.012	0.009
		NPC—Fisher	0.009	0.013	0.009
		NPC—Truncated	0.008	0.014	0.009
0.5	6	Energy	0.009	0.010	0.011
		NPC—Fisher	0.009	0.010	0.009
		NPC—Truncated	0.009	0.009	0.009
	10	Energy	0.012	0.011	0.011
		NPC—Fisher	0.009	0.012	0.011
		NPC—Truncated	0.009	0.012	0.011

Table 2. Rejection rates with significance level

α = 1 %

under the alternative hypothesis varying

ω

.

Table 2. Rejection rates with significance level

α = 1 %

under the alternative hypothesis varying

ω

.

$ω$	V	Method	Multivariate Normal	Multivariate Log-Normal	Multivariate Student’s t
0	6	Energy	0.757	0.448	0.807
		NPC—Fisher	0.965	0.453	0.976
		NPC—Truncated	0.958	0.450	0.972
	10	Energy	0.949	0.677	0.970
		NPC—Fisher	0.999	0.582	1.000
		NPC—Truncated	0.999	0.558	1.000
0.25	6	Energy	0.647	0.425	0.734
		NPC—Fisher	0.931	0.435	0.955
		NPC—Truncated	0.926	0.430	0.946
	10	Energy	0.854	0.595	0.891
		NPC—Fisher	0.990	0.551	0.997
		NPC—Truncated	0.987	0.535	0.995
0.5	6	Energy	0.563	0.378	0.645
		NPC—Fisher	0.877	0.380	0.908
		NPC—Truncated	0.876	0.374	0.901
	10	Energy	0.709	0.479	0.770
		NPC—Fisher	0.961	0.461	0.959
		NPC—Truncated	0.953	0.455	0.955

Table 3. Rejection rates with significance level

α = 1 %

under the alternative hypothesis varying m.

Table 3. Rejection rates with significance level

α = 1 %

under the alternative hypothesis varying m.

m	V	Method	Multivariate Normal	Multivariate Log-Normal	Multivariate Student’s t
20	6	Energy	0.667	0.421	0.731
		NPC—Fisher	0.941	0.434	0.958
		NPC—Truncated	0.935	0.444	0.952
	10	Energy	0.847	0.572	0.911
		NPC—Fisher	0.994	0.506	0.996
		NPC—Truncated	0.990	0.492	0.995
30	6	Energy	0.711	0.432	0.800
		NPC—Fisher	0.985	0.467	0.986
		NPC—Truncated	0.984	0.456	0.982
	10	Energy	0.902	0.578	0.934
		NPC—Fisher	0.998	0.523	1.000
		NPC—Truncated	0.997	0.515	1.000
40	6	Energy	0.753	0.433	0.825
		NPC—Fisher	0.993	0.524	0.993
		NPC—Truncated	0.992	0.518	0.992
	10	Energy	0.929	0.580	0.959
		NPC—Fisher	1.000	0.595	1.000
		NPC—Truncated	1.000	0.582	1.000

Table 4. Rejection rates with significance level

α = 1 %

with differences in the correlation structure.

Table 4. Rejection rates with significance level

α = 1 %

with differences in the correlation structure.

$ω_{1}$	V	Method	Multivariate Normal	Multivariate Log-Normal	Multivariate Student’s t
1.5	6	Energy	0.013	0.012	0.014
		NPC—Fisher	0.010	0.013	0.012
		NPC—Truncated	0.010	0.015	0.010
2.0	6	Energy	0.019	0.014	0.018
		NPC—Fisher	0.012	0.009	0.010
		NPC—Truncated	0.015	0.013	0.009
2.5	6	Energy	0.021	0.022	0.028
		NPC—Fisher	0.012	0.009	0.014
		NPC—Truncated	0.013	0.009	0.014

Table 5. Descriptive statistics.

Value	Diameter A	Diameter B	Diameter C
Average	2.516	4.999	7.002
Variance	1.04 × $10^{- 4}$	1.17 × $10^{- 4}$	2.03 × $10^{- 4}$

Table 6. Global p-values.

Energy	NPC—Fisher A	NPC—Truncated
4.99 × $10^{- 4}$	1.14 × $10^{- 2}$	2.79 × $10^{- 2}$

Table 7. Partial p-values.

Test Statistics	NPC—Fisher A	NPC—Truncated
$T_{μ}$	0.014	0.019
$T_{σ}$	0.095	0.147
$T_{F}$	0.014	0.019

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Multi-Aspect Permutation Test for Goodness-of-Fit Problems

Abstract

1. Introduction

2. Multi-Aspect Permutation Solution

A Competing Method

3. Simulation Study

Results and Discussion

4. Real Data Application

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics