Next Article in Journal
Complexity and Statistical Physics Approaches to Earthquakes
Next Article in Special Issue
Linear Bayesian Estimation of Misrecorded Poisson Distribution
Previous Article in Journal
SharDif: Sharing and Differential Learning for Image Fusion
Previous Article in Special Issue
Probabilistic Nearest Neighbors Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Objective Priors for Invariant e-Values in the Presence of Nuisance Parameters

Department of Statistical Sciences, University of Padova, 35121 Padova, Italy
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2024, 26(1), 58; https://doi.org/10.3390/e26010058
Submission received: 29 November 2023 / Revised: 2 January 2024 / Accepted: 6 January 2024 / Published: 9 January 2024
(This article belongs to the Special Issue Bayesianism)

Abstract

:
This paper aims to contribute to refining the e-values for testing precise hypotheses, especially when dealing with nuisance parameters, leveraging the effectiveness of asymptotic expansions of the posterior. The proposed approach offers the advantage of bypassing the need for elicitation of priors and reference functions for the nuisance parameters and the multidimensional integration step. For this purpose, starting from a Laplace approximation, a posterior distribution for the parameter of interest is only considered and then a suitable objective matching prior is introduced, ensuring that the posterior mode aligns with an equivariant frequentist estimator. Consequently, both Highest Probability Density credible sets and the e-value remain invariant. Some targeted and challenging examples are discussed.

1. Introduction

In this article, we discuss an objective matching prior that maintains the invariance of the posterior mode when testing specific hypotheses for parametric models, especially in the presence of nuisance parameters. These parameters are often introduced to establish flexible and realistic models, although the primary focus of inference is typically limited to a parameter of interest. The proposed approach offers the advantage of eliminating the need for eliciting information on the nuisance components and for conducting multidimensional integration, and it produces invariant e -values in the presence of nuisance parameters.
The parametric framework that we consider can be described as follows. Consider a random sample y = ( y 1 , , y n ) of size n from a random variable Y with parametric model f ( y ; θ ) , indexed by a paramater θ , with θ Θ I R d . Given a prior π ( θ ) on θ , Bayesian inference for θ is based on the posterior density
π ( θ | y ) π ( θ ) L ( θ ) ,
where L ( θ ) represents the likelihood function based on f ( y ; θ ) . Interest is, in particular, in the situation in which θ = ( ψ , λ ) , where ψ is a scalar parameter for which inference is required, and λ represents the remaining ( d 1 ) nuisance parameters. In such case, Bayesian inference for ψ is based on the marginal posterior density
π m ( ψ | y ) = π ( ψ , λ | y ) d λ π ( ψ , λ ) L ( ψ , λ ) d λ ,
which for its computation requires both elicitation on the nuisance parameter λ and multidimensional integration.
Asymptotic arguments are widely used in Bayesian inference through (1) and (2), based on developments of so-called higher-order asymptotics (see, e.g., [1,2,3]). Indeed, the theory of asymptotic expansions provides very accurate approximations to posterior distributions, and to various summary quantities of interest, including tail areas, credible regions and for the Full Bayesian Significance Test (see, e.g., [4,5]). Moreover, they are particularly useful for sensitivity analyses (see [6,7]) and also for the derivation of matching priors (see [8], and references therein). For instance, focusing on the presence of nuisance parameters, the Laplace approximation to (2) provides
π m ( ψ | y ) = ¨ 1 2 π | j p ( ψ ^ ) | 1 / 2 exp { 𝓁 p ( ψ ) 𝓁 p ( ψ ^ ) } | j λ λ ( ψ ^ , λ ^ ) | 1 / 2 | j λ λ ( ψ , λ ^ ψ ) | 1 / 2 π ( ψ , λ ^ ψ ) π ( ψ ^ , λ ^ ) ,
where 𝓁 p ( ψ ) = log L p ( ψ ) = log L ( ψ , λ ^ ψ ) is the profile log-likelihood for ψ , with λ ^ ψ the constrained maximum likelihood estimate (MLE) of λ given ψ , ( ψ ^ , λ ^ ) is the full MLE, and j p ( ψ ) = 2 𝓁 p ( ψ ) / ψ 2 is the profile observed information. Moreover, j λ λ ( ψ , λ ) is the ( λ , λ ) -block of the observed information from the full log-likelihood 𝓁 ( ψ , λ ) = log L ( ψ , λ ) , and the notation = ¨ indicates that the approximation is accurate to order O ( n 3 / 2 ) in moderate deviation regions (see, e.g., [9], Chapter 2). One appealing feature of higher-order approximations like (3) is that they may routinely be applied in practical Bayesian inference, since they require little more than standard likelihood quantities for their implementation, and hence, they may be available at little additional computational cost over simple first-order approximations.
In the presence of nuisance parameters, starting from approximation (3), it is possible to define a general posterior distribution for ψ of the form
π * ( ψ | y ) π * ( ψ ) L p ( ψ ) ,
where π * ( ψ ) is now a prior distribution on ψ only. Bayesian inference based on pseudo-likelihood functions—i.e., functions of ψ only and of the data y with properties similar to those of a genuine likelihood function, such as the profile likelihood—have been widely used and discussed in the recent statistical literature. Moreover, it has been theoretically motivated in several papers (see, for instance, [10,11,12], and references therein), also focusing on the derivation of suitable objective priors. Especially when the dimension of λ is large, there are two advantages in using (4) instead of the marginal posterior distribution (2). First, the elicitation over λ is not necessary, and second, the computation of the integrals in (2) is circumvented.
Focusing on (4), in this paper, it is of interest to test the precise (or sharp) null hypothesis
H 0 : ψ = ψ 0 against H 1 : ψ ψ 0
using the measure of evidence for the Full Bayesian Significance Test (see, e.g., [4,5]). The Full Bayesian Significance Test (FBST) quantifies evidence by considering the posterior probability associated with the least probable points in the parameter space under H 0 . Higher-order asymptotic computation of the FBST for precise null hypotheses in the presence of nuisance parameters has been discussed in [13].
The original measure of evidence for the FBST is not invariant under suitable transformations of the parameter, a property which has, however, been reached in the more recent definition of the e -value (see [14,15], and references therein). Neverthless, when working on a scalar parameter of interest, in the presence of nuisance parameters, the e -value is not invariant with respect to marginalizations of the nuisance parameter, and it must be used in the full dimensionality of the parameter space. This requires elicitation on the complete parameters, numerical optimization and numerical integration, that can be computationally heavy, especially when the dimension of λ is large.
The aim of this paper is to consider the e -value in the context of the pseudo-posterior distribution π * ( ψ | y ) , suggesting in this respect a suitable objective prior π * ( ψ ) to be used in (4). More precisely, focus is on a particular matching prior, which ensures the invariance of the posterior mode of the pseudo-posterior distribution. As a consequence, Highest Probability Density credible (HPD) sets are also invariant, as well as the e -value.
This paper is organized as follows. Section 2 provides a short review on the FBST for testing precise null hypotheses and also illustrates asymptotic approximations for the e -value, extending results of [13]. Section 3 discusses the derivation of the objective matching prior for the parameter of interest only, called median matching prior, that produces invariant e -values. Also, several targeted and challenging examples are discussed. Finally, Section 4 closes with some concluding remarks.

2. The FBST Measure of Evidence

Suppose that we need to decide between two hypotheses: the null H 0 and the alternative H 1 . The usual Bayesian testing procedure is based on the well-known Bayes factor (BF), defined as the ratio of the posterior odds to the prior odds in favor of the null hypothesis. A high BF or its logarithm suggests evidence in favor of H 0 . However, it is well known that, when improper priors are used, the BF can be undetermined, and when the null hypothesis is precise (as specified in (5)), the BF can lead to the so-called Jeffreys–Lindley’s paradox (see, e.g., [16]). Moreover, the BF is not calibrated, i.e., its finite sampling distribution is unknown and it may depend on the nuisance parameter.
To avoid these drawbacks, in recent years, an alternative Bayesian procedure, called FBST, has been introduced by [5] in case of sharp hypothesis H 0 identified by the null set Θ 0 , a submanifold of Θ of lower dimension. The FBST quantifies evidence by considering the posterior probability associated with the least probable points in the parameter space Θ 0 . When this probability is high, it favors the null hypothesis, providing a clear and interpretable measure of support for H 0 (see, e.g., [4,15,17], and references therein). The FBST is based on a specific loss function, and thus, the decision made under this procedure is the action that minimizes the corresponding posterior risk.
The FBST operates by determining the e -value, a representation of Bayesian evidence associated to H 0 . To construct the e -value, the authors introduced the posterior surprise function and its supremum given, respectively, by
π s ( θ | y ) = π ( θ | y ) r ( θ ) and s * = π s ( θ * | y ) = sup θ Θ 0 π s ( θ | y ) ,
where r ( θ ) is a suitable reference function to be chosen. The surprise function was introduced in the context of statistical inference also by [18] (see [15], and references therein). Then, they introduce the tangential set  T y ( θ * ) defined as the set of parameter values for which the posterior surprise function exceeds the supremum s * , that is
T y ( θ * ) = { θ Θ : π s ( θ | y ) > s * } .
This set, often referred to as the Highest Relative Surprise Set, includes parameter values with higher surprise than those within the null set Θ 0 . The e -value is then computed as
e v = 1 T y ( θ * ) π s ( θ | y ) d θ ,
and H 0 is rejected for small values of e v .
The original FBST, as proposed by [5,19], relies on a flat reference function r ( θ ) 1 , so that this first version involved the determination of the tangential set T y ( θ ) starting only from the posterior distribution π ( θ | y ) . However, this initial version lacked invariance under reparameterizations. Subsequent refinements of the FBST introduced the importance of reference density functions, making the e -value explicitly invariant under appropriate transformations of the parameter. Common choices for the reference function include uninformative priors, like the uniform distribution, maximum entropy densities, or Jeffreys’ invariant prior. In [20], the use of the Jeffreys’ prior, π ( θ ) | i ( θ ) | 1 / 2 , where i ( θ ) is the Fisher information derived from L ( θ ) , is discussed as the reference function to derive invariant HPD sets and Maximum A Posteriori (MAP) estimators that are invariant under reparameterizations. Note that the e v uses the full dimensionality of the parameter space. Moreover, this measure is not invariant with respect to transformations of the nuisance parameters, and the use of high posterior densities to construct credible sets may produce inconsistencies.
Concerning the asymptotic behavior of the e v , it can be proven that, under suitable regularity conditions as the sample size increases, with θ 0 representing the true parameter value (see [15]), it holds:
  • If H 0 is false, i.e., θ 0 H 0 , then e v converges in probability to 1.
  • If H 0 is true, i.e., θ 0 H 0 , then, denoting by V ( c ) = P r ( e v c ) the cumulative distribution function of e v , we have that V ( c ) Q ( d h , Q 1 ( d , c ) ) , with d = d i m ( Θ ) , h = d i m ( Θ 0 ) and Q ( k , x ) the cumulative chi-square distribution with k degrees of freedom.
In practice, the computation of e v is performed in two steps: (a) a numerical optimization and (b) a numerical integration. The numerical optimization step consists of finding the maximizer θ * of π s ( θ | y ) under the null hypothesis. The numerical integration step consists of integrating the posterior surprise function over the region where it is greater than π s ( θ * | y ) , to obtain the e -value. These computational steps make the FBST a computationally intensive procedure. Despite efficient computational algorithms for local and global optimization, as well as numerical integration, obtaining precise results for hypotheses like (5) is highly demanding, especially with large nuisance parameter dimensions. Numerical integration can be tackled by resorting to higher-order tail area approximations, as reviewed in the Bayesian framework in [3]. An application of asymptotic approximation to the FBST in its first formulation, i.e., with reference function r ( θ ) 1 , has been discussed in [13].

Asymptotic Approximations for the e-Value

A first-order approximation for the e -value, when testing (5), is simply given by (see, e.g., [21,22])
e v = ˙ 2 1 Φ ψ 0 ψ ^ j p ( ψ ^ ) 1 ,
where the symbol “ = ˙ ” indicates that the approximation is accurate to O ( n 1 / 2 ) , and Φ ( · ) is the standard normal distribution function. Thus, to first-order, e v agrees with the p -value based on the profile Wald statistic
w p ( ψ ) = ( ψ ^ ψ 0 ) j p ( ψ ^ ) 1 .
In practice, the approximation (6) of e v may be inaccurate, in particular when the dimension of λ is large with respect to the sample size, because it forces the marginal posterior distribution to be symmetric.
The practical computation of e v requires the evaluation of integrals of the marginal posterior distribution. In order to have more accurate evaluations of e v , it may be useful to resort to higher-order asymptotics based on tail area approximations (see, e.g., [2,3], and references therein). Indeed, the measure of evidence involves integrals of the marginal surprise posterior density π m s ( ψ | y ) . In particular, extending the application of the tail area argument to the marginal surprise posterior density, we can derive a O ( n 3 / 2 ) approximation to the marginal surprise posterior tail area probability, given by
ψ 0 π m s ( ψ | y ) d ψ = ¨ Φ ( r B * ( ψ 0 ) ) ,
where
r B * ( ψ ) = r p ( ψ ) + 1 r p ( ψ ) log q B ( ψ ) r p ( ψ ) ,
with
r p ( ψ ) = sign ( ψ ^ ψ ) [ 2 ( 𝓁 p ( ψ ^ ) 𝓁 p ( ψ ) ) ] 1 / 2
profile likelihood root and
q B ( ψ ) = 𝓁 p ( ψ ) | j p ( ψ ^ ) | 1 / 2 | j λ λ ( ψ , λ ^ ψ ) | 1 / 2 | j λ λ ( ψ ^ , λ ^ ) | 1 / 2 π ( ψ ^ , λ ^ ) π ( ψ , λ ^ ψ ) r ( ψ , λ ^ ψ ) r ( ψ ^ , λ ^ ) .
In the expression of q B ( ψ ) , 𝓁 p ( ψ ) = 𝓁 p ( ψ ) / ψ is the profile score function.
Using the tail area approximation (8), a third-order approximation of the measure of evidence e v can be derived. The approximation, assuming without loss of generality that ψ 0 is smaller than the MAP of π m s ( ψ | y ) , is given by
e v ( ψ ) = ¨ 1 Φ ( r B * ( ψ 0 ) ) + Φ ( r B * ( ψ 0 * ) ) ,
with ψ 0 * the value of the parameter such that π m s ( ψ 0 * | y ) = π m s ( ψ 0 | y ) . Note that
Φ ( r B * ( ψ 0 ) ) Φ ( r B * ( ψ 0 * ) ) = ¨ ψ 0 * ψ 0 π m s ( ψ | y ) d ψ = 1 e v
in (9) gives the posterior probability of the HPD credible interval ( ψ 0 , ψ 0 * ) . Note also that the higher-order approximation (9) does not call for any condition on the prior π ( ψ , λ ) , i.e., it can be also improper. Finally, when π m s ( ψ | y ) is symmetric, Equation (9) reduces to e v = ¨ 2 ( 1 Φ ( r B * ( ψ 0 ) ) ) .
While tail area approximations require little more than standard likelihood quantities for their implementation and, in this respect, they are available at little additional computational cost over the first-order approximation, they require elicitation on the complete parameter θ and to choose the reference function r ( θ ) .

3. An Invariant Objective Prior

The aim of this section is to derive a default prior π * ( ψ ) to be used in (4). To this end, following [8], we use the shrinkage argument, which is a crucial procedure in the development of matching priors, i.e., priors that ensure, up to the desired order of asymptotics, an agreement between Bayesian and frequentist procedures. Examples of matching priors are (see [8]) for posterior quantiles, for credible regions and for prediction. Here, we focus on a specific matching prior that ensures the invariance of the posterior mode in the posterior distribution (4). As a consequence, the invariance extends to HPDs, as well as the e -value, achieved incorporating the reference function within the prior.
The proposed choice of the prior π * ( ψ ) , which makes the MAP and thus also HPDs and the e -value invariant under 1-1 reparameterization, will depend on the log-likelihood 𝓁 ( θ ) and on its derivatives. In regular parametric estimation problems, both the MLE and the score-estimating function exhibit an asymptotically symmetric distribution centered at the true parameter value and at zero, respectively. However, these asymptotic behaviors may poorly reflect exact sampling distributions, particularly in cases with small or moderate sample information, sparse data, or complex models. Several proposals have been developed to correct the estimate or the estimating function. Most available methods are aimed at approximate bias adjustment, either of the MLE or of the profile score function, also when nuisance parameters are present (see [23] for a review of bias reduction for the MLE and [24] and subsequent literature for bias correction of the profile score). Lack of equivariance impacts the so-called implicit bias reduction methods, which achieve first-order bias correction by modifying the score equation (see [23,25]). To avoid this drawback, in this paper we focus on the median modification of the score, or profile score equation, whose solution respects equivariance under monotone reparameterizations ([26]). Similar to Firth’s implicit method ([25]), the median modification of the score, or profile score, does not rely on finiteness of the MLE, thereby effectively preventing infinite estimates.
In practice, to derive the median matching prior π * ( ψ ) , we impose that the MAP of π * ( ψ | y ) coincides with a refined version of the MLE, obtained as the solution of the median modified score function ([26]). To introduce this new invariant prior, we initially explore the scenario without nuisance parameters and then the situation in which nuisance parameters are present.

3.1. No Nuisance Parameters

Let us explore first the scenario where θ is scalar. In order to obtain median bias reduction of the MLE, it is possible to resort to a modified version of the score function of the form
t ( θ ) = 𝓁 θ ( θ ) + m ( θ ) ,
where 𝓁 θ ( θ ) = 𝓁 θ ( θ ; y ) = 𝓁 ( θ ; y ) / θ is the score function, and m ( θ ) is a suitable correction term of order O ( 1 ) . In particular, the median modified score function assumes for m ( θ ) the expression
m ( θ ) = E ( 𝓁 θ ( θ ) 3 ) 6 i ( θ ) .
The solution θ ˜ to the equation t ( θ ) = 0 not only upholds equivariance under componentwise monotone reparameterizations, but also approximates median unbiasedness ([26]). Note that likelihood inference based on (10) does not depend explicitely on the MLE. Indeed, the modified score function has been found to overcome infinite estimate problems. Likewise, the MLE and also θ ˜ is asymptotically N ( θ , i ( θ ) 1 ) , so that the Wald-type statistics only differ in location.
Since Bayes’ theorem is a statement of adittivity on the log scale log π ( θ | y ) = log π ( θ ) + log L ( θ ) + constant, we observe that in the Bayesian framework, m ( θ ) can be interpreted as the derivative of the logarithm of a prior, that is, m ( θ ) = log π ( θ ) / θ . We are thus looking for a matching prior π * ( θ ) such that
log π * ( θ ) θ = E ( 𝓁 θ ( θ ) 3 ) 6 i ( θ ) .
In the scalar parameter case, it is straightforward to show that the proposed median matching prior takes the form
π * ( θ ) exp 1 6 i ( θ ) 1 E ( 𝓁 θ ( θ ) 3 ) d θ exp 1 6 i ( θ ) 1 ( 3 E ( 𝓁 θ θ ( θ ) 𝓁 θ ( θ ) ) + E ( 𝓁 θ θ θ ( θ ) ) ) d θ ,
with 𝓁 θ θ ( θ ) = 𝓁 θ ( θ ) / θ and 𝓁 θ θ θ ( θ ) = 𝓁 θ θ ( θ ) / θ , where the second expression for π * ( θ ) follows from the Bartlett’s identities. The posterior based on the median matching prior is thus
π * ( θ | y ) exp 𝓁 ( θ ) 1 6 i ( θ ) 1 E ( 𝓁 θ ( θ ) 3 ) d θ .
A first-order approximation for the e -value, when testing H 0 : θ = θ 0 , is simply given by
e v = ˙ 2 1 Φ θ 0 θ ˜ i ( θ 0 ) 1 ,
which differs in location with respect to the classical first-order approximation for the e -value based on the MLE. A second approximation for the e -value, when testing H 0 : θ = θ 0 , can be obtained from the asymptotic distribution of the modified score function (10), that is
e v = ˙ 2 1 Φ t ( θ 0 ) i ( θ 0 ) .
Although the first-order equivalence between (11) and (12), note that (11) is based on an easily understandable comparison between estimated value and hypothetical value, taking estimation error into account, and is widely used in applications but does not satisfy the principle of parameterization invariance. On the other hand, t ( θ ) / i ( θ ) is parameterization invariant.
Note that, when using a predictive matching prior, i.e., a prior ensuring asymptotic equivalence of higher-order frequentist and Bayesian predictive densities (see, e.g., [8]), the term m ( θ ) in (10) corresponds to the Firth’s adjustment ([25])
m F ( θ ) = ( E ( 𝓁 θ ( θ ) 3 ) + E ( 𝓁 θ θ ( θ ) 𝓁 θ ( θ ) ) ) 2 i ( θ ) .
In view of this, for general regular models, Firth’s estimate coincides with the mode of the posterior distribution obtained using the default predictive matching prior. However, lack of invariance affects this kind of adjustment ([23]), unless dealing with linear transformations.
Example 1
(One parameter exponential family). For a one-parameter exponential family with canonical parameter θ, i.e., with density
f ( y ; θ ) = exp { θ a ( y ) K ( θ ) } b ( y ) ,
the median modified score function has the form
t ( θ ) = 𝓁 θ ( θ ) + K θ θ θ 6 K θ θ ,
where K θ θ θ = 3 K ( θ ) / θ 3 and K θ θ = 2 K ( θ ) / θ 2 = i ( θ ) . In this parameterization, t ( θ ) can be seen as the first derivative of the log-posterior
log π ( θ | y ) = 𝓁 ( θ ) + log i ( θ ) / 6 .
On the other hand, Firth’s modified score takes the form t F ( θ ) = 𝓁 θ ( θ ) + K θ θ θ / ( 2 K θ θ ) . The effect of the median modification is to consider the median matching prior π * ( θ ) i ( θ ) 1 / 6 , while t F ( θ ) implies a Jeffreys’ prior π J ( θ ) i ( θ ) 1 / 2 . Note that, for a one-parameter exponential family with canonical parameter θ, both π * ( θ ) and π J ( θ ) belong to the family of invariant priors discussed in [27,28].
Example 2
(Scale model). Consider the scale model f ( y ; θ ) = ( 1 / θ ) p 0 ( y / θ ) , where p 0 ( · ) is a given function. Let g ( · ) = log p 0 ( · ) . We have E ( 𝓁 θ 3 ) = c 1 / θ 3 , E ( 𝓁 θ 3 𝓁 θ ) = c 2 / θ 3 and i ( θ ) = c 3 / θ 2 , with c 1 = ( y 3 g ( y ) + 6 y 2 g ( y ) + 6 y g ( y ) 2 ) p 0 ( y ) d y , c 2 = ( 3 y g ( y ) + y 2 g ( y ) 2 y 2 g ( y ) 2 y 3 g ( y ) g ( y ) 1 ) p 0 ( y ) d y and c 3 = ( 2 y g ( y ) + y 2 g ( y ) 1 ) p 0 ( y ) d y . The median matching prior is thus π * ( θ ) θ c 1 / 6 c 3 , while the Jeffreys’ prior for a one-parameter scale model is π J ( θ ) θ 1 and the prior associated to the Firth’s adjustment is π F ( θ ) θ ( c 1 + c 2 ) / 2 c 3 .
Example 3
(Skew–normal distribution). Consider a skew-normal distribution with shape parameter θ I R , with density f ( y ; θ ) = 2 ϕ ( y ) Φ ( y θ ) , where ϕ ( · ) is the standard normal density function. The median correction term for the score function associated to the median matching prior is (see [26,29])
m ( θ ) = E ( y 3 ϕ ( y θ ) 3 / Φ ( y θ ) 3 ) 6 E ( y 2 ϕ ( y θ ) 2 / Φ ( y θ ) 2 ) .
Numerical integration must be performed to obtain the expected values involved in m ( θ ) .
In order to illustrate the proposed prior, we consider draws from the skew–normal model with true parameter θ 0 = 3 and increasing sample sizes n = 20 , 30 , 50 , 200 (Figure 1). The posterior distributions are obtained with the method by [30], i.e., drawing 10 5  values and accepting the best 5 % . The e -values associated to the null (true) hypothesis H 0 : θ = 3 and the (false) hypothesis H 0 : θ = 4 are reported in Table 1. For comparison, the Jeffreys’ prior ([31]), the predictive matching prior ([29]) and the flat prior, with uniform reference function, are also considered. Progressive agreement among evidence values obtained with the proposed median matching prior and the other priors for larger sample size is shown. Also, as expected, when progressively increasing n, the evidence values indicate agreement with the true hypothesis H 0 : θ 0 = 3 and disagreement with H 0 : θ 0 = 4 for all the priors used. Anyway, note that the posterior distribution obtained with a flat prior and a uniform reference function is proportional to the likelihood function that can be monotone. In view of this, while the MAPs of the posterior based on the default priors are always finite, in some samples the MAP of the posterior with the non-informative prior may be infinite. An example of this effect is illustrated in Figure 2.
The properties of first-order approximations of the e -values have been investigated by a simulation study, with sample sizes n = 20 , 30 , 50 , 200 . Results are displayed in Figure 3. Distributions of the e -value from the posterior based on the median matching prior are better, both for small and moderate sample sizes, in terms of convergence to the uniform distribution. Moreover, score-type e -values (12) are also preferable over Wald-type e -values (11). For the results with the posterior distribution obtained with a flat prior, we found 4.3 % , 4.2 % , 0.9 % , 0 % of infinite estimates for the sample sizes considered in the simulation study, and in these cases the e -value was considered as 0.

3.2. Presence of Nuisance Parameters

In the presence of nuisance parameters, in order to obtain median bias reduction of the MLE, it is possible to resort to a modified version of the profile score function of the form
t p ( ψ ) = 𝓁 p ( ψ ) + m ( ψ , λ ^ ψ ) ,
where m ( ψ , λ ) is a suitable correction term of order O ( 1 ) . In particular, for the median modified profile score function, the adjustment m ( ψ , λ ) assumes the expression
m ( ψ , λ ) = κ 1 ψ + κ 3 ψ 6 κ 2 ψ ,
where κ 1 ψ , κ 2 ψ and κ 3 ψ are the first three cumulants of 𝓁 p ( ψ ) (see [26], Section 2.2, for their expression). For the estimator ψ ˜ p , defined as the solution of t p ( ψ ) = 0 , parameterization equivariance holds under interest respecting reparameterizations ([26]).
Note that, also in the context of nuisance parameters, we are in the situation in which the proposed prior π * ( ψ ) is known through its first derivative; this is typically the situation with matching priors (see, e.g., [8]). Since the parameter of interest is scalar, the posterior based on the median matching prior can be written as
π * ( ψ | y ) exp 𝓁 p ( ψ ) + m ( ψ , λ ^ ψ ) d ψ .
A simple analytical way of approximating to first-order the posterior distribution (14) based on the median matching prior is to resort to a quadratic form of t p ( ψ ) . In particular, the approximate posterior distribution for ψ takes the form
π * ( ψ | y ) ˙ exp 1 2 s p ( ψ ; y ) ,
where s p ( ψ ) = t p ( ψ ) 2 j p ( ψ ) 1 is a Rao score-type statistic based on (13), and the symbol “ ˙ ” means asymptotic proportionality to first-order. In this case, a first-order approximation of the e -value, when testing H 0 : ψ = ψ 0 , is given by
e v = ˙ 2 1 Φ t p ( ψ 0 ) j p ( ψ 0 ) .
In this case, an higher-order approximation via (3) would be impractical since a closed-form prior is not available. As an alternative, simulation-based approaches may be used to derive the implied posterior distribution (14) based on the median matching prior. The first one relies on Approximate Bayesian Computation (ABC) techniques, using ψ ˜ p or the modified profile score function t p ( ψ ) as summary statistics; see [32] for the modification of the algorithm of [30] by using a profile estimating equation. This first method introduces an approximation at the level of the posterior estimation. The second one still relies on (13) but considers use of Manifold MCMC methods (see, e.g., [33]) to conditioning exactly on the profile equation and not up to a tolerance level, as in ABC (see [34,35]). The algorithm moves on the constrained space { ( y , ψ ) Y × | t p ( ψ ˜ p ) = 0 } , where ψ ˜ p is the solution of the estimating equation on the original data. For the latter method, we need minimal regularity assumptions on m ( ψ , λ ) , which is assumed to be continuous, differentiable and available in closed form expression. Note, for instance, that in the skew-normal example in Section 3.1 these conditions are not met.
Example 4
(Exponential family). If f ( y ; θ ) is an exponential family of order d with canonical parameter ( ψ , λ ) , i.e., f ( y ; ψ , λ ) = exp { ψ t ( y ) + λ T s ( y ) K ( ψ , λ ) } h ( y ) , quantities involved in m ( ψ , λ ) are simply obtained from derivatives of K ( ψ , λ ) ([26]). Note that, in this framework, 𝓁 p ( ψ ) κ 1 ψ is an approximation with error of order O ( n 1 ) of the score for ψ in the conditional model given s ( y ) (see e.g., [36], Section 10.2). Then, in the continuous case, the MAP ψ ˜ p is an approximation of the optimal conditional median unbiased estimator, and π * ( ψ | y ) is related to the conditional likelihood for ψ given by L c ( ψ ) = exp ( ψ t ( y ) K s ( ψ ) ) ; see [37] for a Bayesian interpretation of such pseudo-likelihoods.
Example 5
(Multivariate regression model). Consider a regression model of the form
Y i j = β 0 + β 1 x i 1 + β 2 x i 2 + ϵ i j , i = 1 , , n , j = 1 , 2 ,
where it is assumed that ϵ i N 2 ( 0 , Σ ) , with Σ = σ 2 1 ρ ρ 1 positive definite matrix, and ( β 0 , β 1 , β 2 , σ 2 , ρ ) are unknown parameters. This model is widely used for instance in time series analysis, in which as regression covariates the past of the observables y are used. We focus on the problem of testing hypothesis on the correlation coefficient ρ.
Consider a draw with true parameter ρ 0 = 0.95 and n = 20 . For obtaining the proposed posterior (14) for ρ, we first compute the MAPs with the median matching prior and also, for comparison, with the predictive matching prior, which are, respectively, 0.953 and 0.92. Note that the expression of the predictive matching prior for ( ψ , λ ) corresponds to the Firth’s adjustment to the score function. The expressions of the modified profile estimating functions t p ( ψ ) and t F ( ψ ) are obtained from [38] and are available in closed form expressions. Hence, the Manifold MCMC method can be used to obtain the implied posteriors, whose approximation is comparable to that of any MCMC sampler. In particular we used 20,000 iterations.
We compare the posterior distribution based on the proposed median matching prior, with those obtained with the predictive matching prior and with an inverse-Wishart prior for the covariance matrix Σ with one degree of freedom and identity position, and uniform prior on the regression parameters. The posterior distributions are displayed in Figure 4. The hypothesis of interest is H 0 : ρ = 0.9 , and a smaller e-value indicating disagreement with the hypothesis should be preferable. The e -values are 0.25 with the median matching prior, 0.36 for the predictive matching prior and 0.60 with the inverse-Wishart prior. Note that the e -value based on the inverse-Wishart prior involves the constrained maximization and multidimensional integration and thus is not directly readable in Figure 4. Indeed, one crucial difference is that the original e -value formulation links the evidence of the null hypothesis to the evidence of a more refined hypothesis, choosing the MAP under the null hypothesis for all the nuisance parameters, while in the alternative (tangential) set, all values are used, and integration is performed on the full dimensionality of the space. On the contrary, in the proposed posterior based on the median matching prior, the maximizer of nuisance parameters are taken both in the null and non-null sets.
Finally, for the posterior based on the inverse-Wishart prior, we also computed the e -value based on the high-order tail area approximation (9) of the marginal surprise posterior, which is equal to 0.27. This procedure still avoids the multidimensional integration, but the result is not invariant to changes in parametrization.
Example 6
(Logistic regression model). Let y i , i = 1 , , n , be independent realizations of binary random variables with probability π i , where log ( π i / ( 1 π i ) ) = η i = x i β and x i = ( x i 1 , , x i p ) is a row vector of covariates. We assume that a generic scalar component of β is of interest, and we treat the remaining components as nuisance parameters.
As an example, we consider the endometrial cancer grade dataset analyzed, among others, in [39]. The aim of the clinical study was to evaluate the relationship between the histology of the endometrium (HG), the binary response variable, of n = 79 patients and three risk factors: 1. Neovascularization (NV), that indicates the presence or extent of new blood vessel formation; 2. Pulsatility Index (PI), that measures blood flow resistance in the endometrium; 3. Endometrium Height (EH), that indicates the thickness or height of the endometrium. A logistic model for HG, including an intercept and using all the covariates (NV, PI, EH), has been fitted, but a maximum likelihood leads to the infinite MLE of the coefficient β 2 related to NV, due to quasi-complete separation. This phenomenon prohibits the use of diffuse priors for β 2 , since the corresponding posterior would not concentrate. Moreover, the e -value with non-informative priors cannot be obtained also for any hypothesis concerning parameters different from β 2 .
If we consider β 2 as the parameter of interest, while the remaining regression coefficients are treated as nuisance parameters, the analysis with the median matching prior allows us to obtain a global proper posterior, with MAP equal to 3.86, open to interpretation both in the original scale and in terms of odds ratios. Similarly, the posterior based on the predictive matching prior, which in this model coincides with Jeffreys’ prior π ( β ) | i ( β ) | 1 / 2 is proper, with the MAP set at 2.92. The latter suffers from a lack of interpretability on different scales, since a different parametrization in the estimation phase would affect the results.
If we consider β 3 as the parameter of interest, related to the risk factor PI, the MAPs are −0.038 when using the median matching prior and −0.035 when using the predictive matching prior. The e -values for the hypothesis H 0 : β 3 = 0 are 0.60 and 0.55, respectively (see Figure 5). Likewise, the interpretation of e-values remains consistent and independent of parametrization solely in the first case.

4. Conclusions

Although (14) cannot always be considered as orthodox in a Bayesian setting, the use of alternative likelihoods is nowadays widely shared, and several papers focus on the Bayesian application of some well-known pseudo-likelihoods. In particular, the proposed posterior π * ( ψ | y ) has the advantages of avoiding the elicitation on the nuisance parameter λ and of the computation of multidimensional integrations. Moreover, it provides invariant MAPs, HPDs and e -values, without the adoption of a reference function. Finally, we remark that frequentist properties of the MAP of the posterior based off the proposed median matching prior in comparison with the MAP of the posterior based of the predictive matching prior have been investigated in [26,38] for some of the examples discussed in this paper.
For inference on a full vector parameter θ , with d > 1 components, a direct extension of the rationale leading to (10) does not seem to be practicable due to lack of a manageable definition of the multivariate median. Actually, in [26,40], it is shown how the method can be extended to a vector parameter of interest in the presence of nuisance parameters by simultaneously solving median bias-corrected score equations for all parameter components. This leads to componentwise third-order median unbiasedness and parameterization equivariance. Moreover, the use of default priors involving all parameter components, also the nuisance, becomes necessary to regularize likelihoods in case of monotonicity. We note that among the possible objective priors that ensures invariance of the posterior, we did not focus on the Jeffreys’ in the multidimensional case, since it often exhibits poor convergence properties. Conversely, the default matching priors considered in this paper are easily generalizable to the multidimensional case [40] preserving good convergence properties.
As a final remark, we highlight that this paper opens several topics of future research. In particular, from a computational point of view, it could be of interest:
  • To develop a library of computational routines exploring the methods proposed in this paper for a wide range of statistical models of interest;
  • To develop semi-automated procedures for further expanding this library, as is done for point estimation for Generalized Linear Models in the R package brglm2 [41].
Moreover, from a theoretical point of view, it could be of interest:
  • To further explore the theoretical connections between the e -value invariance properties and matching priors;
  • To explore the existence of similar connections in other classes of pseudo-likelihoods, in particular in the context of empirical and profile empirical likelihoods, with a large number of nuisance parameters ([42]),
  • To apply and extend the methodology to consider other objective priors used in Bayesian inference, such as those obtained from scoring rules, as proposed in [10], that are expressed as solutions of differential equations.

Author Contributions

Conceptualization, E.B. and L.V.; methodology, E.B. and L.V.; software, E.B.; validation, E.B. and L.V.; formal analysis, E.B. and L.V.; investigation, E.B. and L.V.; resources, E.B. and L.V.; data curation, E.B.; writing—original draft preparation, E.B. and L.V.; writing—review and editing, E.B. and L.V.; visualization, E.B. and L.V.; supervision, E.B. and L.V.; project administration, E.B. and L.V.; funding acquisition, L.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data and code used in the examples are available in https://github.com/elenabortolato/invariant_evalues (accessed on 28 November 2023).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ABCApproximate Bayesian Computation
BFBayes Factor
FBSTFull Bayesian Significance Test
HPDHighest Probability Density
MAPMaximum A Posteriori
MLEMaximum Likelihood Estimator

References

  1. Brazzale, A.R.; Davison, A.C.; Reid, N. Applied Asymptotics. Case-Studies in Small Sample Statistics; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  2. Reid, N. The 2000 Wald memorial lectures: Asymptotics and the theory of inference. Ann. Stat. 2003, 31, 1695–1731. [Google Scholar] [CrossRef]
  3. Ventura, L.; Reid, N. Approximate Bayesian computation with modified loglikelihood ratios. Metron 2014, 7, 231–245. [Google Scholar] [CrossRef]
  4. Madruga, M.; Pereira, C.; Stern, J. Bayesian evidence test for precise hypotheses. J. Stat. Plan. Inference 2003, 117, 185–198. [Google Scholar] [CrossRef]
  5. Pereira, C.; Stern, J.M. Evidence and Credibility: Full Bayesian Significance Test for Precise Hypotheses. Entropy 1999, 1, 99–110. [Google Scholar] [CrossRef]
  6. Kass, R.E.; Tierney, L.; Kadane, J. Approximate methods for assessing influence and sensitivity in Bayesian analysis. Biometrika 1989, 76, 663–674. [Google Scholar] [CrossRef]
  7. Reid, N.; Sun, Y. Assessing sensitivity to priors using higher order approximations. Commun. Stat. Theory Methods 2010, 39, 1373–1386. [Google Scholar] [CrossRef]
  8. Datta, G.S.; Mukerjee, R. Probability Matching Priors: Higher-Order Asymptotics; Lecture Notes in Statistics; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  9. Severini, T.A. Likelihood Methods in Statistics; Oxford University Press: Oxford, UK, 2000. [Google Scholar]
  10. Leisen, F.; Villa, C.; Walker, S.G. On a class of objective priors from scoring rules (with discussion). Bayesian Anal. 2020, 15, 1345–1423. [Google Scholar] [CrossRef]
  11. Miller, J.W. Asymptotic normality, concentration, and coverage of generalized posteriors. J. Mach. Learn. Res. 2021, 22, 1. [Google Scholar]
  12. Ventura, L.; Racugno, W. Pseudo-likelihoods for Bayesian inference. In Topics on Methodological and Applied Statistical Inference; Studies in Theoretical and Applied Statistics; Springer: Cham, Switzerland, 2016; pp. 205–220. ISBN 978-3-319-44093-4. [Google Scholar]
  13. Cabras, S.; Racugno, W.; Ventura, L. Higher-order asymptotic computation of Bayesian significance tests for precise null hypotheses in the presence of nuisance parameters. J. Stat. Comput. Simul. 2015, 85, 2989–3001. [Google Scholar] [CrossRef]
  14. Diniz, M.B.; Pereira, C.; Stern, J.M. Cointegration and unit root tests: A fully Bayesian approach. Entropy 2020, 22, 968. [Google Scholar] [CrossRef]
  15. Pereira, C.; Stern, J.M. The e-value: A fully Bayesian significance measure for precise statistical hypotheses and its research program. Sao Paulo J. Math. Sci. 2022, 16, 566–584. [Google Scholar] [CrossRef]
  16. Robert, C.P. On the Jeffreys Lindley paradox. Philos. Sci. 2014, 81, 216–232. [Google Scholar] [CrossRef]
  17. Madruga, M.R.; Esteves, L.G.; Wechsler, S. On the bayesianity of pereira-stern tests. Test 2001, 10, 291–299. [Google Scholar] [CrossRef]
  18. Evans, M. Bayesian inference procedures derived via the concept of relative surprise. Commun. Stat. – Theory Methods 1997, 26, 1125–1143. [Google Scholar] [CrossRef]
  19. Pereira, C.; Stern, J.M. Model Selection: Full Bayesian Approach. Environmetrics 2001, 12, 559–568. [Google Scholar] [CrossRef]
  20. Druilhet, P.; Marin, J.M. Invariant HPD credible sets and MAP estimators. Bayesian Anal. 2007, 2, 681–692. [Google Scholar] [CrossRef]
  21. Pereira, C.; Stern, J.M.; Wechsler, S. Can a significance test be genuinely Bayesian? Bayesian Anal. 2008, 79–100. [Google Scholar] [CrossRef]
  22. Diniz, M.A.; Pereira, C.; Polpo, A.; Stern, J.M.; Wechsler, S. Relationship between Bayesian and frequentist significance indices. Int. J. Uncertain. Quantif. 2012, 2, 161–172. [Google Scholar] [CrossRef]
  23. Kosmidis, I. Bias in parametric estimation: Reduction and useful side effects. Wiley Interdiscip. Rev. Comput. Stat. 2014, 6, 185–196. [Google Scholar] [CrossRef]
  24. Stern, S.E. A second-order adjustment to the profile likelihood in the case of a multidimensional parameter of interest. J. R. Stat. Ser. B 1997, 59, 653–665. [Google Scholar] [CrossRef]
  25. Firth, D. Bias reduction of maximum likelihood estimates. Biometrika 1993, 80, 27–38. [Google Scholar] [CrossRef]
  26. Kenne Pagui, E.C.; Salvan, A.; Sartori, N. Median bias reduction of maximum likelihood estimates. Biometrika 2017, 104, 923–938. [Google Scholar] [CrossRef]
  27. Hartigan, J.A. Invariant prior densities. Ann. Math. Statist. 1964, 35, 836–845. [Google Scholar] [CrossRef]
  28. Hartigan, J.A. The asymptotically unbiased density. Ann. Math. Statist. 1965, 36, 1137–1152. [Google Scholar] [CrossRef]
  29. Sartori, N. Bias prevention of maximum likelihood estimates for scalar skew normal and skew t distributions. J. Stat. Plan. Inference 2006, 136, 4259–4275. [Google Scholar] [CrossRef]
  30. Ruli, E.; Sartori, N.; Ventura, L. Robust approximate Bayesian inference. J. Stat. Plan. Inference 2020, 205, 10–22. [Google Scholar] [CrossRef]
  31. Liseo, B.; Loperfido, N. A note on reference priors for the scalar skew-normal distribution. J. Stat. Plan. Inference 2006, 136, 373–389. [Google Scholar] [CrossRef]
  32. Bortolato, E.; Ventura, L. On approximate robust confidence distributions. Econom. Stat. 2023. [Google Scholar] [CrossRef]
  33. Brubaker, M.; Salzmann, M.; Urtasun, R. A family of MCMC methods on implicitly defined manifolds. Artif. Intell. Stat. 2012, 22, 161–172. [Google Scholar]
  34. Lewis, J.R.; MacEachern, S.N.; Lee, Y. Bayesian restricted likelihood methods: Conditioning on insufficient statistics in Bayesian regression (with discussion). Bayesian Anal. 2021, 104, 1393–1462. [Google Scholar] [CrossRef]
  35. Graham, M.; Storkey, A. Asymptotically exact inference in differentiable generative models. Artif. Intell. Stat. 2017, 499–508. [Google Scholar] [CrossRef]
  36. Pace, L.; Salvan, A. Principles of Statistical Inference; World Scientific: Singapore, 1997. [Google Scholar]
  37. Severini, T.A. On the relationship between Bayesian and non-Bayesian elimination of nuisance parameters. Stat. Sin. 1999, 9, 713–724. [Google Scholar]
  38. Bortolato, E.; Kenne Pagui, E.C. Bias reduction and robustness in gaussian longitudinal data analysis. J. Stat. Comput. Simul. 2023, 1–16. [Google Scholar] [CrossRef]
  39. Agresti, A. Foundations of Linear and Generalized Linear Models; John Wiley & Sons: New York, NY, USA, 2015. [Google Scholar]
  40. Kosmidis, I.; Kenne Pagui, E.C.; Sartori, N. Mean and median bias reduction in generalized linear models. Stat. Comput. 2020, 30, 43–59. [Google Scholar] [CrossRef]
  41. Kosmidis, I. brglm2: Bias Reduction in Generalized Linear Models. R Package Version 0.9.2. 2023. Available online: https://CRAN.R-project.org/package=brglm2 (accessed on 28 November 2023).
  42. Bedoui, A.; Lazar, N.A. Bayesian empirical likelihood for ridge and lasso regressions. Comput. Stat. Data Anal. 2020, 145, 106917. [Google Scholar] [CrossRef]
Figure 1. Inference for the scalar parameter θ of the skew-normal model with sample sizes n = 20 , 30 , 50 , 200 (top-left, top-right, bottom-left and bottom-right panels, respectively). The red line is used for the posterior obtained from the median matching prior, the green one for the predictive matching prior, the violet one for the Jeffreys’ prior and the blue one from an improper flat prior. The horizontal lines identify the corresponding tangential sets associated to the hypothesis H 0 : θ = 3 .
Figure 1. Inference for the scalar parameter θ of the skew-normal model with sample sizes n = 20 , 30 , 50 , 200 (top-left, top-right, bottom-left and bottom-right panels, respectively). The red line is used for the posterior obtained from the median matching prior, the green one for the predictive matching prior, the violet one for the Jeffreys’ prior and the blue one from an improper flat prior. The horizontal lines identify the corresponding tangential sets associated to the hypothesis H 0 : θ = 3 .
Entropy 26 00058 g001
Figure 2. Skew-normal model: An example of log π ( θ | y ) / θ (estimating equation) with a flat prior (blue line), the median matching prior (red line) the predictive matching prior (green line) and the Jeffreys’ prior (violet line) in a sample where all the observations are positive.
Figure 2. Skew-normal model: An example of log π ( θ | y ) / θ (estimating equation) with a flat prior (blue line), the median matching prior (red line) the predictive matching prior (green line) and the Jeffreys’ prior (violet line) in a sample where all the observations are positive.
Entropy 26 00058 g002
Figure 3. Skew-normal model: Distributions of e -values from a simulation study under the null hypothesis H 0 : θ = 3 , using a flat prior (blue line), the median matching prior (red line), the predictive matching prior (green line), and the Jeffreys’ prior (violet line). The darker line is used for the approximation (11), while the lighter is for that based on (12).
Figure 3. Skew-normal model: Distributions of e -values from a simulation study under the null hypothesis H 0 : θ = 3 , using a flat prior (blue line), the median matching prior (red line), the predictive matching prior (green line), and the Jeffreys’ prior (violet line). The darker line is used for the approximation (11), while the lighter is for that based on (12).
Entropy 26 00058 g003
Figure 4. Posterior distributions for the correlation parameter ρ of the bivariate regression model obtained from MCMC draws and the three different priors.
Figure 4. Posterior distributions for the correlation parameter ρ of the bivariate regression model obtained from MCMC draws and the three different priors.
Entropy 26 00058 g004
Figure 5. Median matching posterior distribution and predictive matching posterior distributions for β 3 in the logistic regression model.
Figure 5. Median matching posterior distribution and predictive matching posterior distributions for β 3 in the logistic regression model.
Entropy 26 00058 g005
Table 1. Skew-normal: e -values associated to the hypotheses H 0 : θ = 3 and H 0 : θ = 4 .
Table 1. Skew-normal: e -values associated to the hypotheses H 0 : θ = 3 and H 0 : θ = 4 .
Hypothesis H 0 nFlat PriorMedian Matching PriorPredictive Matching PriorJeffreys’ Prior
θ = 3 200.590.620.560.65
300.650.700.730.64
500.810.840.820.91
2000.790.790.810.82
θ = 4 200.820.910.980.84
300.170.220.220.22
500.200.200.200.21
2000.070.080.080.09
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bortolato, E.; Ventura, L. Objective Priors for Invariant e-Values in the Presence of Nuisance Parameters. Entropy 2024, 26, 58. https://doi.org/10.3390/e26010058

AMA Style

Bortolato E, Ventura L. Objective Priors for Invariant e-Values in the Presence of Nuisance Parameters. Entropy. 2024; 26(1):58. https://doi.org/10.3390/e26010058

Chicago/Turabian Style

Bortolato, Elena, and Laura Ventura. 2024. "Objective Priors for Invariant e-Values in the Presence of Nuisance Parameters" Entropy 26, no. 1: 58. https://doi.org/10.3390/e26010058

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop