Nonparametric FBST for Validating Linear Models

Rodrigo F. L. Lassance; Julio M. Stern; Rafael B. Stern

doi:10.3390/psf2025012002

,

and

¹

Department of Statistics, Federal University of São Carlos, São Paulo 13565-905, Brazil

²

Institute of Mathematics and Computer Sciences, University of São Paulo, São Paulo 13566-590, Brazil

³

Institute of Mathematics and Statistics, University of São Paulo, São Paulo 05508-090, Brazil

^*

Author to whom correspondence should be addressed.

Phys. Sci. Forum2025, 12(1), 2;https://doi.org/10.3390/psf2025012002

This article belongs to the Proceedings The 43rd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering

Version Notes

Order Reprints

Abstract

In Bayesian analysis, testing for linearity requires placing a prior to the entire space of potential regression functions. This poses a problem for many standard tests, as assigning positive prior probability to such a hypothesis is challenging. The Full Bayesian Significance Test (FBST) sidesteps this issue, standing out for also being logically coherent and offering a measure of evidence against

H_{0}

, although its application to nonparametric settings is still limited. In this work, we use Gaussian process priors to derive FBST procedures that evaluate general linearity assumptions, such as testing the adherence of data and performing variable selection to linear models. We also make use of pragmatic hypotheses to verify if the data might be compatible with a linear model when factors such as measurement errors or utility judgments are accounted for. This contribution extends the theory of the FBST, allowing for its application in nonparametric settings and requiring, at most, simple optimization procedures to reach the desired conclusion.

Keywords:

FBST; HPD; Bayesian nonparametrics; linear model; Gaussian process; pragmatic hypothesis

1. Introduction

Although linear models are widespread in the scientific literature, their validity is rarely tested in its full complexity. Generally, linearity is tested as a particular case of a more general parametric model [1] or compared to a finite selection of models—each with their own prior specification—through measures such as the Deviance Information Criterion [2]. In actuality, testing the adherence of linear models to data requires (i) assigning a nonparametric prior to the set of regression functions and (ii) devising a procedure that highlights the evidence against the linear model hypothesis based on the data and the prior. Devising a test solely based on the posterior probability of the hypothesis in this case is seldom advised, as it imposes positive prior probability to the set of linear models when there are countless nonlinear functions arbitrarily close to any element of it.

The Full Bayesian Significance Test (FBST, [3]) is the testing framework used throughout this work. The FBST does not violate the likelihood principle, does not require setting positive prior probabilities to hypotheses, and provides a measure of evidence against

H_{0}

, along with other desirable characteristics. With the exceptions of Corrêa Filho [4] and Liu et al. [5], the FBST has not been applied to nonparametrics, still requiring new theoretical developments to systematically embrace such settings.

Bridging the gaps above, this paper provides a nonparametric FBST formulation that tests the adherence of linear models to data. By using a Gaussian Process (GP, [6]) as a prior to model the regression function, we propose FBST procedures that depend on whether the covariates’ domain

X

is finite or infinite. Furthermore, we lay out FBST procedures for hypotheses that include negligible deviations from

H_{0}

, known as pragmatic hypotheses [7], useful to evaluate if

H_{0}

is approximately instead of precisely compatible with the data.

In Figure 1, we illustrate how the FBST operates when applied to

H_{0}

and its pragmatic version,

P g (H_{0})

. For

α \in (0, 1)

, the posterior is used to obtain the

(1 - α) 100 %

Highest Posterior Density (HPD) region, the smallest credible region with probability

(1 - α)

of containing the quantity of interest. The hypothesis is rejected if it does not intersect with the HPD. This procedure is meaningful even when

H_{0}

is precise, that is, when

P (H_{0}) = 0

.

Figure 1. Illustration of the FBST for a precise

H_{0}

and its pragmatic version,

P g (H_{0})

, in the hypothesis space

H

. Each panel presents a possible configuration of the hypotheses and the HPD, with the text above the panels indicating the conclusion.

Even though our contribution makes exclusive use of the FBST, this does not imply that it is the only valid framework for the problem. While no other testing procedure as general as ours has been proposed, Mulder [8] uses Bayes factors to test if a single covariate may be nonlinearly related to the response variable and Lassance et al. [9] (Section 3.1) test the pragmatic version of the linear model hypothesis through its posterior probability.

This work is organized as follows. In Section 2, the required background knowledge is provided. Our findings are presented in Section 3, leading in Section 4 to an application that puts all the FBST procedures to use. Lastly, Section 5 describes how to enhance the FBST further and establishes potential future research. All proofs can be found in Appendix A.

2. Materials and Methods

2.1. Full Bayesian Significance Test (FBST)

The FBST is composed of three steps [3]. For

H_{0} : θ \in Θ_{0} \subset Θ

, where

Θ

is the parameter space, these steps are as follows:

Delimit the set of elements in $Θ$ that are more likely than those in $Θ_{0}$ . That is, if $f (θ | D)$ is the posterior density of $θ$ given the data $D$ ,

$T : = \{θ \in Θ : f (θ | D) \geq sup_{θ \in Θ_{0}} f (θ | D)\} .$
Obtain $e-value : = 1 - P (θ \in T | D) = 1 - \int_{T} f (θ | D) d θ,$ the Bayesian evidence value.
Reject $H_{0}$ if $e-value \leq α$ for a previously specified significance level $α \in (0, 1)$ .

In this paper, we use a procedure equivalent to the FBST: reject

H_{0}

if the Highest Posterior Density (HPD) region is such that

H P D \cap Θ_{0} = \emptyset

. The HPD is the smallest region with posterior probability of

1 - α

, obtained by finding the value

f^{*}

such that

P (θ \in H P D | D) = 1 - α, H P D : = \{θ \in Θ : f (θ | D) \geq f^{*}\} .

When

θ | D

is normally distributed, the HPD region is equivalent to the credible interval symmetric around the posterior mean. For its multivariate counterpart,

θ | D \sim N_{k} (μ, Σ)

, we have that

Σ^{- 1 / 2} (θ - μ) \sim N_{k} (0, I)

, and thus

{(θ - μ)}^{'} Σ^{- 1} (θ - μ) \sim χ_{k}^{2}

, where

χ_{k}^{2}

stands for the chi-squared distribution with k degrees of freedom. Therefore, if

q_{(1 - α)} (\cdot)

is the

(1 - α)

100% quantile function, the HPD is given by the following ellipsoid ([10], Result 4.7):

{θ \in R^{k} : {(θ - μ)}^{'} Σ^{- 1} (θ - μ) \leq q_{(1 - α)} (χ_{k}^{2})} .

(1)

2.2. Gaussian Processes (GPs)

A GP is a nonparametric family of priors used to model functions in regression settings. The random function

g : X \to R

behaves according to a GP if

g (X) \sim N (m (X), K (X, X)), \forall X \subset X,

where

m (\cdot)

and

K (\cdot, \cdot)

, respectively, determine the mean and covariance of the process. When the response variable Y is such that

Y = g (x) + ϵ

for

ϵ \sim N (0, σ^{2})

, that is,

L (g, σ^{2} | y, X) = {(2 π σ^{2})}^{- n / 2} exp \{- \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} {(y_{i} - g (x_{i}))}^{2}\},

then the GP is conjugate and its posterior is such that

\begin{matrix} g (X^{'}) & | y, X, σ^{2} \sim N (μ (X^{'}), Σ (X^{'}, X^{'})), \\ μ (X^{'}) & : = m (X^{'}) + K (X, X^{'}) {(K (X, X) + σ^{2} I)}^{- 1} (y - m (X)), \\ Σ (X^{'}, X^{'}) & : = K (X^{'}, X^{'}) - K (X, X^{'}) {(K (X, X) + σ^{2} I)}^{- 1} K (X^{'}, X) . \end{matrix}

The choice of m, K, and

σ^{2}

reflect positions on the mean, smoothness, and variation surrounding the GP. In Section 4, we use the specifics of the application to choose them. For more general settings, one may assume that the uncertainty of m and K is reducible to a finite number of parameters. Then, one can either set priors to such parameters directly [11] or plug point estimates for them based on the maximum partial likelihood [12].

Conditionally on

σ^{2}

, the HPD region of the GP can be analytically obtained for any finite set

X^{'} = {(x_{1}, x_{2}, \dots, x_{m})}^{'}

. Since the marginals of the posterior GP are also normally distributed, Equation (1) entails that the

(1 - α) 100 %

HPD region for

g (X^{'}) | y, X

is

{h \in H : {(h (X^{'}) - μ (X^{'}))}^{'} Σ {(X^{'}, X^{'})}^{- 1} (h (X^{'}) - μ (X^{'})) \leq q_{(1 - α)} (χ_{m}^{2})} .

(2)

It is also possible to obtain an HPD set for the GP without setting

X^{'}

. Let

P_{g}

and

P_{g | y, X, σ^{2}}

, respectively, be the prior and posterior probability measures of the GP defined on a measurable space

(G, G)

. Hence,

P_{g | y, X, σ^{2}} (A) = \frac{\int_{A} L (g, σ^{2} | y, X) d P_{g} (h)}{\int_{G} L (g, σ^{2} | y, X) d P_{g} (h)}, \forall A \subset G .

Since

P_{g | y, X, σ^{2}} ≪ P_{g}

, the Radon–Nikodym derivative of the GP for

h \in G

is such that

\frac{d P_{g | y, X, σ^{2}}}{d P_{g}} (h) \propto exp (- \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} {(y_{i} - h (x_{i}))}^{2}) = exp (- \frac{1}{2 σ^{2}} {(y - h (X))}^{'} (y - h (X))),

(3)

i.e., it suffices to evaluate h only on the values of

X

in the sample. To account for repeated lines in

X

, let

X^{*}

be the matrix with all unique observations from

X

and

n^{*}

be the number of lines of

X^{*}

. Defining

D_{n_{X^{*}}}

as a diagonal matrix that counts how many times each

x \in X^{*}

appears in

X

and

{\bar{y}}_{X^{*}}

as the vector of means of all elements of

y

related to each

x

,

\frac{d P_{g | y, X, σ^{2}}}{d P_{g}} (h) \propto exp (- \frac{1}{2 σ^{2}} {(h (X^{*}) - {\bar{y}}_{X^{*}})}^{'} D_{n_{X^{*}}} (h (X^{*}) - {\bar{y}}_{X^{*}})) .

(4)

Thus, for a constant

c_{α}

, a Weighted Residual Sum of Squares (WRSS) defines the HPD:

H P D_{(1 - α)} = \{h \in G : W R S S (h) \leq c_{α}\}, W R S S (h) : = {(h (X^{*}) - {\bar{y}}_{X^{*}})}^{'} D_{n_{X^{*}}} (h (X^{*}) - {\bar{y}}_{X^{*}}) .

(5)

2.3. Pragmatic Hypotheses

The pragmatic hypothesis enlarges

H_{0}

to a set deemed as practically equivalent. The implementation uses the notion of negligible deviations from

H_{0}

. The degree to which the hypothesis is enlarged depends on the choice of a threshold

ε

, and factors such as the scale of measurement errors or expert’s utility judgments could help set it (see Section 4 for a practical example and (Lassance et al. [9], Section 4) for suggestions). Formally, for a hypothesis space

H

, let

d (\cdot, \cdot)

be the dissimilarity function from which one can express how much of a departure from

H_{0}

is reasonable. Then, the pragmatic hypothesis is given by

P g (H_{0}, d, ε) : = ⋃_{h_{0} \in H_{0}} \{h \in H : d (h_{0}, h) \leq ε\} = \{h \in H : inf_{h_{0} \in H_{0}} d (h_{0}, h) \leq ε\},

(6)

that is, the pragmatic hypothesis contains all elements

h \in H

such that, for some element

h_{0} \in H_{0}

,

d (h_{0}, h) \leq ε

. In this work, we assume that

H = G

is a space of functions of the type

h : X \to R

. Further specifications on

H

are presented in Section 3. When

d (\cdot, \cdot)

and

ε

are implicit, we use

P g (H_{0})

to denote the pragmatic hypothesis.

3. Results

Throughout this work, we use the modeling assumptions in Section 2.2 for the data

(y, X)

and the regression function

g (\cdot)

, and assume that the hypothesis of interest is

H_{0} : g (x) = b (x) β, \forall x \in X, β \in R^{k},

(7)

where

b (x) = (b_{1} (x), b_{2} (x), \dots, b_{k} (x)) \subset H

is a linearly independent set of linear functions and

X

is the covariates’ domain. The choice of

b

determines the test performed, such as evaluating linear models (

b (x) : = x

) or doing variable selection (

b (x) : = x_{- i}

).

Our findings are divided in two settings: those applicable to

H_{0}

and those to

P g (H_{0})

. In both cases, we explore when

X

is a finite or an infinite set. The finite case provides a closed-form solution for the FBST of

H_{0}

and a solution for the pragmatic hypothesis that requires a univariate optimization procedure. When

X

is infinite, testing

H_{0}

or

P g (H_{0})

also requires determining

c_{α}

in the HPD of Equation (5), which is achieved by noting that the

W R S S

can be expressed as a linear combination of noncentral chi-squared random variables; therefore,

c_{α}

is the

(1 - α)

quantile of a generalized chi-squared distribution [13].

Theorem 1 (FBST of the linear model hypothesis).

Let

H_{0}

be the hypothesis in Equation (7) and

g (\cdot) | y, X \sim G P (μ (\cdot), Σ (\cdot, \cdot))

. Then,

When $X$ is a finite set, the FBST does not reject $H_{0}$ if and only if

${(b (X) \hat{β} - μ (X))}^{'} Σ {(X, X)}^{- 1} (b (X) \hat{β} - μ (X)) \leq q_{(1 - α)} (χ_{| X |}^{2}),$

where $\hat{β} = {(b {(X)}^{'} Σ {(X, X)}^{- 1} b (X))}^{- 1} b {(X)}^{'} Σ {(X, X)}^{- 1} μ (X)$ and $| X |$ is the size of $X$ .
When $X$ is an infinite set, the FBST does not reject $H_{0}$ if and only if

$\begin{matrix} {\bar{y}}_{X^{*}}^{'} M {\bar{y}}_{X^{*}} \leq c_{α}, M = D_{n_{X^{*}}} - D_{n_{X^{*}}} b (X^{*}) {(b {(X^{*})}^{'} D_{n_{X^{*}}} b (X^{*}))}^{- 1} b {(X^{*})}^{'} D_{n_{X^{*}}} . \end{matrix}$

Before presenting the FBST for the pragmatic version of Equation (7), we specify

H

and provide the infimum when the dissimilarity function in Equation (6) is the

L^{2}

distance in the probability space of

X

. The hypothesis space

H

is such that

h \in H ⟺ E_{X} (h^{2}) = \int_{X} h {(x)}^{2} d P (x) < \infty .

(8)

As for the infimum, it is described in the following Lemma:

Lemma 1 (Infimum of the dissimilarity on the linear model set).

Let Equation (8) denote the hypothesis space and

H_{0}

be the hypothesis in Equation (7). If

d (h_{0}, h) : = \sqrt{E_{X} [{(h_{0} - h)}^{2}]}

, then

d (H_{0}, h) = d (b \times {\tilde{β}}_{h}, h), \forall h \in H

, where

\begin{matrix} {\tilde{β}}_{h} = A_{b}^{- 1} \times h_{b}, A_{b} & = (\begin{matrix} E [b_{1}^{2} (X)] & E [b_{2} (X) b_{1} (X)] & \dots & E [b_{k} (X) b_{1} (X)] \\ E [b_{1} (X) b_{2} (X)] & E [b_{2}^{2} (X)] & \dots & E [b_{k} (X) b_{2} (X)] \\ ⋮ & ⋮ & ⋱ & ⋮ \\ E [b_{1} (X) b_{k} (X)] & E [b_{2} (X) b_{k} (X)] & \dots & E [b_{k}^{2} (X)] \end{matrix}), \\ h_{b}^{'} & = (E [h (X) b_{1} (X)], E [h (X) b_{2} (X)], \dots, E [h (X) b_{k} (X)]) . \end{matrix}

Theorem 2 (FBST of the pragmatic version of

H_{0}

).

Let

H

be given by Equation (8) and define

d (h_{0}, h) : = \sqrt{E_{X} [{(h_{0} - h)}^{2}]}

. Assume that

\sum_{x \in X^{*}} P (x) > 0

. Then,

When $X$ is a finite set, the FBST does not reject $P g (H_{0})$ if and only if

$\exists s \in (0, 1) : 1 - μ {(X)}^{'} (\frac{ε^{2}}{1 - s} N^{- 1} + \frac{1}{s} Σ (X, X) q_{(1 - α)} (χ_{| X |}^{2})) μ (X) < 0,$

where $N : = D_{P (X)} [I - b (X) {(b {(X)}^{'} D_{P (X)} b (X))}^{- 1} b {(X)}^{'} D_{P (X)}]$ and $D_{P (X)}$ is a diagonal matrix formed by the vector $P (x), x \in X$ .
When $X$ is an infinite set, the FBST does not reject $P g (H_{0})$ if and only if

$\exists s \in (0, 1) : 1 - {\bar{y}}_{X^{*}}^{'} (\frac{ε^{2}}{1 - s} M^{- 1} + \frac{1}{s} D_{n_{X^{*}}} c_{α}) {\bar{y}}_{X^{*}} < 0,$

where $M : = D_{P (X^{*})} [I - b (X^{*}) {(b {(X^{*})}^{'} D_{P (X^{*})} b (X^{*}))}^{- 1} b {(X^{*})}^{'} D_{P (X^{*})}]$ and $D_{P (X^{*})}$ is a diagonal matrix formed by the vector $P (x), x \in X^{*}$ .

In the infinite case of

X

in Theorem 2, an appealing choice for the distribution of

X

is based on the posterior of a Dirichlet process [14]. For a concentration parameter

τ

and a centering distribution

π

, set a Dirichlet process

P \sim D P (τ, π)

such that

X | P \sim P

. Then,

P | x \sim D P (\frac{τ π}{τ + n} + \frac{\sum_{i = 1}^{n} δ_{x_{i}}}{τ + n}, τ + n) ⟹ P (X_{n e w} \in A | X = x) = \frac{τ π (A)}{τ + n} + \frac{\sum_{i = 1}^{n} I (x_{i} \in A)}{τ + n},

where

δ_{x_{i}} = I (x_{i} \in A)

for

A \subseteq X

. With this choice, one can ensure that positive probability will always be assigned to all

x \in X^{*}

. Moreover,

τ

can leverage the weight of the prior on the FBST, with higher values of

τ

leading to a higher chance of not rejecting

P g (H_{0})

.

4. Application: Water Droplet Experiment

The dataset from Duguid [15] provides a setting where small water droplets (ranging from 3 to 9 μm) are free falling through a tube that keeps factors such as temperature and humidity constant. As a droplet falls, a camera takes a picture at every 0.5 s, ceasing activity after 7 s. One of the objectives of the study was to evaluate Fick’s law, which in this setting implies that—when time is a covariate—the decrease in radius of the droplet can be described through a linear model. The two hypotheses of interest are

\{\begin{matrix} H_{0}^{1} : g (t) = β_{0} + β_{1} t, \forall t \in {0 s, 0.5 s, \dots, 7 s}, (β_{0}, β_{1}) \in R^{2}; \\ H_{0}^{2} : g (t) = β_{0}, \forall t \in {0 s, 0.5 s, \dots, 7 s}, β_{0} \in R; \end{matrix}

with the first hypothesis testing the validity of Fick’s law for this case and the second one verifying if time can be removed as a covariate.

We use the GP in Section 2.2 to model the data, with the following prior settings:

σ^{2} = 0.01, m (t) = \frac{3 + 9}{2} = 6, \forall t \in X, K (t_{1}, t_{2}) = exp \{- \frac{1}{2} | | t_{1} - t_{2} {| |}_{2}\}, (t_{1}, t_{2}) \subset X .

As shown in Figure 2a, this choice leads to functions that obey the 3–9

μ

m restriction without becoming too restrictive as a consequence. In Figure 2b, we observe that the posterior draws resemble a linear model except on

t = 0

, due to the missing observation.

Figure 2. GP draws of the (a) prior and (b) posterior for the water droplet data. The colored curves represent each draw, the black dots are the observed data and the dashed line is the least squares estimate of the linear model.

In Table 1, we present the e-value for both hypotheses of interest, assuming that

X

is either finite or infinite. Since small e-values provide strong evidence against

H_{0}

, with

α = 0.05

we conclude that both

H_{0}^{1}

and

H_{0}^{2}

should be rejected, i.e., Fick’s law would fail.

Table 1. e-value of

H_{0}

under finite and infinite

X

for the water droplet experiment.

While this analysis shows that Fick’s law is not exactly valid, it might still provide an adequate approximation, motivating the use of pragmatic hypotheses. This requires setting the threshold

ε

, which is detailed below.

In the original experiment, the radius of the droplets was obtained indirectly through Stoke’s law, that is,

V_{T} (t) = \frac{g {(t)}^{2}}{K_{s}} ⟹ g (t) = \sqrt{V_{T} (t) \times K_{s}},

(9)

where

V_{T}

is the terminal velocity and

K_{s} = 8.446

. Since the mean velocity (

V_{M}

) was used in (9) instead of

V_{T}

, there are two sources of measurement error: the estimate of

V_{M}

(maximum error of

δ = 0.14

) and switching

V_{T}

for

V_{M}

in (9) (maximum error of

η = 0.3555

([9], Example 1.3)). We conclude that the margin of error of the radius is

\begin{matrix} ϵ : & = max_{t \in T} | g (t) - y (t) | = max_{t \in T} \{|\sqrt{K_{s} V_{T} (t)} - y (t)|\} \\ = max_{t \in T} \{|\sqrt{K_{s} (V_{M} (t) - δ - η)} - y (t)|, |\sqrt{K_{s} (V_{M} (t) + δ + η)} - y (t)|\} \approx 0.6218 . \end{matrix}

While

ϵ

relates to the

l_{\infty}

distance, Lemma 1 uses the

l_{2}

distance. To obtain an estimate of the latter from the former, we use Proposition 6.11 of Folland [16], which implies that

\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y (x_{i}) - g (x_{i}))}^{2}} \leq \sqrt{\frac{1}{n}} ϵ ⟹ max_{i \in {1, 2, \dots, n}} | y (x_{i}) - g (x_{i}) | \leq ϵ,

thus

ε \approx 0.6218 / \sqrt{15} \approx 0.1606

.

Table 2 presents the e-values for the pragmatic hypotheses

P g (H_{0}^{i}, d, 0.1606)

,

i \in {1, 2}

. We assume either that

X = {0, 0.5, 1, \dots, 7}

(original setting, discrete uniform) or that

t | P \sim P

, with

P \sim D P (1, U (0, 7))

(continuous uniform as centering distribution). Contrary to Table 1, the first hypothesis is not rejected, demonstrating that Fick’s law provides a good approximation of the phenomenon.

Table 2. e-value of

P g (H_{0}, d, 0.1606)

under finite and infinite

X

for the water droplet experiment.

5. Discussion

Regarding the results of the application (Section 4), we believe to have demonstrated the importance of using pragmatic hypotheses whenever reasonable. While choosing

ε

is not a simple task in nonparametric settings, there are strategies available for deriving it [9]. Furthermore, while the e-value is not a measure of evidence against

H_{0}^{c}

[3], combining it with a pragmatic hypothesis allows one to perform the Generalized FBST (GFBST, [17]), which can discriminate “evidence of absence” from “absence of evidence”, along with many other desirable properties.

One of the main limitations of this work is in the strategy of performing variable selection. While the aforementioned GFBST allows for multiple testing without the necessity of correcting

α

, variable selection is only possible through Equation (7) if the linear model hypothesis is not rejected. Therefore, one future research direction is developing tests that evaluate conditional independence without assuming a specific functional form for the relationship between variables.

Author Contributions

Conceptualization, all authors; methodology, R.F.L.L. and R.B.S.; software, R.F.L.L.; validation, R.F.L.L.; formal analysis, R.F.L.L.; investigation, J.M.S.; resources, R.F.L.L.; data curation, R.F.L.L.; writing—original draft preparation, R.F.L.L.; writing—review and editing, all authors; visualization, R.F.L.L.; supervision, R.B.S. and J.M.S.; project administration, R.B.S.; funding acquisition, Interinstitutional Graduate Program in Statistics UFSCar-USP. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001. This research was funded by FAPESP (grants 2019/11321-9 and CEPID CeMEAI 2013/07375-0) and CNPq (grants 309607/2020-5, 422705/2021-7 and PQ 303290/2021-8).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data and the functions used in this study are available in GitHub at https://github.com/rflassance/lmFBST (accessed on 24 June 2025). These data were derived from the following resource available in the public domain: https://scholarsmine.mst.edu/cgi/viewcontent.cgi?params=/context/masters_theses/article/6294 (page 42, accessed on 15 May 2024).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

FBST	Full Bayesian Significance Test
GFBST	Generalized Full Bayesian Significance Test
GP	Gaussian Process
HPD	Highest Posterior Density
WRSS	Weighted Residual Sum of Squares

Appendix A. Proofs

Proof of Theorem 1.

The proof is done in parts:

Finite

X

.

Since

X

is finite, Equation (2) is the HPD. Therefore, if

\exists β \in R^{k}

such that

{(b (X) β - μ (X))}^{'} Σ {(X, X)}^{- 1} (b (X) β - μ (X)) \leq q_{(1 - α)} (χ_{| X |}^{2}),

(A1)

then the FBST does not reject

H_{0}

. Derivating the left side of (A1) in terms of

β

, we observe that

\hat{β}

minimizes such expression. ■

Infinite

X

.

In this case, the FBST does not reject the hypothesis iff

\exists β \in R^{k}

such that

W R S S (b \times β) \leq c_{α}

. If

\tilde{β} = {(b {(X^{*})}^{'} D_{n_{X^{*}}} b (X^{*}))}^{- 1} b {(X^{*})}^{'} D_{n_{X^{*}}} {\bar{y}}_{X^{*}}

, this is equivalent to not rejecting

H_{0}

iff

{({\bar{y}}_{X^{*}} - b (X^{*}) \tilde{β})}^{'} D_{n_{X^{*}}} ({\bar{y}}_{X^{*}} - b (X^{*}) \tilde{β}) = {\bar{y}}_{X^{*}}^{'} M {\bar{y}}_{X^{*}} \leq c_{α},

since

\tilde{β}

is the weighted least squares estimate of

β

. ■

□

Proof of Lemma 1.

The proof is found in (Lassance et al. [9], Theorem 2). □

Proof of Theorem 2.

The proof is done in parts:

Finite

X

.

Lemma 1 implies that

{\tilde{β}}_{h} = (b {(X)}^{'} D_{P (X)} b (X)) b {(X)}^{'} D_{P (X)} h (X),

and thus

\begin{matrix} d (H_{0}, h (X)) & = \sqrt{\sum_{x \in X} P (x) {(h (x) - b {(x)}^{'} {\tilde{β}}_{h})}^{2}} \\ = \sqrt{{(h (X) - b (X) {\tilde{β}}_{h})}^{'} D_{P (X)} (h (X) - b (X) {\tilde{β}}_{h})} = \sqrt{h {(X)}^{'} N h (X)} \leq ε . \end{matrix}

Since the HPD is given by Equation (2), the FBST does not reject

P g (H_{0})

if and only if

\{\begin{matrix} h \in H : d (H_{0}, h (X)) = \sqrt{h {(X)}^{'} N h (X)} \leq ε \\ h \in H : {(h (X) - μ (X))}^{'} Σ {(X, X)}^{- 1} (h (X) - μ (X)) \leq q_{1 - α} (χ_{| X |}^{2}) \end{matrix}

(A2)

are intersecting ellipsoids. From Proposition 2 of Gilitschenski and Hanebeck [18], the ellipsoids intersect if and only if

\exists s \in (0, 1) : 1 - μ {(X)}^{'} (\frac{ε^{2}}{1 - s} N^{- 1} + \frac{1}{s} Σ (X, X) q_{(1 - α)} (χ_{| X |}^{2})) μ (X) < 0 .

■

Infinite

X

.

The FBST does not reject

P g (H_{0})

if

inf_{h \in H P D} d {(H_{0}, h)}^{2} = inf_{h \in H P D} inf_{h_{0} \in H_{0}} d {(h_{0}, h)}^{2} = inf_{h_{0} \in H_{0}} inf_{h \in H P D} \int_{X} {(h_{0} (x) - h (x))}^{2} d P_{X} (x) \leq ε^{2} .

(A3)

The ellipsoid

G : = {z \in R^{n^{*}} : {(z - {\bar{y}}_{X^{*}})}^{'} D_{n_{X^{*}}} (z - {\bar{y}}_{X^{*}}) \leq c_{α}}

is such that, for any function

h (\cdot)

where

\exists z \in G : h (X^{*}) = z

, we can conclude that

h \in H P D

. Therefore, the

H P D

contains functions that are linear outside of

X^{*}

, and thus

\begin{matrix} inf_{h_{0} \in H_{0}} inf_{h \in H P D} \int_{X} {(h_{0} - h (x))}^{2} d P_{X} (x) = & inf_{h_{0} \in H_{0}} inf_{z \in G} \sum_{i = 1}^{n^{*}} {(z_{i} - h_{0} (X_{i, .}^{*}))}^{2} P (X_{i, .}^{*}) \\ = & inf_{z \in G} inf_{h_{0} \in H_{0}} {(z - h_{0} (X^{*}))}^{'} D_{P (X^{*})} (z - h_{0} (X^{*})) \\ = & inf_{z \in G} {(z - b (X^{*}) {\hat{β}}_{z})}^{'} D_{P (X^{*})} (z - b (X^{*}) {\hat{β}}_{z}), \end{matrix}

where

{\hat{β}}_{z} = {(b {(X^{*})}^{'} D_{P (X^{*})} b (X^{*}))}^{- 1} b {(X^{*})}^{'} D_{P (X^{*})} z

, thus

{inf}_{h \in H P D} d {(H_{0}, h)}^{2} = {inf}_{z \in G} z^{'} M z

. Therefore, the FBST does not reject

H_{0}

if the ellipsoids G and

{z \in R^{n^{*}} : z^{'} M z \leq ε}

intersect, which can be verified through Proposition 2 of Gilitschenski and Hanebeck [18]. □

■

References

Kershaw, J.; Kashikura, K.; Zhang, X.; Abe, S.; Kanno, I. Bayesian technique for investigating linearity in event-related BOLD fMRI. Magn. Reson. Med. 2001, 45, 1081–1094. [Google Scholar] [CrossRef] [PubMed]
Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2002, 64, 583–639. [Google Scholar] [CrossRef]
de Bragança Pereira, C.A.; Stern, J.M.; Wechsler, S. Can a significance test be genuinely Bayesian? Bayesian Anal. 2008, 3, 79–100. [Google Scholar] [CrossRef]
Corrêa Filho, F.P.T. Nonparametric Tests for Pólya Trees: Non-Parametric Versions of the FBST. Master’s Thesis, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil, 2024. [Google Scholar]
Liu, Z.; Li, Z.; Wang, J.; He, Y. Full Bayesian Significance Testing for Neural Networks. arXiv 2024, arXiv:2401.13335. [Google Scholar] [CrossRef]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; The MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
Esteves, L.G.; Izbicki, R.; Stern, J.M.; Stern, R.B. Pragmatic Hypotheses in the Evolution of Science. Entropy 2019, 21, 883. [Google Scholar] [CrossRef]
Mulder, J. Bayesian Testing of Linear Versus Nonlinear Effects Using Gaussian Process Priors. Am. Stat. 2023, 77, 1–11. [Google Scholar] [CrossRef]
Lassance, R.F.; Izbicki, R.; Stern, R.B. Adding imprecision to hypotheses: A Bayesian framework for testing practical significance in nonparametric settings. Int. J. Approx. Reason. 2025, 178, 109332. [Google Scholar] [CrossRef]
Johnson, R.; Wichern, D. Applied Multivariate Statistical Analysis, 6th ed.; Prentice Hall India Learning Private Limited: Delhi, India, 2012; p. 163. [Google Scholar]
Oakley, J. Eliciting Gaussian process priors for complex computer codes. J. R. Stat. Soc. Ser. D (Stat.) 2002, 51, 81–97. [Google Scholar] [CrossRef]
Wang, X.; Berger, J.O. Estimating Shape Constrained Functions Using Gaussian Processes. SIAM/ASA J. Uncertain. Quantif. 2016, 4, 1–25. [Google Scholar] [CrossRef]
Davies, R.B. Algorithm AS 155: The Distribution of a Linear Combination of χ² Random Variables. Appl. Stat. 1980, 29, 323. [Google Scholar] [CrossRef]
Ferguson, T.S. A Bayesian Analysis of Some Nonparametric Problems. Ann. Stat. 1973, 1, 209–230. [Google Scholar] [CrossRef]
Duguid, H.A. A Study of the Evaporation Rates of Small Freely Falling Water Droplets. Master’s Thesis, Missouri University of Science and Technology, Rolla, MO, USA, 1969. [Google Scholar]
Folland, G. Real Analysis: Modern Techniques and Their Applications, 2nd ed.; Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts; Wiley: Hoboken, NJ, USA, 1999; p. 186. [Google Scholar]
Esteves, L.G.; Izbicki, R.; Stern, J.M.; Stern, R.B. Logical coherence in Bayesian simultaneous three-way hypothesis tests. Int. J. Approx. Reason. 2023, 152, 297–309. [Google Scholar] [CrossRef]
Gilitschenski, I.; Hanebeck, U.D. A robust computational test for overlap of two arbitrary-dimensional ellipsoids in fault-detection of Kalman filters. In Proceedings of the 2012 15th International Conference on Information Fusion, Singapore, 9–12 July 2012; pp. 396–401. [Google Scholar]

Figure 1. Illustration of the FBST for a precise

H_{0}

and its pragmatic version,

P g (H_{0})

, in the hypothesis space

H

. Each panel presents a possible configuration of the hypotheses and the HPD, with the text above the panels indicating the conclusion.

Figure 1. Illustration of the FBST for a precise

H_{0}

and its pragmatic version,

P g (H_{0})

, in the hypothesis space

H

. Each panel presents a possible configuration of the hypotheses and the HPD, with the text above the panels indicating the conclusion.

Figure 2. GP draws of the (a) prior and (b) posterior for the water droplet data. The colored curves represent each draw, the black dots are the observed data and the dashed line is the least squares estimate of the linear model.

Table 1. e-value of

H_{0}

under finite and infinite

X

for the water droplet experiment.

Table 1. e-value of

H_{0}

under finite and infinite

X

for the water droplet experiment.

	Hypothesis
Assumption on $t$	$H_{0}^{1} : g (t) = β_{0} + β_{1} t$	$H_{0}^{2} : g (t) = β_{0}$
t is discrete and finite	0.0446	0
t is continuous	0.0068	0

Table 2. e-value of

P g (H_{0}, d, 0.1606)

under finite and infinite

X

for the water droplet experiment.

Table 2. e-value of

P g (H_{0}, d, 0.1606)

under finite and infinite

X

for the water droplet experiment.

	Original Hypothesis
Assumption on $t$	$H_{0}^{1} : g (t) = β_{0} + β_{1} t$	$H_{0}^{2} : g (t) = β_{0}$
$t \in {0, 0.5, 1, \dots, 7}$	1	0
$t \| P \sim P, P \sim D P (1, U (0, 7))$	1	0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Nonparametric FBST for Validating Linear Models^†

Abstract

1. Introduction

2. Materials and Methods

2.1. Full Bayesian Significance Test (FBST)

2.2. Gaussian Processes (GPs)

2.3. Pragmatic Hypotheses

3. Results

4. Application: Water Droplet Experiment

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Proofs

References

Article Metrics

Citations

Article Access Statistics

Nonparametric FBST for Validating Linear Models †

Abstract

1. Introduction

2. Materials and Methods

2.1. Full Bayesian Significance Test (FBST)

2.2. Gaussian Processes (GPs)

2.3. Pragmatic Hypotheses

3. Results

4. Application: Water Droplet Experiment

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Proofs

References

Article Metrics

Citations

Article Access Statistics

Nonparametric FBST for Validating Linear Models^†