Open Access
This article is

- freely available
- re-usable

*Entropy*
**2019**,
*21*(9),
883;
https://doi.org/10.3390/e21090883

Article

Pragmatic Hypotheses in the Evolution of Science

^{1}

Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo 05508-090, Brazil

^{2}

Departamento de Estatística, Universidade Federal de São Carlos, São Paulo 13565-905, Brazil

^{*}

Author to whom correspondence should be addressed.

Received: 1 August 2019 / Accepted: 10 September 2019 / Published: 11 September 2019

## Abstract

**:**

This paper introduces pragmatic hypotheses and relates this concept to the spiral of scientific evolution. Previous works determined a characterization of logically consistent statistical hypothesis tests and showed that the modal operators obtained from this test can be represented in the hexagon of oppositions. However, despite the importance of precise hypothesis in science, they cannot be accepted by logically consistent tests. Here, we show that this dilemma can be overcome by the use of pragmatic versions of precise hypotheses. These pragmatic versions allow a level of imprecision in the hypothesis that is small relative to other experimental conditions. The introduction of pragmatic hypotheses allows the evolution of scientific theories based on statistical hypothesis testing to be interpreted using the narratological structure of hexagonal spirals, as defined by Pierre Gallais.

Keywords:

hypothesis tests; precise hypotheses; pragmatic hypotheses## 1. Introduction

Standard hypothesis tests can lead to immediate logical incoherence, which makes their conclusions hard to interpret. This incoherence occurs because such tests have only two possible outcomes. Indeed, Izbicki and Esteves [1] shows that there exists no two-valued test that satisfies desirable statistical properties and is also logically coherent.

In order to overcome such an impossibility result, Esteves et al. [2] propose agnostic hypothesis tests, which have three possible outputs: (A) accept the hypothesis, say H, (E) reject H, or (Y) remain agnostic about H. These tests can be made logically coherent while preserving desirable statistical properties. For instance, both conditions are satisfied by the Generalized Full Bayesian Significance Test (GFBST). Furthermore, Stern et al. [3] show that the GFBST’s modal operators and their respective negations can be represented by vertices of the hexagon of oppositions [4,5,6,7,8,9], which is depicted in Figure 1.

This paper complements the above static representation with an analysis of the GFBST in the dynamic evolution of scientific theories. The analysis is based on the metaphor of evolutive hexagonal spirals [10,11], in which the logical modalities associated to scientific theories change over time, as in Figure 2. Our key point in this paradigm is reconciling two apparently contradictory facts. On the one hand, precise or sharp hypotheses, that is, hypotheses that have a priori zero probability are central in scientific theories [12,13]. On the other hand, the GFBST never accepts (A) precise hypotheses. These observations lead to the apparent paradox that, if the GFBST were used to test scientific theories, then the acceptance step in the spiral of scientific theories would be forfeited.

We overcome this paradox by proposing the concept of a “pragmatic hypothesis” associated to a precise hypothesis. Although precise hypotheses are commonly obtained from mathematical theories used in areas of science and technology [12,13], the associated pragmatic hypothesis is an imprecise hypothesis that is sufficiently good from the practical purpose of an end-user of the theories. For instance, Newtonian theory assumes a gravitational force of magnitude given by the equation $F=G\phantom{\rule{0.166667em}{0ex}}{m}_{1}\phantom{\rule{0.166667em}{0ex}}{m}_{2}\phantom{\rule{0.166667em}{0ex}}{d}^{-2}$, where the gravitational constant G has a precise value. However, the current Committee on Data for Science and Technology (CODATA) value for the gravitational constant is $G=6.67408\left(31\right)\times {10}^{-11}{\mathrm{m}}^{3}\phantom{\rule{4pt}{0ex}}{\mathrm{kg}}^{-1}\phantom{\rule{4pt}{0ex}}{s}^{-2}$, which includes a standard deviation for the last significant digits, $408\pm 31$. Thus, it may be reasonable for a given end-user to assume that the theoretical form of the last equation is exact, but that, pragmatically, the constant G can only be known up to a chosen precision. As a result, one might wish to test an imprecise hypothesis associated to the scientific hypothesis of interest [14,15].

This article advocates for the conceptual distinction between a precise scientific theory and an associated pragmatic hypotheses. The alternate use of precise and pragmatic versions of corresponding statistical hypotheses enables the GFBST to (pragmatically) accept scientific hypotheses. Moreover, this alternate use allows the GFBST to track the evolution of scientific theories, as interpreted in the context of Gallais’ hexagonal spirals.

Our main goal in this paper is to formalize testing procedures for a theory taking into consideration the level of precision that is appropriate for a given end-user. To handle this problem, we consider the end-user’s predictions about an experiment of his interest. The variation in these predictions can be explained by a combination of the level of imprecision in the theory and by properties of the end-user’s experiment. For instance, the latter source of variation is influenced by properties of the equipment, including the precision, accuracy, and resolution of measuring devices [16,17], and also error bounds for fundamental constants and calibration factors [18,19,20,21,22,23,24,25,26]. We propose to choose a pragmatic hypothesis in such a way that the imprecision in the end-user’s predictions is mostly due to his experimental conditions and not due to the level of imprecision in the theory that he uses.

In order to develop this argument, Section 2 first adapts Gallais’s metaphor of hexagonal spirals to the evolution of science. Next, Section 3 proposes three methods of decomposing the variability in an end-user’s predictions into the level of precision of the theory he uses and his experimental conditions. Section 3.1 and Section 3.2 use these decompositions in order to build pragmatic hypothesis. They build pragmatic hypotheses for simple hypotheses and then prove that there exists a single way of extending this construction to composite hypotheses while preserving logical coherence in simultaneous hypothesis testing. This methodology is illustrated in Section 4. All proofs are found in Appendix A.

## 2. Gallais’ Hexagonal Spirals and the Evolution of Science

Following a well-established tradition in structural semantics and narratology [27,28], Gallais and Pollina [10] propose that many classical medieval tales follow the same organizational pattern. More precisely, these narratives exhibit an underlying “intellectual structure” and are organized according to an underlying archetypal format or prototypical pattern. This pattern includes both static and a dynamical aspects. From a static perspective, the logical structure of the narrative is such that each arch is represented by a vertex of the “hexagon of oppositions” [4]. The static hexagon of oppositions is depicted in Figure 1 and represents in each vertex a modal operator among necessity (□), possibility (⋄), contingency ($\Delta $), and their negations (¬). These modal operators are structured according to three axes of opposition, $(===\phantom{\rule{0.166667em}{0ex}})$; a triangle of contrariety, $(---\phantom{\rule{0.166667em}{0ex}})$; another triangle of sub-contrariety, $(\cdots \phantom{\rule{0.166667em}{0ex}})$; and several edges of subalteration $(\u27f6$ ). From a dynamical perspective, the temporal evolution of the narrative follows a spiral (Figure 2) that unwinds (se déroule) around concentric and expanding hexagons of opposition [10,11].

Because the evolution of science can also be conceived as following a spiral pattern [29], its analysis can benefit from the structure in the works by the authors of [10,11]. From a static perspective, the logical modalities induced by agnostic hypothesis tests [3] can be represented in the hexagon of oppositions. From a dynamic perspective, scientific theories evolve as a spiral which unwinds around the following states:

- ${A}_{1}$- Extant thesis: This vertex represents a standing paradigm, an accepted theory using well-known formalisms and familiar concepts, relying on accredited experimental means and methods, etc. In fact, the concepts of a current paradigm may become so familiar and look so natural that they become part of a reified ontology. That is, there is a perceived correspondence between concepts of the theory and “dinge-an-sich” (things-in-themselves) as seen in nature [29,30].
- ${U}_{1}$- Analysis: This vertex represents the moment when some hypotheses of the standing theory are put in question. At this moment, possible alternatives to the standing hypotheses may still be only vaguely defined.
- ${E}_{1}$- Antithesis: This vertex represents the moment when some laws of the standing theory have to be rejected. Such a rejection of old laws may put in question the entire world-view of the current paradigm, opening the way for revolutionary ideas, as described in the next vertex.
- ${O}_{2}$- Apothesis/ Prosthesis: This vertex is the locus of revolutionary freedom. Alternative models are considered, and specific (precise) forms investigated. There is intellectual freedom to set aside and dispose of (apothesis) old preconceptions, prejudices and stereotypes, and also to explore and investigate new paths, to put together (prosthesis) and try out new concepts and ideas.
- ${Y}_{2}$- Synthesis: It is at this vertex that new laws are formulated; this is the point of Eureka moment(s). A selection of old and and new concepts seem to click into place, fitting together in the form of new laws, laws that are able to explain new phenomena and incorporate objects of an expanded reality.
- ${I}_{2}$- Enthesis: At this vertex new laws, concepts and methods must enter and be integrated into a consistent and coherent system. At this stage many tasks are performed in order to combine novel and traditional pieces or to accommodate original and conventional components into an well-integrated framework. Finally, new experimental means and methods are developed and perfected, allowing the new laws to be corroborated.
- ${A}_{2}$- New Thesis: At this vertex, the new theory is accepted as the standard paradigm that succeeds the preceding one (${A}_{1}$). Acceptance occurs after careful determination of fundamental constants and calibration factors (including their known precision), metrological and instrumentational error bounds, etc. At later stages of maturity, equivalent theoretical frameworks may be developed using alternative formalisms and ontologies. For example, analytical mechanics offers variational alternatives that are (almost) equivalent to the classical formulation of Newtonian mechanics [31]. Usually, these alternative worldviews reinforce the trust and confidence on the underlying laws. Nevertheless, the existence of such alternative perspectives may also foster exploratory efforts and investigative works in the next cycle in evolution.

Table 1 applies this spiral structure to the evolution of the theories of orbital astronomy and chemical affinity. The evolution of orbital astronomy has been widely studied [32]. The evolution of chemical affinity is presented in greater detail in Stern [29], Stern and Nakano [33].

The above spiral structure highlights that a statistical methodology should be able to obtain each of the six modalities in the hexagon of oppositions. Before an acceptance vertex (A) in the hexagon is reached by the spiral of scientific evolution, theoretically precise or sharp hypotheses must be formulated. However, a logically coherent hypothesis test, such as the GFBST, can choose solely between rejecting or remaining agnostic (i.e., corroborating) such sharp hypotheses. Once the evolving theory becomes (part of) a well-established paradigm, the GFBST can be used with the goal of accepting non-sharp hypotheses in the context of the same paradigm, a context that includes fundamental constants and calibration factors (and their respective uncertainties), metrological error bounds, specified accuracies of scientific intrumentation, etc. The non-sharp versions of sharp hypotheses used in such tests are called pragmatic, and their formulation is developed in the following sections.

## 3. Pragmatic Hypotheses

In order to derive pragmatic hypotheses from precise ones, it is necessary to define an idealized future experiment. Let $\theta $ be an unknown parameter of interest, which is used to express scientific hypotheses and that takes values in the parameter space, $\Theta $. A scientific hypothesis takes the form ${H}_{0}:\theta \in {\Theta}_{0}$, where ${\Theta}_{0}\subset \Theta $. Whenever there is no ambiguity, ${H}_{0}$ and ${\Theta}_{0}$ are used interchangeably. Also, the determination of $\theta $ is useful for predicting an idealized future experiment, $\mathbf{Z}$, which takes values in $\mathcal{Z}$. The uncertainty about Z depends on $\theta $ by means of ${P}_{{\theta}^{*}}$, the probability measure over $\mathcal{Z}$ when it is known that $\theta ={\theta}^{*}$, ${\theta}^{*}\in \Theta $.

Often, it is sufficient for an end-user to determine a pragmatic hypothesis, that is, that the parameter lies in a set of plausible values, which is larger than the null hypothesis. This set can be chosen in such a way that the variation over predictions about a future experiment is mostly due to experimental conditions rather than to the imprecision in the value of the parameter. This section formally develops a methodology for determining these pragmatic hypotheses.

In order to compare two parameter values, we use a “predictive dissimilarity”, ${d}_{\mathbf{Z}}$, which is a function, ${d}_{\mathbf{Z}}:\Theta \times \Theta \to {\mathbb{R}}^{+}$, such that ${d}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})$ measures how much the predictions made for $\mathbf{Z}$ based on ${\theta}^{*}$ diverge from the ones made based on ${\theta}_{0}$. We define and compare three possible choices for such a dissimilarity.

**Definition**

**1.**

The Kullback–Leibler predictive dissimilarity, ${KL}_{\mathbf{Z}}$, is
that is, ${KL}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})$ is the relative entropy between ${\mathbb{P}}_{{\theta}^{*}}$ and ${\mathbb{P}}_{{\theta}_{0}}$.

$$\begin{array}{cc}\hfill {KL}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =KL({\mathbb{P}}_{{\theta}^{*}},{\mathbb{P}}_{{\theta}_{0}})={\int}_{\mathcal{Z}}log\left(\frac{d{\mathbb{P}}_{{\theta}^{*}}}{d{\mathbb{P}}_{{\theta}_{0}}}\right)d{\mathbb{P}}_{{\theta}^{*}},\hfill \end{array}$$

**Example**

**1**

(Gaussian with known variance.)

**.**Let $\mathbf{Z}=({Z}_{1}\dots ,{Z}_{d})\sim N(\theta ,{\Sigma}_{0})$ be a random vector with a multivariate Gaussian distribution:
$$\begin{array}{cc}\hfill \frac{d{\mathbb{P}}_{\theta}\left(\mathbf{z}\right)}{d\mathbf{z}}\phantom{\rule{1.em}{0ex}}& =\parallel 2\pi {\Sigma}_{0}{\parallel}^{-0.5}exp\left(-0.5{(\mathbf{z}-\theta )}^{t}{\Sigma}_{0}^{-1}d(\mathbf{z}-\theta )\right)\hfill \\ \hfill {KL}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& ={\int}_{{\mathbb{R}}^{d}}log\left(\frac{d{\mathbb{P}}_{{\theta}^{*}}\left(\mathbf{z}\right)}{d{\mathbb{P}}_{{\theta}_{0}}\left(\mathbf{z}\right)}\right)d{\mathbb{P}}_{{\theta}^{*}}\left(\mathbf{z}\right)=0.5{({\theta}_{0}-{\theta}^{*})}^{t}{\Sigma}_{0}^{-1}({\theta}_{0}-{\theta}^{*}),\hfill \end{array}$$

When $d=1$ and ${\Sigma}_{0}={\sigma}_{0}^{2}$,

$$\begin{array}{cc}\hfill K{L}_{\mathbf{z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =\frac{{({\theta}_{0}-{\theta}^{*})}^{2}}{2{\sigma}_{0}^{2}}\hfill \end{array}$$

The KL dissimilarity evaluates the distance between the predictive probability distributions for the future experiment under two parameter values, ${\theta}_{0}$ and ${\theta}^{*}$. Although the KL dissimilarity is general, it can be challenging to interpret. In particular, it can be hard to establish the quality of the predictions for $\mathbf{Z}$ based on ${\theta}^{*}$ when $\mathbf{Z}$ is actually generated from ${\theta}_{0}$ and $K{L}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\le \u03f5$. A more interpretable dissimilarity is obtained by taking ${d}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})$ to measure how far are the best predictions for $\mathbf{Z}$ based on ${\theta}^{*}$ and ${\theta}_{0}$. In this case, if one makes a prediction for $\mathbf{Z}$ based on ${\theta}^{*}$, ${\mathbf{z}}_{*}$, and $\mathbf{Z}$ was actually generated using ${\theta}_{0}$, then ${d}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\le \u03f5$ guarantees that ${\mathbf{z}}_{*}$ will be at most $\u03f5$ apart from the best possible prediction. Such a dissimilarity is discussed in the following definition.

**Definition**

**2**

(Best prediction dissimilarity—BP.)
where ${\delta}_{\mathbf{Z},{\theta}_{0}}:\mathcal{Z}\to \mathbb{R}$ is such that ${\delta}_{\mathbf{Z},{\theta}_{0}}\left(\mathbf{z}\right)$ measures how bad $\mathbf{z}$ predicts $\mathbf{Z}$ when $\theta ={\theta}_{0}$. The “best prediction dissimilarity”, ${BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})$, measures how badly $\widehat{\mathbf{Z}}\left({\theta}^{*}\right)$ predicts $\mathbf{Z}$ relatively to $\widehat{\mathbf{Z}}\left({\theta}_{0}\right)$ when $\theta ={\theta}_{0}$. Formally,
where $g:\mathbb{R}\u27f6\mathbb{R}$ is a motononic function. The choice of g in a particular setting aims at improving the interpretation of the best prediction dissimilarity criterion.

**.**Let $\widehat{\mathbf{Z}}:\Theta \to \mathcal{Z}$ be such that $\widehat{\mathbf{Z}}\left({\theta}_{0}\right)$ is the best prediction for $\mathbf{Z}$ given that $\theta ={\theta}_{0}$. For example, one can take
$$\begin{array}{cc}\hfill \widehat{\mathbf{Z}}\left({\theta}_{0}\right)\phantom{\rule{1.em}{0ex}}& =arg\underset{\mathbf{z}\in \mathcal{Z}}{min}{\delta}_{\mathbf{Z},{\theta}_{0}}\left(\mathbf{z}\right),\hfill \end{array}$$

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =g\left(\frac{{\delta}_{\mathbf{Z},{\theta}_{0}}\left(\widehat{\mathbf{Z}}\left({\theta}^{*}\right)\right)-{\delta}_{\mathbf{Z},{\theta}_{0}}\left(\widehat{\mathbf{Z}}\left({\theta}_{0}\right)\right)}{{\delta}_{\mathbf{Z},{\theta}_{0}}\left(\widehat{\mathbf{Z}}\left({\theta}_{0}\right)\right)}\right),\hfill \end{array}$$

**Example**

**2**

(BP under quadratic form.)
The optimal prediction under ${\theta}^{*}$ is $\widehat{\mathbf{Z}}\left({\theta}^{*}\right)={\mu}_{\mathbf{Z},{\theta}^{*}}$. It follows that
In particular, ${\delta}_{\mathbf{Z},{\theta}_{0}}\left(\widehat{\mathbf{Z}}\left({\theta}_{0}\right)\right)=\mathbf{E}\left[\parallel \mathbf{Z}-{\mu}_{\mathbf{Z},{\theta}_{0}}{\parallel}_{\mathbf{S}}^{2}|\theta ={\theta}_{0}\right]$. Therefore,
In this example, BP${}_{\mathbf{Z}}$ can be put in the same scale as $\mathbf{Z}$ by taking $g\left(x\right)=\sqrt{x}$. Also, two choices of $\mathbf{S}$ are of particular interest. When $\mathbf{S}=\mathbb{V}{\left[\mathbf{Z}\right|\theta ={\theta}_{0}]}^{-1}$, Equation (2) simplifies to
Similarly, when $\mathbf{S}$ is the identity matrix, Equation (2) simplifies to
Equation (4) admits an intuitive interpretation. The larger the value of $tr\left(\mathbb{V}\left[\mathbf{Z}\right|\theta ={\theta}_{0}]\right)$, the more $\mathbf{Z}$ is dispersed and the harder it is to predict its value. Also, $\parallel \mathbf{E}\left[\mathbf{Z}\right|\theta ={\theta}_{0}]-\mathbf{E}\left[\mathbf{Z}\right|\theta ={\theta}^{*}]{\parallel}_{2}^{2}$ measures how far apart are the best prediction for $\mathbf{Z}$ under $\theta ={\theta}_{0}$ and $\theta ={\theta}^{*}$. That is, ${\mathrm{BP}}_{Z}({\theta}_{0},{\theta}^{*})$ captures that, if one predicts $\mathbf{Z}$ assuming that $\theta ={\theta}^{*}$ when it is actually $\theta ={\theta}_{0}$, then the error with respect to the best prediction is increased as a function of the distance between the predictions over the dispersion of $\mathbf{Z}$.

**.**Let $\mathcal{Z}={\mathbb{R}}^{d}$, ${\mu}_{\mathbf{Z},\theta}=\mathbf{E}\left[\mathbf{Z}\right|\theta ]$, ${\Sigma}_{\mathbf{Z},\theta}=\mathbb{V}\left[\mathbf{Z}\right|\theta ]$ and $\mathbf{S}$ be a positive definite matrix. Define the quadratic form induced by $\mathbf{S}$ to be ${\parallel \mathbf{z}\parallel}_{\mathbf{S}}^{2}={\mathbf{z}}^{T}\mathbf{S}\mathbf{z}$ and
$$\begin{array}{c}\hfill {\delta}_{\mathbf{Z},{\theta}_{0}}\left(\mathbf{z}\right)=\mathbf{E}\left[{\parallel \mathbf{Z}-\mathbf{z}\parallel}_{\mathbf{S}}^{2}|\theta ={\theta}_{0}\right]\end{array}$$

$$\begin{array}{cc}\hfill {\delta}_{\mathbf{Z},{\theta}_{0}}\left(\widehat{\mathbf{Z}}\left({\theta}^{*}\right)\right)\phantom{\rule{1.em}{0ex}}& =\mathbf{E}\left[\parallel \mathbf{Z}-{\mu}_{\mathbf{Z},{\theta}^{*}}{\parallel}_{\mathbf{S}}^{2}|\theta ={\theta}_{0}\right]\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\parallel {\mu}_{\mathbf{Z},{\theta}_{0}}-{\mu}_{\mathbf{Z},{\theta}^{*}}{\parallel}_{\mathbf{S}}^{2}+\mathbf{E}\left[\parallel \mathbf{Z}-{\mu}_{\mathbf{Z},{\theta}_{0}}{\parallel}_{\mathbf{S}}^{2}|\theta ={\theta}_{0}\right]\hfill \end{array}$$

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =g\left(\frac{\parallel {\mu}_{\mathbf{Z},{\theta}_{0}}-{\mu}_{\mathbf{Z},{\theta}^{*}}{\parallel}_{\mathbf{S}}^{2}}{\mathbf{E}\left[\parallel \mathbf{Z}-{\mu}_{\mathbf{Z},{\theta}_{0}}{\parallel}_{\mathbf{S}}^{2}|\theta ={\theta}_{0}\right]}\right)\hfill \end{array}$$

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =g\left({d}^{-1}{\parallel {\mu}_{\mathbf{Z},{\theta}_{0}}-{\mu}_{\mathbf{Z},{\theta}^{*}}\parallel}_{{\Sigma}_{\mathbf{Z},{\theta}_{0}}^{-1}}^{2}\right)\hfill \end{array}$$

$$\begin{array}{c}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})=g\left(\frac{\parallel \mathbf{E}\left[\mathbf{Z}\right|\theta ={\theta}_{0}]-\mathbf{E}\left[\mathbf{Z}\right|\theta ={\theta}^{*}]{\parallel}_{2}^{2}}{tr\left(\mathbb{V}\left[\mathbf{Z}\right|\theta ={\theta}_{0}]\right)}\right)\end{array}$$

**Example**

**3**

(Gaussian with known variance.)
Similarly, it follows from Equation (3) that when $\mathbf{S}={\Sigma}_{0}^{-1}$,
Conclude from Equation (6) that, if $\mathbf{S}={\Sigma}_{0}^{-1}$ and $g\left(x\right)=x$, then ${BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})=2{d}^{-1}{KL}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})$. Also, when $d=1$, ${\Sigma}_{0}={\sigma}_{0}^{2}$ and $g\left(x\right)=\sqrt{x}$, both Equations (5) and (6) simplify to
In some situations, $\mathbf{Z}$ is the average of m independent observations distributed as $N(\theta ,{\Sigma}_{0})$. In this case, $\mathbf{Z}\sim N(\theta ,{m}^{-1}{\Sigma}_{0})$. It follows from Equation (5) that $B{P}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})=g\left(\frac{m\parallel {\theta}_{0}-{\theta}^{*}{\parallel}_{2}^{2}}{tr\left({\Sigma}_{0}\right)}\right)$ when $\mathbf{S}$ is the identity, and ${BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})=g\left(m{d}^{-1}{({\theta}_{0}-{\theta}^{*})}^{t}{\Sigma}_{0}^{-1}({\theta}_{0}-{\theta}^{*})\right)$ when $\mathbf{S}={\Sigma}_{0}^{-1}$.

**.**Consider Example 1 and let ${\delta}_{\mathbf{Z},{\theta}_{0}}\left(\mathbf{z}\right)$ be as in Example 2. It follows from Equation (4) that when $\mathbf{S}$ is the identity matrix,
$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =g\left(\frac{\parallel {\theta}_{0}-{\theta}^{*}{\parallel}_{2}^{2}}{tr\left({\Sigma}_{0}\right)}\right)\hfill \end{array}$$

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =g\left({d}^{-1}{({\theta}_{0}-{\theta}^{*})}^{t}{\Sigma}_{0}^{-1}({\theta}_{0}-{\theta}^{*})\right)\hfill \end{array}$$

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& ={\sigma}_{0}^{-1}|{\theta}_{0}-{\theta}^{*}|\hfill \end{array}$$

Although ${\mathrm{BP}}_{\mathbf{Z}}$ is more interpretable then ${\mathrm{KL}}_{\mathbf{Z}}$, it also relies on more tuning variables, such as $\delta $, $\widehat{Z}$, and g. A balance between these features is obtained by a third predictive dissimilarity, which evaluates how easy it is to recover the value of $\theta $ between ${\theta}_{0}$ or ${\theta}^{*}$ based on $\mathbf{Z}$.

**Definition**

**3**

(Classification distance—CD.)
${\widehat{\theta}}_{{\theta}_{0},{\theta}^{*}}$ assigns to each possible outcome of the future experiment $\mathbf{z}$, in which the values of θ, ${\theta}_{0}$, or ${\theta}^{*}$ make the experimental result more likely. The classification distance between ${\theta}_{0}$, ${\theta}^{*}$, and $CD({\theta}_{0},{\theta}^{*})$ is defined as
$CD({\theta}_{0},{\theta}^{*})+0.5$ is the best Bayes utility in an hypothesis test of ${\theta}_{0}$ against ${\theta}^{*}$ using a uniform prior for θ and the 0/1 utility [15]. By subtracting $0.5$ from this quantity, $CD({\theta}_{0},{\theta}^{*})$ varies between 0 and $0.5$ and is a distance. Also,
where $TV({\mathbb{P}}_{{\theta}_{0}},{\mathbb{P}}_{{\theta}^{*}})={sup}_{A}|{\mathbb{P}}_{{\theta}_{0}}\left(A\right)-{\mathbb{P}}_{{\theta}^{*}}\left(A\right)|$ and $\parallel {\mathbb{P}}_{{\theta}_{0}}-{\mathbb{P}}_{{\theta}^{*}}{\parallel}_{1}={\int}_{\mathcal{Z}}|{\mathbb{P}}_{{\theta}_{0}}\left(z\right)-{\mathbb{P}}_{{\theta}^{*}}\left(z\right)|dz$ is the ${\mathcal{L}}_{1}$-distance between probability measures.

**.**Let ${\widehat{\theta}}_{{\theta}_{0},{\theta}^{*}}:\mathcal{Z}\to \Theta $ be such that
$$\begin{array}{cc}\hfill {\widehat{\theta}}_{{\theta}_{0},{\theta}^{*}}\left(\mathbf{z}\right)& =arg\underset{\theta \in \{{\theta}_{0},{\theta}^{*}\}}{max}{f}_{\mathbf{Z}}\left(\mathbf{z}\right|\theta )\hfill \end{array}$$

$$\begin{array}{cc}\hfill CD({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =0.5\mathbb{P}\left({\widehat{\theta}}_{{\theta}_{0},{\theta}^{*}}\left(\mathbf{Z}\right)={\theta}_{0}|{\theta}_{0}\right)+0.5\mathbb{P}\left({\widehat{\theta}}_{{\theta}_{0},{\theta}^{*}}\left(\mathbf{Z}\right)={\theta}^{*}|{\theta}^{*}\right)-0.5\hfill \end{array}$$

$$\begin{array}{cc}\hfill CD({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =0.5TV({\mathbb{P}}_{{\theta}_{0}},{\mathbb{P}}_{{\theta}^{*}})=0.25{\parallel {\mathbb{P}}_{{\theta}_{0}}-{\mathbb{P}}_{{\theta}^{*}}\parallel}_{1},\hfill \end{array}$$

**Example**

**4**

(Gaussian with known variance.)
Note that, in this case, $CD$ would be the same as $BP$ if, instead of taking $g\left(x\right)=\sqrt{x}$, one chose $g\left(x\right)=\Phi \left(0.5\sqrt{x}\right)-0.5$.

**.**Consider Examples 1 and 3, when $d=1$, ${\Sigma}_{0}={\sigma}_{0}^{2}$, obtain
$$\begin{array}{cc}\hfill {CD}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =\Phi \left(\frac{|{\theta}_{0}-{\theta}^{*}|}{2{\sigma}_{0}}\right)-\frac{1}{2}\hfill \end{array}$$

Although analytical expressions for CD are generally not available, it is possible to approximate it via numerical integration methods.

The choice between predictive dissimilarity functions depends on the type of guarantee the end-user wishes to obtain. In particular, $BP$, $KL$, and $CD$ is not an exhaustive list of dissimilarities. However, some of their properties can be useful in obtaining a choice. For instance, although $BP$ yields a metric on the parameter space, $KL$ and $CD$ are dissimilarities between probability functions. That is, although $BP$ will generate pragmatic hypothesis that have parameter values numerically close to a given ${\theta}_{0}$, $KL$ and $CD$ will yield pragmatic hypothesis that have parameter values lead to similar prediction about $\mathbf{Z}$. Also, although $KL$ evaluates similarity between predictions from an information theoretic perspective, $CD$ evaluates them from a perspective of hypothesis tests.

#### 3.1. Singleton Hypotheses

We start by defining the pragmatic hypothesis associated to a singleton hypothesis. A singleton hypothesis is one in which the parameter assumes a single value, such as ${H}_{0}:\theta ={\theta}_{0}$. In this case, the pragmatic hypothesis associated to ${H}_{0}$ is the set of points whose dissimilarity to ${\theta}_{0}$ is at most $\u03f5$, as formalized below.

**Definition**

**4**

(Pragmatic hypothesis for a singleton.)

**.**Let ${H}_{0}:\theta ={\theta}_{0}$, ${d}_{\mathbf{Z}}$ be a predictive dissimilarity function and $\u03f5>0$. The pragmatic hypothesis for ${H}_{0}$, $Pg(\left\{{\theta}_{0}\right\},{d}_{\mathbf{Z}},\u03f5)$, is
$$\begin{array}{cc}\hfill Pg(\left\{{\theta}_{0}\right\},{d}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =\{{\theta}^{*}\in \Theta :{d}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\le \u03f5\}\hfill \end{array}$$

Note that for ${\u03f5}_{1}<{\u03f5}_{2}$, $Pg(\left\{{\theta}_{0}\right\},{d}_{\mathbf{Z}},{\u03f5}_{1})\subseteq Pg(\left\{{\theta}_{0}\right\},{d}_{\mathbf{Z}},{\u03f5}_{2})$.

**Example**

**5**

(Gaussian with known variance)

**.**Consider Examples 1 and 3 when $d=1$, ${\Sigma}_{0}={\sigma}_{0}^{2}$ and $g\left(x\right)=\sqrt{x}$. It follows from Equations (1), (7) and (8) that
$$\begin{array}{cc}\hfill Pg(\left\{{\theta}_{0}\right\},B{P}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =\left[{\theta}_{0}-\u03f5{\sigma}_{0},{\theta}_{0}+\u03f5{\sigma}_{0}\right]\hfill \\ \hfill Pg(\left\{{\theta}_{0}\right\},K{L}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =\left[{\theta}_{0}-\sqrt{2\u03f5}{\sigma}_{0},{\theta}_{0}+\sqrt{2\u03f5}{\sigma}_{0}\right]\hfill \\ \hfill Pg(\left\{{\theta}_{0}\right\},C{D}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =\left[{\theta}_{0}-2{\Phi}^{-1}(0.5+\u03f5){\sigma}_{0},{\theta}_{0}+2{\Phi}^{-1}(0.5+\u03f5){\sigma}_{0}\right]\hfill \end{array}$$

Note that the size of each of the pragmatic hypothesis is proportional to ${\sigma}_{0}$. This occurs because each predictive dissimilarity functions makes the prediction error due to the unknown parameter value small with respect to that due to the data variability, ${\sigma}_{0}^{2}$.

#### 3.2. Composite Hypotheses

Next, we consider pragmatic hypotheses for general hypotheses ${H}_{0}:\theta \in {\Theta}_{0}$, where ${\Theta}_{0}\subset \Theta $.

**Definition**

**5.**

For each hypothesis ${\Theta}_{0}\subseteq \Theta $, predictive dissimilarity ${d}_{\mathbf{Z}}$ and $\u03f5>0$, $Pg({\Theta}_{0},{d}_{\mathbf{Z}},\u03f5)$ is the pragmatic hypothesis associated to ${\Theta}_{0}$ induced by ${d}_{\mathbf{Z}}$ and ϵ. Whenever ${d}_{\mathbf{Z}}$ and ϵ are clear or not relevant to the result, we write $Pg\left({\Theta}_{0}\right)$ instead of $Pg({\Theta}_{0},{d}_{\mathbf{Z}},\u03f5)$.

In order to construct these pragmatic hypotheses, we use logically coherent agnostic hypothesis tests. For each hypothesis, an agnostic hypothesis test can either reject it (1), accept it (0), or remain agnostic (1/2) [34]. Esteves et al. [2] shows that an agnostic hypothesis test is logically coherent if and only if it is based on a region estimator. Such tests are presented in Definition 7 and illustrated in Figure 3.

**Definition**

**6.**

Let $\mathcal{X}$ denote the sample space of the data used to test a hypothesis. A region estimator is a function, $R:\mathcal{X}\u27f6\mathcal{P}(\Theta )$, where $\mathcal{P}(\Theta )$ is the power set of Θ.

**Definition**

**7**

(Agnostic test based on a region estimator.)

**.**The agnostic test based on the region estimator R for testing ${H}_{0}$, ${\varphi}_{{H}_{0}}^{R}$, such that
$$\begin{array}{cc}\hfill {\varphi}_{{H}_{0}}^{R}\left(x\right)& =\left\{\begin{array}{cc}0\hfill & ,\phantom{\rule{4.pt}{0ex}}if\phantom{\rule{4.pt}{0ex}}R\left(x\right)\subseteq {H}_{0}\hfill \\ 1\hfill & ,\phantom{\rule{4.pt}{0ex}}if\phantom{\rule{4.pt}{0ex}}R\left(x\right)\subseteq {H}_{0}^{c}\hfill \\ \frac{1}{2}\hfill & ,\phantom{\rule{4.pt}{0ex}}otherwise.\hfill \end{array}\right.\hfill \end{array}$$

Besides the logical conditions on the hypothesis test, one might also impose logical restraints on how pragmatic hypotheses are constructed. For instance, let A and B be two hypothesis, such that B logically entails A, that is, $B\subseteq A$. If a logically coherent test accepts B, then it also accepts A. This property is called monotonocity [1,35,36,37]. One might also impose that $Pg$, such that if a logically coherent hypothesis test accepts $Pg\left(B\right)$, then it should also accept $Pg\left(A\right)$. Similarly, let ${\left({A}_{i}\right)}_{i\in I}$ be a collection of hypothesis which cover A, that is, $A\subseteq {\cup}_{i\in I}{A}_{i}$. If a logically coherent hypothesis test rejects every ${A}_{i}$, then it rejects A. This property is called union consonance. One might also impose that $Pg$ is such that, if a logically coherent hypothesis test rejects $Pg\left({A}_{i}\right)$ for every i, then it should also reject $Pg\left(A\right)$. The above conditions define the logical coherence of a procedure for constructing pragmatic hypotheses.

**Definition**

**8.**

A procedure for constructing pragmatic hypothesis, $Pg$, is logically coherent if, for every logically coherent hypothesis test ϕ and sample point x:

- If ${\varphi}_{Pg\left(B\right)}\left(x\right)=0$ for some $B\subseteq A$, then ${\varphi}_{Pg\left(A\right)}\left(x\right)=0$.
- If ${\varphi}_{Pg\left({A}_{i}\right)}\left(x\right)=1$ for every $i\in I$ and $A\subseteq {\cup}_{i\in I}{A}_{i}$, then ${\varphi}_{Pg\left(A\right)}\left(x\right)=1$.

In order to motivate the above definition, consider that the frequencies of $AA$, $AB$, and $BB$ in a given population are ${\theta}_{1}$, ${\theta}_{2}$, and ${\theta}_{3}$, respectively. Note that $B:=\{0.25,0.5,0.25\}$ is a subset of $A=\{({p}^{2},2p(1-p),{(1-p)}^{2}):p\in [0,1]\}$, which denotes the Hardy–Weinberg equilibrium. That is, if the frequencies $AA$, $AB$, and $BB$ are, respectively, 0.25, 0.5, and 0.25, then the population follows the Hardy–Weinberg equilibrium. As a result, if one pragmatically accepts that the population satisfies the specified proportions, then one might also wish to pragmatically accept that the population follows the Hardy–Weinberg. Similarly, if one pragmatically rejects for every $p\in [0,1]$ that the frequencies of $AA$, $AB$, and $BB$ are, respectively, ${p}^{2}$, $2p(1-p)$, and ${(1-p)}^{2}$, then one might also wish to pragmatically reject that the population follows the Hardy–Weinberg equilibrium. These conditions are assured in Definition 8.

In a logically coherent procedure for constructing pragmatic hypotheses, the pragmatic hypothesis associated to a composite hypothesis is completely determined by the pragmatic hypotheses associated to simple hypotheses. This result is presented in Theorem 1.

**Theorem**

**1.**

A procedure for constructing pragmatic hypothesis, $Pg$, is logically coherent if and only if, for every hypothesis, ${\Theta}_{0}$, $Pg\left({\Theta}_{0}\right)={\bigcup}_{\theta \in {\Theta}_{0}}Pg\left(\left\{\theta \right\}\right)$.

Using Theorem 1, it is possible to determine a logically coherent procedure for constructing pragmatic hypotheses by determining only the pragmatic hypothesis associated to simple hypothesis, such as in Section 3.1. Theorem 1 is illustrated in Section 4. One can also obtain the following general relation between predictive dissimilarities.

**Lemma**

**1.**

$Pg({\Theta}_{0},K{L}_{\mathbf{Z}},8{\u03f5}^{2})\subseteq Pg({\Theta}_{0},C{D}_{\mathbf{Z}},\u03f5)$.

Besides being logically coherent, it is often desirable in statistics [38,39] and in science [12,13] for a procedure to be invariant to reparametrization, so as to ensure that the procedure reaches the same conclusions whatever the coordinate system is used to specify both the sample and the parameter spaces. For instance, the pragmatic hypothesis that is obtained using the International metric system should be compatible to the one that is obtained using the English metric system. Invariance to reparametrization is formally presented in Definition 10.

**Definition**

**9.**

${\left({\mathbb{P}}_{{\theta}^{*}}^{*}\right)}_{{\theta}^{*}\in {\Theta}^{*}}$ is a reparameterization of ${\left({\mathbb{P}}_{\theta}\right)}_{\theta \in \Theta}$ if there exists a bijective function, $f:\Theta \to {\Theta}^{*}$, such that for every $\theta \in \Theta $, ${\mathbb{P}}_{\theta}={\mathbb{P}}_{f\left(\theta \right)}^{*}$.

**Definition**

**10.**

Let ${\left({\mathbb{P}}_{{\theta}^{*}}^{*}\right)}_{{\theta}^{*}\in {\Theta}^{*}}$ be a reparametrization of ${\left({\mathbb{P}}_{\theta}\right)}_{\theta \in \Theta}$ by a bijective function, $f:\Theta \to {\Theta}^{*}$. Also, let ${d}_{\mathbf{Z}}$ and ${d}_{\mathbf{Z}}^{*}$ be predictive dissimilarity functions. The functions ${d}_{\mathbf{Z}}$ and ${d}_{\mathbf{Z}}^{*}$ are invariant to the reparametrization if for every logically coherent procedure for constructing pragmatic hypotheses, $Pg$,

$$\begin{array}{cc}\hfill f\left[Pg({\Theta}_{0},{d}_{\mathbf{Z}},\u03f5)\right]\phantom{\rule{1.em}{0ex}}& =Pg(f\left[{\Theta}_{0}\right],{d}_{\mathbf{Z}}^{*},\u03f5),\hfill \end{array}$$

Definition 10 states that, if ${\Theta}_{0}$ is an hypothesis and invariance to reparametrization holds, then the pragmatic hypothesis obtained in a reparametrization of ${\Theta}_{0}$, say $Pg\left(f\left[{\Theta}_{0}\right]\right)$, is the same as the transformed pragmatic hypothesis associated to ${\Theta}_{0}$, $f\left[Pg\left({\Theta}_{0}\right)\right]$. Theorem 2 presents a sufficient condition for obtaining invariance to reparametrization.

**Theorem**

**2.**

Let ${\left({\mathbb{P}}_{{\theta}^{*}}^{*}\right)}_{{\theta}^{*}\in {\Theta}^{*}}$ be a reparameterization of ${\left({\mathbb{P}}_{\theta}\right)}_{\theta \in \Theta}$ given by a bijective function, f. If ${d}_{\mathbf{Z}}$ and ${d}_{\mathbf{Z}}^{*}$ satisfy ${d}_{\mathbf{Z}}({\theta}_{0},\theta )={d}_{\mathbf{Z}}^{*}(f\left({\theta}_{0}\right),f\left(\theta \right))$, then ${d}_{\mathbf{Z}}$ and ${d}_{\mathbf{Z}}^{*}$ are invariant to this reparametrization.

**Corollary**

**1.**

If ${d}_{\mathbf{Z}}$ and ${d}_{\mathbf{Z}}^{*}$ are the same choice between KL, BP, or CD, then ${d}_{\mathbf{Z}}$ and ${d}_{\mathbf{Z}}^{*}$ are invariant to every reparametrization.

The procedures for constructing pragmatic hypotheses induced by $KL$ and $CD$ also satisfy an additional property given by Theorem 3.

**Theorem**

**3.**

Let ${\mathbf{Z}}_{m}=({Z}_{1},\dots ,{Z}_{m})$, where ${Z}_{i}$’s are i.i.d. ${F}_{\theta}$ and ${\left({F}_{\theta}\right)}_{\theta \in \Theta}$ is identifiable [40,41]. Also, let $K{L}_{m}$ and $C{D}_{m}$ be the dissimilarities calculated using ${\mathbf{Z}}_{m}$. If $Pg$ is logically coherent, then, for every ${\Theta}_{0}\subseteq \Theta $ and $\u03f5>0$,

- (i)
- ${\left(Pg({\Theta}_{0},K{L}_{m},\u03f5)\right)}_{m\ge 1}$ and ${\left(Pg({\Theta}_{0},C{D}_{m},\u03f5)\right)}_{m\ge 1}$ are non-increasing sequences of sets
- (ii)
- $Pg({\Theta}_{0},K{L}_{m},\u03f5)\stackrel{m\to \infty}{\to}{\Theta}_{0}$ and $Pg({\Theta}_{0},C{D}_{m},\u03f5)\stackrel{m\to \infty}{\to}{\Theta}_{0}$.

Theorem 3 states that the sequence of pragmatic hypotheses for ${\Theta}_{0}$ induced by ${d}_{{\mathbf{Z}}_{m}}$ is non-increasing if the dissimilarity is evaluated by either KL or CD. The greater the number of observable quantities ${\mathbf{Z}}_{m}$, the easier it is to distinguish two parameter values ${\theta}_{0}$ and ${\theta}^{*}$, and therefore the smaller the amount of parameters that are taken as close to ${\theta}_{0}$. Also, as the sample size goes to infinity, the pragmatic hypothesis associated to ${\Theta}_{0}$ converges to to ${\Theta}_{0}$. In other words, for each ${\theta}_{0}\in {\Theta}_{0}$, no other parameter value can predict infinitely many observable quantities with a precision sufficiently close to that of ${\theta}_{0}$.

## 4. Applications

In the following, pragmatic hypotheses for standard statistical problems are derived. Coscrato et al. [42] provide additional examples and methods for obtaining pragmatic hypotheses.

**Example**

**6**

(Gaussian with unknown variance.)
The rectangular shape of these pragmatic hypotheses seems to be unreasonable, as, for instance, whether a point $(\mu ,{\sigma}^{2})$ is close to $({\mu}_{0},{\sigma}_{0}^{2})$ does not depend on ${\sigma}_{0}^{2}$. This is a consequence of the choice of δ in Example 5.

**.**Consider the setting from Example 5, but with ${\sigma}^{2}$ unknown and $0<{\sigma}^{2}\le {M}^{2}$. In this case, the parameter is $\theta =(\mu ,{\sigma}^{2})$. Consider the composite hypothesis ${H}_{0}:\left\{{\mu}_{0}\right\}\times (0,{M}^{2}]$, which is often written as ${H}_{0}:\mu ={\mu}_{0}$. In this case, let ${\theta}_{0}=({\mu}_{0},{\sigma}_{0}^{2})$ and ${\Theta}_{0}=\left\{{\mu}_{0}\right\}\times (0,{M}^{2}]$. Proceeding as in Example 5, it follows that
$$\begin{array}{ccc}\hfill Pg(\left\{{\theta}_{0}\right\},B{P}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =[{\mu}_{0}-\u03f5{\sigma}_{0},{\mu}_{0}+\u03f5{\sigma}_{0}]\times (0,{M}^{2}]\hfill & \\ \hfill Pg({\Theta}_{0},B{P}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =[{\mu}_{0}-\u03f5M,{\mu}_{0}+\u03f5M]\times (0,{M}^{2}]\hfill & Theorem1\end{array}$$

Figure 4 presents the pragmatic hypotheses for ${H}_{0}:\mu =0,{\sigma}^{2}=1$ and ${H}_{0}:\mu =0$ when $\u03f5=0.1$ and ${M}^{2}=2$, and using the KL and CD dissimilarities. Contrary to BP, the hypotheses obtained from these dissimilarities do not have a rectangular shape. In particular, the triangular shape of the pragmatic hypotheses for ${H}_{0}:\mu =0$ is such that the closer ${\sigma}^{2}$ is to 0, the smaller the range of values for μ that are included in the pragmatic hypothesis. This behavior might be desirable, as when ${\sigma}^{2}$ is small, there is little uncertainty about the value of $\mathbf{Z}$, and consequently a narrow interval of values of μ can predict Z with precision ϵ.

**Example**

**7**

(Hardy–Weinberg equilibrium)
If ${\mathit{\theta}}_{0}^{p}=({p}^{2},2p(1-p),{(1-p)}^{2})$, ${\delta}_{\mathbf{Z}}\left(\mathbf{z}\right)={\mathit{E}[\parallel \mathbf{Z}-\mathbf{z}\parallel}_{2}^{2}|\mathit{\theta}={\mathit{\theta}}_{0}^{p}]$ and $g\left(x\right)=\sqrt{x}$, then it follows from Example 2 that
The pragmatic hypotheses that are obtained using $KL$, $BP$, and $CD$ for the HW hypothesis are depicted in Figure 5. The choice between BP or KL and CD has a large impact over the shape of the pragmatic hypotheses. Although, for BP, the width of the pragmatic hypothesis is approximately uniform along the HW curve, the width of the pragmatic hypotheses obtained using $KL$ and $CD$ is smaller towards the edges of the HW curve. This behavior could be expected, as towards the edges of the HW curve, $\mathbf{Z}$ has the smallest variability. The figure also depicts the challenge in calibrating $KL$. Although the pragmatic hypotheses for $BP$ and $CD$ have similar sizes when using $\u03f5=0.1$, this result was obtained for $KL$ while using $\u03f5=0.01$.

**.**Let $\mathbf{Z}\sim Multinomial(m,\mathit{\theta})$, where $\mathit{\theta}=({\theta}_{1},{\theta}_{2},{\theta}_{3})$, ${\theta}_{i}\ge 0$, and ${\sum}_{i=1}^{3}{\theta}_{i}=1$. The Hardy–Weinberg (HW) hypothesis [43], ${H}_{0}$, which is depicted in the red curve in Figure 5 satisfies
$$\begin{array}{ccc}\hfill {H}_{0}:\theta \in {\Theta}_{0},& \phantom{\rule{1.em}{0ex}}\hfill & \hfill {\Theta}_{0}=\left\{\left({p}^{2},2p(1-p),{(1-p)}^{2}\right):0\le p\le 1\right\}\end{array}$$

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\mathit{\theta}}_{0}^{p},{\mathit{\theta}}^{*})\phantom{\rule{1.em}{0ex}}& ={\left(m\times \frac{{({\theta}_{1}-{p}^{2})}^{2}+{({\theta}_{2}-2p(1-p))}^{2}+{({\theta}_{3}-{(1-p)}^{2})}^{2}}{{p}^{2}(1-{p}^{2})+2p(1-p)(1-2p(1-p))+{(1-p)}^{2}(1-{(1-p)}^{2})}\right)}^{0.5}\hfill \end{array}$$

The pragmatic hypotheses in Figure 5 are further tested using data from Brentani et al. [44], which is presented in Table 2. This study had the goal of verifying association between the APOE-ϵ4 gene and Alzheimer disease. The lower panels of Figure 5 present the 80% HPD regions for the distribution of this gene in each of the eight groups observed in the study. Additionally, they present two simulated datasets, 9 and 10. Groups 9 and 10 were generated by populations that were, respectively, not under and under the HW equilibrium. Group 9 and 10 fall, respectively, outside and inside of the pragmatic hypothesis.

**Example**

**8**

(Bioequivalence)

**.**Assume that $\mathbf{Z}=(X,Y)\sim N(({\mu}_{1},{\mu}_{2}),{\sigma}^{2}{\mathbb{I}}_{2})$, with σ known. We derive the pragmatic hypothesis for ${H}_{0}:{\mu}_{1}={\mu}_{2}$, that is, for $\{({\mu}_{1},{\mu}_{2})\in {\mathbb{R}}^{2}:{\mu}_{1}={\mu}_{2}\}$. Such a test might be used in a bioequivalence study, where X and Y are the concentrations of an active ingredient in a generic (test) drug medication and in the brand name (reference) medication [45], respectively. As ${H}_{0}$ is composite, it helps to derive the pragmatic hypothesis of its constituents.In order to do so, let ${\theta}_{0}=({\mu}_{0},{\mu}_{0})$, ${\mu}_{0}\in \mathbb{R}$, ${\theta}^{*}=({\mu}_{1}^{*},{\mu}_{2}^{*})$, and ${H}^{{\theta}_{0}}:\theta ={\theta}_{0}$. If ${\delta}_{\mathbf{Z},{\theta}^{*}}\left(\mathbf{z}\right)=\mathbf{E}\left[{(X-{\mathbf{z}}_{1})}^{2}+{(Y-{\mathbf{z}}_{2})}^{2}|\theta ={\theta}^{*}\right]$ and $g\left(x\right)=\sqrt{x}$, then
Hence, $Pg(\left\{{\theta}_{0}\right\},B{P}_{\mathbf{Z}},\u03f5)=\left\{({\mu}_{1}^{*},{\mu}_{2}^{*}):{({\mu}_{1}^{*}-{\mu}_{0})}^{2}+{({\mu}_{2}^{*}-{\mu}_{0})}^{2}\le 2{\u03f5}^{2}{\sigma}^{2}\right\}$, which is a circle with center $({\mu}_{0},{\mu}_{0})$ and radius $\sqrt{2}\u03f5\sigma $, as depicted on the left panel of Figure 6. In this case, the pragmatic hypothesis is the Tier 1 Equivalence Test hypothesis suggested by the US Food and Drug Administration [45]. The pragmatic hypothesis for ${H}_{0}:{\mu}_{1}={\mu}_{2}$ is obtained by taking the union of the pragmatic hypotheses associated to its constituents, as illustrated in the right panel of Figure 6. Specifically,

$$\begin{array}{cc}\hfill {BP}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})\phantom{\rule{1.em}{0ex}}& =\sqrt{\frac{{({\mu}_{1}^{*}-{\mu}_{0})}^{2}+{({\mu}_{2}^{*}-{\mu}_{0})}^{2}}{2{\sigma}^{2}}}\hfill \end{array}$$

$$\begin{array}{c}\hfill Pg({H}_{0},B{P}_{\mathbf{Z}},\u03f5)=\left\{({\mu}_{1}^{*},{\mu}_{2}^{*}):|{\mu}_{2}^{*}-{\mu}_{1}^{*}|\le \u03f5\sigma \right\}\end{array}$$

The pragmatic hypothesis for ${H}_{0}$ using KL is obtained similarly. Note that
Therefore, $Pg(\left\{{\theta}_{0}\right\},K{L}_{\mathbf{Z}},\u03f5)=\left\{({\mu}_{1}^{*},{\mu}_{2}^{*}):{({\mu}_{1}^{*}-{\mu}_{0})}^{2}+{({\mu}_{2}^{*}-{\mu}_{0})}^{2}\le 2\u03f5{\sigma}^{2}\right\}$ and

$$\begin{array}{c}\hfill {KL}_{\mathbf{Z}}({\theta}_{0},{\theta}^{*})=0.5{BP}_{\mathbf{Z}}^{2}({\theta}_{0},{\theta}^{*})\end{array}$$

$$\begin{array}{cc}\hfill Pg({H}_{0},K{L}_{\mathbf{Z}},\u03f5)\phantom{\rule{1.em}{0ex}}& =Pg({H}_{0},K{L}_{\mathbf{Z}},0.5{\u03f5}^{2})\hfill \end{array}$$

The pragmatic hypothesis for ${H}_{0}$ that is obtained using CD has no analytic expression. However, by observing that $N(\mu ,{\sigma}^{2})=\mu +\sigma N(0,1)$, it is possible to show that there exists a monotonically increasing function, $h:\mathbb{R}\u27f6\mathbb{R}$, such that

$$\begin{array}{cc}\hfill Pg({H}_{0},C{D}_{\mathbf{Z}},0.5{\u03f5}^{2})\phantom{\rule{1.em}{0ex}}& =\left\{({\mu}_{1}^{*},{\mu}_{2}^{*}):|{\mu}_{2}^{*}-{\mu}_{1}^{*}|\le h\left(\u03f5\right)\sigma \right\}\hfill \end{array}$$

That is, the pragmatic hypothesis associated to ${H}_{0}$ have the same shape as in the right panel of Figure 6. They differ solely on how many standard deviations correspond to the width of the pragmatic hypothesis.

## 5. Final Remarks

The spiral structure studied in the work by the authors of [10] can be used to describe scientific evolution. However, in order for the analogy to be complete, it is necessary to indicate what types of scientific theories or hypotheses are effectively tested in the acceptance vertex of the hexagon of oppositions. We defend that these are pragmatic hypotheses, which are sufficiently precise for the end-user of the theory.

In order to make this statement formal, we introduce three methods for constructing a pragmatic hypothesis associated to a precise hypothesis. These methods are based on three predictive dissimilarity functions: KL, BP and CD. Each of these methods have different advantages. For instance, the scale of BP and CD is more interpretable than KL, making it easier to determine whether the former are large or small. On the other hand, BP relies on the definition of more functions than KL and CD, such as ${\delta}_{\mathbf{Z},{\theta}_{0}}\left(\mathbf{z}\right)$ in Definition 2. If these function are chosen inadequately, then the shape of the resultant pragmatic hypothesis might be counterintuitive or meaningless. Finally, CD often does not have an analytic expression. It relies on numerical integration over the sample space, which can be taxing in high dimensions.

The applications at Section 4 present adequate choices of metrics and bounds (defining pragmatic hypotheses) for some given theoretical and experimental setups. Nevertheless, the authors did not propose a general or automated recipe for making these choices, nor do they think this to be a feasible goal. In future research, the authors intend to explore a variety of application cases, some using historical data of important experiments, and discuss possible choices of metrics and bounds for each case. The authors hope that, in time, the accumulation of such examples will provide useful guidelines for the good use of methods developed in this paper, in the same way that the statistical literature provides useful guidelines for choosing good statistical models for practical applications.

## Author Contributions

All authors contributed equally to this paper.

## Funding

This work was partially supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) grants PQ 306943-2017-4, 301206-2011-2, and 301892-2015-6; and Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) grants 2019/11321-9, 2017/03363-8, 2014/25302-2, CEPID-2013/07375-0, and CEPID-2014/50279-4. This study was also financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Finance Code 001.

## Acknowledgments

The authors are grateful for the support of the Institute of Mathematics and Statistics of the University of São Paulo (IME-USP), and the Department of Statistics of UFSCar—The Federal University of São Carlos. Finally, the authors are grateful for advice and comments received from anonymous referees; from participants of the 6th World Congress on the Square of Opposition, held on November 1 to 5, 2018, at Chania, Crete, having as main organizers Jean-Yves Béziau and Ioannis Vandoulakis; and from participants of the 39th International Workshop on Bayesian Inference and Maximum Entropy Methods, held on June 30 to July 5, 2019, at Garching bei München, Bavaria, having as main organizers Udo von Toussaint and Roland Preuss.

## Conflicts of Interest

The authors declare no conflicts of interest.

## Appendix A. Proofs

**Proof**

**of**

**Lemma**

**1.**

Let $\theta \in Pg({\Theta}_{0},K{L}_{\mathbf{Z}},8{\u03f5}^{2})$. It follows from Theorem 1 that there exist ${\theta}_{0}\in {\Theta}_{0}$, such that $K{L}_{\mathbf{Z}}({\theta}_{0},\theta )\le 0.5{\u03f5}^{2}$. Conclude from Pinsker’s inequality that $C{D}_{\mathbf{Z}}({\theta}_{0},\theta )\le \sqrt{{8}^{-1}K{L}_{\mathbf{Z}}({\theta}_{0},\theta )}=\u03f5$, that is, $\theta \in Pg({\Theta}_{0},C{D}_{\mathbf{Z}},\u03f5)$. □

**Proof**

**of**

**Theorem**

**1.**

Let $Pg$ be logically coherent. Pick an arbitrary ${\theta}_{0}\in {\Theta}_{0}$ and note that, if $R\left(x\right)\equiv Pg\left(\left\{{\theta}_{0}\right\}\right)$, then ${\varphi}_{Pg\left(\left\{{\theta}_{0}\right\}\right)}^{R}\left(x\right)=0$. Since $Pg$ is logically coherent, conclude that ${\varphi}_{Pg\left({\Theta}_{0}\right)}^{R}\left(x\right)\equiv 0$, that is, $Pg\left(\left\{{\theta}_{0}\right\}\right)\subseteq Pg\left({\Theta}_{0}\right)$. Since ${\theta}_{0}\in {\Theta}_{0}$ was arbitrary, conclude that
Next, let $R\left(x\right)\equiv {\bigcap}_{{\theta}_{0}\in {\Theta}_{0}}Pg{\left(\left\{{\theta}_{0}\right\}\right)}^{c}$. For every ${\theta}_{0}\in {\Theta}_{0}$, ${\varphi}_{Pg\left(\left\{{\theta}_{0}\right\}\right)}^{R}\left(x\right)=1$. Since $Pg$ is logically coherent, ${\varphi}_{Pg\left({\Theta}_{0}\right)}^{R}\equiv 1$, that is, $Pg\left({\Theta}_{0}\right)\subseteq {R}^{c}\equiv {\bigcup}_{{\theta}_{0}\in {\Theta}_{0}}Pg\left(\left\{{\theta}_{0}\right\}\right)$. Conclude that
It follows from Equations (A1) and (A2) that $Pg\left({\Theta}_{0}\right)={\bigcup}_{{\theta}_{0}\in {\Theta}_{0}}Pg\left(\left\{{\theta}_{0}\right\}\right)$. It also follows from direct calculation that, if $Pg\left({\Theta}_{0}\right)={\bigcup}_{{\theta}_{0}\in {\Theta}_{0}}Pg\left(\left\{{\theta}_{0}\right\}\right)$, then $Pg$ is logically coherent. □

$$\begin{array}{cc}\hfill \bigcup _{{\theta}_{0}\in {\Theta}_{0}}Pg\left(\left\{{\theta}_{0}\right\}\right)\phantom{\rule{1.em}{0ex}}& \subseteq Pg\left({\Theta}_{0}\right)\hfill \end{array}$$

$$\begin{array}{cc}\hfill Pg\left({\Theta}_{0}\right)& \subseteq \bigcup _{{\theta}_{0}\in {\Theta}_{0}}Pg\left(\left\{{\theta}_{0}\right\}\right)\hfill \end{array}$$

**Proof**

**of**

**Theorem**

**2.**

Let ${\Theta}_{0}\subseteq \Theta $
□

$$\begin{array}{cc}\hfill Pg(f\left[{\Theta}_{0}\right],{d}_{\mathbf{Z}}^{*},\u03f5)\phantom{\rule{1.em}{0ex}}& =\{{\theta}^{*}\in {\Theta}^{*}:\exists {\theta}_{0}^{*}\in f\left[{\Theta}_{0}\right]\phantom{\rule{4.pt}{0ex}}\mathrm{s}.\mathrm{t}.\phantom{\rule{4.pt}{0ex}}{d}_{\mathbf{Z}}^{*}({\theta}^{*},{\theta}_{0}^{*})\le \u03f5\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\{{\theta}^{*}\in {\Theta}^{*}:\exists {\theta}_{0}^{*}\in f\left[{\Theta}_{0}\right]\phantom{\rule{4.pt}{0ex}}\mathrm{s}.\mathrm{t}.\phantom{\rule{4.pt}{0ex}}{d}_{\mathbf{Z}}({f}^{-1}\left({\theta}^{*}\right),{f}^{-1}\left({\theta}_{0}^{*}\right))\le \u03f5\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =f\left[\{\theta \in \Theta :\exists {\theta}_{0}\in {\Theta}_{0}\phantom{\rule{4.pt}{0ex}}\mathrm{s}.\mathrm{t}.\phantom{\rule{4.pt}{0ex}}{d}_{\mathbf{Z}}(\theta ,{\theta}_{0})\le \u03f5\}\right]\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =f\left[Pg({\Theta}_{0},{d}_{\mathbf{Z}},\u03f5)\right]\hfill \end{array}$$

**Proof**

**of**

**Theorem**

**3.**

Since the ${Z}_{i}$’s are i.i.d., ${\mathrm{KL}}_{m}({\theta}_{0},{\theta}^{*})=m{\mathrm{KL}}_{{Z}_{1}}({\theta}_{0},{\theta}^{*})$. It follows that
Thus, ${\left(Pg({\Theta}_{0},K{L}_{m},\u03f5)\right)}_{m\ge 1}$ is a non-increasing sequence of sets. It follows that
where the next-to-last equality follows from the assumption that ${\left({F}_{\theta}\right)}_{\theta \in \Theta}$ is identifiable. The proofs for the $CD$ divergence follows from the fact that $\mathrm{TV}({\mathbb{P}}_{{\theta}_{0}},{\mathbb{P}}_{{\theta}^{*}})\le \sqrt{\mathrm{KL}({\mathbb{P}}_{{\theta}_{0}},{\mathbb{P}}_{{\theta}^{*}})}$. □

$$\begin{array}{cc}\hfill Pg({\Theta}_{0},K{L}_{m},\u03f5)\phantom{\rule{1.em}{0ex}}& =\bigcup _{{\theta}_{0}\in {\Theta}_{0}}Pg(\left\{{\theta}_{0}\right\},{\mathrm{KL}}_{m},\u03f5)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\bigcup _{{\theta}_{0}\in {\Theta}_{0}}Pg(\left\{{\theta}_{0}\right\},m{\mathrm{KL}}_{{Z}_{1}},\u03f5)=\bigcup _{{\theta}_{0}\in {\Theta}_{0}}\left\{{\theta}^{*}\in \Theta :K{L}_{{Z}_{1}}({\theta}_{0},{\theta}^{*})\le {m}^{-1}\u03f5\right\}\hfill \end{array}$$

$$\begin{array}{cc}\hfill \underset{m\to \infty}{lim}Pg({\Theta}_{0},K{L}_{m},\u03f5)\phantom{\rule{1.em}{0ex}}& =\bigcap _{m\ge 1}\bigcup _{{\theta}_{0}\in {\Theta}_{0}}\left\{{\theta}^{*}\in \Theta :{\mathrm{KL}}_{{Z}_{1}}({\theta}_{0},{\theta}^{*})\le {m}^{-1}\u03f5\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\bigcup _{{\theta}_{0}\in {\Theta}_{0}}\bigcap _{m\ge 1}\left\{{\theta}^{*}\in \Theta :{\mathrm{KL}}_{{Z}_{1}}({\theta}_{0},{\theta}^{*})\le {m}^{-1}\u03f5\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\bigcup _{{\theta}_{0}\in {\Theta}_{0}}\left\{{\theta}^{*}\in \Theta :{\mathrm{KL}}_{{Z}_{1}}({\theta}_{0},{\theta}^{*})=0\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\bigcup _{{\theta}_{0}\in {\Theta}_{0}}\left\{{\theta}_{0}\right\}={\Theta}_{0}\hfill \end{array}$$

## References

- Izbicki, R.; Esteves, L.G. Logical consistency in simultaneous statistical test procedures. Log. J. IGPL.
**2015**, 23, 732–758. [Google Scholar] [CrossRef] - Esteves, L.G.; Izbicki, R.; Stern, J.M.; Stern, R.B. The logical consistency of simultaneous agnostic hypothesis tests. Entropy
**2016**, 18, 256. [Google Scholar] [CrossRef] - Stern, J.; Izbicki, R.; Esteves, L.; Stern, R. Logically-Consistent Hypothesis Testing and the Hexagon of Oppositions. Log. J. IGPL
**2017**, 25, 741–757. [Google Scholar] [CrossRef] - Blanché, R. Structures Intellectuelles: Essai sur l’Organisation Systématique des Concepts; Vrin: Paris, France, 1966. (In French) [Google Scholar]
- Béziau, J.Y. The power of the hexagon. Log. Univers.
**2012**, 6, 1–43. [Google Scholar] [CrossRef] - Béziau, J.Y. Opposition and order. In New Dimensions of the Square of Opposition; Béziau, J.Y., Gan-Krzywoszynska, K., Eds.; Philosophia Verlag: Munich, Germany, 2015; pp. 1–11. [Google Scholar]
- Carnielli, W.; Pizzi, C. Modalities and Multimodalities (Logic, Epistemology, and the Unity of Science); Springer: Berlin, Germany, 2008. [Google Scholar]
- Dubois, D.; Prade, H. On several representations of an uncertain body of evidence. In Fuzzy Information and Decision Processes; Gupta, M., Sanchez, E., Eds.; Elsevier: Amsterdam, The Netherlands, 1982; pp. 167–181. [Google Scholar]
- Dubois, D.; Prade, H. From Blanché’s Hexagonal Organization of Concepts to Formal Concept Analysis and Possibility Theory. Log. Univers.
**2012**, 6, 149–169. [Google Scholar] [CrossRef] - Gallais, P.; Pollina, V. Hegaxonal and Spiral Structure in Medieval Narrative. Yale Fr. Stud.
**1974**, 51, 115–132. [Google Scholar] [CrossRef] - Gallais, P. Dialectique Du Récit Mediéval: Chrétien de Troyes et l’Hexagone Logique; Rodopi: Amsterdam, The Netherlands, 1982. (In French) [Google Scholar]
- Stern, J.M. Symmetry, Invariance and Ontology in Physics and Statistics. Symmetry
**2011**, 3, 611–635. [Google Scholar] [CrossRef] - Stern, J.M. Continuous versions of Haack’s Puzzles: Equilibria, Eigen-States and Ontologies. Log. J. IGPL
**2017**, 25, 604–631. [Google Scholar] [CrossRef] - DeGroot, M.H.; Schervish, M.J. Probability and Statistics; Pearson Education: London, UK, 2012. [Google Scholar]
- Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
- Bucher, J.L. The Metrology Handbook, 2nd ed.; ASQ Quality Press: Milwaukee, WI, USA, 2012. [Google Scholar]
- Czichos, H.; Saito, T.; Smith, L. Springer Handbook of Metrology and Testing, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
- Cohen, R.; Crowe, K.; DuMond, J. The Fundamental Constants of Physics; CODATA/Interscience Publishers: Geneva, Switzerland, 1957. [Google Scholar]
- Cohen, R. Mathematical Analysis of the Universal Physical Constants. Il Nuovo Cimento
**1957**, 6, 187–214. [Google Scholar] [CrossRef] - Lévy-Leblond, J. On the conceptual nature of the physical constants. Il Nuovo Cimento
**1977**, 7, 187–214. [Google Scholar] - Pakkan, M.; Akman, V. Hypersolver: A graphical tool for commonsense set theory. Inform. Sci.
**1995**, 85, 43–61. [Google Scholar] [CrossRef] - Akman, V.; Pakkan, M. Nonstandard set theories and information management. J. Intell. Inf. Syst.
**1996**, 6, 5–31. [Google Scholar] [CrossRef] - Wainwright, M.J. Stochastic Processes on Graphs with Cycles: Geometric and Variational Approaches. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, April 2002. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: Heidelberg, Germany, 2006. [Google Scholar]
- Iordanov, B. HyperGraphDB: A generalized graph database. In International Conference on Web-Age Information Management; Springer: Berlin, Germany, 2010; pp. 25–36. [Google Scholar]
- Gelman, A.; Vehtari, A.; Jylänki, P.; Sivula, T.; Tran, D.; Sahai, S.; Blomstedt, P.; Cunningham, J.P.; Schiminovich, D.; Robert, C. Expectation propagation as a way of life: A framework for Bayesian inference on partitioned data. arXiv
**2014**, arXiv:1412.4869. [Google Scholar] - Greimas, A. Structural Semantics: An Attempt at a Method; University of Nebraska Press: Lincoln, NE, USA, 1983. [Google Scholar]
- Propp, V. Morphology of the Folktale; University of Texas Press: Austin, TX, USA, 2000. [Google Scholar]
- Stern, J.M. Jacob’s Ladder and Scientific Ontologies. Cybern. Human Knowing
**2014**, 21, 9–43. [Google Scholar] - Stern, J.M. Constructive Verification, Empirical Induction, and Falibilist Deduction: A Threefold Contrast. Information
**2011**, 2, 635–650. [Google Scholar] [CrossRef] - Abraham, R.; Marsden, J.E. Foundations of Mechanics; Addison-Wesley: Boston, MA, USA, 2013. [Google Scholar]
- Hawking, S. The Illustrated On the Shoulders of Giants: The Great Works of Physics and Astronomy; Running Press: Philadelphia, PA, USA, 2004. [Google Scholar]
- Stern, J.M.; Nakano, F. Optimization Models for Reaction Networks: Information Divergence, Quadratic Programming and Kirchhoff’s Laws. Axioms
**2014**, 3, 109–118. [Google Scholar] [CrossRef] - Coscrato, V.; Izbicki, R.; Stern, R.B. Agnostic tests can control the type I and type II errors simultaneously. Braz. J. Probab. Stat. 2019. Available online: https://www.imstat.org/wp-content/uploads/2019/01/BJPS431.pdf (accessed on 9 September 2019).
- Izbicki, R.; Fossaluza, V.; Hounie, A.G.; Nakano, E.Y.; de Braganca Pereira, C.A. Testing allele homogeneity: The problem of nested hypotheses. BMC Genet.
**2012**, 13, 103. [Google Scholar] [CrossRef] [PubMed] - Da Silva, G.M.; Esteves, L.G.; Fossaluza, V.; Izbicki, R.; Wechsler, S. A bayesian decision-theoretic approach to logically-consistent hypothesis testing. Entropy
**2015**, 17, 6534–6559. [Google Scholar] [CrossRef] - Fossaluza, V.; Izbicki, R.; da Silva, G.M.; Esteves, L.G. Coherent hypothesis testing. Am. Statist.
**2017**, 71, 242–248. [Google Scholar] [CrossRef] - Pereira, C.A.B.; Stern, J.M.; Wechsler, S. Can a Signicance Test be Genuinely Bayesian? Bayesian Anal.
**2008**, 3, 79–100. [Google Scholar] [CrossRef] - Stern, J.M.; Pereira, C.A.D.B. Bayesian Epistemic Values: Focus on Surprise, Measure Probability! Log. J. IGPL.
**2014**, 22, 236–254. [Google Scholar] [CrossRef] - Casella, G.; Berger, R.L. Statistical Inference; Duxbury Press: Pacific Grove, CA, USA, 2002. [Google Scholar]
- Wechsler, S.; Izbicki, R.; Esteves, L.G. A Bayesian look at nonidentifiability: A simple example. Am. Statist
**2013**, 67, 90–93. [Google Scholar] [CrossRef] - Coscrato, V.; Esteves, L.G.; Izbicki, R.; Stern, R.B. Interpretable hypothesis tests. arXiv
**2019**, arXiv:1904.06605. [Google Scholar] - Hardy, G. Mendelian proportions in a mixed population. 1908. Yale J. Biol. Med.
**2003**, 76, 79. [Google Scholar] [PubMed] - Brentani, H.; Nakano, E.Y.; Martins, C.B.; Izbicki, R.; Pereira, C.A.d.B. Disequilibrium coefficient: A Bayesian perspective. Stat. Appl. Genet. Mol.
**2011**, 10. [Google Scholar] [CrossRef] - Chow, S.C.; Song, F.; Bai, H. Analytical similarity assessment in biosimilar studies. AAPS J.
**2016**, 18, 670–677. [Google Scholar] [CrossRef]

**Figure 3.**$\varphi \left(x\right)$ is an agnostic test based on the region estimator $R\left(x\right)$ for testing ${H}_{0}$.

**Figure 4.**Pragmatic hypotheses in Example 6 for ${H}_{0}:\mu =0$ with KL (upper), CD (lower), $\u03f5=0.1$, and ${M}^{2}=2$. ${H}_{0}$ is represented by a red line in all figures.

**Figure 5.**Pragmatic hypotheses obtained for the HW equilibrium, depicted in red, using $m=20$, $\u03f5=0.1$ for BP and CD and $\u03f5=0.01$ for KL. The blue regions indicate the pragmatic hypothesis for HW and $p=\frac{1}{3}$ (top) and for HW (bottom). The lower, middle, and right panels were obtained, respectively, with BP, KL, and CD. The green regions in the right panels represents 80% HPD regions for the genotype distribution of each of the eight groups collected by Brentani et al. [44] and two simulated datasets.

Vertex | Orbital Astronomy | Chemical Affinity |
---|---|---|

${I}_{1}$- Enthesis/ ${A}_{1}$- Thesis | Ptolemaic/Copernican cycles and epicycles | Geoffroy affinity table and highest rank substitution |

${U}_{1}$- Analysis | Circular or oval orbits? | Ordinal or numeric affinity? |

${E}_{1}$- Antithesis | Non-circular orbits | Non-ordinal affinity |

${O}_{2}$- Apothesis /Prosthesis | Elliptic planetary orbits, focal centering of sun | Integer affinity values, for arithmetic recombination |

${Y}_{2}$- Synthesis | Kepler laws! | Morveau rules and tables! |

${I}_{2}$- Enthesis ${A}_{2}$- Thesis | Vortex physics theories, Keplerian astronomy | Affinity + stoichiometry substitution reactions |

${U}_{2}$- Analysis | Tangential or radial forces? | Total or partial reaction? |

${E}_{2}$- Antithesis | Non-tangential forces | Non-total substitutions |

${O}_{3}$- Apothesis/Prosthesis | Radial attraction forces, inverse square of distance | Reversible reactions, equilibrium conditions |

${Y}_{3}$- Synthesis | Newton laws! | Mass-Action kinetics! |

${I}_{3}$- Enthesis/ ${A}_{3}$- Thesis | Newtonian mechanics & variational equivalents | Thermodynamic theories for reaction networks |

AA | AD | DD | Decision | |
---|---|---|---|---|

1 | 4 | 18 | 94 | Agnostic |

2 | 6 | 53 | 74 | Accept |

3 | 57 | 118 | 100 | Agnostic |

4 | 58 | 97 | 48 | Agnostic |

5 | 120 | 361 | 194 | Agnostic |

6 | 206 | 309 | 142 | Accept |

7 | 110 | 148 | 44 | Accept |

8 | 34 | 22 | 12 | Agnostic |

9 | 198 | 282 | 520 | Reject |

10 | 641 | 314 | 45 | Accept |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).