Symmetry and Evidential Support

Michael G. Titelbaum

doi:10.3390/sym3030680

Department of Philosophy, University of Wisconsin–Madison, 600 North Park Street, Madison, WI 53706, USA

Symmetry2011, 3(3), 680-698;https://doi.org/10.3390/sym3030680

This article belongs to the Special Issue Symmetry in Probability and Inference

Version Notes

Order Reprints

Abstract

This article proves that formal theories of evidential favoring must fail because they are inevitably language dependent. I begin by describing Carnap’s early confirmation theories to show how language dependence problems (like Goodman’s grue problem) arise. I then generalize to show that any formal favoring theory satisfying minimal plausible conditions will yield different judgments about the same evidence and hypothesis when they are expressed in alternate languages. This does not just indict formal theories of favoring; it also shows that something beyond our evidence must be invoked to substantively favor one hypothesis over another.

Keywords:

evidence; evidential support; Carnap; grue; language dependence

1. Introduction

We are constantly judging hypotheses in light of evidence, to see whether the evidence supports one hypothesis over another, is better explained by one hypothesis, makes one hypothesis more likely than the other, etc. In doing so we invoke a general evidential favoring relation, according to which an evidential proposition may favor one hypothesis proposition over another. When that relation holds among such relata, it reveals an important asymmetry between the hypotheses relative to the evidence—an asymmetry we rely on in making predictions, designing experiments, and deciding among courses of action. Little wonder, then, that philosophers since Hempel ([1] and [2]) have tried to formalize the relation of evidential favoring. I begin this article by explaining and evaluating Carnap’s [3] probabilistic theory of evidential favoring. Among other problems, that theory has trouble with language dependence: it yields different favoring judgments when the same hypotheses and evidence are re-expressed in a different language. In another article [4], I have proven that these language dependence problems generalize beyond the particulars of Carnap’s theory, and even beyond probabilistic approaches to evidential favoring. Here I will explain how the proof proceeds, what assumptions it makes about favoring, and how it shows that an accurate formal theory of evidential favoring is impossible. Finally, I will sketch some philosophical consequences of this result, in particular that hypotheses are more symmetrically related to evidence than we think and that choosing among hypotheses almost always requires a contribution from agents that goes beyond their evidence.

2. Carnap’s Early Theories of Favoring

Carnap’s central idea is that a body of evidence favors one hypothesis over another just in case it renders the former more probable than the latter. To make this precise, he first represents evidence and hypothesis propositions in a first-order formal language

L

. Such a language has predicates (

F, G, H, \dots

) that can be applied to constants (

a, b, c, \dots

) to form atomic sentences (

F a, G b, H c, \dots

). By combining atomic sentences with sentential connectives, we build more complex sentences. Carnap’s languages are interpreted: each constant represents a particular object in the world; each predicate represents a property such objects may display; each sentence represents a particular proposition.

With his interpreted language in place, Carnap defines a real-valued function 𝔠 on ordered pairs of sentences in

L

. 𝔠

(h, e)

measures how strongly the evidence represented by e confirms the hypothesis represented by h. The proposition represented by e favors the proposition represented by

h_{1}

over the proposition represented by

h_{2}

just in case 𝔠

(h_{1}, e) > c (h_{2}, e)

.1 The substance of Carnap’s theory comes from the particular values of 𝔠. 𝔠 is defined in terms of a real-valued one-place function

m

on

L

. For any

x, y \in L

(with y non-contradictory),

c (x, y) = \frac{m (x & y)}{m (y)}

(1)

Carnap requires that

m

be a probability function in the sense of Kolmogorov [5]: non-negative, normal, and finitely additive. He also requires that

m

be regular, in the sense that it assigns positive values to all non-contradictions. Combined with Equation (1), these requirements make

c (\cdot, e)

a probability function for any non-contradictory

e \in L

. This squares with Carnap’s central idea that

c (h, e)

—the degree to which e confirms h—is just the probability of h on e.

Carnap then needs to specify values for

m

. He proves that a probability distribution over

L

can be fully specified by assigning values to

L

’s state descriptions. A state description is a maximal consistent conjunction of literals of

L

. (A literal is either an atomic sentence or its negation.) Any non-contradictory sentence

x \in L

is logically equivalent to a disjunction of state descriptions; that disjunction is called x’s disjunctive normal form.

For example, take a simple language

L^{G}

with two constants a and b and one predicate G. Table 1 lays out a set of state descriptions for

L^{G}

. (The table’s leftmost column provides a name for each state description.)2 Suppose we assign

m

-values to each of the four state descriptions in Table 1. Carnap proves that the state descriptions’

m

-values generate a probabilistic distribution over

L^{G}

just in case they are non-negative and sum to 1. If that condition is met, the

m

-value of any non-contradictory

x \in L^{G}

can be calculated by summing the

m

-values of the state descriptions in its disjunctive normal form. (A contradictory x receives an

m

-value of 0.)

Table 1.

m^{†}

- and

m^{*}

-values for language

L^{G}

.

Take the

m^{†}

column of Table 1, for instance. Setting aside for the moment where

m^{†}

comes from, the

m^{†}

-values specified for the state descriptions of

L^{G}

are clearly non-negative and sum to 1. This induces a probability function

m^{†}

over the entire

L^{G}

language. For example, since the disjunctive normal form of sentence

G a

is a disjunction of

s_{1}

and

s_{2}

,

m^{†} (G a)

is the sum of the

m^{†}

-values in Table 1’s first two rows. In other words,

m^{†} (G a) = 1 / 2

.

Carnap has now moved from the problem of specifying 𝔠-values to the problem of specifying

m

-values, and from the problem of specifying an entire

m

-distribution to the problem of specifying

m

over state descriptions. He notes that for any x,

m (x) = (x, T)

, where

T

is an arbitrary tautology in

L^{G}

. So he thinks of

m (x)

as the probability of the proposition represented by x relative to a tautology, or relative to an evidence set containing no empirical information.3 The state descriptions of

L^{G}

describe the basic states of the world

L^{G}

is capable of discriminating among, and it is natural to think that lacking any empirical information, each of these states should be treated symmetrically. That is, we should assign each state description the same

m

-value. This gives us distribution

m^{†}

in Table 1, which (applying Equation (1)) yields a confirmation function Carnap calls

c^{†}

.

c^{†}

captures some intuitive favoring relations. For example, a bit of calculation with Table 1 and Equation (1) reveals that

1 = c^{†} (G a, G a & G b) > c^{†} (\sim G a, G a & G b) = 0

(2)

Recall that evidence favors one hypothesis over another just in case the former receives a higher 𝔠-value on the evidence than the latter. So

c^{†}

is suggesting a sensible favoring relation: Since the evidence represented by

G a & G b

entails the hypothesis represented by

G a

while refuting the hypothesis represented by

\sim G a

, that evidence favors the former over the latter.

But broadening our view reveals clear flaws in

c^{†}

as an explication of evidential favoring. Suppose, for example, that a and b represent two emeralds to be sampled, and G represents the property of being green. We might think that the evidence represented by

G a

favors the hypothesis represented by

G b

over the hypothesis represented by

\sim G b

. Yet from Table 1 we can calculate

c^{†} (G b, G a) = c^{†} (\sim G b, G a) = 1 / 2

(3)

The problem generalizes. When we construct

c^{†}

for larger languages, we find that even when many objects are represented the evidence that all but one have the property G does not confirm the hypothesis that the last object has G over the hypothesis that it does not. As Carnap puts it,

c^{†}

does not allow learning from experience.

The trouble is that

m^{†}

is too symmetrical. To get learning from experience, we need a state in which all the emeralds are green to be more probable than a state in which the first run of emeralds are green but the last one is not. By treating all state descriptions symmetrically,

m^{†}

renders our evidence incapable of discriminating among full states of the world consistent with that evidence.

So Carnap constructs a new confirmation function

c^{*}

from the probability function

m^{*}

. Instead of treating state descriptions symmetrically,

m^{*}

treats structure descriptions symmetrically. Intuitively, a structure description describes the distribution of properties over objects at an abstract level, without specifying which objects occupy which places in the property distribution. A structure description might say “one sampled emerald is green while another is not,” without telling us which of the emeralds is the green one.

At the technical level, a structure description is a disjunction of state descriptions obtainable from each other by permuting constants. Our simple language

L^{G}

has three structure descriptions:

s_{1}

by itself (“both emeralds are green”), the disjunction of

s_{2}

and

s_{3}

(“exactly one of the emeralds is green”), and

s_{4}

by itself (“neither emerald is green”).

m^{*}

assigns each structure description the same value; within a given structure description,

m^{*}

-value is divided up equally among the state descriptions. The resulting

m^{*}

-values are displayed in Table 1.4

c^{*}

solves the learning from experience problem, because

2 / 3 = c^{*} (G b, G a) > c^{*} (\sim G b, G a) = 1 / 3

(4)

This evidential favoring is possible because

s_{1}

has a higher

m^{*}

-value than

s_{2}

. There’s only one way to arrange the world such that both emeralds are green, so there’s only one state description in

s_{1}

’s structure description and that state description gets the full

1 / 3

probability. Yet there are two possible arrangements in which exactly one emerald is green (first emerald green, second emerald green), so

s_{2}

has to share a structure description with

s_{3}

and gets an

m^{*}

-value of only

1 / 6

. Symmetric treatment of structure descriptions yields asymmetric treatment of state descriptions, making the sought-after favoring relations possible.5

But

c^{*}

has a new problem: language dependence. Even with a language as simple as

L^{G}

, we can construct a version of Goodman’s [6] “grue” problem. Suppose we define a language

L^{H}

with the same constants as

L^{G}

representing the same objects. But

L^{H}

has a predicate H related to G as follows: for any constant x,

H x

has the same truth-value as

G x \equiv (x = a)

.6 Table 2 displays a set of state descriptions of

L^{H}

and their

m^{*}

-values, calculated just as before. But it also reveals a further fact: each state description of

L^{H}

expresses the same proposition as a state description of

L^{G}

. That means

L^{H}

can express every proposition expressible in

L^{G}

: given any

L^{G}

-sentence, we find its disjunctive normal form in

L^{G}

, replace its

L^{G}

state descriptions with the corresponding

L^{H}

state descriptions from Table 2, and are left with an

L^{H}

synonym for the

L^{G}

original. Each

L^{G}

sentence has a synonym in

L^{H}

: an

L^{H}

sentence that expresses the same proposition as the

L^{G}

original.

Table 2.

m^{*}

-values for language

L^{H}

.

Consulting Table 2, we find that

1 / 3 = c^{*} (\sim H b, H a) < c^{*} (H b, H a) = 2 / 3

(5)

But

H a

is a synonym for

G a

and

H b

is a synonym for

\sim G b

. Equation (5) indicates exactly the opposite favoring relation—relative to the same evidence—as Equation (4)!

Goodman reverses the confirmation relations among propositions indicated by a theory of evidential favoring by re-expressing the same propositions in a new language. But we can also make those favoring relations disappear entirely. Consider a new language

L^{S}

.

L^{S}

has one constant, o, which names the ordered pair consisting of first the emerald named by a and then the emerald named by b in

L^{G}

.

L^{S}

has two predicates.

G_{1} x

obtains when the first element of the ordered pair named by x is green.

S x

obtains when the two objects in the ordered pair are the same with respect to greenness—they are either both green or they are both not. Since there is only one constant in this new language, permuting constants does not turn any state description into another; each state description has its own structure description. Table 3 displays the resulting

m^{*}

-values for

L^{S}

.

Table 3.

m^{*}

-values for language

L^{S}

.

Again, each state description of

L^{G}

has a synonym in

L^{S}

, so every proposition expressible in

L^{G}

is expressible in

L^{S}

. And now we have

c^{*} (G_{1} o \equiv S o, G_{1} o) = c^{*} (\sim [G_{1} o \equiv S o], G_{1} o) = 1 / 2

(6)

G_{1} o

expresses the same proposition as

G a

did in language

L^{G}

, while

G_{1} o \equiv S o

is a synonym for

G b

. In

L^{S}

, the favoring relations we found earlier disappear.

L^{H}

and

L^{S}

reveal that the facts

c^{*}

relies upon—facts about which propositions are expressed in sentences that share structure descriptions with others—are artifacts of language choice.

3. Alternative Approaches to Favoring

Evidential favoring is a relation among propositions—evidence and hypotheses. With only a few exceptions (which we’ll discuss later), favoring relations among propositions should turn out the same regardless of which language the propositions are expressed in. Yet Carnap’s formal analyses of confirmation yield different favoring judgments for the same propositions when those propositions are expressed in different languages.

Once the possibility of a language like

L^{S}

—a language in which all the objects are referred to via a single n-tuple name—has come up, it may seem like any formal Carnap-style theory of favoring is doomed. But some features of evidence and hypotheses are invariant among all the languages we have considered, even

L^{S}

. Consider again the situation in which our evidence is that emerald a is green, our first hypothesis is that emerald b is green, and our second hypothesis is that emerald b is not. In each of the three languages we have seen, that evidence and those hypotheses are each expressed by a disjunction of two state descriptions. For example, in

L^{G}

their disjunctive normal forms are:7

\begin{matrix} h_{1} : s_{1} \lor s_{3} & h_{2} : s_{2} \lor s_{4} \\ e : s_{1} \lor s_{2} \end{matrix}

Notice also that each hypothesis shares one state description with the evidence but no state descriptions with the other hypothesis. That remains true across

L^{G}

,

L^{H}

, and

L^{S}

.

We could imagine a formal confirmation theory that works with these facts about state-description counts and state-description sharing. For instance, we can invent a “Proportional Theory” that counts what proportion of its state descriptions a hypothesis shares with the evidence, then favors a hypothesis that has a higher proportion of shared state descriptions over one that has a lower proportion. In the example at hand each hypothesis would have a shared proportion of

1 / 2

, so neither

h_{1}

nor

h_{2}

would be favored by e over the other. So the Proportional Theory fails to yield intuitively plausible favoring judgments about learning from experience—but it can still serve as our toy example of a theory that gives consistent favoring results across the transition from

L^{G}

to

L^{H}

and

L^{S}

. It appears at least possible that with a bit of work something in the Proportional Theory’s neighborhood could yield plausible favoring results.

But this appearance is illusory. In Section 5 and Section 6 I will describe a general proof that rules out all formal theories of evidential favoring. Starting with very general conditions on what any formal favoring theory would have to achieve, I will show that even if a theory yields consistent indications of favoring for a set of evidence and hypotheses across

L^{G}

through

L^{S}

, those indications disappear when the relata are expressed in yet another language. After describing the proof, I explain in Section 7 what we can learn from it about the underlying nature of evidential favoring.

Possessing such a general proof is also important for historical reasons. Despite the positive results

c^{*}

yields for simple cases of enumerative induction, Carnap was dissatisfied with that confirmation function’s inability to properly model what he called “arguments by analogy”. So Carnap produced and refined a number of successors to

c^{*}

. Meanwhile, Jaynes ([7] and [8]) developed a rival confirmation approach based on maximizing entropy in probabilistic distributions. Other authors, such as Maher [9] have since tried to develop further formal theories of evidential favoring.8

While presenting Carnap’s first steps is useful for illustrative purposes, there’s no need to work through these further proposals because they all run into the same problem Carnap did: language dependence.9 The result I present in the next few sections reveals that this is not a coincidence—any formal theory of favoring meeting very general conditions will have language dependence problems.

4. General Conditions on Evidential Favoring

What, in general, do we know about the evidential favoring relation? It is a relation among propositions, but we must express those propositions as sentences in a language to work with them. So if

h_{1}, h_{2}, e \in L

represent two hypotheses and a body of evidence in interpreted first-order language

L

(with a finite number of constants and predicates, and no quantifiers or identity symbol), we will write

f (h_{1}, h_{2}, e)

when the evidence represented by e favors the hypothesis represented by

h_{1}

over the hypothesis represented by

h_{2}

. Unlike Carnap, we will not assume that f has anything to do with probabilities or numerical functions, though what we do assume about f will be compatible with that possibility. For example, we will assume that f is antisymmetric relative to a given e—that is, we cannot have both

f (h_{1}, h_{2}, e)

and

f (h_{2}, h_{1}, e)

.

We might also think that f should obtain in some clear cases involving entailment relations. We saw one such case in Equation (2): the evidential proposition represented in

L^{G}

by

G a & G b

favors the hypothesis represented by

G a

over the hypothesis represented by

\sim G a

. But if we want a theory of f to detect these entailment-based favorings, we cannot expect the theory to yield correct results for every conceivable language, because some languages will hide the relevant entailments. For example, a language with a single constant (like the ordered-pair constant in

L^{S}

) might express the evidence and pair of hypotheses from Equation (2) with the atomic sentences

T o

,

U o

, and

V o

. In that language there would be no way to formally recover the entailment facts in virtue of which the favoring holds.

This example shows that we cannot demand invariance of a confirmation theory’s results over every language capable of expressing the evidence and hypotheses of interest. As we go along, we will consider which kinds of language-independence we want and which are inessential. We will judge this by asking what it would reveal about the underlying evidential favoring relation if a correct theory of that relation were invariant across a particular language type. Since it seems plausible that some evidential favorings arise from entailments, we will require a formal theory to detect evidential favoring only when propositions are represented in faithful languages. While faithfulness is defined precisely in [4], the key condition is that the state descriptions of a faithful language express a set of mutually exclusive, exhaustive propositions. Because of this, two sentences x and y in a faithful language will have

x ⊢ y

just in case the proposition represented by x entails the proposition represented by y.10 A faithful language captures in its syntax all the entailment relations among the propositions it represents.

Unfaithful languages go wrong by failing to capture entailment relations among the propositions they represent. But a language may also go wrong by failing to represent some propositions entirely. When we work with scientific hypotheses, a body of evidence may favor one hypothesis over another because it reveals the truth of a prediction made by the former but not the latter.11 A language that faithfully represents the entailment relations among evidence and hypotheses but lacks a sentence representing that prediction may leave formal theories incapable of detecting the favoring relation that obtains.

We will call a language adequate for a particular set of evidence and hypotheses if it contains sentences representing not only those three propositions but also any other propositions necessary for detecting favoring relations among the three. I have no precise characterization of the conditions a language must meet to be adequate for a particular set of evidence and hypotheses. Luckily, all we need for our result is that the concern for adequacy is a concern about representational paucity. We will suppose that for any evidence and two hypotheses, there is a set of languages adequate for those three relata. We will require formal theories of evidential favoring to yield correct results when applied to adequate languages. And we will assume that if language

L

is adequate for a set of evidence and hypotheses, and language

L^{'}

contains synonyms for every sentence in

L

, then

L^{'}

is adequate for those relata as well.

To this point we have required that the f relation be anti-symmetric for a given e and that formal favoring theories detect the relation’s presence when its relata are represented in a language that is adequate for them and faithful. Although I believe various favorings based on entailments (such as the one represented in Equation (2)) hold, we will be able to prove our result without assuming there are any such favorings in the extension of f. We will, however, assume that the extension of f contains something besides entailment-based favorings. This follows Hume’s point in [13, Book I] that if favoring underwrites the inductive inferences that get us through our days, it cannot be restricted to cases in which the evidence either entails or refutes one of the hypotheses.

To be precise: We will say that

h_{1}

,

h_{2}

, and e in faithful language

L

are logically independent just in case any conjunction obtainable by inserting a non-negative number of negation symbols into “

h_{1} & h_{2} & e

” represents a non-empty (non-contradictory) proposition. If there exists at least one faithful

L

with logically independent

h_{1}, h_{2}, e \in L

such that

f (h_{1}, h_{2}, e)

, we will say that f is substantive. I take it Hume showed us evidential favoring is substantive.12

Finally, we need a condition capturing what it is to say that a theory of the evidential favoring relation is “formal”. Typically, formality means that a theory operates on the structure of sentences without noticing which particular items play which roles in that structure. For example, suppose a theory of evidential favoring said that in

L^{G}

,

G a

favored

G b

over

\sim G b

, but

G b

did not favor

G a

over

\sim G a

. These two triples are structurally identical; such a theory would be differentiating between them strictly on the grounds that a appeared in the evidence in the first case while b appeared in the evidence in the second. A formal theory treats constants in a language as interchangeable. It also treats predicates as interchangeable, which is the condition that will play a central role in our proof. We will require f to treat predicate permutations identically, by which we mean that for any language

L

that is faithful and adequate for

h_{1}, h_{2}, e \in L

and any permutation

π

of the predicates of

L

,

f (h_{1}, h_{2}, e)

entails

f (π (h_{1}), π (h_{2}), π (e))

. (Where

π (x)

is the sentence that results from replacing each predicate occurrence in x with its image under the permutation

π

.)

5. First Stage of the Proof

With these notions in place, our general result can be stated simply:

General Result: If the evidential favoring relation is antisymmetric and substantive, it does not treat predicate permutations identically.

For the full details of the proof, I refer the reader to [4]. Here I will simply explain how it works, starting with an overview of the proof strategy: Suppose for reductio that evidential favoring is substantive and antisymmetric and treats predicate permutations identically. By f’s substantivity there exist an

h_{1}

,

h_{2}

, and e in faithful, adequate language

L

such that these three relata are logically independent and

f (h_{1}, h_{2}, e)

. We will construct another faithful, adequate language

L^{*}

with

h_{1}^{*}

,

h_{2}^{*}

, and

e^{*}

representing the same propositions as

h_{1}

,

h_{2}

, and e respectively. Since f concerns a relation among the propositions expressed by sentences, we will have

f (h_{1}^{*}, h_{2}^{*}, e^{*})

. Moreover,

L^{*}

will be constructed so as to make available a predicate permutation

π

such that

π (h_{1}^{*}) = h_{2}^{*}

,

π (h_{2}^{*}) = h_{1}^{*}

, and

π (e^{*}) = e^{*}

. Since f treats predicate permutations identically,

f (h_{2}^{*}, h_{1}^{*}, e^{*})

. But that violates f’s antisymmetry, yielding a contradiction.

The difficult part of this proof is demonstrating that the required

L^{*}

,

h_{1}^{*}

,

h_{2}^{*}

,

e^{*}

, and

π

can be constructed in the general case. That generalization proceeds in two stages. The first stage begins by noting that while

L

might contain multi-place predicates and any (finite) number of constants, given any faithful adequate

L

we can always find another language that is faithful and adequate for the propositions expressed by

h_{1}

,

h_{2}

, and e but contains a single constant and only single-place predicates. We do this in much the same way we moved from language

L^{G}

to language

L^{S}

earlier: We make the new language’s constant represent a tuple of the objects represented by the constants of

L

, then make the new language’s predicates represent properties of objects at particular places in the tuple (“the first object in the tuple is green”) or relations among those objects (“the objects in the tuple match with respect to greenness”). Since whenever there is a faithful, adequate language representing

h_{1}

,

h_{2}

, and e there is also a faithful, adequate language representing the same propositions with only one constant and single-place predicates, we will assume without loss of generality that our original language

L

has only one constant, a, and only single-place predicates.

Having made this assumption, let’s fix a case in our minds by imagining that besides its one constant

L

has five predicates—it does not matter which particular predicates they are. We can then refer to the 32 state descriptions of

L

as

s_{1}

through

s_{32}

. And to further fix a particular case, we can imagine that the

h_{1}

,

h_{2}

, and e in question have these disjunctive normal form equivalents:13

\begin{matrix} h_{1} : s_{1} \lor s_{4} \lor s_{5} \lor s_{7} & h_{2} : s_{2} \lor s_{4} \lor s_{6} \lor s_{7} \\ e : s_{3} \lor s_{5} \lor s_{6} \lor s_{7} \end{matrix}

The reader may establish that this particular

h_{1}

,

h_{2}

, and e are logically independent as required.

We will now construct language

L^{*}

with one constant, a, representing the same tuple as it represents in

L

. Like

L

,

L^{*}

will have five predicates—they will be

F^{*}

,

G^{*}

,

B_{1}^{*}

,

B_{2}^{*}

, and

B_{3}^{*}

. Like

L

,

L^{*}

will be an interpreted language; the key to our proof will be how we assign meanings to the sentences of

L^{*}

.

L^{*}

will be faithful, and will be capable of expressing all the propositions expressed by sentences of

L

. So, for instance,

L^{*}

will contain an

h_{1}^{*}

,

h_{2}^{*}

, and

e^{*}

that are synonyms for

h_{1}

,

h_{2}

, and e respectively. Because

L^{*}

expresses every proposition expressible in

L

, and because we have assumed

L

is adequate for

h_{1}

,

h_{2}

, and e,

L^{*}

will be adequate for

h_{1}^{*}

,

h_{2}^{*}

, and

e^{*}

. But

L^{*}

is designed so that a predicate permutation

π

that maps each

B^{*}

-predicate to itself and interchanges

F^{*}

with

G^{*}

will map

h_{1}^{*}

to

h_{2}^{*}

,

h_{2}^{*}

to

h_{1}^{*}

, and

e^{*}

to itself.

To construct

L^{*}

, we begin with the disjunctive normal form equivalents of

h_{1}

,

h_{2}

, and e in

L

. These sentences have some state descriptions in common and some distinct. For instance,

s_{4}

appears in the disjunctive normal forms of

h_{1}

and

h_{2}

but not of e. So we will refer to

s_{4}

as an “

h_{1} h_{2}

-sd”. (A state description that does not appear in any of our three disjunctive normal forms will a “

ϕ

-sd”.) The basic strategy for setting up

L^{*}

will be to make each state description of

L^{*}

the synonym of a different state description of

L

. (Once meanings are assigned to the state descriptions of

L^{*}

, the meanings of other

L^{*}

sentences will be built up by interpreting logical connectives in the usual way.)14 This means that

s_{4} \in L

, for instance, will have a synonym state description in

L^{*}

occurring in the disjunctive normal forms of

h_{1}^{*}

and

h_{2}^{*}

but not

e^{*}

. We will label that state description in

L^{*}

an “

h_{1}^{*} h_{2}^{*}

-sd”.

We will achieve the permutations we want by carefully selecting which propositions are expressed by which state descriptions in

L^{*}

. For example, a proposition expressed by an

h_{1}

-sd in

L

will be expressed by an

h_{1}^{*}

-sd in

L^{*}

. But we get to select which

L^{*}

state description plays that role, so we will select one that is mapped by

π

to an

h_{2}^{*}

-sd. More generally, we will assign state descriptions of

L^{*}

to propositions so that

π

maps state description types to each other as described in Table 4. (The rows of Table 4 have been numbered for reference later.)

Table 4. In

L^{*}

,

π

maps…

Notice how these mappings have been set up. Once we know which state descriptions express which propositions, we can determine the disjunctive normal form equivalent of

h_{1}^{*}

. When

π

is applied to that disjunctive normal form equivalent, it will replace each

h_{1}^{*}

-sd with an

h_{2}^{*}

-sd and each

e^{*} h_{1}^{*}

-sd with an

e^{*} h_{2}^{*}

-sd. In other words, it will replace each state description that appears in

h_{1}^{*}

but not

h_{2}^{*}

with a state description that appears in

h_{2}^{*}

but not

h_{1}^{*}

. As a result, applying

π

converts

h_{1}^{*}

into

h_{2}^{*}

. Similarly, if the mappings in Table 4 hold,

π

will convert

h_{2}^{*}

to

h_{1}^{*}

while leaving

e^{*}

unchanged.15

How do we assign state descriptions to propositions so as to achieve the mappings described in Table 4? For the specific example we have been working with, the state description assignments in Table 5 will do the job. Here I have indicated state descriptions schematically, leaving out the “a”s and replacing negations with overbars. The table indicates, for example, that the proposition expressed by

s_{4}

in

L

is expressed in

L^{*}

by the state description

F^{*} a & G^{*} a & B_{1}^{*} a & \sim B_{2}^{*} a & B_{3}^{*} a

. Notice that in our example

s_{4}

is an

h_{1} h_{2}

-sd, and our permutation

π

maps the assigned

L^{*}

state description to itself. More generally, the positions in Table 5 match the positions in Table 4:

s_{1}

is an

h_{1}

-sd,

s_{2}

is an

h_{2}

-sd,

s_{3}

is an e-sd, etc. (Table 5 does not assign state descriptions for

ϕ

-sds because those assignments turn out not to matter, as long as each

ϕ

-sd receives an

L^{*}

equivalent that has not been assigned to any other state description of

L

.) Assignments in the rows of Table 5 ensure that the mappings in the same-numbered rows of Table 4 hold. In row (i), for example, when

π

swaps

F^{*}

and

G^{*}

the

h_{1}^{*}

-sd synonym of

s_{1}

is exchanged with the

h_{2}^{*}

-sd synonym of

s_{2}

.16

Table 5.

L^{*}

synonyms for some

L

state descriptions.

The assignments in Table 5 employ a system that can be extended to the general case. First, using guidance from Table 4 we have divided the

L

state descriptions appearing in

h_{1}

,

h_{2}

, and/or e into pairs and singletons, depending on whether

π

will map the relevant state description to another state description or to itself. The pairs and singletons have been aligned on the individual rows. Each row is assigned a unique binary code using the

B^{*}

-predicates. Row (i) is assigned the binary code

B_{1}^{*} B_{2}^{*} B_{3}^{*}

; row (ii) gets

B_{1}^{*} B_{2}^{*} \bar{B_{3}^{*}}

, etc. We then determine the

F^{*}

s and

G^{*}

s. If one state description is to be mapped onto another (and vice versa) by

π

, the state description belonging to

h_{1}^{*}

affirms

F^{*}

and denies

G^{*}

, while the description it’ll be mapped onto denies

F^{*}

and affirms

G^{*}

. The singletons, meanwhile, affirm both

F^{*}

and

G^{*}

so that

π

will map them to themselves.17

With all this work done,

π

maps

h_{1}^{*}

to

h_{2}^{*}

and vice versa while leaving

e^{*}

intact. And this was our goal: We started with

f (h_{1}, h_{2}, e)

, so we have

f (h_{1}^{*}, h_{2}^{*}, e^{*})

. We also have

π (h_{1}^{*}) = h_{2}^{*}

,

π (h_{2}^{*}) = h_{1}^{*}

, and

π (e^{*}) = e^{*}

in faithful, adequate language

L^{*}

. So by the identical treatment of predicate permutations,

f (h_{2}^{*}, h_{1}^{*}, e^{*})

. Together with f’s antisymmetry this yields our contradiction.

6. Second Stage of the Proof

Hopefully I have described the strategy for this single case so that the reader can generalize to similar cases. Unfortunately, though, there is a class of cases to which the strategy (as so far described) will not generalize. Suppose our

h_{1}

,

h_{2}

, and e in faithful, adequate

L

(with five predicates) have the following disjunctive normal form equivalents:

\begin{matrix} h_{1} : s_{1} \lor s_{4} \lor s_{5} \lor s_{7} & h_{2} : s_{0} \lor s_{2} \lor s_{4} \lor s_{6} \lor s_{7} \\ e : s_{3} \lor s_{5} \lor s_{6} \lor s_{7} \end{matrix}

These relata are exactly as before, except that I have added an extra disjunct

s_{0}

to

h_{2}

.

If we pursue the strategy of the previous section, each of

L

’s state descriptions will have a state-description synonym in

L^{*}

. But even if we are clever and set things up so that the

L^{*}

state description synonym for

s_{2}

is mapped by

π

to the

L^{*}

state description synonym for

s_{1}

, there will be nothing left in

h_{1}^{*}

for

π

to map the synonym of

s_{0}

onto. So

π

will not map

h_{2}^{*}

onto

h_{1}^{*}

as desired.

The problem is that in our new case the numbers of each type of state description do not line up in the right way. Looking at Table 4, we can see that in order for our mapping scheme in

L^{*}

to work, we need the following equalities met:

number of h_{1}^{*} - sds = number of h_{2}^{*} - sds number of e^{*} h_{1}^{*} - sds = number of e^{*} h_{2}^{*} - sds

(Notice from Table 4 that the number of

e^{*}

-,

h_{1}^{*} h_{2}^{*}

-,

e^{*} h_{1}^{*} h_{2}^{*}

-, and

ϕ^{*}

-sds is irrelevant, as they are mapped by

π

to themselves.) As our strategy stands, the state descriptions of

L^{*}

match up one-to-one with state descriptions of

L

, so unstarred versions of those equalities must be met as well. But the first (unstarred) equality is not met by the

h_{1}

and

h_{2}

under consideration; in this case there are more

h_{2}

-sds than

h_{1}

.

And this case is important, because it represents a favoring instance under the Proportional Theory discussed in Section 3. Recall that under the Proportional Theory, one hypothesis is favored over another if a greater proportion of its state descriptions is shared with the evidence. In the case at hand,

h_{1}

shares half the state descriptions in its disjunctive normal form with e, while

h_{2}

shares only

2 / 5

. So according to the Proportional Theory

f (h_{1}, h_{2}, e)

. Moreover, every case in which the Proportional Theory indicates that one hypothesis is favored over another by evidence will violate the unstarred version of at least one of the equalities above.18 So for our proof to rule out the Proportional Theory as a viable theory of evidential favoring, it must apply to cases in which the unstarred version of at least one of these equalities fails.

In the present case there is one more

h_{2}

-sd than there are

h_{1}

-sds, but for our permutation mapping in

L^{*}

to work out we need the number of

h_{2}^{*}

-sds to match the number of

h_{1}^{*}

-sds. To make this happen, we must break the one-to-one mapping between propositions expressed by state descriptions of

L^{*}

and propositions expressed by state descriptions of

L

. (Breaking the one-to-one mapping allows the starred equalities to hold when their unstarred analogs do not.) In particular, if the proposition expressed by

s_{1}

—a member of

h_{1}

’s disjunctive normal form but not

h_{2}

’s—was expressed by the disjunction of two state descriptions in

L^{*}

, the first of these state descriptions could be mapped by

π

to the synonym of

s_{2}

while the other could be mapped to the synonym of

s_{0}

. “Splitting”

s_{1}

—an

h_{1}

-sd—into two state descriptions in

L^{*}

would increase the number of

h_{1}^{*}

-sds by one while leaving the number of

h_{2}^{*}

-sds unchanged. This would bring the number of

h_{1}^{*}

-sds and

h_{2}^{*}

-sds into the required alignment.

The trick is to take the proposition expressed by

s_{1}

in

L

and express it as the disjunction of two state descriptions in

L^{*}

, while leaving

L^{*}

adequate and faithful and matching all the other

L

state descriptions that participate in

h_{1}

,

h_{2}

, or e to state descriptions of

L^{*}

one-to-one (so as not to alter the count of any non-

ϕ

-sd-types besides the

h_{1}

-sds). We will achieve this using a technique I call “explode and gather”. This technique involves introducing two intermediary languages,

L^{†}

and

L^{'}

, through which we move from

L

to

L^{*}

. Instead of assigning the state descriptions of

L

synonyms directly in

L^{*}

, we will assign them synonyms in

L^{†}

. Sentences in

L^{†}

will in turn receive synonyms in

L^{'}

, which will finally receive synonyms in

L^{*}

. The net result will be the representation of every proposition expressed by a state description of

L

in

L^{*}

, but the number of

h_{1}^{*}

-sds in

L^{*}

will be one greater than the number of

h_{1}

-sds in

L

.

To get a rough idea of how “explode and gather” works, imagine we had a language

B

whose one constant p referred to a pig and whose one predicate N indicated that the pig was north of the barn. Our

h_{1}

might be

N p

and our

h_{2}

might be

\sim N p

. We would then introduce a second language

B^{†}

that contained the constant p, the predicate

N^{†}

, and the predicate

W^{†}

indicating that the pig was west of the barn. This is the “explosion” phase; while each of our hypotheses was represented by a single state description in

B

, each is now represented by a disjunction of two state descriptions in

B^{†}

. (

N p

, for instance, has become

(N^{†} p & W^{†} p) \lor (N^{†} p & \sim W^{†} p)

). Notice that just like the state descriptions of

B

, the state descriptions of

B^{†}

express a set of mutually exclusive, exhaustive propositions;

B^{†}

is faithful just like

B

.

Now we “gather” with a new language

B^{'}

.

B^{'}

has three state descriptions. One is a synonym for

\sim N^{†} p

, one is a synonym for

N^{†} p & W^{†} p

, and the last is a synonym for

N^{†} p & \sim W^{†} p

.

B^{'}

cannot express every proposition expressible in

B^{†}

, but it can express every proposition expressible in

B

, so it inherits its adequacy from that language. Also, the state descriptions of

B^{'}

express a set of mutually exclusive, exhaustive propositions, so

B^{'}

will be faithful as well. And in

B^{'}

, the synonym of

h_{2}

is a single state description while the synonym of

h_{1}

is a disjunction of two. There is one more

h_{1}^{'}

-sd than there are

h_{1}

-sds, but the number of

h_{2}^{'}

-sds equals the number of

h_{2}

-sds.

This example, while hopefully suggestive, cannot actually be carried through in the details—for one thing, a standard first-order language cannot contain exactly three state descriptions. But now that we have got the broad idea down, we can implement the technical details of “explode and gather” for the

L

-example we have been considering.

We begin by constructing

L^{†}

.

L^{†}

has the same single constant, a, as

L

, representing the same tuple as in

L

.

L^{†}

contains every predicate in

L

, representing the same properties as they represent in

L

. But

L^{†}

has two additional predicates, which (for the sake of definiteness) we will say are

D^{†}

and

E^{†}

.

D^{†}

and

E^{†}

represent properties not already represented by any of the predicates of

L

. (Their introduction is analogous to the introduction of

W^{†}

in the pig example.)19 Notice that under this construction each state description of

L

will have a synonym in

L^{†}

. But the

L^{†}

synonym of an

L

state description

s_{i}

will not be a state description of

L^{†}

. Instead, it will be a disjunction of four state descriptions: One that looks just like

s_{i}

but also affirms

D^{†} a & E^{†} a

, another that looks just like

s_{i}

but appends

D^{†} a & \sim E^{†} a

, etc. Thus we have “exploded” each state description of

L

into multiple state descriptions of

L^{†}

.

Now we “gather” the state descriptions of

L^{†}

into the language

L^{'}

.

L^{'}

has the same constant as

L

and

L^{†}

representing the same tuple. But

L^{'}

has only one more predicate than

L

. Instead of saying what the predicates of

L^{'}

represent, I will describe how we assign

L^{'}

state descriptions as synonyms of sentences in

L^{†}

. First, assign distinct

L^{'}

state descriptions as synonyms for the

L^{†}

sentences expressing state descriptions of

L

—except for the

L^{†}

sentences expressing

s_{1}

and the

ϕ

-sds of

L

. (Recall that

s_{1}

is the state description we’re trying to “split” in two.)

s_{1}

, like all the state descriptions of

L

, has been exploded into four “parts” in

L^{†}

. Take one of those parts and give it a synonym that is a state description in

L^{'}

. Then take the other three parts and give their disjunction another state description synonym in

L^{'}

.20

The resulting

L^{'}

will not be able to express every proposition expressible in

L^{†}

. But it will be able to express every proposition expressible in

L

, because each

L

state description has a synonym in

L^{'}

. And since

L

is adequate for

h_{1}

,

h_{2}

, and e,

L^{'}

will be adequate for the propositions expressed by those relata as well.21 Moreover, each state description of

L

appearing in

h_{1}

,

h_{2}

, or e has a state description synonym in

L^{'}

—except for

s_{1}

. The synonym of

s_{1}

is a disjunction of two state descriptions of

L^{'}

. Since they compose the synonym of

s_{1}

and

s_{1}

is a disjunct of

h_{1}

, both of these state descriptions will be disjuncts of

h_{1}^{'}

. So while there was only one

h_{1}

-sd in

L

, there will be two

h_{1}^{'}

-sds in

L^{'}

. Thus the disjunct-type counts in

L^{'}

will satisfy our earlier equalities (with primes taking the place of stars). That means we can construct

L^{*}

from

L^{'}

with a one-to-one mapping between state descriptions, cleverly assigning meanings to state descriptions of

L^{*}

as described in Section 5.

L^{*}

will be adequate and faithful, and the permutation

π

that exchanges

F^{*}

with

G^{*}

while leaving

B^{*}

-predicates intact will effect exactly the mappings described in Table 4. So our proof will go through as before.

Let’s take a step back and assess. The Proportional Theory and theories like it work by counting the number of disjuncts in hypotheses’ disjunctive normal forms and the number of disjuncts those hypotheses share with the evidence. Yet these count facts are artifacts of the language in which we choose to express our evidence and hypotheses. In this section I have worked through a particular example in which “explode and gather” takes us from one faithful, adequate language to another faithful, adequate language in which the disjunct counts of sentences expressing the same propositions have changed. No matter what logically independent evidence and hypotheses we are presented with, the explode and gather technique can be applied (multiple times if necessary) to bring disjunct counts into the balance we need to apply the proof maneuver from Section 5. Accurate theories of evidential favoring yield the same favoring judgments across all faithful, adequate languages, and if they are formal they do so while treating predicate permutations identically. But given logically independent relata we will always be able to construct an

L^{*}

in which the hypotheses are predicate permutation variants, so that a formal theory will detect no favoring between them. This means that a formal theory will never find that a body of evidence differentially favors logically independent hypotheses. So if it can be captured by a formal theory, evidential favoring is not substantive.

One final note about the proof: One might wonder why

h_{1}

,

h_{2}

, and e have to be logically independent for our argument to go through. The “explode and gather” technique increases the number of

h_{1}

-sds (say) by “splitting” an existing

h_{1}

-sd into two

h_{1}^{'}

-sds. This works only if there is an

h_{1}

-sd to begin with. If

h_{1}

,

h_{2}

, and e were arranged so that there were some

h_{2}

-sds but no

h_{1}

-sds, we would not be able to split

h_{1}

-sds to make their number equal the

h_{2}

-sds’. Logical independence of

h_{1}

,

h_{2}

, and e guarantees that there is at least one state description in

L

of each type, avoiding situations that would derail the proof. But once we have seen that proof, it is clear that the logical independence requirement could be relaxed. For example, we could run the argument for an

h_{1}

,

h_{2}

, and e with no

h_{1}

-sds as long as there were no

h_{2}

-sds either. More generally, the relata we work with have to make some

ϕ

-sds available, and if they make one sd-type on a row of Table 4 available they have to make any other types on that row available as well. This means that we could, for instance, accommodate positions on which evidential favoring only ever occurs between mutually exclusive hypotheses.22 In a faithful language two mutually exclusive hypotheses

h_{1}

and

h_{2}

have no

h_{1} h_{2}

-sds and no

e h_{1} h_{2}

-sds. But eliminating those rows from Table 4 would not disrupt our mapping scheme. So while I will continue to assume logical independence among the relata for simplicity’s sake, it is significant that this requirement could be relaxed.

7. The Favoring Relation Itself

Even if the logical independence requirement were relaxed, our general result would still have shortcomings. For example, it applies only to evidence and hypotheses adequately expressible in a first-order language with no quantifiers or identity symbol. In [4, Appendix B] I argue that this is not as serious a limitation as it might seem. Certainly most of the evidential favoring on which we rely in daily life, scientific research, etc. involves more complicated logical structures than this. But notice that all we need to get our proof going is one instance of logically independent evidence and hypotheses that are first-order expressible. Moreover, these syntactical limitations apply only to the evidence and hypotheses, not to the processes used by theories that detect evidential favoring. Even if an entropy calculation or something more mathematically complex is used to determine the extension of f, our result will apply as long as at least one triple in that extension consists of sentences as simple as “This emerald is green” and “The next one will be too.”

Our result may also seem aimed at a project most philosophers have abandoned; very few people still maintain that evidential favoring can be ferreted out by formal means. So let’s stop thinking about formal theories of favoring, and start thinking about the evidential favoring relation itself. Most of the conditions we imposed in Section 4 concerned that relation directly; the only one motivated by the prospect of formal theorizing was identical treatment of predicate permutations. But the identical treatment of predicate permutations is significant even when we set formal theories aside. If f fails to treat predicate permutations identically, there exist hypotheses and evidence in adequate, faithful languages whose favoring relations disappear when their predicates are swapped around. Since predicates represent properties, a favoring relation that treats predicate permutations non-identically behaves differently towards propositions containing some properties than it does towards otherwise-identical propositions containing different properties. A favoring relation that does not treat predicate permutations identically plays favorites among properties. Our general result says that if evidential favoring is antisymmetric and substantive, it must play favorites.

The standard lesson drawn from language dependence results such as Goodman’s “grue” problem is that evidential favoring privileges some properties over others. Goodman himself [6, Lecture IV] tried to distinguish “projectible” properties from nonprojectible; later philosophers such as Quine [17] attempted to identify “natural properties” that play a special role in evidential favoring. If there are such special properties, favoring theories need not yield consistent results across faithful, adequate languages; such theories (formal or not) may restrict their attention to languages whose predicates represent the privileged properties. If greenness, for example, is a natural property, then Carnap’s

c^{*}

function should be applied only to language

L^{G}

—not

L^{H}

or

L^{S}

—and the intuitive confirmation result of Equation (4) stands.

This standard response leaves an epistemic problem: How can agents determine which properties are natural? The obvious response is that natural properties are revealed by evidence from the natural world. But here the generality of our result kicks in. Imagine some multi-stage process agents could apply: first, the process would use an agent’s empirical evidence to determine a list of natural properties. Next, the process would sort languages whose predicates represented natural properties from those whose predicates did not. Finally, the process would work within the set of preferred languages to determine which hypotheses were favored over others by the agent’s evidence. Now consider a relation

n p

that captures the net effect of this entire process:

n p (h_{1}, h_{2}, e)

holds just in case the natural property list generated by the evidence represented by e yields a favoring relation on which e favors the hypothesis represented by

h_{1}

over the hypothesis represented by

h_{2}

.

Now

n p

is just a relation, so the process that takes agents from the inputted e to an ultimate favoring judgment between

h_{1}

and

h_{2}

need not proceed in the sequence just described. The key point is that however it is generated,

n p

should be antisymmetric and had better be substantive, so that favoring relations may obtain among hypotheses neither entailed nor refuted by the evidence. And so (substituting

n p

for f) our general result applies:

n p

will fail to treat predicate permutations identically. That is, antecedent to the introduction of a particular body of evidence e,

n p

will already prefer some properties over others. Anyone who tries to work out the natural properties from a body of empirical evidence will need a preferred property list before that evidence is even consulted.

Apparently substantive favoring among hypotheses arises from something more than just evidence; it requires an extra-evidential element to select among properties. As far as the evidence and hypotheses themselves are concerned—considered alone as propositions, without any further information or influences—any two logically independent hypotheses are related to a given body of evidence in the same way. The appearance of asymmetry among these propositions, the sense that the evidence has more in common with one hypothesis or favors it over the other, is an artifact of the language in which the propositions are expressed. And the propositions themselves cannot tell you which language is best.23

This leaves a few options in the theory of evidential favoring. First, one can be an externalist: One can insist that there are facts in the world about which properties are natural, or projectible—it is just that those facts cannot be discerned by agents from their evidence. While there are favoring relations among hypotheses and evidence, agents cannot ever know that they have got those relations right. [4] presents arguments against this option; I will simply say here that I find favoring facts inaccessible by agents highly unattractive. Second, one can hold that the preferred property list need not be discerned from empirical evidence because it can be determined a priori. Besides requiring a very strong view of our a priori faculties, this response contradicts the positive theories philosophers have offered of what makes projectible properties projectible. The standard theory (from [18], among others) is that projectible properties recur regularly because they play a particular role in the natural laws of the universe, but such laws must be gleaned from empirical data.24

The remaining option maintains that the element privileging some properties over others comes from agents, and so is accessible to them and allows them to determine when favoring holds. This “subjectivist” option denies that there is a three-place evidential favoring relation (among two hypotheses and a body of evidence) at all; it relativizes favoring to a fourth relatum that is a feature of subjects. Perhaps agents grow up speaking a language whose predicates express certain properties; perhaps agents evolved to think using certain categories; perhaps for some reason agents have a prior disposition to project some properties more readily than others. Wherever their preferred property lists come from, subjects with different lists may not be able to adjudicate disputes between those lists (and the favoring relations that attend them) by citing evidence or a priori considerations. While this approach avoids the problems of the other options—it permits substantive evidential favoring, posits no unrealistic a priori faculties, and allows agents access to favoring facts—it may require us to radically rethink what we are doing when we ask which of two hypotheses is favored by our evidence.

Acknowledgements

I am grateful to Hayley Clatterbuck, Casey Helgeson, Marlos Viana, Paul Bartha, and Brandon Rdzak for discussion of earlier drafts of this article.

References

Hempel, C.G. Studies in the logic of confirmation (I). Mind 1945, 54, 1–26. [Google Scholar] [CrossRef]
Hempel, C.G. Studies in the logic of confirmation (II). Mind 1945, 54, 97–121. [Google Scholar] [CrossRef]
Carnap, R. Logical Foundations of Probability; University of Chicago Press: Chicago, IL, USA, 1950. [Google Scholar]
Titelbaum, M.G. Not enough there there: Evidence, reasons, and language independence. Philos. Perspect. 2010, 24, 477–528. [Google Scholar] [CrossRef]
Kolmogorov, A.N. Foundations of the Theory of Probability; Chelsea Publishing Company: New York, NY, USA, 1950; Translated by Nathan Morrison. [Google Scholar]
Goodman, N. Fact, Fiction, and Forecast; Harvard University Press: Cambridge, MA, USA, 1979. [Google Scholar]
Jaynes, E.T. Information theory and statistical mechanics I. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
Jaynes, E.T. Information theory and statistical mechanics II. Phys. Rev. 1957, 108, 171–190. [Google Scholar] [CrossRef]
Maher, P. Probabilities for two properties. Erkenntnis 2000, 52, 63–91. [Google Scholar] [CrossRef]
Maher, P. Probabilities for multiple properties: The models of Hesse and Carnap and Kemeny. Erkenntnis 2001, 55, 183–216. [Google Scholar] [CrossRef]
Seidenfeld, T. Entropy and uncertainty. Philos. Sci. 1986, 53, 467–491. [Google Scholar] [CrossRef]
Glymour, C. Theory and Evidence; Princeton University Press: Princeton, NJ, USA, 1980. [Google Scholar]
Hume, D. A Treatise of Human Nature, 2nd ed.; Oxford University Press: Oxford, UK, 1978. [Google Scholar]
Popper, K.R. The Logic of Scientific Discovery; Science Editions, Inc.: New York, NY, USA, 1961. [Google Scholar]
Miller, D. Out of Error: Further Essays on Critical Rationalism; Ashgate: Burlington, VT, USA, 2005. [Google Scholar]
Chandler, J. Contrastive Confirmation: Some Competing Accounts. Forthcoming in Synthese. [CrossRef]
Quine, W. Natural kinds. In Ontological Relativity, and Other Essays; Columbia University Press: New York, NY, USA, 1969; pp. 114–138. [Google Scholar]
Lewis, D. New work for a theory of universals. Australas. J. Philos. 1983, 61, 343–377. [Google Scholar] [CrossRef]

1	This is Carnap’s “firmness” explication of confirmation, as opposed to the “increase in firmness” explication he distinguishes from it in the preface to the second edition of [3]. We will stick with the firmness explication because it is easier to work with, but my criticisms below apply equally well to the increase in firmness explication.
2	To prevent logical redundancy in a language’s set of state descriptions, we assume that no literal appears more than once in a state description and the literals appear in alphabetical order. So $G a & G b$ is a state description of $L^{G}$ , but $G a & G a & G b$ and $G b & G a$ are not.
3	On a probabilistic approach like Carnap’s, an evidence set is represented by a conjunction each of whose conjuncts represents one of the set’s members. It makes no difference to the probability calculations if an extra, tautologous conjunct is added to this evidential conjunction. So we can imagine that if we removed empirical facts from an evidence set one at a time, in the end our “evidential conjunction” would be just a tautology.
4	Marlos Viana points out to me (in correspondence) that the transition from distributions over state descriptions to distributions over structure descriptions is familiar from Bose–Einstein, Fermi–Derac, and Boltzmann–Maxwell statistics.
5	$c^{*}$ can also retrieve the favoring judgment represented in Equation (2)—in fact, any 𝔠-function derived from a regular, probabilistic $m$ -function will retrieve that judgment.
6	This is a metalinguistic statement about how the truth-values of propositions represented by atomic $L^{H}$ sentences relate to the truth-values of propositions represented by atomic $L^{G}$ sentences. None of our object languages contain both G and H as predicates, nor do any of our languages represent the identity relation.
7	Since $s_{1}$ through $s_{4}$ are names we have given to state descriptions of $L^{G}$ —and not sentences appearing in $L^{G}$ —the disjunctive normal forms written here for $h_{1}$ , $h_{2}$ , and e are metalinguistic indications of what those disjunctive normal forms actually look like in $L^{G}$ .
8	See Maher’s [10] for discussion of arguments by analogy and Carnap’s later views, as well as favoring theories by Hesse and others. Maher freely admits the language-dependence of his own systems.
9	For example, [11] develops language dependence problems for Jaynes’s maximum entropy approach.
10	The full definition of faithfulness also ensures that if we think of propositions as sets of worlds, a faithful language will have $\sim x$ represent the complement of the proposition represented by x, $x & y$ represent the intersection of the propositions represented by x and y, and $x \lor y$ represent the union of those propositions.
11	This idea is familiar from the hypothetico-deductivist theory of confirmation; see [12] for discussion.
12	Evidential favoring will not be substantive if evidence can only discriminate among hypotheses by logically ruling some out, as suggested by the falsificationism of [14]. Miller [15] is moved to falsificationism in part by language dependence issues related to those discussed in this article. Nevertheless, falsificationism remains a minority view of evidential favoring.
13	Again, these are metalinguistic indications of what the disjunctive normal forms look like, not actual object-language expressions of the evidence and hypotheses.
14	Because $L$ is faithful, its state descriptions express a set of mutually exclusive, exhaustive propositions. Since the state descriptions of $L^{}$ express these same propositions, $L^{}$ will wind up faithful as well.
15	Technically, the order of some conjuncts in the disjunctive normal form of e may be changed. But here and elsewhere in the argument that follows, I treat a sentence of a faithful language as interchangeable with other sentences in that language logically equivalent to it. If we think of propositions as sets of possible worlds, then any two logically equivalent sentences in a faithful language express the same proposition. Since f relates the propositions expressed by sentences, any two equivalent sentences in a faithful language will enter into f-relations in exactly the same ways.
16	One might wonder what the predicates of $L^{}$ mean—for instance, what property of the a-tuple is represented by the predicate $G^{}$ ? We can construct the meaning of $G^{}$ from the meanings of the (presumably well-understood) predicates of $L$ . Each state description of $L$ says something about a, and there are tuples in the world that make that state description true when referred to by a. Each state description of $L^{}$ is a synonym of an $L$ state description; an $L^{}$ state description says the same thing as its $L$ counterpart. $G^{} a$ is equivalent to a disjunction of state descriptions of $L^{}$ , so we can determine which tuples make $G^{} a$ true when referred to by a. $G^{*}$ expresses the property of belonging to the set containing just those tuples.
17	Depending on the size of $L$ and the particular constitution of $h_{1}$ , $h_{2}$ , and e, this assignment strategy may require $L^{*}$ to have more predicates, and thus more state descriptions, than $L$ . The relevant calculations and a general recipe for making this work are described in [4, Appendix A].
18	Quick proof: If we let $# h_{1}$ represent the number of $h_{1}$ -sds and so forth, the proportion of its state descriptions that $h_{1}$ shares with e is $(# h_{1} e + # h_{1} h_{2} e) / (# h_{1} + # h_{1} h_{2} + # h_{1} e + # h_{1} h_{2} e)$ and the proportion of its state descriptions that $h_{2}$ shares with e is $(# h_{2} e + # h_{1} h_{2} e) / (# h_{2} + # h_{1} h_{2} + # h_{2} e + # h_{1} h_{2} e)$ . If the unstarred versions of the two equalities are met, these proportions are equal. So if the unstarred equalities are both met, the Proportional Theory indicates no favoring. Contraposing, if the Proportional Theory indicates a favoring relation at least one of the unstarred equalities is violated.
19	In fact, $D^{†}$ and $E^{†}$ must be chosen so that it is logically possible for the tuple represented by a to have any combination of the properties represented by $D^{†}$ , $E^{†}$ , and the predicates of $L$ . This is required so that the state descriptions of $L^{†}$ will express a set of mutually exclusive, exhaustive propositions and $L^{†}$ can be faithful. The possibility of finding properties of the tuple expressible by $D^{†}$ and $E^{†}$ that are independent in the relevant sense of each other and of the properties represented in $L$ is guaranteed by an assumption I call the Availability of Independent Properties. For a defense of that assumption see [4, p. 502].
20	What about the sentences in $L^{†}$ that are synonyms of $ϕ$ -sds? For our particular $h_{1}$ , $h_{2}$ , and e, $L$ has 32 state descriptions 24 of which are $ϕ$ -sds. Each $ϕ$ -sd has a synonym that is a disjunction of 4 $L^{†}$ state descriptions. $L^{'}$ has one more predicate than $L$ , so it has 64 state descriptions. We have already assigned synonyms to 9 of them, leaving 55. Of the 24 $ϕ$ -sds in $L$ , use $L^{†}$ to split 17 of them into disjunctions of 2 state descriptions in $L^{'}$ , and split the other 7 into disjunctions of 3 state descriptions in $L^{'}$ . This will assign meanings to the remaining 55 state descriptions of $L^{'}$ and ensure that each $ϕ$ -sd has a synonym in $L^{'}$ . (For a generalization of the math involved here, including how big $L^{†}$ and $L^{'}$ will typically have to be, see [4, Appendix A], especially notes 68 through 70.)
21	A bit of thought will also reveal that because the state descriptions of $L^{†}$ express a mutually exclusive, exhaustive set of propositions the state descriptions of $L^{'}$ do so as well. This makes $L^{'}$ faithful.
22	For one such position, see [16].
23	Notice how much stronger this result is than standard “underdetermination of theory by evidence” arguments. Underdetermination of theory by evidence typically argues that while some e may favor $h_{1}$ over $h_{2}$ , we can always manufacture an $h_{3}$ that is just as well supported by e as $h_{1}$ . The present result shows that even if e, $h_{1}$ , and $h_{2}$ are selected for us, there will be no favoring of $h_{1}$ over $h_{2}$ by e unless some logical entailment holds among them. It is not just that an outlandish theory can be made up that accounts for the data; it is that no theory under serious consideration accounts for the data better than any of the others.
24	In correspondence Paul Bartha asks me to consider whether the set of languages to which a favoring theory should apply can be restricted further than I have suggested on an a priori basis. His example is a language whose single constant represents a tuple of 10 emeralds and whose three predicates represent the properties “tuple elements 1 through 6 are green,” “tuple elements 7 through 9 are green,” and “tuple element 10 is green.” Bartha suggests that this language can be ruled out (and in particular that a formal favoring theory need not treat permutations of its predicates identically) on the grounds that some predicates talk about more objects than others, or alternatively that some atomic sentences convey more information than others. My response is that the proper individuation of objects for purposes of projection should be determined by our empirical evidence. For example if we are making predictions about emeralds, is evidence more significant if it is about more emeralds or if the emeralds it talks about have a greater total mass? (What if emeralds 7 through 9 in Bartha’s example together weigh more than emeralds 1 through 6?) Put another way, should a scientific language for predicting features of emeralds have constants that pick out individual emeralds or individual 1-gram chunks of emerald mass? I do not see how this kind of question could be answered a priori, but if we try to answer it using empirical evidence and then develop our evidential favoring relation from languages that individuate objects properly, we will wind up in the same kind of circle as we saw with $n p$ . Once more something beyond our evidence will have to play a decisive role.

Table 1.

m^{†}

- and

m^{*}

-values for language

L^{G}

.

Table 1.

m^{†}

- and

m^{*}

-values for language

L^{G}

.

Name	State description	$m^{†}$	$m^{*}$
$s_{1}$	$G a & G b$	$1 / 4$	$1 / 3$
$s_{2}$	$G a & \sim G b$	$1 / 4$	$1 / 6$
$s_{3}$	$\sim G a & G b$	$1 / 4$	$1 / 6$
$s_{4}$	$\sim G a & \sim G b$	$1 / 4$	$1 / 3$

Table 2.

m^{*}

-values for language

L^{H}

.

Table 2.

m^{*}

-values for language

L^{H}

.

		Expresses same
State description	$m^{*}$	proposition as
$H a & H b$	$1 / 3$	$s_{2}$
$H a & \sim H b$	$1 / 6$	$s_{1}$
$\sim H a & H b$	$1 / 6$	$s_{4}$
$\sim H a & \sim H b$	$1 / 3$	$s_{3}$

Table 3.

m^{*}

-values for language

L^{S}

.

Table 3.

m^{*}

-values for language

L^{S}

.

		Expresses same
State description	$m^{*}$	proposition as
$G_{1} o & S o$	$1 / 4$	$s_{1}$
$G_{1} o & \sim S o$	$1 / 4$	$s_{2}$
$\sim G_{1} o & S o$	$1 / 4$	$s_{4}$
$\sim G_{1} o & \sim S o$	$1 / 4$	$s_{3}$

Table 4. In

L^{*}

,

π

maps…

Table 4. In

L^{*}

,

π

maps…

(i)		each $h_{1}^{}$ -sd to an $h_{2}^{}$ -sd and vice versa,
(ii)		each $e^{*}$ -sd to itself,
(iii)		each $h_{1}^{} h_{2}^{}$ -sd to itself,
(iv)		each $e^{} h_{1}^{}$ -sd to a $e^{} h_{2}^{}$ -sd and vice versa,
(v)		each $e^{} h_{1}^{} h_{2}^{*}$ -sd to itself, and
(vi)		each $ϕ^{}$ -sd to a $ϕ^{}$ -sd.

Table 5.

L^{*}

synonyms for some

L

state descriptions.

Table 5.

L^{*}

synonyms for some

L

state descriptions.

(i)	$s_{1} : F^{} \bar{G^{}} B_{1}^{} B_{2}^{} B_{3}^{*}$		$s_{2} : \bar{F^{}} G^{} B_{1}^{} B_{2}^{} B_{3}^{*}$
(ii)		$s_{3} : F^{} G^{} B_{1}^{} B_{2}^{} \bar{B_{3}^{*}}$
(iii)		$s_{4} : F^{} G^{} B_{1}^{} \bar{B_{2}^{}} B_{3}^{*}$
(iv)	$s_{5} : F^{} \bar{G^{}} B_{1}^{} \bar{B_{2}^{}} \bar{B_{3}^{*}}$		$s_{6} : \bar{F^{}} G^{} B_{1}^{} \bar{B_{2}^{}} \bar{B_{3}^{*}}$
(v)		$s_{7} : F^{} G^{} \bar{B_{1}^{}} B_{2}^{} B_{3}^{*}$

© 2011 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)

(i)	$s_{1} : F^{} \bar{G^{}} B_{1}^{} B_{2}^{} B_{3}^{*}$		$s_{2} : \bar{F^{}} G^{} B_{1}^{} B_{2}^{} B_{3}^{*}$
(ii)		$s_{3} : F^{} G^{} B_{1}^{} B_{2}^{} \bar{B_{3}^{*}}$
(iii)		$s_{4} : F^{} G^{} B_{1}^{} \bar{B_{2}^{}} B_{3}^{*}$
(iv)	$s_{5} : F^{} \bar{G^{}} B_{1}^{} \bar{B_{2}^{}} \bar{B_{3}^{*}}$		$s_{6} : \bar{F^{}} G^{} B_{1}^{} \bar{B_{2}^{}} \bar{B_{3}^{*}}$
(v)		$s_{7} : F^{} G^{} \bar{B_{1}^{}} B_{2}^{} B_{3}^{*}$

Symmetry and Evidential Support

Abstract

1. Introduction

2. Carnap’s Early Theories of Favoring

3. Alternative Approaches to Favoring

4. General Conditions on Evidential Favoring

5. First Stage of the Proof

6. Second Stage of the Proof

7. The Favoring Relation Itself

Acknowledgements

References

Article Metrics

Citations

Article Access Statistics