Quod facile est in re, id probabile est in mente.
—Gottfried Wilhelm Leibniz [1
1. Consistency versus Non-Contradictoriness
As is widely recognized, paraconsistency is the investigation of logic systems endowed with a negation ¬, such that not every contradiction of the form p
entails everything. In other terms, a paraconsistent logic does not suffer from trivialism, in the sense that a contradiction does not necessarily explode or trivialize the deductive machinery of the system. In strict terms, even an irrelevant contradiction in traditional logic obliges a reasoner that follows such a logic to derive anything from a pair of contradictory statements
, as a result of the so-called Principle of Explosion (PEx):
On the other hand, a paraconsistent logician, by using a more cautious way of reasoning, is free of the burden of PEx and could pause to investigate the causes for the contradiction, instead of injudiciously deriving unwanted consequences from it.
Common sense, however, recognizes that some contradictions may be intolerable, and those would destroy the very act of reasoning (that is, lead to trivialization). This amounts to admitting that not all contradictions are equivalent. The Logics of Formal Inconsistency (LFIs), a family of paraconsistent logics designed to express the notion of consistency (and inconsistency, as well) within the object language by employing a connective (reading as “α is consistent”) realizes such an intuition.
As defended in [2
], LFIs can be regarded as theories of logical consequence, epistemic in character, that tell us how to make sensible inferences in the presence of contradictions. From the purely mathematical viewpoint, LFIs are subsystems of classical logic, albeit they extend classical logic in the sense that classicality may be recovered in the presence of consistency: contradictions involving consistent sentences will lead to explosive triviality. Consistency in the LFIs is not regarded as synonymous with freedom from contradiction (as happens with the traditional notion of the consistency of a theory T
, where consistency means that there is no sentence α
where ⊢ is a specified consequence relation in the language of T
). The usual notion of consistency, totally depending on negation, is perhaps sufficient for certain mathematical purposes, but not for the whole enterprise of reasoning, as argued in [3
]. In the LFIs, however, the notions of consistency and non-contradiction are not coincident, nor are the notions of inconsistency and contradiction the same. For more details, conceptual motivations and the main results about LFIs, the reader is referred to [4
The distinguishing feature of the LFIs is that the principle of explosion is not valid in general, although this law is not abolished, but restricted to consistent sentences. Therefore, a contradictory theory may be non-trivial unless the contradiction refers to something consistent.
Such features of the LFIs are condensed in the following law, which is referred to as the “principle of Gentle Explosion” (PGE):
More than three decades ago, Williams J. N. [6
] pleaded that it is a mistake to suppose that inconsistency is the same as contradiction. The LFIs fully formalize this intuition, and starting from this perspective, it is possible to build a number of logical systems with different assumptions that not only encode classical reasoning, but also (at the price of adding new principles) converge to classical logic. We have chosen a particular logic endowed with adequate principles to deal with our paraconsistent probability measures, without obfuscating the fact that several other logics would give rise to specific (weaker or stronger) notions of paraconsistent probabilistic measures.
2. Ci, a Logic of Formal Inconsistency
The system Ci
is a member of the hierarchy of the LFIs with some features that make it reasonably close to classical logic; it is appropriate, in this way, to define a generalized notion of probability strong enough to enjoy useful properties. Consider the following stock of propositional axioms and rules:
Let Σ be a propositional signature closed under the unary connectives and under the binary connectives . The logic Ci (over Σ) is defined by the following Hilbert calculus:
Modus Ponens (MP):
As investigated in detail in [7
can be extended to the first-order logic QCi
(over a convenient extension of Σ) by adding appropriate axioms and rules.
It is worth noting that axioms
plus Modus Ponens MP
define a Hilbert calculus for positive propositional classical logic (see [4
]), and therefore, all of the laws concerning positive logic (as distribution of ∧ over ∨, etc.) are valid.
It is instructive to show, as an example, the useful properties of the distribution of conjunction over disjunction that holds as good as in classical logic (not a surprise, since positive classical logic is incorporated into our paraconsistent logic). However, as the validity of such laws may raise some doubts, we provide a quick proof of them. The symbol
stands for the derivability relation (when there is no danger of confusion, we shall drop the index in
in order to simplify notation) of Ci
Theorem 1. Distributing conjunctions and disjunctions:
We just prove the first item (the second item is analogous). From right to left, axioms
, easily show the intended implication. From left to right, just notice that:
As proven in [4
], the logic Ci
cannot be semantically characterized by finite matrices, but it can be characterized in terms of valuations over
(also called bivaluations):
Definition 2. Let be the collection of sentences of Ci
. A function is a valuation for Ci if it satisfies the following clauses:
The semantical consequence relation w.r.t. bivaluations for Ci
is defined as expected:
iff, for every valuation v
For every set of sentences , .
As shown in [4
], a strong (classical) negation can be defined in Ci
is a bottom formula (that is:
for every ψ
) for any sentence β
. In order to simplify matters, a privileged β
will be chosen, and the subscript will be omitted in
from now on.
Properties of strong negation:
The strong negation ∼ satisfies the following properties in Ci :
for every α and ψ
If and , then
, and so,
Detailed proofs can be found in [4
Additionally, several metatheorems, such as the deduction metatheorem, can be proven in Ci.
Now, by defining ‘α is inconsistent’ as : = , axioms and (which permit to add and eliminate double negations) convey the meaning that ‘α is not inconsistent if and only if it is inconsistent’. Of course, by the very definition and the same axioms on double negations, it also holds ‘α is not consistent if and only if it is inconsistent’.
Some other relevant results in Ci
hold as follows.
Theorem 4. Properties of consistency:
, but the converse does not hold
, but the converse does not hold
Again, detailed proofs can be found in [4
Theorem 5. The following are bottom particles in Ci :
, for any β
, for any β
, for any β
, for any β
We refer the reader once again to [4
] for the proofs. ☐
3. Consistency, Inconsistency and Paraconsistent Probability
As hinted in the previous section (see [2
] for details), the formal notion of consistency here considered does not necessarily depend on negation. Indeed, the logical machinery of the LFIs shows that consistency may be conceived of as a primitive concept, whose meaning can be thought of as “conclusively established as true (or false)” by extra-logical means, depending on the subject matter. Consistency, in this sense, is a notion independent of model theoretical and proof-theoretical methods and is more close to the idea of regularity, or something contrary to change (an elaboration on this view is offered in [8
However, consistency is also connected to complying with the laws of traditional probability, as put by F.Ramsey (see [9
]), who argues that degrees of belief should satisfy the probability axioms and defends that this is connected to a notion of consistency or coherence. In this way, the notion of consistency (at least for degrees of belief) can be regarded as the satisfaction of the probability axioms.
This paper aims to investigate deeper relationships between logic and probability, emphasizing a new way to define a paraconsistent theory of probability. A previous approach has been developed in [10
], where variations of paraconsistent Bayesianism based on a four-valued paraconsistent logic are discussed. A still earlier attempt has been briefly sketched in [11
]. An entirely different view is taken in [12
], where probabilistic semantics is given for a couple of many-valued paraconsistent logics. The connections between non-classical logics and probability are not circumscribed to paraconsistent logic; see, e.g., [13
] for the case of infinite-valued logic and in particular [14
] for a broader and more philosophical perspective.
Probability functions are usually defined for a σ
-algebra of subsets of a given universe set Ω, but it is also natural to define probability functions directly for sentences in the object language. We will refer to them, respectively, as probability on sets versus probability on sentences (see the discussion in Section 5
Although these two approaches are equivalent in classical logic in view of the representation theorems for Boolean algebras, this is not so for probability based on other logics, since the algebraic kinship may be lost for non-classical logics or be much less immediate. Furthermore, in algebraic terms, probability functions in set-theoretical settings are required to satisfy countable additivity, but since propositional language is compact, for probability on sentences, it suffices to require finite additivity.
Our first definition of paraconsistent probability will be directly concerned with probability on sentences, with the primary intention to emphasize the role of a new, more cautious logic, behind the probabilistic reasoning, and the effects of logical machinery on a corresponding version of Bayes’ rule. In Section 5
, however, a notion of paraconsistent probability spaces (concerning probability on sets) will be presented and discussed.
Definition 3. A probability function for the language of a logic L
, or a L
-probability function, is a function satisfying the following conditions, where stands for the syntactic derivability relation of L
Non-negativity: for all
Tautologicity: If , then
Anti-tautologicity: If , then
Comparison: If , then
It should be remarked that the same meta-axioms, taking appropriately
the classical, intuitionistic or paraconsistent derivability relation (in the present case,
), define, respectively, the classical, the intuitionistic and the probabilistic probability measures. The intuitionistic case is treated in [15
]. This is clear evidence that the concept of probability can be regarded as entirely logic dependent and that the choice of the underlying logic is a matter of interest and convenience. However, once a choice is made, the consequences are radically different, as we intend to illustrate below.
Two events α and β are said to be independent if . Two events can be independent concerning one probability measure and dependent concerning another.
Some immediate consequences of such axioms are the following:
Theorem 6. Regularity of L-probability measures.
If φ is a bottom particle in L, then .
If φ and ψ are logically equivalent in L in the sense that and , then .
(1) Immediate, in view of axioms anti-tautologicity and comparison above; (2) Immediate, in view of axiom comparison above. ☐
As a consequence of Theorem 6, , , and , for any probability function P.
Two sentences α
are said to be logically incompatible if
, for any φ
(or equivalently, if
act as a bottom particle). Some simple calculation rules follow:
Calculation rules of Ci-probability measures.
Let P be a Ci-probability measure; then:
, if α and β are logically incompatible.
Only Items (1) and (2) will be proven (the rest is routine): (1): Since α and β are logically incompatible, act as a bottom particle, and the result is immediate by Theorem 6 and finite additivity; (2): Use finite additivity in the sentences and . ☐
Probabilities are sometimes seen as generalized truth values. In this way, in the classical case, the so-called probabilistic semantics may be thought of as extending the bivaluations of classical propositional logic with the probability functions ranging on the real unit interval . The other way round, bivaluations can be regarded as degenerate probability functions; in this sense, classical logic is to be regarded as a special case of probability logic. We show below that an analogous property holds for Ci and for the above defined notion of paraconsistent probability.
Probabilistic semantics aim to interpret logic systems (viz., logic entailment) with no appeal to truth conditions. In this way, it differs from standard truth-valued semantics: probabilistic semantics is much more general, notwithstanding both being equivalent in the classical case. What happens is that each semantics expresses logical truth and logical consequence in its own way, and we show in this section that, quite surprisingly, the equivalence between such two distinct semantics also holds for the paraconsistent probability theory based on Ci.
as a probabilistic semantic relation whose meaning is
if and only if for every probability function P
. It can be shown that Ci
is (strongly) sound and complete with respect to such probabilistic semantics:
Completeness of Ci with respect to probabilistic semantics:
if and only if
The left-to-right direction follows directly from the axioms of probability, namely, tautologicity and comparison, plus the compactness property of Ci
proofs. For the right-to-left direction, notice that if
, then in particular, this holds for the probability functions
, such that
. It suffices, then, by appealing to Theorem 2, to show that the mappings
satisfy all of the conditions for bivaluations of Definition 2:
If , since and by comparison (Definition 3) then .
On the other hand, if , since again by comparison , then , and the result follows by finite additivity, i.e., .
Analogous to the previous item, mutatis mutandis.
By finite additivity . Since , tautologicity implies , and thus, as , .
Now, suppose ; then, either or, if , then .
Reciprocally, suppose that either or . Since
if , then , hence .
If , since , it follows immediately .
Suppose that . If , then by Theorem 7, Item (3), it follows that , absurd.
Suppose that . If , then Theorem 7, Item (2) leads to being absurd.
This item follows directly from axioms and and tautologicity.
Suppose that . By Theorem 7, Item (5), . Hence, by the same Theorem 7, Item (2), and thus .
Probabilistic semantics may be seen as an alternative to truth-valued semantics, with the intention to explain semantic notions, such as truth and consequence, in terms of probability functions. It has been shown in [16
] that a semantics to standard logic (regarding soundness and completeness) can be provided without any appeal to model-theoretic or proof-theoretic concepts. The idea was later extended in [17
], proving that a probabilistic semantics can be given to any extension of classical propositional logic. However, Theorem 8 is somewhat surprising, because the logic Ci
is not an extension of classical propositional logic, just the contrary: although its language extends the language of classical logic, deductively, it is a contraction. Our (paraconsistent) probabilistic semantics to Ci
shows, therefore, that an alternative to truth-valued semantics in terms of non-standard probability functions can be provided even in cases of the contractions of classical logic.
4. Conditional Probabilities and Paraconsistent Updating
Perhaps the most interesting use of probability in paraconsistent logic is to help the so-called Bayesian epistemology or the formal representation of belief degrees in philosophy. The well-known Bayes’ rule permits one to update probabilities as new information is acquired and, in the paraconsistent case, even when such new information involves some degree of contradictoriness.
We define, as usual, the conditional probability of α
The traditional Bayes’ theorem for conditionalization says, for
As usual, here denotes the prior probability, i.e, is the probability of α before β has been observed. denotes the posterior probability, i.e., the probability of α after β is observed. is the likelihood or the probability of observing β given α, and is called the marginal likelihood or “model evidence”.
We can now set up a paraconsistent version of Bayes’ theorem, as suggested in [18
], by making the marginal likelihood
to be analyzed in terms of
It is convenient to show at this point a simple, yet pivotal generalization of the classical theorem of total probability:
Theorem of total paraconsistent probability:
Since , then , and clearly (by ) ; hence, by Theorem 6:
On the other hand, by the axiom of tautologicity (Definition 3):
it follows that:
Theorem 10. Paraconsistent Bayes’ Conditionalization Rule (PBCR): If
First notice that
in view of Definition 3 and
, so the quotient is well defined. Suppose we have two contradictory hypothesis, α and
, and wish to compute the probability of α based on evidence β. Since the definition of conditional probability gives:
it remains to compute the marginal likelihood
. This follows immediately by Theorem 9, dividing and multiplying each term by, respectively,
(which are not zero). ☐
It is clear that this rule generalizes the classical conditionalization rule, as it reduces to the classical case if or if α is consistent: indeed, in the last case, since .
As a slogan, we could summarize PBCR as saying: “Posterior probability is proportional to likelihood times prior probability and inversely proportional to the marginal likelihood analyzed in terms of its components”. It is possible, however, to formulate other kinds of conditionalization rules by combining the notions of conditional probability, contradictoriness, consistency and inconsistency.
As an example, suppose that a doping test for an illegal drug is such that it is 98% accurate in the case of a regular user of that drug (i.e., it produces a positive result, showing “doping”, with probability 0.98 in the case that the tested individual often uses the drug), and 90% accurate in the case of a non-user of the drug (i.e., it produces a negative result, showing “no doping”, with probability 0.9 in the case that the tested individual has never used the drug or does not often use the drug).
Suppose, additionally, that: (i) it is known that 10% of the entire population of all athletes often uses this drug; (ii) that 95% of the entire population of all athletes does not often use the drug or has never used it; and (iii) that the test produces a positive result, showing “doping”, with probability 0.11 for the whole population, independent of the tested individual.
Let us use the following abbreviations mnemonically:
D: the event that the drug test has declared “doping” (positive) for an individual;
C: the event that the drug test has declared “clear” or “no doping” (negative) for an individual;
A: the event that the person tested often uses the drug;
: the event that the person tested does not often use the drug or has never used it.
We know that and . However, the situation is contradictory with respect to the events A and , as they are not excludent. Therefore, by finite additivity, , and thus, .
Furthermore, as given in the problem, , and . The results of the test have no paraconsistent character, since the events D (‘doping’) and C (‘no doping’) exclude each other. Thus, and .
Suppose someone has been tested, and the test is positive (“doping”). What is the probability that the tested individual regularly uses this illegal drug, that is what is ?
By applying the paraconsistent Bayes’ rule
All of the values are known, with the exception of
it remains to compute
. It follows directly from Theorem 9 that
. Therefore, by plugging in all of the values, it follows that
The probability that a tested individual never uses the drug or just uses it sporadically, given that the test has been positive, analogously computed via the paraconsistent Bayes’ rule, is 34.4%.
As shown below, this example suggests that the paraconsistent Bayes’ conditionalization rule is more sensitive to the test parameters than traditional conditionalization. The following tables compare the paraconsistent results with the results obtained by trying to remove the contradiction involving the events A
(the event that the person tested often uses the drug) and
(the event that the person tested does not often use the drug or has never used it), that is by trying to make them “classical”. The two tables refer to two kinds of tests: a less reliable test, with 10% of false positives (Table 1
), and a more reliable test, with 2% of false positives (Table 2
Since A and overlap by 5%, we might think, thus, about reviewing the values, by removing the overlap according to three hypothetical scenarios: an alarming scenario, by lowering the value of by 5%; a happy scenario, by lowering the value of A by 5%; and a cautious scenario, by dividing the surplus equally between A and and computing the probability that the tested individual regularly uses this illegal drug.
Using paraconsistent probabilities, however, one obtains a lower value concerning this less reliable test, namely, , tending towards the “happy” hypothetical scenario where fewer people use the drug.
In this second, more reliable test, by computing the result directly via paraconsistent probabilities, one obtains a higher value, namely , tending towards the “cautious” hypothetical scenario.
The values and are known, respectively, as sensitivity and specificity, and the positive likelihood ratio is defined as (similarly, the negative likelihood ratio is defined as . Defining such measures for paraconsistent probabilities would help to assess their meaning, but this kind of approach is not the intention of this paper.
These simple examples suggest the following interpretation about Bayesian paraconsistent updates: When the test is less reliable, paraconsistent probabilities tend toward cautious optimism: values tend to reflect the most favorable outcome. On the other hand, as the test gets more reliable, paraconsistent probabilities tend toward cautious realism, in the sense of favoring more realistic expectations of undesirable outcomes. This notwithstanding, the test is cautious in all cases.
Such numerical results are very much connected to the philosophical profile of paraconsistent logics, which naturally endorses cautious reasoning about contradictions: when you find a contradiction, it is better to carefully analyze its causes, instead of risking throwing the baby out with the bath water.
It is worth noting that, apparently, our comparisons above would seem to be futile. Since, as an unwary reader could think, standard probabilistic merging techniques would be directly applied to define a belief function consensus (as studied, e.g., in [19
] and further extended in several ways), this is however highly debatable, as discussed in [20
] by one of the founders of Dempster–Shafer theory, who argues against global consensus methods and in favor of direct reassessment of the items of evidence that contribute to a belief function.
Besides such a debate of whether or not a consensus of belief functions can be directly computed, it is not obvious at all how to understand beliefs in terms of probability theory. As put by J. Y. Halpern and R. Fagin in [21
], page 3:
If we view belief as generalized probability, then it makes sense to update beliefs but not combine them. If we view beliefs as a representation of evidence, then it makes sense to combine them, but not update them. This suggests that the rule of combination is appropriate only when we view beliefs as representations of evidence
The belief functions defined by our probability functions, of course, are generalized probabilities, and we align with Halpern and Fagin, among others, in that it makes sense to update beliefs instead of combining them.
Although this first example is just a suggestive sample of what can be done with a robust calculus of probabilities based on a well-founded paraconsistent logic, it is convenient to recall that false positives themselves can be regarded as contradictory when false positive results are more probable than true positive tests; this occurs when the incidence of a certain condition (for instance, disease) in a population is below the probability of a false positive rate. In this case, the traditional Bayes’ conditionalization rule already incorporates a mechanism for rationally handling some types of contradictory data without falling into trivialization. Indeed, in [22
], the paraconsistent paradigm is invoked as a tool to evaluate the sensitivity of the traditional Full Bayesian Significance Test (FBST) value regarding changes in the prior or reference density. That paper argues that such an intuitive measure of inconsistency can be made rigorous in the context of paraconsistent logic and bilattice structures. Our intention is different, as we start from a logical-paraconsistent approach, but both views can be, in principle, combined.
Furthermore, in [23
], a conviction measure and a loss function are defined, intended to be used for evaluating financial operations strategies. A classification tree learning algorithm is thus defined over such a conviction measure, with the advantage of outputting more cautious decisions. It is not implausible that our generalization of Bayes’ updating procedure could be used with similar applications in mind, but this is a subject of further investigation.
5. Paraconsistent Probability Spaces
As is historically recognized (see, e.g., [24
]), two main competing schools of probability emerged in Europe in the 17th century, leading to different methods of statistical inference and estimation, frequentist (based on the laws of large numbers and concerned with stochastic laws of chance) and epistemological (refereeing to credence or reasonable degrees of belief, somehow connected to the modern Bayesianism). However, there is also a second competition, represented by the theories of logical probability (in contemporary times, Keynes/Carnap) versus measure-theoretical approaches to probability (Kolmogorov). These pair of bipartite traditions are not unrelated, but nobody knows for sure how all this is related to the debate between logic and probability (see below).
The reference to probability on sentences is old and comes from Gottfried W. Leibnitz (1646–1716), Jacobus Bernoulli (1654–1705), Johann Heinrich Lambert (1728–1777), Bernard Bolzano (1781–1848), Augustus De Morgan (1806–1871) and George Boole (1815–1864) and, in the 19th and 20th century John Venn (1834–1923), Hugh MacColl (1837–1909), Charles S. Pierce (1839–1887), Hans Reichenbach (1891–1953), John M. Keynes (1883–1946) and Rudolph Carnap (1891–1970), just to stay with some famous names.
On the other hand, the reference to probability on sets is more recent. Kolmogorov introduced in his classic book in German of 1933, translated as [25
], what he called the ‘elementary theory of probability’, which nowadays is widely used by mathematicians, statisticians and engineers and connected to measure theory. Probability on sentences is more common among philosophers and logicians.
Probability can, alternatively, be based on game theory rather than measure theory, as developed in [26
], viewing probability as a perfect information game between two players. The so-called axiom of continuity, considered not to be well motivated, is not assumed in [26
]. As put in [27
Countable additivity for probability has always been controversial. Émile Borel, who introduced it, and Andrei Kolmogorov, who confirmed its role in measure-theoretic probability, were both ambivalent about it. They saw no conceptual argument for requiring probabilities to be countably additive. It is merely mathematically convenient to assume they are. As Kolmogorov explained in his Grundbegriffe, countable additivity has no meaning for empirical experience, which is always finite, but it is mathematically useful. We can elaborate Kolmogorov’s explanation by pointing out that infinities enter into applied mathematics not as representation but as simplification.
Probability on sets (more properly called probability functions) assigns probabilities directly to sets. Although events, statements or predicates in many cases can be expressed as sets, these two accounts are not the same, although they are inherently equivalent in the classical scenario. In a paper that surveys and discusses K. Popper’s contributions to the theory of probability, H. Leblanc in [28
] recalls that Popper reacted against the dependency of probability theory on the Boolean algebra of sets and proposed his notion of absolute probability motivated by this criticism.
The equivalence between probability on sentences and probability on sets is far from obvious under the paraconsistent paradigm, and our proposal would not be complete if we could not offer a similar approach from the paraconsistent perspective.
Definition 4. A paraconsistent probability space is a structure
Ω is the sample space composed of all possible outcomes
is a set of events, such that
Σ is a
∅ , and ;
Σ is closed under ∪, ∩ and countable unions;
Σ is closed under the following two binary operations:
, such that , where is the usual complement.
, such that
Π is the set of all consistent outcomes;
The map is a probability measure satisfying the following conditions:
and (∅) = 0
If are pairwise disjoint, then
The point of assigning a probability interpretation to logic systems is how legitimate it is to regard probability as attached to logical statements, instead of directly to events. This duality causes perplexity to some, since statisticians and engineers may find confusing or unnecessary this passage to logic and would prefer to deal with events directly (in this case, the notion of a random variable is essential). Philosophers and logicians, on the other hand, may find it natural to attach probabilities to statements, not to events (and in this case, the notion of consequence relation is essential).
A (classical) probability space is just a particular case of paraconsistent probability space 〈Ω, Σ, P〉 where Π is empty and the operation is the identity on Σ (and consequently ).
In the classical case, a definite link between probability on events and probability on sentences (in the cases of finite additivity) can be defined so that the probability measures of information in both cases is the same. For the cases that demand infinite additivity, an infinitary propositional language or a first-order logic is required (see [29
], pages 401–404). For De Finneti (see [30
]), as much as for Borel and Kolmogorov, infinite additivity is just a question of mathematical convenience, not strictly justified by the concept of probability:
Its success owes much to the mathematical convenience of making the calculus of probability merely a translation of modern measure theory. [...] No-one has given a real justification of countable additivity (other than just taking it as a ‘natural extension’ of finite additivity).
In the cases where Σ is closed under ∪, ∩ and finite unions, a -algebra is referred to as a σ-algebra.
The connection between probability on events and probability on sentences in the classical case is granted through algebraic maneuvers à la Lindenbaum algebra, by showing that probability measures on sentences turn out to be individual cases of probability measures on events. The argument is essentially the following: denote the set of classical bivaluations in its language L as , and assign to each sentence a subset of , defined by .
It can be inductively proven that:
Therefore, the universe acquires a σ-algebra structure. Now, given any probability distribution as in Definition 3, one defines a probability measure over sets , as in Definition 4, by , for any . Clearly is well-defined in view of Theorem 6, Item (2).
The converse can be easily determined in the cases of finite probability spaces
representing the outcomes of a certain stochastic experiment
(say, throwing dice). In such cases, one can define a corresponding probabilist logic by setting atomic variables
is the outcome of the experiment, and sentences
, saying that
occurs independently, and defining
for any other sentences in these atomic variables. It is easy to see that
is a propositional probability model. Some more details can be found, for the classical case, in [31
], Chapter 4.7. A concrete example of a paraconsistent probability space is given by the structure:
Ω is any set (representing all possible outcomes)
is the set of all events, such that
Σ is a σ-algebra, i.e.:
∅ , and ;
Σ is closed under ∪ and ∩;
Σ is closed under the binary operations:
In this case, Π represents the consistent events (from the point of view of the logic Ci
, this kind of paraconsistent probability space turns out to be a classical probability space. The structure
is a particular case of a paraconsistent algebra of sets, as investigated in [32
], and the results therein (particularly Theorems 4 and 5) can be adapted to give a precise connection between the concepts of paraconsistent probability on events and probability on sentences in the Ci
logic (again, in the cases of finite additivity).
This kind of structure, of interest for mathematical statistics, would permit dealing with paraconsistent probability distributions in the way that is familiar for applications.
The main differences between ‘paraconsistent probability logics’ and ‘paraconsistent probability spaces’ are connected to the fact that the former are concerned with the transmission of probability through valid inferences, while the latter concentrate on measures of sets, the distribution of probabilities and their consequences, permitting one to treat random variables, expectation, central limit theorem, etc. Of course, as much as in the classical case, philosophical issues concern probabilistic logics, not spaces.
Probability spaces and probability logic, as we have seen, make for equivalent views in the classical case (at least for the finite situations), and a similar proof can be adapted for our paraconsistent probability logic and paraconsistent probability spaces (keeping in mind that in both cases, the definitions reflect the inherent nature of or underlying logic Ci). A deep investigation about paraconsistent probability spaces in the direction of a more sophisticated treatment involving random variables, measure theory and similar topics, however, is out of the scope of the present paper.
6. Discussion: From Paraconsistent Probability to Paraconsistent Possibility
Possibility theory is a theory of uncertainty used in areas, such as artificial intelligence, non-monotonic reasoning, belief revision and similar domains, to express uncertain knowledge in scenarios of incomplete information. It is regarded as a kind of imprecise probability theory, at times seen as a generalization of probability, sometimes as its rival (cf., e.g., [33
]). Possibility theory uses a pair of dual set functions (possibility and necessity measures) instead of only one measure, in this way differing from probability.
As a formalism to model and reason with uncertain information, possibility theory is also connected to the Dempster–Shafer theory of evidence (see [34
]). Probability, possibility and other credal calculi are alternative formalisms, with distinct motivations, and sometimes, it may be important to combine different belief measures for better solving certain problems, as explained in [35
Possibility theory can be easily and naturally extended over LFIs, by defining possibility and necessity functions (as, e.g., in [36
] and especially in [37
], where the notion of degrees of support is investigated as generalizations of degrees of belief) simulating our paraconsistent probability measures over LFIs. A deeper study on the meaning and applications of paraconsistent possibility theory, as well as its inter-relations to paraconsistent probability are postponed to further work.
7. Summary, Comments and Conclusions
We have reviewed some basic points about paraconsistency, characterizing a logic of formal inconsistency as a paraconsistent logic endowed with a notion of consistency ∘ and a negation ¬, which is free from trivialism. This means that a contradiction expressed by means of the negation ¬ does not necessarily trivialize the underlying consequence relation, although consistent contradictions do explode. We then defined a measure of probability to one of such logics, the system Ci, taking profit from the underlying notion of consistency and essaying the first steps towards paraconsistent Bayesian updating.
One of the most important topic in statistics is how to ensure reliable uncertain inferences. If we agree with this, the notion of probability and its connection to logic are of fundamental importance. Of course, it is also possible to propose a modal paraconsistent approach to probability based on paraconsistent modalities (see [38
]), following the lines of [39
], but this has to wait for a better motivation.
How can we interpret paraconsistent probabilities? One possible viewpoint is to interpret paraconsistent probabilities as degrees of belief that a rational agent attaches to events, in such a way that such degrees respect the following principles: the necessary events (for instance, tautologies) get the maximum degrees; impossible events (for instance, bottom particles) get the lowest degrees; probabilities respect logical consequence; and finite additivity is guaranteed (that is, conjunction and disjunction retain their classical interpretation). The last condition seems to be less obvious, but the Dutch book argument provides, at least in the classical case, a line of justification for keeping finite additivity. A Dutch book is a sequence of bets, regarded by the agent as fair, but that in the long range guarantee the agent’s loss. De Finetti proved in [40
] that a Dutch book can be constructed against an agent whose degrees of belief do not respect finite additivity probabilities.
J. B. Paris in [41
] has shown that a generalization of the standard Dutch book theorem (his Theorem 5) applies to several non-classical logics, among them to paraconsistent logics where the meaning of conjunction and disjunction satisfies the clause for finite additivity in Definition 3. This generalization encompasses our notion of paraconsistent probability based on the logic Ci
as well as paraconsistent probability theories based on several other LFIs. More recently, it has been shown in [42
] that the Dutch book argument can be further extended to the domain of MV-algebras, providing a logical characterization of coherence for imprecise probability.
It should be taken into account that our underlying logic Ci enlarges the classical scenarios in significant ways: so for instance, even if impossible events should have degree zero by a rational agent, neither such events are necessarily unlikely nor a contradiction is an impossible event (although a consistent contradiction, as commented above, is impossible).
The question of how intimate is the relationship between logic and probability has been a controversial issue for more than three centuries. For instance, F. Ramsey considered in [9
] the theory of probability as a branch of logic, in which we find echoes of Leibniz and his defense of a “new kind of logic” devoted to questions of jurisprudence. Leibniz believed that mathematical calculations could help the legal standpoint, but he lacked the relevant combinatorial tools. Boole also presented a calculus of probabilities, based on his logical calculus, developed in the earlier part of his book The Laws of Thought
. All this led to a tradition of regarding probabilities as a generalization of two-valued logic, as R. Jeffrey argues in [43
Perhaps the two most relevant questions concerning the relationship between logic and probability are: (i) whether the laws of probability should be classified as laws of logic; and (ii) how logic and probability could be combined to refine reasoning. The interpretation of probability has been a torrid controversial question. If we see consequence relations as preserving degrees of probability instead of truth (as in standard logic), then we gain an interpretation that sees probability as part of logic. We do not intend to enter into this thorny issue in this paper. What concerns us here is that, independent of the chosen interpretation, logic and probability can be combined in a mathematically and philosophically well-reasoned way.
In this respect, considering that probability theory differs from classical logic in various aspects, that paraconsistent logic differs from the classical stance, as well, and that both are tolerant to contradictions, inexactness, and so on, their combination offers a new and exciting reasoning paradigm.
Some specific problems connected to probabilities in non-classical logics are pointed out by J. R. G. Williams in [14
Studying axiomatizations of non-classical probabilities is an open-ended task. Can we extend the results of Paris, Mundici et al., and get more general sense of what set of axioms are sufficient to characterize expectations of truth value? A major obstacle here is the appeal throughout to the additivity principle (P3) and its variants, which is the only one to turn essentially on the behavior of particular connectives. Is there a way of capturing its content in terms of logical relations between sentences rather such hardwired constraints?
Our axiomatics does not suffer from such problems and naturally generalizes the additivity principle. Williams also warns (on p. 17) that the generalization of conditional probability in some cases does not guarantee that . Again, this is not a problem in our setting, and yet, another possible source of problems, the fact that should hold for all logically-incompatible α and β, for instance, is treated in our Theorem 7.
As concerns axiomatizations of non-classical probabilities, some other LFIs are natural candidates to be used in connection with probabilities. One of the most promising is the three-valued paraconsistent logic LFI1
]): it is maximal with respect to classical logic (thus, in a certain sense closer to classical logic), enjoys some forms of De Morgan laws, and, above all, is algebraizable. This task is beyond the scope of this paper and will be postponed to future work.
De Finetti provocatively, in [30
]), said that probability does not exist, meaning that probability exists only subjectively, within the individual minds, to which Leibniz would certainly agree. For de Finetti, probability does not need to exist as a property of the outside world and, thus, is a purely epistemic concept. In this sense, paraconsistent probability as attached to the logics of formal inconsistency can be seen as a measure of the persistence of belief under contradictions within the individual minds and has epistemological roots with strong connections with the notion of evidence, as defended in [2
Many arguments maintain that credences should conform to the probability calculus, but since of course an agent can hold contradictory credences without losing its status of being ideally rational (and hopefully, immune to a Dutch book argument), its rationality should be coherent to a paraconsistent probability calculus. It seems to us that what we have in hand is a promising tool to enlarge the notion of probability.