# Causal Confirmation Measures: From Simpson’s Paradox to COVID-19

^{1}

^{2}

## Abstract

**:**

_{d}= max(0, (R − 1)/R) (R denotes the risk ratio) as the probability of causation. In contrast, Philosopher Fitelson uses confirmation measure D (posterior probability minus prior probability) to measure the strength of causation. Fitelson concludes that from the perspective of Bayesian confirmation, we should directly accept the overall conclusion without considering the paradox. The author proposed a Bayesian confirmation measure b* similar to P

_{d}before. To overcome the contradiction between the ECIT and Bayesian confirmation, the author uses the semantic information method with the minimum cross-entropy criterion to deduce causal confirmation measure Cc = (R − 1)/max(R, 1). Cc is like P

_{d}but has normalizing property (between −1 and 1) and cause symmetry. It especially fits cases where a cause restrains an outcome, such as the COVID-19 vaccine controlling the infection. Some examples (about kidney stone treatments and COVID-19) reveal that P

_{d}and Cc are more reasonable than D; Cc is more useful than P

_{d}.

## 1. Introduction

_{1}and its negation e

_{0.}Variable h takes one of two possible values h

_{1}and its negation h

_{0}. Then a sample includes four examples (e

_{1}, h

_{1}), (e

_{1}, h

_{0}), (e

_{0}, h

_{1}), and (e

_{0}, h

_{0}) with different proportions. The inductive school’s researchers often use positive examples and counterexamples’ proportions (P(e

_{1}|h

_{1}) and P(e

_{1}|h

_{0})) or likelihood ratio (P(e

_{1}|h

_{1})/P(e

_{1}|h

_{0})) to express confirmation measures.

^{2}because (x + 1)(x − 1) = x

^{2}− 1. We know that Kant distinguishes analytic judgments and synthetic judgments. Although causal inference is a mathematical method, it is used for synthetic judgments to obtain uncertain rules in biology, psychology, economics, etc. In addition, causal confirmation only deals with binary causality.

_{d}(used by Rubin and Greenland [13]) or the probability of necessity PN (used by Pearl [3]). There is:

_{d}is also called Relative Risk Reduction (RRR) [12]. In the above formula, max(0, ∙) means its minimum is 0. This function is to make P

_{d}more like a probability. Measure b* proposed by the author in [8] is like P

_{d}, but b* changes between −1 and 1. The above risk measures can measure not only risk or relative risk but also success or relative success raised by the cause.

**Example**

**1**

**.**The admission data of the graduate school of the University of California, Berkeley (UCB), for the fall of 1973 showed that 44% of male applicants were accepted, whereas only 35% of female applicants were accepted. There was probably gender bias present. However, in most departments, female applicants’ acceptance rates were higher than male applicants.

_{1}to denote a new cause (or treatment) and x

_{0}to denote a default cause or no cause. If we need to compare two causes, we may use x

_{1}and x

_{2}, or x

_{i}and x

_{j}, to represent them. In these cases, we may assume that one is default like x

_{0}.

**Example**

**2**

**.**Suppose there are two treatments, x

_{1}and x

_{2}, for patients with kidney stones. Patients are divided into two groups according to their size of stones. Group g

_{1}includes patients with small stones, and group g

_{2}has large ones. Outcome y

_{1}represents the treatment’s success. Success rates shown in Figure 1 are possible. In each group, the success rate of x

_{2}is higher than that of x

_{1}; however, the overall conclusion is the opposite.

_{2}is better than x

_{1}. The reason is that the stones’ size is a confounder, and the overall conclusion is affected by the confounder. We should eliminate this influence. The method is to imagine the patients’ numbers in each group are unchanged whether we use x

_{1}or x

_{2.}Then we replace weighting coefficients P(g

_{i}|x

_{1}) and P(g

_{i}|x

_{2}) with P(g

_{i}) (i = 1, 2) to obtain two new overall success rates. Rubin [1] expresses them as P(y

_{1}

^{x}

^{1}) and P(y

_{1}

^{x}

^{2}); whereas Pearl [3] expresses them as P(y

_{1}|do(x

_{1})) and P(y

_{1}|do(x

_{2})). Then, the overall conclusion is consistent with the grouping conclusion.

**Example**

**3**

**.**Treatment x

_{1}denotes taking a kind of antihypertensive drug, and treatment x

_{0}means taking nothing. Outcome y

_{1}denotes recovering health, and y

_{0}means not. Patients are divided into group g

_{1}(with high blood pressure) and group g

_{0}(with low blood pressure). It is very possible that in each group g, P(y

_{1}|g, x

_{1}) < P(y

_{1}|g, x

_{0}) (which means x

_{0}is better than x

_{1}); whereas overall result is P(y

_{1}|x

_{1}) > P(y

_{1}|x

_{0}) (which means x

_{1}is better than x

_{0}).

_{1}is better than x

_{0}because blood pressure is a mediator, which is also affected by x

_{1}. We expect that x

_{1}can move a patient from g

_{1}to g

_{0}; hence we need not change the weighting coefficients from P(g|x) to P(g). The grouping conclusion, P(y

_{1}|g, x

_{1}) < P(y

_{1}|g, x

_{0}), exists because the drug has a side effect.

**Example**

**4**

**.**The United States statistical data about COVID-19 in June 2020 show that COVID-19 led to a higher Case Fatality Rate (CFR) of non-Hispanic whites than others (overall conclusion). We can find that only 35.3% of the infected people were non-Hispanic whites, whereas 49.5% of the infected people who died from COVID-19 were non-Hispanic whites. It seems that COVID-19 is more dangerous to non-Hispanic whites. However, Dana Mackenzie pointed out [19] that we will obtain the opposite conclusion from every age group because the CFR of non-Hispanic whites is lower than that of other people in every age group. So, there exists Simpson’s Paradox. The reason is that non-Hispanic whites have longer lifespans and a relatively large proportion of the elderly, while COVID-19 is more dangerous to the elderly.

_{i}= P(y

_{1}|x

_{1}) − P(y

_{1})

_{1}|x

_{1}, g

_{i}) > P(y

_{1}|x

_{2}, g

_{i}), i=1, 2,

_{1}|x

_{1}) > P(y

_{1}). The result is the same when “>“ is replaced with “<“. Therefore, Fitelson affirms that, unlike RD and P

_{d}, measure i does not result in the paradox.

- For Example 2 about kidney stones, is it reasonable to accept the overall conclusion without considering the difficulties of treatments?
- Is it necessary to extend or apply a Bayesian confirmation measure incompatible with the ECIT and medical practices to causal confirmation?
- Except for the incompatible confirmation measures, are there no compatible confirmation measures?

- combining the ECIT to deduce causal confirmation measure Cc(x
_{1}=> y_{1}) (“C” stands for confirmation and “c” for the cause), which is similar to P_{d}but can measure negative causal relationships, such as “vaccine => infection”; - explaining that measures Cc and P
_{d}are more suitable for causal confirmation than measure_{i}by using some examples with Simpson’s Paradox; - supporting the inductive school of Bayesian confirmation in turn.

_{1}=> y

_{1}), which indicates the outcome’s inevitability or the cause’s sufficiency.

## 2. Background

#### 2.1. Bayesian Confirmation: Incremental School and Inductive School

_{f}. There is [5,6]:

_{f}= f(e,h) = P(h|e). (Carnap, 1962, Fitelson, 2017)

_{1}, h

_{1}) = P(h

_{1}|e

_{1}) − P(h

_{1}) (Carnap, 1962),

_{1}, h

_{1}) = P(e

_{1}|h

_{1}) − P(e

_{1}) (Mortimer, 1988),

_{1}, h

_{1}) = log[P(h

_{1}|e

_{1})/P(h

_{1})] (Horwich, 1982),

_{1}, h

_{1}) = P(h

_{1}, e

_{1}) − P(e

_{1}) P(h

_{1}) (Carnap, 1962),

_{1}, h

_{1}) is measure

_{i}recommended by Fitelson in [5]. R(e

_{1}, h

_{1}) is an information measure. It can be written as logP(h

_{1}|e

_{1}) − logP(h

_{1}). Since logP(h

_{1}|e

_{1}) − logP(h

_{1}) = logP(e

_{1}|h

_{1}) − logP(e

_{1}) = logP(h

_{1},e

_{1}) − log[P(h

_{1})P(e

_{1})], D, M, and C increase with R and hence can be replaced with each other. Z is the normalization of D for having the two desired properties [10]. Therefore, we can also call the incremental school the information school.

_{1}, h

_{1}) = P(h

_{1}|e

_{1}) − P(h

_{1}|e

_{0}) (Christensen, 1999),

_{1}, h

_{1}) = P(e

_{1}|h

_{1}) − P(e

_{1}|h

_{0}) (Nozick, 1981),

_{1}, h

_{1}) = log[ P(e

_{1}|h

_{1})/P(e

_{1}|h

_{0})] (Good, 1984),

_{1}, h

_{1}) = [P(e

_{1}|h

_{1}) − P(e

_{1}|h

_{0})]/[ P(e

_{1}|h

_{1})+ P(e

_{1}|h

_{0})] (Kemeny and Oppenheim, 1952),

_{1}, h

_{1}) = [P(e

_{1}|h

_{1}) − P(e

_{1}|h

_{0})]/max(P(e

_{1}|h

_{1}), P(e

_{1}|h

_{0})) (Lu, 2020).

^{+}= P(e

_{1}|h

_{1})/P(e

_{1}|h

_{0})). For example, L = log LR

^{+}and F = (LR

^{+}− 1)/(LR

^{+}+ 1) [7]. Therefore, these measures are compatible with risk (or reliability) measures, such as P

_{d}, used in medical tests and disease control. Although the author has studied semantic information theory for a long time [26,27,28] and believe both schools have made important contributions to Bayesian confirmation, he is on the side of the inductive school. The reason is that information evaluation occurs before classification, whereas confirmation is needed after classification [8].

_{1}, h

_{1}), (e

_{0}, h

_{1}), (e

_{1}, h

_{0}), and (e

_{0}, h

_{0}) with different proportions as the evidence to construct confirmation measures [8,10]. The main problem with the incremental school is that they do not distinguish the evidence of a major premise and that of the consequent of the major premise well. When they use the four examples’ proportions to construct confirmation measures, e is regarded as the major premise’s antecedent, whose negation e

_{0}is meaningful. However, when they say “to evaluate the supporting strength of e to h”, e is understood as a sample, whose negation e

_{0}is meaningless. It is more meaningless to put a sample e or e

_{0}in an example (e

_{1}, h

_{1}) or (e

_{0}, h

_{1}).

_{i}) and S to show the main difference between the two schools’ measures. Since:

_{1}, h

_{1}) = P(h

_{1}|e

_{1}) − P(h

_{1}) = P(h

_{1}|e

_{1}) − [P(e

_{1})P(h

_{1}|e

_{1}) + P(e

_{0})P(h

_{1}|e

_{0})]

= [1 − P(e

_{1})]P(h

_{1}|e

_{1}) − P(e

_{0})P(h

_{1}|e

_{0}) = P(e

_{0})S(e

_{1},h

_{1}),

_{0}) or P(e

_{1}), but S does not. P(e) means the source and P(h|e) means the channel. D is related to the source and the channel, but S is only related to the channel. Measures F and b* are also only related to channel P(e|h). Therefore, the author calls b* the channels’ confirmation measure.

#### 2.2. The P-T Probability Framework and the Methods of Semantic Information and Cross-Entropy for Channels’ Confirmation Measure b*(e→h)

_{0},x

_{1},…}, and Y be a random variable representing a label or hypothesis, taking a value y∈B = { y

_{0},y

_{1},…}. The Shannon channel is a conditional probability matrix P(y

_{j}|x

_{i}) (i = 1,2,...; j = 1,2,…) or a set of transition probability functions P(y

_{j}|x) (j = 1,2,…). The semantic channel is a truth value matrix T(y

_{j}|x

_{i}) (i = 1,2,…; j = 1,2,…) or a set of truth functions T(y

_{j}|x) (j = 0,1,…). Let the elements in A that make y

_{j}true form a fuzzy subset θ

_{j}. The membership function T(θ

_{j}|x) of θ

_{j}is also the truth function T(y

_{j}|x) of y

_{j}, i.e., T(θ

_{j}|x) = T(y

_{j}|x).

_{j}is:

_{j}is true, the conditional probability of x is:

_{j}can also be understood as a model parameter; hence P(x|θ

_{j}) is a likelihood function.

- The statistical probability is normalized (the sum is 1), whereas the logical probability is not. Generally, we have T(θ
_{0}) + T(θ_{1}) + … > 1. - The maximum value of T(θ
_{j}|x) is 1 for different x, whereas P(y_{0}|x) + P(y_{1}|x) + … = 1 for a given x.

_{i}conveyed by y

_{j}is:

_{j}is:

_{j}) is a cross-entropy:

_{j}) so that P(x|θ

_{j}) = P(x|y

_{j}), H(X|θ

_{j}) reaches its minimum. It is easy to find from Equation (10) that I(X; θ

_{j}) reaches its maximum as H(X|θ

_{j}) reaches its minimum. The author has proved that if P(x|θ

_{j}) = P(x|y

_{j}), then T(θ

_{j}|x)∝P(y

_{j}|x) [27]. If for all j, T(θ

_{j}|x)∝P(y

_{j}|x), we say that the semantic channel matches the Shannon channel.

_{0}, h

_{1}} = {infected, uninfected} and e∈{e

_{0}, e

_{1}} = {positive, negative}. The Shannon channel is P(e|h), and the semantic channel is T(e|h). The major premise to be confirmed is e

_{1}→h

_{1}, which means “If one’s test is positive, then he is infected”.

_{1}(h) as the linear combination of a clear predicate (whose truth value is 0 or 1) and a tautology (whose truth value is always 1). Let the tautology’s proportion be b

_{1}′ and the clear predicate’s proportion be 1 − b

_{1}′. Then we have:

_{1}|h

_{0}) = b

_{1}′; T(e

_{1}|h

_{1}) = b

_{1′}+ b

_{1}= b

_{1}′ + (1 − b

_{1}′) = 1.

_{1}′ is also called the degree of disbelief of rule e

_{1}→h

_{1}. The degree of disbelief optimized by a sample, denoted by b

_{1}′*, is the degree of disconfirmation. Let b

_{1}* denote the degree of confirmation; we have b

_{1}′* = 1 − |b

_{1}*|. By maximizing average semantic information I(H; θ

_{1}) or minimizing cross-entropy H(H|θ

_{j}), we can deduce (see Section 3.2 in [8]):

_{1}) is decomposed into an equiprobable part and a part with 0 and 1. Then, we can deduce the predictions’ confirmation measure c*:

#### 2.3. Causal Inference: Talking from Simpson’s Paradox

_{1}|x

_{1}) and P(y

_{1}|x

_{0}) may not reflect causality well; in addition to the observed data or joint probability distribution P(y, x, g), we also need to suppose the causal structure behind the data [3].

_{1}|do(x)) will also differ. In all cases, we should replace P(y|x) with P(y|do(x)) (if they are different) to get RD, RR, and P

_{d}.

_{1}and x

_{2}, we should compare the two outcomes in the same background. However, there is often no situation where other conditions remain unchanged except for the cause. For this reason, we need to replace x

_{1}with x

_{2}in our imagination and see the shift in y

_{1}or its probability. If u is a confounder and not affected by x, the number of members in g

_{1}and g

_{2}should be unchanged with x, as shown in Figure 3. The solution is to use P(g) instead of P(g|x) for the weighting operation so that the overall conclusion is consistent with the grouping conclusion. Hence, the paradox no longer exists.

_{0}) + P(x

_{1}) = 1 is tenable, P(do (x

_{1})) + P(do (x

_{0})) = 1 is meaningless. That is why Rubin emphasizes that P(y

^{x}), i.e., P(y|do(x)), is still a marginal probability instead of a conditional probability, in essence.

_{1}, the two subgroups’ members (patients) treated by x

_{1}and x

_{2}are interchangeable (i.e., Pearl’s causal independence assumption mentioned in [5]). If a member is divided into the subgroup with x

_{1}, its success rate should be P(y

_{1}|g, x

_{1}); if it is divided into the subgroup with x

_{2}, the success rate should be P(y

_{1}|g, x

_{2}). P(g|x

_{1}) and P(g|x

_{2}) are different only because half of the data are missing. However, we can fill in the missing data using our imagination.

_{1}may enter g

_{2}because of x, and vice versa. P(g|x

_{0}) and P(g|x

_{1}) are hence different without needing to be replaced with P(g). We can let P(y

_{1}|do (x)) = P(y

_{1}|x) directly and accept the overall conclusion.

#### 2.4. Probability Measures for Causation

_{1}stand for the infection, x

_{1}for the exposure, and x

_{0}for no exposure. Then there is R(t) = P(y

_{1}|do(x

_{1}), t)/P(y

_{1}|do(x

_{0}), t). Its lower limit is 0 because the probability cannot be negative. When the change of t is neglected, considering the lower limit, we can write the probability of causation as:

_{d}and explains PN as the probability of necessity [3]. P

_{d}is very similar to confirmation measure b* [8]. The main difference is that b* changes between −1 and 1.

_{d}and Δ*P

_{x}

^{y}is that P

_{d}, like b*, is sensitive to counterexamples’ proportion P(y

_{1}|x

_{0}), whereas Δ*P

_{x}

^{y}is not. Table 1 shows their differences.

_{f}in [5].

_{0}; the other is for the inevitability of y. P(y|x) may be good for the latter but not for the former. The former should be independent of P(x) and P(y). P

_{d}is such a one.

_{d}. If P

_{d}is 0 when y is uncorrelated to x, then P

_{d}should be negative instead of 0 when x inversely affects y (e.g., vaccine affects infection). Therefore, we need a confirmation measure between −1 and 1 instead of a probability measure between 0 and 1.

## 3. Methods

#### 3.1. Defining Causal Posterior Probability

^{x}) is not conditional; it is still marginal. To distinguish P(y

^{x}) and marginal probability P(y), we call P(y

^{x}), i.e., P(y|do(x)), the Causal Posterior Probability (CPP). What posterior probability is the CPP? We use the following example to explain.

_{1}|z) = 1 for z ≥ z

_{0}. The label of an elderly is y

_{1}, and the label of a non-elderly is y

_{0}. The probability of the elderly is:

_{1}denote the improved medical condition. After a period, p(z) becomes p(z

^{x}

^{1}) = p(z|do(x

_{1})) and P(y

_{1}) becomes:

_{0}be the medical condition existing already. We have:

- About whether a drug (x
_{1}) can lower blood pressure, blood sugar, blood lipid, or uric acid (z) or not, if z drops to a certain level z_{0}, we say that the drug is effective (y_{1}). - About whether a fertilizer (x
_{1}) can increase grain yield (z), if z increases to a certain extent z_{0}, the grain yield is regarded as a bumper harvest (y_{1}). - Can a process x
_{1}reduce the deviation z of a product’s size? If the deviation is smaller than the tolerance (z_{0}), we consider the product qualified (y_{1}).

_{0}. For example, if the dividing boundary of the elderly changes from z

_{0}= 60 to z

_{0}′ = 65, the posterior probability P(y

_{1}|z

_{0}′) of y

_{1}will become smaller. This change seemingly also reflects causality. However, the author thinks this change is due to a mathematical cause, which does not reflect the causal relationship we want to study. Therefore, we need to define the CPP more specifically.

**Definition**

**1.**

_{1}, z

_{2}, …} and p(z) is the probability distribution of the objective result. Random variable Y takes a value y∈{y

_{0}, y

_{1}} and represents the outcome, i.e., the classification label of z. The cause or treatment is x ∈ {x

_{0}, x

_{1}} or {x

_{1}, x

_{2}}. If replacing x

_{0}with x

_{1}(or x

_{1}with x

_{2}) can cause the change of probability distribution p(z), we call x the cause, p(z|x) or p(z

^{x}) the CPP distribution, and P(y

^{x}) = P(y|do(x)) the CPP.

_{1}, the conditional probability distribution p(z|y

_{1}) is not the CPP distribution because the probability distribution of z does not change with y.

_{1}is the vaccine for COVID-19, y

_{1}is the infection, and e

_{1}is the test-positive. Then P(y

_{1}|x

_{1}) or P(y

_{1}|do(x

_{1})) is the CPP, whereas P(y

_{1}|e

_{1}) is not. We may regard y

_{1}as the conclusion obtained by the best test, e

_{1}is from a common test, and P(y

_{1}|e

_{1}) is the probability prediction of y

_{1}. P(y

_{1}|e

_{1}) is not a CPP because e

_{1}does not change p(z) and the conclusion from the best test.

#### 3.2. Using x_{2}/x_{1} => y_{1} to Compare the Influences of Two Causes on an Outcome

_{0}is the negation of x

_{1}; they are complementary. However, in causal relationships, x

_{1}is the substitute for x

_{0}. For example, consider taking medicines to cure the disease. Let x

_{0}denote taking nothing, and x

_{1}and x

_{2}represent taking two different medicines. Each of x

_{1}and x

_{2}is a possible alternative to x

_{0}instead of the negation of x

_{0}. Furthermore, in some cases, x

_{1}may include x

_{0}(see Section 4.3).

_{2}and x

_{1}, it is unclear to use “x

_{2}=> y

_{1}” to indicate the causal relationship. Therefore, the author suggests that we had better replace “x

_{2}=> y

_{1}” with “x

_{2}/x

_{1}=> y

_{1}”, which means “replacing x

_{1}with x

_{2}will arise or increase y

_{1}”.

_{2}/x

_{1}”:

- One is to express symmetry (Cc(x
_{2}/x_{1}=> y_{1}) = − Cc(x_{1}/x_{2}=> y_{1})) conveniently. - Another is to emphasize that x
_{1}and x_{2}are not complementary but alternatives for eliminating Simpson’s Paradox easily.

_{1}with x

_{0}, we may selectively use “x

_{1}/x

_{0}=> y

_{1}” or “x

_{1}=> y

_{1}”.

_{2}with x

_{1}in our imagination, we can easily understand why the number of patients in each group should be unchanged, that is, P(g|x

_{1}) = P(g|x

_{2}) = P(g). The reason is that the replacement will not change everyone’s kidney stone size.

_{1}. When we replace x

_{0}with x

_{1}, P(g|x

_{1}) ≠ P(g|x

_{0}) ≠ P(g) is reasonable, and hence the weighting coefficients need not be adjusted. In this case, we can directly let P(y

_{1}|do(x)) = P(y

_{1}|x).

#### 3.3. Deducing Causal Confirmation Measure Cc by the Methods of Semantic Information and Cross-Entropy

_{1}=> y

_{1}as an example to deduce the causal confirmation measure Cc. If we need to compare any two causes, x

_{i}and x

_{k}, we may assume that one is default as x

_{0}.

_{1}= “x

_{1}=> y

_{1}” and s

_{0}= “x

_{0}=> y

_{0}”. We suppose that s

_{1}includes a believable part with proportion b

_{1}and a disbelievable part with proportion b

_{1}′. Their relation is b

_{1}′ + |b

_{1}| = 1. First, we assume b

_{1}> 0; hence b

_{1}= 1 − b

_{1}′. The two truth values of s

_{1}are T(s

_{1}|x

_{1}) and T(s

_{1}|x

_{0}), as shown in the last row of Table 2.

_{1}|x) is related to b

_{1}and b

_{1}′ for b

_{1}> 0. T(s

_{1}|x

_{1}) = 1 means that example (x

_{1}, y

_{1}) makes s

_{1}fully true; T(s

_{1}|x

_{0}) = b

_{1}′ is the truth value and the degree of disbelief of s

_{1}for given counterexample (x

_{0}, y

_{1}).

_{1}= Cc(x

_{1}/x

_{0}= >y

_{1}) = b

_{1}*.

_{1}is (see Equation (7)):

_{1}) = P(x

_{1}) + P(x

_{0}) b

_{1}′,

_{1}by y

_{1}and s

_{1}is:

_{j}can be regarded as the parameter of truth function T(s

_{j}|x).

_{1}and s

_{1}about x is:

_{1}) is a cross-entropy. We suppose that sampling distribution P(x, y) has be modified so that P(y|x) = P(y|do(x)). According to the property of cross-entropy, H(X|θ

_{1}) reaches its minimum so that I(X; θ

_{j}) reaches its maximum as P(x|θ

_{1}) = P(x|y

_{1}), i.e.,

_{i}and y

_{j}and may be independent of P(x) and P(y), unlike P(x

_{i}, y

_{j}). From Equations (25) and (26), we obtain the optimized degree of disbelief, i.e., the degree of disconfirmation:

_{1}′* = m(x

_{0},y

_{1})/m(x

_{1},y

_{1}).

_{1}:

_{1}*> 0 and hence m(x

_{1}, y

_{1}) ≥ m(x

_{0},y

_{1}). If m(x

_{1}, y

_{1}) < m(x

_{0}, y

_{1}), b

_{1}* should be negative, and b

_{1}′* should be m(x

_{1}, y

_{1}) / m(x

_{0}, y

_{0}). Then we have:

_{1}′* = P(x

_{0}|y

_{1})/P(x

_{1}|y

_{1}),

_{1}|x

_{1}) / P(y

_{1}|x

_{0}) is the relative risk or the likelihood ratio used for P

_{d}.

_{0}, y

_{1}) = 0 and the minimum is −1 as m(x

_{1}, y

_{1}) = 0. It has cause symmetry since:

_{1}) be the linear combination of a uniform probability distribution and a 0–1 distribution, we can obtain another causal confirmation measure:

_{1}→h

_{1}) [8]. It increases monotonically with the Bayesian confirmation measure f(h

_{1}, e

_{1}) = P(h

_{1}|e

_{1}), which is used by Fitelson et al. [5,32]. However, Ce has the normalizing property and the outcome symmetry:

_{1}=> y

_{1}) = − Ce(x

_{1}=> y

_{0}).

#### 3.4. Causal Confirmation Measures Cc and Ce for Probability Predictions

_{1}, b

_{1}*, and P(x), we can make the probability prediction about x:

_{1}* > 0, θ

_{1}represents y

_{1}with b

_{1}′*, and θ

_{0}means y

_{0}with b

_{0}′*. If b

_{1}*< 0, we let T(s

_{1}|x

_{1}) = b

_{1}′ and T(s

_{1}|x

_{0}) = 1, and then use the above formula.

_{1}and

**C**e

_{1}. For example, when Ce

_{1}is greater than 0, there is:

_{x}

_{1}denotes x

_{1}and Ce

_{1}.

_{1}> 0 and b

_{0}> 0, as shown in Table 2, we can obtain the corresponding Shannon channel P(y|x). According to Equation (32), we can deduce:

## 4. Results

#### 4.1. A Real Example of Kidney Stone Treatments

_{2}(i.e., treatment A in [15]) is better than treatment x

_{1}(i.e., treatment B in [15]); whereas the conclusion according to average success rates, P(y

_{1}|x

_{2}) = 0.78 and P(y

_{1}|x

_{1}) = 0.83, treatment x

_{1}is better than treatment x

_{2}. There seems to be a paradox.

_{1}) or P(g|x

_{2}) as the weighting coefficient for P(y

_{1}|do(x

_{1})) and P(y

_{1}|do(x

_{2})). After replacing P(y

_{1}|x

_{1}) with P(y

_{1}|do(x

_{1})) and P(y

_{1}|x

_{2}) with P(y

_{1}|do(x

_{2})), we derived Cc

_{1}= Cc(x

_{2}/x

_{1}=> y

_{1}) = 0.06 (see Table 3), which means that the overall conclusion is that treatment x

_{2}is better than treatment x

_{1}.

_{1}in Table 3, we used treatment x

_{1}as the default; the degree of causal confirmation Cc

_{1}= Cc(x

_{2}/x

_{1}=> y

_{1}) is 0.06. If we used x

_{2}as the default, Cc

_{1}= Cc(x

_{1}/x

_{2}=> y

_{1}) = −0.06. Using measure Cc, we need not worry about which of P(y

_{1}|do(x

_{1})) and P(y

_{1}|do(x

_{2})) is larger, whereas, using P

_{d}, we have to consider that before calculating P

_{d}.

_{1}, y

_{1}) to compare x

_{1}and x

_{2}. We obtained:

- ●
- P(y
_{1}) = P(x_{1})P(y_{1}|x_{1}) + P(x_{2})P(y_{1}|x_{2}) = 0.805, - ●
- P(y
_{1}|x_{2}, g_{1}) − P(y_{1}) = 0.93 − 0.805 > 0, - ●
- P(y
_{1}|x_{2}, g_{2}) − P(y_{1}) = 0.73 − 0.805 < 0, and - ●
- D(x
_{1}, y_{1}) = P(y_{1}|x_{1}) − P(y_{1}) = 0.83 − 0.805 > 0 - ●
- D(x
_{2}, y_{2}) = P(y_{1}|x_{2}) − P(y_{1}) = 0.78 − 0.805 < 0.

_{1}is better than x

_{2}. There seems to be no paradox only because the paradox is avoided rather than eliminated when we use D(x

_{1}, y

_{1}).

_{1}′* and b

_{0}′* is the same as P(y|do(x)) shown in the last two rows of Table 3.

#### 4.2. An Example of Eliminating Simpson’s Paradox with COVID-19

_{1}represents the non-Hispanic white and x

_{2}means the other races. P(y

_{1}|x

_{1}, g) and P(y

_{1}|x

_{2}, g) are the CFRs of x

_{1}and x

_{2}in an age group g. See Appendix A for the original data and median results.

_{1}, y

_{1}) to assess the risk. The average CFR is 0.97 (found on the same website [33]). We obtained:

_{1}|x

_{1}) − P(y

_{1}) = 1.04 − 0.97 = 0.07,

_{1}|x

_{2}) − P(y

_{1}) = 0.73 − 0.97 = −0.14,

#### 4.3. COVID-19: Vaccine’s Negative Influences on the CFR and Mortality

_{d}is not convenient to measure the “probability” of “vaccine => infection” or “vaccine => death”, since P

_{d}is regarded as the probability, whose minimum value is 0, while the vaccine’s influence is negative. However, there is no problem using Cc because Cc can be negative.

_{0}) with the new mortality rate due to common reasons plus COVID-19 (x

_{1}) during the same period (such as one year). Since the average lifespan of people in the United States is 79 years old, the annual mortality rate is about 1/79 = 0.013. From Table 6, we can derive that the yearly mortality rate caused by COVID-19 is 0.001 (for unvaccinated people) or 0.00018 (for vaccinated people).

_{0}and x

_{1}are independent of each other. Then new mortality rate P(y

_{1}|x

_{1}) should be 0.013 + 0.001 − 0.013 × 0.001 ≈ 0.014 (for unvaccinated people) or 0.013 + 0.00018 − 0.013 × 0.00018 ≈ 0.01318 (for vaccinated people). Table 7 shows the degree of causal confirmation of COVID-19 leading to mortality, for which we assume P(y

_{1}|x) = P(y

_{1}|do(x)).

_{1}= 0.07 means that among unvaccinated people who die, 7% are due to COVID-19. Moreover, Cc = 0.014 means that among the vaccinated people who die, 1.4% are due to COVID-19.

_{1}= COVID-19 instead of x

_{1}= x

_{0}+ COVID-19, we would get a strange conclusion that COVID-19 could reduce deaths.

## 5. Discussion

#### 5.1. Why Can P_{d} and Cc Better Indicate the Strength of Causation Than D in Theory?

_{i}, y

_{j}) (i = 0,1; j = 0,1) the probability correlation matrix, which is not symmetrical. Although there exists P(x, y) first and then m(x, y) from the perspective of calculation, there exists m(x, y) first and then P(x, y) from the perspective of existence. That is, given P(x), m(x, y) only allows specific P(y) to happen.

_{d}and Cc only depend on m(x, y) and are independent of P(x) and P(y). The two degrees of disconfirmation, b

_{1}′* and b

_{0}′*, ascertain a semantic channel and a Shannon channel. Therefore, the two degrees of causal confirmation, Cc

_{1}= b

_{1}* and Cc

_{0}= b

_{0}*, indicate the strength of the constraint relationship (causality) from x to y. Like Cc, measure P

_{d}is also only related to m(x, y). D and Δ*P

_{x}

^{y}are different; they are related to P(x), so they do not indicate the strength of causation well.

_{d}or Cc are irrelated to vaccination coverage rate P(x

_{1}), whereas measure Δ*P

_{x}

^{y}is related to P(x

_{1}). Measure D is associated with P(y) and is also related to P(x

_{1}). P

_{d}and Cc

_{1}obtained from one region also fits other areas for the same variant of COVID-19. In contrast, Δ*P

_{x}

^{y}and D are not universal because the vaccination coverage rate P(x

_{1}) differs in different areas.

_{1}) is a prior probability, and P(y

_{1}|x) − P(y

_{1}) is its increment. However, when measure D is used for causal confirmation, P(y

_{1}) is obtained from P(x) and P(y

_{1}|x) after the treatment, so P(y) is no longer a priori probability, which is also a fatal problem with the incremental school.

_{d}can indicate the degree of belief of a fuzzy major premise and can be used for probability predictions, whereas D and Δ*P

_{x}

^{y}cannot.

#### 5.2. Why Are P_{d} and Cc Better than D in Practice?

_{d}and Cc are better than D in practice. The reasons are as follows.

#### 5.2.1. P_{d} and Cc Have Precise Meanings in Comparison with D

_{1}= Cc (x

_{1}/x

_{0}=> y

_{1}) indicates what percentage of the result y is due to x

_{1}instead of x

_{0}. For example, Table 6 shows that according to the virulence of the virus, COVID-19 will increase the mortality rate of vaccinated people from 1.3% to 1.318%. Therefore, the degree of causal confirmation is Cc

_{1}= P

_{d}= 0.014, which means that 1.4% of the deaths will be due to COVID-19. However, the meanings of D and Δ*P

_{x}

^{y}are not precise.

_{d}and Cc indicate relative risk or the relative change of the outcome. Many people think COVID-19 is very dangerous because it can kill millions in a country. However, the mortality rate it brings is much lower than that caused by common reasons. P

_{d}and Cc can reveal the relative change in the mortality rate (see Table 7). Although it is essential to reduce or delay deaths, it is also vital to decrease the economic loss due to the fierce fight against the pandemic. Therefore, P

_{d}and Cc can help decision-makers balance between reducing or delaying deaths and reducing financial losses.

#### 5.2.2. The Confounder’s Effect Is Removed from P_{d} and Cc

_{d}or Cc, we can eliminate Simpson’s Paradox and make the overall conclusion consistent with the grouping conclusion: treatment x

_{2}is better than treatment x

_{1}. For example, if we use D to compare the success rates of two treatments, although we can avoid Simpson’s Paradox, the conclusion is unreasonable. The reason is that we neglect the difficulties of treatments for different sizes of kidney stones. If a hospital only accepts patients who are easy to treat, its overall success rate must be high; however, such a hospital may not be a good one.

#### 5.2.3. P_{d} and Cc Allow Us to View the Third Factor, u, from Different Perspectives

_{d}and Cc.

#### 5.3. Why Is It Better to Replace P_{d} with Cc?

_{d}as the probability of causation, P

_{d}can only take its lower limit 0. Although we can replace P

_{d}(vaccinated => death) with P

_{d}(unvaccinated => death) to ensure P

_{d}> 0, it does not conform to our thinking habits to take being vaccinated as the default cause. In addition, Cc has cause symmetry, whereas P

_{d}does not.

_{d}to compare two causes x

_{1}and x

_{2}, such as two treatments for kidney stones (see Section 4.1), we had to consider which of P(y

_{1}|x

_{2}) and P(y

_{1}|x

_{1}) was larger. However, using Cc, we needed to not consider that because it is unnecessary to worry about if (R − 1)/R < 0.

_{1}= Cc(x

_{1}=> y

_{1}) and Cc

_{0}= Cc(x

_{0}=> y

_{0}).

#### 5.4. Necessity and Sufficiency in Causality

_{d}and Cc only indicate the necessity of cause x to outcome y; they do not reflect the sufficiency of x or the inevitability of y. On the other hand, measure f = P(y|x) and Ce can indicate the outcome’s inevitability.

_{0}=> y

_{0}and x

_{1}=> y

_{1}for the same purpose. However, OR

_{N}has the normalizing property and symmetry.

#### 5.5. The Relationship between Bayesian Confirmation Measures b* and c*, and Causal Confirmation Measures Cc and Ce

_{1}|x) has been modified for P(y

_{1}|x) = P(y

_{1}|do(x)). Causal confirmation measure Cc is equal to channels’ confirmation measure b* [8] in value, i.e.,

_{1}=> y

_{1}) = [P(y

_{1}|x

_{1}) − P(y

_{1}|x

_{0})]/max(P(y

_{1}|x

_{1}), P(y

_{1}|x

_{0})) = b*(y

_{1}→x

_{1}).

_{1}is the cause of y

_{1}, then y

_{1}is the evidence of x

_{1}. For example, if COVID-19 infection is the cause of the test-positive, then the test-positive is the evidence of the infection.

_{1}→y

_{1}) in value, i.e.,

_{1}=> y

_{1}) = [P(y

_{1}|x

_{1}) − P(y

_{0}|x

_{1})]/max(P(y

_{1}|x

_{1}), P(y

_{0}|x

_{1})) = c*(x

_{1}→y

_{1}).

## 6. Conclusions

_{1}, y

_{1}) = P(y

_{1}|x

_{1}) − P(y

_{1}) to denote the supporting strength of the evidence to the consequence and extended this measure for causal confirmation without considering the confounder. This paper has shown that measure D is incompatible with the ECIT and popular risk measures, such as P

_{d}= max(0, (R − 1)/R). Using D, one can only avoid Simpson’s Paradox but he cannot eliminate it or provide a reasonable explanation as the ECIT does.

_{d}as the probability of causation. P

_{d}is better than D, but it is improper to call P

_{d}a probability measure and use the probability measure to measure causation. If we use P

_{d}as a causal confirmation measure, it lacks the normalizing property and symmetry that an ideal confirmation measure should have.

_{1}=> y

_{1}) = (R – 1) / max(R, 1) by the semantic information method with the minimum cross-entropy criterion. Cc is similar to the inductive school’s confirmation measure b* proposed by the author earlier. However, the positive examples’ proportion P(y

_{1}|x

_{1}) and the counterexamples’ proportion P(y

_{1}|x

_{0}) are replaced with P(y

_{1}|do(x

_{1})) and P(y

_{1}|do(x

_{0})) so that Cc is an improved P

_{d}. Compared with P

_{d}, Cc has the normalizing property (it changes between –1 and 1) and the cause symmetry (Cc(x

_{0}/x

_{1}=> y

_{1}) = −Cc (x

_{1}/x

_{0}=> y

_{1})). Since Cc may be negative, it is also suitable for evaluating the inhibition relationship between cause and outcome, such as between vaccine and infection.

_{d}and Cc are more reasonable and meaningful than D, and Cc is better than P

_{d}mainly because Cc may be less than zero. In addition, this paper has also provided a causal confirmation measure Ce(x

_{1}=> y

_{1}) that indicates the inevitability of the outcome y

_{1}.

## Funding

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. Data and Calculations for Comparing the CFRs of Non-Hispanic Whites and Other People in the USA

## References

- Rubin, D. Causal inference using potential outcomes. J. Amer. Statist. Assoc.
**2005**, 100, 322–331. [Google Scholar] [CrossRef] - Hernán, M.A.; Robins, J.M. Causal Inference: What If; Chapman & Hall/CRC: Boca Raton, FL, USA, 2020. [Google Scholar]
- Pearl, J. Causal inference in statistics: An overview. Stat. Surv.
**2009**, 3, 96–146. [Google Scholar] [CrossRef] - Geffner, H.; Rina Dechter, R.; Halpern, J.Y. (Eds.) Probabilistic and Causal Inference: The Works of Judea Pearl, Association for Computing Machinery; Association for Computing Machinery: New York, NY, USA, 2021. [Google Scholar]
- Fitelson, B. Confirmation, Causation, and Simpson’s Paradox. Episteme
**2017**, 14, 297–309. [Google Scholar] [CrossRef][Green Version] - Carnap, R. Logical Foundations of Probability, 2nd ed.; University of Chicago Press: Chicago, IL, USA, 1962. [Google Scholar]
- Kemeny, J.; Oppenheim, P. Degrees of factual support. Philos. Sci.
**1952**, 19, 307–324. [Google Scholar] [CrossRef][Green Version] - Lu, C. Channels’ Confirmation and Predictions’ Confirmation: From the Medical Test to the Raven Paradox. Entropy
**2020**, 22, 384. [Google Scholar] [CrossRef][Green Version] - Greco, S.; Slowiński, R.; Szczęch, I. Properties of rule interestingness measures and alternative approaches to normalization of measures. Inf. Sci.
**2012**, 216, 1–16. [Google Scholar] [CrossRef] - Crupi, V.; Tentori, K.; Gonzalez, M. On Bayesian measures of evidential support: Theoretical and empirical issues. Philos. Sci.
**2007**, 74, 229–252. [Google Scholar] [CrossRef][Green Version] - Eells, E.; Fitelson, B. Symmetries and asymmetries in evidential support. Philos. Stud.
**2002**, 107, 129–142. [Google Scholar] [CrossRef] - Relative Risk, Wikipedia the Free Encyclopedia. Available online: https://en.wikipedia.org/wiki/Relative_risk (accessed on 15 August 2022).
- Robins, J.; Greenland, S. The probability of causation under a stochastic model for individual risk. Biometrics
**1989**, 45, 1125–1138. [Google Scholar] [CrossRef] - Simpson, E.H. The interpretation of interaction in contingency tables. J. R. Stat. Soc. Ser. B
**1951**, 13, 238–241. [Google Scholar] [CrossRef] - Simpson’s Paradox. Wikipedia the Free Encyclopedia. Available online: https://en.wikipedia.org/wiki/Simpson%27s_paradox (accessed on 20 August 2022).
- Charig, C.R.; Webb, D.R.; Payne, S.R.; Wickham, J.E. Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy. Br. Med. J. (Clin. Res. Ed.)
**1986**, 292, 879–882. [Google Scholar] [CrossRef][Green Version] - Julious, S.A.; Mullee, M.A. Confounding and Simpson’s paradox. BMJ
**1994**, 309, 1480–1481. [Google Scholar] [CrossRef][Green Version] - Pedagogy, W. Simpson’s Paradox. Available online: https://weapedagogy.wordpress.com/2020/01/15/5-simpsons-paradox/ (accessed on 21 August 2022).
- Mackenzie, D. Race, COVID Mortality, and Simpson’s Paradox. Available online: http://causality.cs.ucla.edu/blog/index.php/category/simpsons-paradox/ (accessed on 22 August 2022).
- Kügelgen, J.V.; Gresele, L.; Schölkopf, B. Simpson’s Paradox in COVID-19 case fatality rates: A mediation analysis of age-related causal effects. IEEE Trans. Artif. Intell.
**2021**, 2, 18–27. [Google Scholar] [CrossRef] - Mortimer, H. The Logic of Induction; Prentice Hall: Paramus, NJ, USA, 1988. [Google Scholar]
- Horwich, P. Probability and Evidence; Cambridge University Press: Cambridge, UK, 1982. [Google Scholar]
- Christensen, D. Measuring confirmation. J. Philos.
**1999**, 96, 437–461. [Google Scholar] [CrossRef] - Nozick, R. Philosophical Explanations; Clarendon: Oxford, UK, 1981. [Google Scholar]
- Good, I.J. The best explicatum for weight of evidence. J. Stat. Comput. Simul.
**1984**, 19, 294–299. [Google Scholar] [CrossRef] - Lu, C. A generalization of Shannon’s information theory. Int. J. Gen. Syst.
**1999**, 28, 453–490. [Google Scholar] [CrossRef] - Lu, C. Semantic Information G Theory and Logical Bayesian Inference for Machine Learning. Information
**2019**, 10, 261. [Google Scholar] [CrossRef][Green Version] - Lu, C. The P–T Probability Framework for Semantic Communication, Falsification, Confirmation, and Bayesian Reasoning. Philosophies
**2020**, 5, 25. [Google Scholar] [CrossRef] - Zadeh, L.A. Fuzzy Sets. Inf. Control
**1965**, 8, 338–353. [Google Scholar] [CrossRef][Green Version] - Zadeh, L.A. Probability measures of fuzzy events. J. Math. Anal. Appl.
**1986**, 23, 421–427. [Google Scholar] [CrossRef] - Rooij, R.V.; Schulz, K. Conditionals, causality and conditional probability. J. Log. Lang. Inf.
**2019**, 28, 55–71. [Google Scholar] [CrossRef][Green Version] - Over, D.E.; Hadjichristidis, C.; Jonathan St, B.T.; Evans, J.S.B.T.; Handley, D.J.; Sloman, S.A. The probability of causal conditionals. Cogn. Psychol.
**2007**, 54, 62–97. [Google Scholar] [CrossRef] [PubMed] - Demographic Trends of COVID-19 Cases and Deaths in the US Reported to CDC. The Website of the US CDC. Available online: https://covid.cdc.gov/covid-data-tracker/#demographics (accessed on 10 September 2022).
- Rates of COVID-19 Cases and Deaths by Vaccination Status The Website of US CDC. Available online: https://covid.cdc.gov/covid-data-tracker/#rates-by-vaccine-status (accessed on 8 September 2022).

**Figure 1.**Illustrating Simpson’s Paradox. In each group, the success rate of x

_{2}, P(y

_{1}|x

_{2}, g), is higher than that of x

_{1}, P(y

_{1}|x

_{1}, g); however, using the method of finding the center of gravity, we can see that the overall success rate of x

_{2}, P(y

_{1}|x

_{2}) = 0.65, is lower than that of x

_{1}, P(y

_{1}|x

_{1}) = 0.7.

**Figure 3.**Eliminating Simpson’s Paradox as the confounder exists by modifying the weighting coefficients. After replacing P(g

_{k}|x

_{i}) with P(g

_{k}) (k = 1,2; i = 1,2), the overall conclusion is consistent with the grouping conclusion; the average success rate of x

_{2}, P(y

_{1}|do(x

_{2})) = 0.7, is higher than that of x

_{1}, P(y

_{1}|do(x

_{1})) = 0.65.

**Figure 4.**The truth function of s

_{1}includes a believable part with proportion b

_{1}and a disbelievable part with proportion b

_{1}′.

P(y_{1}|x_{1}) | P(y_{1}|x_{0}) | P_{d} |
Δ*P_{x}^{y} | Comparison | |
---|---|---|---|---|---|

No big difference | 0.9 | 0.8 | 0.11 | 0.5 | P_{d} << Δ*P_{x}^{y} |

No counterexample | 0.2 | 0 | 1 | 0.2 | P_{d} >> Δ*P_{x}^{y} |

T(s|x_{0}) | T(s|x_{1}) | |
---|---|---|

s_{0} = “x_{0} => y_{0}” | 1 | b_{0}′ |

s_{1} = “x_{1} => y_{1}” | b_{1}′ | 1 |

Treat. x_{1} | Treat. x_{2} | Number | P(g) or Cc | |
---|---|---|---|---|

Small stones (g_{1}) | 87%/270 | 93%/87 * | 357 | 0.51 |

Large stones (g_{2}) | 69%/80 | 73%/263 | 343 | 0.49 |

Overall | 83%/350 | 78%/350 | 700 | |

P(y_{1}|x) | 0.83 | 0.78 | [(P(y_{1}|x_{2}) − P(y_{1}|x_{1})]/P(y_{1}|x_{2}) = −0.064 | |

P(y_{1}|do(x)) | 0.78 | 0.83 | Cc_{1} = Cc(x_{2}/x_{1} => y_{1}) = 0.06 | |

P(y_{0}|do(x)) | 0.22 | 0.17 | Cc_{0} = Cc(x_{1}/x_{2} => y_{0}) = 0.23 |

**Table 4.**The CFRs of COVID-19 of non-Hispanic white (x

_{1}) and other people (x

_{2}) from different age groups.

Age Group (g) | P(x_{1}|g) | P(g) | P(y_{1}|x_{1}, g) | P(g|x_{1}) | P(y_{1}|x_{2}, g) | P(g|x_{2}) |
---|---|---|---|---|---|---|

0–4 Years | 44.200 | 0.041 | 0.0002 | 0.0349 | 0.0002 | 0.0480 |

5–11 Years | 44.200 | 0.078 | 0.0001 | 0.0659 | 0.0001 | 0.0907 |

12–15 Years | 46.300 | 0.052 | 0.0001 | 0.0458 | 0.0001 | 0.0578 |

16–17 Years | 48.700 | 0.029 | 0.0001 | 0.0268 | 0.0002 | 0.0307 |

18–29 Years | 48.700 | 0.223 | 0.0004 | 0.2081 | 0.0006 | 0.2388 |

30–39 Years | 49.300 | 0.178 | 0.0011 | 0.1681 | 0.0019 | 0.1883 |

40–49 Years | 51.000 | 0.146 | 0.0030 | 0.1427 | 0.0048 | 0.1493 |

50–64 Years | 59.100 | 0.163 | 0.0102 | 0.1843 | 0.0144 | 0.1389 |

65–74 Years | 67.300 | 0.055 | 0.0333 | 0.0704 | 0.0457 | 0.0373 |

75–84 Years | 72.900 | 0.025 | 0.0762 | 0.0356 | 0.0938 | 0.0144 |

85+ Years | 76.300 | 0.012 | 0.1606 | 0.0173 | 0.1751 | 0.0059 |

sum | 1 | 1 | 1 |

The CFR of Non-Hispanic Whites (x _{1}) | The CFR of of Other People (x _{2}) | Risk Measure * | |
---|---|---|---|

P(y_{1}|x) | 1.04 | 0.73 | P_{d} = (R − 1)/R = 0.30 |

P(y_{1}|do(x)) | 0.80 | 1.05 | P_{d} = 0; Cc(x_{1}/x_{2}=>y_{1}) = −0.28 |

_{1}|x

_{1})/P(y

_{1}|x

_{2}).

**Table 6.**The negative degrees of causal confirmation for accessing that the vaccine affects infections and deaths.

Unvaccinated (x_{0}) | Vaccinated (x_{1}) | Cc | |
---|---|---|---|

Cases | 512.6 | 189.5 | Cc(x_{1}/x_{0} => y_{1}) = −0.63 |

Deaths | 1.89 | 0.34 | Cc(x_{1}/x_{0}) => y_{1}) = −0.79 |

Mortality rate | 0.001 | 0.00018 |

Mortality Rate P(y_{1}|x) | Unvaccinated | Vaccinated | |
---|---|---|---|

x_{0}: common reasons | P(y_{1}|x_{0}) | 0.013 | 0.013 |

x_{1}: x_{0} plus COVID-19 | P(y_{1}|x_{1}) | 0.014 | 0.01318 |

Cc_{1} = Cc(x_{1}/x_{0} => y_{1}) | 0.07 | 0.014 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Lu, C. Causal Confirmation Measures: From Simpson’s Paradox to COVID-19. *Entropy* **2023**, *25*, 143.
https://doi.org/10.3390/e25010143

**AMA Style**

Lu C. Causal Confirmation Measures: From Simpson’s Paradox to COVID-19. *Entropy*. 2023; 25(1):143.
https://doi.org/10.3390/e25010143

**Chicago/Turabian Style**

Lu, Chenguang. 2023. "Causal Confirmation Measures: From Simpson’s Paradox to COVID-19" *Entropy* 25, no. 1: 143.
https://doi.org/10.3390/e25010143