Law of Total Probability in Quantum Theory and Its Application in Wigner’s Friend Scenario

It is well-known that the law of total probability does not generally hold in quantum theory. However, recent arguments on some of the fundamental assumptions in quantum theory based on the extended Wigner’s friend scenario show a need to clarify how the law of total probability should be formulated in quantum theory and under what conditions it still holds. In this work, the definition of conditional probability in quantum theory is extended to POVM measurements. A rule to assign two-time conditional probability is proposed for incompatible POVM operators, which leads to a more general and precise formulation of the law of total probability. Sufficient conditions under which the law of total probability holds are identified. Applying the theory developed here to analyze several quantum no-go theorems related to the extended Wigner’s friend scenario reveals logical loopholes in these no-go theorems. The loopholes exist as a consequence of taking for granted the validity of the law of total probability without verifying the sufficient conditions. Consequently, the contradictions in these no-go theorems only reconfirm the invalidity of the law of total probability in quantum theory rather than invalidating the physical statements that the no-go theorems attempt to refute.


Introduction
In his seminal paper on the path integral formulation of quantum mechanics [1], Feynman started the introduction of his new theory by pointing out that the law of total probability in classical probability theory must be replaced by a new form of rule. Specifically, in a slightly different notation, the classical law of probability, p(c|a) = ∑ b p(c|b)p(b|a) where p(y|x) is the probability of obtaining measurement result y given measurement result x, is no longer true in quantum theory and must be replaced by ϕ(c|a) = ∑ b ϕ(c|b)ϕ(b|a), where ϕ is a complex number called probability amplitude and related to classical probability by Born's rule p(y|x) = |ϕ(y|x)| 2 . From this key idea, Feynman continued to expand the theory that led to the path integral formulation of quantum mechanics. He also discussed when the new rule of summation over probability amplitude can fall back to the classical law of probability. This is when one "attempts to perform" intermediate measurements that obtain results of all b. In modern terms, what Feynman means by "attempting to perform measurement" can be understood as the decoherence phenomenon [2].
The above example shows that it has been long known that the law of total probability cannot be taken for granted in quantum theory. Indeed, many other classical probability rules are only upheld in specific conditions. For instance, a joint probability can be definitely assigned only when the two measurement operators are commutative [3][4][5][6]. There are many variants of definitions of the conditional probability in quantum theory (for a review, see [7]). However, a family of no-go theorems recently published [8][9][10][11] appears to rely on the total law of probability one way or another without considering the sufficient conditions. These no-go theorems are related to the extensively discussed Wigner's friend experiments. In quantum mechanics, the Wigner's friend [12,13] thought experiment has been widely discussed, as it tests the validity of many quantum interpretation theories. The significance of such experiments is that Wigner and his friend give two different descriptions of the same physical process happening inside the lab. Deutsch further extended the thought experiment to be applicable to macroscopic system such as the lab system [14] itself. Based on that, a more sophisticated extended Wigner's friend experiment is put forwarded by Brukner [8,15]. Such an experimental setup involves two remotely separated labs. Each lab contains half of an entangled pair of spins and a local observer. Outside each lab there is a super-observer who can choose to perform different types of measurements on the lab as a whole. The intention of such an experimental setup is to prove, through a no-go theorem, that measured facts are observer-dependent in quantum theory. A subsequent experiment [16] has been carried out to confirm the inequality developed in [8]. A stronger version of the no-go theorem is further proposed for reaching a similar conclusion [9]. The statement that measured facts are observer-dependent was considered important for the quantum foundation and deserved rigorous theoretical proving and experimental testing. However, proving the no-go theorems by taking the law of total probability for granted casts doubt on their theoretical rigorousness.
The fact that there is still ambiguity in using the total law of probability in quantum theory-though it has long been recognized as not being upheld in quantum mechanicsshows the need to provide a rigorous formulation of the law of total probability in quantum theory and to clarify under what conditions it holds true. This is indeed the motivation behind the present work. Formulation of the law of total probability depends on a clear definition of conditional probability in quantum theory. There is already extensive research on how conditional probability is defined [7,[17][18][19][20][21][22][23][24][25]. However, these formulations are either based on projection measurements or only consider simultaneous measurements with commutative operators. In this work, I extend a two-time conditional probability formulation from projection measurement to more generic POVM measurements. Generalization for POVM measurement is needed because some of the no-go theorems choose POVM operators in their proofs. I then give several sufficient conditions for the law of total probability to become true. The theory is applied to analyze several no-go theorems related to the extended Wigner's friend scenario. Logical loopholes are shown in these no-go theorems because their proofs rely on the law of total probability one way or another, but the conditions to validate the law are not met. Thus, these no-go theorems do not really prove the results they expect, such as "measured facts are observer-dependent". Instead, they just indirectly confirm that the law of total probability does not not hold in quantum theory.
It is worth mentioning that other concerns regarding these no-go theorems have already been pointed out [26,27]. In particular, only when a measurement is completed should a probability distribution be assigned. Assigning probability distribution for premeasurement without results leads to contradiction [26]. The analysis in this work will go one step further by showing that even assigning a probability distribution for completed measurements still leaves logical loopholes in the no-go theorem. This is because the law of total probability that the proofs rely on does not hold true with the specific measurement operators and initial quantum state being chosen. Lastly, it is important to emphasize that I do not take a stand on the assertions of the no-go theorems themselves. For instance, it could still be a valid statement that "measured facts are observer-dependent". What I only show here is that there are logical loopholes in the proof of the no-go theorems.
In summary, this paper extends the formulation of conditional probability to generic POVM measurements and clarifies the conditions under which the law of total probability can be valid in quantum theory. Applying the theory developed in this work to the extended Wigner friend scenario reveals logical loopholes in several no-go theorems that take for granted the validity of the law of total probability. The contradictions in these no-go theorems only reconfirm the invalidity of the law of total probability in quantum theorem rather than invalidating the physical statements that the no-go theorems are intended to refute, such as "measured facts are independent of the observer". I hope the results presented here inspire further research to find more convincing proof and experimental testing. This is important because the implications of the extended Wigner's friend scenario are conceptually fundamental in quantum theory.

The Law of Total Probability in Quantum Theory
First, I briefly review classical probability theory. Suppose there are two random variables X and Y. Without loss of generality, I assume X and Y are discrete random variables. Measuring X (or Y) will obtain one of the values in {a i : i = 1, 2, 3 . . . } (or in {b j : j = 1, 2, 3 . . . }), which is finite or countable infinite. Denote the joint probability of measuring X with result X = a i , measuring Y with result Y = b j as p(a i , b j ), and the conditional probability of obtaining X = a i given that Y = b j as p(a i |b j ). They are related by the following axioms: where p(a i ) is the marginal probability of measuring X with result X = a i , and similarly for p(b j ). Axiom (3) ensures the joint probability is defined uniquely regardless if it is defined by (1) or (2). We explicitly call out (3) since it is not always true in quantum theory. The law of total probability can be derived (Axioms (1)- (3) give p(b j |a i )p(a i ) = p(a i |b j )p(b j ), which is Bayes' law. Summing over i on both sides and using identity ∑ i p(a i |b j )p(b j ) = p(b j ), one obtains (4)) from axioms (1)-(3), expressed as following, What I want to investigate here is how the equivalent version of (4) in quantum theory can be formulated.
To start with, I need to examine how conditional probability is constructed in quantum theory. The subtlety of constructing conditional probability in quantum theory has been investigated long ago. G. Bobo gives an extensive review and discussion [7]. The generally accepted formulation of conditional probability in quantum theory is provided by Lüders rule [18], where the measurements are associated with projection operators. Lüders rule is based on Gleason's theorem, which mathematically justifies Born's rule. Here I wish to follow a similar approach to generalize the formulation for conditional probability when the measurements are associated with POVM operators.
Mathematical proofs for generalizing Gleason's theorem to POVM measurements are given by [28,29], which is our starting point. Suppose a quantum system S is prepared such that its state is described by density operator ρ. S could be a composite system, which I will discuss later. Let A = {A i } be a POVM for S. The probability of measurement with element A i resulting in value a i is [28,29].
p(a i |ρ) = Tr(ρA i ), (5) and the post-measurement density operators ρ i are given by [4] Let B = {B j } be another POVM for S. Given post measurement state ρ i , the probability of measurement with element B j resulting in value b j is, by recursively applying (5), p(b j |ρ i ) = Tr(ρ i B j ). Substituting the expression for ρ i in (6), I obtain the conditional probability There is an underlying assumption in this definition that the probability is assigned only after the measurements are completed. In particular, the first POVM measurement A i must be completed in order to be qualified as a condition. We strictly follow this assumption as opposed to assigning a probability with only "pre-measurement". Pre-measurement refers only to the unitary process that entangles the measured system and measuring apparatus [30] but without the projection process to single out a particular outcome. Given the same initial state ρ, if I swap the order of measurements such that B j goes first, followed by A i , I obtain a conditional probability Note p(a i |ρ) = Tr(ρA i ) and p(b i |ρ) = Tr(ρB j ); Equations (7) and (8) can be rewritten as Equations (9) and (10) are not necessarily equal, which indicate that the quantum version of Bayes' theorem, does not hold in general in quantum theory. This posts a difficulty to define a joint probability as either p( depends on the order of measurement events. Another consequence is that the laws of total probability, i.e., the quantum version of (4) does not hold in general either. This is because from (9), , while p(b j |ρ) = Tr(ρB j ), and these are not equal in general (Note that on the other hand, given (7) and the completeness of POVM elements, ∑ i A i = I, where I is the identity operator, it is straightforward to verify that ∑ j p(b j |a i , ρ)p(a i |ρ) = p(a i |ρ)). We are interested in finding the conditions under which (12) becomes true.
It is well-known that when [A i , B j ] = 0, i.e., A i and B j commute, from (7) and (8), one gets p(b j |a i , ρ)p(a i |ρ) = p(a i |b j , ρ)p(b j |ρ) = Tr(ρA i B j ). Consequently, the law of total probability (12) becomes true and a joint probability can be well-defined. However, the situation becomes much complicated when [A i , B j ] = 0.
Strictly speaking, due to the uncertainty principle, when A i and B j are non-commutative, the two measurements cannot be performed to obtain definite outcomes at the same time. The conditional probability defined in (7) or (8) needs to be extended to a twotime formulation of conditional probability in order to be applicable when [A i , B j ] = 0. There is extensive research on how to construct two-time conditional probability in quantum theory [7,[17][18][19][20][21][22][23][24][25]. One noticeable approach is based on the Page-Wootters timeless formulation [21][22][23][24][25]. However, this work will continue to be based on the generalized Gleason theorem for POVM [28,29] to derive the two-time conditional probability, and will leave discussion of the Page-Wootters mechanism for Section IV.
For conceptual clarity, I start the analysis by considering that there is finite nonzero duration for each measurement. After I construct the conditional probability formulation, for practical purpose of calculation, I can approximate the measurement duration to zero. Suppose the first measurement starts at t − a and completes at t + a . Here t + a − t − a covers the time duration for both the pre-measurement unitary phase that entangles the measured system and the measuring apparatus, and the projection phase. The measurement process (Theorem 5.2 of [4] gives a detailed account on how this POVM measurement is physically realized through indirect measurement) is represented by a POVM element A i associated with outcome a i . Similarly, the second measurement starts at t − b and completes at t + b . Between t + a and t − b there is a free time evolution for the measured system S, described by operator U(∆t Since it is only meaningful to assign a probability distribution after a measurement is completed, the two-time conditional probability I want to construct is "given the measurement outcome of Mathematically, this two-time conditional probability can be written is the initial density operator of S when the first measurement starts. After the first measurement with POVM element A i , the postmeasurement state is For practical purposes of calculation, I can assume the measurement duration is very small compared to the free evolution time, i.e., (t + . Suppose the two POVM elements A i and B j are projection measurements, A i = |φ i φ i | and B j = |ϕ j ϕ j |; one can verify that the conditional probability defined in (13) gives the correct transition probability in standard quantum mechanics: However, Equation (13) is more generic as it is defined with general POVM operators. Note that the denominator in (13) Tr(ρ 0 A i ) = p(a i at t 1 |ρ 0 ); Equation (13) can be rewritten as To analyze the two-time version of the total law of probability, which can be expressed as I consider a series of two-time measurements {A i at t a , B at t b , i = 1 . . . N} on N copies of measured system S with the same initial state ρ 0 . Each two-time measurement consists a first measurement from one possible POVM element from the complete set {A i , i = 1 . . . N} at time t a and the same second measurement B j at time However, by definition, p(b j at t 2 |ρ 0 ) = Tr(B j Uρ 0 U † ). We can see (16) is not true in general. The Theorem next attempts to address the question of under what conditions (16) is valid.
Theorem 1. Let ρ 0 be the density operator for a quantum system S before the measurements. Let A i and B j be two POVM elements to measure S at time t a and t b , respectively, and U(t b , t a ) is the unitary time evolution operator from t a to t b . Select t 1 and t 2 such that t a < t 1 < t b < t 2 . The law of total probability (16) is true if one of the following conditions is met.
The proof of Theorem 1 is in Appendix A, but a few comments are in order here. First, The sequence of operations for U † B j U A i means performing measurement A i at t a , time evolving the post-measurement state from t a to t b , performing measurement B j at t b , and reversing time evolution of the postmeasurement state back to t a . The sequence of operations A i U † B j U means time evolving the state from t a to t b , performing measurement B j at t b , then reversing time evolution of the state back to t a , and performing measurement A i at t a . Condition C1 says that if these two sequences of operations are equivalent, then the law of total probability (16) holds true.
Second, if the post-measurement state ρ i (t a ) after the first measurement does not change during free time evolution, such as the case of a spin state in free space, one will Then, Equation (13) can be written as Equation (18) appears the same as (7), but the precise meaning is different in that the two measurements A i and B j in (18) are taken at two different times. With such a special post-measurement quantum state, the sufficient conditions in Theorem 1 become A couple of comments are in order before closing this section. First, when two measurement operations are not commutative, the conditional probability needs to be defined in the two-time formulation. Second, I can give an intuitive explanation of why (16) does not hold in general in quantum theory. As shown in (17), the right-hand side of (16) refers to the summation of traces of multiplication of operators from a series of experiments where two measurements are carried out in a sequence. In the case of a special postmeasurement state where (18) Measurement of A i changes the initial quantum state such that it affects the probability of outcome for a subsequent measurement B j . However, the term on the left-hand side of (16) refers to the probability of an experiment where only measurement B j is carried out with the same initial quantum state. There is no reason to assume both sides are equal. Equation (16) holds only in special conditions such as those specified in Theorem 1.
The conclusion here is that one should not take for granted that the law of total probability holds true in general. Instead, sufficient conditions, such as those provided in Theorem 1, need to be clearly called out. Failing to do so may leave a loophole in logical deduction when applying the law of total probability.

Application to Composite Systems
In this subsection, I will apply the conditional probability definition to composite quantum systems and reexamine Theorem 1 when measuring composite systems. Suppose the measured system S consists of two subsystems S 1 and S 2 that are space-like separated. Define A i = P i ⊗ I 2 , where P i = |φ i φ i | is a local POVM element on subsystem S 1 , and I 2 is an identity operator on subsystem S 2 . Similarly, define B j = I 1 ⊗ Q j , where Q j is a local POVM element on subsystem S 2 . By the principle of locality, a local measurement on a subsystem should not impact the other remote subsystem. Therefore, [A i , B j ] = 0. For measurement outcomes of two such local measurements, Equations (7) and (8) are correct formulations for conditional probability; the joint probability is well-defined. Consequently, Equations (11) and (12) hold true. There is no need to use the two-time formulation of conditional probability. This is the case for typical Bell tests and has been used to derive the Bell-CHSH inequalities (On the other hand, in the derivation of Bell-CHSH inequalities, identity (1) is further expressed as where λ is a hidden variable. This is known as the outcome independence assumption [31,32]). However, suppose B j = Q j ⊗ I 2 , where Q j is another local POVM element on subsystem S 1 , and [P i , Q j ] = 0. In this case, Equation (7) is incorrect for conditional probability. The two-time conditional probability formulation is needed and can be calculated as where U(∆t) = U S 1 (∆t) ⊗ U S 2 (∆t). Next, I wish to apply the two-time conditional probability to the extended Wigner's friend (EWF) scenario introduced in [8]. As shown in Figure 1, the EWF scenario consists of two space-like separated laboratories L 1 and L 2 . Each laboratory contains half of an entangled pair of systems s 1 and s 2 . L 1 also contains a friend Charlie who can perform measurements on s 1 . Outside L 1 there is a super-observer Alice who can perform different types of measurements on L 1 as a whole. Similarly, there is a friend Debbie in L 2 and a superobserver Bob outside Then, the conditional probability for t a < t 1 < t b < t 2 is given by p(ab at t 2 |cd at t 1 , ρ, xy) where U = U L 1 ⊗ U L 2 is the time evolution operator from t a to t b . The law of total probability I am interested in is p(ab at t 2 |ρ, xy) = ∑ cd p(ab at t 2 |cd at t 1 , ρ, xy)p(cd at t 1 |ρ, xy).
From (21), the R.H.S. of (22) becomes ∑ cd p(ab at t 2 |cd at t 1 , ρ, xy)p(cd at t 1 |ρ, xy) The summation is over POVM element sets for {C} and {D}. Since [A ⊗ I L 2 , I L 1 ⊗ B] = 0, the L.H.S. of (22) is p(ab at t 2 |ρ, xy) = Tr(ρU † ABU). Both sides are not equal in general. In the case that the post-measurement state after the first measurement is unchanged during free time evolution (this is indeed the assumption in the no-go theorems I will analyze in the next section), Equation (21) becomes Equation (23) is simplified to ∑ cd p(ab at t 2 |cd at t 1 , ρ, xy)p(cd at t 1 |ρ, xy) and p(ab at t 2 |ρ, xy) = Tr(ρAB). In this case, one can derive the following corollary based on Theorem 1.

Corollary 1.
In the Extended Wigner's Friend scenario setup, suppose the post-measurement state is unchanged during free time evolution from t a to t b . Select t 1 and t 2 such that t a < t 1 < t b < t 2 .
The law of total probability (22) is true if one of the following conditions is met.
Condition C4 is quite obvious. Proof of condition (26) is given in Appendix B.

Logical Loopholes in No-Go Theorems Related to the Wigner's Friend Scenario
A no-go theorem usually starts from the conventional probability theory, which is widely regarded as the true representation of logical deduction, and assumes certain additional plausible physical premises: realism, locality, no superdeterminism, observer independence, etc. One then shows that such a model leads to prediction, which is contradicted by quantum mechanics. Hence, one concludes that at least one of the assumptions or the rules of conventional probability must be violated by quantum mechanics. Let us denote the physical assumption that a no-go theorem tries to prove to be violated by quantum theory as W. For instance, W could be "measured facts are observer independent". The no-go theorem may be constructed independent of the underlying physical theory. But if the logical deduction in the proof of theorem utilizes the law of total probability in one of the forms of (12), (16), or (22) without calling out the appropriate sufficient condition C, then I know the resulting statement (could be in the form of an inequality) will not hold in quantum theory. This leaves a loophole in the logical deduction. Because the contradiction shown in the no-go theorem could be just due to the fact that C is not met in the experiment setup instead of the intended conclusion that W is violated by quantum theory. Thus, the no-go theorem does not reach the conclusion as desired. I will examine several such no-go theorems in this section (I do not include the no-go theorem [33] widely discussed in the literature since its proof does not invoke the law of total probability.)

A Strong No-Go Theorem on the Wigner's Friend Paradox
Bong et al. introduce a no-go theorem that if one assumes that quantum mechanics is applicable to the scale of an observer, then one of the three assumptions, 'Locality', 'No-superdeterminism', or 'Absoluteness of Observed Events (AOE)' must be false [9]. Here AOE means that "every observed event exists absolutely, not relative to anything or anyone". The no-go theorem is supposed to be independent of underlying physical theory and is proved in the context of extended Wigner's friend (EWF) scenario [8], as shown in Figure 1. The measurement results from the friend in the lab can be correlated with the super-observer's subsequent measurement results. Suppose the measurement outcomes from Alice, Bob, Charlie, and Debbie are a, b, c, d, respectively. Alice can have three different measurement settings, labeled by parameter x ∈ {1, 2, 3}. When x = 1, Alice opens L 1 and asks Charlie's measurement outcome, while when x = 2, 3, Alice performs a different measurement on L 1 . Similar measurement settings for Bob are labeled as y ∈ {1, 2, 3}.
Among the three assumptions, it is well accepted that 'Locality' and 'No-superdeterminism' cannot be violated by any physical theory. The focus is on the assumption of AOE, which is defined mathematically as following [9]. There exists a joint probability distribution p(abcd|xy) such that i p(ab|xy) = ∑ cd p(abcd|xy)∀a, b, x, y ii p(a|cd, x = 1, y) = δ a,c ∀a, c, d, y iii p(b|cd, x, y = 1) = δ b,d ∀b, c, d, x.
With the three assumptions, [9] derives a number of inequalities and experimentally confirms that quantum theory violates these inequalities when proper measurement settings x and y and the initial quantum state are chosen. Therefore, the AOE assumption should be refuted.
However, as discussed in Section 2, the law of total probability does not hold true in quantum theory unless a certain condition is met. We can apply Corollary 1 to analyze the validity of (27). In Appendix C, I show that the operators chosen in [9] are not all commutative with each other. Specifically, [A, C] = 0 and [B, D] = 0, while [C ⊗ I L 2 , I L 1 ⊗ D] = 0 and [A ⊗ I L 2 , I L 1 ⊗ B] = 0. With these choices of operators, the corresponding two-time version of the law of total probability is given by (22) and (25). However, I already see condition C4 is not satisfied. The choice of initial quantum state, i.e., Equation (1) in [9] and the forms of operator A, B, C, D do not satisfy conditions C5 in Corollary 1 either. Therefore, in general, (27) does not hold with the conditions specified in [9]. Inequalities derived based on (27) will be violated by quantum theory with the choice of initial quantum state and measurement operators described in [9]. However, this raises the question of exactly what the no-go theorem refutes. I agree with the authors that violation of the inequalities by quantum theory points to the validity of AOE in quantum theory. However, the definition of AOE and the derivation of the theorem implicitly assume the validity of the law of total probability. The root cause of the violation of the inequalities is due to the fact that the experimental setup does not satisfy the conditions to make the law of total probability hold true, not because of the AOE statement that "an observed event is not relative to anything or anyone". One may argue that the AOE statement is equivalent to invalidity of the law of total probability. However, as discussed earlier, the invalidity of the law of total probability is due to the fact that measurement C (or D) alters the initial quantum state that impacts the probability of outcome for measurement A (or B) since [A, C] = 0 (or [B, D] = 0). There is a logic gap to equate this reason with the statement "an observed event is not relative to anything or anyone". Therefore, it appears that the violation of inequalities in [9] just reconfirms the consequence of non-commutative measurements and, therefore, the invalidity of the law of total probability in quantum theory, rather than confirming the invalidity of the AOE statement.

A No-Go Theorem for Observer-Independent Facts
The no-go theorem for observer-independent facts by Brukner [8] was actually introduced earlier than [9] in a similar effort to prove that measured facts are observer-dependent. The experimental setup is shown in Figure 1. In [8], there are only two different measurement setups either Alice or Bob will perform, compared to three different measurement setups in [9]. Furthermore, the way the no-go theorem is proven is different. The proof in [8] leverages the well-known Bell-CHSH inequality, and thus inherits all the assumptions associated with the Bell-CHSH theorem, while the no-go theorem in [9] is proven independent of the Bell-CHSH theorem. As I will explain next, the proof in [8] is more subtle, as it carefully chooses a quantum state and a set of measurement operators such that the law of total probability holds true if only considering Alice's (or Bob's) measurements.
The initial wave function is chosen such that after Charlie and Debbie each perform a measurement of their respective half of entangled spin along the z axis, from Alice's or Bob's perspective, it becomes (see Equations (5)-(9) in [8]): where |00 L 2 represents that s 1 is in the spin up state and Charlie's pointer variable is associated with the up state, and |11 L 1 corresponds to the spin down state for s 1 and Charlie's pointer variable. There are similar meanings for |00 L 2 and |11 L 2 for s 2 and Debbie's pointer variable. The key point of [8] is to assume there exists a joint probability p(a 1 a 2 b 1 b 2 ), where a 1 , a 2 ∈ {0, 1} are the measurement results corresponding to Alice's choice of two types of measurement operations, and b 1 , b 2 ∈ {0, 1} are the measurement results of Bob. Alice can choose to either measure L 1 with projection operator A 1 or with projection operator A 2 . Here A 1 is represented (in [8], the two types of operations for Alice are defined as Here I use the spectral decomposition theorem to decompose A 1 into projection operators and represent it by A 1 , with a similar approach for the definition of A 2 . There is an important difference here compared to the setup in [9]. Here a 1 , a 2 are results for Alice from completed measurement A 1 , A 2 , respectively, whereas in [9], a, c are measurement results for Alice and Charlie, respectively. The issue of c as a result of Charlie's "pre-measurement" in [9] does not exist here in [8]) by |φ i φ i | L 1 , and |φ 0 = |00 for a 1 = 0 or |φ 1 = |11 for a 1 = 1. A 2 is chosen to be A 2 = |χ 0 χ 0 | for a 2 = 0 or A 2 = |χ 1 χ 1 | for a 2 = 1, where From these definitions of A 1 and A 2 , one can verify that with similar definitions for operators B 1 and B 2 . The problem in [8] is that it assumes the law of marginal probability holds true, for instance, p(a 2 b 2 ) = ∑ a 1 ,b 1 ={0,1} p(a 1 a 2 b 1 b 2 ). Ref. [8] does not provide details on how the joint probability is defined. As discussed earlier, the joint probability cannot be well defined unless the measurement operators are commutative. If I further assume the validity of the classical probability axiom in Eq. (1) and apply it recursively, I have Then, the law of marginal probability is equivalent to the law of total probability such that Let us analyze if Equations (31) and (32) (28). The conditional probability is similar to (24), where I drop the subscript of ρ 0 . The desired law of total probability is p(a 2 b 2 at t 2 |ρ) = ∑ a 1 ,b 1 p(a 2 b 2 at t 2 |a 1 b 1 at t 1 , ρ)p(a 1 b 1 at t 1 |ρ).  (28), Condition C3 is satisfied such that p(a 2 |ρ) = ∑ a 1 p(a 2 |a 1 , ρ)p(a 1 |ρ) and p(b 2 |ρ) = ∑ b 1 p(b 2 |b 1 , ρ)p(b 1 |ρ). Unfortunately, I also show that (34) and the law of marginal probability p(a 2 b 2 ) = ∑ a 1 ,b 1 ={0,1} p(a 1 a 2 b 1 b 2 ) are still not valid.
Since proof of the no-go theorem in [8] depends on the law of marginal probability, and the law of marginal probability does not hold true by the choice of quantum state and measurement operators, there is a logical loophole in the no-go theorem. The violation of the inequality in [8] in quantum theory does not necessarily imply that measured facts are observer dependent. Instead, the violation just reconfirms that the law of marginal probability does not hold for the choice of quantum state and the measurement operators. The logical gap of equating the statement "measured facts are observer dependent" to the invalidity of the law of marginal probability in quantum theory is similar to what I discussed in the last paragraph in Section 4.1.
Note that besides depending on the law of marginal probability, the proof of no-go theorem in [8] also inherits the assumptions for the proof of the Bell-CHSH inequality [31,32], particularly dependency on the outcome independence assumption (19). The no-go theorem in [9], on the other hand, does not depend on the outcome independence assumption.

A No-Go Theorem for the Persistent Reality of Wigner's Friend's Perception
In [11], another no-go theorem is introduced to show that in the extended Wigner's friend scenario, Wigner's friend cannot "treat her perceived measurement outcome as having reality across multiple times" without contradicting one of the following assumptions in quantum mechanics [11]. P1 Let f 1 and f 2 be perceived measurement records of the friend at time t 1 and t 2 , respectively. A joint probability distribution p( f 1 , f 2 ) can be assigned that also satisfies the law of total probability p( f 1 ) = ∑ f 2 p( f 1 , f 2 ) and p( f 2 ) = ∑ f 1 p( f 1 , f 2 ); P2 One time probability is assigned according to p( f i ) = Tr(| f i f i |ρ) using unitary quantum theory where no state collapse is considered to have occurred; P3 The joint probability of the friend's perceived outcomes p( f 1 , f 2 ) has a convex linear dependence on the initial state ρ.
In traditional quantum measurement theory [30], the unitary process is considered to entangle the measured system with the measuring apparatus before the projection process. The projection process gives a definite final outcome. P2 essentially assumes the unitary process itself can have a measurement result and can be assigned a (one-time) probability. Zukowski and Markiewicz have already pointed out that such an assumption leads to a contradiction. However, there is another problem with P2 [26]. The derivation in [11] assumes that the joint probability p( f 1 , f 2 ) is derived through the standard probability axiom p( f 1 , f 2 ) = p( f 2 | f 1 )p( f 1 ), but it does not give details on how the conditional probability is calculated in quantum theory. It is not clear how the unitary formulation presented in [11] can be applied to derive the conditional probability p( f 2 | f 1 ) because P2 assumes there is no "collapse" after the first measurement. It is not a problem to compute the one-time probability p( f 1 ) and p( f 2 ). However, in order to be able to calculate a two-time probability such as p( f 2 | f 1 ), one will have to apply the state update rule after the first measurement at time t 1 , as shown in (13) for the two-time conditional probability.
More crucially, even if I am able to calculate the conditional probability, there is still a problem with P1, as P1 assumes the law of total probability p( f 1 ) = ∑ f 2 p( f 1 , f 2 ) is always true. We have shown in Theorem 1 and subsequent corollaries that the law of total probability is true in quantum theory only with certain conditions. The two POVM elements chosen in [11] are non-commutative, as shown in Equation (17) in [11]. Thus, p( f 1 ) = ∑ f 2 p( f 1 , f 2 ) does not necessarily hold. The proof in [11] assumes that p( f 1 ) = ∑ f 2 p( f 1 , f 2 ) always holds based on P1, then eventually deduces that the two POVM elements should be commutative and claims there is a contradiction. However, such a contradiction is due to the invalid assumption of p( f 1 ) = ∑ f 2 p( f 1 , f 2 ) in P1, which in turn is due to the fact that the two POVM elements are non-commutative. Since P1 is invalid, the contradiction does not lead to the desired conclusion that Wigner's friend cannot "treat her perceived measurement outcome as having reality across multiple times".

Relative Facts, Stable Facts
In relational interpretation of quantum mechanics (RQM) [34][35][36][37][38][39], a measurement result is considered meaningful only relative to the system that interacts with the measured system. A definite measurement result is referred to as a fact. Quantum theory is about conditional probability for facts, given other facts. Recently, Biagio and Rovelli introduced the concept of a stable fact in the following sense [10]. If, given the probability p(a i ) for N mutual exclusive facts a i (i = 1 . . . N) and the conditional probability of another fact b, p(b|a i ), the probability p(b) (dropping index j for b j ) is given by (4), then facts a i are considered stable.
RQM states that fact is relative. Formally, if two systems S and F interact such that variable L F of F depends on the value of variable L S of S, then the value of L S is said to be relative to F [10]. However, not all relative facts are stable. The main thesis of [10] consists two claims. First, the law of total probability (4) is satisfied only if b and a i are facts relative to the same system, say relative to system F. If b is relative to another system W = F, (4) is not true in general. Mathematically, these can be expressed as Here, it is important to label the reference system the fact is relative to. Second, if system F goes through a decoherence process by interacting with an environmental system E, the resulting density matrix for F is approximately given by ρ = ∑ i λ i |Fa i Fa i |, where Fa i is the eigenvalues of L F . Then, Equation (36) can be rewritten as In such a case, facts Fa i relative to E are stable for W. Now let us examine the two claims more carefully. For the first claim, from Theorem 1, Equation (35) is not necessarily true even if both b and a i are facts relative to a same system. That facts b and a i are both relative to a same system means both facts are obtained through interactions with the same system, and the interactions can be represented by measurement operators B and A i , respectively. If [B, A i ] = 0, Equation (35) is true. But there is no reason that B and A i have to be commutative. If [B, A i ] = 0, Equation (35) is not true in general, unless other conditions such as condition C2 or C3 in Theorem 1 are satisfied. Indeed, the second claim (37) is precisely the case where condition C2 is met. Note that the reasoning from (35) to (37) is also applicable when facts b and a i are relative to the same system but the corresponding measurement operators are non-commutative, [B, Therefore, it is not clear that one can use the validity of the law of total probability (35) and (37) to distinguish stable facts from non-stable facts. Again, I am not opposed to the idea that facts are relative. What I am questioning here is the validity of (35) without specifying the conditions, and the rigorousness of reasoning from (36) to (37). It appears that more careful investigation is needed in order to search for the criteria to define a "stable" fact.

The Page-Wootters Timeless Formulation
In the timeless formulation of quantum theory developed by Page and Wootters [21], time evolution is naturally emerged from quantum correlation between a clock and a system whose dynamics are tracked by the clock. Ref. [25] proposed several two-time formulations of conditional probability based on the Page-Wootters timeless mechanism. The advantage of such formulation is that from a timeless quantum state one can derive probability of a measurement event conditional on another event regardless of the temporary order of the two events.
Although the formulation in the present work is based on the regular time evolution dynamics in the Schrodinger picture, the definition of two-time conditional probability (13) is consistent with the definitions in [25]. For instance, for the case of two projection measurements A i and B j at t a and t b , respectively, (13) gives the same transition probability (14) as that in Equation (29) of [25].
However, the timeless formulations of conditional probability in [25] are applicable only to projection measurements, while the theory developed here is more general in the sense that it is applicable to POVM measurements. A two-time conditional probability formulation for projection measurements is insufficient to analyze the no-go theorems in [9]. Moreover, my focus here is the validity of the law of total probability that is built on the definition of two-time conditional probability, which is missing in [25], as the focus there is only on the rules for two-time conditional probability.
It will be interesting to generalize the timeless Page-Wootters formulation of two-time conditional probability in [25] to be able to handle POVM measurements, although I expect such generalization should produce results similar to those presented in this work.

Limitations
One limitation of the present work is that in Theorem 1, I am only able to derive three sufficient conditions for the law of total probability to hold true. In theory, there can be many other sufficient conditions. It is desirable to find the sufficient and necessary condition for the law of total probability to hold true in quantum theory. This remains a future investigation topic. Nevertheless, for the purpose of analyzing the EWF scenario and identifying the loopholes of the relevant no-go theorems, the conditions specified in Theorem 1 and subsequent corollaries are sufficient.

Conclusions
In this paper, the standard rule to assign conditional probability in quantum theory, i.e., Lüders rule, is extended to include two-time POVM measurements. The extension is strictly based on the recursive application of the POVM measurement theory as shown in (5) and (6) and the assumption that probability distribution can be assigned only for completed quantum measurement. The resulting definition (13) is consistent with other works based on Page-Wootters formulation [25], but with advantage of being able to apply to POVM measurements instead of just projection measurements.
More importantly, with the generalized two-time conditional probability formulation, I analyze the validity of the law of total probability. It is shown that the quantum version of the law of total probability does not hold true in general. Certain conditions related to the choice of measurement operators and the initial quantum state must be met in order for the law of total probability to hold. Specifically, such sufficient conditions are derived in Theorem 1 and Corollary 1.
Application of the theory developed here to the extended Wigner's friend scenario reveals logical loopholes in several no-go theorems. These no-go theorems take for granted the validity of the law of total probability (or the law of marginal probability) in quantum theory. However, this is not the case, as shown in Theorem 1 and Corollary 1. Thus, the no-go theorems do not lead to the desired conclusions. For instance, the violation of the inequalities developed in [8,9] in quantum theory does not necessarily lead to the desired statement that "measured facts are observer-dependent". Instead, it just reconfirms the invalidity of the law of total probability or the law of marginal probability in quantum theory. I do not take a stand on the assertions themselves of the no-go theorems. It could be still a valid statement that "measured facts are observer-dependent". What I show here is that there are logical loopholes to reach such a statement. It is desirable to find more convincing proof and experimental testing because the implications of the extended Wigner's friend scenario are conceptually fundamental in quantum theory.
Funding: This research received no external funding.

Data Availability Statement:
The data that support the findings of this study are available within the article.

Acknowledgments:
The author sincerely thanks the anonymous reviewers of this paper for their careful reviews. The valuable comments provided help to improve the clarity of the presentations and discussions.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A. Proof of Theorem 1
The right-hand side of (15) becomes Tr(U † B j U A i ρ). Given the completeness of POVM operators, Then, the right-hand side of (15) becomes Tr(U † B j Uλ i |φ i φ i |) = λ i φ i |U † B j U|φ i . The right-hand side of (16) becomes ∑ i λ i φ i |U † B j U|φ i . On the other hand, p(b j at t 2 |ρ) = Tr(U † B j Uρ) = ∑ k Tr(U † B j Uλ k |φ k φ k |) = ∑ i λ i φ i |U † B j U|φ i , same as the right-hand side of (16).
Given ρ = |Ψ Ψ| and A i is a projection operator, in Condition C3, I have A 2 i = A i , and the right-hand side of (16) becomes ∑ i Ψ|A i U † B j U A i |Ψ . The left-hand side p(b j at t 2 |ρ) = Ψ|U † B j U|Ψ . Again, by the completeness of POVM operators, ∑ i A i = I, and I get To have both sides of (16) equal, I need This can be rearranged to ∑ i Ψ|[A i , U † B j U]A i |Ψ = 0, and Condition C3 ensures this is the case.

Appendix B. Proof of (26)
To avoid confusion, I need to restore the subscripts of operators as C i and D j . Since C i and D j are projection operators, and ρ = |Ψ Ψ|, the R.H.S. of (23) becomes ∑ ij Ψ|C i D j ABC i D j |Ψ . Given the completeness of {C i } and {D j }, I have ∑ ij C i D j = I. Since C 2 i = C i and D 2 j = D j , I further obtain ∑ ij C 2 i D 2 j = I. Then, p(ab at t 2 |ρ, xy) = Tr(ρAB) = ∑ ij Ψ|ABC 2 i D 2 j |Ψ . To make this equal to the R.H.S. of (23), one condition is Omitting the subscripts of C i and D j again gives (26).

Appendix C. Non-Commutation of Operators in [9]
The key characteristic of the EWF experiment is that the super-observer Alice (or Bob) performs measurements on the laboratory L 1 (or L 2 ) as a whole. Thus, the measurement operator acts on both the observed system and the friend in the lab. Ref. [9] carefully chooses the operators as following. When x = 1, Alice's measurement is represented as A(x = 1) = |c c| F 1 ⊗ I s 1 , where c is the outcome Charlie obtains from his measurement on s 1 , and F 1 refers to Charlie himself. For x ∈ {2, 3}, Alice's measurement operator is A(x) = U L 1 (I F 1 ⊗ E x s 1 )U −1 L 1 , where U −1 L 1 is a unitary evolution that reverses the entanglement between F 1 and s 1 , and E x s 1 is a positive operator on s 1 associated with outcome a for measurement setting x. The operator associated with Charlie's measurement on s 1 , according to [9], is described by a unitary operator C(x) = U L 1 from Alice's perspective. U L 1 acts on the same Hilbert space H F 1 ⊗ H s 1 and entangles s 1 and F 1 . For x = 1, [A(x), C(x)] = |c c| F 1 ⊗ I s 1 U L 1 − U L 1 |c c| F 1 ⊗ I s 1 = 0, and for x ∈ {2, 3}, [A(x), C(x)] = U L 1 (I F 1 ⊗ E x s 1 ) − U 2 L 1 (I F 1 ⊗ E x s 1 )U −1 L 1 = 0. As already pointed out [26], defining C(x) as U L 1 implies pre-measurement only with no measurement result and leads to contradictions. An alternative choice of operation is that Charles performs a projection operation after the pre-measurement. This refines the definition of C(x) to include both U L 1 and a projection operation on s 1 , i.e., C(x) = U † L 1 (I F 1 ⊗ |c c|)U L 1 . With this refined definition, one can verify that [A(x), C(x)] = 0 is still true. Choosing C(x) = U † L 1 (I F 1 ⊗ |c c|)U L 1 implies operator C(x) is from Charlie's point of view. This may not be the original intention in [9]. However, the key point here is that with either choice of C(x), [A(x), C(x)] = 0. The same analysis goes for operators B(y) and D(y), and the conclusion is that [B(y), D(y)] = 0 for y ∈ {1, 2, 3}.

Appendix D. Proof That (34) Does Not Hold
First, I consider a simpler case that only Alice performs the two types of measurements and Bob does nothing. The law of total probability in this case can take the form of p(a 2 |ρ) = ∑ a 1 ={0,1} p(a 2 |a 1 , ρ)p(a 1 |ρ). This is true due to the fact that the selected operators A 1 and A 2 and wave function (28) together meet condition C3 . To see this, substitute A 1 = |φ i φ i | L 1 into condition C3 ; C3 becomes Now consider the case a 1 = 0, where |φ 0 = |00 L 1 . From (28), one can calculate Then, from (30) and dropping the unimportant factor of 1/2 for |ϕ 0 , I have for the case of a 1 = 0 For the case of a 2 = 1, I can verify that |ϕ 1 = |11 L 1 and Therefore, condition C3 is met with the choices of wavefunction and Alice's measurement operation. Similarly, if only Bob performs the two types of measurements and Alice does not perform any measurement, and B 1 = |φ i φ i | L 2 , I can verify that Thus, p(b 2 |ρ) = ∑ b 1 ={0,1} p(b 2 |b 1 , ρ)p(b 1 |ρ) holds true per Theorem 1. However, when I consider both Alice and Bob performing the measurements {A 1 , A 2 } and {B 1 , B 2 }, respectively, the situation is different. By replacing operators A, B, C, D in (26) with operators A 2 , B 2 , A 1 , B 1 , (26) reads Ψ|[A 1 B 1 , A 2 B 2 ]A 1 B 1 |Ψ = 0.