Contextual Inferences, Nonlocality, and the Incompleteness of Quantum Mechanics

It is known that “quantum non locality”, leading to the violation of Bell’s inequality and more generally of classical local realism, can be attributed to the conjunction of two properties, which we call here elementary locality and predictive completeness. Taking this point of view, we show again that quantum mechanics violates predictive completeness, allowing the making of contextual inferences, which can, in turn, explain why quantum non locality does not contradict relativistic causality. An important question remains: if the usual quantum state ψ is predictively incomplete, how do we complete it? We give here a set of new arguments to show that ψ should be completed indeed, not by looking for any “hidden variables”, but rather by specifying the measurement context, which is required to define actual probabilities over a set of mutually exclusive physical events.


Introduction
After many years of theoretical and experimental research, it can be now said that the door has been closed on the historical Einstein and Bohr's quantum debate [1][2][3]. On the way, this research opened the door to many new ideas and experiments, leading ultimately to the development of quantum technologies. As a reflection on these evolutions, our point view here is to go back to the Einstein-Bohr debate and to propose answers to the initial questions: Is the "wave function" a complete description of physical reality? What is the role of locality? What about relativistic causality?
We will see that, contrary to what is often said, Einstein, Podolsky, and Rosen were perhaps not so wrong, and Bohr not so right-and that some lessons may be learned regarding what quantum mechanics is telling us about physical reality.
Our reasoning uses the idea of contextuality, which is currently an extremely active field of research, connected with many foundational issues [4][5][6][7][8][9][10][11]. However, rather than pursuing these interesting lines of research, we step back to discussions from the 1980s [12][13][14], which were perhaps too quickly dismissed. This is because fully exploiting them amounts to admitting that the usual |ψ is incomplete, which is a shocking statement that was rejected by Bohr himself in 1935 [3].
However, many things have happened since then, especially with regards to contextuality and nonlocality. Thus, we propose a "not so shocking" way to complete |ψ : very schematically, it tells that a usual state vector is incomplete as long as a complete set of commuting operators admitting this vector as an eigenstate has not been specified. More details are given below as well as how to use this idea for our purpose.

Probabilistic Framework
We use a general framework for conditional probabilities as presented, for instance, by E.T. Jaynes in [12], and also related to the analysis in [13,14]. We emphasize that these calculations are quite general and do not imply any commitment to a specific view on probabilities-Bayesian or otherwise. The equations we write apply both to usual quantum mechanics and to local hidden variable theories (LHVT), and the main interest of this calculation is to show explicitly where these two descriptions split, and why. We will also indicate when some (reasonable but not mandatory) hypotheses will be made.
We consider the well-known EPR-Bohm-Bell scheme [1,15], where polarization measurements are carried out on entangled photon pairs, described by some quantity λ in a variable space Λ. Alice and Bob carry out measurements defined by the respective polarizers' orientations x and y and obtain binary results a = ±1 and b = ±1.
According to the usual rules of probabilities, and with some care but without loss of generality [12], one can write the following relation between conditional probabilities, by conditioning on λ in some a priori unknown hidden variable space Λ P(ab|xy) = ∑ λ∈Λ P(ab|xyλ)P(λ|xy) In addition to this purely probabilistic relation, we introduce some requirements about the physics we want to describe, and we do it in the most general way: we assume that usual Quantum Mechanics (QM) and special relativity in the form of Relativistic Causality (RC) are true. We note that being true does not necessarily mean being complete [15][16][17], and we will come back to that issue later on. It should also be clear that theories where a and b are deterministic functions of λ, x, y do fit in this probabilistic framework as special cases; however, determinism has important consequences to be discussed below.

Enforcing Relativistic Causality
A first consequence of RC, sometimes called "freedom of choice", consists in requiring that λ does not depend on the variables (x, y) representing Alice and Bob's choices of measurement settings. In other words, the choices of measurements (x, y) should not act on the way photons are emitted (λ), since these events are space-like separated. This boils down to the independence condition P(λ|xy) = P(λ), or equivalently P(xy|λ) = P(xy), which is fulfilled by all the theories we are interested in. We note that one may reject freedom of choice by arguing that the random events λ, x and y are correlated from their distant past, see e.g., [18]. However such "superdeterministic" theories are highly speculative and here we do choose to keep free will. We have thus For a given initial state λ of the pair, a relevant theory should provide P(ab|xyλ), and thus we now focus on this conditional probability. For the sake of clarity, λ is a generic notation to specify whatever may be specified about the emission of the photon pair, in a given shot. This may include variables that fluctuate from shot to shot and other variables that do not. On the other hand, x, y and λ are not causally related as written above.
We note that Equation (2) is true also for QM, where the variable space Λ contains only one λ corresponding to the initial state of the entangled pair (e.g., a singlet state). It is standard in recent demonstrations of Bell's inequalities [19,20] to assume that P(ab|xyλ) is a probability for a given λ, and thus there is no restriction of generality here. Now, without any further assumptions, one can write from basic rules of inference P(ab|xyλ) = P(a|xyλ)P(b|xyλa) = P(a|xyλb)P(b|xyλ) where the two decompositions refer, respectively, to Alice and Bob, and where on Alice's side • P(a|xyλ) = probability of Alice obtaining result a for input x, and • P(b|xyλa) = probability of Bob obtaining result b for input y, calculated by Alice who knows x and a; whereas, on Bob's side, • P(b|xyλ) = probability of Bob obtaining result b for input y, and • P(a|xyλb) = probability of Alice obtaining result a for input x, calculated by Bob who knows y and b.
Clearly, a meaningful requirement in Equation (3), again related to RC, is that the choice of measurement by Alice (resp. Bob) should not have an influence on the result by Bob (resp. Alice). This implies that P(a|xyλ) = P(a|xλ) and P(b|xyλ) = P(b|yλ), and we call this condition "elementary locality" (EL), meaning that it is fulfilled for each given λ. As a consequence, one has P(ab|xyλ) = P(a|xλ)P(b|yλ, xa) = P(a|xλ, yb)P(b|yλ) where, in general, one cannot remove xa from P(b|yλ, xa), nor yb from P(a|xλ, yb). Let us emphasize that, so far, we have respected QM and RC at each step, and it can easily be checked that Bell's inequalities cannot be obtained from Equation (4). Correspondingly, if interpreted "à la Bell", keeping xa and yb in Equation (4) looks like an influence of one measurement on the other side. Yet, this conclusion is not warranted since Alice calculates a probability for Bob's result by using only what is locally available to her (resp. him by switching Alice and Bob); this does not influence in any way what is happening on the other side.

Contextual Inferences vs. Bell's Hypotheses
We conclude that, given Equation (4), there is still a missing step to reach Bell's theorem. In order to identify it, let us recall that locality "à la Bell" can be seen as a conjunction of two conditions : - The first condition is "elementary locality" (EL), already spelled out above: (EL) P(a|xyλ) = P(a|xλ) and P(b|xyλ) = P(b|yλ), taken as true as explained before.
These two conditions have been given various names, e.g., parameter independence for (EL) and outcome independence for (PC), see [21]; here, we stay closer to Jarrett's definitions [13,14], though some of our conclusions are different. We note also that (EL) is not accepted by Bohmian mechanics (BM), which agrees with relativistic causality only after averaging on λ. Thus our initial requirement of fulfilling RC at every step is not compatible with BM; see also the Conclusion section.
In order to justify the hypothesis (PC) and the wording "predictive completeness", one must emphasize that Bell's factorization condition P(ab|xyλ) = P(a|xλ)P(b|yλ) relies on the idea that λ specifies everything that can be known about the pair of particles; given this assumption, condition (PC) should be obvious, because knowing xa cannot bring anything more to Alice's probability calculation; hence, the name of predictive completeness. For instance, theories where a and b are deterministic functions of λ, x, y must satisfy (PC).
On the other hand, (λxa) occurring in the probability P(b|yλ, xa) = P(b|y, λxa) is not a property carried by Bob's particle, but it involves both the properties of Bob's particle (included in λ) and the result of Alice's measurement (described by xa). In other words, (λxa) refers to a property of Bob's particle, not in and by itself, but within a context defined by Alice's result. In the CSM language [22][23][24][25][26][27][28], (λxa) defines a modality for Bob's particle, in Alice's context. The correspondance between the respective modalities (λxa) and (λyb) can only be probabilistic, with probability 1 if a = b [24]. Alice's context and result cannot have any influence on Bob's particle, and they do not, since (λxa) is only used locally by Alice according, again, to Equation (4).
Given this situation, we take a major new step beyond the previous discussions by admitting that the description given by λ (or ψ in the quantum case) is incomplete indeed, and that knowing (xa) does bring something new to Alice. Then, condition (PC) can be violated, by Alice making a "contextual inference" about Bob's result. In order to make sense of this idea, it is essential to realize that (i) contextual inference is a non-classical phenomenon, and (ii) it agrees with relativistic causality, as we explain now.
(i) In classical physics, condition (PC) as defined above is verified, and Bell's factorization condition follows. However, in quantum physics, knowing Alice's measurement and result allows her to predict more, without invoking any action at a distance. This is because λ ≡ ψ does not tell which measurements will be actually carried out by Alice and Bob-in other words, λ ≡ ψ is predictively incomplete [13,17].
Adding this information where and when it is locally available improves Alice's prediction about Bob's result, and Bob's about Alice's, in agreement with Equation (4), showing the suitability of the concept of contextual inference. This effect does not show up in classical physics, because a classical λ is complete; however, it does show up in QM because a quantum ψ is (predictively) incomplete, as long as a measurement context has not been specified (for more details, see [17] and the last sections below).
(ii) Since contextual inference only applies to probabilities appearing in Equation (4), it does not involve any physical interaction outside light cones; therefore it obeys relativistic causality. A typical wrong line of thinking would be to say: if Alice can predict with certainty some results by Bob (perfect correlations, obtained when a = b), then either Bob's result is predetermined, or there are instantaneous actions at a distance. However, this dilemma only applies in a classical framework, where particles' properties are defined in an absolute way, and Bell's inequalities do apply. In a quantum framework, Alice can make local inferences by using additional information that is available to her, e.g., (λxa) in the above example; and these predictions can only be checked by accessing Bob's results afterwards, in a local and ordinary causal way.

Discussion
It is also interesting to draw a standard light-cone picture (see Figure 1), in order to show explicitly how contextual inference may be used when the relevant information is locally available. More precisely, this diagram allows us to separate on the one hand the localized events in space time (first the production of λ, x and y, and then the separated read-out of a by Alice, and b by Bob), and on the other hand the corresponding predictions, that are inferences, not influences, and thus no "action at a distance" is involved.  Another remark may be useful: as suggested by the light cones pictures in Figure 1, one may consider that x and y are also issued from independent random processes in variable spaces X and Y, as it is done in loophole-free Bell tests [1]. Then, the global probability writes P(xa, yb) = ∑ λ∈Λ,x∈X,y∈Y P(ab|xyλ)P(λ)P(x)P(y) where P(ab|xyλ) is given by Equation (4) as before. Taking Λ = {λ}, X = {x 1 , x 2 } with P(x 1 ) = P(x 2 ) = 1/2, Y = {y 1 , y 2 } with P(y 1 ) = P(y 2 ) = 1/2 as in a usual Bell test, one finds P(xa, yb) = P(ab|xyλ)/4. Correspondingly, the random variable (xa, yb) may take 16 mutually exclusive values, not 4, and Bell's inequalities cannot be written anymore. Bell's reasoning requires calculating the correlation functions E(x, y) = ab x,y by using P(ab|xy), not P(xa, yb), and thus the four different measurements apply to the same sample space Λ. This means implicitly that λ completely carries the pairs' properties (and the measurement results can be predicted from the knowledge of λ alone), as it would be the case in classical physics.
Completing ψ?-A second major new step is to answer the question: If ψ is not complete, does it tell anything concrete by itself? It does, because it indicates a set of contexts, corresponding to all the observables, including ψ as an eigenvector, where the associated measurement result (eigenvalue) is predictable with certainty. In recent papers [23][24][25][26][27], we introduced a framework that makes a careful distinction between the usual ψ without a context and the physical state within a context-called a modality (see also Appendix A).
In this language, ψ is associated with an equivalence class of modalities, called an extravalence class [25]; whereas the modalities are complete because they are properties of a system within a context; ψ is not, because the context is missing by construction. This gives a nice outcome to the Einstein-Bohr debate, by confirming the incompleteness of ψ [2], and by telling also how to complete it: one should add the context-this actually fits with the "very conditions" required by Bohr's answer [3,15,28].

Conclusions
In summary, the violation of Bell's inequalities by quantum theory and experiments [1] can be explained if one takes into consideration contextual inferences, and these, in turn, are ultimately allowed by the predictive incompleteness of the quantum state: finding actual probabilities for measurement results requires specifying a measurement context. In practice, contextual inferences correspond to what is usually called "quantum non locality"; however, they are not related to locality in a relativistic sense, but rather to the specifically quantum condition that requires attributing physical properties to systems within contexts. The implications on the (in)completeness of QM are discussed in more detail in [17]; however, a few comments are in order:

•
In the above, we argue that ψ is predictively incomplete, but not that QM is incomplete in the sense of being erroneous. There are many practical ways to complete it, by reintroducing the context either "by hand" (like in usual textbook QM) or in a more formal way using algebraic methods [17]. • The predictive incompleteness of ψ is general, and not limited to entangled states. This is because the measurement context is required to find actual probabilities, or said otherwise, one cannot define a full consistent set of classical probabilities applicable to any result in any context. In the language of [29], ψ provides mathematical qprobabilities without interpretation, whereas ψ completed by the specification of the measurement context provides true probabilities for mutually exclusive events.
• In this article, we enforced (EL) at the beginning and explained how (PC) can be violated by a non-deterministic theory, without any conflict with RC. On the other hand, deterministic theories must agree with (PC), and therefore have to violate (EL) to be compatible with the observed violation of Bell's inequalities; an example of such a theory is Bohmian mechanics. Generally speaking, if (EL) is rejected, more care must be taken in order to avoid an explicit violation of special relativity [30]. • Here, we considered the standard version of Bell's theorem, but many other inequalities may be obtained in the general framework of "local realism". It would be interesting to look whether the violation of such inequalities is generally due to a violation of (PC); this may be the topic for further work (see Appendix B for three-particle entanglement).
Finally, it is interesting to note that, in [12], Jaynes did not spell out either the "nonlocal" or the "incompleteness" option, though he did all the calculations above. In our opinion, this is because he could not give up the classical idea that particles should be described independently of their contexts. In order to admit the idea of contextual inference, an intellectual quantum jump is required to accept that, in quantum mechanics, one has to take into consideration both the systems and the contexts in which they evolve. A simple way not to forget this requirement is to postulate that the "object" carrying well-defined properties is a composite: a (quantum) system within a (classical) context [17,[22][23][24][25][26][27][28]31].
Funding: This research received no external funding.

Acknowledgments:
The author deeply thanks Franck Laloë, Roger Balian and Nayla Farouki for many useful discussions, without indicating that they endorse all what is written above.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A. Summary of the Main Arguments
In this Appendix A, we summarize our main arguments, in either the standard QM or the CSM point of view, and in Appendix B, we discuss three-particle entanglement with respect to Predictive Incompleteness.
In the text above, we presented a general probabilistic framework inspired from [12][13][14], and we saw that, assuming the causal independence of the pair emission and measurements, the hypothesis of Bell's theorem can be split in two different sub-hypotheses: Elementary Locality (EL): At the most elementary level, "λ" allowed by the theory under consideration, the probability distribution of the result a of a measurement x by Alice cannot depend on the choice of a measurement y done by Bob in a remote place (the same with Bob's result b): P(a|xyλ) = P(a|xλ) and P(b|xyλ) = P(b|yλ).

Predictive Completeness (PC):
Again, at the "λ" level, Alice knowing the result a of her measurement x cannot help her to obtain a better inference about Bob's result b (and same thing by exchanging Alice and Bob): P(a|xyλb) = P(a|xyλ) and P(b|xyλa) = P(b|xyλ).
Taken together, the conditions (EL) and (PC) entail the factorization P(ab|xyλ) = P(a|xλ)P(b|yλ), and therefore lead to Bell's inequalities [19,20]. As a consequence, a theoretical framework violating Bell's inequalities must reject at least one of the two hypotheses. For instance, quantum mechanics (QM) agrees with (EL), which can be seen as a consequence of relativistic causality but rejects (PC). This means that the description given by λ (or ψ in the quantum case) is not complete, and that knowing (xa) does bring something new to Alice: then, condition (PC) can be violated, and Alice can make a "contextual inference" about Bob's result.
The predictive incompleteness of ψ is actually not a big surprise, and it appears implicitly or explicitly in many presentations of QM, either based on the standard formalism [29], or using other approaches [31]. The respective lines of arguments are : Arguments based on the QM formalism [29].
• QM is a fundamentally probabilistic theory: this is a consequence of the non-commutation of observables. • The "quantum state" (pure state ψ or mixture ρ) is predictively incomplete, because by itself it does not provide a normalized probability distribution over a set of mutually exclusive events. • From a physical point of view, ψ or ρ can be completed by specifying a measurement context, i.e. a macroscopic apparatus, in order to define a set of mutually exclusive events given by the apparatus outcomes. • Once a context is given, ψ or ρ provides the relevant set of probabilities; this applies in any possible context, but only one at a time (predictive incompleteness).
Arguments based on the CSM formalism [31].
• QM is a fundamentally probabilistic theory: this is a consequence of contextual quantization [23,27]. • Nevertheless, QM allows measurements results to be predicted with certainty, either by repeating them in the same measurement context (this defines a modality), or by observing fully connected results betwen different contexts (this defines an equivalence classe of modalities, called an extravalence class). • Associating ψ (a mathematical object) to an extravalence class yields Born's law from Gleason's theorem. • By construction, ψ is predictively incomplete because it is associated with an extravalence class and not with a modality, and thus the context is missing.
Then, the violation of Bell's inequalities and all similar effects (resulting from "local realism") appear as a consequence of the predictive incompleteness of ψ and have no conflict whatsoever with relativistic causality. There is no influence at a distance, but only inference at a distance, within a non-classical framework that is fundamentally probabilistic (this is the starting point of both lines of arguments above).
In the CSM point of view, QM relies on a non-classical ontology where physical properties are attributed to physical objects consisting of a system within a context, i.e., an idealized measurement apparatus. Such physical properties are called modalities, and a modality belongs to a specified system within a specified context, which is described classically.
Loosely speaking, the mathematical description of a modality includes both a usual state vector |ψ and a complete set of commuting operators admitting this vector as an eigenstate. Though it may appear heavier at first sight, this point of view eliminates a great deal of troubles about QM, and (in some sense) it can be seen as a reconciliation between Bohr and Einstein in their famous 1935 debate [28].

Appendix B. Discussion of Three-Particle Entanglement
In the main text, the argument was based on Bell's theorem, but it is also interesting to consider three-particle entangled states, where the conflict between QM and local realism is even more straightforward, as we will see now. The basic argument was introduced by Greenberger, Horne and Zeilinger [32] and has been developed by many authors, both theoretically and experimentally [33,34]; here, we follow mostly Mermin's presentation in [35,36].
Thus, let us consider measurements carried out on three spin 1/2 particles, each one with a spin operator S =¯h 2 σ, where the three Pauli matrices (σ x , σ y , σ z ) are denoted as (X, Y, Z) for simplicity. The 2 × 2 unitary matrix is denoted as I, and ordered products such as XYZ mean the measurements of σ x , σ y , σ z on the first, second, and third particle, respectively given to Alice, Bob, and Charlie.
A basic feature is that the operators XYY, YXY, YYX all commute together, and have eight distinct eigenvalues (±1, ±1, ±1). Therefore, they define a complete set of commuting operators (CSCO), where the eight orthogonal eigenvectors are defined by the eight eigenvalues. Another feature is that XXX also commute with the three previous operators, and one has the algebraic relation XXX = −(XYY).(YXY).(YYX).
Correspondingly, the eigenvalue of XXX is (−1) for the (+1, +1, +1) eigenvector in the previous basis, which we take as the initial state and denote as ψ in , the usual GHZ state defined as ψ in = (| + ++ − | − −− )/ √ 2, where |± are the eigenstates of σ z [32]. On the other hand, denoting x = ±1 (resp. y = ±1) the result of measuring X (resp. Y), on each spin, one has xyy = yxy = yyx = +1 in state ψ in , and since y 2 = 1 the product xxx is +1 also. However, assuming (EL) and (PC), the value of each x should not depend on the other measurements being X or Y, and one finds a full contradiction for ψ in , between measuring XXX (and thus obtaining −1) and deducing xxx from the separate measurements of XYY, YXY and YYX (and thus obtaining +1). Figure A1. Illustration of the complete set of commuting observables considered here. The operators X, Y, Z correspond to the Pauli matrices σ x , σ y , σ z , and I to the identity. Three operators in a set are enough, but a fourth commuting one is added in each group. The possible results are indicated for the initial state ψ in (GHZ state), which is an eigenstate of the CSCO at the center. Note that the missing operators XXY, XYX, YYX, and YYY all give random results in the state ψ in .
For a better understanding, Figure A1 displays the relevant CSCO for the considered situation. The previous operators are at the center, whereas XYY, for instance, is commuting with the CSCO (XI I, IYI, I IY), which is incompatible with (YI I, IXI, I IY) and (YI I, IYI, I IX). In this situation, (PC) requires that, given the previous initial joint eigenvector ψ in ≡ (+1, +1, +1), no further measurements by Alice, Bob or Charlie can improve the inferences they can make about the results to be obtained by others. However, this is clearly not the case in QM, since for instance Alice and Bob can predict Charlie's result with certainty by sharing their measurement results.
This issue can be further illustrated by deriving inequalities in a local realistic framework, as it was done in [36]. The hypothesis needed from local realism is P(abc|uvwλ) = P(a|uλ)P(b|vλ)P(c|wλ) whereas the rules of inference together with elementary locality (EL) for three particles provide P(abc|uvwλ) = P(a|uλ)P(b|uvλa)P(c|uvwλab) = P(a|uλ)P(b|uvwλac)P(c|uwλa) = P(a|uvwλbc)P(b|vλ)P(c|vwλb) = P(a|uvλb)P(b|vλ)P(c|uvwλab) = P(a|uvwλbc)P(b|vwλc)P(c|wλ) = P(a|uwλc)P(b|uvwλac)P(c|wλ) These relations are fulfilled by QM, and each pair of equations applies respectively to Alice, Bob, and Charlie. More precisely, with a, b, c = ±1 and u, v, w = 0 for X and π/2 for Y, one has P(abc|uvwλ) = (1 + abc cos(u + v + w))/8 and all other probabilities are 1/2 except P(a|uvwλbc) = P(b|uvwλac) = P(c|uvwλab) = (1 + abc cos(u + v + w))/2. This means that, if u + v + w = 0 or π, given their results, two partners can predict with certainty the result of the third one as can be expected from the definition of ψ in . On the other hand, in order to obtain Equation (A1) one needs to add Predictive Completeness (PC), which reads by assuming (EL) is fulfilled: P(a|uλ) = P(a|uvλb) = P(a|uwλc) = P(a|uvwλbc) P(b|vλ) = P(b|uvλa) = P(b|vwλc) = P(b|uvwλac) P(c|wλ) = P(c|uwλa) = P(c|vwλb) = P(c|uvwλab) For instance, the first line refers to predicting Alice's result, from the initial state, then by Bob, then by Charlie, and then jointly by Bob and Charlie. According to (PC), all these predictions should be the same, whereas this is clearly not the case for QM, because λ ≡ ψ in is predictively incomplete: a new measurement in a new context provides a better contextual inference. Again, this does not imply any influence at a distance, because (EL) is always fulfilled, but only an inference at a distance, in a non-classical probabilistic framework.