The Relativity Principle at the Foundation of Quantum Mechanics

Quantum information theorists have created axiomatic reconstructions of quantum mechanics (QM) that are very successful at identifying precisely what distinguishes quantum probability theory from classical and more general probability theories in terms of information-theoretic principles. Herein, we show how one such principle, Information Invariance&Continuity, at the foundation of those axiomatic reconstructions maps to"no preferred reference frame"(NPRF, aka"the relativity principle") as it pertains to the invariant measurement of Planck's constant h for Stern-Gerlach (SG) spin measurements. This is in exact analogy to the relativity principle as it pertains to the invariant measurement of the speed of light c at the foundation of special relativity (SR). Essentially, quantum information theorists have extended Einstein's use of NPRF from the boost invariance of measurements of c to include the SO(3) invariance of measurements of h between different reference frames of mutually complementary spin measurements via the principle of Information Invariance&Continuity. Consequently, the"mystery"of the Bell states that is responsible for the Tsirelson bound and the exclusion of the no-signalling,"superquantum"Popescu-Rohrlich joint probabilities is understood to result from conservation per Information Invariance&Continuity between different reference frames of mutually complementary qubit measurements, and this maps to conservation per NPRF in spacetime. If one falsely conflates the relativity principle with the classical theory of SR, then it may seem impossible that the relativity principle resides at the foundation of non-relativisitic QM. In fact, there is nothing inherently classical or quantum about NPRF. Thus, the axiomatic reconstructions of QM have succeeded in producing a principle account of QM that reveals as much about Nature as the postulates of SR.


Introduction
Feynman famously said, "I think I can safely say that nobody understands quantum mechanics" [1]. Despite the fact that quantum mechanics "has survived all tests" and "we all know how to use it and apply it to problems," Gell-Mann agreed with Feynman saying, "we have learned to live with the fact that nobody can understand it" [2]. As a result, there are many programs designed to interpret quantum mechanics (QM), i.e., reveal what QM is telling us about Nature. We will not review such attempts here (the interested reader is referred to Drummond's 2019 overview of QM interpretations [3]), rather in this paper we will explain how axiomatic reconstructions of QM based on information-theoretic principles (e.g., see [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] or the review by Jaeger [24]) contain a surprising advance in the understanding of QM. Specifically, we will show how the principle of Information Invariance & Continuity [8]: The total information of one bit is invariant under a continuous change between different complete sets of mutually complementary measurements.
at the basis of information-theoretic reconstructions of QM already implies the relativity principle (aka "no preferred reference frame (NPRF)") as it pertains to the invariant measurement of Planck's constant h when applied to spin- 1 2 measurements in spacetime. This is in total analogy to the Lorentz transformations of special relativity (SR) being based on the relativity principle as it pertains to the invariant measurement of the speed of light c (light postulate). Thus, the information-theoretic reconstructions of QM (hereafter, "reconstructions of QM") provide a "principle" account of QM in total analogy to that of SR [23,[25][26][27], revealing a deep unity between these pillars of modern physics where others have perceived tension [28][29][30][31].
Before proceeding further, some caveats are in order. First, the relativity principle, i.e., "The laws of physics must be the same in all inertial reference frames" or NPRF for short, is not restricted to "The laws of classical physics," it applies to all of physics. Second, that it resides at the foundation of a theory does not mean the theory is "relativistic." For example, NPRF is at the foundation of Newtonian mechanics with its Galilean transformations and Newtonian mechanics is certainly "non-relativistic." Thus, there is no reason a priori to exclude the possibility that NPRF resides at the foundation of non-relativistic QM.
And, our use of NPRF deals exclusively with the kinematic structure underlying QM, i.e., denumerable-dimensional Hilbert space (of arbitrarily large, but finite, dimension). This is in total analogy to the relativity principle underwriting the the kinematic structure of SR, i.e., Minkowski spacetime (M4). In both cases the kinematic structure constrains but does not dictate the dynamics. Bub writes [32]: The information-theoretic interpretation is the proposal to take Hilbert space as the kinematic framework for the physics of an indeterministic universe, just as Minkowski space provides the kinematic framework for the physics of a non-Newtonian, relativistic universe. In special relativity, the geometry of Minkowski space imposes spatio-temporal constraints on events to which the relativistic dynamics is required to conform. In quantum mechanics, the non-Boolean projective geometry of Hilbert space imposes objective kinematic (i.e., pre-dynamic) probabilistic constraints on correlations between events to which a quantum dynamics of matter and fields is required to conform.
For example, the relativity principle is responsible for the light postulate and together they give M4 at the foundation of SR, but M4 does not dictate the contents of the 4-momentum vector. Likewise, reconstructions of QM give the qubit Hilbert space structure at the foundation of QM, but that qubit Hilbert space structure does not dictate the Hamiltonian for the propagator.
Finally, we should also point out that our result is not related to quantum reference frames [33][34][35], Lorentz invariance of entangled states [36], the relativity principle in QM per Davis [37] or Dragan & Ekert [38], or the relativity of simultaneity applied to quantum experiments [39,40]. The relativity principle will be applied herein to reference frames related by spatial rotations SO (3). More specifically, these spatial reference frames will be those established by mutually complementary qubit measurements [8] per the "closeness requirement" between quantum theory and classical measurement devices [41,42] (Section 2). While spatial rotations plus Lorentz boosts constitute the restricted Lorentz group, spatial rotations plus Galilean boosts constitute the homogeneous Galilean group, so the use of the relativity principle here does not imply Lorentz invariance. Again, conformity to NPRF does not mean a theory is "relativistic" in that sense.
The relativity principle has a long history in physics, e.g., Galileo used the relativity of motion principle to argue against geocentricism and Newtonian mechanics is invariant under Galilean transformations. Einstein generalized Galileo's version of the relativity principle, "The laws of mechanics must be the same in all inertial reference frames" to "The laws of physics must be the same in all inertial reference frames," so he could apply it to the value of c from Maxwell's equations [43][44][45]. As Norton points out, Maxwell's discovery of c plus NPRF makes M4 of SR inevitable [43]. Here we will see that Planck's discovery of h plus NPRF makes the denumerable-dimensional Hilbert space of QM inevitable as well.
Quantum information theorists engaged in reconstructions of QM have stated specifically their desire to discover principles at the foundation of QM analogous to NPRF and light postulate at the foundation of SR [9,13,15,19,22,[46][47][48]. If not the relativity principle specifically, at least principles that can "be translated back into language of physics" [11]. Of course, different reconstructions of QM contain different information-theoretic principles precisely because quantum information scientists "design algorithms and protocols at an abstract level, without considering whether they will be implemented with light, atoms or any other type of physical substrate" [19]. Nonetheless, they all reveal directly or indirectly that the key difference between classical and quantum probability theories resides in the continuity of reversible transformations between pure states (Section 2). In what is considered the first axiomatic reconstruction of QM [24], Hardy notes that by adding the single word "continuous" to his reversibility axiom one obtains quantum probability theory instead of classical probability theory [4]. Indeed, many authors emphasize this point [12,19,22,42], e.g., Koberinski & Müller write [23]: We suggest that (continuous) reversibility may be the postulate which comes closest to being a candidate for a glimpse on the genuinely physical kernel of "quantum reality". Even though Fuchs may want to set a higher threshold for a "glimpse of quantum reality", this postulate is quite surprising from the point of view of classical physics: when we have a discrete system that can be in a finite number of perfectly distinguishable alternatives, then one would classically expect that reversible evolution must be discrete too. For example, a single bit can only ever be flipped, which is a discrete indivisible operation. Not so in quantum theory: the state |0 of a qubit can be continuously-reversibly "moved over" to the state |1 . For people without knowledge of quantum theory (but of classical information theory), this may appear as surprising or "paradoxical" as Einstein's light postulate sounds to people without knowledge of relativity.
Our goal here is to show how this key difference between classical and quantum probability theories per the principle of Information Invariance & Continuity relates directly to an application of NPRF in spacetime.
Of course, as a "principle" account of QM the information-theoretic reconstructions do not provide "a constructive account of ontological structure" that many deem necessary for a true interpretation of QM [23,49]. Einstein noted the difference between "principle" and "constructive" theories in this famous passage [50]: We can distinguish various kinds of theories in physics. Most of them are constructive. They attempt to build up a picture of the more complex phenomena out of the materials of a relatively simple formal scheme from which they start out. [The kinetic theory of gases is an example.] ... Along with this most important class of theories there exists a second, which I will call "principle-theories." These employ the analytic, not the synthetic, method. The elements which form their basis and starting point are not hypothetically constructed but empirically discovered ones, general characteristics of natural processes, principles that give rise to mathematically formulated criteria which the separate processes or the theoretical representations of them have to satisfy. [Thermodynamics is an example.] ... The advantages of the constructive theory are completeness, adaptability, and clearness, those of the principle theory are logical perfection and security of the foundations. The theory of relativity belongs to the latter class.
Nearly every introductory physics textbook introduces SR via the relativity principle and light postulate without qualifying that introduction as somehow lacking an "interpretation." With few exceptions, physicists have come to accept the principles of SR without worrying about a constructive counterpart. Thus, a principle account of QM based on NPRF as with SR certainly constitutes an important advance in our understanding of QM. Perhaps prophetically, Bell said [51, p. 85]: I think the problems and puzzles we are dealing with here will be cleared up, and ... our descendants will look back on us with the same kind of superiority as we now are tempted to feel when we look at people in the late nineteenth century who worried about the aether. And Michelson-Morley ..., the puzzles seemed insoluble to them. And came Einstein in nineteen five, and now every schoolboy learns it and feels ... superior to those old guys. Now, it's my feeling that all this action at a distance and no action at a distance business will go the same way. But someone will come up with the answer, with a reasonable way of looking at these things. If we are lucky it will be to some big new development like the theory of relativity.
By revealing the relativity principle's role at the foundation of QM, information-theoretic reconstructions of QM have revealed what QM is telling us about Nature to no less an extent than SR. And, SR's principle explanation of Nature certainly constituted a "big new development" for physics in 1905. As emphasized by Fuchs, "Where present-day quantum-foundation studies have stagnated in the stream of history is not so unlike where the physics of length contraction and time dilation stood before Einstein's 1905 paper on special relativity" [5]. At that time, "Maxwellian physicists were ready to abandon the relativity of motion principle" [45] and even "Einstein was willing to sacrifice the greatest success of 19th century physics, Maxwell's theory, seeking to replace it by one conforming to an emission theory of light, as the classical, Galilean kinematics demanded" before realizing that such an emission theory would not work [43]. Thus, concerning his decision to produce a principle explanation instead of a constructive explanation for time dilation and length contraction, Einstein writes [52]: By and by I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more despairingly I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results. Therefore, being in a similar situation today with QM, it is not unreasonable to seek a compelling principle account of QM along the lines of SR. Again, a principle account of QM that maps to NPRF applied to h at its foundation would be as valuable to understanding QM as NPRF applied to c is to understanding SR and, as we will show, the information-theoretic reconstructions of QM entail exactly that understanding.
We start in Section 2 with an introduction to the relevant information-theoretic formalism on the qubit at the basis of the reconstructions of QM. This introduction is not a mathematically detailed exposition on the reconstructions of QM (for that see [53] and as related to this paper [54]). Rather, in this section we introduce only the key information-theoretic concepts associated with the qubit in the reconstructions of QM, as required to make our argument to the physicist interested in foundations of QM (but not necessarily familiar with quantum information theory). Virtually all undergraduate physics textbooks introduce the counterintuitive concepts of time dilation, length contraction, and the relativity of simultaneity using the relativity principle and the invariant measurement of c at the foundation of SR. Our goal here is to present an equally accessible introduction to the counterintuitive concept of the qubit using the relativity principle and the invariant measurement of h at the foundation of QM, as implied by the reconstructions of QM. Here we will see that the information-theoretic principles of Existence of an Information Unit and Continuous Reversibility [19], or in combined form Information Invariance & Continuity, already reveal a role for NPRF at the foundation of QM.
In Section 3, we finish our argument by looking at the role played by Planck's constant h in QM and its relation to the Existence of an Information Unit. In particular, we focus on three facts: QM obtains because h = 0, h is a universal constant of Nature, and Stern-Gerlach (SG) spin measurements constitute the invariant measurement of h. We then show why continuous reversibility in SG spin measurements of h is "quite surprising from the point of view of classical physics" [23], i.e., there is no constructive classical model for it and it leads to "average-only" projection of spin angular momentum. Most generally, Information Invariance & Continuity leads to "average-only" projection/transmission/... between the different reference frames of mutually complementary qubit measurements [8,55]. In Section 4, we review how this result leads to the counterintuitive "average-only" conservation characterizing quantum entanglement per the Bell states [56][57][58]. Thus, "average-only" conservation responsible for the Tsirelson bound is explained by conservation per Information Invariance & Continuity (conservation per NPRF in spacetime). In Section 5, we show how conservation per the relativity principle can be used to rule out the nosignalling, "superquantum" joint probabilities of Popescu & Rohrlich [31] in spacetime. We conclude in Section 6.

The Qubit and the Relativity Principle
We start by noting the term "quantum state" can refer to the probability amplitude vector, e.g., |u , or to the probability (density) matrix ρ. It will be clear which is meant by the context. Next, we review the difference between the classical bit and the qubit per Hardy [4], starting with the qubit.
In a 2-dimensional (2D) Hilbert space spanned by |u and |d , a general state |ψ is given by |ψ = c 1 |u + c 2 |d with c 1 and c 2 complex and |c 1 | 2 + |c 2 | 2 = 1. In general, such 2D states are called qubits and the density matrix is given by ρ = |ψ ψ|. In quantum information theory, these qubits represent an elementary piece of information for quantum systems; a quantum system is probed and one of two possible outcomes obtains, e.g., yes/no, up/down, pass/no pass, etc. The structure of all such binary systems from an information-theoretic perspective is identical.
A general Hermitian measurement operator in 2D Hilbert space has outcomes given by its (real) eigenvalues and can be written (λ 1 )|1 1| + (λ 2 )|2 2| where |1 and |2 are the eigenstates for the eigenvalues λ 1 and λ 2 , respectively. Any such Hermitian matrix M can be expanded in the Pauli matrices where (m 0 , m x , m y , m z ) are real. The eigenvalues of M are given by We see that m x , m y , m z give two eigenvalues centered about m 0 . All measurement operators with the same eigenvalues are related by SU(2) transformations given by some combination of e iΘσ j , where j = {x, y, z} and Θ is an angle in Hilbert space. Any density matrix can be expanded in the same fashion where (ρ x , ρ y , ρ z ) are real. The Bloch sphere is defined by ρ 2 x + ρ 2 y + ρ 2 z = 1 with pure states residing on the surface of the sphere and mixed states residing inside the sphere, consistent with the fact that mixed states contain less information than pure states. Since we will be referring to spin-1 2 measurements and states later, we will denote our eigenstates |u for spin up with eigenvalue +1 and |d for spin down with eigenvalue −1. [See Figures 1 and 2 associated with the state space corresponding to |u in the σ z basis.] The transformations relating various pure states on the sphere are continuously reversible so that in going from a pure state to a pure state one always passes through other pure states. This is distinctly different from the reversibility axiom between pure states for classical probability theory's fundamental unit of information, the classical bit, as noted above above by Koberinski & Müller [23].
In classical probability theory, the only continuous way to get from one pure discrete state for a classical bit to the other pure state is through mixed states. For example, suppose we place a single ball in one of two boxes labeled 1 and 2 with probabilities p 1 and p 2 , respectively. The probability state is given by the vector p =  Since this state space is isomorphic to 3-dimensional real space, the Bloch sphere is shown in a real space reference frame with its related Stern-Gerlach (SG) magnet orientations (see Knight [59, p. 1307] for an explanation of the SG experiment). The probability is given for a +1 outcome at the measurement direction shown [55]. Compare this with Figure 2. Notice that the state space for the classical bit is 1-dimensional and represented by a 2 × 1 matrix while that for the qubit is 3-dimensional and represented by a 2 × 2 matrix. In general, the dimension of the probability space for the generalized bit (gbit) of a general probability theory is d = 2 s −1 with s = 1, 2, 3, ... and the gbit is represented by a 2×2×. . .×2 tensor (d equals the number of 2's) [60]. Having seen the fundamental difference between classical probability theory and quantum probability theory per their fundamental units of information, we now review how higher-dimensional Hermitian operators in Hilbert space are related to the qubit. The structure of the qubit is important because any higher-dimensional Hermitian matrix with the same eigenvalues can be constructed via SU(2) and the qubit from the diagonal version, as explained by Hardy [4]. For example, suppose you want to construct the L x measurement operator with eigenvalues +1, 0, −1 in the L z eigenbasis To obtain the third member of the mutually complementary measurements, L y , simply use the SU (2) transformation e iΘσx about the |d axis with Θ = −90 • , then again use the SU (2) transformation e iΘσx about the post transformed |u axis with Θ = 45 • , finally use the SU (2) transformation e iΘσy about the post transformed |0 axis with Θ = 45 • .
In this example, the Pauli matrices (σ x , σ y , σ z ) are clearly visible in the matrices (L x , L y , L z ), respectively. The completion of the reconstruction uses the tensor product ⊗ to add particles for any given dimension in accord with the "Locality" axiom [9] and the relevant dynamical transformation of the state for the Schrödinger equation [4]. [For more information on this mathematical structure and its extension to the probability states for a continuous random variable see [54].] But, we don't need to proceed further in the reconstruction process, as we have what we need to show the reconstruction of QM via Information Invariance & Continuity maps to NPRF giving rise to the invariant measurement of h in spacetime.
To summarize, we see that the reconstruction of QM builds the denumerabledimensional Hilbert space structure of QM foundationally upon a fundamental 2D object, the qubit. The difference between classical probability and quantum probability then resides in the fact that a pure state can be transformed into another pure state through other pure states in continuous fashion for the qubit, while that is not possible for the classical bit. Of course, this continuously reversible transformation property would also hold for a gbit with s > 2, so why does Nature prefer the qubit?
In order to argue for the qubit from the gbit, Masanes et al. employed "Continuous Reversibility, Tomographic Locality and Existence of an Information Unit, No Simultaneous Encoding, All Effects Are Observable, and Gbits Can Interact" [19,48]. In other words, arguing for the qubit while keeping to the very general informationtheoretic principles is highly non-trivial. Dakic & Brukner presented an argument based on their "closeness requirement: the dynamics of a single elementary system can be generated by the invariant interaction between the system and a 'macroscopic transformation device' that is itself described within the theory in the macroscopic (classical) limit" [41,42]. This is due to the fact that the measuring devices used to measure quantum systems are themselves made from quantum systems. For example, the classical magnetic field of an SG magnet is used to measure the spin of spin-1 2 particles and that classical magnetic field "can be seen as a limit of a large coherent state, where a large number of spin- 1 2 particles are all prepared in the same quantum state" [61].
Per Brukner & Zeilinger [8,55], if we identify the preparation state |ψ = |u at z with the reference frame of mutually complementary spin measurements [J x , J y , J z ] (J i = 2 σ i ), then the closeness requirement means our reference frame of mutually complementary measurements is [x,ŷ,ẑ] in real space. Thus, they depict the Bloch sphere in that real space reference frame with associated SG magnet orientations a la Figure 1 [55]. Unless otherwise noted, we also make this association throughout, so that "the reference frame of a complete set of mutually complementary measurements" is simply "the reference frame." While it may prove necessary to consider generalizations of QM that require higher-dimensional space in order to produce a theory of quantum gravity [41,62], the reconstruction of QM clearly shows that one may consider the qubit to reside at the foundation of the denumerable-dimensional Hilbert space structure of QM. And, as we will now see, the qubit structure already reveals a role for the relativity principle at the foundation of QM.
Again, by "relativity principle" we mean "The laws of physics must be the same in all inertial reference frames," aka NPRF. In SR, we are concerned with the fact that everyone measures the same speed light c, regardless of their motion relative to the source (light postulate). Here the inertial reference frames are characterized by motion at constant velocity relative to the source and different reference frames are related by Lorentz boosts. Since c is a constant of Nature per Maxwell's equations, NPRF implies the light postulate (or "NPRF + c" for short) [44,59]. Thus, we see that NPRF + c resides at the foundation of SR.
Likewise, we have seen that different 2D Hilbert space measurement operators with the same outcomes are related by SU(2) transformations and that SU(2) transformations in Hilbert space map to SO(3) rotations between different reference frames in 3dimensional real space (Information Invariance & Continuity). In information-theoretic terms, the total knowledge one has about the elementary system must be independent of how they choose to represent that knowledge [8]. Since spatial rotations, like Lorentz boosts, relate inertial reference frames, the information-theoretic qubit structure reveals a role for NPRF at the foundation of QM.
Since we have so far only reviewed the very general structure of the qubit per information-theoretic principles, we don't as yet have a fundamental constant of Nature in play. However, we can already see that the qubit implies a role for the relativity principle (NPRF) in QM. To complete our analogy with SR and its NPRF + c, we need to relate all of this to the fundamental constant of Nature at the foundation of QM, i.e., Planck's constant h.

Planck's Constant and Spin
Planck introduced h in his explanation of blackbody radiation and we now understand that electromagnetic radiation with frequency f is comprised of indivisible quanta (photons) of energy hf . One difference between the classical view of a continuous electromagnetic field and the quantum reality of photons is manifested in polarization measurements. According to classical electromagnetism, there is no non-zero lower limit to the energy of polarized electromagnetic radiation that can be transmitted by a polarizing filter. However, given that the radiation is actually composed of indivisible photons, there is a non-zero lower limit to the energy passed by a polarizing filter, i.e., each quantum of energy hf either passes or it doesn't. Thus, we understand that the classical "expectation" of fractional amounts of quanta can only obtain on average per the quantum reality, so we expect the corresponding quantum theory will be probabilistic. In information-theoretic terms, a system is composed fundamentally of discrete units of finite information (the qubit). Since the qubit contains finite information, it cannot contain enough information to account for the outcomes of every possible measurement done on it. Thus, a theory of qubits must be probabilistic [61,63,64]. Of course, the relationship between classical and quantum mechanics per its expectation values is another textbook result, e.g., the Ehrenfest theorem.
And, the fact that classical results are obtained from quantum results for h → 0 is common knowledge. In information-theoretic terms, h represents "a universal limit on how much simultaneous information is accessible to an observer" [17]. For example, [X, P ] = i means there is a trade-off between what one can know simultaneously about the position and momentum of a quantum system. If h = 0, as in classical mechanics, there is no such limit to this simultaneous knowledge. For spin-1 2 measurements J i , we have [J x , J y ] = i J z , cyclic. So we see that h = 0 in this case corresponds to the existence of a set of mutually complementary spin measurements associated with the reference frame shown in Figure 1 (more on this in Section 6).
Given that h is a constant of Nature, NPRF dictates that everyone measure the same value for it ("Planck postulate" in analogy with the light postulate) and the measurement of spin via SG magnets constitutes a measurement of h [65]. Again, the ±1 eigenvalues of the Pauli matrices correspond to ± 2 for spin-1 2 measurement outcomes. Thus, the general result from Information Invariance & Continuity concerning the SO(3) invariance of measurement outcomes for a qubit implies NPRF + h (relativity principle → Planck postulate) for QM in total analogy to NPRF + c (relativity principle → light postulate) for SR. One consequence of the continuously reversible movement of one qubit state to another when referring to an SG spin measurement is "average-only" projection.
Suppose we create a preparation state oriented along the positive z axis as in Figure  1, i.e., |ψ = |u , so that our "intrinsic" angular momentum is S = +1ẑ (in units of 2 = 1). Now proceed to make a measurement with the SG magnets oriented atb making an angle θ with respect toẑ (Figure 2). According to the constructive account of classical physics [59,66] (Figure 4), we expect to measure S ·b = cos (θ) ( Figure 5), but we cannot measure anything other than ±1 due to NPRF (contra the prediction by classical physics). As a consequence, we can only recover cos (θ) on average, i.e., NPRF dictates "average-only" projection (+1)P (+1 | θ) + (−1)P (−1 | θ) = cos(θ) Of course, this is precisely σ per QM. Eq. (3) with our normalization condition P (+1 | θ) + P (−1 | θ) = 1 then gives and again, precisely in accord with QM. And, if we identify the preparation state |ψ = |u atẑ with the reference frame of mutually complementary spin measurements [J x , J y , J z ], then the SG measurement atb constitutes a reference frame of mutually complementary measurements rotated by θ in real space relative to the reference frame of the preparation state ( Figure 6). Thus, "average-only" projection follows from Information Invariance & Continuity when applied to SG measurements in real space. The fact that one obtains ±1 outcomes at some SG magnet orientation is not mysterious per se, it can be accounted for by the classical constructive model in Figure 4. The constructive account of the ±1 outcomes would be one of particles with "intrinsic" angular momenta and therefore "intrinsic" magnetic moments [59] orientated in two opposite directions in space, parallel or anti-parallel to the magnetic field. Given this constructive account of the ±1 outcomes at this particular SG magnet orientation, we would then expect that the varying orientation of the SG magnetic field with respect to If the atoms enter with random orientations of their "intrinsic" magnetic moments (due to their "intrinsic" angular momenta), the SG magnets should produce all possible deflections, not just the two that are observed [59,66].  the magnetic moments, created as we rotate our SG magnets, would cause the degree of deflection to vary. Indeed, this is precisely the constructive account that led some physicists to expect all possible deflections for the particles as they passed through the SG magnets, having assumed that these particles would be entering the SG magnetic field with random orientations of their "intrinsic" magnetic moments [66] (Figure 4). But according to this constructive account, if the ±1 outcomes constitute a measurement of h in accord with the rest of quantum physics, then our rotated orientations would not be giving us the value for h required by quantum physics otherwise. Indeed, a rotation of 90 • would yield absolutely no deflection at all (akin to measuring the speed of a light wave as zero when moving through the aether at speed c). That would mean our original SG magnet orientation would constitute a preferred frame in violation of the relativity principle, NPRF. Essentially, as Michelson and Morley rotated their interferometer the constructive model predicted they would see a change in the interference pattern [67], but instead they saw no change in the interference pattern in accord with NPRF. Likewise, as Stern and Gerlach rotated their magnets the constructive model predicted they would see a change in the deflection pattern, but instead they saw no change in the deflection pattern in accord with NPRF. We next review what this "average-only" projection per NPRF + h (or more generally, per Information Invariance & Continuity) tells us about entanglement via the Bell states.

Implication for Entanglement via the Bell States
Since the qubit forms the foundation of all (finite) denumerable-dimensional QM built in composite fashion, the most fundamental entangled states are the Bell states given by |φ + = |u ⊗ |u + |d ⊗ |d √ 2 in the σ z eigenbasis. These correspond to the following density matrices As always, Alice and Bob are making their measurements on each of the two Bell state particles. If Alice makes her spin measurement σ 1 with her SG magnets oriented in thê a direction and Bob makes his spin measurement σ 2 with his SG magnets oriented in theb direction, then The first state |ψ − is called the "singlet state" and it is invariant under any of the SU(2) transformations e iΘσx , e iΘσy , or e iΘσz , corresponding to rotations of the SG magnets about the x, y, or z axes, respectively (for computational details applicable to this section, see [57,58]). This fact aligns with the signature of ρ Sing in Eq. (7). |ψ − represents a total conserved spin angular momentum of zero (S = 0) for the two particles involved, i.e., Alice and Bob always obtain opposite outcomes (ud or du) when making the same measurement. The other three states are called the "triplet states" and they are invariant under the SU(2) transformations e iΘσz , e iΘσx , and e iΘσy , respectively. Again, note the correspondence with the signatures in Eq. (7), respectively. They represent a total conserved spin angular momentum of one (S = 1, in units of = 1) in each of the spatial planes xy (|ψ + ), yz (|φ − ), and xz (|φ + ). [To see this for |ψ + , you have to transform the state to either the σ x or σ y eigenbasis where it has the same form as |φ − or |φ + , respectively [57].] Thus, Alice and Bob always obtain the same outcomes (uu or dd) when measuring at the same angle in the symmetry plane of the relevant triplet state, i.e., when they share the same reference frame. In all four cases, the entanglement represents the conservation of spin angular momentum for the process creating the state.
Suppose Alice and Bob are making measurements on their particles in the symmetry plane of a triplet state such thatâ ·b = cos (θ). Partition the data according to Alice's equivalence relation (her ±1 outcomes) and look at her +1 outcomes. Since we know Bob would have also measured +1 if θ had been zero (i.e., if Bob was in the same reference frame), we have exactly the same classical expectation depicted in Figures 2  & 5 for the single qubit measurement, i.e., NPRF also requires that Alice and Bob each observe +1 half of the time and −1 half of the time, and that P (−1, +1 | θ) = P (+1, −1 | θ), so we have We can now solve these for the joint probabilities P (+1, +1 | θ) = P (−1, −1 | θ) = 1 2 cos 2 θ 2 (11) and P (+1, −1 | θ) = P (−1, +1 | θ) = 1 2 sin 2 θ 2 (12) in accord with QM. In other words, while spin angular momentum is conserved exactly when Alice and Bob are making measurements in the same reference frame, it is conserved only on average when they are making measurements in different reference frames (related by SO(3) as shown in Figure 6). This "surprising" result is a direct consequence of NPRF + h, exactly as length contraction and time dilation are a direct consequence of NPRF + c. And, we could also partition the data according to Bob's equivalence relation (his ±1 results), so that it is Bob who claims Alice must average her results to satisfy "average-only" conservation. This is totally analogous to the relativity of simultaneity in SR. There, Alice partitions spacetime per her equivalence relation (her surfaces of simultaneity) and says Bob's meter sticks are short and his clocks run slow, while Bob can say the same thing about Alice's meter sticks and clocks per his surfaces of simultaneity.
Of course, there is nothing unique about SG spin measurements except they can be considered direct measurements of h. The more general instantiation of NPRF per Information Invariance & Continuity described in Section 2 applies to any qubit. So, for example, if we are again talking about photons passing or not passing through a polarizing filter, we would have "average-only" transmission for photons instead of "average-only" projection for spin-1 2 particles, both of which give "average-only" conservation of spin angular momentum between the reference frames of different mutually complementary measurements. So, most generally, the information-theoretic principle of Information Invariance & Continuity leads to "average-only" (fill in the blank) giving "average-only" conservation of the measured quantity for its Bell states.
We should point out that the trial-by-trial outcomes for this "average-only" conservation can deviate substantially from the target value required for explicit conservation per Alice or Bob's reference frame. For example, we might have Bob's +1 and −1 outcomes averaging to zero as required for the conservation of spin angular momentum per Alice's reference frame. Thus, Alice says Bob's measurement outcomes are violating the conservation of spin angular momentum as egregiously as possible on a trial-by-trial basis. However, from the perspective of Bob's reference frame, it is Alice's +1 and −1 outcomes averaging to zero that violate the conservation of spin angular momentum as egregiously as possible on a trial-by-trial basis. In classical physics, our conservation laws hold on average because they hold explicitly for each and every trial of the experiment (within experimental limits). But here, that would require a preferred reference frame. Thus, "average-only" conservation distinguishes classical mechanics and QM just as the relativity of simultaneity distinguishes Newtonian mechanics and SR. Consequently, we see that "average-only" conservation does not resolve the mystery of quantum entanglement, it is the mystery, i.e., it is what needs to be explained.
What we've seen here is that we can explain "average-only" conservation as conservation that results necessarily from Information Invariance & Continuity, which is conservation per NPRF in spacetime, precisely as the relativity of simultaneity results necessarily from NPRF + c. Of course, this then explains why quantum joint probabilities for the Bell states violate the Clauser-Horne-Shimony-Holt (CHSH) inequality precisely to the Tsirelson bound [56,[68][69][70]. That fact obtains because the quantum joint probabilities for the Bell states are precisely those that satisfy conservation in accord with NPRF. In contrast, classical probability theory would satisfy the CHSH inequality by requiring a preferred reference frame, thereby violating the invariant measurement of a fundamental constant of Nature h. Thus, the reconstructions of QM reveal the relativity principle at the foundation of QM precisely as it exists at the foundation of SR, i.e., demanding the invariant measurement of a fundamental constant of Nature. We now show how this bears on the no-signalling, "superquantum" joint probabilities of Popescu & Rohrlich.

Ruling Out Superquantum PR-Box Probabilities
In this section, we apply conservation per NPRF in spacetime (implied by conservation per Information Invariance & Continuity) to answer an important question in quantum information theory, i.e., why don't we find any so-called "superquantum" joint probabilities in Nature? That is, assuming only that joint probabilities cannot permit faster-than-light signalling, Popescu & Rohrlich introduced these "no-signalling" joint probabilities (the "PR-box") [31] that violate the CHSH inequality beyond the Tsirelson bound of QM a , a, b , or b is a measurement setting (H or T in what follows). "Nosignalling" means that, for a given experimental setting, one observer's outcome is independent of the other observer's setting. For example, suppose that Alice chooses setting a. Then no-signalling implies that her observed outcome (+1 or −1) is independent of whether Bob chooses setting b or b . Specifically, the probability that Alice observes +1 given that she chooses setting a and Bob chooses setting b is P (+1, +1|a, b) + P (+1, −1|a, b) = 1/2 and this must equal the probability that she observes +1 if Bob chooses b instead, P (+1, +1|a, b )+P (+1, −1|a, b ) = 1/2. Quantum information theorists quickly realized that if the joint probabilities in Eq. (13) could be instantiated physically, they would provide an "unreasonably effective" means of communication. Let us follow Bub & Bub's example [71] of a "quantum guessing game" to illustrate this point.
[Keep in mind at this point we are simply trying to understand how each quoin might behave deterministically to explain the Quoin Mechanics of  To show this, they note there are only four ways to "rig" a quoin: (i) No matter how it starts, it ends up heads (H).
(ii) No matter how it starts, it ends up tails (T).
(iii) It stays the same way it starts (S for same).
(iv) It changes from the way it starts (O for other).
They then list all the possible riggings for a heads-heads (HH) start that yield non-equal outcomes and all the possible riggings for a heads-tails (HT) or tails-heads (TH) or a tails-tails (TT) start that yield equal outcomes. As a result, it is apparent that there is no rigging that can yield the Quoin Mechanics of Table 1. Therefore, we simply have to accept Quoin Mechanics without knowing how they are instantiated in order to explore their implications for the Bubs' quantum guessing game. Note, Mermin showed something similar in trying to explain quantum joint probabilities for the spin triplet state using "instruction sets" [57,72,73]. In that case for quantum probabilities, as with these superquantum probabilities, the instruction sets didn't work and we were left with no "constructive account" of those quantum probabilities. Of course, the difference is that while we don't have any (consensus) "causal mechanism" to explain the quantum probabilities for the Bell states, we do instantiate them in the lab. And, as we just showed in Section 4, those quantum probabilities do obey a very reasonable conservation principle (conservation per NPRF) while we will see that the PR-box probabilities violate that principle. Now we show how these conservation-violating PR-box probabilities are "miraculous" in their information exchanging capability. We start with the game itself.
As you can see in Table 2, there are 5 lanes on each side of the barrier, which does not allow Alice to see Bob's values and vice-versa. The lanes are numbered 1 through 5 and the game is started by the dealer setting values of 1 or 0 for each player in each of the five lanes (Table 2). After the dealer has set these ten values, we see that there are some lanes that have a 1 on both sides of the barrier (lanes 1 and 4 in Table 2). Alice is the "guesser" and because of the barrier she doesn't know what Bob has in his lanes. Her job is to guess whether there are an even number of lanes that have a 1 on both sides (as is the case in Table 2) or an odd number of such lanes. The only way she can know the answer with certainty is if the dealer were to set all five of her lanes to 0 in which case she knows the answer is even, i.e., there are zero lanes that have a 1 on both sides. Of course, the dealer isn't going to do that, so she will always have at least one 1 on her side. For all such cases, it is not difficult to convince yourself that the probability of the right answer being odd(even) is 50%(50%). Alice and Bob buy six poker chips to play and the House doubles their chips for a correct guess. [They turn their poker chips back into money when they leave the Quasino of course.] If Alice guesses wrong, they lose all their chips to the House. Since the odds are 50-50 of guessing right, Alice and Bob end up breaking even if they continue to play, meaning they'll not win or lose money overall on average.
But, there is another aspect of the game that we have not told you -Bob can spend one poker chip to send one bit of information to Alice. In that case, Alice and Bob have five chips remaining and if they win, the House pays them five chips, i.e., the House doubles their remaining poker chips. Likewise, Bob can spend two chips to send two bits of information to Alice which means the House would pay four chips if they win. Once they have to spend three chips to win, the House pays three chips and they break even monetarily by winning the game. For example, Alice could ask Bob to send her the values in his lanes 1, 4, and 5 of Table 2, so she can see if he has any 1's in those lanes. That's all the information she needs, since she has 0's in lanes 2 and 3. Bob sends the bits 1, 1, and 0 which tell her they have just two lanes (lanes 1 and 4) where they both have a 1, so the answer is "even." They win the game and three chips, but they spent three chips to send the three bits of information, so they broke even monetarily. You can easily convince yourself that this strategy would actually end up losing money in the long run. Now it's time to show that if Alice and Bob use quoins, then they can win this game every time by merely passing one bit of information each game, thereby winning five chips every game, i.e., netting four chips every game. Here is the Bubs' strategy. Alice and Bob start with five pairs of quoins. [Don't confuse these with the six chips they bought from the House to play the game, they brought these five pairs of quoins with them to the Quasino.] Again, Alice is the guesser, so it is Bob who will be sending the one bit of information. Alice's quoins are labeled 1A, 2A, 3A, 4A, 5A and Bob's are labeled 1B, 2B, 3B, 4B, 5B. Quoins 1A and 1B are entangled per Quoin Mechanics, as are 2A and 2B, etc. The number on each quoin corresponds to each lane of the game. For all lanes in which Alice has a 1, she flips the corresponding quoin starting with H. In all lanes with a 0, she flips the corresponding quoin starting with T. Bob does likewise with his quoins for his values and lanes. Bob then pays one chip to send Alice one bit of information, i.e., 1 if he has an odd number of H's and 0 if he has an even number of H's. From this one bit of information, Alice now knows with certainty whether they have an even or odd number of lanes with two 1's. The strategy is simple although the reasoning behind it is not trivial.
The key is to observe that the individual outcomes of Alice and Bob's quoin tosses do not really matter, but their pairs do matter. In any lane with two 1's, Alice and Bob together observe a total of one H after they flip their quoins for that lane (Quoin Mechanics), which is odd. In the other lanes, they observe an even number (0 or 2) of H's (Quoin Mechanics). Therefore, their combined count of H's is even if and only if there are an even number of lanes with two 1's. Neither player by themselves knows whether the combined number of H's is even or odd because each person can only see the outcomes of his/her own quoins. But, all Alice needs to know after flipping her five quoins is whether or not Bob has an even number of H's or an odd number of H's, and he can send her that one bit of information (1 for odd and 0 for even, for example) by spending just one chip. Thus, Quoin Mechanics guarantees they will win five chips every game (netting four).
How egregious is this advantage? In other words, can quantum joint probabilities achieve anywhere near this success rate? Let's look at the PR-box. We see that the fourth PR-box probability corresponds to the HH case producing unequal outcomes in Quoin Mechanics. That is, using a = b = T and a = b = H with outcomes +1 = H and -1 = T, the PR-box aligns with Quoin Mechanics (Table 1). Since it is the HH case that allows us to discern even or odd pairs of 1's in our guessing game using only one bit of information, let's scrutinize the fourth PR-box probability using our conservation principle.
According to QM, the joint probability of measuring like results for a triplet state in its symmetry plane is cos 2 θ 2 and the joint probability of measuring unlike results is sin 2 θ 2 , where θ is the angle betweenâ andb. The first PR-box probability says that a =b (same results), the second PR-box probability saysâ =b (same results), and the third PR-box probability saysâ =b (same results). Thus, these three PR-box joint probabilities in total sayâ =â =b =b . So, we need the fourth PR-box probability to sayâ =b , i.e., same results, but of course it says we must get opposite results, which meansâ = −b . Therefore, the PR-box joint probabilities violate our conservation principle in a maximal fashion for the triplet states. [A similar argument can be made using the singlet state.] Consequently, in order to satisfy our conservation principle we need the outcomes of a HH start to be equal just as the outcomes of a TH, HT, or TT start. But, if we make the outcomes of a HH start equal, we lose the advantage of Quoin Mechanics. In fact, using quantum coins (the outcomes of a HH start are equal) instead of superquantum quoins (the outcomes of a HH start are not equal) puts us right back to a 50-50 chance of winning the guessing game (recall the reasoning behind the quoin strategy). So, we see that the superquantum PR-box probabilities are not just a little bit better than quantum probabilities for the quantum guessing game, they are "unreasonably effective." But, again, they violate our conservation principle, so it is probably the case that a physical instantiation of the PR-box probabilities is a pipe dream akin to a perpetual motion machine.

Conclusion
Quantum information theorists have a produced a principle account of denumerabledimensional QM whereby "quantum reality" is characterized most succinctly by the information-theoretic principle of Information Invariance & Continuity. Accordingly, reality is composed fundamentally of discrete bits of irreducible, finite information that can be instantiated and measured physically in 3-dimensional space by measurement devices which are themselves composed of such bits ("closeness requirement"). For the Bloch sphere, Information Invariance & Continuity reflects the fact that it is always possible to create a path from one pure state to another by passing through pure states only, i.e., the surface of the Bloch sphere is composed of pure states. This is quite a "surprising" fact from the point of view of classical probability theory where the only path in probability state space between pure states is through mixed states with lower information content. The higher-dimensional and multi-particle Hilbert space structure of QM can all be built from this fundamental, "surprising" qubit structure characterized by Information Invariance & Continuity.
To help clarify the significance of this information-theoretic result, we applied it to the Stern-Gerlach (SG) measurements of a spin- 1 2 particle. In that case, the Pauli matrices are used to represent spin-1 2 measurements J i = 2 σ i with [J x , J y ] = i J z , cyclic, which are responsible for the reference frames associated with complete sets of mutually complementary spin measurements. Information Invariance & Continuity means the spin measurement operators are related by SU (2) in Hilbert space, which means the corresponding reference frames are related by SO(3) in real space. Thus, invariance of the eigenvalues under SU(2) means invariance of measurement outcomes in real space under SO(3), which is a transformation subgroup of both the Lorentz and Galilean transformation groups between inertial reference frames. Since SG spin measurements constitute a measurement of Planck's constant h, Information Invariance & Continuity entails NPRF + h in exact analogy to NPRF giving rise to the invariant measurement of the speed of light c at the foundation of SR. Of course, it must be the case that NPRF + h is responsible for the non-commutative algebra of the spin measurement operators to begin with.
To see that, suppose |ψ = |u alongẑ. Then classically, we know the exact measurement outcome alongb will beb ·ẑ = cos θ. In other words, the measurement outcome in one reference frame (+1 in theẑ frame) determines the exact measurement outcome in another reference frame (cos θ in theb frame) in the classical case. However, this violates NPRF because there is only one reference frame with the "right" eigenvalue (+1 in theẑ frame) and therefore only one frame that measures the correct value for h. So in the classical case, the different measurement operators commute, e.g., [J x , J y ] = 0. In contrast, NPRF (and Information Invariance & Continuity) says we must obtain ±1 alongb just like any other direction. Thus, NPRF does not allow us to deduce the exact measurement outcome in theb reference frame using our +1 measurement outcome in theẑ reference frame. Again, this is what Brukner & Zeilinger meant when they said the qubit does not contain enough information to account for the outcomes of every possible measurement done on it, so a theory of qubits must be probabilistic [61,63,64]. Thus in the quantum case, the different measurement operators do not commute, i.e., [J x , J y ] = i J z . Again, just like elsewhere in QM, letting Planck's constant go to zero recovers classical physics. Here we see that NPRF + h is responsible for the non-commutative algebraic structure of QM, in contrast to the commutative algebraic structure of classical mechanics.
Since NPRF (and Information Invariance & Continuity) requires we obtain ±1 alonĝ b just like any other direction, the classically expected result ofb ·ẑ = cos θ can obtain only on average. Thus, NPRF + h gives us "average-only" projection for measurement outcomes in different reference frames. When applied to Bell state entanglement, we showed that "average-only" projection for one particle leads to "average-only" conservation between two entangled particles in different reference frames. Thus, according to the reconstructions of QM the mysterious "average-only" conservation of Bell state entanglement, and therefore the Tsirelson bound, follow from conservation per Information Invariance & Continuity, which is conservation per NPRF in spacetime. Finally, we showed that a hypothetical instantiation of the superquantum PR-box joint probabilities in spacetime violates conservation per the relativity principle, thereby allowing one to communicate with "unreasonable" effectiveness per Bub & Bub [71]. Thus, it seems unlikely that the PR-box can be realized in Nature, even though these joint probabilities do not violate the no-signalling condition. How conservation per Information Invariance & Continuity bears on other information-theoretic phenomena, e.g., macroscopic entanglement witnesses [74][75][76], is left for future work.
Whether or not one believes principle accounts are explanatory is irrelevant here.
No one disputes what the postulates of SR are telling us about Nature, even though there is still today no (consensus) constructive account of time dilation and length contraction, i.e., there is no "interpretation" of SR. While Lorentz complained [77, p. 230]: Einstein simply postulates what we have deduced, with some difficulty and not altogether satisfactorily, from the fundamental equations of the electromagnetic field.
he nonetheless acknowledged [77, p. 230]: By doing so, [Einstein] may certainly take credit for making us see in the negative result of experiments like those of Michelson, Rayleigh, and Brace, not a fortuitous compensation of opposing effects but the manifestation of a general and fundamental principle.