The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down

Coddens, Gerrit

doi:10.3390/sym13010134

Open AccessArticle

The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down

by

Gerrit Coddens

Laboratoire des Solides Irradiés, Institut Polytechnique de Paris, UMR 7642, CNRS-CEA- Ecole Polytechnique, Route de Saclay, F-91128 Palaiseau CEDEX, France

Symmetry 2021, 13(1), 134; https://doi.org/10.3390/sym13010134

Submission received: 11 December 2020 / Revised: 29 December 2020 / Accepted: 12 January 2021 / Published: 14 January 2021

(This article belongs to the Special Issue Symmetry in the Foundations of Physics)

Download Versions Notes

Abstract

The Stern–Gerlach experiment is notoriously counter-intuitive. The official theory is that the spin of a fermion remains always aligned with the magnetic field. Its directions are thus quantized: It can only be spin-up or spin-down. However, that theory is based on mathematical errors in the way it (mis)treats spinors and group theory. We present here a mathematically rigorous theory for a fermion in a magnetic field, which is no longer counter-intuitive. It is based on an understanding of spinors in SU(2) which is only Euclidean geometry. Contrary to what Pauli has been reading into the Stern–Gerlach experiment, the spin directions are not quantized. The new corrected paradigm, which solves all conceptual problems, is that the fermions precess around the magnetic-field just as Einstein and Ehrenfest had conjectured. Surprisingly, this leads to only two energy states, which should be qualified as precession-up and precession-down rather than spin-up and spin-down. Indeed, despite the presence of the many different possible angles

θ

between the spin axis

s

and the magnetic field

B

, the fermions can only have two possible energies

m_{0} c^{2} \pm μ B

. The values

\pm μ B

thus do not correspond to the continuum of values

- μ \cdot B

Einstein and Ehrenfest had conjectured. The energy term

V = - μ \cdot B

is a macroscopic quantity. It is a statistical average over a large ensemble of fermions distributed over the two microscopic states with energies

\pm μ B

, and as such not valid for individual fermions. The two fermion states with energy

\pm μ B

are not potential-energy states. We also explain the mathematically rigorous meaning of the up and down spinors. They represent left-handed and right-handed reference frames, such that now everything is intuitively clear and understandable in simple geometrical terms. The paradigm shift does not affect the Pauli principle.

Keywords:

quantum mechanics; SU(2); Stern–Gerlach experiment; spinors

PACS:

02.20.-a; 03.65.Ta; 03.65.Ca

1. Preliminaries: Understanding Spinors and a New Approach to Quantum Mechanics

1.1. Clifford Algebra

The present paper is based on previous work of the author [1]. For the convenience of the reader, we provide in this section a minimum of information about that work, which is a formulation of a new approach to quantum mechanics (QM) based on the geometrical meaning of spinors in group representation theory. This geometrical meaning of spinors is explained in [2]. I cannot insist enough that the reader should really consider that reference [2] contains information he is not aware of, such that he should read at least pp. 3–16 of it, if he wants to make sense of the present paper and the short introduction presented in Section 1. The basic underlying idea is that we generate the rotation group and the homogeneous Lorentz group from reflections. That was also Hamilton’s idea when he developed the quaternions.

From the reflections in

R^{n}

,

n \in N

,

n \geq 3

, we generate a group that contains not only the rotations of

R^{n}

but also reversals and reflections. By a reversal, we understand an operation obtained from an odd number of reflections. The rotations are obtained from an even number of reflections and form a subgroup of the group generated by the reflections. An analogous statement applies for the homogeneous Lorentz group, where we also call a reversal an operation obtained from an odd number of reflections. The operations obtained from an even number of reflections constitute the homogeneous Lorentz group which contains the group of rotations of

R^{3}

as a subgroup. How the quest to find the representation matrices of the reflections leads to the definition of the Pauli matrices and the Dirac matrices is explained in Section 2.2 of [2]. Note that the development is algebraically equivalent to the way Dirac defined the Dirac matrices for the homogeneous Lorentz group. It is however conceptually completely different. Rather then trying to find some jaw-dropping square root of the Klein–Gordon equation, the true and geometrically clear issue is to define the reflection operators which can be used to generate the group. This gives a entirely different, geometrical meaning to the algebra, which is absent from Dirac’s approach and renders this algebra much more clear and intuitive. The group is mathematically defined prior to any use of it in physics, and as we show below its representation theory contains only group elements. It does not contain vectors, four-vectors, four-gradients or knock-out square roots of d’Alembert operators. The development should be devoid of physical quantities such as ℏ, the electron rest mass

m_{0}

and energy-momentum four-vectors, which do not have their place in a purely mathematical development of the group representation theory. For reasons of enhanced clarity and readability, we use in our approach [1] consistently a different choice for the

4 \times 4

Dirac matrices than Dirac and which was introduced by Cartan in his monograph on spinors [3]:

γ_{x} = [\begin{matrix} σ_{x} \\ - σ_{x} \end{matrix}], γ_{y} = [\begin{matrix} σ_{y} \\ - σ_{y} \end{matrix}], γ_{z} = [\begin{matrix} σ_{z} \\ - σ_{z} \end{matrix}], γ_{t} = [\begin{matrix} 𝟙 \\ 𝟙 \end{matrix}] .

(1)

This choice has the convenience that we can immediately spot if the group element is obtained from an odd or an even number of reflections. The true Lorentz transformations have a block structure along the main diagonal, while the reversals have a block structure along the secondary diagonal. It will also allow us to spot immediately when we use a superposition state of a true Lorentz transformation and a reversal (see below), because then all four

2 \times 2

blocks will be non-zero. In Dirac’s choice, making such distinctions is thwarted by the fact that not all four gamma matrices have their block structure along the same diagonal. We can use this alternative choice because it has been shown (by Pauli) that all valid choices for the gamma matrices are equivalent. Note that the information content of a column of a Lorentz transformation matrix corresponds to only four real parameters, while the definition of a general element of the Lorentz group requires six real parameters. In Chapter 4, p. 96 of [1], we discuss two

2 \times 2

matrix representations SL(2,

C

) of the Lorentz group which are based on the blocks and contain all six real parameters. In Section 2.2 of [2], we explain starting from its Figure 1 how the product of two reflections in the rotation group defines a rotation and how this leads to the Rodrigues equation for a rotation matrix

R (s, φ)

in SU(2):

R (s, φ) = cos (φ / 2) 𝟙 - ı sin (φ / 2) [s \cdot σ] .

(2)

Here,

s

is a unit vector along the axis of the rotation,

φ

is the rotation angle,

𝟙

is the

2 \times 2

unit matrix and

σ = (σ_{x}, σ_{y}, σ_{z})

is a shorthand for the three Pauli matrices. The construction of the representation theory for the rotation groups in

R^{n}

, with

n \in N

,

n > 3

and the homogeneous Lorentz group is obtained by simply generalizing this idea, but in the present paper we only focus our attention on the group of the rotations in

R^{3}

and on the homogeneous Lorentz group. As the reflections are defined by unit vectors

a

that are orthogonal to their reflection planes, we can use both

a

and

- a

to characterize the reflection, which is thus represented by both matrices

\pm [a \cdot σ]

. The result is that each group element is algebraically represented by two representation matrices. Hence,

R (s, φ)

and

- R (s, φ)

are representing the same rotation: SU(2) is a double covering of SO(3). This is also true for the constructions in the Lorentz group. Here, the reflections are defined with respect to three-dimensional hyperplanes. They are the three canonical parity transformations P and the time reversal operation T. Using the same geometrical derivation within the Dirac representation as used in SU(2), it is easy to see that the rotation which corresponds to the SU(2) matrices

\pm R (s, φ)

is now represented by the two

4 \times 4

matrices with the

2 \times 2

block structure:

\pm [\begin{matrix} R (s, φ) \\ R (s, φ) \end{matrix}] .

(3)

When we want to describe a spinning object at rest within the Dirac representation, it suffices to present the calculations in SU(2) because in the Dirac representation it would only imply writing the same SU(2) matrix twice on the diagonal. (It is therefore a misconception to claim that the electron spin can only be correctly described within the relativistic framework of the Dirac representation.) As already mentioned, this way all the matrices we define represent group elements.

As explained in Section 2.4 of [2], we can now also introduce a second algebra. This is a parallel formalism for vectors and multivectors. This formalism exploits the fact that we use the unit vector

a

to define the reflection operator. The matrices

[a \cdot σ]

then occur in both algebras, such that the same algebraic quantity is representing two geometrically completely different things: reflection operators in the first, pristine algebra of group elements and vectors in the second algebra. The formalism of the second algebra is then easily extended to vectors which are not of unit length, and it is within this second algebra that Dirac’s approach is defined, as finding an algebra that linearizes the square root of a quadratic form. This is missing a crucial point because this second algebra does not refer to the group elements of the first algebra, such as rotations, which, as we show below, are essential for understanding the real meaning of the equation, which is that it expresses spinning motion. The consequence of this is that within this second algebra one obtains the Dirac equation without knowing what it means. It is mystifying us by hiding what is really going on behind the scenes in the first algebra. Most of the time the fact that there are two different formalisms such that a same algebraic expression can represent two different geometric objects is not clearly pointed out. In the first, pristine algebra of group elements, the reflection operators are defined up to a sign, while in the second algebra the vectors are represented unambiguously.

By multiplying the matrices representing vectors, we obtain new quantities in the second algebra, which contain multivectors of the type

a_{1} \land a_{2} \land \dots \land a_{m}

. The second algebra is thus an algebra of multivectors. The expressions we obtain in carrying out the matrix products can contain sums with terms of different types of multivectors, such that one has the impression that this is an algebra wherein we sum objects that we are not supposed to sum. It appears as summing kiwis and bananas. The definition of these awkward sums is in general the starting point of most texts about Clifford algebra. The stunning definition is introduced without any justification or discussion which raises the question of whether this is really legitimate and makes one wonder what this might mean. It just descends from heaven. Our approach to the group representation theory permits to understand where this all comes from. The strange definition relies on the algebraic feasibility to carry out these operations within a same general matrix formalism. For example, the Rodrigues equation seems to be sum of a scalar and an axial vector, which looks a priori absurd. In reality, we express the algebraic representations of geometrical objects of the first type (the group elements) in terms of algebraic representations of geometrical objects of the second type (the multivectors). It is all the consequence of the initial fact that reflections and vectors are represented by a same algebraic expression. The same is true mutatis mutandis in the Lorentz group.

Note that the notation

[a \cdot σ]

, which is used in the algebra, is misleading. It is a shorthand for

a_{x} σ_{x} + a_{y} σ_{y} + a_{z} σ_{z}

, which represents the vector

a_{x} e_{x} + a_{y} e_{y} + a_{z} e_{z} = a

while the analogy with the notation for a scalar product it thrives on might make you think that it is a scalar, viz. the scalar product of a vector

a

with some “vector”

σ

. However, the shorthand

σ = (σ_{x}, σ_{y}, σ_{z})

is not a vector of

R^{3}

, it represents the trivector

(e_{x}, e_{y}, e_{z}) \in R^{9}

, i.e., the triad of the three basis vectors of

R^{3}

. As already stated,

[a \cdot σ]

stands for

(a_{x}, a_{y}, a_{z}) \cdot (e_{x}, e_{y}, e_{z}) = a

. Similar remarks apply mutatis mutandis in the Lorentz group, e.g.,

B \cdot γ

does not represent a scalar but the vector

B

. This becomes very important in Section 2.2.

In SU(2), the

2 \times 2

rotation matrix can be represented by its first column without any loss of information, as explained in Equation (4) of [2]. This

2 \times 1

column matrix is a spinor. In SU(2), a spinor thus represents a rotation. A spinor is a rotation. The second column of the SU(2) rotation matrix is called the conjugated spinor, and as we show below it corresponds to a reversal (see Appendix A). That a spinor in SU(2) is just a rotation is much easier to understand than the textbook narrative that a spinor would be the square root of a vector. That relation is nevertheless also explained in Section 2.5 of [2]. We explain there that this idea cannot be fully generalized to

R^{n}

. The fact that in the Dirac equation we use superpositions of states (see below) has as a consequence that the column vectors still represent the complete information about the six parameters which define a general group element, as discussed on pp. 163–166 of [1]. For this reason, these column matrices are called bi-spinors.

We can take advantage of this remark about superpositions of states to address an important issue. Group theory is based on products of group elements. Sums of group elements are in general not defined. Spinors are not elements of a vector space but of a curved manifold. For this reason, summing spinors is not a defined operation. However, in QM, we are making linear combinations of spinors all the time. We must thus justify this use because QM leads to meaningful results. It is explained in Section 2.3 of [2] that we can give such sums a meaning in terms of sets of group elements. In QM, these sets become statistical ensembles of physical states. This leads us naturally to a statistical interpretation of QM as proposed by Ballentine [4]. The strong point of our approach is that it underpins Ballentine’s interpretation because his rules are now mathematically derived from the group theory (and the construction of the Dirac equation from scratch in [1] sketched briefly in Section 1.2). Another strong point is that we can read within the geometry that corresponds to the algebra of the equations what is happening in the physics. Spinors offer us a key to understanding QM. In fact, we show below that the free-space Dirac equation just describes a statistical ensemble of spinning electrons in uniform motion. This is an insight the traditional approach just emphatically denies us (because it is based on the second algebra and lacks the insight provided by the first algebra). The discussion in terms of sets can also be used to derive a Born rule (see [5] pp. 1–2, [6], p. 25). It is based on associating each electron with a spinor that describes its state. The spinors

χ

of SU(2) satisfy the identity

χ^{†} χ = 1

. If we have to count electrons in some formalism based on SU(2), we should thus use

χ^{†} χ

.

1.2. Use of the Clifford Algebra to Derive the Dirac Equation from Scratch

We can use SU(2) and its spinors to represent spinning motion. Throughout the paper, we use the notation

F (A, B)

for the set of functions whose definition domain is the set A and which take values in the set B. In the same way as we use vector functions

r \in F (R, R^{3})

:

t \to r (t)

to describe orbits in classical mechanics, we can describe the spinning motion of a spinning object as a top or a particle in its rest frame by a spinor function

ψ \in F (R, C^{2})

:

τ \to ψ (τ)

, where

τ

is the proper time. In classical mechanics, we use

m \frac{d^{2} r}{d t^{2}} = F

to make the link between the geometrical parameters and the physical parameters. In relativity, we rather use

\frac{d p}{d t} = F

. In QM, we need to describe spinning motion. The mathematical tool to do this is the spinor. Now, we use an equation for

\frac{d ψ}{d τ}

to make the link between the geometrical and the physical parameters. This link is provided completely at the end by introducing the minimal substitution in order to make the step from the free-space Dirac equation to the Dirac equation for an electron that moves in an electromagnetic field. The minimal substitution is not entirely rigorous because it only addresses the boost part of the spinor, but we cannot discuss this here.

As explained above, with Equation (3), for a spinning object at rest, we can derive the equations in SU(2) first. When we have obtained a feeling for the formalism in SU(2), we can then first lift it to the Dirac representation and then generalize it for a moving electron by covariance. The last step consists in introducing and attempting to justify the minimal substitution. Note that, in our derivation of the Dirac equation from scratch, we do not explain why the electron spins. Perhaps in the future somebody will be able to explain why the electron spins on the basis of a dynamical model for the electron. However, we do not know anything about this issue and in traditional QM we even do not know that the issue exists. We are in a position of total ignorance similar to that of Newton who introduced the expression for the gravitational force ex nihilo and could only lament that he did not understand how this force could act at a distance. Similarly we introduce ex nihilo the ansatz that the electron spins and show then that one can derive the Dirac equation just starting from this basic assumption, which we laconically introduce without any further justification. The starting ansatz thus fulfills the same role in our approach as an axiom in mathematics. Historically, the intuition that the electron may spin has been around from the beginning but this has been firmly denied by the standard dogma. Our approach, which contradicts the standard dogma, cannot be criticized from the standpoint of the traditional approach to QM, because it is a competing theory whose algebraic results are identical to those of the traditional approach.

The following derivation of the free-space Dirac equation from scratch is discussed in [1], especially in pp. 153–168, with additions scattered over various papers (the Appendix of [5], pp. 1–2 of [6]). For this reason, we provide here a synopsis that can serve as a guide for further study. This synopsis can only be presented under the form of a mere sketch. It is impossible to present the full argument in the present paper because it would require incorporating a large part of the monograph [1]. Nobody would like to read such a very long and technical paper. Moreover, the scope of the paper is not deriving the Dirac equation. In what follows, there are thus some gaps that can be filled by reading [1].

We start from the Rodrigues equation Equation (2) and replace

φ = ω_{0} τ

. This describes now the spinning motion of an object, e.g., a particle or a top. The time derivative of

R (s, ω_{0} τ)

yields:

\frac{d R}{d τ} = - ı (ω_{0} / 2) [s \cdot σ] R, and : \frac{d χ}{d τ} = - ı (ω_{0} / 2) [s \cdot σ] χ,

(4)

where the

2 \times 1

spinor

χ

is the first column of

R (s, ω_{0} τ)

. Note that, to derive Equation (4) from Equation (2), we must assume that

\frac{d s}{d τ} = 0

, else the equation will contain extra terms and the equation will become considerably more complicated. In other words, we introduce the underlying assumption that the orientation of the spin axis remains fixed. We must thus remember in the further derivation of the Dirac equation which follows that it is only valid for a spinning electron with a fixed orientation of the axis of its spinning motion. The case where the spin axis precesses is a priori not covered by this derivation. We thus cannot use the Dirac equation to study precession, a limitation one cannot become aware of if one just follows Dirac’s derivation of his equation.

Equations (5)–(8) present a first intuition about a possible roadmap for deriving the equation. However, the whole rationale would fall apart in the face of mathematical rigor. Nevertheless, the intuition is right and we show how we can repair the mistakes in order to obtain a rigorous mathematical proof. For

s = e_{z}

we have

[s \cdot σ] χ = χ

. We obtain then:

\frac{d χ}{d τ} = - ı (ω_{0} / 2) χ .

(5)

In general,

[s \cdot σ] R \neq R

, because a reversal (obtained by an odd number of reflections) can never be equal to a rotation (obtained by an even number of reflections). In general, we also have

[s \cdot σ] χ \neq χ

, such that Equation (5) is simply wrong in general. It is a one-time, punctual coincidence we discuss more in detail in Appendix A. However, let us imagine that we can obtain an equation

[s \cdot σ] χ = χ

that is generally true in SU(2) anyway, such that Equation (5) also becomes true in general. If we now postulate

ℏ ω_{0} / 2 = m_{0} c^{2}

, we then obtain:

- \frac{ℏ}{ı} \frac{d χ}{d τ} = m_{0} c^{2} χ .

(6)

We can now lift this result to the Dirac representation. Now,

\frac{1}{c^{2}} \frac{d^{2}}{d τ^{2}} = \frac{1}{c^{2}} \frac{\partial^{2}}{\partial t^{2}} - \frac{\partial^{2}}{\partial x^{2}} - \frac{\partial^{2}}{\partial y^{2}} - \frac{\partial^{2}}{\partial z^{2}} .

(7)

Hence, the meaning of the “square root” of the d’Alembert operator is:

\frac{1}{c} \frac{d}{d τ} γ_{t} = \frac{1}{c} \frac{\partial}{\partial t} γ_{t} - \nabla \cdot γ .

(8)

The right-hand side of this equation corresponds thus to the partial derivative with respect to the proper time expressed in a frame wherein the electron is no longer at rest. We see that combining this with Equation (6) may lead to a derivation of the Dirac equation from scratch, where all we assume is that the electron spins around a fixed axis with a frequency

ω_{0}

and that

ℏ ω_{0} / 2 = m_{0} c^{2}

. The free-space Dirac equation could be obtained this way by covariance from the equation of the electron in its rest frame.

This appears great but we show below that it contains a hidden error. Let us first repair the fact that, in general,

[s \cdot σ] χ \neq χ

. We see that the cheat of taking

s = e_{z}

combined with the use of the spinor

χ

has produced the miracle that we can use

- \frac{ℏ}{ı} \frac{d}{d τ}

and more generally

- \frac{ℏ}{ı} \frac{\partial}{\partial t}

as an energy operator. We could not have defined this energy operator if we had kept working with

R

even for the case

s = e_{z}

. That all this lacks generality is also obvious from the fact that in general both exponentials

e^{ı ω_{0} τ / 2}

and

e^{- ı ω_{0} τ / 2}

will occur in the first column of the rotation matrix, as can be seen, e.g., in the example of Equation (13) below. Nevertheless, we can still satisfy Equation (6) if we replace the pure state

χ

by a superposition state

ψ

defined by:

ψ = χ + [s \cdot σ] χ \Rightarrow [s \cdot σ] ψ = ψ \Rightarrow \frac{d ψ}{d τ} = - ı (ω_{0} / 2) ψ .

(9)

As explained above, this superposition transforms immediately the theory into a statistical theory, where

ψ

represents a statistical ensemble wherein half of the electrons are in the state

χ

(which is a rotation) and half of them in the state

[s \cdot σ] χ

(which is a reversal). We call the mixed state

ψ

in an abus de langage also a spinor function. The ensemble is defined by its energy and its rotation axis, whereby the state can be a rotation or a reversal. On the new states

ψ

, the operator

- \frac{ℏ}{ı} \frac{\partial}{\partial t}

can function again as an energy operator. All this can also be developed in the Dirac representation and be generalized by covariance. We have done this in [1] on pp. 153–168. As mentioned above towards the end of Section 1.1 where we discuss bi-spinors, it also requires introducing a superposition state, as explained on pp. 162–166 of [1].

The hidden error mentioned above is much more surreptitious a problem. Its treatment is also tedious due to its technicality. We therefore relegate it to Appendix B. Preferably the reader should read Appendix B immediately and then come back here to the main text. Appendix B leads further to the insight that the Dirac equation describes a superposition state that must be interpreted statistically, as proposed by Ballentine [4]. Equation (A8) in Appendix B can be combined with lifting the steps in going from Equation (4) to Equation (6) to the Dirac representation such as to yield the Dirac equation for an electron at rest. This in turn can then be transformed into the general free-space Dirac equation by covariance, as fully described in [1]. It is only after publishing [1] that we explicated the rigorous steps that are needed to extend the definition domain of the differential equation in Equation (9) from

S_{(x_{0}, y_{0}, z_{0})} \times R

to

R^{4}

(as described in Appendix B and discussed in [6]). In [1], the analog within the Dirac representation of the superposition defined in Equation (9) is also discussed on pp. 162–166. Our derivation shows this way clearly that the free-space Dirac equation describes a statistical ensemble of spinning electrons in uniform motion. The minimal substitution required to study electrons in an electromagnetic field is also discussed in [1].

Let us now think of tops that are spinning clockwise or counterclockwise around an axis with angular frequencies

\pm ω_{0}

. They obviously have the same energy. We can see from this that the energy must be

E = | ℏ ω_{0} / 2 |

rather than

E = ℏ ω_{0} / 2

, which settles the riddle of the negative frequencies. As we can extrapolate the equations from SU(2) to the Dirac representation, we see that this must also be true within the context of the Dirac equation. The net energy needed to make the transition between the two states with algebraic energies

E = \pm ℏ ω_{0} / 2

is not

2 m_{0} c^{2}

. First, the state must lose its energy

m_{0} c^{2}

to grind its spinning motion to a halt. Then, we can start to make it spin in the opposite sense. It must then regain the same energy

m_{0} c^{2}

to recover the spinning motion with the opposite algebraic angular frequency. The net change of energy required is thus zero. On the other hand, when a positron and electron annihilate, we do not obtain a zero energy but two gamma rays of 511 keV each, which shows that the identification of negative energies with antiparticles is not justified.

We may observe further that SU(2) does not contain antiparticles. Our whole derivation is based on SU(2) and what does not go into a mathematical formalism cannot come out of it by magic. Similarly, the gauge symmetry used to justify the identification with antiparticles is not used in the derivation. Hence, once again, what does not come in cannot come out by magic. In our approach, we do not introduce the notion that negative frequencies correspond to antiparticles. In addition, Dirac did not originally introduce that notion. We could of course introduce antiparticles and associate them with negative frequencies a posteriori. However, a negative frequency would then correspond to two different physical states. We all but need such ambiguity and can therefore forget about the whole idea of negative energies. In our approach, everything becomes more logical and clear.

1.3. Consequences

Even if we cannot give all the details about it in the present paper, we have derived in [1,6] the Dirac equation meticulously from scratch with the absolute rigour of a mathematical proof. The Schrödinger equation can be derived from the Dirac equation. Hence, the spinor approach contains the basis for a lot of QM. The derivation of the Schrödinger equation introduces approximations that break the symmetry of the Dirac equation, rendering QM actually more difficult to understand. It is somewhat analogous to replacing

e^{x} e^{y} = e^{x + y}

by announcing the less accurate identity

(1 + x + x^{2} / 2) (1 + y + y^{2} / 2) \approx 1 + (x + y) + {(x + y)}^{2} / 2

, with the effect that some people would no longer make the connection to the foundational idea with its perfect symmetry. Therefore, it is important to base our approach on the derivation of the Dirac equation to make the all-important role of the symmetry completely shine through in all its dazzling beauty. Rather than on some incredible, arcane “intuition”, our derivation sketched above is based on very simple ideas. Instead of deriving the energy and momentum operators from substitutions

E \to - \frac{ℏ}{ı} \frac{\partial}{\partial t}

,

p \to \frac{ℏ}{ı} \nabla

, which extrapolate a result obtained by educated guessing from the de Broglie ansatz for a scalar wave function, which itself was also guessed, we obtain it here by a very different, much more logical derivation. There is no guessing in our derivation, it just rolls out from the combination of Equations (6) and (8). In fact,

\frac{1}{m_{0} c^{2}} (E, c p)

are the parameters

(γ, γ v / c)

which define a boost. Any four-vector can this way be used to define a boost, which explains why there exists a special relation in QM between the energy-momentum four-vector and the four-gradient.

There have been many attempts to make sense of QM, e.g., the many-worlds interpretation [7], Bohm’s approach [8] and Cramer’s transactional interpretation [9], just to name a few of them. Such attempts often introduce some new physical idea with classically forbidden traits. According to one’s personal taste, one will consider these transgressions of the classical common sense as credible or otherwise. When this physical idea is hard to verify, in the end, one is still left wondering if it is true or otherwise. We are walking on eggs. The approach described here tries to avoid at all cost introducing physical ideas whose truth is hard to decide upon. The starting point is figuring out the geometrical meaning of spinors. This is pure mathematics and not open to discussion. It can only be right or wrong, and the reader can figure this out for himself by reading at least pp. 3–16 of [2], which gives all the details. He would see that it gives a very clear intuition for what spinors are. Once he has picked this up, the meaning of QM will just unfold itself. This is because the geometrical meaning of the algebra used in QM, which is Clifford algebra, is already given by the mathematics of the group theory itself prior to any application of the algebraic part of it to physics. Understanding the geometrical meaning of the algebra boils down to understanding the group theory and spinors. The reader can acquire this understanding by reading [2]. This will provide him with the key to make sense of the algebra of QM. There is thus no need for introducing puzzling additional physical assumptions in order to unravel the mysteries of QM. All we need is already contained in the mathematics. Understanding spinors permits even to spot and correct flaws in the traditional theory.

What we gain in our new approach is that we know exactly which ingredients are used to derive the equation. In Dirac’s approach, one is left free to imagine that some very special quantum axioms may be needed to derive it, because one just does not know what the underlying axioms are and the experimental results it describes are baffling us. This leaves of course the door open for introducing the destabilizing speculative ideas mentioned above. In Dirac’s approach, we also remain in the dark as to the geometrical meaning of the equation. The reader may be stunned by the fact that the derivation of this eminently quantum mechanical equation is purely classical. It may leave him incredulous. Where does the quantum magic then come from? This is discussed in great detail in [1,6]. The way we are able to derive the Dirac equation calls for caution. If the equation can be derived from such simple assumptions, several deductions drawn from the traditional approach may be overinterpretations that are just not granted.

It turns out that, if one masters the group theory, one can derive many results of QM by just classical reasoning. The quantum mysteries disappear and the theory becomes intuitive and intelligible. We therefore undertake the quest to spot a phenomenon where we become obliged to introduce some quantum magic anyway. Some salient examples of our results are the derivation of this Dirac equation from scratch (with full details in [1] and an addition in [6]), the solution of the particle-wave duality in [6,10], the solution of the paradox of Schrödinger’s cat in Section 2.3.2 of [2] and in [5] and an explanation for the double-slit experiment in [6,10] (which can be further enriched by using the Appendix of [5] to deal with incoherent sources), but there are many more. I have never addressed tunneling because the work of Hansen and Ravndal [11] already explains it perfectly. These successes are obtained without the counterintuitive physical assumptions that are introduced in some other approaches. The latter assumptions are thus introducing mystery and magic without a valid reason and are therefore misleading. In general, there is much less magic than we are used to think and that is what makes our approach so interesting. The present paper shows how our approach also permits to make sense of the Stern–Gerlach experiment.

Traditional QM has been discovered with rather stunning serendipity. Dirac just guessed his equation and many other rules were introduced ad hoc. Despite the shifting grounds of these shaky foundations, QM has proved extremely successful. However, I must insist on warning the over-sceptical reader that he cannot attack my work by using the traditional textbook wisdom as the ultimate touchstone for the truth, e.g., when my work flies in the face of accepted notions or if it draws him out of his comfort zone. That is because my approach, which is a reconstruction of QM from scratch based on the geometrical meaning of spinors, should be considered as a competing theory which leads to the same algebraic results. Competing theories cannot be compared by considering one of them as the absolute truth. The comparison must be based on other merits. Here, these merits are not the agreement of the algebra with the experimental results, because the algebra remains the same. The merit of my approach is that it is based on an already pre-existing clear geometrical meaning of that algebra, provided by the group theory, whereby the results are mathematically derived and proved. It must therefore a priori be considered as superior to the traditional approach, as a viewpoint that is developed from guesses and rather uses the algebra as a black box (under the motto “Shut up and calculate!”) cannot seriously pretend to prevail with authority over an approach based on mathematical derivations and proofs.

In this paper, I have to continue pointing out errors in the traditional theory as already done in the preceding subsections. The fact that I insist on pointing out errors may upset some readers. It may appear as a rant based on sheer arrogance or a lack of respect for Dirac. However, the true issue can only be that my work is an alternative approach to QM, which makes a lot of things that appeared mysterious intelligible. It is absolutely crucial to delineate what is wrong and what is right in this approach when it contradicts accepted notions of the traditional approach or else this would lead to confusion. Solving paradoxes requires pinpointing and neutralizing subliminal logical errors with surgical precision. Nobody is served with keeping such errors concealed, especially since QM is fraught with paradoxes. I cannot lie for reasons of respect. If people want to understand QM, they will have to accept that it may take correcting for mistakes. You cannot ask for a better understanding of QM and postulate at the same time that everything that is different from what you have learned must be wrong. What you have learned is not sacred and it is responsible for the conceptual impasse we are in. Pointing out errors and the differences between the new approach and the traditional approach is a necessary part of comparing them, especially in situations where we encounter conceptual difficulties in the traditional interpretation that can be solved in the new one. Yes, the new approach is non-canonical and it is easy to pooh-pooh it for that reason, although the development in this paper shows that it is in reality its strength. However, is it not madness to still think after hundred years that one would be able to break away from the conceptual difficulties we have in making sense of QM by just sticking to the traditional canonical approach? As identical causes ought to produce identical effects, the breakthrough may just have to come from a non-canonical approach.

1.4. Breakdown of the Standard Dirac Formalism in the Case of Precession

As noted above, we make all the derivations above assuming that the orientation of the spin axis remains fixed. Hence, a priori the Dirac equation cannot be used to describe more complicated motions such as precession. This is something one cannot become aware of in Dirac’s approach. On pp. 313–316 of [1], we have expressed such a precessing motion in SU(2). We have taken the expression for a spinning motion with an angular frequency

ω_{0}

around a general spin axis defined by the unit vector

s

with spherical coordinates

(θ, ϕ)

. Note that we are using

ϕ

and

φ

(defined in Equation (2)) as two different symbols in this article. Then, we have considered what we get by rotating this spinning object bodily around the z-axis with a frequency

Ω

. At the moment

τ

, a non-precessing spinning top is represented by the rotation matrix

R (s, ω_{0} τ)

. As what we get with precession has during the time

τ

bodily been rotated around the z-axis with a frequency

Ω

, we obtain for the precessing spinning top:

\begin{matrix} S (e_{z}, Ω τ, s, ω_{0} τ) = R (e_{z}, Ω τ) R (s, ω_{0} τ), \\ where : R (e_{z}, Ω τ) = [\begin{matrix} e^{- ı Ω τ / 2} \\ e^{ı Ω τ / 2} \end{matrix}] . \end{matrix}

(10)

The matrix

R (e_{z}, Ω τ)

describes the rotational motion around the z-axis. The detailed expression for

R (s, ω_{0} τ)

is given by Equation (13) in Section 3. If the reader has doubts about the correctness of Equation (10), he should think about two identical spinning tops, one that stays at a fixed position in space with respect to the center of the Earth and one that co-rotates with the Earth around its axis. Now, we can differentiate Equation (10) with respect to

τ

. We have made this calculation on pp. 313–316 of [1] and it yields:

\frac{d S}{d τ} = - ı [(Ω + ω_{0} (τ)) / 2] S,

(11)

where

Ω = Ω e_{z}

and

ω_{0} (τ) = ω_{0} s (τ)

. A first observation is here that we no longer obtain a scalar in front of

S

(or its spinor

χ

) by using

- \frac{ℏ}{ı} \frac{\partial}{\partial τ}

but a vector, while the energy definitely should be a scalar. There is thus definitely something wrong with the traditional energy operator in the new context which goes beyond the domain of applicability of the formalism of the Dirac equation and can therefore only be treated by a non-canonical approach. The prescription

- \frac{ℏ}{ı} \frac{\partial}{\partial τ}

for the energy operator ceases to be valid in the extended setting.

We are confronted with an analogous situation in Equation (4) with its wave function

χ

. We can recover the energy operator by replacing

χ

by

ψ

as defined in Equation (9). When in this formalism, we detect an electron with an angular frequency

ω_{0}

in the state

ψ

the theory cannot tell us with certainty if it is in the state

χ

or

[s \cdot σ] χ

. It is crucial to acknowledge that Equation (4) is also a correct equation. It just does not yield the Dirac equation of QM, while our aim is validating our approach to the meaning of QM by deriving the Dirac equation. It is only to achieve this goal that we introduce

ψ

by Equation (9).

We cannot apply the traditional energy operator on the wave function

χ

of Equation (4) because it does not yield the nice result

ℏ ω_{0} / 2

but

(ℏ ω_{0} / 2) [s \cdot σ]

. The correct energy operator to be used with Equation (4) would be

- \frac{ℏ}{ı} [s \cdot σ] \frac{\partial}{\partial τ}

. This is feasible because

[s \cdot σ]

is constant anyway. However, there is no necessity to obtain

- \frac{ℏ}{ı} \frac{\partial}{\partial τ}

as the energy operator for a wave equation apart from the desire to stay within the formalism of the Dirac equation. When we want to describe precession, we are beyond the scope of the Dirac equation and there is no longer a gimmick that can help us to preserve the definition of the energy operator under the form

- \frac{ℏ}{ı} \frac{\partial}{\partial τ}

and drag the equation back into the field of applications of the Dirac equation. This is because there are now in any case angular frequencies with two different absolute values

| ω_{0} + Ω |

or

| ω_{0} - Ω |

occurring within a single column of

S

(see Section 3), such that the energy operator can no longer project out a scalar energy eigenvalue in front of

S

or its spinor

χ

. This is all fair enough. There is nothing wrong with it. It just signals that we are outside the scope of the Dirac equation, in the same way as Equation (4) was outside the scope of the Dirac equation. However, we are not outside the broader scope of the group theory, which is our conceptual basis to formulate QM. The matrix

S

describes now a superposition state that contains in total four different algebraic angular frequencies.

Since in our non-canonical approach outlined in Section 1 we gain a complete geometrical understanding of the ingredients that are needed to derive the Dirac equation, we can now derive a completely novel formalism within the same framework of group representation theory to deal with this new situation. Our hands are not tied to the canonical formalism of QM and its energy operator because our framework has a larger domain of applications than the Dirac equation for a fixed spin axis. Our framework is the group representation theory and its geometrical meaning, which we have already validated as a basis for a new and more intelligible approach to QM by showing that it can be used to derive the Dirac equation. The non-canonical approach can now outrun the canonical approach in its power to deal with novel situations, because we show that we can deal with precession. In our approach, we have to split the four-frequency superposition state into its two different energy components

| ℏ (ω_{0} + Ω) / 2 |

and

| ℏ (ω_{0} - Ω) / 2 |

, because it does not make sense to make a brute-force calculation of the energy of a superposition state that involves pure states of different energies. It is this brute-force calculation which gives rise to the unphysical feature of a varying energy in the QM treatment of precession, which is discussed in Section 2.3. We must first calculate the energies of the pure states, and, if we want to do so, we can calculate the average energy by making statistical averages afterwards. We have now all the prerequisites to understand how we can tackle the Stern–Gerlach experiment in our new approach.

2. The Stern–Gerlach Experiment: Confusion Reigns

2.1. Preamble

In a Stern–Gerlach experiment, neutral spin-

1 / 2

particles are used, e.g., Ag atoms. In our description, we always focus our attention on electrons, even if a Stern–Gerlach experiment on electrons might be extremely difficult to perform. The real problem we want to discuss is the case of an electron with spin

1 / 2

in a magnetic field (the anomalous Zeeman effect), for which we have been taught that the electron spin can only be up or down and never tilted as we assume in the attempt to describe precession, reported in Section 2.3.

In this section, we want to pin down the total lack of intuition and the total lack of theory which prevail in the traditional presentation of the Stern–Gerlach experiment [12]. This experiment is a choice example of what happens all the time in QM. The algebra of the theory agrees perfectly with the experimental data but we cannot possibly make sense of what this algebra means. I often use the analogy of the correspondence between algebra and geometry in algebraic geometry to explain that the calculus of QM, its algebra, is exact but we do not know what its correct intuitive interpretation, i.e., its “geometry”, should be. In this respect, Villani uses the qualifiers “analytic” for what we call algebraic and “synthetic” for what we call “geometric” [13]. Perhaps this terminology is more accurate than ours. The purpose of making this difference between algebra and “geometry” is to make very clear right from the start that in general I am not questioning the algebra because it is correct, as it is always perfectly in agreement with the observed experimental data. All I want to do (and all I still can do) is find an intelligible corresponding “geometry”, as I described in Section 1.

In view of all what is said, when you know the algebraic part of the spinor formalism and you know that the corresponding synthetic part must be the group theory of the rotation and Lorentz groups, then you might expect that explaining the Stern–Gerlach experiment synthetically should not be too difficult. However, lo and behold, this is here certainly not the case. One reason for this is that, for this specific exceptional case, I have to attack the textbook algebra because it is egregiously wrong.

2.2. Total Absence of Theory

Indeed, as pointed out in Section 1.1 and many times before, especially in [1,2,14], the shorthand notation

B \cdot σ

or

B \cdot γ

that occurs in the equations is not the scalar product of the magnetic field

B

with some “vector”

σ

or

γ

, where

\frac{ℏ}{2} σ

or

\frac{ℏ}{2} γ

would be the “spin vector”. As a matter of fact,

B \cdot σ

or

B \cdot γ

just expresses the magnetic field

B

.

Furthermore, the (non-relativistic) unit vector

s

which is parallel to the spin axis is not represented by

σ

or

γ

but by

s \cdot σ

or

s \cdot γ

, which often remains hidden inside the notation for the spinor

ψ

. When

s \cdot σ

or

s \cdot γ

do not explicitly occur in the equations, there cannot be any form of algebraic chemistry, e.g., in the form of a multiplication, between

s \cdot σ

and

B \cdot σ

, in those equations. Similar remarks apply for

s \cdot γ

and

B \cdot γ

in the Dirac formalism, but from now on we only formulate things in the SU(2) formalism.

The textbook theory exploits the mathematical errors mentioned to claim that the “spin vector”

\frac{ℏ}{2} σ

, after multiplication by

\frac{q}{m_{0}}

, defines the “magnetic dipole”

μ = \frac{ℏ q}{2 m_{0}} σ

. This slight of hand transforms the axial vector

\frac{ℏ q}{2 m_{0}} B \cdot σ

by magic into a scalar

B \cdot μ

, where

μ

is now considered to be a magnetic dipole and

V = - B \cdot μ

becomes a “potential energy”. The expression for this “potential energy” corresponds conveniently to our classical intuition, which might convince you to “wisely ignore” the mathematical errors I am pointing out here. However, it is absolutely essential that the reader gets the point that he cannot override or talk his way out of this mathematical verdict by belittling it as inconsequential, which is a frequent attitude of physicists when they are confronted with criticism of a formalism they strongly believe to work. Because they find agreement with experiment, they reckon the theory must be right. However, the fact that the bottom line of a child’s homework is right, does not imply that it is entirely flawless. A mathematical error must always be taken seriously because it can be a warning sign that something is wrong such that it cannot always be overruled based on intuition. The development of the paper further confirms that imperviously ignoring the errors leads here to wrong intuition and a conceptual impasse. Note that the expression of the trivector

μ = \frac{ℏ q}{2 m_{0}} σ

, which is interpreted as a vector

μ

, even does not contain the spin vector

\frac{ℏ}{2} s

. That, in the Clifford algebra, a matrix

a \cdot σ

represents the vector

a

can also be checked in [2], p. 12 and [3], p. 43.

Even if we persisted in ignoring the error, it would remain very difficult to understand within this picture why the spin should select two orientations in order to align with

B

, rather than just one, viz. the one that would minimize its energy within the picture of a potential. Can the spin then also maximize its potential energy?

Despite its appeal, the ansatz

V = - μ \cdot B

is also truly problematic. There is no dipole in mathematics. The idea of a dipole is based on the picture of a current loop. However, as Lorentz pointed out, if all the charge of an electron were put on its equator, even at a velocity c it would not be large enough to account for the hypothetical dipole moment. The algebraic expressions

- \frac{q}{2 m_{0}} (B \cdot \hat{L}) 𝟙

for the normal and

- \frac{ℏ q}{2 m_{0}} [B \cdot σ]

for the anomalous Zeeman effect have completely different symmetries in the Clifford algebra, because the dot product in the normal Zeeman effect is a true scalar product, while the dot product in the anomalous Zeeman effect is a shorthand, which expresses a vector. (The anomalous Zeeman effect is often written as

- \frac{q}{m_{0}} (B \cdot \hat{S})

, whereby one calls

\hat{S} = \frac{ℏ}{2} σ

“the spin”. This is wrong as the spin is

\frac{ℏ}{2} [s \cdot σ]

, and it creates the illusion that the term

- \frac{q}{m_{0}} (B \cdot \hat{S})

would be a true scalar product. It is actually often qualified as a true scalar product. Combining it with

- \frac{q}{2 m_{0}} (B \cdot \hat{L})

one obtains then the expression

- \frac{q}{2 m_{0}} B \cdot (\hat{L} + 2 \hat{S})

, which further enforces the impression that the two types of Zeeman effect have the same symmetry). It is therefore wrong to think about the anomalous Zeeman effect in terms of a dipole moment. Furthermore, we are pretending to talk here about the hypothetical potential energy of a charged spinning point particle in a field

B

, but this field

B

is not a force as the gravitational force

m g

exerted on a spinning top. Any analogy with the potential energy of a spinning top in a gravitational field is a priori potentially misleading and conceptually wanting, as a magnetic field just cannot do any work on a charge. It can exert a force

F = q (v \land B)

, but this force is always perpendicular to the displacement

d r = v d t

and therefore the work

- F \cdot d r = 0

. The situation of an electron in a magnetic field is fundamentally different from that of a spinning top. The energy of the spinning top consists of a potential energy in the gravitational field and two kinetic energy terms, corresponding to the spinning motion and to the precession. The energy of the spinning electron in a magnetic field consists only of the two kinetic-energy terms, whereby the one related to the precession must be treated algebraically. There is no potential-energy contribution.

2.3. Total Absence of Intuition

For an idealized top which is precessing without friction in a gravitational field, the energy of the top remains constant. However, if you describe a precessing top within the spinor formalism of QM, then the formalism says that the energy is not constant and oscillates between two extreme values (see, e.g., [1], p. 307; [15]). We are referring here of course to the description of an electron in a magnetic field. That the energy could oscillate is really incomprehensible. We could imagine that the electron loses energy by, e.g., radiation, but not how it could regain the energy lost, and, what is more, exactly by the same amount. In our approach, the culprit for this contradiction is easily found: we point out in Section 1.4 that the derivation of the Dirac equation in [1] relies on the assumption that the spin axis remains fixed and that beyond the scope of that assumption the energy operator will no longer be given by

- \frac{ℏ}{ı} \frac{\partial}{\partial t}

. Therefore, a priori the motion of a precessing top cannot be studied with the traditional Dirac equation and the calculations that lead to the varying energies are wrong.

If we dare to be heretic by capitalizing on this remark and assuming that the energy is constant as for a spinning top anyway, we may get a constant-energy term that has not the correct value, because it will contain an extra factor

cos θ

, where

θ

is the tilt of the spin axis with respect to the magnetic field, at least if you follow the common-sense arguments you have been taught (e.g., by considering a current loop). None of these speculations leads to a calculation that agrees with the startling experimental result, which seems to indicate that the spin of a fermion can only point up or down.

The traditional way out of these puzzling contradictions is the textbook dogma that directions of space would be quantized, and that this would be a quantum mystery. Whereas I fully agree that I do not understand the first word of it, such that calling this a mystery could be appropriate, I nevertheless think that this is logically and mathematically completely ramshackle. First, we should refuse dogmatic mysteries. However, there is something far worse at work than just a weird paradox. In fact, there is a fierce contradiction hidden within that statement. The contradiction at stake here is that the formalism is completely based on the use of SU(2), wherein the allowed axes of rotation explore all directions of

R^{3}

while it claims that the directions would be quantized in the sense that QM would only allow for two directions, spin-up and spin-down! Such a claim is not compatible with the geometry of SU(2).

The wrong images create even more puzzles in the light of the way we could derive the Dirac equation from the assumption that the electron spins in [1]. In developing the Dirac equation by expressing the rotational motion of a spinning electron with the aid of spinors, at a certain stage, we must put

m_{0} c^{2} = ℏ ω_{0} / 2

as done in Equation (6) to obtain the Dirac equation. Here, the electron spins with angular frequency

ω_{0}

around the spin axis

s

, and

m_{0}

is its rest mass. This means that the complete rest energy of the electron is rotational energy. Consider now the statement that in a magnetic field the spin axis aligns with the magnetic field, because the spin can only be up or down. We could, e.g., imagine that the spin axis

s

is pointing in a given direction and that we turn on the magnetic field in a completely different direction. (Note that this implies also the temporary presence of an electric field. An interesting idea would be to consider that the spin of the electron is coupled to the magnetic field becoming part of it such that it automatically turns with the magnetic field as a part of the field when we switch it, but this idea cannot work for the Stern–Gerlach experiment, where the magnetic fields are not being switched.) There must then exist a really fast mechanism for the spin to align. This is puzzling, because the magnetic energy

\frac{ℏ q B}{2 m_{0}}

is dwarfed by the energy

m_{0} c^{2}

. How could this small magnetic energy possibly succeed in imposing alignment on the much larger energy

m_{0} c^{2}

? It does not comply with our daily-life experience and the conservation of angular momentum. In addition, the transition from spin-up to spin-down becomes problematic. Do we really have to turn the whole state with its energy close to

m_{0} c^{2}

bodily upside down to literally “flip the spin”? Perhaps it requires only changing the rotational frequency, but how does this work when the spin is not aligned? We also do not understand how in general the alignment process is supposed to work. Is there some radiation emitted, and if so should this have been observed? Einstein and Ehrenfest, who had anticipated that the spin would precess around the magnetic field (Larmor precession), have even calculated that the realignment would take more than a hundred years [16].

A final stark example illustrating the ambient confusion and ambivalence is the following. In the Dirac theory, the electron spin is always taken as perpendicular to the plane of motion. When there is a magnetic field, then it is always chosen to be parallel to the z-axis such that the spin can only have the values up or down along the z-axis, in conformity with the theoretical interpretation of the Stern–Gerlach experiment. However, in the explanation of the neutron spin echo technique in solid-state physics [17], you are told that the natural state of affairs is that the neutron spin is always initially aligned with its direction of motion. To make this spin perpendicular to its direction of motion, one has to apply a magnetic field at an angle of 45 degrees with respect to this direction of motion. After a Larmor precession over an angle of 180 degrees around this applied field, the neutron spin will then have become perpendicular to its direction of motion. The rest of the explanation of the method is also entirely based on further Larmor precession of the neutron spin around a guide field. However, as pointed out by Einstein and Ehrenfest, this precession scenario is in contradiction with the results of the Stern–Gerlach experiment. At the very end of the spin echo protocol, the polarization of the spin is measured, and then it is assumed again that the spin can be only up or down. To make the puzzle complete, neutron spin echo has been tried and proved. It works! This reveals how the literature is rife with mutually contradictory scenarios about the way spin behaves. These contradictions are tacitly swept under the rug. Sometimes one assumes that the spin just aligns and one often then invokes the paradigm of a torque exerted on a current loop to explain how this can happen. Sometimes one assumes that the spin must precess and then one often wonders about the mechanism that might eventually align it. Enjoy the paradox: the two mental representations are mutually exclusive. They cannot possibly be both right at the same time. How can we possibly sort this out? We do not understand the behavior of the spin.

3. Tabula Rasa Approach Based on Spinors

In view of all this confusion, typical of a wobbly theory, we must rebuild a theory from scratch and try to solve the paradox within the framework of our new approach. It will therefore be mathematically rigorous and based on a good understanding of spinors [2]. Despite the fact that the author understands spinors quite well, the many contradicting images that are living on in the intuitive folk lore about the spin in a magnetic field amount to a formidable conceptual obstacle. They are a smoke screen that kept me in the dark for a very long time and rendered it extremely difficult to find the correct solution. I am confident that I am not the only one who has been running in circles for years in trying to make sense of this spin-up/spin-down doctrine. As we show, it is focusing the attention on the supposed aligning of the spin axis with the magnetic field

B

that sends us irrevocably down the rabbit hole. It is the unshakable belief that the experiment unmistakably tells us that the spin must be aligned which keeps us in the total impossibility of breaking away from the conceptual death trap of space quantization. The fact that this enigma has remained unsolved for almost a century illustrates how difficult it was.

We must thus repeat our warning to the reader that he is in for a rough ride whereby a lot of what he has become used to take for granted will be ripped apart. Such a statement may cause irritation, as already discussed at the end of Section 1.3, but I think that if you pick up the basics about spinors from [2] and then read the present paper, you will feel rewarded for your efforts. Just as in our derivation of the Dirac equation from scratch in [1] and in Section 1.2, we start from the well-known Rodrigues formula (Equation (2)) in SU(2) for a rotation over an angle

φ

around the axis

s

and put

φ = ω_{0} τ

, where

τ

is the proper time. The resulting equation models then an object that spins at the frequency

ω_{0}

around the axis

s

. For an electron at rest, it suffices to make the calculations in SU(2), as explained above with the aid of Equation (3). From the viewpoint of the traditional approach to QM (based on guessed equations), this starting point may appear to be an extraneous development that is completely out of context and has nothing to do with the formalism of QM, but, as explained in Section 1, the whole formalism of QM is in our approach derived from this Rodrigues formula with the substitution

φ = ω_{0} τ

, such that the development fits completely into the context of our approach.

As easily checked and also derived in [1] (see, e.g., [1], p. 142), one can write the spinning motion in SU(2) in terms of a sum of two frequency components:

R (τ) = \frac{1}{2} [[𝟙 + s \cdot σ] e^{- ı ω_{0} τ / 2} + [𝟙 - s \cdot σ] e^{+ ı ω_{0} τ / 2}] .

(12)

This is a simultaneous description of the mixed state

ψ

defined in Equation (9) and another mixed state

ξ

we define in Equation (A5) of Appendix A. We work all of this out in full detail in Appendix A. Both mixed states are characterized by the fact that they have a well-defined energy. Within the framework of QM, we can consider the two components of the matrix in Equation (12) as two (mixed) beams. In fact, using Ehrenfest’s interpretation of superposition states (see [2], p. 10, complemented by [5], p. 2, for a group-theoretical justification), the presence of the two frequencies in Equation (12) means that we are describing two mixed states simultaneously. Writing the two mixed states that occur in

R (τ)

simultaneously can be considered as just another way of writing a superposition state. We are not forced to consider such a (doubly) mixed beam but the geometrical equation Equation (12) offers us the possibility to do so. Let us now write Equation (12) for a rotation with an axis

s

that is different from the z-axis:

\begin{matrix} R (τ) = & [\begin{matrix} {cos}^{2} (θ / 2) & e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {sin}^{2} (θ / 2) \end{matrix}] e^{- ı ω_{0} τ / 2} \\ + [\begin{matrix} {sin}^{2} (θ / 2) & - e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ - e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {cos}^{2} (θ / 2) \end{matrix}] e^{ı ω_{0} τ / 2} . \end{matrix}

(13)

Here,

(θ, ϕ)

are the spherical coordinates of the spin axis

s

. As already pointed out in Section 1.4, we use

ϕ

and

φ

as two different symbols in this article. Let us now inspect the two components. The

e^{- ı ω_{0} τ / 2}

component is:

\begin{matrix} [\begin{matrix} {cos}^{2} (θ / 2) & e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {sin}^{2} (θ / 2) \end{matrix}] \\ = & [\begin{matrix} cos (θ / 2) e^{- ı ϕ / 2} \\ sin (θ / 2) e^{+ ı ϕ / 2} \end{matrix}] \otimes [\begin{matrix} cos (θ / 2) e^{ı ϕ / 2} & sin (θ / 2) e^{- ı ϕ / 2} \end{matrix}] . \end{matrix}

(14)

We recover here the result

𝟙 + s \cdot σ = 2 ψ_{1} \otimes ψ_{1}^{†}

from [1] (see Equations (3.28) and (5.25)), where

ψ_{1}

is the spinor that corresponds to

R

. The algebraic expression that occurs in Equation (14) is in reality not

ψ_{1} \otimes ψ_{1}^{†}

, but rather its value

ψ_{s 1} \otimes ψ_{s 1}^{†}

at the starting time

τ = 0

, whereby

ψ_{s 1}

is defined by

ψ_{1} = e^{- ı ω_{0} τ / 2} ψ_{s 1}

. The

e^{+ ı ω_{0} τ / 2}

component is:

\begin{matrix} [\begin{matrix} {sin}^{2} (θ / 2) & - e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ - e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {cos}^{2} (θ / 2) \end{matrix}] \\ = & [\begin{matrix} sin (θ / 2) e^{- ı ϕ / 2} \\ - cos (θ / 2) e^{+ ı ϕ / 2} \end{matrix}] \otimes [\begin{matrix} sin (θ / 2) e^{ı ϕ / 2} & - cos (θ / 2) e^{- ı ϕ / 2} \end{matrix}] . \end{matrix}

(15)

This corresponds to

𝟙 - s \cdot σ = 2 ψ_{2} \otimes ψ_{2}^{†}

, where

ψ_{2}

is the conjugated spinor corresponding to

R

, i.e., the second column of

R

. Again, the quantity that occurs in Equation (15) is rather

ψ_{s 2} \otimes ψ_{s 2}^{†}

. Note that

ψ_{1}

and

ψ_{2}

are orthogonal.

Up to now, all calculations have been pure geometry. To introduce the physics, we rely on just one single idea (we first introduced this in [14]), viz. that a magnetic field would make the spin vector precess, based on the following heuristics. For different radii of the circular motion within a magnetic field, the cyclotron frequency remains the same in the non-relativistic limit. Every local co-traveling frame will spin at the same frequency, just in the same way as your horse on a merry-go-round not only moves along a circle but also spins around its own axis with respect to the frame of the observers on the ground. If you shrink the circular orbit in the magnetic field to a point, the spinning motion with the cyclotron frequency around the axis remains. Therefore, a pointlike charged particle at rest in a magnetic field would be spinning even if it were initially spinless. However, if it initially already spins and its spin axis is tilted, then this axis is precessing, which corresponds to the intuitive narrative based on the analogy with a spinning top. We encounter this merry-go-round scenario also in Purcell’s explanation of the Thomas precession [18]. It provides us with some classical intuition for the anomalous Zeeman effect. However, in the Bohr–Sommerfeld imagery of QM, these heuristics are thwarted by the fact that the orbits are quantized. For matters of rigor, we must therefore consider all these ideas as mere heuristics and we have absolutely no cogent a priori knowledge that would help us in deciding if these heuristics are correct or otherwise. We can only acknowledge that spin precession is a popular intuitive scenario. The final test of this merry-go-round scenario will be whether it reproduces the experimental results. For a magnetic field

B

aligned with the z-axis, we obtain then an electron whose spin axis is precessing according to Equation (10). Here,

Ω = \frac{q B}{m_{0}}

is now the cyclotron frequency. Let us write the effect of this precession on both components of

R (τ)

. For the first component:

\begin{matrix} [\begin{matrix} e^{- ı Ω τ / 2} \\ e^{+ ı Ω τ / 2} \end{matrix}] [\begin{matrix} {sin}^{2} (θ / 2) & - e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ - e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {cos}^{2} (θ / 2) \end{matrix}] e^{+ ı ω_{0} τ / 2} = \\ [\begin{matrix} {sin}^{2} (θ / 2) & - e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ 0 & 0 \end{matrix}] e^{ı (ω_{0} - Ω) τ / 2} + \\ [\begin{matrix} 0 & 0 \\ - e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {cos}^{2} (θ / 2) \end{matrix}] e^{ı (ω_{0} + Ω) τ / 2} . \end{matrix}

(16)

The matrices are here again tensor products. However, they are now of a novel type

χ \otimes ψ^{†}

, which no longer provides a familiar link with some rotation axis as in the equation

𝟙 + s \cdot σ = 2 ψ_{1} \otimes ψ_{1}^{†}

. This is quite normal because a precession has no fixed rotation axis. We are working all the time with matrices that can be written as tensor products because they have determinant zero. That a matrix with zero determinant can be written as a tensor product is a specificity of

2 \times 2

matrices. The result of multiplying such a matrix with determinant zero with another matrix will lead to a new matrix that still has determinant zero, such that it can be written again as a tensor product, but it will no longer have the structure

ψ \otimes ψ^{†}

.

We can actually trace back how such hybrid terms come about. Let us call the spinor of the rotation around the z-axis

χ_{1}

and its conjugated spinor

χ_{2}

. The first term in Equation (16), the one that goes with

e^{ı (ω_{0} - Ω) τ / 2}

, is obtained from multiplying:

\begin{matrix} [\begin{matrix} 1 \\ 0 \end{matrix}] \otimes & \underset{⏟}{[\begin{matrix} 1 & 0 \end{matrix}] [\begin{matrix} sin (θ / 2) e^{- ı ϕ / 2} \\ - cos (θ / 2) e^{+ ı ϕ / 2} \end{matrix}]} & \otimes [\begin{matrix} sin (θ / 2) e^{ı ϕ / 2} & - cos (θ / 2) e^{- ı ϕ / 2} \end{matrix}] . \\ sin (θ / 2) e^{- ı ϕ / 2} \end{matrix}

(17)

It thus corresponds to

[χ_{s 1} \otimes χ_{s 1}^{†}] [ψ_{s 2} \otimes ψ_{s 2}^{†}]

. We can multiply the underbraced matrices in the middle, which can be shown to be a correct procedure. We obtain then the scalar

sin (θ / 2) e^{- ı ϕ / 2}

and

χ_{s 1} \otimes ψ_{s 2}^{†}

. In this way, we obtain again the first term of Equation (16). We see that it is obtained by combining

χ_{1}

and

ψ_{2}

, which is why

Ω

occurs with a minus sign and

ω_{0}

with a plus sign. The other component yields:

[\begin{matrix} e^{- ı Ω τ / 2} \\ e^{+ ı Ω τ / 2} \end{matrix}] [\begin{matrix} {cos}^{2} (θ / 2) & e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {sin}^{2} (θ / 2) \end{matrix}] e^{- ı ω_{0} τ / 2} =

\begin{matrix} [\begin{matrix} {cos}^{2} (θ / 2) & e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ 0 & 0 \end{matrix}] e^{- ı (ω_{0} + Ω) τ / 2} + \\ [\begin{matrix} 0 & 0, \\ e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {sin}^{2} (θ / 2) \end{matrix}] e^{- ı (ω_{0} - Ω) τ / 2} . \end{matrix}

(18)

As justified in Section 1.2 and discussed in [1], we can consider that the two signs of the frequency

\pm ω

correspond both to the same energy

E = | \frac{ℏ ω}{2} |

. We can then rearrange the terms according to their energies:

\begin{matrix} [\begin{matrix} {cos}^{2} (θ / 2) & e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ 0 & 0 \end{matrix}] e^{- ı (ω_{0} + Ω) τ / 2} + \\ [\begin{matrix} 0 & 0 \\ - e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {cos}^{2} (θ / 2) \end{matrix}] e^{+ ı (ω_{0} + Ω) τ / 2}, \end{matrix}

(19)

where we can factorize out the probability amplitude

cos (θ / 2)

, and:

\begin{matrix} [\begin{matrix} 0 & 0 \\ e^{ı ϕ} sin (θ / 2) cos (θ / 2) & {sin}^{2} (θ / 2) \end{matrix}] e^{- ı (ω_{0} - Ω) τ / 2} + \\ [\begin{matrix} {sin}^{2} (θ / 2) & - e^{- ı ϕ} sin (θ / 2) cos (θ / 2) \\ 0 & 0 \end{matrix}] e^{+ ı (ω_{0} - Ω) τ / 2}, \end{matrix}

(20)

where we can factorize out the probability amplitude

sin (θ / 2)

. Equations (19) and (20) describe the energy states if we send a mixed electron beam into a Stern–Gerlach filter. When the beam is not mixed, each energy state will only have one component.

It transpires from the calculations that there are two possible energies for the electron within the magnetic field, according to the criterion

E = | ℏ ω / 2 |

outlined above. Here,

ω

takes the values

ω_{0} \pm Ω

. We avoid in this way using

- \frac{ℏ}{ı} \frac{\partial}{\partial τ}

as an energy operator in a context where it is no longer valid, as discussed in Section 1.4. Now, we have found an analysis that yields the correct observed energies. It also explains the whole Stern–Gerlach experiment, provided we can still explain how these two energies lead to different trajectories (see below). Let us note that we present the effect of the magnetic field on the charge by Equation (10). This is not something we find in textbooks, but is based on our heuristics (first developed in [14] in terms of vorticity). The algebra does not contain a current loop or a magnetic dipole. It just contains a rotating point charge. The intuition about a magnetic dipole is a wrong intuition. The fact that the magnetism produced by the spin does not need to be of the dipole type is shown by the exchange mechanism proposed by Heisenberg and Majorana, which is based on the Coulomb interaction and the exclusion principle.

The whole puzzle why the magnetic moment would have to align with the field has now disappeared. We find the right energy without having to invoke alignments of axes with the magnetic field. Such alignments are just no longer part of the story. Furthermore, there is simply no longer a well-defined single fixed axis as transpires from the weird terms

χ_{j} \otimes ψ_{k}^{†}

in the formalism. Equation (19) describes a motion with energy

ℏ (ω_{0} + Ω) / 2 = m_{0} c^{2} + \frac{ℏ q B}{2 m_{0}}

and which occurs with probability

{cos}^{2} (θ / 2)

, while Equation (20) describes a motion with energy

ℏ (ω_{0} - Ω) / 2 = m_{0} c^{2} - \frac{ℏ q B}{2 m_{0}}

and which occurs with probability

{sin}^{2} (θ / 2)

, in agreement with the experimental results. These are both complex motions that we cannot describe in simple terms as we do for a rotation around some axis. We can safely assume that these two components just describe precession (see Section 4). The Stern–Gerlach filter separates these two energies into two different beams. It is one of those two rearranged combinations that in general would be fed into a next Stern–Gerlach apparatus if we performed an experiment with a sequence of Stern–Gerlach filters. The precession just adapts all the time to the magnetic field present and it stops when there is no magnetic field. There are never quantum jumps in the motion of the spin vector, while in the traditional paradigm such jumps appear inevitable. Note that the average energy is

ℏ (ω_{0} + Ω cos θ) / 2

, such that

V = - μ \cdot B

is a macroscopic energy term, which is not applicable to individual fermions. It is not a potential energy. This average energy is no longer varying with time as in the brute-force QM calculation discussed in Section 2.3.

The fact that we made our calculation on a mixed beam may raise the question if this is justified. We may interpret it in terms of clockwise and counterclockwise motion, but must be aware of the fact that we are talking about two mixed states. We perform the calculations on these two states simultaneously to be as general as possible. However, we can see that we could have excluded one state, e.g., by only considering the

e^{- ı ω_{0} τ / 2}

component of Equation (12). Both components lead to the same energies; the results only differ in the algebraic signs.

Most textbooks calculate the force exerted on the fermion starting from an equation for a “potential energy”

V = - μ \cdot B

and then using

F = - \nabla V

. However, the physical existence of such a potential energy is doubtful, because a magnetic field cannot do any work. The equation

V = - μ \cdot B

suggests that all directions of space are allowed, which is actually what, according to the traditional theory, the experiment would prove to be conceptually wrong. This traditional calculation for the trajectories is classical because the aim is to show that our classical notions are wrong. In principle, from the traditional point of view one must then still make a quantum mechanical calculation to render the theoretical approach correct. To avoid talking about chimerical potential energies, it is better to base the analysis on the expression

F = - \nabla E

. The force

F = - \nabla E

is the force responsible for the motion of the center of mass of the fermion through the Stern–Gerlach apparatus when the fermion is no longer at rest. One can imagine that it enters the device in uniform motion and then starts to feel a force. For an electron this would be (predominantly) the Lorentz force, but, for the Ag atoms, which are neutral, this will be this gradient force, just as in the original calculation of Stern and Gerlach. Instead of the expression

- B \cdot μ

, which is wrong, we must use here

E = ℏ (ω_{0} \pm Ω) / 2 = m_{0} c^{2} \pm μ B

, which is correct. Using

F = - \nabla E = \pm μ (\nabla B)

will lead then to the same result as in the textbook analysis of the trajectories, after postulating that

μ

is quantized. In being built on group theory, our calculation is entirely classical, and this suffices to explain the experimental results entirely correctly.

We may finally remark that the mathematical difficulties related to the errors described in Section 2.2 are solved by the introduction of

ψ

defined in Equation (9) and

ξ

defined in Equation (A5) of Appendix A. From these definitions, it follows that, for the special case

s = e_{z}

, which corresponds to

s ‖ B = B e_{z}

, we have

[B \cdot σ] ψ = B [e_{z} \cdot σ] ψ = B ψ

and

[B \cdot σ] ξ = B [e_{z} \cdot σ] ξ = - B ξ

. This special case is the only case which can be treated by the Dirac equation because it does not give rise to precession. In Section 2.2, which treats this special case, we encounter the riddle what we can do with the vector term

[B \cdot σ]

in an equation that is supposed to define an energy. Without knowing that the spinors in the Dirac equation describe superposition states of the type

ψ

or

ξ

, solving the puzzle of how we can replace the vector term

[B \cdot σ]

in the equation by a scalar

\pm B

and obtain a true energy term

\pm B μ

is just impossible because, for a pure state

χ

, we have

[B \cdot σ] χ \neq B χ

. Dirac “solved” the problem by brute force using the error described in Section 2.2. It can be hoped that this will convince the reader that this error cannot be hushed up or ignored. The mathematical truth must prevail and carrying out correctly the admittedly intricate algebra helps us in figuring out the physical truth.

4. More Traditional Formulation in Terms of a Differential Equation

In this section, we reformulate everything again in the more familiar differential calculus of standard textbook QM. In the calculations, we encounter tensor products of the type

χ_{s j} \otimes ψ_{s k}^{†}

. We know that they correspond to precession by construction. The tensor product

χ_{1} \otimes ψ_{2}^{†}

can be understood as a simultaneous description of the motions

e^{- ı Ω τ / 2} χ_{1}

and

e^{+ ı ω_{0} τ / 2} ψ_{2}

and in this sense describe the precession, but it does not contain the correct time dependence

χ_{s 1} \otimes ψ_{s 2}^{†} e^{ı (ω_{0} - Ω) τ / 2}

because

χ_{1} \otimes ψ_{2}^{†} = χ_{s 1} \otimes ψ_{s 2}^{†} e^{- ı (ω_{0} + Ω) τ / 2}

. If we accept the rule that we must replace

χ_{1} \otimes ψ_{2}^{†}

by

χ_{1} \otimes ψ_{2}

or

χ_{1} \otimes ψ_{2}^{⊤}

, we obtain a correct simultaneous description of

e^{- ı Ω τ / 2} χ_{s 1}

and

e^{+ ı ω_{0} τ / 2} ψ_{s 2}

. There are two such terms in Equation (19), and they are coming from the two beams we consider in Equation (12). The motion described by Equation (19) can be condensed into the form:

P (τ) = [\begin{matrix} cos (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} & e^{- ı ϕ} sin (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \\ - e^{ı ϕ} sin (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & cos (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} \end{matrix}], with det (P (τ)) = 1 .

(21)

This equation describes a mixture of two states that occur when we are using a mixed beam. This energy state is thus a set that contains both mixed states. As easily seen,

(\forall τ \in R) (P (τ) \in

SU(2)), such that this set is in some way interpretable as a rotation. We can compare this with the situation in projective geometry, where we can define a straight line as a set of all points which are incident with the line, but we can also define a point as a set of all lines which are incident with the point. A rotation can thus also be seen as a set, and other geometrical objects as well. For example, the quantities

𝟙 + s \cdot σ

and

𝟙 - s \cdot σ

are the eigenvectors of the reflection operator

s \cdot σ

and correspond to the sets

{𝟙, s \cdot σ}

and

{𝟙, - s \cdot σ}

, respectively. We can thus consider the algebra in Equation (21) as the construction of a mixed state (a set of states) with a constant energy that can be interpreted as a rotation. With the rotation in Equation (21), we are now again on more familiar geometrical grounds. We know how to analyze such a matrix and we apply the spinor formalism on it. Derivation with respect to

τ

yields:

\frac{d}{d τ} P (τ) = - ı ((ω_{0} + Ω) / 2) [\begin{matrix} cos (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} & e^{- ı ϕ} sin (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \\ + e^{ı ϕ} sin (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & - cos (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} \end{matrix}] .

(22)

The inverse matrix of

P (τ)

is:

{[P (τ)]}^{- 1} = [\begin{matrix} cos (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & - e^{- ı ϕ} sin (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \\ e^{ı ϕ} sin (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & cos (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \end{matrix}] .

(23)

Hence,

[\frac{d}{d τ} P (τ)] {[P (τ)]}^{- 1}

is given by

- ı ((ω_{0} + Ω) / 2) V (τ)

where

V (τ)

is given by:

\begin{matrix} [\begin{matrix} cos (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} & e^{- ı ϕ} sin (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \\ + e^{ı ϕ} sin (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & - cos (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} \end{matrix}] \times \\ [\begin{matrix} cos (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & - e^{- ı ϕ} sin (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \\ e^{ı ϕ} sin (θ / 2) e^{+ ı (ω_{0} + Ω) τ / 2} & cos (θ / 2) e^{- ı (ω_{0} + Ω) τ / 2} \end{matrix}] = [\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}] = [e_{z} \cdot σ] . \end{matrix}

(24)

We have thus:

\begin{matrix} \frac{d}{d τ} P (τ) = [\frac{d}{d τ} P (τ)] {[P (τ)]}^{- 1} [P (τ)] = - ı [(ω_{0} + Ω) / 2] V (τ) P (τ) = \\ - ı [(ω_{0} + Ω) / 2] [e_{z} \cdot σ] P (τ) . \end{matrix}

(25)

We could treat this geometrical object with a single energy within the scope of the Dirac equation if we introduced again a mixed state (as

ψ

instead of

χ

in Equation (9) within Section 1.2). Indeed, this mixed state again yields a fixed energy

ℏ (ω_{0} + Ω) / 2

, i.e., we obtain

\frac{d}{d τ} ψ = - ı ((ω_{0} + Ω) / 2) ψ

when we use the traditional energy operator on it. A legitimate question is then if nature will provide the additional components that must enter the mixture. However, it is not at all our purpose here to bring the calculations back into the scope of the Dirac equation and its energy operator. There is no reason this energy operator should be valid within the context of precession. What interests us here is not calculating the energy which we know already. It is the fact that the set

P (τ)

can be interpreted as a rotation around the z-axis when the original beam is mixed.

The result is rather amazing, because we have obtained in Equation (25) the same type of differential equation as

\frac{d}{d τ} R (τ) = - ı (ω_{0} / 2) [e_{z} \cdot σ] R (τ)

for the Rodrigues formula expressing a simple spinning motion around the z-axis, although the form of

P (τ)

is different from the form of

R (τ)

because it is not a diagonal matrix, whereas the matrix

R (τ)

that describes a spinning motion around the z-axis is diagonal. With hindsight, we can see that we could have anticipated all this. The equations

\frac{d}{d τ} χ = - ı (ω / 2) [e_{z} \cdot σ] χ

or

\frac{d}{d τ} R = - ı (ω / 2) [e_{z} \cdot σ] R

describe any type of object that rotates with an angular frequency

ω

around the z-axis. In the usual approach, the object is a spinless electron that we rotate with a frequency

ω = ω_{0}

around the z-axis to give the electron its spin. In the new situation, the object is an electron which is already spinning with a frequency

ω_{0}

around an axis

s

, and we rotate this object bodily with a frequency

ω = Ω

around the z-axis, to describe the precession of the spinning electron within a magnetic field. That the new object is different from the initial one can be seen from the expression of the intervening matrix, which is different from the diagonal form we had before. This result shows that, whatever the level of complication in some hierarchy of precessions, we are always able to treat a fixed-energy component this way. We could have reached these conclusions also by observing that:

P (τ) = [\begin{matrix} e^{- ı (ω_{0} + Ω) τ / 2} \\ e^{+ ı (ω_{0} + Ω) τ / 2} \end{matrix}] [\begin{matrix} cos (θ / 2) & e^{- ı ϕ} sin (θ / 2) \\ - e^{ı ϕ} sin (θ / 2) & cos (θ / 2) \end{matrix}] .

(26)

A surprising fact is that the whole energy is attributed to a rotation around the precession axis. However, this illustrates what is noted above, viz. that the energy is not a vector. We have an object that bodily rotates around the precession axis and its energy is

ℏ (ω_{0} + Ω) τ / 2

. The development for the equation of motion in Equation (20) is analogous. It can be condensed in the form:

M (τ) = [\begin{matrix} sin (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} & - e^{- ı ϕ} cos (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \\ e^{ı ϕ} cos (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & sin (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} \end{matrix}], det (M (τ)) = 1 .

(27)

Derivation yields:

\frac{d}{d τ} M (τ) = - ı ((ω_{0} - Ω) / 2) [\begin{matrix} - sin (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} & e^{- ı ϕ} cos (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \\ e^{ı ϕ} cos (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & sin (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} \end{matrix}] .

(28)

The inverse matrix of

M (τ)

is:

M^{- 1} (τ) = [\begin{matrix} sin (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & e^{- ı ϕ} cos (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \\ - e^{ı ϕ} cos (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & sin (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \end{matrix}] .

(29)

We can again construct a matrix

W (τ) = [\frac{d}{d τ} M (τ)] [M^{- 1} (τ)]

, which is now given by:

\begin{matrix} [\begin{matrix} - sin (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} & e^{- ı ϕ} cos (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \\ e^{ı ϕ} cos (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & sin (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} \end{matrix}] \times \\ [\begin{matrix} sin (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & e^{- ı ϕ} cos (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \\ - e^{ı ϕ} cos (θ / 2) e^{- ı (ω_{0} - Ω) τ / 2} & sin (θ / 2) e^{+ ı (ω_{0} - Ω) τ / 2} \end{matrix}] = [\begin{matrix} - 1 & 0 \\ 0 & + 1 \end{matrix}] . \end{matrix}

(30)

We have thus:

\frac{d}{d τ} M (τ) = - ı [(ω_{0} - Ω) / 2] [- e_{z} \cdot σ] M (τ) .

(31)

This is now the equation for a down state. The situation in the Equations (25) and (31) thus actually corresponds exactly to a physical picture of up and down states, but these states are different from what we have been told. It is no longer the same type of object, viz. the spin, that has its rotation axis aligned up or down. In the old context, we start from a spinless electron and make it spin around an axis, while, in the new context, we start from an already spinning electron whose axis is not aligned and we make the whole thing bodily spin around a precession axis. It is this precession axis which can now be up or down, not the spin axis. We should therefore have qualified the states as precession-up and precession-down rather than as spin-up and spin-down. Pauli [19] just introduced pragmatically the experimental result of the Stern–Gerlach experiment into the theory under the form of an ad hoc postulate, without any true justification. He replaced explaining by describing. The spin-up/spin-down narrative was so highly counter-intuitive that it could only provoke intense bewilderment, as described in Section 2. After almost a century, we have now the theoretical justification for Pauli’s ad hoc postulate, and we can appreciate that the directions in space are absolutely not “quantized”.

This solution of a real conceptual difficulty perfectly illustrates the philosophy of our alternative approach to QM. We must obtain the same correct algebraic results, but we can change the corresponding geometrical explanation which must be clear, devoid of mysteries and contradictions and in agreement with the meaning of the spinors. This result further validates our alternative approach. The reason for the confusion within the traditional approach is that the geometrical meaning of the spinors was not understood. It remained hidden due to the fact that Dirac’s derivation has been based on the second algebra rather than the first one. Meanwhile, it often remains very hard to find an explanation for the algebraic results. It requires a lot of mathematical creativity and the mental pictures inherited from the traditional interpretation, which are deeply engraved in our minds, can really make it difficult to break away from them. They can also trigger fierce resistance to the new approach. We want a perfect mathematical system, made of a geometry, an algebra and a dictionary that translates one into the other. The interplay between the algebra and the geometry turns such a system into a very powerful method that allows gaining deep insight if we carry out the mathematics meticulously, as pointed out at the end of Section 3. Analytical Newtonian mechanics reaches this ideal to the point that it almost appears as a purely mathematical theory. With our spinor approach to the few sample cases selected, we seem to come close to this ideal as well.

5. The Pauli Exclusion Principle Remains Valid

Feynman [20] gave an intuitive explanation for the Pauli principle. However, he did not write down his idea under algebraic form, such that a detailed proof is lacking. In the French translation of [20], there is a footnote by Lévy-Leblond, which shows that the argument can lead to some confusion. Intuitively, when you exchange two electrons, each of them makes a turn over an angle of

π

. You may think that this will multiply their spinors by ı and therefore the tensor product of the two spinors by

- 1

. However, the moves involved in the exchange are, at least in appearance, taking place in space rather than inside the electron. They are of the position type such that they and the angle

ζ

which characterizes them (see below) should in principle not intervene in the argument, because the position coordinates do not belong to the set of parameters that define a spin state. The real exchange is thus not the swap of the positions but that of the spin states. However, these moves are accompanied by the rotation of the co-moving Fresnel frame, which is also characterized by

ζ

. This is a merry-go-round type of scenario. This rotational motion is of the spin type. In our development below, the phase

ζ

which intervenes is obtained by Lorentz transformation of the spin variable

ω_{0} τ

, and therefore really of the spin type.

Due to its historical context, one may suspect that the Pauli principle relies on the assumption that the spins can only be up and down, i.e., on parallelism. Now that we have discovered that the energy states must rather be characterized in terms of precession-up and precession-down, one may formulate some concerns if the Pauli principle remains valid. As the spins are no longer parallel, we might have just destroyed the Pauli principle. Certainly, there are still only two possible states for the energy, but there are now many more possible states of motion. The motion is no longer characterized by

Ω

but by

(Ω, θ)

. In fact, the spins no longer need to be parallel in order to resort to the same energy state. Could the change of paradigm cause the meltdown of the Pauli principle?

We show that the Pauli principle is not under fire, but let us first try to write Feynman’s argument algebraically (in the non-relativistic limit), rendering our proof open to a detailed scrutiny of the effects of the change. Let us take for the spin-up and spin-down functions the wave functions for non-relativistic electrons moving on a circle:

\begin{matrix} ψ_{↑} = [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - k ℓ) / 2]} = [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - ζ) / 2]}, \\ ψ_{↓} = [\begin{matrix} 0 \\ 1 \end{matrix}] e^{+ ı [(ω_{0} t - k ℓ) / 2]} = [\begin{matrix} 0 \\ 1 \end{matrix}] e^{+ ı [(ω_{0} t - ζ) / 2]} . \end{matrix}

(32)

The expressions in the exponentials come from integrating

\int ω_{0} d t - k \cdot d r = \int ω_{0} d t - k d ℓ

along the circle, The expression

ω_{0} d t - k \cdot d r

is the Lorentz invariant

ω d t - k \cdot d r = ω_{0} d τ

, whereby we drop the factor

γ \approx 1

in

ω = γ ω_{0}

in the non-relativistic limit. Here, ℓ is the curvilinear distance travelled along the circle and

k = 1 / r

. In fact, by noting

w = c^{2} / v

for the superluminal phase velocity w and putting

w = ω_{0} r

, we obtain

ω_{0} v d ℓ / c^{2} = ω_{0} d ℓ / w = d ℓ / r

, which must be

k d ℓ

. Therefore,

k = 1 / r

. The tangent vector

k

permits following the Thomas precession of the Fresnel basis on the merry-go-round, which embodies the true rigid-body rotation of the whole two-electron configuration. We also note

φ = ω_{0} t

for the spin angle, in contrast with

ϕ

which is the precession angle (and does not intervene here). In fact, the electron does not have to move. When we freeze time, we can still move around the circle geometrically. We introduce the angle

ζ

to specify the position of the electron on the circle. We have

ℓ = ζ r

, such that

k ℓ = ζ

. The angle

ζ

is related to k and we can understand the value of

ζ

also as the rotation angle of the co-moving Fresnel basis. Consider now two spin-up electrons at diametrically opposed positions on a circle of radius r. We can consider then two spin-up electrons positioned in

r_{1} = r

,

r_{2} = - r

,

ζ_{1} = 0

,

ζ_{2} = π

. The phase difference

ζ_{2} = ζ_{1} + π

just translates the different position on the circle.

We have then:

\begin{matrix} ψ_{1} \otimes ψ_{2} = [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - ζ_{1}) / 2]} \otimes [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - ζ_{2}) / 2]} = \\ [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [ω_{0} t / 2]} \otimes [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - π) / 2]} . \end{matrix}

(33)

The expressions are pure spin functions. We consider Equation (33) as the canonical situation. We treat other situations later on. An exchange of the two electrons can be obtained by a rotation over an angle of

π

around the centre of the circle. Under such a rotation

R

over

π

, we obtain

r_{1} \to r_{2}

,

r_{2} \to r_{1}

,

ζ_{j} \to ζ_{j} + π

.

\begin{matrix} R (ψ_{1} \otimes ψ_{2}) = [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - π) / 2]} \otimes [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı ((ω_{0} t - 2 π) / 2)} = \\ [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - π) / 2]} \otimes (- 1) [\begin{matrix} c 1 \\ 0 \end{matrix}] e^{- ı [ω_{0} t / 2]} . \end{matrix}

(34)

Hence, the rotation induces the substitutions

ψ_{1} \to ψ_{2}

,

ψ_{2} \to - ψ_{1}

and

ψ_{1} \otimes ψ_{2} \to - ψ_{2} \otimes ψ_{1}

. We can see that the cause for the minus sign is the fact that position angles

ζ

occur under the form

ζ / 2

in the spinor calculus. After this rotation

R

, the physical situation is indistinguishable from the situation before, because

R

transforms Electron 1 into Electron 2 and vice versa. This means that the wave function must be invariant under the rotation. This implies that

ψ_{1} \otimes ψ_{2}

cannot be the wave function

Ψ

. In fact,

R (Ψ)

would lead to

R (Ψ) = R (ψ_{1} \otimes ψ_{2}) = ψ_{2} \otimes ψ_{1}

, where we express the exchange

ψ_{1} \leftrightarrow ψ_{2}

. However, we have also calculated in Equation (34) that

R (Ψ) = - ψ_{2} \otimes ψ_{1}

. This leads to

R (Ψ) = - R (Ψ)

, such that

R (Ψ) = 0

and

Ψ = 0

. Similarly, if we take:

\begin{matrix} Ψ & = & \underset{⏟}{ψ_{1} \otimes ψ_{2}} & + & \underset{⏟}{ψ_{2} \otimes ψ_{1}}, \\ p_{1} & p_{2} \end{matrix}

(35)

then we obtain also

R (Ψ) = - Ψ

because

R

transforms

p_{1}

into

- p_{2}

and

p_{2}

into

- p_{1}

, while we also have

R (Ψ) = Ψ

because

R

is an exchange. It follows then again that

Ψ = 0

. However, if we rather take:

\begin{matrix} Ψ & = & \underset{⏟}{ψ_{1} \otimes ψ_{2}} & \underset{⏟}{- ψ_{2} \otimes ψ_{1}}, \\ p_{1} & p_{2} \end{matrix}

(36)

we obtain

R (Ψ) = Ψ

because now

R

transforms

p_{1}

into

p_{2}

and

p_{2}

into

p_{1}

. This is now consistent with the fact that

R

is an exchange. Hence,

Ψ

in Equation (36) is a wave function that takes into account the exchange correctly. The wave function has to be antisymmetric. The configuration of two electrons with parallel spins in the same place, can be obtained by considering the special case

r = 0

. When the spins are parallel, we have then

ψ_{1} = - ψ_{2}

and

Ψ = 0

. We are thus obliged to take the spins antiparallel if we want to succeed to have them in the same place. This is the Pauli exclusion principle for spin-up and spin-down states.

Let us now investigate what this becomes with the new paradigm of precession-up and precession-down states. We can consider this as the non-canonical counterpart of the canonical state described above. We start from

\exists (R_{1}, R_{2}) :

\begin{matrix} χ_{1} = [\begin{matrix} ξ_{0} \\ ξ_{1} \end{matrix}] e^{- ı [(ω_{0} t - ζ_{1}) / 2]} = R_{1} [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı [(ω_{0} t - ζ_{1}) / 2]}, \\ χ_{2} = [\begin{matrix} η_{0} \\ η_{1} \end{matrix}] e^{- ı ((ω_{0} t - ζ_{2}) / 2)} = R_{2} [\begin{matrix} 1 \\ 0 \end{matrix}] e^{- ı ((ω_{0} t - ζ_{2}) / 2)} . \end{matrix}

(37)

We are thus considering the rotations

R_{1}

and

R_{2}

that relate the wave functions

χ_{j} = R_{j} (ψ_{j})

to the wave functions

ψ_{j}

of the canonical configuration. We have again

ζ_{2} = ζ_{1} + π

, where we can take

ζ_{1} = 0

. Then,

ψ_{1} = R_{1}^{- 1} χ_{1}, ψ_{2} = R_{2}^{- 1} χ_{2} .

(38)

The two exponentials still exhibit a phase difference

π

leading to a factor

- 1

such that

ψ_{1} \to ψ_{2}, ψ_{2} \to - ψ_{1}, and therefore : R_{1}^{- 1} χ_{1} \to R_{2}^{- 1} χ_{2}, R_{2}^{- 1} χ_{2} \to - R_{1}^{- 1} χ_{1},

(39)

or

χ_{1} \to R_{1} R_{2}^{- 1} χ_{2}, R_{1} R_{2}^{- 1} χ_{2} \to - χ_{1} .

(40)

In other words,

\exists R = R_{2} R_{1}^{- 1} ‖ χ_{2} = R χ_{1} & χ_{1} = - R^{- 1} χ_{2}

. Combining these two identities leads to

χ_{1} = - χ_{1}

. Therefore, the wave function must still be antisymmetrical. Hence, Pauli’s principle remains valid even when the two spins are not parallel.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. The Missing Link: the Geometrical Meaning of the Spin-Up and Spin-Down States

Let us note the up and down spinors as:

χ_{↑} = [\begin{matrix} 1 \\ 0 \end{matrix}], χ_{↓} = [\begin{matrix} 0 \\ 1 \end{matrix}] .

(A1)

The tensor products

χ_{↑} \otimes χ_{↑}^{†}

,

χ_{↓} \otimes χ_{↑}^{†}

,

χ_{↑} \otimes χ_{↓}^{†}

and

χ_{↓} \otimes χ_{↓}^{†}

are a basis for the vector space M

(2, C)

of complex

2 \times 2

matrices. These four basis vectors are

2 \times 2

matrices which have each only one non-zero entry which is equal to 1. The manifold

A (G, G) \subset F (G, G)

of group automorphisms of the rotation group G, which is isomorphic to the rotation group G, is embedded in M

(2, C)

. The vector space of linear mappings M

(2, C)

thus contains the representation of the group G by automorphisms. When we extend G to a larger group, such as to contain also reflections and reversals, it is also embedded in M

(2, C)

. However, M

(2, C)

can also account for more complicated motions such as precession, nutation, etc. The four basis vectors of M

(2, C)

,

χ_{α} \otimes χ_{β}^{†}

, with

(α, β) \in {↑, ↓}^{2}

, transform

β

into

α

. The matrices

\frac{1}{2} (𝟙 \pm e_{z} \cdot σ)

are then the basis vectors

χ_{↑} \otimes χ_{↑}^{†}

and

χ_{↓} \otimes χ_{↓}^{†}

. In fact,

\frac{1}{2} (𝟙 + e_{z} \cdot σ) = χ_{↑} \otimes χ_{↑}^{†}

transforms then

χ_{↑}

into

χ_{↑}

, while

\frac{1}{2} (𝟙 - e_{z} \cdot σ) = χ_{↓} \otimes χ_{↓}^{†}

transforms then

χ_{↓}

into

χ_{↓}

. The spinning motion

[\begin{matrix} e^{- ı ω_{0} τ / 2} \\ e^{ı ω_{0} τ / 2} \end{matrix}],

(A2)

has two components in this basis because it transforms simultaneously

χ_{↑}

into

e^{- ı ω_{0} τ / 2} χ_{↑}

and

χ_{↓}

into

e^{+ ı ω_{0} τ / 2} χ_{↓}

. The reflection operator

[e_{x} \cdot σ]

transforms simultaneously

χ_{↑}

into

χ_{↓}

and

χ_{↓}

into

χ_{↑}

.

Using the fact that a spinor is the first column of a rotation matrix,

χ_{↑}

can be considered as representing

𝟙

and thus the identity element. Following the same logic,

χ_{↓}

could be considered as representing the reflection

σ_{x}

. The spinor

χ_{↑}

would then correspond to a right-handed reference frame with triad

(e_{x}, e_{y}, e_{z})

. The spinor

χ_{↓}

would correspond to a left-handed reference frame with triad

(- e_{x}, e_{y}, e_{z})

. Operating the spinning motion around the z-axis with the representation matrix given by Equation (A2), to

χ_{↑}

and

χ_{↓}

, would then yield the spinor

χ_{↑} e^{- ı ω_{0} τ / 2}

and the conjugated spinor

χ_{↓} e^{ı ω_{0} τ / 2}

. The conjugated spinors correspond thus to reversals.

For a

2 \times 2

matrix

M

operating on the vector space

R^{2}

with basis vectors

e_{1}

and

e_{2}

, the first column of

M

corresponds to

M (e_{1})

, and the second column to

M (e_{2})

. The spinors of SU(2) do not constitute a vector space but a curved manifold, such that what is written for the matrix

M

cannot literally apply. We see now that the first column of a rotation matrix

R

of SU(2) corresponds to

R (χ_{↑})

, where

χ_{↑}

represents

𝟙

or the canonical right-handed reference frame, while the second column corresponds to

R (χ_{↓})

, where

χ_{↓}

represents

σ_{x}

or the canonical left-handed reference frame.

To justify this further, we show that

χ_{↓}

cannot be identified with the first column of a simple rotation matrix, such that the set of spinors

R (χ_{↑})

and the set of conjugated spinors

R (χ_{↓})

are disjoint. The general expression for a rotation by an angle

φ = ω_{0} τ

around the axis

s

with spherical coordinates

(θ, ϕ)

is according to the Rodrigues formula:

[\begin{matrix} cos (φ / 2) - ı cos θ sin (φ / 2) & - ı sin (θ) e^{- ı ϕ} sin (φ / 2) \\ - ı sin (θ) e^{ı ϕ} sin (φ / 2) & cos (φ / 2) + ı cos θ sin (φ / 2) \end{matrix}] .

(A3)

Therefore, obtaining

χ_{↓}

as the first column of this rotation matrix would require

cos (φ / 2) = 0

, implying

sin (φ / 2) = 1

. This would then further require

cos θ = 0

, such that

sin θ = 1

. All these conditions are necessary just to make sure that the first entry of the spinor is zero. This leaves us with:

[\begin{matrix} 0 & - ı e^{- ı ϕ} \\ - ı e^{ı ϕ} & 0 \end{matrix}] .

(A4)

Because

φ / 2

must have the fixed value

π / 2

, we cannot have dynamical spinning motion associated with

χ_{↓}

. Let us now check what follows from the condition that the second entry of the spinor must be 1. We must then have

- ı e^{ı ϕ} = 1

, such that

ϕ = π / 2

. We thus have

(θ, ϕ) = (π / 2, π / 2)

and

φ / 2 = ω_{0} τ / 2 = π / 2

. An illicit out-of-the-box solution would be

ϕ = ω τ + π / 2

. We would obtain then the “spinor”

e^{ı ω τ} χ_{↓}

. This pseudo-solution would then represent the rotation of a non-spinning electron whose rotation axis would be in the

O x y

plane and precessing around the z-axis with an angular frequency

ω

. The net result would be similar to description of an electron spinning around the z-axis. However, this is a cheat because it transgresses the domain of the original definitions, and we can represent such a motion already by means of

χ_{↑}

. Hence, the two sets of “spinors” generated by the rotation group by operating on

χ_{↑} e^{- ı ω_{0} τ / 2}

and

χ_{↓} e^{ı ω_{0} τ / 2}

are physically disjoint. The quantity

χ_{↓} e^{ı ω_{0} τ / 2}

is not a spinor that corresponds to a spinning motion.

We can therefore adopt without ambiguity the convention that

χ_{↓} e^{ı ω_{0} τ / 2}

are reversals, which are rotations of left-handed frames. Note that in a left-handed frame

a \land b

is now defined according to the left-hand rule, such that

ω | - ω

if we stick to the right-hand rule.

We must now discuss a possible confusion. Let us compare Equation (A2) with Equation (12). We see that the part

\frac{1}{2} [𝟙 + s \cdot σ] e^{- ı ω_{0} τ / 2}

in Equation (12) is up to normalization just equal to

[𝟙 + s \cdot σ] R

. One could thus argue that it corresponds to the superposition state

ψ

defined in Equation (9) which permits to write

[s \cdot σ] ψ = ψ

. The other part

\frac{1}{2} [𝟙 - s \cdot σ] e^{ı ω_{0} τ / 2}

in Equation (12) is then up to normalization equal to

[𝟙 - s \cdot σ] R

. It corresponds then to a superposition state

ξ

defined as:

ξ = χ - [s \cdot σ] χ \Rightarrow [s \cdot σ] ξ = - ξ \Rightarrow \frac{d ξ}{d τ} = + ı (ω_{0} / 2) ξ .

(A5)

For the special case of spinning motion around the z-axis in Equation (A2), the first column of the matrix in Equation (A2) thus also corresponds to

\frac{1}{2} (𝟙 + e_{z} \cdot σ) e^{- ı ω_{0} τ / 2}

such that it seems as though

χ_{↑}

must correspond to

\frac{1}{2} (𝟙 + e_{z} \cdot σ)

. Similarly, it seems as though

χ_{↓}

must correspond to

\frac{1}{2} (𝟙 - e_{z} \cdot σ)

. This is a different interpretation scenario for the spin-up and spin-down spinors we establish above. Which one of the two interpretations is right?

This confusion is related to a number of coincidences that occur when

s = e_{z}

. These are the same coincidences which permit us to write

[s \cdot σ] χ = χ

in the main text just before Equation (5). In fact, both the unit matrix and

[e_{z} \cdot σ]

are in the special case

s = e_{z}

algebraically equal to

χ_{↑}

. However, this is a coincidence and not general, such that we cannot attribute geometrical meaning to it. It is so to say frame-dependent. We can decide very quickly which one of the two interpretations is right by remembering that

𝟙 + s \cdot σ = 2 χ \otimes χ^{†}

. It thus transforms as a vector under rotations

R

, i.e.,

𝟙 + s \cdot σ = 2 χ \otimes χ^{†} \to R (𝟙 + s \cdot σ) R^{†} = R (2 χ \otimes χ^{†}) R^{†}

, while the spinors transform as group elements

χ \to R χ

. The quantity

\frac{1}{2} (𝟙 + e_{z} \cdot σ)

is thus the

2 \times 2

matrix

χ_{↑} \otimes χ_{↑}^{†}

and we cannot identify it with the

2 \times 1

matrix

χ_{↑}

. Other differences are also immediately visible. A general spinor

R χ_{↑}

can contain the two frequencies

\pm ω_{0}

and remains always a

2 \times 1

single-column matrix. On the other hand,

(𝟙 + s \cdot σ) e^{- ı ω_{0} / 2}

will under transformation continue to contain the single frequency

- ω_{0}

, and its non-zero entries will in general be spread over two columns.

Les us now call

Q

the rotation around the axis parallel to

e_{z} \land s

that rotates

e_{z}

to

s

(this is actually the second matrix in Equation (26)). Under this rotation, vectors are transformed “quadratically” according to:

[s \cdot σ] = Q [e_{z} \cdot σ] Q^{†}

. This transforms

\frac{1}{2} (𝟙 - e_{z} \cdot σ)

into

\frac{1}{2} (𝟙 - s \cdot σ)

and

\frac{1}{2} (𝟙 + e_{z} \cdot σ)

into

\frac{1}{2} (𝟙 + s \cdot σ)

. The operators

\frac{1}{2} (𝟙 \pm s \cdot σ)

play thus locally the same role for rotations around

s

as

\frac{1}{2} (𝟙 \pm e_{z} \cdot σ)

for rotations around

e_{z}

.

The mixed states

ψ

and

ξ

can also be encountered in Pauli’s theory for the spin, but it has never been realized that they were mixed states. In fact, the matrices

[s \cdot σ]

are reflection matrices. This is something one may not expect based on physical intuition in the definition for the concept of spin. Their non-normalized eigenvectors are

{[1 + s_{z}, s_{x} + ı s_{y}]}^{⊤}

for the eigenvalue

λ = 1

and

{[1 - s_{z}, - s_{x} - ı s_{y}]}^{⊤}

for the eigenvalue

λ = - 1

, clearly revealing their relation with the mixed states

\frac{1}{2} (𝟙 + s \cdot σ)

and

\frac{1}{2} (𝟙 - s \cdot σ)

. It can be seen even more clearly by constructing the eigenvectors as sets:

[s \cdot σ] {𝟙, s \cdot σ} = {𝟙, s \cdot σ}

and

[s \cdot σ] {𝟙, - s \cdot σ} = - {𝟙, - s \cdot σ}

. This shows that it is not correct to interpret the up and down spinors as eigenvectors of the Pauli matrices. We can only operate with Pauli matrices on the up and down states, with the effect to transform a rotation into a reversal and vice versa. This is of course a rather subtle issue. Again, the confusion is due to the coincidence which occurs when

s = e_{z}

, as discussed above.

We may note finally that the basis vectors

χ_{α} \otimes χ_{β}^{†}

, with

(α, β) \in {↑, ↓}^{2}

, of M

(2, C)

acquire a second meaning within the multivector formalism of the Clifford algebra discussed in [2], i.e., what we call the second algebra in the present article. Here, two of the basis vectors are isotropic vectors, which can be considered as representing oriented planes and defining complete triads. These interpretations are of no use here because our spinors

χ_{↑}

and

χ_{↓}

must represent states of spinning motion, such that we need the interpretation of the vector space M

(2, C)

in terms of rotations, reflections and reversals rather than multivectors.

Appendix B. The Dirac Equation Does not Describe a Single Electron but a Superposition State that Must Be Interpreted as Corresponding to a Statistical Ensemble

We can imagine that the electron in Equation (5) is at rest at position

(x_{0}, y_{0}, z_{0}) \in R^{3}

. Up to now, we define a spinor function

ψ \in F (R, C^{2})

whose temporal behavior

ψ (τ)

is a wave. We from now on consider this as a function

ψ_{(x_{0}, y_{0}, z_{0})} \in F (S_{(x_{0}, y_{0}, z_{0})} \times R, C^{2})

with space-time definition domain

S_{(x_{0}, y_{0}, z_{0})} \times R

, where the space part

S_{(x_{0}, y_{0}, z_{0})}

is the one-element set

S_{(x_{0}, y_{0}, z_{0})} = {(x_{0}, y_{0}, z_{0})}

and the Cartesian product with

R

adds the time parameter

τ

. The partial derivatives

\frac{\partial}{\partial x}

,

\frac{\partial}{\partial y}

and

\frac{\partial}{\partial z}

of this function

ψ_{(x_{0}, y_{0}, z_{0})}

are not defined, such that the identity in Equation (8) cannot be applied to

ψ_{(x_{0}, y_{0}, z_{0})}

.

It does not take a brilliant quantum mechanic to repair the situation. Obviously, we must generalize our equation for the mixed state

ψ_{(x_{0}, y_{0}, z_{0})}

to an equation for a spinor wave function

Ψ \in F (R^{4}, C^{4})

. This is a wave function for a statistical ensemble of electrons in a yet broader sense than the mixed state

ψ_{(x_{0}, y_{0}, z_{0})}

. The ensemble accounts for not only various rotational states but also for all possible positions

(x_{0}, y_{0}, z_{0}) \in R^{3}

of the electron. We discussed this in [6], but we think the following explanation is more tidy. If we want to describe the spinning motion of an electron at rest also at another point

(x_{1}, y_{1}, z_{1}) \in R^{3}

, we can consider the function

ψ_{(x_{1}, y_{1}, z_{1})} \in F (S_{(x_{1}, y_{1}, z_{1})} \times R, C^{2})

. Describing the two possible electron positions simultaneously requires introducing a superposition state. This superposition state corresponds to the statistical ensemble of the electrons whose positions can with equal probability be one of the two members of the set

{(x_{0}, y_{0}, z_{0}), (x_{1}, y_{1}, z_{1})}

. As the wave function we use in QM is a plane wave defined at all

(x, y, z) \in R^{3}

, we must thus define a superposition state that incorporates the uncountable number of electrons at all

(x, y, z) \in R^{3}

. However, with the definitions adopted

ψ_{(x_{0}, y_{0}, z_{0})}

and

ψ_{(x_{1}, y_{1}, z_{1})}

cannot be added because they have different definition domains. To render the summing of the wave functions possible, we may introduce extensions

ϕ_{(x_{0}, y_{0}, z_{0})} \in F (R^{4}, C^{2})

which we could define by

ϕ_{(x_{0}, y_{0}, z_{0})} (τ, x, y, z) = ψ (τ) δ (x - x_{0}) δ (y - y_{0}) δ (z - z_{0})

. Here, the function

δ

is not Dirac’s “delta function” because Dirac “delta functions” are mathematical nonsense and do not exist. The function

δ

is here rather defined by:

(δ (0) = 1) & (\forall x \neq 0) (δ (x) = 0)

. Hence, the weight we put in

x = 0

is 1. Furthermore,

\int_{R} δ (x) d x = 0

rather than

\int_{R} δ (x) d x = 1

as in Dirac’s delta which he wrongly thought he could satisfy by stipulating

δ (0) = \infty

. By defining

δ (0) = 1

, we avoid the use of singular Dirac measures (with “infinite weight”), which is important for what we are going to do afterwards. This definition permits to add up

ϕ_{(x_{0}, y_{0}, z_{0})}

and

ϕ_{(x_{1}, y_{1}, z_{1})}

to define the mixed state. By dropping the deltas we obtain a true function

Ψ

defined by

Ψ (τ, x, y, z) = ψ (τ)

,

\forall (τ, x, y, z) \in

R^{4}

, which is the wave function we use for an electron at rest. We talk about it in terms of a wave function for a single electron at rest but it involves considering an infinite statistical ensemble of electrons at rest, whereby the electrons can now be anywhere in

R^{3}

with equal probability. We can consider this function intuitively as a symbolic sum of spinors:

Ψ = \sum_{(x_{0}, y_{0}, z_{0}) \in R^{3}} ϕ_{(x_{0}, y_{0}, z_{0})} or : Ψ (τ, x, y, z) = \sum_{(x_{0}, y_{0}, z_{0}) \in R^{3}} ψ (τ) δ (x - x_{0}) δ (y - y_{0}) δ (z - z_{0}) .

(A6)

which confirms the idea that

Ψ

is a superposition state and therefore corresponds to a statistical ensemble. In reality, such sums over a non-countable set are a priori not defined, although a physicist might consider this remark as esoteric mathematical faultfinding because it seems obvious what it means in this special case. We can formulate the idea completely rigorously by falling back again onto sets, because the sums were introduced in order to represent sets in the first place. We can define the function

Ψ

and its definition domain

S

according to:

Ψ = ⋃_{(x_{0}, y_{0}, z_{0}) \in R^{3}} ψ_{(x_{0}, y_{0}, z_{0})}, S = ⋃_{(x_{0}, y_{0}, z_{0}) \in R^{3}} S_{(x_{0}, y_{0}, z_{0})} \times R = R^{4},

(A7)

which confirms the status of

Ψ \in F (R^{4}, C^{2})

as a superposition state equally well. This may look very arcane and intimidating but it is just based on the idea that a function

f \in F (A, B)

is nothing else than a set of couples

(x, f (x)) \in A \times B

. This is much more rigorous than the tentative approach by the pseudo-equation Equation (A6). We see that by twice using superposition states we have transformed the deterministic equation for a spinning electron in SU(2) to a probabilistic wave equation over

R^{4}

that can be lifted to the Dirac representation of the homogeneous Lorentz group. Both interventions we need to keep on track correspond to introductions of superposition states. This highlights that the wave function must really be interpreted statistically as proposed by Ballentine. The final superposition state has been obtained by considering a non-countable infinity of electrons. These are all the electrons we would need to measure one by one in order to obtain the perfect experimental statistics described by the wave function

Ψ

. By Equation (A7),

Ψ

is now well-defined as a superposition state, even if it still has a normalization problem, because the integral

\int_{R^{3}} Ψ^{†} Ψ d r

diverges. If we had used Dirac measures, the mathematical normalization problems would have become far worse. The resulting superposition state corresponds to all possible meaningful histories for single electrons at rest in Ballentine’s interpretation of QM. One can ask here the question why we give all these electrons the same phase. This question is never asked in the traditional approach but answered in the Appendix of [5]. For the state

Ψ

, the partial derivatives

\frac{\partial}{\partial x}

,

\frac{\partial}{\partial y}

and

\frac{\partial}{\partial z}

are now well-defined operations. Working on

Ψ

, they yield 0, which was not true for

ϕ_{(x_{0}, y_{0}, z_{0})}

or

ψ_{(x_{0}, y_{0}, z_{0})}

. Hence, in the electron’s rest frame, one can after lifting

Ψ

to the Dirac representation apply the identity Equation (8) to the state

Ψ

to obtain the rigorous identity:

\frac{1}{c} \frac{d}{d τ} γ_{t} Ψ \equiv [\frac{1}{c} \frac{\partial}{\partial τ} γ_{t} - \nabla \cdot γ] Ψ .

(A8)

References

Coddens, G. From Spinors to Quantum Mechanics; Imperial College Press: London, UK, 2015. [Google Scholar]
Coddens, G. Spinors for Everyone. Available online: https://hal.archives-ouvertes.fr/cea-01572342v1 (accessed on 29 August 2020).
Cartan, E. The Theory of Spinors; Dover: New York, NY, USA, 1981. [Google Scholar]
Ballentine, L.E. Quantum Mechanics, A Modern Development, 2nd ed.; World Scientific: Singapore, 1998. [Google Scholar]
Coddens, G. A Linearly Polarized Electromagnetic Wave as a Swarm of Photons Half of Which Have Spin −1 and Half of Which Have Spin +1. Available online: https://hal.archives-ouvertes.fr/hal-02636464v3 (accessed on 29 August 2020).
Coddens, G. A Proposal to Get Some Common-Sense Intuition for the Paradox of the Double-Slit Experiment. Available online: https://hal.archives-ouvertes.fr/cea-01383609v5 (accessed on 2 June 2020).
Everett, H. “Relative State” Formulation of Quantum Mechanics. Rev. Mod. Phys. 1957, 29, 454–462. [Google Scholar] [CrossRef]
Bohm, D. A Suggested Interpretation of Quantum Theory in Terms of “Hidden” Variables I. Phys. Rev. 1952, 85, 166–179. [Google Scholar] [CrossRef]
Cramer, J. The transactional interpretation of quantum mechanics. Rev. Mod. Phys. 2009, 58, 795–798. [Google Scholar]
Coddens, G. A Solution of the Paradox of the Double-Slit Experiment. Available online: https://hal.archives-ouvertes.fr/cea-01459890v3 (accessed on 9 July 2020).
Hansen, A.; Ravndal, F. Klein’s Paradox and Its Resolution. Phys. Scripta 1981, 23, 1036. [Google Scholar] [CrossRef]
Gerlach, W.; Stern, O. Der experimentelle Nachweis des magnetischen Moments des Silberatoms. Z. Physik 1921, 8, 110. [Google Scholar] [CrossRef]
Villani, C. La théorie Synthétique de la Courbure de Ricci, 1/7. Available online: https://www.youtube.com/watch?v=xzVk56EKBUI (accessed on 19 August 2020).
Coddens, G. On Magnetic Monopoles, the Anomalous G-factor of the Electron and the Spin-Orbit Coupling in the Dirac Theory. Available online: https://hal.archives-ouvertes.fr/cea-01269569v2 (accessed on 29 August 2020).
Torre, C.G. Quantum Mechanics; Lecture 13; Utah State University: Logan, UT, USA, 2008; Available online: http://www.physics.usu.edu/torre/QuantumMechanics/6210_Spring_2008/ (accessed on 29 August 2020).
Einstein, A.; Ehrenfest, P. Quantentheoretische Bemerkungen zum Experiment von Stern und Gerlach. Z. Physik 1922, 11, 31. [Google Scholar] [CrossRef]
Mezei, F. Neutron Spin Echo. In Proceedings of a Laue-Langevin Institut Workshop Grenoble, Grenoble, France, 15–16 October 1979; Springer Lecture Notes in Physics 128; Springer: Berlin, Germany, 1980. [Google Scholar]
Purcell, E.M. The Thomas Precession. 1975. Unpublished Note. Available online: https://aapt.scitation.org/doi/10.1119/1.1987061 (accessed on 9 August 2020).
Pauli, W. Zur Quanten-mechanik des Magnetischen Elektrons. Z. Physik 1926, 37, 263. [Google Scholar]
Feynman, R.P.; Weinberg, S. Elementary Particles and the Laws of Physics; Cambridge University Press: Cambridge, MA, USA, 1987. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Coddens, G. The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down. Symmetry 2021, 13, 134. https://doi.org/10.3390/sym13010134

AMA Style

Coddens G. The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down. Symmetry. 2021; 13(1):134. https://doi.org/10.3390/sym13010134

Chicago/Turabian Style

Coddens, Gerrit. 2021. "The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down" Symmetry 13, no. 1: 134. https://doi.org/10.3390/sym13010134

APA Style

Coddens, G. (2021). The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down. Symmetry, 13(1), 134. https://doi.org/10.3390/sym13010134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Exact Theory of the Stern–Gerlach Experiment and Why it Does Not Imply that a Fermion Can Only Have Its Spin Up or Down

Abstract

1. Preliminaries: Understanding Spinors and a New Approach to Quantum Mechanics

1.1. Clifford Algebra

1.2. Use of the Clifford Algebra to Derive the Dirac Equation from Scratch

1.3. Consequences

1.4. Breakdown of the Standard Dirac Formalism in the Case of Precession

2. The Stern–Gerlach Experiment: Confusion Reigns

2.1. Preamble

2.2. Total Absence of Theory

2.3. Total Absence of Intuition

3. Tabula Rasa Approach Based on Spinors

4. More Traditional Formulation in Terms of a Differential Equation

5. The Pauli Exclusion Principle Remains Valid

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. The Missing Link: the Geometrical Meaning of the Spin-Up and Spin-Down States

Appendix B. The Dirac Equation Does not Describe a Single Electron but a Superposition State that Must Be Interpreted as Corresponding to a Statistical Ensemble

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI