1. Introduction
In the traditional approach to quantum mechanics (QM), the Hilbert space plays a central, dominant role and probabilities are introduced, almost as an afterthought, in order to provide the phenomenological link for handling measurements. The uneasy coexistence of the Hilbert and the probabilistic structures is reflected in the two separate modes of wave-function evolution; one is the linear and deterministic Schrödinger evolution and the other is the discontinuous and stochastic wave function collapse. It has given rise to longstanding problems in the interpretation of the quantum state itself [
1,
2,
3,
4,
5].
These difficulties have motivated alternative approaches in which, rather than postulating Hilbert spaces as the starting point, one recognizes that probabilities play the dominant role; probabilities are not just an accidental feature peculiar to quantum measurements. The goal there is to derive or “reconstruct” the mathematical formalism of QM from more basic considerations of probability theory and geometry. (See, e.g., [
6,
7,
8,
9,
10,
11] and references therein.)
In the entropic dynamics (ED) approach the central object is the epistemic configuration space, which is a statistical manifold—a space in which each point represents a probability distribution [
11]. In this paper, our goal is to discuss those special curves that could potentially play the role of trajectories. What makes those curves special is that they are adapted to the natural geometric structures on the statistical manifold.
Two such structures are of central importance. The first is familiar from statistics, i.e., all statistical manifolds have an intrinsic metric structure given by the information metric [
12,
13]. The second is familiar from classical mechanics [
14,
15,
16]. Since we are interested in trajectories, we are naturally led to consider the vectors that are tangent to such curves, as well as the dual vectors, or covectors—it is these objects that are used to represent the analogues of the velocities of probabilities and their momenta. Vectors and covectors live in the so-called tangent and cotangent spaces, respectively. It turns out that the statistical manifold plus all its cotangent spaces is itself a manifold—the cotangent bundle—that can be endowed with a second natural structure called
symplectic. In mechanics, the cotangent bundle is known as phase space and the symplectic transformations are known as canonical transformations.
There is extensive literature on the symplectic and metric structures inherent to QM. They have been discovered, independently rediscovered, and extensively studied by many authors [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26]. Their crucial insight is that those structures, being of purely geometrical nature, are not just central to classical mechanics, they are also central to quantum mechanics. Furthermore, the potential connection and relevance of information geometry to various aspects of QM, including its metric structure, has also been studied [
9,
10,
11,
27,
28,
29,
30].
To characterize congruences of curves in the epistemic phase space—or, equivalently, the
flows on the cotangent bundle—we must address two problems. First, we must characterize the particular cotangent space and the symplectic structure that is relevant to QM. This amounts to establishing the correct conjugate momenta to be paired to the coordinates, which, in our case, are probabilities. In classical mechanics, this pairing is accomplished with the help of a Lagrangian
and the prescription
. In the present problem, we have no access to a Lagrangian and a different criterion is adopted [
11]. The second problem is to provide the cotangent bundle with a metric structure that is compatible with the information metric of the underlying statistical manifold. The issue is that cotangent bundles are not statistical manifolds and the challenge is to identify the natural set of assumptions that leads to the right metric structure.
We show that the flows that are relevant to quantum mechanics are those that preserve (in the sense of vanishing Lie derivatives) both the symplectic structure (a Hamilton flow) and the metric structure (a Killing flow). The characterization of these Hamilton–Killing (HK) flows results in a formalism that includes states described by rays, a geometry given by the Fubini–Study metric, flows that obey a linear Schrödinger equation, the emergence of a complex structure, the Born rule, and Hilbert spaces. All these elements are derived rather than postulated.
The present discussion includes two new developments. First, our focus is on isolating the essential geometrical aspects of the problem (a discussion of the physical aspects is given in [
11]) and the main ideas are presented in the simpler context of a finite-dimensional manifold—a simplex. Thus, what we derive here is the geometrical framework that applies to a toy model—an
n-sided quantum die. Second, the metric structure of the cotangent bundle is found by a new argument involving the minimal assumption that the metric of phase space is determined by the only metric structure at our disposal, namely, the information metric of the simplex.
Is this all there is to quantum mechanics? We conclude with a word of caution. The framework developed here takes us a long way towards justifying the mathematical formalism that underlies quantum mechanics, but it is only a kinematical prelude to the true dynamics. The point is that not every HK curve is a trajectory and not every parameter that labels points along a curve is time. All changes of probabilities, including the changes we call dynamics, must be compatible with the entropic and Bayesian rules that have been found to be of universal applicability in inference. It is this additional requirement that further restricts the HK flows to an entropic dynamics that describes an evolution in a suitably constructed entropic concept of time [
7,
11].
This paper focuses on deriving the mathematical formalism of quantum mechanics, but the ED approach has been applied to a variety of other topics in quantum theory. These include the quantum measurement problem [
31,
32]; momentum and uncertainty relations [
33,
34]; the Bohmian limit [
34,
35] and the classical limit [
36]; extensions to curved spaces [
37]; to relativistic fields [
38,
39,
40]; and the ED of spin [
41].
2. Some Background
We deal with several distinct spaces. One is the
ontic configuration space of microstates labeled by
, which are the unknown variables we are trying to predict. Another is the space of probability distributions
, which is the
epistemic configuration space or, to use a shorter name, the
e-configuration space. This
-dimensional statistical manifold is a simplex
,
As coordinates for a generic point
on
, we shall use the probabilities
themselves.
Given the manifold , we can construct two other special manifolds that will turn out to be useful, the tangent bundle and the cotangent bundle . These are fiber bundles; the base manifold is and the fibers at each point are respectively the tangent and cotangent spaces at . The tangent space at , , is the vector space composed of all vectors that are tangent to curves through the point . While this space is obviously important (it is the space of “velocities” of probabilities), in what follows, we will not have much to say about it. Much more central to our discussion is the cotangent space at , which is the vector space of all covectors at .
As already mentioned, the reason we care about vectors and covectors is that these are the objects that are used to represent velocities and momenta. The cotangent bundle , plays the central role of the epistemic phase space, or e-phase space.
A point
is represented as
, where
are coordinates on the base manifold
and
are some generic coordinates on the cotangent space
at
. Curves on
allow us to define vectors on the tangent spaces
. Let
be a curve parameterized by
; then, the vector
tangent to the curve at
has components
and
and is written as
where
and
are the basis vectors, the index
is summed over and we adopt the standard notation in differential geometry,
and
. The directional derivative of a function
along the curve
is
where
is the gradient in
, that is, the gradient of a generic function
is
where
and
are the basis covectors and the tilde ‘˜’ serves to distinguish the gradient
on the bundle
from the gradient
∇ on the simplex
.
Here, unfortunately, we encounter a technical difficulty due to the fact that the space is constrained to normalized probabilities so that the coordinates cannot be varied independently. This problem is handled, without loss of generality, by embedding the -dimensional manifold into a manifold of one dimension higher, the so-called positive-cone, denoted , where the coordinates are unconstrained.
To simplify the notation, a point
in the
-dimensional
is labeled by its coordinates
, where
is a composite index. The first index
(chosen from the beginning of the Greek alphabet) takes two values,
. Since
keeps track of whether
i is an upper
index (
) or a lower
index (
), from now on we can set
. Then, Equations (
2) and (
4) are written as
The repeated indices indicate a double summation over
and
i. The action of the basis covectors
on the basis vectors,
, is given by
is the directional derivative of
F along the vector
.
3. Hamiltonian Flows
Just as a manifold can be supplied with a symmetric bilinear form, the metric tensor, which gives it the fairly rigid structure described as its
metric geometry, cotangent bundles can be supplied with an
antisymmetric bilinear form, the
symplectic form, which gives them the somewhat floppier structure called
symplectic geometry (Arnold 1997 [
15,
16]).
A vector field defines a space-filling congruence of curves that are tangent to the field at every point X. We seek those special congruences or flows that reflect the symplectic geometry.
3.1. hE Symplectic Form
Once the local coordinates
on
are established there is a natural choice of symplectic form
The question of how to choose those local coordinates, which are Darboux coordinates for the cotangent bundle, remains open. The answer is not to be found in mathematics but in physics. In classical mechanics, the criterion for choosing a canonical momentum is provided by a Lagrangian; however, here, we do not have a Lagrangian. An alternative criterion more closely tailored to the framework presented here is provided by entropic dynamics [
11]. From now on, we assume that the correct
coordinates have been identified.
The action of
on two vectors
and
is obtained using (
6).
The result is
3.2. Hamilton’s Equations and Poisson Brackets
Next, we derive the
-dimensional
analogues of the results that are standard in classical mechanics [
14,
15,
16]. We seek those vector fields
that generate flows (the congruence of integral curves) that preserve the symplectic structure in the sense that
where the Lie derivative [
16] is
Since, by Equation (
9), the components
are constant,
, we can rewrite
as
which is the exterior derivative (roughly, the curl) of the covector
. By Poincare’s lemma, requiring
(a vanishing curl) implies that
is the gradient of a scalar function, which we denote by
,
In the opposite direction, we can easily check that (
13) implies
. Using (
9), Equation (
13) is more explicitly written as
or
which we recognize as Hamilton’s equations for a Hamiltonian function
. This justifies calling
the
Hamiltonian vector field associated to the
Hamiltonian function . In other words, the flows that preserve the symplectic structure,
, are generated by Hamiltonian vector fields
associated to Hamiltonian functions
.
From (
9) and (
15) the action of the symplectic form
on two Hamiltonian vector fields
and
generated, respectively, by
and
, is
where, on the right hand side, we have introduced the Poisson bracket notation. In other words, the action of
on two Hamiltonian vector fields is the Poisson bracket of the associated Hamiltonian functions. We can also check that the derivative of an arbitrary function
along the vector field
is
Thus,
the Hamiltonian formalism that is so familiar in physics emerges from purely geometrical considerations. It might be desirable to adopt a more suggestive notation; instead of
let us write
. Then, the flow generated by a Hamiltonian function
and parameterized by “time”
is given by Hamilton’s equations in the standard form,
and the
evolution of any well-behaved function
is given by
The difference with classical mechanics is that, here, the degrees of freedom are probabilities and not ontic variables such as, for example, the positions of particles.
3.3. The Normalization Constraint
Since our actual interest is not in flows on
but on the bundle
of normalized probabilities, we shall restrict ourselves to flows that preserve the normalization of probabilities. Let
We seek those special Hamiltonians
such that the initial condition
is preserved by the flow, that is,
Indeed, the actual quantum Hamiltonians will preserve
even when the constant does not vanish [
11]. Since the probabilities
must remain positive, we further require that
when
.
We can also consider the Hamiltonian flow generated by
and parameterized by
. From Equation (
15) the corresponding Hamiltonian vector field
is given by
or, more explicitly,
The integral curves generated by
are found by integrating (
23). The result is
which amounts to shifting all momenta by the
i-independent parameter
. We can also see that, if
is conserved along
, then
is conserved along
.
which implies that the conserved quantity
is the generator of a symmetry transformation.
To summarize: the phase space of interest is , but the description is simplified by using the unnormalized coordinates of the larger embedding space . The introduction of one superfluous coordinate forces us to also introduce one superfluous momentum. We eliminate the extra coordinate by imposing the constraint . We eliminate the extra momentum by declaring it unphysical; the shifted point is declared to be equivalent to . This equivalence is described as a global “gauge” symmetry which, as we shall see later in the paper, is the reason why quantum mechanical states are represented by rays rather than vectors in a Hilbert space.
5. Hamilton–Killing Flows
In the previous sections we studied those Hamiltonian flows
that, in addition to preserving the symplectic form, are generated by a gauge invariant
so they also preserve the normalization constraint
. Our next goal is to find those flows that also happen to preserve the metric
G of
, that is, we want
to be a Killing vector. The vector field
is determined by the Killing equation [
16],
, or
Since Equation (
38) gives
, the Killing equation simplifies to
where
. If we further require that
is a Hamiltonian flow,
, then
satisfies Hamilton’s equations,
Substituting into (
41), we find
Therefore, in order to generate a flow that preserves both
G and
, the function
must be
linear in both
and
.
The kernels
,
and
are independent of
and
. Imposing that the flow preserves the normalization constraint
, Equation (
21), implies that
must be invariant under the phase shift
. Therefore,
and we conclude that
The corresponding HK flow is given by Hamilton’s equations
The constant in (
45) can be dropped, because it has no effect on the flow. Taking the complex conjugate of (
46) and comparing with (47) show that the kernel
is Hermitian and that the corresponding Hamiltonian functionals
are real.
To summarize,
the preservation of the symplectic structure, the metric structure and the normalization constraint leads to Hamiltonian functions that are bilinear in and , Equation (
45). This is the main result of this paper. To appreciate its significance, once again, we adopt a more suggestive notation, i.e., the flow generated by the Hamiltonian function
which is recognized as the Schrödinger equation. Beyond being Hermitian, the actual form of the kernel
remains undetermined.
The central feature of Hamilton’s Equations (
46) or of the Schrödinger Equation (
49) is that they are linear. Given two solutions
and
and arbitrary constants
and
, the linear combination
is also a solution and this is extremely useful in calculations. Unfortunately, this is an HK flow on the embedding space
and, when the flow is projected onto the e-phase space
, the linearity is severely restricted by normalization. If
and
are normalized points on
, the superposition
is not in general a normalized point on
, unless the constants
and
are chosen appropriately. Furthermore, the states
and
are supposed to be “physically” equivalent to the original
and
, but, in general, the superposition
is not equivalent to
. In other words,
the mathematical linearity of (46) or (49) does not extend to a full-blown superposition principle for physically equivalent states. On the other hand, any point
deserves to be called a “state” in the limited sense that it may serve as the initial condition for a curve in
. Since, given two states
and
, their superposition
is also a state, we see that the set of states
forms a linear vector space. This is a structure that turns out to be very useful.
6. Hilbert Space
Above we saw that the possible initial conditions for an HK flow, the
points of
, form a linear vector space. To take full advantage of linearity we would like to endow this vector space with the additional structure of an inner product and turn it into a Hilbert space—a term which we loosely use to describe any complex vector space with a Hermitian inner product. The metric tensor
G (Equation (
38)) and the symplectic form
(Equation (
37)) are supposed to act on
vectors ; their action on the
points or
is not defined. However, the choice of inner product for the points
is natural, in the sense that the necessary ingredients,
G and
, are already available.
We adopt the familiar Dirac notation to represent the states
as vectors
. In order that the inner product
be preserved it is defined in terms of the preserved tensors
G and
,
where
is a constant and, to follow convention, the overall constant is set to
. Using Equation (
37) and (
38), we obtain
To fix
, we impose that
, which implies that
. In order to comply with the standard convention that the inner product
is anti-linear in the first factor and linear in the second factor, we select
. The result is the familiar expression for the positive definite inner product,
Here we see that the choice of
as the overall constant leads to the standard relation
. The map between points and vectors,
, is defined by
, where
and the vectors
form a basis that is orthogonal and complete.
The bilinear Hamilton function
with kernel
can now be written as the expected value,
, of the Hamiltonian operator
with matrix elements
. The corresponding HK flows are given by
which are described by unitary transformations
where
. Finally, the Poisson bracket of two Hamiltonian functions
and
can be written in terms of the commutator of the associated operators,
. Thus, the Poisson bracket is the expectation of the commutator. This
identity is much sharper than Dirac’s pioneering discovery that the quantum commutator of two quantum variables is
analogous to the Poisson bracket of the corresponding classical variables.
7. Conclusions
There have been numerous attempts to derive or construct the mathematical formalism of quantum mechanics by adapting the symplectic geometry of classical mechanics. Such phase space methods invariably start from a classical phase space of positions and momenta and, through some series of “quantization rules,” posit a correspondence to self-adjoint operators which no longer constitute a phase space. The connection to classical mechanics is lost. The interpretation of and and even the answer to the question of what is ontic and what is epistemic become highly controversial. Probabilities play a secondary role in such formulations.
In this paper, we take a different starting point that places probabilities at the forefront. We discuss special families of curves—the Hamilton–Killing flows—that promise to be useful for the study of quantum mechanics. We show that the HK flows that preserve the symplectic and the metric structures of the e-phase space reproduce much of the mathematical formalism of quantum theory. It clarifies how the linearity of the Schrödinger equation, complex numbers and the Born rule
(the Born rule for generic observables is discussed in [
31,
32]) follow from the symplectic and metric structures, while the normalization constraint leads to the equivalence of states along rays in a Hilbert vector space.