Entropic Dynamics

Entropic Dynamics is a framework in which dynamical laws are derived as an application of entropic methods of inference. No underlying action principle is postulated. Instead, the dynamics is driven by entropy subject to the constraints appropriate to the problem at hand. In this paper we review three examples of entropic dynamics. First we tackle the simpler case of a standard diffusion process which allows us to address the central issue of the nature of time. Then we show that imposing the additional constraint that the dynamics be non-dissipative leads to Hamiltonian dynamics. Finally, considerations from information geometry naturally lead to the type of Hamiltonian that describes quantum theory.


Introduction
The laws of physics, and in particular the laws of dynamics, have traditionally been seen as laws of nature. It is usually believed that such laws are discovered and that they are useful because they reflect reality. The reflection, imperfect though it may be, represents a very direct relation between physics and nature. Here we explore an alternative view in which the relation is considerably more indirect: The laws of physics provide a framework for processing information about nature. From this perspective physical models are mere tools that are partly discovered and partly designed with our own very human purposes in mind. This approach is decidedly pragmatic: when tools happen to be successful we do not say that they are true; we say that they are useful.
Is there any evidence in support of such an unorthodox view? The answer is yes. Indeed, if physics is an exercise in inference then we should expect it to include both ontic and epistemic concepts. The ontic concepts are meant to represent those entities in nature that are the subject of our interest.
They include the quantities, such as particle positions and field strengths, that we want to predict, to explain, and to control. The epistemic concepts, on the other hand, are the tools-the probabilities, the entropies, and the (information) geometries-that are used to carry out our inferences. The prominence of these epistemic elements strongly suggests that physics is not a mere mirror of nature; instead physics is an inference framework designed by humans for the purpose of facilitating their interactions with nature. The founders of quantum theory-Bohr, Heisenberg, Born, etc.-were quite aware of the epistemological and pragmatic elements in quantum mechanics (see e.g., [1]) but they wrote at a time when the tools of quantitative epistemology-the Bayesian and entropic methods of inference-had not yet been sufficiently developed.
Entropic Dynamics (ED) provides a framework for deriving dynamical laws as an application of entropic methods. (The principle of maximum entropy as a method for inference can be traced to E. T. Jaynes. For a pedagogical overview of Bayesian and entropic inference and further references see [2].) In ED, the dynamics is driven by entropy subject to the constraints appropriate to the problem at hand. It is through these constraints that the "physics" is introduced. Such a framework is extremely restrictive. For example, in order to adopt an epistemic view of the quantum state ψ it is not sufficient to merely assert that the probability |ψ| 2 represents a state of knowledge; this is a good start but it is not nearly enough. It is also necessary that the changes or updates of the epistemic ψ-which include both the unitary time evolution described the Schrödinger equation and the collapse of the wave function during measurement-be derived according to the established rules of inference. Therefore, in a truly entropic dynamics we are not allowed to postulate action principles that operate at some deeper sub-quantum level. Instead, the goal is to derive such action principles from entropy principles with suitably chosen constraints.
In this paper we collect and streamline results that have appeared in several publications (see [3][4][5][6] and references therein) to provide a self-contained overview of three types of ED: (1) standard diffusion; (2) Hamiltonian dynamics; and (3) quantum mechanics. First we tackle the case of a diffusion process which serves to address the central concern with the nature of time. In ED "entropic" time is a relational concept introduced as a book-keeping device designed to keep track of the accumulation of change. Time is constructed by (a) introducing the notion of "instants"; (b) showing that these instants are "ordered"; and (c) defining a measure of the interval that separates successive instants. The welcome new feature is that an arrow of time is generated automatically; entropic time is intrinsically directional.
The early formulation of ED [3] involved assumptions about auxiliary variables, the metric of configuration space, and the form of the quantum potential. All these assumptions have been subsequently removed. In [6] it was shown how the constraint that the dynamics be non-dissipative leads to a generic form of Hamiltonian dynamics, with its corresponding symplectic structure and action principle. Thus, in the context of ED action principles are derived; they are useful tools but they are not fundamental.
Different Hamiltonians lead to different dynamical laws. We show how considerations of information geometry provide the natural path to Hamiltonians that include the correct form of "quantum potential" and lead to the Schrödinger equation, and we also identify the constraints that describe motion in an external electromagnetic field.
Here, we focus on the derivation of the Schrödinger equation but the ED approach has been applied to several other topics in quantum mechanics that will not be reviewed here. These include the quantum measurement problem [5,7,8]; momentum, angular momentum, their uncertainty relations, and spin [9,10]; relativistic scalar fields [11]; the Bohmian limit [12]; and the extension to curved spaces [13].
There is vast literature on the attempts to reconstruct quantum mechanics and it is inevitable that the ED approach might resemble them in one aspect or another -after all, in order to claim success all these approaches must sooner or later converge to the same Schrödinger equation. However, there are important differences. For example, the central concern with the notion of time makes ED significantly different from other approaches that are also based on information theory (see e.g., [14][15][16][17][18][19][20][21][22][23][24]). ED also differs from those approaches that attempt to explain the emergence of quantum behavior as the effective statistical mechanics of some underlying sub-quantum dynamics which might possibly include some additional stochastic element (see e.g., [25][26][27][28][29][30][31][32][33][34]). Indeed, ED makes no reference to any sub-quantum dynamics whether classical, deterministic, or stochastic.

Entropic Dynamics
As with other applications of entropic methods, to derive dynamical laws we must first specify the microstates that are the subject of our inference -the subject matter -and then we must specify the prior probabilities and the constraints that represent the information that is relevant to our problem. (See e.g., [2].) We consider N particles living in a flat Euclidean space X with metric δ ab . We assume that the particles have definite positions x a n and it is their unknown values that we wish to infer. (The index n = 1 . . . N denotes the particle and a = 1, 2, 3 the spatial coordinates.) For N particles the configuration space is X N = X × . . . × X.
In this work ED is developed as a model for the quantum mechanics of particles. The same framework can be deployed to construct models for the quantum mechanics of fields, in which case it is the fields that are objectively "real" and have well-defined, albeit unknown, values [11].
The assumption that the particles have definite positions is in flat contradiction with the standard Copenhagen notion that quantum particles acquire definite positions only as a result of a measurement. For example, in the ED description of the double slit experiment we do not know which slit the quantum particle goes through but it most definitely goes through either one or the other.
We do not explain why motion happens but, given the information that it does, our task is to produce an estimate of what short steps we can reasonably expect. The next assumption is dynamical: we assume that the particles follow trajectories that are continuous. This means that displacements over finite distances can be analyzed as the accumulation of many infinitesimally short steps, and our first task is to find the transition probability density P (x |x) for a single short step from a given initial x ∈ X N to an unknown neighboring x ∈ X N . Later we will determine how such short steps accumulate to yield a finite displacement.
To find P (x |x) we maximize the (relative) entropy, To simplify the notation in all configuration space integrals we write d 3N x = dx, d 3N x = dx , and so on. Q(x |x) is the prior probability. It expresses our beliefs-or more precisely, the beliefs of an ideally rational agent-before any information about the motion is taken into account. The physically relevant information about the step is expressed in the form of constraints imposed on P (x |x)-this is the stage at which the physics is introduced.
The prior-We adopt a prior Q(x |x) that represents a state of extreme ignorance: knowledge of the initial position x tells us nothing about x . Such ignorance is expressed by assuming that Q(x |x)dx is proportional to the volume element in X N . Since the space X N is flat and a mere proportionality constant has no effect on the entropy maximization we can set Q(x |x) = 1. The generalization to curved spaces is straightforward [13]. Another possible matter of concern is that the uniform prior is not normalizable and it is known that improper priors can sometimes be mathematically problematic. Fortunately, in our case no such difficulties arise. For microscopic particles any prior that is sufficiently flat over macroscopic scales turns out to lead to exactly the same physical predictions. We can, for example, use a Gaussian centered at x with a macroscopically large standard deviation and this leads to exactly the same transition probability.
The constraints-The first piece of information is that motion is continuous-motion consists of a succession of infinitesimally short steps. Each individual particle n will take a short step from x a n to x a n = x a n + ∆x a n and we require that for each particle the expected squared displacement, ∆x a n ∆x b n δ ab = κ n (n = 1 . . . N ) takes some small value κ n . Infinitesimally short steps are obtained by taking the limit κ n → 0. We will assume each κ n to be independent of x to reflect the translational symmetry of X. In order to describe non-identical particles, we assume that the value of κ n depends on the particle index n. The N constraints Equation (2) treat the particles as statistically independent and their accumulation eventually leads to a completely isotropic diffusion. But we know that particles can become correlated and even become entangled. We also know that motion is not normally isotropic; once particles are set in motion they tend to persist in it. This information is introduced through one additional constraint involving a "drift" potential φ(x) that is a function in configuration space, x ∈ X N . We impose that the expected displacements ∆x a n along the direction of the gradient of φ satisfy where capitalized indices such as A = (n, a) include both the particle index and its spatial coordinate; ∂ A = ∂/∂x A = ∂/∂x a n ; and κ is another small, but for now unspecified, position-independent constant. The introduction of the drift potential φ will not be justified at this point. The idea is that we can make progress by identifying the constraints even when their physical origin remains unexplained. This situation is not unlike classical mechanics, where identifying the forces is useful even in situations where their microscopic origin is not understood. We do however make two brief comments. First, in section 9 we shall see that ED in an external electromagnetic field is described by constraints that are formally similar to Equation (3). There we shall show that the effects of the drift potential φ and the electromagnetic vector potential A a are intimately related-a manifestation of gauge symmetry-suggesting that whatever φ might be, it is as "real" as A a . The second comment is that elsewhere, in the context of a particle with spin, we will see that the drift potential can be given a natural geometric interpretation as an angular variable. This imposes the additional condition that the integral of φ over any closed loop is quantized, dφ = 2πν where ν is an integer.
Maximizing S[P, Q] in Equation (1) subject to the N + 2 constraints Equations (2) and (3) plus normalization yields a Gaussian distribution, where ζ = ζ(x, α n , α ) is a normalization constant and the Lagrange multipliers α n and α are determined from ∂ log ζ/∂α n = −κ n /2 and ∂ log ζ/∂α = κ The distribution P (x |x) is conveniently rewritten as where Z is a new normalization constant. A generic displacement ∆x a n = x a n − x a n can be expressed as an expected drift plus a fluctuation, ∆x a n = ∆x a n + ∆w a n , where ∆x a n = α α n δ ab ∂φ ∂x b n (7) ∆w a n = 0 and ∆w a n ∆w b n = 1 α n δ ab From these equations we can get a first glimpse into the meaning of the multipliers α n and α . For very short steps, as α n → ∞, the fluctuations become dominant: the drift is ∆x a . This implies that, as in Brownian motion, the trajectory is continuous but not differentiable. In the ED approach a particle has a definite position, but its velocity-the tangent to the trajectory-is completely undefined. We can also see that the effect of α is to enhance or suppress the magnitude of the drift relative to the fluctuations -a subject that is discussed in detail in [12]. However, for our current purposes we can absorb α into the so far unspecified drift potential, α φ → φ, which amounts to setting α = 1.

Entropic Time
The goal is to derive dynamical laws as an application of inference methods, although the latter make no reference to time. Therefore, additional assumptions are needed to specify what constitutes "time" in our framework.
The foundation to any notion of time is dynamics. We must first identify a notion of "instant" that properly takes into account the inferential nature of entropic dynamics. Time is then is constructed as a device to keep track of change or, more explicitly, of the accumulation of repeated small changes.

Time as an Ordered Sequence of Instants
Entropic dynamics is generated by the short-step transition probability P (x |x), Equation (6). In the ith step, which takes the system from x = x i−1 to x = x i , both x i−1 and x i are unknown. Integrating the joint probability, These equations involve no assumptions-they are true by virtue of the laws of probability. However, if P (x i−1 ) happens to be the probability of different values of x i−1 at an "instant" labelled t, then we can interpret P (x i ) as the probability of values of x i at the next "instant" which we will label t . Accordingly, we write P ( Nothing in the laws of probability leading to Equation (9) forces the interpretation Equation (10) on us-this is an independent assumption about what constitutes time in our model. We use Equation (10) to define what we mean by an instant: if the distribution ρ(x, t) refers to one instant t, then the distribution ρ(x , t ) generated by P (x |x) defines what we mean by the "next" instant t . The iteration of this process defines the dynamics: entropic time is constructed instant by instant: , and so on. The inferential nature of the construction can be phrased more explicitly. Once we have decided on the relevant information necessary for predicting future behavior-the distributions ρ(x, t) and P (x |x)-we can say that this information defines what we mean by an "instant". Furthermore, Equation (10) shows that: Time is designed in such a way that given the present the future is independent of the past. (10) is commonly employed to define Markovian behavior, in which case it is sometimes known as the Chapman-Kolmogorov equation. Markovian processes are such that once an external notion of time is given, defined perhaps by an external clock, the specification of the state of the system at time t is sufficient to fully determine its state after time t-no additional information about times prior to t is needed. It should be emphasized that we do not make a Markovian assumption. We are concerned with a different problem: we do not use Equation (10) to define a Markovian process in an already existing background time; we use it to construct time itself.

The Arrow of Entropic Time
The notion of time constructed according to Equation (10) incorporates an intrinsic directionality: there is an absolute sense in which ρ(x, t) is prior and ρ(x , t ) is posterior. To construct the time-reversed evolution we just write where according to the rules of probability theory P (x|x ) is related to P (x |x) in Equation (6) by Bayes' theorem, Note that this is not a mere exchange of primed and unprimed quantities: P (x |x) is the Gaussian distribution, Equation (6), obtained using the maximum entropy method; in contrast, the time-reversed P (x|x ) obtained using Bayes' theorem, Equation (12), will not in general be Gaussian. The asymmetry between the inferential past and the inferential future is traced to the asymmetry between priors and posteriors.
The subject of the arrow of time has a vast literature (see e.g., [35,36]). The puzzle has been how to explain the asymmetric arrow from underlying symmetric laws of nature. The solution offered by ED is that there are no symmetric underlying laws and the asymmetry is the inevitable consequence of entropic inference. From the point of view of ED, the challenge is not to explain the arrow of time, but rather the reverse, how to explain the emergence of symmetric laws within an intrinsically asymmetric entropic framework. As we shall see below the derived laws of physics-e.g., the Schrödinger equation-are time-reversible but entropic time itself only flows forward.

Duration: A Convenient Time Scale
To complete the model of entropic time we need to specify the interval ∆t between successive instants. The basic criterion is that duration is defined so that motion looks simple. From Equations (7) and (8) for short steps (large α n ) the motion is dominated by the fluctuations ∆w n . Therefore specifying ∆t amounts to specifying the multipliers α n in terms of ∆t.
To start, we appeal to symmetry: in order that the fluctuations ∆w a n ∆w b n reflect the symmetry of translations in space and time-a time that, just like Newton's, flows "equably everywhere and everywhen"-we choose α n to be independent of x and t, and we choose ∆t so that α n ∝ 1/∆t. More explicitly, we write where the proportionality constants have been written in terms of some particle-specific constants m n , which will eventually be identified with the particle masses, and an overall constant η that fixes the units of the m n s relative to the units of time and will later be regraduated intoh. Before discussing the implications of the choice Equation (13) it is useful to consider the geometry of the N -particle configuration space, X N .

The Information Metric of Configuration Space
We have assumed that the geometry of the single particle spaces X is described by the Euclidean metric δ ab . We can expect that the N -particle configuration space, X N = X × . . . × X, will also have a flat geometry, but the relative contribution of different particles to the metric remains undetermined. Should very massive particles contribute the same as very light particles? The answer is provided by information geometry.
To each point x ∈ X N there corresponds a probability distribution P (x |x). Therefore X N is a statistical manifold and up to an arbitrary global scale factor its geometry is uniquely determined by the information metric, where C is an arbitrary positive constant (see e.g., [2]). A straightforward substitution of Equations (6) and (13) into Equation (14) in the limit of short steps (α n → ∞) yields We see that γ AB diverges as ∆t → 0. The reason for this is not hard to find. As the Gaussian distributions P (x |x) and P (x |x + ∆x) become more sharply peaked and it is easier to distinguish one from the other which translates into a greater information distance, γ AB → ∞. In order to define a distance that remains meaningful for arbitrarily small ∆t it is convenient to choose C ∝ ∆t.
In what follows the metric tensor will always appear in combinations such as γ AB ∆t/C. It is therefore convenient to define the "mass" tensor, Its inverse, is called the "diffusion" tensor.
We can now summarize our results so far. The choice Equation (13) of the multipliers α n simplifies the dynamics: P (x |x) in Equation (6) is a standard Wiener process. A generic displacement, Equation (7), is where b A (x) is the drift velocity, and the fluctuations ∆w A are such that We are now ready to comment on the implications of the choice of time scale ∆t and of multipliers α n , Equation (13). The first remark is on the nature of clocks: In Newtonian mechanics the prototype of a clock is the free particle, and time is defined so as to simplify the motion of free particles-they move equal distances in equal times. In ED the prototype of a clock is a free particle too-for sufficiently short times all particles are free-and time here is also defined to simplify the description of their motion: the particle undergoes equal fluctuations in equal times.
The second remark is on the nature of mass. The particle-specific constants m n will, in due course, be called "mass" and Equation (20) provides their interpretation: mass is an inverse measure of fluctuations. Thus, up to overall constants the metric of configuration space is the mass tensor and its inverse is the diffusion tensor. In standard QM there are two mysteries: "Why quantum fluctuations?" and "What is mass?" ED offers some progress in that instead of two mysteries there is only one. Fluctuations and mass are two sides of the same coin.
Finally we note the formal similarity to Nelson's stochastic mechanics [25]. The similarity is to be expected-all theories that converge on the Schrödinger equation must at some point become formally similar-but our epistemic interpretation differs radically from Nelson's ontic interpretation and avoids the difficulties discussed in [37].

Diffusive Dynamics
Equation (10) is the dynamical equation for the evolution of ρ(x, t). It is written in integral form but it can be written in differential form as a Fokker-Planck (FP) equation (see e.g., [2]) or equivalently as a continuity equation, where is the velocity of the probability flow or current velocity and is called the osmotic velocity-it represents the tendency for probability to flow down the density gradient. Since both b A and u A are gradients, it follows that the current velocity is a gradient too, The FP equation, can be conveniently rewritten in the alternative form for some suitably chosen functionalH[ρ, Φ]. It is easy to check that the appropriate functionalH is where the integration constant F [ρ] is some unspecified functional of ρ.
With these results we have demonstrated that a specific form of dynamics-a standard diffusion process-can be derived from principles of entropic inference. This diffusive dynamics can be written in different but equivalent ways-Equations (21), (22), (26) and (27) are all equivalent. Next we turn our attention to other forms of dynamics, such as quantum or classical mechanics which require a somewhat different choice of constraints.

Hamiltonian Dynamics
The previous discussion has led us to a standard diffusion, in which the density ρ evolves under the influence of some externally fixed drift potential φ. However, in quantum dynamics we require a second degree of freedom, the phase of the wave function. The extra degree of freedom is introduced into ED by replacing the constraint of a fixed drift potential φ by an evolving constraint in which at each time step the potential φ is readjusted in response to the evolving ρ.
To find the appropriate readjustment of φ we borrow an idea of Nelson's [38] and impose that the potential φ be updated in such a way that a certain functional, later called "energy", remains constant. The next challenge is to identify the appropriate functional form of this energy, but before this we make two remarks.
The standard procedure in mechanics is to derive the conservation of energy from the invariance of the action under time translations, but here we do not have an action yet. The logic of our derivation runs in the opposite direction: we first identify the conservation of an energy as the piece of information that is relevant to our inferences, and from it we derive Hamilton's equations and their associated action principle.
Imposing energy conservation appears to be natural because it agrees with our classical preconceptions of what mechanics is like. But ED is not at all like classical mechanics. Indeed, Equation (18) is the kind of equation (a Langevin equation) that characterizes a Brownian motion in the limit of infinite friction. Therefore in the ED approach to quantum theory particles seem to be subject to infinite friction while suffering zero dissipation. Such a strange dynamics can hardly be called "mechanics", much less "classical".
The Ensemble Hamiltonian-The energy functional that codifies the correct constraint is of the form Equation (28). We therefore impose that, irrespective of the initial conditions, the potential φ will be updated in such a way that the functionalH[ρ, Φ] in Equation (28) is always conserved, Using Equation (27) we get We require thatH = const. for arbitrary choices of the initial values of ρ and Φ. From Equation (26) we see that this amounts to imposing dH/dt = 0 for arbitrary choices of ∂ t ρ. Therefore the requirement thatH be conserved for arbitrary initial conditions amounts to imposing that Equations (27) and (31) have the form of a canonically conjugate pair of Hamilton's equations. The field ρ is a generalized coordinate and Φ is its canonical momentum. The conserved functionalH [ρ, Φ] in Equation (28) will be called the ensemble Hamiltonian. We conclude that non-dissipative ED leads to Hamiltonian dynamics.

Equation (31) leads to a generalized Hamilton-Jacobi equation,
The Action, Poisson Brackets, etc.-Now that we have Hamilton's equations, (27) and (31), we can invert the usual procedure and construct an action principle from which they can be derived. Define the differential and then integrate to get the action By construction, imposing δA = 0 leads to Equations (27) and (31). The time evolution of any arbitrary functional f [ρ, Φ] is given by a Poisson bracket, which shows that the ensemble HamiltonianH is the generator of time evolution. Similarly, under a spatial displacement ε a the change in f [ρ, Φ] is whereP is interpreted as the expectation of the total momentum, and X a are the coordinates of the center of mass, A Schrödinger-Like Equation-We can always combine ρ and Φ to define the family of complex functions, where k is some arbitrary positive constant. Then the two coupled Equations (27) and (31) can be written as a single complex Schrödinger-like equation, The reason for the parameter k will become clear shortly, but even at this stage we can already anticipate that η/k will play the role ofh.

Information Geometry and the Quantum Potential
Different choices of the functional F [ρ] in Equation (28) lead to different dynamics. Earlier we invoked information geometry, Equation (14), to define the metric m AB induced in configuration space by the transition probabilities P (x |x). To motivate the particular choice of the functional F [ρ] that leads to quantum theory we appeal to information geometry once again.
Consider the family of distributions ρ(x|θ) that are generated from a distribution ρ(x) by pure translations by a vector θ A , ρ(x|θ) = ρ(x − θ). The extent to which ρ(x|θ) can be distinguished from the slightly displaced ρ(x|θ + dθ) or, equivalently, the information distance between θ A and θ A + dθ A , is given by where Changing variables x − θ → x yields is some function that will be recognized as the familiar scalar potential. Since ED aims to derive the laws of physics from a framework for inference, it is natural to expect that the Hamiltonian might also contain terms that are of a purely informational nature. We have identified two such tensors: one is the information metric of configuration space γ AB ∝ m AB , another is I AB [ρ]. The simplest nontrivial scalar that can be constructed from them is the trace m AB I AB . This suggests where ξ > 0 is a constant that controls the relative strength of the two contributions. The term m AB I AB is sometimes called the "quantum" or the "osmotic" potential. This relation between the quantum potential and the Fisher information was pointed out in [39]. From Equation (43) we see that m AB I AB is a contribution to the energy such that those states that are more smoothly spread out tend to have lower energy. The case ξ < 0 leads to instabilities and is therefore excluded; the case ξ = 0 leads to a qualitatively different theory and will be discussed elsewhere [12].
With this choice of F [ρ] the generalized Hamilton-Jacobi Equation (32) becomes

The Schrödinger Equation
Substituting Equation (44) into Equation (40) gives a Schrödinger-like equation, the beauty of which is severely marred by the non-linear last term. Regraduation-We can now make good use of the freedom afforded by the arbitrary constant k. Since the physics is fully described by ρ and Φ, the different choices of k in Ψ k all describe the same theory. Among all these equivalent descriptions, it is clearly to our advantage to pick the k that is most convenient-a process usually known as "regraduation". Other notable examples of regraduation include the Kelvin choice of absolute temperature, the Cox derivation of the sum and product rule for probabilities, and the derivation of the sum and product rules for quantum amplitudes [2,15].
A quick examination of Equation (46) shows that the optimal k is such that the non-linear term drops out. The optimal choice, which we denotek, iŝ We can now identify the optimal regraduated η/k with Planck's constanth, and Equation (46) becomes the linear Schrödinger equation, where the wave function is Ψ = ρe iΦ/h . The constant ξ =h 2 /8 in Equation (44) turns out to be crucial: it defines the numerical value of what we call Planck's constant and sets the scale that separates quantum from classical regimes. The conclusion is that for any positive value of the constant ξ it is always possible to regraduate Ψ k to a physically equivalent but more convenient description where the Schrödinger equation is linear. From the ED perspective the linear superposition principle and the complex Hilbert spaces are important because they are convenient, but not because they are fundamental.

ED in an External Electromagnetic Field
In ED the information that is physically relevant for prediction is codified into constraints that reflect that motion is (a) continuous, (b) correlated and directional, and (c) non-dissipative. These constraints are expressed by Equations (2), (3), and (31) respectively. In this section we show that interactions can be introduced by imposing additional constraints.
As an explicit illustration, we show that the effect of an external electromagnetic field is modelled by a constraint on the component of displacements along a certain direction represented by the vector potential A a (x). For each particle n we impose the constraint ∆x a A a (x n ) = κ n , (n = 1 . . . N ) where κ n is a particle-dependent constant that reflects the strength of the coupling to A a . The resemblance between Equation (50) and the drift potential constraint, Equation (3), is very significant-as we shall see shortly it leads to gauge symmetries-but there also are significant differences. Note that Equation (3) is a single constraint acting in the N -particle configuration space-it involves the drift potential φ(x) with x ∈ X N . In contrast, Equation (50) is N constraints acting in the 1-particle space-the vector potential A a (x n ) is a function in 3D space, x n ∈ X.
Except for minor changes the development of ED proceeds as before. The transition probability P (x |x) that maximizes the entropy S[P, Q], Equation (1), subject to Equations (2), (3) and (50) and normalization, is which includes an additional set of Lagrange multipliers α n . Next use Equations (13) and (16) to get where we have absorbed α into φ, α φ → φ, and written as a vector in configuration space. As in Equation (18), a generic displacement ∆x A can be expressed in terms of a expected drift plus a fluctuation, ∆x A = b A ∆t + ∆w A , but the drift velocity now includes a new term, The fluctuations ∆w A , Equation (20), remain unchanged. A very significant feature of the transition probability P (x |x) is its invariance under gauge transformations, ∂φ ∂x a n → ∂φ ∂x a n = ∂φ ∂x a n + 1 α ∂χ(x n ) ∂x a n (55) Note that these transformations are local in space. (The vector potential A a (x n ) and the gauge function χ(x n ) are functions in space.) They can be written in the N -particle configuration space, whereχ The accumulation of many small steps is described by a Fokker-Planck equation which can be written either as a continuity equation, Equation (22), or in its Hamiltonian form, Equation (27). As might be expected, the current velocity v A , Equation (25), and the ensemble HamiltonianH, Equation (28), must be suitably modified, As a shortcut here, we have adopted the same functional F [ρ] motivated by information-geometry, Equation (44), and set ξ =h 2 /8. The new FP equation now reads, The requirement thatH be conserved for arbitrary initial conditions amounts to imposing the second Hamilton equation, Equation (31), which leads to the Hamilton-Jacobi equation, Finally, we combine ρ and Φ into a single wave function, Ψ = ρe iΦ/h , to obtain the Schrödinger equation, We conclude with two comments. First, the covariant derivative in Equation (64) can be written in the standard notation, where e n is the electric charge of particle n in units where c is the speed of light. Comparing Equation (65) with Equation (66) allows us to interpret the Lagrange multipliers in terms of the electric charges e n and the speed of light c. Thus, in ED electric charge e n is essentially a Lagrange multiplier α n that regulates the response to the external electromagnetic potential A a . The second comment is that the derivation above is limited to static external potentials, ∂ t V = 0 and ∂ t A a = 0, so that energy is conserved. This limitation is easily lifted. For time-dependent potentials the relevant energy condition must take into account the work done by external sources: we require that the energy increase at the rate The net result is that Equations (62)-(64) remain valid for time-dependent external potentials.

Some Remarks and Conclusions
Are there any new predictions?-Our goal has been to derive dynamical laws, and in particular quantum theory, as an example of entropic inference. This means that, to the extent that we succeed and derive quantum mechanics and not some other theory, we should not expect predictions that deviate from those of the standard quantum theory -at least not in the nonrelativistic regime discussed here. However, the motivation behind the ED program lies in the conviction that it will eventually allow us to extend it to other realms, such as gravity or cosmology, where the status of quantum theory is more questionable.
The Wallstrom objection-An important remaining question is whether the Fokker-Planck and the generalized Hamilton-Jacobi equations, Equations (27) and (31), are fully equivalent to the Schrödinger equation. This point was first raised by Wallstrom [40,41] in the context of Nelson's stochastic mechanics [25] and concerns the single-or multi-valuedness of phases and wave functions. Briefly, the objection is that stochastic mechanics leads to phases Φ and wave functions Ψ that are either both multi-valued or both single-valued. Both alternatives are unsatisfactory: quantum mechanics forbids multi-valued wave functions, while single-valued phases can exclude physically relevant states (e.g., states with non-zero angular momentum). Here we do not discuss this issue in any detail except to note that the objection does not apply once particle spin is incorporated into ED. As shown by Takabayasi [42], a similar result holds for the hydrodynamical formalism. The basic idea is that, as mentioned earlier, the drift potential φ should be interpreted as an angle. Then the integral of the phase dΦ over a closed path gives precisely the quantization condition that guarantees that wave functions remain single-valued even for multi-valued phases.
Epistemology vs. ontology-Dynamical laws have been derived as an example of entropic dynamics. In this model "reality" is reflected in the positions of the particles and our "limited information about reality" is represented in the probabilities as they are updated to reflect the physically relevant constraints.
Quantum non-locality-ED may appear classical because no quantum probabilities were introduced. But this is not so. Probabilities, in this approach, are neither classical nor quantum; they are tools for inference. Phenomena that would normally be considered non-classical, such as non-local correlations, are the natural result of including the quantum potential term in the ensemble Hamiltonian.
On dynamical laws-Action principles are not fundamental; they are convenient ways to summarize the dynamical laws derived from the deeper principles of entropic inference. The requirement that an energy be conserved is an important piece of information (i.e., a constraint) which will probably receive its full justification once a completely relativistic extension of entropic dynamics to gravity is developed.
On entropic vs. physical time-The derivation of laws of physics as examples of inference led us to introduce the informationally motivated notion of entropic time, which includes assumptions about the concepts of instant, simultaneity, ordering, and duration. It is clear that entropic time is useful, but is this the actual, real, "physical" time? The answer is yes. By deriving the Schrödinger equation (from which we can obtain the classical limit) we have shown that the t that appears in the laws of physics is entropic time. Since these are the equations that we routinely use to design and calibrate our clocks, we conclude that what clocks measure is entropic time. No notion of time that is in any way deeper or more "physical" is needed. Most interestingly, the entropic model automatically includes an arrow of time.