Stationary Schrödinger Equation and Darwin Term from Maximal Entropy Random Walk

: We describe particles in a potential by a special diffusion process, the maximal entropy random walk (MERW) on a lattice. Since MERW originates in a variational problem, it shares the linear algebra of Hilbert spaces with quantum mechanics. The Born rule appears from measurements between equilibrium states in the past and the same equilibrium states in the future. Introducing potentials by the observation that time, in a gravitational field running in different heights with a different speed, MERW respects the rule that all trajectories of the same duration are counted with equal probability. In this way, MERW allows us to derive the Schrödinger equation for a particle in a potential and the Darwin term of the nonrelativistic expansion of the Dirac equation. Finally, we discuss why quantum mechanics cannot be simply a result of MERW, but, due to the many analogies, MERW may pave the way for further understanding.


I. INTRODUCTION
The phenomena at atomic and subatomic scales are very successfully described by quantum mechanics and the Schrödinger equation [1].Despite existing now for almost 100 years, their interpretation is still a topic of intensive discussion [2][3][4][5].We know how to use the equations and to make predictions, but there is no agreement yet on how nature produces the effects described by quantum mechanics.Since we have no satisfactory answer to this question, we have to collect the available arguments for a future consistent picture which explains the appearance of quantum phenomena.
Most of the community is satisfied with wave-particle duality, based on the experience that quantum phenomena can be explained by waves, but detection at low intensities is registered locally as being produced by particles.The question is raised whether this is related to the atomic structure of detectors.On the other hand, some authors are inclined to the explanation by wave phenomena.They derive the Schrödinger equation from the wave equation as an approximation for potentials much smaller than the rest mass, see e.g., Section 6.5 of Ref. [4].They explain the Born rule by the quadratic Hamiltonian for electromagnetism and the linear superposition principle from the linear equations, which are valid for waves.
Waves and particle approaches do not take into account that there are phenomena between waves and particles, as well as topological solitons, which are formulated as field models.By their topological properties, special waves characterised by topological quantum numbers become spatially concentrated.These properties of topological solitons may explain why we observe both wave and particle properties.An interesting experiment which demonstrates how wave and particle properties may cooperate is the oil drop experiment invented by Couder [6].A numerical simulation of travelling waves, solitons, and radiating excess energy as dispersive waves is described in Sect.VI.5 of the book by Komech [7].
In this article, I intend to offer another insight into quantum mechanical processes by employing the maximal entropy random walk (MERW), which is one of the approaches in studying diffusion processes in complex systems [8][9][10][11].It models processes where the stationary probability of finding a particle localises in the largest nearly spherical region that is free of defects.Thus, MERW differs on irregular lattices from the generic random walk (GRW), where, at each time step, particles go with equal probabilities to any one of its nearest neighbours.This allows us to derive the stationary Schrödinger equation, to explain the Born rule [12][13][14] and, as shown in this article for the first time, to obtain the Darwin term, which usually appears in a relativistic formulation only [15,16].The rules of linear algebra follow as a consequence of the solution of a variational problem.
MERW is a Markov chain, its time evolution of trajectories depends on the present state only and not on the past.As discussed in Section II, it is defined by a step matrix M , acting on vectors of position distributions, known as amplitudes.In distinction from other random walks, it is also defined by the additional request that all paths taking the same time are assumed to have the same probability.Such a uniform distribution of trajectories has a maximal entropy and justifies the choice of the name [8].For particles in a potential, the solutions of the variational problem are given by the eigenfunctions of M .They provide the ensemble of paths and its equilibrium distribution.Measurements are conducted between the equilibrium distributions in the past and future.From the identity of these distributions follows the time symmetry of the ensemble of trajectories forward and backward in time and the Born rule.In Section III, we apply MERW to the free motion in a box and to motion in a potential.As long as the free motion is not inhibited by boundaries or a potential and if the diffusion constant is determined under the assumption that in reduced Compton, time particles may move a distance of a reduced Compton wavelength, we find that the amplitudes obey the time dependent Schrödinger equation continued in imaginary time.In the case of the influence of a potential, we find that the infinite time limit, the stationary distribution, corresponds to the solution of the time-independent Schrödinger equation.Expanding the solution of the variational problem up to α 4 f , we obtain the Darwin term.In Section IV, we discuss the problems with the probability interpretation of excited states and their technical solution.
We formulate MERW on a cubic homogeneous lattice with spacing δ.Expanding to the second order in δ, we can compare the terms with the three terms which we obtain in the nonrelativistic expansion of the Dirac equation.Despite the nonrelativistic formulation, MERW can describe Zitterbewegung and obtain the correct form of the Darwin term.From the nonrelativistic treatment, it is understandable that the other two terms differ.

II. METHOD
In Refs.[8,12], MERW was introduced as a stochastic process in discrete space and time on general graphs, which are structures made of vertices (nodes) x and edges xy.In a single time step of a Markov chain, the particle may hop along an edge from x to a neighbouring node y with a probability described by the stochastic (transition; Markov) matrix S xy independent of the past history, leaving the probability distribution ρ x invariant; see Equation (9).Equal probabilities S xy outgoing from a vortex x to all neighbours y define the commonly used generic random walk (GRW).Due to this equality, GRW maximises the entropy locally, whereas we will define MERW such that it maximises the entropy of the ensemble of paths of an equal length.
Starting with an initial distribution |i⟩ of positions MERW counts with the amplitudes the number of paths starting at x. MERW is based on a step matrix M with non-negative matrix elements M (y, x) := ⟨y|M |x⟩ which is defined by the number of paths ending after k time steps at |y⟩.The elements of M are defined in Equation (15) for particles in a box and in Equation (33) for particles in a potential V .Constructing the possible paths with this step matrix M guarantees that all trajectories which take the same time have an equal probability.This uniform distribution of paths has a maximal entropy and justifies calling the method "Maximal Entropy Random Walk".The principle of maximum entropy [17] states that the probability distribution which best represents the current state of knowledge about a system is the one with the largest entropy.For timereversal invariant processes, we can assume that steps away and back are equally likely and thus that the matrix M is symmetric.
Our interest concentrates on the stationary states of this diffusion process.We obtain them from the eigenvalue equation of the symmetric step matrix For any finite number of time steps, these states are normalisable and define orthogonal rays in a real vector space The set of orthonormal eigenvectors defines a complete basis If the limit for an infinite number of steps corresponds to a bound state, this limit is also normalisable.Due to the request of the highest entropy of the ensemble of paths, the state |0⟩ with the highest eigenvalue λ 0 is stable, the other eigenstates are metastable.The Perron-Frobenius theorem in the matrix theory states for which matrices there exists a highest real eigenvalue λ 0 with non-negative components ⟨x|0⟩ ≥ 0, i.e., without nodes.Due to the iterability we are able to find the stochastic matrix S xy .Stochastic matrices are also called transition or Markov matrices and determine the transition probabilities of Markov chains.As the matrix elements are probabilities, their values satisfy According to Equation ( 10), the matrices S are leftstochastic, this means that their columns form probability vectors and thus add up to 1; see Equation (10).The 1-norm is ||S|| 1 := max y |S xy | = 1.Since the spectral radius of a matrix is smaller than the norm, all eigenvalues of S are smaller or equal to 1, which leaves a probability distribution ρ x invariant and with a sum over final states of value 1 from a given initial site Equations ( 9) and ( 10) can be fulfilled by where the denominators normalise the number of initial positions in the ground state to 1.We obtain, for the iterated stochastic matrix = 1 (12) and the condition which is requested for a probability distribution.The entropy growth in MERW processes is explicitly defined in Appendix A. There are also examples to show that the entropy is maximal for the states with the highest eigenvalue λ 0 .
An interesting discussion about the interpretation of the wave function in quantum mechanics, as the probability amplitude that the system is in a certain configuration, can be found in the works of Mara Beller [18].

III. APPLICATIONS OF MERW A. Free Motion in a Box
The first impression about the method that we have for free motion is in a box of size L D and infinitely high walls in D dimensions.We are using the sites ⃗ x, coordinates x d , unit vectors d and lattice spacing δ for (N + 1) D sites We describe free motion with a step matrix M , e.g., by We can start the time evolution with some small initial distribution around ⃗ µ, somewhere in the middle of the lattice, e.g., and evolve the amplitudes by According to the central limit theorem, the amplitudes tend asymptotically to the Gaussian distributions as long as the distribution does not reach the boundaries.Due to the additivity of variances, the variance of a coordinate increases in every time step by the same amount δσ 2 leading after τ time steps to the variance For the step matrix M of Equation ( 15), we obtain Normal diffusion, obeying Fick's law, would approach a constant distribution, whereas MERW ends in the state of the highest entropy, with an amplitude proportional to a product of cosine functions To check this behaviour, we have evolved, in Figure 1, amplitudes in one dimension for L = 1, N = 32 and µ = 0, according to Equation (17).In the left diagram, we compare the amplitudes ψ(x, 16)/ψ(0, 16) (blue dots), after 16 time steps, before the distribution arrives at the boundary, with the corresponding Gaussian ϕ 16 (x)/ϕ 16 (0) of free diffusion (red line).We find a similar nice agreement in the right diagram after 256 time steps between ψ(x, 256)/ψ(0, 256), with red dots, and the quantum mechanical ground state cos(πx), the blue line, in a box of size 1.According to Equation (15), there are no paths leading into the walls.This is why we observe in the diagram on the right that particles cannot accumulate near the walls where the freedom of movement is restricted.There are many more paths that lead away from the walls, into areas where particles can move freely.
The normalised time-dependent Gaussian is a solution of the diffusion equation with the diffusion constant D d .Introducing the imaginary time τ = i t δt and the time step δt the diffusion equation agrees with the time-dependent Schrödinger equation for a free particle [19], if we choose leading to ℏ m Inserting Equation ( 26), we conclude δt tc In the n 2 reduced Compton times, the diffusion therefore propagates in every direction n and reduced Compton wave lengths; the single step is conducted with velocity c 0 /n.What is appropriate for quantum physics could be n = 1/α f = 137.036,where α f is Sommerfeld's fine structure constant.Since the further results are not influenced by the factor n, we choose further results for simplicity n = 1, and obtain After reaching the stationary solution, depicted in the right diagram of Figure 1, the time derivative in the diffusion Equation ( 23) vanishes, and the equation agrees with the time-independent Schrödinger equation.The solution is now influenced by boundary conditions.The problem has turned to a boundary value problem, characterised by eigenvalues and eigenfunctions, and by rays in a vector space.

B. Motion in a Potential
From free motion, we had the idea that the solutions of MERW may agree with solutions of the stationary Schrödinger equation.In this section, we will investigate under which conditions we can obtain from MERW the stationary Schrödinger equation for a particle in a potential.
In classical and quantum mechanics, we use time as a universal parameter, despite knowing that time in different gravitational potentials runs with a different speed [20].With the concept of energy as a factor multiplying this universal time [21], we describe moving particles measuring the action S = dtL in units of ℏ.In a potential V and for vanishing kinetic energy T , the Lagrangian reads Since in MERW all paths that take the same amount of time are weighted equally, we have to take into account in MERW that the parameter time changes at different rates in different environments.The time evolution for a time step δt is described by the formal Taylor series According to classical mechanics, the action S generates the canonical transformations between initial coordinates and momenta and their time-dependent values.Using this relation, for the step matrix, we obtain We work in an infinite cubic lattice in R D with lattice spacing δ.The variation of the potential along the path in direction d is considered by the mid-point rule, well known from the path of integral formulation of quantum mechanics [22].
In every time step, determined by the Compton frequency, the particle has to move to a nearest neighbour.Under this assumption, the eigenvalue Equation ( 3) for the D-dimensional problem reads e To obtain potential differences on the right side of this equation, we multiplied Equation (3) by the factor e V (⃗ x) mc 2 .We expand Equation (34) to increasing orders of δ and obtain in the zeroth order an equation for the approximate value of the eigenvalue e which is not really solvable before knowing the effective potential for the eigenvalue λ i .We will obtain this value at order δ 2 only.We expand the expressions on the right side of Equation ( 34) up to order δ 2 and get for Equation (34) e In this equation, there appear two physical scales, the energy scale mc 2 and the length scale δ.Their relation depends on the physical situation.For hydrogen, we expect the well-known relations of a dimensional analysis where α f is Sommerfeld's fine structure constant.We conclude that the dimensionless ratios are both of the order α 2 f allowing us to expand Equation ( 37) to an increasing order in α 2 f .We reach order α 2 After multiplication with mc 2 /(2D), we obtain the stationary Schrödinger equation for a particle in a potential V (⃗ x) and can read off the relation between energy eigenvalues E i and eigenvalues λ i of the step matrix M .The higher the entropy, the lower the energy.Expanding Equation (37) up to α 4 f , i.e., the exponential function up to the second order in V mc 2 , we obtain corrections to the Schrödinger equation It is interesting to compare the correction terms H corr with the correction terms derived from the Dirac equation [23] The first term H ck is the relativistic correction to the kinetic energy, the second term H ℓs is the spin-orbit term.The third term H D is the Darwin term and usually attributed to Zitterbewegung of the electron.
We do not expect a relativistic correction to the kinetic energy in MERW, as it is a nonrelativistic model.Indeed, the corresponding first two terms in H corr of Equation ( 42) cancel.This can be observed, using the virial theorem which is valid for the hydrogen atom already in the classical description.Further, using the quantum mechanical operator for the kinetic energy This shows that the relativistic correction to the kinetic energy vanishes in MERW, as it should.The spin-orbit term has some similarity to the third term in H corr , as both contain the gradient of the potential.But, as expected, they cannot really agree since MERW does not include spin degrees of freedom.
The last term of H corr in Equation ( 42) has exactly the form of the Darwin term This term was first derived by Charles Galton Darwin when he solved the Dirac equation for the motion of an electron in a central force field [15].It can be derived in a general form in a nonrelativistic expansion of the Dirac equation [16].Usually, it is assumed that this term has no classical explanation [24]; it is interpreted as an electronpositron pair creation with the subsequent annihilation of the electron of hydrogen with the positron-see Figure 2. In MERW, it does not need annihilation processes because it is a consequence of the diffusive motion.About the origin of this diffusive motion,one can only speculate.

IV. EXCITED STATES
As mentioned above, amplitudes of states with the highest entropy, quantum mechanical ground states, are nonnegative and allow for an interpretation as numbers of trajectories.The case of a particle in a box with infinite walls shows that the increasing energy corresponds to increasing values of the negative Laplacian, see Equation (41), and to an increasing number of nodes, where the amplitudes change their sign.Since amplitudes count numbers of trajectories, this poses a problem in the probability interpretation of trajectories in MERW.Analogous problems can already be identified in the first paper by Schrödinger on wave mechanics [25] and in experiments on bouncing neutrons [26].Schrödinger, after around two years of intensive deliberation (his notebooks from this time are available), decided to start from the Hamilton-Jacobi equation.He substituted the action S by the exponential of S. Comparing the experiment, he realised that the natural unit of S is the reduced Planck constant ℏ.Therefore, he equated ψ = e S/ℏ .Observe that in this first paper he did not introduce the imaginary i.There is a unique map of S to ψ, but the inverse is not true.The nodes of ψ and negative values of ψ cannot be mapped to S. The existence of nodes also inhibits the particle interpretation of quantum mechanics, as one can nicely observe in the experiment with bouncing neutrons [26].In this experiment, ultra cold neutrons are observed which are bouncing on a horizontal mirror with heights of roughly 30 µm and a horizontal velocity of 6 m/s.The phase space calculations [27] in this experiment clearly show that in an eigenstate of the corresponding Schrödinger equation, the neutrons cannot cross the nodes of the wave function.The general excuse is that the neutrons are never observed in an eigenstate, but always in a superposition.One has to conclude that true eigenstates are never realised in an experiment.On the other side, it is interesting to observe that in the momentum representation, one can clearly see how the neutrons fall with a vertical velocity, increasing proportionally in time towards the reflecting mirror and bouncing back as expected with a constant deceleration; see Figure 20 of Ref. [27].
Since amplitudes in MERW are normalised numbers of trajectories, they should not be negative.At first sight, negative values for amplitudes seem to contradict Kolmogorov's axioms of probability theory.However, there is an exit to avoid this contradiction.One can assume two species of particles with opposite signs of amplitudes.In the time evolution, both types of trajectories evolve towards the nodes and may cross the nodes.In numerical calculations, applying the step matrix M according to Equation (2), one observes that the amplitudes of opposite signs annihilate each other.This nicely corresponds to trajectories of particles and antiparticles that can annihilate, as we observe them in high-energy physics experiments.With the particle-antiparticle interpretation, one can retain Kolmogorov's axioms and obtain a consistent probability interpretation of excited states as particle and antiparticle trajectories.
Of course it looks rather weird that we describe the spatial trajectories of bouncing neutrons in excited states by the oscillation between particle and antiparticle trajectories, whereas the momentum trajectories allow for a homogeneous interpretation.In a soliton interpretation of particles, this change in the interpretation may be related to the fact that one does not know what type of particles is present, as long as the particles do not react with the surroundings [28].

V. CONCLUSIONS
By its construction as a Markov chain with a step matrix M , Maximal Entropy Random Walk (MERW) determines the distribution of particle paths ψ(⃗ y, τ ) = ⃗ x M τ (⃗ y, ⃗ x) ψ(⃗ x) starting from an initial distribution ψ(⃗ x) and paths of same duration τ .The final distributions approach an equilibrium distribution, the time derivative term of the diffusion equation vanishes, and there is no difference between evolution forward and backward in time; thus, we arrive at the eigenvalue in Equation (42).
We observed many common properties between MERW and the time-independent Schrödinger equation: 1. MERW treats stationary states of a diffusion process resulting from the eigenvalue Equation (3) of the step matrix M .These states can be represented by rays in a Hilbert space.
2. MERW distinguishes between amplitudes ⟨0|x⟩ and probabilities ρ x .The amplitudes count the number of trajectories arriving in the equilibrium distribution at a certain position.The probabilities indicate the share of trajectories, which, in the equilibrium distribution, pass at x.They are not modified by the Markov process.
3. MERW respects the Born rule.It derives this rule from the relation (11) between the step matrix M and the stochastic matrix S, and the request that S leaves a probability distribution invariant.MERW impressively shows how the Born rule appears from a measurement between an equilibrium state in the past and the same equilibrium state in the future.This symmetry of the evolution from past to present, and to the evolution from present to future, was already investigated by Erwin Schrödinger in his article [19] on the reversal of the laws of nature.The translation of this article is available in [29].
4. With the appropriate choice of the diffusion constant (25), MERW derives the Schrödinger equation for a free particle.
5. Generalising the observation that time in a gravitational field is running in different heights with a different speed to arbitrary potentials, MERW respects the rule that all trajectories of same duration are counted with equal probability.In this way, MERW allows us to derive the Schrödinger equation for a particle in a potential.
6. From diffusion, MERW derives both the stationary Schrödinger equation and the Darwin term of the nonrelativistic expansion of the Dirac equation.
If we want to use MERW to describe the properties of nature or to explain the background of the validity of the Schrödinger equation, we realise that MERW leaves several questions open or answers them in the wrong way: 1. MERW is a diffusion process which propagates in n 2 the reduced Compton time tc = ℏ mc 2 a distance of n of reduced Compton wave lengths λc = ℏ mc .The width of the ground state of the atomic hydrogen may indicate n = 1/α f as a realistic order of magnitude.MERW does not explain the mechanism driving this diffusion process.The pair creation with a necessary energy of around 1 MeV may not be the origin of the Darwin term.After observing that in elastic scattering processes the position of the particles is not conserved, the author is rather inclined to think of some yet unknown scattering processes.In this respect, Ref. [2] argues in favour of a zero-point field.

The kinetic term − ℏ 2
2m ∆ in Equation ( 41) is of a statistical origin only.The velocity seems to correspond to the diffusive velocity, as defined in [2] p. 125.In MERW, there is no "classical" kinetic energy, which is characteristic for the classical motion in Bohr's model of the hydrogen atom.In this sense, MERW cannot describe states with a non-vanishing angular momentum or oscillations in a classical potential.Such excited states appear in MERW as metastable states only, in distinction to the Schrödinger equation, where these states are stable.
3. Amplitudes are numbers of trajectories and should be positive.This is true in the state with the highest entropy, corresponding to the ground state.Amplitudes for states with lower entropy have nodes and therefore regions with negative amplitudes.Trajectories traversing the nodes in opposite directions cancel and thus never cross the nodes.The particle-antiparticle picture has to be introduced to explain these cancellations.
4. MERW can explain the subtraction of trajectory numbers by a particle-antiparticle picture only.Therefore, the explanation of interference in Young's double slit experiment remains very questionable.
5. MERW does not explain the violation of Bell's theorem, as quantum mechanics nicely does.
Due to these reasons, quantum mechanics cannot be simply a result of MERW, but MERW may provide some hints on how to proceed, in order to arrive at a better understanding of quantum mechanics and, like Mermin [30], not be satisfied with "shut up and calculate".With MERW, it is possible to derive the ground-state wave functions of particles in a potential and the exact form of the Darwin term.This could indicate that the ground state of quantum systems is determined by diffusive motion only.
Concerning the history of these ideas, we would like to mention that after a footnote of Eddington [31] concerning the probability in Schrödinger's wave mechanics, which is obtained by two symmetric systems of ψ-waves travelling in opposite directions in time, Schrödinger [19,32] tried to find a probabilistic equation close to his wave equation.A probabilistic formulation of processes with initial and final densities was proposed by Bernstein [33] in 1932.Yasue suggested, in 1981, a time-symmetric variational approach with an action functional [34].In the formalism of stochastic processes, Zambrini studied Markovian processes of the Bernstein type in two papers and compared them to quantum mechanics [35][36][37].
According to Equation (41), the ground state |0⟩ of a particle in a potential V belongs to the highest eigenvalue λ 0 .As discussed in Section II, the eigenvector |0⟩ of M belonging to this highest eigenvalue λ 0 is also characterised by the largest entropy production H per step of Equation (A8).The value of H for a free particle in a linear box [−0.5, 0.5] with 65 sites is depicted in the left diagram of Figure 4, in the neighbourhood of the state |0⟩ for |ψ⟩ = 1 − α 2 − β 2 |0⟩ + α|1⟩ + β|2⟩, α 2 + β 2 ≤ 0.06, (A11) where |1⟩ and |2⟩ are the first and the second excited state.As expected, the maximum of H is at α = β = 0.With increasing values of α and β, the mixed amplitude (A11) can obtain nodes.This is the reason why, in the left diagram of Figure 4, the values of α and β are restricted to α 2 + β 2 ≤ 0.06 and to values where no nodes appear.This result suggests that we can determine the entropy of excited states from the stochastic matrix which results from blocking the nodes.With this method, we can show that excited states are saddle points of H. (A12) The lowest point at the boundary of the saddle-shaped surface corresponds to an admixture of the ground state and the highest point corresponds to an admixture of the state |2⟩.As excited states are metastable, a small disturbance is sufficient to let them move towards the ground state, which has the highest entropy.
fraction can be expressed by the ratio of the square of the reduced Compton wave length λc and the reduced Compton time tc ℏ m = λ2 c tc with λc := ℏ mc , tc := ℏ mc 2

FIG. 2 :
FIG.2: Schematic picture of Zitterbewegung of electrons.An electron e − propagates in x-direction.Then, after some time, in E, an electron-positron pair is created.Electrons e − are represented by arrows in time direction and the positron e + by arrows backward in time t.At A, the original electron annihilates with the positron.By this process, the position of the original electron appears to be shifted by some accidental distance to a new position.

FIG. 4 :
FIG. 4: The entropy production H of Equation (A10) per step.In a Markov process, the states move towards higher entropy.Left: For |ψ⟩ of Equation (A11), we observe that the state |0⟩ in the middle of the diagram has maximal entropy.The highest value on the boundary corresponds to an admixture of the state |1⟩ and the lowest point corresponds to that of state |2⟩.Right: For |ψ⟩ of Equation (A12), the state |1⟩ is located at the saddle point in the middle of the diagram.The highest point of the boundary corresponds to the state |0⟩ and the lowest point to the state |2⟩.