Quantum Probabilities and Maximum Entropy

Probabilities in quantum physics can be shown to originate from a maximum entropy principle.


Introduction
Ever since quantum physics was discovered at the beginning of the 20th century, there has been a debate about its interpretation.The mathematical formalism does not allow a direct, descriptive interpretation of quantum-objects in space and time and seems to be underdetermined [1] with regard to its ontological structure.As a consequence, there are multiple interpretations/ontologies available today.A particular topic has always been the role and nature of the probabilities in quantum physics.A cornerstone of quantum theory is the fact that the world is empirically and epistemically probabilistic.This means that agents are able to assign probabilities to future events, which are then empirically tested by multiple trials of experiments on identical systems.We also include the term "epistemic", because of the fact that, although there are today deterministic models of the quantum realm [2], it seems that their values are knowable only modulo randomly distributed initial conditions.The fact that nature shows probabilistic patterns and that agents can theoretically predict and then empirically find them in experiments is by no means self-evident.We will see that once we have defined how physical properties are represented in the theory, we only need two additional, plausible assumptions on how agents empirically gather data and draw conclusions, to uniquely define the theory.The fact that experimental (statistical) frequencies coincide with the probabilities is built into the theory and thence logically no surprise.Again, what is astounding though, is that nature plays the game and allows such a theory in the first place.
There has been a long debate on how to interpret randomness in quantum physics [3].There are, at first sight, three different kinds of probabilities.The first category consists of the probabilities, which arise from pure quantum states through the Born-rule [4].These are sometimes considered as the "true" quantum-probabilities.The second category consists of the weights in mixed states and the third one of the frequencies found in multiple trials of an experiment on identical quantum systems.There naturally arises the quest for an underlying principle, common to all categories.Since agents can choose to do a single experiment only, the frequency definition seems to fall short as the common principle.On the other hand and in view of what we said earlier, it also seems a bold standpoint to say that the probabilities are merely subjective [5].The fact that nature allows a theory, where agents by experiments can test probabilities arising from that theory, does say something about nature itself, of which agents, admittedly, are a part.A further interpretation, which works for the single trial, is the one of propensities [6].There is a vast literature on the philosophical question of the true nature of probabilities, which we are unable to cover here [7].We will add a specific unifying view and argue in this paper that the probabilities in quantum theory can be understood as the result of symmetry under permutations in combination with Laplace's law of indifference, i.e., a maximum entropy principle.

Quantum Physics
The ansatz for the mathematical theory of quantum physics is to represent a physical quantity as a self-adjoint operator A ∈ L(H) in the space of linear operators over a state space H, which carries the structure of a complex Hilbert space of some dimension d ∈ N. The values, which this quantity can assume in an experiment, are the corresponding eigenvalues λ k ∈ R, k ≤ d, of A. We have to find a way to assign probabilities to these eigenvalues, a task which is equivalent to assigning probabilities to the orthogonal projection operators {Π k } k≤d ∈ L(H), which project states in H to eigenstates corresponding to the λ k .If agents assign a probability p 0 to a projection operator Π 0 , which is common to two families of orthogonal projectors, {Π k } k≤d , {Γ k } k≤d , and do corresponding experiments, they would like to be sure that, if they find the frequency p 0 , the results describe the same event Π 0 .This is a non-contextuality condition (note that the set of (conditional) probabilities over realized values is contextual, as a theorem by Kochen-Specker [8] and an example by Hardy [9] show).A famous theorem from Gleason [10] says that to any non-contextual measure µ on the sets of projectors {Π} H over a Hilbert space H of dimension d ≥ 3, there exists a unique positive semi-definite, self-adjoint operator of trace class one, called density operator, such that µ(Π) = tr( Π) for all Π.This is the Born-rule.The theorem defines the appropriate measures as well as the quantum states, which are identified with the density-operators .There is a special class S(H) of density-operators, called pure states.Every vector, | ψ ∈ H , defines a corresponding pure state = |ψ ψ|, which is the projection operator onto |ψ .The set of density operators, D(H) ⊂ L(H), is the set of all convex combinations of pure states S(H) ⊂ D(H).The weights corresponding to the convex combinations are the second category of probabilities mentioned above.As important as Gleason's theorem of course is, because it technically defines the right quantum-probabilities, it does not say much about their nature/interpretation.

Probabilities of Mixed States
The simplest case is the second category, namely the probability weights of mixed states.In the mixed case, there are ex-ante probabilities p a ≥ 0, a ≤ M, ∑ a≤M p a = 1, which generate states of the form where the { a = | ψ a ψ a | } a≤M are (non-necessarily orthogonal) pure states.These probabilities are considered to be of classical type, i.e., uncertainties about a possible set of preparations.Since the rational numbers Q are dense in R, we may assume that, with an arbitrarily small error, p a = r a /q a , r, q ∈ N + .Let Q = ∏ a≤M q a and Q a = ∏ a ≤M, a =a q a , respectively.We can then set N = Q and m a = Q a r a to get a number of N states a k a a≤M, k≤m a with probabilities p a k = 1/N and aggregate probabilities p a = m a /N.This way, the probabilities p a clearly reflect the indifference principle resulting from the permutation-symmetry of (1).

Probabilities of Pure States
We now consider a system S represented by a pure state |ψ ∈ H with resolution in the eigenbasis | e a , 1 ≤ a ≤ M, of a self-adjoint operator A ∈ L(H), |ψ = ∑ a≤M ψ a | e a , ψ a ∈ C. We can form the corresponding pure state ψ = | ψ ψ| with matrix-entries ψ aa = ψ a ψ * a .The Born rule then assigns probabilities to the projectors a = | e a e a |, a ≤ M. Assume that there is an additional system E with orthonormal basis states {|n } n≤N , which is initially in the base state 0 = | 0 0| .A measurement of some state by the probe E is an operation U on the joint system joint = | 0 0| ⊗ where U is unitary UU * = Id (this follows from the fact that a general interaction evolution U(t) = e −(i/ )Ht is unitary).A general unitary transformation on a tensor-product, expressed in the respective bases, can be written as a matrix where the operators A nn are given by A nn = ∑ aa u na, n a |a a |.We denote the diagonal sub-block A n0 simply by A n .Since U is unitary, we have Conversely, we can choose any set of operators A n satisfying the resolution of the identity-condition ( 5) to define a measurement on an initial joint state joint = | 0 0| . We now have the necessary elements in place to give the main argument.
Assume there is a second system E with basis {|n } n N and an observer who would like to know in what state a = | e a e a | the system S is in, by making an appropriate measurement U on the joint system joint = | 0 0| ψ .If that is possible in the first place, then, having no additional knowledge, the observer does not, a priori, know in what state |n , n ≤ N, the probe will be after the measurement and before observation, leading to permutation-symmetry. Let the underlying pure state |ψ ∈ H have coefficients ψ a = √ m a e iϕ a , m a ∈ N, ϕ a ∈ R (since the rational numbers Q are dense in R, the choice of m a ∈ N is general enough).The probe E can be chosen appropriately coarse-grained (this coarse-graining is first introduced in [11] in the context of many-worlds) such that N = ∑ a≤M m a .The observer is after the measurement and before observation in a situation where, by lack of further information, she will by Laplace's principle of indifference a priori attribute to each outcome n|U | 0 0| ψ U * |n equal probability p n = 1/N, n ≤ N.This attribution is equivalent to maximizing the entropy function H(p) = − ∑ N n = 1 p n log(p n ).The observer can therefore write down in the spirit of (1) an average of outcomes For our purpose, we now chose the operators A n to be the scaled projectors to the basis-states | e a , a ≤ M. Note that we have replaced the simple-index n by the double-index a k .This choice is consistent with the demands of a measurement, since the P a k satisfy ( 5) Therefore, we can write (6) in the following form Comparing Equation (8) with Equation ( 1), we see that can be viewed as a mixed state with probabilities which is the Born-rule.Before we turn to consider the frequencies, let's have a look at composite systems 12 ∈ D(H ⊗ H).To show the principle it is sufficient to look at binary systems.)The state 12 may be mixed or pure and we can apply the findings in a straightforward way.The single components are given by the partial trace 1/2 = tr 2/1 ( 12 ).If the state has the form 12 = 1 ⊗ 2 , then we are in the separable case and can apply the results in 2.1 and 2.2 to each individual component, which can be pure or mixed.In case 12 is entangled, then the partial trace always produces a mixed state.When we now consider frequencies, then the individual quantum systems might be single or composite, what is important is that they are temporally separable in order to allow for statistics.

Frequencies
The theory so far does only cover single trials.Assume there is a density-operator ∈ D and a complete set of projectors {Π k } k≤M .To find probabilities for a sequence of different outcomes k 1 , . . ., k N , of N experiments on (this is done on N identically prepared systems) we can apply Gleason's theorem to the tensor product [5] to get with So the outcomes of repeated measurements are identically and independently distributed (i.i.d.).The probability for outcome k to occur n k times, k ≤ M, ∑ k n k = N, is given by the multinomial distribution p(n 1 , . . . ,n M ) = (N!/n 1 !, . . . ,n M !)p n 1 1 , . . ., The individual counting functions n k are binomially distributed and hence E(n k ) = N p k .For large N the averages, n k , of the statistical counting functions approach the expectation values and therefore The fact that n k → E(n k ), N → ∞, is due to the law of large numbers.The frequencies with their implied principle of indifference (14) indeed replicate the probabilities.This is achieved by a strong assumption in the theory, reflected in Equations ( 11), (13), and (14).It is the independence condition for the multi-trial states N , N ∈ N. Actually, it is itself a consequence of the assumption that agents have maximal information about a system of N copies of a quantum state [5].Independence implies serial permutation-symmetry, i.e., the fact that it does not count in which sequence the results occur.So in the case of multiple-trials the theory uses a stronger assumption than permutation-symmetry to obtain the compatible frequency-probabilities (14).Can we weaken the assumption?It is remarkable that, due to the (infinite) de-Finetti theorem [12], the assumption of independence can be weakened to the one of exchangeability, to still allow reasonable statistics.Exchangeability stands for permutation-symmetry of the joint distribution of N trials X n , n ≤ N, N (X 1 , . . . ,X N ) = N X π(1) , . . ., X π(N) , π ∈ Per(N), and for consistency from step N to N − 1, N−1 = tr N .If satisfied, it can uniquely represent an N-trial state N by an integral over product states of form (10) by means of a measure µ on S(H) (The measure µ belongs to the second category of probabilities.)For states N of form (15) the statistical approach (14) works with some suitable adjustments (while the distributions are directly integral averages over the product state-distributions, there holds a law of large numbers only conditional to a suitable σ-algebra [13]).Whether we work with states of maximal information (10) or states of form (15), in any case permutation-symmetry and the principle of indifference are key features of the frequencies (14), derived in multiple-trials.

Conclusions
We have, in the exposition, not made use of any specific interpretation of quantum mechanics, but relied on the original formalism only.We have seen that it is possible to interpret all the probabilities in quantum physics as arising from permutation-symmetry and the principle of indifference, which amounts to maximum entropy.For the single trial, this simply means that at any single point in time we have a number of equiprobable states which can occur.In the case of multiple trials, any single state can occur equiprobably at a number of different points in time.Since permutation-symmetry is very natural and inherent in statistics, and since Laplace's principle is a basic rational intuition, which underlies elementary combinatorics, we feel that both together are acceptable principles on which to base a theory of nature.So, given the projectors on Hilbert space as the model-structure, the assumptions of non-contextuality and independence/exchangeability lead together with the principle of indifference directly to both a formal and interpretative specification of the probabilities in quantum physics.The thus specified probabilities are real in as much, as they belong to a theory, in which agents and systems enter into a testable relationship.They are hence as much features of agents as they are of the physical systems.