Maximum Geometric Quantum Entropy

Any given density matrix can be represented as an infinite number of ensembles of pure states. This leads to the natural question of how to uniquely select one out of the many, apparently equally-suitable, possibilities. Following Jaynes’ information-theoretic perspective, this can be framed as an inference problem. We propose the Maximum Geometric Quantum Entropy Principle to exploit the notions of Quantum Information Dimension and Geometric Quantum Entropy. These allow us to quantify the entropy of fully arbitrary ensembles and select the one that maximizes it. After formulating the principle mathematically, we give the analytical solution to the maximization problem in a number of cases and discuss the physical mechanism behind the emergence of such maximum entropy ensembles.


I. INTRODUCTION
Background.Quantum mechanics defines a system's state |ψ⟩ as an element of a Hilbert space H.These are the pure states.To account for uncertainties in a system's actual state |ψ⟩ one extends the definition to density operators ρ that act on H.These operators are linear, positive semidefinite ρ ≥ 0, self-adjoint ρ = ρ † , and normalized Tr ρ = 1.ρ then is a pure state when it is also a projector: ρ 2 = ρ.
The spectral theorem guarantees that one can always decompose a density operator as ρ = i λ i |λ i ⟩ ⟨λ i |, where λ i ∈ [0, 1] are its eigenvalues and |λ i ⟩ its eigenvectors.Ensemble theory [1,2] gives the decomposition's statistical meaning: λ i is the probability that the system is in the pure state |λ i ⟩.Together, they form ρ's eigenensemble L(ρ) := {λ j , |λ j ⟩} j which, putting degeneracies aside for a moment, is unique.L(ρ), however, is not the only ensemble compatible with the measurement statistics given by ρ.Indeed, there is an infinite number of different ensembles that give the same density matrix: {p k , |ψ k ⟩} k such that k p k |ψ k ⟩ ⟨ψ k | = j λ j |λ j ⟩ ⟨λ j |.Throughout the following, E(ρ) identifies the set of all ensembles of pure states consistent with a given density matrix.
Motivation.Since the association ρ → E(ρ) is one-tomany, it is natural to ask whether a meaningful criterion to uniquely select an element of E(ρ) exists.This is a typical inference problem, and a principled answer is given by the maximum entropy principle (MEP) [3][4][5].Indeed, when addressing inference given only partial knowledge, maximum entropy methods have enjoyed marked empirical successes.They are broadly exploited in science and engineering.
Following this lead, the following answers the question of uniquely selecting an ensemble for a given density matrix by adapting the maximum entropy principle.We also argue in favor of this choice by studying the dynamical emergence of these ensembles in a number of cases.
The development is organized as follows.Section II discusses the relevant literature on this problem.It also sets up language and notation.Section III gives a brief summary of Geometric Quantum Mechanics: a differentialgeometric language to describe the states and dynamics of quantum systems [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23].Then, Section IV introduces the technically pertinent version of MEP-the Maximum Geometric Entropy Principle (MaxGEP).Section V discusses two mechanisms that can lead to the MaxGEP and identifies different physical situations in which the ensemble can emerge.Eventually, Section VI summarizes what this accomplishes and draws several forward-looking conclusions.

II. EXISTING RESULTS
The properties and characteristics of pure-state ensembles is a vast and rich research area, one whose results are useful across a large number of fields from quantum information and quantum optics to quantum thermodynamics arXiv:2008.08679v3[quant-ph] 13 Mar 2024 and quantum computing, to mention only a few.This section discusses four sets of results relevant to our purposes.This also allows introducing language and notation.First, recall Ref. [24] where Hughston, Josza, and Wootters gave a constructive characterization of all possible ensembles behind a given density matrix, assuming an ensemble with a finite number of elements.Second, Wiseman and Vaccaro in Ref. [25] then argued for a preferred ensemble via the dynamically-motivated criterion of a Physically Realizable ensemble.Third, Goldstein, Lebowitz, Tumulka, and Zanghi singled out the Gaussian Adjusted Projected (GAP) measure as a preferred ensemble behind a density matrix in a thermodynamic and statistical mechanics setting [26].Fourth, Brody and Hughston used one form of maximum entropy within geometric quantum mechanics [27].HJW Theorem.At the technical level, one of the most important results for our purposes is the Hughston-Josza-Wootters (HJW) theorem, proved in Ref. [24], which we now summarize.Consider a system with finite-dimensional Hilbert space H S described by a density matrix ρ with rank r: ρ = r j=1 λ j |λ j ⟩ ⟨λ j |.We assume dim H S := d S = r, since the case in which d S > r is easily handled by restricting H S to the r-dimensional subspace defined by the image of ρ.Then, a generic ensemble e ρ ∈ E(ρ) with d ≥ d S elements can be generated from L(ρ) via linear remixing with a d×d S matrix M having as columns d S orthonormal vectors.Then, e ρ = {p k , |ψ k ⟩} is given by the following: Here, we must remember that U is not an operator acting on H S , but a unitary matrix mixing weighted eigenvectors into d non-normalized vectors.The power of the HJW theorem is not only that it introduces a constructive way to build E(ρ) ensembles, but that this way is complete.Namely, all ensembles can be built in this way.This is a remarkable fact, which the following sections rely heavily on.Physically Realizable Ensembles.For our purposes, a particularly relevant result is that of Wiseman and Vaccaro [25].(See also subsequent results by Wiseman and collaborators on the same topic [28].)The authors argue for a Physically Realizable ensemble that is implicitly selected by the fact that if a system is in a stationary state ρ ss , one would like to have an ensemble that is stable under the action of the dynamics generated by monitoring the environment.This is clearly desirable in experiments in which one monitors an environment to infer properties about the system.While this is an interesting way to answer the same question we tackle here, their answer is based on dynamics and limited to stationary states.The approach we propose here is very different, being based on an inference principle.This opens interesting questions related to understanding the conditions under which the two approaches provide compatible answers.Work in this direction is ongoing and it will be reported elsewhere.
Gaussian Adjusted Projected Measure.Reference [26] asks a similar question to that here, but in a statistical mechanics and thermodynamics context.Namely, viewing pure states as points on a high-dimensional sphere ψ ∈ S 2d S −1 , which probability measure µ on S 2d S −1 , interpreted as a smooth ensemble on S 2d S −1 , leads to a thermal density matrix: Here, ρ th could be the microcanonical or the canonical density matrix.Starting with Schrödinger's [29,30] and Bloch's [31] early work, the authors argue in favor of the Gaussian Adjusted Projected (GAP) measure.This is essentially a Gaussian measure, adjusted and projected to live on ψ ∈ S 2d S −1 : Written explicitly in terms of complex coordinates ψ j , it is clear that this is a Gaussian measure with vanishing average E[ψ j ] = 0 and covariance specified by E[ψ * j ψ k ] = σ jk .In particular, σ = ρ guarantees that GAP (ρ) has ρ as density matrix.
The GAP measure has some interesting properties [26,32,33] and, as we see in Section IV, it is also closely related to one of our results in a particular case.Our results can therefore be understood as a generalization of the GAP measure.We will not delve deeper into this matter now, but comment on it later.Geometric Approach.In 2000, Brody and Hughston performed the first maximum entropy analysis for the ensemble behind the density matrix [27], in a language and spirit that is quite close to those we use here.Their result came before the definition of the GAP measure, but it is essentially identical to it: µ(ψ) ∝ exp − j,k L jk ψ * j ψ k .Their perspective, however, is very different from that in Ref. [26], which is focused on thermal equilibrium phenomenology.The work we do here, and our results, can also be understood as a generalization of Ref. [27].Indeed, as we argued in Ref. [34] (and will show again in Section IV) the definition of entropy used (see Eq. (10) in Ref. [27]) is meaningful only in certain cases.In particular, when the ensemble has support with dimension equal to the dimension of the state space of the system of interest.In general, more care is required.Summary.We summarized four relevant sets of results on selecting one ensemble among the infinitely many that are generally compatible with a density matrix.Our work relies heavily on the HJW theorem [24], and it is quite different from the approach by Wiseman and Vaccaro [25].Moreover, it constitutes a strong generalization with respect to the results on the GAP measure [26] in a thermal equilibrium context and with respect to the analysis by Brody and Hughston in [27].

III. GEOMETRIC QUANTUM STATES
Our maximum geometric entropy principle relies on a differential-geometric approach to quantum mechanics called Geometric Quantum Mechanics (GQM).The following gives a quick summary of GQM and how its notion of Geometric Quantum State [6,7,34] can be elegantly used to study physical and information-theoretic aspects of ensembles.More complete discussions are found in the relevant literature [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23].Quantum State Space.The state space of a finitedimensional quantum system with Hilbert space H S is a projective Hilbert space P(H S ), which is isomorphic to a complex projective space P(H S ) ∼ CP d S −1 := Z ∈ C d S : Z ∼ λZ, λ ∈ C/0 .Pure states are thus in one-to-one correspondence with points Z ∈ CP d S −1 .Using a computational basis as reference basis {|j⟩} d S j=1 , Z has homogeneous coordinates Z = (Z 1 , . . ., Z d S ) where |Z⟩ = d S j=1 Z j |j⟩ ∈ H S .One of the advantages in using the geometric approach is that one can exploit the symplectic character of the state space.Indeed, this implies that the quantum state space CP n can essentially be considered as a classical, although curved, phase space.With probability and phases being canonically conjugated coordinates: Z j = √ p j e iϕj , we have {p j , ϕ k } = δ jk .The intuition from classical mechanics can then be used to understand the phenomenology of quantum systems.
Observables.Within GQM, observables are Hermitian functions from CP d S −1 to the reals: Starting from the Schmidt partners, these bases are in one-to-one correspondence with d E × d E unitary matrices acting on H E : |v α ⟩ := U |SP α ⟩.And these, in turn, are in one-to-one correspondence with the unitary matrices in the HJW theorem.Therefore, they are an analogously complete classification of ensembles.The reason for this slight rearrangement of things with respect to the HJW theorem is that we now have an interpretation of |χ α ⟩ as the conditionally pure state of the system, conditioned on the fact that we make a projective measurement {|v α ⟩} on the environment where the result α occurs with probability p α .Quantifying quantum entropy.To develop entropy the following uses the setup in Refs.[6,7,34] to study the physics of ensembles using geometric quantum mechanics.Since the focus here is a maximum entropy approach to select an ensemble behind a density matrix, it is important to have a proper understanding of how to quantify the entropy of an ensemble or, equivalently, of the GQS.First, we look at the statistical interpretation of the pure states which participate in the conditional ensembles {p α , |χ α ⟩}.The corresponding kets {|χ α ⟩} are not necessarily orthogonal ⟨χ α |χ β ⟩ ̸ = δ αβ so the states are not mutually exclusive, or distinguishable in the Fubini-Study sense.However, these states come with the classical labels α → |χ α ⟩ associated to the outcomes of projective measurements on the environment.In this sense, if α ̸ = β we have a classical way to distinguish them and thus we can understand how to interpret expressions like − α p α log p α .Then, we highlight that the correct functional to use to evaluate the entropy of µ is not always the same.It depends on another feature of the ensemble, the quantum information dimension, which is conceptually related to the dimension of its support in quantum state space.To illustrate the concept, consider the following four GQSs of a qubit: Naturally, the entropy of µ 1 vanishes, since there is no uncertainty.The system inhabits only one pure state, ψ.The entropy of µ 2 is already nontrivial to evaluate.Indeed, while one obvious way is to use the functional − k p k log p k , it is also very clear that this notion of entropy does not take into account the location of the points ψ k ∈ P(H S ).Intuitively, if all these points are close to each other, we would like our entropy to be smaller than in the case in which all the points are uniformly distributed on ψ k ∈ P(H S ).
The entropy of µ 3 is perhaps the most peculiar, but it illustrates the points in the best way.Let's assume that our qubit is evolving with a Hamiltonian H such that we aggregate the time average and look at the statistics we obtain, it is clear that the variable p is a conserved quantity-p(t) = p 0 .While ϕ(t) = ϕ 0 − ωt is an angular variable that, over a long time, will be uniformly distributed in [0, 2π].This means lim T →∞ µ 3 = 1 2π δ p p0 , where δ p p0 is a Dirac measure over the first variable p with support on p = p 0 .How do we evaluate the entropy of µ 3 ?
While to evaluate µ 4 = µ Haar we simply integrate over the whole state space, and obtain log Vol(CP 1 ), this does not work for µ 3 .Indeed, with respect to the full, 2D quantum state space (p, ϕ) ∈ [0, 1] × [0, 2π], the distribution clearly lives on a 1D line, which is a measure-zero subset.
To properly address all these different cases a more general approach is needed.Reference [34] adapted previous work by Renyi to probability measures on a quantum state space.This led to the notions of Quantum Information Dimension D and Geometric Quantum Entropy H D that address these issues and properly evaluate the entropy in all these cases.We now give a quick summary of the results in Ref. [34].

Quantum Information Dimension and Geometric Entropy.
Thanks to the symplectic nature of P(H S ), the quantum state space is essentially a curved, compact, classical phase space.We can therefore apply classical statistical mechanics to it, using {(p j , ϕ j )} d S j=1 as canonical coordinates.Since, the Fubini-Study volume is dV F S ∝ j dp j dϕ j , we can coarse-grain P(H S ) by partitioning it into phase-space cells C ⃗ a, ⃗ b : The coarse-graining procedure produces a discrete probability distribution q ⃗ a ⃗ b := µ[C ⃗ a ⃗ b ], for which we can compute the Shannon entropy: As we change ϵ = 1/N → 0, the degree of coarse-graining changes accordingly.The scaling behavior of H[ϵ] provides structural information about the underlying ensemble.Indeed, since one can prove that for ϵ → 0, H[ϵ] has asymptotics two quantities define its scaling behavior: D is the quantum information dimension and h D is the geometric quantum entropy.Their explicit definitions are: Note how this keeps the dependence of the entropy on the information dimension explicit.This clarifies how, only in certain cases, one can use the continuous counterpart of Shannon's discrete entropy.In general, its exact form depends on the value of D and it cannot be written as an integral on the full quantum state space with the Fubini-Study volume form.

IV. PRINCIPLE OF MAXIMUM GEOMETRIC QUANTUM ENTROPY
This section presents a fine-grained characterization of selecting an ensemble behind a given density matrix.This leverages both the HJW theorem and previous results by the authors.First, we note that D foliates E(ρ) into non-overlapping subsets E D (ρ) collecting all ensembles µ at given density matrix ρ and with information dimension D: As argued above, ensembles with different D pertain to different physical situations.These can be wildly different.Therefore, we often want to first select the D of the ensemble we will end up with and then choose that with the maximum geometric entropy.Thus, here we introduce the principle of maximum geometric entropy at fixed information dimension.
Proposition 1 (Maximum Geometric Entropy Principle).Given a system with density matrix ρ, the ensemble µ D M E that makes the fewest assumptions possible about our knowledge of the ensemble among all elements of E(ρ) with fixed information dimension dimension D is given by: Several general comments are in order.First, we note that µ D M E might not be unique.This should not come as a surprise.For example, with degeneracies, even the eigenensemble is not unique.Second, the optimization problem defined above is clearly constrained: the resulting ensemble has to be normalized and the average of (Z j ) * Z k must be ρ jk .Calling E µ [A] the state space average of a function A done with the GQS µ, these two constraints can be written as While the vanishing of Λ's derivatives with respect to the Lagrange multipliers γ 1 , γ jk enforces the constraints C 1 = C ρ jk = 0, derivatives with respect to µ give the equation whose solution is the desired ensemble µ D M E .We also note that the {γ jk } are not all independent.This is due to the fact that ρ is not an arbitrary matrix: Tr ρ = 1, ρ ≥ 0, and ρ † = ρ.A similar relation holds for γ jk .To illustrate its use, we now solve this optimization problem in a number of relevant cases.In discussing them, it is worth introducing additional notation.Since we often use canonically conjugated coordinates, {(p j , ϕ j )}  We start by noting that N also foliates E D=0 into nonoverlapping sets in which the ensemble consists of exactly N elements.We call this set E 0,N (ρ) and it is such that E This is achieved by measuring the environment in a basis that is unbiased with respect to the Schmidt partner basis: One such basis can always be built starting from {|SP α ⟩} α by exploiting the properties of the Weyl-Heisenberg matrices via the clock-and-shift construction [37].This is true for all N ∈ N. When N = k n N k k with n k primes and N k some integers, the finite-field algorithm [38,39] can be used to build a whole suite of N bases that are unbiased with respect to the Schmidt partner basis.This leads to λ j e iθαj |λ j ⟩ and to: with ⃗ λ = (λ 1 , . . ., λ d S ) and ⃗ θ α = (θ α0 , . . ., θ αd S ).
To conclude this subsection we simply have to show that this ensemble satisfies the constraints: C 1 = 0, and that the density matrix given by µ D=0 M E is ρ, giving C ρ jk = 0: Here, the key property used is that 1 (2) and the fact that {|v α ⟩} α is a basis.
The second case of interest is the one in which the quantum information dimension takes the maximum value possible, namely D = 2(d S − 1).Then, the GQS's support has the same dimension as the full quantum state space and the optimization problem is also tractable.This is indeed the case solved by Brody and Hughston [27].We do not reproduce the treatment here, which is almost identical in the language of GQM.Rather, we discuss some of its physical aspects from the perspective of conditional ensembles.
If D = 2(d S − 1) and there are no other constraints aside from C 1 and C jk , the measure µ can be expressed as an integral with a density q M E with respect to the uniform, normalized, Fubini-Study measure dV F S : And, its geometric entropy h 2(d S −1) is the continuous counterpart of Shannon's functional, on the quantum state space: dV F S q(Z) log q(Z) .This was proven in Ref. [34].Hereafter, with a slight abuse of language, we refer to both µ and the density q M E (Z) as an ensemble or the GQS.The maximization problem leads to: and Lagrange multipliers {γ jk } are the solution of the nonlinear equations − ∂ log Q ∂γ jk = ρ jk .We note how, using as reference basis the eigenbasis {|λ j ⟩} d S j=1 of ρ and Z ↔ (⃗ p, ⃗ ϕ) as coordinate system reveals that ∂ log Q ∂γ jk = 0 when j ̸ = k and − ∂ log Q ∂γjj = λ j .Thus, in this coordinate system the dependence of µ on the off-diagonal Lagrange multipliers disappears and we retain only the diagonal ones γ jj .Moving to a single label γ jj → γ j and using a vector notation: Here, Q(⃗ τ ) is the normalization function (a partition function).Its exact expression can be derived analytically and it is given in Appendix A.
We can see how µ is the product of an exponential measure on the probability simplex ∆ d S −1 and the uniform measure on the high-dimensional torus of the phases T d S −1 .This leads to the following geometric entropy h 2(d S −1) : In this case the explicit expression of the Lagrange multipliers ⃗ τ satisfying the constraints, which was previously unknown, can be found analytically.This is reported in Appendix B. We note that this exponential distribution on the probability simplex was recently proposed within the context of statistics and data analysis in Ref. [40].Moreover, the exponential form associated to the maximum entropy principle is reminiscent of thermal behavior.Indeed, the shape of this distribution is closely related to the geomet-ric canonical ensemble, see Refs.[7,14,27].However, the value of the Lagrange multipliers is set by a different constraint, in which we fix the average energy rather than the whole density matrix.
Integer, but otherwise arbitrary, D While one expects D to be an integer, there are GQSs that have fractal support, thus exhibiting a noninteger D. This was shown in Ref. [34].This section discusses the generic case in which D ̸ = 0, 2(d S − 1), but it is still an integer.Within E D (ρ), our ensemble µ D M E has support on a Ddimensional submanifold of the full 2(d S − 1)-dimensional quantum state space, where it has a density.Reference [34] discusses, in detail, the case in which D = 1 and d S = 2. Here, we generalize the procedure to arbitrary D and d S .

If the support of µ D
M E is contained in a submanifold of dimension D < 2(d S − 1), which we call S D , we can project the Fubini-Study metric g F S down to P(H S ) to get g F S S .Let's call X j : ξ a ∈ S D → X j (ξ a ) ∈ P(H S ) the functions which embed S D into the full quantum state space P(H S ).Then the metric induced on S D is g , where ∂ a := ∂/∂ξ a .Note that here we are using the "real index" notation even for coordinates X j on P(H S ).While P(H S ) is a complex manifold, admitting complex homogeneous coordinates, we can always use real coordinates on it.Then g S induces a volume form dω ξ S = ω S (dξ) = det g S dξ, where dξ is the Lebesgue measure on the R D which coordinatizes S D .Then, µ D M E can be written as: Eventually, this leads to: This allows rewriting the constraints explicitly in a form that involves only probability densities on S D : where Z(ξ) : ξ ∈ S D → Z(ξ) ∈ P(H S ) are the homogeneous coordinate representation of the embedding functions X a of S D onto P(H S ).
The solution of the optimization problem leads to the Gaussian form, in homogeneous coordinates, with support on S D : Again, we can move from a homogeneous representation to a symplectic one Z(ξ) ↔ ⃗ p(ξ), ⃗ ϕ(ξ) in which the reference basis is the eigenbasis of ρ.This gives ρ jk = λ j δ jk .This, in turn, means we only need the diagonal Lagrange's multipliers γ jj .As for the previous case, we move to a single label notation γ jj → γ j : with an analytical expression for the entropy: While this solution appears to have much in common with the D = 2(d S − 1) case, there are profound differences.Indeed, the functions ⃗ p(ξ) can be highly degenerate, since we are embedding a low-dimensional manifold S D into a higher one P(H S ).Indeed, the coordinates ξ emerge from coordinatizing a submanifold of dimension D within one of dimension 2(d S − 1).This means that for S D there are 2(d S − 1) − D independent equations of the type {K n (Z) = 0} .In general, we expect them to be highly nonlinear functions of their arguments.While choosing an appropriate coordinate system allows simplifying, this choice has to be made on a case-by-case basis.In specific cases, discussed in the next section, several exact solutions can be found analytically.As Ref. [34] showed, even measuring the environment in a local basis can lead to GQSs with noninteger D. For example, if we explicitly break the translational invariance of the spin-1/2 Heisenberg model in 1D by changing the local magnetic field of one spin, the GQS of one of its spin-1/2 is described by a fractal resembling Cantor's set in the thermodynamic limit of an infinite environment.Its quantum information dimension and geometric entropy have been estimated numerically to be D ≈ 0.83±0.02and h 0.83 grows linearly with N E , the size of the environment: h 0.83 ∝ 0.66N E .Their existence gives physical meaning to the question of finding the maximum geometric entropy ensemble with noninteger D. Providing concrete solutions to this problem is quite complex, as it requires having a generic parametrization for an ensemble with an arbitrary fractional D. As far as we know, this is currently not possible.While we do know that certain ensembles have a noninteger D, there is no guarantee that fixing the value of the information dimension, e.g.D = N/M with N, M ∈ N relative primes, turns into an explicit way of parametrizing the ensemble.We leave this problem open for future work.

V. HOW DOES µME EMERGE?
While the previous section gave the technical details regarding ensembles resulting from the proposed maximum geometric quantum entropy principle, the following identifies the mechanisms for their emergence in a number of cases of physical interest.
As partly discussed in the previous section, µ 0 M E can emerge naturally as a conditional ensemble, when our system of interest interacts with a finite-dimensional environment (dimension N ).If the environment is probed with projective measurements in a basis that is unbiased with respect to the Schmidt-partner basis {|SP α ⟩} N α=1 , we reach the absolute maximum of the geometric entropy, log N .The resulting GQS is , with members of the ensemble being λ j e iθαj |λ j ⟩ and p α = 1/N .
As argued in Ref. [41], the notion of unbiasedness is typical.Physically, this is interpreted as follows.Imagine someone gives up |ψ(ρ)⟩, a purification of ρ, without telling anything about the way the purification is done.This means we know nothing about the way ρ has been encoded into |ψ(ρ)⟩.Equivalently, we do not know what the {|SP j ⟩} The mathematically rigorous version of "very high chance" and "very close" is given in Ref. [41] and it is not relevant here.The only thing we need is that this behavior is usually exponential in the size of the environment ∼ 2 N .Somewhat more accurately, the fraction of bases which are |⟨v α |SP j ⟩| 2 ≈ 1/N are ∼ 1 − 2 −N .Therefore, statistically speaking, it is extremely likely that, in absence of meaningful information about what the {|SP j ⟩} d S j=1 are, the conditional ensemble we will see is to emerge as a conditional ensemble our d S -dimensional quantum system must interact with an environment that is being probed with measurements whose outcomes are parametrized by 2(d S − 1) continuous variables, each with the cardinality of the reals.This is because we have to guarantee that D = 2(d S − 1).Therefore, conditioning on projective measurements on a finite environment is insufficient.One possibility is to have a finite environment that we measure on an overcomplete basis, like coherent states.A second possibility is to have a genuinely infinite-dimensional environment, on which we perform projective measurements.For example, we could have 2(d S − 1)/3 quantum particles in 3D that we measure on the position basis ⊗ All the needed details were given in Ref. [6], where we studied the properties of a GQS emerging from a finitedimensional quantum system interacting with one with continuous variables.
We stress here that this is only a necessary condition, not a sufficient one.Indeed, we can have an infinite environment that is probed with projective measurements on variables with the right properties, but still obtain an ensemble that is not µ . An interesting example of this is given by the continuous generalization of the notion of unbiased basis.We illustrate this in a simple example of a purification obtained with a set of 2(d S − 1) real continuous variables, realized by 2(d S − 1) non-interacting particles in a 1D box [0, L].
In this, the notion of an unbiased basis is satisfied by position and momentum eigenstates: Thus, if our Schmidt partners are momentum eigenstates , and we measure the environment in the position basis, we do not obtain a GQS with the required D = 2(d S − 1).Indeed, while we do get λ j e i ⃗ kj •⃗ x |λ j ⟩ are not distributed in the appropriate way.1).This clarifies why, in order to have D = 2(d S − 1), using an environmental basis that is unbiased with respect to the Schmidt partners is not enough.Specifically, the probabilities p j (⃗ x) = |⟨λ j |χ(⃗ x)⟩| 2 = λ j do not depend on ⃗ x.They do not get redistributed by the unbiasedness condition and are always equal to the eigenvalues of ρ.

If we measure on a different basis ⃗ l
Equation ( 8) gives the functions p j ( ⃗ l), ϕ j ( ⃗ l) This, together with the density q( ⃗ l) specifies the ensemble .
Finding the exact conditions that lead to µ = µ involves solving a complex inverse problem.However, what we have done so far allows us to understand the real mechanism behind its emergence.First, the ϕ j ( ⃗ l) must be uniformly distributed: they must be random phases.Second, the distribution of ⃗ p must be of exponential form.The first condition can always be ensured by choosing some ⃗ l ⃗ x = u ⃗ l (⃗ x) and then multiplying it by pseudorandom phases, generated in a way that is completely independent on ⃗ p.This can always be done without breaking the unitarity of ⃗ l ⃗ x via u ⃗ l (⃗ x) → u ⃗ l (⃗ x)e iθ ⃗ l .This guarantees that the marginal distribution over the phases is uniform and that the density q(⃗ p, ⃗ ϕ) becomes a product of its marginals, since the distribution of the ⃗ ϕ has been built to be independent of everything else: Then, in order for q(⃗ p, ⃗ to be the maximum entropy one we need f where J is the Jacobian matrix of the coordinate change ⃗ l → ⃗ p( ⃗ l).Checking that this form leads to the right distribution is simply a matter of coordinate changes.Alternatively, it can be seen by repeated use of the Laplace transform on the simplex, together with the result ).We now see the mechanism at play in a concrete way and how it leads to the maximum entropy GQS µ for example by choosing B j ( ⃗ b) to be linear functions.Then, choosing A j (⃗ a) such that d⃗ aA j (⃗ a) = 1 M guarantees completeness.With this choice, we obtain: The probability density q(⃗ a, ⃗ b) can be written as a product of two probability densities: is a probability density for ⃗ a.Then, the GQS becomes a product of two densities: one over the probability simplex (for ⃗ p) and another one over the phases (for ⃗ ϕ): These are, basically, the formulas for two changes of variable in integrals: ⃗ a → ⃗ p(⃗ a) and ⃗ b → ⃗ ϕ( ⃗ b).Since these are invertible, we can confirm what we understood before.
Q(⃗ τ ) , with J ap being the Jacobian matrix of the change of variables ⃗ a → ⃗ p(⃗ a).Stationary distribution of some dynamic.A second mechanism, that can lead to the emergence of an ensemble with D = 2(d S − 1), is time averaging.Indeed, if we are in a nonequilibrium dynamical situation in which the system and its environment jointly evolve with a dynamical (possibly unitary) law, its conditional ensembles µ(t) depend on time.
To study stationary behavior from dynamics one looks at time-averaged µ(t) = lim T →∞ 1 T T 0 µ(t) ensembles that, in this case, have a certain stationary density matrix ρ ss = ρ(t).Unless something peculiar happens, we expect the ensemble to cover large regions of the full state space, leading to a stationary GQS with D = 2(d S − 1) and a given density matrix ρ ss .Intuitively, we expect dynamics that are chaotic in quantum state space to lead to ensembles described by µ . This is because the ensemble that emerges must be compatible with a density matrix ρ ss , while still exhibiting a nontrivial dynamics due to the action of the environment.We now give a simple example of how this happens.Borrowing from Geometric Quantum Thermodynamics, see Ref. [7] where we studied a qubit with a Caldeira-Leggettlike environment.The resulting evolution for the qubit can be described using Stochastic Schrödinger's equation which, as shown in Ref. [7], leads to a maximum entropy ensemble (see Eq. ( 3)) of the required type.

Emergence of µ d
Among all possible values of D, a third one which is particularly relevant is D = d S − 1, which is half the maximum value.The reason why this is important comes from the symplectic nature of the quantum state space and, ultimately, from dynamics.One physical situation in which µ d S −1 M E emerges naturally is the study of the dynamics of pure, isolated quantum systems.The phenomenology we discuss here is known, being intimately related to thermalization and equilibration studies.We discuss it here only in connection with the maximum geometric entropy principle introduced in Section IV.Imagine an isolated quantum system in a pure state |ψ 0 ⟩ evolving unitarily with a dynamics generated by some time-independent Hamiltonian H = Assuming lack of degeneracies in the energy spectrum, the dynamics is given by |ψ t ⟩ = D n=1 p 0 n e i(ϕ 0 n −Ent) , where p 0 n e iϕ 0 n := ⟨E n |ψ 0 ⟩ and we have used symplectic coordinates in the energy eigenbasis.Since ⃗ p ∈ ∆ D−1 are conserved quantities p t n = p 0 n and ⃗ ϕ ∈ T D−1 evolve independently and linearly ϕ t n = ϕ 0 n − E n t on a highdimensional torus, we know that a sufficient condition for the emergence of ergodicity on T D−1 is the so-called non-resonance condition: energy gaps have to be nondegenerate: namely This condition is usually true for interacting many-body quantum systems.If that's the case, then the evolution of the phases is ergodic on T D−1 .This was first proven by von Neumann [42]  the corresponding Dirac measure on the quantum state space, we have: where unif ⃗ ϕ (T D−1 ) is the uniform measure on T D−1 in which all ϕ n are uniformly and independently distributed on the circle.It is not too hard to see that this is the maximum geometric entropy ensemble with D = d S − 1, compatible with the fact that the occupations of the energy eigenstates are all conserved quantities: p t n = p 0 n .Indeed, these d S − 1 constraints provide d S − 1 independent equations, thus reducing the state-space dimension that the system explores to the high-dimensional torus T D−1 .On this, however, the dynamics is ergodic and the resulting stationary measure is the uniform one.By definition, this is the measure with the highest possible value of geometric entropy since its density is uniform and equal to

Comment on the generic µ D
M E .
To have a GQS with generic information dimension D result from a conditional measurement on an environment, we must condition on at least D continuous variables with the cardinality of the reals.This can be achieved either via measurements on an overcomplete basis, such as coherent states, or via projective measurements on a infinite dimensional environment with at least D real coordinates.This condition is necessary, but not sufficient, to guarantee the emergence of the corresponding maximum entropy ensemble µ D M E .While we have seen that the notion of an unbiased basis is relevant when D = 0, we also saw how this falls short in the generic D > 0 case.Understanding this general condition is a nontrivial task that requires a much deeper understanding of how systems encode quantum information in their environment and how this is extracted by means of quantum measurements.Further work in this direction is ongoing and will be reported elsewhere.
For a GQS with arbitrary dimension D to emerge as a stationary distribution on a quantum state space with dimension 2(d S − 1) it is likely that we need 2(d S − 1) − D independent equations constraining the dynamics.That is, if D is an integer.Indeed, due to the continuity of time and the smoothness of the time evolution in quantum state space, we expect D ∈ N in the vast majority of cases.If these equations constraining the motion on the quantum state space are linear, then we know that having 2(d S − 1) − D independent equations is both necessary and sufficient to have D as quantum information dimension.This, however, says virtually nothing about the maximization of the relevant geometric entropy h D .Moreover, constraints on an open quantum system can take very generic forms and the relevant equations will not always be linear.An explicit example where such ensemble can be found constructively is given by the case of an isolated quantum system.The conditions for the emergence of a maximum entropy µ d S −1 M E are then known, being equivalent to the conditions for the ergodicity of periodic dynamics on a high-dimensional torus, which are known.

VI. CONCLUSION
While a density matrix encodes all the statistics available from performing measurements on a system, they do not give information about how the statistics was created.And, infinite possibilities are available.A natural way to select a unique ensemble behind a given density matrix is to approach the problem from the perspective of information theory.In this case the issue becomes a standard inference problem, one to which we can apply the maximum entropy principle.To properly formulate the problem in this way requires a proper way to compute the entropy of an ensemble.While this is trivial for ensembles with a finite number of elements, it is not for continuous ensembles.The correct answer, the notion of Geometric Quantum Entropy h D , was given in Ref. [34].This, however, depends strongly on another quantity that characterizes the ensemble: the quantum information dimension D. Consequently, we formulated the maximum geometric entropy principle at fixed quantum information dimension.This is a one-parameter class of maximum entropy principles, labeled by D, that can be used to explore various ways to have ensembles give rise to a specific density matrix.
As often happens with inference principles, the generic optimization problem can be hard to solve.However, here we solved a number of cases where the ensemble can be found analytically.We also explored the physical mechanism responsible for the emergence of µ D M E .Two different classes of situations were considered: (i) conditional ensemble, resulting from measuring the environment of our system of interest, and (ii) stationary distributions, in which the statistics arise from aggregating data over time.We have also identified and discussed various instances where both mechanisms lead to a maximum entropy ensemble.

Maximum Geometric Quantum Entropy
Supplemental Material Fabio Anza and James P. Crutchfield Since λ αβ are the Lagrange multipliers of C αβ we chose them to be Hermitian as they are not all independent.Thus, we can always diagonalize them with a unitary matrix: This allows us to define auxiliary integration variables X γ = α U γα Z α .Thanks to these, we express the quadratic form in the exponent of the integrand using that (U † U ) αβ = δ αβ : Moreover, recalling that the Fubini-Study volume element is invariant under unitary transformations, we can simply adapt our coordinate systems to X a .And so, we have X a = q a e iνa .This gives dV F S = D−1 k=1 dqadνa 2 . We get to the following simpler functional: Exploiting linearity of the inverse Laplace transform plus the basic result: where: We have for: .Now, consider that l a are linear functions of the true matrix elements: We arrive at: .

Appendix B: Calculating Lagrange Multipliers
Given the expression of the partition function, we now show that that the value of Lagrange's multipliers γ jk can be given analytically, by extending the Laplace transform technique exploited Appendix A.
The nonlinear equation to fix γ jk is: We now use as reference basis the eigenbasis of ρ and as coordinate system (⃗ p, ⃗ ϕ).This means only the diagonals γ jj enter the equation.To compute this we use the same Laplace transform technique we used before, with a minor adaptation.First we do single lable notation γ jj → l j , then we define: Where: Therefore, we obtain: where z n = −l n .
This can be written again as a sum, using the partial fraction decomposition: where: Exploiting the basic Laplace transform result: we can the invert the relation to compute J D−1 (r = 1): R j e zj r + rR .
The Lagrange's multipliers γ j can then be fixed by solving: where λ k are the eigenvalues of the density matrix: ρ jk = δ jk λ k .

M
kj λ j |λ j ⟩ .Equivalently, one can generate ensembles applying a generic d × d unitary matrix U to a list of d nonnormalized d S -dimensional states in which the first d S , λ j |λ j ⟩ d S j=1 , are proportional to the eigenvectors of ρ while the remaining d − d S are simply null vectors: d S j=1 , we introduce vector notation (⃗ p, ⃗ ϕ), with ⃗ p ∈ ∆ d S −1 and ⃗ ϕ ∈ T d S −1 , where ∆ d S −1 is the (d S − 1)-dimensional probability simplex and T d S −1 is the (d S −1)-dimensional torus.Analogously, we introduce the Dirac measures δ ⃗ p ⃗ x and δ ⃗ ϕ ⃗ φ with support on ⃗ x ∈ ∆ d S and ⃗ φ ∈ T d S , respectively.

Finite Environments: D = 0
If D = 0 then the support of the ensemble is made by a number of points which is a natural number.That is, there exists N ∈ N such that µ D=0 M E = N α=1 p α δ χα , with h 0 = − N α=1 p α log p α .Note how this is the HJW theorem's domain of applicability.And, this allows us to give a constructive solution.
we can use the HJW theorem with the interpretation in which the ensemble is the conditional ensemble.Here, p α and χ α are generated by creating a purification of dimension N , in which the first d S elements of the basis {|SP j ⟩}d S j=1are fixed and the remaining N − d S are free.We denote the entire basis of this type with the same symbol but a different label: {|SP α ⟩} N α=1 .The ensemble we get if we measure it is the eigenensemble L(ρ).However, measuring in a different basis yields a general ensemble, with probabilitiesp α = ⟨ψ(ρ)| I S ⊗ |v α ⟩ ⟨v α |ψ(ρ)⟩ = d S j=1 λ j |⟨SP j |v α ⟩| 2 and states |χ α ⟩ = d S j=1 λ j ⟨vα|SPj ⟩ √ pα |λ j ⟩.With h 0 = − α p α log p α the absolute maximum is attained at p α = 1/N .We now show, constructively, that this is always achievable while still satisfying the constraints C 1 = C ρ ij = 0, thus solving the maximization problem.
d S j=1 are.If we now choose a basis of the environment to study the conditional ensemble, {|v α ⟩} N α=1 , this will have very little information about the {|SP j ⟩} d S j=1there is a very high chance that we will end up very close to the unbiasedness condition.

J
n e −(ln+z)pn × ∞ 0 dp k (−p k )e −(l k +z)p k k e −(l k +z)p k = D n=1 G n (z) ∂ log G k (z) ∂l k