Quantum State Assignment Flows

This paper introduces assignment flows for density matrices as state spaces for representation and analysis of data associated with vertices of an underlying weighted graph. Determining an assignment flow by geometric integration of the defining dynamical system causes an interaction of the non-commuting states across the graph, and the assignment of a pure (rank-one) state to each vertex after convergence. Adopting the Riemannian–Bogoliubov–Kubo–Mori metric from information geometry leads to closed-form local expressions that can be computed efficiently and implemented in a fine-grained parallel manner. Restriction to the submanifold of commuting density matrices recovers the assignment flows for categorical probability distributions, which merely assign labels from a finite set to each data point. As shown for these flows in our prior work, the novel class of quantum state assignment flows can also be characterized as Riemannian gradient flows with respect to a non-local, non-convex potential after proper reparameterization and under mild conditions on the underlying weight function. This weight function generates the parameters of the layers of a neural network corresponding to and generated by each step of the geometric integration scheme. Numerical results indicate and illustrate the potential of the novel approach for data representation and analysis, including the representation of correlations of data across the graph by entanglement and tensorization.


INTRODUCTION
1.1.Overview and Motivation.A basic task of data analysis is categorization of observed data.We consider the following scenario: On a given undirected, weighted graph G = (V, E, w), data D i ∈ X are observed as points in a metric space (X , d X ) at each vertex i ∈ V. Categorization means to determine an assignment D i → j ∈ {1, . . ., c} =: [c] (1.1) of a class label j out of a finite set of labels to each data point D i .Depending on the application, labels carry a specific meaning, e.g.type of tissue in medical image data, object type in computer vision or land use in remote sensing data.The decision at any vertex typically depends on decisions at other vertices.Thus the overall task of labeling data on a graph constitutes a particular form of structured prediction in the field of machine learning [BHS + 07].
Assignment flows denote a particular class of approaches for data labeling on graphs [ ÅPSS17,Sch20].The basic idea is to represent each possible label assignment at vertex i ∈ V by an assignment vector S i ∈ ∆ c in the standard probability simplex, whose vertices encode the unique label assignment for every label by the corresponding unit vector e j , j ∈ [c].Data labeling is accomplished by computing the flow S(t) of the dynamical system Ṡ = R S [ΩS], S(0) = S 0 , (1.2) with the row-stochastic matrix S(t) and row vectors S i (t) as state, which under mild conditions converges to unique label assignment vectors (unit vectors) at every vertex i ∈ V [ZZS21].The vector field on the righthand side in (1.2) is parametrized by parameters collected in a matrix Ω.These parameters strongly affect the contextual label assignments.They can be learned from data in order to take into account typical relations of data in the current field of application [HSPS21].For a demonstration of the application of this approach to a challenging medical imaging problem, we refer to [SBS21].
From a geometric viewpoint, the system (1.2) can be characterized as a collection of individual flows S i (t) at each vertex which are coupled by the parameters Ω.Each individual flow is determined by a replicator equation which constitutes a basic class of dynamical systems known from evolutionary game theory [HS03,San10].By restricting each vector S i (t) to the relative interior ∆c of the probability simplex (i.e. the set of strictly positive discrete probability vectors) and by turning this convex set into a statistical manifold equipped with the Fisher-Rao geometry [AN00], the assignment flow (1.2) becomes a Riemannian ascent flow on the corresponding product manifold.The underlying information geometry is not only important for making the flow converge to unique label assignments but also for the design of efficient algorithms that actually determine the assignments [ZSPS20].For extensions of the basic assignment flow approach to unsupervised scenarios of machine learning and for an in-depth discussion of connections to other closely related work on structured prediction on graphs, we refer to [ZZPS20a,ZZPS20b] and [SBS23b], respectively.
In this paper, we study a novel and substantial generalization of assignment flows from the different point of view: assignment of labels to metric data where the labels are elements of a continuous set.This requires to replace the simplex ∆ c as state space which can only represent assignments of labels from a finite set.The substitute for assignment vectors S i , i ∈ V are Hermitian positive definite density matrices ρ i , i ∈ V with unit trace, D c = {ρ ∈ C c×c : ρ = ρ * , tr ρ = 1}.
(1.3) Accordingly, the finite set of unit vectors e j , j ∈ [c] (vertices of ∆ c ) are replaced by rank-one density matrices ρ ∞ , a.k.a.pure states in quantum mechanics [BZ17].The resulting quantum state assignment flow (QSAF), has a form similar to (1.2) due to adopting the design strategy: the system (1.4) couples the individual evolutions ρ i (t) at each vertex i ∈ V through parameters Ω, and the underlying information geometry causes convergence of each ρ i (t) towards a pure state.Using a different state space D c (rather than ∆c in (1.2)) requires to adopt a different Riemannian metric which results in a corresponding definition of the operator R ρ .
Our approach is natural in that restricting (1.4) to diagonal density matrices results in (1.2), after identifying each vector diag(ρ i ) of diagonal entries of the density matrix ρ i with an assignment vector S i ∈ ∆c .Conversely, (1.4) considerably generalizes (1.2) and enhances modelling expressivity due to the noncommutative interaction of the state spaces ρ i , i ∈ V across the underlying graph G, when the quantum state assignment flow is computed by applying geometric numerical integration to (1.4).
We regard our approach merely as an approach to data representation and analysis, rather than a contribution to quantum mechanics.For example, the dynamics (1.4) clearly differs from the Hamiltonian evolution of quantum systems.Yet we adopt the term 'quantum state' since not only density matrices as state spaces, but also the related information geometry, have been largely motivated by quantum mechanics and quantum information theory [AN00,Pet08].1.2.Contribution and Organization.Section 2 summarizes the information geometry of both the statistical manifold of categorial distributions and the manifold of strictly positive definite density matrices.Section 3 summarizes the assignment flow approach (1.2), as a reference for the subsequent generalization to (1.4).This generalization is the main contribution of this paper and presented in Section 4. Each row of the table below specifies the section where an increasingly general version of the original assignment flow (left column) is generalized to the corresponding quantum state assignment flow (right column, same row).

Assignment Flow (AF)
Quantum State AF (QSAF) single-vertex AF (Section 3.1) single-vertex QSAF (Section 4.2) AF approach (Section 3.2) QSAF approach (Section 4.3) Riemannian gradient AF (Section 3.3) Riemannian gradient QSAF (Section 4.4) recovery of the AF from the QSAF by restriction (Section 4.5) Alternative metrics on the positive definite matrix manifold which have been used in the literature, are reviewed in Section 2.3, in order to position our approach also from this point of view.Few academical experiments illustrate properties of the novel approach in Section 5. Working out a particular scenario of data analysis is beyond the scope of this paper.We conclude and indicate directions of further work in Section 6.In order not to compromise the reading flow, proofs are listed in Section 7.
This paper considerably elaborates the short preliminary conference version [SBS + 23a].
1.3.Basic Notation.For the readers convenience, we specify below the basic notation and notational conventions used in this paper.
the diagonal matrix with vector v as entries diag(V ) the vector of the diagonal entries of a square matrix V exp m the matrix exponential the relative interior of ∆ c , i.e. the set of strictly positive probability vectors (cf. the set of symmetric positive definite c × c matrices (cf.(2.12)) D c the subset of matrices in P c whose trace is equal to

INFORMATION GEOMETRY
Information geometry [Ama85, Lau87] is concerned with the representation of parametric probability distributions from a geometric viewpoint like, e.g., the exponential familiy of distributions [Bro86].Specifically, an open convex set M of parameters of a probability distribution becomes a Riemannian manifold (M, g) when equipped with a Riemannian metric g.The Fisher-Rao metric is the canonical choice due to its invariance properties with respect to reparametrization [ Čen81].A closely related scenario concerns the representation of the interior of compact convex bodies as Riemannian manifolds (M, g) due to the correspondence between compactly supported Borel probability measures and an affine equivalence class of convex bodies [BGVV14].
A key ingredient of information geometry is the so-called α-family of affine connections introduced by Amari [Ama85], which comprises the so-called e-connection ∇ and m-connection ∇ * as special cases.These connections are torsion-free and dual to each other in the sense that they jointly satisfy the equation which uniquely characterizes the Levi-Civita connection as metric connection [Ama85, Def.3.1, Thm.3.1].Regarding numerical computations, working with the exponential map induced by the e-connection is particularly convenient since its domain is the entire tangent space.We refer to [AN00, CU14, AJLS17] for further reading and to [Pet94], [AN00, Ch. 7] for the specific case of quantum state spaces.
In this paper, we are concerned with two classes of convex sets, • the relative interior of probability simplices, each of which represents the categorical (discrete) distributions of the corresponding dimension, and • the set of positive-definite symmetric matrices with trace one.Sections 2.1 and 2.2 introduce the information geometry for the former and the latter class of sets, respectively.
2.1.Categorical Distributions.We set and denote the probability simplex of distributions on [c] by Its relative interior equipped with the Fisher-Rao metric becomes the Riemannian manifold (S c , g), with trivial tangent bundle given by and the tangent space (2.6) The orthogonal projection onto T c,0 is denoted by (2.7) The mapping defined next plays a major role in all dynamical systems being under consideration in this paper.
Definition 2.1 (replicator operator).The replicator operator is the linear mapping of the tangent space parametrized by p ∈ S c .
The name 'replicator' is due to the role of this mapping in evolutionary game theory; see Remark 3.1 on page 9. (2.9b) Then the Riemannian gradient of f with respect to the Fisher-Rao metric (2.4) is given by grad (2.10) Proof.Appendix 7.1 Remark 2.3.Equations (2.10) and (7.12), respectively, show that the replicator operator R p is the inverse metric tensor with respect to the Fisher-Rao metric (2.4), expressed in the ambient coordinates.
The exponential map induced by the e-connection is defined on the entire space T c,0 and reads [AJLS17] Exp : (2.11) 2.2.Density Matrices.We denote the open convex cone of positive definite matrices by and the manifold of strictly positive definite density matrices by (2.13) D c is the intersection of P c and the hyperplane defined by the trace-one constraint.Its closure D c is convex and compact.We can identify the space D c as the space of invertible density operators, in the sense of quantum mechanics, on the finite-dimensional Hilbert space C c without loss of generality.Any matrix ensemble of the form induces the probability distribution on [n] via the Born rule (2.15) (2.14) is called positive operator valued measure (POVM).We refer to [BZ17] for the physical background and to [BH20] and references therein for the mathematical background.The analog of (2.6) is the tangent space which, at any point ρ ∈ D c , is equal to the space of trace-less symmetric matrices where (2.16b) The manifold D c therefore has a trivial tangent bundle given by with the tangent space H c,0 = T 1 Dc D c defined in equation (2.16a).The corresponding orthogonal projection onto the tangent space H c,0 reads Equipping the manifold D c as defined in equation (2.13) with the Bogoliubov-Kubo-Mori (BKM) metric [PT93] results in a Riemannian manifold (D c , g).Using T ρ D c = H c,0 , this metric can be expressed by This metric uniquely ensures the existence of a symmetric e-connection ∇ on D c that it mutually dual to its mconnection ∇ * in the sense of information geometry, leading to the dually-flat structure (g, ∇, ∇ * ) [GS01], [AN00, Thm.7.1].
The following map and its inverse, defined in terms of the matrix exponential exp m and its inverse log m = exp −1 m , will be convenient. (2.20c) The inner product (2.19) may now be written in the form since the trace is invariant with respect to cyclic permutations of a matrix product as argument.Likewise, We consider also two subspaces on the tangent space T ρ D c , which yield the decomposition [AN00] (2.24) In Section 4.5, we will use this decomposition to recover the assignment flow for categorical distributions from the quantum state assignment flow, by restriction to a submanifold of commuting matrices.

2.3.
Alternative Metrics and Geometries.The positive definite matrix manifold P c1 has become a tool for data modelling and analysis during the last two decades.Accordingly, a range of Riemannian metrics exist with varying properties.A major subclass is formed by the O(n)-invariant metrics, including the log-Euclidean, affine-invariant, Bures-Wasserstein and Bogoliubov-Kubo-Mori (BKM) metric.We refer to [TP23] for a comprehensive recent survey.
This section provides a brief comparison of the BKM metric (2.19), adopted in this paper, with two often employed metrics in the literature, the affine-invariant metric and the log-Euclidean metric, which may be regarded as 'antipodal points' in the space of metrics from the geometric and the computational viewpoint, respectively.2.3.1.Affine-Invariant Metrics.The affine-invariant metric has been derived in various ways, e.g. based on the canonical matrix inner product on the tangent space [Bha06, Section 6] or as Fisher-Rao metric on the statistical manifold of centered multivariate Gaussian densities [Sko84].The metric is given by (2.25) The exponential map with respect to the Levi-Civita connection reads (2.27) The exponential map reads and is much more convenient from the computational viewpoint.Endowed with this metric, the space P c is isometric to a Euclidean space.The log-Euclidean metric is not curved and merely invariant under orthogonal transforms and dilations [TP23].
as will be shown below as Remark 4.11.Here, the left-hand side of (2.31a) is the exponential map (2.28) induced by the log-Euclidean metric and Exp (e) ρ is the exponential map with respect to the affine e-connection of information geometry, as detailed below by Proposition 4.6.This close relationship of the e-exponential map Exp (e)  ρ to the exponential map of the log-Euclidean metric highlights the computational efficiency of using BKM metric, which we adopt for our approach.This is also motivated by the lack of an explicit formula for the exponential map with respect to the Levi-Civita connection [MPA00].To date, the sign of the curvature is not known either.
We note that to our best knowledge, the introduction of the affine connections of information geometry, as surrogates of the Riemannian connection for any statistical manifold, predates the introduction of the log-Euclidean metric for the specific space P c .

ASSIGNMENT FLOWS
The assignment flow approach has been informally introduced in Section 1.In this section, we summarize the mathematical ingredients of this approach, as a reference for the subsequent generalization to quantum states (density matrices) in Section 4. Sections 3.1 and 3.2 introduce the assignment flow on a single vertex and on an arbitrary graph, respectively.A reparametrization turns the latter into a Riemannian gradient flow (Section 3.3).Throughout this section, we refer to definitions and notions introduced in Section 2.1.(3.1) In practice, the vector D represents real-valued noisy measurements at some vertex i ∈ V of an underlying graph G = (V, E) and hence will be in 'general position', that is the minimal component will be unique: if j * ∈ [c] indexes the minimal component D j * , then the corresponding unit vector p * = e j * will maximize the right-hand side of (3.1).We call assignment vectors such vectors which assign a label (index) to observed data vectors.
If D varies, the operation (3.1) is non-smooth, however.In view of a desired interaction of label assignments across the graph (cf.Section 3.2), we therefore replace this operation by a smooth dynamical system whose solution converges to the desired assignment vector.To this end, the vector D is represented on S c as likelihood vector L p (D) := exp p (−π c;0 D) where The single-vertex assignment flow equation reads Its solution p(t) converges to the vector that solves the label assignment problem (3.1), see Corollary 3.4 below.
Remark 3.1 (replicator equation).Differential equations of the form (3.4), with some R c -valued function F (p) in place of L p (D), are known as replicator equation in evolutionary game theory [HS03].
Lemma 3.2.Let p ∈ S c .Then the differentials of the mapping (3.3) with respect to p and v are given by Proof.Appendix 7.2.
Theorem 3.3 (single vertex assignment flow).The single-vertex assignment flow equation (3.4) is equivalent to the system with solution given by Proof.Appendix 7.2.(3.7) In particular, if D has a unique minimal component D j * , then p(t) → e j * as t → ∞.

Assignment Flows.
The assignment flow approach consists of the weighted interaction -as define below -of single-vertex assignment flows, associated with vertices i ∈ V of a weighted graph G = (V, E, ω) with nonnegative weight function The assignment vectors are denoted by W i , i ∈ V and form the row vectors of a row-stochastic matrix The product space W c is called assignment manifold (W c , g), where the metric g is defined by applying (2.4) row-wise, The assignment flow equation generalizing (3.4) reads where the similarity vectors form the row vectors of the matrix S(W ) ∈ W c .The neigborhoods are defined by the adjacency relation of the underlying graph G, and R W [•] of (3.11) applies (2.8) row-wise, Note that the similarity vectors S i (W ) given by (3.12) result from geometric weighted averaging of the velocity vectors The velocities represent given data D i , i ∈ V via the likelihood vectors L Wi (D i ) given by (3.2).Each choice of the weights ω ik in (3.12) associated with every edge ik ∈ E defines an assignment flow W (t) solving (3.11).Thus these weight parameters determine how individual label assignments by (3.2) and (3.4) are regularized.
Well-posedness, stability and quantitative estimates of basins of attraction to integral label assignment vectors have been established in [ZZS21].Reliable and efficient algorithms for computing numerically the assignment flow have been devised by [ZSPS20].
with the nonnegative weight matrix corresponding to the weight function (3.8), This formulation reveals in terms of (3.15b) the 'essential' part of the assignment flow equation, since (3.15a) depends on (3.15b), but not vice versa.Furthermore, the data and weights show up only in the initial point and in the vector field on the right-hand side of (3.15b), respectively.Henceforth, we solely focus on (3.15b) rewritten for convenience as where S 0 comprises the similarity vectors (3.12) evaluated at the barycenter W = 1 Wc .

QUANTUM STATE ASSIGNMENT FLOWS
In this section, we generalize the assignment flow equations (3.11) and (3.17) to the product manifold Q c of density matrices as state space.The resulting equations have a similar mathematical form.Their derivation requires • to determine the form of the Riemannian gradient of functions f : D c → R with respect to the BKMmetric (2.19), the corresponding replicator operator and exponential mappings Exp and exp together with their differentials (Section 4.1), • to define the single-vertex quantum state assignment flow (Section 4.2), • to devise the general quantum state assignment flow equation for an arbitrary graph (Section 4.3) • and its alternative parametrization (Section 4.4) which generalizes formulation (3.17) of the assignment flow accordingly.
A natural question is: What does 'label' mean for a generalized assignment flow evolving on the product manifold Q c of density matrices?For the single vertex quantum state assignment flow, i.e. without interaction of these flows on a graph, it turns out that the pure state corresponding to the minimal eigenvalue of the initial density matrix is assigned to the given data point (Proposition 4.13).Coupling non-commuting density matrices over the graph through the novel quantum state assignment flow, therefore, generates an interesting complex dynamics as we illustrate in Section 5.It is shown in Section 4.5 that the restriction of the novel quantum state assignment flow to commuting density matrices recovers the original assignment flow for discrete labels.Throughout this section, we refer to definitions and notions introduced in Section 2.2.
where T −1 ρ is given by (2.20c) and ∂ f is the ordinary gradient with respect to the Euclidean structure of the ambient space C c×c .
Comparing the result (4.1) with (2.10) motivates the following The following lemma shows that the properties (2.9) extend to (4.2).
Next, using the tangent space H c,0 , we define a parametrization of the manifold D c in terms of the mapping where ψ(X) := log tr exp m (X) . (4.4b) The following lemma and proposition show that the domain of Γ extends to R c×c .
Lemma 4.3 (extension of Γ).The extension to C c×c of the mapping Γ defined by (4.4) is well-defined and given by Γ : Proof.Appendix 7.3.
The following lemma provides the diffentials of the mappings Γ and Proof.Appendix 7.3.
We finally compute a closed-form expression of the e-geodesic, i.e. the geodesic resp.exponential map induced by the e-connection on the manifold (D c , g).
Analogous to (3.3), we define the mapping exp ρ , where both the subscript and the argument disambiguate the meaning of 'exp'.
The following lemma provides the explicit form of the differential of the mapping (4.10b) which resembles the corresponding formula (3.5a) of the assignment flow.Lemma 4.9 (differential d exp ρ ).The differential of the mapping (4.10) reads with ρ ∈ D c , X ∈ H c,0 and Proof.Appendix 7.3.
Defining in view of (3.2) the likelihood matrix the corresponding single vertex quantum state assignment flow (SQSAF) equation reads Proposition 4.13 below specifies its properties after a preparatory Lemma.where we adopt the notation (3.13) for neighborhoods N i , i ∈ V. Analogous to (3.9), we define the product manifold with D c given by (2.13).The corresponding factors of ρ are denoted by Q c becomes a Riemannian manifold when equipped with the metric with g ρi given by (2.19) for each i ∈ V. We set with 1 Dc given by (4.18b).Our next step is to define a similarity mapping analogous to (3.12), ρi k∈Ni based on the mappings (4.8b) and (4.17).Thanks to using the exponential map of the e-connection, the matrix S i (ρ) can be rewritten and computed in a simpler, more explicit form.
Expression (4.27), which defines the similarity map, looks like a single iterative step for computing the Riemannian center of mass of the likelihood matrices {L ρ k (D k ) : k ∈ N i } if(!) the exponential map of the Riemannian (Levi Civita) connection were used.Instead, when using the exponential map Exp (e) , S i (ρ) may be interpreted as carrying out a single iterative step for the corresponding geometric mean on the manifold D c .
A natural idea therefore is to define the similarity map to be this geometric mean, rather than just by a single iterative step.Surprisingly, analogous to the similarity map (3.12) for categorial distributions (cf.[Sch20]), both definitions are identical, as shown next.
Proposition 4.15 (geometric mean property).Assume that ρ ∈ D c solves the equation which corresponds to the optimality condition for Riemannian centers of mass [Jos17, Lemma 6.9.4], except for using a different exponential map.Then with the right-hand side given by (4.27).
We are now in the position to define the quantum state assignment flow along the lines of the original assignment flow (3.11), where both the replicator map R ρ and the similarity map S(•) apply factorwise, with the mappings S i given by (4.28) and R ρi by (4.2).
4.4.Reparametrization, Riemannian Gradient Flow.The reparametrization of the assignment flow (3.15) for categorial distributions described in Section 3.3 has proven to be useful for characterizing and analyzing assignment flows.Under suitable conditions on the parameter matrix Ω, the flow performs a Riemannian descent flow with respect to a non-convex potential [SS21, Prop.3.9] and has convenient stability and convergence properties [ZZS21].
In this section, we derive a similar reparametrization of the quantum state assignment flow (4.31).
Proposition 4.16 (reparametrization).Define the linear mapping Then the density matrix assignment flow equation (4.31) is equivalent to the system Proof.Appendix 7.3.
For the following, we adopt the symmetry assumption As a consequence, the mapping (4.33) is self-adjoint, Proposition 4.17 (Riemannian gradient QSAF flow).Suppose the mapping Ω[•] given by (4.33) is self-adjoint with respect to the canonical matrix inner product.Then the solution µ(t) to (4.34b) also solves with respect to the potential Proof.Appendix 7.3.
We conclude this section by rewriting the potential in a more explicit, informative form.

4.5.
Recovering the Assignment Flow for Categorial Distributions.In the following we show how the assignment flow (3.17) for categorial distributions arises as special case of the quantum state assignment flow, under suitable conditions as detailed below.
Definition 4.19 (commutative submanifold).Let denote a set of operators which orthogonally project onto disjoint subspaces of C c , ) and which are complete in the sense that Given a family Π of operators, we define by the submanifold of commuting Hermitian matrices which can be diagonalized simultaneously.
A typical example for a family (4.40) is where U = {u 1 , . . ., u c } is an orthonormal basis of C c .The following lemma elaborates the bijection D Π ↔ S l .
where S ∈ W c is determined by µ i = U Diag(S i )U * , i ∈ V.In particular, the submanifold D Π,c is preserved by the quantum state assignment flow.
It remains to check that under suitable conditions on the data matrices D i , i ∈ V which define the initial point of (4.34b) by the similarity mapping (Lemma 4.14), the quantum state assignment flow becomes the ordinary assignment flow.
Corollary 4.22 (recovery of the AF by restriction).In the situation of Proposition 4.21, assume that all data matrices D i , i ∈ V become diagonal in the same basis U, i.e. (4.48) Then the solution of the QSAF and the initial point is determined by the similarity map (3.12) evaluated at the barycenter W = 1 Wc with the vectors λ i , i ∈ V as data points.

EXPERIMENTS AND DISCUSSION
We report in this section few academical experiments in order to illustrate the novel approach.In comparison to the original formulation, it enables a continuous assignment without the need to specify explicitly prototypical labels beforehand.The experiments highlight the following properties of the novel approach which extend the expressivity of the original assignment flow approach: • geometric adaptive feature vector averaging even when uniform weights are used (Section 5.2); • structure-preserving feature patch smoothing without accessing data at individual pixels (Section 5.3); • seamless incorporation of feature encoding using finite frames (Section 5.3).In Section 6, we indicate the potential for representing spatial feature context via entanglement.Working out more thoroughly the potential for various applications is beyond the scope of this paper, however.5.1.Geometric Integration.In this section, we focus on the geometric integration of the reparameterized flow described by Equation (4.34b).For a reasonable choice of a single stepsize parameter, the scheme is accurate, stable and amenable to highly parallel implementations.
Consequently, the iterative step for updating µ t ∈ Q c , t ∈ N 0 and stepsize > 0 is given by (5.1b) for all i ∈ V. Using (4.10b) and assuming with A t ∈ T c , we obtain and we conclude in view of (4.5) and (5.2) (5.4) Remark 5.1.We note that the numerical evaluation of the replicator operator (4.2) is not required.This makes the geometric integration scheme, summarized by Algorithm 1, quite efficient.based on (e) instead of (a) and using the same noise level in (g).The colors in (f)-(h) merely visualize the bloch vectors by RGB vectors that result from translating the sphere of (e) to the center 1 2 (1, 1, 1) of the RGB cube and scaling it by 1 2 .We refer to the text for a discussion.
Algorithm 1: Geometric Integration Scheme Initialization Determine an initial A 0 ∈ T c;0 and compute µ 0 by (µ 0 We list few further implementation details.
• A reasonable convergence criterion which measures how close the states are to a rank one matrix, is • A resonable range for the stepsize parameter is ≤ 0.1.
• In order to remove spurious non-Hermitian numerical rounding errors, we replace each matrix  • The constraint tr ρ = 1 of (2.13) can be replaced by tr ρ = τ with any constant τ > 1.This ensures for larger matrix dimensions c that the entries of ρ vary in a reasonable numerical range and the stability of the iterative updates.Up to moderate matrix dimensions, say c ≤ 100, the matrix exponential in (4.4a) can be computed using any of the basic established algorithms [Hig08, Ch. 10] or available solvers.In addition, depending on size of the neighborhood N i induced by the weighted adjacency relation of the underlying graph in (4.22), Algorithm 1 can be implemented in a fine-grained parallel fashion. (5.5) Pure states ρ correspond to unit vectors d, d = 1, whereas vectors d, d < 1 parametrize mixed states ρ.
Given data caption of the Figure 5.1.We point out that the two experiments discussed next are supposed to illustrate the behaviour of the QSAF and the impact of the underlying geometry, rather than a contribution to the literature on the processing of color images.Figure 5.1(c) shows a noisy version of the image (b) used to initialize the quantum state assignment flow (QSAF).Panel (d) shows the labeled image, i.e. the assigment of a pure state (depicted as Bloch vector) to each pixel of the input data (c).Although uniform weights were used and any prior information was absent, the result (d) demonstrates that the QSAF removes the noise and preserves the signal transition fairly well, both for large-scale local image structure (away from the image center) and for small-scale local image structure (close to the image center).This behaviour is quite unusual in comparison to traditional image denoising methods which inevitably require adaption of regularization to the scale of local image structure.In addition, we note that noise removal is 'perfect' for the three extreme points red, green and blue of panel (a), but suboptimal only for the remaining non-extreme points.
Panels (f)-(h) show the same results when the data are encoded in a better way, as depicted by (e) using unit vectors not only on the positive orthant but on the whole unit sphere.These data are illustrated by RGB vectors that result from translating the unit sphere (e) to the center 1 2 (1, 1, 1) of the RGB color cube [0, 1] 3 and scaling it by 1 2 .This improved data encoding is clearly visible in panel (g) which displays the same noise level as shown in panel (c).Accordingly, noise removal while preserving signal structure at all local scales is more effectively achieved by the QSAF in (h), in comparison to (d).
5.3.Basic Image Patch Smoothing.Figure 5.2 shows an application of the QSAF to a random spatial arrangement (grid graph) of normalized patches, where each vertex represents a patch, not a pixel.Applying vectorization taking the tensor product with itself, each patch is represented as a pure state in terms of a rankone matrix D i at the corresponding vertex i ∈ V, which constitute the input data in the similarity mapping (4.27).Integrating the flow causes the non-commutative interaction of the associated state spaces ρ i , i ∈ V through geometric averaging, here with uniform weights (4.22), until convergence towards pure states.The resulting patches are then simply given by the corresponding eigenvector, possibly after reversing the arbitrary sign of each eigenvector component, depending on the distance to the input patch.The line graph corresponding to (a).Each vertex corresponds to an edge ij of the graph (a) and an initially separable state ρ ij = ρ i ⊗ ρ j .This defines a simple shallow tensor network.The histograms display the norms of the Bloch vectors of the states tr j (ρ ij ) and tr i (ρ ij ) obtained by the partially tracing out one factor, for each state ρ ij indexed by a vertex ij of the line graph of the grid graph (b).(d) The histogram shows that in the initial state, indeed all states are separable, while (e), (f) booth display the histogram of the norms of all Bloch vectors after convergence of the quantum state assignment flow with uniform weights towards pure states.(g) Using the center coordinates of each edge of the grid graph (b), the entanglement represented by ρ ij is visualized by a disk and 'heat map' colors (blue: low entanglement, red: large entanglement).For visual clarity, (h) and (i) again display the same information after thresholding, using two colors only: Entangled states are marked with red when the norm of the Bloch vectors dropped below the thresholds 0.95 and 0.99, respectively, and otherwise with blue.
The result shown in Figure 5.2 reveals an interesting behaviour: structure-preserving patch smoothing without accessing explicitly individual pixels.In particular, the flow induces a partition of the patches without any prior assumption on the data.
Figure 5.3 shows a variant of the scenario of Figure 5.2 in order to demonstrate in another way the ability to separate local image structure by geometric smoothing at the patch level.
Figure 5.4 generalizes the set-up in two ways.Firstly, patches were encoded using the harmonic frame given by the two-dimensional discrete Fourier matrix.Secondly, non-uniform weights ω ik = e −τ Pi−Pj 2 F , τ > 0 were used depending on the distance of adjacent patches P i , P j .
Specifically, let P i denote the patch at vertex i ∈ V after removing the global mean and normalization using the Frobenius norm.Then, applying the FFT to each patch and vectorization, formally with the discrete two-dimensional Fourier matrix F 2 = F ⊗ F (Kronecker product) and followed by stacking the rows, p i = F 2 vec(P i ), the input data were defined as , where the squared magnitude | • | 2 was computed componentwise.Integrating the flow yields again pure states which were interpreted and decoded accordingly: the eigenvector was used as multiplicative filter of the magnitude of the Fourier transformed patch (keeping its phase), followed by rescaling the norm and adding the mean by approximating the original patch in terms of these two parameters.
The results shown as panels (b) and (c) of Figure 5.4 illustrate the effect of 'geometric diffusion' at the patch level through integrating the flow, and how the input data are approximated depending on the chosen spatial scale (patch size), subject to significant data reduction.

CONCLUSION
We generalized the assignment flow approach for categorial distributions [ ÅPSS17] to density matrices on weighted graphs.While the former flows assign to each data point a label selected from a finite set, the latter assign to each data point a generalized 'label' from the uncountable submanifold of pure states.
Various further directions of research are indicated by the numerical experiments.This includes the unusual behavior of feature vector smoothing which parametrize complex-valued non-commutative state spaces (Figure 5.1), the structure-preserving interaction of spatially indexed feature patches without accessing individual pixels (Figures 5.2 and 5.3), the use of frames for signal representation and as obvervables whose expected values are governed by a quantum state assignment flow (Figure 5.4), and the representation of spatial correlations by entanglement and tensorization (Figure 5.5).Extending to the novel quantum assignment flow approach the representation of the original assignment flow in the broader framework of geometric mechanics, as developed recently by [SAS21], defines another promising research project spurred by established concepts of mathematics and physics.
From these viewpoints, this paper adds a novel concrete approach based on information theory to the emerging literature on network design based on concepts from quantum mechanics; cf., e.g.[LYCS18] and references therein.Our main motivation is the definition of a novel class of 'neural ODEs' [CRBD18] in terms of the dynamical systems which generate a quantum state assignment flow.The layered architecture of a corresponding 'neural network' is implicitly given by geometric integration.The inherent smoothness of the parametrization enables to learn the weight parameters from data.This will be explored in our future work along the various lines of research indicated above.Proof of Proposition 2.2.We verify (2.9) by direct computation.For any p ∈ S c , one has for any v ∈ T c,0 where B † := (B B) −1 B denotes the Moore-Penrose generalized inverse of B. Substituting this parametrization and evaluating the metric (2.4) gives Applying the Sherman-Morrison-Woodbury matrix inversion formula [HJ13, p. 9] we have Now consider any smooth function f : S c → R. Then Comparing the last equation and (7.10) shows that .
Proof of Corollary 3.4.The solution p(t) to (3.4) is given by (3.6).Proposition (2.2) and Eq.(2.10) show that (3.6b) is the Riemannian ascent flow of the function S c q → 1 2 q 2 .The stationary points satisfy R q q = (q − q 2 ) • q = 0 (7.18) and form the set The case J * = [c], i.e. q * = 1 Sc , can be ruled out if D 1c,D = S c , which will always be the case in practice where D corresponds to real data (measurement, observation).The global maxima correspond to the vertices of ∆ c = S c , i.e. |J * | = 1.The remaining stationary points are local maxima and degenerate, since vectors D with non-unique minimal component form a negligible null set.In any case, lim t→∞ p(t) = lim t→∞ q(t) = q * , depending on the index set J * determined by D.

Proofs of Section 4.
Proof of Proposition 4.1.The Riemannian gradient is defined by [KN69, pp.337] Choosing the parametrization X = Y − tr(Y )I ∈ H c,0 with Y ∈ H c , we further obtain The left factor must vanish.Applying the linear mapping T −1 ρ and solving for grad ρ f and gives Since grad ρ f ∈ H c,0 , taking the trace on both sides and using tr Substituting the last two summands in the equation before gives where the last equation follows from (2.22).
Proof of Lemma 4.2.The equation Proof of Lemma 4.3.Using (2.18) we compute where the last equation holds since Z and I commute.Substitution into (4.4a)cancels the scalar factor e tr Z c and shows (4.5).
Proof of Proposition 4.6.The e-geodesic connecting the two points Q, R ∈ D c is given by [Pet94, Section V] ), since the orthogonal projections Π c,0 onto H c,0 are implicitly carried out in (7.33) as well, due to Lemma 4.3.The expression (4.8b) is equal to (4.8c) due to (4.7b).It remains to check that the geodesic emanates at ρ in the direction X.We compute we solve for X, which shows (4.9) and where dΓ(Γ −1 (ρ)) −1 = dΓ −1 (ρ) was used to obtain the last equation.
Proof of Lemma 4.8.We compute and omit the projection map Π c,0 in the last equation, due to Lemma 4.2 or Lemma 4.3.
Proof of Lemma 4.9.We compute Proof of Lemma 4.12.We compute Comparing this equation to the single vertex flow (3.4) at time t = 0, Proof of Lemma 4.14.Put Substituting this expression into (4.27)yields Substituting (7.44) and omitting the projection map Π c,0 due to Lemma 4.3 yields (4.28).
Proof of Proposition 4.15.Substituting as in the proof of Lemma 4.14, we get Since dΓ is one-to-one, the expression inside the brackets must vanish.Solving for ρ and omitting the projection map Π c,0 , due to Lemma 4.   The initial condition for ρ is given by (4.31).The initial condition for µ follows from (7.48).
Proof of Proposition 4.17.We compute using (4.35) p i (0) trπ i π i . (7.55) Consequently, if U = {u 1 , ..., u c } is a basis of C c that diagonalizes µ, then the tangent vector X is also diagonal in this basis U and X commutes with µ, i.e. x r /(trπ r )π r .
.30) with a diagonal matrix D = Diag(D 11 , . . ., D cc ) and a bivariate function φ(x, y) = a m(x, y) θ , a > 0 in terms of a symmetric homogeneous mean m : R + × R + → R + .Regarding the log-Euclidean metric, one has φ(x, y) = x−y log x−log y 2 , whereas for the BKM metric one has φ(x, y) = x−y log x−log y .Taking also the restriction to density matrices D c ⊂ P c into account, one has the relation

3. 1 .
Single-Vertex Assignment Flow.Let D = (D 1 , . . ., D c ) ∈ R c and consider the task to pick the smallest components of D. Formulating this operation as optimization problem amounts to evaluating the support function (in the sense of convex analysis [Roc70, p. 28]) of the probability simplex ∆ c at −D, min j∈[c] {D 1 , . . ., D c } = max p∈∆c −D, p .

Lemma 4 .
20 (properties of D Π ).Let D Π ⊂ D c be given by (4.43) and denote the corresponding inclusion map by ι : D Π → D c .Then (a) the submanifold (D Π , ι * g BKM ) with the induced BKM metric is isometric to (S l , g FR ); (b) if µ ∈ D Π , then the tangent subspace T µ D Π is contained in the subspace T c µ D c ⊆ T µ D c defined by (2.23b).(c) Let U = {u 1 , . . ., u c } denote an orthonormal basis of C c such that for every π i ∈ Π, i ∈ [l], there are u i1 , . . ., u i k ∈ U that form a basis of range(π i ).Then there is an inclusion of commutative subsets D Π → D Π U that corresponds to an inclusion S l → S c .Proof.Appendix 7.3.Now we establish that a restriction of the QSAF equation (4.34b) to the commutative product submanifold can be expressed in terms of the AF equation (3.17).Analogous to the definition (4.23) of the product manifoldQ c , we set D Π,c = D Π × • • • × D Π |V| factors .(4.45)If Π is given by an orthonormal basis as in (4.44), we define the unitary matricesU = (u 1 , . . ., u c ) ∈ Un(c),(4.46a)Uc = Diag(U, . . ., U ) |V| block-diagonal entries .(4.46b) Proposition 4.21 (invariance of D Π,c ).Let Π and D Π be given according to Definition 4.19.Then the following holds.
50)where S(t) satisfies the ordinary AF equationṠ = R S [ΩS], S(0) = S(1 Wc ),(4.51) FIGURE 5.1.(a) A range of RGB unit color vectors in the positive orthant.(b) An image with data according to (a).(c) A noisy version of (b) constituting the initial points ρ i (0), i ∈ V of the QSAF.(d) The labels (pure states) generated by integrating the quantum state assignment flow using uniform weights.(e) The vectors depicted by (a) are replaced by the unit vectors corresponding to the vertices of the icosahedron, centered at 0. (f)-(h) Analogous to (b)-(d),based on (e) instead of (a) and using the same noise level in (g).The colors in (f)-(h) merely visualize the bloch vectors by RGB vectors that result from translating the sphere of (e) to the center 1 2 (1, 1, 1) of the RGB cube and scaling it by 1 2 .We refer to the text for a discussion.

FIGURE 5. 2 .
FIGURE 5.2.Left pair: A random collection of patches with oriented image structure.The colored image displays for each patch its orientation using the color code depicted by the rightmost panel.Each patch is represented by a rank-one matrix D in (4.17), obtained by vectorizing the patch and taking the tensor product.Center pair: The final state of the QSAF obtained by geometric integration with uniform weighting ω ik = 1 |Ni| , ∀k ∈ N i , ∀i ∈ V, of the nearest neighbors states.It represents an image partition but preserves image structure, due to geometric smoothing of patches encoded by non-commutative state spaces.
FIGURE 5.3.(a) A random collection of patches with oriented image structure.(b) A collection of patches with the same oriented image structure.(c) Pixelwise mean of the patches (a) (b) at each location.(d) The QSAF recovers a close approximation of (b) (color code: see Fig. 5.2) by iteratively smoothing the states ρ k , k ∈ N i corresponding to (c) through geometric integration.

5. 2 .
Labeling 3D Data on Bloch Spheres.For the purpose of visual illustration, we consider the smoothing of 3D color vectors d = (d 1 , d 2 , d 3 ) , interpreted as Bloch vectors which parametrize density matrices [BZ17, Section 5.2] FIGURE 5.4.(a) A real image, partitioned into patches of size 8 × 8 and 4 × 4 pixels, respectively.Each patch is represented as pure state with respect to a Fourier frame (see text).Instead of the nearest neighbor adjacency on a regular grid, each patch is adjacent to its 8 closest patches in the entire collection.Integrating the QSAF and decoding the resulting states (see text) yields the results (b) (8 × 8 patches) and (c) (4 × 4 patches), respectively.Result (b) illustrated the effect of smoothing at the patch level, in the Fourier domain, where as the smaller spatial scale used to compute (c) represents the input data fairly accurately, after significant data reduction.

FIGURE 5 .
FIGURE 5.5.(a) A 5 × 5 grid graph.(b) Random Bloch vectors d i ∈ S 2 ⊂ R 3 (visualized using pseudo-color) defining states ρ i by Eq. (5.5) for each vertex of a 32 × 32 grid graph.(c)The line graph corresponding to (a).Each vertex corresponds to an edge ij of the graph (a) and an initially separable state ρ ij = ρ i ⊗ ρ j .This defines a simple shallow tensor network.The histograms display the norms of the Bloch vectors of the states tr j (ρ ij ) and tr i (ρ ij ) obtained by the partially tracing out one factor, for each state ρ ij indexed by a vertex ij of the line graph of the grid graph (b).(d) The histogram shows that in the initial state, indeed all states are separable, while (e), (f) booth display the histogram of the norms of all Bloch vectors after convergence of the quantum state assignment flow with uniform weights towards pure states.(g) Using the center coordinates of each edge of the grid graph (b), the entanglement represented by ρ ij is visualized by a disk and 'heat map' colors (blue: low entanglement, red: large entanglement).For visual clarity, (h) and (i) again display the same information after thresholding, using two colors only: Entangled states are marked with red when the norm of the Bloch vectors dropped below the thresholds 0.95 and 0.99, respectively, and otherwise with blue.
Proofs of Section 2.