The Thermomajorization Polytope and Its Degeneracies

Drawing inspiration from transportation theory, in this work, we introduce the notions of “well-structured” and “stable” Gibbs states and we investigate their implications for quantum thermodynamics and its resource theory approach via thermal operations. It is found that, in the quasi-classical realm, global cyclic state transfers are impossible if and only if the Gibbs state is stable. Moreover, using a geometric approach by studying the so-called thermomajorization polytope, we prove that any subspace in equilibrium can be brought out of equilibrium via thermal operations. Interestingly, the case of some subsystem being in equilibrium can be witnessed via the degenerate extreme points of the thermomajorization polytope, assuming that the Gibbs state of the system is well structured. These physical considerations are complemented by simple new constructions for the polytope’s extreme points, as well as for an important class of extremal Gibbs-stochastic matrices.


Introduction
The core idea of quantum thermodynamics-a field which has gained significant traction within the last decade-is to apply thermodynamic ideas to (ensembles of) systems of finite size instead of "thermodynamically large" setups [1].While this discipline comes with a number of principal questions and concepts (such as fluctuation theorems, thermal machines, the fundamental laws, thermalization, and many more, cf.[2]) an allround successful approach has been to take the open systems perspective [3]: There one models changes to a system of interest as the reduced action of a larger, closed system (i.e.system plus some form of environment, such as a bath) through some total Hamiltonian 1 H tot = H S + H B + H I consisting of a system's, an environment, and an interaction term.Typical thermodynamic constraints imposed on top are that the environment (bath) starts out in equilibrium, i.e. in a Gibbs state, or that the total energy is conserved 2 This perspective ties into the information-theoretic and more specifically into the resource theory approach to quantum thermodynamics.There one attempts to formalize which operations can be carried out at no cost (with respect to some resource, e.g., work), and one of the main aspects of this approach is to characterize when state transfers under thermodynamic constraints are possible.This comes with a subset of quantum channels called thermal operations, which can be carried out without having to expend any resource.
They arise from the above open systems construction as follows [5][6][7]: Given an n-level system described by some system's Hamiltonian H S ∈ iu(n) as well as some fixed background temperature T > 0 which every bath needs to adopt, the set TO(H S , T) of all thermal operations with respect to H S , T is defined as tr B e iH tot (•) ⊗ e −H B /T tr(e −H B /T ) e −iH tot : m∈N,H B ∈iu(m), H tot ∈iu(mn) .
Here u(m) is the unitary Lie algebra in m dimensions so iu(m) is the collection of Hermitian m × m matrices.As hinted at before, one of the central questions of this framework is the following: Given an initial and a target state of some quantum system, can the former be transformed into the latter by means of a thermal operation?
The resource theory approach to quantum thermodynamics lead to a number of structural insights, ranging from optimal protocols for work extraction [4] and cooling [8] to the so-called second laws of quantum thermodynamics [9,10].The latter precisely relates to the above state interconversion problem via a generalization of classical majorization called "thermomajorization". First described by Ruch et al. [11] in the 70s, thermomajorization has gained widespread popularity over the last decade due to the influential works of Brandão et al. [10], Horodecki & Oppenheim [9], Renes [12], as well as many more [8,[13][14][15][16][17]; there, among other things, it has been used to solve the state interconversion problem in the quasi-classical realm, more on this down below.Yet, one can tackle this problem from another, more geometric perspective: for this one abstractly defines the collection of all states which can (approximately) be generated via thermal operations starting from some initial state ρ M H S ,T (ρ) := Φ(ρ) : Φ ∈ TO(H S , T) , and then one studies the geometric properties of M H S ,T (ρ).This set has been called (future) thermal cone3 [18][19][20] or, in the case of quasi-classical states ρ, thermal polytope [8] before, and the elements of M H S ,T (ρ) are precisely those states which are said to be thermomajorized by ρ.In the quasi-classical case [ρ, H S ] = 0 the structure of the thermal cone is known to simplify considerably.This is due to the following crucial facts: • Every thermal operation leaves the set of quasi-classical states invariant [21]: if [ρ, H S ] = 0, then [Φ(ρ), H S ] = 0 for all Φ ∈ TO(H S , T).

•
Thermal operations and general Gibbs-preserving quantum channels (approximately) coincide on quasi-classical states when considering only the diagonal [22,Sec. 3.4].
Combining these facts shows that for any state ρ with [ρ, H S ] = 0 where H S is nondegenerate (i.e.ρ is diagonal in "the" eigenbasis of H S ), M H S ,T (ρ) equals the set of all diagonal states Φ(ρ) where Φ is a quantum channel that preserves the Gibbs state e −H S /T /Z.But for diagonal states-where we write x, y ∈ R n for the vectors of their respective diagonal entries-the existence of such a channel is equivalent to the existence of a Gibbs-stochastic matrix 4 A such that Ay = x [23,Coro. 4.5].Thus M H S ,T (diag(y)) equals {diag(Ay) : A ∈ R n×n Gibbs-stochastic} where one, instead and equivalently, can focus solely on the diagonal by considering the so-called thermomajorization polytope M d (y) := Ay : A ∈ R n×n Gibbs-stochastic . ( At first glance focusing on the quasi-classical case may appear fruitless for understanding the behavior of quantum systems.However, we point out that optimal cooling protocols if the entries of A are non-negative, every column of A sums up to one, and Ad = d.Here d := (e −E j /T ) n j=1 is the (unnormalized, but can also be normalized) vector of Gibbs weights.Moreover, we write s d (n) for the collection of all Gibbs-stochastic matrices w.r.t. the Gibbs vector d.
rely on two level Gibbs-stochastic matrices [8] which can be realized within the Jaynes-Cummings model [15].Moreover, taking the quasi-classical perspective allows one to identify thermal operations which are simple to implement experimentally [24], and tools from the quasi-classical case have been used in quantum control theory to find non-trivial upper bounds on reachable states for dissipative bath-couplings [25,26].
In this work we will combine this geometric approach to quantum thermodynamics with the established field of transportation theory.While this is not the first time these fields are being linked-cf.Sec.2.2 for a short review and Sec.3.2 for some results this perspective has lead to already-this paper's main idea is not to take tools, but rather (not as obvious, yet key) concepts from transportation theory and to investigate their implications in quantum thermodynamics.Thus, this work is structured as follows: We begin by recapping characterizations of thermomajorization in the quasi-classical realm in Sec.2.1, followed by a review of the basics of transportation theory in Sec.2.2.Therein we also introduce (or rather, translate) the key concepts of "stable" and "well-structured" Gibbs states, which will turn out to be quite intuitive.The implications of these notions will be the topic of Sec. 3 which contains the main results.More precisely, stable Gibbs states turn out to be in one-to-one correspondence to the impossibility of (global) cyclic state transfers-which will also lead us to the notion of a "subspace in equilibrium"cf.Sec.3.1.The latter notion turns out to be closely connected to the geometry of M d (y) and is reflected in the (number of) extreme points of M d (y), assuming the Gibbs state is well structured (Sec.3.2).The point-of-view taken in this work also leads to simple, intuitive ways to construct extreme points as well as Gibbs-stochastic matrices which realize the corresponding state transfer; this is what the example and visualization Sec. 4 will be about.

Thermomajorization for Quasi-Classical States
The object related to thermomajorization most commonly found in the literature is the following: let any d ∈ R n ++ (that is, d ∈ R n with d > 0 as d plays the role of the vector of Gibbs weights) as well as y ∈ R n be given.One defines the thermomajorization curve of y (with respect to d), denoted by th d,y , as the piecewise linear, continuous curve fully characterized by the elbow points {(∑ where τ ∈ S n is any permutation such that where, here and henceforth, e := (1, . . ., 1) ⊤ and ∥ • ∥ 1 is the usual vector 1-norm.Note that this curve is a generalization of the notion of Lorenz curves from majorization theory [28, p. 5].While the condition d i > 0 from the minimum in ( 2) is redundant for now, it allows us to generalize the definition of th d,y to the zero-temperature case where some of the entries of d vanish, cf.Remark 2 below.

Remark 1.
It is clear from the definition that the thermomajorization curve is invariant under permutations in the sense that th σd,σy ≡ th d,y for all σ ∈ S n .Here, given some permutation σ ∈ S n we write σ for the corresponding permutation matrix ∑ n i=1 e i e ⊤ σ(i) .In particular the identities σ • τ = τ • σ, (σx) j = x σ(j) , and (σAτ) jk = A σ(j)τ −1 (k) hold for all A ∈ R n×n , x ∈ R n , j, k = 1, . . ., n, and all σ, τ ∈ S n .Now the precise connection between thermomajorization and (quasi-classical) state transfers is summarized in the following well-known result: Given any x, y ∈ R n the following statements are equivalent [27, Prop.1].

•
There exists a Gibbs-stochastic matrix A (recall footnote 4) such that Ay = x.We denote this by x ≺ d y. • e ⊤ x = e ⊤ y and th d,x (∑ for all j = 1, . . ., n − 1 where τ ∈ S n is any permutation such that x = e ⊤ y and ∥d i x − y i d∥ 1 ≤ ∥d i y − y i d∥ 1 for all i = 1, . . ., n.These criteria slightly simplify for probability vectors x, y ∈ R n (e.g., containing the spectrum of any two quantum states), that is, for vectors x, y ≥ 0 such that e ⊤ x = e ⊤ y = 1: in this case the "thermomajorization curve"-criterion reduces to th d,x (c) ≤ th d,y (c) for all c ∈ [0, e ⊤ d] (resp.all c ∈ {∑ j i=1 d τ(i) : j = 1, . . ., n − 1} where τ sorts x d non-increasingly).Starting from thermomajorization curves there even exists an algorithm to find a Gibbsstochastic matrix which implements the state transition in question [22].Either way, the reason thermomajorization curves are equivalent to conditions using the 1-norm is their fundamental link by means of the Legendre transformation 5 , denoted by (•) * : for all non-zero d ≥ 0, all y ∈ R n , and all t ∈ R one has This readily follows from the definition of the Legendre transformation together with the fact that th d,y+td (c) = th d,y (c) + ct for all c ∈ [0, e ⊤ d] as well as the fact that the maximum of any thermomajorization curve equals the sum over all non-negative entries of the initial state [27,Lemma 13 (iii)].With this the equivalence of the corresponding characterizations of d-majorization is due to the fact that the Legendre transformation in an involution which respects (more precisely: reverses) order [29,Thm. 4.2.1].While these conditions are mathematically equivalent, in practice it is often easier to prove things using the curves th d,y .This empirical observation will also be supported by the main part of this paper.
Remark 2 (Thermomajorization for Zero Temperature).The above conditions for thermomajorization remain well defined even if some entries of d vanish, assuming th d,y is defined via (2).Physically, this relates to the temperature being zero in which case all entries of d that do not correspond to the zero-point energy of the system vanish.If this lowest energy level with corresponding index j is non-degenerate, then d = d(T) → e j as T → 0 + .Interestingly, the above characterization of thermomajorization stays valid in this regime.For a proof we refer to App. A. This explains known continuity problems of the associated polytope (cf.Example 3 in [27]): If some of the d i vanish, so do the corresponding inequalities meaning there are less restrictions.Hence the polytope can only become larger in this case.
The importance of the 1-norm conditions is that they lead to a beautiful characterization of M d (y) from (1), cf.[27, Thm.10]: The geometric interpretation of each of these inequalities is that every binary vector m ∈ {0, 1} n \ {0, e} is the normal vector to a halfspace which limits M d (y).The location of said halfspace is determined by the value th d,y (m ⊤ d) and thus by the thermomajorization curve.Note that the orientation of these halfspaces is universal, i.e. they are independent of any of the system parameters; subsequently, y, d only influence the location of the faces.Another way of expressing this is to say that thermomajorization polytopes are obtained by shifting the faces of a classical majorization polytope.Note that this can lead to the situation where some of these halfspace conditions become redundant.This description of M d (y) has been used to prove continuity of the map (d, P) → M d (P) where d > 0 and P is from the collection of non-empty compact sets in R n equipped with the Hausdorff metric [27,Thm. 12 (ii)].Alternatively, this result can be obtained from continuity of the set of Gibbs-stochastic matrices in H 0 and T > 0 [30, Thm.5.1].Now, writing the bounded set M d (y) as the solution to finitely many inequalities shows that it is a convex polytope, i.e. it ultimately can be written as the convex hull of finitely many points [31].Rather than just being an abstract result, these extreme points have an analytic form.Moreover, halfspaces becoming redundant leads to the coalescence of extreme points, more on this in Sec.3.2.

Transportation Polytopes
It turns out that Gibbs-stochastic matrices are closely related to transportation matrices which sit at the heart of the well-studied field of transportation theory [32][33][34][35].It appears that, so far, this has been very much overlooked: this notion does not even appear in the standard work on majorization by Marshall et al. [28], and the only papers from the quantum thermodynamics literature (that we are aware of) which used results from transportation theory are ones by Mazurek et al. [17,36].
Assuming finite domains, transportation matrices are non-negative matrices with fixed column-and row-sums.More precisely, these are matrices A ∈ R m×n + such that Ae = r and c ⊤ = e ⊤ A for some r ∈ R m , c ∈ R n with e ⊤ r = e ⊤ c.The collection of all such matrices is called transportation polytope and is often denoted by T(r, c).As already observed by Hartfiel [30], the connection to our setting then is the following: For non-zero temperatures (i.e.d > 0) there is a one-to-one correspondence between (the extreme points of) the Gibbsstochastic matrices and (the extreme points of) the symmetric transportation polytope T(d, d) by means of the isomorphism X → X diag(d).From our point of view, pursuing this approach may seem counter-intuitive because the geometry of the Gibbs-stochastic matrices is known to be more complicated than the thermomajorization polytope.Already in three dimensions the number of extreme points of the Gibbs-stochastic matrices depends on the temperature of the bath [36, Fig. 1]), that is, on certain relations between the entries of d [27, App.A].However, drawing this connection grants access to powerful tools from combinatorics and graph theory.Roughly speaking there is a relation between the extreme points of T(d, d) and spanning trees of associated bipartite graphs.We need not go into too much detail on the underlying techniques (instead, we refer to [17]); rather we will adopt useful notions from this field and adapt them to thermomajorization as well as the associated polytope.For this our starting point is a paper by Loewy et al. [37] where conditions on the vector d that classify certain features of (the polytope of) Gibbs-stochastic matrices were identified.We present these conditions-which Loewy et al. simply called "property (a)" and "property (b)" 6 -in the following definition: ++ define a map D : P ({1, . . ., n}) → [0, ∞) on the power set of {1, . . ., n} via D(I) := ∑ i∈I d i .
(ii) We call d stable if D is injective.Some remarks on these notions are in order: By definition d is stable if summing up two sets of entries of d only yields the same result if the entries coincided in the first place.In particular, stability implies that d is non-degenerate.Note that for non-degenerate systems stability is a generic property as only finitely many temperatures give rise to unstable Gibbs states.On the other hand, d is well structured if summing up k − 1 arbitrary entries of d always yields less than summing up any k entries of d.Interestingly, this notion is fully captured by the inequality where where α plays the role of |I| from Def. 1), and in a second step realize that the inequality corresponding to α = ⌈ n 2 ⌉ − 1 implies all other ones.Either way, from (5) one sees that well-structuredness is a hightemperature phenomenon: Given energies of the system E 1 ≤ . . .≤ E n (w.l.o.g.E 1 < E n to avoid the trivial case of full degeneracy) there exists a unique critical temperature e −E i /T c , and the corresponding Gibbs-vector is well structured if and only if T > T c .One way to prove this is to examine the auxiliary function and to see that lim T→0 + ϕ(T) ∈ [0, 1], lim T→∞ ϕ(T) > 1, and ϕ ′ (T) > 0 for all 7 T > 0. Thus by the intermediate value theorem there exists unique T c ≥ 0 such that ϕ(T c ) = 1, and ϕ(T) > 1 (which is equivalent to ( 5)) holds if and only if T > T c .
Either way the notion of stable Gibbs states as well as the fact that well-structuredness of the Gibbs state appears if (and only if) the temperature exceeds a critical value can be nicely visualized via the standard simplex and the ordered Weyl chamber, cf. Figure 1.Now based on these notions Loewy et al. were able to prove the following: Given any d ∈ R n ++ well-structuredness of d is equivalent to every extreme point of s d (n) (i.e. the Gibbs-stochastic matrices, recall footnote 4) being invertible [37,Thm. 3.1], the number of extreme points of s d (n) is maximal (when d is taken as a parameter) if and only if d is well structured and stable [37, Thm.5.1 & 6.1], and as a lower and upper bound to the number of extreme points of s d (n) they found (n − 1)!n n−2 and n!n n−2 , respectively 8 .The latter was a substantial improvement over the lower bound n! as first proven by Perfect and Mirsky [38].

Results
As we will see the notions of well-structured and stable Gibbs states are key to answering fundamental questions in quantum thermodynamics.Not only do this definition and the associated, already known results (e.g., on extreme points of the Gibbs-stochastic matrices) carry over to our setting, this language is even suited to solve seemingly unrelated problems in quantum thermodynamics and, subsequently, lets us uncover new connections.Consequently, this section will feature two types of results: first, ones which 7 This follows at once from the readily verified expression together with the observation that E i − E j ≥ 0 because i > j, and even E n − E 1 > 0 by assumption. 8Actually they explained how to calculate an even better lower bound which grows asymptotically as (n − 1)!n n−2 log n but cannot be written down as nicely as the bound (n − 1)!n n−2 .are at most superficially concerned with the geometry of the thermomajorization polytope (Sec.3.1), followed by some results on geometric quantities (e.g., extreme points) of said polytope, cf.Sec.3.2.The former can be seen as general principles underlying quantum thermodynamics while the latter are more state-dependent and more explicit in nature.

Cyclic State Transfers and Subspaces in Equilibrium
An overarching framework for this section is given by the notion of catalysis.There, in the most constrained form one calls a state transition ρ → ω strictly catalytic if there exists an ancilla in state ω C as well as an "allowed" operation Φ on the new overall system such that Φ(ρ ⊗ ω C ) = ω ⊗ ω C (although there are also "weaker" versions of catalytic transformations, cf. the review article [39]).For thermal operations in the quasi-classical realm strict catalysis boils down to state transfers x ⊗ z ≺ d y ⊗ z.
The idea behind such processes is of course that the catalyst ω C , resp.z, undergoes a cyclic process in order to be returned (uncorrelated and) unchanged.This raises fundamental questions such as, e.g., what cyclic processes are even possible in our thermodynamic framework, which properties do such processes have, et cetera.This is what our first result is about: in physical terms it states that global cyclic thermodynamic processes are impossible for almost all temperatures (at least without access to external resources).While Thm. 1 below is concerned with two-step processes, as an immediate corollary one obtains an analogous result for time-continuous cyclic processes.The precise statement here is that the impossibility of non-trivial cyclic processes is in one-to-one correspondence to the notion of stable Gibbs vectors: Proof."(ii) ⇒ (i)": We will prove this direction by contraposition, that is, given d not stable we construct x, y with x ̸ = y such that x ≺ d y ≺ d x.Indeed, assume to the contrary that there exist I, J ⊆ {1, . . ., n}, I ̸ = J such that ∑ i∈I d i = ∑ j∈J d j .Define x := ∑ i∈I d i e i , y := ∑ j∈J d j e j and note that I ̸ = J implies x ̸ = y.We claim that x ≺ d y ≺ d x.Recalling Sec.2.1 we prove this, equivalently, by showing that ∥x − td∥ 1 = ∥y − td∥ 1 for all t ∈ R: The idea of this part of the proof is that any vectors x ≺ d y ≺ d x induce the same thermomajorization curve, and-because d is stable-applying this to the points where the curves change slope lets us conclude x = y.More precisely, assume d is stable and let x, y ∈ R n be given such that x ≺ d y ≺ d x.This implies e ⊤ x = e ⊤ y and, more importantly, th d,x (c) = th d,y (c) for all c ∈ [0, e ⊤ d].But these are piecewise linear functions characterized by its elbow points so in particular the points where th d,x , th d,y have a (non-trivial) change in slope coincide (recall Sec.2.1).To be more precise: there exists k ∈ {1, . . ., n} and sets I x 1 , . . ., But because the changes in slope coincide, this (due to d > 0) shows ∑ and the slope of th d,y at "the" increment d i is given by , there exists c y,l ∈ R such that y i = c y,l • d i for all i ∈ I x l \ I x l−1 ; one argues analogously for th d,x and obtains a constant c x.l .Be aware that we used stability of d here: the length and th d,y coincide, by assumption we find that c x,l = c y,l (=: c l ) and thus Of course this result does not prohibit local cyclic processes, that is, thermodynamic processes where only a subsystem returns to its original state at the end (so precisely, catalysis).What Thm. 1 does assert, however, is that in general a quasi-classical cyclic process (modeled by thermal operations) which is not local has to use up some external resource along the way.
For the remainder of this section our focus lies on (states on) subspaces which are "in equilibrium".The motivation behind this notion is that if a state restricted to some subspace is a multiple of the Gibbs state, then all thermal operations act trivially on it: Indeed given a state x and a subspace P such that x| P is a multiple of the Gibbs vector, then any Gibbs-stochastic matrix which acts non-trivially only on P-i.e. it is of the form A = A P ⊕ 1 P ⊥ -necessarily leaves x invariant.This of course generalizes to subsystems of coupled systems by choosing P appropriately.The precise definition reads as follows: We say that a subset P ⊆ {1, . . ., n}, |P| > 1 of the system's energy levels is in equilibrium if for all i, j ∈ P. On the other hand if no such subset P satisfies this condition we say that the system is in total non-equilibrium.
In this language our next result states that regardless of whether there is a subspace in equilibrium (as long as the full system is not) every such subspace can be brought out of equilibrium by means of d-stochastic matrices.This in particular applies to catalytic state transfers: if, for example, a system is in equilibrium, then any catalyst (which itself is not in the Gibbs state) allows for bringing arbitrary energy levels of the original system out of equilibrium.The precise statement is derived via the dimension of the thermomajorization polytope and reads as follows: The following statements are equivalent.(i) M d (y) is not singular, that is, M d (y) consists of more than just y. (ii) y is not a multiple of d. (iii) The dimension of the convex polytope M d (y) is maximal, i.e. its dimension is n − 1 which is equal to the dimension of the standard simplex ∆ n−1 of all n-dimensional probability vectors.
In particular, if there exists a subset P ⊊ {1, . . ., n}, |P| > 1 in equilibrium (i.e.y i /d i = y j /d j for all i, j ∈ P), then there exists z ∈ M d (y) such that z is in total non-equilibrium.
Proof."(iii) ⇒ (i)": Trivial."(i) ⇒ (ii)": Obvious via contrapositive."(ii) ⇒ (iii)": The dimension of M d (y) is trivially upper bounded by n − 1 as it is a subset of the n − 1dimensional standard simplex [27,Coro. 17].For the converse we argue by contraposition: if the dimension is strictly less than n − 1, then there must exist a condition in (4) which is an equality [40,Ch. 8.2].More precisely, there must exist m ∈ {0, 1} n , 0 < e ⊤ m < n and c ∈ R such that m ⊤ x = c for all x ∈ M d (y).First we determine c.Let σ ∈ S n be any permutation such that σm is sorted non-increasingly, i.e. σm = (1, . . ., 1, 0, . . ., 0) ⊤ .We know E d,y (σ) The final step is to evaluate the condition m ⊤ x = th d,y (m ⊤ d) at a multiple of d: because Because th d,y is concave and because m ⊤ d ∈ (0, e ⊤ d) by the assumptions on m and d, Lemma A3 (iv) (App.B) shows that (6) can only hold if th d,y is linear.But the latter is equivalent to y being a multiple of d as, by definition, the slopes of th d,y are given by y j /d j .
Finally, the additional statement follows at once from the following two facts: (a) The collection of all vectors in total non-equilibrium is dense in ∆ n−1 , and (b) M d (y) contains an interior point w.r.t. the hyperplane W e := {z ∈ R n : e ⊤ z = 1}.While (b) is due to (iii), for (a) note that given any P ⊊ {1, . . ., n}, |P| > 1 the set of vectors y ∈ W e in equilibrium (w.r.t.P) form a lower-dimensional subspace of W e .In particular this set is nowhere dense 9 , 9 Recall that a subset of a topological space is called nowhere dense if its closure has empty interior [41].
which continues to hold when taking the union over all (finitely many!) such P.But this, in particular, implies that the complement of this set-that is, the collection of all vectors in total non-equilibrium-is dense in W e .This concludes the proof.
Returning to the language of subspaces in equilibrium note that this result is an existence result and does not infer anything about the potential "amplitude" of such transfers.Mathematically, Thm. 2 complements the old result of Hartfiel that the dimension of s d (n) for all d > 0 is (n − 1) 2 [30, Thm.

Extreme Points of the Thermomajorization Polytope
Let us stress that-until now-the concept of subspaces in equilibrium (and hence Thm. 2) is logically independent from the notions of stability and well-structuredness.This missing connection will be established below, where well-structured states and subspaces in equilibrium will be linked via geometric properties of the thermomajorization polytope, in particular the number of its extreme points.The key mathematical object for doing so is the extreme point map E d,y which is defined as follows: 10 , where σ is the permutation matrix corresponding to σ as defined above (Remark 1).
As the name suggests, for all d ∈ R n ++ and y ∈ R n the image of E d,y equals the set ext(M d (y)) of extreme points of M d (y), and thus M d (y) = conv{E d,y (σ) : σ ∈ S n } [27,Thm. 16].It should be noted that the extreme point property also manifests in the thermomajorization curves, relating to the concept of tight thermomajorization: Given any y, z ∈ R n + , the point z is an extreme point of M d (y) if and only if all elbow points of th d,z lie on the curve th d,y [17,Thm. 2].Also there is a nice connection between the extreme points of M d (y) and a special class of extreme points of the Gibbs-stochastic matrices which we will elaborate on at the end of this section.Moreover, Sec.4.1 below presents a step-by-step calculation of the extreme point map and explains how it, equivalently, can be computed graphically using the thermomajorization curve, cf. Figure 3.
Our focus for now, however, lies on properties of the map E d,y and geometric aspects of M d (y).From Def. 3 it is clear that the maximal number of extreme points of M d (y) is n! = |S n |; if there are strictly fewer than n! extreme points we say that the polytope is degenerate.The goal of this section is to prove the following result which states that degeneracy of the polytope is a "witness" for subspaces in equilibrium, assuming d is well structured (equivalently: assuming large enough temperatures, cf.Sec.2.2). (i) y has a subspace which is in equilibrium with respect to d. 10 An analytic form of the extreme points of M d (y) has appeared independently in the physics [8,15] and the mathematics [27] literature.
(ii) d is not well structured, i.e. the temperature is below the critical value T ≤ T c , cf.Sec.2.2.Note that in the example of Sec.4.1 degenerate extreme points occur, and the degeneracy stems from the fact that there exists a subspace in equilibrium.This can be shown generally as specifying the preimage of y under the extreme point map E d,y turns out to be straightforward: Clearly then the extreme point y is degenerate (in the sense that it has multiple preimages under E d,y ) if and only if there exists a subspace which is in equilibrium, cf.Definition 2. The converse, however, is not true.Indeed, the example in Sec.4.2 shows that degenerate extreme points can occur even if the system is in total non-equilibrium.Note however that in this example the vector d is not well structured, indicating a low temperature, as required by Thm. 3.
Note that Lemma 1 relates to the concept of virtual temperatures [6,[42][43][44] as multiple permutations are mapped to y under E d,y if and only if there exist i, j ∈ {1, . . ., n} such that the background temperature equals T = In other words virtual temperatures characterize when another corner of the polytope "crosses" the initial state.This also relates to the notion of passivity: The degeneracy of M d (y) at temperature T ij corresponding to the transition between E i and E j is physical (i.e.T ij > 0) only if the initial state diag(y) is passive, meaning no work can be extracted via unitary transformations [43, Sec.III].
Example 1.It is worth addressing the case of classical majorization, i.e. d = e (up to a factor which is of no consequence).We find that th d,y (∑ i for any j = 1, . . ., n, σ ∈ S n ; this recovers E d,y (σ) = σ −1 y ↓ [28, Ch. 4, Prop.C.1]. Therefore degeneracies of M e (y) correspond to some entries of the initial state coinciding 11 .This, in fact, implies "uniformity" of the classical majorization polytope's degeneracy in the sense that the preimage of each extreme point under E d,y has the same size, which is certainly false for general d ∈ R n ++ , cf.Sec.4.1 and in particular Table 1.This uniformity has to do with the fact that s e (n) contains all permutation matrices which yields a group action of S n on s e (n).This group action is transitive on the vertices, and hence all vertices are equivalent.This does not hold for general d ∈ R n ++ : the inverse of some invertible element of s d (n) is again in s d (n) if and only if it is a permutation matrix [45,Remark 4.1].
The importance of this definition comes from its ability to characterize the image of the extreme point map as the following result shows.

Proof. "⇒": Assume
for all j = 1, . . ., n Definition 3. Now given j ∈ {1, . . ., n} there are two (non-exclusive) pos- k if and only if j ∈ I τ k , or the two intervals do not coincide.The latter implies that th d,y is affine linear on conv([∑ ) by our argument from above.But this means that both these intervals have to be contained in the same interval [∆ k−1 , ∆ k ], hence j ∈ I σ k and j ∈ I τ k ."⇐": Assume by contraposition that E d,y (σ) ̸ = E d,y (τ) so there exists j = 1, . . ., n such that (8) does not hold.Thus because the intervals have the same length).Moreover, th d,y cannot be affine linear on conv(J σ ∪ J τ ) so there exists k ∈ {1, . . ., m} such that the change of slope ∆ k lies in the interior of conv(J σ ∪ J τ ).

Figure 2.
Possible combinations of whether J σ and J τ intersect, and where ∆ k lies relative to J σ , J τ .Now there are two possible cases (cf. Figure 2 ) is to the left or to the right of ∆ k .Either way there exists k ∈ {1, . . ., m + 1} such that j ∈ I τ k but j ̸ ∈ I σ k so we are done.For the second case assume that ∆ k ∈ J σ ∩ J τ and w.l.o.g. that min This means that σ and τ yield the same extreme point if and only if they differ exactly by a permutation of the intervals of length d i such that each interval remains in the same region where th d,y is affine linear.Note that the case of classical majorization follows from this since in that case, such permutations differ exactly by elements of the stabilizer of y.Similarly, the result about preimages of y (Lemma 1) follows from this.
Either way Prop. 1 gives a simple criterion to check whether the image of different permutations under the extreme point map E d,y coincides or not.In particular, a bound on the degeneracy of any extreme point can be given by how many of the d i intervals fit into the same [∆ k−1 , ∆ k ] interval.This is related to the bin packing problem in computer science which (is strongly NP-complete but) admits some reasonable approximations.Another way to look at the above results is that |ext( Note that the proof of Thm. 3 actually shows that d i ≥ d j + d k for some distinct i, j, k, which is a property stronger than the lack of well-structuredness.We want to stress that, while condition (i) of Thm. 3 ensures degeneracy of M d (y) (Lemma 1), lack of well-structuredness of d is not sufficient for M d (y) to be degenerate.This phenomenon in easy to understand in the graphical representation: Even if it holds that d i ≥ d j + d k for some distinct i, j, k-and hence d is not well-structured-it might happen that there is no permutation of the intervals which achieves the degeneracy.An example of this is given in Sec.4.3.
Let us conclude this section by having a look at the operator lift.More precisely, due to M d (y) = conv{Ay : A ∈ ext(s d (n))} [28, Ch. 14, Obs.C.2.(iii)], Minkowski's theorem [46,Thm. 5.10] shows that given any extreme point z of M d (y) there exists an extreme point A of s d (n) such that z = Ay.Now the obvious question is whether given some extreme point of M d (y) there is an easy way to recover one (or every) process which maps the initial state to the point in question.While given any initial and any final state there already exists an algorithm to construct a Gibbs-stochastic matrix mapping the former to the latter [22], it turns out that if the final state is an extreme point then this procedure simplifies considerably: ++ and permutations σ, τ ∈ S n there exists, for all j = 1, . . ., n − 1, a unique α j ∈ {1, . . ., n} such that for all j, k = 1, . . ., n.
This object has already appeared in the literature as "β-permutation" [8] and it coincides with the concept of a "biplanar extremal transportation matrix" [17] (up to the isomorphism X → Xdiag(d) from Sec. 2.2).The latter name, rightly, suggests the matrix A στ for all y ∈ R n , d ∈ R n ++ , and all σ, τ ∈ S n is an extreme point of the Gibbs-stochastic matrices [17, Thm. 1 ff.].Yet-due to the lower bound (n − 1)!n n−2 on the number of extreme points of s d (n) from Sec. 2.2-for n ≥ 4 there must exist extreme points of s d (n) which are not of the form A στ (i.e. which are not a β-permutation) 12 [17, Sec.IV.B].
Moreover, and more importantly, if τ is chosen such that , then A στ maps the initial state y to the extreme point E d,y (σ), and if y/d is non-degenerate, then A στ is the unique Gibbs-stochastic matrix which maps y to E d,y (σ), cf.[8], [17,Lemma 3].Not only does this yield a simple way to reverse-engineer a process which generates an extreme point in question, it also constitutes an alternative way to evaluate the extreme point map E d,y from Def. 3.
Remark 4. Given a permutation σ, a matrix A στ (stored in a sparse matrix format) can be constructed algorithmically in at most O(n log n) steps using Def. 4 as the limiting step is to find an appropriate permutation τ.Any algorithm which computes a process matrix A for an arbitrary state transfer, including the one given in [22], must have worst time complexity at least Ω(n 2 ), since A is dense in general.Hence the structure of the A στ leads to an improved runtime for the special case where the final state is extremal.
Now our main contribution to this concept reads as follows.While Definition 4 (which matches the definition given by [8]) as well as Mazurek's construction for biplanar extremal transportation matrices appear rather convoluted, we will present an incredibly simple construction of this matrix in Sec.4.5 down below.All one has to do there is to compare the sets {∑ j i=1 d σ(i) : j = 1, . . ., n} and {∑ j i=1 d τ(i) : j = 1, . . ., n} which, en passant, reaffirms the observation made by Alhambra et al. that Def. 4 is independent of the initial state y.Note that these ideas are closely related to the calculation of extreme points given in Sec.4.1 and to the index sets I σ k defined in (7) which are the main concept in the proof of Prop. 1.

Detailed Examples
The objects introduced in Sec.3.2 can be computed explicitly and they have simple graphical interpretations, e.g., via thermomajorization curves.The following examples show this in detail.

Extreme Point Map
Definition 3 contains a simple algorithm to evaluate the extreme point map E d,y (σ): Given some permutation σ ∈ S n find the value of the thermomajorization curve at , take the difference of consecutive values, and arrange them into a vector which is ordered according to σ.Let us go through a detailed example.
Let y = (4, 0, 1) ⊤ and d = (4, 2, 1) ⊤ .One verifies th d,y (c) = min{c, 5} for all c ∈ [0, 7] by direct computation, cf. Figure 3 below.Now let σ ∈ S 3 be the permutation σ(1) = 2, σ(2) = 3, and σ(3) = 1; in two-line notation this reads σ = 1 2 3 2 3 1 , henceforth σ = (2 3 1) for short.Our goal is to compute the extreme point E d,y (σ) of M d (y) which "corresponds" to σ.We will use the second formulation provided in Definition 3. First 12 Actually, the lower bound in question shows that for large n "almost no" extreme point of 1) ) = min{3, 5} − min{2, 5} = 1 and analogously for (E d,y (σ)) 1 .Thus E d,y (σ) = (2, 2, 1) ⊤ .This procedure can be nicely visualized, cf. Figure 3.The full image of E d,y is computed analogously: one finds (cf.Table 1) ext(M d (y)) =  With this in mind let us reformulate the definition of the map E d,y (σ) to get an even better understanding: Remark 5. Given d ∈ R n ++ and σ ∈ S n , the permutation σ tells us how to order the segments of length d i .These we can visualize as lying head to tail on the x-axis, i.e. as a tiling of the interval [0, e ⊤ d].The contact points between the intervals are then d σ(1) , d σ(1) + d σ (2) , and so on.Now we evaluate th d,y at these points and look at the corresponding increments ) for j = 1, . . ., n as visualized in Figure 3.This construction relates to the previously mentioned notion of tight thermomajorization as these contact points, in turn, are the elbow points of the thermomajorization curve of E d,y (σ).
Note that by Definition 3, (E d,y (σ)) σ(j) is the increment of th d,y over a distance of length d σ(j) .In particular for any j = 1, . . ., n the entry (E d,y (σ)) j corresponds to the increment over the interval d j , no matter where it is in our tiling.Hence two permutations σ, τ give the same extreme point E d,y (σ) = E d,y (τ) if and only if each interval d j yields the same increment of th d,y for both permutations.In the example above this happens because the vector y/d is degenerate.The following example shows however that this is not necessary.

Degeneracy
Now for a different example: consider d = (1, 2, 10) ⊤ , y = (1, 4, 5) ⊤ .The key insight is that even though y d = (1, 2, 0.5) ⊤ is non-degenerate, the polytope M d (y) turns out to be degenerate.The reason this is allowed to happen is that d is not well structured.Indeed, as in Sec.4.1 one computes for all c ∈ [0, 13] and hence one finds

Non-degeneracy
The following example shows that even if d fails to be well structured, this does not guarantee that M d (y) is degenerate.Indeed, choosing d = (4, 2, 1) ⊤ , y = being non-increasing) there is no way to permute these subintervals such that the two small intervals are contained in the big one.
This figure lets us easily build σA στ τ −1 because the non-zero entries of this matrix correspond to how much of the interval (∑ ] is overlapped by a given element of the partition {d σ(i) : i = 1, . . ., n} (by slight abuse of notation we identify d σ(i) with the interval of corresponding length).For example (0, d σ(1) ] overlaps with d τ(1) (covering 2  4 = half of it) but not with d τ(2) , d τ (3) .This means that the first row of σA στ τ −1 is given by ( 1 2 , 0, 0) ⊤ .Similarly, the second row reads ( 1 4 , 0, 0) ⊤ .On the other hand d σ(3) in Fig. 4 overlaps with all three sections d τ(1) , d τ(2) , d τ(3) , and the corresponding overlap ratios are given by 1  4 , 1 1 , and 2 2 .Hence the third row of σA στ τ −1 is given by ( 1 4 , 1, 1) ⊤ which altogether yields One readily verifies that A στ ∈ s d (3) maps y to E d,y (σ) = (2, 2, 1) ⊤ .Another observation to make here is that (A στ ) ij is always given by what portion of d j = d τ(τ −1 (j)) is covered by And there is more to uncover here: on the one hand the τ we chose is not the only permutation which orders y d non-increasingly in this example, and on the other hand we saw in Sec.4.1 that there exists a permutation σ ′ ∈ S 3 other than the σ we chose which is mapped to (2, 2, 1) ⊤ under E d,y .Thus we can apply the above procedure to 2 • 2 = 4 combinations of permutations (σ, τ) which all yield extremal Gibbs-stochastic matrices mapping y to (2, 2, 1) ⊤ .These are given in Table 2 below.

Conclusions
In this work, building upon [8,15,[17][18][19][20] we further explored thermomajorization in the quasi-classical realm as well as the rich geometry of the associated polytope.The former notion comes from the resource theory approach to quantum thermodynamics, and in particular the corresponding set of allowed operations, known as thermal operations.
Inspired by transportation theory the core notions of this work were "stable" and "well-structured" Gibbs vectors which are simple conditions on the spectrum of a Gibbs state.We found that these concepts relate to and give rise to conceptional insights and  unexpected results regarding system properties and state transfers.On the one hand, quasi-classical cyclic state-transfers w.r.t.thermomajorization are impossible if and only if the Gibbs-vector is stable (which is the generic case).Put differently, within the model of (quasi-classical) thermal operations performing cyclic state transfers in general comes with a non-zero work cost.On the other hand, we uncovered two connections to the notion of subspaces in equilibrium: 1. Thermal operations can bring any subspace in equilibrium out of equilibrium without having to expend work.This generalizes the intuitive fact that a system in a Gibbs state can be brought out of equilibrium by coupling it to a nonequilibrium system.Note that for the latter scenario-while any out-of-equilibrium system is necessarily a resource-our result shows that this is the only price one has to pay, i.e. once the systems are coupled there is some process on the composite system which takes the original system out of the Gibbs state and which can be implemented at no work cost.2. The existence of subspaces in equilibrium is reflected in the geometry of the thermomajorization polytope.Indeed, assuming well-structuredness of the Gibbs state-which is equivalent to the system's temperature exceeding a critical value-the existence of a subspace in equilibrium corresponds to the polytope having degenerate corners.While we explored the case of quasi-classical states in all detail, as usual in quantum thermodynamics the general case is a lot more difficult.Most notably, the number of extreme points of the thermal operations as well as the number of extreme points of the future thermal cone is uncountable for non-classical initial states [7,21].Hence one loses access to tools from the theory of convex polytopes.One of the few things that pertain to general systems are the notions of well-structured and stable Gibbs states (Def.1); investigating whether these notions encode any properties of general thermal cones (such as, e.g., the impossibility of cyclic processes) could be an interesting topic for future research."(iii) ⇒ (i)": The idea is as follows: partition y = y 1 y ′ , x = x 1 x ′ where x ′ , y ′ ∈ R n−1 , and perform the following chain of transformations: Now let us go over the above instructions step by step: (1) Define row vectors w + , w − ∈ R n−1 as follows: given i = 1, . . ., n − 1 set (w + ) i := 1 if y ′ i ≥ 0, else (w + ) i := 0. Then set w − := e ⊤ − w + .Thus w + indicates the location of the non-negative entries of y ′ , and w − tells us where the negative entries of y ′ are.Now the column-stochastic matrix collects the positive entries of y ′ in the second position, and the negative entries of y ′ in the third position 13 .We write y + := w + y ′ and y − := −(w − y ′ ).
(3) Similarly to (1) define A 3 such that the bottom right (n − 1) × (n − 1) block of A 3 redistributes the accumulated positive and negative entries of x ′ back to its original place.
Based on this result it seems reasonable to conjecture that for all d ∈ R n + , x ≺ d y is equivalent to e ⊤ x = e ⊤ y and ∥d i x − y i d∥ 1 ≤ ∥d i y − y i d∥ 1 , i = 1, . . ., n.However, proving 13 We may w.l.o.g.assume n ≥ 3 because we can append as many zeros to the original vectors as we want without altering the problem: one readily verifies that x ≺ d y is equivalent to this seems to not be as straightforward as the above non-degenerate temperature-zero case, and it would be beside the point of this article either way.

Appendix B Basic Properties of Concave Functions
We start with the following simple observation: Given a compact interval I ⊆ R, a concave function f : I → R, and r, s, t ∈ I with r < s < t one finds which in turn is equivalent to The following now is a direct consequence of this: • e ⊤ x = e ⊤ y and th d,x (c) ≤ th d,y (c) for all c ∈ [0, e ⊤ d].

Figure 1 .
Figure 1.Illustration of stability and well-structuredness.Consider d ∈ R 4 ++ where, w.l.o.g., e ⊤ d = 1.Then d lies in the relative interior of the standard simplex shown on the left.By reordering its entries in a non-increasing fashion, cf.Remark 1, we can assume that d lies in the ordered Weyl chamber shown in the middle.The unstable points are composed of the walls of the Weyl chamber as well as five planes intersecting the Weyl chamber.These planes cut the Weyl chamber into nine subchambers, and the one which includes the maximally mixed state e/4 (highlighted in red on the right) contains exactly the well-structured Gibbs vectors.
i∈I x l d i = ∑ i∈I y l d i .By assumption d is stable so we obtain I x l = I y l for all l = 1, . . ., k − 1.In particular ∑ i∈I x l x i = th d,x ∑ i∈I x l d i = th d,y ∑ i∈I x l d i = th d,y ∑ all l = 0, . . ., k when defining I x 0 := ∅ =: I y 0 and I x k := {1, . . ., n} =: I y k .Now that we took care of the points where the curves change slope all that remains are the points in between.Consider any l = 1, . . ., k.Because th d,y is affine linear on

Remark 3 .
The assumption d > 0 in Thm. 2 is necessary as hinted at by the fact that dim(s d (n)) for general d ∈ R n + depends on the number of zeros in d[30, Thm.3.2].This is nicely illustrated by the simple example d

Theorem 3 .
Let d ∈ R n ++ , y ∈ R n .If E d,y is not injective, i.e. |ext(M d (y))| < n!,then at least one of the following holds:

Proposition 1 .
Let d ∈ R n ++ , y ∈ R n and let m ∈ {1, . . ., n − 1} be the number of changes in slope of th d,y .Given any σ, τ ∈ S n one has E d,y (σ) = E d,y (τ) if and only if I σ k = I τ k for all k = 1, . . ., m + 1.

Figure 3 .
Figure 3. Visualization of the detailed calculation of E d,y (σ) for σ = (2 3 1).First we extract the value of th d,y at d σ(1) = d 2 = 2 which, because we considered the second entry of d, becomes the second entry of E d,y (σ).Next we add d σ(2) = d 3 = 1 to the previous x-axis value; then the third entry of E d,y (σ) is the corresponding increment th d,y (3) − th d,y (2) = 1.Finally we add d σ(3) = d 1 to our position on the x-axis to arrive at e ⊤ d = 7, so the first entry of E d,y (σ) is given by the final increment th d,y (7) − th d,y (3) = 2.

→ x 1 x
′ = x Before explaining the undefined objects let us make the following key observation: If some entries of d are zero then any d-stochastic matrix has to be block triangular.More precisely given A ∈ R n×n + and d = d 0 n−m where d ∈ R m , d > 0, the matrix A is d-stochastic if and only if there exist a d-stochastic matrix Ã ∈ R m×m + and probability vectors v 1 , . . ., v k ∈ R n + (i.e. e ⊤ v 1 = . . .= e ⊤ v k = 1) such that all x, y, d ∈ R n , d ≥ 0, and k ∈ N.
in the sense that d is well structured if and only if (5) holds.One way to see this is to first reduce well-structuredness of d to a family of inequalities ∑ (1) σ ∈ S n be given.One has E d,y (σ) = y if and only if Assume σ ∈ S n satisfies E d,y (σ) = y so (E d,y (σ)) σ(j) = y σ(j) for all j = 1, . .., n.By definition of E d,y this means th d,y (∑ for all j = 1, . .., n − 1.But by Lemma A2 (App.B) this holds if and only if y σ(1) /d σ(1) .Proof.
by Lemma 1 where (S n ) y/d is the stabilizer of the vector y/d = (y i /d i ) n i=1 in S n , and the examples in Sec.4.1 & 4.2 show that this bound is not tight.These examples suggest that improving this bound via an analytic expression is a non-trivial task.Finally, Prop. 1 lets us prove Thm. 3. Assume there exist distinct σ, τ ∈ S n such that E d,y (σ) = E d,y (τ).Then, by Prop. 1, σ and τ differ by a permutation such that each d-interval remains in the same affine-linear region of the thermomajorization curve.If i → y i d i is injective, this means that there have to exist pairwise distinct i, j, k ∈ {1, . . ., n} such that both the intervals corresponding to d j , d k "fit inside" d i .Therefore d i ≥ d j + d k meaning d cannot be well structured.