Arrow Contraction and Expansion in Tropical Diagrams

Arrow contraction applied to a tropical diagram of probability spaces is a modification of the diagram, replacing one of the morphisms with an isomorphism while preserving other parts of the diagram. It is related to the rate regions introduced by Ahlswede and Körner. In a companion article, we use arrow contraction to derive information about the shape of the entropic cone. Arrow expansion is the inverse operation to the arrow contraction.


Introduction
In [MP18] we have initiated the theory of tropical probability spaces and in [MP19c] applied the techniques to derive a dimension-reduction result for the entropic cone of four random variables.
Two of the main tools used for the latter are what we call arrow contraction and arrow expansion.They are formulated for tropical commutative diagrams of probability spaces.Tropical diagrams are points in the asymptotic cone of the metric space of commutative diagrams of probability spaces endowed with the asymptotic entropy distance.Arrows in diagrams of probability spaces are (equivalence classes of) measure-preserving maps.
Arrow contraction and expansion take a commutative diagram of probability spaces as input, modify it, but preserve important properties of the diagram.The precise results are formulated as Theorems 3.1 and 3.3 in the main text.Their formulation requires language, notation and definitions that we review in Section 2.
However, to give an idea of the results in this paper, we now present two examples.

Two examples.
1.1.1.Arrow contraction and expansion in a two-fan.Suppose we are given a fan Z = (X ← Z → Y ) and we would like to complete it to a diamond (1.1) such that the entropy of V , denoted by [V ], equals the mutual information [X ∶ Y ] between X and Y .That is, we would like to realize the mutual information between X and U by a pair of reductions X → V and U → V .This is not always possible, not even approximately.Arrow contraction instead produces another fan Z ′ = (X ← Z ′ → V ), such that the reduction Z ′ → X is an isomorphism and the relative entropy [X V ] of X given V equals [X U ].By collapsing this reduction we obtain as a diagram just the reduction X → V .If need be, we can still keep the original spaces Z and U in the modified diagram obtaining the "broken diamond" diagram Of course, no special technique is necessary to achieve this result, since it is easy to find a reduction from a tropical space [X] to another tropical probability space with the prespecified entropy, as long as the Shannon inequality is not violated.
However, a similar operation becomes non-trivial and in fact impossible without passing to the tropical limit, if instead of a single space X there is a more complex sub-diagram as in the example in the next subsection.
To explain how arrow expansion works, lets start with the chain of reductions Z → X → V .Can we extend it to a diamond, as in (1.1) so that [X ∶ Y V ] = 0? This is again not possible, in general.However, if we pass to tropical diagrams, then such an extension always exists.
1.1.2.One More Example of Arrow Expansion and Contraction.Consider a diagram presented in Figure 1.Such a diagram is called a Λ 3 -diagram.We would like to find a reduction X → V so that [X U ] = [X V ].It is not possible to achieve this within the realm of diagrams of classical probability spaces.But once we pass to the tropical limit, the reduction [X] → [V ] can be found by contracting and then collapsing the arrow [Z] → [X], as shown in Figure 1.
Arrow contraction is closely related to the Shannon channel coding theorem.This is perhaps most obvious from the proof.Furthermore, arrow contraction has connections with rate regions as introduced by Ahlswede and Körner, The main contribution of our work lies in the fact that we prove a much stronger preservation of properties of the diagram under arrow contraction.

Preliminaries
2.1.Probability spaces and their diagrams.Our main objects of study will be commutative diagrams of probability spaces.A finite probability space X is a set with a probability measure on it, supported on a finite set.We denote by X the cardinality of the support of the measure.The statement x ∈ X means that point x is an atom with positive weight in X.For details see [MP18,MP19b,MP19a].
Examples of commutative diagrams of probability spaces are shown in Figure 2. The objects in such diagrams are finite probability spaces and morphisms are equivalence classes of measure-preserving maps.Two such maps are considered to be equivalent, if they coincide on a set of full measure.To record the combinatorial structure of a commutative diagram, i.e. the arrangement of spaces and morphisms, we use indexing categories, which are poset categories satisfying an additional property, that we describe below.
2.1.1.Indexing Categories.A poset category is a finite category such that there are at most one morphism between any two objects either way.
For a pair of objects k, l in a poset category G = {i; γ ij }, such that there is a morphism γ kl in G, we call k an ancestor of l and l a descendant of k.The set of all ancestors of an object k together with all the morphisms between them is itself a poset category and will be called a co-ideal generated by k and denoted by ⌊k⌋.Similarly, a poset category consisting of all descendants of k ∈ G and morphisms between them will be called an ideal generated by k and denoted ⌈k⌉.
An indexing category G = {i; γ ij } used for indexing diagrams, is a poset category satisfying the following additional property: for any pair of objects i 1 , i 2 ∈ G the intersection of co-ideals is also a co-ideal generated by some object In other words, for any pair of objects i 1 , i 2 ∈ G there exists a least common ancestor i 3 , that is i 3 is an ancestor to both i 1 and i 2 and any other common ancestor is also an ancestor of i 3 .Any indexing category is initial, i.e. there is a (necessarily unique) initial object î in it, which is the ancestor of any other object in G, in other words G = ⌈î⌉.
A fan in a category is a pair of morphisms with the same domain.A fan (i ← k → j) is called minimal, if for any other fan (i ← l → j) included in a commutative diagram k i j l the vertical morphism (k → l) must be an isomorphism.Any indexing category also satisfies the property that for any pair of objects in it there exists a minimal fan with target objects the given ones.
This terminology will also be applied to diagrams of probability spaces indexed by G. Thus, given a space X in a G-diagram, we can talk about its ancestors, descendants, co-ideal ⌊X⌋ and ideal ⌈X⌉.We use square brackets to denote tropical diagrams and spaces in them.For the (co-)ideals in tropical diagrams, in order to unclutter notations, we will write A constant G-diagram denoted X G is a diagram where all the objects equal to X and all morphisms are identities.
Important examples of indexing categories are a two-fan, a diamond category, a full category Λ n on n spaces, chains C n .For detailed description and more examples, the reader is referred to the articles cited at the beginning of this section.

Tropical Diagrams.
2.2.1.Intrinsic Entropy Distance.For a fixed indexing category G the space of commutative G-diagrams will be denoted by Prob ⟨G⟩.Evaluating entropy on every space in a G-diagram gives a map where the target space R G is the space of real-valued functions on objects of G.We endow this space with the 1 -norm.For a fan F = (X ← Z → Y) of G-diagrams we define the entropy distance between it terminal objects by and the intrinsic entropy distance between two arbitrary G-diagrams by The triangle inequality for k and various its other properties are discussed in [MP18].
In the same article, also a useful estimate for the intrinsic entropy distance is proven, called the Slicing Lemma.The following corollary, [MP18, Corollary 3.10(1)], of the Slicing Lemma will be used in the next section.
Proposition 2.1.Let G be an indexing category, X , Y ∈ Prob ⟨G⟩ and U ∈ Prob included into a pair of two-fans Points in the asymptotic cone of (Prob ⟨G⟩ , k) are called tropical G-diagrams and the space of all tropical G-diagrams, denoted Prob[G], is endowed with the asymptotic entropy distance.We explain this now in more detail and a more extensive description can be found in [MP19b].
To describe points in Prob[G] we consider certain sequences X ∶= (X (n) ∶ n ∈ N) of G-diagrams, that grow almost linearly and endow the space of all such sequences with the asymptotic entropy distance defined by is defined to be an equivalence class of such sequences, where two sequences X and Ȳ are equivalent, if κ( X , Ȳ) = 0.The space Prob[G] carries the asymptotic entropy distance and has the structure of a R ≥0 -semi-module -one can take linear combinations with non-negative coefficients of tropical diagrams.The linear entropy functional A detailed discussion about tropical diagrams can be found in [MP19b].In the cited article we show that, the space Prob[G] is metrically complete, and isometrically isomorphic to a closed convex cone in some Banach space.
For G = C k a chain category, containing k objects {1, . . ., k} and unique morphism i → j for every pair i ≥ j, we have shown that the space The isomorphism is given by the entropy functional.Thus we can identify tropical probability spaces (elements in Prob[C 1 ]) with non-negative numbers via entropy.We will simply write [X] to mean the entropy of the space [X].
Along this line we also adopt the notations The approximating sequence of homogeneous diagrams is evidently quasilinear with the defect bounded by the admissible function Thus, Theorem 2.2 above states that L(Prob ⟨G⟩) ⊂ Prob[G] h .On the other hand we have shown in [MP19b], that the space of linear sequences L(Prob ⟨G⟩) is dense in Prob [G].Combining the two statements we get the following theorem.It can be understood as the tropical limit of the sequence (X (n) u n ), where (X (n)) is the homogeneous approximation of [X ], U (n) is the space in X (n) that corresponds to [U ] under combinatorial isomorphism and u n is any atom in U (n).
We have shown in [MP19a] that operation of conditioning is Lipschitzcontinuous with respect to the asymptotic entropy distance.

Arrow Collapse, Arrow Contraction and Arrow Expansion.
3.1.1.Prime Morphisms.A morphism γ ij ∶ i → j in an indexing category G = {i; γ ij } will be called prime if it cannot be factored into a composition of two non-identity morphisms inG.Morphism in a G-diagram indexed by a prime morphism in G will also be called prime.
The spaces Z i and Z j are replaced by a single space and the new space will inherit all the morphisms in Z with targets and domains Z i and Z j .An admissible fan will be called reduced if the morphism Z → X is an isomorphism.
3.2.The Contraction Theorem.Our aim is to prove the following theorem.
), corresponding to the original admissible fan through the combinatorial isomorphism, such that, with the notations X = ⌈X⌉ and It is not clear that constructing diagrams Z ′ as in the theorem above for a sequence of values of parameter ε decreasing to 0, we can obtain a convergent sequence in Prob[G] with the limiting diagram satisfying conclusions of the theorem with ε = 0.
The proof of Theorem 3.1 is based on the following proposition, which will be proven in Section 5.
Proposition 3.2.↓ Let (X 0 ← Z 0 → U ) be an admissible fan in some homogeneous G-diagram of probability spaces Z. Then there exists G-diagram Z ′ containing the admissible fan Proof (of Theorem 3.1): First we assume that [Z] is a homogeneous tropical diagram.It means that it can be represented by a quasi-linear sequence (Z(n)) n∈N 0 of homogeneous diagrams, with defect of the sequence bounded by the function ϕ(t) ∶= C ⋅ t 3 4 for some C > 0. This means that for any m, n ∈ N where D ϕ is some constant depending on ϕ, see [MP19b].
Fix a number n ∈ N and apply Proposition 3.2 to the homogeneous diagram Z(n), containing the admissible fan Since X ′′ u ′′ does not depend on u ′′ and X (n) u does not depend on u we have The distance between [ Z] and [Z] can be bounded as follows This also implies Since conditioning is a Lipschitz continuous operation with Lipschitz constant 2, we also have Combining the estimates in (3.2), (3.3), (3.4) and (3.5) we obtain n Note that X 0 (n) grows at most exponentially (it is bounded by e n([X 0 ]+C) for some C) and ϕ is a strictly sub-linear function.Thus by choosing n sufficiently large depending on given ε > 0 we obtain [Z ′ ] satisfying conclusions of the theorem for [Z] homogeneous.
To prove the theorem in full generality observe that all the quantities on the right-hand side of the inequalities are Lipschitz-continuous.Since Prob[G] h is dense in Prob[G] the theorem extends to any [Z] by first approximating it with any precision by a homogeneous configuration and applying the argument above.⊠

The expansion Theorem.
The following theorem is complimentary to Theorem 3.1.The expansion applied to a diagram containing a reduced admissible fan produces a diagram with an admissible fan, such that contraction of it is the original diagram.Thus, arrow expansion is a right inverse of the arrow contraction operation.
In general, contraction erases some information stored in the diagram, so there are many right inverses.We prove the theorem below by providing a simple construction of one such right inverse.
be a tropical probability space with entropy equal to λ.For any reduction of tropical spaces [A] → [B], there are natural reductions We construct the diagram [Z] by replacing every space

And any morphism from
Clearly the resulting diagram satisfies the conclusion of the theorem.⊠ The rest of the article is devoted to the development of necessary tools and the proof of Proposition 3.2.

Local Estimate
In this section we derive a bound, very similar to Fano's inequality, on the intrinsic entropic distance between two diagrams of probability spaces with the same underlying diagram of sets.The bound will be in terms of total variation distance between two distributions corresponding to the diagrams of probability spaces.It will be used in the next section, to prove arrow contraction theorem.

Distributions.
4.1.1.Distributions on sets.For a finite set S we denote by ∆S the collection of all probability distributions on S and by π 1 − π 2 1 we denote the total variation distance between π 1 , π 2 ∈ ∆S.4.1.2.Distributions on Diagrams of Sets.Let Set denote the category of finite sets and surjective maps.For an indexing category G, we denote by Set ⟨G⟩ the category of G-diagrams in Set.That is, objects in Set ⟨G⟩ are commutative diagrams of sets indexed by the category G, the spaces in the such a diagram are finite sets and arrows represent surjective maps, subject to commutativity relations.
For a diagram of sets S = {S i ; σ ij } we define the space of distributions on the diagram S by where f * ∶ ∆S → ∆S ′ is the affine map induced by a surjective map f ∶ S → S ′ .If S 0 is the initial space of S, then there is an isomorphism Using the isomorphism (4.1) we define total variation distance between two distributions π, π ′ ∈ ∆S as Given a G-diagram of sets S = {S i ; σ ij } and an element π ∈ ∆S we can construct a G-diagram of probability spaces (S, π) ∶= {(S i , π i ); σ ij }.
Below we give the estimate of the entropy distance between two G-diagrams of probability spaces (S, π) and (S, π ′ ) in terms of the total variation distance π − π ′ between distributions.4.2.The estimate.The upper bound on the entropy distance, that we derive below, has two summands.One is linear in the total variation distance with the slope proportional to the log-cardinality of S 0 .The second one is super-linear in the total variation distance, but it does not depend on S. So we have the following interesting observation: of course, the super-linear summand always dominates the linear one locally.However as the cardinality of S becomes large it is the linear summand that starts playing the main role.This will be the case when we apply the bound in the next section.
For α ∈ [0, 1] consider a binary probability space with the weight of one of the atoms equal to α To prove the local estimate we decompose both π and π ′ into a convex combination of a common part π and rests π + and π ′+ .The coupling between the common parts gives no contribution to the distance, and the worst possible estimate on the other parts is still enough to get the bound in the lemma, by using Proposition 2.1.
Let S 0 be the initial set in the diagram S. We will need the following obvious rough estimate of the entropy distance that holds for any π, π ′ ∈ ∆S: It can be obtained by taking a tensor product for the coupling between X and Y.
Our goal now is to write π and π ′ as the convex combination of three other distributions π, π + and π ′+ as in We could do it the following way.Let π 0 and π ′ o be the distributions on S 0 that correspond to π and π ′ under isomorphisms (4.1).Let α If α = 1 then the proposition follows from the rough estimate (4.2), so from now on we assume that α < 1. Define three probability distributions π0 , π + 0 and π ′+ 0 on S 0 by setting for every x ∈ S 0 Denote by π, π + , π ′+ ∈ ∆S the distributions corresponding to π0 , π + 0 , π ′+ 0 ∈ ∆S 0 under isomorphism (4.1).Thus we have The reductions in the fans in (4.3) are given by coordinate projections.Note that the following isomorphisms hold Now we apply Proposition 2.1 along with the rough estimate in (4.2) to obtain the desired inequality In this section we prove Proposition 3.2, which is shown below verbatim.Proposition 3.2.↑ Let (X 0 ← Z 0 → U ) be an admissible fan in some homogeneous G-diagram of probability spaces Z. Then there exists G-diagram Z ′ containing the admissible fan (X ′ 0 ← Z ′ 0 → U ′ ) such that, with the notations X ∶= ⌈X 0 ⌉ and X ′ ∶= ⌈X ′ 0 ⌉, it holds that (1) X u = X ′ u ′ for any u ∈ U and u The proof consists of the construction in Section 5.1 and estimates in Propositions 5.4 and 5.5.5.1.The construction.In this section we fix an indexing category G, a minimal G-diagram of probability spaces Z with an admissible sub-fan X 0 ← Z 0 → U .We denote X ∶= ⌈X 0 ⌉ and by H we denote the combinatorial type of X = {X i ; χ ij }.
Instead of diagram Z we consider an extended diagram, which is a two-fan of H-diagrams where Y = {Y i ; υ ij } consists of those spaces in Z, that are initial spaces of two-fans with feet in U and in some space in X .That is for every i ∈ H the space Y i is defined to be the initial space in the minimal fan It may happen that for some pair of indices i 1 , i 2 ∈ H the initial spaces of the fans with one feet U and the other X i 1 and X i 2 coincide in Z.In Y, however, they will be treated as separate spaces, so that the combinatorial type of Y is H. Starting with the diagram in (5.1) one can recover Z by collapsing all the isomorphism arrows.The initial space of Y will be denoted Y 0 .
We would like to construct a new fan Once this goal is achieved, we collapse all the isomorphisms to obtain Gdiagram satisfying conditions in the conclusion of Proposition 3.2.
We start with a general description of the idea behind the construction, followed by the detailed argument.To introduce the new space V we take its points to be N atoms in u 1 , . . ., u N ∈ U .Ideally we would like to choose the atoms in such a way that X 0 u n are disjoint and cover the whole of X 0 .It is not always possible to achieve this exactly.However, when X 0 is large, N is taken slightly larger than e [X 0 ∶U ] , and u 1 , . . ., u N are chosen at random, then with high probability the spaces X 0 u n will overlap only little and will cover most of X 0 .The details of the construction follow.
We fix N ∈ N and construct several new diagrams.For each of the new diagrams we provide a verbal and formal description.
• The space U N .Points in it are independent samples of length N of points in U .• The space V N = ({1, . . ., N } , unif).A point n ∈ V N should be interpreted as a choice of index in a sample ū ∈ U N .• The H-diagram A, where A point (x, n, ū) in A i corresponds to the choice of a sample ū ∈ U N , an independent choice of a member of the sample u n and a point x ∈ X i u n .Recall that the original diagram Z was assumed to be homogeneous and, in particular, the distribution on X i u n is uniform.Due to the assumption on homogeneity of Z, the space X i u does not depend on u ∈ U .Since V N is also equipped with the uniform distribution, it follows that the distribution on A i will also be uniform.
(2) The diagrams which is obtained by conditioning top fan in the diagram in (5.3).
The very important observation is that diagrams X ′ ū n and X u are isomorphic for any choice of n ∈ V N and u ∈ U .The isomorphism is the composition of the following sequence of isomorphisms where the first isomorphism follows from the definition of X ′ ū, the second -from minimality of the fan B ← A → V N , the third -from the definition of A and the forth -from the homogeneity of Z.
5.2.The estimates.We now claim and prove that one could choose a number N and ū in U , where Y ′ ū,0 and X ′ ū,0 are initial spaces in X ′ ū and Y ′ ū, respectively.5.2.1.Total Variation and Entropic Distance estimates.If we fix some x 0 ∈ X 0 , then ν = ν(x 0 , ⋅ ) is a scaled binomially distributed random variable with parameters N and ρ, which means that N ⋅ ν ∼ Bin(N, ρ).
First we state the following bounds on the tails of a binomial distribution.
Lemma 5.2.Let ν be a scaled binomial random variable with parameters N and ρ, then (i) for any t ∈ [0, 1] holds The proof of Lemma 5.2 can be found at the end of this section.Below we use the notation P ∶= p U N for the probability distribution on U N .For a pair of complete diagrams C, C ′ with the same underlying diagram of sets and with initial spaces C 0 , C ′ 0 , we will write α(C, C ′ ) for the halved total variation distance between their distributions Proposition 5.3.In the settings above, for t ∈ [0, 1], the following inequality holds Recall that by definition X ′ ū = B ū.We use equation (5.4) to expand the left hand side of the inequality as follows Since by homogeneity of the original diagram all the summands are the same, we can fix some x 0 ∈ X 0 and estimate further: Applying Lemma 5.2(i) we obtain the required inequality.⊠ In the propositions below we assume that X 0 is sufficiently large (larger than e 20 ).
Proposition 5.4.In the settings above and for any 10 ln X 0 ≤ t ≤ 1 holds: We will use local estimate to bound entropy distance and then apply Proposition 5.3.To simplify notations we will write simply α for α(X ′ ū, X ) = α(B ū, X ).
Note that in the chosen regime, t ≥ 10 ln X 0 , the first summand on the lefthand side of the inequality is larger than the second, and thus 3 N ⋅ρ⋅t 2 ⊠ 5.2.2.The "height" estimate.Recall that for given N ∈ N and ū ∈ U N we have constructed a two-fan of H-diagrams We will now estimate the length of the arrow Y ′ ū,0 → X ′ ū,0 .Proposition 5.5.In the settings above and for t ∈ [0, 2] ⊠ Proof: First we observe that the fiber of the reduction Y ′ ū,0 → X ′ ū,0 over a point x ∈ X ′ ū,0 is a homogeneous probability space of cardinality equal to N (x, ū), therefore its entropy is ln N (x, ū).
The last inequality above follows from Lemma 5.2(ii).⊠ 5.3.Proof of Proposition 3.2.Let X ′ ū ← Y ′ ū → V N be the fan constructed in Section 5.1.The construction is parameterized by number N and atom ū ∈ U N .Below we will choose a particular value for N and apply estimates in Propositions 5.4 and 5.5 with particular choice of parameter t to show that there is ū ∈ U N , so that the fan satisfies conclusions of Proposition 3.2.Let N ∶= ln 3 X 0 ⋅ ρ −1 = ln 3 X 0 ⋅ e [X 0 ∶U ] t ∶= 10 ln X 0 With this choices of N and t Proposition 5.4 implies
Theorem 2.3.For any indexing category G, the space Prob[G] h is dense in Prob[G].Similarly, the space Prob[G] h,m is dense in Prob[G] m .⊠ It is possible that the spaces Prob[G] h and Prob[G] coincide.At this time we have neither a proof nor a counterexample to this conjecture.2.4.Conditioning in Tropical Diagrams.For a tropical G-diagram [X ] containing a space [U ] we defined a conditioned diagram [X U ].
3.1.3.Arrow Contraction and Expansion.Arrow contraction and expansion are two operations on tropical G-diagrams.Roughly speaking, arrow contraction applied to a tropical G-diagram [Z] results in another tropical G-diagram [Z ′ ] such that one of the arrows become an isomorphism, while some parts of the diagram are not modified.Arrow expansion is an inverse operation to arrow contraction.3.1.4.Admissible and Reduced Sub-fans.An admissible fan in a G-diagram Z is a minimal fan X ← Z → U , such that Z is the initial space of Z and any space in Z belongs either to the co-ideal ⌈X⌉ or ideal ⌊U ⌋.
[MMRV02]sults by Ahlswede and Körner were applied by[MMRV02], resulting in a new non-Shannon information inequality.Moreover, in[MMRV02]a new proof was given of the results; this new proof is similar to the proof of the arrow contraction result in the present paper.
[MP18]e relative entropy and mutual information for the tropical spaces included in some diagram.2.3.Asymptotic Equipartition Property for Diagrams.2.3.1.Homogeneous diagrams.A G-diagram X is called homogeneous if the automorphism group Aut(X ) acts transitively on every space in X .Homogeneous probability spaces are uniform.For more complex indexing categories this simple description is not sufficient.2.3.2.Tropical Homogeneous Diagrams.The subcategory of all homogeneousG-diagrams will be denoted Prob ⟨G⟩ h and we write Prob ⟨G⟩ h,m for the category of minimal homogeneous G-diagrams.These spaces are invariant under the tensor product, thus they are metric Abelian monoids.Passing to the tropical limit we obtain spaces of tropical (minimal) homogeneous diagrams, that we denote Prob[G] h and Prob[G] h,m .2.3.3.Asymptotic Equipartition Property.In[MP18]the following theorem is proven Theorem 2.2.Suppose X ∈ Prob ⟨G⟩ is a G-diagram of probability spaces for some fixed indexing category G. Then there exists a sequence H