Information Inequalities for Five Random Variables

Csirmaz, Laszlo; Csirmaz, Elod P.

doi:10.3390/computation14020042

Open AccessArticle

Information Inequalities for Five Random Variables

by

Laszlo Csirmaz

^1,2,*

and

Elod P. Csirmaz

¹

Alfréd Rényi Institute of Mathematics, 1053 Budapest, Hungary

²

Institute of Information Theory and Automation, CZ-182 00 Prague, Czech Republic

^*

Author to whom correspondence should be addressed.

Computation 2026, 14(2), 42; https://doi.org/10.3390/computation14020042

Submission received: 29 December 2025 / Revised: 22 January 2026 / Accepted: 27 January 2026 / Published: 2 February 2026

(This article belongs to the Section Computational Engineering)

Download

Browse Figures

Versions Notes

Abstract

The entropic region is formed by the collection of the Shannon entropies of all subvectors of finitely many jointly distributed discrete random variables. For four or more variables, the structure of the entropic region is mostly unknown. We utilize a variant of the Maximum Entropy Method to obtain five-variable non-Shannon entropy inequalities, which delimit the five-variable entropy region. This method adds copies of some of the random variables in generations. A significant reduction in computational complexity, achieved through theoretical considerations and by harnessing the inherent symmetries, allowed us to calculate all five-variable non-Shannon inequalities provided by the first nine generations. Based on the results, we define two infinite collections of such inequalities and prove them to be entropy inequalities. We investigate downward-closed subsets of non-negative lattice points that parameterize these collections, and based on this, we develop an algorithm to enumerate all extremal inequalities. The discovered set of entropy inequalities is conjectured to characterize the applied method completely.

Keywords:

entropic region; information inequalities; polymatroid; polyhedral geometry

MSC:

05B35; 26A12; 52B12; 90C29; 94A17; 52B40; 90C27

1. Introduction

Many important mathematical problems can be reduced to the following question: does a collection of finite random variables exist such that the entropies of the variable subsets satisfy certain linear constraints? Examples include, but are not limited to, channel coding [1] and network coding in particular [2], estimating the efficiency of secret sharing schemes [3,4,5], questions about matroid representations [6], guessing games [7], extracting information from common strings in cryptography [8], additive combinatorics [9], and finding conditional independence inference rules [10].

The entropy function of finitely many discrete random variables

〈 ξ_{i} : i \in N 〉

indexed by the fixed finite set N maps the non-empty subsets

I \subseteq N

to the Shannon entropy

H (ξ_{I})

of the variable set

ξ_{I} = 〈 x_{i} : i \in I 〉

, see [11]. The entropy region, denoted by

Γ_{N}^{*}

, is the range of the entropy function; it is a part of the

2^{| N |} - 1

-dimensional Euclidean space where the coordinates are labeled by non-empty subsets of N. Entropies are non-negative real numbers, and thus the entropy region lies in the non-negative orthant of this Euclidean space. It is delimited by a collection of homogeneous linear inequalities corresponding to the non-negativity of basic Shannon information measures [11]. Points satisfying all these inequalities form the Shannon-bound; the Shannon-bound is denoted by

Γ_{N}

.

N. Pippenger argued in [12] that linear inequalities bounding the entropic region

Γ_{N}^{*}

encode the fundamental laws of Information Theory and determine the limits of information transmission and data compression. The long-standing problem of whether a linear information inequality can properly cut into the Shannon bound was settled in 1998 by Zhang and Yeung [13] who exhibited the first example of such a non-Shannon information inequality. Their discovery initiated intensive research. The phrase Copy Lemma was coined by Dougherty et al. [14] to describe the general method distilled from the original Zhang–Yeung construction. The Copy Lemma has been applied successfully to generate several hundred sporadic and a couple of infinite families of non-Shannon entropy inequalities for

Γ_{4}^{*}

, see [14,15,16]. A different method, utilizing an information-theoretic lemma attributed to Ahlswede and Körner [17], was proposed in [18]; later it was shown to be equivalent to a special case of the Copy Lemma [19].

Our method to obtain five-variable non-Shannon entropy inequalities is based on a more general paradigm of which the Copy Lemma is a special case [20]. Derived from the principle of maximum entropy [21], it is called MEM, short for Maximum Entropy Method. For more details, see Section 3.

Previous works on generating and applying non-Shannon entropy inequalities, such as [4,10,14,15,22,23], focused on the four-variable case, and only a few sporadic five-variable non-Shannon inequalities have been discovered, such as the MMRV inequality from [18]. This is the first work that provides a method that generates an infinite collection of non-Shannon bounds on the five-variable entropy region

Γ_{5}^{*}

. Compared to the four-variable case, there are significant challenges, both theoretical and computational. The four-variable entropy region

Γ_{4}^{*}

sits in the 15-dimensional Euclidean space, while the five-variable region

Γ_{5}^{*}

is 31-dimensional. The structure of the Shannon bound

Γ_{4}

is well-understood: it has 41 extremal directions, and only 6 of them have no entropic points. The entropy region

Γ_{4}^{*}

has an inner polyhedral cone where it fills its Shannon bound, and has six isomorphic “protrusions” towards the six exceptional extremal directions, each protrusion surrounded by 15 hyperplanes of which 14 come from the Shannon bound [24]. Only the protrusions contribute to new entropy inequalities, and their dimension can be reduced to 10. Computational results about

Γ_{4}^{*}

can be obtained by computing vertices and facets of numerous implicitly defined 10-dimensional polyhedra [22]. In contrast, the Shannon bound

Γ_{5}

of the five-variable entropy region has 117,983 extremal directions [25], and for a few of them it is not even known whether they contain an entropic point or not. No structural reduction similar to the four-variable case is available, and it is not known whether such a reduction exists or not. Computations about

Γ_{5}^{*}

can still be reduced to 25-dimensional polyhedral enumeration problems (although with significantly larger number of constraints than in the 4-variable case). The complexity of enumeration problems typically doubles when the dimension increases by one, making such high-dimensional enumeration problems practically intractable.

We overcome this computational difficulty by applying a particular variant of the Maximum Entropy Method. This variant, working in generations, first reduces the problem dimension from 31 to 19, and then, at each generation, adds extra copies of some of the random variables, increasing the problem dimension again. Theoretical considerations and harnessing the inherent symmetry allowed us to complete the associated polyhedral computations up to nine generations. The output was the complete list of five-variable non-Shannon inequalities provided by the first nine generations. Based on the experimental results, we define an infinite collection of five-variable inequalities that we prove are provided by this MEM variant—in particular, they are valid non-Shannon entropy inequalities—and conjecture this collection to be complete; that is, no additional inequalities are yielded by this MEM variant. The collection of the inequalities is parametrized by finite, downward closed subsets of the non-negative lattice points of the plane. Some of the inequalities in our collection are consequences of the others; those that are not, are called extremal. We developed an incremental algorithm that enumerates, from generation to generation, the parameters yielding the extremal inequalities, in complete agreement with the computational results. The algorithm allowed us to significantly exceed the capabilities of polyhedral computation. While numerical instability prevented the completion of the polyhedral computation for the tenth generation, all extremal entropy inequalities were enumerated up to generation 60. Finally, we have examined the large-scale behavior of the extremal inequalities, and depicted how these inequalities delimit a three-dimensional cross-section of

Γ_{5}^{*}

.

The new five-variable non-Shannon inequalities can be applied to real-world problems. The most immediate application is in network coding. The new inequalities tighten the boundaries; they provide stricter and more accurate bounds on network capacity. In a network protocol they can assist in proving whether a targeted data rate is achievable or not [2].

Cloud storage services (like Google Drive or AWS S3) distribute data fragments across many nodes [26]. In case of failure, the node has to download the missing data from other nodes. The new five-variable inequalities can be used to determine the theoretical limits of storage efficiency for systems with more complex failure models or larger clusters.

In the realm of secret sharing, entropy inequalities provide lower bounds on the size of secrets [3,4,5]. To explore another facet of these problems, the new inequalities can prove that certain efficient schemes are impossible to realize. In complex datasets, it is important to distinguish between correlation and actual causation [27]. When an AI model analyzes data to build a causal graph, it can use entropy inequalities to rule out models that are information-theoretically impossible, narrowing the search space and improving accuracy.

In this paper lemmas, claims and theorems are arranged so that each is used in the same section only, typically right after they are stated and proved. Section 4 proves structural properties of the entropy region that are used to reduce the computational complexity of the polyhedral algorithms. The main theoretical results are stated and proved in Section 7. In Theorem 1 we prove a large collection of entropy inequalities parametrized by the downward-closed subsets of the non-negative lattice points. Lemmas estimating different entropy expressions are used in this section only. Unfortunately, many inequalities provided by Theorem 1 are consequences of the others. Claims and lemmas in Section 8 provide the theoretical foundation for our algorithm that selects and enumerates the extremal inequalities among them.

The remaining part of the paper is organized as follows. Notations are recalled in Section 2. Section 3 describes the special variant of the Maximum Entropy Method we apply to

Γ_{5}^{*}

. Section 4 discusses possible simplifications, including how symmetry can be utilized and how the MEM parameters were chosen. Section 5 describes the chosen coordinate systems, polyhedral computations, and their results. Section 6 presents the five-variable inequalities we obtained, paving the way for the definition of two infinite families of such inequalities in Section 7. Additional theoretical results, including the proof that inequalities in these families are indeed generated by the MEM method, are presented in Section 7. Section 8 discusses methods that can recognize extremal inequalities, discusses the incremental algorithm that enumerates the extremal inequalities for each MEM generation, describes the large-scale behavior of the new inequalities, and investigates the delimited part of the five-variable entropy region. Finally, Section 9 summarizes our work, lists open questions, and provides directions for further work.

2. Preliminaries

In this paper all sets are finite. Capital letters, such as A, J, N, etc., denote (finite) sets; elements of these sets are denoted by lower case letters. The union sign and the curly brackets around singletons are frequently omitted, thus,

N i j

denotes the set

N \cup {i, j}

. The difference of two sets is written as

A - B

, or

A - b

if the second set is a singleton. The star in the union

A \cup^{*} B

emphasizes that A and B are disjoint sets. A partition of N is a collection of non-empty disjoint subsets of N whose union equals N.

A discrete random variable ξ takes its values from a finite set

X

, called alphabet. The probability that

ξ

takes

x \in X

is denoted by

\Pr (ξ = x)

, or simply by

\Pr (x)

when the random variable

ξ

is clear from the context. Suppose

ξ

is defined on the direct product

X = \prod_{i \in N} X_{i}

for some finite set N, called the base set. For a non-empty

A \subseteq N

the marginal

ξ_{A}

is defined on the product alphabet

X_{A} = \prod_{i \in A} X_{i}

so that the probability of

y \in X_{A}

is the sum of the probabilities of those

x \in X

whose projection to

X_{A}

equals y:

\Pr (y) = \sum {\Pr (x) : x ↾ A = y} .

(1)

To emphasize that

ξ

is defined on a product space, we write

ξ = (ξ_{i} : i \in N)

, and say that the random variables

ξ_{i}

are distributed jointly. The Shannon entropy of the distribution

ξ

is defined as

H (ξ) = \sum_{x \in X} - \Pr (x) \log \Pr (x)

(2)

with the convention that

0 \log 0 = 0

. If

ξ = (ξ_{i} : i \in N)

is a joint distribution, then we write

H_{ξ} (A)

for

H (ξ_{A})

. The index

ξ

is also dropped when it is clear from the context. By convention,

H (\emptyset) = 0

. The entropies

H_{ξ} (A)

are arranged into a vector indexed by the non-empty subsets A of N. This vector is the entropy profile of the distribution

ξ

. The collection of these

(2^{| N |} - 1)

-dimensional vectors forms the entropy region, denoted by

Γ_{N}^{*}

. Elements of

Γ_{N}^{*}

are considered interchangeably as vectors, as points in this Euclidean space, and as functions assigning non-negative real numbers to non-empty subsets of the base set N. For a gentle introduction to these notions of Information Theory, please consult [11].

Notions of conditional entropy, mutual information, and conditional mutual information from Information Theory are formally extended to the functional form of these vectors. If f is any function on subsets of N, then for subsets

A, B, C, D

of N the following forms will be used as abbreviations:

\begin{matrix} f (A | B) & \overset{def}{=} f (A B) - f (B), \\ f (A, B) & \overset{def}{=} f (A) + f (B) - f (A B), \\ f (A, B | C) & \overset{def}{=} f (A C) + f (B C) - f (A B C) - f (C), and \\ f [A, B, C, D] & \overset{def}{=} - f (A, B) + f (A, B | C) + f (A, B | D) + f (C, D) . \end{matrix}

The first three expressions are called conditional entropy, mutual information, and conditional mutual information, respectively. The last line defines the Ingleton expression. An entropy function is not defined on the empty set, nevertheless,

f (\emptyset) = 0

will be assumed whenever convenient. In particular,

f (A, B | \emptyset)

and

f (A, B)

are the same expressions. Frequently, when clear from the context, the function f is omitted before the parenthesized expression. Additionally, if applied to singletons, the Ingleton expression is written without commas. An example is the inequality

[a b c d] + (a, b | z) + (b, z | a) + (a, z | b) + 3 (z | a b) ⩾ 0 .

(3)

Shannon inequalities state the non-negativity of the conditional entropy, mutual information, and conditional mutual information for all subsets A, B, C of the base set N. They are consequences of the unique minimal set of such inequalities, called basic Shannon inequalities, see [11], listed in (B1) and (B2) below:

(B1): $f (i | N - i) ⩾ 0$ for all $i \in N$ ;
(B2): $f (a, b | K) ⩾ 0$ for all $K \subseteq N$ and different $a, b \in N - K$ , including $K = \emptyset$ .

The collection of all

(2^{| N |} - 1)

-dimensional vectors (or points, or functions) that satisfy the Shannon inequalities is denoted by

Γ_{N}

. It is a natural outer bound for the entropy region

Γ_{N}^{*}

.

Γ_{N}

is a pointed polyhedral cone [28]; its facets are the hyperplanes specified by the basic Shannon inequalities. Polymatroids are elements of

Γ_{N}

written in functional form. A polymatroid is usually written as

(f, N)

, or just f, when we say that f is on N. The polymatroid f is entropic if it is in

Γ_{N}^{*}

, and almost entropic, or aent for short, if it is in the closure (in the usual Euclidean topology) of

Γ_{N}^{*}

. Linear inequalities valid for all polymatroids are consequences of the basic Shannon inequalities; an example is the inequality (3). A non-Shannon inequality is a homogeneous linear inequality that is valid for points of the entropic region but not for all points of the Shannon bound. Equivalently, the non-negative side of the hyperplane corresponding to such an inequality contains the complete entropy region, while it cuts properly into

Γ_{N}

.

The closure of the entropic region is a pointed convex full-dimensional cone [11], and only its boundary points can be non-entropic [29].

The polymatroid

(f, N)

on the base set N is linearly representable over the field

F

, or

F

-representable in short, if there is a finite-dimensional vector space V over

F

, and linear subspaces

V_{i} \subseteq V

for

i \in N

, such that for all

I \subseteq N

,

f (I)

is the dimension of the linear subspace spanned by

⋃_{i \in I} V_{i}

. Clearly, if both

(f, N)

and

(g, N)

are

F

-representable over the same field, then so is their sum

f + g

. The polymatroid f is

F

-linear if it is in the closure of the multiplies of

F

-representable polymatroids. By the previous remark,

F

-linear polymatroids form a closed cone. Finally, f is linear if it is

F

-linear for some field

F

.

Following a compactness argument, if f is

F

-representable, then it is representable over some finite field as well, see [30], meaning that the vector space V is also finite. Taking the uniform distribution on V provides the entropic polymatroid

(\log | V |) f

. Thus, linear polymatroids are also almost entropic.

Linear polymatroids on the base set N with

| N | ⩽ 5

are

F

-linear for every field

F

, see [24,31]; this statement is not true in general. For

| N | ⩽ 3

every polymatroid is linear. For

N = {a b c d}

a polymatroid f on N is linear if and only if it satisfies the following six instances of the Ingleton inequality:

\begin{matrix} f [a b c d] ⩾ 0, & f [a c b d] ⩾ 0, & f [a d b c] ⩾ 0, \\ f [b c a d] ⩾ 0, & f [b d a c] ⩾ 0, & f [c d a b] ⩾ 0, \end{matrix}

(4)

see [24]. Since the Ingleton expression is symmetric in the first two and in the last two arguments, these expressions cover all 24 permutations of N.

Finally, we recall notions of independence. Let

(f, N)

be a polymatroid, and X,

Y_{1}

, …,

Y_{k}

be disjoint subsets of N.

Y_{1}

and

Y_{2}

are independent in f if

f (Y_{1}, Y_{2}) = 0

. The collection

Y_{1}, \dots, Y_{k}

is completely independent in f if for any two disjoint subsets I and J of the indices

{1, 2, \dots, k}

,

Y_{I} = ⋃_{i \in I} Y_{i}

and

Y_{J}

are independent, or, equivalently, if

f (Y_{1} \dots Y_{k}) = f (Y_{1}) + \dots + f (Y_{k}) .

(5)

In this case we also have

f (Y_{I}) = \sum_{i \in I} f (Y_{i})

for every subset I of the indices. The disjoint subsets

Y_{1}

and

Y_{2}

are conditionally independent over X if

f (Y_{1}, Y_{2} | X) = 0

; and

Y_{1}, \dots, Y_{k}

are completely conditionally independent over X if

Y_{I}

and

Y_{J}

are conditionally independent over X for arbitrary disjoint subsets I and J of the indices. An equivalent condition is

f (Y_{1} \dots Y_{k} | X) = f (Y_{1} | X) + \dots + f (Y_{k} | X),

(6)

which similarly implies

f (Y_{I} | X) = \sum_{i \in I} f (Y_{i} | X)

for every index set I.

3. The Maximum Entropy Method

In general terms, the principle of maximum entropy is easy to formulate: if a probability distribution is specified only partially, take the one with the largest entropy, see, e.g., [21]. In the particular case applied here “partial specification” means fixing some, but not all, marginal distributions. To be more concrete, suppose

ξ

is distributed jointly on the base set N. Partition N into three non-empty subsets as

N = Y \cup^{*} X \cup^{*} Z

. Take

n ⩾ 1

disjoint copies of Y and

m ⩾ 1

disjoint copies of Z to form the enlarged base set

N^{*} = Y_{1} \cup^{*} \dots \cup^{*} Y_{n} \cup^{*} X \cup^{*} Z_{1} \cup^{*} \dots \cup^{*} Z_{m} .

(7)

Consider the collection of those distributions

ξ^{*}

on

N^{*}

whose marginals on

Y_{i} X

are equal to

ξ_{Y X}

, and marginals on

X Z_{j}

are equal to

ξ_{X Z}

. That is, the marginal of

ξ

on

Y X

and the marginals of

ξ^{*}

on all

Y_{i} X

are the same as well as the marginal of

ξ

on

X Z

and the marginals of

ξ^{*}

on

X Z_{j}

. This collection of distributions is not empty, as one can take each

Y_{i}

to be the same as Y, and each

Z_{j}

to be the same as Z. The total entropy is a strictly concave function of the probability masses, and fixing certain marginals imposes linear constraints on those masses. Consequently, there is a unique optimal distribution

ξ^{*}

with maximum total entropy, see [32]. Although structural properties of the maximum entropy distributions are mainly unknown, they are known to satisfy numerous conditional independencies. For this particular case, these are stated as Lemma 1 below.

Lemma 1.

In the distribution with maximum total entropy, the subsets

Y_{1}, \dots, Y_{n}

and

Z_{1}, \dots, Z_{m}

are completely conditionally independent over X.

Proof.

If some of the conditional independence statements do not hold, then one can redefine the distribution keeping the specified marginals while increasing the total entropy. For details, see [20]. □

Since identical distributions have identical entropy profiles, Lemma 1 immediately implies that an entropic polymatroid has an

n, m

-copy as defined below:

Definition 1.

Let f be a polymatroid on N, and partition N into three non-empty subsets as

N = Y \cup^{*} X \cup^{*} Z

. Let

Y_{1}, \dots, Y_{n}

and

Z_{1}, \dots, Z_{m}

be disjoint copies of Y and Z, respectively. The polymatroid

f^{*}

on the base set

N^{*} = Y_{1} \dots Y_{n} X Z_{1} \dots Z_{m}

is an

n, m

-copy of f if

(i): $f^{*}$ restricted to $Y_{i} X$ is isomorphic to $f ↾ Y X$ for every $i ⩽ n$ ,
(ii): $f^{*}$ restricted to $X Z_{j}$ is isomorphic to $f ↾ X Z$ for every $j ⩽ m$ ,
(iii): the $n + m$ subsets $Y_{1}, \dots, Y_{n}, Z_{1}, \dots, Z_{m}$ are completely conditionally independent over X in $f^{*}$ .

The special version of the Maximum Entropy Method used in this paper is based on the fact that entropic polymatroids have

n, m

-copies. For fixed integers n and m, polymatroids on

Y X Z

that have an

n, m

-copy form a polyhedral cone

C_{n, m}

. This is proved as Claim 1 below. The cone

C_{n, m}

contains the complete entropy region

Γ_{Y X Z}^{*}

, and is contained in the Shannon cone

Γ_{Y X Z}

. Consequently, bounding facets of the cone

C_{n, m}

that are not facets of the Shannon cone provide new entropy inequalities. This method is summarized as follows.

Maximum Entropy Method

(special case). Fix the base set N and the partition

N = Y \cup^{*} X \cup^{*} Z

. For

n, m ⩾ 1

let

C_{n, m}

be the polyhedral cone of those polymatroids on N that have an

n, m

-copy. Compute all bounding facets of

C_{n, m}

as homogeneous linear inequalities, and delete those which are consequences of the basic Shannon inequalities. The remaining inequalities form the maximal set of non-Shannon inequalities provided by the partition

Y X Z

and the numbers n and m.

Let us remark that while the maximum entropy extension is unique, the

m, n

-copy in Definition 1 is typically not, as the definition captures only a small part of the properties of the maximum entropy extension. The obtained entropy inequalities form the facets of a convex polytope; consequently, they are independent in the sense that none of them is a consequence of the others or the Shannon inequalities.

Next we prove that

C_{n, m}

is a polyhedral cone indeed.

Claim 1.

Polymatroids

(f, N)

with an

n, m

-copy form a polyhedral cone.

Proof.

Consider the polymatroid f as a

(2^{| N |} - 1)

-dimensional vector indexed by the non-empty subsets of N. Write this vector as

(x, u)

where

x

of dimension

d_{1}

contains those coordinates where the index I is a subset of either

Y X

or

X Z

, and

u

of dimension

d_{2}

contains the rest, namely those subsets that intersect both Y and Z. Clearly,

d_{1} + d_{2} = 2^{| N |} - 1

. Similarly, let

y

be the vector formed from the values of the

n, m

-copy polymatroid

f^{*}

as indexed by the subsets of

N^{*}

. The vector

y

has dimension

d_{3} = 2^{| N^{*} |} - 1

. Now,

(f^{*}, N^{*})

is a polymatroid if the vector

y

satisfies all linear inequality constraints imposed by the basic Shannon inequalities in (B1) and (B2); and it is an

n, m

-copy of f if, additionally, the composed vector

(x, y)

satisfies the equality constraints corresponding to conditions (i)–(iii) in Definition 1. Consequently, there exists a matrix M with

d_{1} + d_{3}

columns, depending only on the partition

Y X Z

and the numbers n and m, so that f has an

n, m

-copy if and only if there is a vector

y

satisfying

M \cdot {(x, y)}^{⊤} ⩾ 0

. Similarly,

(f, N)

is a polymatroid if, for another matrix B with

(d_{1} + d_{2})

columns expressing the basic Shannon inequalities for

Y X Z

, we have

B \cdot {(x, u)}^{⊤} ⩾ 0

. Thus the collection of polymatroids on N that have an

n, m

-copy is the set

\begin{matrix} Q = {(x, u) \in R^{d_{1} + d_{2}} : & B \cdot {(x, u)}^{T} ⩾ 0, and \\ M \cdot {(x, y)}^{⊤} ⩾ 0 for some y \in R^{d_{3}}} . \end{matrix}

Here M and B are matrices with integer entries; these matrices depend only on

Y X Z

, n, and m. Since

Q

is the intersection of a polyhedral cone and the projection of a polyhedral cone, it is also a polyhedral cone, as claimed. □

From the proof it is clear that the

u

-part of

Q

is constrained only by the basic Shannon inequalities encoded in the matrix B. Furthermore, constraints on

x

imposed by the first condition are contained in the second one. Thus, it suffices to consider the bounding facets of

Q^{*} = \{x \in R^{d_{1}} : M \cdot {(x, y)}^{⊤} ⩾ 0 for some y \in R^{d_{3}}\}

(8)

for new entropy inequalities. This is because, due to the duality theorem of linear programming [28], facets of

Q

are convex linear combinations of facets of

Q^{*}

and facets corresponding to the basic Shannon inequalities for the base set

Y X Z

.

Coordinates in

x

are indexed by subsets of

Y X

and

X Z

, so the inequalities provided by the bounding facets of

Q^{*}

contain only elements of the restrictions

f ↾ Y X

and

f ↾ X Z

. We emphasize that these restrictions are not arbitrary polymatroids on

Y X

and

X Z

with a common restriction on X, as they also have a common extension, namely f. Conditions ensuring the existence of such a common extension are assumed to hold, see [33], and they do not contribute towards the non-Shannon entropy inequalities we are searching for.

4. What to Compute? How to Compute?

As discussed in Section 3, the task of finding new non-Shannon entropy inequalities implied by the existence of an

n, m

-copy reduces to enumerating all facets of the polyhedral cone

Q^{*}

defined in (8). However, without further reduction, this polyhedral computation is intractable even for small parameter values. Therefore, in this section we look at some general methods to reduce the complexity of the computation, and then discuss how the number of elements in the

Y X Z

partition was chosen.

4.1. Tight and Modular Parts

Both the polyhedral region

Γ_{N}

and the closure of the entropy region

\bar{Γ_{N}^{*}}

decompose naturally into direct sums of modular and tight parts, see [23]. To discuss this result, let us first introduce some notation. For

i \in N

define the function

r_{i}

on the non-empty subsets A of N as

r_{i} : A \mapsto \{\begin{matrix} 1 & if i \in A, \\ 0 & otherwise . \end{matrix}

(9)

Non-negative multiples of

r_{i}

are clearly entropic polymatroids; modular polymatroids are, by definition, the conic combinations of the vectors

{r_{i} : i \in N}

. For a polymatroid

(f, N)

, a singleton

i \in N

and a real number

α ⩾ 0

, the function

f ↓_{α}^{i}

is defined on the non-empty subsets of N as follows:

f ↓_{α}^{i} : A \mapsto \min {f (A i) - α, f (A)} .

(10)

When

α

is set to

f (i | N - i)

,

f ↓_{α}^{i}

is denoted simply by

f ↓^{i}

. Note that for

i \notin A

we have

f (A i) - f (A) ⩾ f (N) - f (N - i) = f (i | N - i)

by submodularity. Consequently,

f ↓^{i}

can be written explicitly as

f ↓^{i} (A) = \{\begin{matrix} f (A) - f (i | N - i) & if i \in A, \\ f (A) & if i \notin A . \end{matrix}

(11)

Therefore,

f = f ↓^{i} + f (i | N - i) r_{i}

, where

r_{i}

is the polymatroid defined in (9). The result of tightening f at i is the function

f ↓^{i}

. The tight part of f, denoted by

f ↓

, is the result of tightening f at every element of its base set

N = {i_{1}, \dots, i_{n}}

:

f ↓ = (\dots (f ↓^{i_{1}}) ↓^{i_{2}} \dots) ↓^{i_{n}} .

(12)

This result is independent of the order in which the reductions are applied, which is also shown by the decomposition formula

f = f ↓ + \sum_{i \in N} f (i | N - i) r_{i} .

(13)

The proof of the following lemma can be found in [20] or [34]. In this paper only the first part of the lemma is needed, which can be verified by direct computation.

Lemma 2.

Let

0 ⩽ α ⩽ f (i)

. If f is a polymatroid, then

f ↓_{α}^{i}

is also a polymatroid. If, in addition, f is almost entropic, then so is

f ↓_{α}^{i}

. □

Accordingly,

f ↓

(the tight part of f) is a polymatroid, and it is also almost entropic (aent) whenever f is aent. The difference

f - f ↓

is the modular part, and it is a modular polymatroid. This decomposition of f into a tight and a modular part is unique, and both parts are aent if f is aent.

The cone formed by the modular polymatroids over N is

| N |

-dimensional, and is generated by the linearly independent vectors

{r_{i} : i \in N}

. The cone of tight polymatroids is orthogonal to this (modular) cone, and so to every vector

r_{i}

, and is bounded by the hyperplanes corresponding to the basic Shannon inequalities in (B2). The cone of tight, almost entropic polymatroids is similarly orthogonal to the modular cone. A consequence of this decomposition is that linear bounds on the entropic cone also decompose into bounds on the tight part and bounds on the modular part—the latter being trivial, that is, a Shannon inequality. The normal

n

of a supporting hyperplane of the tight part is necessarily orthogonal to all vectors

r_{i}

, that is, the scalar products

n \cdot r_{i}

are zero. Consequently, if the normal has the coordinates

n = 〈 t_{I} : I \subseteq N 〉

, then the sum

\sum {t_{I} : i \in I}

is zero for every

i \in N

. For this reason, these hyperplanes are called balanced. The tight component of any entropy inequality is balanced, and it is also an entropy inequality. This fact is equivalent to saying that every entropy inequality can be strengthened to become a balanced one, see [35].

From the above it follows that the facets of the cone

Q^{*}

belong to two disjoint groups. There are

| N |

(trivial, Shannon) facets that bound the modular part of

Q^{*}

, and the rest bound the tight part. The normal vectors of the facets in the second group are balanced, and only they can provide non-Shannon inequalities. Therefore, it suffices to consider only the tight part of

Q^{*}

. This part is generated by a smaller collection of polymatroids, has fewer dimensions, and so can be handled more efficiently.

Claim 2.

The tight part of

Q^{*}

is generated by the

n, m

-copies of the polymatroids f on

N = Y X Z

that are (i) tight; (ii) satisfy

f (Y, Z | X) = 0

; and (iii) for all

y \in Y

,

f (y | Y X - y) = 0

, and for all

z \in Z

,

f (z | X Z - z) = 0

.

Observe that the tightness of f at the elements of Y and Z follows from condition (iii) and submodularity; thus, (i) is relevant only for elements of X.

Proof.

Let

f^{*}

be an

n, m

-copy of f. In the definition of

Q^{*}

only the values of

f ↾ Y X

and the values of

f ↾ X Z

are used. Therefore, f can be replaced with any other polymatroid that has the same restrictions. Such a polymatroid is

f^{*} ↾ Y_{1} X Z_{1}

by part (i) of Definition 1, which gives (ii). For (iii) let

y \in Y

, and

α = f (y | Y X - y)

. Apply Lemma 2 to

f^{*}

and all instances of y in the copies

Y_{i}

to get the new polymatroid

g^{*}

. Denoting the instance of y in

Y_{1}

by

y_{1}

, the lemma provides

g^{*} (y_{1} | Y_{1} X - y_{1}) = 0

. In addition,

g^{*}

is an

n, m

-copy of its restriction to

Y_{1} X Z_{1}

. Since this restriction and the polymatroid f differ only by a modular shift on subsets of

Y X

and

X Z

, their tight parts are the same. A similar reduction on elements of Z, and finally on elements of X, provides the statement. □

Using Claim 2, the number of columns in the constraint matrix M in (8) can be significantly reduced. It is so since, by the tightness of f,

f^{*} (A i) = f^{*} (A)

holds for many subsets A of

N^{*}

with few elements, and this equality implies

f^{*} (B i) = f^{*} (B)

for every

A \subseteq B \subset N^{*}

.

4.2. Symmetry

The inherent symmetry in the

n, m

-copy allows for another significant complexity reduction. Let

π

be one of the

(n! m!)

permutations of the base set

N^{*}

that permutes the subsets

Y_{i}

and the subsets

Z_{j}

independently. This permutation naturally extends to the subsets of

N^{*}

, and then to the polymatroids on

N^{*}

. The

n, m

-copy

f^{*}

of f is symmetric if it is invariant for each such permutation

π

, that is,

f^{*} (A) = (π f^{*}) (A) = f^{*} (π A)

for all

A \subseteq N^{*}

.

Claim 3.

f has an

n, m

-copy if and only if it has a symmetric

n, m

-copy.

Proof.

If

f^{*}

is an

n, m

-copy of f, then clearly so is

π f^{*}

. Since conditions (ii) and (iii) in Definition 1 are linear, they are also satisfied by the average of all such permutations of

f^{*}

, that is, by the polymatroid

g^{*} = {(n! m!)}^{- 1} \sum_{π} π f^{*}

. Clearly,

g^{*}

is a symmetric

n, m

-copy of f. □

Symmetry alone reduces the number of auxiliary variables in the definition of

Q^{*}

from exponential in n and m to polynomial in these parameters.

4.3. No New Inequality

In some cases, the computations required by the Maximum Entropy Method, as defined in Section 3, can be simplified further, or even completely avoided. The first claim of this subsection states that certain polymatroids do not contribute to new entropy inequalities.

Claim 4.

Suppose f is a polymatroid on

N = Y X Z

, and f restricted to X is modular. Then f has an

n, m

-copy for every n and m.

Proof.

The statement follows from the following lemma by induction. □

Lemma 3.

Suppose that the polymatroids

(f_{1}, Y X)

and

(f_{2}, X Z)

have a common restriction on X which is modular. Then there is a polymatroid

(g, Y X Z)

that extends both

f_{1}

and

f_{2}

such that

g (Y, Z | X) = 0

.

Proof.

For

I \subseteq Y

,

J \subseteq X

and

K \subseteq Z

define

g (I J K) \overset{def}{=} \min_{L} {f_{1} (I L) + f_{2} (L K) - f_{1} (L) : J \subseteq L \subseteq X} .

(14)

Using the fact that

f_{1} ↾ X

and

f_{2} ↾ X

are isomorphic and modular, a simple calculation shows that g is a polymatroid and satisfies the requirements. For details, consult [20,33] or [29]. □

If either Y or Z has a single element, then one does not need to look beyond

n, 1

-copies.

Claim 5.

Suppose

| Z | = 1

. Entropy inequalities generated by

n, m

-copies of polymatroids on

Y X Z

are also generated by

n, 1

-copies.

Proof.

We claim that the cone generated by the tight part of

n, m

-copies is the same as the cone generated by the

n, 1

-copies. To prove this, let f be a polymatroid on

Y X Z

that satisfies the conditions of Claim 2, and let

f^{*}

be an

n, 1

-copy of f so that f is identified with

f^{*} ↾ Y_{1} X Z_{1}

where

Z_{1}

has a single element z. Let

g^{*}

be the polymatroid when

m - 1

identical copies of z are added to

f^{*}

. We claim that

g^{*}

is an

n, m

-copy. The only non-trivially satisfied condition is that the copies of z,

z_{1}

and

z_{2}

, are independent over X. Since

f (z | X) = 0

by (iii) of Claim 2, we have

g^{*} (X z_{1}) = g^{*} (X z_{2}) = g^{*} (X z_{1} z_{2}) = g^{*} (X)

, thus

g^{*} (z_{1}, z_{2} | X) = 0

. Since

g^{*} ↾ Y_{1} X Z_{1}

and

f^{*} ↾ Y_{1} X Z_{1}

are the same polymatroids, the

n, m

-cone is part of the

n, 1

-cone, as claimed. □

4.4. Problem Parameters

By Claim 4, the Maximum Entropy Method does not yield new inequalities when

f ↾ X

is modular. This is certainly the case when

| X | = 1

, so we must have

| X | ⩾ 2

. By Claim 5, if

| Y | = | Z | = 1

, then beyond the

1, 1

-copy, no additional inequalities are generated. The smallest parameter setting when new entropy inequalities are expected as the number of copies grows is

| Y | = 2

,

| X | = 2

, and

| Z | = 1

. We fix these sizes, as well as the labels of the members of each set as

X = {a, b}, Y = {c, d}, and Z = {z} .

(15)

Since

| Z | = 1

, according to Claim 5, it suffices to consider

n, 1

-copies only. To simplify the notation, the extra 1 will be dropped and we write n-copy instead. We also explicitly state the definition of the n-copy for this particular partition.

Definition 2.

Let f be a polymatroid on

N = {a b c d z}

, and let

n ⩾ 1

. The polymatroid

f^{*}

on the base set

N^{*} = a b z \cup {c_{i} d_{i} : 1 ⩽ i ⩽ n}

is an n-copy of f, if

(i): $f^{*} ↾ a b z$ is isomorphic to $f ↾ a b z$ , and, for each $i ⩽ n$ , with the $c_{i} \leftrightarrow c$ , $d_{i} \leftrightarrow d$ correspondences, $f^{*} ↾ a b c_{i} d_{i}$ is isomorphic to $f ↾ a b c d$ ;
(ii): ${c_{i} d_{i} : i ⩽ n}$ and z are completely conditionally independent over $a b$ .

This special case of the Maximum Entropy Method provides new non-Shannon entropy inequalities based on the fact that entropic polymatroids on the 5-element base set

a b c d z

have an n-copy for each

n ⩾ 1

. The steps we will follow are as below:

Fix the number of copies n, called a generation. Determine the generating matrix M of the cone $Q^{*}$ as specified in Claim 1 using only polymatroids that satisfy the conditions of Claim 2.
The new inequalities are provided by the non-Shannon facets of the tight part of $Q^{*}$ ; these facets can be computed using some polyhedral algorithm from the generating matrix M.

5. Computation

The cone

Q^{*}

whose non-Shannon bounding facets provide the new entropy inequalities sits in the

d_{1} = 19

-dimensional Euclidean space with coordinates indexed by the non-empty subsets of

Y X = {a b c d}

and

X Z = {a b z}

. Fix the number of copies to

n ⩾ 1

. This choice also fixes the dimension

d_{3}

of the vector

y

. The generating matrix M of the polyhedral cone

Q^{*}

from (8) is repeated here:

Q^{*} = \{x \in R^{19} : M \cdot {(x, y)}^{⊤} ⩾ 0 for some y \in R^{d_{3}}\} .

(16)

The modular part of

Q^{*}

is 5-dimensional, and so its tight part sits in a 14-dimensional subspace of

R^{19}

.

A structural property of the polymatroid region on the four-element set

a b c d

allows us to further reduce the complexity of the polyhedral computation required in step (2) above. This region has a central part and six permutationally equivalent “protrusions,” depending on the signs of the Ingleton expressions

f [a b c d], f [a c b d], f [a d b c], f [b c a d], f [b d a c], and f [c d a b] .

(17)

If all of them are non-negative, then the restriction

f ↾ a b c d

is a linear polymatroid; otherwise exactly one of these Ingleton expressions is negative, see e.g., [24]. Accordingly, the cone

Q^{*}

is cut into seven parts by these Ingleton hyperplanes: the central part where all Ingleton values are non-negative, and six other parts where exactly one of the expressions is negative. The facets of each part can be computed separately.

Parts of

Q^{*}

on the negative side of

[a c b d]

,

[a d b c]

,

[b c a d]

, and

[b d a c]

are isomorphic because swapping

a \leftrightarrow b

and/or

c \leftrightarrow d

are symmetries of

Q^{*}

. Therefore, it suffices to consider only one of them. The central part, where every Ingleton expression is non-negative, does not yield new inequalities. This follows from Lemma 4 below, as the elements of the central part are linear.

Lemma 4.

If f restricted to

a b c d

is linear, then f has an n-copy for all

n ⩾ 1

.

Proof.

Since every polymatroid on three elements is linear, and linearly representable polymatroids on three or four elements are representable over any field, we can assume, after scaling and using continuity, that both

f ↾ a b c d

and

f ↾ a b z

are

F

-linearly representable over the same finite field

F

. Denote the two representing vector spaces by

V^{1}

and

V^{2}

, and consider the subspace arrangements

(V_{a}^{1}, V_{b}^{1})

and

(V_{a}^{2}, V_{b}^{2})

in the two vector spaces. Now

V_{a}^{i}

and

V_{b}^{i}

have dimensions

f (a)

and

f (b)

, respectively, and their linear span has dimension

f (a b)

. Therefore, these arrangements are isomorphic, and

V^{1}

and

V^{2}

can be glued along the linear span of

(V_{a}^{1}, V_{b}^{1})

and

(V_{a}^{2}, V_{b}^{2})

. This gluing yields an

F

-linear polymatroid g that has the same restrictions on

a b c d

and on

a b z

as f does. Since this g is entropic, it has an n-copy for every

n ⩾ 1

. This n-copy is also an n-copy of f, as required. □

Consequently, up to the

a \leftrightarrow b

and

c \leftrightarrow d

symmetries, three mutually exclusive cases are left:

f [a b c d] < 0

,

f [a c b d] < 0

, and

f [c d a b] < 0

. Using the homogeneity of

Q^{*}

, the Ingleton value can be set to

- 1

, in effect taking a cross-section of

Q^{*}

that has one fewer dimension. Facets of the part of

Q^{*}

we are considering are also facets of these cross-sections; consequently, only facets of the cross-sections need to be computed. We consider these three cases separately in the subsections below.

The definition (16) of the cone

Q^{*}

uses the 19-dimensional coordinate system where the coordinates of the vector

x

are labeled by the non-empty subsets of

a b c d

and

a b z

. In all three cases we perform calculations in different coordinate systems that are chosen so that

The first coordinate is the Ingleton expression defining the cross-section;
The tight and modular parts of the cross-section have disjoint coordinates;
Apart from the Ingleton coordinate, other coordinates have non-negative values.

The first property allows to set the Ingleton value explicitly. Based on the second property, the tight part of the cross-section can be separated by dropping some coordinates; and the third property potentially reduces the complexity of the polyhedral enumeration algorithm.

5.1. Case I

The cone

Q^{*}

is intersected with the hyperplane

[a b c d] = - 1

. In this case we use the coordinate system

\begin{matrix} C_{1} : & [a b c d], \\ C_{2} - C_{4} : & (a, b | c), & (a, c | b), & (b, c | a), \\ C_{5} - C_{7} : & (a, b | d), & (a, d | b), & (b, d | a), \\ C_{8} - C_{11} : & (c, d | a), & (c, d | b), & (c, d), & (a, b | c d), \\ C_{12} - C_{14} : & (a, b | z), & (a, z | b), & (b, z | a), \\ C_{15} - C_{19} : & (a | b c d), & (b | a c d), & (c | a b d), & (d | a b c), & (z | a b) . \end{matrix}

(18)

Coordinates

C_{15} - C_{19}

cover the modular part of

Q^{*}

. The tight part is spanned by the coordinate vectors

C_{1} - C_{14}

, and each of these vectors is orthogonal to the modular part. Let

{\tilde{P}}_{1}

be the inverse of the matrix of this coordinate transformation, and the vector

p_{1}

be the first row of

{\tilde{P}}_{1}

. Let

P_{1}

be the submatrix formed from rows 2 to 14 of

\tilde{P}

. Coordinates of the vector

x \in R^{19}

in this coordinate system are

{\tilde{P}}_{1} x^{⊤}

, and, in particular, the Ingleton value

f [a b c d]

is the scalar product

p_{1} \cdot x

. Consequently, the tight part of the intersection of

Q^{*}

and the hyperplane

[a b c d] = - 1

in this coordinate system is

Q_{1}^{*} = \{P_{1} x^{⊤} : p_{1} \cdot x = - 1, and M \cdot {(x, y)}^{⊤} ⩾ 0 for some y \in R^{d_{3}}\} .

(19)

Finding all facets of

Q_{1}^{*}

determined by the matrices M and

\tilde{P}

is closely related to linear multiobjective optimization [36], and can benefit significantly by working in the 13-dimensional target space [37] instead of the significantly larger,

d_{3}

-dimensional problem space. We have developed a variant of Benson’s inner approximation algorithm [22,38] which takes advantage of the additional special property that

Q_{1}^{*}

is in the non-negative orthant of the target space. The program version 1.3 is available on GitHub as https://github.com/csirmaz/information-inequalities-5, (accessed on 5 January 2026).

Table 1 shows the sizes of the generating matrix M, the total number of facets and vertices (including extremal directions) of the cross-section

Q_{1}^{*}

, and the running time of the vertex enumeration algorithm on a single-core desktop computer with an Intel^® Core^™ i5-4590 CPU @ 3.30 GHz processor and 8 GB of memory. The running time was taken up almost exclusively by the underlying LP solver. While the number of facets grows quite moderately with n, the number of vertices more than doubles at each generation. The matrix M, despite the numerous improvements, is highly degenerate, and numerical instability, originating from both the LP solver and the applied polyhedral algorithm, prevented the completion of the computation for larger values of n. The results of the computation are presented in Section 6.

5.2. Case II

The cone

Q^{*}

is intersected with the hyperplane

[a c b d] = - 1

. The coordinate system is similar to the one used in Section 5.1. Base elements b and c are swapped in coordinates

C_{1} - C_{11}

, while the other coordinates remain unchanged. The tight part of the intersection, denoted by

Q_{2}^{*}

, is defined similarly with the same matrix M but a different coordinate transformation matrix

{\tilde{P}}_{2}

, vector

p_{2}

, and submatrix

P_{2}

as

Q_{2}^{*} = \{P_{2} x^{⊤} : p_{2} \cdot x = - 1, and M \cdot {(x, y)}^{⊤} ⩾ 0 for some y \in R^{d_{3}}\} .

(20)

The problem size, number of facets and vertices, and the running time in seconds are summarized in Table 2. Both the number of facets and the number of vertices grow moderately. A plausible conjecture is that, in general, the number of facets is

2 n + 14

, and the number of vertices is

2 n^{2} + 17

.

The running time is significantly shorter than in Section 5.1. It is explained by the fact that the polyhedral algorithm requires solving an LP instance for each vertex and each facet in the result, and those numbers are significantly smaller here. The generating matrix M is the same in both cases, implying that the problem size is the same. Numerical instability prevented completing the computation for

n = 10

even in this case.

5.3. Case III

No new inequality is generated when the cone

Q^{*}

is intersected with the hyperplane

[c d a b] = - 1

. This can be proved as follows. Since this intersection, denoted by

Q_{3}^{*}

, is an (unbounded) polyhedron, every polymatroid in

Q_{3}^{*}

is a conic combination of its vertices and extremal directions. These vertices and extremal directions can be represented by certain extremal polymatroids. Conic combinations of polymatroids that have an n-copy also have an n-copy. Consequently, it suffices to show that these extremal polymatroids have an n-copy for all

n ⩾ 1

.

Changing the first 11 coordinates of the coordinate system used in Section 5.1 to

\begin{matrix} C_{1} : & [c d a b], \\ C_{2} - C_{4} : & (c, d | a, & (a, c | d), & (a, d | c), \\ C_{5} - C_{7} : & (b, c | d), & (c, d | b), & (b, d | c), \\ C_{8} - C_{11} : & (a, b | c), & (a, b | d), & (a, b), & (c, d | a b), \end{matrix}

(21)

and keeping the rest, the vertex enumeration algorithm used in the previous cases generated the vertices and extremal directions of the 13-dimensional tight part of

Q_{3}^{*}

. The computation showed that it is a pointed cone with a single vertex that has coordinates

C_{2} - C_{14}

equal to zero (while

C_{1} = - 1

) and has 14 extremal directions, 12 of which are coordinate axes. Polymatroids representing the extremal directions are linear when restricted to the base set

a b c d

(they satisfy

f [c d a b] = 0

; therefore, the other Ingleton values are also non-negative). Consequently, these polymatroids have an n-copy for all

n ⩾ 1

. Finally, the remaining polymatroid at the single vertex has

f (a, b) = 0

(as the coordinate

C_{10}

is zero), which means that

f ↾ a b

is modular. By Claim 4 it also has an n-copy for all

n ⩾ 1

. This concludes the proof that no non-Shannon inequality is generated in this case.

6. Experimental Information Inequalities

For a fixed

n ⩾ 1

, the problem of extracting the set of non-Shannon inequalities that form the necessary and sufficient conditions for the existence of an n-copy of a polymatroid on the base set

a b c d z

was shown to be equivalent to determining all facets of a 13-dimensional polyhedral cone. The cone was cut into several pieces and the facets of each piece were computed for

n ⩽ 9

. In this section we take a quick look at the computational results. Below, the symbols

Z

,

C

,

D

denote the following entropy expressions:

\begin{matrix} Z & \overset{def}{=} (a, z | b) + (b, z | a), \\ C & \overset{def}{=} (a, c | b) + (b, c | a), \\ D & \overset{def}{=} (a, d | b) + (b, d | a) . \end{matrix}

6.1. Case I

In the

[a b c d] < 0

case, facets of the polyhedron

Q_{1}^{*}

from (19) include all 13 coordinate planes orthogonal to the coordinate axes

C_{2}

–

C_{14}

. These facets correspond to the non-negativity of the expression defining the coordinate.

Q_{1}^{*}

has two additional Shannon facets, corresponding to the Shannon inequalities

(a, z) ⩾ 0

and

(b, z) ⩾ 0

. The remaining facets determine the non-Shannon inequalities we are interested in. They come in three flavors:

\begin{matrix} (a, b | z) + α_{s} [a b c d] + α_{s} Z + β_{s} C + γ_{s} D & ⩾ 0, \end{matrix}

(22)

\begin{matrix} (a, b | c) + α_{s} [a b c d] + (α_{s} + β_{s}) C + γ_{s} D & ⩾ 0, \end{matrix}

(23)

\begin{matrix} (a, b | d) + α_{s} [a b c d] + β_{s} C + (α_{s} + γ_{s}) D & ⩾ 0, \end{matrix}

(24)

where

〈 α_{s}, β_{s}, γ_{s} 〉

are certain triplets of non-negative integers. For illustration, consider the

n = 3

case. As reported in Table 1, for

n = 3

the polyhedron

Q_{1}^{*}

has 34 facets. These facets determine 15 Shannon inequalities, 11 inequalities of the form (22), and 4–4 inequalities of the form (23) and (24). The

〈 α, β, γ 〉

triplets appearing in (22) are listed in three columns in Table 3. Inequalities in (23) and (24) use triplets from the first column only; these are the triplets that also appear in the

n = 2

generation.

In general, inequalities in (23) and (24) are consequences of (22) via replacing z with c and d, respectively. Since the copy

c_{n}

of c in an n-copy polymatroid

f^{*}

can be considered to be the variable z in the

n - 1

-copy when

f^{*}

is restricted to

N^{*} - {d_{n} z}

, inequalities valid for

n - 1

-copy instances must hold in an n-copy with z replaced by c, and, similarly, when z is replaced by d. This property is confirmed by the computational results. Additionally, all inequalities not containing the variable z proved to be derivatives from the previous generation via the above substitutions. The main goal in Section 7 and Section 8 is to obtain a general description of the triplets

〈 α_{s}, β_{s}, γ_{s} 〉

occurring in (22).

6.2. Case II

Inequalities in the

[a c b d] < 0

case have a similar but significantly simpler structure. Facets of the n-copy cone

Q_{2}^{*}

from (20) include the coordinate planes, the two Shannon facets

(a, z) ⩾ 0

and

(b, z) ⩾ 0

as above, and additional facets generating the inequalities

\begin{matrix} (a, b | z) + k [a c b d] + k Z + \frac{k (k - 1)}{2} C & ⩾ 0, \\ (a, b | c) + k [a c b d] + \frac{(k + 1) k}{2} C & ⩾ 0 \end{matrix}

for

1 ⩽ k ⩽ n

for the first, and

1 ⩽ k ⩽ n - 1

for the second set of inequalities. As noted in the

[a b c d] < 0

case, inequalities in the second set are instances of ones from the first from the previous generation when z is replaced by c. When z is replaced by d, the resulting inequality

(a, b | d) + k [a c b d] + k D + \frac{k (k - 1)}{2} C ⩾ 0

(25)

is Shannon as

[a c b d] + D ⩾ 0

holds in every polymatroid.

7. New Inequalities

In this section we define a set of

〈 α, β, γ 〉

triplets, and prove that each of them gives rise to a non-Shannon inequality that must hold in polymatroids having an n-copy. These inequalities cover those that were discovered experimentally for

n ⩽ 9

. We conjecture this set to be complete, that is, the applied MEM method yields no additional non-Shannon inequalities; or in other words, if a polymatroid on 5 elements satisfies all these inequalities, then it has an n-copy for all n.

7.1. Case I

For notational convenience,

b (x, y)

, for binomial, denotes the function defined on

N \times N

that satisfies the following recurrent definition for positive integers x and y:

\begin{matrix} b (0, 0) & = b (x, 0) = b (0, y) = 1, and \\ b (x, y) & = b (x - 1, y) + b (x, y - 1) . \end{matrix}

Clearly,

b (x, y) = b (y, x) = (\begin{matrix} x + y \\ x \end{matrix})

. The following summation formulas will be used later.

Lemma 5.

For

x, y \in N

the following summation formulas hold:

\begin{matrix} \sum_{i ⩽ x, j ⩽ y} b (i, j) & = b (x + 1, y + 1) - 1, \\ \sum_{i ⩽ x, j ⩽ y} i b (i, j) & = x b (x + 1, y + 1) - b (x, y + 2) + 1 . \end{matrix}

Proof.

Induction on x shows that

\sum_{i ⩽ x} b (i, y) = b (x, y + 1)

, and also that

\sum_{i ⩽ x} i b (i, y) = x b (x, y + 1) - b (x - 1, y + 2) .

(26)

Following this, induction on y gives the desired results. □

Definition 3.

The set

S \subset N \times N

of pairs of non-negative integers is downward closed if

(i, j) \in S

implies

(i^{'}, j^{'}) \in S

for every non-negative

i^{'} ⩽ i

and

j^{'} ⩽ j

. For

n ⩾ 1

the diagonal set

D_{n}

is

D_{n} \overset{def}{=} {(i, j) \in N \times N : i + j < n} .

(27)

Clearly,

D_{n}

is downward closed.

Definition 4.

For a finite, downward closed set

S \subset N \times N

, define the three-dimensional vector

v_{S}

as

v_{S} = 〈 α_{S}, β_{S}, γ_{S} 〉 = \sum_{(i, j) \in S} b (i, j) 〈 1, i, j 〉,

(28)

When S is the empty set, define

v_{\emptyset} = 〈 0, 0, 0 〉

.

The diagonal

D_{1}

has a single point, the origin, and the corresponding vector is

v_{D_{1}} = 〈 1, 0, 0 〉

. In general, the diagonal

D_{n}

has

n (n + 1) / 2

points, and the vector associated with

D_{n}

is

v_{D_{n}} = 〈 2^{n} - 1, (n - 2) 2^{n - 1} + 1, (n - 2) 2^{n - 1} + 1 〉 .

(29)

The following theorem provides a family of non-Shannon inequalities that covers all inequalities that were found experimentally in Section 6.1.

Theorem 1.

Let f be a polymatroid on the base set

a b c d z

that has an n-copy over the

{c d} {a b} {z}

partition. Then, for every downward closed set

S \subseteq D_{n}

, f satisfies the inequality

(a, b | z) + α_{S} ([a b c d] + Z) + β_{S} C + γ_{S} D ⩾ 0 .

(30)

Proof.

Let

f^{*}

be an n-copy of f on the base set

N^{*} = {a b z} \cup {c_{i} d_{i} : i ⩽ n}

. Using Claims 2 and 3, we can assume that

(i): f is isomorphic to $f^{*} ↾ a b c_{1} d_{1} z$ , and
(ii): $f^{*}$ is symmetric for all $n!$ permutations of the pairs $c_{i} d_{i}$ .

By (i) it suffices to show that

f^{*}

satisfies all inequalities in (30). By (ii), permutationally equivalent subsets of

N^{*}

have the same

f^{*}

-value. Below,

c^{k}

, etc., stands for k elements chosen from

c_{1}, \dots, c_{n}

. Occasionally,

c_{1}

,

d_{1}

will also be denoted by c and d, and

c^{k + 1}

will be written as

c c^{k}

, letting

c = c_{1}

be one of the chosen elements.

A representative element for the subset

A \subseteq N^{*}

will be written as

B c^{k} d^{l} {(c d)}^{m},

(31)

with

k + l + m ⩽ n

, where B is a (possibly empty) subset of

a b z

; and from the

c_{i} d_{i}

pairs there are k that intersect A in

c_{i}

, there are ℓ pairs that intersect A in

d_{i}

, and there are m pairs that intersect A in

c_{i} d_{i}

. Only non-zero exponents will be presented.

The following inequality is denoted by

I (k, l)

:

\begin{matrix} [a b c d] & + Z + k C + l D ⩾ \\ - (a, b | c^{k} d^{l} z) + (a, b | c^{k + 1} d^{l} z) + (a, b | c^{k} d^{l + 1} z) . \end{matrix}

By Lemma 9 below, this inequality holds for

f^{*}

when

k + l < n

. Let

S \subseteq D_{n}

be a downward closed set, and consider the following combination of the inequalities

I (k, l)

:

\sum_{(k, l) \in S} b (k, l) I (k, l) .

(32)

On the left-hand side we have

α_{S}

many copies of

[a b c d]

and

Z

,

β_{S}

many copies of

C

, and

γ_{S}

copies of

D

. On the right-hand side the only remaining negative term is

(a, b | z)

, all others cancel out as

b (k - 1, l) + b (k, l - 1) = b (k, l)

. Consequently, inequality (30) holds in

f^{*}

, as claimed. □

The rest of this section is devoted to the proof of Lemma 9 stating that

I (k, l)

holds in

f^{*}

. We start with some simple inequalities about the copy polymatroid

f^{*}

. For ease of reading, we omit the parentheses in addition to the function

f^{*}

.

Lemma 6.

For non-negative integers k, ℓ with

k + l < n

we have

\begin{matrix} a c^{k} d^{l + 1} z & ⩽ a d z + k (a c - a) + l (a d - a), \\ b c^{k + 1} d^{l} z & ⩽ b c z + k (b c - b) + l (b d - b) . \end{matrix}

Proof.

The claims are clearly true for

k = l = 0

. Otherwise, use induction on k and ℓ using

\begin{matrix} a c^{k + 1} d^{l + 1} z - a c^{k} d^{l} z & = a c X - a X ⩽ a c - a, \\ a c^{k} d^{l + 2} z - a c^{k} d^{l + 1} z & = a d Y - a Y ⩽ a d - a, \end{matrix}

for some subsets X and Y of

N^{*}

. The second inequality can be proved similarly. □

Lemma 7.

(i): If $k + l ⩽ n$ then $a b c^{k} d^{l} z = a b z + k (a b c - a b) + l (a b d - a b)$ .
(ii): If $k + l < n$ then $a b (c d) c^{k} d^{l} z = a b c d + (a b z - a b) + k (a b c - a b) + l (a b d - a b)$ .

Proof.

Both statements follow from the fact that under the given conditions

c d

,

c^{k}

,

d^{l}

and z are conditionally independent over

a b

. □

Lemma 8.

(i): $(c d) c^{k} d^{l} z - (c d) ⩾ (a b z - a b) + k (a b c - a b) + l (a b d - a b)$ .
(ii): $b d c^{k} z - b d ⩾ (a b z - a b) + k (a b c - a b)$ .
(iii): $a (c d) c^{k} z - a (c d) ⩾ (a b z - a b) + k (a b c - a b)$ .

Proof.

For the first inequality,

(c d) c^{k} d^{l} z - (c d) ⩾ a b (c d) c^{k} d^{l} z - a b c d

by submodularity. From here, apply Lemma 7 to get the required inequality. The other two inequalities can be proved in a similar way. □

Lemma 9.

For non-negative integers k, ℓ with

k + l < n

the inequality

I (k, l)

holds in

f^{*}

.

Proof.

Recall that the inequality

I (k, l)

is

\begin{matrix} [a b c d] & + Z + k C + l D ⩾ \\ - (a, b | c^{k} d^{l} z) + (a, b | c^{k + 1} d^{l} z) + (a, b | c^{k} d^{l + 1} z) . \end{matrix}

Write the right-hand side as the sum

T_{1} + T_{2} + T_{3} + T_{4}

where the four terms are

\begin{matrix} T_{1} & = c^{k} d^{l} z - c^{k + 1} d^{l} z - c^{k} d^{l + 1} z, \end{matrix}

(33)

\begin{matrix} T_{2} & = a b c^{k} d^{l} z - a b c^{k + 1} d^{l} z - a b c^{k} d^{l + 1} z, \end{matrix}

(34)

\begin{matrix} T_{3} & = a c^{k + 1} d^{l} z - a c^{k} d^{l} z + a c^{k} d^{l + 1} z, \end{matrix}

(35)

\begin{matrix} T_{4} & = b c^{k + 1} d^{l} z - b c^{k} d^{l} z + b c^{k} d^{l + 1} z . \end{matrix}

(36)

We estimate each term separately. For (33) we have

T_{1} = - (c, d | c^{k} d^{l} z) - (c d) c^{k} d^{l} z .

(37)

Here the first term is

⩽ 0

, and the second term can be bounded using part (i) of Lemma 8. Therefore,

T_{1} ⩽ - c d - (a b z - a b) - k (a b c - a b) - l (a b d - a b) .

(38)

Using part (i) of Lemma 7, the exact value of (34) can be computed as

T_{2} = - a b z - (k + 1) (a b c - a b) - (l + 1) (a b d - a b) .

(39)

For (35) use

a c^{k + 1} d^{l} z - a c^{k} d^{l} z = a c X - a X ⩽ a c - a

and Lemma 6 to get

T_{3} ⩽ a d z + (k + 1) (a c - a) + l (a d - a) .

(40)

Finally, to estimate (36) use the similar inequality

b c^{k} d^{l + 1} z - b c^{k} d^{l} z ⩽ b d - b

and the second statement of Lemma 6 to get

T_{4} ⩽ b c z + k (b c - b) + (l + 1) (b d - b) .

(41)

The sum of the right-hand sides in the estimates (38)–(41) is

[a b c d] + Z + k C + l D - (c, z | b) - (d, z | a) .

(42)

This amount is ⩽ than the left-hand side of

I (k, l)

, proving the lemma. □

7.2. Case II

The following theorem claims that in the case of

[a c b d] < 0

inequalities experimentally found in Section 6.2 indeed hold for every n.

Theorem 2.

Let f be a polymatroid on

a b c d z

that has an n-copy for the partition

{c d} {a b} {z}

. Then f satisfies the following inequality for every

k ⩽ n

:

(a, b | z) + k [a c b d] + k Z + \frac{k (k - 1)}{2} C ⩾ 0 .

(43)

We give two proofs. The first one is similar to the proof of Theorem 1 and uses an inequality mimicking Lemma 9. The second proof is by induction and uses a technique that also recovers some of the inequalities covered in Theorem 1.

Proof 1 of Theorem 2.

The following inequality holds in

f^{*}

for every

0 ⩽ k < n

:

[a c b d] + Z + k C ⩾ - (a, b | c^{k} z) + (a, b | c^{k + 1} z) .

(44)

Summing up this inequality from zero to

k - 1

gives the claim of Theorem 2, thus it suffices to prove (44). The natural approach of using induction on k does not work. The reason for this is that the inequality

C ⩾ (a, b | c^{k} z) - 2 (a, b | c^{k + 1} z) + (a, b | c^{k + 2} z),

(45)

required by the induction does not hold in general. Instead, we give a more involved reasoning, resembling the proof of Lemma 9. In (44) write

c^{k + 1}

as

c c^{k}

, and let d be the element that forms a pair with this c. Adding

(b, d | c^{k} z) + (a, c | c^{k} d z)

to the right-hand side of (44) and rearranging, we obtain the upper bound

\begin{matrix} (c d c^{k} z - c c^{k} z) + (a c c^{k} z - a c^{k} z) - (a b c c^{k} z - a b c^{k} z) + \\ + b c c^{k} z + a d c^{k} z - b d c^{k} z - a c d c^{k} z . \end{matrix}

(46)

Each of the seven terms is estimated separately as follows.

\begin{matrix} c d c^{k} z - c c^{k} z & ⩽ c d - c, \end{matrix}

(47)

\begin{matrix} a c c^{k} z - a c^{k} z & ⩽ a c - a, \end{matrix}

(48)

\begin{matrix} a b c c^{k} z - a b c^{k} z & = a b c - a b, \end{matrix}

(49)

\begin{matrix} b c c^{k} z & ⩽ b c z + k (b c - b), \end{matrix}

(50)

\begin{matrix} a d c^{k} z & ⩽ a d z + k (a c - a), \end{matrix}

(51)

\begin{matrix} - b d c^{k} z & ⩽ - b d - k (a b c - a b) - (a b z - a b), \end{matrix}

(52)

\begin{matrix} - a c d c^{k} z & ⩽ - a c d - k (a b c - a b) - (a b z - a b) . \end{matrix}

(53)

Equations (47) and (48) follow from submodularity. Equation (49) expresses that c and

c^{k} z

are independent over

a b

. Equations (50) and (51) are in Lemma 6, while (52) and (53) are from Lemma 8. The sum of the right-hand sides of (47)–(53) is

[a c b d] + Z + k C - (c, z | b) - (d, z | a),

(54)

proving (44). □

To describe the technique used in the second proof of Theorem 2, let

E_{n}

denote the collection of all linear five-variable inequalities that are valid in every polymatroid on

a b c d z

that has an n-copy. Let

f^{*}

be such an n-copy. Having n instances of

c d

, one of the

c_{i} d_{i}

pairs can be singled out, and one of its elements can be renamed

z^{'}

. Restricting

f^{*}

to these elements is an

n - 1

-copy of

a b c d z^{'}

since

z^{'}

and the remaining

n - 1

pairs are independent over

a b

. Therefore,

a b c d z^{'}

satisfies the inequalities in

E_{n - 1}

. Let

E (a, b, c, d, z) \in E_{n - 1}

be such an inequality, marking the base elements explicitly. Then we have

E (a, b, c, d, c) \in E_{n}

, and also

E (a, b, c, d, d) \in E_{n}

. This fact has been observed and used in Section 6 to explain the coefficients in the obtained inequalities that do not contain the variable z.

Similarly to the above, the pairs

{(c_{i} z, d_{i} z) : i ⩽ n}

are isomorphic and are independent over the pair

(a z, b z)

. Therefore, they form an

n - 1

-copy of the polymatroid with base elements

{a z, b z, c z, d z, c z}

. This means that the inequality

E (a z, b z, c z, d z, c z)

is also in

E_{n}

, and so is the inequality with

d z

in the last position. For non-negative integers

α

and

β

denote the following inequality by

J (α, β)

:

J (α, β) \overset{def}{=} (a, b | z) + α [a c b d] + α Z + β C ⩾ 0 .

(55)

Lemma 10.

Suppose

J (α, β) \in E_{n - 1}

. Then

J (α + 1, α + β) \in E_{n}

.

Proof.

The following Shannon inequalities hold in every polymatroid:

\begin{matrix} (a, b | z) + [a c b d] + Z & ⩾ (a z, b z | c z) - 3 (c d, z | a b), \\ [a c b d] + Z & ⩾ [a z, c z, b z, d z] - 3 (c d, z | a b), \\ (a, c | b) & ⩾ (a z, c z | b z) - (c, z | a b), \\ (b, c | a) & ⩾ (b z, c z | a z) - (c, z | a b) . \end{matrix}

Since

c d

and z are conditionally independent over

a b

in

f^{*}

, the last terms are zero. Taking the first inequality once, the second one

α

times, and the last two

(α + β)

times, the sum of the left-hand sides is

J (α + 1, α + β)

, while the right-hand side is just

J (α, β)

for the

(a z, b z, c z, d z, c z)

base. Since this inequality is in

E_{n - 1}

by assumption, we have

J (α + 1, α + β) \in E_{n}

, as claimed. □

Proof 2 of Theorem 2.

The inequality to be proved is

J (k, (\begin{matrix} k \\ 2 \end{matrix}))

. Use induction on k. For

k = 0

it is a Shannon inequality, thus it holds in every polymatroid. For other values of k, Lemma 10 says that

J (k, (\begin{matrix} k \\ 2 \end{matrix})) \in E_{n - 1}

implies

J (k + 1, (\begin{matrix} k + 1 \\ 2 \end{matrix})) \in E_{n}

, concluding the induction step. □

Lemma 10 remains valid if in the definition of

J (α, β)

the Ingleton expression

[a c b d]

is replaced by

[a b c d]

. Consequently, some, but not all, of the inequalities covered in Theorem 1 can be obtained by similar inductive reasoning.

8. The Minimal Set of Inequalities

Experimental results reported in Section 5 and discussed in Section 6 provided the complete list of five-variable non-Shannon entropy inequalities implied by the existence of an n-copy for

n ⩽ 9

. Two families of non-Shannon inequalities, generalizing the ones found experimentally, were proven, in Theorem 1 and Theorem 2, respectively, to hold in every polymatroid with an n-copy. We conjecture that these families actually characterize those five-variable polymatroids that have an n-copy, so no further non-Shannon inequalities can be discovered by the version of the Maximum Entropy Method utilized in this paper.

In the

[a c b d] < 0

case the family of non-Shannon inequalities provided by Theorem 2 matches exactly the inequalities obtained experimentally for

n ⩽ 9

.

In the

[a b c d] < 0

case the family provided by Theorem 1 is parametrized by the downward closed subsets S of the diagonal set

D_{n} \subset N \times N

. Not all of the generated inequalities correspond to facets of the cone

Q_{1}^{*}

. While they are valid non-Shannon inequalities, some of them are consequences of others. Table 4 shows the downward closed subsets of

D_{3}

as well as the corresponding

v_{S} = 〈 α_{S}, β_{S}, γ_{S} 〉

triplets from Definition 4. Two triplets, marked by ∗, are not in Table 3. The corresponding inequality

(a, b | z) + α ([a b c d] + Z) + β C + γ D ⩾ 0

(56)

with

α = 5

and

β = γ = 3

is the average of the inequalities obtained from the triplets numbered 6, 10 and 13; thus, it is a consequence of them. The main goal of this Section is to obtain a description of those downward closed subsets of

D_{n}

that generate facets of

Q_{1}^{*}

, that is, inequalities that are not consequences of the others.

Since the inequality (56) contains the fixed term

(a, b | z)

, and trivially holds true when

[a b c d] + Z ⩾ 0

, it is a consequence of the inequalities obtained from the triplets

{〈 α_{i}, β_{i}, γ_{i} 〉 : i \in I}

if there is a convex combination

〈 α^{'}, β^{'}, γ^{'} 〉 = \sum_{i \in I} λ_{i} 〈 α_{i}, β_{i}, γ_{i} 〉, with λ_{i} ⩾ 0, and \sum_{i \in I} λ_{i} = 1,

(57)

such that

α ⩽ α^{'}

,

β ⩾ β^{'}

, and

γ ⩾ γ^{'}

. In this case we say that

〈 α, β, γ 〉

is superseded by the set

{〈 α_{i}, β_{i}, γ_{i} 〉 : i \in I}

. If

v_{S} = 〈 α_{S}, β_{S}, γ_{S} 〉

is not superseded by other elements of this family, then

v_{S}

is called extremal. Actually, by the above observation, extremal vectors are the vertices of the convex hull of the set of triplets

v_{S}

as S runs over the downward closed subset of

D_{n}

. By Carathéodory’s theorem, see [28],

v_{S}

is superseded if and only if it is (also) superseded by a set with at most three elements.

Lemma 11 below gives a necessary and sufficient condition for the vector

v_{S}

to be superseded by a special three-element set. For a subset S of

N \times N

we write

S + (i, j)

for adding the point

(i, j)

to S, and

S - (i, j)

to remove

(i, j)

from S. In the first case it is tacitly assumed that

(i, j)

is not in S, and in the second case that

(i, j) \in S

.

Lemma 11.

Let

i_{1} < i_{2} < i_{3}

, and

j_{1} > j_{2} > j_{3}

.

(i): $v_{S}$ is superseded by the vectors ${v_{S + (i_{1}, j_{1})}, v_{S - (i_{2}, j_{2})}, v_{S + (i_{3}, j_{3})}}$ if and only if

$\frac{j_{2} - j_{3}}{i_{3} - i_{2}} ⩾ \frac{j_{1} - j_{3}}{i_{3} - i_{1}} .$

(58)
(ii): $v_{S}$ is superseded by ${v_{S - (i_{1}, j_{1})}, v_{S + (i_{2}, j_{2})}, v_{S - (i_{3}, j_{3})}}$ if and only if

$\frac{j_{2} - j_{3}}{i_{3} - i_{2}} ⩽ \frac{j_{1} - j_{3}}{i_{3} - i_{1}} .$

(59)

Proof.

We prove (i) only, (ii) is similar. Let

b_{1} = b (i_{1}, j_{1})

,

b_{2} = b (i_{2}, j_{2})

and

b_{3} = b (i_{3}, j_{3})

. Then, according to Definition 4,

\begin{matrix} v_{S + (i_{1}, j_{1})} & = 〈 α_{S} + b_{1}, β_{S} + i_{1} b_{1}, γ_{S} + j_{1} b_{1} 〉, \\ v_{S - (i_{2}, j_{2})} & = 〈 α_{S} - b_{2}, β_{S} - i_{2} b_{2}, γ_{S} - j_{2} b_{2} 〉, \\ v_{S + (i_{3}, j_{3})} & = 〈 α_{S} + b_{3}, β_{S} + i_{3} b_{3}, γ_{S} + j_{3} b_{3} 〉 . \end{matrix}

v_{S} = 〈 α_{S}, β_{S}, γ_{S} 〉

is superseded by these vectors if there are non-negative numbers

λ_{1}

,

λ_{2}

,

λ_{3}

with

λ_{1} + λ_{2} + λ_{3} = 1

such that

\begin{matrix} α_{S} & ⩽ λ_{1} (α_{S} + b_{1}) + λ_{2} (α_{S} - b_{2}) + λ_{3} (α_{S} + b_{3}), \\ β_{S} & ⩾ λ_{1} (β_{S} + i_{1} b_{1}) + λ_{2} (β_{S} - i_{2} b_{2}) + λ_{3} (β_{S} + i_{3} b_{3}), \\ γ_{S} & ⩾ λ_{1} (γ_{S} + j_{1} b_{1}) + λ_{2} (γ_{S} - j_{2} b_{2}) + λ_{3} (γ_{S} + j_{3} b_{3}) . \end{matrix}

Since the sum of the

λ_{i}

’s is 1, this system is equivalent to

\begin{matrix} λ_{2} b_{2} & ⩽ λ_{1} b_{1} + λ_{3} b_{3}, \\ i_{2} λ_{2} b_{2} & ⩾ i_{1} λ_{1} b_{1} + i_{3} λ_{3} b_{3}, \\ j_{2} λ_{2} b_{2} & ⩾ j_{1} λ_{1} b_{1} + j_{3} λ_{3} b_{3} . \end{matrix}

Clearly,

λ_{2}

must be strictly positive as

b_{1}

,

b_{2}

, and

b_{3}

are positive. Introducing

μ_{1} = (λ_{1} b_{1}) / (λ_{2} b_{2})

and

μ_{3} = (λ_{3} b_{3}) / (λ_{2} b_{2})

, this system is equivalent to

\begin{matrix} 1 & ⩽ μ_{1} + μ_{3}, \\ i_{2} & ⩾ i_{1} μ_{1} + i_{3} μ_{3} \\ j_{2} & ⩾ j_{1} μ_{1} + j_{3} μ_{3} . \end{matrix}

One can assume that the first inequality holds with an equality. Since

i_{1} < i_{2} < i_{3}

, the second inequality holds when

i_{2}

is above the point which splits the interval

[i_{1}, i_{3}]

in ratio

μ_{3}

to

μ_{1}

. Similarly,

- j_{1} < - j_{2} < - j_{3}

implies that the third inequality holds when

- j_{2}

is below the point that splits

[- j_{1}, - j_{3}]

in the same ratio. Thus, non-negative numbers

μ_{1}

and

μ_{3}

satisfying these three inequalities exist if and only if the proportion of

[i_{2}, i_{3}]

in

[i_{1}, i_{3}]

is not larger than the proportion of

[- j_{2}, - j_{3}]

in the interval

[- j_{1}, - j_{3}]

, that is,

\frac{(- j_{3}) - (- j_{2})}{(- j_{3}) - (- j_{1})} ⩾ \frac{i_{3} - i_{2}}{i_{3} - i_{1}} .

(60)

This condition is equivalent to the one given in the claim. □

Corollary 1.

Let

i_{1} < i_{2} < i_{3}

, and

j_{1} > j_{2} > j_{3}

. Assume both

(i_{k}, j_{k})

and

(i_{k} + 1, j_{k} - 1)

are in S, while

(i_{k}, j_{k} + 1) \notin S

and

(i_{k} + 1, j_{k}) \notin S

for

k = 1, 2, 3

. When

j_{3} = 0

the condition with negative values is assumed to hold. If

(j_{1} - j_{3}) / (i_{3} - i_{1})

is not in the open interval

(\frac{j_{2} - j_{3}}{(i_{3} - i_{2}) + 1}, \frac{j_{2} - j_{3}}{(i_{3} - i_{2}) - 1}),

(61)

then

v_{S}

is superseded by vectors generated by one of the following two triplets:

\begin{matrix} {S + (i_{1} + 1, j_{1}), & S - (i_{2}, j_{2}), & S + (i_{3} + 1, j_{3})}, \\ {S - (i_{1}, j_{1}), & S + (i_{2} + 1, j_{2}), & S - (i_{3}, j_{3})} . \end{matrix}

(62)

Proof.

If the slope

(j_{1} - j_{3}) / (i_{3} - i_{1})

is less than, or equal to the lower limit, then part (i) of Lemma 11 applies to the first triplet. When the slope is at or above the upper limit, then part (ii) of that Lemma applies to the second triplet. □

A downward closed set

S \subset N \times N

can be specified in two ways. Either by a non-increasing sequence

S^{col} = (c_{0}, c_{1}, \dots, c_{k})

specifying the maximal values in columns 0, …, k, or by a non-increasing sequence

S^{row} = (r_{0}, r_{1}, \dots, r_{l})

specifying the maximal values in rows 0, …, ℓ. It is easy to see that

(x, y) \in S \Leftrightarrow 0 ⩽ y ⩽ c_{x} \Leftrightarrow 0 ⩽ x ⩽ r_{y} .

(63)

Corollary 2.

If the vector

v_{S}

for some

S \subseteq D_{n}

is not superseded by other vectors generated by subsets of

D_{n}

, then either the sequence

S^{col}

is strictly decreasing, or the sequence

S^{row}

is strictly decreasing.

Proof.

If

S^{col}

is not strictly decreasing, then the upper bound of S contains a horizontal segment length of at least 2. Similarly, if

S^{row}

is not strictly decreasing, then the right bound of S contains a vertical segment of length at least 2, see Figure 1. Take such a horizontal and a vertical segments whose distance is minimal. Let the horizontal segment be in row r between columns

c_{1}

and

c_{2}

, and the vertical segment be in column c between rows

r_{1}

and

r_{2}

. The horizontal and vertical segments are connected by (a possibly empty) diagonal staircase. Depending on which segment comes first, there are two possible arrangements as depicted on Figure 1.

In the first case

c_{1} < c_{2} < c

, and

r_{1} < r_{2} < r

; in the second case

c < c_{1} < c_{2}

and

r < r_{1} < r_{2}

. Apply Lemma 11 to the marked points and observe that the modified downward closed sets are always subsets of

D_{n}

. In the first case

\frac{c_{2} - c_{1}}{(r + 1) - r} ⩾ 1 ⩾ \frac{c + 1 - c_{2}}{r_{1} - r},

(64)

and in the second case

\frac{(c + 1) - c)}{r_{1} - r_{2}} ⩽ 1 ⩽ \frac{c_{2} - (c + 1)}{r - r_{1}} .

(65)

Therefore, by Lemma 11,

v_{S}

is superseded by the vectors generated by the indicated sets, proving the claim. □

By Corollary 2, the downward closed set corresponding to an extremal vertex is either a staircase with step heights 1 (when

S^{row}

is strictly decreasing), which we call horizontal, or the mirror image of such a staircase. The only configuration that belongs to both cases is the diagonal

D_{n}

. It will be more convenient to use the column-sequence

(c_{0}, c_{1}, \dots, c_{k})

to represent horizontal staircases. Here

k ⩾ 0

is the length of the staircase, also denoted by

len (S)

. The last column size (height) is necessarily

c_{k} = 0

, and

c_{i}

equals either

c_{i + 1}

or

c_{i + 1} + 1

for every

0 ⩽ i < k

. In the rest of this section, all staircases, if not mentioned otherwise, are horizontal ones.

Definition 5.

The staircase S is Positive-Negative-Positive (PNP)-reducible in

D_{n}

if there are

i_{1} < i_{2} < i_{3}

and

j_{1} > j_{2} > j_{3}

such that

S_{1} = S + (i_{1}, j_{1})

,

S_{2} = S - (i_{2}, j_{2})

, and

S_{3} = S + (i_{3}, j_{3})

are staircases in

D_{n}

and

v_{S}

is superseded by

{v_{S_{1}}, v_{S_{2}}, v_{S_{3}}}

. S is PNP-irreducible if it is not PNP-reducible.

Negative-Positive-Negative (NPN)-reducibility and NPN-irreducibility is defined analogously, using staircases

S - (i_{1}, j_{1})

,

S + (i_{2}, j_{2})

, and

S - (i_{3}, j_{3})

, assuming that they are also subsets of

D_{n}

. S is irreducible in

D_{n}

if it is both PNP- and NPN-irreducible. Finally, let

S_{n}

be the collection of the irreducible staircases that are subsets of

D_{n}

.

By the remark at the beginning of this section, by Lemma 11, and by Corollary 2, extremal vertices are generated by elements of

S_{n}

and by their mirror images. We describe an incremental algorithm that generates the elements of the collection

S_{n}

.

A horizontal staircase S of length n can be recovered from a unique horizontal staircase

S^{'}

of length

n - 1

as follows. If

S^{'}

has the column sequence

(c_{0}^{'}, c_{1}^{'}, \dots, c_{n - 1}^{'})

, then S is defined by one of the column sequences

\begin{matrix} (c_{0}^{'}, & c_{1}^{'}, & \dots, & c_{n - 1}^{'}, & 0) or \\ (c_{0}^{'} + 1, & c_{1}^{'} + 1, & \dots, & c_{n - 1}^{'} + 1, & 0), \end{matrix}

(66)

depending on whether the last two elements of the column sequence of S are equal.

Claim 6.

(i): Suppose S has length n. S is irreducible in $D_{n + 1}$ if and only if it is irreducible in $D_{m}$ for any $m ⩾ n + 1$ .
(ii): If $len (S) = n$ and S is irreducible in $D_{n}$ , then $S^{'}$ is irreducible in $D_{n - 1}$ .
(iii): If $S \in S_{n}$ but $S \notin S_{n + 1}$ , then $len (S) = n$ and S is PNP-reducible in $D_{n + 1}$ with $i_{3} = n + 1$ and $j_{3} = 0$ .
(iv): If $S^{'} \in S_{n - 1}$ and $S \notin S_{n}$ , then either S is NPN-reducible with $i_{3} = n$ and $j_{3} = 0$ , or it is PNP-reducible with $i_{3} = n - 1$ and $j_{3} = 1$ .

Proof.

(i) is immediate from the definition as the staircases

S \pm (i, j)

must be subsets of

D_{n + 1}

.

(ii) Assume

S^{'}

is reducible in

D_{n - 1}

shown by the staircases

S_{1}^{'}

,

S_{2}^{'}

and

S_{3}^{'}

. Since they are in

D_{n - 1}

, they can be lifted back to

S_{1}

,

S_{2}

,

S_{3}

in

D_{n}

. According to Lemma 11 these staircases witness the reducibility of S.

(iii) If S is reducible in

D_{n + 1}

but not in

D_{n}

, then

S + (i_{3}, j_{3})

is not in

D_{n}

, leading to the stated condition.

(iv) If S is reducible in

D_{n}

while

S^{'}

is not reducible in

D_{n - 1}

, then the reduction must use

(i_{3}, j_{3})

, which is in

D_{n}

but not in

D_{n - 1}

. If it is an NPN-reduction then it must use the newly added point

(n, 0)

; in other cases the reduction can be shifted back to

S^{'}

. In the case of a PNP-reduction this additional point is

(n - 1, 1)

(when extending the staircase by a column of height zero), or can be shifted back to

S^{'}

again. □

Based on Claim 6, the incremental algorithm, sketched as Algorithm 1, generates all horizontal irreducible staircases. The PNP- and NPN-irreducibility can be checked based on Lemma 11. The last point

(i_{3}, j_{3})

is fixed, and the naïve implementation requires quadratic running time in

len (S)

. With some simple bookkeeping it can be reduced to a backward scanning of the column sequence, resulting in linear running time.

Using the algorithm we have computed the complete set of irreducible staircases up to

n = 60

. The number of new staircases that remained irreducible in each subsequent generation matches the sequence A103116 in the Encyclopedia of Integer Sequences [39]:

{remains}_{n} = \sum_{i ⩽ n} (n - i + 1) φ (i),

(67)

where

φ

is Euler’s totient function, which suggests that the connection is based on the number of different slopes determined by the lattice points in a rectangle. Proving the equivalence of these two sequences is an intriguing open problem.

For better visualization, triplets

〈 α_{S}, β_{S}, γ_{S} 〉

corresponding to these irreducible staircases are plotted as the three-dimensional points

〈 β / α, γ / α, α 〉

using logarithmic scale for the third

α

coordinate. The plot in Figure 2 contains all 126,981 extremal triplets in the range

β, γ ⩽ 20 α

. Some of the plotted triplets appear as late as generation

n = 80

; later generations do not contribute to this part of the complete set. For comparison, some triplets in the 80-th generation have values larger than

2^{85}

.

Algorithm 1: Generating irreducible staircases

To explain the shape of the surface of extremal triplets plotted in Figure 2, we provide some heuristic reasoning. A consequence of Corollary 1 is that if the extremal triplet

v_{S}

is computed from the staircase S, then the slopes determined by the step edges

(i, j) \in S

(namely, points of S where neither

(i + 1, j)

nor

(i, j + 1)

are in S) are almost equal. Consequently, on a large scale, extremal

v_{S}

vectors are generated by the set of lattice points in right-angled triangles defined by the inequality

S (a, b) = {(x, y) \in N \times N : \frac{x}{a} + \frac{y}{b} ⩽ 1}

(68)

for some positive values of a and b. Since, by Lemma 5,

\begin{matrix} \sum_{i ⩽ x, j ⩽ y} b (i, j) & \approx b (x + 1, y + 1), and \\ \sum_{i ⩽ x, j ⩽ y} i b (i, j) & \approx x b (x + 1, y + 1), \end{matrix}

the vector

v_{S (a, b)}

is well approximated by

b (x + 1, y + 1) 〈 1, x, y 〉

, where

(x, y) \in S (a, b)

is the point where

b (x + 1, y + 1)

takes its maximal value. As the function

b (x, y)

strictly increases in both coordinates, this maximum is taken on the boundary diagonal of the right-angled triangle

S (a, b)

that has endpoints

(0, b)

and

(a, 0)

. Using the Stirling formula

n! \approx \sqrt{2 π n} {(n / e)}^{n}

, we have

b (x, y) = \frac{(x + y)!}{x! y!} \approx \frac{1}{\sqrt{2 π}} \frac{{(x + y)}^{x + y + 1 / 2}}{x^{x + 1 / 2} y^{y + 1 / 2}} .

(69)

Introducing

φ (x) = (x + \frac{1}{2}) \log x

, we see that the logarithm of

b (x, y)

is well approximated by the function

θ (x, y) = φ (x + y) - φ (x) - φ (y) - \log \sqrt{2 π} .

(70)

Using this approximation, the point

(u, v)

is extremal in the triangle

S (a, b)

if

(u, v)

is on the boundary diagonal and

θ (u + 1, v + 1)

has zero derivative along this diagonal. For fixed u and v such a positive a and b exist just in case the partial derivatives

θ_{x}

and

θ_{y}

are positive at

(u + 1, v + 1)

. By inspection, this condition is satisfied for every

(u, v)

. Consequently, if

〈 α, β, γ 〉

is an extremal triplet, then choosing

u = β / α

,

v = γ / α

, we expect

\log α \approx θ (u + 1, v + 1),

(71)

and, conversely, for each u, v, with the choice

\log α = θ (u + 1, v + 1)

,

β = u α

, and

γ = v α

, we expect the triplet

〈 α, β, γ 〉

to be extremal. For comparison, Figure 3 plots these triplets over the same range that was used in Figure 2.

This approximation seems to slightly underestimate the real value of

\log α

. For example, the extremal triplet obtained from the diagonal staircase

D_{n + 1}

is

α = 2^{n + 1} - 1, β = (n - 1) 2^{n} + 1, γ = (n - 1) 2^{n} + 1,

(72)

thus,

u = v = β / α \approx (n - 1) / 2

, and

\log α \approx (n + 1) \log 2

. At the same time,

θ (u + 1, v + 1) = (n + 1) \log 2 - \log \sqrt{n + 1} + O (1) .

(73)

Extremal triplets on the two edges of the surface are specified by the totally flat, stairless staircases. These triplets are

α = n, β = n (n - 1) / 2, γ = 0

(74)

on one axis, and

β

,

γ

swapped on the other. In this case, the

(u, v)

pair is

((n - 1) / 2, 0)

, and

θ (u + 1, v + 1) = \log n + (1 - \log \sqrt{8 π} + O (1 / n)),

(75)

which differs from the correct value by a constant only.

We have also looked at how the newly discovered entropy inequalities delimit the 5-variable entropy region. The triplet

〈 α_{S}, β_{S}, γ_{S} 〉

yields the inequality

(a, b | z) + α_{S} ([a b c d] + Z) + β_{S} C + γ_{S} D ⩾ 0 .

(76)

Since the closure of the 5-variable entropy region is a pointed convex cone, one can normalize it by assuming

(a, b | z) = 1

. An equivalent view is to take the cross-section of

\bar{Γ_{5}^{*}}

by this hyperplane. Consider the three-dimensional subspace spanned by the vectors

\begin{array}{l} x & \overset{def}{=} C & = (a, c | b) + (b, c | a), \\ y & \overset{def}{=} D & = (a, d | b) + (b, d | a), \\ - z & \overset{def}{=} [a b c d] + Z & = [a b c d] + (a, z | b) + (b, z | a); \end{array}

(77)

observe that

z

is negated. Normalize the five-variable entropic function f so that it satisfies

f (a, b | z) = 1

, then project it to this subspace. Use the scalar products

〈 f \cdot x, f \cdot y, f \cdot z 〉

as the projection coordinates. This three-dimensional cross-section of the five-variable entropy region is

Δ = \{〈 f \cdot x, f \cdot y, f \cdot z 〉 \in R^{3} : f \in \bar{Γ_{5}^{*}} such that f (a, b | z) = 1\} .

(78)

Clearly, points in

Δ

have non-negative x and y coordinates, while the z coordinate can take both positive and negative values. Since

\bar{Γ_{5}^{*}}

is a closed convex cone,

Δ

is closed and convex. We concentrate on the part above the

x y

plane:

Δ^{+} = {〈 x, y, z 〉 \in Δ : z ⩾ 0} .

(79)

Shannon inequalities provide no restriction whatsoever on

Δ^{+}

as any non-negative coordinate triplet can be realized by some polymatroid. To show this, define

r_{I}

for any subset I of the ground set

a b c d z

as

r_{I} : A \mapsto \{\begin{matrix} 1 & if I \cap A \neq \emptyset, \\ 0 & otherwise . \end{matrix}

Let, moreover,

r^{\circ}

be the function

r^{\circ} : A \mapsto \{\begin{matrix} 2 & if | A | = 1, \\ 4 & if | A | ⩾ 3, or A is one of c d, c z, d z, \\ 3 & otherwise, \end{matrix}

as A runs over the non-empty subsets of

a b c d z

. Both

r_{I}

and

r^{\circ}

are extremal rays of

Γ_{5}

, so they are polymatroids. For arbitrary non-negative numbers

x, y, z

the linear combination

f = r_{a b c d} + x r_{a c} + y r_{a d} + z r^{\circ}

satisfies

f (a, b | z) = 1

and has coordinates

〈 x, y, z 〉

, providing the required polymatroid.

Points of

Δ^{+}

with non-negative x and y coordinates and

z = 0

are realized by linear polymatroids; thus, the complete non-negative quadrant of the

x y

plane is a part of

Δ^{+}

. Our first non-Shannon inequality, generated by the triplet

S_{D_{1}} = 〈 1, 0, 0 〉

, is

(a, b | z) + [a b c d] + Z ⩾ 0 .

(80)

This inequality immediately limits the region

Δ^{+}

to

z ⩽ 1

; therefore, points in

Δ^{+}

have a height at most 1.

Other extremal triplets provide additional linear constraints. Figure 4 illustrates the delimited part of the non-negative octant as viewed from the origin, and cut at

x ⩽ 2.5

and at

y ⩽ 2.5

. The pictured bound of

Δ^{+}

is extended to larger values of x and y. Along the x and y axes, this bound approaches the

x z

and

y z

coordinate planes as the functions

z = \sqrt{y}

and

z = \sqrt{x}

, respectively. Along the

x y

diagonal, the limiting behavior toward the z axis is similar to the entropy function

z = - (x + y) \log (x + y)

. The corner point of the plateau

z = 1

has coordinates

〈 1, 1, 1 〉

. The

θ (u + 1, v + 1)

estimate gives a smooth bound on

Δ^{+}

, which is asymptotically tight along the x and y axes.

9. Conclusions

Structural properties of the entropy region of four or more variables are mostly unknown. This region is bounded by linear inequalities corresponding to the non-negativity of Shannon information measures. Finding additional entropy inequalities is, and remains, an intriguing open problem. Previous works on generating and applying such non-Shannon entropy inequalities focused mainly on the four-variable case [4,10,14,15], and only a few sporadic five-variable non-Shannon inequalities have been discovered [18]. This work provides infinitely many five-variable non-Shannon information inequalities by systematically exploring a special property of entropic vectors. Other works utilized the Copy Lemma, a method distilled from the original Zhang–Yeung construction by Dougherty et al. [14]. Our method is based on a different paradigm derived from the principle of maximum entropy and is a special case of the Maximum Entropy Method described in [20]. As proven in Lemma 1, the principle of maximum entropy implies that every entropic polymatroid has an

n, m

-copy, which is a polymatroidal extension with special properties as defined in Definition 1. In Claim 1, we have proved that polymatroids having

n, m

-copies form a polyhedral cone and hint at how its facets can be computed. Facet equations provide the potentially new non-Shannon entropy inequalities.

While the polyhedral computation presented in Claim 1 is numerically intractable even for small parameter values, the theoretical results of Section 4 allowed us to reduce this complexity significantly. Computational aspects of determining the facets of a high-dimensional cone are closely related to linear multi-objective optimization [22]. We have developed a specially tailored variant of Benson’s inner approximation algorithm [22,38], which takes advantage of the special properties of this enumeration problem. Computational results are reported in Section 5 for generations

n ⩽ 9

. Numerical instability, originating from both the underlying LP solver and the polyhedral algorithm, prevented the completion of the computation for larger values of n.

Non-Shannon inequalities obtained from these computations are discussed in Section 6. Based on these experimental results, two infinite families of five-variable inequalities were defined. The first family in Theorem 1 is parametrized by downward closed subsets of non-negative lattice points. The second family in Theorem 2 has a single positive integer parameter. Inequalities in both families are proved to hold for polymatroids on five elements that have an n-copy; consequently, they are all valid entropy inequalities. It is conjectured that they cover all inequalities that can be obtained by the applied method. In other words, if a polymatroid on five elements satisfies all these inequalities, then it has an n-copy for all n. This conjecture is left as an open problem. The computational results confirmed this conjecture up to

n = 9

.

Inequalities in the first family are investigated in Section 8 in more detail. They are specified by triplets

〈 α_{x}, β_{S}, γ_{S} 〉

determined by downward closed sets S of nonnegative lattice points as discussed in Definition 4. Such a triplet is extremal if the corresponding inequality is not a consequence of other inequalities from the same family. Extremal triplets are determined by a special collection of downward closed sets called irreducible staircases. Based on the theoretical results in Corollary 2 and Claim 6, an incremental algorithm, sketched as Algorithm 1, was used to generate irreducible staircases up to generation 60. The converse implication, valid for the computed cases, that triplets generated by irreducible staircases are extremal, is left as an open problem. Triplets (

α_{S}, β_{S}, γ_{S} 〉

in the range

β_{S}, γ_{S} ⩽ 20 α_{S}

, generated by irreducible staircases, are plotted in Figure 2. The number of new irreducible staircases that remained irreducible in the subsequent generation matches the sequence A103116 in the Encyclopedia of Integer Sequences [39]. It is an interesting open problem to prove the equality of these sequences.

To illustrate how the newly discovered entropy inequalities delimit the five-variable entropy region, entropy vectors were normalized to satisfy

(a, b | z) = 1

and projected onto a three-dimensional subspace. Part of the projection in the non-negative octant is denoted by

Δ^{+}

. The Shannon inequalities do not provide any restriction on this part. Figure 4 illustrates the bounds implied by the new inequalities. While the non-negative quadrant of the

x y

plane is known to be part of

Δ^{+}

, and that it also contains points above that plane, it is an intriguing open problem whether our bound is, at least asymptotically, tight around the x and y axes. Showing that our bound is asymptotically tight at the zero point would amount to settling the long-standing open problem of whether the entropic region is semi-algebraic.

Author Contributions

Conceptualization, L.C. and E.P.C.; methodology, L.C. and E.P.C.; software, L.C. and E.P.C.; validation, L.C. and E.P.C.; formal analysis, L.C. and E.P.C.; investigation, L.C. and E.P.C.; resources, L.C. and E.P.C.; data curation, L.C. and E.P.C.; writing—original draft preparation, L.C. and E.P.C.; writing—review and editing, L.C. and E.P.C.; visualization, L.C. and E.P.C.; supervision, L.C. and E.P.C.; funding acquisition, L.C. All authors have read and agreed to the published version of the manuscript.

Funding

The research reported in this paper was partially supported by the ERC Advanced Grant ERMiD.

Data Availability Statement

The data presented in this study are openly available in GitHub at https://github.com/csirmaz/information-inequalities-5, accessed on 28 December 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Csiszár, I.; Körner, J. Information Theory: Coding Theorems of Discrete Memoryless Systems; Akademia Kiado: New York, NY, USA; Budapest, Hungary, 1981. [Google Scholar]
Yeung, R.W. Information Theory and Network Coding; Springer: New York, NY, USA, 2008. [Google Scholar]
Beimel, A. Secret-sharing schemes: A survey. In Coding and Cryptology; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6639, pp. 11–46. [Google Scholar]
Beimel, A.; Orlov, I. Secret Sharing and Non-Shannon Information Inequalities. IEEE Trans. Inf. Theory 2011, 57, 5634–5649. [Google Scholar]
Gürpınar, E.; Romashchenko, A. How to Use Undiscovered Information Inequalities: Direct Applications of the Copy Lemma. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Paris, France, 8–12 July 2019; pp. 1377–1381. [Google Scholar]
Bamiloshin, M.; Ben-Efraim, A.; Farràs, O.; Padró, C. Common information, matroid representation, and secret sharing for matroid ports. Des. Codes Cryptogr. 2021, 89, 143–166. [Google Scholar]
Martin, J.; Rombach, P. Guessing Numbers and Extremal Graph Theory. Electron. J. Comb. 2022, 29, P2.58. [Google Scholar] [CrossRef]
Groth, J.; Ostrovsky, R. Cryptography in the Multi-string Model. J. Cryptol. 2014, 27, 506–543. [Google Scholar] [CrossRef]
Madiman, M.; Marcus, A.W.; Tetali, P. Information-theoretic inequalities in additive combinatorics. In Proceedings of the IEEE Information Theory Workshop, Dublin, Ireland, 30 August–3 September 2010; pp. 1–4. [Google Scholar]
Sudeny, M. Conditional independence structures over four discrete random variables revisited. IEEE Trans. Inform. Theory 2021, 67, 7030–7049. [Google Scholar] [CrossRef]
Yeung, R.W. A First Course in Information Theory; Kluwer Academic/Plenum Publishers: New York, NY, USA, 2002. [Google Scholar]
Pippenger, N. What are the laws of information theory. In 1986 Special Problems on Communication and Computation Conference, Proceedings of the Tenth Prague Conference on Information Theory, Statistical Decision Functions, Random Processes, Prague, Czech Republic, 7–11 July 1986; Springer: Palo Alto, CA, USA, 1986. [Google Scholar]
Zhang, Z.; Yeung, R.W. On characterization of entropy function via information inequalities. IEEE Trans. Inform. Theory 1998, 44, 1440–1452. [Google Scholar]
Dougherty, R.; Freiling, C.; Zeger, K. Non-Shannon information inequalities in four random variables. arXiv 2011, arXiv:1104.3602. [Google Scholar] [CrossRef]
Csirmaz, L. Book inequalities. IEEE Trans. Inf. Theory 2014, 60, 6811–6818. [Google Scholar]
Matúš, F. Infinitely many information inequalities. In Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 41–44. [Google Scholar]
Ahlswede, R.; Gács, P.; Körner, J. Bounds on conditional probabilities with applications in multi-use communication. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1976, 34, 157–177. [Google Scholar]
Makarychev, K.; Makarychev, Y.; Romashchenko, A.; Vereshchagin, N. A new class of non-Shannon-type inequalities for entropies. Commun. Inf. Syst. 2002, 2, 147–166. [Google Scholar]
Kaced, T. Equivalence of two proof techniques for non-Shannon-type inequalities. In Proceedings of the IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 236–240. [Google Scholar]
Csirmaz, L. Exploring the entropic region. arXiv 2025, arXiv:2509.12439. [Google Scholar] [CrossRef]
Guiasu, S.; Shenitzer, A. The principle of maximum entropy. Math. Intell. 1985, 7, 42–48. [Google Scholar]
Csirmaz, L. Inner approximation algorithm for solving linear multiobjective optimization problems. Optimization 2020, 70, 1487–1511. [Google Scholar] [CrossRef]
Matúš, F.; Csirmaz, L. Entropy region and convolution. IEEE Trans. Inform. Theory 2016, 62, 6007–6018. [Google Scholar]
Matúš, F.; Studený, M. Conditional Independences among Four Random Variables I. Comb. Probab. Comput. 1995, 4, 269–278. [Google Scholar] [CrossRef]
Studeny, M.; Bouckaert, R.R.; Kocka, T. Extreme Supermodular Set Functions over Five Variables; Research Report N. 1977; Institute of Information Theory and Automation: Prague, Czech Republic, 2000. [Google Scholar]
Mazumdar, S.; Seybold, D.; Kritikos, K.; Verginadis, Y. A survey on data storage and placement methodologies for Cloud-Big Data ecosystem. J. Big Data 2019, 6, 15. [Google Scholar] [CrossRef]
Huber, M. An introduction to causal discovery. Swiss J. Econ. Stat. 2024, 160, 14. [Google Scholar] [CrossRef]
Ziegler, G.M. Lectures on Polytopes; Graduate Texts in Mathematics; Springer: Berlin/Heidelberg, Germany, 1994; Volume 152. [Google Scholar]
Matúš, F. Adhesivity of polymatroids. Discret. Math. 2007, 307, 2464–2477. [Google Scholar][Green Version]
Bell, J.; Funk, D.; Kim, D.D.; Mayhew, D. Effective Versions of Two Theorems of Rado. Q. J. Math. 2020, 71, 599–618. [Google Scholar] [CrossRef]
Dougherty, R.; Freiling, C.; Zeger, K. Linear rank inequalities on five or more variables. arXiv 2010, arXiv:0910.0284. [Google Scholar] [CrossRef]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar] [CrossRef]
Csirmaz, L. One-adhesive polymatroids. Kybernetika 2020, 56, 886–902. [Google Scholar] [CrossRef]
Matúš, F. Two constructions on limits of entropy functions. IEEE Trans. Inform. Theory 2007, 53, 320–330. [Google Scholar]
Chan, T.H. Balanced information inequalities. IEEE Trans. Inform. Theory 2003, 49, 3261–3267. [Google Scholar]
Csirmaz, E.P.; Csirmaz, L. Enumerating Extremal Submodular Functions for n = 6. Mathematics 2024, 13, 97. [Google Scholar] [CrossRef]
Ehrgott, M.; Löhne, A.; Shao, L. A dual variant of Benson’s ‘outer approximation algorithm’ for multiple objective linear programming. J. Glob. Optim. 2012, 52, 757–778. [Google Scholar] [CrossRef]
Löhne, A.; Weißing, B. The vector linear program solver Bensolve—notes on theoretical background. Eur. J. Oper. Res. 2017, 260, 807–813. [Google Scholar]
OEIS Foundation Inc. The On-Line Encyclopedia of Integer Sequences; OEIS Foundation Inc.: Highland Park, NJ, USA, 2019; Available online: https://oeis.org/A103116 (accessed on 28 December 2025).

Figure 1. Closest horizontal and vertical boundary segments of the gray region are marked by the arrows. Red circles indicate points to be added and subtracted.

Figure 2. Extremal configurations. Colors indicate the

α

value.

Figure 2. Extremal configurations. Colors indicate the

α

value.

Figure 3. The

θ (u + 1, v + 1)

function. Colors indicate the function value.

Figure 3. The

θ (u + 1, v + 1)

function. Colors indicate the function value.

Figure 4. Delimiting the five-variable entropy region. Entropic points are on or below the indicated surface.

Table 1. Results for the case

[a b c d] < 0

.

Table 1. Results for the case

[a b c d] < 0

.

n	Rows	Columns	Facets	Vertices	Time (s)
1	76	23	16	19	0.01
2	284	53	21	43	0.03
3	706	101	34	155	0.35
4	1416	171	63	675	3.54
5	2488	267	120	2171	38.25
6	3996	393	221	6275	5:24
7	6014	533	386	14,523	36:45
8	8616	751	635	31,379	2:59:17
9	11,876	991	1000	61,627	13:13:45

Table 2. Results for the case

[a c b d] < 0

.

Table 2. Results for the case

[a c b d] < 0

.

n	Rows	Columns	Facets	Vertices	Time (s)
1	76	23	16	19	0.00
2	284	53	18	25	0.03
3	706	101	20	35	0.14
4	1416	171	22	49	0.65
5	2488	267	24	67	2.36
6	3996	393	26	89	7.37
7	6014	533	28	115	32.21
8	8616	751	30	145	1:12
9	11,876	991	32	179	5:01

Table 3. Coefficient values for

n = 3

.

Table 3. Coefficient values for

n = 3

.

$〈 α, β, γ 〉$	$〈 α, β, γ 〉$	$〈 α, β, γ 〉$
$〈 1, 0, 0 〉$	$〈 3, 0, 3 〉$	$〈 6, 3, 5 〉$
$〈 2, 0, 1 〉$	$〈 3, 3, 0 〉$	$〈 6, 5, 3 〉$
$〈 2, 1, 0 〉$	$〈 4, 1, 3 〉$	$〈 7, 5, 5 〉$
$〈 3, 1, 1 〉$	$〈 4, 3, 1 〉$

Table 4. Downward closed subsets of

D_{3}

and the corresponding triplets. Triplets marked by * are consequences of the others.

Table 4. Downward closed subsets of

D_{3}

and the corresponding triplets. Triplets marked by * are consequences of the others.


1	$〈 1, 0, 0 〉$
2	$〈 2, 1, 0 〉$
3	$〈 3, 3, 0 〉$
4	$〈 2, 0, 1 〉$
5	$〈 3, 1, 1 〉$
6	$〈 4, 3, 1 〉$
* 7	$〈 5, 3, 3 〉$
8	$〈 6, 5, 3 〉$
9	$〈 3, 0, 3 〉$
10	$〈 4, 1, 3 〉$
* 11	$〈 5, 3, 3 〉$
12	$〈 6, 3, 5 〉$
13	$〈 7, 5, 5 〉$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Csirmaz, L.; Csirmaz, E.P. Information Inequalities for Five Random Variables. Computation 2026, 14, 42. https://doi.org/10.3390/computation14020042

AMA Style

Csirmaz L, Csirmaz EP. Information Inequalities for Five Random Variables. Computation. 2026; 14(2):42. https://doi.org/10.3390/computation14020042

Chicago/Turabian Style

Csirmaz, Laszlo, and Elod P. Csirmaz. 2026. "Information Inequalities for Five Random Variables" Computation 14, no. 2: 42. https://doi.org/10.3390/computation14020042

APA Style

Csirmaz, L., & Csirmaz, E. P. (2026). Information Inequalities for Five Random Variables. Computation, 14(2), 42. https://doi.org/10.3390/computation14020042

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Information Inequalities for Five Random Variables

Abstract

1. Introduction

2. Preliminaries

3. The Maximum Entropy Method

4. What to Compute? How to Compute?

4.1. Tight and Modular Parts

4.2. Symmetry

4.3. No New Inequality

4.4. Problem Parameters

5. Computation

5.1. Case I

5.2. Case II

5.3. Case III

6. Experimental Information Inequalities

6.1. Case I

6.2. Case II

7. New Inequalities

7.1. Case I

7.2. Case II

8. The Minimal Set of Inequalities

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI