Information Content and Maximum Entropy of Compartmental Systems in Equilibrium

Metzler, Holger; Sierra, Carlos A.

doi:10.3390/e27101085

Open AccessArticle

Information Content and Maximum Entropy of Compartmental Systems in Equilibrium

by

Holger Metzler

^1,2,3,4,*

and

Carlos A. Sierra

¹

Max Planck Institute for Biogeochemistry, Hans-Knöll-Str. 10, 07745 Jena, Germany

²

Department of Geography, Ludwig-Maximilians-Universität Munich, Luisenstr. 37, 80333 Munich, Germany

³

Department of Crop Production Ecology, Swedish University of Agricultural Sciences, Ulls väg 16, 756 51 Uppsala, Sweden

⁴

Department of Forest Ecology and Management, Swedish University of Agricultural Sciences, Skogsmarksgränd 17, 901 83 Umeå, Sweden

^*

Author to whom correspondence should be addressed.

Entropy 2025, 27(10), 1085; https://doi.org/10.3390/e27101085

Submission received: 31 August 2025 / Revised: 10 October 2025 / Accepted: 14 October 2025 / Published: 21 October 2025

(This article belongs to the Special Issue Bridging Bayesian and Information-Theoretic Approaches in Earth and Environmental System Modeling: Theory and Practical Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Mass-balanced compartmental systems defy classical deterministic entropy measures since both metric and topological entropy vanish in dissipative dynamics. By interpreting open compartmental systems as absorbing continuous-time Markov chains that describe the random journey of a single representative particle, we allow established information-theoretic principles to be applied to this particular type of deterministic dynamical system. In particular, path entropy quantifies the uncertainty of complete trajectories, while entropy rates measure the average uncertainty of instantaneous transitions. Using Shannon’s information entropy, we derive closed-form expressions for these quantities in equilibrium and extend the maximum entropy principle (MaxEnt) to the problem of model selection in compartmental dynamics. This information-theoretic framework not only provides a systematic way to address equifinality but also reveals hidden structural properties of complex systems such as the global carbon cycle.

Keywords:

information entropy; compartmental systems; equifinality; model identification; MaxEnt; reservoir models

1. Introduction

For many modeling applications it is of interest to quantify the complexity of the system of differential equations used to represent natural phenomena [1,2]. In principle, we are interested in selecting models that are parsimonious, i.e., models with the least degree of complexity for explaining certain patterns in nature [3]. The concept of entropy has been commonly used to characterize complexity or information content. Classical entropy measures for dynamical systems characterize the rate of increase in dynamical complexity as the system evolves over time [4]. These metrics have been used extensively to characterize chaotic behavior in complex nonlinear systems [5], but they give trivial results for a large range of models used in the natural sciences.

In a large variety of scientific fields, models are based on the principle of mass conservation. In many cases, such models are nonnegative dynamical systems that can be described by first-order systems of ordinary differential equations (ODEs) with strong structural constraints. Such systems are called compartmental systems [6,7,8].

Compartmental systems can be evaluated using diagnostic metrics that predict system-level behavior and allow comparisons of systems of very different structures. Age and transit time of material content in compartmental systems are two diagnostic metrics that have been widely studied for systems in and out of equilibrium [9,10,11,12,13,14]. They help compare the behavior and quality of different models. Nevertheless, structurally very different models might show very similar ages and transit times and might represent a given measurement equally well. If we are in the position to select one of such models, which is the one to select? This equifinality problem can be resolved by the maximum entropy principle (MaxEnt) [15,16], a generic procedure to draw unbiased inferences from measurement or stochastic data [3,17].

In order to apply MaxEnt to compartmental systems, an appropriate notion of entropy is required to measure the system’s uncertainty or information content. Two classical examples in dynamical systems theory are the topological entropy and the Kolmogorov–Sinai/metric entropy. However, open compartmental systems are dissipative, and trajectories with slightly disturbed initial conditions do not diverge. Hence, by Pesin’s theorem [18], both metric and topological entropy vanish and cannot serve as a measure of uncertainty. Alternatively, we can interpret compartmental systems as weighted directed graphs. Dehmer and Mowshowitz [19] provide a comprehensive overview of the history of graph entropy measures. Unfortunately, most such entropy measures are based on the number of vertices, vertex degree, edges, or degree sequence [20,21]. Thus, they concentrate only on the structural information of the graph. There are also graph theoretical measures that take edges and weights into account by using probability schemes. Their drawback is that the underlying meaning of uncertainty becomes difficult to interpret, because the assigned probabilities seem somewhat arbitrary [22].

To bridge this gap, we interpret deterministic compartmental systems from a probabilistic viewpoint which allows us to apply the whole information theoretical toolbox to this important class of deterministic systems. As a first step in this direction, we compute the Shannon information entropy [23] of the continuous-time Markov chain that describes the random path of a single particle through the compartmental system [13] and introduce three non-vanishing entropy measures: While the path entropy describes the uncertainty of a single particle’s path through the system, the entropy rate per unit time and the entropy rate per jump describe average uncertainties over the course of a particle’s journey.

The focus on a single particle gives our entropies microscopic system properties and consequently distinguishes our approach from the theory of maximum caliber (MaxCal) [24,25], where path entropy is interpreted as a macroscopic system property of bulk material. Furthermore, our approach differs from the thermodynamic approach to entropy, which has been developed by other authors studying energy transfers and reversibility in thermodynamic systems [8,26,27,28]. While the probabilistic interpretation of thermodynamic entropy is related to the uncertainty of the location of a typical particle at a specific point in time, the newly introduced path entropy considers all locations of a typical particle at all times while it is part of the system. A first application of our information theoretical entropy concept to compartmental systems allows us to reveal hidden inherent properties of complex systems such as the carbon cycle, from the microbial to the global scale, e.g., allowing us to partly explain why there is a large diversity of soil carbon models, while there is more consensus on how to model carbon uptake by photosynthesis.

This article is organized as follows. First, we provide the fundamentals from information theory and dynamical systems theory that are necessary to introduce path entropy as the uncertainty of a single particle traveling through the system. Then, we mathematically derive the path entropy and introduce the entropy rates per unit time and per jump as uncertainty measures of the behavior of one typical particle. The focus on a single particle gives our entropies microscopic system properties and provides insights, where the macroscopic approach via topological and metric entropy fails. Then, we prove that the new entropy rates of a finite particle path are indeed proper entropy rates of an associated stationary stochastic process, which guarantees that, in the process of model selection, there exists a unique optimal solution as long as the parameter set is convex. To assist in the interpretation of the newly introduced quantities from a system-wide point of view, we establish the link to the macroscopic system scale before we introduce the link between MaxEnt and structural model identification for compartmental systems. Afterwards, we present the introduced theory by means of simple generic examples from the field of carbon-cycle modeling exploring the effect of different parameterizations on the three entropy metrics before we apply MaxEnt to a model identification problem. Then, we discuss the results and draw final conclusions.

2. Mathematical Background: Information Entropy and Compartmental Systems as Markov Chains

First, we introduce some basic notations and well-known properties of Shannon information entropy of random variables and stochastic processes. Then, we present compartmental systems as a means to model material-cycle systems that obey the law of mass balance. We then consider such systems from a single-particle point of view and define the path of a single particle through the system along with its visited compartments, sojourn times, occupation times, and transit time.

2.1. Short Summary of Shannon Information Entropy

We introduce a few basic concepts of information entropy. Within the framework of this article, discrete entropies are usually associated with a particle’s jump into another compartment and differential entropies to a particle’s sojourn time within a specific compartment. Entropy rates are defined as average uncertainties of the particle’s path while it is in the system. See (Cover and Thomas [29] Sects. 2 and 8) for a more detailed introduction to Shannon’s information entropy and differential entropy. Entropy rates for discrete- and continuous-time stochastic processes are introduced in (Cover and Thomas [29] Sect. 4) and Bad Dumitrescu [30].

Let Y be a real-valued discrete (continuous) random variable and call p its probability mass function (probability density function). Then,

H (Y) : = - E [\log p (Y)]

(1)

is called the Shannon information entropy (differential entropy) of Y. Most of the time, we just say entropy, and the precise meaning can be derived from the context. The entropy’s unit depends on the logarithmic base. For base 2, the unit is bits, and for the natural logarithm with base e, the unit is nats. Throughout this manuscript, we use the latter if not stated otherwise.

The entropy

H (Y)

of a random variable Y has two intertwined interpretations. On the one hand, it is a measure of uncertainty, i.e., a measure of how difficult it is to predict the outcome of a realization of Y. On the other hand,

H (Y)

is also a measure of the information content of Y, i.e., a measure of how much information we gain once we learn about the outcome of a realization of Y. It is important to note that, even though their definitions and information theoretical interpretations are quite similar, Shannon and differential entropy have one main difference. Shannon entropy is always nonnegative, whereas differential entropy can have negative values. While Shannon entropy is an absolute measure of information and makes sense in its own right, differential entropy is not an absolute information measure, is not scale-invariant, and makes sense only in comparison with the differential entropy of another random variable.

Panel (a) of Figure 1 depicts the Shannon entropy with logarithmic base 2 of a Bernoulli random variable Y, with

P (Y = 1) = 1 - P (Y = 0) = p \in [0, 1]

representing a coin toss with probability of heads equal to p. The closer p is to

1 / 2

, the more difficult it is to predict the outcome. For an unbiased coin with

p = 1 / 2

, we have no information about the outcome whatsoever, and the Shannon entropy

H (Y) = - p \log (p) - (1 - p) \log (1 - p)

(2)

is maximized. Panel (b) of Figure 1 shows the differential entropy of an exponentially distributed random variable

Y \sim Exp (λ)

with rate parameter

λ > 0

, probability density function

f (y) = λ e^{- λ y}

for

y \geq 0

, and

E [Y] = λ^{- 1}

. We can imagine it to represent the duration of stay of a particle in a well-mixed compartment in an equilibrium compartmental system, where

λ

is the total outflow rate from the compartment. The higher the outflow rate, the more likely an early exit of the particle, and the easier it is to predict its moment of exit. Hence, the differential entropy

H (Y) = 1 - \log λ

(3)

decreases with increasing

λ

.

The joint entropy of two random variables

Y_{1}

and

Y_{2}

can be described as

H (Y_{1}, Y_{2}) = H (Y_{1}) + H (Y_{2} | Y_{1}),

(4)

where the conditional entropy

H (Y_{2} | Y_{1})

describes the uncertainty of

Y_{2}

under the condition that

Y_{1}

is known. The uncertainty of a stochastic process can be measured by its entropy rate, which describes the time density of the average information in the process. For a discrete-time stochastic process

Y = {(Y_{n})}_{n \geq 1}

, it is defined as ([29] Sect. 4.2)

θ (Y) = lim_{n \to \infty} \frac{1}{n} H (Y_{1}, Y_{2}, \dots, Y_{n}),

(5)

when the limit exists.

For instance, let

Z \sim Poi (λ)

be a Poisson process with intensity rate

λ > 0

describing the moments of occurrences of certain events. The interarrival times

Y = (Y_{1}, Y_{2}, \dots)

of Z (the times between events) are

Exp (λ)

-distributed and mutually independent. Hence,

θ (Y) = 1 - \log λ

. If we rescale

θ (Y)

by the mean interarrival time

1 / λ

, we obtain the entropy rate of Z being ([31] Sect. 3.3)

θ (Z) = θ (Poi (λ)) = λ (1 - \log λ) .

(6)

This entropy rate increases with

λ \in [0, 1]

, reaches its maximum at 1, and then it decreases (Figure 1c). The maximum always occurs at

λ = 1

independent of the unit of

λ

, because it is based on the differential entropy of the exponential distribution, which is not scale-invariant. Consequently, it is not an absolute measure of information content but only useful in comparison to the entropy rates of other stochastic processes.

2.2. Compartmental Systems in Equilibrium

The mass-balanced flow of material into a system, within the system, and out of the system that consists of several compartments can be modeled by so-called compartmental systems [6,32]. Compartments are always well-mixed and usually also called pools or boxes. An autonomous compartmental system can be described by the d-dimensional linear ODE system

\frac{d}{d t} x (t) = B x (t) + u, t > 0,

(7)

with some nonnegative initial condition

x (0) = x^{0} \in R_{+}^{d}

. The nonnegative vector

x (t)

describes the amount of material in the different compartments at time t, the nonnegative vector

u = {(u_{i})}_{i = 1, 2, \dots, d} \in R_{+}^{d}

is the vector of external inputs to the compartments, and the compartmental matrix

B \in R^{d \times d}

describes the flux rates between the compartments and out of the system. The nonnegative off-diagonal value

B_{i j}

is the flux rate from compartment j to compartment i, the absolute value of the negative diagonal value

B_{j j}

is the total rate of fluxes out of compartment j, and the nonnegative column sum

z_{j} = - \sum_{i = 1}^{d} B_{i j}

is the rate of the flux from compartment j out of the system. By requiring

B

to be invertible, we ensure that the system is open, i.e., all material that enters the system will eventually also leave it. Throughout this manuscript, we consider the open compartmental system (7) to have reached its unique steady-state or equilibrium compartment vector

x^{*} = - B^{- 1} u

. This implies

∥ r ∥ = ∥ u ∥

, where

r = {(r_{i})}_{i = 1, 2, \dots, d}

given by

r_{j} = z_{j} x_{j}^{*}

is the external outflux vector from the system, and

∥ \cdot ∥

denotes the sum of absolute values of a vector (

l_{1}

-norm). An open compartmental system in equilibrium given by Equation (7) is fully characterized by

u

and

B

, and we denote it by

M : = M (u, B)

.

2.3. The One-Particle Perspective

While Equation (7) describes the movement of bulk material through the system, compartmental systems in equilibrium can also be described probabilistically by considering the random path of a single particle through the system [13]. If

X_{t} \in S : = {1, 2, \dots, d}

denotes the compartment in which the single particle is at time t, and

X_{t} = d + 1

if the particle has already left the system, then

X : = {(X_{t})}_{t \geq 0}

is an absorbing continuous-time Markov chain [33] on

\tilde{S} : = S \cup {d + 1}

. Its initial distribution is given by

\tilde{β} = {(β_{1}, β_{2}, \dots, β_{d}, 0)}^{T}

, where

β : = u / ∥ u ∥

, and hence,

β_{j} = P (X_{0} = j)

is the probability of the single particle to enter the system through compartment j. The superscript T denotes the transpose of the respective vector or matrix. The transition-rate matrix of X is given by

Q = (\begin{matrix} B & 0 \\ z^{T} & 0 \end{matrix}),

(8)

and thus,

P (X_{t} = i) = {(e^{t Q} \tilde{β})}_{i} = \sum_{j = 1}^{d} {(e^{t Q})}_{i j} β_{j}, i \in \tilde{S}

(9)

is the probability of the particle to be in compartment i at time t if

i \in S

or that the particle has left the system if

i = d + 1

. Here,

e^{t Q}

denotes the matrix exponential. Furthermore,

P (X_{t} = i | X_{s} = j) = {(e^{(t - s) Q})}_{i j}, s \leq t, i, j \in \tilde{S}

(10)

is the probability that X is in state i at time t, given it was in state j at a previous time s. Since the Markov chain X and the compartmental system in equilibrium given by Equation (7) are equivalent, we can write

M = M (u, B) = M (X) .

(11)

2.4. The Path of a Single Particle

A particle’s path through the system from the moment of entering until the moment of exit can be described as a sequence of (compartment, sojourn-time)-pairs

P (X) : = ((Y_{1} = X_{0}, T_{1}), (Y_{2}, T_{2}), \dots, (Y_{N - 1}, T_{N - 1}), Y_{N} = d + 1),

(12)

where X is the absorbing Markov chain associated with the particle’s journey. The sequence

Y_{1}, Y_{2}, \dots, Y_{N - 1} \in S

represents the successively visited compartments with the associated sojourn times

T_{1}, T_{2}, \dots, T_{N - 1}

, and the random variable

N : = inf {n \in N : Y_{n} = d + 1}

(13)

denotes the first hitting time of the absorbing state

d + 1

by the embedded jump chain

Y : = {(Y_{n})}_{n = 1, 2, \dots, N}

of X [33]. With

λ_{j} : = - Q_{j j}

, the one-step transition probabilities of Y are given by, for

i, j \in \tilde{S}

,

P_{i j} : = P (Y_{n + 1} = i | Y_{n} = j) = \{\begin{matrix} 0, & i = j or λ_{j} = 0, \\ Q_{i j} / λ_{j}, & else . \end{matrix}

(14)

Let

{P |}_{S} = {(P_{i j})}_{i, j \in S}

be the restriction of

P

to S. We can also write

{P |}_{S} = B D^{- 1} + I

, where

D : = diag (λ_{1}, λ_{2}, \dots, λ_{d})

is the diagonal matrix with the diagonal entries of

B

, and

I

denotes the identity matrix of appropriate dimension. Then,

M : = {(I - P |_{S})}^{- 1}

is the fundamental matrix of Y. The entry

M_{i j}

denotes the expected number of visits to compartment i given that the particle entered the system through compartment j. Consequently, the expected number of visits to compartment

i \in S

is given by

E [N_{i}] = \sum_{j = 1}^{d} M_{i j} β_{j} = {(M β)}_{i} = {[{(I - P |_{S})}^{- 1} β]}_{i} = {(D B^{- 1} β)}_{i} = \frac{λ_{i} x_{i}^{*}}{∥ u ∥},

(15)

and the total expected number of jumps is given by

E [N] = \sum_{i = 1}^{d} {(M β)}_{i} + 1 = \sum_{i = 1}^{d} E [N_{i}] + 1,

(16)

where we take into account also the last jump out of the system.

The last jump,

N

, leads the particle out of the system such that, at the moment of this last jump, X takes on the value

d + 1

. This last jump happens at the absorption time of the Markov chain X, which is defined as

T : = inf {t > 0 : X_{t} = d + 1} .

(17)

The absorption time is phase-type distributed [34],

T \sim PH (β, B)

, with probability density function

f_{T} (t) = z^{T} e^{t B} β, t \geq 0 .

(18)

It can be shown ([13] Sect. 3.2) that the mean or expected value of

T

equals the turnover time [12] of system (7) in equilibrium and is given by total stocks over total fluxes, i.e.,

E [T] = \frac{∥ x^{*} ∥}{∥ u ∥} .

(19)

Furthermore, by construction,

\sum_{k = 1}^{N - 1} T_{k} = T

. If we denote by

1_{{A}}

the indicator function of the logical expression A given by

1_{{A}} = \{\begin{matrix} 1, A is true, \\ 0, else, \end{matrix}

(20)

then

O_{j} : = \sum_{k = 1}^{N - 1} 1_{{Y_{k} = j}} T_{k}

is the total time that the particle spends in compartment j. This time is called occupation time of j, and its mean is given by ([13] Sect. 3.3)

E [O_{j}] = \frac{x_{j}^{*}}{∥ u ∥},

(21)

which induces

E [T] = \sum_{j = 1}^{d} E [O_{j}]

.

3. Entropy Measures, MaxEnt, and Structural Model Identification

Based on these basic structures of the path of a single particle traveling through the system, we compute three different types of entropy, for which we provide a summary of the desirable relations among them below:

(1): As a particle travels through a system of interconnected compartments, it jumps a certain number of times to the next compartment until it finally jumps out of the system. Between two jumps, the particle resides in some compartment. The path entropy measures the entire uncertainty about the particle’s travel through the system, including both the sequence of visited compartments and the respective times spent there.
(2): The entire travel of the particle takes a certain time. In each unit time interval before the particle leaves the system, uncertainties exist as to whether the particle jumps, where it jumps, and even how often it jumps. The mean of these uncertainties over the mean length of the travel interval is measured by the entropy rate per unit time.
(3): Each jump comes with uncertainties about which compartment will be next and how long will the particle stay there. The entropy rate per jump measures the average of these uncertainties with respect to the mean number of jumps until the particle’s exit from the system.

Once these entropy metrics are established, we introduce MaxEnt and show how to apply it to the problem of structural model identification.

3.1. Path Entropy, Entropy Rate per Unit Time, and Entropy Rate per Jump

The path

P = P (X)

given by Equation (12) can be interpreted in three different ways. Each of these ways leads to a different interpretation of the path’s entropy. First, we can look at

P

as the result of bookkeeping of the absorbing continuous-time Markov chain X, where for all times t we note down the pair

(X_{t}, t)

of the current compartment and the current time. Second, we can consider the path as a discrete-time process. In each time step n, we choose randomly a new compartment

Y_{n + 1}

and an associated sojourn time

T_{n + 1}

of the particle in this compartment. Third, we can look at

P

as a single random variable with values in the space of all possible paths. Based on the latter interpretation we now derive the path entropy.

We are interested in the uncertainty/information content of the path

P (X)

of a single particle. Along the lines of Albert [35], we construct a space ℘ that contains all possible paths that can be taken by a particle that runs through the system until it leaves. Let

℘_{n} : = {(S \times R_{+})}^{n} \times {d + 1}

denote the space of paths that visit n compartments/states before ending up in the environmental compartment/absorbing state

d + 1

. By

℘ : = ⋃_{n = 1}^{\infty} ℘_{n}

, we denote the space of all eventually absorbed paths. Note that, since

B

is invertible, a path through the system is finite with probability 1. Let l denote the Lebesgue measure on

R_{+}

and c the counting measure on S. Furthermore, let

σ_{n}

be the

σ

-finite product measure on

℘_{n}

. It is defined by

σ_{n} : = {(c \otimes l)}^{n} \otimes c

. Almost all sample functions of

{(X_{t})}_{t \geq 0}

can be represented as a point

p \in ℘

([36] Chapter VI). Consequently, we can represent X by a finite-length path

P (X) = ((Y_{1}, T_{1}), (Y_{2}, T_{2}), \dots, (Y_{n}, T_{n}), Y_{n + 1})

for some

n \in N

, where

Y_{n + 1} = d + 1

.

For each set

W \subseteq ℘

, for which

W \cap ℘_{n}

is

σ_{n}

-measurable for each

n \in N

, we define

σ^{*} (W) : = \sum_{n = 1}^{\infty} σ_{n} (W \cap ℘_{n})

. This measure is defined on the

σ

-field

F^{*}

which is the smallest

σ

-field containing all sets

W \subseteq ℘

, whose projection on

R_{+}^{n}

is a Borel set for each

n \in N

. Let

σ

be a measure on all sample functions, defined for all subsets W whose intersection with ℘ is in

F^{*}

. We define it by

σ (W) : = σ^{*} (W \cap ℘)

.

Let

p = ((x_{1}, t_{1}), (x_{2}, t_{2},), \dots, (x_{n}, t_{n}), d + 1) \in ℘

for some

n \in N

. For

i \neq j

, denote by

N_{i j} (p)

the total number of path’s p one-step transitions from j to i and by

R_{j} (p)

the total amount of time spent in j.

Theorem 1.

The probability density function of

P = P (X)

with respect to σ is given by

\begin{matrix} f_{P} (p) = β_{x_{1}} (\prod_{j = 1}^{d} \prod_{i = 1, i \neq j}^{d + 1} & {(Q_{i j})}^{N_{i j} (p)}) \prod_{j = 1}^{d} e^{- λ_{j} R_{j} (p)}, \\ p = ((x_{1}, t_{1}), (x_{2}, t_{2}), \dots, (x_{n}, t_{n}), d + 1) \in ℘ . \end{matrix}

(22)

Proof.

Let

x_{1}, x_{2}, \dots, x_{n} \in S

,

x_{n + 1} = d + 1

, and

t_{1}, t_{2}, \dots, t_{n} \in R_{+}

. Since

\begin{matrix} P ((Y_{1} = x_{1}, T_{1} \leq t_{1}), \dots, (Y_{n} = x_{n}, T_{n} \leq t_{n}), Y_{n + 1} = d + 1) \\ = P (Y_{n + 1} = d + 1 | Y_{n} = x_{n}) \\ \cdot \prod_{k = 2}^{n} P (Y_{k} = x_{k}, T_{k} \leq t_{k} | Y_{k - 1} = x_{k - 1}) P (Y_{1} = x_{k}, T_{1} \leq t_{1}) \\ = P_{d + 1, x_{n}} [\prod_{k = 2}^{n} P_{x_{k} x_{k - 1}} (1 - e^{- λ_{x_{k}} t_{k}})] β_{x_{1}} (1 - e^{- λ_{x_{1}} t_{1}}) \\ = \int_{T_{n}} β_{x_{1}} \prod_{k = 1}^{n} Q_{x_{k + 1} x_{k}} e^{- λ_{x_{k}} τ_{k}} d τ_{1} d τ_{2} \dots d τ_{n}, \end{matrix}

(23)

with

T_{n} = {(τ_{1}, τ_{2}, \dots, τ_{n}) \in R_{+}^{n} : 0 \leq τ_{1} \leq t_{1}, 0 \leq τ_{2} \leq t_{2}, \dots, 0 \leq τ_{n} \leq t_{n}}

, the probability density function of

P = P (x)

with respect to

σ

is given by

\begin{matrix} f_{P} (p) = β_{x_{1}} \prod_{k = 1}^{n} & Q_{x_{k + 1} x_{k}} e^{- λ_{x_{k}} t_{k}}, \\ p = ((x_{1}, t_{1}), (x_{2}, t_{2}), \dots, (x_{n}, t_{n}), d + 1) \in ℘ . \end{matrix}

(24)

The term

Q_{x_{k + 1} x_{k}} = Q_{i j}

enters exactly

N_{i j} (p)

times. Furthermore,

\begin{matrix} \prod_{k = 1}^{n} e^{- λ_{x_{k}} t_{k}} & = \prod_{k = 1}^{n} \prod_{j = 1}^{d} 1_{{x_{k} = j}} e^{- λ_{j} t_{k}} = \prod_{j = 1}^{d} e^{- λ_{j} \sum_{k = 1}^{n} 1_{{x_{k} = j}} t_{k}} \\ = \prod_{j = 1}^{d} e^{- λ_{j} R_{j} (p)} . \end{matrix}

(25)

We make the according substitutions, and the proof is finished. □

The entropy of the absorbing continuous-time Markov chain X is equal to its entropy on the random but finite time horizon

[0, T]

, which in turn equals the entropy of a single particle’s path

P

through the system.

Theorem 2.

The entropy of the absorbing continuous-time Markov chain X is given by

\begin{matrix} H (X) & = H (P) \\ = - \sum_{i = 1}^{d} β_{i} \log β_{i} \\ + \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} [\sum_{i = 1, i \neq j}^{d} B_{i j} (1 - \log B_{i j}) + z_{j} (1 - \log z_{j})] . \end{matrix}

(26)

Proof.

Let X have the finite path representation

P = P (X) = ((Y_{1}, T_{1}), (Y_{2}, T_{2}), \dots, (Y_{n}, T_{n}), d + 1)

(27)

for some

n \in N

, and denote by

f_{P}

its probability density function. Then, by Theorem 1,

- \log f_{P} (P) = - \log β_{Y_{1}} - \sum_{j = 1}^{d} \sum_{i = 1, i \neq j}^{d + 1} N_{i j} (P) \log Q_{i j} + \sum_{j = 1}^{d} λ_{j} R_{j} (P) .

(28)

We compute the expectation and get

\begin{matrix} H (X) & = H (P) = - E [\log f_{P} (P)] \\ = - E [\log β_{Y_{1}}] - \sum_{j = 1}^{d} \sum_{i = 1, i \neq j}^{d + 1} E [N_{i j} (P)] \log Q_{i j} + \sum_{j = 1}^{d} λ_{j} E [R_{j} (P)] \\ = H (Y_{1}) + \sum_{j = 1}^{d} λ_{j} E [R_{j} (P)] - \sum_{j = 1}^{d} \sum_{i = 1, i \neq j}^{d + 1} E [N_{i j} (P)] \log Q_{i j} . \end{matrix}

(29)

Obviously,

E [R_{j} (P)] = E [O_{j}] = x_{j}^{*} / ∥ u ∥

is the mean occupation time of compartment

j \in S

by X. Furthermore, for

i \in \tilde{S}

and

j \in S

such that

i \neq j

, by Equations (14) and (15),

E [N_{i j} (P)] = E [N_{j} (P)] P_{i j} = \{\begin{matrix} \frac{x_{j}^{*}}{∥ u ∥} B_{i j}, & i \leq d, \\ \frac{x_{j}^{*}}{∥ u ∥} z_{j}, & i = d + 1 . \end{matrix}

(30)

Together with

λ_{j} = \sum_{i = 1, i \neq j}^{d} B_{i j} + z_{j}

, we obtain

\begin{matrix} H (X) & = H (Y_{1}) + \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} [(\sum_{i = 1, i \neq j}^{d} B_{i j} + z_{j}) \\ - \sum_{i = 1, i \neq j}^{d} B_{i j} \log B_{i j} - z_{j} \log z_{j}] \\ = - \sum_{i = 1}^{d} β_{i} \log β_{i} + \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} [\sum_{i = 1, i \neq j}^{d} B_{i j} (1 - \log B_{i j}) \\ + z_{j} (1 - \log z_{j})] . \end{matrix}

(31)

□

By some simple substitutions and rearrangements, we obtain two representations of

H (X) = H (P)

that are easy to interpret. For simplicity of notation, we define

H (β) : = - \sum_{i = 1}^{d} β_{i} \log β_{i} .

(32)

Proposition 1.

The entropy of the absorbing continuous-time Markov chain X is also given by

H (X) = H (β) + \sum_{j = 1}^{d} E [O_{j}] (\sum_{i = 1, i \neq j}^{d} θ (Poi (B_{i j})) + θ (Poi (z_{j})))

(33)

and

\begin{matrix} H (X) & = H (β) \\ + \sum_{j = 1}^{d} E [N_{j}] (H (Exp (λ_{j})) + H (P_{1, j}, P_{2, j}, \dots, P_{d, j}, P_{d + 1, j})), \end{matrix}

(34)

which can be rewritten as

\begin{matrix} H (X) & = H (β) + \sum_{j = 1}^{d} E [N_{j}] H (P_{1, j}, P_{2, j}, \dots, P_{d, j}, P_{d + 1, j}) \end{matrix}

(35)

\begin{matrix} + \sum_{j = 1}^{d} E [N_{j}] H (Exp (λ_{j})) . \end{matrix}

(36)

Proof.

By virtue of Equation (33), we replace

x_{j}^{*} / ∥ u ∥

by

E [O_{j}]

in Equation (26) and take into account that the entropy rate of a Poisson process with intensity rate

λ

equals

λ (1 - \log λ)

to prove Equation (33). To prove Equation (34), we use Equation (15) to replace

x_{j}^{*} / ∥ u ∥

in Equation (26) by

E [N_{j}] / λ_{j}

and obtain

\begin{matrix} H (X) & = - \sum_{i = 1}^{d} β_{i} \log β_{i} + \sum_{j = 1}^{d} E [N_{j}] (1 - \log λ_{j}) \\ + \sum_{j = 1}^{d} E [N_{j}] (- \sum_{i = 1, i \neq j}^{d} \frac{B_{i j}}{λ_{j}} \log \frac{B_{i j}}{λ_{j}} - \frac{z_{j}}{λ_{j}} \log \frac{z_{j}}{λ_{j}}) . \end{matrix}

(37)

Here,

(1 - \log λ_{j})

is the entropy of an exponential random variable with rate parameter

λ_{j}

. Using definition (14) of

P_{i j}

, we replace

B_{i j} / λ_{j}

by

P_{i j}

for

i \in S

and

z_{j} / λ_{j}

by

P_{d + 1, j}

and finish the proof. □

By identifying a compartmental system

M = M (u, B)

with its associated absorbing continuous-time Markov chain X and the according path

P = P (X)

of a single traveling particle, we transfer the concept of the path entropy

H (P)

from the probabilistic to the deterministic realm.

Definition 1.

The path entropy of the compartmental system M in equilibrium given by Equation (7) with associated absorbing continuous-time Markov chain X and path

P = P (X)

is defined by

H (P) = H (P (X)) = H (X) .

(38)

Consider a one-dimensional compartmental system

M_{λ}

in equilibrium with rate

λ > 0

and positive external input given by

\frac{d}{d t} x (t) = - λ x (t) + u, t > 0

(39)

and denote its associated path by

P_{λ}

. The entropy of the initial distribution vanishes, and we obtain

H (P_{λ}) = \frac{x^{*}}{u} λ (1 - \log λ) = \frac{1}{λ} λ (1 - \log λ) = 1 - \log λ,

(40)

which equals the differential entropy

1 - \log λ

of the exponentially distributed mean transit time

T_{λ} \sim Exp (λ)

, reflecting that the only uncertainty of the particle’s path in a one-pool system is the time of the particle’s exit. The exponential distribution with rate parameter

λ

is the distribution of the interarrival time of a Poisson process wit intensity rate

λ

. Hence, we can interpret

H (P_{λ}) = λ^{- 1} λ (1 - \log λ)

as the instantaneous Poisson entropy rate

λ (1 - \log λ)

multiplied with the expected duration

E [T_{λ}] = λ^{- 1}

of the particle’s stay in the system.

For a d-dimensional system, we can interpret

H (P)

as the entropy of a continuous-time process in the context of Equation (33) and as the entropy of a discrete-time process in the context of Equation (34). In both interpretations, the first term

H (β) = H (X_{0}) = H (Y_{1})

represents the uncertainty of the first pool through which the particle enters the system. In the continuous-time interpretation, the uncertainty of the subsequent travel is the weighted average of the superposition of d Poisson processes describing the instantaneous uncertainty of possible jumps of the particle inside the system,

θ (Poi (B_{i j}))

, and out of the system,

θ (Poi (z_{j}))

, where the weights are the expected occupation times of the different compartments

j \in S

. In the discrete-time interpretation, the subsequent travel’s uncertainty is the average of uncertainties associated to each pool, weighted by the number of visits to the respective pools. The uncertainty associated with each pool comprises the uncertainty of the length of the stay in the pool,

H (Exp (λ_{j}))

, and the uncertainty of where to jump afterwards,

H ({P_{i j} : i \in \tilde{S}, j \in S, i \neq j})

. Hence, in the context of Equation (34), we can separate the path entropy into a discrete part associated with the jump uncertainty given by Equation (35) and a continuous part associated with the sojourn time uncertainty given by Equation (36).

The two interpretations of the path entropy

H (P)

(as a continuous-time or discrete-time process) motivate two different entropy rates as described earlier. The entropy rate per unit time is given by

θ (P) = \frac{H (P)}{E [T]}

(41)

and the entropy rate per jump by

θ_{J} (P) = \frac{H (P)}{E [N]} .

(42)

While the path entropy measures the uncertainty of the entire path, entropy rates measure the average uncertainty of the instantaneous future of a particle while it is in the system: for the entropy rate per unit time, it is the uncertainty entailed by the infinitesimal future, and for the entropy rate per jump, it is the uncertainty entailed by the next jump.

For these entropy rates to be useful in the process of model selection, it is important that they guarantee the existence of a unique maximum entropy model in case of a convex parameter space, which is not obvious from their definitions. The classical entropy rate of a stochastic process as defined in (Cover and Thomas [29] Sect. 4.1) has this property, and we prove in Appendix A that

θ_{J} (P) = θ (Z)

, where the stationary process

Z = {(Z_{n})}_{n \geq 1} = {({\tilde{Y}}_{n}, {\tilde{T}}_{n})}_{n \geq 1}

on the space

(\tilde{S} \times R_{+})

describes the infinite journey of a typical particle. It is the sequence of visited compartments with the associated sojourn times of a single particle through the system with immediate jumps back into the system when leaving it. By Equations (41) and (42), the average time between two jumps is

E [T] / E [N]

. If we divide the entropy rate per jump by it, we obtain the entropy rate per unit time. Hence,

θ (P) = \frac{E [N]}{E [T]} θ (Z)

(43)

is the average uncertainty per unit time of the stationary process Z.

3.2. From Microscopic Particle Entropy to Macroscopic System Entropy

While the microscopic entropy measures provide direct insights into the uncertainties of the path of a single traveling particle, we can also scale them up to the macroscopic system scale. The combination of Equations (19), (41), and (43) inevitably leads to the following macroscopic definition.

Definition 2.

The system entropy of the compartmental system M in equilibrium given by Equation (7) with associated absorbing continuous-time Markov chain X and path

P = P (X)

is defined by

H (M) = ∥ x^{*} ∥ θ (P) = ∥ u ∥ H (P) = ∥ u ∥ E [N] θ_{J} (P) .

(44)

Consequently, the system entropy can be interpreted in three ways: (1) as the cumulated mean instantaneous uncertainty of all particles currently in the system, (2) as the cumulated uncertainty of the entire future path of all particles currently entering the system, and (3) as the cumulated mean uncertainty of all future jumps of all particles currently entering the system.

3.3. The Maximum Entropy Principle (MaxEnt)

MaxEnt arose in statistical mechanics as a variational principle to predict the equilibrium states of thermal systems and later was applied to matters of information and as a general procedure to draw inferences based on self-consistency requirements [17]. Its relationship to information theory and stochastics was established by Jaynes [15,16]. The general idea is to identify the most uninformed probability distribution to represent some given data in the sense that the maximum entropy distribution, constrained to given data, uses the information provided by the data only and nothing else. This approach ensures that no additional subjective information creeps into the distribution. For compartmental systems, data constraints could affect macroscopic quantities such as the stocks

x^{*}

, the input vector

u

, the output rates

z_{j}

, or the mean transit time

E [T]

. The goal of this section is to transfer MaxEnt to compartmental systems in order to identify the compartmental system that best represents our state of knowledge in different situations and, at the same time, to get a better understanding of the previously introduced entropy measures. In the next two examples, we identify compartmental models with maximum entropy under some restrictions. Both examples show that maximizing entropy means also maximizing symmetry as much as the given constraints allow.

Example 1.

Consider the set

M_{1}

of equilibrium compartmental systems (7) with a predefined nonzero input vector

u

, a predefined mean transit time

E [T]

, and an unknown steady-state vector

x^{*}

comprising nonzero components. We are interested in the most unbiased compartmental system that reflects our state of information, where maximum unbiasedness is achieved by identifying

M_{1}^{*} \in M_{1}

with the path

P_{1}^{*} : = P (M_{1}^{*})

such that the path entropy

H (P_{1}^{*})

or, equivalently, the entropy rate per unit time

θ (P_{1}^{*})

is maximized. We can show (see Proposition A2) that the compartmental system

M_{1}^{*} = M (u, B)

with

B = (\begin{matrix} - λ & 1 & \dots & 1 \\ 1 & - λ & 1 \dots & 1 \\ ⋮ & ⋱ & ⋮ \\ 1 & \dots & 1 & - λ \end{matrix}),

(45)

where

λ = d - 1 + 1 / E [T]

, is the maximum entropy model in

M_{1}

. In the special case

d = 1

for a one-dimensional compartmental system, we obtain

B = - 1 / E [T]

. Since, in this case,

T \sim Exp (- B)

, we see that the exponential distribution is the maximum entropy distribution in the class of all nonnegative continuous probability distributions with fixed expected value. This special case is very well-known ([29] Example 12.2.5).

Example 2.

Let us consider the subclass

M_{2} \subseteq M_{1}

of compartmental models from the previous example with the additional restriction of a predefined positive steady-state vector

x^{*}

. Then, the compartmental system

M_{2}^{*} = M (u, B)

with path

P_{2}^{*}

and

B_{i j} = \{\begin{matrix} \sqrt{\frac{x_{i}^{*}}{x_{j}^{*}}}, & i \neq j, \\ - \sum_{k = 1, k \neq j}^{d} \sqrt{\frac{x_{k}^{*}}{x_{j}^{*}}} - \frac{1}{\sqrt{x_{j}^{*}}}, & i = j, \end{matrix}

(46)

is the maximum entropy model in

M_{2}

(see Proposition A3).

3.4. Structural Model Identification Assisted by MaxEnt

Suppose that we observe a natural system and conduct measurements from which we try to construct a linear autonomous compartmental model in equilibrium that represents the observed natural system as well as possible. The first question that arises is about the number of compartments that the model should ideally have. MaxEnt cannot be helpful here because by adding more and more compartments we can theoretically increase the entropy of the model indefinitely. Consequently, the problem of finding the right dimension of system (7) has to be solved by other means. One way to do this is to analyze an impulse response function of the system and its Laplace transform, i.e., the transfer function of the system, and identify the most dominant frequencies. The impulse response or the transfer function might be possible to obtain by tracer experiments [6,37].

Once the desired number of compartments is identified, we can focus on the structure and values of external input and output fluxes as well as internal fluxes. In (Anderson [6] Chapter 16), the structural identification problem of linear autonomous systems is described as follows. Suppose that we are interested in determining a d-dimensional system of form (7). We are interested in sending an impulse to the system at time

t = 0

and analyzing its further behavior. To that end, we rewrite the system as

\begin{matrix} \frac{d}{d t} x (t) & = B x (t) + A u, t \geq 0, \\ x (0) & = 0 \in R^{d}, \\ y (t) & = C x (t), t \geq 0 . \end{matrix}

(47)

Note that the roles of

A

and

B

are interchanged here with respect to Anderson [6]. In a typical tracer experiment, we choose an input vector

u

and the input distribution matrix

A

, which defines how the input vector enters the system. Then, we decide which compartments we can observe to determine the output connection matrix

C

. The experiment is now to inject an impulse into the system and to record the output function

y (t) = C x (t)

. Bellman and Åström [38] pointed out that the input–output relation is given by

\begin{matrix} y (t) & = C x (t) = C \int_{0}^{t} e^{(t - τ) B} A u (τ) d τ \\ = [C e^{t B} A] * u (t), \end{matrix}

where ∗ is the convolution operator. The model parameters enter the input–output relation only in the matrix-valued impulse response function

Ψ (t) : = C e^{t B} A, t \geq 0,

(48)

or in the transfer function

\hat{Ψ} (s) : = C {(s I - B)}^{- 1} A, s \geq 0,

(49)

which is the Laplace transform matrix of

Ψ

. Consequently, all identifiable parameters of

A

,

B

, and

C

must be identified through

Ψ

or

\hat{Ψ}

. Difficulties arise because the entries of the matrices

Ψ

and

\hat{Ψ}

are usually nonlinear expressions of the elements of

A

,

B

, and

C

. We call system (47) identifiable if this nonlinear system of equations has a unique solution

(A, B, C)

for given

Ψ

or

\hat{Ψ}

. Otherwise, the system is called non-identifiable. Usually, the matrices

A

and

C

are already known from the experiment setup. What remains is to identify the compartmental matrix

B

, and this can be achieved by MaxEnt.

4. Application to Particular Systems

First, we apply the presented theory to some equilibrium compartmental models with very simple structure in order to grasp the new entropy concepts. Then, we compute entropy quantities for two carbon-cycle models in dependence on environmental and biochemical parameters. Finally, we apply MaxEnt to solve an equifinality problem in model selection as an example of how to tackle this problem arising from, for instance, tracer experiments.

4.1. Simple Examples

From Table 1, we can see that, depending on the connections between compartments, smaller systems can have greater path entropy and entropy rates than larger systems, even though systems with more compartments can theoretically reach higher entropy. Furthermore, we see from the depicted examples that the system with the highest path entropy does not have the highest entropy rate per unit time or per jump. Adding connections to a system, one would expect higher path entropy, but the path entropy might actually decrease, because the new connections potentially provide a faster way out of the system.

4.2. A Linear Autonomous Global Carbon-Cycle Model

We consider the global carbon-cycle model introduced by Emanuel et al. [39] (Figure 2).

The model comprises five compartments: non-woody tree parts

x_{1} = 37 PgC

, woody tree parts

x_{2} = 452 PgC

, ground vegetation

x_{3} = 69 PgC

, detritus/decomposers

x_{4} = 81 PgC

, and active soil carbon

x_{5} = 1121 PgC

. We introduce an environmental rate modifier

ξ

, which controls the speed at which carbon is cycled in all compartments. If

ξ > 1

, carbon is cycled faster in all compartments, simulating the effect of the global surface temperature increase [40]. For a given

ξ

, the equilibrium model

M_{ξ} = M (u, B_{ξ})

is given by

u = {(77; 0; 36; 0; 0)}^{T} PgC {yr}^{- 1}

(50)

and

B_{ξ} = ξ (\begin{matrix} - 77 / 37 & 0 & 0 & 0 & 0 \\ 31 / 37 & - 31 / 452 & 0 & 0 & 0 \\ 0 & 0 & - 36 / 69 & 0 & 0 \\ 21 / 37 & 15 / 452 & 12 / 69 & - 48 / 81 & 0 \\ 0 & 2 / 452 & 6 / 69 & 3 / 81 & - 11 / 1, 121 \end{matrix}) {yr}^{- 1},

(51)

where the numbers are chosen as in Thompson and Randerson [41]. The input vector is expressed in units of petagrams of carbon per year (

PgC {yr}^{- 1}

) and the fractional transfer coefficients in units of per year (

{yr}^{- 1}

). Because

B_{ξ}

is a lower triangular matrix, the model contains no feedbacks. For every value of

ξ

, the system has a different steady state (Figure 3a).The higher the value of

ξ

, the faster the system, which makes the mean transit time (Figure 3b) decrease, and because of shorter paths, the path entropy (Figure 3d) also decreases. Since

ξ

has no impact on the structure of the model, the mean number of jumps (Figure 3c) remains unaffected. This can also be seen from the solid line marked by squares in (Figure 3d). It represents the part of the path entropy related to jump-associated uncertainties (Equation (35)). The solid line marked by circles represents the part of the path entropy related to sojourn-associated uncertainties (Equation (36)), which as a weighted average of one-pool entropies decreases similarly to the entropy of an exponential distribution with an increasing rate parameter

λ

(Figure 1b). The two parts together constitute the path entropy as represented by the unmarked solid line.

The entropy rate per unit time (Figure 3e) increases until

ξ \approx 6

and decreases afterwards, because with increasing system speed the decreasing uncertainty associated with sojourn times increasingly dominates the uncertainty associated with jumps. While the uncertainty associated with jumps averaged over the path length increases, because the total jump uncertainty is constant (see solid line marked with squares in Figure 3d), and the mean path length decreases (Figure 3b), the sojourn-associated uncertainty decreases with the increasing system speed for

ξ > 6

, similar to the entropy rate of a Poisson process with intensity rate

λ > 1

(see Figure 1c). The entropy rate per jump (Figure 3f) decreases with increasing

ξ

, because the path entropy of the system decreases.

Dashed lines in Figure 3d–f show the respective entropy values for a one-pool system

M_{λ} = M ((77 + 36) P gC {yr}^{- 1}, - λ)

with the same mean transit time, i.e.,

λ^{- 1} = E [T_{ξ}]

. The solid and dashed lines intersect at

ξ \approx 4.31

in Figure 3d,e. Before this break-even point, the path of this multiple-pool model is harder to predict than the path (i.e., the exit time of the particle) of a one-pool model with the same mean transit time. After this point of breaking even, the path of the model with five compartments is easier to predict than only the transit time in a one-pool model. The reason is that, as the system becomes faster, the differential entropy of the sojourn times in slow pools decreases so fast that, at some point, the sojourn times in slow pools visited by few particles becomes rather unimportant. The one-pool model’s path becomes relatively harder to predict, because it puts too much weight on a small amount of slowly cycling particles.

Note that there is no point in comparing jump-associated uncertainties (square-marked lines) with one-pool entropies (dashed lines), because the former are discrete entropies and the latter differential entropies. Comparison of a differential entropy with another quantity only becomes reasonable if a second differential entropy is involved, as is true for the path entropy or the entropy rates

θ

and

θ_{J}

(unmarked solid lines). Hence, square- and circle-marked lines assist in understanding the composition of the entropies of the multi-pool system, and only the composition of the two can then be compared to the one-pool entropy rate.

4.3. A Nonlinear Autonomous Soil Organic Matter Decomposition Model

Consider the nonlinear two-compartment model

M_{ε} = M (u, B_{ε})

, described by Wang et al. [42], which is used to represent the dynamics of microbes and carbon substrates in soils (Figure 4). Its ODE system is given by

\frac{d}{d t} (\begin{matrix} C_{s} \\ C_{b} \end{matrix}) (t) = (\begin{matrix} - λ (x (t)) & μ_{b} \\ ε λ (x (t)) & - μ_{b} \end{matrix}) (\begin{matrix} C_{s} \\ C_{b} \end{matrix}) + (\begin{matrix} F_{NPP} \\ 0 \end{matrix}),

(52)

where

x (t) = {(C_{s}, C_{b})}^{T} (t)

. We denote by

C_{s}

and

C_{b}

substrate organic carbon and soil microbial biomass carbon (

gC m^{- 2}

), respectively, by

ε

the carbon use efficiency or fraction of assimilated carbon that is converted into microbial biomass (unit-less), by

μ_{b}

the turnover rate of microbial biomass per year (

{yr}^{- 1}

), by

F_{NPP}

the carbon influx into the soil (

gC m^{- 2} {yr}^{- 1}

), and by

V_{s}

and

K_{s}

the maximum rate of soil carbon assimilation per unit microbial biomass per year (

{yr}^{- 1}

) and the half-saturation constant for soil carbon assimilation by microbial biomass (

gC m^{- 2}

), respectively.

We consider the model in equilibrium, i.e.,

x (t) = x^{*} = {(C_{s}^{*}, C_{b}^{*})}^{T}

, with

C_{s}^{*} = \frac{K_{s}}{\frac{V_{s} ε}{μ_{b}} - 1} and C_{b}^{*} = \frac{F_{NPP}}{μ_{b} (- 1 + \frac{1}{ε})} .

(53)

The equilibrium stocks depend on the carbon use efficiency

ε

and so does the compartmental matrix

B = B_{ε}

, because

λ (x) = \frac{C_{b} V_{s}}{C_{s} + K_{s}} .

(54)

From Wang et al. [42], we take the parameter values

μ_{b} = 4.38 {yr}^{- 1}

,

F_{NPP} = 345.00 gC m^{- 2} {yr}^{- 1}

, and

K_{s}

= 53,954.83

gC m^{- 2}

. Since the description of

V_{s}

is missing in the original publication, we let it be equal to

59.13 {yr}^{- 1}

to approximately meet the given steady-state contents

C_{s}^{*}

= 12,650.00

gC m^{- 2}

and

C_{b}^{*} = 50.36 gC m^{- 2}

for the original value

ε = 0.39

. Otherwise, we leave the carbon use efficiency

ε

as a free parameter.

In contrast to the system from the first example, this system exhibits a feedback. This feedback results from dead soil microbial biomass being considered as new soil organic matter. The feedback can also be recognized by noting that

B

is not triangular. For every value of

ε

, the system has a different steady state (Figure 5a). The higher the value of

ε

, the lower the equilibrium substrate organic carbon and the higher the microbial biomass carbon. Caused by the model’s nonlinearity expressed in Equation (54), the system speed increases, and the mean transit time goes down (Figure 5b) with increasing

ε

. At the same time, higher carbon use efficiency increases the probability of each carbon atom to be reused more often; hence, the mean number of jumps increases (Figure 5c), making the entropy rate per jump decrease (Figure 5f). Even though the average paths become shorter, with increasing carbon use efficiency, the path entropy increases as well for most values of

ε

. This has two reasons. First, the mean uncertainty of where to jump from

C_{s}

increases; this uncertainty decreases then for

ε > 0.5

(solid line marked by squares in Figure 5f). Second, the rate

- B_{11}

of leaving the substrate pool is increasing and smaller than 1. The corresponding Poisson process reaches its maximum entropy rate at an intensity rate equal to 1 (Figure 1c), which corresponds to

ε \approx 0.926

. This is also reflected in the entropy rate per unit time (Figure 5e). The maximum does not exactly occur at

ε = 0.926

, because the time that the particle stays in the different pools also depends on

ε

. For

ε

approaching 1, both the path entropy and the entropy rate rapidly decline as the sojourn-associated uncertainties (solid lines with circle markers) decline sharply because of a nonlinear increase in the rate

- B_{11}

of soil organic carbon turnover.

Considering a one-pool system

M_{λ} = M (345.00 gC m^{- 2} {yr}^{- 1}, - 1 / E [T_{ε}])

with the same mean transit time, we recognize only small sensitivity of the entropies on

ε

, because the contrary effects on path length and jump- and sojourn-associated uncertainties mostly balance out (dashed lines in Figure 5d–f).

4.4. Model Identification via Maxent

The following example is inspired by (Anderson [6] Example 16 C). It shows how MaxEnt can help make a decision about which model to use if not all parameters can be uniquely determined from the transfer function

\hat{Ψ}

. We are interested in determining the entries of the compartmental matrix

B

belonging to the two-dimensional equilibrium compartmental system

\frac{d}{d t} (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) (t) = (\begin{matrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) (t) + (\begin{matrix} 1 \\ 0 \end{matrix}) gC {yr}^{- 1}, t > 0 .

(55)

We immediately notice that

u = {(1, 0)}^{T} gC {yr}^{- 1}

and

A = I

. Further, we decide to measure the contents of compartment 1 such that

C = (1, 0)

. We recall

z_{j} = - \sum_{i = 1}^{d} B_{i j}

and obtain

z_{1} = - B_{11} - B_{21}

and

z_{2} = - B_{22} - B_{12}

. The real-valued transfer function is then given by

\hat{Ψ} (s) = \frac{s + γ_{1}}{s^{2} + γ_{2} s + γ_{3}},

(56)

where

\begin{matrix} γ_{1} & = B_{12} + z_{2}, \\ γ_{2} & = B_{21} + z_{1} + B_{12} + z_{2}, \\ γ_{3} & = z_{1} B_{12} + z_{1} z_{2} + B_{21} z_{2} . \end{matrix}

(57)

We assume that

\hat{Ψ}

is known from measurements, i.e.,

γ_{1}

,

γ_{2}

, and

γ_{3}

are known impulse response parameters. We have the four unknown parameters

B_{11}

,

B_{12}

,

B_{21}

, and

B_{22}

, or equivalently,

B_{12}

,

B_{21}

,

z_{1}

, and

z_{2}

, but only three equations to determine them. Consequently, the system is non-identifiable and there remains a class

M

of models which all satisfy Equation (57). Which model out of

M

are we going to select now?

Here, MaxEnt comes into play. We intend to select the model that best represents the information given by our measurement data. We have to find

M^{*} = M (u, B^{*})

such that

M^{*} = \underset{M \in M}{arg max} θ (P (M)) .

(58)

Maximizing the entropy rate per unit time here leads to a feasible optimization problem, whereas maximization of the path entropy by slowing down the model and indefinitely increasing its mean transit time and its path entropy would lead to an unbounded optimization problem. The parameter space associated with

M

is given by

{p = (B_{12}, B_{21}, z_{1}, z_{2}) \in R_{+}^{4} : p satisfies Equation (57)},

(59)

which is not guaranteed to be convex in general. Consequently, by fundamental principles from mathematical optimization theory, the existence and uniqueness of

M^{*}

are not guaranteed, and we must apply optimization methods tailored to the specific case at hand.

Let us turn to a numerical example in which we suppose to be given

γ_{1} = 3 {yr}^{- 1}

,

γ_{2} = 5 {yr}^{- 1}

, and

γ_{3} = 4 {yr}^{- 1}

. Since convexity of the parameter space is not guaranteed, local optimality does not guarantee global optimality. Hence, we run local optmizations from starting points on a grid with mesh side

0.2

over the subspace

{[0, 5]}^{4}

of the parameter space and select our global maximum candidate as the local maximum with the highest entropy rate per unit time. Even though we cannot rigorously prove that our global maximum candidate

M_{\max} = M (u, B_{\max})

, as represented by the red dot in Figure 6 with

B_{\max} \approx (\begin{matrix} - 2.723 & 1.821 \\ 1.098 & - 2.277 \end{matrix}) {yr}^{- 1}

(60)

and

θ_{\max} \approx 1.916

, is a global maximum, we can clearly see that it is a good candidate. Increasing the distance of the local maximum parameters (Figure 6a) and mean transit time (Figure 6b) from the global maximum candidate leads to a decrease in the entropy rate per unit time. Furthermore, local optimizations with starting points on the grid lead only to small improvements. A good choice of starting point on the grid is crucial to find a good global maximum candidate (Figure 6c). Finally, the global maximum candidate for the entropy rate per unit time does not maximize the path entropy (Figure 6d).

5. Discussion

Based on the stochastic path that a single particle takes through a deterministic compartmental system, we introduced three types of entropy based on Shannon’s information theory. The entropy of the particle’s entire path through the system is the central concept, and the entropy rates per unit time and per jump are consistently derived from it. Even though we call

H (P)

the path entropy and identify models by maximizing it, it is different from the concept of path entropy as treated in the context of maximum caliber (MaxCal) [24,25]. We maximize here the Shannon information entropy of a single particle’s microscopic path through a compartmental system by means of an absorbing continuous-time Markov chain, whose transition probabilities are already determined by the macroscopic equilibrium state of the system. As discussed by Pressé et al. [17], MaxCal interprets the path entropy as a macroscopic system property to be maximized in order to identify a time-dependent trajectory of the entire dynamical system, not just one single particle. We derive macroscopic system entropy by multiplying microscopic entropy quantities (e.g., path entropy, entropy rate per unit time) with the associated macroscopic system quantities (e.g., total system content, total input amount).

In the field of soil carbon cycle modeling, Ågren [43] applied the maximum entropy principle to identify the distribution of soil carbon qualities within the framework of the continuous-quality theory. Given only the nonnegative mean quality, an application of MaxEnt leads to an exponential quality distribution, because, under these circumstances, the exponential distribution is the maximum entropy distribution. The path entropy generalizes this approach to several interconnected compartments and jumps between them, while each sojourn time in a compartment is exponentially distributed.

From the simple examples in Section 4.1, we can see that models can be ordered differently in terms of uncertainty, depending on whether the interest is in the uncertainty of the entire path or in some average uncertainty rate. For applications of MaxEnt without restrictions on the transit time, it is often useful to maximize an entropy rate instead of the path entropy, because, by slowing the system down more and more, the path entropy can potentially be increased indefinitely, and a maximum path entropy model does not exist. The decision to maximize a rate can potentially also be justified by a given macroscopic restriction on the stock sizes, which would increase indefinitely with indefinitely increasing path entropy by slowing down the system.

By virtue of its very mathematical definition (Equation (1)), entropy is maximized when the system’s symmetry is maximized. This is indicated by the Bernoulli entropy (Figure 1a) and supported by Example 1. Intuitively, this result is obvious. If a system has high symmetry, a particle is equally likely to jump among different pools. The Poisson process with intensity rate 1 is the one with maximum entropy rate, which follows directly from properties of the function

f (x) = x \log x

. Furthermore, the resulting rates

z_{j} = 1 / E [T]

of leaving the system are chosen such that the mean transit time constraint is fulfilled. In Example 2, the symmetry is broken by the additional restriction of a given steady-state vector. Consequently,

H (P_{2}^{*}) \leq H (P_{1}^{*})

.

When we compute entropy values for actual carbon-cycle models (Section 4.2 and Section 4.3), we note that environmental or eco-physiological factors might impact model entropies. For example, higher global surface temperatures are likely to induce a higher global carbon-cycle system speed (

1 \leq ξ ≪ 6

). This higher system speed reduces the uncertainty of the long-term future of entire paths of carbon atoms entering the terrestrial biosphere from the atmosphere. At the same time, it increases the entropy rate per unit time, i.e., the uncertainty of the short-term future of carbon atoms already in the terrestrial biosphere.

Furthermore, we see that for sufficiently fast systems, a multi-pool model has lower entropy than a one-pool model with the same system speed. The one-pool system might put too much weight on the uncertainties of a small number of slow-cycling particles, while the more detailed multi-pool model focuses more on the small uncertainties of the major amount of fast-cycling particles. The path of a detailed model that separates fast from slow paths is then even easier to predict than a one-pool model path, even though the detailed model’s path looks more complicated. However, detailed paths of slow-cycling systems are harder to predict than just the exit time in a one-pool equivalent.

The two carbon-cycle models (Section 4.2 and Section 4.3) are well-understood in equilibrium; hence, they can serve as a means to better understand properties of the newly introduced entropy metrics. Once we understand entropy properties in dependence on general system properties, we can extrapolate this understanding to far more complex systems and make qualitative statements about their predictability without going into all model details. One major insight from those two examples is that, in general, slow heterogeneous systems are much harder to predict than fast homogeneous systems. Slowness increases the uncertainty of the duration of particle’s stay in the system, and heterogeneity increases the uncertainty of a particle’s sequence of visited compartments.

These simple insights allow us to understand modeling issues in a broader sense. For instance, path entropies support the understanding of differences in the diversity of modeling approaches and predictions for carbon uptake and transfers to soils in terrestrial ecosystems. Both photosynthesis [44] and soil carbon turnover [45] are modeled by many different approaches. However, in ecosystem models, photosynthesis is almost exclusively represented [46] by the Farquhar model [47], while soil carbon dynamics are represented by a great variety of models with very different structures [48]. The latter leads to large variations in the prediction of future land carbon uptake [48,49]. A comparison of carbon simulations from eleven modeling centers showed that across models, global soil carbon varied more than twice as much as global net primary productivity [50]. Carbon dynamics in leaves are relatively fast, and the fate of this carbon is highly predictable; it is either used to fuel the metabolic activity of leaf cells or allocated to storage reserves or woody tissue [51]. In contrast, the fate of carbon entering soils is much less predictable, with a large range of potential metabolic pathways through microbial food webs or potential physico-chemical interactions with the mineral surfaces in the soil matrix that occur at longer timescales [52]. Consequently, the higher uncertainty of soil carbon cycling compared to photosynthetic carbon uptake is an inherent property of the system. Simply by the soil’s heterogeneous and slow-cycling nature, the system posseses high inherent uncertainty, which hints at a theoretical limit that cannot be overcome by any model.

The example of model identification by MaxEnt in Section 4.4 shows a major difference from the more artificial previous maximum entropy examples. The given constraints do not tell us enough about the structure of the model class

M

to ensure that an identified local maximum is also a global maximum. Owing to the nonlinear restrictions on the parameters in Equation (57), the parameter space is probably not convex. Hence, local maxima are not guaranteed to be also globally optimal. The small system size allows us to identify a reasonable global maximum candidate model by brute force, starting local maximizations on a grid over a parameter sub-space. Practical examples might include higher-dimensional systems and thus not be feasible for brute-force approaches. More sophisticated optimization methods suitable for the particular problem at hand should then be applied. However, since the newly introduced entropy measures are proper entropies, and in the case of a convex parameter space, the existence of a unique global optimum is guaranteed.

6. Conclusions

A probabilistic approach to mass-balanced deterministic systems allows basic information theoretical principles to compute the uncertainty of a wide range of models representing complex processes in nature—a task at which classical deterministic theories fail. The information content of autonomous compartmental systems in equilibrium can be assessed by the entropy of the path of particles traveling through the system of interconnected compartments. When a particle moves through a compartmental system, it creates a path from the time of its entry until the time of its exit. This path can be described in three ways: (1) as a random variable in the path space, (2) as a continuous-time stochastic process representing the occupied compartments, (3) as a discrete sequence of pairs consisting of visited compartments and associated sojourn times. Based on these three possible descriptions, we introduced, for systems in equilibrium, (1) the entropy of the entire path, (2) the entropy rate per unit time, and (3) the entropy rate per jump. These three different entropies allow us to quantify how difficult it is to predict the path of particles entering a compartmental system, serving as a measure of system uncertainty/predictability. With these measures, it is thus possible to apply maximum entropy principles to compartmental systems in equilibrium in order to address problems of equifinality in model selection.

Although the path entropy concept developed here only applies to systems in equilibrium, it sets the foundation for future research on systems out of equilibrium. This can be achieved by building on the concept of the entropy rate per unit time as an instantaneous uncertainty and interpreting non-autonomous compartmental systems as inhomogeneous Markov chains. This would allow an extension of MaxCal, which was so far applied only to the inhomogeneous embedded jump chain, as by Ge et al. [53], to incorporate sojourn times in different compartments as well.

By introducing the concept of path entropy to compartmental systems, we made a first crucial step toward a quantification of information content in models that can be compared to other methods to obtain information content from observations. Using entropy measures based on Shannon information theory in both models and observations, we can potentially advance toward better methods for model selection applying the maximum entropy principle.

Author Contributions

H.M.: Writing—review and editing, Writing—original draft, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Data curation. C.A.S.: Writing—review and editing, Validation, Funding acquisition, Supervision, Project administration, Conceptualization. All authors have read and agreed to the published version of the manuscript.

Funding

Open Access funding provided by the Max Planck Society. This research was funded by the German Research Foundation through its Emmy Noether Program (SI 1953/2–1) and the Swedish Research Council for Sustainable Development FORMAS, under grant 2018-01820.

Data Availability Statement

The Python (version 3.12.2) code to reproduce the figures used in the manuscript is provided in the static repository https://doi.org/10.5281/zenodo.17396760.

Acknowledgments

H.M. acknowledges the support of Giulia Vico at the Swedish University of Agricultural Sciences in Uppsala, Oskar Franklin at the International Institute for Applied Systems Analysis, and Julia Pongratz at the Ludwig-Maximilians-Universität Munich for their support during the long-standing processes of transforming pure thought material into a publishable manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. The Stationary Process Z

We prove that the entropy rate per jump of a single traveling particle is a proper entropy rate in the sense of the definition as given in Equation (5). Let

Z = {(Z_{n})}_{n \geq 1} = {({\tilde{Y}}_{n}, {\tilde{T}}_{n})}_{n \geq 1}

on the space

(\tilde{S} \times R_{+})

describe the infinite journey of a typical particle. It is the sequence of visited compartments with the associated sojourn times of a single particle through the system with immediate jumps back into the system when leaving it, defined by the transition probabilities

{\tilde{P}}_{i j} (t) = P ({\tilde{Y}}_{n + 1} = i, {\tilde{T}}_{n + 1} \leq t | {\tilde{Y}}_{n} = j)

given by

{\tilde{P}}_{i j} (t) = \{\begin{matrix} 0, & i = j, \\ B_{i j} λ_{j}^{- 1} (1 - e^{- λ_{i} t}), & i, j \leq d, i \neq j, \\ z_{j} λ_{j}^{- 1}, & i = d + 1, j \leq d, \\ β_{i} (1 - e^{- λ_{i} t}), & i \leq d, j = d + 1, \end{matrix}

(A1)

and initial (stationary) distribution

π_{j} (t) = \frac{1}{E [N]} \cdot \{\begin{matrix} E [N_{j}] (1 - e^{- λ_{j} t}), & j \leq d, \\ 1, & j = d + 1 . \end{matrix}

(A2)

Proposition A1.

The entropy rate per jump,

θ_{J} (P)

, equals the entropy rate of the stationary process Z.

Proof.

Step 1. We show that

Z = (\tilde{Y}, \tilde{T})

is stationary. To that end, we define

π_{j} : = {lim}_{t \to \infty} π_{j} (t)

, and we prove

P ({\tilde{Y}}_{2} = i, {\tilde{T}}_{2} \leq t) = π_{i} (t) = P ({\tilde{Y}}_{1} = i, {\tilde{T}}_{1} \leq t)

. Stationarity follows then by induction. Let

i = d + 1

. Then,

\begin{matrix} P ({\tilde{Y}}_{2} = i, {\tilde{T}}_{2} \leq t) & = \sum_{j = 1}^{d} P ({\tilde{Y}}_{2} = i, {\tilde{T}}_{2} \leq t | {\tilde{Y}}_{1} = j) P (\tilde{Y_{1}} = j) \\ = \sum_{j = 1}^{d} {\tilde{P}}_{d + 1, j} (t) π_{j} \\ = \sum_{j = 1}^{d} \frac{z_{j}}{λ_{j}} \frac{E [N_{j}]}{E [N]} . \end{matrix}

(A3)

By Equation (15),

r_{j} = z_{j} x_{j}^{*}

, and

∥ r ∥ = ∥ u ∥

, we get

P ({\tilde{Y}}_{2} = i, {\tilde{T}}_{2} \leq t) = \frac{1}{E [N]} \sum_{j = 1}^{d} \frac{z_{j}}{λ_{j}} \frac{λ_{j} x_{j}^{*}}{∥ u ∥} = \frac{1}{E [N]} \frac{z^{T} x^{*}}{∥ u ∥} = π_{d + 1} (t) .

(A4)

Now let

i \leq d

. Then,

\begin{matrix} P ({\tilde{Y}}_{2} = i, \tilde{T_{2}} \leq t) & = \sum_{j = 1, j \neq i}^{d} \frac{B_{i j}}{λ_{j}} (1 - e^{- λ_{i} t}) \frac{E [N_{j}]}{E [N]} + β_{i} (1 - e^{- λ_{i} t}) \frac{1}{E [N]} \\ = \frac{1}{E [N]} [\sum_{i = 1, i \neq j}^{d} \frac{B_{i j} x_{j}^{*}}{∥ u ∥} + β_{i}] (1 - e^{- λ_{i} t}) \\ = \frac{1}{E [N]} [\frac{{(B x^{*})}_{i}}{∥ u ∥} - \frac{B_{i i} x_{i}^{*}}{∥ u ∥} + β_{i}] (1 - e^{- λ_{i} t}) \\ = \frac{1}{E [N]} [- \frac{u_{i}}{∥ u ∥} + \frac{λ_{i} x_{i}^{*}}{∥ u ∥} + β_{i}] (1 - e^{λ_{i} t}) \\ = \frac{1}{E [N]} E [N_{i}] (1 - e^{- λ_{i} t}) \\ = π_{i} (t) . \end{matrix}

(A5)

Step 2. Since Z is stationary, by (Cover and Thomas [29] Theorem 4.2.1), its entropy rate given by

θ (Z) = lim_{n \to \infty} H (Z_{n + 1} | Z_{n}) = H (Z_{2} | Z_{1}),

(A6)

which computes to

\begin{matrix} θ (Z) & = H (({\tilde{Y}}_{2}, {\tilde{T}}_{2}) | ({\tilde{Y}}_{1}, {\tilde{T}}_{1})) = H (({\tilde{Y}}_{2}, {\tilde{T}}_{2}) | {\tilde{Y}}_{1}) = H ({\tilde{T}}_{2} | {\tilde{Y}}_{2}, {\tilde{Y}}_{1}) + H ({\tilde{Y}}_{2} | {\tilde{Y}}_{1}) \\ = H ({\tilde{T}}_{2} | {\tilde{Y}}_{2}) + H ({\tilde{Y}}_{2} | {\tilde{Y}}_{1}) . \end{matrix}

(A7)

By stationarity,

H ({\tilde{T}}_{2} | {\tilde{Y}}_{2}) = H ({\tilde{T}}_{1} | {\tilde{Y}}_{1})

. Consequently,

\begin{matrix} θ (Z) & = H ({\tilde{T}}_{1} | {\tilde{Y}}_{1}) + H ({\tilde{Y}}_{2} | {\tilde{Y}}_{1}) \\ = \sum_{j = 1}^{d + 1} π_{j} [H (\tilde{T_{1}} | {\tilde{Y}}_{1} = j) + H ({\tilde{Y}}_{2} | {\tilde{Y}}_{1} = j)], \\ = \frac{1}{E [N]} (\sum_{j = 1}^{d} E [N_{j}] [H ({\tilde{T}}_{1} | {\tilde{Y}}_{1} = j) + H ({\tilde{Y}}_{2} | {\tilde{Y}}_{1} = j)] + H ({\tilde{Y}}_{2} | {\tilde{Y}}_{1} = d + 1)), \end{matrix}

(A8)

which together with Equation (34) finishes the proof. □

Appendix B. Proofs of the MaxEnt Examples

Recall that the path entropy of a linear autonomous compartmental system

M = M (u, B)

is given by

\begin{matrix} H (P (M)) & = H (X) \\ = - \sum_{i = 1}^{d} β_{i} \log β_{i} + \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} [\sum_{i = 1, i \neq j}^{d} B_{i j} (1 - \log B_{i j}) + z_{j} (1 - \log z_{j})] . \end{matrix}

(A9)

In order to obtain maximum entropy models under simple constraints, we now adapt ideas of [54].

Proposition A2.

Consider the set

M_{1}

of compartmental systems in equilibrium given by Equation (7) with a predefined nonzero input vector

u

, a predefined mean transit time

E [T]

, and an unknown steady-state vector comprising nonzero components. The compartmental system

M_{1}^{*} = M (u, B^{*})

with

B^{*} = (\begin{matrix} - λ & 1 & \dots & 1 \\ 1 & - λ & 1 \dots & 1 \\ ⋮ & ⋱ & ⋮ \\ 1 & \dots & 1 & - λ \end{matrix}),

(A10)

where

λ = d - 1 + 1 / E [T]

, is the maximum entropy model in

M_{1}

.

Proof.

We can express the constraint

E [T] = ∥ x^{*} ∥ / ∥ u ∥

by

C_{1} = \frac{1}{∥ u ∥} \sum_{j = 1}^{d} x_{j}^{*} - E [T] = 0 .

(A11)

From the steady-state formula

x^{*} = - B^{- 1} u

, we obtain another set of d constraints, which we can describe by

\frac{1}{∥ u ∥} {(B x^{*})}_{i} = - β_{i}, i = 1, 2, \dots, d .

(A12)

We rewrite the left hand side as

\begin{matrix} \frac{1}{∥ u ∥} {(B x^{*})}_{i} & = \frac{1}{∥ u ∥} \sum_{j = 1}^{d} B_{i j} x_{j}^{*} = \frac{1}{∥ u ∥} (\sum_{j = 1, j \neq i}^{d} B_{i j} x_{j}^{*} + B_{i i} x_{i}^{*}) \\ = \frac{1}{∥ u ∥} \sum_{j = 1, j \neq i}^{d} B_{i j} x_{j}^{*} - \frac{1}{∥ u ∥} x_{i}^{*} (\sum_{k = 1, k \neq i}^{d} B_{k i} + z_{i}), \end{matrix}

(A13)

which leads to the constraints

C_{2, i} = \frac{1}{∥ u ∥} \sum_{j = 1, j \neq i}^{d} B_{i j} x_{j}^{*} - \frac{1}{∥ u ∥} x_{i}^{*} (\sum_{k = 1, k \neq i}^{d} B_{k i} + z_{i}) + β_{i} = 0, i \in S .

(A14)

The Lagrangian is now given by

L = H (X) + γ_{0} C_{1} + \sum_{i = 1}^{d} γ_{i} C_{2, i}

(A15)

and its partial derivatives with respect to

B_{i j} (i \neq j)

,

z_{j}

, and

x_{j}^{*}

by

\begin{matrix} ∥ u ∥ \frac{\partial}{\partial B_{i j}} L & = - x_{j}^{*} \log B_{i j} + γ_{i} x_{j}^{*} - γ_{j} x_{j}^{*}, \\ ∥ u ∥ \frac{\partial}{\partial z_{j}} L & = - x_{j}^{*} \log z_{j} - γ_{j} x_{j}^{*}, \end{matrix}

(A16)

and

\begin{matrix} ∥ u ∥ \frac{\partial}{\partial x_{j}^{*}} L & = \sum_{i = 1, i \neq j}^{d} B_{i j} (1 - \log B_{i j}) + z_{j} (1 - \log z_{j}) \\ + γ_{0} + \sum_{i = 1, i \neq j}^{d} γ_{i} B_{i j} - γ_{j} (\sum_{k = 1, k \neq j}^{d} B_{k j} + z_{j}), \end{matrix}

(A17)

respectively. Setting

\frac{\partial}{\partial B_{i j}} L = 0

gives

B_{i j} = e^{γ_{i} - γ_{j}}

, and setting

\frac{\partial}{\partial z_{j}} L = 0

gives

z_{j} = e^{- γ_{j}}

. We plug this into

\frac{\partial}{\partial x_{j}^{*}} L = 0

and get

\begin{matrix} 0 & = \sum_{i = 1, i \neq j}^{d} e^{γ_{i} - γ_{j}} [1 - (γ_{i} - γ_{j})] + e^{- γ_{j}} [1 - (- γ_{j})] \\ + γ_{0} + \sum_{i = 1, i \neq j}^{d} γ_{i} e^{γ_{i} - γ_{j}} - γ_{j} (\sum_{k = 1, k \neq j}^{d} e^{γ_{k} - γ_{j}} + e^{- γ_{j}}) \\ = \sum_{i \neq j, i \neq j} e^{γ_{i} - γ_{j}} + e^{- γ_{j}} + γ_{0} . \end{matrix}

(A18)

Subtracting

e^{- γ_{j}}

from both sides and multiplying with

e^{γ_{j}}

leads to

γ_{0} e^{γ_{j}} + \sum_{i = 1, i \neq j}^{d} e^{γ_{i}} = - 1, j = 1, 2, \dots, d .

(A19)

This is equivalent to the linear system

Y v = - 1

with

Y = (\begin{matrix} γ_{0} & 1 & \dots & 1 \\ 1 & γ_{0} & 1 \dots & 1 \\ ⋮ & ⋱ & ⋮ \\ 1 & \dots & 1 & γ_{0} \end{matrix}), v = (\begin{matrix} e^{γ_{1}} \\ e^{γ_{2}} \\ ⋮ \\ e^{γ_{d}} \end{matrix}), - 1 = (\begin{matrix} - 1 \\ - 1 \\ ⋮ \\ - 1 \end{matrix}) .

(A20)

The case

γ_{0} = 1

has no solution

v

, since

e^{γ_{i}} > 0 > - 1

. For

γ_{0} \neq 1

, the matrix

Y

has a nonzero determinant, which makes the system uniquely solvable. For symmetry reasons,

γ_{i} = γ_{j} = : γ

for all

i, j = 1, 2, \dots, d

. Consequently, for

i \neq j

, we get

B_{i j} = 1

, and by summing Equation (A14) over

i \in S

,

\begin{matrix} 0 & = ∥ u ∥ \sum_{i = 1}^{d} C_{2, i} = \sum_{i = 1}^{d} \sum_{j = 1, j \neq i}^{d} B_{i j} x_{j}^{*} - \sum_{i = 1}^{d} x_{i}^{*} (\sum_{k = 1, k \neq i}^{d} B_{k i} + z_{i}) - ∥ u ∥ \\ = - \sum_{i = 1}^{d} x_{i}^{*} z_{i} - ∥ u ∥, \end{matrix}

(A21)

which can also be expressed by

z^{T} x^{*} = ∥ u ∥

. We simply plug in

z_{i} = e^{- γ}

and get

e^{- γ} ∥ x^{*} ∥ = ∥ u ∥

, which means

z_{i} = 1 / E [T]

. Consequently,

B^{*} = (\begin{matrix} - λ & 1 & \dots & 1 \\ 1 & - λ & 1 \dots & 1 \\ ⋮ & ⋱ & ⋮ \\ 1 & \dots & 1 & - λ \end{matrix})

(A22)

for

λ = d - 1 + 1 / E [T]

. Since uniqueness of this solution follows from its construction, we remain with showing maximality. To this end, we split the entropy into to three parts, i.e.,

H (X) = H_{1} + H_{2} + H_{3}

with

\begin{matrix} H_{1} & = - \sum_{i = 1}^{d} β_{i} \log β_{i}, \\ H_{2} & = \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} z_{j} (1 - \log z_{j}), and \\ H_{3} & = \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} \sum_{i = 1, i \neq j}^{d} B_{i j} (1 - \log B_{i j}) . \end{matrix}

(A23)

The term

H_{1}

is independent of

B_{i j}

and

z_{j}

for all

i, j \in S

and

i \neq j

and can thus be ignored. We denote by E the pool from which the particle exits from the system. Then, we can use ([13] Sect. 5.3)

P (E = j) = \frac{z_{j} x_{j}^{*}}{∥ u ∥}

(A24)

to rewrite the second term as

H_{2} = \sum_{j = 1}^{d} P (E = j) (1 - \log z_{j}) = \sum_{j = 1}^{d} H (T_{E} | E = j) P (E = j) = H (T_{E} | E),

(A25)

where

T_{E}

denotes the exponentially distributed sojourn time in E right before absorption. We see that

H_{2}

becomes maximal if the knowledge of E contains no information about

T_{E}

. Hence,

z_{j} = z_{i}

for

i, j \in S

. Since we need to ensure the systems’ constraint on

E [T]

, we get

z_{j} = 1 / E [T]

for all

j \in S

.

In order to see that

B_{i j} = 1

(

i \neq j

) leads to maximal entropy, we first note that

H_{3} = \sum_{j = 1}^{d} \frac{x_{j}^{*}}{∥ u ∥} \sum_{i = 1, i \neq j}^{d} 1 \cdot (1 - \log 1) = (d - 1) \sum_{j = 1}^{d} E [O_{j}] = (d - 1) E [T]

(A26)

by Equation (33). We now disturb

B_{k l}

for fixed

k, l \in S

with

k \neq l

by a sufficiently tiny

ε

, which may be positive or negative. We define

B_{k l} (ε) : = B_{k l} + ε

, and a change from

λ_{j}

to

λ_{j} (ε) : = λ_{j} + ε > 0

ensures

z_{j} (ε) = z_{j}

, implying that the system’s mean transit time remains unchanged, i.e.,

E [T (ε)] = E [T]

. The

ε

-disturbed

H_{3}

is given by

\begin{matrix} H_{3} (ε) & = \sum_{j = 1}^{d} \frac{x_{j}^{*} (ε)}{∥ u ∥} \sum_{i = 1, i \neq j}^{d} 1 \cdot (1 - \log 1) (1 - 1_{{i = k, j = l}}) \\ + \frac{x_{l}^{*} (ε)}{∥ u ∥} (1 + ε) [1 - \log (1 + ε)] \\ = \sum_{j = 1}^{d} \frac{x_{j}^{*} (ε)}{∥ u ∥} \sum_{i = 1, i \neq j}^{d} (1 - 1_{{i = k, j = l}}) + \frac{x_{l}^{*} (ε)}{∥ u ∥} (1 - δ) \end{matrix}

(A27)

for some

δ > 0

, since the map

x \mapsto x (1 - \log x)

has its global maximum at

x = 1

. Consequently,

\begin{matrix} H_{3} (ε) & = [\sum_{j = 1}^{d} \frac{x_{j}^{*} (ε)}{∥ u ∥} \sum_{i = 1, i \neq j}^{d} 1] - δ \frac{x_{l}^{*} (ε)}{∥ u ∥} & = (d - 1) \sum_{j = 1}^{d} E [O_{j} (ε)] - δ \frac{x_{l}^{*} (ε)}{∥ u ∥} \\ = (d - 1) E [T (ε)] - δ \frac{x_{l}^{*} (ε)}{∥ u ∥} & = (d - 1) E [T] - δ \frac{x_{l}^{*} (ε)}{∥ u ∥} \\ < H_{3} . \end{matrix}

(A28)

Hence, disturbing

B_{i j}

away from 1 reduces the entropy of the system, and the proof is complete. □

Proposition A3.

Consider the set

M_{2}

of compartmental systems in equilibrium given by Equation (7) with a predefined nonzero input vector

u

and a predefined positive steady-state vector

x^{*}

. The compartmental system

M_{2}^{*} = M (u, B^{*})

with

B^{*} = {(B_{i j})}_{i, j \in S}

given by

B_{i j} = \{\begin{matrix} \sqrt{\frac{x_{i}^{*}}{x_{j}^{*}}}, & i \neq j, \\ - \sum_{k = 1, k \neq j}^{d} \sqrt{\frac{x_{k}^{*}}{x_{j}^{*}}} - \frac{1}{\sqrt{x_{j}^{*}}}, & i = j, \end{matrix}

(A29)

is the maximum entropy model in

M_{2}

.

Proof.

The mean transit time

E [T] = ∥ x^{*} ∥ / ∥ u ∥

of the system is fixed. Hence, the Lagrangian L is the same as in Equation (A15) and setting

\partial L / \partial B_{i j} = 0

leads to

- \log B_{i j} + γ_{i} - γ_{j} = 0, i \neq j .

(A30)

An interchange of the indices and summing the two equations gives

\log B_{i j} + \log B_{j i} = 0 .

(A31)

Hence,

B_{i j} B_{j i} = 1

. A good guess gives

B_{i j}^{2} = x_{i}^{*} / x_{j}^{*}

and

γ_{j} = \frac{1}{2} \log x_{j}^{*}

. From

\frac{\partial}{\partial z_{j}} L = 0

, we get

- \log z_{j} - γ_{j} = 0, j \in S,

(A32)

and in turn

z_{j} = {(x_{j}^{*})}^{- 1 / 2}

. The maximality and uniqueness of this solution follow from the strict concavity of

H (X)

as a function of

B_{i j}

and

z_{j}

for fixed

x^{*}

. We can see this strict concavity by

\frac{\partial^{2}}{\partial B_{i j}^{2}} H (X) = \frac{\partial}{\partial B_{i j}} \frac{- x_{j}^{*}}{∥ u ∥} \log B_{i j} = - \frac{x_{j}^{*}}{∥ u ∥ B_{i j}} < 0

(A33)

and

\frac{\partial^{2}}{\partial z_{j}^{2}} H (X) = \frac{\partial}{\partial z_{j}} \frac{- x_{j}^{*}}{∥ u ∥} \log z_{j} = - \frac{x_{j}^{*}}{∥ u ∥ z_{i}} < 0 .

(A34)

□

References

Burnham, K.P.; Anderson, D.R. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach; Springer: New York, NY, USA; Berlin/Heidelberg, Germany, 2002. [Google Scholar]
Höge, M.; Wöhling, T.; Nowak, W. A Primer for Model Selection: The Decisive Role of Model Complexity. Water Resour. Res. 2018, 54, 1688–1715. [Google Scholar] [CrossRef]
Golan, A.; Harte, J. Information theory: A foundation for complexity science. Proc. Natl. Acad. Sci. USA 2022, 119, e2119089119. [Google Scholar] [CrossRef] [PubMed]
Jost, J. Dynamical Systems: Examples of Complex Behaviour; Springer: Berlin, Germany; New York, NY, USA, 2005. [Google Scholar]
Fan, J.; Meng, J.; Ludescher, J.; Chen, X.; Ashkenazy, Y.; Kurths, J.; Havlin, S.; Schellnhuber, H.J. Statistical physics approaches to the complex Earth system. Phys. Rep. 2021, 896, 1–84. [Google Scholar] [CrossRef] [PubMed]
Anderson, D.H. Compartmental Modeling and Tracer Kinetics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1983; Volume 50. [Google Scholar]
Walter, G.G.; Contreras, M. Compartmental Modeling with Networks; Birkhäuser: Basel, Switzerland, 1999. [Google Scholar]
Haddad, W.M.; Chellaboina, V.; Hui, Q. Nonnegative and Compartmental Dynamical Systems; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
Eriksson, E. Compartment Models and Reservoir Theory. Annu. Rev. Ecol. Syst. 1971, 2, 67–84. [Google Scholar] [CrossRef]
Bolin, B.; Rodhe, H. A note on the concepts of age distribution and transit time in natural reservoirs. Tellus 1973, 25, 58–62. [Google Scholar] [CrossRef]
Rasmussen, M.; Hastings, A.; Smith, M.J.; Agusto, F.B.; Chen-Charpentier, B.M.; Hoffman, F.M.; Jiang, J.; Todd-Brown, K.E.O.; Wang, Y.; Wang, Y.P.; et al. Transit times and mean ages for nonautonomous and autonomous compartmental systems. J. Math. Biol. 2016, 73, 1379–1398. [Google Scholar] [CrossRef]
Sierra, C.A.; Müller, M.; Metzler, H.; Manzoni, S.; Trumbore, S.E. The muddle of ages, turnover, transit, and residence times in the carbon cycle. Glob. Chang. Biol. 2017, 23, 1763–1773. [Google Scholar] [CrossRef]
Metzler, H.; Sierra, C.A. Linear Autonomous Compartmental Models as Continuous-Time Markov Chains: Transit-Time and Age Distributions. Math. Geosci. 2018, 50, 1–34. [Google Scholar] [CrossRef]
Metzler, H.; Müller, M.; Sierra, C.A. Transit-time and age distributions for nonlinear time-dependent compartmental systems. Proc. Natl. Acad. Sci. USA 2018, 115, 1150–1155. [Google Scholar] [CrossRef]
Jaynes, E.T. Information Theory and Statistical Mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
Jaynes, E.T. Information Theory and Statistical Mechanics. II. Phys. Rev. 1957, 108, 171–190. [Google Scholar] [CrossRef]
Pressé, S.; Ghosh, K.; Lee, J.; Dill, K.A. Principles of maximum entropy and maximum caliber in statistical physics. Rev. Mod. Phys. 2013, 85, 1115. [Google Scholar] [CrossRef]
Pesin, Y.B. Characteristic Lyapunov exponents and smooth ergodic theory. Uspekhi Mat. Nauk 1977, 32, 55–112. [Google Scholar]
Dehmer, M.; Mowshowitz, A. A history of graph entropy measures. Inf. Sci. 2011, 181, 57–78. [Google Scholar] [CrossRef]
Trucco, E. A note on the information content of graphs. Bull. Math. Biol. 1956, 18, 129–135. [Google Scholar] [CrossRef]
Morzy, M.; Kajdanowicz, T.; Kazienko, P. On Measuring the Complexity of Networks: Kolmogorov Complexity versus Entropy. Complexity 2017, 2017, 3250301. [Google Scholar] [CrossRef]
Bonchev, D.; Buck, G.A. Quantitative Measures of Network Csomplexity. In Complexity in Chemistry, Biology, and Ecology; Springer: Berlin/Heidelberg, Germany, 2005; pp. 191–235. [Google Scholar]
Shannon, C.E.; Weaver, W. The Mathematical Theory of Communication; The University of Illinois Press: Urbana, IL, USA, 1949. [Google Scholar]
Jaynes, E.T. Macroscopic prediction. In Complex Systems—Operational Approaches in Neurobiology, Physics, and Computers; Springer: Berlin/Heidelberg, Germany, 1985; pp. 254–269. [Google Scholar]
Roach, T.N.F. Use and Abuse of Entropy in Biology: A Case for Caliber. Entropy 2020, 22, 1335. [Google Scholar] [CrossRef] [PubMed]
Aoki, I. Entropy laws in ecological networks at steady state. Ecol. Model. 1988, 42, 289–303. [Google Scholar] [CrossRef]
Haddad, W.M. A Unification between Dynamical System Theory and Thermodynamics Involving an Energy, Mass, and Entropy State Space Formalism. Entropy 2013, 15, 1821–1846. [Google Scholar] [CrossRef]
Haddad, W.M. A Dynamical Systems Theory of Thermodynamics; Princeton University Press: Princeton, NJ, USA, 2019. [Google Scholar]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
Bad Dumitrescu, M.E. Some informational properties of Markov pure-jump processes. Časopis Pro Pěstování Mat. 1988, 113, 429–434. [Google Scholar]
Gaspard, P.; Wang, X.J. Noise, chaos, and (ε, τ)-entropy per unit time. Phys. Rep. 1993, 235, 291–343. [Google Scholar] [CrossRef]
Jacquez, J.A.; Simon, C.P. Qualitative theory of compartmental systems. Siam Rev. 1993, 35, 43–79. [Google Scholar] [CrossRef]
Norris, J.R. Markov Chains; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
Neuts, M.F. Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach; The Johns Hopkins University Press: Baltimore, MD, USA, 1981. [Google Scholar]
Albert, A. Estimating the Infinitesimal Generator of a Continuous Time, Finite State Markov Process. Ann. Math. Stat. 1962, 33, 727–753. [Google Scholar] [CrossRef]
Doob, J.L. Stochastic Processes; Wiley: New York, NY, USA, 1953; Volume 7. [Google Scholar]
Walter, G.G. Size identifiability of compartmental models. Math. Biosci. 1986, 81, 165–176. [Google Scholar] [CrossRef]
Bellman, R.; Åström, K.J. On structural identifiability. Math. Biosci. 1970, 7, 329–339. [Google Scholar] [CrossRef]
Emanuel, W.R.; Killough, G.G.; Olson, J.S. Modelling the Circulation of Carbon in the World’s Terrestrial Ecosystems. In Carbon Cycle Modelling; SCOPE 16; John Wiley and Sons: Hoboken, NJ, USA, 1981; pp. 335–353. [Google Scholar]
Sierra, C.A.; Quetin, G.R.; Metzler, H.; Müller, M. A decrease in the age of respired carbon from the terrestrial biosphere and increase in the asymmetry of its distribution. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2023. [Google Scholar] [CrossRef]
Thompson, M.V.; Randerson, J.T. Impulse response functions of terrestrial carbon cycle models: Method and application. Glob. Chang. Biol. 1999, 5, 371–394. [Google Scholar] [CrossRef]
Wang, Y.P.; Chen, B.C.; Wieder, W.R.; Leite, M.; Medlyn, B.E.; Rasmussen, M.; Smith, M.J.; Agusto, F.B.; Hoffman, F.; Luo, Y.Q. Oscillatory behavior of two nonlinear microbial models of soil carbon decomposition. Biogeosciences 2014, 11, 1817–1831. [Google Scholar] [CrossRef]
Ågren, G.I. Investigating soil carbon diversity by combining the MAXimum ENTropy principle with the Q model. Biogeochemistry 2021, 153, 85–94. [Google Scholar] [CrossRef]
García-Rodríguez, L.d.C.; Prado-Olivarez, J.; Guzmán-Cruz, R.; Rodríguez-Licea, M.A.; Barranco-Gutiérrez, A.I.; Perez-Pinal, F.J.; Espinosa-Calderon, A. Mathematical Modeling to Estimate Photosynthesis: A State of the Art. Appl. Sci. 2022, 12, 5537. [Google Scholar] [CrossRef]
Manzoni, S.; Porporato, A. Soil carbon and nitrogen mineralization: Theory and models across scales. Soil Biol. Biochem. 2009, 41, 1355–1379. [Google Scholar] [CrossRef]
Zaehle, S.; Medlyn, B.E.; De Kauwe, M.G.; Walker, A.P.; Dietze, M.C.; Hickler, T.; Luo, Y.; Wang, Y.P.; El-Masri, B.; Thornton, P.; et al. Evaluation of 11 terrestrial carbon–nitrogen cycle models against observations from two temperate Free-Air CO₂ Enrichment studies. New Phytol. 2014, 202, 803–822. [Google Scholar] [CrossRef]
Farquhar, G.D.; von Caemmerer, S.v.; Berry, J.A. A biochemical model of photosynthetic CO₂ assimilation in leaves of C₃ species. Planta 1980, 149, 78–90. [Google Scholar] [CrossRef]
Friedlingstein, P.; Cox, P.; Betts, R.; Bopp, L.; von Bloh, W.; Brovkin, V.; Cadule, P.; Doney, S.; Eby, M.; Fung, I.; et al. Climate–carbon cycle feedback analysis: Results from the C4MIP model intercomparison. J. Clim. 2006, 19, 3337–3353. [Google Scholar] [CrossRef]
Friedlingstein, P.; Meinshausen, M.; Arora, V.K.; Jones, C.D.; Anav, A.; Liddicoat, S.K.; Knutti, R. Uncertainties in CMIP5 climate projections due to carbon cycle feedbacks. J. Clim. 2014, 27, 511–526. [Google Scholar] [CrossRef]
Todd-Brown, K.E.; Randerson, J.T.; Post, W.M.; Hoffman, F.M.; Tarnocai, C.; Schuur, E.A.; Allison, S.D. Causes of variation in soil carbon simulations from CMIP5 Earth system models and comparison with observations. Biogeosciences 2013, 10, 1717–1736. [Google Scholar] [CrossRef]
Ceballos-Núñez, V.; Müller, M.; Sierra, C.A. Towards better representations of carbon allocation in vegetation: A conceptual framework and mathematical tool. Theor. Ecol. 2020, 13, 317–332. [Google Scholar] [CrossRef]
Sierra, C.A.; Harmon, M.E.; Perakis, S.S. Decomposition of heterogeneous organic matter and its long-term stabilization in soils. Ecol. Monogr. 2011, 81, 619–634. [Google Scholar] [CrossRef]
Ge, H.; Pressé, S.; Ghosh, K.; Dill, K.A. Markov processes follow from the principle of maximum caliber. J. Chem. Phys. 2012, 136, 064108. [Google Scholar] [CrossRef]
Girardin, V. Entropy Maximization for Markov and Semi-Markov Processes. Methodol. Comput. Appl. Probab. 2004, 6, 109–127. [Google Scholar] [CrossRef]

Figure 1. Shannon entropy of a Bernoulli distribution (a), differential entropy of an exponential distribution (b), and entropy rate of a Poisson process (c). Vertical gray lines indicate the parameter values leading to the highest entropy.

Figure 2. Schematic of the linear autonomous global carbon-cycle model in steady state, introduced by Emanuel et al. [39].

Figure 3. (a) Equilibrium carbon stocks and (b–f) entropy related quantities of the global carbon-cycle model, introduced by Emanuel et al. [39], in dependence on the environmental rate coefficient

ξ

. Vertical gray lines show

ξ = 1

, the original speed of the model.

Figure 3. (a) Equilibrium carbon stocks and (b–f) entropy related quantities of the global carbon-cycle model, introduced by Emanuel et al. [39], in dependence on the environmental rate coefficient

ξ

. Vertical gray lines show

ξ = 1

, the original speed of the model.

Figure 4. Scheme of the nonlinear autonomous carbon-cycle model, introduced by Wang et al. [42], with two compartments: substrate organic carbon (

C_{s}

) and microbial biomass (

C_{b}

).

Figure 4. Scheme of the nonlinear autonomous carbon-cycle model, introduced by Wang et al. [42], with two compartments: substrate organic carbon (

C_{s}

) and microbial biomass (

C_{b}

).

Figure 5. (a) Equilibrium carbon stocks and (b–f) entropy related quantities of the global carbon-cycle model, introduced by Wang et al. [42], in dependence on the microbial carbon use efficiency

ε

. The left vertical gray lines show

ε = 0.39

, the original carbon use efficiency of the model, the right ones at

ε = 0.926

show the carbon use efficiency value with the maximum entropy rate of the Poisson process associated with

C_{s}

.

Figure 5. (a) Equilibrium carbon stocks and (b–f) entropy related quantities of the global carbon-cycle model, introduced by Wang et al. [42], in dependence on the microbial carbon use efficiency

ε

. The left vertical gray lines show

ε = 0.39

, the original carbon use efficiency of the model, the right ones at

ε = 0.926

show the carbon use efficiency value with the maximum entropy rate of the Poisson process associated with

C_{s}

.

Figure 6. Local maximizations of

θ

over a grid on a subspace of the parameter space. For better visibility, we randomly chose 1000 grid points for the plot. Blue dots show local maxima found during the global maximization procedure starting on the grid. The red dot is associated with the global maximum candidate

M_{\max}

. (a) Entropy rate per unit time versus

l_{1}

-distance of local maxima

p_{i}

parameters from the global maximum candidate parameters

p_{\max}

. (b) Entropy rate per unit time versus mean transit time. (c) Paths of entropy rate per unit time during the local maximizations on the grid. (d) Path entropy versus mean transit time.

Figure 6. Local maximizations of

θ

over a grid on a subspace of the parameter space. For better visibility, we randomly chose 1000 grid points for the plot. Blue dots show local maxima found during the global maximization procedure starting on the grid. The red dot is associated with the global maximum candidate

M_{\max}

. (a) Entropy rate per unit time versus

l_{1}

-distance of local maxima

p_{i}

parameters from the global maximum candidate parameters

p_{\max}

. (b) Entropy rate per unit time versus mean transit time. (c) Paths of entropy rate per unit time during the local maximizations on the grid. (d) Path entropy versus mean transit time.

Table 1. Overview of different entropy measures of simple models with different structures. The columns from left to right represent a schematic representation of the model structure, its mathematical representation, entropy rate per jump

θ_{J}

, mean number of jumps

E [N]

, entropy rate per unit time

θ

, mean transit time

E [T]

, and path entropy

H (P)

. Underlined numbers are the highest values per column.

Table 1. Overview of different entropy measures of simple models with different structures. The columns from left to right represent a schematic representation of the model structure, its mathematical representation, entropy rate per jump

θ_{J}

, mean number of jumps

E [N]

, entropy rate per unit time

θ

, mean transit time

E [T]

, and path entropy

H (P)

. Underlined numbers are the highest values per column.

$\frac{d}{d t} x (t)$	$θ_{J}$	$E [N]$	$θ$	$E [T]$	$H (P)$
$- λ x + 1$	$0.5 (1 - \log λ)$	2.00	$λ (1 - \log λ)$	$1 / λ$	$1 - \log λ$
$(\begin{matrix} - 1 & 0 \\ 1 & - 1 \end{matrix}) x + (\begin{matrix} 1 \\ 0 \end{matrix})$	0.67	3.00	1.00	2.00	2.00
$(\begin{matrix} - 1 & 0 \\ 0 & - 1 \end{matrix}) x + (\begin{matrix} 1 \\ 1 \end{matrix})$	0.85	2.00	1.69	1.00	1.69
$(\begin{matrix} - 1 & 1 / 2 \\ 1 & - 1 \end{matrix}) x + (\begin{matrix} 1 \\ 0 \end{matrix})$	1.08	5.00	1.35	4.00	5.39
$(\begin{matrix} - 1 & 1 / 2 \\ 1 / 2 & - 1 \end{matrix}) x + (\begin{matrix} 1 \\ 1 \end{matrix})$	1.36	3.00	2.04	2.00	4.08
$(\begin{matrix} - 1 & 0 & 0 \\ 1 & - 1 & 0 \\ 0 & 1 & - 1 \end{matrix}) x + (\begin{matrix} 1 \\ 0 \\ 0 \end{matrix})$	0.75	4.00	1.00	3.00	3.00
$(\begin{matrix} - 1 & 0 & 0 \\ 0 & - 1 & 0 \\ 0 & 0 & - 1 \end{matrix}) x + (\begin{matrix} 1 \\ 1 \\ 1 \end{matrix})$	1.05	2.00	2.10	1.00	2.10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Metzler, H.; Sierra, C.A. Information Content and Maximum Entropy of Compartmental Systems in Equilibrium. Entropy 2025, 27, 1085. https://doi.org/10.3390/e27101085

AMA Style

Metzler H, Sierra CA. Information Content and Maximum Entropy of Compartmental Systems in Equilibrium. Entropy. 2025; 27(10):1085. https://doi.org/10.3390/e27101085

Chicago/Turabian Style

Metzler, Holger, and Carlos A. Sierra. 2025. "Information Content and Maximum Entropy of Compartmental Systems in Equilibrium" Entropy 27, no. 10: 1085. https://doi.org/10.3390/e27101085

APA Style

Metzler, H., & Sierra, C. A. (2025). Information Content and Maximum Entropy of Compartmental Systems in Equilibrium. Entropy, 27(10), 1085. https://doi.org/10.3390/e27101085

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Information Content and Maximum Entropy of Compartmental Systems in Equilibrium

Abstract

1. Introduction

2. Mathematical Background: Information Entropy and Compartmental Systems as Markov Chains

2.1. Short Summary of Shannon Information Entropy

2.2. Compartmental Systems in Equilibrium

2.3. The One-Particle Perspective

2.4. The Path of a Single Particle

3. Entropy Measures, MaxEnt, and Structural Model Identification

3.1. Path Entropy, Entropy Rate per Unit Time, and Entropy Rate per Jump

3.2. From Microscopic Particle Entropy to Macroscopic System Entropy

3.3. The Maximum Entropy Principle (MaxEnt)

3.4. Structural Model Identification Assisted by MaxEnt

4. Application to Particular Systems

4.1. Simple Examples

4.2. A Linear Autonomous Global Carbon-Cycle Model

4.3. A Nonlinear Autonomous Soil Organic Matter Decomposition Model

4.4. Model Identification via Maxent

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. The Stationary Process Z

Appendix B. Proofs of the MaxEnt Examples

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI