Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables

Telcs, András

doi:10.3390/e28060641

Open AccessArticle

Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables

by

András Telcs

HUN-REN Wigner Research Centre for Physics, Konkoly Thege Miklós út 29-33, 1121 Budapest, Hungary

Entropy 2026, 28(6), 641; https://doi.org/10.3390/e28060641

Submission received: 2 May 2026 / Revised: 2 June 2026 / Accepted: 4 June 2026 / Published: 6 June 2026

(This article belongs to the Section Information Theory, Probability and Statistics)

Download Versions Notes

Abstract

We develop a finite-resolution empirical framework for applying nonnegative Mages–Anastasiadi–Rohner partial information decomposition (MAR-PID) to continuous and non-binary discrete variables. The variables are represented by recursive quantile binarization. This provides a balanced binary-tree representation at each finite depth. MAR-PID is then applied to binary target components, and the resulting atoms are aggregated back to the original target and source variables. The construction gives nonnegative target-relative information summaries for observed variables up to the chosen empirical resolution. The pipeline consists of conditional channel estimation, bit-level MAR-PID computation, projection of source atoms to the original variables, and aggregation of the resulting information contributions. The obtained quantities are empirical estimates of finite-resolution population quantities. XOR and mixed redundancy–synergy examples show how the representation separates informational mechanisms, which signed interaction-information summaries can conflate. We focus here on the finite-resolution MAR-PID construction and its information-level quantities. Downstream summaries, such as resolution-normalized PID-dimension descriptors and thresholded support-based degrees of freedom, are indicated but left for separate work.

Keywords:

partial information decomposition; MAR-PID; Blackwell order; recursive quantile binarization; continuous variables; zonogons; empirical estimation

1. Introduction

Understanding information flow in multivariate systems is a central problem in the analysis of complex data. In applications such as neuroscience, climate science, economics, and networked dynamical systems, observed signals arise from several interacting components. Pairwise measures are often insufficient in such settings, because information about a target may be redundant across sources, unique to one source, or available only through joint observation.

Classical multivariate information measures, such as interaction information, are useful algebraic summaries but can take negative values. This is not an error: the sign reflects the alternating-sum structure of the definition and the fact that redundancy and synergy contribute with opposite signs. Nevertheless, signed net quantities can be difficult to interpret when the aim is to identify distinct informational mechanisms.

Partial information decomposition (PID) addresses this issue by decomposing information about a chosen target into redundant, unique, and synergistic components. Early PID approaches introduced important conceptual distinctions but did not fully resolve the problem of obtaining nonnegative atoms in general multivariate settings. Mages, Anastasiadi, and Rohner [1] (see recent references in that paper) introduced a full-lattice PID based on the Blackwell order and the geometry of binary-input channels. We refer to this construction as MAR-PID. Its central advantage is that the resulting partial information atoms are nonnegative for a broad class of information functionals.

Partial information decomposition was introduced to separate redundant, unique, and synergistic contributions to the information that several sources provide about a target [2,3]. Since then, a number of alternative approaches to redundancy, synergy, and multivariate dependence have been developed, including optimization-based, geometric, pointwise, and dual-decomposition formulations [4]. Related work has also emphasized the role of synergy in information modification and dynamical (see also references there) systems, as well as the limitations of classical Shannon-type summaries for describing higher-order dependence [5] (see also references in the cited paper). The present paper does not aim to compare these PID measures. We use the Blackwell-order construction of Mages, Anastasiadi, and Rohner because it provides nonnegative partial information atoms for a broad class of information functionals and is naturally formulated in terms of binary-input channel geometry [1,6]. The binary-input channel geometry, Blackwell order, zonogon representation, cumulative loss construction, and nonnegativity theorem are inherited from MAR-PID. The new contribution of the present paper is the finite-resolution empirical layer needed when the variables of interest are continuous or high-cardinality rather than already given in a finite binary form. The comparison made here is conceptual rather than benchmark-based: different PID proposals define different decompositions, and a numerical comparison would require separate choices of estimators, test distributions, and error criteria.

The basic idea is to represent variables by binary coordinates obtained from recursive quantile, or median, partitioning. For a continuous target variable, the finite-resolution representation is a vector of binary target components. MAR-PID is applied to each binary target component, and the resulting atomic information quantities are aggregated over target bits. For sources that are also represented by binary coordinates, the bit-level atoms are pushed forward to atoms indexed by the original source variables. Thus the binary representation acts as an intermediate computational device, while the final summaries are reported at the level of the original variables.

The resulting object is a finite-resolution MAR-PID summary induced by the chosen binary representation, not a representation-independent PID of the underlying continuous variable. This distinction is important: the paper extends the empirical use of MAR-PID to finite-resolution representations of continuous or high-cardinality observations, but it does not modify the MAR-PID nonnegativity theorem itself.

This construction should also be distinguished from set-based structural degrees of freedom. If a discretized dynamical system is known explicitly, and if one knows the set of present-time variables influencing each next-time component, then set cardinalities and set-theoretic inclusion-exclusion methods provide direct structural descriptors. In observational time-series analysis, however, the update rule and its influence sets are usually not available. The present paper does not attempt to define degrees of freedom from such observations. Instead, it provides empirical nonnegative MAR-PID atoms that can later serve as the basis for support-level or scale-based summaries.

Contributions

The main contributions of this paper are as follows:

A finite-resolution binary representation of continuous or multilevel variables based on recursive quantile binarization;
A conditional dyadic property showing that, at every fixed finite-tree level, the recursive quantile code produces balanced and conditionally independent binary coordinates under non-atomic conditional laws;
A bit-level empirical procedure that applies MAR-PID to binary target components obtained from the finite-resolution representation;
An aggregation scheme over resolved target bits, producing finite-resolution target-bit summaries for the original target variable;
A source-side pushforward that maps bit-level MAR-PID atoms to summaries indexed by atoms over the original source variables;
Illustrative examples showing how nonnegative MAR-PID atoms represent XOR-type synergy and mixed redundancy–synergy mechanisms.

Although this paper focuses on the empirical construction and aggregation of MAR-PID atoms, the procedure is not tied to one side of the redundancy–synergy distinction. The finite-resolution binarization, target-bit aggregation, and source-side pushforward apply to the resulting atoms whether they are interpreted as redundant, unique, or synergistic contributions.

The present paper deliberately remains at the information decomposition level. PID-based dimensions and support-based degrees-of-freedom summaries require additional choices concerning scale normalization, activity thresholds, and source/target-bit aggregation. These questions are natural continuations of the present construction and are left to a separate treatment.

The structure of the paper reflects this separation. Section 2 recalls only the MAR-PID background needed for the construction. Section 3 defines the finite-resolution binary representation and proves the conditional dyadic property. Section 4 defines the bit-level MAR-PID aggregation and the pushforward to original-variable atoms. Section 5 presents the empirical plug-in computation. Section 6 presents illustrative examples, and Section 7 discusses interpretation, limitations, and subsequent support- or scale-based summaries.

2. MAR-PID Background

We briefly recall the objects needed for the empirical construction. Let

V = {V_{1}, \dots, V_{n}}

be a finite collection of observed variables, and let T denote the target. A source is a nonempty subset

S \subseteq V

. We write

S = P (V) ∖ {\emptyset}

for the set of all sources. An atom is an antichain in

S

; that is, a collection

α \subseteq S

such that no source in

α

is strictly contained in another. The set of atoms is denoted by

A (V)

.

The MAR-PID construction assigns to each atom

α \in A (V)

a partial information contribution about the target. These atoms are obtained by applying Möbius inversion to a cumulative loss functional on the synergy/loss lattice. We write

I_{\cup, f} (α; T)

for the cumulative loss and

Δ I_{\cup, f} (α; T)

for its Möbius inverse. When the functional f is fixed, we suppress f when no confusion is possible.

The construction is based on comparing channels in the Blackwell order. For a source S and a binary target

T \in {0, 1}

, the relevant channel is

κ_{S} (s ∣ t) = P (S = s ∣ T = t), t \in {0, 1} .

For binary-input channels, the Blackwell order admits a zonogon representation, which makes joins and meets computable and supplies the geometric inequality behind the nonnegativity of the atoms. The formal lattice and channel definitions are recalled in Appendix A and Appendix B.

We use the synergy/loss orientation of the MAR-PID lattice throughout. Thus

⪯_{S}

denotes the order on antichains used for cumulative loss and Möbius inversion, as recalled in Appendix A. The symbol ∨ in

κ^{\lor} (α)

denotes the Blackwell join of the source channels

{κ_{S} : S \in α}

. For binary-input channels this join is represented and computed through the associated zonogon geometry; the zonogon construction is therefore a geometric representation of the Blackwell join, not a separate source lattice operation.

The functional

i_{f} (p, κ)

denotes the information functional used in MAR-PID for a binary-input channel

κ

and target bias

p = P (T = 1)

. The cumulative quantity used below is a loss relative to the full source channel:

I_{\cup, f} (α; T) = i_{f} (p, κ_{V}) - i_{f} (p, κ^{\lor} (α)) .

Partial information atoms are then obtained by Möbius inversion with respect to

⪯_{S}

. We assume f belongs to the admissible class covered by the MAR-PID nonnegativity theorem; the present paper does not modify that theorem or enlarge its scope.

A useful way to view the MAR-PID result is that it provides nonnegative target-relative atoms. In the binary-target case, these atoms satisfy

Δ I_{\cup, f} (α; T) \geq 0, α \in A (V) .

For finite multilevel targets, MAR-PID can be formulated through pointwise binary-input decompositions over target states. In the empirical construction developed below, we use binary components of a finite-resolution representation of the target. This keeps every computational step inside the binary-input setting while allowing the final summaries to refer to the original target variable after aggregation.

3. Recursive Quantile Binarization

In practice, variables may be continuous or have many possible values. We therefore introduce a finite-resolution binary representation. The construction is based on recursive quantile (median) partitioning. For simplicity of notation, assume first that a variable X takes values in

[0, 1]

; other one-dimensional variables may be transformed to this case by a monotone distributional transform or by empirical ranks.

Let x be a realization of X.

Set $A^{(0)} = [0, 1]$ .
For $k = 1, 2, \dots$ :
(a)
Choose the median $m^{(k - 1)}$ of X conditional on $X \in A^{(k - 1)}$ ;
(b)
Set

$B_{k} (x) = \{\begin{matrix} 0, & x \leq m^{(k - 1)}, \\ 1, & x > m^{(k - 1)}; \end{matrix}$

(c)
Update

$A^{(k)} = \{\begin{matrix} A^{(k - 1)} \cap (- \infty, m^{(k - 1)}], & B_{k} (x) = 0, \\ A^{(k - 1)} \cap (m^{(k - 1)}, \infty), & B_{k} (x) = 1 . \end{matrix}$
Output the binary sequence ${(B_{k} (x))}_{k \geq 1}$ .

In empirical work, the medians are estimated from samples. For time-series prediction or conditional analysis, the medians may be conditional on past information or on a conditioning state. The following proposition records the finite-tree-level property used in the construction.

Proposition 1

(Conditional dyadic structure). Fix

n \geq 0

and write

F_{n} = σ (X_{n})

. Assume that the recursive conditional median binarization of

X_{n + 1}

relative to

F_{n}

is well defined up to every finite depth. Then, for every

K \geq 1

and every binary word

(b_{1}, \dots, b_{K}) \in {0, 1}^{K}

,

P (B_{1}^{n + 1} = b_{1}, \dots, B_{K}^{n + 1} = b_{K} ∣ F_{n}) = 2^{- K} a . s .

Consequently, for every fixed K, the variables

B_{1}^{n + 1}, \dots, B_{K}^{n + 1}

are conditionally independent given

F_{n}

, and each is conditionally Bernoulli

(1 / 2)

.

The proof is given in Appendix C. This statement can be viewed as a finite-depth dyadic analogue of the Rosenblatt transform [7]. The construction produces balanced binary components, which is useful both statistically and computationally.

Proposition 1 is used here as a finite-resolution coding result. Recursive conditional median splitting produces balanced binary coordinates with a dyadic finite-tree structure. This gives each resolved target coordinate a common binary information scale and makes it suitable as a binary target for MAR-PID. The target-bit aggregation and the projection to original-variable atoms are separate construction steps introduced below.

Remark 1

(Representation dependence). The construction is multivariate at the PID level: MAR-PID is applied to a collection of source variables and a target. The recursive quantile binarization step, however, is specified here for scalar variables. A vector-valued observed variable can be included either by treating its coordinates as separate observed variables, or by first choosing an additional finite-resolution encoding of the vector. Different such encodings may lead to different finite-resolution MAR-PID summaries.

For scalar variables, the quantile construction is invariant under strictly monotone transformations at the population level, or under rank-based empirical implementation, because the induced order of observations is unchanged. The finite-depth dyadic property in Proposition 1 requires the conditional median splits used in the construction; unconditional quantiles would not in general yield the same conditional dyadic property. Ties or atoms require a deterministic or randomized tie-breaking convention, and the exact balanced-split statement should be understood under the stated non-atomic conditional-law assumption.

The use of recursive median splits is part of the finite-resolution representation. Median splitting is the choice used in Proposition 1, since it gives balanced binary coordinates and the finite-depth dyadic property. Other quantile splits may be useful for application-specific encodings—for example, to give more resolution to tail events—but they define different finite-resolution summaries and need not satisfy the same dyadic property. For scalar variables, the rank-based median construction is invariant under strictly monotone transformations, apart from tie-handling conventions. For vector-valued variables, or for non-median encodings, the chosen representation should be reported.

4. Bit-Level MAR-PID and Lifting to Original Variables

The binary variables introduced above are computational intermediates. We now describe how MAR-PID is applied at the bit level and how the resulting atoms are aggregated back to finite-resolution summaries for the original variables.

Let Y be an original target variable, and let

Y^{(K)} = (C_{1}, \dots, C_{K})

be its finite-resolution binary representation. Similarly, suppose that each original source variable

X_{i}

is represented at a common source depth k by

X_{i}^{(k)} = (B_{i, 1}, \dots, B_{i, k}) .

The binary source universe is

B = {B_{i, r} : 1 \leq i \leq n, 1 \leq r \leq k} .

For simplicity we use a common source depth k and target depth K. Allowing source-dependent depths

k_{i}

is straightforward, but adds notation without changing the construction. For each binary target component

C_{ℓ}

, MAR-PID is applied to the source universe

B

and target

C_{ℓ}

. This gives bit-level atoms

Δ I (α; C_{ℓ}), α \in A (B) .

If Shannon information is used, the finite-resolution contribution associated with a bit-level atom

α

is aggregated over target bits by

Δ I (α; Y^{(K)}) : = \sum_{ℓ = 1}^{K} Δ I (α; C_{ℓ}) .

The normalized version is

\bar{Δ I} (α; Y^{(K)}) : = \frac{1}{K} \sum_{ℓ = 1}^{K} Δ I (α; C_{ℓ}) .

Throughout the paper, an overline denotes normalization by the number of resolved target bits.

When the sources are represented by binary coordinates, bit-level atoms can also be projected back to original source variables. Define the projection map on source-bit subsets by

π (S) : = {i : S \cap {B_{i, 1}, \dots, B_{i, k}} \neq \emptyset}, S \subseteq B .

For an atom

α \in A (B)

, the collection

{π (S) : S \in α}

may fail to be an antichain, because distinct bit-level sources can project to nested original-variable sources. We therefore define

π^{*} (α)

to be the antichain obtained by removing every projected set that is strictly contained in another projected set.

For an original-variable atom

γ \in A ({X_{1}, \dots, X_{n}})

, define

Δ I^{orig} (γ; C_{ℓ}) : = \sum_{α : π^{*} (α) = γ} Δ I (α; C_{ℓ}) .

Aggregating over target bits gives

Δ I^{orig} (γ; Y^{(K)}) : = \sum_{ℓ = 1}^{K} Δ I^{orig} (γ; C_{ℓ}),

and the normalized version is

{\bar{Δ I}}^{orig} (γ; Y^{(K)}) : = \frac{1}{K} \sum_{ℓ = 1}^{K} Δ I^{orig} (γ; C_{ℓ}) .

Thus all bit-level atoms projecting to the same original-variable atom are summed. The resulting

γ

-indexed quantities are pushforward summaries of genuine bit-level MAR-PID atoms. This preserves, for each target component, the total MAR-PID information mass and keeps the projected quantities nonnegative. The projection is also a coarsening step: distinct bit-level mechanisms may project to the same original-variable atom and are then no longer separated at the original-variable level.

Aggregating over target bits gives a finite-resolution target-bit summary for the original target variable. The normalized version is an average information contribution per target bit, not a support or variable-count summary. In this sense, the binary coordinates are computational intermediates: MAR-PID is computed at the binary-coordinate level, while the final summaries are indexed by atoms over the original variables.

Remark 2

(Scope of the target-bit aggregation). The source-side projection described above is a pushforward of nonnegative MAR-PID atoms. The subsequent summation over

ℓ = 1, \dots, K

is a finite-resolution target-bit aggregation: it records how MAR-PID information about the resolved target coordinates is distributed over original-variable atoms. This target-bit summary is distinct from a native MAR-PID of the full target block

Y^{(K)}

. Cross-target-bit informational mechanisms require a separate target-block analysis or an additional support-level or dimension-level construction.

5. Algorithmic Computation and Estimation

We now summarize the empirical computation. Throughout this section the target is a binary component

C_{ℓ} \in {0, 1}

and the sources are finite-valued variables, either original discrete variables or binary coordinates obtained from the representation above.

5.1. Empirical Meaning and Statistical Scope

In this paper, the term empirical means that the construction starts from a finite observed sample and produces computable finite-resolution MAR-PID summaries after binarization and channel estimation. The resulting quantities are plug-in summaries of the chosen finite-resolution representation.

Their numerical values depend on the available sample, the binarization depths, and the set of source atoms included in the computation. Statistical error control, optimal resolution selection, and finite-sample confidence statements are separate questions and are not addressed here.

For reproducibility, the sample size N, target depth K, source depths

k_{i}

, tie-handling convention, and chosen source universe should be reported with any finite-sample computation.

5.2. Channel Estimation

Given samples

{(c_{ℓ}^{(m)}, b^{(m)})}_{m = 1}^{N}

from the binary target

C_{ℓ}

and the source universe, estimate

p = P (C_{ℓ} = 1)

by

\hat{p} = \frac{1}{N} \sum_{m = 1}^{N} 1_{{c_{ℓ}^{(m)} = 1}} .

For each source S and each value

s \in A_{S}

, estimate

P (S = s ∣ C_{ℓ} = t), t \in {0, 1} .

Let

N_{s, t}

be the number of observations with

S = s

and

C_{ℓ} = t

, and let

N_{t}

be the number of observations with

C_{ℓ} = t

. We use the plug-in estimator

\hat{P} (S = s ∣ C_{ℓ} = t) = \frac{N_{s, t}}{N_{t}} .

These conditional probabilities define the empirical channel

{\hat{κ}}_{S}

for each source S. In the remainder of this section, all channels and MAR-PID quantities are empirical plug-in quantities unless explicitly stated otherwise; hats are omitted to lighten the notation.

Regularized channel estimators may be useful in sparse empirical applications, but their choice is an implementation issue and is not part of the finite-resolution MAR-PID construction studied here.

5.3. Zonogons, Joins, and Cumulative Loss

For each empirical channel

{\hat{κ}}_{S}

, construct the binary-input zonogon associated with its column vectors. Blackwell joins of channels are computed geometrically at the level of these zonogons. For an atom

α

, write

κ_{\lor} (α) = \underset{S \in α}{⋁} κ_{S} .

Let

κ_{V}

denote the channel associated with the full source universe. The empirical cumulative loss is

I_{\cup, f} (α; C_{ℓ}) = i_{f} (p, κ_{V}) - i_{f} (p, κ_{\lor} (α)) .

Möbius inversion on the synergy/loss lattice gives

{Δ I}_{\cup, f} (α; C_{ℓ}) = I_{\cup, f} (α; C_{ℓ}) - \sum_{β ≺_{S} α} {Δ I}_{\cup, f} (β; C_{ℓ}) .

The atoms are evaluated in an order compatible with

⪯_{S}

.

5.4. Aggregation over Target Bits and Projection to Original Variables

Repeat the binary-target computation for

ℓ = 1, \dots, K

. The empirical block-level contribution is

Δ I (α; Y^{(K)}) = \sum_{ℓ = 1}^{K} Δ I (α; C_{ℓ}),

or, in normalized form,

\bar{Δ I} (α; Y^{(K)}) = \frac{1}{K} \sum_{ℓ = 1}^{K} Δ I (α; C_{ℓ}) .

If source variables were represented by binary coordinates, one may additionally group bit-level atoms by their projected original-variable atom

π^{*} (α)

as in Section 4. This yields empirical original-variable summaries

{Δ I}^{orig} (γ; Y^{(K)}) = \sum_{ℓ = 1}^{K} \sum_{α : π^{*} (α) = γ} Δ I (α; C_{ℓ}) .

5.5. Algorithmic Summary

The computation proceeds as follows.

Choose finite resolutions for source and target variables.
Represent the original target Y by binary components $C_{1}, \dots, C_{K}$ .
Represent continuous or multilevel source variables by binary coordinates when required.
For each target bit $C_{ℓ}$ , estimate the binary-input channels $κ_{S}$ for all sources S under consideration.
Construct the associated zonogons and compute Blackwell joins.
Compute empirical cumulative losses $I_{\cup, f} (α; C_{ℓ})$ .
Apply Möbius inversion to obtain ${Δ I}_{\cup, f} (α; C_{ℓ})$ .
Aggregate over target bits to obtain $Δ I (α; Y^{(K)})$ or its normalized version.
If source variables were binarized, project bit-level atoms to original-variable atoms and aggregate the corresponding contributions.

6. Illustrative Examples

The examples below are controlled mechanism checks rather than empirical benchmarks. They are chosen so that the redundant, unique, or synergistic structure is known in advance. This makes it possible to check whether the finite-resolution MAR-PID construction assigns nonnegative information contributions to the expected atoms. Real-data applications require further choices, including preprocessing, source selection, resolution choice, and statistical stabilization. These choices are application dependent and are not part of the construction studied here.

We provide three examples illustrating the finite-resolution MAR-PID viewpoint. The first contrasts the negative value of classical interaction information with the nonnegative PID representation of an XOR structure. The second shows that redundant and synergistic mechanisms may coexist in a binary-target distribution. The third illustrates the treatment of a non-binary discrete target.

6.1. XOR Under the Doubling Map

Consider the doubling map on

[0, 1)

,

M (u) = 2 u mod 1 .

Write the binary expansion, ignoring the null set of dyadic rationals, as

u = 0 . u_{0} u_{1} u_{2} \dots, u_{i} \in {0, 1} .

Then M acts as the left shift on binary digits.

Let

X_{0}

and

Y_{0}

be independent uniformly distributed points in

[0, 1)

, with binary expansions

X_{0} = 0 . b_{0} b_{1} b_{2} \dots, Y_{0} = 0 . c_{0} c_{1} c_{2} \dots,

where

{(b_{i})}_{i \geq 0}

and

{(c_{i})}_{i \geq 0}

are independent i.i.d. Bernoulli

(1 / 2)

sequences. Define

Z_{0} = 0 . (b_{0} \oplus c_{0}) (b_{1} \oplus c_{1}) (b_{2} \oplus c_{2}) \dots,

where ⊕ denotes addition modulo two. Let

X_{t} = M^{t} (X_{0}), Y_{t} = M^{t} (Y_{0}), Z_{t} = M^{t} (Z_{0}) .

Then

Z_{t} = X_{t} \oplus Y_{t}

digitwise for every

t \geq 0

.

At resolution

δ = 2^{- k}

, let

X_{t}^{(k)}, Y_{t}^{(k)}, Z_{t}^{(k)}

denote the k-bit prefixes. Since

Z_{t}^{(k)} = X_{t}^{(k)} \oplus Y_{t}^{(k)}

digitwise, the classical trivariate interaction information satisfies

I (X_{t}^{(k)}; Y_{t}^{(k)}; Z_{t}^{(k)}) = - k

when information is measured in bits. Thus the signed interaction information summary is negative.

In the MAR-PID representation, keep the sources

V_{1} = X_{t}^{(k)}, V_{2} = Y_{t}^{(k)},

and choose a binary target given by one output digit of

Z_{t}

.

In this example we take the source universe to be

V = {V_{1}, V_{2}}

, where each

V_{i}

is a k-bit finite-valued source variable. Thus the source blocks are treated as the two sources in the MAR-PID computation. One could alternatively use the individual source bits as the source universe; then the positive synergistic contribution for target bit

C_{j}

would be localized at the corresponding pair of source bits and would project back to the original-source atom

{{V_{1}, V_{2}}}

.

Let

C_{j} = B_{t, j} = b_{t + j} \oplus c_{t + j}, 0 \leq j \leq k - 1 .

Then

I (V_{1}; C_{j}) = 0, I (V_{2}; C_{j}) = 0, I ((V_{1}, V_{2}); C_{j}) = 1 .

For each binary target bit

C_{j}

, the nonzero MAR-PID contribution is the synergistic atom corresponding to the joint source:

Δ I (α_{syn}; C_{j}) = 1, Δ I (α; C_{j}) = 0 for α \neq α_{syn} .

Aggregating over the k target bits gives

Δ I (α_{syn}; Z_{t}^{(k)}) = \sum_{j = 0}^{k - 1} Δ I (α_{syn}; C_{j}) = k .

Thus the negative interaction information is replaced by a nonnegative atomic description: one bit of synergistic information is present at each resolved target digit (Table 1).

6.2. A Mixed Redundancy–Synergy Example

Let

A \sim Bern (1 / 2)

be the binary target, and let

B \sim Bern (q)

,

0 < q < 1

, be an independent latent switching variable. The observed sources are constructed as follows:

If $B = 0$ , set $V_{1} = A$ and $V_{2} = A$ . This regime introduces a redundant mechanism, since either source alone determines the target.
If $B = 1$ , sample $U \sim Bern (1 / 2)$ independently of A, and set $V_{1} = U$ and $V_{2} = U \oplus A$ . This regime introduces an XOR-type synergistic mechanism, since neither source alone determines the target, while the pair does.
Let $V_{3}$ be either independent noise or a noisy copy of A, depending on whether one wants to include an additional weakly informative source.

Since B is latent, MAR-PID is applied to the marginal distribution of

(A, V_{1}, V_{2}, V_{3})

, after averaging over the switch. The resulting atoms need not decompose additively into the two latent regimes. The example should therefore be read as a qualitative mechanism test case: one regime introduces redundant information, while the other introduces XOR-type synergistic information. The redundant atom associated with

{{V_{1}}, {V_{2}}}

and the synergistic atom associated with

{{V_{1}, V_{2}}}

are the principal atoms to inspect. Additional atoms involving

V_{3}

may appear if

V_{3}

carries information about A.

For reference, Table 2 shows population Shannon summaries for the two-source version of this example, without the optional source

V_{3}

. The table is not a MAR-PID benchmark. It only records familiar information quantities for several values of the switch parameter q. The change of sign of

I (A; V_{1}; V_{2})

shows how the same family moves from a redundancy-dominated regime to an XOR-type synergy-dominated regime.

6.3. Discrete Variable with More than 2 Values

Let

C_{1}, C_{2}

be independent Bernoulli

(1 / 2)

variables and let

Y = 2 C_{1} + C_{2} \in {0, 1, 2, 3} .

Let

V_{1} = C_{1}

and

V_{2} = C_{2}

. Then Y is a four-valued target, but its quantile binary representation consists of the two independent bits

C_{1}

and

C_{2}

. MAR-PID applied to

C_{1}

assigns one bit uniquely to

V_{1}

, while MAR-PID applied to

C_{2}

assigns one bit uniquely to

V_{2}

. Aggregation over target bits therefore gives two separate original-target information contributions rather than a single undifferentiated two-bit dependence.

The example also shows where representation dependence enters. Internal permutations of bits within a source variable are removed by the projection

π^{*}

, so they do not change the resulting original-variable source atom. On the target side, however, the chosen binary representation matters: a substantially different binarization of Y may lead to a different target-bit summary. This is why the proposed quantities are finite-resolution summaries induced by a specified target representation.

Together, the examples illustrate the role of the finite-resolution representation. The XOR example starts from continuous variables on

[0, 1)

, but MAR-PID is applied only after resolving binary coordinates. The four-valued example shows that target binarization can expose separate informational components that would otherwise appear as a single multivalued dependence.

7. Discussion

The present paper separates two tasks that are often conflated: constructing a nonnegative empirical information decomposition, and deriving support- or scale-based summaries from it. We focus on the first task.

The main construction is a finite-resolution empirical MAR-PID pipeline. Continuous or high-cardinality variables are represented by binary coordinates using recursive quantile binarization. MAR-PID is then applied to binary target components, producing nonnegative bit-level atoms. These atoms are aggregated across target bits and lifted from source-bit atoms back to atoms over the original source variables. Thus the binary coordinates serve as computational intermediates rather than final interpretive units.

The finite-depth dyadic property justifies the use of this binary representation. Under non-atomic conditional laws, the binary tree coordinates are balanced and conditionally independent at each finite-tree level. In the present construction, the source side is resolved over the chosen binary source universe: for each target bit

C_{ℓ}

, MAR-PID is computed over

B

, and the resulting source-bit atoms are pushed forward to atoms over the original variables. On the target side, the construction remains MAR-PID-nonnegativity-compatible bit by bit. Each

Δ I (α; C_{ℓ})

is a genuine nonnegative binary-target MAR-PID atom, and summing over ℓ gives a nonnegative finite-resolution target-bit summary.

The finite-resolution pipeline is neutral with respect to the redundancy–synergy interpretation of the atoms. Once the MAR-PID atoms have been estimated, the same aggregation and lifting procedure applies to atoms interpreted as redundant, unique, or synergistic contributions. Accordingly, the proposed summaries should be interpreted as finite-resolution MAR-PID summaries induced by the chosen binary representation, rather than as a representation-independent continuous-variable PID.

The examples illustrate why the atomic decomposition matters. In the XOR example, classical interaction information is negative, while MAR-PID identifies a nonnegative synergistic contribution at each target bit. In the mixed example, redundant and synergistic mechanisms coexist. A signed net quantity may obscure this coexistence, whereas MAR-PID keeps the mechanisms separated.

The construction has several practical constraints. Finite-sample estimation of high-dimensional conditional channels is statistically demanding. In applied work, one may need restricted atom families, problem-specific source selection, or other regularized channel estimators. Such choices concern large-scale implementation and statistical stabilization; they are not part of the finite-resolution MAR-PID construction itself.

Support-based degrees of freedom, persistent supports, and PID-based dimensions are natural downstream constructions. Once nonnegative empirical atoms are available, one may threshold their support, normalize across source and target resolutions, or study scaling with resolution. These choices require separate treatment. The role of the present paper is to establish the empirical nonnegative MAR-PID layer on which such summaries can be built.

The present paper should be read as a construction paper. It defines a finite-resolution representation, applies MAR-PID at the binary target-component level, and aggregates the resulting nonnegative atoms back to original variables. It does not claim to provide a complete statistical validation procedure. Real-data benchmarking, confidence statements, optimal resolution selection, and large-scale sparse implementation require further statistical and computational choices. These are important for applications, but they are not part of the finite-resolution MAR-PID construction itself.

The full-lattice version is combinatorial. If MAR-PID is computed over m source coordinates, there are

2^{m} - 1

nonempty source subsets, and the number of nonempty antichain atoms is

M (m) - 2

, where

M (m)

is the m-th Dedekind number. This gives

4, 18, 166,

and 7579 atoms for

m = 2, 3, 4,

and 5, respectively. In the bit-level construction, m may be

| B | = \sum_{i} k_{i}

, so source-bit unfolding can increase the lattice quickly. Full-lattice computation should therefore be regarded as a small-source construction. For larger systems, one may reduce the source universe by grouping source bits into original variables, limiting the maximum source order, preselecting variables, or using a problem-specific family of source groups. Pruning very small atoms can be useful as a reporting or post-processing step, but it is not a substitute for the exact Möbius inversion on the chosen lattice.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

During the preparation of this manuscript the author used ChatGPT 5.4 for language editing and LaTeX formatting. The author has reviewed and edited the output and takes full responsibility for the content of this publication.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MAR-PID	Mages–Anastasiadi–Rohner partial information decomposition
PID	Partial information decomposition

Nomenclature

Symbol	Object
$T, Y$	Target variables
$C_{ℓ}$	Target bit
$X_{i}, V_{i}$	Source variables
$B_{i, r}$	Source bit
$K, k_{i}$	Resolution depths
$S$	Source family
$A (V)$	Atom lattice
$α, γ$	Atom indices
$κ_{S}$	Source channel
$Δ I (α; C_{ℓ})$	Bit-level atom
$π^{*} (α)$	Projected atom

Appendix A. Lattice Formalism for Atoms and the Synergy Order

This appendix summarizes the lattice notation used in the MAR-PID construction.

Appendix A.1. Sources and Atoms

Let

V = {V_{1}, \dots, V_{n}}

be the set of observed variables. A source is a nonempty subset of V. We denote the set of all sources by

S = P (V) ∖ {\emptyset} .

An atom is an antichain in

S

. Thus an atom is a collection

α \subseteq S

such that

S_{a}, S_{b} \in α, S_{a} \neq S_{b} ⟹ S_{a} \neg \subset S_{b} and S_{b} \neg \subset S_{a} .

The set of all atoms is denoted by

A (V)

.

Appendix A.2. The Synergy/Loss Order

The synergy, or loss, order

⪯_{S}

on

A (V)

is defined by

α ⪯_{S} β ⟺ \forall S \in α \exists S^{'} \in β such that S \subseteq S^{'} .

Equipped with this order,

A (V)

is a finite lattice. The order is used to define cumulative loss and to perform Möbius inversion.

Appendix A.3. Möbius Inversion

Let

F : A (V) \to R

be a real-valued function. Its Möbius inverse

Δ F

with respect to

⪯_{S}

is defined recursively by

Δ F (α) = F (α) - \sum_{β ≺_{S} α} Δ F (β),

or equivalently

F (α) = \sum_{β ⪯_{S} α} Δ F (β) .

In MAR-PID, F is the cumulative loss

I_{\cup, f} (α; T)

, and

Δ F

yields the partial information atoms

Δ I_{\cup, f} (α; T)

.

Appendix B. Binary-Input Channels, Blackwell Order, and Zonogons

This appendix recalls the channel-theoretic objects used in the MAR-PID construction. The full proof of nonnegativity is credited to Mages, Anastasiadi, and Rohner [1].

Appendix B.1. Binary-Input Channels

Assume

T \in {0, 1}

. A binary-input channel

κ : T \to S

is specified by

κ (s ∣ t) = P (S = s ∣ T = t) .

Equivalently, it can be represented by column vectors

v_{s} = (\begin{matrix} κ (s ∣ 1) \\ κ (s ∣ 0) \end{matrix}), s \in A_{S} .

Appendix B.2. Blackwell Order

For two channels

κ_{1} : T \to S_{1}

and

κ_{2} : T \to S_{2}

with the same input alphabet, write

κ_{1} ⪯_{BW} κ_{2}

if

κ_{1}

is a garbling of

κ_{2}

; i.e., if there exists a stochastic post-processing channel

K : S_{2} \to S_{1}

such that

κ_{1} = K \circ κ_{2} .

Thus

κ_{2}

is at least as informative as

κ_{1}

for all decision problems with input T.

Appendix B.3. Zonogon Representation

For binary-input channels, associate to

κ

the zonogon

Z (κ) = \sum_{s \in A_{S}} [0, v_{s}] \subset R^{2},

a Minkowski sum of line segments. The Blackwell order, joins, and meets can be expressed geometrically in terms of these zonogons. This representation is both computationally useful and central to the proof of the nonnegativity of the MAR-PID atoms.

Appendix B.4. Nonnegativity Mechanism

For an atom

α

, define

κ_{\lor} (α) = \underset{S \in α}{⋁} κ_{S} .

The cumulative loss functional is

I_{\cup, f} (α; T) = i_{f} (p, κ_{V}) - i_{f} (p, κ_{\lor} (α)), p = P (T = 1) .

Partial atoms are obtained by Möbius inversion on the synergy/loss lattice. The nonnegativity theorem follows from monotonicity of the channel join, monotonicity of

i_{f}

under the Blackwell order, and zonogon-based inclusion–exclusion inequality for binary-input channels. Hence, in the binary-target setting,

Δ I_{\cup, f} (α; T) \geq 0 \forall α \in A (V) .

Appendix C. Finite-Depth Conditional Binarization

In this appendix we prove Proposition 1. We work on standard Borel spaces, so regular conditional distributions exist. Fix

n \geq 0

and write

F_{n} = σ (X_{n}) .

Let

μ_{n} (ω, \cdot) = P (X_{n + 1} \in \cdot ∣ F_{n}) (ω)

be a regular conditional distribution. We assume that, for almost every

ω

, the conditional law

μ_{n} (ω, \cdot)

is non-atomic on the cells generated below. This guarantees that each cell can be split into two subcells of equal conditional probability. It also avoids ambiguity from atoms or ties. In empirical implementations with ties, one must specify a deterministic or randomized tie-breaking rule; this is a numerical convention and is not part of the population statement proved here.

For each

k \geq 0

and each binary word

β = (b_{1}, \dots, b_{k}) \in {0, 1}^{k},

let

C_{β}^{(k)} (ω)

denote the corresponding depth-k cell. These cells are allowed to depend on

ω

, through the conditional law given

F_{n}

. At depth zero set

C_{\emptyset}^{(0)} = [0, 1] .

If

C_{β}^{(k)}

has been defined, choose an

F_{n}

-measurable conditional median

m_{β}^{(k)}

inside

C_{β}^{(k)}

such that the two children

C_{β 0}^{(k + 1)} = C_{β}^{(k)} \cap (- \infty, m_{β}^{(k)}], C_{β 1}^{(k + 1)} = C_{β}^{(k)} \cap (m_{β}^{(k)}, \infty)

have equal conditional probability given

F_{n}

and

X_{n + 1} \in C_{β}^{(k)}

. Such measurable choices are standard under the non-atomic conditional-law assumption; alternatively, the argument may be read conditional on a fixed

ω

outside a null set.

Define the cylinder events

E_{β}^{(k)} = {B_{1}^{n + 1} = b_{1}, \dots, B_{k}^{n + 1} = b_{k}} = {X_{n + 1} \in C_{β}^{(k)}} .

Proof of Proposition 1.

We first show by induction that

P (E_{β}^{(K)} ∣ F_{n}) = 2^{- K}

for every word

β \in {0, 1}^{K}

. For

K = 1

, this is exactly the first conditional median split. Assume it holds at depth K. For

β \in {0, 1}^{K}

and

b \in {0, 1}

,

E_{β b}^{(K + 1)} \subseteq E_{β}^{(K)},

and the defining split yields

P (E_{β b}^{(K + 1)} ∣ F_{n}, E_{β}^{(K)}) = \frac{1}{2} .

Therefore

P (E_{β b}^{(K + 1)} ∣ F_{n}) = \frac{1}{2} 2^{- K} = 2^{- (K + 1)} .

This completes the induction.

For each i and

b \in {0, 1}

, the event

{B_{i}^{n + 1} = b}

is the disjoint union of

2^{i - 1}

cylinder events of length i, each of conditional probability

2^{- i}

. Hence

P (B_{i}^{n + 1} = b ∣ F_{n}) = \frac{1}{2} .

Finally, for any word

(b_{1}, \dots, b_{K})

,

P (B_{1}^{n + 1} = b_{1}, \dots, B_{K}^{n + 1} = b_{K} ∣ F_{n}) = 2^{- K}

and

\prod_{i = 1}^{K} P (B_{i}^{n + 1} = b_{i} ∣ F_{n}) = 2^{- K} .

Thus

B_{1}^{n + 1}, \dots, B_{K}^{n + 1}

are conditionally independent given

F_{n}

. □

References

Mages, T.; Anastasiadi, E.; Rohner, C. Non-Negative Decomposition of Multivariate Information: From Minimum to Blackwell-Specific Information. Entropy 2024, 26, 424. [Google Scholar] [CrossRef] [PubMed]
Williams, P.L.; Beer, R.D. Nonnegative decomposition of multivariate information. arXiv 2010, arXiv:1004.2515. [Google Scholar] [CrossRef]
Lizier, J.T.; Bertschinger, N.; Jost, J.; Wibral, M. Information decomposition of target effects from multi-source interactions: Perspectives on previous, current and future work. Entropy 2018, 20, 307. [Google Scholar] [CrossRef] [PubMed]
Harder, M.; Salge, C.; Polani, D. Bivariate Measure of Redundant Information. Phys. Rev. E 2013, 87, 012130. [Google Scholar] [CrossRef] [PubMed]
Lizier, J.T.; Flecker, B.; Williams, P.L. Towards a Synergy-based Approach to Measuring Information Modification. In Proceedings of the 2013 IEEE Symposium on Artificial Life (ALife), Singapore, 16–19 April 2013; pp. 43–51. [Google Scholar] [CrossRef]
Blackwell, D. Equivalent Comparisons of Experiments. Ann. Math. Stat. 1953, 24, 265–272. [Google Scholar] [CrossRef]
Rosenblatt, M. Remarks on a Multivariate Transformation. Ann. Math. Stat. 1952, 23, 470–472. [Google Scholar] [CrossRef]

Table 1. MAR-PID structure of the XOR example at resolution

δ = 2^{- k}

.

Table 1. MAR-PID structure of the XOR example at resolution

δ = 2^{- k}

.

Atom $α$	Type	Aggregated Contribution
${{V_{1}, V_{2}}}$	Synergistic	k
${{V_{1}}}$	Unique	0
${{V_{2}}}$	Unique	0
${{V_{1}}, {V_{2}}}$	Redundant	0

Table 2. Population Shannon summaries for the mixed redundancy–synergy example without the optional source

V_{3}

. Information is measured in bits.

Table 2. Population Shannon summaries for the mixed redundancy–synergy example without the optional source

V_{3}

. Information is measured in bits.

q	$I (A; V_{1})$	$I (A; V_{2})$	$I (A; V_{1}, V_{2})$	$I (A; V_{1}; V_{2})$
0	$1.000$	$1.000$	$1.000$	$1.000$
$0.25$	$0.456$	$0.456$	$0.741$	$0.172$
$0.50$	$0.189$	$0.189$	$0.656$	$- 0.278$
$0.75$	$0.046$	$0.046$	$0.697$	$- 0.605$
1	$0.000$	$0.000$	$1.000$	$- 1.000$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Telcs, A. Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables. Entropy 2026, 28, 641. https://doi.org/10.3390/e28060641

AMA Style

Telcs A. Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables. Entropy. 2026; 28(6):641. https://doi.org/10.3390/e28060641

Chicago/Turabian Style

Telcs, András. 2026. "Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables" Entropy 28, no. 6: 641. https://doi.org/10.3390/e28060641

APA Style

Telcs, A. (2026). Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables. Entropy, 28(6), 641. https://doi.org/10.3390/e28060641

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Empirical Nonnegative Finite-Resolution MAR-PID for Continuous Variables

Abstract

1. Introduction

Contributions

2. MAR-PID Background

3. Recursive Quantile Binarization

4. Bit-Level MAR-PID and Lifting to Original Variables

5. Algorithmic Computation and Estimation

5.1. Empirical Meaning and Statistical Scope

5.2. Channel Estimation

5.3. Zonogons, Joins, and Cumulative Loss

5.4. Aggregation over Target Bits and Projection to Original Variables

5.5. Algorithmic Summary

6. Illustrative Examples

6.1. XOR Under the Doubling Map

6.2. A Mixed Redundancy–Synergy Example

6.3. Discrete Variable with More than 2 Values

7. Discussion

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Nomenclature

Appendix A. Lattice Formalism for Atoms and the Synergy Order

Appendix A.1. Sources and Atoms

Appendix A.2. The Synergy/Loss Order

Appendix A.3. Möbius Inversion

Appendix B. Binary-Input Channels, Blackwell Order, and Zonogons

Appendix B.1. Binary-Input Channels

Appendix B.2. Blackwell Order

Appendix B.3. Zonogon Representation

Appendix B.4. Nonnegativity Mechanism

Appendix C. Finite-Depth Conditional Binarization

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI