Continuity of Channel Parameters and Operations under Various DMC Topologies

Rajai Nasser

doi:10.3390/e20050330

École Polytechnique Fédérale de Lausanne, Route Cantonale, 1015 Lausanne, Switzerland

^†

This paper is an extended version of our paper that was published in the International Symposium on Information Theory 2017 (ISIT 2017).

Entropy2018, 20(5), 330;https://doi.org/10.3390/e20050330

This article belongs to the Section Information Theory, Probability and Statistics

Version Notes

Order Reprints

Abstract

We study the continuity of many channel parameters and operations under various topologies on the space of equivalent discrete memoryless channels (DMC). We show that mutual information, channel capacity, Bhattacharyya parameter, probability of error of a fixed code and optimal probability of error for a given code rate and block length are continuous under various DMC topologies. We also show that channel operations such as sums, products, interpolations and Arıkan-style transformations are continuous.

Keywords:

continuity; channel capacity; topology

1. Introduction

This paper is an extended version of our paper that is published in the International Symposium on Information Theory 2017 (ISIT 2017) [1].

Let

X

and

Y

be two finite sets, and let W be a fixed channel with input alphabet

X

and output alphabet

Y

. It is well known that the input-output mutual information is continuous on the simplex of input probability distributions. Many other parameters that depend on the input probability distribution were shown to be continuous on the simplex in [2].

Polyanskiy studied in [3] the continuity of the Neyman–Pearson function for a binary hypothesis test that arises in the analysis of channel codes. He showed that for arbitrary input and output alphabets, this function is continuous in the input distribution in the total variation topology. He also showed that under some regularity assumptions, this function is continuous in the weak-∗ topology.

If

X

and

Y

are finite sets, the space of channels with input alphabet

X

and output alphabet

Y

can naturally be endowed with the topology of the Euclidean metric, or any other equivalent metric. It is well known that the channel capacity is continuous in this topology. If

X

and

Y

are arbitrary, one can construct a topology on the space of channels using the weak-∗ topology on the output alphabet. It was shown in [4] that the capacity is lower semi-continuous in this topology.

The continuity results that are mentioned in the previous paragraph do not take into account “equivalence” between channels. Two channels are said to be equivalent if they are degraded from each other. This means that each channel can be simulated from the other by local operations at the receiver. Two channels that are degraded from each other are completely equivalent from an operational point of view: both channels have exactly the same probability of error under optimal decoding for any fixed code. Moreover, any sub-optimal decoder for one channel can be transformed to a sub-optimal decoder for the other channel with the same probability of error and essentially the same computational complexity. This is why it makes sense, from an information-theoretic point of view, to identify equivalent channels and consider them as one point in the space of “equivalent channels”.

In [5], equivalent binary-input channels were identified with their L-density (i.e., the density of log-likelihood ratios). The space of equivalent binary-input channels was endowed with the topology of convergence in distribution of L-densities. Since the symmetric capacity (the symmetric capacity is the input-output mutual information with uniformly-distributed input) and the Bhattacharyya parameter can be written as an integral of a continuous function with respect to the L-density [5], it immediately follows that these parameters are continuous in the L-density topology.

In [6], many topologies were constructed for the space of equivalent channels sharing a fixed input alphabet. In this paper, we study the continuity of many channel parameters and operations under these topologies. The continuity of channel parameters and operations might be helpful in the following two problems:

If a parameter (such as the optimal probability of error of a given code) is difficult to compute for a channel W, one can approximate it by computing the same parameter for a sequence of channels ${(W_{n})}_{n \geq 0}$ that converges to W in some topology where the parameter is continuous.
The study of the robustness of a communication system against the imperfect specification of the channel.

In Section 2, we introduce the preliminaries for this paper. In Section 3, we recall the main results of [6] that we need here. In Section 4, we introduce the channel parameters and operations that we investigate in this paper. In Section 5, we study the continuity of these parameters and operations in the quotient topology of the space of equivalent channels with fixed input and output alphabets. The continuity in the strong topology of the space of equivalent channels sharing the same input alphabet is studied in Section 6. Finally, the continuity in the noisiness/weak-∗ and the total variation topologies is studied in Section 7.

2. Preliminaries

We assume that the reader is familiar with the basic concepts of General Topology. The main concepts and theorems that we need can be found in the Preliminaries Section of [6].

2.1. Set-Theoretic Notations

For every integer

n \geq 1

, we denote the set

{1, \dots, n}

as

[n]

.

The set of mappings from a set A to a set B is denoted as

B^{A}

.

Let A be a subset of B. The indicator mapping

𝟙_{A, B} : B \to {0, 1}

of A in B is defined as:

𝟙_{A, B} (x) = 𝟙_{x \in A} = \{\begin{matrix} 1 & if x \in A, \\ 0 & otherwise . \end{matrix}

If the superset B is clear from the context, we simply write

𝟙_{A}

to denote the indicator mapping of A in B.

The power set of B is the set of subsets of B. Since every subset of B can be identified with its indicator mapping, we denote the power set of B as

{0, 1}^{B} = 2^{B}

.

Let

{(A_{i})}_{i \in I}

be a collection of arbitrary sets indexed by I. The disjoint union of

{(A_{i})}_{i \in I}

is defined as

\underset{i \in I}{∐} A_{i} = ⋃_{i \in I} (A_{i} \times {i})

. For every

i \in I

, the i-th-canonical injection is the mapping

ϕ_{i} : A_{i} \to \underset{j \in I}{∐} A_{j}

defined as

ϕ_{i} (x_{i}) = (x_{i}, i)

. If no confusions can arise, we can identify

A_{i}

with

A_{i} \times {i}

through the canonical injection. Therefore, we can see

A_{i}

as a subset of

\underset{j \in I}{∐} A_{j}

for every

i \in I

.

Let R be an equivalence relation on a set T. For every

x \in T

, the set

\hat{x} = {y \in T : x R y}

is the R-equivalence class of x. The collection of R-equivalence classes, which we denote as

T / R

, forms a partition of T, and it is called the quotient space of T by R. The mapping

{Proj}_{R} : T \to T / R

defined as

{Proj}_{R} (x) = \hat{x}

for every

x \in T

is the projection mapping onto

T / R

.

2.2. Topological Notations

A topological space

(T, U)

is said to be contractible to

x_{0} \in T

if there exists a continuous mapping

H : T \times [0, 1] \to T

such that

H (x, 0) = x

and

H (x, 1) = x_{0}

for every

x \in T

, where

[0, 1]

is endowed with the Euclidean topology.

(T, U)

is strongly contractible to

x_{0} \in T

if we also have

H (x_{0}, t) = x_{0}

for every

t \in [0, 1]

.

Intuitively, T is contractible if it can be “continuously shrinked” to a single point

x_{0}

. If this “continuous shrinking” can be done without moving

x_{0}

, T is strongly contractible.

Note that contractibility is a very strong notion of connectedness: every contractible space is path-connected and simply connected. Moreover, all its homotopy, homology and cohomology groups of order

\geq 1

are zero.

Let

{(T_{i}, U_{i})}_{i \in I}

be a collection of topological spaces indexed by I. The product topology on

\prod_{i \in I} T_{i}

is denoted by

⨂_{i \in I} U_{i}

. The disjoint union topology on

\underset{i \in I}{∐} T_{i}

is denoted by

⨁_{i \in I} U_{i}

.

The following lemma is useful to show the continuity of many functions.

Lemma 1.

Let

(S, V)

and

(T, U)

be two compact topological spaces, and let

f : S \times T \to R

be a continuous function on

S \times T

. For every

s \in S

and every

ϵ > 0

, there exists a neighborhood

V_{s}

of s such that for every

s^{'} \in V_{s}

, we have:

sup_{t \in T} | f (s^{'}, t) - f (s, t) | \leq ϵ .

Proof.

See Appendix A. ☐

2.3. Quotient Topology

Let

(T, U)

be a topological space, and let R be an equivalence relation on T. The quotient topology on

T / R

is the finest topology that makes the projection mapping

{Proj}_{R}

continuous. It is given by:

U / R = \{\hat{U} \subset T / R : {Proj}_{R}^{- 1} (\hat{U}) \in U\} .

Lemma 2.

Let

f : T \to S

be a continuous mapping from

(T, U)

to

(S, V)

. If

f (x) = f (x^{'})

for every

x, x^{'} \in T

satisfying

x R x^{'}

, then we can define a transcendent mapping

f : T / R \to S

such that

f (\hat{x}) = f (x^{'})

for any

x^{'} \in \hat{x}

. f is well defined on

T / R

. Moreover, f is a continuous mapping from

(T / R, U / R)

to

(S, V)

.

Let

(T, U)

and

(S, V)

be two topological spaces, and let R be an equivalence relation on T. Consider the equivalence relation

R^{'}

on

T \times S

defined as

(x_{1}, y_{1}) R^{'} (x_{2}, y_{2})

if and only if

x_{1} R x_{2}

and

y_{1} = y_{2}

. A natural question to ask is whether the canonical bijection between

((T / R) \times S, (U / R) \otimes V)

and

((T \times S) / R^{'}, (U \otimes V) / R^{'})

is a homeomorphism. It turns out that this is not the case in general. The following theorem, which is widely used in Algebraic Topology, provides a sufficient condition:

Theorem 1.

[7] If

(S, V)

is locally compact and Hausdorff, then the canonical bijection between

((T / R) \times S

,

(U / R) \otimes V)

and

((T \times S) / R^{'}, (U \otimes V) / R^{'})

is a homeomorphism.

Corollary 1.

Let

(T, U)

and

(S, V)

be two topological spaces, and let

R_{T}

and

R_{S}

be two equivalence relations on T and S, respectively. Define the equivalence relation R on

T \times S

as

(x_{1}, y_{1}) R (x_{2}, y_{2})

if and only if

x_{1} R_{T} x_{2}

and

y_{1} R_{S} y_{2}

. If

(S, V)

and

(T / R_{T}, U / R_{T})

are locally compact and Hausdorff, then the canonical bijection between

((T / R_{T}) \times (S / R_{S}), (U / R_{T}) \otimes (V / R_{S}))

and

((T \times S) / R, (U \otimes V) / R)

is a homeomorphism.

Proof.

We just need to apply Theorem 1 twice. Define the equivalence relation

R_{T}^{'}

on

T \times S

as follows:

(x_{1}, y_{1}) R_{T}^{'} (x_{2}, y_{2})

if and only if

x_{1} R_{T} x_{2}

and

y_{1} = y_{2}

. Since

(S, V)

is locally compact and Hausdorff, Theorem 1 implies that the canonical bijection from

((T / R_{T}) \times S, (U / R_{T}) \otimes V)

to

((T \times S) / R_{T}^{'}, (U \otimes V) / R_{T}^{'})

is a homeomorphism. Let us identify these two spaces through the canonical bijection.

Now, define the equivalence relation

R_{S}^{'}

on

(T / R_{T}) \times S

as follows:

({\hat{x}}_{1}, y_{1}) R_{S}^{'} ({\hat{x}}_{2}, y_{2})

if and only if

{\hat{x}}_{1} = {\hat{x}}_{2}

and

y_{1} R_{S} y_{2}

. Since

(T / R_{T}, U / R_{T})

is locally compact and Hausdorff, Theorem 1 implies that the canonical bijection from

((T / R_{T}) \times (S / R_{S}), (U / R_{T}) \otimes (V / R_{S}))

to

(((T / R_{T}) \times S) / R_{S}^{'}, ((U / R_{T}) \otimes V) / R_{S}^{'})

is a homeomorphism.

Since we identified

((T / R_{T}) \times S, (U / R_{T}) \otimes V)

and

((T \times S) / R_{T}^{'}, (U \otimes V) / R_{T}^{'})

through the canonical bijection (which is a homeomorphism),

R_{S}^{'}

can be seen as an equivalence relation on

((T \times S) / R_{T}^{'}, (U \otimes V) / R_{T}^{'})

. It is easy to see that the canonical bijection from

(((T \times S) / R_{T}^{'}) / R_{S}^{'}, ((U \otimes V) / R_{T}^{'}) / R_{S}^{'})

to

((T \times S) / R, (U \otimes V) / R)

is a homeomorphism. We conclude that the canonical bijection from

((T / R_{T}) \times (S / R_{S}), (U / R_{T}) \otimes (V / R_{S}))

to

((T \times S) / R, (U \otimes V) / R)

is a homeomorphism. ☐

2.4. Measure-Theoretic Notations

If

(M, Σ)

is a measurable space, we denote the set of probability measures on

(M, Σ)

as

P (M, Σ)

. If the

σ

-algebra

Σ

is known from the context, we simply write

P (M)

to denote the set of probability measures.

If

P \in P (M, Σ)

and

{x}

is a measurable singleton, we simply write

P (x)

to denote

P ({x})

.

For every

P_{1}, P_{2} \in P (M, Σ)

, the total variation distance between

P_{1}

and

P_{2}

is defined as:

∥ P_{1} - P_{2} ∥_{T V} = sup_{A \in Σ} | P_{1} (A) - P_{2} (A) | .

The push-forward probability measure:
Let P be a probability measure on $(M, Σ)$ , and let $f : M \to M^{'}$ be a measurable mapping from $(M, Σ)$ to another measurable space $(M^{'}, Σ^{'})$ . The push-forward probability measure of P by f is the probability measure $f_{#} P$ on $(M^{'}, Σ^{'})$ defined as $(f_{#} P) (A^{'}) = P (f^{- 1} (A^{'}))$ for every $A^{'} \in Σ^{'}$ .
A measurable mapping $g : M^{'} \to R$ is integrable with respect to $f_{#} P$ if and only if $g \circ f$ is integrable with respect to P. Moreover,

$\int_{M^{'}} g \cdot d (f_{#} P) = \int_{M} (g \circ f) \cdot d P .$

The mapping $f_{#}$ from $P (M, Σ)$ to $P (M^{'}, Σ^{'})$ is continuous if these spaces are endowed with the total variation topology:

$\begin{matrix} ∥ f_{#} P - f_{#} P^{'} ∥_{T V} & \overset{(a)}{\leq} {∥ P - P^{'} ∥}_{T V}, \end{matrix}$

where (a) follows from Property 1 of [8].
Probability measures on finite sets:
We always endow finite sets with their finest $σ$ -algebra, i.e., the power set. In this case, every probability measure is completely determined by its value on singletons, i.e., if P is a probability measure on a finite set $X$ , then for every $A \subset X$ , we have:

$P (A) = \sum_{x \in A} P (x) .$

If $X$ is a finite set, we denote the set of probability distributions on $X$ as $Δ_{X}$ . Note that $Δ_{X}$ is an $(| X | - 1)$ -dimensional simplex in $R^{X}$ . We always endow $Δ_{X}$ with the total variation distance and its induced topology. For every $p_{1}, p_{2} \in Δ_{X}$ , we have:

$∥ p_{1} - p_{2} ∥_{T V} = \frac{1}{2} \sum_{x \in X} | p_{1} (x) - p_{2} (x) | = \frac{1}{2} {∥ p_{1} - p_{2} ∥}_{1} .$
Products of probability measures:
We denote the product of two measurable spaces $(M_{1}, Σ_{1})$ and $(M_{2}, Σ_{2})$ as $(M_{1} \times M_{2}, Σ_{1} \otimes Σ_{2})$ . If $P_{1} \in P (M_{1}, Σ_{1})$ and $P_{2} \in P (M_{2}, Σ_{2})$ , we denote the product of $P_{1}$ and $P_{2}$ as $P_{1} \times P_{2}$ .
If $P (M_{1}, Σ_{1})$ , $P (M_{2}, Σ_{2})$ and $P (M_{1} \times M_{2}, Σ_{1} \otimes Σ_{2})$ are endowed with the total variation topology, the mapping $(P_{1}, P_{2}) \to P_{1} \times P_{2}$ is a continuous mapping (see Appendix B).
Borel sets and the support of a probability measure:
Let $(T, U)$ be a Hausdorff topological space. The Borel $σ$ -algebra of $(T, U)$ is the $σ$ -algebra generated by $U$ . We denote the Borel $σ$ -algebra of $(T, U)$ as $B (T, U)$ . If the topology $U$ is known from the context, we simply write $B (T)$ to denote the Borel $σ$ -algebra. The sets in $B (T)$ are called the Borel sets of T.

The support of a measure

P \in P (T, B (T))

is the set of all points

x \in T

for which every neighborhood has a strictly positive measure:

supp (P) = {x \in T : P (O) > 0 for every neighborhood O of x} .

If P is a probability measure on a Polish space, then

P (T \ supp (P)) = 0

.

2.5. Random Mappings

Let M and

M^{'}

be two arbitrary sets, and let

Σ^{'}

be a

σ

-algebra on

M^{'}

. A random mapping from M to

(M^{'}, Σ^{'})

is a mapping R from M to

P (M^{'}, Σ^{'})

. For every

x \in M

,

R (x)

can be interpreted as the probability distribution of the random output given that the input is x.

Let

Σ

be a

σ

-algebra on M. We say that R is a measurable random mapping from

(M, Σ)

to

(M^{'}, Σ^{'})

if the mapping

R_{B} : M \to R

defined as

R_{B} (x) = (R (x)) (B)

is measurable for every

B \in Σ^{'}

.

Note that this definition of measurability is consistent with the measurability of ordinary mappings: let f be a mapping from M to

M^{'}

, and let

D_{f} : M \to P (M^{'}, Σ^{'})

be the random mapping defined as

D_{f} (x) = δ_{f (x)}

for every

x \in M

, where

δ_{f (x)} \in P (M^{'}, Σ^{'})

is a Dirac measure centered at

f (x)

. We have:

\begin{matrix} D_{f} is measurable & \Leftrightarrow {(D_{f})}_{B} is measurable, \forall B \in Σ^{'} \\ \Leftrightarrow {({(D_{f})}_{B})}^{- 1} (B^{'}) \in Σ, \forall B^{'} \in B (R), \forall B \in Σ^{'} \\ \overset{(a)}{\Leftrightarrow} {({(D_{f})}_{B})}^{- 1} ({1}) \in Σ, \forall B \in Σ^{'} \\ \overset{(b)}{\Leftrightarrow} f^{- 1} (B) \in Σ, \forall B \in Σ^{'} \\ \Leftrightarrow f is measurable, \end{matrix}

where (a) and (b) follow from the fact that

({(D_{f})}_{B}) (x)

is either one or zero, depending on whether

f (x) \in B

or not.

Let P be a probability measure on

(M, Σ)

, and let R be a measurable random mapping from

(M, Σ)

to

(M^{'}, Σ^{'})

. The push-forward probability measure of P by R is the probability measure

R_{#} P

on

(M^{'}, Σ^{'})

defined as:

(R_{#} P) (B) = \int_{M} R_{B} \cdot d P, \forall B \in Σ^{'} .

Note that this definition is consistent with the push-forward of ordinary mappings: if f and

D_{f}

are as above, then for every

B \in Σ^{'}

, we have:

({(D_{f})}_{#} P) (B) = \int_{M} {(D_{f})}_{B} \cdot d P = \int_{M} (𝟙_{B} \circ f) \cdot d P = \int_{M^{'}} 𝟙_{B} \cdot d (f_{#} P) = (f_{#} P) (B) .

Proposition 1.

Let R be a measurable random mapping from

(M, Σ)

to

(M^{'}, Σ^{'})

. If

g : M^{'} \to R^{+} \cup {+ \infty}

is a

Σ^{'}

-measurable mapping, then the mapping

x \to \int_{M^{'}} g (y) \cdot d (R (x)) (y)

is a measurable mapping from

(M, Σ)

to

R^{+} \cup {+ \infty}

. Moreover, for every

P \in P (M, Σ)

, we have:

\int_{M^{'}} g \cdot d (R_{#} P) = \int_{M} (\int_{M^{'}} g (y) \cdot d (R (x)) (y)) d P (x) .

Proof.

See Appendix C. ☐

Corollary 2.

If

g : M^{'} \to R

is bounded and

Σ^{'}

-measurable, then the mapping:

x \to \int_{M^{'}} g (y) \cdot d (R (x)) (y)

is bounded and Σ-measurable. Moreover, for every

P \in P (M, Σ)

, we have:

\int_{M^{'}} g \cdot d (R_{#} P) = \int_{M} (\int_{M^{'}} g (y) \cdot d (R (x)) (y)) d P (x) .

Proof.

Write

g = g^{+} - g^{-}

(where

g^{+} = max {g, 0}

and

g^{-} = max {- g, 0}

), and use the fact that every bounded measurable function is integrable over any probability distribution. ☐

Lemma 3.

For every measurable random mapping R from

(M, Σ)

to

(M^{'}, Σ^{'})

, the push-forward mapping

R_{#}

is continuous from

P (M, Σ)

to

P (M^{'}, Σ^{'})

under the total variation topology.

Proof.

See Appendix D. ☐

Lemma 4.

Let

U

be a Polish (This assumption can be dropped. We assumed that

U

is Polish just to avoid working with Moore–Smith nets.) topology on M, and let

U^{'}

be an arbitrary topology on

M^{'}

. Let R be a measurable random mapping from

(M, B (M))

to

(M^{'}, B (M^{'}))

. Moreover, assume that R is a continuous mapping from

(M, U)

to

P (M^{'}, B (M^{'}))

when the latter space is endowed with the weak-∗ topology. Under these assumptions, the push-forward mapping

R_{#}

is continuous from

P (M, B (M))

to

P (M^{'}, B (M^{'}))

under the weak-∗ topology.

Proof.

See Appendix D. ☐

2.6. Meta-Probability Measures

Let

X

be a finite set. A meta-probability measure on

X

is a probability measure on the Borel sets of

Δ_{X}

. It is called a meta-probability measure because it is a probability measure on the space of probability distributions on

X

.

We denote the set of meta-probability measures on

X

as

MP (X)

. Clearly,

MP (X) = P (Δ_{X})

.

A meta-probability measure MP on

X

is said to be balanced if it satisfies:

\int_{Δ_{X}} p \cdot d MP (p) = π_{X},

where

π_{X}

is the uniform probability distribution on

X

.

We denote the set of all balanced meta-probability measures on

X

as

{MP}_{b} (X)

. The set of all balanced and finitely-supported meta-probability measures on

X

is denoted as

{MP}_{b f} (X)

.

The following lemma is useful to show the continuity of functions defined on

MP (X)

.

Lemma 5.

Let

(S, V)

be a compact topological space, and let

f : S \times Δ_{X} \to R

be a continuous function on

S \times Δ_{X}

. The mapping

F : S \times MP (X) \to R

defined as:

F (s, MP) = \int_{Δ_{X}} f (s, p) \cdot d MP (p)

is continuous, where

MP (X)

is endowed with the weak-∗ topology.

Proof.

See Appendix E. ☐

Let f be a mapping from a finite set

X

to another finite set

X^{'}

. f induces a push-forward mapping

f_{#}

taking probability distributions in

Δ_{X}

to probability distributions in

Δ_{X^{'}}

.

f_{#}

is continuous because

Δ_{X}

and

Δ_{X^{'}}

are endowed with the total variation distance.

f_{#}

in turn induces another push-forward mapping taking meta-probability measures in

MP (X)

to meta-probability measures in

MP (X^{'})

. We denote this mapping as

f_{# #}

, and we call it the meta-push-forward mapping induced by f. Since

f_{#}

is a continuous mapping from

Δ_{X}

to

Δ_{X^{'}}

,

f_{# #}

is a continuous mapping from

MP (X)

to

MP (X^{'})

under both the weak-∗ and the total variation topologies.

Let

X_{1}

and

X_{2}

be two finite sets. Let

Mul : Δ_{X_{1}} \times Δ_{X_{2}} \to Δ_{X_{1} \times X_{2}}

be defined as

Mul (p_{1}, p_{2}) = p_{1} \times p_{2}

. For every

{MP}_{1} \in MP (X_{1})

and

{MP}_{2} \in MP (X_{2})

, we define the tensor product of

{MP}_{1}

and

{MP}_{2}

as

{MP}_{1} \otimes {MP}_{2} = {Mul}_{#} ({MP}_{1} \times {MP}_{2}) \in MP (X_{1} \times X_{2})

.

Note that since

Δ_{X_{1}}

,

Δ_{X_{2}}

and

Δ_{X_{1} \times X_{2}}

are endowed with the total variation topology,

Mul (p_{1}, p_{2}) = p_{1} \times p_{2}

is a continuous mapping from

Δ_{X_{1}} \times Δ_{X_{2}}

to

Δ_{X_{1} \times X_{2}}

. Therefore,

{Mul}_{#}

is a continuous mapping from

P (Δ_{X_{1}} \times Δ_{X_{2}})

to

P (Δ_{X_{1} \times X_{2}}) = MP (X_{1} \times X_{2})

under both the weak-∗ and the total variation topologies. On the other hand, Appendix B and Appendix F imply that the mapping

({MP}_{1}, {MP}_{2}) \to {MP}_{1} \times {MP}_{2}

from

MP (X_{1}) \times MP (X_{2})

to

P (Δ_{X_{1}} \times Δ_{X_{2}})

is continuous under both the weak-∗ and the total variation topologies. We conclude that the tensor product is continuous under both of these topologies.

3. The Space of Equivalent Channels

In this section, we summarize the main results of [6].

3.1. Space of Channels from $X$ to $Y$

A discrete memoryless channel W is a three-tuple

W = (X, Y, p_{W})

where

X

is a finite set that is called the input alphabet of W,

Y

is a finite set that is called the output alphabet of W and

p_{W} : X \times Y \to [0, 1]

is a function satisfying

\forall x \in X, \sum_{y \in Y} p_{W} (x, y) = 1

.

For every

(x, y) \in X \times Y

, we denote

p_{W} (x, y)

as

W (y | x)

, which we interpret as the conditional probability of receiving y at the output, given that x is the input.

Let

{DMC}_{X, Y}

be the set of all channels having

X

as the input alphabet and

Y

as the output alphabet.

For every

W, W^{'} \in {DMC}_{X, Y}

, define the distance between W and

W^{'}

as follows:

d_{X, Y} (W, W^{'}) = \frac{1}{2} max_{x \in X} \sum_{y \in Y} | W^{'} (y | x) - W (y | x) | .

We always endow

{DMC}_{X, Y}

with the metric distance

d_{X, Y}

. This metric makes

{DMC}_{X, Y}

a compact path-connected metric space. The metric topology on

{DMC}_{X, Y}

that is induced by

d_{X, Y}

is denoted as

T_{X, Y}

.

3.2. Equivalence between Channels

Let

W \in {DMC}_{X, Y}

and

W^{'} \in {DMC}_{X, Z}

be two channels having the same input alphabet. We say that

W^{'}

is degraded from W if there exists a channel

V \in {DMC}_{Y, Z}

such that:

W^{'} (z | x) = \sum_{y \in Y} V (z | y) W (y | x) .

W and

W^{'}

are said to be equivalent if each one is degraded from the other.

Let

Δ_{X}

and

Δ_{Y}

be the space of probability distributions on

X

and

Y

, respectively. Define

P_{W}^{o} \in Δ_{Y}

as

P_{W}^{o} (y) = \frac{1}{| X |} \sum_{x \in X} W (y | x)

for every

y \in Y

. The image of W is the set of output-symbols

y \in Y

having strictly positive probabilities:

Im (W) = {y \in Y : P_{W}^{o} (y) > 0} .

For every

y \in Im (W)

, define

W_{y}^{- 1} \in Δ_{X}

as follows:

W_{y}^{- 1} (x) = \frac{W (y | x)}{| X | P_{W}^{o} (y)}, \forall x \in X .

For every

(x, y) \in X \times Im (W)

, we have

W (y | x) = | X | P_{W}^{o} (y) W_{y}^{- 1} (x)

. On the other hand, if

x \in X

and

y \in Y \ Im (W)

, we have

W (y | x) = 0

. This shows that

P_{W}^{o}

and the collection

{W_{y}^{- 1}}_{y \in Im (W)}

uniquely determine W.

The Blackwell measure (denoted

{MP}_{W}

) of W is a meta-probability measure on

X

defined as:

{MP}_{W} = \sum_{y \in Im (W)} P_{W}^{o} (y) δ_{W_{y}^{- 1}},

where

δ_{W_{y}^{- 1}}

is a Dirac measure centered at

W_{y}^{- 1}

. In an earlier version of this work, I called

{MP}_{W}

the posterior meta-probability distribution of W. Maxim Raginsky thankfully brought to my attention the fact that

{MP}_{W}

is called the Blackwell measure.

It is known that a meta-probability measure MP on

X

is the Blackwell measure of some discrete memoryless channels (DMC) with input alphabet

X

if and only if it is balanced and finitely supported [9].

It is also known that two channels

W \in {DMC}_{X, Y}

and

W^{'} \in {DMC}_{X, Z}

are equivalent if and only if

{MP}_{W} = {MP}_{W^{'}}

[9].

3.3. Space of Equivalent Channels from $X$ to $Y$

Let

X

and

Y

be two finite sets. Define the equivalence relation

R_{X, Y}^{(o)}

on

{DMC}_{X, Y}

as follows:

\forall W, W^{'} \in {DMC}_{X, Y}, W R_{X, Y}^{(o)} W^{'} \Leftrightarrow W is equivalent to W^{'} .

The space of equivalent channels with input alphabet

X

and output alphabet

Y

is the quotient of

{DMC}_{X, Y}

by the equivalence relation:

{DMC}_{X, Y}^{(o)} = {DMC}_{X, Y} / R_{X, Y}^{(o)} .

Quotient topology:

We define the topology

T_{X, Y}^{(o)}

on

{DMC}_{X, Y}^{(o)}

as the quotient topology

T_{X, Y} / R_{X, Y}^{(o)}

. We always associate

{DMC}_{X, Y}^{(o)}

with the quotient topology

T_{X, Y}^{(o)}

.

We have shown in [6] that

{DMC}_{X, Y}^{(o)}

is a compact, path-connected and metrizable space.

If

Y_{1}

and

Y_{2}

are two finite sets of the same size, there exists a canonical homeomorphism between

{DMC}_{X, Y_{1}}^{(o)}

and

{DMC}_{X, Y_{2}}^{(o)}

[6]. This allows us to identify

{DMC}_{X, Y}^{(o)}

with

{DMC}_{X, [n]}^{(o)}

, where

n = | Y |

and

[n] = {1, \dots, n}

.

Moreover, for every

1 \leq n \leq m

, there exists a canonical subspace of

{DMC}_{X, [m]}^{(o)}

that is homeomorphic to

{DMC}_{X, [n]}^{(o)}

[6]. Therefore, we can consider

{DMC}_{X, [n]}^{(o)}

as a compact subspace of

{DMC}_{X, [m]}^{(o)}

.

Noisiness metric:

For every

m \geq 1

, let

Δ_{[m] \times X}

be the space of probability distributions on

[m] \times X

. Let

Y

be a finite set, and let

W \in {DMC}_{X, Y}

. For every

p \in Δ_{[m] \times X}

, define

P_{c} (p, W)

as follows:

P_{c} (p, W) = sup_{D \in {DMC}_{Y, [m]}} \sum_{\begin{matrix} u \in [m], \\ x \in X, \\ y \in Y \end{matrix}} p (u, x) W (y | x) D (u | y) .

The quantity

P_{c} (p, W)

depends only on the

R_{X, Y}^{(o)}

-equivalence class of W (see [6]). Therefore, if

\hat{W} \in {DMC}_{X, Y}^{(o)}

, we can define

P_{c} (p, \hat{W}) : = P_{c} (p, W^{'})

for any

W^{'} \in \hat{W}

.

Define the noisiness distance

d_{X, Y}^{(o)} : {DMC}_{X, Y}^{(o)} \times {DMC}_{X, Y}^{(o)} \to R^{+}

as follows:

d_{X, Y}^{(o)} ({\hat{W}}_{1}, {\hat{W}}_{2}) = sup_{\begin{matrix} m \geq 1, \\ p \in Δ_{[m] \times X} \end{matrix}} | P_{c} (p, {\hat{W}}_{1}) - P_{c} (p, {\hat{W}}_{2}) | .

We have shown in [6] that

({DMC}_{X, Y}^{(o)}, T_{X, Y}^{(o)})

is topologically equivalent to

({DMC}_{X, Y}^{(o)}, d_{X, Y}^{(o)})

.

3.4. Space of Equivalent Channels with Input Alphabet $X$

The space of channels with input alphabet

X

is defined as:

{DMC}_{X, *} = \underset{n \geq 1}{∐} {DMC}_{X, [n]} .

We define the equivalence relation

R_{X, *}^{(o)}

on

{DMC}_{X, *}

as follows:

\forall W, W^{'} \in {DMC}_{X, *}, W R_{X, *}^{(o)} W^{'} \Leftrightarrow W is equivalent to W^{'} .

The space of equivalent channels with input alphabet

X

is the quotient of

{DMC}_{X, *}

by the equivalence relation:

{DMC}_{X, *}^{(o)} = {DMC}_{X, *} / R_{X, *}^{(o)} .

For every

n \geq 1

and every

W \in {DMC}_{X, [n]}

, we identify the

R_{X, [n]}^{(o)}

-equivalence class of W with the

R_{X, *}^{(o)}

-equivalence class of it. This allows us to consider

{DMC}_{X, [n]}^{(o)}

as a subspace of

{DMC}_{X, *}^{(o)}

. Moreover,

{DMC}_{X, *}^{(o)} = ⋃_{n \geq 1} {DMC}_{X, [n]}^{(o)} .

Since any two equivalent channels have the same Blackwell measure, we can define the Blackwell measure of

\hat{W} \in {DMC}_{X, *}^{(o)}

as

{MP}_{\hat{W}} = {MP}_{W^{'}}

for any

W^{'} \in \hat{W}

. The rank of

\hat{W} \in {DMC}_{X, *}^{(o)}

is the size of the support of its Blackwell measure:

rank (\hat{W}) = | supp ({MP}_{\hat{W}}) | .

We have:

{DMC}_{X, [n]}^{(o)} = {\hat{W} \in {DMC}_{X, *}^{(o)} : rank (\hat{W}) \leq n} .

A topology

T

on

{DMC}_{X, *}^{(o)}

is said to be natural if and only if it induces the quotient topology

T_{X, [n]}^{(o)}

on

{DMC}_{X, [n]}^{(o)}

for every

n \geq 1

.

Every natural topology is

σ

-compact, separable and path-connected [6]. On the other hand, if

| X | \geq 2

, a Hausdorff natural topology is not Baire, and it is not locally compact anywhere [6]. This implies that no natural topology can be completely metrized if

| X | \geq 2

.

Strong topology on

{DMC}_{X, *}^{(o)}

:

We associate

{DMC}_{X, *}

with the disjoint union topology

T_{s, X, *} : = ⨁_{n \geq 1} T_{X, [n]}

. The space

({DMC}_{X, *}, T_{s, X, *})

is disconnected, metrizable and

σ

-compact [6].

The strong topology

T_{s, X, *}^{(o)}

on

{DMC}_{X, *}^{(o)}

is the quotient of

T_{s, X, *}

by

R_{X, *}^{(o)}

:

T_{s, X, *}^{(o)} = T_{s, X, *} / R_{X, *}^{(o)} .

We call open and closed sets in

({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})

as strongly-open and strongly-closed sets, respectively. If A is a subset of

{DMC}_{X, *}^{(o)}

, then A is strongly open if and only if

A \cap {DMC}_{X, [n]}^{(o)}

is open in

{DMC}_{X, [n]}^{(o)}

for every

n \geq 1

. Similarly, A is strongly closed if and only if

A \cap {DMC}_{X, [n]}^{(o)}

is closed in

{DMC}_{X, [n]}^{(o)}

for every

n \geq 1

.

We have shown in [6] that

T_{s, X, *}^{(o)}

is the finest natural topology. The strong topology is sequential, compactly generated and

T_{4}

[6]. On the other hand, if

| X | \geq 2

, the strong topology is not first-countable anywhere [6]; hence, it is not metrizable.

Noisiness metric:

Define the noisiness metric on

{DMC}_{X, *}^{(o)}

as follows:

d_{X, *}^{(o)} (\hat{W}, {\hat{W}}^{'}) : = d_{X, [n]}^{(o)} (\hat{W}, {\hat{W}}^{'}) where n \geq 1 satisfies \hat{W}, {\hat{W}}^{'} \in {DMC}_{X, [n]}^{(o)} .

d_{X, *}^{(o)} (\hat{W}, {\hat{W}}^{'})

is well-defined because

d_{X, [n]}^{(o)} (\hat{W}, {\hat{W}}^{'})

does not depend on

n \geq 1

as long as

\hat{W}, {\hat{W}}^{'} \in {DMC}_{X, [n]}^{(o)}

. We can also express

d_{X, *}^{(o)}

as follows:

d_{X, *}^{(o)} (\hat{W}, {\hat{W}}^{'}) = sup_{\begin{matrix} m \geq 1, \\ p \in Δ_{[m] \times X} \end{matrix}} | P_{c} (p, \hat{W}) - P_{c} (p, {\hat{W}}^{'}) | .

The metric topology on

{DMC}_{X, *}^{(o)}

that is induced by

d_{X, *}^{(o)}

is called the noisiness topology on

{DMC}_{X, *}^{(o)}

, and it is denoted as

T_{X, *}^{(o)}

. We have shown in [6] that

T_{X, *}^{(o)}

is a natural topology that is strictly coarser than

T_{s, X, *}^{(o)}

.

Topologies from Blackwell measures:

The mapping

\hat{W} \to {MP}_{\hat{W}}

is a bijection from

{DMC}_{X, *}^{(o)}

to

{MP}_{b f} (X)

. We call this mapping the canonical bijection from

{DMC}_{X, *}^{(o)}

to

{MP}_{b f} (X)

.

Since

Δ_{X}

is a metric space, there are many standard ways to construct topologies on

MP (X)

. If we choose any of these standard topologies on

MP (X)

and then relativize it to the subspace

{MP}_{b f} (X)

, we can construct topologies on

{DMC}_{X, *}^{(o)}

through the canonical bijection.

In [6], we studied the weak-∗ and the total variation topologies. We showed that the weak-∗ topology is exactly the same as the noisiness topology.

The total-variation metric distance

d_{T V, X, *}^{(o)}

on

{DMC}_{X, *}^{(o)}

is defined as:

d_{T V, X, *}^{(o)} (\hat{W}, {\hat{W}}^{'}) = {∥ {MP}_{\hat{W}} - {MP}_{{\hat{W}}^{'}} ∥}_{T V} .

The total-variation topology

T_{T V, X, *}^{(o)}

is the metric topology that is induced by

d_{T V, X, *}^{(o)}

on

{DMC}_{X, *}^{(o)}

. We proved in [6] that if

| X | \geq 2

, we have:

$T_{T V, X, *}^{(o)}$ is not natural, nor Baire, hence it is not completely metrizable.
$T_{T V, X, *}^{(o)}$ is not locally compact anywhere.

4. Channel Parameters and Operations

4.1. Useful Parameters

Let

Δ_{X}

be the space of probability distributions on

X

. For every

p \in Δ_{X}

and every

W \in {DMC}_{X, Y}

, define

I (p, W)

as the mutual information

I (X; Y)

, where X is distributed as p and Y is the output of W when X is the input. The mutual information is computed using the natural logarithm. The capacity of W is defined as

C (W) = sup_{p \in Δ_{X}} I (p, W)

.

For every

p \in Δ_{X}

, the error probability of the MAP decoder of W under prior p is defined as:

P_{e} (p, W) = 1 - \sum_{y \in Y} max_{x \in X} {p (x) W (y | x)} .

Clearly,

0 \leq P_{e} (p, W) \leq 1

.

For every

W \in {DMC}_{X, Y}

, define the Bhattacharyya parameter of W as:

Z (W) = \{\begin{matrix} \frac{1}{| X | \cdot (| X | - 1)} \sum_{\begin{matrix} x_{1}, x_{2} \in X, \\ x_{1} \neq x_{2} \end{matrix}} \sum_{y \in Y} \sqrt{W (y | x_{1}) W (y | x_{2})}, & if | X | \geq 2 \\ 0 & if | X | = 1 . \end{matrix}

It is easy to see that

0 \leq Z (W) \leq 1

.

It was shown in [10,11] that

\frac{1}{4} Z {(W)}^{2} \leq P_{e} (π_{X}, W) \leq (| X | - 1) Z (W)

, where

π_{X}

is the uniform distribution on

X

.

An

(n, M)

-code

C

on the alphabet

X

is a subset of

X^{n}

such that

| C | = M

. The integer n is the block length of

C

, and M is the size of the code. The rate of

C

is

\frac{1}{n} log M

, and it is measured in nats. The error probability of the ML decoder for the code

C

when it is used for a channel

W \in {DMC}_{X, Y}

is given by:

P_{e, C} (W) = 1 - \frac{1}{| C |} \sum_{y_{1}^{n} \in Y^{n}} max_{x_{1}^{n} \in C} \{\prod_{i = 1}^{n} W (y_{i} | x_{i})\} .

The optimal error probability of

(n, M)

-codes for a channel W is given by:

P_{e, n, M} (W) = min_{\begin{matrix} C \subset X^{n}, \\ | C | = M \end{matrix}} P_{e, C} (W) .

The following proposition shows that all the above parameters are continuous:

Proposition 2.

We have:

$I : Δ_{X} \times {DMC}_{X, Y} \to R^{+}$ is continuous, concave in p and convex in W.
$C : {DMC}_{X, Y} \to R^{+}$ is continuous and convex.
$P_{e} : Δ_{X} \times {DMC}_{X, Y} \to [0, 1]$ is continuous, concave in p and concave in W.
$Z : {DMC}_{X, Y} \to [0, 1]$ is continuous.
For every code $C$ on $X$ , $P_{e, C} : {DMC}_{X, Y} \to [0, 1]$ is continuous.
For every $n > 0$ and every $1 \leq M \leq {| X |}^{n}$ , the mapping $P_{e, n, M} : {DMC}_{X, Y} \to [0, 1]$ is continuous.

Proof.

These facts are well known, especially the continuity of I, its concavity in p and its convexity in W [12]. Since C is the supremum of a family of mappings that are convex in W, it is also convex in W. For a proof of the continuity of C, see Appendix G. The continuity of Z,

P_{e}

and

P_{e, C}

follows immediately from their definitions. Moreover, since

P_{e, n, M}

is the minimum of a finite number of continuous mappings, it is continuous. The concavity of

P_{e}

in p and in W can also be easily seen from the definition. ☐

4.2. Channel Operations

If

W \in {DMC}_{X, Y}

and

V \in {DMC}_{Y, Z}

, we define the composition

V \circ W \in {DMC}_{X, Z}

of W and V as follows:

(V \circ W) (z | x) = \sum_{y \in Y} V (z | y) W (y | x) . \forall x \in X, \forall z \in Z .

For every function

f : X \to Y

, define the deterministic channel

D_{f} \in {DMC}_{X, Y}

as follows:

D_{f} (y | x) = \{\begin{matrix} 1 & if y = f (x), \\ 0 & otherwise . \end{matrix}

It is easy to see that if

f : X \to Y

and

g : Y \to Z

, then

D_{g} \circ D_{f} = D_{g \circ f}

.

For every two channels

W_{1} \in {DMC}_{X_{1}, Y_{1}}

and

W_{2} \in {DMC}_{X_{2}, Y_{2}}

, define the channel sum

W_{1} \oplus W_{2} \in {DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}

of

W_{1}

and

W_{2}

as:

(W_{1} \oplus W_{2}) (y, i | x, j) = \{\begin{matrix} W_{i} (y | x) & if i = j, \\ 0 & otherwise . \end{matrix}

W_{1} \oplus W_{2}

arises when the transmitter has two channels

W_{1}

and

W_{2}

at its disposal, and it can use exactly one of them at each channel use. It is an easy exercise to check that

e^{C (W_{1} \oplus W_{2})} = e^{C (W_{1})} + e^{C (W_{2})}

(remember that we compute the mutual information using the natural logarithm).

We define the channel product

W_{1} \otimes W_{2} \in {DMC}_{X_{1} \times X_{2}, Y_{1} \times Y_{2}}

of

W_{1}

and

W_{2}

as:

(W_{1} \otimes W_{2}) (y_{1}, y_{2} | x_{1}, x_{2}) = W_{1} (y_{1} | x_{1}) W_{2} (y_{2} | x_{2}) .

W_{1} \otimes W_{2}

arises when the transmitter has two channels

W_{1}

and

W_{2}

at its disposal, and it uses both of them at each channel use. It is an easy exercise to check that

C (W_{1} \otimes W_{2}) = C (W_{1}) + C (W_{2})

, or equivalently

e^{C (W_{1} \otimes W_{2})} = e^{C (W_{1})} \cdot e^{C (W_{2})}

. Channel sums and products were first introduced by Shannon in [13].

For every

W_{1} \in {DMC}_{X, Y_{1}}

,

W_{2} \in {DMC}_{X, Y_{2}}

and every

0 \leq α \leq 1

, we define the

α

-interpolation

[α W_{1}, (1 - α) W_{2}] \in {DMC}_{X, Y_{1} ∐ Y_{2}}

between

W_{1}

and

W_{2}

as:

[α W_{1}, (1 - α) W_{2}] (y, i | x) = \{\begin{matrix} α W_{1} (y | x) & if i = 1, \\ (1 - α) W_{2} (y | x) & if i = 2 . \end{matrix}

Channel interpolation arises when a channel behaves as

W_{1}

with probability

α

and as

W_{2}

with probability

1 - α

. The transmitter has no control on which behavior the channel chooses, but on the other hand, the receiver knows which one was chosen. Channel interpolations were used in [14] to construct interpolations between polar codes and Reed–Muller codes.

Now, fix a binary operation ∗ on

X

. For every

W \in {DMC}_{X, Y}

, define

W^{-} \in {DMC}_{X, Y^{2}}

and

W^{+} \in {DMC}_{X, Y^{2} \times X}

as:

W^{-} (y_{1}, y_{2} | u_{1}) = \frac{1}{| X |} \sum_{u_{2} \in X} W (y_{1} | u_{1} * u_{2}) W (y_{2} | u_{2}),

and:

W^{+} (y_{1}, y_{2}, u_{1} | u_{2}) = \frac{1}{| X |} W (y_{1} | u_{1} * u_{2}) W (y_{2} | u_{2}) .

These operations generalize Arıkan’s polarization transformations [15].

Proposition 3.

We have:

The mapping $(W, V) \to V \circ W$ from ${DMC}_{X, Y} \times {DMC}_{Y, Z}$ to ${DMC}_{X, Z}$ is continuous.
The mapping $(W_{1}, W_{2}) \to W_{1} \oplus W_{2}$ from ${DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}}$ to the space ${DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}$ is continuous.
The mapping $(W_{1}, W_{2}) \to W_{1} \otimes W_{2}$ from ${DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}}$ to ${DMC}_{X_{1} \times X_{2}, Y_{1} \times Y_{2}}$ is continuous.
The mapping $(W_{1}, W_{2}, α) \to [α W_{1}, (1 - α) W_{2}]$ from ${DMC}_{X, Y_{1}} \times {DMC}_{X, Y_{2}} \times [0, 1]$ to ${DMC}_{X, Y_{1} ∐ Y_{2}}$ is continuous.
For any binary operation ∗ on $X$ , the mapping $W \to W^{-}$ from ${DMC}_{X, Y}$ to ${DMC}_{X, Y^{2}}$ is continuous.
For any binary operation ∗ on $X$ , the mapping $W \to W^{+}$ from ${DMC}_{X, Y}$ to ${DMC}_{X, Y^{2} \times X}$ is continuous.

Proof.

The continuity immediately follows from the definitions. ☐

5. Continuity on ${DMC}_{X, Y}^{(o)}$

It is well known that the parameters defined in Section 4.1 depend only on the

R_{X, Y}^{(o)}

-equivalence class of W. Therefore, we can define those parameters for any

\hat{W} \in {DMC}_{X, Y}^{(o)}

through the transcendent mapping (defined in Lemma 2). The following proposition shows that those parameters are continuous on

{DMC}_{X, Y}^{(o)}

:

Proposition 4.

We have:

$I : Δ_{X} \times {DMC}_{X, Y}^{(o)} \to R^{+}$ is continuous and concave in p.
$C : {DMC}_{X, Y}^{(o)} \to R^{+}$ is continuous.
$P_{e} : Δ_{X} \times {DMC}_{X, Y}^{(o)} \to [0, 1]$ is continuous and concave in p.
$Z : {DMC}_{X, Y}^{(o)} \to [0, 1]$ is continuous.
For every code $C$ on $X$ , $P_{e, C} : {DMC}_{X, Y}^{(o)} \to [0, 1]$ is continuous.
For every $n > 0$ and every $1 \leq M \leq {| X |}^{n}$ , the mapping $P_{e, n, M} : {DMC}_{X, Y}^{(o)} \to [0, 1]$ is continuous.

Proof.

Since the corresponding parameters are continuous on

{DMC}_{X, Y}

(Proposition 2), Lemma 2 implies that they are continuous on

{DMC}_{X, Y}^{(o)}

. The only cases that need a special treatment are those of I and Z. We will only prove the continuity of I since the proof of continuity of Z is similar.

Define the relation R on

Δ_{X} \times {DMC}_{X, Y}

as:

(p_{1}, W_{1}) R (p_{2}, W_{2}) \Leftrightarrow p_{1} = p_{2} and W_{1} R_{X, Y}^{(o)} W_{2} .

It is easy to see that

I (p, W)

depends only on the R-equivalence class of

(p, W)

. Since I is continuous on

Δ_{X} \times {DMC}_{X, Y}

, Lemma 2 implies that the transcendent mapping of I is continuous on

(Δ_{X} \times {DMC}_{X, Y}) / R

. On the other hand, since

Δ_{X}

is locally compact, Theorem 1 implies that

(Δ_{X} \times {DMC}_{X, Y}) / R

can be identified with

Δ_{X} \times ({DMC}_{X, Y} / R_{X, Y}^{(o)}) = Δ_{X} \times {DMC}_{X, Y}^{(o)}

, and the two spaces have the same topology. Therefore, I is continuous on

Δ_{X} \times {DMC}_{X, Y}^{(o)}

. ☐

With the exception of channel composition, all the channel operations that were defined in Section 4.2 can also be “quotiented”. We just need to realize that the equivalence class of the resulting channel depends only on the equivalence classes of the channels that were used in the operation. Let us illustrate this in the case of channel sums:

Let

W_{1}, W_{1}^{'} \in {DMC}_{X_{1}, Y_{1}}

and

W_{2}, W_{2}^{'} \in {DMC}_{X_{2}, Y_{2}}

and assume that

W_{1}

is degraded from

W_{1}^{'}

and

W_{2}

is degraded from

W_{2}^{'}

. There exists

V_{1} \in {DMC}_{Y_{1}, Y_{1}}

and

V_{2} \in {DMC}_{Y_{2}, Y_{2}}

such that

W_{1} = V_{1} \circ W_{1}^{'}

and

W_{2} = V_{2} \circ W_{2}^{'}

. It is easy to see that

W_{1} \oplus W_{2} = (V_{1} \oplus V_{2}) \circ (W_{1}^{'} \oplus W_{2}^{'})

, which shows that

W_{1} \oplus W_{2}

is degraded from

W_{1}^{'} \oplus W_{2}^{'}

. This was proven by Shannon in [16].

Therefore, if

W_{1}

is equivalent to

W_{1}^{'}

and

W_{2}

is equivalent to

W_{2}^{'}

, then

W_{1} \oplus W_{2}

is equivalent to

W_{1}^{'} \oplus W_{2}^{'}

. This allows us to define the channel sum for every

{\hat{W}}_{1} \in {DMC}_{X_{1}, Y_{1}}^{(o)}

and every

{\bar{W}}_{2} \in {DMC}_{X_{2}, Y_{2}}^{(o)}

as

{\hat{W}}_{1} \oplus {\bar{W}}_{2} = \tilde{W_{1}^{'} \oplus W_{2}^{'}} \in {DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}

for any

W_{1}^{'} \in {\hat{W}}_{1}

and any

W_{2}^{'} \in {\bar{W}}_{2}

, where

\tilde{W_{1}^{'} \oplus W_{2}^{'}}

is the

R_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}

-equivalence class of

W_{1}^{'} \oplus W_{2}^{'}

.

With the exception of channel composition, we can “quotient” all the channel operations of Section 4.2 in a similar fashion. Moreover, we can show that they are continuous:

Proposition 5.

We have:

The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \oplus {\bar{W}}_{2}$ from ${DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}$ to ${DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}$ is continuous.
The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \otimes {\bar{W}}_{2}$ from ${DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}$ to ${DMC}_{X_{1} \times X_{2}, Y_{1} \times Y_{2}}^{(o)}$ is continuous.
The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}, α) \to [α {\hat{W}}_{1}, (1 - α) {\bar{W}}_{2}]$ from ${DMC}_{X, Y_{1}}^{(o)} \times {DMC}_{X, Y_{2}}^{(o)} \times [0, 1]$ to ${DMC}_{X, Y_{1} ∐ Y_{2}}^{(o)}$ is continuous.
For any binary operation ∗ on $X$ , the mapping $\hat{W} \to {\hat{W}}^{-}$ from ${DMC}_{X, Y}^{(o)}$ to ${DMC}_{X, Y^{2}}^{(o)}$ is continuous.
For any binary operation ∗ on $X$ , the mapping $\hat{W} \to {\hat{W}}^{+}$ from ${DMC}_{X, Y}^{(o)}$ to ${DMC}_{X, Y^{2} \times X}^{(o)}$ is continuous.

Proof.

We only prove the continuity of the channel sum because the proof of continuity of the other operations is similar.

Let

Proj : {DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}} \to {DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}

be the projection onto the

R_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}

-equivalence classes. Define the mapping

f : {DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}} \to {DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}

as

f (W_{1}, W_{2}) = Proj (W_{1} \oplus W_{2})

. Clearly, f is continuous.

Now, define the equivalence relation R on

{DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}}

as:

(W_{1}, W_{2}) R (W_{1}^{'}, W_{2}^{'}) \Leftrightarrow W_{1} R_{X_{1}, Y_{1}}^{(o)} W_{1}^{'} and W_{2} R_{X_{2}, Y_{2}}^{(o)} W_{2}^{'} .

The discussion before the proposition shows that

f (W_{1}, W_{2}) = Proj (W_{1} \oplus W_{2})

depends only on the R-equivalence class of

(W_{1}, W_{2})

. Lemma 2 now shows that the transcendent map of f defined on

({DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}}) / R

is continuous.

Notice that

({DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}}) / R

can be identified with

{DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}

. Therefore, we can define f on

{DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}

through this identification. Moreover, since

{DMC}_{X_{1}, Y_{1}}

and

{DMC}_{X_{2}, Y_{2}}^{(o)}

are locally compact and Hausdorff, Corollary 1 implies that the canonical bijection between

({DMC}_{X_{1}, Y_{1}} \times {DMC}_{X_{2}, Y_{2}}) / R

and

{DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}

is a homeomorphism.

Now, since the mapping f on

{DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}

is just the channel sum, we conclude that the mapping

({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \oplus {\bar{W}}_{2}

from

{DMC}_{X_{1}, Y_{1}}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}

to

{DMC}_{X_{1} ∐ X_{2}, Y_{1} ∐ Y_{2}}^{(o)}

is continuous. ☐

6. Continuity in the Strong Topology

The following lemma provides a way to check whether a mapping defined on

({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})

is continuous:

Lemma 6.

Let

(S, V)

be an arbitrary topological space. A mapping

f : {DMC}_{X, *}^{(o)} \to S

is continuous on

({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})

if and only if it is continuous on

({DMC}_{X, [n]}^{(o)}, T_{X, [n]}^{(o)})

for every

n \geq 1

.

Proof.

\begin{matrix} f is continuous on ({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)}) & \Leftrightarrow f^{- 1} (V) \in T_{s, X, *}^{(o)} \forall V \in V \\ \Leftrightarrow f^{- 1} (V) \cap {DMC}_{X, [n]}^{(o)} \in T_{X, [n]}^{(o)} \forall n \geq 1, \forall V \in V \\ \Leftrightarrow f is continuous on ({DMC}_{X, [n]}^{(o)}, T_{X, [n]}^{(o)}) \forall n \geq 1 . \end{matrix}

☐

Since the channel parameters I, C,

P_{e}

, Z,

P_{e, C}

and

P_{e, n, M}

are defined on

{DMC}_{X, [n]}^{(o)}

for every

n \geq 1

(see Section 5), they are also defined on

{DMC}_{X, *}^{(o)} = ⋃_{n \geq 1} {DMC}_{X, [n]}^{(o)}

. The following proposition shows that those parameters are continuous in the strong topology:

Proposition 6.

Let

U_{X}

be the standard topology on

Δ_{X}

. We have:

$I : Δ_{X} \times {DMC}_{X, *}^{(o)} \to R^{+}$ is continuous on $(Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{s, X, *}^{(o)})$ and concave in p.
$C : {DMC}_{X, *}^{(o)} \to R^{+}$ is continuous on $({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})$ .
$P_{e} : Δ_{X} \times {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $(Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{s, X, *}^{(o)})$ and concave in p.
$Z : {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})$ .
For every code $C$ on $X$ , $P_{e, C} : {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})$ .
For every $n > 0$ and every $1 \leq M \leq {| X |}^{n}$ , the mapping $P_{e, n, M} : {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})$ .

Proof.

The continuity of

C, Z, P_{e, C}

and

P_{e, n, M}

immediately follows from Proposition 4 and Lemma 6. Since the proofs of the continuity of I and Z are similar, we only prove the continuity for I.

Due to the distributivity of the product with respect to disjoint unions, we have:

\begin{matrix} Δ_{X} \times {DMC}_{X, *} = \underset{n \geq 1}{∐} (Δ_{X} \times {DMC}_{X, [n]}), \end{matrix}

and:

\begin{matrix} U_{X} \otimes T_{s, X, *} = ⨁_{n \geq 1} (U_{X} \otimes T_{X, [n]}) . \end{matrix}

Therefore,

(Δ_{X} \times {DMC}_{X, *}, U_{X} \otimes T_{s, X, *})

is the disjoint union of the spaces

{(Δ_{X} \times {DMC}_{X, [n]})}_{n \geq 1}

. Moreover, I is continuous on

Δ_{X} \times {DMC}_{X, [n]}

for every

n \geq 1

. We conclude that I is continuous on

(Δ_{X} \times {DMC}_{X, *}, U_{X} \otimes T_{s, X, *})

.

Define the relation R on

Δ_{X} \times {DMC}_{X, *}

as follows:

(p_{1}, W_{1}) R (p_{2}, W_{2})

if and only if

p_{1} = p_{2}

and

W_{1} R_{X, *}^{(o)} W_{2}

. Since

I (p, W)

depends only on the R-equivalence class of

(p, W)

, Lemma 2 shows that the transcendent map of I is a continuous mapping from

((Δ_{X} \times {DMC}_{X, *}) / R, (U_{X} \otimes T_{s, X, *}) / R)

to

R^{+}

. On the other hand, since

Δ_{X}

is locally compact and Hausdorff, Theorem 1 implies that

((Δ_{X} \times {DMC}_{X, *}) / R, (U_{X} \otimes T_{s, X, *}) / R)

can be identified with

(Δ_{X} \times ({DMC}_{X, *} / R_{X, *}^{(o)}), U_{X} \otimes (T_{s, X, *} / R_{X, *}^{(o)})) = (Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{s, X, *}^{(o)})

. Therefore, I is continuous on

(Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{s, X, *}^{(o)})

. ☐

It is also possible to extend the definition of all the channel operations that were defined in Section 5 to

{DMC}_{X, *}^{(o)}

. Moreover, it is possible to show that many channel operations are continuous in the strong topology:

Proposition 7.

Assume that all equivalent channel spaces are endowed with the strong topology. We have:

The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \oplus {\bar{W}}_{2}$ from ${DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}$ to ${DMC}_{X_{1} ∐ X_{2}, *}^{(o)}$ is continuous.
The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \otimes {\bar{W}}_{2}$ from ${DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}$ to ${DMC}_{X_{1} \times X_{2}, *}^{(o)}$ is continuous.
The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}, α) \to [α {\hat{W}}_{1}, (1 - α) {\bar{W}}_{2}]$ from ${DMC}_{X, *} \times {DMC}_{X, Y_{2}}^{(o)} \times [0, 1]$ to ${DMC}_{X, *}^{(o)}$ is continuous.
For any binary operation ∗ on $X$ , the mapping $\hat{W} \to {\hat{W}}^{-}$ from ${DMC}_{X, *}^{(o)}$ to ${DMC}_{X, *}^{(o)}$ is continuous.
For any binary operation ∗ on $X$ , the mapping $\hat{W} \to {\hat{W}}^{+}$ from ${DMC}_{X, *}^{(o)}$ to ${DMC}_{X, *}^{(o)}$ is continuous.

Proof.

We only prove the continuity of the channel interpolation because the proof of the continuity of other operations is similar.

Let

U

be the standard topology on

[0, 1]

. Due to the distributivity of the product with respect to disjoint unions, we have:

{DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1] = \underset{n \geq 1}{∐} ({DMC}_{X, [n]} \times {DMC}_{X, Y_{2}} \times [0, 1]),

and:

T_{s, X, *} \otimes T_{X, Y_{2}} \otimes U = ⨁_{n \geq 1} (T_{X, [n]} \otimes T_{X, Y_{2}} \otimes U) .

Therefore, the space

{DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1]

is the topological disjoint union of the spaces

{({DMC}_{X, [n]} \times {DMC}_{X, Y_{2}} \times [0, 1])}_{n \geq 1}

.

For every

n \geq 1

, let

{Proj}_{n}

be the projection onto the

R_{X, [n] ∐ Y_{2}}^{(o)}

-equivalence classes, and let

i_{n}

be the canonical injection from

{DMC}_{X, [n] ∐ Y_{2}}^{(o)}

to

{DMC}_{X, *}^{(o)}

.

Define the mapping

f : {DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1] \to {DMC}_{X, *}^{(o)}

as:

f (W_{1}, W_{2}, α) = i_{n} ({Proj}_{n} ([α W_{1}, (1 - α) W_{2}])) = [α {\hat{W}}_{1}, (1 - α) {\bar{W}}_{2}],

where n is the unique integer satisfying

W_{1} \in {DMC}_{X, [n]}

.

{\hat{W}}_{1}

and

{\bar{W}}_{2}

are the

R_{X, [n]}^{(o)}

and

R_{X, Y_{2}}^{(o)}

-equivalence classes of

W_{1}

and

W_{2}

, respectively.

Due to Proposition 3 and due to the continuity of

{Proj}_{n}

and

i_{n}

, the mapping f is continuous on

{DMC}_{X, [n]} \times {DMC}_{X, Y_{2}} \times [0, 1]

for every

n \geq 1

. Therefore, f is continuous on

({DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1], T_{s, X, *} \otimes T_{X, Y_{2}} \otimes U)

.

Let

R^{'}

be the equivalence relation defined on

{DMC}_{X, *} \times {DMC}_{X, Y_{2}}

as follows:

(W_{1}, W_{2}) R^{'} (W_{1}^{'}, W_{2}^{'})

if and only if

W_{1} R_{X, *}^{(o)} W_{1}^{'}

and

W_{2} R_{X, Y_{2}}^{(o)} W_{2}^{'}

. Furthermore, define the equivalence relation R on

{DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1]

as follows:

(W_{1}, W_{2}, α) R (W_{1}^{'}, W_{2}^{'}, α^{'})

if and only if

(W_{1}, W_{2}) R^{'} (W_{1}^{'}, W_{2}^{'})

and

α = α^{'}

.

Since

f (W_{1}, W_{2}, α)

depends only on the R-equivalence class of

(W_{1}, W_{2}, α)

, Lemma 2 implies that the transcendent mapping of f is continuous on

({DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1]) / R

.

Since

[0, 1]

is Hausdorff and locally compact, Theorem 1 implies that the canonical bijection from

({DMC}_{X, *} \times {DMC}_{X, Y_{2}} \times [0, 1]) / R

to

(({DMC}_{X, *} \times {DMC}_{X, Y_{2}}) / R^{'}) \times [0, 1])

is a homeomorphism. On the other hand, since

({DMC}_{X, *}, T_{s, X, *})

and

{DMC}_{X, Y_{2}}^{(o)} = {DMC}_{X, Y_{2}} / R_{X, Y_{2}}^{(o)}

are Hausdorff and locally compact, Corollary 1 implies that the canonical bijection from

{DMC}_{X, *}^{(o)} \times {DMC}_{X, Y_{2}}^{(o)}

to

({DMC}_{X, *} \times {DMC}_{X, Y_{2}}) / R^{'}

is a homeomorphism. We conclude that the channel interpolation is continuous on

({DMC}_{X, *}^{(o)} \times {DMC}_{X, Y_{2}}^{(o)} \times [0, 1], T_{s, X, *}^{(o)} \otimes T_{X, Y}^{(o)} \otimes U)

. ☐

Corollary 3.

({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})

is strongly contractible to every point in

{DMC}_{X, *}^{(o)}

.

Proof.

Fix

{\hat{W}}_{0} \in {DMC}_{X, *}^{(o)}

. Define the mapping

H : {DMC}_{X, *}^{(o)} \times [0, 1] \to {DMC}_{X, *}^{(o)}

as

H (\hat{W}, α) = [α {\hat{W}}_{0}, (1 - α) \hat{W}]

. H is continuous by Proposition 7. We also have

H (\hat{W}, 0) = \hat{W}

and

H (\hat{W}, 1) = {\hat{W}}_{0}

for every

\hat{W} \in {DMC}_{X, *}^{(o)}

. Moreover,

H ({\hat{W}}_{0}, α) = {\hat{W}}_{0}

for every

0 \leq α \leq 1

. Therefore,

({DMC}_{X, *}^{(o)}, T_{s, X, *}^{(o)})

is strongly contractible to every point in

{DMC}_{X, *}^{(o)}

. ☐

The reader might be wondering why channel operations such as the channel sum were not shown to be continuous on the whole space

{DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}

instead of the smaller space

{DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, Y_{2}}^{(o)}

. The reason is because we cannot apply Corollary 1 to

{DMC}_{X_{1}, *} \times {DMC}_{X_{2}, *}

and

{DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}

since neither

{DMC}_{X_{1}, *}^{(o)}

, nor

{DMC}_{X_{2}, *}^{(o)}

is locally compact (under the strong topology).

One potential method to show the continuity of the channel sum on

({DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}, T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)})

is as follows: let R be the equivalence relation on

{DMC}_{X_{1}, *} \times {DMC}_{X_{2}, *}

defined as

(W_{1}, W_{2}) R (W_{1}^{'}, W_{2}^{'})

if and only if

W_{1} R_{X_{1}, *}^{(o)} W_{1}^{'}

and

W_{2} R_{X_{2}, *}^{(o)} W_{2}^{'}

. We can identify

({DMC}_{X_{1}, *} \times {DMC}_{X_{2}, *}) / R

with

{DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}

through the canonical bijection. Using Lemma 2, it is easy to see that the mapping

({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \oplus {\bar{W}}_{2}

is continuous from

({DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}, (T_{s, X_{1}, *} \otimes T_{s, X_{2}, *}) / R)

to

({DMC}_{X_{1} ∐ X_{2}, *}^{(o)}, T_{s, X_{1} ∐ X_{2}, *}^{(o)})

.

It was shown in [17] that the topology

(T_{s, X_{1}, *} \otimes T_{s, X_{2}, *}) / R

is homeomorphic to

κ (T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)})

through the canonical bijection, where

κ (T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)})

is the coarsest topology that is both compactly generated and finer than

T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)}

. Therefore, the mapping

({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \oplus {\bar{W}}_{2}

is continuous on

({DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}, κ (T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)}))

. This means that if

T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)}

is compactly generated, we will have

T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)} = κ (T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)})

, and so, the channel sum will be continuous on

({DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}, T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)})

. Note that although

T_{s, X_{1}, *}^{(o)}

and

T_{s, X_{2}, *}^{(o)}

are compactly generated, their product

T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)}

might not be compactly generated.

7. Continuity in the Noisiness/Weak-∗ and the Total Variation Topologies

We need to express the channel parameters and operations in terms of the Blackwell measures.

7.1. Channel Parameters

The following proposition shows that many channel parameters can be expressed as an integral of a continuous function with respect to the Blackwell measure:

Proposition 8.

For every

\hat{W} \in {DMC}_{X, *}^{(o)}

, we have:

\forall p \in Δ_{X}, I (p, \hat{W}) = H (p) - | X | \cdot \int_{Δ_{X}} (\sum_{x \in X} p (x) p^{'} (x) log \frac{p (x) p^{'} (x)}{\sum_{x^{'}} p (x^{'}) p^{'} (x^{'})}) \cdot d {MP}_{\hat{W}} (p^{'}),

\forall p \in Δ_{X}, P_{e} (p, \hat{W}) = 1 - | X | \int_{Δ_{X}} max_{x \in X} \{p (x) \times p^{'} (x)\} \cdot d {MP}_{\hat{W}} (p^{'}),

if | X | \geq 2, Z (\hat{W}) = \frac{1}{| X | - 1} \sum_{\begin{matrix} x, x^{'} \in X, \\ x \neq x^{'} \end{matrix}} \int_{Δ_{X}} \sqrt{p (x) p (x^{'})} \cdot d {MP}_{\hat{W}} (p),

For every code C \subset X^{n}, P_{e, C} (\hat{W}) = 1 - \frac{{| X |}^{n}}{| C |} \int_{Δ_{X}^{n}} max_{x_{1}^{n} \in C} \{\prod_{i = 1}^{n} p_{i} (x_{i})\} d {MP}_{\hat{W}}^{n} (p_{1}^{n}),

where

H (p)

is the entropy of p and

{MP}_{\hat{W}}^{n}

is the product measure on

Δ_{X}^{n}

obtained by multiplying

{MP}_{\hat{W}}

with itself n times. Note that we adopt the standard convention that

0 log \frac{0}{0} = 0

.

Proof.

By choosing any representative channel

W \in \hat{W}

and replacing

W (y | x)

by

| X | P_{W}^{o} (y) W_{y}^{- 1} (x)

in the definitions of the channel parameters, all the above formulas immediately follow. Let us show how this works for

P_{e}

:

\begin{matrix} P_{e} (p, \hat{W}) & = P_{e} (p, W) \overset{(a)}{=} 1 - \sum_{y \in Im (W)} max_{x \in X} {p (x) W (y | x)} \\ = 1 - \sum_{y \in Im (W)} max_{x \in X} \{p (x) \cdot | X | \cdot P_{W}^{o} (y) W_{y}^{- 1} (x)\} \\ = 1 - | X | \sum_{y \in Im (W)} max_{x \in X} {p (x) W_{y}^{- 1} (x)} \cdot P_{W}^{o} (y) \\ = 1 - | X | \int_{Δ_{X}} max_{x \in X} {p (x) p^{'} (x)} \cdot d {MP}_{W} (p^{'}) \\ = 1 - | X | \int_{Δ_{X}} max_{x \in X} {p (x) p^{'} (x)} \cdot d {MP}_{\hat{W}} (p^{'}), \end{matrix}

where (a) is true because

W (y | x) = 0

for

y \notin Im (W)

. ☐

Proposition 9.

Let

U_{X}

be the standard topology on

Δ_{X}

. We have:

$I : Δ_{X} \times {DMC}_{X, *}^{(o)} \to R^{+}$ is continuous on $(Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{X, *}^{(o)})$ and concave in p.
$C : {DMC}_{X, *}^{(o)} \to R^{+}$ is continuous on $({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})$ .
$P_{e} : Δ_{X} \times {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $(Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{X, *}^{(o)})$ and concave in p.
$Z : {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})$ .
For every code $C$ on $X$ , $P_{e, C} : {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})$ .
For every $n > 0$ and every $1 \leq M \leq {| X |}^{n}$ , the mapping $P_{e, n, M} : {DMC}_{X, *}^{(o)} \to [0, 1]$ is continuous on $({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})$ .

Proof.

We associate the space

MP (X)

with the weak-∗ topology. Define the mapping:

\bar{I} : Δ_{X} \times MP (X) \to R^{+}

as follows:

\bar{I} (p, MP) = H (p) - | X | \cdot \int_{Δ_{X}} (\sum_{x \in X} p (x) p^{'} (x) log \frac{p (x) p^{'} (x)}{\sum_{x^{'}} p (x^{'}) p^{'} (x^{'})}) \cdot d MP (p^{'}),

Lemma 5 implies that

\bar{I}

is continuous. On the other hand, Proposition 8 shows that

I (p, \hat{W}) = \bar{I} (p, {MP}_{\hat{W}})

. Therefore, I is continuous on

(Δ_{X} \times {DMC}_{X, *}^{(o)}, U_{X} \otimes T_{X, *}^{(o)})

. We can prove the continuity of

P_{e}

and Z similarly.

Now, define the mapping

\bar{C} : MP (X) \to R

as:

\bar{C} (MP) = sup_{p \in Δ_{X}} \bar{I} (p, MP) .

Fix

MP \in MP (X)

, and let

ϵ > 0

. Since

MP (X)

is compact (under the weak-∗ topology), Lemma 1 implies the existence of a weakly-∗ open neighborhood

U_{MP}

of MP such that

| \bar{I} (p, MP) - \bar{I} (p, {MP}^{'}) | < ϵ

for every

{MP}^{'} \in U_{MP}

and every

p \in Δ_{X}

. Therefore, for every

{MP}^{'} \in U_{MP}

and every

p \in Δ_{X}

, we have:

\bar{I} (p, MP) < \bar{I} (p, {MP}^{'}) + ϵ \leq \bar{C} ({MP}^{'}) + ϵ,

hence,

\bar{C} (MP) = sup_{p \in Δ_{X}} \bar{I} (p, MP) \leq \bar{C} ({MP}^{'}) + ϵ .

Similarly, we can show that

\bar{C} ({MP}^{'}) \leq \bar{C} (MP) + ϵ

. This shows that

| \bar{C} ({MP}^{'}) - \bar{C} (MP) | \leq ϵ

for every

{MP}^{'} \in U_{MP}

. Therefore,

\bar{C}

is continuous. However,

C (\hat{W}) = \bar{C} ({MP}_{\hat{W}})

, so C is continuous on

({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})

.

Now for every

0 \leq i \leq n

, define the mapping

f_{i} : Δ_{X}^{i} \times MP (X) \to R

backward-recursively as follows:

$f_{n} (p_{1}^{n}, MP) = max_{x_{1}^{n} \in C} \{\prod_{i = 1}^{n} p_{i} (x_{i})\}$ .
For every $0 \leq i < n$ , define:

$f_{i} (p_{1}^{i}, MP) = \int_{Δ_{X}} f_{i + 1} (p_{1}^{i + 1}, MP) \cdot d MP (p_{i + 1}) .$

Clearly

f_{n}

is continuous. Now, let

0 \leq i < n

, and assume that

f_{i + 1}

is continuous. If we let

S = Δ_{X}^{i} \times MP (X)

, Lemma 5 implies that the mapping

F_{i} : Δ_{X}^{i} \times MP (X) \times MP (X)

defined as:

F_{i} (p_{1}^{i}, MP, {MP}^{'}) = \int_{Δ_{X}} f_{i + 1} (p_{1}^{i + 1}, MP) \cdot d {MP}^{'} (p_{i + 1})

is continuous. However,

f_{i} (p_{1}^{i}, MP) = F_{i} (p_{1}^{i}, MP, MP)

, so

f_{i}

is also continuous. Therefore,

f_{0}

is continuous. By noticing that

P_{e, C} (\hat{W}) = 1 - \frac{{| X |}^{n}}{| C |} f_{0} ({MP}_{\hat{W}})

, we conclude that

P_{e, C}

is continuous on

({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})

. Moreover, since

P_{e, n, M}

is the minimum of a finite family of continuous mappings, it is continuous. ☐

It is worth mentioning that Proposition 6 can be shown from Proposition 9 because the noisiness topology is coarser than the strong topology.

Corollary 4.

All the mappings in Proposition 9 are also continuous if we replace the noisiness topology

T_{X, *}^{(o)}

with the total variation topology

T_{T V, X, *}^{(o)}

.

Proof.

This is true because

T_{T V, X, *}^{(o)}

is finer than

T_{X, *}^{(o)}

. ☐

7.2. Channel Operations

In the following, we show that we can express the channel operations in terms of Blackwell measures. We have all the tools to achieve this for the channel sum, channel product and channel interpolation. In order to express the channel polarization transformations in terms of the Blackwell measures, we need to introduce new definitions.

Let

X

be a finite set, and let ∗ be a binary operation on a finite set

X

. We say that ∗ is uniformity preserving if the mapping

(a, b) \to (a * b, b)

is a bijection from

X^{2}

to itself [18]. For every

a, b \in X

, we denote the unique element

c \in X

satisfying

c * b = a

as

c = a /^{*} b

. Note that

/^{*}

is a binary operation, and it is uniformity preserving.

/^{*}

is called the right-inverse of ∗. It was shown in [11] that a binary operation is polarizing if and only if it is uniformity preserving and its inverse is strongly ergodic.

Binary operations that are not uniformity preserving are not interesting for polarization theory because they do not preserve the symmetric capacity [11]. Therefore, we will only focus on polarization transformations that are based on uniformity preserving operations.

Let ∗ be a fixed uniformity preserving operation on

X

. Define the mapping

C^{-, *} : Δ_{X} \times Δ_{X} \to Δ_{X}

as

(C^{-, *} (p_{1}, p_{2})) (u_{1}) = \sum_{u_{2} \in X} p_{1} (u_{1} * u_{2}) p_{2} (u_{2}) .

The probability distribution

C^{-, *} (p_{1}, p_{2})

can be interpreted as follows: let

X_{1}

and

X_{2}

be two independent random variables in

X

that are distributed as

p_{1}

and

p_{2}

, respectively, and let

(U_{1}, U_{2})

be the random pair in

X^{2}

defined as

(U_{1}, U_{2}) = (X_{1} /^{*} X_{2}, X_{2})

, or equivalently

(X_{1}, X_{2}) = (U_{1} * U_{2}, U_{2})

.

C^{-, *} (p_{1}, p_{2})

is the probability distribution of

U_{1}

.

Clearly,

C^{-, *}

is continuous. Therefore, the push-forward mapping

C_{#}^{-, *}

is continuous from

P (Δ_{X} \times Δ_{X})

to

P (Δ_{X}) = MP (X)

under both the weak-∗ and the total variation topologies (see Section 2.6). For every

{MP}_{1}, {MP}_{2} \in MP (X)

, we define the

(-, *)

-convolution of

{MP}_{1}

and

{MP}_{2}

as:

{({MP}_{1}, {MP}_{2})}^{-, *} = C_{#}^{-, *} ({MP}_{1} \times {MP}_{2}) \in MP (X) .

Since the product of meta-probability measures is continuous under both the weak-∗ and the total variation topologies (Appendix B and Appendix F), the

(-, *)

-convolution is also continuous under these topologies.

For every

p_{1}, p_{2} \in Δ_{X}

and every

u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))

, define

C^{+, u_{1}, *} (p_{1}, p_{2}) \in Δ_{X}

as:

(C^{+, u_{1}, *} (p_{1}, p_{2})) (u_{2}) = \frac{p_{1} (u_{1} * u_{2}) p_{2} (u_{2})}{(C^{-, *} (p_{1}, p_{2})) (u_{1})} .

The probability distribution

C^{+, u_{1}, *} (p_{1}, p_{2})

can be interpreted as follows: if

X_{1}, X_{2}, U_{1}

and

U_{2}

are as above,

C^{+, u_{1}, *} (p_{1}, p_{2})

is the conditional probability distribution of

U_{2}

given

U_{1} = u_{1}

.

Define the mapping

C^{+, *} : Δ_{X} \times Δ_{X} \to P (Δ_{X}) = MP (X)

as follows:

C^{+, *} (p_{1}, p_{2}) = \sum_{u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))} (C^{-, *} (p_{1}, p_{2})) (u_{1}) \cdot δ_{C^{+, u_{1}, *} (p_{1}, p_{2})},

where

δ_{C^{+, u_{1}, *} (p_{1}, p_{2})}

is a Dirac measure centered at

C^{+, u_{1}, *} (p_{1}, p_{2})

.

If

X_{1}, X_{2}, U_{1}

and

U_{2}

are as above,

C^{+, *} (p_{1}, p_{2})

is the meta-probability measure that describes the possible conditional probability distributions of

U_{2}

that are seen by someone having knowledge of

U_{1}

. Clearly,

C^{+, *}

is a random mapping from

Δ_{X} \times Δ_{X}

to

Δ_{X}

. In Appendix H, we show that

C^{+, *}

is a measurable random mapping. We also show in Appendix H that

C^{+, *}

is a continuous mapping from

Δ_{X} \times Δ_{X}

to

MP (X)

when the latter space is endowed with the weak-∗ topology. Lemmas 3 and 4 now imply that the push-forward mapping

C_{#}^{+, *}

is continuous under both the weak-∗ and the total variation topologies.

For every

{MP}_{1}, {MP}_{2} \in MP (X)

, we define the

(+, *)

-convolution of

{MP}_{1}

and

{MP}_{2}

as:

{({MP}_{1}, {MP}_{2})}^{+, *} = C_{#}^{+, *} ({MP}_{1} \times {MP}_{2}) \in MP (X) .

Since the product of meta-probability measures is continuous under both the weak-∗ and the total variation topologies (Appendix B and Appendix F), the

(+, *)

-convolution is also continuous under these topologies.

Proposition 10.

We have:

For every ${\hat{W}}_{1} \in {DMC}_{X_{1}, *}^{(o)}$ and ${\bar{W}}_{2} \in {DMC}_{X_{2}, *}^{(o)}$ , we have:

${MP}_{{\hat{W}}_{1} \oplus {\bar{W}}_{2}} = \frac{| X_{1} |}{| X_{1} | + | X_{2} |} {MP}_{{\hat{W}}_{1}}^{'} + \frac{| X_{2} |}{| X_{1} | + | X_{2} |} {MP}_{{\bar{W}}_{2}}^{'},$

where ${MP}_{{\hat{W}}_{1}}^{'}$ (respectively ${MP}_{{\hat{W}}_{2}}^{'}$ ) is the meta-push-forward of ${MP}_{{\hat{W}}_{1}}$ (respectively ${MP}_{{\hat{W}}_{2}}$ ) by the canonical injection from $X_{1}$ (respectively $X_{2}$ ) to $X_{1} ∐ X_{2}$ .
For every ${\hat{W}}_{1} \in {DMC}_{X_{1}, *}^{(o)}$ and ${\bar{W}}_{2} \in {DMC}_{X_{2}, *}^{(o)}$ , we have:

${MP}_{{\hat{W}}_{1} \otimes {\bar{W}}_{2}} = {MP}_{{\hat{W}}_{1}} \otimes {MP}_{{\bar{W}}_{2}} .$
For every $α \in [0, 1]$ and every ${\hat{W}}_{1}, {\hat{W}}_{2} \in {DMC}_{X, *}^{(o)}$ , we have:

${MP}_{[α {\hat{W}}_{1}, (1 - α) {\hat{W}}_{2}]} = α {MP}_{{\hat{W}}_{1}} + (1 - α) {MP}_{{\hat{W}}_{2}} .$
For every uniformity preserving binary operation ∗ on $X$ , and every $\hat{W} \in {DMC}_{X, *}^{(o)}$ , we have:

${MP}_{{\hat{W}}^{-}} = {({MP}_{\hat{W}}, {MP}_{\hat{W}})}^{-, *} .$
For every uniformity preserving binary operation ∗ on $X$ and every $\hat{W} \in {DMC}_{X, *}^{(o)}$ , we have:

${MP}_{{\hat{W}}^{+}} = {({MP}_{\hat{W}}, {MP}_{\hat{W}})}^{+, *} .$

Proof.

See Appendix I. ☐

Note that the polarization transformation formulas in Proposition 10 generalize the formulas given by Raginsky in [19] for binary-input channels.

Proposition 11.

Assume that all equivalent channel spaces are endowed with the noisiness/weak-∗ or the total variation topology. We have:

The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \oplus {\bar{W}}_{2}$ from ${DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}$ to ${DMC}_{X_{1} ∐ X_{2}, *}^{(o)}$ is continuous.
The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}) \to {\hat{W}}_{1} \otimes {\bar{W}}_{2}$ from ${DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}$ to ${DMC}_{X_{1} \times X_{2}, *}^{(o)}$ is continuous.
The mapping $({\hat{W}}_{1}, {\bar{W}}_{2}, α) \to [α {\hat{W}}_{1}, (1 - α) {\bar{W}}_{2}]$ from ${DMC}_{X, *} \times {DMC}_{X, *}^{(o)} \times [0, 1]$ to ${DMC}_{X, *}^{(o)}$ is continuous.
For every uniformity preserving binary operation ∗ on $X$ , the mapping $\hat{W} \to {\hat{W}}^{-}$ from ${DMC}_{X, *}^{(o)}$ to ${DMC}_{X, *}^{(o)}$ is continuous.
For every uniformity preserving binary operation ∗ on $X$ , the mapping $\hat{W} \to {\hat{W}}^{+}$ from ${DMC}_{X, *}^{(o)}$ to ${DMC}_{X, *}^{(o)}$ is continuous.

Proof.

The proposition directly follows from Proposition 10 and the fact that all the meta-probability measure operations that are involved in the formulas are continuous under both the weak-∗ and the total variation topologies. ☐

Corollary 5.

Both

({DMC}_{X, *}^{(o)}, T_{X, *}^{(o)})

and

({DMC}_{X, *}^{(o)}, T_{T V, X, *}^{(o)})

are strongly contractible to every point in

{DMC}_{X, *}^{(o)}

.

Proof.

We can use the same proof of Corollary 3. ☐

8. Discussion and Conclusions

Section 5 and Section 6 show that the quotient topology is relatively easy to work with. If one is interested in the space of equivalent channels sharing the same input and output alphabets, then using the quotient formulation of the topology seems to be the easiest way to prove theorems.

The continuity of the channel sum and the channel product on the whole product space

({DMC}_{X_{1}, *}^{(o)} \times {DMC}_{X_{2}, *}^{(o)}, T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)})

remains an open problem. As we mentioned in Section 6, it is sufficient to prove that the product topology

T_{s, X_{1}, *}^{(o)} \otimes T_{s, X_{2}, *}^{(o)}

is compactly generated.

Acknowledgments

I would like to thank Emre Telatar and Mohammad Bazzi for helpful discussions. I am also grateful to Maxim Raginsky for his comments.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DMC	Discrete memoryless channel
TV	Total variation

Appendix A. Proof of Lemma 1

Fix

ϵ > 0

, and let

(s, t) \in S \times T

. Since f is continuous, there exists a neighborhood

O_{s, t}

of

(s, t)

in

S \times T

such that for every

(s^{'}, t^{'}) \in O_{s, t}

, we have

| f (s^{'}, t^{'}) - f (s, t) | < \frac{ϵ}{2}

. Moreover, since products of open sets form a base for the product topology, there exists an open neighborhood

V_{s, t}

of s in

(S, V)

and an open neighborhood

U_{s, t}

of t in T such that

V_{s, t} \times U_{s, t} \subset O_{s, t}

.

Since

(S, V)

and

(T, U)

are compact, the product space is also compact. On the other hand, we have

⋃_{(s, t) \in S \times T} V_{s, t} \times U_{s, t} = S \times T

, so

{V_{s, t} \times U_{s, t}}_{(s, t) \in S \times T}

is an open cover of

S \times T

. Therefore, there exist

s_{1}, \dots, s_{n} \in S

and

t_{1}, \dots, t_{n} \in T

such that

⋃_{i = 1}^{n} V_{s_{i}, t_{i}} \times U_{s_{i}, t_{i}} = S \times T

.

Now, fix

s \in S

, and define

V_{s} = ⋂_{\begin{matrix} 1 \leq i \leq n, \\ s \in V_{s_{i}, t_{i}} \end{matrix}} V_{s_{i}, t_{i}}

. Since

V_{s}

is the intersection of finitely many open sets containing s,

V_{s}

is an open neighborhood of s in

(S, V)

. Let

s^{'} \in V_{s}

and

t \in T

. Since

⋃_{i = 1}^{n} V_{s_{i}, t_{i}} \times U_{s_{i}, t_{i}} = S \times T

, there exists

1 \leq i \leq n

such that

(s, t) \in V_{s_{i}, t_{i}} \times U_{s_{i}, t_{i}} \subset O_{s_{i}, t_{i}}

. Since

s \in V_{s_{i}, t_{i}}

, we have

V_{s} \subset V_{s_{i}, t_{i}}

, and so,

s^{'} \in V_{s_{i}, t_{i}}

. Therefore,

(s^{'}, t) \in V_{s_{i}, t_{i}} \times U_{s_{i}, t_{i}} \subset O_{s_{i}, t_{i}}

, hence:

| f (s^{'}, t) - f (s, t) | \leq | f (s^{'}, t) - f (s_{i}, t_{i}) | + | f (s_{i}, t_{i}) - f (s, t) | < \frac{ϵ}{2} + \frac{ϵ}{2} = ϵ .

However, this is true for every

t \in T

. Therefore,

sup_{t \in T} | f (s^{'}, t) - f (s, t) | \leq ϵ .

Appendix B. Continuity of the Product of Measures

For every subset A of

M_{1} \times M_{2}

and every

x_{1} \in M_{1}

, define

A_{2}^{x_{1}} = {x_{2} \in M_{2} : (x_{1}, x_{2}) \in A}

. Similarly, for every

x_{2} \in M_{2}

, define

A_{1}^{x_{2}} = {x_{1} \in M_{1} : (x_{1}, x_{2}) \in A}

. Let

P_{1}, P_{1}^{'} \in P (M_{1}, Σ_{1})

and

P_{2}, P_{2}^{'} \in P (M_{2}, Σ_{2})

. We have:

\begin{matrix} ∥ P_{1} \times P_{2} - & P_{1}^{'} \times P_{2}^{'} ∥_{T V} = sup_{A \in Σ_{1} \otimes Σ_{2}} | (P_{1} \times P_{2}) (A) - (P_{1}^{'} \times P_{2}^{'}) (A) | \\ \leq sup_{A \in Σ_{1} \otimes Σ_{2}} \{| (P_{1} \times P_{2}) (A) - (P_{1}^{'} \times P_{2}) (A) | + | (P_{1}^{'} \times P_{2}) (A) - (P_{1}^{'} \times P_{2}^{'}) (A) |\} \\ = sup_{A \in Σ_{1} \otimes Σ_{2}} {|\int_{M_{2}} P_{1} (A_{1}^{x_{2}}) \cdot d P_{2} (x_{2}) - \int_{M_{2}} P_{1}^{'} (A_{1}^{x_{2}}) \cdot d P_{2} (x_{2})| \\ + |\int_{M_{1}} P_{2} (A_{2}^{x_{1}}) \cdot d P_{1}^{'} (x_{1}) - \int_{M_{1}} P_{2}^{'} (A_{2}^{x_{1}}) \cdot d P_{1}^{'} (x_{1})|} \\ \leq sup_{A \in Σ_{1} \otimes Σ_{2}} \{\int_{M_{2}} |P_{1} (A_{1}^{x_{2}}) - P_{1}^{'} (A_{1}^{x_{2}})| \cdot d P_{2} (x_{2}) + \int_{M_{1}} |P_{2} (A_{2}^{x_{1}}) - P_{2}^{'} (A_{2}^{x_{1}})| \cdot d P_{1}^{'} (x_{1})\} \\ \leq \int_{M_{2}} (sup_{A_{1} \in Σ_{1}} |P_{1} (A_{1}) - P_{1}^{'} (A_{1})|) d P_{2} + \int_{M_{1}} (sup_{A_{2} \in Σ_{2}} |P_{2} (A_{2}) - P_{2}^{'} (A_{2})|) d P_{1}^{'} \\ = ∥ P_{1} - P_{1}^{'} ∥_{T V} + {∥ P_{2} - P_{2}^{'} ∥}_{T V} . \end{matrix}

This shows that the product of measures is continuous under the total variation topology.

Appendix C. Proof of Proposition 1

Define the mapping

G : M \to R^{+} \cup {+ \infty}

as follows:

G (x) = \int_{M^{'}} g (y) d (R (x)) (y) .

For every

n \geq 0

, define the mapping

g_{n} : M^{'} \to R^{+}

as follows:

g_{n} (y) = \frac{1}{2^{n}} ⌊2^{n} \times min {n, g (y)}⌋ .

Clearly, for every

y \in M^{'}

we have:

$g_{n} (y) \leq g (y)$ for all $n \geq 0$ .
$g_{n} (y) \leq g_{n + 1} (y)$ for all $n \geq 0$ .
$lim_{n \to \infty} g_{n} (y) = g (y)$ .

Moreover, for every fixed

n \geq 0

, we have:

$g_{n}$ is $Σ^{'}$ -measurable.
$g_{n}$ takes values in $\{\frac{i}{2^{n}} : 0 \leq i \leq n 2^{n}\}$ .

For every

0 \leq i \leq n 2^{n}

, let

B_{i, n} = {y \in M^{'} : g_{n} (y) = \frac{i}{2^{n}}}

. Since

g_{n}

is

Σ^{'}

-measurable, we have

B_{i, n} \in Σ^{'}

for every

0 \leq i \leq n 2^{n}

. Now, for every

n \geq 0

, define the mapping

G_{n} : M \to R \cup {+ \infty}

as follows:

\begin{matrix} G_{n} (x) & = \int_{M^{'}} g_{n} (y) d (R (x)) (y) = \int_{M^{'}} (\sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} 𝟙_{B_{i, n}} (y)) d (R (x)) (y) \\ = \sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} (R (x)) (B_{i, n}) = \sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} R_{B_{i, n}} (x) . \end{matrix}

Since the random mapping R is measurable and since

B_{i, n} \in Σ^{'}

, the mapping

R_{B_{i, n}}

is

Σ

-measurable for every

0 \leq i \leq n 2^{n}

. Therefore,

G_{n}

is

Σ

-measurable for every

n \geq 0

. Moreover, for every

x \in Σ

, we have:

\begin{matrix} lim_{n \to \infty} G_{n} (x) & = lim_{n \to \infty} \int_{M^{'}} g_{n} (y) d (R (x)) (y) \overset{(a)}{=} \int_{M^{'}} g (y) d (R (x)) (y) = G (x), \end{matrix}

where (a) follows from the monotone convergence theorem. We conclude that G is

Σ

-measurable because it is the point-wise limit of

Σ

-measurable functions. On the other hand, we have:

\begin{matrix} \int_{M^{'}} g_{n} \cdot d (R_{#} P) & = \sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} (R_{#} P) (B_{i, n}) = \sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} \int_{M} R_{B_{i, n}} (x) \cdot d P (x) \\ = \sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} \int_{M} (R (x)) (B_{i, n}) \cdot d P (x) = \sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} \int_{M} (\int_{M^{'}} 𝟙_{B_{i, n}} (y) \cdot d (R (x)) (y)) d P (x) \\ = \int_{M} (\int_{M^{'}} (\sum_{i = 0}^{n 2^{n}} \frac{i}{2^{n}} 𝟙_{B_{i, n}} (y)) d (R (x)) (y)) d P (x) \\ = \int_{M} (\int_{M^{'}} g_{n} (y) d (R (x)) (y)) d P (x) = \int_{M} G_{n} \cdot d P . \end{matrix}

Therefore,

\int_{M^{'}} g \cdot d (R_{#} P) \overset{(a)}{=} lim_{n \to \infty} \int_{M^{'}} g_{n} \cdot d (R_{#} P) = lim_{n \to \infty} \int_{M} G_{n} \cdot d P \overset{(b)}{=} \int_{M} G \cdot d P,

where (a) and (b) follow from the monotone convergence theorem.

Appendix D. Continuity of the Push-Forward by a Random Mapping

Let R be a measurable random mapping from

(M, Σ)

to

(M^{'}, Σ^{'})

. Let

P_{1}, P_{2} \in P (M, Σ)

. Define the signed measure

μ = P_{1} - P_{2}

, and let

{μ^{+}, μ^{-}}

be the Jordan measure decomposition of

μ

. It is easy to see that

∥ P_{1} - P_{2} ∥_{T V} = μ^{+} (M) = μ^{-} (M)

. For every

B \in Σ^{'}

, we have:

\begin{matrix} (R_{#} (P_{1})) (B) - (R_{#} (P_{2})) (B) & = \int_{M} R_{B} \cdot d P_{1} - \int_{M} R_{B} \cdot d P_{2} = \int_{M} R_{B} \cdot d (P_{1} - P_{2}) \\ = \int_{M} R_{B} \cdot d (μ^{+} - μ^{-}) \leq \int_{M} R_{B} \cdot d μ^{+} \leq {∥ R_{B} ∥}_{\infty} \cdot μ^{+} (M) \\ \overset{(a)}{\leq} μ^{+} (M) = {∥ P_{1} - P_{2} ∥}_{T V}, \end{matrix}

where (a) follows from the fact that

| R_{B} (x) | = | (R (x)) (B) | \leq 1

for every

x \in M

. We can similarly show that:

(R_{#} (P_{2})) (B) - (R_{#} (P_{1})) (B) \leq ∥ R_{B} ∥_{\infty} \cdot μ^{-} (M) \leq {∥ P_{1} - P_{2} ∥}_{T V} .

Therefore,

\begin{matrix} ∥ R_{#} (P_{1}) - R_{#} (P_{2}) ∥_{T V} = sup_{B \in Σ^{'}} | (R_{#} (P_{1})) (B) - (R_{#} (P_{2})) (B) | \leq ∥ P_{1} - P_{2} ∥_{T V} . \end{matrix}

This shows that the push-forward mapping

R_{#}

from

P (M, Σ)

to

P (M^{'}, Σ^{'})

is continuous under the total variation topology. This concludes the proof of Lemma 3.

Now, assume that

U

is a Polish topology on M and

U^{'}

is an arbitrary topology on

M^{'}

. Let R be measurable random mapping from

(M, B (M))

to

(M^{'}, B (M^{'}))

. Moreover, assume that R is a continuous mapping from

(M, U)

to

P (M^{'}, B (M^{'}))

when the latter space is endowed with the weak-∗ topology. Let

{(P_{n})}_{n \geq 0}

be a sequence of probability measures in

P (M, B (M))

that weakly-∗ converges to

P \in P (M, B (M))

.

Let

g : M^{'} \to R

be a bounded and continuous mapping. Define the mapping

G : M \to R

as follows:

G (x) = \int_{M^{'}} g (y) \cdot d (R (x)) (y) .

For every sequence

{(x_{n})}_{n \geq 0}

converging to x in M, the sequence

{(R (x_{n}))}_{n \geq 0}

weakly-∗ converges to

R (x)

in

P (M^{'}, B (M^{'}))

because of the continuity of R. This implies that the sequence

{(G (x_{n}))}_{n \geq 0}

converges to

G (x)

. Since

U

is a Polish topology (hence, metrizable and sequential [20]), this shows that G is a bounded and continuous mapping from

(M, U)

to

R

. Therefore, we have:

\begin{matrix} lim_{n \to \infty} \int_{M^{'}} g \cdot d (R_{#} P_{n}) \overset{(a)}{=} lim_{n \to \infty} \int_{M} G \cdot d P_{n} \overset{(b)}{=} \int_{M} G \cdot d P \overset{(c)}{=} \int_{M^{'}} g \cdot d (R_{#} P), \end{matrix}

where (a) and (c) follow from Corollary 2, and (b) follows from the fact that

{(P_{n})}_{n \geq 0}

weakly-∗ converges to P. This shows that

{(R_{#} P_{n})}_{n \geq 0}

weakly-∗ converges to

R_{#} P

. Now, since

U

is Polish, the weak-∗ topology on

P (M, B (M))

is metrizable [21]; hence, it is sequential [20]. This shows that the push-forward mapping

R_{#}

from

P (M, B (M))

to

P (M^{'}, B (M^{'}))

is continuous under the weak-∗ topology.

Appendix E. Proof of Lemma 5

For every

s \in S

, define the mapping

f_{s} : Δ_{X} \to R

as

f_{s} (p) = f (s, p)

. Clearly

f_{s}

is continuous for every

s \in S

. Therefore, the mapping

F_{s} : MP (X) \to R

defined as:

F_{s} (MP) = \int_{Δ_{X}} f_{s} \cdot d MP

is continuous in the weak-∗ topology of

MP (X)

.

Fix

ϵ > 0

, and let

(s, MP) \in S \times MP (X)

. Since

F_{s}

is continuous, there exists a weakly-∗ open neighborhood

U_{s, MP}

of MP such that

| F_{s} ({MP}^{'}) - F_{s} (MP) | < \frac{ϵ}{2}

for every

{MP}^{'} \in U_{s, MP}

. On the other hand, Lemma 1 implies the existence of an open neighborhood

V_{s}

of s in

(S, V)

such that for every

s^{'} \in V_{s}

, we have:

sup_{p \in Δ_{X}} | f (s^{'}, p) - f (s, p) | \leq \frac{ϵ}{2} .

Clearly

V_{s} \times U_{s, MP}

is an open neighborhood of

(s, MP)

in

S \times MP (X)

. For every

(s^{'}, {MP}^{'}) \in V_{s} \times U_{s, MP}

, we have:

\begin{matrix} | F (s^{'}, {MP}^{'}) - F (s, MP) | & \leq | F (s^{'}, {MP}^{'}) - F (s, {MP}^{'}) | + | F (s, {MP}^{'}) - F (s, MP) | \\ = |\int_{Δ_{X}} (f (s^{'}, p) - f (s, p)) \cdot d {MP}^{'} (p)| + | F_{s} ({MP}^{'}) - F_{s} (MP) | \\ < (\int_{Δ_{X}} | f (s^{'}, p) - f (s, p) | \cdot d {MP}^{'} (p)) + \frac{ϵ}{2} \overset{(a)}{\leq} \frac{ϵ}{2} + \frac{ϵ}{2} = ϵ, \end{matrix}

where (a) follows from the fact that

{MP}^{'}

is a meta-probability measure and

| f (s^{'}, p) - f (s^{'}, p) | \leq \frac{ϵ}{2}

for every

p \in Δ_{X}

. We conclude that F is continuous.

Appendix F. Weak-∗ Continuity of the Product of Meta-Probability Measures

Let

{({MP}_{1, n})}_{n \geq 0}

and

{({MP}_{2, n})}_{n \geq 0}

be two sequences that weakly-∗ converge to

{MP}_{1}

and

{MP}_{2}

in

MP (X_{1})

and

MP (X_{2})

, respectively. Let

f : Δ_{X_{1}} \times Δ_{X_{2}} \to R

be a continuous and bounded mapping. Define the mapping

F : Δ_{X_{1}} \times MP (X_{2})

as follows:

F (p_{1}, {MP}_{2}^{'}) = \int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2}^{'} (p_{2}) .

Fix

ϵ > 0

. Since

f (p_{1}, p_{2})

is continuous, Lemma 5 implies that F is continuous. Therefore, the mapping

p_{1} \to F (p_{1}, {MP}_{2})

is continuous on

Δ_{X_{1}}

, which implies that it is also bounded because

Δ_{X_{1}}

is compact. Therefore,

lim_{n \to \infty} \int_{Δ_{X_{1}}} F (p_{1}, {MP}_{2}) d {MP}_{1, n} (p_{1}) = \int_{Δ_{X_{1}}} F (p_{1}, {MP}_{2}) d {MP}_{1} (p_{1})

because

{({MP}_{1, n})}_{n \geq 0}

weakly-∗ converges to

{MP}_{1}

. This means that there exists

n_{1} \geq 0

such that for every

n \geq n_{1}

, we have:

|\int_{Δ_{X_{1}}} F (p_{1}, {MP}_{2}) d {MP}_{1, n} (p_{1}) - \int_{Δ_{X_{1}}} F (p_{1}, {MP}_{2}) d {MP}_{1} (p_{1})| < \frac{ϵ}{2} .

On the other hand, since F is continuous and since

MP (X_{2})

is compact under the weak-∗ topology [21], Lemma 1 implies the existence of a weakly-∗ open neighborhood

U_{{MP}_{2}}

of

{MP}_{2}

such that

| F (p_{1}, {MP}_{2}^{'}) - F (p_{1}, {MP}_{2}) | \leq \frac{ϵ}{2}

for every

{MP}_{2}^{'} \in U_{{MP}_{2}}

and every

p_{1} \in Δ_{X_{1}}

. Moreover, since

{MP}_{2, n}

weakly-∗ converges to

{MP}_{2}

, there exists

n_{2} \geq 0

such that

{MP}_{2, n} \in U_{{MP}_{2}}

for every

n \geq n_{2}

.

Therefore, for every

n \geq max {n_{1}, n_{2}}

, we have:

\begin{matrix} |\int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2, n} (p_{2})) d {MP}_{1, n} (p_{1}) - \int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2} (p_{2})) d {MP}_{1} (p_{1})| \\ \leq |\int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2, n} (p_{2})) d {MP}_{1, n} (p_{1}) - \int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2} (p_{2})) d {MP}_{1, n} (p_{1})| \\ + |\int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2} (p_{2})) d {MP}_{1, n} (p_{1}) - \int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2} (p_{2})) d {MP}_{1} (p_{1})| \\ = |\int_{Δ_{X_{1}}} (F (p_{1}, {MP}_{2, n}) - F (p_{1}, {MP}_{2})) d {MP}_{1, n} (p_{1})| \\ + |\int_{Δ_{X_{1}}} F (p_{1}, {MP}_{2}) d {MP}_{1, n} (p_{1}) - \int_{Δ_{X_{1}}} F (p_{1}, {MP}_{2}) d {MP}_{1} (p_{1})| \\ < \int_{Δ_{X_{1}}} |F (p_{1}, {MP}_{2, n}) - F (p_{1}, {MP}_{2})| d {MP}_{1, n} (p_{1}) + \frac{ϵ}{2} \overset{(a)}{\leq} \int_{Δ_{X_{1}}} \frac{ϵ}{2} \cdot d {MP}_{1, n} (p_{1}) + \frac{ϵ}{2} = ϵ, \end{matrix}

where (a) follows from the fact

{MP}_{2, n} \in U_{{MP}_{2}}

for every

n \geq n_{2}

. Therefore,

\begin{matrix} lim_{n \to \infty} \int_{Δ_{X_{1}} \times Δ_{X_{2}}} f \cdot d ({MP}_{1, n} \times {MP}_{2, n}) & \overset{(a)}{=} lim_{n \to \infty} \int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2, n} (p_{2})) d {MP}_{1, n} (p_{1}) \\ = \int_{Δ_{X_{1}}} (\int_{Δ_{X_{2}}} f (p_{1}, p_{2}) d {MP}_{2} (p_{2})) d {MP}_{1} (p_{1}) \\ \overset{(b)}{=} \int_{Δ_{X_{1}} \times Δ_{X_{2}}} f \cdot d ({MP}_{1} \times {MP}_{2}), \end{matrix}

where (a) and (b) follow from Fubini’s theorem. We conclude that

{({MP}_{1, n} \times {MP}_{2, n})}_{n \geq 0}

weakly-∗ converges to

{({MP}_{1} \times {MP}_{2})}_{n \geq 0}

. Therefore, the product of meta-probability measures is weakly-∗ continuous.

Appendix G. Continuity of the Capacity

Since the mapping I is continuous and since the space

Δ_{X} \times {DMC}_{X, Y}

is compact, the mapping I is uniformly continuous, i.e., for every

ϵ > 0

, there exists

δ (ϵ) > 0

such that for every

(p_{1}, W_{1}), (p_{2}, W_{2}) \in Δ_{X} \times {DMC}_{X, Y}

, if

∥ p_{1} - p_{2} ∥_{1} : = \sum_{x \in X} | p_{1} (x) - p_{2} (x) | < δ (ϵ)

and

d_{X, Y} (W_{1}, W_{2}) < δ (ϵ)

, then

| I (p_{1}, W_{1}) - I (p_{2}, W_{2}) | < ϵ .

Let

W_{1}, W_{2} \in {DMC}_{X, Y}

be such that

d_{X, Y} (W_{1}, W_{2}) < δ (ϵ)

. For every

p \in Δ_{X}

, we have

{∥ p - p ∥}_{1} = 0 < δ (ϵ)

, so we must have

| I (p, W_{1}) - I (p, W_{2}) | < ϵ

. Therefore,

\begin{matrix} I (p, W_{1}) < I (p, W_{2}) + ϵ \leq sup_{p^{'} \in Δ_{X}} I (p^{'}, W_{2}) + ϵ = C (W_{2}) + ϵ . \end{matrix}

Therefore,

\begin{matrix} C (W_{1}) = sup_{p \in Δ_{X}} I (p, W_{1}) \leq C (W_{2}) + ϵ . \end{matrix}

Similarly, we can show that

C (W_{2}) \leq C (W_{1}) + ϵ

. This implies that

| C (W_{1}) - C (W_{2}) | \leq ϵ

; hence, C is continuous.

Appendix H. Measurability and Continuity of C^+,∗

Let us first show that the random mapping

C^{+, *}

is measurable. We need to show that the mapping

C_{B}^{+, *} : Δ_{X} \times Δ_{X} \to R

is measurable for every

B \in B (Δ_{X})

, where:

C_{B}^{+, *} (p_{1}, p_{2}) = (C^{+, *} (p_{1}, p_{2})) (B), \forall p_{1}, p_{2} \in Δ_{X} .

For every

u_{1} \in X

, define the set:

A_{u_{1}} = {(p_{1}, p_{2}) \in Δ_{X} \times Δ_{X} : (C^{-, *} (p_{1}, p_{2})) (u_{1}) > 0} .

Clearly,

A_{u_{1}}

is open in

Δ_{X} \times Δ_{X}

(and so it is measurable). The mapping

C^{+, u_{1}, *}

is defined on

A_{u_{1}}

, and it is clearly continuous. Therefore, for every

B \in B (Δ_{X})

,

{(C^{+, u_{1}, *})}^{- 1} (B)

is measurable. We have:

\begin{matrix} C_{B}^{+, *} (p_{1}, p_{2}) & = (C^{+, *} (p_{1}, p_{2})) (B) = \sum_{\begin{matrix} u_{1} \in supp (C^{-, *} (p_{1}, p_{2})), \\ C^{+, u_{1}, *} (p_{1}, p_{2}) \in B \end{matrix}} (C^{-, *} (p_{1}, p_{2})) (u_{1}) \\ = \sum_{\begin{matrix} u_{1} \in X, \\ (p_{1}, p_{2}) \in A_{u_{1}}, \\ C^{+, u_{1}, *} (p_{1}, p_{2}) \in B \end{matrix}} (C^{-, *} (p_{1}, p_{2})) (u_{1}) \overset{(a)}{=} \sum_{u_{1} \in X} (C^{-, *} (p_{1}, p_{2})) (u_{1}) \cdot 𝟙_{{(C^{+, u_{1}, *})}^{- 1} (B)} (p_{1}, p_{2}), \end{matrix}

where (a) follows from the fact that

(p_{1}, p_{2}) \in {(C^{+, u_{1}, *})}^{- 1} (B)

if and only if

(p_{1}, p_{2}) \in A_{u_{1}}

and

C^{+, u_{1}, *} (p_{1}, p_{2}) \in B

. This shows that

C_{B}^{+, *}

is measurable for every

B \in B (Δ_{X})

. Therefore,

C^{+, *}

is a measurable random mapping.

Let

{(p_{1, n}, p_{2, n})}_{n \geq 0}

be a converging sequence to

(p_{1}, p_{2})

in

Δ_{X} \times Δ_{X}

. Since

C^{-, *}

is continuous, we have

lim_{n \to \infty} (C^{-, *} (p_{1, n}, p_{2, n})) (u_{1}) = (C^{-, *} (p_{1}, p_{2})) (u_{1})

for every

u_{1} \in X

. Therefore, for every

u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))

, there exists

n_{u_{1}} \geq 0

such that for every

n \geq n_{u_{1}}

, we have

C^{-, *} (p_{1, n}, p_{2, n}) > 0

. Let

n_{0} = max {n_{u_{1}} : u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))}

. For every

n \geq n_{0}

, we have

supp (C^{-, *} (p_{1}, p_{2})) \subset supp (C^{-, *} (p_{1, n}, p_{2, n}))

. Therefore, for every continuous and bounded mapping

g : Δ_{X} \to R

, we have:

\begin{matrix} lim_{n \to \infty} \int_{Δ_{X}} g \cdot d (C^{+, *} (p_{1, n}, p_{2, n})) & = lim_{n \to \infty} \sum_{u_{1} \in supp (C^{-, *} (p_{1, n}, p_{2, n}))} g (C^{+, u_{1}, *} (p_{1, n}, p_{2, n})) \cdot (C^{-, *} (p_{1, n}, p_{2, n})) (u_{1}) \\ \overset{(a)}{=} lim_{n \to \infty} \sum_{u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))} g (C^{+, u_{1}, *} (p_{1, n}, p_{2, n})) \cdot (C^{-, *} (p_{1, n}, p_{2, n})) (u_{1}) \\ \overset{(b)}{=} \sum_{u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))} g (C^{+, u_{1}, *} (p_{1}, p_{2})) \cdot (C^{-, *} (p_{1}, p_{2})) (u_{1}) \\ = \int_{Δ_{X}} g \cdot d (C^{+, *} (p_{1}, p_{2})), \end{matrix}

where (b) follows from the continuity of g and

C^{-, *}

and the continuity of

C^{+, u_{1}, *}

on

A_{u_{1}}

for every

u_{1} \in X

. (a) follows from the fact that:

\begin{matrix} lim_{n \to \infty} \sum_{\begin{matrix} u_{1} \in supp (C^{-, *} (p_{1, n}, p_{2, n})), \\ u_{1} \notin supp (C^{-, *} (p_{1}, p_{2})) \end{matrix}} & | g (C^{+, u_{1}, *} (p_{1, n}, p_{2, n})) \cdot (C^{-, *} (p_{1, n}, p_{2, n})) (u_{1}) | \\ \leq {∥ g ∥}_{\infty} lim_{n \to \infty} \sum_{\begin{matrix} u_{1} \in supp (C^{-, *} (p_{1, n}, p_{2, n})), \\ u_{1} \notin supp (C^{-, *} (p_{1}, p_{2})) \end{matrix}} (C^{-, *} (p_{1, n}, p_{2, n})) (u_{1}) \\ = {∥ g ∥}_{\infty} lim_{n \to \infty} (1 - \sum_{u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))} (C^{-, *} (p_{1, n}, p_{2, n})) (u_{1})) \\ = {∥ g ∥}_{\infty} (1 - \sum_{u_{1} \in supp (C^{-, *} (p_{1}, p_{2}))} (C^{-, *} (p_{1}, p_{2})) (u_{1})) = 0 . \end{matrix}

We conclude that the mapping

C^{+, *}

is a continuous mapping from

Δ_{X} \times Δ_{X}

to

MP (X)

when the latter space is endowed with the weak-∗ topology.

Appendix I. Proof of Proposition 10

Let

{\hat{W}}_{1} \in {DMC}_{X_{1}, *}^{(o)}

and

{\bar{W}}_{2} \in {DMC}_{X_{2}, *}^{(o)}

. Fix

W_{1} \in {\hat{W}}_{1}

and

W_{2} \in {\bar{W}}_{2}

, and let

Y_{1}

and

Y_{2}

be the output alphabets of

W_{1}

and

W_{2}

, respectively. We may assume without loss of generality that

Im (W_{1}) = Y_{1}

and

Im (W_{2}) = Y_{2}

.

Let

y \in Y_{1}

. We have:

\begin{matrix} P_{W_{1} \oplus W_{2}}^{o} (y) & = \frac{1}{| X_{1} ∐ X_{2} |} \sum_{x \in X_{1} ∐ X_{2}} (W_{1} \oplus W_{2}) (y | x) \\ = \frac{1}{| X_{1} | + | X_{2} |} \sum_{x \in X_{1}} W_{1} (y | x) = \frac{| X_{1} |}{| X_{1} | + | X_{2} |} P_{W_{1}}^{o} (y) > 0 . \end{matrix}

For every

x \in X_{1}

, we have:

{(W_{1} \oplus W_{2})}_{y}^{- 1} (x) = \frac{(W_{1} \oplus W_{2}) (y | x)}{(| X_{1} | + | X_{2} |) P_{W_{1}}^{o} (y)} = \frac{W_{1} (y | x)}{| X_{1} | P_{W_{1}}^{o} (y)} = {(W_{1})}_{y}^{- 1} (x) .

On the other hand, for every

x \in X_{2}

, we have:

{(W_{1} \oplus W_{2})}_{y}^{- 1} (x) = \frac{(W_{1} \oplus W_{2}) (y | x)}{(| X_{1} | + | X_{2} |) P_{W_{1}}^{o} (y)} = 0 .

Therefore,

{(W_{1} \oplus W_{2})}_{y}^{- 1} = ϕ_{1 #} {(W_{1})}_{y}^{- 1}

, where

ϕ_{1}

is the canonical injection from

X_{1}

to

X_{1} ∐ X_{2}

.

Similarly, for every

y \in Y_{2}

, we have

P_{W_{1} \oplus W_{2}}^{o} (y) = \frac{| X_{2} |}{| X_{1} | + | X_{2} |} P_{W_{1}}^{o} (y) > 0

and

{(W_{1} \oplus W_{2})}_{y}^{- 1} = ϕ_{2 #} {(W_{2})}_{y}^{- 1}

, where

ϕ_{2}

is the canonical injection from

X_{2}

to

X_{1} ∐ X_{2}

. For every

B \in B (Δ_{X_{1} ∐ X_{2}})

, we have:

\begin{matrix} {MP}_{W_{1} \oplus W_{2}} (B) & = \sum_{\begin{matrix} y \in Y_{1} ∐ Y_{2}, \\ {(W_{1} \oplus W_{2})}_{y}^{- 1} \in B \end{matrix}} P_{W_{1} \oplus W_{2}}^{o} (y) \\ = (\sum_{\begin{matrix} y \in Y_{1}, \\ ϕ_{1 #} {(W_{1})}_{y}^{- 1} \in B \end{matrix}} \frac{| X_{1} |}{| X_{1} | + | X_{2} |} P_{W_{1}}^{o} (y)) + (\sum_{\begin{matrix} y \in Y_{2}, \\ ϕ_{2 #} {(W_{2})}_{y}^{- 1} \in B \end{matrix}} \frac{| X_{2} |}{| X_{1} | + | X_{2} |} P_{W_{2}}^{o} (y)) \\ = \frac{| X_{1} |}{| X_{1} | + | X_{2} |} {MP}_{W_{1}} ({(ϕ_{1 #})}^{- 1} (B)) + \frac{| X_{2} |}{| X_{1} | + | X_{2} |} {MP}_{W_{2}} ({(ϕ_{2 #})}^{- 1} (B)) \\ = \frac{| X_{1} |}{| X_{1} | + | X_{2} |} (ϕ_{1 # #} {MP}_{W_{1}}) (B) + \frac{| X_{2} |}{| X_{1} | + | X_{2} |} (ϕ_{2 # #} {MP}_{W_{2}}) (B) . \end{matrix}

Therefore,

{MP}_{{\hat{W}}_{1} \oplus {\bar{W}}_{2}} = \frac{| X_{1} |}{| X_{1} | + | X_{2} |} ϕ_{1 # #} {MP}_{{\hat{W}}_{1}} + \frac{| X_{2} |}{| X_{1} | + | X_{2} |} ϕ_{2 # #} {MP}_{{\bar{W}}_{2}} .

This shows the first formula of Proposition 10.

For every

y = (y_{1}, y_{2}) \in Y_{1} \times Y_{2}

, we have:

\begin{matrix} P_{W_{1} \otimes W_{2}}^{o} (y) & = \sum_{(x_{1}, x_{2}) \in X_{1} \times X_{2}} \frac{1}{| X_{1} \times X_{2} |} (W_{1} \otimes W_{2}) (y_{1}, y_{2} | x_{1}, x_{2}) \\ = \sum_{\begin{matrix} x_{1} \in X_{2}, \\ x_{2} \in X_{2} \end{matrix}} \frac{W_{1} (y_{1} | x_{1})}{| X_{1} |} \cdot \frac{W_{2} (y_{2} | x_{2})}{| X_{2} |} = P_{W_{1}}^{o} (y_{1}) P_{W_{2}}^{o} (y_{2}) > 0 . \end{matrix}

For every

x = (x_{1}, x_{2}) \in X_{1} \times X_{2}

, we have:

\begin{matrix} {(W_{1} \otimes W_{2})}_{y}^{- 1} (x) & = \frac{(W_{1} \otimes W_{2}) (y | x)}{| X_{1} \times X_{2} | P_{W_{1} \otimes W_{2}}^{o} (y)} = \frac{W_{1} (y_{1} | x_{1})}{| X_{1} | P_{W_{1}}^{o} (y_{1})} \cdot \frac{W_{2} (y_{2} | x_{2})}{| X_{2} | P_{W_{2}}^{o} (y_{2})} \\ = {(W_{1})}_{y_{1}}^{- 1} (x_{1}) \cdot {(W_{2})}_{y_{2}}^{- 1} (x_{2}) = ({(W_{1})}_{y_{1}}^{- 1} \times {(W_{2})}_{y_{2}}^{- 1}) (x) . \end{matrix}

For every

B \in B (Δ_{X_{1} \times X_{2}})

, we have:

\begin{matrix} {MP}_{W_{1} \otimes W_{2}} (B) & = \sum_{\begin{matrix} y \in Y_{1} \times Y_{2}, \\ {(W_{1} \otimes W_{2})}_{y}^{- 1} \in B \end{matrix}} P_{W_{1} \otimes W_{2}}^{o} (y) = \sum_{\begin{matrix} y \in Y_{1} \times Y_{2}, \\ {(W_{1})}_{y_{1}}^{- 1} \times {(W_{2})}_{y_{2}}^{- 1} \in B \end{matrix}} P_{W_{1}}^{o} (y_{1}) P_{W_{2}}^{o} (y_{2}) \\ = \sum_{\begin{matrix} y \in Y_{1} \times Y_{2}, \\ Mul ({(W_{1})}_{y_{1}}^{- 1}, {(W_{2})}_{y_{2}}^{- 1}) \in B \end{matrix}} P_{W_{1}}^{o} (y_{1}) P_{W_{2}}^{o} (y_{2}) = ({MP}_{W_{1}} \times {MP}_{W_{2}}) ({Mul}^{- 1} (B)) \\ = ({Mul}_{#} ({MP}_{W_{1}} \times {MP}_{W_{2}})) (B) = ({MP}_{W_{1}} \otimes {MP}_{W_{2}}) (B) . \end{matrix}

Therefore,

{MP}_{{\hat{W}}_{1} \otimes {\bar{W}}_{2}} = {MP}_{{\hat{W}}_{1}} \otimes {MP}_{{\bar{W}}_{2}} .

This shows the second formula of Proposition 10.

Now, let

α \in [0, 1]

and

{\hat{W}}_{1}, {\hat{W}}_{2} \in {DMC}_{X, *}^{(o)}

. Fix

W_{1} \in {\hat{W}}_{1}

and

W_{2} \in {\hat{W}}_{2}

, and let

Y_{1}

and

Y_{2}

be the output alphabets of

W_{1}

and

W_{2}

, respectively. We may assume without loss of generality that

Im (W_{1}) = Y_{1}

and

Im (W_{2}) = Y_{2}

. Let

W = [α W_{1}, (1 - α) W_{2}]

. If

α = 0

, then W is equivalent to

W_{2}

and

{MP}_{W} = {MP}_{W_{2}} = α {MP}_{W_{1}} + (1 - α) {MP}_{W_{2}}

. If

α = 1

, then W is equivalent to

W_{1}

and

{MP}_{W} = {MP}_{W_{1}} = α {MP}_{W_{1}} + (1 - α) {MP}_{W_{2}}

.

Assume now that

0 < α < 1

. For every

y \in Y_{1}

, we have:

\begin{matrix} P_{W}^{o} (y) = \frac{1}{| X |} \sum_{x \in X} W (y | x) = \frac{1}{| X |} \sum_{x \in X} α \cdot W_{1} (y | x) = α P_{W_{1}}^{o} (y) > 0 . \end{matrix}

For every

x \in X

, we have:

\begin{matrix} W_{y}^{- 1} (x) = \frac{W (y | x)}{| X | P_{W}^{o} (y)} = \frac{α W_{1} (y | x)}{| X | α P_{W_{1}}^{o} (y)} = {(W_{1})}_{y}^{- 1} (x) . \end{matrix}

Similarly, for every

y \in Y_{2}

, we have

P_{W}^{o} (y) = (1 - α) P_{W_{2}}^{o} (y) > 0

and

W_{y}^{- 1} = {(W_{2})}_{y}^{- 1}

. Therefore,

\begin{matrix} {MP}_{W} & = \sum_{y \in Y_{1} ∐ Y_{2}} P_{W}^{o} (y) \cdot δ_{W_{y}^{- 1}} = (\sum_{y \in Y_{1}} α P_{W_{1}}^{o} (y) \cdot δ_{{(W_{1})}_{y}^{- 1}}) + (\sum_{y \in Y_{2}} (1 - α) P_{W_{2}}^{o} (y) \cdot δ_{{(W_{2})}_{y}^{- 1}}) \\ = α {MP}_{W_{1}} + (1 - α) {MP}_{W_{2}} . \end{matrix}

Therefore,

{MP}_{[α {\hat{W}}_{1}, (1 - α) {\hat{W}}_{2}]} = α {MP}_{{\hat{W}}_{1}} + (1 - α) {MP}_{{\hat{W}}_{2}} .

This shows the third formula of Proposition 10.

Now, let

\hat{W} \in {DMC}_{X, *}^{(o)}

, and let ∗ be a uniformity preserving binary operation on

X

. Fix

W \in \hat{W}

, and let

Y

be the output alphabet of W. We may assume without loss of generality that

Im (W) = Y

.

Let

U_{1}, U_{2}

be two independent random variables uniformly distributed in

X

. Let

X_{1} = U_{1} * U_{2}

and

X_{2} = U_{2}

. Send

X_{1}

and

X_{2}

through two independent copies of W, and let

Y_{1}

and

Y_{2}

be the output, respectively.

For every

(y_{1}, y_{2}) \in Y^{2}

, we have:

P_{W^{-}}^{o} (y_{1}, y_{2}) = P_{Y_{1}, Y_{2}} (y_{1}, y_{2}) = P_{Y_{1}} (y_{1}) P_{Y_{2}} (y_{2}) = P_{W}^{o} (y_{1}) P_{W}^{o} (y_{2}) > 0 .

For every

u_{1} \in X

, we have:

\begin{matrix} {(W^{-})}_{y_{1}, y_{2}}^{- 1} (u_{1}) & = P_{U_{1} | Y_{1}, Y_{2}} (u_{1} | y_{1}, y_{2}) = \sum_{u_{2} \in X_{2}} P_{U_{1}, U_{2} | Y_{1}, Y_{2}} (u_{1}, u_{2} | y_{1}, y_{2}) \\ = \sum_{u_{2} \in X_{2}} P_{X_{1}, X_{2} | Y_{1}, Y_{2}} (u_{1} * u_{2}, u_{2} | y_{1}, y_{2}) = \sum_{u_{2} \in X_{2}} P_{X_{1} | Y_{1}} (u_{1} * u_{2} | y_{1}) P_{X_{2} | Y_{2}} (u_{2} | y_{2}) \\ = \sum_{u_{2} \in X_{2}} W_{y_{1}}^{- 1} (u_{1} * u_{2}) W_{y_{2}}^{- 1} (u_{2}) = (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{1}) . \end{matrix}

For every

B \in B (Δ_{X})

, we have:

\begin{matrix} {MP}_{W^{-}} (B) & = \sum_{\begin{matrix} y \in Y^{2}, \\ {(W^{-})}_{y}^{- 1} \in B \end{matrix}} P_{W^{-}}^{o} (y) = \sum_{\begin{matrix} (y_{1}, y_{2}) \in Y^{2}, \\ C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1}) \in B \end{matrix}} P_{W_{1}}^{o} (y_{1}) P_{W_{2}}^{o} (y_{2}) \\ = ({MP}_{W} \times {MP}_{W}) ({(C^{-, *})}^{- 1} (B)) = (C_{#}^{-, *} ({MP}_{W} \times {MP}_{W})) (B) = {({MP}_{W}, {MP}_{W})}^{-, *} (B) . \end{matrix}

Therefore,

{MP}_{{\hat{W}}^{-}} = {({MP}_{\hat{W}}, {MP}_{\hat{W}})}^{-, *} .

This shows the forth formula of Proposition 10.

For every

(y_{1}, y_{2}, u_{1}) \in Y^{2} \times X

, we have:

\begin{matrix} P_{W^{+}}^{o} (y_{1}, y_{2}, u_{1}) & = P_{Y_{1}, Y_{2}, U_{1}} (y_{1}, y_{2}, y_{1}) = P_{Y_{1}, Y_{2}} (y_{1}, y_{2}) P_{U_{1} | Y_{1}, Y_{2}} (u_{1} | y_{1}, y_{2}) \\ = P_{W}^{o} (y_{1}) P_{W}^{o} (y_{2}) \cdot (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{1}) . \end{matrix}

Therefore,

Im (W^{+}) = ⋃_{(y_{1}, y_{2}) \in Y^{2}} {(y_{1}, y_{2})} \times supp (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) .

For every

(y_{1}, y_{2}, u_{1}) \in Im (W^{+})

, we have:

\begin{matrix} {(W^{+})}_{y_{1}, y_{2}, u_{1}}^{- 1} (u_{2}) & = P_{U_{2} | Y_{1}, Y_{2}, U_{1}} (u_{2} | y_{1}, y_{2}, u_{1}) = \frac{P_{U_{1}, U_{2} | Y_{1}, Y_{2}} (u_{1}, u_{2} | y_{1}, y_{2})}{P_{U_{1} | Y_{1}, Y_{2}} (u_{1} | y_{1}, y_{2})} \\ = \frac{P_{X_{1} | Y_{1}} (u_{1} * u_{2} | y_{1}) P_{X_{2} | Y_{2}} (u_{2} | y_{2})}{(C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{1})} = \frac{W_{y_{1}}^{- 1} (u_{1} * u_{2}) W_{y_{2}}^{- 1} (u_{2})}{(C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{1})} \\ = (C^{+, u_{1}, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{2}) . \end{matrix}

For every

B \in B (Δ_{X})

, we have:

\begin{matrix} {MP}_{W^{+}} (B) & = \sum_{(y_{1}, y_{2}) \in Y^{2}} \sum_{\begin{matrix} u_{1} \in supp (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1}), \\ C^{+, u_{1}, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1}) \in B \end{matrix}} P_{W}^{o} (y_{1}) P_{W}^{o} (y_{2}) \cdot (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{1}) \\ = \sum_{(y_{1}, y_{2}) \in Y^{2}} P_{W}^{o} (y_{1}) P_{W}^{o} (y_{2}) \sum_{\begin{matrix} u_{1} \in supp (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1}), \\ C^{+, u_{1}, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1}) \in B \end{matrix}} (C^{-, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (u_{1}) \\ = \sum_{(y_{1}, y_{2}) \in Y^{2}} P_{W}^{o} (y_{1}) P_{W}^{o} (y_{2}) (C^{+, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1})) (B) \\ = \sum_{(y_{1}, y_{2}) \in Y^{2}} P_{W}^{o} (y_{1}) P_{W}^{o} (y_{2}) (C_{B}^{+, *} (W_{y_{1}}^{- 1}, W_{y_{2}}^{- 1}) \\ = \int_{Δ_{X} \times Δ_{X}} C_{B}^{+, *} (p_{1}, p_{2}) \cdot d ({MP}_{W} \times {MP}_{W}) (p_{1}, p_{2}) \\ = (C_{#}^{+, *} ({MP}_{W} \times {MP}_{W})) (B) = {({MP}_{W}, {MP}_{W})}^{+, *} (B) . \end{matrix}

Therefore,

{MP}_{{\hat{W}}^{+}} = {({MP}_{\hat{W}}, {MP}_{\hat{W}})}^{+, *} .

This shows the fifth and last formula of Proposition 10.

References

Nasser, R. Continuity of Channel Parameters and Operations under Various DMC Topologies. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 3185–3189. [Google Scholar]
Polyanskiy, Y.; Poor, H.V.; Verdu, S. Channel Coding Rate in the Finite Blocklength Regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
Polyanskiy, Y. Saddle Point in the Minimax Converse for Channel Coding. IEEE Trans. Inf. Theory 2013, 59, 2576–2595. [Google Scholar] [CrossRef]
Schwarte, H. On weak convergence of probability measures, channel capacity and code error probabilities. IEEE Trans. Inf. Theory 1996, 42, 1549–1551. [Google Scholar] [CrossRef]
Richardson, T.; Urbanke, R. Modern Coding Theory; Cambridge University Press: New York, NY, USA, 2008. [Google Scholar]
Nasser, R. Topological Structures on DMC spaces. arXiv, 2017; arXiv:1701.04467. [Google Scholar]
Engelking, R. General Topology; Monografie Matematyczne: Warsaw, Poland, 1977. [Google Scholar]
Schieler, C.; Cuff, P. The Henchman Problem: Measuring Secrecy by the Minimum Distortion in a List. IEEE Trans. Inf. Theory 2016, 62, 3436–3450. [Google Scholar] [CrossRef]
Torgersen, E. Comparison of Statistical Experiments; Encyclopedia of Mathematics and its Applications, Cambridge University Press: Cambridge, UK, 1991. [Google Scholar]
Şaşoğlu, E.; Telatar, E.; Arıkan, E. Polarization for Arbitrary Discrete Memoryless Channels. In Proceedings of the IEEE Information Theory Workshop, Taormina, Italy, 11–16 October 2009; pp. 144–148. [Google Scholar]
Nasser, R. An Ergodic Theory of Binary Operations, Part II: Applications to Polarization. IEEE Trans. Inf. Theory 2017, 63, 1063–1083. [Google Scholar] [CrossRef]
Cover, T.; Thomas, J. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
Shannon, C. The zero error capacity of a noisy channel. IRE Trans. Inf. Theory 1956, 2, 8–19. [Google Scholar] [CrossRef]
Mondelli, M.; Hassani, S.H.; Urbanke, R.L. From Polar to Reed-Muller Codes: A Technique to Improve the Finite-Length Performance. IEEE Trans. Inf. Theory 2014, 62, 3084–3091. [Google Scholar]
Arıkan, E. Channel Polarization: A Method for Constructing Capacity-Achieving Codes for Symmetric Binary-Input Memoryless Channels. IEEE Trans. Inf. Theory 2009, 55, 3051–3073. [Google Scholar] [CrossRef]
Shannon, C. A Note on a Partial Ordering for Communication Channels. Inform. Contr. 1958, 1, 390–397. [Google Scholar] [CrossRef]
Steenrod, N.E. A convenient category of topological spaces. Michigan Math. J. 1967, 14, 133–152. [Google Scholar] [CrossRef]
Nasser, R. An Ergodic Theory of Binary Operations, Part I: Key Properties. IEEE Trans. Inf. Theory 2016, 62, 6931–6952. [Google Scholar] [CrossRef]
Raginsky, M. Channel Polarization and Blackwell Measures. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–15 July 2016; pp. 56–60. [Google Scholar]
Franklin, S. Spaces in which sequences suffice. Fundam. Math. 1965, 57, 107–115. [Google Scholar] [CrossRef]
Villani, C. Topics in Optimal Transportation; Graduate studies in mathematics, American Mathematical Society: Madison, WI, USA, 2003. [Google Scholar]

© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Continuity of Channel Parameters and Operations under Various DMC Topologies^†

Abstract

1. Introduction

2. Preliminaries

2.1. Set-Theoretic Notations

2.2. Topological Notations

2.3. Quotient Topology

2.4. Measure-Theoretic Notations

2.5. Random Mappings

2.6. Meta-Probability Measures

3. The Space of Equivalent Channels

3.1. Space of Channels from $X$ to $Y$

3.2. Equivalence between Channels

3.3. Space of Equivalent Channels from $X$ to $Y$

3.4. Space of Equivalent Channels with Input Alphabet $X$

4. Channel Parameters and Operations

4.1. Useful Parameters

4.2. Channel Operations

5. Continuity on ${DMC}_{X, Y}^{(o)}$

6. Continuity in the Strong Topology

7. Continuity in the Noisiness/Weak-∗ and the Total Variation Topologies

7.1. Channel Parameters

7.2. Channel Operations

8. Discussion and Conclusions

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Proof of Lemma 1

Appendix B. Continuity of the Product of Measures

Appendix C. Proof of Proposition 1

Appendix D. Continuity of the Push-Forward by a Random Mapping

Appendix E. Proof of Lemma 5

Appendix F. Weak-∗ Continuity of the Product of Meta-Probability Measures

Appendix G. Continuity of the Capacity

Appendix H. Measurability and Continuity of C^+,∗

Appendix I. Proof of Proposition 10

References

Article Metrics

Citations

Article Access Statistics

Continuity of Channel Parameters and Operations under Various DMC Topologies †

Abstract

1. Introduction

2. Preliminaries

2.1. Set-Theoretic Notations

2.2. Topological Notations

2.3. Quotient Topology

2.4. Measure-Theoretic Notations

2.5. Random Mappings

2.6. Meta-Probability Measures

3. The Space of Equivalent Channels

3.1. Space of Channels from X to Y

3.2. Equivalence between Channels

3.3. Space of Equivalent Channels from X to Y

3.4. Space of Equivalent Channels with Input Alphabet X

4. Channel Parameters and Operations

4.1. Useful Parameters

4.2. Channel Operations

5. Continuity on DMC X , Y ( o )

6. Continuity in the Strong Topology

7. Continuity in the Noisiness/Weak-∗ and the Total Variation Topologies

7.1. Channel Parameters

7.2. Channel Operations

8. Discussion and Conclusions

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Proof of Lemma 1

Appendix B. Continuity of the Product of Measures

Appendix C. Proof of Proposition 1

Appendix D. Continuity of the Push-Forward by a Random Mapping

Appendix E. Proof of Lemma 5

Appendix F. Weak-∗ Continuity of the Product of Meta-Probability Measures

Appendix G. Continuity of the Capacity

Appendix H. Measurability and Continuity of C+,∗

Appendix I. Proof of Proposition 10

References

Article Metrics

Citations

Article Access Statistics

Continuity of Channel Parameters and Operations under Various DMC Topologies^†

3.1. Space of Channels from $X$ to $Y$

3.3. Space of Equivalent Channels from $X$ to $Y$

3.4. Space of Equivalent Channels with Input Alphabet $X$

5. Continuity on ${DMC}_{X, Y}^{(o)}$

Appendix H. Measurability and Continuity of C^+,∗