A New Lower Bound for Noisy Permutation Channels via Divergence Packing

Feng, Lugaoze; Lv, Guocheng; Li, Xunan; Jin, Ye

doi:10.3390/e27111101

Open AccessArticle

A New Lower Bound for Noisy Permutation Channels via Divergence Packing^†

¹

State Key Laboratory of Photonics and Communications, Peking University, Beijing 100871, China

²

National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China

^*

Author to whom correspondence should be addressed.

^†

This work was presented in part at IEEE International Symposium on Information Theory (ISIT), Lugaoze, F.; Xunan, L.; Guocheng, L.; Ye, J. New Channel Coding Lower Bounds for Noisy Permutation Channels. In Proceedings of the 2025 IEEE International Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, 22–27 June 2025.

Entropy 2025, 27(11), 1101; https://doi.org/10.3390/e27111101

Submission received: 4 September 2025 / Revised: 12 October 2025 / Accepted: 23 October 2025 / Published: 25 October 2025

(This article belongs to the Special Issue Next-Generation Channel Coding: Theory and Applications)

Download

Browse Figures

Versions Notes

Abstract

Noisy permutation channels are applied in modeling biological storage systems and communication networks. For noisy permutation channels with strictly positive and full-rank square matrices, new achievability bounds are given in this paper, which are tighter than existing bounds. To derive this bound, we use the

ϵ

-packing with Kullback–Leibler divergence as a distance and introduce a novel way to illustrate the overlapping relationship of error events. This new bound shows analytically that for such a matrix W, the logarithm of the achievable code size with a given block n and error probability

ϵ

is closely approximated by

ℓ log (\frac{\sqrt{n}}{- Φ^{- 1} (ϵ / G)}) + log V (W)

, where

ℓ = rank (W) - 1

,

G = 2 (\binom{ℓ + 1}{2})

, and

V (W)

is a characteristic of the channel referred to as channel volume ratio. Our numerical results show that the new achievability bound significantly improves the lower bound of channel coding. Additionally, the Gaussian approximation can replace the complex computations of the new achievability bound over a wide range of relevant parameters.

Keywords:

noisy permutation channel; finite blocklength; divergence packing; ϵ-packing

1. Introduction

The noisy permutation channel, consisting of a discrete memoryless channel (DMC) and a uniform random permutation block, was introduced in [1] and is a point-to-point communication model that captures the out-of-order arrival of packets. Such channels can be used as models of communication networks and DNA storage systems, where the ordering of the codeword does not carry any information. Previously, several advances have been made to asymptotic bounds, including the binary channels in [2], the capacity of full-rank DMCs in [1], and the converse bounds based on divergence covering [3,4].

The code lengths of suitable codes in communication systems are in the order of thousands or hundreds, invalidating the asymptotic assumptions in classical information theory. We initiate the study of new channel coding bounds to extend the information-theoretic results for noisy permutation channels in finite blocklength analysis. Finite blocklength analysis and finer asymptotics are important branches of research in information theory. Interest in this topic has been growing since the seminal works of [5,6,7]. These works suggest that the channel coding rate in the finite blocklength regime is closely related to the information density [8], i.e., the stochastic measure of the input distribution and channel noise. The second-order approximation of conventional channels involves the variance of the information density, which has been shown to approximate the channel coding rate at short blocklengths.

The codeword positions are randomly permuted in noisy permutation channels. Conventional analysis techniques, specifically the dependence-testing (DT) bound and the random coding union (RCU) bound ([9] Theorems 17 and 18), become inapplicable. Since the messages are mapped to different probability distributions in noisy permutation channels [1], the only statistical information the receiver would use from the received codeword

Y^{n}

is which marginal distribution

Y^{n}

belongs to. Therefore, the finer asymptotic completely differs from that in conventional channels.

The main contributions of our work are the following:

We present a new nonasymptotic achievability bound for noisy permutation channels that have strictly positive square matrices W with full rank. The two main ingredients of our proof are the following: the $ϵ$ -packing [10,11] with Kullback–Leibler (KL) divergence as a distance, and an analysis for the error events that decouple the union of error events from the message set. Additionally, this new bound is stronger than existing bounds ([1] Equation (36)).
We show that the finite blocklength achievable code size can be approximated by

$log M^{*} (n, ϵ) \approx ℓ log (\frac{\sqrt{n}}{- Φ^{- 1} (ϵ / G)}) + log V (W),$

(1)

where $ℓ = rank (W) - 1$ , $G = 2 (\binom{ℓ + 1}{2})$ , and $V (W)$ is the channel volume ratio.
To complement these results and assist in understanding them, we particularize all these results to typical DMCs, i.e., BSC and BEC permutation channels. Additionally, our Gaussian approximations, through numerical results, lead to tight approximations of the achievable code size for blocklengths n as short as 100 in these cases.

We continue this section with the motivation and application. Section 2 sets up the system model. In Section 3, we provide methods to construct a set of divergence packing centers (message set) and bounds for packing numbers. In Section 4, we present our new achievability bound and particularize this bound to the typical DMCs. Section 5 studies the asymptotic behavior of the achievability bound using Gaussian approximation analysis and applies it to the typical DMCs. In Section 6, we present numerical results. We conclude this paper in Section 7.

1.1. Motivation and Application

The noisy permutation channel models the scenario where codewords undergo reordering, which occurs in communication networks and DNA storage systems. We briefly outline some applications of this channel.

(a): Communication Networks: First, noisy permutation channels are a suitable model for the multipath routed network in which packets arrive at different delays [12,13]. In such networks, data packets within the same group often take paths of differing lengths, bandwidths, and congestion levels as they traverse the network to the receiver. Consequently, transmission delays exhibit unpredictable variations, causing these packets to arrive at their destination in a potentially different order from their original sending sequence. Moreover, during transmission, data packets may be lost or corrupted due to reasons such as link failures or buffer overflow. Treating all possible packets as the input alphabet, this scenario fits the noisy permutation channel model.
(b): DNA Storage Systems: The DNA storage systems, known for their high density and reliability over long periods, are another motivation for our research [14,15,16]. Such a system can be seen as an out-of-order communication channel [1,14,17]. The source data is written onto DNA molecules (or codeword strings) consisting of letters from an alphabet of four nucleotides ${A, C, G, T}$ . Due to physical conditions causing random fragmentation of DNA molecules, long-read sequencing technology, such as nanopore sequencing [18], is employed at the receiver to read entire randomly permuted DNA molecules. In the noisy permutation channel, the DMC matrix models potential errors during the synthesis and storage of DNA molecules, followed by a random permutation block that represents the random permutation of DNA molecules. For a comprehensive overview of DNA storage systems, see [1,14]; studies presenting specific DNA-based storage coding schemes include [17,19,20].

1.2. Notation

We use

[n] = {1, \dots, n}

,

Z_{\geq a} = {a, a + 1, a + 2, \dots}

to represent integer intervals. Let

1 {\cdot}

denote the indicator function. For a given

X

and a random variable

X \in X

, we write

X \sim P_{X}

to indicate that the random variable

X

follows the distribution

P_{X}

. Let

X^{n} = (X_{1}, \dots, X_{n})

and

x^{n} = (x_{1}, \dots, x_{n})

denote the random vector and its realization in the n-th Cartesian product

X^{n}

, respectively. A

(| X | - 1)

-dimensional simplex on

R^{| X |}

is a set of points as follows:

Δ_{| X | - 1} = \{(p_{1}, p_{2}, \dots, p_{| X |}) \in R^{| X |} | \sum_{x = 1}^{| X |} p_{x} = 1, p_{x} \geq 0\} .

(2)

The KL divergence and the total variation are denoted by

D (\cdot ∥ \cdot)

and

TV (\cdot, \cdot)

, respectively. For a matrix A, we use the notation

rank (A)

to represent the rank of matrix A. The probability and mathematical expectation are denoted by

P [\cdot]

and

E [\cdot]

, respectively. The cumulative distribution function of the standard normal distribution is denoted by

Φ (\cdot)

and

Φ^{- 1} (\cdot)

is its inverse function.

2. System Model

The code

C_{n}

consists of a message set

M

, an (possibly randomized) encoder function

f_{n} : M \to X^{n}

, and a (possibly randomized) decoder function

g_{n} : Y^{n} \to M \cup {e}

, where the notation ‘

e

’ indicates that the decoder chooses “error”. We write

X

for the finite input alphabet and

Y

for the finite output alphabet. M denotes the achievable code size of code

C_{n}

.

The input alphabet

X

abstracts transmitted codeword symbols in various applications. For instance, in DNA storage applications,

X

denotes the alphabet of four nucleotides, while the n-length codeword

X^{n} \in X^{n}

represents the DNA molecule formed by the corresponding n nucleotides. The sender uses encoder

f_{n}

to encode the message M into a codeword

X^{n}

, which is then passed through the DMC W to produce

Z^{n} \in Y^{n}

. The DMC is defined by a

| X | \times | Y |

matrix W, where

W (z | x)

denotes the probability that the output

z \in Y

occurs given input

x \in X

. Finally,

Z^{n}

passes through a random permutation block

P_{Y^{n} | Z^{n}}

to generate

Y^{n} \in Y^{n}

. The random permutation block

P_{Y^{n} | Z^{n}}

operates as follows. First, a random permutation is denoted as

σ : {1, \dots, n} \mapsto {1, \dots, n}

, drawn randomly and uniformly from the symmetric group

S_{n}

over

{1, \dots, n}

. Then,

Y^{n}

is generated by permuting

Z^{n}

according to

Y_{i} = Z_{σ (i)}

for all

i \in {1, \dots, n}

. The receiver uses decoder

g_{n}

to produce the estimate of the message

\hat{M}

. We can describe these steps by the following Markov chain:

M \to X^{n} \to Z^{n} \to Y^{n} \to \hat{M} .

(3)

The channel model of the noisy permutation channel is illustrated in Figure 1. For codewords

X^{n}

drawn i.i.d. from

P_{X}

, the random permutation block does not change the probability distribution of codewords [1], i.e., if

X^{n} \overset{i . i . d .}{\sim} P_{X}

, then

Z^{n} \overset{i . i . d .}{\sim} P_{X}

and

Y^{n} \overset{i . i . d .}{\sim} P_{X}

.

We say W is strictly positive if all the transition probabilities are greater than 0. We impose the following restrictions on the channel.

Assumption 1.

The channel W is a strictly positive and full-rank square matrix.

For a given code

C_{n}

, the average error probability of code

C_{n}

is

\begin{matrix} P_{e} = P [M \neq \hat{M}] . \end{matrix}

(4)

The achievable code size with a given blocklength and probability of error is denoted by

M^{*} (n, ϵ) = max \{M | \exists C_{n} s . t . P_{e} \leq ϵ\} .

The code rate of the encoder–decoder pair

(f_{n}, g_{n})

is denoted by

R = \frac{log M}{log n},

(5)

where

log (\cdot)

is the binary logarithm (with base 2) in this paper. Note that the rate

R (n, ϵ)

for the noisy permutation channel is not the conventional definition where

R (n) = \frac{1}{n} log M

since the noisy permutation channels will have rate 0 under this definition. The capacity for noisy permutation channels is defined as

C = sup {R \geq 0 : R is achievable}

.

3. Message Set and Divergence Packing

A divergence packing is a set of centers in simplex

Δ_{| Y | - 1}

such that the minimum distance between each center is greater than some KL distance. The following definition abides by the packing number.

Definition 1.

The achievability space of the marginal distribution is defined by

Δ_{| Y | - 1}^{*} = {P_{X} \circ W | P_{X} \in Δ_{| X | - 1}}

.

Definition 2.

Let

{P_{1}, \dots, P_{M}} \subset Δ_{| Y | - 1}^{*}

be the set of divergence packing centers. The divergence packing number on

Δ_{| Y | - 1}^{*}

is defined by

\begin{matrix} N^{*} (r, | Y |) = max \{M | \exists {P_{1}, \dots, P_{M}} s . t . min_{i \neq j} D (P_{i} ∥ P_{j}) \geq r\}, \end{matrix}

(6)

where

r > 0

is the packing radius.

Here, we provide some intuition for using divergence packing in noisy permutation channels. Using the ML decoder, we relate the non-asymptotic channel performance to the likelihood ratio of two distributions (decoding metric). The statistical mean of this decoding metric is the KL divergence, which arises from applying the law of large numbers as the blocklength grows. Thus, using divergence packing can obtain an upper bound of the error probability by lower bounding the distance between each distribution, and the code size can be analyzed asymptotically via the Berry–Esseen bound (see Section 5). These factors motivate us to use the KL divergence in constructing the message set.

Since the messages correspond to different distributions in noisy permutation channels, the message set can be equivalent to the set of marginal distributions at the receiver (e.g., see [1]). In the sequel, we denote the marginal distribution corresponding to message m by

P_{m}

.

Additionally, in Gaussian approximations, we need the following definition.

Definition 3.

Let

vol (y, Δ_{| Y | - 1}^{*})

and

vol (y, Δ_{| Y | - 1})

be, respectively, the volume of the projection of

Δ_{| Y | - 1}^{*}

and

Δ_{| Y | - 1}

from

R^{| Y |}

to space

R^{| Y | - 1}

, in which the y-th dimension is removed. The channel volume ratio is defined as

V (W) = max_{y \in Y} \frac{vol (y, Δ_{| Y | - 1}^{*})}{vol (y, Δ_{| Y | - 1})} .

(7)

Next, we present several lower bounds on packing numbers. In the two-dimensional case, our construction achieves tighter bounds. For higher dimensions, our primary tool is the volume bound for

ϵ

-packing (e.g., see [8] Theorem 27.3). These results form the foundation for constructing marginal probability distributions in subsequent sections, while also playing a key role in the analysis of Gaussian approximations.

3.1. Binary Case

We first give the lower bound of the packing number in the binary case. Consider

Δ_{1}^{*} = {(q, 1 - q) | δ_{1} \leq q \leq 1 - δ_{2}}

, where

0 < δ_{1} < 1 - δ_{2} < 1

. Define

\begin{matrix} Γ_{b, 2}^{*} = \{(q, 1 - q) | q = \frac{ξ a}{⌊ 1 / b ⌋} + δ_{1} where a \in Z_{\geq 0}\}, \end{matrix}

(8)

where

ξ = 1 - δ_{1} - δ_{2}, b > 0

.

Then, we have the following result proved in Appendix A.

Proposition 1.

Fix

Δ_{1}^{*} = {(q, 1 - q) | δ_{1} \leq q \leq 1 - δ_{2}}

, where

δ_{1} > 0

and

δ_{2} > 0

. We can construct a set of packing centers by (8) with

b = \frac{1}{ξ} \sqrt{\frac{r}{2 log e}}

and

ξ = 1 - δ_{1} - δ_{2}

such that

N^{*} (r, 2) = ⌊ 1 / r ⌋ + 1 .

(9)

3.2. General Case

Next, we introduce a general method for constructing the set of packing centers and the bounds of their size. Let

b > 0

, and we consider the following set:

\begin{matrix} Γ_{b, | Y |}^{*} = \{P \in Δ_{| Y | - 1}^{*} | \begin{matrix} P = (\frac{a_{1}}{⌊ 1 / b ⌋}, \dots, \frac{a_{| Y |}}{⌊ 1 / b ⌋}) where a_{1}, \dots, a_{| Y |} \in Z_{\geq 0} \end{matrix}\} . \end{matrix}

(10)

The intuition behind constructing this set is that the minimum distance of distributions in this uniform structure can be bounded by the total variation distance. Then, we can obtain the set of divergence packing centers that have a certain radius by applying Pinsker’s inequality.

We have the following lower bound proved in Appendix A.

Theorem 1.

Fix a W that satisfies Assumption 1 and generate

Δ_{| Y | - 1}^{*}

. We can construct a set of packing centers by (10) with

b = \sqrt{\frac{r}{2 log e}}

such that

\begin{matrix} N^{*} (r, | Y |) \geq V (W) {(\frac{log e}{8 r})}^{\frac{| Y | - 1}{2}}, \end{matrix}

(11)

where

r > 0

is the packing radius.

4. New Bounds on Rate

In this section, we introduce our new bound, which is based on divergence packing and yields the spirit of the RCU bound. The key ingredient is our analysis of error events.

To that end, we introduce some definitions. Suppose we have a set of marginal distributions

M

, which is constructed by (8) or (10) with any

b > 0

. Fixing a

P = (p_{1}, \dots, p_{| Y |}) \in M

, we are often concerned with the divergence packing centers close to P. To do this, we consider

Q_{a, b} (P) = (q_{1}, \dots, q_{| Y |})

, where

a \neq b

,

q_{a} = p_{a} + \frac{K}{⌊ 1 / r ⌋}, q_{b} = p_{b} - \frac{K}{⌊ 1 / r ⌋}

and

q_{i} = p_{i}

for

i \in Y ∖ {a, b}

. K is a constant that has value

ξ

or 1 when

M

is constructed by (8) or (10), respectively. We define

R_{P}

as the neighboring set of P:

R_{P} = \{Q_{a, b} (P) | a, b \in Y, a \neq b\} \cap M .

(12)

In general, the distribution

Q_{a, b} (P)

coincides with distributions in the set

M

, except near the boundaries of the simplex where

Q_{a, b} (P)

may violate the constraints of the probability space. We use the intersection operation in (12) to make sure all elements of

R_{P}

remain within the simplex. For convenience, we use

j \in [| R_{P} |]

to index

Q_{j} \in R_{P}

, and we say

Q_{j} \in R_{P}

is the neighboring distribution of P. By counting, we have

| R_{P} | \leq 2 (\binom{| Y |}{2})

.

For the marginal distribution

P_{m}

corresponding to the transmitted message m, we use the log-likelihood ratio to define the following decoding metric:

d (m, j, y) : = log \frac{P_{m} (y)}{Q_{j} (y)},

(13)

where

Q_{j} \in R_{P_{m}}

.

Then, the proof of our main result consists of three parts, each detailed in one of the following subsections. In the first subsection, we introduce a lemma. This lemma shows that the message sets constructed by (8) or (10) have an overlapping relationship for error events. In the second subsection, we use this lemma to give an equivalent expression for the error probability. The third subsection contains our main result. Additionally, we particularize this new bound to BSC and BEC permutation channels in the fourth and fifth subsections, respectively.

4.1. Overlapping of Error Events

Intuitively, the rate of decay of

P_{e}

is dominated by the rate of decay of the probability of error in distinguishing neighboring messages. In order to use this intuition mathematically, we need to analyze the relationship between error events. The following lemma, proved in Appendix B, does this and can be used for analyzing random coding bounds.

Lemma 1.

Let

M

be constructed by (8) or (10) with any

b > 0

. Fix a

P \in M

. Then, for every

Λ = (λ_{1}, \dots, λ_{| Y |}) \in Δ_{| Y | - 1}

and

Q = (q_{1}, \dots, q_{| Y |}) \in M ∖ (R_{P} \cap {P})

, if

\prod_{i = 1}^{| Y |} p_{i}^{λ_{i}} \leq \prod_{i = 1}^{| Y |} q_{i}^{λ_{i}},

(14)

there exists a

Q^{*} = (q_{1}^{*}, \dots, q_{| Y |}^{*}) \in R_{P}

such that

\prod_{i = 1}^{| Y |} p_{i}^{λ_{i}} \leq \prod_{i = 1}^{| Y |} {(q_{i}^{*})}^{λ_{i}} .

(15)

4.2. Equivalent Expression

In this subsection, we give a lemma tailored to our purposes. It follows directly from Lemma 1.

Lemma 2.

For the set of marginal distributions

M

constructed by (8) or (10) with any

b > 0

, we have

\begin{matrix} P [⋃_{j = 1, j \neq m}^{| M |} \{P_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}] = P [⋃_{j = 1}^{| R_{P_{m}} |} \{Q_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}], \end{matrix}

(16)

where the sequence

Y^{n}

is drawn i.i.d. from

P_{m}

and

Q_{j} \in R_{P_{m}}

.

Proof.

Using Lemma 1, for

j \in [| M |]

, if

\{P_{j}^{n} (y^{n}) \geq P_{m}^{n} (y^{n})\}

occurs, we obtain

j \in [| R_{P_{m}} |]

such that

\{Q_{j}^{n} (y^{n}) \geq P_{m}^{n} (y^{n})\}

occurs. Then, we observe

P [⋃_{j = 1, j \neq m}^{| M |} \{P_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}] = \sum_{y^{n} \in Y^{n}} P_{m}^{n} (y^{n}) ⋃_{j = 1, j \neq m}^{| M |} 1 \{P_{j}^{n} (y^{n}) \geq P_{m}^{n} (y^{n})\}

(17)

= \sum_{y^{n} \in Y^{n}} P_{m}^{n} (y^{n}) ⋃_{j = 1}^{| R_{P_{m}} |} 1 \{Q_{j}^{n} (y^{n}) \geq P_{m}^{n} (y^{n})\} .

(18)

= P [⋃_{j = 1}^{| R_{P_{m}} |} \{Q_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}],

(19)

where in (17) we sum over all possible outputs, and (18) relies on Lemma 1 by setting

Λ

to be the empirical distribution of

y^{n}

. This completes the proof of (16). □

Remark 1.

If the transmitted message is m, Lemma 2 shows that the union of error events

\cup_{j = 1, j \neq m}^{| M |} \{P_{j}^{n} \geq P_{m}^{n}\}

can be equivalent to a minor union

\cup_{j = 1}^{| R_{P_{m}} |} \{Q_{j}^{n} \geq P_{m}^{n}\}

on

R_{P_{m}}

. Its size depends on the size of the output alphabet.

4.3. Main Result: New Lower Bound

The main result in this section is the following. Please refer to Appendix C for the proof.

Theorem 2.

Fix a W that satisfies Assumption 1 and generate

Δ_{| Y | - 1}^{*}

. Let the set of marginal distributions

M

be constructed by (8) or (10) with any

b > 0

. Then, there exists a code

C_{n}

(average error probability) with achievable code size

| M |

such that

\begin{matrix} ϵ \leq min \{1, \frac{1}{| M |} \sum_{m = 1}^{| M |} \sum_{j = 1}^{| R_{P_{m}} |} \sum_{y^{n} \in Y^{n}} P_{m}^{n} (y^{n}) \cdot 1 \{\sum_{i = 1}^{n} d (m, j, y_{i}) \leq 0\}\} . \end{matrix}

(20)

Remark 2.

Theorem 2 relies on the message set constructed by (8) or (10). We restrict the channel W to be a full-rank square matrix, which makes

Δ_{| Y | - 1}^{*}

an equal-dimensional subspace of

Δ_{| Y | - 1}

. Therefore, the evenly spaced grid structure on

Δ_{| Y | - 1}^{*}

can be constructed by using (8) or (10). Without this condition, we cannot apply (10) unless we make strong assumptions about W.

Remark 3.

Theorem 2 upper bounds the probability of error with the sum of the probabilities of error events on

R_{P_{m}}

instead of

M

, which makes our bound much stronger than [1]. In fact, if we do not use Lemma 2 but instead apply the union bound and the second moment method for TV distance ([21] Lemma 4.2(iii)) in the proof of Theorem 2, we can obtain the existing bound ([1] Equation (36)).

4.4. BSC Permutation Channels

In this subsection, we particularize the nonasymptotic bounds to the BSC, i.e., the DMC matrix is

W = [\begin{matrix} 1 - δ & δ \\ δ & 1 - δ \end{matrix}],

(21)

denoted

{BSC}_{δ}

. According to Proposition 1 and Theorem 1, using (8) to construct the set of marginal distributions is better than (10) in the binary case. Therefore, we focus on the former in this subsection. For convenience, we denote

P_{m} (\cdot) = (δ_{m}, 1 - δ_{m})

. For

i, j \in [| M |]

, let

δ_{i} < δ_{j}

if

i < j

. Then, for a

P_{m}

, we clearly have

R_{P_{m}} = \{\begin{matrix} {P_{m - 1}, P_{m + 1}}, & 2 \leq m \leq | M | - 1, \\ {P_{2}}, & m = 1, \\ {P_{| M | - 1}}, & m = | M | . \end{matrix}

(22)

Let

f_{1} (n, T_{i}) = \{\begin{matrix} \sum_{t = 0}^{T_{i}} (\binom{n}{t}) δ_{i}^{t} {(1 - δ_{i})}^{n - t}, & i \geq 2, \\ 0, & i = 1, \end{matrix}

(23)

and

f_{2} (n, T_{i}) = \{\begin{matrix} \sum_{t = T_{i}}^{n} (\binom{n}{t}) δ_{i}^{t} {(1 - δ_{i})}^{n - t}, & i \leq | M | - 1, \\ 0, & i = | M | . \end{matrix}

(24)

The following bound is a straightforward generalization of Theorem 2.

Theorem 3

(Achievability). For the BSC permutation channel with crossover probability δ, there exists a code

C_{n}

such that

\begin{matrix} ϵ \leq \frac{\sum_{i = 1}^{| M |}}{| M |} min \{1, f_{1} (n, {\underline{T}}_{i}) + f_{2} (n, {\bar{T}}_{i})\}, \end{matrix}

(25)

where

{\underline{T}}_{i} = ⌊\frac{n log \frac{1 - δ_{i - 1}}{1 - δ_{i}}}{log \frac{δ_{i} (1 - δ_{i - 1})}{δ_{i - 1} (1 - δ_{i})}}⌋

(26)

and

{\bar{T}}_{i} = ⌈\frac{n log \frac{1 - δ_{i + 1}}{1 - δ_{i}}}{log \frac{δ_{i} (1 - δ_{i + 1})}{δ_{i + 1} (1 - δ_{i})}}⌉ .

(27)

The set of marginal distributions is constructed by (8) and for the radius r, we have

| M | = ⌊ 1 / r ⌋ + 1 .

(28)

Proof.

Let us assume the transmitted message is

m \in M

, corresponding to the marginal distribution

P_{m}

. In BSC, we focus on the set (22). Using the same argument in the proof of Lemma 1, the term corresponding to

d (m, m - 1, y_{i})

in (20) can be computed as

\begin{matrix} \sum_{y^{n} \in Y^{n}} P_{m}^{n} (y^{n}) \cdot & 1 \{\sum_{i = 1}^{n} d (m, m - 1, y_{i}) \leq 0\} = \sum_{t = 0}^{{\underline{T}}_{m}} (\binom{n}{t}) δ_{m}^{t} {(1 - δ_{m})}^{n - t}, \end{matrix}

(29)

where

{\underline{T}}_{m}

follows from (A19). Similarly, the term corresponding to

d (m, m + 1, y_{i})

in (20) can be computed as

\begin{matrix} \sum_{y^{n} \in Y^{n}} P_{m}^{n} (y^{n}) \cdot & 1 \{\sum_{i = 1}^{n} d (m, m + 1, y_{i}) \leq 0\} = \sum_{t = {\bar{T}}_{m}}^{n} (\binom{n}{t}) δ_{m}^{t} {(1 - δ_{m})}^{n - t}, \end{matrix}

(30)

where

{\bar{T}}_{m}

follows from (A19). Equations (29) and (30) are substituted into Theorem 2 to complete the proof. □

4.5. BEC Permutation Channels

The BEC permutation channel with erasure probability

δ

consists of input alphabet

X = {0, 1}

and output alphabet

Y = {0, e, 1}

, where the conditional distribution is

\forall z \in Y, \forall x \in X, W (z | x) = \{\begin{matrix} 1 - δ, & z = x, \\ δ, & z = e, \\ 0, & otherwise . \end{matrix}

(31)

Moreover, we denote such a channel as

{BEC}_{δ}

for convenience.

Next, we have the following achievability bound.

Proposition 2.

For BEC permutation channels with erasure probability

2 δ

, there exists a code

C_{n}

such that the average probability of error and the code size satisfy (25) and (28), respectively.

Proof.

The derivations of this proof follow from ([1] Proposition 6), and we include the details for the sake of completeness. We first note that the BSC matrix satisfies the Doeblin minorization condition (e.g., see [1] Definition 5) with

1^{⊤} / 2

and constant

2 δ

. Using ([1] Lemma 6), we find that

{BSC}_{δ}

is a degraded version of

{BEC}_{2 δ}

. Then, for the encoder and decoder pairs

(f_{n}, g_{n})

for BSC permutation channels and

(f_{n}, {\tilde{g}}_{n})

for BEC permutation channels, the average probability of error satisfies [1] Equation (36).

P_{e} (f_{n}, g_{n}, {BSC}_{δ}) = P_{e} (f_{n}, {\tilde{g}}_{n}, {BEC}_{2 δ}) .

(32)

Then, the argument of the proof of Theorem 3 is repeated. This completes the proof. □

5. Gaussian Approximation

We turn to the asymptotic analysis of the noisy permutation channel for a given blocklength and average probability of error.

5.1. Auxiliary Lemmata

To establish our Gaussian approximation, we will present two lemmata. The first lemma we will exploit is an important tool in the Gaussian approximation analysis:

Lemma 3

(Berry–Esseen, [22] Chapter XVI.5, Theorem 2). Fix a positive integer n. Let

Z_{i}

indexed by

(1, \dots, n)

be independent. Then, for any real x and

C_{0} \leq 6

we have

|P [\sum_{i = 1}^{n} Z_{i} < n (μ_{n} + x \sqrt{\frac{V_{n}}{n}})] - Φ (x)| \leq \frac{B_{n}}{\sqrt{n}},

(33)

where

\begin{matrix} μ_{n} = \frac{1}{n} \sum_{i = 1}^{n} E [Z_{i}], V_{n} = \frac{1}{n} \sum_{i = 1}^{n} Var [Z_{i}], \end{matrix}

(34)

\begin{matrix} T_{n} = \frac{1}{n} \sum_{i = 1}^{n} E [{|Z_{i} - μ_{i}|}^{3}], B_{n} = C_{0} \frac{T_{n}}{V_{n}^{3 / 2}} . \end{matrix}

(35)

To develop the Gaussian approximation, we consider the following definitions. The variance and third absolute moment of log-likelihood ratio between two distributions P and Q are defined as

V (P ∥ Q) = E [{(log \frac{P}{Q} - D (P ∥ Q))}^{2}]

and

T (P ∥ Q) = E [{|log \frac{P}{Q} - D (P ∥ Q)|}^{3}]

, respectively. Then, the following lemma is concerned with properties of

V (P ∥ Q)

and

T (P ∥ Q)

, which is proved in Appendix D.

Lemma 4.

Fix a W that satisfies Assumption 1 and generate

Δ_{| Y | - 1}^{*}

. Let

M

constructed by (10) with any

b > 0

be the set of packing centers on

Δ_{| Y | - 1}^{*}

. If packing radius

r \leq \frac{2 log e}{9}

, for any

P \in M

and

Q \in R_{P}

, we have

V (P ∥ Q) = r F_{0},

(36)

where

(\frac{5}{8 p_{m a x}} - \frac{2}{9}) log e \leq F_{0} \leq \frac{5 log e}{2 p_{m i n} {(1 - p_{m a x})}^{2}},

(37)

p_{m i n}

and

p_{m a x}

are constants greater than 0. Additionally, we have

T (P ∥ Q) \leq \frac{36 \sqrt{2} {(log e)}^{3 / 2}}{p_{m i n}^{2} {(1 - p_{m a x})}^{3}} r_{0}^{3 / 2} .

(38)

5.2. Main Result: Gaussian Approximation

The main result in this section is the following. Please refer to Appendix E for the proof.

Theorem 4.

Fix W is a strictly positive and full-rank square matrix for noisy permutation channels. Then, there exists a number

N_{0} \geq 1

, such that

log M^{*} (n, ϵ) = ℓ log (\frac{\sqrt{n}}{- Φ^{- 1} (ϵ / G)}) + log V (W) + θ

(39)

is achievable for all

n \geq N_{0}

, where

ℓ = rank (W) - 1

,

G = 2 (\binom{ℓ + 1}{2})

,

V (W)

is the channel volume ratio, and θ is a constant.

This achievable code size (39) is different from the Gaussian approximation of the traditional channel (e.g., see [7]) since our bound is obtained by divergence packing numbers

N^{*} (r, | Y |)

. The packing radius is a key ingredient affecting the lower bound of

N^{*} (r, | Y |)

and affects the error probability.

5.3. Approximation of BSC and BEC Permutation Channels

We apply Theorem 4 to obtain the following approximation.

Corollary 1.

For BSC permutation channels with crossover probability δ, there exists a number

N_{0} \geq 1

, such that

log M^{*} (n, ϵ) = log (\frac{(1 - 2 δ) \sqrt{n}}{- Φ^{- 1} (\frac{ϵ}{2})}) + θ

(40)

is achievable for all

n \geq N_{0}

, where θ is a constant.

Proof.

For BSC, we have

Δ_{1}^{*} = {(p, 1 - p) | δ \leq p \leq 1 - δ}

. By using Lagrange’s formula [23], we have

V ({BSC}_{δ}) = 1 - 2 δ

. Substituting this into Theorem 4 yields the result. □

Remark 4.

We remark that the Gaussian approximation shows some properties of the code size with a given blocklength n and probability of error ϵ. In BSC permutation channels, while the channel capacity is only related to the rank of the channel matrix, the rate at which the achievable code size approaches the capacity is affected by crossover probability δ.

The approximation of BSC permutation channels can also be derived from Proposition 1. To use the message set constructed by (8), we need the following lemma, which is proved in Appendix D:

Lemma 5.

Fix a W that satisfies Assumption 1 and generate

Δ_{1}^{*}

. Let

M

constructed by (8) with any

b > 0

be the set of packing centers on

Δ_{1}^{*}

. Then, there exists a packing radius

r_{1}

, such that for all

r \leq r_{1}

, we have

F_{0} r \leq V (P ∥ Q) \leq F_{1} r

(41)

and

T (P ∥ Q) \leq F_{2} r^{3 / 2},

(42)

where

F_{0}

,

F_{1}

and

F_{2}

are positive and finite.

Then, we have the following result.

Proposition 3.

For BSC permutation channels with crossover probability δ, there exists a number

N_{0} \geq 1

, such that

log M^{*} (n, ϵ) = log (⌈\frac{(1 - 2 δ) \sqrt{n}}{- Φ^{- 1} (\frac{ϵ}{2})}⌉) + θ

(43)

is achievable for all

n \geq N_{0}

, where θ is a constant.

Proof.

Instead of using (10), we use (8) with

Δ_{1}^{*} = {(p, 1 - p) | δ \leq p \leq 1 - δ}

. Repeat the argument of the proof of Theorem 4 replacing Lemma 4 with Lemma 5. Note that for

M

constructed by (8) with any

b > 0

, we have

b = \frac{1}{1 - 2 δ} \sqrt{\frac{r}{2 log e}} .

(44)

We use Proposition 1 to continue as follows:

\begin{matrix} log M^{*} (n, ϵ) & \geq log (⌊\frac{(1 - 2 δ) \sqrt{n}}{- Φ^{- 1} (\frac{ϵ}{2})}⌋ + 1) + log F_{0} \\ \geq log (⌈\frac{(1 - 2 δ) \sqrt{n}}{- Φ^{- 1} (\frac{ϵ}{2})}⌉) + log F_{0}, \end{matrix}

(45)

where

F_{0} \geq 0

is a constant. This completes the proof. □

Next, for BEC permutation channels, we have the following approximation.

Proposition 4.

For BEC permutation channels with erasure probability η, there exists a number

N_{0} \geq 1

, such that

log M^{*} (n, ϵ) = log (\frac{(1 - η) \sqrt{n}}{- Φ^{- 1} (\frac{ϵ}{2})}) + θ

(46)

and

log M^{*} (n, ϵ) = log (⌈\frac{(1 - η) \sqrt{n}}{- Φ^{- 1} (\frac{ϵ}{2})}⌉) + θ

(47)

are achievable for all

n \geq N_{0}

, where θ is a constant.

Proof.

Through Theorem 2, repeat the argument of the proof of Corollary 1 and Theorem 3, replacing

δ

with

η / 2

. □

6. Numerical Results

In this section, we perform numerical evaluations to illustrate our results. We first validate the precision of our Gaussian approximation across a wide range of parameters. Secondly, we present the performance of bounds of a binary DNA storage system and compare them with existing bounds.

6.1. Precision of the Gaussian Approximation

Here, we give the numerical results. According to Proposition 2, the BEC permutation channel with erasure probability

2 δ

can be equivalent to the BSC permutation channel with crossover probability

δ

. Thus, we focus on the numerical results of BSC permutation channels. We use Theorem 3 to compute the non-asymptotic achievability bound. We start searching from

M = 2

until the right side of (25) is less than the error probability

ϵ

. For Gaussian approximation, we use (40) and (43) but omit the remainder term

θ

. As Figure 2, Figure 3, Figure 4 and Figure 5 show, although the remainder term of the Gaussian approximation is a constant, it is still quite close to the non-asymptotic achievability bound. In fact, for all

n \geq 20

, the difference between (43) and Theorem 3 is within 1 bit in

log M^{*} (n, ϵ)

.

6.2. Comparison with Existing Bound

Additionally, in the context of DNA storage systems, we consider codewords composed of nucleotides

{A, C, G, T}

. For simplicity, A and C are regarded as the symbol 0, and G and T are regarded as the symbol 1 in code constructions, giving a binary alphabet

{0, 1}

. The synthesis errors and random permutation of DNA molecules are modeled as the BSC Permutation Channel with crossover probability

δ = 0.25

. To reduce the computation complexity, we use approximation (43). Furthermore, we present numerical results for the existing lower bound, namely, Makur’s achievability bound ([1] Equation (36)) for BSC permutation channels. The result shows that our new achievability bound is uniformly better than Makur’s bound. In the setup of Figure 6, our bound quickly approaches half of the capacity (

n \approx 1000

). As the blocklength increases, Makur’s bound reaches

20 %

of the channel capacity at about

n \approx 1.4 \times 10^{5}

, shown in Figure 6. This is because we show the overlapping relationship of error events, which reduces the number of error events when applying the union bound.

7. Conclusions and Discussion

In summary, we established a new achievability bound for noisy permutation channels with a strictly positive and full-rank square matrix. The key element is that our analysis indicates that the size of error events in the union is independent of the message set. This allows us to derive a refined asymptotic analysis of the achievable rate. Numerical simulations show that our new achievability bound is stronger than Makur’s achievability bound in [1]. Additionally, our approximation is quite accurate, even though the remainder term is a constant. Finally, the primary future work will generalize the DMC matrix in noisy permutation channels to non-full-rank and non-strictly positive matrices. Other future work may improve asymptotic expansion (e.g., improving the remainder term to

o (1)

).

Author Contributions

Conceptualization, L.F.; Methodology, L.F.; Software, L.F.; Validation, L.F.; Formal analysis, L.F.; Investigation, L.F.; Resources, L.F.; Data curation, L.F.; Writing—original draft, L.F.; Writing—review and editing, L.F., G.L. and X.L.; Visualization, L.F.; Supervision, G.L.; Project administration, G.L. and Y.J.; Funding acquisition, G.L., X.L. and Y.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Divergence Packing

Appendix A.1. Proof of Proposition 1

We note that

Γ_{b, 2}^{*}

constructed by (8) is a subset of

Δ_{1}^{*}

. Fix

b = \frac{1}{ξ} \sqrt{\frac{r}{2 log e}}

. Then, for

P, Q \in Γ_{b, 2}

we have

min_{P \neq Q} (2 log e) {TV}^{2} (P, Q) \geq r .

(A1)

Use Pinsker’s inequality ([8] Theorem 7.10) to obtain

min_{P \neq Q} D (P ∥ Q) \geq r

. Then, we conclude this proof by realizing that

| Γ_{b, 2}^{*} | = ⌊ 1 / b ⌋ + 1

.

Appendix A.2. Proof of Theorem 1

To prove Theorem 1, we consider the set

\begin{matrix} Γ_{b, | Y |} = \{P \in Δ_{| Y | - 1} | \begin{matrix} P = (\frac{a_{1}}{⌊ 1 / b ⌋}, \dots, \frac{a_{| Y |}}{⌊ 1 / b ⌋}) where a_{1}, \dots, a_{| Y |} \in Z_{\geq 0} \end{matrix}\} . \end{matrix}

(A2)

We have the following lemma:

Lemma A1.

Let

Γ_{b, | Y |}

be constructed by (A2) with any

b > 0

. For

P, Q \in Γ_{b, | Y |}

, we have

min_{P \neq Q} TV (P, Q) \geq b .

(A3)

Proof.

For every

P \in Γ_{b, | Y |}

, we can find a

Q \in Γ_{b, | Y |}

such that for

m, n \in Y

and

m \neq n

, we have

q_{m} \neq p_{m}

,

q_{n} \neq p_{n}

. For

i \in Y ∖ {m, n}

, we have

q_{i} = p_{i}

. Then, we obtain that

min_{P \neq Q} TV (P, Q) = min_{P \neq Q} \frac{1}{2} \sum_{i \in Y} | p_{i} - q_{i} | = \frac{1}{⌊ 1 / b ⌋} \geq b

. □

Fix a radius

b = \sqrt{\frac{r}{2 log e}} > 0

and fix a

Γ_{b, | Y |}^{*}

constructed by (10). Note that we have

Γ_{b, | Y |}^{*} \subset Γ_{b, | Y |}

. For any

P, Q \in Γ_{b, | Y |}^{*}

, we have

min_{P \neq Q} D (P ∥ Q) \overset{(a)}{\geq} min_{P \neq Q} (2 log e) {TV}^{2} (P, Q) \overset{(b)}{\geq} r,

(A4)

where

(a)

follows from Pinsker’s inequality and

(b)

follows from Lemma A1. On the other hand, note that the right inequality of (A4) shows that

min \sum_{y \in Y} | P (y) - Q (y) | \geq \sqrt{\frac{2}{log e}} \sqrt{r} .

(A5)

Hence, a total variation packing constructed by

Γ_{b, | Y |}^{*}

can be regarded as an

ℓ_{1}

-norm packing of

Δ_{| Y | - 1}^{*}

with radius

\sqrt{\frac{2 r}{log e}}

. Let

B_{ℓ_{1}}^{| Y | - 1}

be the

ℓ_{1}

-norm unit ball. The volume bound ([8] Theorem 27.3) of

ℓ_{1}

-norm packing provides the lower bound of

| Γ_{b, | Y |}^{*} |

as

| Γ_{b, | Y |}^{*} | \geq {(\frac{log e}{2 r})}^{\frac{| Y | - 1}{2}} \frac{vol (y, Δ_{| Y | - 1}^{*})}{vol (B_{ℓ_{1}}^{| Y | - 1})} .

(A6)

Here,

vol (B_{ℓ_{1}}^{| Y | - 1})

is the volume of the

ℓ_{1}

-norm unit ball;

vol (y, Δ_{| Y | - 1}^{*})

is the volume of the projection of

Δ_{| Y | - 1}^{*}

from

R^{| Y |}

to the space

R^{| Y | - 1}

, consistent with the argument used in the proof of ([24] Proposition 2).

Note that the maximum volume ratio is

V (W)

by Definition 3. We continue the bounding as follows:

\begin{matrix} | Γ_{b, | Y |}^{*} | & \geq {(\frac{log e}{2 r})}^{\frac{| Y | - 1}{2}} \times \frac{V (W) vol (y, Δ_{| Y | - 1})}{vol (B_{ℓ_{1}}^{| Y | - 1})}, \end{matrix}

(A7)

\begin{matrix} = V (W) {(\frac{log e}{8 r})}^{\frac{| Y | - 1}{2}}, \end{matrix}

(A8)

where

(A7) holds since this projection can remove any y-th dimension, where $y \in Y$ . Consequently, the lower bound is given by taking the maximum of the volume ratio over $y \in Y$ ;
(A8) follows by using Lagrange’s formula [23] and the volume formula to obtain $vol (y, Δ_{| Y | - 1}^{*}) = \frac{V (W)}{(| Y | - 1)!}$ and $vol (B_{ℓ_{1}}^{| Y | - 1}) = \frac{2^{| Y | - 1}}{(| Y | - 1)!}$ , respectively.

Finally, we conclude the proof by realizing that the left inequality of (A4) implies that

Γ_{b, | Y |}^{*}

gives the set of divergence packing centers. This completes the proof of Theorem 1.

Appendix B. Proof of Lemma 1

We first consider

M

constructed by (10). Fix a

P = (p_{1}, \dots, p_{| Y |}) \in M

and generate

R_{P}

corresponding to P. Fix

a, b \in Y

, where

a \neq b

. Denote a

Q_{a, b}^{*} = (q_{1}^{*}, \dots, q_{| Y |}^{*})

such that

q_{a}^{*} = p_{a} + \frac{1}{⌊ 1 / r ⌋}

,

q_{b}^{*} = p_{b} - \frac{1}{⌊ 1 / r ⌋}

,

q_{i}^{*} = p_{i}

, where

i \in Y ∖ {a, b}

. Let

|log \frac{p_{a}}{p_{b} + 1 / ⌊ 1 / r ⌋}| = G_{a, b} |log \frac{p_{b}}{p_{b} - 1 / ⌊ 1 / r ⌋}|,

(A9)

where

G_{a, b}

is a constant. Define

B_{a, b} = \{(λ_{1}, \dots, λ_{| Y |}) | G_{a, b} λ_{a} < λ_{b}\} .

(A10)

If

Q_{a, b}^{*} \in R_{P}

, for

Λ = (λ_{1}, \dots, λ_{| Y |}) \in B_{a, b}

and

Q_{a, b}^{*} = (q_{1}^{*}, \dots, q_{| Y |}^{*})

, we have

\prod_{i = 1}^{| Y |} p_{i}^{λ_{i}} > \prod_{i = 1}^{| Y |} {(q_{i}^{*})}^{λ_{i}} .

(A11)

For

a, b \in Y

and

a \neq b

, we find

B_{a, b}

and consider the intersection, i.e.,

B_{P} = ⋂_{a, b \in Y, a \neq b, Q_{a, b}^{*} \in R_{P}} B_{a, b}

. Then, for

Λ \in B_{P}

, we have (A11) holds for any

Q^{*} \in R_{P}

.

Now we consider

Q \in M ∖ (R_{P} \cap {P}

), which is farther from P in Euclidean distance than

Q^{*} \in R_{P}

. For each

i \in Y

, we obtain that the Euclidean distance between

p_{i}

and

q_{i}

is

\frac{K_{i}}{⌊ 1 / r ⌋}

, where

K_{i} \in Z_{\geq 0}

.

Since Q is a probability distribution, we have

p_{i} - \frac{K_{i}}{⌊ 1 / r ⌋} > 0

and

K_{i} \geq 0

. Using the inequality

{(1 + p_{i})}^{K} \geq 1 + K p_{i}

, we obtain that

K_{i} |log \frac{p_{i}}{p_{i} + 1 / ⌊ 1 / r ⌋}| \geq |log \frac{p_{i}}{p_{i} + K_{i} / ⌊ 1 / r ⌋}|

(A12)

and

K_{i} |log \frac{p_{i}}{p_{i} - 1 / ⌊ 1 / r ⌋}| \leq |log \frac{p_{i}}{p_{i} - K_{i} / ⌊ 1 / r ⌋}| .

(A13)

Let

Y^{*} = {i \in Y | q_{i} \neq p_{i}}

,

Y_{0} = {i \in Y^{*} | q_{i} < p_{i}}

, and

Y_{1} = {i \in Y^{*} | q_{i} > p_{i}}

. For

i \in Y_{0}

, let

q_{i}^{'} = p_{i} - 1 / ⌊ 1 / r ⌋

. For

i \in Y_{1}

, let

q_{i}^{'} = p_{i} + 1 / ⌊ 1 / r ⌋

. Then, we have

\sum_{i \in Y^{*}} λ_{i} log \frac{p_{i}}{q_{i}} \geq \sum_{i \in Y^{*}} λ_{i} K_{i} log \frac{p_{i}}{q_{i}^{'}} .

(A14)

Due to the constraints of the probability space, we have

\sum_{i \in Y_{0}} K_{i} = \sum_{i \in Y_{1}} K_{i} .

(A15)

Recall that for

Λ \in B_{P}

, (A11) holds for any

Q^{*} \in R_{P}

. Then, by the definition of

R_{P}

, for

Λ \in B_{P}

, we have

λ_{a} log \frac{p_{a}}{q_{a}^{'}} + λ_{b} log \frac{p_{b}}{q_{b}^{'}} > 0,

(A16)

where

a \in Y_{0}

and

b \in Y_{1}

. We combine (A14)–(A16) to obtain that if

Λ \in B_{P}

, then

\sum_{i \in Y^{*}} λ_{i} log \frac{p_{i}}{q_{i}} > 0 .

(A17)

That is

\prod_{i = 1}^{| Y |} p_{i}^{λ_{i}} > \prod_{i = 1}^{| Y |} q_{i}^{λ_{i}} .

(A18)

For

Q \in M ∖ (R_{P} \cap {P})

, define

B_{Q}

such that (A17) holds for

Λ \in B_{Q}

. We have

B_{P} \subset B_{Q}

. Denote by

A_{Q} = Δ_{| Y | - 1} ∖ B_{Q}

and

A_{P} = Δ_{| Y | - 1} ∖ B_{P}

the complement of

B_{Q}

and

B_{P}

, respectively. We note that if (14) holds, we have

Λ \in A_{Q}

. Then, we obtain (15) holds for a

Q^{*} \in R_{P}

since we obviously have

A_{Q} \subset A_{P}

. This completes the proof in the case of

M

constructed by (10).

We then give another proof in binary case. Consider

M

constructed by (8). For convenience, let

p_{j} < p_{m}

if

j < m

, where

j, m \in [| M |]

. Fix

P_{m - 1} = (p_{m - 1}, 1 - p_{m - 1})

and

P_{m} = (p_{m}, 1 - p_{m})

. Let

f (p_{j}) = (log \frac{1 - p_{j}}{1 - p_{m}}) / (log \frac{p_{m} (1 - p_{j})}{p_{j} (1 - p_{m})}) .

(A19)

For any

λ \in [0, 1]

and

j \in [m - 2]

, we have

p_{j}^{λ} {(1 - p_{j})}^{(1 - λ)} \geq p_{m}^{λ} {(1 - p_{m})}^{(1 - λ)}

(A20)

holds for

λ \in [0, f (p_{j})]

. Note that

f (p_{j})

is a monotonically increasing function with respect to

p_{j}

. Then, we can obtain

p_{m - 1}^{λ} {(1 - p_{m - 1})}^{(1 - λ)} \geq p_{m}^{λ} {(1 - p_{m})}^{(1 - λ)}

(A21)

since

[0, f (p_{j})] \subset [0, f (p_{m - 1})]

.

For

j \in {m + 1, m + 3, \dots, | M |}

, use the same argument to obtain that (A20) holds for

λ \in [f (p_{j}), 1]

and

p_{m + 1}^{λ} {(1 - p_{m + 1})}^{(1 - λ)} \geq p_{m}^{λ} {(1 - p_{m})}^{(1 - λ)}

(A22)

holds for

λ \in [f (p_{m + 1}), 1]

. Then, (A20) implies (A22) since

[f (p_{j}), 1] \subset [f (p_{m - 1}), 1]

. This completes the proof.

Appendix C. Proof of Theorem 2

Since the matrix is full-rank, we obtain the achievability space of the marginal distribution is a

(| Y | - 1)

-dimensional probability space

Δ_{| Y | - 1}^{*}

. We consider constructing the set of marginal distributions

M

by using the divergence packing on space

Δ_{| Y | - 1}^{*}

, i.e., let

M

be constructed by (8) or (10) with any

b > 0

.

Assuming the transmitted message is m, corresponding to the marginal distribution

P_{m}

. After the codeword

X^{n}

passes through the DMC matrix W and the random permutation block

P_{Y^{n} | Z^{n}}

, we obtain that

Y^{n} \overset{i . i . d .}{\sim} P_{m}

. Then, error event of the maximum likelihood decoder is

⋃_{j = 1, j \neq m}^{| M |} \{P_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\} .

(A23)

Then, for

Y^{n}

drawn i.i.d. from

P_{m}

the average probability of error satisfies

P [error | M = m] \leq P [⋃_{j = 1, j \neq m}^{| M |} \{P_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}]

(A24)

= P [⋃_{j = 1}^{| R_{P_{m}} |} \{P_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}]

(A25)

\leq \sum_{j = 1}^{| R_{P_{m}} |} P [\{P_{j}^{n} (Y^{n}) \geq P_{m}^{n} (Y^{n})\}],

(A26)

where

(A24) follows from we regard the equality case, $P_{j}^{n} (Y^{n}) \geq P_{1}^{n} (Y^{n})$ , as an error event though the ML decoder might return the correct message;
(A25) follows from Lemma 2;
(A26) follows from the union bound.

Let the message be uniform on

M

. Then, we take the expectation over all codewords to obtain

ϵ \leq \frac{1}{| M |} \sum_{m = 1}^{| M |} \sum_{j = 1}^{| R_{P_{m}} |} \sum_{y^{n} \in Y^{n}} P_{m}^{n} (y^{n}) 1 \{\sum_{i = 1}^{n} d (m, j, y_{i}) \leq 0\}

(A27)

This completes the proof.

Appendix D. Properties of $V (P ∥ Q)$ and $T (P ∥ Q)$

This appendix is concerned with the behavior of

V (P ∥ Q)

and

T (P ∥ Q)

. We first prove Lemma 4.

Appendix D.1. Proof of Lemma 4

Since the DMC matrix is strictly positive, each term of marginal distribution P is uniformly bounded away from zero, i.e., there exists a

p_{min} \in (0, 1)

such that

P (y) \geq p_{min}

for all

y \in Y

. Similarly, there exists a

p_{max} \in (0, 1)

such that

P (y) \leq p_{max}

for all

y \in Y

. Clearly, we have

p_{min} = {min}_{y \in Y} {P (y) | P \in M}

and

p_{max} = {max}_{y \in Y} {P (y) | P \in Δ_{| Y | - 1}^{*}}

. Without loss of generality, we assume

| M | \geq 2

. That is, we have

r_{0} \leq \frac{2 log e}{9}

.

We know that

\begin{matrix} P (y) & = \frac{a}{1 / r - δ} \\ = \frac{a \sqrt{\frac{r_{0}}{2 log e}}}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}}, \end{matrix}

(A28)

where

δ \in [0, 1)

and

a \in Z_{\geq 1}

.

We consider each term

P (y)

in P separately. If

P (y) = Q (y)

, we obviously get zero. If

P (y) > Q (y)

, we have

Q (y) = \frac{(a - 1) \sqrt{\frac{r_{0}}{2 log e}}}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}}

. Here, since the distributions in

M

have no zero term, we have

a \in Z_{\geq 2}

. Since

a = P (y) (1 - δ \sqrt{\frac{r_{0}}{2 log e}}) \sqrt{\frac{2 log e}{r_{0}}}

, we have

\begin{matrix} \frac{\sqrt{r_{0} log e}}{P (y) (\sqrt{2} - δ \sqrt{r_{0} / log e})} \leq log \frac{P (y)}{Q (y)} \leq \frac{\sqrt{r_{0} log e}}{P (y) (\sqrt{2} - δ \sqrt{r_{0} / log e}) - \sqrt{r_{0} / log e}} . \end{matrix}

(A29)

Consequently, we have

P (y) {(log \frac{P (y)}{Q (y)})}^{2} \geq \frac{r_{0} log e}{2 P (y)}

(A30)

and

\begin{matrix} P (y) {(log \frac{P (y)}{Q (y)})}^{2} \leq \frac{r_{0} log e}{2 {(a - 1)}^{2} P (y) / a^{2}} \leq \frac{2 r_{0} log e}{P (y)} {(\frac{1}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}})}^{2} . \end{matrix}

(A31)

If

P (y) < Q (y)

, consider

Q (y) = \frac{(a + 1) \sqrt{\frac{r_{0}}{2 log e}}}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}}

, where

a \in Z_{\geq 1}

. Applying the same argument to obtain

\begin{matrix} \frac{\sqrt{r_{0} log e}}{P (y) (\sqrt{2} - δ \sqrt{r_{0} / log e}) + \sqrt{r_{0} / log e}} \leq | log \frac{P (y)}{Q (y)} | \leq \frac{\sqrt{r_{0} log e}}{P (y) (\sqrt{2} - δ \sqrt{r_{0} / log e})} . \end{matrix}

(A32)

Consequently, we have

\begin{matrix} P (y) {(log \frac{P (y)}{Q (y)})}^{2} \geq \frac{r_{0} log e}{2 {(a + 1)}^{2} P (y) / a^{2}} \geq \frac{r_{0} log e}{8 P (y)} \end{matrix}

(A33)

and

\begin{matrix} P (y) {(log \frac{P (y)}{Q (y)})}^{2} & \leq \frac{r_{0} log e}{2 P (y)} {(\frac{1}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}})}^{2} . \end{matrix}

(A34)

Then, we see that

V (P ∥ Q) \geq - D {(P ∥ Q)}^{2} + \sum_{y} P (y) {(log \frac{P (y)}{Q (y)})}^{2}

(A35)

\geq r_{0} log e (\frac{5}{8 p_{m a x}} - \frac{r_{0}}{log e}),

(A36)

where

(A35) simply expand the variance;
(A36) holds since the definition of $R_{P}$ ensures that for all $y \in Y$ , we only have two terms that are not zero and can be bounded by (A30) and (A33), respectively.

Finally, we complete the proof of the lower bound by noting that

r_{0} \leq \frac{2 log e}{9}

.

Note that

{(\frac{1}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}})}^{2} \leq \frac{1}{{(1 - p_{m a x})}^{2}} .

(A37)

Using (A31) and (A34), we obtain the upper bound

V (P ∥ Q) \leq \frac{5 log e}{2 p_{m i n} {(1 - p_{m a x})}^{2}} r_{0},

(A38)

which yields (36).

We now turn to bound

T (P ∥ Q)

. For

P (y) > Q (y)

, we have

P (y) | log \frac{P (y)}{Q (y)} |^{3} \leq \frac{4}{\sqrt{2}} \frac{{(r_{0} log e)}^{3 / 2}}{P^{2} (y)} {(\frac{1}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}})}^{3} .

(A39)

For

P (y) < Q (y)

, we have

P (y) | log \frac{P (y)}{Q (y)} |^{3} \leq \frac{1}{2 \sqrt{2}} \frac{{(r_{0} log e)}^{3 / 2}}{P^{2} (y)} {(\frac{1}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}})}^{3} .

(A40)

Note that

D (P ∥ Q)

can be bounded by the following:

D (P ∥ Q) \leq \frac{3 \sqrt{2 log e}}{2} \sqrt{r_{0}} (\frac{1}{1 - δ \sqrt{\frac{r_{0}}{2 log e}}}) .

(A41)

Using the inequality

{| a - b |}^{3} \leq {4 (| a |}^{3} + {| b |}^{3})

, we obtain

T (P ∥ Q) \leq \frac{36 \sqrt{2} {(log e)}^{3 / 2}}{p_{m i n}^{2} {(1 - p_{m a x})}^{3}} r_{0}^{3 / 2},

(A42)

which establishes (38).

Appendix D.2. Proof of Lemma 5

Then, we prove Lemma 5 for

M

constructed by (8). We first note that

\begin{matrix} P (y) & = \frac{a ξ}{1 / r - δ} \\ = \frac{(a - \frac{δ_{1} δ}{ξ} + F_{0}) \sqrt{\frac{r_{0}}{2 log e}}}{1 - \frac{δ}{ξ} \sqrt{\frac{r_{0}}{2 log e}}}, \end{matrix}

(A43)

where

δ \in [0, 1]

,

F_{0} = δ_{1} / \sqrt{\frac{r_{0}}{2 log e}}

, and

a \in Z_{\geq 1}

. Note that

F_{0} - \frac{δ_{1} δ}{ξ} \geq 0

holds for small

r_{0}

. Repeat the proof of Lemma 4, replacing a with

a - \frac{δ_{1} δ}{ξ} + F_{0}

and

δ

with

\frac{δ}{ξ}

. Then, there exists a packing radius

r_{1}

, such that for all

r_{0} \leq r_{1}

, we have

F_{0} r_{0} \leq V (P ∥ Q) \leq F_{1} r_{0}

(A44)

and

T (P ∥ Q) \leq F_{2} r_{0}^{3 / 2},

(A45)

where

F_{0}

,

F_{1}

and

F_{2}

are positive and finite.

Appendix E. Proof of Theorem 4

Let

M_{r}

constructed by (10) with

b = \sqrt{\frac{r}{2 log e}}

be the set of packing centers on

Δ_{| Y | - 1}^{*}

, where

r > 0

is the packing radius and we specify later. For the transmitted message m, passing codeword through the DMC matrix

W^{n}

and random permutation block

P_{Y^{n} | Z^{n}}

induces a marginal distribution

P_{m}

on

Y^{n}

. Note that the decoding metric is the sum of independent identically distributed variables:

\sum_{k = 1}^{n} d (m, j, Y_{k}) = \sum_{k = 1}^{n} log \frac{P_{m} (Y_{k})}{Q_{j} (Y_{k})} .

(A46)

It has the mean

D (P_{m} ∥ Q_{j})

, variance

V (P_{m} ∥ Q_{j})

, and third absolute moment

T (P_{m} ∥ Q_{j})

. Denote

B_{m} = max_{j \in [| R_{P_{m}} |]} \frac{6 T (P_{m} ∥ Q_{j})}{\sqrt{V^{3} (P_{m} ∥ Q_{j})}}

(A47)

and

P_{e} [m] = \sum_{j = 1}^{| R_{P_{m}} |} P [\sum_{k = 1}^{n} d (m, j, Y_{k}) \leq 0] .

(A48)

According to Theorem 2, there exists a code with average error probability

ϵ^{'}

such that

ϵ^{'} \leq \frac{1}{| M_{r} |} \sum_{m = 1}^{| M_{r} |} P_{e} [m] .

(A49)

Denote

m^{*} = \underset{m \in M}{argmax} P_{e} [m]

. We continue (A49) as follows:

ϵ^{'} \leq P_{e} [m^{*}]

(A50)

\leq \frac{| R_{P_{m^{*}}} | B_{m^{*}}}{\sqrt{n}} + max_{j \in [| R_{P_{m^{*}}} |]} | R_{P_{m^{*}}} | Φ (\frac{- n D (P_{m^{*}} ∥ Q_{j})}{\sqrt{n V (P_{m^{*}} ∥ Q_{j})}})

(A51)

\leq \frac{| R_{P_{m^{*}}} | B_{m^{*}}}{\sqrt{n}} + | R_{P_{m^{*}}} | Φ (- \sqrt{\frac{n r_{0}}{F_{0}}}),

(A52)

where

(A51) follows from Theorem 3;
(A52) holds for a suitable constant $F_{0} > 0$ by Lemma 4.

Equating the RHS of (A52) to

ϵ

and noting that

ϵ \geq ϵ^{'} > 0

, we solve that

r_{0} = \frac{{(Φ^{- 1} (\frac{ϵ}{| R_{P_{m^{*}}} |} - \frac{F_{1}}{\sqrt{n}}))}^{2} F_{0}}{n},

(A53)

where

| R_{P_{m^{*}}} | \leq 2 (\binom{| Y |}{2})

by its definition,

F_{0} > 0

and

F_{1} > 0

are suitable constants. This can be done since for suitable constants

F_{2} > 0

and

F_{3} > 0

, we obtain

T (P_{m} ∥ Q_{j}) \leq F_{2} r_{0}^{3 / 2}

and

V (P_{m} ∥ Q_{j}) \geq F_{3} r_{0}

for n sufficiently large by Lemma 4. Consequently, for large n,

B_{m}

can be upper bounded by a suitable constant

F_{1}

.

The above arguments indicate that there exists a code with average error probability

ϵ

such that the code size

| M_{r} |

is achievable. Let

ℓ = | Y | - 1

and let

G = 2 (\binom{| Y |}{2})

. For n sufficiently large, we obtain that for suitable

F_{5} > 0

,

log | M_{r} | = log N^{*} (r, | Y |)

(A54)

\geq log (V (W) {(\frac{log e}{8})}^{\frac{ℓ}{2}} {(\frac{\sqrt{n} F_{4}}{- Φ^{- 1} (ϵ / G)})}^{ℓ})

(A55)

\geq ℓ log (\frac{\sqrt{n}}{- Φ^{- 1} (ϵ / G)}) + log V (W) + log F_{5},

(A56)

where

(A54) follows from that $M_{r}$ constructed by (10) is the set of packing centers and has the packing number $N^{*} (r, | Y |)$ ;
(A55) holds for suitable $F_{4} > 0$ by Theorem 1 and Taylor’s formula of $Φ^{- 1} (\cdot)$ .

This completes the proof.

References

Makur, A. Coding Theorems for Noisy Permutation Channels. IEEE Trans. Inf. Theory 2020, 66, 6723–6748. [Google Scholar] [CrossRef]
Makur, A. Bounds on Permutation Channel Capacity. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 762–766. [Google Scholar]
Tang, J.; Polyanskiy, Y. Capacity of Noisy Permutation Channels. IEEE Trans. Inf. Theory 2023, 69, 4145–4162. [Google Scholar] [CrossRef]
Feng, L.; Wang, B.; Lv, G.; Li, X.; Wang, L.; Jin, Y. New Upper Bounds for Noisy Permutation Channels. IEEE Trans. Commun. 2025, 73, 7478–7492. [Google Scholar] [CrossRef]
Strassen, V. Asymptotic Estimates in Shannon’s Information Theory. In Proceedings of the Transactions of the Third Prague Conference on Information Theory, Prague, Czech Republic, 5–13 June 1962; pp. 689–723. [Google Scholar]
Hayashi, M. Information Spectrum Approach to Second-Order Coding Rate in Channel Coding. IEEE Trans. Inf. Theory 2009, 55, 4947–4966. [Google Scholar] [CrossRef]
Polyanskiy, Y.; Poor, H.V.; Verdu, S. Channel Coding Rate in the Finite Blocklength Regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
Polyanskiy, Y.; Yihong, W. Information Theory: From Coding to Learning; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
Polyanskiy, Y. Channel Coding: Non-Asymptotic Fundamental Limits. Ph.D. Thesis, Princeton University, Princeton, NJ, USA, 2010. [Google Scholar]
Kolmogorov, A.N. Selected Works of A. N. Kolmogorov. Mathematics and Its Applications; Springer: Dordrecht, The Netherlands, 1993. [Google Scholar]
Yuhong, Y.; Andrew, B. Information-theoretic determination of minimax rates of convergence. Ann. Stat. 1999, 27, 1564–1599. [Google Scholar]
John, W.; Steven, W.; Wa, M. Optimal rate delay tradeoffs for multipath routed and network coded networks. In Proceedings of the 2008 IEEE International Symposium on Information Theory (ISIT), Toronto, ON, Canada, 7–12 June 2008. [Google Scholar]
John, W.; Steven, W.; Wa, M. Optimal Rate–Delay Tradeoffs and Delay Mitigating Codes for Multipath Routed and Network Coded Networks. IEEE Trans. Inf. Theory 2009, 55, 5491–5510. [Google Scholar]
Yazdi, S.M.; Hossein, T.; Han, M.; Garcia, R. DNA-Based Storage: Trends and Methods. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2015, 1, 230–248. [Google Scholar] [CrossRef]
Erlich, T.; Zielinski, D. DNA Fountain enables a robust and efficient storage architecture. Science 2017, 355, 950–954. [Google Scholar] [CrossRef] [PubMed]
Heckel, R.; Shomorony, I.; Ramchandran, K.; Tse, D.N.C. Fundamental limits of DNA storage systems. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 3130–3134. [Google Scholar]
Kovačević, M.; Tan, V.Y. Codes in the Space of Multisets—Coding for Permutation Channels with Impairments. IEEE Trans. Inf. Theory 2018, 64, 5156–5169. [Google Scholar] [CrossRef]
Laver, T.; Harrison, J.; Moore, K.; Farbos, A.; Paszkiewicz, K.; Studholme, D. Assessing the performance of the oxford nanopore technologies minion. J. Mol. Biol. 2015, 3, 1–8. [Google Scholar] [CrossRef] [PubMed]
Kovačević, M.; Tan, V.Y. Asymptotically optimal codes correcting fixed-length duplication errors in DNA storage systems. IEEE Commun. Lett. 2018, 22, 2194–2197. [Google Scholar] [CrossRef]
Kiah, M.H.; Puleo, G.; Milenkovic, O. Codes for DNA sequence profiles. IEEE Trans. Inf. Theory 2016, 62, 3125–3146. [Google Scholar] [CrossRef]
Evans, W.; Kenyon, C.; Peres, Y.; Schulman, L.J. Broadcasting on trees and the Ising model. Ann. Appl. Prob. 2000, 410–433. [Google Scholar] [CrossRef]
Feller, W. An Introduction to Probability Theory and Its Applications; Wiley: New York, NY, USA, 1971. [Google Scholar]
Stein, P. A Note on the Volume of a Simplex. Am. Math. Mon. 1966, 73, 299. [Google Scholar] [CrossRef]
Jennifer, T. Divergence Covering. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2021. [Google Scholar]

Figure 1. Illustration of a communication model with a DMC followed by a random permutation.

Figure 2. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.11

and average block error rate

ϵ = 10^{- 3}

.

Figure 2. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.11

and average block error rate

ϵ = 10^{- 3}

.

Figure 3. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.11

and average block error rate

ϵ = 10^{- 6}

.

Figure 3. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.11

and average block error rate

ϵ = 10^{- 6}

.

Figure 4. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.22

and average block error rate

ϵ = 10^{- 3}

.

Figure 4. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.22

and average block error rate

ϵ = 10^{- 3}

.

Figure 5. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.22

and average block error rate

ϵ = 10^{- 6}

.

Figure 5. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.22

and average block error rate

ϵ = 10^{- 6}

.

Figure 6. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.25

and average block error rate

ϵ = 10^{- 3}

: example of a DNA storage system.

Figure 6. Rate–blocklength tradeoff for the BSC with crossover probability

δ = 0.25

and average block error rate

ϵ = 10^{- 3}

: example of a DNA storage system.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, L.; Lv, G.; Li, X.; Jin, Y. A New Lower Bound for Noisy Permutation Channels via Divergence Packing. Entropy 2025, 27, 1101. https://doi.org/10.3390/e27111101

AMA Style

Feng L, Lv G, Li X, Jin Y. A New Lower Bound for Noisy Permutation Channels via Divergence Packing. Entropy. 2025; 27(11):1101. https://doi.org/10.3390/e27111101

Chicago/Turabian Style

Feng, Lugaoze, Guocheng Lv, Xunan Li, and Ye Jin. 2025. "A New Lower Bound for Noisy Permutation Channels via Divergence Packing" Entropy 27, no. 11: 1101. https://doi.org/10.3390/e27111101

APA Style

Feng, L., Lv, G., Li, X., & Jin, Y. (2025). A New Lower Bound for Noisy Permutation Channels via Divergence Packing. Entropy, 27(11), 1101. https://doi.org/10.3390/e27111101

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Lower Bound for Noisy Permutation Channels via Divergence Packing †

Abstract

1. Introduction

1.1. Motivation and Application

1.2. Notation

2. System Model

3. Message Set and Divergence Packing

3.1. Binary Case

3.2. General Case

4. New Bounds on Rate

4.1. Overlapping of Error Events

4.2. Equivalent Expression

4.3. Main Result: New Lower Bound

4.4. BSC Permutation Channels

4.5. BEC Permutation Channels

5. Gaussian Approximation

5.1. Auxiliary Lemmata

5.2. Main Result: Gaussian Approximation

5.3. Approximation of BSC and BEC Permutation Channels

6. Numerical Results

6.1. Precision of the Gaussian Approximation

6.2. Comparison with Existing Bound

7. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Divergence Packing

Appendix A.1. Proof of Proposition 1

Appendix A.2. Proof of Theorem 1

Appendix B. Proof of Lemma 1

Appendix C. Proof of Theorem 2

Appendix D. Properties of V ( P ∥ Q ) and T ( P ∥ Q )

Appendix D.1. Proof of Lemma 4

Appendix D.2. Proof of Lemma 5

Appendix E. Proof of Theorem 4

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

A New Lower Bound for Noisy Permutation Channels via Divergence Packing^†

Appendix D. Properties of $V (P ∥ Q)$ and $T (P ∥ Q)$