Next Article in Journal
Enhancing Migraine Trigger Surprisal Predictions: A Bayesian Approach to Establishing Prospective Expectations
Previous Article in Journal
Trust-Aware Causal Consistency Routing for Quantum Key Distribution Networks Against Malicious Nodes
Previous Article in Special Issue
High-Efficiency Lossy Source Coding Based on Multi-Layer Perceptron Neural Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Lower Bound for Noisy Permutation Channels via Divergence Packing †

1
State Key Laboratory of Photonics and Communications, Peking University, Beijing 100871, China
2
National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
*
Author to whom correspondence should be addressed.
This work was presented in part at IEEE International Symposium on Information Theory (ISIT), Lugaoze, F.; Xunan, L.; Guocheng, L.; Ye, J. New Channel Coding Lower Bounds for Noisy Permutation Channels. In Proceedings of the 2025 IEEE International Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, 22–27 June 2025.
Entropy 2025, 27(11), 1101; https://doi.org/10.3390/e27111101
Submission received: 4 September 2025 / Revised: 12 October 2025 / Accepted: 23 October 2025 / Published: 25 October 2025
(This article belongs to the Special Issue Next-Generation Channel Coding: Theory and Applications)

Abstract

Noisy permutation channels are applied in modeling biological storage systems and communication networks. For noisy permutation channels with strictly positive and full-rank square matrices, new achievability bounds are given in this paper, which are tighter than existing bounds. To derive this bound, we use the ϵ -packing with Kullback–Leibler divergence as a distance and introduce a novel way to illustrate the overlapping relationship of error events. This new bound shows analytically that for such a matrix W, the logarithm of the achievable code size with a given block n and error probability ϵ is closely approximated by log n Φ 1 ( ϵ / G ) + log V ( W ) , where = rank ( W ) 1 , G = 2 + 1 2 , and V ( W ) is a characteristic of the channel referred to as channel volume ratio. Our numerical results show that the new achievability bound significantly improves the lower bound of channel coding. Additionally, the Gaussian approximation can replace the complex computations of the new achievability bound over a wide range of relevant parameters.

1. Introduction

The noisy permutation channel, consisting of a discrete memoryless channel (DMC) and a uniform random permutation block, was introduced in [1] and is a point-to-point communication model that captures the out-of-order arrival of packets. Such channels can be used as models of communication networks and DNA storage systems, where the ordering of the codeword does not carry any information. Previously, several advances have been made to asymptotic bounds, including the binary channels in [2], the capacity of full-rank DMCs in [1], and the converse bounds based on divergence covering [3,4].
The code lengths of suitable codes in communication systems are in the order of thousands or hundreds, invalidating the asymptotic assumptions in classical information theory. We initiate the study of new channel coding bounds to extend the information-theoretic results for noisy permutation channels in finite blocklength analysis. Finite blocklength analysis and finer asymptotics are important branches of research in information theory. Interest in this topic has been growing since the seminal works of [5,6,7]. These works suggest that the channel coding rate in the finite blocklength regime is closely related to the information density [8], i.e., the stochastic measure of the input distribution and channel noise. The second-order approximation of conventional channels involves the variance of the information density, which has been shown to approximate the channel coding rate at short blocklengths.
The codeword positions are randomly permuted in noisy permutation channels. Conventional analysis techniques, specifically the dependence-testing (DT) bound and the random coding union (RCU) bound ([9] Theorems 17 and 18), become inapplicable. Since the messages are mapped to different probability distributions in noisy permutation channels [1], the only statistical information the receiver would use from the received codeword Y n is which marginal distribution Y n belongs to. Therefore, the finer asymptotic completely differs from that in conventional channels.
The main contributions of our work are the following:
  • We present a new nonasymptotic achievability bound for noisy permutation channels that have strictly positive square matrices W with full rank. The two main ingredients of our proof are the following: the ϵ -packing [10,11] with Kullback–Leibler (KL) divergence as a distance, and an analysis for the error events that decouple the union of error events from the message set. Additionally, this new bound is stronger than existing bounds ([1] Equation (36)).
  • We show that the finite blocklength achievable code size can be approximated by
    log M * ( n , ϵ ) log n Φ 1 ( ϵ / G ) + log V ( W ) ,
    where = rank ( W ) 1 , G = 2 + 1 2 , and V ( W ) is the channel volume ratio.
  • To complement these results and assist in understanding them, we particularize all these results to typical DMCs, i.e., BSC and BEC permutation channels. Additionally, our Gaussian approximations, through numerical results, lead to tight approximations of the achievable code size for blocklengths n as short as 100 in these cases.
We continue this section with the motivation and application. Section 2 sets up the system model. In Section 3, we provide methods to construct a set of divergence packing centers (message set) and bounds for packing numbers. In Section 4, we present our new achievability bound and particularize this bound to the typical DMCs. Section 5 studies the asymptotic behavior of the achievability bound using Gaussian approximation analysis and applies it to the typical DMCs. In Section 6, we present numerical results. We conclude this paper in Section 7.

1.1. Motivation and Application

The noisy permutation channel models the scenario where codewords undergo reordering, which occurs in communication networks and DNA storage systems. We briefly outline some applications of this channel.
(a)
Communication Networks: First, noisy permutation channels are a suitable model for the multipath routed network in which packets arrive at different delays [12,13]. In such networks, data packets within the same group often take paths of differing lengths, bandwidths, and congestion levels as they traverse the network to the receiver. Consequently, transmission delays exhibit unpredictable variations, causing these packets to arrive at their destination in a potentially different order from their original sending sequence. Moreover, during transmission, data packets may be lost or corrupted due to reasons such as link failures or buffer overflow. Treating all possible packets as the input alphabet, this scenario fits the noisy permutation channel model.
(b)
DNA Storage Systems: The DNA storage systems, known for their high density and reliability over long periods, are another motivation for our research [14,15,16]. Such a system can be seen as an out-of-order communication channel [1,14,17]. The source data is written onto DNA molecules (or codeword strings) consisting of letters from an alphabet of four nucleotides { A , C , G , T } . Due to physical conditions causing random fragmentation of DNA molecules, long-read sequencing technology, such as nanopore sequencing [18], is employed at the receiver to read entire randomly permuted DNA molecules. In the noisy permutation channel, the DMC matrix models potential errors during the synthesis and storage of DNA molecules, followed by a random permutation block that represents the random permutation of DNA molecules. For a comprehensive overview of DNA storage systems, see [1,14]; studies presenting specific DNA-based storage coding schemes include [17,19,20].

1.2. Notation

We use [ n ] = { 1 , , n } , Z a = { a , a + 1 , a + 2 , } to represent integer intervals. Let 1 { · } denote the indicator function. For a given X and a random variable X X , we write X P X to indicate that the random variable X follows the distribution P X . Let X n = ( X 1 , , X n ) and x n = ( x 1 , , x n ) denote the random vector and its realization in the n-th Cartesian product X n , respectively. A ( | X | 1 ) -dimensional simplex on R | X | is a set of points as follows:
Δ | X | 1 = ( p 1 , p 2 , , p | X | ) R | X | | x = 1 | X | p x = 1 , p x 0 .
The KL divergence and the total variation are denoted by D ( · · ) and TV ( · , · ) , respectively. For a matrix A, we use the notation rank ( A ) to represent the rank of matrix A. The probability and mathematical expectation are denoted by P [ · ] and E [ · ] , respectively. The cumulative distribution function of the standard normal distribution is denoted by Φ ( · ) and Φ 1 ( · ) is its inverse function.

2. System Model

The code C n consists of a message set M , an (possibly randomized) encoder function f n : M X n , and a (possibly randomized) decoder function g n : Y n M { e } , where the notation ‘ e ’ indicates that the decoder chooses “error”. We write X for the finite input alphabet and Y for the finite output alphabet. M denotes the achievable code size of code C n .
The input alphabet X abstracts transmitted codeword symbols in various applications. For instance, in DNA storage applications, X denotes the alphabet of four nucleotides, while the n-length codeword X n X n represents the DNA molecule formed by the corresponding n nucleotides. The sender uses encoder f n to encode the message M into a codeword X n , which is then passed through the DMC W to produce Z n Y n . The DMC is defined by a | X | × | Y | matrix W, where W ( z | x ) denotes the probability that the output z Y occurs given input x X . Finally, Z n passes through a random permutation block P Y n | Z n to generate Y n Y n . The random permutation block P Y n | Z n operates as follows. First, a random permutation is denoted as σ : { 1 , , n } { 1 , , n } , drawn randomly and uniformly from the symmetric group S n over { 1 , , n } . Then, Y n is generated by permuting Z n according to Y i = Z σ ( i ) for all i { 1 , , n } . The receiver uses decoder g n to produce the estimate of the message M ^ . We can describe these steps by the following Markov chain:
M X n Z n Y n M ^ .
The channel model of the noisy permutation channel is illustrated in Figure 1. For codewords X n drawn i.i.d. from P X , the random permutation block does not change the probability distribution of codewords [1], i.e., if X n i . i . d . P X , then Z n i . i . d . P X and Y n i . i . d . P X .
We say W is strictly positive if all the transition probabilities are greater than 0. We impose the following restrictions on the channel.
Assumption 1.
The channel W is a strictly positive and full-rank square matrix.
For a given code C n , the average error probability of code C n is
P e = P [ M M ^ ] .
The achievable code size with a given blocklength and probability of error is denoted by M * ( n , ϵ ) = max M | C n s . t . P e ϵ . The code rate of the encoder–decoder pair ( f n , g n ) is denoted by
R = log M log n ,
where log ( · ) is the binary logarithm (with base 2) in this paper. Note that the rate R ( n , ϵ ) for the noisy permutation channel is not the conventional definition where R ( n ) = 1 n log M since the noisy permutation channels will have rate 0 under this definition. The capacity for noisy permutation channels is defined as C = sup { R 0 : R is achievable } .

3. Message Set and Divergence Packing

A divergence packing is a set of centers in simplex Δ | Y | 1 such that the minimum distance between each center is greater than some KL distance. The following definition abides by the packing number.
Definition 1.
The achievability space of the marginal distribution is defined by Δ | Y | 1 * = { P X W | P X Δ | X | 1 } .
Definition 2.
Let { P 1 , , P M } Δ | Y | 1 * be the set of divergence packing centers. The divergence packing number on Δ | Y | 1 * is defined by
N * ( r , | Y | ) = max M | { P 1 , , P M } s . t . min i j D ( P i P j ) r ,
where r > 0 is the packing radius.
Here, we provide some intuition for using divergence packing in noisy permutation channels. Using the ML decoder, we relate the non-asymptotic channel performance to the likelihood ratio of two distributions (decoding metric). The statistical mean of this decoding metric is the KL divergence, which arises from applying the law of large numbers as the blocklength grows. Thus, using divergence packing can obtain an upper bound of the error probability by lower bounding the distance between each distribution, and the code size can be analyzed asymptotically via the Berry–Esseen bound (see Section 5). These factors motivate us to use the KL divergence in constructing the message set.
Since the messages correspond to different distributions in noisy permutation channels, the message set can be equivalent to the set of marginal distributions at the receiver (e.g., see [1]). In the sequel, we denote the marginal distribution corresponding to message m by P m .
Additionally, in Gaussian approximations, we need the following definition.
Definition 3.
Let vol ( y , Δ | Y | 1 * ) and vol ( y , Δ | Y | 1 ) be, respectively, the volume of the projection of Δ | Y | 1 * and Δ | Y | 1 from R | Y | to space R | Y | 1 , in which the y-th dimension is removed. The channel volume ratio is defined as
V ( W ) = max y Y vol ( y , Δ | Y | 1 * ) vol ( y , Δ | Y | 1 ) .
Next, we present several lower bounds on packing numbers. In the two-dimensional case, our construction achieves tighter bounds. For higher dimensions, our primary tool is the volume bound for ϵ -packing (e.g., see [8] Theorem 27.3). These results form the foundation for constructing marginal probability distributions in subsequent sections, while also playing a key role in the analysis of Gaussian approximations.

3.1. Binary Case

We first give the lower bound of the packing number in the binary case. Consider Δ 1 * = { ( q , 1 q ) | δ 1 q 1 δ 2 } , where 0 < δ 1 < 1 δ 2 < 1 . Define
Γ b , 2 * = ( q , 1 q ) | q = ξ a 1 / b + δ 1 where a Z 0 ,
where ξ = 1 δ 1 δ 2 , b > 0 .
Then, we have the following result proved in Appendix A.
Proposition 1.
Fix Δ 1 * = { ( q , 1 q ) | δ 1 q 1 δ 2 } , where δ 1 > 0 and δ 2 > 0 . We can construct a set of packing centers by (8) with b = 1 ξ r 2 log e and ξ = 1 δ 1 δ 2 such that
N * ( r , 2 ) = 1 / r + 1 .

3.2. General Case

Next, we introduce a general method for constructing the set of packing centers and the bounds of their size. Let b > 0 , and we consider the following set:
Γ b , | Y | * = P Δ | Y | 1 * | P = a 1 1 / b , , a | Y | 1 / b where a 1 , , a | Y | Z 0 .
The intuition behind constructing this set is that the minimum distance of distributions in this uniform structure can be bounded by the total variation distance. Then, we can obtain the set of divergence packing centers that have a certain radius by applying Pinsker’s inequality.
We have the following lower bound proved in Appendix A.
Theorem 1.
Fix a W that satisfies Assumption 1 and generate Δ | Y | 1 * . We can construct a set of packing centers by (10) with b = r 2 log e such that
N * ( r , | Y | ) V ( W ) log e 8 r | Y | 1 2 ,
where r > 0 is the packing radius.

4. New Bounds on Rate

In this section, we introduce our new bound, which is based on divergence packing and yields the spirit of the RCU bound. The key ingredient is our analysis of error events.
To that end, we introduce some definitions. Suppose we have a set of marginal distributions M , which is constructed by (8) or (10) with any b > 0 . Fixing a P = ( p 1 , , p | Y | ) M , we are often concerned with the divergence packing centers close to P. To do this, we consider Q a , b ( P ) = ( q 1 , , q | Y | ) , where a b , q a = p a + K 1 / r , q b = p b K 1 / r and q i = p i for i Y { a , b } . K is a constant that has value ξ or 1 when M is constructed by (8) or (10), respectively. We define R P as the neighboring set of P:
R P = Q a , b ( P ) | a , b Y , a b M .
In general, the distribution Q a , b ( P ) coincides with distributions in the set M , except near the boundaries of the simplex where Q a , b ( P ) may violate the constraints of the probability space. We use the intersection operation in (12) to make sure all elements of R P remain within the simplex. For convenience, we use j [ | R P | ] to index Q j R P , and we say Q j R P is the neighboring distribution of P. By counting, we have | R P | 2 ( | Y | 2 ) .
For the marginal distribution P m corresponding to the transmitted message m, we use the log-likelihood ratio to define the following decoding metric:
d ( m , j , y ) : = log P m ( y ) Q j ( y ) ,
where Q j R P m .
Then, the proof of our main result consists of three parts, each detailed in one of the following subsections. In the first subsection, we introduce a lemma. This lemma shows that the message sets constructed by (8) or (10) have an overlapping relationship for error events. In the second subsection, we use this lemma to give an equivalent expression for the error probability. The third subsection contains our main result. Additionally, we particularize this new bound to BSC and BEC permutation channels in the fourth and fifth subsections, respectively.

4.1. Overlapping of Error Events

Intuitively, the rate of decay of P e is dominated by the rate of decay of the probability of error in distinguishing neighboring messages. In order to use this intuition mathematically, we need to analyze the relationship between error events. The following lemma, proved in Appendix B, does this and can be used for analyzing random coding bounds.
Lemma 1.
Let M be constructed by (8) or (10) with any b > 0 . Fix a P M . Then, for every Λ = ( λ 1 , , λ | Y | ) Δ | Y | 1 and Q = ( q 1 , , q | Y | ) M R P { P } , if
i = 1 | Y | p i λ i i = 1 | Y | q i λ i ,
there exists a Q * = ( q 1 * , , q | Y | * ) R P such that
i = 1 | Y | p i λ i i = 1 | Y | ( q i * ) λ i .

4.2. Equivalent Expression

In this subsection, we give a lemma tailored to our purposes. It follows directly from Lemma 1.
Lemma 2.
For the set of marginal distributions M constructed by (8) or (10) with any b > 0 , we have
P j = 1 , j m | M | P j n ( Y n ) P m n ( Y n ) = P j = 1 | R P m | Q j n ( Y n ) P m n ( Y n ) ,
where the sequence Y n is drawn i.i.d. from P m and Q j R P m .
Proof. 
Using Lemma 1, for j [ | M | ] , if P j n ( y n ) P m n ( y n ) occurs, we obtain j [ | R P m | ] such that Q j n ( y n ) P m n ( y n ) occurs. Then, we observe
P j = 1 , j m | M | P j n ( Y n ) P m n ( Y n ) = y n Y n P m n ( y n ) j = 1 , j m | M | 1 P j n ( y n ) P m n ( y n )
= y n Y n P m n ( y n ) j = 1 | R P m | 1 Q j n ( y n ) P m n ( y n ) .
= P j = 1 | R P m | Q j n ( Y n ) P m n ( Y n ) ,
where in (17) we sum over all possible outputs, and (18) relies on Lemma 1 by setting Λ to be the empirical distribution of y n . This completes the proof of (16). □
Remark 1.
If the transmitted message is m, Lemma 2 shows that the union of error events j = 1 , j m | M | P j n P m n can be equivalent to a minor union j = 1 | R P m | Q j n P m n on R P m . Its size depends on the size of the output alphabet.

4.3. Main Result: New Lower Bound

The main result in this section is the following. Please refer to Appendix C for the proof.
Theorem 2.
Fix a W that satisfies Assumption 1 and generate Δ | Y | 1 * . Let the set of marginal distributions M be constructed by (8) or (10) with any b > 0 . Then, there exists a code C n (average error probability) with achievable code size | M | such that
ϵ min 1 , 1 | M | m = 1 | M | j = 1 | R P m | y n Y n P m n ( y n ) · 1 i = 1 n d ( m , j , y i ) 0 .
Remark 2.
Theorem 2 relies on the message set constructed by (8) or (10). We restrict the channel W to be a full-rank square matrix, which makes Δ | Y | 1 * an equal-dimensional subspace of Δ | Y | 1 . Therefore, the evenly spaced grid structure on Δ | Y | 1 * can be constructed by using (8) or (10). Without this condition, we cannot apply (10) unless we make strong assumptions about W.
Remark 3.
Theorem 2 upper bounds the probability of error with the sum of the probabilities of error events on R P m instead of M , which makes our bound much stronger than [1]. In fact, if we do not use Lemma 2 but instead apply the union bound and the second moment method for TV distance ([21] Lemma 4.2(iii)) in the proof of Theorem 2, we can obtain the existing bound ([1] Equation (36)).

4.4. BSC Permutation Channels

In this subsection, we particularize the nonasymptotic bounds to the BSC, i.e., the DMC matrix is
W = 1 δ δ δ 1 δ ,
denoted BSC δ . According to Proposition 1 and Theorem 1, using (8) to construct the set of marginal distributions is better than (10) in the binary case. Therefore, we focus on the former in this subsection. For convenience, we denote P m ( · ) = ( δ m , 1 δ m ) . For i , j [ | M | ] , let δ i < δ j if i < j . Then, for a P m , we clearly have
R P m = { P m 1 , P m + 1 } , 2 m | M | 1 , { P 2 } , m = 1 , { P | M | 1 } , m = | M | .
Let
f 1 ( n , T i ) = t = 0 T i n t δ i t ( 1 δ i ) n t , i 2 , 0 , i = 1 ,
and
f 2 ( n , T i ) = t = T i n n t δ i t ( 1 δ i ) n t , i | M | 1 , 0 , i = | M | .
The following bound is a straightforward generalization of Theorem 2.
Theorem 3
(Achievability). For the BSC permutation channel with crossover probability δ, there exists a code C n such that
ϵ i = 1 | M | | M | min 1 , f 1 ( n , T ¯ i ) + f 2 ( n , T ¯ i ) ,
where
T ¯ i = n log 1 δ i 1 1 δ i log δ i ( 1 δ i 1 ) δ i 1 ( 1 δ i )
and
T ¯ i = n log 1 δ i + 1 1 δ i log δ i ( 1 δ i + 1 ) δ i + 1 ( 1 δ i ) .
The set of marginal distributions is constructed by (8) and for the radius r, we have
| M | = 1 / r + 1 .
Proof. 
Let us assume the transmitted message is m M , corresponding to the marginal distribution P m . In BSC, we focus on the set (22). Using the same argument in the proof of Lemma 1, the term corresponding to d ( m , m 1 , y i ) in (20) can be computed as
y n Y n P m n ( y n ) · 1 i = 1 n d ( m , m 1 , y i ) 0 = t = 0 T ¯ m n t δ m t ( 1 δ m ) n t ,
where T ¯ m follows from (A19). Similarly, the term corresponding to d ( m , m + 1 , y i ) in (20) can be computed as
y n Y n P m n ( y n ) · 1 i = 1 n d ( m , m + 1 , y i ) 0 = t = T ¯ m n n t δ m t ( 1 δ m ) n t ,
where T ¯ m follows from (A19). Equations (29) and (30) are substituted into Theorem 2 to complete the proof. □

4.5. BEC Permutation Channels

The BEC permutation channel with erasure probability δ consists of input alphabet X = { 0 , 1 } and output alphabet Y = { 0 , e , 1 } , where the conditional distribution is
z Y , x X , W ( z | x ) = 1 δ , z = x , δ , z = e , 0 , otherwise .
Moreover, we denote such a channel as BEC δ for convenience.
Next, we have the following achievability bound.
Proposition 2.
For BEC permutation channels with erasure probability 2 δ , there exists a code C n such that the average probability of error and the code size satisfy (25) and (28), respectively.
Proof. 
The derivations of this proof follow from ([1] Proposition 6), and we include the details for the sake of completeness. We first note that the BSC matrix satisfies the Doeblin minorization condition (e.g., see [1] Definition 5) with 1 / 2 and constant 2 δ . Using ([1] Lemma 6), we find that BSC δ is a degraded version of BEC 2 δ . Then, for the encoder and decoder pairs ( f n , g n ) for BSC permutation channels and ( f n , g ˜ n ) for BEC permutation channels, the average probability of error satisfies [1] Equation (36).
P e ( f n , g n , BSC δ ) = P e ( f n , g ˜ n , BEC 2 δ ) .
Then, the argument of the proof of Theorem 3 is repeated. This completes the proof. □

5. Gaussian Approximation

We turn to the asymptotic analysis of the noisy permutation channel for a given blocklength and average probability of error.

5.1. Auxiliary Lemmata

To establish our Gaussian approximation, we will present two lemmata. The first lemma we will exploit is an important tool in the Gaussian approximation analysis:
Lemma 3
(Berry–Esseen, [22] Chapter XVI.5, Theorem 2). Fix a positive integer n. Let Z i indexed by ( 1 , , n ) be independent. Then, for any real x and C 0 6 we have
P i = 1 n Z i < n μ n + x V n n Φ ( x ) B n n ,
where
μ n = 1 n i = 1 n E [ Z i ] , V n = 1 n i = 1 n Var [ Z i ] ,
T n = 1 n i = 1 n E Z i μ i 3 , B n = C 0 T n V n 3 / 2 .
To develop the Gaussian approximation, we consider the following definitions. The variance and third absolute moment of log-likelihood ratio between two distributions P and Q are defined as V ( P Q ) = E log P Q D ( P Q ) 2 and T ( P Q ) = E log P Q D ( P Q ) 3 , respectively. Then, the following lemma is concerned with properties of V ( P Q ) and T ( P Q ) , which is proved in Appendix D.
Lemma 4.
Fix a W that satisfies Assumption 1 and generate Δ | Y | 1 * . Let M constructed by (10) with any b > 0 be the set of packing centers on Δ | Y | 1 * . If packing radius r 2 log e 9 , for any P M and Q R P , we have
V ( P Q ) = r F 0 ,
where
5 8 p m a x 2 9 log e F 0 5 log e 2 p m i n ( 1 p m a x ) 2 ,
p m i n and p m a x are constants greater than 0. Additionally, we have
T ( P Q ) 36 2 ( log e ) 3 / 2 p m i n 2 ( 1 p m a x ) 3 r 0 3 / 2 .

5.2. Main Result: Gaussian Approximation

The main result in this section is the following. Please refer to Appendix E for the proof.
Theorem 4.
Fix W is a strictly positive and full-rank square matrix for noisy permutation channels. Then, there exists a number N 0 1 , such that
log M * ( n , ϵ ) = log n Φ 1 ( ϵ / G ) + log V ( W ) + θ
is achievable for all n N 0 , where = rank ( W ) 1 , G = 2 + 1 2 , V ( W ) is the channel volume ratio, and θ is a constant.
This achievable code size (39) is different from the Gaussian approximation of the traditional channel (e.g., see [7]) since our bound is obtained by divergence packing numbers N * ( r , | Y | ) . The packing radius is a key ingredient affecting the lower bound of N * ( r , | Y | ) and affects the error probability.

5.3. Approximation of BSC and BEC Permutation Channels

We apply Theorem 4 to obtain the following approximation.
Corollary 1.
For BSC permutation channels with crossover probability δ, there exists a number N 0 1 , such that
log M * ( n , ϵ ) = log ( 1 2 δ ) n Φ 1 ϵ 2 + θ
is achievable for all n N 0 , where θ is a constant.
Proof. 
For BSC, we have Δ 1 * = { ( p , 1 p ) | δ p 1 δ } . By using Lagrange’s formula [23], we have V ( BSC δ ) = 1 2 δ . Substituting this into Theorem 4 yields the result. □
Remark 4.
We remark that the Gaussian approximation shows some properties of the code size with a given blocklength n and probability of error ϵ. In BSC permutation channels, while the channel capacity is only related to the rank of the channel matrix, the rate at which the achievable code size approaches the capacity is affected by crossover probability δ.
The approximation of BSC permutation channels can also be derived from Proposition 1. To use the message set constructed by (8), we need the following lemma, which is proved in Appendix D:
Lemma 5.
Fix a W that satisfies Assumption 1 and generate Δ 1 * . Let M constructed by (8) with any b > 0 be the set of packing centers on Δ 1 * . Then, there exists a packing radius r 1 , such that for all r r 1 , we have
F 0 r V ( P Q ) F 1 r
and
T ( P Q ) F 2 r 3 / 2 ,
where F 0 , F 1 and F 2 are positive and finite.
Then, we have the following result.
Proposition 3.
For BSC permutation channels with crossover probability δ, there exists a number N 0 1 , such that
log M * ( n , ϵ ) = log ( 1 2 δ ) n Φ 1 ϵ 2 + θ
is achievable for all n N 0 , where θ is a constant.
Proof. 
Instead of using (10), we use (8) with Δ 1 * = { ( p , 1 p ) | δ p 1 δ } . Repeat the argument of the proof of Theorem 4 replacing Lemma 4 with Lemma 5. Note that for M constructed by (8) with any b > 0 , we have
b = 1 1 2 δ r 2 log e .
We use Proposition 1 to continue as follows:
log M * ( n , ϵ ) log ( 1 2 δ ) n Φ 1 ϵ 2 + 1 + log F 0 log ( 1 2 δ ) n Φ 1 ϵ 2 + log F 0 ,
where F 0 0 is a constant. This completes the proof. □
Next, for BEC permutation channels, we have the following approximation.
Proposition 4.
For BEC permutation channels with erasure probability η, there exists a number N 0 1 , such that
log M * ( n , ϵ ) = log ( 1 η ) n Φ 1 ϵ 2 + θ
and
log M * ( n , ϵ ) = log ( 1 η ) n Φ 1 ϵ 2 + θ
are achievable for all n N 0 , where θ is a constant.
Proof. 
Through Theorem 2, repeat the argument of the proof of Corollary 1 and Theorem 3, replacing δ with η / 2 . □

6. Numerical Results

In this section, we perform numerical evaluations to illustrate our results. We first validate the precision of our Gaussian approximation across a wide range of parameters. Secondly, we present the performance of bounds of a binary DNA storage system and compare them with existing bounds.

6.1. Precision of the Gaussian Approximation

Here, we give the numerical results. According to Proposition 2, the BEC permutation channel with erasure probability 2 δ can be equivalent to the BSC permutation channel with crossover probability δ . Thus, we focus on the numerical results of BSC permutation channels. We use Theorem 3 to compute the non-asymptotic achievability bound. We start searching from M = 2 until the right side of (25) is less than the error probability ϵ . For Gaussian approximation, we use (40) and (43) but omit the remainder term θ . As Figure 2, Figure 3, Figure 4 and Figure 5 show, although the remainder term of the Gaussian approximation is a constant, it is still quite close to the non-asymptotic achievability bound. In fact, for all n 20 , the difference between (43) and Theorem 3 is within 1 bit in log M * ( n , ϵ ) .

6.2. Comparison with Existing Bound

Additionally, in the context of DNA storage systems, we consider codewords composed of nucleotides { A , C , G , T } . For simplicity, A and C are regarded as the symbol 0, and G and T are regarded as the symbol 1 in code constructions, giving a binary alphabet { 0 , 1 } . The synthesis errors and random permutation of DNA molecules are modeled as the BSC Permutation Channel with crossover probability δ = 0.25 . To reduce the computation complexity, we use approximation (43). Furthermore, we present numerical results for the existing lower bound, namely, Makur’s achievability bound ([1] Equation (36)) for BSC permutation channels. The result shows that our new achievability bound is uniformly better than Makur’s bound. In the setup of Figure 6, our bound quickly approaches half of the capacity ( n 1000 ). As the blocklength increases, Makur’s bound reaches 20 % of the channel capacity at about n 1.4 × 10 5 , shown in Figure 6. This is because we show the overlapping relationship of error events, which reduces the number of error events when applying the union bound.

7. Conclusions and Discussion

In summary, we established a new achievability bound for noisy permutation channels with a strictly positive and full-rank square matrix. The key element is that our analysis indicates that the size of error events in the union is independent of the message set. This allows us to derive a refined asymptotic analysis of the achievable rate. Numerical simulations show that our new achievability bound is stronger than Makur’s achievability bound in [1]. Additionally, our approximation is quite accurate, even though the remainder term is a constant. Finally, the primary future work will generalize the DMC matrix in noisy permutation channels to non-full-rank and non-strictly positive matrices. Other future work may improve asymptotic expansion (e.g., improving the remainder term to o ( 1 ) ).

Author Contributions

Conceptualization, L.F.; Methodology, L.F.; Software, L.F.; Validation, L.F.; Formal analysis, L.F.; Investigation, L.F.; Resources, L.F.; Data curation, L.F.; Writing—original draft, L.F.; Writing—review and editing, L.F., G.L. and X.L.; Visualization, L.F.; Supervision, G.L.; Project administration, G.L. and Y.J.; Funding acquisition, G.L., X.L. and Y.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Divergence Packing

Appendix A.1. Proof of Proposition 1

We note that Γ b , 2 * constructed by (8) is a subset of Δ 1 * . Fix b = 1 ξ r 2 log e . Then, for P , Q Γ b , 2 we have
min P Q ( 2 log e ) TV 2 ( P , Q ) r .
Use Pinsker’s inequality ([8] Theorem 7.10) to obtain min P Q D ( P Q ) r . Then, we conclude this proof by realizing that | Γ b , 2 * | = 1 / b + 1 .

Appendix A.2. Proof of Theorem 1

To prove Theorem 1, we consider the set
Γ b , | Y | = P Δ | Y | 1 | P = a 1 1 / b , , a | Y | 1 / b where a 1 , , a | Y | Z 0 .
We have the following lemma:
Lemma A1.
Let Γ b , | Y | be constructed by (A2) with any b > 0 . For P , Q Γ b , | Y | , we have
min P Q TV ( P , Q ) b .
Proof. 
For every P Γ b , | Y | , we can find a Q Γ b , | Y | such that for m , n Y and m n , we have q m p m , q n p n . For i Y { m , n } , we have q i = p i . Then, we obtain that min P Q TV ( P , Q ) = min P Q 1 2 i Y | p i q i | = 1 1 / b b . □
Fix a radius b = r 2 log e > 0 and fix a Γ b , | Y | * constructed by (10). Note that we have Γ b , | Y | * Γ b , | Y | . For any P , Q Γ b , | Y | * , we have
min P Q D ( P Q ) ( a ) min P Q ( 2 log e ) TV 2 ( P , Q ) ( b ) r ,
where ( a ) follows from Pinsker’s inequality and ( b ) follows from Lemma A1. On the other hand, note that the right inequality of (A4) shows that
min y Y | P ( y ) Q ( y ) | 2 log e r .
Hence, a total variation packing constructed by Γ b , | Y | * can be regarded as an 1 -norm packing of Δ | Y | 1 * with radius 2 r log e . Let B 1 | Y | 1 be the 1 -norm unit ball. The volume bound ([8] Theorem 27.3) of 1 -norm packing provides the lower bound of | Γ b , | Y | * | as
| Γ b , | Y | * | log e 2 r | Y | 1 2 vol ( y , Δ | Y | 1 * ) vol ( B 1 | Y | 1 ) .
Here, vol ( B 1 | Y | 1 ) is the volume of the 1 -norm unit ball; vol ( y , Δ | Y | 1 * ) is the volume of the projection of Δ | Y | 1 * from R | Y | to the space R | Y | 1 , consistent with the argument used in the proof of ([24] Proposition 2).
Note that the maximum volume ratio is V ( W ) by Definition 3. We continue the bounding as follows:
| Γ b , | Y | * | log e 2 r | Y | 1 2 × V ( W ) vol ( y , Δ | Y | 1 ) vol ( B 1 | Y | 1 ) ,
= V ( W ) log e 8 r | Y | 1 2 ,
where
  • (A7) holds since this projection can remove any y-th dimension, where y Y . Consequently, the lower bound is given by taking the maximum of the volume ratio over y Y ;
  • (A8) follows by using Lagrange’s formula [23] and the volume formula to obtain vol ( y , Δ | Y | 1 * ) = V ( W ) ( | Y | 1 ) ! and vol ( B 1 | Y | 1 ) = 2 | Y | 1 ( | Y | 1 ) ! , respectively.
Finally, we conclude the proof by realizing that the left inequality of (A4) implies that Γ b , | Y | * gives the set of divergence packing centers. This completes the proof of Theorem 1.

Appendix B. Proof of Lemma 1

We first consider M constructed by (10). Fix a P = ( p 1 , , p | Y | ) M and generate R P corresponding to P. Fix a , b Y , where a b . Denote a Q a , b * = ( q 1 * , , q | Y | * ) such that q a * = p a + 1 1 / r , q b * = p b 1 1 / r , q i * = p i , where i Y { a , b } . Let
log p a p b + 1 / 1 / r = G a , b log p b p b 1 / 1 / r ,
where G a , b is a constant. Define
B a , b = ( λ 1 , , λ | Y | ) | G a , b λ a < λ b .
If Q a , b * R P , for Λ = ( λ 1 , , λ | Y | ) B a , b and Q a , b * = ( q 1 * , , q | Y | * ) , we have
i = 1 | Y | p i λ i > i = 1 | Y | ( q i * ) λ i .
For a , b Y and a b , we find B a , b and consider the intersection, i.e., B P = a , b Y , a b , Q a , b * R P B a , b . Then, for Λ B P , we have (A11) holds for any Q * R P .
Now we consider Q M ( R P { P } ), which is farther from P in Euclidean distance than Q * R P . For each i Y , we obtain that the Euclidean distance between p i and q i is K i 1 / r , where K i Z 0 .
Since Q is a probability distribution, we have p i K i 1 / r > 0 and K i 0 . Using the inequality ( 1 + p i ) K 1 + K p i , we obtain that
K i log p i p i + 1 / 1 / r log p i p i + K i / 1 / r
and
K i log p i p i 1 / 1 / r log p i p i K i / 1 / r .
Let Y * = { i Y | q i p i } , Y 0 = { i Y * | q i < p i } , and Y 1 = { i Y * | q i > p i } . For i Y 0 , let q i = p i 1 / 1 / r . For i Y 1 , let q i = p i + 1 / 1 / r . Then, we have
i Y * λ i log p i q i i Y * λ i K i log p i q i .
Due to the constraints of the probability space, we have
i Y 0 K i = i Y 1 K i .
Recall that for Λ B P , (A11) holds for any Q * R P . Then, by the definition of R P , for Λ B P , we have
λ a log p a q a + λ b log p b q b > 0 ,
where a Y 0 and b Y 1 . We combine (A14)–(A16) to obtain that if Λ B P , then
i Y * λ i log p i q i > 0 .
That is
i = 1 | Y | p i λ i > i = 1 | Y | q i λ i .
For Q M ( R P { P } ) , define B Q such that (A17) holds for Λ B Q . We have B P B Q . Denote by A Q = Δ | Y | 1 B Q and A P = Δ | Y | 1 B P the complement of B Q and B P , respectively. We note that if (14) holds, we have Λ A Q . Then, we obtain (15) holds for a Q * R P since we obviously have A Q A P . This completes the proof in the case of M constructed by (10).
We then give another proof in binary case. Consider M constructed by (8). For convenience, let p j < p m if j < m , where j , m [ | M | ] . Fix P m 1 = ( p m 1 , 1 p m 1 ) and P m = ( p m , 1 p m ) . Let
f ( p j ) = log 1 p j 1 p m / log p m ( 1 p j ) p j ( 1 p m ) .
For any λ [ 0 , 1 ] and j [ m 2 ] , we have
p j λ ( 1 p j ) ( 1 λ ) p m λ ( 1 p m ) ( 1 λ )
holds for λ [ 0 , f ( p j ) ] . Note that f ( p j ) is a monotonically increasing function with respect to p j . Then, we can obtain
p m 1 λ ( 1 p m 1 ) ( 1 λ ) p m λ ( 1 p m ) ( 1 λ )
since [ 0 , f ( p j ) ] [ 0 , f ( p m 1 ) ] .
For j { m + 1 , m + 3 , , | M | } , use the same argument to obtain that (A20) holds for λ [ f ( p j ) , 1 ] and
p m + 1 λ ( 1 p m + 1 ) ( 1 λ ) p m λ ( 1 p m ) ( 1 λ )
holds for λ [ f ( p m + 1 ) , 1 ] . Then, (A20) implies (A22) since [ f ( p j ) , 1 ] [ f ( p m 1 ) , 1 ] . This completes the proof.

Appendix C. Proof of Theorem 2

Since the matrix is full-rank, we obtain the achievability space of the marginal distribution is a ( | Y | 1 ) -dimensional probability space Δ | Y | 1 * . We consider constructing the set of marginal distributions M by using the divergence packing on space Δ | Y | 1 * , i.e., let M be constructed by (8) or (10) with any b > 0 .
Assuming the transmitted message is m, corresponding to the marginal distribution P m . After the codeword X n passes through the DMC matrix W and the random permutation block P Y n | Z n , we obtain that Y n i . i . d . P m . Then, error event of the maximum likelihood decoder is
j = 1 , j m | M | P j n ( Y n ) P m n ( Y n ) .
Then, for Y n drawn i.i.d. from P m the average probability of error satisfies
P [ error | M = m ] P j = 1 , j m | M | P j n ( Y n ) P m n ( Y n )
= P j = 1 | R P m | P j n ( Y n ) P m n ( Y n )
j = 1 | R P m | P P j n ( Y n ) P m n ( Y n ) ,
where
  • (A24) follows from we regard the equality case, P j n ( Y n ) P 1 n ( Y n ) , as an error event though the ML decoder might return the correct message;
  • (A25) follows from Lemma 2;
  • (A26) follows from the union bound.
Let the message be uniform on M . Then, we take the expectation over all codewords to obtain
ϵ 1 | M | m = 1 | M | j = 1 | R P m | y n Y n P m n ( y n ) 1 i = 1 n d ( m , j , y i ) 0
This completes the proof.

Appendix D. Properties of V ( P Q ) and T ( P Q )

This appendix is concerned with the behavior of V ( P Q ) and T ( P Q ) . We first prove Lemma 4.

Appendix D.1. Proof of Lemma 4

Since the DMC matrix is strictly positive, each term of marginal distribution P is uniformly bounded away from zero, i.e., there exists a p min ( 0 , 1 ) such that P ( y ) p min for all y Y . Similarly, there exists a p max ( 0 , 1 ) such that P ( y ) p max for all y Y . Clearly, we have p min = min y Y { P ( y ) | P M } and p max = max y Y { P ( y ) | P Δ | Y | 1 * } . Without loss of generality, we assume | M | 2 . That is, we have r 0 2 log e 9 .
We know that
P ( y ) = a 1 / r δ = a r 0 2 log e 1 δ r 0 2 log e ,
where δ [ 0 , 1 ) and a Z 1 .
We consider each term P ( y ) in P separately. If P ( y ) = Q ( y ) , we obviously get zero. If P ( y ) > Q ( y ) , we have Q ( y ) = ( a 1 ) r 0 2 log e 1 δ r 0 2 log e . Here, since the distributions in M have no zero term, we have a Z 2 . Since a = P ( y ) 1 δ r 0 2 log e 2 log e r 0 , we have
r 0 log e P ( y ) ( 2 δ r 0 / log e ) log P ( y ) Q ( y ) r 0 log e P ( y ) ( 2 δ r 0 / log e ) r 0 / log e .
Consequently, we have
P ( y ) log P ( y ) Q ( y ) 2 r 0 log e 2 P ( y )
and
P ( y ) log P ( y ) Q ( y ) 2 r 0 log e 2 ( a 1 ) 2 P ( y ) / a 2 2 r 0 log e P ( y ) 1 1 δ r 0 2 log e 2 .
If P ( y ) < Q ( y ) , consider Q ( y ) = ( a + 1 ) r 0 2 log e 1 δ r 0 2 log e , where a Z 1 . Applying the same argument to obtain
r 0 log e P ( y ) ( 2 δ r 0 / log e ) + r 0 / log e | log P ( y ) Q ( y ) | r 0 log e P ( y ) ( 2 δ r 0 / log e ) .
Consequently, we have
P ( y ) log P ( y ) Q ( y ) 2 r 0 log e 2 ( a + 1 ) 2 P ( y ) / a 2 r 0 log e 8 P ( y )
and
P ( y ) log P ( y ) Q ( y ) 2 r 0 log e 2 P ( y ) 1 1 δ r 0 2 log e 2 .
Then, we see that
V ( P Q ) D ( P Q ) 2 + y P ( y ) log P ( y ) Q ( y ) 2
r 0 log e 5 8 p m a x r 0 log e ,
where
  • (A35) simply expand the variance;
  • (A36) holds since the definition of R P ensures that for all y Y , we only have two terms that are not zero and can be bounded by (A30) and (A33), respectively.
Finally, we complete the proof of the lower bound by noting that r 0 2 log e 9 .
Note that
1 1 δ r 0 2 log e 2 1 ( 1 p m a x ) 2 .
Using (A31) and (A34), we obtain the upper bound
V ( P Q ) 5 log e 2 p m i n ( 1 p m a x ) 2 r 0 ,
which yields (36).
We now turn to bound T ( P Q ) . For P ( y ) > Q ( y ) , we have
P ( y ) | log P ( y ) Q ( y ) | 3 4 2 ( r 0 log e ) 3 / 2 P 2 ( y ) 1 1 δ r 0 2 log e 3 .
For P ( y ) < Q ( y ) , we have
P ( y ) | log P ( y ) Q ( y ) | 3 1 2 2 ( r 0 log e ) 3 / 2 P 2 ( y ) 1 1 δ r 0 2 log e 3 .
Note that D ( P Q ) can be bounded by the following:
D ( P Q ) 3 2 log e 2 r 0 1 1 δ r 0 2 log e .
Using the inequality | a b | 3 4 ( | a | 3 + | b | 3 ) , we obtain
T ( P Q ) 36 2 ( log e ) 3 / 2 p m i n 2 ( 1 p m a x ) 3 r 0 3 / 2 ,
which establishes (38).

Appendix D.2. Proof of Lemma 5

Then, we prove Lemma 5 for M constructed by (8). We first note that
P ( y ) = a ξ 1 / r δ = a δ 1 δ ξ + F 0 r 0 2 log e 1 δ ξ r 0 2 log e ,
where δ [ 0 , 1 ] , F 0 = δ 1 / r 0 2 log e , and a Z 1 . Note that F 0 δ 1 δ ξ 0 holds for small r 0 . Repeat the proof of Lemma 4, replacing a with a δ 1 δ ξ + F 0 and δ with δ ξ . Then, there exists a packing radius r 1 , such that for all r 0 r 1 , we have
F 0 r 0 V ( P Q ) F 1 r 0
and
T ( P Q ) F 2 r 0 3 / 2 ,
where F 0 , F 1 and F 2 are positive and finite.

Appendix E. Proof of Theorem 4

Let M r constructed by (10) with b = r 2 log e be the set of packing centers on Δ | Y | 1 * , where r > 0 is the packing radius and we specify later. For the transmitted message m, passing codeword through the DMC matrix W n and random permutation block P Y n | Z n induces a marginal distribution P m on Y n . Note that the decoding metric is the sum of independent identically distributed variables:
k = 1 n d ( m , j , Y k ) = k = 1 n log P m ( Y k ) Q j ( Y k ) .
It has the mean D ( P m Q j ) , variance V ( P m Q j ) , and third absolute moment T ( P m Q j ) . Denote
B m = max j [ | R P m | ] 6 T ( P m Q j ) V 3 ( P m Q j )
and
P e [ m ] = j = 1 | R P m | P k = 1 n d ( m , j , Y k ) 0 .
According to Theorem 2, there exists a code with average error probability ϵ such that
ϵ 1 | M r | m = 1 | M r | P e [ m ] .
Denote m * = argmax m M P e [ m ] . We continue (A49) as follows:
ϵ P e [ m * ]
| R P m * | B m * n + max j [ | R P m * | ] | R P m * | Φ n D ( P m * Q j ) n V ( P m * Q j )
| R P m * | B m * n + | R P m * | Φ n r 0 F 0 ,
where
  • (A51) follows from Theorem 3;
  • (A52) holds for a suitable constant F 0 > 0 by Lemma 4.
Equating the RHS of (A52) to ϵ and noting that ϵ ϵ > 0 , we solve that
r 0 = Φ 1 ϵ | R P m * | F 1 n 2 F 0 n ,
where | R P m * | 2 ( | Y | 2 ) by its definition, F 0 > 0 and F 1 > 0 are suitable constants. This can be done since for suitable constants F 2 > 0 and F 3 > 0 , we obtain T ( P m Q j ) F 2 r 0 3 / 2 and V ( P m Q j ) F 3 r 0 for n sufficiently large by Lemma 4. Consequently, for large n, B m can be upper bounded by a suitable constant F 1 .
The above arguments indicate that there exists a code with average error probability ϵ such that the code size | M r | is achievable. Let = | Y | 1 and let G = 2 ( | Y | 2 ) . For n sufficiently large, we obtain that for suitable F 5 > 0 ,
log | M r | = log N * ( r , | Y | )
log V ( W ) log e 8 2 n F 4 Φ 1 ( ϵ / G )
log n Φ 1 ( ϵ / G ) + log V ( W ) + log F 5 ,
where
  • (A54) follows from that M r constructed by (10) is the set of packing centers and has the packing number N * ( r , | Y | ) ;
  • (A55) holds for suitable F 4 > 0 by Theorem 1 and Taylor’s formula of Φ 1 ( · ) .
This completes the proof.

References

  1. Makur, A. Coding Theorems for Noisy Permutation Channels. IEEE Trans. Inf. Theory 2020, 66, 6723–6748. [Google Scholar] [CrossRef]
  2. Makur, A. Bounds on Permutation Channel Capacity. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 762–766. [Google Scholar]
  3. Tang, J.; Polyanskiy, Y. Capacity of Noisy Permutation Channels. IEEE Trans. Inf. Theory 2023, 69, 4145–4162. [Google Scholar] [CrossRef]
  4. Feng, L.; Wang, B.; Lv, G.; Li, X.; Wang, L.; Jin, Y. New Upper Bounds for Noisy Permutation Channels. IEEE Trans. Commun. 2025, 73, 7478–7492. [Google Scholar] [CrossRef]
  5. Strassen, V. Asymptotic Estimates in Shannon’s Information Theory. In Proceedings of the Transactions of the Third Prague Conference on Information Theory, Prague, Czech Republic, 5–13 June 1962; pp. 689–723. [Google Scholar]
  6. Hayashi, M. Information Spectrum Approach to Second-Order Coding Rate in Channel Coding. IEEE Trans. Inf. Theory 2009, 55, 4947–4966. [Google Scholar] [CrossRef]
  7. Polyanskiy, Y.; Poor, H.V.; Verdu, S. Channel Coding Rate in the Finite Blocklength Regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
  8. Polyanskiy, Y.; Yihong, W. Information Theory: From Coding to Learning; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
  9. Polyanskiy, Y. Channel Coding: Non-Asymptotic Fundamental Limits. Ph.D. Thesis, Princeton University, Princeton, NJ, USA, 2010. [Google Scholar]
  10. Kolmogorov, A.N. Selected Works of A. N. Kolmogorov. Mathematics and Its Applications; Springer: Dordrecht, The Netherlands, 1993. [Google Scholar]
  11. Yuhong, Y.; Andrew, B. Information-theoretic determination of minimax rates of convergence. Ann. Stat. 1999, 27, 1564–1599. [Google Scholar]
  12. John, W.; Steven, W.; Wa, M. Optimal rate delay tradeoffs for multipath routed and network coded networks. In Proceedings of the 2008 IEEE International Symposium on Information Theory (ISIT), Toronto, ON, Canada, 7–12 June 2008. [Google Scholar]
  13. John, W.; Steven, W.; Wa, M. Optimal Rate–Delay Tradeoffs and Delay Mitigating Codes for Multipath Routed and Network Coded Networks. IEEE Trans. Inf. Theory 2009, 55, 5491–5510. [Google Scholar]
  14. Yazdi, S.M.; Hossein, T.; Han, M.; Garcia, R. DNA-Based Storage: Trends and Methods. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2015, 1, 230–248. [Google Scholar] [CrossRef]
  15. Erlich, T.; Zielinski, D. DNA Fountain enables a robust and efficient storage architecture. Science 2017, 355, 950–954. [Google Scholar] [CrossRef] [PubMed]
  16. Heckel, R.; Shomorony, I.; Ramchandran, K.; Tse, D.N.C. Fundamental limits of DNA storage systems. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 3130–3134. [Google Scholar]
  17. Kovačević, M.; Tan, V.Y. Codes in the Space of Multisets—Coding for Permutation Channels with Impairments. IEEE Trans. Inf. Theory 2018, 64, 5156–5169. [Google Scholar] [CrossRef]
  18. Laver, T.; Harrison, J.; Moore, K.; Farbos, A.; Paszkiewicz, K.; Studholme, D. Assessing the performance of the oxford nanopore technologies minion. J. Mol. Biol. 2015, 3, 1–8. [Google Scholar] [CrossRef] [PubMed]
  19. Kovačević, M.; Tan, V.Y. Asymptotically optimal codes correcting fixed-length duplication errors in DNA storage systems. IEEE Commun. Lett. 2018, 22, 2194–2197. [Google Scholar] [CrossRef]
  20. Kiah, M.H.; Puleo, G.; Milenkovic, O. Codes for DNA sequence profiles. IEEE Trans. Inf. Theory 2016, 62, 3125–3146. [Google Scholar] [CrossRef]
  21. Evans, W.; Kenyon, C.; Peres, Y.; Schulman, L.J. Broadcasting on trees and the Ising model. Ann. Appl. Prob. 2000, 410–433. [Google Scholar] [CrossRef]
  22. Feller, W. An Introduction to Probability Theory and Its Applications; Wiley: New York, NY, USA, 1971. [Google Scholar]
  23. Stein, P. A Note on the Volume of a Simplex. Am. Math. Mon. 1966, 73, 299. [Google Scholar] [CrossRef]
  24. Jennifer, T. Divergence Covering. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2021. [Google Scholar]
Figure 1. Illustration of a communication model with a DMC followed by a random permutation.
Figure 1. Illustration of a communication model with a DMC followed by a random permutation.
Entropy 27 01101 g001
Figure 2. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.11 and average block error rate ϵ = 10 3 .
Figure 2. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.11 and average block error rate ϵ = 10 3 .
Entropy 27 01101 g002
Figure 3. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.11 and average block error rate ϵ = 10 6 .
Figure 3. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.11 and average block error rate ϵ = 10 6 .
Entropy 27 01101 g003
Figure 4. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.22 and average block error rate ϵ = 10 3 .
Figure 4. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.22 and average block error rate ϵ = 10 3 .
Entropy 27 01101 g004
Figure 5. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.22 and average block error rate ϵ = 10 6 .
Figure 5. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.22 and average block error rate ϵ = 10 6 .
Entropy 27 01101 g005
Figure 6. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.25 and average block error rate ϵ = 10 3 : example of a DNA storage system.
Figure 6. Rate–blocklength tradeoff for the BSC with crossover probability δ = 0.25 and average block error rate ϵ = 10 3 : example of a DNA storage system.
Entropy 27 01101 g006
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Feng, L.; Lv, G.; Li, X.; Jin, Y. A New Lower Bound for Noisy Permutation Channels via Divergence Packing. Entropy 2025, 27, 1101. https://doi.org/10.3390/e27111101

AMA Style

Feng L, Lv G, Li X, Jin Y. A New Lower Bound for Noisy Permutation Channels via Divergence Packing. Entropy. 2025; 27(11):1101. https://doi.org/10.3390/e27111101

Chicago/Turabian Style

Feng, Lugaoze, Guocheng Lv, Xunan Li, and Ye Jin. 2025. "A New Lower Bound for Noisy Permutation Channels via Divergence Packing" Entropy 27, no. 11: 1101. https://doi.org/10.3390/e27111101

APA Style

Feng, L., Lv, G., Li, X., & Jin, Y. (2025). A New Lower Bound for Noisy Permutation Channels via Divergence Packing. Entropy, 27(11), 1101. https://doi.org/10.3390/e27111101

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop