1. Introduction
In [
1], C. Shannon established the foundations of information theory by characterizing the key mathematical properties of communication channels. For a transmission rate
R that is less than the channel capacity
C, the probability of erroneous decoding with respect to an optimal code decreases exponentially as the code length
increases. Shannon introduced the channel reliability function
as being the exponent governing this exponential decrease in relation to the transmission rate
R.
A major goal in information theory is to find a closed-form expression for the channel reliability function. This expression should be computable and fully determined by the parameters of the communication task. Naturally, we must define what constitutes a closed-form expression. In [
2], Chow, and in [
3], Borwein and Crandall discuss different approaches to defining closed-form expressions. All of the various representations satisfy the requirement that the corresponding functions can be computed algorithmically using a digital computer. This can be achieved with great precision, depending on the inputs within their domain of definition.
Shannon’s characterization of the capacity for message transmission via the discrete memoryless channel (DMC) in [
1], Ahlswede’s characterization of the capacity for message transmission via the multiple access channel in [
4], and Ahlswede and Dueck’s characterization of the identification capacity for DMCs in [
5] are all significant examples of closed-form solutions using elementary functions. These provide important instances of the computability of the corresponding performance functions, as defined in the previous context. The precise definition of computability, as outlined by Turing, is presented in
Section 2.
Lovász’s characterization of the zero-error capacity for the pentagram also represents a closed-form number according to Chow’s definition in [
2], which can be computed algorithmically—an outcome that is desirable. However, the characterization of the zero-error capacity for a cyclical heptagon remains an open problem. Moreover, it is still unclear whether the zero-error capacities of DMCs can take computable values for computable channels. Additionally, the algorithmic computability of the broadcast capacity region is still uncertain.
In the age of artificial intelligence, it is increasingly important to determine whether a digital computer can solve a given problem or compute a given function. Since every function that can be computed by a digital computer can also be computed by a Turing machine (as will be discussed in more detail below), this question is reduced to asking whether a function is computable. It is therefore crucial to distinguish between determining how to compute the zero-error capacity and whether it is computable at all. In this work, we focus on the latter: the computability of the zero-error capacity.
The Lovász
-function for graphs was analyzed in [
6] from three distinct research perspectives related to various graph invariants. This investigation resulted in new insights into the Shannon capacity of graphs, observations on cospectral and nonisomorphic graphs, and bounds on graph invariants while also serving as a tutorial in zero-error information theory and algebraic graph theory. Further observations on the Lovász
-function are provided by the author in [
7].
In this paper, we provide a negative answer to the question of whether the channel reliability function and several related bounds are algorithmically computable by Turing machines.
Significant research has been conducted on the channel reliability function, but many aspects of its behavior remain unresolved (see surveys [
8] and [
9]). In fact, a complete characterization of the channel reliability function is still unknown for binary-input binary-output channels. As a result, considerable efforts have been made to derive computable lower and upper bounds for the function (see [
10,
11,
12]).
Determining the behavior of the channel reliability function across the entire interval
is a challenging problem. Various approaches have attempted to compute the reliability function algorithmically by constructing sequences of upper and lower bounds. The first significant contribution in this direction was made by Shannon, Gallager, and Berlekamp in [
13].
A fundamental question that arises is whether the reliability function can be computed in this manner. To investigate this, we employ the framework of Turing computability [
14]. In general, a function is considered Turing computable if there exists an algorithm capable of computing it. The Turing machine serves as the most fundamental and powerful model of computation, underpinning theoretical computer science. Unlike physical computers, which have practical constraints, a Turing machine is an abstract mathematical construct that can be rigorously analyzed using formal mathematical methods.
It is important to note that the Turing machine represents the ultimate performance limit of current digital computers, including supercomputers. A Turing machine models an algorithm or a program, where computation consists of step-by-step manipulation of symbols or characters that are read from and written to a memory tape according to a set of rules. These symbols can be interpreted in various ways, including as numbers. To perform computations on abstract sets, the elements of the set must be encoded as strings of symbols on the tape. This approach allows Turing computability to be defined for real and complex numbers.
The use of digital computers to compute approximations of channel capacities or channel reliability functions has been a prominent topic in information theory. The computation of channel capacity for discrete memoryless channels (DMCs) is a convex optimization problem, and in 1972, an algorithm for approximating the capacity of a DMC on digital computers was independently published in [
15] and [
16].
Even for binary symmetric channels with rational crossover probabilities (excluding the case
), the channel capacity is a transcendental number. As a result, despite the relative simplicity of these channels, their capacity can only be approximated with finite precision by digital computers. In contrast to the problem of computing channel capacity, determining the behavior of the channel reliability function over the entire interval
is a significantly more complex task. A common approach to this challenge involves considering sequences of upper and lower bounds for
(see [
13]).
In general, the
channel reliability function is a well-studied topic in information theory. Originally introduced and analyzed for discrete memoryless channels (DMCs), the concept has since been significantly extended to various other scenarios and channel models. In [
17], the reliability function of a DMC was studied at rates above capacity. Subsequent refinements and theoretical bounds were proposed, such as the Poor–Verdú upper bound addressed in [
18]. Extensions beyond DMCs include continuous channels and channels with feedback or secrecy constraints. For instance, upper bounds for Gaussian channels were developed in [
19], while the role of feedback in Poisson and Gaussian channels was explored in [
20,
21]. The impact of signal constraints was analyzed in [
22], and improved Gaussian channel bounds were proposed in [
23]. Secrecy considerations and cost constraints were incorporated into the analysis of the reliability and secrecy functions in [
24]. The reliability function in the presence of side information, as in the Gelfand–Pinsker channel, was considered in [
25]. More recently, a new upper bound for DMCs was given in [
26], and noisy feedback for binary symmetric channels was studied in [
27]. These developments culminated in the analysis of reliability functions in quantum communication settings. Foundational work includes [
28,
29], and recent advancements include [
30].
In this work, we explore whether it is possible to compute the channel reliability function in this manner using a mathematically rigorous formalization of computability. Specifically, our analysis is based on the theory of Turing machines and recursive functions.
In many cases, there is no direct characterization of the behavior of a general function over an abstract set in terms of an algorithm on a Turing machine. Consequently, a common strategy is to approximate the function successively using a sequence of computable upper and lower bounds, for which an algorithm is available. One can then ask the weaker question of whether it is possible to approximate the function in a computable manner. This requires computable sequences of computable upper and lower bounds. This approach is also necessary for the reliability function, and we conducted this analysis. Unfortunately, our results show that the channel reliability function is not a Turing computable performance function when the channel is considered as input.
We also examine several other closely related functions, including the function, the sphere packing bound function, the expurgation bound function, and the zero-error feedback capacity, all of which are closely tied to the reliability function. We treat all of these functions as functions of the channel.
As envisioned, the sixth generation (6G) of mobile networks will introduce a wide range of new features [
31]. These innovations bring new challenges to the design of wireless communication systems. Specifically, the Tactile Internet will enable not only the control of data but also the manipulation of physical and virtual objects [
31]. With such applications, there arises an increased need to ensure the trustworthiness of the system and its services [
32,
33].
6G will impose more diverse and demanding quality-of-service (QoS) requirements on network resilience, reliability, service availability, and delay [
31]. The channel reliability function plays a vital role in the reliability and delay performance analysis of communication systems. It is therefore of interest to explore whether the reliability and delay performance of communication systems can be verified automatically on digital hardware [
33]. Analyzing the channel reliability function with respect to Turing computability becomes crucial in this context. The question of Turing computability for performance functions is a central issue in information theory, as closed-form expressions are only known for a few performance functions. It is therefore important to compute corresponding performance functions on available computers with provable performance, ensuring the strict requirements for future communication systems [
31,
33].
The structure of this paper is as follows. In
Section 2, we begin by presenting the basic definitions and known results that will be used throughout the paper.
Section 3 focuses on the
function. We examine the decidability of connected sets with the
function and demonstrate that only an approximation from below is possible. This has implications for the sphere packing bound, and we show that it is not a Turing computable performance function.
In
Section 4, we analyze the reliability function and prove that it is also not Turing computable. The same result holds for the expurgation bound. In
Section 5, we investigate the zero-error feedback capacity, which is closely related to the
function. We first address a question posed by Alon and Lubetzky in [
34] regarding the zero-error capacity with feedback, specifically for the case without feedback (which was examined in [
35]). We then show that the zero-error feedback capacity is not Banach–Mazur computable and cannot be approximated by computable increasing sequences of computable functions. Additionally, we characterize the superadditivity of the zero-error feedback capacity and demonstrate that the
function is additive.
In
Section 6, we analyze the behavior of the expurgation bound rates. Finally, we conclude by summarizing the implications of our results for the channel reliability function. Our findings indicate that, in general, there cannot be a simple recursive closed-form expression for the channel reliability function over a very precise interval.
Some of the results in this paper were presented at the IEEE International Symposium on Information Theory in Espoo, as noted in [
36].
2. Definitions and Basic Results
2.1. Basic Concepts of Computability Theory
In this section, we present the basic definitions and results from computability theory that are necessary for this work. We begin with the fundamental definitions of computability, starting with the concept of a Turing machine [
14].
A Turing machine serves as a mathematical model for what we intuitively understand as computation machines. In this sense, they provide an abstract idealization of modern-day computers. Any algorithm that can be executed by a real-world computer can, in principle, be simulated by a Turing machine, and vice versa. However, unlike real-world computers, Turing machines are not constrained by limitations such as energy consumption, computation time, or memory size. Furthermore, all computation steps on a Turing machine are assumed to be executed flawlessly, with no possibility of error.
Recursive functions, more specifically known as μ-recursive functions, form a special subset of the set , where the symbol “↪” denotes a partial mapping. The set of recursive functions provides an alternative characterization of the notion of computability. Turing machines and recursive functions are equivalent in the following sense: a function is computable by a Turing machine if and only if it is a partial recursive function.
Next, we introduce several key definitions from
computable analysis [
37,
38,
39], which we will apply in the subsequent sections.
Definition 1. A sequence of rational numbers is called a computable sequence if there exist recursive functions with for all and Definition 2. We say that a computable sequence of rational numbers converges effectively, i.e., computably, to a number x, if a recursive function exists such that for all and all with applies.
We can now introduce computable numbers.
Definition 3. A real number x is said to be computable if there exists a computable sequence of rational numbers , such that for all . We denote the set of computable real numbers using .
Next, we need suitable subsets of the natural numbers.
Definition 4. A set is called recursive if there exists a recursive function f, such that if and if , where stands for the complement set of A.
Definition 5. A set is recursively enumerable if there exists a recursive function whose domain is exactly A.
Remark 1. For the definition of recursive and partial recursive functions, see [37]. Recursive functions are the building blocks to develop the framework for computing theory on rational numbers, on real numbers, and on related functions defined over these number fields. This theory captures exactly what can be achieved in theory with digital computers in these number fields. We next introduce the concept of computable performance functions on the basis of computability theory. It is important to note that computability theory formalizes exactly what is computable with perfect digital computers. 2.2. Basic Concepts of Information Theory
To define the reliability function and its related functions, we first need the definition of a discrete memoryless channel. In the theory of transmission, the receiver must be in a position to successfully decode all the messages transmitted by the sender.
Let be a finite alphabet, and denote the set of all probability distributions on using . We define the set of computable probability distributions, , as the subset of consisting of all distributions for which holds for all .
Furthermore, for finite alphabets and , let denote the set of all conditional probability distributions (or channels) . We define as the set of computable conditional probability distributions, i.e., those for which holds for every .
Let . We call M semi-decidable if and only if there is a Turing machine that either stops or computes forever, depending on whether is true. This means exactly accepts the elements of M, and for an input , it computes forever.
Definition 6. A discrete memoryless channel (DMC) is a triple , where is the finite input alphabet, is the finite output alphabet, and , with , . The probability that a sequence is received if was sent is defined by Definition 7. A (deterministic) block code with rate R and block length n consists of
A message set with ;
An encoding function ;
aAdecoding function .
We call such a code an -code.
Definition 8. Let be a DMC. Using , we denote a block code with the block length n and message set .
- 1.
The individual message probability of error is defined by the conditional probability of error, given that message is transmitted: - 2.
We define the average probability of error by denotes the minimum error probability over all block codes of block length n and with message set .
- 3.
We define the maximal probability of error by denotes the minimum error probability over all block codes of block length n and with message set .
- 4.
The Shannon capacity for a channel is defined by - 5.
The zero-error capacity for a channel is defined by
Remark 2. For R with , there exists , such that We also define the discrete memoryless channel with noiseless feedback (DMCF). By this, we mean that, in addition to the DMC, there exists a return channel that sends the element of
actually received back from the receiving point to the transmitting point. It is assumed that this information is received at the transmitting point before the next letter is sent and can therefore be used to choose the next letter to be sent. We assume that this feedback is noiseless. We denote the feedback capacity of a channel
W by
and the zero-error feedback capacity by
. Shannon proved in [
40] that
. This is, in general, not true for the zero-error capacity. We see that the zero-error (feedback) capacity is related to the reliability function, which we analyze in this paper. It is defined as follows.
Definition 9. The channel reliability function (error exponent) is defined by Remark 3. We make use of the common convention that .
Remark 4. We need the lim sup in (
1)
, because it is not known whether the limit value, i.e., the limts on the right-hand side of (
1)
, exist. The first simple observation is that for
, we have
, and if
for
, we have
. One well-known upper bound is the sphere packing bound, which can be defined as follows (see [
10]).
Definition 10. Let be finite alphabets, and be a DMC. Then, for all , we define the sphere packing bound function: Theorem 1 (Fano 1961, Shannon, Gallager, Berlekamp 1967).
For any DMC W and for all , it holds that The sphere packing upper bound is an important upper bound. The following two lower bounds of the reliability function are also very important. In [
41], the random coding bound was defined as follows:
Definition 11. Let be finite alphabets, and be a DMC. Then, for all , we define the random coding bound function as Theorem 2. Let be finite alphabets and be a DMC; then, Gallager also defined in [
41] the
k-letter expurgation bound as follows:
Definition 12. Let be finite alphabets and be a DMC; then, for all , we define the k-letter expurgation bound function: Theorem 3. Let be finite alphabets and be a DMC. Then, for all , we have The inequality in (
9) follows from Fekete’s lemma.
The smallest value of
R, at which the convex curve
meets its supporting line of slope -1, is called the critical rate and is denoted by
[
9]. For the certain interval
, the random coding lower bound corresponds to the sphere packing upper bound. The channel reliability function is therefore known for this interval. The channel reliability function is generally not known for the interval
. For the interval
, there are also better lower bounds than the random coding lower bound.
is the infimum of all rates
such that
is finite on the open interval
.
applies if
. The following representation of
exists (see [
9]):
There exist alphabets
and channels
such that
while
.
Moreover, for the zero-error feedback capacity
, it holds that
whenever
. However, if
, there exists a channel
W for which
while
(see [
9]).
For the zero-error feedback capacity, the following is known.
Theorem 4 (Shannon 1956, [
40]).
Let ; then, 2.3. Lower and Upper Bounds on the Reliability Function for the Typewriter Channel
As mentioned before, Shannon, Gallager, and Berlekamp assumed in [
13] that the expurgation is bound tight. Katsman, Tsfasman, and Vladut showed in [
42] a counterexample for the symmetric
q-ary channel when
. Dalai and Polyanskiy found a simpler counterexample in [
43]. They showed that the conjecture is already wrong for the
q-ary typewriter channel for
. We would like to briefly present their results here.
Definition 13. Let and . The typewriter channel is defined byThe extension of the channel is defined by For the reliability function of this channel, the interval
is of interest. The capacity of a typewriter channel
has the formula
where
is the binary entropy function. Shannon showed in [
40] that
is positive if
. He showed that for even
q, it holds that
. It is difficult to get a formula for odd
q. Lovász proved in [
44] that Shannon’s lower bound for
:
is tight. For general odd
q, Lovász proved
It is only known for
that this bound is tight. In general, this is not true. For special
q, there are special results outlined in [
44,
45,
46,
47].
Dalai and Polyanskiy provide upper and lower bounds on the reliability function in [
43]. They observed that the zero-error capacity of the pentagon can be determined by a careful study of the expurgated bound.
They present an improved lower bound for the case of even and odd
q, showing that it also is a precisely shifted version of the expurgated bound for the BSC. Their result also provides a new elementary disproof of the conjecture suggested in [
13] that the expurgated bound is asymptotically tight when computed on arbitrarily large blocks. Furthermore, in [
43], Dalai and Polyanskiy present a new upper bound for the case of odd
q based on the minimum distance of codes. They use Delsarte’s linear programming method [
48] (see also [
49]), combining the construction used by Lovász [
44] for bounding the graph capacity with the construction used by McEliece–Rodemich–Rumsey–Welch [
50] for bounding the minimum distance of codes in Hamming spaces. In the special case
, they give another improved upper bound for the case of odd
q, following the ideas of Litsyn [
51] and Barg–McGregor [
52], which in turn are based on estimates for the spectra of codes originated by Kalai–Linial [
53].
2.4. Computable Channels and Computable Performance Functions
We need further basic concepts for computability. We want to investigate the function and the upper bounds like and for as functions of W and R. These functions are generally only well defined for fixed channels W on sub-intervals of as functions depending on R. For example, for with , is infinite for . Hence, must be examined and computed as a function of R on the interval . Similar statements also apply to the other functions that have already been introduced. We now fix non-trivial alphabets and the corresponding set of the computable channels and .
Definition 14 (Turing computable channel function). We call a function a Turing computable channel function if there is a Turing machine that converts any program for the representation of into a program for the computation of —that is, , .
We want to determine whether there is a closed form for the channel reliability function. For this, we need the following definition, which we discuss in more detail in Remark 5 below.
Definition 15 (Turing computable performance function). Let ⊥ be a symbol. We call a function a Turing computable performance function if there are two Turing computable channel functions and with for , and a Turing machine , which is defined for input and . The Turing machine stops for the variables and and any representation for W and R as input if and only if and the Turing machine delivers . If , then does not stop.
Remark 5. The requirement for function to be a Turing computable performance function is relatively weak. For example, let us take W and R as inputs. Then, the interval is computed first. If R is now in the interval , then the Turing machine must stop for the input and deliver the result for . We impose no requirements on the behavior of the Turing machine for input W and . In particular, the Turing machine does not have to stop for the input in this case.
Take, for example, any Turing computable function with the corresponding Turing machine . Furthermore, let and be any two TMs, so that always holds for all . Then, the following Turing machine defines a Turing computable performance function.
- 1.
For any input and , first compute and .
- 2.
Compute the following two tests in parallel:
- (a)
Use the Turing machine and test using for input .
- (b)
Use the Turing machine and test using for input .
Let these two tests run until both Turing machines stop. If both Turing machines stop in 2, then compute and set .
actually generates a Turing computable performance function, and the Turing machine stops for the input if and only if applies. Then, it gives the value as output. This follows from the fact that the Turing machine stops for input if and only if . The second Turing machine from 2 stops exactly when , i.e. the Turing machine in 2., which simulates and in parallel, stops exactly when applies.
Remark 6. Using the above approach, we can try, for example, to find upper and lower bounds for the channel reliability function by allowing general Turing computable functions and algorithmically determine the interval from for which the function delivers lower or upper bounds for the channel reliability function.
Definition 16 (Banach–Mazur computable channel function). We call a Banach–Mazur computable channel function if every computable sequence from is mapped by f into a computable sequence from .
For practical applications, it is necessary to have performance functions that satisfy Turing computability. Depending on W, the channel reliability function or the bounds for this function should be computed. This computation is carried out by an algorithm that also receives W as input. This means that the algorithm should also be recursively dependent on W; otherwise, a special algorithm would have to be developed for each W (depending on W but not recursively dependent), since the channel reliability function for this channel, or a bound for this function, is computed.
It is now clear that when defining the Turing computable performance function, the Turing computable channel functions
cannot be dispensed with, because the channel reliability function depends on the specific channel and the permissible rate region for which the function can be computed. For
, one often has the representation
with
. For
, the choice
with
for the channel reliability function is a natural choice, because the channel reliability function is only useful for this interval. (We note that we showed in [
35] that
is not Turing computable in general.)
For the Turing computability of the channel reliability function or corresponding upper and lower bounds, it is therefore a necessary condition that the dependency of the relevant rate intervals on W be Turing computable—that is, recursive.
Remark 7. As noted in the Introduction, very few closed-form expressions for performance functions are known in information theory. Even for relatively simple scenarios, such as secure message transmission over a wiretap channel with an active jammer, closed-form solutions are not available (see [54,55,56]). Existing methods in information theory provide convergent multi-letter sequences for determining capacity. While these sequences enable the investigation of important properties of the capacity (see [54,57,58]), they are not yet suitable for direct numerical computation of the capacity. This is due to the reliance on Fekete’s lemma to prove the existence of the limit of these sequences. However, it was shown in [59] that Fekete’s lemma is not constructive, meaning no algorithm can effectively compute the associated limit values. Moreover, the problem of finding simple optimizers for performance functions is generally not algorithmically solvable [60,61]. For instance, the Blahut–Arimoto algorithm can be used to compute an infinite sequence of input distributions that converge to an optimal distribution. However, there is no way to halt the process based on a reliable approximation error, making it impossible to stop the computation at a specific point (see [60,61]). 3. Results for the Rate Function and Applications on the Sphere Packing Bound
In this section, we analyze the function and its implications for the sphere packing bound. Specifically, we demonstrate that is not a Turing computable performance function.
We begin by expressing
as
From this, we derive the equivalent representations:
where
In summary, the following holds true: let
be arbitrary non-trivial finite alphabets; then, for
Proof. Let
W be fixed. We consider the vector
of the convex set
is a computable continuous function on
. Thus, for
, we always have
with
, and thus
. □
Remark 8. We do not know whether holds for any finite . This statement holds for , but the general case is open.
For finite alphabets
and
with
, we want to analyze the set
To accomplish this, we refer to the proof of Theorem 23 in [
35]. Along the same lines, one can show that the following holds true:
Theorem 5. Let be non-trivial finite alphabets. For all with , the setis not semi-decidable. The following theorem can be derived from a combination of the proof of Theorem 5 and Theorem 24 in [
35]. The proof is carried out in the same way as the proof of Theorem 24 in [
35].
Theorem 6. Let be non-trivial finite alphabets. The function is not Banach–Mazur computable.
We now prove a stronger result then what we were able to show for
in [
35] so far. We show that the analogous question, like the question in [
34] for
for the function
, can be answered positively.
We need a concept of distance for
. Therefore, for fixed and finite alphabets
, we define the distance between
and
based on the total variation distance
Definition 17. A function is called computable continuously if the following are true:
- 1.
f is sequentially computable, i.e., f maps every computable sequence with into a computable sequence of computable numbers,
- 2.
f is effectively uniformly continuous, i.e., there is a recursive function such that for all and all with , it holds that
Theorem 7. Let be finite alphabets with and . There exists a computable sequence of computable continuous functions on with
- 1.
with and ,
- 2.
for all .
Proof. We consider the function
for
. For all
we have for all
and for all
, we have for all
and
is a computable continuous function, and
is a computable sequence of computable continuous functions. So,
for
and
.
satisfies all properties of the theorem, and point 1 is shown.
It holds
Therefore, we have
Because of (18), we have
for all
. (
20) yields
So,
and
holds. So, we have
□
We now want to prove that the corresponding question in [
34] can be answered positively for
.
Theorem 8. Let be finite alphabets with and . For all with , the setis semi-decidable. Proof. We use the computable sequences of computable continuous functions
from Theorem 7. It holds that
if and only if there is an
such that
holds. As in the proof of Theorem 28 from [
35], we now use the construction of a Turing machine
, which exactly accepts the set
□
We now consider the approximability “from below” (this can be seen as a kind of reachability). We have shown that can always be represented as a limit value of monotonically decreasing computable sequences of computable continuous functions. From this, it can be concluded that the sequence is then also a computable sequence of Banach–Mazur computable functions. We now have the following:
Theorem 9. Let be finite alphabets with and . There does not exist a sequence of Banach–Mazur computable functions with
- 1.
with and ;
- 2.
for all .
Proof. We assume that such a sequence does exist. Then, from Theorem 7 and the assumptions from this theorem, it can be concluded that is a Banach–Mazur computable function. This has created a contradiction. □
With this, we immediately get the following:
Corollary 1. Consider finite alphabets with , and let be a sequence of Banach–Mazur computable functions that satisfies the following:
- 1.
with and ,
- 2.
for all .
Then, there exists such that holds true.
We now want to apply the results for to the sphere packing bound as an application. With the results via the rate function, we immediately get
Theorem 10. Let be finite alphabets with and . The sphere packing bound is not a Turing computable performance function for .
Proof. Assuming that the statement of the theorem is incorrect, then is a Turing computable performance function on . But then the channel functions for and for must be Turing computable channel functions. As was already shown, however, is not Banach–Mazur computable. We have thus created a contradiction. □
4. Computability of the Channel Reliability Function and the Sequence of Expurgation Bound Functions
In this section, we consider the reliability function and the expurgation bound and show that these functions are not Turing computable performance functions.
With the help of the results from [
35] for
for noisy channels, we immediately get the following theorem:
Theorem 11. Let be finite alphabets with and . The channel reliability function is not a Turing computable performance function for .
Proof. Here, for is a Turing computable function, according to Definition 14. We already know that is not Banach–Mazur computable on . This gives the proof in the same way as for the sphere packing bound, i.e., the proof of Theorem 10. □
Now, we consider the rate function for the expurgation bound. The
k-letter expurgation bound
as a function of
W and
R is a lower bound for the channel reliability function. The latter can only be finite for certain intervals
. Thus, we want to compute the function in these intervals. In their famous paper [
13], Shannon, Gallager, and Berlekamp examined the sequence of functions
and analyzed the relationship to the channel reliability function. They conjectured that for all
for all
R with
(one would have convergence and also
), the relation
holds. This conjecture was first refuted in [
42] and later refuted by a simpler example in [
43].
It was already clear with the introduction of the channel reliability function that it had a complicated behavior. A closed-form formula for the channel reliability function is not yet known, and the results of this paper show that such a formula cannot exist. Shannon, Gallager, and Berlekamp tried in [
13] in 1967 to find sequences of seemingly simple formulas for the approximation of the channel reliability function. It seems that they considered the sequence of the
k-letter expurgation bounds to be very good channel data for its approximation. It was hoped that these sequences could be computed more easily with the use of new powerful digital computers.
Let us now examine the sequence . We have already introduced the concept of computable sequences of computable continuous channel functions. We now introduce the concept of computable sequences of Turing computable performance functions.
Definition 18. A sequence of Turing computable performance functions is called a computable sequence if there is a Turing machine that generates the description of for input k according to the definition of the function for the values for which the function is defined.
In the following theorem, we prove that the sequence of the k-letter expurgation bounds is not a computable sequence of computable performance functions. So, the hope mentioned above cannot be fulfilled.
Theorem 12. Let be finite alphabets with and . The sequence of the expurgation lower bounds is not a computable sequence of Turing computable performance functions.
Proof. We prove the theorem by contradiction, assuming that there exists a Turing machine that generates a description of the function for a given input k, as defined in its formulation. This implies that the sequence is computable, since we have an algorithm that can generate each function in the sequence.
Notably, we can express as . Given an input k, the Turing machine produces the description of , from which can be directly obtained via projection (in the sense of primitive recursive functions).
According to Shannon, Gallager, and Berlekamp [
13], the following limit holds:
for all
. Furthermore, the sequence
is monotonically increasing, i.e.,
Let us consider the set
for
with
. We are now constructing a Turing machine
with only one holding state, “stop”, which means that it either stops or computes forever.
should stop for input
if and only if
applies, that is,
stops if
W is in the above set. According to the assumption,
is a computable sequence of Turing computable channel functions. For the input
W, we can generate the computable sequence
of computable numbers. We now use the Turing machine
, which receives an arbitrary computable number
x as input and stops if and only if
, i.e.,
has only one hold state and accepts exactly the computable numbers
x as input for which
holds. We now use this program for the following algorithm.
We start with and let compute one step for input . If stops; then, we stop the algorithm.
If does not stop, we set and compute steps for . If one of these Turing machines stops, then the algorithm stops; if not, we set and repeat the second computation.
The above algorithm stops if and only if there is a
such that
. But this is the case (because of the monotony of the sequence
) if and only if
. But with this, the set
is semi-decidable. So, we have shown that this is not the case. We have thus created a contradiction. □
5. Computability of the Zero-Error Capacity of Noisy Channels with Feedback
In this section, we consider the zero-error capacity for noisy channels with feedback. In our paper [
35], we examined the properties of the zero-error capacity without feedback. Let
. We already noted that Shannon showed in [
40] that
From (
15), recall that
Then, we have for
W with
,
We know that
if
. If
, then there is a channel
W with
and
. Like in Lemma 1, we can show the following:
Lemma 2. Let be finite non-trivial alphabets. It holds that From Theorem 5 and the relationship between
and
, we get the following results for
, which we have already proved for
in [
35].
Theorem 13. Let be finite alphabets with and . For all with , the sets are not semi-decidable.
Theorem 14. Let be finite alphabets with and . Then, is not Banach-Mazur computable.
Now, we will prove the following:
Theorem 15. Let be finite alphabets with and . There is a computable sequence of computable continuous functions G with
- 1.
for and ;
- 2.
for .
Proof. We use for
,
and
the function
Then, for
we have the same properties as in Theorem 7 and
is an upper bound for
, which is monotonically decreasing. Now, the relation
holds for
if and only if there are two
so that
holds. We now set
and have
for
.
g is a computable continuous function with respect to
. Now, we set
for
.
is thus a computable sequence of computable continuous functions. Obviously,
for
and
is satisfied.
if and only if
. So, for
, we always have
For
W with
,
This is shown in the proof of Theorem 7. □
This immediately gives us the following theorem.
Theorem 16. Let be finite alphabets with and . For all with , the sets are semi-decidable.
Now, we want to look at the consequences of the results above for
. The same statements apply here as in
Section 3 for
with regard to the approximation from below.
cannot be approximated by monotonically increasing sequences.
There is an elementary relationship between
and
, which we use in the following. Again, we assume that
are finite non-trivial alphabets. We remember the following functions:
where
where
and
Let
be the
matrix with
for
and
, such that
if and only if
. Furthermore, let
and
For
and
, we consider the function
. The function
F is concave in
and convex in
.
and
are closed convex and compact sets, and
is continuous in both variables. So,
Let
be fixed. Then,
with
. Now,
for
. Hence,
with
for
. So,
Furthermore, for
fixed,
with
and
. Therefore,
with
for
. It follows that
We get the following lemma.
Lemma 3. Let ; then, We want to investigate the behavior of for the input , where denotes the Kronecker product of the matrices and compared to and . For this purpose, let be arbitrary finite non-trivial alphabets, and we consider for .
Theorem 17. Let be arbitrary finite non-trivial alphabets, and for . Then, we have Proof. We use the
function. It applies to
with
and
, so that
This applies to all
and
arbitrarily. So,
Also, we have
as well. So,
and the theorem is proven. □
We want to investigate the behavior of for the input compared to and . For this purpose, let be arbitrary finite non-trivial alphabets and consider for .
Theorem 18. Let be arbitrary finite non-trivial alphabets, and for . Then, we have
Remark 9. The condition (33) is equivalent to Proof. (
31) follows directly from the operational definition of C. Let (
33) now be fulfilled. Then,
must be fulfilled. Without loss of generality, we assume
,
and
,
. Since
,
If (
32) is fulfilled, then
. Then,
must be, because if
, then
, and thus
also (since the
capacity has no super-activation). This means that
, which would be a contradiction.
If
, then
This is a contradiction, and thus
. Furthermore,
must apply, because if
, then
without loss of generality. Then,
because
when
. This is again a contradiction. With this, we have proven the theorem. □
We still want to show for which alphabet sizes the behavior according to Theorem 18 can occur.
Theorem 19. - 1.
If , then for all with , we have - 2.
If are non-trivial alphabets withthen there exists with , such that
Proof. If
, then (
35) holds, since
.
If
, we can assume without loss of generality that
. In this case,
must be either
which implies that
, and consequently,
. Furthermore, if
, then
must also be one of the two matrices above, ensuring that (
35) holds. If instead
, Theorem 17 guarantees that (
35) remains valid.
We now prove (
36) under the assumption that
and
. If we have found channels
for this case, such that (
36) holds, then it is also clear how general case 2 can be proved. We set
, which means
. For
, we take the three-ary typewriter channel
with
(see [
43]):
Let
be arbitrary, then
. We have
and
. This means that
. Thus, because
,
and we have proven case 2.
□
6. Behavior of the Expurgation-Bound Rates
In this section, we consider the behavior of the expurgation-bound rate. occurs in the expurgation bound as a lower bound for the channel reliability function, where k is the parameter for the k-letter description. Let be arbitrary finite non-trivial alphabets, and for . We want to examine .
Theorem 20. There exist non-trivial alphabets and channels for , such that for all , there exists with Proof. Assume that for all
and
with
for all
,
We now take
such that
is superadditive. Then, we have for certain
with
,
Then,
This is a contradiction, and thus the theorem is proven. □
We improve the statement of Theorem 20 with the following theorem.
Theorem 21. There exist non-trivial alphabets and channels for and a , such that for all ,holds true. Proof. Assume the statement of the theorem is false, which means for all channels
with
, the following applies: There exists a sequence
with
, such that
for
. We now take
so that
is superadditive for these alphabets. Then, we have for certain
with
for
,
Then,
This is a contradiction to (
38), and thus the theorem is proven. □
We have already observed that the function exhibits significantly different behavior over certain rate intervals . In particular, we have analyzed the impact of the channel product on the intervals and for .
For the first interval, we established the relation
However, for the second interval, we have shown that such a simple additive behavior does not hold. Given the proof of Theorem 18, we conclude that there exist channels
for which
is satisfied for all
.
Another important aspect is understanding the conditions under which the interval causes to become infinite. This occurs if and only if , in which case, the interval is given by . Consequently, there exist channels such that for the function , this interval extends beyond .
Thus, we conclude that is generally superadditive.
7. Conclusions
We have shown that the channel reliability function is not a Turing computable performance function. The same conclusion holds for the functions associated with the sphere packing bound and the expurgation bound.
An interesting aspect of our work is that the constraints we impose on Turing computable performance functions are strictly weaker than those typically required for Turing computable functions. Specifically, we do not require that the Turing machine halt for all inputs . This means we allow the Turing machine to compute indefinitely for certain inputs, i.e., it may never halt for some inputs. Consequently, we permit performance functions that are not defined for all . However, we do require the Turing machine to halt for inputs whenever the performance function F is defined, and in such cases, the machine must return the computable value as output. This ensures that the algorithm generated corresponds to the number according to Definition 15.
Additionally, we considered the function and the zero-error feedback capacity, both of which play a critical role in the context of the channel reliability function. We demonstrated that neither the function nor the zero-error feedback capacity is Banach–Mazur computable. Furthermore, we proved that the function is additive.
We also established that for all finite alphabets with and , the channel reliability function itself is not a Turing computable performance function. Moreover, we showed that the commonly studied bounds, which have been extensively examined in the literature, are also not Turing computable performance functions. It remains unclear whether non-trivial upper bounds for the channel reliability function that are Turing computable even exist.
In [
13], the sequence of
k-letter expurgation bounds was considered an effective method for approximating the channel reliability function. It was hoped that these sequences could be computed more efficiently using modern digital computers. However, we have shown that this is not the case.
Table 1 gives an overview of the main results of the paper.
As mentioned in the Introduction, future communication systems, such as 6G, will face stringent requirements for trustworthiness. Ultra-reliability, along with the corresponding performance functions, is central to 6G, and this paper addresses that challenge. It is currently unclear how the non-Turing computability of performance functions will impact the system evaluation and certification of future communication systems. A recent study [
62] showed that the non-Turing computability of performance functions in artificial intelligence (AI) leads to digital AI algorithms being unable to meet essential legal requirements. It is an intriguing research question whether similar issues might arise in the context of communication systems.
This work does not claim that machine learning or artificial intelligence (AI) approaches are useless for computing capacity functions. Rather, it demonstrates that certain solutions cannot be found by such methods, or that a computer may not be able to assess how close a given result is to the optimum. Nevertheless, employing machine learning tools remains valuable; one must simply be aware that these approaches do not always guarantee optimality. In such cases, alternative theoretical frameworks may be necessary.
Turing computability and Banach–Mazur computability are two central notions in the theory of computation. Every function that is Turing computable is also Banach–Mazur computable, meaning that Banach–Mazur computability subsumes Turing computability. However, the converse does not hold: not every Banach–Mazur computable function is Turing computable. In fact, if a function is not Banach–Mazur computable, then it cannot be computable under any other standard notion of computability. This underscores the foundational and maximal character of Banach–Mazur computability within the hierarchy of computability concepts. Moreover, as shown in [
63], there exist even total functions—functions defined on all computable real numbers—that are Banach–Mazur computable but not Turing computable. For readers interested in a deeper understanding of computability theory—how to determine whether a function is computable, along with illustrative examples and detailed explanations—we recommend the comprehensive work by Soare [
64] and Cooper’s
New Computational Paradigms [
65]. Practical implications of these theoretical analyses, especially their relevance to real-world applications, are further explored in [
66], which may be of particular interest to those seeking connections between theory and practice.