Abstract
The notion of -mutual information between non-stochastic uncertain variables is introduced as a generalization of Nair’s non-stochastic information functional. Several properties of this new quantity are illustrated and used in a communication setting to show that the largest -mutual information between received and transmitted codewords over -noise channels equals the -capacity. This notion of capacity generalizes the Kolmogorov -capacity to packing sets of overlap at most and is a variation of a previous definition proposed by one of the authors. Results are then extended to more general noise models, including non-stochastic, memoryless, and stationary channels. The presented theory admits the possibility of decoding errors, as in classical information theory, while retaining the worst-case, non-stochastic character of Kolmogorov’s approach.
1. Introduction
Shannon’s celebrated channel coding theorem states that the capacity is the supremum of the mutual information between the input and the output of the channel [1]. In this setting, the mutual information is intended as the amount of information obtained regarding the random variable at the input of the channel by observing the random variable at the output of the channel, and the capacity is the largest rate of communication that can be achieved with an arbitrarily small probability of error. In an effort to provide an analogous result for safety-critical control systems where occasional decoding errors can result in catastrophic failures, Nair introduced a non-stochastic mutual information functional and established that this equals the zero-error capacity [2], namely the largest rate of communication that can be achieved with zero probability of error. Nair’s approach is based on the calculus of non-stochastic uncertain variables (UVs), and his definition of mutual information in a non-stochastic setting is based on the quantization of the range of uncertainty of a UV induced by the knowledge of the other. While Shannon’s theorem leads to a single letter expression, Nair’s result is multi-letter, involving the non-stochastic information between codeword blocks of n symbols. The zero-error capacity can also be formulated as a graph-theoretic property, and the absence of a single-letter expression for general graphs is well known [3,4]. Extensions of Nair’s non-stochastic approach to characterize the zero-error capacity in the presence of feedback from the receiver to the transmitter using nonstochastic directed mutual information have been considered in [5].
Kolmogorov introduced the notion of -capacity in the context of functional spaces as the logarithm base two of the packing number of the space, namely the logarithm of the maximum number of balls of radius that can be placed in the space without overlap [6]. Determining this number is analogous to designing a channel codebook such that the distance between any two codewords is at least . In this way, any transmitted codeword that is subject to a perturbation of, at most, can be recovered at the receiver without error. It follows that the -capacity per transmitted symbol (viz. per signal dimension) corresponds to the zero-error capacity of an additive channel having arbitrary bounded noise of the radius at most . Lim and Franceschetti extended this concept introducing the capacity [7], defined as the logarithm base two of the largest number of balls of radius that can be placed in the space with an average codeword overlap of, at most, . In this setting, measures the amount of error that can be tolerated when designing a codebook in a non-stochastic setting, and the capacity per transmitted symbol corresponds to the largest rate of communication with error at most .
The first contribution of this paper is to consider a generalization of Nair’s mutual information based on a quantization of the range of uncertainty of a UV given the knowledge of another, which reduces the uncertainty to, at most, , and to show that this new notion corresponds to the capacity. Our definition of capacity is a variation of the one in [7], as it is required to bound the overlap between any pair of balls, rather than the average overlap. For , we recover Nair’s result for the Kolmogorov -capacity or, equivalently, for the zero-error capacity of an additive, bounded noise channel. We then extend the results to more general channels where the noise can be different across codewords and is not necessarily contained within a ball of radius . Finally, we consider the class of non-stochastic, memoryless, stationary uncertain channels, where the noise experienced by a codeword of n symbols factorizes into n identical terms describing the noise experienced by each codeword symbol. This is the non-stochastic analog of a discrete memoryless channel (DMC), where the current output symbol depends only on the current input symbol, not on any of the previous input symbols, and where the noise distribution is constant across symbol transmissions and differs from Kolmogorov’s -noise channel, where the noise experienced by one symbol affects the noise experienced by other symbols (in Kolmogorov’s setting, the noise occurs within a ball of radius . It follows that for any realization where the noise along one dimension (viz. symbol) is close to , the noise experienced by all other symbols lying in the remaining dimensions must be close to zero.). Letting be the confidence of correct decoding after transmitting n symbols, we introduce several notions of capacity and establish coding theorems in terms of mutual information for all of them, including a generalization of the zero-error capacity that requires the error sequence to remain constant and a non-stochastic analog of Shannon’s capacity that requires the error sequence to vanish, as in .
Finally, since in Nair’s case, all of our results are multi-letter, in the Supplementary Materials, we provide some sufficient conditions for the factorization of the mutual information leading to a single-letter expression for the non-stochastic capacity of stationary, memoryless, uncertain channels, provide some examples in which these conditions are satisfied, and compute the corresponding capacity.
The rest of the paper is organized as follows: Section 2 introduces the mathematical framework of non-stochastic uncertain variables that are used throughout the paper. Section 3 introduces the concept of non-stochastic mutual information. Section 4 gives an operational definition of the capacity of a communication channel and relates it to the mutual information. Section 5 extends the results to more general channel models, and Section 6 concentrates on the special case of stationary, memoryless, uncertain channels. Section 7 draws conclusions and discusses future directions.
2. Uncertain Variables
We start by reviewing the mathematical framework used in [2] to describe UVs. A UV X is a mapping from a sample space to a set , i.e., for all , we have , namely
Given a UV X, the marginal range of X is
The joint range of the two UVs X and Y is
Given a UV Y, the conditional range of X given is
and the conditional range of X given Y is
Thus, denotes the uncertainty in X, given the realization of Y and that represents the total joint uncertainty of X and Y, namely
Finally, two UVs, X and Y, are independent if for all ,
which also implies that for all ,
3. -Mutual Information
3.1. Uncertainty Function
We now introduce a class of functions that are used to express the amount of uncertainty in determining one UV given another. In our setting, an uncertainty function associates a positive number with a given set, which expresses the “massiveness” or “size” of that set.
Definition 1.
Given the set of non-negative real numbers and any set of , is an uncertainty function if it is finite and strongly transitive:
We have , and for all
For all we have
In the case where is measurable, an uncertainty function can easily be constructed using a measure. In the case where is a bounded (not necessarily measurable) metric space and the input set contains at least two points, an example of an uncertainty function is the diameter.
3.2. Association and Dissociation Between UVs
We now introduce notions of association and dissociation between UVs. In the following definitions, we let and be uncertainity functions defined over sets and corresponding to UVs X and Y. We use the notation to indicate that for all , we have . Similarly, we use to indicate that for all , we have , where and . For , we assume that is always satisfied, while is not. Whenever we consider , we also assume that and , where and .
Definition 2.
The sets of association for UVs X and Y are
The sets of association are used to describe the correlation between two uncertain variables. Since , the exclusion of the zero value in (11) and (12) occurs when there is no overlap between the two conditional ranges.
Definition 3.
For any , UVs X and Y are disassociated at levels if the following inequalities hold:
and, in this case, we write .
Having UVs X and Y be disassociated at levels indicates that at least two conditional ranges and have non-zero overlap and that, given any two conditional ranges, either they do not overlap or the uncertainty associated with their overlap is greater than a fraction of the total uncertainty associated with ; the same holds for conditional ranges and and for level . The levels of disassociation can be viewed as lower bounds in the amount of residual uncertainty in each variable when the other is known. If X and Y are independent, then all the conditional ranges completely overlap, and contain only the element one, and the variables are maximally disassociated (see Figure 1a).
Figure 1.
Illustration of disassociation between UVs. Case (a): variables are maximally disassociated, and all conditional ranges completely overlap, in that all conditional ranges are equal to (or ). Case (b): variables are disassociated at some levels , and there is some overlap between at least two conditional ranges. Case (c): variables are not disassociated at any levels, and there is no overlap between the conditional ranges.
In this case, knowledge of Y does not reduce the uncertainty of X, and vice versa. On the other hand, when the uncertainty associated with any of the non-zero intersections of the conditional ranges decreases but remains positive, X and Y become less disassociated in the sense that knowledge of Y can reduce the residual uncertainty of X, and vice versa (see Figure 1b). When the intersection between every pair of conditional ranges becomes empty, the variables cease to be disassociated (see Figure 1c). Note that excluding the value of 1 in the definition of disassociation allows us to distinguish the case of disassociation from the case of full independence.
An analogous definition of association is given to provide upper bounds on the residual uncertainty of one uncertain variable when the other is known.
Definition 4.
For any , we say that UVs X and Y are associated at levels if the following inequalities hold:
and in this case, we write .
Note that is included in Definition 4 and not in Definition 3. This is because in Definition 3, we have a strict lower bound on the uncertainty in the association sets.
The following lemma provides the necessary and sufficient conditions for association to hold at given levels. These conditions are stated for all points in the marginal ranges and . They show that in the case of association, one can also include in the definition the conditional ranges that have zero intersection. This is not the case for disassociation.
Lemma 1.
For any , if and only if for all , we have
and for all , we have
Proof.
The proof is given in Appendix A. □
An immediate yet important consequence of our definitions is that both association and disassociation at given levels cannot hold simultaneously. We also understand that, given any two UVs, one can always select and to be large enough such that they are associated at levels . In contrast, as the smallest value in the sets and tends towards zero, the variables eventually cease to be disassociated. Finally, it is possible that two uncertain variables are neither associated nor disassociated at given levels . Also, any two uncertain variables are associated at level trivially by definition.
Example 1.
Consider three individuals, a, b, and c, are going for a walk along a path. Assume they take, at most, 15, 20, and 10 min to finish their walk, respectively. Assume a starts walking at time 5:00, b starts walking at 5:10, and c starts walking at 5:20. Figure 2 shows the possible time intervals for the walkers on the path. Let an uncertain variable W represent the set of walkers that are present on the path at any time and an uncertain variable T represent the time at which any walker on the path finishes their walk. Then, we have the following marginal ranges:
We also have the following conditional ranges:
For all t ∈ [5:00, 5:10), we have
for all t ∈ [5:10, 5:15], we have
for all t ∈ (5:15, 5:20), we have
and for all t ∈ [5:20, 5:30], we have
Now, let the uncertainty function of a time set be
where is the Lebesgue measure. Let the uncertainty function associated with a set of individuals be the cardinality of the set. Then, the sets of association are
It follows that for all and , we have
and the residual uncertainty in W given T is at least a fraction of the total uncertainty in W, while the residual uncertainty in T given W is at least a fraction of the total uncertainty in T. On the other hand, for all and , we have
and the residual uncertainty in W given T is at most a fraction of the total uncertainty in W, while the residual uncertainty in T given W is at most a fraction of the total uncertainty in T.
Figure 2.
Illustration of the possible time intervals for the walkers on the path.
Finally, if or , then W and T are neither associated nor disassociated.
3.3. -Mutual Information
We now introduce the mutual information between uncertain variables in terms of some structural properties of covering sets. Intuitively, for any the -mutual information, expressed in bits, represents the most refined knowledge that one uncertain variable provides about the other at a given level of confidence . We express this idea by considering the quantization of the range of uncertainty of one variable, induced by the knowledge of the other. Such quantization ensures that the variable can be identified with uncertainty at most . The notions of association and disassociation introduced above are used to ensure that the mutual information is well defined, in that it can be positive and exhibits a certain symmetric property.
Definition 5.
δ-Connectedness and δ-isolation.
- For any , points are δ-connected via and are denoted by if there exists a finite sequence of conditional sets such that , and for all , we haveIf and , then we say that and are singly δ-connected via , i.e., there exists a y such that .
- A set is δ-connected via if every pair of points in the set is δ-connected via .
- A set is singly δ-connected via if there exists a such that every point in the set is contained in , namely .
- Two sets are δ-isolated via if no point in is δ-connected to any point in .
Example 2.
Consider the same setting discussed in Example 1. For , two points at times 5:05 and 5:25 ∈ 〚T〛 are δ-connected. The sequence of conditional sets connecting the two points is , where the sets are defined in (21) and (22). This is because 5:05 ∈ 〚T|{a}〛, 5:25 ∈ 〚T|{b}〛 and
For all , two points at times 5:00 and 5:05 are singly δ-connected since 5:00, 5:05 .
Likewise, for all , the set is singly δ-connected by definition.
For , the set is δ-connected. This is because for all , one of the following scenarios holds:
- .
- .
- and , or vice-versa.
In the first two scenarios, the points and are singly δ-connected. In the third scenario, the points are δ-connected since
For all , the two sets and are δ-isolated since there is no overlap between the two sets.
Definition 6.
δ-overlap family.
For any , a δ-overlap family of , denoted by , is the largest family of distinct sets covering , such that
- Each set in the family is δ-connected and contains at least one singly δ-connected set of the form .
- The measure of overlap between any two distinct sets in the family is at most , namely for all , such that ; we also have .
- For every singly δ-connected set, there exist a set in the family containing it.
The first property of the -overlap family ensures that points in the same set of the family cannot be distinguished with confidence of at least , while also ensuring that each set cannot be arbitrarily small. The second and third properties ensure that points that are not covered by the same set of the family can be distinguished with confidence of at least . It follows that the cardinality of the covering family represents the most refined knowledge at a given level of confidence that we can have about X, given the knowledge of Y. This also corresponds to the most refined quantization of the set induced by Y. This interpretation is analogous to the one in [2], extending the concept of overlap partition introduced there to a -overlap family in this work. The stage is now set to introduce the -mutual information in terms of the -overlap family.
Definition 7.
The δ-mutual information provided by Y about X is
if a δ-overlap family of exists; otherwise, it is zero.
We now show that when variables are associated at level , there exists a -overlap family, so that the mutual information is well defined.
Theorem 1.
If , then there exists a δ-overlap family .
Proof.
We show that
satisfies all the three properties of -overlap family in Definition 6. First, note that is a cover of , since , even though for different y may overlap with each other. Second, each set in the family is singly -connected via , since trivially, any two points are singly -connected via the same set. It follows that Property 1 of Definition 6 holds.
Now, since , then by Lemma 1, for all , we have
which shows that Property 2 of Definition 6 holds. Finally, it is also easy to see that Property 3 of Definition 6 holds, since contains all sets . Hence, satisfies all the three properties in Definition 6, which implies that there exists at least one set satisfying these conditions. Hence, the maximum over these sets is defined and the claim follows. □
Next, we show that a -overlap family also exists when variables are disassociated at level . In this case, we also characterize the mutual information in terms of a partition of .
Definition 8.
δ-isolated partition.
A δ-isolated partition of , denoted by , is a partition of such that any two sets in the partition are δ-isolated via .
Theorem 2.
If , then the following holds:
- 1.
- There exists a unique δ-overlap family .
- 2.
- The δ-overlap family is the δ-isolated partition of largest cardinality, in that, for any , we havewhere the equality holds if and only if .
Proof.
First, we show the existence of a -overlap family. For all , let be the set of points that are -connected to x via , namely
Then, we let
and show that this is a -overlap family. First, note that since , we know that is a cover of . Second, for all , there exists a such that , and since any two points are singly -connected via , we understand that . It follows that every set in the family contains at least one singly -connected set. For all , we also have and . Since , by Lemma A2 in Appendix C, this implies that . It follows that every set in the family is -connected and contains at least one singly -connected set, and we conclude that Property 1 of Definition 6 is satisfied.
We now claim that for all , if
then
This can be proven by contradiction. Let and assume that . By (9), this implies that . We can then select , such that we have and . Since , by Lemma A2 in Appendix C, this also implies that , and, therefore, , which is a contradiction. It follows that if , then we must have , and, therefore,
We conclude that Property 2 of Definition 6 is satisfied.
Finally, we understand that for any singly -connected set , there exist an such that , which by, (42), implies that . Namely, for every singly -connected set, there exist a set in the family containing it. We can then conclude that satisfies all the properties of a -overlap family.
Next, we show that is a unique -overlap family, which implies that this is also the largest set satisfying the three conditions in Definition 6. By contradiction, consider another -overlap family . For all , let denote a set in containing x. Then, using the definition of and the fact that is -connected, it follows that
Next, we show that for all , we also have
from which, we conclude that .
The proof of (48) is also obtained by contradiction. Assume there exists a point . Since both x and are contained in , . Let be a point in a singly connected set that is contained in , namely . Since both x and are in , we understand that . Since , we can apply Lemma A2 in Appendix C to conclude that . It follows that there exists a sequence of conditional ranges such that and , which satisfies (35). Since is in both and , we obtain , and since , we obtain
Without loss of generality, we can then assume that the last element of our sequence is . By Property 3 of Definition 6, every conditional range in the sequence must be contained in some set of the -overlap family . Since and , it follows that there exist two consecutive conditional ranges along the sequence and two sets of the -overlap family covering them, such that , , and . Then, we have
where follows from (10) and follows from (35). It follows that
and Property 2 of Definition 6 is violated. Thus, does not exists, which implies . Combining (47) and (48), we conclude that the -overlap family is unique.
We now turn to the proof of the second part of the theorem. Since by (46), the uncertainty associated with the overlap between any two sets of the -overlap family is zero, it follows that is also a partition.
Now, we show that is also a -isolated partition. This can be proven by contradiction. Assume that is not a -isolated partition. Then, there exists two distinct sets such that and are not -isolated. This implies that there exists a point and such that . Using the fact that and are -connected and Lemma A2 in Appendix C, this implies that all points in the set are -connected to all points in the set . Now, let and be points in a singly -connected set contained in and , respectively: and . Since , there exists a sequence of conditional ranges satisfying (35), such that and . Without loss of generality, we can assume and . Since is a partition, we understand that and . It follows that there exist two consecutive conditional ranges along the sequence and two sets of the -overlap family covering them, such that and and that . Similarly to (50), we hold that
and Property 2 of Definition 6 is violated. Thus, and do not exist, which implies that is a -isolated partition.
Let be any other -isolated partition. We wish to show that and that the equality holds if and only if . First, note that every set can intersect, at most, one set in ; otherwise, the sets in would not be -isolated. Second, since is a cover of , every set in must be intersected by at least one set in . It follows that
Now, assume the equality holds. In this case, there is a one-to-one correspondence , such that for all , we have , and since both and are partitions of , it follows that . Conversely, assuming that , then follows trivially. □
We have introduced the notion of mutual information from Y to X in terms of the conditional range . Since, in general, we have , one may expect the definition of mutual information to be asymmetric in its arguments. Namely, the amount of information provided about X by the knowledge of Y may not be the same as the amount of information provided about Y by the knowledge of X. Although this is true in general, we show that for disassociated UVs, symmetry is retained, provided that when swapping X with Y, one also rescales appropriately. The following theorem establishes the symmetry in the mutual information under the appropriate scaling of the parameters and . The proof requires the introduction of the notions of taxicab connectedness, taxicab family, and taxicab partition, which are given in Appendix C.1, along with the proof of the theorem.
Theorem 3.
If and a -taxicab family of exists, then we have
4. (,)-Capacity
We now give a definition of the capacity of a communication channel and relate it to the notion of mutual information between the UVs introduced above. We consider a normed space to be totally bounded if, for every , can be covered by a finite number of open balls of radius . We let be a totally bounded, normed space such that for all , we have , where represents the norm. This normalization is for convenience in the notation process, and all results can easily be extended to metric spaces of any bounded norm. Let be a discrete set of points in the space, which represents a codebook.
Definition 9.
ϵ-perturbation channel.
A channel is called ϵ-perturbation if for any transmitted codeword , x is received with noise perturbation at most ϵ. Namely, we receive a point in the set
Given the codebook is transmitted over an -perturbation channel, all received codewords lie in the set , where . Transmitted codewords can be decoded correctly as long as the corresponding uncertainty sets at the receiver do not overlap. This can be achieved by simply associating the received codeword to the point in the codebook that is closest to it.
For any ∈, we now let
where is an uncertainty function defined over the space . We also assume without loss of generality that the uncertainty associated with the whole space of received codewords is . Finally, we let be the smallest uncertainty set corresponding to a transmitted codeword, namely , where . The quantity can be viewed as the confidence we have in not confusing and in any transmission or, equivalently, as the amount of adversarial effort required to induce a confusion between the two codewords. For example, if the uncertainty function is constructed using a measure, then all the erroneous codewords generated by an adversary to decode instead of must lie inside the equivocation set depicted in Figure 3, whose relative size is given by (56). The smaller the equivocation set is, the larger the effort required by the adversary to induce an error must be. If the uncertainty function represents the diameter of the set, then all the erroneous codewords generated by an adversary to decode instead of will be close to each other in the sense of (56). Once again, the closer the possible erroneous codewords are, the harder it must be for the adversary to generate an error, since any small deviation allows the decoder to correctly identify the transmitted codeword.
Figure 3.
The size of the equivocation set is inversely proportional to the amount of adversarial effort required to induce an error.
We now introduce the notion of a distinguishable codebook, ensuring that every codeword cannot be confused with any other codeword, rather than with a specific one, at a given level of confidence.
Definition 10.
-distinguishable codebook.
For any , , a codebook is -distinguishable if for all , we have .
For any -distinguishable codebook and , we let
It now follows from Definition 10 that
and each codeword in an -distinguishable codebook can be decoded correctly with confidence of at least . Definition 10 guarantees even more, namely that the confidence of not confusing any pair of codewords is uniformly bounded by . This stronger constraint implies that we cannot “balance” the error associated with a codeword transmission by allowing some decoding pair to have a lower confidence and enforcing other pairs to have higher confidence. This is the main difference between our definition and the one used in [7], which bounds the average confidence and allows us to relate the notion of capacity to the mutual information between pairs of codewords.
Definition 11.
-capacity.
For any totally bounded, normed metric space , , , and the -capacity of is
where is the set of -distinguishable codebooks.
The -capacity represents the largest number of bits that can be communicated by using any -distinguishable codebook. The corresponding geometric picture is illustrated in Figure 4. For , our notion of capacity reduces to Kolmogorov’s -capacity, which is the logarithm of the packing number of the space with balls of radius .
Figure 4.
Illustration of the -capacity in terms of packing -balls with maximum overlap .
In the definition of capacity, we have restricted to rule out cases when the decoding error can be at least as large as the error introduced by the channel and when the -capacity is infinite. Also, note that since and (10) holds.
We now relate our operational definition of capacity to the notion of UVs and mutual information introduced in Section 3. Let X be the UV corresponding to the transmitted codeword. This is a map and . Likewise, let Y be the UV corresponding to the received codeword. This is a map and . For our -perturbation channel, these UVs are such that for all and , we have
(see Figure 5). Clearly, the set in (60) is continuous, while the set in (61) is discrete.
Figure 5.
Conditional ranges and due to the -perturbation channel.
To measure the levels of association and disassociation between X and Y, we use an uncertainty function defined over and defined over . We introduce the feasible set
representing the set of UVs X such that the marginal range is a discrete set representing a codebook, and the UV can either achieve levels of disassociation or levels of association with Y. In our channel model, this feasible set also depends on the -perturbation through (60) and (61).
We can now state the non-stochastic channel coding theorem for our -perturbation channel.
Theorem 4.
For any totally bounded, normed metric space , ϵ-perturbation channel satisfying (60) and (61), , and , we have
Proof.
First, we show that there exists a UV X and such that , which implies that the supremum is well defined. Second, for all X and such that
and
we show that
Finally, we show the existence of and such that .
Let us begin with the first step. Consider a point . Let X be a UV such that
Then, we hold that the marginal range of the UV Y corresponding to the received variable is
and, therefore, for all , we have
Using Definition 2 and (67), we hold that
because consists of a single point, and, therefore, the set in (12) is empty.
On the other hand, using Definition 2 and (69), we have
Using (70), and since holds for , we have
Similarly, using (71), we have
Now, combining (72) and (73), we have
Letting , this implies that and the first step of the proof is complete.
To prove the second step, we define the set of discrete UVs
which is a larger set than the one containing all UVs X that are associated with Y. Now, we will show that if a UV , then the corresponding codebook . If , then there exists a such that for all , we have
It follows that for all , we have
Using , (60), and , for all , we have
where follows from . Putting things together, it follows that
Consider now a pair of X and such that and
If , then, using Lemma A1 in Appendix C, there exist two UVs, and and , such that
and
On the other hand, if , then (81) and (82) also trivially hold. It then follows that (81) and (82) hold for all . We now have
where follows from (81) and (82), follows from Lemma A3 in Appendix C since , follows by defining the codebook corresponding to the UV , and follows from the fact that using (81) and Lemma 1 allows , which implies for (79) that .
Finally, let
which achieves the capacity . Let be the UV whose marginal range corresponds to the codebook . It follows that for all , we have
which implies that ,
Letting , and using Lemma 1, we hold that , which implies that , and the proof is complete. □
Theorem 4 characterizes the capacity as the supremum of the mutual information over all UVs in the feasible set. The following theorem shows that the same characterization is obtained if we optimize the right-hand side in (63) over all UVs in the space. It follows by Theorem 4 that rather than optimizing over all UVs representing all the codebooks in the space, a capacity-achieving codebook can be found within the smaller class of feasible sets with error at most , since for all , .
Theorem 5.
The -capacity in (63) can also be written as
Proof.
Consider a UV , where Y is the corresponding UV at the receiver. The idea of the proof is to show the existence of a UV and the corresponding UV at the receiver, and
such that the cardinality of the overlap partitions
Let the cardinality
By Property 1 of Definition 6, we hold that for all , there exists an such that . Now, consider another UV whose marginal range is composed of K elements of , namely
Let be the UV corresponding to the received variable. Using the fact that for all , we have since (60) holds, and using Property 2 of Definition 6, for all , we obtain
where follows from the fact that using (91). Then, for all , we hold that
since . Then, by Lemma 1, it follows that
Since , we have
Therefore, and . We now hold that
where follows by applying Lemma A4 in Appendix C using (94) and (95) and follows from (90) and (91). Combining (96) with Theorem 4, the proof is complete. □
We now make some considerations with respect to previous results in the literature. First, we note that for , all of our definitions reduce to Nair’s ones and Theorem 4 recovers Nair’s coding theorem ([2] (Theorem 4.1)) for the zero-error capacity of an additive -perturbation channel.
Second, we point out that the -capacity considered in [7] defines the set of -distinguishable codewords such that the average overlap among all codewords is at most . In contrast, our definition requires the overlap for each pair of codewords to be at most . The following theorem provides the relationship between our and the capacity considered in [7], which is defined using the Euclidean norm.
Theorem 6.
Let be the -capacity defined in [7]. We have
and
Proof.
For every codebook and , we have
Since , this implies that for all , we have
For all , the average overlap defined in ([7] (53)) is
Then, we have
where follows from (100). Thus, we have
and (97) follows.
Now, let be a codebook with average overlap at most , namely
This implies that for all , we have
where follows from the fact that . Thus, we have
and (98) follows. □
To better understand the relationship between the two capacities and show how they can be distinct, consider the case in which the output space is the union of the three -balls depicted in Figure 6; this is the only feasible output configuration.
Figure 6.
Output configuration for the computation of and .
We now compute the two capacities and in this case. We have
and the average overlap (101) is
It follows that
On the other hand, the worst case overlap is
and it follows that
5. -Capacity of General Channels
We now extend our results to more general channels where the noise can be different across codewords and is not necessarily contained within a ball of radius .
Let be a discrete set of points in the space, which represents a codebook. Any point represents a codeword that can be selected at the transmitter, sent over the channel, and received with perturbation. A channel with transition mapping associates with any point in a set in , such that the received codeword lies in the set
Figure 7 illustrates possible uncertainty sets associated with three different codewords.
Figure 7.
Uncertainty sets associated with three different codewords. Sets are not necessarily balls; they can be different across codewords and can also be composed of disconnected subsets.
All received codewords lie in the set , where . For any ∈, we now let
where is an uncertainty function defined over . We also assume without loss of generality that the uncertainty associated with the space of received codewords is . We also let , where . Thus, is the set corresponding to the minimum uncertainty introduced by the noise mapping N.
Definition 12.
-distinguishable codebook.
For any , a codebook is -distinguishable if for all , we have .
Definition 13.
-capacity.
For any totally bounded, normed metric space , channel with transition mapping N, and , the -capacity of is
where .
We now relate our definition of capacity to the notion of UVs and mutual information introduced in Section 3. As usual, let X be the UV corresponding to the transmitted codeword and Y be the UV corresponding to the received codeword. For a channel with transition mapping N, these UVs are such that for all and , we have
To measure the levels of association and disassociation between UVs X and Y, we use an uncertainty function defined over , and is defined over . The definition of the feasible set is the same as the one given in (62). In our channel model, this feasible set depends on the transition mapping N through (115) and (116).
We can now state the non-stochastic channel coding theorem for channels with transition mapping N.
Theorem 7.
For any totally bounded, normed metric space , channel with transition mapping N satisfying (115) and (116), and , we have
The proof is along the same lines as the one of Theorem 4 and is omitted.
Theorem 7 characterizes the capacity as the supremum of the mutual information over all codebooks in the feasible set. The following theorem shows that the same characterization is obtained if we optimize the right hand side in (117) over all codebooks in the space. It follows by Theorem 7 that rather than optimizing over all codebooks, a capacity-achieving codebook can be found within the smaller class of feasible sets with error at most .
Theorem 8.
The -capacity in (117) can also be written as
The proof is along the same lines as the one of Theorem 5 and is omitted.
6. Capacity of Stationary Memoryless Uncertain Channels
In this section, we consider the special case of stationary, memoryless, uncertain channels.
Let be the space of -valued discrete-time functions , where is the set of positive integers denoting the time step. Let denote the function restricted over the time interval . Let be a discrete set which represents a codebook. Also, let denote the set of all codewords up to time n and denote the set of all codeword symbols in the codebook at time n. The codeword symbols can be viewed as the coefficients representing a continuous signal in an infinite-dimensional space. For example, transmitting one symbol per time step can be viewed as transmitting a signal of unit spectral support over time. Any discrete-time function can be selected at the transmitter, sent over a channel, received with noise perturbation, and introduced by the channel. The perturbation of the signal at the receiver due to the noise can be described as a displacement experienced by the corresponding codeword symbols . To describe this perturbation, we consider the set-valued map , associating any point in to a set in , where is the space of -values discrete-time functions. For any transmitted codeword , the corresponding received codeword lies in the set
Also, the noise set associated with is
where We are now ready to define stationary, memoryless, uncertain channels.
Definition 14.
A stationary, memoryless, uncertain channel is a transition mapping that can be factorized into identical terms describing the noise experienced by the codeword symbols. Namely, there exists a set-valued map such that for all and , we have
According to the definition, a stationary, memoryless, uncertain channel maps the nth input symbol into the nth output symbol in a way that does not depend on the symbols at other time steps, and the mapping is the same at all time steps. Since the channel can be characterized by the mapping N, to simplify the notation, we will use instead of .
Another important observation is that the -perturbation channel in Definition 9 may not admit a factorization like the one in (121). For example, consider the space to be equipped with the norm, the codeword symbols to represent the coefficients of an orthogonal representation of a transmitted signal, and the noise experienced by any codeword to be within a ball of radius . In this case, if a codeword symbol is perturbed by a value close to , the perturbation of all other symbols must be close to zero.
For stationary, memoryless, uncertain channels, all received codewords lie in the set , and the received codewords up to time n lie in the set . Then, for any , we let
where is an uncertainty function defined over the space of the received codewords. We also assume without loss of generality that at any time step n, the uncertainty associated with the space of received codewords is . We also let , where . Thus, is the set corresponding to the minimum uncertainty introduced by the noise mapping at a single time step. Finally, we let The quantity can be viewed as the confidence we have of not confusing and in any transmission or, equivalently, as the amount of adversarial effort required to induce a confusion between the two codewords. For example, if the uncertainty function is constructed using a measure, then all the erroneous codewords generated by an adversary to decode instead of must lie inside the equivocation set whose relative size is given by (122). The smaller the equivocation set is, the larger the effort required by the adversary to induce an error must be. If the uncertainty function represents the diameter of the set, then all the erroneous codewords generated by an adversary to decode instead of will be close to each other, in the sense of (122).
We now introduce the notion of a distinguishable codebook, ensuring that every codeword cannot be confused with any other codeword, rather than with a specific one, at a given level of confidence.
Definition 15.
-distinguishable codebook.
For all and , a codebook is -distinguishable if for all , we have
It immediately follows that for any -distinguishable codebook , we have
so that each codeword in can be decoded correctly with confidence at least . Definion 15 guarantees even more, namely that the confidence of not confusing any pair of codewords is at least .
We now associate with any sequence the largest distinguishable rate sequence , whose elements represent the largest rates that satisfy that confidence sequence.
Definition 16.
Largest -distinguishable rate sequence.
For any sequence , the largest -distinguishable rate sequence is such that for all , we have
where
We say that any constant rate R that lays below the largest -distinguishable rate sequence is -distinguishable. Such a -distinguishable rate ensures the existence of a sequence of distinguishable codes that, for all , have a rate of at least R and confidence of at least .
Definition 17.
-distinguishable rate.
For any sequence , a constant rate R is said to be -distinguishable if for all , we have
We now give our first definition of capacity for stationary, memoryless, uncertain channels as the supremum of the -distinguishable rates. Using this definition, transmitting at a constant rate below capacity ensures the existence of a sequence of codes that, for all , have confidence of at least .
Definition 18.
capacity.
For any stationary, memoryless, uncertain channel with transition mapping N, and any given sequence , we let
Another definition of capacity arises if, rather than the largest lower bound to the sequence of rates, one considers the least upper bound for which we can transmit, satisfying a given confidence sequence. Using this definition, transmitting at a constant rate below capacity ensures the existence of a finite-length code (rather than a sequence of codes) that satisfies at least one confidence value along the sequence .
Definition 19.
capacity.
For any stationary, memoryless, uncertain channel with transition mapping N, and any given sequence , we define
Next, consider Definition 19 in the case of as a constant sequence; namely, for all , we have . In this case, transmitting below capacity ensures the existence of a finite-length code that has confidence of at least . This is a generalization of the zero-error capacity,
Definition 20.
capacity.
For any stationary, memoryless, uncertain channel with transition mapping N and any sequence , where for all we have , we define
Letting , we obtain the zero-error capacity. In this case, below capacity, there exists a code with which we can transmit with full confidence.
Finally, to give a definition of a non-stochastic analog of Shannon’s probabilistic capacity, we first say that any constant rate R is achievable if there exists a sequence as such that R lays below . An achievable rate R then ensures that for all , there exists an infinite sequence of distinguishable codes of rate of at least whose confidence tends towards one as . It follows that in this case, we can achieve communication at rate R with arbitrarily high confidence by choosing a sufficiently large codebook.
Definition 21.
Achievable rate.
A constant rate R is achievable if there exists a sequence such that as and
We now introduce the non-stochastic analog of Shannon’s probabilistic capacity as the supremum of the achievable rates. This means that we can pick any confidence sequence such that tends towards zero as . In this way, plays the role of the probability of error and the capacity is the largest rate that can be achieved by a sequence of codebooks with an arbitrarily high confidence level. Using this definition, transmitting at a rate below capacity ensures the existence of a sequence of codes achieving arbitrarily high confidence by increasing the codeword size.
Definition 22.
capacity.
For any stationary, memoryless, uncertain channel with transition mapping N, we define the capacity as
We point out the key difference between Definitions 20 and 22. Transmitting below the capacity ensures the existence of a fixed codebook that has confidence of at least . In contrast, transmitting below the capacity allows us to achieve arbitrarily high confidence by increasing the codeword size.
To give a visual illustration of the different definitions of capacity, we refer to Figure 8.
Figure 8.
Illustration of capacities: This figure plots the sequence for a given sequence of with respect to .
For a given sequence , the figure sketches the largest -distinguishable rate sequence . According to definitions 18 and 19, the capacities and are given by the supremum and infimum of this sequence, respectively. On the other hand, according to Definition 22, the capacity is the largest limsup over all vanishing sequences . Assuming the figure refers to a vanishing sequence that achieves the supremum in (133), we have
We now relate our notions of capacity to the mutual information rate between transmitted and received codewords. Let X be the UV corresponding to the transmitted codeword. This is a map of and . Restricting this map to a finite time yields another UV and . Likewise, a codebook segment is a UV of marginal range Likewise, let Y be the UV corresponding to the received codeword. It is a map of and . and are UVs, and and . For a stationary, memoryless, uncertain channel with transition mapping N, these UVs are such that for all , and , and we have
Now, we define the largest -mutual information rate as the supremum mutual information per unit-symbol transmission that a codeword can provide about with confidence of at least .
Definition 23.
Largest -information rate.
For all , the largest -information rate from to is
In the following theorem. we establish the relationship between and .
Theorem 9.
Proof.
The proof of the theorem is similar to the one of Theorem 4 and is given in Appendix B. □
The following coding theorem is now an immediate consequence of Theorem 9 and of our capacity definitions.
Theorem 10.
Theorem 10 provides multi-letter expressions of capacity, since depends on according to (137). In the Supplementary Materials, we establish some special cases of uncertainty functions, confidence sequences, and classes of stationary, memoryless, uncertain channels, leading to the factorization of the mutual information and to single-letter expressions.
7. Conclusions and Future Directions
We presented a non-stochastic notion of information with worst-case confidence and related it to the capacity of a communication channel subject to unknown noise. Using the non-stochastic variables framework of Nair [5] and a generalization of the Kolmogorov capacity allowing some amount of overlap in the packing sets [7], we showed that the capacity equals the largest amount of information conveyed by the transmitter to the receiver, with a given level of confidence. These results are the natural generalization of Nair’s ones, obtained in a zero-error framework, and provide an information-theoretic interpretation of the geometric problem of sphere packing with overlap, as studied in [7].
Non-stochastic approaches to information and their use to quantify the performance of various engineering systems have recently received attention in the context of estimation, control, security, communication over non-linear optical channels, and learning systems [8,9,10,11,12,13,14]. We hope that the theory developed here can be useful in the future in some of these contexts. While refinements and extensions of the theory are certainly of interest, explorations of application domains are of paramount importance. There is evidence in the literature regarding the need for a non-stochastic approach to study the flow of information in complex systems, and there is a certain tradition in computer science, especially in the field of online learning, to study various problems in both a stochastic and a non-stochastic setting [15,16,17]. Nevertheless, it seems that only a few isolated efforts have been made towards the formal development of a non-stochastic information theory. Wider involvement of the community in developing alternative, even competing, theories is certainly advisable to eventually fulfill the need of these application areas.
Supplementary Materials
The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/e27050472/s1, Kolmogorov capacity with overlap.
Author Contributions
Conceptualization, M.F. and A.R.; methodology, M.F. and A.R.; investigation, M.F. and A.R.; writing—original draft preparation, A.R.; writing—review and editing, M.F. and A.R.; supervision, M.F.; project administration, M.F.; funding acquisition, M.F. All authors have read and agreed to the published version of the manuscript.
Funding
This work was partially supported by NSF Award Number: 2127605.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
Data is contained within the article.
Acknowledgments
This article is a revised and expanded version of a paper entitled Insights into [Towards a Non-Stochastic Information Theory], which was presented at [2019 IEEE International Symposium of Information Theory, Paris, France, 7–12 July 2019] and [Channel Coding Theorems in Non-Stochastic Information Theory], which was presented at [2021 IEEE International Symposium of Information Theory, Melbourne, VIC, Australia, 12–20 July 2021] [18,19].
Conflicts of Interest
The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
Appendix A. Proof of Lemma 1
Proof.
Let . Then,
Let
Then, for all , we have
Also, if , then
and if , then using (9), we have
This along with (A1) and (A4) implies that (17) follows.
Likewise, let
Then, for all ,
Also, if , then
and if , then using (9), we have
This along with (A2) and (A8) implies that (18) follows.
Now, we prove the opposite direction of the statement. Given that for all , we have
and for all , we have
Then, using the definition of and , we have
The statement of the lemma follows. □
Appendix B. Proof of Theorem 9
Proof.
We will show (138). Then, using Lemma A4 in Appendix C, (139) follows using the same argument as in the proof of Theorem 8.
We proceed in three steps. First, we show that for all , there exists a UV and such that , which implies that is not empty, and so the supremum is well defined. Second, for all , , and such that
and
we show that
Finally, for all , we show the existence of and such that
Let us begin with the first step. Consider a point . Let be a UV such that
Then, we hold that the marginal range of the UV corresponding to the received variable is
and therefore, for all , we have
Using Definition 2 and (A18), we have
because consists of a single point, and therefore, the set in (12) is empty.
On the other hand, using Definition 2 and (A20), we have
Using (A21), and since holds for , we have
Similarly, using (A22), we have
Now, combining (A23) and (A24), we have
Letting , this implies that and that the first step of the proof is complete.
To prove the second step, we define
which is a larger set than the one containing all UVs that are associated with . Similarly to (79), it can be shown that
Consider now a pair, and , such that and
If , then, using Lemma A1 in Appendix C, there exist UVs and and such that
and
On the other hand, if , then (A29) and (A30) also trivially hold. It then follows that (A29) and (A30) hold for all . We now have
where follows from (A29) and (A30), follows from Lemma A3 in Appendix C since , follows by defining the codebook corresponding to the UV , and follows from the fact that using (A29) and Lemma 1, we have , which implies by (A27) that .
For any , let
which achieves the rate . Let be the UV whose marginal range corresponds to the codebook . It follows that for all , we have
which implies that ,
Letting , and using Lemma 1, we hold that , which implies
and (138) follows. □
Appendix C. Auxiliary Results
Lemma A1.
Proof.
Let the cardinality
By Property 1 of Definition 6, we hold that for all , there exists a such that . Now, consider a new UV whose marginal range is composed of K elements of , namely
Let be the UV corresponding to the received variable. Using the fact that for all , we have since (60) holds, and using Property 2 of Definition 6, for all , we have
where follows from the fact that using (A40). Then, for all , we hold that
where . Then, by Lemma 1, it follows that
Since , we have
Using (A43) and (A44), we now hold that
where follows from Lemma A4 in Appendix C and follows from (A39) and (A40). Hence, the statement of the lemma follows. □
Lemma A2.
Let
If and , then we hold that .
Proof.
Let be the sequence of conditional range connecting x and . Likewise, let be the sequence of conditional range connecting x and .
Lemma A3.
Consider two UVs X and Y. Let
If , then we have
Proof.
We will prove this by contradiction. Let
Then, by Property 1 of Definition 6, there exists two sets and one singly -connected set such that
Then, we have
where follows from (A62) and (10), follows from (A59), and follows from the fact that . However, by Property 2 of Definition 6, we have
Hence, we hold that (A63) and (A64) contradict each other, which implies that (A61) does not hold. Hence, the statement of the theorem follows. □
Lemma A4.
Consider two UVs X and Y. Let
For all and , if , then we have
Additionally, is a -overlap family.
Proof.
We show that
is a -overlap family. First, note that is a cover of , since . Second, each set in the family is singly -connected via , since trivially any two points are singly -connected via the same set. It follows that Property 1 of Definition 6 holds.
Now, since , then by Lemma 1 for all we have
which shows that Property 2 of Definition 6 holds. Finally, it is also easy to see that Property 3 of Definition 6 holds, since contains all sets . Hence, satisfies all the properties of -overlap family, which implies that
Since , using Lemma A3, we also have
Combining (A69), (A70) and the fact that satisfies all the properties of -overlap family, the statement of the lemma follows. □
Appendix C.1. Taxicab Symmetry of the Mutual Information
Definition A1.
-taxicab connectedness and -taxicab isolation.
- Points are -taxicab connected via and are denoted by , if there exists a finite sequence of points in such that , , and for all , we have eitherorIf and , then we say that and are singly -taxicab connected, i.e., either and or and .
- A set is (singly) -taxicab connected via if every pair of points in the set is (singly) -taxicab connected in .
- Two sets are -taxicab isolated via if no point in is -taxicab connected to any point in .
Definition A2.
Projection of a set
- The projection of a set on the x-axis is defined as
- The projection of a set on the y-axis is defined as
Definition A3.
-taxicab family
A -taxicab family of , denoted by , is a largest family of distinct sets covering such that
- 1.
- Each set in the family is -taxicab connected and contains at least one singly -connected set of form and at least one singly -connected set of the form .
- 2.
- The measure of overlap between the projections on the x-axis and y-axis of any two distinct sets in the family are at most and , respectively.
- 3.
- For every singly -connected set, there exists a set in the family containing it.
We now show that when hold, the cardinality of the -taxicab family is same as the cardinality of the -overlap family and -overlap family.
Proof of Theorem 3.
We will show that . Then, can be derived along the same lines. Hence, the statement of the theorem follows.
First, we will show that
satisfies all the properties of .
Since is a covering of , we have
which implies that is a covering of .
Consider a set . For all , and are -taxicab connected. Then, there exists a taxicab sequence of the form
such that either or in Definition A1 is true. Then, the sequence yields a sequence of conditional range such that for all ,
Hence, via . Hence, is -connected via . Also, contains at least one singly -connected set of the form , which implies . Hence, contains at least one singly -connected set of the form . Hence, satisfies Property 1 in Definition 6.
For all , we have
using Property 2 in Definition A3. Hence, satisfies Property 2 in Definition 6.
Using Property 3 in Definition A3, we hold that for all , there exists a set containing it. This implies that for all , we have
Hence, satisfies Property 3 in Definition 6.
Thus, satisfies all the three properties of . This implies, along with Theorem 2, that
which implies that
Hence, the statement of the theorem follows. □
References
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Nair, G.N. A nonstochastic information theory for communication and state estimation. IEEE Trans. Autom. Control 2013, 58, 1497–1510. [Google Scholar] [CrossRef]
- Rosenfeld, M. On a problem of C.E. Shannon in graph theory. Proc. Am. Math. Soc. 1967, 18, 315–319. [Google Scholar] [CrossRef]
- Shannon, C. The zero error capacity of a noisy channel. IRE Trans. Inf. Theory 1956, 2, 8–19. [Google Scholar] [CrossRef]
- Nair, G.N. A nonstochastic information theory for feedback. In Proceedings of the 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), Maui, HI, USA, 10–13 December 2012; IEEE: Maui, HI, USA, 2012; pp. 1343–1348. [Google Scholar]
- Tikhomirov, V.M.; Kolmogorov, A.N. ϵ-entropy and ϵ-capacity of sets in functional spaces. Uspekhi Mat. Nauk 1959, 14, 3–86. [Google Scholar]
- Lim, T.J.; Franceschetti, M. Information without rolling dice. IEEE Trans. Inf. Theory 2017, 63, 1349–1363. [Google Scholar] [CrossRef]
- Borujeny, R.R.; Kschischang, F.R. A Signal-Space Distance Measure for Nondispersive Optical Fiber. IEEE Trans. Inf. Theory 2021, 67, 5903–5921. [Google Scholar] [CrossRef]
- Ferng, C.S.; Lin, H.T. Multi-label classification with error-correcting codes. In Proceedings of the Asian Conference on Machine Learning, Taoyuan, Taiwan, 13–15 November 2011; pp. 281–295. [Google Scholar]
- Saberi, A.; Farokhi, F.; Nair, G. Estimation and Control over a Nonstochastic Binary Erasure Channel. IFAC-PapersOnLine 2018, 51, 265–270. [Google Scholar] [CrossRef]
- Saberi, A.; Farokhi, F.; Nair, G.N. State Estimation via Worst-Case Erasure and Symmetric Channels with Memory. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; IEEE: Paris, France, 2019; pp. 3072–3076. [Google Scholar]
- Verma, G.; Swami, A. Error correcting output codes improve probability estimation and adversarial robustness of deep neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 8646–8656. [Google Scholar]
- Weng, T.W.; Zhang, H.; Chen, P.Y.; Yi, J.; Su, D.; Gao, Y.; Hsieh, C.J.; Daniel, L. Evaluating the robustness of neural networks: An extreme value theory approach. arXiv 2018, arXiv:1801.10578. [Google Scholar]
- Wiese, M.; Johansson, K.H.; Oechtering, T.J.; Papadimitratos, P.; Sandberg, H.; Skoglund, M. Uncertain wiretap channels and secure estimation. In Proceedings of the 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–15 July 2016; IEEE: Barcelona, Spain, 2016; pp. 2004–2008. [Google Scholar]
- Agrawal, R. Sample mean based index policies with O (log n) regret for the multi-armed bandit problem. Adv. Appl. Probab. 1995, 27, 1054–1078. [Google Scholar] [CrossRef]
- Auer, P.; Cesa-Bianchi, N.; Freund, Y.; Schapire, R.E. The nonstochastic multiarmed bandit problem. SIAM J. Comput. 2002, 32, 48–77. [Google Scholar] [CrossRef]
- Rangi, A.; Franceschetti, M. Online learning with feedback graphs and switching costs. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, Naha, Japan, 16–18 April 2019; pp. 2435–2444. [Google Scholar]
- Rangi, A.; Franceschetti, M. Towards a non-stochastic information theory. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; IEEE: Paris, France, 2019; pp. 997–1001. [Google Scholar]
- Rangi, A.; Franceschetti, M. Channel Coding Theorems in Non-stochastic Information Theory. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021; IEEE: Melbourne, Australia, 2021; pp. 2295–2300. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).