Proof of Lemma 18. Using
and
, it is convenient to define a joint probability density function over
as
which is changing only stepwise in
x-direction. Note that
of (86) is the
y-marginal of
. This gives
We proceed in two stages. First we quantize
by rounding it
down and check the effect of this on the LHS of (
84)–(86). Then we complement the total probability back to 1, so that the type
is conserved, and check the effect of this on the RHS of (
84)–(86).
The quantization of
is done by first replacing it with its infimum in each rectangle
and then quantizing this infimum down to the nearest value
,
:
Due to (8), the integral of
over
can be smaller than 1 only by an integer multiple of
. The resulting difference from
at each point
can be bounded by a sum of two terms as
where
and
K is the parameter from (
12).
For (86) we will require the
y-marginal of
from (
A6), defined in the usual manner:
where the equality (
a) follows from (
A3), (
A5) and (
A6), and the inequality (
b) follows by (
A7). Then
where the last inequality follows by Lemma 4, (
A7), (
7), with
. Note that the previously defined
, while
if and only if
and
.
Now consider the LHS of (
84). Note that each function
in (
84) is bounded and has a finite variance. It follows that it has a finite differential entropy. With (A3) we can rewrite the LHS of (84) as
Let us examine the possible increase in (
A10) when
is replaced with
defined by (
A5) and (
A6). For this, let us define a set in
with respect to the parameter
h of (
A7):
which is a countable union of disjoint rectangles by the definition of
in (
A3). Then
Note that the minimum of the function
occurs at
. Then for
we have
for all
and the first of the two terms in (
A12) is upper-bounded as
where the equality (∗) is appropriate for the case when the upper bound is positive, with the definitions:
Next we upper-bound the entropy of the probability density function
on the RHS of (
A13) by that of a Gaussian PDF. By (
A4) we have
So we can rewrite the bound of (
A13) in terms of
p defined by (
A14):
From (
A11) and (
A14) it is clear that
as
. In order to relate between them, let us rewrite the inequality in (
A15) again as
where we use the disk set
, centered around zero. This results in the following upper bound on
p in terms of
h:
where
. Substituting the LHS of (
A17) in (
A16) in place of
p, we obtain the following upper bound on the first half of (
A12) in terms of
h of (
A7) and (
A11):
In the second term of (
A12) for
the integrand can be upper-bounded by Lemma A1 with its parameters
t and
such that
This gives
where
is the total area of
A. To find an upper bound on
, we use (
A4):
where in (
a) we use the disk set
, centered around zero, of the same total area as
A, and the resulting property that
. So that
Continuing (
A19), we therefore obtain the following upper bound on the second term in (
A12):
Putting (
A12), (
A18) and (
A21) together:
where
and
h are such as in (
A17) and (
A7), respectively. So if
in (
A7), then the possible increase in (
A10) caused by substitution of
in place of
is at most
.
Later on, for the RHS of (
84)–(86) we will require also the loss in the total probability incurred in the replacement of
by
. This loss is strictly positive and tends to zero with
h of (
A7):
where the set
A in (
a) is defined in (
A11), (
b) follows by (
A14) and (
A7), and (
c) follows by (
A17) and (
A20).
The LHS of (86)
Consider next the LHS of (86). Since
and has a finite variance, its differential entropy is finite. Let us examine the possible decrease in the LHS of (86) when
is replaced with
defined in (
A8). For this, let us define a set in
with respect to the parameter
of (
A9):
which is a countable union of disjoint open intervals. Then
For
we have
for all
and the first of the two terms in (
A25) is non-positive:
In the second term of (
A25) for
, the integrand can be upper-bounded by Lemma A1 with its parameters
t and
such that
where
K is the parameter from (
12). This gives
where
is the total length of
. It remains to find an upper bound on
. We use (
A4):
where in (
a) we use the interval set
, centered around zero, and of the same total length as
with the resulting property that
. So that
where
. Continuing (
A27), with (
A28) we obtain the following upper bound on the second term in (
A25), which is by (
A26) also an upper bound on both terms of (
A25):
So if
in (
A9), then the possible decrease caused by substitution of
in place of
on the LHS of (86) is at most
.
The LHS of (85)
Let us define two functions of
:
Then with
defined in (
A6), we can obtain a lower bound for the expression on the LHS of (85):
where (
a) follows because
and
, (
b) follows by (
A3) and Jensen’s inequality for the concave (∩) function
, and (
c) follows by the condition of the lemma.
Joint type
Let us define two mutually complementary probability masses for each
:
where
is defined in (
A30). It follows from (
A6) and (8), that each number
is an integer multiple of
and
for each
. Then a joint type can be formed with the two definitions above:
such that
and
for each
.
The RHS of (85)
Having defined
and
, let us examine the possible decrease in the expression found on the RHS of (85) when
inside that expression is replaced with
:
where (
a) follows by (
A34), (
b) follows according to the definitions (
A30) and (A33), (
c) follows because
and because
then (
d) follows by the upper bound on
of (
A23). Since by definition (
A32) we also have
which is exactly the beginning of (
A31), then combining (
A31) and (
A35) we obtain (85). The remainder of the proof for (
84) and (86) will easily follow by Lemma A1 applied to corresponding discrete entropy expressions with probability masses.
In order to upper-bound the expression on the RHS of (
84), it is convenient to write:
where (
a) follows by (
A34); for (
b) in the first term we apply the upper bound of Lemma A1 with its parameters
and
with
by (
A34), (
A36), for
n sufficiently large such that
; while in the second term we use the equality (
A36) and (
7); (
c) follows for
since
when positive; (
d) and (
e) follow respectively by the equality (
A36) and the inequality (
A23). Now since
the inequality in (
84) follows by comparing (
A10), (
A22), and (
A37).
The RHS of (86)
With
and
we have
where (
a) follows by (
A34); for (
b) in the first term, we apply the lower bound of Lemma A1 with its parameters
and
with
by (
A34) and (
A36), for
n sufficiently large such that
, while in the second term, we use the equality (
A36) and (
7); (
c) follows because
whenever positive, (
d) and (
e) follow, respectively, by the equality (
A36) and the inequality (
A23). From (
A8) and (
A32) we observe that
. Since the function
is piecewise constant in
by the definition of
, it follows that
Then the inequality (86) follows by comparing (
A29), (
A38). This concludes the proof of Lemma 18. □