On Asymptotic Equipartition Property for Stationary Process of Moving Averages

Yuanyuan Ren; Zhongzhi Wang

doi:10.3390/sym16070827

and

¹

School of Mathematics and Statistics, Xinyang College, Xinyang 464000, China

²

School of Microelectronics and Data Science, Anhui University of Technology, Ma’anshan 243000, China

^*

Author to whom correspondence should be addressed.

Symmetry2024, 16(7), 827;https://doi.org/10.3390/sym16070827

This article belongs to the Section Mathematics

Version Notes

Order Reprints

Abstract

Let

{X_{n}}_{n \in Z}

be a stationary process with values in a finite set. In this paper, we present a moving average version of the Shannon–McMillan–Breiman theorem; this generalize the corresponding classical results. A sandwich argument reduced the proof to direct applications of the moving strong law of large numbers. The result generalizes the work by Algoet et. al., while relying on a similar sandwich method. It is worth noting that, in some kind of significance, the indices

a_{n}

and

ϕ (n)

are symmetrical, i.e., for any integer n, if the growth rate of

{(a_{n})}_{n \in Z}

is slow enough, all conclusions in this article still hold true.

Keywords:

ergodic; stationary processes; asymptotic equipartition property; sandwich argument

MSC:

94A17

1. Introduction

Information theory is mainly concerned with stationary random processes

X = {X_{n}}_{n \in Z}

, where

X_{n}

takes values in a set

X

, with cardinality

| X | < \infty

. The strong convergence of the entropy at time n of a random processes divided by n to a constant limit called the entropy rate of the process is known as the ergodic theorem of information theory or the asymptotic equipartition property (AEP) [1], in some sense, of the expression

lim_{n \to \infty} [- \frac{1}{n} log p (X_{0}, \dots, X_{n - 1})] \to a constant .

Its original version proven in the 1950s for ergodic stationary processes is known as the Shannon–McMillan theorem for the convergence in mean and as the Shannon–Breiman–McMillan theorem [2,3,4] for the almost everywhere convergence. Since then, generalized versions of Shannon–McMillan–Breiman’s limit theorem were developed by many authors [1,2,4,5]. Extensions have been made in the direction of weakening the assumptions on the reference measure, state space, index set and required properties of the process. For the general development, please see Girardin [6] and the references therein.

In statistics, smoothing data is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise phenomena. One of the most used smoothing methods is moving average (MA). A number of authors have studied the question of almost everywhere convergence for an invertible transformation of X, which is measure preserving the moving averages, e.g., Akcoglu and Del Junco [7]; Bellow, Jones, and Rosenblatt [8]; Junco and Steele [9]; Schwartz [10]; and Haili and Nair [11]. Recently, Wang and Yang [12,13] proposed a new concept of the generalized entropy density, and established a generalized entropy ergodic theorem for time-nonhomogeneous Markov chain and for non-null stationary processes. Shi, Wang et al. [14] studied the generalized entropy ergodic theorem for nonhomogeneous Markov chains indexed by a binary tree.

Motivated by the work above, in this paper we will give a moving average version of the Shannon–McMillan–Breiman theorem. The results in this paper generalize the results of those in [2]. It is worth noting that, in some sense, the indices

a_{n}

and

ϕ (n)

are symmetrical. In this paper, we are discussing the so-called forward moving average; if the growth rate of

{(a_{n})}_{n \in Z}

w.r.p. to integer n is slow enough, all conclusions in this article still hold true, i.e., the backward moving average is still established.

The method used in showing the main results is the “sandwich” approximation approach of Algoet and Cover [2], which depends strongly on the moving strong law of large numbers: sample entropy is asymptotically sandwiched between two functions whose limits can be determined from the moving SLLN theorem.

This paper is organized as follows. In Section 2, we introduced some necessary preparatory knowledge. To distinguish this from the main conclusion theorem names, we present some required preliminaries and three lemmas. In Section 3, we give the main results and some properties of them are studied in the same section. Also, we give examples of applications.

2. Preliminaries

Throughout this section, let

(Ω, F, P)

denote a fixed probability space and let

{X_{n}}_{n \in Z}

be a stationary sequence taking values from a finite set

X = {1, 2, \dots, b}

. For the sequence

{X_{n}}_{n \in Z}

, denote the partial sequence

X_{i}, \dots, X_{j}

by

X_{i}^{j}

and

x_{i}, \dots, x_{j}

by

x_{i}^{j}

for

i < j

. Likewise, we write

X_{- \infty}^{n}

,

x_{- \infty}^{n}

for the sequence of

{X_{i}}_{i ⩽ n}

and

{x_{i}}_{i ⩽ n}

, respectively. Let

p (x_{i}^{j}) = P (X_{i}^{j} = x_{i}^{j})

and

p (x_{j} | x_{i}^{j - 1}) = P (X_{j} = x_{j} | X_{i}^{j - 1} = x_{i}^{j - 1})

wherever the conditioning event has positive probability. Define random variables

p (X_{i}^{j}) and P (X_{j} | X_{i}^{j - 1})

by setting

X_{j} = X_{j} (ω)

in the corresponding definitions. Since

P (p (X_{i}^{j}) = 0) = 0

the conditional probability makes sense

P - a . e .

(i.e., almost everywhere holds true under measure

P

).

Definition 1

(see e.g., [2]). The canonical Markov approximation of order m to the probability is defined for large

j > m

as

p^{[m]} (X_{i}^{i + j - 1}) = p (X_{i}^{i + m - 1}) \prod_{k = i + m}^{i + j - 1} p (X_{k} | X_{k - m}^{k - 1})

We will prove a new version of AEP for a stationary process

{X_{n}}_{n \in Z}

. Before developing the main theme of the paper, we shall need to derive some basic lemmas. Let

{a_{n}, ϕ (n)}_{n \in Z}

be a pair of positive integers such that

ϕ (n) \to \infty

as

n \to \infty

, and for every

ε > 0

,

Σ_{n = 1}^{\infty} 2^{- ε ϕ (n)} < \infty

.

Lemma 1.

Let

{X_{n}}_{n \in Z}

be a stationary process with values from a finite set

X

; then, we have

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} log \frac{p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩽ 0, P - a . e . \end{matrix}

(1)

and

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} log \frac{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})} ⩽ 0, P - a . e . \end{matrix}

(2)

where the base of the logarithm is taken to be 2.

Proof.

Let A be the support set of

p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})

; then,

\begin{matrix} E_{P} [\frac{p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}] = & \sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A} \frac{p^{[m]} (x_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (x_{a_{n}}^{a_{n} + ϕ (n) - 1})} \cdot p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \\ = & \sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A} p^{[m]} (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \\ = & p^{[m]} (A) \\ ⩽ & 1 . \end{matrix}

(3)

where

E_{P}

indicates taking expectation under measure

P

.

Similarly, let

B (X_{- \infty}^{a_{n} - 1})

denote the support set of

p (\cdot | X_{- \infty}^{a_{n} - 1})

. Then, we have

\begin{matrix} E_{P} \{\frac{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})}\} \\ = & E_{P} \{E_{P} [\frac{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})} | X_{- \infty}^{a_{n} - 1}]\} \\ = & E_{P} [\sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in B (X_{- \infty}^{a_{n} - 1})} \frac{p (x_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (x_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})} \cdot p (x_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})] \\ = & E_{P} [\sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in B (X_{- \infty}^{a_{n} - 1})} p (x_{a_{n}}^{a_{n} + ϕ (n) - 1})] \\ ⩽ & 1 \end{matrix}

(4)

By Markov’s inequality and Equation (4), we have, for any

ε > 0

,

P \{\frac{1}{ϕ (n)} \log \frac{p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩾ ε\} ⩽ \frac{1}{2^{ε ϕ (n)}}

Noting that

\sum_{n = 1}^{\infty} 2^{- ε ϕ (n)} < \infty

, we see by the Borel–Cantelli lemma that the event

P \{ω : \frac{1}{ϕ (n)} log \frac{p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩾ ε i . o .\} = 0

By the arbitrariness of

ε

, we have

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \log \frac{p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩽ 0, P - a . e .

Applying the same arguments using Markov’s inequality to Equation (3), we obtain

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \log \frac{p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})} ⩽ 0, P - a . e .

This proves the lemma. □

Lemma 2.

(SLLN for MA): For a stationary stochastic process

{X_{n}}_{n \in Z}

,

\begin{matrix} lim_{n \to \infty} - \frac{1}{ϕ (n)} log p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) = H^{m}, P - a . e . \end{matrix}

(5)

and

\begin{matrix} lim_{n \to \infty} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1}) = H^{\infty}, P - a . e . \end{matrix}

(6)

where

H^{m} = E_{P^{[m]}} {- log p (X_{0} | X_{- m}^{- 1})}

,

H^{\infty} = E_{P^{[m]}} {- log p (X_{0} | X_{- \infty}^{- 1})}

.

Proof.

It is not difficult to verify that

E_{P} p (X_{a_{n}}^{a_{n} + m - 1}) ⩽ 1

. An argument similar to the one used in Lemma 1 shows that

\begin{matrix} lim_{n \to \infty} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + m - 1}) = 0, P - a . e . \end{matrix}

(7)

Let

s \in [- \frac{1}{2}, \frac{1}{2}] ∖ {0}

, and define

Λ_{a_{n}, ϕ (n)} (s, ω) = \frac{2^{- s \sum_{k = a_{n} + m}^{a_{n} + ϕ (n)} log p (X_{k} | X_{k - m}^{k - 1})}}{\prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} E_{P^{[m]}} [2^{- s log (X_{k} | X_{k - m}^{k - 1})} | X_{k - m}^{k - 1}]}, n \in Z

Since

\begin{matrix} E_{P^{[m]}} Λ_{a_{n}, ϕ (n)} (s, ω) \\ = & E_{P^{[m]}} [E_{P^{[m]}} (Λ_{a_{n}, ϕ (n)} (s, ω) | X_{a_{n}}^{a_{n} + ϕ (n) - 1})] \\ = & E_{P^{[m]}} [E_{P^{[m]}} (Λ_{a_{n}, ϕ (n) - 1} (s, ω) \cdot \frac{2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2})}}{E_{P^{[m]}} (2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2})} | X_{a_{n} + ϕ (n) - m}^{a_{n} + ϕ (n) - 2})} | X_{a_{n}}^{a_{n} + ϕ (n) - 2})] \\ = & E_{P^{[m]}} [\frac{Λ_{a_{n}, ϕ (n) - 1} (s, ω)}{E_{P^{[m]}} (2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2})} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2})}] \\ \cdot E_{P^{[m]}} (2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2})} | X_{a_{n}}^{a_{n} + ϕ (n) - 2}) \\ = & \frac{E_{P^{[m]}} Λ_{a_{n}, ϕ (n) - 1} (s, ω)}{E_{P^{[m]}} (2^{- s log p (X_{a_{n} + ϕ (n)} | X_{a_{n} + ϕ (n) - m}^{a_{n} + ϕ (n) - 1})} | X_{a_{n} + ϕ (n) - m}^{a_{n} + ϕ (n) - 1})} \\ \cdot E_{P^{[m]}} (2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2})} | X_{a_{n} + ϕ (n) - m - 1}^{a_{n} + ϕ (n) - 2}) (by Markov property) \\ = & E_{P^{[m]}} Λ_{a_{n}, ϕ (n) - 1} (s, ω) \\ = & \dots = 1 . \end{matrix}

It is straightforward to show that

\begin{matrix} \underset{n}{lim sup} \frac{1}{ϕ (n)} log Λ_{a_{n}, ϕ (n)} (s, ω) ⩽ 0, P^{[m]} - a . e . \end{matrix}

(8)

Note that

\begin{matrix} \frac{1}{ϕ (n)} log Λ_{a_{n}, ϕ (n)} (s, ω) \\ = & \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} [- s log p (X_{k} | X_{k - m}^{k - 1}) - log E_{P^{[m]}} (2^{- s log p (X_{k} | X_{k - m}^{k - 1})} | X_{k - m}^{k - 1})], \end{matrix}

(9)

By Equations (8) and (9) and and the property of superior limits, we have

\begin{matrix} \underset{n}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} [- s log p (X_{k} | X_{k - m}^{k - 1})] \\ ⩽ & \underset{n}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} log E_{P^{[m]}} (2^{- s log p (X_{k} | X_{k - m}^{k - 1})} | X_{k - m}^{k - 1}), P^{[m]} - a . e . \end{matrix}

(10)

Setting

s \in (0, \frac{1}{2}]

, dividing both sides of Equation (10) by s, we obtain

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} [- log p (X_{k} | X_{k - m}^{k - 1})] = \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} [- log \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})] \\ ⩽ & \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} \frac{1}{s} log E_{P^{[m]}} (2^{- s log p (X_{k} | X_{k - m}^{k - 1})} | X_{k - m}^{k - 1}), P^{[m]} - a . e . \end{matrix}

(11)

Using the inequalities

log x < \frac{x - 1}{ln 2} (x > 0)

and

0 < 2^{x} - 1 - x ln 2 < \frac{1}{2} {(x ln 2)}^{2} e^{| x ln 2 |}

,

x \in R

.

It follows from Equation (11) that

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} \frac{1}{s} log E_{P^{[m]}} (2^{- s log p (X_{k} | X_{k - m}^{k - 1})} | X_{k - m}^{k - 1}) \\ ⩽ & E_{P^{[m]}} (- log p (X_{0} | X_{- m}^{- 1}) \\ + \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} [\frac{E_{P^{[m]}} (2^{- s log p (X_{k} | X_{k - m}^{k - 1})} | X_{k - m}^{k - 1}) - 1}{s ln 2} + E_{P^{[m]}} (log p (X_{k} | X_{k - m}^{k - 1}) | X_{k - m}^{k - 1})] \\ = & H^{m} + \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} \{\frac{E_{P^{[m]}} [(2^{- s log p (X_{k} | X_{k - m}^{k - 1})} - 1 + s ln 2 log p (X_{k} | X_{k - m}^{k - 1}) | X_{k - m}^{k - 1})]}{s ln 2}\} \\ ⩽ & H^{m} + \frac{s}{2 ln 2} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + 1}^{a_{n} + ϕ (n) - 1} E_{P^{[m]}} [{ln}^{2} 2 {log}^{2} p (X_{k} | X_{k - m}^{k - 1}) e^{s | ln 2 log p (X_{k} | X_{k - m}^{k - 1}) |} | X_{k - m}^{k - 1}] \\ = & H^{m} + \frac{s}{2 ln 2} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + 1}^{a_{n} + ϕ (n) - 1} E_{P^{[m]}} [{ln}^{2} p (X_{k} | X_{k - m}^{k - 1}) e^{s | ln p (X_{k} | X_{k - m}^{k - 1}) |} | X_{k - m}^{k - 1}] \\ ⩽ & H^{m} + \frac{s}{2 ln 2} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} \sum_{k = a_{n} + 1}^{a_{n} + ϕ (n) - 1} E_{P^{[m]}} [{ln}^{2} p (X_{k} | X_{k - m}^{k - 1}) p^{- \frac{1}{2}} (X_{k} | X_{k - m}^{k - 1}) | X_{k - m}^{k - 1}], P^{[m]} - a . e . \end{matrix}

(12)

By the fact that

max {t^{\frac{1}{2}} {ln}^{2} t, 0 ⩽ t ⩽ 1} = 16 e^{- 2}

, we have

\begin{matrix} E_{P^{[m]}} [{ln}^{2} p (X_{k} | X_{k - m}^{k - 1}) p^{- \frac{1}{2}} (X_{k} | X_{k - m}^{k - 1}) | X_{k - m}^{k - 1}] \\ = & \sum_{j = 1}^{b} {ln}^{2} p (j | X_{k - m}^{k - 1}) p^{- \frac{1}{2}} (j | X_{k - m}^{k - 1}) \cdot p (j | X_{k - m}^{k - 1}) \\ = & \sum_{j = 1}^{b} {ln}^{2} p (j | X_{k - m}^{k - 1}) \cdot p^{\frac{1}{2}} (j | X_{k - m}^{k - 1}) \\ ⩽ & 16 b e^{- 2} \end{matrix}

(13)

From Equations (11)–(13), we have

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} [- log \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})] ⩽ H^{m} + 8 log e \cdot s b e^{2}, P^{[m]} - a . e . \end{matrix}

(14)

Putting

s ↓ 0

in Equation (14), we obtain

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} [- log \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})] ⩽ H^{m}, P^{[m]} - a . e .

Replacing

s \in (0, \frac{1}{2}]

by

s \in [- \frac{1}{2}, 0)

in the above argument, we can obtain

\underset{n \to \infty}{lim inf} \frac{1}{ϕ (n)} [- log \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})] ⩾ H^{m}, P^{[m]} - a . e .

These imply that

\begin{matrix} lim_{n \to \infty} \frac{1}{ϕ (n)} [- log \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})] = H^{m}, P^{[m]} - a . e . \end{matrix}

(15)

Note that

P ≪ P^{[m]}

; therefore, we have, by Equation (15),

\begin{matrix} lim_{n \to \infty} \frac{1}{ϕ (n)} [- log \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})] = H^{m}, P - a . e . \end{matrix}

(16)

Since

P^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n)} - 1) = p (X_{a_{n}}^{a_{n} + m - 1}) \prod_{k = a_{n} + m}^{a_{n} + ϕ (n) - 1} p (X_{k} | X_{k - m}^{k - 1})

, Equation (5) follows immediately from Equations (7) and (16).

Similarly, let s be a nonzero real number, and define

Δ_{a_{n}, ϕ (n)} (s, ω) = \frac{2^{- s \sum_{k = a_{n}}^{a_{n} + ϕ (n) - 1} log p (X_{k} | X_{- \infty}^{k - 1})}}{{[E_{P} (2^{- s log (X_{0} | X_{- \infty}^{- 1})} | X_{- \infty}^{- 1})]}^{ϕ (n)}}, n \in Z

Note that

\begin{matrix} E_{P} Δ_{a_{n}, ϕ (n)} (s, ω) \\ = & E_{P} [E_{P} (Δ_{a_{n}, ϕ (n)} (s, ω) | X_{- \infty}^{a_{n} + ϕ (n) - 2})] \\ = & E_{P} [E_{P} (Δ_{a_{n}, ϕ (n) - 1} (s, ω) \cdot \frac{2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} + ϕ (n) - 2})}}{E_{P} (2^{- s log p (X_{0} | X_{- \infty}^{- 1})} | X_{- \infty}^{- 1})} | X_{- \infty}^{a_{n} + ϕ (n) - 2})] \\ = & E_{P} [\frac{Δ_{a_{n}, ϕ (n) - 1} (s, ω)}{E_{P} (2^{- s log p (X_{0} | X_{- \infty}^{- 1})} | X_{- \infty}^{- 1})}] \\ \cdot E_{P} (2^{- s log p (X_{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} + ϕ (n) - 1})} | X_{- \infty}^{a_{n} + ϕ (n) - 2}) \\ = & [\frac{E_{P} Δ_{a_{n}, ϕ (n) - 1} (s, ω)}{E_{P} (2^{- s log p (X_{0} | X_{- \infty}^{- 1})} | X_{- \infty - 1})}] \\ \cdot E_{P} (2^{- s log p (X_{0} | X_{- \infty}^{- 1})} | X_{- \infty}^{- 1}) (by stationary) \\ = & E_{P} Δ_{a_{n}, ϕ (n) - 1} (s, ω) \\ = & \dots = 1 . \end{matrix}

The remainder of the argument is analogous to that in proving Equation (5) and is left to the reader. □

Lemma 3.

(No gap):

H^{m} ↘ H^{\infty}

and

H = H^{\infty}

.

Proof.

We know that for stationary precesses

H^{m} ↘ H

, so it remains to show that

H^{m} ↘ H^{\infty}

.

Let

Z_{0} = - log p (X_{0}), Z_{n} = - log p (X_{0} | X_{- n}^{- 1}), n ⩾ 1

. Since

E (Z_{n}) = H (X_{0} | X_{- n}^{- 1}) ⩽ H (X_{0}) < \infty,

Z_{n}

is integrable. Now, since all random variables are discrete, we may write

\begin{matrix} E_{P} (Z_{n + 1} | X_{- n}^{0} = x_{- n}^{0}) \\ = & - \sum_{x_{- (n + 1)}} p (x_{- (n + 1)} | x_{- n}^{0}) log p (x_{0} | x_{- (n + 1)}^{- 1}) \\ + \sum_{x_{- (n + 1)}} \frac{p (x_{- (n + 1)}^{0})}{p (x_{- n}^{0})} log \frac{p (x_{- (n + 1)}^{- 1})}{p (x_{- (n + 1)}^{0})} \\ ⩽ & log [\sum_{x_{- (n + 1)}} \frac{p (x_{- (n + 1)}^{- 1})}{p (x_{- n}^{0})}] (J e n s e n^{'} s i n e q u a l i t y) \\ = & log \frac{p (x_{- n}^{- 1})}{p (x_{- n}^{0})} \\ = & - log p (x_{0} | x_{- n}^{- 1}) \end{matrix}

Therefore,

E_{P} (Z_{n + 1} | X_{- n}^{0}) ⩽ Z_{n}

and

Z_{n}

is measurable relative to

σ -

field

σ (X_{0}, X_{- 1}, \dots, X_{- n})

;

{Z_{n}}_{n ⩾ 1}

is a non-negative supermartingale, hence converges

a . e .

to an integrable limit function for all

x_{0} \in X

.

Note that, for any m,

\begin{matrix} H^{m} = & E_{P} {- \log p (X_{m + a_{n}} | X_{a_{n}}^{a_{n} + m - 1})} \\ = & E_{P} {- log p (X_{0} | X_{- m}^{- 1})} \end{matrix}

where the last equation follows from stationarity.

Since

X

is finite and

p log p

is bounded and continuous in p for all

0 ⩽ p ⩽ 1

, the bounded convergence theorem allows interchange of expectation and limit, yielding

\begin{matrix} lim_{m \to \infty} H^{m} \\ = & lim_{m \to \infty} E_{P} {- \sum_{x_{0} \in X} p (x_{0} | X_{- m}^{- 1}) log p (x_{0} | X_{- m}^{- 1})} \\ = & E_{P} {- \sum_{x_{0} \in X} p (x_{0} | X_{- \infty}^{- 1}) log p (x_{0} | X_{- \infty}^{- 1})} \\ = & H^{\infty} \end{matrix}

Thus,

H^{m} ↘ H = H^{\infty}

. □

3. Main Results

With the preliminary accounted for, we wish to use the Lemma 1 to conclude that

\begin{matrix} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1)}) & = - \frac{1}{ϕ (n)} \sum_{i = 0}^{ϕ (n) - 1} log p (X_{a_{n} + i} | X_{a_{n}}^{a_{n} + i - 1}) \\ \to lim_{n \to \infty} E_{P} (- log p (X_{n} | X_{0}^{n - 1}) \end{matrix}

(17)

It is not easy to prove Equation (17). However, the closely related quantities

p (X_{a_{n} + ϕ (n)} |

X_{a_{n} + ϕ (n) - m}^{a_{n} + ϕ (n) - 1})

and

p (X_{a_{n} + ϕ (n)} | X_{- \infty}^{a_{n} + ϕ (n) - 1})

are easily identified as entropy rates.

Recall that the entropy rate is given by

H = lim_{m \to \infty} H^{m} = lim_{n \to \infty} \frac{1}{n} \sum_{m = 0}^{n - 1} H^{m}

Of course,

H^{m} ↘ H

by stationarity and the fact that conditioning does not increase entropy. It will be crucial that

H^{m} ↘ H = H^{\infty}

.

With the help of the preceding lemmas, we can now prove the following theorem:

Theorem 1.

(AEP) If H is the entropy rate of a finite-valued stationary process

{X_{n}}_{n \in Z}

, then it holds that

\begin{matrix} lim_{n \to \infty} - \frac{1}{ϕ (n)} \log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) = H, P - a . e . \end{matrix}

(18)

Remark 1.

In the case

a_{n} \equiv 1, ϕ (n) = n

, Theorem 1 reduces to the famous Shannon–McMillan–Breiman theorem, which is the fundamental theorem of information theory. Let

a_{n} = n

; it gives a delayed average version of AEP.

Proof.

We argue that the sequence of random variables

- \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1)})

is asymptotically sandwiched between the upper bound

H^{m}

and the lower bound

H^{\infty}

for all

m \geq 0

. The AEP will follow since

H^{m} \to H^{\infty}

and

H^{\infty} = H

. □

From Lemma 1, we have

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} log \frac{p^{[m]} (X_{a_{n}}^{a_{n} + m - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩽ 0, P - a . e .

which we rewrite, taking the existence of the

{lim}_{n \to \infty} \frac{1}{ϕ (n)} log p^{[m]} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})

into account, as

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} log \frac{1}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩽ lim_{n \to \infty} \frac{1}{ϕ (n)} log \frac{1}{p^{m} (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} = H^{m}

for

m = 1, 2, \dots .

Also, from Lemma 1, we have

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} log \frac{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})} ⩽ 0, P - a . e .

which we rewrite as

\underset{n \to \infty}{lim sup} \frac{1}{ϕ (n)} log \frac{1}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} ⩾ lim_{n \to \infty} \frac{1}{ϕ (n)} \log \frac{1}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1} | X_{- \infty}^{a_{n} - 1})} = H^{\infty}

From the definition of

H^{\infty}

in Lemma 2, we have, by putting together Equations (6) and (7),

H^{\infty} ⩽ \underset{n \to \infty}{lim inf} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) ⩽ \underset{n \to \infty}{lim sup} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) ⩽ H^{m}

for all m.

But, by Lemma 3, $H^{m} ↘ H^{\infty} = H$ . Consequently,

lim_{n \to \infty} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) = H, P - a . e .

Now, we give some interesting applications of our main results in the next examples.

Example 1.

Let

{X_{n}}_{n \in Z}

be independent, identically distributed random variables drawn from the probability mass function

p (x)

; then,

lim_{n \to \infty} - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) = H (X), a . e .

Example 2.

Let

X = \{\begin{matrix} 1, & \frac{1}{2}; \\ 2, & \frac{1}{4}; \\ 3, & \frac{1}{4} . \end{matrix}

Let

{X_{n}}_{n \in Z}

be drawn i.i.d. according to this distribution; then,

lim_{n \to \infty} p {(X_{a_{n}}^{a_{n} + ϕ (n) - 1})}^{\frac{1}{ϕ (n)}} = 2^{\frac{1}{4} log 6}, a . e .

Example 3.

Let

{X_{n}}_{n \in Z}

be independent identically distributed random variables drawn according to the probability mass function

p (x), x \in X

. Thus,

p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) = \prod_{i = a_{n}}^{a_{n} + ϕ (n) - 1} p (x_{i})

. Let

q (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) = \prod_{i = a_{n}}^{a_{n} + ϕ (n) - 1} q (x_{i})

, where q is another probability mass function on

X

; then,

lim_{n \to \infty} \frac{1}{ϕ (n)} log \frac{q (X_{a_{n}}^{a_{n} + ϕ (n) - 1})}{p (X_{a_{n}}^{a_{n} + ϕ (n) - 1})} = D (p ‖ q), a . e .

where

D (p ‖ q)

is the informational divergence between two probability distributions

p

and

q

on a common alphabet

X

.

Since convergence almost everywhere implies convergence in probability, Theorem 1 has the following implication:

Definition 2.

The typical set

A_{ε}^{a_{n}, ϕ (n)}

with respect to

P

is the set of sequence

(x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \in X^{ϕ (n)}

with the following properties:

\begin{matrix} 2^{- ϕ (n) (H + ε)} ⩽ p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) ⩽ 2^{- ϕ (n) (H - ε)} \end{matrix}

(19)

As a consequence of the Theorem 1, we can show that the set

A_{ε}^{(a_{n}, ϕ (n))}

has the following properties:

Proposition 1.

Let

{X_{n}}_{n \in Z}

be independent, identically distributed random variables drawn from the probability mass function

p (x)

; then,

(1).: If $(x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \in A_{ε}^{(a_{n}, ϕ (n))}$ , then

$H - ε ⩽ - \frac{1}{ϕ (n)} log p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) ⩽ H + ε .$
(2).: $P (A_{ε}^{(a_{n}, ϕ (n))}) > 1 - ε$ for sufficiently large n.
(3).: $| A_{ε}^{(a_{n}, ϕ (n))} | ⩽ 2^{ϕ (n) (H - ε)}$ , where $| A |$ denotes the number of elements in set A.
(4).: $| A_{ε}^{(a_{n}, ϕ (n))} | ⩾ (1 - ε) 2^{ϕ (n) (H - ε)}$ for sufficiently large n.

Proof.

The property (1) is immediate from the definition of

A_{ε}^{(a_{n}, ϕ (n))}

.

Property (2) follows directly from Theorem 1, since the probability of the event

(X_{a_{n}}^{a_{n} + ϕ (n) - 1}) \in A_{ε}^{(a_{n}, ϕ (n))}

tends to 1 as

n \to \infty

.

Thus, for any

δ > 0

, there exists an

n_{0}

such that for all

n \geq n_{0}

, we have

P {| - \frac{1}{ϕ (n)} log p (X_{a_{n}}^{a_{n} + ϕ (n) - 1}) - H (X) | < ε} > 1 - δ .

Setting

δ = ε

, we have the following:

To prove property (3), noticing that

\begin{matrix} 1 = & \sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in X^{ϕ (n)}} p (x_{a_{n}}^{a_{n} + ϕ (n) - 1 p (x_{a_{n}}^{a_{n} + ϕ (n) - 1})}) \\ \geq & \sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A_{ε}^{(a_{n}, ϕ (n))})} A_{ε}^{(a_{n}, ϕ (n))} \\ \geq & \sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A_{ε}^{(a_{n}, ϕ (n))})} 2^{- ϕ (n) [H (X) + ε]} \\ = & 2^{- ϕ (n) [H (X) + ε]} | A_{ε}^{(a_{n}, ϕ (n))} |, \end{matrix}

where the second inequality follows from Equation (19),

| A_{ε}^{(a_{n}, ϕ (n))} | ⩽ 2^{ϕ (n) (H - ε)} .

Finally, for sufficiently large n,

P (A_{ε}^{(a_{n}, ϕ (n))}) > 1 - ε

,

\begin{matrix} 1 - ε < & P (A_{ε}^{(a_{n}, ϕ (n))}) \\ \leq & \sum_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A_{ε}^{(a_{n}, ϕ (n))})} 2^{- ϕ (n) [H (X) - ε]} \\ = & 2^{- ϕ (n) [H (X) - ε]} | A_{ε}^{(a_{n}, ϕ (n))} |, \end{matrix}

where the second inequality follows from Definition 2. Therefore,

| A_{ε}^{(a_{n}, ϕ (n))} | ⩾ (1 - ε) 2^{ϕ (n) (H - ε)} .

These complete the proof of the proposition. □

Example 4.

Let

{X_{n}}_{n \in Z}

be i.i.d.

\sim p (x), x \in X

. Let

H = - \sum p (x) log p (x)

. Let

A^{a_{n}, ϕ (n)} = {x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in X^{ϕ (n)} : | - \frac{1}{ϕ (n)} log p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) - H | ⩽ ϵ}

and

B^{a_{n}, ϕ (n)} = {x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in X^{ϕ (n)} : | \frac{1}{ϕ (n)} Σ_{i = a_{n}}^{a_{n} + ϕ (n) - 1} X_{i} - E X_{i} | ⩽ ϵ}

. Then we have the following:

(1)

{lim}_{n \to \infty} P {X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)}} = 1

;

(2)

{lim}_{n \to \infty} P {X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} = 1

;

(3)

| A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)} | \leq 2^{ϕ (n) (H + ϵ)}

, for all n;

(4)

| A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)} | \geq (\frac{1}{2}) 2^{ϕ (n) (H - ϵ)}

, for sufficiently large n.

Proof.

(1) By Theorem 1, the probability

X_{a_{n}}^{a_{n} + ϕ (n) - 1}

is typical goes to 1.

(2) By the Strong Law of Large Numbers for moving average, we have

P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in B^{a_{n}, ϕ (n)}) \to 1

. So there exists

ϵ > 0

and

N_{1}

such that

P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)}) > 1 - \frac{ϵ}{2}

for all

n > N_{1}

, and there exists

N_{2}

such that

P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in B^{a_{n}, ϕ (n)}) > 1 - \frac{ϵ}{2}

for all

n > N_{2}

. So for all

n > max (N_{1}, N_{2})

,

\begin{matrix} P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}) \\ = & P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)}) + P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in B^{a_{n}, ϕ (n)}) \\ - P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} \cup B^{a_{n}, ϕ (n)}) \\ > & 1 - \frac{ϵ}{2} + 1 - \frac{ϵ}{2} - 1 \\ = & 1 - ϵ \end{matrix}

So for any

ϵ > 0

there exists

N = max (N_{1}, N_{2})

such that

P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}) > 1 - ϵ

for all

n > N

; therefore,

P (X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}) \to 1

.

(3) By the law of total probability

Σ_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \leq 1

. Also, for

x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)}

, from Theorem 1 in the text,

p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \geq 2^{- ϕ (n) (H + ϵ)}

. Combining these two equations gives

\begin{matrix} 1 \geq & Σ_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \\ \geq & Σ_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} 2^{- ϕ (n) (H + ϵ)} \\ = & | A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)} | 2^{- ϕ (n) (H + ϵ)} . \end{matrix}

Multiplying through by

2^{ϕ (n) (H + ϵ)}

gives the result

| A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)} | \leq 2^{ϕ (n) (H + ϵ)}

.

(4) Since from (2)

P {X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} \to 1

, there exists N such that

P {X_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} \geq \frac{1}{2}

for all

n > N

. From Theorem 1 in the text, for

x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)}

,

p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \leq 2^{- n (H - ϵ)}

. So, combining these two gives

\begin{matrix} \frac{1}{2} & \leq Σ_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} p (x_{a_{n}}^{a_{n} + ϕ (n) - 1}) \\ \leq Σ_{x_{a_{n}}^{a_{n} + ϕ (n) - 1} \in A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)}} 2^{- ϕ (n) (H - ϵ)} \\ = | A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)} | 2^{- ϕ (n) (H - ϵ)} . \end{matrix}

Multiplying through by

2^{ϕ (n) (H - ϵ)}

gives the result

| A^{a_{n}, ϕ (n)} ⋂ B^{a_{n}, ϕ (n)} | \geq (\frac{1}{2}) 2^{ϕ (n) (H - ϵ)}

for sufficiently large n. □

Author Contributions

Writing—original draft, Y.R. and Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by NSF of Anhui University China (No. KJ2021A0386).

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

It is a pleasure to acknowledge our debt to Weicai Peng who suggested to us the problem addressed herein. We are grateful to the three anonymous referees and the editor for the useful comments and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley-Interscience: Hoboken, NJ, USA, 2005. [Google Scholar]
Algoet, P.H.; Cover, T.M. A sandwich proof of the Shannon-McMillan-Breiman theorem. Ann. Probab. 1988, 16, 899–909. [Google Scholar] [CrossRef]
Breiman, L. The Individual Ergodic Theorem of Information Theory. Ann. Math. Stat. 1957, 28, 809–811. [Google Scholar] [CrossRef]
McMillan, B. The basic theorems of information. Ann. Math. Stat. 1953, 24, 196–219. [Google Scholar] [CrossRef]
Neshveyev, S.; StØrmer, E. The McMillan Theorem for a Class of Asymptotically Abelian C^*-Algebras. Ergod. Theory Dyn. Syst. 2002, 22, 889–897. [Google Scholar] [CrossRef]
Girardin, V. On the different extensions of the ergodic theorem of information theory. In Recent Advance in Applied Probability; Springer Science+Business Media: Berlin/Heidelberg, Germany, 2005; pp. 163–179. [Google Scholar]
Akcoglu, M.A.; Junco, D. Convergence of averages of point transformations. Proc. Am. Math. Soc. 1975, 49, 265–266. [Google Scholar] [CrossRef]
Bellow, A.; Jones, R.; Rosenblatt, J.M. Convergence for moving averages. Ergod. Theory Dyn. Syst. 1990, 10, 43–62. [Google Scholar] [CrossRef][Green Version]
del Junco, A.; Steele, J.M. Moving averages of ergodic process. Metrika 1977, 24, 35–43. [Google Scholar] [CrossRef]
Schwartz, M. Polynomially moving ergodic average. Proc. Am. Math. Soc. 1988, 103, 252–254. [Google Scholar] [CrossRef]
Haili, H.K.; Nair, R. Optimal continued fractions and the moving average ergodic theorem. Period. Math. Hung. 2013, 66, 95–103. [Google Scholar] [CrossRef]
Wang, Z.Z.; Yang, W.G. The generalized entropy ergodicity theorem for nonhomogeneous Markov chains. J. Theor. Probab. 2016, 29, 761–775. [Google Scholar] [CrossRef]
Wang, Z.Z.; Yang, W.G. Markov approximation and the generalized entropy ergodic theorem for non-null stationary process. Proc. Indian Acad. Sci. (Math. Sci.) 2020, 130, 13. [Google Scholar] [CrossRef]
Shi, Z.Y.; Wang, Z.Z.; Zhong, P.P. The generalized entropy ergodicity theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. J. Theor. Probab. 2022, 35, 1367–1390. [Google Scholar] [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

On Asymptotic Equipartition Property for Stationary Process of Moving Averages

Abstract

1. Introduction

2. Preliminaries

3. Main Results

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics