A Dynamical Systems-Based Hierarchy for Shannon, Metric and Topological Entropy

A rigorous dynamical systems-based hierarchy is established for the definitions of entropy of Shannon (information), Kolmogorov–Sinai (metric) and Adler, Konheim & McAndrew (topological). In particular, metric entropy, with the imposition of some additional properties, is proven to be a special case of topological entropy and Shannon entropy is shown to be a particular form of metric entropy. This is the first of two papers aimed at establishing a dynamically grounded hierarchy comprising Clausius, Boltzmann, Gibbs, Shannon, metric and topological entropy in which each element is ideally a special case of its successor or some kind of limit thereof.


Introduction
Entropy, which can among a variety of other things, be roughly viewed as a measure of uncertainty (cf. [1]), has been and remains a fascinating and still not completely understood concept with an amazing range of applications, including quantum and ecological systems [2]. We have discovered that many mathematical, scientific and engineering colleagues share our abiding curiosity about rigorous connections among the myriad definitions of entropy. For example, what if any is the relationship between classical Clausius entropy and topological entropy or even Shannon entropy? There have been several extensive investigations of various types of entropy including historical accounts of the relevant developments and informative investigations of the linkages among the various forms of entropy such as in [3][4][5][6][7][8][9][10][11][12][13], but they all appear to be somewhat lacking in terms of identification of truly compelling unifying themes for the multifarious definitions.
Our intention in this and a subsequent paper is to provide a rigorous partial answer to this question by proving that there is a dynamical systems thread connecting all of the following definitions of entropy leading, with the possible addition of some mild assumptions, to the hierarchy Clausius ←− Gibbs ←− Boltzmann ←− Shannon ←− metric ←− topological By rigor as used here, we mean that any connections and special case identifications shall be proved. In this paper, we shall establish a dynamical systems thread connecting Shannon, metric and topological entropy; in a forthcoming paper we hope to complete the hierarchy by elucidating the connections among the first three of the above and their link to the last three. In a related note, we recommend a recent interesting functorial connection among these three entropies introduced by Delvenne [14].
The material to be presented is organized as follows. In the interest of completeness, we present the definitions and some basic properties of topological, metric and Shannon entropy. Then, in Section 3, we prove that metric entropy is a special case of topological entropy if one adds just a few assumptions. Equality of metric and topological entropy can and shall be established in the following two forms: Constructively, namely by showing that certain measurable dynamical systems can be given a topology, which might be quite far removed from some of the more usual types, for which the two entropies in question are equal and by comparison, where one can sometimes determine when a given topology on a measurable dynamical system yields equality of the entropies. Next, in Section 4, we show that Shannon entropy is special case of metric entropy when formulated in a certain dynamical context. In particular, one can frame the underlying information theory foundation as a measurable dynamical system in which the metric and Shannon entropies are equal. As an example, we show how the result for a binary alphabet can be obtained essentially from scratch, in stark contrast to the general result due mainly to Kolmogorov and Sinai that involves very deep and extensive analysis. Finally, in Section 5 we summarize some of the conclusions reached and outline our plans for related future work on extending the hierarchy for entropies.

Topological, Metric and Shannon Entropy
In this section we define topological, metric and Shannon entropy, list some of their properties, and describe several basic relationships among them. We begin with the topological entropy of a discrete dynamical system on a topological space.

Topological Entropy
Let X be a nonempty Hausdorff topological space with topology T (comprising the open subsets of the space). A topological discrete (semi-) dynamical system (or just discrete dynamical system for short) is a continuous action of the form where F(n, x) := f n (x), f : X → X is a continuous map, f n is the n-th iterate of the map under composition •, with the convention that f 0 is the identity map id = id X and N * is the abelian semigroup {0} ∪ N under addition. There is a standard nomenclature (see, for example [15]) that includes such definitions as (positive semi-) orbits O(x) := { f n (x) : n ∈ N * } and fixed points where f (x) = x, so that O(x) = {x}. For convenience, we shall denote the dynamical system by D := ( f , X). Assuming X to be compact, Adler et al. [16] defined the topological entropy of the dynamical system as follows: If U = {U} is any open covering of X, the minimum cardinality N(U ) over all subcoverings must be finite owing to the compactness of X. Now, if U and V = {V} are two open coverings of X, so is their common refinement every set of which is contained in a set of both U and V, with the refinement relation being typically denoted as U , V ≺ U ∨ V. By iterating refinement, for each n ∈ N we obtain the open covering exists. Then, the topological entropy is defined as where OC(X) is the set of all open coverings of X.
It should be remarked that although we have confined ourselves to compact spaces, it is possible to extend the definition of topological entropy to noncompact spaces (cf. [19][20][21]). There is an analogous definition of topological entropy for continuous dynamical systems that applies to continuous (usually continuously differentiable), which we include for possible future reference. It applies to continuous actions of the type where R is the topological abelian group of real numbers with respect to addition. These actions arise naturally from solutions of autonomous systems of ordinary differential equations satisfying conditions guaranteeing global existence and uniqueness. There is the usual notation for these systems, which includes for example, the positive semiorbit through The associated time-1 map ϕ 1 , which generates a discrete dynamical system, is defined as , and the topological entropy of the continuous dynamical system is Finally, we note in passing that the topological entropy in both the discrete and continuous cases can be intuitively viewed as the exponential growth in distinct orbits with discrete or continuous time, respectively.

Some Properties of Topological Entropy
Obviously, the topological entropy is nonnegative, and one can use the basic definitions and a bit of straightforward analysis (as in [15,16]) to prove the following properties: open covers is refiningmeaning that U n ≺ U n+1 for every n ∈ N and for every open cover V of X there exists an n such that V ≺ U n -then {h T (U n , f )} is a nondecreasing sequence, with h T ( f ) = lim n→∞ h T (U n , f ) = sup{h T (U n , f ) : n ∈ N}.

Kolmogorov-Sinai Metric Entropy
A discrete measurable dynamical system consist of a nonempty set X, a σ-algebra of µ-measurable subsets M of X, where µ is a probability measure (with µ(X) = 1) and a measurable function The iterates of f obviously also define a discrete dynamical system, with the usual notions of orbits, fixed points and the like.
The metric (Kolmogorov-Sinai or just K-S) entropy requires a number of introductory elements for its definition. A measurable partition of X is a finite, pairwise-disjoint sequence P = {Q 1 , . . . , Q m } of µ-measurable sets of positive measure such that X = m k=1 Q k , and it is convenient to denote the set of all such measurable partitions of X as P(X). The entropy of a measurable partition is defined as As with open coverings in the definition, we can define a common refinement of measurable partitions P andP as P ∨P := Q ∩Q : (Q,Q) ∈ P ×P , for which the refinement relation is analogously denoted by P,P ≺ P ∨P. Iterating refinement in a manner analogous to that used for topological entropy, we define measurable partitions associated to f of the form P n := P ∨ f −1 (P ) ∨ · · · ∨ f −n+1 (P ), and define where the limit can be readily shown to exist. Then, we define the metric entropy as and take note of the strong similarity with the definition of topological entropy. It should also be mentioned that just as for the case of topological entropy, we can define metric entropy for continuous measurable dynamical systems using the time-1 map of the flow as in (3). As with the topological entropy, there are intuitive characterizations of metric entropy, two of which are the following: The exponential growth rate of typical orbits; and the maximum of the rate of extractable information.

Basic Properties of K-S Entropy
The nonnegativity of K-S entropy is clear from its definition, and we have the following basic properties that are clearly analogs of those for topological entropy, with the last of which following from the concavity of −x log x (see [10,15,[22][23][24][25]).
is refining -meaning that P n ≺ P n+1 for every n ∈ N and P n :

Shannon Entropy
In contrast with topological and metric entropy, Shannon entropy (see [26,27]) has an information theory rather than a dynamical system foundation. The basic elements can be distilled as follows: Let S := {s 1 , . . . , s m } be a nonempty finite set of symbols or messages (sometimes referred to as the alphabet) with a discrete probability p assigned to each, such that p(s i ) ≥ 0 for all 1 ≤ i ≤ m and p(s 1 ) + · · · + p(s m ) = 1. Then, the Shannon entropy of the message ensemble is just the average or expected value of the information content of the message ensemble X, which is very much like the entropy of a measurable partition used in the definition of metric entropy.

Several Properties of Shannon Entropy
It follows from (7) that the Shannon entropy is nonnegative, and there are several other readily verified properties such as:

A Relationship among the Entropies
Here we shall describe a well known relationship between topological and metric entropy obtained by Dinaburg [4,5], Goodman [8], Goodwyn [9] and Misiurewicz [28] using variational techniques (see also [12,13,15]), and often referred to as the variational principle. Let f : X → X be a continuous map on a compact metric space X with topology T, which defines a discrete dynamical system that we represent by the triple ( f , X, T). To establish the topological-metric entropy relation, it is convenient to define M(X) := µ : µ an invariant Borel probability measure on the σ-algebra M µ of subsets of X , (8) where Borel means that T ⊂ M µ for all µ ∈ M(X). Then we have the following results, which is proved very efficiently in [28]: Theorem 1. If X is a compact metric space with topology T and f : X → X is a continuous map, then

When Topological Entropy Equals Metric Entropy
In this section we shall prove there is a topology on numerous discrete measurable dynamical systems (DMDS) of interest such that the corresponding topological entropy is equal to the K-S entropy. Let ( f , X, M, µ) be a measurable dynamical system having metric entropy h ( f ). We shall show that in certain cases there is a topology T on X for which f : X → X is continuous and h T ( f ) = h ( f ). One interesting example, which is an example of type (F2) that concerns Conjecture 3 of [16], which has since been proven (see [17,18,21,29]); namely, if X is a compact separable topological group, with topology T, and f : X → X is a continuous automorphism, then the Haar measure µ is f -invariant, and h T ( f ) = h µ ( f ). What we also have in mind is a Constructive-type of problem that involves defining a topology on a given DMDS, with no pre-specified topology, such that the topological and K-S entropies coincide and this is where we shall begin.

Constructive-Type Equality
The question is when can the phase space X of a given DMDS can be endowed with a compact metric topology T such that h T ( f ) = h µ ( f ) . Our main result, which follows directly from the Jewett-Krieger theorem (cf. [3,30,31]), seems not to have appeared in the literature.

Theorem 2.
Let D = ( f , X, M, µ) be a measurable dynamical system, with f : X → X bijective and ergodic. Then, there exists a compact metric topology T on X such that h Proof. The Jewett-Krieger theorem implies that there is a compact metric spaceX with topologŷ T with a continuous mapf :X →X,f -invariant Borel measureμ, and a sigma-algebraM of µ-measurable subsets ofX defining a measure-theoretic dynamical systemD = f ,X,M,μ with the following properties: (i) There is a measure-theoretic dynamical system embedding Φ : D →D, which means there is an injective map φ : (ii) Ifμ * is the restriction ofμ to φ(X) andT * is the subspace topology induced byT on φ(X), then hT(f ) = hμ * (f ).
The choice of the desired topology can be inferred readily from the above; namely, we define T to be the topology induced by the injection φ : X →X. As is well known, this topology T is simply that which is generated by the set φ −1 (V) : V ∈T . Hence, owing to the properties described above, the systems are rendered both topologically and metrically conjugate on φ(X). This completes the proof since it follows that (ii) implies h T ( f ) = h µ ( f ).
Although the above is a simple corollary of the Jewett-Krieger result, the proof of the theorem itself is quite long and deep. This suggests that there may be a more direct proof of Theorem 1, which is something worth investigating.

Comparison-Type Equality
Since we start with a choice of the topology, the construction involves the possibility of noncompactness for which there are approaches in the literature such as [18][19][20][21]29,32,33]. However, we avoid this issue by selecting a compact topology for X.
We begin with the construction of the topology T for the phase space X of the DMDS D := ( f , X, M, µ) that we hope yields the equality of the two entropies. A rather natural choice for T is the topology T 0 (µ) generated by the sets in all possible measurable partitions of X comprising subsets of positive measure, but this has inherent problems, not the least of which concerns compactness. Certainly, (X, T 0 ()) is not a priori compact, nor as it turns out is it suitable for proving that the corresponding topological and given metric entropies are equal. Consequently, we must be more restrictive in the definition of the chosen topology as well as the DMDS.
First, we deal with the choice of topology and then describe some of the fundamental properties of the DMDS vis a vis the topology that we will use to guarantee the equality of the metric and topological entropies. Our first assumption is that X can be compactly embedded in a metric space Y with metric d, so it may be considered as a compact metric space with metric d restricted to X, with corresponding metric topology T d . This immediately takes care of the question of compactness, for example, and provides other desirable properties associated with metric topologies. Moreover, it applies to any finite-dimensional, smooth compact manifold. Another convenient assumption connecting the topology T d and D is the following: The measure is Borel with respect to the topology; i.e., T d ⊂ M and so all open sets are µ-measurable. It is worth noting that compact smooth manifolds with the usual types of measures typically satisfy these two assumptions. We shall also find it convenient to assume that the map f is continuous with respect to T d , which implies that the map is also proper; i.e., the preimage of compact sets are compact.

Definitions and Additional Notation
The above assumptions are basic and conveniently serve our needs, but several more subtle properties are required to obtain equality of entropies, which are best delineated by introducing some additional simplifying notation in the context of the assumptions that have so far been made. For example, the following notion shall prove useful. . . , U m } of X is minimal with respect to a measurable partition P = {Q 1 , . . . , Q m } of X if Q k ⊂ U k for every 1 ≤ k ≤ m and no proper subset of U is a covering of X. For convenience, we denote this as U -min-P.

Definition 2.
A measurable partition P = {Q 1 , . . . , Q m } is an α-β (with 0 < α < β) partition of X with respect to T d and denoted as P (α, β) if the following properties obtain for all the elements: (i) µ(Q k ) > 0; (ii) Q k is connected in the topology T d ; (iii) the diameter d(Q k ) < β; and (iv) there is at least one point x ∈ Q k such that the (closed ball)B α (x) := {y ∈ X : d(x, y) ≤ α}, corresponding to the open ball B α (x) := {y ∈ X : d(x, y) < α}, is contained in Q k . We denote the set of all α-β partitions as P αβ (X).

Definition 3.
The DMDS D is of L-type for T d if it satisfies the following properties: (i) µ(B α (x) > 0 for every α > 0; (ii) µ(B α (x) → 0 as α → 0; (iii) for every element Q k of an α-β partition of X and > 0 there is a connected open set U k ( ) such that Q k ⊂ U k ( ), µ (U k ( ) Q k ) < and d (U k ( )) < β + ; (iv) d (x, ∂U k ( )) < /2 for every x ∈ ∂Q k and (v) U k ( 1 ) ⊂ U k ( 2 ) whenever 1 < 2 , so that lim ↓0 µ (U k ( )) = µ (Q k ). We note that here the L is for Lebesgue, inasmuch as these properties are well known to apply for the normalized Lebesgue measure on a compact subset of Euclidean space. The following result is a readily verified consequence of the above definitions.

The Comparison Theorem
We now assemble some of the properties defined above for use as assumptions in the main theorem of this section. Definition 5. The measurable dynamical system D := ( f , X, M, µ) is T-compatible if there exists a compact metric topology T d on X such that the following properties obtain: (i) f is continuous with respect to T d (ii) µ is Borel with respect to T d ; (iii) P αβ (X) = ∅; and (iv) D is of L-type for T d .
The stage has now been set for the following result on comparative equality. Theorem 3. Suppose that the DMDS D is T-compatible with respect to T d on X and the following additional properties hold: (E1) There exists a sequence of partitions{P (α n , β n )} in P αβ (X) such that P (α n , β n ) ≺ P (α n+1 , β n+1 ) for all n ∈ N and {α n } and {β n } are decreasing sequences converging to zero, which means that P (α n , β n ) is refining in the sense of (ME6). (E2) Moreover, the sequence in (E1) satisfies the following property: For every increasing sequence of positive integers {j k } there exists a dominating sequence of natural numbers {n k }, with n k > j k for all k ∈ N, such that P n k (α k , β k ) = Q 1(k,n k ) , . . . , Q m(k,n k ) and H (P n k (α k , β k )) = log (m(k, n k )) − σ(k, n k ), (10) for all k ∈ N, where σ(k, n k ) > 0 is bounded for all (k, n k ) ∈ N × N.

Proof.
A key element of our proof is the recognition that (E2) expressed in Equation (10) is tantamount to P n k (α k , β k ) being nearly equiprobable and that a sufficiently tight open covering U of P (α k , β k ) yields a tight open cover U n k with respect to P n k (α k , β k ) such that log N (U n k ) = log (m(k, n k )). Consequently, Now, selecting {P (α n , β n )} in P αβ (X) as in (E1), it follows from (ME 6) that for every > 0 there exists a natural number l = l( ) so large that for any integer k ≥ l( ) there is aq k ∈ N such that whenever q k is an integer greater thanq k . Moreover, (E2) implies that for each k ≥ l( ) there are infinitely instances of Equation (11), which we refer to asq k , satisfying There is, owing to Lemma 1, a decreasing sequence {δ n } with δ n ↓ 0 as n → ∞ associated to a refining sequence of δ k -tight open coverings {U k } of {P (α k , β k )} that is refining in the sense of (TE 6). Furthermore, the compactness of X, the continuity of f : X → X, (TE 6), (E 2), Equation (10) and Lemma 1 imply that the {δ n } for the sequence of tight open coverings can be chosen so that Uq k k is minimal for Pq k (α k , β k ) for someq k for which Equation (12) obtains, and Accordingly it follows from Equations (12)- (14) that Hence, as is arbitrary, the proof is complete.
We note that it appears that Theorem 2 can also be proved using the concept of an f -homogeneous invariant measure introduced in [17]. It is actually likely that the hypotheses in the above theorem are equivalent to Bowen's f -homogeneous invariant Borel probability measure µ property for a discrete dynamical system f : X → X on a compact metric space, which is defined as follows: For every > 0, there exist δ, c > 0 such that for all n ≥ 0 and x, y ∈ X, where B δ and B are the standard δ and balls, respectively, for the metric d generating the topology T d on X. He proved that if this property holds, then In the same vein, it is likely that Theorem 2 can be employed to prove the equivalence of the metric and topological entropies for a DMDS comprising a continuous automorphism of a compact separable topological group leaving the Haar measure invariant. Another thing worth noting is the strong indications that this theorem is apt to have analogs for several variants of entropy, such as relative entropy, as well as some extensions to the noncompact versions of entropy in [18][19][20][21]29,32,33] and these seem like they might be an interesting topics for future research.

Examples of Topological and Metric Entropy Equivalence
We mentioned the case of automorphisms on compact, separable topological groups and now we want to show some examples that follow from Theorems 1 and 2. Most of these examples have already been covered in the literature, so they are mainly meant for illustrative purposes.
Example 1. Let f : S 1 → S 1 be a rotation that is an irrational multiple of 2π, where S 1 is the unit circle with the metric d and measure µ both based on the normalized arclength. Then the associated DMDS is ergodic, so Theorem 1 implies that h T e S 1 = h S 1 , which is readily shown to be zero as is that of every rotation of the circle. It is also a simple matter to obtain the same result using Theorem 2. On the other hand, if we chose a rational multiple of 2π rotation, the DMDS is no longer ergodic, so Theorem 1 cannot be used, but the result can be obtained by employing Theorem 2.

Example 2. Consider the standard tent map
combining this with the Lebesgue measure space on the unit interval and the euclidean topology T e , we obtain a DMDS on the compact unit interval. It is easy to see that by successively bisecting the unit interval, we obtain a refining sequence of partitions in satisfying Theorem 2 for all {n k }, from which we conclude that h T e (Λ) = h (Λ) = log 2.
with the same measure and topological structure above for the 2x (mod 1) map. The result is h T e ( f m/1 ) = h ( f m/1 ) = log m.

Example 5.
We can neatly recast the mx (mod 1) map examples as smooth maps on a smooth manifold and extend them to higher dimensions with ease. To begin, we again let S 1 be the unit circle in the complex plane C with the standard arclength measure µ normalized by a factor of 1/2π, so that it is a probability measure, with the usual Riemannian arclength metric d also scaled by a factor of 1/2π. Then, the mx (mod 1) maps can be identified with the smooth maps z → z m restricted to the unit circle; namely, F m : S 1 → S 1 defined by (z m ) |S 1 so that F m (e iθ ) := e imθ , and this combination defines a DMDS on a compact space. By mimicking the construction in the previous example, it can be readily shown the Theorem 2 implies that h T e (F m ) = h (F m ) = log m. Similarly, given an n-tuple m = (m 1 , . . . , m n ) ∈ N n , we define the smooth self map of the n-torus Φ m : T n → T n by Φ m e iθ 1 , . . . , e iθ m := e im 1 θ 1 , . . . , e im n θ n , where as usual, T n is just the cartesian product of n copies of the unit circle with the product topology and product measure associated to the single circle. This then comprises a smooth DMDS on the compact n-torus, which can be easily shown to satisfy the hypotheses of Theorem 2 and yield h T e (Φ m ) = h (Φ m ) = log(m 1 · · · m n ).

Shannon Entropy as a Special Case of Metric Entropy
We shall prove that Shannon's entropy may be formulated in the context of a Bernoulli scheme; whence, it is equal to the corresponding metric entropy owing to the Kolmogorov-Sinai entropy theorem [23,25], which is also covered in [12,13,15,24]. What we undertake in this section was more or less observed in the process of the development of K-S entropy, although not actually proved in detail as in what follows. In this regard, the work of Frigg [34] should also be noted. To recast Shannon entropy (described above) as a measurable dynamical system called a Bernoulli scheme or Bernoulli shift, we define the phase space X as the set of all bi-infinite sequences of symbols; namely, for all k ∈ Z together with the following probability distribution (p 1 , . . . , p n ) for the symbol set S, where p k := p(s k ) > 0 for all k ∈ Z and p 1 + p 2 + · · · + p n = 1. This definition fits rather nicely with the information foundation of Shannon entropy inasmuch as it corresponds to reading the string of symbols from left to right one at a time.
To complete the reformulation as a measurable dynamical system, it remains to define a σ-algebra of µ-measurable subsets of X, denoted as M, where µ is an s-invariant probability measure on X. This can be done in many ways, so the trick is to find a definition that yields a metric entropy equal to the Shannon entropy. Toward this end, we start by defining cylinder sets of the form where F is a finite set of integers, ψ ∈ S F and ζ |F is the restriction of ζ to F ⊂ Z, and let C be the collection of all cylinder sets together with the null set and all of X.
Next, we introduce the set function µ 0 : C → R defined as follows: . Now, there is a theorem of Kolmogorov [22] (see also, for example [24], p. 628) of a rather technical nature stating that the above set function can be uniquely extended to a complete probability measure µ : M → R, called the product measure determined by the distribution (p 1 , . . . , p n ), where M is the smallest σ-algebra containing C and p k := p(s k ) for all 1 ≤ k ≤ n.
Thus we have reformulated the Shannon communication system with alphabet S as the measurable dynamical system s,X := S Z , M, µ , where M and µ are as defined above, which is called a Bernoulli scheme and often denoted as B (p 1 , . . . , p n ). The details necessary to prove that the metric entropy of the Bernoulli scheme equals the Shannon entropy of the corresponding communication system are quite extensive and covered rather neatly and completely in [24], so we shall only try to summarize the results in what we hope is a readily understood fashion, at least from an intuitive standpoint.
First, the Kolmogorov-Sinai entropy theorem can be roughly stated as follows: Let D = ( f ,X, M, µ) be a measurable dynamical system with f invertible, as it is in the case of B (p 1 , . . . , p n ), such that there is a measurable partition P for which M is the smallest σ-algebra containing the union of the unions of all sets of the form f −m P ∨ f −m+1 P ∨ · · · ∨ f m−1 P ∨ f m P, as m ranges over the natural numbers N. Then h µ ( f ) = H (P, f ) . Now, it is not difficult to show (see, e.g., [24]) that the measurable partition where C k := ζ ∈ X = S Z : ζ 0 = s k , satisfies the requirements of the Kolmogorov-Sinai theorem. This follows primarily from the fact that s −m (C k ) = {ζ ∈ X : ζ m = s k } for all m ∈ N. Now the entropy of the partition P is Moreover, we find that Therefore, it follows from the two expressions above and the entropy theorem of Kolmogorov-Sinai that which is the desired result.
Our intention is to show that this is equal to the entropy with respect to the shift map s, that is h µ (s) = H (P ) .
Owing to the readily verified fact that every member of m k=−m s −k P is of the form C Z m −m , ϕ with ϕ : Z m −m → {0, 1} and the Kolmogorov-Sinai entropy theorem (see e.g., [24]), it suffices to prove that every cylinder C (F, ψ) satisfies where Z m −m := {l ∈ Z : |l| ≤ m, m ∈ N}. This is obvious since F is finite, which proves the desired result.

Concluding Remarks
After a brief review of the main features and fundamental properties of topological, metric (Kolmogorov-Sinai) and Shannon entropy, we embarked on our effort to establish dynamical systems theory as their principal connective thread. We began with comparisons between the topological entropy, which may be considered the most general manifestation of the multifarious forms, and the metric entropy, perhaps the most real-world applicable of the embodiments of entropy. For example, we recounted the variational principle that the topological entropy of a discrete dynamical system is the supremum of the metric entropies over all possible corresponding discrete measurable dynamical dynamical systems, which naturally leads to the question of whether or not the supremum is actually assumed. In that vein, we proved our main theorem giving sufficient conditions for the equality of the topological and Komolgorov-Sinai entropies and provided several illustrative examples. Finally, we showed that Shannon's information entropy is a special case of metric entropy, inasmuch the a dynamical information system can be identified with a Bernoulli scheme for which the Kolmogorov-Sinai entropy theorem provides a formula for the entropy identical to that of Shannon. In summary then, our project aimed at providing a rigorous underlying dynamical system theme for several of the more important entropy definitions.
Author Contributions: Both authors contributed equally to the research presented, with R.A. focusing mainly on metric and Shannon entropy and D.B. working principally on topological and metric entropy.
Funding: This research received no external funding. and the Center for Applied Mathematics and Statistics for some very helpful support. Finally, the authors thank the reviewers for their very helpful suggestions and insightful constructive criticism, which substantially improved the original version of this manuscript. In this regard, comments concerning the Jewett-Krieger theorem were especially useful.

Conflicts of Interest:
The authors declare no conflict of interest.