Next Article in Journal
Estimation for Entropy and Parameters of Generalized Bilal Distribution under Adaptive Type II Progressive Hybrid Censoring Scheme
Next Article in Special Issue
Scaling Analysis of an Image Encryption Scheme Based on Chaotic Dynamical Systems
Previous Article in Journal
An Interpretation Architecture for Deep Learning Models with the Application of COVID-19 Diagnosis
Previous Article in Special Issue
Effects of Urban Producer Service Industry Agglomeration on Export Technological Complexity of Manufacturing in China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Infinite Ergodic Walks in Finite Connected Undirected Graphs †

by
Dimitri Volchenkov
Department of Mathematics and Statistics, Texas Tech University, 1108 Memorial Circle, Lubbock, TX 79409, USA
This paper is an extended version of our paper published in The 1st Online Conference on Nonlinear Dynamics and Complexity, Central Time Zone, USA, 23–25 November 2020.
Entropy 2021, 23(2), 205; https://doi.org/10.3390/e23020205
Submission received: 28 November 2020 / Revised: 30 January 2021 / Accepted: 4 February 2021 / Published: 8 February 2021
(This article belongs to the Special Issue Entropic Forces in Complex Systems)

Abstract

:
The micro-canonical, canonical, and grand canonical ensembles of walks defined in finite connected undirected graphs are considered in the thermodynamic limit of infinite walk length. As infinitely long paths are extremely sensitive to structural irregularities and defects, their properties are used to describe the degree of structural imbalance, anisotropy, and navigability in finite graphs. For the first time, we introduce entropic force and pressure describing the effect of graph defects on mobility patterns associated with the very long walks in finite graphs; navigation in graphs and navigability to the nodes by the different types of ergodic walks; as well as node’s fugacity in the course of prospective network expansion or shrinking.

1. Introduction

The precursor of a concept of statistical ensembles and the related ergodic hypothesis formulated by Boltzmann [1,2] were met with a violently negative reaction by the great majority of scientists for clumsiness, absurd, and paradoxical consequences [3], although it allowed the theoretical calculation of the equations of state for the first time. The study of statistical ensembles related to graphs and networks suffers from a similar inhospitable reception from scientists playing cup-and-ball with a swarm of heuristic parameters and giving any importance to their connection with each other, which is often responsible for spurious conclusions on the graph’s structure and function. The thermodynamic approach to graphs was initiated in complex network theory concerned with the thermodynamic limit of infinitely large graph size N [4], in which a graph’s structural “fluctuations” become negligible. The major result of the theory on structurally homogeneous infinite graph (random trees) is the Bose–Einstein condensation mechanism explaining the growth of complex evolving networks as a topological phase transition between a “rich-get-richer” phase and a “winner-takes-all” phase [5,6,7]. In contrast to complex network theory, we consider the statistical ensembles of walks defined on a finite connected undirected graph in the thermodynamic limit of very long walks n , which has previously never been addressed. Statistics of lengthy walks elucidates the graph structure, quantifies navigability of the graph, and evaluates the fugacity of graph nodes with respect to the entire system of infinite paths available in the graph—all of these characteristics are introduced and discussed in our work for the first time. The probability measuring the tendency of a graph to shrivel or expand at a node follows the Fermi–Dirac distribution function. Although we have sketched a set of “ideal gas laws” for the structure of networks and graphs (in the last section of our work), we have not formulated a comprehensive structural "equation of sate" for graphs and networks yet.
The probability we assign to an event depends on whether we count it as one of many, considered all at once, or as a single event of its kind. In other words, an estimated likelihood of events hinges on their assumed membership in an ensemble described by some probability distribution. The famous Two-Child Paradox [8] serves a good example for this point: “Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?” Given that a child is either a boy (B) or a girl (G) with equal probability 1 2 , two incompatible answers may be given to this question, depending on the assumptions taken.
On the one hand, as the probability of getting a boy equals Pr ( B ) = 1 2 uniformly and unconditionally for all families, using the Bayes’ Theorem, we obtain that the probability of having at least one boy in a two-kid family will be the same as just having a boy, viz.,
Pr B & B | B = Pr B | B & B × Pr ( B & B ) Pr ( B ) = 1 × 1 4 1 2 = 1 2 .
On the other hand, as having a boy in an ensemble of two-child families with at least one boy obviously comprises three possible events, i.e., B & B , or G & B , or B & G , the probability of getting a boy in a family of two equals Pr ( B ) = Pr ( B & B ) + Pr ( B & G ) + Pr ( G & B ) = 1 4 + 1 4 + 1 4 = 3 4 , and therefore
Pr B & B | B = Pr B | B & B × Pr ( B & B ) Pr ( B ) = 1 × 1 4 3 4 = 1 3 .
The ensemble interpretation, in which each admissible event in a family of two with a boy appears equally probable, is preferable in the context of ergodic hypothesis blind to family history. In the context of the Two-Child Paradox, there is no way in probability theory to discern if the gender composition in such a family stays put, or children change their sex exploring possible gender identities during an infinite lifetime provided at least one of them stays a boy. The ergodic hypothesis helps to avoid this awkward question by equating the ensemble and time averages while replacing a dynamic description of identity changes by the probabilistic description within the ensemble over a very long period of time. Switching temporal and ensemble perspectives under the spell of ergodic hypothesis is assumed in thermodynamics, equilibrium statistical mechanics, and the theory of dynamical systems.
The concepts of ensembles and ergodicity have a long history [3]. Boltzmann introduced a "monode", a family of possible stationary probability distributions over a single cyclic trajectory of a system of gas particles on an energy surface in the phase space as early as in 1844 [1,2]. According to the Boltzmann hypothesis (1), the time spent by a system in some region of the phase space is proportional to the volume of this region, so that all accessible microstates are equiprobable over a long period of time, viz.,
lim T d t T = σ d s σ d s ,
where σ is the probability density of microstates on the iso-energetic surface, whose area element is d s . With this hypothesis, Boltzmann [1,2] and later Helmholtz [9,10] were able to explain the classical equilibrium thermodynamics, which successfully describes the behavior of gases. The concept of thermodynamic ensembles was further developed and coined into the English-speaking world by Gibbs [11].
In our work, we review three classical thermodynamic ensembles defined by Gibbs [11]—the microcanonical (Section 2), canonical (Section 4), and grand canonical (Section 8) ensembles of very long walks defined in finite connected undirected graphs—and demonstrate that the concept of ergodic ensembles might be applied to quite abstract objects of discrete mathematics. The thermodynamic limit in our approach is defined as the limit of very long walks n in a finite graph rather than the limit for a large number of graph nodes N. In the limit N , “fluctuations” of graph structural features are negligible, and therefore the graph can be considered as structurally homogeneous across all scales—random, in the limit n fluctuations of the growth rate of the number of distinguishable, long walks in the graph can be ignored, and then graph’s topological entropy μ = log 2 α max (the log of graph’s spectral radius) and the corresponding Perron eigenvector of the graph adjacency matrix describe the degree of structural complexity, anisotropy, and navigability of the graph.
Each thermodynamic ensemble permits specific statistical behavior. For example, the microcanonical ensemble representing an isolated system (with constant energy) is defined by assigning equal probability to every walk of a given length existing in the graph. All very long walks that fit some probability distribution over graph’s nodes constitute a macrostate in the canonical ensemble of walks defined in the graph. For example, the series of intrinsic random walks (introduced in Section 5) make up equal probabilities to all walks of a given length starting at a node providing an example of the canonical ensemble of walks defined on the finite graph. This canonical ensemble contains not only the very well-known isotropic nearest-neighbor random walks on finite graphs [7,12], but also infinitely many types of less known anisotropic random walks on graphs—and the Ruelle–Bowen random walk [13,14] making up all infinite walks starting at each node equally probably is one among them. While the ergodic theory for isotropic random walks on finite graphs is well developed [15,16] (We profoundly thank our referee for this remark), the ergodic properties of anisotropic random walks, including their statistical confinement in the best structurally integrated sub-graphs (see Section 5 and Section 7), have not been discussed in literature yet. Finally, in an open system of long walks represented by the grand canonical ensemble, chemical potential (free energy absorbed by a very long walk seizing graph’s edge) is kept fixed and equal the graph’s topological entropy μ .
We also discuss applications of ergodic walks to the structural analysis of and navigation through finite undirected connected graphs. Graph’s structural defects and boundaries repel very long walks that can be be expressed in terms of entropic pressure and force (Section 3). Intrinsic random walks forming the canonical ensemble in a graph can be used to measure the degree of graph’s structural anisotropy (Section 5), to estimate the amount of predictable (navigable) information about present navigator’s location (Section 6) and assess the navigability to each graph node in proportion to its relative visiting frequency (Section 7). Navigation focuses on locating a navigator’s position compared to known locations, paths, and structural patterns [17]. The navigability to a node comprises two information components compatible with two major navigation strategies, known as path integration (that allows for keeping track of the position and heading while exploring a new space) and landmark-based piloting (re-calculating position when in a familiar environment), working in concert during navigation in humans and animals [17]. Finally, the grand canonical ensemble describes the statistics of local fluctuations of the growth rate of the numbers of long walks around the chemical potential as n (Section 8). The distribution of these fluctuations follows Fermi–Dirac statistics and marks graph’s defects and boundary nodes hosting dramatically less very long walks than others.
We conclude in the last section.

2. The Micro-Canonical Ensemble of Equiprobable Walks in Finite Connected Undirected Graphs

The number of walks of length n (i.e., n–walks) in a lattice Z d in d-dimensional space grows exponentially with n, N n = 2 n d . The micro-canonical ensemble is defined by assigning equal probability to every n–walk, viz.,
n = 1 2 n d = exp n d 1 ln 2 exp F n k T ,
where the (Boltzmann constant and) temperature k T 1 ln 2 , the free energy of the n–walks is
F n log 2 n = k T ln N n k T H n = n d ,
and H n ln N n = n d ln 2 is the entropy in a micro-canonical ensemble. As the free energy F n is the Legendre transformation of the internal energy U n , with k T as the independent variable [18,19], viz.,
F n U n k T H n ,
comparing this definition with (3), we conclude that the internal energy of all n–walks is U n = 0 in a micro-canonical ensemble.
We also readily extend the statistical description of micro-canonical ensemble of equiprobable walks to κ -regular graphs, in which every vertex has the same number of neighbors, κ = 2 d , by using the substitution d = log 2 κ . As the free energy value (3) grows linearly with n, the intensive free energy (per absorbed edge), viz.,
μ lim n F n n = k T lim n H n n = log 2 κ = d ,
plays the role of chemical potential describing the change to free energy after absorbing a new edge to a very long walk in a κ -regular graph in a micro-canonical ensemble.
Given a finite connected undirected graph G ( V , E ) where V, V = N , is a set of vertices, and E V × V is a set of edges, we assume that its adjacency matrix (such that A i , j = 1 , i , j E , and A i , j = 0 , otherwise) has the following spectral decomposition A i j = s = 1 N α s u i s u s j , with ordered eigenvalues α max α 1 > α 2 α N . The free energy in the micro-canonical ensemble equals
F n = log 2 N n = log 2 i j A i j n = log 2 i j s = 1 N α s n u i s u j s = log 2 s = 1 N α s n γ s 2 = log 2 γ 1 2 α max n 1 + s = 2 N γ s 2 γ 1 2 α s α max n , γ s i = 1 N u i s ,
and, since α s α max < 1 , the intensive free energy amounts to the logarithm of the spectral radius α max of the graph, viz.,
μ = lim n F n n = lim n 1 n log 2 γ 1 2 α max n 1 + s = 2 N γ s 2 γ 1 2 α s α max n = log 2 α max d G .
In Section 8, the quantity (6) plays the role of chemical potential of an edge absorbed by a very long walk. For a κ —regular graph, its spectral radius α max = κ , so that μ = log 2 κ = d , in accordance with (4). The log of graph spectral radius (6) is also called the topological entropy of the graph [20,21] because it is the exponential growth rate of the number of distinguishable walks, being a measure of complexity of the graph structure. According to (4), the topological entropy of the graph μ can also be interpreted as the effective dimension of space of the graph, d G , in a micro-canonical ensemble of very long walks.

3. Entropic Pressure and Force in Micro-Canonical Ensemble of Walks

Missing nodes and edges might dramatically reduce the number of very long walks available in a graph, reshaping the global mobility patterns in a micro-canonical ensemble of walks. Statistical changes in mobility patterns due to graph defects that can be described in terms of entropic pressure and entropic force are as follows.
Namely, a missing node depletes the number of very long walks available in the graph, and therefore reduces the corresponding free energy, F n = log 2 i j A i j n , by the following amount of local energy,
E i ( n ) = log 2 j A n i j = log 2 α max n u i 1 γ 1 1 + s = 2 N α s α max n u i s u i 1 γ s γ 1 ,
corresponding to the number of very long walks anchoring at i, viz.,
δ i F n F n E i ( n ) .
In the thermodynamic limit n , the resulting local increment of free energy measuring its sensitivity to the disappearance of node from the graph is as follows:
Δ F i = lim n δ i F n F n = lim n log 2 i j ( A n ) i j j ( A n ) i j i j ( A n ) i j = log 2 1 u i 1 γ 1 P i .
We call the resulting quantity (9) entropic pressure P i , as it accounts for the local stress characterizing the transfer of walker’s mobility from i to the rest of the graph if i is not available (see Figure 1 Left).
Similarly, by eliminating an edge i , j E from the graph, we reduce the local energy E i ( n ) of the node i V (7) by the following amount, δ E i , j ( n ) = log 2 s A i s n A i j k A n 1 j k , corresponding to the number of n 1 -walks available from the node j V adjacent to i, viz.,
Δ E i j = lim n log 2 s ( A n ) i s A i j k ( A n 1 ) i k s ( A n ) i s = lim n log 2 1 A i j k ( A n 1 ) j k s ( A n ) i s = log 2 1 A i j u j 1 α max u i 1 = log 2 1 W i j ( ) F i j .
The direction dependent entropic force F i j introduced in (10) emerges from the statistical tendency of very long walks to follow the preferential transition W i j ( ) to the neighboring nodes hosting many infinitely long walks, as in the Ruelle–Bowen random walk (19) [13,14]. It is worth-mentioning that the expression for the entropic force (10) has the structure of a Laplacian operator L i j = 1 W i j ( ) related to random walks defined in the graph G by the transition matrix W i j ( ) .
In Figure 1, we have presented a membrane graph with a defect and highlighted its nodes according to the values of entropic pressure (9) (left) and the elements of Perron eigenvector of the matrix F i j (10) (right) in the membrane graph.
In Figure 2, we use the graph representation of Lubbock, TX, USA acquired from the OpenStreetMap service (The OpenStreetMap database is publicly available at https://dataverse.harvard.edu/dataverse/osmnx-street-networks). To construct the spatial graph of the city, we used Python’s lxml library to parse the raw data and obtain the spatial graph adjacency matrix. The data set was cleaned further by removing disconnected neighborhoods, such as the Preston Smith International airport that is not a structural part of the city. The resulting connected city graph of Lubbock contains 10,421 nodes representing all spaces of movement, including but not limited to residential, secondary, tertiary roads, trunk links, and highways.
The value of entropic pressure in the spatial graph of Lubbock attains maximum at the contemporary structural focus of the city, far apart from the city historical downtown (Figure 2 Left). The nodes of the city spatial graph on the right-hand side of Figure 2 are highlighted according to elements of the Fiedler eigenvector belonging to the second largest eigenvalue of the entropic force matrix F i j (i.e., the second smallest eigenvalue of the associated Laplacian matrix L i j ). The Fiedler eigenvector is used in spectral graph partition, as it bisects the graph into only two connected communities based on the sign of the second vector entry. The Fiedler eigenvector indicates the direction of the fastest decrease of the entropic force over the city spatial graph of Lubbock (Figure 2 Right). The entries of the Fielder eigenvector are zero everywhere, except for a narrow band extended from the historical city center (where the magnitudeof entropic force is positive) toward the contemporary structural focus of the city (where the magnitude of entropic force is negative). The structural focus of the city absorbs very long walks while the historical center anchored at the abolished city railway station expels long walks. Although railway construction enhanced the city status of Lubbock in early days, its maintenance has a continuing negative impact on the urban development, since railways barricade streets, dramatically cutting down the number of possible paths people can drive or walk and create isolated neighborhoods [22].

4. The Canonical Ensemble of Walks in Finite Connected Undirected Graphs

The canonical ensemble represents the possible states of a system in equilibrium that does not evolve over time, even though the underlying system might be in constant motion [11]. The canonical ensemble is a collection of very long walks (microstates) of length n = r = 1 s n r 1 , where n r counts the number of visits paid by a walker to the r-th vertex of a connected undirected graph G , compatible with a π -macrostate, a discrete probability density vector π r r = 1 N , r = 1 N π r = 1 , taken over the set of graph vertices, viz., n r / n n π r .
The total number of microstates (i.e., long walks) lumped into a single π -macrostate is then given by the following multinomial coefficient:
M n , s = n ! n 1 ! n s ! = n ! n π 1 ! n π s !
Using Stirling’s approximation, ln n ! n + n ln n , we readily obtain that
ln M n , s n n ln n n 1 n 1 ln n 1 n s n s ln n s = n r = 1 s n r n · ln n r n ,
and therefore, as n
M n , s exp n r = 1 s n r n · ln n r n exp n r = 1 s π r · ln π r exp n H ,
in which
H r = 1 N π r · ln π r , 0 · log 0 = log 0 0 = log 1 = 0 ,
is the Boltzmann–Gibbs–Shannon entropy [23,24] in the canonical ensemble. If every very long walk lumped to the π -macrostate is chosen with equal probability among the other walks suited for the same macrostate, n exp n H , then the most probable walks would be those compatible with the uniform density π r = 1 N , r = 1 , , N , maximizing the value of entropy (14), H max = ln N . The free energy over the canonical ensemble of very long π -walks ( n ) is given by
F n = k T ln n k T · n H ,
and, therefore, the intensive free energy (chemical potential) equals
μ = lim n F n n = k T · H = r = 1 N π r · log 2 π r I π ,
where I π is the amount of information (in bits) revealed at every step of the π -walk.

5. The Canonical Ensemble of Intrinsic Random Walks in Finite Connected Undirected Graphs

Discrete time random walks W = X n V : n Z defined in a finite connected undirected graph G ( V , E ) by an irreducible row -stochastic transition probability matrix W i j = Pr X n + 1 = j | X n = i > 0 , i , j V , i , j E are the natural candidates for the π -macrostates in the canonical ensemble of walks. Indeed, as the row-stochastic transition matrix W i j does not evolve over time, the unique stationary distribution of the random walk W is the major left eigenvector π = π r r = 1 N of the transition matrix, such that s = 1 N π r W r s = π s .
Given the graph adjacency matrix A i , j = 1 , i , j E , and A i , j = 0 otherwise, we define the n t h -order degree of the vertex i V as the number of n-walks available at i V , viz.,
κ i n j = 1 N A n i j , κ i 0 = 1 .
Taking further into account that κ i n + 1 = j = 1 N A i j κ j n , we derive an infinite sequence of transition probability matrices [25], viz.,
W i j n = A i j κ j n κ i n + 1 = A i j s = 1 N A n j s s = 1 N A i s s = 1 N A n s r , j = 1 N W i j n = 1 , n N ,
defining a countable set of intrinsic random walks in the graph G.
The first order intrinsic random walk defined by the transition matrix W i j 1 = A i j κ i 1 has been discussed in literature for more than a century [12,26]. The walk W i j 1 is locally isotropic, as the random walker chooses the next node to visit among all nearest neighbors of the current node with equal probability. In Figure 3, we presented densities of nodes in the membrane graph with respect to the different types of intrinsic random walks. Density of nodes with respect to W i j 1 is proportionate to their degree centrality, i.e., the numbers of links incident upon the nodes (Figure 3, left). Other intrinsic random walks following the transition probabilities, W i j n , n > 1 , make all κ i n n-walks starting at the node i to occur with equal probability. These random walks are locally biased (anisotropic), as transitions to the nearest neighbors providing more lengthy walks are more preferable under (18) for n > 1 [25]. In the limit n , the series of transition matrices W i j n converges [25] to the Ruelle–Bowen random walk [21] (also known as the maximal entropy random walk [27]), viz.,
W i j = lim n W i j n = lim n A i j κ j n κ i n + 1 = lim n A i j α max n u j 1 γ 1 α max n + 1 u i 1 γ 1 = A i j u j 1 α max u i 1 .
The anisotropic random walk W i j is confined in the central nodes of the membrane graph (Figure 3, right). The stationary distribution for the intrinsic random walks (18) reads as follows [25]:
π i n = κ i n κ i n 1 s = 1 N κ s n κ s n 1 .
For the isotropic random walks W i j ( 1 ) , the stationary distribution π i 1 = κ i 1 2 E , where E is the total number of edges in the graph [12], and π i = u i 1 2 , for the Ruelle–Bowen random walks [27]. The stationary distribution π i 1 reports on the degree centrality of the graph nodes (i.e., the number of links incident upon a node), and π i is naturally related to the eigenvector centrality u i 1 of the node i in the graph G [28].
The time until a random walk approaches the stationary distribution (Figure 3) (i.e., the mixing time) is determined by the spectral gap, the difference between the two largest eigenvalues of the transition matrix. Spectral gaps is maximum (mixing time is minimum) over the canonical ensemble of intrinsic random walks for the anisotropic random walk W i j (Figure 4).
The relative entropy rate [29] between two Markov chains defined by their transition matrices,
η ( n ) = i = 1 N π i ( 1 ) j = 1 N W i j ( 1 ) log 2 W i j ( 1 ) W i j ( n ) = 1 2 E i , j A i j log 2 κ i ( n ) κ j ( n 1 ) κ i ( 1 ) = 1 2 E i , j A i j log 2 κ i ( n ) κ j ( n 1 ) δ i j log 2 κ i ( 1 ) 1 2 E i , j A i j Δ i j ( n ) δ i j d i ,
can be used for measuring information divergence over the canonical ensemble of intrinsic random walks in connected undirected graphs and the degree of graph directional anisotropy [25].
In (21), we have introduced d i log 2 κ i ( 1 ) , a local counterpart of the space dimension parameter (4), and its generalization to n-walks, the directional graph space dimension tensor
Δ i j ( n ) log 2 κ i ( n ) κ j ( n 1 )
measuring the degree of directional anisotropy in transitions of the intrinsic random walks making up all n-walks available from the node i with equal probability. For n = 1 , the graph space dimension tensor (22) reduces to the space dimension, Δ i j ( 1 ) = d i , as κ i ( 0 ) = 1 for all nodes. In the thermodynamic limit n , the graph space dimension tensor reduces to a direction dependent counterpart of the effective space dimension of the graph d G (6) (or the graph topological entropy), viz.,
Δ i j ( ) = log 2 α max u i 1 u j 1 .

6. Navigation through Graphs over Canonical Ensembles of Walks

The problem of effective navigation in graphs and networks can be considered in the framework of canonical ensemble of walks, since the navigator location prediction requires a density of locations that is known. Frequently visited sites are predicted more efficiently than little frequented, especially in the long-run [30].
Given a π -walk W = X t V : t Z defined in a connected undirected graph G ( V , E ) , Bayes’ theorem [29,31] describes the probability of navigator’s present location X based on prior knowledge of her previous location t steps before, X t t X . Namely, X t may be a t-step precursor of X with the following probability:
Pr X t | X = Pr X t t X π X t π X ,
where Pr X t t X is the probability of walking from X t to X precisely in t steps; π X t and π X are the densities of locations X t and X with respect to the π -walk, respectively. Pr X t | X is a density of the t-step precursors for the location X induced by the density of walks π . If Pr X t | X = π ( X t ) , it follows from (24) that the location X is unpredictable (as any other location X t is a precursor for X). The available information about visiting the location X at present is therefore scattered over the entire graph in the past and can be assessed by observing all possible t-step precursors X t , viz.,
P ( X ) = Pr X t | X log 2 Pr X t | X π ( X t )
The information divergence [29] (25) vanishes if and only if the density of t-step precursors Pr X t | X for the location X over the graph G is identical to π ( X t ) , so that visiting the location X t in the past is statistically independent of visiting the present location X t steps later, and therefore X t is not a t-step precursor of X [30]. The amount of information (25) attains its maximum value, viz.,
max P t ( X ) = log 2 π ( X ) ,
whenever the marginal probability π X is the major left eigenvector of the t-step transition matrix Pr X t t X , so that Pr X t | X = 1 , for all t and X t , i.e., visiting any location in the graph G by π -walk with probability 1 is a predictor for visiting any other location X t steps later.
According to the Boltzmann equation (1), for ergodic observables, the time average of the maximal information (26) over the entire history of π -walks equals the entropy of the π -walk (16), viz.,
lim t 1 t τ = 0 t max P τ ( X ) = { X } π ( X ) log 2 π ( X ) = k T · H ( X ) I ( π ) .
However, the actual amount of predictable (navigable) information about present navigator’s location may be quite modest, much less than the amount information revealed at every step of the π -walk ((16) and (27)): different graphs have different degrees of navigability.

7. Navigability of Graphs and Graph Nodes over Canonical Ensembles of Walks

The information function (27) can be represented as a sum of the predictable and unpredictable information components [32], viz.,
I π = P π + U π .
The predictable information component P π measures the amount of apparent uncertainty about the navigator’s location that can be resolved with some navigation strategy compatible with the π -walks, and U π gauges the amount of true uncertainty about the navigator’s location that cannot be inferred anyway. In the following, we attribute the predictable information component P π to the navigability of the graph G by the π -walk.
Assuming that both information components in (28) have the same form as the information function (27), viz.,
P π = r = 1 N π r · log 2 φ r , and U π = r = 1 N π r · log 2 ψ r ,
with some partition functions φ r and ψ r , such that π r = φ r ψ r , we obtain
I π = r = 1 N π r · log 2 φ r ψ r , φ r = π r ψ r .
We call the partition function φ r the navigability to the node r V in the graph G by the π -walk. Obviously, the navigability to the node φ r is proportional to its relative visiting frequency π r —as the more frequent the location, the higher its forecast accuracy—and inverse proportional to the partition function ψ r assessing uncertainty of visiting the node r by the π -walk.
There are two major navigation strategies—landmark-based piloting and walk integration—working in concert during wayfinding in humans and animals [17]. First, the next visit location X t + 1 can be guessed from the present navigator’s position X t in the graph, and the degree of accuracy of such a guess can be assessed by the mutual information between the present and future navigator’s location conditioned on the walk history, I X t ; X t + 1 | X t 1 , X 1 . This strategy can be naturally associated with landmark-based piloting.
If the π -walk is a random walk defined by a transition matrix W i j , the conditional mutual information for such a Markov chain depends only upon the immediate past navigator’s location X t 1 , but not on the entire historical sequence of locations visited by the navigator in the more distant past [32], so that
I X t ; X t + 1 | X t 1 = H X t + 1 | X t 1 H X t | X t 1 = k = 1 N π k r = 1 N W k r log 2 W k r W k r 2 log 2 W k r 2 .
Second, some degree of uncertainty about the navigator’s future location X t + 1 might be resolved after all revisiting, and a possible correlation between walks are taken into account in the course of walk integration over the presumably infinite motion history of π -walk. The latter quantity is given by the excess entropy [33,34,35],
E π = I π h π
where the entropy rate [29],
h π = lim t 1 t k = 1 t H X t | X t 1 , X 1
quantifies the mean amount of uncertainty consisting in the whole (infinite) path history of the π -walks. However, it is intuitive that the values of conditional entropies H X t | X t 1 , X 1 in the r.h.s. of (33) do not increase with the length of walks and, therefore, I π h π , so that E π 0 .
For a random walk defined by a transition matrix W i j , the Markov property simplifies the expression for the entropy rate (33), viz.,
h π = lim t 1 t k = 1 t H X t | X t 1 , X 1 = lim t 1 t H ( X 1 ) + 1 t H X 2 | X 1 + H X 3 | X 2 + = lim t 1 t H ( X 1 ) + t 1 t H X 2 | X 1 = H X 2 | X 1 = k = 1 N π k r = 1 N W k r log 2 W k r ,
so that the excess entropy (32) reads as follows:
E ( π ) = k = 1 N π k · log 2 π k + r = 1 N W k r log 2 W k r .
By summing (31) and (35), we obtain the total amount of predictable information P ( π ) revealed by the π -walk in the graph G, viz.,
P ( π ) = E ( π ) + I X t ; X t + 1 | X t 1 = k = 1 N π k · log 2 π k + r = 1 N W k r 2 log 2 W k r 2 = k = 1 N π k · log 2 π k · r = 1 N W k r 2 W k r 2 ,
so that the navigability to the node r in the graph G by the random walks defined by the transition matrix W i j is
φ k = π k · r = 1 N W k r 2 W k r 2
Navigability to a node evaluated by the partition function (37) depends on the strategy of walkers. In Figure 5, we illustrate the difference by highlighting the nodes of the membrane graph according to the degrees of navigability by the isotropic random walks W i j ( 1 ) (left) and anisotropic random walks W i j ( ) (right). For the isotropic random walks, the movement of walkers along the low-dimensional boundaries and at the corners of the graph are more predictable than their movements in the bulk, as all bulky locations of the same connectivity are visited with equal probability by the random walk W i j ( 1 ) . In contrast with the isotropic random walks, a navigator following the anisotropic strategy W i j ( ) is statistically confined within the region hosting the most of infinitely long paths available in the graph, where the navigator’s position is very likely.
As demonstrated in [32], the entropy function I ( π ) H ( X t ) allows for the following decomposition involving the conditional entropies:
H X t H X t H X t + 1 X t + H X t + 1 X t = H ( X t ) H ( X t + 1 | X t ) + H X t + 1 | X t + H X t | X t 1 H X t | X t 1 + H X t + 1 | X t 1 H X t + 1 | X t 1 = H ( X t ) H X t + 1 | X t E ( π ) + H X t + 1 X t 1 H X t X t 1 I X t ; X t + 1 X t 1 + H X t + 1 X t + H X t X t 1 H X t + 1 X t 1 U π .
Therefore, the remaining part of the information function (28), the last part in the decomposition (38), is the conditional entropy of the present navigator’s location conditioned on her past and future locations, viz.,
U ( π ) = I ( π ) P ( π ) = H X t | X t 1 + H X t | X t 1 H X t + 1 | X t 1
assesses the amount of true uncertainty about a navigator’s location that can neither be inferred from integrating over the past history of the π -walk nor have any repercussion for the navigator walk in the future [34]. For a random walk defined by the transition matrix W i j , we readily obtain that
U ( π ) = k = 1 N π k · log 2 ψ k = k = 1 N π k · log 2 π k φ k = k = 1 N π k · log 2 r = 1 N W k r 2 W k r 2 ,
where the partition function ψ k assesses the amount of uncertainty about navigator’s visiting the node k V .

8. A Grand-Canonical Ensemble of Ergodic Walks in Finite Connected Undirected Graphs

The grand canonical ensemble represents the possible states of a system exchanging energy and particles with a heat bath in thermodynamic equilibrium [11]. The growth rate of the number of distinguishable paths in a graph tends to its topological entropy μ = log 2 α max in the thermodynamic limit n . However, the local growth rate of the number of distinguishable paths available from a node, log 2 α max γ 1 u i 1 , might differ from the graph topological entropy. In the grand canonical ensemble, the probability to observe such a "fluctuation" of the long paths growth rate inferior to the topological entropy at the node i is taken to be
P i ( n ) = 1 Z n exp δ i F n k T , Z n j = 1 N exp δ j F n k T ,
where δ i F n = F n E i ( n ) is a fluctuation of free energy (8) associated with heterogeneity of growth rate of the number of very long walks in the graph. The grand partition function Z n amasses the fugacity exp δ j F n k T of all nodes in the graph, playing the role of a normalization factor in (41). In the thermodynamic limit n , lim n F n = lim n log 2 i j A i j n = lim n log 2 α max n = n μ , and lim n E i ( n ) = lim n log 2 α max n u i 1 γ 1 , so that the expression for the node’s fugacity takes the following form:
exp δ i F k T = lim n exp n μ E i ( n ) k T = exp [ log 2 α max n log 2 α max n u i 1 γ 1 ] 1 ln 2 = α max n α max n u i 1 γ 1 = 1 u i 1 γ 1 , γ 1 j = 1 N u j 1 ,
the grand partition function reads as follows:
Z lim n Z n = 1 γ 1 j = 1 N 1 u j 1 , γ 1 i = 1 N u i 1 ,
and, finally, the grand canonical probability (41) takes the form of a Fermi–Dirac distribution in the thermodynamic limit n , viz.,
P i = lim n P i ( n ) = 1 u i 1 j = 1 N 1 u j 1 = 1 1 + j i N 1 u j 1 .
The grand potential Ω playing the role of free energy with respect to the grand partition function Z in grand-canonical ensemble equals:
Ω = k T ln Z = log 2 1 γ 1 j = 1 N 1 u j 1 .
Having a form of the relative fugacity of a node, the grand canonical probability (44) can be regarded as measuring the ease of separation of the vertex from the rest of graph with respect to the entire system of infinite paths. The nodes with the long paths growth rate inferior to the topological entropy are insufficiently integrated into the graph structure and might be lost or acquire new connections in the course of prospective graph structural modifications.
In Figure 6, we have presented the membrane graph (left) and the spatial graph of the city of Lubbock, Texas (right), with their nodes colored according to values of grand canonical probabilities (44). The nodes located on the low-dimensional graph boundaries, at the corners of membrane graph and in the loosely connected south suburbs of the city of Lubbock have distinctly higher relative fugacity than others. These nodes can also be regarded as the points of prospective network growth in where the graph as a system of infinite paths remains open. Interestingly, the highlighted nodes in the spatial graph of Lubbock (Figure 6 right) mark the city neighborhoods currently under construction.

9. Discussion and Conclusions

We have defined three major thermodynamic ensembles of ergodic walks in connected undirected graphs, in the thermodynamic limit of infinitely long walks and showed that the ergodic mindset might be applied not only to particles of ideal gases, but also to quite abstract objects of discrete mathematics, such as graphs.
We have demonstrated that graph structural defects and irregularities, such as missing nodes and edges, might dramatically reduce the number of available very long paths, globally reshaping the mobility patterns in the entire graph. In the framework of micro-canonical ensembles, we may consider their effect as resulting from actions of the entropic pressure and force repelling walkers from structural irregularities and boundaries toward the best integrated region of the network: the laxer the connection, the stronger the repelling. Perhaps, the cumulative effect of entropic forces generated by railways and other structural obstacles along with the unbalanced growth of urban neighborhoods might be responsible for the urban decay process in the historical districts of some cities.
We have also shown that the problem of effective navigation [36] in graphs and networks can be considered with respect to a canonical ensemble of walks, as an effective location prediction of a navigator’s position, and requires a density of locations in the walk be known. According to the probabilistic setting, frequently visited sites are predicted more efficiently than little frequented, especially in the long run: the more frequent a node, the more predictable the navigator’s position visiting it. Regular lattices and homogeneous graphs lacking structural salience and landmarks might also be confusing environments dramatically, reducing predictability of navigator’s position.
Finally, we have studied the grand canonical ensemble of very long paths describing the statistics of fluctuations of the local path growth rate with respect to the graph topological entropy. In the thermodynamic limit of infinite paths, the distribution of the relative fugacity over the graph nodes takes the form of Fermi–Dirac distribution function. The high relative fugacity value of a node assumes that the degree of its integration into the system of infinite paths is insufficient, indicating that the graph is open for the prospective structural modifications associated with the node. In the urban spatial graphs, the nodes of high fugacity might be concentrated in the neighborhood under construction, marking the points of city network growth.
Future research should consider a comprehensive structural "equation of sate" for networks and graphs.

Funding

Our work has been partially funded under the contract W911 W6-13-2-0004 between the Texas Tech University, the US Department of Defense, and the AVX Aircraft Company.

Acknowledgments

D.V. acknowledges administrative and technical support received from the Texas Tech University. The author is grateful to Veniamin Smirnov for downloading and processing the urban spatial data from the OpenStreetMap service database. The author profoundly thanks participants of the 1st Online Conference on Nonlinear Dynamics and Complexity held 23–25 November 2020, Central Time Zone, USA for the multiple useful discussions and commentaries on the paper. The author also thanks E.I. Kotegova for multiple inspiring discussions during preparation of the article in tumultuous time.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Boltzmann, L. Über die mechanische Bedeutung des zweiten Haupsatzes der Wärmetheorie. In Wissenschaftliche Abhandlungen; Hasenöhrl, F.P., Ed.; Chelsea: New York, NY, USA, 1968; Volume I. [Google Scholar]
  2. Boltzmann, L. Über die Eigenschaften monzyklischer und anderer damit verwandter Systeme. In Wissenschaftliche Abhandhmgen; Hasenöhrl, F.P., Ed.; Chelsea: New York, NY, USA, 1968; Volume III. [Google Scholar]
  3. Gallavotti, G. Ergodicity, Ensembles, Irreversibility in Boltzmann and Beyond. J. Stat. Phys. 1995, 78, 1571–1589. [Google Scholar] [CrossRef]
  4. Dorogovtsev, S.N.; Mendes, J.F.F. Evolution of Networks: From Biological Nets to the Internet and WWW; Oxford University Press, Inc.: Oxford, UK, 2003. [Google Scholar]
  5. Bianconi, G.; Barabasi, A.-L. Bose–Einstein condensation in complex networks. Phys. Rev. Lett. 2001, 86, 5632–5635. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Bianconi, G.; Barabasi, A.-L. Competition and multiscaling in evolving networks. Europhys. Lett. 2001, 54, 436–442. [Google Scholar] [CrossRef] [Green Version]
  7. Albert, R.; Barab‘asi, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47–49. [Google Scholar] [CrossRef] [Green Version]
  8. Gardner, M. The Second Scientific American Book of Mathematical Puzzles and Diversions; Simon & Schuster: New York, NY, USA, 1961; ISBN 978-0-226-28253-4. [Google Scholar]
  9. Helmholtz, H. Principien der Statik monocyklischer Systeme. In Wissenschaftliche Abhandlungen; Johann Ambrosius Barth: Leipzig, Germany, 1895; Volume III. [Google Scholar]
  10. Helmholtz, H. Studien zur Statik monocyklischer Systeme. In Wissenschaftliche Abhandlungen; Johann Ambrosius Barth: Leipzig, Germany, 1895; Volume III. [Google Scholar]
  11. Gibbs, J.W. Elementary Principles in Statistical Mechanics; Charles Scribner’s Sons: New York, NY, USA, 1902. [Google Scholar]
  12. Lovász, L. Random Walks on Graphs: A Survey. In Combinatorics, Paul Erdös is Eighty; Mathematical Studies 2; Bolyai Society: Keszthely, Hungary, 1993; pp. 1–46. [Google Scholar]
  13. Parry, W. Intrinsic Markov Chains. Trans. Am. Math. Soc. 1964, 112, 55–66. [Google Scholar] [CrossRef]
  14. Ruelle, D. Thermodynamic Formalism; Reading Mass; Addison-Wesley: Boston, MA, USA, 1978. [Google Scholar]
  15. Denker, M.; Grillenberger, C.; Sigmund, K. Ergodic Theory on Compact Spaces; Lecture Notes in Mathematics; Springer: Berlin, Germany; New York, NY, USA,, 1976; Volume 527. [Google Scholar]
  16. Kitchens, B. Symbolic Dynamics. One-Sided, Two-Sided and Countable State Markov Shifts. Universitext; Springer: Berlin, Germany, 1998. [Google Scholar]
  17. Epstein, R.A.; Vass, L.K. Neural systems for landmark-based wayfinding in humans. Philos. Trans. R. Soc. B Biol. Sci. 2014, 369, 20120533. [Google Scholar] [CrossRef]
  18. Zia, R.K.P.; Redish, E.F.; McKay, S.R. Making sense of the Legendre transform. Am. J. Phys. 2009, 77, 614–622. [Google Scholar] [CrossRef] [Green Version]
  19. Scott Shell, M. Thermodynamics and Statistical Mechanics: An Integrated Approach; Cambridge University Press: Cambridge, UK, 2014; Chapter 7. [Google Scholar]
  20. Adler, R.L.; Konheim, A.G.; McAndrew, M.H. Topological entropy. Trans. Am. Math. Soc. 1965, 114, 309–319. [Google Scholar] [CrossRef]
  21. Delvenne, J.C.; Libert, A.S. Centrality measures and thermodynamic formalism for complex networks. Phys. Rev. E 2011, 83, 046117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Volchenkov, D.; Smirnov, V. The City of Lubbock is Running Away. Integration and Isolation Patterns in the Wandering City. J. Vib. Test. Syst. Dyn. 2019, 3, 121–132. [Google Scholar] [CrossRef]
  23. Khinchin, A.I. Mathematical Foundations of Information Theory; Dover: New York, NY, USA, 1957. [Google Scholar]
  24. Ramshaw, J.D. The Statistical Foundations of Entropy; World Scientific: Singapore, 2018. [Google Scholar]
  25. Volchenkov, D. Grammar Of Complexity: From Mathematics to a Sustainable World; World Scientific: Singapore, 2018. [Google Scholar]
  26. Aldous, D.; Fill, J.A. Reversible Markov Chains and Random Walks on Graphs; Unfinished Monograph, Recompiled 2014. Available online: https://www.stat.berkeley.edu/users/aldous/RWG/book.html (accessed on 28 November 2020).
  27. Burda, Z.; Duda, J.; Luck, J.M.; Waclaw, B. Localization of the Maximal Entropy Random Walk. Phys. Rev. Lett. 2009, 102, 160602. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Landau, E. Zur relativen Wertbemessung der Turnierresultate. Dtsch. Wochenschach 1895, 11, 366–369. [Google Scholar]
  29. Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley: Hoboken, NJ, USA, 1991; p. 576. [Google Scholar]
  30. Smirnov, V.; Volchenkov, D. Five Years of Phase Space Dynamics of the Standard & Poor’s 500. Appl. Math. Nonlinear Sci. 2019, 4, 209–222. [Google Scholar]
  31. Lee, P.M. Bayesian Statistics; Wiley: Hoboken, NJ, USA, 2012; ISBN 978-1-1183-3257-3. [Google Scholar]
  32. Volchenkov, D. Memories of the Future. Predictable and Unpredictable Information in Fractional Flipping a Biased Coin. Entropy 2019, 21, 807. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Watanabe, S.; Accardi, L.; Freudenberg, W.; Ohya, M. (Eds.) Algebraic Geometrical Method in Singular Statistical Estimation; Series in Quantum Bio-Informatics; World Scientific: Singapore, 2008; pp. 325–336. [Google Scholar]
  34. James, R.G.; Ellison, C.J.; Crutchfield, J.P. Anatomy of a bit: Information in a time series observation. Chaos 2011, 21, 037109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Travers, N.F.; Crutchfield, J.P. Infinite excess entropy processes with countable-state generators. Entropy 2014, 16, 1396–1413. [Google Scholar] [CrossRef] [Green Version]
  36. Du, Y.; Wang, C.; Qiao, Y.; Zhao, D.; Guo, W. A geographical location prediction method based on continuous time series Markov model. PLoS ONE 2018, 13, e0207063. [Google Scholar] [CrossRef]
Figure 1. Entropic pressure and force in the membrane graph. Left: The nodes are colored according to the values of entropic pressure (9). Right: The nodes are colored according to the values of the Perron eigenvector of the entropic force matrix F i j (10).
Figure 1. Entropic pressure and force in the membrane graph. Left: The nodes are colored according to the values of entropic pressure (9). Right: The nodes are colored according to the values of the Perron eigenvector of the entropic force matrix F i j (10).
Entropy 23 00205 g001
Figure 2. Entropic pressure and force in the city spatial graph of Lubbock, Texas (of 10,421 nodes). Left: The nodes of the city graph are colored according to values of entropic pressure (9). Right: The nodes of the city graph are colored according to values of the Fiedler eigenvector belonging to the second largest eigenvalue of the entropic force matrix F i j (10) (or the smallest eigenvalue of the associated Laplacian matrix). The Fiedler eigenvector indicates the direction of fastest decrease of the entropic force over the city spatial graph of Lubbock.
Figure 2. Entropic pressure and force in the city spatial graph of Lubbock, Texas (of 10,421 nodes). Left: The nodes of the city graph are colored according to values of entropic pressure (9). Right: The nodes of the city graph are colored according to values of the Fiedler eigenvector belonging to the second largest eigenvalue of the entropic force matrix F i j (10) (or the smallest eigenvalue of the associated Laplacian matrix). The Fiedler eigenvector indicates the direction of fastest decrease of the entropic force over the city spatial graph of Lubbock.
Entropy 23 00205 g002
Figure 3. Densities of nodes in the membrane graph with respect to the isotropic and anisotropic intrinsic random walks. Left: Density of nodes wrt to the isotropic random walk W i j 1 is proportionate to their degree centrality. Right: The anisotropic random walk W i j is confined in the central nodes of the membrane graph.
Figure 3. Densities of nodes in the membrane graph with respect to the isotropic and anisotropic intrinsic random walks. Left: Density of nodes wrt to the isotropic random walk W i j 1 is proportionate to their degree centrality. Right: The anisotropic random walk W i j is confined in the central nodes of the membrane graph.
Entropy 23 00205 g003
Figure 4. Spectral gaps is maximum (mixing time is minimum) over the canonical ensemble of intrinsic random walks for the anisotropic random walk W i j .
Figure 4. Spectral gaps is maximum (mixing time is minimum) over the canonical ensemble of intrinsic random walks for the anisotropic random walk W i j .
Entropy 23 00205 g004
Figure 5. Navigability to the nodes in the membrane graph by the isotropic W i j ( 1 ) (left) and anisotropic W i j ( ) (right) intrinsic random walks.
Figure 5. Navigability to the nodes in the membrane graph by the isotropic W i j ( 1 ) (left) and anisotropic W i j ( ) (right) intrinsic random walks.
Entropy 23 00205 g005
Figure 6. The grand canonical probabilities in the membrane graph (left) and in the spatial graph of the city of Lubbock, Texas (right). The highlighted nodes exhibit the long paths growth rates inferior to the topological entropy of the graph, in the thermodynamic limit n , and therefore have higher relative fugacity in the course of prospective graph structural changes.
Figure 6. The grand canonical probabilities in the membrane graph (left) and in the spatial graph of the city of Lubbock, Texas (right). The highlighted nodes exhibit the long paths growth rates inferior to the topological entropy of the graph, in the thermodynamic limit n , and therefore have higher relative fugacity in the course of prospective graph structural changes.
Entropy 23 00205 g006
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Volchenkov, D. Infinite Ergodic Walks in Finite Connected Undirected Graphs. Entropy 2021, 23, 205. https://doi.org/10.3390/e23020205

AMA Style

Volchenkov D. Infinite Ergodic Walks in Finite Connected Undirected Graphs. Entropy. 2021; 23(2):205. https://doi.org/10.3390/e23020205

Chicago/Turabian Style

Volchenkov, Dimitri. 2021. "Infinite Ergodic Walks in Finite Connected Undirected Graphs" Entropy 23, no. 2: 205. https://doi.org/10.3390/e23020205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop