Next Article in Journal
An Informational–Entropic Approach to Exoplanet Characterization
Previous Article in Journal
Stochastic Entropy Production for Classical and Quantum Dynamical Systems with Restricted Diffusion
Previous Article in Special Issue
Adaptive Kalman Filtering Localization Calibration Method Based on Dynamic Mutation Perception and Collaborative Correction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Further Exploration of an Upper Bound for Kemeny’s Constant

by
Robert E. Kooij
1,2,* and
Johan L. A. Dubbeldam
1
1
Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 CD Delft, The Netherlands
2
TNO (Unit ICT, Strategy & Policy, Netherlands Organisation for Applied Scientific Research), 2595 DA The Hague, The Netherlands
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(4), 384; https://doi.org/10.3390/e27040384
Submission received: 12 February 2025 / Revised: 1 April 2025 / Accepted: 2 April 2025 / Published: 4 April 2025
(This article belongs to the Special Issue Complexity, Entropy and the Physics of Information II)

Abstract

:
Even though Kemeny’s constant was first discovered in Markov chains and expressed by Kemeny in terms of mean first passage times on a graph, it can also be expressed using the pseudo-inverse of the Laplacian matrix representing the graph, which facilitates the calculation of a sharp upper bound of Kemeny’s constant. We show that for certain classes of graphs, a previously found bound is tight, which generalises previous results for bipartite and (generalised) windmill graphs. Moreover, we show numerically that for real-world networks, this bound can be used to find good numerical approximations for Kemeny’s constant. For certain graphs consisting of up to 100 K nodes, we find a speedup of a factor 30, depending on the accuracy of the approximation that can be achieved. For networks consisting of over 500 K nodes, the approximation can be used to estimate values for the Kemeny constant, where exact calculation is no longer feasible within reasonable computation time.

1. Introduction

Kemeny’s constant, a graph metric first proposed in 1960 [1], links random walks, Markov chains, and spectral graph theory; see, for instance, [2,3,4].
An intuitive way to understand Kemeny’s constant is by random walks on a graph, which was also how it was originally presented by Kemeny [1]. For an undirected connected graph with an adjacency matrix A, we can define a transition matrix P i j = A i j / d i for the transition from state i to j, where d i is the degree of node i. This defines an irreducible finite-state Markov chain in discrete time with an N × N transition matrix P i j [5]. If we also know the mean first-passage time matrix m i j denoting the average time to go from a vertex i to a vertex j (we take m i i = 0 by convention), the Kemeny constant is defined by
K ( P ) = j = 1 N π j m i j ,
where π j is the j-th component of the stationary solution of the random walk. The fact that K ( P ) does not depend on the index i, which can be interpreted as the starting state of the random walk and is therefore truly a constant, was discussed in a number of papers [6,7]. Hunter [8] and Kirkland [9] have analysed Relation (1) and established a connection with generalised matrix inverses.
The Kemeny constant also has an interpretation as a ‘mixing time’, which was originally proposed by Hunter in [7]. Here, we briefly repeat the demonstration that the Kemeny constant can be identified by a mixing time and show that this can be directly interpreted in terms of entropy. Let us define the ‘time to mixing’, T, of a Markov chain { X n } following [7], as the smallest index k at which X k = Y , where Y is a random variable distributed according to the stationary distribution of the Markov chain { π j } . We can now calculate the conditional expectation value of T, E [ T | Y , X ( 0 ) = i ] ,
E [ T | Y , X ( 0 ) = i ] = j E [ T , Y = j | X ( 0 ) = i ] P [ Y = j ] = j E [ T i j | X ( 0 ) = i ] π j = j m i j π j = K ( P ) ,
where T i j is the mean first-passage time for going from node i to node j.
Expression Equation (2) for the mixing time permits an interpretation in terms of relative entropy or Kullback–Leiber divergence D ( p | | π ) , which measures the distance between the distributions p and π ; see also [10]. The relative entropy is defined as
D n ( p | | π ) = j p j ( n ) log p j ( n ) π j .
Since D n ( p | | π ) 0 with equality only when p j ( n ) = π j for all j = 1 , , N , the time to mixing can be interpreted as the smallest value of n for which the relative entropy D n ( p | | π ) = 0 .
Kemeny’s constant has recently also been suggested as a metric to identify bottleneck roads whose removal would greatly reduce the connectivity of the network [11] or as a metric to determine the ‘superspreader’ links that transmit disease between different communities [12].
It has already been established that there are several equivalent ways to express Kemeny’s constant: using effective graph resistance, random walks, spectral graph theory, and pseudo-inverse Laplacians; see [8].
The study of Kemeny’s constant is still an active and relevant research field, as was showcased by the mini-symposium “Kemeny’s constant on networks and its application”, which was organised as part of the 24th Conference of the International Linear Algebra Society, which took place in Galway, Ireland, 20–24 June 2022 [13] as well as recent papers addressing applications of Kemeny’s constant to different networks [14,15].
In 2017, Wang et al. [4] derived a closed-form formula for Kemeny’s constant, K ( P ) for a random walk on a graph G with N nodes and L edges, where the transition matrix was given by P = Δ 1 A ( G ) , where  A ( G ) is the adjacency matrix of G and Δ is a diagonal matrix containing the degrees of the nodes. In [4], it was shown that K ( P ) can be expressed in terms of the Moore–Penrose pseudo-inverse Q of the Laplacian matrix of G, as 
K ( P ) = ζ T d d T Q d 2 L ,
where the column vector ζ = Q 11 , Q 22 , , Q N N and d ( G ) = ( d 1 , d 2 , , d N ) denotes the degree vector for the graph.
In [4], not only Equation (3) was derived, but also a closely connected upper bound:
K ( P ) ζ T d H ( G ) D ( G ) μ 1 ( G ) K U ( P ) ,
where D ( G ) is the average degree and μ 1 ( G ) is the largest eigenvalue of the Laplacian matrix Δ ( G ) A ( G ) corresponding to graph G. Here, Δ ( G ) denotes the diagonal matrix containing the degrees of the nodes. The heterogeneity index H ( G ) , measuring the variability in the degrees of the nodes (see [16]) is defined as
H ( G ) = 1 N i = 1 N ( d i D ( G ) ) 2 ,
where d i is the degree of the i-th node.
It was shown in [17] that the upper bound given in Equation (4) is tight, meaning that we have an equality in Equation (4), for two classes of graphs, namely complete bipartite graphs and (generalised) windmill graphs. A windmill graph W ( η , k ) consists of η copies of the complete graph K k , with each node connected to a common node. Two generalisations of windmill graphs were suggested by Kooij [18] in 2019. For both generalisations, we replace the central node, connecting all η copies of the complete graph K k , with l central nodes. For the first generalisation, we assume that the l central nodes are all connected, i.e., they form a clique K l . We call this a generalised windmill graph of Type I and denote it by W ( η , k , l ) . For the second generalisation, we assume that the l central nodes have no connections among each other. We will refer to it as a Type II generalised windmill graph and denote it by W ( η , k , l ) . Figure 1 shows examples of a windmill graph and its two generalisations,
The aim of this paper is four-fold. First, we will consider a broad family of graphs, which contain complete bipartite and (generalised) windmill graphs as special cases, and show analytically that for these graphs, the bound Equation (4) is tight. Graphs in this family have in common that they are bimodal and have a diameter of two. However, we will also show that these conditions are not sufficient to ensure that Equation (4) is tight. Next, we compare the complexity of the computation of the upper-bound Equation (4) with the exact expression for Kemeny’s constant, given by Equation (3). In [17], we have already compared the exact value of K ( P ) with the upper bound for some real-world networks. However, the considered networks were of rather moderate size ( N 754 ). Here, we will assess the performance of K U ( P ) on real-world networks of sizes up to around 365 K nodes and 1.72 M edges.
Finally, in addition to Equation (4), we also assess the performance of an upper bound suggested by de Vriendt [19] based on the so-called resistance radius of a graph:
K ( P ) L σ 2 K * ,
where the resistance radius σ 2 is defined as
σ 2 = 1 2 ( u T Ω 1 u ) 1 ,
with Ω denoting the resistance matrix and u the all-one vector. The upper-bound Equation (5) is tight for vertex-transitive graphs. Here, we remark that vertex-transitive graphs are rather exceptional and are typically highly symmetric; examples of vertex-transitive graphs are Cayley graphs and the Petersen graph [20]. We will show in this paper that the bound K * is not a good estimate for the Kemeny constant for the classes of graphs that are considered in this paper and that K U is in general a much better estimate.

2. A Family of Biregular Graphs with Diameter 2

2.1. Construction

The aim is to construct a family of graphs that contains the complete bipartite and (generalised) windmill graphs as special cases and is commonly known as the combination of two regular graphs, denoted G 1 G 2 . We start the construction by considering a d 1 -regular graph G 1 on N 1 nodes, and a k 2 -regular graph G 2 on N 2 nodes. We assume k 1 0 and also k 2 0 . Finally, we connect every node in G 1 to every node in G 2 to obtain the graph G. The nodes in G that are also in G 1 have degree k 1 + N 2 , while the nodes in G 2 have degree k 2 + N 1 . This construction yields a graph G = G 1 G 2 that is a so-called biregular graph in which all nodes of G 1 have the same degree and the same holds for all nodes of G 2 ; see also [21]. Only if k 1 + N 2 = k 2 + N 1 is the graph G regular. By construction, G has diameter 2.
The choice of k 1 = 0 and k 2 = 0 leads to the complete bipartite graph K N 1 , N 2 . If we take η isolated copies of the complete graph K k as G 1 and an isolated node for G 2 , then G is the windmill graph W ( η , k ) . If instead, we let G 2 be a complete graph K l , then G is a generalised windmill graph of Type I, W ( η , k , l ) , whereas if we let G 2 consist of l isolated nodes, G is a generalised windmill graph of Type II, W ( η , k , l ) .
Figure 2 shows an example of a graph that belongs to the suggested family of graphs. Here, G 1 , on the left side of the figure, is a random regular graph with k 1 = 3 , on N 1 = 10 nodes, while G 2 is a graph on N 2 = 8 nodes, where each node has degree k 2 = 5 . For the graph G, the nodes in G 1 have degree 11, while the nodes in G 2 have degree 15.

2.2. Tightness of the Upper Bound K U ( G )

We will now show for the family of graphs proposed in the previous subsection that the upper-bound Equation (4) for Kemeny’s constant is tight.
Theorem 1.
Consider two graphs G 1 and G 2 with all vertices in G 1 with degree d 1 and those in G 2 degree d 2 . If we connect each of the vertices in G 2 with all nodes of G 1 , then Kemeny’s constant K ( P ) for the resulting graph G is given by K ( P ) = ζ T d H ( G ) D μ 1 , that is, the upper-bound Equation (4) is tight.
Proof. 
First, we give expressions for the average degree D and the heterogeneity index H, which appear in the upper-bound Equation (4). Denoting the degrees of the nodes in G in G 1 and G 2 as D 1 and D 2 , respectively, we obtain
D 1 = D ( G 1 ) + N 2 = d 1 + N 2
and
D 2 = D ( G 2 ) + N 1 = d 2 + N 1
The average degree of G, D ( G ) , which we abbreviate for notational convenience to D, is defined by
D = D 1 N 1 + D 2 N 2 N 1 + N 2 .
The heterogeneity index H ( G ) , a metric which quantifies the variability of the degree distribution (see [16]), is defined as follows:
H ( G ) = 1 N i = 1 N ( D i D ) 2 ,
where D i denotes the degree of node i in graph G. Using the expressions for degrees D 1 and D 2 found in (7) and (8) and expression (9) for D, we obtain
H ( G ) = 1 N 1 + N 2 i = 1 N 1 ( D 1 D ) 2 + i = N 1 + 1 N 2 ( D 2 D ) 2 = 1 N 1 + N 2 N 1 ( D 1 D ) 2 + N 2 ( D 2 D ) 2 = N 1 N 2 ( D 1 D 2 ) 2 ( N 1 + N 2 ) 2 .
We will now prove the statement by first calculating the Laplacian matrix Q for the graph G, which has the following special structure:
Q = A 1 J N 1 × N 2 J N 2 × N 1 A 2 ,
where J N 2 × N 1 is an all-one N 2 × N 1 matrix, and the square matrices A 1 and A 2 are defined as
A 1 = Q G 1 + N 2 I [ N 1 , N 1 ] A 2 = Q G 2 + N 1 I [ N 2 , N 2 ] ,
where Q G 1 ( G 2 ) , is the Laplacian of graph G 1 ( G 2 ), and  I [ N 1 , N 1 ] ] , I [ N 2 , N 2 ] denote the identity matrices of size N 1 × N 1 and N 2 × N 2 , respectively. The decomposition of Q into 4 blocks can be understood by realising that the upper right-hand block, J N 1 × N 2 , represents the N 2 links that exist between each vertex of G 1 and all the vertices of G 2 . Since Q is a Laplacian matrix, we have to ensure that all rows sum up to zero, which can be achieved by adding N 2 to each of the diagonal entries of the N 1 × N 1 block in the upper left-hand corner, that is, the block A 1 should be as defined above. Analogously, we find that the lower left-hand and right-hand blocks should be equal to J N 2 × N 1 and A 2 , respectively.
Two eigenvectors, v 1 and v N , can be found by inspection. v N = ( 1 , , 1 ) T , which corresponds to eigenvalue μ N = 0 , and  v 1 = ( N 2 , , N 2 , N 1 , , N 1 ) T , which has N 1 entries equal to N 2 and N 2 entries equal to N 1 and corresponds to μ 1 = N 1 + N 2 .
Because the largest Laplacian eigenvalue is upper-bounded by N, the number of nodes in a graph (see [22]), we directly obtain that μ 1 is the largest eigenvalue of Q. Combining this with Equations (9)–(11), we obtain
H ( G ) D μ 1 = N 1 N 2 ( D 1 D 2 ) 2 ( N 1 + N 2 ) 2 ( D 1 N 1 + D 2 N 2 ) .
Since eigenvectors corresponding to different eigenvalues are all orthogonal and those corresponding to the same eigenvalues can be chosen to be orthogonal, due to the symmetry of Q, all eigenvectors v = ( x 1 , x 2 , , x N 1 + N 2 ) T that are not equal to v 1 or v N are subject to
x 1 + x 2 + + x N 1 + N 2 = 0 N 2 ( x 1 + + x N 1 ) N 1 ( x N 1 + 1 + + x N 1 + N 2 ) = 0 ,
which leads to
x 1 + x 2 + + x N 1 = 0 x N 1 + 1 + + x N 1 + N 2 = 0 .
We next turn to the expression d T Q d , where Q = i = 1 N 1 + N 2 1 1 μ i v ^ i v ^ i T where μ i is the i-th eigenvalue of Q and v ^ i is the normalised eigenvector. The conditions for the eigenvectors (15) imply that all terms in the expression d T Q d vanish except the term associated with v 1 . More precisely, we find that
d T Q d = i = 2 N 1 + N 2 1 ( d T v ^ i ) 2 μ i + ( v ^ 1 T d ) 2 μ 1 = N 1 N 2 ( D 1 D 2 ) 2 ( N 1 + N 2 ) 2 ,
where d = ( D 1 , D 2 ) T , so the first N 1 components all have degree D 1 and the remaining components have degree D 2 , which implies d T v i = 0 by Equation (15). Finally, because from D = 2 L N , we obtain
2 L = D 1 N 1 + D 2 N 2
it follows that d T Q d 2 L equals Equation (14), which proves the proposition. □

2.3. Some Examples

As a first example, we consider the graph depicted in Figure 2, where N 1 = 10 , d 1 = 3 , N 2 = 8 and d 2 = 5 . Using Python (https://www.python.org/) code, we have evaluated both K and K U . For this network, we obtain K = 16.33864 , which is equal to K U to numerical precision, as should be according to Theorem 1. On the other hand, the upper bound K * based upon the resistance radius gives K * = 19.29615 , which is reasonably close to the actual value.
Next, we consider a graph where N 1 = 50 , d 1 = 4 , N 2 = 10 and d 2 = 6 ; see Figure 3. Here, we get K = 59.19805 , and again, K and K U are numerically extremely close. On the other hand, for this graph, the bound Equation (5) is two orders larger than the actual value: K * = 533.03153 .
As a final example, we consider the case where N 1 = 100 , d 1 = 10 , N 2 = 20 and d 2 = 8 ; see Figure 4. Now, K = 119.24078 and again K and K U are equal to numerical precision. Again, the bound based on the resistance radius is much higher: K * = 1652.63986 .
We end this subsection by noting that the choice for the examples in this subsection was rather arbitrary. We also ran our Python script on several other graphs with sizes up to 1500 nodes. Each time, it yielded the same result: K and K U have values that are numerically very close (see also [4,17] for more numerical comparisons), while the upper bound K * exceeds Kemeny’s constant by a few orders.

3. Graphs with Diameter 2 for Which K U ( P ) Is Not Tight

3.1. Bimodal Graphs with Diameter 2 for Which Equation (4) Is Not Tight

The numerical results of the examples on biregular graphs with diameter 2 from the previous section showed that in all these cases, the approximation of K by K U is actually exact. In other words, the bound K U is tight in these cases. Therefore, one might be tempted to believe that Equation (4) is tight for all biregular graphs with diameter 2. In this section, we prove that this is not the case by giving some counterexamples.
The simplest counterexample we could find consists of the cycle graph C 5 with an additional link; see Figure 5.
For this graph, we get K = 3.71212 , K U = 3.72380 and K * = 4.21488 . There is a simple procedure to check whether or not a biregular graph G with diameter 2 belongs to the graph family constructed in the previous section. First, partition the nodes into two sets S 1 and S 2 where all the nodes in the set S 1 have degree D 1 , while all the nodes in the set S 2 have degree D 2 . Next, verify if the number of links between the 2 sets is | S 1 × S 2 | and all nodes of S 1 are linked to all nodes of S 2 . If this is not the case, the graph G S 1 S 2 . In the other case, remove all | S 1 × S 2 | links between the sets S 1 and S 2 . If the remaining two graphs are not both regular, then the original graph G does not belong to the family constructed in the previous section, that is, G S 1 S 2 .
The second counterexample is constructed by adding a link to the Petersen graph; see Figure 6.
For this graph, we get K = 9.76597 , K U = 9.77389 and K * = 10.26963 .

3.2. Non-Biregular Graphs with Diameter 2

We now give an example of a non-biregular graph with diameter 2, for which the upper-bound Equation (4) also does not equal Kemeny’s constant. We construct the graph by first taking a complete graph K N on N nodes. Next, we add one node and connect it to one node in K N and therefore the resulting graph has diameter 2. The resulting graph has N 1 nodes with degree N 1 , one node with degree N, and one node with degree 1. Figure 7 shows an example with N = 10 .
Applying Equation (3), we get K = 9.26522 , while the upper bound of Equation (4) gives K U = 9.83439 , while K * = 31.28000 .

4. Regular Graphs

In this section, we consider regular graphs on N nodes with degree r. In this case, the relation between Kemeny’s constant and the effective graph resistance was shown [23] to be
K P = r N R G ,
where R G denotes the effective graph resistance. Next, we show that for these graphs, the upper-bound Equation (4) is also tight. For this, we will use the following expression for the effective graph resistance (see [4]):
R G = N i = 1 N Q i i .
For r-regular graphs, H = 0 , and therefore Equation (4) gives
K U = ζ T d = r ζ T u = r i = 1 N Q i i = r N R G ,
hence K U = K according to Equation (18).
As an example, we consider a random 3-regular graph on 100 nodes (see Figure 8), which has a diameter 10. We get numerically K = 195.30524 , which is indeed equal to K U up to the numerical precision of 10 17 . Applying Equation (5) gives K * = 218.32805 . In this case, the upper-bound Equation (5) is not tight because the graph is not vertex-transitive.

5. Complexity for the Computation of K U ( P )

The time complexity of K ( P ) , computed via Equation (3), is dominated by the Laplacian pseudo-inverse, which is as expensive as performing a dense matrix multiplication and takes O ( N 3 ) in practice with standard tools. On the other hand, the time complexity of K U ( P ) mainly depends on two operations: computing the largest Laplacian eigenvalue and performing the dot product of a degree vector and the diagonal element vector of the Laplacian pseudo-inverse. Interestingly, to compute K U ( P ) , we can avoid the full pseudo-inversion as it only requires the diagonal elements of the Laplacian pseudo-inverse. Algorithms that approximate the diagonal (or the trace) of matrices often use iterative methods, sparse direct methods [24], Monte Carlo [25] or deterministic probing techniques [26]. Although faster than computing the full inversion, these approaches are still time-consuming in practice for large graphs [27]. For that reason, we employ a recently proposed algorithm that approximates the diagonal entries of the Laplacian pseudo-inverse using combinatorial connections [27]. This algorithm exploits the relation between effective resistance and the pseudo-inverse Laplacian. In order to calculate the diagonal elements of Q , it is sufficient to compute the electrical farness f el ( u ) of each node u in the set of all nodes V; the farness is defined by
f el ( u ) = v V / { u } R ( u , v ) = N Q u u + Tr ( Q )
Here, R ( u , v ) is the effective resistance between node u and v, which is the potential difference between u and v when a unit current is injected in graph G at node u and extracted at node v [28]. Rather than calculate R ( u , v ) for each pair of nodes, we sample a set of uniform spanning trees. This approach provides a probabilistic absolute approximation guarantee.
The algorithm’s time complexity is summarised in the following proposition:
Proposition 1
([27]). Let G = ( V , E ) be an undirected and weighted graph with N nodes and L edges. The sampling algorithm, briefly described above, gives an approximation of the diagonal elements of Q with absolute error ± ϵ with probability 1 δ in an expected time O ( L · ecc 3 ( u ) · ϵ 2 · log ( L / δ ) ) , where ecc ( u ) is the length of the longest shortest path (eccentricity) starting in a selected node u. For small-world graphs and δ = 1 / N (for high probability), this yields a time complexity of O ( L log 4 N · ϵ 2 ) .
For networks that have small-world characteristics, a common feature for many real-world networks [29], the above algorithm obtains a ± ϵ -approximation with high probability, in a time that is linear in L up to polylogarithmic terms and quadratic in 1 / ϵ . Furthermore, computing the largest Laplacian eigenvalue does not change the overall complexity bound. More precisely, this step often takes O ( L ) time for sparse matrices using standard iterative methods, such as the Lanczos algorithm [30]. In general, the actual running time for this step highly depends on the desired accuracy and the eigenvalue distribution of the involved matrix. Overall, the complexity bound for computing K U ( P ) for small-world graphs using the above techniques is linear in the number of links L (up to a polylogarithmic factor).

6. Analysis of Some Large Real-World Networks

In this section, we analyse the performance of our proposed bound, K U ( P ) , compared to Kemeny’s constant, K ( P ) , in terms of accuracy and running time results. For  K U ( P ) , our implementation uses the NetworKit [31] graph library to compute the diagonal elements of Q (via the algorithm of Angriman et al. [27]) and the Slepc library (https://slepc.upv.es/) (accessed on 2 December 2024) to compute the largest Laplacian eigenvalue. K ( P ) , in turn, is computed via Equation (3) and our implementation uses the Eigen library (http://eigen.tuxfamily.org) (accessed on 2 December 2024) to compute the entire pseudo-inverse, Q . We do not include any comparisons against K * since, computationally, it is as expensive as the exact computation of Kemeny’s constant. Our test machine is a shared-memory server with a 2x 18-Core Intel Xeon 6154 CPU and a total of 1.5 TB RAM. To ensure reproducibility, experiments are managed by SimexPal [32]. In Table 1, we list the real-world graphs that are used in our experiments, downloaded from SNAP [33] and NR [34] public repositories. In this context, we consider as medium graphs those whose vertex count is <57 K. The largest graph has around 365 K nodes and 1.72 M edges.
For the medium graphs of Table 1, we are able to compare our bound K U ( P ) relatively to Kemeny’s constant K ( P ) , and the results are illustrated in Figure 9. K U ( P ) is computed with different error bounds ( ϵ ) for the approximation of the diagonal elements (via the algorithm of Angriman et al. [27])—they correspond to the respective numbers next to the names in Figure 9. Regarding the accuracy, we observe that our approach for computing K U ( P ) is overall highly accurate for all values of ϵ and graphs. More precisely, on average (computed via geometric mean) over the medium-size graphs, our approach is 0.33% 0.27% 0.25% and 1.26% away from the exact Kemeny’s constant for ϵ = 0.1 ,   0.3 ,   0.5 and 0.9 , respectively. Meanwhile, the running time is on average 2 ,   18 ,   48 and 141× faster than the exact computation for each ϵ , respectively. Figure 9a shows that on individual graphs, a larger ϵ value ( ϵ = 0.9 ) may result in a slightly less accurate bound—up to 10% away from the exact value (arx). Moreover, in Figure 9b, we observe that for the inf graph, computing the exact Kemeny’s constant is much faster than computing K U ( P ) via Algorithm [27]. The primary reason for that is the small size (6K edges) for which an exact computation of the entire pseudo-inverse is still fast enough. A second reason for the slow performance of the algorithm of Angriman et al. could be due to the high diameter of the graph in question (≫ log N ).
In Table 2, we illustrate our results for the largest graphs of Table 1. For this experiment, we set ϵ = 0.5 for the approximation of the diagonal elements of Q as this offers the best trade-off between accuracy and speed, according to the previous experiment. Unfortunately, we were not able to compute exact values for Kemeny’s constant for these graphs, as all involved runs timed out at 18,000 s. This is due to the prohibitive time and space complexity of the pseudo-inversion operation required by K ( P ) .

7. Conclusions

We have investigated Kemeny’s constant K ( P ) for a number of networks using the exact expression from [4] and compared this expression with two upper bounds: one K * ( P ) that was derived in Ref. [19] and is known to be tight for vertex-transitive graphs, and the other bound K U ( P ) was derived in [4] and is written in terms of degrees of the nodes, the diagonal elements of the pseudo-inverse Laplacian, the largest eigenvalue of the Laplacian matrix and the heterogeneity of the degrees of the nodes.
We have numerically demonstrated that the bound K U ( P ) is generally a much better approximation for K ( P ) than K * ( P ) for the networks that we have explored. Moreover, we have proved that for any graph G composed of two regular graphs G 1 and G 2 with all nodes of the graph G 1 connected to each node of G 2 , the bound K U ( P ) is tight. This generalises earlier findings that the bound K U ( P ) is tight for (generalised) windmill and complete bipartite graphs.
As an illustration of the advantages of using the expression K U ( P ) to estimate the Kemeny constant, we numerically calculated the Kemeny constant for a number of real-world large networks. We find that the calculation of K U ( P ) can be performed very efficiently, displaying efficiency gains in the order of a factor 100–1000, for networks up to 57 K nodes. The upper bound can still be obtained in a reasonable time for networks up to 365 K nodes.

Author Contributions

Conceptualisation, R.E.K. and J.L.A.D.; methodology, R.E.K. and J.L.A.D.; software, R.E.K.; formal analysis, R.E.K. and J.L.A.D.; investigation, R.E.K. and J.L.A.D.; writing—original draft preparation, R.E.K. and J.L.A.D.; writing—review and editing, R.E.K. and J.L.A.D.; visualisation, R.E.K. All authors have read and agreed to the published version of the manuscript.

Funding

J.L.A. Dubbeldam was partially supported by the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No. 955708 and the Dutch National Foundation projects OCENW.KLEIN.277.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kemeny, J.G.; Snell, J.L. Finite Markov Chains; D. Van Nostrand: Princeton, NJ, USA, 1960. [Google Scholar]
  2. Lovász, L. Random Walks on Graphs: A Survey. In Paul Erdös is Eighty; Bolyai Society, Mathematical Studies: Keszthely, Hungary, 1993; Volume 2, pp. 1–46. [Google Scholar]
  3. Palacios, J.L.; Renom, J.M. Bounds for the Kirchhoff index of regular graphs via the spectra of their random walks. Int. J. Quantum Chem. 2010, 110, 1637–1641. [Google Scholar] [CrossRef]
  4. Wang, X.; Dubbeldam, J.L.A.; Van Mieghem, P. Kemeny’s constant and the effective graph resistance. Linear Algebra Its Appl. 2017, 535, 231–244. [Google Scholar]
  5. Noh, J.D.; Rieger, H. Random Walks on Complex Networks. Phys. Rev. Lett. 2004, 92, 118701. [Google Scholar] [CrossRef] [PubMed]
  6. Levene, M.; Loizou, G. Kemeny’s Constant and the Random Surfer. Am. Math. Mon. 2002, 109, 741–745. [Google Scholar] [CrossRef]
  7. Hunter, J. Mixing times with applications to perturbed Markov chains. Linear Algebra Its Appl. 2006, 417, 108–123. [Google Scholar] [CrossRef]
  8. Hunter, J.J. The role of Kemeny’s constant in properties of Markov chains. Commun. Stat.-Theory Methods 2014, 43, 1309–1321. [Google Scholar]
  9. Kirkland, S.; Zeng, Z. Kemeny’s Constant and an Analogue of Braess’ Paradox for Trees. Electron. J. Linear Algebra 2016, 31, 444–464. [Google Scholar] [CrossRef]
  10. Thomas, M.; Cover, J.A.T. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  11. Altafini, D.; Bini, D.A.; Cutini, V.; Meini, B.; Poloni, F. An edge centrality measure based on the Kemeny constant. arXiv 2022, arXiv:2203.06459. [Google Scholar] [CrossRef]
  12. Yilmaz, S.; Dudkina, E.; Bin, M.; Crisostomi, E.; Ferraro, P.; Murray-Smith, R.; Parisini, T.; Stone, L.; Shorten, R. Kemeny-based testing for COVID-19. PLoS ONE 2020, 15, e0242401. [Google Scholar] [CrossRef]
  13. Available online: https://www.niallmadden.ie/ILAS-2022.pdf (accessed on 2 December 2024).
  14. Breen, J.; Crisostomi, E.; Kim, S. Kemeny’s constant for a graph with bridges. Discret. Appl. Math. 2022, 322, 20–35. [Google Scholar] [CrossRef]
  15. Bini, D.A.; Durastante, F.; Kim, S.; Meini, B. On Kemeny’s constant and stochastic complement. Linear Algebra Its Appl. 2024, 703, 137–162. [Google Scholar] [CrossRef]
  16. Bell, F. A note on the irregularity of graphs. Linear Algebra Its Appl. 1992, 161, 45–54. [Google Scholar] [CrossRef]
  17. Kooij, R.E.; Dubbeldam, J.L. Kemeny’s constant for several families of graphs and real-world networks. Discret. Appl. Math. 2020, 285, 96–107. [Google Scholar] [CrossRef]
  18. Kooij, R.E. On generalized windmill graphs. Linear Algebra Its Appl. 2019, 565, 25–46. [Google Scholar] [CrossRef]
  19. Devriendt, K.; Martin-Gutierrez, S.; Lambiotte, R. Variance and Covariance of Distributions on Graphs. SIAM Rev. 2022, 64, 343–359. [Google Scholar] [CrossRef]
  20. Godsil, C.; Royle, G.F. Algebraic Graph Theory; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  21. Scheinermann, E.; Ullman, D. Fractional Graph Theory: A Rational Approach; Wiley and Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
  22. Van Mieghem, P. Graph Spectra for Complex Networks; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  23. Palacios, J.L.; Renom, J. Broder and Karlin’s formula for hitting times and the Kirchhoff Index. Int. J. Quantum Chem. 2011, 111, 35–39. [Google Scholar] [CrossRef]
  24. Jacquelin, M.; Lin, L.; Yang, C. PSelInv—A distributed memory parallel algorithm for selected inversion: The non-symmetric case. Parallel Comput. 2018, 74, 84–98. [Google Scholar] [CrossRef]
  25. Hutchinson, M.F. A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. J. Commun. Statist. Simula. 1990, 19, 433–450. [Google Scholar] [CrossRef]
  26. Bekas, C.; Kokiopoulou, E.; Saad, Y. An Estimator for the Diagonal of a Matrix. Appl. Numer. Math. 2007, 57, 1214–1229. [Google Scholar] [CrossRef]
  27. Angriman, E.; Predari, M.; van der Grinten, A.; Meyerhenke, H. Approximation of the Diagonal of a Laplacian’s Pseudoinverse for Complex Network Analysis. In Proceedings of the ESA. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Pisa, Italy, 7–9 September 2020; LIPIcs. Volume 173, pp. 6:1–6:24. [Google Scholar]
  28. Bollobás, B. Modern Graph Theory; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
  29. Newman, M. Networks, 2nd ed.; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
  30. Paige, C. Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem. Linear Algebra Its Appl. 1980, 34, 235–258. [Google Scholar] [CrossRef]
  31. Staudt, C.L.; Sazonovs, A.; Meyerhenke, H. NetworKit: A tool suite for large-scale complex network analysis. Netw. Sci. 2016, 4, 508–530. [Google Scholar] [CrossRef]
  32. Angriman, E.; van der Grinten, A.; von Looz, M.; Meyerhenke, H.; Nöllenburg, M.; Predari, M.; Tzovas, C. Guidelines for experimental algorithmics: A case study in network analysis. Algorithms 2019, 12, 127. [Google Scholar] [CrossRef]
  33. Leskovec, J. Stanford Network Analysis Package (SNAP). Available online: https://snap.stanford.edu/ (accessed on 2 December 2024).
  34. Rossi, R.A.; Ahmed, N.K. The Network Data Repository with Interactive Graph Analytics and Visualization. In Proceedings of the AAAI, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
Figure 1. A windmill graph and generalised windmills of Types I and II.
Figure 1. A windmill graph and generalised windmills of Types I and II.
Entropy 27 00384 g001
Figure 2. Graph G on 18 nodes, where G 1 is a random 3-regular graph on 10 nodes, and G 2 is a 5-regular graph on 8 nodes.
Figure 2. Graph G on 18 nodes, where G 1 is a random 3-regular graph on 10 nodes, and G 2 is a 5-regular graph on 8 nodes.
Entropy 27 00384 g002
Figure 3. Graph G on 60 nodes, where G 1 is a random 4-regular graph on 50 nodes, and G 2 is a random 6-regular graph on 10 nodes.
Figure 3. Graph G on 60 nodes, where G 1 is a random 4-regular graph on 50 nodes, and G 2 is a random 6-regular graph on 10 nodes.
Entropy 27 00384 g003
Figure 4. Graph G on 120 nodes, where G 1 is a random 10-regular graph on 100 nodes, and G 2 is a random 8-regular graph on 20 nodes.
Figure 4. Graph G on 120 nodes, where G 1 is a random 10-regular graph on 100 nodes, and G 2 is a random 8-regular graph on 20 nodes.
Entropy 27 00384 g004
Figure 5. Smallest biregular graph with diameter 2 for which the upper-bound Equation (4) is not tight.
Figure 5. Smallest biregular graph with diameter 2 for which the upper-bound Equation (4) is not tight.
Entropy 27 00384 g005
Figure 6. Petersen graph with one additional link.
Figure 6. Petersen graph with one additional link.
Entropy 27 00384 g006
Figure 7. Non-biregular graph with diameter 2.
Figure 7. Non-biregular graph with diameter 2.
Entropy 27 00384 g007
Figure 8. Random 3-regular graph on 100 nodes.
Figure 8. Random 3-regular graph on 100 nodes.
Entropy 27 00384 g008
Figure 9. Relative quality (a) and speedup (b) results (per graph) for computing K U ( P ) for medium graphs ( n < 57 K) of Table 1. Results are relative to exact computation of K ( P ) .
Figure 9. Relative quality (a) and speedup (b) results (per graph) for computing K U ( P ) for medium graphs ( n < 57 K) of Table 1. Results are relative to exact computation of K ( P ) .
Entropy 27 00384 g009
Table 1. Summary of graph instances, providing (in order) network name, type, abbreviation, vertex count, and edge count.
Table 1. Summary of graph instances, providing (in order) network name, type, abbreviation, vertex count, and edge count.
GraphTypeID | V | | E |
inf-powerinfrastructureinf 4 K6 K
facebook-ego-combinedsocialfac4 K8.8 K
p2p-Gnutella04internetp2p10 K39 K
ca-HepPhcollaborationca-11 K117 K
arxiv-astro-phcollaborationarx17 K196 K
eatwordseat23 K297 K
arenas-pgpinfrastructureare24 K10 K
as-caida20071105internetas-26 K53 K
ia-email-EUcommunicationia-32 K54.4 K
loc-brightkitesociallob57 K213 K
soc-Slashdot0902socialsoc82 K504 K
flickrimagesfli106 K2.31 M
livemochasocialliv104 K2.19 M
loc-gowalla-edgessociallog196 K950 K
web-NotreDamewebweb325 K1.09 M
citeseercitationcit365 K1.72 M
Table 2. Absolute results for K U ( P ) on the largest graphs of Table 1. Comparison to K ( P ) is prohibitive due to the (large) size of the graphs in question.
Table 2. Absolute results for K U ( P ) on the largest graphs of Table 1. Comparison to K ( P ) is prohibitive due to the (large) size of the graphs in question.
Graph K U ( P ) Time (h:min:s)
lob80,90348.83 s
soc96,10250.87 s
fli122,1851 min:38.11 s
liv120,52537.07 s
log271,5775 min:10.77 s
web1,009,7601 h:11 min:19.36 s
cit508,2441 h:16 min:11.51 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kooij, R.E.; Dubbeldam, J.L.A. Further Exploration of an Upper Bound for Kemeny’s Constant. Entropy 2025, 27, 384. https://doi.org/10.3390/e27040384

AMA Style

Kooij RE, Dubbeldam JLA. Further Exploration of an Upper Bound for Kemeny’s Constant. Entropy. 2025; 27(4):384. https://doi.org/10.3390/e27040384

Chicago/Turabian Style

Kooij, Robert E., and Johan L. A. Dubbeldam. 2025. "Further Exploration of an Upper Bound for Kemeny’s Constant" Entropy 27, no. 4: 384. https://doi.org/10.3390/e27040384

APA Style

Kooij, R. E., & Dubbeldam, J. L. A. (2025). Further Exploration of an Upper Bound for Kemeny’s Constant. Entropy, 27(4), 384. https://doi.org/10.3390/e27040384

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop