1. Introduction
Interactions in nature and society, such as protein–protein interactions [
1], image pixel connections [
2], social networks [
3], or quantum system evolution [
4], are often best understood through the lens of large networks. However, analyzing such networks poses significant computational challenges due to the complexity of operations on their graph matrices. Fortunately, these networks are frequently composed of smaller, more manageable structures—motifs [
5], communities [
6], or layers [
7]. By utilizing the properties of these substructures, we can infer characteristics of the larger networks formed through specific operations [
8,
9]. In graph theory, three fundamental graph products are commonly used to construct larger networks from smaller ones: the Cartesian product, the Kronecker product (also known as the direct product), and the strong product. Each of these products defines the vertex set as ordered pairs of the vertices of the constituent graphs and employs distinct rules for forming edges based on the edges of the constituent graphs.
Graph products play a crucial role in fields such as computer science, mathematics, and engineering, providing both practical applications and theoretical insights. For instance, the Kronecker product effectively models large networks, such as Internet graphs, by approximating them with powers of smaller subgraphs [
10]. Recently, graph products have emerged as a significant formalism in network science for characterizing the topologies of multilayer networks [
7,
11,
12]. Furthermore, their spectral properties have demonstrated substantial utility in various domains, including interconnection networks, parallel computing architectures, and diffusion processes [
13].
In the past two decades, graph spectra have found significant applications across various fields, particularly in computer science [
14,
15], including Internet technologies, computer vision, data mining, and multiprocessor systems. Among graph spectra, the spectrum and eigenspaces of the Laplacian matrix have received considerable attention due to its numerous applications. For instance, it has been used to analyze spanning trees resistance distances, community structures, and various network dynamics such as mechanical relaxation and fluorescence depolarization [
16,
17]. A fundamental challenge in this area is understanding how the spectral properties of graph products relate to those of their factor graphs. While relationships of the spectra of degree and adjacency matrices have been established for all major graph products, explicit characterizations of the Laplacian spectrum and eigenvectors remain limited. Complete results are available for the Cartesian product [
18], but the Laplacian spectrum and eigenvectors of the Kronecker product of graphs are difficult to derive directly from their factors, as were explored in [
19] for the modeling of multilayer networks. This has motivated research on the creation of empirical methods for estimation [
20,
21], as well as the derivation of explicit solutions for very specific cases [
22].
This paper focuses on analyzing approximation methods for the Laplacian eigenvectors of the Kronecker product, addressing both theoretical insights and practical implications for large-scale networks, as studied in [
20,
21]. The mentioned studies developed practical methods for estimating the Laplacian spectrum and eigenvectors of the Kronecker product of two graphs. These approximations reveal that the estimated eigenvalues and eigenvectors behave differently depending on the network topology. Empirical and theoretical evidence suggests that the Kronecker product of eigenvectors derived from the (normalized) Laplacian matrices of factor graphs, which is central to these approximation methods, can, in many cases, effectively approximate the eigenvectors of the Laplacian matrix of the Kronecker product. This process of approximation can be enhanced further by examining and comparing the correlation coefficients of the approximated eigenvectors. This measure determines the degree to which an arbitrary vector approximates an eigenvector of a given matrix. It is also shown that higher correlation coefficients of the approximated eigenvectors lead to lower relative errors in the estimated eigenvalues, highlighting the accuracy of the approximation. These approximation methods generally provide a more accessible approach for analyzing large-scale, complex networks, with significant implications for network science and machine learning models, such as random fields [
19].
The computation of the correlation coefficient is typically impractical because there is no explicit formula available for the (normalized) Laplacian eigenvectors of factor graphs expressed in terms of graph degrees or other structural parameters of the graph. However, Ref. [
21] showed that
n out of
correlation coefficients could be explicitly determined using the approximation method proposed in [
20], with the first Zagreb index playing a crucial role in the expression. On the other hand, the expression that served as a lower bound, (in some cases) of
n correlation coefficients of the second approximation method in [
21] contains a factor that is a function of the degrees of the first factor graph, but the second factor is still a function of second factor of an eigenvector of the normalized Laplacian of the second factor graph. Conversely, the expression that provides a lower bound for
n correlation coefficients, as derived from the second approximation method proposed in [
21], incorporates a factor that is influenced by the degrees of the first factor graph. In this context, both the first Zagreb index and the forgotten topological index play a crucial role. However, the second factor of this expression is determined as a function of the eigenvectors of the normalized Laplacian corresponding to the second factor graph.
Following the preliminary section, where we outline the established formulas for
n correlation coefficients derived from the approximation formulas in [
20], as well as the lower bounds (in specific cases) for
n correlation coefficients derived from the approximation formulas in [
21], the third section provides empirical and theoretical evidence to demonstrate why the aforementioned bounds serve as reasonably accurate approximations for the actual values of the correlation coefficients. In
Section 3.1, we conduct a comparison between the set of actual values of
n correlation coefficients and the set of proposed approximated correlation coefficients. This is achieved by plotting their smoothed probability distributions for different types of random model graphs. Our observations reveal that, for sparse graphs, both the actual and approximated values of the correlation coefficients exceed 0.9. For slightly denser graphs, these values exceed 0.95 and exhibit similar behavior in both cases (see Figure 1). Additionally, we perform a direct comparison of each of the
n correlation coefficients with its corresponding estimated value by calculating the percentage errors. The results show that the distribution of percentage errors is nearly uniform around 0, as observed in certain regions of the percentage error graph. A shared characteristic of all types of random graphs considered in the experiment is that, after an initial drop in their values, the percentage errors exhibit a gradual upward trend. Subsequently, these errors can be expressed in permilles, indicating a reduction in variability as the order number of the correlation coefficient increases.
In
Section 3.2, we theoretically show that the relative error between the approximated and actual
n coefficients tends to zero as
(fixed
p) or
(fixed
n) in the case of the Erdős–Rényi graph. Experiments confirm that the accuracy of approximation improves with graph density, supporting the reliability of approximation for large or dense networks. This conclusion is established by building upon the results presented in Theorem 3 from [
21], which are derived from Lemma 1 and Lemma 2 in the same work. While these results are initially expressed in terms of inequalities, it is observed that the left-hand and right-hand sides of these inequalities become asymptotically equivalent in the case of this random graph model. The refined proofs show that certain functions of degree sequences in Erdős–Rényi graphs, interpreted as topological indices, are asymptotically equivalent. This strengthens the mathematical foundation of the approximation and broadens the application of chemical graph theory to complex networks.
The primary objective of
Section 4 is to evaluate the accuracy of the approximation method introduced in [
20], which considers only
n coefficients expressed as a function of the degree sequence of a graph. Furthermore, the section aims to evaluate the accuracy of the approximation method presented in [
21], which focuses on
n approximations of the corresponding coefficients associated with this approximation. Given that the
n correlation coefficients in the approximation method from [
20] are mutually equal, we establish the following: The maximum value of these coefficients is achieved when the factor graph (
G) is a regular graph, as shown in Theorem 3. In contrast, the minimum value occurs when
G is a star graph, as stated in Theorem 4 (these results are observed when analyzing the Laplacian matrix of the Kronecker product of graphs
G and
H). For the derived classes of graphs, it is observed that the values of the
n correlation coefficients are consistently smaller than those of the other correlation coefficients. This observation has been extended to additional classes of graphs and is empirically validated. Furthermore, a similar trend is evident among the correlation coefficients obtained via the approximation method presented in [
21]. In Theorem 7, the maximum value of the
n approximated coefficients is achieved when
G is a regular graph, provided that the normalized Laplacian vector of the factor graph (
H) is fixed. It is further proven that the approximated coefficients reach their minimum when
G belongs to a certain class of almost regular graphs, as demonstrated in Theorem 8 and some empirical comments after the theorem, for a fixed normalized Laplacian vector of the factor graph (
H).
In
Section 4.2, we analyze the relationship between the majority of the set of
n correlation coefficients from the first approximation method and the set of
n approximated correlation coefficients from the second method. We also examine how this relationship reflects the connection between the remaining coefficients of both methods. This analysis may lead us to a conclusion regarding which approximation method is more suitable for certain classes of graphs. According to Theorem 5, when
G is an almost regular graph with the degree sequence expressed as
, where
is a perfect square, the discrepancy is minimized, favoring the second set. This pattern extends to the remaining coefficients, further emphasizing the influence of the structure of
G on the methods. On the other hand, as shown in Theorems 5 and 6, along with the comments following these theorems, when
G is a star graph, the discrepancy between the approximated
n correlation coefficients of the second method and those of the first method is maximized, favoring the first set. In both cases, this observation holds true when the normalized Laplacian vector of the factor graph (
H) is fixed. This trend also applies to the remaining coefficients, highlighting the structural distinctions between the two methods. However, deviations from this trend may occur when the values of the
n correlation coefficients from both methods are close, and the relationship between them is not necessarily reflected in the remaining coefficients, as discussed at the end of the section.
The proofs generally require a thorough discussion and encompass a broad spectrum of distinct cases. These cases employ techniques derived from the intersection of graph index theory, probability theory, mathematical analysis, inequalities, polynomial theory, computational mathematics, and other relevant fields.
2. Preliminaries
Consider a simple graph (G) with order n, characterized by a set of vertices () and a set of edges (), denoted as . The adjacency matrix (A) of G is an matrix, where the entry at position is 1 if vertices i and j are adjacent and 0 if they are not. The degree matrix (D) is a diagonal matrix in which each entry corresponds to the number of edges incident to the i-th vertex. The Laplacian matrix of the adjacency matrix (A) is defined as , where D is the degree matrix of A. The normalized Laplacian matrix is defined as .
Let and be two simple, connected graphs, each of the order n. The Kronecker product of graphs, denoted by , is a graph defined on the vertex set () such that two vertices ( and ) are adjacent if and only if and .
The approximation proposed in [
20] suggests that
, where
and
(
) are arbitrary eigenvectors of
and
, respectively, can be used as an approximation for the true eigenvectors of
. On the other hand, the authors of [
21] found that, in certain cases, the Kronecker product of the eigenvectors of
and
, denoted by
, where
and
(
) are arbitrary eigenvectors of
and
, provides a more accurate approximation for the eigenvectors of
than the Kronecker product of the eigenvectors of
and
. The idea also partially comes from the fact that the normalized Laplacian matrix of the Kronecker product of graphs
can be represented in terms of normalized Laplacian matrices of factor graphs
G and
H.
The closeness of the estimated vector (
x) to any of the original eigenvectors of the Laplacian of the Kronecker product of graphs
is measured using the vector correlation coefficient (cosine similarity), denoted by
, as described in [
20,
21]:
Throughout the paper, we deal with approximations that are Kronecker products of two vectors, i.e.,
and
for
, and to simplify notation, we define
and
. Since not all of the coefficients (
and
) are feasible, our focus is on those that are feasible or for which a reasonable estimation can be provided. Specifically, since
and
are eigenvectors of
and
, respectively, by substituting
and
, in accordance with Theorem 1 and Theorem 3 from [
21], we can express the following consequences: For
and
,
and
are the degrees of
G. In addition, Theorem 2 states that the expectation of
may be greater than or equal to the expectation of the expression on the right-hand side of relation (
2) in certain cases where the Kronecker product of two graphs (
G and
H) comprises Erdős–Rényi graphs. Therefore, in the following section, we demonstrate that the aforementioned expression represents a reasonable approximation of
.
For the sake of simplicity, we denote the right-hand side of expression (
2) as follows:
Furthermore, throughout the paper, we compare the value of
with this approximation (
3), and the resulting analysis ultimately addresses the following question: Which classes of graphs allow the approximation methods proposed in [
20,
21] to yield better results?
4. Comparison of Correlation Coefficients and
In this section, we further analyze the distribution of correlation coefficients associated with both approximate vectors discussed in previous sections. Initially, we evaluate these approximations by comparing their distributions of correlation coefficients, as calculating the correlation coefficients is generally not feasible for
or
when
. However, only the values of
can be explicitly calculated as presented in (
1). Additionally, there exists a reasonably good approximation for
, as discussed in the previous section and given by (
2) for
. Thus, the main question we aim to address is the following: Can we determine which approximation is more suitable for the given graphs (
G and
H) by comparing only the values of
and an estimation of
?
Motivated by the observation that when
G is
d-regular, the following inequality between
and the approximation of
, denoted by
, always holds for
:
We assume that the approximation for the eigenvectors of
presented in [
20] is more suitable than the one provided in [
21]. However, this verification is only possible if we establish that the majority of the correlation coefficients (
) are greater than those of
for
. This conclusion can be further supported by comparing the smoothed probability density functions of the vector correlation coefficients (
and
). In this analysis, we assume that
G is a regular graph of order 50 with an edge density level of 25% (see
Figure 3). Graph
H, which also has an order of 50, is an Erdős–Rényi graph with an edge density of 10% in the first case (left panel) and a Barabási–Albert graph with an edge density of 25% in the second case (right panel).
In the given plot, we compare the smoothed probability functions of correlation coefficients derived from the Kronecker product of normalized Laplacian vectors (blue line) and Laplacian vectors (green line). The figure shows that the majority of correlation coefficients based on the Kronecker product of normalized Laplacian vectors () are concentrated between 0.87 and 0.96. This concentration is reflected in a sharp peak at around 0.93, indicating that most of the values are clustered around this point. There are almost no values greater than 0.96, as seen from the sudden drop-off of the blue line beyond 0.96. On the other hand, the correlation coefficients based on the Kronecker product of Laplacian vectors, denoted as , have a broader distribution, primarily covering the range from 0.87 to 1. The green line shows a peak around 0.96, indicating that a considerable number of these values is concentrated around this point. Additionally, there are many values in the range from 0.96 to 1, as illustrated by the gentler slope of the green curve compared to the blue one. This suggests that values tend to spread out closer to 1, while values are more narrowly concentrated.
The second plot shows an even more pronounced trend compared to the first. The majority of correlation coefficients based on the Kronecker product of normalized Laplacian vectors () are concentrated in the range from 0.87 to 0.97, with a sharp peak around 0.94. This indicates a tighter clustering of values around this point, and very few coefficients exceed 0.97. In contrast, the coefficients derived from the Kronecker product of Laplacian vectors () are spread over a narrower high-value range from 0.95 to 1, peaking around 0.98. This suggests a higher concentration of values close to 1. The green curve’s shape in the second plot confirms that values tend to accumulate near the maximum, while values show a slightly broader distribution but still remain tightly concentrated.
It is important to note that the sum of the cubes of vertex degrees of a graph (
G) is referred to as the forgotten topological index, denoted by
[
25]. In contrast, the sum of the squares of vertex degrees of the same graph (
G) is known as the well-established first Zagreb index, denoted by
[
26]. According to this notation, the correlation coefficient (
) and the value of
can be expressed as follows:
and
where
m represents the number of edges in the graph (
G).
Let us consider the
function expressed in its general form as follows:
where the condition of
holds and
represents the
k-th moment, with
being positive real numbers.
If we denote and as functions of , then it follows that and . Since the extremal values of and are achievable for the same values as the extremal values of and , we analyze them in order to identify suitable candidates (G) with a particular graphical sequence () that exhibit low or high values of or .
On the other hand, for the comparison of the values of and , it is necessary to examine the extremal values of , as we can deduce that the inequality expressed as is equivalent to . Therefore, in the following part of this section, we determine the extremal values of the functions expressed as , , and , subject to the constraint of for , as the degrees of a connected graph can only take integer values between 1 and .
Theorem 2. The stationary points of , as defined by (15), under the constraints of and , for , are - (i)
for , , and ;
- (ii)
and for , , and ;
- (iii)
and for , , and .
For a suitable , there exists an integer (k) such that , and is a positive root of the polynomial expressed as .
Proof. For given values of
p,
l, and
s, we first calculate the partial derivatives of
with respect to
for
:
In order to find the stationary points of
, the following system of equations must be solved for all
:
The above quadratic equations can have, at most, two distinct solutions. Let a and b be solutions of the quadratic equation so that a stationary point is represented as .
If
, then it holds that
for all values. Consequently, the left-hand side of (
17) becomes
from which we deduce that
is a stationary point in all cases stated in the theorem. The same conclusion can be derived for
.
Now, let
. It easy to see that
is a solution of Equation (
17) if and only if
is also a solution for some
. Therefore, if we set
and substitute
and
for
into (
17), we obtain
After substituting
and performing a short calculation, the previous equation becomes
By dividing the preceding equation by
, where
, and introducing the substitution of
, the equation can be expressed in a more simplified form as
If , , and , the equation becomes . The left-hand side of the equality can be expressed as . Consequently, it is greater than zero for . Therefore, the equation does not have a solution, which proves the first part of the theorem.
If
,
, and
, Equation (
19) becomes
. As the discriminant of this cubic equation is given by
we find that all zeros of
are real. Moreover, according to Descartes’ rule of signs, it is evident that the initial cubic function (
) has one negative and two positive roots. For
, which is a root of
, we find that
is a stationary point of
, which completes the proof of part
of the theorem.
Finally, we need to address the case where
,
, and
. In this scenario, Equation (
19) becomes
. Furthermore,
can be factorized as
, from which we conclude that the only positive root distinct from one is
. Consequently, it follows that
is a stationary point of
, completing the proof of part
of the theorem. □
4.1. Extreme Values of the Correlation Coefficients
In the following theorems, we determine the global minimum and maximum of the function expressed as to identify the extreme values of , where is defined as for a given degree sequence of a graph (G).
Theorem 3. The point expressed as represents the maximum of the function expressed as , subject to the constraint of for all .
Proof. In the previous theorem, we proved that the point expressed as
is a stationary point of
, and now, we show that this point is a global maximum of the function. According to the definition of
, for
,
. Since
are positive real numbers, the arithmetic–quadratic mean inequality implies
, leading to
□
According to the preceding theorem, we have demonstrated that the correlation coefficient (
) attains its highest value of 1 only when
G is regular, as
. On the other hand, the second part of the assertion indicates that
, since
, with equality occurring only when
G is regular. Let us note that this extremal case serves as a motivation for comparing the values of
and
in order to establish certain relations between the correlation coefficients (
and
), as mentioned at the beginning of this section. We conclude that
is always greater than or equal to
due to the presence of the
factor in the expression for
, which prevents the two values from being equal. Additionally, we experimentally demonstrate that
tends to exhibit higher values than
, as illustrated in
Figure 3. However, since the upper bound of
depends on the graph (
H), it is evident that the maximum value of
can reach 1 if
H is a regular graph, in which case it would be equal to
. Consequently, in this case (where
G and
H are regular), both approximations become actual eigenvectors of
. This straightforward observation serves as the motivation for the approximation suggested in [
20], leading to the conclusion that all values of
and
are equal to 1. This further supports our hypothesis that higher values of
and
correspond to greater values of
and
, respectively.
Consequently, we identify the classes of graphs (G) for which the values of are low and compare these with the corresponding values of . Given that and does not possess any other stationary point, except for the one at which the maximum is attained, as indicated in the previous theorem, we conclude that the minimum of this function is achieved at the boundary points of . Therefore, we prove the following theorem.
Theorem 4. The minimum of the function expressed as , given , is achieved at the point expressed as .
Proof. Since the minimum is achieved at a boundary point, we can assume, without loss of generality, that
. According to (
16), partial derivatives of
with respect to
for
are defined as
Since the partial derivatives have the same form as in the case of the function (F) when considered over the entire domain, it follows that the stationary points must satisfy the system of linear equations (). The solution to this system is given in the form of and . Since each of the equations can be transformed as , we conclude that every stationary point on the boundary of the domain satisfies . These points correspond to the global maxima. Thus, it follows that the global minimum is achieved on the boundary of the domain expressed as , implying that . Continuing this process, we can deduce that, for a point of global minimum , it holds that for all .
If , then the above equality can be expressed as .
Therefore, we proceed by comparing the values of
, where
for
. These values represent the potential points at which the function expressed as
may achieve its minimum. Given that
our goal is to find the minimum value of the function expressed as
, for
.
The first derivative of
G is expressed as
From this expression, it follows that G is an increasing function for , which implies that is the point where the function (F) attains its minimum. □
According to the preceding theorem, we conclude that for a graph (G) with a degree sequence of , the correlation coefficients () attain their lowest values. This indicates that G is a star graph. By substituting the degree sequence into the formula for , we find that , which holds true as . On the other hand, we have , implying that . In examining the formulas of and , we note that the order of magnitude of n in the first formula exceeds that in the second. This suggests that is generally greater than . However, the correlation coefficient () may decrease the value of , potentially resulting in cases where , depending on the specific classes of graphs and, consequently, the value of .
Now, we conduct experiments that compare the coefficient relations of
and
when
G is a star graph of order 50 and
H is an Erdős–Rényi graph of order 50, with an edge density of 10% in the first case (see
Figure 4, left panel) and an edge density of 30% in the second case (
Figure 4, right panel). The left plot shows that most correlation coefficients (
, blue line) cluster around 0.93, indicating a sharp peak and low dispersion, suggesting that values are concentrated within a narrow range. Meanwhile, the coefficients (
, green line) peak around 0.97 but display a wider dispersion, covering a broader range of values. This suggests that
generally takes higher values than
, except within the narrow range of the highest values (0.95 to 0.98). In the second case, with a slightly denser graph (
H), the trend becomes more pronounced. The number of correlation coefficients (
) is consistently higher than that of
across all ranges, indicating that
typically achieves higher values overall. Additionally, in both panels, we observe a slight lift in each curve: around 0.7 for the blue line and around 0.3 for the green line, corresponding to
and
, respectively.
The relationship between and is, indeed, indicative of the relationship between and . The density of graph H plays a pivotal role in amplifying through increases in . This dynamic not only increases but also ensures that becomes more dominant in relation to (right panel). By choosing graph G as a star graph, we achieve the desired dominance of over , further validating the hypothesis.
4.2. Comparison of the Correlation Coefficients and the Approximations
Based on previous discussions, we find that the value of
reaches its minimum when considering the star graph among all connected graphs of order
n. However, we have shown that, even when
G is a star graph, the value of
remains greater than
by a factor on the order of
n, up to the value of
. This suggests that the discrepancy between these two values may be at its maximum in this specific case. Therefore, in the following theorem, we compare
and
directly by examining the extreme values of the function expressed as
. It is worth noting that the comparison between
and
can be analyzed by studying the function expressed as
under the following condition, which we previously established:
In the following theorems, we aim to find the second differential of the function expressed as
and, accordingly, calculate its second partial derivatives. First, in equality (
16), let
denote the factor of
, which is given by
. Thus, the second partial derivatives are expressed as
where the partial derivative of the
function with respect to
is given by
Theorem 5. For the function expressed as , under the constraint of for , it follows that the point expressed as represents the maximum for a suitable .
Proof. According to Theorem 2, the points expressed as
, for
represent the potential points at which the function expressed as
achieves its maximum. Therefore, we proceed by comparing the values of
, where
, for every
in order to identify the maximum. First, we observe
where
. Clearly, the function expressed
attains its minimum value when
t is as small as possible. Furthermore, since the quadratic function expressed as
for
attains its maximum at
, as established using the arithmetic–geometric mean inequality, we conclude that it reaches its minimum at the endpoints of the interval of
. This implies that
for
or
. Since
is the point at which the function expressed as
achieves its maximum in the set of points expressed as
, it remains to be proven that this point is, indeed, a global maximum of
under the constraints given in the statement of the theorem.
To establish this, we demonstrate that the second derivative of the function at point
is negative. Therefore, we calculate the second derivatives at point
following Equation (
21), from which we conclude that
, given that
at stationary point
. Since the value of
is constant, we calculate
only at stationary point
using (
22), resulting in
First, let
a be a substitute for
, and observe that
, which implies
. Therefore, by extracting the factor expressed as
, the second derivative of the function expressed as
at point
can be expressed as
Applying the arithmetic–quadratic mean inequality to variables
, i.e.,
, we deduce that
Using this inequality in Equation (
24) and factoring out the common term
, we obtain the following inequality:
Observing that
, we can rewrite the right-hand side of the above inequality in the following form by placing the sum in front of the expression:
If we factor out from the expression under the summation, we obtain a quadratic equation in terms of whose discriminant is given by . It is evident that this discriminant is negative, implying that the expression under the summation is greater than zero. This further implies that is negative, thereby completing the proof. □
Theorem 6. The point expressed as is a saddle point of the function expressed as , where the function is defined in the domain of for and .
Proof. We examine function
F around the point expressed as
along the line expressed as
, where
. The value of the function along the line, given by
, is denoted as
and is defined as
The first derivative of
G is given by
from which it follows that
G is non-decreasing in the interval of
. This implies that in an arbitrary
neighborhood of point
y, we can identify the points (
and
) where
exhibits opposite signs. This indicates that
F has a saddle point at
for
. □
Given that does not possess any stationary points other than the one at which the maximum is attained and a has only a single saddle point, as established in the previous theorems, we conclude that the minimum of this function is achieved at the boundary points (). After performing the necessary derivative calculations applied to the function expressed as , similar to those given in the proof of Theorem 4, where they are applied to the function expressed as , we find that the minimum occurs for . However, no graph (G) exists with a degree sequence of . In the contrary case, two vertices would be connected to all other vertices, which would imply that every vertex has a degree of at least two, leading to a contradiction. The problem of finding the minimum of at points with integer coordinates falls within the field of integer programming, where problems are often infeasible. In this case, experimental confirmation shows that the minimum is achieved for , which implies that G is a star graph. However, this case has already been analyzed (thus, it does not provide new theoretical insights, as no new class of graphs is discovered), indicating that the values of are of a significantly higher order of magnitude of (n) than , resulting in the greatest discrepancy between them according to the previous conclusion.
According to Theorem 5, we conclude that the function expressed as
attains its maximum value, which is equal to
, based on relation (
23) at point
for some
.
Therefore, for graph
G with a degree sequence of
, where
and
are integers, we have that a condition of
which implies that
for any graph
H according to the relation (
20).
Before we proceed with further analysis of our hypothesis that the relation reflects the relation between the remaining correlation coefficients ( and ) by experimentally calculating them, let us briefly mention why a graph (G) with a degree sequence of , where and are integers, exists. Graph G exists if and only if graph exists, where v is a vertex of G such that its degree is equal to , where . The sum of the vertex degrees of is equal to , from which we can immediately conclude that the first summand is an even number. Furthermore, the second summand () can be rewritten as , indicating that it is also an even number. This conclusion follows from the fact that y, , and cannot all be odd simultaneously. However, the first summand represents the sum of the degrees of a -regular graph with vertices, while the second summand represents the sum of the degrees of a y-regular graph with vertices. These sums are evidently even, confirming the existence of a disconnected graph with two connected components. Finally, to obtain a connected graph with the same degree sequence, it is sufficient to remove an edge in the d-regular component and connect one of its ends to an arbitrary vertex of the other component.
According to the previous discussion, we can conclude that a graph (
G) with a degree sequence of
exists if and only if the following condition is satisfied:
In the following experiments, a graph with a degree sequence of
is referred to as an almost regular graph with parameters of
y and
k, denoted by
. It is important to note that both a regular graph and the star graph can be treated as almost regular graphs with parameters of
and
, respectively, see
Figure 5.
The blue line, representing the correlation coefficients (), indicates a concentration of values around a peak of 0.94, with a moderate spread in the both cases. We observe greater variability and a broader distribution in the correlation coefficients () due to the increased density of graph H. In contrast, the green line, corresponding to , consistently shows higher correlation values than , with peaks approaching 0.98 across both scenarios. The increase in the edge density of H affects slightly, whereas remains largely unchanged. A similar pattern can be seen in the comparison of the correlation coefficients ( and ). Moreover, approximately ranges from 0.4 to 0.7, while reaches 0.8, corresponding to the elevation of the green curve at that point.
From the discussion above, we conclude that the relationship between
and
is consistent with the relationship between the entire sets of correlation coefficients (
and
). This claim is validated for classes of graphs for which the values of
and
achieve their extremes. As observed in the previous assertions, these classes of graphs fall within the class of almost regular graphs. Therefore, it is useful to examine the relation between
and
within this class to substantiate the claim. This examination can be conducted by analyzing the
function expressed as (
25). Since
is a function of a single variable, it can be plotted over the interval of
(for instance, we can consider the case where
).
On the graph of
, we can identify three characteristic points, as shown in the
Figure 6. The first two points
and
are considered in Theorems 6 and 5, from which we conclude that they represent a saddle point and the point of global maximum, respectively. According to the inequality in (
20), we conclude that if
is satisfied, then it holds that
. This implies that for point
, such that
and
, it holds that
for
. Moreover, for
, it could potentially be that
, depending on the value of
. After a brief calculation, we can determine that the equation expressed as
is equivalent to
which implies that
.
We conduct an experiment to calculate the correlation coefficients ( and ) and to construct their smoothed probability functions, where G is an graph and H is an Erdős–Rényi graph, with edge densities of 10% and 30%, respectively. Both graphs (G and H) are of order . Based on the previous discussion, given , we have , from which it further follows that potentially holds, provided that and . We examine the entire sets of values of and to determine their relationship.
In the first plot, where
H is an Erdős–Rényi graph with a 10% edge density, the green line representing the correlation coefficients (
) lacks a distinct, sharp peak, instead exhibiting a broad, flatter distribution, see
Figure 7. This contrasts with the blue line
, which has a pronounced peak around 0.95, indicating a sharp concentration of values with limited dispersion. This suggests that the values of
are generally greater than
. The right panel, representing
H with a 30% edge density, shows similar trends, though the differences between
and
become more pronounced. The
values remain concentrated around a pronounced peak near 0.97, whereas the
values also shift higher, centering around the same peak but exhibiting a wider distribution.
A similar pattern is observed in the comparison of the correlation coefficients ( and ), where values are generally higher than those of . Specifically, approximately ranges from 0.45 to 0.62 when H has an edge density of 10%, while reaches a maximum value of 0.52, corresponding to the elevation of the blue and green curves in those segments. This trend becomes more pronounced when H has an edge density of 30%, when nearly all values are greater than . In this case, ranges from 0.51 to 0.63, whereas remains at 0.52.
In the following subsection, we examine the values of the sets of correlation coefficients ( and ). Up until this point in this subsection, we have compared these values only in terms of their relative ratios.
4.3. Extreme Values of the Approximated Correlation Coefficients
In the following two theorems, we analyze the extreme values of by determining the minimum and maximum value of the function expressed as , as it was previously established that .
Theorem 7. The point expressed as represents the minimum of the function expressed as , subject to the constraint of for all .
Proof. In Theorem 2, we prove that the point expressed as
is a stationary point of
, and now, we show that this point is a global minimum of the function. According to the definition of
, for
, we find that
. First, observe that
, while
. Consequently, we have
which ultimately implies that
. □
Theorem 8. The maximum of the function expressed as , subject to the condition , is attained at the point expressed as for , where is a suitable constant and is a root of the following polynomial: Proof. According to Theorem 2, the points expressed as
, for
, represent the potential points at which the function expressed as
achieves its maximum, where
represents a positive root of
. Therefore, we proceed by comparing the values of
, where
, for every
in order to identify the maximum. Given that
we aim to find the maximum value of the function expressed as
, where
is a positive solution of
, for
. The first derivative,
, can be expressed as
, where
and
are defined as
By regrouping the terms that include the
factor in
, we can represent
as
, where
After extracting the
facotr from the terms of
, the resulting quotient is a polynomial in
, which has a double root at 1. Thus,
can be rewritten as
According to the given formula, after extracting the
x factor from the terms of
, the resulting quotient is a polynomial in
, which has a root at 1. Dividing this polynomial by
results in a new polynomial in the form of
. Explicitly, we have
We previously established that polynomial
, as defined in (
19), where
,
and
, has exactly one negative root and two positive roots, as demonstrated in the proof of Theorem 2. This implies that
satisfies the same condition. Moreover, since the roots of the derivative of
, calculated by
, are
, we can conclude that for the greater positive root of
, denoted as
, it holds that
. Additionally, for the smaller positive root of
, denoted as
, it follows that
, in accordance with the Gauss–Lucas theorem.
If , we have , which implies that . Therefore, we conclude that . Furthermore, since , it follows that , leading to the inequality expressed as . Consequently, we can infer that , which indicates that is a decreasing function. This implies that reaches its maximum at within the interval of . Therefore, we conclude that the point expressed as , where is a root of polynomial and is greater than one, is the point at which the function expressed as attains its maximum within the domain defined by the set of points () for .
If , we have , given that , which implies that . Additionally, note that . On the other hand, if we rewrite as , it is evident that , which implies that ( decrease on the segment of ). Therefore, we conclude that , further inferring that . Consequently, we establish that , implying that is an increasing function. This observation implies that reaches its maximum at over the interval of . Therefore, we conclude that the point expressed as , where is a root of the polynomial expressed as and is lower than one, is the point at which the function expressed as attains its maximum within the domain defined by the set of points () for .
However, if we substitute
, we obtain
, and
is a root of polynomial
, given that
. Consequently, we conclude that
, and for every positive
t and
y, it holds that
Therefore, it is sufficient to prove that the second derivative of F is negative at point , which satisfies . However, we omit the proof due to the extensive calculations involved, which offer little in the way of a qualitatively new technique compared to those demonstrated in the proof of Theorem 5.
By substituting , we obtain , which essentially represents the depressed cubic form of . We can straightforwardly observe that and , which implies . However, if , which holds true for . For , we can directly verify that , which completes the proof. □
Now, we examine the values of
and
within the class of almost regular graphs in a manner similar to how we analyzed their ratio in
Figure 6.
Let us first note that and for some graphs (G and H), where G has an order of n and a degree sequence of .
According to the two preceding theorems, we observe that the minimum and maximum values of
are attained at
and
, respectively. Additionally, as established by Theorems 3 and 4, the maximum and minimum values of
are achieved at
and
, respectively. Furthermore, since the maximum value of
occurs at
, as indicated by Theorem 5, we conclude that the maximum value of
is also attained at
. Finally, according to Equation (
26), we observe that
or, equivalently,
when
. Since
and
are functions of a single variable, they can be plotted over the interval of
for
, with the characteristic points marked on the plots, see
Figure 8.
As is the point of the global minimum of , we observe that achieves its minimum when G belongs to the class of almost regular graphs with a degree sequence of or for an appropriate integer (y) and a given graph (H). Furthermore, based on empirical evidence, we hypothesize that these points are the candidates among which the global maximum of should be located. Determining the theoretical maximum of this function lies within the scope of integer programming, a domain where problems often prove to be infeasible.
However, even though reaches its minimum at , we observe that the value of can be lower than the value of , as shown in the provided graph for . Specifically, it is established that for , the inequality expressed as holds. Consequently, since and , we conclude that .
Now, we provide some comments related to the comparison between and . According to the proof of Theorem 8, it is observed that , which implies that for all n satisfying , the inequality expressed as holds. If we raise both sides of the inequality expressed as to the power of three, we obtain . Similarly, squaring both sides of the inequality expressed as yields , from which we straightforwardly conclude that it is satisfied for .
In the same manner, we observe that for all n satisfying , the inequality expressed as holds. Consequently, it can be concluded that this inequality is satisfied when , which implies . For values of n in the range of , the relationship between and must be checked directly, as demonstrated for .
For the case of
, as we have observed, the values of
and
are very close, and for a fixed
,
reaches its minimum on one of the graphs (
or
). It can be noted that
; therefore, the analysis focuses on the sets of correlation coefficients of (
), despite the fact that the values differ only at the fifth decimal place. Moreover, given that
, we conclude that
. Since
and
, it follows that, for sufficiently high values of
, the value of
could exceed
. However, we conduct experiments to calculate all correlation coefficients (
and
) and compare them for graph
G being
, while
H represents an Erdős–Rényi graph, with edge densities of 10% and 30%, respectively, as shown in
Figure 9. In the first case, the values of the correlation coefficients (
) range from 0.42 to 0.63. In the second case, their lower bound is slightly higher, starting at 0.5, while the upper bound remains the same, at 0.63. Therefore, in both cases, we can infer that some of the correlation coefficients (
) are higher than
(note the upward shift of the green line in both plots around this value). Conversely, the values of
clearly dominate the values of
overall in both scenarios.
On the other hand, the situation is entirely different when considering the Kronecker product of graph
G, which is
, and the graph
H, an Erdős–Rényi graph, with edge densities of 10% and 30%, respectively (see
Figure 10). We observe that
, while
, which implies that
(note the rise of the green line near the value of 0.69). In the conducted experiment, the values of
range from 0.4 to 0.67 when
H has an edge density of 10% and from 0.5 to 0.65 when
H has an edge density of 30%, implying that
is always greater than
. Nevertheless, it is observed that the values of
are considerably higher in both plots compared to the preceding experiment. Furthermore, in a significant number of cases,
exceeds the corresponding values of
.
Based on the two preceding experiments, we believe that the relationship between and does not necessarily determine the relationship between the entire sets of correlation coefficients ( and ), especially when the absolute difference between and is small. More generally, we can conclude that the relationship between and does not significantly influence the relationship between and , provided that G is an almost regular graph () of order n and k is close to .
5. Concluding Remarks
Although the relationships between the spectral properties of a product graph and those of its factor graphs are well-established for standard graph products, the characterization of the Laplacian spectrum and eigenvectors of the Kronecker product of graphs using the Laplacian spectra of the factors remains an unresolved problem. In this work, we analyzed approximation methods recently proposed in the literature for estimating the Laplacian eigenvectors of the Kronecker product of graphs, given the eigenvectors of their factor graphs. The most common method to evaluate the extent to which an arbitrary vector belongs to a set of eigenvectors of a given matrix involves the use of a correlation coefficient. This measure is closely related to the cosine of the angle between the original vector and its image under the matrix transformation. However, the calculation of the correlation coefficient is typically infeasible, as the explicit formula for the (normalized) Laplacian eigenvectors of factor graphs is not known in the general case in terms of the parameters of the factor graphs.
Fortunately,
n of the
correlation coefficients can be explicitly determined using the method proposed in [
20] and estimated using the method proposed in [
21], where
is the order of the graph formed as the Kronecker product of two graphs of order
n. The expression presented in [
21] has proven to be a highly accurate approximation of these
n correlation coefficients. This was validated through numerous experiments, which included determining the probability density functions and analyzing the percentage errors between actual and estimated values. These experiments encompassed various scenarios, such as Kronecker products of different types of random graphs and varying edge densities, consistently demonstrating the method’s reliability. Theoretical analysis shows that the expected value of relative error between approximated and actual values of the
n coefficients tends to zero as
with a fixed
p in the Erdős–Rényi model. Similarly, as
with a fixed
n, the error also diminishes. Both conclusions are supported by experimental verification, highlighting that the approximation improves with increasing graph density and confirming its reliability for large or dense networks.
The theoretical analysis relies on demonstrating that certain functions of degree sequences, which can be interpreted as topological indices (including the forgotten topological index), are asymptotically equivalent. This connection not only strengthens the mathematical foundation of the approximation but also provides a novel contribution to the chemical graph theory of random graphs, extending its applicability to the study of structural properties in complex networks.
The primary advantage of
n specific correlation coefficients of the method proposed in [
20] lies in their dependence on the degree sequences of one factor graph, classifying them as structural topological indices. Moreover, the dependence of only one factor in the approximation of coefficients related to the method proposed in [
21] on the degree sequence allows it to be viewed as a hybrid index, combining structural and spectral characteristics. This leads to the central motivation for the second part of the paper, which raises several questions:
Can we infer the accuracy of the first approximation method by knowing only the explicit values of the n correlation coefficients?
Similarly, can the accuracy of the second approximation method be assessed using only the approximations of the n correlation coefficients?
Finally, is it possible to determine which approximation is more suitable for specific classes of factor graphs by comparing the mentioned correlation coefficients and the approximations?
Related to the first two questions, it can be observed that the n correlation coefficients (for both approximation methods) are generally smaller than the remaining correlation coefficients. This trend is evident in all plots, where the smoothed probability distributions of the correlation coefficients show a lift on the left side of the graph, corresponding to the lower values of these n coefficients. In contrast, the dominant majority of the coefficients tend to cluster around a peak on the right side of the graphs, indicating that these coefficients typically have greater values. Therefore, it naturally arose as a task to determine the extreme values of these coefficients for the first approximation method and the approximated values for the second method. This is because higher extreme values generally imply higher overall values across the entire set of coefficients. We proved that for the first approximation method, the maximum value of the n coefficients is achieved when the factor graph (G) is a regular graph, while the minimum value occurs when G is a star graph. Similarly, for the second approximation method, the maximum value of the n approximated coefficients is achieved when G is a regular graph, given a fixed normalized Laplacian eigenvector of factor graph H. We also proved that the approximated coefficients achieve their minimum when G belongs to the class of almost regular graphs with a degree sequence of or , where y is an appropriate integer and is defined in Theorem 8 for a fixed normalized Laplacian eigenvector of H. Based on empirical evidence, we hypothesize that these points are strong candidates for the global minimum. Determining the theoretical minimum or maximum of this function, however, involves integer programming, a field known for its computational complexity and infeasibility in certain cases. As such, resolving these challenges represents a promising direction for future research, particularly in advancing both theoretical insights and practical applications.
Regarding the third raised question, we observed that, in most cases, the relationship between the majority of the values in the set of n correlation coefficients of the first approximation method and the approximated n correlation coefficients of the second method reflects the relationship between the remaining coefficients of both approximations. Specifically, we proved that when G is a star graph, the discrepancy between the approximated n correlation coefficients of the second method and the n correlation coefficients of the first method is largest, favoring the first set. This same behavior extends to the remaining coefficients, further emphasizing the structural distinctions between the two methods in this case. The reverse case is observed when G is an almost regular graph with a degree sequence of , where is a perfect square. In this scenario, the discrepancy between the approximated n correlation coefficients of the second method and the n correlation coefficients of the first method is smallest, favoring the second set. This pattern also extends to the remaining coefficients, further highlighting how the structural properties of G influence the behavior of the two approximation methods. Certain deviations from this trend can be observed when the values of the n correlation coefficients of the first and second methods are close, and the relationship between them cannot be reflected in the relationship between the remaining coefficients of the approximation methods.
These observations emphasize the significant impact of the graph structure, particularly highly imbalanced configurations, such as certain types of almost regular graphs (one example being the star graph), on the performance of the two approximation methods. Therefore, it will be useful to examine the accuracy of the classes of graphs with more balanced degree sequences, i.e., graphs with more than two values in their degree sequence. However, the proofs generally require extensive discussion and fall into a wide range of distinct cases, involving techniques based on the interplay of graph index theory, mathematical analysis, inequalities, polynomial theory, and computational mathematics, among others. Consequently, the examination of more balanced graphs would be a more demanding task. In addition, certain results concerning the examination of the functions expressed as and can stand alone in the literature on graph indices, as they establish certain relationships between the forgotten topological index and the first Zagreb index. These results contribute to the broader understanding of graph indices and may inspire further exploration in this area.