Clustering Vertex-Weighted Graphs by Spectral Methods

Juan-Luis García-Zapata; Clara Grácio

doi:10.3390/math9222841

and

¹

Departamento de Matemáticas, Universidad de Extremadura, 10003 Cáceres, Spain

²

Departamento de Matematica, Universidad de Évora, 7004-516 Évora, Portugal

^*

Author to whom correspondence should be addressed.

Mathematics2021, 9(22), 2841;https://doi.org/10.3390/math9222841

This article belongs to the Section C2: Dynamical Systems

Version Notes

Order Reprints

Abstract

Spectral techniques are often used to partition the set of vertices of a graph, or to form clusters. They are based on the Laplacian matrix. These techniques allow easily to integrate weights on the edges. In this work, we introduce a p-Laplacian, or a generalized Laplacian matrix with potential, which also allows us to take into account weights on the vertices. These vertex weights are independent of the edge weights. In this way, we can cluster with the importance of vertices, assigning more weight to some vertices than to others, not considering only the number of vertices. We also provide some bounds, similar to those of Chegeer, for the value of the minimal cut cost with weights at the vertices, as a function of the first non-zero eigenvalue of the p-Laplacian (an analog of the Fiedler eigenvalue).

Keywords:

clustering; partitioning; Laplacian graph; vertex-weighted graph

MSC:

05C22; 05C35; 05C50; 05C70

1. Introduction

Informally, a cluster in a graph is a subgraph whose vertices are tightly linked between them, and loosely linked with other vertices, out of the subgraph. Such a vague concept is useful in the description of several phenomena: walking, searching, and decomposition of graphs [1]. The concept of a cluster is closely related to that of community [2]. Depending on the intended application, the meaning of the concept can be specified either to better reflect the aspects to be modeled or to ease its calculation. In our case, we were interested in the subdivision of the set of vertices in two parts, of similar size, in such a way that the number of edges between the two parts is kept to a minimum. To solve this problem, there are many references in the literature about spectral tools, based on a certain eigenvector of the Laplacian operator (Fiedler vector) [3]. These references also show how to consider weights at the edges of the graph so that the minimization of the edges takes these weights into account. In this work, we extended this tool in case there are also weights on the vertices independent of those on the edges so that the partition is made in parts of similar total weight, not necessarily a similar number of vertices, minimizing the total edge weight of the cut. Other proposals in the literature deal with this problem using generalized eigenvectors (see [4] and the references therein). In this article, we remain in the context of the usual eigenvectors. Our interest in avoiding generalized eigenvalues is that one of the possible applications is spectral clustering, along the lines of [5]. This method uses several eigenvectors to build a template for each vertex and forms the clusters with these templates. The eigenvectors are orthogonal to each other, but the generalized eigenvectors are not, so the templates formed with these will be less effective. In other words, we needed larger templates with generalized eigenvectors than with usual eigenvectors. An example of our work related to spectral clustering is [6].

A similar problem is studied in [7], where a Laplacian that incorporates both weights at the vertices and at the edges is defined. The vertex weights are multiplicatively integrated directly as matrix factors. The novelty of our approach is that our weights (called

ρ

) derive from a potential p that is additively integrated on the diagonal. The relationship between the potential p that must be introduced to obtain the desired weights

ρ

is specified in Theorem 3. This is significant because it allows us to overcome a technical difficulty of the cited work, obtaining error bounds (Theorem 4) as a function of the potential, not of the weights. A case of application of this method of weights from a potential is our contribution (Chapter 2.4) to the collective work [8]. In that work, we used the Laplacian matrix with potentials to perform spectral graph partitioning for process placement on heterogeneous computing platforms.

In the next section, we review standard notations and concepts about graph and matrix representation. In the following Section 3, we describe known facts about the spectral partition using a Laplacian matrix. Section 4 contains our contribution: we define a Laplacian matrix with potential (p-Laplacian matrices) and show that certain matrix of this type can be used to perform a partition on the set of vertices, in sizes of similar weight, minimizing the edges cut (Theorem 3). It also contains a Cheeger-style bound on the difference between the true value of the minimum partition and the approximated value obtained using the Fiedler eigenvector of the p-Laplacian matrix (Theorem 4). The value considered in this result is the ratio cut of the partition, instead of the total cut. Our purpose was to show that it is possible to give bounds analogous to those that appear in the literature but adapted to this approach of weights at the vertices with independence from the weights at the edges. We emphasize that this bound is an improvement, similar to that of Mohar, for vertex-and-edge-weighted graph spectral analysis.

2. Graphs and Transfer Matrices

A graph

G = (V, E)

consists of a set V of vertexes and a set E of subsets of two vertices (the edges). That is,

E \subset P_{2} (V)

. For

u, v \in V

, the edge

{u, v}

, also noted

u \sim v

, is said to go between u and v. In this study, we used this definition of a graph, which does not model the direction of the edge or loops.

A weight on edges is a map

w : E \to R .

The weight of the edge

u \sim v

is noted

w (u, v)

. If a weight on the edges is not specified, implicitly, the constant unit weight must be considered (that is,

w (u, v) = 1

for each

{u, v} \in E

).

A weight on vertexes is a map

\begin{matrix} s : V & \to R . \end{matrix}

The set of these maps, that is, the set of all vertex weights, is denoted

R^{V}

. Clearly, it is a vector space.

To represent graphs using matrices, we chose an ordering of the set of vertexes,

V = {v_{1}, \dots, v_{n}}

. The adjacency matrix of G (for this conventional ordering) is the

n \times n

matrix

A = {(a_{i j})}_{i, j = 1, \dots, n}

of values:

a_{i j} = \{\begin{matrix} 1 & if v_{i} \sim v_{j} \\ 0 & if not \end{matrix}

We represent a vertex weight

s \in R^{V}

as the vector

{(s_{i})}_{i = 1, \dots, n}

of its values

s_{i} = s (v_{i})

. The adjacency matrix operates in

R^{V}

, the set of vertex weights, as a right matrix product (postmultiplication).

\begin{matrix} A : R^{V} & ⟶ R^{V} \\ {(s_{i})}_{i = 1, \dots, n} & ⟼ s A = {(\sum_{j = 0}^{n} s_{i} a_{i j})}_{i = 1, \dots, n} \end{matrix}

This is the transference (or shift) of the vertex weight

(s_{i})

by the graph G.

The postmultiplication of s, as a row vector, by A, is usual in matrix analysis of finite Markov chains [9]. We can see the shift as the following action: the shift

s A

represents that each edge

v_{i} \sim v_{j}

takes the content of

v_{i}

, that is,

s (v_{i})

, and transports it to the vertex

v_{j}

(modifying its amount by the factor of transference

a_{i j}

). The sum of the values transferred to

v_{j}

is

\sum s (v_{i}) a_{i j}

, with a gain

a_{i j}

from each adjacent vertex

v_{i}

. So, the vertex weight

s A

is:

(s A) (v_{j}) = \sum_{v_{i} \sim v_{j} \in E} s (v_{i}) a_{i j}

In the literature, the interest is focused on symmetrical matrices (because they are modeling undirected graphs), so there is no difference between pre- and postmultiplication by A. Besides, note that the main diagonal is zero because the graphs do not have loops. In Section 4, we introduce potentials that can be viewed as a method to use the diagonal entries to carry vertex weights, independently of the edge weights.

3. Laplacian and Partitions

This section summarizes standard notions about graphs, partitions, and the Laplacian operator, which can be seen in [3] or [10], although the notation is adapted to our particular purposes, and the Lemma 3 is proved in a novel way, without using summatories.

A partition of a set V is an array

(V_{1}, V_{2})

of two subsets of V such that

V_{1} \cup V_{2} = V

and

V_{1} \cap V_{2} = \emptyset

. A partition of a graph

G = (V, E)

is a partition of the underlying set V of vertices. An edge

u \sim v

in G is cut by a partition

(V_{1}, V_{2})

if

u \in V_{1}

and

v \in V_{2}

or vice versa. If the graph is weighted, the total cut is

cut (V_{1}, V_{2}) = \sum_{\begin{matrix} u \in V_{1} \\ v \in V_{2} \end{matrix}} w (u, v) .

Any subset

U \subset V

defines a cut,

(U, U^{c})

, which is called the cut of U.

It is usually preferable, between several partitions in a graph, that one has a minimum number of cut edges (or total cut, if weighted). The cut of the graph is:

c (G) = min_{\begin{matrix} U \subset V \end{matrix}} cut (U, U^{c})

In this study, we were interested in partitions with minimal cuts but with a balanced number of vertices, that is,

| V_{1} | = | V_{2} |

for an even number of vertices or

| V_{1} | = ⌊\frac{| V |}{2}⌋

for any number of vertices (a bipartition). The bipartition width [11] is:

b (G) = min_{\begin{matrix} U \subset V \\ | U | = ⌊\frac{| V |}{2}⌋ \end{matrix}} cut (U, U^{c})

We also used the cut ratio of U (the quotient

i (U) = cut (U, U^{c}) / | U |

) and the isoperimetric number:

i (G) = min_{\begin{matrix} U \subset V \\ | U | \leq \frac{| V |}{2} \end{matrix}} \frac{cut (U, U^{c})}{| U |}

We expressed, using linear algebra, the combinatorial problem of finding the partitions that realize these minimums.

Let us suppose we are given an ordering

V = {v_{1}, v_{2}, \dots, v_{n}}

in the set of vertices. A vector

x \in R^{V}

has an entry

x_{i}

for each

v_{i}

. The characteristic vector

c_{S}

of a set

S \in V

is

c_{S} = {(x_{i})}_{i = 1, \dots, n}

with:

x_{i} = \{\begin{matrix} 1 if v_{i} \in S \\ 0 if v_{i} \notin S \end{matrix}

Sometimes it is preferable to use other values than 0 or 1 in the vector expression of a combinatorial object like a subset or partition [4]. For two real values

b_{1}, b_{2}

, the

(b_{1}, b_{2})

-indicator vector of a partition

(V_{1}, V_{2})

is the vector

x = {(x_{i})}_{i = 1, \dots, n}

with

x_{i} = \{\begin{matrix} b_{1} if v_{i} \in V_{1} \\ b_{2} if v_{i} \in V_{2} \end{matrix}

For example, the (0,1)-indicator is the characteristic of the second set of the partition. We mainly use (1,−1)-indicators. We denote

x \cdot y

, which is the standard scalar product in

R^{n}

. In this way, a matrix A has associated the bilinear form

x \cdot A y

. The vector

1

has the value 1 in each component. The degree vector is

g = (d_{1}, d_{2}, \dots, d_{n})

where

d_{i} = deg (v_{i})

.

Lemma 1.

Being

S, T \subset V

of characteristic vectors

c_{S}, c_{T}

:

(i): $1 \cdot c_{S} = | S |$ . Also $c_{S} \cdot c_{S} = | S |$ .
(ii): $c_{S} \cdot c_{T} = | S \cap T |$ .
(iii): If A is the adjacency matrix of a graph, the vector $A 1$ has, in the i-th entry, the degree $d_{i} = deg (v_{i})$ . That is, $A 1 = g$ . Also $1 \cdot A 1 = \sum_{i} d_{i}$ .
(iv): $A c_{S}$ has, in entry i-th, the number of edges to $v_{i}$ from vertices in S.

Proof.

They are straightforward. □

Lemma 2.

Being

c_{1}

and

c_{2}

characteristic vectors of the sets of a partition

(V_{1}, V_{2})

:

cut (V_{1}, V_{2}) = c_{1} \cdot A c_{2}

Proof.

By Lemma 1 (iv),

A c_{2}

contains in i-th entry the total weight of edges between

v_{i}

and

V_{2}

. Hence,

c_{1} \cdot A c_{2}

is the sum of the weight of the edges in the set

V_{1} \sim V_{2}

, defined as

V_{1} \sim V_{2} = {u \sim v | u \in V_{1}, v \in V_{2}},

that is, cut

(V_{1}, V_{2})

. □

Calling

D_{g} = diag (g)

the matrix with the degree vector g in the diagonal and zero off-diagonal, we have

1 \cdot D_{g} 1 = \sum_{i} d_{i}

. Being x the (1,−1)-indicator of any partition, we also have

x \cdot D_{g} x = \sum_{i} d_{i}

, because the minus signs appear in pairs.

Defining the Laplacian as

L = D_{g} - A

, we have:

Lemma 3.

If x is the (1,−1)-indicator of

(V_{1}, V_{2})

,

cut (V_{1}, V_{2}) = \frac{x \cdot L x}{4}

Proof.

If x is the (1,−1)-indicator of

(V_{1}, V_{2})

,

x = c_{1} - c_{2}

with

c_{1}

and

c_{2}

characteristic of

V_{1}

and

V_{2}

, respectively, and:

\begin{matrix} x \cdot A x = (c_{1} - c_{2}) \cdot A (c_{1} - c_{2}) = c_{1} \cdot A c_{1} + c_{2} \cdot A c_{2} - (c_{1} \cdot A c_{2} + c_{2} \cdot A c_{1}) = \\ = c_{1} \cdot A c_{1} + c_{2} \cdot A c_{2} - 2 cut (V_{1}, V_{2}) \end{matrix}

Besides, as

c_{1} + c_{2} = 1

, that is,

c_{1} = 1 - c_{2}

, then

c_{1} \cdot A c_{1} = c_{1} \cdot A (1 - c_{2}) = c_{1} \cdot A 1 - c_{1} \cdot A c_{2}

. Likewise,

c_{2} \cdot A c_{2} = c_{2} \cdot A 1 - c_{2} \cdot A c_{1}

, hence

\begin{matrix} c_{1} \cdot A c_{1} + c_{2} \cdot A c_{2} = c_{1} \cdot A 1 - c_{1} \cdot A c_{2} + c_{2} \cdot A 1 - c_{2} \cdot A c_{1} = \\ = (c_{1} + c_{2}) \cdot A 1 - (c_{1} \cdot A c_{2} + c_{1} \cdot A c_{2}) = 1 \cdot A 1 - 2 cut (V_{1}, V_{2}) \end{matrix}

So,

\begin{matrix} x \cdot A x = c_{1} \cdot A c_{1} + c_{2} \cdot A c_{2} - 2 cut (V_{1}, V_{2}) = \\ = 1 \cdot A 1 - 2 cut (V_{1}, V_{2}) - 2 cut (V_{1}, V_{2}) = \\ = \sum_{i} d_{i} - 4 cut (V_{1}, V_{2}) . \end{matrix}

That is,

x \cdot A x = \sum_{i} d_{i} - 4 cut (V_{1}, V_{2})

. Solving for the cut, we have

4 cut (V_{1}, V_{2}) = \sum_{i} d_{i} - x \cdot A x

. As

\sum_{i} d_{i} = x \cdot D_{g} x

, and

L = D_{g} - A

, we can express

\sum_{i} d_{i} - x \cdot A x = x \cdot D_{g} x - x \cdot A x = x \cdot (D_{g} x - A x) = x \cdot (D_{g} - A) x = x \cdot L x

. Therefore,

cut (V_{1}, V_{2}) = \frac{x \cdot L x}{4} .

□

We deduced this well-known identity in matrix form, instead of summatory form as usual. So we not only avoided the index chasing but also made explicit the role of the values

b_{1}, b_{2}

used in indicators. For example, if x is a

(\frac{1}{2}, - \frac{1}{2})

-indicator of

(V_{1}, V_{2})

, then cut

(V_{1}, V_{2}) = x \cdot L x

. In general [4], for

(b_{1}, b_{2})

-indicators, the cut is

\frac{x \cdot L x}{{(b_{2} - b_{1})}^{2}}

. This deduction also clarifies the role of the diagonal degree matrix

D_{g}

.

In addition to the expression of the cost as a bilinear form L, we expressed the requirement that the partition

(V_{1}, V_{2})

be balanced as

1 \cdot x = 0

. So, the problem of finding the bipartition of minimal cost is the following problem of combinatorial optimization:

$\underset{x}{Minimize}$	$x \cdot L x$
$subject to$	$\{\begin{matrix} x_{i} = \pm 1, i = 1, \dots, n \\ 1 \cdot x = 0 \end{matrix}$ .

This combinatorial problem is an NP-complete problem [12]. To approximate a solution, with polynomial computational cost, it is customary to relax the constraints

x_{i} = \pm 1

. This relaxed problem is a numerical one that has several features that ease its resolution: L is symmetrical, hence its eigenvalues are real, and there is an orthonormal basis of eigenvectors [13]. Besides,

1

is an eigenvector of eigenvalue 0, because

D_{g} 1 - A 1 = 0

. Additionally, L is a weak diagonally dominant of positive diagonal, hence (by the theorem of the Geršgorin discs [14]) its eigenvalues are nonnegative

0 = λ_{0} \leq λ_{1} \leq \dots \leq λ_{k}

. If G is connected,

λ_{0} < λ_{1}

[3]. These features of L are generally deduced from its expression as summatory of squares, which we have avoided. The result about diagonal dominance that we used instead is also easy to see.

The Rayleigh quotient of a

n \times n

symmetric matrix M is

R_{M} (x) = \frac{x \cdot M x}{x \cdot x}

defined for

x \neq 0

in

R^{n}

. It plays a role in the following min–max theorem of Courant-Weyl [13], which we use without proof:

Theorem 1.

Let

S_{k}

be the set of subspaces of

R^{n}

of dimension lesser or equal than k, for

k = 1, 2, \dots, n

, M symmetric with eigenvalues

λ_{0} \leq λ_{1} \leq \dots \leq λ_{n - 1}

and corresponding eigenvectors

f_{0}, f_{1}, \dots, f_{n - 1}

. Then

λ_{k - 1} = min_{E \in S_{k}} max_{\begin{matrix} x \in E \\ | | x | | = 1 \end{matrix}} R_{M} (x) .

Besides, the argument E giving the minimum is

E = span (f_{0}, f_{1}, \dots, f_{k - 1})

, and an argument x giving the maximum is

f_{k - 1}

.

In particular,

λ_{0} = min_{\begin{matrix} x \in R^{n} \\ | | x | | = 1 \end{matrix}} R_{M} (x)

, because each

x \neq 0

span an

E \in S_{1}

. In the case that

M = L

, the eigenvector

x_{0}

is (a scalar multiple of)

1

, as commented above, and the others are orthogonal to it:

1 \cdot f_{i} = 0, i = 1, \dots, k - 1

. In the case

i = 1

,

λ_{1} = min_{E \in S_{2}} max_{\begin{matrix} x \in E \\ | | x | | = 1 \end{matrix}} R_{L} (x) = min_{\begin{matrix} x \in R^{n} \\ 1 \cdot x = 0, | | x | | = 1 \end{matrix}} R_{L} (x)

because each x with

1 \cdot x = 0

span a

E = ⟨1, x⟩ \in S_{2}

. The minimum is reached in

f_{1}

.

To relate this result with the cut value of partitions note that if x an indicator vector of a partition (

x_{i} = \pm 1

), the cut

\frac{x \cdot L x}{4}

is proportional to the Rayleigh quotient. The minimum

λ_{1} = {min}_{1 \cdot x = 0} \frac{x \cdot L x}{x \cdot x}

is reached in a vector

f_{1}

that is an eigenvector for

λ_{1}

. That is,

f_{1}

is a solution to the relaxed problem, although it may not be an indicator vector. The first non-null eigenvalue

λ_{1}

is termed the Fiedler value, and its eigenvector

f_{1}

is the Fiedler vector.

There are several rounding or truncation methods to obtain an indicator vector

x = {(x_{i})}_{i}

(i.e., with integer values

- 1, 1

for

x_{i}

) from

f_{1}

(whose entries are not necessarily integers). The most direct rounding is the partition by sign:

x_{i} = 1

if

{(f_{1})}_{i} > 0

,

x_{i} = - 1

otherwise. This rounding can give a partition that is not a bipartition. Another rounding method uses the median, achieving precisely bipartitions: if m is the median value of the entries of

f_{i}

,

x_{i} = 1

if

f_{i} > m

and

x_{i} = - 1

otherwise.

For these rounding methods and others that appear in the literature [15], there is an error bound. The error is the difference between the partition obtained by the rounding and the partition that actually minimizes the cut, which is obtainable by combinatorial methods. These bounds are known as discrete Cheeger bounds since they involve the first nonzero eigenvalue. In particular, we use the following Cheeger bound developed by Mohar [16]. It compares the cut ratio

i (U_{s}) = \frac{cut (U_{s}, U_{s}^{c})}{| U_{s} |}

of the partition induced by the sign rounding,

U_{s} = {x_{i} | {(f_{1})}_{i} > 0}

, and the isoperimetric number

i (G)

.

Theorem 2.

If G has more than three vertexes and its maximal degree is Δ, then

\frac{λ_{1}}{2} \leq i (G) \leq i (U_{s}) \leq \sqrt{λ_{1} (2 Δ - λ_{1})}

This bound uses the sign rounding and the cut ratio,

i (U_{s})

, instead of the median rounding and the edge cut,

cut (U, U^{c})

, which we use to describe our framework. However, we chose it because it is easier to present the generalization that we make for vertex-weighted graphs in the next section. The cut ratio has also been studied in the stochastic setting [17]. In principle, it is also possible to generalize for vertex-weighted graphs the similar bounds that there are in the literature for median rounding and edge cut [18].

4. Laplacians with Potential

To motivate our contribution, we recall here the usual interpretation, for instance in [19], of the values of a weight s on vertices using flows on graphs. The vertex weight

s (v_{i})

corresponds to the amount or magnitude of some physical substance placed at

v_{i}

. The weights at the edges

w_{i j}, i \neq j

between different vertices correspond, in this interpretation, to a transmission factor or gain that affects the substance when it flows from one vertex

v_{i}

to another

v_{j}

, increasing or decreasing its amount. Following this interpretation of the weight matrix as a transference or shift, the weights in the loop edges

w_{i i}

are the gain that suffers the substance that stays in the same vertex

v_{i}

.

In this diffusion process interpretation, the eigenvectors of a shift matrix are the stationary substance distributions. In particular, the Laplacian matrix has a stationary distribution that is uniform (corresponding to the null eigenvalue) because the degree values on the diagonal make the total gain of substance equal to zero. In this sense, the Laplacian process is conservative. Another example is the shift by the adjacency matrix of a connected graph, which has a positive stationary distribution (the random walk limit distribution, corresponding to the Perron eigenvector [3]), with a substance gain given by the Perron eigenvalue.

Following this vein, we show how to control the weights in the diagonal

w_{i i}

to obtain any positive

ρ : V ⟶ R

as a stationary distribution, the eigenvector of a matrix similar to the Laplacian.

As commented, a weight on vertices is a function

p : V ⟶ R

. Its diagonal form is the matrix

D_{p} = diag (p (x_{i}))

. The generalized Laplacian with potential p (or p-Laplacian) is:

L_{p} = L + D_{p}

That is,

L_{p} = D_{g} - A + D_{p}

. In particular, with

p = 0

, the 0-Laplacian is the ordinary Laplacian. Some properties of p-Laplacians are similar to those of ordinary Laplacians, as can be viewed in [20] under the name of generalized Laplacian. For our purposes, we highlight the following ones, whose proof require some technicalities:

Lemma 4.

If the graph G is connected and the potential p verifies

p (v_{i}) \geq - deg (v_{i}),

then:

(a): The eigenvalues of $L_{p}$ are real, and the minimum eigenvalue has multiplicity 1. That is, $λ_{0} < λ_{1} \leq \dots \leq λ_{n - 1}$ .
(b): There is a positive eigenvector corresponding to $λ_{0}$ , unique up to a scalar multiple.

Proof.

If

p (v_{i}) \geq - deg (v_{i})

, then

L_{p} = D_{g} - A + D_{p}

is a symmetrical Z-matrix [21]. As G is connected,

L_{p}

is irreducible. By Observation 1.4.3 of [21], the claims follow. □

The minimum eigenvalue is the Perron eigenvalue

λ_{0}

. It has multiplicity 1, and there is an eigenvector of the Perron eigenvalue with all positive entries. To fix one such eigenvector, we define the Perron eigenvector

ρ

as that with

∥ ρ ∥ = 1

.

The min–max theorem for the operator

L_{p}

gives us that

λ_{1} = {min}_{x \neq 0} \frac{x \cdot L_{p} x}{x \cdot x}

, and the minimum is reached in an eigenvector of

λ_{1}

of norm 1 (the Fiedler vector

ϕ

).

With these properties, we can replicate the spectral partition methodology because the spectral decomposition of

L_{p}

assures that

ϕ \cdot ρ = 0

. This can be understood, as in the ordinary Laplacian L above, that the positive and negative values of the Fiedler vector give us an indicator of two sets of vertexes. This indicator cuts V in two parts of equal absolute sum of Perron values. That is,

\sum_{x \in V} ϕ (x) ρ (x) = 0

and if

V_{1} = {x_{i} | ϕ (x_{i}) > 0}

,

V_{2} = V_{i}^{c}

, then

\sum_{x \in V_{1}} ϕ (x) ρ (x) = \sum_{x \in V_{2}} |ϕ (x)| ρ (x)

However, in this case, the Perron vector is not the constant distribution,

1

, but a positive distribution

ρ

. We use the distribution

ρ

as a measure of the relative importance of the vertexes in a partition or clustering.

Note that the Perron eigenvector

ρ

is defined up to a constant factor, in the sense that

ρ^{'} = k ρ

, for

k > 0

, is also a positive eigenvector of the same eigenvalue. Our choice of

∥ ρ ∥ = 1

is conventional.

Note also that, as in the ordinary case commented in Section 3, the Fiedler vector

ϕ

does not necessarily have

\pm 1

components and should be considered as an approximation to the optimal partition.

To build a potential p such that the Perron distribution of

L_{p}

will be a predefined given

ρ

, we apply the formula of the following theorem. Note that it is scale-invariant, in the sense that

ρ

and

k ρ

, for

k > 0

, will produce the same potential p.

Remember that for a vector x, we denote its i-th component as

x_{i}

, and a function

x : V \to R

is identified with the vector

x_{i} = x (v_{i})

. We can build a potential p such that the Perron vector

ρ

of

L_{p}

have predetermined positive values

ρ_{i}

:

Theorem 3.

For any vector ρ such that

ρ_{i} > 0

for

i = 1, \dots, n

, if

p (v_{i}) = \frac{{(A ρ)}_{i}}{ρ_{i}} - deg (v_{i}),

being

A = (a_{i j})

the adjacency matrix, then the Perron vector of

L_{p}

is ρ, and the Perron value is 0.

Proof.

As p verifies the hypothesis of Lemma 4,

L_{p}

has a Perron eigenvalue. Note that for each

i = 1, \dots, n

,

{(D_{g} ρ)}_{i} = deg (v_{i}) ρ_{i}

. Also

{(D_{p} ρ)}_{i} = p (v_{i}) ρ_{i} = {(A ρ)}_{i} - deg (v_{i}) ρ_{i} .

Therefore

\begin{matrix} {(L_{p} ρ)}_{i} = {((D_{g} - A + D_{p}) ρ)}_{i} = {(D_{g} ρ)}_{i} - {(A ρ)}_{i} + {(D_{p} ρ)}_{i} = \\ = deg (v_{i}) ρ_{i} - {(A ρ)}_{i} + ({(A ρ)}_{i} - deg (v_{i}) ρ_{i}) = 0 . \end{matrix}

So,

L_{p} ρ = 0

. To conclude that 0 and

ρ

are the Perron eigenvalue and eigenvector, we use the fact that in a symmetrical matrix the eigenvectors of different eigenvalues are orthogonal. Consequently, only one eigenvalue can have associated a positive eigenvector as

ρ

, and this is 0. □

With this result, we can do spectral partition with preassigned weights

ρ

on the vertexes. By the above discussion, the p-Laplacian for the potential p corresponding to the given

ρ

has a Fiedler vector orthogonal to the Perron vector

ρ

(that is, it produces a partition in parts of equal total weight at the vertices). The value of the cut is also well expressed by the p-Laplacian, in the following lemma. For a vertex set

U \subset V

, we denote

p (U) = \sum_{v \in U} p (v)

.

Lemma 5.

If x is the (1,−1)-indicator of

(V_{1}, V_{2})

,

cut (V_{1}, V_{2}) = \frac{x \cdot L_{p} x}{4} - \frac{p (V)}{4}

Proof.

As

L_{p} = L + D_{p}

,

x \cdot L_{p} x = x \cdot (L + D_{p}) x = x \cdot (L x + D_{p} x) = x \cdot L x + x \cdot D_{p} x

By Lemma 3,

x \cdot L x = 4 cut (V_{1}, V_{2})

, and

x \cdot D_{p} x = \sum_{i} p (v_{i}) = p (V)

, hence the claim. □

Therefore, to find a partition (i.e., an indicator x) that minimizes

x \cdot L_{p} x

is equivalent to minimizing

cut (V_{1}, V_{2})

, because

p (V) / 4

is a constant, given the potential p.

For a predefined weight on vertices

ρ = (ρ_{1}, ρ_{2}, \dots ρ_{n})

, we build the potential p by Theorem 3. The p-Laplacian

L_{p}

specifies the vertex

ρ

-weighted bipartition problem:

$\underset{x}{Minimize}$	$x \cdot L_{p} x$
$subject to$	$\{\begin{matrix} x_{i} = \pm 1, i = 1, \dots, n \\ ρ \cdot x = 0 \end{matrix}$ .

This combinatorial problem is at least NP-complete as the conventional bipartition problem, which includes it as a subproblem. In any case, we consider the relaxed problem (without the restriction

x_{i} = \pm 1

) that has as solution, by the min–max Theorem 1, the Fiedler vector

ϕ

of

L_{p}

.

By taking this Fiedler vector as an approximation to the combinatorial solution, that is, the unrelaxed problem, the error can be bounded with a Chegeer expression similar to that of Mohar (Theorem 2). To express this bound, in the following Theorem 4, we define the cut ratio of U with respect to p as

i_{p} (U) = \frac{cut (U, U^{c})}{p (U)},

and the isoperimetric number with respect to p as

i_{p} (G) = min_{\begin{matrix} U \subset V \\ p (U) \leq \frac{p (V)}{2} \end{matrix}} \frac{cut (U, U^{c})}{p (U)} .

Given a vector

x \in R^{V}

,

x_{i} = x (v_{i})

, we order the set of its values as

t_{1} > t_{2} > \dots > t_{m}

. That is, for each i in

1, \dots, n

there is one j in

1, \dots, m

and only one with

t_{j} = x (v_{i})

. The level sets of x are, for each

k = 1, \dots, m

,

V_{k} = {v \in V | x (v) \geq t_{k}}

. The sweep cut

h_{p} (x)

is the minimum cut following level sets of x, that is:

h_{p} (x) = min_{k = 1, \dots, m} \frac{cut (V_{k}, V_{k}^{c})}{p (V_{k})}

It is clear that, for any x,

i_{p} (G) \leq h_{p} (x)

. We define also

S (x)

, the increment of square values along x, as

S (x) = \sum_{v_{i} \sim v_{j}} | x_{i}^{2} - x_{j}^{2} | .

Lemma 6.

Let L be the ordinary Laplacian, for any x:

(a): $h_{p} (x) x \cdot D_{p} x \leq S (x) .$
(b): $S (x) \leq \sqrt{x \cdot L x} \sqrt{2 x \cdot D_{g} x} .$

Proof.

For (a), consider the level sets of x, that is, vertices sorted such that

x_{1} \geq x_{2} \geq \dots \geq x_{n}

, being

t_{1} > t_{2} > \dots > t_{m}

the different values of x. For each level set

V_{k}

, by the definition of sweep cut we have:

h_{p} (x) p (V_{k}) \leq cut (V_{k}, V_{k}^{c})

(1)

Besides

\begin{matrix} x \cdot D_{p} x = \sum_{i = 1}^{n} p (v_{i}) x_{i}^{2} = \\ = \sum_{v_{i} \in V_{1}} p (v_{i}) t_{1}^{2} + \sum_{v_{i} \in V_{2} ∖ V_{1}} p (v_{i}) t_{2}^{2} + \sum_{v_{i} \in V_{3} ∖ V_{2}} p (v_{i}) t_{3}^{2} + \dots + \sum_{v_{i} \in V_{m} ∖ V_{m - 1}} p (v_{i}) t_{m}^{2} = \\ = \sum_{v_{i} \in V_{1}} p (v_{i}) t_{1}^{2} + \sum_{v_{i} \in V_{2}} p (v_{i}) t_{2}^{2} - \sum_{v_{i} \in V_{1}} p (v_{i}) t_{2}^{2} + \sum_{v_{i} \in V_{3}} p (v_{i}) t_{3}^{2} - \sum_{v_{i} \in V_{2}} p (v_{i}) t_{3}^{2} + \dots \\ \dots + \sum_{v_{i} \in V_{m}} p (v_{i}) t_{m}^{2} - \sum_{v_{i} \in V_{m - 1}} p (v_{i}) t_{m}^{2} = \\ = p (V_{1}) (t_{1}^{2} - t_{2}^{2}) + p (V_{2}) (t_{2}^{2} - t_{3}^{2}) + \dots + p (V_{m - 1}) (t_{m - 1}^{2} - t_{m}^{2}) + p (V_{m}) t_{m}^{2} = \\ = \sum_{k = 1}^{m} p (V_{k}) (t_{k}^{2} - t_{k + 1}^{2}) \end{matrix}

(2)

extending with a value

t_{m + 1} = 0

to ease the notation.

Combining (1) and (2):

\begin{matrix} h_{p} (x) x \cdot D_{p} x = & \sum_{k = 1}^{m} h_{p} (x) p (V_{k}) (t_{k}^{2} - t_{k + 1}^{2}) \leq \sum_{k = 1}^{m} cut (V_{k}, V_{k}^{c}) (t_{k}^{2} - t_{k + 1}^{2}) = \\ = \sum_{k = 1}^{m} \sum_{\begin{matrix} v_{i} \sim v_{j} \\ x_{i} \in V_{k} \\ x_{j} \in V_{k}^{c} \end{matrix}} (t_{k}^{2} - t_{k + 1}^{2}) = \sum_{k = 1}^{m} \sum_{\begin{matrix} v_{i} \sim v_{j} \\ x_{i} \geq t_{k} \\ x_{j} < t_{k} \end{matrix}} (t_{k}^{2} - t_{k + 1}^{2}) = \sum_{v_{i} \sim v_{j}} | x_{i}^{2} - x_{j}^{2} | = S (x) . \end{matrix}

The last equality is because, for each pair of vertices

v_{i} \sim v_{j}

, if

x_{i} = t_{k_{0}}

and

x_{j} = t_{k_{1}}

with

1 \leq k_{0} < k_{1} \leq m

, the double summation includes a chain of terms

(t_{k_{0}}^{2} - t_{k_{0} + 1}^{2}) + (t_{k_{0} + 1}^{2} - t_{k_{0} + 2}^{2}) + \dots + (t_{k_{1} - 1}^{2} - t_{k_{1}}^{2})

that collapses to

t_{k_{0}}^{2} - t_{k_{1}}^{2} = | x_{i}^{2} - x_{j}^{2} |

.

For (b), we consider vectors

u, v \in R^{E}

, that is, indexed by the edges

u = (u_{1}, u_{2}, \dots, u_{| E |}) = {(u_{i, j})}_{v_{i} \sim v_{j}} .

If we apply the Cauchy–Schwarz inequality

u \cdot v \leq ∥ u ∥ ∥ v ∥

to the particular vectors of

R^{E}

u = (| x_{i} - x_{j} {|)}_{v_{i} \sim v_{j}}

and

v = {(x_{i} + x_{j})}_{v_{i} \sim v_{j}}

, as

u \cdot v = (| x_{i} - x_{j} | (x_{i} + x_{j}))_{v_{i} \sim v_{j}} = (| x_{i}^{2} - x_{j}^{2} {|)}_{v_{i} \sim v_{j}}

, we have:

\begin{matrix} S (x) = \sum_{v_{i} \sim v_{j}} | x_{i}^{2} - x_{j}^{2} | = \sum_{v_{i} \sim v_{j}} | x_{i} - x_{j} | (x_{i} + x_{j}) \leq \sqrt{\sum_{v_{i} \sim v_{j}} {| x_{i} - x_{j} |}^{2}} \sqrt{\sum_{v_{i} \sim v_{j}} {(x_{i} + x_{j})}^{2}} \end{matrix}

For the first root factor, it is known that

x \cdot L x = \sum_{v_{i} \sim v_{j}} {(x_{i} - x_{j})}^{2}

being L the ordinary Laplacian (see for example [15]). For the second root factor, by the trivial fact that

{(a + b)}^{2} \leq 2 (a^{2} + b^{2})

we have:

\sum_{v_{i} \sim v_{j}} {(x_{i} + x_{j})}^{2} \leq 2 \sum_{v_{i} \sim v_{j}} (x_{i}^{2} + x_{j}^{2}) = 2 \sum_{i} deg (v_{i}) x_{i}^{2} = 2 x \cdot D_{g} x .

□

To prove the following claim, we apply the above lemma to

ϕ

, the Fiedler eigenvalue of

L_{p}

.

Theorem 4.

Being

m_{p} = {min}_{i} (p (v_{i}))

,

Δ = {max}_{i} (deg (v_{i}))

,

λ_{1}

and ϕ the Fiedler eigenvalue and eigenvector of

L_{p}

, then:

\frac{n λ_{1} - p (V)}{4} \leq i_{p} (G) \leq h_{p} (ϕ) \leq \sqrt{\frac{2 Δ (λ_{1} - m_{p})}{m_{p}^{2}}}

Proof.

For the first inequality, calling

x_{b}

the (−1,1)-indicator of the minimal bipartition

(V_{1}, V_{2})

, that is, such that

i_{p} (G) = cut (V_{1}, V_{2})

, by Lemma 5 we have:

i_{p} (G) = cut (V_{1}, V_{2}) = \frac{x_{b} \cdot L_{p} x_{b} - p (V)}{4},

that is,

x_{b} \cdot L_{p} x_{b} = 4 i_{p} (G) + p (V)

. Besides, being

λ_{1} = {min}_{x \neq 0} \frac{x \cdot L_{p} x}{x \cdot x}

for each x is

λ_{1} x \cdot x \leq x \cdot L_{p} x

. In particular,

λ_{1} n = λ_{1} x_{b} \cdot x_{b} \leq x_{b} \cdot L_{p} x_{b}

, hence

λ_{1} n \leq 4 i_{p} (G) + p (V)

and:

\frac{n λ_{1} - p (V)}{4} \leq i_{p} (G)

For the second inequality, being

h_{p} (ϕ)

, a particular cut (the sweep cut of

ϕ

) the minimal cut

i_{p} (G)

is lesser or equal than it.

Finally, for the third inequality, we use Lemma 6 inequalities for the particular case

x = ϕ

, that is

(a): $h_{p} (ϕ) ϕ \cdot D_{p} ϕ \leq S (ϕ)$
(b): $S (ϕ) \leq \sqrt{ϕ \cdot L ϕ} \sqrt{2 ϕ \cdot D_{g} ϕ} .$

From (a) we have

h_{p} (ϕ) \leq \frac{S (ϕ)}{ϕ \cdot D_{p} ϕ}

, that from (b) is lesser or equal than

\frac{\sqrt{ϕ \cdot L ϕ} \sqrt{2 ϕ \cdot D_{g} ϕ}}{ϕ \cdot D_{p} ϕ}

. The value of these terms are

ϕ \cdot D_{g} ϕ = \sum_{i} deg (v_{i}) ϕ_{i}^{2}

and

ϕ \cdot D_{p} ϕ = \sum_{i} p (v_{i}) ϕ_{i}^{2}

. Additionally, as

L_{p} = L + D_{p}

and

L_{p} ϕ = λ_{1} ϕ

, we have

ϕ \cdot L ϕ = ϕ \cdot L_{p} ϕ - ϕ \cdot D_{p} ϕ = λ_{1} \sum_{i} ϕ_{i}^{2} - \sum_{i} p (v_{i}) ϕ_{i}^{2} = \sum_{i} (λ_{1} - p (v_{i})) ϕ_{i}^{2} .

Substituting those values in the inequalities:

\begin{matrix} h_{p} (ϕ) \leq \frac{S (ϕ)}{ϕ \cdot D_{p} ϕ} \leq \frac{\sqrt{2 ϕ \cdot L ϕ ϕ \cdot D_{g} ϕ}}{ϕ \cdot D_{p} ϕ} \leq \frac{\sqrt{2 \sum_{i} (λ_{1} - p (v_{i})) ϕ_{i}^{2} \sum_{i} deg (v_{i}) ϕ_{i}^{2}}}{\sum_{i} p (v_{i}) ϕ_{i}^{2}} \leq \\ \leq \sqrt{2 \frac{\sum_{i} (λ_{1} - p (v_{i})) ϕ_{i}^{2}}{\sum_{i} p (v_{i}) ϕ_{i}^{2}} \frac{\sum_{i} deg (v_{i}) ϕ_{i}^{2}}{\sum_{i} p (v_{i}) ϕ_{i}^{2}}} \leq \sqrt{2 \frac{(λ_{1} - m_{p}) \sum_{i} ϕ_{i}^{2}}{m_{p} \sum_{i} ϕ_{i}^{2}} \frac{Δ \sum_{i} x_{i}^{2}}{m_{p} \sum_{i} ϕ_{i}^{2}}} = \\ = \sqrt{\frac{2 Δ (λ_{1} - m_{p})}{m_{p}^{2}}} \end{matrix}

The last inequality is because, by definition of

m_{p}

and

Δ

, for each

i = 1, \dots, n

, we have

deg (v_{i}) \leq Δ

,

p (v_{i}) \geq m_{p}

and also

(λ - p (v_{i})) \leq (λ - m_{p})

. □

This result is similar to the above Theorem 2, in this case bounding the cost of the minimal cut

i_{p} (G)

and the sweep cut

h_{p} (ϕ)

of the Fiedler eigenvector of

L_{p}

.

5. Conclusions

We introduced a spectral partitioning method for arbitrary (positive) vertex weights not related to edges’ weights or vertex degrees. We also provided a bound of the Cheeger type for the error in taking the eigenvector numerical solution as an approximation for the combinatorial problem. This bound is similar to others that appear in the literature for cuts with uniform vertex weights.

In this work, we did not take into account a distribution of weights on the edges. The usual methods of considering ordinary Laplacians with edge weights (for example, [3]) can be extended to generalized Laplacians as we exposed (that is, including a vertex potential that induces vertex weighting in the spectral partitioning). The two types of weights, on vertices and on edges, are independent. This was not considered for simplicity, but it will be the content of future work.

Author Contributions

Conceptualization, J.-L.G.-Z. and C.G.; Writing—original draft, J.-L.G.-Z.; Writing—review and editing, C.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was jointly supported by the European Regional Development Fund “A way to achieve Europe” (ERDF) and the Extremadura Local Government (Ref. IB20040) and by the Spanish Ministerio de Ciencia e Innovación through project PID2019-110315RB-I00 (APRISA).

Conflicts of Interest

The authors declare no conflict of interest.

References

Schaeffer, S.E. Survey: Graph Clustering. Comput. Sci. Rev. 2007, 1, 27–64. [Google Scholar] [CrossRef]
Shang, Y. Generalized K-Core Percolation in Networks with Community Structure. SIAM J. Appl. Math. 2020, 80, 1272–1289. [Google Scholar] [CrossRef]
Chung, F.R. Spectral Graph Theory; CBMS; American Mathematical Soc.: Providence, RI, USA, 1997; Volume 92. [Google Scholar]
Shewchuk, J.R. Allow Me to Introduce Spectral and Isoperimetric Graph Partitioning; Technical Report; University of California Berkeley: Berkeley, CA, USA, 2016. [Google Scholar]
Ng, A.Y.; Jordan, M.I.; Weiss, Y. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14; (NIPS 2001); MIT press: Cambridge, MA, USA, 2002; pp. 849–856. [Google Scholar]
Martins, A.; Grácio, C.; Teixeira, C.; Rodrigues, I.P.; Zapata, J.L.; Ferreira, L. Historia Augusta Authorship: An Approach based on Measurements of Complex Networks. Appl. Netw. Sci. 2021, 6, 50. [Google Scholar] [CrossRef]
Xu, S.; Fang, J.; Li, X. Weighted Laplacian Method and Its Theoretical Applications. In IOP Conference Series: Materials Science and Engineering 2020; IOP Publishing: Bristol, UK, 2020; Volume 768, p. 072032. [Google Scholar]
Da Costa, G.; Lastovetsky, A.; Barbosa, J.; Díaz-Martín, J.C. Chapter 2: “Programming models and runtimes”. In Ultrascale Computing Systems; Carretero, J., Jeannot, E., Zomaya, A., Eds.; Institution of Engineering and Technology: London, UK, 2019. [Google Scholar]
Berman, A.; Plemmons, R.J. Nonnegative Matrices in the Mathematical Sciences; SIAM: Philadelphia, PA, USA, 1994. [Google Scholar]
Cvetkovic, D.M.; Rowlinson, P.; Simic, S. An Introduction to the Theory of Graph Spectra; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
Beineke, L.W.; Wilson, R.J.; Cameron, P.J. (Eds.) Topics in Algebraic Graph Theory; Encyclopedia of Mathematics and Its Applications; Cambridge University Press: Cambridge, UK, 2004; Volume 102. [Google Scholar]
Papadimitriou, C.H.; Steiglitz, K. Combinatorial Optimization: Algorithms and Complexity; Dover Publications: Mineola, NY, USA, 1998. [Google Scholar]
Lancaster, P.; Tismenetsky, M. The Theory of Matrices: With Applications; Computer Science and Scientific Computing Series; Academic Press: Cambridge, MA, USA, 1985. [Google Scholar]
Horn, R.A.; Johnson, C.R. Matrix Analysis, 2nd ed.; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Spielman, D.A.; Teng, S.H. Spectral partitioning works: Planar graphs and finite element meshes. Linear Algebra Appl. 2007, 421, 284–305. [Google Scholar] [CrossRef] [Green Version]
Mohar, B. Isoperimetric numbers of graphs. J. Comb. Theory Ser. B 1989, 47, 274–291. [Google Scholar] [CrossRef]
Shang, Y. Isoperimetric Numbers of Randomly Perturbed Intersection Graphs. Symmetry 2019, 11, 452. [Google Scholar] [CrossRef] [Green Version]
Boppana, R.B. Eigenvalues and graph bisection: An average-case analysis. In Proceedings of the 28th Annual Symposium on Foundations of Computer Science (sfcs 1987), Los Angeles, CA, USA, 12–14 October 1987; IEEE: Piscataway, NJ, USA, 1987; pp. 280–285. [Google Scholar]
Chung, F.R. Laplacians of graphs and Cheeger’s inequalities. Comb. Paul Erdos Eighty 1996, 2, 13–20. [Google Scholar]
Bıyıkoglu, T.; Leydold, J.; Stadler, P.F. Laplacian Eigenvectors of Graphs; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 2007; Volume 1915. [Google Scholar]
Molitierno, J.J. Applications of Combinatorial Matrix Theory to Laplacian Matrices of Graphs; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Clustering Vertex-Weighted Graphs by Spectral Methods

Abstract

1. Introduction

2. Graphs and Transfer Matrices

3. Laplacian and Partitions

4. Laplacians with Potential

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics