Next Article in Journal
SPX Calibration of Option Approximations under Rough Heston Model
Next Article in Special Issue
The k-Metric Dimension of a Unicyclic Graph
Previous Article in Journal
Exploring an Efficient POI Recommendation Model Based on User Characteristics and Spatial-Temporal Factors
Previous Article in Special Issue
On the Maximal Shortest Paths Cover Number
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Restart Local Search for Solving Diversified Top-k Weight Clique Search Problem

Information Science and Technology, Northeast Normal University, Changchun 130117, China
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(21), 2674; https://doi.org/10.3390/math9212674
Submission received: 24 September 2021 / Revised: 17 October 2021 / Accepted: 19 October 2021 / Published: 21 October 2021
(This article belongs to the Special Issue Graphs, Metrics and Models)

Abstract

:
Diversified top-k weight clique (DTKWC) search problem is an important generalization of the diversified top-k clique (DTKC) search problem with practical applications. The diversified top-k weight clique search problem aims to search k maximal cliques that can cover the maximum weight in a vertex weighted graph. In this work, we propose a novel local search algorithm called TOPKWCLQ for the DTKWC search problem which mainly includes two strategies. First, a restart strategy is adopted, which repeated the construction and updating processes of the maximal weight clique set. Second, a scoring heuristic is designed by giving different priorities for maximal weight cliques in candidate set. Meanwhile, a constraint model of the DTKWC search problem is constructed such that the research concerns can be evaluated. Experimental results show that the proposed algorithm TOPKWCLQ outperforms than the comparison algorithm on large-scale real-world graphs.

1. Introduction

Given an undirected graph G = ( V , E ) , a clique is a subset of the graph G, where any two vertices are adjacent. The maximal clique (MC) is a clique with the largest cardinality in the graph G. The maximum weight clique (MWC) is a generalization of MC with a positive integer assigned to each vertex as its weight value. The diversified top-k clique (DTKC) search problem aims to find a set with at most k maximal cliques to occupy as many vertices as possible, where k is a parameter that requires to be provided. The diversified top-k weight clique (DTKWC) search problem [1] attempts to search a set with at most k maximal weight cliques in the graph G with the largest total weight of covered vertices in these cliques, which can be readily verified as a NP-hard problem [2].
The MC and related problems have lots of applications, especially in real-world applications such as combinatorial auction [3], community detection [4,5] and video object segmentation [6]. Recently, considerable attentions have also been paid to solve top-k problems on large graphs [2,7,8]. This kind of problem can be very well applied to practical applications, such as the influential community [9], motif discovery in molecular biology [10]. For example, citation networks are usually represented as a type of social network with papers and links between citation relationships. In citation networks, denoted as graph G, papers are considered as vertices, and citation relationships are the edges between papers. The influence on the paper is viewed as a weight in G. The problem aims to search the top-k maximal divisive groups with different domains in G, which can be regarded as finding a DTKWC solution.
To solve MC and WMC problems on large-scale graphs effectively, some related methods have been proposed. These algorithms are usually divided into two categories: (1) the first one is the exact algorithms which can guarantee the optimality of the solutions, such as [11,12,13]. But exact algorithms may fail to solve the graphs within a reasonable time when the scale of them are larger. The second one is the local search algorithm, which is considered to find a suboptimal solution within a reasonable time for medium even larger graphs. And a large amount of effort has been devoted to designing different local search algorithms. For example, there exist lots of local search algorithms for solving WMC, e.g., [14,15]. Although there exist many algorithms to solve MC and WMC problems, currently, there are very few methods for diversified top-k cohesive groups. Such as Yuan et al. [7], Yuan et al. [8] (2015, 2016) proposed the concept of DTKC and then provided an approximate algorithm for it. Wu et al. [2] (2020) provides a local search algorithm to solve the DTKC search problem in large graphs, and it is state-of-the-art algorithm for DTKC search problem. Wu and Yin [16] (2021) introduce a problem of finding cohesive groups, named DTKSP problem, and develop a local search method based on some new heuristic strategies for this problem. Zhou et al. [1] (2021) encode the DTKWC search problem into the weighted partial MaxSAT (WPMS) problem, including direct encoding (DE) and independent set partition based encoding (ISPE), and solving WPMS with state-of-the-art solvers. However, this method is limited to solve real-world large graphs, because it is failed to encode large graphs into WPMS.
In this work, we propose a local search algorithm for the DTKWC search problem in large graphs, which provides a local optimal solution within a reasonable time and avoids the generation and storage of all maximal weight cliques. It aims at addressing the aforementioned problem. This algorithm, named TOPKWCLQ (which stands for top-k weight cliques), is based on two main strategies.
The first strategy is a restart method that can deal with the cycling problem. In the process of searching for the maximal weight clique, TOPKWCLQ will repeat to create a new maximal weight clique after initializing the set of maximal weight cliques and update this set through a scoring function. When the algorithm cannot be updated at fixed steps, it performs the restart process with the current best candidate solution.
The second strategy is a scoring function, which is designed by giving different priorities for maximal weight cliques in the candidate solution. During the searching process, TOPKWCLQ constructs and then maintains a candidate solution which size is at most k by adding or removing the maximal weight cliques according to the score value of each one. The score of each maximal weight clique is calculated by the total weight of the vertices that the clique has exclusively in the candidate solution.
To date, there is no suitable comparable algorithm for the DTKWC search problem on large scale of real-world graphs. Thus, we compare TOPKWCLQ with a commercial solver, CPLEX solver, with the constraint formulas proposed in this paper. Extensive performance experiments are executed to demonstrate that our proposed algorithm can achieve both high effectiveness and efficiency on real-world large-scale graphs.
The remainder of the paper is organized as follows. In Section 2, we propose the necessary background knowledge about diversified top-k weight clique search problem and formalize the DTKC and DTKWC search problem. In Section 3, we describe the TOPKWCLQ algorithm and the techniques it implements. In Section 4, we report extensive experimental results to demonstrate DTKWC’s high performance compared to CPLEX with our model in solving the DTKWC search problem, and finally, the conclusions are given in Section 5.

2. Diversified Top-k Weight Clique Search Problem

In this section, some notations and basic definitions which are applied to the DTKWC search problem are introduced. Then the proof of NP-hardness about DTKWC search problem is given. Next, the constraint formulas which are used in CPLEX solver as the mathematical model for DTKC and DTKWC search problem are proposed, respectively.

2.1. Definition and Notations

A weighted graph  G = ( V , E , w ) is a graph including | V | vertices and | E | edges, w is a weight function that assigns to each vertex v i of V a non-negative integer w ( v i ) representing its weight. v i w ( v i ) represents that vertex v i has weight w ( v i ) .
Definition 1 (Maximal clique (MC)).Given an unweighted graph G ( V , E ) , a clique c in G is a set of vertices such that for any u G , v c ( u v ) , we have ( u , v ) E . A clique c in G called a maximal clique if there exists no clique c in G such that c c .
Definition 2 (Maximal weight clique (MWC)).Given a weighted graph G ( V , E , w ) , a weight clique c in G is a set of vertices such that for any u G , v c ( u v ) , we have ( u , v ) E and the weight of c is ω ( c ) = v i c w ( v i ) . A weight clique c in G called a maximal weight clique if there exists no clique c in G such that ω ( c ) < ω ( c ) .
Given a set of maximal cliques C = { c 1 , c 2 , } , the coverage of C, denoted by c o v ( C ) , is the set of vertices covered by C, i.e.,  c o v ( C ) = c i C c i .
Definition 3 (Diversified top-k clique (DTKC)). 
Given an unweighted graph G ( V , E ) and an integer k, the problem of diversified top-k clique search is to compute a set C, such that each c C is a maximal clique, | C | k , and  c o v ( C ) is maximized. C is called diversified top-k cliques.
Given a set of maximal (weighted) cliques C = { c 1 , c 2 , } , the private vertices of a maximal (weighted) clique c in C, denoted by p r i v ( c , C ) , are a subset of vertices of c not contained in any other clique in C, i.e.,  p r i v ( c , C ) = c \ c o v ( C \ c ) . The weight of C is the total weight of the set of vertices in G covered by the cliques in C, denoted by W ( C ) , as below
W ( C ) = v i ( c j C c j ) w ( v i )
For the DTCK search problem, w ( v i ) = 1 , i [ 1 , | V ] ] . The overlapping of C, denoted by o v e r l a p ( S ) , is a set of vertices that are covered by maximal cliques in C more than once.
Definition 4 (Diversified top-k weight clique (DTKWC)). 
Given a weighted graph G ( V , E , w ) and an integer k, the problem of diversified top-k weight clique search is to compute a set C, such that each c C is a maximal weight clique, | C | k , and  W ( C ) is maximized. C is called diversified top-k weight cliques.

2.2. Constraint Formulation for DTKWC Search Problem

The DTKWC search problem is a generalization of the DTKC search problem [2] which aims to find a maximal clique set with at most k size with maximum total weight and a lower overlapping among all possible maximal clique sets from a given graph. Hence, we first give the formulas of the DTKC search problem and then expand them to the formulas of the DTKWC search problem. The DTKC search problem can be formulated as a mixed integer linear program (MILP) as follows:
OBJ 1 : M a x i m u m W 1 ( G ) = i [ 1 , | V | ] X i
OBJ 2 : M i n i m u m W 2 ( G ) = h = 1 k i [ 1 , | V | ] x i h i [ 1 , | V | ] X i k 1
Subject to:
x i h + x j h 1 , ( i , j ) E ¯ , 1 h k
X i i [ 1 , | V | ] x i h , i [ 1 , | V | ] , 1 h k
x i h { 0 , 1 } , i [ 1 , | V | ] , 1 h k
X i { 0 , 1 } , i [ 1 , | V | ]
where x i h is the binary variable associated with the vertex i, such that x i h = 1 if vertex v i is in the h’th maximal clique, x i h = 0 otherwise. X i is also a binary variable associated with vertex i. X i = 1 if there exists a vertex i in a maximal clique, X i = 0 otherwise. And constraint (4) is guaranteed that there is an edge between every two vertices in a clique. Constraint (5) means there is one clique including vertex v i , then X i = 1 . Constraints (6) and (7) give the range of the variables.
According to the formulas above, we give the MILP of the DTKWC search problem below:
OBJ 1 : M a x i m u m W 1 ( G ) = i [ 1 , | V | ] ( X i w i )
OBJ 2 : M i n i m u m W 2 ( G ) = h = 1 k i [ 1 , | V | ] ( x i h w i ) i [ 1 , | V | ] ( X i w i ) k 1
Subject to:
x i h + x j h 1 , ( i , j ) E ¯ , 1 h k
X i i V x i h , i [ 1 , | V | ] , 1 h k
x i h { 0 , 1 } , i [ 1 , | V | ] , 1 h k
X i { 0 , 1 } , i [ 1 , | V | ]
Similarly, x i h and X i represent the binary variables corresponding with the vertex i. x i h = 1 , if vertex i appears in the h’th maximal clique, x i h = 0 otherwise. X i = 1 , if vertex v i belongs to any maximal clique of C, X i = 0 otherwise. w i denotes the weight of the vertex i. Constraints (10)–(13) have the same intentions as the above constraints (4)–(7), respectively.
In the above formulas, both of these two problems aim to minimize the value of objective “OBJ1” on basis of maximizing the value of objective “OBJ2”. Thus, we can obtain the optimal solution of DTKC (DTKWC) search problem.

3. TOPKWCLQ: A Local Search Method for the DTKWC Search Problem

In this section, we will outline the framework of our algorithm. We use a restart strategy that interleaves between the construction and updating processes of the maximal weight clique set to enhance the quality of the candidate solution.
The restart procedure of the local search avoids the previous trajectory but turns to explore more different maximal weight clique sets. We construct these different maximal weight clique sets by combing the maximal weight cliques constructed from different starting vertices at each iteration. Thus, in the DTKWC search problem, using this restart strategy, TOPKWCLQ can improve the quality of the current candidate solution step by step.
At each restart iteration, we need to construct a new maximal weight clique one by one and eliminate the original maximal weight clique with the scoring function from the current candidate solution until no further improvement is found in the limited updating steps or the limit time is out. Thus, it can save the search time of a single iteration and restart the algorithm as soon as possible. After the updating procedure, a current candidate solution, that is, a local optimal solution, can be found and the algorithm will update the solution by comparing this local optimal solution with the maintained candidate solution from the previous iterations. Finally, until the time limit runs out, the algorithm returns to a maximal weight clique set as the best solution.
In the following, we will provide a random restart local search algorithm for the DTKWC search problem, called TOPKWCLQ.

3.1. Maximal Weight Clique Scoring Function

Before describing the algorithm framework, we first give a core issue in the algorithm TOPKWCLQ to evaluate the priorities of each maximal weight clique. During the search process, TOPKWCLQ must maintain a maximal weight clique set of size at most k as a candidate solution of the DTKWC search problem. Therefore, it is important to balance the quality and efficiency of the solutions by determining which maximal weight clique should be included or eliminated from the current candidate solution. For this reason, we will define a scoring function based on the total weight of its private vertices presented in Section 2 for each maximal weight clique during the updating process.
Definition 5 (Score function (Score function (score(c))).Given a weighted graph G = ( V , E , w ) , a maximal weight clique set C and a maximal weight clique c i of G ( c i C ) . We use s c o r e ( c i ) to define the benefit of c i after adding a maximal weight clique to the set C. The score of c in C is defined as
s c o r e ( c i ) = v j p r i v ( c i , C ) w ( v j )
The maximal weight clique selection method used in the UpdateSolution procedure is based on this scoring function. It attempts to determine the eliminated maximal weight clique with the smallest s c o r e value by computing the scoring function for each maximal weight clique in the candidate solution C after adding a new maximal weight clique into C.

3.2. TOPKWCLQ Algorithm: The Top-Level Algorithm

The proposed TOPKWCLQ algorithm (see the flowchart in Figure 1) combines an initialization procedure aiming to generate a feasible initial solution and a local search procedure aiming at improving the initial solution. The top level of TOPKWCLQ is outlined in Algorithm 1, as described below.
Algorithm 1 TOPKWCLQ( G , k , c u t o f f )
1:
Input: an weighted graph G ( V , E , W ) , one integer k, c u t o f f time
2:
Output: a set C * containing at most k maximal weight cliques
3:
m m 0 , C * Ø ; /* m 0 is a parameter used in BMS strategy */
4:
while ( elapsed time < c u t o f f ) do
5:
    R e m a i n i n g S e t V ;
6:
  /* m m a x is another parameter used in BMS strategy */
7:
  if ( m < m m a x ) then
8:
     m 2 m ;
9:
  else
10:
     m 0 m 0 + 1 ; m m 0 ;
11:
  end if
12:
   C I n i t K C l i q u e s ( G , m , R e m a i n i n g S e t ) ;
13:
  if ( c o v ( C ) = V ) then
14:
    return C;
15:
  end if
16:
   C L o c a l S e a r c h ( G , C , m , R e m a i n i n g S e t ) ;
17:
  if ( W ( C * ) < W ( C ) ) then
18:
     C * C ;
19:
  end if
20:
end while
21:
return C * ;
First, we introduce the basic framework of our algorithm, which is presented in Algorithm 1. A current best global solution C * will be initialized as an empty set. Then the TOPKWCLQ starts a loop until the limited time reaches the maximum which equals c u t o f f (lines 4–5). Before this loop, the parameters m 0 and m m a x of the best from multiple selection (BMS) strategy which is used in [2] to solve the DTKC search problem were given first (lines 4.2–4.5) and update the value of m in the loop. Then the TOPKWCLQ adopts a function I n i t K C l i q u e s to construct enough maximal weight cliques as an initialization solution (line 4.6). After the initialization procedure, if  c o v ( C ) equals to V, TOPKWCLQ will return C as a candidate solution (lines 4.7–4.8); Otherwise, update the current candidate solution C by using L o c a l S e a r c h method (line 4.9). If the total weight of the vertices in C * is smaller than the total weight of C, that is W ( C * ) < W ( C ) , then replace C * with C (lines 4.10–4.11). When the elapsed time is bigger than the cutoff time, TOPKWCLQ stops searching and returns C * .
In this section, the technical details of the TOPKWCLQ algorithm are introduced. The function to create a maximal weight clique from a random vertex is introduced in Section 3.3. In Section 3.4, the initialization procedure is presented. Section 3.5 presents the local search updating procedure of our algorithm.

3.3. Constructing a Maximal Weight Clique with Diversity

At each stage of our algorithm, we need constantly to find different maximal weight cliques to add into the candidate solution. Therefore, we design a method called G e t C l i q u e which uses the vertices in R e m a i n i n g S e t to construct the maximal weight cliques according to the properties of the DTKWC search problem. Let C a n d s e t denote the vertices which are adjacent to all vertices already in c. We also design a function b [ v ] which will be utilized during the initialization procedure to represent the benefit of a vertex v, the expression is as follows,
b [ v ] = u ( N ( v ) C a n d s e t ) w ( u ) .
The Algorithm 2 shows the pseudo-code of G e t C l i q u e s . First, c is initialized as an empty set. Then, G e t C l i q u e s iteratively and randomly selects a vertex from R e m a i n i n g S e t which includes all vertices in V but excludes the vertices in the current candidate solution. If the set of R e m a i n n i n g S e t is empty (line 2), then the algorithm returns c, and c is empty, which means we cannot create one more maximal weight clique. Otherwise, G e t C l i q u e selects a vertex v from R e m a i n i n g S e t randomly and then adds it to the set c. Then, the algorithm adds all neighbours of v into C a n d S e t . If  C a n d S e t is not empty, G e t C l i q u e will find a maximal weight clique by the BMS strategy (which is proposed by [17] ) used to select the better next vertex as the added vertex to the current partial clique (lines 6–7). In this situation, if the cardinality of C a n d S e t is smaller than the parameter m, the algorithm will pick a vertex v from C a n d S e t with the greatest b ^ , breaking ties in favour of the older one; Otherwise, G e t C l i q u e selects the vertex with biggest benefit from m vertices that randomly selects from C a n d s e t . After that, we can get a better result by just calculating the score of at most m vertices. C a n d S e t is updated for selecting the next vertex of the maximal weight clique.
Algorithm 2 GetClique ( G , m , R e m a i n i n g S e t )
1:
c Ø ;
2:
if ( R e m a i n i n g S e t = Ø ) then
3:
  return c;
4:
end if
5:
v randomly select a vertex from R e m a i n i n g S e t ;
6:
c { v } , C a n d S e t { u | u N ( v ) } ;
7:
while ( C a n d S e t Ø ) do
8:
  if ( | C a n d S e t | < m ) then
9:
    pick the vertex v from C a n d S e t with the greatest b ^ , breaking ties in favour of the older one;
10:
  else
11:
     v randomly select a vertex from C a n d S e t ;
12:
    for ( i t e r : = 1 to m 1 ) do
13:
       v randomly select a vertex from C a n d S e t ;
14:
      if ( b ^ [ v ] > b ^ [ v ] ) then
15:
         v v , b ^ [ v ] b ^ [ v ] ;
16:
      end if
17:
    end for
18:
  end if
19:
   c c { v } ;
20:
  remove v from R e m a i n i n g S e t ;
21:
   C a n d S e t C a n d S e t N ( v ) ;
22:
end while
23:
returnc

3.4. The Initialization Procedure

In this subsection, we will explain the initialization procedure, which is outlined in Algorithm 3. It is the first stage of our algorithm. At the beginning of this procedure, a current candidate solution C is set to empty. Due to the DTKWC search problem needs to find a solution which is a set including at most k maximal weight cliques, this method attempts to create the maximal weight cliques randomly through G e t C l i q u e that introduced in the above subsection. If we get an empty result from G e t C l i q u e until there is no more vertex not belongs to C or we have created k maximal weight cliques. In this method, the starting vertices are generated randomly from a set that includes the vertices never used as a starting vertex. Repeat this random process, we can get the diversified maximal weight cliques that do not depend on the corresponding information acquired during the previous process. Moreover, starting from an unvisited vertex to construct the solution of the DTKWC search problem will overcome the cycling problem, i.e., revisiting the same solution within a short time in the local search algorithm.
Algorithm 3 InitKCliques ( G , m , R e m a i n i n g S e t )
1:
C Ø ;
2:
while ( | C | < k ) do
3:
  if ( c o v ( C ) = V ) then
4:
    returnC;
5:
  end if
6:
  /* create a maximal weight clique with BMS strategies */
7:
   c G e t C l i q u e ( G , m , R e m a i n i n g S e t ) ;
8:
  if ( c Ø ) then
9:
     C C { c } ;
10:
  end if
11:
end while
12:
returnC

3.5. Local Search Updating

The candidate solution initialized by the initialization procedure is just a good candidate solution meeting the requirement of the DTKWC search problem, but there is no guarantee that it is a great candidate solution. Therefore, in this subsection, we design a local search method to improve the quality of this solution by exploring as many new maximal weight cliques as possible (line 2.2).
The proposed local search method in Algorithm 4 finds the different maximal weight clique combinations from a candidate solution. It is a good way to iteratively find a better combination C which includes the vertices with a greater total weight. For this reason, we add a new maximal weight clique c created by G e t C l i q u e into the current candidate solution C. Then we compute k + 1 s c o r e functions explained in Section 3.1 for all of maximal weight cliques in C each time (line 2.5). After this, we gain k + 1 values of s c o r e , delete the maximal weight clique with the smallest value among these maximal weight cliques in C . Such that we maintain a k size maximal weight clique solution as the new candidate solution (line 2.8).
Algorithm 4 LocalSearch( G , C , m , R e m a i n i n g S e t )
1:
s t e p 0 , C Ø ;
2:
while ( elapsed time < c u t o f f ) do
3:
   s t e p s t e p + 1 ;
4:
   c G e t C l i q u e ( G , m , R e m a i n i n g S e t ) ;
5:
  if ( c = Ø ) then
6:
    break;
7:
  end if
8:
   C C { c } ;
9:
  Compute s c o r e ( c ) of each maximal weight clique c in C ;
10:
   c m i n a r g c C m i n { s c o r e ( c ) } ;
11:
  Remove c m i n from C , breaking ties in favour of the smaller one;
12:
  if ( c o v ( C ) > c o v ( C ) ) then
13:
     C C , s t e p 0 ;
14:
  end if
15:
  /* f s is the third parameter of TOPKWCLQ */
16:
  if ( s t e p f s ) then
17:
    break;
18:
  end if
19:
end while
20:
returnC
Although, we obtain the information that sometimes a candidate solution cannot be improved by the normal local search method in a long time. For this, we add a fixed step, denoted by f s , into our local search framework that breaks the loop if the current solution cannot be improved in f s steps.

3.6. An Example of DTKWC Search Problem

 Example 1. 
Let us illustrate how to explore a solution for the DTKWC search problem by using a sample weighted graph in Figure 2.
Figure 2a gives an weighted graph G ( V , E , w ) with ten vertices, where v i w i denotes vertex v i with w i = w ( v i ) . Assume the integer parameter k = 2 . The best clique weight so far is ω ( C m a x ) = 13 . During the first phase, InitKCliques creates a maximal weight clique set C m a x = { c 1 , c 2 } , and c 1 = { v 0 1 , v 1 2 , v 2 1 , v 3 2 } , c 2 = { v 3 2 , v 4 3 , v 5 3 } . The total weight of C m a x is 12. In the second phase, LocalSearch tries to determine a new maximal weight clique c 3 by GetClique. Suppose c 3 = { v 5 3 , v 7 3 } . We add c 3 into C m a x . Now, C m a x contains 3 maximal weight cliques. We evaluate the quality of these three maximal weight cliques by the scoring function we proposed. { v 0 1 , v 1 2 , v 2 1 } , { v 4 3 } , { v 7 3 } are the private vertices set of c 1 , c 2 , c 3 , respectively. s c o r e ( c 1 ) = 4 , s c o r e ( c 2 ) = 3 , s c o r e ( c 3 ) = 3 . After this process, we remove the worst maximal weight clique c 2 or c 3 to keep the size of the maximal weight clique set equal to 2. Observe that the set of { c 1 , c 2 } or { c 1 , c 3 } are both the solution of DTKWC search problem in this graph with the parameter k = 2 , and { c 1 , c 3 } is the best solution with the lowest overlapping.

4. Experimental Evaluation

In this section, we carry out extensive experiments to evaluate the performance of TOPKWCLQ on weighted real-world large graphs. Since there is no suitable heuristic or exact algorithm for the DTKWC search problem on real-world large graphs in literature, as we know that is a good choice to compare the results of the proposed algorithm to the results obtained by CPLEX solver which is a commercial solver for many combinatorial optimization problems with their constraint formulas of mathematical models. Therefore, the results obtained by CPLEX can be used as reference on the solution quality. We first describe the weighted benchmark and then present the experimental preliminaries and introduce the parameter settings.

4.1. The Benchmark

We evaluate the TOPKWCLQ algorithm on the benchmarks of the weighted real-world graph, which will be shown below.
The weighted real-world large graph benchmark in our experiments was originally from the Network Data Repository online [18] (http://www.graphrepository.com/networks.php, accessed on 1 August 2021). There are millions of vertices and tens of millions of edges on many of the real-world graphs which used in our experiments. This benchmark has been transformed from unweighted graphs to the weighted graphs used the weighting function w ( v i ) = ( i m o d 200 ) + 1 (including 102 instances) [19]. Moreover, most of these as the experimental instances used in maximum vertex weight clique problem [6,20,21], coloring problem [22], maximum k-plexes problem [23] and DTKC search problem [2]. Considering the relationship between the DTKWC search problem and these problems, these real-world graphs can naturally be used to evaluate the performance of our algorithm for the DTKWC search problem. These real-world graphs were downloaded from the author’s website (http://lcs.ios.ac.cn/~caisw/Resource/weighted-massive-graphs.zip, accessed on 1 August 2021).
The graphs in our experiments are divided into 11 classes, including biological networks, collaboration networks, interaction networks, infrastructure networks, recommendation networks, retweet networks, scientific computing, social networks, facebook networks, technological networks, and web graphs.

4.2. Experimental Preliminaries and Parameter Tuning

The proposed algorithm TOPKWCLQ was implemented in C++ and compiled on CentOS with 2.4 GHz CPU and 32G RAM with “-O3” flag. We run TOPKWCLQ 10 times independently with the random seed setting from 1 to 10 for each instances. Each one is run until the run time of the algorithm arrives which is a given time limit that is assigned as 600 s in this paper. The termination criterion of CPLEX is either the convergence of lower and upper bounds or a time limit which is assigned as 3600 s. We use the solution values of CPLEX to evaluate the quality of the solution solved by TOPKWCLQ.
For each real-world large graph used in our experiments, we set the parameter k to 10, 20, 30, 40, and 50 to obtain five DTKWC search problem instances. Hence, there were 102 × 5 = 510 DTKWC search problem instances in our experiments.
TOPKWCLQ uses three parameters for which well-working values must be found: m 0 and m m a x are the minimum and maximum value of BMS strategy respectively, and f s is the maximum allowed updating steps of the solution per iteration. Parameters m 0 and m m a x are used in the BMS strategy inspired by [17]. The value of these parameters are set in Table 1 according to a preliminary tuning experiment.
The next subsection is shown to the evaluation of TOPKWCLQ compared with the lower bound (“ L B ”) and the upper bound (“ U B ”) of CPLEX under all 510 DTKWC search problem instances.

4.3. Experimental Results

We present the comprehensive experiment results on the benchmark instances described in Section 4.1 with 5 values of parameter k in Table 2, Table 3, Table 4, Table 5 and Table 6. Among them, Table 2, Table 3, Table 4, Table 5 and Table 6 for k = 10 , 20 , 30 , 40 , 50 , respectively.
For each instance, the column “Instance” indicates the basic information for the name. In TOPKWCLQ, we present the maximum weight value of the DTKWC search problem instances ( w b ) and the average weight DTKWC search problem results ( w a ) obtained over 10 runs. We also report the average run time over 10 runs ( T i m e , in seconds) to reach the maximum weight for all DTKWC search problem instances by TOPKWCLQ. And ”0” in the time column indicates TOPKWCLQ was able to obtain the best solution in less than 0.01 s. To study the effectiveness of TOPKWCLQ for DTKWC search problem, we compare it with the CPLEX solver (version 12.9) with the mathematical model (8)–(13) introduced in Section 2.2. The best lower bound ( L B ) and upper bound ( U B ) found by CPLEX are listed in the CPLEX columns. If CPLEX was unable to find a bound on an instance, the corresponding entry is marked by “-”. If CPLEX was unable to load the model, the entry is marked by “N/A”. For the items in a column, the bold value indicates that the algorithm obtained the same or better objective values compared to the results of the comparison algorithm.
Table 2, Table 3, Table 4, Table 5 and Table 6 show that TOPKWCLQ obtained the same or better objective values compared with the objective values of CPLEX on most instances. On 5 out of 102 graphs, CPLEX can find better objective values than our algorithm TOPKWCLQ. However, the instances become more challenging for CPLEX with a larger k, and TOPKWCLQ is becoming more effective. For 19 out of 510 real-world instances, we can prove the optimal solutions, where the values of the lower bound (“ L B ”) and upper bound (“ U B ”) are equivalent in CPLEX column. In terms of computational time, TOPKWCLQ can obtain the optimal values in less than one second (at most hundreds of seconds) in most cases. For example, on the graph rt-twitter-copen, CPLEX always finds better objective values than TOPKWCLQ except the instance with parameter k = 50 . For another 84 out of 102 larger graphs, TOPKWCLQ can also obtain good objective values where CPLEX failed.
Based on the benchmark introduced in Section 4.1, Table 7 summarizes the computational results of CPLEX and TOPKWCLQ on 102 real-world graphs. From Table 7, we observe that for almost all instances under the five values of the parameter k, our TOPKWCLQ algorithm can obtain better solutions than the lower bound of CPLEX. It indicates the superiority of the proposed algorithm TOPWCLQ.

5. Conclusions

In this paper, we propose the diversified top-k weight clique search problem and formalize DTKC and DTKWC search problem. The scoring strategy is proposed to find diversified maximal weight cliques for our algorithm. A local search algorithm for the DTKWC search problem based on the scoring strategy and random restart strategy is then proposed, called TOPKWCLQ. This algorithm interleaves maximal weight clique set construction and updating. Experiments on the real-world benchmark show the effectiveness and efficiency of our algorithm. Moreover, further work is to investigate the enhanced configuration checking strategy used in [2] to enhance the performance of the algorithm.

Author Contributions

Methodology, J.W.; software, J.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W. and M.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fundamental Research Funds for the Central Universities 2412018QD022, NSFC (under Grant No. 61976050, 61972384) and Jilin Provincial Science and Technology Department under Grant No. 20190302109GX.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhou, J.; Li, C.; Zhou, Y.; Li, M.; Liang, L.; Wang, J. Solving diversified top-k weight clique search problem. Sci. China Inf. Sci. 2021, 64, 150105. [Google Scholar] [CrossRef]
  2. Wu, J.; Li, C.; Jiang, L.; Zhou, J.; Yin, M. Local search for diversified Top-K Clique Search Probl. Comput. Oper. Res. 2020, 116, 104867. [Google Scholar] [CrossRef]
  3. Zhou, Z.; Xiao, Z.; Deng, W. Improved community structure discovery algorithm based on combined clique percolation method and K-means algorithm. Peer-to-Peer Netw. Appl. 2020, 13, 2224–2233. [Google Scholar] [CrossRef]
  4. Pelofske, E.; Hahn, G.; Djidjev, H. Solving large maximum clique problems on a quantum annealer. In International Workshop on Quantum Technology and Optimization Problems; Springer: Berlin/Heidelberg, Germany, 2019; pp. 123–135. [Google Scholar]
  5. Chang, L. Efficient maximum clique computation and enumeration over large sparse graphs. VLDB J. 2020, 29, 999–1022. [Google Scholar] [CrossRef]
  6. Jiang, H.; Li, C.; Manyà, F. An Exact Algorithm for the Maximum Weight Clique Problem in Large Graphs. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Singh, S.P., Markovitch, S., Eds.; AAAI Press: Menlo Park, CA, USA, 2017; pp. 830–838. [Google Scholar]
  7. Yuan, L.; Qin, L.; Lin, X.; Chang, L.; Zhang, W. Diversified top-k clique search. In Proceedings of the 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, Korea, 13–17 April 2015; Gehrke, J., Lehner, W., Shim, K., Cha, S.K., Lohman, G.M., Eds.; IEEE Computer Society: Washington, DC, USA, 2015; pp. 387–398. [Google Scholar] [CrossRef]
  8. Yuan, L.; Qin, L.; Lin, X.; Chang, L.; Zhang, W. Diversified top-k clique search. VLDB J. 2016, 25, 171–196. [Google Scholar] [CrossRef]
  9. Lee, C.; Reid, F.; McDaid, A.; Hurley, N. Detecting highly overlapping community structure by greedy clique expansion. arXiv 2010, arXiv:1002.1827. [Google Scholar]
  10. Zheng, X.; Liu, T.; Yang, Z.; Wang, J. Large cliques in Arabidopsis gene coexpression network and motif discovery. J. Plant Physiol. 2011, 168, 611–618. [Google Scholar] [CrossRef] [PubMed]
  11. Jiang, H.; Li, C.; Liu, Y.; Manyà, F. A Two-Stage MaxSAT Reasoning Approach for the Maximum Weight Clique Problem. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, LA, USA, 2–7 February 2018; McIlraith, S.A., Weinberger, K.Q., Eds.; AAAI Press: Palo Alto, CA, USA, 2018; pp. 1338–1346. [Google Scholar]
  12. Li, C.M.; Liu, Y.; Jiang, H.; Manyà, F.; Li, Y. A new upper bound for the maximum weight clique problem. Eur. J. Oper. Res. 2018, 270, 66–77. [Google Scholar] [CrossRef]
  13. Jain, S.; Seshadhri, C. The power of pivoting for exact clique counting. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 268–276. [Google Scholar]
  14. Wang, Y.; Cai, S.; Chen, J.; Yin, M. SCCWalk: An efficient local search algorithm and its improvements for maximum weight clique problem. Artif. Intell. 2020, 280, 103230. [Google Scholar] [CrossRef]
  15. Sevinc, E.; Dokeroglu, T. A novel parallel local search algorithm for the maximum vertex weight clique problem in large graphs. Soft Comput. 2020, 24, 3551–3567. [Google Scholar] [CrossRef]
  16. Wu, J.; Yin, M. Local Search for Diversified Top-k s-plex Search Problem (Student Abstract). In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 February 2021; AAAI Press: New Orleans, LA, USA, 2021; pp. 15929–15930. [Google Scholar]
  17. Cai, S. Balance between Complexity and Quality: Local Search for Minimum Vertex Cover in Massive Graphs. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, 25–31 July 2015; Yang, Q., Wooldridge, M.J., Eds.; AAAI Press: Palo Alto, CA, USA, 2015; pp. 747–753. [Google Scholar]
  18. Rossi, R.A.; Ahmed, N.K. The Network Data Repository with Interactive Graph Analytics and Visualization. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; Bonet, B., Koenig, S., Eds.; AAAI Press: Palo Alto, CA, USA, 2015; pp. 4292–4293. [Google Scholar]
  19. Cai, S.; Lin, J. Fast Solving Maximum Weight Clique Problem in Massive Graphs. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016; Kambhampati, S., Ed.; IJCAI/AAAI Press: Palo Alto, CA, USA, 2016; pp. 568–574. [Google Scholar]
  20. Fan, Y.; Li, N.; Li, C.; Ma, Z.; Latecki, L.J.; Su, K. Restart and Random Walk in Local Search for Maximum Vertex Weight Cliques with Evaluations in Clustering Aggregation. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 19–25 August 2017; Sierra, C., Ed.; AAAI Press: Palo Alto, CA, USA, 2017; pp. 622–630. [Google Scholar] [CrossRef] [Green Version]
  21. Nogueira, B.C.S.; Pinheiro, R.G.S. A CPU-GPU local search heuristic for the maximum weight clique problem on massive graphs. Comput. Oper. Res. 2018, 90, 232–248. [Google Scholar] [CrossRef]
  22. Lin, J.; Cai, S.; Luo, C.; Su, K. A Reduction based Method for Coloring Very Large Graphs. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 19–25 August 2017; Sierra, C., Ed.; AAAI Press: Palo Alto, CA, USA, 2017; pp. 517–523. [Google Scholar] [CrossRef] [Green Version]
  23. Gao, J.; Chen, J.; Yin, M.; Chen, R.; Wang, Y. An Exact Algorithm for Maximum k-Plexes in Massive Graphs. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, Stockholm, Sweden, 13–19 July 2018; Lang, J., Ed.; AAAI Press: Palo Alto, CA, USA, 2018; pp. 1449–1455. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The main flowchart of the proposed TOPKWCLQ algorithm.
Figure 1. The main flowchart of the proposed TOPKWCLQ algorithm.
Mathematics 09 02674 g001
Figure 2. A simple example for DTKWC search problem. (a) A weighted graph with 7 vertices. (b) Graph with three maximal weight cliques.
Figure 2. A simple example for DTKWC search problem. (a) A weighted graph with 7 vertices. (b) Graph with three maximal weight cliques.
Mathematics 09 02674 g002
Table 1. Setting of parameters m 0 , m m a x , f s .
Table 1. Setting of parameters m 0 , m m a x , f s .
ParametersDescriptionsRangeValues
m 0 minimum number of greediness iterations in BMS strategy2, 4, 8, 168
m m a x maximum number of greediness iterations in BMS strategy32, 64, 128, 25664
f s control restart in update procedure1000, 2000, 4000, 80002000
Table 2. Experiment results on real-world large graphs with k = 10 .
Table 2. Experiment results on real-world large graphs with k = 10 .
InstanceCPLEXTOPKWCLQInstanceCPLEXTOPKWCLQ
LB UB w b w a Time LB UB w b w a Time
bio-celegans6867686768526808.4250.97socfb-B-anonN/AN/A21,27320,893.6198.57
bio-diseasome86728672867286720.11socfb-Berkeley13N/AN/A38,74038,354.7103.04
bio-dmelaN/AN/A60945994.9137.32socfb-CMUN/AN/A32,88832,488.2143.52
bio-yeast49915020499149914.70socfb-Duke14N/AN/A30,72230,284.8199.64
ca-AstroPhN/AN/A47,91247,445152.59socfb-IndianaN/AN/A45,02644,249.2155.71
ca-citeseerN/AN/A76,78276,398244.19socfb-MITN/AN/A31,20730,894.1196.22
ca-coauthors-dblpN/AN/A304,932302,045177.41socfb-ORN/AN/A27,60127,282.7186.37
ca-CondMatN/AN/A20,07019,949.8187.83socfb-Penn94N/AN/A39,16438,537210.69
ca-CSphd40594059405940590.30socfb-Stanford3N/AN/A38,66838,137.4202.05
ca-dblp-2010N/AN/A64,73264,245.7168.86socfb-Texas84N/AN/A43,44342,722.4153.66
ca-dblp-2012N/AN/A64,17964,024.8209.09socfb-uci-uniN/AN/A70926911.8174.43
ca-Erdos992N/AN/A58005664.5215.51socfb-UCLAN/AN/A42,91942,595133.01
ca-GrQcN/AN/A25,84425,8440.16socfb-UConnN/AN/A38,26138,075.5214.20
ca-HepPhN/AN/A79,62479,496.8225.91socfb-UCSB37N/AN/A42,83042,442.4166.30
ca-hollywood-2009N/AN/A91,386190,8334.9223.68socfb-UFN/AN/A52,62652,133.5231.33
ca-MathSciNetN/AN/A24,01723,826156.59socfb-UIllinoisN/AN/A46,70546,144.6219.00
ca-netscience75887588758875880.16socfb-Wisconsin87N/AN/A33,59433,384.2183.56
ia-email-EUN/AN/A84408236.2239.39soc-flickrN/AN/A26,96526,228.7173.67
ia-email-univ9570971895669536.2236.42soc-flixsterN/AN/A25,50524,409.3158.30
ia-enron-largeN/AN/A16,12415,984194.00soc-FourSquareN/AN/A12,25411,894.3180.01
ia-enron-only433143384331433113.55soc-gowallaN/AN/A19,60719,371.3174.70
ia-fb-messages5738744157925737.7211.06soc-karate4724724724720.00
ia-infect-dublin10,94611,00110,94610,9469.17soc-lastfmN/AN/A13,12212,833.9198.74
ia-infect-hyper4446529244424416.3160.16soc-livejournalN/AN/A138,283135,583.1145.34
ia-realityN/AN/A28822852.8212.69soc-LiveMochaN/AN/A98699608.8148.29
ia-wiki-TalkN/AN/A10,97610,754172.35soc-orkutN/AN/A42,48441,987.8145.63
inf-powerN/AN/A6613661337.32soc-pokecN/AN/A21,99521,094.4130.38
inf-roadNet-CAN/AN/A63676296.1141.26soc-slashdotN/AN/A13,80113,224.5169.68
inf-roadNet-PAN/AN/A60846082.5148.54soc-twitter-followsN/AN/A55355346.8174.89
inf-road-usaN/AN/A60356002.5136.92soc-wiki-Vote6376676863416253.2133.06
rec-amazonN/AN/A893189311.73soc-youtubeN/AN/A12,40912,255191.54
rt-retweet15751578157515750.06soc-youtube-snapN/AN/A12,14611,973.7140.26
rt-retweet-crawlN/AN/A87428650.5189.03tech-as-caida2007N/AN/A72407111196.61
rt-twitter-copen4661466146614659.5128.18tech-as-skitterN/AN/A29,47028,331.9131.52
sc-ldoorN/AN/A40,72640,704.3179.99tech-internet-asN/AN/A74577067.2166.47
sc-msdoorN/AN/A40,67040,625.9135.87tech-p2p-gnutellaN/AN/A58295816.2190.24
sc-nasasrbN/AN/A43,77643,714173.28tech-RL-caidaN/AN/A12,21511,994.8164.86
sc-pkustk11N/AN/A47,54847,268.1153.85tech-routers-rf3914207,55410,0029871128.18
sc-pkustk13N/AN/A57,28256,738185.36tech-WHOISN/AN/A31,45931,125.1144.75
sc-pwtkN/AN/A45,50445,432179.58web-arabic-2005N/AN/A92,10892,1083.98
sc-shipsec1N/AN/A31,79031,661.2214.63web-BerkStanN/AN/A11,20011,20071.12
sc-shipsec5N/AN/A43,26043,157.4132.02web-eduN/AN/A11,25011,2500.17
soc-BlogCatalogN/AN/A20,01419,109.1242.59web-google13,12613,12613,12613,1260.01
soc-brightkiteN/AN/A19,54819,178.3173.03web-indochina-2004N/AN/A44,05244,05244.42
soc-buzznetN/AN/A17,62017,041.3199.35web-it-2004N/AN/A415,850415,8500.63
soc-deliciousN/AN/A11,71111,553.1136.82web-polblogs5926642859185828.2166.71
soc-diggN/AN/A27,72627,083.9226.17web-sk-2005N/AN/A63,93063,920.4132.89
soc-dolphins12261226122612260.04web-spamN/AN/A14,56914,418.8150.91
soc-doubanN/AN/A89838860.1207.39web-uk-2005N/AN/A441,613441,6130.41
soc-epinionsN/AN/A12,94212,661.1139.36web-webbase-2001N/AN/A20,64820,64868.16
socfb-A-anonN/AN/A22,55121,896.8176.24web-wikipedia2009N/AN/A29,78129,063.8162.08
Table 3. Experiment results on real-world large graphs with k = 20 .
Table 3. Experiment results on real-world large graphs with k = 20 .
InstanceCPLEXTOPKWCLQInstanceCPLEXTOPKWCLQ
LB UB w b w a Time LB UB w b w a Time
bio-celegans11,48911,74711,27511,198.2147.50socfb-B-anonN/AN/A38,66738,284.5221.32
bio-diseasome14,31314,34514,31314,3132.06socfb-Berkeley13N/AN/A67,51866,896.4158.57
bio-dmelaN/AN/A10,44710,370.9209.09socfb-CMUN/AN/A55,26754,811191.40
bio-yeast8874973090679060125.32socfb-Duke14N/AN/A54,84454,091.1137.00
ca-AstroPhN/AN/A85,36685,173.6185.29socfb-IndianaN/AN/A80,04079,382.1162.68
ca-citeseerN/AN/A131,947131,244.5148.71socfb-MITN/AN/A55,08654,421.2156.99
ca-coauthors-dblpN/AN/A531,666529,145235.05socfb-ORN/AN/A48,73748,162.2216.69
ca-CondMatN/AN/A35,02934,734249.30socfb-Penn94N/AN/A70,88870,417.4178.75
ca-CSphd--79437942103.99socfb-Stanford3N/AN/A66,44565,926.7213.33
ca-dblp-2010N/AN/A107,838107,254.7192.43socfb-Texas84N/AN/A75,52274,283.2156.21
ca-dblp-2012N/AN/A102,895102,577.8214.70socfb-uci-uniN/AN/A13,58613,267.4191.29
ca-Erdos992N/AN/A97669580.7146.01socfb-UCLAN/AN/A72,36371,465.5192.25
ca-GrQcN/AN/A37,55937,55924.67socfb-UConnN/AN/A64,26663,327202.52
ca-HepPhN/AN/A113,984113,639.5189.33socfb-UCSB37N/AN/A68,72268,234.3135.47
ca-hollywood-2009N/AN/A1,305,0291,293,108185.10socfb-UFN/AN/A90,30789,178.6233.87
ca-MathSciNetN/AN/A41,46141,215.4137.01socfb-UIllinoisN/AN/A79,95279,253.7243.79
ca-netscience13,17813,19613,18913,1892.16socfb-Wisconsin87N/AN/A59,74359,398.8215.07
ia-email-EUN/AN/A13,44013,232.2151.90soc-flickrN/AN/A44,27943,612.1194.81
ia-email-univ11,02418,42615,64715,525.2172.17soc-flixsterN/AN/A41,71540,413210.02
ia-enron-largeN/AN/A28,59228,252.6123.99soc-FourSquareN/AN/A20,86620,602.4213.53
ia-enron-only6922723769236917177.34soc-gowallaN/AN/A35,01734,396185.05
ia-fb-messages620415,51010,30510,207.4189.05soc-karate6296296296290.00
ia-infect-dublin15,00918,69417,58817,527.2215.34soc-lastfmN/AN/A23,75322,918.3113.19
ia-infect-hyper5972655460866068.2116.70soc-livejournalN/AN/A222,938220,213.4147.43
ia-realityN/AN/A52915252.3161.50soc-LiveMochaN/AN/A16,90716,391.4106.99
ia-wiki-TalkN/AN/A18,64418,315.4252.87soc-orkutN/AN/A79,35178,363.4150.28
inf-powerN/AN/A11,74311,712204.17soc-pokecN/AN/A38,79338,128.1239.65
inf-roadNet-CAN/AN/A12,34112,292.8179.86soc-slashdotN/AN/A21,07120,868.8217.39
inf-roadNet-PAN/AN/A12,04912,041.1157.62soc-twitter-followsN/AN/A97229588.3208.03
inf-road-usaN/AN/A11,94011,827.3191.68soc-wiki-Vote10,64912,61310,67010,620.8225.30
rec-amazonN/AN/A17,41417,41410.02soc-youtubeN/AN/A22,17321,901.9199.76
rt-retweet273227582754275419.86soc-youtube-snapN/AN/A21,68721,423.5162.58
rt-retweet-crawlN/AN/A15,28314,983.6163.67tech-as-caida2007N/AN/A11,63211,485.1114.11
rt-twitter-copen8364852582938256.7142.45tech-as-skitterN/AN/A47,28844,902.5169.13
sc-ldoorN/AN/A81,16581,074.7202.88tech-internet-asN/AN/A12,59111,964.9157.32
sc-msdoorN/AN/A80,70380,651.2178.45tech-p2p-gnutellaN/AN/A11,11511,034.6168.48
sc-nasasrbN/AN/A84,63884,212.2162.43tech-RL-caidaN/AN/A21,04820,652.783.62
sc-pkustk11N/AN/A90,37690,205.5204.08tech-routers-rfN/AN/A15,45615,301.5173.66
sc-pkustk13N/AN/A108,096107,515.5141.38tech-WHOISN/AN/A44,18943,782.9188.40
sc-pwtkN/AN/A89,49689,366204.28web-arabic-2005N/AN/A178,434178,4343.22
sc-shipsec1N/AN/A60,83160,323.5156.03web-BerkStanN/AN/A18,75618,724143.16
sc-shipsec5N/AN/A82,42281,832.3133.56web-eduN/AN/A16,53116,498170.07
soc-BlogCatalogN/AN/A30,47929,811201.55web-google14,17422,94321,47921,4790.59
soc-brightkiteN/AN/A31,06230,674.7207.00web-indochina-2004N/AN/A74,32074,236.2218.99
soc-buzznetN/AN/A28,64228,133.5193.13web-it-2004N/AN/A797,123797,1231.03
soc-deliciousN/AN/A20,40220,228.492.03web-polblogs980811,90699499909.4135.08
soc-diggN/AN/A42,90342,169.9170.04web-sk-2005N/AN/A97,19697,053192.90
soc-dolphins18611871186118616.05web-spamN/AN/A23,98223,625.7175.29
soc-doubanN/AN/A15,04314,670.5198.98web-uk-2005N/AN/A789,896789,8960.43
soc-epinionsN/AN/A22,00121,671.5247.93web-webbase-2001N/AN/A33,43833,290.2154.66
socfb-A-anonN/AN/A39,45839,201.1210.51web-wikipedia2009N/AN/A48,22247,165.3230.13
Table 4. Experiment results on real-world large graphs with k = 30 .
Table 4. Experiment results on real-world large graphs with k = 30 .
InstanceCPLEXTOPKWCLQInstanceCPLEXTOPKWCLQ
LB UB w b w a Time LB UB w b w a Time
bio-celegans14,52816,19314,80514,740203.67socfb-B-anonN/AN/A55,31854,993.8155.81
bio-diseasome12,61020,10318,43118,425.591.62socfb-Berkeley13N/AN/A91,99190,393.9162.54
bio-dmelaN/AN/A14,49714,384180.80socfb-CMUN/AN/A72,88572,032.9162.68
bio-yeast--12,77412,750.3198.29socfb-Duke14N/AN/A75,54374,516.6149.19
ca-AstroPhN/AN/A117,328116,221.3241.64socfb-IndianaN/AN/A110,454109,407.1198.30
ca-citeseerN/AN/A178,715177,726.9209.84socfb-MITN/AN/A74,15873,595.9182.28
ca-coauthors-dblpN/AN/A740,505733,645.3135.01socfb-ORN/AN/A67,31766,836.6179.21
ca-CondMatN/AN/A48,47648,219.6138.60socfb-Penn94N/AN/A98,25597,120.6165.00
ca-CSphdN/AN/A11,73911,731150.83socfb-Stanford3N/AN/A90,44289,784.3151.78
ca-dblp-2010N/AN/A146,421145,317.6171.76socfb-Texas84N/AN/A103,092101,991212.90
ca-dblp-2012N/AN/A133,948133,150.3106.77socfb-uci-uniN/AN/A19,43619,111173.62
ca-Erdos992N/AN/A13,38113,305.1173.98socfb-UCLAN/AN/A97,50696,350.8206.58
ca-GrQcN/AN/A46,26246,161.4151.01socfb-UConnN/AN/A84,39183,777.7183.23
ca-HepPhN/AN/A140,485140,068.4169.58socfb-UCSB37N/AN/A90,19889,200.8181.14
ca-hollywood-2009N/AN/A1,591,2871,578,513203.39socfb-UFN/AN/A119,700117,876.9124.65
ca-MathSciNetN/AN/A56,82656,490.5202.42socfb-UIllinoisN/AN/A108,948108,470.3135.60
ca-netscience17,34117,99717,78117,780.1121.27socfb-Wisconsin87N/AN/A83,83782,596.5211.36
ia-email-EUN/AN/A17,86117,546.7182.28soc-flickrN/AN/A59,97659,342.1125.18
ia-email-univ873326,22020,74920,538.9201.94soc-flixsterN/AN/A55,67654,557.7181.08
ia-enron-largeN/AN/A40,23339,553.5189.25soc-FourSquareN/AN/A28,77528,598.1233.85
ia-enron-only8030922685708542.2160.40soc-gowallaN/AN/A48,36547,895.4148.82
ia-fb-messages713523,60514,39614,223152.57soc-karate6296296296290.00
ia-infect-dublin12,60826,00422,78322,627.5193.70soc-lastfmN/AN/A32,61432,036.3157.81
ia-infect-hyper65546554655465541.68soc-livejournalN/AN/A297,475289,923.2167.77
ia-realityN/AN/A74737404.8205.85soc-LiveMochaN/AN/A23,05622,702.8187.07
ia-wiki-TalkN/AN/A25,58925,136.2172.33soc-orkutN/AN/A114,442112,839.3187.46
inf-powerN/AN/A16,41416,356.8172.16soc-pokecN/AN/A55,39654,045.9170.75
inf-roadNet-CAN/AN/A18,35118,245.3232.32soc-slashdotN/AN/A28,03227,717.9261.61
inf-roadNet-PAN/AN/A18,00517,990.4186.22soc-twitter-followsN/AN/A13,69013,527.6203.29
inf-road-usaN/AN/A17,72817,649.3156.50soc-wiki-Vote994418,42614,49614,390.8248.59
rec-amazonN/AN/A25,46825,468112.68soc-youtubeN/AN/A30,90530,491.7185.84
rt-retweet360936093609360921.53soc-youtube-snapN/AN/A30,74630,309.2203.55
rt-retweet-crawlN/AN/A21,15120,857.2230.71tech-as-caida2007N/AN/A16,12015,863.9138.62
rt-twitter-copen11,77712,47411,59011,548.8161.99tech-as-skitterN/AN/A60,30858,536.9151.54
sc-ldoorN/AN/A121,303121,165.8214.11tech-internet-asN/AN/A16,81816,538131.74
sc-msdoorN/AN/A120,426120,330.8250.58tech-p2p-gnutellaN/AN/A16,02315,920.1239.06
sc-nasasrbN/AN/A123,112122,866197.63tech-RL-caidaN/AN/A28,82628,485.9174.90
sc-pkustk11N/AN/A132,640132,153244.31tech-routers-rfN/AN/A20,06219,967.9145.42
sc-pkustk13N/AN/A157,978156,744.2221.26tech-WHOISN/AN/A53,75053,301.1159.00
sc-pwtkN/AN/A132,744132,648.4245.68web-arabic-2005N/AN/A263,602263,6027.09
sc-shipsec1N/AN/A87,85387,048.3150.32web-BerkStanN/AN/A25,84825,786.1122.40
sc-shipsec5N/AN/A118,524117,849.5174.37web-eduN/AN/A20,80520,751198.34
soc-BlogCatalogN/AN/A40,72439,594.4184.04web-google10,76728,59926,86626,858.8131.36
soc-brightkiteN/AN/A41,18940,361.4212.23web-indochina-2004N/AN/A98,80998,657.3176.93
soc-buzznetN/AN/A38,62837,701174.26web-it-2004N/AN/A1,115,8231,115,8231.32
soc-deliciousN/AN/A28,26327,792.6135.52web-polblogs857917,31213,54913,498.1171.71
soc-diggN/AN/A55,87954,964.3164.61web-sk-2005N/AN/A126,823126,524.1144.92
soc-dolphins20152015201520150.00web-spamN/AN/A31,80331,327.7134.99
soc-doubanN/AN/A20,01719,862201.77web-uk-2005N/AN/A1,050,477105,04770.44
soc-epinionsN/AN/A29,86629,612.2209.61web-webbase-2001N/AN/A44,25643,708.3156.66
socfb-A-anonN/AN/A57,40855,907.1157.96web-wikipedia2009N/AN/A65,28663,121.4198.87
Table 5. Experiment results on real-world large graphs with k = 40 .
Table 5. Experiment results on real-world large graphs with k = 40 .
Instance CPLEXTOPKWCLQInstanceCPLEXTOPKWCLQ
LB UB w b w a Time LB UB w b w a Time
bio-celegans10,03722,07718,05717,956.7218.25socfb-B-anonN/AN/A71,89570,879.8170.92
bio-diseasome10,45024,13821,94521,933.6188.25socfb-Berkeley13N/AN/A113,862111,784.4173.56
bio-dmelaN/AN/A18,40318,266.6197.28socfb-CMUN/AN/A87,97286,992.2180.33
bio-yeast--16,30216,247.8182.60socfb-Duke14N/AN/A93,29492,967.7224.83
ca-AstroPhN/AN/A144,660143,098.9186.93socfb-IndianaN/AN/A135,869135,265178.45
ca-citeseerN/AN/A218,277217,906.1194.61socfb-MITN/AN/A90,76089,700.5182.88
ca-coauthors-dblpN/AN/A924,023921,156.3170.24socfb-ORN/AN/A85,04584,407.4133.30
ca-CondMatN/AN/A61,64760,964.7221.77socfb-Penn94N/AN/A121,319120,455.7179.44
ca-CSphdN/AN/A15,39815,370.2206.08socfb-Stanford3N/AN/A112,011111,138.3140.08
ca-dblp-2010N/AN/A180,532179,364.5182.20socfb-Texas84N/AN/A128,204126,768.1219.27
ca-dblp-2012N/AN/A160,660159,929.8169.46socfb-uci-uniN/AN/A25,65025,101.6226.36
ca-Erdos992N/AN/A16,87716,752.9125.53socfb-UCLAN/AN/A119,747118,738.3187.82
ca-GrQcN/AN/A53,64553,532.1170.29socfb-UConnN/AN/A102,092101,497.7207.71
ca-HepPhN/AN/A161,167160,718.8207.33socfb-UCSB37N/AN/A108,210107,448.6210.05
ca-hollywood-2009N/AN/A1,815,9581,803,757213.10socfb-UFN/AN/A143,528142,751.4126.65
ca-MathSciNetN/AN/A70,63670,271.1143.10socfb-UIllinoisN/AN/A135,222134,182.4201.21
ca-netscience10,51922,01121,07621,072.8136.11socfb-Wisconsin87N/AN/A104,664103,275.1203.96
ia-email-EUN/AN/A21,65221,347.9217.56soc-flickrN/AN/A74,83074,224.6160.25
ia-email-univ10,97732,64125,20225,079.5174.42soc-flixsterN/AN/A68,59367,034.3182.67
ia-enron-largeN/AN/A50,53650,100.5197.78soc-FourSquareN/AN/A36,90536,496.1180.98
ia-enron-only815910,12196109590.5123.20soc-gowallaN/AN/A61,16660,477.1106.86
ia-fb-messages--18,07817,939.4155.67soc-karate6296296296290.00
ia-infect-dublin17,53530,61826,72126,652.9207.08soc-lastfmN/AN/A40,83040,624.8191.26
ia-infect-hyper65546554655465540.00soc-livejournalN/AN/A353,252345,997.4159.91
ia-realityN/AN/A94749427.1159.85soc-LiveMochaN/AN/A29,37628,832190.46
ia-wiki-TalkN/AN/A31,49231,299.6216.93soc-orkutN/AN/A146,600145,529.7135.93
inf-powerN/AN/A20,84220,757.7129.00soc-pokecN/AN/A70,45469,295.1140.46
inf-roadNet-CAN/AN/A24,26124,204.7150.68soc-slashdotN/AN/A34,45033,981205.31
inf-roadNet-PAN/AN/A23,94023,929.1134.36soc-twitter-followsN/AN/A17,55017,336.8220.63
inf-road-usaN/AN/A23,59423,458.5146.11soc-wiki-Vote810723,86418,06817,883.6175.77
rec-amazonN/AN/A33,36333,356.2161.28soc-youtubeN/AN/A39,55638,94991.83
rt-retweet416141694161416110.27soc-youtube-snapN/AN/A38,92938,679.1195.84
rt-retweet-crawlN/AN/A27,01126,594138.17tech-as-caida2007N/AN/A19,97319,779.8135.87
rt-twitter-copen14,77216,13014,65714,635.9175.68tech-as-skitterN/AN/A72,57171,620209.40
sc-ldoorN/AN/A161,126160,998.6178.39tech-internet-asN/AN/A21,02920,875.3175.95
sc-msdoorN/AN/A159,967159,737.1224.65tech-p2p-gnutellaN/AN/A20,77320,588110.00
sc-nasasrbN/AN/A160,746160,572.2148.46tech-RL-caidaN/AN/A36,57936,137.1133.97
sc-pkustk11N/AN/A174,249173,122.1163.16tech-routers-rfN/AN/A24,39124,223156.74
sc-pkustk13N/AN/A204,876204,369.1137.80tech-WHOISN/AN/A61,59860,786.6181.99
sc-pwtkN/AN/A175,800175,532183.57web-arabic-2005N/AN/A348,027348,0275.92
sc-shipsec1N/AN/A113,485112,618.6214.02web-BerkStanN/AN/A32,56632,486.7199.25
sc-shipsec5N/AN/A153,006151,986162.57web-eduN/AN/A24,78024,730.8219.82
soc-BlogCatalogN/AN/A49,42248,620.8193.41web-google--31,20131,141.2148.58
soc-brightkiteN/AN/A49,75349,285.4204.41web-indochina-2004N/AN/A119,525119,336.2220.28
soc-buzznetN/AN/A47,09246,465.3230.39web-it-2004N/AN/A1,314,1441,314,1449.23
soc-deliciousN/AN/A35,37734,841.4161.33web-polblogs960621,84016,84416,756.7201.77
soc-diggN/AN/A67,80666,826.6248.28web-sk-2005N/AN/A154,004153,682.4175.56
soc-dolphins20152015201520150.00web-spamN/AN/A38,11237,909.5198.68
soc-doubanN/AN/A25,24524,822.4118.98web-uk-2005N/AN/A1,277,8871,277,8870.45
soc-epinionsN/AN/A37,34536,766.1187.05web-webbase-2001N/AN/A52,39552,146192.70
socfb-A-anonN/AN/A71,90871,376.4198.96web-wikipedia2009N/AN/A79,08177,242.5215.47
Table 6. Experiment results on real-world large graphs with k = 50 .
Table 6. Experiment results on real-world large graphs with k = 50 .
InstanceCPLEXTOPKWCLQInstanceCPLEXTOPKWCLQ
LB UB w b w a Time LB UB w b w a Time
bio-celegans14,81125,65921,03220,899.3242.15socfb-B-anonN/AN/A87,45585,964.7168.6
bio-diseasome13,04127,56925,15725,132.1219.96socfb-Berkeley13N/AN/A132,576131,226.4245.12
bio-dmelaN/AN/A22,06021,985156.03socfb-CMUN/AN/A101,002100,407.2224.22
bio-yeastN/AN/A19,70019,584.3130.19socfb-Duke14N/AN/A110,636109,983.7198.54
ca-AstroPhN/AN/A167,224166,691.2161.95socfb-IndianaN/AN/A158,910157,463.4175.19
ca-citeseerN/AN/A258,664256,459.1200.09socfb-MITN/AN/A104,41210,3962155.76
ca-coauthors-dblpN/AN/A1,103,6391,100,063173.91socfb-ORN/AN/A101,953100,836164.1
ca-CondMatN/AN/A73,22472,874.1174.02socfb-Penn94N/AN/A143,535141,653.6171.98
ca-CSphdN/AN/A18,91318,888.6215.74socfb-Stanford3N/AN/A130,978130,307.6160.01
ca-dblp-2010N/AN/A211,258210,658.3202.01socfb-Texas84N/AN/A150,919149,626.2215.77
ca-dblp-2012N/AN/A185,410184,644.4138.06socfb-uci-uniN/AN/A31,01930,807.1192.24
ca-Erdos992N/AN/A20,33120,133.4134.71socfb-UCLAN/AN/A141,024139,929.2165.78
ca-GrQcN/AN/A60,41060,254174.04socfb-UConnN/AN/A119,269118,552.1183.93
ca-HepPhN/AN/A177,770177,104.2157.61socfb-UCSB37N/AN/A125,680124,864.5169.75
ca-hollywood-2009N/AN/A2,017,2832,004,122220.66socfb-UFN/AN/A166,765165,868.4242.62
ca-MathSciNetN/AN/A84,03583,379.4168.59socfb-UIllinoisN/AN/A159,630157,918184.39
ca-netscience14,91725,10323,69623,684.2146.82socfb-Wisconsin87N/AN/A122,934122,265.3172.9
ia-email-EUN/AN/A25,60325,175.5229.9soc-flickrN/AN/A89,80787,883175.89
ia-email-univ--29,40229,187.6231.77soc-flixsterN/AN/A80,45179,254.8212.02
ia-enron-largeN/AN/A60,88060,270.8212.19soc-FourSquareN/AN/A44,67644,272194.57
ia-enron-only915710,42810,29710,266141.51soc-gowallaN/AN/A72,60872,149.7213.64
ia-fb-messages--21,65421,446135.39soc-karate6296296296290
ia-infect-dublin21,42634,15630,07429,964.5221.4soc-lastfmN/AN/A49,43148,941.3182.21
ia-infect-hyper65546554655465540soc-livejournalN/AN/A405,317399,721.7142.94
ia-realityN/AN/A11,51811,445.6146.6soc-LiveMochaN/AN/A34,89034,574.5194.75
ia-wiki-TalkN/AN/A38,23937,417.3192.79soc-orkutN/AN/A18,1096178,069.6139.86
inf-powerN/AN/A24,96824,890.9224.22soc-pokecN/AN/A85,01084,118.3188.24
inf-roadNet-CAN/AN/A30,25130,155.4111.57soc-slashdotN/AN/A40,15039,867.5207.98
inf-roadNet-PAN/AN/A29,88729,867.4199.69soc-twitter-followsN/AN/A21,30121,065.8187.84
inf-road-usaN/AN/A29,31629,201.1142.64soc-wiki-Vote10,09728,03921,41921,210.6187.91
rec-amazonN/AN/A41,14641,124.2148.14soc-youtubeN/AN/A47,16046,973183.03
rt-retweet45264620452645260.14soc-youtube-snapN/AN/A47,52346,964.5234.3
rt-retweet-crawlN/AN/A32,58632,128.1175.81tech-as-caida2007N/AN/A23,88123,719.3164.61
rt-twitter-copen17,31419,81217,65217,571210.34tech-as-skitterN/AN/A84,74683,760193.48
sc-ldoorN/AN/A200,767200,601.1197.73tech-internet-asN/AN/A25,23625,078.6252.39
sc-msdoorN/AN/A199,076198,801.3129.64tech-p2p-gnutellaN/AN/A25,23825,143.9187.28
sc-nasasrbN/AN/A197,922197,611.2186.05tech-RL-caidaN/AN/A43,72843,471.3151.62
sc-pkustk11N/AN/A214,700213,740.4140.39tech-routers-rfN/AN/A28,34928,125.2137.27
sc-pkustk13N/AN/A252,454251,525.5186.96tech-WHOISN/AN/A68,10267,397.8165.01
sc-pwtkN/AN/A218,360217,983.9161.86web-arabic-2005N/AN/A430,893430,89312.32
sc-shipsec1N/AN/A138,034137,296.4181.85web-BerkStanN/AN/A39,16038,997.3183.66
sc-shipsec5N/AN/A185,496184,589.5171.17web-eduN/AN/A28,72128,667.3205.28
soc-BlogCatalogN/AN/A57,09756,401.7150.99web-google--34,97734,930.2194.03
soc-brightkiteN/AN/A58,01457,698.3213.89web-indochina-2004N/AN/A138,107137,833.2161.98
soc-buzznetN/AN/A55,42054,720209.46web-it-2004N/AN/A1,502,5801,502,58031.06
soc-deliciousN/AN/A42,31741,594169.03web-polblogs979325,43419,89019,727.5208.03
soc-diggN/AN/A79,14977,969.7168.98web-sk-2005N/AN/A180,230179,935.1183.74
soc-dolphins20152015201520150web-spamN/AN/A45,22844,432.1172.25
soc-doubanN/AN/A29,99329,587.2135.61web-uk-2005N/AN/A1,497,3141,497,3140.46
soc-epinionsN/AN/A43,67643,427.1179.86web-webbase-2001N/AN/A59,56959,108.5153.63
socfb-A-anonN/AN/A87,21786,621.5203.76web-wikipedia2009N/AN/A90,89890,000.9190.59
Table 7. Summary of comparison between CPLEX and TOPKWCLQ on real-world graphs. #Better denotes the number of graphs where an algorithm finds better objective values. #N/A denotes the number of graphs where an algorithm fails to find an objective value.
Table 7. Summary of comparison between CPLEX and TOPKWCLQ on real-world graphs. #Better denotes the number of graphs where an algorithm finds better objective values. #N/A denotes the number of graphs where an algorithm fails to find an objective value.
BenchmarkkCPLEXTOPKWCLQ
#Better#N/A#Better#N/A
real-world graphs (102)10584870
20285970
30186970
40186970
50087980
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wu, J.; Yin, M. A Restart Local Search for Solving Diversified Top-k Weight Clique Search Problem. Mathematics 2021, 9, 2674. https://doi.org/10.3390/math9212674

AMA Style

Wu J, Yin M. A Restart Local Search for Solving Diversified Top-k Weight Clique Search Problem. Mathematics. 2021; 9(21):2674. https://doi.org/10.3390/math9212674

Chicago/Turabian Style

Wu, Jun, and Minghao Yin. 2021. "A Restart Local Search for Solving Diversified Top-k Weight Clique Search Problem" Mathematics 9, no. 21: 2674. https://doi.org/10.3390/math9212674

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop