Nearest Neighbor Search in the Metric Space of a Complex Network for Community Detection

The objective of this article is to bridge the gap between two important research directions: (1) nearest neighbor search, which is a fundamental computational tool for large data analysis; and (2) complex network analysis, which deals with large real graphs but is generally studied via graph theoretic analysis or spectral analysis. In this article, we have studied the nearest neighbor search problem in a complex network by the development of a suitable notion of nearness. The computation of efficient nearest neighbor search among the nodes of a complex network using the metric tree and locality sensitive hashing (LSH) are also studied and experimented. For evaluation of the proposed nearest neighbor search in a complex network, we applied it to a network community detection problem. Experiments are performed to verify the usefulness of nearness measures for the complex networks, the role of metric tree and LSH to compute fast and approximate node nearness and the the efficiency of community detection using nearest neighbor search. We observed that nearest neighbor between network nodes is a very efficient tool to explore better the community structure of the real networks. Several efficient approximation schemes are very useful for large networks, which hardly made any degradation of results, whereas they save lot of computational times, and nearest neighbor based community detection approach is very competitive in terms of efficiency and time.


Introduction
The nearest neighbor (NN) search is an important computational primitive for structural analysis of data and other query retrieval purposes.NN search is very useful for dealing with massive data sets, but it suffers from the curse of dimension [1,2].Though nearest neighbor search is a extensively studied research problem for low dimensional data, a recent surge of results shows that it is the most useful tool for analyzing very large quantities of data, provided a suitable space partitioning the data structure is used, like, kd-tree, quad-tree, R-tree, metric-tree and locality sensitive hashing [3][4][5][6].One more advantage of using nearest neighbor search for large data analysis is the availability of efficient approximation scheme, which provides almost same results in very less time [7,8].
Though nearest neighbor search is very successful and extensively used across the research domains of computer science, it is not studied rigorously in complex network analysis.Complex networks are generally studied with a graph theoretic framework or spectral analysis framework.One basic reason for this limitation may be the nodes of the complex networks do not naturally lie on a metric space, thus restricting the use of nearest neighbor analysis which is done using metric or nearness measures.Other than graphs, the complex networks are characterized by small "average path length" and a high "clustering coefficient".A network community (also known as a module or cluster) is typically a group of nodes with more interconnections among its members than the remaining part of the network [9][10][11].To extract such group of nodes from a network one generally selects an objective function that captures the possible communities as a set of nodes with better internal connectivity than external [12,13].However, very little research has been done for network community detection, which tries to develop nearness between the nodes of a complex network and use the nearest neighbor search for partitioning the network [14][15][16][17][18][19][20].The way metric is defined among the nodes should be able to capture the crucial properties of complex networks.Therefore, we need to create the metric very carefully so that it can explore the underlying community structure of the real life networks [21].
Extracting network communities in large real graphs such as social networks, web, collaboration networks and bio-networks is an important research direction of recent interest [11,[22][23][24][25].In this work, we have developed the notion of nearness among the nodes of the network using some new matrices derived from the modified adjacency matrix of the graph which is flexible over the networks and can be tuned to enhance the structural properties of the network required for community detection.
The main contributions of this work are, (1) development of the concept of nearness between the nodes of a complex network; (2) comparing the proposed nearness with other notions of similarities; (3) study and experiment on approximate nearest neighbor search for complex network using M-tree and LSH; (4) design of efficient community detection algorithm using nearest neighbor search.We observed that nearest neighbor between network nodes is a very efficient tool to explore better community structure of the real networks.Further several efficient approximation scheme are very useful for large networks, which hardly made any degradation of results, whereas saves lot of computational times.
The rest of this paper is organized as follows.Section 2 describes the notion of nearness in complex network and proposed method to compute distance between the nodes of a complex network.Sections 3 and 4 describe the algorithm of the nearest neighbor search over complex network using of metric tree and locality sensitive hashing methods respectively.In Section 5, the proposed algorithm for network community detection using nearest neighbor search is discussed.The results of the comparison between community detection algorithms are illustrated in Section 6.

Proposed Notion of Nearness in Complex Network
The notion of nearness between the nodes of a graph is used in several purposes in the history of literature of graph theory.Most of the time the shortest path and edge connectivity are popular choices to describe nearness of nodes.However, the edge count does not give the true measure of network connectivity.A true measure of nearness in a complex network should able to determine how much one node can affect the other node to provide a better measure of connectivity between nodes of a real life complex network.Research in this direction need special attention in the domain of complex network analysis, one such is proposed in this article and described in the following subsections.

Definitions
Definition 1 (Metric space of network).Given, a graph G = (V, E) the metric is defined over the vertex set V and d, a function to compute the distance between two vertices of V. Pair (V, d) distinguished metric space if d satisfies reflexivity, non-negativity, symmetry and triangle inequality.
Definition 2 (Nearest neighbor search on network).The nearest-neighbor searching problem in complex network is to find the nearest node in a graph G = (V, E) between a query vertex v q and any other vertex of the graph V {v q }, with respect to a metric space M(V, d) associated with the graph G = (V, E).

Nearness in Complex Network
Methods based on node neighborhoods.For a node x, let N(x) denote the set of neighbors of x in a graph G(V, E).A number of approaches are based on the idea that two nodes x and y are more likely to be affected by one another if their sets of neighbors N(x) and N(y) have large overlap.

Common neighbors:
The most direct implementation of this idea for nearness computation is to define d(x, y) := |N(x) ∩ N(y)|, the number of neighbors that x and y have in common.Jaccard coefficient: The Jaccard coefficient, a commonly used similarity metric, measures the probability that both x and y have a feature f , for a randomly selected feature f that either x or y has.If we take features here to be neighbors in G(V, E), this leads to the measure d(x, y)

Preferential attachment:
The probability that a new edge involves node x is proportional to |N(x)|, the current number of neighbors of x.The probability of co-authorship of x and y is correlated with the product of the number of collaborators of x and y.This corresponds to the measure d(x, y) := |N(x)| × |N(y)|.Katz measure: This measure directly sums over the collection of paths, exponentially damped by length to count short paths more heavily.This leads to the measure d(x, y) := β × |paths(x, y)| where, paths(x, y) is the set of all length paths from x to y. (β determines the path size, since paths of length three or more contribute very little to the summation.)Commute time: A random walk on G starts at a node x, and iteratively moves to a neighbor of x chosen uniformly at random.The hitting time H(x, y) from x to y is the expected number of steps required for a random walk starting at x to reach y.Since the hitting time is not in general symmetric, it is also natural to consider a commute time C(x, y) := H(x, y) + H(y, x).PageRank: Random resets form the basis of the PageRank measure for Web pages, and we can adapt it for link prediction as follows: Define d(x, y) to be the stationary probability of y in a random walk that returns to x with probability α each step, moving to a random neighbor with probability 1 − α.
Most of the methods are developed for different types of problems like information retrieval, ranking, prediction e.t.c. and developed for general graphs.In the article [21], the authors studied a measure specially designed for complex network.

Proposed Nearness in Complex Network
In this subsection, we developed the notion of nearness among the nodes of the network using some linear combination of adjacency matrix A and identity matrix of same dimension for the network G = (V, E).The similarities between the nodes are defined on matrix L = λI + A as spherical similarity among the rows and determine by applying a concave function φ over the standard notions of similarities like, Pearson coefficient(σ PC ), Spacerman coefficient(σ SC ) or Cosine similarity(σ CS ).φ(σ)() must be chosen using the chord condition, i.e., metric-preserving (φ(d(x i , x j ) = d φ (x i , x j )), concave and monotonically-increasing, to obtain a metric.It works by picking a pair of rows from L and computing the distance defined in the φ(σ)().The function φ converts a similarity function (Pearson coefficient (σ PC ), Spacerman coefficient (σ SC ) or cosine similarity (σ CS )) into a distance matrix.In general, the similarity function satisfies the positivity and similarity condition of the metric, but not the triangle inequality.φ is a metric-preserving (φ(d(x i , x j ) = d φ (x i , x j )), concave and monotonically-increasing function.The three conditions above are referred to as the chord condition.The φ function is chosen to have minimum internal area with the chord.
The choice of λ and φ(σ)() in the above sub-modules play a crucial role in the graph to metric transformation algorithm to be used for community detection.The complex network is characterized by a small average diameter and a high clustering coefficient.Several studies on network structure analysis reveal that there are hub nodes and local nodes characterizing the interesting structure of the complex network.Suppose we have taken φ = arccos, σ CS and constant λ ≥ 0. λ = 0 penalizes the effect of the direct edge in the metric and is suitable to extract communities from a highly dense graph.λ = 1 places a similar weight of the direct edge, and the common neighbor reduces the effect of the direct edge in the metric and is suitable to extract communities from a moderately dense graph.λ = 2 sets more importance for the direct edge than the common neighbor (this is the common case of available real networks).λ ≥ 2 penalizes the effect of the common neighbor in the metric and is suitable for extracting communities from a very sparse graph.

Nearest Neighbor Search on Complex Network Using Metric Tree
There are numerous methods developed to compute the nearest neighbor search for points of a metric space.However, finding the nearest neighbor search on some data where dimension is high suffer from curse of dimension.Some recent research in this direction revealed that dimension constrained can be tackled by using efficient data structures like metric tree and locality sensitive hashing.In this section we have explored metric tree to perform the nearest neighbor search on complex network with the help of metric mapping of complex network described in the previous section.

Metric-Tree
A metric tree is a data structure specially designed to perform the nearest neighbor query for the points residing on a metric space and perform well on high dimension particularly when some approximation is permitted.A metric tree organizes a set of points in a spatial hierarchical manner.It is a binary tree whose nodes represent a set of points.The root node represents all points, and the points represented by an internal node v is partitioned into two subsets, represented by its two children.Formally, if we use N(v) to denote the set of points represented by node v, and use v.lc and v.rc to denote the left child and the right child of node v, then we have ) for all the non-leaf nodes.At the lowest level, each leaf node contains very few points.
An M-Tree [26] consists of leaf node, internal node and routing object.Leaf nodes are set of objects N v with pointer to parent object v p .Internal nodes are set of routing objects N RO with pointer to its parent object v p .Routing object v r store covering radius r(v r ) and pointer to covering tree T(v r ), Distance of v r from its parent object d(v r , P(v r )).Feature values stored in the object v j are object identifier oid (v j ) and distance of v j from its parent object d(v j , P(v j )) The key to building a metric-tree is how to partition a node v.A typical way is as follows: We first choose two pivot points from N(v), denoted as v.l pv and v.rpv.Ideally, v.l pv and v.rpv are chosen so that the distance between them is the largest of all distances within N(v).More specifically, ||v.
A search on a metric-tree is performed using a stack.The current radius r is used to decide which child node to search first.If the query q is on the left of current point, then v.lc is searched first, otherwise, v.rc is searched first.At all times, the algorithm maintains a candidate NN and there distance determines the current radius, which is the nearest neighbor it finds so far while traversing the tree.The algorithm for nearest neighbor search using metric tree is (Algorithm 1) given below.

Nearest Neighbor Search Algorithm Using M-Tree
The theoretical advantage of using metric tree as a data structure for nearest neighbor search is: Let M = (V, d), be a bounded metric space.Then for any fixed data V ∈ R n of size n, and for constant c ≥ 1, ∃ such that we may compute d(q, V)| with at most c • log(n) + 1 expected metric evaluations [27].
1: Insert root object v r in stack 2: Set current radius as d(v r , q) 3: Successively traverse the tree in search of q 4: PUSH all the objects of traversal path into stack 5: Update the current radius 6: If leaf object reached 7: POP objects from stack 8: For all points lying inside the ball of current radius centering q, verify for possible nearest neighbor and update the current radius.9: return d(q, v q )

Nearest Neighbor Search on Complex Network Using Locality Sensitive Hashing
Metric trees, so far represent the practical state of the art for achieving efficiency in the largest dimension possible.However, many real-world problems consist of very large dimension and beyond the capability of such search structures to achieve sub-linear efficiency.Thus, the high-dimensional case is the long-standing frontier of the nearest-neighbor problem.The approximate nearest neighbor can be computed very efficiently using Locality sensitive hashing.

Approximate Nearest Neighbor
Given a metric space (S, d) and some finite subset S D of data points S D ⊂ S on which the nearest neighbor queries are to be made, our aim to organize S D s.t.NN queries can be answered more efficiently.For any q ∈ S, NN problem consists of finding single minimal located point p ∈ S D s.t.d(p, q) is minimum over all p ∈ S D .We denote this by p = NN(q, S D ).
An approximate NN of q ∈ S is to find a point p ∈ S D s.t.d(p, q) ≤ (1 + )d(x, d) ∀ x ∈ S D .

Locality Sensitive Hashing (LSH)
Several methods to compute first nearest neighbor query exists in the literature and locality-sensitive hashing (LSH) is most popular because of its dimension independent run time [28,29].In a locality sensitive hashing, the hash function has the property that close points are hash into same bucket with high probability and distance points are hash into same bucket with low probability.Mathematically, a family where B(q, r) denotes a hyper sphere of radius r centered at q.In order for a locality-sensitive family to be useful, it has to satisfy inequalities p 1 > p 2 and r 1 < r 2 when D is a distance, or p 1 > p 2 and r 1 > r 2 when D is a similarity measure [4,5].The value of δ = log(1/P 1 )/log(1/P 2 ) determines search performance of LSH.Defining a LSH as a(r, r(1 + ), p1, p2), the (1 + ) NN problem can be solved via series of hashing and searching within the buckets [5,30,31].

Locality Sensitive Hash Function for Complex Network
In this sub-section, we discuss the existence of locality sensitive hash function families for the proposed metric for complex network.The LSH data structure stores all nodes in hash tables and searches for nearest neighbor via retrieval.The hash table is contain many buckets and identified by bucket id.Unlike conventional hashing, the LSH approach tries to maximize the probability of collision of near items and put them into same bucket.For any given the query q the bucket h(q) considered to search the nearest node.In general k hash functions are chosen independently and uniformly at random from hash family H.The output of the nearest neighbor query is provided from the union ok k buckets.The consensus of k functions reduces the error of approximation.For metric defined in the previous Section 2 we considered k random points from the metric space.Each random point r i define a hash function h i (x) = sign(d(x, r i )), where d is the metric and i ∈ [1, k].These randomized hash functions are locality sensitive [32,33].
1: Identify buckets of query point q corresponding to different hash functions.
2: Compute nearest neighbor of q only for the points inside the selected buckets.
3: return d(q, V) The theoretical advantage of using locality sensitive hashing as a data structure for nearest neighbor search is: Let M = (V, d), be a bounded metric space.Then for any fixed data V ∈ R n of size n, and for constant c ≥ 1, ∃ such that we may compute d(q, V)| with at most mn O(1/ ) expected metric evaluations, where m is the number of dimension of the metric space.In case of complex network m = n so expected time is n O(2/ ) [27,34].

Proposed Community Detection Based on Nearest Neighbor
In this section we have described the algorithm proposed for network community detection using nearest neighbor search.Our approach differs from the existing methods of community detection.The broad categorization of the available algorithms is generally based on graph traversal, semidefinite programming and spectral analysis.The basic approach and the complexity of very popular algorithms are listed in the Table 1.There are more algorithms developed to solve network community detection problem a complete list can be obtained in several survey articles [11,35,36].Theoretical limitations and evaluation strategies of community detection algorithms are provided in the articles [37][38][39][40][41]. Content based node similarity (discussed in [42,43]) methods uses additional information of the network node and not available in general complex networks.A partial list of algorithms developed for network community detection purpose is tabulated in Table 1.The algorithms are categorized into three main group as spectral (SP), graph traversal based (GT) and semi-definite programming based (SDP).The categories and complexities are also given in Table 1.(Geometric Brownian motion, 2014 [60]) SDP O(n 3 )

Distance Based Community Detection
There exist no algorithms in the literature of network community detection which compute direct nearest neighbor between nodes to the best of our knowledge; however, concepts of nearness used in some of the algorithms and they are described below.
Walktrap Algorithm (WT): This algorithm by Pons and Latapy [14] uses a hierarchical agglomerative method.Here, the distance between two nodes is defined in terms of random walk process.The basic idea is that if two nodes are in the same community, the probability to get to a third node located in the same community through a random walk should not be very different.The distance is constructed by summing these differences over all nodes, with a correction for degree.The complexity of the algorithm is O(n 3 ) as reported in Latapy & Pons (walktrap, 2004 [14]).
Label Propagation Algorithm (LP): This algorithm by Raghavan et al. [57] uses the concept of node neighborhood and the diffusion of information in the network to identify communities.Initially, each node is labeled with a unique value.Then an iterative process takes place, where each node takes the label which is the most spread in its neighborhood.This process goes on until the conditions, no label change, is met.The resulting communities are defined by the last label values with the complexity O(n + m) for each iteration as reported in Raghavan et al. (label propagation, 2007 [57]).
Geometric Brownian motion (GBM): This concept was borrowed from statistical physics by Zhou et al. [47] and extended by Jin et al. [60] with the inclusion of the concept of bispace.This method develops the notion of Brownian motion on networks to compute the influences between the nodes, which used to discover communities of social networks.The complexity of the algorithm is O(n 3 ) as reported in Jin et al. (GBM, 2014 [60]).

Proposed Algorithm for Network Community Detection Using Nearest Neighbor Search
In this subsection we have described k-central algorithm for the purpose of network community detection by using the nearest neighbor search in a complex network.The community detection methods based on partitioning of graph is possible using nearest neighbor search, because the nodes of the graph are converted into the points of a metric space.This algorithm for network community detection converges automatically and does not compute the value of objective function in iterations therefore reduce the computation compared to standard methods.The k-central algorithm for community detection is (Algorithm 3) given below.for j = 1 to k do 5: end for 7: end for 8: for j = 1 to k do 9: end for 11: until |cost(T t ) − cost(T t+1 )| = 0 12: return T = {C 1 , C 2 , . . ., C k }

Complexity And Convergence
Complexity of the network community detection algorithms are the least studied research topic in network science.However, the rate of convergence is one of the important issues of algorithmic complexity and low rate of convergence is the major pitfall of the most of the existing algorithms.Due to the transformation into the metric space, our algorithm is equipped with the quick convergence facility of the k-partitioning on metric space by providing a good set of initial points.Another crucial pitfall suffer by majority of the existing algorithms is the validation of the objective function in each iteration during convergence.Our algorithm converges automatically to the optimal partition thus reduces the cost of validation during convergence.Theorem 4.During the course of the k center partitioning algorithm, the cost (community-wise total distance from the corresponding centers ) monotonically decreases.
Proof.Let Z t = {z t 1 , . . ., z t k } , T t = {C t 1 , . . ., C t k } denote the centers and clusters at the start of the t th iteration of k partitioning algorithm.The first step of the iteration assigns each data point to its closest center; therefore cost(T t+1 , Z t ) ≤ cost(T t , Z t ) On the second step, each cluster is re-centered at its mean; therefore cost(T t+1 , Z t+1 ) ≤ cost(T t+1 , Z t ).
The main achievement of our algorithm is to use the rich literature of clustering using nearest neighbor.Clustering is easy NP-Hard in metric space, whereas graph partitioning is NP-Hard.Our algorithm converges automatically to optimal clustering.It does not require verifying the value of objective function guide next iteration, like popular approaches, thus saving the time of computation.

Experiments and Results
In this section we described in details several experiments to asses the, proposed nearness measure for the nodes of the network, efficiency of several approximation scheme to compute node nearness and performance of proposed algorithm for community detection.Several experiments conducted in this regard are detailed below along with their parameter settings, results and conclusions.

Experimental Designs
We performed three different experiments to asses the performance of the proposed network nearest neighbor search for community detection.The first experiment is designed to evaluate the nearness measure, the second experiment is designed to explore the effectiveness of approximate nearest neighbor search for network community detection and the third experiment is designed to verify behavior of the algorithm and the time required to compute the algorithm.One of the major goals of the last experiment is to verify the behavior of the algorithm with respect to the performance of other popular methods exists in the literature in terms of standard modularity measures.Experiments are conducted over several real networks Table 2 to compare the results (Tables 5 and 6) of our algorithm with the state-of-the-art algorithms (Table 1) available in the literature in terms of modularity most preferred by the researchers of the domain of network community detection.The details of the several experiments and the analysis of the results are given in the following subsections.

Performance Indicator
Modularity: The notion of modularity is the most popular for the network community detection purpose.The modularity index assigns high scores to communities whose internal edges are more than that expected in a random-network model which preserves the degree distribution of the given network.

Datasets
A list of real networks taken from several real life interactions are considered for our experiments and they are shown in Table 2 below.We have also listed the number of nodes, number of edges, average diameter, and the k value used in Subsection 5.2.The values of the last column can be used to assess the quality of detected communities.In this experiment we tried to asses the usefulness of proposed nearness measure between the nodes of complex network.For this purpose we have equiped our algorithm with different measures of nearness along with our measure.Experimental steps are as follows: Nearness measures: Six different measures are taken for construction the distance based community detection.They are jaccard coefficient (JA), preferential attachment (PA), Katz measure (KM), commute time (CT), page rank (PR) and proposed metric (PM).details of the measures are already discussed in Subsection 2.2 and proposed metric is detailed in Subsection 2.3.Algorithm: The community detection algorithm proposed in Section 5 is used and exact nearest neighbor between nodes are considered and computed communities besed on those different nearness measures.Network data: Different types of real network data is taken, small, large, very sparse and relatively dense and they are discussed in Table 2. Results: Compared the community structure obtained by algorithms, equipped with different measures of node nearness, in terms of modularity and shown in Table 3. Observation: It can be observed from Table 3 that algorithm based on proposed metric (shown in column PA) provides better modularity than other for community detection.In this experiment we explore the effectiveness of several approximation techniques of nearest neighbor search on complex network designed via metric tree and locality sensitive hashing.For this purpose we have equiped our algorithm with different data structures (metric tree and LSH) with varying approximation ratio.Experimental steps are as follows: Metric and algorithm: The algorithms considered in this experiment used proposed measures of node nearness detailed in Subsection 2.3.The community detection algorithm proposed in Section 5 is used in this experiment.Approximation: Computed communities using approximate nearest neighbor via metric tree and locality sensitive hashing.Different precision of approximation is considered ranges from 0-0.5 and computed five times each over both the scheme of approximation.Network data: Different types of real network data is taken to verify the acceptablity of degradation over the networks and is shown in Table 4. Results: Compared the community structure obtained by algorithms, equipped with approximate nearest neighbor instead of exact measures of node nearness, in terms of modularity and shown in Table 4. Observations: Observed that both the approximation schemes are very good for community detection and slightly degrade the results under ranges of Approximations Table 4.In this experiment we have compared several algorithms for network community detection with our proposed algorithm developed using the nearest neighbor search in complex network, which is discussed in Section 5.The experiment is performed on a large list of network data sets Table 2. Two versions of the experiment are developed for comparison purposea based on modularity and time taken.The results are shown in the Tables 5 and 6 respectively.Experimental steps are as follows: Design of experiment: In this experiment we have compared three groups of algorithms for network community detection with one based on nearest neighbor search, described above.Two versions of the experiment are developed for comparison purposes based on modularity and time taken in seconds.Best of literature: Regarding the three groups of algorithms; the first group contain algorithms based on semi-definite programming and the second group contain algorithms based on graph traversal approaches.For each group, we have taken the best value of modularity in Table 5 among all the algorithms in the groups.All the algorithms considered in this experiment are detailed in Section 5. Other distance based methods: Three different methods of network community detection are also considered for our comparison which indirectly use the influence between the nodes in their algorithms.These methods are walktrap (WT), label propagation (LP) and geometric brownian motion (GBM) and already discussed in Section 5 along with their references and complexities.Proposed methods: Three versions of proposed algorithm are compared with other algorithms, the proposed algorithm based on exact nearest neighbor, approximated nearest neighbor computed using metric tree and approximate nearest neighbor computed using locality sensitive hashing.Network data: A long list of real network data is taken for evaluation of modularity and timedescribed in Table 4. Efficiency and time: Compared the community structure obtained in terms of modularity and time (seconds) taken by the algorithms, shown in the Tables 5 and 6, respectively.
The results obtained with our approach are very competitive with most of the well known algorithms in the literature and this is justified over the large collection of datasets.On the other hand, it can be observed that time (second) taken (Table 6) by our algorithm is quite less compared to other methods and justify the theoretical findings.

Results Analysis and Achievements
In this subsection, we have described the analysis of the results obtained in our experiments shown.The results obtained in the first experiment justify that the proposed distance is more useful for complex network to extract the community structure compared to other measures of similarity.The results obtained in the second experiment verify that the approximate distance is also useful for network community detection especially for large data where time is a major concern.The results obtained in the third experiment justify that the proposed algorithm for community detection is very efficient compared to other existing methods in terms of modularity and time.

Conclusions
In this paper, we studied the interesting problem of the nearest neighbor within the nodes of a complex networks and applied this for community detection.We have used a geometric framework for network community detection instead of the traditional graph theoretic approach or spectral methods.Processing the nearest neighbor search in complex networks cannot be achieved straightforwardly; we presented the transformation of the graph to metric space and efficient computation of the nearest neighbor therein using metric tree and locality sensitive hashing.To validate the performance of proposed nearest neighbor search designed for complex networks, we applied our approaches on a community detection problem.Through several experiments conducted in this regard and we found community detection using nearest neighbor search is very efficient and time saving for large networks due to good approximations.The results obtained on several network data sets prove the usefulness of the proposed method and provide motivation for further application of other structural analysis of complex network using the nearest neighbor search.

Table 1 .
Algorithms for network community detection and their complexities.

Table 2 .
Complex network datasets and values of their parameters.

Table 5 .
Comparison of our approaches with other best methods in terms of modularity.

Table 6 .
Comparison of our approaches with other best methods in terms of time.