A General Definition of Network Communities and the Corresponding Detection Algorithm

Network structures, consisting of nodes and edges, have applications in almost all subjects. A set of nodes is called a community if the nodes have strong interrelations. Industries (including cell phone carriers and online social media companies) need community structures to allocate network resources and provide proper and accurate services. However, all the current detection algorithms are motivated by the practical problems, whose applicabilities in other fields are open to question. Thence, for a new community problem, researchers need to derive algorithms ad hoc, which is arduous and even unnecessary. In this paper, we represent a general procedure to find community structures in practice. We mainly focus on two typical types of networks: transmission networks and similarity networks. We reduce them to a unified graph model, based on which we propose a general method to define and detect communities. Readers can specialize our general algorithm to accommodate their problems. In the end, we also give a demonstration to show how the algorithm works.


I. INTRODUCTION
Our real world consists of elements associated by relations. We call the entity network made by elements with the relations among them. Most real world networks are not random as they reveal big inhomogeneity, high level of order and organization [1]. For example, people working in a company may have much closer relations than the ones outside the company. The observation inspires people to partition the elements into groups (communities) such that the relations are strong and dense within the groups but sparse and weak among them [2][3][4].
Community detections have widespread applications. Amazon groups customers buying similar products together for better commodities recommendations. Facebook clusters the users by relationships, hobbies, etc. to accurately suggest users with new friends and circles. Carriers group the locations among which customers have high transportation demands for a proper assignment of vehicles.
Most algorithms work well in the areas from which they are derived. But the reliabilities outside their zones are controversial. The poor adaptabilities of the algorithms reveal the demand for general community detection methods as well as the general community definitions. Besides, most algorithms start from graphs directly, while the procedure of the math model abstraction is rarely formalized. In this paper, we try to tackle these problems. In particular, 1) Is there a general way to abstract a concrete problem into a unified math model? 2) Based on the unified math model, are there some common properties shared by most community definitions? 3) Is there a common method to detect community structures?
The rest of the paper is organized as follows. In Section II, we have a review on the popular community detection algorithms.
And we also discuss their pros and cons. In Section III, we introduce how to reduce a concrete problem into a graph model. Based on this, we define the community structure in Section IV. In Section V, we propose and prove some propositions regarding our community definition. Then we provide a corresponding detection algorithm. A demonstration is given in Section VI and finally, we talk about the limitations of our model as well as the future work in Section VIII.

II. RELATED WORK
The research of community detection starts from solving some concrete problems. For instance, Kernighan-Lin algorithm [5] is designed for clustering digital components into equal or nearly equal size communities such that mutual connections among the sets are minimized (due to the cost and stability consideration, electronic engineers need to minimize the number of connections among boards).
The researches always reduce the real-world network structures to the nodes connected by edges, where the edges represent the relations among the nodes. Although the meanings of relations vary in different papers, there are two main types: two nodes is related if 1) there are material transmission between them, AND/OR 2) they share some identical or similar properties.
The material means concrete objects (like goods) or information (like data packages). One of the most typical examples is the transportation among cities. The cities easily communicating with each other are grouped into one community [25].
There are also many network structures constructed by the similarity of the properties of the nodes. In protein-protein interaction networks, biologists would cluster proteins with equivalent or similar functions into one group [26]. Then, the relations are the function similarities of the proteins. In social networks, the people active in the similar locations and/or the time slots could be considered as a community. Then the relations represent the location and schedule similarity. In World Wide Web, the communities may correspond to the groups of pages concerned with the related topics or events [27,28]. Then the relations become the content similarity.
Based on the graph model, lots of community detection algorithms have been proposed.
Graph partitioning method groups the vertices into a predefined number of communities and minimize the number of edges among the groups. Most of the algorithms belonging to this method can perfectly solve particular problems in practice. However, the algorithms are not adapted to community detections due to the necessity of the pre-specified number of groups, which is in general unknown in community detection problems [1]. More seriously, the graph partitioning method is not derived from an explicit definition of communities. So there is no guarantee that the vertex groups found by the method are communities following our intuitions.
Real-world networks commonly have hierarchical structures from which the abstracted graph models usually inherit. The corresponding detection algorithms fall into two types: agglomerative (bottom-up) approach and divisive (top-down) approach [6]. Briefly, agglomerative approach starts from considering each node as a community and merges the community pairs as moving up the hierarchy. The divisible algorithm works in the opposite way. It starts by grouping all the nodes in one cluster and performs splitting recursively as moving down the hierarchy.
Partitional clustering also plays an important role in the graph clustering family. In order to apply the algorithm, the user must specify the number of clusters, which causes the same disadvantages as the ones belonging to graph partitioning [1]. The method puts all nodes in a metric space, and thus the distance between each pair of the nodes in the space is defined. The distance used in the algorithm can be considered as a measure of dissimilarity between the nodes. The algorithm needs to cluster the nodes into a pre-specified number of groups which minimize a given cost function.
All community detection methods and techniques related to matrix eigenvectors belong to spectral clustering. The clustering method requires a distance function to measure the similarity among the objects. The fundamental idea behind the algorithm is to utilize the eigenvectors to cluster objects by connectedness rather than the distance. Although there are many successful applications in image segmentation and machine learning, researchers have already found several fundamental limitations. For instance, spectral clustering would fail if it uses the first k eigenvectors to find k clusters when confronting with clusters of various scales regarding a multiscale landscape potential [29]. Besides, the algorithm needs the assistance of partitional clustering, whose drawbacks are also inherited to spectral clustering.
Most of the aforementioned algorithms have remarkable performances in the problems they derived from. And their complexities are optimized for the dynamic network community detection [16,30,31]. However, researchers hardly focus on the general definition of communities and the corresponding general community detection algorithms. A few preliminary works have been done. In particular, the comparisons have been made among the different community definitions as well as the corresponding detection algorithms [32]. Besides, some researcher believes that the definition often depends on the specific system at hand and/or application one has in mind [1].

III. CONCRETE PROBLEM REDUCTION
Most people believe that, a community is some set of objects where the interrelations are strong. However, there are lots of arguments on the definitions of relations as well as the ways to measure them. Generally, we derive the relations in two ways.
One is based on the materials transmission. The materials here can represent both material substance (like goods) or just information (like data). Intuitively, the objects that can easily communicate with each other should have strong relations among them. Then those objects can somehow be considered as a community.
The other one is based on the similarity (for example, people buy the similar kind of books might be considered as a community). In usual cases, we believe that the objects in the same community should share some other similar properties. Then we can make some reasonable predictions on the community level (for example, Amazon uses this trick to recommend commodities).
Although two community derivations come from different motivations, we show that they can be reduced to the same graph model.
A. Transmission network 1) Transmission relation characteristics: In order to make the problem easier to discuss, only one material will be considered. Moreover, we need the following definitions and assumptions.
Assumption 1. For any material, there is a minimal unit can be transferred. And we call the minimal unit a point.
Definition 1 (Node). The objects that receives and sends points is nodes.
Definition 2 (Medium). The object that propagates points is medium.
Assumption 2. The transmission relations are constructed in nodes, media and points, which also determine the properties of the relations.
Because of Assumption 2, the entity under the consideration consists of nodes, points and media. And we name it transmission network.
Two most important characteristics in a transmission network are the number of points transferred and the time consumed in a transmission process. Their ratio is termed speed.
Definition 3 (Speed). Speed is the number of points transferred in a unit time interval.
The transmission capability can be described by the speed function of time. For simplicity, only the node pairs connected by media directly are considered. By Assumption 2, the behaviour of the speed function f depends on the properties of the nodes, media and points. This means a specific analytic expression of the speed function cannot be given unless all the properties have been designated. However, some common characteristics can be expected. Suppose there is a pair of nodes (u, v) connected by media directly. Then (see Figure 1) the value of the speed function f (u,v) (t) remains zero until some point because of the latency caused by the sending, propagation and reception of the points. However, in the real world, TTS may never be reached, which leads to the meaninglessness of it. So instead, we can choose some proper threshold δ ∈ R * , the lowest acceptable speed. Correspondingly, we call the time to reach the threshold the critical moment (CM).

Remark 1. Suppose there are two towns A and B near a river.
A is upstream of B. Consider the goods transportation on the river. Because of the water stream, 1) The shorter STT is, the stronger the corresponding relation is. 2) The shorter CM is, the stronger the corresponding relation is.
3) The shorter the time to transfer a certain number of points is, the stronger the corresponding relation is. Although a reasonable SRSM cannot be constructed until a concrete problem is given, several key properties should be shared by all SRSMs.
Specifically, in a transmission network, relations should not cancel out each other. So the SRSM is non-negative. Besides, the strongest relation in a transmission network should be the one that the node relates to itself as the transmission speed can be considered as infinity. Intuitively, all SRSMs should have the same value in this case, which is assigned zero. Moreover, the relation strength will get weaker if the function value increases. Therefore, we have the following definition.
Definition 4 (SRSM). Given some network, let N be the set of nodes in it and P ⊆ N × N the set of node pairs connected by media directly. Then a function s :

Remark 2. Since transmission functions have various values
if we change the order of the parameters in general, so do SRSMs. Namely, s(u, v) = s(v, u) in general.
With the help of SRSM, the transmission network can be reduced to a graph model. In the graph, the nodes are represented by the vertices and connected by a weighted edge if they are connected by media directly. Moreover, the weights of the edges are assigned by a specific SRSM depending on the practical problem. In general, the graph should be directed. If s(u, v) = s(v, u) for all applicable pairs of nodes (u, v) in a network, the graph can also be considered as an undirected one.
In Section III-A3, the graph model will be used to define more general relation strength measurement. Besides, node and vertex are used alternatively without considering the difference. So are relation and edge.
3) Relation strength measurement for transmission network: In this part, the discussion is based on the graph model. Firstly, some notations of graphs need to be introduced.
Let G := (V G , E G ) be a weighted graph where V G and E G are the sets of vertices and edges of graph G respectively. For any edge e ∈ E G , its weight is denoted by |e|. If there is no ambiguity about the choice of the graph, V G and E G are abbreviated to V and E.
In the previous section, SRSM measures the direct relations between any pair of nodes. That is, the retransmission function of nodes is not taken into account. Neither is the parallel transmission on various paths. However, in the real world, the retransmission and parallel transmission of points happen frequently (like express service and data transmission on the internet). So we have to derive a more general relation strength measurement function, which is named relation strength measurement for transmission network (RSMFTN).
For the same reason in the derivation of SRSM, an analytic expression of RSMFTN cannot be given until a concrete problem is designated. But a reasonable RSMFTN must hold several key properties.
First of all, all the relations between any pair of nodes should be measurable. That is, the domain of RSMFTN is V × V . The properties of SRSM should be inherited. Then RSMFTN is also non-negative and RSM F T N (u, v) = 0 if and only if u = v. Besides, there is no relation between the nodes belonging to two disconnected components. In contrast to the coincidence of two nodes, this is the other extreme. So it is reasonable to set the value of RSM F T N to be infinity (∞), an element greater than any real number. The relation of some vertices u and v gets stronger if the function value RSM F T N (u, v) approaches to zero.
Consider a linear graph (Figure 2a). The points sent by u and received by v must be retransmitted by w. So the difficulty to transfer points from u to v is not less than the one from u to v visiting w. Notice that the difficulty of nodes to receive and send points has been included by SRSM and so indicated by the weights of the edges. Hence, the equality should hold. In other words, To add on, the transmission difficulty does not increase if we add some other retransmission node s (see Figure 2b). Thence, in general, we have Moreover, the ratio of the relation strengths (the relative relation strength) of two pairs of nodes should be fully determined by the relative magnitude of the weights. Also, for an indirect graph, the directions of edges are not taken into account; therefore, To sum up, we can define the RSMFTN as follows, Definition 5 (Relation strength measurement for transmission network). Suppose G is some directed graph and g : and v. 4. g(u, v) ≤ g(u, w)+g(w, v). Moreover, the equality holds if the two components that contain u and v are connected by the cutting node w.

5.
Suppose G is a graph which is same as G except that the edges' weights in G are all α times greater than the ones in G. Then for the corresponding vertices u and v Example 1. Actually, many measurements derived by other researchers are RSMFTNs. A well-known one should be the shortest distance function (SDF), which evaluates the shortest distance between some pair of nodes in a graph. Here is the proof.
Proof. Suppose u, v and w are arbitrary vertices in some directed graph G. Let g denote the SDF. We prove the proposition when G is weighted. The proof for the unweighted graph follows by setting the weights of the edges to one.
Since g(u, v) returns the sum of weights on the shortest path from u to v, g(u, v) ≥ 0. The property 1 holds. The properties 2 and 3 hold by the definition of SDF.
For property 4, assume g(u, v) > g(u, w) + g(w, v). Consider the path p consisting of the shortest path from u to w and the one from w to v. It is easy to see that the length of the path l(p) is g(u, w) + g(w, v), which is shorter than g(u, v). So g(u, v) cannot be the length of the shortest path. We get a contradiction. Moreover, if there exists a cutting node w connecting the components containing u and v respectively, the shortest path between u and v can be split into the one from u to w and the one from w to v. So g(u, v) = g(u, w) + g(w, v). Therefore, the property 4 holds.
For property 5, suppose path p is some shortest path from u to v in graph G. Then l(u, v) = g(u, v). Let p denote its counterpart in G . Since the weights in G are α times the ones in G, so is the length of path p . That is, l(p ) = αl(p). Since path p connects u and v in G , we have g(u , v ) ≤ l(p ). In other words, g(u , v ) ≤ αl(u, v). Similarly, consider the reverse transformation from G to G. That is, all the weights of edges in G is 1 α times in the ones in G. So we have 1 α l(u , v ) ≥ g(u, v), which is equivalent to l(u , v ) ≥ αg(u, v). Combining with g(u , v ) ≤ αl(u, v), we conclude that g(u , v ) = αl(u, v) = αg(u, v). So property 5 holds.
Suppose G is an undirected graph, then by the commutativity and associativity of the addition operator, g(u, v) = g(v, u). In other words, the order to add the weights of the edges compounding the shortest path does not change the final result.
To sum up, SDF is an RSMFTN.
B. Similarity Network 1) Similarity function: In order to make the problem easier to discuss, we need to give some fundamental definitions at first. We name the objects that have similarity relations nodes. Each node may have various properties. Moreover, there are some possible options for each property (for example, red, blue, yellow are possible options for property colour), and we name these options cases.
The similarity network concerns the property similarity of nodes. We assume that, for a certain problem, the set of properties is fixed and for each property, there exists a similarity function that maps the case pairs to real numbers. Intuitively, the a measure of similarity should not be negative. Thus, we assume that the similarity function is non-negative. Besides, to keep the consistency with the definition of SRSM given in Section III-A2, the function value increases while the similarity decreases. Moreover, for some objects A and B, if A is similar to B, then B is also similar to A. Then we formalize the preliminary idea: Definition 6. Let P be some property of the nodes and C P the set of possible cases of P , then we say s is a similarity function s : Since more than one properties P 1 , P 2 , · · · might be considered in general, we need to define a list of similarity functions s P1 , s P2 , · · · . For convenience, we write them in matrix forms. That is, [P 1 P 2 · · · ] and [s P1 s P2 · · · ].
Two lists of similarities are not measurable. And the properties may not be of the identical importance. Thence, we need a function to translate a list of parameters to an index. Traditionally, we call the function manipulating the importances of a list factors weight function. So we have the following definition: Definition 7 (Weight function). Suppose N is the set of nodes under the consideration in some problem. Let P 1 , P 2 · · · denote the properties. Besides, [s P1 s P2 · · · ] is the list of the corresponding similarity functions. A function of functions w mapping [s P1 s P2 · · · ] to a non-negative function f is called a weight function.
Remark 3. The choice of the weight function depends on the practical problem we try to solve. A trivial weight function is just a list of weights. In more details, suppose [s P1 s P2 · · · s Pn ] is a list of similarity functions. Let α 1 , α 2 · · · , α n be the weights indicating the importances. Then w = [α 1 α 2 · · · α n ] can be a potential weight function. And f is which is a non-negative function.

2) Relation strength measurement for similarity network:
The weight function can generate a function f to measure the similarity of a pair of nodes. However, the weight function here has no guarantee that f always gives the measurement following our intuition. In particular, we require f satisfies the following properties: Suppose N is the set of nodes under the consideration and P the set of properties. Then for u, v and w in N , we have,

and only if u and v have the exactly same cases for all properties in
The first two properties are inherited from the similarity function. Since the similarity relation should be symmetric (that is, if A is similar to B, then B is also similar to A), so we have property 4. Besides, property 3 shows that the direct measurement of any pair of nodes is at least not greater than the sum of the ones with an intermediate point. Since this function is defined for similarity measurement, we name it the relation strength measurement for similarity network (RSMFSN).

Remark 4.
In other words, f is a distance function. In fact, the example we give in remark 3 is an RSMFSN.

C. Relations between similarity network and transmission network
In many cases, there are very strong relations between similarity networks and transmission networks. A typical example is the pathogen infection among some species. If we consider the DNA similarity among organisms. It is easier for some certain pathogen to infect organisms that have similar DNAs. Or in other words, the easiness of pathogen transmission has a positive relationship with the DNA similarity. Therefore, the relative relation strength among the organisms should be similar whichever relation type we consider here.
Since both RSMFTN and RSMFSN are used to measure the relations among the nodes in networks, and the follow-up propositions are based on their shared properties, we call both two relation measurements relation strength measurement (RSM) in the sequel.

IV. COMMUNITIES
After defining RSM, the definition of communities can be derived. Ahead of giving a formal definition, an important problem needs to be discussed. That is, the community relation's transitivity. In other words, if A and B are contained in one community and so are B and C, can we also say A and C are in one community? In general, this implication is not true. A typical counterexample is "your friend's friends may not be your friends". So all relation strength between any pair of nodes should be considered when we define a community. Moreover, since only the groups of nodes having mutually strong enough relations are considered as communities, a relation strength threshold (community parameter) needs to be designated.
Definition 8 (Community). Suppose W G ⊆ V G for some directed graph G, ∈ R * , and g is some RSM. Then W G is a community with respect to RSM g and constant if and only if for all (u, v) ∈ W G × W G . g(u, v) ≤ . Moreover, we say is the community parameter (CP) of W G with respect to g. If there is no ambiguity of the choice of RSM and CP, we will briefly say W G is a community.
Since CP gives a threshold of the relation strength, whichever pair of nodes we choose in a community, the relation strength of the pair cannot be weaker than the ones the CP represents. So for those problems considering the worst cases, the CP can be designated according to some CM with some certain threshold (in Section III-A1). Then the inner structure of the community can be ignored since the poorest performance of the community satisfies the requirement. In other words, a community can be considered as a relatively independent entity, and the CP is a global property of it.
Remark 5. Notice that the definition is based on the set of vertices instead of the subgraph used in many other papers. Besides, it is worthy to emphasize that the choice of communities usually consider the whole graph's topology rather than the local one (this shows that the community is some higher level structure based on the original graph). Since the results might be different for various choices of graph topology, the superscripts are used to make the description clear (for example, W G means W is a vertex set and graph G is the working topology).

COMMUNITIES
Based on the definition of RSM, an adjoint complete graph can be derived for recording all the relation strengths. More accurately, the weights of edges in the adjoint graph is determined by the corresponding RSM.
Definition 9 (Adjoint complete digraph). Suppose G is some directed graph and g is some RSM. Let E = V G × V G be a new set of edges whose weights are assigned by g. Then the adjoint complete digraph adj(G, g) := {V G , E}.
The definition of communities uses CP to give a threshold of the relation strength. That means, if the relation strength is not strong enough, the relation is ignored during the community detection. Moreover, for any pair of nodes, the definition of communities requires the enough strengths of the relations in both two directions. Hence, we can remove the relations unsatisfying the requirement to simplify our graph without changing the result of community detection. With this trick in mind, we have the following transformation.
Definition 10 (Refinement transformation). Suppose G is a directed weighted graph, D G ⊆ E G , and ∈ R * is some CP. The refinement transformation is defined like this.
Besides, all the edges' weights are set to 1 after applying the transformation.
Remark 6. For convenience, the relations (u, v) and (v, u) are together denoted u ↔ v. In this case the weight is not applicable.
The definition of refinement transformation shows that if some edges (u, v) is in R (D G ), then so is (v, u). Moreover, the weights become unnecessary since they all equal one. Therefore, in R (D G ), there is no need for us to consider the directions and weights of the edges anymore. So, for now on, R (D G ) is thought of a set of undirected unweighted edges. Moreover, if (u, v) ∈ R (D G ), then we say the relation between u and v is reserved. Or briefly, u ↔ v is reserved.
In fact, the refinement transformation is a higher order function that applies a Boolean function to each relation in the set of edges. The Boolean function here determines whether the given relation is strong enough to be considered in the community detection. So for a certain refinement transformation, the reservation of the relation depends on the strength of the relation itself rather than the topology in which the relation is.
Proof. Pick (u, v) ∈ R (F 1 ) arbitrary. So u ↔ v is in F 1 and reserved after applying the refinement transformation. Since Then the adjoint complete digraph can derive a simplified undirected unweighted graph whose edges represent the twodirection relations strong enough to construct communities.
Definition 11 (Effective edge graph). Suppose G is some directed graph. g is some RSM.
is some CP. Then the effective edge graph is (V G , R (E adj(G,g) )) and denoted by eeg (g, ) (G). Moreover, suppose the vertices set A G is a subset of V G . The full subgraphs of eeg (g, ) (G) over A G is denoted by eeg (g, ) (G)[A G ].
Lemma 2. Let g be some RSM, ∈ R * some CP and G some directed weighted graph. Assume W G ⊆ V G . Then the vertices set W G is a community if and only if for all u, v ∈ W G , u ↔ v is reserved after applying the refinement transformation.
Proof. The refinement transformation will remove all the relations that cannot be used in a community structure. In other words, if all relations are reserved after applying refinement transformation, the relation in any pair of nodes is strong enough. This is exactly what the definition of communities requires. So, W G is a community. On the other hand, if W G is a community, the relation (in both directions) in any pair of nodes should be strong enough. Thus, all of them are reserved after applying the refinement transformation. Proof. Suppose S is a full subgraph of C n for some n. Then for arbitrary vertices u and v in S, edge (u, v) ∈ E Cn . So by the definition of full subgraphs, (u, v) ∈ E S . That is, there is an edge in an arbitrary pair of nodes in S. So S is a complete graph.
Theorem 1. Suppose G is some directed graph, g some RSM and some CP. Proof.
(⇒) Suppose A G is a community. Pick vertices u and v in A G arbitrary. Since A G is a community, then all the relations will be reserved after applying the refinement transformation.
Since we pick u and v arbitrary in A G , there is an edge between any pair of nodes in eeg (g, is complete, all the relations in E adj(G,g)[A G ] are reserved after applying the refinement function. Therefore, A G is a community.
It is easy to find that all single nodes can be considered as a community because they relate to themselves trivially, and RSM is zero. However, this kind of result does not follow our intuition since the community should be some set of nodes. The definition of the maximal community tackles this problem. For a better understanding of the definition, a theorem needs to be introduced first.
Theorem 2. Suppose G is a directed graph, a CP and g an RSM. Besides, Since B G is a community, then by Theorem 1, Moreover, pick u, v ∈ A G arbitrary. Then u, v ∈ B G as well. Notice that eeg (g, ) Theorem 2 shows that, if B G can be considered as a community with respect to some RSM and CP, then all the subsets of B G can be considered as a community. This observation leads to the definition of maximal community.
Definition 12 (Maximal community). Suppose G is a directed graph. RSM and CP are given. Moreover, A G is a subset of V G . Then A G is a maximal community if and only if 1. A G is a community, and 2. There is no B G ⊆ V G such that B G is a community and With the maximal community definition in mind, we introduce an algorithm to detect them if RSM and CP are specified. For easier explanation, we define problem A as this: Definition 13 (Problem A). Given some adjoint graph adj(G, g) and CP , find all the maximal communities in G (the set of the maximal communities is denoted Ψ).
Definition 14 (Problem B). Given the effective edge graph eeg (g, ) (G), find the all the maximal cliques in eeg (g, ) (G) (the set of the maximal cliques is denoted by Φ).
The following theorem shows the equivalence of problem A and problem B.
Theorem 3. Suppose G is a directed graph, some CP, g an RSM and A G a subset of V G . Then A G is a maximal community if and only if eef (g, ) (G)[A G ] is a maximal clique in graph eef (g, ) (G). Therefore, Ψ = Φ. Proof.
Moreover, since both two graphs are complete, the equality cannot hold. Otherwise, Besides, since eef (g, ) (G)[B G ] is complete, B G is a community. Hence, A G cannot be a maximal community, which is a contradiction.
(⇐) Suppose eef (g, ) (G)[A G ] is a maximal clique. Since eef (g, ) (G)[A G ] is complete, then A G is a community. Assume A G is not maximal, then there exists some community Lemma 1. Therefore, eef (g, ) (G)[A G ] cannot be maximal, which is a contradiction.
Remark 7. Bron-Kerbosch algorithm [33] is a well-known algorithm to find maximal cliques. Thence, by Theorem 3, we can reduce problem A to problem B and apply Bron-Kerbosch algorithm to find Φ, which equals Ψ. Figure 3 shows the relationships among the important concepts and transformations introduced. In more details, suppose G is some graph, g some RSM and some CP. Moreover, all the vertices in G have been indexed from 1 to |V G |. Then we have the following algorithm, 1: procedure FINDMAXIAMLCOMMUNITIES 2: In this section, we demonstrate how our new algorithm works by applying it on Zachary's karate club network [34]. We choose resistance distance [35] as our RSM.

A. The current model
The definition of communities indicates that some certain RSM is required. We have shown that SDF is RSMFTN in Example 1, so that SDF is RSM. Although many community detection algorithms work on SDF, it may not always give a reasonable result. Intuitively, the relation of a pair nodes will get stronger if there are more paths between them. However, SDF does not consider this case (see Figure 4). More specifically, in SDF view, the relation will not get stronger unless a path shorter than the previous shortest path is added. , the shortest distance between Node1 and Node3 is 2. So in SDF view, the relation strengths between Node1 and Node3 in these two cases are identical. However, there is one more path between Node1 and Node3 in (a). So, intuitively, the relation between Node1 and Node3 in (a) should be stronger than the one in (b).
In order to avoid this problem, we try to use the Klein and Randic's effective resistance function (ERF) [35] to measure the relation strength instead of SDF.

B. Klein and Randic's effective resistance
Suppose G is an indirectly connected graph. Then G can be considered as an electrical network that all the edges are resistances with the corresponding weight values (if G is an unweighted graph, then the resistances of all edges are one).
Let u and v be two vertices in the graph. Then the effective resistance of these two vertices can be defined like this: Definition 15. Let the voltage of u be U and the one of v be 0. We can measure the current I from u to v. Then the efficient resistance R(u, v) between u and v is U I . Briefly, R(u, v) = U I . 1) Algorithm to get efficient resistance distance: Klein and Randic [35] also provide an algorithm to compute the resistance distance for a connected indirect weighted graph.
Suppose graph G is connected. Let A be the adjacent matrix and D the diagonal degree matrix of G. It is worthy to note that, in a weighted indirected graph, the degree of a vertex is the sum of the weights of all its adjacent edges. Then the Laplacian matrix L can be computed using formula L = D − A. Let L † be the generalized inverse [36] of L. Then the efficient resistance distance R i,j of any pair of vertices (i, j) in graph G can be obtained by And we usually call the corresponding matrix R resistance matrix.
2) ERF is an RSMFTN: Since the definition of community is based on RSM, we have to prove ERF is an RSM first. Essentially, in this case, the relations among the nodes are derived from the electron flow in the wires among the vertices. So we need to consider the criteria of RSMFSN. Lemma 4. Resistance is distance. That is, the resistance satisfies the following properties: Lemma 5. Let x be a cut-point of a connected graph, and let a and b be points occurring in different components which arise upon deletion of x. Then, Remark 8. The proofs of Lemma 4 and Lemma 5 have been given by Klein and Randic [35].
Lemma 6. RDF satisfies the property 5 of RSM.
Proof. Suppose G is some graph and G is same as G but the edges weights in G are all α times greater than the ones in G. Let A and A be the adjacent matrixes of G and G respectively. Then we have A = αA. So for the corresponding degree matrixes D and D , we also have D = αD. Therefore, we have Let L † and L † be the generalized inverse of L and L respectively. Then by the definition of the generalized inverse, we have LL † L = L (1) Since L = αL, we can simplify equation Hence, we have Proof. We can define that the resistance distance of a pair of vertices is infinite if there is no path between them. Then the proposition is immediate from Lemmas 4-6.
C. Community detection in Zachary's karate club The graph we use for demonstration is Zachary's karate club ( Figure 5) [34], which is a popular test case in community detection research.
At first, we need to choose some proper CP, which is the lower bound of the relation strength within the communities. Here, we let CP = 1.5.
Then, we use Klein-Randic method to compute the resistance distance for each pair of nodes in the network and get the corresponding resistance matrix R.
After that, we get the corresponding adjoint graph (adj) from R and remove all the edges whose weights are greater than CP. So we get the efficient edges graph (eeg).
Then we apply Bron-Kerbosch algorithm on eeg and get a list of maximal communities.
In Figure 6, we plot those maximal communities in the original graph. Here, we have three maximal communities represented by red, blue and yellow respectively. Some nodes are multi-colour, which means they belong to various maximal communities simultaneously.

VII. CONCLUSION
In this paper, we discussed two most common types of networks: transmission networks and similarity networks. Two corresponding relation strength measurements (RSMFTN and RSMFSN) are defined. And we reduce them into a unified graph model. Based on this, we provide a general definition of communities and derive a corresponding detection algorithm. At last, we give a demonstration to show how the algorithm works.
Our paper gives a general procedure to detect community structures in practical networks. Readers can specialize our algorithm to derive theirs according to the problems confronting them.

VIII. LIMITATIONS AND FUTURE WORK
Generally, RSMs consider the whole network topology. So does the algorithm to find the maximal community structures. While the algorithm gives the accurate results, it is NP hard. So our algorithm may not suit the community detections in dynamic networks. Besides, the definition we give in this paper is based on the absolute strengths among the nodes. So the users should always give a proper community parameter , which is hard to find sometimes.
Although we proved that SDF and ERF are RSMFTN, many other RSMFTNs still wait to be discovered. Besides, the definition based on the absolute relation strength should derive a corresponding definition based on the relative relation strength. The key point is how to give a general definition of the neighbour nodes when applying different RSMs.