In Search of the Densest Subgraph
Abstract
:1. Introduction and Motivation
- In a social network, such as Facebook, a typical graph representation has nodes corresponding to individuals, and the edges capture some relation or interaction between them, e.g., friendship. In this model, a dense subgraph represents a community. Many types of communities exist, as the tendency of people with similar tastes, choices, and preferences to get associated may lead to the formation of communities.
- In a communication network, a dense subgraph can capture a congested part of the network, assuming that the edges correspond to those links that have traffic load over some threshold. Identifying such congested parts, and then taking appropriate action to relieve the congestion, can have a major impact on network performance.
- In a mobile ad hoc radio network, a dense subgraph can represent a subnetwork with high radio interference. These parts tend to offer lower throughput, thus identifying them can be useful to avoid degraded performance.
- In the World Wide Web (WWW), a natural graph representation has the websites as its nodes, and the hyperlinks between them as edges. A densely interconnected part may indicate a web community, such as a group of content creators sharing a common interest.
- Another application in the WWW is the detection of link spam. Link spam is the posting of out-of-context links on websites, discussion forums, blog comments, guest-books or other online venues that display user comments. The purpose of link spam is to increase the number of external links pointing to a page the spammer wants to promote, with the goal of increasing page rank and improving its position in search engine results. Then, the higher rankings in web searches lead to greater visibility over competitors, more visitors and potentially more paying customers.
- In bioinformatics, dense subgraphs are used for finding molecular complexes in protein–protein interaction networks, for discovering regulatory motifs in genomic DNA, and finding complex patterns in the gene annotation graph.
- In the field of finance, dense subgraphs have been used for discovering migration patterns in financial markets.
- In data mining, correlation mining represents observation sequences as graph nodes, and strong correlation between them as edges. Then, a dense subgraph can capture highly correlated groups of entities, which has been used in stock market analysis, and computational biology.
- Other use cases of dense subgraphs include graph compression, graph visualization, clustering, real-time identification of important stories on Twitter, and many other data mining and knowledge discovery applications.
2. Preliminaries
3. Measures of Graph Density and Associated Results
3.1. Polynomial-Time Solvable Cases
3.1.1. Edge Density: Subgraph with Maximum Average Degree
3.1.2. k-Core: Subgraph with Largest Minimum Degree
3.1.3. Subgraph with Maximum k-Clique Density
3.1.4. Subgraph with Maximum F-Density
3.1.5. Densest Subgraph with Concave Size Function
3.1.6. Densest Subgraph with Specified Subset
3.1.7. Dense Subgraph with Sparse Cut
3.1.8. Subgraph with Maximum Edge-Connectivity
Principle of the Algorithm
- Find a minimum cut in the input graph G; let its size be denoted by .
- Let be the two node sets into which the minimum cut divides the graph. A key observation is that, if there is any subgraph with , then it must be either fully in A or fully in B. This is because, if had nodes on both sides, then the found cut of size k would separate these nodes, too. , however, cannot be separated by k edges, due to . Thus, if such a exists, then it must be either fully in A or fully in B.
- Let denote the most connected subgraphs within the sets , respectively. We can find them by recursively calling the algorithm for the smaller graphs induced by A and B.
- Return the graph that has the highest connectivity among the three graphs and .
3.2. NP-Hard Density Measures and Related Results
3.2.1. Maximum Clique
- Problem:
- Clique
- Input:
- Graph G, positive integer k
- Question:
- Does G have a clique of size at least k?
- Problem:
- Maxclique
- Input:
- Graph G
- Task:
- Find a maximum clique in G.
Faster Exponential-time Algorithms
Approximation
Inapproximability
Special Graph Classes
Listing All Maximal Cliques
3.2.2. Constrained Dense Subgraphs and Quasi-Cliques
- implies.
- implies.
Density Relative to A Clique
Constrained size Subgraphs with Maximum Number of Edges
- There is a no polynomial-time approximation algorithm with an approximation factor of for some , if the so-called Exponential Time Hypothesis (ETH) is true, as proved by Manurangsi [32]. The ETH asserts that the 3-Satisfiability problem is not solvable faster than exponential time in the worst case. This is a stronger assumption than P ≠ NP.
- Khot [33] proved that the problem does not have a Polynomial-Time Approximation Scheme (PTAS) for the general case, unless NP-complete problems can be solved in subexponential time with randomization. For dense input graphs, however, there is a PTAS, as mentioned above. Recall that dense means the minimum degree is at least for some constant
3.2.3. Dense Common Subgraphs in Multiple Graphs
- DCS-MA, where the objective is to maximize the minimum over the frames of the average degree in the induced subgraph (induced by the node set selected for the subgraph).
- DCS-AM where the objective is to maximize the average over the frames of the minimum degree in the induced subgraph.
- DCS-MM, where the objective is to maximize the minimum over the frames of the minimum degree in the induced subgraph.
- DCS-AA, where the objective is to maximize the average over the frames of the average degree in the induced subgraph.
4. Some Related Results and Open Problems
4.1. Conditions that Force the Existence of a Dense Subgraph
4.2. Min-Max Theorems
- They provide non-trivial insight into the problem structure, which is often helpful in developing algorithms.
- They offer a way to certify the optimality of a solution. For example, if someone provides us a network flow in a large network, and claims that it is a maximum flow, how can this be certified without re-running the entire algorithm? We can easily check that the flow is feasible, by simply checking that it satisfies all the constraints. How do we make sure it is indeed optimal, i.e., there is no flow with larger value? If we are also given a cut (a dual solution) with capacity equal to the flow value, then it indeed certifies the optimality, since no more flow can get through than what the bottleneck allows. The fact that such an optimality certificate always exists for this problem is not trivial, it follows from the Max Flow Min Cut Theorem.
4.3. Open Problems
4.3.1. Density with P-computable Function of Degrees
4.3.2. Linear Combination of Polynomially Solvable Densities
4.3.3. Min-Max Theorems for Other (Polynomially Solvable) Densities
4.3.4. Further Conditions that Force the Existence of A Dense Subgraph
4.3.5. Maximum Clique in Random Graph
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Luce, R.D.; Perry, A. A method of matrix analysis of group structure. Psychometrika 1949, 14, 95–116. [Google Scholar] [CrossRef] [PubMed]
- Gionis, A.; Tsourakakis, C.E. Dense Subgraph Discovery. KDD Tutorial. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15), Sydney, Australia, 10–13 August 2015. [Google Scholar]
- Picard, J.-C.; Queyranne, M. A Network Flow Solution to Some Nonlinear 0–1 Programming Problems with Application to Graph Theory. Networks 1982, 12, 141–159. [Google Scholar] [CrossRef]
- Gallo, G.; Grigoriadis, M.D.; Tarjan, R.E. A Fast Parametric Maximum Flow Algorithm and Applications. SIAM J. Comput. 1989, 18, 30–55. [Google Scholar] [CrossRef]
- Charikar, M. Greedy Approximation Algorithms for Finding Dense Components in a Graph. In Approximation Algorithms for Combinatorial Optimization: Third International Workshop, APPROX 2000; Springer: Berlin, Germany, 2000; pp. 84–95. [Google Scholar]
- Dong, J.; Liu, Y. Determination of the Densest Subgraph. J. Syst. Sci. Complex. 2004, 17, 23–27. [Google Scholar]
- Faragó, A. A General Tractable Density Concept for Graphs. Math. Comput. Sci. 2008, 1, 689–699. [Google Scholar] [CrossRef]
- Lick, D.R.; White, A.T. k-Degenerate Graphs. Can. J. Math. 1970, 22, 1082–1096. [Google Scholar] [CrossRef]
- Matula, D.W.; Beck, L.L. Smallest-Last Ordering and Clustering and Graph Coloring Algorithms. J. ACM 1983, 30, 417–427. [Google Scholar] [CrossRef]
- Tatti, N.; Gionis, A. Density-Friendly Graph Decomposition. In Proceedings of the 24th International World Wide Web Conference (WWW’15), Florence, Italy, 18–22 May 2015; pp. 1089–1099. [Google Scholar]
- Tsourakakis, C.E. The K-clique Densest Subgraph Problem. In Proceedings of the 24th International World Wide Web Conference (WWW’15), Florence, Italy, 18–22 May 2015; pp. 1122–1132. [Google Scholar]
- Kawase, Y.; Miyauchi, A. The Densest Subgraph Problem with a Convex/Concave Size Function. Algorithmica 2018, 80, 3461–3480. [Google Scholar] [CrossRef]
- Saha, B.; Hoch, A.; Khuller, S.; Raschid, L.; Zhang, X.-N. Dense Subgraphs With Restrictions and Applications to Gene Annotation Graphs. In RECOMB 2010; Berger, B., Ed.; Springer: Heidelberg, Germany, 2010; Volume 6044, pp. 456–472. [Google Scholar]
- Chen, W.; Peng, L.; Wang, J.; Li, F.; Tang, M. Algorithms for the Densest Subgraph With at Least k Vertices and with a Specified Subset. In Combinatorial Optimization and Applications; Lu, Z., Kim, D., Wu, W., Li, W., Du, D.Z., Eds.; Springer: Berlin, Germany, 2015; Volume 9486, pp. 566–573. [Google Scholar]
- Miyauchi, A.; Kakimura, N. Finding a Dense Subgraph with Sparse Cut. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM’18), Torino, Italy, 22–26 October 2018; pp. 547–556. [Google Scholar]
- Nagamochi, H.; Ibaraki, T. Algorithmic Aspects of Graph Connectivity; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
- Matula, D.W. The Cohesive Strength of Graphs. In The Many Facets of Graph Theory; Lecture Notes in Mathematics; Chartrand, G., Kapoor, S.F., Eds.; Springer: Berlin, Germany, 1969; Volume 110. [Google Scholar]
- Garey, M.R.; Johnson, D.S. Computers and Intractability: A Guide to the Theory of NP-Completeness; W. H. Freeman and Co.: San Francisco, CA, USA, 1979. [Google Scholar]
- Kosub, S. Local Density. In Network Analysis—Methodological Foundations; Brandes, U., Erlebach, T., Eds.; Springer: Berlin, Germany, 2005. [Google Scholar]
- Robson, J.M. Algorithms for maximum independent sets. J. Algorithms 1986, 7, 425–440. [Google Scholar] [CrossRef]
- Robson, J.M. Finding a Maximum Independent Set in Time O(2n/4). Technical Report. 2001. Available online: http://www.labri.fr/perso/robson/mis/techrep.html (accessed on 10 June 2019).
- Boppana, R.B.; Halldorsson, M.M. Approximating maximum independent sets by excluding subgraphs. BIT Numer. Math. 1992, 32, 180–196. [Google Scholar] [CrossRef]
- Feige, U. Approximating Maximum Clique by Removing Subgraphs. SIAM J. Discret. Math. 2004, 18, 219–225. [Google Scholar] [CrossRef]
- Zuckerman, D. Linear Degree Extractors and the Inapproximability of Max Clique and Chromatic Number. Theory Comput. 2007, 3, 103–128. [Google Scholar] [CrossRef]
- ISGCI: Information System on Graph Classes and their Inclusions. Available online: http://www.graphclasses.org (accessed on 5 May 2019).
- Tsukiyama, S.; Ide, M.; Ariyoshi, H.; Shirakawa, I. A New Algorithm for Generating all the Maximal Independent Sets. SIAM J. Comput. 1977, 6, 505–517. [Google Scholar] [CrossRef]
- Johnson, D.S.; Papadimitriou, C.H.; Yannakakis, M. On generating all maximal independent sets. Inf. Process. Lett. 1988, 27, 119–123. [Google Scholar] [CrossRef]
- Asahiro, Y.; Hassin, R.; Iwama, K. Complexity of Finding Dense Subgraphs. Discret. Appl. Math. 2002, 121, 15–26. [Google Scholar] [CrossRef]
- Bhaskara, A.; Charikar, M.; Chlamtac, E.; Feige, U.; Vijayaraghavan, A. Detecting High Log-densities—An O(n1/4) Approximation for Densest k-Subgraph. In Proceedings of the Annual ACM Symposium on Theory of Computing (STOC 2010), Cambridge, MA, USA, 6–8 June 2010; ACM: New York, NY, USA, 2010; pp. 201–210. [Google Scholar]
- Chen, D.Z.; Fleischer, R.; Li, J. Densest k-Subgraph Approximation on Intersection Graphs. In Approximation and Online Algorithms (WAOA 2010); Jansen, K., Solis-Oba, R., Eds.; Springer: Berlin, Germany, 2010; Volume 6534, pp. 84–93. [Google Scholar]
- Arora, S.; Karger, D.; Karpinski, M. Polynomial Time Approximation Schemes for Dense Instances of NP-Hard Problems. J. Comput. Syst. Sci. 1999, 58, 193–210. [Google Scholar] [CrossRef] [Green Version]
- Manurangsi, P. Almost-polynomial Ratio ETH-hardness of Approximating Densest k-Subgraph. In Proceedings of the 49th Annual ACM Symposium on Theory of Computing (STOC 2017), Montreal, PQ, Canada, 19–23 June 2017; pp. 954–961. [Google Scholar]
- Khot, S. Ruling Out PTAS for Graph Min-Bisection, Densest Subgraph and Bipartite Clique. In Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS’04), Rome, Italy, 17–19 October 2004; pp. 136–145. [Google Scholar]
- Charikar, M.; Naamad, Y.; Wu, J. On Finding Dense Common Subgraphs. arXiv 2018, arXiv:1802.06361. [Google Scholar]
- Semertzidis, K.; Pitoura, E.; Terzi, E.; Tsaparas, P. Finding lasting dense subgraphs. In Data Mining and Knowledge Discovery; Springer: Berlin, Germany, 2018; pp. 1–29. [Google Scholar]
- Turán, P. On an Extremal Problem in Graph Theory. Matematikai és Fizikai Lapok (Math. Phys. Lett.) 1941, 48, 436–452. [Google Scholar]
- Dirac, G.A. Extensions of Turáns Theorem on Graphs. Acta Math. Acad. Sci. Hung. 1963, 14, 417–422. [Google Scholar] [CrossRef]
- Erdős, P.; Stone, A.H. On the Structure of Linear Graphs. Bull. Am. Math. Soc. 1946, 52, 1087–1091. [Google Scholar] [CrossRef]
- Schrijver, A. Min-Max Results in Combinatorial Optimization. In Mathematical Programming—The State of the Art; Bachem, A., Korte, B., Grotschel, M., Eds.; Springer: Berlin, Germany, 1983. [Google Scholar]
- Gabow, H.N.; Westermann, H.H. Forests, Frames, and Games: Algorithms for Matroid Sums and Applications. Algorithmica 1992, 7, 465–497. [Google Scholar] [CrossRef]
- Erdős, P.; Hajnal, A. On Chromatic Number of Graphs and Set-Systems. Acta Math. Hung. 1966, 17, 61–99. [Google Scholar] [CrossRef]
- Bollobás, B. Random Graphs; Cambridge University Press: Cambridge, UK, 2001. [Google Scholar]
- Grimmett, G.; McDiarmid, C. On Colouring Random Graphs. Math. Proc. Cam. Philos. Soc. 1975, 77, 313–324. [Google Scholar] [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Faragó, A.; R. Mojaveri, Z. In Search of the Densest Subgraph. Algorithms 2019, 12, 157. https://doi.org/10.3390/a12080157
Faragó A, R. Mojaveri Z. In Search of the Densest Subgraph. Algorithms. 2019; 12(8):157. https://doi.org/10.3390/a12080157
Chicago/Turabian StyleFaragó, András, and Zohre R. Mojaveri. 2019. "In Search of the Densest Subgraph" Algorithms 12, no. 8: 157. https://doi.org/10.3390/a12080157
APA StyleFaragó, A., & R. Mojaveri, Z. (2019). In Search of the Densest Subgraph. Algorithms, 12(8), 157. https://doi.org/10.3390/a12080157