iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression
Abstract
:1. Introduction
- We propose a lossless graph compression algorithm, iRun, which decomposes the ordered matrix into compression-friendly HVR, considering each block;
- We propose an architecture that allows compressing a graph using mixed graph compression algorithms at the block’s sub-sub-level to efficiently reduce storage space requirements;
- We compare our proposed technique with four existing bitmap compression algorithms and four encoding schemes for graph compression.
- Extensive experiments are carried out to validate the compactness and processing efficiency performance of the technique.
2. Related Work
2.1. Graph-Encoding Techniques
2.1.1. Encoding Adjacency List
2.1.2. Encoding Adjacency Matrix
2.2. Bitmap Compression
3. Proposed Solution
3.1. Problem Definition
3.2. Index Run Algorithm (iRun)
Algorithm 1: iRun algorithm |
Sparsity Checking
Algorithm 2: Sparsity Check algorithm |
3.3. Parallel Process
Algorithm 3: HVR decomposition algorithm |
Algorithm 4: Synchronous Process algorithm |
3.4. HVR Decomposition
4. Experimental Setup and Evaluation
4.1. Processing Time Comparison
4.2. Memory Complexity Comparison
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Liu, Y.; Safavi, T.; Dighe, A.; Koutra, D. Graph Summarization Methods and Applications: A Survey. ACM Comput. Surv. 2018, 51, 1–34. [Google Scholar] [CrossRef]
- Rasel, M.K.; Han, Y.; Kim, J.; Park, K.; Tu, N.A.; Lee, Y.K. itri: Index-based triangle listing in massive graphs. Inf. Sci. 2016, 336, 1–20. [Google Scholar] [CrossRef]
- Dhulipala, L.; Kabiljo, I.; Karrer, B.; Ottaviano, G.; Pupyrev, S.; Shalita, A. Compressing graphs and indexes with recursive graph bisection. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1535–1544. [Google Scholar]
- Alam, A.; Umair, M.; Dolgorsuren, B.; Akhond, M.R.; Ali, M.A.; Qudus, U.; Lee, Y.K. Distributed In-Memory Granularity-Based Time-Series Graph Compression; Korean Society of Information Science and Technology Academic Papers; Korean Society of Information Science and Technology: Busan, Republic of Korea, 2018; pp. 235–237. [Google Scholar]
- Dolgorsuren, B.; Khan, K.U.; Rasel, M.K.; Lee, Y.K. StarZIP: Streaming graph compression technique for data archiving. IEEE Access 2019, 7, 38020–38034. [Google Scholar] [CrossRef]
- Umair, M.; Rasel, M.K.; Lee, Y.K. BLOCK Formulation Technique for Compressed Graph Computation; Korean Society of Information Science and Technology Academic Papers; Korean Society of Information Science and Technology: Busan, Republic of Korea, 2017; pp. 263–265. [Google Scholar]
- Maserrat, H.; Pei, J. Neighbor query friendly compression of social networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 24–28 July 2010; pp. 533–542. [Google Scholar]
- Rasel, M.K.; Lee, Y.K. Exploiting CPU parallelism for triangle listing using hybrid summarized bit batch vector. In Proceedings of the 2016 International Conference on Big Data and Smart Computing (BigComp), Hong Kong, China, 18–20 January 2016; pp. 183–190. [Google Scholar]
- Li, G.; Rao, W.; Jin, Z. Efficient compression on real world directed graphs. In Proceedings of the Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, Beijing, China, 7–9 July 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 116–131. [Google Scholar]
- Shun, J.; Dhulipala, L.; Blelloch, G.E. Smaller and faster: Parallel processing of compressed graphs with Ligra+. In Proceedings of the Data Compression Conference (DCC), Snowbird, UT, USA, 7–9 April 2015; pp. 403–412. [Google Scholar]
- Li, G.; Rao, W. Compression-aware graph computation. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, Heidelberg, Germany, 12–16 September 2016; pp. 1295–1302. [Google Scholar]
- Lim, Y.; Kang, U.; Faloutsos, C. Slashburn: Graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 2014, 26, 3077–3089. [Google Scholar] [CrossRef]
- Besta, M.; Hoefler, T. Survey and taxonomy of lossless graph compression and space-efficient graph representations. arXiv 2018, arXiv:1806.01799. [Google Scholar]
- Pibiri, G.E.; Venturini, R. Techniques for inverted index compression. ACM Comput. Surv. (CSUR) 2020, 53, 1–36. [Google Scholar] [CrossRef]
- Boldi, P.; Vigna, S. The webgraph framework I: Compression techniques. In Proceedings of the 13th International Conference on World Wide Web, Seoul, Republic of Korea, 7–11 April 2014; pp. 595–602. [Google Scholar]
- Boldi, P.; Santini, M.; Vigna, S. Permuting web and social graphs. Internet Math. 2009, 6, 257–283. [Google Scholar] [CrossRef] [Green Version]
- Kang, U.; Faloutsos, C. Beyond’caveman communities’: Hubs and spokes for graph compression and mining. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM), British, CO, Canada, 11–14 December 2011; pp. 300–309. [Google Scholar]
- Kang, U.; Tong, H.; Sun, J.; Lin, C.Y.; Faloutsos, C. Gbase: A scalable and general graph management system. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1091–1099. [Google Scholar]
- Kang, U.; Tong, H.; Sun, J.; Lin, C.Y.; Faloutsos, C. Gbase: An efficient analysis platform for large graphs. VLDB J.—Int. J. Very Large Data Bases 2012, 21, 637–650. [Google Scholar] [CrossRef]
- Blandford, D.K.; Blelloch, G.E.; Kash, I.A. An Experimental Analysis of a Compact Graph Representation. In Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithmics and Combinatorics, New Orleans, LA, USA, 10 January 2004. [Google Scholar]
- Shun, J.; Blelloch, G.E. Ligra: A lightweight graph processing framework for shared memory. Proc. Acm Sigplan Not. 2013, 48, 135–146. [Google Scholar] [CrossRef]
- Chan, C.Y.; Ioannidis, Y.E. Bitmap index design and evaluation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, WA, USA, 1–4 June 1998; pp. 355–366. [Google Scholar]
- Stabno, M.; Wrembel, R. RLH: Bitmap compression technique based on run-length and Huffman encoding. Inf. Syst. 2009, 34, 400–414. [Google Scholar] [CrossRef]
- Deliège, F.; Pedersen, T.B. Position list word aligned hybrid: Optimizing space and performance for compressed bitmaps. In Proceedings of the 13th International Conference on Extending Database Technology, Lausanne, Switzerland, 22–26 March 2010; pp. 228–239. [Google Scholar]
- Guzun, G.; Canahuate, G.; Chiu, D.; Sawin, J. A tunable compression framework for bitmap indices. In Proceedings of the 2014 IEEE 30th International Conference on Data Engineering, Chicago, IL, USA, 31 March–4 April 2014; pp. 484–495. [Google Scholar]
- Corrales, F.; Chiu, D.; Sawin, J. Variable length compression for bitmap indices. In Proceedings of the International Conference on Database and Expert Systems Applications, Toulouse, France, 29 August–2 September 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 381–395. [Google Scholar]
- Kim, S.; Lee, J.; Satti, S.R.; Moon, B. SBH: Super byte-aligned hybrid bitmap compression. Inf. Syst. 2016, 62, 155–168. [Google Scholar] [CrossRef]
- Antoshenkov, G. Byte-aligned bitmap compression. In Proceedings of the Data Compression Conference (DCC’95), Snowbird, UT, USA, 28–30 March 1995; p. 476. [Google Scholar]
- Wu, K.; Otoo, E.J.; Shoshani, A. Optimizing bitmap indices with efficient compression. ACM Trans. Database Syst. (TODS) 2006, 31, 1–38. [Google Scholar] [CrossRef]
- Colantonio, A.; Di Pietro, R. Concise: Compressed ‘n’composable integer set. Inf. Process. Lett. 2010, 110, 644–650. [Google Scholar] [CrossRef] [Green Version]
- Lemire, D.; Kaser, O.; Aouiche, K. Sorting improves word-aligned bitmap indexes. Data Knowl. Eng. 2010, 69, 3–28. [Google Scholar] [CrossRef] [Green Version]
- van Schaik, S.J.; de Moor, O. A memory efficient reachability data structure through bit vector compression. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, Athens, Greece, 12–16 June 2011; pp. 913–924. [Google Scholar]
- Chambi, S.; Lemire, D.; Kaser, O.; Godin, R. Better bitmap performance with roaring bitmaps. Softw. Pract. Exp. 2016, 46, 709–719. [Google Scholar] [CrossRef] [Green Version]
- Chen, Z.; Wen, Y.; Cao, J.; Zheng, W.; Chang, J.; Wu, Y.; Ma, G.; Hakmaoui, M.; Peng, G. A survey of bitmap index compression algorithms for big data. Tsinghua Sci. Technol. 2015, 20, 100–115. [Google Scholar] [CrossRef]
- Barik, R.; Minutoli, M.; Halappanavar, M.; Tallent, N.R.; Kalyanaraman, A. Vertex Reordering for Real-World Graphs and Applications: An Empirical Evaluation. In Proceedings of the 2020 IEEE International Symposium on Workload Characterization (IISWC), Beijing, China, 27–29 October 2020; pp. 240–251. [Google Scholar]
- Arai, J.; Shiokawa, H.; Yamamuro, T.; Onizuka, M.; Iwamura, S. Rabbit order: Just-in-time parallel reordering for fast graph analysis. In Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Chicago, IL, USA, 23–27 May 2016; pp. 22–31. [Google Scholar]
- Jacquelin, M.; Ng, E.G.; Peyton, B.W. Fast and effective reordering of columns within supernodes using partition refinement. In Proceedings of the 2018 Proceedings of the Seventh SIAM Workshop on Combinatorial Scientific Computing, Bergen, Norway, 6–8 June 2018; pp. 76–86. [Google Scholar]
- Faloutsos, M.; Faloutsos, P.; Faloutsos, C. On power-law relationships of the internet topology. ACM SIGCOMM Comput. Commun. Rev. 1999, 29, 251–262. [Google Scholar] [CrossRef]
- Sun, J.; Vandierendonck, H.; Nikolopoulos, D.S. Graphgrind: Addressing load imbalance of graph partitioning. In Proceedings of the International Conference on Supercomputing, Chicago, IL, USA, 14–16 June 2017; pp. 1–10. [Google Scholar]
System | Approach | Purpose | Computational | Distributed | Weighted | Lossless | Directed | Parallel |
---|---|---|---|---|---|---|---|---|
Ligra [21] | A parallel single- machine | Reducing the memory consumption | O(log3 M), M: sum of the sizes of the sets | ✗ | ✓ | ✓ | ✓ | ✓ |
Ace Up [9] | Clustering, structural information of the graph | Directed graph compression | Sub-linear | ✗ | ✗ | ✓ | ✓ | ✗ |
SlashBurn [12] | Reordering greedy hub selection | To reduce nonzero blocks in a resulting adjacency matrix | Iterated logarithmic | ✓ | ✗ | ✓ | ✓ | ✗ |
HVR Graph Compression | Horizontal and vertical shaped compression | To achieve a higher compression ratio | Logarithmic | ✗ | ✗ | ✓ | ✓ | ✓ |
ID | Offset | Length | Shape | Template |
---|---|---|---|---|
0 | 0 | 5 | ||
1 | 5 | 15 | ||
2 | 20 | 10 | S | |
3 | 30 | 45 |
Real Graph Datasets | Type | # of Vertices | # of Edges | Size (MB) |
---|---|---|---|---|
Google Web Graph | Web pages network | 875713 | 510503 | 71.8 |
Road Network | Road traffic network | 196520 | 553321 | 83.7 |
Social network | 4847572 | 6899378 | 131 | |
DBLP | Citation network | 317080 | 104986 | 13.2 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Umair, M.; Lee, Y.-K. iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression. Sensors 2022, 22, 9894. https://doi.org/10.3390/s22249894
Umair M, Lee Y-K. iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression. Sensors. 2022; 22(24):9894. https://doi.org/10.3390/s22249894
Chicago/Turabian StyleUmair, Muhammad, and Young-Koo Lee. 2022. "iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression" Sensors 22, no. 24: 9894. https://doi.org/10.3390/s22249894
APA StyleUmair, M., & Lee, Y.-K. (2022). iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression. Sensors, 22(24), 9894. https://doi.org/10.3390/s22249894