The Maximum Common Subgraph Problem: A Parallel and Multi-Engine Approach
Abstract
:1. Introduction
- As sequential programming has often become “parallel programming” [4,5], we present a CPU-based multi-core parallel design of the McSplit procedure. In this application, we parallelize the searching procedure viewing the backtrack search as forming a tree, and exploring different portions of the tree using different threads. Although high-level architecture-independent models (such as Thread Building Block, OpenMP, and Cilk) enable easy integration into existing applications, native threading approaches allow much more control over exactly what happens on what thread. For that reason, we adopt a native threading programming model to develop our CPU-based application, where threads are organized in a thread pool. The pool shares a queue in which tasks are inserted by a “master” thread and later extracted and solved by the first free “helper” thread. Mechanisms are used to avoid task duplication, task unbalance, and data structure repetition, as such problems may impair the original divide-and-conquer approach. A compact data structure, which represents partial results, is used to share solutions among threads.
- As in the last few years, it has become pretty clear that programming General Purpose GPU (GPGPU) applications is a move in the right direction for all professional users [6,7], we analyze how to extend the previous approach to many-core GPGPUs. First, we describe how to transform an intrinsically recursive solution into an iterative one which is able to handle the heavy workload unbalance and the massive data dependency of the original, recursive algorithm. After that, we select the implementation framework and redesign the parallel application. Nowadays, CUDA and OpenCL are the leading GPGPU frameworks, and each one brings its own advantages and disadvantages. CUDA is a proprietary framework created by NVIDIA, whilst OpenCL is open source. Even if OpenCL is supported in more applications than CUDA, the general consensus is that CUDA generates better performance results as NVIDIA provides extremely good integration. For that reason, we describe the corrections required to the multi-core CPU-based algorithm to handle the heavy workload unbalance and the massive data dependency of the original recursive algorithm on the CUDA framework. We would like to emphasize here that our implementation works only on NVIDIA GPUs, even if an OpenCL implementation would imply moderate modifications and corrections.
- As the MCS problem implicitly imply poor scalability, we present a few heuristics to modify the original procedure on hard-to-solve instances. The first one modifies the initial ordering used by the branch-and-bound procedure to pair vertices and to generate the recursion tree. Moreover, we suggest counter-measures to deal with the “heavy-tail phenomenon”, i.e., the condition in which no larger MCS is found, but the branch-and-bound procedure keeps recurring. Finally, following other application fields (such as the one of SATisfiability solvers [8]), we propose randomize tree search and automatic search restarts for expensive computations. These heuristics show very good performance on specific instances even if they cannot outperform the original implementation on average.
- To leverage the advantages and disadvantages of the various strategies we implemented, we extend the classical “winner-takes-all” approach to the MCS problem to a multi-engine (portfolio) methodology [8,9,10]. In our approach, different implementations may resort to different computation units, namely CPU and GPU cores, with a corresponding increase of both the overall memory bandwidth and the computational power and a consequent reduction of the classical multi-engine fragilities.
Roadmap
2. Background and Related Works
2.1. The Maximum Common Subgraph Problem
2.2. Related Works
2.2.1. Sequential Approaches
2.2.2. Parallel Approaches
2.3. The McSplit Algorithm
3. Parallel Approaches
3.1. The CPU Multi-Core Approach
Algorithm 1 The multi-core CPU-based recursive branch-and-bound MCS procedure. |
parallelMCS (G, H)
solve (, M, , )
|
- When the program starts and the thread pool is first initialized (line 3), each thread will start its own cycle when the task queue is empty. In this situation each thread acquires the lock on the queue (not everyone at the same time, of course), and then it tries to pick a task from the queue. Since the latter is empty, the thread skips all operations without doing anything, and the thread blocks on the condition variable waiting for someone to put a task in the queue and wake it up (POSIX mutex and condition variable objects are used but not detailed in this description).
- Every time the solve function adds a task to the queue (lines 18 or 26), each thread of the thread pool that in that moment is waiting for the condition variable to become signaled is released. The first thread which manages to acquire the lock proceeds within its main cycle, and it picks the task from the queue. Even if this task should no longer be executed by any other thread, the only entity allowed to remove it from the queue is the one that put it in the queue.
- When the task queue is not empty, tasks are ordered such that they are executed and completed following the order of the recursion. Any completed task that is still being executed by some thread, will be at the beginning of the queue. When a task is executed, it is marked as running, i.e., it has a NULL pointer instead of the actual pointer to the function used to solve the problem. Running tasks are simply skipped by any thread scanning the queue searching for a task to execute. When a task with a non-NULL pointer is found, the execution continues as the previous item point. If no task is found, the execution continues as in the first point.
3.2. The GPU Many-Core Approach
3.2.1. Labels Re-Computation
3.2.2. New Bidomain Stack
3.2.3. Our GPU Implementation
- When the partition cannot lead to a better solution, we ignore it and we prune the corresponding searching path (lines 7–8).
- When it can lead to a larger MCS and it represents a small enough recursion level (line 9), we solved it by eventually running a new CUDA kernel (lines 10–15), A new CUDA kernel is run only when the number of pending sub-problems (or threads, i.e., ) is large enough (line 12). When this happens, we run a new kernel (function Kernel, line 13).
- When we do not rule out the problem and we do not run it within a new CUDA kernel, we solve it locally (lines 17–21). In this case the resolution pattern is similar to the many-core CPU code version. First, we select a vertex from the left bidomain and a vertex from the right bidomain (line 18). Then, we store them in the solution M (line 19) and we eventually update (line 20). Finally, we compute the new bidomain set based on the selected vertices (line 21). However, in the recursive algorithm, the selection of a vertex is performed by swapping it with the one at the end of the domain and by decreasing the bidomain size. In the recursive algorithm, this is possible because when backtracking to higher recursion levels, deep bidomains are destroyed, and the ones at higher levels have not been touched by lower recursions, thus the algorithm can continue with consistent data structures. When recursion is removed, however, we lose this important property, and when we select a vertex from the right bidomain and we decrease its size, the original size must be restored using the original value previously stored as the sixth value of the stack frame as described in Section 3.2.2.
Algorithm 2 The many-core GPU-based iterative branch-and-bound MCS procedure. |
gpuParalleMCS (G, H)
kernel (Q, M, , )
solve (Q, M, , )
|
3.2.4. Complexity and Computational Efficiency
4. The Portfolio Approach
4.1. Portfolio-Oriented Heuristics
4.1.1. Reordering the Adjacency Matrix
4.1.2. Dead-End Recovery and Restart Strategy
4.2. The Portfolio Engines
- Versions ( and ), are the original versions (sequential and parallel, respectively) implemented by McCreesh et al. [3], in C++.
- Version 1 () is a sequential re-implementation of the original code in C language. Even if the complexity of the native threads programming model is comparable for C and C++ languages, since the threaded work must be described as a function programming with native threads may look more natural with languages like C. This code also constitutes the starting version for all our implementations, as it is more coherent with all following requirements, modifications, and choices.
- Version 2 () is the C multi-thread version described in Section 3.1. It logically derives from and .
- Version 3 () is an intermediate CPU single-thread implementation that removes recursion and decreases memory usage. It is logically the starting point for comparison for the following two versions.
- Version 4 () is a CPU multi-thread implementation based on the same principles of the following CUDA implementation.
- Version 5 () is the GPU many-thread implementation, described in Section 3.2. It is based on and .
5. Experimental Results
5.1. Data-Sets Description
5.2. Results on Small-Size Graphs
5.3. Results on Medium-Size Graphs
5.4. Performance Analysis
6. Conclusions
7. Future Works
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Barrow, H.G.; Burstall, R.M. Subgraph Isomorphism, Matching Relational Structures and Maximal Cliques. Inf. Process. Lett. 1976, 4, 83–84. [Google Scholar] [CrossRef]
- Bron, C.; Kerbosch, J. Finding All Cliques of an Undirected Graph (algorithm 457). Commun. ACM 1973, 16, 575–576. [Google Scholar] [CrossRef]
- McCreesh, C.; Prosser, P.; Trimble, J. A Partitioning Algorithm for Maximum Common Subgraph Problems. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17), Melbourne, Australia, 19–25 August 2017; pp. 712–719. [Google Scholar]
- Mattson, T.; Sanders, B.; Massingill, B. Patterns for Parallel Programming, 1st ed.; Addison-Wesley Professional: Boston, MA, USA, 2004; p. 384. [Google Scholar]
- McCool, M.; Reinders, J.; Robison, A. Structured Parallel Programming: Patterns for Efficient Computation, 1st ed.; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2012. [Google Scholar]
- Garbo, A.; Quer, S. A Fast MPEG’s CDVS Implementation for GPU Featured in Mobile Devices. IEEE Access 2018, 6, 52027–52046. [Google Scholar] [CrossRef]
- Cabodi, G.; Camurati, P.; Garbo, A.; Giorelli, M.; Quer, S.; Savarese, F. A Smart Many-Core Implementation of a Motion Planning Framework along a Reference Path for Autonomous Cars. Electronics 2019, 8, 177. [Google Scholar] [CrossRef] [Green Version]
- The SAT Competition Web Page. Available online: http://www.satcompetition.org/ (accessed on 1 October 2019).
- The SMT Competition Web Page. Available online: https://smt-comp.github.io/2019/index.html (accessed on 1 October 2019).
- Kotthoff, L.; McCreesh, C.; Solnon, C. Portfolios of Subgraph Isomorphism Algorithms. In Learning and Intelligent Optimization; Festa, P., Sellmann, M., Vanschoren, J., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 107–122. [Google Scholar]
- De Santo, M.; Foggia, P.; Sansone, C.; Vento, M. A Large Database of Graphs and its Use for Benchmarking Graph Isomorphism Algorithms. Pattern Recognit. Lett. 2003, 24, 1067–1079. [Google Scholar] [CrossRef]
- Foggia, P.; Sansone, C.; Vento, M. A Database of Graphs for Isomorphism and Sub-Graph Isomorphism Benchmarking. In Proceedings of the 3rd IAPR TC-15 International Workshop on Graph-based Representations, Ischia, Italy, 23–25 May 2001; pp. 176–187. [Google Scholar]
- Bunke, H.; Foggia, P.; Guidobaldi, C.; Sansone, C.; Vento, M. A Comparison of Algorithms for Maximum Common Subgraph on Randomly Connected Graphs. In Proceedings of the Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), Windsor, ON, Canada, 6–9 August 2002; pp. 123–132. [Google Scholar]
- Conte, D.; Foggia, P.; Vento, M. Challenging Complexity of Maximum Common Subgraph Detection Algorithms: A Performance Analysis of Three Algorithms on a Wide Database of Graphs. J. Graph Algorithms Appl. 2007, 11, 99–143. [Google Scholar] [CrossRef] [Green Version]
- Vismara, P.; Valery, B. Finding Maximum Common Connected Subgraphs Using Clique Detection or Constraint Satisfaction Algorithms. In Modelling, Computation and Optimization in Information Systems and Management Sciences; Le Thi, H.A., Bouvry, P., Pham Dinh, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 358–368. [Google Scholar]
- Minot, M.; Ndiaye, S.N. Searching for a Maximum Common Induced Subgraph by Decomposing the Compatibility Graph. In Proceedings of the Workshop in Bridging the Gap Between Theory and Practice in Constraint Solvers (CP2014), Lyon, France, 8–12 September 2014; pp. 1–17. [Google Scholar]
- Chen, A.C.L.; Elhajj, A.; Gao, S.; Sarhan, A.; Afra, S.; Kassem, A.; Alhajj, R. Approximating the Maximum Sommon Subgraph Isomorphism Problem with a Weighted Graph. Knowl. Based Syst. 2015, 85, 265–276. [Google Scholar] [CrossRef]
- Bunke, H.; Foggia, P.; Guidobaldi, C.; Vento, M. Graph Clustering Using the Weighted Minimum Common Supergraph. In Proceedings of the 4th IAPR International Conference on Graph Based Representations in Pattern Recognition (GbRPR’03), York, UK, 30 June–2 July 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 235–246. [Google Scholar]
- Blondel, V.; Gajardo, A.; Heymans, M.; Senellart, P.; Van Dooren, P. A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching. SIAM Rev. 2004, 46, 647–666. [Google Scholar] [CrossRef]
- Zager, L.A. Graph Similarity and Matching. Ph.D. Thesis, Massachussetts Institute of Technology, Cambridge, MA, USA, 2005. [Google Scholar]
- Bunke, H. On a relation between graph edit distance and maximum common subgraph. Pattern Recognit. Lett. 1997, 18, 689–694. [Google Scholar] [CrossRef]
- Venero, M.L.F.; Valiente, G. A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognit. Lett. 2001, 22, 753–758. [Google Scholar]
- McGregor, J.J. Backtrack Search Algorithms and the Maximal Common Subgraph Problem. Softw. Pract. Exp. 1982, 12, 23–34. [Google Scholar] [CrossRef]
- Ndiaye, S.M.; Solnon, C. CP Models for Maximum Common Subgraph Problems. In Proceedings of the 17th International Conference of Principles and Practice of Constraint Programming, Perugia, Italy, 12–16 September 2011; pp. 637–644. [Google Scholar]
- Balas, E.; Yu, C. Finding a Maximum Clique in an Arbitrary Graph. SIAM J. Comput. 1986, 15, 1054–1068. [Google Scholar] [CrossRef]
- Raymond, J.W.; Willett, P. Maximum Common Subgraph Isomorphism Algorithms for the Matching of Chemical Structures. J. Comput. Aided Mol. Des. 2002, 16, 521–533. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- McCreesh, C.; Ndiaye, S.N.; Prosser, P.; Solnon, C. Clique and Constraint Models for Maximum Common (connected) Subgraph Problems. In International Conference on Principles and Practice of Constraint Programming; Springer: Berlin/Heidelberg, Germany, 2016; pp. 350–368. [Google Scholar]
- Piva, B.; de Souza, C.C. Polyhedral study of the maximum common induced subgraph problem. Ann. Oper. Res. 2012, 199, 77–102. [Google Scholar] [CrossRef] [Green Version]
- Englert, P.; Kovács, P. Efficient Heuristics for Maximum Common Substructure Search. J. Chem. Inf. Model. 2015, 55, 941–955. [Google Scholar] [CrossRef] [PubMed]
- Hoffmann, R.; McCreesh, C.; Reilly, C. Between subgraph isomorphism and maximum common subgraph. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 3907–3914. [Google Scholar]
- McCreesh, C.; Prosser, P. A Parallel, Backjumping Subgraph Isomorphism Algorithm Using Supplemental Graphs. In Principles and Practice of Constraint Programming; Pesant, G., Ed.; Springer International Publishing: Cham, Switzerland, 2015; pp. 295–312. [Google Scholar]
- Archibald, B.; Dunlop, F.; Hoffmann, R.; McCreesh, C.; Prosser, P.; Trimble, J. Sequential and Parallel Solution-Biased Search for Subgraph Algorithms. In Integration of Constraint Programming, Artificial Intelligence, and Operations Research; Rousseau, L.M., Stergiou, K., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 20–38. [Google Scholar]
- Minot, M.; Ndiaye, S.; Solnon, C. A Comparison of Decomposition Methods for the Maximum Common Subgraph Problem. In Proceedings of the IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), Vietri sul Mare, Italy, 9–11 November 2015; pp. 461–468. [Google Scholar]
- McCreesh, C. Solving Hard Subgraph Problems in Parallel. Ph.D. Thesis, University of Glasgow, Glasgow, UK, 2017. [Google Scholar]
- Hoffmann, R.; Mccreesh, C.; Ndiaye, S.N.; Prosser, P.; Reilly, C.; Solnon, C.; Trimble, J. Observations from Parallelising Three Maximum Common (Connected) Subgraph Algorithms. In International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research; Springer: Cham, Switzerland, 2018; pp. 298–315. [Google Scholar]
- Kimmig, R.; Meyerhenke, H.; Strash, D. Shared Memory Parallel Subgraph Enumeration. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lake Buena Vista, FL, USA, 29 May–2 June 2017; pp. 519–529. [Google Scholar]
- McCreesh, C.; Prosser, P. The Shape of the Search Tree for the Maximum Clique Problem and the Implications for Parallel Branch and Bound. ACM Trans. Parallel Comput. 2015, 2. [Google Scholar] [CrossRef]
- Trimble, J. McSplit Implementations. Available online: https://github.com/ciaranm/cpaior2018-parallel-mcs-paper/tree/master/james-cpp-parallel (accessed on 1 October 2019).
- Lai, T.H.; Sahni, S. Anomalies in Parallel Branch-and-bound Algorithms. Commun. ACM 1984, 27, 594–602. [Google Scholar] [CrossRef] [Green Version]
- Li, G.; Wah, B.W. Coping with Anomalies in Parallel Branch-and-Bound Algorithms. IEEE Trans. Comput. 1986, C-35, 568–573. [Google Scholar] [CrossRef]
- de Bruin, A.; Kindervater, G.A.P.; Trienekens, H.W.J.M. Asynchronous parallel branch and bound and anomalies. In Parallel Algorithms for Irregularly Structured Problems; Ferreira, A., Rolim, J., Eds.; Springer: Berlin/Heidelberg, Germany, 1995; pp. 363–377. [Google Scholar]
- Malapert, A.; Régin, J.C.; Rezgui, M. Embarrassingly Parallel Search in Constraint Programming. J. Artif. Int. Res. 2016, 57, 421–464. [Google Scholar] [CrossRef] [Green Version]
- Cabodi, G.; Loiacono, C.; Palena, M.; Pasini, P.; Patti, D.; Quer, S.; Vendraminetto, D.; Biere, A.; Heljanko, K. Hardware Model Checking Competition 2014: An Analysis and Comparison of Model Checkers and Benchmarks. Int. J. Satisf. Boolean Model. Comput. (JSAT) 2016, 9, 135–172. [Google Scholar] [CrossRef] [Green Version]
- Bordeaux, L.; Hamadi, Y.; Samulowitz, H. Experiments with Massively Parallel Constraint Solving. In Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 9–10 August 2003. [Google Scholar]
- Xu, L.; Hutter, F.; Hoos, H.H.; Leyton-Brown, L. SATzilla: Portfolio-based Algorithm Selection for SAT. J. Artif. Intell. Res. 2008, 32, 565–606. [Google Scholar] [CrossRef] [Green Version]
- Pulina, L.; Tacchella, A. A self-adaptive multi-engine Solver for Quantified Boolean Formulas. Constraints 2009, 14, 80–116. [Google Scholar] [CrossRef]
- Hamadi, Y.; Sais, L. ManySAT: A Parallel SAT Solver. Int. J. Satisf. Boolean Model. Comput. 2009, 6, 245–262. [Google Scholar] [CrossRef] [Green Version]
- Hellerman, S.; Rarick, D.C. The Partitioned Preassigned Pivot Procedure (P4). Sparse Matrices Their Appl. 1972, 67–76. [Google Scholar] [CrossRef]
- Gomes, C.P.; Selman, B.; Kautz, H. Boosting Combinatorial Search Through Randomization. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98) Tenth Conference on Innovative Applications of Artificial Intelligence (IAAI-98), Madison, WI, USA, 26–30 July 1998; pp. 431–437. [Google Scholar]
- Hariharan, R.; Janakiraman, A.; Nilakantan, R.; Singh, B.; Varghese, S.; Landrum, G.; Schuffenhauer, A. MultiMCS: A Fast Algorithm for the Maximum Common Substructure Problem on Multiple Molecules. J. Chem. Inf. Model. 2011, 51, 788–806. [Google Scholar] [CrossRef]
- Dalke, A.; Hastings, J. FMCS: A novel algorithm for the multiple MCS problem. J. Cheminform. 2013, 5, 1. [Google Scholar] [CrossRef] [Green Version]
(a) | |||
---|---|---|---|
v | Label | v | Label |
b | 1 | c | 1 |
c | 1 | a | 1 |
d | 1 | d | 1 |
(b) | |||
v | Label | v | Label |
c | 11 | a | 11 |
d | 10 | d | 11 |
(c) | |||
v | Label | v | Label |
d | 101 | d | 110 |
2586 | 2535 | 2538 | 2530 | solved | 2665 | 2636 | 2633 | 2633 |
1321 | 1478 | 1595 | 1499 | total time | 822 | 1100 | 1039 | 1027 |
0.4798 | 0.5367 | 0.5789 | 0.5442 | avg. time | 0.2984 | 0.3993 | 0.3772 | 0.3731 |
273 | 304 | 340 | 352 | # best | 250 | 345 | 331 | 315 |
Code Line | Divergence | Total Executions | |
---|---|---|---|
[%] | [#] | ||
253 | 9.4 | 17,078 | 180,781 |
325 | 40.7 | 26,932 | 66,120 |
329 | 22.1 | 6806 | 30,819 |
334 | 22.3 | 6884 | 30,819 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Quer, S.; Marcelli, A.; Squillero, G. The Maximum Common Subgraph Problem: A Parallel and Multi-Engine Approach. Computation 2020, 8, 48. https://doi.org/10.3390/computation8020048
Quer S, Marcelli A, Squillero G. The Maximum Common Subgraph Problem: A Parallel and Multi-Engine Approach. Computation. 2020; 8(2):48. https://doi.org/10.3390/computation8020048
Chicago/Turabian StyleQuer, Stefano, Andrea Marcelli, and Giovanni Squillero. 2020. "The Maximum Common Subgraph Problem: A Parallel and Multi-Engine Approach" Computation 8, no. 2: 48. https://doi.org/10.3390/computation8020048
APA StyleQuer, S., Marcelli, A., & Squillero, G. (2020). The Maximum Common Subgraph Problem: A Parallel and Multi-Engine Approach. Computation, 8(2), 48. https://doi.org/10.3390/computation8020048