# Multi-Threading a State-of-the-Art Maximum Clique Algorithm

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Preliminaries

## 2. Algorithms for the Maximum Clique Problem

Algorithm 1: Sequentially colour vertices and sort into non-decreasing colour order. |

Algorithm 2: An exact algorithm to deliver a maximum clique. |

## 3. Parallel Algorithm Design

Algorithm 3: A threaded algorithm to deliver a maximum clique |

**Figure 2.**Work splitting and queueing mechanism for our parallel algorithm. Nodes correspond to a call to $\mathtt{expand}$. Work donation occurs once when the donating worker’s position is at the node marked

**☆**.

#### 3.1. Goals of Parallelism

`expand`. If an algorithm is not work efficient, we define its efficiency to be the ratio of work done by the sequential algorithm to work done by the parallel algorithm, expressed as a percentage. If the time taken to execute each “unit of work” is roughly equivalent, we would expect an efficiency of greater than 100% to be a necessary, but not sufficient, condition for obtaining a superlinear speedup (This is not entirely true, due to cache effects: it could be that the working memory will fit entirely in cache when using multiple cores, but not when using a single core. However, for our purposes, this effect is negligible.).

#### 3.2. Complications from Hyper-Threading

## 4. Experimental Evaluation

#### 4.1. Implementation

#### 4.2. Experimental Data and Methodology

#### 4.3. Comparison of Sequential Algorithm to Published Results

**Table 1.**Comparison of runtimes (in seconds) for our sequential implementation with San Segundo’s published results for $\mathtt{BBMCI}$ [11] (which differs slightly from our algorithm) and with runtimes using Prosser’s $\mathtt{BBMC}\mathtt{1}$ [12] (which is the same as our algorithm, but in Java). The system used to produce these results has the same model CPU as was used by San Segundo.

Problem | Our runtime (s) | $\mathtt{BBMCI}\left(\mathtt{s}\right)$ | $\mathtt{BBMC}\mathtt{1}\left(\mathtt{s}\right)$ |
---|---|---|---|

brock400_1 | 198 | 341 | 507 |

brock400_2 | 144 | 144 | 371 |

brock400_3 | 114 | 229 | 294 |

brock400_4 | 56 | 133 | 146 |

#### 4.4. Threaded Experimental Results on Standard Benchmarks

**Table 2.**Experimental results for DIMACS instances. Shown are the size of a maximum clique, then sequential runtimes and the number of search nodes (calls to $\mathtt{expand}$). Next is parallel runtimes, speedups and efficiency using 4 to 24 threads on a 12-core hyper-threaded system. Superlinear speedups and efficiencies greater than 100% are shown in bold; blanks indicate unattempted problems. Problems where the sequential run took under one second are omitted.

Problem | ω | Sequential Runtimes | Threaded Runtimes, Speedups and Efficiency | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

and Search Nodes | 4 | 8 | 12 | 24 | |||||||||||

brock400_1 | 27 | 274.9s | $2.0\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{8}$ | 69.3 s | 4.0 | 99% | 36.0 s | 7.6 | 95% | 25.3 s | 10.9 | 90% | 11.7 s | 23.4 | 159% |

brock400_2 | 29 | 200.8 s | $1.5\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{8}$ | 50.7 s | 4.0 | 99% | 22.2 s | 9.0 | 117% | 16.6 s | 12.1 | 105% | 8.9 s | 22.5 | 149% |

brock400_3 | 31 | 159.4 s | $1.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{8}$ | 38.6 s | 4.1 | 106% | 17.4 s | 9.1 | 118% | 10.6 s | 15.1 | 135% | 5.7 s | 28.0 | 193% |

brock400_4 | 33 | 77.5 s | $5.4\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{7}$ | 17.9 s | 4.3 | 111% | 8.5 s | 9.1 | 118% | 1.9 s | 40.4 | 477% | 1.7 s | 45.5 | 358% |

brock800_1 | 23 | 4,969.8 s | $2.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{9}$ | 1,216.5 s | 4.1 | 104% | 587.1 s | 8.5 | 108% | 405.6 s | 12.3 | 105% | 269.9 s | 18.4 | 122% |

brock800_2 | 24 | 4,958.2 s | $2.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{9}$ | 1,237.8 s | 4.0 | 101% | 584.8 s | 8.5 | 109% | 386.0 s | 12.8 | 111% | 266.7 s | 18.6 | 123% |

brock800_3 | 25 | 4,590.7 s | $2.1\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{9}$ | 1,125.2 s | 4.1 | 103% | 533.2 s | 8.6 | 110% | 347.8 s | 13.2 | 114% | 222.2 s | 20.7 | 138% |

brock800_4 | 26 | 1,733.0 s | $6.4\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{8}$ | 408.7 s | 4.2 | 110% | 220.3 s | 7.9 | 97% | 152.3 s | 11.4 | 96% | 131.5 s | 13.2 | 77% |

C250.9 | 44 | 1,606.8 s | $1.1\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{9}$ | 411.2 s | 3.9 | 98% | 228.1 s | 7.0 | 88% | 147.8 s | 10.9 | 96% | 149.0 s | 10.8 | 97% |

C500.9 | ≥54 | >1 day | |||||||||||||

C1000.9 | ≥58 | >1 day | |||||||||||||

C2000.5 | 16 | 67,058.8 s | $1.8\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{10}$ | 17,023.9 s | 3.9 | 100% | 8,334.3 s | 8.0 | 100% | 5,633.0 s | 11.9 | 100% | 4,347.9 s | 15.4 | 100% |

C2000.9 | ≥65 | >1 day | |||||||||||||

C4000.5 | 18 | 19 days using 32 threads on a 16-core hyper-threaded dual Xeon E5-2660 shared with other users. | |||||||||||||

DSJC500_5 | 13 | 1.0 s | $1.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{6}$ | 266 ms | 3.9 | 101% | 152 ms | 6.7 | 101% | 130 ms | 7.9 | 99% | 89 ms | 11.5 | 99% |

DSJC1000_5 | 15 | 135.7 s | $7.7\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{7}$ | 34.7 s | 3.9 | 99% | 17.4 s | 7.8 | 98% | 11.7 s | 11.6 | 99% | 9.1 s | 15.0 | 98% |

gen200_p0.9_44 | 44 | 2.5 s | $1.8\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{6}$ | 654 ms | 3.9 | 100% | 109 ms | 23.2 | 471% | 100 ms | 25.3 | 316% | 95 ms | 26.6 | 540% |

gen400_p0.9_55 | 55 | 36 h using 32 threads on a 16-core hyper-threaded dual Xeon E5-2660 shared with other users. | |||||||||||||

gen400_p0.9_65 | 65 | 431,310.7 s | $1.8\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{11}$ | 96,329.7 s | 4.5 | 114% | 38,514.8 s | 11.2 | 140% | 16,921.6 s | 25.5 | 216% | 17,755.0 s | 24.3 | 162% |

gen400_p0.9_75 | 75 | 247,538.3 s | $1.0\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{11}$ | 22,715.4 s | 10.9 | 309% | 17,211.1 s | 14.4 | 214% | 11,594.2 s | 21.4 | 200% | 3,799.6 s | 65.1 | 445% |

hamming10-4 | ≥40 | >1 day | |||||||||||||

johnson32-2-4 | ≥16 | >1 day | |||||||||||||

keller5 | 27 | 153,970.1 s | $5.1\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{10}$ | 38,817.9 s | 4.0 | 100% | 19,288.2 s | 8.0 | 100% | 12,793.6 s | 12.0 | 100% | 10,241.3 s | 15.0 | 100% |

keller6 | ≥55 | >1 day | |||||||||||||

MANN_a45 | 345 | 224.8 s | $2.9\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{6}$ | 56.3 s | 4.0 | 100% | 27.1 s | 8.3 | 108% | 18.2 s | 12.3 | 122% | 12.5 s | 17.9 | 161% |

MANN_a81 | 1,100 | 31 days using 24 threads on a 12-core hyper-threaded dual Xeon E5645 shared with other users. | |||||||||||||

p_hat300-3 | 36 | 1.1 s | $6.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{5}$ | 291 ms | 3.7 | 94% | 156 ms | 7.0 | 93% | 129 ms | 8.4 | 91% | 103 ms | 10.5 | 89% |

p_hat500-3 | 50 | 108.7 s | $3.9\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{7}$ | 29.5 s | 3.7 | 95% | 15.1 s | 7.2 | 90% | 10.8 s | 10.1 | 86% | 8.1 s | 13.4 | 88% |

p_hat700-2 | 44 | 3.1 s | $7.5\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{5}$ | 946 ms | 3.3 | 102% | 402 ms | 7.7 | 103% | 403 ms | 7.7 | 100% | 270 ms | 11.5 | 93% |

p_hat700-3 | 62 | 1,627.6 s | $2.8\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{8}$ | 419.9 s | 3.9 | 98% | 223.9 s | 7.3 | 91% | 156.8 s | 10.4 | 87% | 120.4 s | 13.5 | 90% |

p_hat1000-2 | 46 | 159.2 s | $3.4\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{7}$ | 40.5 s | 3.9 | 98% | 20.4 s | 7.8 | 97% | 14.3 s | 11.1 | 96% | 11.7 s | 13.7 | 92% |

p_hat1000-3 | 68 | 804,428.9 s | $1.3\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{11}$ | 200,853.7 s | 4.0 | 101% | 101,303.7 s | 7.9 | 100% | 67,659.5 s | 11.9 | 100% | 53,424.6 s | 15.1 | 98% |

p_hat1500-1 | 12 | 3.2 s | $1.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{6}$ | 821 ms | 3.9 | 100% | 433 ms | 7.4 | 100% | 341 ms | 9.4 | 100% | 259 ms | 12.4 | 100% |

p_hat1500-2 | 65 | 24,338.5 s | $2.0\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{9}$ | 6,117.3 s | 4.0 | 99% | 3,089.0 s | 7.9 | 98% | 2,094.6 s | 11.6 | 96% | 1,789.1 s | 13.6 | 91% |

p_hat1500-3 | 94 | 128 days using 32 threads on a 16-core hyper-threaded dual Xeon E5-2660 shared with other users. | |||||||||||||

san200_0.9_3 | 44 | 8.5 s | $6.8\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{6}$ | 439 ms | 19.5 | 595% | 319 ms | 26.8 | 417% | 177 ms | 48.3 | 734% | 271 ms | 31.5 | 567% |

san400_0.7_2 | 30 | 2.0 s | $8.9\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{5}$ | 590 ms | 3.4 | 105% | 298 ms | 6.7 | 90% | 176 ms | 11.4 | 116% | 76 ms | 26.3 | 216% |

san400_0.7_3 | 22 | 1.3 s | $5.2\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{5}$ | 84 ms | 15.0 | 529% | 62 ms | 20.3 | 475% | 54 ms | 23.4 | 396% | 58 ms | 21.7 | 253% |

san400_0.9_1 | 100 | 23.5 s | $4.5\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{6}$ | 5.3 s | 4.4 | 133% | 312 ms | 75.3 | 1,357% | 230 ms | 102.2 | 1,353% | 191 ms | 123.0 | 1,217% |

san1000 | 15 | 1.9 s | $1.5\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{5}$ | 488 ms | 3.9 | 101% | 281 ms | 6.8 | 107% | 173 ms | 11.1 | 108% | 108 ms | 17.7 | 139% |

sanr200_0.9 | 42 | 19.4 s | $1.5\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{7}$ | 5.3 s | 3.7 | 92% | 2.8 s | 6.8 | 89% | 2.2 s | 9.0 | 85% | 3.0 s | 6.4 | 66% |

sanr400_0.7 | 21 | 72.1 s | $6.4\phantom{\rule{0.166667em}{0ex}}\times \phantom{\rule{0.166667em}{0ex}}{10}^{7}$ | 18.1 s | 4.0 | 100% | 9.1 s | 7.9 | 100% | 6.2 s | 11.7 | 100% | 4.6 s | 15.7 | 100% |

**Table 3.**Experimental results for BHOSLIB instances. Shown are the size of a maximum clique, then sequential runtimes and the number of search nodes (calls to $\mathtt{expand}$). Next is parallel runtimes, speedups and efficiency using 4 to 24 threads on a 12-core hyper-threaded system. Superlinear speedups and efficiencies greater than 100% are shown in bold.

Problem | ω | Sequential Runtimes and Search Nodes | Threaded Runtimes, Speedups and Efficiency | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

4 | 8 | 12 | 24 | ||||||||||||

frb30-15-1 | 30 | 657.1 s | 2.9 × 10^{8} | 160.2 s | 4.1 | 102% | 76.6 s | 8.6 | 107% | 43.9 s | 15.0 | 130% | 35.5 s | 18.5 | 127% |

frb30-15-2 | 30 | 1,183.1 s | 5.6 × 10^{8} | 287.7 s | 4.1 | 102% | 141.7 s | 8.3 | 105% | 93.6 s | 12.6 | 109% | 65.8 s | 18.0 | 131% |

frb30-15-3 | 30 | 356.7 s | 1.7 × 10^{8} | 80.8 s | 4.4 | 113% | 38.8 s | 9.2 | 118% | 25.3 s | 14.1 | 125% | 19.5 s | 18.3 | 133% |

frb30-15-4 | 30 | 1,963.2 s | 9.9 × 10^{8} | 496.0 s | 4.0 | 100% | 246.1 s | 8.0 | 100% | 166.0 s | 11.8 | 100% | 124.4 s | 15.8 | 104% |

frb30-15-5 | 30 | 577.1 s | 2.8 × 10^{8} | 129.2 s | 4.5 | 115% | 68.4 s | 8.4 | 109% | 44.4 s | 13.0 | 118% | 42.1 s | 13.7 | 100% |

frb35-17-1 | 35 | 51,481.7 s | 1.3 × 10^{10} | 12,072.8 s | 4.3 | 108% | 5,949.7 s | 8.7 | 110% | 3,800.8 s | 13.5 | 116% | 2,532.0 s | 20.3 | 144% |

frb35-17-2 | 35 | 91,275.0 s | 2.3 × 10^{10} | 21,867.3 s | 4.2 | 105% | 10,959.2 s | 8.3 | 105% | 7,175.1 s | 12.7 | 107% | 5,677.3 s | 16.1 | 108% |

frb35-17-3 | 35 | 33,852.1 s | 8.2 × 10^{9} | 8,278.8 s | 4.1 | 103% | 4,063.2 s | 8.3 | 105% | 2,813.6 s | 12.0 | 101% | 2,349.3 s | 14.4 | 96% |

frb35-17-4 | 35 | 37,629.2 s | 8.9 × 10^{9} | 9,319.5 s | 4.0 | 101% | 4,522.7 s | 8.3 | 105% | 2,638.6 s | 14.3 | 122% | 2,196.1 s | 17.1 | 111% |

frb35-17-5 | 35 | 205,356.0 s | 5.8 × 10^{10} | 49,901.9 s | 4.1 | 103% | 25,130.3 s | 8.2 | 102% | 16,365.4 s | 12.5 | 105% | 10,363.4 s | 19.8 | 137% |

#### 4.5. Threaded Experimental Results on Larger Sparse Random Graphs

**Table 4.**Experimental results for larger, sparser random graph instances, sample size of 10. Shown are average sequential runtimes and, then, average parallel runtimes and speedups using 4 to 24 threads on a 12-core hyper-threaded system.

Problem | Sequential Runtimes | Average Threaded Runtimes and Speedups | |||||||
---|---|---|---|---|---|---|---|---|---|

4 | 8 | 12 | 24 | ||||||

G(1, 000, 0.1) | 18 ms | 18 ms | 1.0 | 23 ms | 0.8 | 26 ms | 0.7 | 33 ms | 0.5 |

G(1, 000, 0.2) | 65 ms | 30 ms | 2.2 | 30 ms | 2.2 | 32 ms | 2.0 | 36 ms | 1.8 |

G(1, 000, 0.3) | 532 ms | 151 ms | 3.5 | 100 ms | 5.3 | 83 ms | 6.4 | 76 ms | 7.0 |

G(1, 000, 0.4) | 6.1 s | 1.6 s | 3.9 | 860 ms | 7.1 | 585 ms | 10.5 | 439 ms | 13.9 |

G(1, 000, 0.5) | 138.6 s | 34.9 s | 4.0 | 17.6 s | 7.9 | 12.2 s | 11.3 | 8.9 s | 15.6 |

G(3, 000, 0.1) | 640 ms | 220 ms | 2.9 | 171 ms | 3.7 | 156 ms | 4.1 | 168 ms | 3.8 |

G(3, 000, 0.2) | 11.9 s | 3.1 s | 3.9 | 1.7 s | 7.0 | 1.2 s | 10.1 | 900 ms | 13.3 |

G(3, 000, 0.3) | 358.5 s | 90.3 s | 4.0 | 45.6 s | 7.9 | 30.7 s | 11.7 | 23.2 s | 15.4 |

G(10, 000, 0.1) | 84.6 s | 21.8 s | 3.9 | 11.5 s | 7.3 | 8.5 s | 9.9 | 7.3 s | 11.6 |

G(15, 000, 0.1) | 403.5 s | 102.8 s | 3.9 | 53.8 s | 7.5 | 38.1 s | 10.6 | 33.2 s | 12.2 |

#### 4.6. A Detailed Look at a Super-Linear Speedup

**Figure 3.**Total CPU time spent (i.e., runtimes multiplied by the number of threads) for “san400_0.9_1” from DIMACS with varying numbers of threads. Also shown is the total CPU time taken to find a maximum clique, but not to prove its optimality. A linear speedup would give a horizontal line, and downwards sloping lines are superlinear.

## 5. Conclusion and Future Work

## Conflict of Interest

## References

- Garey, M.R.; Johnson, D.S. Computers and Intractability; A Guide to the Theory of NP-Completeness; W. H. Freeman & Co.: New York, NY, USA, 1990. [Google Scholar]
- Cheeseman, P.; Kanefsky, B.; Taylor, W.M. Where the Really Hard Problems Are; Morgan Kaufmann: San Francisco, CA, 1991; pp. 331–337. [Google Scholar]
- Bomze, I.M.; Budinich, M.; Pardalos, P.M.; Pelillo, M. The Maximum Clique Problem. In Handbook of Combinatorial Optimization (Supplement Volume A); Kluwer Academic Publishers: Dordrecht, The Netherlands, 1999; Volume 4, pp. 1–74. [Google Scholar]
- Butenko, S.; Wilhelm, W.E. Clique-detection models in computational biochemistry and genomics. Eur. J. Oper. Res.
**2006**, 173, 1–17. [Google Scholar] [CrossRef] - Sutter, H. The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb’s J.
**2005**, 30, 202–210. [Google Scholar] - Gustafson, J.L. Reevaluating Amdahl’s law. Commun. ACM
**1988**, 31, 532–533. [Google Scholar] [CrossRef] - Tomita, E.; Seki, T. An Efficient Branch-and-bound Algorithm for Finding a Maximum Clique. In Proceedings of the 4th International Conference on Discrete Mathematics and Theoretical Computer Science, DMTCS’03, Dijon, France, 7-12 July 2003; Springer-Verlag: Berlin/Heidelberg, Germany, 2003; pp. 278–289. [Google Scholar]
- Tomita, E.; Kameda, T. An efficient branch-and-bound algorithm for finding a maximum clique with computational experiments. J. Glob. Optim.
**2007**, 37, 95–111. [Google Scholar] [CrossRef] - Tomita, E.; Sutani, Y.; Higashi, T.; Takahashi, S.; Wakatsuki, M. A Simple and Faster Branch-and-Bound Algorithm for Finding a Maximum Clique. In Proceedings of the WALCOM 2010, LNCS 5942, Dhaka, Bangladesh, 10–12 February 2010; pp. 191–203.
- San Segundo, P.; Rodríguez-Losada, D.; Jiménez, A. An exact bit-parallel algorithm for the maximum clique problem. Comput. Oper. Res.
**2011**, 38, 571–581. [Google Scholar] [CrossRef] - Segundo, P.S.; Matia, F.; Rodríguez-Losada, D.; Hernando, M. An improved bit parallel exact maximum clique algorithm. Optim. Lett.
**2011**, 38, 571–581. [Google Scholar] [CrossRef] - Prosser, P. Exact algorithms for maximum clique: A computational study. Algorithms
**2012**, 5, 545–587. [Google Scholar] [CrossRef] - Pattabiraman, B.; Patwary, M.M.A.; Gebremedhin, A.H.; keng Liao, W.; Choudhary, A.N. Fast algorithms for the maximum clique problem on massive sparse graphs. CoRR
**2012**. abs/1209.5818. [Google Scholar] - Rossi, R.A.; Gleich, D.F.; Gebremedhin, A.H.; Patwary, M.M.A. A fast parallel maximum clique algorithm for large sparse graphs and temporal strong components. CoRR
**2013**. abs/1302.6256. [Google Scholar] - Gendron, B.; Crainic, T.G. Parallel branch-and-branch algorithms: Survey and synthesis. Oper. Res.
**1994**, 42, 1042–1066. [Google Scholar] [CrossRef] - Bader, D.A.; Hart, W.E.; Phillips, C.A. Parallel Algorithm Design for Branch and Bound. In Tutorials on Emerging Methodologies and Applications in Operations Research: Presented at INFORMS 2004, Denver, CO; International Series in Operations Research & Management Science; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2004; pp. 5-1–5-44. [Google Scholar]
- McCreesh, C.; Prosser, P. Distributing an exact algorithm for maximum clique: maximising the costup. Algorithms
**2012**, 5, 545–587. [Google Scholar] - Pardalos, P.; Rappe, J.; Resende, M. An Exact Parallel Algorithm for the Maximum Clique Problem. In High Performance Algorithms and Software in Nonlinear Optimization; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1998. [Google Scholar]
- Lai, T.H.; Sahni, S. Anomalies in parallel branch-and-bound algorithms. Commun. ACM
**1984**, 27, 594–602. [Google Scholar] [CrossRef] - Mehrotra, R.; Gehringer, E.F. Superlinear Speedup Through Randomized Algorithms. In Proceedings of the ICPP’85, University Park, PA, USA, 1985; pp. 291–300.
- Li, G.J.; Wah, B.W. Coping with anomalies in parallel branch-and-bound algorithms. IEEE Trans. Comput.
**1986**, 35, 568–573. [Google Scholar] [CrossRef] - Clearwater, S.H.; Huberman, B.A.; Hogg, T. Cooperative solution of constraint satisfaction problems. Science
**1991**, 254, 1181–1183. [Google Scholar] [CrossRef] [PubMed] - Bruin, A.d.; Kindervater, G.A.P.; Trienekens, H.W.J.M. Asynchronous Parallel Branch and Bound and Anomalies. In Proceedings of the Second International Workshop on Parallel Algorithms for Irregularly Structured Problems, Lyon, France, 4-6 September 1995; Springer-Verlag: London, UK, UK, 1995. IRREGULAR ’95. pp. 363–377. [Google Scholar]
- Marr, D.; Binns, F.; Hill, D.; Hinton, G.; Koufaty, D.; Miller, J.; Upton, M. Hyper-threading technology architecture and microarchitecture. Intel Technol. J.
**2002**, 6, 4–15. [Google Scholar] - Bulpin, J.; Pratt, I. Multiprogramming performance of the Pentium 4 with Hyper-Threading. In Proceedings of the Workshop on Duplicating, Deconstructing, and Debunking (WDDD04), München, Germany, 20 June 2004.
- McCreesh, C.; Prosser, P. http://github.com/ciaranm/multithreadedmaximumclique.
- DIMACS instances. http://dimacs.rutgers.edu/Challenges/.
- BHOSLIB instances. http://www.nlsde.buaa.edu.cn/~kexu/benchmarks/graph-benchmarks.htm.
- Batsyn, M.; Goldengorin, B.; Maslov, E.; Pardalos, P. Improvements to MCS algorithm for the maximum clique problem. J. Comb. Optim.
**2013**, 26, 1–20. [Google Scholar] [CrossRef]

© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

McCreesh, C.; Prosser, P. Multi-Threading a State-of-the-Art Maximum Clique Algorithm. *Algorithms* **2013**, *6*, 618-635.
https://doi.org/10.3390/a6040618

**AMA Style**

McCreesh C, Prosser P. Multi-Threading a State-of-the-Art Maximum Clique Algorithm. *Algorithms*. 2013; 6(4):618-635.
https://doi.org/10.3390/a6040618

**Chicago/Turabian Style**

McCreesh, Ciaran, and Patrick Prosser. 2013. "Multi-Threading a State-of-the-Art Maximum Clique Algorithm" *Algorithms* 6, no. 4: 618-635.
https://doi.org/10.3390/a6040618