# Invariant Graph Partition Comparison Measures

^{*}

## Abstract

**:**

## 1. Introduction

- Partition ${\mathcal{P}}_{1}=\left(\right)open="\{"\; close="\}">\{1,2\},\{3,4\}$ is mapped to the structurally equivalent partition ${\mathcal{P}}_{2}=\left(\right)open="\{"\; close="\}">\{1,4\},\{2,3\}$.
- Partition ${\mathcal{Q}}_{1}=\left(\right)open="\{"\; close="\}">\{1,3\},\{2,4\}$ is mapped to the identical partition ${\mathcal{Q}}_{2}$.

- Because ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$ are structurally equivalent, the RI should be one (as for Cases 1, 2 and 3) instead of $1/3$.
- Comparisons of structurally different different partitions (Cases 4 and 5) and comparisons of structurally equivalent partitions (Case 6) should not result in the same value.

## 2. Graphs, Permutation Groups and Graph Automorphisms

- Closure: $\forall g,h\in H:g\circ h\in H$
- Unit element: The identity function $id\in H$ acts as the neutral element: $\forall g\in H:id\circ g=g\circ id=g$
- Inverse element: For any g in H, the inverse permutation function ${g}^{-1}\in H$ is the inverse of g: $\forall g\in H:g\circ {g}^{-1}={g}^{-1}\circ g=id$
- Associativity: The associative law holds: $\forall f,g,h\in H:f\circ \left(\right)open="("\; close=")">g\circ h\circ h$

- ${u}^{id}=u,\phantom{\rule{0.166667em}{0ex}}\forall u\in V$
- ${\left({u}^{g}\right)}^{h}={u}^{gh},\phantom{\rule{0.166667em}{0ex}}\forall u\in V,\phantom{\rule{0.166667em}{0ex}}\forall g,h\in H$

**Theorem**

**1.**

**Definition**

**1.**

**Example**

**1.**

**Definition**

**2.**

**Example**

**2.**

## 3. Graph Partition Comparison Measures Are Not Invariant

**Definition**

**3.**

#### 3.1. Variant 1: Construction of a Counterexample

**Theorem**

**2.**

**Proof.**

#### 3.2. Variant 2: Inconsistency of the Identity and the Invariance Axiom

**Theorem**

**3.**

**Proof.**

- Since $Aut\left(G\right)$ is nontrivial, a nontrivial orbit with at least two different partitions, namely $\mathcal{P}$ and $\mathcal{Q}$, exists because $|{\mathcal{P}}^{Aut\left(G\right)}|>1$. It follows from the invariance axiom that $m(\mathcal{P},\mathcal{Q})=c$.
- The identity axiom implies that it follows from $m(\mathcal{P},\mathcal{Q})=c$ that $\mathcal{P}=\mathcal{Q}$.
- This contradicts the assumption that $\mathcal{P}$ and $\mathcal{Q}$ are different.

## 4. The Construction of Invariant Measures for Finite Permutation Groups

- We construct a pseudometric space from the images of the actions of $Aut\left(G\right)$ on partitions in $P\left(V\right)$ (Definition 1).
- We extend the metrics for partition comparison by constructing invariant metrics on the pseudo-metric space of partitions.

#### 4.1. The Construction of the Pseudometric Space of Equivalence Classes of Graph Partitions

- Symmetry: $d(s,t)=d(t,s)$.
- Identity: $d(s,t)=0$ if and only if $s=t$.
- Triangle inequality: $d(s,u)\le d(s,t)+d(t,u)$.

- $(S,d)$ is a metric space with $S=P\left(V\right)$ and with the function $d:P\left(V\right)\times P\left(V\right)\to \mathbb{R}$.
- $({S}^{\ast},{d}^{\ast})$ is a metric space with ${S}^{\ast}=P{\left(V\right)}^{Aut\left(G\right)}=\{{\mathcal{P}}^{Aut\left(G\right)}\mid \mathcal{P}\in P\left(V\right)\}$ and the function ${d}^{\ast}$: $P{\left(V\right)}^{Aut\left(G\right)}\times P{\left(V\right)}^{Aut\left(G\right)}\to \mathbb{R}$. We construct three variants of ${d}^{\ast}$ in Section 4.2.
- $(S,{d}^{\ast})$ is the pseudometric space with $S=P\left(V\right)$ and with the metric ${d}^{\ast}$. The partitions in S are mapped to arguments of ${d}^{\ast}$ by the transformation $ec:P\left(V\right)\to P{\left(V\right)}^{Aut\left(G\right)}$, which is defined as $ec\left(\mathcal{P}\right){\mathcal{P}}^{Aut\left(G\right)}$.

#### 4.2. The Construction of Left-Invariant and Additive Measures on the Pseudometric Space of Equivalence Classes of Graph Partitions

**Theorem**

**4.**

**Proof.**

- For ${d}_{L}^{\ast}$, we have:$$\begin{array}{ccc}\hfill \underset{\tilde{\mathcal{Q}}\in {\mathcal{Q}}^{Aut\left(G\right)}}{min}d(\mathcal{P},\tilde{\mathcal{Q}})& =\underset{g\in Aut\left(G\right)}{min}d(\mathcal{P},{\mathcal{Q}}^{g})\hfill & \\ & =\underset{{g}^{-1}\in Aut\left(G\right)}{min}d({\mathcal{P}}^{{g}^{-1}},\mathcal{Q})\hfill & =\underset{\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)}}{min}d(\tilde{\mathcal{P}},\mathcal{Q})\hfill \end{array}$$$$\begin{array}{ccc}\hfill \underset{\begin{array}{c}\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)},\\ \tilde{\mathcal{Q}}\in {\mathcal{Q}}^{Aut\left(G\right)}\end{array}}{min}d(\tilde{\mathcal{P}},\tilde{\mathcal{Q}})& =\underset{g,h\in Aut\left(G\right)}{min}d({\mathcal{P}}^{h},{\mathcal{Q}}^{g})\hfill & =\underset{g,h\in Aut\left(G\right)}{min}d(\mathcal{P},{\mathcal{Q}}^{g{h}^{-1}})\hfill \\ & =\underset{f\in Aut\left(G\right)}{min}d(\mathcal{P},{\mathcal{Q}}^{f})\hfill & =\underset{\tilde{\mathcal{Q}}\in {\mathcal{Q}}^{Aut\left(G\right)}}{min}d(\mathcal{P},\tilde{\mathcal{Q}})\hfill \end{array}$$
- For the proof of ${d}_{U}^{\ast}$ for ${\mathcal{P}}^{Aut\left(G\right)}\ne {\mathcal{Q}}^{Aut\left(G\right)}$ we substitute max for min in the proof of ${d}_{L}^{\ast}$.

**Theorem**

**5.**

**Proof.**

**Theorem**

**6.**

- Identity: ${d}_{L}^{\ast}(\mathcal{P},\mathcal{Q})=0$, if ${\mathcal{P}}^{Aut\left(G\right)}={\mathcal{Q}}^{Aut\left(G\right)}$.
- Invariance: ${d}_{L}^{\ast}(\mathcal{P},\mathcal{Q})={d}_{L}^{\ast}(\tilde{\mathcal{P}},\tilde{\mathcal{Q}})$, for all $\mathcal{P},\mathcal{Q}\in P\left(V\right)$ and $\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)}$, $\tilde{\mathcal{Q}}\in {\mathcal{Q}}^{Aut\left(G\right)}$.
- Symmetry: ${d}_{L}^{\ast}(\mathcal{P},\mathcal{Q})={d}_{L}^{\ast}(\mathcal{Q},\mathcal{P})$.
- Triangle inequality: ${d}_{L}^{\ast}(\mathcal{P},\mathcal{R})\le {d}_{L}^{\ast}(\mathcal{P},\mathcal{Q})+{d}_{L}^{\ast}(\mathcal{Q},\mathcal{R})$

**Proof**

**.**

- Identity holds because of the definition of the distance ${d}^{\ast}$ between two elements in an equivalence class of the pseudometric space $(S,{d}^{\ast})$.
- Invariance of ${d}_{L}^{\ast}(\mathcal{P},\mathcal{Q})$, ${d}_{U}^{\ast}(\mathcal{P},\mathcal{Q})$ and ${d}_{av}^{\ast}(\mathcal{P},\mathcal{Q})$ is proven by Theorems 4 and 5.
- Symmetry holds, because d is symmetric, and min, max and the average do not depend on the order of their respective arguments.
- To proof the triangular inequality, we make use of Theorems 4 and 5 and of the fact that d is a metric for which the triangular inequality holds:
- (a)
- For ${d}_{L}^{\ast}$ follows:$$\begin{array}{cc}\hfill {d}_{L}^{\ast}(\mathcal{P},\mathcal{R})& =\underset{\begin{array}{c}\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)},\\ \tilde{\mathcal{R}}\in {\mathcal{R}}^{Aut\left(G\right)}\end{array}}{min}d(\tilde{\mathcal{P}},\tilde{\mathcal{R}})\hfill \\ & \le \underset{\begin{array}{c}\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)},\\ \tilde{\mathcal{Q}}\in {\mathcal{Q}}^{Aut\left(G\right)},\\ \tilde{\mathcal{R}}\in {\mathcal{R}}^{Aut\left(G\right)}\end{array}}{min}\left(\right)open="("\; close=")">d(\tilde{\mathcal{P}},\tilde{\mathcal{Q}})+d(\tilde{\mathcal{Q}},\tilde{\mathcal{R}})\hfill \end{array}& =\underset{\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)}}{min}d(\tilde{\mathcal{P}},\mathcal{Q})+\underset{\tilde{\mathcal{R}}\in {\mathcal{R}}^{Aut\left(G\right)}}{min}d(\mathcal{Q},\tilde{\mathcal{R}})\hfill \\ & ={d}_{L}^{\ast}(\mathcal{P},\mathcal{Q})+{d}_{L}^{\ast}(\mathcal{Q},\mathcal{R})\hfill $$
- (b)
- For the proof of the triangular inequality for ${d}_{U}^{\ast}$, we substitute max for min and ${d}_{U}$ for ${d}_{L}$ in the proof of the triangular inequality for ${d}_{L}^{\ast}$.
- (c)
- For ${d}_{av}^{\ast}$, it follows:$$\begin{array}{cc}\hfill {d}_{av}^{\ast}(\mathcal{P},\mathcal{R})& =\frac{1}{|{\mathcal{P}}^{Aut\left(G\right)}\left|\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}}\right|{\mathcal{R}}^{Aut\left(G\right)}|}\sum _{\tilde{\mathcal{P}}\in {\mathcal{P}}^{Aut\left(G\right)}}\sum _{\tilde{\mathcal{R}}\in {\mathcal{R}}^{Aut\left(G\right)}}d(\tilde{\mathcal{P}},\tilde{\mathcal{R}})\hfill \\ & \le \frac{1}{|{\mathcal{P}}^{Aut\left(G\right)}\left|\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}}\right|{\mathcal{R}}^{Aut\left(G\right)}|}\sum _{\tilde{\mathcal{P}}}\sum _{\tilde{\mathcal{R}}}\left(\right)open="["\; close="]">d(\tilde{\mathcal{P}},\mathcal{Q})+d(\mathcal{Q},\tilde{\mathcal{R}})\hfill \end{array}$$

## 5. Decomposition of Partition Comparison Measures

- In Case 1, we compare two partitions from nontrivial equivalence classes: the difference of $0.4$ between ${d}_{U}^{\ast}$ and ${d}_{L}^{\ast}$ indicates that the potential maximal automorphism effect is larger than the lower measure. In addition, it is also smaller (by $0.2$) than the automorphism effect in each of the equivalence classes. That ${d}_{Aut\left(G\right)}$ is zero for the lower measure implies that the pair $(\mathcal{P},\mathcal{Q})$ is a pair with the minimal distance between the equivalence classes. The fact that ${d}_{av}^{\ast}=0.5$ is the mid-point between the lower and upper measures indicates a symmetric distribution of the distances between the equivalence classes.
- That ${d}_{Aut\left(G\right)}$ is zero for the upper measure in Case 2 means that we have found a pair with the maximal distance between the equivalence classes.
- In Case 3, we have also found a pair with maximal distance between the equivalence classes. However, the maximal potential automorphism effect is smaller than for Cases 1 and 2. In addition, the distribution of distances between the equivalence classes is asymmetric.
- Case 4 shows the comparison of a partition from a trivial with a partition from a non-trivial equivalence class. Note, that in this case, all three invariant measures, as well as ${d}_{RI}$ coincide and that no automorphism effect exists.

## 6. Invariant Measures for the Karate Graph

`tech-internet-as`are from Rossi and Ahmed [32]): for this graph, several locally optimal solutions with a modularity value above $0.694$ exist, all of which are unstable. Further analysis of the structural properties of the solution landscape of this graph is work in progress.

## 7. Discussion, Conclusions and Outlook

- A formal definition of partition stability, namely $\mathcal{P}$ is stable iff $|{\mathcal{P}}^{Aut\left(G\right)}|=1$.
- A proof of the non-invariance of all partition comparison measures if the automorphism group is nontrivial ($\left|Aut\right(G\left)\right|>1$).
- The construction of a pseudometric space of equivalence classes of graph partitions for three classes of invariant measures concerning finite permutation groups of graph automorphisms.
- The proof that the measures are invariant and that for these measures (after the transformation to a distance), the axioms of a metric space hold.
- The space of partitions is equipped with a metric (the original partition comparison measure) and a pseudometric (the invariant partition comparison measure).
- The decomposition of the value of a partition comparison measure into a structural part and a remainder that measures the effect of group actions.

## Supplementary Materials

`partitionComparison`by the authors of this article that implements the different partition comparison measures is available at https://cran.r-project.org/package=partitionComparison.

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. Modularity

## Appendix B. Measures for Comparing Partitions

- Pair-counting measures.
- Set-based comparison measures.
- Information theory based measures.

#### Appendix B.1. Pair-Counting Measures

**Table A1.**The pair counting measures used in Table 3 [42]. The above measures are similarity measures. Distance measures and non-normalized measures are listed in Table A2. For brevity: ${N}_{21}={N}_{11}+{N}_{10}$, ${N}_{12}={N}_{11}+{N}_{01}$, ${N}_{01}^{\prime}={N}_{00}+{N}_{01}$ and ${N}_{10}^{\prime}={N}_{00}+{N}_{10}$. Abbr., Abbreviation.

Abbr. | Measure | Formula | $\mathcal{P}=\mathcal{P}$ |
---|---|---|---|

RI | Rand [43] | $\frac{{N}_{11}+{N}_{00}}{\left(\right)}$ | $1.0$ |

ARI | Hubert and Arabie [44] | $\frac{2({N}_{00}{N}_{11}-{N}_{10}{N}_{01})}{{N}_{01}^{\prime}{N}_{12}+{N}_{10}^{\prime}{N}_{21}}$ | $1.0$ |

H | Hamann [45] | $\frac{({N}_{11}+{N}_{00})-({N}_{10}+{N}_{01})}{\left(\right)}$ | $1.0$ |

CZ | Czekanowski [46] | $\frac{2{N}_{11}}{2{N}_{11}+{N}_{10}+{N}_{01}}$ | $1.0$ |

K | Kulczynski [47] | $\frac{1}{2}\left(\right)open="("\; close=")">\frac{{N}_{11}}{{N}_{21}}+\frac{{N}_{11}}{{N}_{12}}$ | $1.0$ |

MC | McConnaughey [48] | $\frac{{N}_{11}^{2}-{N}_{10}{N}_{01}}{{N}_{21}{N}_{12}}$ | $1.0$ |

P | Peirce [49] | $\frac{{N}_{11}{N}_{00}-{N}_{10}{N}_{01}}{{N}_{21}{N}_{01}^{\prime}}$ | $1.0$ |

W_{I} | Wallace [50] | $\frac{{N}_{11}}{{N}_{21}}$ | $1.0$ |

W_{II} | Wallace [50] | $\frac{{N}_{11}}{{N}_{12}}$ | $1.0$ |

FM | Fowlkes and Mallows [51] | $\sqrt{\frac{{N}_{11}}{{N}_{21}}\frac{{N}_{11}}{{N}_{12}}}$ | $1.0$ |

$\Gamma $ | Yule [52] | $\frac{{N}_{11}{N}_{00}-{N}_{10}{N}_{01}}{\sqrt{{N}_{21}{N}_{12}{N}_{10}^{\prime}{N}_{01}^{\prime}}}$ | $1.0$ |

SS1 | Sokal and Sneath [53] | $\frac{1}{4}\left(\right)open="("\; close=")">\frac{{N}_{11}}{{N}_{21}}+\frac{{N}_{11}}{{N}_{12}}+\frac{{N}_{00}}{{N}_{10}^{\prime}}+\frac{{N}_{00}}{{N}_{01}^{\prime}}$ | $1.0$ |

B1 | Baulieu [54] | $\frac{{\left(\right)}^{\genfrac{}{}{0pt}{}{n}{2}}2}{-}{\left(\right)}^{\genfrac{}{}{0pt}{}{n}{2}}2$ | $1.0$ |

GL | Gower and Legendre [55] | $\frac{{N}_{11}+{N}_{00}}{{N}_{11}+\frac{1}{2}\left(\right)open="("\; close=")">{N}_{10}+{N}_{01}}$ | $1.0$ |

SS2 | Sokal and Sneath [53] | $\frac{{N}_{11}}{{N}_{11}+2({N}_{10}+{N}_{01})}$ | $1.0$ |

SS3 | Sokal and Sneath [53] | $\frac{{N}_{11}{N}_{00}}{\sqrt{{N}_{21}{N}_{12}{N}_{01}^{\prime}{N}_{10}^{\prime}}}$ | $1.0$ |

RT | Rogers and Tanimoto [56] | $\frac{{N}_{11}+{N}_{00}}{{N}_{11}+2({N}_{10}+{N}_{01})+{N}_{00}}$ | $1.0$ |

GK | Goodman and Kruskal [57] | $\frac{{N}_{11}{N}_{00}-{N}_{10}{N}_{01}}{{N}_{11}{N}_{00}+{N}_{10}{N}_{01}}$ | $1.0$ |

J | Jaccard [3] | $\frac{{N}_{11}}{{N}_{11}+{N}_{10}+{N}_{01}}$ | $1.0$ |

RV | Robert and Escoufier [58] | $\begin{array}{c}\left(\right)open="("\; close=")">{N}_{11}-\frac{1}{q}{N}_{21}-\frac{1}{p}{N}_{12}+\left(\right)open="("\; close=")">\genfrac{}{}{0pt}{}{n}{2}\frac{1}{pq}\end{array}\left(\right)open="["\; close>\left(\right)open="("\; close=")">\frac{p-2}{p}{N}_{21}+\left(\right)open="("\; close=")">\genfrac{}{}{0pt}{}{n}{2}& \frac{1}{{p}^{2}}$ | $1.0$ |

**Table A2.**Pair counting measures that are not similarity measures. For brevity: ${N}_{21}={N}_{11}+{N}_{10}$, ${N}_{12}={N}_{11}+{N}_{01}$, ${N}_{01}^{\prime}={N}_{00}+{N}_{01}$ and ${N}_{10}^{\prime}={N}_{00}+{N}_{10}$.

Abbr. | Measure | Formula | $\mathcal{P}=\mathcal{P}$ |
---|---|---|---|

RR | Russel and Rao [59] | $\frac{{N}_{11}}{\left(\right)}$ | max |

M | Mirkin and Chernyi [60] | $2({N}_{01}+{N}_{10})$ | $0.0$ |

Mi | Hilbert [61] | $\sqrt{\frac{{N}_{10}+{N}_{01}}{{N}_{11}+{N}_{10}}}$ | $0.0$ |

Pe | Pearson [62] | $\frac{{N}_{11}{N}_{00}-{N}_{10}{N}_{01}}{{N}_{21}{N}_{12}{N}_{01}^{\prime}{N}_{10}^{\prime}}$ | max |

B2 | Baulieu [54] | $\frac{{N}_{11}{N}_{00}-{N}_{10}{N}_{01}}{{\left(\right)}^{\genfrac{}{}{0pt}{}{n}{2}}}$ | max |

LI | Lerman [63] | $\frac{{N}_{11}-E\left({N}_{11}\right)}{\sqrt{{\sigma}^{2}\left({N}_{11}\right)}}$ | max |

NLI | Lerman [63] (normalized) | $\frac{\mathrm{LI}({P}_{1},{P}_{2})}{\mathrm{LI}({P}_{1},{P}_{1})\mathrm{LI}({P}_{2},{P}_{2})}$ | $1.0$ |

FMG | Fager and McGowan [64] | $\frac{{N}_{11}}{\sqrt{{N}_{21}{N}_{12}}}-\frac{1}{2\sqrt{{N}_{21}}}$ | max |

#### Appendix B.2. Set-Based Comparison Measures

**Table A3.**References and formulas for the three set-based comparison measures used in Table 3. $\sigma $ is the result of a maximum weighted matching of a bipartite graph. The bipartite graph is constructed from the partitions that shall be compared: the two node sets are derived from the two partitions, and each cluster is represented by a node. By definition, the two node sets are disjoint. The node sets are connected by edges of weight ${w}_{ij}=\left(\right)open="|"\; close="|">\{{C}_{i}\cap {C}_{j}^{\prime}\mid {C}_{i}\in \mathcal{P},{C}_{j}^{\prime}\in \mathcal{Q}\}$. As in our context $\left|\mathcal{P}\right|=\left|\mathcal{Q}\right|$, the found $\sigma $ is assured to be a perfect (bijective) matching. n is the number of nodes $\left|V\right|$.

Abbr. | Measure | Formula | $\mathcal{P}=\mathcal{P}$ |
---|---|---|---|

LA | Larsen and Aone [65] | $\frac{1}{\left|\mathcal{P}\right|}{\sum}_{C\in \mathcal{P}}{max}_{{C}^{\prime}\in \mathcal{Q}}\frac{2|C\cap {C}^{\prime}|}{\left|C\right|+|{C}^{\prime}|}$ | $1.0$ |

${d}_{CE}$ | Meilǎ and Heckerman [66] | $1-\frac{1}{n}{max}_{\sigma}{\sum}_{C\in \mathcal{P}}|C\cap \sigma (C\left)\right|$ | $0.0$ |

D | van Dongen [67] | $\begin{array}{c}2n-{\sum}_{C\in \mathcal{P}}{max}_{{C}^{\prime}\in \mathcal{Q}}|C\cap {C}^{\prime}|-\\ {\sum}_{{C}^{\prime}\in \mathcal{Q}}{max}_{C\in \mathcal{P}}|C\cap {C}^{\prime}|\end{array}$ | $0.0$ |

#### Appendix B.3. Information Theory-Based Measures

**Table A4.**Information theory-based measures used in Table 3. All measures are based on Shannon’s definition of entropy. Again, $n=\left|V\right|$.

Abbr. | Measure | Formula | $\mathcal{P}=\mathcal{P}$ |
---|---|---|---|

MI | e.g., Vinh et al. [68] | ${\sum}_{C\in \mathcal{P}}{\sum}_{{C}^{\prime}\in \mathcal{Q}}\frac{|C\cap {C}^{\prime}|}{n}logn\frac{|C\cap {C}^{\prime}|}{\left|C\right||{C}^{\prime}|}$ | max |

${\mathrm{NMI}}_{\phi}$ | Danon et al. [70] | $\frac{\mathrm{MI}}{\phi \left(H\right(\mathcal{P}),H(\mathcal{Q}\left)\right)},\phantom{\rule{0.166667em}{0ex}}\phi \in \{min,max\}$ | $1.0$ |

${\mathrm{NMI}}_{\mathsf{\Sigma}}$ | Danon et al. [70] | $\frac{2\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}}\mathrm{MI}}{H\left(\mathcal{P}\right)+H\left(\mathcal{Q}\right)}$ | $1.0$ |

VI | Meilǎ [69] | $H\left(\mathcal{P}\right)+H\left(\mathcal{Q}\right)-2MI$ | $0.0$ |

#### Appendix B.4. Summary

## References

- Melnykov, V.; Maitra, R. CARP: Software for fishing out good clustering algorithms. J. Mach. Learn. Res.
**2011**, 12, 69–73. [Google Scholar] - Bader, D.A.; Meyerhenke, H.; Sanders, P.; Wagner, D. (Eds.) 10th DIMACS Implementation Challenge—Graph Partitioning and Graph Clustering; Rutgers University, DIMACS (Center for Discrete Mathematics and Theoretical Computer Science): Piscataway, NJ, USA, 2012. [Google Scholar]
- Jaccard, P. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat.
**1908**, 44, 223–270. [Google Scholar] - Horta, D.; Campello, R.J.G.B. Comparing hard and overlapping clusterings. J. Mach. Learn. Res.
**2015**, 16, 2949–2997. [Google Scholar] - Romano, S.; Vinh, N.X.; Bailey, J.; Verspoor, K. Adjusting for chance clustering comparison measures. J. Mach. Learn. Res.
**2016**, 17, 1–32. [Google Scholar] - Von Luxburg, U.; Williamson, R.C.; Guyon, I. Clustering: Science or art? JMLR Workshop Conf. Proc.
**2011**, 27, 65–79. [Google Scholar] - Hennig, C. What are the true clusters? Pattern Recognit. Lett.
**2015**, 64, 53–62. [Google Scholar] [CrossRef] - Van Craenendonck, T.; Blockeel, H. Using Internal Validity Measures to Compare Clustering Algorithms; Benelearn 2015 Poster Presentations (Online); Benelearn: Delft, The Netherlands, 2015; pp. 1–8. [Google Scholar]
- Filchenkov, A.; Muravyov, S.; Parfenov, V. Towards cluster validity index evaluation and selection. In Proceedings of the 2016 IEEE Artificial Intelligence and Natural Language Conference, St. Petersburg, Russia, 10–12 November 2016; pp. 1–8. [Google Scholar]
- MacArthur, B.D.; Sánchez-García, R.J.; Anderson, J.W. Symmetry in complex networks. Discret. Appl. Math.
**2008**, 156, 3525–3531. [Google Scholar] [CrossRef][Green Version] - Darga, P.T.; Sakallah, K.A.; Markov, I.L. Faster Symmetry Discovery Using Sparsity of Symmetries. In Proceedings of the 2008 45th ACM/IEEE Design Automation Conference, Anaheim, CA, USA, 8–13 June 2008; pp. 149–154. [Google Scholar]
- Katebi, H.; Sakallah, K.A.; Markov, I.L. Graph Symmetry Detection and Canonical Labeling: Differences and Synergies. In Turing-100. The Alan Turing Centenary; EPiC Series in Computing; Voronkov, A., Ed.; EasyChair: Manchester, UK, 2012; Volume 10, pp. 181–195. [Google Scholar]
- Ball, F.; Geyer-Schulz, A. How symmetric are real-world graphs? A large-scale study. Symmetry
**2018**, 10, 29. [Google Scholar] [CrossRef] - Newman, M.E.J.; Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E
**2004**, 69, 026113. [Google Scholar] [CrossRef] [PubMed] - Ovelgönne, M.; Geyer-Schulz, A. An Ensemble Learning Strategy for Graph Clustering. In Graph Partitioning and Graph Clustering; Bader, D.A., Meyerhenke, H., Sanders, P., Wagner, D., Eds.; American Mathematical Society: Providence, RI, USA, 2013; Volume 588, pp. 187–205. [Google Scholar]
- Wielandt, H. Finite Permutation Groups; Academic Press: New York, NY, USA, 1964. [Google Scholar]
- James, G.; Kerber, A. The Representation Theory of the Symmetric Group. In Encyclopedia of Mathematics and Its Applications; Addison-Wesley: Reading, MA, USA, 1981; Volume 16. [Google Scholar]
- Coxeter, H.; Moser, W. Generators and Relations for Discrete Groups. In Ergebnisse der Mathematik und ihrer Grenzgebiete; Springer: Berlin, Germany, 1965; Volume 14. [Google Scholar]
- Dixon, J.D.; Mortimer, B. Permutation Groups. In Graduate Texts in Mathematics; Springer: New York, NY, USA, 1996; Volume 163. [Google Scholar]
- Beth, T.; Jungnickel, D.; Lenz, H. Design Theory; Cambridge University Press: Cambridge, UK, 1993. [Google Scholar]
- Erdős, P.; Rényi, A.; Sós, V.T. On a problem of graph theory. Stud. Sci. Math. Hung.
**1966**, 1, 215–235. [Google Scholar] - Burr, S.A.; Erdős, P.; Spencer, J.H. Ramsey theorems for multiple copies of graphs. Trans. Am. Math. Soc.
**1975**, 209, 87–99. [Google Scholar] [CrossRef] - Ball, F.; Geyer-Schulz, A. R Package Partition Comparison; Technical Report 1-2017, Information Services and Electronic Markets, Institute of Information Systems and Marketing; KIT: Karlsruhe, Germany, 2017. [Google Scholar]
- Doob, J.L. Measure Theory. In Graduate Texts in Mathematics; Springer: New York, NY, USA, 1994. [Google Scholar]
- Hausdorff, F. Set Theory, 2nd ed.; Chelsea Publishing Company: New York, NY, USA, 1962. [Google Scholar]
- Kuratowski, K. Topology Volume I; Academic Press: New York, NY, USA, 1966; Volume 1. [Google Scholar]
- Von Neumann, J. Construction of Haar’s invariant measure in groups by approximately equidistributed finite point sets and explicit evaluations of approximations. In Invariant Measures; American Mathematical Society: Providence, RI, USA, 1999; Chapter 6; pp. 87–134. [Google Scholar]
- Ball, F.; Geyer-Schulz, A. Weak invariants of actions of the automorphism group of a graph. Arch. Data Sci. Ser. A
**2017**, 2, 1–22. [Google Scholar] - Zachary, W.W. An information flow model for conflict and fission in small groups. J. Anthropol. Res.
**1977**, 33, 452–473. [Google Scholar] [CrossRef] - Bock, H.H. Automatische Klassifikation: Theoretische und praktische Methoden zur Gruppierung und Strukturierung von Daten; Vandenhoeck und Ruprecht: Göttingen, Germany, 1974. [Google Scholar]
- Rossi, R.; Fahmy, S.; Talukder, N. A Multi-level Approach for Evaluating Internet Topology Generators. In Proceedings of the 2013 IFIP Networking Conference, Trondheim, Norway, 2–4 June 2013; pp. 1–9. [Google Scholar]
- Rossi, R.A.; Ahmed, N.K. The Network Data Repository with Interactive Graph Analytics and Visualization. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Furst, M.; Hopcroft, J.; Luks, E. Polynomial-time Algorithms for Permutation Groups. In Proceedings of the 21st Annual Symposium on Foundations of Computer Science, Syracuse, NY, USA, 13–15 October 1980; pp. 36–41. [Google Scholar]
- McKay, B.D.; Piperno, A. Practical graph isomorphism, II. J. Symb. Comput.
**2014**, 60, 94–112. [Google Scholar] [CrossRef][Green Version] - Babai, L. Graph isomorphism in quasipolynomial time. arXiv, 2015; arXiv:1512.03547. [Google Scholar]
- Fortunato, S.; Barthélemy, M. Resolution limit in community detection. Proc. Natl. Acad. Sci. USA
**2007**, 104, 36–41. [Google Scholar] [CrossRef] [PubMed] - Lancichinetti, A.; Fortunato, S. Limits of modularity maximization in community detection. Phys. Rev. E
**2011**, 84, 66122. [Google Scholar] [CrossRef] [PubMed] - Geyer-Schulz, A.; Ovelgönne, M.; Stein, M. Modified randomized modularity clustering: Adapting the resolution limit. In Algorithms from and for Nature and Life; Lausen, B., Van den Poel, D., Ultsch, A., Eds.; Studies in Classification, Data Analysis, and Knowledge Organization; Springer International Publishing: Heidelberg, Germany, 2013; pp. 355–363. [Google Scholar]
- Meilǎ, M. Comparing clusterings—An information based distance. J. Multivar. Anal.
**2007**, 98, 873–895. [Google Scholar] [CrossRef] - Youness, G.; Saporta, G. Some measures of agreement between close partitions. Student
**2004**, 51, 1–12. [Google Scholar] - Denœud, L.; Guénoche, A. Comparison of distance indices between partitions. In Data Science and Classification; Batagelj, V., Bock, H.H., Ferligoj, A., Žiberna, A., Eds.; Studies in Classification, Data Analysis, and Knowledge Organization; Springer: Berlin/Heidelberg, Germany, 2006; pp. 21–28. [Google Scholar]
- Albatineh, A.N.; Niewiadomska-Bugaj, M.; Mihalko, D. On similarity indices and correction for chance agreement. J. Classif.
**2006**, 23, 301–313. [Google Scholar] [CrossRef] - Rand, W.M. Objective criteria for the evaluation of clustering algorithms. J. Am. Stat. Assoc.
**1971**, 66, 846–850. [Google Scholar] [CrossRef] - Hubert, L.; Arabie, P. Comparing partitions. J. Classif.
**1985**, 2, 193–218. [Google Scholar] [CrossRef] - Hamann, U. Merkmalsbestand und Verwandtschaftsbeziehungen der Farinosae: Ein Beitrag zum System der Monokotyledonen. Willdenowia
**1961**, 2, 639–768. [Google Scholar] - Czekanowski, J. “Coefficient of Racial Likeness” und “Durchschnittliche Differenz”. Anthropol. Anz.
**1932**, 9, 227–249. [Google Scholar] - Kulczynski, S. Zespoly roslin w Pieninach. Bull. Int. Acad. Pol. Sci. Lett.
**1927**, 2, 57–203. [Google Scholar] - McConnaughey, B.H. The determination and analysis of plankton communities. Mar. Res.
**1964**, 1, 1–40. [Google Scholar] - Peirce, C.S. The numerical measure of the success of predictions. Science
**1884**, 4, 453–454. [Google Scholar] [CrossRef] [PubMed] - Wallace, D.L. A method for comparing two hierarchical clusterings: Comment. J. Am. Stat. Assoc.
**1983**, 78, 569–576. [Google Scholar] [CrossRef] - Fowlkes, E.B.; Mallows, C.L. A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc.
**1983**, 78, 553–569. [Google Scholar] [CrossRef] - Yule, G.U. On the association of attributes in statistics: With illustrations from the material of the childhood society. Philos. Trans. R. Soc. A
**1900**, 194, 257–319. [Google Scholar] [CrossRef] - Sokal, R.R.; Sneath, P.H.A. Principles of Numerical Taxonomy; W. H. Freeman: San Francisco, CA, USA; London, UK, 1963. [Google Scholar]
- Baulieu, F.B. A classification of presence/absence based dissimilarity coefficients. J. Classif.
**1989**, 6, 233–246. [Google Scholar] [CrossRef] - Gower, J.C.; Legendre, P. Metric and euclidean properties of dissimilarity coefficients. J. Classif.
**1986**, 3, 5–48. [Google Scholar] [CrossRef] - Rogers, D.J.; Tanimoto, T.T. A computer program for classifying plants. Science
**1960**, 132, 1115–1118. [Google Scholar] [CrossRef] [PubMed] - Goodman, L.A.; Kruskal, W.H. Measures of association for cross classifications. J. Am. Stat. Assoc.
**1954**, 49, 732–764. [Google Scholar] - Robert, P.; Escoufier, Y. A unifying tool for linear multivariate statistical methods: The RV-coefficient. J. R. Stat. Soc. Ser. C
**1976**, 25, 257–265. [Google Scholar] [CrossRef] - Russel, P.F.; Rao, T.R. On habitat and association of species of anopheline larvae in south-eastern madras. J. Malar. Inst. India
**1940**, 3, 153–178. [Google Scholar] - Mirkin, B.G.; Chernyi, L.B. Measurement of the distance between partitions of a finite set of objects. Autom. Remote Control
**1970**, 31, 786–792. [Google Scholar] - Hilbert, D. Gesammelte Abhandlungen von Hermann Minkowski, Zweiter Band; Number 2; B. G. Teubner: Leipzig, UK; Berlin, Germany, 1911. [Google Scholar]
- Pearson, K. On the coefficient of racial likeness. Biometrika
**1926**, 18, 105–117. [Google Scholar] [CrossRef] - Lerman, I.C. Comparing Partitions (Mathematical and Statistical Aspects). In Classification and Related Methods of Data Analysis; Bock, H.H., Ed.; North-Holland: Amsterdam, The Netherlands, 1988; pp. 121–132. [Google Scholar]
- Fager, E.W.; McGowan, J.A. Zooplankton species groups in the north pacific co-occurrences of species can be used to derive groups whose members react similarly to water-mass types. Science
**1963**, 140, 453–460. [Google Scholar] [CrossRef] [PubMed] - Larsen, B.; Aone, C. Fast and Effective Text Mining Using Linear-time Document Clustering. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 15–18 August 1999; ACM: New York, NY, USA, 1999; pp. 16–22. [Google Scholar]
- Meilǎ, M.; Heckerman, D. An experimental comparison of model-based clustering methods. Mach. Learn.
**2001**, 42, 9–29. [Google Scholar] [CrossRef] - Van Dongen, S. Performance Criteria for Graph Clustering and Markov Cluster Experiments; Technical Report INS-R 0012; CWI (Centre for Mathematics and Computer Science): Amsterdam, The Netherlands, 2000. [Google Scholar]
- Vinh, N.X.; Epps, J.; Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res.
**2010**, 11, 2837–2854. [Google Scholar] - Meilǎ, M. Comparing clusterings by the variation of information. In Learning Theory and Kernel Machines; Schölkopf, B., Warmuth, M.K., Eds.; Number 2777 in Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2003; pp. 173–187. [Google Scholar]
- Danon, L.; Díaz-Guilera, A.; Duch, J.; Arenas, A. Comparing community structure identification. J. Stat. Mech. Theory Exp.
**2005**, 2005, P09008. [Google Scholar] [CrossRef]

**Figure 1.**Two structurally different partitions of the cycle graph C

_{4}: grouping pairs of neighbors (

**a**) and grouping pairs of diagonals (

**d**). Equally-colored nodes represent graph clusters, and the choice of colors is arbitrary. Adding, again arbitrary, but fixed, node labels impacts the node partitions and results in the failure to recognize the structural difference when comparing these partitions with partition comparison measures (see Table 1). The different images (

**b**,

**c**) (${\mathcal{P}}_{1}=\left\{\right\{1,2\},\{3,4\left\}\right\}$, ${\mathcal{P}}_{2}=\left\{\right\{1,4\},\{2,3\left\}\right\}$ and (

**e**,

**f**) ${\mathcal{Q}}_{1}={\mathcal{Q}}_{2}=\left\{\right\{1,3\},\{2,4\left\}\right\}$) emerge from the graph’s symmetry.

**Figure 3.**The cycle graph ${C}_{36}$ (the “outer” cycle) and an initial partition of six clusters (connected nodes of the same color, separated by dashed lines). A single application of $g=(1\phantom{\rule{0.166667em}{0ex}}2\phantom{\rule{0.166667em}{0ex}}\dots \phantom{\rule{0.166667em}{0ex}}36)$ “rotates” the graph by one node (the “inner” cycle ${\mathcal{C}}_{36}^{g}$). As a consequence, in each cluster, one node drops out and is added to another cluster: For instance, Node 1 drops out of the “original” cluster $C=\left(\right)open="\{"\; close="\}">1,2,3,4,5,6$, and Node 7 is added, resulting in ${C}^{g}=\left(\right)open="\{"\; close="\}">2,3,4,5,6,7$. All dropped nodes are shown in light gray.

**Figure 4.**Zachary’s Karate graph K with the vertices of the orbits of the three subgroups of $Aut\left(K\right)$ in bold and the clusters of ${P}_{O}$ separated by dashed edges.

**Table 1.**The Rand index is $RI=\frac{{N}_{11}+{N}_{00}}{{N}_{11}+{N}_{10}+{N}_{01}+{N}_{00}}$. ${N}_{11}$ indicates the number of nodes that are in both partitions together in a cluster; ${N}_{10}$ and ${N}_{01}$ are the number of nodes that are together in a cluster in one partition, but not in the other; and ${N}_{00}$ are the number of nodes that are in both partitions in different clusters. See Appendix B for the formal definitions. Partitions ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$ are equivalent (yet not equal, denoted “∼”), and partitions ${\mathcal{Q}}_{1}$ and ${\mathcal{Q}}_{2}$ are identical (thus, also equivalent, denoted “=”). However, the comparison of the structurally different partitions (denoted “≠”) ${\mathcal{P}}_{i}$ and ${\mathcal{Q}}_{j}$ yields the same result as the comparison between the equivalent partitions ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$. This makes the recognition of structural differences impossible.

Case | Compared Partitions | Relation | ${\mathit{N}}_{11}$ | ${\mathit{N}}_{10}$ | ${\mathit{N}}_{01}$ | ${\mathit{N}}_{00}$ | RI |
---|---|---|---|---|---|---|---|

1 | ${\mathcal{P}}_{1},{\mathcal{P}}_{1}$ | = | 2 | 0 | 0 | 4 | 1 |

2 | ${\mathcal{P}}_{2},{\mathcal{P}}_{2}$ | = | 2 | 0 | 0 | 4 | 1 |

3 | ${\mathcal{Q}}_{1},{\mathcal{Q}}_{1}$ or ${\mathcal{Q}}_{1},{\mathcal{Q}}_{2}$ or ${\mathcal{Q}}_{2},{\mathcal{Q}}_{2}$ | = | 2 | 0 | 0 | 4 | 1 |

4 | ${\mathcal{P}}_{1},{\mathcal{Q}}_{1}$ or ${\mathcal{P}}_{1},{\mathcal{Q}}_{2}$ | ≠ | 0 | 2 | 2 | 2 | $\frac{1}{3}$ |

5 | ${\mathcal{P}}_{2},{\mathcal{Q}}_{1}$ or ${\mathcal{P}}_{2},{\mathcal{Q}}_{2}$ | ≠ | 0 | 2 | 2 | 2 | $\frac{1}{3}$ |

6 | ${\mathcal{P}}_{1},{\mathcal{P}}_{2}$ | ∼ | 0 | 2 | 2 | 2 | $\frac{1}{3}$ |

**Table 2.**The full automorphism group $Aut\left({G}_{bf}\right)=\left(\right)open="\{"\; close="\}">id,{g}_{1},\dots ,{g}_{7}$ of the butterfly graph in Figure 2 and its effect on three partitions. Bold partitions are distinct. A possible generator is $\left(\right)$.

Permutation | ${\mathcal{P}}_{1},\phantom{\rule{0.166667em}{0ex}}\mathit{Q}=0$ | ${\mathcal{P}}_{2},\phantom{\rule{0.166667em}{0ex}}\mathit{Q}=\frac{1}{9}$ | ${\mathcal{P}}_{3},\phantom{\rule{0.166667em}{0ex}}\mathit{Q}=-\frac{1}{18}$ |
---|---|---|---|

$id=\left(1\right)\left(2\right)\left(3\right)\left(4\right)\left(5\right)$ | $\{\mathbf{1},\mathbf{2}\},\left\{\mathbf{3}\right\},\{\mathbf{4},\mathbf{5}\}$ | $\{\mathbf{1},\mathbf{2},\mathbf{3}\},\{\mathbf{4},\mathbf{5}\}$ | $\{\mathbf{1},\mathbf{2},\mathbf{3},\mathbf{4}\},\left\{\mathbf{5}\right\}$ |

${g}_{1}=\left(1\phantom{\rule{0.166667em}{0ex}}2\right)$ | $\{2,1\},\left\{3\right\},\{4,5\}$ | $\{2,1,3\},\{4,5\}$ | $\{2,1,3,4\},\left\{5\right\}$ |

${g}_{2}=\left(4\phantom{\rule{0.166667em}{0ex}}5\right)$ | $\{1,2\},\left\{3\right\},\{5,4\}$ | $\{1,2,3\},\{5,4\}$ | $\{\mathbf{1},\mathbf{2},\mathbf{3},\mathbf{5}\},\left\{\mathbf{4}\right\}$ |

${g}_{3}=\left(1\phantom{\rule{0.166667em}{0ex}}2\right)\left(4\phantom{\rule{0.166667em}{0ex}}5\right)$ | $\{2,1\},\left\{3\right\},\{5,4\}$ | $\{2,1,3\},\{5,4\}$ | $\{2,1,3,5\},\left\{4\right\}$ |

${g}_{4}=\left(1\phantom{\rule{0.166667em}{0ex}}4\right)\left(2\phantom{\rule{0.166667em}{0ex}}5\right)$ | $\{4,5\},\left\{3\right\},\{1,2\}$ | $\{\mathbf{4},\mathbf{5},\mathbf{3}\},\{\mathbf{1},\mathbf{2}\}$ | $\{\mathbf{4},\mathbf{5},\mathbf{3},\mathbf{1}\},\left\{\mathbf{2}\right\}$ |

${g}_{5}=\left(1\phantom{\rule{0.166667em}{0ex}}5\right)\left(2\phantom{\rule{0.166667em}{0ex}}4\right)$ | $\{5,4\},\left\{3\right\},\{2,1\}$ | $\{5,4,3\},\{2,1\}$ | $\{5,4,3,2\},\left\{1\right\}$ |

${g}_{6}=\left(1\phantom{\rule{0.166667em}{0ex}}4\phantom{\rule{0.166667em}{0ex}}2\phantom{\rule{0.166667em}{0ex}}5\right)$ | $\{4,5\},\left\{3\right\},\{2,1\}$ | $\{4,5,3\},\{2,1\}$ | $\{\mathbf{4},\mathbf{5},\mathbf{3},\mathbf{2}\},\left\{\mathbf{1}\right\}$ |

${g}_{7}=\left(1\phantom{\rule{0.166667em}{0ex}}5\phantom{\rule{0.166667em}{0ex}}2\phantom{\rule{0.166667em}{0ex}}4\right)$ | $\{5,4\},\left\{3\right\},\{1,2\}$ | $\{5,4,3\},\{1,2\}$ | $\{5,4,3,1\},\left\{2\right\}$ |

**Table 3.**Comparing the modularity maximizing partitions of the cycle graph ${C}_{36}$ with modularity $Q=\frac{2}{3}$. The six optimal partitions consist of six clusters (see Figure 3). The number of pairs in the same cluster in both partitions is denoted by ${N}_{11}$, in different clusters by ${N}_{00}$ and in the same cluster in one partition, but not in the other, by ${N}_{01}$ or ${N}_{10}$. For the definitions of all partition comparison measures, see Appendix B. To compute this table, the R package

`partitionComparison`has been used [23].

Measure | $\mathit{m}({\mathcal{P}}_{0},{\mathcal{P}}_{0}^{{\mathit{g}}^{\mathit{k}}})$ with $\mathit{g}=\left(1\phantom{\rule{0.166667em}{0ex}}2\phantom{\rule{0.166667em}{0ex}}3\phantom{\rule{0.166667em}{0ex}}\mathit{\dots}\phantom{\rule{0.166667em}{0ex}}35\phantom{\rule{3.33333pt}{0ex}}36\right)$ for k: | |||||
---|---|---|---|---|---|---|

0 | 1 | 2 | 3 | 4 | 5 | |

Pair counting measures ($f({N}_{11},{N}_{00},{N}_{01},{N}_{10})$; see Table A1 and Table A2) | ||||||

RI | $1.0$ | $0.90476$ | $0.84762$ | $0.82857$ | $0.84762$ | $0.90476$ |

ARI | $1.0$ |
$$0.61111$$
| $0.37778$ | $0.3$ | $0.37778$ | $0.61111$ |

H | $1.0$ | $0.80952$ | $0.69524$ | $0.65714$ | $0.69524$ | $0.80952$ |

CZ | $1.0$ | $0.66667$ | $0.46667$ | $0.4$ | $0.46667$ | $0.66667$ |

K | $1.0$ | $0.66667$ | $0.46667$ | $0.4$ | $0.46667$ | $0.66667$ |

MC | $1.0$ | $0.33333$ | $-0.06667$ | $-0.2$ | $-0.06667$ | $0.33333$ |

P | $1.0$ | $0.61111$ | $0.37778$ | $0.3$ | $0.37778$ | $0.61111$ |

W_{I} | $1.0$ | $0.66667$ | $0.46667$ | $0.4$ | $0.46667$ | $0.66667$ |

W_{II} | $1.0$ | $0.66667$ | $0.46667$ | $0.4$ | $0.46667$ | $0.66667$ |

FM | $1.0$ | $0.66667$ | $0.46667$ | $0.4$ | $0.46667$ | $0.66667$ |

$\Gamma $ | $1.0$ | $0.61111$ | $0.37778$ | $0.3$ | $0.37778$ | $0.61111$ |

SS1 | $1.0$ | $0.80556$ | $0.68889$ | $0.65$ | $0.68889$ | $0.80556$ |

B1 | $1.0$ | $0.91383$ | $0.87084$ | $0.85796$ | $0.87084$ | $0.91383$ |

GL | $1.0$ | $0.95$ | $0.91753$ | $0.90625$ | $0.91753$ | $0.95$ |

SS2 | $1.0$ | $0.33333$ | $0.17949$ | $0.14286$ | $0.17949$ | $0.33333$ |

SS3 | $1.0$ | $0.62963$ | $0.42519$ | $0.36$ | $0.42519$ | $0.62963$ |

RT | $1.0$ | $0.82609$ | $0.73554$ | $0.70732$ | $0.73554$ | $0.82609$ |

GK | $1.0$ | $0.94286$ | $0.79937$ | $0.71429$ | $0.79937$ | $0.94286$ |

J | $1.0$ | $0.5$ | $0.30435$ | $0.25$ | $0.30435$ | $0.5$ |

RV | $1.0$ | $0.61039$ | $0.37662$ | $0.29870$ | $0.37662$ | $0.61039$ |

RR | $0.14286$ | $0.09524$ | $0.06667$ | $0.05714$ | $0.06667$ | $0.09524$ |

M | $0.0$ | 12$0.0$ | 19$2.0$ | 21$6.0$ | 19$2.0$ | 12$0.0$ |

Mi | $0.0$ | $0.81650$ | $1.03280$ | $1.09545$ | $1.03280$ | $0.81650$ |

Pe | $0.00002$ | $0.00001$ | $0.00001$ | $0.00001$ | $0.00001$ | $0.00001$ |

B2 | $0.12245$ | $0.07483$ | $0.04626$ | $0.03673$ | $0.04626$ | $0.07483$ |

LI | 2$4.37212$ | 1$4.89407$ | $9.20724$ | $7.31163$ | $9.20724$ | 1$4.89407$ |

NLI | $1.0$ | $0.61111$ | $0.37778$ | $0.3$ | $0.37778$ | $0.61111$ |

FMG | $0.94730$ | $0.61396$ | $0.41396$ | $0.34730$ | $0.41396$ | $0.61396$ |

Set-based comparison measures (see Table A3) | ||||||

LA | $1.0$ | $0.83333$ | $0.66667$ | $0.5$ | $0.66667$ | $0.83333$ |

${d}_{CE}$ | $0.0$ | $0.16667$ | $0.33333$ | $0.5$ | $0.33333$ | $0.16667$ |

D | $0.0$ | 1$2.0$ | 2$4.0$ | 3$6.0$ | 2$4.0$ | 1$2.0$ |

Information theory-based measures (see Table A4) | ||||||

MI | $1.79176$ | $1.34120$ | $1.15525$ | $1.09861$ | $1.15525$ | $1.34120$ |

NMI (max) | $1.0$ | $0.74854$ | $0.64475$ | $0.61315$ | $0.64475$ | $0.74854$ |

NMI (min) | $1.0$ | $0.74854$ | $0.64475$ | $0.61315$ | $0.64475$ | $0.74854$ |

NMI ($\mathsf{\Sigma}$) | $1.0$ | $0.74854$ | $0.64475$ | $0.61315$ | $0.64475$ | $0.74854$ |

VI | $0.0$ | $0.90112$ | $1.27303$ | $1.38629$ | $1.27303$ | $0.90112$ |

**Table 4.**The equivalence classes of the pseudometric space $(S,{d}^{\ast})$ of the butterfly graph (see Figure 2). Classes are grouped by their partition type, which is the corresponding integer partition. k is the number of partitions per type; l is the number of clusters the partitions of a type consists of; ${dia}_{1-RI}$ is the diameter (see Equation (2)) of the equivalence class computed for the distance ${d}_{RI}$ computed from the Rand Index (RI) by $1-RI$.

${\mathcal{P}}^{\mathit{Aut}\left(\mathit{G}\right)}$ | Q | ${dia}_{1-\mathit{RI}}$ | |
---|---|---|---|

Partition type $(1,1,1,1,1)$, $k=1$, $l=5$ | |||

${\mathit{E}}_{\mathbf{1}}$ | $\left\{1\right\},\left\{2\right\},\left\{3\right\},\left\{4\right\},\left\{5\right\}$ | $-\frac{2}{9}$ | $0.0$ |

Partition type $(1,1,1,2)$, $k=10$, $l=4$ | |||

${E}_{2}$ | $\left\{1\right\},\left\{2\right\},\left\{3\right\},\{4,5\}\phantom{\rule{1.em}{0ex}}\{1,2\},\left\{3\right\},\left\{4\right\},\left\{5\right\}$ | $-\frac{1}{9}$ | $0.2$ |

${E}_{3}$ | $\left\{1\right\},\left\{2\right\},\{3,4\},\left\{5\right\}\phantom{\rule{1.em}{0ex}}\left\{1\right\},\left\{2\right\},\{3,5\},\left\{4\right\}\phantom{\rule{1.em}{0ex}}\left\{1\right\},\{2,3\},\left\{4\right\},\left\{5\right\}$ | $-\frac{1}{6}$ | $0.2$ |

$\{1,3\},\left\{2\right\},\left\{4\right\},\left\{5\right\}$ | |||

${E}_{4}$ | $\left\{1\right\},\{2,4\},\left\{3\right\},\left\{5\right\}\phantom{\rule{1.em}{0ex}}\left\{1\right\},\{2,5\},\left\{3\right\},\left\{4\right\}\phantom{\rule{1.em}{0ex}}\{1,4\},\left\{2\right\},\left\{3\right\},\left\{5\right\}$ | $-\frac{5}{18}$ | $0.2$ |

$\{1,5\},\left\{2\right\},\left\{3\right\},\left\{4\right\}$ | |||

Partition type $(1,1,3)$ $k=10$, $l=3$ | |||

${E}_{5}$ | $\left\{1\right\},\left\{2\right\},\{3,4,5\}\phantom{\rule{1.em}{0ex}}\left\{4\right\},\left\{5\right\},\{1,2,3\}$ | 0 | $0.6$ |

${E}_{6}$ | $\left\{1\right\},\left\{3\right\},\{2,4,5\}\phantom{\rule{1.em}{0ex}}\left\{3\right\},\left\{5\right\},\{1,2,4\}\phantom{\rule{1.em}{0ex}}\left\{3\right\},\left\{4\right\},\{1,2,5\}$ | $-\frac{2}{9}$ | $0.4$ |

$\left\{2\right\},\left\{3\right\},\{1,4,5\}$ | |||

${E}_{7}$ | $\left\{1\right\},\left\{5\right\},\{2,3,4\}\phantom{\rule{1.em}{0ex}}\left\{1\right\},\left\{4\right\},\{2,3,5\}\phantom{\rule{1.em}{0ex}}\left\{2\right\},\left\{5\right\},\{1,3,4\}$ | $-\frac{1}{6}$ | $0.6$ |

$\left\{2\right\},\left\{4\right\},\{1,3,5\}$ | |||

Partition type $(1,2,2)$, $k=15$, $l=3$ | |||

${\mathit{E}}_{\mathbf{8}}$ | $\left\{3\right\},\{1,2\},\{4,5\}$ | 0 | $0.0$ |

${E}_{9}$ | $\left\{3\right\},\{1,4\},\{2,5\}\phantom{\rule{1.em}{0ex}}\left\{3\right\},\{1,5\},\{2,4\}$ | $-\frac{1}{3}$ | $0.4$ |

${E}_{10}$ | $\left\{1\right\},\{2,3\},\{4,5\}\phantom{\rule{1.em}{0ex}}\left\{5\right\},\{1,2\},\{3,4\}\phantom{\rule{1.em}{0ex}}\left\{4\right\},\{1,2\},\{3,5\}$ | $-\frac{1}{18}$ | $0.4$ |

$\left\{2\right\},\{1,3\},\{4,5\}$ | |||

${E}_{11}$ | $\left\{1\right\},\{2,4\},\{3,5\}\phantom{\rule{1.em}{0ex}}\left\{1\right\},\{2,5\},\{3,4\}\phantom{\rule{1.em}{0ex}}\left\{5\right\},\{1,3\},\{2,4\}$ | $-\frac{2}{9}$ | $0.4$ |

$\left\{4\right\},\{1,3\},\{2,5\}\phantom{\rule{1.em}{0ex}}\left\{2\right\},\{1,4\},\{3,5\}\phantom{\rule{1.em}{0ex}}\left\{5\right\},\{1,4\},\{2,3\}$ | |||

$\left\{2\right\},\{1,5\},\{3,4\}\phantom{\rule{1.em}{0ex}}\left\{4\right\},\{1,5\},\{2,3\}$ | |||

Partition type $(1,4)$, $k=5$, $l=2$ | |||

${\mathit{E}}_{\mathbf{12}}$ | $\{1,2,4,5\},\left\{3\right\}$ | $-\frac{2}{9}$ | $0.0$ |

${E}_{13}$ | $\{2,3,4,5\},\left\{1\right\}\phantom{\rule{1.em}{0ex}}\{1,2,3,4\},\left\{5\right\}\phantom{\rule{1.em}{0ex}}\{1,2,3,5\},\left\{4\right\}$ | $-\frac{1}{18}$ | $0.6$ |

$\{1,3,4,5\},\left\{2\right\}$ | |||

Partition type $(2,3)$, $k=10$, $l=2$ | |||

${E}_{14}$ | $\{1,2\},\{3,4,5\}\phantom{\rule{1.em}{0ex}}\{4,5\},\{1,2,3\}$ | $\frac{1}{9}$ | $0.4$ |

${E}_{15}$ | $\{3,5\},\{1,2,4\}\phantom{\rule{1.em}{0ex}}\{3,4\},\{1,2,5\}\phantom{\rule{1.em}{0ex}}\{1,3\},\{2,4,5\}$ | $-\frac{1}{6}$ | $0.6$ |

$\{2,3\},\{1,4,5\}$ | |||

${E}_{16}$ | $\{2,5\},\{1,3,4\}\phantom{\rule{1.em}{0ex}}\{2,4\},\{1,3,5\}\phantom{\rule{1.em}{0ex}}\{1,4\},\{2,3,5\}$ | $-\frac{2}{9}$ | $0.6$ |

$\{1,5\},\{2,3,4\}$ | |||

Partition type $\left(5\right)$, $k=1$, $l=1$ | |||

${\mathit{E}}_{\mathbf{17}}$ | $\{1,2,3,4,5\}$ | 0 | $0.0$ |

**Table 5.**Measure decomposition for partitions of the butterfly graph for the Rand distance ${d}_{RI}=1-RI$.

Case | $\mathcal{P}$ | $\mathcal{Q}$ | ${\mathit{d}}_{\mathit{RI}}$ | ${\mathit{d}}^{\ast}$ | ${\mathit{d}}_{\mathit{struc}}$ | ${\mathit{d}}_{\mathit{Aut}\left(\mathit{G}\right)}$ |
---|---|---|---|---|---|---|

1 | $\left\{\right\{1,2,3,4\left\}\right\{5\left\}\right\}$ | $\left\{\right\{4\left\}\right\{5\left\}\right\{1,2,3\left\}\right\}$ | 0.3 | ${d}_{L}^{\ast}$ | 0.3 | 0.0 |

$\in {E}_{13}$ | $\in {E}_{5}$ | 0.3 | ${d}_{av}^{\ast}$ | 0.5 | −0.2 | |

$dia\left({E}_{13}\right)=0.6$ | $dia\left({E}_{5}\right)=0.6$ | 0.3 | ${d}_{U}^{\ast}$ | 0.7 | −0.4 | |

2 | $\left\{\right\{2,4\left\}\right\{1,3,5\left\}\right\}$ | $\left\{\right\{3\left\}\right\{1,4\left\}\right\{2,5\left\}\right\}$ | 0.6 | ${d}_{L}^{\ast}$ | 0.2 | 0.4 |

$\in {E}_{16}$ | $\in {E}_{9}$ | 0.6 | ${d}_{av}^{\ast}$ | 0.4 | 0.2 | |

$dia\left({E}_{16}\right)=0.6$ | $dia\left({E}_{9}\right)=0.4$ | 0.6 | ${d}_{U}^{\ast}$ | 0.6 | 0.0 | |

3 | $\left\{\right\{1\left\}\right\{2,5\left\}\right\{3,4\left\}\right\}$ | $\left\{\right\{1\left\}\right\{2,3\left\}\right\{4\left\}\right\{5\left\}\right\}$ | 0.3 | ${d}_{L}^{\ast}$ | 0.1 | 0.2 |

$\in {E}_{11}$ | $\in {E}_{3}$ | 0.3 | ${d}_{av}^{\ast}$ | 0.25 | 0.05 | |

$dia\left({E}_{11}\right)=0.4$ | $dia\left({E}_{3}\right)=0.2$ | 0.3 | ${d}_{U}^{\ast}$ | 0.3 | 0.0 | |

4 | $\left\{\right\{3\left\}\right\{1,2\left\}\right\{4,5\left\}\right\}$ | $\left\{\right\{1\left\}\right\{2,3\left\}\right\{4,5\left\}\right\}$ | 0.3 | ${d}_{L}^{\ast}$ | 0.3 | 0.0 |

$\in {E}_{8}$ | $\in {E}_{10}$ | 0.3 | ${d}_{av}^{\ast}$ | 0.3 | 0.0 | |

($dia\left({E}_{8}\right)=0$, stable) | $dia\left({E}_{10}\right)=0.4$ | 0.3 | ${d}_{U}^{\ast}$ | 0.3 | 0.0 |

**Table 6.**Diameter (computed using ${d}_{RI}$), orbit size and stability of partitions ${\mathcal{P}}_{O}$, ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$.

$\mathcal{X}$ | ${\mathcal{P}}_{\mathit{O}}$ | ${\mathcal{P}}_{1}$ | ${\mathcal{P}}_{2}$ |
---|---|---|---|

$dia\left(\mathcal{X}\right)$ | $0.0000$ | $0.1176$ | $0.1390$ |

$|{\mathcal{X}}^{Aut\left(G\right)}|$ | 1 | 20 | 20 |

$\mathcal{X}$ stable? | yes | no | no |

**Table 7.**Invariant measures and automorphism effects for the Karate graph. The R package

`partitionComparison`has been used for the computations [23].

Measure $\mathit{d}={\mathit{d}}_{\mathit{RI}}$ | $\mathit{m}({\mathcal{P}}_{\mathit{O}},{\mathcal{P}}_{1})$ | $\mathit{m}({\mathcal{P}}_{\mathit{O}},{\mathcal{P}}_{2})$ | $\mathit{m}({\mathcal{P}}_{1},{\mathcal{P}}_{2})$ |
---|---|---|---|

d | $0.0927$ | $0.1426$ | $0.0499$ |

${d}_{L}^{\ast}+{d}_{Aut\left(G\right)}$ | $0.0927$ | $0.1426$ | $0.0499+0.0000$ |

${d}_{U}^{\ast}-{d}_{Aut\left(G\right)}$ | $0.0927$ | $0.1426$ | $0.1676-0.1176$ |

${d}_{av}^{\ast}-{d}_{Aut\left(G\right)}$ | $0.0927$ | $0.1426$ | $0.1280-0.0781$ |

${e}_{max}^{Aut\left(K\right)}$ | $0.0000$ | $0.0000$ | $0.1176$ |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ball, F.; Geyer-Schulz, A.
Invariant Graph Partition Comparison Measures. *Symmetry* **2018**, *10*, 504.
https://doi.org/10.3390/sym10100504

**AMA Style**

Ball F, Geyer-Schulz A.
Invariant Graph Partition Comparison Measures. *Symmetry*. 2018; 10(10):504.
https://doi.org/10.3390/sym10100504

**Chicago/Turabian Style**

Ball, Fabian, and Andreas Geyer-Schulz.
2018. "Invariant Graph Partition Comparison Measures" *Symmetry* 10, no. 10: 504.
https://doi.org/10.3390/sym10100504