# Linking and Cutting Spanning Trees

^{*}

## Abstract

**:**

## 1. Introduction

- We present a new algorithm, which given a graph G, generates a spanning tree of G uniformly at random. The algorithm uses the link-cut tree data structure to compute randomizing operations in $O(logV)$ amortized time per operation. Hence, the overall algorithm takes $O(\tau logV)$ time to obtain a uniform spanning tree of G, where $\tau $ is the mixing time of a Markov chain that is dependent on G. Theorem 1 summarizes this result.
- We propose a coupling to bound the mixing time $\tau $. The analysis of the coupling yields a bound for cycle graphs (Theorem 2), and for graphs which consist of simple cycles connected by bridges or articulation points (Theorem 3). We also simulate this procedure experimentally to obtain bounds for other graphs. The link-cut tree data structure is also key in this process. Section 4.3 shows experimental results, including other classes of graphs.

## 2. The Challenge

## 3. Main Idea

`Link`operation. The randomizing process needs to identify C and select $({u}^{\prime},{v}^{\prime})$ from it. The LCT can also compute this process in $O(logV)$ amortized time. The LCT works by partitioning the represented tree into disjoint paths. Each path is stored in an auxiliary data structure, so that any of its edges can be accessed efficiently in $O(logV)$ amortized time. To compute this process we force the path $D=C\setminus \left\{\right(u,v\left)\right\}$ to become a disjoint path. This means that D will be completely stored in one auxiliary data structure. Hence, it is possible to efficiently select an edge from it. Moreover the size of D can also be computed efficiently. The exact process, to force D into an auxiliary structure, is to make u the root of the represented tree and then access v. Algorithm 1 shows the pseudo-code of the edge-swapping procedure. We can confirm, by inspection, that this process can be computed in the $O(logV)$ amortized time bound that is crucial for our main result.

Algorithm 1 Edge-swapping process | |

1: procedure EdgeSwap(A) | ▹A is an LCT representation of the current spanning tree |

2: $(u,v)\leftarrow $ Chosen uniformly at random from E | |

3: if $(u,v)\notin A$ then | ▹$O(logV)$ time |

4: $\mathtt{ReRoot}(A,u)$ | ▹ Makes u the root of A |

5: $D\leftarrow \mathtt{Access}(A,v)$ | ▹ Obtains a representation of the path $C\setminus \left\{\right(u,v\left)\right\}$ |

6: $i\leftarrow $ Chosen uniformly from $\{1,\dots ,|D\left|\right\}$ | |

7: $({u}^{\prime},{v}^{\prime})\leftarrow \mathtt{Select}(D,i)$ | ▹ Obtain the i-th edge from D |

8: $\mathtt{Cut}(A,{u}^{\prime},{v}^{\prime})$ | |

9: $\mathtt{Link}(A,u,v)$ | |

10: end if | |

11: end procedure |

**Theorem**

**1.**

## 4. The Details

#### 4.1. Ergodic Analysis

- If $i=j$ we use a self-loop transition.
- Otherwise, when $i\ne j$, it is possible to choose $(u,v)$ from ${A}_{j}\setminus {A}_{i}$, and $({u}^{\prime},{v}^{\prime})$ from $(C\setminus \left\{(u,v)\right\})\cap ({A}_{i}\setminus {A}_{j})=C\setminus {A}_{j}$; note that the set equality follows from the assumption that $(u,v)$ belongs to ${A}_{j}$. For the last property, note that if no such $({u}^{\prime},{v}^{\prime})$ exists then $C\subseteq {A}_{j}$, which is a contradiction because ${A}_{j}$ is a tree and C is a cycle. As mentioned above, the probability of this transition is at least $1/\left(EV\right)$. After this step the resulting tree is not necessarily ${A}_{j}$, but it is closer to that tree. More precisely $({A}_{i}\cup \left\{(u,v)\right\})\setminus \left\{({u}^{\prime},{v}^{\prime})\right\}$ is not necessarily ${A}_{j}$, however the set ${A}_{j}\setminus (({A}_{i}\cup \left\{(u,v)\right\})\setminus \left\{({u}^{\prime},{v}^{\prime})\right\})$ is smaller than the original ${A}_{j}\setminus {A}_{i}$. Its size decreases by 1 because the edge $(u,v)$ exists on the second set but not on the first. Therefore, this process can be iterated until the resulting set is empty and therefore the resulting tree coincides with ${A}_{j}$. The maximal size of ${A}_{j}\setminus {A}_{i}$ is $V-1$, because the size of ${A}_{j}$ is at most $V-1$. This value occurs when ${A}_{i}$ and ${A}_{j}$ do not share edges. Multiplying all the probabilities in the process of transforming ${A}_{i}$ into ${A}_{j}$ we obtain a total probability of at least $1/{\left(EV\right)}^{V-1}$.

#### 4.2. Coupling

#### 4.2.1. $d(x,y)=0$

#### 4.2.2. $d(x,y)=1$

**Lemma**

**1.**

- ${e}_{x}\in {E}_{x}$, ${e}_{y}\in {E}_{y}$, ${i}_{x}\in I$
- ${E}_{x}\cap {E}_{y}=\varnothing $, ${E}_{x}\cap I=\varnothing $, ${E}_{y}\cap I=\varnothing $
- ${E}_{x}\cup I={C}_{x}$, ${E}_{y}\cup I={C}_{y}$, ${E}_{x}\cup {E}_{y}={C}_{e}$.

- If the chain ${X}_{t}$ loops (${x}^{\prime}=x$), because ${i}_{x}\in {A}_{x}$, then ${Y}_{t}$ also loops and therefore ${y}^{\prime}=y$. The set ${U}_{y}$ does not change, i.e., set ${U}_{{y}^{\prime}}={U}_{y}$.
- If ${i}_{x}={e}_{y}$ and ${o}_{x}={e}_{x}$ then set ${i}_{y}={e}_{x}$ and ${o}_{y}={e}_{y}$. In this case the chains do not coalesce; they swap states because ${x}^{\prime}=y$ and ${y}^{\prime}=x$ (see Figure 5). Set ${U}_{{y}^{\prime}}={C}_{e}\setminus \left\{{e}_{{x}^{\prime}}\right\}$.
- If ${i}_{x}={e}_{y}$ and ${o}_{x}\ne {e}_{x}$ then set ${i}_{y}={e}_{x}$ and ${o}_{y}={o}_{x}$. In this case the chains coalesce, i.e., ${x}^{\prime}={y}^{\prime}$ (see Figure 6). When the chains coalesce, the edges ${e}_{{x}^{\prime}}$ and ${e}_{{y}^{\prime}}$ no longer exist and the set ${U}_{{y}^{\prime}}$ is no longer relevant.
- If ${i}_{x}\ne {e}_{y}$ set ${i}_{y}={i}_{x}$. We now have three sub-cases, which are further sub-divided. These cases depend on whether $|{C}_{x}|=|{C}_{y}|$, $|{C}_{x}|<|{C}_{y}|$ or $|{C}_{x}|>|{C}_{y}|$. We start with $|{C}_{x}|=|{C}_{y}|$ which is simpler and establishes the basic situations. When $|{C}_{x}|<|{C}_{y}|$ or $|{C}_{x}|>|{C}_{y}|$ we use some Bernoulli random variables to balance out probabilities and whenever possible reduce to the cases considered for $|{C}_{x}|=|{C}_{y}|$. When this is not possible we present the corresponding new situation.
- (a)
- If $|{C}_{x}|=|{C}_{y}|$ we have the following situations:
- (i)
- (ii)
- If ${o}_{x}\in I\setminus \left\{{i}_{x}\right\}$ then set ${o}_{y}={o}_{x}$. In this case the chains do not coalesce, in fact the exclusive edges remain unchanged, i.e., ${e}_{{x}^{\prime}}={e}_{x}$ and ${e}_{{y}^{\prime}}={e}_{y}$ (see Figure 8 and Figure 9). When ${o}_{x}\notin {C}_{e}$ the set ${C}_{{e}^{\prime}}$ remains equal to ${C}_{e}$ and likewise ${U}_{{y}^{\prime}}$ remains equal to ${U}_{y}$ (see Figure 8). Otherwise, when ${o}_{x}\in {C}_{e}$, the set ${C}_{{e}^{\prime}}$ is different from ${C}_{e}$, and we assign ${U}_{{y}^{\prime}}={U}_{y}\cap {C}_{{e}^{\prime}}$ (see Figure 9).
- (iii)
- If ${o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}$ then select ${s}_{y}$ uniformly from ${E}_{y}\setminus \left\{{e}_{y}\right\}$. If ${s}_{y}\in {U}_{y}$ then set ${o}_{y}={e}_{y}$ (see Figure 10). In this case set ${U}_{{y}^{\prime}}={E}_{x}\setminus \left\{{e}_{x}\right\}$. The alternative, when ${s}_{y}\notin {U}_{y}$, is considered in the next case (4.a.iv).
- (iv)
- If ${o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}$ and ${s}_{y}\notin {U}_{y}$, then set ${o}_{y}={s}_{y}$. This case is shown in Figure 11. In this case the distance of the coupled states increases, i.e., $d({x}^{\prime},{y}^{\prime})=2$. Therefore we include a new state ${z}^{\prime}$ in between ${x}^{\prime}$ and ${y}^{\prime}$ and define ${e}_{{z}^{\prime}}$ to be the edge in ${A}_{{z}^{\prime}}\setminus {A}_{{x}^{\prime}}$; ${e}_{{y}^{\prime}}$ the edge in ${A}_{{y}^{\prime}}\setminus {A}_{{z}^{\prime}}$; and ${e}_{{x}^{\prime}}$ the edge in ${A}_{{x}^{\prime}}\setminus {A}_{{z}^{\prime}}$. The set ${U}_{{z}^{\prime}}$ should contain the edges that provide alternatives to ${e}_{{z}^{\prime}}$. In this case set ${U}_{{z}^{\prime}}={E}_{x}\setminus \left\{{e}_{x}\right\}$ and ${U}_{{y}^{\prime}}=({U}_{y}\cap {E}_{y})\setminus \left\{{o}_{y}\right\}$.

- (b)
- If $|{C}_{x}|<|{C}_{y}|$ then ${X}_{t}$ will choose ${o}_{x}\in I$ with a higher probability then what ${Y}_{t}$ should. Therefore, we use a Bernoulli random variable B with a success probability p defined as follows:$$p=\frac{{C}_{x}-1}{{C}_{y}-1}$$In Lemma 2 we prove that p properly balances the necessary probabilities. For now note that when $|{C}_{x}|=|{C}_{y}|$ the expression for p yields $p=1$. This is coherent with the following cases, because when B yields
`true`we use the choices defined for $|{C}_{x}|=|{C}_{y}|$. The following situations are possible:- (i)
- If ${o}_{x}={e}_{x}$ then we reduce to the case 4.a.i, both when B yields
`true`or when B fails and ${s}_{y}\in {U}_{y}$. Set ${o}_{y}={e}_{y}$ (see Figure 7). The new case occurs when B fails and ${s}_{y}\notin {U}_{y}$, in this situation set ${o}_{y}={s}_{y}$ and ${U}_{{y}^{\prime}}=({U}_{y}\cap {C}_{y})\setminus \left\{{o}_{y}\right\}$ (see Figure 12). - (ii)
- If ${o}_{x}\in I\setminus \left\{{i}_{x}\right\}$ then we reduce to the case 4.a.ii when B yields
`true`. Set ${o}_{y}={o}_{x}$ (see Figure 8 and Figure 9). When B fails and ${s}_{y}\in {U}_{y}$ we have a new situation. Set ${o}_{y}={e}_{y}$ and ${U}_{{y}^{\prime}}=I\setminus \left\{{i}_{x}\right\}$. The chains preserve their distance, i.e., $d({x}^{\prime},{y}^{\prime})=1$ (see Figure 13). The alternative, when ${s}_{y}\notin {U}_{y}$, is considered in the next case (4.b.iii). - (iii)
- If ${o}_{x}\in I\setminus \left\{{i}_{x}\right\}$ and B fails and ${s}_{y}\notin {U}_{y}$. We have a new situation,;set ${o}_{y}={s}_{y}$. The distance increases, $d({x}^{\prime},{y}^{\prime})=2$ (see Figure 14). Set ${U}_{{z}^{\prime}}=I\setminus \left\{{i}_{x}\right\}$ and ${U}_{{y}^{\prime}}=({U}_{y}\cap {E}_{y})\setminus \left\{{o}_{y}\right\}$.
- (iv)

- (c)
- If $|{C}_{x}|>|{C}_{y}|$ we have the following situations:
- (i)
- If ${o}_{x}={e}_{x}$ then use case 4.a.i and set ${o}_{y}={e}_{y}$ (see Figure 7). The chains coalesce.
- (ii)
- (iii)
- If ${o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}$ then we use a new Bernoulli random variable ${B}^{*}$ with a success probability ${p}^{*}$ defined as follows:$${p}^{*}=\left(\right)open="("\; close=")">\frac{1}{{C}_{y}-1}-\frac{1}{{C}_{x}-1}$$In Lemma 2 we prove that ${B}^{*}$ properly balances the necessary probabilities. For now, note that when $|{C}_{x}|=|{C}_{y}|$ the expression for ${p}^{*}$ yields ${p}^{*}=0$, because $1/({C}_{y}-1)-1/({C}_{x}-1)$ becomes 0. This is coherent because when ${B}^{*}$ returns
`false`we will use the choices defined for $|{C}_{x}|=|{C}_{y}|$. The case when ${B}^{*}$ fails is considered in the next case (4.c.iv).If ${B}^{*}$ is successful we have a new situation. Set ${o}_{y}={s}_{i}$, where ${s}_{i}$ is chosen uniformly from $I\setminus \left\{{i}_{y}\right\}$ (see Figure 15). We have ${e}_{{y}^{\prime}}={e}_{y}$, ${U}_{{y}^{\prime}}=({U}_{y}\cap {E}_{y})\setminus \left\{{o}_{y}\right\}$, ${e}_{{z}^{\prime}}={o}_{x}$ and ${U}_{{z}^{\prime}}={E}_{x}\setminus \left\{{e}_{x}\right\}$. - (iv)
- If ${o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}$ and ${B}^{*}$ fails we use another Bernoulli random variable ${B}^{\prime}$ with a success probability ${p}^{\prime}$ defined as follows:$${p}^{\prime}=1-\frac{({C}_{x}-1)({E}_{y}-1)}{({C}_{y}-1)({E}_{x}-1)(1-{p}^{*})}$$In Lemma 2 we prove that ${B}^{\prime}$ properly balances the necessary probabilities. In case ${B}^{\prime}$ yields
`true`, use case 4.a.iii and set ${o}_{y}={e}_{y}$ (see Figure 10). Otherwise, if ${s}_{y}\in {U}_{y}$, use case 4.a.iii (Figure 10) or, if ${s}_{y}\notin {U}_{y}$, use case 4.a.iv (Figure 11).

**Lemma**

**2.**

**Proof.**

- $i\in {A}_{y}$: this occurs only in case 1, when ${i}_{x}\in {A}_{x}$. It may be that $i={e}_{y}$; this occurs when ${i}_{x}={e}_{x}$, in which case ${i}_{y}={e}_{y}=i$ and this is the only case where ${i}_{y}={e}_{y}$. In this case $Pr({i}_{y}=i)=Pr({i}_{x}={e}_{x})=1/E$. Otherwise, $i\in {A}_{y}\cap {A}_{x}$, in these cases ${i}_{y}={i}_{x}$, and therefore $Pr({i}_{y}=i)=Pr({i}_{x}=i)=1/E$.
- $i={e}_{x}$: this occurs in cases 2 and 3, i.e., when ${i}_{x}={e}_{y}$, which is the decisive condition for this choice. Therefore $Pr({i}_{y}=i)=Pr({i}_{x}={e}_{y})=1/E$.
- $i\in E\setminus {A}_{y}$: this occurs in case 4. In this case ${i}_{y}={i}_{x}$ so again we have that $Pr({i}_{y}=i)=Pr({i}_{x}=\phantom{\rule{3.33333pt}{0ex}}i)\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1/E$.

- Analysis of B. We need to have ${C}_{y}-1\ne 0$ for p to be well defined. Any cycle must contain at least three edges; therefore $3\le {C}_{y}$ and hence $0<2\le {C}_{y}-1$. This guarantees that the denominator is not 0. The same argument proves that $0<{C}_{x}-1$, thus implying that $0<p$, as both expressions are positive. We also establish that $p<1$ because of the hypothesis of case 4.b which guarantees ${C}_{x}<{C}_{y}$ and therefore ${C}_{x}-1<{C}_{y}-1$.
- Analysis of ${B}^{*}$. As in seen the analysis of B we have that $0<{C}_{y}-1$ and $0<{C}_{x}-1$, therefore those denominators are not 0. Moreover we also need to prove that ${E}_{x}-1\ne 0$. In general we have that $1\le {E}_{y}$, because ${e}_{y}\in {E}_{y}$. Moreover, the hypothesis of case 4.c.iii is that ${C}_{y}<{C}_{x}$ and therefore ${E}_{y}<{E}_{x}$, which is obtained by removing I from the both sides. This implies that $1<{E}_{x}$ and therefore $0<{E}_{x}-1$, thus establishing that the last denominator is also not 0.Let us now establish that $0\le {p}^{*}$ and ${p}^{*}<1$. Note that ${p}^{*}$ can be simplified to the expression $({C}_{x}-{C}_{y})(I-1)/\left(({C}_{y}-1)({E}_{x}-1)\right)$, where all the expressions in parenthesis are non-negative, so $0\le {p}^{*}$. For the second property we use the new expression for ${p}^{*}$ and simplify ${p}^{*}<1$ to $({E}_{x}-{E}_{y})(I-1)<({E}_{x}-1)({C}_{y}-1)$. The deduction is straightforward using the equality ${C}_{x}-{C}_{y}={E}_{x}-{E}_{y}$ that is obtained by removing I from the left side. The properties ${E}_{x}-{E}_{y}\le {E}_{x}-1$ and $I-1<{C}_{y}-1$ establish the desired result.
- Analysis of ${B}^{\prime}$. We established, in the analysis of B, that ${C}_{y}-1$ is non-zero. In the analysis of ${B}^{*}$ we also established that ${E}_{x}-1$ is non-zero. Note that case 4.c.iv also assumes the hypothesis that ${C}_{y}<{C}_{x}$. Moreover, in the analysis of ${B}^{*}$ we also established that ${p}^{*}<1$, which implies that $0<1-{p}^{*}$ and therefore the last denominator is also non-zero.Let us also establish that $0\le {p}^{\prime}$ and ${p}^{\prime}\le 1$. For the second property we instead prove that $0\le 1-{p}^{\prime}$, where $1-{p}^{\prime}=({C}_{x}-1)({E}_{y}-1)/\left(({C}_{y}-1)({E}_{x}-1)(1-{p}^{*})\right)$ and all of the expressions in parenthesis are non-negative. We use the following deduction of equivalent inequalities to establish that $0\le {p}^{\prime}$:$$\begin{array}{cc}\hfill 0& \le {p}^{\prime}\hfill \\ \hfill -{p}^{\prime}& \le 0\hfill \\ \hfill 1-{p}^{\prime}& \le 1\hfill \\ \hfill ({C}_{x}-1)({E}_{y}-1)& \le ({C}_{y}-1)({E}_{x}-1)(1-{p}^{*})\hfill \\ \hfill ({C}_{x}-1)({E}_{y}-1)& \le ({C}_{y}-1)({E}_{x}-1)\left(\right)open="("\; close=")">1-\frac{({C}_{x}-{C}_{y})(I-1)}{({C}_{y}-1)({E}_{x}-1)}\hfill \end{array}$$This last inequality is part of the hypothesis of case 4.c.iv.

- When the cycles are equal ${C}_{x}={C}_{y}$. This involves cases 2 and 3.
- $o={e}_{y}$: this occurs only in case 2 and it is determined by the fact that ${o}_{x}={e}_{x}$. Therefore, $Pr({o}_{y}=o)=Pr({o}_{x}={e}_{x})=1/({C}_{x}-1)=1/({C}_{y}-1)$.
- $o\ne {e}_{y}$: this occurs only in case 3 and it is determined by the fact that ${o}_{x}\ne {e}_{x}$, in this case ${o}_{y}={o}_{x}$. Therefore $Pr({o}_{y}=o)=Pr({o}_{x}=o)=1/({C}_{x}-1)=1/({C}_{y}-1)$.

- When the cycles have the same size $|{C}_{x}|=|{C}_{y}|$ (case 4.a). The possibilities for o are as follows:
- $o={e}_{y}$: this occurs only in the case 4.a.i. This case is determined by the fact that ${o}_{x}={e}_{x}$. Therefore, $Pr({o}_{y}=o)=Pr({o}_{x}={e}_{x})=1/({C}_{x}-1)=1/({C}_{y}-1)$. Note that according to the Lemma’s hypothesis, case 4.a.iii never occurs.
- $o\in I\setminus \left\{{i}_{y}\right\}$: this occurs only in case 4.a.ii. This case is determined by the fact that ${o}_{x}\in I\setminus \left\{{i}_{x}\right\}$ and sets ${o}_{y}={o}_{x}=o$. Therefore, $Pr({o}_{y}=o)=Pr({o}_{x}=o)=1/({C}_{x}-1)=1/({C}_{y}-1)$.
- $o\in {E}_{y}\setminus \left\{{e}_{y}\right\}$: this occurs only in case 4.a.iv. This case is determined by the fact that ${o}_{x}\in X\setminus \left\{{e}_{x}\right\}$ and moreover sets ${o}_{y}={s}_{y}$, which was uniformly selected from ${E}_{y}\setminus \left\{{e}_{y}\right\}$. We have the following deduction where we use the fact that the events are independent and that $|{C}_{x}|=|{C}_{y}|$ implies $|{E}_{x}|=|{E}_{y}|$:$$\begin{array}{cc}\hfill Pr({o}_{y}=o)& =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}\mathrm{and}{s}_{y}=o)\hfill \\ & =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\})Pr({s}_{y}=o)\hfill \\ & =\frac{{E}_{x}-1}{{C}_{x}-1}\times \frac{1}{{E}_{y}-1}\hfill \\ & =1/({C}_{x}-1)\hfill \\ & =1/({C}_{y}-1)\hfill \end{array}$$

- When ${C}_{x}<{C}_{y}$ this involves case 4.b. The cases for o are as follows:
- $o={e}_{y}$: this occurs only in the case 4.b.i and when B is
`true`. This case occurs when ${o}_{x}={e}_{x}$. We make the following deduction, that uses the fact that the events are independent and the success probability of B:$$\begin{array}{cc}\hfill Pr({o}_{y}=o)& =Pr({o}_{x}={e}_{x}\mathrm{and}B=\mathtt{true})\hfill \\ & =Pr({o}_{x}={e}_{x})Pr(B=\mathtt{true})\hfill \\ & =\frac{1}{{C}_{x}-1}\times \frac{{C}_{x}-1}{{C}_{y}-1}\hfill \\ & =1/({C}_{y}-1)\hfill \end{array}$$ - $o\in I\setminus \left\{{i}_{y}\right\}$: this occurs only in case 4.b.ii and when B is
`true`. This case is determined by the fact that ${o}_{x}\in I\setminus \left\{{i}_{x}\right\}$ and sets ${o}_{y}={o}_{x}=o$. We make the following deduction, that uses the fact that the events are independent and the success probability of B:$$\begin{array}{cc}\hfill Pr({o}_{y}=o)& =Pr({o}_{x}=o\mathrm{and}B=\mathtt{true})\hfill \\ & =Pr({o}_{x}=o)Pr(B=\mathtt{true})\hfill \\ & =\frac{1}{{C}_{x}-1}\times \frac{{C}_{x}-1}{{C}_{y}-1}\hfill \\ & =1/({C}_{y}-1)\hfill \end{array}$$ - $o\in {E}_{y}\setminus \left\{{e}_{y}\right\}$: this occurs in case 4.b.iv, but also in cases 4.b.iii and 4.b.i when B is
`false`. We have the following deduction, that uses event independence, the fact that the cases are disjoint events, and the success probability of B:$$\begin{array}{cc}\hfill Pr& ({o}_{y}=o)\hfill \\ & =Pr\left(4.b.\mathrm{iv}\phantom{\rule{3.33333pt}{0ex}}\mathrm{or}\phantom{\rule{3.33333pt}{0ex}}(4.b.\mathrm{iii}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}B=\mathtt{false})\mathrm{or}\phantom{\rule{3.33333pt}{0ex}}(4.b.\mathrm{i}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}B=\mathtt{false})\right)\hfill \\ & =Pr(4.b.\mathrm{iv})+Pr(4.b.\mathrm{iii}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}B=\mathtt{false})+Pr(4.b.\mathrm{i}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}B=\mathtt{false})\hfill \\ & =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}\mathrm{and}{s}_{y}=o)+Pr({o}_{x}\in I\mathrm{and}B=\mathtt{false}\mathrm{and}{s}_{y}=o)\hfill \\ & \phantom{\rule{1.em}{0ex}}+Pr({o}_{x}={e}_{x}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}B=\mathtt{false}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{s}_{y}=o)\hfill \\ & =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\})Pr({s}_{y}=o)+Pr({o}_{x}\in I)Pr(B=\mathtt{false})Pr({s}_{y}=o)\hfill \\ & \phantom{\rule{1.em}{0ex}}+Pr({o}_{x}={e}_{x})Pr(B=\mathtt{false})Pr({s}_{y}=o)\hfill \\ & =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\})Pr({s}_{y}=o)+Pr({o}_{x}\in I\cup \left\{{e}_{x}\right\})Pr(B=\mathtt{false})Pr({s}_{y}=o)\hfill \\ & =[Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\})+Pr({o}_{x}\in I\cup \left\{{e}_{x}\right\})(1-Pr(B=\mathtt{true}))]Pr({s}_{y}=o)\hfill \\ & =[Pr({o}_{x}\in {C}_{x})-Pr({o}_{x}\in I\cup \left\{{e}_{x}\right\})Pr(B=\mathtt{true})]Pr({s}_{y}=o)\hfill \\ & =[1-Pr({o}_{x}\in I\cup \left\{{e}_{x}\right\})Pr(B=\mathtt{true})]Pr({s}_{y}=o)\hfill \\ & =\left(\right)open="["\; close="]">1-\frac{I-1+1}{{C}_{x}-1}\times \frac{{C}_{x}-1}{{C}_{y}-1}Pr({s}_{y}=o)\hfill \end{array}& =\frac{{C}_{y}-1-I}{{C}_{y}-1}\times \frac{1}{{E}_{y}-1}\hfill \\ & =\frac{1}{{C}_{y}-1}\hfill $$

- When ${C}_{x}>{C}_{y}$, this concerns case 4.c. The cases for o are the following:
- $o={e}_{y}$: this occurs in the case 4.c.i and case 4.c.iv when ${B}^{\prime}$ is
`true`. We use the following deduction:$$\begin{array}{cc}\hfill Pr& ({o}_{y}=o)\hfill \\ & =Pr(4.c.\mathrm{i}\phantom{\rule{3.33333pt}{0ex}}\mathrm{or}\phantom{\rule{3.33333pt}{0ex}}(4.c.\mathrm{iv}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{B}^{\prime}=\mathtt{true}))\hfill \\ & =Pr(4.c.\mathrm{i})+Pr(4.c.\mathrm{iv}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{B}^{\prime}=\mathtt{true})\hfill \\ & =Pr({o}_{x}={e}_{x})+Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}\mathrm{and}{B}^{*}=\mathtt{false}\mathrm{and}{B}^{\prime}=\mathtt{true})\hfill \\ & =\frac{1}{{C}_{x}-1}+\frac{{E}_{x}-1}{{C}_{x}-1}(1-{p}^{*})\left(\right)open="("\; close=")">1-\frac{({C}_{x}-1)({E}_{y}-1)}{({C}_{y}-1)({E}_{x}-1)(1-{p}^{*})}\hfill \end{array}& =\frac{1}{{C}_{x}-1}+\frac{{E}_{x}-1}{{C}_{x}-1}-\frac{{E}_{x}-1}{{C}_{x}-1}{p}^{*}-\frac{{E}_{y}-1}{{C}_{y}-1}\hfill \\ & =\frac{1}{{C}_{x}-1}+\frac{{E}_{x}-1}{{C}_{x}-1}-\left(\right)open="["\; close="]">\frac{1}{{C}_{y}-1}-\frac{1}{{C}_{x}-1}(I-1)-\frac{{E}_{y}-1}{{C}_{y}-1}\hfill $$ - $o\in I\setminus \left\{{i}_{y}\right\}$: this occurs in case 4.c.ii and case 4.c.iii when ${B}^{*}$ is
`true`. We make the following deduction:$$\begin{array}{cc}\hfill Pr& ({o}_{y}=o)\hfill \\ & =Pr(4.c.\mathrm{ii}\phantom{\rule{3.33333pt}{0ex}}\mathrm{or}\phantom{\rule{3.33333pt}{0ex}}4.c.\mathrm{iii})\hfill \\ & =Pr(4.c.\mathrm{ii})+Pr(4.c.\mathrm{iii})\hfill \\ & =Pr({o}_{x}=o)+Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}\mathrm{and}{B}^{*}=\mathtt{true}\mathrm{and}{s}_{i}=o)\hfill \\ & =Pr({o}_{x}=o)+Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\})Pr({B}^{*}=\mathtt{true})Pr({s}_{i}=o)\hfill \\ & =\frac{1}{{C}_{x}-1}+\frac{{E}_{x}-1}{{C}_{x}-1}\times \left(\right)open="("\; close=")">\frac{1}{{C}_{y}-1}-\frac{1}{{C}_{x}-1}\times \frac{({C}_{x}-1)(I-1)}{{E}_{x}-1}\times \frac{1}{I-1}\hfill \end{array}$$ - $o\in {E}_{y}\setminus \left\{{e}_{y}\right\}$: this occurs in case 4.c.iv when ${B}^{\prime}$ is
`false`. We have the following deduction:$$\begin{array}{cc}\hfill Pr& ({o}_{y}=o)\hfill \\ & =Pr(4.c.\mathrm{iv}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{B}^{\prime}=\mathtt{false})\hfill \\ & =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{B}^{*}=\mathtt{false}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{B}^{\prime}=\mathtt{false}\phantom{\rule{3.33333pt}{0ex}}\mathrm{and}\phantom{\rule{3.33333pt}{0ex}}{s}_{y}=o)\hfill \\ & =Pr({o}_{x}\in {E}_{x}\setminus \left\{{e}_{x}\right\})Pr({B}^{*}=\mathtt{false})Pr({B}^{\prime}=\mathtt{false})Pr({s}_{y}=o)\hfill \\ & =\frac{{E}_{x}-1}{{C}_{x}-1}(1-{p}^{*})\frac{({C}_{x}-1)({E}_{y}-1)}{({C}_{y}-1)({E}_{x}-1)(1-{p}^{*})}\times \frac{1}{{E}_{y}-1}\hfill \\ & =\frac{1}{{C}_{y}-1}\hfill \end{array}$$

**Lemma**

**3.**

**Proof.**

**Theorem**

**2.**

**Proof.**

**Theorem**

**3.**

**Proof.**

#### 4.3. Experimental Results

#### 4.3.1. Convergence Testing

- The bottom left plot shows the graph properties, the number of vertices V in the x axis and the number edges E on the y axis. For the dense case graph 0 has 10 vertices and 45 edges. Moreover, graph 6 has 40 vertices and 780 edges. These graph indexes are used in the remaining plots.
- The top left plot shows the number of iterations t of the chain in the x axis and the estimated variation distance on the y axis, for all the different graphs.
- The top right plot is similar to the top left, but the x axis contains the number of iterations divided by $({V}^{1.3}+E)$. Besides the data this plot also shows a plot of $ln(1/\widehat{\epsilon})$ for reference.
- The bottom right plot is the same as the top right plot, using a logarithmic scale on the y axis.

**dmP**, namely in Figure 22, Figure 23 and Figure 24, correspond to the choices of p.

**cycle**graphs consist of a single cycle, as shown in Figure 16.The

**sparse**graphs are ladder graphs; an illustration of these graphs is shown in Figure 25.

**dense**graphs are actually the complete graphs ${K}_{V}$. We also generated other dense graphs labeled

**biK**which consisted of two complete graphs connected by two edges. Graphs were also generated based on the duplication model

**dmP**. Let ${G}_{0}=({V}_{0},{E}_{0})$ be an undirected and unweighted graph. Given $0\le p\le 1$, the partial duplication model builds a graph $G=(V,E)$ by partial duplication as follows [12]: start with $G={G}_{0}$ at time $t=1$ and, at time $t>1$, perform a duplication step:

- Uniformly select a random vertex u of G.
- Add a new vertex v and an edge $(u,v)$.
- For each neighbor w of u, add an edge $(v,w)$ with probability p.

#### 4.3.2. Coupling Simulation

**dmP**and

**torus**graphs, and it is faster for

**biK**,

**cycle**and

**sparse**(ladder) graphs. As expected, it is less competitive for

**dense**graphs. Hence, experimental results seem to point out that the edge-swapping method is more competitive in practice for those instances that are harder for random walk-based methods, namely

**biK**and

**cycle**graphs. The results for

**biK**and

**dmP**are of particular interest as most real networks seem to include these kind of topologies, i.e., they include communities and they are scale-free [13].

## 5. Related Work

`Select`an edge from the path. By starting at the root and comparing the tree sizes to i we can determine if the first vertex of the desired edge is on the left sub-tree, on the root, or on the right sub-tree. Likewise we can do the same for the second vertex of the edge in question. These operations splay the vertices that they obtain and therefore the total time depends on the

`Splay`operation. The precise total time of the

`Splay`operation is $O\left(\right(V+1)logn)$, however the $VlogV$ term does not accumulate over successive operations, thus yielding the bound of $O\left(\right(V+\tau )logV)$ in Theorem 1. In general the $VlogV$ term should not be a bottleneck because for most graphs we should have $\tau >V$. This is not always the case; if G consists of a single cycle then $\tau =1$, but V may be large. Figure 16 shows an example of such a graph.

**Definition**

**1.**

**Definition**

**2.**

**Lemma**

**4.**

## 6. Conclusions and Future Work

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Aigner, M.; Ziegler, G.M.; Quarteroni, A. Proofs from the Book; Springer: Berlin/Heidelberg, Germany, 2010; Volume 274. [Google Scholar]
- Borchardt, C.W. Über Eine Interpolationsformel für Eine Art Symmetrischer Functionen und über Deren Anwendung; Math. Abh. der Akademie der Wissenschaften zu Berlin: Berlin, Germany, 1860; pp. 1–20. [Google Scholar]
- Cayley, A. A theorem on trees. Q. J. Math.
**1889**, 23, 376–378. [Google Scholar] - Galler, B.A.; Fisher, M.J. An improved equivalence algorithm. Commun. ACM
**1964**, 7, 301–303. [Google Scholar] [CrossRef] - Sleator, D.D.; Tarjan, R.E. Self-adjusting binary search trees. J. ACM
**1985**, 32, 652–686. [Google Scholar] [CrossRef] - Mitzenmacher, M.; Upfal, E. Probability and Computing: Randomized Algorithms and Probabilistic Analysis; Cambridge University Press: New York, NY, USA, 2005. [Google Scholar]
- Levin, D.A.; Peres, Y. Markov Chains and Mixing Times; American Mathematical Society: Providence, RI, USA, 2017; Volume 107. [Google Scholar]
- Sinclair, A. Improved bounds for mixing rates of Markov chains and multicommodity flow. Comb. Probab. Comput.
**1992**, 1, 351–370. [Google Scholar] [CrossRef] - Bubley, R.; Dyer, M. Path coupling: A technique for proving rapid mixing in Markov chains. In Proceedings of the 38th Annual Symposium on Foundations of Computer Science, Miami Beach, FL, USA, 20–22 October 1997; pp. 223–231. [Google Scholar]
- Kumar, V.S.A.; Ramesh, H. Markovian coupling vs. conductance for the Jerrum-Sinclair chain. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, New York, NY, USA, 17–19 October 1999; pp. 241–251. [Google Scholar]
- Jerrum, M.; Sinclair, A. Approximating the permanent. SIAM J. Comput.
**1989**, 18, 1149–1178. [Google Scholar] [CrossRef] - Chung, F.R.K.; Lu, L.; Dewey, T.G.; Galas, D.J. Duplication models for biological networks. J. Comput. Biol.
**2003**, 10, 677–687. [Google Scholar] [CrossRef] - Chung, F.R.; Lu, L. Complex Graphs and Networks; American Mathematical Society: Providence, RI, USA, 2006; No. 107. [Google Scholar]
- Lyons, R.; Peres, Y. Probability on Trees and Networks; Cambridge University Press: Cambridge, UK, 2016; Volume 42. [Google Scholar]
- Aldous, D.J. The random walk construction of uniform spanning trees and uniform labelled trees. SIAM J. Discret. Math.
**1990**, 3, 450–465. [Google Scholar] [CrossRef] - Broader, A. Generating random spanning trees. In Proceedings of the IEEE Symposium on Fondations of Computer Science, Research Triangle Park, NC, USA, 30 October–1 November 1989; pp. 442–447. [Google Scholar]
- Aldous, D. A random tree model associated with random graphs. Random Struct. Algorithms
**1990**, 1, 383–402. [Google Scholar] [CrossRef] - Wilson, D.B. Generating random spanning trees more quickly than the cover time. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing (STOC ’96), Philadelphia, PA, USA, 22–24 May 1996; ACM: New York, NY, USA, 1996; pp. 296–303. [Google Scholar]
- Kirchhoff, G. Ueber die auflösung der gleichungen, auf welche man bei der untersuchung der linearen vertheilung galvanischer ströme geführt wird. Ann. Phys.
**1847**, 148, 497–508. [Google Scholar] [CrossRef] - Guénoche, A. Random spanning tree. J. Algorithms
**1983**, 4, 214–220. [Google Scholar] [CrossRef] - Kulkarni, V. Generating random combinatorial objects. J. Algorithms
**1990**, 11, 185–207. [Google Scholar] [CrossRef] - Colbourn, C.J.; Day, R.P.J.; Nel, L.D. Unranking and ranking spanning trees of a graph. J. Algorithms
**1989**, 10, 271–286. [Google Scholar] [CrossRef] - Colbourn, C.J.; Myrvold, W.J.; Neufeld, E. Two algorithms for unranking arborescences. J. Algorithms
**1996**, 20, 268–281. [Google Scholar] [CrossRef] - Kelner, J.A.; Mądry, A. Faster generation of random spanning trees. In Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science, Atlanta, GA, USA, 24–27 October 2009; pp. 13–21. [Google Scholar]
- Mądry, A. From Graphs to Matrices, and Back: New Techniques For Graph Algorithms. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2011. [Google Scholar]
- Mądry, A.; Straszak, D.; Tarnawski, J. Fast generation of random spanning trees and the effective resistance metric. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2015), San Diego, CA, USA, 4–6 January 2015; Indyk, P., Ed.; pp. 2019–2036. [Google Scholar]
- Feder, T.; Mihail, M. Balanced matroids. In Proceedings of the Twenty-Fourth Annual ACM Symposium on Theory of Computing, Victoria, BC, Canada, 4–6 May 1992; ACM: New York, NY, USA, 1992; pp. 26–38. [Google Scholar]
- Jerrum, M.; Son, J.-B.; Tetali, P.; Vigoda, E. Elementary bounds on poincaré and log-sobolev constants for decomposable markov chains. Ann. Appl. Probab.
**2004**, 14, 1741–1765. [Google Scholar] [CrossRef] - Mihail, M. Conductance and convergence of markov chains-a combinatorial treatment of expanders. In Proceedings of the 30th Annual Symposium on Foundations of Computer Science, Research Triangle Park, NC, USA, 30 October–1 November 1989; pp. 526–531. [Google Scholar]
- Jerrum, M.; Son, J.-B. Spectral gap and log-sobolev constant for balanced matroids. In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science, Vancouver, BC, Canada, 19 November 2002; pp. 721–729. [Google Scholar]
- Sleator, D.D.; Tarjan, R.E. A data structure for dynamic trees. In Proceedings of the Thirteenth Annual ACM Symposium on Theory of Computing (STOC ’81), Milwaukee, WI, USA, 11–13 May 1981; ACM: New York, NY, USA, 1981; pp. 114–122. [Google Scholar]
- Goldberg, A.V.; Tarjan, R.E. Finding minimum-cost circulations by canceling negative cycles. J. ACM
**1989**, 36, 873–886. [Google Scholar] [CrossRef] - Henzinger, M.R.; King, V. Randomized dynamic graph algorithms with polylogarithmic time per operation. In Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, Las Vegas, NV, USA, 29 May–1 June 1995; ACM: New York, NY, USA, 1995; pp. 519–527. [Google Scholar]

**Figure 3.**Edge-swap procedure. Inserting the edge $(u,v)$ into the initial tree A generates a cycle C. The edge $({u}^{\prime},{v}^{\prime})$ is removed from C.

**Figure 18.**Estimation of variation distance as a function of the number of iterations for

**sparse**graphs (see Section 4.3.1 for details).

**Figure 19.**Estimation of variation distance as a function of the number of iterations for

**cycle**graphs (see Section 4.3.1 for details).

**Figure 20.**Estimation of variation distance as a function of the number of iterations for

**dense**graphs (see Section 4.3.1 for details).

**Figure 21.**Estimation of variation distance as a function of the number of iterations for

**biK**graphs (see Section 4.3.1 for details).

**Figure 22.**Estimation of variation distance as a function of the number of iterations for

**dmP**graphs (see Section 4.3.1 for details).

**Figure 23.**Estimation of variation distance as a function of the number of iterations for

**dmP**graphs (see Section 4.3.1 for details).

**Figure 24.**Estimation of variation distance as a function of the number of iterations for

**dmP**graphs (see Section 4.3.1 for details).

**Figure 26.**Running times for

**dense**(fully connected) graphs averaged over five runs, including the running time for computing the optimistic coupling estimate and the running time for generating a spanning tree based on that estimate, as well as the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Figure 27.**Running times for

**biK**graphs averaged over five runs, including the running time for computing the optimistic coupling estimate, the running time for generating a spanning tree based on that estimate, the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Figure 28.**Running times for

**cycle**graphs averaged over five runs, including the running time for computing the optimistic coupling estimate, the running time for generating a spanning tree based on that estimate, the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Figure 29.**Running times for

**sparse**(ladder) graphs averaged over five runs, including the running time for computing the optimistic coupling estimate, the running time for generating a spanning tree based on that estimate, the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Figure 30.**Running times for square

**torus**graphs averaged over five runs, including the running time for computing the optimistic coupling estimate, the running time for generating a spanning tree based on that estimate, the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Figure 31.**Running times for rectangular

**torus**graphs averaged over five runs, including the running time for computing the optimistic coupling estimate, the running time for generating a spanning tree based on that estimate, the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Figure 32.**Running times for

**dmP**graphs averaged over five runs, including the running time for computing the optimistic coupling estimate, the running time for generating a spanning tree based on that estimate, the edge-swapping algorithm, the running time for generating a spanning tree through a random walk, and the running time for Wilson’s algorithm.

**Table 1.**Variation distance (VD) for different graph topologies. Median and maximum VD computed over five runs for each network. Since

**dmP**graphs are random, results for

**dmP**were further computed over five different graphs for each size $\left|V\right|$.

Graph | $\left|\mathit{V}\right|$ | Median VD | Max VD |
---|---|---|---|

dense | $\{5,7\}$ | 0.060 | 0.194 |

biK | $\{8,10\}$ | 0.065 | 0.190 |

cycle | $\{16,20,24\}$ | 0.001 | 0.004 |

sparse | $\{10,14,20\}$ | 0.053 | 0.110 |

torus | $\{9,12\}$ | 0.094 | 0.383 |

dmP | $\{8,10,12\}$ | 0.069 | 0.270 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Russo, L.M.S.; Teixeira, A.S.; Francisco, A.P.
Linking and Cutting Spanning Trees. *Algorithms* **2018**, *11*, 53.
https://doi.org/10.3390/a11040053

**AMA Style**

Russo LMS, Teixeira AS, Francisco AP.
Linking and Cutting Spanning Trees. *Algorithms*. 2018; 11(4):53.
https://doi.org/10.3390/a11040053

**Chicago/Turabian Style**

Russo, Luís M. S., Andreia Sofia Teixeira, and Alexandre P. Francisco.
2018. "Linking and Cutting Spanning Trees" *Algorithms* 11, no. 4: 53.
https://doi.org/10.3390/a11040053