Algorithm 3 accepts source tree (

${T}_{1}$) and distortion parameters to create a target tree (

${T}_{2}$) as the number of added (

a) and deleted nodes (

d), and percentage of matched nodes with changed parent (

${m}_{p}$). Matched nodes are nodes with the same labels in both source and target tree. The algorithm works as follows: first, nodes

${V}_{1}$ and edges

${E}_{1}$ of the source tree are copied into

${V}_{2}$ and

${E}_{2}$. Matched nodes are stored in set

M, corresponding to the set

${V}_{2}$ after the node deletion. The algorithm randomly selects

d nodes from set

${V}_{2}$ by using function

`getRandomNode`. Randomly selected node

${u}_{d}$ is deleted from set

${V}_{2}$, together with the corresponding edge to the parent node from the set of edges

${E}_{2}$. The edge that is deleted is obtained by function

`getEdgeToParent` and given the child node. Upon deletion, the algorithm creates new nodes and adds them to the node-set

${V}_{2}$. For each new node

${u}_{a}$ the algorithm randomly selects its parent node

${u}_{p}$ and creates a corresponding edge

$({u}_{p},{u}_{a})$. If the parent node has children, the tree can be additionally modified by replacing the parent of one of the randomly selected children

c and set

${u}_{a}$ as the new parent. This modification depends on the

`spreadDecision` function, which in this paper decides with a 50% chance. However, the function can be parametrized or defined differently based on specific usage. The third step of the algorithm is to replace parent node for matched nodes from set

M. First, the number of parent-changing nodes

${m}_{c}$ is calculated over the parameter

${m}_{p}$ and the number of matched nodes

$\left|M\right|$. Node

${u}_{m}$ is selected from a set of nodes

N, which initially contains matched nodes

M. At each iteration performed

${m}_{c}$ times, set

N is reduced by node

${u}_{m}$. The algorithm acts similarly for adding the nodes; it randomly selects a new parent node

${u}_{np}$ from a set of nodes

${V}_{2}\setminus \{{u}_{m},{u}_{op}\}$, where a newly added node, as a matched node, can be the new parent node. Similarly to the adding node, node

${u}_{m}$ can be added as child to the new parent

${u}_{np}$ in two ways: by replacing the existing child node

c or adding it as the new child node. The difference is that the edge

$({u}_{op},{u}_{m})$ to the previous parent node

${u}_{op}$ is deleted from set

${E}_{2}$. On the other hand, the final number of nodes with a changed parent may eventually be greater than

M. Deleting a node with children involves changing the parent of the child, and in cases where a new or an existing node replaces the child, that child changes the parent. However, in this paper, we consider such changes implicitly with the initial distortion parameters. Future work could include an evaluation of Algorithm 3 that would consider such changes separately. Algorithm 3’s complexity is in the worst-case

$O({n}_{1}+{n}_{2})$ when

${n}_{1}$ nodes are deleted, and

${n}_{2}$ nodes are added, where

${n}_{1}$ corresponds to

d, and

${n}_{2}$ corresponding to

a. An example of tree generated by Algorithm 3 from the tree shown in

Figure 6a as the source tree with 100 nodes is shown in

Figure 6b. The distortion parameters used are 10 added and deleted nodes and 10% of matched nodes with changed parents (10, 10, 0.1). Note that tree in

Figure 6a is generated by Algorithm 2 with the following distribution: 32.5%, 50%, 10%, 5%, and 2.5%. The given distribution of nodes corresponds to the distribution of tree representing class hierarchy of

NewPipe program version 0.8.9 shown in

Figure 5. Trees in

Figure 6a,b are similar, since the tree in

Figure 6b is generated by controllable distortion of source tree in

Figure 6a. On the other hand, the tree in

Figure 6c is generated by Algorithm 2 with the same distribution as the tree in

Figure 6a. An approach where both trees are generated by Algorithm 2 can also be used to generate trees for comparison. In this case, the distribution of nodes in both trees is equal, but the distortion is in the form of a possibly different number of nodes and random connections between nodes.