# Is Network Clustering Detectable in Transmission Trees?

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Simulating Networks and Measuring Clustering

`y_hi = simulate(y ~ gwesp(0.2,fixed=T), theta0 = 5,...`

`constraints = ~ degreedist, burnin=5e+5)`

`y_lo = simulate(y ~ gwesp(0.2,fixed=T), theta0 = -5,...`

`constraints = ~ degreedist, burnin=5e+5)`

#### 2.2. Simulating Epidemics

#### 2.3. Summarising Epidemic Data

- Repeat for $i=1,\dots ,500$:
- (a)
- Sample a graph ${Y}_{i}$ according to given degree distribution.
- (b)
- Simulate two further graphs ${Y}_{i}^{hi}$ and ${Y}_{i}^{lo}$ with high clustering and low clustering, respectively, using a Monte Carlo sampler that rewires ${Y}_{i}$ to alter the clustering level while preserving the degree of each node.
- (c)
- Simulate SEIR epidemics over ${Y}_{i}^{hi}$ and ${Y}_{i}^{lo}$ , conditioning on a major outbreak occurring in each.
- (d)
- Extract resulting transmission trees from ${Y}_{i}^{hi}$ and ${Y}_{i}^{lo}$ and calculate the respective summaries, ${S}_{i}^{hi}$ and ${S}_{i}^{lo}$.

- Compare sets of summaries, ${S}^{hi}$ and ${S}^{lo}$.

## 3. Results

## 4. Discussion and conclusions

## 5. Acknowledgements

## References

- Goldenberg, A.; Zheng, A.; Fienberg, S.; Airoldi, E. A survey of statistical network models. Foundations and Trends in Machine Learning
**2009**, 2, 129–233. [Google Scholar] [CrossRef] - Network Epidemiology: A Handbook for Survey Design and Data Collection; Morris, M. (Ed.) Oxford University Press: Oxford, UK, 2004. [Google Scholar]
- Newman, M.E.J. The Structure and Function of Complex Networks. SIAM Review
**2003**, 45, 167–256. [Google Scholar] [CrossRef] - Britton, T.; O’Neill, P. Bayesian Inference for Stochastic Epidemics in Populations with Random Social Structure. Scand. J. Stat.
**2002**, 29, 375–390. [Google Scholar] [CrossRef] - Groendyke, C.; Welch, D.; Hunter, D.R. Bayesian Inference for Contact Networks Given Epidemic Data. Scand. J. Stat.
**2011**. [Google Scholar] [CrossRef] - Groendyke, C.; Welch, D.; Hunter, D.R. A Network-based Analysis of the 1861 Hagelloch Measles Data; Technical Report 11-03; Department of Statistics, Pennsylvania State University: University Park, PA, USA, 2011. [Google Scholar]
- Lewis, F.; Hughes, G.J.; Rambaut, A.; Pozniak, A.; Brown, A.J.L. Episodic Sexual Transmission of HIV Revealed by Molecular Phylodynamics. PLoS Med.
**2008**, 5, e50. [Google Scholar] [CrossRef] [PubMed] - Cottam, E.M.; Thebaud, G.; Wadsworth, J.; Gloster, J.; Mansley, L.; Paton, D.J.; King, D.P.; Haydon, D.T. Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus. Proc. Roy. Soc. B-Biol. Sci.
**2008**, 275, 887–895. [Google Scholar] [CrossRef] [PubMed] - Welch, D.; Bansal, S.; Hunter, D.R. Statistical inference to advance network models in epidemiology. Epidemics
**2011**, 3, 38–45. [Google Scholar] [CrossRef] - Wallinga, J.; Teunis, P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. Am. J. Epidemiol.
**2004**, 160, 509–516. [Google Scholar] [CrossRef] - Pybus, O.; Rambaut, A. Evolutionary analysis of the dynamics of viral infectious disease. Nat. Rev. Genet.
**2009**, 10, 540–560. [Google Scholar] [CrossRef] - Watts, D.J.; Strogatz, S.H. Collective Dynamics of Small World Networks. Nature
**1998**, 393, 440–442. [Google Scholar] [CrossRef] - Newman, M.E.J.; Park, J. Why social networks are different from other types of networks. Phys. Rev. E
**2003**, 68, 036122. [Google Scholar] [CrossRef] [PubMed] - Newman, M.E.J. Properties of highly clustered networks. Phys. Rev. E
**2003**, 68. [Google Scholar] [CrossRef] [PubMed] - Britton, T.; Deifen, M.; Lageras, A.N.; Lindholm, M. Epidemics on random graphs with tunable clustering. J. Appl. Probab.
**2008**, 45, 743–756. [Google Scholar] [CrossRef] - Kiss, I.Z.; Green, D.M. Comment on “Properties of highly clustered networks”. Phys. Rev. E
**2008**, 78, 048101. [Google Scholar] [CrossRef] - Ball, F.; Sirl, D.; Trapman, P. Analysis of a stochastic SIR epidemic on a random network incorporating household structure. Math. Biosci.
**2010**, 224, 53–73. [Google Scholar] [CrossRef] - Eames, K. Modelling disease spread through random and regular contacts in clustered populations. Theor. Popul. Biol.
**2008**, 73, 104–111. [Google Scholar] [CrossRef] - Keeling, M. The implications of network structure for epidemic dynamics. Theor. Pop. Biol.
**2005**, 67, 1–8. [Google Scholar] [CrossRef] - Serrano, M.A.; Boguñá, M. Percolation and Epidemic Thresholds in Clustered Networks. Phys. Rev. Lett.
**2006**, 97, 088701. [Google Scholar] [CrossRef] - Miller, J.C. Spread of infectious disease through clustered populations. J. R. Soc. Interface
**2009**, 6, 1121–1134. [Google Scholar] [CrossRef] - Miller, J.C. Percolation and epidemics in random clustered networks. Phys. Rev. E
**2009**, 80, 020901. [Google Scholar] [CrossRef] - Badham, J.; Stocker, R. The impact of network clustering and assortativity on epidemic behaviour. Theor. Popul. Biol.
**2010**, 77, 71–75. [Google Scholar] [CrossRef] [PubMed] - Moslonka-Lefebvre, M.; Pautasso, M.; Jeger, M.J. Disease spread in small-size directed networks: Epidemic threshold, correlation between links to and from nodes, and clustering. J. Theor. Biol.
**2009**, 260, 402–411. [Google Scholar] [CrossRef] [PubMed] - Melnik, S.; Hackett, A.; Porter, M.A.; Mucha, P.J.; Gleeson, J.P. The unreasonable effectiveness of tree-based theory for networks with clustering. arXiv
**2011**, arXiv:1001.1439. [Google Scholar] [CrossRef] [PubMed] - Erdos, P.; Renyi, A. On random graphs. Publ. Math. Debrecen
**1959**, 6, 290–297. [Google Scholar] [CrossRef] - Albert, R.; Barabási, A.L. Statistical mechanics of complex networks. Rev. Mod. Phys.
**2002**, 74, 47–97. [Google Scholar] [CrossRef] - Molloy, M.; Reed, B. A Critical Point for Random Graphs with a Given Degree Sequence. Random Struct. Algorithm.
**1995**, 6, 161–180. [Google Scholar] [CrossRef] - Handcock, M.S.; Hunter, D.R.; Butts, C.T.; Goodreau, S.M.; Morris, M.; Krivitsky, P. ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks, Version 2.2-2. Available online: http://statnetproject.org (accessed on 1 February 2011).
- R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2009; ISBN 3-900051-07-0. [Google Scholar]
- Bansal, S.; Khandelwal, S.; Meyers, L. Exploring biological network structure with clustered random networks. BMC Bioinformatics
**2009**, 10, 405. [Google Scholar] [CrossRef] - Hunter, D.R.; Handcock, M.S.; Butts, C.T.; Goodreau, S.M.; Morris, M. ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. J. Stat. Software
**2008**, 24, 1–29. [Google Scholar] [CrossRef] - Mckenzie, A.; Steel, M. Distributions of cherries for two models of trees. Math. Biosci.
**2000**, 164, 81–92. [Google Scholar] [CrossRef] - Salathé, M.; Kazandjieva, M.; Lee, J.W.; Levis, P.; Feldman, M.W.; Jones, J.H. A high-resolution human contact network for infectious disease transmission. Proc. Natl. Acad. Sci. U. S. A.
**2010**, 107, 22020–22025. [Google Scholar] [CrossRef] - Wallinga, J.; Teunis, P.; Kretzschmar, M. Using Data on Social Contacts to Estimate Age-specific Transmission Parameters for Respiratory-spread Infectious Agents. Am. J. Epidemiol.
**2006**, 164, 936–944. [Google Scholar] [CrossRef] [PubMed] - Mossong, J.; Hens, N.; Jit, M.; Beutels, P.; Auranen, K.; Mikolajczyk, R.; Massari, M.; Salmaso, S.; Tomba, G.S.; Wallinga, J.; Heijne, J.; Sadkowska-Todys, M.; Rosinska, M.; Edmunds, W.J. Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLoS Med.
**2008**, 5, e74. [Google Scholar] [CrossRef] [PubMed] - Goudie, R. What does a tree tell us about a network? Master’s thesis, Oxford University, Oxford, UK, September 2008. [Google Scholar]

**Figure 1.**An example of a network on 7 nodes. The nodes are the red dots, labelled 1 to 7 and represent individuals in the population. The edges are shown as black lines connecting the nodes and represent possible routes of transmission. The degree of each node is number of edges adjacent to it, so that node 5 has degree 3 and node 7 has degree 1. The degree sequence of the network is the count of nodes with a given degree and can be represented by the vector $(0,2,0,3,1,1)$ showing that there are 0 nodes of degree 0, 2 of degree 1, 0 of degree 2 and so on. A cycle in the network is a path starting at a node and following distinct edges to end up back at the same node. For example, the path from node 6 to node 1 to node 3 and back to node 6 is a cycle but there is no cycle that includes node 4. Clustering is a measure of propensity of cycles of length 3 (triangles) to form. Here, the edges (2,1) and (2,6) form a triangle with the edge (1,6), so work to increase clustering in the network. However, the edges (2,1) and (2,5) do not comprise part of a triangle as (1,5) does not exist, so work to decrease clustering.

**Figure 2.**An example of transmission tree showing the labelling scheme. Five individuals are involved in the epidemic and are labelled $1,\dots ,5$. The root node is labelled $(0,1,0)$ to signify that, at time $t=0$, individual 1 was (spontaneously) infected. The internal nodes represent transmission events via a triplet such as $(1.3,2,1)$ showing that, at time $t=1.3$, individual 2 was infected by individual 1. The leaf nodes represent recovery times, for example $(3.1,1)$ means that, at time $t=3.1$, individual 1 recovered. Note that this tree has one “cherry”, formed by the leaves labelled $(5.2,4)$ and $(6.2,5)$, out of a possible maximum of two cherries.

**Figure 4.**Empirical distributions of summary statistics for epidemics on Bernoulli networks. (top left) Number of infected individuals through time with daily mean shown in black; (top right) Length of epidemic; (centre left) Maximum number infected at peak of outbreak; and, (bottom) Time of outbreak peak.

**Figure 5.**Empirical distributions of summary statistics of transmission trees from epidemics on Bernoulli networks. (top left) Mean internal branch length; (top right) Mean external branch length; (middle left) Number of secondary infections by node; (middle right) Number of total infections by node, vertical axis on log-scale; and, (bottom) Number of cherries in tree as a proportion of possible cherries.

**Figure 6.**Empirical distributions of summary statistics of epidemics on power-law networks. (top left) Number of infected individuals through time with daily mean shown in black; (top right) Length of epidemic; (centre left) Maximum number infected at peak of outbreak; and, (bottom) Time of outbreak peak

**Figure 7.**Empirical distributions of summary statistics of transmission trees from epidemics on power-law networks. (top left) Mean internal branch length; (top right) Mean external branch length; (middle left) Number of secondary infections by node; (middle right) Number of total infections by node, vertical axis on log-scale; and, (bottom) Number of cherries in tree as a proportion of possible cherries.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2011 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Welch, D.
Is Network Clustering Detectable in Transmission Trees? *Viruses* **2011**, *3*, 659-676.
https://doi.org/10.3390/v3060659

**AMA Style**

Welch D.
Is Network Clustering Detectable in Transmission Trees? *Viruses*. 2011; 3(6):659-676.
https://doi.org/10.3390/v3060659

**Chicago/Turabian Style**

Welch, David.
2011. "Is Network Clustering Detectable in Transmission Trees?" *Viruses* 3, no. 6: 659-676.
https://doi.org/10.3390/v3060659