Complex Contagion Features without Social Reinforcement in a Model of Social Information Flow

Contagion models are a primary lens through which we understand the spread of information over social networks. However, simple contagion models cannot reproduce the complex features observed in real-world data, leading to research on more complicated complex contagion models. A noted feature of complex contagion is social reinforcement that individuals require multiple exposures to information before they begin to spread it themselves. Here we show that the quoter model, a model of the social flow of written information over a network, displays features of complex contagion, including the weakness of long ties and that increased density inhibits rather than promotes information flow. Interestingly, the quoter model exhibits these features despite having no explicit social reinforcement mechanism, unlike complex contagion models. Our results highlight the need to complement contagion models with an information-theoretic view of information spreading to better understand how network properties affect information flow and what are the most necessary ingredients when modeling social behavior.


Introduction
Social networks mediated through online platforms are an increasingly important way in which individuals send and receive information, and their influence is now felt in economics, politics, and the workplace [1][2][3][4][5][6]. These platforms provide rich opportunities for researchers to collect and study real-world data related to human behavior and the spread of information. In concert with these datasets, considerable research has worked towards better statistical and information-theoretic tools to quantify information flow [7][8][9] and towards more accurate mathematical models to understand and even predict information flow [10][11][12].
A common approach to measuring information flow over a network is to idealize information as a collection of 'packets,' and then track the spread of those packets throughout the network. This approach is especially common when studying social media where keywords such as hashtags or URLs are easily tracked. More complex phenomena, such as the adoption of behaviors can also be monitored and used as a proxy for information flow [13]. Treating information flow in this way brings to mind the spread of infections and the use of epidemiologically inspired models is popular. In this context, the social "diffusion" of information is often characterized as either a simple contagion or a complex contagion [14]. Simple contagions are those where each exposure can independently lead to an infection. Complex contagions, in contrast, introduce a social reinforcement mechanism where multiple exposures are needed before the contagion can spread.
However, despite its simplicity and popularity, there can be drawbacks to treating information as the contagion of discrete packets. Within social media, for example, there is a wealth of written information being posted by users that is ignored when focusing only on particular keywords. Likewise, considerable information could be exchanged between individuals without leading to an observable adoption of behavior. Therefore, we argue in this work that a more nuanced approach grounded in information theory can give a better view of information flow in online social networks while more fully using the available data.
The goal of this work is to study how network properties can affect information flow when taking an information-theoretic view on information flow, and how this information-theoretic view compares to contagion. We study the quoter model [12], a simple model for individuals generating text data within social media and apply information-theoretic estimators to the model text. Using both network models and real-world network data, we compare the behavior of information flow in this model with traditional simple and complex contagion, to see the similarities and differences we may observe through these contrasting viewpoints. Interestingly, we find that the quoter model exhibits several phenomena characteristic of complex contagion, despite lacking an explicit social reinforcement mechanism, the key feature of complex contagion.
The rest of this work is organized as follows. In Section 2 we describe information-theoretic estimators of information flow and mathematical models of information flow and contagion. In Section 3 we describe the materials and methods used in this study, including simulation details, measures of information flow, the network properties we investigate, and the network data we use. Section 4 presents our results comparing contagion models with the information-theoretically motivated quoter model and exploring how various network properties affect information flow in the quoter model. We conclude with a discussion in Section 5.

Measuring Information Flow
Suppose an individual within a social network generates a stream of text representing posts shared online on Twitter, for example. The entropy rate h of this text captures the information present within it. It can be challenging to estimate h for natural language data as information is present in the ordering of the words, not just the relative frequencies of words [15]. To help address this challenge, Kontoyianni et al. [16] proved that the estimator converges to the true entropy rate h of a text, where T is the length of the sequence of words and Λ t is the match length of the prefix at position t: it is the length of the shortest substring (of words) starting at t that has not previously appeared in the text. This estimator has been used to study human dynamics including mobility patterns and social media predictability [11,17]. Equation (1) generalizes to an estimator of the cross-entropy h × between two texts A and B [11,18]: where T A and T B are the lengths of the two texts, and Λ t (A|B) is the length of the shortest substring , the ordered sequence of words in B that appear before the time of the t-th word in A, until the first substring [A t , . . . , A t+Λ t (A|B)+1 ] that is not seen in B :t . By matching the future text of A (words posted at times ≥ time(A t )) against the past text of B (words posted at times < time(A t )) at every t, only B's past predictive information about A's future is estimated and temporal precedence is satisfied. The cross-entropy can be applied directly to the texts of a pair of individuals by choosing B to be the text stream of one individual and A the text stream of the other, and Equation (2) can be used to measure the information flow between those individuals by asking how much predictive information about one text is contained within the other. This can be a quite powerful and effective measure of information flow, as it satisfies temporal precedence of the text streams and it uses all of the available (text) data for the pair of users [7,11,12,16,18]. We focus on the cross-entropy estimated using Equation (2) as a pairwise measure of information flow, but generalizations can capture information flow from multiple social ties towards a single individual [11,12]. Doing so allows for measures of more complex information flow such as analogs of transfer entropy or causation entropy [7,8,19]. The best extensions of information flow estimators beyond pairwise measures remains an active and fruitful area of research (see also our discussion in Section 5).
Closely associated with the cross-entropy is the predictability Π. Predictability, given by Fano's Inequality [20], provides a bound on how accurately an ideal predictive method can perform when working with data of a given entropy: Π is the probability the most accurate possible method will correctly predict the subsequent word with the given information's uncertainty (i.e., the cross-entropy).
and z is the cardinality of the sample space; in our problem, this is the vocabulary size or number of unique words for the quoter model (Section 3.1). The predictability is then given by finding numerically the largest Π that satisfies Equation (3). Equation (3) demonstrates that h × and Π are functionally equivalent (and inversely related, with higher h × corresponding to lower Π and vice versa) as z is a constant for the model we study here (see also discussion in Section 5). Higher values of Π (lower h × ) correspond to higher amounts of information flow.

Quoter Model
To study the effects of network properties on information flow, we use the recently proposed quoter model [12]. The quoter model represents an idealized model of social conversations, meant to capture some of the processes by which individuals in an online social network post text while also being analytically tractable. Nodes in a network generate text streams both by sampling from a given vocabulary distribution and by copying ("quoting") short sub-sequences of text from their neighbors. This model provides a parameter q, the quote probability that tunes the degree of information flow. (Full details of the model and how we simulate it are given in Section 3.1.) After simulating the quoter model for a given number of time steps (Section 3.1), a text stream has been generated by each node in the network, and we can estimate the cross-entropies between these texts to study the social flow of written information. See Bagrow and Mitchell [12] for full details on the quoter model.

Other Models of Information Flow
Contagion approaches are often used to model information flow [14]. A classic simple contagion approach to information flow is compartment models, taken from models of epidemics. Two simple compartment models are Susceptible-Infected (SI) and Susceptible-Infected-Recovered (SIR) models. On a network, a small number of nodes are initially "infected" while the remaining nodes are susceptible. The contagion then spreads from those infected nodes with a constant transmission rate per link so that each node in the "S" compartment has a constant probability to move to the "I" compartment with any given exposure. For SIR models, an additional "R" compartment is used to model a recovery process where infected nodes cease spreading the contagion while also becoming immune to reinfection. Many variants on these models exist.
Complex contagion phenomena are typically captured with threshold models [21,22]. Here nodes are again labeled as susceptible or infected, but the probability for a node i to become "infected" is a function of the number of neighbors of that node already infected. If too few neighbors are infected there is zero probability that i will be infected. Yet if a sufficient fraction of i's neighbors become infected, then i has a non-zero probability of becoming infected. This social reinforcement mechanism is intended to capture the cognitive mechanisms underlying opinion change, knowledge acquisition, and other facets of how individuals respond to and adopt information and ideas [23,24].
Complex contagion leads to several phenomena that differ from simple contagion. For one, there is an interesting cascade window where network density leads to a non-monotonic relationship with the spread of the contagion. Often denser networks lead to less spread, unlike simple contagion where a contagion will spread more easily as denser networks afford more opportunities (links) for spreading. Another feature of complex contagion is the complicated role of clustering where clustering can appear to either promote or inhibit contagion [25][26][27][28]. Complex contagion also exhibits a "weakness of long ties" effect, where long ties impede the flow of contagion [29], in contrast with the seminal "strength of weak ties" result [30] that implies long-range ties have an out-sized role in promoting information flow. The goal of our work here is to study the information-theoretic view of information flow we adopt here with the quoter model and compare to the effects of complex contagion that is commonly used as a non-information-theoretic view to study information flow.

Materials and Methods
In this study, we use the quoter model on networks to elucidate the role of network structure on information flow. Here we describe the procedures to simulate the quoter model, measure information flow between nodes in networks, we describe the network features we study in relation to information flow, and we provide the details on the network models (random graphs) and real-world network datasets we study.

The Quoter Model
We use the following process to simulate the quoter model on a given network. The quoter model requires a directed graph G = (V, E) (where N = |V| is the number of nodes and M = |E| is the number of edges) and, in the most general case, quote probabilities q uv on each directed edge (we say node v (ego) may quote u (alter) if the edge u → v exists and has q uv > 0). We simplify this for our simulations: when an ego generates new text, with probability q (bidirectional quoting) we pick an alter (predecessor) uniformly at random to quote from; otherwise, with probability 1 − q the ego generates new content. If an ego quotes an alter (probability q), copy a random segment of the alter's past text and append this onto the ego's growing text stream. We take the "quote length" (number of words) being copied to be Poisson-distributed (with mean λ) for all users; Otherwise, if not quoting (probability 1 − q), generate new content by sampling with replacement from a vocabulary distribution W(w) and appending those samples onto the ego's growing text stream, where the number of samples is again Poisson-distributed with mean λ. We assume a common, fixed vocabulary distribution W(w) that follows a Zipf law of word use, as in prior studies and motivated by real-world language usage patterns [12]. Specifically, a Zipf law defines the probability of using word w to be a power law based on the rank r w of w: W(w) = H −1 z,α r −α w , where z is the vocabulary size and H z,α = ∑ z r=1 r −α . Here we take z = 1000 as in [12] and, unless otherwise stated, focus on the exponent α = 1.5, a value typical of social media data. We focus in this work on q = 1/2 and λ = 3 but we explore the robustness of our results to other parameter choices in Appendix A. This process repeats for T = 1000N time steps so that each user has generated approximately 1000λ = 3000 words when complete. This number of time steps was chosen to ensure the entropy estimator would converge (see [16,18] for convergence proofs).
While very short amounts of text will make the estimated entropy too uncertain to be reliable, this length of text is in line with the empirical convergence of h × reported in real data [11].

Measuring Information Flow over the Network
After generating text streams for all nodes in G by iterating the quoter model, the cross-entropy estimator (Equation (2)) is then applied as needed to compute h × . We compute the cross-entropy over all edges, {h × } = {h × (u | v) | (u, v) ∈ E}, and report the mean h × and variance Var(h × ) of these values. (We examine the distribution of h × in Appendix B to show that h × and Var(h × ) are reasonable summaries of the distribution of h × .) Likewise, the predictability Π, given by Fano's Inequality [20], is a functionally equivalent measure of information flow (as we assume the same vocabulary sizes for nodes in the quoter model). We focus on link-based cross-entropies although the cross-entropy estimator can be applied to non-neighboring nodes. Indeed, when studying the role of community structure in modular networks (see Section 3.4), we also consider cross-entropies between nodes in different modules, to assess information flow between and within said modules.

Simulating Contagion Models
To compare and contrast information flow in the quoter model, we also simulate traditional models of information flow, specifically simple and complex contagion. For simple contagion we simulate a stochastic SIR model on different networks (1000-node Erdős-Rényi and Barabási-Albert networks, as well as a sample of real-world networks) using [31]. For the simulations here we set the transmission rate 20 and recovery rate 1. We initialize with a random 5% of the nodes infected, and run 10 outbreaks on 100 realizations of the network for each choice of average degree k . For complex contagion we use exactly the same parameters, except we introduce a threshold function for transmission as in [22], where the transmission rate is set to zero if the proportion of infected neighbors is below some threshold φ (and we set φ = 0.18 following [22]). For all simple and complex contagion simulations we measure the peak outbreak size, noting that larger outbreak sizes conventionally correspond to greater information flow.

Assessing the Impact of Structure on Dynamics
In this work we use several network models (random graphs) tailored to control for various network properties such as density, clustering, and modular structure. Here we describe the models and properties we study in relation to information flow in the quoter model.

Density and Average Degree
To explore how network density relates to information flow, we create Erdős-Rényi and Barabási-Albert networks of N nodes with varying average degree, k , allowing us to the tune their densities. For the Erdős-Rényi networks we add edges independently with probability p = k /(N − 1). For the Barabási-Albert model we start with m = k /2 nodes with no edges and add nodes which each form m links with previous nodes according to preferential attachment. Here we measure how cross-entropies varies with the densities of the networks using their average degree k and edge density M/( N 2 ) where M is the total number of edges in the network. To complement the Erdős-Rényi and Barabási-Albert results, we also compare the densities of real networks with their average cross-entropy.

Degree Heterogeneity
To assess the role of degree heterogeneity on information flow, we study the simplest random graph model with tunable degree heterogeneity, termed "dichotomous networks" in [32]. Dichotomous networks are generated via the configuration model. They have only two types of nodes-those with degree k 1 and those with degree k 2 . We assume there are N/2 nodes of each degree and fix k 1 + k 2 so that the average degree is fixed. The mean and variance of the degree distribution, respectively, are given by µ = 1 2 (k 1 + k 2 ) and σ 2 = (k 1 − k 2 ) 2 /4. We are interested in how the cross-entropy varies with k 1 /k 2 . When k 1 /k 2 = 1 the network reduces to a random k-regular graph (σ 2 = 0), while σ 2 → ∞ as k 1 /k 2 → 0.
Clustering Clustering or triadic closure, the tendency towards forming triangles, is a key feature of social networks. We studied clustering using a network model with tunable numbers of triangles and with a randomization procedure that can lower the number of triangles in an existing network. We quantify a network's clustering using transitivity T(G), the fraction of possible triangles in the network which actually exist: T(G) = 3N triangles /N triads , where N triangles counts the number of triangles in the network and N triads is the number of triads or paths of length 2.
We constructed "small-world" networks using the Watts-Strogatz (WS) model [33] to tune their clustering. We generated a one-dimensional periodic lattice of N nodes with k nearest-neighbor connections, and randomly rewired lattice edges with a rewiring probability p. Varying the rewiring probability p allows us to tune the network diameter and clustering.
While the Watts-Strogatz model lets us generate networks with different clustering values, a generic challenge when assessing the impact of clustering (and other network properties) on dynamics is generating networks with tunable clustering, but for which other structural properties, such as density or diameter, can be controlled for. To study the relationship between transitivity and information flow, we apply the established degree-preserving stochastic rewiring or "x-swap" method [34][35][36], in which we repeatedly choose two links at random and two randomly selected endpoints of those links are swapped as long as the number of links does not change by swapping and the network does not become disconnected. These swaps lower transitivity while fixing the number of links and degrees of all nodes in the network. We performed 5M swaps for each real network. Examining information flow on the randomized network compared with information flow on the original network can then illustrate what effect, if any, transitivity had on information flow.

Community Structure and Modularity
Community structure is another inherent property of social networks. It is commonly quantified using modularity [37]: where M is the total number of links, the sum runs over all pairs of nodes in the network, A = [a ij ] is the adjacency matrix of the network, k i is the degree of node i, δ is the Kronecker delta, and c i denotes the community containing i. The community structure encoded in the {c i } can be found using a community detection algorithm or it may be planted within a network model. To investigate community structure within a network model, we examined instances of the stochastic block model (SBM) [38,39] with N nodes and two planted blocks, or groups of nodes, denoted A and B, of equal size m = N/2. Here there are two connection probabilities: p 0 (the within-block connection probability) and p 1 (the between-block connection probability) governing the probability for a link to form between nodes in the same block and in different blocks, respectively. The expected modularity in this two-block stochastic block model is Our main quantities of interest are the average cross-entropy on within-block edges, h × (within) , the average cross-entropy on between-block edges h × (between) and their difference, ∆h × ≡ h × (between) − h × (within) . These quantities describe to what extent information flows within and between communities.
We also computed modularity for real networks using the Louvain method [40]. The Louvain method is a hierarchical community detection algorithm that finds a partition of nodes that maximizes modularity Q. As commonly done, we initialize each node in its own community.

Multiple Vocabulary Distributions
A recent study [41] showed that heterogeneity in the dynamical parameters can be as important as structural heterogeneity. Communities offer an obvious way to implement such heterogeneity: We also investigate a two-block SBM where we distinguish the two groups A and B by giving them different Zipf exponents α A , α B , respectively, for their vocabulary distributions.

Network Datasets
To supplement the above graph models, we also studied contagion and quoter model dynamics on real-world networks. We developed a corpus of 10 social networks spanning a range of sizes and densities that were used as the basis for simulation. See Appendix C for details on network sources and processing. Table 1 shows several descriptive statistics for the networks we analyzed.

Results
Here we compare information flow in the quoter model with traditional simple and complex contagion (Section 4.1), then investigate how degree heterogeneity (Section 4.1), clustering (Section 4.2) and network modularity (Section 4.3) affect information flow. We also study how heterogeneity in the parameters affects information flow compared to the effects of network structure (Section 4.4).

Information Flow and Models of Contagion
A distinguishing feature of simple and complex contagion is that denser networks lead to higher spreading for simple contagion and lower spreading (mostly) for complex contagion. We illustrate this difference using simulations in Figure 1A,B. For the simple and complex contagion models we use the average peak size of the outbreak as our measure of information flow in the network, whereas for the quoter model we use the average predictability over links. The decrease in spreading in complex contagion is due to its social reinforcement mechanism: it is more difficult for a contagion to spread when egos have many alters as more alters must adopt the contagion before the ego does. Yet we see in Figure 1C that the quoter model, which lacks an explicit social reinforcement mechanism, also exhibits lower information flow at higher density. Here we measure information flow using predictability on links (Section 3.2), which is functionally equivalent (Section 2.1) in our simulations to the cross-entropy h × (Figure 1C inset). Please note that while the curve for h × looks visually similar to that of simple contagion's average peak size, it is measuring the opposite effect: higher h × corresponds to lower information flow. These results also hold on our corpus of real-world networks (Figure 2).  Somewhat surprisingly, in Figure 1C we see that Erdős-Rényi (ER) and Barabási-Albert (BA) networks are qualitatively indistinguishable in terms of information flow, despite the preponderance of hubs in the latter that we expect would play an out-sized role in information flow. To better understand this observation, we investigated the variance of h × over links in Figure 3A. We see that the cross-entropy varies more from link to link in the BA networks than for ER networks, indicating that hubs do not move the average information flow but do create fluctuations in the flow, especially for sparser networks.
To further explore the role of network structure heterogeneity, we investigate dichotomous networks (Section 3.4). Here half the nodes have degree k 1 and the other half have degree k 2 . Varying the degree ratio k 1 /k 2 allows us to tune the degree variance within this simplified network model. In Figure 3B we see that the total number of nodes and average degree change the average information flow while the degree heterogeneity (k 1 /k 2 ) has little effect. Yet degree heterogeneity does affect the variance of information flow ( Figure 3C). These simpler dichotomous networks show the same effects as observed previously in BA networks.
The simplified bimodal degree distribution of dichotomous networks also lets us explore the effects of ego and alter degrees by computing conditional expectations of h × conditioned on degree.
We see from the grouping of curves in Figure 3D that the degree of the ego (the node being predicted) but not the alter (the node predicting) plays a role in the information flow: degree-k 1 egos have more information flow than degree-k 2 egos regardless of the degree of the alter.   Figure 1C. Network size has a smaller effect on h × compared to the average degree. (C) Variance of cross-entropy versus k 1 /k 2 . Higher degree heterogeneity (lower k 1 /k 2 ) leads to higher variation in h × over links, indicating the existence of highly predictive nodes and nodes that contribute little predictive information within heterogeneous networks. (D) Dichotomous networks of size N = 1000 and k = 16. Average cross-entropy over links conditioned on degrees of endpoints (predicting ego from alter). Only the degree of the ego matters, approximately, not the degree of the alter.

Interplay of Clustering and Information Flow
Next, we study how clustering (transitivity) affects information flow. Clustering plays a complicated role in both simple and complex contagion [25,27] and we report interesting, if mixed, results in Figure 4 with the quoter model's information flow.
First, in Figure 4A we study information flow for small-world networks that are randomly rewired to remove clustering [33]. Regardless of network size or average degree, information flow decreases (higher h × in top panel of Figure 4A) as clustering decreases ( Figure 4A bottom panel). Please note that rewiring also changes the diameter of the small-world network, but we see that the main increase in h × occurs when clustering begins to drop. In small-world networks, clustering tends to promote information flow.
Next, in Figure 4B we investigate transitivity in the corpus of real-world networks. For each network, we compute information flow on the original network and on a replicate of the network that is randomized by the "x-swap" method. The x-swap method lowers transitivity for all networks but for half of the networks it also lowers h × , contradicting the previous results on small-world networks by indicating that transitivity inhibits information. However, it is challenging to draw a sharp conclusion from this x-swap procedure as it also affects other network properties simultaneously. We illustrate this in Figure 4C where we compare four network properties in the original and x-swapped networks. X-swapping affects transitivity but also average shortest path length (ASPL), modularity and assortativity (degree correlations). This means the changes in information flow seen in Figure 4B may be due to changes in a combination of these (and possibly other) network properties. Unfortunately, it remains an open research problem how best to systematically control for network properties to uncover their effects on dynamics.

Community Structure and the Weakness of Long Ties
The effects of long-range links on information flow have been investigated for some time, from the seminal "strength of weak ties" [30] and the contrasting "weakness of long ties" in complex contagion [29]. Here we investigate long ties in the context of community structure: In networks with densely connected groups of nodes, long ties act to bridge nodes in different groups. How does information flow differ between groups compared to flow within groups?
Using the stochastic block model (Section 3.4) with two groups of equal size as a model for networks with dense modules, we study in Figure 5 information flow between and within groups. The two-group SBM is parameterized by two connection probabilities, the probability for a link within each group (p 0 ) and the probability for a link between the two groups (p 1 ). In Figure 5A we see that information flow decreases as p 0 increases and the network becomes denser. Likewise, the difference in information flow ∆h × increases due to between-block links containing less predictive information ( Figure 5B). This supports the well-known "weakness of long ties" feature of complex contagion. For larger values of p 1 , when there are more links connecting the groups making them less distinct, this difference decreases. The collapse of curves in Figure 5C indicates ∆h × is entirely predicated on the network modularity Q.
Interestingly, we also remark that ∆h × is always positive-even when p 0 < p 1 (equivalently, Q < 0). We would expect more information flow between groups than within when within this "anti-community" regime of the SBM, when there are more links between groups than within groups, yet we observe a weak effect otherwise. Examining ∆h × as a function of modularity Q shows a clear collapse across values of SBM probabilities. Interestingly, anti-community structure (Q < 0) still leads to positive ∆h × , indicating that information flow is still more prevalent within blocks.

The Role of Dynamic Heterogeneity
In our results so far, we have treated nodes as identical within the quoter model and focused only on their topological differences within the network. Yet recent studies have underlined the importance of comparing dynamic heterogeneity with structural heterogeneity [41]. Here we taken an exploratory step in this direction by considering a generalization of the quoter model where nodes have different vocabulary distributions.
We explored how information flow changes in the stochastic block model when the nodes in the two blocks have different vocabulary distributions. This is intended to model a difference in the nodes between the two groups, capturing in the quoter model a social homophily in how egos write. Specifically, we assume they have the same vocabularies and follow Zipf distributions, but the exponent of the Zipf distribution is different: nodes in block A have exponent α A and nodes in block B have exponent α B . A larger α (steeper distribution) corresponds to a less diverse vocabulary, and could capture a group of nodes that is more consistent and repetitive in their dialog. In contrast, a lower α (shallower distribution) may describe a group of nodes that uses more diverse words. Figure 6 shows how information flow changes when the two blocks have different vocabulary distributions ( Figure 6A,C) compared with the same distribution ( Figure 6B). For illustration, we show the Zipfian vocabulary distributions for the two groups as insets in Figure 6. We observe a much larger trend in how cross-entropy changes with modularity when the exponents are not equal compared to when they are equal. This underscores how structural features (the degree of modularity) greatly magnifies the effects of intrinsic dynamic heterogeneity (different vocabulary distributions). While modularity plays a role even when the two groups have identical vocabulary distributions ( Figure 5), this difference is challenging to detect in Figure 6B when viewed on the scale of groups with different vocabulary distributions ( Figure 6A,C).

Discussion
In this work, we have studied how the social flow of written information can be affected by network properties such as the density of links, preponderance of triangles, and modular or community structure. We focused on the quoter model, a toy model for a network of individuals to communicate by generating text sequences and applied information-theoretic estimators of the information flow to these texts. We compared results of information flow in the quoter model with traditional simple and complex contagion models.
A particularly intriguing facet of the interplay between quoter model dynamics and network topology is how the quoter model exhibits both the density-driven inhibition of information flow and the weakness of long ties that are signatures of complex contagion, despite lacking an explicit mechanism of social reinforcement. Social reinforcement, the idea that individuals adopt a piece of information only after receiving repeat exposure from social ties, is considered one of the characteristics that distinguishes complex contagion from epidemic spreading. Social reinforcement mechanisms better model how people perceive and react to information. Yet we found here that social reinforcement is not strictly necessary when modeling a more nuanced view of information flow. In particular, considering text streams (as generated by the quoter model) and predictive measures of information flow (as quantified using cross-entropy estimators) allows us to capture how information can be "drowned out" by the increased "cross-talk" that occurs in denser networks, showing how increased density can inhibit information flow. Further pursuing this line of investigation may give more insight into information flow and even human behavior within social networks.
We also found a mixed combination of results relating clustering to information flow. For small-world (Watts-Strogatz) networks, increasing the clustering leads to a significant increase in information flow (decrease in cross-entropy). At the same time, however, experiments on real-world networks showed the opposite effect: randomizing networks to lower transitivity while preserving connectedness and the degree distribution leads to a decrease in information flow. However, this well-established randomization procedure does not control for other network properties such as modularity or average shortest path length, so it remains an open question if the interplay of multiple effects may resolve the discrepancy between these results.
Another interesting result related information flow to community structure, with the modularity Q used to measure the strength of the modular divide. When Q > 0, meaning there were fewer links between modules than expected, we found in Figure 5 an increase in cross-entropy between modules compared with the cross-entropy between nodes that share a module, as expected by the "weakness of long ties". However, we found the same increase in cross-entropy when Q < 0, where there were more links between modules than expected. We would initially expect this regime of "anti-community" structure to have more information flow between modules as there exist more links to facilitate this flow. One possible reason for this anti-community result is that nodes in the same group, while having fewer direct links to one another, may have many links to common nodes in the other group, leading to more similar inputs to their texts. This nonlocal interplay of information flow and network structure is an intriguing avenue for future work.
There are some important limitations to discuss regarding this work. We only considered undirected, unweighted networks. In the context of social networks, this implies all relationships are reciprocal and equal in strength. Future work should extend to directed, weighted networks. Furthermore, a more exhaustive study of the robustness of results to parameter choices is necessary (we take a first step towards this in Appendix A). Vocabulary size is another parameter worth exploring; here we assume it is constant across all nodes. Likewise, cross-entropy (Equation (2)) is a somewhat simplistic information-theoretic measure of information flow, and it is important to consider more advanced measures. Measures such as transfer or causation entropy can offer more insight, quantifying non-redundant information and allowing us to identify indirect influences [7,8]. However, in the context of time-ordered social text data, it is challenging to estimate conditional entropies, making it non-obvious how to implement such measures [12]. Finally, while we observed several features that are signatures of complex contagion, not all features of complex contagion are exhibited by the quoter model. For example, there is an optimal modularity that maximizes spreading of complex contagions within the stochastic block model: if Q is either too small or too large then the contagion will not spread [42]. We were unable to observe a corresponding feature within the quoter model. This warrants further investigation, in particular to understand if this is due to how the quoter model differs from complex contagion models, or if it is due to the information-theoretic measure of information, or a combination of the two.
In general, contagion models are a successful way to study information flow in social networks, but to gain more insight it is necessary to adopt more nuanced views of information flow. We argue here that information theory can provide a pathway towards these insights, especially when combined with models such as the quoter model that capture features of human behavior while also modeling key aspects of the data being generated by social network platforms.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Further Exploring Quoter Model Parameters
To support our results, here we explore other choices of quoter model parameters (q and λ). The simulations are done on smaller networks to make it less computationally expensive to do a wide sweep of the parameter space. We first simulate the quoter model on ER, BA, and small-world networks for q ∈ {0.1, 0.5, 0.9} and vary k or the rewiring probability, p, to support results from Section 4.1 and Section 4.2. We then simulate the ER, BA, and small-world experiments again for various combinations of the quote probability q and mean quote length λ. We evaluate the robustness of results for ER networks as follows. For each combination of (q, λ), we calculate the difference h × k=20 − h × k=6 , whereby h × k=20 we mean the average cross-entropy on ER networks of average degree k = 20. The quantity will be positive if density inhibits information flow. This allows us to assess the how the magnitude of our results vary with (q, λ), although it does not confirm a monotonic trend holds. We repeat these calculations with the BA networks and extend them to the small-world networks by replacing k with p ∈ {0, 1}. In general, we find in Figures A1 and A2 that our results are qualitatively robust to parameter choices, with the exception of very small values of q, as we expect.   Figure A2. Effects of quoter model parameter choices on observed trends. Information flow is lower for denser ER and BA networks across a range of q and λ with the effect being more pronounced at higher values of q and λ. Likewise, for small-world networks, more clustering (lower p) exhibits higher h × than less clustering (higher p), with the effect being most pronounced at q > 0.5 regardless of λ. Here, ER & BA networks had N = 100 and small-world networks had N = 200 and k = 6. Each cell constitutes 100 simulations.

Appendix B. Summarizing h ×
In this work, we summarized h × by the mean h × and variance Var(h × ). In Figure A3, we see that this choice was appropriate: examining the distributions of h × for various networks shows that they are approximately normal. We also find the mean and median h × to be approximately equal.  Figure 4A. Real-world networks are from 300 simulations as in Figures 2 and 4B,C. Quoter model parameters are given in Section 3.1.