Estimating the Information Source under Decaying Diffusion Rates

: Recently, as arising from online social network services such as Facebook and Twitter, people are more actively using social networks to exchange their new information. In this consideration, ﬁnding the information source becomes one of the indispensable and useful tasks in detecting a malicious agent and an inﬂuential person in the networks. A seminal work by Shah and Zaman in 2010 showed that the detection probability cannot be beyond 31% even for regular trees if time goes to inﬁnity. From the study, extensive researches have been done for this problem, whose major interests lie in constructing an efﬁcient estimator and providing theoretical analysis on its detection performance. However, most of the works assumed the homogeneous diffusion rate of the information, where the diffusion rate does not change at all times over the network. In practice, it is reported that information has a lifetime and it becomes less attractive over time. In this paper, we study the problem of detecting the information source when the diffusion rate decreases by the distance from the source in the network. As a result, we obtain analytical detection performance of Maximum Likelihood Estimator (MLE) and validate our theoretical ﬁndings over the regular tree, random and real-world


Introduction
In the era of big data that came with the rapid development of the Internet, information spread is universal including the propagation of infectious diseases, the technology diffusion, the computer virus/spam infection on the Internet, and sharing popular topics by the Facebook. Information source estimation is the problem that identifies the initial seed node of the diffused information in the network. This is clearly of practical importance, because harmful diffusion can be mitigated or even blocked, e.g., by vaccinating humans or installing security updates. In the seminal work by Shah and Zaman [1], it is shown that in the regular tree topologies, the detection probability cannot be above 31% under Maximum Likelihood Estimator (MLE), and even worse, in other realistic topologies such as Facebook graphs, scale-free graphs and Internet Autonomous System (AS) graphs, the detection probability is less than 5% under the MLE-based heuristic due to a complex structure of networks. Since then, extensive research attention for this problem have been paid for various network topologies and diffusion models [1][2][3][4][5][6][7][8], whose major interests lie in constructing an efficient estimator and providing theoretical analysis of its detection performance. Prior work directly or indirectly conclude that this information (We use the terms "information" and "rumor" interchangeably through the paper.) source finding turns out to be a challenging task unless sufficient side information or multiple diffusion To the best of our knowledge, this is the first work to consider the decaying diffusion rate on the information source detection problem.
Our main contributions are summarized in what follows.
• First, we use the MLE to find the source for decaying diffusion and show that the MLE is same as that of the homogeneous diffusion rate of information if the diffusion decays with respect to the distance from the source. This implies that the MLE of our model also has the same graphical centrality property called rumor center in [1]. This enables us to analyze the detection performance for the decaying rate scenario. • Second, we define two exponential decaying models: (a) Simple exponential decay and (b) Generalized exponential decay. The simple exponential decay is a kind of light tail distribution, but the generalized exponential decay covers light and heavy tail distribution in the sense of the decaying pattern. We then obtain the closed-form of detection probability of the MLE when the underlying graph is a regular tree for both decaying models. Different to the prior result in [1], the detection probability is larger than zero in the line graph and there is a non-neglectable improvement of detection for any degree of a regular tree. • Third, we consider the case that the decaying model parameter is hidden for two decaying models above. This is a more realistic scenario because knowing the exact parameter of the model is not easy in practice. To do that, we first derive MLE to estimate this parameter and show that it needs exponential computing time. Hence, we design a heuristic estimation algorithm for the true parameter by using the diffusion snapshot information, appropriately. • Finally, we validate our theoretical result for the regular tree using the MLE and for over popular random graphs (Erdös-Rényi, scale-free and small-world graphs) and real-world networks (US-power grid, Facebook and Wiki vote) using the heuristic Breath-First-Search (BFS) estimator. As a result, we see that the detection probability can be above 80% for the regular tree and it can be above 30% if the diffusion rate decays, whereas it is about 20% without decaying in the Facebook graph.
The remainder of this paper is organized as follows. Section 2 discusses related literature. In Section 3, we introduce the information diffusion model and estimator. The theoretical results for detection performance of decaying rate will be presented in Section 4, and the corresponding proof will be provided in Section 5. In Section 6, we depict the simulation results and conclude the paper in Section 8. The detailed proof will be presented in the Appendix A.

Related Work
The research on information source detection has recently received significant attention. We divide them into the following three categories: (i) single source estimation, (ii) multiple sources estimation, and (iii) hiding and seeking the sources.
(i) Single source estimation. The first theoretical approach was done by Shah and Zaman [1][2][3], which introduced the metric called rumor centrality -a simple topology-dependent metric for a given diffusion snapshot. They called a node that has maximum rumor centrality by rumor center as a MLE. They proved that the rumor centrality describes the likelihood function when the underlying network is a regular tree and the diffusion follows the Susceptible-Infected (SI) model, which is extended to a random graph network in [2]. Zhu and Ying [4] solved the source detection problem under the Susceptible-Infected-Removed (SIR) model and took a sample path approach to solve the problem, where the Jordan center was used, being extended to the case of sparse observations [5]. There were several attempts to boost up the detection probability. Wang et al. [6] showed that observing multiple different epidemic instances can significantly increase the detection probability. Dong et al. [7] assumed that there exist a restricted set of source candidates, where they showed the increased detection probability based on the Maximum a Posterior Estimator (MAPE). Choi et al. [8] showed that the anti-rumor spreading under some distance distribution of rumor and anti-rumor sources helps to find the rumor source by using the MAPE. Choi et al. [9,10] studied the effects of querying to finding the source and showed how many queries are sufficient to achieve a target detection probability. The authors in [12,13] introduced the notion of set estimation and provided the analytical results on the detection performance. Luo et al. [14] considered the problem of estimating an infection source for the SI model, in which not all infected nodes can be observed.
(ii) Multiple sources estimation. Different from the single source estimation, multi-source estimation requires inferring the set of source nodes. Despite the difficulty of the problem, some prior studies tried to solve this problem by appropriate set estimation methods. Prakash et al. [15] proposed to employ the Minimum Description Length (MDL) principle to identify the best set of seed nodes and virus propagation ripple, which describes the infected graph most succinctly. They proposed a highly efficient algorithm to identify likely sets of seed nodes given a snapshot and show that it can optimize the virus propagation ripple in a principled way by maximizing the likelihood. Zhu et al. [16] proposed a new source localization algorithm, named Optimal-Jordan-Cover (OJC). The algorithm first extracts a subgraph using a candidate selection algorithm that selects source candidates based on the number of observed infected nodes in their neighborhoods. Then, in the extracted subgraph, OJC finds a set of nodes that cover all observed infected nodes with the minimum radius. Considering the heterogeneous SIR diffusion in the ER random graph, they proved that OJC can locate all sources with probability one asymptotically with partial observations. Ji et al. [17] developed a theoretical framework to estimate rumor sources, given an observation of the infection graph and the number of rumor sources.
(iii) Hiding and seeking the source. As opposed to finding the information source from a given snapshot of the epidemic, hiding the corresponding source approach also has been studied. Fanti et al. [18] first considered this problem and proposed an Adaptive Diffusion (AD) for the information spreading protocol. They showed that AD is near-optimal for hiding the source as well as maximizing the information spreading on the regular tree structures. Luo et al. [19] considered a problem that an information source tries to hide with maximizing the spread of its information, whereas the network adversary seeks the source, simultaneously. They formulated this problem by a game theoretic model and showed that a Nash Equilibrium (NE) exists under some mild condition of the game.
To the best of our knowledge, our paper is the first to quantitatively consider the decaying of diffusion power, which is more realistic scenario compared to the homogeneous diffusion rate. We obtain that the source can be found more easily when the diffusion rate decays with respect to the distance from the source for the real world graphs as well as tree structure.

Model and Estimator
We consider an undirected graph G = (V, E), where V is a countably infinite set of nodes and E is the set of edges of the form (i, j) for i, j ∈ V. Each node represents an individual in human social networks or a computer host in the Internet, and each edge corresponds to a social relationship between two individuals or a physical connection between two Internet hosts [9]. As an information spreading model, we consider a SI model as in [1,2,4], where each node is in either of two states: susceptible or infected. In this model, once a node has the information, it keeps it forever, i.e., it does not allow for any nodes to recover. All nodes are initialized to be susceptible except the information source, and once a node i has information, it is able to spread the information to another node j if and only if there is an edge between them. We denote a random variable τ ij by the time it takes for node j to receive the information from node i if i has it. We denote v * ∈ V by the information source, which acts as a node that initiates diffusion and denote V N ⊂ V of N infected nodes under the observed snapshot G N ⊂ G. In this paper, we consider the case when G is a regular tree and our interest is when the sufficient time has passed, as done in many prior work [1,2,6,7]. Even though the real network may not be the regular tree with high probability, it is shown that many random graphs can be approximated by regular tree due to locally tree-like structure [20]. Note that most of the works have focused on the case that all edge e = (u, v) for node pair u and v have same diffusion rate (or probability), say λ > 0. However, in this work, we assume that the diffusion rate of all edge e = (u, v) for node pair u and v is a function of distance from the information source v * ∈ V. We assume that τ ij are independent and have an exponential distribution with parameter λ h where h is the number of hops of node i from the source. Hence, the diffusion rate only depends on the distance to the source in the graph. Further, we assume that a diffused information has a message about how many hops (h) passed from the source (Using this message, each infected node spreads the information to its neighbors under the diffusion rate λ h ).

Maximum Likelihood Estimator.
As an estimator of the source, we consider a MLE for the observed graph (snapshot) G N when there are N infected nodes in the network: i.e., the MLE is the node that maximizes the likelihood P(G N |v) of the diffusion snapshot G N . Instead of direct computing the likelihood, which is quite difficult due to heterogeneous diffusion rate, we consider the following proposition that guarantees a useful graphical characterization of the MLE. Proposition 1. Let V ml be a set of MLEs (We consider the set of MLEs because it can be multiple nodes.) for the homogeneous diffusion rate and let V ml (h) be the set of MLEs for (1) on the regular tree. Then we have V ml = V ml (h).
The proof of Proposition 1 will be presented in Section 5. This proposition indicates that for a given regular tree, the MLE of our decaying diffusion model is the same to that of the homogeneous diffusion rate. Consequently, we have the same graphical property of "rumor center" as described in [1], which is one of the graph centrality concepts. To see more details, we let T v u be the number of nodes in the subtree rooted at node u, assuming v is the root of the tree G N (see [1] for details). Then the rumor center has the following property.

Corollary 1.
Under d-regular tree G, for a given observation G N with N infected nodes, the node v is a MLE of (1) if and only if T v u ≤ N/2 for every neighbor u of v.
For a comparative purpose, we first present the detection probability of MLE for the homogeneous diffusion rate [2] as follows: where I x (α, β) is the incomplete Beta function ( The incomplete Beta function I x (α, β) is the probability that a Beta random variable with parameters α and β is less than x ∈ [0, 1], whose form is is the standard Gamma function [1].) with parameters α and β.
Using Lemma 1, one can check that the detection probability for MLE under the homogeneous diffusion rate is at most 0.307 in the asymptotic case for d-regular trees. In the following section, we define our interesting decaying models and obtain the detection probabilities under these models.

Main Results
In this section, we first obtain the asymptotic detection probability (when t → ∞) for two decaying models with known model parameters. Next, we consider a case that the model parameter is hidden (unknown) so that we need to estimate it. To do that, we suggest a simple and efficient parameter estimation algorithm.

Probability of Correct Detection of MLE
As a first decaying model, we define a simple decay function as follows. To do this, we introduce a decaying parameter p > 1, which indicates how much the diffusion rate will be decay by the number of hops to the source. Using this, we have the following definition.
Definition 1 (Simple Exponential Decay). Let λ 0 be the initial diffusion rate from the information source v * . We call the rate function λ h (p) = λ 0 p h by simple exponential decay w.r.t. the source with the decaying parameter p > 1.
Using this definition, we first obtain the detection probability of the MLE when the underlying graph is a line. To do that, we let P(v ml (p) = v * ) be the detection probability of the information source by the MLE for the simple exponential decay. Then we have the following result.

Theorem 1.
For the line graph with the decaying parameter p > 1, the detection probability of the information source by the MLE when t goes to infinity is given by where M(p) : In this result, we check that if p = 2, the detection probability is larger than 7/16 by using an integral approximation of M(p). Further, we see that if p increases the detection probability also increases. This is a significant enhancement of detection compared to that of the homogeneous diffusion rate where the detection probability is zero for the line graph [2].
Next, we will obtain the source detection probability for d-regular tree (d ≥ 3) as follows.
the MLE can detect the source with probability one as t goes to infinity, i.e., This result indicates that for the d-regular tree (d ≥ 3), if the decaying rate of the information is larger than d − 1, the MLE can detect the source almost surely even though the time goes to infinity. This is because the tendency of great decaying makes the snapshot of diffusion more dense with respect to the source so that the MLE can find it easily (Since the MLE is a rumor center under the decaying model, the estimator has the largest centrality of the graph.). Next, we introduce a more general exponential decaying model where the decay level is parameterized by r from the simple exponential decaying to homogeneous rate by one parameter in what follows: Definition 2 (Generalized Exponential Decay). Let λ 0 be the initial diffusion rate of the information source v * . We call the rate function λ h (p, r) by generalized exponential decay with respect to the source if where 0 ≤ r ≤ p − 1 is the decay level.
We see that this form is parameterized by r. For example, if r = 0 then it becomes the simple exponential decay as in Definition 1. If r = p − 1 then it becomes λ 0 which is the homogeneous diffusion rate. We plot the rate function λ h (p, r) with various values of parameters (p, r) in Figure 2. Using this, we obtain the following result.
be the detection probability of the information source by the MLE under the generalized exponential decay λ h (p, r) in (5).
The result implies that for a given decaying level r, we have the asymptotic detection probability of the MLE as in (6). For example, if r = 0 then lim t→∞ P(v ml (p, r) = v * ) = 1 which is the result in Theorem 2 and if r = d − 2 then the detection probability becomes the result in [1] which is the case of the homogeneous diffusion rate.

Decaying Parameter Estimation
In this subsection, we consider the scenario that the parameter p of the decaying model is hidden because it is often hard to know the exact parameter even though the model is given in practice. In this case, under the decaying model, one can estimate it by the following MLE simply using the current snapshot G N . To formalize this, we let σ = (v 1 = v, v 2 , . . . , v N ) be a infection sequence which generates G N and let Ω(v, G N ) be the set of these sequences when v is the source. Then we have where the equality (a) is due to the uniform prior of the source. Then, we see that computing the MLE has combinatorial complexity because there are exponentially many infection sequences in Ω(v, G N ) with respect to the infected nodes N. Hence, instead of using this, we design an approximation algorithm that guarantees simple and efficient to estimate the true parameter as follows.
Algorithm. We now describe our decaying parameter estimation algorithm, named DPE(K) , where K is sampling cost of the algorithm in Algorithm 1. Since we do not have any prior information for the parameter p, we set the estimation range [p min , p max ], where p min and p max are minimum and maximum values of p, respectively. We set p min = 1 (i.e., no decaying) in the algorithm. Then, the algorithm first sets p = p min as described in the first line. Next, for each infected node v ∈ Ω(v, G N ), it calculates the rumor centrality R(v, G N ) using the given snapshot information G N as in [1] (Step 1). To approximate the term ∑ σ∈Ω(v,G N ) P(σ|p, v) in (7), we consider that the algorithm samples K infection sequences uniformly at random (The uniform sampling indeed gives a simple approximation for the mean of infection probabilities), where K is a fixed constant K ≥ 1 and computes: where Step 2). This is regarded as a averaged value of P(σ i |p, v) in (7). Clearly, we know that many samples (large K) give more accuracy in general. Then, we multiplyP K (σ|p, v) to the rumor centrality R(v, G N ) and we put it intoR(v, G N ) (Step 3). We next sum these values for all infected nodes and save it to f (p). We repeat this procedure by increasing δ > 0 to the previous value p. Finally, the algorithm compares all values of f (p) within the range [p min , p max ] and takes the maximum p as the estimation of decaying parameter, denoted by p. We see that the algorithm complexity is O(max{N, K}), where N is the number of infected nodes in the graph. We will show how accurate this algorithm as varying K in the simulation section.

Algorithm 1 Decaying Parameter Estimation (DPE(K))
Input: Diffusion snapshot G N , sampling cost K, p min , p max , increasing step size δ Output: Estimation parameterp Set the initial decaying parameter p ← p min ; while p ≤ p max do for each v ∈ G N do Step1: Compute the rumor centrality R(v, G N ) by a message passing algorithm [1]; Step2: Choose random samples σ i ∈ Ω(v, G N ) K times and compute its mean by; Step3:

Proof of Results
In this section, we will provide the proofs of Propositions and Theorems. All proofs of Lemmas will be provided in Appendix A.

Proof of Proposition 1
To prove this, it is sufficient to show that under d-regular tree G, for a given observation G N with N infected nodes, v is a MLE of (1) if and only if |T v u | ≤ N/2 for every neighbor u of v from the result in [1]. Let v ∈ G N be the node which satisfies this condition and let u ∈ G N be a node that has |T u w | > N/2 for some neighbor node w of u. Then, we will prove P(G N |v) > P(G N |u) for all u ∈ G N by using a contradiction method. Suppose there exists a node u such that P(G N |v) ≤ P(G N |u).
Then, for every σ ∈ Ω(v, G N ) and σ ∈ Ω(u, G N ), we have where τ j is the exponential random variable of the jth node at the boundary of the information spread and the inequality (a) follows from the fact that if we let L u := max s∈G N d(u, s) then by the decreasing diffusion rate, we have P(L u > L v ) > P(L u ≤ L v ) and this makes P(σ|v) > P(σ|u) for any permutation σ. To see this more rigorously, we see that for any permutation σ ∈ Ω(u, G N ), there exists distinct permutation σ ∈ Ω(v, G N ) such that (a) holds , due to the fact that a/(a + b) > c/(b + c) for a > b > c. By using the result in [1] which makes contradiction to our hypothesis. Hence, we complete the proof of Proposition 1.

Proof of Theorem 1
For the line graph, note that there are two independent diffusion processes from the source. We let N i (t) (i = 1, 2) be those processes that indicate the total number of infections until time t from the source, respectively. Let C k t := {N 1 (t) − N 2 (t) = k} be the event that the difference of number of infected nodes between N 1 (t) and N 2 (t) is k after time t and let C t := {v ml (p) = v * } be the detection event of the MLE at time t. Then, from the Corollary 1, we have following two events for detecting the source: (i) The detection occurs when the MLE is uniquely defined at time t, denoted by C 0 t and (ii) The detection occurs when there are two MLEs at time t as in [1], denoted by C 1 t and C −1 t . Hence, the detection probability P(C t ) is described by To obtain this, we first obtain the probability P(C k t ). From the Markov property of the diffusion process, we see that C k t depends only on C k−1 t−1 or C k+1 t−1 . Then, we obtain the following lemma.

Lemma 2.
If N 1 (t) = m + k and N 2 (t) = m (m ≥ 0, k ≥ −m : integer), then and Using this result, the probability P(C k t ) can be expressed by the following recursive form: Since P(C k t ) converges for all integer k as t → ∞, (C k ∞ = C k ) we delete the index t from now on. To compute each probability in (12), we consider the following lemma.
By using this, we finally obtain P(C 1 t ), and P(C −1 t ) (= P(C 1 t )) (This is because two random processes N 1 (t) and N 2 (t) are i.i.d.) from P(C 0 t ). Then, from the relation of (12) and by taking limit (t goes to infinity), we have the result in Theorem 1 and this completes the proof of Theorem 1.

Proof of Theorem 2
First, we will prove for p = d − 1. To do that, we let T i (t) be the set of infected nodes in the subtree rooted at u i at time t (u i is i-th neighbor node of the information source v * where 1 ≤ i ≤ d and we omit the superscript v * in T v * i (t) for the notational simplicity). Then we have the following lemma.
This lemma implies that if we consider the decaying factor p = d − 1, the diffusion rate that a node is infected at any time t in T i (t) becomes a constant λ 0 .
Hence, using the Lemma 4, the probability that a node in T i (t) is infected is always equal to 1/d for all i. Indeed, if we let τ i (t) be a random variable of infection time for a node in T i (t) then be the total number of infections in the network until time t. Then, we have Let where (a) is from the fact that all events E i are mutually excluded and (b) comes from Hoeffding's bound with the expectation E(T i (t)/N(t)) = 1/d, i.e., , which converges to 0 as t → ∞ (N(t) → ∞). Thus, for a fixed d, as t → ∞ , P(C t ) converges to 1. Next, to see the result for p > d − 1, consider the following lemma.
Using this, we obtain that for any rate p ≥ d − 1, the detection probability goes to one as time goes to infinity. This completes the proof of Theorem 2.

Proof of Theorem 3
Let p = d − 1 then, one can easily check that λ h (p, r) in (5) satisfies the following recurrence: This relation means that one infection of any node in a d-regular tree increases exactly rλ 0 . At initial time, S 1 (0) = λ 0 and ∑ i =1 S i (0) = (d − 1)λ 0 . If an infection occurs in a T 1 or G/T 1 , then S 1 or ∑ i =1 S i (0) increases by rλ 0 . This process can be mapped into Polya's Urn process [2] and the probability lim t→∞ P(T i > N(t)/2) becomes 1 − I 1/2 (1/r, (d − 1)/r). The rate λ h (p, r) decreases as k increases with the condition that r < d − 1. If r = 0, any infection does not change the sum of rates S i and it is same as (d − 1)-rate decaying model which has detection probability converging to 1 as t goes to infinity. Furthermore, if r = d − 2, infection rate becomes homogeneous and then the detection probability is same to the result in [2]. This completes the proof of Theorem 3.

Numerical and Simulation Results
In this section, we will provide simulation results for the detection probabilities over three types of graph topologies: (i) regular trees, (ii) three random graphs, and (iii) three real-world networks. We propagate the information from a randomly chosen source up to 1000 infected nodes and plot the detection probability from 500 iterations. We obtain the detection probability by varying the decaying parameter p in the generalized exponential decay, which including the simple exponential decay (r = 0).

Regular Trees
For the regular tree, we first plot the theoretical result (asymptotic detection probability) in the Theorem 3 numerically with respect to the decay level r of the generalized decaying model in Figure 3. We have checked that the detection probabilities are almost one for r = 0 (i.e., simple exponential decaying) and they decrease as t increases for three cases of degrees (d = 5, d = 10 and d = 15). We see that if r = d − 2 (= 3) for d = 5, it has the same detection probability to that of the homogeneous rate as our expectation. We further validate our theoretical result by simulation. We also check that the higher degree for the regular tree gives more chances to detect the source because it frequently has the best rumor centrality in the infection graph.  In Figure 4, we obtain the error of parameter estimation for p with respect to the sampling cost K for the algorithm DPE(K). We consider the case d = 5 and we set the error between true parameter p and estimated parameterp by using the distance |p −p|. We plot the error for different values of p (p = 1, 2, 3, 4) by averaged values of 100 runnings of the algorithm. We use the step size δ = 0.1 and p max = 5. In the result, we see that the algorithm DPE(K) estimates the true parameter more accurately when p is large. This is because a large decaying parameter makes the snapshot more balanced from the source, and this enables to estimate the true parameter with less fluctuation.
In Figure 5, we obtain the detection probabilities for degree d = 3 with respect to decaying parameter p. In this simulation, we compare two results (i) known parameter (true) (ii) unknown parameter (estimated), respectively. The known parameter means that the true decaying parameter is given a prior, and we use it in the simulation. The unknown parameter means we do not know the true parameter so that we need to estimate it using the algorithm DPE(K) first. Then we obtain the detection probability. In the result, we check that the detection probability increases as p increases and the reduction of detection performance is not much even though we use the estimated parameter using DPE(K) for K = 500.

Random Graphs
We consider Erdös-Rényi (ER), Scale-Free (SF) and Small World (SW) graphs. In the ER graph, we choose its parameter so that the average degree by 4 for 2000 nodes. In the SF and SW graphs, we choose the parameter so that the average ratio of edges to nodes by 1.5 for 2000 nodes. It is known that obtaining the MLE is hard for the graphs with cycles, which is P-complete. Due to the reason, we first construct a diffusion tree from the Breadth-First Search (BFS) as used in [1]: Let σ v be the infection sequence of the BFS ordering of the nodes in the given graph, then we estimate the source v bfs that solves the following: where T b (v) is a BFS tree rooted at v and the information spreads along it. The BFS tree is a good approximation for our model because the decaying rates with the same distance to the source are same, and the nodes closer to the source have a higher diffusion rate. Then, we obtain the detection probabilities for each graph by increasing the parameter p as in Figure 6. We first simulate the case for true parameter i.e., it is given as a prior. Next, we also consider the parameter is hidden so that we need to estimate it. In this case, we replace the rumor centrality R(v, G N ) for chosen node v by R(v, T b (v)) in the algorithm DPE(K). As a result, we see that the detection probabilities increase as the parameter p increases and they decrease as the parameter r increases for three random graphs. We also see that if the decay level r = p − 1, the detection probabilities are same as those of homogeneous diffusion rate. Further, we see that the detection is better on ER random graph than the other two networks because ER has symmetric locally tree structure, the decaying effect for large detection probability. Finally, for the unknown parameter p, we estimate the parameter using the algorithm DPE(K) with 100 iterations as in Table 1 and we check that the degradation of detection performance is not much even though we use the estimated parameter using DPE(K) if the sampling cost is sufficient.

Real World Graphs
For the Facebook network in Figure 7b, we use the data in [21] where generates an undirected graph consisting of 4039 nodes and 88,234 edges. Each edge corresponds to a social relationship (called FriendList) and the diameter is 8 hops. The US power grid network Wiki consists of 4941 nodes and 6594 edges, and the diameter is 46 hops. For the WiKi-vote network, we use the data in [22] where generates 7115 nodes and 103,689 edges, and the diameter is 7 hops. In these networks, we also use the BFS approach in (20) to estimate the source and apply the BFS tree to estimate the parameter p in DPE(K) as in Table 1 with 100 iterations. Further, we see that the detection probabilities for the three networks increase as p increases and if the decal level r = p − 1, they become the results of homogeneous rate in Figure 7. We also see that the detection is better on the US power grid network than the other two networks because it has a large network diameter that gives more chances to detect the source using the diffusion snapshot. Table 1. Averaged value of estimations of parameter p using DPE(K) for general graphs (K = 500). We use the step size δ = 0.1 and p max = 10.

Discussion
Our results show that decaying of information diffusion over distance from the source has a positive effect in most cases on inferring the source of information. In particular, the results show that if only mild conditions are satisfied, the source can be found as probability one no matter how much time passes for the regular tree structure. This is due to the fact that the spreading patterns which corresponding the decaying of information spreads more evenly with respect to the center of the infection graph. This phenomenon also occurs similarly with general graphs, making it easier to find the source.
The first limitation of this result is that the diffusion model under consideration does not cover the various decaying patterns due to some technical problem for analysis. For example, we can also consider the decaying by an exponential function such as λ h (p) = ce −ph for some constant c > 0 or by power-law function λ h (p) = λ 0 /h p , etc. However, we find some mathematical intractability to obtain analytical detection probability. As a second limitation, we did not consider the decaying with respect to time this is also because of hardness for tracking the infection time of diffusion.
As future work, we will consider more general decaying models that can be explainable of heavy-tail decaying and light-tail decaying patterns. We hope that this direction will give more chances to find some tractability for analysis of various decaying models. Further, we will obtain some analytical results for detecting the source of the Erdös-Rényi random graph because it is the simplest random graph.

Conclusions
In this paper, we consider an information source finding problem when the diffusion rate of information decays with respect to the distance from the source. We first show that the MLE is same as that of the homogeneous diffusion rate of information if the diffusion decays with respect to the distance from the source. We then obtain the closed form of detection probability of the MLE when the underlying graph is a regular tree for two exponential decaying models. Different to the prior result, the detection probability is larger than zero in the line graph and there is a non-neglectable improvement of detection for any degree of a regular tree. Next, we consider the case that the decaying model parameter is hidden for two decaying models above. To do that, we design a heuristic estimation algorithm for the true parameter by using the diffusion snapshot information, appropriately. Finally, we validate our theoretical result for the regular tree using the MLE and for over popular random graphs and real-world networks using the heuristic Breath-First-Search (BFS) estimator. We obtain that the detection probability can be above 80% for the regular tree, and it can be above 30% if the diffusion rate decays, whereas it is about 20% without decaying in the real-world graphs. possible events at time t + 1, infection occurs in N 1 or N 2 . By the exponential distributions, we obtain (13) by One can easily obtain (14) by similar calculation and this completes the proof of Lemma 2.
Appendix A.2. Proof of Lemma 3 We will prove this by induction on k. First, consider the case k = 0, then we have and this implies that P(C 1 ) = 1+p 2p P(C 0 ). Next, If k = 1, we have P(C 1 ) = 1 1+p 0 P(C 0 ) + p 2 1+p 2 P(C 2 ) and P(C 2 ) = 1+p 2 p 2 ( 1+p 2p P(C 0 ) − 1 2 P(C 0 )) = 1+p 2 2p 3 P(C 0 ). Using Equation (12) and induction hypothesis, we obtain By the symmetry P(C k ) = P(C −k ), and the total sum P(C k ) of all integer k should be one, we obtain Now let E(p) = 2 ∑ k=∞ k=1 1+p k 2p k(k+1) 2 + 1 and then the bound of P(C 0 ) = 1/E(p) is determined by E(p). Lemma 3 shows P(C 0 |even) = 2/E(p) which leads to same conclusion on bound of P(C 0 ). If p ≤ 1, E(p) goes to infinity. It means P(C 0 ) and P(C 1 ) goes to 0 as time goes to infinity. If p > 1, we have Then, E(p) has upper and lower bound 2 < E(p) < 2p We will prove this by induction hypothesis on T i (t) as follows. If T i (t) = 1, T i (t) = u i where u i have d − 1 boundary edges and each edge has diffusion rate λ 0 d−1 , we have S i (t) = (d − 1) λ 0 d−1 = λ 0 . If T i (t) = k ≥ 2 , Let v be the last infected node in T i (t) and w be the infected node in T i (t) which spread rumor to v. Before infections of v, w's boundary edges having rate λ d(w) (p) equally where d(w) is the distance of node w from the source. Since v is one hop further from rumor source than w, v's boundary edges having rate λ d(w) (p) d−1 equally. When v is infected, one of w's boundary edges is deleted from E i (t) and (d − 1) edges of v are added to E i (t). Thus, the total sum of rate of T i (t) is same as that of (T i (t) \ v). Then, by the induction hypothesis, we have that of (T i (t) \ v) as S i = λ 0 and this completes the proof of Lemma 4.
Appendix A.4. Proof of Lemma 5 To prove this, we first denote N(t) = N for the notational simplicity. Then, we use a mathematical induction on N i.e., the number of infected nodes in the graph. For N = 1, it is trivial because there is only one infected node which is the rumor source. Hence, for any diffusion rate, we have P(T i (t) > 1/2|λ 1 ) = P(T i (t) > 1/2|λ 2 ) = 0. Now, suppose this is true for N − 1 and then define three events as where s = 1, 2. We let B s := {T i (t + 1) > N/2|λ s } then by the total probability law, we have where the equality (a) is from the fact that P(N = odd) = P(N = even) = 1/2, P(B s |A 3,s ) = 0 for N is odd and P(A 2,s ) = P(B s |A 3,s ) = 0 for N is even, respectively. From this result, it remains four terms in (A7) to obtain the result P(B 1 ) ≥ P(B 2 ). First, we see that P(A 1,1 ) ≥ P(A 1,2 ) from the induction hypothesis. Next, we will see that P(B s |A 1,s ) is also satisfied (This is the case that N is even). In this case, all the transition probabilities are one except the case that there are N/2 infected nodes in T i (t) and (N − 1) − N/2 infected node in one of remained subtree. Let β s be the transition probability from this state to N/2 equally distributed for those two subtrees then one can check that β 1 ≤ β 2 because where t i be the random variable of the event that next infection occurs in the subtree T i when there are N − 1 infected nodes in the graph. Here, λ i σ(k) is the rate of this infection occurs in T 1 when there are k nodes. The equality (a) is due to the exponential random variable of diffusion process and the inequality (b) is from the fact that for all k since there are d − 2 subtrees without any infections of λ 1 i ≥ λ 2 i for all i ≥ 1. Then we have P(B 1 |A 1,1 ) = 1 − β 1 ≥ 1 − β 2 = P(B 2 |A 1,2 ). Next, we consider the probability P(B s |A 2,s )P(A 2,s ) (This is case that N is odd) and we first focus on P(B s |A 2,s ). In this case, we have P(B 1 |A 2,1 ) ≥ P(B 2 |A 2,2 ) by the similar steps as before. Finally, we need to see that P(A 2,1 ) ≥ P(A 2,2 ). To show this is true for all N ≥ 1, we will show it holds for all t > 0 as follows.
where the equality (a) comes from the identical random process for all T i (t) and the inequality (b) follows from the independence of random variables. Indeed, we see that where the equality (a) follows from the fact that since T i (t) ( In the tree, there is a unique path of any two nodes and the distribution of diffusion time from the rumor source to any node of distance l > 0 follows hypo-exponential with rate (λ 1 , . . . , λ l )) takes integer values for all t > 0, the probability P(T 2 (t) = k/(d − 1)|λ 1 ) is zero when k is not a multiple of d − 1. The inequality (b) is from the fact that P(T 1 (t) = m(d − 1)|λ 1 ) P(T 2 (t) = m|λ 2 ) ≥ P(T 1 (t) = m(d − 1)|λ 2 ) P(T 2 (t) = m|λ 1 ) , since the random process T 1 (t) and T 2 (t) have same distribution with exponential rates λ 1 i ≥ λ 2 i for all i ≥ 1 of each edge. ( We use the fact that b a ≥ d c if b ≥ d and a ≤ c, respectively. ) Next, we consider that P(B s ) = 1 2 {(1 + P(B s |A 1,s ))P(A 1,s ) + P(B s |A 2,s )P(A 2,s )} for an arbitrary small number ε > 0 and consider that where s = 1, 2. Then, from this relation and the fact that P(N = odd) = P(N = even) = 1/2, we have P(T i (t) > N/2|λ 1 ) = 1 2 2P(T i (t) > (N − 1)/2|λ 1 ) − P(T i (t) = N/2|λ 1 ) where (a) is due to the induction hypothesis. Then, by using the fact that lim N→∞ P(T i (t) = N/2|λ s ) = 0 and the induction hypothesis, we have lim N→∞ P(T i (t) > N/2|λ 1 ) ≥ lim N→∞ P(T i (t) > N/2|λ 2 ), and this completes the proof of Lemma 5.