Spatial-Temporal Data Collection with Compressive Sensing in Mobile Sensor Networks

Compressive sensing (CS) provides an energy-efficient paradigm for data gathering in wireless sensor networks (WSNs). However, the existing work on spatial-temporal data gathering using compressive sensing only considers either multi-hop relaying based or multiple random walks based approaches. In this paper, we exploit the mobility pattern for spatial-temporal data collection and propose a novel mobile data gathering scheme by employing the Metropolis-Hastings algorithm with delayed acceptance, an improved random walk algorithm for a mobile collector to collect data from a sensing field. The proposed scheme exploits Kronecker compressive sensing (KCS) for spatial-temporal correlation of sensory data by allowing the mobile collector to gather temporal compressive measurements from a small subset of randomly selected nodes along a random routing path. More importantly, from the theoretical perspective we prove that the equivalent sensing matrix constructed from the proposed scheme for spatial-temporal compressible signal can satisfy the property of KCS models. The simulation results demonstrate that the proposed scheme can not only significantly reduce communication cost but also improve recovery accuracy for mobile data gathering compared to the other existing schemes. In particular, we also show that the proposed scheme is robust in unreliable wireless environment under various packet losses. All this indicates that the proposed scheme can be an efficient alternative for data gathering application in WSNs.


Introduction
Wireless sensor networks have been widely deployed in a variety of applications including environmental monitoring, traffic surveillance and social sensing and analysis [1][2][3][4]. In such networks, data gathering is one of most fundamental tasks, where a large amount of sensory data is required to be transmitted to a fusion center (FC). However, owing to power limitation and resource constraints of sensor nodes, it is impractical for them to directly transfer all of data to the FC without any dimensionality reduction. Since sensory data collected in many scenarios intrinsically exhibits spatial-temporal correlations, some conventional data gathering schemes have been proposed to reduce energy consumption of data transmissions [5][6][7][8]. Recently, compressive sensing provides a new approach for data gathering applications in WSNs, which allows for the original signal recovery from a small number of measurements as long as the signal can be sparsely represented in a certain transform domain [9]. Compressive sensing is able to perform sensing and compression simultaneously to reduce the amount of data transmitted over the network so as to save energy consumption at each sensor node.
Various compressive sensing based approaches have been investigated for efficient data gathering in WSNs [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. However, most of the current CS-based data gathering schemes try to exploit either spatial or temporal correlation among nodes. Thus, the performance improvement brought by CS is limited. Considering many sensor network applications in practice especially in environmental monitoring, sensory data are usually periodically collected for a long time. Thus, the temporal correlation at each sensor node can be further exploited. Consequently, one can take full advantage of both spatial and temporal correlations to further improve the data gathering performance.
Recently, some CS-based data gathering schemes have been proposed to exploit both spatial and temporal correlations of sensory data in WSNs [28][29][30][31][32]. (See Section 2 for more discussions.) For example, Mahmudimanesh et al. proposed a balanced temporal-spatial CS scheme by applying a joint sparsity model to improve the reconstruction accuracy of the spatial-temporal profile of WSNs [28]. Quer et al. proposed an effective framework for data gathering to exploit spatial and temporal correlations by jointly using CS and Principle Component Analysis (PCA) [29]. These two schemes above mainly utilize the spatial-temporal correlation characteristic of the signal in the recovery algorithm to improve the reconstruction accuracy. Xu et al. proposed a spatial-temporal hierarchical data aggregation scheme using compressive sensing, where a subset of nodes is randomly selected to collect the corresponding subset data at each data collection instant and then matrix completion algorithm is applied to reconstruct all the data of the entire network [30]. In this scheme, some specific nodes or cluster heads are designated to forward data to the FC, which inevitably leads to unbalanced energy consumption. Wang et al. proposed a data gathering scheme based on Kronecker compressive sensing theory by exploiting both temporal and spatial sparsity of sensory data to improve reconstruction accuracy [31]. In this scheme, for each data gathering period, only a subset of nodes is randomly selected to transmit their CS measurements to the FC.
Mobile data gathering has also attracted much attention in the past few years, where one or more mobile collectors are designated to take responsibility for collecting data from sensor nodes [33][34][35][36][37][38]. The reason to use mobile collector(s) for data gathering is to reduce energy expenditure at each node. Since a mobile collector can come closely to a node for data collection, it is not necessary for a node to use a powerful transceiver to communicate with the mobile collector. Instead, a node only needs to wait for the mobile collector to get close enough, which can significantly reduce power for communication. Furthermore, it is also not necessary for all of nodes to maintain the connectivity of an entire network since they only need to communicate with the mobile collector(s) instead of with the other nodes. Due to the merits of mobile data gathering, the mobility pattern combining with compressive sensing has been exploited for data gathering in several works [24][25][26][27][39][40][41]. However, previous work above does not exploit spatial-temporal correlation of sensory data.
In this paper, we propose a novel mobile data gathering scheme by exploiting spatial-temporal correlation of sensory data to improve the performance of energy consumption and signal recovery quality in WSNs. The contributions of this paper are summarized as follows.

•
We propose a data gathering scheme using the Metropolis-Hastings algorithm with delayed acceptance (MHDA), a random walk algorithm for a mobile collector to collect data from sensor nodes. In this scheme, the mobile collector only needs to collect temporal CS measurements from a subset of nodes by sequentially visiting these nodes along a random routing path. Unlike [26,27] which need to perform multiple random walks to obtain random projections, the mobile collector takes only one random walk to collect measurements from nodes, which can significantly reduce the cost of data transmissions and improve energy efficiency. On the other hand, instead of using a standard random walk as [41], we adopt an improved Metropolis-Hastings algorithm as a mobility pattern which results in more uniform sampling distribution to improve signal reconstruction accuracy.

•
We prove that the sensing matrix for spatial dimensional signal which is constructed from the proposed random walk algorithm and a kernel-based sparsity representation basis satisfies the Restricted Isometry Property (RIP). We also exploit KCS for spatial-temporal correlation of sensory data to improve the compression performance and prove that the equivalent sensing matrix for spatial-temporal dimensional signal can satisfy the property of KCS models.

•
We also show that the proposed scheme is robust to the packet loss environment. We present simulation results to demonstrate that the proposed scheme is able to not only reduce communication cost but also improve recovery accuracy for mobile data gathering compared to some existing schemes.
The remainder of this paper is organized as follows. In Section 2, we introduce the related work. In Section 3, the preliminaries on CS basics and system model are given and the problem is formulated. In Section 4, we present a random walk algorithm with CS for mobile data gathering and the design for measurement matrix and sparsity representation basis. In Section 5, we give theoretical analysis and the performance analysis for the proposed scheme. In Section 6, we conduct extensive simulations to evaluate the performance of the proposed scheme. Finally, we draw a conclusion and discuss our future work in Section 7.

Related Work
In the past few years, much research work has been done to investigate the efficiency of compressive sensing for data gathering in WSNs [10][11][12][13][14][15][16]. For example, Luo et al. proposed a CS-based data gathering scheme for large-scale sensor networks. The goal of this work is to improve network capacity by alleviating the bottleneck effect of the sink. On the other hand, there are much work dedicated to improve the energy efficiency of data gathering by employing CS technique [10]. For instance, Xiang et al. also proposed a data gathering scheme to improve energy efficiency, where the entire network is partitioned into subnetworks and CS is employed within each subnetwork for data gathering [12]. Zhao et al. proposed a treelet-based clustered data gathering scheme to save energy expenditure, where a treelet transform is adopted as a sparsity representation basis and then CS is applied in conjunction with a clustered routing algorithm [16]. Furthermore, another research line for improving energy efficiency of data gathering by using CS is to design sparse random measurement matrices [19][20][21][22]. For instance, Zheng et al. designed a new type of random measurement matrix by using random walk algorithm for data gathering which follows the theory of expander-based compressive sensing [21]. Liu et al. proposed a non-uniform sparse random matrix by constructed from an opportunistic routing algorithm [22]. Meanwhile, Nguyen et al. exploited an integration between CS and the random mobility of sensors in distributed mobile sensor networks [26,27,39]. In their work, a small number of mobile sensors are utilized to collect data at their random positions and exchange their readings with their neighbors within the sensor transmission range to form one CS measurement. Rana et al. proposed a method for the mobile nodes to adaptively predict the number of projections based on the speed of the mobile nodes and compute a deterministic projection matrix from a learnt dictionary [40]. However, the aforementioned work only considers spatial correlation of sensing field by using CS.
Recently, some work has been dedicated to exploiting both spatial and temporal correlations of sensory data to improve the performance of data gathering in WSNs.
For example, Mahmudimanesh et al. proposed a balanced temporal-spatial CS scheme by applying a joint sparsity model to improve the reconstruction accuracy of the spatial-temporal profile of WSNs [28]. Quer et al. propose an effective framework for data gathering by exploiting spatial and temporal correlations jointly using CS and Principle Component Analysis (PCA), where they develop a sparse sampling matrix [29]. Different from their work, we study on how to collect temporal-spatial sensory data with a mobile collector whereas they focus on improving the recovery quality on the reconstruction side by exploiting spatial and temporal correlations using PCA. Xu et al. proposed a spatial-temporal hierarchical data aggregation scheme using compressive sensing, where a subset of nodes is randomly selected to collect the corresponding subset data at each data collection instant and then matrix completion algorithm is applied to reconstruct all the data of the entire network [30]. In their work, they adopted a multi-level clustering fashion to collect spatial-temporal data. Since random projections are forwarded to the FC along the tree, any packet loss at some specific nodes or cluster heads during transmission will results in the loss of previously calculated random projection. Thus, the value of random projection obtained at the FC is inaccuracy. Consequently, such an approach is susceptible to packet loss. Wang et al. proposed a data gathering strategy based on KCS theory by exploiting both temporal and spatial sparsity of sensory data to improve reconstruction accuracy [31]. Li et al. proposed a CS-based data gathering algorithm which utilize random sampling and random walks to select sensory data in temporal and spatial domains [32]. In their work, each projection is obtained by summing the received measurements and the sensing matrix is designed based on the adjacency matrix of an unbalanced expander graph. Different from the above work, we exploit the mobility pattern for spatial-temporal data collection, where it allows the mobile collector to gather data by sequentially visiting a subset of nodes, thus significantly reducing energy expenditure of sensor nodes. Furthermore, in our scheme a projection is generated from only one sensor node instead of a linear combination of the measurements from multiple nodes.

Compressive Sensing Basics
Compressive sensing provides a new paradigm for signal sampling and compression. CS theory asserts that a sparse or compressible signal can be reconstructed with high probability from a small number of measurements, which is far smaller than the length of the original signal. For example, consider an n-dimensional signal vector x = (x 1 , . . . , x n ) T . We say that x is a k-sparse signal if there are at most k(k n) nonzero elements in x. To reduce the dimensionality of x, CS applies a measurement matrix Φ ∈ R m×n onto x to obtain an m-dimensional signal y ∈ R m . The measurement matrix Φ can be a Gaussian or Bernoulli random matrix which follows the restricted isometry property (RIP) [42]. CS theory states that the k-sparse signal x can be accurately recovered with high probability from m = O(k log(n/k)) linear combinations of measurements y. It has been proven that recovering the signal x from y can be achieved through solving an 1 -minimization problem [43]: However, the signals existing in nature, such as temperature and humidity, are not perfectly k-sparse as they may have k large transform coefficients while the remaining coefficients are small. Suppose that the signal x can be represented Ψ = (ψ 1 , . . . , ψ n ) as in an n × n orthogonal basis Ψ, where θ = (θ 1 , . . . , θ n ) T is the transform coefficients of x in the basis Ψ. We reorder the coefficients θ i in decreasing magnitude such that We say that the signal x is a power-law decay signal in the basis Ψ if for some fixed p, the ith largest transform coefficient satisfies for each 1 ≤ i ≤ n, where p controls the compressibility of the transform coefficients (i.e., a smaller p implies faster decay) and R is a constant. The best k-term approximation of x (obtained by keeping the k largest coefficients and setting the others to zero) is given byx k = ∑ k i=1 θ i ψ i , where 1 ≤ k ≤ n is fixed. We say that x is compressible in Ψ when the mean squared approximation error behaves like for some fixed constant C r > 0 that only depends on p, where the parameter p controls the compressibility of x in Ψ. CS theory states that the compressible signals can also be optimally recovered from m = O(k log(n/k)) random measurements with high probability [42] through solving an 1 -minimization problem [43]:

System Model
We consider a multi-hop wireless sensor network for data gathering, which consists of n sensor nodes N = (1, . . . , n) and one mobile collector C. We assume that the sensor nodes are deployed uniformly and randomly in a unit square area to periodically sample spatial-temporal data from a sensing field. Suppose that the ith sensor node takes m readings for every sensing period T at a certain sampling speed in a data gathering tour, which is denoted as a signal vector In this paper, we model the WSN as a random geometric graph G(V, E ), where V is a set of vertices representing the sensor nodes N and E is a set of edges representing the links among the sensor nodes.
Consider a mobile collector roaming over the graph G. The mobile collector can be a physical mobile agent (e.g., data mule) [44]. We assume that the mobile collector can visit a sensor node to collect data at time t and then moves to one of its neighbors within its detection range r(n) at time t + 1. The detection range is defined as the maximal distance that the mobile collector can move in each time slot. Figure 1 illustrates an example of collecting sensory data in a WSN. In this work, sensory data collected by the mobile collector at each node comes from the compressive temporal measurements of a node in a sensing period. A similar model for mobile data gathering in WSNs was adopted in [37]. We assume that there is an edge between two sensor nodes if the Euclidean distance between them is smaller than the distance r(n). One of the merits to use the mobile collect is due to the fact that it does not need to equip with a powerful transceiver to communicate with sensor nodes. Instead, the mobile collect can roam closely to sensor nodes (much smaller than r(n)), which can also significantly reduce energy budget at each sensor node.

Problem Formulation
In this paper, we assume that the sensory data exhibits both spatial and temporal correlations, which is typical for most environmental monitoring applications in a densely deployed WSN. Let D ∈ R m×n represent spatial-temporal sensory data collected by n sensor nodes in every sensing period T, where the ith column of D is data vector sampled by the ith sensor node during the sensing period T and the jth row of D is data vector sensing by n sensor node at a sampling instant T j . We also assume that the signal D is sparse in both spatial domain Ψ s and temporal domain Ψ t . It has been proven that by using a certain type of sparse projection matrix the information of the entire sensing field can be approximately reconstructed from only a small fraction of randomly selected sensor nodes from the perspective of CS theory [23]. As discussed above, if these random selected sensor nodes directly transmit their data to the FC through multi-hop, it will consume a large amount of energy budget. Therefore, in this paper we consider a random walk based strategy for the mobile collector to travel over the graph G to collect spatial-temporal data from a small subset of sensor nodes. We tackle the following problems: (1) What is the appropriate movement strategy (i.e., the transition probability of a random walk ) for the mobile collector so that the sensing field can be reconstructed from a small fraction of sensor nodes? (2) How many nodes and how many measurements at each node should be collected in a data gathering period? (3) How many steps for the mobile collector should take to collect these measurements in a data gathering period?

Spatial-Temporal Data Gathering with Random Walk
In this section, we propose a random walk based scheme for spatial-temporal data gathering in a WSN. In every period T of data gathering, after sampling m measurements, each sensor node applies CS to compress temporal sensory data using a random projection matrix Ψ t , i.e., y i = Ψ t x i , where y i is the random projections of the ith sensor node and Ψ t ∈ R m t ×m can be a Gaussian or Bernoulli random matrix. We now describe how to employ mobile collector with a random walk algorithm to gather random projections from a fraction of sensor nodes. We adopt an improved Metropolis-Hastings random walk algorithm (MHRW), named Metropolis-Hastings random walk algorithm with delayed acceptance (MHDA) [45]. The MHDA algorithm not only achieves a uniform stationary distribution for the mobile collector to unbiasedly visit sensor nodes but also improves efficiency comparing with the traditional MH algorithm. Specially, under the MHDA algorithm, the mobile collector reduces the bias of visiting previous nodes when choosing its next step. Different from MHRW, MHDA avoids the random walk to backtrack to the previous visited node. The idea using the MHDA algorithm for mobile data gathering is described as follows. At the beginning of the algorithm, the mobile collector C randomly selects one of sensor node i ∈ N to invoke a random walk with the length t = 0 and collect random projections y i from node i. At time t, the mobile collector C chooses a node j uniformly at random from its neighbors of the current visited node u within its detection range r(n) to visit. MHDA considers two scenarios for the mobile collector to harvest data when it selects the next step. Let Y t and k be the previous visited node and the next node to be visited, respectively. First, if j is the previously visited node Y t , then the walk will stop going to j and the transition to node j is delayed. Instead, it will go to another node k ∈ N(u) \ {j} with another proposal probability 1/(d(u) − 1), where N(u) is the set of the neighbor nodes of u and d(u) denotes the degree of node u. The proposed transition to k is then accepted with another acceptance probability [45] It has been proved in [45] that when the acceptance probability A (u, k) is chosen as the condition in Equation (7) then the transition matrix P = [P(i, j)] of the random walk is irreducible and non-reversible with a unique stationary distribution and leads to the unbiased estimator. Second, if k = Y t , then the walk will go to k to collect data as in the traditional MHRW algorithm. The proposed state transition to k is accepted with an acceptance probability A(u, k) = min{1, d(u) d(k) }. Figure 2 shows two scenarios for MHDA algorithm. At each time, when the walk visits a new node, the node sends its random projections to the mobile collector and increments the length t. When t reaches a given quantity, then the mobile collector C stops the algorithm and performs the CS recovery algorithm. The procedure of the mobile collector employing the MHDA algorithm is summarized in Algorithm 1. Figure 2. Illustrating two scenarios for MHDA algorithm. (a) The next node j is the previous visited node Y t ; (b) The next node j is not the previous visited node Y t .
In the proposed scheme, it requires accurate location information of sensor nodes to help the mobile collector to find the nodes to be visited at each step. On the other hand, these location information is also needed to construct Gaussian kernel basis required for the recovery algorithm. Generally, there are mainly two categories to obtain location information in WSNs: range-free and range-based approaches. Range-free approaches provide imprecise estimation of the node location, which mainly rely on connectivity measurements (e.g., hop count), whereas range-based approaches provide more precise location estimation by measuring the Euclidean distances among the nodes with various ranging techniques such as TOA (time of arrival), TDOA (difference of arrival), AOA (angle of arrival) and RSSI (received signal strength indicators). One also can combine range-free and range-based approaches to improve the localization accuracy [46]. On the other hand, for an outdoor environment, the mobile collector can be equipped with a GPS module to obtain the location information. With the location information, the mobile collector can get closely to a node for data collection. For the recovery algorithm of CS, the mobile collector should know which nodes have been visited. Thus, the additional overhead for the mobile collector only needs to contain the corresponding indices of the nodes to indicate which nodes have been visited. We also note that this overhead does not increase with the size of sensory data collected by the mobile collectors and only depends on the number of sensor nodes in the network. For example, in practice, we only need one bit for each sensor. Therefore, the additional overhead for the random walk is very small compared to the size of data sensory collected by the mobile collector, which can be neglected.

Algorithm 1
The MHDA random walk algorithm for mobile data gathering at time t.
1: Select a node j uniformly at random from neighbors of u, i.e, N(u) 2: Generate a uniform random probability p ∈ [0, 1] 3: if p < min{1, d(u)/d(j)} then 4: if node j is the previous node Y t (i.e., Y t = j) and d(u) > 1 then 5: select a node k uniformly at random from neighbors N(u) \ {j} 6: Generate a uniform random probability q ∈ [0, 1] 7: if q < min{1, min{1, (d(u)/d(k)) 2 } max {1, (d(j)/d(u)) 2 }} then 8: C visits node k and collects random projections y k from node k 9: B t+1 = B t ∪ y k and Y t+1 ← i 10: else 11: C visits node j and collects random projections y j from node j 12: end if 14: else 15: C visits node j and collects random projections y j from node j 16: end if 18: else 19: Stay at node u and Y t+1 ← Y t 20: end if

Random Measurement Matrices Design
In this section, we study the random matrices used in the proposed scheme. As discussed above, the random matrix Φ t for temporal data can be a Gaussian or Bernoulli random matrix. In the following, we study the random matrix constructed from the MHDA algorithm. Let Φ s be an m s × n boolean random matrix. Each row of Φ s contains only one nonzero element, i.e., "1", which denotes that the corresponding sensor node has been visited by the mobile collector. For example, Φ s (i, j) = 1 denotes that node j is the ith node visited by the walk. Then each element of Φ s can be expressed as follows where V i is the set of the vertices which is the ith visited node by the mobile collector.

Spatial-Temporal Sparsity Representation Bases Design
Due to spatial-temporal correlation of sensory data, we consider a two-dimensional sparsity representation basis to sparsify a sensing field. For time-series sensory data from one sensor node, we adopt DCT transform basis Ψ t to compress these data in temporal domain. For spatial correlation of sensory data, we use a kernel-based method to sparsify a sensing field as in [41]. A two-dimensional Gaussian kernel function is adopted to construct a transform basis in spatial domain as follows where µ i ∈ R 2 represents the coordinates of node i following an i.i.d. sample with uniform distribution on [0, 1] 2 . µ i − µ j = d ij represents the distance between node i and node j. We assume these distances are known. As a result, the corresponding kernel matrix K n can be expressed as follows We adopt a centered version of the kernel matrix where data is centered in the feature space. Each entryK ij of the centered kernel matrixK n is given as [47] where K ij is the entry of the uncentered kernel matrix K n . We then diagonalize the kernel matrixK n as K n = ΨΛΨ −1 , where Ψ s is an orthonormal eigenvector basis, Λ is the diagonal matrix whose diagonal elements are the corresponding eigenvalues ofK n . In this paper, we use Ψ s as a representation basis to sparsify sensory data in spatial domain.

Reconstruction Algorithm of Spatial-Temporal Data
. . , x m,i ) denote the measurement vector taken by node i and x ∈ R m×n represent the matrix of all the measurements sampled by n nodes. Hence, x can be represented as a 2-D signal. In this work, we employ the framework of KCS for spatial-temporal sensory data [48].
As aforementioned, we use Ψ s and Ψ t as the sparsifying bases for spatial and temporal domain data, respectively. According to the theory of KCS, a single sparsifying basis for an entire multidimensional signal can be obtained by the Kronecker products of sparsifying bases for each dimensional signal [48]. Then, a single sparsifying basis for x can be obtained asΨ = Ψ s ⊗ Ψ t , which is defined asΨ Hence, x can be represented as x =Ψθ, where x is a vector-reshaped representation of x and θ is the vector of transform coefficients of x .
Similarly, we can also use Kronecker products to develop measurement matrices. As discussed above, at each sensor node the matrix Φ t is used to obtain its individual compressive measurements and the matrix Φ s is used to obtain compressive measurements of all the sensor nodes, which is a boolean random matrix. Hence, the joint measurement matrix can be expressed asΦ = Φ s ⊗ Φ t as followsΦ where 0 represents a matrix with all entries being 0. Suppose that the compressive measurements obtained at the sink are denoted as Y = (y T 1 y T 2 . . . y T m s ) T where m s is the number of the nodes visited by the mobile collector. The data obtained by the sensor nodes is denoted as X = (x T 1 x T 2 . . . x T n ) T . Then, recovering the signal X from compressive measurements Y can be conducted by solving the following 1 -minimization problem:

Algorithm and Performance Analysis
In this section, we first prove that the random CS matrix A =ΦΨ constructed from the proposed spatial-temporal data gathering scheme with the MHDA algorithm and the spatial-temporal representation basisΨ, follows the RIP. Then, we analyze the performance of the proposed scheme in terms of the actual number of steps that the mobile collector needs to take during a data gathering tour.

Algorithm Analysis
We now prove the following lemma stating that the random CS matrix A =ΦΨ follows the RIP with the following bounds of restricted isometry constants. The restricted isometry constant is defined as follows: Definition 1. The K-restricted isometry constant δ for the matrix A =ΦΨ is the smallest nonnegative number such that, for all θ ∈ R N with ||θ|| 0 = K defined as in Section 4.2 and Ψ s , Ψ t are defined as in Section 4.3, thenΦ s andΦ t satisfy the RIP with high probability, respectively, and where δ s and δ t are the restricted isometry constants ofΦ s andΦ t , respectively.

Proof.
We first prove that bothΦ s andΦ t follow the RIP with high probability. As discussed above, we adopt a random Gaussian or Bernoulli random matrix as the random measurement matrix Φ t and the DCT transform basis as the orthonormal sparsity basis Ψ t . Obviously, the random matrix Φ t = Φ t Ψ t follows the RIP with high probability. We now are ready to prove thatΦ s = Φ s Ψ s also follows the RIP with high probability. The proof can be done following the similar reasoning as in [41]. Compared with [41], a different random walk algorithm is employed in this paper, which results in a distinct node sampling probability. As proved in [41], the entries of Ψ s follow a sub-Gaussian distribution, i.e., Ψ s (j, i) ∼ Sub(c 2 ), where c > 0 is a constant. Then, for any 2 ) = m s σ 2 [49] Theorem 4.2. As discussed above, the probability that each row is selected fromΦ s is equal to the probability that the corresponding node is visited by the random walk. As proved in [45], the MHDA algorithm is theoretically guaranteed to construct a random walk algorithm to achieve a uniform stationary distribution, improving efficiency over the standard MHRW algorithm. According to the theorem in [49] Theorem 4.2, for any α ∈ (0, 1) and β ∈ [c 2 /σ 2 , β max ], there exists a constant C * ≥ 4 depending only on β max and the ratio σ 2 /c 2 such that and By normalization, Φ s = n m s (Φ s1 ,Φ s2 , . . . ,Φ s m ) T . Furthermore, we have E(Φ s (j, i) 2 ) = ( n m s ) 2 E(Φ s (j, i) 2 ) = 1 m s and E(Φ s (j, i)) = n m s E(Φ s (j, i)) = 0. Then, we obtain By setting α = 1 − δ, Equation (17) can be expressed as Similarly, by setting β = 1 + δ, Equation (18) becomes For a k-sparse signal x, there exist (n, k) possible signal sets x. Then we have (n, k) ≤ (en/k) k by Sterling's approximation. Therefore, we have the probability exceeding 1 − 2(en/k) k · e − mδ 2 C * = 1 − 2e − mδ 2 C * +k log(n/k)+k such that Therefore, we can choose m s = O(k log(n/k)) such thatΦ s satisfies the RIP with the probability approximating to 1.
From the above analysis, we have proved thatΦ s andΦ t follow the RIP with high probability, respectively. According the property of Kronecker product, we have Let δ s and δ t are the restricted isometry constants ofΦ s andΦ t , respectively. In [48], it has been proven that the RIP constant of the Kronecker product matrix δ(Φ s ⊗Φ t ) satisfies the following inequality Then, we obtain which finishes the proof.

Performance Analysis
We are now ready to calculate the actual number of steps m l that the mobile collector needs to take to visit m s distinct sensor nodes. Since the mobile collector may visit a sensor node for several times including self loop steps the actual number of steps m l should be larger than m s . The advantage of the MHDA algorithm is that it can avoid backtracking to the previously visited nodes to reduce the number of steps that it spends at the same nodes.

Lemma 2.
The expected number of steps that the mobile collector needs to take to visit m s distinct sensor nodes over the graph G is where d min and d max are the minimum and maximum degree of the graph G, respectively.
Proof. Consider a random walk with a length of m l using the MHDA algorithm and let U 1 , U 2 ,. . ., U m s be the distinct nodes that the mobile collector may visit during the random walk. Suppose that for a random walk and a sequence of denote the number of steps that the mobile collector needs to spend at node v i until it visits the next distinct node. Hence, X i is 1 plus the number of steps that the mobile collector stays at the self loop.
Note that X i is a random variable which only depends on U i and not on the previously visited node. Now consider any step at node v i . Then, the probability of X i = 1 excluding the number of steps at the self loop is d v i p ij , where d v i is the degree of v i and p ij is the transition probability of the random walk.
Note that for every sequence of node v 1 , v 2 ,. . ., the random variables X 1 , X 2 ,. . . are independent given that Note that m s is a fixed number given before the random walk runs and m l is a random variable. So E(m l ) is the actual number of steps that we want to calculate. Since ∑ m s i=1 X i = m l , we obtain: Since then we have Note that the probability p ij is the transition probability of the random walk involving the two scenarios as described in the Algorithm 1.

Discussion
In the following, we discuss the performance improvement of the proposed scheme over the other existing schemes in terms of the number of transmissions that the sensor nodes require to deliver their measurements to the mobile collector. Firstly, we take the raw data transmission without compression as a baseline scheme for comparison, where the measurements of n sensor nodes with each having m measurements are collected by a mobile collector by using a standard random walk algorithm until all of nodes have been visited. Obviously, it requires at least total mn transmissions in one round of data gathering. Additionally, it also requires O(mn log n) time slots to visit all of nodes in a random geometric network, which has been proven in [50]. In the meanwhile, compared with the schemes using multiple random walks as proposed in [21,26,27], our scheme only needs to take a random walk, which consumes E = O(k t k s log(m/k t ) log(n/k s )) transmissions and spends T = O(k t k s log(m/k t ) log(n/k s )) time slots, whereas it takes m r = O(k s log(n/k s )) random walks with a length of T r = O(n/k s ) steps [21], resulting in total E = O(k t n log(m/k t ) log(n/k s )) number of transmissions. Therefore, the proposed scheme has significant improvements on energy consumption and data gathering delay over the existing CS-based approaches.
However, we note that it is not efficient to use only one single mobile collector for data gathering, since it incurs high delay due to the low moving velocity of the mobile collector. To further reduce the collection time, multiple mobile collectors can be employed so that each mobile collector may visit fewer sensor nodes to collect their measurements. On the other hand, we also note that the proposed scheme might incur high cost due to the deployment of mobile collector. Such a mobile collector requires high power to allow it to move around a sensing field. However, it in turn helps to save energy consumption at each sensor node since it is possible for a node to use a relatively small power of its transceiver for data transmission when a mobile collector comes by.

Numerical Simulations
In this section, we evaluate the performance of the proposed scheme for spatial-temporal data gathering in WSNs through simulations. The proposed algorithm is implemented by MATLAB. A cvx package is used to solve 1 programming for CS decoding algorithm [51].

Spatial and Temporal Correlation Characteristics of the Dataset
In this section, we first introduce the dataset adopted in the simulation and then analyze the correlation characteristics. The real dataset we use is obtained from a remote sensing system to measure sea surface temperature [52]. We assume that a wireless sensor network is composed of 512 sensor nodes which are randomly and uniformly deployed in a sensing field to monitor the sea environment. Such a deployment will result in an irregular network topology. Figure 3 presents spatial-temporal temperature data collected by the sensor nodes. To analyze the correlation characteristics of the real dataset, we compute the spatial correlation and the temporal correlation of the dataset following the approach proposed by Zordan et al. in [53]. In order to calculate the spatial correlation, we randomly select 3000 pairs of points from the total number of pairs. For each pair of points, we calculate its distance d and the corresponding spatial correlation function ρ S using Equation (2) in [53]. Similarly to the procedure adopted in [53], we divide the maximum distance d max of the all pairs of points into 20 intervals. Then we calculate the average spatial correlation coefficients for all the pairs of points whose distance falls within the same interval. We also calculate the spatial correlation vs. the distance using the Power Exponential (PE) model and the Rational Quadratic (RQ) model [53]. In this simulation, we select the parameters ξ = 0.38 and ν = 0.3 for the PE model and ξ = 1.8 and ν = 1.2 for the RQ model (corresponding to Equations (3) and (4) in [53], respectively). In Figure 4, we plot the spatial correlation ρ S with the normalized distance d/d max ∈ [0, 1] for the real dataset. As shown in Figure 4, we can see that the spatial correlation of the real dataset used in this paper nicely fits the PE model. On the other hand, we also calculate the temporal correlation coefficients of the dataset using Equation (6) in [53]. We find that the average temporal correlation coefficient of the dataset is 0.9984, which shows the strong temporal correlation.

Performance of Spatial-Temporal Sparsity Representation Basis
In this section, we evaluate the performance of the proposed scheme using spatial-temporal sparsity basis and show the performance improvement over the other conventional bases. We use Gaussian kernel basis (GKB) and DCT as the spatial and temporal transform bases (GKB-DCT), respectively. The kernel parameter ω is set to 1 for GKB. The relative error is used to evaluate the reconstruction quality, which is defined as ε = x − x 2 2 / x 2 2 , where x andx are the original sensory data and the reconstructed data, respectively. Figure 5 shows 2D transform coefficients by using GKB-DCT. From Figure 5, it can be seen that there are only few transform coefficients whose absolute values are much larger than the remaining coefficients. This explains that GKB-DCT can be well used to sparsify sensory data for an irregular topology. We also compare the reconstruction performance with the other schemes using spatial sparsity bases such as Laplacian eigenvector basis (LEB), discrete wavelet transform (DWT) and discrete cosine transform (DCT) and temporal sparsity basis DCT (termed as LEB-DCT, DWT-DCT, DCT-DCT), respectively. The eigenvector of the Laplacian matrix of a graph G(V, E) has been commonly adopted to sparsify data for an irregular topology [21,54]. The Laplacian Matrix of a graph is usually used to characterize the topology of a sensor network, which is defined as follows [55]: where d i,i is the degree of node i.
From Figure 6, it is noticed that the proposed scheme using GKB-DCT outperforms the other schemes using LEB-DCT, DWT-DCT, DCT-DCT. This indicates that GKB-DCT is a more efficient spatial-temporal transform basis to sparsify data than other bases in an irregular deployment. Figure 6 also shows that the conventional bases (e.g., DCT and DWT) perform poorly. This is because such bases are more appropriate for a regular deployment.

Performance of Spatial-Temporal Data Gathering with Mobile Collector
In this section, we evaluate the performance of the proposed spatial-temporal data gathering scheme using the MHDA algorithm. We choose r(n) = 8 log n n as the detection range of the mobile collector since it has been proven that a random geometric graph has optimal partial cover time with high probability if r(n) ≥ 8c log n n where c ≥ 1 [50]. We first investigate the performance for data gathering without considering temporal data under the varying compression ratios. We compare our scheme with the other schemes using simple random walk (RW) algorithm, IID algorithm and dense random projections (DRPs). In the RW scheme, the mobile collector randomly and uniformly selects one node from its neighbors for the next visit. In the IID scheme, we assume that the fusion center randomly picks up some of the nodes to send their data. In the DRPs scheme, we employ a Bernoulli random matrix instead of using a mobility model to generate a random measurement matrix as the baseline scheme for performance comparison. We use the Gaussian kernel basis as the sparsity representation basis for all the schemes. Figure 7 plots the reconstruction error for the schemes above. The reconstruction error is computed over 20 trials for each scheme. From Figure 7, we observe that the proposed MHDA algorithm outperforms the RW algorithm when compression ratios vary from 0.1 to 0.3. This is because the node sampling distribution induced by the MHDA algorithm is more uniform than the one induced by the RW algorithm. The IID algorithm outperforms the MHDA algorithm due to the same reason. However, we also notice that when compression ratios vary from 0.4 to 0.6, the MHDA algorithm can achieve the similar reconstruction quality compared to the IID and DRPs schemes.
We next compare the performance in terms of communication cost of sensor nodes for the four schemes above. The communication cost is evaluated in terms of the number of transmissions required for the sensor nodes to send their data to the mobile collector or the fusion center. Figure 7 plots the number of transmissions required for different schemes. We can see that the proposed scheme outperforms all of the other schemes. For instance, our scheme can provide about on average 4%, 49% and 87% transmission cost reduction compared to ST-SRW, ST-IID, ST-MRW. Compared to ST-SRW, the performance gain of our scheme benefits from the fact that the MHDA algorithm makes the mobile collector avoid backtracking the previously visited nodes. We also conduct simulations to validate the analytical result derived in Lemma 2. As discussed before, since the mobile collector may backtrack to some previously visited nodes, it needs to take additional steps that it spends at the same nodes. In this simulation, we calculate the actual number of steps m l taken by the mobile collector to visit m s distinct sensor nodes under the various sizes of network n. Figure 8 plots the expected number of steps E(m l ) and the upper bound of the analytical result (i.e., the right term in Equation (30)) with the parameter m s /n for n = 512 and n = 1024, respectively. It can be seen that the actual number of steps E(m l ) is upper bounded by the analytical result, which is approximately smaller than m s d max /d min . In [56], it has been proven that the maximum and minimum degree of a connected random geometric graph G are in the same order, i.e, d max /d min = c where c is a constant. This indicates that the actual number of steps E(m l ) approximately scales linearly with the number of distinct nodes m s and has the same order with m s (i.e., E(m l ) = O(m s )).
We then compare the performance with various spatial-temporal data gathering schemes including ST-SRW, ST-IID, ST-MRW. ST-SRW and ST-IID extend the aforementioned schemes SRW and IID to the spatial-temporal case. In the ST-MRW scheme, we extend the multiple random walks based algorithm adopted in [21] to the spatial-temporal data collection scheme where an LEB-DCT transform basis is adopted. Figure 9 shows the reconstruction performance for the four schemes above. We observe the similar results as obtained in the only spatial data collection scheme above.
On the other hand, instead of using multi-hops as in ST-IID or multiple random walks as in ST-MRW, the proposed scheme performs only a random walk for data collection, which is able to significantly reduce the communication cost of sensor nodes.  On the other hand, it is worth noting that it might be not efficient by using only one single mobile collector for data gathering since it incurs high delay. In this simulation, we investigate the performance of employing multiple mobile collectors for data collection. Obviously, we can make use of multiple mobile collectors to increase the speed of data collection since each one may visit fewer sensor nodes to collect the measurements. Figure 10 plots the reconstruction error when the number of mobile collectors is nc = 1, 2, 4, 6, respectively. It is clear that the similar reconstruction quality can be achieved when the total number of measurements collected by all the mobile collectors in each scheme is the same. Therefore, our scheme can also be easily extended to the case using multiple collectors to reduce the data gathering delay although only one collector is utilized in this work.

Performance of the Proposed Scheme with Packet Loss
We further evaluate the performance of the proposed scheme against unreliable wireless channel environments. In this simulation, we assume that the temporal data at each sensor node in one round of data gathering is compressed by CS and encapsulated into a packet. We also assume that a packet at a sensor node is transmitted to the mobile collector with a packet loss rate P. we compare the performance of various spatial-temporal data gathering schemes under a 20% packet loss rate. As shown in Figure 11, the performance of the proposed scheme is superior to the other schemes compared to ST-RW, ST-IID, ST-MRW. This is due to the fact that a projection in our scheme is generated from only one sensor node instead of a linear combination of the measurements from multiple nodes as in ST-MRW. The result also demonstrates the mobile data gathering schemes outperform the multi-hop transmission strategies as adopted in ST-IID because any packet loss during the transmission to the fusion center through multi-hops will increase the loss probability of a packet.

Conclusions and Future Work
In this paper, we studied an energy-efficient data gathering scheme using compressive sensing for spatial-temporal sensory data in mobile wireless sensor networks. We proposed a novel spatial-temporal data gathering scheme using the Metropolis-Hastings random walk algorithm with delayed acceptance, which allows a mobile collector to harvest compressive measurements by sequentially visiting a small subset of nodes along a random routing path. We proved that the equivalent sensing matrix constructed from the proposed scheme for spatial-temporal dimensional compressible signal satisfies the RIP. In particular, we showed that from the mobile collector needs to visit m s = O(k s log(n/k s )) randomly selected nodes and collect m t = O(k t log(m/k t )) compressive measurements from each node so as to reconstruct a sensing field assuming that the field has k s and k t spatial and temporal dimensional sparsity, respectively. We presented extensive simulation results to demonstrate that the proposed scheme is able to not only significantly reduce communication cost but also improve reconstruction accuracy compared to some existing CS-based schemes. We also showed that the proposed scheme be also resilient to unreliable wireless environment under various packet losses. However, we shall note that it requires accurate location information of sensor nodes to construct a sparsity representation basis for signal recovery due to the use of Gaussian kernel basis. On the other hand, the mobile collector also requires these location information for the navigation in the sensing field. When these location information is not available, how to develop a more efficient sparsity representation basis and a more practical random walk algorithm may be more important for a wireless sensor network. As a result, we intend to leave it for future study. Author Contributions: Haifeng Zheng designed the algorithm and performed theoretical analysis, and wrote the entire manuscript. Jiayin Li carried out the simulation and contributed to the manuscript preparation. Xinxin Feng and Zhonghui Chen provided some suggestions on the writing of the manuscript. Wenzhong Guo provided with the support of the entire study and revised the manuscript. Neal Xiong contributed to polish the revised manuscript and gave the suggestions on simulation evaluation.

Conflicts of Interest:
The authors declare no conflict of interest.