Adaptive Compressive Sensing and Data Recovery for Periodical Monitoring Wireless Sensor Networks

The development of compressive sensing (CS) technology has inspired data gathering in wireless sensor networks to move from traditional raw data gathering towards compression based gathering using data correlations. While extensive efforts have been made to improve the data gathering efficiency, little has been done for data that is gathered and recovered data with unknown and dynamic sparsity. In this work, we present an adaptive compressive sensing data gathering scheme to capture the dynamic nature of signal sparsity. By only re-sampling a few measurements, the current sparsity as well as the new sampling rate can be accurately determined, thus guaranteeing recovery performance and saving energy. In order to recover a signal with unknown sparsity, we further propose an adaptive step size variation integrated with a sparsity adaptive matching pursuit algorithm to improve the recovery performance and convergence speed. Our simulation results show that the proposed algorithm can capture the variation in the sparsities of the original signal and obtain a much longer network lifetime than traditional raw data gathering algorithms.


Introduction
Wireless Sensor Networks (WSNs), which are capable of sensing, computing, and wireless communication, can be applied to a wide range of applications, such as scientific observation, emergence detection, climate detection, ecosystem surveillance, and physical hazard prevention [1]. In many of these applications, sensor nodes are powered by battery and deployed in an unattended hostile environment with high density. Once deployed, these nodes should send their sensing results to the sink node periodically. Due to the constraints of application environments, it is crucial to prolong the network lifetime of WSN. According to the energy consumption model presented in [2], the energy consumed in a sensor node is exponentially increased with the communication distance. As a result, sensor nodes usually follow the routine to the sink node via multihop transmission, thus saving energy. Besides that, multihop communication is essential for large-scale WSNs for cases where the transmission range of sensor nodes is much smaller than the size of the target area.
Since all the sensor nodes forward their data to only one sink node, the routing tree in WSNs usually exhibits a many-to-one structure, which is often called convergecast [3,4]. The main drawback of the convergecast structure is the energy hole problem, chiefly because the sensor nodes close to the sink After gathering the data with compressive sensing, another challenge is how to design a signal recovery algorithm with fast reconstruction and reliable accuracy. Recently, the signal recovery algorithm applied in WSNs was classified into two categories: (1) the basis pursuit (BP) which was proposed to find the l 1 minimization using linear programming, and (2) the iterative greedy pursuit, for example, the orthogonal matching pursuit (OMP), the stagewise OMP (StOMP) [24] and the compressive sampling matching pursuit (CoSaMP) [25]. Among them, BP requires the least number of measurements; however, its high computational complexity prevents it from being used for large scale applications. OMP and StOMP adopt a bottom-up approach in signal recovery, and their complexity levels are much lower than that of BP. However, they require more measurements and have a lack of recovery guarantee. CoSaMP adopts a top-down approach and can offer an acceptable recovery performance, similar to that of the BP method, but with a much lower recovery complexity. However, all these algorithms discussed above assume that the sparsity of the signal is known prior. These algorithms cannot be directly applied to the construction of a signal with unknown sparsity. For blind signal recovery with unknown sparsity, the most commonly applied algorithm is the sparsity adaptive matching pursuit (SAMP) [26], where the sparsity of signal and the true support of the signal is estimated stage by stage with the "divide and conquer" method. SAMP can also be viewed as a generalization of existing algorithms, such as OMP or CoSaMP. However, the design of an appropriate step variation to guarantee fast convergence and accuracy is still an open problem.
Different from aforementioned works, the aim of this work was to design an adaptive compressive sensing framework for periodical monitoring of WSNs. As we know, the most related work to this research is the intelligent compressive sensing scheme proposed in Refs. [27,28]. The main difference between our work and their works is the sparsity determination. In Ref. [27], sparsity was obtained with local correlations, and the signal was recovered by successive reconstruction applied at the sink node. In Ref. [28], the compressive sampling was applied in a single-hop IoT system, and a learning phase was designed at the central smart object to select the sparsity level. However, in our work, compressive sampling is applied in multi-hop wireless sensor networks, and the variation in the sparsity is determined by very few re-sampling iterations . Additionally, we present an improved SAMP to recover signal with unknown sparsity. The main contributions of this paper are summarized as follows: • We propose an adaptive compressive sensing framework for periodical monitoring of WSNs, where a reconstruction error estimation module is designed to check whether the current sampling rate is still sufficient for signal reconstruction, and a sparsity determination module is designed to estimate the sparsity and calculate the required sampling rate at the next monitoring period.

•
We propose an efficient sparsity variation determination algorithm, which can determine the current sparsity as well as the new sampling rate by only re-sampling a few measurements to save the energy cost and guarantee the recovery performance.

•
We propose an improved SAMP algorithm to recover the signal with unknown sparsity, where both the linear and non-linear step size variation are designed to guarantee fast convergence and reliable accuracy.

•
We evaluate the proposed algorithms with extensive simulations and study the impacts of multiple environmental factors, including the number of sensors and the different sampling rates.
The simulation results show that our proposed algorithm could achieve substantial improvements compared with existing algorithms in terms of sparsity matching and signal recovery.
The remainder of this paper is organized as follows. Section II presents the mathematical details for the CS and data gathering. We propose our adaptive signal sampling method with unknown sparsity in Section III. The signal recovery algorithm is presented in Section IV. Section V presents the numerical results, and Section VI concludes the paper.

Data Gathering Based on Compressive Sensing
We consider a monitoring WSN consisting of N sensor nodes and a sink node, where the set of sensor nodes is denoted as N = {1, ..., N}, and the only sink node is denoted as S. We denote the monitoring period as P. Sensor nodes are distributed in the target area to sense the physical conditions and report the sensory reading to the sink node in each period (p ∈ P).
It is assumed that the sensed data for sensor node i (i ∈ N ) in time period p is x i,p , and the signal of all the sensor nodes in time period p can be represented by an N-dimensional vector, x p = (x 1,p , · · · , x N,p ) T . It is said that x p is a k-sparse signal in domain Ψ p if x p can be represented by a k-sparse vector, d p , in domain Ψ p , and is given by where Ψ p = [ψ 1,p ; · · · ; ψ N,p ] (ψ i,p ∈ R N ) is the orthonormal basis, and d p represents the transform coefficients with only k non-zero values. The compressive sensing theory states that an N-dimensional signal with k-sparsity can be represented by M (M < N) linearly projected measurements. Let Φ (M × N) be the measurement matrix, x p can be given as According to Ref. [29], the exact recovery of x p can be achieved through solving the following combinatorial optimization problem: This is an NP-hard problem, and it is hard to obtain the optimal solution. However, it is equivalent to the following l 1 optimization problem if the incoherence property between Φ and Ψ or the restricted isometry property (RIP) [30] of matrix ΦΨ satisfies The above 1-minimization problem is more tractable and can be solved with linear programming (LP) techniques [31]. We can further obtain the recovered signal,x p , with the known orthonormal basis, Ψ p .
With the above discussion, it is clear that applying CS to WSNs relies greatly on two important issues: First, how to choose or find an efficient orthonormal basis to represent the original signal. Generally, sensor readings are spatially smooth and sparse in the frequency domain; thus, we can use the wavelet transform or the discrete cosine transform can be applied to add sparsity to the original signal [14]. However, for signals with abnormal reading or transmission errors, it is hard to add sparsity in the frequency domain. To cope with this, sparse representation based on overcomplete dictionaries has been proposed [32]. With an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Various dictionary learning methods have been proposed in the literature. For example, the original (synthesis) K-SVD is one such method which allows the construction of an overcomplete dictionary that is suitable for sparse synthesis by learning the dictionary from the data itself [32]. After find a sparsity method, another problem in designing compressive sensing is how to design a measurement matrix (Φ) such that a good RIP is attained. It has been shown that a random matrix with Gaussian variables complying to N(1, 1 M ) has a good RIP [33]. Algorithm 1 illustrates the process of compressed sampling with sampling rate M in detail. In this algorithm, each time a sensor (i) in period p has a value x i,p to transmit to the sink node, it first calculates a new value (y m i,p ) by multiplying value x i,p with a Gaussian variable, φ m i,p (1 ≤ m ≤ M). Then, this new value (y m i,p = φ m i,p x i,p ) is aggregated and transmitted along the path to the parent node (j). In this way, the transmitted data of sensor node j is the summation of y m j,p and the value from all its children (∑ i∈c j y m i,p ), where c j is the set of children of sensor node i. Finally, the sink node receives the value as ∑ n φ m i,p x i,p . This process is repeated until M linearly projected measurements are obtained, as shown in Figure 1. Similarly to Refs. [14,17,34], we assume the measurement coefficient φ m i,p is generated using a pseudorandom number generator seeded with the identifier of node i, which means that the measurement matrix can be easily constructed locally at the node itself. Figure 1. Data gathering based on compressive sensing.

Algorithm 1:
Compressive sampling with sampling rate M m = 1 while m ≤ M do Each sensor node (i) calculates its value, y m i,p = φ m i,p x i,p Sensor node i transmits y m i,p to its parent (j) in the routing tree The parent node (j) receives all its children's value and calculates its own transmitted value as y m j,p + ∑ i∈c j y m i,p and then forwards this to its upstream node. The sink node obtains the sampling data, y m p−1 as y m According to the CS theory in Ref. [33], when the number of measurements (M) satisfies M ∝ ck with a constant (c), the original signal with k-sparsity can be recovered with high probability. Due to the fact that the sparsity of a signal usually satisfies k << N, it is clear that the amount of data transmission in CS-based data gathering is much smaller than that in these traditional methods; thus, reducing the communication costs and achieving energy efficiency during data gathering.

Adaptive Sampling for Signals with Dynamic Sparsity
From the description of the CS based data gathering, it can be known that for a signal with k-sparsity, the optimal sampling rate should be at least ck to guarantee the recovery accuracy and to save communication energy. However, for a signal with unknown sparsity, it is very challenging to select the optimal sampling rate to match the sparsity. In this section, we therefore propose an adaptive sampling mechanism for signals with variable sparsity. In our approach, we assume that the sparsity of the signal in period p, denoted as k p , remains fixed during the entire period, as shown in Figure 2. However, at the starting time of the next period, its sparsity is re-examined, and the new sampling rate is determined corresponding to its new sparsity.
Periodical monitoring with data re-sampling at the start of each period.
It should be noted that at the starting time of period p + 1, it is hard to decide whether the previous sampling (α p ) still matches the sparsity of the signal at period p + 1. A straightforward solution is to conduct raw data gathering with all active sensor nodes. This approach, however, results in the decrease of the network's lifetime and is inefficient for networks with larger numbers of sensor nodes. Therefore, in order to determine the minimum sampling rate required to recover a signal with unknown sparsity, a compressive sampling method based on sequential observations is proposed.
x M is denoted as the recovered signal with sampling rate M. When noiseless measurements are taken using the random Gaussian ensemble, we have the following lemma [35]. Lemma 1. For a Gaussian measurement ensemble, ifx M+α =x M , then we can conclude thatx M =x with a probability of 1.
According to Lemma 1, the recovered signals with M samples and M + α samples are first compared. If the samples match, we declare that these signals are correctly recovered. In a practical setting, most of the coefficients after orthogonal transformations are relatively small, rather than being exactly zero, which means only approximate sparseness is obtained. Thus, the recovery error between the original signal and the recovery with M sequential samples is given by [35] ||x where θ is the angle between vectorsx andx M , and has the following approximation The recovery error can be further approximated as In our paper, the recovery error betweenx M andx M+α is applied to determine the variation of the sparsity, and an adaptive sampling approach is proposed for the periodical monitoring of sensor networks.
In Algorithm 2, the process of the adaptive sampling is illustrated in detail. In this algorithm, M p is denoted as the sampling rate at period p. Initially, raw data gathering is applied, to obtain accurate sparsity of the entire signal. After that, at the starting time of period p, the sampled data obtained at period p 1 , denoted as y M p p−1 , is recovered asx M . Meanwhile, a data re-sampling operation is executed to get re-sampled data (y ff p ) with a fixed number of sampling points as α. After re-sampling, y ff p and y M p p−1 are further combined as y M+ff and recovered as y M+ff at the sink node. Therefore, the performances of y p+ff and y ff are compared to decide whether the gap between them is larger than the predefined threshold (T th ). If it is detected that the recovery gap is smaller than T th , it is concluded that the sparsity is the same as that in period p 1 , and the sampling rate in period p is set as M p = M p−1 . If it is detected that the recovery gap is larger than T th , the novel sparsity is calculated. Due to fact that re-sampling with a fixed sampling rate (α) may be unable to recover the original signal, extra sampling with sampling rate α s is executed, where α s = β||x M −x M+α || 2 − α, and β is a scale coefficient related to the recovery gap. From the expression of α s , it is known that the larger the gap is, the more re-sampling iterations needed to obtain its new sparsity level. After that, the sampling rate for period p is determined and broadcast to all the sensor nodes. Similar to Refs. [14,17], it is assumed that the signal recovery and sparsity estimation are conducted at the sink node, and only the determined sparsity is sent back to each sensor node, thus reducing the energy consumption of each sensor node. It should be noted that our algorithm adopts the same data gathering procedure (shown in Algorithm 1) as that of traditional compressive based methods. The only difference lies in the degree to which the sampling rate corresponds with the signal variation, which means that the data transmission delay is almost the same. Compared with raw data gathering, the time consumed in compressive data transmission is much longer, which is mainly because compressive based data gathering needs to occur on the same date M times. However, it should be noted that compressive based data gathering obtains balanced energy consumption among all sensor nodes, and has a much longer network lifetime than raw data gathering. For clarity, Figure 3 illustrates the detailed process of how to decide the sparsity at period p.

Algorithm 2: Adaptive compressive sampling
Set period p = 1 Use the raw data gathering to obtain x Sparse x to obtain its sparsity (k p ) Set the sampling rate of M p to k p log(N) Sample the compressive data using Algorithm 1 with sampling rate M p while min i∈N E i ≥ E min do p=p+1 The sink node obtains the sampling data y M p p−1 and recovers it asx M Carry out compressive data sampling using Algorithm 1 with the sampling rate α The sink node obtains the sampling data y ff p The sink node combines y ff p and y M p−1 as y M+ff The sink node recoversx M+α with y M+ff Set the amount of extra sampling as α s = β||x M −x M+α || 2 − α Carry out compressive data sampling using Algorithm 1 with the sampling rate α s The sink node obtains the extra sampling data y ff s p The sink node combines y M+ff and y ff s p as y M+ff+ff s The sink node recoversx M+α+α s with y M+ff+ff s Sparse y M+ff+ff s to obtain its sparsity (k p ) Set the sampling rate of M p as k p log(N) Carry out compressive data sampling using Algorithm 1 with the sampling rate M p end Update the residual energy (E i ) of sensor node i, ∀i ∈ N end Return the total number of periods (p) of the entire network.

Signal Recovery with Unknown Sparsity
In sequential-based data gathering, signal recovery requires the use of non-linear algorithms to find the sparsest signal from the measurements. One challenging question in CS research is the design of a fast reconstruction algorithm with reliable accuracy and (nearly) optimal theoretical performance.
The existing signal recovery algorithms always require the signal sparsity first. For signals with unknown sparsity, the sparsity adaptive matching pursuit (SAMP) has already been widely used to recover many blind signals with unknown sparsity.
In contrast to other state-of-the-art greedy algorithms, SAMP takes advantage of both the "bottom-up" approach and the "top-down" approach, where the "bottom-up" method is applied to estimate the sparsity of the signal, step-by-step, and the "top-down" method is applied to identify the true support of the signal by backtracking strategy. Figure 4 shows the conceptual diagram of SAMP, and it can be observed that the sizes of the candidate set (|C k |) and finalist (|F k |) are adaptive. However, the recovery accuracy and speed of SAMP highly relies on the step size (s). In order to avoid overestimation, the safest choice is to set s as 1 for an unknown k, but many more iterations are needed for convergence. How to provide a recovery accuracy with fast convergence speed has become a major challenge. In this paper, we propose an adaptive step decision that corresponds with the number of iterations. The step size at iteration t is denoted as s t , which is given by where ω t is the weight factor employed to regulate the trade-off between the speed and accuracy, and ∆ t is the fixed step difference between any two adjacent iterations. In our paper, two step variation approaches are proposed: the linear decrement weight factor and the non-linear decrement weight factor. For the linear weight factor decrement approach, the weight factor is given by where ω max and ω min are the maximum and minimum weight factors, and T max is the maximum number of iterations. For the non-linear weight factor decrement approach, the weight factor is given by where the parameter λ is applied to control the convergence speed, and ω t decreases with an increase in λ. In constrast to the linear weight factor decrement, the decrease in the step size is much smaller than that with the linear method, reducing the possibility of getting the local optimal solution. Algorithm 3 presents the detailed pseudo code of the proposed recovery algorithm with an adaptive step. Here, Φ * represents the transpose of matrix Φ, Φ † represents the Moore-Penrose inverse of matrix Φ, s represents the step size of the finalist, and the function Max(F , s) returns s elements corresponding to the largest absolute value of vector F . Additionally, for a set Λ = {1, 2, · · · , N}, Φ Λ is the sub-matrix of Φ with indices (i ∈ Λ). At the s-th iteration, S t , C t , F t , and r t denote the shortlist, the candidate list, the finalist, and the observation residual. In this paper, the maximum iteration time (T max ) is set as the stopping rule, and the recovering threshold ( ) is applied as the halting condition.

Numerical Results
In this section, we provide numerical results to illustrate the performance of our proposed algorithm. We consider a WSN with a fixed sink node and no more than 400 sensor nodes randomly deployed in a square area of size 500 × 500. It is assumed that the sink node is located in the center of the area. The initial energy of each sensor node was set to 10,800 J. The corresponding simulations were implemented in Matlab R2009a using a laptop with an Intel (i5-4300) CPU. All the results were obtained by averaging over 100 simulations.

Sparsity Analysis
First, we took the raw signals from the real ocean temperature monitored by the National Oceanic and Atmospheric Administration (NOAA) for the sparsity analysis. For instance, Figure 5a shows the sea temperature monitoring data collected at half past 6, February 23, 2012 in the location of 5 N, 95 W. This data contains 1040 temperature measurements at different depths. From Figure 5a, it can be observed that the signal in the time domain is not sparse. We therefore used the Discrete Wavelet Transform (DWT) to find its sparsity. As shown in Figure 5b, it can be observed that the raw data has good sparsity in the DWT domain-it only has 76 non-zero values. We thus conclude that it is a 76-sparse signal in DWT domain. We further took a raw RSSI measurement of an access point (AP) from a smartphone detected in a real environment [36]. Due to the channel-path fading and interference from nearby equipment, it was hard to find an orthonormal basis from the DWT domain, as shown in Figure 6a. We thus applied the K-SVD algorithm based on an overcomplete dictionary to sparse it. Figure 6b shows that the raw data has good sparsity with the K-SVD algorithm. It has 13 non-zero zones, which also showcase that the RSSI value is concentrated on 13 parts. Considering that signal sampling rate greatly relies on signal sparsity, and the number of sampling times M should satisfy M ∝ ck, we compared the signal recovery performance with different c values. The measurement matrix Φ was constructed by creating an M × N matrix with i.i.d. draws of a Gaussian distribution (N(0, 1)). The recovered signals with different sampling rates are shown in Figures 7-9, where Figure 7 illustrates the signal recovery with a sparsity of k = 76 in the DWT domain, Figure 8 illustrates the signal recovery with a sparsity of k = 122 in the DWT domain, and Figure 9 illustrates the signal recovery of the RSSI data with an unknown sparsity in the DWT domain, but it can be sparsed with an overcomplete dictionary with a sparsity of k = 13. From these figures, it can be observed that as the signal recovery performance increases, the sampling rate c increases . This can be explained by the fact that a larger sampling rate c results in a larger measurement matrix (Φ), and thus, more energy is consumed in data gathering. Besides that, it was observed that for signals sparsed at the frequency domain, the recovered signal is almost the same as the raw data when the sampling rate is larger than 2k. However, for signals with an unknown sparsity at the frequency domain, the sampling rate should be set larger than 4k. Therefore, in this paper, when a signal could be sparsed at the frequency domain, and its sparsity was k, we set the sampling rate as 2k, whereas for signals that could not be sparsed at the frequency domain and for which the sparsity (k) was detected at the overcomplete dictionary, we set the sampling rate as 4k. With this scheme, for the signals shown in Figures 7 and 8, the sampling rates were set as 2 × 76 and 2 × 122, which means the size of the measurement matrices (Φ) were 152 × 1024 and 244 × 1024. For the signals shown in Figure 9, the sampling rate was set as 4 × 13, and the size of its corresponding measurement matrix (Φ) was 52 × 132, respectively.

Adaptive Sampling
In this simulation, we tested the performance of our algorithm for signals with dynamic sparsities. We considered a 200-node wireless sensor network, and applied Dijkstra as the routing algorithm. We compared our algorithm with the fixed sampling algorithm proposed in Ref. [13]. Figure 10 plots the sparsity estimation in the first 30 monitoring periods. Figure 11 plots the recovery performance in the first 30 monitoring periods. From Figure 10, it can be observed that during the time between the 10th and 15th periods, the sparsity of the signal changed. However, the fixed sampling method still uses a sampling rate settled at the initial state and cannot guarantee the required recovery performance. Compared with the fixed sampling algorithm, our algorithm always captured the variation in the sparsity; the estimated sparsity was almost the same as the real sparsity. Our algorithm obtained a much lower recovery error rate despite the changes in the environment, which shows that it is important to apply an adaptive sampling rate for the periodical monitoring of sensor networks.  Considering that α extra data sampling is needed to estimate the sparsity in each period, we further tested the energy consumption and network lifetime performances and compared them with other algorithms. Here, the intelligent sampling represents the algorithm proposed in Ref. [27]. Figure 12 plots the energy consumption of each node in the 200-node sensor network in one working period. It can be observed that our algorithm obtained more balanced energy consumption compared with raw data gathering algorithm. Figure 13 plots the network lifetime comparison versus the number of sensor nodes. It can be observed that our algorithm achieved nearly the same network lifetime as that of fixed sampling or intelligent sampling. This can be explained by the fact that although adaptive sensing requires extra data gathering at the start time of each monitoring period, its sampling rate may be lower than the other two algorithms during the following sampling period. It can also be observed that our algorithm obtained a much longer network lifetime than that of raw data gathering, which shows that compressive sensing is an efficient data gathering method in wireless sensor networks.

Signal Recovery
This simulation presents the signal recovery performance with unknown sparsities in a 200-node sensor network. Our algorithm was compared with the regularized orthogonal matching pursuit (ROMP), SAMP, and regularized adaptive matching pursuit (RAMP) [37]. We evaluated the reconstruction performance by using the averaged relative error (Re) and signal-to-noise ratio (SNR), where the relative error was defined as the average of Re = ||x|| 2 ||x−x|| 2 over 100 trials, and SNR was defined as SNR = 10 log 10 Re. Figures 14 and 15 plot the signal recovery performance with different algorithms.
It was observed that the signal recovery performance can be substantially improved with an increase in sampling rate. It was also observed that our algorithm obtained a similar performance with linear or non-linear factors. Table 1 further shows a comparison of the convergence times. From this table, it can be observed that our algorithm took no more than 0.5 s to converge for the optimal solutions, while the traditional SAMP with step = 1 (denoted as SAMP-1 in Table 1) needed nearly 0.5 s and the SAMP with step = 3 (denoted as SAMP-3 in Table 1) needed about 0.1 s. This shows that our algorithm not only improves the recovery performance, but also converges much faster than the traditional SAMP method.  We further tested the recovery performance with different sensor deployment densities and compared it with different sampling algorithms. Figure 16 plots the SNR versus the number of sensor nodes. Combined with the results shown in Figures 13 and 16, it can be concluded that although our algorithm obtained a lower network lifetime compared with other compressive sampling methods, it obtained the best recovery performance. In addition, the SNR value was almost the same as that of raw data gathering. However, our algorithm was much more energy efficient than that of raw data gathering.

Conclusions
In this paper, we presented an adaptive sampling data gathering scheme for the periodical monitoring of wireless sensor networks. We developed a sequential observation based scheme to observe the variation in sparsity with fewer re-sampling measurements. We designed an adaptive construction step determination to improve the performance of SAMP in which both linear and non-linear step variation were designed to guarantee fast convergence and accuracy. Our simulation results demonstrate that our algorithm can efficiently capture the sparsity variation, and obtain greater recovery performance compared with existing compressive sensing methods with fixed sampling or intelligent sampling. It also obtains a much longer network lifetime than the traditional data gathering algorithm.