Spatio-Temporal Field Estimation Using Kriged Kalman Filter (KKF) with Sparsity-Enforcing Sensor Placement

We propose a sensor placement method for spatio-temporal field estimation based on a kriged Kalman filter (KKF) using a network of static or mobile sensors. The developed framework dynamically designs the optimal constellation to place the sensors. We combine the estimation error (for the stationary as well as non-stationary component of the field) minimization problem with a sparsity-enforcing penalty to design the optimal sensor constellation in an economic manner. The developed sensor placement method can be directly used for a general class of covariance matrices (ill-conditioned or well-conditioned) modelling the spatial variability of the stationary component of the field, which acts as a correlated observation noise, while estimating the non-stationary component of the field. Finally, a KKF estimator is used to estimate the field using the measurements from the selected sensing locations. Numerical results are provided to exhibit the feasibility of the proposed dynamic sensor placement followed by the KKF estimation method.


Introduction
Tracking the spatio-temporal evolution of any field using a limited number of homogeneous/heterogeneous sensors with a desired accuracy is one of the most common applications of wireless sensor networks (WSNs) [1,2]. Different types of environmental, geophysical and biological processes exhibit complicated spatial as well as temporal variability. Spatial and temporal variability of a spatio-temporally stationary physical field can be modelled by its correlation over space and time [3]. If the field is non-stationary then a suitable dynamic model can be used to model the spatio-temporal evolution of the field [3]. If the field exhibits both a stationary and non-stationary behavior over space and time then the field can be dynamically monitored by the combination of kriging [3] and Kalman filtering, i.e., a kriged Kalman filter (KKF) [4] or space-time Kalman filter [5]. The key idea behind the KKF is the liaison of kriging [3] and Kalman filtering. The unknown physical field is modelled as a combination of a non-stationary (capturing the dynamics) and a stationary (capturing the low magnitude spatial effects) stochastic component. Assuming that the dynamics of the non-stationary component and the second-order statistics of the stationary component (e.g., covariance structure) are perfectly known, KKF jointly estimates both of these field components using the spatial observations at every time instant. The KKF paradigm has been used for different applications like wireless communications (e.g., spectrum sensing [6] and path delay estimation [7]) and field estimation [5]. From a practical perspective, KKF can also be used for air pollution level forecasting applications [8]. Also, a distributed implementation of KKF can be used for environment monitoring using a robotic sensor network [9].
One of the important overheads of dynamic field estimation using a WSN is the lack of sufficient measurements at every time instant. This is related to the shortage of sensor life time, availability of bandwidth, and other resource-related economical constraints. In such scenarios, we need to efficiently place/move the available sensors into the most informative locations over space and time. One notable approach for sensor placement in Gaussian processes is based on exploiting the submodularity of the mutual information between the sensing locations [10]. Submodularity of the frame potential and related greedy algorithms for sensor placement are proposed in [11]. Dynamic sensor scheduling is a well-cultivated topic in the fields of signal processing as well as control theory [12,13]. Some recent notable contributions are [14][15][16]. Prior knowledge regarding the correlation of the field over space and time can be exploited in a multi-layer design of sensor networks [17]. Selecting the most informative sensing locations can be treated as a sensor selection problem, which can be formulated as a convex optimization problem [18]. This can be solved for linear [19] as well as non-linear measurement models [20]. Sparsity-promoting approaches for sensor placement are also exhibited in [15,21], where the placement algorithm is formulated using the alternating direction method of multipliers (ADMM). In [22], a generalized sparsity-enforcing and performance-constrained sensor placement method is proposed, where the field can be either stationary or non-stationary. The aforementioned method can be implemented for a single snapshot or multiple snapshot ahead sensor placement and for a general class of spatio-temporal covariance matrices, which can either be ill-conditioned or well-conditioned. Seminal contributions on the convex formalism of sensor selection (like [18]) assume that the measurement noise components are spatio-temporally uncorrelated. However, this can be an unrealistic assumption in some practical scenarios [23]. However, even in those scenarios, it has been shown that the sensor selection problem can be formulated as a convex optimization problem [16,24].
In this work, we develop a unified framework of sensor placement followed by a KKF estimator to dynamically monitor a physical field that exhibits both stationarity and non-stationarity over space and time. In the first step, we select the most informative locations to deploy/move the sensors and in the second step we estimate the field by using the measurements from those selected locations.
The key contributions can be summarized as follows,

•
The performance metrics to estimate the stationary as well as the non-stationary components of the field are represented in closed form as an explicit function of the sensor location selection vector.

•
The aforementioned analytical formalism tackles two important issues in the sensor placement step. First, the developed method takes care of the fact that the estimation of the non-stationary component of the field involves the stationary component of the field as a spatially correlated observation noise. Second, the proposed method is applicable for a general class of spatial covariance matrices of the stationary component of the field, even when they are ill-conditioned or close to singular [25].

•
The proposed sensor placement problem is formulated in a way that minimizes a cost function that involves the sum of the mean square error (MSE) of the stationary and the non-stationary component of the field as well as a spatial sparsity enforcing penalty. The overall optimization problem also satisfies a flexible resource constraint at every time instant.
One of the aspects that distinguishes the proposed sensor placement method from the prior works in sensor placement for environmental field estimation is the specific statistical nature of the unknown physical field, which yields an additive coupling of stationary and the non-stationary components. Secondly, we develop a unified framework for the efficient utilization of the spatio-temporal variability of the field in order to design an opportunistic sensor placement method using a convex approach. We develop a parsimonious sensor placement algorithm followed by a KKF estimator, which can be used to dynamically monitor a general class of environmental fields (based on the assumed process model and spatial statistics of the field components). However, the developed approach is similar to [22] in terms of the primary measurement model, which is considered to be linear and underdetermined. We emphasize that the proposed technique is a model-based centralized sensor placement method, where we resort to the Bayesian estimation philosophy. We assume that the available prior statistical knowledge regarding the unknown physical field like spatial correlation information and dynamics are perfectly known a priori. It is also assumed that for the current centralized setup the communication range of the sensors are sufficient to communicate with the fusion center, which can be located inside/outside the given service area.
Notations: Matrices are in upper case bold while column vectors are in lower case bold. The notation [X] ij is the (i, j)-th entry of the matrix X, [x] i is the i-th entry of the vector x, and tr[X] denotes the trace of X, i.e., the sum of the diagonal elements of X. The notation supp(x) is defined as the set of indices of the non-zero entries of x, while diag(x) and diag(X) are the diagonal matrix with diagonal x and the main diagonal of the matrix X, respectively. An identity matrix of size N × N is denoted by I N . The notation (·) T is the transpose operator,x is the estimate of x, Vectors of all zeros and ones of length N are denoted by 0 N and 1 N , respectively. An all zero matrix of size N × N is given by 0 N×N . The set of symmetric matrices of size N × N and the set of symmetric positive-definite matrices of size N × N are denoted by S N and S N ++ , respectively.

Measurement Model
Let us denote the spatially continuous field by u t (x), at any discrete time index t and location x ∈ R 2 . We assume that the entire service area of interest is uniformly discretized over N pixels, where we would like to estimate the field intensities. The field intensities of the N pixels at time t are represented by u t ∈ R N . It is assumed that the field intensity is the same everywhere within a pixel, and it can be represented by [u t ] j = u t (x j ), where x j ∈ R 2 is the centroid of the j-th pixel, with j = 1, . . . , N. We consider a linear underdetermined measurement model where v t ∈ R N is the non-stationary component of the field and s t ∈ R N is a stationary component of the field capturing the non-dynamic spatial effects. It is assumed that v t and s t are mutually uncorrelated. At any time t, the measurements are given by y t ∈ R M t collected from M t (M t < N) sensing locations (pixels) of the entire service area. The time-varying sensing or measurement matrix C t ∈ {0, 1} M t ×N selects M t measurements from N field locations. The measurement matrix C t is constructed by removing the zero rows of diag(w t ), where w t ∈ {0, 1} N is a sensor location selection vector. If [w t ] j = 1(0), then the j-th pixel is selected (not selected) for sensor placement. Based on this, at any time t, the number and the constellation of the selected sensing locations are given by 1 T w t = M t and supp(w t ), respectively. Using the considered construction of C t , we have the relations Note that the type of measurement matrix used in this work is similar to an incidence matrix, which can be viewed as a flexible data collection method using heterogeneous sensing equipments. In practice, when different types of sensing modalities are used, we may not know the process by which any of the sensors gathers its measurement but only its recorded value is important. Also, we rigorously exploit the property of the structure of C t mentioned in (3), later in this paper.
The error incurred by the measurement process is modelled through e t , which is uncorrelated with both v t and s t , respectively. The spatio-temporally white measurement noise e t is characterized by e t ∼ N (0 M t , σ 2 e I M t ).

Modelling of the Spatial Variability
The spatial effects of the field are modelled through a spatially colored yet temporally white discrete random process s t ∼ N (µ s , Σ s ), where µ s ∈ R N is the mean and Σ s ∈ S N ++ is the spatial covariance matrix of s t . We assume that the process s t is spatially second-order stationary as well as isotropic, which means that where i, j = 1, . . . , N [3]. Note that here we follow the same spatial discretization as mentioned in Section 2.1. There are several empirical as well as parametric model-based approaches to model the spatial covariance. In this work, we assume that the spatial covariance function is given by a simple squared exponential function: where θ > 0 is the parameter controlling the strength of the spatial correlation. The covariance function mentioned in (6) is plotted in Figure 1a for increasing values of the pairwise distance between the centroids of the pixels, i.e., d ij = x i − x j 2 and the parameter θ. Note that the aforementioned covariance function belongs to the family of Matérn covariance functions [3], which are widely used to model the spatial variability of a field in geostatistics and environmental sciences. Using the squared exponential covariance function, the elements of the N × N spatial covariance matrix (Σ s ) can be constructed by the Relation (5). Let us consider a service area with N = 36 pixels. The centroids of these 36 pixels, which are also the candidate locations for sensor deployment are shown in Figure 1b. These centroids are indexed from the top left to the bottom right. The elements of Σ s are shown in Figure 1c. Note that based on the nature of the covariance Function (6), the spatial covariance matrix Σ s is symmetric and based on the constellation of the candidate sensing locations (Figure 1b), Σ s is also a block Toeplitz matrix. We assume that µ s and Σ s are perfectly known a priori.

State Model
The spatio-temporal evolution of the non-stationary component of the field, i.e., v t , can be modelled by the following state model Here, the time-varying state transition/propagator matrix is given by H t ∈ R N×N . The process noise vector q t is assumed to be characterized by q t ∼ N (0, Q t ). The elements of the state transition matrix H t act as spatial redistribution weights for v t−1 for the temporal transition from t − 1 to t [3]. Note that this spatial redistribution can be dependent on the temporal sampling interval. We model the elements of H t by using a parameterized Gaussian kernel function where i, j = 1, . . . , N and the spatio-temporally varying translation and dilation parameters are represented by a ij t ∈ R 2 , and D ij t ∈ S 2 ++ , respectively. The scalar ν ∈ (0, 1) is a scaling parameter to avoid an explosive growth of v t , i.e., to keep the maximum eigenvalue of H t below 1. The simplest form of the state model (7) is defined by H t ≈ νI N , which is similar to a Gaussian random walk model. This corresponds to the Gaussian kernel Function (8), with D t = ζI and ζ 1, and with a t = 0. In this work, we assume that the state transition matrix H t , whose elements are parameterized by {a ij t } and {D ij t } through the Function (8) is perfectly known a priori.  From a practical point of view, the aforementioned Gaussian kernel-based spatio-temporal evolution can be used for rainfall prediction [26]. This modelling approach incorporates some physical properties of environmental fields, such as diffusion, advection etc. [26], so it can also be used for modelling the propagation of a general class of environmental fields (e.g., pollutants, aerosol movements).

Main Problem Statement
The main problem is to design an optimal sensor placement pattern, i.e., to design w t , whose support gives the optimal locations to deploy the sensors. At any t, the design goal is to minimize the estimation error for both the stationary and the non-stationary components of the field as well as to enforce sparsity in w t , i.e., to reduce the number of required sensing locations. If the estimation error of the stationary and non-stationary components of the field can be represented by a single performance metric g(w t ), the sensor placement problem can be represented by arg min At any t, K min t and K max t denote the lower bound on the number of available sensors, and a given budget on the maximum number of available sensors, respectively. Sparsity is enforced through a sparsity-promoting penalty, i.e., an 1 norm of w t in the second summand of the cost Function (9a) with a time-varying regularization parameter λ t > 0 controlling the sparsity of the elements of w t . A detailed description regarding the structure of the objective function and the importance of the constraints in the optimization problem (9) are discussed in Section 3.1.

Simple KKF Estimator and Estimation Error Covariance
Using the measurement and state models of (1) and (7), respectively, the state estimateû t , for t = 1, 2, . . . , can be computed following the lines of a standard KKF [5,6]. First, a simple Kalman filter is used to track the dynamic component v t , where the stationary component s t is interpreted as a noise term. In this case, the measurement model is given by It can be easily shown that v t andȇ t are mutually uncorrelated as it is already assumed in Section 2.1 that v t is mutually uncorrelated with s t and e t , respectively. Now, using the state model of (7) and the measurement model of (10), the non-stationary component v t can be estimated following the lines of a simple Kalman filter [27]. In this case, the recursive state estimate at time t is given byv where the Kalman gain G t can be expressed as The MSE matrix of the estimatev t at time t is given by which is related to the MSE matrix of the estimate at time t − 1, i.e., M v t−1 , by the recursive relation [27] In the next stage, the estimate ofv t in (11) is used to compute the stationary component s t using kriging. In spatial statistics, the intensity of an environmental field in an unknown location can be interpolated using a variogram model [3,4]. For a spatially stationary field, this variogram can be expressed as a covariance [3]. In this way kriging can be viewed as a simple linear minimum mean square error (LMMSE) interpolator [27]. The linear model is given by y t − C tvt = C t s t + e t and the related estimator has the form where we use the prior information s t ∼ N (µ s , Σ s ). The MSE matrix [27] of the estimateŝ t , i.e., M s t is given by Finally, the overall field estimate at time t is given byû t =v t +ŝ t .

Performance Metrics as a Function of w t
In this section, we express the MSE matrices, i.e., M v t and M s t as functions of w t . First of all, we mention some facts regarding the structure of the error covariance matrices presented in the Expressions (13) and (15).
It should be noted that the measurement noise in (10) is correlated over space through the off-diagonal elements ofȒ t . Due to this fact, sensor selection approaches using the standard convex framework like [18,20,22], i.e., designing a w t by directly optimizing the Expression (13) is difficult due to the presence of the off-diagonal elements ofȒ t . It should also be noted that the expression of R t is a function of the measurement matrix C t and thus the selection vector w t . However, we do not encounter this issue in the performance metric to estimate the stationary component s t , i.e., (15), as the measurement noise e t is assumed to be spatially uncorrelated in this case.
In the expression of M s t , i.e., (15), we assume that Σ s is well-conditioned, i.e., accurately invertible. However, this may not be the case in some scenarios. The condition number of Σ s strongly depends on the correlation of the field, spatial sampling distance, grid size etc. [25]. The variation of the condition number of Σ s with different values of θ for both N = 36 and N = 144 is plotted in Figure 2a. It is seen that for a higher resolution or a strong spatial correlation, the spatial covariance matrix becomes increasingly ill-conditioned and thus close to singular. In such circumstances, we cannot compute the estimation error covariance matrix M s t using the Expression (15). In that case, M s t can be computed using the alternate expression of (15) given by which is obtained using the matrix inversion lemma (MIL). It should be noted that the alternative expression of the MSE can be used to compute the MSE (without inverting Σ s ), but it is difficult to express it as an explicit function of w t . In Figure 2b, we plot tr[M s t ] for the best case, i.e., the MSE with all the pixel centroids equipped with sensors (w t = 1 N or C t = I N ) for different values of θ, and for two different spatial resolutions (N = 36 and N = 144) of the same 6 × 6 square km service area. It is seen that tr[M s t ] decreases as the strength of the correlation is increased by increasing θ.
To circumvent the effect of ill-conditioning as well as to handle the correlated measurement noise in the expression of M v t , we propose the following approach. We start by introducing the substitution where Σ sr is a well-conditioned matrix and α ∈ R. More specifically, the range of α is considered to be 0 < α < σ 2 e . Substituting Σ s = Σ sr − αI, we can represent the measurement error covariance matrix of (10) as, (13), the MSE matrix for the estimate of the non-stationary component is given by Using the MIL, we have the following matrix identity from which we can derive where we used C T t C t = diag(w t ). Substituting (20) in (18) we obtain the following expression for M v t .
Next, substituting Σ s = Σ sr − αI in the inverse of the right most term of (16) and using C t C T t = I M t , we obtain Substituting the identity (20) into (22), we obtain the following expression of M s t .
Note that, the expression of (23) avoids the inversion of an ill-conditioned Σ s . Here, we only need to invert the well-conditioned Σ sr .
In this work, we consider the overall MSE as a performance metric for sensor placement, i.e., g(w t ) as mentioned in (9a). This is given by where sr Σ s , and Z = Σ −1 sr Σ s . Note that the matrices X, F, Y, and Z are all independent of w t . To model Σ sr and F + ζ −1 diag(w t ) as positive definite matrices we need 0 < α < σ 2 e . The performance metric derived in (24) incorporates the MSE matrices of the estimates of the non-stationary (v t ) as well as the stationary (s t ) component of the field, as explicit functions of the sensor location selection vector w t . Note that a formulation similar to (23), for the computation of the MSE matrix as a function of w t is proposed in [22], where the field is considered to be either purely stationary or non-stationary.

KKF with Sensor Placement
In this section, we relax and reformulate the proposed sensor placement problem (9) as a semidefinite programming (SDP) problem. Then we present the overall KKF estimator followed by the sensor placement to dynamically monitor the field using only the measurements from the selected sensing locations.

Sensor Placement Problem as an SDP
Solving for the best subset of sensing locations is a combinatorially complex problem. However, it can be relaxed to a convex problem [18][19][20]. As discussed in Section 2.1, the sensor location selection vector w t ∈ {0, 1} N acts as a weighting vector for all the N candidate pixels. Following the main optimization problem, i.e., (9), a sparsity-enforcing, low estimation error, and resource-constrained design of w t can be obtained by solving arg min where the expression of g(w t ) is given by (24). Here, we have relaxed the non-convex Boolean constraint w t ∈ {0, 1} N of (9) to a convex box constraint w t ∈ [0, 1] N . The resource constraint of (25b) is affine and thus convex. Some comments regarding the formulation of the proposed sensor placement problem of (25) are presented next.
• First of all, let us consider the non-convex version of the optimization problem of (25) with λ t = 0. This is given as arg min In this case, the MSE cost will be minimum, i.e., the best estimation performance is achieved, when we select the maximum number of available candidate locations or in other words, when 1 T w t = K max t . Then, there is no way to reduce the number of selected locations below K max t and the constraint 1 T w t ≥ K min t becomes redundant. In the aforementioned case, it is difficult to reduce the number of selected sensing locations below K max t .

•
Notice that, dropping the resource constraint (25b) and increasing λ t will reduce the number of selected sensing locations. However, there is no explicit relation between λ t and 1 T w t , i.e., it is difficult to directly control the resource allocation (i.e., K max t ) through λ t .

•
We mention that the proposed formulation of (25) is not a direct MSE minimization problem but it attains a specific MSE along with enforcing sparsity in spatial sensor location selection through the second summand of (25a). The sparsity enforcement is lower bounded by the minimum number of sensing locations to be selected at any t, i.e., K min t . It should be noted that for an arbitrary selection of λ t , the minimum number of selected sensing locations will always be K min t .

•
Lastly, it should be noted that a sparsity-enforcing design of w t can be achieved by retaining only the second summand of the objective function of (25a) and using a separate performance constraint given as g(w t ) ≤ γ t,MSE [20,22]. The desired performance threshold γ t,MSE can be time-varying or independent of t based on the application. However, in many practical scenarios, it could be difficult to set the performance threshold γ t,MSE a priori for every t.
Based on the aforementioned arguments, we advocate the proposed design approach (25) that lowers the MSE along with enforcing sparsity in sensor placement satisfying a flexible resource allocation constraint.
After solving (25), we obtainŵ t ∈ [0, 1] N which can be converted to a Boolean selection vector w t ∈ {0, 1} N . This can be performed by either deterministic or stochastic rounding procedures as discussed below.

•
The simplest approach could be to set the non-zero entries ofŵ t to 1. However, there can be a huge difference between the magnitudes of any two non-zero elements inŵ t . Considering the fact that the indices of the high magnitude (close to 1) elements ofŵ t signify a more informative sensing location,ŵ t can be sorted in ascending order of magnitude [18] and a selection threshold (γ) can be selected based on the magnitudes of the elements of the sortedŵ t . The entries of the Boolean selection vector can be computed as Another approach could be a stochastic approach, where every entry ofŵ t is assumed to be the probability that this sensing location is selected at time t. Based on this, multiple random realizations of w t ∈ {0, 1} N are generated, where the probability that [w t ] j = 1 is given by [ŵ t ] j , for j = 1, . . . , N. Then the realization that satisfies the constraints and minimizes the estimation error, i.e., g(w t ) is selected [20].
Let us now transform the optimization problem of (25) into an SDP. From the expression of (24), it is clear that minimizing g(w t ) w.r.t. w t is equivalent to minimizing the expression In the first step, we represent the optimization problem of (25) in an epigraph form ( [28], p. 134), ( [18], Equations (25) and (26)) which is given by arg min where we introduce the auxiliary variables V ∈ S N and B ∈ S N . We notice that the epigraph form (27) is well-posed since by choosing 0 < α < σ 2 e in (17) the matrix [F + ζ −1 diag(w t )] is always positive definite and symmetric. In addition, also the matrix [X − F[F + ζ −1 diag(w t )] −1 F] is also positive definite by construction as derived in (18)- (21).
The epigraph form (27) is not a strictly convex program, in the sense that there are multiple V and B matrices that achieve the minimal cost value. This is due to the inequality constraints of (27b) and (27c). At optimality, the eigenvalues of V and B must be equivalent to their lower bounds in (27b) and (27c). Hence, an optimizer of the problem is We proceed by simplifying the constraint (27b). Let us introduce another auxiliary variable A ∈ S N and substitute (27b) with two constraints With this in place, the optimization Problem (27) can be formulated as arg min It can be claimed that the optimization problem (30) is equivalent to (27) given that it yields a decision variable w t with the same optimal cost of (27) . To prove this, let us choose an arbitrary w t sayw. For a fixed yet arbitraryw verifying (30e), the optimization problem (30) minimizes both V and B. This means that due to (30b) it minimizes also A: in fact, as V [X − A] −1 the lower bound for V is minimal if the positive definite matrix [X − A] is maximal, that is A is minimal. Therefore, A must attain its lower bound. As mentioned earlier, there are multiple optimizers, yet one is The same reasoning holds also for B, which at optimality is B = Z T [F + ζ −1 diag(w)] −1 Z. Since this reasoning is valid for any feasiblew, it is also valid for an optimal one and therefore the equivalence claim follows. It should be noted that a similar argument was also presented in [24], where only the issue of correlated measurement noise is considered.
Using the Schur complement lemma the constraints (30b) and (30c) can be equivalently represented by the linear matrix inequalities (LMI) : The constraint (30c) can be equivalently represented by an LMI using the Schur complement [28]. In other words, using the fact that [F + ζ −1 diag(w t )] 0, we obtain Finally, an SDP representation of the overall optimization problem of (27) can be expressed as arg min s.t. LMIs in (31a), (31b), (32) (33b) The solution of the aforementioned SDP is a selection vectorŵ t ∈ [0, 1] N .

Spatial Sensor Placement for Stationary Field Estimation
Let us consider the effect of the stationary component of the field s t for any t. In this case, we consider that v t = 0. In this case, the measurement model of (1) is given by y t = C t s t + e t . Exploiting the prior information regarding s t , i.e., s t ∼ N (µ s , Σ s ) an LMMSE estimator of s t can be presented byŝ t = µ s + Σ s C T t (C t Σ s C T t + σ 2 e I M t ) −1 (y t − C t µ s ). The performance of the aforementioned estimator is given by the MSE matrix Considering the fact that Σ s can be ill-conditioned, following the formulation of (23), the expression of M s t can be expressed as a function of w t as where Y = Σ s − Σ s Σ −1 sr Σ s , Z = Σ −1 sr Σ s , and F = Σ −1 sr . Note that the matrices F, Y, and Z are all independent of w t . Considering g(w t ) = tr[M s t ] and following the same SDP formulation of Section 3.1, the proposed sensor placement problem of (9) can be represented as arg min The optimization problem of (35) gives the spatial sensor placement pattern for any snapshot t, when the field is stationary over space. However, if the field is also temporally stationary then the sensor placement problem of (35) can be extended to blocks of multiple snapshots. In this case, the performance metric can be computed using the same approach as [22]. In the simulation section, we show the effects of spatial correlation on sensor placement.

Sparsity-Enhancing Iterative Design
In order to eschew the effect of the magnitude dependencies of the elements ofŵ t , we individually weigh each element of w t . In this case, we consider a vector form for the regularization parameter : λ t ∈ R N . The weight associated to the each element of w t is the corresponding element of λ t ∈ R N . We iteratively refine the weighting vector λ t in the 1 minimization term of the problem (33) [29]. Using this approach, higher weights are applied on the smaller elements of w t to push them towards 0 and the magnitudes of the larger elements are maintained by applying a smaller weight. In this way, a sparser solution can be obtained compared to the standard sparsity-promoting method. The iterative algorithm can be summarized as s.t. LMIs in (31a), (31b), (32) (36b) , for every j = 1, . . . , N • end; • setŵ t =ŵ I t . After solving the above algorithm, we still obtainŵ t ∈ [0, 1] N . We convert this to a Boolean selection vector w t ∈ {0, 1} N using a deterministic/stochastic rounding method as mentioned in Section 3.1.

KKF Algorithm with Sensor Placement
The informative M t locations to deploy/move the sensors at any t is denoted by supp(ŵ t ), where 1 Tŵ t = M t . The noisy measurements collected from the aforementioned M t locations are stored in y t . The sensing matrix C t is constructed by removing the all-zero rows of diag(ŵ t ) at every t. This measurement matrix is used for the estimation of the non-stationary and the stationary components by (11) and (14), respectively. Then the overall field estimate at time t is computed bŷ u t =v t +ŝ t . Note that the estimation steps, i.e., (11) and (14) do not require the computation of the inverse of Σ s . The error covariance of the non-stationary component can be updated by (13), which also does not require the inverse of Σ s . At every t, the overall estimation performance can be computed by the expression of (24). The best case performance, i.e., the performance with all the locations selected can also be computed by the expression of (24) by using w t = 1 N .
In many practical environmental fields (such as rainfall), the field is generally non-negative. To achieve a non-negative estimate at every t, the estimates of the stationary and non-stationary components can be projected onto the non-negative orthant, i.e., the negative values are set to zero. This is obtained by adoptingû However, in this case, the performance metrics tr[M v t ] and tr[M s t ] are only the approximations. The overall sensor placement followed by a KKF algorithm is presented in Algorithm 1.

4:
Prediction using the state model.

6:
Correction: estimation of v t and s t using the measurements from the selected sensing locations.

Simulation Results
In this section, we perform some numerical experiments to exhibit the practicality of the developed sparsity-enforcing sensor placement followed by the KKF estimation method. We select a service area of 6 × 6 square km with 1 square km spatial resolution as illustrated in Figure 3. The spatial distribution of the non-stationary component at time t = 0, i.e., v 0 , is generated by the following exponential source-field model where K is the number of field-generating points/sources. The parameters ρ k , s k , and d k are the location, amplitude, and the spatial decaying factor of the k-th source at time t = 0. Based on this function, we generate the non-stationary component of the field at time t = 0, i.e., v 0 ∈ R N using the parameters K = 1, ρ 1 = [1.5, 1.5] T , s 1 = 2, and d 1 = 1. The spatial distribution of v 0 in the specified service area is shown in Figure 3. The state model of the non-stationary component v t is modelled by (7). The state transition matrix is modelled by the Gaussian kernel function given by (8). For the sake of simplicity, we consider a spatially invariant translation parameter and spatio-temporally invariant dilation parameters given as a ij t = a t and D ij t = D, respectively, for i, j = 1, . . . , N. The elements of the state transition matrix are given by The spatio-temporal evolution of the true value of the field, i.e., u t = v t + s t is generated in the following two ways. In the first case, we consider a pure advective process, i.e., we select a very low dilation parameter given by D = 10 −4 I 2 for all t = 1, . . . , 8 and ν = 0.8. It is assumed that the temporal resolution is 1 min. The translation vectors, i.e., a t , are assumed to be changing every t as [1,1] It is assumed that at t = 0, v t is generated by the source as shown in Figure 3. The different states of v t for t = 1, . . . , 8 are generated by the state model of (7). The spatially colored yet temporally uncorrelated process noise is characterized by . The stationary component s t is modelled by s t ∼ N (1 N , Σ s ). The parameters of the squared exponential covariance function of (6) are given by σ 2 s = 0.001 and θ = 1. Note that increasing the value of θ, the field becomes spatially more correlated and the condition number of Σ s increases. However, as mentioned earlier, our proposed formulation, i.e., both the selection and the estimation, does not involve the inversion of Σ s . A highly spatially correlated s t is considered in the next case. For the first case, the true field u t = v t + s t for t = 1, . . . , 8 can be simulated as shown in Figure 4. x (km) In the second case, we consider D = I 2 for all t = 1, . . . , 8 and the translation parameters are fixed as a t = [0.4, 0.4] T for t = 1, . . . , 4, and no translation for the last 4 snapshots, i.e., a t = [0, 0] T for t = 5, . . . , 8. The state of v t at t = 0 is the same as before. The scaling parameter is given by ν = 0.35. The process noise q t is the same as before. In this case, we assume that the stationary component s t is spatially more correlated than the last case. The parameters of the covariance Function (6) are taken as σ 2 s = 0.01 and θ = 4, which generates an ill-conditioned Σ s (Figure 2a). Using these, the true field, i.e., u t = v t + s t for t = 1, . . . , 8 is shown in Figure 5. x (km)

Sensor Placement Followed by Field Estimation Using KKF
We select the optimal sensing locations and use them to estimate the field for t = 1, . . . , 8 snapshots for the two different scenarios of the spatio-temporal evolution of the field, as mentioned in the previous section. We use the same service area shown in Figure 1b, where the centroids of the N = 36 pixels are the candidate sensing locations. We assume that the measurement noise variance is given by σ 2 e = 0.001. We solve the optimization problem of (36) with the parameters I = 2 and = 10 −6 . The weighting vectors are initialized as λ 0 = 1 N . The resource constraints are given as K max t = 30 and K min t = 25 for all t. To extract the Boolean solution w t ∈ {0, 1} N fromŵ t ∈ [0, 1] N , we adopt the randomized rounding method. We use the software CVX [30] (parser CVX, solver SeDuMi [31]) to solve the SDP problem (36).
Following the above simulation setup, the selected sensing locations for the first and the second scenario are shown in Figure 6a,b respectively for the 8 snapshots. The indices of the pixel midpoints are the same as in Figure 1b (vertical axis). The main observations from the selected locations are listed below.

•
First of all, it is clearly seen that the selected sensing locations depend on the dynamics. Note that Figure 6a gives the optimal placement pattern, when H t is changing every t (different a t on every t). Figure 6b shows the optimal sensing locations when we have the same H t for t = 1, .. When H t is changing every t, i.e., the spatio-temporal evolution of the field is guided by the time-varying spatial translation parameter a t (Figure 1b), the optimal selection pattern also depends upon this translation (Figure 6a).

•
In the second scenario, we have assumed a very low and fixed translation, i.e., a t = [0.4, 0.4] T for the first 4 snapshots and a t = [0, 0] T , i.e., no translation, for the last 4 snapshots ( Figure 5). It is seen that almost the same set of sensors are selected in the last 4 snapshots of Figure 6b. In general, when H t is not changing with time, the estimation error of the non-stationary component reaches a steady state after a number of snapshots and the same set of sensors are selected every t.
The overall estimation performance using the measurements from the selected locations of Figure 6a,b is shown in Figure 7a,b, respectively. In these figures, we exhibit the pixel-wise comparison of the estimates for T = 8 snapshots, i.e., the estimation performance of 36 × 8 = 288 pixels. We initialize the KKF iterations withv t = 1 N and M v t = 0.001I N at t = 0.

Performance Analysis
We compare the estimation performance of the developed sensor placement method by comparing the performance metric, i.e., g(ŵ t ) = tr[M s t ] + tr[M v t ] with a random sensor placement (with the same M t , i.e., ŵ t 0 = M t as for the developed method) and with the best case performance (i.e., M t = N or w t = 1 N ). For the random placement, we generate 100 different realizations of w t ∈ {0, 1} N at every t with the same M t as for the proposed method. Then g(w t ) is computed for every w t and their average is considered. Similarly, we compute the best case performance, i.e., g(1 N ) for every t and in this case M v t is updated with w t = 1 N . We use the same set of parameters as mentioned in the first case of Section 4.1. Only the resource allocation constraint is simplified as 1 T w t = 15, i.e., we fix that only 15 sensing locations will be selected every t. The performance comparison is shown in Figure 8. It is seen that the proposed approach slightly outperforms the random placement. However, the random placement of sensors does not optimize any performance criterion.

Spatial Sensor Placement for Stationary Field Estimation
In this section, we show the effects of different spatial correlation patterns on sensor placement assuming the field is purely stationary. We solve the optimization problem of (35), for two different spatial covariance matrices (Σ s ). In the first case, we consider that Σ s is generated by the squared exponential Function (6) with θ = 2 and σ 2 s = 0.01 (Figure 9a). In the second case, we consider a randomly generated Σ s (Figure 9b). The resource allocation constraint is the same as before, i.e., K min t = 25, and K max t = 30. We solve the optimization problem of (35), with the iterative approach of (36) with the same parameters as mentioned in the previous section. The selected locations (marked by black squares where the blue dots are the candidate locations as shown in Figure 1b) to deploy sensors are shown in Figure 10a,b for the spatial covariance matrices shown in Figure 9a,b, respectively.
First of all, it is observed that the spatial distribution of the optimal sensing locations depends upon the correlation pattern of the field. It is seen that when Σ s is generated by a squared exponential covariance (stationary) function then the optimal sensor placement pattern is more or less symmetrically and uniformly distributed over the entire service area. However, for a random spatial covariance matrix the optimal sensing locations do not follow any specific pattern.

Conclusions and Future Work
In this work, we have developed a sparsity-enforcing sensor placement followed by a field estimation technique using a KKF. The proposed methodology selects the most informative sensing locations over space and time in a specified service area of interest. Along with minimizing the estimation error, the developed method also economizes the sensor placement (in terms of resources) at every temporal interval. The salient features of the proposed method include handling a general class of spatial covariance matrices and tackling correlated measurement noise. Numerical analysis shows the feasibility of the method. The effects of the dynamics and spatial correlation of the field in spatio-temporal sensor placement are discussed with numerical experiments.
In this work, we have considered the fact that the prior knowledge regarding the spatial variability and the dynamics are perfectly known a priori. In that case, the performance of a clairvoyant Kalman setup with Gaussian measurement and process noise is optimal. However, in many practical scenarios, the aforementioned spatio-temporal prior information may not be accurate and we require more information regarding the unknown field in the estimation step. Future research is envisioned to incorporate the effects of model imperfections in the developed method. Another future research area could be using distributed algorithms to apply the developed method for large scale sensor network applications. It will also be interesting to tailor the recent progress in time-varying optimization [32] to solve the SDPs in a tracking fashion, rather than at optimality at each sampling time.