ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments

Xu, Pengfei; Cui, Tianhao; Chen, Lei

doi:10.3390/s19081912

Open AccessArticle

ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments

by

Pengfei Xu

¹,

Tianhao Cui

¹ and

Lei Chen

^1,2,3,*

¹

School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

²

Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts &Telecommunications, Nanjing 210023, China

³

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

^*

Author to whom correspondence should be addressed.

Sensors 2019, 19(8), 1912; https://doi.org/10.3390/s19081912

Submission received: 13 March 2019 / Revised: 16 April 2019 / Accepted: 20 April 2019 / Published: 23 April 2019

(This article belongs to the Section Sensor Networks)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Accurate and sufficient node location information is crucial for Wireless Sensor Networks (WSNs) applications. However, the existing range-based localization methods often suffer from incomplete and detorted range measurements. To address this issue, some methods based on low-rank matrix recovery have been proposed, which usually assume noises follow single Gaussian distribution or/and single Laplacian distribution, and thus cannot handle the case with wider noise distributions beyond Gaussian and Laplacian ones. In this paper, a novel Anomaly-aware Node Localization (ANLoC) method is proposed to simultaneously impute missing range measurements and detect node anomaly in complex environments. Specifically, by utilizing inherent low-rank property of Euclidean Distance Matrix (EDM), we formulate range measurements imputation problem as a Robust

ℓ_{2, 1}

-norm Regularized Matrix Decomposition (RRMD) model, where complex noise is fitted by Mixture of Gaussian (MoG) distribution, and node anomaly is sifted by

ℓ_{2, 1}

-norm regularization. Meanwhile, an efficient optimization algorithm is designed to solve proposed RRMD model based on Expectation Maximization (EM) method. Furthermore, with the imputed EDM, all unknown nodes can be easily positioned by using Multi-Dimensional Scaling (MDS) method. Finally, some experiments are designed to evaluate performance of the proposed method, and experimental results demonstrate that our method outperforms three state-of-the-art node localization methods.

Keywords:

wireless sensor networks; anomaly-aware node localization; low-rank matrix decomposition; mixture of Gaussians

1. Introduction

Wireless Sensor Networks (WSNs) have made great progress and have been widely used in various fields, such as environmental monitoring, intelligent transportation, target tracking, etc. [1,2,3]. The premise that these applications work well is accurate location information acquisition [4,5]. Up to now, many localization methods have been proposed, which could be divided into two categories [6]. One is called range-based localization method that could achieve more accurate positioning, but the computation and communication overhead is large while some hardware support is also required. The other is named range-free localization method that is generally suitable for low power and cost applications, yet their positioning accuracy is low.

In this paper, we mainly focus on the range-based localization methods, which could be described as follows: in WSNs applications, some sensor nodes are randomly disposed, and a few of them called anchors could get actual location information by GPS device or other equipment. Then, with the pair-wise range measurements between nodes and actual location of anchors, the location of all unknown nodes could be easily estimated. In general, range-based localization methods often depend on a large amount of accurate inter-node distance information. However, in practical situations, limited by the energy of sensors or the distribution of nodes in application scenario, only a small amount of inter-node distance information could be obtained [7]. Additionally, due to complex environments and uncertain hardware abnormality, the range measurements inevitably suffer from some errors, which lead to lower positioning accuracy. Usually, these errors can be regarded as a mixture of complex noise and anomaly. In reference [6], the complex noise is considered to be caused by environmental interference, malicious attacks, or other unpredictable factors, and is assumed to be a mixture of Gaussian noise with single distribution and outlier noise with single distribution. Correspondingly, Xiao et al. [6] proposed a method based on low-rank matrix completion to sift the complex noise by adopting both Frobenius-norm regularization and

ℓ_{1}

-norm regularization. However, presetting the complex noise into two known noise types was too arbitrary. Actually, the noise distribution in practical applications was often unknown and showed a wider noise distribution beyond Gaussian and Laplacian ones. On the other hand, reference [8] handled one more general application scenario with the co-existence of complex noise and anomaly nodes, where anomaly nodes were defined as ones with abnormal transmission module or unpredictable hardware defects. In reference [8], a

ℓ_{2, 1}

-norm regularization term was employed to detect the anomaly from the corrupted range measurements. However, similar to reference [6], the so-called complex noise is still limited to be modeled as two known noise types, i.e., the mixture of outlier noise and Gaussian noise.

To address this limitation, we propose a novel Anomaly-aware Node Localization (ANLoC) method to simultaneously position the unknown nodes and probe the abnormal nodes in complex environments. Specifically, a Robust

ℓ_{2, 1}

-norm Regularized Matrix Decomposition (RRMD) model is constructed by introducing Mixture of Gaussian distribution and

ℓ_{2, 1}

-norm into the conventional low-rank Matrix Decomposition (MD) model, which can not only well fit the intrinsic low-rank property of Euclidean distance matrix, but also is robust against the node anomaly and a wider range of complex noise distributions beyond Gaussian and Laplacian noises. Our basic idea is to encode the node anomaly and the complex noise as a structural row/column sparsity matrix and a noise matrix, respectively, with entries that satisfy Mixture of Gaussian (MoG) distribution. Here, we prefer to employ MoG distribution as the general noise model due to its universal approximation property to any continuous distribution [9]. Such idea is inspired by some recent noise modeling works including Low-Rank Matrix Factorization (LRMF) [10] and Low-Rank Representation (LRR) [11], and has been verified to be effective in the complex noise scenarios. Note that although anomaly can also be seen as a continuous distribution that is approximated by MoG in theory, we still explicitly model it by using

ℓ_{2, 1}

-norm. The reason is to explicitly detect the location of anomaly node, and thus provide a basis for troubleshooting. Furthermore, based on the popular Expectation Maximization (EM) method [12], an efficient optimization algorithm for solving the proposed RRMD model is designed to obtain the true underlying Euclidean Distance Matrix (EDM). Finally, the actual coordinates of unknown nodes could be easily estimated by employing Multi-Dimensional Scaling (MDS) method [13], and the abnormal nodes can also be detected.

The primary contributions of our work can be summarized as follows.

A Robust $ℓ_{2, 1}$ -norm Regularized Matrix Decomposition (RRMD) model is proposed to jointly estimate the missing range measurements and detect the node anomaly, which takes advantage of the potential relationship between two tasks which could help each other to achieve more accurate performance.
The MoG distribution is employed to fit the unknown complex noise, which allows the proposed RRMD model to adaptively handle a wider range of noise beyond the existing methods. Meanwhile, an efficient optimization algorithm is designed to solve the proposed RRMD model by adopting the popular EM method.
A novel Anomaly-aware Node Localization (ANLoC) method is proposed based on the RRMD model, and extensive experiments verify the superior positioning performance of the ANLoC method in the coexistence of node anomaly and complex noise.

The rest of this paper is organized as follows. In Section 2, we introduce the current research advances about the range-based node localization methods and low-rank matrix decomposition methods, respectively. Section 3 describes the notations used in this paper and some related mathematical foundations. In Section 4, the RRMD model is constructed and an optimization algorithm based on the EM method is designed to solve this model. This section also presents the Anomaly-aware Node Localization (ANLoC) method based on the proposed RRMD and the classic MDS. In Section 5, a series of simulation experiments are conducted to evaluate the performance of our proposed method. Finally, the conclusions are drawn in Section 6.

2. Related Work

2.1. Range-Based Node Localization

At present, the typical range-based localization methods include two steps: (1) using certain ranging methods to measure the distance between nodes, such as Time of Arrival (ToA), Time Difference of Arrival (TDoA) and Received Signal Strength Indicator (RSSI); (2) using the range measurements combined with the location of anchor nodes to calculate the position information of unknown nodes. The popular localization method called Maximum Likelihood (ML) is asymptotically efficient [14] with enough data records. Tomic et al. [15] built a new convex estimator that approximated the ML by applying efficient convex relaxations, which reduced the estimation errors. In references [16,17], WSNs localization problem is treated as a variant of EDM recovery problem or graph implementation problem. By using the range measurements between nodes and introducing the slack variable to convert non-convex quadratic distance constraints into linear constraints, the authors formulated the WSNs localization problem as a Semi-Definite Programming (SDP) problem, and designed an efficient optimization method to solve the proposed problem. References [18,19] employed the classical MDS method to map the distance relationship between wireless sensor nodes to low-dimensional space, and generated a relative coordinate map which fitted well the distance relationship between nodes. Then, a few anchor nodes were used to convert the relative position to the global position. However, the above methods did not work well in real application scenarios. In general, these methods required complete and accurate EDM. Unfortunately, due to the requirement of energy-saving and the effects of potential node anomaly and complex noise, the existing range-based node localization methods often suffer from incomplete and detorted range measurements. In addition, the ML method and SDP method could be only used to solve a small-scale problem because of its high computational complexity.

To address the above issues, some matrix completion based node localization methods have been proposed in recent years. Specifically, by taking advantage of the low-rank characteristics of EDM, Feng et al. [20] firstly formulated the range measurement imputation problem as a low-rank EDM Matrix Completion (MC) model, and designed an efficient optimization method to solve the proposed MC model. However, their work is limited to assume the errors contained in the range measurements are Gaussian noise, ignoring the existence of some complex errors caused by node hardware failure, multipath transmission, etc. In reference [6], another kind of errors called outliers was assumed to obey Laplacian distribution. Then,

ℓ_{1}

-norm regularization was introduced to deal with it, which effectively improved the positioning accuracy. However, it was also too simple to preset the complex noise to these two known types, and the actual noises should be more complicated in practical applications. Recently, Liu et al. [21] proposed a Linear Bregman Iteration based matrix completion method to localize node position for WSNs. However, this method did not consider the actual scenario under the co-existence of complex noise and anomaly. More importantly, all these matrix completion based node localization methods inevitably involved Singular Value Decomposition (SVD) operation with heavy computation cost. To address these issues, an ANLoC method is proposed in this paper, which formulate the range measurements imputation problem as low-rank matrix decomposition instead of low-rank matrix completion model. Therefore, in the next section, we will introduce the related work on low-rank matrix decomposition.

2.2. Low-Rank Matrix Decomposition

Low-Rank Matrix Decomposition (LRMD) is an important technique in data science, which can uncover the latent manifold structures of data, and thus obtain a low dimensional compression representation. Recently, by decomposing the target matrix into the product of two low-rank matrices, LRMD has been widely used in various fields such as dimensionality reduction, clustering, and matrix recovery.

The original LRMD [22] model can be formulated as

\underset{U, V}{m i n} | | M - U V^{T} | |_{ℓ_{p}}

(1)

where

M \in ℝ^{m \times n}

is the target matrix to be approximated,

U \in ℝ^{m \times r}

and

V \in ℝ^{n \times r}

are two low-dimensional matrix variables (

r < m i n (m, n)

),

{| | \cdot | |}_{ℓ_{p}}

denotes the

ℓ_{p}

-norm, and

ℓ_{1}

-norm and

ℓ_{2}

-norm are commonly used to make this model robust to Gaussian noise and outlier noise, respectively. When some elements of

M

are missing, the original LRMD could be changed into a matrix recovery model by adding an orthogonal projection operator [23], which could be formulated as:

\underset{U, V}{m i n} | | P_{Ω} {(M - U V^{T}) | |}_{ℓ_{p}}

(2)

where

P_{Ω} (\cdot)

is an orthogonal projection operator defined as

{[P_{Ω} (M)]}_{i j} = {\begin{matrix} M_{i j}, & (i, j) \in Ω, \\ 0, & o t h e r w i s e, \end{matrix}

(3)

where

Ω \subseteq [m] \times [n] ([m] = {1, 2, \dots, m}, [n] = {1, 2, \dots, n})

represents the index set of sampled elements. This model could be applied to various matrix recovery problems such as recommender system [24] and image representation [25], and has lower computational complexity than low-rank matrix completion model with nuclear norm minimization. However,

ℓ_{1}

-norm and

ℓ_{2}

-norm regularizations are only optimal when the noise follows a Gaussian or Laplacian distribution, which is not in line with the actual situation. To address this limitation, Meng et al. [10] proposed a noise-tolerant LRMD model based on Mixture of Gaussian distribution, which could be described as:

\underset{U, V}{m i n} | | P_{Ω} {(M - U V^{T}) | |}_{M o G}

(4)

where

| | \cdot | |_{M o G}

represents that each element of this matrix is modeled by Mixture of Gaussian distribution. Reference [10] has demonstrated the robustness of the MoG model to unknown noise. However, since this model does not explicitly consider the row/column-wise structural anomaly that the sampled matrix may suffer from, it cannot be directly applied to the node localization of WSNs in this paper.

3. Preliminaries

3.1. Notations

We first introduce the important notations used in this paper. All italic letters denote variables and non-italic letters denote constants. Bold uppercase letters denote matrices and bold lowercase letters denote vectors. Specifically,

X_{i j}

denotes the scalar in the

i

-th row and

j

-th column of

X

.

x^{i}

and

x_{j}

represent the

i

-th row and

j

-th column of the matrix

X

respectively. Additionally, we denote the

ℓ_{p}

-norm, Frobenius-norm, and

ℓ_{2, 1}

-norm of matrix

X \in ℝ^{m \times n}

as

{(\sum_{i = 1}^{m} \sum_{j = 1}^{n} {| X_{i j} |}^{p})}^{1 / p}, {| | X | |}_{F} = {(\sum_{i = 1}^{m} \sum_{j = 1}^{n} X_{i j}^{2})}^{1 / 2}

, and

{| | X | |}_{2, 1} = \sum_{i = 1}^{m} {(\sum_{j = 1}^{n} X_{i j}^{2})}^{1 / 2}

, respectively.

X^{T}

denotes the transpose of matrix

X

.

N (0, σ^{2})

denotes a Gaussian distribution with mean 0 and variance

σ^{2}

. Finally,

⊙

represents the Hadamard product of two matrices.

3.2. Mathematical Foundation

Definition 1

[26]: (Proximal Operator) Let

F (X)

be a real convex function defined on

X \in ℝ^{m \times n}

, for any

τ, μ > 0

and constant matrix

M \in ℝ^{m \times n}

, the proximal operator could be defined as:

p r o x_{τ F (X)} (M) = \arg \underset{X}{m i n} τ F (X) + \frac{μ}{2} | | X - {M | |}_{F}^{2}

(5)

Definition 2

[27]: (Structural Thresholding Operator) For any

τ > 0, M \in ℝ^{m \times n}

, the proximal operator of

ℓ_{2, 1}

-norm, i.e., structural thresholding operator, is defined as:

p r o x_{τ {| | X | |}_{2, 1}} (M) = J_{τ / μ} (M)

(6)

where

{(J_{τ / μ} (M))}^{i} = m a x {| | M^{i} | |_{2} - τ / μ, 0} \cdot M^{i} / | | M^{i} | |_{2}, i = 1, 2, \dots, m

(7)

Definition 3

[28]: (Lipschitz Continuous Gradient) A differentiable convex function

F (X)

defined on

ℝ^{m \times n}

is said to have a Lipschitz continuous gradient

ξ

, i.e., for

\forall X_{1}, X_{2} \in ℝ^{m \times n}

, there exists a constant

ξ > 0

, such that

{(J_{τ / μ} (M))}^{i} = m a x {| | M^{i} | |_{2} - τ / μ, 0} \cdot M^{i} / | | M^{i} | |_{2}, i = 1, 2, \dots, m

(8)

Theorem 1

[29]: Let

F_{1}

and

F_{2}

be two lower semi-continuous convex functions on

X \in ℝ^{m \times n}

, and

F_{2}

is also a differentiable function with Lipschitz continuous gradient

ξ

. Then, for a convex optimization problem defined as

\underset{X}{m i n} F_{1} (X) + F_{2} (X)

(9)

if

F_{1} + F_{2}

is mandatory and strictly convex, then for any initial value

X_{0}

and

0 < δ < 1 / ξ

, the iterative sequence

X_{k + 1}

generated by the following Equation (10) is the unique solution of problem (9)

X_{k + 1} = a r g \underset{X}{m i n} δ F_{1} (X) + \frac{1}{2} | | X - {(X_{k} - δ \nabla F_{2} (X_{k})) | |}_{F}^{2}

(10)

4. Anomaly-Aware Node Localization for WSNs

In this section, we first establish a RRMD model and employ EM method to optimize it, and then a novel ANLoC method is proposed based on the RRMD model.

4.1. Euclidean Distance Matrix Completion

4.1.1. Problem Description and RRMD Model Construction

In a typical application scenario,

n

sensor nodes are randomly deployed in a specific area, whose coordinates can be formulated as

X = [x_{1}, x_{2}, \dots, x_{n}] \in ℝ^{d \times n}

(

d

usually be 2 or 3, indicating the dimension of the coordinate space). Then, the corresponding EDM matrix

D \in ℝ^{n \times n}

could be calculated by

D_{i j} = | | x_{i} - x_{j} | |^{2} = x_{i}^{T} x_{i} + x_{j}^{T} x_{j} - 2 x_{i}^{T} x_{j}, i, j \in {1, 2, \dots, n}

(11)

where

D_{i j}

represents the squared Euclidean distance between the i-th sensor node and the j-th sensor node. Reference [8] has proofed that the rank of matrix

D

is at most

d + 2

, which indicates the EDM matrix has an inherently strict low-rank property. Usually, in practical applications, only a few sensor nodes could obtain their accurate coordinates by loading GPS, and the position information of other unknown nodes need be indirectly calculated by employing some localization methods. Due to irregular node distribution and energy consumption limitation, only the partial range measurements could be collected. Based on the collected incomplete distance measurements, we can only establish a sampled EDM matrix with missing elements. Our goal is to estimate the coordinates of unknown nodes based on the sampled range measurements and the known coordinates of anchor nodes. Figure 1 illustrates the pipeline of node localization for WSNs in complex environments. As shown in Figure 1, the node localization process involves two main steps: EDM recovery and node positioning. At the first step of EDM recovery, by using our proposed RRMD algorithm, we can impute the missing range measurements and obtain the estimated complete EDM matrix, which is also called the true underlying EDM matrix, and then at the second step of node positioning, based on the estimated true underlying EDM matrix, we can further employ MDS method to position each unknown node. Compared to the second step of node positioning, the first step of EDM recovery is more critical and challenging. Specifically, in practical applications, limited by the complex environmental interference, malicious attacks, hardware malfunction, etc., the existence of errors in range measurements is unavoidable, which leads to the destruction of sampled EDM and reduction in localization accuracy. In this paper, we assume that the sampled EDM matrix

M \in ℝ^{n \times n}

consists of the true underlying EDM component

\hat{D} \in ℝ^{n \times n}

, the row structural anomaly matrix component

R \in ℝ^{n \times n}

, the column structural anomaly matrix component

C \in ℝ^{n \times n},

and the noise matrix component

N \in ℝ^{n \times n}

, which is illustrated in Figure 2. The noise is considered to be caused by environmental interference, malicious attacks, etc., which usually follows continuous complex distributions. Moreover, the existence of the anomaly is caused by abnormal transmission module or other unpredictable hardware defects. Specifically, when the receiving module of the nodes fails, it will cause row structural anomaly in the corresponding row in EDM, and when the sending module of nodes fails, the corresponding column in the EDM will have the column structural anomaly [8]. To address the above problems, we need to design an efficient method to obtain a true underlying EDM. Finally, based on this EDM, the actual coordinates of unknown nodes could be easily estimated by employing the classic MDS method.

Therefore, our primary goal is to design an effective EDM reconstruction method to impute the missing range measurements and de-noise the inaccurate ones. Intuitively, based on the low-rank characteristics of EDM, we can classify the EDM reconstruction problem into a standard matrix decomposition model, which is formulated as

\underset{U, V}{m i n} | | P_{Ω} {(M - U V^{T}) | |}_{F} .

(12)

However, this model can only be applied to the case of single Gaussian noise, and its reconstruction performance will decline dramatically when the sampled EDM is under co-existence of complex noise and anomaly. To this end, we employ MoG distribution and

ℓ_{2, 1}

-norm to improve the model, and then establish a Robust

ℓ_{2, 1}

-norm Regularized Matrix Decomposition (RRMD) model as follows:

\begin{matrix} \underset{\hat{D}, R, C, N, U, V, Π, Σ}{m i n} λ_{1} {| | R | |}_{2, 1} + λ_{2} | | C^{T} | |_{2, 1} - l o g L (P_{Ω} (N) | Π, Σ), \\ s . t . P_{Ω} (M) = P_{Ω} (\hat{D} + N + R + C), \hat{D} = U V^{T}, \sum_{k = 1}^{n_{c}} π_{k} = 1, π_{k} \geq 0, k = 1, 2, \dots, n_{c}, \end{matrix}

(13)

where

M \in ℝ^{n \times n}

denotes the sampled matrix,

R \in ℝ^{n \times n}

and

C \in ℝ^{n \times n}

represent row structural anomaly and column structural anomaly, respectively,

U, V \in ℝ^{n \times (d + 2)}

are two low-rank components of the true underlying EDM matrix

\hat{D}

,

l o g L (P_{Ω} (N) | Π, Σ)

is the log-likelihood function,

Π = {π_{1}, π_{2, \dots,} π_{n_{c}}}

,

Σ = {σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{n_{c}}^{2}}

and

n_{c}

are the parameters of MoG, and

λ_{1}

and

λ_{2}

are the tunable parameters. The key of this model is that we use MoG distribution to smooth any unknown noises and

ℓ_{2, 1}

-norm to detect node anomaly, respectively. The main motivations for this model are as follows. (1) First of all, in response to the complex noise, we introduce a Mixture of Gaussians distribution, which is treated as a universal approximator to any continuous density function. Therefore, the RRMD model is robust against a wider range of complex noise distributions beyond Gaussian and Laplacian noises. (2) Then, as the optimal measure of row sparsity matrix,

ℓ_{2, 1}

-norm could be used to smooth row-wise anomaly and the transpose of column-wise anomaly, thus the abnormal nodes can be detected. (3) Finally, in order to deal with the challenge of data missing, we choose the classic low-rank MD method, which can impute the missing elements with lower computational costs than the low-rank MC method.

In this paragraph, we introduce how to apply MoG distribution to the RRMD model. Without loss of generality, each element

N_{i j} (i, j = 1, 2, \dots, n)

in the noise matrix

N

is considered to be from a Mixture of Gaussians distribution, which is defined as

𝕡 (N_{i j}) ~ \sum_{k = 1}^{n_{c}} π_{k} N (N_{i j} | 0, σ_{k}^{2})

(14)

where

N (N_{i j} | 0, σ_{k}^{2})

denotes the Gaussian distribution with mean 0 and variance

σ^{2}

.

n_{c}

is the number of Gaussian components and

π_{k} \geq 0

represents the mixing proportion where

\sum_{k = 1}^{n_{c}} π_{k} = 1

. Then, the likelihood function can be written as

L (P_{Ω} (N) | Π, Σ) = \prod_{i, j \in Ω} \sum_{k = 1}^{n_{c}} π_{k} N (N_{i j}, 0, σ_{k}^{2})

(15)

where

Π = {π_{1}, π_{2, \dots,} π_{n_{c}}}

,

Σ = {σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{n_{c}}^{2}} .

Usually, we use the log-likelihood function instead of this likelihood function for convenient calculation. Then, our aim is to maximize the log-likelihood function to obtain the MoG parameters. Obviously, we can construct the final objective function expressed as Equation (13) by combining the negative log-likelihood function with the

ℓ_{2, 1}

-norm, and then minimize it to get all unknown variables in RRMD model.

4.1.2. Optimizing RRMD via Expectation Maximization Method

In this section, we employ the popular EM method to solve the proposed RRMD model. Firstly, we explain the motivation for using this method. Obviously, in order to gain the estimated values of parameters

Π, Σ

, the log-likelihood function

l o g L (P_{Ω} (N) | Π, Σ)

should be maximized. However, for many specific problems, parameters

Π, Σ

cannot be directly calculated due to complex expressions of log-likelihood functions. Fortunately, the EM method can be employed to solve such problems, which has been proven to be effective by many researches. Specifically, let

Z_{i j k} \in {0, 1}

be a set of hidden variables with

\sum_{k = 1}^{n_{c}} Z_{i j k} = 1 (i, j = 1, 2, \dots, n)

, and

Z_{i j k} = 1

implies the noise

N_{i j}

comes from the

k

-th Gaussian component. Therefore, the log-likelihood function can be re-written in a form that is easy to optimize as

l o g L (P_{Ω} (N) | Π, Σ) = \sum_{i, j \in Ω} \sum_{k = 1}^{n_{c}} Z_{i j k} (l o g π_{k} - l o g \sqrt{2 π} σ_{k} - \frac{1}{2 σ_{k}^{2}} {(M_{i j} - u_{i}^{T} v_{j} - R_{i j} - C_{i j})}^{2})

(16)

Next, we will introduce the specific optimization process of estimating the parameters (

\hat{D}, R, C, N, U, V, Π, Σ

) in Equation (13) by alternately conducting the E and M steps.

• E step:

In the E step, we calculate the conditional expectation

Υ_{i j k}

of

Z_{i j k}

. The specific calculation formula can be given as follows.

Υ_{i j k} = E (Z_{i j k}) = \frac{π_{k} N (M_{i j} - u_{i}^{T} v_{j} - R_{i j} - C_{i j}, 0, σ_{k}^{2})}{\sum_{k = 1}^{n_{c}} π_{k} N (M_{i j} - u_{i}^{T} v_{j} - R_{i j} - C_{i j}, 0, σ_{k}^{2})}

(17)

• M step:

In the M step, we need to optimize the following optimization model as

\min_{U, V, R, C, Π, Σ} λ_{1} {| | R | |}_{2, 1} + λ_{2} | | C^{T} | |_{2, 1} - \sum_{i, j \in Ω} \sum_{k = 1}^{n_{c}} Υ_{i j k} (l o g π_{k} - l o g \sqrt{2 π} σ_{k} - \frac{1}{2 σ_{k}^{2}} {(M_{i j} - u_{i}^{T} v_{j} - R_{i j} - C_{i j})}^{2}) s . t . \sum_{k = 1}^{n_{c}} π_{k} = 1, π_{k} \geq 0

(18)

A natural idea is to use alternate iteration methods to solve MoG parameters

Π, Σ

and matrix variables

U, V, R, C

, i.e.,

Update

Π, Σ

:

π_{k} = B_{k} / \sum_{k = 1}^{n_{c}} B_{k}, where B_{k} = \sum_{i, j \in Ω} Υ_{i j k}

(19)

σ_{k}^{2} = \frac{1}{B_{k}} \sum_{i, j \in Ω} Υ_{i j k} {(M_{i j} - u_{i}^{T} v_{j} - R_{i j} - C_{i j})}^{2}

(20)

Update

U, V, R, C

:

In order to update

U, V, R, C

, we need to solve the following sub-problem:

\min_{U, V, R, C} λ_{1} {| | R | |}_{2, 1} + λ_{2} | | C^{T} | |_{2, 1} + \sum_{i, j \in Ω} \sum_{k = 1}^{n_{c}} Υ_{i j k} \frac{{(M_{i j} - u_{i}^{T} v_{j} - R_{i j} - C_{i j})}^{2}}{2 σ_{k}^{2}}

(21)

If we let

W_{i j} = {\begin{matrix} \sqrt{\sum_{k = 1}^{n_{c}} Υ_{i j k} / σ_{k}^{2}}, (i, j) \in Ω, \\ 0, o t h e r w i s e . \end{matrix}

(22)

Then Equation (21) can be reformulated as follows:

\min_{U, V, R, C} λ_{1} {| | R | |}_{2, 1} + λ_{2} | | C^{T} | |_{2, 1} + \frac{1}{2} | | W ⊙ {(M - U V^{T} - R - C) | |}_{F}^{2}

(23)

Furthermore, we can optimize Equation (23) by solving the following three sub-problems.

1) Update

U, V

by solve the following sub-problem:

\min_{U, V} \frac{1}{2} | | W ⊙ {(M - U V^{T} - R - C) | |}_{F}^{2}

(24)

Obviously, as a typical weighted matrix decomposition problem, it could be solved by various methods, such as ALS [30], WLRA [31] and DN [32].

2) Update

R

by conducting the following iteration:

R_{t} = J_{λ_{1} τ_{R}} (R_{t - 1} - τ_{R} W ⊙ W ⊙ (R_{t - 1} + C + U V^{T} - M))

(25)

where

t

represents the

t

-th iteration and

τ_{R}

is the step size of the proximal gradient.

3) Update

C

by conducting the following iteration:

{(C^{T})}_{t} = J_{λ_{2} τ_{C}} ({(C_{t - 1} - τ_{c} W ⊙ W ⊙ (R + C_{t - 1} + U V^{T} - M))}^{T})

(26)

where

t

represents the

t

-th iteration and

τ_{C}

is the step size of the proximal gradient.

Based on the aforementioned analysis, we summarize the whole procedure in Algorithm 1.

Algorithm 1. Proposed Robust

ℓ_{2, 1}

-norm Regularized Matrix Decomposition (RRMD) Algorithm

Input: The sampled EDM matrix

M

, the index set

Ω

, the parameters

λ_{1}

,

λ_{2}

and threshold

θ

, the initial number of Gaussian components

n_{c}

.

Output: The true underlying EDM matrix

\hat{D}

, the row structural anomaly

R,

and the column structural anomaly

C

.

1. Randomly initialize

U_{0}, V_{0}, Π_{0}, Σ_{0}

; initialize

R_{0}, C_{0}

to zero matrix;

2.

t = 1

;

3. While not convergence do

4. (E-step): update

Υ

according to Equation (17);

5. (M-step): update

Π, Σ

according to Equation (19) and Equation (20);

6. (M-step): update

U, V

according to optimize Equation (24);

7. (M-step): update

R

according to Equation (25);

8. (M-step): update

C

according to Equation (26);

9. (Tuning

n_{c}

): Let

g_{i}

and

g_{i}

represent the number of

i

-th and

j

-th Gaussian component respectively. if

| σ_{i}^{2} - σ_{j}^{2} | / (σ_{i}^{2} + σ_{j}^{2}) < θ

, then let

π_{i} = π_{i} + π_{j}

,

σ_{i}^{2} = (g_{i} σ_{i}^{2} + g_{j} σ_{j}^{2}) / (g_{i} + g_{j})

,

K = K - 1

. Lastly, remove

π_{j}

and

σ_{j}^{2}

from

Π, Σ

, respectively.

10.

t = t + 1

;

11. End while

12.

R, C and \hat{D} = U V^{T}

.

4.2. Anomaly-Aware Node Localization

Based on the proposed RRMD model, all pair-wise range measurements between nodes can be easily obtained. However, it is only the first step of this range-based localization method. The second step of this method is to calculate the location information of the unknown nodes based on the complete distance information and the actual coordinates of anchor nodes, and which can be implemented by using the classical MDS method [6,8,21].

MDS is a typical low-dimensional embedding method, which is originally proposed to solve the curse of dimensionality in the field of machine learning. The MDS method ensures that the distance between samples in the original high-dimensional space can be preserved in low-dimensional space. The input of this method can be high-dimensional features of samples (the pair-wise distance information can be easily obtained from high-dimensional features) or the distance information between samples, while the output is the low-dimensional features of samples in the specified

d

-dimensional space. Therefore, in the sensor networks localization application, we can employ the MDS method to calculate the relative coordinates of each node in the

d

-dimensional space. Then, based on the actual/absolute coordinates of at least

d + 1

anchor nodes and the relative coordinates obtained, the coordinate transformation matrix between the actual coordinates and the relative coordinates can be calculated (with the details provided in Theorem 2). Finally, the relative coordinates can be mapped to the actual ones by using the coordinate transformation matrix.

Next, we introduce the specific steps to implement the above MDS method. Firstly, we let the relative coordinates and actual coordinates of

n

nodes be represented as

T = [t_{1}, t_{2}, \dots, t_{n}] \in ℝ^{d \times n}

,

A = [a_{1}, a_{2}, \dots, a_{n}] \in ℝ^{d \times n}

, respectively, and assume that nodes

1, 2, \dots n_{a} (n_{a} \geq d + 1)

are anchor nodes. Secondly, the classic MDS method is employed to calculate the relative coordinates of all nodes. Thirdly, with the actual coordinates of anchor nodes, we have coordinate transformation matrix:

Q = \frac{[a_{2} - a_{1}, a_{3} - a_{1}, \dots, a_{n_{a}} - a_{1}]}{[t_{2} - t_{1}, t_{3} - t_{1}, \dots, t_{n_{a}} - t_{1}]}

(27)

Finally, the coordinate of all unknown nodes can be obtained by

(a_{i} = Q \cdot (t_{i} - t_{1}) + a_{1}, i = n_{a} + 1, n_{a} + 2, \dots n

(28)

Based on the aforementioned analysis, the proposed Anomaly-aware Node Localization (ANLoC) algorithm can be summarized as in Algorithm 2.

Algorithm 2. Anomaly-aware Node Localization (ANLoC) Algorithm

Input: The sampled EDM matrix

M

, the index set

Ω

, the parameters

λ_{1}

,

λ_{2}

and threshold

θ

, the initial number of Gaussian components

n_{c}

. The coordinates of anchor nodes

{a_{1}, a_{2}, \dots, a_{n_{a}}} (n_{a} \geq d + 1),

where

n_{a}

denotes the number of anchor nodes.

Output: Coordinates of all unknown nodes

{a_{i} | i = n_{a} + 1, n_{a} + 2, \dots n}

.

1. Calculate the true underlying EDM matrix

\hat{D}

by using Algorithm 1;

2. Double centering the matrix

\hat{D}

:

S = - 0.5 \times J \hat{D} J

, where

J = I - 1 \cdot 1^{T} / n

and

I

is identity matrix;

3. Perform SVD decomposition on matrix

S

:

[H, Λ, Κ] = s v d (S)

;

4. Calculate relative coordinates:

T = [t_{1}, t_{2}, \dots, t_{n}] = \sqrt{Λ_{d}} \cdot H_{d}^{T}

where

t_{i} \in ℝ^{d \times 1}

,

Λ_{d} = Λ (1 : d, 1 : d)

,

H_{d} = H (:, 1 : d)

;

5. Calculate the coordinate transformation matrix

Q

:

Q = [a_{2} - a_{1}, a_{3} - a_{1}, \dots, a_{n_{a}} - a_{1}] / [t_{2} - t_{1}, t_{3} - t_{1}, \dots, t_{n_{a}} - t_{1}]

;

6. Calculate and output coordinates of all unknown nodes:

a_{i} = Q \cdot (t_{i} - t_{1}) + a_{1}, i = n_{a} + 1, n_{a} + 2, \dots n .

Theorem 2:

For sensor nodes localization in

d

-dimensional space, given the absolute positions

{a_{1}, a_{2}, \dots, a_{n_{a}}}

of the

n_{a}

anchor nodes and the relative coordinates

{t_{1}, t_{2}, \dots, t_{n}}

of all the

n (n ≫ n_{a})

sensor nodes, if

n_{a} \geq d + 1

, then the relative coordinates

{t_{n_{a} + 1}, t_{n_{a} + 2}, \dots, t_{n}}

can be transformed to the corresponding absolute positions

{a_{n_{a} + 1}, a_{n_{a} + 2}, \dots, a_{n}}

.

Proof:

According to reference [13], the absolute positions

a_{i}

(

i \geq n_{a} + 1

) can be computed according to

a_{i} = Q \cdot (t_{i} - t_{1}) + a_{1}, i = n_{a} + 1, n_{a} + 2, \dots, n

(29)

where

Q \in ℝ^{d \times d}

is the unknown coordinate-transform matrix, which should be determined by the following matrix equation:

Q \cdot [t_{2} - t_{1}, t_{3} - t_{1}, \dots, t_{n_{a}} - t_{1}] = [a_{2} - a_{1}, a_{3} - a_{1}, \dots, a_{n_{a}} - a_{1}]

(30)

Without loss of generality, let

\hat{T} = [t_{2} - t_{1}, t_{3} - t_{1}, \dots, t_{n_{a}} - t_{1}]

and

\hat{A} = [a_{2} - a_{1}, a_{3} - a_{1}, \dots, a_{n_{a}} - a_{1}]

, then we can see that

\hat{T} \in ℝ^{d \times (n_{a} - 1)}

and

\hat{A} \in ℝ^{d \times (n_{a} - 1)}

. Therefore, the above matrix Equation can be regarded as the following equivalent equations:

\sum_{l = 1}^{d} Q_{i l} \cdot {\hat{T}}_{l j} = {\hat{A}}_{i j}, i = 1, 2, \dots, d, and j = 1, 2, \dots, n_{a} - 1

(31)

Obviously, if

n_{a} \leq d

, then the number of unknown entries (i.e.,

d^{2}

) is greater than the number of Equations (i.e.,

d \times (n_{a} - 1)

), so we cannot obtain the determined coordinate-transform matrix

Q

. On the contrary, if

n_{a} \geq d + 1

, we only need to arbitrarily select

d + 1

anchor nodes, then we can obtain the unique solution of

Q

. □

5. Performance Evaluation

In this section, we first describe the experimental setting and evaluation metrics, and then report some extensive experimental results under the different scenarios.

5.1. Experimental Setting

In order to investigate the performance of our proposed ANLoC, some simulation experiments were conducted. All these experiments were designed based on MATLAB 2017a and run on PC with Intel i5-8400 CPU and 16G RAM. Firstly, we randomly disposed 100 sensor nodes in a square area with

100 \times 100

unit (where the unit can be determined according to actual communication condition, such as meter, decimeter, foot, and inch, etc.), and 6 of them were anchor nodes while others were unknown nodes. Let

X \in ℝ^{2 \times 100}

be the actual coordinate matrix of all nodes and

D \in ℝ^{100 \times 100}

denote the ground truth EDM matrix between nodes. Secondly, we artificially added some error matrices into matrix

D

to simulate the complex noises and the structural anomaly, and thus obtained a corrupted matrix

D_{e r r o r}

. Specifically, the complex noise was set to the mixture of the following components: (1) Gaussian noise with mean 0 and variance 100; (2) Gaussian noise with mean 0 and variance 50; and (3) sparse noise with a pollution ratio 1% randomly generated within the range of

[0, 10000]

. Moreover, the structural anomalies were set to be randomly generated within the range of

[0, 500]

and with a row-wise pollution ratio 3% and a column-wise pollution ratio 3%. Thirdly, sampled matrix

M

could be obtained by randomly sampling some elements from

D_{e r r o r}

. Finally, the true underlying EDM matrix can be reconstructed from

M

via Algorithm 1, and then the actual coordinates of all nodes could be calculated using Algorithm 2. Moreover, we summarized all the simulation parameters in Table 1.

The proposed ANLoC method involves three hyper-parameters

λ_{1}, λ_{2}

and

n_{c}

. It is important to adjust these parameters to proper values. In order to determine the optimal value of these parameters, we employed a 10-fold cross validation procedure and conduct the grid search method. In particular, firstly, let

n_{c} = 5

, we searched for the optimal solution of

λ_{1}

and

λ_{2}

within the range of

{0.001, 0.01, 0.1, 1, 10, 100, 1000}

by minimizing the matrix recovery error of EDM. Then, based on the selected optimal parameter

λ_{1}

and

λ_{2}

,

n_{c}

was determined within the range of

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

. Last but not least, we compared our method with three different competing methods, including NLIRM method [6], SVT-based method [20], and OptSpace-based method [33]. All the involved parameters in these competing methods were optimized by using the same nested 10-fold cross-validation procedure as in our ANLoC method.

5.2. Evaluation Metrics

We selected the following four evaluation metrics to evaluate the node localization performance:

EDM recovery error:

$e_{r} = | | \hat{D} - D | |_{F} / {| | D | |}_{F}$

(32)

where $\hat{D}$ denotes the reconstructed matrix and $D$ is the ground truth matrix.
Localization error:

$e_{l} = | | \hat{X} - {X | |}_{F} / n$

(33)

where $\hat{X}$ denotes the coordinates calculated by Algorithm 2.
Anomaly recognition accuracy:

$p_{a v e} = (p_{r o w} + p_{c o l}) / 2$

(34)

where $p_{r o w}$ and $p_{c o l}$ represent the row anomaly recognition accuracy and column anomaly recognition accuracy, respectively. Specifically, the $p_{r o w}$ is calculated by

$p_{r o w} = 2 \times \frac{r_{p r e} \cdot r_{r e c}}{r_{p r e} + r_{r e c}}, r_{p r e} = \frac{r_{t r u}}{r_{a l l}}, r_{r e c} = \frac{r_{t r u}}{r_{a c t}}$

(35)

where $r_{a l l}$ represents the number of abnormal rows identified by the method proposed in this paper, $r_{t r u}$ indicates the number of correctly recognized rows in $r_{a l l}$ and $r_{a c t}$ represents the number of actual abnormal rows; the $p_{r o w}$ is calculated by

$p_{c o l} = 2 \times \frac{c_{p r e} \cdot r_{r e c}}{c_{p r e} + r_{r e c}}, c_{p r e} = \frac{c_{t r u}}{c_{a l l}}, c_{r e c} = \frac{c_{t r u}}{c_{a c t}}$

(36)

where $c_{a l l}$ represents the number of abnormal columns identified by the method proposed in this paper, $c_{t r u}$ indicates the number of correctly recognized columns in $c_{a l l},$ and $c_{a c t}$ represents the number of actual abnormal columns.
Cumulative distribution of localization errors [21]:

$e_{C D} = P (Δ l_{i} \leq σ)$

(37)

where $Δ l_{i} = \sqrt{{({\hat{x}}_{i} - x_{i})}^{2} + {({\hat{y}}_{i} - y_{i})}^{2}}$ defines the localization error of $i$ -th node, $(x_{i}, y_{i})$ and $({\hat{x}}_{i}, {\hat{y}}_{i})$ denote the actual coordinates and the estimated coordinates, respectively.

5.3. Experimental Results

In order to investigate the performance of proposed method in complex environments, we conducted the following four experiments under the different application scenarios.

• Scenario 1: Localization without the noise and anomaly

This experiment assumed that all observed range measurements were accurate. Figure 3 shows that the variation of the EDM recovery error and average localization error of WSNs nodes when the EDM is sampled at different proportions. The horizontal axis of the two subgraphs represents the sampling ratio, while the vertical axis in Figure 3a represents the EDM recovery error, and the vertical axis in Figure 3b represents the localization error of WSNs nodes. Comparing Figure 3a with Figure 3b, it is obvious to find that the EDM recovery error and localization error are consistent with the trend of the sampling rate. Under this scenario, when sampling rate reaches 0.3, the EDM recovery error and the localization error of these four methods reach a very low level. This shows that the existing methods and the method proposed in this paper can achieve good positioning accuracy under the ideal condition.

• Scenario 2: Localization with only complex noise

This experiment aims to examine the performance of four methods with only complex noise. Therefore, we only add complex noise into EDM. After sampling the EDM in different proportions, the experimental results are shown in Figure 4. Under this scenario, the SVT-based method and the OptSpace-based method have large EDM recovery error and localization error even if the sampling rate is high, which indicates that the two methods do not handle the complex noise well. When the sampling ratio reaches 0.3, the EDM recovery error of NLIRM could be stabilized at around 0.01, and the localization error of nodes could be less than 0.03. However, the EDM recovery error and localization error of ANLoC could reach 0.001 and 0.01, respectively. Comparing with Figure 3, we find that both the SVT-based method and the OptSpace-based method could not deal with complex noise at all. The NLIRM has a certain ability to resist noise, but it is slightly worse than our ANLoC. Therefore, our ANLoC could perform best under complex noise than others.

• Scenario 3: Localization with only anomaly

The purpose of this experiment is to examine the performance of detecting anomalous nodes in the environment without any noise. In this case, only anomaly is added into EDM. As shown in Figure 5, compared with Figure 3, the EDM recovery error and the localization error of the four methods are all increased when the EDM is destroyed by the anomaly. However, compared with other three methods, ANLoC could have better performance at a lower sampling rate (0.2). In addition, the ANLoC could also detect abnormal nodes. Specifically, Figure 5c shows that when the sampling rate reaches 0.2, the recognition accuracy of abnormal nodes can reach 100%. In comparison with Figure 3, we find that the performance of all methods has been degraded in varying degrees, but our ANLoC still has the best performance. Therefore, when some sensor nodes have abnormities, the ANLoC has good performance for precisely locating unknown nodes and detecting abnormal nodes.

• Scenario 4: Localization with both complex noise and anomaly

This experiment is designed to evaluate the performance of the ANLoC in a complex environment where complex noise and anomaly coexist. The complex noise and the anomaly are both added to the EDM to simulate the impact of the complex real application scenarios. Experimental results are shown in Figure 6. Obviously, compared with the other three methods, our ANLoC method proposed in this paper not only has the lowest EDM recovery error and localization error, but also has high recognition accuracy of abnormal nodes when the sampling rate reaches 0.3. Comparing with Figure 3, we can find that the performance of the SVT-based method and the OptSpace-based method are much worser in complex environment. The NLIRM has less performance degradation, but its EDM recovery accuracy and localization accuracy are still not as high as our ANLoC. In short, our proposed ANLoC has good robustness to complex application scenarios and could detect faulty nodes well. In addition, Figure 7a shows the cumulative distribution of the localization error for the four methods when the sampling rate is 0.5 under this scenario. The probability of the localization error of the ANLoC being less than 0.5 is 98% and the probability of being less than 1 is 100% while the other three methods are all below 30%. In addition, Figure 7b shows the intuitionistic localization results of all nodes in this case when the sampling rate is 0.5.

5.4. Effects of the Proposed Strategies

To analyze the effects of our proposed strategies, experiments are designed to compare the proposed method with three partial deleted methods. Actually, our RRMD can be treated as LRMD + MoG +

ℓ_{2, 1}

-norm. Then, the remaining three types of partial deleted methods are as follows. (1) Pure LRMD: do not consider any noise or anomaly; (2) LRMD + MoG: only consider complex noise; (3) LRMD +

ℓ_{2, 1}

-norm: only consider the anomaly. What’s more, these experiments are set under the scenario with both complex noise and anomaly. Finally, all experimental results are shown in Figure 8a–c. Obviously, our ANLoC considers both complex noise and anomaly, which leads to its low EDM recovery error and localization error. Moreover, when the sampling rate reaches 0.3, it could accurately detect abnormal nodes. Due to too ideal assumptions, pure LRMD is the worst performer, which means that it cannot be applied in actual situation. Although LRMD + MoG could handle the error of EDM well, it does not have the ability to detect abnormal nodes. Specially, LRMD +

ℓ_{2, 1}

-norm could hardly detect abnormal nodes in this case. Naturally, we speculate the noise that is not unprocessed may have a negative effect on the performance for detecting abnormal nodes in this method. Therefore, another experiment is designed with LRMD +

ℓ_{2, 1}

-norm when the EDM is only destroyed by anomaly, and the result of anomaly recognition is drawn in Figure 8d. The pleasure is that this supplementary experiment validates our conjecture. In conclusion, these experiments demonstrate the effectiveness of the MoG and

ℓ_{2, 1}

-norm strategies.

5.5. Localization for Large-Scale Scenario

In a practical application scenario, there is a case that when the size of the localization scenario is much larger than ranging length of the sensor nodes and only a few range measurements between inter-nodes could be obtained. As a result of that, the sampled EDM based on range measurements will be very sparse, which leads to poor performance of our method. To solve this problem, a large-scale localization method is proposed in this paper. For the convenience of discussion, we simplify the application scenario as shown in Figure 9. Suppose there is a

180 \times 100

unit area divided into three parts I, II, and III from left to right, and among them, there are 80, 20, 80 nodes, respectively. Of the 180 nodes, there are 6 anchor nodes and 6 relay nodes. Then, regions I and II are combined into a localization sub-region

M

while regions II and III are combined into a localization sub-region

N

. The large-scale process is as follows. (1) Calculate the actual coordinates of all nodes within sub-region

M

by ANLoC. (2) Based on step one, the actual coordinates of relay nodes are gained. Taking them as anchor nodes, we employ ANLoC again to get the coordinates of the remaining unknown nodes in region

N

.

Further, in order to verify the feasibility of the above-mentioned large-scale scenario localization method, we designed an experiment with complex noise and anomaly. Moreover, the EDM sampling rates of region

M

and

N

are both set 0.5. As Figure 10 shows, the experimental results confirm that our large-scale localization method can be well applied to large-scale scenarios.

6. Conclusions

In this paper, we aim to estimate node position of WSNs in complex environments. Considering that the coexistence of potential node anomaly and complex noise in practical applications, we propose a novel Anomaly-aware Node Localization (ANLoC) method to address this task. Specifically, the proposed ANLoC method is divided into two steps. First, a Robust

ℓ_{2, 1}

-norm Regularized Matrix Decomposition (RRMD) model is designed to simultaneously detect node anomaly and impute the missing range measurements. Second, based on the imputed EDM matrix, all unknown nodes can be easily estimated by using the classic MDS method. The extensive experiments demonstrated our proposed ANLoC method consistently outperforms three state-of-the-art localization methods in terms of EDM recovery error, localization error, and anomaly recognition accuracy. Additionally, our proposed ANLoC method also can extend to a large-scale localization scenario. Future work will focus on extending our ANLoC approach to an incremental version, making it suitable for handling dynamic positioning of mobile sensor nodes.

Author Contributions

Conceptualization, P.X and L.C; formal analysis, P.X. and L.C.; funding acquisition, L.C.; software, P.X. and T.C.; writing—original draft, P.X.; writing—review and editing, P.X., L.C., and T.C.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 61872190, Grant 61572263, and Grant 61772285, the Postdoctoral Science Foundation of China under Grant 2015M581794, the Natural Science Foundation of Jiangsu Province under Grant BK20161516, and the Postdoctoral Science Foundation of Jiangsu Province under Grant 1501023C.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zheng, K.; Wang, H.; Li, H.; Xiang, W.; Lei, L.; Qiao, J.; Shen, X.S. Energy-efficient localization and tracking of mobile devices in wireless sensor networks. IEEE Trans. Veh. Technol. 2017, 66, 2714–2726. [Google Scholar] [CrossRef]
Qian, H.; Fu, P.; Li, B.; Liu, J.; Yuan, X. A Novel Loss Recovery and Tracking Scheme for Maneuvering Target in Hybrid WSNs. Sensors 2018, 18, 341. [Google Scholar] [CrossRef] [PubMed]
Zhu, H.; Xiao, F.; Sun, L.; Wang, R.; Yang, P. R-TTWD: Robust device-free through-the-wall detection of moving human with WiFi. IEEE J. Sel. Areas Commun. 2017, 35, 1090–1103. [Google Scholar] [CrossRef]
Akyildiz, I.F.; Su, W.; Sankarasubramaniam, Y.; Cayirci, E. A survey on sensor networks. IEEE Commun. Mag. 2002, 40, 102–114. [Google Scholar] [CrossRef]
Patwari, N.; Ash, J.N.; Kyperountas, S.; Hero, A.O.; Moses, R.L.; Correal, N.S. Locating the nodes: Cooperative localization in wireless sensor networks. IEEE Signal Processing Mag. 2005, 22, 54–69. [Google Scholar] [CrossRef]
Xiao, F.; Sha, C.; Chen, L.; Sun, L.; Wang, R. Noise-tolerant localization from incomplete range measurements for wireless sensor networks. In Proceedings of the 2015 IEEE Conference on Computer Communications (INFOCOM), Hong Kong, China, 26–30 April 2015; pp. 2794–2802. [Google Scholar]
Karbasi, A.; Oh, S. Robust localization from incomplete local information. IEEE/ACM Trans. Netw. 2013, 21, 1131–1144. [Google Scholar] [CrossRef]
Xiao, F.; Liu, W.; Li, Z.; Chen, L.; Wang, R. Noise-tolerant wireless sensor networks localization via multinorms regularized matrix completion. IEEE Trans. Veh. Technol. 2018, 67, 2409–2419. [Google Scholar] [CrossRef]
Maz’ya, V.; Schmidt, G. On approximate approximations using Gaussian kernels. IMA J. Numer. Anal. 1996, 16, 13–29. [Google Scholar] [CrossRef]
Meng, D.; De La Torre, F. Robust matrix factorization with unknown noise. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 1337–1344. [Google Scholar]
Yao, J.; Cao, X.; Zhao, Q.; Meng, D.; Xu, Z. Robust subspace clustering via penalized mixture of Gaussians. Neurocomputing 2018, 278, 4–11. [Google Scholar] [CrossRef]
Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39, 1–38. [Google Scholar] [CrossRef]
Shang, Y.; Ruml, W.; Zhang, Y.; Fromherz, M.P. Localization from mere connectivity. In Proceedings of the 4th ACM international symposium on Mobile ad hoc networking computing, Annapolis, MD, USA, 1–3 June 2003; pp. 201–212. [Google Scholar]
Sheng, X.; Hu, Y.H. Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. IEEE Signal Processing Mag. 2005, 53, 44–53. [Google Scholar] [CrossRef] [Green Version]
Tomic, S.; Beko, M.; Dinis, R. RSS-based localization in wireless sensor networks using convex relaxation: Noncooperative and cooperative schemes. IEEE Trans. Veh. Technol. 2015, 64, 2037–2050. [Google Scholar] [CrossRef]
So, A.M.C.; Ye, Y. Theory of semidefinite programming for sensor network localization. Math. Program. 2007, 109, 367–384. [Google Scholar] [CrossRef]
Shamsi, D.; Taheri, N.; Zhu, Z.; Ye, Y. Conditions for correct sensor network localization using SDP relaxation. In Discrete Geometry and Optimization; Springer: Heidelberg, Germany, 2013; pp. 279–301. [Google Scholar]
Nguyen, T.L.N.; Shin, Y. Matrix completion optimization for localization in wireless sensor networks for intelligent IoT. Sensors 2016, 16, 722. [Google Scholar] [CrossRef] [PubMed]
Sun, X.; Chen, T.; Li, W.; Zheng, M. Perfomance research of improved mds-map algorithm in wireless sensor networks localization. In Proceedings of the 2012 International Conference on Computer Science and Electronics Engineering (ICCSEE), Hangzhou, China, 23–25 March 2012; pp. 587–590. [Google Scholar]
Feng, C.; Valaee, S.; Au, W.S.A.; Tan, Z. Localization of wireless sensors via nuclear norm for rank minimization. In Proceedings of the 2010 IEEE Global Telecommunications Conference GLOBECOM 2010, Miami, FL, USA, 6–10 December 2010; pp. 1–5. [Google Scholar]
Liu, C.; Shan, H.; Wang, B. Wireless Sensor Network Localization via Matrix Completion Based on Bregman Divergence. Sensors 2018, 18, 2974. [Google Scholar] [CrossRef] [PubMed]
Lee, D.D.; Seung, H.S. Algorithms for non-negative matrix factorization. Adv. Neural Inf. Processing Syst. 2001, 556–562. [Google Scholar]
Sun, R.; Luo, Z.Q. Guaranteed matrix completion via non-convex factorization. IEEE Trans. Inf. Theory 2016, 62, 6535–6579. [Google Scholar] [CrossRef]
Koren, Y.; Bell, R.; Volinsky, C. Matrix factorization techniques for recommender systems. Computer 2009, 8, 30–37. [Google Scholar] [CrossRef]
Liu, H.; Wu, Z.; Li, X.; Cai, D.; Huang, T.S. Constrained nonnegative matrix factorization for image representation. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 1299–1311. [Google Scholar] [CrossRef] [PubMed]
Parikh, N.; Boyd, S. Proximal algorithms. Found. Trends® Optim. 2014, 1, 127–239. [Google Scholar] [CrossRef]
Chen, L.; Zhang, H.; Lu, J.; Thung, K.; Aibaidula, A.; Liu, L.; Chen, S.; Jin, L.; Wu, J.; Wang, Q.; Zhou, L.; Shen, D. Multi-label nonlinear matrix completion with transductive multi-task feature selection for joint MGMT and IDH1 status prediction of patient with high-grade gliomas. IEEE Trans. Med. Imaging 2018, 37, 1775–1787. [Google Scholar] [CrossRef] [PubMed]
Armijo, L. Minimization of functions having Lipschitz continuous first partial derivatives. Pac. J. Math. 1966, 16, 1–3. [Google Scholar] [CrossRef] [Green Version]
Chen, L.; Yang, G.; Chen, Z.; Xiao, F.; Shi, J. Correlation consistency constrained matrix completion for web service tag refinemen. Neural Comput. Appl. 2015, 26, 101–110. [Google Scholar] [CrossRef]
De La Torre, F.; Black, M.J. A framework for robust subspace learning. Int. J. Comput. Vis. 2003, 54, 117–142. [Google Scholar] [CrossRef]
Srebro, N.; Jaakkola, T. Weighted low-rank approximations. In Proceedings of the 20th International Conference on Machine Learning (ICML-03), Washington, DC, USA, 21–24 August 2003; pp. 720–727. [Google Scholar]
Buchanan, A.M.; Fitzgibbon, A.W. Damped newton algorithms for matrix factorization with missing data. In Proceedings of the 2005 Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–26 June 2005; pp. 316–322. [Google Scholar]
Keshavan, R.H.; Montanari, A.; Oh, S. Matrix completion from a few entries. IEEE Trans. Inf. Theory 2010, 56, 2980–2998. [Google Scholar] [CrossRef]

Figure 1. Pipeline of node localization for Wireless Sensor Networks (WSNs) in complex environments.

Figure 2. Illustration of the sampled Euclidean Distance Matrix (EDM) matrix.

Figure 3. Performance comparison under scenario 1 without the noise and anomaly. (a) EDM recovery error and (b) localization error.

Figure 4. Performance comparison under scenario 2 with only complex noise. (a) EDM recovery error and (b) localization error.

Figure 5. Performance comparison under scenario 3 with only anomaly. (a) EDM recovery error; (b) localization error; and (c) anomaly recognition accuracy.

Figure 6. Performance comparison under scenario 4 with both complex noise and anomaly. (a) EDM recovery error; (b) localization error; and (c) anomaly recognition accuracy.

Figure 7. Illustration of localization performance under scenario 4 with both complex noise and anomaly. (a) Cumulative distribution of localization error and (b) positioning result display.

Figure 8. Effects illustration of the proposed strategies. (a) EDM recovery error under the scenario 4 with both complex noise and anomaly; (b) localization error under the scenario 4 with complex noise and anomaly; (c) anomaly recognition accuracy under condition with complex noise and anomaly; and (d) anomaly recognition accuracy under the scenario 3 with only anomaly.

Figure 9. Illustration of large-scale scenario used in this study.

Figure 10. Localization results under large-scale scene scenario with complex noise and anomaly.

Table 1. Simulation parameters.

Name	Value	Name	Value
Size of experimental scenario	$100 \times 100$	Gaussian noise 1	Mean 0, variance 100
Number of sensor nodes	100	Gaussian noise 2	Mean 0, variance 50
Number of anchors	6	Range of sparse noise	$[0, 10000]$
Range of row anomaly	$[0, 500]$	Range of column anomaly	$[0, 500]$

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, P.; Cui, T.; Chen, L. ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments. Sensors 2019, 19, 1912. https://doi.org/10.3390/s19081912

AMA Style

Xu P, Cui T, Chen L. ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments. Sensors. 2019; 19(8):1912. https://doi.org/10.3390/s19081912

Chicago/Turabian Style

Xu, Pengfei, Tianhao Cui, and Lei Chen. 2019. "ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments" Sensors 19, no. 8: 1912. https://doi.org/10.3390/s19081912

APA Style

Xu, P., Cui, T., & Chen, L. (2019). ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments. Sensors, 19(8), 1912. https://doi.org/10.3390/s19081912

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ANLoC: An Anomaly-Aware Node Localization Algorithm for WSNs in Complex Environments

Abstract

1. Introduction

2. Related Work

2.1. Range-Based Node Localization

2.2. Low-Rank Matrix Decomposition

3. Preliminaries

3.1. Notations

3.2. Mathematical Foundation

4. Anomaly-Aware Node Localization for WSNs

4.1. Euclidean Distance Matrix Completion

4.1.1. Problem Description and RRMD Model Construction

4.1.2. Optimizing RRMD via Expectation Maximization Method

4.2. Anomaly-Aware Node Localization

5. Performance Evaluation

5.1. Experimental Setting

5.2. Evaluation Metrics

5.3. Experimental Results

5.4. Effects of the Proposed Strategies

5.5. Localization for Large-Scale Scenario

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI