# A Systematic Literature Review on Outlier Detection in Wireless Sensor Networks

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

- An event-based network is different from a monitoring sensor network. Some typical event examples are earthquake monitoring, flood, volcanic eruption alarm, rainfall and flood detection, weather changes, chemical hazardous alert, air pollution, air quality monitoring, and fireplace detection. In mutuality with inaccurate data, outliers generated by events tend to have an especially smaller probability of incidence [21]. Deleting the outlier event from the dataset can cause loss of necessary and important data from relevant events [22]. Several techniques are proposed for event detection such as [23,24,25].
- Noise or error that is based on noise in measurement may occur because of several sources, like a sensor fault or sensor misbehavior [20]. Faulty data are ordinarily described as a modification in the dataset that is disparate from the rest of the data. Error or noise can result in several changes associated with the environment, including harshness and the difficulties of the preparation areas. If possible, faulty data, as well as noisy data, must be corrected or deleted [20].
- Malicious attacks are associated with the security of the network. Outliers based on malicious attacks begin with a sensor node that is compromised by the attacker and the injection of unreliable or corrupt data into the network topology. Malicious attacks are classified into passive and active attacks. A passive attack changes sensory data with the aim of interrupting the decision-making system of the network [20], whereas an active attack has an effect on network functionality and performance. This attack can slow or even shut down the network [26].

## 2. Application of Outlier Detection in WSNs

- Environmental monitoring: Many sensors such as temperature, humidity, air pollution, and wind speed sensors are deployed in harsh environments to monitor and analyze environmental factors.
- Industrial monitoring: Sensors such as vibration, pressure, or temperature sensors are installed on sensitive equipment to monitor the state of this equipment.
- Healthcare monitoring: Small sensors are used to monitor patients’ vital state. These sensors are implemented in the patient’s body in different positions to monitor blood pressure, heart rate, or enzymes and minerals.
- Smart cities: Different kinds of sensors such as parking sensors, dustbin sensors, and pedestrian sensors are used to make cities more comfortable for citizens.
- Forest fire detection: Forests are monitored to prevent fires using a variety of sensor nodes. Thousands of them are deployed in the target area to predict and prevent forest fires.

## 3. Review Method

## 4. Planning the Review

#### 4.1. The Need for a Systematic Review

#### 4.2. Identifying Research Questions

- RQ1: What is the designed taxonomy and framework for outlier detection techniques in WSNs?
- RQ2: What are the outlier detection techniques that have been used in WSNs?
- RQ3: What are the challenges in current outlier detection techniques in WSNs?

#### 4.3. Developing a Review Protocol

## 5. Conducting the Review

#### 5.1. Search Strategy

- Science Direct (http://www.sciencedirect.com/),
- SpringerLink (http://www.springer.com/in/),
- IEEE Explorer (http://www.ieee.org/index.html),
- Taylor and Francis Online (http://www.tandfonline.com/),
- ACM Digital Library (https://dl.acm.org/),
- MDPI (https://www.mdpi.com/).

#### 5.2. Criteria for Inclusion and Exclusion Articles

#### 5.3. Manual Search

#### 5.4. Process for Selection of Studies

#### 5.5. Applying Quality Assessment (QA)

- QA1: Is the topic addressed in the paper related to anomaly detection in WSN?
- QA2: Is the research methodology defined in the article?
- QA3: Is there a sufficient explanation of the background in which the study was performed?
- QA4: Is there a clear declaration concerning the research objectives?

#### 5.6. Data Extraction and Synthesis

#### 5.7. Publication Sources Overview

#### 5.8. Classification of Outlier Detection Techniques Used in Previous Studies

## 6. RQ Results

#### 6.1. What is the Complete Taxonomy Framework for Outlier Detection Techniques for WSNs? (RQ1)

#### 6.2. What Are the Outlier Detection Techniques that Have Been Used for WSNs? (RQ2)

#### 6.2.1. Statistical-Based Approaches

- Parametric-Based Approaches: These strategies consider the accessibility of information from the fundamental data distribution. It is followed by approximation of distribution limitations using the available data. Data distribution is classified as Gaussian-based models or non-Gaussian-based models. Gaussian models are characterized by a normal distribution of data.
- Gaussian-Based Models: Outlying sensors and sensor networks’ event boundaries are identified by using two specific strategies described by [70]. These strategies depend on the spatial correspondence of the evaluation of adjacent sensor nodes to compare outlining sensors with the event boundary. The difference between readings of a node and the mean of the readings of its adjacent nodes is calculated by each node in the strategy employed for recognition of outlining sensors. This is followed by the regulation of every difference from the adjacent nodes. If the extent of variation of a reading of a node’s absolute value is considerably higher than the predetermined criteria, then the node is said to be an outlying node. The event boundary recognition strategy depends on the preceding outcomes of distant sensor recognition. In this case, the node is said to be an event node if there is a significant variation in the absolute value of the extent of divergence of the node in different geological areas. These strategies do not consider the temporal association of sensor readings, so their precision is not very high.
- Non-Gaussian-Based Models: A mathematically supported strategy is proposed by [155], where the outliers in the shape of spontaneous noise are modeled using a symmetric $\alpha $- stable (${\mathrm{S}}_{\alpha}\mathrm{S}$) distribution. In this strategy, the time-space associations of sensor data are employed to recognize outliers. The anticipated data and sensing data are contrasted by every group node for identifying and correcting the temporal outliers. This corrected data from nodes are gathered by the cluster-head to identify spatial outliers that show significant divergence from regular data. There is a reduction in communication costs that can be attributed to local transfer. Moreover, costs incurred on calculation are minimized because a major part of computations is conducted by cluster-heads. However, it may not be appropriate to apply ${\mathrm{S}}_{\alpha}\mathrm{S}$ distribution to real sensor data. Powerful alterations of network topology may be experienced by the cluster-based model.

- Non-Parametric-Based Approaches: Accessibility of data distribution is not considered by non-parametric strategies. The space between new test cases and mathematical models is usually identified by these strategies. To identify whether the observation is an outlier or not, some criteria are applied to the measured space. Histograms and kernel density estimators are famous strategies in this regard. In histogram models, the rate of incidence of various data instances is determined by calculating the probable incidence of a data instance. Afterward, the test is contrasted with every type of histogram to determine the type to which it is associated. The probability distribution function (pdf) for regular instances is evaluated by kernel-density estimators and by employing the kernel functions. An outlier is found to be any new instance in a pdf that is found in a region characterized by a low probability.
- Histogramming: Worldwide outliers in applications of sensor networks that are responsible for the collection of data are recognized by a strategy developed on the basis of a histogram proposed by [11]. This histogram is characterized by a minimization of the cost incurred for communication because it focuses on gathering histogram data instead of unprocessed data for further processing. Histogram information help to extract data distribution from the network and sort out non-outliers. Additional histogram data can be gathered from the network for recognizing outliers. Outliers are determined by a predetermined standard distance or by their position amongst the outliers. One shortcoming of this strategy is that communication expenses are increased because of the need to gather additional histogram data from the entire network. Moreover, merely single-dimensional data are considered by this strategy.
- Kernel Functions: It is a strategy used for the detection of outliers online in transferring sensor data, it was recommended by [156]. It is based on kernels and is independent of the predetermined data distribution. The strategy uses the kernel density evaluator in order to estimate the fundamental distribution of sensor data. Thus, outliers are recognized by nodes in case of major divergence of value from the pre-set model of data distribution. An outlier is the value of a node whose adjacent node values do not meet the criteria set by the user. This strategy is also applicable to complex nodes for recognition of outliers overall. This strategy is highly dependent on pre-set criteria. This makes it problematic because it is very complicated to select suitable criteria. Moreover, identification of outliers in data with multiple variables may not be possible using a single criterion.

- Evaluation of Statistical-Based Techniques: These strategies have been proved mathematically to effectively recognize outliers when an accurate model of the probability distribution is given. Additionally, the basic information on which the model is constructed is not needed afterward. However, in reality, previous information on sensor stream distribution is usually unavailable. Hence, in the absence of a predetermined distribution to be followed by sensor data, parametric strategies are deemed to be ineffective. Non-parametric strategies are more efficient because they do not depend on distribution features. Histogram models are suitable for single variable data, but in the case of multiple variables, this model fails to consider the correlation between various aspects of data. For data with multiple variables, a kernel function is a better option, specifically in terms of computation cost.

#### 6.2.2. Nearest Neighbor Based Techniques

#### 6.2.3. Clustering-Based Techniques

#### 6.2.4. Classification-Based Techniques

#### 6.2.5. Information Theoretic

#### 6.2.6. Spectral Decomposition-Based Approaches

#### 6.3. What Are the Challenges of Outlier Techniques in WSNs? (RQ3)

**Resource limitations:**Low-quality and cheap sensor nodes present several barriers, such as limited memory and energy, narrow communication bandwidth, and poor computational ability. Many common outlier detection techniques hesitate to probe into higher computational capabilities because of the high cost involved as well as the extensive storage and analysis that are needed. Thus, common sensors are inadequate to identify outliers in WSNs [6].**High communication cost:**A lot of energy in WSNs is channeled to radio communication, and the non-computation costs for communication in sensor nodes are higher than those for computation costs. Most common outlier detecting techniques employ centralized steps to analyze data, which causes higher energy use and communication overhead, decreasing network lifetime and blocking network traffic.**Distributed streaming data:**Sensor data that originate from varied channels may shift in a dynamic manner. Moreover, no model seems to spell out the distribution of these data. Additionally, calculating probabilities is a challenging task. Most techniques that identify outliers fail to satisfy the fixed criteria to process dispersion of stream data. Theoretical conceptions are unsuitable for sensor data and thus are inappropriate for WSNs.**Heterogeneity and mobility of nodes, frequent communication failures, dynamic network topology:**Sensor nodes placed in frenzy settings are deemed to fail because of dynamic network topology and frequent communication. Sensor nodes with varied capacities can move into different positions because each node may contain various kinds of sensors. Thus, the intricacy of generating a viable outlier detecting method for WSNs is heightened because of such dynamic and complex features.**Large-scale deployment:**The scale of WSNs may be massive and may thus require the higher task of detecting outliers, which cannot be performed by common sensors.**Identifying outlier sources:**A sensor network monitors activities and provides raw data. Nevertheless, it is difficult to determine outliers in complex and intricate WSNs. Common methods may not even be able to identify events from outliers. Hence, it is more challenging to identify outliers in WSNs from other normal events.

## 7. Advantages and Disadvantages of Existing Outlier Detection Techniques

#### 7.1. Statistical-Based Techniques

- The systems, similar to many outlier detection systems, do not require prior knowledge of security flaws and attacks. Hence, the systems can detect ‘0 day’ or the latest attacks.
- The statistical techniques offer accurate alert regarding attacks for extended periods. Thus, they are excellent signals for forthcoming DoS attacks (e.g., port scan).

- Skilled attackers can train a statistical outlier detection to accept abnormal behavior as normal.
- It is challenging to determine thresholds that balance the likelihood of false positives with that of false negatives.
- Statistical techniques demand accurate statistical distributions. However, not all behaviors can be modeled statistically. Most of the suggested outlier detection methods demand the assumption of a quasi-stationary process that cannot be estimated for most data [177].

#### 7.2. Nearest-Neighbor-Based Techniques

#### 7.3. Clustering-Based Techniques

- Easy to adapt with incremental mode (after learning the clusters, new points can be inserted into the system and tested for outliers).
- Do not require supervision.
- Appropriate to detect outliers from temporal data.
- Have a rapid testing stage because the number of clusters that require comparisons is normally small.

- Rely highly on the efficiency of clustering algorithms to capture cluster structure in normal instances.
- Most methods that detect outliers are by-products of clustering and are thus non-optimized to detect outliers.
- Several clustering algorithms force every instance to be assigned to some clusters. This might result in anomalies getting assigned to a large cluster and being considered as normal instances by techniques that operate under the assumption that anomalies do not belong to any cluster.
- Some clustering algorithms insist on assigning each instance to a cluster. Thus, outliers may be linked to a large cluster and seen as a normal instance by methods that assume that outliers are always in isolation.
- Some clustering-based methods are effective only when outliers are not a part of essential clusters.
- There is bottleneck computation intricacy, particularly when O(N2d) clustering algorithm is applied.

#### 7.4. Classification-Based Techniques

- Classification-based methods, particularly multi-class approaches, apply powerful algorithms that can differentiate instances from varied classes.
- The testing stage is rapid because the data instances are only compared with a pre-computed model.

- They rely on the availability of accurate labels for varied normal classes, which is difficult to obtain.
- Classification-based methods have a label for every test instance that turns into a drawback if an outlier score is desired for test instances. Several classification methods that gain probabilistic estimation scores from classifier outputs can be employed to overcome this issue [8].

#### 7.5. Information Theoretic

- Do not require supervision.
- Discard assumptions regarding underlying statistical data distribution.

- High reliance on the selection of information-theoretic measures. These measures often identify outliers when they are present in large numbers.
- The information-theoretic methods used in spatial and sequence datasets depend on sub-structure size, which is challenging to determine.
- It is challenging to link test instances with outlier scores via the information-theoretic method.

#### 7.6. Spectral Decomposition-Based Approaches

- Spectral methods can automatically minimize dimensionality and are thus adequate to handle datasets with high dimensions. They can be also applied as a pre-processing step, and they are followed by the use of existing outlier detection methods in the transformed space.
- Spectral methods do not require supervision.

- Spectral methods are useful if both normal data and outliers are segregated for data at lower dimensions.
- The methods demand computation that is highly intricate.

## 8. Evaluation of Outlier Detection Techniques

**Statistical-based approaches**: They are more adapted when a small number of outliers exist in the WSN data. Statistical-based approaches work in an unsupervised way by building statistically-based models and applying descriptive statistics to detect outliers.

**Parametric-based approaches**: They are suitable for underlying WSN data that can be modeled by a probability distribution. Generally, parametric-based approaches can be used in Gaussian and non-Gaussian models. Gaussian models are used when the WSN data are compared with the neighbor in spatial correlation mode. In this case, Gaussian models need a pre-selected threshold to detect anomaly data. However, non-Gaussian models are used for local outlier detection. In this case, they use temporal correlation for outlier detection.

**Non-parametric-based approaches**: These approaches are interesting since no assumption about the distribution of WSNs data are required. Non-parametric-based approaches include histogram-based and kerned-based models. The first models involve determining the frequency of occurrence of different data instances. They can achieve excellent results for univariate WSNs data but less for multivariate data with interactions between the attributes. The second type, kerned-based models, uses kernel density to estimate the probability distribution function of sensor data. They can achieve excellent results with multivariate WSNs data with a good computational time.

**Nearest Neighbor-based approaches**: They are very convenient when the distance between two neighbor sensors is the key matter for the analysis of the WSN data. The nearest neighbor technique is one of the well-known techniques not only in WSN but also in data mining and machine learning. This technique requires the use of several distances between two sensor nodes. The goal of using nearest neighbor-based approaches is to assume that normal WSN data occur in dense neighborhoods, while outliers are far away from their closest neighbors.

**Clustering-based approaches**: They are used when similar WSN data instances are very important for data mining. These techniques provide WSN data in clusters with similar behavior. After that, points that are not within clusters can be considered as anomalies.

**Classification-based approaches**: They are divided into two types: supervised and unsupervised. Supervised techniques require labeling the WSN data and dividing it into training and testing parts. Unsupervised techniques do not require labeling the data; they determine the boundary of the normal instances and identify new instances existing outside this boundary as an outlier.

- High outlier detection rate.
- High scalability.
- High distinction between erroneous measurements and events.
- Low computational complexity and easy implementation.
- Consideration of correlation between attributes, spatial/spatiotemporal, and multivariate sensory data.
- Unsupervised techniques are preferred since the learning phase for WSN sensory data are a difficult task for supervised methods.
- Non-parametric methods are preferred for WSN sensory data due to the absence of knowledge about the data distribution.
- Energy-efficient and robust to communication failures.

## 9. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Xie, M.; Han, S.; Tian, B.; Parvin, S. Anomaly detection in wireless sensor networks: A survey. J. Netw. Comput. Appl.
**2011**, 34, 1302–1325. [Google Scholar] [CrossRef] - Myllyla, R.; The Mendeley Support Team. Vital Sign Monitoring System with Life Emergency Event Detection using Wireless Sensor Network. In Proceedings of the 2006 5th IEEE Conference on Sensors, Daegu, Korea, 22–25 October 2006; pp. 518–521. [Google Scholar]
- You, Z.; Mills-Beale, J.; Pereles, B.D.; Ong, K.G. A Wireless, Passive Embedded Sensor for Real-Time Monitoring of Water Content in Civil Engineering Materials. IEEE Sens. J.
**2008**, 8, 2053–2058. [Google Scholar] - Hao, Q.; Brady, D.J.; Guenther, B.D.; Burchett, J.B.; Shankar, M.; Feller, S. Human tracking with wireless distributed pyroelectric sensors. IEEE Sens. J.
**2006**, 6, 1683–1695. [Google Scholar] [CrossRef] - Subramaniam, S.; Palpanas, T.; Papadopoulos, D.; Kalogeraki, V.; Gunopulos, D. Online outlier detection in sensor data using non-parametric models. In Proceedings of the VLDB ’06: Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, 12–15 September 2006; pp. 187–198. [Google Scholar]
- Meratnia, N.; Havinga, P. Outlier Detection Techniques for Wireless Sensor Networks: A Survey. IEEE Commun. Surv. Tutor.
**2010**, 12, 159–170. [Google Scholar] [CrossRef][Green Version] - Ganguly, A.R.; Gama, J.; Omitaomu, O.A.; Gaber, M.; Vatsavai, R.R. Knowledge Discovery From Sensor Data; CRC Press: Boca Raton, FL, USA, 2008. [Google Scholar]
- Chandola, V.; Banerjee, A.; Vipin, K. Anomaly Detection: A Survey. ACM Comput. Surv.
**2009**, 41, 1–6. [Google Scholar] [CrossRef] - Esnaola-Gonzalez, I.; Bermúdez, J.; Fernández, I.; Fernández, S.; Arnaiz, A. Towards a Semantic Outlier Detection Framework in Wireless Sensor Networks. In Proceedings of the 13th International Conference on Semantic Systems–Semantics2017, Amsterdam, The Netherlands, 12–13 September 2017; pp. 152–159. [Google Scholar] [CrossRef]
- Fontugne, R.; Ortiz, J.; Tremblay, N.; Borgnat, P.; Flandrin, P.; Fukuda, K.; Culler, D.; Esaki, H. Strip, bind, and search: A method for identifying abnormal energy consumption in buildings. In Proceedings of the 2013 ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), Philadelphia, PA, USA, 8–11 April 2013; pp. 129–140. [Google Scholar]
- Sheng, B.; Li, Q.; Mao, W.; Jin, W. Outlier detection in sensor networks. In Proceedings of the 8th ACM International Symposium on Mobile Ad Hoc Networking and Computing—MobiHoc ’07, Montreal, QC, Canada, 9–14 September 2007; pp. 219–228. [Google Scholar] [CrossRef][Green Version]
- Hawkins, D.M. Identification of Outliers; Springer: Dordrecht, The Netherlands, 1980. [Google Scholar] [CrossRef]
- Titouna, C.; Aliouat, M.; Gueroui, M. Outlier Detection Approach Using Bayes Classifiers in Wireless Sensor Networks. Wirel. Pers. Commun.
**2015**, 85, 1009–1023. [Google Scholar] [CrossRef] - Barnett, V.; Lewis, T. Outliers in Statistical Data; Wiley: Hoboken, NJ, USA, 1974. [Google Scholar]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. ACM Sigmod Rec.
**2000**, 29, 93–104. [Google Scholar] [CrossRef] - Cheng, T.; Li, Z. A multiscale approach for spatio-temporal outlier detection. Trans. GIS
**2006**, 10, 253–263. [Google Scholar] [CrossRef] - Aggarwal, C.C.; Philip, S.Y. An effective and efficient algorithm for high-dimensional outlier detection. VLDB J.
**2005**, 14, 211–221. [Google Scholar] [CrossRef] - Muthukrishnan, S.; Shah, R.; Vitter, J.S. Mining deviants in time series data streams. In Proceedings of the 16th International Conference on Scientific and Statistical Database Management, Santorini Island, Greece, 21–23 June 2004; pp. 41–50. [Google Scholar]
- Jiang, M.F.; Tseng, S.S.; Su, C.M. Two-phase clustering process for outliers detection. Pattern Recognit. Lett.
**2001**, 22, 691–700. [Google Scholar] [CrossRef] - Ayadi, A.; Ghorbel, O.; Obeid, A.M.; Abid, M. Outlier detection approaches for wireless sensor networks: A survey. Comput. Netw.
**2017**, 129, 319–333. [Google Scholar] [CrossRef] - Buratti, C.; Conti, A.; Dardari, D.; Verdone, R. An overview on wireless sensor networks technology and evolution. Sensors
**2009**, 9, 6869–6896. [Google Scholar] [CrossRef] [PubMed][Green Version] - Rajasegarar, S.; Leckie, C.; Palaniswami, M.; Bezdek, J. Distributed Anomaly Detection in Wireless Sensor Networks. In Proceedings of the 2006 10th IEEE Singapore International Conference on Communication Systems, Singapore, 30 October–2 November 2006; pp. 1–5. [Google Scholar] [CrossRef]
- Krishnamachari, B.; Iyengar, S. Distributed Bayesian algorithms for fault-tolerant event region detection in wireless sensor networks. IEEE Trans. Comput.
**2004**, 53, 241–250. [Google Scholar] [CrossRef] - Ding, M.; Chen, D.; Xing, K.; Cheng, X. Localized Fault-Tolerant Event Boundary Detection in Sensor Networks. In Proceedings of the IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, Miami, FL, USA, 13–17 March 2005; pp. 902–913. [Google Scholar] [CrossRef][Green Version]
- Bahrepour, M.; Zhang, Y.; Meratnia, N.; Havinga, P.J. Use of event detection approaches for outlier detection in wireless sensor networks. In Proceedings of the 2009 International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), Melbourne, VIC, Australia, 7–10 December 2009; pp. 439–444. [Google Scholar]
- Shahid, N.; Naqvi, I.H.; Qaisar, S.B. Characteristics and classification of outlier detection techniques for wireless sensor networks in harsh environments: A survey. Artif. Intell. Rev.
**2012**, 43, 193–228. [Google Scholar] [CrossRef] - Ghaddar, A.; Razafindralambo, T.; Simplot-Ryl, I.; Tawbi, S.; Hijazi, A. Algorithm for temporal anomaly detection in WSNs. In Proceedings of the 2011 IEEE Wireless Communications and Networking Conference, WCNC 2011, Cancun, Quintana Roo, Mexico, 28–31 March 2011; pp. 743–748. [Google Scholar] [CrossRef]
- Hanafizadeh, P.; Keating, B.W.; Khedmatgozar, H.R. A systematic review of Internet banking adoption. Telemat. Inform.
**2014**, 31, 492–510. [Google Scholar] [CrossRef] - Asadi, S.; Hussin, A.R.C.; Dahlan, H.M. Organizational research in the field of Green IT: A systematic literature review from 2007 to 2016. Telemat. Inform.
**2017**, 34, 1191–1249. [Google Scholar] [CrossRef] - Asadi, S.; Abdullah, R.; Yah, Y.; Nazir, S. Understanding Institutional Repository in Higher Learning Institutions: A systematic literature review and directions for future research. IEEE Access
**2019**, 7, 35242–35263. [Google Scholar] [CrossRef] - Kitchenham, B. Procedures for performing systematic reviews. Keele Univer. Tech. Rep. UK
**2004**, 33, 1–26. [Google Scholar] - Keele, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering; Technical Report, Ver. 2.3 EBSE Technical Report; EBSE, Keele University and Durham University: Keele, UK; Durham, UK, 2007. [Google Scholar]
- Bandara, W.; Miskon, S.; Fielt, E. A Systematic, Tool-Supported Method for Conducting Literature Reviews in IS. Inf. Syst. J.
**2011**, 1–14. [Google Scholar] - Webster, J.; Watson, R.T. Analyzing the Past to Prepare for the Future: Writing a Literature Review. MIS Q.
**2002**, 26, 13–23. [Google Scholar] - Kitchenham, B.A.; Charters, S. Guidelines for performing Systematic Literature Reviews in Software Engineering. Keele Univ. Univ. Durh.
**2007**, 2, 1–65. [Google Scholar] - Branch, J.W.; Giannella, C.; Szymanski, B.; Wolff, R.; Kargupta, H. In-Network Outlier Detection in Wireless Sensor Networks. Knowl. Inf. Syst.
**2009**, 34, 23–54. [Google Scholar] [CrossRef][Green Version] - Luo, X.; Dong, M.; Huang, Y. On distributed fault-tolerant detection in wireless sensor networks. IEEE Trans. Comput.
**2006**, 55, 58–70. [Google Scholar] [CrossRef] - Samparthi, V.S.K.; Verma, H.K. Outlier Detection of Data in Wireless Sensor Networks Using Kernel Density Estimation. Int. J. Comput. Appl.
**2010**, 5, 975–8887. [Google Scholar] [CrossRef] - Jiang, F.; Sui, Y.; Cao, C. Some issues about outlier detection in rough set theory. Expert Syst. Appl.
**2009**, 36, 4680–4687. [Google Scholar] [CrossRef] - Otey, M.E.; Ghoting, A.; Parthasarathy, S. Fast distributed outlier detection in mixed-attribute data sets. Data Min. Knowl. Discov.
**2006**, 12, 203–228. [Google Scholar] [CrossRef] - Ghorbel, O.; Obeid, A.M.; Abid, M.; Snoussi, H. One class outlier detection method in wireless sensor networks: Comparative study. In Proceedings of the 2016 24th International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 22–24 September 2016; pp. 1–8. [Google Scholar] [CrossRef]
- Salem, O.; Mehaoua, A. Anomaly Detection in Medical Wireless Sensor Networks. J. Comput. Sci. Eng.
**2013**, 7, 272–284. [Google Scholar] [CrossRef] - Chen, Y.; Miao, D.; Zhang, H. Neighborhood outlier detection. Expert Syst. Appl.
**2010**, 37, 8745–8749. [Google Scholar] [CrossRef] - Rajasegarar, S.; Bezdek, J.C.; Leckie, C.; Palaniswami, M. Elliptical anomalies in wireless sensor networks. ACM Trans. Sens. Netw.
**2009**, 6, 1–28. [Google Scholar] [CrossRef] - Bakar, Z.A.; Mohemad, R.; Ahmad, A.; Deris, M.M. A Comparative Study for Outlier Detection Techniques in Data Mining. In Proceedings of the IEEE Conference on Cybernetics and Intelligent Systems, Bangkok, Thailand, 7–9 June 2006; pp. 1–6. [Google Scholar] [CrossRef][Green Version]
- Hwj, H.; Iacca, G.; Tejada, A.; Wörtche, H.J.; Liotta, A. Spatial anomaly detection in sensor networks using neighborhood information. Inf. Fusion
**2017**, 33, 41–56. [Google Scholar] [CrossRef][Green Version] - Abid, A.; Kachouri, A.; Mahfoudhi, A. Anomaly detection through outlier and neighborhood data in Wireless Sensor Networks. In Proceedings of the 2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Monastir, Tunisia, 21–24 March 2016; pp. 26–30. [Google Scholar] [CrossRef]
- Xie, M.; Hu, J.; Han, S.; Chen, H.H. Scalable hypergrid k-NN-based online anomaly detection in wireless sensor networks. IEEE Trans. Parallel Distrib. Syst.
**2013**, 24, 1661–1670. [Google Scholar] [CrossRef] - Al-Zoubi, M.; Al-Dahoud, A.; Yahya, A. New outlier detection method based on fuzzy clustering. WSEAS Trans. Inf.
**2010**, 7, 681–690. [Google Scholar] - Yang, D.; Rundensteiner, E.a.; Ward, M.O. Neighbor-based pattern detection for windows over streaming data. In Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology, EDBT’09, Saint Petersburg, Russia, 24–26 March 2009; pp. 529–540. [Google Scholar] [CrossRef][Green Version]
- Aggarwal, C.; Philip, S. Outlier Detection with Uncertain Data. In Proceedings of the 2008 SIAM International Conference on Data Mining, Atlanta, GA, USA, 24–26 April 2008. [Google Scholar]
- Branch, J.; Szymanski, B.; Giannella, C.; Wolff, R.W.R.; Kargupta, H. In-Network Outlier Detection in Wireless Sensor Networks. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS’06), Lisboa, Portugal, 4–7 July 2006; p. 51. [Google Scholar] [CrossRef]
- Garcia-Font, V.; Garrigues, C.; Rifà-Pous, H. Difficulties and challenges of anomaly detection in smart cities: A laboratory analysis. Sensors
**2018**, 18, 3198. [Google Scholar] [CrossRef] [PubMed][Green Version] - Wang, M.; Xue, A.; Xia, H. Abnormal Event Detection in Wireless Sensor Networks Based on Multiattribute Correlation. J. Electr. Comput. Eng.
**2017**, 2017, 2587948. [Google Scholar] [CrossRef][Green Version] - De Paola, A.; Gaglio, S.; Re, G.L.; Milazzo, F.; Ortolani, M. Adaptive distributed outlier detection for WSNs. IEEE Trans. Cybern.
**2015**, 45, 888–899. [Google Scholar] [CrossRef] - Rajasegarar, S.; Leckie, C.; Palaniswami, M. Hyperspherical cluster based distributed anomaly detection in wireless sensor networks. J. Parallel Distrib. Comput.
**2014**, 74, 1833–1847. [Google Scholar] [CrossRef] - Fawzy, A.; Mokhtar, H.M.O.; Hegazy, O. Outliers detection and classification in wireless sensor networks. Egypt. Inform. J.
**2013**, 14, 157–164. [Google Scholar] [CrossRef][Green Version] - Takruri, M.; Challa, S.; Chakravorty, R. Recursive bayesian approaches for auto calibration in drift aware wireless sensor networks. J. Netw.
**2010**, 5, 823–832. [Google Scholar] [CrossRef] - Budalakoti, S.; Srivastava, A.N.; Otey, M.E. Anomaly detection and diagnosis algorithms for discrete symbol sequences with applications to airline safety. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.
**2009**, 39, 101–113. [Google Scholar] [CrossRef] - Ghoting, A.; Parthasarathy, S.; Otey, M.E. Fast mining of distance-based outliers in high-dimensional datasets. Data Min. Knowl. Discov.
**2008**, 16, 349–364. [Google Scholar] [CrossRef][Green Version] - Zhuang, Y.; Chen, L. In-network Outlier Cleaning for Data Collection in Sensor Networks. In Proceedings of the Workshop in VLDB, Seoul, Korea, 12–15 September 2006. [Google Scholar]
- Li, Q.; Sun, R.; Wu, H.; Zhang, Q. Parallel distributed computing based wireless sensor network anomaly data detection in IoT framework. Cogn. Syst. Res.
**2018**, 52, 342–350. [Google Scholar] [CrossRef] - Feng, Z.; Fu, J.; Du, D.; Li, F.; Sun, S. A new approach of anomaly detection in wireless sensor networks using support vector data description. Int. J. Distrib. Sens. Netw.
**2017**, 13, 155014771668616. [Google Scholar] [CrossRef][Green Version] - Gil, P.; Martins, H.; Cardoso, A.; Palma, L. Outliers detection in non-stationary time-series: Support vector machine versus principal component analysis. In Proceedings of the 2016 12th IEEE International Conference on Control and Automation (ICCA), Kathmandu, Nepal, 1–3 June 2016; Volume 1, pp. 701–706. [Google Scholar] [CrossRef]
- Ghorbel, O.; Abid, M.; Snoussi, H. Improved KPCA for outlier detection in Wireless Sensor Networks. In Proceedings of the 2014 1st International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2014, Sousse, Tunisia, 17–19 March 2014. [Google Scholar] [CrossRef]
- Livani, A.A.; Abadi, M.; Alikhani, M. Outlier detection in wireless sensor networks using distributed principal component analysis. J. Data Min.
**2013**, 1, 1–11. [Google Scholar] [CrossRef] - Zhang, Y.; Hamm, N.; Meratnia, N.; Stein, A.; van de Voort, M.; Havinga, P.J.M. Statistics-based outlier detection for wireless sensor networks. Int. J. Geogr. Inf. Sci.
**2012**, 26, 1373–1392. [Google Scholar] [CrossRef] - Rajasegarar, S.; Leckie, C.; Bezdek, J.C.; Palaniswami, M. Centered hyperspherical and hyperellipsoidal one-class support vector machines for anomaly detection in sensor networks. IEEE Trans. Inf. Forensics Secur.
**2010**, 5, 518–533. [Google Scholar] [CrossRef] - Zhang, Y.; Meratnia, N.; Havinga, P. An online outlier detection technique for wireless sensor networks using unsupervised quarter-sphere support vector machine. In Proceedings of the 2008 International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), Sydney, NSW, Australia, 15–18 December 2008; pp. 151–156. [Google Scholar] [CrossRef][Green Version]
- Wu, W.; Cheng, X.; Ding, M.; Xing, K.; Liu, F.; Deng, P. Localized outlying and boundary data detection in sensor networks. IEEE Trans. Knowl. Data Eng.
**2007**, 19, 1145–1156. [Google Scholar] [CrossRef] - Bandyopadhyay, S.; Giannella, C.; Maulik, U.; Kargupta, H.; Liu, K.; Datta, S. Clustering distributed data streams in peer-to-peer environments. Inf. Sci.
**2006**, 176, 1952–1985. [Google Scholar] [CrossRef] - Ramotsoela, D.; Abu-Mahfouz, A.; Hancke, G. A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors
**2018**, 18, 2491. [Google Scholar] [CrossRef][Green Version] - Ayadi, A.; Ghorbel, O. Performance of outlier detection techniques based classification in Wireless Sensor Networks. In Proceedings of the 13th International Wireless Communications and Mobile Computing Conference (IWCMC), Valencia, Spain, 26–30 June 2017; pp. 687–692. [Google Scholar] [CrossRef]
- Gil, P.; Martins, H.; Januário, F. Detection and accommodation of outliers in Wireless Sensor Networks within a multi-agent framework. Appl. Soft Comput. J.
**2016**, 42, 204–214. [Google Scholar] [CrossRef] - Ghorbel, O.; Ayedi, W.; Snoussi, H.; Abid, M. Fast and efficient outlier detection method in wireless sensor networks. IEEE Sens. J.
**2015**, 15, 3403–3411. [Google Scholar] [CrossRef] - Govindarajan, M.; Abinaya, V. An Outlier detection approach with data mining in wireless sensor network. Int. J. Curr. Eng. Technol.
**2014**, 4, 929–932. [Google Scholar] - Kumarage, H.; Khalil, I.; Tari, Z.; Zomaya, A. Distributed anomaly detection for industrial wireless sensor networks based on fuzzy data modelling. J. Parallel Distrib. Comput.
**2013**, 73, 790–806. [Google Scholar] [CrossRef] - Zhang, Y.; Meratnia, N.; Havinga, P.J.M. Ad Hoc Networks Distributed online outlier detection in wireless sensor networks using ellipsoidal support vector machine. Ad Hoc Netw.
**2013**, 11, 1062–1074. [Google Scholar] [CrossRef] - Peng, X.; Chen, J.; Shen, H. Outlier detection method based on SVM and its application in copper-matte converting. In Proceedings of the 2010 Chinese Control and Decision Conference, Xuzhou, China, 26–28 May 2010; pp. 628–631. [Google Scholar]
- Moshtaghi, M.; Rajasegarar, S.; Leckie, C.; Karunasekera, S. Anomaly Detection by Clustering Ellipsoids in Wireless Sensor Networks Masud. In Proceedings of the 2009 International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), Melbourne, VIC, Australia, 7–10 December 2009; pp. 331–336. [Google Scholar]
- Agovic, A.; Banerjee, A.; Ganguly, A.; Protopopescu, V. Anomaly Detection in Transportation Corridors Using Manifold Embedding. In Proceedings of the ACM Workshop on Knowledge Discovery from Sensor Data: The 13th International Conference on Knowledge Discovery and Data Mining, San Jose, CA, USA, 12–15 August 2007. [Google Scholar]
- Zhang, K.; Gao, H.; Li, J.; Shi, S. Unsupervised Outlier Detection in Sensor Networks Using Aggregation Tree. Adv. Data Min. Appl.
**2007**, 4632, 158–169. [Google Scholar] - Janakiram, D.; Reddy, V.; Kumar, A.V.U.P.; V, A.M.R. Outlier Detection in Wireless Sensor Networks using Bayesian Belief Networks. In Proceedings of the 2006 1st International Conference on Communication Systems Software & Middleware, New Delhi, India, 8–12 January 2006; pp. 1–6. [Google Scholar] [CrossRef]
- Granjal, J.; Silva, J.M.; Lourenço, N. Intrusion detection and prevention in CoAP wireless sensor networks using anomaly detection. Sensors
**2018**, 18, 2445. [Google Scholar] [CrossRef][Green Version] - Xie, M.; Hu, J.; Guo, S.; Zomaya, A.Y. Distributed Segment-Based Anomaly Detection with Kullback–Leibler Divergence in Wireless Sensor Networks. IEEE Trans. Inf. Forensics Secur.
**2017**, 12, 101–110. [Google Scholar] [CrossRef] - Kamal, S.; Ramadan, R.; El-Refai, F. Smart outlier detection of wireless sensor network. Facta Univ. Ser. Electron. Energetics
**2016**, 29, 383–393. [Google Scholar] [CrossRef] - Yao, H.; Cao, H.; Li, J. Comprehensive Outlier Detection in Wireless Sensor Network with Fast Optimization Algorithm of Classification Model. Int. J. Distrib. Sens. Netw.
**2015**, 2015. [Google Scholar] [CrossRef][Green Version] - Shukla, D.S.; Pandey, A.C.; Kulhari, A. Outlier detection: A survey on techniques of WSNs involving event and error based outliers. In Proceedings of the International Conference on Innovative Applications of Computational Intelligence on Power, Energy and Controls with Their Impact on Humanity, CIPECH 2014, Ghaziabad, India, 28–29 November 2014; pp. 113–116. [Google Scholar] [CrossRef]
- Ritika; Kumar, T.; Kaur, A. Outlier Detection in WSN- A Survey. Int. J. Adv. Res. Comput. Sci. Softw. Eng.
**2013**, 3, 609–617. [Google Scholar] - Ni, K.; Pottie, G. Sensor network data fault detection with maximum a posteriori selection and bayesian modeling. ACM Trans. Sens. Netw.
**2012**, 8, 1–21. [Google Scholar] [CrossRef] - Kontaki, M.; Gounaris, A.; Papadopoulos, A.N.; Tsichlas, K.; Manolopoulos, Y. Continuous monitoring of distance-based outliers over data streams. In Proceedings of the IEEE 27th International Conference on Data Engineering, Hannover, Germany, 11–16 April 2011. [Google Scholar] [CrossRef]
- Sharma, A.B.; Golubchik, L.; Govindan, R. Sensor faults. ACM Trans. Sens. Netw.
**2010**, 6, 1–39. [Google Scholar] [CrossRef] - Yang, P.; Zhu, Q.; Zhong, X. Subtractive clustering based RBF neural network model for outlier detection. J. Comput.
**2009**, 4, 755–762. [Google Scholar] [CrossRef] - Shuai, M.; Xie, K.; Chen, G.; Ma, X.; Song, G. A kalman filter based approach for outlier detection in sensor networks. In Proceedings of the 2008 International Conference on Computer Science and Software Engineering, Wuhan, China, 12–14 December 2008; Volume 4, pp. 154–157. [Google Scholar]
- Angiulli, F.; Fassetti, F. Detecting distance-based outliers in streams of data. In Proceedings of the Sixteenth ACM Conference on Conference On Information and Knowledge Management, CIKM ’07, Lisbon, Portugal, 6–10 November 2007. [Google Scholar] [CrossRef]
- Birant, D.; Kut, A. Spatio-temporal outlier detection in large databases. In Proceedings of the 28th International Conference on Information Technology Interfaces, Cavtat/Dubrovnik, Croatia, 19–22 June 2006. [Google Scholar] [CrossRef][Green Version]
- Yahyaoui, A.; Abdellatif, T.; Attia, R. READ: Reliable Event and Anomaly Detection System in Wireless Sensor Networks. In Proceedings of the 2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), Paris, France, 27–29 June 2018; pp. 193–198. [Google Scholar] [CrossRef]
- Trinh, V.V.; Tran, K.P.; Huong, T.T. Data driven hyperparameter optimization of one-class support vector machines for anomaly detection in wireless sensor networks. In Proceedings of the International Conference on Advanced Technologies for Communications, Quy Nhon, Vietnam, 18–20 October 2017; pp. 6–10. [Google Scholar] [CrossRef]
- Can, A.; Guillaume, G.; Picaut, J. Cross-calibration of participatory sensor networks for environmental noise mapping. Appl. Acoust.
**2016**, 110, 99–109. [Google Scholar] [CrossRef] - Kannan, K.; Manoj, K.; Sakthivel, E. A comparative study on nearest-neighbor based outlier detection in data mining. A J. Manag. NISMA Noorul Islam Strateg. Manag. Ambience
**2015**, 1, 203–204. [Google Scholar] - Pimentel, M.A.F.; Clifton, D.A.; Clifton, L.; Tarassenko, L. Review: A Review of Novelty Detection. Signal Process.
**2014**, 99, 215–249. [Google Scholar] [CrossRef] - Chandore, P.; Chatur, D. Hybrid approach for outlier detection over wireless sensor network real time data. Int. J. Comput. Sci. Addit. Appl.
**2013**, 6, 76–81. [Google Scholar] - Warriach, E.U.; Nguyen, T.A.; Aiello, M.; Tei, K. Notice of Violation of IEEE Publication Principles A hybrid fault detection approach for context-aware wireless sensor networks. In Proceedings of the 2012 IEEE 9th International Conference on Mobile Ad-Hoc and Sensor Systems (MASS 2012), Las Vegas, NV, USA, 8–11 October 2012; pp. 281–289. [Google Scholar]
- Bezdek, J.C.; Rajasegarar, S.; Moshtaghi, M.; Leckie, C.; Palaniswami, M.; Havens, T.C. Anomaly detection in environmental monitoring networks [application notes]. IEEE Comput. Intell. Mag.
**2011**, 6, 52–58. [Google Scholar] [CrossRef] - Sangari, A.S. Anomaly detection in wireless sensor networks. Recent Adv. Space Technol. Serv. Clim. Chang.
**2010**, 16, 1413–1432. [Google Scholar] [CrossRef] - Zhang, Y.; Meratnia, N.; Havinga, P. Adaptive and Online One-Class Support Vector Machine-Based Outlier Detection Techniques for Wireless Sensor Networks. In Proceedings of the 2009 International Conference on Advanced Information Networking and Applications Workshops, Bradford, UK, 26–29 May 2009; pp. 990–995. [Google Scholar] [CrossRef][Green Version]
- Wang, X.; Lizier, J.; Obst, O. Spatiotemporal anomaly detection in gas monitoring sensor networks. In Proceedings of the 5th European conference on Wireless sensor networks, Bologna, Italy, 30 January–1 February 2008; pp. 90–105. [Google Scholar] [CrossRef]
- Ni, K.; Pottie, G. Bayesian selection of non-faulty sensors. IEEE Int. Symp. Inf. Theory Proc.
**2007**, 616–620. [Google Scholar] [CrossRef] - Chen, J.; Kher, S.; Somani, A. Distributed Fault Detection of Wireless Sensor Networks. In Proceedings of the 2006 Workshop on Dependability Issues in Wireless Ad Hoc Networks and Sensor Networks, Los Angeles, CA, USA, 26 September 2006; pp. 65–72. [Google Scholar] [CrossRef]
- Kumar Dwivedi, R.; Pandey, S.; Kumar, R. A Study on Machine Learning Approaches for Outlier Detection in Wireless Sensor Network. In Proceedings of the 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 11–12 January 2018; pp. 189–192. [Google Scholar] [CrossRef]
- Trinh, V.V.; Tran, K.P.; Mai, A.T. Anomaly Detection in Wireless Sensor Networks via Support Vector Data Description with Mahalanobis Kernels and Discriminative Adjustment. In Proceedings of the 4th NAFOSTED Conference on Information and Computer Science Anomaly, Hanoi, Vietnam, 24–25 November 2017; pp. 567–586. [Google Scholar] [CrossRef]
- Dziengel, N.; Seiffert, M.; Ziegert, M.; Adler, S.; Pfeiffer, S.; Schiller, J. Deployment and evaluation of a fully applicable distributed event detection system in Wireless Sensor Networks. Ad Hoc Netw.
**2016**, 37, 160–182. [Google Scholar] [CrossRef] - Guo, J.; Liu, F. Automatic data quality control of observations in wireless sensor network. IEEE Geosci. Remote Sens. Lett.
**2015**, 12, 716–720. [Google Scholar] [CrossRef] - Dauwe, S.; Oldoni, D.; De Baets, B.; Van Renterghem, T.; Botteldooren, D.; Dhoedt, B. Multi-criteria anomaly detection in urban noise sensor networks. Environ. Sci. Process. Impacts
**2014**, 16, 1–10. [Google Scholar] [CrossRef] [PubMed] - Duh, D.R.; Li, S.P.; Cheng, V.W. Distributed Fault-Tolerant Event Region Detection of Wireless Sensor Networks. J. Distrib. Sens. Netw.
**2013**, 2013, 160523. [Google Scholar] - Shahid, N.; Naqvi, I.H.; Qaisar, S.B. Real time energy efficient approach to outlier & event detection in wireless sensor networks. In Proceedings of the 2012 IEEE International Conference on Communication Systems (ICCS), Singapore, 21–23 November 2012; pp. 162–166. [Google Scholar]
- Farruggia, A. A Probabilistic Approach to Anomaly Detection for Wireless Sensor Networks Abstract. Ph.D. Thesis, Università degli Studi di Palermo, Palermo, Italy, 2011. [Google Scholar]
- Siripanadorn, S.; Siripanadorn, S.; Hattagam, W.; Teaumroong, N. Anomaly detection using self-organizing map and wavelets in wireless sensor networks. In Proceedings of the 10th WSEAS International Conference on Applied Computer Science, Merida, Venezuela, 14–16 December 2010; pp. 291–297. [Google Scholar]
- Bal, M.; Shen, W.; Ghenniwa, H. Collaborative signal and information processing in wireless sensor networks: A review. In Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics, San Antonio, TX, USA, 11–14 October 2009; pp. 3151–3156. [Google Scholar] [CrossRef][Green Version]
- Rajasegarar, S.; Leckie, C.; Palaniswami, M. CESVM: Centered hyperellipsoidal support vector machine based anomaly detection. IEEE Int. Conf. Commun.
**2008**, 1610–1614. [Google Scholar] [CrossRef] - Rajasegarar, S.; Leckie, C.; Palaniswami, M.; Bezdek, J.C. Quarter Sphere Based Distributed Anomaly Detection in Wireless Sensor Networks. In Proceedings of the 2007 IEEE International Conference on Communications, Glasgow, UK, 24–28 June 2007; pp. 3864–3869. [Google Scholar] [CrossRef]
- Bhuse, V.; Gupta, A. Anomaly Intrusion Detection in Wireless Sensor Networks. J. High Speed Netw.
**2006**, 15, 33–51. [Google Scholar] [CrossRef] - Du, W.; Fang, L. LAD: Localization Anomaly Detection for Wireless Sensor Networks. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, Denver, CO, USA, 4–8 April 2005. [Google Scholar] [CrossRef]
- Ahmad, B.; Jian, W.; Ali, Z.A.; Tanvir, S.; Khan, M.S.A. Hybrid Anomaly Detection by Using Clustering for Wireless Sensor Network. Wirel. Pers. Commun.
**2018**, 1–13. [Google Scholar] [CrossRef] - Kanev, A.; Nasteka, A.; Bessonova, C.; Nevmerzhitsky, D.; Silaev, A.; Efremov, A.; Nikiforova, K. Anomaly Detection in Wireless Sensor Network of the “Smart Home” System. In Proceedings of the 2017 20th Conference of Open Innovations Association (FRUCT), St. Petersburg, Russia, 3–7 April 2017; pp. 118–124. [Google Scholar]
- Li, G.; He, B.; Huang, H.; Tang, L. Temporal data-driven sleep scheduling and spatial data-driven anomaly detection for clustered wireless sensor networks. Sensors
**2016**, 16, 1601. [Google Scholar] [CrossRef][Green Version] - Haque, S.A.; Rahman, M.; Aziz, S.M. Sensor anomaly detection in wireless sensor networks for healthcare. Sensors
**2015**, 15, 8764–8786. [Google Scholar] [CrossRef][Green Version] - O’Reilly, C.; Gluhak, A.I.; Rajasegarar, S. Anomaly Detection in Wireless Sensor Networks in a Non-Stationary Environment Colin. IEEE Commun. Surv. Tutorials
**2014**, 16, 1413–1432. [Google Scholar] [CrossRef][Green Version] - Rassam, M.A.; Zainal, A.; Maarof, M.A. Advancements of data anomaly detection research in Wireless Sensor Networks: A survey and open issues. Sensors
**2013**, 13, 10087–10122. [Google Scholar] [CrossRef][Green Version] - Ren, W.; Cui, Y. A parallel rough set tracking algorithm for wireless sensor networks. J. Netw.
**2012**, 7, 972–979. [Google Scholar] - Jurdak, R.; Wang, X.R.; Obst, O.; Valencia, P. Wireless Sensor Network Anomalies: Diagnosis and Detection Strategies. Intell. Syst. Ref. Libr.
**2011**, 10, 309–325. [Google Scholar] [CrossRef] - Orair, G.H.; Teixeira, C.H.; Meira, W.J.; Wang, Y.; Parthasarathy, S. Distance-based outlier detection: Consolidation and renewed bearing. Vldb
**2010**, 3, 1469–1480. [Google Scholar] [CrossRef] - Hoes, R.; Basten, T.; Tham, C.K.; Geilen, M.; Corporaal, H. Quality-of-service trade-off analysis for wireless sensor networks. Perform. Eval.
**2009**, 66, 191–208. [Google Scholar] [CrossRef] - Rajasegarar, S.; Leckie, C.; Palaniswami, M. Anomaly detection in wireless sensor networks. Wirel. Commun. IEEE
**2008**, 15, 34–40. [Google Scholar] [CrossRef] - Hill, D.J.; Minsker, B.S.; Amir, E. Real-Time Bayesian Anomaly Detection for Environmental Sensor Data. Water Resour. Res.
**2009**, 45. [Google Scholar] [CrossRef][Green Version] - Abe, N.; Zadrozny, B.; Langford, J. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, 20–23 August 2006; pp. 504–509. [Google Scholar] [CrossRef]
- Chen, Q.; Lam, K.Y.; Fan, P. Comments on "Distributed Bayesian algorithms for fault-tolerant event region detection in wireless sensor networks". IEEE Trans. Comput.
**2005**, 54, 1182–1183. [Google Scholar] [CrossRef] - Hodge, V.J.; Austin, J. A Survey of Outlier Detection Methodoligies. Artif. Intell. Rev.
**2004**, 22, 85–126. [Google Scholar] [CrossRef][Green Version] - McDonald, D.; Sanchez, S.; Madria, S.; Ercal, F. A Survey of Methods for Finding Outliers in Wireless Sensor Networks. J. Netw. Syst. Manag.
**2013**, 23, 163–182. [Google Scholar] [CrossRef] - Portocarrero, J.M.T.; Delicato, F.C.; Pires, P.F.; Gámez, N.; Fuentes, L.; Ludovino, D.; Ferreira, P. Autonomic Wireless Sensor Networks: A Systematic Literature Review. J. Sens.
**2014**, 2014. [Google Scholar] [CrossRef][Green Version] - Mamun, Q.; Islam, R.; Kaosar, M. Anomaly Detection in Wireless Sensor Network. J. Netw.
**2014**, 9, 2914–2924. [Google Scholar] [CrossRef] - Su, L.; Han, W.; Yang, S.; Zou, P.; Jia, Y. Continuous Adaptive Outlier Detection on Distributed Data Streams. Lect. Notes Comput. Sci.
**2007**, 74–85. [Google Scholar] - Agarwal, D. Detecting anomalies in cross-classified streams: A Bayesian approach. Knowl. Inf. Syst.
**2007**, 11, 29–44. [Google Scholar] [CrossRef] - Basu, S.; Meckesheimer, M. Automatic outlier detection for time series: An application to sensor data. Knowl. Inf. Syst.
**2007**, 11, 137–154. [Google Scholar] [CrossRef] - Budalakoti, S.; Srivastava, A.; Akella, R.; Turkov, E. Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences. Tech. Rep. NASA TM-2006-214553; NASA Ames Research Center: Mountain View, CA, USA, 2006. [Google Scholar]
- He, Z.; Deng, S.; Xu, X. An Optimization Model for Outlier Detection in Categorical Data. In International Conference on Intelligent Computing; Springer: Berlin/ Heidelberg, Germany, 2005; pp. 400–409. [Google Scholar] [CrossRef]
- Hill, D.J.; Minsker, B.S.; Amir, E. Real-time Bayesian anomaly detection for environmental sensor data. Proc. Congr.-Int. Assoc. Hydraul. Res.
**2006**, 32, 503. [Google Scholar] - Farruggia, A.; Lo Re, G.; Ortolani, M. Probabilistic Anomaly Detection for Wireless Sensor Networks. In Congress of the Italian Association for Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2011; pp. 438–444. [Google Scholar] [CrossRef]
- Chatzigiannakis, V.; Papavassiliou, S.; Grammatikou, M.; Maglaris, B. Hierarchical anomaly detection in distributed large-scale sensor networks. In Proceedings of the 11th IEEE Symposium on Computers and Communications, ISCC’06, Pula-Cagliari, Sardinia, Italy, 26–29 June 2006; pp. 761–767. [Google Scholar]
- Mohamed, M.S.; Kavitha, T. Outlier Detection Using Support Vector Machine in Wireless Sensor Network Real Time Data. Int. J. Soft Comput. Eng. (IJSCE)
**2011**, 1, 68–72. [Google Scholar] - Hassan, A.F.; Mokhtar, H.M.O.; Hegazy, O. A Heuristic Approach for Sensor Network Outlier Detection. Science
**2011**, 11, 66–72. [Google Scholar] - Banerjee, a.; Burlina, P.; Diehl, C. A support vector method for anomaly detection in hyperspectral imagery. IEEE Trans. Geosci. Remote Sens.
**2006**, 44, 2282–2291. [Google Scholar] [CrossRef] - Wang, C.; Viswanathan, K.; Choudur, L.; Talwar, V.; Satterfield, W.; Schwan, K. Statistical techniques for online anomaly detection in data centers. In Proceedings of the 12th IFIP/IEEE International Symposium on Integrated Network Management, IM 2011, Dublin, Ireland, 23–27 May 2011; pp. 385–392. [Google Scholar] [CrossRef]
- Nidhra, S.; Yanamadala, M. Knowledge Transfer Challenges and Mitigation Strategies in Global Software Development. Int. J. Inf. Manag.
**2012**, 33, 333–355. [Google Scholar] [CrossRef][Green Version] - Hida, Y.; Huang, P.; Nishtala, R. Aggregation Query Under Uncertainty in Sensor Networks; Department of Electrical Engineering and Computer Science, University of California: Berkeley, CA, USA, 2007; pp. 1–17. [Google Scholar]
- Palpanas, T.; Papadopoulos, D.; Kalogeraki, V.; Gunopulos, D. Distributed deviation detection in sensor networks. ACM Sigmod Rec.
**2003**, 32, 77–82. [Google Scholar] [CrossRef] - Knorr, E.M.; Ng, R.T. Algorithms for Mining Distance-Based Outliers in Large Datasets. Proc. 24th VLDB Conf.
**1998**, 98, 392–403. [Google Scholar] - Ramaswamy, S.; Rastogi, R.; Shim, K. Efficient algorithms for mining outliers from large data sets. ACM Sigmod Rec.
**2000**, 427–438. [Google Scholar] [CrossRef] - Papadimitriou, S.; Kitagawa, H.; Gibbons, P.B.; Faloutsos, C. LOCI: Fast outlier detection using the local correlation integral. In Proceedings of the 19th International Conference on Data Engineering, Bangalore, India, 5–8 March 2003; pp. 315–326. [Google Scholar] [CrossRef]
- Boulila, W.; Farah, I.R.; Ettabaa, K.S.; Solaiman, B.; Ghézala, H.B. Spatio-Temporal Modeling for Knowledge Discovery in Satellite Image Databases. In Proceedings of the CORIA 2010, 7th French Information Retrieval Conference, Sousse, Tunisia, 18–20 March 2010; pp. 35–49. [Google Scholar] [CrossRef]
- Boulila, W. A top-down approach for semantic segmentation of big remote sensing images. Earth Sci. Inform.
**2019**, 12, 295–306. [Google Scholar] [CrossRef] - Yu, D.; Sheikholeslami, G.; Zhang, A. FindOut: Finding Outliers in Very Large Datasets. Knowl. Inf. Syst.
**2002**. [Google Scholar] [CrossRef] - Allan, J.; Carbonell, J.; Doddington, G.; Yamron, J.; Yang, Y. Topic detection and tracking pilot study: Final report. DARPA Broadcast News Transcr. Underst. Workshop
**1998**. [Google Scholar] [CrossRef] - Marchette, D.J. A Statistical Method for Profiling Network Traffic. In Proceedings of the 1st Workshop on Intrusion Detection and Network Monitoring, Santa Clara, CA, USA, 9–12 April 1999; pp. 119–128. [Google Scholar]
- Wu, N.; Zhang, J. Factor analysis based anomaly detection. IEEE Syst. Man Cybern. Soc. Inf. Assur. Workshop
**2003**. [Google Scholar] [CrossRef] - Vinueza, A.; Grudic, G. Unsupervised Outlier Detection and Semi-Supervised Learning; Technical Report CU-CS-976-04; University of Colorado: Boulder, CO, USA, 2004. [Google Scholar]
- Chan, P.K.; Mahoney, M.V.; Arshad, M.H. A Machine Learning Approach to Anomaly Detection; Florida Institute of Technology: Melbourne, FL, USA, 2003. [Google Scholar]
- Barbará, D.; Li, Y.; Couto, J.; Lin, J.L.; Jajodia, S. Bootstrapping a data mining intrusion detection system. In Proceedings of the 2003 ACM Symposium on Applied Computing, SAC ’03, Melbourne, FL, USA, 9–12 March 2003; pp. 421–425. [Google Scholar] [CrossRef]
- Barbará, D.; Li, Y.; Couto, J. COOLCAT: An Entropy-Based Algorithm for Categorical Clustering. Entropy
**2002**. [Google Scholar] [CrossRef] - Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Min. Knowl. Discov.
**1998**. [Google Scholar] [CrossRef] - De Stefano, C.; Sansone, C.; Vento, M. To reject or not to reject: That is the question - an answer in case of neural classifiers. IEEE Trans. Syst. Man Cybern. Part Appl. Rev.
**2000**. [Google Scholar] [CrossRef] - Barbará, D.; Wu, N.; Jajodia, S. Detecting Novel Network Intrusions Using Bayes Estimators. In Proceedings of the 2001 SIAM International Conference on Data Mining, Chicago, IL, USA, 5–7 April 2001. [Google Scholar] [CrossRef][Green Version]
- Elnahrawy, E.; Nath, B. Context-aware sensors. Wirel. Sens. Netw. Proc.
**2004**. [Google Scholar] [CrossRef] - Hawkins, S.; He, H.; Williams, G.; Baxter, R. Outlier Detection Using Replicator Neural Networks. Data Warehous.
**2002**, 170–180. [Google Scholar] [CrossRef] - Yamanishi, K.; Takeuchi, J.I.; Williams, G.; Milne, P. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Min. Knowl. Discov.
**2004**. [Google Scholar] [CrossRef] - Sykacek, P. Equivalent Error Bars For Neural Network Classifiers Trained By Bayesian Inference. In Proceedings of the European Symposium on Artificial Neural Networks, Bruges, Belgium, 16–18 April 1997; pp. 121–126. [Google Scholar] [CrossRef][Green Version]
- Patcha, A.; Park, J.M. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw.
**2007**, 51, 3448–3470. [Google Scholar] [CrossRef] - Tan, P.N. Introduction to Data Mining; Pearson Education India, University of Minnesota: Minneapolis, MN, USA, 2006. [Google Scholar]
- Jain, A.K.; Dubes, R.C. Algorithms for Clustering Data; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1988. [Google Scholar]
- Boulila, W.; Farah, I.R.; Ettabaa, K.S.; Solaiman, B.; Ghézala, H.B. A data mining based approach to predict spatiotemporal changes in satellite images. Int. J. Appl. Earth Obs. Geoinf.
**2011**, 13, 386–395. [Google Scholar] [CrossRef] - Han, J.; Pei, J.; Kamber, M. Data mining: Concepts and techniques; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
- Safaei, M.; Ismail, A.S.; Chizari, H.; Driss, M.; Boulila, W.; Asadi, S.; Safaei, M. Standalone noise and anomaly detection in wireless sensor networks: A novel time-series and adaptive Bayesian-network-based approach. J. Softw. Pract. Exp.
**2020**. [Google Scholar] [CrossRef] - He, Z.; Deng, S.; Xu, X.; Huang, J.Z. A fast greedy algorithm for outlier mining. In Pacific-Asia Conference on Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2006; pp. 567–576. [Google Scholar]
- Ando, S. Clustering needles in a haystack: An information theoretic analysis of minority and outlier detection. In Proceedings of the Seventh IEEE International Conference on Data Mining, Omaha, NE, USA, 28–31 October 2007; pp. 13–22. [Google Scholar]
- Jolliffe, I.T. Principal component analysis and factor analysis. In Principal Component Analysis; Springer: New York, NY, USA, 2002; pp. 150–166. [Google Scholar]
- Dunia, R.; Joe Qin, S. Subspace approach to multidimensional fault identification and reconstruction. AIChE J.
**1998**, 44, 1813–1831. [Google Scholar] [CrossRef] - Jackson, J.E.; Mudholkar, G.S. Control procedures for residuals associated with principal component analysis. Technometrics
**1979**, 21, 341–349. [Google Scholar] [CrossRef] - Lijun, C.; Xiyin, L.; Tiejun, Z.; Zhongping, Z.; Aiyong, L. A data stream outlier delection algorithm based on reverse k nearest neighbors. In Proceedings of the 2010 International Symposium on Computational Intelligence and Design, Hangzhou, China, 29–31 October 2010; Volume 2, pp. 236–239. [Google Scholar]
- Rizwan, R.; Khan, F.A.; Abbas, H.; Chauhdary, S.H. Anomaly detection in wireless sensor networks using immune-based bioinspired mechanism. Int. J. Distrib. Sens. Netw.
**2015**, 11, 684952. [Google Scholar] [CrossRef] - Abukhalaf, H.; Wang, J.; Zhang, S. Outlier detection techniques for localization in wireless sensor networks: A survey. Int. J. Future Gener. Commun. Netw.
**2015**, 8, 99–114. [Google Scholar] [CrossRef] - Egilmez, H.E.; Ortega, A. Spectral anomaly detection using graph-based filtering for wireless sensor networks. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 1085–1089. [Google Scholar]
- Al-Sarem, M.; Boulila, W.; Al-Harby, M.; Qadir, J.; Alsaeedi, A. Deep Learning-Based Rumor Detection on Microblogging Platforms: A Systematic Review. IEEE Access
**2019**, 7, 152788–152812. [Google Scholar] [CrossRef]

Reference | Definition |
---|---|

[11] | “A process to identify data points that are very different from the rest of the data based on a certain measure.” |

[12] | “An observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.” |

[13] | “An observation that deviates a lot from other observations and can be generated by a different mechanism.” |

[14] | “An outlier is an observation or subset of observations that appears to be inconsistent with the rest of the set of data.” |

[15] | “An outlier is a data point which is significantly different from other data points, or does not conform to the expected normal behavior, or conforms well to a defined abnormal behavior.” |

[16] | “A spatial-temporal point, which non-spatial attribute values are significantly different from those of other spatially and temporally referenced points in its spatial or/and temporal neighborhoods, is considered as a spatial-temporal outlier.” |

[17] | “A point is considered to be an outlier if, in some lower-dimensional projection, it is present in a local region of abnormal low density.” |

[18] | “If the removal of a point from the time sequence results in a sequence that can be represented more briefly than the original one, then the point is an outlier.” |

[19] | “Outliers are points that do not belong to clusters of a dataset or clusters that are significantly smaller than other clusters.” |

[15] | “Outliers are points that lie in the lower local density with respect to the density of their local neighborhoods.” |

Inclusion Criteria | Exclusion Criteria |
---|---|

Studies are written in English | Studies whose full text is not available |

Studies are published between 2004−2018 | Duplicated studies |

Studies are published in the above selected database | Studies that are not related to outlier detection in wireless network domain |

Studies that provide answers to the research questions | Articles that did not match the inclusion criteria |

2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

[2] | ||||||||||||||

[4] | ||||||||||||||

[5] | ||||||||||||||

[8] | [22] | |||||||||||||

[6] | [36] | [37] | ||||||||||||

[38] | [39] | [40] | ||||||||||||

[20] | [41] | [42] | [43] | [44] | [45] | |||||||||

[46] | [47] | [48] | [49] | [50] | [51] | [52] | ||||||||

[53] | [54] | [55] | [56] | [57] | [26] | [58] | [59] | [60] | [11] | [61] | ||||

[62] | [63] | [64] | [13] | [65] | [66] | [67] | [68] | [59] | [69] | [70] | [71] | |||

[72] | [73] | [74] | [75] | [76] | [77] | [78] | [1] | [79] | [80] | [81] | [82] | [83] | ||

[84] | [85] | [86] | [87] | [88] | [89] | [90] | [91] | [92] | [93] | [94] | [95] | [96] | [17] | |

[97] | [98] | [99] | [100] | [101] | [102] | [103] | [104] | [105] | [106] | [107] | [108] | [109] | [24] | |

[110] | [111] | [112] | [113] | [114] | [115] | [116] | [117] | [118] | [119] | [120] | [121] | [122] | [123] | |

[124] | [125] | [126] | [127] | [128] | [129] | [130] | [131] | [132] | [133] | [134] | [135] | [136] | [137] | [138] |

S-ID | Reference | Year | Type | Methodology | Taxonomy | Dataset |
---|---|---|---|---|---|---|

S1 | [53] | 2018 | Journal | Support Vector Machines | Classification | Smart city datset |

S2 | [87] | 2015 | Journal | Quarter-sphere support vector machine (QSSVM) | Classification | – |

S3 | [55] | 2015 | Journal | Bayesian network | Classification | Mica2Dot sensor nodes dataset at Berkeley Lab |

S4 | [89] | 2013 | Journal | Survey | – | – |

S5 | [64] | 2016 | Conference | Support vector machine technique within a sliding window-based learning algorithm | Classification and Spectral Decomposition | univariate datasets: an artificial dataset, in addition to the Well-Log and Dow Jones dataset |

S6 | [74] | 2016 | Journal | Support vector machine and a sliding window learning | Classification and Spectral Decomposition | Benchmark three-tank system |

S7 | [101] | 2014 | Journal | Review Paper | – | – |

S8 | [47] | 2016 | Conference | Nearest neighbor | Classification | Intel Berkeley base |

S9 | [13] | 2015 | Journal | Naïve bayesian | Classification | Intel Berkeley Research Lab |

S10 | [41] | 2016 | Conference | Kernel principal component analysis (KPCA) | Statistical | Intel Berkeley (IBRL), Grand-St- Bernard (GStB), and Sensor- scope (LUCE) |

S11 | [92] | 2010 | Journal | Rule, LLSE, time series forecasting, and HMMs | – | Sensor Scope, INTEL Lab, GDI, NAMOS |

S12 | [6] | 2010 | Journal | Survey | – | – |

S13 | [26] | 2012 | Journal | Survey | – | – |

S14 | [75] | 2015 | Journal | KPCA based Mahalanobis kernel | Statistical and Classification | Intel Berkeley Research Lab (IBRL), Grand St. Bernard (GStB), Sensorscope Lausanne Urban Canopy Experiment (LUCE) |

S15 | [86] | 2016 | Journal | STODM algorithm and the fuzzy logic | – | St.Bernard wireless sensor network |

S16 | [119] | 2009 | Conference | Review | – | – |

S17 | [71] | 2006 | Journal | K-Means | Clustering | Dataset generated from multivariate Gaussian distribution |

S18 | [99] | 2016 | Journal | Cross calibration | – | Simulated dataset |

S19 | [109] | 2006 | Journal | Localized fault detection | – | – |

S20 | [137] | 2005 | Journal | Bayesian algorithm | Classification based | – |

S21 | [114] | 2014 | Journal | Multi criteria | statistical | Dataset acquired from a real world |

S22 | [24] | 2005 | Conference | Boundary detection | – | – |

S23 | [115] | 2013 | Journal | Fault-tolerant | Clustering | – |

S24 | [112] | 2016 | Journal | Data compression | – | Events collected data from different locations |

S25 | [113] | 2015 | Journal | Automatic data quality control | – | Real dataset |

S26 | [138] | 2004 | Journal | review Survey | – | – |

S27 | [133] | 2009 | Journal | Pareto algebra | Statistical | Simulated dataset |

S28 | [46] | 2017 | Journal | Dynamically aggregated neighboring information | Nearest Neighbor based | Dataset from Sensor Scope Grand St. Bernard scenario |

S29 | [131] | 2011 | Journal | – | – | – |

S30 | [8] | 2009 | Journal | Survey | – | – |

S31 | [82] | 2007 | Journal | Aggregation tree | Classification | Dataset provided by Berkeley research lab |

S32 | [37] | 2006 | Journal | Bayesian and Neyman Pearson | Statistical | Simulated dataset |

S33 | [139] | 2013 | Journal | Survey | – | – |

S34 | [108] | 2007 | Conference | Bayesian | Classification | Simulated dataset and actual environmental dataset collected in the forest |

S35 | [90] | 2012 | Journal | Hierarchical Bayesian spatio temporal (HBST) modeling | Classification | Three simulated datasets |

S36 | [140] | 2014 | Journal | Systematic Literature Review | – | – |

S37 | [22] | 2006 | Conference | Clustering | Clustering | Simulated dataset from the Great Duck Island project |

S38 | [128] | 2014 | Journal | Review of anomaly detection methods | – | – |

S39 | [116] | 2012 | Journal | Support vector machine | Classification | synthetic and real |

S40 | [11] | 2007 | Journal | Histogram | Statistical | Dataset of temperature records |

S41 | [58] | 2010 | Journal | Bayesian network | Classification | Simulated dataset |

S42 | [107] | 2008 | Journal | Bayesian network | Classification | Dataset gathered from deployed sensor networks in existing Australian coal mines |

S43 | [70] | 2007 | Journal | Two localized algorithms | – | Simulated dataset |

S44 | [48] | 2013 | Journal | k-nearest neighbor | Nearest neighbor | Real WSN dataset |

S45 | [93] | 2009 | Journal | Clustering | Clustering | Dataset acquired from the UCI Machine Learning Repository |

S46 | [67] | 2012 | Journal | Time series analysis and geostatistics | Statistics | Real dataset from the Swiss Alps |

S47 | [54] | 2017 | Journal | Bayesian network | Classification | Dataset of Intel Lab |

S48 | [63] | 2017 | Journal | Support vector machine | Classification | UCI dataset and IBRL dataset of WSNs |

S49 | [85] | 2017 | Journal | Segment-based anomaly detection | – | Dataset of real-word received signal strength (RSS) |

S50 | [73] | 2017 | Conference | Five different classifiers: bayesian network, neural network, nearest neighbors, support vector machine, and decision tree | Classification | Dataset from WSN in static and dynamic environments |

S51 | [141] | 2014 | Journal | Voronoi diagram based network | – | Dataset of IRBL |

S52 | [1] | 2011 | Journal | Survey | – | – |

S53 | [36] | 2009 | Journal | Non-parametric and unsupervised methods | Statistical | Simulated data |

S54 | [120] | 2008 | Journal | Survey | – | – |

S55 | [78] | 2012 | Journal | Support vector machine | Classification | Two synthetic datasets and a real dataset gathered at the Grand St. Bernard, Switzerland |

S56 | [5] | 2006 | Conference | Non Parametric | Statistical | Simulated dataset and real dataset from Pacific Northwest region |

S57 | [25] | 2009 | Conference | Bayesian network and support vector machine | Classification | Simulated dataset and real dataset from Grand-St-Bernard, Switzerland |

S58 | [69] | 2008 | Conference | Quarter sphere Support vector machine (SVM) | Classification | Dataset of Intel Berkeley Research Laboratory |

S59 | [123] | 2005 | Conference | Gaussian distribution | Statistical | Simulated data |

S60 | [105] | 2010 | Conference | Survey | – | – |

S61 | [122] | 2006 | Journal | Lightweight methods | – | Simulated data |

S62 | [118] | 2010 | Conference | Discrete Wavelet Transform (DWT) and the self-organizing map (SOM) | Classification | Synthetic dataset and actual dataset collected from a wireless sensor network |

S63 | [43] | 2010 | Journal | Neighborhood | Nearest neighbor | Real life datasets (Annealing and Cancer) |

S64 | [132] | 2010 | Conference | Optimization | – | Real and synthetic datasets |

S65 | [121] | 2007 | Conference | Quarter sphere support vector machines | Classification | Real dataset gathered from a deployment of wireless sensors in the Great Duck Island project |

S66 | [134] | 2008 | Conference | Hyperellipsoidal support vector machine | Classification | Real dataset from the Great Duck Island Project |

S67 | [68] | 2010 | Journal | Support vector machine (CESVM) and one class quarter sphere support vector machine | Classification | Synthetic and real datasets: GDI, Ionosphere, Banana, and Synth |

S68 | [83] | 2006 | Conference | Bayesian Networks | Classification | Dataset of habitat monitoring on Great duck island |

S69 | [106] | 2009 | Conference | One class support vector machine | Classification | Synthetic and real datasets of the Sensor Scope System |

S70 | [61] | 2006 | Conference | Wavelet based outlier correction and DTW distance | – | Simulated dataset |

S71 | [40] | 2006 | Journal | Outlier detection algorithm | Statistical | – |

S72 | [142] | 2007 | Journal | kernel density estimation and mico cluster | Statistical and classification | – |

S73 | [57] | 2013 | Journal | k-nearest neighbor | Clustering based and nearest neighbor based | Intel Berkeley research lab and synthetic dataset |

S74 | [136] | 2006 | Conference | Unsupervised learning | Classification and nearest neighbor | Dataset of the KDD-Cup 1999 network |

S75 | [143] | 2007 | Journal | Hierarchical Bayesian model within a decision theoretic framework | Classification | Simulated dataset |

S76 | [51] | 2008 | Conference | Density estimation | Statistical | Datasets from the UCI Machine Learning Repository and a number of synthetic datasets |

S77 | [81] | 2008 | Conference | One class Support Vector machine | Classification | Simulated datasets |

S78 | [45] | 2006 | Conference | linear regression and control chart | statistical | Dataset of the observation of the air pollution taken in Kuala Lumpur |

S79 | [144] | 2007 | Journal | One sided and two sided median | Statistical | Dataset of a flight data recorder (FDR) |

S80 | [145] | 2006 | Journal | Bayesian network | Classification | Simulated datasets |

S81 | [8] | 2009 | Journal | Survey | – | – |

S82 | [146] | 2005 | Journal | Local search heuristic | – | Real life datasets (lymphography and cancer) and synthetic datasets |

S83 | [39] | 2009 | Journal | Squence-based method | Statistical | Real life datasets (lymphography and cancer) |

S84 | [130] | 2012 | Journal | Neural Network and Rough set | Classification | Simulated Data |

S85 | [92] | 2010 | Journal | Rules, Time series analysis, learning, and estimation methods | – | Real World datasets |

S86 | [147] | 2006 | Conference | Dynamic bayesian networks | Classification | SERF windspeed sensor dataset streams from Corpus Christi Bay |

S87 | [148] | 2011 | Journal | Bayesian networks | Classification | – |

S88 | [149] | 2006 | Conference | Neighboring network | Nearest neighbor | Dataset of meteorological from various neighboring ground stations in the island of Crete in Greece |

S89 | [60] | 2008 | Journal | Distance | Nearest neighbor | Real and synthetic datasets |

S90 | [38] | 2010 | Journal | Kernel Density Estimation | Statistical | Real dataset from Intel Berkeley Research lab |

S91 | [49] | 2010 | Journal | Fuzzy clustering | Clustering | Three datasets |

S92 | [150] | 2011 | Journal | Support vector machine | Classification | Real dataset collected from a closed neighborhood from a WSN deployed in Grand-St-Bernard |

S93 | [151] | 2011 | Journal | Clustering | Clustering | Real dataset obtained from Intel Lab’s web site and synthetic dataset |

S94 | [80] | 2009 | Conference | Hyper-ellipsoidal | Clustering | Real life dataset called the IBRL and a synthetic dataset |

S95 | [44] | 2009 | Journal | Statistical analysis | Statistical | Dataset from a real sensor network obtained from the Intel Berkeley Research Laboratory (IBRL) |

S96 | [42] | 2013 | Journal | Linear regression | Statistical | Real medical dataset with many (both real and synthetic) anomalies |

S97 | [56] | 2014 | Journal | Hyperspherical clusters | Clustering | Two real sensor network deployment datasets and two synthetic datasets for evaluation purposes, namely the IBRL, GDI, Banana and Gaussmix datasets |

S98 | [77] | 2013 | Journal | Fuzzy clustering | Clustering and Statistical | Real dataset from 54 sensors deployed at the Intel Berkeley Research Lab and artificial datasets from Intel Lab |

S99 | [66] | 2013 | Journal | Principal component analysis (PCA) | Clustering | Real sensed dataset collected by 54 Mica2Dot sensors deployed in Intel Berkeley Research Lab |

S100 | [152] | 2006 | Journal | Support vector machine | Classification | Dataset of the wide area airborne mine detection (WAAMD) and hyperspectral digital imagery collection experiment (HYDICE) |

101 | [153] | 2011 | Conference | Tukey and relative entropy statistics | Statistical | Dataset from RUBiS |

102 | [59] | 2009 | Journal | Sequence Miner | Clustering | Synthetic dataset and real dataset |

103 | [129] | 2013 | Journal | Survey | – | – |

104 | [76] | 2014 | Journal | Decision tree | Classification | Intel Berkley lab dataset |

105 | [27] | 2011 | Conference | Temporal technique | Statistical technique combined with nearest neighbor technique | Real datasets in different fields |

106 | [20] | 2017 | Journal | Survey | – | – |

107 | [62] | 2018 | Journal | DUCF protocol of based on fuzzy logic interface system | Clustering | Real dataset |

108 | [72] | 2018 | Journal | Case Study | Machine learning&Classification | – |

109 | [84] | 2018 | Journal | Support vector machine | Classification | Real dataset |

110 | [126] | 2016 | Journal | Kriging | Clustering | Real dataset |

111 | [127] | 2015 | Journal | Support vector machine | Classification | Dataset of multiple intelligent monitoring in intensive care (MIMIC) |

112 | [98] | 2017 | Conference | support vector machine | Classification | IBRL dataset |

113 | [97] | 2018 | Conference | support vector machine | Classification | IBRL real dataset |

114 | [111] | 2017 | Conference | support vector machine | Classification | Intel Berkeley Research Laboratory (IBRL) |

115 | [125] | 2017 | Conference | Neural network | Classification | Real dataset |

116 | [110] | 2018 | Conference | Bayesian network | Classification | – |

117 | [124] | 2018 | Journal | K-medoids | Clustering | Synthetic datasets provided by NS2 and R studio |

S_ID | QA1 | QA2 | QA3 | QA4 | Score |
---|---|---|---|---|---|

S1 | 2 | 2 | 2 | 2 | 8 |

S2 | 2 | 2 | 2 | 2 | 8 |

S3 | 2 | 1 | 1 | 2 | 6 |

S4 | 2 | 2 | 1 | 1 | 6 |

S5 | 2 | 2 | 1 | 2 | 7 |

S6 | 2 | 1 | 2 | 2 | 7 |

S7 | 1 | 2 | 1 | 1 | 5 |

S8 | 2 | 2 | 2 | 2 | 8 |

S9 | 2 | 2 | 2 | 2 | 8 |

S10 | 2 | 2 | 2 | 2 | 8 |

S11 | 1 | 1 | 2 | 2 | 6 |

S12 | 2 | 1 | 2 | 2 | 7 |

S13 | 2 | 2 | 2 | 2 | 8 |

S14 | 2 | 2 | 2 | 2 | 8 |

S15 | 2 | 1 | 2 | 2 | 7 |

S16 | 2 | 2 | 2 | 2 | 8 |

S17 | 2 | 2 | 2 | 2 | 8 |

S18 | 2 | 1 | 2 | 2 | 7 |

S19 | 2 | 2 | 2 | 2 | 8 |

S20 | 2 | 2 | 2 | 2 | 8 |

S21 | 2 | 1 | 2 | 2 | 7 |

S22 | 2 | 2 | 2 | 2 | 8 |

S23 | 2 | 2 | 2 | 2 | 8 |

S24 | 2 | 1 | 2 | 2 | 7 |

S25 | 2 | 2 | 2 | 2 | 8 |

S26 | 2 | 2 | 2 | 2 | 8 |

S27 | 2 | 1 | 2 | 2 | 7 |

S28 | 2 | 2 | 2 | 2 | 8 |

S29 | 2 | 2 | 2 | 2 | 8 |

S30 | 2 | 1 | 2 | 2 | 7 |

S31 | 2 | 2 | 2 | 2 | 8 |

S32 | 2 | 2 | 2 | 2 | 8 |

S33 | 2 | 1 | 2 | 2 | 7 |

S34 | 2 | 2 | 2 | 2 | 8 |

S35 | 2 | 2 | 2 | 2 | 8 |

S36 | 2 | 1 | 2 | 2 | 7 |

S37 | 2 | 2 | 2 | 2 | 8 |

S38 | 2 | 2 | 2 | 2 | 8 |

S39 | 2 | 1 | 2 | 2 | 7 |

S40 | 2 | 2 | 2 | 2 | 8 |

S41 | 2 | 2 | 2 | 2 | 8 |

S42 | 2 | 1 | 2 | 2 | 7 |

S43 | 1 | 2 | 1 | 1 | 5 |

S44 | 2 | 1 | 1 | 1 | 4 |

S45 | 1 | 2 | 1 | 1 | 5 |

S46 | 1 | 1 | 2 | 1 | 5 |

S47 | 1 | 2 | 1 | 2 | 6 |

S48 | 2 | 2 | 1 | 2 | 7 |

S49 | 2 | 1 | 1 | 1 | 5 |

S50 | 2 | 1 | 2 | 1 | 6 |

S51 | 1 | 1 | 1 | 1 | 4 |

S52 | 1 | 1 | 1 | 1 | 4 |

S53 | 1 | 2 | 2 | 1 | 6 |

S54 | 1 | 2 | 1 | 1 | 5 |

S55 | 2 | 1 | 2 | 1 | 6 |

S56 | 2 | 1 | 1 | 1 | 5 |

S57 | 2 | 1 | 1 | 2 | 6 |

S58 | 2 | 1 | 1 | 1 | 5 |

S59 | 2 | 1 | 1 | 2 | 6 |

S60 | 2 | 1 | 2 | 2 | 7 |

S61 | 2 | 2 | 2 | 1 | 7 |

S62 | 2 | 1 | 1 | 1 | 5 |

S63 | 2 | 1 | 1 | 2 | 6 |

S64 | 2 | 1 | 2 | 2 | 7 |

S65 | 2 | 1 | 1 | 2 | 6 |

S66 | 2 | 2 | 2 | 1 | 7 |

S67 | 2 | 1 | 1 | 1 | 5 |

S68 | 1 | 2 | 2 | 2 | 7 |

S69 | 2 | 2 | 1 | 2 | 7 |

S70 | 2 | 1 | 1 | 1 | 5 |

S71 | 2 | 1 | 1 | 1 | 5 |

S72 | 2 | 2 | 1 | 1 | 6 |

S73 | 2 | 2 | 2 | 2 | 8 |

S74 | 2 | 1 | 1 | 2 | 6 |

S75 | 2 | 1 | 1 | 1 | 5 |

S76 | 1 | 2 | 2 | 1 | 6 |

S77 | 1 | 2 | 1 | 1 | 5 |

S78 | 2 | 1 | 1 | 1 | 5 |

S79 | 2 | 1 | 2 | 2 | 7 |

S80 | 1 | 2 | 2 | 1 | 6 |

S81 | 1 | 2 | 2 | 2 | 7 |

S82 | 2 | 1 | 2 | 1 | 6 |

S83 | 1 | 1 | 2 | 1 | 5 |

S84 | 1 | 1 | 0 | 1 | 3 |

S85 | 1 | 2 | 0 | 1 | 4 |

S86 | 1 | 1 | 1 | 1 | 4 |

S87 | 1 | 1 | 1 | 1 | 4 |

S88 | 1 | 2 | 0 | 1 | 4 |

S89 | 2 | 2 | 1 | 1 | 6 |

S90 | 1 | 1 | 1 | 1 | 4 |

S91 | 2 | 1 | 1 | 1 | 5 |

S92 | 2 | 1 | 1 | 1 | 5 |

S93 | 1 | 1 | 1 | 1 | 4 |

S94 | 2 | 0 | 1 | 1 | 4 |

S95 | 2 | 1 | 1 | 1 | 5 |

S96 | 2 | 2 | 2 | 2 | 8 |

S97 | 1 | 1 | 2 | 1 | 5 |

S98 | 2 | 1 | 1 | 1 | 5 |

S99 | 2 | 1 | 1 | 2 | 6 |

S100 | 2 | 1 | 1 | 1 | 5 |

S101 | 2 | 1 | 1 | 1 | 5 |

S102 | 2 | 2 | 1 | 1 | 6 |

S103 | 2 | 1 | 1 | 2 | 6 |

S104 | 2 | 2 | 1 | 1 | 6 |

S105 | 2 | 2 | 2 | 1 | 7 |

S106 | 1 | 1 | 2 | 2 | 6 |

S107 | 2 | 1 | 2 | 1 | 6 |

S108 | 2 | 2 | 1 | 2 | 7 |

S109 | 1 | 2 | 2 | 1 | 6 |

S110 | 1 | 2 | 1 | 1 | 5 |

S111 | 2 | 1 | 1 | 1 | 5 |

S112 | 2 | 1 | 2 | 2 | 7 |

S113 | 1 | 1 | 2 | 1 | 5 |

S114 | 2 | 1 | 1 | 1 | 5 |

S115 | 2 | 1 | 1 | 2 | 6 |

S116 | 1 | 1 | 2 | 2 | 6 |

S117 | 2 | 1 | 2 | 1 | 6 |

Extracted Data | Description |
---|---|

Study ID | Unique identity for each article |

Authors | authors’ names |

Year | Publication Date |

Type | Journal or conference |

Methodology | e.g., bayesian network, k-nearest neighbor (kNN), support vector machine, etc |

Taxonomy | Comparative techniques that are addressed in each paper |

datasets | e.g., simulated data, real data, etc. |

References | Detection Technique | Outlier Dimensional | Detection Mode | Detection Model | ||||
---|---|---|---|---|---|---|---|---|

Univariate | Multivariate | Online | Offline | Local | Global | Centralized | ||

[104] | Clustering | x | ✓ | x | ✓ | x | ✓ | x |

[57] | Hybrid | ✓ | x | x | ✓ | x | ✓ | x |

[94] | Statistical | - | - | x | ✓ | ✓ | ✓ | x |

[150] | Classification | x | ✓ | ✓ | x | ✓ | x | x |

[38] | Statistical | ✓ | x | x | ✓ | - | - | - |

[188] | Nearest neighbor | - | - | x | ✓ | x | ✓ | x |

[100] | Nearest neighbor | - | - | x | ✓ | x | x | ✓ |

[67] | Statistical | - | - | ✓ | x | x | ✓ | x |

[47] | Nearest neighbor | - | - | x | ✓ | x | ✓ | x |

[96] | Clustering | - | - | x | ✓ | ✓ | ✓ | x |

[79] | Classification | x | ✓ | - | - | ✓ | x | x |

[102] | Hybrid | x | ✓ | ✓ | x | - | - | - |

[65] | Classification | - | - | ✓ | x | - | - | - |

[55] | Classification | - | - | ✓ | x | - | - | - |

[103] | Hybrid | ✓ | x | - | - | ✓ | ✓ | x |

[80] | Clustering | x | - | x | ✓ | x | ✓ | x |

References | Detection Technique | Characteristics | Usability/Limitations |
---|---|---|---|

[189] | kNN | The complexity of this technique is depending on the number of dimensions | Valuable, scalable, efficient, and human independent solution |

[88,190] | Spectral | The detection performance is highly depending on the choices of features and distance measure | Robust to parameter perturbations and good performances with different anomaly scoring metrics |

[117,191] | Gaussian | Use of the spatial correlation to determine outlying sensors and event boundaries | The accuracy is not relatively high due to the ignorance of the temporal correlation of sensor readings |

[89,191] | Non-Gaussian | Use of the spatio-temporal correlations of data to locally detect outliers | Reduction of the communication cost (due to local transmission) and of the computational cost (due to the execution of tasks by the cluster-heads) |

[138,191] | kernel | Use of kernel density estimator to approximate the underlying distribution of sensor data | High dependency on threshold definition (the choice of an appropriate threshold is quite difficult and a single threshold may also not be suitable for outlier detection in multi-dimensional data) |

[20,191] | Histogram | Reduction of the communication cost by collecting histogram information rather than collecting raw data for centralized processing | The collection of more histogram information from the whole network will cause a communication overhead. In addition, this technique only considers one-dimensional data |

[89,191] | Naïve Bayesian Network | Computation of the probabilities of each node locally | The spatial neighborhood under the dynamic change of network topology is not specified. In addition, this technique deals only with one-dimensional data |

[89,191] | Bayesian Network (BN) | Use of BN to capture the spatio-temporal correlations that exist between the observations of sensor nodes and the conditional dependence between the observations of sensor attributes | Improvement of the accuracy in detecting outliers as it considers conditional dependencies between the attributes |

[89,191] | Dynamic Bayesian Network | Identification of outliers by computing the posterior probability of the most recent data values in a sliding window | Possibility of operation on several data streams at once |

[26,89,191] | Support vector machine | Mapping of the data into a higher dimensional feature space where it can be easily separated by a hyperplane | Identification of outliers from the data measurements collected after a long-time window and is not performed in real-time. In addition, this technique ignores the spatial correlation between neighboring nodes, which leads to inaccurate results of local outliers |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Safaei, M.; Asadi, S.; Driss, M.; Boulila, W.; Alsaeedi, A.; Chizari, H.; Abdullah, R.; Safaei, M.
A Systematic Literature Review on Outlier Detection in Wireless Sensor Networks. *Symmetry* **2020**, *12*, 328.
https://doi.org/10.3390/sym12030328

**AMA Style**

Safaei M, Asadi S, Driss M, Boulila W, Alsaeedi A, Chizari H, Abdullah R, Safaei M.
A Systematic Literature Review on Outlier Detection in Wireless Sensor Networks. *Symmetry*. 2020; 12(3):328.
https://doi.org/10.3390/sym12030328

**Chicago/Turabian Style**

Safaei, Mahmood, Shahla Asadi, Maha Driss, Wadii Boulila, Abdullah Alsaeedi, Hassan Chizari, Rusli Abdullah, and Mitra Safaei.
2020. "A Systematic Literature Review on Outlier Detection in Wireless Sensor Networks" *Symmetry* 12, no. 3: 328.
https://doi.org/10.3390/sym12030328