Wireless sensor networks (WSN) gained popularity in recent years as the world embraces Internet of Things (IoT) applications as part of the 4th industrial revolution [1
]. The sensors in WSN applications are used to gather data about the environment and these nodes communicate with each other and possibly a base station in order to collaboratively monitor the application environment and make “intelligent” decision in reaction to the sensed condition [5
]. A problem arises in WSNs, where due to the nature of these devices, data can be lost or corrupted during the transmission phase due to external interference in the communication line or malfunction sensors which produce faulty unreliable data [6
]. Furthermore, the cost of implementing or needing to replace many physical sensors in a network can become prohibitively expensive.
The integrity of received information is an important issue in the modern age. Sensors play a pivotal role in electronic devices of all shapes, sizes and function and especially more so in WSNs where data loss is an expected occurrence [7
]. The integrity of the data in WSN is not only an issue with regards to the naturally occurring external factors, there are also sinister threats to these vulnerable devices. The practical limitations of these devices mean that they are vulnerable to security attacks from people with malicious intent [8
]. This means that an attacker may be able to seize control of one or more sensor nodes and alter the data in order to manipulate the system into making potentially disastrous decisions which could negatively affect the application environment [9
In network security, there is a heavy emphasis on preventative security mechanisms which provide an external security perimeter to prevent an attacker from gaining access to the system [10
]. When these preventative security mechanisms fail, detection mechanisms, diversionary tactics and countermeasures can be used to limit the potential damage an attacker can cause [11
]. Intrusion Detection and Prevention Systems (IDPS) can thus be used in WSNs to detect and limit the damage of successful attacks by for example disregarding sensor readings from suspicious nodes [12
]. Another important research focus is the detection of the location of the malicious node which is not trivial in the resource constrained environments [13
]. This is because these countermeasures may affect the limited system resources which are required by the main application.
Data imputation is a method that allows a system to counteract the effect of data loss by accurately substituting values that real sensors would most likely have returned [16
]. This then allows the system to still make use of incomplete data rather than completely discarding affected data entries. In the previous example this means that the IDPS would be able to disregard the suspicious sensor readings without significantly affecting the system’s performance. The focus on this paper is not on the detection of intrusions, but rather on a possible countermeasure that can be used once an attack has been successfully detected. This countermeasure can also be used when a node is disconnected from a system or has faulty readings in non-malicious circumstances. Data imputation is also traditionally used to replace missing/faulty sensor values at particular time instances. This is possible when the previous values of the affected sensor are used to infer what the current value should be. For algorithms that use the other sensor values to infer what the affect sensor value should be, it is necessary to have a separate imputation/filtering technique to infer values at specific time-instances. This is because these algorithms require the remaining sensor values to be accurate in order to effectively infer the missing values. These types of algorithms are more suitable as virtual sensor, which replace the affected sensor values for a specified time period instead of only at specific time instances.
Traditionally statistical techniques were the preferred choice to impute data but in recent years machine learning has been increasingly applied to the field [17
]. These machine learning techniques have been proven to be more accurate and robust than their statistical counterparts [19
]. K-Nearest Neighbours (KNN: lazy learning) [20
], multi-layered perceptrons (MLP: supervised learning) [21
] and self-organizing maps (SOM: unsupervised learning) [22
] are three popular machine learning methods that have been used to great effect to solve the missing data problem in various applications. These algorithms were all able to outperform the traditional imputations techniques such as hot-swapping by significant margins in applications such as breast cancer detection, seed classification and sonar imaging. Most of the published work in this regard is on data mining techniques on incomplete datasets while the proposed work focuses on the real-time imputation of sensor values in resource constrained WSNs. This is important because sensor nodes are both vulnerable to security attacks and also prone to random non-malicious failures [23
]. It is not always possible to expeditiously thwart a security attack or replace faulty nodes so a temporary solution is required. The proposed solutions should not have a large enough overhead to interfere with main application of the WSN as that would render them infeasible in practice. The resource limitations of these systems however make this challenging because traditional network security approaches are mostly not applicable in this setting [24
This paper proposes the use of data imputation methods and machine learning in WSNs to realise virtual sensors. These virtual sensors are able to completely replace physical sensor nodes and give accurate substituted data in place of nodes with failed sensor modules. A Kalman Filter is used for the imputation of missing/faulty sensor readings at particular time instances and a Multilayer Perception is used to infer the virtual sensor values. The former was necessary due to the error prone sensor readings and anomalous environmental conditions which could affect the virtual sensor predictions. The main function of the proposed system is to ensure the robustness of WSNs by ensuring that damaged or compromised nodes in these systems can be replaced by these machine learning-based virtual sensors. This intervention reduces the effects, on system performance, of not being able to use the sensor data from the affected nodes. This allows the system to continue operating with little to no effect while using imputed values that closely resemble the affected sensor nodes’ would-be data.
The rest of this paper is organised as follows: Section 2
gives a brief background of all the relevant topics while Section 3
broadly describes the proposed system. The detailed design of the system is outlined in Section 4
and the results are presented and discussed in Section 5
and Section 6
respectively. Finally, the paper is concluded in Section 7
3. System Overview
This paper is largely concerned with the identification of the relationship between sensor nodes in a sensor network using data that has been collected by the sensor network hence many of the above algorithms are applicable to the proposed scenario. Multiple temperature sensor nodes were deployed to generate training data resulted and each sensor node had a corresponding virtual sensor. A FFNN was used to model the relationship between the deployed sensor nodes in the network and these models were used to create the virtual sensors. For training, a genetic algorithm (GA) was chosen due to experimental evidence showing that the training method converges much faster than the back-propagation algorithm [41
]. This reduces the required training time as well as the likelihood of falling into the potential trap of local optimums which may require a reinitialisation of the training due to stagnant results.
A star topology was chosen for the sensor network that would make use of WiFi communication technology where the sensor nodes would communicate with a central server. Low power usage is of no concern in this paper where the primary concern is with sensor networks where distances from each node are not too far apart from each other (<50 m) which would mean that expensive technology such as ZigBee and LoRa are not required. The database of the system will be stored locally by the server and used as training and test data for the VS once enough data has been gathered. A separate database will collect the VS outputs for comparison to real sensed data once training is complete. Figure 1
shows the conceptual design of the proposed system.
3.1. PhysicalSensor Nodes
The ambient temperature where the sensor nodes would be deployed was not expected to change abruptly under normal conditions so a sampling rate of one sample per 30-s interval was chosen. No scaling circuit was designed as the modified Steinhart-Hart equation [42
] was deemed a better software solution than using hardware to convert the sensed values to corresponding temperatures. A filtering algorithm was required to account for the noise that was experienced. This noise was due to temperature anomalies as a result of hot air pockets moving through the buildings as well as hardware glitches. A scalar Kalman filter (SKF) was designed and implemented in software on the server to deal with these anomalies as it was accurate as well as being computationally simple.
3.2. Virtual Sensor
For the virtual sensor, a supervised MLP algorithm was used to find a model representation of the relationship between the deployed sensors. An algorithm was implemented to iterate through various topologies based on several hidden nodes as well as hidden layers to determine an optimal middle ground where an acceptable accuracy was reached by the MLP while taking training time into consideration. During Training the fittest neural networks are selected for breeding using the GA. Parents are then randomly selected for breeding, irrespective of which parent is the fittest in the subset population. Weights are randomly selected across both parents’ weight arrays to complete a child neural network’s weight array with a small chance to completely randomise a single weight in the array to act as a mutator until the original population size is reached. This is repeated over a specified number of epochs until a candidate neural network emerges for a specific VS.
The VSs are deployed on both the server and the sensor nodes, applying the weights that have been found through the training phase. The system will identify if a sensor node is not communicating with the server and use the appropriate VS to take over sensing operations until the sensor node is able to reconnect to the network. If this is not the case, the sensor node will send both VS data and physically sensed data back to the server.
The success of any virtual sensor system using machine learning at the core of the design is dependent on the successful design and implementation of the chosen machine learning method as well as the quality of the training data that is used to train the virtual sensors. This was evident in the results obtained for the overall virtual sensor network and its associated subsystems. When the results for the accuracy are compared with regards to the size of a building it was found that a virtual sensor network deployed in a large residential building very marginally outperforms a small residential building, but this was negligible. It was further found that the system was able to impute values with an accuracy of around 95% for both deployed scenarios. The system was trained and evaluated using real sensor readings from the experimental setup described in the previous section. This means that the trained MLPs were deployed on the sensor nodes and could impute values in real time on the physical system. A more detailed discussion on the computational consideration will be deferred for later in this section.
When compared to state-of-the-art techniques it is evident that the proposed approach has better comparable results on average. Zhang [28
] evaluated three kNN implementations on two popular datasets. The mean accuracy for all three algorithms on both datasets ranged from 85% in the worst case to 98% in the best case. As mentioned previously though, the approach proposed in this paper is more suitable for WSNs than kNN because of the resource constraints of the application environment. Folguera et al. [36
] proposed and SOM technique for data imputation of physicochemical water parameters and compared it to data imputed by expert inference. The expert data yielded a mean accuracy of around 84% while the proposed SOM technique achieved a mean accuracy of almost 90%. The approach proposed in this paper is again preferred in the application environment because it is more resource friendly.
The choice of using genetic algorithms to train the MLP proved to be well suited in all three virtual sensor cases as evidenced by the standard deviation measurements in Table 3
and visualised by the plots shown in Figure 8
and Figure 9
. It was seen that the virtual sensors accurately depicted the changes with very little variation between the measured and imputed temperatures values over several days.
It was found that with the increase in computational power in the modern-day has resulted in very fast turnaround times when imputing sensor values using an MLP. Even the 40 MHz processing speed of the PIC32 was able to impute values in under a second with the longest imputation lasting 0.588 s. On the server-side the desktop PC used for this system was able to impute values on average in 9.379 ms for all three sensors. Depending on the size and topology of the network, imputing the sensor values on the physical sensor node may not always be practical. In large networks where multiple sensor values have to be imputed an execution time of 0.588 s could cause significant delays which could affect overall system performance. The purpose of the proposed system is to maintain system performance even when some of the sensors are malfunctioning so if the proposed scheme also affects overall system performance it would be counterproductive.
The approach to implement the wireless communication in the WSN using WiFi communication was a positive design choice as many IoT applications in the literature make use of the TCP/IP stack which works well over the WiFi 2.4 GHz spectrum and allows easier management of clients that connect to the network. It was found that due to the power spikes that occur when transmitting data from the server that a query could only be sent every 5 s or WiFi module connected to the server would crash and require a cold restart which is due to the oscillations in current draw.