Differential Run-Length Encryption in Sensor Networks

Energy is a main concern in the design and deployment of Wireless Sensor Networks because sensor nodes are constrained by limitations of battery, memory, and a processing unit. A number of techniques have been presented to solve this power problem. Among the proposed solutions, the data compression scheme is one that can be used to reduce the volume of data for transmission. This article presents a data compression algorithm called Differential Run Length Encryption (D-RLE) consisting of three steps. First, reading values are divided into groups by using a threshold of Chauvenet’s criterion. Second, each group is subdivided into subgroups whose consecutive member values are determined by a subtraction scheme under a K-RLE based threshold. Third, the member values are then encoded to binary based on our ad hoc scheme to compress the data. The experimental results show that the data rate savings by D-RLE can be up to 90% and energy usage can be saved more than 90% compared to data transmission without compression.


Introduction
Wireless sensor networks (WSNs) consist of smart wireless sensors working together to monitor areas and to collect data such as temperature and humidity from the environment. However, sensor nodes are faced with resource constraints in terms of energy, memory, and a processing unit [1]. Many works [2][3][4] proposed a variety of solutions to overcome the restrictions. A challenge is how to prolong sensor life time during data delivery from sensor nodes to a base station. Energy consumption is a hard problem in the design and deployment in WSNs because sensor nodes may be deployed in harsh environments where it is not easy to replace the batteries [5]. Energy consumption mostly occurs in either data computing or data transmission. Wang et al. [6] reported that the ratio between computing and communication incurred energy consumption is about 1:3000; therefore, sensor nodes should focus on effective data communication. If sensor nodes reduce the number of data transmissions, this can obviously save energy consumption in the entire network. The end-to-end energy cost and network lifetime are greatly restricted if the cooperative transmission model is not designed properly [7]. The most common technique for saving energy is the use of a sleep-wake scheduling scheme [8] in which a significant part of the sensor's transceiver is switched off. However, the solution induces a problem of time synchronization [9] and the possibility of retransmitting data. Sensor network topologies also have a massive impact on energy usage in data transmission. In tree-based topologies [10], data aggregation approaches are often mentioned in order to reduce data redundancy and resulted in decreasing the number of data transmissions. However, it is merely given an approximate data value in a local area [11]. In cluster-based topologies [7,12], a cluster head plays an important role that collects and forwards all data from neighboring nodes to the base station. The cluster head consumes higher energy than other neighboring nodes and results in failures if it is out of energy more quickly. If the number of failures exceeds the tolerance level, a system may collapse [13]. To disregard this problem, a cluster head [14] is assumed as a special node having more sufficient energy than its neighboring nodes. On the other hand, our proposed solution does not require a special node. It simply can be applied for all sensor nodes including a cluster head that can be exhausted when it works hard. Residual energy is a criterion in selection of a cluster head [15].
In this paper, data compression is a proposed solution that can reduce a data packet size and amount of data transmission and result in prolonging the battery life of wireless sensor nodes. The proposed concept shown in Figure 1 can apply either lossless or lossy data compression. Furthermore, the proposed data compression does not require extra RAM. Data compression can be divided into lossless and lossy algorithms. Lossless compression provides data accuracy but normally requires extensive use of memory for making a lookup table. A sensor LZW (S-LZW) algorithm [16] is an extension of a lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch (LZW) [17,18]. Capo-Chichi et al. [19] and Roy et al. [20] reported a concept of S-LZW that is a dictionary-based algorithm that is initialized by all standard characters of 255 ASCII codes. However, a new string in the input stream creates a new entry and results in the limitation of memory in a sensor node. In [21][22][23][24], their schemes are based on lossless entropy compression (LEC) for data compression by using the Huffman variable length codes. The data difference is an input to an entropy encoder. LEC is one of the efficient schemes in data compression, therefore LEC is applied for reliable data transmission to monitor a structural health in wireless sensor networks [25]. On the other hand, lossy compression [26] is data compression that is appropriate for sending approximate data or repeated data [27]. The K-Run-Length Encoding (K-RLE) algorithm [28] is a lossy compression that is an adaptation of RLE [17]. K-RLE's data accuracy and compression ratio depend on the K-precision.

Related Work
In [26], researchers presented a comparison of data compression schemes with different sensor data types and sensor data sets in WSNs. In [24], data types in compression can be considered and divided into smooth temperature and relative humidity data and dynamic volcanic data that exhibit dramatic different characteristics. Since power consumption is one of the main concerns, Koc et al. [29] studied and measured power consumption during data compression by using the MSP432 family of microcontrollers. With the same fixed parameters of wireless environments, the energy usage for a fixed size packet would be the same on delivering the packet. Reducing the number of data packets would help reduce the energy consumption. Therefore, data compression has played a significant role in WSNs. Two types of data compression are generally categorized and referred to as lossy and lossless compression. The former compression permanently removes a certain amount of data, reducing the size of data to much smaller than the original ones, but it degrades the quality of data. While the latter compression reduces the data size without any data quality loss, its compression rate is lower than the former compression. In our experiments, we compared our results with three lossless algorithms that is LEC [21], Lempel-Ziv-welch (LZW) [17,18] and run-length encoding (RLE), whereas we compared our results with K-RLE for the lossy scheme. LEC is a lossless algorithm based on the Huffman concept in which the entropy is used for defining the Huffman codes. Table 1 shows the prefix and suffix  codes used in LEC. LEC computes the difference data values and then replaces the difference by the  corresponding codes from Table 1. LEC Algorithm is shown in Algorithm 1.
When d i is negative, low-order bits of the two's complement representation of (d − 1) are used for the suffix code. For example, suppose we have a data set: <19, 18, 20, 21>. Starting with the first data 19, we then compute the value difference between a pair of consecutive data, resulting in <19, −1, 2, 1>. By using Table 1 and Algorithm 1 above, we obtain the following encoded bs i : (0001 0011), (010,0) (011,10) (010,1).
LZW is a dictionary-based lossless compression. LZW used in the experiments begins with the value of 256 onwards to avoid repeating the value of the first 256 ASCII codes. The algorithm repeatedly reads a symbol input to form a string and checks if the string is not in the dictionary. Once such a string is found, the corresponding output code for the string without the last symbol that is the longest string in the dictionary is sent out, and the new found string is added to the dictionary with the next available output code. Table 2 shows an example of data input string AAAABAAAABCC. Applying Algorithm 2, the seven new codes (code from 256 to 262) are added into the dictionary and the output strings are <A, AA, A, B, AAA, B, C, C>. The output codes for those output strings are <65, 256, 65, 66, 257, 66, 67, 67> where each output code uses nine bits, so in total the encoding output uses 72 bits compared to original input 96 bits. Nevertheless, a larger dictionary requires larger memory. RLE is the simplest compression, working by counting the amount of repeating consecutive identical data. The amount of consecutive identical data followed by the data symbol is replaced for the original repeating data. For example, the data of AAABBCEEFFFFFFFFAA are compressed to 3A2B1C2E8F2A, which implies that there are 3 A's, 2 B's, C, 2 E's, 8 F's and 2 A's next to each other in series. RLE pseudocode is shown in Algorithm 3.

Algorithm 3 RLE Pseudocode
while there are still input symbols do count = 0 repeat get input symbol count = count + 1 until symbol unequal to next symbol output count and symbol end while K-RLE is based on a RLE algorithm allowing quality loss to such an extent. The value of K indicates the range of different data values. If K=1, for example, the data of <19, 18, 20, 21> will be encoded as < (3,19), (1,21)> because the first three pieces of data are in the range of 19 ± 1 and the last two pieces of data are in the range of 20 ± 1. The encoded data < (3,19), (1, 21)> then will be decoded as <19, 19, 19, 21>. Obviously, the decoded data are different from the original data due to the lossy scheme. K-RLE pseudocode is shown in Algorithm 4.
LZW compresses the same data with the same encoding though these data are at different positions in the data input stream. In contrast, RLE requires that the same data must stand next to each other in a row. Both LEC and LZW apply a similar concept in terms of the prefix codes. However, LEC also has suffix codes addressing the different value; hence, each encoding of difference value in the input stream consists of prefix and suffix codes. If the input stream has many pieces of consecutive identical data, RLE performs very well. LZW would be preferred to RLE if the input stream consisted of many repeating data with shorter output codes. LEC works even better if the input stream has many of the same levels of the difference values with shorter prefix and suffix codes. Aforementioned algorithms have the linear time complexity O(n) and perform best in their own characteristics, not for general datasets. This gap stimulates how we can combine each strong point of those algorithms to compress the data. To this end, we have developed Differential Run Length Encryption (D-RLE), which also has the linear time complexity O(n), and will explain its concept in the next section.

Differential Run Length Encryption
This section presents the proposed algorithm called Differential Run Length Encryption or D-RLE, which consists of three steps. First, raw data are collected and divided into several groups by using a threshold of Chauvenet's criterion. Second, consecutive data are subtracted and arranged into multiple subgroups of differential values based on a threshold of K-RLE. Third, our adaptive data compression, which is the DSC-based scheme [30], is employed to each piece of subgroup data. Data formats in D-RLE are shown in Table 3. As a result, D-RLE significantly reduces the amount of data delivery, saves energy consumption and prolongs the battery life of the sensor nodes. Table 3. Data formats in D-RLE.

Group Division by Chauvenet's Criterion
The raw data can be considered as a random sample <r 0 , r 1 , . . . , r m > and they are grouped by using a pre-defined Chauvenet's criterion [31,32] D max , which is set to 1.96 according to the significance level of 0.05 from statistics. The data r i is passed to Equation (1) for calculating D i in which µ and σ are mean and standard deviation, respectively. The value of D i is then compared with D max to consider if r i should belong to the same group or not. We maintain r i to the same group if D max is greater than or equal to D i ; otherwise, we split r i into the next group. For example, suppose we have the following raw data <21, 25 Once a group is split, the mean and standard deviation are recalculated and updated with the remaining data and used in Equation (1) for the next round of group divisions. Algorithm 5 shows the algorithm for the group division step.

Subgroups Division
Each group from the first step will be sub-divided based on the K-RLE scheme. The value of K implies the degree of tolerance. It is lossless if K equals zero. The result of the subgroup division is written in the compact format: <r b , |c 1 , d 1 | 1 , |c 2 , d 2 | 2 , . . . , |c n , d n | n >, where the symbol |c i , d i | i is used to separate ith subgroup and r b is the first value of the group. The number of elements in ith subgroup is denoted by c i and the value difference between the last raw data of the present subgroup ith and the last raw data in the previous subgroup (i − 1)th is denoted by d i . The algorithm of discovering the c i and d i is shown in Algorithm 6. As we start off the index from zero, m + 1 is the number of data members in the group. The index n is the number of subgroups that is also the number of members in set C i and D i . The relationship between m and c i is shown by Equation (2): (2) For the first group <21, 25,28,30,31,35,37,42,47,49, 50, 55, 62, 76, 82, 95>, we ran the algorithm from Algorithm 6 above and had the following subgroup division result: The variable c i is just a counter, starting from one, indicating how many pairs that the absolute difference between the first and next raw data in the same subgroup do not differ more than the pre-defined K value. The variable d i roughly dictates to us how different the data are between the present and previous subgroups.

Adaptive Data Encoding
The last step is the process of adaptively encoding each subgroup division result. Shortened opcodes for c i and d i are used to compress raw data via the encoding process. The number of bits to represent c i and d i is determined by set C i = {c 1 , c 2 , . . . , c n } and set D i = {d 1 , d 2 , . . . , d n }, respectively.

Performance Evaluation
In our testbed, we assumed that data sets are already stored in a sensor node. The sensor node could be a cluster head that collects data either from itself or from neighboring nodes. Later, the sensor node performs data compression and wirelessly sends compressed data to a receiver. This section analyzes the performance of our algorithm and compares the performance with four benchmark algorithms on five datasets.

Effect of K Value
We have evaluated the performance of our algorithm in terms of data rate savings (DRS) [30] as shown in Equation (7).
Particularly, the case of lossless compression (K = 0) and the case of lossy compression (K = 1) were evaluated. The lossy case for K = 1 has been shown in the previous section and its DRS is 63.31%.
We did repeat the same procedure except for K = 0 in the lossless case on the same raw data <21, 25 To achieve more data rate savings, we could allow more data lost or distorted to some extent. It depends on applications how much the quality loss is acceptable and this is done via the value of K in the algorithm. Table 4 shows the effect of varying K. We obtained more data rate savings when K ≥ 4 for the same raw data previously illustrated.

Evaluation Results
We have split our experiments into two sets. The dataset we used in the first set is varied in size-roughly speaking as small, medium, and large sizes-while, for the second set, we have fixed each dataset with the same size. Subsequently, benchmark algorithms as well as our proposed D-RLE algorithm were applied to compress those datasets and then the energy consumption for data compression and transmission would be measured.
Starting with the first set, we have simulated a 100-byte temperature dataset shown in Figure 2c and its shape resembles Figure 2d, which is a real collected dataset by [33]. In addition to these datasets, three more datasets referred to as sine-like, chaotic, and temperatureMin datasets as illustrated in Figure 2 were used for evaluating effectiveness of compression algorithms in our experiments. The simulated temperature dataset was created by Algorithm 7 while the temperatureHr dataset was the actual hourly recorded temperature data for 48 h. The temperatureMin dataset was retrieved from the same source of the temperatureHr dataset, but minutely recorded data. For the sine-like dataset, the minimum and maximum data values are 2 and 19, respectively. The neighboring data value next to 2 is 3 and then the data value is increased by 3 until reaching the maximum value. The data value after the maximum was set to 18 then decreased by 3 until touching the minimum value. By doing so for two cycles, the sine-like dataset has 30 pieces of data. We have fixed data ranging from 2 to 19 for the sine-like dataset, whereas we have randomly selected data ranging from 0 to 20 for the chaotic dataset. For chaotic and simulated temperature datasets, each dataset consists of 100 pieces of raw data. The temperatureHr dataset only has 48 pieces of data as the data were hourly recorded for 48 h, whereas the temperatureMin dataset has 2880 pieces of data since the data were minutely recorded for the same 48 h period. The raw data in each dataset were recorded as a series of strings; hence, bit sizes of data 2, 12 , and 102 are 8, 16, and 24 bits, respectively.

Algorithm 7 Creating simulated temperature data
Require: g, uppertemp, lowertemp Ensure: t[g] = {t 1 , t 2 , . . . , t g } initialize min = 0, max = 10, temp = 20, i = 1, up = true For the second set of experiments, we extended the size of each dataset except for temperatureMin dataset to 46,080 bits. The reason to do this experiment is to investigate the energy usage for lengthy data transmission in one shot compared to multishot transmission of a small amount of data. We expected that the one shot delivery should have more efficient energy usage than the multishot since the sensor node in the single shot would be in a silent or power saving mode longer, whereas the multishot could wake up the sensor node more frequently. For both sets of experiments, four selected benchmark compression algorithms which are RLE, K-RLE, LEC, and LZW were used and compared to our D-RLE algorithm. For K-RLE, we set K = 1. A sensor node has been used in the experiments and run these algorithms to compress the datasets before transmitting data to a base station. The DRS obtained and energy consumed by each algorithm then were recorded for comparisons. The algorithms were implemented into a LAUNCHXL-CC1310 board [34] acting as the sensor node equipped with an RF module. To measure the power and energy usage in data compression and transmission, an MSP430FR5969 board [35] and code composer studio (CCS) program (version 8.0.0, Texas Instrument Inc., Dallas, Texas, USA) [36] were used. The MSP430FR5969 board was connected to the LAUNCHXL-CC1310 board as shown in Figure 3, and the amount of energy used was then measured by the EnergyTrace() function in CCS software. We took off the jumper connecting between the microcontroller and the debug parts of the CC1310 board to ensure that the power source came from the MSP430FR5969 board. The corresponding 3.3V Vcc and ground pins between the two boards were wired up as illustrated by black and white lines in Figure 3.  Tables 5 and 6 show the comparison results and total energy consumption on the datasets with  different sizes while Tables 7 and 8 show the comparison results and total energy consumption on lengthy datasets with the same size of 46,080 bits, respectively. In the matter of DRS, the sine-like data gradually change values and there are no repeating values in adjacent data, so it is obvious that RLE and K-RLE poorly perform while D-RLE works much better than others. On the other hand, the temperatureMin dataset has many repeating values and this characteristic does help RLE, K-RLE, and D-RLE to have higher DRS than LEC and LZW. The temperatureMin dataset has many nearby identical repeating data making it possible for LZW to create a dictionary with shorter encoding bits than LEC, and hence LZW has compressed data better than LEC. For a simulated temperature dataset, LZW, RLE and K-RLE perform worse than LEC and D-RLE since there are no repeating data values. LEC and D-RLE share a similar concept in the way of encoding the difference values. While LEC considers the all of the data as one group for data encoding, D-RLE divides the data into several subgroups in which the members within the same subgroup are not much different, leading to smaller encoding bits. For the temperatureHr dataset, D-RLE performs very well while among K-RLE, LEC and LZW work comparably, but RLE is the worst. RLE is the worst algorithm for the datasets that there are no repeating values. The negative DRS of RLE means RLE could not compress data at all, and it also adds extra overheads into the raw data. For the chaotic dataset, LEC, LZW and D-RLE perform better than the RLE family due to the data fluctuation and the dataset rarely has repeating data. We found that, in terms of DRS on those datasets, our D-RLE is the winner on both single shot and multi-shot patterns. Though the compression time by D-RLE is longer than others except for LZW, the compression energy used by D-RLE is not much different from others in multi-shot patterns. For the single shot pattern, D-RLE spends compression energy similar to LEC and consumes more compression energy than RLE and K-RLE, but less than LZW. We would suggest using D-RLE for a dataset that has a long sequence of data and it works best for repeating data or a gradual change in data values.  As a result of highest DRS performance, the number of packets for data delivery by D-RLE is smaller than the number of packets by other algorithms, leading to less transmission energy. In terms of energy use, D-RLE uses the least total energy compared to other algorithms on most of the datasets as shown in Tables 6 and 8. For example, in temperatureHr dataset of 46,080 bits, D-RLE approximately sends only 12 packets with the total energy use of 18.82 mJ, while others use more than 30 packets with the total of energy greater than 45 mJ. The total power use, P total , consists of two parts from compression and transmission steps, which is referred to as compression power, P c , and transmission power, P t , respectively. P total is determined by Equation (8) in which the subscript i indicates the ith group number when we compress the data, and j expresses the jth payload or packet number that we deliver. The values of g and p are the number of groups and the number of packets. To calculate corresponding energy used in each step, we use the relationships between energy and power from Equations (9) and (10), where T c i and T t j are the compression time spending for compressing ith group and transmission time spending for delivering jth packet, respectively. Lastly, total energy consumption, E total , is computed by Equation (11), adding transmission and compression energy. The experimental results show that D-RLE takes minimal transmission energy in exchange for slightly more compression energy, but it is worthy of being considered, as D-RLE significantly reduces total energy use while other benchmark algorithms consume much higher total energy level on the same datasets: Figure 4 compares power and energy consumed by D-RLE during data compression and transmission between single shot and multishot cases on the TemperatureMin dataset, respectively. Both cases have the same data length of 46,080 bits. The single shot receives all of the data before starting compression and transmission while the multishot would receive several data portions in which each portion has the same data length. According to the graph, it is clearly seen that the transmission period demands higher power consumption than the compression period. However, the energy consumption during the transmission indicated as T in the graph is less than the energy used during compression indicated as C in the graph. The smaller size of compressed data gives the shorter period of data transmission time for the single shot case. On the other hand, the multishot case has to repeat many compression and transmission cycles and take more time. In each cycle of the multishot, the compression step is reinitiated, which gives us lower compression efficiency and hence the accumulated energy used by the multishot is greater than the accumulated energy used by the single shot. Therefore, the single shot is more efficient and energy saving compared to the multishot. Other datasets have the graphs in the same manner as the Temperature dataset.

Performance Visualization
We have plotted radar charts as shown in Figure 5 according to five categories for making a simple way to visualize performance comparison among the algorithms. The first three categories are DRS and data accuracy (in the sense of how much difference there is between the decoded data and its original data). The last three ad hoc categories are called compression time efficiency (CTE), compression energy efficiency (CEE), and transmission energy efficiency (TEE) in which they are defined by Equations (12)- (14), respectively. The parameter A in those equations is the number of algorithms we used in the experiments , i.e., A = 5. T c i , E c i and E t i are the compression time, compression energy, and transmission energy of ithalgorithm, respectively. Each category has a score from 0 to 100; the higher the score, the better the performance. The left panel of Figure 5 shows comparisons among RLE, K-RLE and D-RLE while the right panel of Figure 5 shows comparison among LEC, LZW and D-RLE. For the left panel, D-RLE performs better than the others on most categories except for CTE given the fact that D-RLE takes more time in the compression step. For the right panel, D-RLE performs equally or slightly better than the others in terms of CEE, TEE, and accuracy. D-RLE is located between LEC and LZW on CTE, whereas D-RLE mostly achieves better DRS (DRS results on the radar charts Figure 5a-c might not be clearly seen as shown as the number in Table 7). On average, D-RLE gets a high score and is well balanced, reaching the vertex of pentagon in the graph when compared to other algorithms in each category.

Conclusions
We have presented a compression algorithm called D-RLE applied to the domain of wireless sensor nodes in which energy use is one of the most important aspects. It starts with dividing the data into many groups based on Chauvenet's criterion and then each group further forms subgroups to which an adaptive encoding is applied. According to the experimental results, D-RLE have demonstrated that it performs very well, gives the highest data savings rate and spends less energy compared to other benchmark algorithms. In particular, D-RLE is suitable for big amounts of data with repeating or gradually changed values and for a single shot delivery mode. Due to its highest compression rate, the amount of data transmission is significantly reduced and hence less energy is demanded. This prolongs the battery life of the sensor nodes. This work is an alternative way to increase the performance of the sensor node concerning the energy.
Author Contributions: C.C. designed the core architecture and performed the hardware/software implementation and experiments; A.B. provided supervision to the project and has the responsibility as the main corresponding author; S.S. co-supervised the project and contributed to the experimental design, partial software development and result analysis.