A Formal Approach to the Selection by Minimum Error and Pattern Method for Sensor Data Loss Reduction in Unstable Wireless Sensor Network Communications

There are wireless networks in which typically communications are unsafe. Most terrestrial wireless sensor networks belong to this category of networks. Another example of an unsafe communication network is an underwater acoustic sensor network (UWASN). In UWASNs in particular, communication failures occur frequently and the failure durations can range from seconds up to a few hours, days, or even weeks. These communication failures can cause data losses significant enough to seriously damage human life or property, depending on their application areas. In this paper, we propose a framework to reduce sensor data loss during communication failures and we present a formal approach to the Selection by Minimum Error and Pattern (SMEP) method that plays the most important role for the reduction in sensor data loss under the proposed framework. The SMEP method is compared with other methods to validate its effectiveness through experiments using real-field sensor data sets. Moreover, based on our experimental results and performance comparisons, the SMEP method has been validated to be better than others in terms of the average sensor data value error rate caused by sensor data loss.


Introduction
There are wireless networks in which communications are typically unsafe. Most wireless sensor networks belong to this category of networks. Usually, they have several shortcomings caused by limited resources, such as short-range and low-speed communications, limited-power batteries, low memory capacity, and low-speed processors [1][2][3][4][5]. In particular, their communications are vulnerable to environmental changes, such as foggy, rainy, hot, cold, or dry weather conditions, or location, among other factors [6][7][8][9][10][11][12]. Another example of an unsafe communication network is an underwater acoustic sensor network (UWASN). In underwater acoustic communication, there are several barriers that need to be overcome: low bandwidth, high propagation delay, high error rate, multi-path propagation, strong signal attenuation, and time-variant channels [13][14][15][16][17][18]. The time variation of channels in particular is one of primary causes of unsafe underwater communications. Underwater environmental changes, such as temperature, salinity, pressure, underwater fluid, bubbles, and noise, among others, are the main causes of time variations of channels [13][14][15][16][17][18], but many other factors also influence channel time variations. Owing to the time-variant channels, communication failures occur frequently and the failure time durations may range from seconds up to a few hours, days, or even weeks. After the failure has ended, in most cases the communication resumes, but during the communication

Communication Framework Overview for Sensor Data Loss Reduction
In this section, we present an overall framework to reduce sensor data loss in the course of communication failure. In fact, this framework provides for the requirement analysts or designers a guideline for the identification and realization of functionalities necessary to support the reduction of sensor data loss during communication failures in wireless sensor networks.
Our overall framework has five sequential modes over a sensor node, as shown in Figure 1, which are the Normal Operation (NO), Communication Failure Propagation (CFP), Sensor Data Save and Compression (SDSC), Communication Recovery Propagation (CRP), and Compressed Data Transmission and Recovery (CDTR) modes,. The detailed explanation of the overall framework is as follows: In the NO mode, each sensor node in a sensor network performs its operations normally, such as sampling, processing, sending, or receiving sensor data. There occurs no communication failure in this mode. When a communication failure occurs, the sensor network mode moves from the NO mode to the CFP mode. In this mode, the first sensor node to detect a communication failure propagates to its neighbor nodes a Communication Failure (CF) message to notify them of the communication failure occurrence. Every sensor node to receive a CF message tries to find an alternative routing path, other than via the sender node of the CF message. If the sensor node finds a new routing path, it invalidates the routing path via the sender node of the CF message and performs transmission operations through the new routing path. If any new routing path is not found, then the sensor node propagates to its neighbor nodes the Communication Failure (CF) message and its mode moves into SDSC mode. In this mode, the sensor node starts saving sensor data generated from its sensor into its data queue. The sensor node performs the compression operation over some or all of the sensor data in the queue when the data queue is full. In this mode, as soon as the sensor node receives from the previous sensor node of the CF message a Communication Recovery (CR) message to notify that the previous communication failure has been recovered, the sensor node mode moves from the SDSC mode into the CRP mode. In CRP mode, the sensor node validates the invalidated routing path via the sender, propagates the CR message to its neighbor nodes again and then its mode moves to the CDTR mode. In the CDTR mode, the sensor node reverts to using the validated or recovered routing path from the previous mode as a new primary routing path. In this mode, the sensor node sends all the saved or compressed sensor data in data queue and the compression information through the recovered routing path. After that, the sensor node mode returns to the NO mode and proceeds with normal operations. The destination node to receive all the compressed sensor data and the compression information proceeds with restoring all the lost sensor data from the compression in SDSC mode to the approximated values, which are similar to their original sensor data. routing path via the sender, propagates the CR message to its neighbor nodes again and then its mode moves to the CDTR mode. In the CDTR mode, the sensor node reverts to using the validated or recovered routing path from the previous mode as a new primary routing path. In this mode, the sensor node sends all the saved or compressed sensor data in data queue and the compression information through the recovered routing path. After that, the sensor node mode returns to the NO mode and proceeds with normal operations. The destination node to receive all the compressed sensor data and the compression information proceeds with restoring all the lost sensor data from the compression in SDSC mode to the approximated values, which are similar to their original sensor data. There occurs sensor data loss due to the compression operations in SDSC mode. In fact, sensor data loss is inevitable in the communication failures with no alternative routing path. Specifically, in the long-term communication failures, without taking any action after the full state of the data or the routing queue, almost all of the new generated and routed sensor data will be lost inevitably. Therefore, lost sensor data cannot be restored to the same values as the original sensor data, or even to similar values. The only solution to resolve this problem is to devise a compression method to compress sensor data in a full queue state and restore the lost ones in the compression process to the same or similar values. For this reason, the compression in the SDSC mode is the most important task in our framework to reduce sensor data loss and the SMEP method presented from Section 4 to Section 8 is one of such compression methods.

Basic Definitions and Properties
Fundamental formal definitions and properties are presented in this section. These are necessary to introduce the SMEP method in Sections 5-7, and the decompression and recovery algorithms in Section 8. Definition 1. The sensor data sequence S is a list of periodic sensor data, S0, S1,…, Sn−1 such that for each i (0 ≤ i ≤ n − 1) Si is a sensor data and if j < k (0 ≤ j, k ≤ n − 1), then Sj is a sensor data generated before the sensor data Sk. We represent the sensor data sequence as S = S0, S1,…, Sn−1. There occurs sensor data loss due to the compression operations in SDSC mode. In fact, sensor data loss is inevitable in the communication failures with no alternative routing path. Specifically, in the long-term communication failures, without taking any action after the full state of the data or the routing queue, almost all of the new generated and routed sensor data will be lost inevitably. Therefore, lost sensor data cannot be restored to the same values as the original sensor data, or even to similar values. The only solution to resolve this problem is to devise a compression method to compress sensor data in a full queue state and restore the lost ones in the compression process to the same or similar values. For this reason, the compression in the SDSC mode is the most important task in our framework to reduce sensor data loss and the SMEP method presented from Section 4 to Section 8 is one of such compression methods.

Basic Definitions and Properties
Fundamental formal definitions and properties are presented in this section. These are necessary to introduce the SMEP method in Sections 5-7, and the decompression and recovery algorithms in Section 8. Definition 1. The sensor data sequence S is a list of periodic sensor data, S 0 , S 1 , . . . , S n−1 such that for each i (0 ≤ i ≤ n − 1) S i is a sensor data and if j < k (0 ≤ j, k ≤ n − 1), then S j is a sensor data generated before the sensor data S k . We represent the sensor data sequence as S = S 0 , S 1 , . . . , S n−1 .
We call the number of sensor data in the sensor data sequence S the size of S or the length of S and represent it with |S|.
For simplicity, let us define terms, ground sequence and base sequence.
Definition 2. The ground sequence or ground sensor data sequence is defined as a sensor data sequence of which each sensor data does not pass through any compression after its generation from a sensor.

Definition 3.
The base sequence is defined as a sensor data sequence of which each sensor data is selected through the same compressions and is a target for another compression.
Definition 4. Given a sensor data sequence, S = S 0 , S 1 , . . . , S n−1 , a compression interval I with the size m with respect to S is defined as a part of S, i.e., I = S i , S i+1 , . . . , S i+m , which consists of m + 1 consecutive sensor data in S and is a target for compression.
For simplicity, we represent a compression interval I with size m, S i , S i+1 , . . . , S i+m as [i, i + m] by using only the first and last subscripts of its m + 1 consecutive sensor data . We can know easily that if I = [i, j] is a compression interval, then the size of I is j − i. The compression interval is a sensor data sequence, too. However, note that the size of the compression interval I is different from the size of the sensor data sequence I. In fact, the size of the compression interval I is 1 less than the size of a sensor data sequence I. A compression on the compression internal I = [i, i + m] is an operation to select one or more sensor data from all sensor data in I and discard all other sensor data except the selected ones.

Definition 5.
Consider two compression intervals I = [i, j] and J = [k, l] with respect to a sensor data sequence S. Then, if j = k or l = i, I and J are defined as consecutive compression intervals in S. If each size of two consecutive compression intervals is the same as m then we call that as consecutive compression intervals with size m.
According to this definition, if two compression intervals are consecutive then the last sensor data of one of two compression intervals is the same as the first sensor data of the other. Theorem 1. Let I = [i, j] and J = [k, l] be consecutive compression intervals with size m and n, respectively, in a sensor data sequence S. If i < k or i < l or j < l, then J = [i, l] is a compression interval with the size m+n with respect to S (For the proof refer to Appendix A).
Given two consecutive compression intervals, I = [i, j] and J = [k, l], of a sensor data sequence, if i < k or j < l (this is, j = k), then we represent it with I < J and we call that I is smaller than J or that J is greater than I. Definition 6. Suppose that I = [i, j] and J = [k, l] are consecutive compression intervals with respect to a sensor data sequence, S, and I < J. We define the merging operation • of I and J as I • J = [i, l] and we call I • J the merged compression interval of I and J.
We can prove easily the following theorem, an association law of the merging operation.
Theorem 2. Let I, J, and K be consecutive compression intervals and I < J < K. Then, the merging operation satisfies the associative low. That is, (I • J) • K = I • (J • K).

Definition 7.
Consider consecutive compression intervals, I 0 , I 1 , . . . , I m−1 with respect to a sensor data sequence, S, where I 0 < I 1 < I 2 < . . . < I m−1 . If S = I 0 • I 1 • I 2 • . . . • I m−1 , then a set C S = (I 0 , I 1 , I 2 , . . . , I m−1 ) is called as a compression interval covering on S. When the size of every compression interval in the compression interval covering C S is the same as k, we call C S the k-size compression interval covering.
From definitions, we can get the relationship between a sensor data sequence and its 2-size compression interval covering as the below. For the proof refer to Appendix B. Theorem 3. Let S and C S = (I 0 , I 1 , I 2 , . . . , I m−1 ) be a sensor sequence and a 2-size compression interval covering on S, respectively. Then, the following propositions are true: Proposition 2. Given a compression interval, I i = S p , S q , S r , for some integer i, 0 ≤ i ≤ m − 1, positions of S p , S q , and S r , is 2i, 2i + 1, and 2i + 2, respectively, in S.
In the SDSC mode already introduced in Section 3, in order for a sensor node not to loss sensor data generated periodically from its sensor, the sensor node has to have its own data queue to save lost sensor data in the SDSC mode. However, whenever the data queue is full in the SDSC mode, the 2MC method executes a compression on a sensor data sequence of its data queue. Now, let us define a compression round as follows: Definition 8. A round compression is an operation defined as follows and a compression round is the number of the compression executions: (i) A ground sequence is in the 0-th round compression.
(ii) The i-th round compression is a compression executed when S is a sensor data sequence corresponding to a full data queue and each sensor data in S have been derived through the i-1 th compression.
At the time when a compression is executed, some of sensor data sequence are selected and the others are discarded. Here, the selected become the compressed sensor data but the discarded becomes the lost sensor data. In order for each of lost sensor data to be restored, its position should be known or found. The SMEP method uses position information about the selected sensor data to find positions the sensor data lost in compression.
Besides the above definitions more definitions are introduced in the following sections whenever necessary. All of main notations and terms are listed in Table 1 with their definition numbers. Refer to definitions if necessary.

Basic Concepts for the SMEP Method
The SMEP method uses two selection rules for two consecutive compression intervals to cover a sensor data sequence for the compression; selection by compression interval pattern and selection by minimum error. This section presents basic SMEP method concepts helpful to understand its data structure elements and algorithms introduced in Sections 6-8 in detail, including two selection rules.

SMEP Compression Process Overview
The SMEP proceeds the compression with 2-size consecutive compression interval covering with respect to a sensor data sequence with the size 2 m + 1 for in integer m, m ≥ 1. Actually, the size 2 m + 1 or 2 m is the same as the length of a part of data queue used for saving and compressing the sensor data generated from a sensor in the SDSC mode and the sensor data sequence for the compression is a sequence of sensor data in the queue. Depending on the 2 m + 1 size, we get the below theorem (For the proof refer to Appendix C): Theorem 4. Let S and C s be a sensor data sequence S and a 2-size compression interval covering of S and C s = (I 0 , I 1 , . . . , I l ), respectively. If the size of S is 2 m + 1 for an integer m ≥ 1, then l = 2 m−1 − 1, that is, the number of compression intervals of C s is 2 m−1 .
Whenever a compression round is proceeded, one sensor data in each compression interval is selected and the others are discarded in it. At this time, the discarded sensor data becomes the lost sensor data. Figures 2 and 3 show the compression process on a sensor data sequence in a data queue with the size 17 = 2 4 + 1. As shown in Figure 2, in the c-th (c = 1, 2, 3, . . . ) round compression, sensor data after the c-1 th round compression are selected as follows: At first, the first sensor data S c-1,0 is selected as S c,0 , unconditionally, where S c,i means the i-th sensor data of a sensor data sequence derived through the c-th round compression. Then, by selecting one sensor data per every compression interval, according to the order between compression intervals in the 2-size compression interval covering, the compression derives a new compressed sensor data sequence. One of both of the 2nd and the 3rd sensor data in each interval is selected so that the SMEP method should keep a selection rule, one per compression interval, for all compression intervals except the 1st compression interval. In selection, the i-th sensor data selected in the previous compression round becomes the i-th sensor data in the new compressed sensor data sequence. After the compression, the size of the new sensor data sequence shrinks to the half the previous one, this is, 2 m−1 + 1, occupying data queue space with the same size, and the 2 m−1 + 1 size queue space remains empty. After that, as sensor data generated newly from a sensor are inserted into the remaining queue space, the space becomes full like Figures 2 and 3 so that new sensor data should be compressed with the same way until all sensor data in the full state data queue are derived through c time compressions.  When all sensor data of a sensor data sequence in the full state data queue are derived through c time compressions, the c-th round compression has been completed and the new c+1 th round compression begins. Figure 3 shows only the first round compression process. Figure 4 shows the compression process in the aspect of the merging of two intervals. Compression is conducted by selecting just one from sensor data in a compression interval that corresponds to the merging of two sequent compression intervals of the base sequence. For example, in Figure 4, S1,3, a sensor data, is selected as the one from a compression interval, I0,2 = [4, 6]0. Here, Ic,i means the i-th position compression interval in the 2-size compression interval covering on a sensor data sequence derived though the c-th round compression. Additionally, [p, q]c represents an S 0,0 S 0,1 S 0,2 S 0,3 S 0,4 S 0,5 S 0,6 S 0,7 S 0,8 S 0,9 S 0,10 S 0,11 S 0,12 S 0,13 S 0,14 S 0, 15     When all sensor data of a sensor data sequence in the full state data queue are derived through c time compressions, the c-th round compression has been completed and the new c+1 th round compression begins. Figure 3 shows only the first round compression process. Figure 4 shows the compression process in the aspect of the merging of two intervals. Compression is conducted by selecting just one from sensor data in a compression interval that corresponds to the merging of two sequent compression intervals of the base sequence. For example, in Figure 4, S1,3, a sensor data, is selected as the one from a compression interval, I0,2 = [4, 6]0. Here, Ic,i means the i-th position compression interval in the 2-size compression interval covering on a sensor data sequence derived though the c-th round compression. Additionally, [p, q]c represents an S 0,0 S 0,1 S 0,2 S 0,3 S 0,4 S 0,5 S 0,6 S 0,7 S 0,8 S 0,9 S 0,10 S 0,11 S 0,12 S 0,13 S 0,14 S 0, 15   When all sensor data of a sensor data sequence in the full state data queue are derived through c time compressions, the c-th round compression has been completed and the new c+1 th round compression begins. Figure 3 shows only the first round compression process. Figure 4 shows the compression process in the aspect of the merging of two intervals. Compression is conducted by selecting just one from sensor data in a compression interval that corresponds to the merging of two sequent compression intervals of the base sequence. For example, in Figure 4, S 1,3 , a sensor data, is selected as the one from a compression interval, I 0,2 = [4, 6] 0 . Here, I c,i means the i-th position compression interval in the 2-size compression interval covering on a sensor data sequence derived though the c-th round compression. Additionally, [p, q] c represents an compression interval, [p, q] on a sequence derived through the c-th round compression. Shortly, (I 0,2 , S 1,3 ) in the figure means the sensor data S 1,3 is selected in the compression interval I 0,2 .

I-bit Position and Generation
For further description, we define the term cover: Definition 9. Let S i and I be a sensor data and a compression interval with respect to a sensor data sequence, respectively. S i covers I if and only if S i is selected in I.
In Figure 4, S 2,2 covers I 1,1 and it is selected one of S 1,3 and S 1,4 to cover I 0,2 and I 0,3 , respectively. Note that S 2,2 covers I 0,2 •I 0,3 in this figure. We define the term cover on the view of compression interval: compression interval, [p, q] on a sequence derived through the c-th round compression. Shortly, (I0,2, S1,3) in the figure means the sensor data S1,3 is selected in the compression interval I0,2.
For further description, we define the term cover:
As described already before, the positions of the selected data should be necessarily known in order that each of all lost sensor data is restored. The very function that lets the SMEP method know where each sensor data is either selected or lost is I-bit position.  According to the definition, I 1,1 covers I 0,2 •I 0,3 in Figure 4. Moreover, I 2,0 covers I 1,0 •I 1,1 and I 1,0 and I 1,1 also cover I 0,0 •I 0,1 and I 0,2 •I 0,3 . Since S 3,1 covers I 2,0 and S 3,1 covers I 0,0 •I 0,1 •I 0,2 •I 0,3 I 2,0 , I 2,0 also covers I 0,0 •I 0,1 •I 0,2 •I 0,3 I 2,0 . For the generalization of this property, we introduce the below Lemma 1, Lemma 2 and Theorem 5. For their proofs refer to Appendix H, Appendix I, and Appendix D, respectively. Theorem 5. Let S and S c = S c,0 , S c,1 , . . . , S c,2 m for integer m ≥ 0 be a ground sequence and a sensor data sequence derived through the c-th round compression, respectively, based on S. Then, in the SMEP method, for integers c ≥ 1 and for any integer i ≥ 1, S c,i covers I 0, This theorem tells S c,i is the sensor data selected in By the way, since I 0,k = [2k, 2k + 2] 0 and I 0, Corollary. Let S c,i be the i-th position sensor data in a sensor data sequence derived through the i-th round compression. Then, the position of S c,i exists in [(I − 1)2 c , i2 c ] 0 .
As described already before, the positions of the selected data should be necessarily known in order that each of all lost sensor data is restored. The very function that lets the SMEP method know where each sensor data is either selected or lost is I-bit position. Definition 11. I-bit position is defined recursively as a function to generate the bit string about the position of S c,i in a compression interval according to the following rules: Rule 1. For any c = 1, 2, . . . , n, I-bit position(S c,0 ) is not defined. Rule 2. In 1-th round compression, for each I 0,k in C S 0 for k = 0, 1, . . . , |C S 0 | − 1 and for S 1,i selected from Shortly, we call I-bit position of S c,i-th e value of I-bit position(S c,i ). The I-bit position represents the relative position of a sensor data within a compression interval. In fact, given an I-bit position(S c,i ), the real relative position of S c,i in its interval is (the value of I-bit position(S c,i )) + 1, As compression makes progress, two consecutive compression intervals that two competitive sensor data for selection cover is merged into one compression interval covered by the selected sensor data. The above definition lets us know how the position of a selected sensor data in a merged compression interval is determined by positions of two competitive sensor data. Figure 5 illustrates such a process to generate the I-bit position of a selected sensor data in compression. In the 1st round compression, S 0,0 , S 0,2 , S 0,4 , S 0,5 , and S 0,7 are selected as S 1,0 , S 1,1 , S 1,2 , S 1,3 , and S 1,4 from compression intervals, I 0,0 , I 0,1 , I 0,2 , and I 0,3 , respectively. Here, note that S 0,0 is unconditionally selected as S 1,0 and its I-bit position is undefined according the definition Rule 1, and S 0,2 is selected as S 1,1 in the compression interval I 0,0 . According to the Rule 2 of the above definition, their corresponding I-bit positions are null (i.e., undefined), 1, 1, 0, and 0 in the 1st round compression. As in the 2nd round compression, I 0,0 and I 0,1 , are merged into I 1,0 and I 0,2 and I 0,3 are also merged into I 1,1 , S 2,0 , S 2,1 and S 2,2 are selected from S 1,0 , S 1,1 , and S 1,4 , respectively. Each of them is selected as one per each compression interval after the 1 st compression, except the unconditionally selected S 2,0 in I 1,0 . S 1,0 and S 1,1 are positioned in the first compression interval I 1,0 and S 1,4 is positioned in the second one I 1,1 . According to the definition Rule 1, I-bit positions of S 2,0 is undefined as null, again. In addition, by applying the first condition of the definition Rule 3 to the first compression interval, I-bit position of S 2,1 are created by adding 0 bit to the front of the I-bit position of S 1,1 and, ultimately, they become to be 01 as a relative position bit string for the merged compression interval I 2,0 . Meanwhile, S 2,2 selected from S 1,4 exists in the second compression interval I 1,1 among two consecutive compression intervals covered by I 2,0 . In this case, we have to apply the second condition of the Rule 3 and so we can get the bit string 10 by adding the bit 1 to the front of I-bit position of S 1,4 . After that, S 3,0 and S 3,1 are selected from S 2,0 and S 2,2 located in the first and second compression intervals I 1,0 and I 1,1 covered by I 2,0 . Therefore, we can apply the Rule 1 and the condition of the Rule 3 to the generation of I-bit positions of S 3,0 and S 3,1 so to get null and 110, respectively.
for the merged compression interval I2,0. Meanwhile, S2,2 selected from S1,4 exists in the second compression interval I1,1 among two consecutive compression intervals covered by I2,0. In this case, we have to apply the second condition of the Rule 3 and so we can get the bit string 10 by adding the bit 1 to the front of I-bit position of S1,4. After that, S3,0 and S3,1 are selected from S2,0 and S2,2 located in the first and second compression intervals I1,0 and I1,1 covered by I2,0. Therefore, we can apply the Rule 1 and the condition of the Rule 3 to the generation of I-bit positions of S3,0 and S3,1 so to get null and 110, respectively. As a new round compression proceeds, the length of I-bit position of a selected sensor data makes one more increase in the new round compression than in the previous round compression, resulting to the c length bit string for the c-th compression, ultimately.

Absolute Position
As soon as a sensor data is selected during a compression, the SMEP method generates an I-bit position corresponding to the sensor data and updates I-bit position sequence defined in the below, appending its I-bit position to it: Moreover, the position of the first bit of the I-bit position sequence is defined as 0. From the above definition, note that the I-bit position corresponding to the first sensor data is excluded from I-bit position sequence. In fact, we can remove it because the position of the first sensor data is always fixed as the first for every compression. In Figure 5, I-bit position sequence of S 1 is 1100 (= 1 Θ 1 Θ 0 Θ 0) generated by concatenating I-bit positions of S 1,1 , S 12 , S 1,3 , and S 1,4 except S 1,0 . I-bit position sequences of S 1 , and S 2 are 0110 (=01Θ10), and 110 by the same way, respectively.
Given an I-bit position sequence, B p , of S c , for any sensor data, S c,k , we can get easily the I-bit position of S c,k by taking a substring from the c(k − 1) th position bit to ck − 1 th position bit in B p . So, let us define a function, BitSubstring, now: Assume that S c is a sensor data sequence in a data queue for compression and C S c = (I c , 1 , I c , 2 , . . . , I c , m ) is a 2-size compression interval covering with respect to S c . Then, we define the selection by compression interval pattern as follows: Definition 15. Given a compression interval I c,i = S c,i , S c,i+1 , S c,i+2 in C S c with respect to Sc, the selection by compression interval pattern with respect to Ic,i is defined as the selection of a sensor data among Sc,i, Sc,i+1, Sc,i+2 according to the following selection by pattern rule: Selection by Pattern Rule: If Pattern 1 or Pattern 2 or Pattern 3 is true among nine patterns, then S c,i+2 is selected, otherwise S c,i+1 is selected.
Simply, we will call the selection by compression interval pattern rule the pattern selection rule or selection by pattern rule. The selection by pattern rule is derived not from noble theoretical results to minimize the errors between every lost sensor data and its corresponding original sensor data in the several compressions, but from empirical results through several experiments with several sets of real-field sensor data. Thus, the compression to use only the pattern selection method tends occasionally to show worse performance than the CQP method and others. For this reason, we have devises a method mixed up with the pattern selection method and the selection by minimum error method in Section 5.5.
On the other hand, in two consecutive compression intervals of 2-size compression interval covering, the first sensor data in the second interval is the same one with the third sensor data in the first interval according to Definition 5 in Section 4. If the third sensor data is selected in the first interval, the first sensor data cannot be selected in the second interval. For this reason, the selection by pattern rule selects only one of the 2nd sensor data (S c,i+1 ) and the 3rd sensor data (S c,i+2 ) except the 1st sensor data (S c,i ) in the compression interval.

Selection by Minimum Error Rule
In the SMEP method, given a 2-size compression interval covering and a compression interval for compression, the selection by minimum error rule uses two beforehand-selected sensor data from two neighbor compression intervals for sensor data selection. That is, this rule selects the sensor data to minimize error using those two beforehand-selected sensor data. Before the detailed description, we introduce the selection error of a sensor data, for the time being: Definition 16. The selection error of a sensor data is defined as an error incurred by selecting a sensor data among two or more selectable sensor data.
Our purpose in compression is to minimize selection errors so that the lost sensor data, i.e., the unselected sensor data, should be restored to sensor data almost similar to the original sensor data. For our purpose, we need an estimation measure to allow compare their selection error sizes each other among sensor data. For this reason, we introduce a line interpolation error measure as one of estimation measures. Figure 7 shows the line interpolation error measure to be used as a selection error measure and it is defined at the below: Definition 17. Let be I k−1 , I k and I k+1 be compression intervals and let S i and S j be the beforehand selected sensor data in I k−1 and I k+1 , respectively. For selectable consecutive sensor data, S p and S q (p ≤ q), consider two lines, S p S j and S i S q calculated using absolute positions of S i and S j . And consider two points, a = (q', A) and b = (p', B) on S p S j and S i S q , respectively, where p' and q' are absolute position of p and q. Then, line interpolation errors, E p and E q , of S p and S q are defined as follows: From now on, we will regard the selection error as the line interpolation error. In the definition, the absolute position for each of all sensor data is used for calculating line interpolation errors. We have a valid reason why we have to use the absolute positions of sensor data in calculating the line interpolation errors: Consider three or more consecutive sensor data to take one or more compressions. They seem likely to have the same distance unit 1 each other at glance. Generally, however, there can be big differences in their absolute positions. For example, consider in Figure 2 three sensor data, S 2,0 , S 2,1 and S 2,2 , where they are consecutive each other with the same 1 unit distance in the 2nd round compressed sensor data sequence. Their corresponding original sensor data, however, are S 0,0 , S 0,2 and S 0,7 and the distance units between them are different with 2 and 5 between S 0,0 and S 0,2 and between S 0,2 and S 0,7 , respectively.
Exceptionally, the SMEP uses S c,0 for the predetermined sensor data S c,i in the definition so to select one of S c,1 and S c,2 for I c,0 . The reason to do so is because there exists no interval for predetermined sensor data in its left-hand side but if any interval can exit the unique sensor data is the very S c,0 that can be the left boundary sharable sensor data of I c,0 .
The SMEP executes the selection by minimum error rule for each I 2k and the selection by pattern rule for I 2k+1 in C S c , for k=0, 1, . . . ,|C S c |. Therefore, in the definition, the real compression intervals for the SMEP method to apply the selection by minimum error rule to the compression are I 2k for k = 0, 1, . . . ,|C S c |. Now, focusing on I 2k (k = 0, 1, . . . ,|C S c |), we investigate some properties for minimizing the selection error.

Lemma 5.
Let be I c,2k−1 , I c,2k and I c,2k+2 be consecutive compression intervals in C S c of S c and let S c,i and S c,j be the beforehand selected sensor data in I 2k−1 and I 2k+2 , respectively. Given B p , an I-bit position sequence of S c , for i', j', p' and q' in the line interpolation error definition, Moreover, for A and B in the definition, For the proof of Lemma 5 refer to Appendix L.
Theorem 7. For selectable S p and S q in I 2k in the line interpolation error definition (Definition 17), For the proof of Theorem 7 refer to Appendix E. Until now, in this section we have used the terminology selection by minimum error rule without its exact formal definition. Now, the definition of this rule is formally defined as the below: Definition 18. Given the two selectable sensor data, S c,4k+1 and S c,4k+2 , in I c,2k in C S c of S c , the selection by minimum error rule is defined as a rule to select S c,i as S c+1,k in the c+1 th round compression such that:

Main SMEP Elements
SMEP algorithms create and use a few of data structures as main elements for the compression of sensor data sequences. These data structures are DataQueue, Zones, ZoneBpSeq, ZoneCompHistory, and AuxDataQueue. Moreover, each information in these main elements is transmitted to one or more consumer (i.e., destination) sites in order that they decompress and recover compressed and lost sensor data using this information. This section presents each of these data structures, their operations, and the related properties significant to the SMEP algorithms.

Data Queue and Zones
DataQueue is an array that saves a ground sensor data and compressed sensor data and the compression is executed on this DataQueue. Each compression makes the number of the targeted ground or compressed sensor data in DataQueue reduce to the half. For this reason, the size of DataQueue must be 2 m + 1. Furthermore, Zones array information corresponding to DataQueue ranges is a basis on each compression. Zones data structure is defined as the below: Definition 19. Given a data queue, DataQueue[0..2m], with the size 2 m + 1, a Zones with respect to the data queue is defined as a data structure with the size m, this is, Zones[0..m − 1], as follow: When a zone corresponds to a range of DataQueue, we call the zone covers the range. This range is relevant to one plus a half of DataQueue size as an area to save the selected sensor data when DataQueue is full of ground or compressed sensor data. In Figure 8, all sensor data in this range have been selected through three compressions. Zones [1] covers a range from S 2.0 (DataQueue [9]) to S 2,4 (DataQueue [12]) taken through two compressions and the number of sensor data in this range is 2 2 . Zones [2] and Zones [3] cover a range between S 2.5 (DataQueue [13]) and S 2,6 (DataQueue [14]) and between a range of S 0.0 (DataQueue [15]) and S 0,1 (DataQueue [16]), respectively. In particular, the range that Zones [3] covers does not path through any compression in Figure 8.
data queue corresponding to the Zones[m − 1] have been compressed.
When a zone corresponds to a range of DataQueue, we call the zone covers the range. Figure 8 illustrates Zones[0..3] with respect to DataQueue[0.. 2 4 ]. In this figure Zones[0] covers a range from S3.0 to S3,8. This range is relevant to one plus a half of DataQueue size as an area to save the selected sensor data when DataQueue is full of ground or compressed sensor data. In Figure 8, all sensor data in this range have been selected through three compressions. Zones [1] covers a range from S2.0 (DataQueue [9]) to S2,4 (DataQueue [12]) taken through two compressions and the number of sensor data in this range is 2 2 . Zones [2] and Zones [3] cover a range between S2.5 (DataQueue [13]) and S2,6 (DataQueue [14]) and between a range of S0.0 (DataQueue [15]) and S0,1 (DataQueue [16]), respectively. In particular, the range that Zones [3] covers does not path through any compression in Figure 8.     [2] and Zones [3] values become all 0, which means sensor data in these zone ranges don't take any compression so to be able to save the new generated sensor data and to compress them. When ranges covered by these 0 value zones are full again, the second compression makes the second zone Zones [1] value be 1, the number of compressions, and it makes Zones [2] and Zone [3]  these zone ranges don't take any compression so to be able to save the new generated sensor data and to compress them. When ranges covered by these 0 value zones are full again, the second compression makes the second zone Zones [1] value be 1, the number of compressions, and it makes Zones [2] and Zone [3]

Zone Bit Sequence
Now, we introduce ZoneBpSeq[0..m − 1] that is an array of references to I-bit position sequences of Zones. Numbers of compressions in zones of DataQueue generally differ from each other and the length of I-bit position of sensor data in each zone can differ from other zone lengths, too. Therefore, we cannot handle all the sensor data in DataQueue as a sensor data sequence in compressions. Consequently, we have no way but to treat DataQueue sensor data with a sensor data sequence per each zone. I-bit position lengths of sensor data in the same zone are all the same since they have passed through the same number of compressions. For this reason, we use ZoneBpSeq[0..m − 1] as an array of pointers to refer to I-bit position sequences of the sensor data sequences corresponding to each zones of DataQueue.

Definition 21.
Given a zone of DataQueue, its zone bit sequence is defined as the I-bit position sequence corresponding to a sensor data sequence in the zone and we call ZoneBpSeq[0..m − 1] a zone bit sequence array.
With the arrays Zones and ZoneBpSeq we can seek the absolute position of any sensor data in DataQueue according to the following theorem (For the proof refer to Appendix F).

Zone Compression History
In every compression, the sensor data S c,0 in the zone 0 is selected unconditionally in the compression interval sequence in the zone. In fact, S c,0 in the zone 0 plays a role as a base in sensor data selection by the first application of selection by minimum error rule in the zone 0. We call such a sensor data like S c,0 the base sensor data in the zone of DataQueue. In other words, a base sensor data is the first sensor data positioned in the first interval in a zone of DataQueue and it has a role as a base in applying the first selection by minimum error rule in the zone. By the way, a problem is that there is no base sensor data in all zones except the zone 0 so to prevent from selecting a new sensor data from the first interval in the first zone for the compression. To resolve this problem, we make the last selected sensor data in the previous zone be a base for the compression of a sensor data sequence after that zone. In spite of doing that, another problem occurs if there is among zones no history about sensor data selected in every compression. Figure 10 shows such an example. In this figure, the sensor data sequence between the zone 1 and the zone 2 should be compressed and the numbers of compressions in the zone 0 and in the zone 1 are different as 2 and 0, respectively. In this case, however, there is no way for compression without the last selected sensor and absolute position in the zone 0 as a historical sensor data in the 1st compression. Because there is no base sensor data for the 1 th compression. For this reason, we introduce ZoneCompHistory, an array, as the very data structure to keep the historical information about zone compressions. As shown in Figure 10, the rows in the ZoneCompHistory array corresponds to zones of DataQueue and its columns correspond to the number of compressions. Moreover, its value is a pair of the value and absolute position of the sensor data selected in the zone through compression. As an example, in Figure 10, ZoneCompHistory[0,1] tells the sensor data with 28 as its value and 7 as its absolute position had been selected as a last sensor data in the zone 0 through the 1 st compression. This sensor data is the very base for the 1st compression of a sensor data sequence between zones 1 and 2.

Auxiliary Data Queue
As long as the number of compressions in the last zone of DataQueue is less than the number of compressions in its previous zone, there occurs no new compression with its neighbor zones even though DataQueue is filled up with compressed sensor data. In this situation, AuxDataQueue, an auxiliary data queue, is used for new ground sensor data to be inserted into and it is also used to adjust the number of compressions in the last zone until that number is to be equal to the number of compressions in its previous zone. Using AuxDataQueue, the number of the last zone compressions increases one by one until it equals to the number of its previous zone compressions. Figure 11 shows such an example to use an auxiliary queue for a compression on the last zone. Figure 11 also illustrates in turn the main steps and changes of the last zone compression process with numbering.
Given  As an example, in Figure 10, ZoneCompHistory[0,1] tells the sensor data with 28 as its value and 7 as its absolute position had been selected as a last sensor data in the zone 0 through the 1 st compression. This sensor data is the very base for the 1st compression of a sensor data sequence between zones 1 and 2.

Auxiliary Data Queue
As long as the number of compressions in the last zone of DataQueue is less than the number of compressions in its previous zone, there occurs no new compression with its neighbor zones even though DataQueue is filled up with compressed sensor data. In this situation, AuxDataQueue, an auxiliary data queue, is used for new ground sensor data to be inserted into and it is also used to adjust the number of compressions in the last zone until that number is to be equal to the number of compressions in its previous zone. Using AuxDataQueue, the number of the last zone compressions increases one by one until it equals to the number of its previous zone compressions. Figure 11 shows such an example to use an auxiliary queue for a compression on the last zone. Figure 11 also illustrates in turn the main steps and changes of the last zone compression process with numbering.
Given Zones[m − 1], the number of compressions in the last zone, AuxDataQueue[0..2 Zones[m−1]+1 ], an array with the size 2 zones[m−1]+1 +1, is required to increase the number of the last zone compressions up one more ( 1) in Figure 11). Note that in Figure 11 Figure 11). Moreover, the final compression on auxiliary queue is always complete with the selection by pattern rule ( 5) in Figure 11). Next, one of two sensor data in the last zone is selected according to the selection by minimum error rule, using ZoneCompHistory[m − 2, Zones[m − 1] + 1] and the remaining auxiliary queue sensor data as a base and a sensor data selected through the selection by pattern rule, respectively ( 6) in Figure 11). This selected sensor data and the remaining auxiliary queue sensor data are replaced as the values before and at the last position in DataQueue, respectively ( 7) in Figure 11). During these steps, I-bit positions of each selected sensor data are calculated according to I-bit position definition (Definition 11) and they are reflected to find their absolute positions and to make an I-bit position sequence of the last zone. Accordingly, ZoneCompHistory is modified with the last selected sensor data value and its absolute position ( 8) in Figure 11). Finally, the last value of Zones is also changed to the same value as the Zones value corresponding to the zone before the last zone when the compression on the two last sensor data has been finished ( 9) in Figure 11). corresponding to the zone before the last zone when the compression on the two last sensor data has been finished ( 9) in Figure 11).

SMEP Algorithms
Compression algorithms to execute the SMEP method use the rules, data structures, operations, and theorems in Section 5 and 6. For compressions, these algorithms keep and follow Zones operations definition Rules in Definition 20 partly or overall. The AuxDataQueueCompression algorithm introduced in Section 7.1 as a smallest procedure unit executes compressions on a ground sensor data sequence in AuxDataQuque. The LastZoneCompression algorithm introduced in Section 7.2 compresses two sensor data in the last zone using AuxDataQueue Compression. EquivalentZonesCompression algorithm introduced in Section 7.3 performs compression over consecutive zones with the same number of compressions according to Zones operations definition rule when DataQueue is full and the number of compressions of the last zone is same as that of its neighbor zone. Following overall Zones operations definition rules the SMEP main algorithm introduced in Section 7.4 controls and manages compressions in various cases using the above algorithms. Now, we introduce these algorithms in turn in this subsections.

AuxDataQueue Compression Algorithm
The AuxDataQueueCompression algorithm is the algorithm for compressing a sensor data sequence in AuxDataQueue. This algorithm is illustrated Algorithm 1.

SMEP Algorithms
Compression algorithms to execute the SMEP method use the rules, data structures, operations, and theorems in Sections 5 and 6. For compressions, these algorithms keep and follow Zones operations definition Rules in Definition 20 partly or overall. The AuxDataQueueCompression algorithm introduced in Section 7.1 as a smallest procedure unit executes compressions on a ground sensor data sequence in AuxDataQuque. The LastZoneCompression algorithm introduced in Section 7.2 compresses two sensor data in the last zone using AuxDataQueue Compression. EquivalentZonesCompression algorithm introduced in Section 7.3 performs compression over consecutive zones with the same number of compressions according to Zones operations definition rule when DataQueue is full and the number of compressions of the last zone is same as that of its neighbor zone. Following overall Zones operations definition rules the SMEP main algorithm introduced in Section 7.4 controls and manages compressions in various cases using the above algorithms. Now, we introduce these algorithms in turn in this subsections.

AuxDataQueue Compression Algorithm
The AuxDataQueueCompression algorithm is the algorithm for compressing a sensor data sequence in AuxDataQueue. This algorithm is illustrated Algorithm 1. Finally, the algorithm executes the zoneCValue th compression in the line 19 applying the selection by pattern rule, selects only one sensor data, and returns its value, absolute position and I-bit position.

Last Zone Compression Algorithm
The LastZoneCompression is a procedure that reads ground sensor data and compresses them using AuxDataQueueCompession procedure so to increase one more the number of compressions of the last zone. The LastZoneCompression procedure is called by the SMEP main procedure introduced in Section 7.4 when the last zone becomes full and the number of compressions in the last zone is less than the number in its previous neighbor zone. The last zone compression algorithm is shown in Algorithm 2. The LastZoneCompression procedure allocates the AuxDataQueue array space in order to make the last zone be taken to the one more compression (procedure line 4). Hence, the size of AuxDataQueue array space must be 2 Zones[m−1]+1 . Next, this procedure begins to read ground sensor data and inserts them into AuxDataQueue (from procedure line 6 to line 15). Whenever the procedure reads a ground sensor data, it checks if the communication failure has been recovered using Receive(CRmessage) function (procedure line 10). If the check is true, the procedure returns to the calling procedure, the SMEP main procedure, and informs the recovery from the communication failure to that procedure by returning 'FinishMode' (procedure line 11). After filling up AuxDataQueue the procedure calls the AuxDataQueueCompression procedure to compress a ground sensor data in AuxDataQueue (procedure line 16). The AuxDataQueueCompression procedure returns a finally selected sensor data value, its absolute position, and its zone bit sequence (procedure line 16). The procedure uses the returned value and absolute position to select one between two last sensor data by the selection by minimum error rule (procedure line 17). The finally selected sensor data and the returned value are inserted into

Consecutive Equivalent Zones Compression Algorithm
Let us say that the zones are equivalent when zones are same in numbers of compressions.
EquivalentZonesCompression is a procedure that compresses the consecutive zones equivalent with the last zone, and that keeps Rule 2 (i) and (ii) in the Zones operations definition (Definition 20). This procedure also returns to the SMEP main procedure the value and absolute position of the last selected sensor data, where this sensor data is the last of the beginning zone among consecutive equivalent zones after compressing their sensor data. The SMEP main procedure calls this procedure to pass the beginning zone among consecutive equivalent zones to the parameter StartZone, as shown in the Algorithm 3. Position is used as its absolute position (procedure line 4). The compression proceeds over the 2-size compression interval covering on this sensor data sequence (from procedure line 4 to line 11). In this compression, the procedure applies alternately the selection by pattern rule and the selection by minimum error rule for the odd number intervals and the even number intervals, respectively (procedure line 6 and line 7). Here, DQGP is used to calculate the DataQueue position to save the sensor data selected for each interval (procedure line 8 and line 9). The sensor data sequence from the StartZone zone to the last m − 1 zone is compressed to the Startzone zone, where this is proved in the last of this section. Additionally, the zone bit sequence about the StartZone zone is generated during compression using I-bit positions of S c,p and S c,j and the zone bit strings corresponding to them (procedure line 10 and line 12). The procedure calculates the absolute position of the last selected sensor data in the compression using Zones Finally, we prove the below theorem (For the proof refer to Appendix G): Theorem 9. In the compression for the sensor data sequence in consecutive equivalent zones from the StartZone zone to the last m − 1 zone in DataQueue, this sensor data sequence is compressed to the Startzone zone.

Main Algorithm
The SMEP main procedure proceeds to save and compress the growing ground sensor data sequence controlling Zones operations Rule 1 and Rule 2 in the Definition 20 in Section 6.1. This procedure uses the data structures and procedures already introduced in previous sections. This algorithm is illustrated in Algorithm 4.
At first, the produce initializes Zones and ZoneCompHistory data structures, the absolute position variable Ap, the DataQueue index DQinx and the zone index variable ZNinx (from procedure line 1 to line 7). Then, the procedure reads and inserts a ground sensor data into DataQueue (procedure line 9 and line 10).
If DataQueue is not full but the communication failure is not still recovered (procedure line 12), then the procedure continues to read and insert a new sensor data into DataQueue (procedure line 14 and line 15). If the communication failure is recovered (procedure line 12) then the procedure returns all sensor data in DataQueue, the Zones data, the ZonesBpSeq data, the ZoneCompHistory data and the absolute position Ap of the last ground sensor data to CDTR mode via CRP mode (procedure line 13). The CDTR mode procedure will transmit them to the monitoring and control center (i.e., destination site), in which the original ground data and the lost sensor data are restored, ultimately. How to transmit in the CDTR mode via the CRP mode is beyond our issues.
If DataQueue becomes full, then the procedure continues to compress the last zone while the number of the compressions of the last zone is less than that of its previous zone (from procedure line 18 to line 24). For the last zone compression the procedure calls the LastZoneCompression procedure. As described before, the LastZoneCompression procedure returns 'FinishMode' when it has received CR message in the middle of reading and inserting a new ground sensor data (procedure line 10 and line 11 in Algorithm 4). At this time the procedure returns DataQueue, Zones, ZonesBpSeq, ZoneCompHistory, Ap, and AuxDataQueue essential for the original and lost sensor data to be recovered (procedure line 20, line 21, and line 22). If the procedure doesn't receive any CR message, it increases one more the number of compressions of the last zone since (procedure line 23).
When the last zone and its previous zone become equivalent, according to Zones operations rule, the procedure finds the start zone of the consecutive equivalent zones (procedure line 25).  Then, the procedure calls the EquivalentZonesCompression procedure to compress these zones into the start zone and it saves the return values, which are the last selected sensor data and its absolute position, to ZoneCompHistory for the future compression preparation as the base sensor data and its absolute position of the next zone (procedure line 26). The procedure also increases one more the number of compressions of the last zone (procedure line 27) because the sensor data sequence in equivalent consecutive zones has been one more compressed into the start zone. As the sensor data sequence from the start zone to the last zone shrinks to only the start zone through the consecutive equivalent zones compression, the DataQueue memory spaces corresponding to the remaining zones are empty and the first position of the immediate zone of the start zone is the beginning DataQueue location to save the new ground sensor data. Thus, the procedure must initialize the data structure parts corresponding to these empty zones, such as the values from the immediate zone to the last zone in Zones, ZoneBpSeq, and ZoneCompHistory. The lines from procedure line 28 to line 34 reflect these initializations. The first position to save a new sensor data is calculated to the DQinx variable in procedure line 35.

Compressed Sensor Data Decompression and Lost Sensor Data Recovery
The final consumers of the return values from SMEP main are one or more destinations that receive the values via a wireless sensor network and one or more existing networks, and use them with their own purpose. One of main consumers is generally a monitoring and control center that monitors external or internal environments of each sensor nodes using sensor data transmitted by them, and controls them by sending control messages to sensor nodes if necessary. In order for the final consumer to use the SMEP main return values, the consumer must decompress them and recover the unselected sensor data during saving and compressing sensor data. In this section, we introduce algorithms for the decomposition of compressed sensor data and the recovery of lost sensor data.

Sensor Data Line Interpolation Algorithm
Sensor Data Line Interpolation Algorithm uses the line interpolation method to recover the lost sensor data. The line interpolation is a very simple method to interpolate the unknown values of points between two known endpoints. Algorithm 5 illustrates SensorDataLineInterplolation, a line interpolation algorithm for lost sensor data recovery: DataQueue. The procedure finds the line equation to pass two endpoints (procedure line 4). If there exist one or more absolute positions between these two endpoints, the sensor data at these absolute positions are those lost in compressions. Therefore, the procedure calculates approximately values at these absolute positions by substituting each of them into the line equation one by one (from procedure line 3 to line 7) and outputting each pair of the values and absolute positions of them (procedure line 5). This procedure is used in the SensorDataRecovery procedure introduced in the next section.

Sensor Data Recovery Algorithm
SensorDataRecovery is an algorithm that receives data structures (DataQueue, Zones, ZoneBpSeq, ZoneCompHistory, and AuxDataQueue) and a variable (Ap) from the SMEP main via a wireless sensor network and existing networks and decompresses or recovers the compressed and lost sensor data by the SMEP main sensor node during communication failure. In fact, all of the sensor data in each zone of DataQueue are the compressed or original ground sensor data finally selected in the same number of compressions. SensorDataRecovery algorithm finds absolute positions of remaining sensor data in each zone. At this time, the procedure uses FindAbsoluteAddress function and this function finds an absolute address using DataQueu, Zones and ZoneBpSeq arrays, when a sensor data position in DataQueue is given. Since how to calculate the absolute address using these arrays has already been introduced in the form of a formula in Theorem 8 in Section 6.2, we omit the detailed procedure about that function in this paper. With these sensor data values and found absolute positions, SensorDataRecovery procedure recovers the values of lost sensor data to the interpolated using SensorDataLineInterplolation procedure described in the previous section. The more detailed algorithm is presented in Algorithm 6. The procedure, at first, finds the last zone of which the zone value is not zero (procedure line 3). The zones from the zone 0 to the non-zero-valued zone are the target zones for decompression and recovery. The zones with 0 as their value includes the original ground sensor data without any compression. Every sensor data in the 0 valued zones doesn't need any decompression and recovery at all but it just needs output for its value and absolute position. The algorithm pseudo code about this corresponds to the part from the procedure line 29 to 40. The pseudo codes that executes decompression and recovery for none zero-valued target zones are shown from procedure line 5 to 27. Each zone needs a base sensor data for the first decompression and recovery. The pseudo codes for determination of a base sensor data are shown from the procedure line 6 to 9. The base sensor data in the 0 th zone is DataQueue[0] as S c,0 and its absolute address is 0 (procedure line 7), since S c,0 is always selected unconditionally in every compression. Meanwhile, the base sensor data for each target zones except the 0 th zone is the value and absolute position in ZoneCompHistory to have in its previous zone the same number of compressions with the target zone (procedure line 8). In order to execute decompression and recovery on DataQueue sensor data in each target zone with DataQueue, the procedure needs to know boundaries of DataQueue space corresponding to the target zone. The lowest boundary is determined by the greatest boundary of the previous target zone. Hence, the procedure determines only the greatest boundary as shown from the procedure line 10 to 13. Then, the procedure proceeds decompression and recovery for all sensor data, repeatedly (from procedure line 15 to 21). During doing this, the procedure finds an absolute position of each new sensor data in DataQueue through the FindAbsolutePosition function (procedure line 17). Then, the procedure calls SensorDataLineInterpolation with the values and absolute positions of two old and new sensor data so that each of lost sensor data between these two sensor data should be recovered to an line interpolated value and its absolute position and outputted (procedure line 18). The line interpolation needs two points. By the way, the last sensor data in each zone becomes one of two points as a partner of a previous or base sensor data but, in the next decompression and recovery, it needs a new partner for the line interpolation. So, the procedure uses as its new partner a sensor data of ZoneCompHistory of which the value and absolute position had played a role as a base sensor data for the next zone (from the procedure line 22 to 25). Then, the procedure recovers and outputs the compressed and lost sensor data between the last sensor data and its partner by calling SensorDataLineInterpolation (from the procedure line 26 to 29).
Since if the zone is the last target zone then the first sensor data of the next zone is a ground sensor, the procedure selects, as a partner of the last sensor data, the sensor data of ZoneCompHistory to have the least compression 1 in that zone (procedure line 23). There may be the lost sensor data between the last partner selected from ZoneCompHistory and the first ground sensor data in the first 0 valued zone. Accordingly, this recovery process appears from the procedure line 32 to 37. This process is not necessary in the cases that the last zero-valued zone is the first zone 0 or the last partner selected from ZoneCompHistory is the greatest absolute positioned sensor data covered by the last target zone for decompression and recovery. The processes about these cases are reflected on the procedure lines 32 and 34, respectively. Meanwhile, if the last target zone is the last zone m − 1 of DataQueue, there exists in AuxDataQueue a next new partner of the last partner selected from ZoneCompHistory, because the new partner does not exist in the next zone any more. If not, however, the new partner exists in DataQueue. Each of recovery processes on these cases is reflected on procedure lines 34 and 35 and procedure lines 32, and 33. Furthermore, each case is involved in the next process for ground sensor data in the zero valued zones and AuxDataQueue. In other words, in the case that the last target is not the last zone of DataQueue, the procedure just outputs each remaining sensor data in DataQueue with its absolute position (from the procedure line 38 to 42) since all of ground sensor data remain in DataQueue,. Otherwise, the procedure outputs each remaining sensor data in AuxDataQueue with its absolute position (from procedure line 38 to 42). Because ground sensor data do not exist in DataQueue, any more.

Performance Comparisons and Analysis
Until now, we have focused on minimizing error for the lost sensor data in compression intervals rather than on minimizing energy consumption, as have most of the current research works. Therefore, our performance evaluation focuses on the average error rate per recovered sensor data as one of performance factors. This section shows average error rates by experimental results based on real-field sensor data sets.

Experimental Sensor Data Sets and Samples
Samples extracted from each of four real-field sensor data sets: underwater pH in the ocean, underwater temperature in the ocean, relative humidity in a city, and air temperature in a city. The underwater pH and underwater temperature sets were collected through a wireless sensor network developed by us from a real field: a littoral sea near Yokjido, which is a small island at Tongyeong-Si, Gyeongsangnam-do, Korea. The other sets, relative humidity and air temperature, were collected in Seoul, Korea, by the Meteorological Administration. Each sensor data in each of these four sets is a data collected every alternative hour.
In fact, before choosing these four sensor data sets, we had considered two types of sensor data sets as experimental sensor data sets: one is the type of sensor data set in which changes in the difference among sensor data values are frequent and high. The other is the type of sensor data set in which changes in the difference among sensor data values are infrequent and low. The reason for our consideration is that we had predicted higher error rates in sets with frequent and high value changes than in sets with infrequent and low value changes. Moreover, another reason was to ensure the validity of the average error rate of the SMEP method in sets with frequent and high value changes. Specifically, our main concerns are whether the average error rate in the SMEP method is valid in sets with frequent and high value changes, and until how many times compression is generally reasonable. Two sensor data sets, the relative humidity set and air temperature set, have more frequent and higher value changes than the two remaining sets, the underwater pH set and underwater temperature set. We chose different samples from these four sensor data sets, as shown in Table 2, depending on various experiments. How and why to choose them is described in detail in Sections 9.2.1 and 9.2.2.

Comparison Target Methods
For comparison with other methods, we have chosen five compression methods, i.e., winavg, delta, CQP, 2MC, and SMEP. We have applied their own methods to each sample in compression. We have applied a simple line interpolation to decompression and recovery in the SMEP method. We have not applied other interpolations such as two-or three-point spline interpolation, since we had used these interpolation methods but their average error rates were higher in each sample than average error rates by the line interpolation.

Performance Evaluation Measure
For performance evaluation, we use the average error rate defined as the below: Definition 22. Let M and S 0 = S o,o , S 0,1 , · · · , S 0,n−1 be a compression method and a ground sensor data sequence, respectively. Given a compressed sensor data sequence of S 0 , S c = S c,o , S c,1 , · · · , S c,m , let S 0 = S 0,0 , S 0,1 , · · · , S 0,n−1 be a decompressed and recovered sensor data sequence of S c . Then, the average error rate (AER) E M,S c of S c with respect to S 0 in the compression method M is defined as From the above definition, note that an average error rate means an average ratio of the average difference value between the recovered sensor data and the original sensor data with respect to the average original sensor data value.

Experimental Tool and Method
We have used MATLAB R2014a as an experimental tool. With this tool, we have experimented with four compression methods per sample, while increasing the number of compressions one-by-one. With our experimental results, we have analyzed and evaluated our SMEP method, comparing it with other methods.

Experiments, Experimental Results and Analysis
In wireless sensor network literatures only a few methods for reducing the sensor data loss during communication failure have been proposed while a lot of the methods for reducing the number of sensor data transmissions with energy efficiency purpose have been mostly presented. They are winavg, delta, CQP and 2MC methods as they have been briefly discussed in Section 2 and among the the winavg and delta methods are other's works. In this section we show experimental results comparing the SMEP method with winavg, delta, CQP and 2MC methods.
We performed experiments for two categories of analysis: the AER analysis in round compressions and the AER analysis in Zones value patterns. In Sections 9.2.1 and 9.2.2, we present the above two categories of experiments, their experimental results, and the analysis on them, respectively. In Section 9.2.2, moreover, we analyze characteristics of the SMEP method presenting graphs of Zones value patterns.

Average Error Rates in Round Compressions
We experimented with four samples for each of the five compression methods, which are winavg, delta, CQP, 2MC, and SMEP. We constructed one sample from each of four real-field sensor data sets, which are relative humidity, air temperature, underwater pH, and underwater temperature sensor data sets. Each sample consisted of 129 consecutive ground sensor data extracted from their corresponding sensor data set. Varying the number of round compressions between one and four for each of four samples per method, we estimated individual AERs and we compared them for the five methods (here, the sizes of original sample in round compressions from one to four times are reduced to 50%, 25%, 12.5%, and 6.25% in compression ratios, respectively). In Table 3, we present the results. Compared with the other methods, the SMEP shows for each round compressions better AER than the other methods, in the sensor data samples with more frequent and higher value changes, i.e., in the relative humidity and air temperature samples.
In the underwater pH and underwater temperature samples with infrequent and low value changes, though the differences among the AERs of five methods in the same round compressions are small, the SMEP shows mostly better AER than the other methods. Specifically, comparing with the 2MC, note that the AER of the SMEP method is better than the 2MC. Meanwhile, in Table 2 the SMEP and the 2MC show that their AER differences between these methods and the other methods are much bigger in the relative humidity and air temperature samples than in the underwater pH and underwater temperature samples. As an example, in Table 2, while the AER difference between the SMEP and delta in the 3rd round compression of the underwater temperature sample is only 0.37%, while their AER difference in the same round compression of the air temperature sample is 21.44%. Therefore, in compressions of ground sensor data with more frequent and higher value changes, the SMEP and 2MC methods are more effective than the other methods.
We can show that the number of round compressions until which a method is feasible depends definitely on the sensor data change properties in all the compression methods. As shown in Table 2, in all methods, the underwater pH and underwater temperature samples with infrequent and low value changes show the AER difference between the 4th round compression and each of the other round compressions is so small that we can generally ignore it. In contrast to this, every method shows the 4th round compression is not feasible in the relative humidity and air temperature samples with frequent and high value changes. Of course, even if the maximum feasible number of round compressions depends on sensor data properties and sample periods, in these cases, Table 2 shows that the SMEP method is generally feasible up to 3rd round compressions in the case of the samples with frequent and high value changes. In fact, the maximum feasible number of round compressions depends entirely on how finely or frequently sensor data are sampled. The more frequently sampled sensor data are, the higher the maximum feasible number of round compressions is.

Average Error Rates in Zones Value Patterns
One among major merits of the SMEP method is Zones compression. Such a merit is based on Zones rules, in which the numbers of compressions can be different among zones of DataQueue. In fact, the round compressions in the experiments described in Section 9.2.1 are the special cases in which values of the non-zero value zones are all the same and there is no ground sensor data in zero-value zones. Hence, these cases are rare. The cases of Zones value patterns introduced in this section are much more general and frequent than the cases of round compressions. From now on, we let m-n-o-p Zones value pattern or, shortly, m-n-o-p Zones pattern mean that values of Zones[0], Zones [1], Zones [2], and Zones [3] are m, n, o, and p, respectively, in the DataQueue[0. .16]. In other words, m-n-o-p Zones pattern means that sensor data corresponding to Zones[0], Zones [1], Zones [2], and Zones [3] in DataQueue[0..16] are compressed m, n, o, and p times, respectively.
We tried to carry out experiments for investigating the merits in the virtue of Zones patterns. For doing these, we prepared DataQueue[0. .16]. Actually, the DataQueue size is much bigger than this size because the DataQueue size can be the space size capable of accommodating 2 m + 1 sensor data from available free memory space. The real sizes of DataQueue in current technologies can generally range from dozens of kilobytes to hundreds of megabytes in available flash memory spaces. Despite this, the reason why we prepared such a small-sized DataQueue is to validate the merits of the SMEP method in Zones compressions, just with small DataQueue size. We also prepared five Zones patterns, i.e., 1-0-0-0, 2-0-0-0, 3-0-0-0, 3-2-2-1, and 3-2-2-2, by randomly choosing them because there are too many cases to investigate all cases in Zones patterns. The sample sizes corresponding to these patterns are 25, 41, 73, 93, and 97 in order that the ground sensor data should be full in each 0-valued zone. We experimented with five samples with these sizes for each of the five compression methods, which are winavg, delta, CQP, 2MC, and SMEP. Five samples are constructed for each of the four real-field sensor data sets, which are the relative humidity, air temperature, underwater pH, and underwater temperature sensor data sets. According to sample sizes corresponding to the five Zones patterns, each of the five samples consisted of consecutive ground sensor data extracted randomly from each of these four sensor data sets. With these samples corresponding to five Zones patterns, we estimated the individual AERs and compared them for the five methods. In Table 4, we present the results. Table 4 shows us that even in terms of the zones value patterns, the AER of the SMEP method is better than the AERs of the other methods, too. Figure 12 shows more explicitly the graph shape changes in accordance with Zones patterns. The figure shows not only graph patterns of sensor data recovered from relative humidity sensor data compressed by the winavg, delta, CQP, 2MC, and SMEP methods, but also comparisons among between of these patterns and the original sensor data pattern. Table 4. Average error rates in sensor data samples corresponding to Zones value patterns (DataQueue size: 16, CR: Compression Ratio (%)).

Sensor Data Sets
Relative Humidity Air Temperature

Methods
Zones

Methods
Zones In the 1-0-0-0 Zones pattern, sensor data from the 1st to the 17th covered by Zones[0] are one-time compressed data, but sensor data from the 17th sensor data to the 25th sensor data covered by Zones [1], Zones [2], and Zones [3], of which the values are 0, are ground sensor data. Figure 12a shows the 1-0-0-0 Zones pattern. In this graph, we can see that the CQP, 2MC, and SMEP methods follow patterns similar to original sensor data. data. Nonetheless, unlike the other methods, including CQP and 2MC, the SMEP method shows that part of the graph between the 34th sensor data and the 41th sensor data is the same pattern as in the original sensor data graph. Actually, this graph part indicates that the SMEP method has a better AER between the original sensor data and its own sensor data part than the other methods.
In the 3-0-0-0 Zones pattern, Zones[0] sensor data between the 1st sensor data and the 65th sensor data and the sensor data between the 66th sensor data and the 73th sensor data are original sensor data covered by Zones [1], Zones [2], and Zones [3]. Figure 12c shows that the graph part covered by Zones[0] is more similar to the corresponding original sensor data part than for the other methods. In particular, the graph part covered by Zones [1], Zones [2], and Zones [3] is exactly the same as the corresponding original sensor data part.
(a)   Figure 12b shows the 2-0-0-0 Zones pattern. In the 2-0-0-0 Zones pattern, sensor data between the 1st sensor data and the 33th sensor data are two-time compressed data covered by Zones[0], but sensor data between the 34th sensor data and the 41th sensor data are original sensor data. Like the 1-0-0-0 Zones pattern, the CQP, 2MC, and SMEP methods follow patterns similar to original sensor data. Nonetheless, unlike the other methods, including CQP and 2MC, the SMEP method shows that part of the graph between the 34th sensor data and the 41th sensor data is the same pattern as in the original sensor data graph. Actually, this graph part indicates that the SMEP method has a better AER between the original sensor data and its own sensor data part than the other methods.
In the 3-0-0-0 Zones pattern, Zones[0] sensor data between the 1st sensor data and the 65th sensor data and the sensor data between the 66th sensor data and the 73th sensor data are original sensor data covered by Zones [1], Zones [2], and Zones [3]. Figure 12c shows that the graph part covered by Zones[0] is more similar to the corresponding original sensor data part than for the other methods. In particular, the graph part covered by Zones [1], Zones [2], and Zones [3] is exactly the same as the corresponding original sensor data part.
In the 3-2-2-1 and 3-2-2-2 Zones patterns, Figure 12d,e show that graph shapes of the SMEP method are much more similar to the graph shape of the original sensor data than for the other methods. In fact, the graph shape becomes more similar to the original sensor data than the other methods if the Zones[0] value is greater than the values of the next non-zero value zones.
One of the reasons for this, as shown in Section 9.2.1, is that the SMEP method generally has a better AER performance than the other methods. Moreover, the more important reason is that in the SMEP method, the number of compressions of sensor data in one zone is different from that in the other zones if the Zones[0] value is greater than the values of the next non-zero value zones. In the other methods, however, every sensor data has the same number of compressions as Zones[0]. Thereby, the SMEP method tends to exhibit better AER performance effects in the zones with the lower number of compressions than that in Zones[0].

Conclusions
In this paper, we have proposed a communication framework and feasible method for reducing sensor data that can be lost during communication failure. Our formal approach has also come up with a theoretical basis for problems on the reduction of sensor data loss during communication failure. Consequently, in the comparison with current compression methods, the SMEP method has shown better performance in average error rate per sensor data than others.
Meanwhile, the current technologies have developed and commercialized such cheap micro controller units with flash memories into which program code or data can be permanently saved. Moreover, they are currently used in various wireless sensor network applications. For example, the ATmega128 and Cortex M3 have 128 KB and 512 KB flash memory, respectively. We can use not all of these memory spaces but some parts of them. Under such a possibility, the point is that most current sensor nodes can permanently save compressed and ground sensor data to its own flash memory. Accordingly, our proposed SMEP method is feasible in most of current sensor node technologies. Nonetheless, most sensor data will be lost during communication failures without taking any action.
In the implementation of the SMEP method, the SMEP functionalities cannot be loaded into the routing layer in the protocol stack, because sensor data compression is not the intrinsic function of the routing layer and, in the SMEP method, each compressed or ground sensor data is necessary to be handled not as a packet with extra information, such as a header or footer, but as bare data without any extra information. For these reasons, we recommend that the SMEP functionalities should be embedded into an upper layer of the routing layer. Several research works have introduced and implemented a kind of database layer, such as a query layer [23,[34][35][36]45]. In these, the database layer commonly plays a role of data management between the routing layer and upper layers, such as application layer. For the detail on the data layer, refer to the above-referenced literature. Such a database layer or query layer could be one of our recommended upper layers. Another possibility for implementation could be a cross layer to merge the functionalities of the database and routing layers, of course.
On the other hand, the SMEP method alternately applies the selection by pattern rule and the selection by minimum error rule for each compression interval. Therefore, given two consecutive compression intervals, one is applied by the selection by pattern rule and the other is applied by the selection by minimum error rule. Now, we expect that we can improve the average error rate if the selection by minimum error rule is applied to not one compression interval but several consecutive compression intervals. As further work, we are going to research not only how to design this method, but also how many consecutive compression intervals are optimal for performance and how much its performance is improved depending on the number of consecutive compression intervals. 2 m−j−1 > 0, this is, k = max t 0 < i − ∑ j=t−1 j=0 2 m−j−1 . Since ∑ j=t−1 j=0 2 m−j−1 = 2 m−1 + 2 m−2 + . . . + 2 m−t = 2 m − 2 m−t = 2 m−t 2 t − 1 , k = max t 0 < i − 2 m−t · 2 t − 1 . If i = 2 m , i is the last position. Hence, i exists in the last zone m − 1.
In the (ii) proof, at first, if i = 0, P i = 0, since S c,0 = S 0,0 for any number of compressions c. Now, we prove the case for i = 0 and k ≥ 0. Let k be the zone in which i position is located. The absolute address P i of i position in DataQueue consists of a sum of the three parts: where P B is the number of absolute positions covered by zones from the zone 0 to the zone k − 1, P k, i−1 is the number of absolute positions within a zone k which covered by sensor data from 1 st sensor data to the i − 1 th positioned sensor data in the zone, P k,I i is an I-bit position address of the i-th position within the compression interval in which the i-th position is located. Each zone j has 2 m−j−1 sensor data in DataQueue and each of them has been selected among 2 Zones[j] . Therefore, each zone i covers 2 m−j−1 ·2 Zones[j] absolute positions. Since zones from the zone 0 to the zone k − 1 are located before the zone k, Given a sensor data sequence with even number size, the sensor data sequence after compression becomes to be reduced to the half size. Thus, the compressed sensor data sequence with the size L is reduced to the size 1 2 L = 1 2 ·2 m−StatZone = 2 m−StatZone−1 , which is the same as the size of StartZone zone. Thus, the sensor data sequence in consecutive equivalent zones from the StartZone zone to the last m − 1 zone is compressed to the Startzone zone.