Bounded-Error-Pruned Sensor Data Compression for Energy-E ﬃ cient IoT of Environmental Intelligence

: Emerging IoT (Internet of Things) technologies have enjoyed tremendous success in a variety of applications. Since sensors in IoT consume a lot of energy to transmit their data, data compression used to prolong system lifetime has become a hot research topic. In many real-world applications, such as IEI (IoT of environmental intelligence), the required sensing data usually have limited error tolerance according to the QoS 2 (quality of sensor service) or QoD (quality of decision-making) required. However, the bounded-error-pruned sensor data can achieve higher data correlation for better compression without jeopardizing QoS 2 / QoD. Moreover, the sensing data in widely spread sensors usually have strong temporal and spatial correlations. We thus propose a BESDC (bounded-error-pruned sensor data compression) scheme to achieve better leverage between the bounded error and compression ratio for sensor data. In this paper, our experiments on a sensor network of two-tier tree architecture consider four di ﬀ erent environmental datasets including PM 2.5, CO, temperature and seismic wave with di ﬀ erent scales of bounded errors. With the bounded errors required by the given IEI applications, our BESDC can reduce the total size of data transmission to minimize both energy consumption in sensor-tier devices and storage of fog-tier servers. The experimental results demonstrate that our BESDC can reduce transmission data by over 55% and save 50% energy consumption when assigning 1% of error tolerance within QoS 2 / QoD requirement. To the best of our knowledge, the proposed BESDC scheme can help other energy-e ﬃ cient IoT schemes applying network topologies and routing protocols to further enhance energy-e ﬃ cient IoT services.


Introduction
In recent years, emerging IoT (Internet of Things) technologies have enjoyed tremendous success in varieties of applications [1][2][3][4][5][6]. They require sensor data to be timely, reliable and suitable for decision-making [7][8][9]. The quality of decision-making (QoD) depends on the quality of the information from sensor data. Although sensors are available for measuring many kinds of indicators, energy consumption remains one of the major limitations of IEI (IoT of environmental intelligence). Since sensors in IoT will consume considerable energy to transmit their data, data compression to prolong system lifetime has become a hot research topic. Different types of IEI applications can impose different quality of sensor service (QoS 2 ) requirements. For example, an air quality monitoring application gathering air parameter measurements demands less stringent timing requirements than The system architecture of an IEI application is illustrated in Figure 1. It indicates that many sensor nodes are deployed, and the sensor nodes are divided into two categories: end device (ED) and super node (SN). SNs are usually more powerful and faster than EDs in computing and communication resources and also much more expensive than EDs. In this paper, without loss of generality, we simply assume that each SN and its EDs are homogeneous and they are deployed in the monitored environment properly for IEI applications. The application-specific bounded error is given from the server (i.e., bounded error regulator) deployed on the fog-tier according to the required QoS 2 or QoD for IEI applications.
In this paper, we apply a two-tier scheme to assign proportionally divided error bounds (pdEB) to the related sensor-tier devices (i.e., ED and SN) to guarantee the bounded error assignment (BEA) of pdEB from the bounded error regulator on the fog-tier. Then, BESDC determines whether the new sensing data should be compressed by comparing with not only the previous sensed data for temporal correlation but also the neighboring sensed data for spatial correlation, within the tolerated BEA. Besides, after bounded-error-pruning the sensor data within the required QoS 2 or QoD, the dataset size for encoding can be reduced further to save resources of storage and transmission for sensor nodes and fog-computing machines. Ultimately, better sensor data compression can be further achieved in the bounded-error-pruned sensed values to save more energy consumption.
Experiments on four different environmental datasets, including PM 2.5, CO, temperature and seismic wave, are presented. Performance results show that the energy consumption of IEI sensor The system architecture of an IEI application is illustrated in Figure 1. It indicates that many sensor nodes are deployed, and the sensor nodes are divided into two categories: end device (ED) and super node (SN). SNs are usually more powerful and faster than EDs in computing and communication resources and also much more expensive than EDs. In this paper, without loss of generality, we simply assume that each SN and its EDs are homogeneous and they are deployed in the monitored environment properly for IEI applications. The application-specific bounded error is given from the server (i.e., bounded error regulator) deployed on the fog-tier according to the required QoS 2 or QoD for IEI applications.
In this paper, we apply a two-tier scheme to assign proportionally divided error bounds (pdEB) to the related sensor-tier devices (i.e., ED and SN) to guarantee the bounded error assignment (BEA) of pdEB from the bounded error regulator on the fog-tier. Then, BESDC determines whether the new sensing data should be compressed by comparing with not only the previous sensed data for temporal correlation but also the neighboring sensed data for spatial correlation, within the tolerated BEA. Besides, after bounded-error-pruning the sensor data within the required QoS 2 or QoD, the dataset size for encoding can be reduced further to save resources of storage and transmission for sensor nodes and fog-computing machines. Ultimately, better sensor data compression can be further achieved in the bounded-error-pruned sensed values to save more energy consumption.
Experiments on four different environmental datasets, including PM 2.5, CO, temperature and seismic wave, are presented. Performance results show that the energy consumption of IEI sensor data transmission can be highly reduced even when a small bounded error is given. For example, when the error tolerance is assigned as 1%, BESDC can reduce transmission data by at least 55%. For a temperature sensor, 1% error tolerance of two-digit temperature value is around 0.5 degree Celsius in bounded error. Therefore, it is acceptable for applications to make a weather report with 1 degree Celsius bounded error in required QoS 2 /QoD.
The remainder of this paper is organized as follows. In Section 2, we survey the literature for IoT data compression including bounded error and unbounded error methods. In Section 3, we present BESDC and introduce the applied data, temporal and spatial correlation compression schemes. We use two-tier BEA for different devices running BESDC for sensor data. In Section 4, experiments and performance evaluation results of BESDC are shown. Finally, conclusions and future work are given in Section 5.

Related Works
IEI applications need rapid information delivery from environmental transitions. They require a timely, reliable and suitable process in collecting diversified sensor data. As sensor data communication dominates the energy consumption in IoT, many compression mechanisms have been applied to prolong the IoT lifetime [12]. Data compression schemes applied for energy-saving in IoT can be divided into two categories: lossy and lossless. Lossy compression schemes usually give much better compression ratios but introduce some signal fidelity in required QoS 2 /QoD lost from sensor data. Schemes in this category include JPEG [13][14][15], discrete cosine transform (DCT) [16], wavelet-based [17], vector quantization (VQ) [18], and multi-resolution spatial and temporal coding (MRCQ) [19]. As the errors between the compressed and original data are unbounded, these works are unable to support real-world IEI applications to make proper decisions for users.
Modern IEI applications are usually based on lossless compression schemes including Huffman coding (HC) [20], static HC (SHC) [21], adaptive HC (AHC) [22], modified AHC (MAHC) [23] and improved MAHC (IMAHC) [24]. HC refers to the use of a variable-length codebook where symbols with higher probabilities to appear are encoded with fewer bits. Given statistics information of source symbols, SHC is proposed to construct a codebook with an optimal prefix code. AHC separates encoded data into prefixes and suffixes based on the weight of the data. Prefix can be redefined as adaptive to have a better compression ratio. MAHC merges the codebook with the optimal prefix code of SHC and the codebook with dynamic suffix of AHC to have a better compression ratio. IMAHC applies dynamic codebook to adjust to rapid environmental changes for IoT. These lossless compression schemes maintain complete data integrity but have poor compression ratios compared with lossy compression.
Correlations considered in data compression can be spatial correlation, data correlation and temporal correlation. Spatial correlation suggests that the sensor nodes located closely in an intensive area usually have similar sensed data. Thus, we can apply the spatial correlation to compress transmission data. Spatial correlation compression schemes include minimum fusion Steiner tree (MFST) [25], adaptive fusion Steiner tree (AFST) [26], data fusion (DFuse) [27], context adaptive clustering (CAC) [28], CODA [29], clustered aggregation (CAG) [30], prediction-based monitoring (PREMON) [31], and data fusion using temporal and spatial correlations (DF-TS) [32]. Different from spatial correlation, data correlation uses information theory to encode high probability data with short length output. An example of a data correlation compression scheme is decentralized data fusion (DDF) [33]. Temporal correlation is based on the idea that sensor nodes usually have similar sensed data in a short period of time. Temporal correlation compression schemes include multi-level fusion synchronization (MFS) [34], temporal coherency-aware in-network aggregation (TiNA) [35], CAG, PREMON, and DF-TS.
Usually, IEI applications will tolerate limit errors for their required QoD and QoS 2 . Bounded error can be considered valuable information to reduce the communication cost. The simplest way, as in the TiNa scheme, is to use the bounded error between current and former data to decide whether or not to transmit the current data. In this paper, we use bounded error in advance for data compression to not only reduce the communication/storage cost for sensors and fog-computing machines but also cost-efficiently extend the sensors' lifetime. A tolerable bounded error is helpful not only to improve the data compression ratio [36][37][38][39] for energy-efficient IoT applications but also to reduce the query response time for big sensor data [40,41].
In our previous work [37] for a preliminary wireless sensor network (WSN) topology, we first applied static bounded error in codebook for data correlation compression and outperformed DF-TS in sensor energy saving. Then, the dynamic bounded error in more realistic WSN topology was considered in [38] to further validate the energy efficiency by four different BEA in linear/non-linear proportions. Furthermore, the authors of [36] improved previous works described above by applying run length encoding (RLE) in data correlation compression and revising the spatial correlation compression method.
Different with [36][37][38], in a wireless body sensor network (WBSN), a recent article [39] first applied an error-bounded lossy compression scheme [42] in high performance computing (HPC) and then used a machine learning technique to rebuild the transmitted data with given bounded error on the edge node. However, for IEI sensor networks and different QoS 2 /QoD required in their various applications, our proposed architecture applies overall lossless compression schemes in data, spatial and temporal correlations among bounded-error-pruned sensor data. Without the need to rebuild the lossy transmitted data in fog-computing machines, the total data size in transmitting and storing can be minimized for energy saving to prolong IoT lifetime for IEI applications in order to quickly access the bounded-error-pruned sensor data.

Bounded-Error Sensor Data Compression (BESDC)
In an IEI application, the application-specific bounded error is given from a fog-computing machine called the bounded error regulator according to the QoS 2 or QoD required by IEI decision-makers. Then, a two-tier BEA scheme is applied based on the current topology of the deployed IoT to assign the error bounds of pdEB to the fog-tier machines like sink and the sensor-tier devices of SN and ED. With the given error bounds, EDs will use temporal and data correlations to compress the receiving sensor data and send the compressed sensor data to the corresponding SN. With the incoming compressed data between different neighboring EDs, SN will use spatial and data correlation compression to compress incoming sensor data further. Then, SN will send out the compressed data to the corresponding sink and the fog-tier sink will forward these compressed data to the fog-tier servers. Finally, the corresponding servers will store and decompress the compressed sensor data according to the pdEBs. Then, they will save the decompressed bounded-error-pruned data to the database for later query from users.

Basic Processing Flow
The flow chart of our proposed BESDC scheme is illustrated in Figure 2. It can be simply divided into two phases: offline phase and online phase. The offline phase is processed only while a sensor node is newly deployed and needs to generate its initial codebook. In two-tier BEA, a bounded error for the fog-tier sink is obtained from the application-specific bounded error of the user-required QoS 2 or QoD. We assigned the fog-tier bounded error τ total to the sink by proportionally dividing the BEA, as shown in Figure 3. The sink will calculate the sensor-tier bounded errors τ 1 to τ 9 by Equation (1) with the proportional weight w i according to the status, capability, etc., of sensor i. This value w i can be intelligently assigned from the fog-tier server (i.e., bounded error regulator, BER) or just equally assigned w i = 1/n via the fog-tier sink. After the bounded error τ i is assigned to the sensor node i, we try to evaluate the data correlation of sensed data for constructing the initial codebook. Usually, the servers can spend a time period evaluating the data correlation of sensed data for constructing Appl. Sci. 2020, 10, 6512 5 of 16 the initial codebook. In order to reduce the offline phase latency for constructing the initial codebook, different sensors can be assigned the same initial codebook by default or a fog-tier server like BER.
In the online phase, the new sensed data of ED, which are represented as in Equation (3), should be compressed or not by comparing with former sensed data (i.e., temporal correlation between sensor value s j+1 and s j ) and be encoded through the codebook (i.e., data correlation in sensor data similarity, as shown in Equation (5)) within the bounded error constraint τ i (as shown in the given criteria in Equation (6)). In Equation (3), S ED is defined as a sequence of ED's sensor values of length n. S τ i ED is defined as a sequence of bounded-error-pruned sensor values from the sequence S ED , as the bounded error τ i given in Equation (4). S   = τ (1) As shown in Figure 2, BESDC consists of EB-HC (error-bounded HC) for data correlation compression and EB-RLE (error-bounded RLE) for temporal correlation compression [38]. Thus, EB-HC and EB-RLE will have the advantages of more same values and longer run-length values, respectively, to reduce the number of |S ED | to for bounded-error-pruned sensor data compression, as indicated in the bounded error criteria required between the pruned and original data (i.e., s τ i j and s j ) in Equation (6). Then, the pruned sensor data sequence S τ i ED will have similar QoS2 and QoD with the original sensor data sequence S ED , as shown in Equations (7) and (8).
In an SN, the new received data from ED should be compressed again or not by comparing it with the neighboring sensed data (i.e., spatial correlation) and be encoded through the codebook (i.e., data correlation) within the given bounded error. Finally, these compressed data after bounded-error-pruning in BESDC, without jeopardized the required QoS 2 and QoD (as indicated in Equations (7) and (8)), will be forwarded to the server via the sink. Finally, the fog-tier server will decompress the compressed data and report the results for IEI applications.

EB-RLE, EB-HC and Spatial Correlation Compression
RLE shortens a representation of a long sequence of a symbol by the sequence length and the symbol itself. For example, "rrrrrrrr" can be encoded as "8,r". Notably, real sensor data usually have no long running lengths of the same value. Traditional RLE seems unsuitable for sensor data compression. However, in our EB-RLE, if a long sequence of data is within a certain bounded error range, it can be encoded by RLE. In this paper, we proposed a greedy algorithm to find the "longest sequence" of sensor data which can be represented as the same value within the given bounded error. Figure 4 shows an example with a sequence of 10 sensor data ( Figure 4 within the given bounded error 0.5. They can be encoded as "4,22.7". Then, we continue to get the longest running sequence (23.5, 23.7, 24.2, 23.6, 23.3) within the given bounded error 0.5. They can be encoded as "5,23.8". The final EB-RLE result is an RLE sequence ("4,22.7", "5,23.8", "1,23.3").

EB-HC
In conventional HC, as the length of code depends on the frequency of the data which appear, we try to count the frequency of same data value appearance. Different from HC, EB-HC can merge all data within the given bounded error range into one code. In this paper, we proposed a greedy algorithm to find the "largest group" of sensor data which can be represented by the same value within the given bounded error. Considering the same example of sensor data sequence ( Figure 4, our EB-HC algorithm sorts the upper bounds and the lower bounds of the bounded error bars to obtain scan lines EB1 to EB17. For each scan line, the related counter represents its "bump-to-bar" counting result to indicate the number of sensor data that can be represented by one single value. We have the related counters EBi = 1, 2, 3, 4, 5, 5, 5, 7, 7, 7, 6, 5, 5,  22.7. The final result is a bit sequence 00111101111 with the codebook ("1,23.2", "01,22.7", "00,24.7").

EB-RLE, EB-HC and Spatial Correlation Compression
RLE shortens a representation of a long sequence of a symbol by the sequence length and the symbol itself. For example, "rrrrrrrr" can be encoded as "8,r". Notably, real sensor data usually have no long running lengths of the same value. Traditional RLE seems unsuitable for sensor data compression. However, in our EB-RLE, if a long sequence of data is within a certain bounded error range, it can be encoded by RLE. In this paper, we proposed a greedy algorithm to find the "longest sequence" of sensor data which can be represented as the same value within the given bounded error. Figure 4 shows an example with a sequence of 10 sensor data ( Figure 4 within the given bounded error 0.5. They can be encoded as "4,22.7". Then, we continue to get the longest running sequence (23.5, 23.7, 24.2, 23.6, 23.3) within the given bounded error 0.5. They can be encoded as "5,23.8". The final EB-RLE result is an RLE sequence ("4,22.7", "5,23.8", "1,23.3").

Spatial Correlation Compression
For spatial correlation compression, SN processes multiple encoded data from neighboring EDs to further reduce data transmission to the fog-tier sink to extend the SN lifetime and save storage in fog-computing machines. While SN receives packets from neighboring EDs, SN verifies whether each sensor data encoding part (SDEP) in the payloads of two packets from similar geography regions is identical. For our proposed spatial correlation compression, if two packets share exactly the same SDEP, BESDC will use _S_NoOpt to replace one of the SDEPs. _S_NoOpt is encoded as "1" and the other SDEPs will be labeled by adding a "0" in front of them before they are encoded. Figure 5 illustrates two packets sent from two neighboring EDs and the packet format after they are compressed. The second SDEPs of these two packets are identical, so one of them will be replaced by _S_NoOpt. Then, this spatial correlation encoding scheme can further compress the incoming sensor data from different EDs to reduce the data transmission from SN to the fog-tier sink for more energy saving for SN.

EB-HC
In conventional HC, as the length of code depends on the frequency of the data which appear, we try to count the frequency of same data value appearance. Different from HC, EB-HC can merge all data within the given bounded error range into one code. In this paper, we proposed a greedy algorithm to find the "largest group" of sensor data which can be represented by the same value within the given bounded error. Considering the same example of sensor data sequence ( Figure 4, our EB-HC algorithm sorts the upper bounds and the lower bounds of the bounded error bars to obtain scan lines EB1 to EB17. For each scan line, the related counter represents its "bump-to-bar" counting result to indicate the number of sensor data that can be represented by one single value. We have the related counters EBi = 1, 2, 3, 4, 5, 5, 5, 7, 7, 7, 6, 5, 5, 5, 4, 2, 1 for i = 1 to 17. By greedy, we select EB8 to represent seven data including {22.

Spatial Correlation Compression
For spatial correlation compression, SN processes multiple encoded data from neighboring EDs to further reduce data transmission to the fog-tier sink to extend the SN lifetime and save storage in fog-computing machines. While SN receives packets from neighboring EDs, SN verifies whether each sensor data encoding part (SDEP) in the payloads of two packets from similar geography regions is identical. For our proposed spatial correlation compression, if two packets share exactly the same SDEP, BESDC will use _S_NoOpt to replace one of the SDEPs. _S_NoOpt is encoded as "1" and the other SDEPs will be labeled by adding a "0" in front of them before they are encoded. Figure 5 illustrates two packets sent from two neighboring EDs and the packet format after they are compressed. The second SDEPs of these two packets are identical, so one of them will be replaced by _S_NoOpt. Then, this spatial correlation encoding scheme can further compress the incoming sensor data from different EDs to reduce the data transmission from SN to the fog-tier sink for more energy saving for SN.

Experimental Results and Performance Evaluations
In this section, the compression ratio and energy consumption of IoT in BESDC for IEI applications are evaluated with four different datasets. Different scales of bounded errors are applied in these datasets. These environmental datasets are temperature, PM 2.5 and CO of air quality monitoring from [43] in Taiwan and seismic wave [44] from the Taiwan Chi-Chi earthquake which happened on 21 September 1999. Table 1 shows the features of correlation in each dataset. This experiment with performance evaluation is divided into four parts. The first part introduces the processing in two-tier BEA for this experiment. Then, the second part introduces the calculation of compression ratio and energy consumption in performance evaluations from experiments. The third part evaluates the performance improvement from both temporal and data correlation compression of EB-RLE and EB-HC. The fourth part evaluates the performance improvement from spatial correlation compression of SN in BESDC scheme. The performance improvement is focused on the sensor data compression and energy saving ratios as compared with the original sensor data without compression.

Two-Tier BEA
This 2-tier BEA of BESDC assigns corresponding bounded errors to each sink from IEI BEA server. Without loss of generality, τ total is set to the value described in Equation (9), where SD Max is the maximal value among all sensor data collected from n devices (including SN and ED) under the sink and ρ j is the jth inserted bounded error (0 < ρ j < 1). In our experiment, the inserted proportional weight w i for each sensor device i is simply 1/n and ρ j is 0.25%, 0.5%, 0.75% and 1.0%. Then, the bounded error τ i for each sensor device i can be rewritten as Equation (10).

Calculations of Compression and Energy Consumption Ratios for BESDC
The compression performance of BESDC is simply calculated by comparing the compressed sensor data size with the original sensor data size from the above-mentioned four different environmental datasets. The calculation is listed in Equation (11). compression ratio = BESDC compressed data size Orginal data size (11) To evaluate the energy consumption, the parameters shown in Table 2 are adopted from [45]. According to these parameters from CC2420 transceiver and ARM7TDMI data-sheets, energy consumed in data compression is calculated according to the number of different machine instructions executed in the proposed BESDC algorithms on ED and SN via the assembly code output from Visual Studio 2010, including different instructions of ADD (addition), MUL (multiply), CMP (comparison) and SHF (shift). As well, energy consumed in data transmission (transmit and receive) is calculated according to the number of bits compressed by BESDC in data, temporal and spatial correlations. In our experiments for BESDC performance, the energy consumption ratio is calculated as defined in Equation (12). energy consumption ratio = Energy consumption with compression Energy consumption without compression (12)  The energy consumption function eg TX (n) in joule units (i.e., J) for ED or SN to transmit n bits can be calculated as defined in Equation (13). For example, to calculate the energy consumption to transmit 1 bit using the applied parameters in Table 2, we can have the energy consumption value of eg TX (1) = I TX × V TX × T TX × 1 = 17.4 mA × 3.3 V × 32 µs/bit × 1 bit = 1.83744 µJ (i.e., T TX indicates the time for ED or SN to send out 1 bit to network). The energy consumption eg RV (m) for ED or SN to receive m bits can be calculated as defined in Equation (14). Thus, we can have eg RV (1) = I RV × V RV × T RV × 1 = 2.08032 µJ. Then, the energy consumption function eg COM (n, P sensor ) for n-bit compression via the output of assembly code from data compression program P sensor running on sensors, such as ED or SN in proposed BESDC, can be calculated as defined in Equation (15).
In Equation (15), i P sensor add , i P sensor mul , i P sensor cmp and i P sensor sh f are indicated as the respective numbers for different assembly instructions of ADD, MUL, CMP and SHF counted in running program P sensor . Then, ε add , ε mul , ε cmp and were the measured energy consumption values, respectively, in experience from [45] for each instruction of ADD, MUL, CMP and SHF in a program of encoding 1 bit. Therefore, for the sensor node of ED running EB-RLE with EB-HC in program P ED , we can have the ED total energy consumption function eg ED (n) as shown in Equation (16), for compressing n-bit data and transmitting compressed n-bit data to SN. In Equation (17), we can have the total energy consumption function eg SN (m) for SN running spatial correlation compression with EB-HC in program P SN , to receive m-bit data from ED and compress them to m-bit data, finally sending the compressed m-bit data to fog-computing machines.
eg TX (n) = I TX × V TX × T TX × n (13) eg COM (n, P sensor ) = n × i P sensor add × ε add + i P sensor mul × ε mul + i P sensor cmp × ε cmp + i P sensor sh f × ε sh f (15) eg ED (n) = eg COM (n, P ED ) + eg TX ( n)

Performance Improvement from Temporal and Data Correlation Compression
In Figure 6, the compression performance of EB-RLE (temporal correlation compression) and EB-HC (data correlation compress) in BESDC shows the compression ratio for four different environmental datasets of PM 2.5, temperature, CO and seismic wave on four different bounded errors of 0.25%, 0.5%, 0.75% and 1%. As shown in Figure 6, the seismic wave dataset has worse EB-RLE/EB-HC compression ratio than others, since high-frequency seismic waves usually preserve low temporal and data correlations in sensor data values. Since the energy consumption of the sensor node is dominated by data transmission, the computing of EB-RLE and EB-HC in BESDC will not affect too much the overall energy consumption in sensor nodes (i.e., EDs). Thus, the energy consumption ratio performance as shown in Figure 7 has a similar trend to Figure 6. This indicates that the higher data compression ratios preserve the lower energy consumption to substantially extend the IoT lifetime for IEI applications. It is known that the larger the bounded error given, the better compression and energy consumption ratios which should be preserved. However, the compression and energy consumption ratios of the PM 2.5 dataset with a bounded error of 0.75% are almost the same as the bounded error of 0.5%. This is because the experimental dataset of PM 2.5 preserves much the same value ranges as the bounded errors given in 0.5% and 0.75%.  Since the energy consumption of the sensor node is dominated by data transmission, the computing of EB-RLE and EB-HC in BESDC will not affect too much the overall energy consumption in sensor nodes (i.e., EDs). Thus, the energy consumption ratio performance as shown in Figure 7 has a similar trend to Figure 6. This indicates that the higher data compression ratios preserve the lower energy consumption to substantially extend the IoT lifetime for IEI applications. It is known that the larger the bounded error given, the better compression and energy consumption ratios which should be preserved. However, the compression and energy consumption ratios of the PM 2.5 dataset with a bounded error of 0.75% are almost the same as the bounded error of 0.5%. This is because the experimental dataset of PM 2.5 preserves much the same value ranges as the bounded errors given in 0.5% and 0.75%.
Since the energy consumption of the sensor node is dominated by data transmission, the computing of EB-RLE and EB-HC in BESDC will not affect too much the overall energy consumption in sensor nodes (i.e., EDs). Thus, the energy consumption ratio performance as shown in Figure 7 has a similar trend to Figure 6. This indicates that the higher data compression ratios preserve the lower energy consumption to substantially extend the IoT lifetime for IEI applications. It is known that the larger the bounded error given, the better compression and energy consumption ratios which should be preserved. However, the compression and energy consumption ratios of the PM 2.5 dataset with a bounded error of 0.75% are almost the same as the bounded error of 0.5%. This is because the experimental dataset of PM 2.5 preserves much the same value ranges as the bounded errors given in 0.5% and 0.75%.

Performance Improvement from Spatial Correlation Compression
BESDC spatial correlation compression can help the SN device to reduce the sensor data transmission to the sink after collecting compressed sensor data (data correlation and temporal correlation compression) from its neighboring ED to the sink, comparing their similarity and compressing (spatial correlation). In the experiments for the compression and energy consumption performance in spatial correlation compression in BESDC, we use three different environmental datasets of PM 2.5, temperature and CO on two neighboring EDs (both in Taichung City, Taiwan) and two faraway EDs (in Northern and Southern Taiwan) to demonstrate the BESDC performance in spatial correlation compression and energy consumption on SN. Besides, we add the experimental results simulated for other energy-efficient IoT works [32] (i.e., DF-TS temperature as shown in Figures 8-11) from our previous work [37]) to further compare the performance improvement from the proposed BESDC.

Performance Improvement from Spatial Correlation Compression
BESDC spatial correlation compression can help the SN device to reduce the sensor data transmission to the sink after collecting compressed sensor data (data correlation and temporal correlation compression) from its neighboring ED to the sink, comparing their similarity and compressing (spatial correlation). In the experiments for the compression and energy consumption performance in spatial correlation compression in BESDC, we use three different environmental datasets of PM 2.5, temperature and CO on two neighboring EDs (both in Taichung City, Taiwan) and two faraway EDs (in Northern and Southern Taiwan) to demonstrate the BESDC performance in spatial correlation compression and energy consumption on SN. Besides, we add the experimental results simulated for other energy-efficient IoT works [32] (i.e., DF-TS temperature as shown in Figures 8-11) from our previous work [37]) to further compare the performance improvement from the proposed BESDC.

Performance Improvement from Spatial Correlation Compression
BESDC spatial correlation compression can help the SN device to reduce the sensor data transmission to the sink after collecting compressed sensor data (data correlation and temporal correlation compression) from its neighboring ED to the sink, comparing their similarity and compressing (spatial correlation). In the experiments for the compression and energy consumption performance in spatial correlation compression in BESDC, we use three different environmental datasets of PM 2.5, temperature and CO on two neighboring EDs (both in Taichung City, Taiwan) and two faraway EDs (in Northern and Southern Taiwan) to demonstrate the BESDC performance in spatial correlation compression and energy consumption on SN. Besides, we add the experimental results simulated for other energy-efficient IoT works [32] (i.e., DF-TS temperature as shown in Figures 8-11) from our previous work [37]) to further compare the performance improvement from the proposed BESDC.    . Energy consumption ratio in SN with "faraway" EDs by spatial correlation compression.
As shown in Figures 8 and 9, the performance results of data compression and energy consumption for SN with two neighboring EDs using the BESDC spatial correction compression scheme show that the compression ratio and energy consumption ratio can be improved to extend the SN lifetime for IEI applications.
While considering spatial correlation compression for the SN with two faraway EDs, their data compression and energy consumption ratios on SN are shown in Figures 10 and 11, respectively. Compared with the performance results in Figures 8 and 9 (i.e., SN with two neighboring EDs), they have worse data compression and energy consumption ratios since two EDs are far away and have low spatial correlations in the experimental datasets.  . Energy consumption ratio in SN with "faraway" EDs by spatial correlation compression.
As shown in Figures 8 and 9, the performance results of data compression and energy consumption for SN with two neighboring EDs using the BESDC spatial correction compression scheme show that the compression ratio and energy consumption ratio can be improved to extend the SN lifetime for IEI applications.
While considering spatial correlation compression for the SN with two faraway EDs, their data compression and energy consumption ratios on SN are shown in Figures 10 and 11, respectively. Compared with the performance results in Figures 8 and 9 (i.e., SN with two neighboring EDs), they have worse data compression and energy consumption ratios since two EDs are far away and have low spatial correlations in the experimental datasets. Figure 11. Energy consumption ratio in SN with "faraway" EDs by spatial correlation compression.
As shown in Figures 8 and 9, the performance results of data compression and energy consumption for SN with two neighboring EDs using the BESDC spatial correction compression scheme show that the compression ratio and energy consumption ratio can be improved to extend the SN lifetime for IEI applications.
While considering spatial correlation compression for the SN with two faraway EDs, their data compression and energy consumption ratios on SN are shown in Figures 10 and 11, respectively. Compared with the performance results in Figures 8 and 9 (i.e., SN with two neighboring EDs), they have worse data compression and energy consumption ratios since two EDs are far away and have low spatial correlations in the experimental datasets.
However, in the experimental results of SN with two neighboring EDs, the compression and energy consumption ratios of the temperature dataset in bounded error of 0.75% are almost the same as the bounded error of 0.5% (i.e., BESDC temperature in Figures 8 and 9). This is the same reason (as mentioned in the previous subsection) that the experimental dataset of temperature preserves much the same value ranges as the bounded errors given in 0.5% and 0.75%. Then, compared to the quantitative results from DF-TS temperature in Figures 8-11, the compression and energy-consumption ratios from BESDC temperature show better improvement than DF-TS temperature. Therefore, according to the overall performance results, BESDC can comprehensively improve the performance of data compression and energy consumption for the SN to extend the IoT lifetime of IEI applications.

Conclusions and Future Work
In this paper, BESDC is proposed to reduce the energy consumption for extending the IoT lifetime with a given bounded error. Two-tier BEA is applied to assign the proper bounded error to EDs and SNs. Then, EB-RLE and EB-HC are used to improve temporal and data correlation (codebook) compression in sensor data within given bounded error. The spatial correlation compression is applied for SN to compress the bounded error sensor data difference among neighboring sensed data received from different EDs. Thus, the total size of transmission data in sensor-tier devices and fog-tier machines can be minimized to achieve energy and storage saving within the given bounded errors.
The experiments for four different environmental datasets in Taiwan, including forest temperature, PM2.5 and CO of air quality and seismic wave, demonstrate that our BESDC can reduce transmission data by around 28% to 68% and save around 20% to 72% energy consumption for ED when assigning 0.25% to 1% of error tolerance, respectively. For SN using the same bounded error range, spatial correlation compression can reduce transmission data by around 34% to 73% and save around 45% to 78% energy consumption, given that the SN connects to two neighboring EDs. Moreover, for SN with two faraway EDs, the transmission data can be reduced by around 28% to 72% and the energy consumption can be saved from 38% to 77%.
Since the proposed BESDC scheme is implemented between perception and network layers of IoT, we believe it can help other energy-efficient IoT schemes applying higher layer approaches, such as network topologies and routing protocols (e.g., [25,26,46]), to reduce their payload overhead for further enhancing their energy-efficient IoT services, without affecting the QoS 2 /QoD required.
The future work will consider improving the proposed BESDC for cost-effective prediction of the lost sensor data via learning from the data received previously. We will also consider applying the smart contract technique in fast big sensor data query [40,41] with different bounded errors for users on differentiated service levels, since the smart contract using blockchain technology can preserve the resilience and transparency for big sensor data access at different levels of privacy controls.