Sensor Data Compression Using Bounded Error Piecewise Linear Approximation with Resolution Reduction

: Smart production as one of the key issues for the world to advance toward Industry 4.0 has been a research focus in recent years. In a smart factory, hundreds or even thousands of sensors and smart devices are often deployed to enhance product quality. Generally, sensor data provides abundant information for artiﬁcial intelligence (AI) engines to make decisions for these smart devices to collect more data or activate some required activities. However, this also consumes a lot of energy to transmit the sensor data via networks and store them in data centers. Data compression is a common approach to reduce the sensor data size so as to lower transmission energies. Literature indicates that many Bounded-Error Piecewise Linear Approximation (BEPLA) methods have been proposed to achieve this. Given an error bound, they make e ﬀ orts on how to approximate to the original sensor data with fewer line segments. In this paper, we furthermore consider resolution reduction, which sets a new restriction on the position of line segment endpoints. Swing-RR (Resolution Reduction) is then proposed. It has O(1) complexity in both space and time per data record. In other words, Swing-RR is suitable for compressing sensor data, particularly when the volume of the data is huge. Our experimental results on real world datasets show that the size of compressed data is signiﬁcantly reduced. The energy consumed follows. When using minimal resolution, Swing-RR has achieved the best compression ratios for all tested datasets. Consequently, fewer bits are transmitted through networks and less disk space is required to store the data in data centers, thus consuming less data transmission and storage power.


Introduction
Recently, Industry 4.0 has been commonly referred to as the fourth industrial revolution. It mainly focuses on manufacturing automation which is enabled by Internet of Things (IoT), big data, cloud computing, and artificial intelligence (AI) to enhance schedules and processes of production lines, aiming to reduce production costs, improve product quality, and shorten production time. Smart production is one of the key issues for the world to advance toward Industry 4.0. Automatic production optimization is also one of the methods to improve production throughputs and to quickly and flexibly respond to customer-oriented market. In a smart factory, hundreds or even thousands of sensors and smart devices are often deployed, e.g., for production line monitoring or product inspection. By analyzing collected sensor data, smart engines can make proper decisions, e.g., to stop the production line immediately when a severe anomaly is detected, to avoid producing defective products. As a For a time series y(t) = (y 1 , y 2 , y 3 , . . . , y n ), an approximation y'(t) = (y 1 , y 2 , y 3 , . . . , y n ') to y(t) is bounded-error by a preassigned error bound ε when |y i − y i '| ≤ ε for all i = 1, 2, . . . , n. In this paper, y i ' is the approximated data point of data point y i .
In literature, many criteria have been utilized to assess the quality of an approximation against that of original time series. A criterion called p-norm of the errors between them, denoted by l p -error, is shown in Equation (1), which measures the distance between them. Commonly used l p -error are l 1 -error, l 2 -error, and l ∞ -error, which are shown in Equations (2)-(4), respectively. These distances are frequently employed in time series analyses. l 1 -error and l 2 -error are also called Manhattan distance and Euclidean distance, respectively. Since the whole series are taken into consideration, they are rarely used in online scenarios, in which every portion of the approximation has to be generated as soon as available. These two l p -error are considered as an average distance globally, rather than an instant distance locally. For an approximation that has a small l 1 -error and/or l 2 -error, there is no trivial bound on the error between any particular data point and its approximated data point.
Unlike l 1 -error and l 2 -error, l ∞ -error measures the largest error between any data point and its approximated. When l ∞ -error is bounded, loose bounds of l 1 -error and l 2 -error for any portion of the approximation can be easily derived. Also, for any line segment s i with the length of m, l 1 -error and l 2 -error between s i and y(t) are bounded by mε and √ mε, respectively.

Definition 2: Piecewise Linear Approximation (PLA)
PLA as a classic representation method is often used to describe two-dimensional data series, ((x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n )). A PLA to a data series consists of k (k ≤ n) continuous non-overlapping line segments (s 1 , s 2 , . . . , s k ), in which s i (1 ≤ i ≤ k) approximating to a portion of the original data is represented by two endpoints, i.e., (s i .start.x, s i .start.y) and (s i .stop.x, s i .stop.y). The length of s i in x-axis is expressed as s i .length = s i .stop.x − s i .start.x + 1.
PLA has many applications [35]. For example, in piecewise linear regression, the original data are divided into several segments, each of which is fitted by using linear regression. Note that linear regression tries to find the best line to approximate to the original data series, meanwhile minimizing l 2 -error. As mentioned above, when using PLA, there is no bound between any particular data point and its approximated data point.
In recent years, BEPLA for sensor data have attracted researchers' eyes once again [29][30][31]34]. As shown in Figure 1, given time series y(t) and preassigned maximal error ε, we can set upper bound H(t) and lower bound G(t) for y'(t), as listed in Equations (5) and (6), respectively.
Although the two εs in both Equations (5) and (6) are the same, they may be different in different applications.
Although the two εs in both Equations (5) and (6) are the same, they may be different in different applications. In fact, BEPLA y'(t) = (s1, s2, …, sk) is a PLA to y(t) and the l∞-error between y(t) and y'(t) is bounded by ε if and only if all points in all line segments are between G(t) and H(t). For any line segment si, the length of which is m, it is easy to prove that l1-error and l2-error between si and y(t) are bounded by mε and √ , respectively. As shown in Figure 2, there are two types of BEPLA, joint and disjoint. In joint BEPLA, consecutive line segments share an endpoint, i.e., si+1.start is actually si.stop (1 ≤ i < k), as shown in Figure 2a. In addition to the start point of the first line segment (i.e., s1.start), for each line segment, the stop point is recorded. Specifically, y'(t) is recorded as (s1.start.y, (s1.length, s1.stop.y), (s2.length, s2.stop.y), …, (sk.length, sk.stop.y)). Note that s1.start.x = 1, and si.stop.x = si+1.start.x = si.start.x + si.length.x − 1 for all is, 1 ≤ i < k. When the value of y is stored in p bits and the length of a line segment is stored in q bits, y'(t) is recorded in p + (p + q) × k bits. Thus, compression ratio for joint BEPLA can be calculated by using Equation (7).
where n is the number of elements in the original time series y(t), and 32 is the number of bits used to represent yi in y(t), for all is. In disjoint BEPLA, a line segment si+1 does not have to start from the stop point of si (1 ≤ i < k). In practice, si+1 starts from the next time tick after the stop point of si, i.e., si+1.start.x = si.stop.x + 1 for all is, 1 ≤ i < k, as shown in Figure 2b. Since there are no shared endpoints between consecutive line segments, both start and stop points of a line segment need to be recorded. Specifically, y'(t) is recorded as ((s1.start.y, s1.length, s1.stop.y), (s2.start.y, s2.length, s2.stop.y), …, (sk.start.y, sk.length, sk.stop.y)). Note that s1.start.x = 1, and si.stop.x = si.start.x + si.length.x − 1 for all is, Compression ratio for disjoint BEPLA is calculated by using Equation (8).
In fact, there are k − 1 hidden line segments whose length is 1 time tick, as the dashed line segment shown in Figure 2b. They are (si.stop.y, 1, si+1.start.y) for all is, 1 ≤ i < k, and embedded in the representation of y'(t). In other words, there are actually 2 × k − 1 line segments. These hidden line segments are not explicitly recorded since they can be derived from the representation of disjoint BEPLA. In fact, BEPLA y'(t) = (s 1 , s 2 , . . . , s k ) is a PLA to y(t) and the l ∞ -error between y(t) and y'(t) is bounded by ε if and only if all points in all line segments are between G(t) and H(t). For any line segment s i , the length of which is m, it is easy to prove that l 1 -error and l 2 -error between s i and y(t) are bounded by mε and √ mε, respectively. As shown in Figure 2, there are two types of BEPLA, joint and disjoint. In joint BEPLA, consecutive line segments share an endpoint, i.e., s i+1 .start is actually s i .stop (1 ≤ i < k), as shown in Figure 2a. In addition to the start point of the first line segment (i.e., s 1 .start), for each line segment, the stop point is recorded. Specifically, y'(t) is recorded as (s 1 .start.y, (s 1 .length, s 1 .stop.y), (s 2 .length, s 2 .stop.y), . . . , (s k .length, s k .stop.y)). Note that s 1 .start.x = 1, and s i .stop.x = s i+1 .start.x = s i .start.x + s i .length.x − 1 for all is, 1 ≤ i < k. When the value of y is stored in p bits and the length of a line segment is stored in q bits, y'(t) is recorded in p + (p + q)k bits. Thus, compression ratio for joint BEPLA can be calculated by using Equation (7).
where n is the number of elements in the original time series y(t), and 32 is the number of bits used to represent y i in y(t), for all is. In disjoint BEPLA, a line segment s i+1 does not have to start from the stop point of s i (1 ≤ i < k). In practice, s i+1 starts from the next time tick after the stop point of s i , i.e., s i+1 .start.x = s i .stop.x + 1 for all is, 1 ≤ i < k, as shown in Figure 2b. Since there are no shared endpoints between consecutive line segments, both start and stop points of a line segment need to be recorded. Specifically, y'(t) is recorded as ((s 1 .start.y, s 1 .length, s 1 .stop.y), (s 2 .start.y, s 2 .length, s 2 .stop.y), . . . , (s k .start.y, s k .length, s k .stop.y)). Note that s 1 .start.x = 1, and s i .stop.x = s i .start.x + s i .length.x − 1 for all is, 1 ≤ i ≤ k. Thus, y'(t) is recorded in (2p + q)k bits. Compression ratio for disjoint BEPLA is calculated by using Equation (8).
In fact, there are k − 1 hidden line segments whose length is 1 time tick, as the dashed line segment shown in Figure 2b. They are (s i .stop.y, 1, s i+1 .start.y) for all is, 1 ≤ i < k, and embedded in the representation of y'(t). In other words, there are actually 2k − 1 line segments. These hidden line segments are not explicitly recorded since they can be derived from the representation of disjoint BEPLA.
Many BEPLA algorithms have been so far proposed [27][28][29][30][31][32][33][34], trying to extend the line segments as long as possible and thus minimizing the number of line segments. In 2009, Elmeleegy et al. [31] Energies 2019, 12, 2523 5 of 20 reinvented Swing filter and Slide filter for joint and disjoint BEPLA, respectively. Swing filter is the simplest, but not optimal in terms of number of line segments. Its complexities on a data point in both time and space are O(1). In fact, Swing filter was presented by Gritzali and Papakonstantinou in 1983 [33]. An optimal joint BEPLA algorithm, referred to Cont-PLA in this paper, was introduced by Hakimi and Schmeichel in 1991 [27,28]. Slide filter is optimal in minimizing number of line segments. Actually, an optimal disjoint BEPLA algorithm was proposed by O'Rourke in 1991 [32]. Xie et al. [29] in 2014 improved the running time efficiency of Slide filter. Zhao et al. [30] in 2016 improved the efficiency of Swing filter based on [29]. Using convex hull and similar techniques, the amortized time complexity of both Cont-PLA and Slide filter on a data point are also O(1). BEPLA. In (a), joint BEPLA, two joint line segments that share an endpoint are used to approximate to y(t) shown in Figure 1. In (b), disjoint BEPLA, two disjoint line segments that do not share endpoints are used to approximate to y(t). In fact, between them, there is a hidden line segment (dashed), the length of which is one time tick.
Many BEPLA algorithms have been so far proposed [27][28][29][30][31][32][33][34], trying to extend the line segments as long as possible and thus minimizing the number of line segments. In 2009, Elmeleegy et al. [31] reinvented Swing filter and Slide filter for joint and disjoint BEPLA, respectively. Swing filter is the simplest, but not optimal in terms of number of line segments. Its complexities on a data point in both time and space are O(1). In fact, Swing filter was presented by Gritzali and Papakonstantinou in 1983 [33]. An optimal joint BEPLA algorithm, referred to Cont-PLA in this paper, was introduced by Hakimi and Schmeichel in 1991 [27,28]. Slide filter is optimal in minimizing number of line segments. Actually, an optimal disjoint BEPLA algorithm was proposed by O'Rourke in 1991 [32]. Xie et al. [29] in 2014 improved the running time efficiency of Slide filter. Zhao et al. [30] in 2016 improved the efficiency of Swing filter based on [29]. Using convex hull and similar techniques, the amortized time complexity of both Cont-PLA and Slide filter on a data point are also O(1).
For data compression consideration, each line segment consumes p + q bits in joint BEPLA while 2 × p + q bits in disjoint BEPLA. Since disjoint BEPLA algorithms have higher freedom in selecting the start points of line segments, they usually use fewer line segments. As shown in [29], in terms of bits representing the resultant BEPLA, Cont-PLA, and Slide filter mutually outperformed each other on different datasets, and both achieved 15-25% superiority over swing filer in all datasets. In 2015, Luo et al. [34] introduced Mixed-PLA that uses both joint and disjoint line segments. The authors employed dynamic programming technique and showed that Mixed-PLA were roughly 15% better than Cont-PLA and Slide filter in terms of bits representing the resultant BEPLA.

BEPLA with Resolution Reduction
Here, we define the problems that BEPLA with resolution reduction may face. As described above, the only requirement of a proper BEPLA is that all line segments must be between G(t) and H(t). In fact, even the x-coordinates of line segment endpoints are not restricted to be aligned with the time ticks, as a result, the x-and y-coordinates of line segment endpoints are real numbers, and stored as float point numbers, which are typically 32 bits long. If we set some restriction on position of line segment endpoints, we can use less number of bits to encode x-and y-coordinates of them.
In this paper, resolution reduction is further taken into consideration. In practice, error bounds (ε) typically range from 0.5% to 5% of the whole range of possible sensor data [29,31,34]. We note that before data compression, it can apply an extreme filter [36] to remove the influence of unfavorable data points. Therefore, 2 × ε is from 1% to 10%. If r bits are used to approximate to the data, the whole For data compression consideration, each line segment consumes p + q bits in joint BEPLA while 2p + q bits in disjoint BEPLA. Since disjoint BEPLA algorithms have higher freedom in selecting the start points of line segments, they usually use fewer line segments. As shown in [29], in terms of bits representing the resultant BEPLA, Cont-PLA, and Slide filter mutually outperformed each other on different datasets, and both achieved 15-25% superiority over swing filer in all datasets. In 2015, Luo et al. [34] introduced Mixed-PLA that uses both joint and disjoint line segments. The authors employed dynamic programming technique and showed that Mixed-PLA were roughly 15% better than Cont-PLA and Slide filter in terms of bits representing the resultant BEPLA.

BEPLA with Resolution Reduction
Here, we define the problems that BEPLA with resolution reduction may face. As described above, the only requirement of a proper BEPLA is that all line segments must be between G(t) and H(t). In fact, even the x-coordinates of line segment endpoints are not restricted to be aligned with the time ticks, as a result, the xand y-coordinates of line segment endpoints are real numbers, and stored as float point numbers, which are typically 32 bits long. If we set some restriction on position of line segment endpoints, we can use less number of bits to encode xand y-coordinates of them.
In this paper, resolution reduction is further taken into consideration. In practice, error bounds (ε) typically range from 0.5% to 5% of the whole range of possible sensor data [29,31,34]. We note that before data compression, it can apply an extreme filter [36] to remove the influence of unfavorable data points. Therefore, 2ε is from 1% to 10%. If r bits are used to approximate to the data, the whole range of possible sensor data is then divided into 2 r blocks, and the center of each block is coded accordingly, named coded data point in this paper. When the block size (1/2 r ) is smaller than 2ε, there must be at least ε2 r+1 coded data points between G(t) and H(t) at any particular time. Table 1 shows the minimal resolution (in bits), calculated by using Equation (9), for typical error bounds.

Minimal Resolution
Energies 2019, 12, 2523 6 of 20 in which · is ceiling function. In other words, when r is larger than or equal to the minimal resolution for the preassigned error bound ε, for any particular data point y, there must be at least one coded data point y* such that the distance between y* and y is bounded by ε. When resolution reduction is adopted by BEPLA algorithms, the endpoints of line segments must be all coded data points, as shown in Figure 3, where coded data points are depicted as black circles. We must note that in this configuration, it does not mean the approximated data points, except the start point and stop point, on a line segment are coded data points.
Apparently, adoption of resolution reduction by BEPLA puts a new restriction on endpoint selection for line segments. As shown in Figure 3a,b, the first line segments in both joint and disjoint BEPLA stop after three data points are examined. However, in Figure 2a,b, the first line segments in both joint and disjoint BEPLA stop after six data points are examined and their lengths in time are both five time ticks, implying that the adoption of resolution reduction might shorten the line segments used to approximate to the original time series, thus generating more line segments. Referring to Equations (7) and (8), even when the number of line segments (k) increases, we can reduce the size of BEPLA by using fewer bits (p and q bits) to represent the approximated data point and the length of a line segment. For example, when the error bound is set to 0.5% and minimal resolution is used, the y-coordinate of a line segment endpoint is stored in 7 bits rather than 32 bits. The reduction is significant for both Equations (7) and (8) and usually can compensate for the increase of k, i.e., the length of a BEPLA. in which ⌈.⌉ is ceiling function. In other words, when r is larger than or equal to the minimal resolution for the preassigned error bound ε, for any particular data point y, there must be at least one coded data point y * such that the distance between y * and y is bounded by ε. When resolution reduction is adopted by BEPLA algorithms, the endpoints of line segments must be all coded data points, as shown in Figure 3, where coded data points are depicted as black circles. We must note that in this configuration, it does not mean the approximated data points, except the start point and stop point, on a line segment are coded data points.
Apparently, adoption of resolution reduction by BEPLA puts a new restriction on endpoint selection for line segments. As shown in Figure 3a,b, the first line segments in both joint and disjoint BEPLA stop after three data points are examined. However, in Figure 2a,b, the first line segments in both joint and disjoint BEPLA stop after six data points are examined and their lengths in time are both five time ticks, implying that the adoption of resolution reduction might shorten the line segments used to approximate to the original time series, thus generating more line segments. Referring to Equations (7) and (8), even when the number of line segments (k) increases, we can reduce the size of BEPLA by using fewer bits (p and q bits) to represent the approximated data point and the length of a line segment. For example, when the error bound is set to 0.5% and minimal resolution is used, the y-coordinate of a line segment endpoint is stored in 7 bits rather than 32 bits. The reduction is significant for both Equations (7) and (8) and usually can compensate for the increase of k, i.e., the length of a BEPLA. In this paper, the simplest method, Swing filter [31], is extended to take resolution reduction into consideration. To the best of the authors' knowledge, Swing-RR is the first BEPLA algorithm that adopts resolution reduction.
Swing-RR generates disjoint BEPLA rather than joint BEPLA. Its time and space complexities are both O(1), the same as those of Swing filter. Hence, Swing-RR can be applied to be used by sensor networks, edge computing, and scenarios in which computing power and energy are limited. For real-time event detection and processing, i.e., line segments must be generated before a preassigned number of data points is sensed, the length of line segments can be bounded by a maximal delay in Swing-RR.
Real world datasets, the UCR time series classification archive [37], are used to investigate the performance of Swing-RR. Experiment results show that Swing-RR significantly outperforms Swing filter. The bits used to represent BEPLA generated by Swing-RR are only 20-35% of those produced by Swing filter for typical error bounds. Swing-RR generates more line segments than Swing filter. The lengths of its line segments in time are shorter, thus better fitting the original data. The mean square errors of BEPLA generated by Swing-RR is smaller than those produced by Swing filter. In this paper, the simplest method, Swing filter [31], is extended to take resolution reduction into consideration. To the best of the authors' knowledge, Swing-RR is the first BEPLA algorithm that adopts resolution reduction.
Swing-RR generates disjoint BEPLA rather than joint BEPLA. Its time and space complexities are both O(1), the same as those of Swing filter. Hence, Swing-RR can be applied to be used by sensor networks, edge computing, and scenarios in which computing power and energy are limited. For real-time event detection and processing, i.e., line segments must be generated before a preassigned number of data points is sensed, the length of line segments can be bounded by a maximal delay in Swing-RR.
Real world datasets, the UCR time series classification archive [37], are used to investigate the performance of Swing-RR. Experiment results show that Swing-RR significantly outperforms Swing filter. The bits used to represent BEPLA generated by Swing-RR are only 20-35% of those produced by Swing filter for typical error bounds. Swing-RR generates more line segments than Swing filter. The lengths of its line segments in time are shorter, thus better fitting the original data. The mean square errors of BEPLA generated by Swing-RR is smaller than those produced by Swing filter.

Method
This section describes how to generate a BEPLA by Swing-RR. Algorithm 1 shows the pseudocode of Swing-RR (). Given error bound ε, maximal delay delay, and resolution r bits in length, whenever a new data point d is sensed, d is processed by Swing-RR(d) and line segments are generated on the fly.
Similar to Swing filter, Swing-RR maintains a data structure for holding possible line segments. As shown in Figure 4, two auxiliary lines s→u s and s→l s are maintained. The possible line segments for the current processing window must lie within s→u s and s→l s . Both auxiliary lines start from the start point s of the current processing window. When a window is initialized, a coded data point in the bounded range of the first sensed data point is chosen, probably randomly (please refer to lines 2 and 3 of Algorithm 1 and Figure 4a). In this experiment, a coded data point nearest to the original data is chosen.
When the second data point d in this window is processed, two support points, u s = d + ε and l s = d − ε, are initialized accordingly (please refer to lines 4~6 of Algorithm 1 and Figure 4b). The upper support point u s bounds the maximal slope of possible line segments, i.e., s→u s . The lower support point l s bounds the minimal slope of possible line segments, i.e., s→l s . Similar to Swing filter, whenever a new data point d is sensed, the two support points, u s and l s , are maintained according to the positions of d + ε and d − ε. s→u s may swing down, and s→l s may swing up (please refer to lines 7~11).
If d + εis below s→l s and thus s→u s will swing down too much, a line segment is then generated. Also, if d − εis above s→u s and thus s→l s will swing up too much, a line segment is generated (please refer to lines 7~8). Otherwise, u s and l s are maintained to update the range of possible line segments (please refer to lines 10 and 11). If d − εis above s→l s , s→l s swings up by updating l s = d − ε, as shown in Figure 4a,b. If d + εis below s→u s , s→u s swings down by updating u s = d + ε, as shown in Figure 4c,d. When resolution reduction is adopted, please see function check_range(), Swing-RR further checks to see whether there are coded data points between l and u, which are extended from s→l s and s→u s , respectively. Specifically, Swing-RR calculates l + and u − where l + is the smallest coded data point larger than or equal to l, and u − is the largest coded data point smaller than or equal to u, as shown in Figure 4b-d. We note that s, l + , and u − must be coded data points, while u s , l s , u, and l are not restricted to be coded data points. When l + is smaller than or equal to u − , there must be at least one data point between l and u. A coded data point between l + and u − is chosen as the stop point of a line segment candidate for this window. Swing-RR adopts the middle coded data point between l + , and u − (please refer to line 21). On the other hand, when l + is larger than u − , there is no coded data point between l and u, as shown in Figure 4d. Swing-RR generates a line segment, and initializes a new window (please refer to line 26 and Figure 4e).
When the length of a line segment candidate is equal to delay, Swing-RR generates the line segment candidate and initializes a new window (please refer to lines [22][23][24]. Adoption of resolution reduction puts a restriction on endpoint selection for BEPLA. As shown in Figure 4d, when there are no coded data points between l and u, Swing-RR has to close the current window and generates a line segment, while Swing filter can further process new data points. Obviously, the higher the resolution, the more the coded data points between d − ε and d + ε. When the resolution r increases, there might be more coded data points between l and u. As shown in Figure 4f, when r is increased by one, there is only one coded data point between l and u. Swing-RR does not have to close the window. (please refer to lines 10 and 11). If d − ε is above s→ls, s→ls swings up by updating ls = d − ε, as shown in Figure 4a,b. If d + ε is below s→us, s→us swings down by updating us = d + ε, as shown in Figure  4c,d. When resolution reduction is adopted, please see function check_range(), Swing-RR further checks to see whether there are coded data points between l and u, which are extended from s→ls and s→us, respectively. Specifically, Swing-RR calculates l + and u − where l + is the smallest coded data point larger than or equal to l, and u − is the largest coded data point smaller than or equal to u, as shown in Figure 4b-d. We note that s, l + , and u − must be coded data points, while us, ls, u, and l are not restricted to be coded data points. When l + is smaller than or equal to u − , there must be at least one data point between l and u. A coded data point between l + and u − is chosen as the stop point of a line segment candidate for this window. Swing-RR adopts the middle coded data point between l + , and u − (please refer to line 21). On the other hand, when l + is larger than u − , there is no coded data point between l and u, as shown in Figure 4d. Swing-RR generates a line segment, and initializes a new window (please refer to line 26 and Figure 4e).
When the length of a line segment candidate is equal to delay, Swing-RR generates the line segment candidate and initializes a new window (please refer to lines [22][23][24].
Adoption of resolution reduction puts a restriction on endpoint selection for BEPLA. As shown in Figure 4d, when there are no coded data points between l and u, Swing-RR has to close the current window and generates a line segment, while Swing filter can further process new data points. Obviously, the higher the resolution, the more the coded data points between d − ε and d + ε. When the resolution r increases, there might be more coded data points between l and u. As shown in Figure  4f, when r is increased by one, there is only one coded data point between l and u. Swing-RR does not have to close the window.

Experiment
In this section, we investigate the performance of Swing-RR. An archive which consists of several real-world datasets [37] is used. As that in [34], eight datasets are chosen: Cricket_X, Cricket_Y, Cricket_Z, FaceFour, Lighting2, Lighting7, MoteStrain, and wafer. Table 2 listed related information of these datasets. All data points are stored in IEEE 754 single precision floating point format [38], i.e., 32 bits are used to store a data point.
Note that x-coordinates of line segment endpoints in BEPLA produced by Swing filter and Swing-RR are aligned with time ticks. As a result, the lengths of line segments are all the same in the following experiment.

Experimental Setup
In addition to typical error bounds ranging from 0.5% to 5% of the whole range of possible data points, in this experiment, we also examine scenarios of a small error bound, ranging between 0.1% and 0.4%. Note that in this case, a higher resolution is needed to ensure the existence of some coded data points, the values of which are between the upper and lower bounds. As well, when the error bound is small, the space for BEPLA follows. Consequently, the expected lengths of line segments of BEPLA are also short, thus further shortening the maximal delay so that fewer bits are required to record the lengths of these line segments. Table 3 shows the resolution and maximal delays (in time ticks) employed in this experiment. Maximal delays are set usually based on the applications of sensor networks. The more in real time requirements, the smaller the maximal delays. However, the lengths of line segments are more restricted by the given error bound. When the maximal delays are too short, long line segments, if they exist, are forced to be cut. When the maximal delays are longer than the lengths of most line segments, bit usage in recording the lengths of line segments is inefficient.
Given an error bound ε, for all datasets, Swing-RR utilizes different resolutions, particularly from the minimal resolution (please refer to Equation (9) and Table 1) to higher. When minimal resolution is employed, there is at least one coded data point, the value of which is between y − ε to y + ε for a data point y. When one more bit is used for the resolution, the number of coded data points will be doubled.
We compare the performance of Swing-RR and Swing filter [31]. Three criteria are investigated, including compression ratio, lengths of line segments and their distribution, and mean square error (MSE). MSE is calculated by Equation (10).
Previous methods focused on how to approximate to the original data by using fewer line segments.
With resolution reduction, we further examine the compression ratios for different resolutions and maximal delays. Investigation on the lengths of line segments and their distribution helps us understand the tradeoff regarding the selection of maximal delays. BEPLA generated by Swing-RR and Swing filter are all bounded by ε. MSE related with l 2 -error provides additional information about these BEPLA. In general, a BEPLA with a smaller MSE fits better to the original data than those with larger MSEs do. Figure 5 shows the performance comparison between Swing-RR and Swing filter given different sizes of BEPLA, where maximal delays and resolutions are discriminated by different point types and colors, respectively. Specifically, we calculate the ratio of the size (in bits) of the two BEPLAs generated by Swing-RR and Swing filter. For all eight datasets, Swing-RR uses fewer bits significantly than Swing filter does. For typical error bounds ranging from 0.5% to 5%, when comparing with the number of bits needed by Swing filter to represent its BEPLA, only 20-35% of bits are consumed by Swing-RR. Even for a much smaller error bound, e.g., 0.1-0.4%, Swing-RR needs only 30-45% of bits required by Swing filter.

Compression Ratio
The upper bound H(t) and lower bound G(t) of ε restrict the space of BEPLA. As shown in Figures 6 and 7, when the error bound is enlarged, the bits used to represent BEPLA generated by both Swing-RR and Swing filter are further reduced as expected. Since the figures for all datasets are very similar, only a part of them is shown.
For the eight datasets employed, Swing-RR using the minimal resolution (please refer to Equation (9) and Table 1) always generates the best compression ratios. As shown in Figure 5a-h, when the error bound is set to 4% or 5%, Swing-RR with minimal resolution (4 bits) has the most significant bit reduction. Compared to the number of bits needed by Swing filter to represent its BEPLA, only about 20% of bits (colored in orange) are consumed by Swing-RR. When the error bound is set to 2% or 3%, Swing-RR with minimal resolution (5 bits) has the most significant bit reduction again, only about 25% (colored in green) of bits are required, compared to that needed by Swing filter. For other error bounds, similar results are conducted.
In most scenarios, given an error bound, the locations of points are strongly related to their colors (which represent the resolution), as shown in Figure 5a-h. In fact, the differences on numbers of line segments of BEPLA generated by Swing-RR with different resolutions and maximal delays are small, compared to the total number of line segments. We will show this in the following. Also, please refer to Equations (7) and (8), because the change of k (i.e., the number of line segments) is small, p (i.e., bits for resolution), and q (i.e., bits for maximal delay) play more important roles in affecting the size of compressed data. On the other hand, given an error bound, the resolution plays an important role in reducing size of compressed data than the maximal delay does. In Figure 5a-h, the points of different point types (which represent the maximal delay) but in a same color (which shows the resolution) are very close. Figure 6 shows the sizes of the BEPLA generated by Swing filter and Swing-RR and their energy consumption, given different resolutions to the Cricket_X dataset. Scenarios on maximal delays of 63 and 127 are shown. Here energy consumption EC is defined as where N is the number of bit actually transmitted, and U_E is the energy consumed for delivering a bit, a typical value of which is 2.5 PJ/bit (i.e., pico joule per bit) [39]. As described above, Swing-RR outperforms Swing filter significantly. Swing-RR with the minimal resolution achieves a much better compression than Swing filter does. Since IoT data are transmitted all year long, maybe continuously or intermittently, from a long term viewpoint, the accumulatively saved energy should be huge. On the other hand, given an error bound, the resolution plays an important role in reducing size of compressed data than the maximal delay does. In Figure 5a-h, the points of different point types (which represent the maximal delay) but in a same color (which shows the resolution) are very close. Figure 6 shows the sizes of the BEPLA generated by Swing filter and Swing-RR and their energy consumption, given different resolutions to the Cricket_X dataset. Scenarios on maximal delays of 63 and 127 are shown. Here energy consumption EC is defined as where N is the number of bit actually transmitted, and U_E is the energy consumed for delivering a bit, a typical value of which is 2.5 PJ/bit (i.e., pico joule per bit) [39]. As described above, Swing-RR outperforms Swing filter significantly. Swing-RR with the minimal resolution achieves a much better compression than Swing filter does. Since IoT data are transmitted all year long, maybe continuously or intermittently, from a long term viewpoint, the accumulatively saved energy should be huge. Note that Figure 6a and Figure 6b are very similar. Figure 7 shows the size of the BEPLA generated by Swing filter and Swing-RR and their energy consumption, given different maximal delays to the wafer dataset. For a typical error bound, i.e., from 0.5% to 5%, and resolutions, i.e., 7 and 8 bits, Swing-RR with a maximal delay of 127 or 255 compresses the wafer dataset better than Swing-RR with a maximal delay of 63 or 511 does. When the maximal delay is too short, as mentioned before, long line segments will be cut. When the maximal delay is longer than the lengths of most line segments, some bits allocated to record the length will not be used, thus bit efficiency is reduced.
As described above, Slide filter and Cont-PLA mutually outperform each other given different datasets, while Mixed-PLA outperforms Swing-filter, Slide filter, and Cont-PLA on the eight real world datasets [29,34]. The size of BEPLA generated by Mixed-PLA is about 50-60% of that produced by Swing filter. These methods focus on how to minimize the number of line segments. Furthermore, when using minimal resolution, Swing-RR requires only 20-25% of bits. Of course, the accumulatively saved energy is also huge.     Figure 8a,b show the numbers of line segments required to represent BEPLA generated by Swing-RR on the Lighting7 dataset given different maximal delays and resolutions. For an error bound, as mentioned above, the differences of numbers of line segments generated on different maximal delays are small, compared to the total number of line segments. On the other hand, the bit reduction caused by using fewer bits to represent the length of line segments is trivial. This is also true for different resolutions.

Number of Segments and Their Length
In fact, the distribution of numbers of line segments and their lengths follows power law [40]. Most line segments are short. Only a small portion is long. As shown in Figure 8c, given the Lighting7 dataset, when error bound is 1%, the numbers of short line segments produced by Swing filter and Swing-RR on different maximal delays (63, 127, 255) are not far away. The peaks in length 63 for both schemes show that some line segments are longer than 63. So long line segments are cut at this length. As shown in Figure 8a, when the error bound is 1%, roughly 2,500 line segments in the BEPLA are generated by Swing-RR on different maximal delays. Among them, as shown in Figure 8c, there are about 1000, the length of which is 1, and about 500, the length of which is 2. More than one half of line segments are very short. In fact, less than 100 line segments are longer than 63. As a result, it might be worth to specify a smaller maximal delay and thus represent the lengths of line segments with fewer bits.
Further, the length of an IP header is about 32 bytes (between 20 and 60 bytes), meaning that 256 bits are ordinarily accompanied with the delivered data of a line segment. The energy consumed for transmitting an IP header, denoted by EIH, is defined as where M is the number of line segments generated. As shown in Figure 8, we can see that the total numbers of line segments produced by Swing filter are higher than those yielded by Swing-RR. For example, in Figure 8b, when error bounded is 1, the numbers generated by Swing-RR are about 2500, whereas those produced by Swing filter are about 4500. The difference between the two consumed energies is 259 * 2.5 * (4500 − 2500) PJ. The phenomenon also occurs in Figure 8d, i.e., accumulated number of line segments. In Figure 8, we do not show the corresponding energy since it is trivial. Only M is shown. We can obtain the consumed energy by multiplying M with 256 * 2.5 PJ.    Figure 9 shows the MSE between the original data and BEPLA generated by Swing filter and Swing-RR, given the FaceFour dataset. Swing filter yields joint BEPLA, in which the start point of a line segment is the stop point of its immediate previous line segment. To extend the line segments as long as possible and thus minimize the number of line segments, the line segments generated by Swing filter usually reach the upper H(t) or lower bound G(t). On the other hand, the available approximated data points for Swing-RR are seldom close to H(t) or G(t). When Swing-RR is employed, the start point of each line segment in disjoint BEPLA is the coded data point nearest to the original data point. Furthermore, most line segments are very short. In fact, there are more than one half of line segments, the lengths of which are 1 or 2 time ticks. For these short line segments, it is expected that the stop points are also individually close to their own original data. As a result, BEPLA produced by Swing-RR has smaller MSE than that yielded by Swing filter.

Mean Square Error
When resolution increases, many more coded data points will be there between the upper and lower bounds. The MSE between the original data and BEPLA generated by Swing-RR with a smaller resolution is further reduced, as shown in Figure 9a,b.
Energies 2019, 12, 2523 17 of 20 approximated data points for Swing-RR are seldom close to H(t) or G(t). When Swing-RR is employed, the start point of each line segment in disjoint BEPLA is the coded data point nearest to the original data point. Furthermore, most line segments are very short. In fact, there are more than one half of line segments, the lengths of which are 1 or 2 time ticks. For these short line segments, it is expected that the stop points are also individually close to their own original data. As a result, BEPLA produced by Swing-RR has smaller MSE than that yielded by Swing filter. When resolution increases, many more coded data points will be there between the upper and lower bounds. The MSE between the original data and BEPLA generated by Swing-RR with a smaller resolution is further reduced, as shown in Figure 9a

Conclusions
In Industry 4.0, sensors play a very important role in manufacturing automation. However, the big data generated by these sensors consumes a lot of energy, for transmitting data from sensors to data center, and data storage in the data centers. The best practices to improve data center power saving include reduction of storage disk space, and network port power consumption. Data compression is a common approach to reduce the amount of big data transmitted via networks. BEPLA retains a certain level of quality of the original sensor data for later analysis. Previous methods focused on how to extend the line segments as long as possible, thus minimizing the number of line segments in BEPLA. However, the length of a line segment, or the length that a line segment can extend, largely depends on the given error bound and data variation.
In this paper, Swing-RR is presented to produce disjoint BEPLA with Resolution Reduction for sensor data compression. To the best of authors' knowledge, it is the first attempt to take Resolution Reduction into consideration for BEPLA. The real-world datasets [37] are used to evaluate the performance and energy consumption of Swing-RR. Our experimental results show that Swing-RR outperforms Swing filter. For typical error bounds, i.e., 0.5-5%, Swing-RR with minimal resolution achieves much better compression ratios. Swing-RR uses 7 (for 0.5%) to 4 (for 5%) bits, rather than 32 bits, to store the approximated data point. Refer to Equations (7) and (8), since p decreases significantly, the size reduction of BEPLA is obvious. Compared to the number of bits needed by Swing filter to represent its BEPLA, Swing-RR uses only 20-25% of bits, while state-of-the-art methods utilize 50-60% of bits. As a result, fewer bits are transmitted in the network and less disk space are required to store the sensor data in the data center. Generally, the power consumption is largely reduced. As well, the MSE of BEPLA generated by Swing-RR is smaller than that produced by Swing filter.
The resolution plays a more important role than the maximal delay. In this study, for all datasets, Swing-RR using the minimal resolution always yields better compression ratios than using a higher resolution does. Most line segments are short. Only a small portion is long. Thus, it is worth using fewer bits to encode the lengths of segments.

Conclusions
In Industry 4.0, sensors play a very important role in manufacturing automation. However, the big data generated by these sensors consumes a lot of energy, for transmitting data from sensors to data center, and data storage in the data centers. The best practices to improve data center power saving include reduction of storage disk space, and network port power consumption. Data compression is a common approach to reduce the amount of big data transmitted via networks. BEPLA retains a certain level of quality of the original sensor data for later analysis. Previous methods focused on how to extend the line segments as long as possible, thus minimizing the number of line segments in BEPLA. However, the length of a line segment, or the length that a line segment can extend, largely depends on the given error bound and data variation.
In this paper, Swing-RR is presented to produce disjoint BEPLA with Resolution Reduction for sensor data compression. To the best of authors' knowledge, it is the first attempt to take Resolution Reduction into consideration for BEPLA. The real-world datasets [37] are used to evaluate the performance and energy consumption of Swing-RR. Our experimental results show that Swing-RR outperforms Swing filter. For typical error bounds, i.e., 0.5-5%, Swing-RR with minimal resolution achieves much better compression ratios. Swing-RR uses 7 (for 0.5%) to 4 (for 5%) bits, rather than 32 bits, to store the approximated data point. Refer to Equations (7) and (8), since p decreases significantly, the size reduction of BEPLA is obvious. Compared to the number of bits needed by Swing filter to represent its BEPLA, Swing-RR uses only 20-25% of bits, while state-of-the-art methods utilize 50-60% of bits. As a result, fewer bits are transmitted in the network and less disk space are required to store the sensor data in the data center. Generally, the power consumption is largely reduced. As well, the MSE of BEPLA generated by Swing-RR is smaller than that produced by Swing filter.
The resolution plays a more important role than the maximal delay. In this study, for all datasets, Swing-RR using the minimal resolution always yields better compression ratios than using a higher resolution does. Most line segments are short. Only a small portion is long. Thus, it is worth using fewer bits to encode the lengths of segments.
The time and space complexities of Swing-RR are both O(1). This makes Swing-RR suitable for sensors with limited resources and energy.
In this paper, only temporal correlation in sensor data is leveraged. Data and spatial correlations [26] are currently under investigation. As well, we are going to explore more complex data structures, e.g., convex hull used in previous methods [31,34].
Also, we assume that the collected data are precise. In reality, there can be some bias in data collection. For example, the environments monitored by sensors are under attack [41], and the sensed data are contaminated. Depending on the roles, edge nodes can preprocess the data before the data are sent to their data centers. They can do nothing and send the raw data. Or, they can remove the outliers, predict the missing data, and measure the robustness of the network. The authors will further study these issues in the future.