Adaptive Segmentation of Streaming Sensor Data on Edge Devices

Sensor data streams often represent signals/trajectories which are twice differentiable (e.g., to give a continuous velocity and acceleration), and this property must be reflected in their segmentation. An adaptive streaming algorithm for this problem is presented. It is based on the greedy look-ahead strategy and is built on the concept of a cubic splinelet. A characteristic feature of the proposed algorithm is the real-time simultaneous segmentation, smoothing, and compression of data streams. The segmentation quality is measured in terms of the signal approximation accuracy and the corresponding compression ratio. The numerical results show the relatively high compression ratios (from 135 to 208, i.e., compressed stream sizes up to 208 times smaller) combined with the approximation errors comparable to those obtained from the state-of-the-art global reference algorithm. The proposed algorithm can be applied to various domains, including online compression and/or smoothing of data streams coming from sensors, real-time IoT analytics, and embedded time-series databases.


Introduction
Sensor signal chain solutions used to rely totally upon cloud infrastructure whenever high-level data processing was required. In most cases, it was effective because the amounts of data to be transferred were small, and possibly existing real-time constraints were not excessive. For contemporary systems, however, this approach is often not acceptable, since the full bandwidth of sampled data will almost always cause network congestion and/or create a significant bottleneck for the aggregation node (e.g., a wireless gateway).
The obvious solution can be to compress the data before uploading it. To realize this and to address the above issues, the edge computing approach emerged, which can be treated as a decentralized cloud that brings computing power and thus capabilities of data stream pre-processing and compression closer to data sources such as sensors, Internet of Things (IoT) devices and wearable devices [1,2].
Locating computing power closer to data sources is indispensable for some applications requiring almost real-time responses, such as for example autonomous vehicles and e-health. Real-time requirements of such applications cannot be met by the regular cloud in the case of numerous sensors because of high latency and ineffective bandwidth [1,2]. The computing power available in edge devices also opens up new possibilities for advanced data stream pre-processing such as smoothing and/or segmentation.
In many instances, certain properties of the input signal-typically represented as a series of data points obtained by sampling-are known and must be considered during the segmentation of the signal. A common example is the signal smoothness, measured by the differentiability class C k , with C 2 often being the target ( f ∈ C 2 if it is twice differentiable. For instance, in robotics or control systems to have a smooth movement, the trajectory must be twice differentiable to give a continuous velocity and acceleration.).
With no access to future values, an effective algorithm for the segmentation of streaming data, must be entirely local. Although such algorithms exist (for instance, PLA, PMC-MR, Linear Filter [3]), their outputs are not C 2 -continuous. This also refers to cubic Hermite spline-based solutions (segmentations), which are C 1 -continuous only.
The second group of potential solutions-represented by cubic smoothing splinesgives C 2 -continuous outputs, yet the corresponding algorithms are not local since they require the solution of a system of linear equations whose coefficients depend on the whole data set. To the best of our knowledge, there does not exist a streaming algorithm which combines the above properties, i.e., is local and computes C 2 -segmentations.
Our aim is to propose such an algorithm. The presented algorithm is based on the greedy look-ahead strategy and built upon the concept of a cubic splinelet (see Figure 1). One of its key properties is the real-time simultaneous segmentation, smoothing, and compression of noisy data streams. This means it can be applied to various domains including online compression and/or smoothing of streaming data, real-time IoT analytics, and embedded time-series databases.
The main contributions of this paper are the following: • the cubic splinelet of type WSSR min -the special type of splinelet that minimizes the Weighted Sum of Squared Residuals (Section 4.1), • the algorithm for C 2 -continuous WSSR min -cubic splinelet-based adaptive segmentation of streaming sensor data (Sections 4.2 and 4.3), • numerical results which demonstrate the effectiveness of the algorithm (Section 5).
The remainder of this paper is organized as follows. The next Section 2 contains the related work overview. Following that Section 3, the problem statement is given and then, in Section 4, the proposed solution is described. Next Section 5, the solution is evaluated, and the obtained results are presented and discussed. The last Section 6 contains the conclusion of the study. eam of data points Figure 1. Conceptual diagram of the considered problem: the streaming preprocessor (segmenter) maps a stream of data points to a stream of cubic spline segments, which form a C 2 -continuous curve.

Related Work
Stream computing requires low-latency real-time algorithms that can process massive amounts of data generated by multiple sources at very high speed [4]. Such algorithms should be able to pre-process and analyze on-the-fly high-velocity streams of data coming from sources such as the Internet of Things (IoT) devices, sensor networks, wearable and mobile devices, market data.
The key value of data coming from such sources is their "freshness", and they should be processed and analyzed as soon as they arrive, which is the key assumption of the big data stream analytics [4]. Such a requirement leads to the need for low-latency real-time algorithms because the batch computing approach, in which data should be stored first before it is processed and analyzed, is not sufficient [4][5][6]. In recent years, the research has mainly been focused on algorithms for real-time analysis of big data streams and there was not much research into the noisy or incomplete streaming data pre-processing phase [4].
Below, the research related to the proposed algorithm characteristic features-the real-time data stream segmentation, smoothing, and compression-is presented. Moreover, the related research works on splines are mentioned and the selected possible application areas for the proposed algorithm are reviewed.

Data Stream Segmentation
Sensors located in IoT devices generate data streams continuously. For some application areas, it is crucial to partition such data into segments to perform successful analysis using advanced algorithms, for example the machine learning ones.
Streaming segmentation of the signal realized on edge devices allows for: • reconstruction of a sampled noisy signal (to maintain its continuity/smoothness class as a key feature (like non-negativity)) before the network transmission, • signal compression (in experiments with test signals, we observed the data size reduction from 135 to 208 times, i.e., two orders of magnitude), • reduction of network traffic (in the entire infrastructure), • energy savings (in the entire infrastructure-the network communication is energyintensive, and additionally the signal smoothed on edge devices no longer needs to be pre-processed in the cloud).
Recognition and prediction of human activities by real-time analysis of data streams coming from sensors and actuators is one of the areas of application of data stream segmentation [7]. The problem of automatic segmentation of data stream into activities in real time is a difficult one-there is no general approach for determining the end of the detected activity [7]. Sensor data stream segmentation has been the subject of many research works. Some proposed approaches were based on a time window with fixed length or on a dynamic time window including fixed number of events [8,9]. The other approaches were based on a real-time analysis of temporal information but either very intensive pre-processing was required, or the application was limited (for example to location analysis) [10,11].
An approach for continuous activity recognition based on the real-time sensor data segmentation was proposed in [12]. The proposed method was based on dynamically resized time windows (taking into account temporal sensor data and the state of activity recognition) and the ontology-based activity recognition algorithm. A real-time activity prediction method using automatic data stream segmentation based on Jaro-Winkler distance measurement was proposed in [7].
A method for unsupervised on-the-fly segmentation and classification of the timeseries data was proposed in [13]. The approach was based on data density distribution estimation and the data stream was processed incrementally, using fixed amount of resources (memory and CPU). It even worked in real time when the sampling rate of the data stream was on the certain level [13]. A semantic-based approach for real-time separating and segmenting sensor data stream into multiple threads of activities was proposed in [14].
None of the above-mentioned approaches can provide online C 2 -continuous segmentation. This is crucial if we must deal with physical constraints (for example, velocity and acceleration must be continuous).
Our proposed adaptive segmentation of the data stream (sampled signal) takes place in the approximation space of 3rd order splines (which represents the space of twice differentiable functions, i.e., the problem domain). Its result is a stream of segments that represents the reconstructed true signal. A spline constructed in this way recreates the signal taking into account its known class of continuity (smoothness). Therefore, we segment the data stream taking into account the features (continuity/smoothness class) of the processed signal.

Data Stream Smoothing
Advanced driving assistance systems and adaptive cruise control systems require high-accuracy, low-noise (or at least smoothed) data for proper functioning [15]. Data streams of estimated vehicle position can be obtained from different types of sensors: GPS, radar, LiDAR, gyroscopes, accelerometers, wheel speed sensors. Data coming from those sensors can be noisy and inaccurate due to many technical reasons. Such inaccuracies may lead to incorrect absolute positioning, unrealistic kinematics and inconsistent spacing between vehicles [15].
In car-following applications, the Kalman smoothing was used for improving the quality of data coming from one source (GPS) [16] or from multiple sensors [15,17]. The Kalman smoothing approach was also used for improving the vehicle positioning data coming from GPS and internal dead reckoning (gyroscopes, accelerometers, wheel speed) sensors [18].
The method for smoothing the data stream coming from ultrasonic sensors measuring the water level was proposed in [19]. The proposed approach included outlier detection using modified Z-scores based on the median absolute deviation and stream data smoothing based on the exponentially weighted moving average. A relational database system was extended to include real-time method based on dynamic probabilistic models for filtering and smoothing data streams in [20]. In their approach, the authors used particle filters (a class of sequential Monte Carlo algorithms).
The data stream smoothing methods mentioned above do not include data stream segmentation nor compression. The approach using wavelet-based Kalman data smoothing for processing uncertain oil well-testing data, which included compression, was presented in [21]. However, the backward data smoothing was performed offline. The real-time approach proposed in this paper provides C 2 -continuous segmentation and data smoothing.
Signals of C 2 continuity, recorded by sensors, constitute a substantial category/class (for example, recorded location of autonomous vehicles, drones or industrial robots). C 2 continuity (as a measure of smoothness) is a key characteristic of a signal (similar to, for example, monotonicity or non-negativity) and determines the problem domain. It means that the signal can be differentiated twice. For example, it allows, based on the recorded location of the object, the determination of its velocity and acceleration, without the need to additionally register these quantities which, in turn: • significantly reduces the size of the data needed to be transferred from the edge layer to the cloud, • reduces delays and speeds up data transmission, • reduces energy consumption (in the entire infrastructure).
In the case of a noisy signal (a typical case), taking into account the continuity class allows for a more accurate reproduction of the true signal (noise removal). The continuity class is an important element of our knowledge about the signal, which we should not ignore because it reduces the accuracy of the true signal reproduction (the signal cannot represent, for example, the function of the object's location in time, if it is not twice differentiable-otherwise it would allow the possibility of the operation of infinitely large forces, and we do not have such in nature).

Data Stream Compression
The cloud computing is an indispensable part of the Internet of Things (IoT). However, using it gives us many problems including transmission latency, bandwidth constraints, and high energy consumption [2]. Micro-controller, transceiver, and sensor units are the parts of smart devices that consume most of the energy, and data transmission is the most power-hungry task [22,23]. Energy efficiency can be improved by moving computation tasks from the cloud to edge devices, and by reducing the amount of data transferred from IoT devices to the edge (which additionally conserves the edge devices' storage space) [1,2,24].
To address the above issue, in [2] a lightweight version of fast error-bounded lossy compression algorithm [37] was proposed. The authors showed that the proposed ap-proach was able to reduce the amount of data transmitted from the wearable device to the edge device by approximately 103 times, simultaneously not worsening the results of data analytics.
None of the above-mentioned compression algorithms pre-process data into a form that would potentially accelerate the operation of machine learning algorithms on edge devices, for example by segmenting or smoothing the data stream before sending it from sensors or IoT devices.
Our algorithm segments the data stream (sampled signal belonging to the continuity class C 2 ) in an online way. Segmentation allows for a significant degree of compression (we have observed the data size reduction of 135 to 208 times) and the C 2 class of the signal helps in this process since three out of four coefficients of each spline segment (apart from the first one) can be calculated from the continuity conditions. The omission in the segmentation process of the known (in advance-because we know what we are measuring) continuity class of the processed signal could potentially allow for a slightly better compression ratio, but at the cost of the accuracy of signal reproduction (and in many cases it is unacceptable). It is primarily about recreating the qualitative characteristics of the signal (continuity/smoothness class), and less about the accuracy of approximation (quantitative feature, measured, for example, with the mean square error).

Splines
A spline-flexible strip of wood that was used to draw smooth curves-was mentioned for the first time in [38], as indicated in [39]. The new idea of a spline curve represented as piece-wise polynomial curves with certain smoothness properties was proposed in [40]. In this work, mathematical foundations for spline interpolation and approximation were presented. More information on splines can be found for example in the following works [41][42][43][44][45].
An approximation of a linearly varying curvature by three cubic curve segments was proposed in [46]. An online algorithm for the generation of minimum time joint industrial manipulators trajectories, using similar representation of a curve as in [46], was proposed in [47].
To the best of our knowledge, there is no online algorithm, which can compute C 2continuous segmentations. The approach proposed in this paper can perform online C 2 -continuous cubic splinelet-based adaptive segmentation. It can process data streams in real time. It can also be applied offline when dealing with huge amounts of data, which cannot be processes by traditional algorithms due to memory limitations.

Possible Application Areas
The application areas, for which there is a need for on-the-fly algorithms allowing for data stream segmentation, smoothing, and compression include, but are not limited to, IoT devices, sensor networks, edge computing, and autonomous vehicles (cars, robots, drones). The need for real-time pre-processing of big data streams coming from multiple sensors results from data noise, bandwidth limitations and energy efficiency requirements. Below, the selected research in three areas (sensor networks, Unmanned Aerial Vehicles teams and robot teams), in which the proposed algorithm could be applied, is presented.
Weather prediction generally requires expensive weather stations and supercomputers for computations. An alternative can be the approach using Distributed Sensor Network for collecting data and performing weather prediction computations [56].
Such an approach requires real-time pre-processing, segmentation, smoothing, and compression of data streams because the used weather stations continuously communicate with each other and exchange large amounts of data coming from sensors to compute the predictions. The approach proposed in this paper meets all the requirements to be used in this area of applications-it allows for online C 2 -continuous cubic splinelet-based adaptive segmentation, compression, and smoothing of noisy data streams.
Using multiple Unmanned Aerial Vehicles (UAVs) for surveillance, environmental monitoring, and rescue operations has become an increasingly popular research topic in the recent years [57,58]. Tracking single or multiple moving ground targets requires continuously updated and accurate data about their position. The accuracy of data coming from UAV's sensors is crucial for that task. However, data coming from sensors such as GPS, radar, LiDAR, gyroscopes, and accelerometers can be noisy and inaccurate due to many technical reasons. Using multiple coordinated UAVs allows for combining data coming from their sensors and thus using more accurate information about the current target(s) position [57]. Furthermore, the navigation and coordination of the group of UAVs will require real-time continuous exchange of large amounts of data coming from each unit sensors [59].
Using a real-time algorithm that can segment, smooth and compress data streams coming from each UAV's sensors will be crucial for successfully navigating and coordinating a whole team. The online algorithm proposed in this paper not only computes C 2 -segmentations, but also smooths and compress data in real time, which makes it fully applicable in such a domain as UAVs' sensors data analytics. The online C 2 -continuous segmentation is crucial for UAVs because, for example, the approximated trajectory of a vehicle must be twice differentiable to give a continuous velocity and acceleration.
The research on multi-robot systems (MRS) gained importance and developed significantly in the recent years. Some of the most important research problems in MRS domain include communication mechanisms, planning and coordination strategies, and decision-making algorithms [60]. Research issues related to team coordination [61], sharing data, intelligence, and resources between many robots [62] are of great importance. The effective communication between many robots in the case of limited bandwidth resulting from environmental conditions (for example underwater environment), in which teams of robots are operating, is also the subject of intensive research [63]).
As in the case of drone teams, also in the case of robot teams coordinating their actions and sharing data, the essential issue is to deal in real time with data streams, which additionally can be noisy and incomplete. In such a case, using a method of segmenting, smoothing and compressing data in real time is crucial. The proposed approach not only does this, but also provides C 2 -segmentations, which is of crucial importance when we use the data to plan the trajectories for robots. In such a case the trajectory must be twice differentiable to give a continuous velocity and acceleration.

Problem Formulation
Consider a stream of sensor data points, S D = (D 0 , D 1 , D 2 , . . . ), that arrive (or are accessed) sequentially, and describe an underlying signal f (q), q ∈ R (note that in the subsequent formulae the q stands for any independent variable, typically it will refer to time (t)), where: This stream in a general case is "noisy", i.e., where g(·) is the true signal and ∼ N (µ, σ 2 ), i.e., it is Gaussian noise. This model is shown in Figure 1.
Problem statement. Given a stream of data points S D = (D 0 , D 1 , D 2 , . . . ), where D k = (q k , y k ) with y k = g(q k ) + k , k = 0, 1, 2 . . . , find the C 2 -continuous cubic spline whose segments correspond-in the space generated by the user-defined segment length adaptation strategy (δ)-to the optimal segmentation of S D , with regard to the reconstruction of the original signal (g).

Remark 1.
The adaptation strategy, δ, usually depends on the target platform capabilities (e.g., the available memory), and on the required accuracy of the solution. In specific cases, it can be very sophisticated, e.g., Machine Learning (ML)-based.

Proposed Solution
The streaming algorithm we propose is based upon the concept of a cubic splinelet-a local building block of an "on the fly" constructed global cubic spline, which is by definition C 2 -continuous (see Section 4.1 and [64]). This local, three-segment building block introduces a look-ahead capability to the algorithm, which-because of its online characteristicmust be greedy. Indeed, we can construct the global cubic spline using only the first segment of each splinelet, while the remaining two-treated as a "look-ahead" part-can be dropped (see Algorithm 1).
The key elements of the proposed algorithm (including its pseudo-code) are given in the following three subsections.

Cubic Splinelet of Type WSSR min -The Solution Building Block
Without loss of generality, we can consider the problem in the following local frame: which means that an interval [q A , q D ], given in the global frame, Oqy, is shifted in qdirection by the offset, q 0 : In a special case, when q 0 = q A , we get: Note: this local frame, Oxy, will be used in the following paragraphs.

Definition 1.
A cubic splinelet of type WSSR min is a three-segment piece-wise cubic function defined in the local frame (when q 0 = q A ) as ( [64]): where: and with the following properties: s(x) has the following boundary conditions: • minimizes the Weighted Sum of Squared Residuals, i.e., where: I To find the splinelet corresponding to Equation (9), we first note that each coefficient in Equation (7) can be expressed in the following way: where: i = 1, 2, 3, j = 1, 2, 3, 4, and (α 1 , α 2 , α 3 , α 4 ) = (s D , s D , s D , 1). Equation (9) can be now restated as the following parametric optimization problem: or, in another notation: arg min Next, we compute: which leads to a system of three linear equations (since J(α 1 , α 2 , α 3 ) is linear with respect to α i , i = 1, 2, 3): whose solution (i.e., the optimal values of α i , i = 1, 2, 3) we substitute into Equation (10), and obtain the coefficients of the corresponding splinelet.

Segmentation Heuristic Overview
The cubic splinelet defined in the previous section gives the locally optimal approximant in the given interval [0, x D ]. However, the following questions remain unanswered: • What should be the value of x D that corresponds to the locally optimal stream segmentation/partitioning? • What should be the search space for this optimization task?
The search space (interval) can be straightforwardly derived from the selected spline adaptation strategy. It may be as simple as: The best x D can be then computed using an iterative improvement method. At each of its steps:

1.
the search interval is divided into predefined number of sub-intervals, 2.
they are then evaluated (see Equation (16) below) by sampling and interpolating (note: this step can be accelerated by memoization/caching), 3.
the best sub-interval becomes the new search interval.
The fitness (objective) function, φ, used in the above search algorithm is defined in the following way: where: and:ȳ with R 2 adj being the Adjusted Coefficient of Determination, MAE-the Mean Absolute Error, and Σ s -a set of candidate splinelets (corresponding to different values of x D ).

The Algorithm
Algorithm 1 presents a high-level view of the whole computational process (see also Appendix A). The sliding window-based streaming segmentation that we propose is clearly reflected in its structure. It is also worth noting that: • the sliding window buffer size, h, can be either fixed upfront (e.g., depending on the input signal characteristics and/or real-time constraints), or constantly adapted (e.g., using ML algorithms); a good strategy for the first approach is to use the value of h corresponding to the maximum acceptable buffering delay (latency), • to increase the readability of the pseudo-code, checking for exceptional/corner cases (e.g., too few data points at the end of the sliding window buffer to build one more segment) was omitted in some places, • the algorithm presents one possible way of handling the end of the stream (s Best , was computed with no looking-ahead); again, if necessary, this computation can be more sophisticated (e.g., the stream can be "artificially" extended), • the number of sub-intervals that a given interval is divided into can be either fixed upfront or variable (e.g., simple dependence on the length of the interval, or ML-based).

Input:
Non-empty stream of data points, Sliding window buffer size

Output:
Stream of cubic spline segments which form a C 2 -continuous curve 1 Open the input and output streams 2 Estimate initial conditions for the first segment 3 Initialize the buffer offset and search interval (see Equation (15)) 4 while not end of input stream do 5 Fill the sliding window buffer 6 while not end of sliding window buffer do 7 Compute the new search interval (see Equation (15)) 8 Find s Best -the best splinelet in this interval 9 Add the first segment of s Best to the output stream 10 Compute the new initial conditions 11 Add the remaining two segments of s Best to the output stream 12 Close the input and output streams

Results and Discussion
To evaluate the proposed algorithm, a series of numerical experiments was carried out, mostly in the form of a comparative analysis. As a point of reference, the results obtained from R function smooth.spline (accessed on 15 July 2021) were used. It is worth noting that this function-an example of state-of-the-art solutions-is global (i.e., the whole stream must be given as its input).
A summary of the evaluation process used is given in Section 5.1 and the results of the experiments are presented in Sections 5.2 and 5.3.

Evaluation Process Overview
The key aspects of the evaluation process-test streams, algorithm performance descriptors, and the reference function (algorithm) used-are briefly described in this section.

Performance Descriptors
Each of the solutions, s, was evaluated using the following measures: • Mean Absolute Error: • Root Mean Squared Error: • Normalized Root Squared Error: • Mean Absolute Error Quotient (local-to-global algorithm ratio 1): • Root Mean Squared Error Quotient (local-to-global algorithm ratio 2): • Compression Ratio: Note: due to C 2 -continuity of cubic splines we need only {4 • Squared Error (function): Remark 4. The above set covers local (AE and SQE), global (MAE, RMSE, and NRSE), and competitive (Q RMSE , Q RMSE , and CR) performance descriptors.

Reference Algorithm and Its Limitations
Remember that an online (local, streaming) algorithm is one that can process its input piece-by-piece in a serial fashion without having the entire input available from the beginning (as is the case for offline/global algorithms). As a result, it might make "decisions" that later turn out not to be optimal. Consequently, a local algorithm cannot outperform its global (optimal) counterpart. To compare these two, a "local-to-global algorithm ratio" is often used.
Unfortunately, this approach cannot be directly applied to the problem under consideration (i.e., adaptive segmentation of streaming data with the use of C 2 -continuous cubic splines) because there is no other algorithm to compare it with. With this in mind, we can assume that the stream is finite and then use an existing cubic spline-based approximator as a (global) point of reference. An example of such an approximator is the R language smoothing spline function smooth.spline (accessed on 15 July 2021).
It turns out, however, that this is still not a solution because from the automatic segmentation point of view, this reference function (algorithm) does not handle data streams longer than about 6% of the length of the test streams (as shown in Figure 4).  For longer streams, we need to specify the number of smoothing spline segments (knots) manually. As shown in Figure 5, we can expect accurate approximations for all test streams when using more than 4 × 10 3 knots.

Remark 5.
In the evaluation process used, the number of knots for function smooth.spline was set to be the same as that found by the splinelet-based segmentation algorithm (which in all cases was more than 4 × 10 3 ).

Evaluation Results: Approximation Errors and Compression Ratio
Given a signal, the quality of its splinelet-based approximation-measured in terms of absolute and quadratic errors (see Section 5.1.2)-is the key performance indicator of the corresponding segmentation which, in turn, is strongly related to the signal compression ratio. The corresponding evaluation results are presented in Table 1 and in Figures 5 and 7. The values of error quotients QMAE and QRMSE (Table 1) show that the splineletbased solutions, despite being completely local, in most cases are almost as good as their global (smoothing spline-based) correspondents.
The same can be observed in Figures 6 and 7. They provide additional insight into the splinelet-based approximation. These supplement the integral measure view-given by error quotients-with error distributions. Almost identical shapes of density lines corresponding to the two compared solutions confirm the high quality of splinelet-based solution. In this context, the values of compression ratios, CR(s, t), given in Table 1 can be considered high (from 135 to 208, meaning that the compressed stream sizes are up to 208 times smaller).

Evaluation Results: Segment Length Auto-Adaptation
Since the spline segment length auto-adaptation mechanism determines the search space at each segmentation step, it has significant impact on the algorithm's overall performance. Not only does this refer to the approximation quality (discussed in Section 5.2), but also-probably even more importantly-to the algorithm's stability, which becomes essential in the context of the C 2 -continuous streaming approximation. Figure 8 shows concisely the spline segment auto-adaptation related results in the form of a segment length distribution for each tested data stream.
We can see that: • in all cases the dominating segment lengths (remember that the test streams differ only in their signal-to-noise ratios-see Equations (23) and (26)) belong to the interval [5,10], • the lower the noise level, the more distinct the three existing maxima of the density function become (they correspond to the main "building blocks" used by the segmentation algorithm to restore the true signal, which is periodic), • the higher the noise level, the closer to uniform the segment length distribution becomes, and the longer the segments are (because of a higher error tolerance).

Conclusions
It has been shown that the C 2 -continuous cubic splinelet-based adaptive segmentation of streaming data-despite its local/online character-is not only possible but also can be effective. The key element in achieving this was to base the algorithm on the greedy lookahead strategy based on the concept of a cubic splinelet-a building block for C 2 -continuous cubic splines. A characteristic feature of the proposed algorithm is the simultaneous segmentation, smoothing, and compression of data streams from sensors being performed in real time.
The segmentation quality has been measured in terms of the signal approximation accuracy and the corresponding compression ratio. The numerical results show the relatively high compression ratios (from 135 to 208, see Table 1) combined with the approximation errors comparable to these obtained from the (global) reference algorithm (see Figures 6 and 7).
The proposed algorithm can be applied to various domains, including online compression and/or smoothing of streaming data coming from IoT devices, sensor networks, and sensors located in autonomous vehicles (cars, drones) and robots. The possible application areas also include real-time IoT analytics, and embedded time-series databases. Further exploration of this idea could be the first possible future research direction. Another could be related to more advanced auto-adaptation mechanisms of the search space.

Input:
h -sliding window buffer size, S in -non-empty stream of data points Best .x S // s Best .x S = s Best .x B , see Equation (6) 18 Best .x E // s Best .x E = s Best .x D , see Equation (6) /* add the remaining two segments, s Best and s Best to S out */ 19 put(s (2) Best , S out ) 20 put(s (3) Best , S out ) /* close the streams */ 21 close (S in ); close (S out )   Table A1, but in graphical form.