You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

24 October 2025

Frugal Self-Optimization Mechanisms for Edge–Cloud Continuum

,
,
,
and
1
Faculty of Mathematics and Information Science, Warsaw University of Technology, 00-662 Warsaw, Poland
2
Systems Research Institute, Polish Academy of Sciences, 01-447 Warsaw, Poland
3
Department of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu 965-8580, Japan
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Artificial Intelligence and Edge Computing in IoT-Based Applications

Abstract

The increasing complexity of the Edge–Cloud Continuum (ECC), driven by the rapid expansion of the Internet of Things (IoT) and data-intensive applications, necessitates implementing innovative methods for automated and efficient system management. In this context, recent studies focused on the utilization of self-* capabilities that can be used to enhance system autonomy and increase operational proactiveness. Separately, anomaly detection and adaptive sampling techniques have been explored to optimize data transmission and improve systems’ reliability. The integration of those techniques within a single, lightweight, and extendable self-optimization module is the main subject of this contribution. The module was designed to be well suited for distributed systems, composed of highly resource-constrained operational devices (e.g., wearable health monitors, IoT sensors in vehicles, etc.), where it can be utilized to self-adjust data monitoring and enhance the resilience of critical processes. The focus is put on the implementation of two core mechanisms, derived from the current state-of-the-art: (1) density-based anomaly detection in real-time resource utilization data streams, and (2) a dynamic adaptive sampling technique, which employs Probabilistic Exponential Weighted Moving Average. The performance of the proposed module was validated using both synthetic and real-world datasets, which included a sample collected from the target infrastructure. The main goal of the experiments was to showcase the effectiveness of the implemented techniques in different, close to real-life scenarios. Moreover, the results of the performed experiments were compared with other state-of-the-art algorithms in order to examine their advantages and inherent limitations. With the emphasis put on frugality and real-time operation, this contribution offers a novel perspective on resource-aware autonomic optimization for next-generation ECC.

1. Introduction

With the ongoing expansion of the Internet of Things (IoT) and the increasing demand for data-intensive applications that involve capturing, analyzing, and exchanging large amounts of data, which are being processed and used for decision-making, the concept of Edge–Cloud Continuum (ECC) has gained significant popularity [1]. Its use cases span several domains, in which heterogeneous resources are being integrated to provide complex solutions. These include, among others, smart homes and cities [2,3], autonomous vehicles [4], smart manufacturing [5], energy management [6,7], or agriculture [8]. As outlined in [9], ECC can be understood as a complex distributed infrastructure that encompasses a wide variety of technologies and interconnected devices, seamlessly integrating the edge (where data sources and actuators are located) and the cloud (infrastructure with computational resources). Through such an integration, ECC aims to harness the combined benefits of both paradigms, while addressing their inherent limitations [10]. In particular, edge devices located in close proximity to the data sensors and actuators are often utilized for time-critical or privacy-sensitive applications, which require minimal latency [11]. On the other hand, cloud resources provide the ability to perform computationally intensive tasks that exceed the capabilities of resource-restricted devices. With dynamic resource provisioning and scalability, cloud solutions can also better accommodate long-running workflows, or those executed over complex interconnected architectures [12,13].
However, despite the aforementioned benefits, ECC also faces a number of challenges [1,14]. (1) The heterogeneity of computational resources, system topologies, and network connectivity schemes hinders the possibility of applying unified orchestration methods, making system management particularly difficult [15,16]. This encompasses not only aspects of resource provisioning, task offloading, or scheduling, but also (2) mechanisms for data management [17]. Achieving the data consensus, in ECC with distributed data sources, requires advanced mechanisms for synchronization, transmission policies, and control over access rights. Here, an additional level of complexity arises from (3) the need to perform all these operations on real-time streaming data. Moreover, (4) the processing of data on, often, resource-limited edge devices requires the use of algorithms with a low computational overhead, occupying little storage [18]. Developing such frugal algorithms typically involves finding a reasonable compromise between efficiency and effectiveness. Finally, (5) the dynamic and distributed nature of the environment impedes the detection of potential errors, which can hinder the system’s stability, resulting in temporary service interruptions, performance deterioration, and breaches of Service Level Agreements (SLA)s [19,20].
Therefore, attaining ECC-based systems reliability requires the incorporation of smart mechanisms providing fault detection and enabling system adaptability to the ever-changing conditions. In order to achieve these objectives, in a possibly automated and feasible manner, self-* mechanisms [21] have become the focus of recent research studies. These mechanisms aim to enhance the system’s autonomy, allowing it to initiate decisions and adaptations with little to no human intervention. In the survey [22], the authors presented a taxonomy that distinguished nine different types of self-*, of which the primary four are: (1) self-healing, (2) self-protection, (3) self-configuration and (4) self-optimization. Among these, the self-optimization constitutes the focus of this contribution.
Typically, self-optimization is understood as a property of the system that not only recognizes the need for its adaptations, in the case of violation of its high-level objectives, but also attempts to improve the system’s performance in its stable state. This makes it particularly important for complex ECC infrastructures that require automation of complex processes. Interestingly, as identified in [22], among other self-* modules, self-optimization was among the ones mentioned the least number of times in the existing literature. This fact can be attributed to several factors, including (1) difficulty in specifying necessary key performance indicators, (2) vague definition of the self-optimization scope, or (3) the need to incorporate domain-specific expert knowledge. Moreover, the existing works target mostly cloud applications, relying on non-frugal algorithms that either require substantial computational power (e.g., those based on deep neural networks) or storage (e.g., making decisions based on historical data). As a result, they cannot be considered suitable for use on edge nodes that are highly resource-constrained. Interestingly enough, even though not reflected in the literature, self-optimization close to the edge seems to be a very attractive solution, as by being deployed in the proximity of data sources, it facilitates fast reactions and adaptations.
In this context, the research in edge self-optimization still exhibits several unresolved gaps. In terms of the applicability domains, the majority of the studies discuss self-optimization in the context of adaptable resource management, such as resource allocation, auto-scaling, replica control, and task migration. Apart from the use cases of adaptable workload prediction [23], the self-optimization does not appear directly in the context of data dissemination, which pertains to a significant concern in ECC, since frequent transmissions of large data volumes involve considerable energy consumption. Additionally, the current studies mainly focus on how to execute adaptation rather than when the adaptation should be initiated (e.g., when the potential abnormality appears in the system). Consequently, the timely identification of opportunities for adaptation remains insufficiently explored. Moreover, even though some of these works address edge self-optimization, they still require resource-intensive operations to be offloaded to the cloud [24], making them reliant on external high-performance hardware.
The aforementioned gaps in the existing self-optimization literature have been partially addressed in separate studies devoted to anomaly detection [25] and adaptive sampling [26,27]. Consequently, this work will examine how to employ both of these techniques within the self-optimization module.
In particular, this contribution presents a frugal self-optimization module, in which statistical-based techniques are used to identify potential adaptation opportunities. The article describes two principal roles of the module:
  • Anomaly detection—detection of potential resource utilization abnormalities, which was achieved through density-based analysis of real-time data streams collected from computational nodes
  • Adaptive sampling—estimation of the optimal sampling period for the monitoring of resource utilization and power consumption, obtained by analyzing the changes in data distribution with Probabilistic Exponential Moving Average (PEWMA)
In the context of ECC, such a module is particularly important, since it enables automated decision-making directly at the resource-constrained edge devices, achieving optimizations in (soft) real-time. By adjusting sampling rates autonomously, the module ensures that data are monitored and (in many cases) transmitted only when necessary. This, in turn, can reduce energy consumption and lessen network congestion. Moreover, applying the anomaly detection mechanisms can support not only fault prevention and the security of ECC but may also serve as a trigger for autonomous adaptations that could help preserve the computational resources.
For example, consider a remote health monitoring system, where patients wear electrocardiogram (ECG) smart watches that measure their vital signs (e.g., heart rate). These ECGs serve as sensors that send the collected data to patients’ smartphones (edge devices), which are responsible for making immediate, critical decisions locally and forwarding relevant information to the healthcare provider’s cloud for more extensive, long-term analysis. In this scenario, the proposed self-optimization module could be utilized to automate the collection of patients’ vital signs and recognize potential critical health events. Specifically, by lowering the data collection frequency during long periods of physiological stability (e.g., sleeping or resting), adaptive sampling could prolong the battery life of wearable ECGs. Simultaneously, the anomaly detection mechanism could streamline detection of irregularities (e.g., in a heart rhythm) that could trigger alerts requiring immediate action (e.g., increasing the ECG sampling rate and sending a signal to healthcare professionals). From the perspective of system resilience, anomaly detection could also be used to monitor sensor resources (e.g., temperature) to detect their malfunction (e.g., overheating) and initiate autonomous adaptations (e.g., by switching off non-critical processing).
However, in order to make the self-optimization module deployable on the edge, it was necessary to propose an approach that would address the limitations of resource-constrained edge devices. Specifically, the selection of algorithms was guided by factors such as (1) computational efficiency and (2) frugality, understood in the context of both utilized resources and processed data. Therefore, this contribution also provides a structured perspective on the approach to real-time optimization in the context of edge computing. Moreover, the novelty of the proposed solution lies in its modular architecture, which (1) enables the dynamic extension of models to accommodate new data metrics (e.g., network throughput, sensor battery level) and (2) facilitates Human-In-The-Loop (HiTL) interaction for model parameter adjustment (e.g., acceptable estimation imprecision).
The conceptualized self-optimization module was implemented as part of the Horizon 2020 European research project aerOS (https://aeros-project.eu/, access date: 24 October 2025). aerOS is a platform-agnostic, intelligent meta-operating system for the edge-IoT-Cloud continuum. It is composed of a wide variety of heterogeneous computational nodes called Infrastructure Elements (IE)s, which operate on different types of devices ranging from embedded microcontrollers to cloud servers that can be potentially distributed between different client domains. In this contribution, aerOS was used as a primary testbed for validating the self-optimization module. Nevertheless, it should be noted that the module is also suitable for other ECC applications.
The performance of the developed module has been evaluated using synthetic and real-world complex datasets. In particular, the anomaly detection model was assessed using (1) the Numenta Anomaly Benchmark (NAB) [28] on the synthetic time-series data and (2) a snapshot of CPU utilization monitored in the aerOS continuum. On the other hand, the effectiveness of sampling period estimation was evaluated on both of the aforementioned datasets and, additionally, on RainMon CPU utilization traces (https://github.com/mrcaps/rainmon/blob/master/data/README.md, access date: 24 October 2025). Finally, the achieved results were compared with selected frugal state-of-the-art approaches, including Adaptive Window-Based Sampling (AWBS) [29] and User-Driven Adaptive Sampling Algorithm (UDASA) [30].
The remainder of the article is structured as follows. Section 2 contextualized the implemented approach by reviewing existing studies on the topics of self-optimization, adaptive sampling, and real-time anomaly detection. Next, Section 3 describes the developed self-optimization module, discussing its main assumptions and requirements, as well as explaining the details of the proposed design. It is followed by Section 4, which presents the selected test cases and discusses the achieved results. Finally, Section 6 summarizes the findings of the study and indicates future research directions.

3. Implemented Approach

The self-optimization module belongs to the collection of nine self-* modules available to be deployed in the aerOS. Its primary role is to analyze the current state of the computational resources and to suggest potential adaptations that could lead to broadly understood efficiency improvements. In order to achieve this objective, self-optimization employs two mechanisms: (1) adaptive sampling, which is used to autonomously control the frequency of data monitoring, and (2) anomaly detection that aims to identify and alert about potential abnormalities in the resource utilization. Before proceeding to the in-depth description of each of these mechanisms, let us first discuss the requirements that guided their design. It should be emphasized that although these requirements were formulated in accordance with the particular infrastructure (e.g., aerOS) in which the module was tested, they are generalizable to other scenarios as well, making the proposed design versatile across different applications.

3.1. Requirements

The requirements were derived based on the specificity of the environment in which the self-optimization module is intended to operate. Therefore, prior to listing them, it is necessary to outline the main characteristics of aerOS.
Let us recall that aerOS is an intelligent meta-operating system for the edge-IoT-Cloud continuum. The operations in aerOS are performed on the computational nodes called IEs. As these may differ in terms of available resources or their internal architecture, no assumptions should (and can) be made regarding their computational capabilities. Moreover, it should be noted that tasks executed on each IE are treated as “black boxes”, meaning that the only attributes known to the system are their execution requirements.
In order to improve the management and performance of the entire infrastructure, the aerOS provides a set of mandatory and auxiliary services deployable on individual IEs. Among these services are nine self-* modules [22], which aim to enrich IEs with autonomous capabilities. In the context of self-optimization, the most important self-* modules are (1) self-orchestration, which invokes rule-based mechanisms that autonomously manage IE resources, and (2) self-awareness, which continuously monitors the IE state, providing insights on its current resource utilization and power consumption. These modules act as points of interaction with the external environment, from the self-optimization perspective. However, they can be substituted with any components that will provide monitoring data and accept the output of the module under discussion.
Taking into account all the aforementioned features of aerOS, the self-optimization module was designed according to the following requirements:
  • Tailored to operate on data streams: since the self-optimization module receives data from the self-awareness module, its internal mechanisms must be tailored to work on real-time data streams. Therefore, at this stage, methods that utilize large data batches or process entire datasets were excluded from consideration.
  • Employ only frugal techniques: all the implemented algorithms should be computationally efficient and require a minimal amount of storage. This requirement was established to enable deploying self-optimization on a wide variety of IEs, including those operating on small edge devices. Consequently, no additional storage for historical data was considered, which eliminated the possibility of implementing some of the more advanced analytical algorithms. However, it should be stressed that this was a design decision rooted in in-depth analysis of pilots, guiding scenarios, and use cases of aerOS and other real-life-anchored projects dealing with ECC.
  • Facilitate modular design: all internal parts of the self-optimization should be seamlessly extendable to enable accommodating new requirements, or analyzing new types of monitoring data. Therefore, self-optimization should employ external interfaces that would facilitate the interaction with human operators, or other components of the aerOS continuum. Fulfilling this requirement will allow for generic adaptability of the module since data/metrics to be analyzed may be deployment-specific.
These requirements dictated the overall architectural design of the self-optimization module, as well as influenced the selection of its underlying algorithms, which are described throughout the following sub-sections.

3.2. Self-Optimization Architecture

The high-level architecture of the self-optimization is depicted in Figure 1. It illustrates the core parts of the module, which are: (1) Collector/Parser, (2) Shift/Anomaly Detection Model, (3) Sampling Model and (4) Recommender. Let us now describe the responsibilities of each of these components and their role in processing the received monitoring data.
Figure 1. Architecture of self-optimization module.
The Collector/Parser is the component that receives the hardware state of IE sent periodically by the self-awareness module. It can be thought of as a gateway that collects raw input and maps it into a format that can later be used for further analysis. In particular, the data sent by the self-awareness module consists of a set of different metrics, among which not all are relevant from the perspective of the implemented analytical models (e.g., IE location, CPU architecture, or trust score).
Therefore, it is the responsibility of Collector/Parser to extract the required metrics. Currently, these include the CPU cores, RAM, and disk utilization, as well as the power consumption. Here, it should be noted that these metrics are stored in the form of key-value pairs; hence, extending their scope in the future would be easily possible.
After formatting the IE data, Collector/Parser forwards it to two analytical components— Shift/Anomaly Detection Model and Sampling Model, which are responsible for the identification of potential adaptation possibilities.
In particular, the role of the Shift/Anomaly Detection Model is to determine whether the value of any of the considered utilization metrics should be classified as anomalous. Separately, Sampling Model estimates the most appropriate frequency for monitoring, of the IE state, in self-awareness. The process of detecting the anomalies is further described in Section 3.3, while the applied adaptive sampling approach is discussed in Section 3.4. The results of both analyses are passed to the last self-optimization component, which is the Recommender.
Recommender is responsible for forwarding the suggested adaptations to the associated self-* modules. Specifically, its functionality is organized into two distinct flows. (1) Upon receiving the proposed most optimal sampling period from Sampling Model, Recommender forwards that information to the self-awareness so that it can adjust the monitoring frequency. Moreover, (2) after obtaining the details about the detected anomalies from Shift/Anomaly Detection Model, it maps each anomaly type into a corresponding unique code and sends them to self-orchestrator. In this context, a unique code acts as a particular type of alert (e.g., a sudden increase of CPU usage in a particular infrastructure device can be mapped into a single code 001). The self-orchestrator maps each type of alert into a corresponding mitigation/recovery action, represented through a serverless function, and triggers its execution on the common serverless platform OpenFaaS [48]. In aerOS, such functions are predefined by the developers, hence the type of applied mitigation action is specific to the given application domain (e.g., may involve re-allocation of deployed services). Here, let us also note that the communication between self-optimization and self-orchestrator is highly generic, operating only on the means of a single code. Hence, the proposed self-optimization module can be plugged in as a standalone component to any other type of system, which defines its own mechanisms of performing adaptation actions.
Let us observe that the architecture of the self-optimization module facilitates the separation of concerns. In fact, it resembles a simplified version of the MAPE-K model. In particular, the data collection and pre-processing in Collector/Parser can be viewed as a MAPE-K Monitoring component. The MAPE-K Analysis component is reflected by Sampling Model and Shift/Detection Anomaly Model since these identify potential adaptation possibilities. Finally, the Recommender corresponds to the MAPE-K Planning and Execution. Using such a design supports the high-level modularity of the system, enabling its further extensions.
Since the core components of the analysis are Shift/Anomaly Detection Model and Sampling Model, let us describe them in further detail.

3.3. Shift/Anomaly Detection Model

The primary responsibility of the Shift/Anomaly Detection Model is to identify anomalies in the current IE state. Since the representation of the IE state is composed of multiple metrics, there are two main approaches that can be applied in terms of anomaly detection.
The first approach considers the IE state as a whole, meaning that the detected anomalies represent a potential degradation of an overall IE performance but are not associated with any particular (individual) metric of the resource utilization. Application of such an approach requires either scalarizing the IE state to a single value or implementing anomaly detection techniques, which operate on multivariate data. While this may simplify the anomaly detection, requiring only one model to be implemented, it also sacrifices the metric-level interpretability. In particular, since there is no identification of specific anomalous metrics, the root cause of the problem cannot be easily diagnosed. Therefore, determining appropriate mitigation measures may also be very challenging. As an example, an anomaly may be detected due to a memory leak or excessive CPU utilization. These are two separate issues that require completely different handling solutions.
The second approach addresses such cases by detecting anomalies for separate metrics. In particular, the metrics are divided into scalar values based on their type and fed into separate anomaly detection models. Models of this type require individual parameter tuning, which may be viewed as both a disadvantage and an advantage. On the one hand, such an approach requires the adjustment of a vast number of parameters, which may not always be straightforward. However, on the other hand, it may also improve the performance of the models since it allows their behaviors to be adjusted based on the evolution and distribution of individual metric streams. Here, obviously, it is possible that subgroups of parameters can be used for anomaly detection, but further exploring such possibilities is out of the scope of this contribution. Suffices to say that, for obvious reasons, the proposed approach can support anomaly detection based on any subgroup of parameters. All that is needed is to: (a) adjust the Collector/Parser to extract the needed parameters, and (b) insert the needed Shift/Detection Anomaly Model.
The Shift/Anomaly Detection Model, considered in what follows, employs the second approach, which is illustrated on the high-level architecture of the component in Figure 2.
Figure 2. Architecture of Shift/Anomaly Detection Model.
The overall flow of the Shift/Anomaly Detection Model is as follows. The IE data are initially sent to the Data Entrypoint, which is responsible for extracting values corresponding to each metric type. Let x m denote the observed value of metric m. Currently, Shift/Anomaly Detection Model takes into account only three types of metrics: (1) CPU utilization x C P U , (2) RAM utilization x R A M , and (3) disk utilization x d i s k . However, as noted, this scope may be extended in the future, without affecting the remaining parts of the system. For each metric m, Data Entrypoint fetches the corresponding model parameters from the Model Config component. Let θ m represent the parameter set associated with the model that handles metric m. Each pair ( x m , θ m ) is subsequently fed into the corresponding model, which applies a density-based anomaly detection algorithm [44]. After finishing the detection process, the anomalies returned by individual models are aggregated and passed to the Recommender (see Section 3.2). Moreover, the statistics characterizing the temporal evolution of each metric stream (i.e., sample mean μ i , m and scalar product i , m of metric m at time step i) are updated and stored in a lightweight Local Cache, to support incremental computations.
Taking this into account, in what follows, a detailed description of the algorithm responsible for anomaly detection is presented.

3.3.1. Density-Based Anomaly Detection

As indicated in Section 2, Shift/Anomaly Detection Model implements a density-based point anomaly detection algorithm described in [44]. Specifically, the algorithm represents the evolution of the metric stream through the computation of consecutive sample densities. The notation used in the subsequent algorithm’s description is summarized in Table 3.
Table 3. Notation used in the density-based anomaly detection algorithm.
The computations begin upon receiving the next observation from the Collector/Parser component. In the first step, the sample mean and the scalar product are calculated, using the recursive formulas presented in Equations (1) and (2). It should be noted that both these formulas do not rely on any additional data, apart from the previously computed statistics (i.e., μ i 1 , i 1 ). Hence, the AdaM method does not require any additional storage aside from a lightweight local cache. This is important in the case of resource-constrained devices.
μ i = μ i 1 + x i μ i 1 N , if N > 0 x i , otherwise
i = i 1 + x i 2 i 1 N , if N > 0 x i 2 , otherwise
After updating the sample mean and the scalar product, the obtained values are used in the computation of the sample density. The density formula is inspired by the concept of data clouds (i.e., fuzzy representations of data clusters) introduced in [49]. Here, the Cauchy kernel was used over the data cloud density in order to control the contribution of individual sample points in the estimation, mitigating the impact of outliers. In comparison to other kernels (e.g., Gaussian), Cauchy exhibits heavier tails and slower decay, by which it strongly suppresses the impact of distant samples, while maintaining the contribution of nearby points. As such, it is less sensitive to extreme deviation, anticipating their occurrence, and consequently mitigating their impact on the final estimation. Thus, Cauchy kernels provide greater robustness in scenarios with highly noisy or irregular data. To accommodate the requirements of incremental, stream-based computations, the recursive formula, presented in Equation (3), is applied.
D i = 1 1 + ( x i μ i ) 2 + i μ i 2 , if N > 0 x i , otherwise
The updated density is then used to calculate the mean density, as represented in Equation (4), where the density difference Δ D i is computed, as formulated in Equation (5).
D ¯ i = ( D i 1 + Δ D i K ) · ( 1 | Δ D i | ) + D i · | Δ D i | , if N > 0 1 , otherwise
Δ D i = D i D i 1
In the presented equation, the absolute density difference | Δ D i | serves as a weighting factor, enabling adjustment to the impact of previous and current densities on the estimation. In particular, when the changes in density are minor (i.e., similar observations), the updated mean remains closer to the previous density (i.e., D i 1 + Δ D i K ). On the other hand, significant variations favor the contribution of a new mean density (i.e., D i ). This allows the model to adapt more effectively to large changes, while smoothing out negligible fluctuations.
The mean density is used to control the transition between the normal and the anomalous states. In particular, if the current observation is considered to be normal, then it may switch to the anomalous state according to Equation (6).
s t a t e = a n o m a l o u s , if D i D ¯ i · T h a n o m a l y K W a n o m a l y n o r m a l , otherwise
There are two conditions that must be satisfied for the anomalous state to occur: (1) the current density must be less than the weighted mean density, and (2) the number of consecutive observations K that remained in the same state (in this case, potentially anomalous) is above a specified threshold W a n o m a l y . Please note that if the first condition (i.e., D i D ¯ i · T h a n o m a l y ) is fulfilled, then the number K will be incremented, but the state will not change to the anomalous until K reaches a sufficient size. Therefore, by applying the weighting and the threshold parameters, the sensitivity of the model can be seamlessly adjusted for the specific use cases.
A similar transition also occurs from the anomalous to the normal state. However, since it is not the main subject of this study, more details about this mechanism are omitted, while they can be found in [44].
Upon recognizing the anomalous state, the category of anomaly must be specified. However, it should be noted that, since the algorithm does not rely on historical or labeled data, it cannot learn different types of complex anomalies. Therefore, in Shift/Anomaly Detection Model, two anomaly classes were, by default, specified for each type of metric: (1) INCREASE (i.e., for density being above the mean density) and (2) DECREASE (i.e., for density being below the mean density). These categories aim to characterize the type of abrupt shifts in resource utilization. However, it should be noted that due to the modular architecture of the module, the anomaly classes can be easily extended or modified, depending on the particular application context.
The quality of anomaly detection is influenced significantly by the parameters of the individual models. In this context, the next subsection describes which parameters were considered and how the human operator can be involved in their dynamic modification.

3.3.2. Model Configuration

The Shift/Anomaly Detection Model is composed of several sub-models, each of which requires adjustment of its individual parameters. The density-based anomaly detection parameters encompass: (1) T h a n o m a l y , with a value in the interval [0, 1], (2) T h n o r m a l , with a value in the interval [0, 1], (3) W a n o m a l y , with an integer value greater or equate to 1 and (4) W n o r m a l , with an integer value greater or equate to 1.
All of them are defined accordingly in Table 3. In general, adjusting these parameters is a non-trivial task, which depends on the specific application context. For instance, in time-sensitive applications, such as critical healthcare systems, the detection of anomalies may require a high level of granularity to promptly alert about singular deviations. In such cases, the parameters W a n o m a l y and W n o r m a l should be minimized to accommodate transient fluctuations. In the opposite case, in systems characterized by frequent data fluctuations (e.g., high-traffic web services), W a n o m a l y and W n o r m a l should be adjusted to larger values to react only to sustained abnormalities.
In this context, it should be noted that the module is not domain-specific and can be utilized in a variety of applications. However, this contributes to the complexity of the parameter fine-tuning, as no universal method exists (and can be applied) to determine their most optimal combination. Furthermore, since the self-optimization was designed with modularity in mind, it was considered that Shift/Anomaly Detection Model may potentially be extended (in the future) to support new sub-models. Consequently, it was determined that the configuration of parameters must be: (1) adjustable—to the specific application context and (2) extendable—to support future modifications.
To accommodate these requirements, each of the sub-models was identified by a unique name (e.g., CPU_MODEL), allowing it to be associated with an individual set of parameters. These parameters were stored in an adaptable component configuration Model Config. Moreover, in order to facilitate the integration of the Shift/Anomaly Detection Model into different systems, dedicated interfaces were introduced to support dynamic parameter adjustment by the HiTL.
In particular, the endpoint /anomaly/parameters/{type} was implemented to provide an on-the-fly parameter modification capability, e.g., for the external human operator. In addition, in order to ensure continuity of system operation, the newly introduced parameters are applied only once the next sample is received, whereas the ongoing analysis proceeds using the previous parameters.
Apart from the Shift/Anomaly Detection Model, the second model employed in the self-optimization is the Sampling Model. Its architecture, along with the utilized method, is described in the subsequent section.

3.4. Sampling Model

The Sampling Model follows a similar architectural approach as the Shift/Anomaly Detection Model. In particular, as illustrated in Figure 3, it is also composed of the (1) Data Entrypoint, (2) Model Config and (3) Local Cache. However, instead of utilizing separate analytical models for each metric type, it contains a single Adaptive Sampling Model. This model accepts the entire IE data and handles its processing internally using two sub-models: Resource Sampling Model and Power Consumption Sampling Model.
Figure 3. Architecture of Sampling Model.
The process of handling the IE data in Sampling Model is as follows. As with the Shift/Anomaly Detection Model, Sampling Model receives the information about the current IE state, in Data Entrypoint. The Data Entrypoint can operate in two modes: (1) processing resource utilization data, and (2) processing power consumption data. Each of these modes is identified by a unique name, being, respectively, RESOURCE or POWER. The mode that should be adopted by Data Entrypoint is specified along the data passed by the Collector/Parser component (Section 3.2). Let [ x C P U , x R A M , x d i s k ] T be a vector representing the resource utilization metrics, and let x P be a scalar representing the current power consumption. Instead of splitting the data into different metric types, the main responsibility of Data Entrypoint is to select parameters of the corresponding model. These parameters are obtained based on the mode from the Model Config. After obtaining the parameters, Data Enrypoint creates corresponding pairs: (1) ( [ x C P U , x R A M , x d i s k ] T , θ R E S O U R C E ) , where θ R E S O U R C E denotes the set of parameters of Resource Sampling Model for the RESOURCE mode, and (2) ( x P , θ P O W E R ) , where θ P O W E R denotes the set of parameters of the Power Consumption Sampling Model for POWER mode. These pairs are then passed to the Adaptive Sampling Model along with the information about the mode type, so that the component can recognize which internal model should be called. Each of these internal models applies the same AdaM algorithm [34], which, as an output, recommends the optimal sampling frequency. Corollary to the algorithm utilized in Shift/Anomaly Detection Model, AdaM is memory efficient. However, it requires storing updated statistics of the metric stream evolution in the lightweight Local Cache. This happens as the last step. The next subsection describes the details of the AdaM adaptive sampling algorithm.

3.4.1. AdaM Adaptive Sampling

The original AdaM framework [34] introduces two algorithms: (1) adaptive sampling, and (2) adaptive filtering. In the context of this work, the emphasis is placed only on the adaptive sampling. The core of the algorithm is the estimation of metric stream evolution using PEWMA. Table 4 outlines all the notation used in this section. It should be noted that this section provides an overview of the applied formulas. In order to gain a better understanding of the underlying rationale of the selected statistics, it is advised to see the original article [34].
Table 4. Notation used in the AdaM adaptive sampling algorithm.
In order to calculate the PEWMA, it is necessary, first, to compute the distance between the two consecutive observations. In the implemented algorithm, the Manhattan distance formula is used, as presented in Equation (7).
δ i = | x i x i 1 | , if N > 0 0 , otherwise
The distance δ i is utilized in the computation of the probability P i , which is a core component of the PEWMA estimation. The probability is computed under the Gaussian distribution, in order to accommodate sudden spikes in the data, while simultaneously controlling their long-term influence. In particular, it prevents singular fluctuations from dominating the overall estimation. The formula for P i is outlined in Equation (8). It should be noted that P i is computed only when N > 0 , since, in the other case, the default value of PEWMA estimation is used.
P i = 1 2 π · e x p ( z i 2 2 )
Here, z i represents a z-score at time i, which is obtained using the current distance δ i , estimated distance μ i 1 (i.e., previous PEWMA estimation), and the estimated moving standard deviation σ ¯ i as indicated in Equation (9). It represents the deviation of δ i , relative to the expected data distribution.
z = δ i μ i 1 σ ¯ i
In the formula, σ ¯ i corresponds to the moving standard deviation, which can be calculated using the previous PEWMA estimation and the smoothed second moment (i.e., PEWMA of squared deviations; ν i 1 ). Specifically, it is derived using Equation (10). It should be noted that since all of the formulas are recursive and depend only on previous values, they are well-aligned to the needs of incremental computations.
ν i 1 = α · ( 1 β P i 1 ) · ν i 2 + ( 1 α · ( 1 β P i 1 ) ) · δ i 1 2 σ ¯ i = ν i 1 μ i 1 2
Finally, the computed probability P i is substituted in the recursive PEWMA formula, presented in Equation (11).
μ i = α · ( 1 β P i ) · μ i 1 + ( 1 α · ( 1 β P i ) ) · δ i , if N > 0 δ i , otherwise
In Equations (10) and (11), the parameters α and β are the weighting factors that control the influence of the previously observed values and the probability.
The anomalies are detected based on the difference between the actual observed standard deviation σ i and the standard deviation estimated using PEWMA σ ¯ i . In particular, Equation (12) presents the formula of a confidence interval, which specifies the degree of certainty of the estimation. It should be noted that the σ i may be equal to zero when the consecutive observations are equal. In such a case, the evolution of the metric stream is stable (distance equal to zero). Consequently, the confidence should be maximized in order to increase the sampling period as much as possible, which explains the second condition.
c i = 1 | σ ¯ i σ i | σ i , if σ i 0 1 , otherwise
When the confidence is above the acceptable imprecision (i.e., 1 γ ), a new sampling period is proposed, according to Equation (13).
T i + 1 = T i + λ · ( 1 + c i γ c i ) , if c i 1 γ T m i n , otherwise
The formula uses the λ parameter, which allows scaling the estimated sampling period value. It can be particularly useful when applying the AdaM technique in different scenarios that operate on different units of sampling period (e.g., milliseconds, seconds).
The last step, before returning the estimated sampling period, is the verification whether it falls within the boundaries [ T m i n , T m a x ] . If the value does not comply with the specified limitations, it is being clamped.
As can be observed, the AdaM algorithm operates on scalar observations. This perfectly accommodates estimating the sampling period in the case of power consumption. However, applying this technique in the case of the RESOURCE mode, which uses a vector of values, requires introducing additional modifications.
In particular, the proposed approach is based on the minimization of the sampling period. First, the sampling period is computed separately for each resource utilization metric while using a common set of model parameters. Afterwards, the minimal estimated sampling period is selected in order to account for frequent fluctuations. Taking into account the sampling period of relatively stable metrics (associated with the greater values of the sampling period) would not be feasible, since the substantial changes in rapidly shifting ones would not be captured in subsequent monitoring cycles. Such an approach has been adopted due to the architecture of the system, in which the self-optimization module was verified. In particular, in self-awareness all resource utilization metrics are monitored within a single iteration cycle. Consequently, it was not possible to adjust their sampling periods individually. Nevertheless, it should be noted that deploying the self-optimization module within different architectures that support adjusting the sampling period for each metric may not require this unification step.
As with the Shift/Anomaly Detection Model, the performance of the Sampling Model heavily relies on the selection of appropriate parameters as described in the following subsection.

3.4.2. Model Configuration

The parameters used in the AdaM computation are: (1) T m i n , (2) T m a x , (3) λ , (4) γ , with a value between 0 and 1, (5) α , with a value between 0 and 1 and (6) β , with a value between 0 and 1. All of these parameters are described in Table 4. They are configured separately for each sub-model, i.e., the Resource Sampling Model and the Power Consumption Sampling Model. As the number of parameters is relatively large, relying on their fixed specification, without incorporating domain knowledge, would not be feasible.
Therefore, an approach similar to Shift/Anomaly Detection Model was employed. In particular, an external interface that allows the modification of all parameters on the fly was implemented. Through the endpoint sampling/parameters/type, a human operator, or another system component, can dynamically adjust the selected model settings. This mechanism works in the same manner as the one introduced for the Shift/Anomaly Detection Model, which was described in Section 3.3.2.
It should be easy to notice that the performance of the entire self-optimization module relies primarily on its analytical components. Therefore, they have been evaluated in different experimental settings, as described in the following section.

4. Experimental Validation

The main objective of the conducted experiments was to evaluate the effectiveness of the implemented algorithms against the selected benchmark datasets and compare them to other state-of-the-art approaches. Here, it should be noted that while frugality was a key design consideration, the current evaluation did not explicitly assess the performance of both models on resource-constrained devices. Yet, such an assessment is recognized as a critical next step and is planned to be included in future research.
As of now, all of the experiments were conducted on a computer with Intel(R) Core(TM) i5-10210U @ 1.60GHz 2.11 GH, 4 Core(s), 16GB RAM. For the assessment, four different real-world datasets were selected:
  • RainMon monitoring dataset (https://github.com/mrcaps/rainmon/blob/master/data/README.md, access date: 24 October 2025): an unlabeled collection of real-world CPU utilization data that consists of 800 observations. It was obtained from the publicly available data corpus of the RainMon research project [50]. The CPU utilization traces of this dataset exhibit non-stationary, highly dynamic behavior, including multiple abrupt spikes. Therefore, it provides an excellent basis for validating the effectiveness of adaptive sampling in capturing critical variations in monitoring data.
  • NAB, synthetic anomaly dataset (anomalies jumps-up) (https://github.com/numenta/NAB/blob/master/data/README.md, access date: 24 October 2025): time-series, labeled, dataset composed of 4032 observations, which was obtained from the NAB data corpus [28]. It features artificially generated anomalies that form a periodic pattern. In particular, in this dataset, CPU usage exhibits regular, continuous bursts of high activity, followed by sharp declines in activity. Consequently, it provides a testbed for anomaly detection, allowing the assessment of its contextual anomaly detection capabilities.
  • NAB, synthetic anomaly dataset (load balancer spikes) (https://github.com/numenta/NAB/blob/master/data/README.md, access date: 24 October 2025): similarly to the previous one, a time-series, labeled dataset composed of 4032 observations, which was obtained from the NAB data corpus [28]. It also features artificially generated anomalies, but of different traits than in the anomalies jumps-up dataset. In particular, these represent abrupt individual spikes, providing a basis for the evaluation of both the point-based and the contextual anomaly detection.
  • aerOS cluster IE traces: resource utilization traces of a single IE that were collected using the self-awareness component in the aerOS continuum. They span 87 observations obtained during one hour. Although these traces do not exhibit any substantially abrupt behaviors, they were selected for the analysis since they closely resemble the conditions on which the self-optimization algorithms are to operate.
Since the external datasets (i.e., RainMon and NAB) use data formats that differ from the representation of the IE state in the aerOS continuum, their pre-processing was required. In particular, in order to emulate the results retrieved from self-awareness, individual scalar values had to be mapped to the resource utilization metrics of the IE. To achieve that, it was necessary to first predefine the test configuration of IE. Accordingly, it was assumed that the IE, which was the subject of monitoring in all conducted tests, had 20 CPU cores, 15615 MB RAM capacity, and 68172 MB disk capacity.
For the RainMon dataset, individual percentage CPU utilization values were converted into a number of utilized CPU cores. A similar operation was conducted for the NAB dataset, since it was assumed that its synthetic traces also represent percentage CPU utilization. In both cases, all remaining monitoring metrics (i.e., RAM and disk utilization) were set to 0 values, since they were not relevant to the experiments.
All of the experiments were executed using a simple, dedicated, test framework that was developed as part of the self-optimization module. In particular, the framework emulates the monitoring behavior of self-awareness by transforming the continuous time-based process into a discrete-based one. This is achieved by representing fixed temporal intervals (e.g., five seconds) using steps between successive observations in a dataset (e.g., a gap of 5). The instantiated framework allows specifying the configuration of each individual test scenario, which includes: (1) the name of the scenario, (2) a description of the scenario, (3) a specification of IE data, (4) a reference to the JSON file containing monitoring data, (5) list of evaluation metrics, and (6) list of the names and parameters of additional algorithms that are to be used in results comparison. As such, it allows the execution of repeatable experiments for different scenarios. Let us now describe experiments conducted for Sampling Module and Shift/Anomaly Detection Module.

4.1. Sampling Model Verification

The Sampling Module was tested on Dataset [data:dataset1]1, Dataset [data:dataset2]2 and Dataset [data:dataset4]4. It was not tested on Dataset [data:dataset3]3, since apart from abrupt metric stream spikes (already present in Dataset [data:dataset1]1), this dataset does not exhibit validation means that would be relevant from the perspective of adaptive sampling.
In all conducted tests, it was assumed that the value of the α parameter and of the β parameter are static, i.e., α = 0.5 and β = 1 . This assumption was made based on the results presented in [34], which indicated that the α and β do not have a significant impact on the performance of the model. Similarly, the minimal T m i n and the maximal T m a x sampling periods were fixed to the values determined after consultation with the creators of the self-awareness module, i.e., T m i n = 1000 (ms) and T m a x = 6000 (ms). This ensured that the chosen values closely reflect the real-world conditions.
The only parameters that were not statically defined were multiplicity λ and imprecision γ . Therefore, in order to select their best combination, comparative tests were conducted on each dataset. First, the focus was to establish the best γ . Thus, λ was fixed to 1000, which is a reasonable baseline as it represents a multiplicity of 1 (no multiplicity). After selecting the most optimal γ , it was used in tests that have been aiming at selecting the best λ .
The assessment was based on two different quality measures. First (following [34]) was the Mean Absolute Percentage Error (MAPE). It allows illustrating the error between the predicted stream values and the actual ones. The formula for MAPE is illustrated in Equation (14), where x a c t u a l is an actual value, x p r e d i c t e d is a predicted value (set to the last monitored value, if the observation is skipped), and n is the size of the sample.
M A P E = i n | x p r e d i c t e d x a c t u a l | x a c t u a l n · 100
A second metric was the Sample Ratio (SR), which quantifies the percentage of samples that were monitored. This measure provides insight into the reduction in the data dissemination volume. It is computed using Equation (15).
S R = n m o n i t o r e d n a c t u a l · 100
Those two quality metrics were used to compute the Joint-Performance Metric (JPM), which is formulated as in Equation (16). Given that both MAPE and SR are to be minimized, in order to achieve the best performance, JPM serves as an aggregate that evaluates the overall trade-off between these goals. Here, the higher JPM indicates better effectiveness.
J P M = 100 M A P E + S R 2
Table 5 presents the results of the tests executed for different γ values for all three datasets. In the case of the aerOS dataset, the MAPE was calculated by averaging the MAPE of all metrics (i.e., CPU, RAM, and disk utilization).
Table 5. Results of the tests of different γ values obtained for all three datasets.
As can be observed, the best results were achieved for γ = 0.6 . In the majority of cases, SR exhibited a decreasing trend as γ increased, which is consistent with the expected behavior. In particular, higher tolerance of estimation errors expressed through γ leads to longer sampling intervals, which, as a consequence, reduces the number of monitored samples. Interestingly, for γ = 1.0 , the SR unexpectedly increased in all conducted tests, achieving slightly degraded overall performance (JPM). Additionally, the experiments conducted on the aerOS dataset yielded the lowest SR values (between 80 and 100%) compared to the NAB and RainMon. This difference may potentially be attributed to the fact that the sampling period computation on the aerOS dataset was performed on multivariate observations (i.e., it accounted for RAM and disk utilization), making the final sampling frequency dependent on the distribution of all three utilization metrics.
Moreover, it is observable that the MAPE values are significantly below the acceptable imprecision γ in all cases. However, the results of tests conducted on different datasets exhibit significant differences in the behavior of this metric, as illustrated in Figure 4.
Figure 4. The comparison of achieved MAPE values for different imprecisions and datasets.
For the RainMon dataset, the MAPE values were fairly regular and oscillated between 5 and 10%. On the other hand, for the NAB dataset, the MAPE values were the lowest, with almost all of them being below 1%. Finally, for the aerOS dataset, the average MAPE values were the least regular, making it difficult to observe any trends. These outcomes emphasize the criticality of adjusting the model parameters for the different application scenarios.
Based on the observed results, the value of γ was fixed at 0.6 . Next, it was used in the selection of the multiplicity parameter ( λ ). For λ , four different values were considered: 500, 1000, 2000, and 3000. Table 6 summarizes the outcomes of the tests.
Table 6. Results of the tests of different λ values for all three datasets.
The obtained results demonstrate that λ and MAPE exhibit a positive correlation, while λ and SR have an inverse relationship. Since the changes in MAPE are relatively small (being significantly below the considered imprecision parameter), a clear increase also appears in the case of JPM. Therefore, in the non-critical applications, where a moderate level of estimation error is tolerable, selecting a higher value of λ may be beneficial. However, in cases where precision is critical, the overall JPM value might be of secondary importance to maintaining a lower MAPE.
Taking into account the aforementioned findings, the combination of γ = 0.6 and λ = 3000 was considered to be the most optimal. As such, it was used in the subsequent experiments evaluating the performance of the implemented adaptive sampling approach against other existing algorithms. In particular, the (1) UDASA [30] and (2) AWBS [29] adaptive sampling techniques were used in the assessment. However, it should be stressed again that establishing these values has to be performed for each application of the proposed approach.
For UDASA, the following parameters were applied: (1) window size N = 15 , (2) saving level (controls the level of precision) n = 5 , and (3) baseline sampling period T = 1000 . These parameters were determined by analyzing the results of the experiments presented in [29] and adjusting them to the specificity of test datasets (e.g., window size decreased due to a smaller number of observations).
Similarly, for the AWBS, the parameters included: (1) maximal window size m x = 6 , (2) initial window size w i n i t i a l = 1 , and (3) threshold (used to determine anomalies) t h = 40 % . Here, let us note that in the original version of AWBS, the threshold was not parametrized but determined with respect to the percentage difference of the first two moving averages. However, this approach does not capture edge cases (e.g., when the percentage difference is 0), which appear in the considered test datasets. In such scenarios, since the threshold is static, it does not adapt to the evolution of a metric stream (i.e., it would remain 0). Consequently, there is almost no sampling adjustment. To mitigate that, it was decided that the threshold should be predefined. Hence, it was set to 40% so that it corresponds to the imprecision parameter γ . The results of all experiments are summarized in Table 7.
Table 7. Comparison of the performance of all selected algorithms run against all three datasets.
It can be observed that in terms of JPM, the AdaM algorithm implemented in Sampling Model yields comparable results to the remaining state-of-the-art approaches. It outperformed UDASA in all executed tests while achieving slightly worse outcomes (by around 2%) than AWBS. Among the evaluated algorithms, UDASA demonstrated the least effective data volume reduction, which was associated with a relatively high prediction error. This is clearly observable in Figure 5, which depicts the monitoring of the RainMon dataset under various adaptive sampling strategies. While UDASA is able to account for spikes with a larger magnitude, it smooths out smaller fluctuations, which contributes to the prediction error.
Figure 5. Results of adaptive sampling on the RainMon dataset.
On the contrary, in terms of the data volume reduction (SR), the best performance, among all algorithms, was achieved by AWBS. However, it should be noted that this result was accompanied by significantly higher prediction error, as indicated by substantially worse MAPE values. This is reflected, among others, in Figure 6, where the results of adaptive data sampling on the NAB dataset are demonstrated. In AWBS, large portions of fluctuations are bypassed by averaging them within sliding windows. As a result, despite the fact that it significantly reduces the amount of disseminated data, it accounts for substantial delays in capturing the evolution of the metric stream, which manifests in the estimation errors.
Figure 6. Results of adaptive sampling on NAB (anomaly jumps-up) dataset.
Overall, it can be concluded that among all presented algorithms, the implemented AdaM approach offers the most balanced trade-off between quality and performance. This observation is supported by the results of the tests carried out on the aerOS dataset, which are presented in Figure 7. Here, the CPU utilization exhibited highly systematic fluctuations with relatively low amplitude. Moreover, there were only a few observations that did not follow this pattern.
Figure 7. Results of adaptive sampling on aerOS dataset.
It is also noticeable that AWBS smoothed the majority of observations, failing to capture any of the isolated spikes. The only apparent fluctuations that it detected appeared at the beginning of the metric stream. This is likely due to an insufficient number of data samples to compute the average. In comparison, UDASA performed slightly better, as it was able to capture some of the variations. However, it failed to immediately account for the spikes, which are visible in Figure 7c by shifted slopes. Additionally, UDASA rarely adjusted to decreasing values. Finally, the AdaM technique had greater responsiveness to more substantial fluctuations, even though they were not of a large magnitude.
It can be concluded that among the evaluated frugal adaptive sampling algorithms, AdaM is able to best accommodate different types of metric streams, including not only those characterized by abrupt changes but also those with a stable behavior. The experiments that were conducted proved that the method is able to respond quickly to substantial fluctuations, adjusting the sampling frequency accordingly. Furthermore, its performance aligns well with the operational characteristics of real-life ECC, as demonstrated in the tests executed over monitoring data obtained from the aerOS continuum.
Let us now describe the evaluation of the algorithm implemented in Shift/Anomaly Detection Model.

4.2. Shift/Anomaly Detection Model Verification

The process of evaluating the Shift/Anomaly Detection Model can be divided into two main parts: (1) assessing the quality of the anomaly detection using labeled datasets and comparing its performance to other existing state-of-the-art approaches, and (2) running the Shift/Anomaly Detection Model on unlabeled aerOS data, and consulting the obtained results with aerOS domain-experts to assess their accuracy.
For the first part, Dataset [data:dataset2]2 and Dataset [data:dataset3]3 were used. Both of them provide a ground-truth mask, which, for each anomaly, specifies a time interval during which it is supposed to be detected. In this context, Dataset [data:dataset1]1 was not selected for the tests, since it does not provide any labels enabling the distinguishing of the anomaly observations from the normal ones.
In all conducted experiments, it was assumed that the main objective was to detect anomalies as soon as possible. Hence, the parameters W n o r m a l and W a n o m a l y were fixed and set to the value 1. In this way, the model was able to respond to abrupt fluctuations without delaying alerts regarding anomalous observations. Similarly, the value of T h n o r m a l was fixed to 0.05, since the objective was to investigate the performance of the transition between the normal and the anomalous state, and not the other way around. Consequently, the only parameter that had to be adjusted was T h a n o m a l y . Therefore, in order to establish its optimal value, a set of comparative experiments was conducted.
The performance of the anomaly detection was numerically assessed using the normalized anomaly score (S) quality measure. This measure was introduced as part of NAB [28], and it weights individual detections, relative to the expected time interval (anomaly window) in which they were supposed to be captured. In particular, it works in the following way. Detected anomalies are classified into three categories: (1) detected before the anomaly window, (2) detected within the anomaly window, and (3) detected after the anomaly window. Each such detected anomaly is associated with its individual score, which indicates how much a given detection should contribute to the overall quality of the algorithm. The detection score of individual anomalies is computed using a sigmoid function according to Equation (17).
σ ( y i ) = 1 , if y i before window 2 · 1 1 + e 5 y 1 , otherwise
In this equation, y i represents the position of the detected anomaly i, relative to the anomaly window. It is calculated using the formula in Equation (18).
y i = i d x a n o m a l y i d x s t a r t _ w i n d o w w i n d o w _ s i z e , if y i inside window i d x e n d _ w i n d o w i d x a n o m a l y w i n d w _ s i z e , otherwise
As can be observed, whenever the anomaly detection ( i d x a n o m a l y ) is closer to the beginning of the anomaly window ( i d x s t a r t _ w i n d o w ), the relative position goes to −1, hence, resulting in a larger positive value of the detection score. Therefore, early detection of anomalies is prioritized. On the other hand, when the anomaly is detected just at the end of the anomaly window ( i d x e n d _ w i n d o w ), then the relative position is 0, neutralizing the impact of its detection score. For the anomalies that go beyond the window, the value of the detection score becomes increasingly negative as their relative distance from the detection window grows. Readers interested in understanding a more in-depth rationale behind applying such formulas are encouraged to read [28].
After scoring individual detections, they are weighted (see Equation (19)) using true positive weight ( W t p ), and false positive weight ( W f p ). Then, all weighted individual detections are summed. Lastly, the number of windows with undetected anomalies ( n u n d e t e c t e d ), weighted based on the false negative weight ( W f n ), is subtracted from the sum. Hence, the overall formula of a score is as presented in Equation (20).
σ A ( y i ) = σ ( y i ) · W t p , if y i inside window σ ( y i ) · W f p , otherwise
S n = i n σ A ( y i ) n u n d e t e c t e d · W f n
In the experiments, the normalized formula of the score was used as outlined in Equation (21).
S = 100 · S n S n u l l S p e r f e c t S n u l l
Here, (1) S p e r f e c t represents a perfect score (i.e., all anomalies recognized at the beginning of their anomaly windows), and (2) S n u l l represents a case when no anomalies were detected (i.e., baseline defined individually for each NAB dataset).
The S was utilized as a basis for assessing the quality of the anomaly detection.
It should be noted that the standard weighting profile, proposed by the authors of NAB, makes the anomaly score extremely sensitive to false positives. Specifically, for standard value W f p = 0.11 , the anomaly score S n attains large negative values in scenarios characterized by a large number of false positives (e.g., when the model detects point anomalies, rather than contextual ones). At the same time, the scale of the possible false positives is not accounted for in the S n u l l computation. As a result, under such conditions, the overall normalized S n value may fall below 0, violating the intended 0–100% range. In order to accommodate this issue, in the conducted experiments, the standard weighting profile was modified by lowering the W f p to 0.05 .
In this context, the results of tests conducted on all datasets using different values of T h a n o m a l y are presented in Table 8.
Table 8. Results of the tests of different T h a n o m a l y values obtained for both NAB datasets.
The results illustrate that for low threshold values (i.e., T h a n o m a l y = 0.1 ), the model fails to detect any anomalies, hence the score S is 0%. Then, for slightly larger T h a n o m a l y (i.e., T h a n o m a l y = 0.2 or T h a n o m a l y = 0.3 ), S begins to sharply increase. It can be attributed to the fact that for such parameters, the system remains moderately sensitive, rejecting minor fluctuations (i.e., minimizing false positives) while detecting substantial spikes. The peak results are obtained for T h a n o m a l y = 0.3 (93.12%) in the case of anomaly jumps-up and T h a n o m a l y = 0.2 (21.47%) for load balancer spikes. After reaching this value, the performance begins to steadily decline. It can be a result of an increased sensitivity of the system, hence a larger number of false positive anomalies. Moreover, it should be noted that the NAB datasets label the contextual anomalies. Consequently, although the density-based anomaly detection algorithm implemented in Shift/Anomaly Detection Model is able to detect the point-based anomalies, it may not be well suited for the contextual ones.
This last conclusion is particularly supported by the S results obtained for the load balancer spikes dataset, which are significantly lower than in the case of anomaly jumps-up. It can be observed in Figure 8, which presents the comparison of the anomalies detected for anomaly jumps-up (Figure 8a) and load balancer spikes (Figure 8b) for T h a n o m a l y = 0.3 . In the figures, the red semi-transparent area represents the labeled anomaly window. In the case of anomaly jumps-up, only one anomaly was detected because there was only one fluctuation of large magnitude. However, in the case of load balancer spikes, the data stream had multiple high spikes, each of which was classified as an anomaly, even though they were outside of the anomaly window. It is clearly visible that the algorithm was not able to recognize the anomalies based on the periodicity and frequency of their occurrences.
Figure 8. Results of anomaly detection for T h a n o m a l y = 0.3 and NAB datasets (anomaly jumps-up and load balancer spikes).
Based on the obtained results, the threshold T h a n o m a l y = 0.3 was selected to be used in subsequent experiments, comparing the performance of Shift/Anomaly Detection Model to other existing state-of-the-art approaches. In particular, for the comparison, the Contextual Anomaly Detector CAD-OSE [51] and ARTime [52] were used. Both these algorithms were implemented as part of NAB and achieved the best results on the NAB scoreboard when tested against different corpus datasets. Therefore, it was decided that they would serve as a suitable basis for the performance evaluation. The obtained results, for running Shift/Anomaly Detection Model, CAD-OSE and ARTime on anomaly jumps-up and load balancer spikes are summarized in Table 9.
Table 9. Comparison of the performance of different anomaly detection algorithms run against NAB anomaly jumps-up and load balancer spikes datasets.
As can be observed, the density-based algorithm implemented in Shift/Anomaly Detection Model achieved a comparable performance to the remaining tested approaches. In particular, for both datasets, it scored higher, or equal, to CAD-OSE, but slightly lower than ARTime. The superior performance achieved in both scenarios by ARTime can be primarily attributed to its inherent complexity, as it is based on the Adaptive Resonance Theory (ART) neural network. Consequently, among all applied algorithms, it most effectively captures contextual anomalies and reduces false positive rates. Moreover, since ART networks are designed to quickly adapt to new input patterns without retraining, ARTime is able to respond immediately to the detected deviations. It is clearly illustrated in Figure 9 and Figure 10 that depict anomaly detection results in both the anomaly jumps-up and the load balancer spikes datasets.
Figure 9. Results of anomaly detection on NAB (anomaly jumps-up).
Figure 10. Results of anomaly detection on NAB (load balancer spikes) dataset.
In the case of anomaly jumps-up dataset (Figure 9), all three approaches were able to capture the anomaly within the intended anomaly window. Furthermore, none of them detected false positives. Here, the difference in their performance, reflected by the value of S, arises from the time at which the anomalies were detected. Specifically, ARTime was the fastest in anomaly detection, followed by the Shift/Anomaly Detection Model, with CAD-OSE demonstrating the slowest response. However, in this particular case, the differences in anomaly detection time were negligible. Therefore, in applications where rapid anomaly detection is not highly critical (e.g., monitoring trends in streaming sensor data), all three algorithms can be considered to be equivalent in terms of performance.
On the other hand, for the experiments performed on the load balancer spikes dataset (Figure 10), the differences are more prominent. In particular, all algorithms detected a substantial number of false positive observations, where their highest number was detected by Shift/Anomaly Detection Model. This strengthens the conclusion that the implemented density-based algorithm is better suited for point anomaly detection rather than contextual anomalies. Nevertheless, it is important to note that the Shift/Anomaly Detection Model detected the true anomaly relatively early, within the intended anomaly window, which contributed positively to its overall anomaly score. In contrast, the number of false positives detected by ARTime and CAD-OSE was comparably lower. The variation in the final performance, among these two algorithms, resulted from the time in detecting the true anomaly, with ARTime identifying it significantly faster than CAD-OSE.
The conducted experiments showcase that ARTime achieves the best overall performance. However, it is important to emphasize that this particular algorithm is very computationally complex, which hinders its applicability in environments that perform operations on resource-restricted devices (e.g., edge sensors). Consequently, due to the comparable performance achieved by the Shift/Anomaly Detection Model, it may be an appropriate replacement for the anomaly detection in environments that require frugality of computational resources.
In order to assess the performance of Shift/Anomaly Detection Model in close to the real-life conditions, the final tests were conducted on the Dataset [data:dataset1]1 (i.e., the aerOS dataset). Since the aerOS dataset is characterized by low data fluctuations, the experiments were conducted first for T h a n o m a l y = 0.3 and then for T h a n o m a l y = 0.95. The main purpose was to evaluate the behavior of the model under increased sensitivity to anomaly detection. The results obtained for both configurations are illustrated in Figure 11. Here, it should be noted that the aerOS dataset does not provide ground-truth labels, describing which observations are anomalous. Therefore, the reliability of the obtained results has to be confirmed by the aerOS experts.
Figure 11. Results of anomaly detection on aerOS dataset.
As can be observed, for the T h a n o m a l y = 0.3, the Shift/Anomaly Detection Model detected no anomalies. This behavior is fully rational, as in such a configuration, the model will not be sensitive to minor fluctuations, such as those observed in the aerOS dataset. Singular anomalies were detected only after increasing T h a n o m a l y to a value of 0.95, representing a very high sensitivity setting. Under this configuration, the model identified several small spikes. Although in this particular case, they illustrate mainly that the sensitivity of the Shift/Anomaly Detection Model can be effectively and seamlessly tuned via its parameters.
The results of the experiments demonstrate the model’s versatility, making it adaptable for a wide range of application domains, ranging from those that require the detection of subtle variations (e.g., healthcare applications) to those that should respond only to more substantial fluctuations (e.g., industrial fault detection).

5. Limitations

The implemented approach was evaluated on a set of synthetic and real-world datasets, proving its effectiveness by showing that individual models achieve performance comparable to or better than other existing state-of-the-art approaches. However, there are still significant limitations that need to be addressed further.
For instance, the implemented Shift/Anomaly Detection Model is able to perform well only on the point-based anomalies, hindering its applicability in systems that rely on long-term contextual information (e.g., identifying anomalies in network traffic). Hence, future research should explore possibilities of incorporating context into the anomaly detection process, while still preserving the frugal nature of the approach.
Separately, the Sampling Model has been primarily tested on the univariate metric streams, under-representing the real-world applications that rely on multivariate data. As such, a broader validation within the multivariate data domain is required to fully assess the performance of the model.
Moreover, as demonstrated by the results of the experiments, both models are highly reliant on their parameters, requiring substantial knowledge from domain experts for their tuning. Although this has been addressed partially by introducing interfaces supporting on-the-fly parameters’ modification, the current solution is fully manual and may not be feasible in large-scale complex systems. Consequently, future efforts should investigate automated mechanisms for parameter tuning or potentially propose means for autonomous self-configuration.
Finally, while the current study validates the proposed self-optimization module in simulation and dataset-driven scenarios, its real-world efficacy remains to be assessed. This limitation is particularly relevant since only experiments conducted on real-life hardware can capture the true impact of the algorithms from the point of view of actual resource utilization and energy consumption. Furthermore, since frugality was a central aspect of the design of the presented algorithms, so that they can accommodate resource-restrictive devices, experiments performed solely in simulated environments cannot fully verify whether this objective was achieved. Therefore, future deployments and experiments of the self-optimization module must be conducted on physical ECC infrastructures. Such evaluations would require the careful selection and configuration of edge devices, as well as the design of representative test scenarios along with domain-specific evaluation metrics. Given the extensive scope of these experiments, they fall beyond the limits of the present contribution but will be undertaken as a continuation of the current research.
Nevertheless, the preliminary results obtained within this contribution indicate a strong potential in the implemented self-optimization module, driven by its modularity, generalizability, and frugal approach.

6. Concluding Remarks

The contribution introduced a novel perspective on the self-optimization for resource-constrained edge devices that transmit and process streaming data. Specifically, it presented a design of a modular and pluggable self-optimization module comprising two core components: (1) Shift/Anomaly Detection Model and (2) Sampling Model. In light of an ambiguity existing in the current literature regarding the definition of self-optimization, it can be stated that, in the context of this work, self-optimization was perceived from the perspective of enhancing operational efficiency and minimizing energy consumption.
On the one hand, in order to reduce the energy overhead associated with the transmission of large data volumes, the focus was placed on the optimization of the data-monitoring process. The PEWMA-based adaptive sampling technique, implemented within Sampling Model, was utilized to autonomously adjust the monitoring rate, reducing the overall disseminated data volume while preserving the relevant information. Such an approach bridges a significant gap in the existing self-optimization literature, where the subject of optimizing the data transmission process is often overlooked.
On the other hand, through Shift/Anomaly Detection Model, this study also addressed the challenges of resource optimization by analyzing the utilization of current resources and detecting potential abnormalities. However, it should be noted that this process pertained only to the detection of possible adaptation possibilities, rather than suggesting particular actions to undertake. By doing so, the presented approach provides a greater scope for generalization since the adaptation actions are often highly context-dependent and application-specific. Following such a design enables the self-optimization module to be seamlessly integrated into different infrastructures, where it can interface with external components responsible for executing adaptations. Nevertheless, the examples of such integrations are yet to be studied in future publications.
Moreover, it should be noted that the algorithms implemented in both models were derived based on the analysis of the current state-of-the-art, where the frugality, understood from (1) the resource utilization, and (2) required data volume perspectives, constituted a primary criterion behind the algorithm’s selection. Consequently, it can be argued that the introduced approach accommodates well the requirements of a wide variety of ECC applications, including those with substantial processing limitations. For instance, the developed self-optimization module may be equally deployed on ECC composed of large edge servers processing satellite images, but also on embedded devices located within vehicles, or small wearable smartwatches used to monitor vital parameters of patients.
As such, it can be summarized that the implemented self-optimization module may constitute a viable component for next-generation intelligent ECC.

Author Contributions

Conceptualization, Z.W. and K.W.-M.; methodology, Z.W. and K.W.-M.; software, Z.W.; validation, Z.W.; investigation, Z.W.; writing—original draft preparation, Z.W.; writing—review and editing, K.W.-M., Y.W., M.G., and M.P.; visualization, Z.W.; supervision, M.G. and M.P. All authors have read and agreed to the published version of the manuscript.

Funding

The work of Zofia Wrona and Katarzyna Wasielewska-Michniewska was funded by the European Union’s “Horizon Europe” research and innovation funding program as part of the “Autonomous, scalable, trustworthy, intelligent European meta-operating system for the IoT edge–cloud continuum” (aerOS) project under Grant Agreement No. 101069732. Moreover, the work of Maria Ganzha and Marcin Paprzycki was funded by the European Union’s “Horizon Europe” research and innovation funding program as part of the “Open CloudEdgeIoT Platform Uptake in Large Scale Cross-Domain Pilots" (O-CEI) project under Grant Agreement No. 101189589.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

NAB datasets used as part of this study can be found at: https://github.com/numenta/NAB (Access date: 24 October 2025). The aerOS dataset presented in this article is not readily available yet because it is a subject of an ongoing study as part of the aerOS project and will be published along with the completed open-source code. Requests to access the dataset earlier should be directed to zofia.wrona.dokt@pw.edu.pl.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

IoTInternet of Things
ECCEdge–Cloud Continuum
SLAService Level Agreement
NABNumenta Anomaly Benchmark
PEWMAProbabilistic Exponential Moving Average
HiTLHuman in The Loop
IEInfrastructure Element
AWBSAdaptive Window-Based Sampling
UDASAUser-Driven Adaptive Sampling Algorithm
QoSQuality of Service
MADMedian Absolute Deviation
AdaMAdaptive Monitoring Framework
MAPEMean Absolute Percentage Error
SRSample Ratio
JPMJoint-Performance Metric

References

  1. Gkonis, P.; Giannopoulos, A.; Trakadas, P.; Masip-Bruin, X.; D’Andria, F. A Survey on IoT-Edge-Cloud Continuum Systems: Status, Challenges, Use Cases, and Open Issues. Future Internet 2023, 15, 383. [Google Scholar] [CrossRef]
  2. Trigka, M.; Dritsas, E. Edge and Cloud Computing in Smart Cities. Future Internet 2025, 17, 118. [Google Scholar] [CrossRef]
  3. Belcastro, L.; Marozzo, F.; Orsino, A.; Talia, D.; Trunfio, P. Edge-Cloud Continuum Solutions for Urban Mobility Prediction and Planning. IEEE Access 2023, 11, 38864–38874. [Google Scholar] [CrossRef]
  4. Xu, Y.; He, H.; Liu, J.; Shen, Y.; Taleb, T.; Shiratori, N. IDADET: Iterative Double-Sided Auction-Based Data-Energy Transaction Ecosystem in Internet of Vehicles. IEEE Internet Things J. 2023, 10, 10113–10130. [Google Scholar] [CrossRef]
  5. Ying, J.; Hsieh, J.; Hou, D.; Hou, J.; Liu, T.; Zhang, X.; Wang, Y.; Pan, Y.T. Edge-enabled cloud computing management platform for smart manufacturing. In Proceedings of the 2021 IEEE International Workshop on Metrology for Industry 4.0 & IoT (MetroInd4.0&IoT), Rome, Italy, 7–9 June 2021; pp. 682–686. [Google Scholar] [CrossRef]
  6. Boiko, O.; Komin, A.; Malekian, R.; Davidsson, P. Edge-Cloud Architectures for Hybrid Energy Management Systems: A Comprehensive Review. IEEE Sens. J. 2024, 24, 15748–15772. [Google Scholar] [CrossRef]
  7. Fu, W.; Wan, Y.; Qin, J.; Kang, Y.; Li, L. Privacy-Preserving Optimal Energy Management for Smart Grid With Cloud-Edge Computing. IEEE Trans. Ind. Inform. 2022, 18, 4029–4038. [Google Scholar] [CrossRef]
  8. Dhifaoui, S.; Houaidia, C.; Saidane, L.A. Cloud-Fog-Edge Computing in Smart Agriculture in the Era of Drones: A systematic survey. In Proceedings of the 2022 IEEE 11th IFIP International Conference on Performance Evaluation and Modeling in Wireless and Wired Networks (PEMWN), Rome, Italy, 8–10 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
  9. Yousif, M. The Edge, the Cloud, and the Continuum. In Proceedings of the 2022 Cloud Continuum, Los Alamitos, CA, USA, 5 December 2022; pp. 1–2. [Google Scholar] [CrossRef]
  10. Weinman, J. Trade-Offs Along the Cloud-Edge Continuum. In Proceedings of the 2022 Cloud Continuum, Los Alamitos, CA, USA, 5 December 2022; IEEE: Piscataway, NJ, USA, 2022; Volume 12, pp. 1–7. [Google Scholar] [CrossRef]
  11. Kong, X.; Wu, Y.; Wang, H.; Xia, F. Edge Computing for Internet of Everything: A Survey. IEEE Internet Things J. 2022, 9, 23472–23485. [Google Scholar] [CrossRef]
  12. Alalawi, A.; Al-Omary, A. Cloud Computing Resources: Survey of Advantage, Disadvantages and Pricing. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, 26–27 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
  13. George, A.S.; George, A.S.H.; Baskar, T. Edge Computing and the Future of Cloud Computing: A Survey of Industry Perspectives and Predictions. Partners Univers. Int. Res. J. 2023, 2, 19–44. [Google Scholar] [CrossRef]
  14. Azmy, S.B.; El-Khatib, R.F.; Zorba, N.; Hassanein, H.S. Extreme Edge Computing Challenges on the Edge-Cloud Continuum. In Proceedings of the 2024 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Kingston, ON, Canada, 6–9 August 2024; pp. 99–100. [Google Scholar] [CrossRef]
  15. Vaño, R.; Lacalle, I.; Sowiński, P.; S-Julián, R.; Palau, C.E. Cloud-Native Workload Orchestration at the Edge: A Deployment Review and Future Directions. Sensors 2023, 23, 2215. [Google Scholar] [CrossRef]
  16. Chiang, Y.; Zhang, Y.; Luo, H.; Chen, T.Y.; Chen, G.H.; Chen, H.T.; Wang, Y.J.; Wei, H.Y.; Chou, C.T. Management and Orchestration of Edge Computing for IoT: A Comprehensive Survey. IEEE Internet Things J. 2023, 10, 14307–14331. [Google Scholar] [CrossRef]
  17. Karami, A.; Karami, M. Edge computing in big data: Challenges and benefits. Int. J. Data Sci. Anal. 2025, 20, 6183–6226. [Google Scholar] [CrossRef]
  18. Andriulo, F.C.; Fiore, M.; Mongiello, M.; Traversa, E.; Zizzo, V. Edge Computing and Cloud Computing for Internet of Things: A Review. Informatics 2024, 11, 71. [Google Scholar] [CrossRef]
  19. Zhang, S.; Liu, C.; Li, X.; Han, Y. Runtime reconfiguration of data services for dealing with out-of-range stream fluctuation in cloud-edge environments. Digit. Commun. Netw. 2022, 8, 1014–1026. [Google Scholar] [CrossRef]
  20. Sheikh, A.M.; Islam, M.R.; Habaebi, M.H.; Zabidi, S.A.; Bin Najeeb, A.R.; Kabbani, A. A Survey on Edge Computing (EC) Security Challenges: Classification, Threats, and Mitigation Strategies. Future Internet 2025, 17, 175. [Google Scholar] [CrossRef]
  21. Kephart, J.; Chess, D. The Vision Of Autonomic Computing. Computer 2003, 36, 41–50. [Google Scholar] [CrossRef]
  22. S-Julián, R.; Lacalle Úbeda, I.; Vaño, R.; Boronat, F.; Palau, C. Self-* Capabilities of Cloud-Edge Nodes: A Research Review. Sensors 2023, 23, 2931. [Google Scholar] [CrossRef] [PubMed]
  23. Jayakumar, V.K.; Lee, J.; Kim, I.K.; Wang, W. A Self-Optimized Generic Workload Prediction Framework for Cloud Computing. In Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LO, USA, 18–22 May 2020; pp. 779–788. [Google Scholar] [CrossRef]
  24. Betancourt, V.P.; Kirschner, M.; Kreutzer, M.; Becker, J. Policy-Based Task Allocation at Runtime for a Self-Adaptive Edge Computing Infrastructure. In Proceedings of the 2023 IEEE 15th International Symposium on Autonomous Decentralized System (ISADS), Mexico City, Mexico, 15–17 March 2023; pp. 1–8. [Google Scholar] [CrossRef]
  25. Krzysztoń, E.; Rojek, I.; Mikołajewski, D. A Comparative Analysis of Anomaly Detection Methods in IoT Networks: An Experimental Study. Appl. Sci. 2024, 14, 11545. [Google Scholar] [CrossRef]
  26. De Oliveira, E.A.; Rocha, A.; Mattoso, M.; Delicato, F. Latency and Energy-Awareness in Data Stream Processing for Edge Based IoT Systems. J. Grid Comput. 2022, 20, 27. [Google Scholar] [CrossRef]
  27. Giouroukis, D.; Dadiani, A.; Traub, J.; Zeuch, S.; Markl, V. A Survey of Adaptive Sampling and Filtering Algorithms for the Internet of Things. In Proceedings of the DEBS ’20: 14th ACM International Conference on Distributed and Event-Based Systems, Montreal, QC, Canada, 13–17 July 2020. [Google Scholar]
  28. Lavin, A.; Ahmad, S. Evaluating Real-Time Anomaly Detection Algorithms – The Numenta Anomaly Benchmark. In Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 9–11 December 2015; pp. 38–44. [Google Scholar] [CrossRef]
  29. Hafeez, T.; McArdle, G.; Xu, L. Adaptive Window Based Sampling on The Edge for Internet of Things Data Streams. In Proceedings of the 2020 11th International Conference on Network of the Future (NoF), Bordeaux, France, 12–14 October 2020; pp. 105–109. [Google Scholar] [CrossRef]
  30. Kim-Hung, L.; Le-Trung, Q. User-Driven Adaptive Sampling for Massive Internet of Things. IEEE Access 2020, 8, 135798–135810. [Google Scholar] [CrossRef]
  31. Singh, S.; Chana, I.; Singh, M.; Buyya, R. SOCCER: Self-Optimization of Energy-efficient Cloud Resources. Clust. Comput. 2016, 19, 1787–1800. [Google Scholar] [CrossRef]
  32. Kumar, A.; Lal, M.; Kaur, S. An autonomic resource management system for energy efficient and quality of service aware resource scheduling in cloud environment. Concurr. Comput. Pract. Exp. 2023, 35, e7699. [Google Scholar] [CrossRef]
  33. Mordacchini, M.; Ferrucci, L.; Carlini, E.; Kavalionak, H.; Coppola, M.; Dazzi, P. Self-organizing Energy-Minimization Placement of QoE-Constrained Services at the Edge. In Proceedings of the Economics of Grids, Clouds, Systems, and Services, Virtual Event, 21–23 September 2020; Tserpes, K., Altmann, J., Bañares, J.Á., Agmon Ben-Yehuda, O., Djemame, K., Stankovski, V., Tuffin, B., Eds.; Springer: Cham, Switzerland, 2021; pp. 133–142. [Google Scholar]
  34. Trihinas, D.; Pallis, G.; Dikaiakos, M.D. AdaM: An adaptive monitoring framework for sampling and filtering on IoT devices. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015; pp. 717–726. [Google Scholar] [CrossRef]
  35. Zhang, D.; Ni, C.; Zhang, J.; Zhang, T.; Yang, P.; Wang, J.; Yan, H. A Novel Edge Computing Architecture Based on Adaptive Stratified Sampling. Comput. Commun. 2022, 183, 121–135. [Google Scholar] [CrossRef]
  36. Lou, P.; Shi, L.; Zhang, X.; Xiao, Z.; Yan, J. A Data-Driven Adaptive Sampling Method Based on Edge Computing. Sensors 2020, 20, 2174. [Google Scholar] [CrossRef] [PubMed]
  37. Choi, D.Y.; Kim, S.W.; Choi, M.A.; Geem, Z.W. Adaptive Kalman Filter Based on Adjustable Sampling Interval in Burst Detection for Water Distribution System. Water 2016, 8, 142. [Google Scholar] [CrossRef]
  38. Hawkins, D.M. Identification of Outliers; Springer: Berlin/Heidelberg, Germany, 1980; Volume 11. [Google Scholar]
  39. Das, R.; Luo, T. LightESD: Fully-Automated and Lightweight Anomaly Detection Framework for Edge Computing. In Proceedings of the 2023 IEEE International Conference on Edge Computing and Communications (EDGE), Chicago, IL, USA, 2–8 July 2023; pp. 150–158. [Google Scholar] [CrossRef]
  40. Kim, T.; Park, C.H. Anomaly pattern detection for streaming data. Expert Syst. Appl. 2020, 149, 113252. [Google Scholar] [CrossRef]
  41. Liu, X.; Lai, Z.; Wang, X.; Huang, L.; Nielsen, P.S. A Contextual Anomaly Detection Framework for Energy Smart Meter Data Stream. In Proceedings of the Neural Information Processing, Vancouver, BC, Canada, 6–12 December 2020; Yang, H., Pasupa, K., Leung, A.C.S., Kwok, J.T., Chan, J.H., King, I., Eds.; Springer: Cham, Switzerland, 2020; pp. 733–742. [Google Scholar]
  42. Jain, P.; Jain, S.; Zaïane, O.R.; Srivastava, A. Anomaly Detection in Resource Constrained Environments with Streaming Data. IEEE Trans. Emerg. Top. Comput. Intell. 2022, 6, 649–659. [Google Scholar] [CrossRef]
  43. Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar] [CrossRef]
  44. Shirazi, S.N.; Simpson, S.; Gouglidis, A.; Mauthe, A.; Hutchison, D. Anomaly Detection in the Cloud Using Data Density. In Proceedings of the 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), San Francisco, CA, USA, 27 June–2 July 2016; pp. 616–623. [Google Scholar] [CrossRef]
  45. Garcia-Martin, E.; Bifet, A.; Lavesson, N.; König, R.; Linusson, H. Green Accelerated Hoeffding Tree. arXiv 2022, arXiv:2205.03184. [Google Scholar] [CrossRef]
  46. Manapragada, C.; Webb, G.I.; Salehi, M. Extremely Fast Decision Tree. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, New York, NY, USA, 19–23 August 2018; KDD ’18. pp. 1953–1962. [Google Scholar] [CrossRef]
  47. Lu, T.; Wang, L.; Zhao, X. Review of Anomaly Detection Algorithms for Data Streams. Appl. Sci. 2023, 13, 6353. [Google Scholar] [CrossRef]
  48. OpenFaas. Available online: https://www.openfaas.com/ (accessed on 28 September 2025).
  49. Angelov, P.; Yager, R. Simplified fuzzy rule-based systems using non-parametric antecedents and relative data density. In Proceedings of the 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS), Paris, France, 14–15 April 2011; pp. 62–69. [Google Scholar] [CrossRef]
  50. Shafer, I.; Ren, K.; Boddeti, V.N.; Abe, Y.; Ganger, G.R.; Faloutsos, C. RainMon: An integrated approach to mining bursty timeseries monitoring data. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; ACM: New York, NY, USA, 2012. KDD ’12. pp. 1158–1166. [Google Scholar] [CrossRef]
  51. Contextual Anomaly Detector for NAB. Available online: https://github.com/smirmik/CAD (accessed on 8 May 2025).
  52. ARTime Anomaly Detector for NAB. Available online: https://github.com/markNZed/ARTimeNAB.jl (accessed on 8 May 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.