Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (10)

Search Parameters:
Keywords = Service Level Objectives (SLO)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
32 pages, 2917 KiB  
Article
Self-Adapting CPU Scheduling for Mixed Database Workloads via Hierarchical Deep Reinforcement Learning
by Suchuan Xing, Yihan Wang and Wenhe Liu
Symmetry 2025, 17(7), 1109; https://doi.org/10.3390/sym17071109 - 10 Jul 2025
Viewed by 328
Abstract
Modern database systems require autonomous CPU scheduling frameworks that dynamically optimize resource allocation across heterogeneous workloads while maintaining strict performance guarantees. We present a novel hierarchical deep reinforcement learning framework augmented with graph neural networks to address CPU scheduling challenges in mixed database [...] Read more.
Modern database systems require autonomous CPU scheduling frameworks that dynamically optimize resource allocation across heterogeneous workloads while maintaining strict performance guarantees. We present a novel hierarchical deep reinforcement learning framework augmented with graph neural networks to address CPU scheduling challenges in mixed database environments comprising Online Transaction Processing (OLTP), Online Analytical Processing (OLAP), vector processing, and background maintenance workloads. Our approach introduces three key innovations: first, a symmetric two-tier control architecture where a meta-controller allocates CPU budgets across workload categories using policy gradient methods while specialized sub-controllers optimize process-level resource allocation through continuous action spaces; second, graph neural network-based dependency modeling that captures complex inter-process relationships and communication patterns while preserving inherent symmetries in database architectures; and third, meta-learning integration with curiosity-driven exploration enabling rapid adaptation to previously unseen workload patterns without extensive retraining. The framework incorporates a multi-objective reward function balancing Service Level Objective (SLO) adherence, resource efficiency, symmetric fairness metrics, and system stability. Experimental evaluation through high-fidelity digital twin simulation and production deployment demonstrates substantial performance improvements: 43.5% reduction in p99 latency violations for OLTP workloads and 27.6% improvement in overall CPU utilization, with successful scaling to 10,000 concurrent processes maintaining sub-3% scheduling overhead. This work represents a significant advancement toward truly autonomous database resource management, establishing a foundation for next-generation self-optimizing database systems with implications extending to broader orchestration challenges in cloud-native architectures. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

19 pages, 1008 KiB  
Article
On the Analysis of Inter-Relationship between Auto-Scaling Policy and QoS of FaaS Workloads
by Sara Hong, Yeeun Kim, Jaehyun Nam and Seongmin Kim
Sensors 2024, 24(12), 3774; https://doi.org/10.3390/s24123774 - 10 Jun 2024
Cited by 1 | Viewed by 1807
Abstract
A recent development in cloud computing has introduced serverless technology, enabling the convenient and flexible management of cloud-native applications. Typically, the Function-as-a-Service (FaaS) solutions rely on serverless backend solutions, such as Kubernetes (K8s) and Knative, to leverage the advantages of resource management for [...] Read more.
A recent development in cloud computing has introduced serverless technology, enabling the convenient and flexible management of cloud-native applications. Typically, the Function-as-a-Service (FaaS) solutions rely on serverless backend solutions, such as Kubernetes (K8s) and Knative, to leverage the advantages of resource management for underlying containerized contexts, including auto-scaling and pod scheduling. To take the advantages, recent cloud service providers also deploy self-hosted serverless services by facilitating their on-premise hosted FaaS platforms rather than relying on commercial public cloud offerings. However, the lack of standardized guidelines on K8s abstraction to fairly schedule and allocate resources on auto-scaling configuration options for such on-premise hosting environment in serverless computing poses challenges in meeting the service level objectives (SLOs) of diverse workloads. This study fills this gap by exploring the relationship between auto-scaling behavior and the performance of FaaS workloads depending on scaling-related configurations in K8s. Based on comprehensive measurement studies, we derived the logic as to which workload should be applied and with what type of scaling configurations, such as base metric, threshold to maximize the difference in latency SLO, and number of responses. Additionally, we propose a methodology to assess the scaling efficiency of the related K8s configurations regarding the quality of service (QoS) of FaaS workloads. Full article
(This article belongs to the Special Issue Edge Computing in Internet of Things Applications)
Show Figures

Figure 1

28 pages, 845 KiB  
Article
SLA-Adaptive Threshold Adjustment for a Kubernetes Horizontal Pod Autoscaler
by Olesia Pozdniakova, Dalius Mažeika and Aurimas Cholomskis
Electronics 2024, 13(7), 1242; https://doi.org/10.3390/electronics13071242 - 27 Mar 2024
Cited by 4 | Viewed by 1950
Abstract
Kubernetes is an open-source container orchestration system that provides a built-in module for dynamic resource provisioning named the Horizontal Pod Autoscaler (HPA). The HPA identifies the number of resources to be provisioned by calculating the ratio between the current and target utilisation metrics. [...] Read more.
Kubernetes is an open-source container orchestration system that provides a built-in module for dynamic resource provisioning named the Horizontal Pod Autoscaler (HPA). The HPA identifies the number of resources to be provisioned by calculating the ratio between the current and target utilisation metrics. The target utilisation metric, or threshold, directly impacts how many and how quickly resources will be provisioned. However, the determination of the threshold that would allow satisfying performance-based Service Level Objectives (SLOs) is a long, error-prone, manual process because it is based on the static threshold principle and requires manual configuration. This might result in underprovisioning or overprovisioning, leading to the inadequate allocation of computing resources or SLO violations. Numerous autoscaling solutions have been introduced as alternatives to the HPA to simplify the process. However, the HPA is still the most widely used solution due to its ease of setup, operation, and seamless integration with other Kubernetes functionalities. The present study proposes a method that utilises exploratory data analysis techniques along with moving average smoothing to identify the target utilisation threshold for the HPA. The objective is to ensure that the system functions without exceeding the maximum number of events that result in a violation of the response time defined in the SLO. A prototype was created to adjust the threshold values dynamically, utilising the proposed method. This prototype enables the evaluation and comparison of the proposed method with the HPA, which has the highest threshold set that meets the performance-based SLOs. The results of the experiments proved that the suggested method adjusts the thresholds to the desired service level with a 1–2% accuracy rate and only 4–10% resource overprovisioning, depending on the type of workload. Full article
Show Figures

Figure 1

39 pages, 1887 KiB  
Article
Efficient Resource Utilization in IoT and Cloud Computing
by Vivek Kumar Prasad, Debabrata Dansana, Madhuri D. Bhavsar, Biswaranjan Acharya, Vassilis C. Gerogiannis and Andreas Kanavos
Information 2023, 14(11), 619; https://doi.org/10.3390/info14110619 - 19 Nov 2023
Cited by 12 | Viewed by 6644
Abstract
With the proliferation of IoT devices, there has been exponential growth in data generation, placing substantial demands on both cloud computing (CC) and internet infrastructure. CC, renowned for its scalability and virtual resource provisioning, is of paramount importance in e-commerce applications. However, the [...] Read more.
With the proliferation of IoT devices, there has been exponential growth in data generation, placing substantial demands on both cloud computing (CC) and internet infrastructure. CC, renowned for its scalability and virtual resource provisioning, is of paramount importance in e-commerce applications. However, the dynamic nature of IoT and cloud services introduces unique challenges, notably in the establishment of service-level agreements (SLAs) and the continuous monitoring of compliance. This paper presents a versatile framework for the adaptation of e-commerce applications to IoT and CC environments. It introduces a comprehensive set of metrics designed to support SLAs by enabling periodic resource assessments, ensuring alignment with service-level objectives (SLOs). This policy-driven approach seeks to automate resource management in the era of CC, thereby reducing the dependency on extensive human intervention in e-commerce applications. This paper culminates with a case study that demonstrates the practical utilization of metrics and policies in the management of cloud resources. Furthermore, it provides valuable insights into the resource requisites for deploying e-commerce applications within the realms of the IoT and CC. This holistic approach holds the potential to streamline the monitoring and administration of CC services, ultimately enhancing their efficiency and reliability. Full article
(This article belongs to the Special Issue Systems Engineering and Knowledge Management)
Show Figures

Figure 1

26 pages, 4909 KiB  
Article
FireFace: Leveraging Internal Function Features for Configuration of Functions on Serverless Edge Platforms
by Ming Li, Jianshan Zhang, Jingfeng Lin, Zheyi Chen and Xianghan Zheng
Sensors 2023, 23(18), 7829; https://doi.org/10.3390/s23187829 - 12 Sep 2023
Cited by 2 | Viewed by 2054
Abstract
The emerging serverless computing has become a captivating paradigm for deploying cloud applications, alleviating developers’ concerns about infrastructure resource management by configuring necessary parameters such as latency and memory constraints. Existing resource configuration solutions for cloud-based serverless applications can be broadly classified into [...] Read more.
The emerging serverless computing has become a captivating paradigm for deploying cloud applications, alleviating developers’ concerns about infrastructure resource management by configuring necessary parameters such as latency and memory constraints. Existing resource configuration solutions for cloud-based serverless applications can be broadly classified into modeling based on historical data or a combination of sparse measurements and interpolation/modeling. In pursuit of service response and conserving network bandwidth, platforms have progressively expanded from the traditional cloud to the edge. Compared to cloud platforms, serverless edge platforms often lead to more running overhead due to their limited resources, resulting in undesirable financial costs for developers when using the existing solutions. Meanwhile, it is extremely challenging to handle the heterogeneity of edge platforms, characterized by distinct pricing owing to their varying resource preferences. To tackle these challenges, we propose an adaptive and efficient approach called FireFace, consisting of prediction and decision modules. The prediction module extracts the internal features of all functions within the serverless application and uses this information to predict the execution time of the functions under specific configuration schemes. Based on the prediction module, the decision module analyzes the environment information and uses the Adaptive Particle Swarm Optimization algorithm and Genetic Algorithm Operator (APSO-GA) algorithm to select the most suitable configuration plan for each function, including CPU, memory, and edge platforms. In this way, it is possible to effectively minimize the financial overhead while fulfilling the Service Level Objectives (SLOs). Extensive experimental results show that our prediction model obtains optimal results under all three metrics, and the prediction error rate for real-world serverless applications is in the range of 4.25∼9.51%. Our approach can find the optimal resource configuration scheme for each application, which saves 7.2∼44.8% on average compared to other classic algorithms. Moreover, FireFace exhibits rapid adaptability, efficiently adjusting resource allocation schemes in response to dynamic environments. Full article
Show Figures

Figure 1

31 pages, 7425 KiB  
Article
Evaluating Task-Level CPU Efficiency for Distributed Stream Processing Systems
by Johannes Rank, Jonas Herget, Andreas Hein and Helmut Krcmar
Big Data Cogn. Comput. 2023, 7(1), 49; https://doi.org/10.3390/bdcc7010049 - 10 Mar 2023
Viewed by 3234
Abstract
Big Data and primarily distributed stream processing systems (DSPSs) are growing in complexity and scale. As a result, effective performance management to ensure that these systems meet the required service level objectives (SLOs) is becoming increasingly difficult. A key factor to consider when [...] Read more.
Big Data and primarily distributed stream processing systems (DSPSs) are growing in complexity and scale. As a result, effective performance management to ensure that these systems meet the required service level objectives (SLOs) is becoming increasingly difficult. A key factor to consider when evaluating the performance of a DSPS is CPU efficiency, which is the ratio of the workload processed by the system to the CPU resources invested. In this paper, we argue that developing new performance tools for creating DSPSs that can fulfill SLOs while using minimal resources is crucial. This is especially significant in edge computing situations where resources are limited and in large cloud deployments where conserving power and reducing computing expenses are essential. To address this challenge, we present a novel task-level approach for measuring CPU efficiency in DSPSs. Our approach supports various streaming frameworks, is adaptable, and comes with minimal overheads. This enables developers to understand the efficiency of different DSPSs at a granular level and provides insights that were not previously possible. Full article
(This article belongs to the Special Issue Distributed Applications and Services for Future Internet)
Show Figures

Figure 1

19 pages, 632 KiB  
Article
Tail Prediction for Heterogeneous Data Center Clusters
by Sharaf Malebary, Sami Alesawi and Hao Che
Processes 2023, 11(2), 407; https://doi.org/10.3390/pr11020407 - 30 Jan 2023
Viewed by 1955
Abstract
Service providers need to meet their service level objectives (SLOs) to ensure better client experiences. Predicting tail sojourn times of applications is an essential step to combat long tail latency. Therefore, as an attempt to further unravel the power of our prediction model, [...] Read more.
Service providers need to meet their service level objectives (SLOs) to ensure better client experiences. Predicting tail sojourn times of applications is an essential step to combat long tail latency. Therefore, as an attempt to further unravel the power of our prediction model, new study scenarios for heterogeneous environments will be introduced in this research by using either of two methods: white- or black-box solutions. This research presents several techniques for modeling clusters of inhomogeneous nodes. Those techniques are recognized as heterogeneous fork-join queuing networks (HFJQNs). Moreover, included in the research is a nested-event-based simulation model, borrowing help from multi-core technologies. This model adopts the multiprocessing technique to take part in its design to enable different architectural designs for all computing nodes. This novel implementation of the simulation model is believed to be the next logical step for research studies targeting heterogeneous clusters in addition to the several provided scenarios. Experimental results confirm that even with the existence of such heterogeneous conditions, the tail latency can be predicted at high-load regions with an approximated relative error of less than 15%. Full article
(This article belongs to the Special Issue Trends of Machine Learning in Multidisciplinary Engineering Processes)
Show Figures

Figure 1

18 pages, 553 KiB  
Article
HetSev: Exploiting Heterogeneity-Aware Autoscaling and Resource-Efficient Scheduling for Cost-Effective Machine-Learning Model Serving
by Hao Mo, Ligu Zhu, Lei Shi, Songfu Tan and Suping Wang
Electronics 2023, 12(1), 240; https://doi.org/10.3390/electronics12010240 - 3 Jan 2023
Cited by 2 | Viewed by 2803
Abstract
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the use of expensive hardware accelerators (e.g., GPUs) to reduce execution time. Advanced inference serving systems are needed to satisfy latency service-level objectives (SLOs) in a cost-effective manner. Novel autoscaling [...] Read more.
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the use of expensive hardware accelerators (e.g., GPUs) to reduce execution time. Advanced inference serving systems are needed to satisfy latency service-level objectives (SLOs) in a cost-effective manner. Novel autoscaling mechanisms that greedily minimize the number of service instances while ensuring SLO compliance are helpful. However, we find that it is not adequate to guarantee cost effectiveness across heterogeneous GPU hardware, and this does not maximize resource utilization. In this paper, we propose HetSev to address these challenges by incorporating heterogeneity-aware autoscaling and resource-efficient scheduling to achieve cost effectiveness. We develop an autoscaling mechanism which accounts for SLO compliance and GPU heterogeneity, thus provisioning the appropriate type and number of instances to guarantee cost effectiveness. We leverage multi-tenant inference to improve GPU resource utilization, while alleviating inter-tenant interference by avoiding the co-location of identical ML instances on the same GPU during placement decisions. HetSev is integrated into Kubernetes and deployed onto a heterogeneous GPU cluster. We evaluated the performance of HetSev using several representative ML models. Compared with default Kubernetes, HetSev reduces resource cost by up to 2.15× while meeting SLO requirements. Full article
Show Figures

Figure 1

16 pages, 1186 KiB  
Article
User-Engagement Score and SLIs/SLOs/SLAs Measurements Correlation of E-Business Projects Through Big Data Analysis
by Solomiia Fedushko, Taras Ustyianovych, Yuriy Syerov and Tomas Peracek
Appl. Sci. 2020, 10(24), 9112; https://doi.org/10.3390/app10249112 - 20 Dec 2020
Cited by 33 | Viewed by 4530
Abstract
The Covid-19 crisis lockdown caused rapid transformation to remote working/learning modes and the need for e-commerce-, web-education-related projects development, and maintenance. However, an increase in internet traffic has a direct impact on infrastructure and software performance. We study the problem of accurate and [...] Read more.
The Covid-19 crisis lockdown caused rapid transformation to remote working/learning modes and the need for e-commerce-, web-education-related projects development, and maintenance. However, an increase in internet traffic has a direct impact on infrastructure and software performance. We study the problem of accurate and quick web-project infrastructure issues/bottleneck/overload identification. The research aims to achieve and ensure the reliability and availability of a commerce/educational web project by providing system observability and Site Reliability Engineering (SRE) methods. In this research, we propose methods for technical condition assessment by applying the correlation of user-engagement score and Service Level Indicators (SLIs)/Service Level Objectives (SLOs)/Service Level Agreements (SLAs) measurements to identify user satisfaction types along with the infrastructure state. Our solution helps to improve content quality and, mainly, detect abnormal system behavior and poor infrastructure conditions. A straightforward interpretation of potential performance bottlenecks and vulnerabilities is achieved with the developed contingency table and correlation matrix for that purpose. We identify big data and system logs and metrics as the central sources that have performance issues during web-project usage. Throughout the analysis of an educational platform dataset, we found the main features of web-project content that have high user-engagement and provide value to services’ customers. According to our study, the usage and correlation of SLOs/SLAs with other critical metrics, such as user satisfaction or engagement improves early indication of potential system issues and avoids having users face them. These findings correspond to the concepts of SRE that focus on maintaining high service availability. Full article
(This article belongs to the Special Issue Digital Transformation in Manufacturing Industry Ⅱ)
Show Figures

Figure 1

21 pages, 374 KiB  
Article
A Taxonomy of Techniques for SLO Failure Prediction in Software Systems
by Johannes Grohmann, Nikolas Herbst, Avi Chalbani, Yair Arian, Noam Peretz and Samuel Kounev
Computers 2020, 9(1), 10; https://doi.org/10.3390/computers9010010 - 11 Feb 2020
Cited by 3 | Viewed by 5287
Abstract
Failure prediction is an important aspect of self-aware computing systems. Therefore, a multitude of different approaches has been proposed in the literature over the past few years. In this work, we propose a taxonomy for organizing works focusing on the prediction of Service [...] Read more.
Failure prediction is an important aspect of self-aware computing systems. Therefore, a multitude of different approaches has been proposed in the literature over the past few years. In this work, we propose a taxonomy for organizing works focusing on the prediction of Service Level Objective (SLO) failures. Our taxonomy classifies related work along the dimensions of the prediction target (e.g., anomaly detection, performance prediction, or failure prediction), the time horizon (e.g., detection or prediction, online or offline application), and the applied modeling type (e.g., time series forecasting, machine learning, or queueing theory). The classification is derived based on a systematic mapping of relevant papers in the area. Additionally, we give an overview of different techniques in each sub-group and address remaining challenges in order to guide future research. Full article
(This article belongs to the Special Issue Applications in Self-Aware Computing Systems and their Evaluation)
Show Figures

Figure 1

Back to TopTop