Next Article in Journal
Lagrangian Heuristic for Multi-Depot Technician Planning of Product Distribution and Installation with a Lunch Break
Next Article in Special Issue
MFANet: A Collar Classification Network Based on Multi-Scale Features and an Attention Mechanism
Previous Article in Journal
Gated Recurrent Fuzzy Neural Network Sliding Mode Control of a Micro Gyroscope
Previous Article in Special Issue
Time-Varying Sequence Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Performance Evaluation of a Cloud Datacenter Using CPU Utilization Data

1
Department of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka 8208502, Japan
2
Graduate School of Information Science and Technology, Osaka University, Osaka 5650871, Japan
3
Graduate School of Advanced Science Engineering, Hiroshima University, Higashihiroshima 7398527, Japan
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(3), 513; https://doi.org/10.3390/math11030513
Submission received: 26 November 2022 / Revised: 26 December 2022 / Accepted: 5 January 2023 / Published: 18 January 2023

Abstract

:
Cloud computing and its associated virtualization have already been the most vital architectures in the current computer system design. Due to the popularity and progress of cloud computing in different organizations, performance evaluation of cloud computing is particularly significant, which helps computer designers make plans for the system’s capacity. This paper aims to evaluate the performance of a cloud datacenter Bitbrains, using a queueing model only from CPU utilization data. More precisely, a simple but non-trivial queueing model is used to represent the task processing of each virtual machine (VM) in the cloud, where the input stream is supposed to follow a non-homogeneous Poisson process (NHPP). Then, the parameters of arrival streams for each VM in the cloud are estimated. Furthermore, the superposition of estimated arrivals is applied to represent the CPU behavior of an integrated virtual platform. Finally, the performance of the integrated virtual platform is evaluated based on the superposition of the estimations.

1. Introduction

With the vigorous growth of big data and large-scale data processing, traditional computing models can no longer meet daily computing needs [1]. Cloud computing and its associated virtualization are the most vital architectures for providing cloud services to users, which have become the standard infrastructure for supporting Internet services [2]. In general, cloud computing has a service-oriented architecture, in which services can be categorized into IaaS (Infrastructure-as-a-Service), PaaS (Platform-as-a-Service), and SaaS (Software-as-Service) by the provided layer to clients [3]. Specifically, IaaS provides essential computing, storage, and networking resources on demand. PaaS allows users to access hardware and software computing platforms such as virtualized servers and operating systems over the internet. SaaS provides cloud users access to hosted applications and other hosted services over the internet. In recent decades, many companies have attempted to integrate such servers into a virtual server using PaaS architecture to reduce server management and maintenance costs. For example, Bitbrains is a service provider specializing in hosted services and business computing for enterprises [4]. Clients include many banks (ING), credit card operators (ICS), insurers (Aegon), etc. Bitbrains hosts applications in the solvency domain and examples of its application vendors are Towers Watson and Algorithmics.
PaaS cloud service providers need to gain insight into the relationship between the performance of the cloud service platform and the available resources, not only to meet users’ performance needs but also to fully utilize the infrastructure and resources of the cloud service platform. For cloud service users, performance evaluation quantitatively assesses the various cloud services to ensure that their needs are met. For example, users can conduct quantitative analysis on the same service provided by different cloud service providers through performance evaluation and make the best service selection and decision. Generally, a PaaS cloud computing platform has a vast number of collaborative physical machines, each of which includes multiple virtual machines (VMs). Additionally, performance evaluation of the integrated virtual platform during the design phase of a computer system can help the computer designer plan the system’s capacity [5]. However, with the rapid increase in data, every upgrade to software or hardware (e.g., CPU, memory) comes with high risk and expense [6]. Performance evaluation that evaluates the utility of given upgrades facilitates cost reduction and construction optimization at a datacenter, while erroneous analysis leads to ultimately huge losses. Therefore, it is essential to estimate the system performance from the statistics of existing servers.
Cloud services differ from traditional hosting in three main aspects. First, the cloud provides service on demand; second, cloud services are elastic because users can use the services they want at any given time; third, the cloud provider fully manages the cloud services [7]. Queueing models provide an efficient manner to simulate the behaviors and evaluate the performance of a cloud datacenter. Queueing theory often models web applications as queues and VMs as facility (service) nodes [8]. The parameters of queueing models (e.g., arrivals and service rates) can then be estimated as services arrive in a first-come-first-served (FCFS) manner [9]. Usually, a queueing model can be standardized as A / B / S / K , where A and B represent the arrival and service distributions, respectively. S and K are the number of service nodes and queue capacity, respectively. For example, M / M / 1 / K represents that the arrival and service times follow Poisson and exponential distributions, respectively, and the number of servers and queue length are one and K, respectively. Tasks sent to the cloud datacenter are usually served within a suitable waiting time and will leave the queue when the service is over. However, there remain many challenging issues with queueing models, despite their ability to represent the behavior of cloud datacenters [10]. A cloud center usually has a lot of service nodes. Traditional queueing models rarely consider the system size. Besides, approximation methods are sensitive and inaccurate to the probability distribution of arrival and service times. Furthermore, traditional queueing systems usually observe the inter-arrival times or waiting times to estimate arrival rates for VMs. In a cloud datacenter, task arrivals and wait times are difficult to monitor and collect.
As a solution, our previous work [11] proposed an approach to estimate the arrival intensity of computer systems only from CPU utilization data. CPU utilization data is one of the most commonly used statistics to monitor CPU behavior during task execution. Most operating systems have the function to calculate CPU utilization by default. Besides, the CPU behavior of an existing server was modeled by an M t / M / 1 / K queueing system, where the arrival stream is according to a non-homogeneous Poisson process (NHPP), and the service time obeys an exponential distribution. NHPP is the best-known generalization of the Poisson process in that the arrival intensity is given as a function of time t [12]. Therefore, an NHPP can better approximate the arrival process of the tasks accurately than a homogeneous Poisson process (HPP) [13]. For the Poisson process, the renewal process is the partial sum process associated with independent random variables with an exponential law [14]. An alternative way to avoid the independently and identically distributed requirement is the Markovian arrival process (MAP) [15].
To reasonably plan the cloud computing platform and improve its performance, we aim to evaluate the performance of Bitbrains cloud servers in a PaaS architecture in this paper. More precisely, we use an M t / M / 1 / K queueing model to represent the CPU behavior of each VM, which can dynamically calculate the task arrival rate at different times. In addition, because the arrivals and waiting times of tasks in computer systems are difficult to monitor and collect, we estimate the arrival intensity of each VM in the cloud datacenter from the CPU utilization data. Finally, the performance of the integrated virtual platform is evaluated by applying the superposition technique of NHPPs. The main contributions are organized as follows:
  • Performance evaluation of a cloud server: It is non-trivial and significant to evaluate the performance of a cloud datacenter (i.e., Bitbrains) to meet user performance needs and make full use of the resources of the PaaS cloud service platform.
  • Parameter estimation for a queueing model subject to NHPP arrival using CPU utilization data: An NHPP is used as the arrival process of the queueing model, which can dynamically calculate the task arrival rate at different times. Additionally, we use CPU utilization to estimate the parameters because of the unobservability of the task arrival process and waiting times of the computer system. CPU utilization is the most commonly used statistic, but task arrival and service processes are not visible.
  • Flexibility and scalability for performance evaluation: any queuing model based on utilization data can be solved using our approach. In addition, our model can be combined with distributed computing to further improve the capability of performance evaluation.
The remainder of the paper is as follows. The related research works are reviewed in Section 2. In Section 3, the cloud datacenter system (i.e., Bitbrains) is first introduced in detail. Then, the EM algorithm-based parameter estimation method is proposed. In Section 4, we first estimate the parameters (i.e., arrival intensity function) of the M t / M / 1 / K queueing model. Then, we evaluate the performance of the Bitbrains using only CPU utilization data. Finally, the paper is concluded in Section 5.

2. Literature Review

In recent years, data has exploded with the rapid growth of computers and the Internet. As one of the solutions to cope with the era of big data, cloud computing has gained wide attention and application. It is an extremely worthwhile task to evaluate the performance of cloud computing platforms. For cloud computing platform designers, performance evaluation can help them decide the size of system memory, the number of CPUs, etc. For cloud computing providers, performance evaluation can help them allocate facilities and resources appropriately. For users, it can help them choose the right provider.

2.1. Queueing Models as Solutions

However, only a small part of the work involves the performance evaluation of cloud computing data centers, and many researchers prefer to evaluate the performance evaluation of cloud computing centers using queuing models [16,17,18,19,20]. Moreover, most of the literature estimates the parameters of the queuing model by collecting data such as queue lengths or waiting times. For example, Thiruvaiyaru et al. [16] collected queue length data and estimated parameters from an M / M / 1 queueing model. Ross et al. [19] collected queue length data at successive time points and estimated the parameters of an M / M / c queuing system. Then, they generated density-dependent transition rates for Markov processes by placing the arrival rates in the same order as the number of servers. Liu et al. [20] calculated the queue lengths using the number of vehicles in the queueing system and proposed a real-time length estimation method through the probed vehicles. Unlike queue length data, waiting time data is another commonly used data that can be used to estimate the parameters of queueing models. Waiting times contain partial information about inter-arrivals and service times. Basawa et al. [17] collected waiting times of n successive customers from M / M / 1 and M / E k / 1 , respectively. Fischer et al. [18] used the Laplace transform method to approximate the distribution of waiting times and then estimated the parameters of an M / G / 1 queueing model. For performance evaluation of cloud computing systems, Khazaei et al. [10] modeled a cloud center as an M / G / m / m queuing system. They used a combination of a transformation-based analytical model and an embedded Markov chain model to obtain a complete probability distribution of response times and the number of tasks. Finally, they evaluated the performance of the system.
However, for observable queuing systems, the collection of queue lengths and waiting times consumes much time. For non-observable queuing, data such as queue lengths and waiting times, which can visually reflect queue information, are difficult to collect. For example, the arrival time, time interval, and waiting time of two successive tasks are not observable in a cloud computing system.
CPU utilization data is one of the most common statistics used to monitor CPU behavior during task execution. Most operating systems have the function to calculate CPU utilization by default. However, unlike queue lengths and waiting times, CPU utilization is utilization data. In other words, CPU monitoring is not continuous but at certain time intervals. Therefore, CPU utilization data belongs to incomplete data, which increases the difficulty of the parameter estimation of a queueing model. Moreover, the utilization data does not reflect the queue information intuitively like the queue lengths but reflects its information implicitly. For example, a high CPU utilization indicates that many tasks are waiting to be processed in a queue or that the current task is taking a long time to process. Therefore, it is more challenging to estimate parameters and evaluate performance from CPU utilization data. To the best of our knowledge, there are no other papers on queueing parameter evaluation based on utilization data other than our proposed [11,15].

2.2. Non-Homogeneous Poisson Process

Most works prefer to assume the arrival process of tasks as a homogeneous Poisson process (HPP) with a constant arrival rate. However, a Poisson arrival process is not a good choice in practice. For example, cloud computing datacenter visits are generally characterized by higher working hours than evenings and higher weekdays than weekends. The arrival process of tasks usually varies dynamically with time, i.e., it is a non-homogeneous Poisson process (NHPP) [12]. In general, queuing systems based on NHPP arrivals are more difficult to estimate parameters than HPP-based systems. Rothkopf and Oren [21] proposed an approximate method to estimate a dynamically varying arrival process of a queuing system. Heyman and Whitt [22] modeled an M t / G / c queuing system to deal with the non-simultaneous arrival process of asymptotic behavior. They defined an intensity function λ ( t ) to represent the time-varying Poisson arrival rate. In addition, Green et al. [23] evaluated the performance of a queuing system with the NHPP arrival process and exponential service times. Pant et al. [24] assumed a M t / M / 1 queuing system in which customers arrive at the system with a sinusoidal arrival intensity function λ ( t ) .

2.3. Parameter Estimation of Queueing Models

The maximum likelihood estimates (MLE) is the most commonly used estimation method of queueing models [25,26,27]. Wang et al. [26] proposed an M / M / R / N queueing model, which had R multiple severs in the queue. They estimated the HPP arrival rate and service time to obey the exponential distribution using MLE. Amit et al. [27] collected the number of customers in an M / M / 1 queueing system, and then estimated the traffic intensity by MLE. We have stated above that CPU utilization data is different from observable data and belongs to incomplete data. Therefore, a statistical inference technique for evaluating the performance of incomplete data is required. Expectation maximization (EM) [28] is a useful algorithm that can iteratively compute partial data with MLE. The EM algorithm is powerful for stochastic models with multiple parameters. An EM algorithm generally has two steps (i.e., the expectation step and the maximization step, respectively). MLE iteratively computes the expectation step and the maximization step iteratively and then stops iterating until the loss function converges. Wu [29] verified the convergence of the EM algorithm theoretically. They proposed that the MLE of the EM sequence can converge to a unique value in the case that the likelihood function is unimodal and differentiable. Rydén [30] estimated the parameters of a queueing model with Markov Modulated Poisson Process (MMPP) using MLE via an EM algorithm. Similarly, Basawa et al. [31] estimated the parameters of an G I / G / 1 queueing system using an EM algorithm from waiting times. Okamura et al. [32] defined group data and tried to estimate arrival and processing rates of a queueing system with an Markovian arrival process (MAP) using an EM algorithm-based approach.
In this paper, we propose a novel statistical inference technique for incomplete data to evaluate the performance of cloud datacenters. Empirically, the task arrival process in a datacenter often exhibits a cyclical and recurring nature. The cycle of this process can be in the form of a day, a week, a year or other units. For example, datacenter access is greater during business hours than early mornings and evenings. Weekday accesses will also be larger than weekend accesses. Therefore, we model each virtual machine (VM) of a cloud datacenter as an M t / M / 1 / K queueing system. An NHPP can better represent this cyclic characteristic. Then, we estimate parameters only from CPU utilization data using the EM algorithm. In a cloud datacenter, CPU utilization is the most commonly used statistic to monitor the behavior of each VM. Because any computer operating system can calculate CPU utilization by default, we do not need to spend time collecting information such as queue length and waiting times in the queuing system.

3. Methodology

In this section, we first briefly introduce the Bitbrains cloud datacenter. The system behavior can be modeled by an M t / M / 1 / K queueing model. Then, we define utilization data formally and approximate the NHPP to a series of HPPs. Finally, the details of the MLE optimization method based on an EM algorithm are described.

3.1. Bitbrains Cloud Datacenter

Bitbrains is a service provider that specializes in managed hosting and business computation for enterprises [33]. One of the typical applications of Bitbrains is for financial reporting, which is used predominately at the end of financial quarters. The workloads of Bitbrains are master-worker models, where the workers are used to calculate Monte Carlo simulations [34]. For example, a customer would request a cluster of computing nodes to run such simulations. The request is accompanied by the requirements as follows. First, data transmission between the customer and the datacenter through a secure channel, computing nodes rented as VM in the datacenter to provide predictable performance, and high availability for running critical business simulations.
Bitbrains uses the standard VMware provisioning mechanisms to manage computing resources, such as dynamic resource scheduling and storage dynamic resource scheduling. In general, Bitbrains consist of three types of VMs: management servers, application servers, and computing nodes. The management servers can be used for the daily operation of customer environments. Application servers are used as database servers, web servers, and head nodes. Computing nodes are mainly used to compute and simulate financial risk assessment. CPU utilization data of Bitbrains used in this work were collected between August and September 2013 two traces, which are described in Table 1.

3.2. Collection of CPU Utilization Data

CPU utilization is the most commonly used statistic to monitor the behavior of each VM in the cloud datacenter Bitbrains. We define CPU utilization in each time interval of each VM in the cloud datacenter Bitbrains as Δ t . CPU utilization can be considered as the ratio of the busy time to the total time of one monitoring. The busy time is given by the cumulative time in which the server is processing a task. For each fixed time interval, a computer system or VM calculates the time fraction as utilization. Parameter estimation from utilization data is more challenging than other related work. Based on the behavior of the CPU utilization data, we assume that the utilization within a time interval consists of an unobserved time and a successive observed time, respectively. Furthermore, since the CPU cannot be monitored during an unobserved period, we can only collect CPU utilization data during an observed period. For the CPU, each observation period is short (only a few milliseconds), so there is at most one change from busy to idle or idle to busy during each observation period. Formally, let t u and t o be the lengths of the unobserved and observed periods of each time interval, respectively. B t and I t are the lengths of busy and idle times in time slot t. According to the above assumptions, CPU utilization for one-time interval t u + t o can be defined as follows:
u = B t o / ( B t o + I t o ) ,
where u indicates the CPU utilization for one-time interval, B t o and I t o represent the lengths of busy and idle times in t, respectively, and t o t u .

3.3. System Behavior as the M t / M / 1 / K Queueing Model

First, users send the task requests that need to process by Bitbrains. Then, Bitbrains allocates computing resources for each task. Note that user tasks are independent of each other. When the CPU of the VM is idle, the first assigned task can be sent directly to the CPU for processing, or it needs to wait in the buffer of size K. For the VM with a buffer size of K, the arriving tasks that exceed K cannot enter the buffer for processing. Additionally, the waited tasks in the buffer will be processed by the CPU. The served task will leave the VM once finished.
Therefore, the system behavior of Bitbrains can be represented by the queueing model: web applications are often modeled as queues, and VMs are modeled as service nodes. We assume that the arrivals obey an NHPP with the intensity function is λ ( t ) . Service time is according to exponential distribution whose rate is μ . As shown in Figure 1, the system can be modeled by an M t / M / 1 / K queueing model.
In queueing theory, we usually use a continuous time Markov chain (CTMC) to formalize the behavior of a queueing model. For the M t / M / 1 / K queueing model, the infinitesimal generator matrix of the CTMC can be expressed as follows:
Q ( t ) = λ ( t ) λ ( t ) μ ( μ + λ ( t ) ) λ ( t ) μ μ

3.4. Approximate NHPP as a Series of HPPs

The time dynamics λ ( t ) of an NHPP, often called time intensity, is a function of time t, which is hardly estimated. Usually, the arrival process of an NHPP can be regarded as a series of independent HPPs. To simplify the model and estimation, the intensity function λ ( t ) of an NHPP can be approximated by a series of HPPs. Specifically, assume that the total time for utilization data collection is T. T can be divided into n ( n 1 ) same periods, and each divided time interval can be represented by Δ t . In practice, a large value of n is chosen. The n HPPs can be approximated as an NHPP with the intensity of λ ( t ) . At the ith ( 1 i n ) time interval, the arrival tasks obey the HPP with the arrival rate being a constant of λ i . The approximated piecewise constant function of NHPP is shown as
λ ( t ) = λ 1 ( 0 t Δ t ) λ 2 ( Δ t < t 2 Δ t ) λ n ( ( n 1 ) Δ t < t T )

3.5. Parameter Estimation of the M t / M / 1 / K Queueing Model

According to Equation (3), the infinitesimal generator matrix of Equation (2) can be modified to n independent infinitesimal generator matrices, and the ith matrix can be denoted by Q i ( 1 i n ). Q i 0 is defined as the ith infinitesimal generator matrix with CPU states that transfer from idle to idle or from busy to busy. Q i 0 is denoted by
Q i 0 = λ i ( μ + λ i ) λ i μ μ .
Similarly, Q i 1 is defined as the ith infinitesimal generator matrix with CPU states that transfer from idle to busy or from busy to idle. Q i 0 is represented as follows.
Q i 1 = λ i μ O ,
where O is zero matrix and,
Q i = Q i 0 + Q i 1 .
Define utilization as D = ( D 1 , D 2 , , D n ) . The utilization data in i-th period Δ t is defined as D i = ( u i 1 , u i 2 , , u i k ) , 0 u i j 1 . Then the likelihood function can be formulated from utilization data by using MLE, as shown below:
L ( λ 1 , , λ n ; D ) = p L 1 ( λ 1 ; D 1 ) L n ( λ n ; D n ) 1 ,
L i ( λ i ; D i ) = L i ( u i 1 ) L i ( u i k ) ,
With Equations (4)–(6), the items of Equation (8) can be expressed as
(9) L i ( u ) = exp ( Q i t u ) Λ 0 exp ( Q i 0 ( 1 u ) t o ) Q i 1 exp ( Q i 0 u t o ) + exp ( Q i t u ) Λ 1 exp ( Q i 0 u t o ) Q i 1 exp ( Q i 0 ( 1 u ) t o ) , if   0 < u < 1 , (10) L i ( u ) = exp ( Q i t u ) Λ 0 exp ( Q i 0 t o ) , if   u = 0 , (11) L i ( u ) = exp ( Q i t u ) Λ 1 exp ( Q i 0 t o ) , if   u = 1 ,
where p is the initial probability vector. Also Λ 0 and Λ 1 are ( K + 1 ) -by- ( K + 1 ) block matrices;
Λ 0 = 1 0 0 , Λ 1 = 0 1 1 .

3.6. EM Algorithm for CPU Utilization Data

An EM algorithm is an effective machine learning algorithm that can be used for parameter estimation of queueing models with incomplete data. We have stated that CPU utilization data belongs to incomplete data and assumed that the utilization within a time interval consists of an unobserved time and a successive observed time, respectively. Since data could be collected only during observed time, we try to use the EM algorithm to estimate the parameters. An EM algorithm aims to find the MLE for the M t / M / 1 / K queueing model from incomplete observation. The two steps of an EM algorithm is demonstrated as follows:
  • Expectation step: The expected log-likelihood function is calculated using the posterior probabilities of the hidden variables. The equation is calculated as follows:
    E [ log p ( D , U ; θ ) ] ,
    where D and U are the observed and missing data in an unobserved time interval, respectively. The θ is a vector of parameters to be estimated.
  • Maximization step: The parameter θ is updated by maximizing the expected log-likelihood function found in the expectation step. The equation is shown below:
    θ = arg max E [ log p ( D , U ; θ ) ] ,
Then, the parameter estimates from the maximization step are used as the initial parameters in the next expectation step to determine the distribution of the latent variables. Finally, the optimal parameters can be obtained by iterating these two steps several epochs until convergence. The EM algorithm can represent the arrival rate λ i , j by
λ i , j = E [ N i , j ] E [ S i ] = E [ N i , j U + N i , j O D ] E [ S i U + S i O D ] ,
where λ i , j is the arrival rate from i state to j state, N i , j is the number of transition from state i to state j, S i is the sojourn time in state i, N i , j U is the number of transition from state i to state j at unobserved time period, N i , j O is the number of transition from state i to state j at observed time period, S i U is the sojourn time in state i at unobserved period and S i O is the sojourn time in state i at observed period.

3.7. The Number of Time Intervals

n is is the total number of the divided time intervals in data collection time of T. n is a hyperparameter of the proposed approach. Different n determines different models. If n is too small, the estimated intensity function does not reflect the non-homogeneity property. For example, n = 1 means that the NHPP is simplified to an HPP, while if n is too large, the estimated intensity function overreacts to small fluctuations, resulting in an overfitting phenomenon. To choose an appropriate n, we use the Akaike Information Criterion (AIC) [35] to quantify the goodness of the models. Note that a smaller AIC indicates a better fit of the model. Therefore, we choose the n when the AIC takes the minimum value. The formula is shown as follows:
A I C = 2 L L F + 2 ( of parameters ) ,
where LLF denotes the maximum value of the log-likelihood function.

3.8. Superposition of Arrival Intensities

In the PaaS environment, the cloud server of Bitbrains provides computer platforms (e.g., CPUs) as a service by using a hardware virtualization technique. Therefore, the arrival process for the non-virtual CPU can be estimated by a superposition of the arrival process of virtual servers. Formally, define λ i ( t ) ( i = 1 , , m ) as the arrival intensity of NHPP for the i-th virtualized platform. Since the CPU task arrival processes for virtualized platforms can be regarded as independent stochastic processes, the CPU task arrival process in the PaaS is given by
λ a l l ( t ) = i = 1 m λ i ( t ) .
The procedure for the parameter estimation of the M t / M / 1 / K is summarized as in Algorithm 1.
Algorithm 1 Parameter estimation procedure for the M t / M / 1 / K model
Step 1:
Divide the total time interval [ 0 , T ] into n fixed periods:
0 = t 0 < t 1 < · < t n = T ; t i = i Δ t
Step 2:
Approximate λ ( t ) as a piecewise intensity function:
λ ( t ) λ i , ( t i 1 < t t i )
Step 3:
Determine parameters by maximizing the LLF:
( λ 1 ^ , , λ n ^ ) = arg max λ 1 , , λ n log L ( λ 1 , , λ n ; D )
Step 4:
Select the optimal model by minimizing AIC:
AIC = 2 ( log - likelihood n )
Step 5:
Evaluate performance of the cloud datacenter (e.g., average response time).
Finally, we evaluate the performance of the integrated platform by using the estimated parameters, such as the arrival intensity and service rate.

4. Results

We randomly select five VMs from the R n d trace of Bitbrains and estimate their arrival intensities of M t / M / 1 / K queueing model from CPU utilization data. Then, we evaluate the average response time of the superposed platform. The service rate of the exponential distribution is set as μ = 3 . Buffer size of the M t / M / 1 / K queueing model is set as K = 20 . Because Bitbrains monitors CPU utilization every 0.3 s, we fix the unobserved and observed time length to t u = 0.29 s and t o = 0.01 s, respectively.
Figure 2 demonstrates CPU utilization data collected from five VMs. The CPU utilization collected from VM 1, VM 2, and VM 3 is dense. Their CPU utilization values vary drastically in the interval [0, 0.2], [0, 0.02], and [0, 0.1]. Compared with VM 1, VM 2, and VM 3, the CPU utilization data collected from VM 4 and VM 5 are more sparse, with their utilization data varying in the interval [0, 0.1]. Moreover, the five sets of data seem to have a certain periodicity, which is one of the reasons we choose the NHPP as the arrival stream of Bitbrains.

4.1. Results of Parameter Estimation

To get the optimal value of n, the AIC values are calculated. Table 2 exhibits the AIC values, where n = 1 , 2 , , 20 for the five VMs. From the table, the optimal number of the time intervals of the five VMs can be obtained when the values of AIC are the smallest (i.e., 7448.08 for VM 1, 4428.04 for VM 2, 7953.98 for VM 3, 886.04 for VM 4, and 7957.62 for VM 5). The optimal number of the n are n 1 = 1 , n 2 = 1 , n 3 = 1 , n 4 = 19 , and n 5 = 1 for the five VMs. The five estimated intensity functions are demonstrated in Figure 3.
From Figure 3, we can find that the task arrival rates for VM1, VM2, VM3, and VM5 are 3.60 , 3.08 , 3.72 , and 0.43 . In other words, the four VMs obey four HPPs, not the NHPPs. Furthermore, since the AIC of VM 4 is smallest when n = 19 , the NHPP arrivals of VM 4 can be approximated as 19 HPPs with different arrival rates, and the arrival rate is reached to the maximum in the 17th time interval. In summary, the estimated arrival rate from the CPU utilization of VM 3 is the largest and the arrival process obeys an HPP, while the estimated arrival rate from VM 4 is the smallest and the arrival obeys an NHPP.

4.2. Results of Performance Evaluation

Finally, we evaluate performance in the scenario of the integrated systems using the estimated arrival rates and intensity function. According to Equation (17), we can calculate the integrated arrival intensity function λ a l l ( t ) . Using λ a l l ( t ) , we can simulate arrival streams whose arrival intensity obeys λ a l l ( t ) . Thus, we can get the arrival time of each task in the arrival stream. Then, by simulating processing times that obey an exponential distribution, we can similarly simulate the CPU processing time for each task. Finally, using the arrival time and processing time, we can calculate the average response time of the arrival stream. The algorithm is shown in Algorithm 2.
Algorithm 2 Performance evaluation for the M t / M / 1 / K model
Step 1:
Calculate the intensity function λ a l l ( t ) of the integrated system, which obeys NHPP according to Equation (17):
λ a l l ( t ) = i = 1 m λ i ( t ) .
Step 2:
Simulate the arrival times ( t a r r ) whose arrival intensity obeys λ a l l ( t ) :
Simulated arrival times : t a r r 1 , t a r r 2 , , t a r r 1000 .
Step 3:
Simulate the service time ( t s e r ) for an exponential distribution with service rate μ :
Simulated service times : t s e r 1 , t s e r 2 , , t s e r 1000 .
Step 4:
Calculate the average response time ( T r e s ).
Table 3 shows an example of the calculation of response times. During the processing of the first arrival (P1), P2 has to wait for 8 ms before it can be processed. In addition, the arrival time of P2 is 1 ms. Therefore, the response time of P2 is 8 1 = 7 ms. Similarly, the arrival of P3 has to keep waiting until P1 and P2 are served (i.e., after 8 + 7 = 15 ms). Since the arrival time of P3 is 2 ms, the response time of P3 is 15 2 = 13 ms.
We conduct ten loops and calculate the average response times. The average response times of the superposition of the five VMs are evaluated by changing the service rate from μ = 10 to μ = 30 . The results are given in Table 4. Furthermore, to make the response times more intuitive, we make the the results of Table 4 into Figure 4.
When the service rate is small ( μ 15 ), the average response times of the integrated system tend to be very large ( T r e s 1.687 s). With the increase of service rate ( μ 16 ), the effect of μ on the average response times becomes smaller. In other words, as a designer of a cloud computing platform, at least 16 CPUs need to be designed to meet user requirements. As a cloud computing provider, 16 CPUs are good enough to provide cloud computing services. Therefore, 16 CPUs can be allocated to all users in the interval of [0, T]. As a user of a cloud computing, we can choose an appropriate number of CPUs to balance performance and cost.

5. Conclusions

In this paper, we have modeled the behavior of five VMs of Bitbrains by using an M t / M / 1 / K queueing model. In particular, the model parameters were estimated by approximating an NHPP using a series of discrete HPPs and the MLE with EM algorithm. The performance of the integrated virtual platform was evaluated based on the superposition of the estimations of five VMs.
However, our proposed approach have a main limitation. In general, a cloud data center contains a large number of physical machines and virtual machines. For each virtual machine, the arrival and service rates of the tasks can be calculated independently. Therefore, the parameters of each virtual machine can be estimated in a distributed manner. However, due to hardware limitations, we cannot compute the parameters of each virtual machine. In this paper, we estimated the parameters for five VMs.
In the future, we would like to address the above limitation first. We would like to evaluate all the arrival processes of the VMs in Bitbrians in a distributed manner offline. Due to the fundamental weaknesses of an EM algorithm, the iterative convergence process is time-consuming. Therefore, we would choose a faster iterator, such as the Adam, for the parameter estimation. Finally, the performance of the whole cloud computing platform can be evaluated, which is meaningful for the cloud service providers and users.
In addition, a M A P / M / 1 / K assumption will be considered to estimate the arrival rates and evaluate the system performance. As a generalization of an NHPP, a MAP takes account of the dependency between consecutive arrivals and is often used to model complex, bursty, and correlated traffic streams. Therefore, we would like to concentrate on the MAP parameter estimation of quasi-birth-death queueing systems using utilization data.

Author Contributions

C.L., J.Z., H.O. and T.D. conceived the original idea for the study, analyzed the experiment results, and revised the manuscript. C.L. performed the experiments, wrote the manuscript, analyzed the data, and validated the experiment results. All authors have read and approved the submitted manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

AbbreviationsMeaning
IaaSInfrastructure-as-a-Service
PaaSPlatform-as-a-Service
SaaSSoftware-as-Service
VMvirtual machine
MEMoment estimates
FCFSFirst-come-first-served
HPPHomogeneous Poisson process
NHPPNon-homogeneous Poisson process
MAPMarkovian arrival process
LLFLog-likelihood function
EMExpectation maximization
AICAkaike’s information criterion
NotationMeaning
M t Non-homogeneous Poisson process
KCapacity of a queueing system
E k Erlang distribution
GGeometric distribution
λ Arrival rate of an HPP
λ ( t ) Intensity function of an NHPP
μ Service rate of an exponential distribution
Q ( t ) Infinitesimal generator matrix of an NHPP
λ i Arrival rate of the i-th time interval
TTotal observation time interval
nThe number of time intervals
Δ t A time interval
t u Time length of an unobserved period
t o Time length of an observed period
B t Time length of busy time
I t Time length of idle time
O Zero matrix
D Utilization data
D i Utilization data in i-th time period
u i i-th utilization sample of the observable period
p initial probability vector
Λ 0 ( K + 1 ) -by- ( K + 1 ) block matrix
Λ 1 ( K + 1 ) -by- ( K + 1 ) block matrix
U Missing data in an unobserved time interval
θ A vector of parameters to be estimated
λ i , j Arrival rate from i state to j state
N i , j The number of transition from state i to state j
S i The sojourn time in state i
N i , j U The number of transition from state i to state j at unobserved time period
N i , j O The number of transition from state i to state j at observed time period
S i U The sojourn time in state i at unobserved period
S i O The sojourn time in state i at observed period
λ a l l ( t ) Integrated intensity function of virtual servers
λ i ^ The i-th estimated arrival rate of a series of HPPs

References

  1. Hashem, I.A.T.; Yaqoob, I.; Anuar, N.B.; Mokhtar, S.; Gani, A.; Khan, S.U. The rise of “big data” on cloud computing: Review and open research issues. Inf. Syst. 2015, 47, 98–115. [Google Scholar] [CrossRef]
  2. Sareen, P. Cloud computing: Types, architecture, applications, concerns, virtualization and role of it governance in cloud. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2013, 3, 533–538. [Google Scholar]
  3. Walia, M.K.; Halgamuge, M.N.; Hettikankanamage, N.D.; Bellamy, C. Cloud computing security issues of sensitive data. In Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing; IGI Global: Hershey, PA, USA, 2021; pp. 1642–1667. [Google Scholar]
  4. Shen, S.; Van Beek, V.; Iosup, A. Statistical characterization of business-critical workloads hosted in cloud datacenters. In Proceedings of the 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Shenzhen, China, 4–7 May 2015; pp. 465–474. [Google Scholar]
  5. Tang, X.; Zhang, Z.; Wang, M.; Wang, Y.; Feng, Q.; Han, J. Performance evaluation of light-weighted virtualization for paas in clouds. In Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing, Dalian, China, 24–27 August 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 415–428. [Google Scholar]
  6. Russom, P. Big data analytics. Tdwi Best Pract. Rep. Fourth Quart. 2011, 19, 1–34. [Google Scholar]
  7. Brebner, P.; Liu, A. Performance and cost assessment of cloud services. In Proceedings of the International Conference on Service-Oriented Computing, San Francisco, CA, USA, 7–10 December 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 39–50. [Google Scholar]
  8. Cooper, R.B. Queueing theory. In Proceedings of the ACM’81 Conference, New York, NY, USA, 1 January 1981; pp. 119–122. [Google Scholar]
  9. Talreja, R.; Whitt, W. Fluid models for overloaded multiclass many-server queueing systems with first-come, first-served routing. Manag. Sci. 2008, 54, 1513–1527. [Google Scholar] [CrossRef] [Green Version]
  10. Khazaei, H.; Misic, J.; Misic, V.B. Performance analysis of cloud computing centers using m/g/m/m+ r queuing systems. IEEE Trans. Parallel Distrib. Syst. 2011, 23, 936–943. [Google Scholar] [CrossRef]
  11. Li, C.; Okamura, H.; Dohi, T. Parameter Estimation of Mt/M/1/K Queueing Systems With Utilization Data. IEEE Access 2019, 7, 42664–42671. [Google Scholar] [CrossRef]
  12. Zhao, M.; Xie, M. On maximum likelihood estimation for a general non-homogeneous Poisson process. Scand. J. Stat. 1996, 23, 597–607. [Google Scholar]
  13. Pasupathy, R. Generating homogeneous Poisson processes. Wiley Encyclopedia of Operations Research and Management Science; Wiley: Hoboken, NJ, USA, 2010. [Google Scholar]
  14. Merlevède, F.; Rio, E. Strong approximation for additive functionals of geometrically ergodic Markov chains. Electron. J. Probab. 2015, 20, 1–27. [Google Scholar] [CrossRef]
  15. Li, C.; Zheng, j.; Okamura, H.; Dohi, T. Parameter Estimation of Markovian Arrivals with Utilization Data. IEICE Trans. Commun. 2021, 105, 1–10. [Google Scholar] [CrossRef]
  16. Thiruvaiyaru, D.; Basawa, I.V.; Bhat, U.N. Estimation for a class of simple queueing networks. Queueing Syst. 1991, 9, 301–312. [Google Scholar] [CrossRef]
  17. Basawa, I.V.; Bhat, U.N.; Lund, R. Maximum likelihood estimation for single server queues from waiting time data. Queueing Syst. 1996, 24, 155–167. [Google Scholar] [CrossRef]
  18. Mendiratta, V.B. Trivedi, Kishor S. 2002. Probability and Statistics with Reliability, Queuing and Computer Science Applications. Interfaces 2004, 34, 407–409. [Google Scholar]
  19. Ross, J.V.; Taimre, T.; Pollett, P.K. Estimation for queues from queue length data. Queueing Syst. 2007, 55, 131–138. [Google Scholar] [CrossRef]
  20. Liu, H.; Liang, W.; Rai, L.; Teng, K.; Wang, S. A real-time queue length estimation method based on probe vehicles in CV environment. IEEE Access 2019, 7, 20825–20839. [Google Scholar] [CrossRef]
  21. Rothkopf, M.H.; Oren, S.S. A closure approximation for the nonstationary M/M/s queue. Manag. Sci. 1979, 25, 522–534. [Google Scholar] [CrossRef] [Green Version]
  22. Heyman, D.P.; Whitt, W. The asymptotic behavior o queues with time-varying arrival rates. J. Appl. Probab. 1984, 21, 143–156. [Google Scholar] [CrossRef]
  23. Green, L.; Kolesar, P.; Svoronos, A. Some effects of nonstationarity on multiserver Markovian queueing systems. Oper. Res. 1991, 39, 502–511. [Google Scholar] [CrossRef] [Green Version]
  24. Pant, A.; Ghimire, R. M (t)/M/1 queueing system with sinusoidal arrival rate. J. Inst. Eng. 2015, 11, 120–127. [Google Scholar] [CrossRef] [Green Version]
  25. Clarke, A.B. Maximum likelihood estimates in a simple queue. Ann. Math. Stat. 1957, 28, 1036–1040. [Google Scholar] [CrossRef]
  26. Wang, T.Y.; Ke, J.C.; Wang, K.H.; Ho, S.C. Maximum likelihood estimates and confidence intervals of an M/M/R queue with heterogeneous servers. Math. Methods Oper. Res. 2006, 63, 371–384. [Google Scholar] [CrossRef]
  27. Choudhury, A.; Basak, A. Statistical inference on traffic intensity in an M/M/1 queueing system. Int. J. Manag. Sci. Eng. Manag. 2018, 13, 274–279. [Google Scholar] [CrossRef]
  28. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Methodol. 1977, 39, 1–22. [Google Scholar]
  29. Wu, C.J. On the convergence properties of the EM algorithm. Ann. Stat. 1983, 95–103. [Google Scholar] [CrossRef]
  30. Rydén, T. An EM algorithm for estimation in Markov-modulated Poisson processes. Comput. Stat. Data Anal. 1996, 21, 431–447. [Google Scholar] [CrossRef]
  31. Basawa, I.; Bhat, U.; Zhou, J. Parameter Estimation in Queueing Systems Using Partial Information; Tech. Rep; Ohio State University: Columbus, OH, USA, 2006. [Google Scholar]
  32. Okamura, H.; Dohi, T.; Trivedi, K.S. Markovian Arrival Process Parameter Estimation with Group Data. IEEE/ACM Trans. Netw. 2009, 17, 1326–1339. [Google Scholar] [CrossRef]
  33. Shen, S.; Van Beek, V.; Iosup, A. Workload Characterization of Cloud Datacenter of BitBrains; TU Delft, Tech. Rep. PDS-2014-001; Delft University of Technology: Delft, The Netherlands, 2014. [Google Scholar]
  34. Swendsen, R.H.; Wang, J.S. Nonuniversal critical dynamics in Monte Carlo simulations. Phys. Rev. Lett. 1987, 58, 86. [Google Scholar] [CrossRef]
  35. Sakamoto, Y.; Ishiguro, M.; Kitagawa, G. Akaike information criterion statistics. Dordr. Neth. Reidel 1986, 81, 26853. [Google Scholar]
Figure 1. An M t / M / 1 / K queueing model.
Figure 1. An M t / M / 1 / K queueing model.
Mathematics 11 00513 g001
Figure 2. CPU utilization data collected from five VMs, named VM 1 to VM 5.
Figure 2. CPU utilization data collected from five VMs, named VM 1 to VM 5.
Mathematics 11 00513 g002
Figure 3. Estimated intensities of the VMs.
Figure 3. Estimated intensities of the VMs.
Mathematics 11 00513 g003
Figure 4. Varation curve of the average response times with service rate μ .
Figure 4. Varation curve of the average response times with service rate μ .
Mathematics 11 00513 g004
Table 1. Workload traces of the Bitbrains cloud datacenter.
Table 1. Workload traces of the Bitbrains cloud datacenter.
Name of Trace#VMsPeriod of Data CollectionStorage TechnologyMemory Size (GB)Cores
fastStorage12501 monthSAN17,7294057
Rnd5003 monthsNAS and SAN54851444
Total1750 23,2145501
Table 2. AIC values of the NHPP models with n = 1 , 2 , , 20 for the five VMs.
Table 2. AIC values of the NHPP models with n = 1 , 2 , , 20 for the five VMs.
nVM 1VM 2VM 3VM 4VM 5
LLFAICLLFAICLLFAICLLFAICLLFAIC
13723.047448.082213.024428.043975.997953.98−592.641187.283977.817957.62
23724.937453.862258.624521.243975.997955.98−568.311140.623977.817959.62
33727.647461.282252.304510.603976.787959.56−543.531093.063978.607963.20
43728.807465.602259.454526.903975.997959.98−511.131030.263977.817963.62
53730.777471.542259.374528.743975.997961.98−490.91991.823977.817965.62
63731.207474.402258.764529.523974.407960.80−493.50999.003976.227964.44
73733.267480.522260.994535.983974.407962.80−498.081010.23976.227966.44
83736.957489.902261.394538.783975.997967.98−510.051036.13977.817971.62
93735.167488.322261.144540.283972.017962.02−498.331014.73973.837965.66
103741.327502.642262.184544.363975.997971.98−450.52921.043977.817975.62
113741.637505.262270.924563.843979.967981.92−440.00902.003981.787985.56
123728.697481.382256.754537.503969.627963.24−440.40904.803971.447966.88
133727.727481.442262.064550.123969.627965.24−457.46940.923971.447968.88
143733.887495.762261.934551.863974.407976.80−463.62955.243976.227980.44
153735.567501.122263.244556.483972.017974.02−456.38942.763973.837977.66
163740.057512.102258.224548.443969.627971.24−458.24948.483971.447974.88
173749.157532.302265.314564.623974.407982.80−432.36898.723976.227986.44
183755.997547.982268.344572.683979.177994.34−440.88917.763980.997997.98
193753.527545.042263.814565.623973.607985.20−424.02886.04−3975.427988.84
203759.307558.602266.394572.783975.997991.98−429.57899.143977.817995.62
Table 3. An example of the calculation of response times.
Table 3. An example of the calculation of response times.
ProcessArrival Time ( t arr )Service Time ( t ser )Response Time ( t res )
P10 ms8 ms0 ms
P21 ms7 ms7 ms
P32 ms10 ms13 ms
Table 4. Average response times of the M t / M / 1 / K queueing model.
Table 4. Average response times of the M t / M / 1 / K queueing model.
Service Rate μ T res ( s ) Service Rate μ T res ( s )
1112.680210.196
128.127220.168
134.904230.150
143.057240.156
151.687250.116
160.815260.113
170.621270.103
180.554280.105
190.311290.089
200.354300.085
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, C.; Zheng, J.; Okamura, H.; Dohi, T. Performance Evaluation of a Cloud Datacenter Using CPU Utilization Data. Mathematics 2023, 11, 513. https://doi.org/10.3390/math11030513

AMA Style

Li C, Zheng J, Okamura H, Dohi T. Performance Evaluation of a Cloud Datacenter Using CPU Utilization Data. Mathematics. 2023; 11(3):513. https://doi.org/10.3390/math11030513

Chicago/Turabian Style

Li, Chen, Junjun Zheng, Hiroyuki Okamura, and Tadashi Dohi. 2023. "Performance Evaluation of a Cloud Datacenter Using CPU Utilization Data" Mathematics 11, no. 3: 513. https://doi.org/10.3390/math11030513

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop