Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market

Igried, Bashar; Al-Serhan, Atalla Fahed; Alsarhan, Ayoub; Aljaidi, Mohammad; Aldweesh, Amjad

doi:10.3390/fi15010001

Open AccessArticle

Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market

by

Bashar Igried

¹,

Atalla Fahed Al-Serhan

²,

Ayoub Alsarhan

³

,

Mohammad Aljaidi

^4,*

and

Amjad Aldweesh

^5,*

¹

Department of Computer Science and Applications, Faculty of Prince Al-Hussein Bin Abdallah II for Information Technology, The Hashemite University, Zarqa 13133, Jordan

²

Department of Business Administration, Al-Bayt University, Al-Mafraq 25113, Jordan

³

Department of Information Technology, Faculty of Prince Al-Hussein Bin Abdallah II for Information Technology, The Hashemite University, Zarqa 13133, Jordan

⁴

Department of Computer Science, Zarqa University, Zarqa 13110, Jordan

⁵

College of Computing and Information Technology, Shaqra University, Riyadh 11911, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Future Internet 2023, 15(1), 1; https://doi.org/10.3390/fi15010001

Submission received: 24 October 2022 / Revised: 24 November 2022 / Accepted: 16 December 2022 / Published: 20 December 2022

(This article belongs to the Section Network Virtualization and Edge/Fog Computing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

A successful cloud trading system requires suitable financial incentives for all parties involved. Cloud providers in the cloud market provide computing services to clients in order to perform their tasks and earn extra money. Unfortunately, the applications in the cloud are prone to failure for several reasons. Cloud service providers are responsible for managing the availability of scheduled computing tasks in order to provide high-level quality of service for their customers. However, the cloud market is extremely heterogeneous and distributed, making resource management a challenging problem. Protecting tasks against failure is a challenging and non-trivial mission due to the dynamic, heterogeneous, and largely distributed structure of the cloud environment. The existing works in the literature focus on task failure prediction and neglect the remedial (post) actions. To address these challenges, this paper suggests a fault-tolerant resource management scheme for the cloud computing market in which the optimal amount of computing resources is extracted at each system epoch to replace failed machines. When a cloud service provider detects a malfunctioning machine, they transfer the associated work to new machinery.

Keywords:

cloud market; fault tolerance; task failure

1. Introduction

Recently, cloud computing has been recognized as an innovative strategy to increase cloud provider (CP) profit while satisfying a diverse range of consumers worldwide. Cloud computing is predicated on the idea that consumers can use CP’s computing resources while having to pay for them [1,2,3]. Long-term success of cloud computing necessitates numerous technological, economic, and data security advancements. It is critical, in particular, to establish a proper mechanism that provides sufficient incentives for CPs to provide their computing resources for client sharing. Market-driven cloud computing is a viable approach for addressing the issue of client and CP incentives. In the cloud market, CPs lease their computing resources (CRs) temporarily to clients all over the world [1,4,5,6]. Access to CRs is often unpredictable, unlike the supply of more common goods. Furthermore, when working in an online environment such as the cloud, a task may fail for a variety of reasons, including software bugs, hardware failure, and insufficient resources. According to the Google cloud status dashboard [7], infrastructure problems caused YouTube and Gmail to be down for an hour on 14 December 2020. When tasks fail, the customer’s Quality of Service (QoS) suffers dramatically. As a result, it is critical to take actions that can assist in compensating and protecting clients when task failure occurs [6,7,8,9]. Due to task failure, the availability of CRs is unknown. CPs cannot typically provide a good service with a guarantee to meet their clients’ expectations. From the perspective of a CP, it is mandatory to gain maximum profit, which can be obtained by serving the clients’ requests within the agreed upon deadline. To prevent a penalty for a delay, a deadline constraint should be met. Consequently, CPs need to be ready for any CR failure that could cause a violation the deadline constraint. Furthermore, CPs should take steps to lower customer queue wait times. To address this issue, we proposed a novel method that ensures the QoS of all requests while drastically reducing resource waste. In addition, CPs need to take action to shorten the time that customers must wait in line. However, earlier research has not addressed the needs and applications for fault-tolerant computing, as well as fundamental subjects, components, and trading metrics in the context of fault tolerance in the cloud. This paper investigates fault tolerance in the cloud market in a straightforward manner. To that end, we study short-term trading in the cloud market, where trade takes place between the CP (a monopoly supplier) and multiple clients. In this market, each client comes to an arrangement, known as a contract, with the CP. The contract specifies the following key elements:

The client’s demand (i.e., the number of CRs);
Payment in a given period;
Rental period.

Clients in the cloud market rent CRs in real time and on demand through an auction. Clients compete for CRs, and through pre-defined contracts, CPs insure clients from the uncertainty of future supply. The main concern of this research is to boost CPs’ profit. Cloud market data information is typically erratic. Information about the cloud market includes service demand, supply, and customer bids. This problem is formulated as a profit maximization problem. The following are the main contributions:

A new modeling and solution technique is proposed to handle task failure for clients. In this work, we tackle the problem of profit maximization in cloud markets under stochastic network information. Clients’ satisfaction is guaranteed by providing the committed number of CRs to guarantee the QoS for clients. The CP reserves a few CRs to deal with machine failure, and new requests are served based on an auction policy that guarantees the availability of CRs.
Under stochastic network information, we extract the optimal number of CRs that can be used to replace failed CRs.
We study the CP’s profit and QoS constraints for clients under stochastic cloud market information.

The remainder of this article is organized as follows: First, the related work and our contributions to the paper are introduced in Section 2. Next, the cloud market is presented in Section 3. We describe the proposed fault-tolerant trading scheme in Section 4. Then, we present some of the performed tests and show the performance of the trading scheme under different conditions with our scheme in Section 5. Finally, the article is concluded in Section 6.

2. Related Work

Trading is disrupted when machines fail in the cloud market, decreasing consumer confidence in online purchases. It is possible for any machine, process, or part of a network to malfunction. The result is dissatisfied customers who are less likely to return and less likely to shop online in the future. The success of CR trading depends on the efficient handling of faults through the creation of new fault tolerance mechanisms to safeguard customers from any potential outages. A CP can still satisfy the request even if some of the CRs fail thanks to the fault-tolerant scheme employed [10,11]. A new scheme for replacing failed machines is proposed in [12]. The main concern of the proposed scheme is improving the reliability of cloud services based on the replication-based fault tolerance method. The three components of the proposed methodology are the selection of a host server, the optimization of the placement of virtual machines, and the selection of a recovery strategy.

The authors of [13] propose a new scheme to optimize cloud-based reliability in a versatile and adaptable manner. A peer-to-peer checkpointing method was used, which allows clients’ consistency points and levels to be optimized in light of each one’s unique needs and the full range of data center resources. In [14], the authors propose a new scheme for dealing with machine failure by migrating a job to new machines if the current set of machines are unable to finish the job. To identify the most suitable replacement virtual machines (VMs) for a failed task, the authors of [15] propose a new resource-aware virtual machine migration technique. A CP selects the most appropriate target virtual machines by keeping an eye on resource utilization and job arrival rate. In [16], the authors propose a new model to allocate transfer and compression rate to each virtual machine in order to reduce job migration time. To manage job migration, geometric programming was used. This requires allocation of transfer rate to these VMs, which is usually done such that either the total migration time and/or total downtime is minimized without considering the penalty imposed for the service downtime during migration.

In order to determine whether or not CRs are alive, the majority of today’s methods, such as system layer heart beating, rely on a single, unreliable detector. Regardless of the nature of the fault being detected, there is a minimum amount of time that must elapse before the results of a single unreliable detector can be trusted. There are, however, other detectors that can find many faults much faster. Considering the need for rapid fault detection in cloud computing environments, the authors in [17] propose an online liveness fault detection mechanism that integrates existing detectors. In [18], the authors propose a new scheme to manage resources in the cloud market and to reduce the service level agreement violation, cost, energy usage, and time using fuzzy logic. In [19], various techniques and architecture of fault tolerance systems are discussed. The authors describe the procedures of handling failure in the cloud market. In [20], the authors introduce a fault-tolerant framework for highly efficient cloud-based computing. To speed up the execution of computationally intensive programs, the authors suggest using process level redundancy (PLR) techniques. Previous research on fault tolerance in the cloud is summarized in [21]. Recent cloud-based environments have introduced novel difficulties in improving fault tolerance while also presenting novel opportunities for the creation of novel strategies, architectures, and standards. In [22], the authors suggest a new scheme for calculating the deadline miss rate in the cloud market. The proposed scheme’s main aim is to maximize the CP’s profit by using the proposed method for multi-server configuration. The authors suggest a new scheme in [23] for service pricing that takes into account the consumer perceived value. The suggested scheme enables CPs to accurately estimate supply and demand in the cloud market. Furthermore, the method determines the best multi-server configuration to maximize the CP’s profit. Authors propose new models for cloud service revenue and costs in [24]. The trading cloud resources problem was formulated as a profit optimization problem, and a heuristic method based on a grouped grey wolf optimizer (GWO) was proposed to extract the optimal multi-server setup for a given client demand. In [25], the authors analyzed a deadline constraint that may affect the profit of CPs. Furthermore, a new mathematical model was proposed to analyze the relationship between customer satisfaction and the revenues of CPs.

It can be noticed from the previous discussion that in the cloud market there are different motivations for clients and the CP. These motivations for both sides should be evaluated and appropriately addressed so that a more thorough CRs trading algorithm can be developed, which might serve to offer both parties with sufficient incentives to stay and engage with the cloud, leading to a sustainable system. However, the majority of past research has come from either the incentive for clients or the CP to solve this issue. In contrast, this work attempts to investigate the CRs trading problem in cloud computing by addressing the key motives for both parties, namely, maximizing CP profit and meeting client expectations.

When figuring out the CRs redundancy strategy in the cloud market, many state-of-the-art methods fail to consider the issue of the network’s potentially enormous resource consumption. Moreover, traditional approaches could not adjust to the market shift toward CR failure in the cloud. With this in mind, we propose a novel adaptive fault-tolerant scheme, distinct from existing approaches, in which the optimal number of CRs to be used for replacing a failed CR is calculated at each system epoch. Scalability is achieved by grouping customers into manageable clusters and having a central CP oversee all trading activity for each group. For the trading of CRs, our scheme can be implemented with little to no additional infrastructure. In the event of CRs failure, the job can continue to run thanks to migration strategies proposed in the literature. Nonetheless, shoddy migrations lengthen the migration process, cause disruptions in service, and reduce application performance.

3. The Cloud Market and Problem Formulation

We consider a cloud market with one CP and multiple clients. The CP owns K CRs to serve clients. Depending on CR availability and the demand for CRs, the CP serves clients. In the cloud market, we consider short-term CR trading, where the CP leases the CRs to clients on an availability basis. The main motivation for adopting short-term CR trading is that CR availability changes randomly due to the uncertainty and variability of service demand.

V_{i}^{t}

denotes the availability of ith CR at time

t

.

V_{i}^{t}

= 1 indicates the ith CR and is not used by any client at time

t

. The supply of service is defined as the total number of CRs that are used to serve clients. Service supply at time

t

can be expressed as follows:

S_{t} = \{V_{i}^{t} | V_{i}^{t} = 1\}

(1)

The size of total supply

Z

at time

t

can be expressed as follows:

Z = |S_{t}| = \sum_{i = 1}^{N} V_{i}^{t}

(2)

where N is the maximum number of CRs that the CP may utilize to support the cloud market customers. Due to the uncertainty of clients’ demand for service,

V_{i}^{t}

changes randomly across both time and CRs. The demand for service at time

t

can be represented as follows:

D_{t} = \{d_{1}^{t}, d_{2}^{t}, \dots, d_{m}^{t}\}

(3)

where

d_{i}^{t}

is the number of CRs required by ith client, and

m

is the number of requests in the cloud market. Total demand

U

for service can be represented as follows:

U = \sum_{i = 1}^{m} d_{i}^{t}

(4)

In the cloud market, the demand and supply functions are independent over time. Each CR can be used by only one client. Function

Y

represents CR availability. For ith CR,

B_{i}^{t}

is the idle probability and 1-

B_{i}^{t}

is the busy probability.

3.1. Overview of Cloud Market

In the cloud computing market, contracts between a CP and its customers are typically short-term. Contract theory is used in our work to create contracts between the CP and clients. The following are some of the provisions of the contract:

Cost of the service being provided;
Time of rental;
The total number of CRs.

The CP agrees to deliver the agreed CRs to the client within the allotted time frame. In exchange for the i^th client’s potential utility loss, the CP should pay a penalty

b_{i}^{t}

. The client only makes a request to rent CRs when needed, and if there are multiple clients with the same need, they must bid against one another to receive the service. The winner receives immediate access to the service at a price determined by market competition and client valuations in real time.

3.2. Contract Structure in the Cloud Market

From the customer’s point of view, there are a number of goals that might be defined, but the most appealing feature is the ability to successfully serve their requests at low cost within the deadline. Nonetheless, customers will move to another CP if request execution consistently falls behind schedule. To avoid delays in satisfying client requests, our scheme has taken steps to address CR failure, such as replacing failed CRs and migrating jobs to new machines. In the cloud market, the contract for ith client at time

t

can be written as follows:

C_{i}^{t} = \{d_{i}^{t}, p_{i}^{t}, b_{m}^{t}\}

(5)

where

p_{i}^{t}

is the ith client payment, and

b_{i}^{t}

is the CP’s penalty for CR failure. The revenue earned by the CP after completing the contract for the ith client is calculated as follows:

R_{i}^{t} = p_{i}^{t} - b_{i}^{t}

(6)

Thus, when accepting a contract for the ith client, the CP’s revenue collected is determined by the CP’s penalty and the payment made by the ith client. The CP’s penalty is calculated as follows:

b_{i}^{t} = a (d_{i}^{t} - f_{i}^{t})

(7)

where

f_{i}^{t}

is the number of failed machines in the system, and

a

is the unit punishing price for a CR. Clients achieve certain benefits from using CRs. A client’s valuation on a specific CR represents the client’s benefit from using CRs. Client satisfaction is determined by service quality, which reflects how good or effective the CR is, and by a user-specific preference, which reflects how efficiently a client can use CRs and how urgently a client requires CRs.

4. Computing Resources Trading Based on a Fault Tolerance Scheme

Our scheme allows a client’s request to be migrated to a new CR if execution is not possible on the present machine. The CP should guarantee the QoS for clients.

Definition 1.

For any request, the CP assigns the requested number of CRs to serve the request from the pool of available CRs. The CP receives the reward for serving the request if the request is successfully completed before the deadline.

The CP’s primary objective is to maximize profit by utilizing CRs as much as possible while minimizing the chance of deadline violation. However, if the demand for the service is high, the risk of CR failure increases dramatically, which may result in contract violation and a reduction in client satisfaction. Furthermore, frequent CR failures prohibit the CP from meeting clients’ cloud expectations, which inevitably harms the CP’s reputation, resulting in economic loss in the long run. In this regard, the CP’s primary concern is to ensure that all requests are served on time. Obviously, the main concern of the CP is maximizing its profit as follows:

P = \sum_{t = 0}^{T} λ (p_{i}^{t} - b_{i}^{t})

(8)

where

T

is the simulation time, and

λ

is the requests arrival rate. We assume that the arrival rate of clients’ requests follows a Poisson distribution with arrival rate λ. The service rate for an incoming request is assumed to be exponentially distributed with service rate μ. These assumptions reflect some of the reality of trading applications. To ensure that clients receive a high quality service, a CP must keep a certain number of CRs on hand in case of CR failure. The CP should consider the failure probability for each machine when keeping certain number of CRs. As a result, and because service demand changes over time, the CP requires a policy to protect clients from machine failure. In our work, a state-dependent policy based on Markov decision processing (MDP) is proposed to model CRs trading in the cloud market. We integrate the CP’s penalty for CR failure in the MDP model. An MDP algorithm is used to extract the optimal management policy that maximizes the CP’s profit for given cloud market state

H_{t}

. The optimal trading policy will be extracted to select the set of possible service requests that maximize the net profit for the CP. The net profit

G

is computed as follows:

G = \max_{i \in A} λ (p_{i}^{t} - b_{i}^{t})

(9)

All requests that do not generate extra profit are rejected by the CP. Generally, the sensitivity of clients’ payments to the number of failed CRs can be approximated by the cost of penalty for CR failure:

\frac{\partial P}{\partial F} = S (F)

(10)

where

F

is the number of failed CRs.

S (F)

is a measure of the change in client willingness to pay in the cloud market based on the number of failed CRs. In order to maximize profit, the CP must respond to fluctuations in service demand by adjusting the price of services and the number of CRs dealing with machine failure. The sensitivity of the CP’s revenue to the number of failed CRs can be represented as follows:

\frac{\partial R}{\partial F} = \frac{\partial P}{\partial F} - \frac{\partial b}{\partial F}

(11)

The cost of machine failure is a linear function of

F

and it is computed as follows:

b = a F

(12)

Substituting (9) in (10), the CP’s net revenue is maximized when number of failed machines equals the root of:

F_{t + 1} = F_{t} - \frac{S (F_{t}) - a}{\frac{\partial (S (F_{t}) - a)}{\partial F}}

(13)

Newton’s method of successive linear approximations is used [26,27] to find the root of Equation (12). The new number of failed machines

F_{t}

at each iteration step t is computed as follows:

F_{t + 1} = F_{t} - \frac{S (F_{t}) - a}{\frac{\partial (S (F_{t}) - a)}{\partial F}}

(14)

Approximating the derivative in Equation (13) at step t.

\frac{\partial (S (F_{t}) - a)}{\partial F} = \frac{\partial (S (F_{t}))}{\partial F} ≅ \frac{\partial (S (F_{t}) - \partial (S (F_{t - 1}))}{F_{t} - F_{t - 1}}

(15)

Substituting (14) in (13), the expected number of failed machines is represented as follows:

F_{t + 1} = F_{t} - (F_{t} - F_{t - 1}) = \frac{S (F_{t}) - a}{S (F_{t}) - S (F_{t - 1})}

(16)

The expected number of failed machines at step t can be extracted using the following Algorithm 1:

Algorithm 1 Finding the optimal number of failed machines at step t.

Input:

F_{t}

,

a

, and

ε

.

Output: The optimal number of that dedicated to replace failed CRs.

1:: Arbitrarily initialize $H_{t}$ ;
2:: if CR-Failure()
3:: {
4:: while $(|S (F_{t}) - a| > ε)$
5:: {
6:: $c o m p u t e (S (F_{t}));$
7:: $F_{t + 1} = F_{t} - (F_{t} - F_{t - 1}) = \frac{S (F_{t}) - a}{S (F_{t}) - S (F_{t - 1})}$
8:: }
9:: return $F_{t + 1}$ ;
10:: }
11:: End

Given a good initial approximation, the time complexity of Newton’s method to compute a root of a function

\frac{\partial R}{\partial F}

with

ε

-digit precision is O((log

ε

)F(

ε

)) where F(

ε

) is the cost of calculating

\frac{S (F_{t}) - a}{S (F_{t}) - S (F_{t - 1})}

with

ε

-digit precision.

Definition 2.

Any request from any client is accepted if the number of free CRs is sufficient to serve the request and the CP is capable of meeting the QoS requirments for the request.

5. Results, Analysis, and Discussion

In this section, we evaluate our proposed security scheme (FS) against an intolerance scheme (IS) in which a failed task is re-added to the waiting queue. Profit and end-to-end delay are the performance metrics used in the comparison. To gauge our scheme’s efficacy, we looked at how well it improves a CP’s profits while cutting down on client request waiting times. An adequate number of replicates were performed, and the results were averaged to ensure a 95% level of confidence and relative errors of less than 5%. We looked at how the proposed scheme (FS) performs using a variety of settings for the parameters. We examined the performance under different parameter settings. Table 1 shows the parameters used to evaluate the proposed scheme.

Although some CRs may have failed, our scheme will continue to process requests from clients. Customers who are unhappy with their service because of machine failure are more likely to consider switching to a different CP, thereby reducing the profits of the CP. Figure 1 depicts the effect of CR failure on the profit of the CP. The figure displays the results of our method (FS) and the intolerance scheme (IS). In the experiment, we took the profit at various percentage values of failed CRs. Clearly, as the number of failed machines increases in the cloud market, the profit of the CP decreases significantly. Fortunately, our technique keeps serving clients if some of the CRs fail. The FS moves failed requests into some CRs that are part of a pool of CRs ready to replace any failed CR. The IS, on the other hand, re-adds the failed requests to the queue and begins looking for available CRs. Because it serves all requests in-service, our scheme outperforms the IS in terms of profit.

Figure 2 shows a comparison between our suggested scheme and the IS for the percentage of completed jobs. As the proportion of failed CRs rises, the percentage of jobs that are successfully completed falls for both schemes. The system’s capacity to deal with rising client demand is constrained by failed CRs. Instead of canceling them, jobs that have failed are executed by CRs that are dedicated to replication in our system. If a CR in service fails, the CP can decide which CR should take over the failed task. The CP can increase the number of requests it can fulfill by replacing failed CRs. However, the IS places the failed tasks in a queue, which increases the wait time for clients and decreases the proportion of successfully completed jobs.

Figure 3 depicts the size of the replication pool and the number of CRs set aside to replace failed CRs. It is obvious that as more CRs in the cloud market fail, the pool size grows as well. The CP requires more machines to satisfy client requests and perform replication tasks as there are more and more CRs failing.

Figure 4 displays the replication pool size as a function of the penalty for CR failure. It is evident that when the penalty cost rises, the pool size expands to minimize additional losses caused by machine failure. As more CRs fail, the CP requires more machines to serve client requests and complete replication activities. On the other hand, as shown in Figure 5, we compare the delay in our suggested scheme and the IS. The simulated scenario used to measure the delay of requests in the two schemes clearly shows that the delay increases significantly as the percentage of failed machines increases. Because of the replication pool, which is utilized to replace failed CRs and greatly reduce the delay, our technique surpasses the IS in terms of latency.

6. Conclusions and Future Work

CP’s profit can be improved significantly by handling the QoS requirements of clients effectively. Because cloud services are prone to a range of failures, fault tolerance issues must be considered while developing new strategies to address all cloud market needs. In order to reduce resource waste while accommodating CR failure and increasing CSP profit, these strategies must strike a compromise between competing objectives.

We proposed a new scheme that keep the services provided by the CP performing regardless of faults. In our scheme, the optimal number of CRs dedicated to replacing failed CRs is extracted. Newtown method is adopted for specifying the optimal size of replication pool. The proposed model moves the request into new CR for the pool of CRs if a machine fails to perform the task. This action minimizes the waiting time for clients and reduce the total migration time significantly. Current strategies for establishing fault tolerance in the cloud market do not take into account computing the appropriate size of CRs allocated to handle any partial failure in the market.

When the replication of failed CRs is used by the FS instead of the IS, the profit of the CP is greatly enhanced and the waiting time for clients is reduced. In the future, we plan to compute the time complexity of the proposed approach and compare it to alternative fault-tolerant schemes in terms of additional performance metrics such as request blocking probability. Furthermore, we will evaluate our technique with a verification tool to test its resilience against various faults and show that it optimizes CP profit under various conditions.

Author Contributions

Conceptualization, B.I. and A.A. (Ayoub Alsarhan); methodology, B.I.; software, A.A. (Ayoub Alsarhan); validation, B.I., A.A. (Ayoub Alsarhan) and A.F.A.-S.; formal analysis, B.I.; investigation, B.I.; writing—original draft preparation, M.A.; writing—review and editing, A.A. (Amjad Aldweesh); visualization, A.F.A.-S.; supervision, B.I.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Deanship of Scientific Research at Shaqra University (KSA) and The Hashemite University (Jordan) for supporting this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Alsarhan, A.; Itradat, A.Y.; Al-Dubai, A.Y.Z.; Min, G. Adaptive Resource Allocation and Provisioning in Multi-Service Cloud Environments. IEEE Trans. Parallel Distrib. Syst. 2018, 29, 31–42. [Google Scholar] [CrossRef] [Green Version]
Chen, F.; Zhu, Z.; Chen, G.; Min, X.; Zheng, X.; Rong, C. Resource Allocation for Cloud-Based Software Services Using Prediction-Enabled Feedback Control with Reinforcement Learning. IEEE Trans. Cloud Comput. 2022, 10, 1117–1129. [Google Scholar] [CrossRef]
Guezzaz, A.; Azrour, M.; Benkirane, S.; Mohyeddine, M.; Attou, H.; Douiba, M. A Lightweight Hybrid Intrusion Detection framework using Machine Learning for Edge-Based IIoT Security. Int. Arab. J. Inf. Technol. 2022, 19, 5. [Google Scholar] [CrossRef]
Chen, X.; Zhang, Y.; Chen, Y. Cost-Efficient Request Scheduling and Resource Provisioning in Multiclouds for Internet of Things. IEEE Internet Things J. 2020, 7, 1594–1602. [Google Scholar] [CrossRef]
Alsarhan, A.; Al-Sarayreh, K.T.; Al-Ghuwairi, A.R.; Kilani, Y. Resource trading in cloud environments for profit maximisation using an auction model. Int. J. Adv. Intell. Paradig. 2014, 6, 176–190. [Google Scholar] [CrossRef]
Upadhyaya, A.N.; Shah, J.S. Attacks on vanet security. Int. J. Comp. Eng. Tech. 2018, 9, 8–19. [Google Scholar]
Google Cloud Status Dashboard. Available online: https://status.cloud.google.com/summary (accessed on 20 May 2021).
Radhika, E.G.; Sadasivam, G.S. Budget optimized dynamic virtual machine provisioning in hybrid cloud using fuzzy analytic hierarchy process. Expert Syst. Appl. 2021, 183, 115398. [Google Scholar] [CrossRef]
Li, C.; Sun, H.; Chen, Y.; Luo, Y. Edge cloud resource expansion and shrinkage based on workload for minimizing the cost. Future Gener. Comput. Syst. 2019, 101, 327. [Google Scholar] [CrossRef]
Gokhroo, M.K.; Govil, M.C.; Pilli, E.S. Detecting and mitigating faults in cloud computing environment. In Proceedings of the International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 9–10 February 2017. [Google Scholar]
Charity, T.J.; Hua, G.C. Resource reliability using fault tolerance in cloud computing. In Proceedings of the Next Generation Computing Technologies (NGCT), Dehradun, India, 14–16 October 2016. [Google Scholar]
Zhou, A.; Wang, S.; Cheng, B.; Zheng, Z.; Yang, F.; Chang, R.N.; Lyu, M.R.; Buyya, R. Cloud Service Reliability Enhancement via Virtual Machine Placement Optimization. IEEE Trans. Serv. Comput. 2017, 10, 902–913. [Google Scholar] [CrossRef]
Zhao, J.; Xiang, Y.; Lan, T.; Huang, H.H.; Subramaniam, S. Elastic reliability optimization through peer-to-peer checkpointing in cloud computing. IEEET Trans. Parallel Distrib. Syst. 2017, 28, 491–502. [Google Scholar]
Raseena, H.; Khaled, M.K.; Nhlabatsi, A. Live migration of virtual machine memory content in networked systems. Comput. Netw. 2022, 209, 22. [Google Scholar]
Paulraj, G.J.L.; Francis, S.A.J.; Peter, J.D.; Jebadurai, I.J. Resource-aware virtual machine migration in IoT cloud. Future Gener. Comput. Syst. 2018, 85, 173–183. [Google Scholar] [CrossRef]
Singha, G.; Singh, A.K. Optimizing multi-VM migration by allocating transfer and compression rate using Geometric Programming. Simul. Model. Pract. Theory 2021, 106, 102201. [Google Scholar] [CrossRef]
Lee, Y.L.; Liang, D.; Wang, W.J. Optimal Online Liveness Fault Detection for Multilayer Cloud Computing Systems. IEEE Trans. Dependable Secur. Comput. 2021, 19, 3464–3477. [Google Scholar] [CrossRef]
Dewangan, K.B.; Amit, A.; Tanupriya, C.; Ashutosh, P. Workload aware autonomic resource management scheme using grey wolf optimization in cloud environment. IET Commun. 2021, 15, 1869–1882. [Google Scholar] [CrossRef]
Robel, M.R.A.; Bharati, S.; Podder, P.; Raihan-Al-Masud, M.; Mandal, S. Fault Tolerance in Cloud Computing—An Algorithmic Approach. In Innovations in Bio-Inspired Computing and Applications. IBICA 2019; Abraham, A., Panda, M., Pradhan, S., Garcia-Hernandez, L., Ma, K., Eds.; Advances in Intelligent Systems and Computing; Springer: Berlin/Heidelberg, Germany, 2021; Volume 1180. [Google Scholar]
Egwutuoha, I.P.; Chen, S.; Levy, D.; Selic, B. A Fault Tolerance Framework for High Performance Computing in Cloud. In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), Ottawa, ON, Canada, 13–16 May 2012; pp. 709–710. [Google Scholar]
Rehman, A.U.; Aguiar, R.L.; Barraca, J.P. Fault-Tolerance in the Scope of Cloud Computing. IEEE Access 2022, 10, 63422–63441. [Google Scholar] [CrossRef]
Wang, T.; Zhou, J.; Li, L.; Zhang, G.; Li, K.; Hu, X.S. Deadline and Reliability Aware Multiserver Configuration Optimization for Maximizing Profit. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 123772–123786. [Google Scholar] [CrossRef]
Wang, T.; Zhou, J.; Zhang, G.; Wei, T.; Hu, S. Customer perceived value- and risk-aware multiserver configuration for profit maximization. IEEE Trans. Parallel Distrib. Syst. 2020, 31, 1074–1088. [Google Scholar] [CrossRef]
Cong, P.; Hou, X.; Zou, M.; Dong, J.; Chen, M.; Zhou, J. Multiserver configuration for cloud service profit maximization in the presence of soft errors based on grouped grey wolf optimizer. J. Syst. Archit. 2022, 127, 102512. [Google Scholar] [CrossRef]
Chen, S.; Liu, J.; Ma, F.; Huang, H. Customer-Satisfaction-Aware and Deadline-Constrained Profit Maximization Problem in Cloud Computing. J. Parallel Distrib. Comput. 2022, 163, 198–213. [Google Scholar] [CrossRef]
Bertsimas, D.; Tsitsiklis, J. Introduction to Linear Optimization; Athena Scientific: Belmont, MA, USA, 1997. [Google Scholar]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]

Figure 1. Profit under different percentage of failed CRs.

Figure 2. Percentage of executed jobs under different percentage of failed CRs.

Figure 3. Pool size under different percentage of failed CRs.

Figure 4. Pool size under different values of penalty cost.

Figure 5. End-to-End delay under different percentage of failed CRs.

Table 1. Simulation parameters.

Parameter		Value
number of CRs		200
number of clients		350
number of requests per client		random
$λ$		1
service price		10
penalty cost		5
simulation time		1000 s
number of served requests		1,000,000
simulation devices	Intel Core i5	2.50 GHz
	process cores	2 × 2.50 GHz
	RAM	6 GB
	OS	Windows 7 64 bit

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Igried, B.; Al-Serhan, A.F.; Alsarhan, A.; Aljaidi, M.; Aldweesh, A. Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market. Future Internet 2023, 15, 1. https://doi.org/10.3390/fi15010001

AMA Style

Igried B, Al-Serhan AF, Alsarhan A, Aljaidi M, Aldweesh A. Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market. Future Internet. 2023; 15(1):1. https://doi.org/10.3390/fi15010001

Chicago/Turabian Style

Igried, Bashar, Atalla Fahed Al-Serhan, Ayoub Alsarhan, Mohammad Aljaidi, and Amjad Aldweesh. 2023. "Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market" Future Internet 15, no. 1: 1. https://doi.org/10.3390/fi15010001

APA Style

Igried, B., Al-Serhan, A. F., Alsarhan, A., Aljaidi, M., & Aldweesh, A. (2023). Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market. Future Internet, 15(1), 1. https://doi.org/10.3390/fi15010001

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Failure-Aware Scheme for Profit Maximization in the Cloud Market

Abstract

1. Introduction

2. Related Work

3. The Cloud Market and Problem Formulation

3.1. Overview of Cloud Market

3.2. Contract Structure in the Cloud Market

4. Computing Resources Trading Based on a Fault Tolerance Scheme

5. Results, Analysis, and Discussion

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI