Energy-Efﬁcient Task Scheduling and Resource Allocation for Improving the Performance of a Cloud–Fog Environment

: Inadequate resources and facilities with zero latency affect the efﬁciencies of task scheduling (TS) and resource allocation (RA) in the fog paradigm. Only the incoming tasks can be completed within the deadline if the resource availability in the cloud and fog is symmetrically matched with them. A container-based TS algorithm (CBTSA) determines the symmetry relationship of the task/workload with the fog node (FN) or the cloud to decide the scheduling workloads (whether in the fog or a cloud). Furthermore, by allocating and de-allocating resources, the RA algorithm reduces workload delays while increasing resource utilization. However, the unbounded cloud resources and the computational difﬁculty of ﬁnding resource usage have not been considered in CBTSA. Hence, this article proposes an enhanced CBTSA with intelligent RA (ECBTSA-IRA), which symmetrically balances energy efﬁciency, cost, and the performance-effectiveness of TS and RA. Initially, this algorithm determines whether the workloads are accepted for scheduling. An energy-cost–makespan-aware scheduling algorithm is proposed that uses a directed acyclic graph (DAG) to represent the dependency of tasks in the workload as a graph. Workloads are prioritized and selected for the node to process the prioritized workload. The selected node for processing the workload might be a FN or cloud and is decided by an optimum efﬁciency factor that trades off the schedule length, cost, and energy. Moreover, a Markov decision process (MDP) was adopted to allocate the best resources using the reinforcement learning scheme. Finally, the investigational ﬁndings reveal the efﬁcacy of the presented algorithms compared to the existing CBTSA in terms of various performance metrics.


Introduction
In the era of network technology, cloud computing experienced extraordinary growth owing to the explosive usage of the web and innovation of the communication technology for solving large-scale problems. This has led to both software and hardware resources over the web for cloud clients. Typically, it means a web-based computing framework that distributes data, resources, and services to different systems of the client upon request. It minimizes both the computation and processing difficulties of conventional information processing systems. However, several challenges have been observed in the development of cloud computing with the Internet of Things (IoT). The IoT paradigm is altering the mode in which clients interact with the physical world, increasing the growth of connectivity towards it [1].
It can lead to novel systems with infinite abilities and huge influences by facilitating many client contributions. It exploits cloud computing to manage a huge quantity of data Alsaffar et al. [8] noted that RA depends on the interaction between fog and cloud systems. In this structure, novel algorithms, namely selection policies of the linearized choice tree, were adopted depending on the number of facilities, execution interval, and VM's ability for directing and assigning the client demand. Further, an algorithm was presented for allocating the resources to meet SLA and quality-of-service (QoS), including modification of the big data sharing in fog and cloud systems.
Yu et al. [9] formulated a fog-enabled effective price reduction challenge for a cloud source in which the price included the power of cloud servers, system spectrum, and income failure (because of the propagation latency and the monetary reward to fog systems). After that, an analogous and dispersed scheme was suggested depending on the proximal Jacobian alternating direction method of the multipliers (PJ-ADMM). Hoang and Dang [10] developed a region-based cloud algorithm for TS in fog computing. At first, the fog-based region and cloud (FBRC) structure were developed for making closer resources. A case was reported as to whether the estimations were done at the farthest data centers or neighboring areas or together. Moreover, the TS problem was formulated as an integer program.
Pham et al. [11] developed a cost-makespan-aware scheduling (CMaS) scheme for TS in fog computing. In this algorithm, a workload reallocation method was introduced to filter the CMaS scheme outcomes that satisfied the client-described target limits. Moreover, a balance between essential costs was achieved for the usage of cloud resources and the efficiency of application execution. Ni et al. [12] developed a RA method in the fog paradigm depending on the valued duration of Petri Nets, which were used for choosing the satisfying resources autonomously by accounting for the expense and duration needed to execute the workload and the integrity analysis of fog resources and users. Moreover, it was built according to the characteristics of fog resources.
Sun and Zhang [13] developed the RA framework depending on the repetitive players in the fog paradigm. Initially, a system framework was presented depending on deep learning that used the qualities of cloud and fog servers. After that, a reward and punishment method was established based on the resource set-funding mechanism for incorporating sporadic resources to construct the variable resource group, which optimized the spare resources in the neighboring system. Further, an incentive method was applied to motivate several resource providers to distribute their resources with the resource group and manage the resource followers since they vigorously executed their workloads.
Nie et al. [14] designed a VM allocation algorithm based on multi-dimensional resources that considered the diversity of users' requests. At first, the utilization of each dimension resource of physical machines was considered and a D-dimensional resource state model was constructed. Then, an energy-resource state metric was introduced and an energy-aware multi-dimensional RA named MRBEA was adopted to assign the resources according to the resource state and energy consumption of physical machines.
Liu et al. [15] developed a general multi-user system framework for achieving efficient TS in heterogeneous fog systems. Originally, processing competence was applied to combine computing resources and transfer facilities. After, a dispersive stable TS (DATS) strategy was applied to lessen the service latency using two major elements, such as a processing efficacy-based progressive computing resources competition (PCRC) and a synchronized TS (STS).
Yang et al. [16] developed a widespread systematic framework for precisely analyzing the total power efficacy of homogeneous fog systems. First, the tradeoff between efficiency and power expense in the combined workload offloading was analyzed. Then, a maximal energy-efficient TS (MEETS) scheme was employed to derive the best-assigned choice for a workload.
Jia et al. [17] investigated the computing RA challenge in a three-layer fog computing network. The key goal of this analysis was to develop a RA scheme to increase cost efficiency. For this purpose, the RA challenge was formulated as a deferred acceptancebased double-matching scheme (DA-DMS) according to the price efficiency obtained by evaluating the efficacy and price of the fog computing systems.
Yang et al. [18] analyzed the collaborative TS problem for general homogeneous fog networks. In this analysis, fair network efficiency in provision latency and power usage was achieved. To achieve this, client data, neighborhood implementation, incentive limit, queuing, workload offloading, and wireless communications were considered. Moreover, delay energy-balanced TS (DEBTS) was adopted for TS. In this method, the evaluation of the control variable was integrated with the Lyapunov optimizer to reduce power usage and provision latency in fog networks.
Balevi and Gitlin [19] presented a stochastic geometry analysis for computing the optimal amount of FNs while the end systems transmitted their packets to the FNs. In this model, FNs and end systems were considered points in a two-dimensional Euclidean space. Through this model, the average data rate was enhanced and the transmission delay was minimized. Moreover, the optimal amount of FNs was reduced for high path loss exponent channels denoting that FNs should have been chosen among the nodes that had maximum computational power for these channels.
Wang et al. [20] designed the dynamic TS strategy depending on a weighted bi-graph (DTSWB), which formulated the scheduling dilemma as the maximum WB harmonizing challenge. This dilemma was resolved using different operations: state data acquisition of offloaded workloads and network operators, correlation finding, revenue matrix determination, and best harmonizing. Li et al. [21] suggested an energy-efficient computation offloading and RA (ECORA) method in the fog paradigm. In ECORA, the computation offloading challenge was decoupled into sub-problems of RA and offloading decisions.
Zhou et al. [22] developed a new algorithm called the improved TS algorithm (ITSA) using the gain value of a workload swap. Initially, the idea of the gain value of a workload swap was introduced. After that, the workload with the least gain value and the workload with the highest gain value were combined to create a workload pair. Further, scheduling was performed by the greedy mechanism.
Ren et al. [23] designed an improved three-layer fog-to-cloud structure and schedule fit algorithm to use flog-to-cloud resources effectively, achieving QoS regarding delay and service failure probability. Guevara and da Fonseca [24] presented two schedulers depending on the integer linear programming, which schedules workloads either in the cloud or on fog devices. The schedulers used the class of facilities to choose the processing components wherein the workloads must be performed.
Movahedi Z. and Defude [25] presented a fog-based structure to handle the TS requests and give the best decisions. Then, the TS issue was modeled as an integer linear programming optimization, which took both time and fog energy usage. Further, this issue was resolved by an opposition-based chaotic whale optimization algorithm (OppoCWOA).
Yin et al. [26] developed the TS mechanism using workload priority. Initially, a cloud-fog model was constructed for smart production lines and the multi-objective function was created for TS, which reduced the service latency and energy usage of the workloads. Moreover, an improved hybrid monarch butterfly optimization and improved ant colony optimization algorithm (called HMA) were utilized to explore the best TS mechanism.
Guo et al. [27] designed an intelligent genetic scheme (IGS) for multi-objective collaboration service scheduling. In the initial population selection step, the initial population creation method was modified, a portion of the population was arbitrarily chosen, and the selection procedure was iteratively optimized. The diversity of the population in the dynamic selection was enhanced by the mutation aspects depending on individual innate efficiencies. According to the fitness function, the optimal collaborative services were scheduled, which reduced the cost and enhanced efficiency.

Drawbacks in the Existing Algorithms
• In [7], the authors did not consider the network bandwidth, co-location, and parallelization, which were major concerns due to data centers from various locations, serving user requests, network criteria, and delays. • In [8], the computational time complexity was high. Moreover, the authors did not consider the minimization of the workload completion time to resolve the TS dilemma in the cloud-fog networks. • In [9], the memory and execution time were not taken into the objective function.
• In [10], the longer queuing delay provided a high computation time.
• The CMaS scheme [11] was not suitable for real-world applications.
• In [12], the authors did not provide more suitable services to users. • In [13], the costs of the service providers and the power usage in fog servers were not reduced. • The efficiency of MRBEA [14] was not effective due to more VM migrations.
• In [15], the authors did not consider the power-delay tradeoff dilemma in heterogeneous fog networks while constructing the preference profile. • The MEETS scheme [16] did not consider the offloading workloads in heterogeneous fog systems. • The computational complexity of the DA-DMS [17] was high.
• The DEBTS scheme [18] was not effective for complex homogeneous fog networks.
• In [19], the optimum locations of FNs were not determined, and caching efficiency was not enhanced. • In [20], the authors did not consider the dependent types of workloads, and the execution order of the workloads for further enhancing the performance. • The ECORA [21] method has high complexity. • In ITSA [22], workloads were not managed instantly in a few situations where workloads contained arbitrariness. • In [23], the service failure probability was high because, in this structure, the workload was performed on a single VM, which restricted the CPU resources. Moreover, the workload was not properly sent while there were several workloads in the network. • In [24], the decision was contradictory in real-time scenarios since the framework was only suitable for a single objective function. • In OppoCWOA [25], the scenario was not considered, where the workload implementation was unsuccessful on a fog node because of the destruction of the CPU. • In [26], the authors did not consider the execution time and memory to determine the objective function. • In [27], the running period of IGS increased while the population was huge.
From this literature-most of the algorithms did not consider the computational costs (i.e., processing time, completion time, ideal time, and so on,) of TS as objective functions. Moreover, the unbounded cloud services were not considered. To resolve these challenges, this study considers computational time reduction as an objective (cost) function to schedule workloads, as well as distributes the optimal resources for cloud-fog systems.

Contribution of the Study
The major contributions of this study are the following: • It determines whether the workloads are decided or not. • An energy-cost-makespan-aware scheduling algorithm is applied, in which the decided workloads are denoted as graphs using a DAG. • Once TS is completed, this ECBTSA-IRA algorithm using different reinforcement learning methods, such as QL, SARSA, ESARSA, and MC, is performed to allocate the resources efficiently.

Proposed Methodology
In this section, the proposed algorithms are briefly explained. At first, CBTSA is executed to determine whether the given workload is decided or not [6]. Then, an ECBTSA algorithm is described, which schedules the workloads according to their priority levels and efficiency factors. Moreover, an ECBTSA-IRA algorithm is explained for allocating resources efficiently. Figure 1 portrays the flow diagram of the proposed ECBTSA-IRA algorithm.

•
Once TS is completed, this ECBTSA-IRA algorithm using different reinforcem learning methods, such as QL, SARSA, ESARSA, and MC, is performed to allo the resources efficiently.

Proposed Methodology
In this section, the proposed algorithms are briefly explained. At first, CBTSA is cuted to determine whether the given workload is decided or not [6]. Then, an ECB algorithm is described, which schedules the workloads according to their priority le and efficiency factors. Moreover, an ECBTSA-IRA algorithm is explained for alloca resources efficiently. Figure 1 portrays the flow diagram of the proposed ECBTSA algorithm.

Energy-Efficient, Cost-and Performance-Effective Container-Based Task Scheduling Al rithm
Once the workloads ∈ are decided, the workload scheduler distributes th cided workloads according to their priority levels. If the workload is accomplished i ther the cloud or the fog node, then it is simply shared. If the cloud and fog nodes f the workload, then the workload assigner requests to choose where to locate the w load. So, the workload requests access to the processed information in the cloud whe workload requests more resources. For this reason, a threshold is set and fine-tune cording to the present group of workloads to maximize the number of effective workl on the node.
The workload assigner must allocate the choice threshold for every time, i.e resource threshold of node in the interval . First, the resource requirements per period of workload are computed by the workload assigner.
If the resource requirement is greater than , then the workload is distributed t cloud. Otherwise, the workload is allocated to the node for implementation. If the num of workloads decided in is superior to the number of workloads in − 1, then threshold of is not optimum. So, the workload assigner minimizes the threshold cording to the mean of each workload in , denoted as: In Equation (1), is the overall resource of the node in any interval. During threshold update, the priority for each decided workload is required to compute the suitable schedule for executing the decided workloads. Moreover, the node selectio locates each decided and prioritized workload to a suitable cloud or FN to achiev best efficiency (cost) factor.

Energy-Efficient, Cost-and Performance-Effective Container-Based Task Scheduling Algorithm
Once the workloads (t ∈ T) are decided, the workload scheduler distributes the decided workloads according to their priority levels. If the workload is accomplished in either the cloud or the fog node, then it is simply shared. If the cloud and fog nodes finish the workload, then the workload assigner requests to choose where to locate the workload. So, the workload requests access to the processed information in the cloud when the workload requests more resources. For this reason, a threshold is set and fine-tuned according to the present group of workloads to maximize the number of effective workloads on the node.
The workload assigner must allocate the choice threshold δ j p for every time, i.e., the resource threshold of node j in the interval p. First, the resource requirements Re j avg per period of workload are computed by the workload assigner.
If the resource requirement is greater than δ j p , then the workload is distributed to the cloud. Otherwise, the workload is allocated to the node for implementation. If the number of workloads decided in p is superior to the number of workloads in p − 1, then the threshold of p is not optimum. So, the workload assigner minimizes the threshold according to the mean of each workload in p, denoted as: In Equation (1), R j is the overall resource of the node in any interval. During the threshold update, the priority for each decided workload is required to compute the most suitable schedule for executing the decided workloads. Moreover, the node selection allocates each decided and prioritized workload to a suitable cloud or FN to achieve the best efficiency (cost) factor. In this step, the decided workloads are ranked, depending on their arrangement priorities. Consider pri t j p is the priority of t p at j and defined as: In Equation (2), w t j p denotes the average computation time of t p at j and is calculated as: In Equation (3), v t,data is the amount of data needed to be processed by t during p, w t j p is the computation time of t p at j during p, P t max is the maximum amount of interval that the workload can maintain, P t is the definite amount of interval used by t, and r j t,p is the allocated resources by the fog node j. To satisfy the delay constraint, P t max can be calculated as: In Equation (4), delay t is the deadline of t and p j is the constant, i.e., the time of j. Thus, each decided workload is prioritized and ranked in a non-growing manner of the priority range that gives a topological level of each workload. So, the priority limit among all decided workloads is preserved.

Node Selection Step
The minimum priority workload t p only initiates its execution after the maximum priority workloads t p of j are finished. Consider DTT j t as the completion time of the last decided least priority t of j. Moreover, it is the interval, while the incoming information of t is prepared to be broadcasted to the chosen node for executing t. It is computed as: In Equation (5), ω j n is the bandwidth that j assigns to t for linking it to the cloud. For the entry task, DTT j t = 0. If t p is allocated to j, then the execution time of t, Ex j t is determined as: The ready time of t on j, rdy j t is the interval while the essential incoming information of t, which was transferred from the memory disk on either the cloud or FN, was reached at target j. Therefore, rdy j t is defined by In Equation (7), n is the number of fog/cloud nodes in the network. Consider EST In Equation (9), w j t is the computation interval of t on j. Moreover, it perceives the economic price at which the cloud clients are stimulated for the utilization of data center resources, storage, and memory. The fog-cloud architecture is made up of FNs and developed using the cloud nodes offered as data center services.
Thus, consider cost j t as the economic cost to execute t on j. If j is a cloud node, cost j t comprises processing, storage, memory expenses of t on j, and transfers expenses for the (number of) departing information from other cloud nodes to the desired j to execute t. If j is an FN, then cloud clients will charge for broadcasting the departing information from the cloud nodes to the desired FN in the neighboring network. As a result, cost j t is defined as: In Equation (10), every expense is computed as: The processing expense is defined as: Here, c 1 refers to the processing expense per interval of task implementation on j. Consider c 2 as the storage expense per information and str t as the storage volume of t on j. After, the storage expense of t is determined as: Moreover, the expense of utilizing j th memory for t is determined as: In Equation (13), s mem is the amount of memory utilized for t and c 3 is the memory expense per information. If c 4 is the total expense per information to broadcast departing information from j, then the transfer expense is computed as: Then, an efficiency factor is defined to compute the tradeoff between the expense and EFT as: After that, t is allocated to j, which gives the highest tradeoff U j t . After t is allocated on j, the initial completion interval of t, i.e., AFT j t is equal to the EFT j t value. Thus, all of the decided workloads are prioritized and scheduled for completing their executions efficiently.

ECBTSA with Intelligent Resource Allocation
The main concept of this IRA is that FN trains the IoT paradigm through an interface and modifies it. The FN receives incentives for each resource it employs; once FN is trained in an effective resource strategy, it will be able to increase its forecasted-combined incentives, modify the IoT paradigm, and fulfill the requirements.

Input: Decided workload set (t ∈ T) Output: A workload schedule Initialize
Determine the priority range pri t j p of every t ∈ T; Rank all decided workloads T into a list L according to their priority levels; for(all t ∈ L) for(all j ∈ n) Compute EST j t , EFT j t and cost j t ; Compute U j t ; end for Allocate t to j that increases U j t of t; end for End

MDP Problem Formulation
For a resource demand from a client with efficiency u t at interval τ, if the FN wants to use the resource x τ = supply that indicates to supply the user at the edge, then it can obtain a direct incentive r τ , and any resource blocks (RBs) can be engaged. Otherwise, for x τ = discard, which indicates to decline (to supply the edge client) and transfer it to the cloud, the FN can sustain its accessible RBs and obtain r τ . The r τ range relates to x τ and u τ . Assume that the quantized u τ ∈ {1, 2, . . . , U} and the FN state s τ at τ are represented as: In Equation (16), b τ ∈ {0, 1, . . . , N} is the quantity of engaged RBs at τ. Notice that the succeeding state s τ+1 relies only on the present state s τ , the efficiency u τ+1 of the consecutive service demand, and the resource used (supply or discard), ensuring the Markov state P(s τ+1 |s 0 , . . . , s τ−1 , s τ , x τ ) = P(s τ+1 |s τ , x τ ).
So, the cloud-fog RA dilemma is devised as MDP, which is represented by the tuple S, A, P x ss , R x ss , where S is the group of every promising state, i.e., s τ ∈ S, A is the group of resources, i.e., x τ ∈ A = {supply, discard}, P x ss is the transition chance from s to s whist x is considered, i.e., P x ss = P(s |s, x) where s is a term for s τ+1 and R x ss is the direct incentive accepted if x is considered at s that completes in s , e.g., r τ = R x τ s τ s τ+1 ∈ R. The profit G τ is the total deducted incentives accepted from τ ahead and described as: In Equation (17), Y ∈ [0, 1] indicates the coupon rate, i.e., the weight of upcoming incentives regarding the direct incentive, Y = 0 rejects upcoming incentives and Y = 1 indicates the upcoming incentives are of equal importance similar to the direct incentives. The MDP dilemma's purpose is to increase the predicted main profit E[G 0 ]. In this MDP, for the FN consisting N RBs, there are U(N + 1) states, s τ ∈ S = {1, . . . , U(N + 1)} where U refers to the highest discrete efficiency factor. For τ = 0, every RB is accessible i.e., b = 0, so from Equation (16), there are U promising s 0 ∈ {1, . . . , U} based on u 0 . It completes at τ if each RB is engaged, i.e., b τ = N, so there are U s τ ∈ {UN + 1, UN + 2 . . . , U(N + 1)}. Moreover, the incentive strategy is presented depending on the decided efficiency and the resource considered for it.
In particular, at τ, depending on u τ and x τ , the FN accepts r τ ∈ R = {r sh , r sl , r rh , r rl } and travels to the s τ+1 where r sh represents the incentive to supply a high-efficiency demand, r sl denotes the incentive to supply a low-efficiency demand, r rh and r rl indicate the incentive to omit the high-and low-efficiency demands, accordingly. Demand is measured as high-or low-efficiency associated with the framework depending on the threshold δ j τ for every interval, i.e., resource threshold of FN j in τ. For the case, δ j τ is chosen as a particular average of the efficiency in the framework. Therefore, the incentive factor is determined as:

Optimum Strategies
The state value factor V(s) in Equation (19) is the long-term significance of being in s based on the predicted profit that is gathered, initiating from this state ahead until the end. Thus, the ending state includes 0 significance because no incentive is gathered from that state and the primary state significance is corresponding to the E[G 0 ]. Moreover, the state significance is observed in the direct incentive from the resource considered and the offered significance of s τ+1 . Likewise, the resource significance factor Q(s, x) is the predicted profit, which is realized followed by considering r at s as given in Equation (20). The expressions in Equations (19) and (20) are called the Bellman expectancies for state and resource significances, accordingly.
Here, x is the succeeding resource at s . The FN's goal is to employ N RBs for highefficiency IoT uses. It is realized by increasing the primary state significance so that the best decision strategy is essential. A strategy π is a method for choosing resources. It is the group of chances of considering a specified resource in the state, i.e., π = {P(x|s)} for every promising state resource set. Moreover, π is termed as the best if it increases every state's significance, i.e., π * = arg max π V π (s), ∀s. Thus, for resolving this MDP dilemma, the FN requires the best strategy by discovering the best state significance factor V * (s) = max π V π (s), which is identical to discovering the best resource significance factor Q * (s, x) = max π Q π (s, x) for every state resource set.
From Equations (19) and (20), the Bellman optimality formulas for V * (s) and Q * (s, x) are the following: The concept of the best state resource factor V * (s) significantly shortens the exploration ability for the best strategy. Because the objective of increasing the predicted upcoming incentives is considered for the best significance of s τ+1 , V * (s ) is considered the expectancy in Equation (21). Therefore, the best strategy is considered the optimum neighboring resources in every state. Using Q * (s, x) for selecting the best resources is less complicated since with Q * (s, x) the FN does not need to execute the single-phase-one-step-onward exploration, rather it chooses the optimum resource, which increases Q * (s, x) at every state. The best resources are described as: Once the efficiency is discretized into U stages, the state capacity becomes wellmannered with cardinality |S| = U(N + 1); thus, the best strategy is trained in this scenario by determining the best significant factors with the help of MC, SARSA, E-SARSA, and QL.
Initially, FN accepts a demand from an IoT system of u and it provides a choice for supplying or discarding, i.e., the incentive for supplying r s ∈ {r sh , r sl } and the incentive for discarding r r ∈ {r rh , r rl } are recognized at the period of choice selection. Therefore, from Equations (19) and (20), the best resource at s is defined as: In Equation (24), s supply refers to the succeeding state if x = supply, s discard denotes the succeeding state if x = discard, and E u refers to the expectancy regarding u in the IoT paradigm. This is highly preferable based on the significance iteration by MC calculations for determining the best state significances needed by the best strategy. For the variables N, Y, {u h , r sh , r sl , r rh , r rl } and the information of IoT clients {u τ }, MC trains the best strategy for this MDP dilemma.
Observe that {u τ } is actual information from the IoT paradigm and implementations when the chance distribution is identified. In this algorithm, the Pro f its array is a matrix to store the profit of all states at every iteration. Initially, each state significance is 0 and the present state significances that comprise the present strategy are applied for considering the resources until the ending state is achieved. Then, a profit vector for each state in the iteration G(s) is added to the Pro f its array for modifying the state significances.
It continues until each state significance joins and calculates the resources using Equation (24). Related to Equation (24), the best resource at s is defined based on Q * (s, x) as: Exploit the computed V * (s) to discover the best resources using Equation (24); The best resource significance factors needed by the best strategy in Equation (25) are computed by the QL, SARSA, and E-SARSA schemes. These schemes are employed for training the best strategy for the MDP by determining Q * (s, x). Moreover, α denotes the training rate, denotes the chance of providing an arbitrary resource for search, and n is the interval slots after modifying Q(s, x). The algorithm for QL, E-SARSA, and SARSA is the following: Exploit Q * (s, x) computed in Q for π * using Equation (25). Thus, this algorithm can be used to assign resources and effectively schedule the workload.

Simulation Results
In this section, the ECBTSA and ECBTSA-IRA are simulated using the standard CloudSimAPI 3.0.3 and their effectiveness are compared with the existing CBTSA. In this simulation, the processing ability of processors is specified by MIPS (million instructions per second). The mixture of 30 cloud nodes with multiple settings and 20 fog nodes is considered. The fog node's processing rate is assigned less than that of cloud nodes and the fog node's spectrum is greater than that of cloud nodes. The workload data varies between 100 and 500 MB. Moreover, 10 arbitrary DAGs having multiple edges and node weights for all workload pattern densities are generated. After that, each algorithm is executed for scheduling these workload graphs and assigning the resources. The comparison is prepared in terms of the mean interval of received workloads, schedule length, cost-makespan tradeoff, and economic cost. The simulation parameters of the cloud and fog environment are presented in Table 1. Figure 2 shows the average 'reducing' times of the accepted workloads under different latency limits. According to the results, ECBTSA-IRA and ECBTSA used for TS and RA significantly reduce the delay while the workload latency limit is between 100 and 500 ms. This is because of the concept that the resource allocation per interval is adjusted as the reassigned amount of a workload rises. The goal of these algorithms is to improve the resource use of the fog and cloud nodes, which refers to maximizing the node's information processing facility efficiently.  Figure 2 shows the average 'reducing' times of the accepted workloads under different latency limits. According to the results, ECBTSA-IRA and ECBTSA used for TS and RA significantly reduce the delay while the workload latency limit is between 100 and 500 ms. This is because of the concept that the resource allocation per interval is adjusted as the reassigned amount of a workload rises. The goal of these algorithms is to improve the resource use of the fog and cloud nodes, which refers to maximizing the node's information processing facility efficiently.  Figure 3 demonstrates the schedule lengths according to different workloads for EC-BTSA-IRA, ECBTSA, and CBTSA. From this analysis, it is observed that the ECBTSA-IRA achieves fewer schedule lengths compared to other algorithms. The scheduled length of ECBTSA-IRA is 26.81% less than CBTSA and 10.14% smaller than ECBTSA.

Cost-Makespan Tradeoff
This is used for estimating the optimum strategy at every workload pattern density, as: min min  Figure 3 demonstrates the schedule lengths according to different workloads for ECBTSA-IRA, ECBTSA, and CBTSA. From this analysis, it is observed that the ECBTSA-IRA achieves fewer schedule lengths compared to other algorithms. The scheduled length of ECBTSA-IRA is 26.81% less than CBTSA and 10.14% smaller than ECBTSA.  Figure 2 shows the average 'reducing' times of the accepted workloads under different latency limits. According to the results, ECBTSA-IRA and ECBTSA used for TS and RA significantly reduce the delay while the workload latency limit is between 100 and 500 ms. This is because of the concept that the resource allocation per interval is adjusted as the reassigned amount of a workload rises. The goal of these algorithms is to improve the resource use of the fog and cloud nodes, which refers to maximizing the node's information processing facility efficiently.  Figure 3 demonstrates the schedule lengths according to different workloads for EC-BTSA-IRA, ECBTSA, and CBTSA. From this analysis, it is observed that the ECBTSA-IRA achieves fewer schedule lengths compared to other algorithms. The scheduled length of ECBTSA-IRA is 26.81% less than CBTSA and 10.14% smaller than ECBTSA.

Cost-Makespan Tradeoff
This is used for estimating the optimum strategy at every workload pattern density, as: min min

Cost-Makespan Tradeoff
This is used for estimating the optimum strategy at every workload pattern density, as: In Equation (26), AL = {a 1 , . . . , a n } represents the catalog of each scheduling strategy wherein the cost-makespan tradeoff of all strategies a i ∈ AL is computed. The higher cost-makespan tradeoff will realize a higher tradeoff rank on the economic cost and schedule duration. Figure 4 depicts the cost-makespan tradeoffs according to different workloads for ECBTSA-IRA, ECBTSA, and CBTSA. From this analysis, it is observed that the ECBTSA-IRA achieves better cost-makespan tradeoffs than other algorithms. The cost-makespan tradeoff of ECBTSA-IRA is about 13.77% higher than CBTSA and 5.85% higher than ECBTSA. In Equation (26), = , … , represents the catalog of each scheduling strategy wherein the cost-makespan tradeoff of all strategies ∈ is computed. The higher cost-makespan tradeoff will realize a higher tradeoff rank on the economic cost and schedule duration. Figure 4 depicts the cost-makespan tradeoffs according to different workloads for ECBTSA-IRA, ECBTSA, and CBTSA. From this analysis, it is observed that the ECBTSA-IRA achieves better cost-makespan tradeoffs than other algorithms. The cost-makespan tradeoff of ECBTSA-IRA is about 13.77% higher than CBTSA and 5.85% higher than EC-BTSA.

Economic Cost
The economic cost is defined as the price required to pay for executing all workloads. Figure 5 portrays the economic costs of cloud resources according to different workloads. From this analysis, it is observed that the ECBTSA-IRA achieves less cost compared to other algorithms. The cost of ECBTSA-IRA is 30.46% less than CBTSA and 20.29% smaller than ECBTSA. Therefore, the efficiency of ECBTSA-IRA is accomplished better by increasing the number of workloads.

Discussion and Limitations
The ECBTSA-IRA has a minimum cost, an average 'reducing' time of accepted workloads, and a schedule length. This is because both the cloud and fog layers implement priority scheduling in these layers, and locate workloads in the proper priority levels ac-

Economic Cost
The economic cost is defined as the price required to pay for executing all workloads. Figure 5 portrays the economic costs of cloud resources according to different workloads. From this analysis, it is observed that the ECBTSA-IRA achieves less cost compared to other algorithms. The cost of ECBTSA-IRA is 30.46% less than CBTSA and 20.29% smaller than ECBTSA. Therefore, the efficiency of ECBTSA-IRA is accomplished better by increasing the number of workloads. In Equation (26), = , … , represents the catalog of each scheduling strategy wherein the cost-makespan tradeoff of all strategies ∈ is computed. The higher cost-makespan tradeoff will realize a higher tradeoff rank on the economic cost and schedule duration. Figure 4 depicts the cost-makespan tradeoffs according to different workloads for ECBTSA-IRA, ECBTSA, and CBTSA. From this analysis, it is observed that the ECBTSA-IRA achieves better cost-makespan tradeoffs than other algorithms. The cost-makespan tradeoff of ECBTSA-IRA is about 13.77% higher than CBTSA and 5.85% higher than EC-BTSA.

Economic Cost
The economic cost is defined as the price required to pay for executing all workloads. Figure 5 portrays the economic costs of cloud resources according to different workloads. From this analysis, it is observed that the ECBTSA-IRA achieves less cost compared to other algorithms. The cost of ECBTSA-IRA is 30.46% less than CBTSA and 20.29% smaller than ECBTSA. Therefore, the efficiency of ECBTSA-IRA is accomplished better by increasing the number of workloads.

Discussion and Limitations
The ECBTSA-IRA has a minimum cost, an average 'reducing' time of accepted workloads, and a schedule length. This is because both the cloud and fog layers implement priority scheduling in these layers, and locate workloads in the proper priority levels ac-

Discussion and Limitations
The ECBTSA-IRA has a minimum cost, an average 'reducing' time of accepted workloads, and a schedule length. This is because both the cloud and fog layers implement priority scheduling in these layers, and locate workloads in the proper priority levels according to their tolerance delays. After that, scheduling according to the priority levels enhances the number of tasks completed, which results in the minimum overall response time and the total cost. The ECBTSA-IRA algorithm achieves a good balance between cost savings and schedule length. The greater CMT indicates the better tradeoff level on economic cost and schedule length that an algorithm can provide. This proposed ECBTSA-IRA algorithm reduces the schedule length, which needs much less economic costs for cloud resources compared to the other algorithms. At the same time, ECBTSA-IRA improved over traditional TS and RA algorithms, but the time required to allocate the resource request by each device was high. Moreover, the ECBTSA-IRA algorithm did not support the dynamic RA, which impacted the network QoS efficiency. Experimental results show that the ECBTSA-IRA algorithm had a clear advantage in TS and RA, which also implied that the ECBTSA-IRA algorithm could be utilized to solve the problem of optimizing heterogeneous IoT applications.

Conclusions
An ECBTSA-IRA is proposed to improve the CBTSA by considering both the computation time and finiteness of cloud resources. An optimum value of an efficiency factor is obtained by using this algorithm for measuring the tradeoff between the schedule length, cost, and energy. This algorithm uses MDP to consider the RA problem and execute QL, SARSA, ESARSA, and MC to allocate the optimal resources. This algorithm reduces the response time and significantly minimizes the cost. This is due to the effective prioritization of the tasks according to their delay, schedule length, energy, and cost, resulting in less mean response time and overall cost. Eventually, the simulation outcomes proved that the ECBTSA-IRA has a 7.1 ms average 'reducing' time of accepted workloads for 100 ms workload delays, whereas ECBTSA and CBTSA algorithms have 7.5 ms and 8 ms, correspondingly. Moreover, the ECBTSA-IRA has a 577-s schedule length, 0.84 cost-makespan tradeoff, and USD 27,300 G cost for 100 workloads, whereas the ECBTSA has a 611-s schedule length, 0.8 cost-makespan tradeoff, and USD 32,000 G cost for 100 workloads; the CBTSA has a 700-s schedule length, 0.75 cost-makespan tradeoff, and USD 35,000 G cost for 100 workloads.
In future work, we need to further consider the case that the dynamic RA for heterogeneous IoT devices to achieve their QoS requirements by considering more network parameters, such as throughput, average link usage, and so on. At the same time, further research is made on how to apply the ECBTSA-IRA algorithm in real-time application deployment, such as healthcare monitoring, smart manufacturing, and so on.