Water Pumping and Refilling (WPR): A Resource Allocation Algorithm for Maximizing Acceptance Ratio in Asymmetrical Edge Computing Networks

Li Dong; Wenji He; Yunjie Liu

doi:10.3390/sym15050985

,

and

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

^*

Author to whom correspondence should be addressed.

Symmetry2023, 15(5), 985;https://doi.org/10.3390/sym15050985

This article belongs to the Special Issue Asymmetrical Network Control for Complex Dynamic Services

Version Notes

Order Reprints

Abstract

Computation offloading has received a significant amount of attention in recent years, with many researchers proposing joint offloading decision and resource allocation schemes. However, although existing delay minimization schemes achieve minimum delay costs, they do so at the cost of losing possible further maximization of the number of serviced requests. Furthermore, the asymmetry between uplink and downlink poses challenges to resource allocation in edge computing. This paper addresses this issue by formulating the joint computation offloading and edge resource allocation problem as a mixed-integer nonlinear programming (MINLP) problem in an edge-enabled asymmetrical network. Leveraging the margin between a delay-minimum scheme and a near-deadline scheme, a water pumping and refilling (WPR) algorithm is proposed to maximize the number of accepted requests. The WPR algorithm can function both as a supplementary algorithm to a given offloading scheme and as a standalone algorithm to obtain a resource allocation scheme following a customizable refilling policy. The simulation results demonstrated that the proposed algorithm outperforms delay-minimum schemes in achieving a high acceptance ratio.

Keywords:

computation offloading; mobile edge computing; resource allocation; delay minimum; WPR

1. Introduction

Computation offloading has emerged as a prominent research area in edge computing-enabled systems in recent years [1]. Its fundamental concept involves leveraging the computing resources deployed at the network edge to offer faster and more satisfactory computational services to user devices (UDs) [2,3]. However, the burgeoning number of Internet of Things [4] devices exacerbates the already constrained computing resources of edge nodes. When an asymmetry edge node is overloaded, it is unable to accommodate all the computation offloading requests. Therefore, how to allocate resources to maximize the number of serviced requests in asymmetrical systems, while respecting the given resource limitations, remains a crucial issue.

As computation offloading requests arrive, the BS decides which requests should be accepted and how many resources should be allocated for the accepted requests. Different objectives lead to different resource allocation schemes. The primary objective of current research in computation offloading is to minimize delay or energy consumption [5]. In situations where multiple users demand offloading services, it is common to minimize the weighted sum or the sum of users’ processing delays [6,7]. However, weight setting is typically based on empirical methods, and schemes with different weightings may yield distinct performances. Although designs that minimize delay cost lead to attractive resource allocation schemes, most of their resource allocation results tend to be over-provisioned to some extent. On the other hand, energy consumption minimization schemes, from the perspective of user devices, extend the battery life of UDs and achieve energy efficiency at the system level [8]. However, such schemes may not be optimal from the perspective of network operators (NOs) because they fail to accommodate additional computation offloading requests, thereby reducing the economic revenues of NOs.

Several existing studies propose incentive-driven computation offloading and resource allocation schemes to benefit edge service providers [9]. For example, Yuan et al. [10] aimed to maximize the profits of remote clouds by accepting more requests without causing high bandwidth consumption and energy consumption for task processing. However, such schemes may not be suitable for edge servers with limited resources, particularly when facing excessive terminal requests. Under such circumstances, the edge server may be forced to exhaust its available resources to accommodate end requests, which involves both request acceptance and resource allocation. Therefore, a joint request acceptance and resource allocation scheme with maximized accepted requests is necessary.

Although these designs function well when there are sufficient resources in the serving network, the focus of request acceptance and resource allocation should shift towards utilizing constrained resources to accept as many requests as possible when the resource is insufficient. Considering this, from the user’s perspective, a satisfactory service does not necessarily need have the shortest service delay, as long as a certain level of service agreement [11] is met. Therefore, NOs do not have to distribute the network resource for a system-wide delay minimal objective, which releases the network’s potential to accommodate more requests. In this regard, this paper proposes a joint computation offloading acceptance and edge resource allocation scheme to maximize the number of accepted requests in an edge-enabled asymmetrical network. Additionally, a scalable water pumping and refilling algorithm is proposed to accommodate requests on the basis of the aforementioned delay-minimal schemes. The primary innovations of this paper are summarized as follows:

A low-complexity water pumping and refilling (WPR) algorithm is proposed to release the untapped potential of the network and accommodate more requests, based on the delay cost minimization scheme. This approach can serve as both a supplementary method and a standalone method when combined with a specific customization strategy.
The joint computation offloading and edge resource allocation problem is formulated as a mixed integer nonlinear programming problem with the objective of maximizing the number of accepted requests. Resource margins between the delay cost minimization scheme and the desirable quality of service (QoS) scheme are exploited to accommodate more requests.
We evaluate the performance of the proposed algorithm under various conditions. The simulation results demonstrate that our WPR algorithm outperforms the delay-cost-minimum-based schemes regarding acceptance ratio.

The remainder of this paper is organized as follows. In Section 2, we review related work on computation offloading and resource allocation in edge computing systems. In Section 3, we present the system model and relevant assumptions. The proposed water pumping and refilling algorithm is introduced in Section 4. In Section 5, we analyze and discuss the simulation results. Finally, our paper is concluded in Section 6.

2. Related Work

Among the delay-minimal schemes, Ren et al. [7] proposed a partial offloading model that divides a computational task into two parts and the divided tasks are collaboratively processed by an edge server and a remote cloud. A closed-form task splitting ratio and resource allocation scheme were provided. Similarly, Wang et al. [12] aimed to minimize task duration while meeting energy consumption constraints by allocating bandwidth equally to connected offloading user devices, using the alternating direction method of multipliers (ADMM) algorithm to determine computation node selection and computing resource allocation schemes. Wei et al. [13] selected the joint computation node, decided on the content caching, and determined the resource allocation (including radio bandwidth and computing resources) with a two-hidden-layer deep neural network to minimize end-to-end delay for computation offloading and content delivery services, where the channel state and resource allocation are discretized. Although these schemes have achieved stunning performance, they still fall into the category of delay-minimal schemes, which means resource allocation results could be fine-tuned to accept more requests within a deadline.

Similarly, many research efforts have been made to devise energy cost minimization (ECM) schemes [14,15]. For example, Wang et al. [5] formulated EM and delay cost minimization (DCM) problems for single-user partial computation offloading, using the dynamic voltage scaling technique [16]. They optimized transmission power, computation resource allocation, and offloading ratio and reached the insightful conclusion that if a task has a stringent delay requirement (less than a threshold), it cannot be processed in a partitioned way. In [17], requests from representative locations are grouped, and computation results for requests with duplicated inputs selectively cached. They formulated the cache decision, bandwidth allocation, and computing resource allocation problem as an MINLP problem, aiming to minimize the energy consumption of the base station (BS) and all users. In [18], the deep deterministic policy gradient (DDPG) algorithm is adopted to solve the joint computation node selection and computing resource allocation problem, aiming to minimize system energy consumption. They constructed an SBS–MBS three-layer offloading model for delay-stringent tasks. Apart from the cloud- and edge-enabled processing models, some researchers supplement user devices to enhance performance. For instance, Huang et al. [19] proposed an edge-end cooperation scheme where mobile devices act as computing servers to minimize the energy consumed by the mobile devices. In [20], a three-node computation offloading scenario was studied in which the user device near the access point (AP) is exploited to relay and compute the task of the far user device with the aim of minimizing the energy consumed by the AP in a wireless powered system.

Several studies have investigated resource allocation in computation offloading from different perspectives. In [21], the authors assumed adequate bandwidth between vehicles and the associated MEC server and adopted a Q-value-based deep reinforcement learning method to maximize the acceptance rate in vehicular networks. Zhou and Hu [22] aimed to maximize the ratio of processed bits to the energy consumed by energy-harvesting user devices for non-orthogonal multiple access and time division multiple access systems. Mukherjee et al. [23] focused on maximizing system revenue by analyzing the pricing strategy for offloaded tasks with different time constraints. Yan et al. [24] formulated the DCM and revenue maximization problem as a two-stage game, where computing resources at BS were equally allocated. Wang et al. [25] designed an online auction mechanism to maximize the profit of resource providers in an energy-effective way. Zhou and Zhang [26] proposed compensating tasks with a higher delay-to-deadline ratio to minimize the maximal ratio among users, which was solved using an evolutionary algorithm. Hejja et al. [27] maximized the number of serviced offloading requests under the network function virtualization framework. Finally, Meng et al. [28] developed a task dispatching and scheduling algorithm that focused more on computing node selection and task scheduling to maximize the number of tasks meeting deadline requirements. We summarize part of the studies mentioned above in Table 1.

Table 1. Summary of the discussed work.

Furthermore, recent research has explored combining deep reinforcement learning with computation offloading to enhance performance [13,17,30,31,32,33]. However, these approaches do not address the problem of maximizing the number of accepted requests among a flood of requests with limited resources in an edge-enabled asymmetrical network. This requires a system of efficient allocation of resources and selection of the appropriate requests while considering their respective delay requirements. This paper proposes a solution to solve this problem, wherein tasks are processed at the edge server, and rejected tasks are considered to be task failures. Notably, our proposed algorithm can function as a supplementary approach and as a standalone approach.

3. System Model

As shown in Figure 1, the system consists of a single BS and M UDs. The BS is equipped with an edge computing server, having a computing power of

F e

(in CPU cycles per second). The BS is connected to the edge server via a fiber link [34] to provide computing services for UDs. UDs issue computation offloading requests to the associated BS if local computing resources cannot complete the task within the deadline

T^{d l}

, and such requests are considered to be timeouts if they are rejected. It is assumed that the computation-intensive tasks arrive at the beginning of a schedule interval [35] and only one task is generated from each UD. Local processing is not considered and requests with no resources allocated are treated as timeouts.

Figure 1. System model.

There are K kinds of computation applications running in the system. In this paper, it is assumed that the input of computational application k is one of the

F^{k}

input files. A computation request of application k is specified by the task input size

l_{f}^{k}

(in bits), the computation load

L_{f}^{k}

(the desired computing resource, in CPU cycles), and the task deadline

T_{f}^{k}

(this can also be a QoS-related delay parameter). Thus, a UD m requesting computation offloading of application k with task input f (

f \in F^{k} = {1, 2, 3, \dots, F^{k}}

) can be specified with

r_{m} = (l_{f}^{k}, L_{f}^{k}, T_{f}^{k})

.

r_{m}

indicates the computation task from UD m in the sequel. For simplicity of notations, the parameters of task

r_{m}

are written as

(l_{m}, L_{m}, T_{m}^{d l})

.

When UDs cannot complete their tasks within the given deadline, they request the associated BS for computation offloading [3]. While some previous research considered local processing power on the UDs, this paper excludes UDs that can process tasks without exceeding the deadline (

L_{m} / f_{m} \leq T_{m}^{d l}

). As a result, the BS can either accept or decline the computation offloading requests, depending on its processing capabilities. The BS tries to accommodate the requests to the best of its abilities. When task

r_{m}

is accepted (i.e.,

x_{m} = 1

), the BS ensures that resource allocation meets the delay requirement (or QoS-related delay requirement). If task

r_{m}

is denied (i.e.,

x_{m} = 0

), the BS incurs a penalty. The BS makes decisions on joint offloading request acceptance and resource allocation with the aim of maximizing the number of accepted requests.

3.1. Transmission Model

The whole system is presumed to function in an orthogonal frequency division multiple access (OFDMA) mode with a bandwidth of B (in hertz). In this way, interference is not considered in our model. Noticing the asymmetry in uplink and downlink, the result download stage is not considered in our model [36]. Wireless access bandwidth is only allocated to accepted requests and the BS assigns a portion bandwidth

α_{m}

to UD m to upload the task input file (of size

l_{m}

). Consequently, the maximal upload data rate between UD m and the BS can be expressed as follows:

R_{m} = α_{m} B {log}_{2} (1 + \frac{p_{m} h_{m}^{2}}{σ^{2}}) = α_{m} {\hat{R}}_{m},

(1)

where

p_{m}

is the upload transmission power (usually determined by the association control scheme) of UD m.

h_{m}

is the instantaneous channel gain between UD m and the BS. As the transmission may occur over several slots, the averaged channel gain

{\bar{h}}_{m}

is employed to substitute the instantaneous channel gain [7] across multiple frames (with channel estimation technologies [37]). Supposing request

r_{m}

from UD m is accepted, the averaged upload transmission time for UD m to upload its computation task input can be written as:

t_{m}^{u p} = \frac{l_{m}}{α_{m} B {log}_{2} (1 + \frac{p_{m} {\bar{h}}_{m}^{2}}{σ^{2}})} = \frac{l_{m}}{α_{m} {\bar{R}}_{m}},

(2)

where

{\bar{R}}_{m}

is the average of

B {log}_{2} (1 + \frac{p_{m} {\bar{h}}_{m}^{2}}{σ^{2}})

during

t_{m}^{u p}

.

In addition, the allocated access bandwidth of the BS cannot exceed its capacity:

\sum_{m \in M_{1}} α_{m} \leq 1,

(3)

where

M_{1}

denotes the set of accepted requests from UDs.

3.2. Computing Model

The heterogeneity in computing resource requirements among accepted requests necessitates efficient allocation of computing resources by the edge server.

M_{1}

denotes the set of accepted requests and

M_{0}

denotes the set of rejected requests. The BS assigns a fraction

β_{m}

of its computing resources to UD m to process the computation task. The resulting computation delay can be written as:

t_{m}^{e} = \frac{L_{m}}{β_{m} F e} .

(4)

The allocation of computing resources must adhere to its capacity constraint, which can be formally expressed as:

\sum_{m \in M_{1}} β_{m} \leq 1 .

(5)

3.3. Problem Formulation

If a request

r_{m}

is accepted, the BS ensures that the associated delay constraint is not violated. This requirement can be expressed as:

T_{m} = t_{m}^{u p} + t_{m}^{e} \leq T_{m}^{d l}, \forall m \in M_{1} .

(6)

The main objective is to maximize the number of accommodated requests subject to the constraints imposed by the limited system resources and deadlines, which can be formally expressed as:

\begin{matrix} \underset{x, α, β}{m a x} & : \sum_{m = 1}^{M} 𝟙 (x_{m}) \\ s . t . & C 1 : \sum_{m = 1}^{M} x_{m} α_{m} \leq 1 \\ C 2 : \sum_{m = 1}^{M} x_{m} β_{m} \leq 1 \\ C 3 : x_{m} (\frac{l_{m}}{α_{m} {\bar{R}}_{m}} + \frac{L_{m}}{β_{m} F e}) \leq T_{m}^{d l}, \forall m \in M . \end{matrix}

(P1)

𝟙 (x_{m})

is an indicator function and takes value one when

x_{m} = 1

and zero otherwise. Constraints

C 1

and

C 2

ensure that the bandwidth and computing resources allocated to the accepted requests should not exceed the system’s capacity. Constraint

C 3

ensures that the delay requirements of the accepted requests are met.

Problem (P1) is known to be intractable, due to the non-smooth and non-convex nature of the objective function. To overcome this challenge, Problem (P1) is transformed into Problem (P2) with the objective of minimizing the total delay cost of all requests. This transformation reduces the search domain of Problem (P1) and enables more efficient solution approaches to solve Problem (P2). Two key observations form the basis of this transformation. First, for an accepted request, the processing delay is always less than the corresponding deadline, while a rejected request receives a relatively large penalty. In this way, the system delay cost can be reduced by accepting more requests. Second, the optimal allocation scheme to Problem (P1) does not necessarily minimize the overall delay cost. Such a scheme could lead to higher delays compared to Problem (P2), which leaves more vacant resources for other requests. Consequently, solutions to Problem (P1) can be obtained by first solving the delay cost minimization (P2) and pushing the resource allocation scheme near to the deadline.

In this paper, rejected requests are treated as timed-out and discarded, incurring penalties. Specifically, the penalty of a timed-out request is denoted as

η T_{n}^{d l}, \forall n \in M_{0}

, where

η ≫ 1

is a constant factor for all requests. This penalty factor reflects the severity of a timed-out request in terms of delay cost. To account for these penalties, Problem (P2) is reformulated by adding penalty terms for timed-out requests:

\begin{matrix} \underset{x, α, β}{m i n} & : \sum_{m = 1}^{M} x_{m} (t_{m}^{u p} + t_{m}^{e}) + (1 - x_{m}) η T_{m}^{d l} \\ s . t . & C 1, C 2, C 3 . \end{matrix}

(P2)

Problem (P2) is a challenging MINLP problem with coupled decision variables. To address this challenge, we follow previous works [38] and Problem (P2) and decompose the problem into two sub-problems: request acceptance and resource allocation. The binary request acceptance problem can be solved with a coordinate descent algorithm [39]. Note that both request acceptance and resource allocation influence the final effect. Specifically, this paper focuses on solving the resource allocation sub-problem to gain insight into Problem (P2).

\begin{matrix} \underset{α, β}{m i n} & : \sum_{m \in M_{1}} (t_{m}^{u p} + t_{m}^{e}) + \sum_{n \in M_{0}} η T_{n}^{d l} \\ s . t . & C 4 : \sum_{m \in M_{1}} α_{m} \leq 1 \\ C 5 : \sum_{m \in M_{1}} β_{m} \leq 1 \\ C 6 : \frac{l_{m}}{α_{m} {\bar{R}}_{m}} + \frac{L_{m}}{β_{m} F e} \leq T_{m}^{d l}, \forall m \in M_{1} . \end{matrix}

(P2.1)

It can be seen from above that, once

x

is given (

M_{0}

and

M_{1}

are determined), Problem (P2.1) can be reformulated as a convex optimization problem [38]. Once an optimal resource allocation scheme is obtained, we readjust the resource allocation result in

M_{1}

and try to allocate resources to request in

M_{0}

, to accommodate more requests without violating the deadline constraints, and, then, obtain the solutions to the original Problem (P1). This process allows us to iteratively refine our solutions and obtain an optimized allocation of system resources that maximizes the number of accommodated requests within the constraints of the system’s limited resources and deadlines.

4. The Water-Pumping Algorithm

In this section, the classic Lagrange multiplier method and KKT conditions are adopted to derive the solution to Problem (P2.1). Based on this solution, the water pumping and refilling algorithm is proposedto solve the initial asymmetric resource allocation Problem, as stated in Problem (P1).

4.1. The Delay Minimum Solution

By introducing Lagrange multipliers

λ_{1}, λ_{2}, ν = (ν_{1}, ν_{2}, \dots, ν_{m}), \forall m \in M_{1}

, the Lagrange function of (P2.1) can be formulated as:

\begin{matrix} L (α, β, λ_{1}, λ_{2}, ν) = \sum_{m \in M_{1}} (\frac{l_{m}}{α_{m} {\bar{R}}_{m}} + \frac{L_{m}}{β_{m} F e}) + \sum_{n \in M_{0}} η T_{m}^{d l} + λ_{1} (\sum_{m \in M_{1}} α_{m} - 1) + \\ λ_{2} (\sum_{m \in M_{1}} β_{m} - 1) + \sum_{m \in M_{1}} ν_{m} (\frac{l_{m}}{α_{m} {\bar{R}}_{m}} + \frac{L_{m}}{β_{m} F e} - T_{m}^{d l}) . \end{matrix}

(7)

Based on Equation (7) and in-depth analysis of Problem (P2.1), using KKT conditions, the following corollaries can be obtained.

Corollary 1.

The necessary condition for an optimal bandwidth and computing resource allocation scheme for

M_{1}

is given by:

\begin{matrix} (α_{m}, β_{m}) = (\frac{\sqrt{\frac{(1 + ν_{m}) l_{m}}{{\bar{R}}_{m}}}}{\sqrt{λ_{1}}}, \frac{\sqrt{\frac{(1 + ν_{m}) L_{m}}{F e}}}{\sqrt{λ_{2}}}) = (\frac{\sqrt{\frac{(1 + ν_{m}) l_{m}}{{\bar{R}}_{m}}}}{\sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) l_{i}}{{\bar{R}}_{i}}}}, \frac{\sqrt{\frac{(1 + ν_{m}) L_{m}}{F e}}}{\sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) L_{i}}{F e}}}) . \end{matrix}

(8)

Proof.

Please see the detailed proof in Appendix A. □

The optimal bandwidth and computing resource allocation scheme for UD m can be obtained from Equation (8). It can be observed that the optimal bandwidth allocation is proportional to

\sqrt{\frac{l_{m}}{{\bar{R}}_{m}}}

, which implies that requests with better channel conditions and larger input data size receive a larger share of the bandwidth allocation. Similarly, requests with higher computation loads are allocated more edge computing resources.

Corollary 2.

The optimal allocation scheme achieves the minimal delay cost for all

m \in M_{1}

when the Lagrange multiplier

ν_{m}

takes the same value (e.g., 0), which can be denoted as:

\begin{matrix} (α_{m}^{*}, β_{m}^{*}) = (\frac{\sqrt{\frac{l_{m}}{{\bar{R}}_{m}}}}{\sum_{i \in M_{1}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}}}, \frac{\sqrt{\frac{L_{m}}{F e}}}{\sum_{i \in M_{1}} \sqrt{\frac{L_{i}}{F e}}}) . \end{matrix}

(9)

Proof.

Please see the detailed proof in Appendix B. □

Equation (9) indicates that the delay-minimum resource allocation scheme to Problem (P2.1) in this context leads to the exhaustion of all system resources for accepted requests, without taking into account the possibility of over-provisioning a request, based on its maximum acceptable delay. In general, the ratio of the user’s processing delay to its deadline under the delay-minimum scheme is less than one (

T_{m} < T_{m}^{d l}, \forall m \in M_{1}

), where the processing delay of an accepted request

r_{m}

can be represented as:

\begin{matrix} T_{m} = \frac{\sqrt{λ_{1}}}{\sqrt{1 + ν_{m}}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}} + \frac{\sqrt{λ_{2}}}{\sqrt{1 + ν_{m}}} \sqrt{\frac{L_{m}}{F e}} . \end{matrix}

(10)

Upon comparing Equation (8) with Equation (9), it can be observed that the Lagrange multipliers

ν_{m}, \forall m \in M_{1}

play a crucial role in regulating the allocation of resources, which, in turn, impacts the processing delays of accepted requests. Thus, the proposed WPR algorithm increases the number of accepted requests by reallocating resources (setting

ν_{m}

) to extend the processing delays of accepted requests up to their respective deadlines (

T_{m} \approx T_{m}^{d l}, \forall m \in M_{1}

).

4.2. Water Pumping

It can be inferred from Equation (10) that

T_{m}

can be extended to be equal to

T_{m}^{d l}

by adjusting

ν_{m}

, while keeping

λ_{1}, λ_{2}

(which can be calculated with

ν = (ν_{1}, ν_{2}, \dots, ν_{m}), \forall m \in M_{1}

using Equation (8)) unchanged. This can be denoted as:

\begin{matrix} T_{m}^{d l} = \frac{\sqrt{λ_{1}}}{\sqrt{1 + ν_{m}^{p}}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}} + \frac{\sqrt{λ_{2}}}{\sqrt{1 + ν_{m}^{p}}} \sqrt{\frac{L_{m}}{F e}} . \end{matrix}

(11)

Based on this observation, we propose the term “water pumping” to describe the procedure whereby the value of

ν_{m}

, the water level parameter of an accepted request

r_{m}

, is adjusted from its initial value

ν_{m}

to a new value

ν_{m}^{p}

. To further characterize this process, we define the ratio

a_{m}

as

a_{m} = \frac{T_{m}}{T_{m}^{d l}} = \frac{\sqrt{1 + ν_{m}^{p}}}{\sqrt{1 + ν_{m}}}

. By manipulating the value of

ν_{m}

, the processing delay

T_{m}

is extended to the desired maximal acceptable deadline

T_{m}^{d l}

for request

r_{m}

:

\begin{matrix} ν_{m}^{p} = a_{m}^{2} (1 + ν_{m}) - 1 . \end{matrix}

(12)

Notably, the proposed scheme can be readily extended to a cloud-edge collaboration model [12,40,41]. In this case, the ratio

a_{m}

can be redefined as

\frac{T_{m}}{T_{m}^{d l} - T_{r t t}}

, where

T_{r t t}

denotes the round-trip time between the edge node and the remote cloud server. As we adjust the value of

ν_{m}

to

ν_{m}^{p}

, the vector

ν = (ν_{1}, ν_{2}, \dots, ν_{m}), \forall m \in M_{1}

shifts to the new value

ν^{p} = (ν_{1}, ν_{2}, \dots, ν_{m}^{p}), \forall m \in M_{1}

.

λ_{1}

and

λ_{2}

shifts to

λ_{1}^{p}

and

λ_{2}^{p}

, accordingly. The decrement of

\sqrt{λ_{1}}, \sqrt{λ_{2}}

, which is termed the “pumped water”, can be expressed as follows:

\begin{matrix} Δ_{1} = \sqrt{λ_{1}} - \sqrt{λ_{1}^{p}} = \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}} (\sqrt{1 + ν_{m}} - \sqrt{1 + ν_{m}^{p}}) = (1 - a_{m}) \sqrt{1 + ν_{m}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}}, \end{matrix}

(13)

\begin{matrix} Δ_{2} = \sqrt{λ_{2}} - \sqrt{λ_{2}^{p}} = \sqrt{\frac{L_{m}}{F e}} (\sqrt{1 + ν_{m}} - \sqrt{1 + ν_{m}^{p}}) = (1 - a_{m}) \sqrt{1 + ν_{m}} \sqrt{\frac{L_{m}}{F e}} . \end{matrix}

(14)

4.3. Water Refilling

To accommodate request

r_{m + 1}

, the pumped water

Δ_{1}

and

Δ_{2}

should be refilled to ensure that

r_{m + 1}

does not violate its deadline constraint. The procedure of setting a feasible

ν_{m + 1}

for

r_{m + 1}

is termed “water refilling”. Successful water refilling involves finding a suitable

r_{m + 1}

and setting

ν_{m + 1}

, within the following constraints:

\begin{matrix} \sqrt{λ_{1}^{^{'}}} = \sqrt{1 + ν_{1}} \sqrt{\frac{l_{1}}{{\bar{R}}_{1}}} + \dots + \sqrt{1 + ν_{m - 1}} \sqrt{\frac{l_{m - 1}}{{\bar{R}}_{m - 1}}} + \sqrt{1 + ν_{m}^{p}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}} + \\ \sqrt{1 + ν_{m + 1}} \sqrt{\frac{l_{m + 1}}{{\bar{R}}_{m + 1}}} = \sqrt{λ_{1}^{p}} + \sqrt{1 + ν_{m + 1}} \sqrt{\frac{l_{m + 1}}{{\bar{R}}_{m + 1}}} \leq \sqrt{λ_{1}}, \end{matrix}

(15)

\begin{matrix} \sqrt{λ_{2}^{^{'}}} = \sqrt{1 + ν_{1}} \sqrt{\frac{L_{1}}{F e}} + \dots + \sqrt{1 + ν_{m - 1}} \sqrt{\frac{L_{m - 1}}{F e}} + \sqrt{1 + ν_{m}^{p}} \sqrt{\frac{L_{m}}{F e}} + \\ \sqrt{1 + ν_{m + 1}} \sqrt{\frac{L_{m + 1}}{F e}} = \sqrt{λ_{2}^{p}} + \sqrt{1 + ν_{m + 1}} \sqrt{\frac{L_{m + 1}}{F e}} \leq \sqrt{λ_{2}}, \end{matrix}

(16)

\begin{matrix} T_{m + 1} = \frac{\sqrt{λ_{1}^{^{'}}}}{\sqrt{1 + ν_{m + 1}}} \sqrt{\frac{l_{m + 1}}{{\bar{R}}_{m + 1}}} + \frac{\sqrt{λ_{2}^{^{'}}}}{\sqrt{1 + ν_{m + 1}}} \sqrt{\frac{L_{m + 1}}{F e}} \leq T_{m + 1}^{d l} . \end{matrix}

(17)

Here

\sqrt{λ_{1}^{^{'}}}, \sqrt{λ_{2}^{^{'}}}

are obtained with

ν^{'} = (ν_{1}, ν_{2}, \dots, ν_{m - 1}, ν_{m}^{^{'}}, ν_{m + 1}), \forall i \in M_{1}^{^{'}} M_{1}^{^{'}} = M_{1} \cup \{m + 1\}

according to Equations (A14) and (A15) (

\sqrt{λ_{1}^{p}}

and

\sqrt{λ_{2}^{p}}

are obtained with

ν^{p}

). The reason for taking the inequality in Equations (15) and (16) is to ensure there is no violation of the resource constraints C1 and C2. Equation (17) guarantees that the allocated resource for

r_{m + 1}

meets its deadline requirement. In this regard, compared with Equations (13) and (14), Equations (15) and (16) can be rewritten as:

\begin{matrix} \sqrt{1 + ν_{m + 1}} \sqrt{\frac{l_{m + 1}}{{\bar{R}}_{m + 1}}} \leq Δ_{1}, \end{matrix}

(18)

\begin{matrix} \sqrt{1 + ν_{m + 1}} \sqrt{\frac{L_{m + 1}}{F e}} \leq Δ_{2} . \end{matrix}

(19)

Furthermore, the value of

ν_{m + 1}

for the newly accepted request

r_{m + 1}

can be decided with the following formula:

\begin{matrix} ν_{m + 1} = m i n \{\frac{{\bar{R}}_{m + 1}}{l_{m + 1}} {(Δ_{1})}^{2} - 1, \frac{F e}{L_{m + 1}} {(Δ_{2})}^{2} - 1\}, T_{m + 1} \leq T_{m + 1}^{d l} . \end{matrix}

(20)

Unfortunately, a single trial of “water pumping” may not necessarily result in successful “water refilling”, so situations where multiple trials of “water pumping” are necessary to ensure successful refilling should be considered. Denote

S

(

S \subset M_{1}

) as the current set of “pumped requests”, containing the requests having shrunken

ν

(

ν_{i} \to ν_{i}^{p} \forall i \in S

), and

U

(

U \subset M_{1}, U ⋂ S = \emptyset

and

U ⋃ S = M_{1}

) as the set of “unpumped requests”. According to a predetermined rule, (the pumping policy

P

), a request can be selected from

U

to perform “water pumping”. Assuming

r_{m + 1}

is the chosen request to be “refilled”, based on another predetermined rule, the refilling policy

R

, after several pumping trials,

ν_{m + 1}

can be determined without violating the following constraints:

\begin{matrix} \sqrt{λ_{1}^{^{'}}} = \sum_{i \in S} \sqrt{1 + ν_{i}^{p}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}} + \sum_{j \in U} \sqrt{1 + ν_{j}} \sqrt{\frac{l_{j}}{{\bar{R}}_{j}}} + \sqrt{1 + ν_{m + 1}} \sqrt{\frac{l_{m + 1}}{{\bar{R}}_{m + 1}}} \leq \sqrt{λ_{1}}, \end{matrix}

(21)

\begin{matrix} \sqrt{λ_{2}^{^{'}}} = \sum_{i \in S} \sqrt{1 + ν_{i}^{p}} \sqrt{\frac{L_{i}}{F e}} + \sum_{j \in U} \sqrt{1 + ν_{j}} \sqrt{\frac{L_{j}}{F e}} + \sqrt{1 + ν_{m + 1}} \sqrt{\frac{L_{m + 1}}{F e}} \leq \sqrt{λ_{2}}, \end{matrix}

(22)

Compared with Equations (18) and (19), Equations (21) and (22) can be rewritten as follows:

\begin{matrix} Σ Δ_{1}^{^{'}} = \sqrt{λ_{1}} - \sqrt{λ_{1}^{^{'}}} = \sum_{i \in S} (1 - a_{i}) \sqrt{1 + ν_{i}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}} - \sqrt{1 + ν_{m + 1}} \sqrt{\frac{l_{m + 1}}{{\bar{R}}_{m + 1}}} \geq 0, \end{matrix}

(23)

\begin{matrix} Σ Δ_{2}^{^{'}} = \sqrt{λ_{2}} - \sqrt{λ_{2}^{^{'}}} = \sum_{i \in S} (1 - a_{i}) \sqrt{1 + ν_{i}} \sqrt{\frac{L_{i}}{F e}} - \sqrt{1 + ν_{m + 1}} \sqrt{\frac{L_{m + 1}}{F e}} \geq 0 . \end{matrix}

(24)

In this way, the

ν_{m + 1}

, after multiple pumping trials of can be obtained from:

\begin{matrix} ν_{m + 1} = m i n \{\frac{{\bar{R}}_{m + 1}}{l_{m + 1}} {(Σ Δ_{1}^{^{'}})}^{2} - 1, \frac{F e}{L_{m + 1}} {(Σ Δ_{2}^{'})}^{2} - 1\}, \end{matrix}

(25)

It should be noted that requests

r_{i}, \forall i \in S

do not violate the deadline requirements after accepting the new request

r_{m + 1}

. Equation (11) indicates that the processing time of a pumped request is extended to its deadline. Equations (15) and (21) guarantee that

\sqrt{λ_{1}^{^{'}}} \leq \sqrt{λ_{1}}

. Similarly,

\sqrt{λ_{2}^{^{'}}} \leq \sqrt{λ_{2}}

holds. As a result, the processing delay of a pumped request can be reformulated as:

\begin{matrix} T_{m}^{^{'}} = \frac{\sqrt{λ_{1}^{^{'}}}}{\sqrt{1 + ν_{m}^{^{'}}}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}} + \frac{\sqrt{λ_{2}^{^{'}}}}{\sqrt{1 + ν_{m}^{^{'}}}} \sqrt{\frac{L_{m}}{F e}} \leq \frac{\sqrt{λ_{1}}}{\sqrt{1 + ν_{m}^{^{'}}}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}} + \frac{\sqrt{λ_{2}}}{\sqrt{1 + ν_{m}^{^{'}}}} \sqrt{\frac{L_{m}}{F e}} = T_{m}^{d l} . \end{matrix}

(26)

The WPR is all about how to allocate network resources to provide near-to-deadline services for accepted computation offloading requests. The algorithm converges to its final result by continuously pumping and refilling until no request can be successfully added. The whole procedure of the WPR algorithm is summarized in Algorithm 1, and some notations are listed in Table 2.

Table 2. Summary of notations used in the WPR algorithm.

4.4. Complexity Analysis

The proposed WPR algorithm provides a solution to problem (P1) and offers flexibility in designing service strategies by allowing the choice of requests to shrink during each iteration. It is important to note that the WPR algorithm converges within at most M refilling iterations, provided that the exit condition in Line 14 of Algorithm 1 is not met. Assuming that

| M_{1}^{c} | = m_{1}

, the last successful one-to-one pumping and refilling happens at this point in time. After this moment, multiple rounds of pumping are necessary to ensure a successful refilling. Once

| M_{1}^{c} | = m_{2}

, no further refilling attempts will succeed, which means the end of the algorithm. Therefore, the total pumping procedure will take

m_{1} + (m_{1} + 1) + (m_{1} + 2) + \dots + (m_{2} - 1) + m_{2} \leq {| M |}^{2}

iterations. Thus, the time complexity of the WPR algorithm is

O (M^{2})

.

Algorithm 1: Water Pumping and Refilling

Input: initial accepted requests set

M_{1}

, initial

ν

of

M_{1}

, pumping policy

P

and refilling policy

R

;

Output: final accepted requests set

M_{1}^{f}

, final

ν

of

M_{1}

;

1:: current set of accepted requests that have not been pumped $U = M_{1}$ ;
2:: current shrunk set $S = \emptyset$ , last $ν^{l} = ν$ of the accepted requests $M_{1}$ after a successful refilling;
3:: current rejected requests set $M_{0} = M - M_{1}^{c}$ , current $ν^{c}$ of current accepted requests $M_{1}^{c}$ ;
4:: last $M_{1}^{l} = M_{1}^{c}$ after reset (used as the condition to exit the loop) $Σ Δ_{1} = 0$ , $Σ Δ_{2} = 0$ ;
5:: while $U \neq \emptyset$ do:
6:: get the request $r_{m}$ to be pumped from $U$ according to $P$ , $U = U - \{m\}$ ;
7:: get $a_{m}$ and update $ν_{m}^{^{'}}$ with Equation (12) (the water pumping), update $ν^{c}$ ;
8:: get $Δ_{1}, Δ_{2}$ with Equations (13) and (14), update $Σ Δ_{1} + = Δ_{1}$ , $Σ Δ_{2} + = Δ_{2}$ ;
9:: get the request $r_{m + 1}$ to be refilled from $M_{0}$ , according to $R$ , obtain $ν_{m + 1}$ with Equation (25) (water refilling), and check whether $r_{m + 1}$ exceeds its deadline;
10:: if the water refilling succeeds in accommodating $r_{m + 1}$ then
11:: $M_{1}^{c} = M_{1}^{c} \cup \{m + 1\}, ν^{c} = ν^{c} \cup \{ν_{m + 1}\}$ , $ν^{l} = ν^{c}$ , $U = U \cup \{m + 1\}$ , $M_{0} = M_{0} - \{m + 1\}$ , $Σ Δ_{1} = 0$ , $Σ Δ_{2} = 0$ ;
12:: else
13:: $M_{1}^{c} = M_{1}^{c}$ , $ν^{c} = ν^{c}$ , $M_{0} = M_{0}$ ;
14:: if $U = \emptyset$ then ▹ reset $U$ until no new request is accepted
15:: if $M_{1}^{c} \neq M_{1}^{l}$ then
16:: $M_{1}^{l} = M_{1}^{c}$ ;
17:: $U = M_{1}^{c}$ , $Σ Δ_{1} = 0$ , $Σ Δ_{2} = 0$ ;
18:: else
19:: return scheme $M_{1}^{c}$ , $ν^{l}$ .
20:: return scheme $M_{1}^{c}$ , $ν^{l}$

5. Simulation Results and Discussion

In this section, we present results and discussions concerning the edge-enabled asymmetrical network under different parameters.

5.1. Simulation Setting

The system parameters are defined as follows. The bandwidth of the BS is

8 \times 10^{6}

MHz, and the computing capacity of the connected edge server is

F e = 1 \times 10^{10}

CPU cycles/second. The channel model utilized in this study was the same as the one presented in [38]. UDs upload task data with a fixed transmission power

p_{m} = 0.2 \forall m \in M

. The system contains

M = 20

UDs, and it considers only one type of computation application, denoted by

K = 1

. Accordingly,

F = 10

input files are considered, with the input size

l_{m}^{f}

(in bits) ranging from

[l_{m i n} = 1 \times 10^{6}, l_{m a x} = 20 \times 10^{6}]

. Specifically, the input size of

r_{m}

with input file f takes the value of

l_{m}^{f} = l_{m i n} + (f - 1) \frac{l_{m a x} - l_{m i n}}{F}

following uniform distribution, Zipf distribution [41,42] (skewness factor

α = 1

) prioritizing small loads and Zipf distribution prioritizing large loads. The computation load (

L_{m}

) of each task ranges from

[L_{m i n} = 0.5 \times 10^{8}, L_{m a x} = 4 \times 10^{8}]

(in CPU cycles) and computation load of

r_{m}

with input file f takes a value of

L_{m}^{f} = L_{m i n} + (f - 1) \frac{L_{m a x} - L_{m i n}}{F}

following the same distribution as

l_{m}^{f}

.

T_{m}^{d l} = 0.4

and the penalty factor

η = 10

. Unless otherwise specified, the results were obtained based on uniform distribution. In this paper, the default policy

P

selects the request

r_{m}

with the highest

a_{m}

in

U

to pump first and

R

tries to refill the request

r_{i}

with the smallest

\sqrt{λ_{1}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}} + \sqrt{λ_{2}} \sqrt{\frac{L_{i}}{F e}}

in

M_{0}

. It should be noted that other

P

and

R

can be customized and adopted, such as smallest file size first (SFWPR), and best channel condition first (which is not considered in this paper). The following baseline algorithms were used in this paper:

Delay cost minimization (DCM): this scheme allocates resources for accepted requests with the aim of minimizing system delay costs. In cases where a request is rejected, the DCM scheme imposes a penalty instead of the processing delay.
Water pumping and refilling, basing on DCM (WPDCM): this scheme uses results obtained from DCM as the input $M_{1}$ of WPR and sets each item of $ν$ to 1.
Water pumping and refilling (WPR): The initial $M_{1}$ only includes the request with the smallest $\sqrt{λ_{1}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}} + \sqrt{λ_{2}} \sqrt{\frac{L_{i}}{F e}}$ which is the refilling policy $R$ used by default.
Smallest input file first water pumping and refilling (SFWPR): using the default policy $P$ while refilling requests with the smallest input size. The initial $M_{1}$ only includes the request with the smallest input size.

5.2. Result Discussion

Figure 2 illustrates the relationship between the number of UDs and the acceptance ratio of offloading requests in the system. As the number of UDs requesting computation offloading increased, the acceptance ratio of requests tended to decrease. The reason for this is clear, the limited system resources could not deal with requests beyond the system’s capabilty. The WPR-based algorithms achieved a higher acceptance ratio than the DCM algorithm. The reason for this is that the DCM algorithm prioritized minimization of the sum processing delay of the accepted requests, which came at the expense of reducing the system’s capacity to handle a high volume of user requests. In other words, while minimizing processing delay is a desirable goal to improve system performance, it may result in reduced capacity to handle additional user requests. Moreover, the WPR scheme proposed in this study demonstrated competitive performance, and the highest acceptance ratio among all the schemes considered, including WPDCM. Therefore, the proposed WPR algorithm can be used as a supplementary algorithm to DCM and as an independent algorithm, offering competitive results.

Figure 2. Acceptance ratio versus UDs.

Figure 3a illustrates that the aggregate system delay cost increased proportionally with the number of offloading user devices (UDs). This escalation is primarily attributed to the penalty incurred by rejecting redundant offloading requests that exceed the system’s resource capacity. Notably, the DCM algorithm incurred the highest delay cost, mainly due to its low acceptance ratio. In contrast, Figure 3b illustrates that the WPR-based algorithm achieved a higher acceptance ratio than the DCM algorithm. The DCM algorithm aimed to minimize the processing delay of each accepted request, while the WPR-based algorithm permitted the processing delay of an accepted request to approximate its deadline. This approach left more system resources available to accommodate additional requests.

Figure 3. Delay cost versus UDs: (a) sum delay cost of all UDs; (b) average processing delay for an accepted request.

From Figure 4, several conclusions can be drawn. Firstly, it can be observed that all schemes achieved higher acceptance ratios when the input size of the majority of tasks had a small range, as depicted in Figure 4a,b. This is due to the fact that smaller service loads require fewer resources. Secondly, the proposed SFWPR scheme outperforms the DCM scheme in maximizing the acceptance ratio When more tasks carried smaller input sizes. As shown in Figure 4b, the acceptance ratio of SFWPR was higher than that of DCM and was very close to those of WPR and WPDCM. However, it should be noted that SFWPR might not be suitable for scenarios wherein tasks with large input sizes constitute the majority. Lastly, as the input size of tasks increased, the acceptance ratio decreased, due to the increased service load.

Figure 4. Acceptance ratio versus maximum file size: (a) uniform distribution; (b) Zipf distribution prioritizing small input files; (c) Zipf distribution prioritizing large input files.

Figure 5 presents the performance of different schemes when the maximum computation load shifted from

1 \times 10^{8}

CPU cycles to

4 \times 10^{8}

CPU cycles. Similar to the observations obtained from Figure 4, it was observed that all schemes achieved higher acceptance ratios when the computation load of tasks varied in a small range. Due to the small range of computation loads, SFWPR outperforms DCM in the uniform distribution (Figure 5a) and the Zipf distribution prioritizing small input files (Figure 5b). However, as the maximum computation load increases, the performance of SFWPR degrades rapidly, and it becomes inferior to DCM in the Zipf distribution prioritizing large input files case (Figure 5c). This is because the minor difference in computation load meant the refilling policy

R

for SFWPR and WPR were approximately the same. Moreover, the evident degradation shown in Figure 5c suggests that SFWPR was sensitive to computing load.

Figure 5. Acceptance ratio versus maximum computation load: (a) uniform distribution; (b) Zipf distribution prioritizing small input files; (c) Zipf distribution prioritizing large input files.

Figure 6 depicts the achieved acceptance ratio of various schemes while varying the system bandwidth from

6 \times 10^{6}

hertz to

1.8 \times 10^{7}

hertz. It was observed that a larger bandwidth led to a higher acceptance ratio for all schemes. The primary reason for this phenomenon is that the over-provisioned bandwidth reduced the pressure on computing resources. With shorter transmission delay, there is a more considerable margin of compensation for computational delay. As a result, SFWPR outperforms DCM due to the additional supplement of system bandwidth, as compared to the results presented in Figure 2.

Figure 6. Acceptance ratio versus system bandwidth.

Figure 7 illustrates how the performance of the schemes mentioned above differed as the edge computational capacity increased from

5 \times 10^{9}

CPU cycles per second to

30 \times 10^{9}

CPU cycles per second. It can be concluded that the performance of SFWPR degraded rapidly as the computational capacity became smaller. This is attributed to its refilling policy

R

, which inherently required more computing resources than the WPR scheme. Moreover, the performance gap between the WPR-based schemes (WPDCM, WPR) and DCM widened as the edge computational capacity increased, indicating the superior flexibility of WPR-based schemes in resource allocation.

Figure 7. Acceptance ratio versus edge computational capacity.

6. Conclusions

The current minimum latency scheme is inadequate in fully utilizing resources of the asymmetrical system to provide satisfactory computation offloading service for UDs with a maximum request acceptance ratio. In this paper, we propose a water pumping and refilling algorithm that exploits the margin between the delay-minimum scheme and the near-deadline scheme to achieve the maximal acceptance ratio. We first solve the resource allocation sub-problem of the delay-minimum scheme, which provided inspiration for the design of the water pumping and refilling algorithm. The water pumping algorithm can function not only as a supplementary algorithm to a given scheme, but also as a standalone algorithm to obtain the resource allocation scheme following a customizable refilling policy

R

. The simulation results demonstrated that our proposed algorithm outperforms delay-minimum schemes in achieving a high acceptance ratio.

In the future, we plan to investigate computation offloading and resource allocation schemes under energy consumption constraints. Additionally, we will explore system models with collaborations between multiple base stations.

Author Contributions

Conceptualization, L.D. and Y.L.; methodology, L.D.; software, L.D.; validation, L.D. and W.H.; formal analysis, L.D.; investigation, L.D.; resources, Y.L.; data curation, L.D. and W.H.; writing—original draft preparation, L.D.; writing—review and editing, L.D. and W.H.; visualization, L.D.; supervision, Y.L.; project administration, Y.L.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof.

The resulting Karush–Kuhn–Tucker (KKT) conditions can be denoted as follows:

\begin{matrix} \frac{\partial L}{\partial α_{m}} = λ_{1} - \frac{(1 + ν_{m}) l_{m}}{α_{m}^{2} {\bar{R}}_{m}} = 0, \forall m \in M_{1}, \end{matrix}

(A1)

\begin{matrix} \frac{\partial L}{\partial β_{m}} = λ_{2} - \frac{(1 + ν_{m}) L_{m}}{β_{m}^{2} F e} = 0, \forall m \in M_{1}, \end{matrix}

(A2)

\begin{matrix} \sum_{m \in M_{1}} α_{m} - 1 \leq 0, \end{matrix}

(A3)

\begin{matrix} λ_{1} (\sum_{m \in M_{1}} α_{m} - 1) = 0, \end{matrix}

(A4)

\begin{matrix} \sum_{m \in M_{1}} β_{m} - 1 \leq 0, \end{matrix}

(A5)

\begin{matrix} λ_{2} (\sum_{m \in M_{1}} β_{m} - 1) = 0, \end{matrix}

(A6)

\begin{matrix} ν_{m} (\frac{l_{m}}{α_{m} {\bar{R}}_{m}} + \frac{L_{m}}{β_{m} F e} - T_{m}^{d l}) = 0, \forall m \in M_{1}, \end{matrix}

(A7)

\begin{matrix} \frac{l_{m}}{α_{m} {\bar{R}}_{m}} + \frac{L_{m}}{β_{m} F e} - T_{m}^{d l} \leq 0, \end{matrix}

(A8)

\begin{matrix} λ_{1}, λ_{2} \geq 0 . \end{matrix}

(A9)

With some manipulation to Equations (A1) and (A2), we can obtain the bandwidth and computing resources allocation solution as follows:

\begin{matrix} α_{m} = \frac{\sqrt{\frac{(1 + ν_{m}) l_{m}}{{\bar{R}}_{m}}}}{\sqrt{λ_{1}}}, \end{matrix}

(A10)

\begin{matrix} β_{m} = \frac{\sqrt{\frac{(1 + ν_{m}) L_{m}}{F e}}}{\sqrt{λ_{2}}} . \end{matrix}

(A11)

With the contradiction technique adopted in Appendix B in [7], we have:

\begin{matrix} \sum_{m \in M_{1}} α_{m} - 1 = 0, \end{matrix}

(A12)

\begin{matrix} \sum_{m \in M_{1}} β_{m} - 1 = 0 . \end{matrix}

(A13)

Substituting Equation (A10) and Equation (A11) into Equation (A12) and Equation (A13) respectively, the Lagrange multipliers

\sqrt{λ_{1}}, \sqrt{λ_{2}}

can be derived as:

\begin{matrix} \sqrt{λ_{1}} = \sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) l_{i}}{{\bar{R}}_{i}}}, \end{matrix}

(A14)

\begin{matrix} \sqrt{λ_{2}} = \sum_{j \in M_{1}} \sqrt{\frac{(1 + ν_{j}) L_{j}}{F e}} . \end{matrix}

(A15)

Finally, combining Equation (A10) with Equation (A14) and Equation (A11) with Equation (A15), the bandwidth and computing resources allocation scheme to Problem (P2.1) can be denoted as follows:

\begin{matrix} (α_{m}, β_{m}) = (\frac{\sqrt{\frac{(1 + ν_{m}) l_{m}}{{\bar{R}}_{m}}}}{\sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) l_{i}}{{\bar{R}}_{i}}}}, \frac{\sqrt{\frac{(1 + ν_{m}) L_{m}}{F e}}}{\sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) L_{i}}{F e}}}), \forall m \in M_{1} . \end{matrix}

(A16)

This ends the proof. □

Appendix B

Proof.

The sum of upload transmission time of UDs in

M_{1}

can be written as follows

\begin{matrix} \sum_{m \in M_{1}} \frac{l_{m}}{α_{m} {\bar{R}}_{m}} = \sum_{m \in M_{1}} \frac{l_{m}}{α_{m} {\bar{R}}_{m}} \sum_{m \in M_{1}} α_{m} \geq {(\sum_{m \in M_{1}} \sqrt{\frac{l_{m}}{α_{m} {\bar{R}}_{m}}} \sqrt{α_{m}})}^{2} = {(\sum_{m \in M_{1}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}})}^{2} . \end{matrix}

(A17)

The first equality comes from Equation (A12) and the inequality comes from Cauchy–Buniakowsky–Schwarz inequality where the equality holds for all UDs in

M_{1}

:

\begin{matrix} \frac{\sqrt{\frac{l_{1}}{α_{1} {\bar{R}}_{1}}}}{\sqrt{α_{1}}} = \frac{\sqrt{\frac{l_{2}}{α_{2} {\bar{R}}_{2}}}}{\sqrt{α_{2}}} = \dots = \frac{\sqrt{\frac{l_{m}}{α_{m} {\bar{R}}_{m}}}}{\sqrt{α_{m}}} = c, \end{matrix}

(A18)

where c is a constant. With some manipulation, we have

\begin{matrix} \sqrt{\frac{l_{m}}{{α_{m}}^{2} {\bar{R}}_{m}}} = \frac{\sqrt{λ_{1}}}{\sqrt{(1 + ν_{m})}} = \frac{\sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) l_{i}}{{\bar{R}}_{i}}}}{\sqrt{(1 + ν_{m})}}, \end{matrix}

(A19)

where the first equality comes from Equation (A10) and the second equality comes from Equation (A16). Thus, to achieve the minimum of Equation (A17), we need to determine the values of

ν_{m}, \forall m \in M_{1}

to guarantee

\frac{\sqrt{λ_{1}}}{\sqrt{(1 + ν_{m})}} = c

. An intuitive scheme is to assign

ν_{m}, \forall m \in M_{1}

with equal values (for example, 1).

Similarly, we can conclude that the sum of edge processing time

\sum_{m \in M_{1}} \frac{L_{m}}{β_{m} F e}

achieves its minimum

{(\sum_{m \in M_{1}} \sqrt{\frac{L_{m}}{F e}})}^{2}

if:

\begin{matrix} \sqrt{\frac{L_{m}}{{β_{m}}^{2} F e}} = \frac{\sqrt{λ_{2}}}{\sqrt{(1 + ν_{m})}} = \frac{\sum_{i \in M_{1}} \sqrt{\frac{(1 + ν_{i}) L_{i}}{F e}}}{\sqrt{(1 + ν_{m})}} = d, \forall m \in M_{1}, \end{matrix}

(A20)

where d is a constant. Thus, setting equal value to

ν_{m} \forall m \in M_{1}

can obtain the minimum delay cost

{(\sum_{m \in M_{1}} \sqrt{\frac{l_{m}}{{\bar{R}}_{m}}})}^{2} + {(\sum_{m \in M_{1}} \sqrt{\frac{L_{m}}{F e}})}^{2}, \forall m \in M_{1}

. Accordingly, the optimal Lagrange multipliers

λ_{1}, λ_{2}

and resource allocation decisions can be denoted as

\begin{matrix} λ_{1}^{*} = {(\sum_{i \in M_{1}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}})}^{2}, & λ_{2}^{*} = {(\sum_{i \in M_{1}} \sqrt{\frac{L_{i}}{F e}})}^{2}, \end{matrix}

(A21)

\begin{matrix} α_{m}^{*} = \frac{\sqrt{\frac{l_{m}}{{\bar{R}}_{m}}}}{\sum_{i \in M_{1}} \sqrt{\frac{l_{i}}{{\bar{R}}_{i}}}}, & β_{m}^{*} = \frac{\sqrt{\frac{L_{m}}{F e}}}{\sum_{i \in M_{1}} \sqrt{\frac{L_{i}}{F e}}} . \end{matrix}

(A22)

This ends the proof. □

References

Dai, Y.; Xu, D.; Maharjan, S.; Zhang, Y. Joint computation offloading and user association in multi-task mobile edge computing. IEEE Trans. Veh. Technol. 2018, 67, 12313–12325. [Google Scholar] [CrossRef]
Hu, Y.C.; Patel, M.; Sabella, D.; Sprecher, N.; Young, V. Mobile edge computing—A key technology towards 5G. ETSI White Pap. 2015, 11, 1–16. [Google Scholar]
Lin, L.; Liao, X.; Jin, H.; Li, P. Computation Offloading Toward Edge Computing. Proc. IEEE 2019, 107, 1584–1607. [Google Scholar] [CrossRef]
Khan, A.A.; Laghari, A.A.; Shaikh, Z.A.; Dacko-Pikiewicz, Z.; Kot, S. Internet of Things (IoT) security with blockchain technology: A state-of-the-art review. IEEE Access 2022, 10, 122679–122695. [Google Scholar] [CrossRef]
Wang, Y.; Sheng, M.; Wang, X.; Wang, L.; Li, J. Mobile-edge computing: Partial computation offloading using dynamic voltage scaling. IEEE Trans. Commun. 2016, 64, 4268–4282. [Google Scholar] [CrossRef]
Zhang, J.; Hu, X.; Ning, Z.; Ngai, E.C.H.; Zhou, L.; Wei, J.; Cheng, J.; Hu, B. Energy-latency tradeoff for energy-aware offloading in mobile edge computing networks. IEEE Internet Things J. 2017, 5, 2633–2645. [Google Scholar] [CrossRef]
Ren, J.; Yu, G.; He, Y.; Li, G.Y. Collaborative cloud and edge computing for latency minimization. IEEE Trans. Veh. Technol. 2019, 68, 5031–5044. [Google Scholar] [CrossRef]
Mach, P.; Becvar, Z. Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun. Surv. Tutor. 2017, 19, 1628–1656. [Google Scholar] [CrossRef]
Shakarami, A.; Ghobaei-Arani, M.; Masdari, M.; Hosseinzadeh, M. A survey on the computation offloading approaches in mobile edge/cloud computing environment: A stochastic-based perspective. J. Grid Comput. 2020, 18, 639–671. [Google Scholar] [CrossRef]
Yuan, H.; Bi, J.; Zhou, M. Geography-aware task scheduling for profit maximization in distributed green data centers. IEEE Trans. Cloud Comput. 2020, 10, 1864–1874. [Google Scholar] [CrossRef]
Yuan, H.; Zhou, M. Profit-maximized collaborative computation offloading and resource allocation in distributed cloud and edge computing systems. IEEE Trans. Autom. Sci. Eng. 2020, 18, 1277–1287. [Google Scholar] [CrossRef]
Wang, Y.; Tao, X.; Zhang, X.; Zhang, P.; Hou, Y.T. Cooperative task offloading in three-tier mobile computing networks: An ADMM framework. IEEE Trans. Veh. Technol. 2019, 68, 2763–2776. [Google Scholar] [CrossRef]
Wei, Y.; Yu, F.R.; Song, M.; Han, Z. Joint optimization of caching, computing, and radio resources for fog-enabled IoT using natural actor–critic deep reinforcement learning. IEEE Internet Things J. 2018, 6, 2061–2073. [Google Scholar] [CrossRef]
Khan, A.A.; Laghari, A.A.; Shafiq, M.; Awan, S.A.; Gu, Z. Vehicle to Everything (V2X) and Edge Computing: A Secure Lifecycle for UAV-Assisted Vehicle Network and Offloading with Blockchain. Drones 2022, 6, 377. [Google Scholar] [CrossRef]
Mohajer, A.; Daliri, M.S.; Mirzaei, A.; Ziaeddini, A.; Nabipour, M.; Bavaghar, M. Heterogeneous computational resource allocation for NOMA: Toward green mobile edge-computing systems. IEEE Trans. Serv. Comput. 2022, 16, 1225–1238. [Google Scholar] [CrossRef]
Zhang, W.; Wen, Y.; Guan, K.; Kilper, D. Energy-Optimal Mobile Cloud Computing under Stochastic Wireless Channel. IEEE Trans. Wirel. Commun. 2013, 12, 4569–4581. [Google Scholar] [CrossRef]
Chen, J.; Xing, H.; Lin, X.; Nallanathan, A.; Bi, S. Joint Resource Allocation and Cache Placement for Location-Aware Multi-User Mobile-Edge Computing. IEEE Internet Things J. 2022, 9, 25698–25714. [Google Scholar] [CrossRef]
Dai, Y.; Zhang, K.; Maharjan, S.; Zhang, Y. Edge intelligence for energy-efficient computation offloading and resource allocation in 5G beyond. IEEE Trans. Veh. Technol. 2020, 69, 12175–12186. [Google Scholar] [CrossRef]
Huang, P.Q.; Wang, Y.; Wang, K.; Liu, Z.Z. A bilevel optimization approach for joint offloading decision and resource allocation in cooperative mobile edge computing. IEEE Trans. Cybern. 2019, 50, 4228–4241. [Google Scholar] [CrossRef]
Zeng, S.; Huang, X.; Li, D. Joint Communication and Computation Cooperation in Wireless Powered Mobile Edge Computing Networks with NOMA. IEEE Internet Things J. 2023. [Google Scholar] [CrossRef]
Karimi, E.; Chen, Y.; Akbari, B. Task offloading in vehicular edge computing networks via deep reinforcement learning. Comput. Commun. 2022, 189, 193–204. [Google Scholar] [CrossRef]
Zhou, F.; Hu, R.Q. Computation efficiency maximization in wireless-powered mobile edge computing networks. IEEE Trans. Wirel. Commun. 2020, 19, 3170–3184. [Google Scholar] [CrossRef]
Mukherjee, M.; Kumar, V.; Zhang, Q.; Mavromoustakis, C.X.; Matam, R. Optimal Pricing for Offloaded Hard-and Soft-Deadline Tasks in Edge Computing. IEEE Trans. Intell. Transp. Syst. 2021, 23, 9829–9839. [Google Scholar] [CrossRef]
Yan, J.; Bi, S.; Duan, L.; Zhang, Y.J.A. Pricing-driven service caching and task offloading in mobile edge computing. IEEE Trans. Wirel. Commun. 2021, 20, 4495–4512. [Google Scholar] [CrossRef]
Wang, Q.; Guo, S.; Liu, J.; Pan, C.; Yang, L. Profit maximization incentive mechanism for resource providers in mobile edge computing. IEEE Trans. Serv. Comput. 2019, 15, 138–149. [Google Scholar] [CrossRef]
Zhou, J.; Zhang, X. Fairness-aware task offloading and resource allocation in cooperative mobile-edge computing. IEEE Internet Things J. 2021, 9, 3812–3824. [Google Scholar] [CrossRef]
Hejja, K.; Berri, S.; Labiod, H. Network slicing with load-balancing for task offloading in vehicular edge computing. Veh. Commun. 2022, 34, 100419. [Google Scholar] [CrossRef]
Meng, J.; Tan, H.; Li, X.Y.; Han, Z.; Li, B. Online deadline-aware task dispatching and scheduling in edge computing. IEEE Trans. Parallel Distrib. Syst. 2019, 31, 1270–1286. [Google Scholar] [CrossRef]
Ren, J.; Yu, G.; Cai, Y.; He, Y. Latency optimization for resource allocation in mobile-edge computation offloading. IEEE Trans. Wirel. Commun. 2018, 17, 5506–5519. [Google Scholar] [CrossRef]
Bi, S.; Huang, L.; Wang, H.; Zhang, Y.J.A. Lyapunov-guided deep reinforcement learning for stable online computation offloading in mobile-edge computing networks. IEEE Trans. Wirel. Commun. 2021, 20, 7519–7537. [Google Scholar] [CrossRef]
Qiu, X.; Zhang, W.; Chen, W.; Zheng, Z. Distributed and collective deep reinforcement learning for computation offloading: A practical perspective. IEEE Trans. Parallel Distrib. Syst. 2020, 32, 1085–1101. [Google Scholar] [CrossRef]
Zhou, H.; Jiang, K.; Liu, X.; Li, X.; Leung, V.C. Deep reinforcement learning for energy-efficient computation offloading in mobile-edge computing. IEEE Internet Things J. 2021, 9, 1517–1530. [Google Scholar] [CrossRef]
Zhu, X.; Luo, Y.; Liu, A.; Bhuiyan, M.Z.A.; Zhang, S. Multiagent deep reinforcement learning for vehicular computation offloading in IoT. IEEE Internet Things J. 2020, 8, 9763–9773. [Google Scholar] [CrossRef]
Ndikumana, A.; Tran, N.H.; Ho, T.M.; Han, Z.; Saad, W.; Niyato, D.; Hong, C.S. Joint communication, computation, caching, and control in big data multi-access edge computing. IEEE Trans. Mob. Comput. 2019, 19, 1359–1374. [Google Scholar] [CrossRef]
Wen, W.; Cui, Y.; Quek, T.Q.; Zheng, F.C.; Jin, S. Joint optimal software caching, computation offloading and communications resource allocation for mobile edge computing. IEEE Trans. Veh. Technol. 2020, 69, 7879–7894. [Google Scholar] [CrossRef]
Gong, Y.; Yao, H.; Wang, J.; Li, M.; Guo, S. Edge intelligence-driven joint offloading and resource allocation for future 6G industrial internet of things. IEEE Trans. Netw. Sci. Eng. 2022. [Google Scholar] [CrossRef]
Liu, Y.; Tan, Z.; Hu, H.; Cimini, L.J.; Li, G.Y. Channel estimation for OFDM. IEEE Commun. Surv. Tutor. 2014, 16, 1891–1908. [Google Scholar] [CrossRef]
Dong, L.; He, W.; Yao, H. Task Offloading and Resource Allocation for Tasks with Varied Requirements in Mobile Edge Computing Networks. Electronics 2023, 12, 366. [Google Scholar] [CrossRef]
Bi, S.; Zhang, Y.J. Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading. IEEE Trans. Wirel. Commun. 2018, 17, 4177–4190. [Google Scholar] [CrossRef]
Xu, X.; Li, Y.; Huang, T.; Xue, Y.; Peng, K.; Qi, L.; Dou, W. An energy-aware computation offloading method for smart edge computing in wireless metropolitan area networks. J. Netw. Comput. Appl. 2019, 133, 75–85. [Google Scholar] [CrossRef]
Fang, C.; Xu, H.; Yang, Y.; Hu, Z.; Tu, S.; Ota, K.; Yang, Z.; Dong, M.; Han, Z.; Yu, F.R.; et al. Deep-reinforcement-learning-based resource allocation for content distribution in fog radio access networks. IEEE Internet Things J. 2022, 9, 16874–16883. [Google Scholar] [CrossRef]
Fang, C.; Liu, C.; Wang, Z.; Sun, Y.; Ni, W.; Li, P.; Guo, S. Cache-assisted content delivery in wireless networks: A new game theoretic model. IEEE Syst. J. 2020, 15, 2653–2664. [Google Scholar] [CrossRef]

Figure 1. System model.

Figure 2. Acceptance ratio versus UDs.

Figure 3. Delay cost versus UDs: (a) sum delay cost of all UDs; (b) average processing delay for an accepted request.

Figure 4. Acceptance ratio versus maximum file size: (a) uniform distribution; (b) Zipf distribution prioritizing small input files; (c) Zipf distribution prioritizing large input files.

Figure 5. Acceptance ratio versus maximum computation load: (a) uniform distribution; (b) Zipf distribution prioritizing small input files; (c) Zipf distribution prioritizing large input files.

Figure 6. Acceptance ratio versus system bandwidth.

Figure 7. Acceptance ratio versus edge computational capacity.

Table 1. Summary of the discussed work.

Work	Nodes with Computing Power	Variables to Be Optimized	Objective	Methodology
[7]	Edge Server (ES), Cloud Server (CS)	$λ$ ⁵, $x$ ¹, $b$ ², $α$ ³	DCM	Decomposition and KKT Conditions
[12]	UD, ES, CS	$x$ , $y$ ⁴, $α$	DCM	ADMM
[13]	ES, CS	$y$ , $b$ , $α$	DCM	Actor-Critic based DRL
[29]	UD, ES	$λ$ , $b$ , $α$	DCM	The Lagrange multiplier method
[18]	UD, ES	$x$ , $y$ , $α$	ECM	DDPG
[30]	UD, ES	$x$ , $b$ , $α$	Computation Rate Maximization	Lyapunov Optimization and DRL
[26]	ES	$x$ , $b$ , $α$	Minimize Maximal Delay ratio	Evolutionary Algorithm
[19]	UD, ES	$x$ , $α$	ECM	Ant Colony-based algorithm
[24]	UD, ES	$x$ , $c$ ⁶	DCM and Revenue Maximization	Stackelberg Game
[25]	UD, ES	$x$ , $c$	ECM and Revenue Maximization	Market Auction Theory
Our Work	ES	$x$ , $b$ , $α$	Acceptance Ratio Maximization	WPR

¹:

x

is the binary offloading vector; ²:

b

is the bandwidth allocation vector; ³:

α

is the computing resource allocation vector; ⁴:

y

denotes the computation node selection vector; ⁵:

λ

denotes the splitting ratio; ⁶:

c

denotes the pricing vector.

Table 2. Summary of notations used in the WPR algorithm.

Notation	Description
$P$	the pumping policy deciding the pumping order of the accepted requests
$R$	the refilling policy deciding the refilling order of the rejected requests
$α_{m}$	the bandwidth fraction allocated to request $r_{m}$
$β_{m}$	the computing resource fraction allocated to request $r_{m}$
$M_{1}, M_{1}^{c}, M_{0}$	the set of accepted requests, current accepted requests, and rejected requests
$M_{1}^{l}$	the latest set after the last successful refilling
$U$	the set of “unpumped requests” $U \subset M_{1}, U ⋂ S = \emptyset$ and $U ⋃ S = M_{1}$
$S$	$S \subset M_{1}$ the set of “pumped requests”, $ν_{i} \to ν_{i}^{p}, \forall i \in S$ according to Equation (11)
$λ_{1}, λ_{1}^{p}, λ_{1}^{^{'}}$	the Lagrange multipliers obtained with $ν, ν^{p}, ν^{^{'}}$ according to Equation (8)
$ν$	the Lagrange multipliers of accepted requests $ν = (ν_{1}, ν_{2}, \dots, ν_{m}), \forall m \in M_{1}$
$ν^{p}$	the Lagrange multipliers of accepted requests $ν^{p} = (ν_{1}, ν_{2}, \dots, ν_{m}^{p}), \forall m \in M_{1}$
$Σ Δ_{1}^{^{'}}, Σ Δ_{2}^{^{'}}$	the cumulative “pumped water” after multiple pumping trials

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Water Pumping and Refilling (WPR): A Resource Allocation Algorithm for Maximizing Acceptance Ratio in Asymmetrical Edge Computing Networks

Abstract

1. Introduction

2. Related Work

3. System Model

3.1. Transmission Model

3.2. Computing Model

3.3. Problem Formulation

4. The Water-Pumping Algorithm

4.1. The Delay Minimum Solution

4.2. Water Pumping

4.3. Water Refilling

4.4. Complexity Analysis

5. Simulation Results and Discussion

5.1. Simulation Setting

5.2. Result Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Article Metrics

Citations

Article Access Statistics