Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration

Si, Weili; Zhang, Zhifeng; Wang, Bo

doi:10.3390/digital6020044

Open AccessArticle

Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration

by

Weili Si

¹,

Zhifeng Zhang

² and

Bo Wang

^3,*

¹

School of Big Data and Artificial Intelligence, Zhengzhou University of Science and Technology, Zhengzhou 450064, China

²

Software Engineering College, Zhengzhou University of Light Industry, Zhengzhou 450001, China

³

School of Computer Science and Technology, Henan Institute of Technology, Xinxiang 453003, China

^*

Author to whom correspondence should be addressed.

Digital 2026, 6(2), 44; https://doi.org/10.3390/digital6020044

Submission received: 23 April 2026 / Revised: 30 May 2026 / Accepted: 1 June 2026 / Published: 2 June 2026

Download

Browse Figures

Versions Notes

Abstract

With the rapid growth of edge computing applications, optimizing both edge server placement and task offloading decisions is critical for minimizing system latency in edge–cloud environments. However, these two problems are tightly coupled and jointly form a binary non-linear programming (BNLP) problem that is NP-hard. To address this challenge, this paper proposes a novel hybrid meta-heuristic algorithm with alternating iteration, which decouples the joint optimization into two interdependent subproblems: edge server placement and task offloading. These subproblems are solved alternately using particle swarm optimization (PSO) for placement and a genetic algorithm (GA) for offloading, respectively. PSO efficiently explores the discrete placement space under bound constraints, while GA effectively navigates the high-dimensional binary offloading space. Compact encoding schemes are designed to inherently satisfy problem constraints, reducing search overhead and improving convergence. The overall algorithm exhibits polynomial-time complexity, making it scalable for practical deployments. Extensive experiments comparing the proposed method against ten baseline algorithms demonstrate that it achieves the best latency with the smallest standard deviation. The results validate the effectiveness, robustness, and scalability of the proposed alternating iterative hybrid meta-heuristic approach for joint edge server placement and task offloading optimization.

Keywords:

edge–cloud; edge server placement; genetic algorithm; particle swarm optimization; alternating iteration

1. Introduction

With the rapid proliferation of Internet of Things (IoT) devices, mobile applications, and real-time services, the volume of data generated at the network edge has grown explosively [1]. Typical application scenarios include IoT networks, where massive numbers of sensors and devices generate delay-sensitive data (e.g., smart cities, environmental monitoring); mobile communication systems (e.g., 5G and beyond), where user equipment requires low-latency edge processing for enhanced mobile broadband and ultra-reliable low-latency communications; and emerging real-time interactive services such as autonomous driving, augmented reality (AR), virtual reality (VR), and industrial automation. These applications demand end-to-end latencies in the millisecond range, which traditional cloud-centric computing models, where all data are transmitted to centralized cloud data centers, cannot satisfy due to physical distance and shared resource contention.

Edge computing has emerged as a promising paradigm to address these limitations by bringing computation and storage resources closer to end users. In edge computing, data processing occurs at edge nodes (e.g., base stations, edge servers) deployed near user devices, significantly reducing network latency and alleviating bandwidth pressure on the core network [2]. However, edge nodes typically possess limited computing resources compared to centralized cloud data centers. Consequently, modern edge–cloud systems adopt a three-tier architecture comprising the user tier, the edge tier, and the cloud tier. In such systems, requests from user devices can be processed either locally at edge servers or offloaded to the cloud, depending on resource availability and service requirements [3].

A fundamental design problem in edge–cloud systems is the joint optimization of edge server placement and task offloading decisions. Edge server placement determines where to deploy a limited number of edge servers among candidate base stations, directly affecting the computing capacity available at each location. Task offloading decides whether each user request should be processed at the edge or in the cloud. These two decisions are inherently coupled: placement configurations determine which base stations have edge servers to process requests locally, while offloading decisions influence the resource demands placed on each edge server. Optimizing both aspects jointly is critical for minimizing overall system latency and ensuring efficient resource utilization.

The joint optimization of edge server placement and task offloading presents several significant challenges. First, placement and offloading decisions are tightly coupled: placement configurations determine which base stations have edge servers available for local processing, while offloading decisions influence the workload distribution across edge servers, creating mutual interdependencies. Second, both decision spaces are discrete and combinatorial in nature, where placement involves assigning a limited number of edge servers to candidate base stations, while offloading involves binary decisions for each request, leading to a combined search space that grows exponentially with problem size, rendering exhaustive enumeration infeasible for realistic scenarios. Third, the system latency objective function exhibits non-linear characteristics due to the proportional resource allocation mechanism among requests processed at the same edge computing center, further complicating the optimization process. Fourth, the problem is fundamentally a binary non-linear programming (BNLP) problem, which is known to be NP-hard, making exact solution methods such as branch-and-bound computationally intractable even for moderate-sized instances. Finally, practical edge–cloud deployments involve hundreds of base stations, thousands of user requests, and multiple edge servers, demanding solution approaches that scale polynomially with problem dimensions while delivering high-quality solutions.

Considerable research efforts have been devoted to edge server placement and task offloading, both separately and jointly [4,5]. Early work on edge server placement primarily formulated the problem as a facility location problem solved using greedy heuristics or integer programming, but typically assumed fixed offloading policies or did not fully account for dynamic task characteristics. Task offloading research has produced a rich body of work ranging from game-theoretic approaches and Lyapunov optimization to deep reinforcement learning, yet most of these studies assume that edge servers are pre-deployed at fixed locations, which may not reflect real-world deployment scenarios where placement decisions are under the control of the system operator. Early attempts employed integer programming with linearization techniques, but these methods scale poorly to practical problem sizes. Consequently, most recent studies in this field model the joint optimisation problem as a non-linear problem (e.g., binary non-linear programming) and adopt meta-heuristic or machine learning-based approaches. But several research gaps remain: most joint optimization methods fail to leverage the natural decomposition of the problem; hybrid approaches combining multiple algorithms often operate in parallel or sequentially without alternating iterative refinement; systematic justification for algorithm selection based on subproblem characteristics is lacking; and comprehensive comparative evaluations against a broad set of heuristic and meta-heuristic baselines remain insufficient.

To address the aforementioned challenges and research gaps, this paper proposes an alternating iterative framework that decomposes the joint edge server placement and task offloading optimization problem into two interdependent subproblems, solved alternately using particle swarm optimization (PSO) for placement and a genetic algorithm (GA) for offloading. The core idea is to leverage the complementary strengths of these two meta-heuristics: PSO excels at exploring discrete placement spaces with bound constraints, while GA effectively navigates high-dimensional binary offloading spaces. This structural decomposition significantly reduces the complexity of each optimization step, enabling more focused search; the alternating iterative mechanism allows each subproblem to benefit from improvements in the other, progressively guiding the search toward globally high-quality joint solutions. Efficient encoding schemes are designed to inherently satisfy the atomicity constraints of edge server placement, eliminating infeasible solutions and reducing search overhead. The overall algorithm scales polynomially with the number of requests, base stations, and edge servers, making it practical for realistic deployment scenarios. Experimental results demonstrate that the proposed method consistently outperforms ten baseline algorithms across average latency, minimum latency, and standard deviation metrics, validating its effectiveness and robustness in solving the joint optimization problem. The main contributions of this paper are summarized in three aspects as follows.

To address the limitations of existing works, this paper proposes an alternating iterative framework that decomposes the joint edge server placement and task offloading optimization problem into two interdependent subproblems, solved alternately using particle swarm optimization (PSO) for placement and a genetic algorithm (GA) for offloading. The core methodological novelty lies in three aspects.

First, unlike monolithic optimization approaches that treat the joint problem as a whole or sequential methods that solve the two subproblems independently, our framework establishes a closed-loop co-evolution mechanism where the placement solution guides offloading optimization and the refined offloading solution in turn improves placement, enabling mutual reinforcement.
Second, we provide a principle-driven algorithm selection justified by the inherent characteristics of each subproblem, where PSO efficiently explores the discrete bounded placement space, while GA excels at navigating the high-dimensional binary offloading space, distinguishing our work from empirical trial-and-error hybridizations.
Third, we design compact encoding schemes that embed the atomicity constraint of edge servers directly into the solution representation, eliminating infeasible solutions and reducing search overhead. The overall algorithm achieves polynomial-time complexity, making it scalable for practical deployments. Extensive experiments comparing the proposed method against ten baseline algorithms demonstrate that it achieves the lowest average latency, best minimum latency, and smallest standard deviation, validating its effectiveness and robustness.

The remainder of this paper is organized as follows. Section 2 discusses the related work. Section 3 presents the system model, including the three-tier edge–cloud architecture, resource allocation mechanisms, delay formulations, and the complete mathematical formulation of the joint optimization problem. Section 4 details the proposed alternating iterative solution method, covering the algorithm selection rationale, search space encoding schemes, detailed algorithmic flow, and complexity analysis. Section 5 reports the experimental evaluation, including parameter settings, baseline algorithms, performance metrics, and comprehensive results analysis. Section 6 concludes the paper and discusses directions for future work.

2. Related Work

Edge server placement and task offloading have been extensively studied in the context of mobile edge computing, with research efforts spanning from single-objective optimization to multi-objective formulations, and from heuristic methods to meta-heuristic and learning-based approaches.

In the domain of resource allocation and task offloading, Zhang et al. [6] proposed a Stackelberg game-based multi-agent algorithm for resource allocation and task offloading in MEC-enabled cooperative intelligent transportation systems, effectively balancing the interests of multiple stakeholders. Similarly, Zhao et al. [7] developed EdgePro, a safe deep reinforcement learning framework for adaptive edge service provision, ensuring service reliability while optimizing resource efficiency. For a broader perspective, Bahrami et al. [8] provided a comprehensive survey of edge server placement problems in multi-access edge computing, covering models, techniques, and applications, which serves as a foundational reference for our work.

Another line of research focuses on edge application deployment under various constraints. A series of studies by Zhao et al. has addressed this topic from different angles: they considered joint shareability and interference for multiple edge application deployment [9], joint coverage-reliability for budgeted edge application deployment [10], availability-aware revenue-effective deployment [11], and more recently, maximizing revenue for reliability-aware edge application deployment [12]. Li et al. [13] introduced READ, a robustness-oriented edge application deployment framework that prioritizes resilience against failures. Although these works share similar optimization structures with our ESPP, they focus on application deployment rather than server placement and task offloading.

In the context of large-scale AI model inference, Lyu et al. [14] studied quantization-aware collaborative inference for large embodied AI models, highlighting the need for efficient edge–cloud partitioning. Their work addresses challenges analogous to our ESPP and task offloading, determining where to execute model components under resource constraints. Notably, our proposed alternating iterative framework could be extended to such AI deployment scenarios, where placement decisions correspond to assigning model components to edge servers and offloading decisions determine edge/cloud execution. This potential extension demonstrates the broader applicability of our methodology.

Addressing practical constraints such as heterogeneous server capacities and fairness, Tiwari [15] proposed a knapsack-based metaheuristic that formulates the placement problem as a 0–1 knapsack problem, enabling effective server allocation under heterogeneous capacity settings. Wu et al. [16] investigated the fairness-aware budgeted edge server placement problem in roadside units for connected autonomous vehicles, proving its NP-hardness and proposing both an exact integer programming approach for small-scale problems and an approximation method for large-scale scenarios.

Recent advances have leveraged clustering and machine learning techniques to solve the edge server placement problem. Vali et al. [17] introduced RESP, a recursive clustering technique based on cluster medians determined by base station counts, designed to achieve workload balance and minimize network traffic. Zhang et al. [18] proposed a graph clustering-based model incorporating a two-layer graph convolutional network with a differentiable K-means component, transforming placement into an end-to-end learning optimization problem that jointly optimizes average delay and load balancing. Zhou and Abawajy [19] combined improved spectral clustering with Q-learning (ISC-QL) for edge server placement in intelligent transportation systems, aiming to balance load, reduce energy consumption, and minimize delay.

Considering user mobility adds another layer of complexity. Li [20] systematically addressed mobility-aware server placement and power allocation in environments with randomly walking mobile users, establishing M/G/1 and M/G/k queueing systems under both synchronous and asynchronous mobility models, and developing optimization algorithms that consider two task offloading strategies.

The interdependence between server placement and workload distribution has motivated integrated approaches. Zarei et al. [21] formulated the edge server placement and load distribution problems as a mixed-integer nonlinear programming framework, employing ant colony optimization for placement and heuristic algorithms for request scheduling. Wang et al. [22] derived optimal request dispatch based on queueing theory and proposed a hybrid PSO-GA algorithm with niching technology (nPGSAO) that achieves improved response time with linear time complexity.

Recognizing that single-objective approaches often fail to capture the inherent trade-offs, several studies have explored multi-objective formulations. Ghasemzadeh et al. [23] proposed a multi-objective solution based on NSGA-II that simultaneously optimizes workload variance and latency reduction. Surayya et al. [24] evaluated multiple evolutionary algorithms for edge server placement in vehicular edge computing, providing insights into their respective strengths in terms of convergence speed, solution distribution, and energy efficiency.

Despite these substantial research efforts, existing studies exhibit several limitations. First, while many approaches address placement and offloading separately or treat them as a monolithic optimization, few exploit the natural structural decomposition between these two interdependent problems through alternating iterative refinement. Second, hybrid meta-heuristic methods often combine algorithms in parallel or sequential manners without systematic alternation that allows each subproblem to benefit from improvements in the other. Third, algorithm selection is frequently based on empirical preference rather than principled justification grounded in the inherent characteristics of each subproblem. Fourth, comprehensive comparative evaluations against a diverse set of baseline algorithms, including both heuristics and multiple meta-heuristics, remain limited. These gaps motivate our proposed alternating iterative framework, which leverages the complementary strengths of PSO for placement optimization and GA for offloading optimization in a coordinated alternating manner, providing both theoretical justification and thorough experimental validation.

3. Problem Formulation

We consider a three-tier edge–cloud computing system composed of the user tier, the edge tier, and the cloud tier, as illustrated in Figure 1. The user tier consists of geographically distributed user devices (UDs), such as IoT sensors, smartphones, or vehicular terminals, that generate and send diverse requests to the edge–cloud system. The edge tier comprises multiple edge computing centers (ECs), each equipped with a base station (BS) (e.g., 5G gNB or Wi-Fi access point) that provides wireless connectivity for data transmission between user devices and the center. Each EC is equipped with a BS, and can host one or more edge servers (ESs) for processing the requests received from UDs. However, due to the limited computing capacity of the edge tier, some requests may be offloaded to the cloud tier to enhance overall system performance. The cloud tier offers abundant computing resources, but it typically suffers from higher network latency because of the long physical distance to UDs and the shared nature of its resources among numerous cloud users. In this paper, we focus on the Edge Server Placement Problem (ESPP), which involves two coupled decisions: (i) where to place each purchased ES (i.e., at which BS) for request processing, and (ii) where to process each request (at the edge or in the cloud). The objective is to minimize the average processing latency across all requests, i.e., the system latency.

3.1. System Model

In the considered edge–cloud system, there are R requests generated by user devices (UDs) that need to be processed. The system has B pre-deployed BSs, and E ESs to be placed at these BSs. Due to the limited coverage of wireless signals, each BS can only receive requests from UDs within a specific geographic area. We define the binary indicator

h_{i, j}

for

i = 1, \dots, R

and

j = 1, \dots, B

as shown in Equation (1) to represent whether i-th request can be received by j-th BS. Specifically,

h_{i, j} = 1

if the i-th request is within the coverage area of the j-th BS and can be received by it, and

h_{i, j} = 0

otherwise. We assume that each user request is associated with a single BS, determined by the strongest received signal or the closest geographic proximity. Consequently, for each request i, there exists exactly one j such that

h_{i, j} = 1

. This simplification is commonly adopted in ES placement literature and focuses the optimization on placement and offloading rather than on access point selection. Extensions to multi coverage scenarios with an additional decision variable for BS selection are left for future work.

h_{i, j} = \{\begin{matrix} 1, & if the i - th request is received by the j - th BS \\ 0, & otherwise \end{matrix} .

(1)

Each BS j provides a network transmission rate

w_{j}

for uploading the input data of requests that are to be processed at the ESs collocated with that BS.

For each request i, let

d_{i}

denote the size of its input data required for processing, and

c_{i}

denote the amount of computing resources needed to complete the request.

There are E ESs available for placement. Each ES k is equipped with computing capacity

f_{k}

to satisfy the processing requirements of the requests assigned to it. The computing resources of an ES are shared among all requests received by the BS where that ES is placed.

The cloud tier boasts abundant resources. It provides a computing capacity of F for processing every request and a network transmission rate of W for receiving every request’s input data.

To formulate the ESPP, as shown in Equations (2) and (3), we introduce two sets of binary decision variables,

x_{j, k}

(for

j = 1, \dots, B

and

k = 1, \dots, E

) and

y_{i}

(for

i = 1, \dots, R

), which capture the decisions of ES placement and task offloading, respectively. Specifically,

x_{j, k} = 1

if the k-th ES is placed at the j-th BS, and

x_{j, k} = 0

otherwise. For task offloading,

y_{i} = 1

indicates that the i-th request is processed at its associated ES (i.e., processed locally at the edge), while

y_{i} = 0

indicates that it is processed in the cloud. The assumption that a request can only be processed either by its associated BS’s edge server(s) or offloaded to the cloud is a standard simplification in the related works. It reflects the fact that in most practical systems, a user device is wirelessly connected to a single BS (the one with the strongest signal), and edge processing must occur at that same BS due to coverage and backhaul constraints. Nevertheless, we acknowledge that more advanced scenarios (e.g., multi-connectivity, server-to-server forwarding) exist, and addressing them is one of our future research directions to enhance the generality of our method.

x_{j, k} = \{\begin{matrix} 1, & if k - th ES is placed in j th BS \\ 0, & else \end{matrix},

(2)

y_{i} = \{\begin{matrix} 1, & if i - th request is processed at its local ES \\ 0, & else \end{matrix} .

(3)

Due to the indivisibility of every ES, the ES can be only deployed with only one BS, and thus Equation (4) holds for k-th ES.

\sum_{j = 1}^{B} x_{j, k} \leq 1 .

(4)

3.2. Processing in Edge

For the i-th request, if it is processed at the edge (

y_{i} = 1

), then the associated BS, i.e., the j-th BS with

h_{i, j} = 1

, must have at least one ES deployed. This requirement is captured by the constraint (5).

y_{i} \leq \sum_{j = 1}^{B} (h_{i, j} \cdot \sum_{k = 1}^{E} x_{j, k}) .

(5)

Here,

\sum_{k = 1}^{E} x_{j, k}

denotes the number of ESs placed at the j-th BS, and the outer summation over j selects the BS that actually serves request i (i.e., where

h_{i, j} = 1

). Thus, the right-hand side of the constraint gives the total number of ESs deployed at the BS associated with request i.

If no ES is placed at that BS, the right-hand side equals 0, and the inequality forces

y_{i} \leq 0

. Since

y_{i}

is binary, this implies

y_{i} = 0

, meaning the request must be offloaded to the cloud. Conversely, if at least one ES is present, the right-hand side is at least 1, and

y_{i}

can be either 0 or 1, allowing the request to be processed either locally at the edge or offloaded to the cloud.

When the i-th request is processed at the edge, two main types of delay contribute to its completion time: the input data transfer delay from the user device to the BS, and the computing delay on the deployed ESs. Since a BS and its associated ESs are co-located at the same EC, the data transmission delay between them is negligible. Moreover, the size of the computation result is typically much smaller than that of the input data, so the return transmission delay from the edge to the user can be ignored.

Both network and computing resources at an EC are shared among multiple requests processed at that center. For the j-th EC, the set of requests processed locally is

{i - th request ∣ y_{i} = 1 and h_{i, j} = 1}

. For each resource type (network or computing), the available capacity is allocated to requests in proportion to their individual demands. Consequently, for a request i processed at the j-th EC, the allocated network rate and computing resources are given by Equations (6) and (7), respectively.

\begin{matrix} b_{i, j} & = \frac{d_{i}}{\sum_{i = 1}^{R} (y_{i} \cdot h_{i, j} \cdot d_{i})} \cdot w_{j}, \end{matrix}

(6)

\begin{matrix} r_{i, j} & = \frac{c_{i}}{\sum_{i = 1}^{R} (y_{i} \cdot h_{i, j} \cdot c_{i})} \cdot \sum_{k = 1}^{E} (x_{j, k} \cdot f_{k}) . \end{matrix}

(7)

Note that the computing resources of all ESs placed at the same EC are aggregated into a single pool. A request is therefore not assigned to a specific ES. Instead, it receives a share of the total computing capacity proportional to its computing demand relative to the total demand of all requests processed at that EC.

Then, the processing delay of request i when processed at the j-th EC (

D_{i, j}^{edge}

) is expressed as Equation (8). For any request i that is either not processed at the edge (

y_{i} = 0

) or not associated with the j-th EC (

h_{i, j} = 0

), we have

D_{i, j}^{edge} = 0

.

\begin{matrix} D_{i, j}^{edge} & = y_{i} \cdot h_{i, j} \cdot (\frac{d_{i}}{b_{i, j}} + \frac{c_{i}}{r_{i, j}}) \\ = y_{i} \cdot h_{i, j} \cdot (\frac{\sum_{i = 1}^{R} (y_{i} \cdot h_{i, j} \cdot d_{i})}{w_{j}} + \frac{\sum_{i = 1}^{R} (y_{i} \cdot h_{i, j} \cdot c_{i})}{\sum_{k = 1}^{E} (x_{j, k} \cdot f_{k})}) . \end{matrix}

(8)

3.3. Processing in Cloud

Different from the resource provisioning in ECs, where limited resources are shared among multiple requests, the cloud can allocate resources to each offloaded request independently. Consequently, for a request processed in the cloud, the cloud provides a data transfer rate of W and a computing capacity of F dedicated to its processing. In this paper, we assume that the cloud provides dedicated resources to each offloaded request: a network transmission rate W and a computing capacity F are allocated exclusively to the request, as done by many other related works. This assumption is justified because the number of requests offloaded from our edge system is typically small relative to the cloud’s total capacity, making intra system contention negligible. Nevertheless, the cloud suffers from high propagation delay and resource sharing with external users, which is reflected in the chosen values of W and F. Thus, if the i-th request is offloaded to the cloud, its data transmission delay and computing delay are

d_{i} / W

and

c_{i} / F

, respectively, and its total delay

D_{i}^{cloud}

is given by Equation (9). When the request is processed at the edge (

y_{i} = 1

), we have

1 - y_{i} = 0

, resulting in

D_{i}^{cloud} = 0

.

\begin{matrix} D_{i}^{cloud} = (1 - y_{i}) \cdot (\frac{d_{i}}{W} + \frac{c_{i}}{F}) . \end{matrix}

(9)

3.4. Problem Model

Based on the system model and delay expressions derived in the previous subsections, we now formally formulate the ESPP. Recall the two sets of binary decision variables introduced in Section 3.1:

ES placement variables: $x_{j, k} \in {0, 1}$ for $j = 1, \dots, B$ and $k = 1, \dots, E$ . $x_{j, k} = 1$ if the k-th ES is placed at the j-th BS (i.e., at the corresponding EC), and 0 otherwise.
Task offloading variables: $y_{i} \in {0, 1}$ for $i = 1, \dots, R$ . $y_{i} = 1$ if the i-th request is processed at the edge (i.e., at its associated ES(s)), and $y_{i} = 0$ if it is offloaded to the cloud.

The objective is to minimize the average system latency, which is the mean processing delay over all R requests. The total latency for a request consists of two mutually exclusive parts: the edge processing delay (if

y_{i} = 1

) and the cloud processing delay (if

y_{i} = 0

). Using the expressions derived in Equations (8) and (9), the overall objective function is given by Equation (10).

min L = \frac{1}{R} \sum_{i = 1}^{R} (\sum_{j = 1}^{B} D_{i, j}^{edge} + D_{i}^{cloud}),

(10)

where

D_{i, j}^{edge}

is defined in Equation (8) and

D_{i}^{cloud}

in Equation (9). The optimization is subject to the following constraints.

Atomicity of ESs: Each ES must be placed at exactly one base station, i.e., all purchased servers are deployed. This is enforced by Equation (11).

$\sum_{j = 1}^{B} x_{j, k} = 1, \forall k = 1, \dots, E .$

(11)
Feasibility of edge processing: A request can be processed at the edge only if its associated EC/BS hosts at least one ES. This logical condition is captured by Equation (12).

$y_{i} \leq \sum_{j = 1}^{B} (h_{i, j} \cdot \sum_{k = 1}^{E} x_{j, k}), \forall i = 1, \dots, R .$

(12)

Here, $h_{i, j}$ is the binary receivability indicator defined in Section 3.1. The right-hand side of Equation (12) gives the total number of ESs placed at the BS that can serve request i. If no server is present, the right-hand side is zero, forcing $y_{i} = 0$ (i.e., cloud offloading). Otherwise, $y_{i}$ can be either 0 or 1.
Domain constraints: All decision variables are binary, which is expressed as Equations (13) and (14).

$x_{j, k} \in {0, 1}, \forall j, k,$

(13)

$y_{i} \in {0, 1}, \forall i .$

(14)

Equations (11)–(14) together with the objective (10) define the ESPP as a binary non-linear programming (BNLP) problem. The non-linearity arises from the resource allocation terms within

D_{i, j}^{edge}

(see Equation (8)), where

x_{j, k}

appear in denominators via summations over requests and ESs. This problem is NP-hard, motivating the meta-heuristic solution presented in Section 4.

4. Method

4.1. Overview and Rationale

The ESPP formulated in Section 3 couples two types of binary decisions: where to place edge servers (placement variables

x_{j, k}

) and whether to process each request at the edge or in the cloud (offloading variables

y_{i}

). Solving this joint problem directly using a single monolithic meta-heuristic can be inefficient because the two decision spaces have fundamentally different structures. The placement space is discrete but bounded, consisting of E-dimensional vectors with each element in

{1, \dots, B}

; the offloading space is a high-dimensional binary hypercube of dimension R. A monolithic optimizer struggles to simultaneously navigate both types of landscapes, may converge to suboptimal.

To address this challenge, we propose an alternating iterative decomposition strategy. Instead of optimizing both sets of variables together, we decouple the joint problem into two interdependent subproblems: ES placement (with offloading fixed) and task offloading (with placement fixed). These two subproblems are solved alternately, each using a meta-heuristic specifically chosen to match its structural characteristics. The key insight is that by fixing one set of decisions, the remaining subproblem becomes more tractable, and the alternating exchange of solutions creates a co-evolutionary effect that progressively refines both.

The selection of PSO for placement and GA for offloading is grounded in their complementary strengths. PSO is well-suited for the placement subproblem because the search space is a bounded integer lattice; the velocity-position update mechanism naturally enforces bounds through clamping, and the swarm-based exploration efficiently locates promising regions. GA is ideal for offloading because the decision space is binary and high-dimensional; crossover and mutation operators effectively recombine good offloading patterns and maintain population diversity, preventing premature convergence. This principled selection avoids empirical trial-and-error commonly seen in hybrid meta-heuristics.

The overall framework alternates between the two modules: starting from an initial offloading decision (e.g., all requests processed at the edge), the PSO module optimizes placement given the current offloading; the resulting placement is then passed to the GA module, which optimizes offloading given the fixed placement. This loop repeats until convergence. The decomposition reduces the effective dimensionality of each optimization step, while the closed-loop iteration ensures that improvements in one subproblem benefit the other. The following subsections detail the encoding schemes, fitness function, algorithmic procedures, and complexity analysis.

4.2. Algorithm Framework

The joint optimization problem defined in the previous section can be decomposed into two interdependent subproblems by fixing one set of decision variables while optimizing the other.

Placement subproblem (solved by PSO): Given a fixed offloading decision $Y$ , the goal is to minimize the average system latency by choosing the placement variables $X = {x_{j, k}}$ , subject to the atomicity constraint and binary domain. Formally,

$min_{X} L (X; Y), s . t ., \sum_{j = 1}^{B} x_{j, k} = 1, \forall k; x_{j, k} \in {0, 1} .$

(15)
Offloading subproblem (solved by GA): Given a fixed placement decision $X$ , the goal is to minimize the average system latency by choosing the offloading variables $Y = {y_{i}}$ , subject to the feasibility constraint and binary domain. Formally,

$min_{Y} L (Y; X), s . t ., y_{i} \leq \sum_{j = 1}^{B} (h_{i, j} \cdot \sum_{k = 1}^{E} x_{j, k}), \forall i; y_{i} \in {0, 1} .$

(16)

Both subproblems share the same objective function L (the average system latency defined in Equation (10)), but they operate on different decision spaces and constraints. The alternating iterative framework solves these two subproblems sequentially, passing the solution of one as a fixed parameter to the other.

The overall algorithm operates in an alternating iterative manner, as illustrated in Algorithm 1. The process begins with an initial offloading decision (e.g., all requests processed at the edge). At each major iteration, the framework first fixes the current offloading decisions and invokes the PSO module to optimize the placement of edge servers (solving the placement subproblem). The resulting placement solution is then passed to the GA module, which optimizes the offloading decisions while keeping the placement fixed (solving the offloading subproblem). This alternation continues until convergence is achieved (i.e., no further improvement in system latency) or a predefined maximum number of iterations is reached.

Algorithm 1 Alternating Iterative PSO-GA Framework

Require:: Request set $R$ (R requests), BS set $B$ (B BSs), ES set $E$ (E ESs), system parameters, algorithm parameters $N_{p}$ , $T_{p}$ , $ω$ , $c_{1}$ , $c_{2}$ , $N_{g}$ , $T_{g}$ , $p_{c}$ , $p_{m}$ ;
Ensure:: Optimal placement $X^{*}$ and offloading $Y^{*}$ ;

1:: Initialization
2:: $Y^{0} \leftarrow 1$ ▷ All requests processed at edge initially
3:: $t \leftarrow 0$
4:: while $t < K_{max}$ and improvement $\geq ϵ$ do
5:: Phase 1: PSO for Edge Server Placement (fixed $Y^{t}$ )
6:: for $p = 1$ to $N_{p}$ do
7:: Initialize position $X_{p}$ as E-dim vector with each element in ${1, \dots, B}$
8:: $V_{p} \leftarrow 0$
9:: Evaluate fitness $F (X_{p}, Y^{t})$ using Equation (10)
10:: ${pbest}_{p} \leftarrow X_{p}$
11:: end for
12:: $gbest \leftarrow arg {min}_{p} F ({pbest}_{p}, Y^{t})$
13:: for $i t e r = 1$ to $T_{p}$ do
14:: for $p = 1$ to $N_{p}$ do
15:: Update velocity: $V_{p} \leftarrow ω V_{p} + c_{1} r_{1} ({pbest}_{p} - X_{p}) + c_{2} r_{2} (gbest - X_{p})$
16:: Update position: $X_{p} \leftarrow round (X_{p} + V_{p})$ , clamp to $[1, B]$
17:: Evaluate fitness $F (X_{p}, Y^{t})$
18:: if $F (X_{p}, Y^{t}) < F ({pbest}_{p}, Y^{t})$ then
19:: ${pbest}_{p} \leftarrow X_{p}$
20:: end if
21:: if $F (X_{p}, Y^{t}) < F (gbest, Y^{t})$ then
22:: $gbest \leftarrow X_{p}$
23:: end if
24:: end for
25:: end for
26:: $X^{t + 1} \leftarrow gbest$
27:: Phase 2: GA for Task Offloading (fixed $X^{t + 1}$ )
28:: for $q = 1$ to $N_{g}$ do
29:: Initialize individual $Y_{q}$ as R-dim binary vector
30:: Evaluate fitness $F (X^{t + 1}, Y_{q})$
31:: end for
32:: $Y_{best} \leftarrow arg {min}_{q} F (X^{t + 1}, Y_{q})$
33:: for $g e n = 1$ to $T_{g}$ do
34:: $P_{new} \leftarrow \emptyset$
35:: for $q = 1$ to $N_{g}$ do
36:: Select parents $Y_{a}$ , $Y_{b}$ via tournament selection
37:: if $rand () < p_{c}$ then
38:: Perform uniform crossover to produce $Y_{off 1}$ , $Y_{off 2}$
39:: else
40:: $Y_{off 1} \leftarrow Y_{a}$ , $Y_{off 2} \leftarrow Y_{b}$
41:: end if
42:: Apply bit-flip mutation with probability $p_{m}$ to both offspring
43:: Evaluate $F (X^{t + 1}, Y_{off 1})$ and $F (X^{t + 1}, Y_{off 2})$
44:: Add $Y_{off 1}$ , $Y_{off 2}$ to $P_{new}$
45:: end for
46:: Replace worst individuals in $P_{new}$ with $Y_{best}$ (elitism)
47:: $Y_{best} \leftarrow arg {min}_{Y \in P_{new}} F (X^{t + 1}, Y)$
48:: end for
49:: $Y^{t + 1} \leftarrow Y_{best}$
50:: $t \leftarrow t + 1$
51:: end while
52:: Return $X^{t}$ , $Y^{t}$

The alternating structure ensures that each subproblem is solved under the most recent information from the other, progressively refining both decision sets toward a high-quality joint solution.

4.3. Search Space Encoding and Fitness Function

To ensure solution feasibility and efficient exploration, we design specialized encoding schemes for the two sets of decision variables. For ES placement, each candidate solution is represented as an E-dimensional vector

X = (x_{1}, x_{2}, \dots, x_{E})

, where each element

x_{k} \in 1, 2, \dots, B

denotes the BS index to which the k-th ES is assigned. This encoding inherently satisfies the atomicity constraint

\sum_{j = 1}^{B} x_{j, k} \leq 1

, as each ES is placed at exactly one BS. Consequently, infeasible solutions are eliminated from the search space, reducing invalid exploration overhead and accelerating convergence.

For task offloading, each offloading decision is represented as an R-dimensional binary vector

Y = (y_{1}, y_{2}, \dots, y_{R})

, where

y_{i} \in 0, 1

indicates whether the i-th request is processed at the edge (

y_{i} = 1

) or offloaded to the cloud (

y_{i} = 0

). This encoding mirrors the original decision variable definition, enabling straightforward fitness evaluation without additional decoding.

Given a placement solution

X

and an offloading solution

Y

, the fitness (system latency) is computed as the average delay over all requests using the objective function in Equation (10). The fitness value guides both the PSO and GA modules during optimization.

4.4. Algorithmic Procedures

This subsection details the internal operations of the two modules within the alternating iterative framework. Given a fixed offloading decision, the PSO module optimizes the ES placement; given a fixed placement, the GA module optimizes the task offloading decision. The two modules are executed alternately in the main loop, with the output of one serving as the input to the other, as outlined in Algorithm 1.

(1) Particle Swarm Optimization for ES placement. Given a fixed offloading solution

Y^{t}

, the PSO module optimizes the placement decisions. A swarm of

N_{p}

particles is initialized, where each particle represents a candidate placement vector

X_{p}

with each element randomly chosen from

[1, B]

. Velocities

V_{p}

are initialized to zero. In each iteration, the velocity of each particle is updated using the standard PSO rule:

V_{p} = ω V_{p} + c_{1} r_{1} ({pbest}_{p} - X_{p}) + c_{2} r_{2} (gbest - X_{p})

, where

ω

is the inertia weight,

c_{1}

and

c_{2}

are acceleration coefficients, and

r_{1}, r_{2}

are random numbers in

[0, 1]

. The position is then updated by

X_{p} \leftarrow round (X_{p} + V_{p})

, with each element clamped to the interval

[1, B]

. The fitness of each particle is evaluated using the fixed offloading decisions via Equation (10). Personal best

{pbest}_{p}

and global best

gbest

are updated accordingly. This process repeats for

T_{p}

iterations, and the final

gbest

is output as the optimized placement

X^{t + 1}

.

(2) Genetic algorithm for task offloading. Given a fixed placement solution

X^{t + 1}

, the GA module optimizes the offloading decisions. A population of

N_{g}

binary strings is initialized randomly, each representing an offloading vector

Y_{q}

of length R. Tournament selection is used to choose parent individuals based on fitness (lower latency is better). Offspring are generated via uniform crossover with probability

p_{c}

: each bit of the offspring is inherited from either parent with equal chance. Bit-flip mutation is applied with probability

p_{m}

, where a bit is flipped from 0 to 1 or vice versa. Elitism is employed to carry the best individual(s) unchanged into the next generation. Fitness evaluation uses the fixed placement solution and Equation (10). The algorithm runs for

T_{g}

generations, and the best individual

Y_{best}

becomes the optimized offloading

Y^{t + 1}

.

The alternating process continues until convergence or a maximum number of iterations is reached.

Constraint (12) is enforced during fitness evaluation without modifying the actual decision variables. In the PSO module, for a given placement

X

and fixed offloading

Y^{t}

, any request i with

y_{i} = 1

whose associated BS has no ES is temporarily considered as offloaded to the cloud when computing its delay. In the GA module, for a given offloading individual

Y

and fixed placement

X^{t + 1}

, the same rule applies: if

y_{i} = 1

but the corresponding BS lacks an ES, the request is treated as cloud processed. This approach maintains the feasibility of all evaluated solutions while preserving the original search directions of PSO and GA.

5. Performance Evaluation

In this section, we evaluate the performance of the proposed alternating iterative PSO-GA (AIPG) algorithm against several baseline and state-of-the-art methods. We describe the experimental setup, present the comparative results, and provide detailed analyses of the key performance metrics.

5.1. Experimental Setup

The simulated edge–cloud system consists of

R = 1000

user requests,

B = 50

BSs, and

E = 10

ESs to be placed. The input data size of each request

d_{i}

is uniformly distributed in the range

[1000, 3000]

KB, and the required computing amount

c_{i}

follows a uniform distribution in

[1, 100]

MI (million instructions). Each ES provides computing capacity

f_{k}

uniformly distributed in

[10, 100]

MIPS (million instructions per second), while each BS offers a network transmission rate

w_{j}

uniformly distributed in

[10, 100]

Mbps. For cloud processing, the cloud provides a computing capacity of

F = 100

MIPS and a network transmission rate of

W = 10

Mbps, which are shared among all offloaded requests.

To comprehensively evaluate the performance of the proposed AIPG method, we compare it against the following nine algorithms:

AllCloud: All requests are offloaded to the cloud.
AllEdge: All requests are processed at the edge, with edge servers randomly placed.
RAND: Random edge server placement combined with random task offloading decisions.
MLF: ESs are deployed on the ECs with the highest computing overload first.
GA: Genetic algorithm applied to the joint optimization problem without decomposition.
DE: Differential evolution algorithm applied to the joint optimization problem.
PSO: Particle swarm optimization applied to the joint optimization problem.
PSOGA [25]: A hybrid method combining PSO and GA without alternating iteration.
GWO [26]: Grey wolf optimizer applied to the joint optimization problem.
FOX [27]: Fox optimizer applied to the joint optimization problem.

The primary performance metric is the average processing latency (system latency), defined as the mean delay across all requests. All experiments are repeated 20 times with random seeds to ensure statistical significance. For each algorithm, we report the average, minimum, and standard deviation of the system latency across the 20 runs.

5.2. Results and Analysis

5.2.1. Performance Comparison

Figure 2, Figure 3 and Figure 4 show the average latency, minimum latency, and standard deviation, respectively, for each algorithm over 20 independent runs. From these figures, the proposed AIPG algorithm consistently achieves the lowest average system latency among all compared methods, with an average latency of 0.079 s. This represents a significant improvement over the second-best algorithms (GA and PSOGA), which achieve an average latency of 0.098 s, a reduction of approximately 19.4%. Compared to the heuristic AllEdge method (0.245 s) and the MLF method (0.235 s), AIPG reduces the average latency by over 67%. Notably, the baseline AllCloud scheme yields the worst performance (0.701 s) due to the limited cloud network capacity and shared resource contention, highlighting the importance of edge computing for latency-sensitive applications. These results demonstrate that the alternating iterative decomposition of placement and offloading decisions can effectively capture the interdependencies between the two subproblems, leading to improved joint optimization.

As shown in Figure 2, the average latency metric reflects the overall system performance under typical conditions. AIPG achieves the lowest average (0.079 s), outperforming both GA and PSOGA (both 0.098 s) by a clear margin. This indicates that the alternating iterative strategy yields better convergence toward a globally optimal solution than directly applying GA or PSOGA without alternating refinement. PSO and GWO, which treat the joint problem as a single optimization task, achieve average latencies of 0.399 s and 0.405 s, respectively, significantly higher than AIPG. This suggests that the coupled nature of placement and offloading decisions makes joint optimization challenging for single-population meta-heuristics, whereas the decomposition-based approach effectively reduces problem complexity. DE achieves an average of 0.351 s, while FOX attains 0.369 s, both substantially worse than AIPG. The heuristic MLF method (0.235 s) and AllEdge (0.245 s) outperform random and cloud-only baselines but still lag behind the meta-heuristic methods. Overall, AIPG demonstrates superior capability in finding low-latency solutions by leveraging the complementary strengths of PSO and GA in an alternating framework.

As shown in Figure 3, the minimum latency achieved across 20 runs represents the best-case solution each algorithm can attain. AIPG again attains the lowest minimum value (0.071 s), closely followed by GA and PSOGA (both 0.083 s). The 12.9% improvement over the second-best methods confirms that AIPG not only performs well on average but also has the potential to reach exceptionally high-quality solutions. AllEdge achieves a minimum of 0.125 s, which is relatively low compared to its average, indicating that random placement can occasionally yield favorable configurations, albeit with high variability. The meta-heuristic methods that do not employ decomposition (PSO, GWO, DE, FOX) achieve minimum latencies ranging from 0.320 s to 0.369 s, which are substantially higher than those of GA, PSOGA, and AIPG. This underscores the importance of addressing the placement and offloading subproblems in a structured alternating manner rather than treating the entire problem as a monolithic optimization.

Standard deviation measures the stability and robustness of an algorithm across multiple runs. As shown in Figure 4, AIPG exhibits the smallest standard deviation (0.005 s), indicating that it consistently produces high-quality solutions with minimal variability. GA and DE also achieve low standard deviations (both 0.008 s), demonstrating their stable performance. Notably, AllEdge shows a relatively high standard deviation (0.047 s), reflecting the high variability inherent in random placement decisions. RAND also exhibits significant variance (0.042 s), as expected. The standard deviations of PSOGA (0.008 s) and GWO (0.016 s) are moderate, while FOX has a relatively large deviation (0.034 s), suggesting less consistent performance. The exceptionally low standard deviation of AIPG confirms that the alternating iterative mechanism not only improves solution quality but also enhances algorithm robustness by guiding the search toward stable configurations.

In summary, the proposed AIPG algorithm demonstrates superior performance across all evaluated metrics. It achieves the lowest average system latency (0.079 s), the best minimum latency (0.071 s), and the smallest standard deviation (0.005 s) among all compared methods. The results validate the effectiveness of the alternating iterative decomposition strategy, which enables PSO to focus on optimal edge server placement while GA refines task offloading decisions in a coordinated manner. This structured approach yields significant improvements over monolithic meta-heuristic methods and heuristic baselines, establishing AIPG as a robust and efficient solution for the edge server placement and task offloading joint optimization problem in edge–cloud computing systems.

5.2.2. Impact of the Number of BSs

To evaluate how the number of base stations B affects the performance of different algorithms, we conducted experiments with B varying from 10 to 50 (in steps of 10), while keeping

R = 1000

UDs and

E = 10

ESs fixed. Figure 5 reports the average system latency for each algorithm under different B values.

From the results, as the number of base stations increases, most algorithms exhibit a decreasing trend in average latency. This is because a larger set of BSs offers more flexibility for placing edge servers, allowing requests to be distributed more evenly across the network, which reduces the load on individual edge servers and shortens transmission distances. The reduction is most significant for methods that intelligently select placement locations, such as AIPG, GA, and PSOGA, while naive heuristics (AllEdge, RAND, MLF) also benefit but to a lesser extent.

AIPG consistently achieves the lowest latency across all B values, with the sole exception of

B = 10

where GA and PSOGA show a marginally lower value (0.124 vs. 0.125). However, the difference is extremely small (0.001 s) and likely within statistical noise. For

B \geq 20

, AIPG clearly outperforms all other methods. Notably, AIPG’s latency drops from

0.125

s at

B = 10

to

0.079

s at

B = 50

, a reduction of 36.8%, while GA (and PSOGA) only drops from 0.124 to 0.098 (20.9%). This indicates that AIPG is more effective at exploiting the increased number of candidate locations due to its alternating iterative mechanism, which jointly optimizes placement and offloading.

AllEdge (random placement) improves substantially when more BSs become available, from

0.665

s at

B = 10

to

0.245

s at

B = 50

, because more random placements can accidentally hit good configurations. However, its latency remains much higher than AIPG. MLF, which places servers on the most overloaded ECs, also benefits but saturates around

0.235

s. PSO and GWO, which treat the joint problem monolithically, show moderate improvement but still lag far behind AIPG, with latencies around

0.4

s at

B = 50

.

AllCloud shows almost no variation with B (around

0.70

s), as expected because its performance depends only on cloud resources and is independent of edge infrastructure. This highlights the importance of edge computing: even a small number of edge servers (

E = 10

) can reduce latency by an order of magnitude compared to the cloud only approach.

In summary, the proposed AIPG algorithm demonstrates superior scalability and adaptability to larger numbers of base stations, consistently achieving the lowest average latency across the entire range of B. The performance gap between AIPG and other meta heuristics widens as B increases, confirming that the alternating iterative decomposition effectively exploits the expanded placement space.

5.2.3. Impact of the Number of ESs

To investigate how the amount of edge computing resources affects system performance, we varied the number of edge servers E from 6 to 10 in steps of 1, while keeping

R = 1000

requests and

B = 50

base stations fixed. Figure 6 reports the average system latency for each algorithm under different E values.

As the number of edge servers increases, the average latency of all edge aware algorithms decreases significantly. This is expected because more servers provide additional computing capacity, allowing more requests to be processed at the edge and reducing the need for cloud offloading. The most substantial improvements occur when E increases from 6 to 8. Beyond 8, the marginal gain diminishes as the system becomes adequately provisioned.

AIPG achieves the lowest latency for every value of E. At

E = 6

, AIPG (

0.143

s) outperforms GA and PSOGA (

0.195

s) by 26.7%. At

E = 10

, the gap narrows slightly in relative terms (AIPG:

0.079

s, GA/PSOGA:

0.098

s, a 19.4% improvement), but AIPG remains the best. Notably, the performance of GA and PSOGA is identical in this experiment because PSOGA’s non alternating hybrid does not yield an advantage over pure GA. AIPG’s alternating iteration, by contrast, consistently finds better placements and offloading policies.

Comparison with single population methods. PSO and GWO, which treat the joint problem monolithically, improve slowly as E increases, but their latencies remain high (

0.399

s and

0.405

s at

E = 10

). DE shows moderate improvement but still lags far behind AIPG. These results confirm that simply adding more edge servers does not compensate for the inability to jointly optimize placement and offloading; a structured decomposition is necessary.

Heuristic baselines. AllEdge and RAND show substantial latency reductions when more servers are available, but they still perform poorly compared to meta heuristics. MLF (most load first) improves more dramatically, especially from

E = 8

to

E = 10

, because having many servers allows the heuristic to load balance effectively. Nevertheless, at

E = 10

, MLF (

0.235

s) is still nearly three times worse than AIPG (

0.079

s).

Cloud only baseline. AllCloud remains constant at around

0.70

s, independent of E, confirming that edge resources alone determine the benefit of edge computing. The fact that AIPG with only 6 edge servers achieves

0.143

s—a 5 x improvement over the cloud demonstrates the effectiveness of our approach even under tight resource budgets.

In summary, the proposed AIPG algorithm consistently outperforms all baselines across a wide range of edge server counts. It effectively exploits additional servers to reduce latency, and it achieves this with better scalability than monolithic meta heuristics. These results further validate the practical value of the alternating iterative framework for edge server placement and task offloading.

5.2.4. Impact of the Number of User Requests

To evaluate the scalability of the proposed AIPG algorithm with respect to the number of user requests, we varied R from 100 to 1000 in steps of 100, while keeping

B = 50

base stations and

E = 10

edge servers fixed. Figure 7 reports the average system latency for each algorithm under different R values.

As the number of user requests increases, the average latency of all algorithms grows, because more requests compete for the limited computing and network resources at both the edge and the cloud. However, the rate of increase varies significantly across algorithms.

AIPG achieves the lowest latency for all values of R. For very small request volumes (

R = 100

), AIPG (

0.004

s) and GA/PSOGA (

0.005

s) both achieve extremely low latency, nearly reaching the lower bound. As R grows, the gap between AIPG and GA/PSOGA widens: at

R = 1000

, AIPG (

0.079

s) outperforms GA/PSOGA (

0.098

s) by 19.4%. This indicates that the alternating iterative mechanism is more effective at managing the increased complexity of the offloading space when many requests are present.

AIPG exhibits approximately linear growth in latency with respect to R (from

0.004

s at

R = 100

to

0.079

s at

R = 1000

), which is consistent with its polynomial-time complexity. In contrast, monolithic meta heuristics such as PSO and GWO show a steeper initial increase and then plateau at relatively high latencies (around

0.39

–

0.40

s). DE shows a rapid rise from

0.122

s at

R = 100

to

0.351

s at

R = 1000

, indicating poor scalability.

AllEdge, RAND, and MLF perform poorly even for small R. AllEdge (random placement) has high latency (

0.239

s at

R = 100

) that remains largely unchanged with R, because random placement fails to exploit the limited number of edge servers. MLF (most load first) improves over AllEdge but still lags far behind meta heuristic methods. AllCloud is unaffected by R in our model (since cloud resources are dedicated per request), but its latency is consistently high (about

0.70

s), making it unsuitable for delay sensitive applications.

The proposed AIPG algorithm demonstrates excellent scalability with respect to the number of user requests. It consistently outperforms all baselines, especially when R is large, thanks to its efficient alternating decomposition that handles the growing offloading space without suffering from the curse of dimensionality. These results confirm the practical applicability of AIPG in edge–cloud systems with realistic request volumes.

6. Conclusions

This paper investigated the joint optimization of ES placement and task offloading in edge–cloud computing systems, recognizing the inherent coupling between these two decisions and the NP-hard nature of the resulting binary non-linear programming formulation. To address this challenge, we proposed an alternating iterative framework that decomposes the joint problem into two interdependent subproblems, ES placement optimized by particle swarm optimization (PSO) and task offloading optimized by genetic algorithm (GA), leveraging the complementary strengths of these meta-heuristics to efficiently explore the discrete placement space and the high-dimensional binary offloading space, respectively. Through compact encoding schemes that inherently satisfy placement constraints and a detailed complexity analysis demonstrating polynomial-time scalability, the proposed method balances solution quality with computational efficiency. Extensive experiments comparing the proposed algorithm against ten baseline methods, including heuristics and state-of-the-art meta-heuristics, showed that our approach achieves the lowest average latency among all compared methods, confirming the effectiveness, robustness, and scalability of the alternating iterative framework.

The current work assumes a static batch of requests, which may not capture the dynamics of real-world edge environments where user requests arrive over time. Extending the alternating iterative framework to handle online optimization with dynamic request arrivals is an important direction. One possible approach is to combine the proposed method with sliding-window optimization, where the batch window moves over time, or to integrate it with reinforcement learning to adapt placement and offloading decisions in response to changing workload patterns. Such dynamic formulations would better reflect practical edge–cloud systems and will be pursued in our future research.

Author Contributions

Conceptualization, W.S.; methodology, Z.Z.; software, W.S.; validation, B.W.; formal analysis, B.W.; investigation, W.S.; resources, W.S.; data curation, Z.Z.; writing—original draft preparation, W.S.; writing—review and editing, B.W.; visualization, W.S.; supervision, Z.Z.; project administration, B.W.; funding acquisition, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the key scientific and technological projects of Henan Province (Grant Nos. 262102211130, 252102211072 and 252102221017).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset is available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BNLP	binary non-linear programming
BS	Base Station
DE	Differential Evolution
DNSGA	Non-Dominated Sorting Genetic Algorithm
EC	Edge Computing Center
ES	Edge Server
ESPP	Edge Server Placement Problem
GA	Genetic Algorithm
GWO	Grey Wolf Optimizer
IoT	Internet of Things
MI	Million Instruction
MIPS	Million Instruction Per Second
MLF	Most Load First
PSO	Particle Swarm Optimization
PSOGA	Hybrid PSO and GA
UD	User Device

References

Zreikat, A.I.; AlArnaout, Z.; Abadleh, A.; Elbasi, E.; Mostafa, N. The Integration of the Internet of Things (IoT) Applications into 5G Networks: A Review and Analysis. Computers 2025, 14, 250. [Google Scholar] [CrossRef]
Kong, L.; Tan, J.; Huang, J.; Chen, G.; Wang, S.; Jin, X.; Zeng, P.; Khan, M.; Das, S.K. Edge-computing-driven Internet of Things: A Survey. ACM Comput. Surv. 2022, 55, 174. [Google Scholar] [CrossRef]
Asghari, A.; Sohrabi, M.K. Server placement in mobile cloud computing: A comprehensive survey for edge computing, fog computing and cloudlet. Comput. Sci. Rev. 2024, 51, 100616. [Google Scholar] [CrossRef]
Zhao, J.; Quan, H.; Ge, P.; Huang, Y.; Xiao, Y. Vehicular Edge Computing System: A Survey. IEEE Internet Things Mag. 2025, 8, 122–128. [Google Scholar] [CrossRef]
Peng, P.; Lin, W.; Wu, W.; Zhang, H.; Peng, S.; Wu, Q.; Li, K. A survey on computation offloading in edge systems: From the perspective of deep reinforcement learning approaches. Comput. Sci. Rev. 2024, 53, 100656. [Google Scholar] [CrossRef]
Zhang, S.; Tong, X.; Chi, K.; Gao, W.; Chen, X.; Shi, Z. Stackelberg Game-Based Multi-Agent Algorithm for Resource Allocation and Task Offloading in MEC-Enabled C-ITS. IEEE Trans. Intell. Transp. Syst. 2025, 26, 17940–17951. [Google Scholar] [CrossRef]
Zhao, L.; Wu, Z.; Zhou, J.; Cai, H.; Li, B.; Xiao, F. EdgePro: Adaptive Edge Service Provision via Safe Deep Reinforcement Learning. In Proceedings of the 2025 IEEE International Conference on Web Services (ICWS), Helsinki, Finland, 7–12 July 2025; pp. 742–752. [Google Scholar] [CrossRef]
Bahrami, B.; Khayyambashi, M.R.; Mirjalili, S. Edge server placement problem in multi-access edge computing environment: Models, techniques, and applications. Clust. Comput. 2023, 26, 3237–3262. [Google Scholar] [CrossRef]
Zhao, L.; Tan, W.; Li, B.; He, Q.; Huang, L.; Sun, Y.; Xu, L.; Yang, Y. Joint Shareability and Interference for Multiple Edge Application Deployment in Mobile-Edge Computing Environment. IEEE Internet Things J. 2022, 9, 1762–1774. [Google Scholar] [CrossRef]
Zhao, L.; Li, B.; Tan, W.; Cui, G.; He, Q.; Xu, X.; Xu, L.; Yang, Y. Joint Coverage-Reliability for Budgeted Edge Application Deployment in Mobile Edge Computing Environment. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 3760–3771. [Google Scholar] [CrossRef]
Zhao, L.; Xiao, F.; Li, B.; Zhou, J.; Xu, X.; Yang, Y. Availability-Aware Revenue-Effective Application Deployment in Multi-Access Edge Computing. IEEE Trans. Parallel Distrib. Syst. 2024, 35, 1268–1280. [Google Scholar] [CrossRef]
Zhao, L.; Li, B.; Zhou, J.; Chen, C.; Xiao, F.; Yang, Y. Maximizing Revenue for Reliability-Aware Edge Application Deployment. IEEE Trans. Ind. Inform. 2026, 22, 4593–4603. [Google Scholar] [CrossRef]
Li, B.; He, Q.; Cui, G.; Xia, X.; Chen, F.; Jin, H.; Yang, Y. READ: Robustness-Oriented Edge Application Deployment in Edge Computing Environment. IEEE Trans. Serv. Comput. 2022, 15, 1746–1759. [Google Scholar] [CrossRef]
Lyu, Z.; Xiao, M.; Skoglund, M.; Debbah, M.; Poor, H.V. Quantization-Aware Collaborative Inference for Large Embodied AI Models. arXiv 2026, arXiv:2602.13052. [Google Scholar] [CrossRef]
Tiwari, V.; Pandey, C.; Dahal, A.; Roy, D.S.; Fiore, U. A Knapsack-based Metaheuristic for Edge Server Placement in 5G networks with heterogeneous edge capacities. Future Gener. Comput. Syst. 2024, 153, 222–233. [Google Scholar] [CrossRef]
Wu, J.; Xu, X.; Cui, G.; Zhang, Y.; Qi, L.; Dou, W.; Cai, Z. Fairness-Aware Budgeted Edge Server Placement for Connected Autonomous Vehicles. IEEE Trans. Mob. Comput. 2025, 24, 4762–4776. [Google Scholar] [CrossRef]
Vali, A.A.; Azizi, S.; Shojafar, M. RESP: A Recursive Clustering Approach for Edge Server Placement in Mobile Edge Computing. ACM Trans. Internet Technol. 2024, 24, 13. [Google Scholar] [CrossRef]
Zhang, S.; Yu, J.; Hu, M. An edge server placement based on graph clustering in mobile edge computing. Sci. Rep. 2024, 14, 29986. [Google Scholar] [CrossRef]
Zhou, Z.; Abawajy, J. Reinforcement Learning-Based Edge Server Placement in the Intelligent Internet of Vehicles Environment. IEEE Trans. Intell. Transp. Syst. 2025; in press. [CrossRef]
Li, K. Mobility -aware server placement and power allocation for randomly walking mobile users. J. Parallel Distrib. Comput. 2026, 210, 105216. [Google Scholar] [CrossRef]
Zarei, S.; Azizi, S.; Ahmed, A. Optimizing Edge Server Placement and Load Distribution in Mobile Edge Computing Using ACO and Heuristic algorithms. J. Supercomput. 2025, 81, 257. [Google Scholar] [CrossRef]
Wang, B.; Zhang, Z.; Song, Y.; Chen, M.; Liu, D. nPGSAO: A Hybrid Particle Swarm Optimization and Genetic Algorithm with Niching Technology for Edge Server Placement. IEEE Internet Things J. 2025, 12, 19370–19383. [Google Scholar] [CrossRef]
Ghasemzadeh, A.; Aghdasi, H.S.; Saeedvand, S. Edge server placement and allocation optimization: A tradeoff for enhanced performance. Clust. Comput. 2024, 27, 5783–5797. [Google Scholar] [CrossRef]
Surayya, A.; Muzakkir Hussain, M.; Reddy, V.D.; Abdul, A.; Gazi, F. Evolutionary Algorithms for Edge Server Placement in Vehicular Edge Computing. IEEE Access 2025, 13, 79030–79052. [Google Scholar] [CrossRef]
Song, M.; An, M.; He, W.; Wu, Y. Research on land use optimization based on PSO-GA model with the goals of increasing economic benefits and ecosystem services value. Sustain. Cities Soc. 2025, 119, 106072. [Google Scholar] [CrossRef]
Jiang, J.; Zhao, Z.; Liu, Y.; Li, W.; Wang, H. DSGWO: An improved grey wolf optimizer with diversity enhanced strategy based on group-stage competition and balance mechanisms. Knowl.-Based Syst. 2022, 250, 109100. [Google Scholar] [CrossRef]
Mohammed, H.; Rashid, T. FOX: A FOX-inspired optimization algorithm. Appl. Intell. 2023, 53, 1030–1050. [Google Scholar] [CrossRef]

Figure 1. The architecture of the edge–cloud system.

Figure 2. The average system latency for each algorithm over 20 independent runs.

Figure 3. The minimum system latency for each algorithm over 20 independent runs.

Figure 4. The standard deviation of the system latency for each algorithm over 20 independent runs.

Figure 5. Average latency vs. number of base stations (B).

Figure 6. Average latency vs. number of edge servers (E).

Figure 7. Average latency vs. number of user requests (R).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Si, W.; Zhang, Z.; Wang, B. Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration. Digital 2026, 6, 44. https://doi.org/10.3390/digital6020044

AMA Style

Si W, Zhang Z, Wang B. Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration. Digital. 2026; 6(2):44. https://doi.org/10.3390/digital6020044

Chicago/Turabian Style

Si, Weili, Zhifeng Zhang, and Bo Wang. 2026. "Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration" Digital 6, no. 2: 44. https://doi.org/10.3390/digital6020044

APA Style

Si, W., Zhang, Z., & Wang, B. (2026). Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration. Digital, 6(2), 44. https://doi.org/10.3390/digital6020044

Article Menu

Edge Server Placement by a Novel Hybrid Meta-Heuristic Algorithm with Alternating Iteration

Abstract

1. Introduction

2. Related Work

3. Problem Formulation

3.1. System Model

3.2. Processing in Edge

3.3. Processing in Cloud

3.4. Problem Model

4. Method

4.1. Overview and Rationale

4.2. Algorithm Framework

4.3. Search Space Encoding and Fitness Function

4.4. Algorithmic Procedures

5. Performance Evaluation

5.1. Experimental Setup

5.2. Results and Analysis

5.2.1. Performance Comparison

5.2.2. Impact of the Number of BSs

5.2.3. Impact of the Number of ESs

5.2.4. Impact of the Number of User Requests

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI