HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation

Mdemaya, Garrik Brel Jagho; Ngomade, Armel Nkonjoh; Velempini, Mthulisi

doi:10.3390/iot7020038

Open AccessArticle

HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation

by

Garrik Brel Jagho Mdemaya

^1,*

,

Armel Nkonjoh Ngomade

²

and

Mthulisi Velempini

¹

Department of Computer Science, University of Limpopo, Polokwane 0727, South Africa

²

Department of Software Engineering, Higher Institute of Transport Logistics and Commerce, University of Ebolowa, Ebolowa P.O. Box 886, Cameroon

^*

Author to whom correspondence should be addressed.

IoT 2026, 7(2), 38; https://doi.org/10.3390/iot7020038

Submission received: 3 November 2025 / Revised: 12 February 2026 / Accepted: 16 February 2026 / Published: 27 April 2026

Download

Browse Figures

Versions Notes

Abstract

Edge computing has emerged as a promising paradigm to minimize latency and energy consumption while improving computational efficiency for mobile devices. Latency-sensitive applications such as autonomous driving, augmented reality, and industrial automation require ultra-low response times, making efficient task offloading a necessity in edge computing. However, distributing optimally computational tasks among edge servers remains a challenge, especially when considering latency, energy consumption, and workload balancing simultaneously. Although existing approaches have focused on one or two of these objectives, they do not provide a holistic solution that incorporates all three factors. In addition, some existing solutions do not take advantage of parallelism at the edge layer, resulting in bottlenecks and inefficient resource usage. In this paper, we propose a novel learning-based task offloading model that integrates parallel processing at the edge layer, adaptive workload balancing, and joint latency–energy optimization. Moreover, by dynamically adjusting the number of selected edge servers for parallel execution, our approach achieves optimal trade-offs between performance and resource efficiency. Our experimental setup includes several edge servers and several randomly deployed devices. It employs Apache HTTP Benchmark (AB) to generate realistic Mobile Edge Computing workloads. The obtained results show that our method outperforms existing approaches by reducing latency, lowering energy consumption, and maintaining a balanced workload across edge nodes.

Keywords:

energy consumption; internet of things; latency; mobile edge computing; parallel computing

Graphical Abstract

1. Introduction

The Internet of Things and next-generation networks have introduced a plethora of new energy-intensive and latency-sensitive applications [1,2]. In an IoT network, a smart city, for example, IoT devices generally have fairly limited capacities in terms of batteries, storage memory, and processor frequency, which means that some heavy tasks may not be processed because they consume a lot of resources and exhaust the batteries [3,4]. Mobile edge computing (MEC) is a paradigm in which edge servers, with similar abilites to those of cloud layer computers, are positioned close to IoT objects. The main objective is to process requests from these objects faster, thereby reducing the latency of request processing compared to if these requests were sent to the cloud layer [5,6,7]. In such an environment, IoT objects can send certain heavy tasks to the edge layer for faster execution: this is known as task offloading [8,9] and it is illustrated in Figure 1.

Figure 1 clearly shows that task offloading in an MEC environment can be carried out using different strategies, including full offloading, partial offloading, and device-to-device (D2D) offloading. Each of these techniques has advantages and disadvantages that influence their applicability depending on the application context.

Firstly, Full offloading [10] involves delegating an entire task from an IoT device to an edge server. This approach maximizes the use of edge server computing resources while minimizing the energy consumption of IoT devices, which are typically resource-constrained. However, a potential drawback of full offloading is the latency induced by data transmission and the dependence on network quality;
Secondly, Partial offloading [11] involves fragmenting the task into several subtasks, some of which are executed locally on the IoT device while others are transferred to the edge server. This approach aims to balance energy consumption and latency, but it presents several challenges. Firstly, it increases the complexity of the decision-making process, as it requires the determination of optimal parts of the task to offload. Secondly, synchronizing local subtasks with those executed at the edge can introduce additional delays and computational overhead, potentially reducing the expected benefits;
Finally, D2D [12] offloading is based on the idea that an IoT object can transfer a task to another neighboring IoT object instead of sending it directly to an edge server. This technique can reduce congestion on MEC servers, but it is limited by the availability and capacity of other IoT objects, which may themselves be resource-constrained. In addition, unstable connections between IoT objects can lead to increased transmission delays and decreased reliability of the offloading process.

In our approach, we adopted full offloading for several reasons. Firstly, it simplifies decision-making by avoiding the complexity of task slicing and synchronization, as in the case with partial offloading. Secondly, it ensures the optimal use of edge server resources, which are much more efficient than IoT objects. Thirdly, it allows for the better management of the energy consumption of IoT objects by avoiding any local computational load, which is essential to extend their autonomy. Finally, thanks to our machine learning model, edge server selection is performed intelligently, thus reducing the latency and network congestion generally associated with full offloading.

However, when offloading a task, many parameters need to be taken into account, such as the following:

Task offloading is performed only when the latency resulting from execution at the edge layer is less than the time required to process the task locally on the IoT object;
The energy consumption used to offload the task to the edge layer is less than the energy consumed if the task is executed locally on the IoT object.

The main challenge is therefore to intelligently determine whether a task should be executed locally on an IoT device or offloaded to an edge server. A poorly optimized decision can lead either to overloading the edge servers or to excessive energy consumption on the local device, thereby degrading the overall performance of the system. Current approaches to offloading in the MEC have several limitations. Heuristic methods are not sufficiently adaptive and cannot dynamically adjust to variations in network conditions [13,14]. On the other hand, many machine learning-based solutions only take into account a single optimization criterion, latency or energy, without offering a holistic approach to integrating these two aspects [15,16]. Several works, like the ones of Mengyu et al. [17] and Dinesh et al. [18], included both latency and energy in their offloading solutions but did not optimize user requests among edge servers for efficient workload distribution. In addition, the use of parallelism at the edge layer has received limited attention, although it can significantly improve execution performance. In this paper, we propose an innovative decision model that takes into consideration all these challenges to improve the management of offloading and distributed computing in an MEC environment. Our contribution is therefore based on three main premises: (1) Holistic Optimization: the important insight is that previous work often ignores the joint optimization of latency, energy consumption, and workload balancing simultaneously. Our model is uniquely designed to provide a holistic solution integrating all three; (2) Adaptive Parallelism: The novel introduction of adaptive parallel processing at the edge layer, where the system dynamically selects the optimal number of edge servers for execution to achieve the best trade-off between performance and resource efficiency, which is a key differentiator from existing approaches; (3) Contextual Decision Making: The insight that an optimal offloading decision must be context-aware, integrating not just latency and energy models but also dynamic thresholds to reflect the relative priority of these metrics based on the application’s needs. Through simulations, we demonstrate that our approach reduces the energy consumption of IoT devices while improving task allocation and overall system latency. These results pave the way for smarter and more efficient resource management in next-generation MEC infrastructure. The rest of this paper is organized as follows. In Section 2 we present some related works. In Section 3, our proposed solution is presented and discussed. The results of our simulations are presented in Section 4, Section 5 discusses our results, and finally, the paper is concluded in Section 6.

2. State of the Art

Task offloading in Mobile Edge Computing is crucial for enhancing three main key concepts, which are computational efficiency, minimizing latency, and boosting energy efficiency for mobile devices. Despite the introduction of various strategies and techniques, key challenges, such as resource allocation, scalability, and mobility, continue to present unresolved issues for researchers. Here is a review of recent works that address this issue. These key paradigm strategies include full, partial, and collaborative offloading models, while decision-making approaches use heuristic and machine learning methods to optimize offloading decisions. In addressing this issue, some works focus on resource management [19,20], and others are based on latency and energy efficiency [20,21]. Recently, some works have included parallelism techniques [15,16] and/or artificial intelligence techniques such as Machine Learning models for intelligent decision-making.

Based on resource allocation, Minwoo Kim et al. [20] tackles the challenge of optimizing three key factors which enhance users’ quality of experience in an MEC network, which are battery usage, computation, and communication latencies. In their work, they propose an NP–Hard combinatorial problem that optimizes these three key aspects. A closed-form solution for resource allocation was derived. They then proposed a pricing-based approach for user association, which minimizes communication overhead. By incorporating language-based AI during the simulation process, the results demonstrate that the proposed method significantly outperforms the baseline models, achieving up to 1.62 times shorter latency and 41.2% lower energy consumption per task. These results also indicate that as the number of edge servers increases, the proposed method maintains its robustness and effectively reduces latency in various network conditions. This study does not consider constraints that might arise from the physical limitations of edge servers, such as the computational capacity or energy availability during peak usage, and the work focuses on static network scenarios; rapidly changing environments may require adaptive strategies that can respond to fluctuations in user demands and network loads.

To minimize energy consumption and execution latency, Abbas et al. [8] proposed a heuristic-based approach using a variant of well-known algorithms such as the Ant Colony Optimization, Whale Optimization algorithm, and Grey Wolf Optimization. These algorithms are adapted to determine an optimal offloading strategy that balances the computation load between mobile devices and MEC servers. The proposed method evaluates trade-offs between local execution and offloading, ensuring improved response times and energy efficiency. The simulation results demonstrate that Grey Wolf optimization outperforms others in terms of energy efficiency and latency reduction. However the proposed metaheuristic algorithms are inherently sequential, limiting their efficiency in large-scale MEC environments. Parallelizing these algorithms (e.g., parallel swarm intelligence techniques) could significantly accelerate the offloading decision-making process, making it more suitable for real-time applications. In addition, the evolutionary nature of metaheuristic algorithms increases computational overhead, making real-time offloading decisions challenging for delay-sensitive applications. Wang et al. [22] proposed an energy-efficient offloading scheme for edge computing which divides tasks into subtasks assigned to multiple servers based on characteristics such as distance and CPU capacity. Using the Hungarian algorithm, the Multi-subTasks-to-Multi-Servers (MTMS) scheme optimizes subtask allocation to minimize latency and energy consumption. Simulations demonstrate that MTMS significantly reduces delay and energy usage compared to non-assignment, greedy, and random methods, making it suitable for large-scale IoT environments. However, the scheme’s complexity and reliance on accurate server information may limit scalability. In the same way, Gu et al. [23] presents a comprehensive optimization approach aimed at minimizing energy consumption while respecting a predefined execution time constraint. The authors formulate an optimization problem that jointly considers communication and computational resources, obtaining analytical solutions for the optimal transmission power, offload ratio, and rate. The strategy distinguishes between full offloading and partial offloading based on channel conditions, introducing a channel gain threshold to determine the optimal offloading decision. The proposed approach adapts to the varying system parameters, such as latency constraints, task complexity, and local computing power, ensuring energy efficiency and the timely completion of tasks. The simulation results validate the effectiveness of the strategy, showing significant energy savings and reduced latency. The proposed approach involving a single-edge server may not scale well in scenarios with numerous mobile devices or fluctuating demands. As more devices connect and offload tasks, the edge server may become overwhelmed, leading to increased latency and reduced energy savings.

Using a similar process, many works have been carried out. Beck et al. [24] introduced a coordinate descent algorithm that iteratively refines the locally optimal solution by modifying the binary offloading decision of one user at a time. The method employs convex relaxation techniques to manage binary variables but faces challenges in achieving a balance between performance and computational complexity. Michael P. et al. [25] propose an optimization framework that integrates computation offloading with resource scheduling, using a multiobjective approach to balance latency and energy trade-offs. The solution leverages convex optimization techniques to dynamically allocate computational resources and determine optimal offloading decisions for user devices. In [26], the authors study and compare three scenarios: full offloading, partial offloading, and D2D-enabled partial offloading. For full offloading, a heuristic algorithm is introduced to reduce computational complexity while achieving suboptimal solutions. Building on this, an algorithm for partial offloading is proposed, and the D2D-enabled scenario is broken into two subproblems, which are solved independently and then combined to form an improved solution to the original problem. The numerical results demonstrate the effectiveness of the proposed algorithms in reducing system delay, particularly in high-density computing task environments. Hu et al. [27] investigates a hybrid edge and central cloud computing architecture in heterogeneous cellular networks, focusing on wireless backhaul optimization. The authors proposed QoS-aware resource allocation algorithms, which analyze the performance of the scheme through simulations, demonstrating that integrated edge-central schemes outperform isolated approaches, especially with low-complexity massive MIMO backhaul solutions. Various optimization techniques, such as linear relaxation [28] and quadratic approximation [29], have been employed to handle binary variables. However, these methods often face challenges in balancing performance and computational complexity, particularly when managing integer variables. As a result, they do not consistently deliver high-quality solutions in dynamic environments, making them less suitable for real-time decision-making. Recently, Eang et al. [30] introduced a Lagrange Duality Resource Optimization Algorithm (LDROA) for efficient task offloading and resource allocation in Mobile Edge Computing environments. The proposed approach demonstrates significant improvements over conventional methods like Greedy Latency, Load Balancing, and Random Offloading. Specifically, LDROA achieves better cost efficiency and lower latency as the number of tasks increases. The algorithm intelligently adjusts resources for each task, minimizing both computation costs and latency while adhering to the IoT application time constraints. The experimental results show that LDROA outperforms existing schemes in varying numbers of MEC servers, user device capacities, and congestion conditions. For instance, it reduces total latency compared to Greedy Latency and lowers costs compared to Random Offloading. The method also considers energy consumption, enhancing the battery life of user devices. The algorithm’s performance relies on accurate Lagrange multiplier tuning, which may require significant computational overhead and expertise for large-scale deployments. Based on parallel or distributed techniques and Machine Learning algorithms, many studies have been conducted [31,32,33,34,35]. In [31], the authors propose a framework that integrates wireless power transfer with binary offloading, enabling efficient resource allocation and task execution. A key feature of the approach is its ability to dynamically adjust resource allocation based on real-time demands, leveraging deep neural networks (DNNs) to optimize offloading decisions. Although they obtained significant results, the reliance on DNNs for decision-making can introduce computational overhead, particularly in scenarios with a high volume of tasks, limiting their scalability in large-scale MEC environments. Also, Perin et al. [36] proposed a decentralized predictive optimization framework for MEC networks with energy-harvesting edge servers, combining Model Predictive Control (MPC) with a customized Distributed Regularized Splitting (DRS) algorithm. The approach improves energy efficiency and workload management by dynamically optimizing load and consolidating the servers under varying network conditions. The approach reduces energy consumption, dependence on cloud resources, and the overall carbon footprint. The framework shows rapid convergence with convex cost functions at high loads and maintains fairness and efficiency across different prediction horizons. However, it introduces added complexity and computational overhead due to its predictive and decentralized nature. Its effectiveness also depends on the accuracy of the models and predictions, which can be compromised by unpredictable energy or network fluctuations. Furthermore, non-convex cost functions can slow convergence, limiting the applicability of the model in real-time scenarios. The work in [32] proposes a Green and Sustainable Mobile Edge Computing Framework to address energy sustainability and timeliness in IoT systems constrained by delay. The framework integrates Energy Harvesting Technologies to enable IoT devices to self-power using ambient energy, reducing battery maintenance costs. A dynamic parallel computing offloading and energy management algorithm is designed to minimize task response time and packet losses while ensuring energy queue stability. The algorithm leverages Lyapunov optimization to dynamically allocate resources, including energy harvesting, transmission power, CPU frequency, and offloading decisions. The model assumes a quasistatic environment, which may not hold in dynamic real-world scenarios with fluctuating energy availability and varying channel conditions. DongYu Lua et al. [35] present a deep reinforcement learning framework to enhance network function parallelism. The authors utilize Network Function Virtualization to optimize the deployment of Service Function Chains which play a crucial role in real-time network services such as firewalls, encryption, and intrusion detection. The approach is to adopt a Double Deep Q-Network algorithm to tackle the Fairness-Aware Throughput Maximization Problem. Through the reuse of initialized Virtual Network Functions and optimizing resource allocation, this improves throughput, reduces service delays, and maintains Quality of Service standards. As in other methods, the high computational overhead in Large-Scale Deployments is the main drawback of this method.

From the various studies analyzed, task offloading in Mobile Edge Computing remains a critical challenge, primarily due to the trade-offs between latency, energy consumption, and computational efficiency. While some approaches focus solely on energy optimization, others emphasize latency reduction, and few integrate both aspects as seen in Table 1. Additionally, recent advancements explore parallel processing and intelligent decision-making using Machine Learning to enhance offloading efficiency. However, many existing solutions fail to achieve a holistic balance between performance, resource utilization, and dynamic network conditions. This review highlights the gaps in current approaches, such as inefficient edge workload balancing, the lack of adaptive mechanisms for varying network loads, and the absence of integrated models that dynamically optimize both energy and latency. These limitations justify the need for a more intelligent and adaptive task-offloading strategy that efficiently distributes workloads while maintaining system performance. Our proposed approach addresses these challenges by using Machine Learning for decision-making and introducing an adaptive mechanism for selecting edge servers based on their computational capacities and network conditions.

3. The Proposed Scheme

In this paper, we propose a novel approach to enable IoT objects to make intelligent offloading decisions in a Mobile Edge Computing environment. First, we establish an analytical formula that evaluates the latency of executing a task locally versus offloading that task. This formula takes into account the possibility of executing the task in parallel between several edge servers, intelligently selected to ensure good load balancing and optimize the overall execution time. This modeling is essential, because a poor choice of task placement cannot only lead to congestion on certain servers, but can also limit the expected performance gains. Next, we propose a second analytical formula for evaluating the energy consumption of local execution versus offloading task execution. The aim here is to minimize the energy consumption of IoT objects, which often have limited energy resources, while guaranteeing efficient task execution. However, optimizing latency alone or energy alone can lead to sub-optimal decisions. This is why we introduce a machine learning model that integrates these two criteria together in order to optimize decision-making according to the execution context. This model enables the adoption of a more refined and adaptive strategy, guaranteeing a good compromise between execution speed and energy efficiency. Figure 2 describes our solution. In Figure 2, if a task to be performed by an IoT object requires resources available on that IoT object (IOT 1 and IOT 3), then that object will perform the task locally. If an IoT object does not have enough resources to perform the task(IOT 2 and IOT 4), then the task will be offloaded to the edge layer. In the edge layer, if the edge server receiving the task can perform that task locally and reduce overall latency and energy without being overloaded (ES5), it performs the task locally and returns the results to the IoT object. However, if it is better to execute that task in parallel among other edge servers, then the edge server receiving the task will select the best edge servers and distribute the task among them (ES3). In Section 3.2, we detail the analytical modeling of latency and the approach adopted to integrate parallelism at the edge layer. Next, in Section 3.3 we introduce the analytical modeling of energy consumption, followed by a description of our machine learning model and its training process in Section 3.4.

3.1. Notations

Notations and symbols used to describe the proposed solution are presented in Table 2.

3.2. Latency-Based Offloading Decision

Offloading a task from an IoT device to an MEC server is an effective strategy to reduce the execution time of resource-intensive applications. However, this approach is beneficial only in terms of latency under certain specific conditions. The benefits of offloading depend mainly on the relationship between local computing time and total remote processing time, which includes data transmission time (to send and receive data) and MEC server computing time. When the computing power of the IoT device is insufficient to process the task in a reasonable time, sending the task to an edge server with higher capacity can significantly improve the latency perceived by the user. However, if the network load is high or if the MEC server is already saturated, the additional cost in terms of communication and processing time may reserve the expected gains and make local execution more advantageous. Furthermore, for tasks of low complexity or those that require real-time interaction with the user, the additional cost of sending data over the network can be prohibitive.

The decision to offload must therefore be based on a dynamic assessment of latency, incorporating parameters such as network load, the state of the MEC servers, and the nature of the task to be executed. In our analytical framework, we deliberately abstract away several secondary latency components to maintain tractability while capturing the dominant performance factors. For local execution, we exclude flash or disk access overheads, namely, seek, read, and write times, because IoT endpoints typically leverage in-memory or flash-cached computations, rendering these delays negligible compared to processor cycle costs.

Similarly, when offloading, we do not model buffering delays, serialization/deserialization overheads, TLS handshake and connection setup times, or queuing at the network and server stack. These latter effects are highly variable, insignificant often amortized through persistent connections and batching and would only complicate the model without substantially enhancing its accuracy for the wide range of MEC scenarios under consideration. Let

T_{i o t}

be the processing time of a task locally on an IoT object, and

T_{o f f}

the processing time of the task when it is offloaded onto the edge layer. Equation (1) states the conditions in which offloading is beneficial or not.

I f \{\begin{matrix} T_{o f f} < T_{i o t} & offloading is beneficial in terms of latency \\ T_{o f f} \geq T_{i o t} & local processing is beneficial in terms of latency \end{matrix}

(1)

The time taken by an IoT object to process the task locally (

T_{i o t}

) depends on the processor frequency (

f_{i o t}

in Ghz) of that object and the total workload of the task to be executed (

C_{t o t}

in terms of the number of cycles). Therefore,

T_{i o t}

can be calculated using Equation (2).

T_{i o t} = \frac{C_{t o t}}{f_{i o t}}

(2)

The latency for processing a task when it is offloaded (

T_{o f f}

) depends on three parameters: the time to send data to the edge layer (

T_{u p l}

), the time to process data on the edge layer (

T_{p r o c}

) and the time to return the result to the IoT object (

T_{d o w n}

). Therefore, we can estimate the processing time of a task when it is offloaded using Equation (3).

T_{o f f} = T_{u p l} + T_{p r o c} + T_{d o w n}

(3)

Let Z be the size of the data to be offloaded,

Z^{'}

the size of the result,

R_{u p l}

the available network throughput during the offloading process,

R_{d o w n}

the effective network throughput when the result is sent back to the IoT object, and G the congestion coefficient. Using the Shannon capacity model [37], we can evaluate the maximum achievable throughput of the network with Equation (4). Therefore,

R_{u p l}

and

R_{d o w n}

are computed using Equation (5).

\begin{matrix} G_{m a x} = B \times l o g_{2} (1 + \frac{P \times {| h |}^{2}}{N_{0}}) \\ G_{m a x} is the maximum network achievable throughput . \\ B is the transmission bandwidth . \\ P is the transmission power . \\ {| h |}^{2} represents the channel condition . \\ N_{0} is the noise power spectral density . \end{matrix}

(4)

\begin{matrix} R_{u p l} = G_{m a x} (1 - G_{u p l}) \\ R_{d o w n} = G_{m a x} (1 - G_{d o w n}) \end{matrix}

(5)

Therefore,

T_{u p l}

and

T_{d o w n}

can be computed with Equations (6) and (7), respectively.

\begin{matrix} T_{u p l} = \frac{Z}{R_{u p l}} \\ with Z the size of the data to be offloaded; \end{matrix}

(6)

\begin{matrix} T_{d o w n} = \frac{Z^{'}}{R_{d o w n}} \\ with Z^{'} the size of the result sent back to the IoT object; \end{matrix}

(7)

When a task is offloaded to an MEC server, the latter must decide on the best execution strategy to optimize the computing time (

T_{p r o c}

). There are two options: execute the task locally on the server receiving the request, or execute it in parallel with other available edge nodes. This decision is based primarily on a dynamic assessment of the available resources and the cost of communication between the nodes. If the local MEC server has sufficient computing power and a low workload, local execution can minimize processing latency by avoiding the delays associated with distributing and synchronizing sub-tasks. On the other hand, for complex tasks that require a long processing time, parallel execution becomes a viable alternative, enabling intelligent load distribution and a reduction in total execution time. However, this approach introduces an additional cost associated with data exchanges between edge servers, making it essential to select the best participating nodes according to their computing capacity, availability, and estimated communication time. The decision must therefore be based on an adaptive model that strikes a balance between the performance gains from parallelism and the cost of task distribution. Let

T_{e s L o c a l}

be the time taken to process the task locally on the edge server that received the task, and

T_{p a r}

the time taken to process the task in parallel. Equation (8) states the conditions in which the execution of the task in parallel among the edge servers is beneficial or not.

I f \{\begin{matrix} T_{p a r} < T_{e s L o c a l} & parallel processing is beneficial in terms of latency \\ T_{p a r} \geq T_{e s L o c a l} & local processing is beneficial in terms of latency \end{matrix}

(8)

3.2.1. Computation of the Local Execution Time on the Edge Server $T_{e s L o c a l}$

Let C be the number of processor cycles required to perform the task, and

f_{e s L o c a l}

the processor frequency of the edge server that is to process the task locally. The time required to perform this task is estimated using Equation (9).

\begin{matrix} T_{e s L o c a l} = \frac{C}{f_{e s L o c a l}} \end{matrix}

(9)

3.2.2. Computation of the Parallel Execution Time $T_{p a r}$

Running the task in parallel between edge servers can be beneficial if the task to be performed is very large. However, only certain nodes should be selected for parallel execution. As a result, the main edge server receiving the task must choose which of the available edge servers it can collaborate with to implement the parallel execution of tasks, taking the following into consideration:

The computing power of the servers;
The communication time (round trip time) between the main edge server and other edge servers;
Workload balancing between the chosen edge servers.

The aim is to avoid overloaded edge servers (which could delay execution) and ensure efficient balancing while minimizing total processing time.

(i) Notations
Let us consider the following notations:
–
S: The set of edge servers;
–
$S^{*}$ : The set of selected edge servers ( $S * \subset S$ );
–
$f_{i}$ : Processor frequency of edge server i (in Ghz);
–
$L_{i}$ : Workload already present on node i (in number of cycles);
–
$L_{m a x}$ : Maximum workload that a node can manage;
–
$T_{c o m, i}$ : Communication time between the main edge server and other edge server i;
–
$C_{t o t}$ : Total workload of the task to be executed;
–
$w_{i}$ : Computed weight of an edge server i, representing its priority in case of the parallel computation of a task.
(ii) Computation of processing time
The parallel calculation time corresponds to the calculation time of the slowest edge server among the edge servers selected to process the task. This includes the communication time between the main edge server and each node, added to the calculation time on each node, taking into account the workload already present and the workload allocated by the main edge server. Consequently, the optimum choice of edge servers should minimize the total parallel execution time (Equation (10)).

$\begin{matrix} T_{p a r} = m i n [\underset{i \in S^{*}}{m a x} (\frac{C_{i} + L_{i}}{f_{i}} + T_{c o m, i})] \end{matrix}$

(10)

The constraints in Equation (10) are provided in Equation (11).

$\begin{matrix} C_{i} = C_{t o t} \times \frac{w_{i}}{\sum_{j \in S^{*}} w_{j}} \\ where \\ w_{i} = \frac{f_{i}}{T_{c o m, i} \times (1 + \frac{L_{i}}{L_{m a x}})} \end{matrix}$

(11)

Therefore, the following hold:
–
If $L_{i}$ is big, then $w_{i}$ decreases, so edge server i has lower priority;
–
If $T_{c o m, i}$ is big, then $w_{i}$ decreases; this avoids edge servers that are too far away;
–
If $f_{i}$ is big, then $w_{i}$ increases, which favors powerful edge servers.
From Equations (3), (6), (7), (9) and (10), we can compute the total time required when a task is offloaded ( $T_{o f f}$ ) and executed either locally on the edge server receiving the task, or in parallel. This is given by Equation (12).

$T_{o f f} = \{\begin{matrix} \frac{Z}{R_{u p l}} + \frac{C}{f_{e s L o c a l}} + \frac{Z^{'}}{R_{d o w n}} & if the main server performs the task \\ \frac{Z}{R_{u p l}} + m i n [\underset{i \in S^{*}}{m a x} (\frac{C_{i} + L_{i}}{f_{i}} + T_{c o m, i})] + \frac{Z^{'}}{R_{d o w n}} & if the task is executed in parallel \end{matrix}$

(12)
(iii) Dynamic computation of the number of edge servers
In order to select the best possible edge servers for parallel computing and avoid over- or under-using the other edge servers, we introduce Equation (13).

$\begin{matrix} S^{*} = {i \in S / w_{i} \geq α \times w_{m a x}} \\ where \\ w_{m a x} = m a x (w_{i}) \\ α is the dynamic selection threshold (Ex 0.7 or 70 %) \end{matrix}$

(13)
(iv) Proof of concepts
Let us consider the following data:
–
There are 10 edge servers;
–
$C_{t o t}$ = $10^{9}$ cycles;
–
$L_{m a x}$ = $3 \times 10^{9}$ cycles;
–
Processor frequencies in ghz: $f_{i}$ = [3; 4; 5; 2.5; 6; 3.2; 4.5; 5.5; 2.8; 6.1];
–
Existing workload on edge servers in number of cycles: $L_{i}$ = [1.2; 0.8; 2.5; 1.1; 0.4; 2.1; 1.8; 0.7; 1.0; 0.3] $\times 10^{9}$ ;
–
Communication time in seconds: $T_{c o m, i}$ = [0.02; 0.015; 0.018; 0.03; 0.012; 2.025; 0.017; 0.022; 0.028; 0.010].
(a)
Step 1: Computation of weights $w_{i}$ and selection of the best edge servers
In Equation (11), we have the formula to compute the values of $w_{i}$ , with $i \in S$ . So, Table 3 gives the different values of $w_{i}$ for each edge server.
If we consider $α = 37 %$ , then $α \times w_{m a x}$ = $0.37 \times 554.55 = 205.18$ . The selected edges servers for parallel calculation will therefore be those with a weight $w_{i}$ greater than or equal to 205.18. These edge servers are edge server 2 ( $w_{2}$ = 210.52), edge server 5 ( $w_{5}$ = 428.57) and edge server 10 ( $w_{10}$ =554.55).
(b)
Step 2: Distribution of the workload and computation of the execution time in parallel
Equation (11) provides the formula needed to compute the workload of each edge server with $\sum w_{j} = w_{2} + w_{5} + w_{10} = 1193.64$ . Also, the execution time is computed based on Equation (10). In Table 4, we compute the different values of $C_{i}$ and $T_{i}$ for each selected edge server.
From Equation (10), we can conclude that $T_{p a r} = m a x [0.184; 0.139; 0.138] = 0.184$ s. Since the calculation times are almost the same on each edge server, the balancing is successful.
(c)
Step 3: Decision
*
If we assume that the processor frequency of the main edge server that received the request is 3 Ghz ( $f_{e s L o c a l} = 3$ Ghz), then $T_{e s L o c a l} = \frac{10^{9}}{3 \times 10^{9}} = 0.33$ s. From Equation (8), we can see that $T_{p a r} < T_{e s L o c a l}$ because 0.184 s < 0.33 s and therefore, the execution in parallel is beneficial in this case;
*
According to Equation (3), $T_{o f f} = T_{u p l} + T_{p r o c} + T_{d o w n}$ ; If we consider $T_{u p l} = 0.05$ s and $T_{d o w n} = 0.03$ s, then $T_{o f f} = 0.05 + 0.184 + 0.03 = 0.264$ s;
*
Let us consider $f_{i o t} = 1.5$ Ghz, then $T_{i o t} = \frac{10^{9}}{1.5 \times 10^{9}} = 0.67$ s;
*
In conclusion, since $T_{o f f} < T_{i o t}$ (0.264 s < 0.67 s), then according to Equation (1), offloading this task is beneficial in terms of latency in this case.

3.3. Energy Consumption-Based Offloading Decision

Offloading a task to the edge layer can be an effective way to save energy on IoT devices, which are often constrained by limited batteries. However, this approach is only advantageous in energy terms if the overall energy cost of offloading is lower than that of local execution. This is because the energy consumption of an IoT device comprises several factors: the energy required to execute the task locally, the energy consumed in deciding to offload, and the energy spent transmitting the data to the MEC server and receiving the result of the processing. When the task is computationally intensive, running it locally can lead to excessive energy consumption due to prolonged use of the processor. In this case, outsourcing the computation to a more powerful and energy-optimized edge server can improve the longevity of the terminal battery. However, if the size of the data to be transferred is large or if the network is congested, the energy cost of transmitting the data may outweigh the savings made on computing, making offloading counterproductive. In addition, the energy efficiency of the MEC server plays a decisive role: an overloaded or energy-inefficient edge server could reserves expected benefits. Consequently, the decision to offload a task must be based on a detailed analysis of local energy consumption, the energy cost of network transmissions, and the load and efficiency conditions of the MEC layer.

Let us consider

E_{i o t}

as the energy consumed to execute the task locally,

E_{o f f l}

the energy consumed when the task is offloaded, and

E_{d e c i s i o n}

the energy consumed by the IoT device to execute the decision making algorithm. The total energy consumed by an IoT object for a task T is defined by Equation (14).

E_{t o t a l} = E_{d e c i s i o n} + \{\begin{matrix} E_{i o t} & if E_{i o t} < E_{o f f l} \\ E_{o f f l} & if E_{i o t} > E_{o f f l} \end{matrix}

(14)

Since, in this context, we are interested in the energy consumption of the IoT object, the energy consumed by the task offloading is the sum of the energy consumed to send the task to the edge layer and the energy consumed to receive the result. For the decision algorithm, this requires computing time on the IoT CPU. Consequently, the energy consumed by the IoT object during decision making is given by Equation (15).

\begin{matrix} E_{d e c i s i o n} = P_{i o t} \times T_{d e c i s i o n} \\ where \\ P_{i o t} is the IoT object processor power . \\ T_{d e c i s i o n} is the time needed to make the decision . \end{matrix}

(15)

On the one hand, if the task is carried out locally on the IoT object, the energy consumed depends on the use of all the components involved in carrying out the task. Thus, let

P_{i}

be the power consumed by component i (CPU, memory, etc.), and

T_{i}

the time component i is used during the execution of the task. The energy consumed by the IoT object when it performs the task locally is therefore given by Equation (16).

E_{i o t} = \sum P_{i} \times T_{i}

(16)

On the other hand, if the task is offloaded to the edge layer, then the energy consumed by the IoT object depends on the energy consumed to transmit the data (

E_{t x}

), the energy consumed to receive the result (

E_{r e c e p}

) and the energy consumed while waiting for the result (

E_{i d l e}

). Consequently, the energy consumed by the Iot object when the task is offloaded can be calculated using Equation (17).

\begin{matrix} E_{o f f l} = E_{t x} + E_{i d l e} + E_{r e c e p} \\ where \\ E_{t x} = P_{t x} \times T_{t x} \\ E_{r e c e p} = P_{r e c e p} \times T_{r e c e p} \\ E_{i d l e} = P_{i d l e} \times T_{i d l e} \end{matrix}

(17)

In Equation (17),

P_{t x}

is the power used by the IoT object to transmit data to the edge layer, while

T_{t x}

is the time spent during transmission. Likewise,

P_{r e c e p}

is the power used by the IoT object to receive the result from the edge layer, while

T_{r e c e p}

is the time spent during reception.

P_{i d l e}

is the power used by the IoT object while waiting for the result of the edge layer, and

T_{i d l e}

is the time taken by the edge layer to perform the task. To compute

T_{t x}

and

T_{r e c e p}

, we use Equations (6) and (7) respectively.

3.4. A Machine Learning Approach for Global Offloading Decision

In an MEC environment, the decision to offload tasks by an IoT object must be optimized to minimise both execution latency and energy consumption. To achieve this, we propose an approach based on machine learning that enables an IoT object to dynamically assess whether it should execute a task locally or offload it to the edge layer. Our solution is based on a classification model that uses a set of characteristics, such as the computing power of the IoT and edge servers, network congestion, the size of the data exchanged, and the available bandwidth. The machine learning model does not learn static rules. It is trained on labels generated by our analytical latency and energy models, which incorporate dynamic mechanisms such as adaptive parallelism and load balancing. The proposed approach thus makes it possible to improve resource management while reducing the latency and energy consumption of IoT objects, as demonstrated by the experimental evaluations presented in this article.

3.4.1. Problem Formulation

The aim is to enable an IoT object to decide intelligently whether to execute a task locally or offload it to the edge layer, minimizing both latency and energy consumption. In this way, the IoT object makes an offloading decision based on a Machine Learning model that predicts whether executing locally or offloading the task to the edge layer is more advantageous depending on the two criteria (latency and energy). We then formulate the following problem as a binary classification:

Class 1 (Offloading recommended): if sending to the edge layer is more advantageous in terms of latency and energy;
Class 0 (local execution preferred): if local execution is more efficient.

3.4.2. Dataset Definition

To train a Machine Learning model, we need to collect a representative set of data. Each training example corresponds to a task to be executed, described by a set of parameters influencing latency and energy consumption. These parameters represent the characteristics of our dataset, which are described in Table 5. The first column of Table 5 gives the name of the feature, while the second column describes the elements that influence this feature.

Since we implement a binary classification model, the target variable y can only take two values:

y = 1

(meaning that offloading is preferable) and

y = 0

(meaning that local execution is preferable). In Section 3.2 we computed the execution time of a task on the edge layer (Equation (3)) and the execution time of the same task locally on the IoT object (Equation (2)). Thereafter, in Section 3.3, we computed the energy consumed by an IoT object when the task is executed by it (Equation (16)) and the energy consumed when the task is offloaded (Equation (17)). In the best case, offloading is preferable (

y = 1

) if the execution time on the edge layer is less than the execution time on the IoT object, and the energy used to offload the task is less than the energy used to compute the task on the IoT object (

y = 1

if

T_{o f f l}

<

T_{i o t}

and

E_{o f f l}

<

E_{i o t}

). However, in some cases, the execution time in offloading is less than the execution time in local, while the energy consumed in offloading is greater than the energy consumed when executing the task locally, and vice versa. So, to determine the different values of the target variable, we introduce a more nuanced decision, which depends on the following:

The relative importance of latency and energy for each task: some applications are more sensitive to latency (e.g., video streaming, telemedicine) while others prioritize energy savings (e.g., low-power IoT sensors);
The tolerance threshold to the degradation of a criterion: if energy increases lightly, but latency decreases significantly, offloading may be justified. Conversely, if latency increases but energy is reduced, this may be acceptable.

We therefore introduce two dynamic thresholds:

$γ_{l a t e n c y}$ : tolerance to increased latency;
$γ_{e n e r g y}$ : tolerance to increased energy consumption.

Parameters

γ_{l a t e n c y}

and

γ_{e n e r g y}

are weights that represent the relative priority assigned to minimizing latency versus minimizing energy, or vice versa, in the integrated cost function used for labeling the ML model’s training data. Thus, Equation (18) defines the improvement factor for each criterion.

\begin{matrix} R_{L} = \frac{T_{i o t} - T_{o f f l}}{T_{i o t}} \\ R_{E} = \frac{E_{i o t} - E_{o f f l}}{E_{i o t}} \end{matrix}

(18)

In our dataset, the target variable can therefore be defined by the Algorithm 1 and described as follows:

Offloading is preferable ( $y = 1$ ) if
- $T_{o f f} < T_{i o t}$ and $E_{o f f l} < E_{i o t}$ (Line 8 to line 9 of Algorithm 1);
- OR $T_{o f f} < T_{i o t}$ , but $E_{o f f l} > E_{i o t}$ and $R_{L} > γ_{l a t e n c y}$ : The improvement in latency is significant. (line 10 to line 11 of Algorithm 1);
- OR $T_{o f f} > T_{i o t}$ , but $E_{o f f l} < E_{i o t}$ and $R_{E} > γ_{e n e r g y}$ : The reduction in energy consumption is significant. (line 12 to line 13 of Algorithm 1).
Otherwise, local execution occurson the IoT object ( $y = 0$ ), (line 14 end line 15 of Algorithm 1).

Lemma 1.

The complexity of the Algorithm 1 is in

Γ_{0} = O (1)

.

Proof.

Algorithm 1, which is responsible for deciding whether a given task should be offloaded or executed locally, is designed with a strong emphasis on computational efficiency. Its execution consists of two main stages: (1) the computation of the latency and energy gain ratios (

R_{L}

and

R_{E}

), and (2) the decision-making process based on these metrics and predefined tolerance thresholds.

Step 1: Ratio computation: This step involves a fixed number of arithmetic operations: two subtractions ( $T_{i o t} - T_{o f f l}$ and $E_{i o t} - E_{o f f l}$ ) and two divisions to normalize the differences. These operations are performed on scalar values and thus require a constant number of steps regardless of the input. The time complexity for this stage is therefore $O (1)$ ;
Step 2: Decision-making logic: The decision stage evaluates, at most, three mutually exclusive conditional branches, each involving simple comparisons (e.g., <, >) and logical conjunctions. The number of comparisons and conditional statements is fixed and does not depend on the nature or complexity of the task T. Consequently, this phase also operates in constant time;
Overall time and space complexity: Since both phases of the algorithm consist solely of constant-time operations with no loops or recursion, the overall time complexity of Algorithm 1 is $O (1)$ . Similarly, the algorithm uses only a constant amount of memory to store intermediate values (e.g., $R_{L}$ , $R_{E}$ ), leading to a space complexity of $O (1)$ .

□

Algorithm 1: Offloading prediction

Let us consider the following application examples to better explain how the target variable is obtained:

For a real-time video surveillance application, latency is critical (high $γ_{l a t e n c y}$ ). Energy is a lower priority (low $γ_{e n e r g y}$ ). If latency is reduced by 20% but energy increases by 10%, offloading is justified;
For an IoT temperature sensor with a limited battery, the priority is energy autonomy (high $γ_{e n e r g y}$ ). Latency is secondary (low $γ_{l a t e n c y}$ ). If the energy saved is significant (>15%) even with slightly higher latency (+5%), offloading is recommended.

3.4.3. Machine Learning Model Architecture

The decision model is a binary classifier whose objective is to determine for each task whether local execution on the IoT object or offloading to the Edge layer is the optimal strategy. During the initial comparative testing within the resource-constrained environment, we conducted a preliminary evaluation comparing our binary classifier model to more complex algorithms, including Logistic Regression, Support Vector Machines, Random Forest, Gradient Boosting(XGBoost), Multi-Layer Perceptron (MLP), k-Nearest Neighbors (k-NN), Naive Bayes and a shallow Decision Tree. The selection of the comparison models was carried out using a rigorous approach aimed at covering all the main algorithmic families used in recent work on offloading decisions and MEC optimization [22,25,30,31,36]. This choice ensures a comprehensive evaluation covering: (i) interpretable and lightweight models suitable for constrained environments, (ii) nonlinear models with high approximation capacity, and (iii) ensemble methods considered to be performance standards in the recent literature. Comparisons were made using criteria such as training time, inference latency, accuracy, F1-score, Energy Consumption per Inference and memory footprint on simulated edge devices. A dataset of 10,000 samples was used, and to ensure robustness, we performed 5-fold cross-validation on the training set. For each model, hyperparameters were optimized using grid search. These hyperparameters are as follows:

For logistic regression: regularization strength C ∈ {0.01; 0.1; 1; 10};
For random forest: number of trees T ∈ {50; 100; 200};
For XGBoost: learning rate ∈ {0.01; 0.05; 0.1}, max depth ∈ {3; 5; 7};
For MLP: hidden units ∈ {32; 64; 128};
For k-NN: K ∈ {3; 5; 7};

Table 6 shows the comparison results. The results show that ensemble models, particularly XGBoost and Random Forest, achieve the best performance in terms of accuracy and F1-score. XGBoost achieves the best average accuracy (95.2 ± 0.7%) and the highest F1-score (0.94 ± 0.01), confirming its ability to capture complex nonlinear relationships between network variables (bandwidth, transmission power, channel conditions, noise, CPU load) and the offloading decision. In the context of IoT and edge computing, where real-time decision making is required, computational resources are constrained, and energy efficiency is critical, the marginal accuracy improvement (approximately 2–3%) achieved by complex ensemble models does not justify their substantially higher computational overhead. Our proposed model achieves competitive predictive performance, the lowest inference latency, minimal memory footprint and the lowest energy consumption per prediction.

Inference Latency: Inference latency is a critical factor in real-time IoT scenarios, where the decision to execute locally or offload must be made in a matter of milliseconds in order to avoid degrading quality of service. The results show a significant difference: our model: 0.05 ms; XGBoost: 3.12 ms; Random Forest: 2.34 ms; k-NN: 8.9 ms. In MEC systems where thousands of decisions can be made per second, this difference becomes structurally decisive.
Memory Footprint and Deployability: The size of the model directly impacts the feasibility of deployment on IoT devices with limited memory. The results show: our model: 12 KB; XGBoost: 920 KB; Random Forest: 680 KB; MLP: 410 KB. The proposed model has a memory footprint 50 to 75 times less than the ensemble models. This difference becomes critical for microcontrollers or embedded devices with limited memory resources. In a large-scale distributed MEC environment, replicating heavy models across multiple edge nodes can also introduce significant additional network and storage costs.
Energy Consumption: Energy consumption per inference is directly proportional to computation time and algorithmic complexity. The results indicate: our model: 0.9 mJ; XGBoost: 7.4 mJ; Random Forest: 6.1 mJ; k-NN: 12.3 mJ. In a battery-powered IoT scenario, a multiplier factor of 6 to 10 on energy consumption per decision can significantly reduce the device’s battery life. Thus, even though XGBoost’s gain in accuracy is measurable, its energy cost makes it less suitable for constrained environments.

These results are also demonstrated by the evaluation metrics in Section 4. Therefore, from a system-level perspective, the proposed lightweight model provides the best trade-off between predictive accuracy and deployability in real-world MEC environments.

Our proposed model is implemented as a supervised classifier, which can be interpreted as a two-layer neural network: an input layer containing ten neurons (corresponding to the input features of our dataset) with an ReLU activation function and an output layer with one neuron with a Sigmoid activation function for binary classification (0 or 1, representing Local or Offload). Formally, the model computes the output, as shown by Equation (19):

\begin{matrix} \hat{y} = σ (W^{T} X + b) \\ where \\ X : is the input feature vector . \\ W and b : are the trainable weight vector and bias \\ σ (.) : is the sigmoid activation function \end{matrix}

(19)

The choice of a lightweight supervised model is motivated by the need for explainability, low latency inference, and the feasibility of its practical deployment in IoT environments. Unlike complex deep learning architectures, the proposed model achieves an effective balance between decision accuracy and computational efficiency, while leveraging analytically derived latency and energy models to guide the learning process.

3.4.4. Cost Function Definition

The aim of our machine learning model is to learn how to make the best offloading decision based on training data. To do this, we use a classifier, which should minimize errors during learning. The cost function allows us to measure the extent to which our model makes accurate or inaccurate predictions. In our binary classification model (0 = local execution, 1 = offloading), we use cross-entropy loss as the cost function for several reasons [38]:

It penalizes classification errors more severely than a simple quadratic error;
It is well-suited to binary classification problems, where we want the model to give well-calibrated probabilities;
It favors confident and accurate predictions (close to 0 or 1).

So, the cost function is defined using Equation (20).

\begin{matrix} J (Θ) = - \frac{1}{N} \sum_{i = 1}^{n} [y_{i} log ({\hat{y}}_{i}) + (1 - y_{i}) log (1 - {\hat{y}}_{i})] \\ with \\ N : total number of examples in the training set . \\ y_{i} : the real offloading decision (0 or 1) for example i . \\ {\hat{y}}_{i} : the probability predicted by the model that example i will be offloaded . \\ Θ : the model parameters . \end{matrix}

(20)

If our model predicts correctly (

{\hat{y}}_{i}

is close to

y_{i}

), then the value of the cost function is low. If our model makes an error (

{\hat{y}}_{i}

is far from

y_{i}

), then the cost function is high. The cost function tells us how wrong our model is and how it should be minimized during training. To reduce the cost function, the learning algorithm adjusts its parameters by learning from past examples. The more it minimizes this function, the better it becomes at deciding when to offload an IoT task.

3.4.5. The Training Phase

The training data is generated through simulations using OMNET++ simulation tool based on the analytical latency and energy models described in Section 3.2 and Section 3.3. The goal is to create a labeled dataset where each record represents a potential execution scenario. Thus, each instance in the dataset corresponds to a task generated by an IoT object and is represented by a vector of features describing its execution context. These features include:

The estimated local execution time;
The execution time in case of offloading (including transmission, edge processing, and return of the result);
Local energy consumption;
Energy consumption related to offloading;
Relative gains in latency and energy;
Tolerance parameters $γ_{l a t e n c y}$ and $γ_{e n e r g y}$ ;

The labels associated with the data are constructed from the analytical models proposed in Section 3.2 and Section 3.3, incorporating realistic trade-offs between latency and energy. This strategy allows offloading decisions to be made that are consistent with system objectives, while avoiding rigid rules. Then, with Python 3.12.12, a Python script is thus implemented to train the machine learning model defined in the Section 3.4.3 using the synthetic dataset generated through simulations. The model is trained by adjusting its internal parameters to minimize classification error, i.e., the frequency with which it chooses a suboptimal decision relative to the label (the best decision, calculated analytically).

3.4.6. The Prediction Phase

Once the model was trained, it was deployed on the nodes to make real-time decisions.

1.

Real-Time Context Data Capture: When an IoT device has a new task to perform, it gathers the necessary contextual information:

Locally: Task characteristics (size, cycles required), and IoT device capabilities (CPU frequency, battery status).
Via the Edge Server: The device communicates with the Edge to obtain the current status of network and Edge resources (throughput, load on neighboring Edge servers $L_{i}$ , frequencies $f_{i}$ ).

2.

Prediction Execution: This context data is injected as input into the trained machine learning model. The classification model instantly predicts which is the optimal offloading decision among the learned options (Local, Edge Local, Edge Parallel).

3.

Implementation of the Decision:

If the prediction is “Local, the IoT device executes the task on its own processor. This is often the case for lightweight tasks or when network quality is poor, making transmission more costly in terms of latency and energy than local execution;
If the prediction is “Offloading Local Edge”, the IoT device transfers the entire task to the nearest Edge server. This server executes it locally;
If the prediction is “Parallel Edge Offloading”, the IoT device sends the task to the main Edge server. This server then uses its prediction information to select the optimal set of servers $S^{*}$ for parallelization, avoiding overloaded servers (Li≈Lmax), and distribute the work ( $C_{t o t}$ ) to the selected servers based on the calculated weights ( $w_{i}$ ), ensuring adaptive workload distribution.

4. Simulations and Analysis

The simulations were carried out in a development environment based on Jupyter Notebook 3, run using Anaconda Navigator. This environment was chosen for its flexibility and ease of use, allowing for the interactive execution of the Python scripts and efficient visualization of the results in the form of figures and tables [39]. The machine learning model was implemented using Scikit-learn to train and evaluate the model. Training data was processed using Pandas and NumPy, while Matplotlib v3.10.6 and Seaborn v0.13.2 were used to generate visualizations that illustrate the impact of offloading decisions on latency and energy consumption. In addition, the Apache HTTP Benchmark (AB) tool was used to measure the impact of offloading decisions on the latency and performance of the MEC servers, as shown in [40]. The Apache AB configuration was defined to simulate realistic network loads with the following parameters:

Total number of requests sent: 10,000;
Number of concurrent requests: 100;
Type of requests: HTTP requests to MEC servers to simulate the sending and processing of offloaded tasks.

These parameters can be used to analyze the ability of MEC servers to process offloaded tasks efficiently and to assess the improvement in average latency and energy consumption achieved by the proposed approach. Omnet++ was used to simulate a realistic MEC environment, including multiple Edge servers and IoT devices with random deployments. Table 7 gives the characteristics of the fifteen edge servers used during the simulations.

In our study, we opted for high-fidelity synthetic data over open benchmark datasets for two primary reasons:

Context Specificity: Existing public datasets often lack the fine-grained, dynamic parameters critical to our model, such as inter-edge communication time ( $T_{c o m m}$ ), parallel processing overhead, and dynamically varying network congestion ( $ρ$ ). Our model’s core novelty lies in integrating these factors, requiring a dataset that captures these features under various load and network configurations.
Control over Trade-offs: To train a robust classifier that intelligently optimizes latency and energy, we need control over the target variable generation based on dynamic thresholds ( $γ_{l a t e n c y}$ ) and ( $γ_{e n e r g y}$ ). A synthetic environment allowed us to generate a balanced dataset covering scenarios where latency and energy dominate, or a given trade-off is required, ensuring the model learns the nuanced decision logic defined in Algorithm 1.

We detailed the simulation environment generated by OMNeT++ using the INET framework and Apache HTTP Benchmark (AB), to ensure full reproducibility, in Table 8.

4.1. Evaluation of Our Model

The analysis of the experimental results in Figure 3, Figure 4 and Figure 5 highlights the relevance of our machine learning-based approach to offloading decisions in an MEC environment. We also implemented a decision algorithm labeled “Without our model” in Figure 3, Figure 4 and Figure 5, based on a simple heuristic strategy on the latency threshold, for comparison with our model. Specifically, each IoT object decides to offload a task only if the estimated latency for local execution exceeds a predefined threshold (e.g., 250 ms). Below this threshold, the task is executed locally; above the threshold, it is transferred to the edge layer. The choice of this approach is based on the fact that it is a naive but realistic strategy, often encountered in embedded systems where no adaptive intelligence is deployed. It provides a relevant basis for comparison when evaluating the contribution of our machine learning model, as it simulates deterministic behavior without dynamic optimization. However, this fixed-threshold method does not take into account other important factors, such as the energy consumption, the current edge server load, or the variability of network conditions. This underlines the value of our intelligence-based approach, which is capable of simultaneously integrating latency and energy consumption for finer-grained, contextually optimal decision making.

4.1.1. Energy Consumption Curve with and Without Our Model

Figure 3a shows the average energy consumption of tasks depending on whether the offloading decision is made with or without our model. If the energy consumption with our model is lower, this means that the model favors energy-efficient execution choices. If it is the same or slightly higher, this may be a compromise to reduce latency. The analysis of average energy consumption shows that our machine learning approach allows us to reduce energy expenditure when offloading is advantageous, or to promote latency gains while keeping consumption under control.

4.1.2. Average Latency Curve with and Without Our Model

Figure 3b compares the average execution latency of the task when the offload decision is made with and without our machine learning-based approach. Without our model, offloading is performed statically, which can lead to suboptimal decisions. However, our model learns to predict the best offloading cases, reducing latency when relevant. A reduction in latency with our model confirms that the algorithm is able to optimize task execution by selecting the right environment (local or MEC). The significant improvement in average latency observed with our machine learning-based approach demonstrates the relevance of intelligent offloading to optimize execution performance.

4.1.3. Interpretation of the Confusion Matrix

The confusion matrix in Figure 4a is used to assess the performance of our machine learning model by illustrating the distribution of correct and incorrect predictions. It consists of four values:

True positives: the number of times that the model correctly predicted that the task should run locally (1118 in Figure 4a);
False positives: the number of times the model recommended local execution when, in reality, offloading was more advantageous (247 in Figure 4a);
False negatives: the number of times the model recommended offloading tasks when, in reality, local execution would have been optimal (361 in the Figure 4a);
True negatives: the number of times that the model correctly predicted that the task should be offloaded (774 in Figure 4a).

Since the values on the main diagonal (true positives and true negatives) are dominant (1118 and 774, respectively), our model guarantees efficient and optimized decision-making.

4.1.4. Precision Recall Curve

The precision–recall curve (Figure 4b) provides a fine-tuned assessment of model performance, particularly in contexts where classes are unbalanced, as is often the case in MEC environments. In our study, precision measures the proportion of relevant tasks among those predicted by the model to be offloaded, while recall indicates the proportion of tasks destined to be offloaded that the model has correctly identified. The results obtained reflect a realistic compromise between these two metrics, as they reveal a slight drop in precision as the recall increases, which is expected in a real environment. This indicates that our model, while maintaining a strong ability to identify the right tasks to offload, remains cautious in order to limit prediction errors that could lead to unnecessary offloading, with a cost in terms of transmission and energy consumption. In short, these results confirm that our machine learning model makes balanced and robust decisions, aligned with the latency and energy performance objectives in an intelligent edge computing environment.

4.1.5. Learning Rate Curve

Figure 5a represents the evolution of the learning rate over the iterations of our model. It illustrates how the model gradually adjusts its weights to converge towards an optimal solution. A decrease in the learning rate is generally desirable as it allows the model to refine its decision by stabilizing its learning. Decreasing the learning rate too quickly could lead to premature convergence towards a suboptimal solution, while decreasing it too slowly could prolong the learning time. In our case, the evolution of the learning rate shows a gradual convergence of the model towards a better decision-making ability. A controlled decrease in the learning rate ensures the effective stabilization of the algorithm and avoids excessive adjustments.

4.1.6. Workload on Edge Servers

In Figure 5b, we show the results of the workload evaluation on each edge server when there are 200 tasks, 500 tasks and 1000 tasks to be executed by IoT objects. Figure 5b shows that when there are 10,000 tasks to be executed and some of them are offloaded to the edge layer, edge server 1 uses 38.20% of its resources, edge server 2 uses 42.03% of its resources, edge server 3 uses 37.36%, edge server 4 uses 35.96%, edge server 5 uses 40.12%, edge server 6 uses 36.85%, edge server 7 uses 39.76%, edge server 8 uses 41.64%, edge server 9 uses 43.04%, edge server 10 uses 35.97%, edge server 11 uses 39.61%, edge server 12 uses 42.23%, edge server 13 uses 37.61%, edge server 14 uses 44.69% and edge server 15 uses 36.37% of its resources. This shows that the workload is well-distributed among the nodes. Table 9 also supports the results presented in Figure 5b. Table 9 shows the standard deviation in terms of workload on each edge server when there are 200, 500 and 1000 tasks to be executed in the network. This table shows that not only is the standard deviation reduced, reflecting the effective balance of the workload between the edge servers, but also that our protocol adapts to the most complex environments, given that the value of the standard deviation is almost the same even though the number of tasks increases.

4.2. Comparison of Our Approach with Several Others

To validate our solution, we compared it with the ones proposed by Abednego et al. [31], Eang et al. [30], Michael et al. [25], Perin et al. [36] and Jin Wang et al. [22] in terms of energy consumption and latency. First, we evaluated latency and energy by varying the number of tasks to be executed by the IoT objects, and then we varied the number of IoT objects. For each evaluation, we considered two cases: the case in an application domain where latency is considered a priority over energy (

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

), and the case in an application domain where energy is considered a priority over latency (

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

). Each IoT object has 1000J of battery.

4.2.1. Latency and Energy Comparisons with Number of Tasks Varying

Figure 6 and Figure 7 display the results when varying the number of tasks to be executed by the IoT objects. Considering latency evaluation (Figure 6), the results show that when we are in an environment where latency is more important than energy (Figure 6b), the average latency of our solution is too small compared to the other solutions; and when we are in an environment where energy is more important than latency, our solution has the best latency but very close to the solution of Abenego et al. [31]. Our machine learning model and the fact that our solution adapts itself to any type of environment helps us to have the best results when the number of tasks is varying. Considering energy consumption (Figure 7), we evaluated the average remaining energy of the IoT objects after all the tasks were executed. In an environment where latency is more important than energy, Figure 7b shows that our solution is still the best compared to those proposed by Abednego et al. [31], Eang et al. [30] and Michael et al. [25] but very close to the one of Abednego et al. [31]. But when we consider that energy is more important than latency (Figure 7a), our solution is the best compared to the others.

4.2.2. Latency and Energy Comparisons with Number of IoT Objects Varying

Figure 8 displays the results when our solution is compared to those of Abednego et al. [31], Eang et al. [30] and Michael et al. [25] in terms of latency, while Figure 9 presents the results in terms of energy consumption when we vary the number of IoT objects in the network. Latency increases as the number of IoT objects increases, since the more IoT objects there are in the network, the more messages there are, and the more bandwidth is saturated, leading to congestion. Also, energy decreases as the number of IoT objects increases since the more IoT objects there are in the network, the more messages there are and the higher the probability of collisions and the resending of messages. Since our machine learning model incorporates the characteristics of the network at time t into its decision-making process, our solution is the best.

5. Discussion

This section explains why HILANDER outperforms the baseline (“Without our model” in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9) by highlighting the following points:

Holistic Optimization vs. Static Heuristic: The baseline relies on a fixed latency threshold for offloading decisions, ignoring energy consumption, edge server load, and network dynamics. In contrast, HILANDER uses a machine learning model that jointly optimizes latency and energy under dynamic conditions, guided by user-defined weights ( $γ_{l a t e n c y}$ and $γ_{e} n e r g y$ );
Quantified Performance Gains: Simulation results (Figure 3, Figure 6, Figure 8 and Figure 9) show that HILANDER significantly reduces average latency (e.g., Figure 3b) while maintaining lower energy consumption (e.g., Figure 3a), demonstrating effective multi-objective trade-off optimization;
Adaptive Workload Management: HILANDER dynamically selects MEC servers for parallel execution, preventing the bottlenecks and inefficient resource usage common in static approaches. The high accuracy shown in the confusion matrix (Figure 4a) confirms reliable decision-making and optimized edge resource utilization.

6. Conclusions

In this paper, we proposed a hybrid approach integrating analytical models and machine learning to optimize offloading in Mobile Edge Computing networks. Our methodology is based on the definition of a mathematical expression for latency, taking into account local execution as well as distributed processing on the edge layer with parallelization between servers and workload balancing. In addition, we developed a formula for estimating energy consumption, differentiating between the cost of local execution and the cost of transmission to the edge layer. These two models were then integrated into a machine learning algorithm capable of making optimal decisions based on the constraints of the network and the resources of the IoT objects. Our experimental results show a significant reduction in latency while optimizing energy consumption, particularly in highly constrained environments where resources are limited. The confusion matrix and the various evaluation metrics confirm that our approach enables finer-grained decision making, avoiding offloading errors that could overload edge servers or drain the batteries of IoT objects unnecessarily. This solution is particularly well-suited to embedded systems requiring fast, efficient processing, such as intelligent drones, real-time surveillance applications and sensor networks for energy management in critical infrastructures. Its integration into commercial MEC systems could improve resource management and offer a better quality of service to end users.

Author Contributions

Conceptualization, G.B.J.M. and M.V.; methodology, G.B.J.M. and A.N.N.; software, G.B.J.M.; validation, G.B.J.M., A.N.N. and M.V.; formal analysis, G.B.J.M.; investigation, A.N.N.; resources, M.V.; data curation, G.B.J.M.; writing—original draft preparation, G.B.J.M. and A.N.N.; writing—review and editing, G.B.J.M., A.N.N. and M.V.; visualization, A.N.N.; supervision, G.B.J.M. and M.V.; project administration, G.B.J.M. and A.N.N.; funding acquisition, M.V. All authors have read and agreed to the published version of the manuscript.

Funding

The paper was funded by the National Research Foundation of South Africa (Grant Numbers: 141918).

Data Availability Statement

Data supporting this research are available at https://github.com/Garrik10/hilander_paper_data, accessed on 15 February 2026.

Acknowledgments

The authors used chatPdf 1.1.4 to summarize documents for literature review.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Aouedi, O.; Vu, T.H.; Sacco, A.; Nguyen, D.C.; Piamrat, K.; Marchetto, G.; Pham, Q.V. A survey on intelligent Internet of Things: Applications, security, privacy, and future directions. IEEE Commun. Surv. Tutor. 2024, 27, 1238–1292. [Google Scholar] [CrossRef]
Jagho Mdemaya, G.B.; Kengne Tchendji, V.; Velempini, M.; Atchaze, A. ATENA: Adaptive TEchniques for Network Area Coverage and Routing in IoT-Based Edge Computing. J. Netw. Syst. Manag. 2024, 32, 83. [Google Scholar] [CrossRef]
Bomgni, A.B.; Mdemaya, G.B.J.; Ali, H.M.; Zanfack, D.G.; Zohim, E.G. ESPINA: Efficient and secured protocol for emerging IoT network applications. Clust. Comput. 2022, 26, 85–98. [Google Scholar] [CrossRef]
Park, H.; Song, S.; Nguyen, T.H.; Park, L. Machine Learning for Internet of Things: Applications and Discussions. In Proceedings of the 2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Osaka, Japan, 19–22 February 2024; pp. 459–462. [Google Scholar] [CrossRef]
Loutfi, S.I.; Shayea, I.; Tureli, U.; El-Saleh, A.A.; Tashan, W. An overview of mobility awareness with mobile edge computing over 6G network: Challenges and future research directions. Results Eng. 2024, 23, 102601. [Google Scholar] [CrossRef]
Wang, J.; Zhao, L.; Liu, J.; Kato, N. Smart Resource Allocation for Mobile Edge Computing: A Deep Reinforcement Learning Approach. IEEE Trans. Emerg. Top. Comput. 2021, 9, 1529–1541. [Google Scholar] [CrossRef]
Mdemaya, G.B.J.; Ndadji, M.M.Z.; Sindjoung, M.L.F.; Velempini, M. Efficient Load-Balancing and Container Deployment for Enhancing Latency in an Edge Computing-Based IoT Network Using Kubernetes for Orchestration. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 10. [Google Scholar] [CrossRef]
Dong, S.; Tang, J.; Abbas, K.; Hou, R.; Kamruzzaman, J.; Rutkowski, L.; Buyya, R. Task offloading strategies for mobile edge computing: A survey. Comput. Netw. 2024, 254, 110791. [Google Scholar] [CrossRef]
Chen, Y.; Xu, J.; Wu, Y.; Gao, J.; Zhao, L. Dynamic Task Offloading and Resource Allocation for NOMA-Aided Mobile Edge Computing: An Energy Efficient Design. IEEE Trans. Serv. Comput. 2024, 17, 1492–1503. [Google Scholar] [CrossRef]
Cao, B.; Zhang, L.; Li, Y.; Feng, D.; Cao, W. Intelligent Offloading in Multi-Access Edge Computing: A State-of-the-Art Review and Framework. IEEE Commun. Mag. 2019, 57, 56–62. [Google Scholar] [CrossRef]
Zhang, H.; Liu, X.; Xu, Y.; Li, D.; Yuen, C.; Xue, Q. Partial Offloading and Resource Allocation for MEC-Assisted Vehicular Networks. IEEE Trans. Veh. Technol. 2024, 73, 1276–1288. [Google Scholar] [CrossRef]
Mi, X.; He, H.; Shen, H. A Multi-Agent RL Algorithm for Dynamic Task Offloading in D2D-MEC Network with Energy Harvesting. Sensors 2024, 24, 2779. [Google Scholar] [CrossRef] [PubMed]
Abbas, A.; Raza, A.; Aadil, F.; Maqsood, M. Meta-heuristic-based offloading task optimization in mobile edge computing. Int. J. Distrib. Sens. Netw. 2021, 17, 15501477211023021. [Google Scholar] [CrossRef]
Li, K. Heuristic Computation Offloading Algorithms for Mobile Users in Fog Computing. ACM Trans. Embed. Comput. Syst. 2021, 20, 11. [Google Scholar] [CrossRef]
Yu, Z.; Xu, X.; Zhou, W. Task offloading and resource allocation strategy based on deep learning for mobile edge computing. Comput. Intell. Neurosci. 2022, 2022, 1427219. [Google Scholar] [CrossRef] [PubMed]
Yang, S.; Lee, G.; Huang, L. Deep Learning-Based Dynamic Computation Task Offloading for Mobile Edge Computing Networks. Sensors 2022, 22, 4088. [Google Scholar] [CrossRef] [PubMed]
Sun, M.; Quan, S.; Wang, X.; Huang, Z. Latency-aware scheduling for data-oriented service requests in collaborative IoT-edge-cloud networks. Future Gener. Comput. Syst. 2025, 163, 107538. [Google Scholar] [CrossRef]
Dinesh, S.; Nidhi; Rajnish, C.; Shiv, P.; Tiansheng, Y.; Singh, R.R.; Idrees, A. Optimizing energy and latency in edge computing through a Boltzmann driven Bayesian framework for adaptive resource scheduling. Sci. Rep. 2025, 15, 30452. [Google Scholar] [CrossRef]
Zhang, C.; Liu, S.; Yang, H.; Cui, G.; Li, F.; Wang, X. Joint Task Offloading and Resource Allocation in Mobile Edge Computing-Enabled Medical Vehicular Networks. Mathematics 2025, 13, 52. [Google Scholar] [CrossRef]
Kim, M.; Jang, J.; Choi, Y.; Yang, H.J. Distributed Task Offloading and Resource Allocation for Latency Minimization in Mobile Edge Computing Networks. IEEE Trans. Mob. Comput. 2024, 23, 15149–15166. [Google Scholar] [CrossRef]
Zhao, L.; Fang, H.; Wu, J.; Wan, Y.; Zhang, Y.; Chen, J. Computing Task Offloading Based on An Enhanced Genetic Algorithm in Mobile Edge Computing. In Proceedings of the 2023 4th International Conference on Computer Science and Management Technology; Association for Computing Machinery: New York, NY, USA, 2024; pp. 117–121. [Google Scholar] [CrossRef]
Wang, J.; Wu, W.; Liao, Z.; Sangaiah, A.K.; Simon Sherratt, R. An Energy-Efficient Off-Loading Scheme for Low Latency in Collaborative Edge Computing. IEEE Access 2019, 7, 149182–149190. [Google Scholar] [CrossRef]
Gu, X.; Ji, C.; Zhang, G. Energy-Optimal Latency-Constrained Application Offloading in Mobile-Edge Computing. Sensors 2020, 20, 3064. [Google Scholar] [CrossRef]
Beck, A.; Tetruashvili, L. On the Convergence of Block Coordinate Descent Type Methods. SIAM J. Optim. 2013, 23, 2037–2060. [Google Scholar] [CrossRef]
Mahenge, M.P.J.; Li, C.; Sanga, C.A. Energy-efficient task offloading strategy in mobile edge computing for resource-intensive mobile applications. Digit. Commun. Netw. 2022, 8, 1048–1058. [Google Scholar] [CrossRef]
Cai, Y.; Ran, L.; Zhang, J.; Zhu, H. Latency optimization for D2D-enabled parallel mobile edge computing in cellular networks. EURASIP J. Wirel. Commun. Netw. 2021, 2021, 133. [Google Scholar] [CrossRef]
Hu, X.; Wang, L.; Wong, K.K.; Tao, M.; Zhang, Y.; Zheng, Z. Edge and Central Cloud Computing: A Perfect Pairing for High Energy Efficiency and Low-Latency. IEEE Trans. Wirel. Commun. 2020, 19, 1070–1083. [Google Scholar] [CrossRef]
Du, J.; Yu, F.R.; Chu, X.; Feng, J.; Lu, G. Computation Offloading and Resource Allocation in Vehicular Networks Based on Dual-Side Cost Minimization. IEEE Trans. Veh. Technol. 2019, 68, 1079–1092. [Google Scholar] [CrossRef]
Dinh, T.Q.; Tang, J.; La, Q.D.; Quek, T.Q.S. Offloading in Mobile Edge Computing: Task Allocation and Computational Frequency Scaling. IEEE Trans. Commun. 2017, 65, 3571–3584. [Google Scholar] [CrossRef]
Eang, C.; Ros, S.; Kang, S.; Song, I.; Tam, P.; Math, S.; Kim, S. Offloading decision and resource allocation in mobile edge computing for cost and latency efficiencies in real-time IoT. Electronics 2024, 13, 1218. [Google Scholar] [CrossRef]
Acheampong, A.; Zhang, Y.; Xu, X. A parallel computing based model for online binary computation offloading in mobile edge computing. Comput. Commun. 2023, 203, 248–261. [Google Scholar] [CrossRef]
Deng, Y.; Chen, Z.; Yao, X.; Hassan, S.; Ibrahim, A.M.A. Parallel Offloading in Green and Sustainable Mobile Edge Computing for Delay-Constrained IoT System. IEEE Trans. Veh. Technol. 2019, 68, 12202–12214. [Google Scholar] [CrossRef]
Carvalho, G.; Woungang, I.; Jaseemuddin, M. Analysis of joint parallelism in wireless and cloud domains on mobile edge computing over 5G systems. J. Commun. Netw. 2018, 20, 565–577. [Google Scholar] [CrossRef]
Qin, Y.; Chen, J.; Jin, L.; Yao, R.; Gong, Z. Task offloading optimization in mobile edge computing based on a deep reinforcement learning algorithm using density clustering and ensemble learning. Sci. Rep. 2025, 15, 211. [Google Scholar] [CrossRef]
Lu, D.; Long, S. Enhancing network function parallelism in mobile edge computing using Deep Reinforcement Learning. ICT Express 2025, 11, 41–46. [Google Scholar] [CrossRef]
Perin, G.; Berno, M.; Erseghe, T.; Rossi, M. Towards Sustainable Edge Computing Through Renewable Energy Resources and Online, Distributed and Predictive Scheduling. IEEE Trans. Netw. Serv. Manag. 2022, 19, 306–321. [Google Scholar] [CrossRef]
Comisso, M.; Vatta, F.; Buttazzoni, G.; Babich, F. Shannon Capacity Evaluation for 5G Communications Using the 3D Random Waypoint Mobility Model. In Proceedings of the 2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 17–19 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
Li, L.; Doroslovački, M.; Loew, M.H. Approximating the Gradient of Cross-Entropy Loss Function. IEEE Access 2020, 8, 111626–111635. [Google Scholar] [CrossRef]
Cardoso, A.; Leitão, J.; Teixeira, C. Using the Jupyter Notebook as a Tool to Support the Teaching and Learning Processes in Engineering Courses. In Proceedings of the The Challenges of the Digital Transformation in Education; Auer, M.E., Tsiatsos, T., Eds.; Springer: Cham, Switzerland, 2019; pp. 227–236. [Google Scholar]
Nguyen, Q.M.; Phan, L.A.; Kim, T. Load-Balancing of Kubernetes-Based Edge Computing Infrastructure Using Resource Adaptive Proxy. Sensors 2022, 22, 2869. [Google Scholar] [CrossRef]

Figure 1. Overview of task offloading approaches.

Figure 2. Global solution description.

Figure 3. (a) Average energy consumption. (b) Average task execution latency.

Figure 4. (a) Confusion matrix evaluation. (b) Precision recall evaluation.

Figure 5. (a) Learning rate evaluation. (b) Workload evaluation.

Figure 6. Latency comparison with number of tasks varying (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 6. Latency comparison with number of tasks varying (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 7. Energy comparison with number of tasks varying (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 7. Energy comparison with number of tasks varying (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 8. Latency comparison with varying numbers of IoT objects (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 8. Latency comparison with varying numbers of IoT objects (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 9. Energy comparison with a varying number of IoT objects (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Figure 9. Energy comparison with a varying number of IoT objects (a) when

γ_{l a t e n c y} = 0.3

and

γ_{e n e r g y} = 0.7

. (b) when

γ_{l a t e n c y} = 0.7

and

γ_{e n e r g y} = 0.3

.

Table 1. Comparison of literature review works.

Author/Work	Method Type	Optimization Objectives	Support for Parallel/Distributed Execution	Awareness of Dynamic Network Conditions
Minwoo Kim et al. [20]	Learning-based (AI + NP-Hard optimization)	Latency, energy, workload balancing	Not explicitly specified	Yes (robust across varying servers)
Abbas et al. [8]	Heuristic (mtaheuristics: Ant Colony, Whale, Grey Wolf Optimization)	Energy, latency	Limited (sequential metaheuristics)	Not specifically addressed
Wang et al. [6]	Heuristic (Hungarian algorithm-based)	Latency, energy	Yes (subtask assignment to multiple servers)	Not explicitly addressed
Gu et al. [23]	Analytical + Heuristic (channel gain threshold)	Energy, latency	Not specified	Handles varying system parameters
Beck et al. [24]	Heuristic (coordinate descent)	Latency, energy	Not explicitly specified	Not addressed
Michael P. et al. [25]	Convex optimization (multiobjective)	Latency, energy	Not specified	Not explicitly addressed
Wang et al. [19]	Heuristic (full, partial, D2D offloading)	Delay, energy	Yes (independent sub-problems)	Not specified
Hu et al. [36]	Analytical + heuristic (QoS-aware algorithms)	Latency, energy, Backhaul efficiency	Yes (integrated edge and cloud architecture)	Yes (network condition adaptation)
Eang et al. [30]	Distributed (LDROA Lagrange Duality)	Latency, cost, energy	Yes	Yes (adjusts resources according to network conditions)

Table 2. Table of symbols.

Symbol	Description
$T_{i o t}$	Processing time of a task locally on an IoT object
$T_{o f f}$	Processing time of a task when it is offladed onto the edge layer
$G_{m a x}$	Maximum network achievable throughput
$T_{e s L o c a l}$	Time taken to process a task locally on the edge server that received the task
$T_{p a r}$	Time taken to process the task in parallel
C	Number of processor cycles required to perform the task
$f_{e s L o c a l}$	Processor frequency of the edge server that processes a task locally

Table 3. Values of

w_{i}

.

Table 3. Values of

w_{i}

.

Item	Application of Equation (11) for $w_{i}$	Result
$w_{1}$	$\frac{3}{0.02 \times (1 + \frac{1.2}{3})}$	107.14
$w_{2}$	$\frac{4}{0.015 \times (1 + \frac{0.8}{3})}$	210.52
$w_{3}$	$\frac{5}{0.018 \times (1 + \frac{2.5}{3})}$	151.51
$w_{4}$	$\frac{2.5}{0.03 \times (1 + \frac{1.1}{3})}$	60.98
$w_{5}$	$\frac{6}{0.012 \times (1 + \frac{0.4}{3})}$	428.57
$w_{6}$	$\frac{3.2}{0.025 \times (1 + \frac{2.1}{3})}$	74.42
$w_{7}$	$\frac{4.5}{0.017 \times (1 + \frac{1.8}{3})}$	166.67
$w_{8}$	$\frac{5.5}{0.022 \times (1 + \frac{0.7}{3})}$	203.70
$w_{9}$	$\frac{2.8}{0.028 \times (1 + \frac{1}{3})}$	75.68
$w_{10}$	$\frac{6.1}{0.010 \times (1 + \frac{0.3}{3})}$	554.55

Table 4. Values of

C_{i}

for each selected edge server.

Table 4. Values of

C_{i}

for each selected edge server.

Server	Application of Equation (11)	Results for $C_{i}$	Application of Equation (10)	Results for $T_{i}$
Server 2	$C_{2}$ = $10^{9} \times \frac{210.52}{1193.64}$	$C_{2}$ = $1.76 \times 10^{8}$	$T_{2}$ = $\frac{1.76 \times 10^{8} + 0.8 \times 10^{9}}{4 \times 10^{9}}$	$T_{2}$ = 0.184 s
Server 5	$C_{5}$ = $10^{9} \times \frac{428.52}{1193.64}$	$C_{5}$ = $3.59 \times 10^{8}$	$T_{5}$ = $\frac{3.59 \times 10^{8} + 0.4 \times 10^{9}}{6 \times 10^{9}}$	$T_{5}$ = 0.139 s
Server 10	$C_{10}$ = $10^{9} \times \frac{554.55}{1193.64}$	$C_{10}$ = $4.65 \times 10^{8}$	$T_{10}$ = $\frac{4.65 \times 10^{8} + 0.3 \times 10^{9}}{6.1 \times 10^{9}}$	$T_{10}$ = 0.138 s

Table 5. Features of our dataset.

Feature	Influenced by
Properties of the task to be executed	$C_{t}$ : Number of CPU cycles required to execute the task; $S_{t}$ Size of task data (in KB or MB).
IoT object characteristics	$f_{i o t}$ : IoT processor frequency (GHz); $p_{i o t}$ : IoT power in run mode (W).
Network characteristics	B: Available bandwidth (Mbps); d: Network latency (ms); $ρ$ : Network congestion (signal-to-noise ratio).
Edge servers characteristics	$f_{e d g e}$ : Edge server frequency (GHz); $p_{e d g e}$ : Edge server power (W); $T_{c o m m}$ : Inter-edge communication time (ms), if the task is executed in parallel.

Table 6. Quantitative comparison of machine learning models for offloading decision.

Model	Accuracy (%)	F1-Score	Inference Latency (ms)	Training Time (s)	Memory Footprint (KB)	Energy (mJ)
Our model	$92.4 \pm 1.1$	$0.91 \pm 0.01$	0.05	0.42	12	0.9
SVM (RBF)	$93.8 \pm 0.9$	$0.92 \pm 0.02$	1.87	3.6	240	4.8
Random Forest	$94.6 \pm 0.8$	$0.93 \pm 0.01$	2.34	5.2	680	6.1
XGBoost	$95.2 \pm 0.7$	$0.94 \pm 0.01$	3.12	8.7	920	7.4
MLP (2 layers)	$94.1 \pm 1.0$	$0.93 \pm 0.02$	1.95	6.4	410	5.5
k-NN (k = 5)	$90.8 \pm 1.4$	$0.89 \pm 0.02$	8.9	0.1	20	12.3
Decision Tree	$91.6 \pm 1.2$	$0.90 \pm 0.02$	0.75	0.8	85	2.1
Naive Bayes	$88.7 \pm 1.6$	$0.87 \pm 0.03$	0.22	0.15	29	1.1

Table 7. Edge server characteristics.

Edge Server	Number of CPU	Disc Storage (GB)	RAM (GB)	CPU (Ghz)
Edge server 1	2	120	8	2.2
Edge server 2	2	120	3	2.5
Edge server 3	3	150	2	1.8
Edge server 4	2	100	3	2.8
Edge server 5	1	80	2	3.0
Edge server 6	2	150	2	2.0
Edge server 7	3	100	2	3.1
Edge server 8	2	110	2	2.1
Edge server 9	1	90	2	2.5
Edge server 10	2	120	2	2.8
Edge server 11	1	100	2	2.1
Edge server 12	2	130	1	1.8
Edge server 13	1	80	1	2.5
Edge server 14	1	70	1	1.8
Edge server 15	2	100	2	1.8

Table 8. Simulation parameters.

Parameter	Value/Range	Justification
Number of IoT Devices	Randomly deployed (100 to 1000)	Simulates dynamic network density
Number of edge servers	15 (Fixed characteristics)	Based on Table 7
Task Size (St)	Uniformly distributed [50 KB, 5 MB]	Represents diverse application requirements
CPU Cycles ( $C_{t}$ )	Uniformly distributed [107, 109] cycles	Represents varying computational complexity.
IoT Device Frequency ( $f_{i o t}$ )	Uniformly distributed [0.8 GHz, 1.5 GHz]	Models heterogeneity of mobile devices
Available Bandwidth (B)	Fluctuating [50 Mbps, 300 Mbps]	Simulated using OMNeT++ congestion models.
Network Congestion ( $ρ$ )	Dynamically varied (Signal-to-Noise Ratio)	Crucial for modeling realistic transmission time
Workload Generator	Apache HTTP Benchmark (AB)	10,000 requests, 100 concurrent requests.

Table 9. Evaluation of standard deviation.

Number of tasks	200	500	1000	10,000
Our protocol	0.6455	0.5621	0.6868	0.6189

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mdemaya, G.B.J.; Ngomade, A.N.; Velempini, M. HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation. IoT 2026, 7, 38. https://doi.org/10.3390/iot7020038

AMA Style

Mdemaya GBJ, Ngomade AN, Velempini M. HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation. IoT. 2026; 7(2):38. https://doi.org/10.3390/iot7020038

Chicago/Turabian Style

Mdemaya, Garrik Brel Jagho, Armel Nkonjoh Ngomade, and Mthulisi Velempini. 2026. "HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation" IoT 7, no. 2: 38. https://doi.org/10.3390/iot7020038

APA Style

Mdemaya, G. B. J., Ngomade, A. N., & Velempini, M. (2026). HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation. IoT, 7(2), 38. https://doi.org/10.3390/iot7020038

Article Menu

HILANDER: High-Performance Intelligent Learning-Based Task Offloading for Network-Aware Dynamic Edge Resource Allocation

Abstract

1. Introduction

2. State of the Art

3. The Proposed Scheme

3.1. Notations

3.2. Latency-Based Offloading Decision

3.2.1. Computation of the Local Execution Time on the Edge Server T e s L o c a l

3.2.2. Computation of the Parallel Execution Time T p a r

3.3. Energy Consumption-Based Offloading Decision

3.4. A Machine Learning Approach for Global Offloading Decision

3.4.1. Problem Formulation

3.4.2. Dataset Definition

3.4.3. Machine Learning Model Architecture

3.4.4. Cost Function Definition

3.4.5. The Training Phase

3.4.6. The Prediction Phase

4. Simulations and Analysis

4.1. Evaluation of Our Model

4.1.1. Energy Consumption Curve with and Without Our Model

4.1.2. Average Latency Curve with and Without Our Model

4.1.3. Interpretation of the Confusion Matrix

4.1.4. Precision Recall Curve

4.1.5. Learning Rate Curve

4.1.6. Workload on Edge Servers

4.2. Comparison of Our Approach with Several Others

4.2.1. Latency and Energy Comparisons with Number of Tasks Varying

4.2.2. Latency and Energy Comparisons with Number of IoT Objects Varying

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2.1. Computation of the Local Execution Time on the Edge Server $T_{e s L o c a l}$

3.2.2. Computation of the Parallel Execution Time $T_{p a r}$