DRL-Driven Intelligent SFC Deployment in MEC Workload for Dynamic IoT Networks
Abstract
1. Introduction
1.1. Background and Motivation
1.2. Motivation and Contributions
- We formulated the VNF placement and resource allocation problem as a multi-objective optimization task, where conflicting performance goals are considered to minimize latency, conserve energy, and optimize resource utilization.
- By intelligently distributing workloads across geographically distributed edge nodes based on real-time system states, the model ensures balanced resource utilization and reduces service degradation in high-density IoT deployments. Our scheme can proactively reallocate resources to avoid and minimize overload risks, thereby improving the long-term sustainability of edge nodes.
- We design a joint optimization algorithm for task offloading and resource allocation based on the Deep Q-Network, with the optimization objectives of assisting the NFVO in instantiating resources according to the workload requirements of IoT devices.
- The reward provides comprehensive network performance. Our proposed scheme leverages DRL frameworks and the network environment to highlight significant existing solutions across various aspects, including energy, latency, packet delivery ratio, packet drop ratio, and throughput.
1.3. Paper Organization
2. Related Work
3. Model and Problem Formulation
3.1. Network Model
- Offloading task from IoT to MEC server for computing task: In each timeslot-t, IoT devices offload to . Equation (1) indicates the communication model that is associated between end-devices and the MEC-server for data rate, denoted as for the state of allocated bandwidth , channel gain , transmission power , and noise .
3.2. SFC Requests
3.2.1. Resource Constraints
3.2.2. Delay Constraint
3.3. Optimization Modelling Designs
3.3.1. State Space
- represents the coordination of resources in the MEC server to preserve the ability to handle the computation tasks during IoT and offload the tasks to MEC-m.
- represents the upper-bound resource utilization in MEC-m at timeslot-t.
- is communication from the local device to , state of total bandwidth, observed from a total bandwidth allocation and channel gain between environments.
- represents how much bandwidth is allocated between nodes to during the paging tasks at .
- A subset of a tuple , which consists of processing tasks, including the consumed resources/energy, and time spent from experience.
3.3.2. Action Space
3.3.3. Reward Based on Policy Charging Selection
3.4. Pseudo DQN Algorithm Designs
Algorithm 1: Pseudo-code of DQN-based VNF placement algorithms | |||
Input | |||
Output | |||
Learning process: | |||
1 | Initialize the experience reply. | ||
2 | For each | ||
3 | of VNF from SFR without repetition | ||
4 | do | ||
5 | Generate randomly value | ||
6 | then | ||
7 | |||
8 | Else | ||
9 | |||
10 | based on online_net | ||
11 | End if | ||
12 | Enforce the action that deploys VNF and calculates the shortest path of VNFFG | ||
13 | |||
14 | |||
15 | |||
16 | steps | ||
17 | If the VNF placement and deployment has yet completed; | ||
18 | Then | ||
20 | End if | ||
21 | End for | ||
Execution process: | |||
22 | Read the online_net and target_net | ||
23 | |||
24 | Do | ||
25 | Select the action with Max Q_value; | ||
26 | Execute the action and update the placement scheme: | ||
27 | |||
28 | |||
29 | If the VNF placement is completed, then. | ||
31 | End if | ||
32 | End for | ||
33 | Return |
4. Performance Evaluation
4.1. Comparison of Proposed and Reference Schemes
- DQN-MIoT is our proposed method, which utilizes DRL algorithms combined with NFVO in MEC to allocate resources for VNF placement. Our proposed DRL-based NFV control leverages a DQN architecture to approximate the Q-value function, enabling collaborative configuration of SDN/NFV flow rules for optimized placement and resource allocation. This method effectively trains the function approximator to manage high-dimensional state observations and representations. As detailed in Section 3, the approach incorporates experience replay, neural network training, and synchronization with the SDN/NFV controller to learn sophisticated policies.
- DQL-MIoT is set to leverage the Deep Q-learning approach, which enables learning from the network environment and applying action to control policy; however, DL achieves low performance in the complex network topology and can only handle lightweight network topology cases.
- Greedy-MIoT represents the traditional SDN/NFV for IoT-MEC servers in computing task management. This baseline approach addresses a centralized MANO that manages resources and controls the characteristics of IoT, adhering to the definition of standard NFV rules. The resource management controller is based on topology, traffic conditions, service requirements, and offloading policies. Greedy-MIoT relies on network-level optimization, efficient resource management, and resource allocation for non-complex applications.
- Random-MIoT indicates that the algorithm stochastically chooses an MEC service to an ingress source from traversing a VNF resource instance and a routing path to commonly chaining two adjacent VNF instances for every incoming IoT-R.
4.2. Results and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Anitha, P.; Vimala, H.S.; Shreyas, J. Comprehensive Review on Congestion Detection, Alleviation, and Control for IoT Networks. J. Netw. Comput. Appl. 2024, 221, 103749. [Google Scholar] [CrossRef]
- Alsharif, M.H.; Kelechi, A.H.; Jahid, A.; Kannadasan, R.; Singla, M.K.; Gupta, J.; Geem, Z.W. A Comprehensive Survey of Energy-Efficient Computing to Enable Sustainable Massive IoT Networks. Alex. Eng. J. 2024, 91, 12–29. [Google Scholar] [CrossRef]
- Ma, H.; Tao, Y.; Fang, Y.; Chen, P.; Li, Y. Multi-Carrier Initial-Condition-Index-Aided DCSK Scheme: An Efficient Solution for Multipath Fading Channel. IEEE Trans. Veh. Technol. 2025, 1–14. [Google Scholar] [CrossRef]
- Yu, Q.; Wang, H.; He, D.; Lu, Z. Enhanced Group-Based Chirp Spread Spectrum Modulation: Design and Performance Analysis. IEEE Internet Things J. 2025, 12, 5079–5092. [Google Scholar] [CrossRef]
- Li, J.; Sun, G.; Wu, Q.; Niyato, D.; Kang, J.; Jamalipour, A.; Leung, V.C.M. Collaborative Ground-Space Communications via Evolutionary Multi-Objective Deep Reinforcement Learning. IEEE J. Sel. Areas Commun. 2024, 42, 3395–3411. [Google Scholar] [CrossRef]
- Mijumbi, R.; Serrat, J.; Gorricho, J.-L.; Bouten, N.; De Turck, F.; Boutaba, R. Network Function Virtualization: State-of-The-Art and Research Challenges. IEEE Commun. Surv. Tutor. 2016, 18, 236–262. [Google Scholar] [CrossRef]
- Aboubakar, M.; Kellil, M.; Roux, P. A Review of IoT Network Management: Current Status and Perspectives. J. King Saud Univ. Comput. Inf. Sci. 2021, 34, 4163–4176. [Google Scholar] [CrossRef]
- Liyanage, M.; Porambage, P.; Ding, A.Y.; Kalla, A. Driving Forces for Multi-Access Edge Computing (MEC) IoT Integration in 5 G. ICT Express 2021, 7, 127–137. [Google Scholar] [CrossRef]
- Singh, R.; Sukapuram, R.; Chakraborty, S. A Survey of Mobility-Aware Multi-Access Edge Computing: Challenges, Use Cases and Future Directions. Ad Hoc Netw. 2023, 140, 103044. [Google Scholar] [CrossRef]
- Wang, H.; Chen, P. Parallelism-Aware Service Function Chain Placement for Delay-Sensitive IoT Applications with VNF Reuse in Mobile Edge Computing. In Proceedings of the 2024 IEEE International Conference on Web Services (ICWS), Shenzhen, China, 7–13 July 2024; pp. 968–973. [Google Scholar] [CrossRef]
- Ros, S.; Tam, P.; Song, I.; Kang, S.; Kim, S. Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance. Electronics 2024, 13, 2552. [Google Scholar] [CrossRef]
- Tam, P.; Kim, S. Graph-Based Deep Reinforcement Learning in Edge Cloud Virtualized O-RAN for Sharing Collaborative Learning Workloads. IEEE Trans. Netw. Sci. Eng. 2024, 12, 302–318. [Google Scholar] [CrossRef]
- Wang, H.; Guo, R.; Ma, P.; Ruan, C.; Luo, X.; Ding, W.; Zhong, T.; Xu, J.; Liu, Y.; Chen, X. Towards Mobile Sensing with Event Cameras on High-Agility Resource-Constrained Devices: A Survey. arXiv 2025, arXiv:2503.22943. [Google Scholar] [CrossRef]
- Ullah, S.A.; Bibi, M.; Hassan, S.A.; Abou-Zeid, H.; Qureshi, H.K.; Jung, H.; Mahmood, A.; Gidlund, M.; Hossain, E. From Nodes to Roads: Surveying DRL Applications in MEC-Enhanced Terrestrial Wireless Networks. IEEE Commun. Surv. Tutor. 2025, 1–42. [Google Scholar] [CrossRef]
- Ding, H.; Zhao, Z.; Zhang, H.; Liu, W.; Yuan, D. DRL-Based Computation Efficiency Maximization in MEC-Enabled Heterogeneous Networks. IEEE Trans. Veh. Technol. 2024, 73, 15739–15744. [Google Scholar] [CrossRef]
- Wu, J.; Yin, J.; Zhao, X.; Liu, Y. DRL-Driven Adaptive SFC Deployment with GCN in Distributed Cloud Networks. In Proceedings of the 2025 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 14–16 March 2025; pp. 1269–1273. [Google Scholar] [CrossRef]
- Onsu, M.A.; Lohan, P.; Kantarci, B.; Janulewicz, E.; Slobodrian, S. A New Realistic Platform for Benchmarking and Performance Evaluation of DRL-Driven and Reconfigurable SFC Provisioning Solutions. arXiv 2024, arXiv:2406.10356. [Google Scholar] [CrossRef]
- Liu, Y.; Lu, Y.; Li, X.; Qiao, W.; Li, Z.; Zhao, D. SFC Embedding Meets Machine Learning: Deep Reinforcement Learning Approaches. IEEE Commun. Lett. 2021, 25, 1926–1930. [Google Scholar] [CrossRef]
- Escheikh, M.; Taktak, W. Online QoS/QoE-Driven SFC Orchestration Leveraging a DRL Approach in SDN/NFV Enabled Networks. Wirel. Pers. Commun. 2024, 137, 1511–1538. [Google Scholar] [CrossRef]
- Golec, M.; Khamayseh, Y.; Melhem, S.B.; Alwarafy, A. LLM-Driven APT Detection for 6G Wireless Networks: A Systematic Review and Taxonomy. arXiv 2025, arXiv:2505.18846. [Google Scholar] [CrossRef]
- Ros, S.; Kang, S.; Iv, T.; Song, I.; Tam, P.; Kim, S. Priority-Aware Resource Allocation for VNF Deployment in Service Function Chains Based on Graph Reinforcement Learning. Comput. Mater. Contin. 2025, 83, 1649–1665. [Google Scholar] [CrossRef]
- Xu, Y.; He, Z.; Li, K. Resource Allocation and Placement in Multi-Access Edge Computing. Stud. Big Data 2024, 151, 39–62. [Google Scholar] [CrossRef]
- Tomassilli, A.; Giroire, F.; Huin, N.; Pérennès, S. rovably Efficient Algorithms for Placement of Service Function Chains with Ordering Constraints. In Proceedings of the IEEE INFOCOM 2018—IEEE Conference on Computer Communications, Honolulu, HI, USA, 16–19 April 2018. [Google Scholar] [CrossRef]
- Guérout, T.; Gaoua, Y.; Artigues, C.; Costa, G.D.; Lopez, P.; Monteil, T. Mixed Integer Linear Programming for Quality of Service Optimization in Clouds. Future Gener. Comput. Syst. 2017, 71, 1–17. [Google Scholar] [CrossRef]
- Sun, G.; Wang, Z.; Su, H.; Yu, H.; Lei, B.; Guizani, M. Profit Maximization of Independent Task Offloading in MEC-Enabled 5G Internet of Vehicles. IEEE Trans. Intell. Transp. Syst. 2024, 25, 16449–16461. [Google Scholar] [CrossRef]
- Pei, J.; Hong, P.; Xue, K.; Li, D. Efficiently Embedding Service Function Chains with Dynamic Virtual Network Function Placement in Geo-Distributed Cloud System. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 2179–2192. [Google Scholar] [CrossRef]
- Jin, P.; Fei, X.; Zhang, Q.; Liu, F.; Li, B. Latency-aware VNF Chain Deployment with Efficient Resource Reuse at Network Edge. In Proceedings of the IEEE Conference on Computer Communications (IEEE INFOCOM), Toronto, ON, Canada, 6–9 July 2020; pp. 267–276. [Google Scholar]
- Zheng, D.; Peng, C.; Liao, X.; Cao, X. Toward Optimal Hybrid Service Function Chain Embedding in Multiaccess Edge Computing. IEEE Internet Things J. 2020, 7, 6035–6045. [Google Scholar] [CrossRef]
- Tam, P.; Ros, S.; Song, I.; Kang, S.; Kim, S. A Survey of Intelligent End-To-End Networking Solutions: Integrating Graph Neural Networks and Deep Reinforcement Learning Approaches. Electronics 2024, 13, 994. [Google Scholar] [CrossRef]
- Mao, M.; Hong, M. YOLO Object Detection for Real-Time Fabric Defect Inspection in the Textile Industry: A Review of YOLOv1 to YOLOv11. Sensors 2025, 25, 2270. [Google Scholar] [CrossRef]
- Sun, R.; Cheng, N.; Li, C.; Chen, F.; Chen, W. Knowledge-Driven Deep Learning Paradigms for Wireless Network Optimization in 6G. IEEE Network 2024, 38, 70–78. [Google Scholar] [CrossRef]
- Zhai, X.; He, Z.; Xiao, Y.; Wu, J.; Yu, X. Dynamic VNF Deployment and Resource Allocation in Mobile Edge Computing. In Proceedings of the 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), Kaifeng, China, 30 October–2 November 2024; pp. 573–581. [Google Scholar] [CrossRef]
- Yao, J.; Wang, J.; Wang, C.; Yan, C. DRL-Based VNF Cooperative Scheduling Framework with Priority-Weighted Delay. IEEE Trans. Mob. Comput. 2024, 23, 11375–11388. [Google Scholar] [CrossRef]
- Xu, S.; Li, Y.; Guo, S.; Lei, C.; Liu, D.; Qiu, X. Cloud-Edge Collaborative SFC Mapping for Industrial IoT Using Deep Reinforcement Learning. IEEE Trans. Ind. Inform. 2021, 18, 4158–4168. [Google Scholar] [CrossRef]
- Zhu, R.; Wang, P.; Geng, Z.; Zhao, Y.; Yu, S. Double-Agent Reinforced VNFC Deployment in EONs for Cloud-Edge Computing. J. Light. Technol. 2023, 41, 5193–5208. [Google Scholar] [CrossRef]
- Gao, X.; Liu, R.; Kaushik, A.; Zhang, H. Dynamic Resource Allocation for Virtual Network Function Placement in Satellite Edge Clouds. IEEE Trans. Netw. Sci. Eng. 2022, 9, 2252–2265. [Google Scholar] [CrossRef]
Notation | Description |
---|---|
Set of MEC servers | |
E | Set of links |
V | Set of VNFs |
A set of number tasks | |
The set of IoT devices | |
Number of timeslots | |
Resources of MEC servers | |
Set all the VNFs by IoT-Rj | |
Upper-bound capacity of MEC servers on (CPU, RAM, Disk) | |
BW | Maximum of Bandwidths |
Propagation delay between nodes on link e | |
F | Number of service requests |
Upper-bound bandwidth of MEC-m at timeslot-t | |
Remaining resources of MEC-m at timeslot-t | |
Decision variable | |
Equals one if the VNF of the service request that is deployed at node-v at timeslot-t, otherwise | |
Selecting the virtual link between node-j at timeslot-t, otherwise |
Parameters | Specifications |
---|---|
Hosting infrastructure | Ryzen(R) 7 5700x3d CPU @ 3.0 GHz, 32 GB, NVIDIA RTX 4080 GPU |
Number of IoT | 100 |
Number of MEC servers | 5 |
Service request configuration | NF (types = 5) |
SFC type of length (2–5) | |
Flow rate (64 Kbps–4 Mbps) | |
Tolerable time (100–500) | |
Task complexity and sizes | Random set (Low, normal, high)—(256 Kbits, 512 Kbits, 1024 Kbits) |
Learning rate | 0.001 |
Discount factor | 0.95 |
Batch size | Random set (32, 64, 128, 256) |
Exploration | 0.5 |
Number of episodes | 400 |
Python platform | PyTorch |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ros, S.; Ryoo, I.; Kim, S. DRL-Driven Intelligent SFC Deployment in MEC Workload for Dynamic IoT Networks. Sensors 2025, 25, 4257. https://doi.org/10.3390/s25144257
Ros S, Ryoo I, Kim S. DRL-Driven Intelligent SFC Deployment in MEC Workload for Dynamic IoT Networks. Sensors. 2025; 25(14):4257. https://doi.org/10.3390/s25144257
Chicago/Turabian StyleRos, Seyha, Intae Ryoo, and Seokhoon Kim. 2025. "DRL-Driven Intelligent SFC Deployment in MEC Workload for Dynamic IoT Networks" Sensors 25, no. 14: 4257. https://doi.org/10.3390/s25144257
APA StyleRos, S., Ryoo, I., & Kim, S. (2025). DRL-Driven Intelligent SFC Deployment in MEC Workload for Dynamic IoT Networks. Sensors, 25(14), 4257. https://doi.org/10.3390/s25144257