Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning
Abstract
1. Introduction
- Kubernetes-orchestrated ROS 2 edge architecture. We design and implement a K3S-based orchestration of containerized ROS 2 nodes for a multi-robot relative localization task, detailing deployment descriptors and node placement (worker/master) relevant to resource-constrained robots.
- UWB error mitigation at the edge. We integrate an LSTM-based ranging error estimator as distributed ROS 2 nodes on each robot, feeding a particle filter relative pose estimator.
- Fault injection evaluation of resilience. We induce controlled LSTM node failures (F1–F5 scenarios) during operation as well as measure localization performance and recovery, showing graceful degradation compared to a no-LSTM baseline (NOLSTM) and rapid restoration via Kubernetes self-healing.
2. Background and Related Works
2.1. Conceptual Background
2.2. Related Works
3. Resilient Multi-Robot Relative Positioning Using Kubernetes and ROS 2
3.1. System Architecture
3.2. System Deployment
3.2.1. Docker Registry
3.2.2. Particle Filter Node
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: pf-ros2-deployment
- labels:
- app: pf-ros2
- spec:
- selector:
- matchLabels:
- app: pf-ros2
- template:
- metadata:
- labels:
- app: pf-ros2
- spec:
- containers:
- − name: pf-ros2
- image: pf-ros2: ros2core
3.2.3. LSTM Nodes
- kind: Deployment
- metadata:
- name: uwb-lstm-deploymentxxx
- labels:
- app: uwb-lstmxxx
- spec:
- selector:
- matchLabels:
- app: uwb-lstmxxx
- template:
- metadata:
- labels:
- app: uwb-lstmxxx
- spec:
- containers:
- − name: uwb-lstm-ros2core-xxx
- image: uwb-lstm-ros2core:xxx
3.2.4. K3S Edge Cluster
3.3. Evaluation Metrics
3.4. Experimental Setup
3.4.1. Overall Setting
3.4.2. Software Information
3.4.3. Hardware Information
4. Experimental Results
5. Conclusions and Discussion
5.1. Conclusions
5.2. Discussion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhang, J.; Keramat, F.; Yu, X.; Hernández, D.M.; Queralta, J.P.; Westerlund, T. Distributed robotic systems in the edge-cloud continuum with ros 2: A review on novel architectures and technology readiness. In Proceedings of the 2022 Seventh International Conference on Fog and Mobile Edge Computing (FMEC), Paris, France, 12–15 December 2022; pp. 1–8. [Google Scholar]
- Yu, X.; Catalano, I.; Morón, P.T.; Salimpour, S.; Westerlund, T.; Queralta, J.P. Fusing Odometry, UWB Ranging, and Spatial Detections for Relative Multi-Robot Localization. arXiv 2023, arXiv:2304.06264. [Google Scholar]
- Miramá, V.; Bahillo, A.; Quintero, V.; Díez, L.E. NLOS detection generated by body shadowing in a 6.5 GHz UWB localization system using machine learning. IEEE Sens. J. 2023, 23, 20400–20411. [Google Scholar] [CrossRef]
- Armstrong, S. DevOps for Networking; Packt Publishing Ltd.: Birmingham, UK, 2016. [Google Scholar]
- Uphill, T.; Arundel, J.; Khare, N.; Saito, H.; Lee, H.C.C.; Hsu, K.J.C. DevOps: Puppet, Docker, and Kubernetes; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
- Mehrooz, G.; Ebeid, E.; Schneider-Kamp, P. System design of an open-source cloud-based framework for internet of drones application. In Proceedings of the 2019 22nd Euromicro Conference on Digital System Design (DSD), Kallithea, Greece, 28–30 August 2019; pp. 572–579. [Google Scholar]
- Lumpp, F.; Panato, M.; Fummi, F.; Bombieri, N. A Container-based Design Methodology for Robotic Applications on Kubernetes Edge-Cloud architectures. In Proceedings of the 2021 Forum on Specification & Design Languages (FDL), Antibes, France, 8–10 September 2021; pp. 1–8. [Google Scholar]
- Seisa, A.S.; Satpute, S.G.; Nikolakopoulos, G. A kubernetes-based edge architecture for controlling the trajectory of a resource-constrained aerial robot by enabling model predictive control. In Proceedings of the 2022 26th International Conference on Circuits, Systems, Communications and Computers (CSCC), Crete, Greece, 19–22 July 2022; pp. 290–295. [Google Scholar]
- Zhang, J.; Yu, X.; Ha, S.; Peña Queralta, J.; Westerlund, T. Comparison of Middlewares in Edge-to-Edge and Edge-to-Cloud Communication for Distributed ROS 2 Systems. J. Intell. Robot. Syst. 2024, 110, 162. [Google Scholar] [CrossRef]
- Shah, J.; Dubaria, D. Building modern clouds: Using docker, kubernetes & Google cloud platform. In Proceedings of the 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 7–9 January 2019; pp. 184–189. [Google Scholar]
- Böhm, S.; Wirtz, G. Profiling Lightweight Container Platforms: MicroK8s and K3s in Comparison to Kubernetes. In Proceedings of the ZEUS, Bamberg, Germany, 25–26 February 2021; pp. 65–73. [Google Scholar]
- Aldegheri, S.; Bombieri, N.; Fummi, F.; Girardi, S.; Muradore, R.; Piccinelli, N. Late breaking results: Enabling containerized computing and orchestration of ROS-based robotic SW applications on cloud-server-edge architectures. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), Virtual, 20–24 July 2020; pp. 1–2. [Google Scholar]
- Lumpp, F.; Panato, M.; Bombieri, N.; Fummi, F. A design flow based on docker and kubernetes for ros-based robotic software applications. ACM Trans. Embed. Comput. Syst. 2024, 23, 1–24. [Google Scholar] [CrossRef]
- Lumpp, F.; Fummi, F.; Patel, H.D.; Bombieri, N. Enabling kubernetes orchestration of mixed-criticality software for autonomous mobile robots. IEEE Trans. Robot. 2023, 40, 540–553. [Google Scholar] [CrossRef]
- Seisa, A.S.; Lindqvist, B.; Satpute, S.G.; Nikolakopoulos, G. An edge architecture for enabling autonomous aerial navigation with embedded collision avoidance through remote nonlinear model predictive control. J. Parallel Distrib. Comput. 2024, 188, 104849. [Google Scholar] [CrossRef]
- Seisa, A.S.; Satpute, S.G.; Nikolakopoulos, G. Comparison between docker and kubernetes based edge architectures for enabling remote model predictive control for aerial robots. In Proceedings of the IECON 2022—48th Annual Conference of the IEEE Industrial Electronics Society, Brussels, Belgium, 17–20 October 2022; pp. 1–6. [Google Scholar]
- Chen, S.; Zhou, M. Evolving container to unikernel for edge computing and applications in process industry. Processes 2021, 9, 351. [Google Scholar] [CrossRef]
- Queralta, J.P.; Qingqing, L.; Zou, Z.; Westerlund, T. Enhancing autonomy with blockchain and multi-access edge computing in distributed robotic systems. In Proceedings of the 2020 Fifth International Conference on Fog and Mobile Edge Computing (FMEC), Paris, France, 30 June–3 July 2020; pp. 180–187. [Google Scholar]
- Liu, X.; Chen, A.; Zheng, K.; Chi, K.; Yang, B.; Taleb, T. Distributed computation offloading for energy provision minimization in WP-MEC networks with multiple HAPs. IEEE Trans. Mob. Comput. 2025, 24, 2673–2689. [Google Scholar] [CrossRef]
Scenario | LSTM Pods Running at s | Failure Event at s | LSTM Pods Running 5 s After Failure | Purpose |
---|---|---|---|---|
NOLSTM | 0 | — | 0 | Pure UWB + PF baseline (no correction) |
SALL | 5 | none | 5 | Ideal case: full correction, no faults |
F1 | 5 | kill 1 pod | 5 (after restart) | Single-node fault, tests self-healing |
F2 | 5 | kill 2 pods | 5 (after restart) | Dual-node fault, moderate stress |
F3 | 5 | kill 3 pods | 5 (after restart) | Majority failure, high stress |
F4 | 5 | kill 4 pods | 5 (after restart) | Only one pod initially survives |
F5 | 5 | kill all 5 pods | 5 (after restarts) | Worst case: complete outage, full recovery required |
Scenarios | RMSE/STD | |||
---|---|---|---|---|
TB1 | TB2 | TB3 | TB4 | |
SAll | (0.121/0.064) | (0.122/0.064) | (0.117/0.057) | (0.133/0.070) |
F1 | (0.116/0.058) | (0.118/0.058) | (0.113/0.057) | (0.132/0.065) |
F2 | (0.120/0.060) | (0.123/0.062) | (0.116/0.059) | (0.136/0.074) |
F3 | (0.118/0.062) | (0.125/0.067) | (0.113/0.056) | (0.138/0.071) |
F4 | (0.118/0.058) | (0.121/0.061) | (0.116/0.059) | (0.131/0.067) |
F5 | (0.230/0.195) | (0.122/0.060) | (0.118/0.058) | (0.136/0.070) |
NOLSTM | (0.661/0.528) | (0.134/0.076) | (0.188/0.127) | (0.255/0.186) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Yu, X.; Westerlund, T. Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning. Sensors 2025, 25, 5067. https://doi.org/10.3390/s25165067
Zhang J, Yu X, Westerlund T. Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning. Sensors. 2025; 25(16):5067. https://doi.org/10.3390/s25165067
Chicago/Turabian StyleZhang, Jiaqiang, Xianjia Yu, and Tomi Westerlund. 2025. "Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning" Sensors 25, no. 16: 5067. https://doi.org/10.3390/s25165067
APA StyleZhang, J., Yu, X., & Westerlund, T. (2025). Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning. Sensors, 25(16), 5067. https://doi.org/10.3390/s25165067