Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU

Decker, Jonathan; Hasse, Vincent Florens; Kunkel, Julian

doi:10.3390/a18060324

Open AccessArticle

Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU

by

Jonathan Decker

^1,*

,

Vincent Florens Hasse

¹

and

Julian Kunkel

^1,2,*

¹

Institute for Computer Science, University of Göttingen, Goldschmidtstraße 7, 37077 Göttingen, Germany

²

GWDG, Burckhardtweg 4, 37077 Göttingen, Germany

^*

Authors to whom correspondence should be addressed.

Algorithms 2025, 18(6), 324; https://doi.org/10.3390/a18060324

Submission received: 7 April 2025 / Revised: 20 May 2025 / Accepted: 26 May 2025 / Published: 29 May 2025

(This article belongs to the Collection Parallel and Distributed Computing: Algorithms and Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Kubernetes has emerged as the industry standard for container orchestration in cloud environments, with its scheduler dynamically placing container instances across cluster nodes based on predefined rules and algorithms. Various efforts have been made to extend and improve upon the Kubernetes scheduler. However, as the majority of Kubernetes clusters operate on homogeneous hardware, most scheduling algorithms are also only developed for homogeneous systems. Heterogeneous infrastructures, which include IoT devices or specialized hardware, have become more widespread and require specialized tuning to optimize workload assignment, for which researchers and developers working on scheduling systems require access to heterogeneous hardware for development and testing; such data may not be available. While simulations such as CloudSim or K8sSim can provide insights, the level of detail they can offer to validate new schedulers is limited, as they are only simulations. To address this, we introduce Q8S, a tool for emulating heterogeneous Kubernetes clusters including x86_64 and ARM64 architectures on OpenStack using QEMU. Emulations created through Q8S provide a higher level of detail than simulations and can be used to train machine learning scheduling algorithms. By providing an environment capable of executing real workloads, Q8S enables researchers and developers to test and refine their scheduling algorithms, ultimately leading to more efficient and effective heterogeneous cluster management. We release our implementation of Q8S as open source.

Keywords:

Kubernetes; emulation; heterogeneous; QEMU; OpenStack; scheduling

1. Introduction

Cloud computing has brought a major paradigm shift to the IT landscape [1] and enabled the creation of large compute continuums. These continuums are often utilized for a variety of tasks [2] including compute-intensive workloads as well as providing web services and are commonly operated through Kubernetes [3]. While Kubernetes is capable of effectively scheduling container workloads, taking into account various constraints and policies [4], the cloud landscape is growing increasingly complex through the addition of heterogeneous components such as Internet of Things (IoT) devices [5].

Kubernetes scheduling is an ongoing area of research [4,6,7] with different approaches focusing on metrics such as reliability, energy consumption, and resource utilization. However, scheduling becomes increasingly difficult with the inclusion of heterogeneous components, which can be edge devices, data centers using heterogeneous hardware, or outdated hardware being replaced by newer components [8].

Researchers working on scheduling algorithms for heterogeneous compute clusters may struggle to evaluate their work properly due to limited access to heterogeneous components. Heterogeneous components include specialized and often expensive hardware or IoT devices, which may be reserved by other users or not available at all. Moreover, available heterogeneous components may already be installed in production systems such that a newly developed scheduling algorithm cannot be deployed in production without being evaluated first; however, for the evaluation, the heterogeneous components would need to be moved into a test system and be missing from production.

One such research effort is the DECICE project [9], which aims to develop an AI-based scheduler for Kubernetes that is suitable for heterogeneous Kubernetes clusters. To train machine learning models and to test frameworks, one or more heterogeneous Kubernetes clusters are required. As a possible solution, we considered simulations such as CloudSim [10], its extensions [11,12,13,14,15], EMUSIM [16], and K8sSim [17]; however, we found that the grade of detail available, especially for heterogeneous hardware, was insufficient for our work. Instead, we investigated techniques to emulate a heterogeneous Kubernetes cluster consisting of nodes with different emulated hardware architectures and operating systems on existing homogeneous OpenStack cloud infrastructure. Emulations provide higher accuracy than simulations [18], as simulations apply abstractions that omit details not explicitly included by the developers of a given simulation.

Moreover, an emulation enables the execution of real-world workloads such that technical problems with scheduling algorithms and end-user applications can be discovered.

As the result of this, we present Q8S, which allows users to automatically set up a Kubernetes cluster with heterogeneous nodes running on a homogeneous OpenStack cloud using Quick Emulator (QEMU) [19]. Other emulation efforts such as [18,20] focus on the emulation of networks and large numbers of nodes, respectively. Q8S, on the other hand, fills a gap in the range of options by emulating heterogeneous hardware through actual hardware emulation techniques instead of requiring physical heterogeneous hardware or simulating heterogeneous capabilities of nodes. Q8S offers a greater level of detail than simulations, as, instead of a simulated cluster with artificial nodes and tasks, Q8S provides a real Kubernetes cluster, which can execute actual workloads. Therefore, Q8S enables researchers to observe details in the implementation of a given algorithm that might not otherwise surface until the algorithm was deployed in a production environment. Furthermore, we have benchmarked Q8S using k8s-bench-suite [21], showing that the emulated clusters work without significant changes in communication bandwidth, although CPU performance suffers from emulation overhead. Our Q8S implementation is open source and can be extended to further utilize the emulation capacities of QEMU and libvirt [22]. The source code is available at https://github.com/gwdg/pub-2025-Q8S (accessed on 25 May 2025) under the MIT license.

1.1. Background

Q8S depends on a complex stack of technologies including nested virtualization, hardware emulation, and manipulation of Linux iptables. The following paragraphs explain these technologies, as they are relevant for understanding the design of Q8S.

System Heterogeneity

For heterogeneous systems, we derive our definition from [23], such that a system is homogeneous if the hardware, configuration, resources, and operating systems are the same. Therefore, a node is heterogeneous if it deviates in any of these areas.

Differences in hardware may consist of different CPU architectures such as x86 and Advanced RISC Machines (ARM) but can also consist of different CPUs of the same architecture, varying amounts of installed system memory, or different network cabling. Moreover, some nodes might include additional hardware components such as GPUs. All these differences lead to variations in the resources that a given node can contribute to a heterogeneous cluster.

Based on deployed operating systems and configurations as well as hardware variations, a given workload may perform better or worse on a specific node or even be incompatible. Having a scheduler determine the optimal placement may be an NP-complete problem [9]. In order to find near-optimal placements in a heterogeneous cluster, all these factors should be considered.

Nested Virtualization

Virtualization includes operating a Hypervisor or Virtual Machine Manager (VMM) on a node that, in turn, controls one or more virtual machines (VMs) [24]. The hypervisor must handle the virtualization of CPU and memory as well as any Input/Output (IO) devices including disks and networking [25], which results in additional overhead for operations carried out in a VM because the VMM must perform translations between the VM and the hardware. There exist optimization techniques such as para-virtualization [26], which often require a VM to recognize that it is running on top of a hypervisor such that it can utilize hypercalls, requests going directly to the hypervisor without translation.

In nested virtualization, a VM is run within another VM by layering hypervisors [27]. This creates a hierarchy of hypervisors, where the outermost hypervisor operating on the actual hardware and hardware interactions from a VM running on an inner hypervisor are passed from inner hypervisor to outer hypervisor. For the inner hypervisor to work properly, certain operations that it would expect to be carried out directly on the underlying hardware need to be trapped and emulated by the outer hypervisor. Due to the multiple layers of hypervisors, nested virtualization further adds to the performance overhead of virtualization.

Flannel and VXLAN

Flannel is a lightweight networking plugin for Kubernetes with decent performance [28,29]. It operates via a Layer 3 IPv4 network and routes packages based on target IP addresses to the respective ports of the Kubernetes containers. Its recommended backend is Virtual Extensible LAN (VXLAN), a networking virtualization, which employs encapsulation of Ethernet traffic to construct Layer 2 (data link layer) connections via Layer 3 networks. For inter-node communication, Flannel sets up a VXLAN Tunnel Endpoint (VTEP) that enables any two nodes to communicate on Layer 2 by sending UDP packages, which encapsulate a VXLAN header. The inter-node traffic uses Layer 3 IP addresses, while node internal addressing takes place on Layer 2 using MAC addresses.

Linux Firewall

Networking packages are processed by the Linux kernel based on routing tables and iptables in order to determine where a given package should be sent. If a package is not addressed to a local IP, its next hop is determined by the routing tables, while iptables serve to apply further operations on incoming packages. These operations include filtering, network address translation (NAT), and package manipulation (mangle) applied at varying stages of package processing such as at arrival (prerouting), upon local reception (input), at the time of forwarding (forward), and before sending (postrouting) as well as for any locally created packages (output).

Simulation and Emulation

The behavior of a system can be replicated using emulation or simulation. The two techniques form distinct approaches, with simulation utilizing a model to resemble a given system and its behavior through mathematical or logical abstractions [30]. Emulation, on the other hand, attempts to replicate the exact behavior of the target system by mimicking its hardware and software environment.

Simulations can be highly flexible, but because the actual system involved is abstracted into a model, accuracy may be reduced. In emulations, the target system components are mimicked as close as possible, resulting in high accuracy at the cost of reduced flexibility and higher computational cost.

QEMU and Libvirt

QEMU is a generic open-source emulator with support for a large range of hardware platforms and architectures such that developers can test their software against multiple platforms without actual access to the hardware. QEMU can operate as an inner hypervisor to emulate a VM with a specified hardware configuration, for example, ARM, on top of an x86 node. However, when doing so, hardware acceleration features that would reduce the performance overhead of the virtualization are not available, as the hardware emulated in the VM must be identical to that of the physical host for hardware acceleration.

Libvirt provides an API, a daemon, and management tools for working with QEMU and other virtualization systems including command line tools Virsh and Virt-install. The interaction of libvirt and QEMU is shown in Figure 1, with libvirt providing its tools in user space such that QEMU can be managed to create the requested VMs while interacting with the underlying system kernel. It should be noted that in Figure 1, QEMU would be the only hypervisor and would directly work with the hypervisor Kernel-based Virtual Machine (KVM) to create VMs. For QEMU to be an inner hypervisor, the physical hardware would need to be an emulation of the outer hypervisor, and KVM would not be used.

Ubuntu Cloud-Images and Cloud-Init

Ubuntu Cloud-Images are Ubuntu Operating System (OS) images that include a feature called cloud-init, which can be configured to apply settings and execute scripts on a fresh image during the first boot. These configurations consist of multiple parts including user-data and meta-data files. The user-data file contains user creation settings, ssh configurations, the packages to install, and the scripts to run. The meta-data file can add further network configuration and set the hostname.

1.2. Related Work

CloudSim [10] is an open-source engine for simulating cloud computing environments. In CloudSim, a data center may operate a number of hosts, each of which runs one or more VMs, and tasks are executed in the VMs. The compute capabilities of hosts can be specified in core numbers and Million Instructions Per Second (MIPS) per core as well as memory, bandwidth, and storage. The simulation can then calculate for a set of tasks, each of which requires a number of compute instructions as well as memory and access to storage, what tasks would end up on which nodes and how long the execution of all tasks would take. CloudSim is an event-based simulation and does not emulate any hardware. Moreover, the heterogeneity of hosts is limited to CPU counts, speed, memory, and storage size and bandwidth; it does not include different system architectures.

DynamicCloudSim [11] and Cloud2Sim [12] provide further developments on top of CloudSim. DynamicCloudSim introduces performance variance by sampling CPU, IO, and bandwidth speed from a normal distribution instead of using a static value. Cloud2Sim adds the option to run distributed simulations on CloudSim, as CloudSim itself can only be executed on a single host, which results in long compute times for large simulations.

EMUSIM [16] is another extension of CloudSim; it utilizes emulated hardware to run applications to improve the accuracy of a simulation. By running the target application in a small-scale emulated environment with varying input sizes, EMUSIM is able to collect metrics regarding the scalability of the application. These metrics are then used in CloudSim to perform a more detailed simulation. The same limitations apply for the support of heterogeneous clusters as for CloudSim.

K8sSim [17], a tool similar to CloudSim, is capable of simulating Kubernetes clusters, including specifying CPU capacity, memory, and GPU count per node. Workloads can be configured with specific requirements and be scheduled with customizable scheduling algorithms. The simulation collects data on total execution time and wait times for workloads, which serve to evaluate a given scheduling algorithm. This can serve as a starting point for validating a new scheduling algorithm; however, it does not properly support heterogeneous clusters, as it does not run any workloads or performs hardware emulation.

CloudNativeSim [13] also takes inspiration from CloudSim and provides a toolkit for simulating cloud applications that utilize microservice architectures. It implements many additional features such as various scaling techniques, request processing, and Quality of Service (QoS) metrics. While it employs containerization and pods as an abstraction for instances, it does not fully simulate Kubernetes deployments. Moreover, its support for heterogeneous environments is limited to simulating instances with different code environments and hardware capacities.

PerficientCloudSim [14] provides an extension of CloudSim for it to support large-scale simulations using heterogeneous resources such as different CPU types. Different CPU types are simulated by considering their performance distribution in terms of the instructions they can process per second. While this provides a tool for testing scheduling systems on across different CPU types, it does not cover potential issues with varying CPU architectures.

ServiceSim [15] is another toolkit built on top of CloudSim and has a focus on simulations of microservices in cloud-edge environments. One of its main goals is to enable the validation of policies against QoS requirements in heterogeneous networks. However, it does not specifically consider that edge nodes may employ heterogeneous hardware.

faas-sim [31] is a simulation framework, which also supports edge environments but specifically focuses on serverless edge computing. Function life cycle management is based on Kubernetes and supports heterogeneous nodes via varying capabilities and capacities.

CloudSimSC [32], similar to faas-sim provides tools for simulating serverless applications and bases its assumptions about function life cycles on Kubernetes. However, it explicitly focuses on data center deployments and does not further consider heterogeneous environments.

iContinuum [18] proposes an emulation toolkit for cloud-edge environments with IoT devices. It also argues for using emulations for higher accuracy compared to simulations as nuanced interactions can be observed, which would be missing from the abstractions used in simulations. The emulated environment utilizes Kubernetes with Mininet to emulate network topologies. By incorporating edge devices such as Raspberry Pis, it is possible to construct heterogeneous clusters. Emulation of edge hardware is not supported by iContinuum.

Large-Scale Cloud Emulation [20] provides an emulation approach for large clouds, i.e., tens of thousands to hundreds of thousands of nodes. For this, it utilizes densely packed node emulations on top of Kubernetes clusters.

Overall, these related works display many simulation approaches focusing on various aspects of cloud computing such as support for complex network topologies [15], heterogeneous resources [14], or even specific applications such as serverless computing [31,32]. Moreover, some of these approaches take inspiration from or mimic Kubernetes scheduling [17,31,32]. However, as argued by [18], simulations are limited in the accuracy they can provide, owing to the level of abstraction their models utilize. Frameworks such as [18,20] provide more details through emulation, but in order to support heterogeneous hardware, they require physical access to such hardware.

Table 1 summarizes the capabilities of the above-discussed frameworks that employ emulation and the features that they support. EMUSIM is a simulation framework, but it is included here, as it employs sampling of workloads in emulated environments. The other frameworks fully focus on emulation, and each one provides features specialized for certain use cases. iContinuum supports complex network topologies as found in edge network setups. Large-Scale Cloud Emulation supports emulations with many nodes by packing multiple nodes on the same host. Q8S, on the other hand, employs hardware emulation, enabling running emulations even when only homogeneous hardware is available.

1.3. Contributions

This paper contributes the design, implementation and evaluation of a Q8S prototype with the following features:

Automatic deployment of a heterogeneous Kubernetes cluster on a homogeneous OpenStack cloud including nodes with emulated hardware via QEMU;
Configurable settings to specify the hardware to emulate as well as how many nodes should be deployed for a given specification;
Open-source access to the Q8S Python code and Bash scripts, enabling extensibility to include additional features of QEMU and libvirt.

By utilizing Q8S, researchers can deploy emulated heterogeneous Kubernetes clusters to test, evaluate, and train AI schedulers for heterogeneous environments without actual access to such hardware. The content of this paper is based on the master’s thesis of one of the authors [33].

1.4. Outline

The rest of this paper is structured as follows: Section 2 presents the design of Q8S as well as our methods for validating it. Section 3 discusses the implementation details of Q8S and the results of our validation. Section 4 covers limitations and implications of Q8S. Section 5 summarizes our efforts and describes future work.

2. Materials and Methods

The goal of Q8S is to provide a tool for researchers and developers to test code, evaluate algorithms, and train machine learning models on a heterogeneous Kubernetes cluster running on emulated hardware. Users of Q8S should be able to specify what hardware should be emulated and how many nodes should be deployed. Given all configurations, Q8S should be able to request the required nodes in an OpenStack environment, set up the emulation software, and deploy Kubernetes such that users can start using the cluster as soon as Q8S is done.

We have captured these goals in the form of the following functional requirements:

FR1

Required resources are requested from an existing OpenStack environment.

FR2

The specifications of a node type on the to be emulated cluster can be set by the user, including the following:

FR2.1: System architecture, e.g., x86 or ARM.
FR2.2: CPU speed and number.
FR2.3: Memory size.
FR2.4: Storage size.

FR3

The user can specify the number of nodes per type.

FR4

The hardware of every worker node is emulated. This also includes any nodes that might run the same architecture as the host system.

FR5

After successful completion of Q8S, the user is provided with a working Kubernetes cluster, which meets the provided specifications and can execute real workloads.

FR6

No additional privileges beyond regular user permission in the existing OpenStack environment are required.

Some components of a heterogeneous Kubernetes cluster might be installed outside of a data center, for example, IoT devices collecting or presenting data close to the user. The connectivity to such devices may be limited or even unstable, requiring specialized scheduling and autonomous actions from such nodes. Therefore, we consider the support for network bandwidth and latency as well as simulating failures in network connectivity as relevant features for an extensive emulation of a heterogeneous Kubernetes cluster. However, support for bandwidth, latency, and network failures are not included in the version of Q8S presented here but will be added in a future release.

2.1. Q8S Design

Here, we present and discuss the design considerations when creating Q8S. The problem of providing a tool for creating heterogeneous Kubernetes clusters on top of OpenStack can be split into three layers:

User interface;
OpenStack instances;
QEMU VMs.

These layers are also partially reflected in Figure 2, a sequence diagram depicting the workflow of Q8S.

As the user interface, Q8S provides a CLI tool, which reads from a YAML file. The user can specify the desired cluster by creating said YAML file as a human- and machine-readable record of the cluster to be created. Moreover, the user is expected to provide credentials for OpenStack such that Q8S can request resources in the name of the user. This operation is reflected in the first step shown in Figure 2, where Q8S represents the CLI tool as the user interface.

The next layer involves Q8S using the provided credentials to request instances via the OpenStack API. Each OpenStack VM serves as the host for a single Kubernetes node, which may be deployed directly on the host or as a nested VM via QEMU. Kubernetes nodes hosting control plane components are deployed without emulation, while all worker nodes use QEMU VMs as emulated environments. The provisioning of the OpenStack VMs is depicted in the next three steps in Figure 2.

We considered operating multiple emulated Kubernetes workers from a single OpenStack VM by having QEMU run multiple VMs but decided on a one-node-per-host solution to avoid adding additional complexity to the network configuration. This decision comes at the price of having to run a host operating system for each emulated node instead of providing the combined resources to a single VM host to emulate multiple nodes. When scaling out a Q8S cluster, 2 CPU cores and 2 GB RAM should be dedicated to the Ubuntu hosts for the emulated nodes such that the emulations can run without interruption due to background activities from the host. By grouping, for example, four emulated nodes on the same host, this overhead would be reduced by a factor of four. This is not yet implemented in the presented version of Q8S.

The last layer consists of setting up the Kubernetes cluster and the QEMU VMs and joining the cluster together. When initializing the cluster, Q8S uses the node on which it is running as the initial Kubernetes control plane member that generates the Kubernetes certificates and administrator configurations.

On all other control plane nodes, Q8S installs Kubernetes and joins it to the already initialized cluster. On the worker nodes, Q8S installs QEMU and libvirt and is then ready to fetch the OS images used for the respective emulated environments.

We considered multiple approaches for providing the OS images:

VM Snapshots
This approach involves preparing a VM with all the required dependencies and saving it as a snapshot, which is then provided to the QEMU hosts. However, we wanted Q8S to be flexible, and this approach would require recreating said snapshots for every update to the images for every specified node type.
Autoinstall for Ubuntu Server
Ubuntu servers have a feature such that they can automatically set up a system when given an autoinstall configuration. We explored this option, and it worked for our x86 images, but we encountered problems for our ARM images.
Ubuntu Cloud-Images
Besides Ubuntu server images, Canonical also provides Ubuntu Cloud-Images, which are lightweight for Ubuntu OS images with around 600 MB and can be configured via cloud-init settings. Cloud-Images are available for different system architectures including x86_64 and ARM64 while utilizing identical setup instructions. While this decision locked Q8S into only supporting Ubuntu based hosts, we consider this a reasonable trade-off compared to specifying setup instructions for various Linux distributions.

Q8S provides the cloud-init settings to the QEMU hosts such that the Cloud-Images can be set up in the emulated environments.

Another major aspect of the design of Q8S is networking. Each machine is in two different networks: the OpenStack provided network that connects all hosts and the local network for each host, which is generated by libvirt. For the Kubernetes cluster to properly work, each of the local networks needs to be connected to the inter-node network such that requests can be routed between the QEMU VMs while supporting the Kubernetes internal networking.

Besides opening the required ports in the OpenStack security groups, we configured Flannel to use the host’s network interfaces instead of the QEMU VM’s interfaces for VTEP by annotating the nodes with the respective public-ip setting for Flannel. Moreover, we set up NAT rules in iptables to route packets received by the host to the QEMU VM IP. This required both Source NAT (SNAT) and Destination NAT (DNAT) rules.

2.2. Design Validation

In order to evaluate Q8S, the functional requirements need to be checked. Furthermore, to understand the overheads introduced by the emulation layer, we employ k8s-bench-suite [21] version 1.5.0 to measure network communication, CPU usage, and RAM usage.

Specifically we measured pod-to-pod and pod-to-service communication throughput and latency as well as CPU and memory usage while the benchmark ran. All measurements were taken using k8s-bench-suite except for latency, which was measured with the ping command.

To establish a baseline, the experiment was run with emulated worker nodes and worker nodes running directly on the OpenStack hosts. To ensure that the results would be comparable, we configured QEMU to emulate the same CPU as was installed in the OpenStack nodes. In our OpenStack environment, this was the AMD EPYC Processor with BPB. Moreover, to ensure that both the emulated node and the OpenStack node had the same quantities of resources available, we set the QEMU VM to only use 2 CPUs and 2 GB RAM on host nodes with 8 CPUs and 8 GB RAM, while setting the OpenStack nodes to a flavor with 2 CPUs and 2 GB RAM.

3. Results

Based on the design discussed in the previous section, we implemented Q8S. The resulting source code is available at https://github.com/gwdg/pub-2025-Q8S (accessed on 25 May 2025). The prototype is implemented in Python 3.10 and was verified using Kubernetes version 1.28.0 and Flannel version 0.25.6 on Ubuntu 22.04.4 nodes, which were stable releases at the time of the experimental validation. Q8S has no strict dependency on specific software versions of Kubernetes, Flannel, or the host Ubuntu system and can therefore be used with newer versions as long as there are no breaking changes to the iptables, the default ports used, or the Flannel configuration.

3.1. Implementation

In the following, we discuss the implementation details of our Q8S prototype with examples taken from our OpenStack environment.

Networking

Depending on the configuration of a given OpenStack cloud, the default network masks may vary. In our OpenStack environment, all hosts are part of the 10.254.1.0/24 network range, and the QEMU VMs use 192.11.3.0/24. When assigning the IPs of the QEMU VM, we configured Q8S to automatically match the host IP, e.g., the host with 10.254.1.3 would internally run 192.11.3.3. This also ensures that the QEMU VM IPs are unique across both networks.

Figure 3 provides an overview of a cluster created by Q8S with the 10.254.1.0/24 IP range used for the hosts. The name of the default network interface of the hosts in the OpenStack network is ens3 here, but it may be different in other OpenStack clouds.

The starting instance in the top left of Figure 3 is the node on which the user has started Q8S and that initialized the cluster as its first control plane node. Below that are the other control plane nodes, which do not install an additional virtualization layer. On the right are workers depicted with numbers from 1 to n. It should be noted that in Figure 3, there are two worker types: x86 and arm64.

The worker nodes deploy QEMU and connect its internal network via a bridge to the host network. From the perspective of the other nodes, the additional virtualization layer provided by QEMU is not visible on the network layer, as any requests sent to a worker node are forwarded by the iptables rules to the internal VM via NAT.

The internal networking of a worker host are depicted in detail in Figure 4, with the white box on the left listing the iptables rules. These include prerouting DNAT rules to forward any traffic send to the host to the QEMU VM on the same port. The exception is port 22 of the host, which still provides regular SSH access to the host node, and port 2222, which is redirected to port 22 of the QEMU VM for SSH access. For outgoing traffic, SNAT is used to masquerade requests as the host node.

The right half of Figure 4 shows the network bridge provided by libvirt labeled as virbr0, which connects the host and QEMU VM networks. Inside the QEMU VM is the enp1s0 interface, which holds the IP of the overall QEMU VM. Inside the VM is the local network created by Flannel, which is used to assign IPs to the individual containers. Flannel is shown in Figure 4 to use the node’s public-ip for inter-node communication, which is necessary for it to accept the requests forwarded via the iptables rules.

Configuration

Q8S requires two configuration files at the start: the YAML cluster configuration file and a file with credentials for the OpenStack cloud. The official Python OpenStack SDK used by Q8S is able to process clouds.yaml files, which can be downloaded through the OpenStack Horizon web interface under application credentials. With these application credentials, Q8S is able to query the OpenStack API for the OpenStack project for which the credentials were created.

The cluster configuration file includes the user-provided settings for the cluster to be created. An example of such a file is given in Listing 1. The configuration includes the following fields, which the users are expected to adjust to their needs:

git_url:A URL pointing to a Q8S git repository, which is either public or accessible via embedded access tokens. This is required for the later installation stages to download the respective setup scripts on the new nodes.
private_network_id: The id of the internal network in which all hosts reside; it can be found in the OpenStack Horizon web interface under networks as the id of the private subnet. This is needed to request new VMs via the OpenStack API.
remote_ip_prefix: The IP mask of the OpenStack network. In our environment, this is 10.254.1.0/24.
default_image_name: The name of the OS image that should be used for the OpenStack images. Q8S expects this to be an Ubuntu image.
name_of_initial_instance: The name of the starting instance in OpenStack. This is needed to update its security groups.
security_groups: The list of OpenStack security groups that should be added to each node. This list must at least contain q8s-cluster, which is the group configured for internal communication of the Kubernetes cluster.
required_tcp_ports: The list of TCP ports that should be opened in the q8s-cluster security group for inter-node communication. The list given in the example in Listing 1 should be kept, but further ports may be added.
required_udp_ports: The same as for TCP ports abovem but for UDP.
worker_port_range_min: This is the lower end of the port range that is to be opened in addition to the TCP ports specified above and used for Kubernetes container node ports.
worker_port_range_max: The high end of the port range as specified above.
master_node_flavor: The flavor to be used by additional control plane nodes. The flavor in OpenStack specifies the number of CPUs and amount of system memory OpenStack should allocate from a project quota to a specific VM.
number_additional_master_nodes: The number of control plane nodes that Q8S should deploy in addition to the starting instance. The control plane IP is always set to the IP of the node running Q8S and does not deploy failover mechanisms. Therefore, even if the created cluster includes multiple control plane nodes, it is not a high-availability (HA) deployment.
worker: The vm_types specified in the next section can be used here to indicate how many instances of a given type should be deployed by Q8S.

The vm_types a user wishes to use should be each specified as a dictionary with the following fields:

architecture: System architecture of the emulated CPU. Our Q8S prototype supports x86_64 and ARM_64.
num_cpus: The number of emulated CPUs that should be available in the QEMU VM.
cpu_model: Specific CPU model that should be emulated by QEMU, which also determines the available CPU speed. The list of supported CPU models depends on QEMU and can be found in its documentation.
machine_model: Machine model requested through QEMU. This should be kept as virt.
ram: The amount of system memory to allocate for the QEMU VM in MB.
storage: Amount of storage to allocate for the QEMU VM in GB.
openstack_flavor: Flavor to use in OpenStack for the host. The flavor should include at least as many CPUs as the emulated node.

Listing 1. An example file for a cluster definition. The notation using '!', e.g., !ClusterData and !VmType, is used by Q8S to map the respective sections of the configuration to Python data classes.

Deployment Process

Figure 2 gives a high-level overview of the Q8S workflow and the interactions between the components. To provide a more detailed insight to the involved steps for Q8S, the following presents the concrete operations that are executed in order:

Creation of the security group q8s-cluster if it does not exist and the configuration of the rules as specified in the settings file;
Creation of an SSH key pair, which is uploaded to OpenStack such that the new VMs are initialized with it and can later on be accessed;
Creation of the OpenStack VMs via its API for the control plane and worker nodes;
Waiting for all new OpenStack VMs to be reachable via SSH;
Installation of Kubernetes dependencies and system configurations required for Kubernetes on the instance running Q8S;
Initializing the Kubernetes cluster and installing Flannel;
Extracting the Kubernetes token for joining of worker nodes and uploading cluster certificates, required for joining control plane nodes;
Installation of Kubernetes dependencies and system configurations on the additional control plane nodes;
Joining of the additional control plane nodes to the cluster;
Installation of QEMU and libvirt on the worker hosts;
Configuring of the QEMU network to ensure the desired IP address that matches the host IP will be assigned to the QEMU VM;
Downloading of the Ubuntu Cloud-Images;
Preparation of the user-data and meta-data files for the Cloud-Image VM;
Preparation of the Cloud-Image for QEMU;
Creation of the QEMU VM from the Cloud-Image;
Installation of Kubernetes and dependencies and system configurations on the QEMU VMs as triggered by cloud-init;
Waiting for all worker nodes to join the Kubernetes cluster.

Once all these steps are completed, the cluster is ready to be used, and Q8S is no longer required. In the prototype implementation of Q8S, the above steps are executed sequentially and are still open to optimization through parallelization, for example, by letting the deployment process of the additional control plane nodes and the emulated worker nodes run in parallel.

3.2. Evaluation

Benchmark

In this section, we discuss the measurements that were acquired according to the methods described in Section 2.2.

Table 2 shows the bandwidth measured through the benchmark for pod-to-pod and pod-to-service communication for emulated and non-emulated nodes via TCP and UDP. Except for UDP being slightly slower on the emulated nodes, the performance is almost identical despite the additional layer of networking that applies for the emulated nodes.

The difference in UDP speed could be related to the UDP packet size [34] being non-optimal for wrapping in the Flannel environment. As the difference is marginal, we consider the networking performance to be effectively identical.

We measured the following network latencies:

Worker-Node-to-Worker-Node round-trip latency: ≈3.1 ms;
Pod-to-Pod (different nodes) round-trip latency: ≈4.8 ms;
Pod-to-Pod (same node) round-trip latency: ≈1.1 ms;
Worker-Node-to-Control-Plane-Node round-trip latency: ≈1.5 ms;
Control-Plane-Node-to-Control-Plane-Node round-trip latency: ≈0.5 ms.

Between control plane nodes, no additional iptables rules are applied; between a worker node and a control plane node, one set of iptables rules are applied; and between worker nodes, the iptables are applied both on being sent and on being received.

Going from no iptables rules to one pass adds about 1 ms of latency, and adding another pass adds another 1.6 ms. Compared to pod-to-pod on two worker nodes, this becomes even slower, as it now also has to pass through the Kubernetes internal networking.

Considering the low latency for direct pings between two hosts, these latencies created by the additional iptables and network virtualization layers seem high but are overall still relatively low and acceptable for the majority of use cases.

The CPU usage of the nodes during the throughput benchmark as well as in the idle state is given in Table 3. The benchmark consists of a client-and-server pairing, where the client sends requests to the server.

The CPU usage is significantly higher for the emulated nodes than for the OpenStack nodes. Even during the benchmark, the CPU usage of the OpenStack node barely goes up except for the UDP communication, where it is significantly increased for the client node but not the server node.

The reason for the increased CPU usage likely lies in the emulation, as, for each CPU instruction, the hypervisor has to perform additional operations to check and translate the instruction in addition to actually performing the instruction. This overhead factor even varies between client and server, such that we can assume that sending and receiving requests require different quantities of overhead operations by the hypervisor.

Regarding the increased client CPU usage for UDP, when comparing the measurements to other benchmarks created with k8s-bench-suite [35], these also show an increase in CPU usage for UDP clients. Our test setup features only 2 CPU cores; therefore, we can speculate that the performance differences would be less pronounced when using more cores, as in the external example.

The memory usage of the nodes is given in Table 4 and shows no significant changes during the benchmark. Notably, the memory usage of the emulated server is even slightly lower than that of the OpenStack server. Overall, the memory usage does not vary significantly, such that it appears to be handled well by the emulation layer.

Moreover, to verify that Q8S is capable of deploying larger clusters, we have deployed a cluster with 50 emulated worker nodes including 30 emulated x86_64 nodes and 20 emulated ARM64 nodes as well as two additional master nodes. The time to complete the deployment was 77 min, with the times taken for the individual stages shown in Table 5.

The first stage in Table 5 represent the time taken for the OS to provision the nodes and is stopped when the OS reports all nodes as ready. However, for the OS, a node appears ready when it has successfully booted, but it might still be starting up and launching services such that access to it via SSH is not yet possible. In the next stage, Q8S distributes setup scripts and configuration files to the nodes, which takes only a few seconds per node but is performed sequentially in our implementation and nodes might not be ready to establish an SSH connection yet, such that the script has to wait.

The third major stage in Table 5 is the setup of the hosts, which involves installing QEMU as well as downloading and preparing the guest image for the emulated workers. This stage is processed in parallel. Afterwards, the cloud-init script has to run in all emulated workers to install Kubernetes and finally join the cluster. After joining the cluster, the nodes must all report ready status for the next status before the network setup is completed in the last stage.

The slowest operation by far was the setup of the emulated nodes, including running cloud-init and installing Kubernetes, which took about 23 min for the fastest node and 39 min for the slowest node. Moreover, all x86_64 nodes completed their deployment after 25 min, while this stage took the ARM64 nodes at least 35 min.Another longer stage was the initial setup of each of the hosts, i.e., the installation of QEMU and other dependencies, which took 24 min.

Requirements

In this part we discuss how the presented prototype for Q8S aligns with the functional requirements, which we defined in Section 2.

As we were able to perform the benchmarking on a Kubernetes cluster deployed through Q8S, we can consider FR5 to be completed. The benchmarking was performed on an OpenStack cluster for which we did not request any additional privileges, fulfilling FR1 and FR6. Specifying the node settings including system architecture, CPU speed, core count, memory and storage size also worked, which completes FR2. Q8S was also able to deploy the desired amount of node types and set up a working emulation for them, which matches FR3 and FR4. Table 6 provides an overview of the status of the requirements.

However, while Q8S completes the goals we set out for it, we still see further features and improvements that should be made. These include support for limiting bandwidth and latency along with simulating network failure rates to emulate edge nodes. Moreover, Q8S works slow and a full deployment can take over an hour, highlighting the need to parallelize and optimize the process. Nevertheless, once the deployment is complete, the created cluster can be used for multiple rounds of experiments without having to rerun Q8S, making Q8S a valuable tool for emulating instead of simulating heterogeneous Kubernetes clusters.

4. Discussion

While Q8S fulfills its initial requirements, several limitations and considerations remain that should be mentioned.

4.1. Limitations

The current implementation of Q8S only supports up to 252 nodes in an emulated cluster as the /24 subnet in OpenStack is used and some IPs are already reserved for the starting instance and the OpenStack router. This could be expanded to a /16 subnet, however, this would require changing the implementation on how the matching of IPs between the hosts and QEMU VMs works as having an identical last section of the IP address does not work with subnets larger than /24.

It should also be noted that when running an emulation, the compute capabilities of the emulated node cannot exceed those of the host node. Moreover, as the hosts also consume some CPU time and memory in order to run their operating systems and QEMU, when setting the emulated nodes resources to close to those of the host, some of these are not guaranteed to be available to the emulated node.

The emulation introduces performance overhead resulting in higher resource consumption compared to a native performance as was shown in Table 3. If Q8S is to be used for gathering training data for scheduling algorithms or CPU metrics are used for evaluating a given algorithm, this overhead has to be taken into account.

Another limitation is that due to the usage of QEMU and libvirt, only hardware that is supported by these applications can be emulated via Q8S. The supported hardware is further restricted due to the reliance on Ubuntu Cloud-Images, which typically builds for 64-bit x86 and ARM systems. Supporting other architectures, such as 32-bit systems, would require building Cloud-Images with 32-bit support or significantly adjusting the installation process of Q8S to not rely on Ubuntu Cloud-Images.

However, in order to deploy a Kubernetes cluster on such infrastructure, the respective container images must also be available. According to the image manifests available for the official Kubernetes repositories, only four architectures are supported: amd64 (i.e., x86_64), arm64, ppc64le and s390x. As ppc64le and s390x have either a negligible market share or are effectively discontinued, we conclude that the majority of system that will run Kubernetes utilize either x86_64 or ARM64.

4.2. Implications

With Q8S, we provide a tool for emulating heterogeneous Kubernetes clusters. This enables researchers and developers of software for such clusters, for example, alternative scheduling systems, to perform tests and gather more detailed data compared to using simulated clusters if access to real heterogeneous hardware is not available. Furthermore, an emulated cluster can potentially be used to gather data to train machine learning models or have scheduling algorithm actively learn via reinforcement learning. Q8S, therefore, extends the selection of available tools to researchers beyond existing simulation tools such as CloudSim, K8sSim and EMUSIM.

5. Conclusions

In this work, we have identified the need for emulated heterogeneous Kubernetes clusters in order to provide more accurate environments compared to utilizing simulation tools. To meet this need, we have designed and implemented Q8S as a tool for automatically deploying such clusters on top of OpenStack clouds by using QEMU to emulate heterogeneous nodes, e.g., ARM nodes on top of x86 machines. Moreover, we have validated the functionality of Q8S and benchmarked the performance overhead imposed by the emulation layer and found it to be acceptable.

We provide our prototype implementation of Q8S under an open-source license to enable its adaptation and further development.

Future Work

The prototype of Q8S is functional but could be further improved. Many other simulations support setting bandwidth and latency for connections, which is to be implemented in Q8S. Similarly, support for network and node failures, which may occur more frequently for heterogeneous clusters with components outside of a data center, are also not yet supported. Due to the multiple layers involved when deploying a cluster via Q8S, if any of the deployment scripts get stuck and fail to complete, it can be difficult to identify the issue or even to retrieve the logs of the failed script. For this, Q8S requires a proper error handling system, which observes the deployment logs across the all layers such that the user can immediately be pointed to the failing component.

Author Contributions

Conceptualization, J.D., V.F.H., and J.K.; methodology, J.D., V.F.H., and J.K.; software, J.D. and V.F.H.; validation, V.F.H.; formal analysis, V.F.H.; investigation, J.D. and V.F.H.; resources, J.D.; data curation, V.F.H.; writing—original draft preparation, J.D.; writing—review and editing, J.D. and J.K.; visualization, V.F.H.; supervision, J.D. and J.K.; project administration, J.K.; funding acquisition, J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Federal Ministry of Education and Research (BMBF) under the AI service center KISSKI (grant nos. 01IS22093A and 01IS22093B).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original code produced for Q8S as part of this research work is available under the MIT license at https://github.com/gwdg/pub-2025-Q8S (accessed on 25 May 2025).

Acknowledgments

The computing systems for developing and testing Q8S in this study were kindly made available by the GWDG (https://gwdg.de).

Conflicts of Interest

The authors declare that this study received funding from the German Federal Ministry of Education and Research. The funder had no specific involvement with the study.

Abbreviations

The following abbreviations are used in this manuscript:

ARM	Advanced RISC Machines
CLI	Command-Line Interface
DNAT	Destination NAT
IO	Input/Output
IoT	Internet of Things
KVM	Kernel-based Virtual Machine
LAN	Local Area Network
MIPS	Million Instructions Per Second
NAT	Network Address Translation
OS	Operating System
QEMU	Quick Emulator
QoS	Quality of Service
SNAT	Source NAT
VM	Virtual Machine
VMM	Virtual Machine Manager
VTEP	VXLAN Tunnel Endpoint
VXLAN	Virtual Extensible LAN

References

Kiswani, J.H.; Dascalu, S.M.; Harris, F.C. Cloud Computing and Its Applications: A Comprehensive Survey. Int. J. Comput. Their Appl. 2021, 28, 3. [Google Scholar]
Islam, R.; Patamsetti, V.; Gadhi, A.; Gondu, R.M.; Bandaru, C.M.; Kesani, S.C.; Abiona, O. The Future of Cloud Computing: Benefits and Challenges. Int. J. Commun. Netw. Syst. Sci. 2023, 16, 53–65. [Google Scholar] [CrossRef]
CNCF Annual Survey 2023. 2023. Available online: https://www.cncf.io/reports/cncf-annual-survey-2023/ (accessed on 21 September 2024).
Senjab, K.; Abbas, S.; Ahmed, N.; Khan, A.u.R. A Survey of Kubernetes Scheduling Algorithms. J. Cloud Comput. 2023, 12, 87. [Google Scholar] [CrossRef]
Sadeeq, M.M.; Abdulkareem, N.M.; Zeebaree, S.R.M.; Ahmed, D.M.; Sami, A.S.; Zebari, R.R. IoT and Cloud Computing Issues, Challenges and Opportunities: A Review. Qubahan Acad. J. 2021, 1, 1–7. [Google Scholar] [CrossRef]
Carrión, C. Kubernetes Scheduling: Taxonomy, Ongoing Issues and Challenges. ACM Comput. Surv. 2022, 55, 138:1–138:37. [Google Scholar] [CrossRef]
Ahmad, I.; AlFailakawi, M.G.; AlMutawa, A.; Alsalman, L. Container Scheduling Techniques: A Survey and Assessment. J. King Saud Univ.—Comput. Inf. Sci. 2022, 34, 3934–3947. [Google Scholar] [CrossRef]
Mars, J.; Tang, L.; Hundt, R. Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity. Available online: https://ieeexplore.ieee.org/abstract/document/5887296 (accessed on 21 September 2024).
Kunkel, J.; Boehme, C.; Decker, J.; Magugliani, F.; Pleiter, D.; Koller, B.; Sivalingam, K.; Pllana, S.; Nikolov, A.; Soyturk, M.; et al. DECICE: Device-edge-cloud Intelligent Collaboration Framework. In Proceedings of the Computing Frontiers, Bologna, Italy, 9–11 May 2023; ACM: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
Calheiros, R.N.; Ranjan, R.; Beloglazov, A.; De Rose, C.A.F.; Buyya, R. CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms. Softw. Pract. Exp. 2011, 41, 23–50. [Google Scholar] [CrossRef]
Bux, M.; Leser, U. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds. In Proceedings of the 2nd ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies (SWEET ’13), New York, NY, USA, 23 June 2013; pp. 1–12. [Google Scholar] [CrossRef]
Kathiravelu, P.; Veiga, L. Concurrent and Distributed CloudSim Simulations. In Proceedings of the 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems, Paris, France, 9–11 September 2014; pp. 490–493. [Google Scholar] [CrossRef]
Wu, J.; Xu, M.; He, Y.; Ye, K.; Xu, C. Cloudnativesim: A Toolkit for Modeling and Simulation of Cloud-Native Applications. Softw. Pract. Exp. 2025, 1–24. [Google Scholar] [CrossRef]
Zakarya, M.; Gillam, L.; Khan, A.A.; Rahman, I.U. PerficientCloudSim: A Tool to Simulate Large-Scale Computation in Heterogeneous Clouds. J. Supercomput. 2021, 77, 3959–4013. [Google Scholar] [CrossRef]
Shi, H.; He, X.; Wang, T.; Wang, Z. ServiceSim: A Modelling and Simulation Toolkit of Microservice Systems in Cloud-Edge Environment. In Proceedings of the Service-Oriented Computing, Rome, Italy, 28 November–1 December 2023; Monti, F., Rinderle-Ma, S., Ruiz Cortés, A., Zheng, Z., Mecella, M., Eds.; Springer: Cham, Switzerland, 2023; pp. 258–272. [Google Scholar] [CrossRef]
Calheiros, R.N.; Netto, M.A.; De Rose, C.A.; Buyya, R. EMUSIM: An Integrated Emulation and Simulation Environment for Modeling, Evaluation, and Validation of Performance of Cloud Computing Applications. Softw. Pract. Exp. 2013, 43, 595–612. [Google Scholar] [CrossRef]
Wen, S.; Han, R.; Qiu, K.; Ma, X.; Li, Z.; Deng, H.; Liu, C.H. K8sSim: A Simulation Tool for Kubernetes Schedulers and Its Applications in Scheduling Algorithm Optimization. Micromachines 2023, 14, 651. [Google Scholar] [CrossRef] [PubMed]
Akbari, N.; Toosi, A.N.; Grundy, J.; Khalajzadeh, H.; Aslanpour, M.S.; Ilager, S. iContinuum: An Emulation Toolkit for Intent-Based Computing Across the Edge-to-Cloud Continuum. In Proceedings of the 2024 IEEE 17th International Conference on Cloud Computing (CLOUD), Shenzhen, China, 7–13 July 2024; pp. 468–474. [Google Scholar] [CrossRef]
QEMU. Available online: https://wiki.qemu.org/Main_Page (accessed on 2 August 2024).
Zeng, Z.; Chung, C.J.; Xie, L. The Development of A Large-Scale Cloud Emulator. In Proceedings of the 2024 IEEE International Conference on Cloud Engineering (IC2E), Paphos, Cyprus, 24–27 September 2024; pp. 206–210. [Google Scholar] [CrossRef]
GitHub—InfraBuilder/K8s-Bench-Suite: Simple Scripts to Benchmark Kubernetes Cluster Features. Available online: https://github.com/InfraBuilder/k8s-bench-suite (accessed on 10 September 2024).
Libvirt: The Virtualization API. Available online: https://libvirt.org/ (accessed on 21 September 2024).
Anthony, R.J. Chapter 5—The Architecture View. In Systems Programming; Morgan Kaufmann: Boston, MA, USA, 2016; pp. 277–382. [Google Scholar] [CrossRef]
Rodríguez-Haro, F.; Freitag, F.; Navarro, L.; Hernánchez-sánchez, E.; Farías-Mendoza, N.; Guerrero-Ibáñez, J.A.; González-Potes, A. A Summary of Virtualization Techniques. Procedia Technol. 2012, 3, 267–272. [Google Scholar] [CrossRef]
Song, Y.; Wang, H.; Soyata, T. Hardware and Software Aspects of VM-Based Mobile-Cloud Offloading; IGI-Global: Hershey, PA, USA, 2015; pp. 247–271. [Google Scholar] [CrossRef]
Tsetse, A.; Tweneboah-Koduah, S.; Rawal, B.; Zheng, Z.; Prattipati, M. A Comparative Study of System Virtualization Performance. In Proceedings of the 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), Los Angeles, CA, USA, 30 July–1 August 2019; pp. 372–378. [Google Scholar] [CrossRef]
Lim, J.T.; Nieh, J. Optimizing Nested Virtualization Performance Using Direct Virtual Hardware. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 16–20 March 2020; pp. 557–574. [Google Scholar] [CrossRef]
Kapočius, N. Overview of Kubernetes CNI Plugins Performance. Moksl.—Liet. Ateitis 2020, 12, 1–5. [Google Scholar] [CrossRef]
Kumar, R.; Trivedi, M.C. Networking Analysis and Performance Comparison of Kubernetes CNI Plugins. In Proceedings of the Advances in Computer, Communication and Computational Sciences, Bangkok, Thailand, 11–12 October 2019; Bhatia, S.K., Tiwari, S., Ruidan, S., Trivedi, M.C., Mishra, K.K., Eds.; Springer: Singapore, 2021; pp. 99–109. [Google Scholar] [CrossRef]
Maria, A. Introduction to Modeling and Simulation. In Proceedings of the 29th Conference on Winter Simulation, Atlanta, GA, USA, 7–10 December 1997; pp. 7–13. [Google Scholar]
Raith, P.; Rausch, T.; Furutanpey, A.; Dustdar, S. Faas-Sim: A Trace-Driven Simulation Framework for Serverless Edge Computing Platforms. Soft. Pract. Exp. 2023, 53, 2327–2361. [Google Scholar] [CrossRef]
Mampage, A.; Buyya, R. CloudSimSC: A Toolkit for Modeling and Simulation of Serverless Computing Environments. In Proceedings of the 2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Melbourne, Australia, 17–21 December 2023; pp. 550–557. [Google Scholar] [CrossRef]
Hasse, V.F. Emulation of Heterogeneous Kubernetes Clusters Using QEMU. 2024. Available online: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/AQETJV (accessed on 26 March 2025). [CrossRef]
Christensen, M.J.; Richter, T. Achieving Reliable UDP Transmission at 10 Gb/s Using BSD Socket for Data Acquisition Systems. J. Instrum. 2020, 15, T09005. [Google Scholar] [CrossRef]
GitHub—InfraBuilder/benchmark-k8s-cni-2020-08: Results for 2020-08 using Flannel on Ubuntu 18.04. Available online: https://github.com/InfraBuilder/benchmark-k8s-cni-2020-08/blob/master/results/doc-flannel.u18.04-default/doc-flannel.u18.04-default-run1.knbdata (accessed on 18 September 2024).

Figure 1. Interactions of libvirt and QEMU on a Linux host.

Figure 2. Sequence diagram of the Q8S workflow.

Figure 3. Overview of a cluster created by Q8S.

Figure 4. Internal view of a worker host and its networking.

Table 1. Feature comparison for frameworks that employ emulation. “Real Workloads” refers to the ability of the framework to test against actual workloads instead of simulated ones. “Kubernetes Support” refers to whether the framework is capable of running Kubernetes on its nodes. “Dense Packing” refers to the ability of a framework to deploy multiple emulated nodes on the same physical node or VM. “Advanced Networking” refers to the possibility of setting up complex network topologies, for example, mimicking edge devices connected from outside the data center. “Hardware Emulation” refers to the emulation of heterogeneous hardware such as different CPU types including different CPU architectures such as ARM on x86 hosts.

Feature	EMUSIM [16]	iContinuum [18]	LSC Emu. [20]	Q8S
Real Workloads	sampling	yes	yes	yes
Kubernetes Support	no	yes	yes	yes
Advanced Networking	no	yes	no	no
Dense Packing	-	no	yes	no
Hardware Emulation	no	no	no	yes

Table 2. Internal cluster network bandwidth in Mbit/s. Measured by k8s-bench-suite, 5 min per measurement.

Node Type	Pod-to-Pod		Pod-to-Service
Node Type	TCP	UDP	TCP	UDP
OpenStack Instance Nodes	93	94.5	93	94.5
Emulated Nodes	93	89.7	93	89.4

Table 3. Comparison of CPU usage in %, with nodes idling and during pod-to-pod and pod-to-service communication. Measured by k8s-bench-suite, 5 min per measurement.

Node Type	Idle	Pod-to-Pod		Pod-to-Service
Node Type	Idle	TCP	UDP	TCP	UDP
OpenStack, client node	1.52	1.98	51.43	2.21	56.93
Emulated, client node	14.33	31.66	63.99	32.74	63.22
OpenStack, server node	1.13	3.94	3.58	5.03	3.55
Emulated, server node	15.14	43.81	56.59	44.20	55.61

Table 4. Comparison of RAM usage in MB with nodes idling and during pod-to-pod and pod-to-service communication. Measured by k8s-bench-suite, 5 min per measurement.

Node Type	Idle	Pod-to-Pod		Pod-to-Service
Node Type	Idle	TCP	UDP	TCP	UDP
OpenStack, client node	350	387	379	388	365
Emulated, client node	429	442	438	441	433
OpenStack, server node	431	431	436	436	435
Emulated, server node	400	407	407	408	412

Table 5. Time taken to complete various stages of deployment for a cluster with 50 emulated worker nodes, including 30 emulated x86_64 nodes and 20 emulated ARM64 nodes, using Q8S.

Stage	Time in min	Time Since Start in min
OS instances ready	2	2
Distributed configuration files	8	10
Host setup completed	24	34
All workers joined	39	73
All workers ready	3	76
Flannel Network set up	1	77

Table 6. Fulfillment of Requirements and Constraints.

Requ.	Description	Fulfilled
FR1	Use existing OpenStack cloud for resources	yes
FR2	Specify different node settings	yes
FR2.1	Emulated system architecture	yes, x86_64 and ARM64
FR2.2	CPU speed and core count	yes, speed via CPU model
FR2.3	Memory size	yes
FR2.4	Storage size	yes
FR3	Specify amount of different nodes in the cluster	yes
FR4	Emulation of worker node hardware	yes
FR5	Working Kubernetes cluster as defined	yes
FR6	No additional privileges in OpenStack	yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Decker, J.; Hasse, V.F.; Kunkel, J. Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU. Algorithms 2025, 18, 324. https://doi.org/10.3390/a18060324

AMA Style

Decker J, Hasse VF, Kunkel J. Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU. Algorithms. 2025; 18(6):324. https://doi.org/10.3390/a18060324

Chicago/Turabian Style

Decker, Jonathan, Vincent Florens Hasse, and Julian Kunkel. 2025. "Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU" Algorithms 18, no. 6: 324. https://doi.org/10.3390/a18060324

APA Style

Decker, J., Hasse, V. F., & Kunkel, J. (2025). Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU. Algorithms, 18(6), 324. https://doi.org/10.3390/a18060324

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU

Abstract

1. Introduction

1.1. Background

1.2. Related Work

1.3. Contributions

1.4. Outline

2. Materials and Methods

2.1. Q8S Design

2.2. Design Validation

3. Results

3.1. Implementation

3.2. Evaluation

4. Discussion

4.1. Limitations

4.2. Implications

5. Conclusions

Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI