Q8S: Emulation of Heterogeneous Kubernetes Clusters Using QEMU
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study makes a significant contribution to the subject of Kubernetes scheduling in heterogeneous environments, and its overall structure and technique are satisfactory. However, by resolving the following issues, the paper can be reinforced even more:
- Abstract Improvement: A succinct and straightforward problem statement that highlights the increasing significance of heterogeneous infrastructure in cloud workloads would be a good place to start for the abstract. This will engage the reader right away and better contextualize the necessity for the suggested solution (Q8S).
- Literature Gap: Although the introduction gives a broad summary of related technologies, the gap in the literature is not yet clearly stated. It is advised to conclude Section 1 with a succinct explanation of the particular drawbacks of the current tools (such as CloudSim and K8sSim) and how Q8S specifically overcomes them.
- Performance Metrics: For a more comprehensive evaluation of the proposed framework, it is advisable to incorporate and report on a broader set of performance metrics. These could include:
- Scheduler efficiency metrics (e.g., scheduling latency, throughput, placement accuracy)
- System performance metrics such as CPU/GPU utilization per node, memory and I/O usage, and job completion time (turnaround time)
- QoS/SLA-related metrics, which are critical for assessing the ability of the scheduler to meet service-level expectations
-
Author Response
Thank you for your feedback. We have made improvements to the paper in line with your feedback and the feedback from other reviewers. Please find below how we addressed your individual comments. Comments 1: Abstract Improvement: A succinct and straightforward problem statement that highlights the increasing significance of heterogeneous infrastructure in cloud workloads would be a good place to start for the abstract. This will engage the reader right away and better contextualize the necessity for the suggested solution (Q8S). Response 1: We agree that we missed pointing out the need for support for heterogeneous infrastructure in scheduling solutions. We changed the first sentence about heterogenous infrastructures to: "Heterogeneous infrastructures, which include IoT devices or specialized hardware, have become more widespread and require specialized tuning to optimize workload assignment for which researchers and developers working on scheduling systems require access to heterogeneous hardware for development and testing, which may not be available." Comments 2: Literature Gap: Although the introduction gives a broad summary of related technologies, the gap in the literature is not yet clearly stated. It is advised to conclude Section 1 with a succinct explanation of the particular drawbacks of the current tools (such as CloudSim and K8sSim) and how Q8S specifically overcomes them. Response 2: We have extended the last paragraph of section 1.0 to better emphasize the advantages of an emulation over a simulation to verify a scheduling algorithm: "Q8S offers a greater level of detail than simulations as instead of a simulated cluster with artificial nodes and tasks, Q8S provides a real Kubernetes cluster, which can execute actual workloads. Therefore, Q8S can serve to discover technical problems in the implementation of a given algorithm, which might otherwise not surface until a new algorithm is deployed in a production environment." Comments 3: Performance Metrics: For a more comprehensive evaluation of the proposed framework, it is advisable to incorporate and report on a broader set of performance metrics. These could include: Scheduler efficiency metrics (e.g., scheduling latency, throughput, placement accuracy) System performance metrics such as CPU/GPU utilization per node, memory and I/O usage, and job completion time (turnaround time) QoS/SLA-related metrics, which are critical for assessing the ability of the scheduler to meet service-level expectations Response 3: We agree that these metrics are relevant for evaluating a new scheduling approach. However, Q8S itself is intended for deploying a test environment for arbitrary Kubernetes scheduling systems that should be evaluated on a heterogeneous cluster. Reports on CPU, memory and network metrics are already included in the paper, the other requested metrics are specific to a scheduling solution, which there is none to report on.Reviewer 2 Report
Comments and Suggestions for AuthorsOverall, the authors have addressed a clear gap in existing solutions providing the lack of detailed emulation tools for heterogeneous Kubernetes environments. This paper presents a valuable contribution to the field of cloud computing research by providing a tool for emulating heterogeneous Kubernetes clusters. While the technical implementation appears solid and addresses a genuine need, the paper would benefit from extended validation, clearer discussion of performance implications, and addressing the scope limitations more directly. The implementation of Q8S as presented provides a functional foundation that could be expanded in future work. The technical implementation is well-described with appropriate diagrams. However, there are still major issues of concern that need to be addressed.
Technical Issues:
- While the paper claims to emulate heterogeneous clusters, the current implementation only supports x86_64 and ARM64 architectures. Other dimensions of heterogeneity (like specialized hardware or IoT devices) are mentioned but not implemented.
- The CPU overhead appears significant (Table 2), but there's insufficient analysis of the implications. How would this impact the training of scheduling algorithms or the validity of performance measurements?
- The paper describes using a one-node-per-host solution to avoid network configuration complexity, but doesn't adequately explore the cost implications or scalability challenges of this approach.
- The benchmark analysis compares emulated vs. non-emulated nodes, but doesn't validate against real heterogeneous hardware, which would be necessary to establish that the emulation is adequate for its intended purpose.
- The implementation has been validated only on Kubernetes 1.28.0. There should be discussion about compatibility with other versions.
- While the paper claims greater detail than simulations, there's no direct comparison with CloudSim or K8sSim to quantify this advantage.
- No investigation of how the solution scales with increasing cluster size or complexity.
- Add validation against real heterogeneous hardware to verify emulation fidelity.
- Include direct comparisons with simulation tools to quantify the advantages of emulation.
- Integrate with container performance analysis tools to provide deeper insights.
- Test with larger and more diverse cluster configurations to demonstrate scalability.
- It is highly recommended to present experimental evaluation results other than the tabular results, which could be more verifiable.
Writing, Formatting Concerns:
- The abstract could better highlight the specific contributions and limitations of the work.
- The paper sometimes uses "virtualization" and "emulation" interchangeably, though they have distinct technical meanings.
- References appear inconsistent and some include URLs that may not be persistent.
- The user interface section mentions only CLI, but Figure 2 doesn't show user interface details. (Line: 233)
- The conclusion about networking performance being "identical" despite the measurements showing differences should be more precisely worded. (Line: 436)
- "These differences could have become even more pronounced" is speculative and could benefit from supporting evidence. (Line: 470)
- The paper would benefit from consistency in how it refers to Q8S capabilities - sometimes they are described as "simulation" and other times as "emulation", Why?
Grammatical Concerns:
- "More and more complex" is informal; consider "increasingly complex". (Line: 25)
- "May be unable to properly evaluate" - passive voice could be replaced with active voice. (Line: 31)
- "Greater level of detail" lacks precision; specify what details are improved.(Line: 42)
- "Requires a VM to be aware that..." - awkward phrasing. (Line: 83)
- "Most probably" is informal; consider "likely" or providing evidence. (Line: 437)
- "are given" is passive; consider active voice. (Line: 455)
- "There are still some limitations" is vague; be more direct. (Line: 496)
Author Response
Thank you very much for your extensive feedback.
We were able to make many improvements to the paper thanks to your feedback and the feedback from other reviewers.
Please find below how we addressed your individual comments.
Comments 1: While the paper claims to emulate heterogeneous clusters, the current implementation only supports x86_64 and ARM64 architectures. Other dimensions of heterogeneity (_like specialized hardware or IoT devices_) are mentioned but not implemented.
Response 1: We agree that this is a limitation of the version of Q8S presented in the paper, which could be overcome by providing additional base images, for example to support 32 bit systems. However, a much more limiting factor for heterogeneous Kubernetes cluster is the availability of Kubernetes images for such architectures. When checking the official Kubernetes repository ( docker manifest inspect registry.k8s.io/kube-apiserver:v1.28.0 ) it shows that in addition to x86_64 and arm64, only s390x and ppc64le are available, which are either defunct or have a negligible market share. Therefore, we conclude that the majority of systems that will run Kubernetes utilize either x86_64 or arm64 architecture. We added a paragraph to the end of 4.1 to reflect this argument: "However, in order to deploy a Kubernetes cluster on such infrastructure, the respective container images must also be available. According to the image manifests available for the official Kubernetes repositories, only four architectures are supported: amd64 (i.e. x86\_64), arm64, ppc64le and s390x. As ppc64le and s390x have either a negligible market share or are effectively discontinued, we conclude that the majority of system that will run Kubernetes utilize either x86\_64 or ARM64."
Comments 2: The CPU overhead appears significant _(Table 2)_, but there's insufficient analysis of the implications. How would this impact the training of scheduling algorithms or the validity of performance measurements?
Response 2: We acknowledge this as a limitation and have added a paragraph to 4.1.: "The emulation introduces performance overhead resulting in higher resource consumption compared to a native performance as was shown in Table 2.
If Q8S is to be used for gathering training data for scheduling algorithms or CPU metrics are used for evaluating a given algorithm, this overhead has to be taken into account."
Comments 3: The paper describes using a one-node-per-host solution to avoid network configuration complexity, but doesn't adequately explore the cost implications or scalability challenges of this approach.
Response 3: We added a discussion about this decision to the respective paragraph in 2.1.: "We considered operating multiple emulated Kubernetes worker from a single OpenStack VM by having QEMU run multiple VMs but decided on a one node per host solution to not add additional complexity to the network configuration.
This decision comes at the price of having to run a host operating system for each emulated node instead of providing the combined resources to a single VM host to emulate multiple nodes.
When scaling out a Q8S cluster, 2\,CPU cores and 2\,GB RAM should be dedicated to the Ubuntu hosts for the emulated nodes such that the emulations may run without interruption due to background activities from the host.
By grouping, for example, four emulated nodes on the same host, this overhead would also be reduced by a factor of four.
This is not yet implemented in the presented version of Q8S."
Comments 4: The benchmark analysis compares emulated vs. non-emulated nodes, but doesn't validate against real heterogeneous hardware, which would be necessary to establish that the emulation is adequate for its intended purpose.
Response 4:
The limited availability of real heterogenous hardware was one of the original reasons that brought us to develop Q8S.
We agree that ideally their should also be a comparison of emulated heterogeneous nodes and real heterogeneous.
Nevertheless, as we had disabled hardware acceleration features during our comparison of emulated x86_64 and non-emulated x86_64 nodes, we consider this a sufficient analogical comparison to validate Q8S.
Comments 5: The implementation has been validated only on Kubernetes 1.28.0. There should be discussion about compatibility with other versions.
Response 5: We added a sentence to 3.0 to clarify this: "Q8S has no strict dependency on specific software versions of Kubernetes, Flannel or the host Ubuntu system and can therefore be used with newer versions as well as long as there are no breaking changes to iptables, default ports used or the Flannel configuration."
Comments 6: While the paper claims greater detail than simulations, there's no direct comparison with CloudSim or K8sSim to quantify this advantage.
Response 6: We have extended the last paragraph of section 1.0 to better emphasize the advantages of an emulation over a simulation to verify a scheduling algorithm: "Q8S offers a greater level of detail than simulations as instead of a simulated cluster with artificial nodes and tasks, Q8S provides a real Kubernetes cluster, which can execute actual workloads.
Therefore, Q8S can serve to discover technical problems in the implementation of a given algorithm, which might otherwise not surface until a new algorithm is deployed in a production environment."
Comments 7: No investigation of how the solution scales with increasing cluster size or complexity.
Response 7: Q8S is essentially only a tool for deploying a Kubernetes cluster, which includes emulated nodes. After Q8S is finished the provided cluster operates as a regular Kubernetes environment with the scalability that can be expected from Kubernetes.
Comments 8: Add validation against real heterogeneous hardware to verify emulation fidelity.
Response 8: See Response 4.
Comments 9: Include direct comparisons with simulation tools to quantify the advantages of emulation.
Response 9: See Response 6.
Comments 10: Integrate with container performance analysis tools to provide deeper insights.
Response 10: In the paper we presented our performance benchmarking using k8s-bench-suite, which provided us the data needed to assert the functionality and performance overhead of nodes deployed via Q8S. We consider deeper analysis of the performance of containers in emulated environments outside the scope of our work.
Comments 11: Test with larger and more diverse cluster configurations to demonstrate scalability.
Response 11: See Response 7.
Comments 12: It is highly recommended to present experimental evaluation results other than the tabular results, which could be more verifiable.
Response 12: The tabular results imply the successful experimental evaluation of Q8S. It is unclear to us what other, more verifiable, results are requested here.
Comments 13: The abstract could better highlight the specific contributions and limitations of the work.
Response 13: We have adjusted parts of the abstract to mention the support for only x86_64 and ARM64 architectures as well as to specify what we mean by higher level of detail comapred to simulations: "To address this, we introduce Q8S, a tool for emulating heterogeneous Kubernetes clusters including x86\_64 and ARM64 architectures on OpenStack using QEMU. Emulations created through Q8S provide a higher level of detail than simulations and can be used to train machine learning scheduling algorithms. By providing an environment capable of executing real workloads, Q8S enables researchers and developers to test and refine their scheduling algorithms, ultimately leading to more efficient and effective heterogeneous cluster management."
Comments 14: The paper sometimes uses "_virtualization_" and "_emulation_" interchangeably, though they have distinct technical meanings.
Response 14: We checked all references to virtualization and emulation and found two instances of incorrect or unclear usage, which we have clarified.
Comments 15: References appear inconsistent and some include URLs that may not be persistent.
Response 15: We checked the references and agree that the reference to the CNCF report may not persist and replaced it with a link to the web archive version of the same article.
Comments 16: The user interface section mentions only CLI, but Figure 2 doesn't show user interface details. **(Line: 233)**
Response 16: We clarified this by extending the respective sentence to: "This operation is reflected in the first step shown in Figure 2 where Q8S represents the CLI tool as the interface to the user. "
Comments 17: The conclusion about networking performance being "identical" despite the measurements showing differences should be more precisely worded. **(Line: 436)**
Response 17: We changed it to "almost identical" to reflect the minor differences in the measurements.
Comments 18: "These differences could have become even more pronounced" is speculative and could benefit from supporting evidence. **(Line: 470)**
Response 18: We changed the sentence to imply that this is indeed speculation from our side: "Our test setup features only 2 CPU cores, therefore, we can speculate that the performance differences would be less pronounced when using more cores as in the external example."
Comments 19: The paper would benefit from consistency in how it refers to Q8S capabilities - sometimes they are described as "simulation" and other times as "emulation", Why?
Response 19: We checked the usage of these words around Q8S and found that simulating network failures is mentioned as a potential feature for Q8S. As the network failure would be artificially introduced we consider this a simulation and not an emulation.
Comments 20: "More and more complex" is informal; consider "increasingly complex". **(Line: 25)**
Response 20: Thank you for the suggestion, we have updated the sentence accordingly.
Comments 21: "May be unable to properly evaluate" - passive voice could be replaced with active voice. **(Line: 31)**
Response 21: We changed the sentence to "Researchers working on scheduling algorithms for heterogeneous compute clusters may struggle to evaluate their work properly due to limited access to heterogeneous components."
Comments 22: "Greater level of detail" lacks precision; specify what details are improved.**(Line: 42)**
Response 22: We have extended the last paragraph of section 1.0 to better emphasize the advantages of an emulation over a simulation to verify a scheduling algorithm: "Q8S offers a greater level of detail than simulations as instead of a simulated cluster with artificial nodes and tasks, Q8S provides a real Kubernetes cluster, which can execute actual workloads.
Therefore, Q8S can serve to discover technical problems in the implementation of a given algorithm, which might otherwise not surface until a new algorithm is deployed in a production environment."
Comments 23: "Requires a VM to be aware that..." - awkward phrasing. **(Line: 83)**
Response 23: We updated it to say "Requires a VM to recognize that".
Comments 24: "Most probably" is informal; consider "likely" or providing evidence. **(Line: 437)**
Response 24: We replaced "Most probably" with "likely" as suggested.
Comments 25: "are given" is passive; consider active voice. **(Line: 455)**
Response 25: We changed the sentence to "We measured the following network latencies".
Comments 26: "There are still some limitations" is vague; be more direct. **(Line: 496)**
Response 26: We adjusted the sentence to "While Q8S fulfills its initial requirements, several limitations and considerations remain that should be mentioned".
Reviewer 3 Report
Comments and Suggestions for AuthorsIn this paper, the authors propose a tool for creating a heterogeneous Kubernetes cluster, the operation of which is emulated in a virtual environment using OpenStack and QEMU software tools. According to the authors, the advantage of emulating a Kubernetes cluster over its simulation, which is used in similar systems, is a higher level of detail.
Comments
1) The paper specifies the tools against which the proposed tool is compared, but does not provide a performance comparison with them.
2) The paper only provides synthetic performance tests of the proposed tool, but does not provide examples of code testing, algorithm evaluation, and training of machine learning models on a heterogeneous Kubernetes cluster running in a virtual environment.
3) The paper is based on one of the authors' master's thesis, which was written using ChatGPT. I believe that the use of fragments of a master's thesis written using ChatGPT is inadmissible when writing a research paper.
4) It is necessary to specify the novelty of the research results presented in the article in comparison with the results of the master's thesis.
5) The review should include references to modern research papers from the last 5 years.
Author Response
Thank you for your feedback.
We have made improvements to the paper in line with your feedback and the feedback from other reviewers.
Please find below how we addressed your individual comments.
Comments 1: The paper specifies the tools against which the proposed tool is compared, but does not provide a performance comparison with them.
Response 1: We have extended the last paragraph of section 1.0 to better emphasize the advantages of an emulation over a simulation to verify a scheduling algorithm: "Q8S offers a greater level of detail than simulations as instead of a simulated cluster with artificial nodes and tasks, Q8S provides a real Kubernetes cluster, which can execute actual workloads.
Therefore, Q8S can serve to discover technical problems in the implementation of a given algorithm, which might otherwise not surface until a new algorithm is deployed in a production environment."
Comments 2: The paper only provides synthetic performance tests of the proposed tool, but does not provide examples of code testing, algorithm evaluation, and training of machine learning models on a heterogeneous Kubernetes cluster running in a virtual environment.
Response 2: We conducted a performance evaluation to determine the overhead of the emulation layer in the cluster. Besides these overheads, the cluster is a regular Kubernetes cluster and can be used for any code testing, algorithm evaluation and training of machine learning models. However, we acknowledge the limitation that these overheads, especially for CPU performance, may be a problem when using such metrics during training of machine learning models and added a paragraph to 4.1.: "The emulation introduces performance overhead resulting in higher resource consumption compared to a native performance as was shown in Table 2.
If Q8S is to be used for gathering training data for scheduling algorithms or CPU metrics are used for evaluating a given algorithm, this overhead has to be taken into account."
Comments 3: The paper is based on one of the authors' master's thesis, which was written using ChatGPT. I believe that the use of fragments of a master's thesis written using ChatGPT is inadmissible when writing a research paper.
Response 3: Note that while the master's thesis used ChatGPT to improve the language and write small sections, all content was already carefully reviewed in the master's thesis. The paper reuses graphics, results and arguments from the master's thesis but is completely written from scratch to fit the format of a research paper. We have declared any usage of AI tools as required by the MDPI author guidelines ( https://www.mdpi.com/ethics#_bookmark3 ).
Comments 4: It is necessary to specify the novelty of the research results presented in the article in comparison with the results of the master's thesis.
Response 4: We must disagree here as a master's thesis is not a peer reviewed publication and therefore this paper does not constitute an extension of the work in the master's thesis but a first time publication of its results. We believe we follow the common practice to create publications from theses such as MSc and PhD theses.
Comments 5: The review should include references to modern research papers from the last 5 years.
Response 5: We have checked again to see if there are new references, which we have missed but found that recent developments in this area are very limited. We are unclear about the confusion as the article already includes references to multiple publications from the last 5 years, most significantly K8sSim from 2023.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have now addressed all the suggestions and concerns properly.
Author Response
Thank you for the kind response.
Nevertheless, as requested by other reviewers, we have extended our review of related works and included 7 additional papers from the last 5 years, which discuss simulation or emulation approaches.
With that we added the following to the Related Work section 1.2:
"CloudNativeSim \cite{wuCloudnativesimToolkitModeling} also takes inspiration from CloudSim and provides a toolkit for simulating cloud applications that utilize microservice architectures.
It implements many additional features such as various scaling techniques, request processing and QoS metrics.
While it employs containerization and pods as an abstraction for instances, it does not fully simulate Kubernetes deployments.
Moreover, its support for heterogeneous environments is limited to simulating instances with different code environments and hardware capacities.
PerficientCloudSim \cite{zakaryaPerficientCloudSimToolSimulate2021} provides an extension of CloudSim for it to support large-scale simulations using heterogeneous resources such as different CPU types.
Different CPU types are simulated by considering their performance distribution in terms of the instructions they can process per second.
While this provides a tool for testing scheduling systems on across different CPU types, it does not cover potential issues with varying CPU architectures.
ServiceSim \cite{shiServiceSimModellingSimulation2023} is another toolkit built on top of CloudSim and has a focus on simulations of microservices in cloud-edge environments.
One of its main goals is to enable the validation of policies against QoS requirements in heterogeneous networks.
However, it does not specifically consider that edge nodes may employ heterogeneous hardware.
faas-sim \cite{raithFaassimTracedrivenSimulation2023} is a simulation framework, which also supports edge environments but specifically focuses on serverless edge computing.
Function life cycle management are based on Kubernetes and support heterogeneous nodes via varying capabilities and capacities.
CloudSimSC \cite{mampageCloudSimSCToolkitModeling2023} similar to faas-sim provides tools for simulating serverless applications and bases its assumptions about function life cycles on Kubernetes.
However, it explicitly focuses on data center deployments and does not further consider heterogeneous environments.
iContinuum \cite{akbariIContinuumEmulationToolkit2024} proposes an emulation toolkit for cloud-edge environments with IoT devices.
It also argues for using emulations for higher accuracy compared to simulations as nuanced interactions can be observed, which would be missing from the abstractions used in simulations.
The emulated environment utilizes Kubernetes with Mininet to emulate network topologies.
By incorporating edge devices such as Raspberry Pis, it is possible to construct heterogeneous clusters.
Emulation of edge hardware is not supported by iContinuum.
Large-Scale Cloud Emulation \cite{zengDevelopmentLargeScaleCloud2024} provides an emulation approach for large clouds, i.e., tens of thousands to hundreds of thousands of nodes.
For this it utilizes densely packed node emulations on top of Kubernetes clusters.
Overall these related works display many simulation approaches focusing on various aspects of cloud computing such as support for complex network topologies \cite{shiServiceSimModellingSimulation2023}, heterogeneous resources \cite{zakaryaPerficientCloudSimToolSimulate2021} or even specific applications such as serverless computing \cite{raithFaassimTracedrivenSimulation2023, mampageCloudSimSCToolkitModeling2023}.
Moreover, some of these approaches take inspiration from or mimic Kubernetes scheduling \cite{raithFaassimTracedrivenSimulation2023, mampageCloudSimSCToolkitModeling2023, wenK8sSimSimulationTool2023}.
However, as argued by \cite{akbariIContinuumEmulationToolkit2024}, simulations are limited in the accuracy they can provide to the level of abstraction their models utilize.
Frameworks such as \cite{akbariIContinuumEmulationToolkit2024, zengDevelopmentLargeScaleCloud2024} provide more details through emulation but in order to support heterogeneous hardware they require physical access to such hardware.
Q8S on the other hand employs hardware emulation, enabling running emulations even when only homogeneous hardware is available."
Moreover, we added the following paragraph to the end of the penultimate paragraph of the introduction 1.0 as one of the new papers supports our argument regarding the usage of emulations over simulations:
"Emulations provide higher accuracy than simulations \cite{akbariIContinuumEmulationToolkit2024} as simulations apply abstractions that miss out on details that are not explicitly included by the developers of a given simulation."
Finally, we adjusted part of the last paragraph of the introduction 1.0 to also consider these new references:
"Other emulation efforts such as \cite{akbariIContinuumEmulationToolkit2024} and \cite{zengDevelopmentLargeScaleCloud2024} focus on the emulation of networks and large amounts of nodes, respectively.
Q8S on the other hand fills the gap for emulating heterogeneous hardware through actual hardware emulation techniques instead of requiring physical heterogeneous hardware or simulating heterogeneous capabilities of nodes."
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors' responses to the comments reflect changes in the paper. Unfortunately, I did not see any fundamental corrections that would significantly improve the scientific soundness, novelty, and practical significance. The paper presents a separate practical experiment conducted as part of a master's thesis. It is obvious that the results of the experiment require additional comprehensive justification. The authors did not provide such a justification. In particular, they did not improve the review of related works. Therefore, in my opinion, the paper should be rejected.
Author Response
Comments 1: The authors' responses to the comments reflect changes in the paper. Unfortunately, I did not see any fundamental corrections that would significantly improve the scientific soundness, novelty, and practical significance. The paper presents a separate practical experiment conducted as part of a master's thesis. It is obvious that the results of the experiment require additional comprehensive justification. The authors did not provide such a justification. In particular, they did not improve the review of related works. Therefore, in my opinion, the paper should be rejected.
Response 1: We have carefully considered your feedback and performed another literature search to identify recent works that should be included and that serve to justify the research gap, which our work addresses.
We have extended our review of related works and included 7 additional papers from the last 5 years, which discuss simulation or emulation approaches.
With that we added the following to the Related Work section 1.2:
CloudNativeSim \cite{wuCloudnativesimToolkitModeling} also takes inspiration from CloudSim and provides a toolkit for simulating cloud applications that utilize microservice architectures.
It implements many additional features such as various scaling techniques, request processing and QoS metrics.
While it employs containerization and pods as an abstraction for instances, it does not fully simulate Kubernetes deployments.
Moreover, its support for heterogeneous environments is limited to simulating instances with different code environments and hardware capacities.
PerficientCloudSim \cite{zakaryaPerficientCloudSimToolSimulate2021} provides an extension of CloudSim for it to support large-scale simulations using heterogeneous resources such as different CPU types.
Different CPU types are simulated by considering their performance distribution in terms of the instructions they can process per second.
While this provides a tool for testing scheduling systems on across different CPU types, it does not cover potential issues with varying CPU architectures.
ServiceSim \cite{shiServiceSimModellingSimulation2023} is another toolkit built on top of CloudSim and has a focus on simulations of microservices in cloud-edge environments.
One of its main goals is to enable the validation of policies against QoS requirements in heterogeneous networks.
However, it does not specifically consider that edge nodes may employ heterogeneous hardware.
faas-sim \cite{raithFaassimTracedrivenSimulation2023} is a simulation framework, which also supports edge environments but specifically focuses on serverless edge computing.
Function life cycle management are based on Kubernetes and support heterogeneous nodes via varying capabilities and capacities.
CloudSimSC \cite{mampageCloudSimSCToolkitModeling2023} similar to faas-sim provides tools for simulating serverless applications and bases its assumptions about function life cycles on Kubernetes.
However, it explicitly focuses on data center deployments and does not further consider heterogeneous environments.
iContinuum \cite{akbariIContinuumEmulationToolkit2024} proposes an emulation toolkit for cloud-edge environments with IoT devices.
It also argues for using emulations for higher accuracy compared to simulations as nuanced interactions can be observed, which would be missing from the abstractions used in simulations.
The emulated environment utilizes Kubernetes with Mininet to emulate network topologies.
By incorporating edge devices such as Raspberry Pis, it is possible to construct heterogeneous clusters.
Emulation of edge hardware is not supported by iContinuum.
Large-Scale Cloud Emulation \cite{zengDevelopmentLargeScaleCloud2024} provides an emulation approach for large clouds, i.e., tens of thousands to hundreds of thousands of nodes.
For this it utilizes densely packed node emulations on top of Kubernetes clusters.
Overall these related works display many simulation approaches focusing on various aspects of cloud computing such as support for complex network topologies \cite{shiServiceSimModellingSimulation2023}, heterogeneous resources \cite{zakaryaPerficientCloudSimToolSimulate2021} or even specific applications such as serverless computing \cite{raithFaassimTracedrivenSimulation2023, mampageCloudSimSCToolkitModeling2023}.
Moreover, some of these approaches take inspiration from or mimic Kubernetes scheduling \cite{raithFaassimTracedrivenSimulation2023, mampageCloudSimSCToolkitModeling2023, wenK8sSimSimulationTool2023}.
However, as argued by \cite{akbariIContinuumEmulationToolkit2024}, simulations are limited in the accuracy they can provide to the level of abstraction their models utilize.
Frameworks such as \cite{akbariIContinuumEmulationToolkit2024, zengDevelopmentLargeScaleCloud2024} provide more details through emulation but in order to support heterogeneous hardware they require physical access to such hardware.
Q8S on the other hand employs hardware emulation, enabling running emulations even when only homogeneous hardware is available.
Moreover, we added the following paragraph to the end of the penultimate paragraph of the introduction 1.0 as one of the new papers supports our argument regarding the usage of emulations over simulations:
"Emulations provide higher accuracy than simulations \cite{akbariIContinuumEmulationToolkit2024} as simulations apply abstractions that miss out on details that are not explicitly included by the developers of a given simulation."
Finally, we adjusted part of the last paragraph of the introduction 1.0 to also consider these new references:
"Other emulation efforts such as \cite{akbariIContinuumEmulationToolkit2024} and \cite{zengDevelopmentLargeScaleCloud2024} focus on the emulation of networks and large amounts of nodes, respectively.
Q8S on the other hand fills the gap for emulating heterogeneous hardware through actual hardware emulation techniques instead of requiring physical heterogeneous hardware or simulating heterogeneous capabilities of nodes."
Given that we have found multiple works that employ simulation and even emulation, we still consider Q8S to be a novel work as it addresses the need for emulating hardware within an emulated cluster.
Round 3
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors provided references to modern research papers from the last five years and briefly considered different simulation tools. However, the main questions still remain: What is the novelty and practical significance of the proposed tool? How is the effectiveness of the proposed tool confirmed? What are the results of a comprehensive comparative analysis of the proposed tool compared to other similar tools? Answers to these questions require additional experiments. The Q8S evaluation, which was carried out by the authors in comparison with OpenStack, used only part of the computational resources of two computer nodes. These comparison results may be sufficient for a master's thesis. However, in my opinion, they are not indicative of a serious scientific study. Real problems are solved using tens, hundreds, or thousands of computer nodes.
Author Response
Comments 1: The authors provided references to modern research papers from the last five years and briefly considered different simulation tools. However, the main questions still remain: What is the novelty and practical significance of the proposed tool? How is the effectiveness of the proposed tool confirmed? What are the results of a comprehensive comparative analysis of the proposed tool compared to other similar tools? Answers to these questions require additional experiments. The Q8S evaluation, which was carried out by the authors in comparison with OpenStack, used only part of the computational resources of two computer nodes. These comparison results may be sufficient for a master's thesis. However, in my opinion, they are not indicative of a serious scientific study. Real problems are solved using tens, hundreds, or thousands of computer nodes.
Response 1: To showcase that our project, Q8S, fills a research gap, we expanded the related work section 1.2 and added a table (below converted from latex to markdown) as well as a discussion:
|-----------------------|------------|----------------|--------------|-----|
| Real Workloads | sampling | yes | yes | yes |
| Kubernetes Support | no | yes | yes | yes |
| Advanced Networking | no | yes | no | no |
| Dense Packing | - | no | yes | no |
| Hardware Emulation | no | no | no | yes |
Feature comparison for frameworks that employ emulation.
Real workloads: Capability to test against actual workloads instead of simulated ones.
Kubernetes support: Ability to run Kubernetes on framework nodes.
Dense packing: Deploy multiple emulated nodes on the same physical node/VM.
Advanced networking: Setup complex network topologies (e.g., edge device emulation).
Hardware emulation: Emulate heterogeneous hardware (e.g., ARM on x86 hosts).
"Table 1 summarizes the capabilities of the above discussed frameworks that employ emulation and the features that they support.
EMUSIM is a simulation framework but it is included here as it employs sampling of workloads in emulated environments.
The other frameworks fully focus on emulation and each provide features specialized for certain use cases.
iContinuum supports complex network topologies as found in edge network setups.
Large-Scale Cloud Emulation supports emulations with many nodes by packing multiple nodes on the same host.
Q8S on the other hand employs hardware emulation, enabling running emulations even when only homogeneous hardware is available."
To show that Q8S is able to serve for larger amounts of nodes, we have successfully tested its capability to deploy a cluster with over 50 nodes.
In line with this we added a pargraph at the end of the benchmark section in Evaluation 3.2:
"Moreover to verify that Q8S is capable of deploying larger clusters, we have deployed a cluster with 50 emulated worker nodes including 30 emulated x86\_64 nodes and 20 emulated ARM64 nodes as well as two additional master nodes."
Q8S has originated as a master's thesis but as shown in the paper, it fulfills a real research gap.
We are looking to utilize it in projects such as DECICE where we need to verify our scheduler software builds against heterogeneous clusters while being limited to a single actually hetereogeneous cluster due to project budget constraints.
Round 4
Reviewer 3 Report
Comments and Suggestions for AuthorsThe paper has been sufficiently revised to address the comments, so it can be accepted for publication.
Author Response
Thank you for your feedback to help us improve our paper.
By the request of the editors, we have expanded the discussion around our larger deployment to include the time it took to complete the deployment as given below:
"The time to complete the deployment was 77\,min.
The by far slowest operation was the setup of the emulated nodes, including running cloud-init and installing Kubernetes, which took about 23\,min for the fastest node and 39\,min for the slowest node.
Moreover, all x86\_64 nodes completed their deployment after 25\,min while it took the ARM64 nodes at least 35\,min in this stage.
Another longer stage was the initial setup of each of the hosts, .i.e., the installation of QEMU and other dependencies, which took 24\,min.
The remaining time is spent on waiting for OpenStack to provision the nodes and distributing setup scripts and configuration files."