Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads

Morić, Zlatan; Dakić, Vedran; Čavala, Tomislav

doi:10.3390/jcp5020030

Open AccessFeature PaperArticle

Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads

by

Zlatan Morić

^*

,

Vedran Dakić

^*

and

Tomislav Čavala

Department of System Engineering and Cybersecurity, Algebra Bernays University, 10000 Zagreb, Croatia

^*

Authors to whom correspondence should be addressed.

J. Cybersecur. Priv. 2025, 5(2), 30; https://doi.org/10.3390/jcp5020030

Submission received: 26 April 2025 / Revised: 14 May 2025 / Accepted: 26 May 2025 / Published: 4 June 2025

(This article belongs to the Special Issue Cyber Security and Digital Forensics—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Containerized applications are pivotal to contemporary cloud-native architectures, yet they present novel security challenges. Kubernetes, a prevalent open-source platform for container orchestration, provides robust automation but lacks inherent security measures. The intricate architecture and scattered security documentation may result in misconfigurations and vulnerabilities, jeopardizing system confidentiality, integrity, and availability. This paper analyzes the key aspects of Kubernetes security by combining theoretical examination with practical application, concentrating on architectural hardening, access control, image security, and compliance assessment. The text commences with a synopsis of Kubernetes architecture, networking, and storage, analyzing prevalent security issues in containerized environments. The emphasis transitions to practical methodologies for safeguarding clusters, encompassing image scanning, authentication and authorization, monitoring, and logging. The paper also examines recognized Kubernetes CVEs and illustrates vulnerability scanning on a local cluster. The objective is to deliver explicit, implementable recommendations for enhancing Kubernetes security, assisting organizations in constructing more robust containerized systems.

Keywords:

Kubernetes; Kubernetes security; vulnerability management; DevSecOps; Zero Trust Architecture; hardening

1. Introduction

Kubernetes has become the prevailing standard for orchestrating containerized applications in the swiftly advancing realm of cloud-native computing. As digital transformation progresses across sectors, enterprises progressively embrace microservices architectures to enhance scalability, resilience, and development speed. Kubernetes is pivotal in overseeing containerized workloads by offering a cohesive framework for automating deployment, scaling, and maintenance. Nonetheless, the characteristics that render Kubernetes potent—its distributed architecture, adaptability, and extensibility—simultaneously give rise to numerous security vulnerabilities.

Security in Kubernetes is not activated by default, and improper configuration can readily expose an entire cluster to unauthorized access, data breaches, and denial-of-service attacks. Each layer requires individual evaluation and reinforcement, considering the variety of components in a Kubernetes ecosystem, spanning the API server, etcd, container runtimes, and ingress controllers. Furthermore, containerized environments function on the principle of ephemeral infrastructure, wherein services can dynamically emerge and vanish, exacerbating the challenge of implementing uniform security policies.

Information regarding Kubernetes security is frequently dispersed among community forums, official documentation, third-party tools, and vendor-specific directives. The absence of centralized, organized knowledge hinders practitioners from balancing operational efficiency with risk management. This paper aims to tackle these challenges by providing a thorough, pragmatic, and technically sound framework for fortifying Kubernetes environments. This work is limited to control plane and workload hardening, authentication and authorization, observability, and compliance verification. At the same time, network security and runtime protection are acknowledged as vital issues, but they are not within the scope of this analysis.

This paper offers a systematic analysis and demonstration of optimal security practices for Kubernetes. The primary contribution is a thorough analysis of Kubernetes architectural components, elucidating their functions in the orchestration process and highlighting their security ramifications. The second contribution pertains to the execution of pragmatic hardening strategies on essential elements of the control plane and worker nodes, including the Kubelet, API server, and etcd. Third, the paper presents a comprehensive methodology for securing container images and implementing authentication and authorization protocols. Ultimately, it examines practical auditing, logging, and monitoring methods, illustrating how observability facilitates proactive threat identification and response in dynamic settings.

The remainder of the paper is structured as follows: Section 2 delineates pertinent research and fundamental principles regarding Kubernetes security. Section 3 outlines the basic architecture of Kubernetes, offering essential terminology and system context. Section 4 presents sophisticated hardening techniques, encompassing subsections on container image security, identity and access management, and cluster observability via auditing and monitoring. Section 5 compiles effective best practices for securing Kubernetes environments. Section 6 addresses prospective future research avenues, while the final section provides concluding observations, encapsulating essential insights and their implications for industrial applications.

2. Related Works

Curtis and Eisty [1] analyzed four years of Stack Overflow discussions using AI, identifying Kubernetes security as a top concern, representing 12.3% of posts. While awareness is high, they noted a lack of tools and actionable guidance. Their study also revealed that security interest has grown significantly, yet the practical implementation of best practices still lags.

Rahman et al. [2] conducted a large-scale study of 2039 open-source Kubernetes manifests, uncovering 1051 security misconfigurations across 11 categories. They developed SLI-KUBE, a static analysis tool to detect these flaws. Despite promising detection rates, the challenge remained in convincing developers to consistently remediate issues—only 60% of their recommended fixes were applied.

Wang et al. [3] proposed KubeRM, a distributed, rule-based system for real-time Kubernetes security enforcement. Their system enabled exploits and abnormal traffic detection without centralized bottlenecks. However, the authors faced issues scaling detection consistency across large clusters and noted difficulties in synchronizing rules efficiently across all nodes.

Bose et al. [4] investigated under-reported security defects in Kubernetes manifests by analyzing over 5000 commits from open-source repositories. Only 0.79% of commits addressed security issues, suggesting a significant under-reporting trend. Their key issue was inconsistent commit messaging, which made it hard to identify all relevant security-related changes.

Kutsa [5] discussed security strategies for Kubernetes in multi-cloud setups, emphasizing the use of CNAPP platforms and Zero Trust principles. She identified configuration and compliance issues as main threats, but found implementing integrated solutions complex due to tool incompatibility, high customization requirements, and overlapping security scopes.

Kampa [6] explored Kubernetes-specific threats like API exposure, unscanned images, and misconfigured RBAC policies. The paper reviewed current tools and mitigation strategies. However, the author noted that many recommendations lacked empirical validation, making it hard for practitioners to assess their effectiveness in real-world scenarios.

Corin et al. [7] enhanced Kubernetes with an algorithm for application-aware provisioning of security services. They demonstrated improved service-level performance and security. Nevertheless, integration proved difficult in existing clusters due to limitations in the default scheduler and increased system overhead in high-availability environments.

Gunathilake and Ekanayake [8] developed K8s Pro Sentinel to automate encryption and RBAC configuration for Secret Objects. Their operator reduced human error and improved cluster confidentiality. Still, they acknowledged the complexity of integrating it with Kubernetes API server extensions and potential configuration conflicts in real deployments.

Ascensão et al. [9] compared three Kubernetes distributions—K8s, K3s, and K0s—on performance and security. K0s outperformed others in both areas, while K3s struggled with scalability. However, differences in infrastructure and edge device limitations made it difficult to draw generalizable conclusions applicable to all environments.

Kazenas and Ponomareva [10] evaluated container security tools in Kubernetes clusters, emphasizing real-time monitoring. They found most tools lacked cohesion and had high false-positive rates, reducing their effectiveness. They called for improved integration and better event correlation to avoid operator fatigue and alert oversaturation.

Zhu and Gehrmann [11] proposed Kub-Sec, a system to auto-generate AppArmor profiles for Kubernetes containers. Their method successfully blocked XML-RPC attacks on WordPress apps. However, the approach required extensive behavioral data collection, raising concerns about data privacy and the feasibility of continuous profiling in dynamic environments.

Kamieniarz and Mazurczyk [12] compared four Kubernetes deployments—two managed services and two on-premises clusters. They used automated tools to assess misconfigurations and vulnerabilities. While insightful, the study was limited by a lack of long-term metrics and variation in cluster configurations.

Aly et al. [13] introduced a machine learning-based method for multi-class threat detection in Kubernetes environments. Their hybrid PCA-autoencoder-Naive Bayes model achieved 91% accuracy. Although promising, the model’s performance depended heavily on large, labeled datasets and high computational costs, limiting its use in small organizations.

Russell and Dev [14] developed a centralized defense solution for detecting Kubernetes misconfigurations using open-source tools. Their system successfully aggregated diagnostic logs. Nevertheless, they acknowledged the challenge of high infrastructure costs and noted that the solution may not scale well in cost-sensitive or resource-limited environments.

Acharekar [15] presented a layered security analysis of Kubernetes components, from containers to clusters. The study mapped vulnerabilities and proposed high-availability strategies. However, the paper was largely theoretical, lacking implementation and testing of the suggested methods in production or simulated Kubernetes environments.

Surantha and Ivan [16] applied the Zero Trust model to Kubernetes networking for a financial services firm. Their network design improved segmentation and control. While simulations showed strong results, the authors noted that deploying their model in production could be difficult due to existing vendor lock-ins and integration constraints.

Molleti [17] designed a Kubernetes multi-tenancy model for fintech firms that used namespaces, RBAC, and pod security policies. The architecture showed promising results in tests. Still, the authors faced challenges in resource allocation during peak loads and ensuring strict isolation under dynamically changing tenant requirements.

Surantha et al. [18] revisited Zero Trust principles in Kubernetes for enterprise data centers. They proposed a network design validated through simulations. Although the model adhered to vendor guidelines, compatibility with legacy infrastructure and a lack of vendor-neutral implementations were significant deployment barriers.

Chinnam [19] examined Kubernetes in healthcare IT systems, focusing on secure data processing and infrastructure resilience. While highlighting scalability and automation benefits, he acknowledged issues like data privacy, regulatory compliance, and difficulty integrating with legacy systems in health environments.

Fowley and May [20] assessed automated attack tools targeting Kubernetes, finding them effective in exploiting default configurations. Their study emphasized the risks of improper setup and the steep learning curve faced by operators. However, the research was limited to simulated environments and did not evaluate long-term impacts.

These studies demonstrate that Kubernetes security has emerged as a significant and thoroughly examined area within cloud-native computing. Researchers have tackled various challenges, including misconfigurations, threat detection, and the implementation of zero-trust frameworks. Despite these advancements, the domain remains deficient in standardized frameworks, large-scale empirical validation, and holistic solutions for real-time, multi-tenant, and hybrid-cloud security issues, underscoring the necessity for ongoing, thorough inquiry.

The following section offers a comprehensive examination of the Kubernetes architecture, establishing the essential foundation for comprehending the components and interactions supporting containerized workloads’ orchestration. The following section outlines a detailed security framework encompassing cluster-level protections, container image fortification, identity and access management, and integrated observability via auditing, logging, and monitoring. Kubernetes penetration testing will subsequently assess the efficacy of the proposed security strategies in practical scenarios. The paper concludes with a discourse on prospective research avenues, emphasizing outstanding challenges, potential advancements, and a final synthesis of findings and implications for secure Kubernetes deployment.

3. Kubernetes Architecture

Although virtualization transformed IT infrastructure by allowing multiple operating systems to operate on a single physical machine, it did not provide inherent support for rapid scaling, lightweight deployment, and application-centric management. Kubernetes enhances the containerization paradigm by providing an elevated level of abstraction aimed at orchestrating application containers within a distributed cluster. It underscores portability, automation, and resilience—essential attributes lacking in conventional virtualization.

Kubernetes is a robust, open-source framework for automating the deployment, scaling, and administration of containerized applications. It is intended to oversee clusters of physical or virtual machines, facilitating the orchestration of containers as a cohesive infrastructure layer.

The basics of Kubernetes architecture are shown in Figure 1:

Kubernetes’ declarative configuration model and automation features have established it as the foundation of contemporary cloud-native systems. Comprehending Kubernetes architecture is crucial for utilizing its functionalities securely and effectively. This section delineates the principal architectural components, their interrelations, and their functions in facilitating scalable, resilient, and dynamic application management.

3.1. Control Plane Elements

The control plane functions as the central nervous system of a Kubernetes cluster. It regulates the system’s functionality by preserving the desired state, reacting to user inputs, and orchestrating resource allocation among nodes.

Kubernetes API Server: functions as the front-end interface for the cluster. All operations, whether initiated by users or automated, interface with the cluster through this RESTful API.
etcd: a distributed key-value repository that maintains the complete cluster state, encompassing configuration, metadata, and status data. The consistency and availability are essential for the reliability of the cluster.
Controller Manager: manages multiple controllers that oversee the status of cluster resources and align them with the intended state. The ReplicationController guarantees that the designated number of pod replicas is operational.
Scheduler: tasked with allocating pods to nodes. It determines actions based on resource necessities, affinity regulations, and policy limitations, guaranteeing optimal workload allocation.

These components uphold the cluster’s intended state and continuously monitor for discrepancies to trigger corrective measures.

3.2. Node Components

Worker nodes are the physical or virtual machines that execute application workloads. Each node is outfitted with components that interact with the control plane and guarantee the integrity of container execution.

Kubelet: The principal node agent that registers the node with the control plane and guarantees the operational integrity and health of the containers specified in a pod definition.
Container Runtime: responsible for initiating and overseeing container lifecycles. Kubernetes accommodates multiple runtimes, such as containerd, CRI-O, and, previously, Docker.
Kube-proxy: manages network routing on every node. It facilitates internal communication among services and pods across nodes and external communication when exposed through an ingress.

Collectively, these node-level components facilitate the deployment and execution of containers in alignment with control plane directives.

3.3. Networking and Service Identification

Kubernetes networking simplifies the intricacies of inter-service communication. Each pod is assigned a distinct IP address, and services facilitate stable, discoverable endpoints.

Cluster Networking establishes a flat network architecture, enabling direct communication among all pods without requiring NAT. It is generally administered through CNI plugins such as Antrea, Calico, Flannel, or Cilium.
Services function as reliable front-ends for pod access, frequently load-balanced and reachable through DNS. Kubernetes features an internal DNS service that autonomously allocates DNS records to services and pods.
Ingress Controllers regulate external access to services, typically through HTTP/S routing protocols and TLS termination, and play a crucial role in securely managing north-south traffic.

This system enables dependable communication between services internally within the cluster (east-west) and externally (north-south).

3.4. Extensibility and Declarative Configuration

Kubernetes’s primary advantage is its extensibility and declarative resource management method. Administrators and developers specify the desired state using YAML or JSON manifests.

Custom Resource Definitions (CRDs) facilitate introducing novel resource types, permitting integration with domain-specific logic.
Operators enhance controller functionality to oversee intricate, stateful applications by utilizing domain expertise embedded in automation.
Admission Controllers are modular components that intercept API requests to enforce cluster policies or execute validation and mutation.

These extensibility mechanisms enable teams to customize Kubernetes to their requirements while ensuring uniform management practices across environments.

The architecture of Kubernetes offers a modular and extensible framework for orchestrating containerized applications; however, each component—the control plane, node agents, and networking layers—poses distinct security challenges. Misconfigurations, excessive privileges, insecure defaults, and insufficient runtime protections are prevalent risks that can jeopardize cluster integrity. A comprehensive comprehension of these architectural components is crucial for implementing effective, layered security measures, which will be analyzed in detail in the subsequent section, encompassing identity management, auditability, hardening techniques, and secure communication. By implementing security measures at every architectural layer, administrators can markedly diminish the attack surface and enhance the overall resilience of Kubernetes deployments. We will examine fundamental mechanisms and optimal practices to safeguard each architectural layer, from securing the control plane and worker nodes to enforcing robust access controls, monitoring, and hardening techniques for constructing resilient and reliable Kubernetes environments.

4. Security Hardening of Kubernetes Components

This section emphasizes the architectural and operational fortification of Kubernetes clusters, outlining the essential measures necessary to secure both control plane and worker node components. Securing Kubernetes requires a layered defense strategy due to its open and dynamic architecture, encompassing network segmentation, secure API access, container runtime isolation, and node hardening. The chapter delineates essential practices, including role-based access control (RBAC), audit logging, secure communication, and optimal configuration protocols. These measures diminish the attack surface, enforce the principle of least privilege, and guarantee robust, policy-driven cluster operations in production environments.

4.1. Control Plane Security

The control plane functions as the central intelligence of a Kubernetes cluster, managing scheduling, resource distribution, and the intended state of all workloads. Securing the control plane is essential for safeguarding the Kubernetes ecosystem’s integrity, availability, and confidentiality. Due to their elevated rights and access to necessary configuration and state data, these components are prime targets for attackers. This section delineates critical security protocols for fortifying the principal control plane elements: the Kubernetes API server, etcd, and the scheduler. Each topic delineates certain configuration modifications, certificate administration, and access controls that mitigate attack surfaces and avert unauthorized interactions with the fundamental orchestration logic.

4.1.1. Securing Kubernetes API

The Kubernetes API server listens on port 6443 on the first non-localhost network interface, secured by TLS. In production, the API server uses port 443. The --secure-port and --bind-address flags can change this port’s IP address. The port is changed from the default to 8443, and the API server is bound to the control plane node’s private IP address, as shown in the example below. As shown in Figure 2, /etc/kubernetes/manifests/kube-apiserver.yaml is modified:

Firewall rules should also be configured to restrict API server access.

API servers provide certificates for secure communication. Private certificate authorities (CAs) or public key infrastructures tied to trusted CAs can issue this certificate. The certificate and private key are in /etc/kubernetes/pki/. Figure 2 shows the “--tls-cert-file” and “--tls-private-key-file” options. That encrypts all Kubernetes API server traffic with HTTPS using the specified certificates. To establish secure connections, the client must present a copy of the CA certificate configured in ~/.kube/config and present its own TLS certificate. TLS security depends on cipher suite strength. It is crucial to enable only strong cipher suites like AES-GCM and disable weaker ones.

HTTP requests undergo authentication after TLS is established. An HTTP request is used for authentication, which checks headers and client certificates. Client certificates, passwords, plain tokens, bootstrap tokens, and JSON Web Tokens are authentication modules. Multiple authentication modules can be configured to try sequentially until one succeeds.

If the request cannot be authenticated, HTTP status 401 is returned. Some authenticators provide group memberships, but not all. Kubernetes uses usernames for access control and logging but does not store them or other user data.

4.1.2. Securing etcd

To safeguard etcd, it is essential to implement either firewall configurations or security protocols based on the X. 509 Public Key Infrastructure. Encrypted communication channels can be established by generating a key and certificate pair. The key pair peer.key and peer.cert can be employed to secure communication among etcd nodes, while client.key and client.cert can protect interactions between etcd and its clients.

To facilitate secure client communication for etcd, the flags --key-file=k8sclient.key and --cert-file=k8sclient.cert must be specified, and the URL protocol HTTPS should be employed. Client communication pertains to the interaction between etcd clients (such as kubectl or the API server) and the etcd server. Ensuring secure client communication with etcd is essential, as etcd contains sensitive cluster state information, including configuration data and potentially confidential secrets. Utilizing TLS certificates and private keys (via the --key-file and --cert-file flags) while enforcing HTTPS as the URL protocol ensures that communication between etcd clients (such as kubectl or the Kubernetes API server) and the etcd server is both encrypted and authenticated. The given code snippet illustrates how a client can mandate secure etcd communication.

Setting ETCDCTL_API=3 enables the etcdctl tool to utilize its most recent API version for interactions. The --endpoints flag designates the secure endpoint of the etcd server, which operates on port 2379. The --cert and --key options supply the TLS certificate and the associated private key for the client, whereas --cacert indicates the Certificate Authority’s certificate utilized to authenticate the etcd server’s certificate. These parameters guarantee that the exchanged data are encrypted and that both the client and server execute the necessary verifications, which is vital for preserving the confidentiality and integrity of essential cluster information.

For backup and restoration, etcdctl can be utilized, as etcd stores data regarding all Kubernetes resources in its key-value database. etcdctl is the official command-line client for interfacing with an etcd key-value store, rendering it essential for administering etcd instances in Kubernetes or any system utilizing etcd. It is especially advantageous for backups as it can generate consistent, point-in-time snapshots of the cluster data using the etcdctl snapshot save command. The restoration process employs the identical tool through the etcdctl snapshot restore command, guaranteeing a uniform method for both backup and recovery. Upon generating an etcdctl snapshot, we can reinstate the state, as illustrated in Figure 3:

The system administrator must encrypt the snapshot files to safeguard sensitive Kubernetes data. The etcdctl snapshot save command generates a snapshot from a live etcd database. Restoring etcd via etcdutl necessitates reconstructing the etcd data directory from a snapshot file.

4.1.3. Securing Kube-Scheduler

Given the kube-scheduler role in workload placement, it must be adequately secured to prevent unauthorized access or disruptions in cluster operations. Compromising the kube-scheduler component can affect the cluster’s pod performance and availability. To safeguard the kube-scheduler, Figure 4 shows the security actions to be considered:

The configuration file in the figure contains the following measures to turn off profiling and restrict external connection requests. Since profiling exposes system-specific details, setting the --profiling option to false in /etc/kubernetes/scheduler.conf helps minimize the attack surface. Profiling endpoints typically expose detailed runtime information such as CPU and memory usage, stack traces, and other debugging data. Attackers can analyze this data to gain insights into the system, identify potential weaknesses, or plan further attacks. Next, external access to the kube-scheduler should be restricted in the scheduler. The conf file is located in the/etc/kubernetes directory.

Furthermore, the “AllowExtTrafficLocalEndpoint” feature gate, when set to “false” (“--feature-gates=AllowExtTrafficLocalEndpoint=false”), removes the ability for Services with the externalTrafficPolicy set to “Local” to use a local endpoint for external traffic [20]. By turning this feature off, the cluster will not allow traffic from outside the cluster to be routed exclusively to nodes that have a local Pod endpoint for that Service. This configuration can be necessary in environments where preserving the source IP of external traffic is either unnecessary or undesirable, or where uniform distribution of incoming traffic across all nodes is preferred over local endpoint routing.

The certificates and keys for TLS are also implemented and are specified in the kube-scheduler static pod manifest. Here is an explanation of TLS configuration:

--tls-cert-file=/etc/kubernetes/pki/scheduler.crt: specifies the kube-scheduler’s TLS certificate for secure communication.
--tls-private-key-file=/etc/kubernetes/pki/scheduler.key: specifies the private key associated with the TLS certificate.
--client-ca-file=/etc/kubernetes/pki/ca.crt: CA file for validating incoming client certificates.

By enforcing certificate-based authentication, administrators can ensure that only authorized components interact with the scheduler. TLS protects sensitive data between the scheduler and other components like the API server. Securing the Kubernetes kube-scheduler is critical, as it decides on the placement of pods across the cluster based on scores and resource availability. If compromised, an attacker could manipulate pod scheduling, leading to resource exhaustion, unauthorized workload deployments, or denial of service.

The control plane manages the orchestration of the cluster, while the application workloads are executed on worker nodes. These nodes operate containerized services and are frequently more susceptible to external traffic, making them vital in a defense-in-depth strategy. The subsequent section emphasizes node-level security, specifically, the fortification of the Kubelet and the runtime environment it oversees. Effectively securing the nodes prevents unauthorized users from directly manipulating workloads or establishing a foothold that could compromise the control plane.

4.2. Node Security

Worker nodes constitute the execution layer of a Kubernetes cluster, tasked with operating the containers that provide application functionality. Every node interacts with the control plane and comprises essential components, including the Kubelet and container runtime, both require safeguarding to avert lateral movement or privilege escalation. Due to nodes frequently managing external workloads and directly interfacing with the underlying host operating system, vulnerabilities at this level can result in complete node compromise. This section emphasizes the fortification of the Kubelet through the implementation of stringent authentication and authorization protocols, the deactivation of anonymous access, and the assurance of encrypted communications, all intended to mitigate the danger of unauthorized acts on the node.

Securing Kubelet

The next step is securing the Kubelet component. Kubelet offers significant node and container control via an HTTPS endpoint, which is exposed unauthenticated by default. Kubelet authentication and authorization are recommended for production clusters. Any kubelet HTTPS endpoint requests not denied by other authentication mechanisms are anonymous, and they are assigned the username system:anonymous, group system:unauthenticated.

Kubelet’s default configuration with --anonymous-auth=true treats any request not matched by other authentication mechanisms as an anonymous request and allows it if authorization policies do not explicitly deny it. That often gives unauthenticated access to the Kubelet API. The Kubelet denies unauthenticated requests with --anonymous-auth=false, so the client must be authenticated and authorized to act on the node. Disabling anonymous access restricts Kubelet endpoint requests to authenticated principals (users or service accounts) with the correct permissions. To disable anonymous access and return “401 Unauthorized responses for unauthenticated requests,” start kubelet with --anonymous-auth=false, as shown in Figure 5:

Securing the kubelet is vital as it directly governs the workloads operating on the node. Any misconfiguration or security vulnerability may result in significant repercussions, such as unauthorized access to sensitive workloads, data exfiltration, or total compromise of worker nodes.

In addition to safeguarding the infrastructure that executes workloads, it is crucial to analyze the workloads themselves, particularly the container images and runtime configurations. If inadequately assessed, container images may contain known vulnerabilities or excessive rights, rendering them a possible access point for attackers. This section outlines optimal techniques for container image scanning, privilege limitation, and secure integration into CI/CD pipelines to guarantee that only trusted and fortified workloads are deployed within the cluster.

4.3. Image and Workload Hardening

Container image scanning tools examine the container filesystem to collect metadata and identify any vulnerable components present within the image. A plethora of open-source and commercial enterprise solutions exist for this task, providing integration with CI/CD pipelines and an extensive array of scanning functionalities. Since containers are integral to contemporary software development and delivery pipelines, it is essential to ensure that container images are devoid of known security vulnerabilities to uphold integrity, availability, and confidentiality. Here are several key reasons why container vulnerability scanning is essential:

Proactive detection of vulnerabilities: Container vulnerability scanning allows organizations to identify security concerns in container images during the development phase and before deployment in production environments.
Compliance: Numerous organizations must comply with regulatory standards that mandate particular security practices and controls. Vulnerability scanning guarantees that container images are devoid of recognized vulnerabilities.
Minimizing the attack surface: Scanning container images for vulnerabilities enables organizations to reduce their attack surface by eliminating superfluous components, packages, and dependencies that may present security threats.

What is the functionality of container image scanning tools? The process of identifying vulnerabilities can be delineated through several interrelated phases:

Retrieving images: Images must be sourced from a local or reputable repository.
Extracting image layers: To extract an image’s layers, a vulnerability scanner must proficiently handle the Open Container Initiative format and accommodate the processing of tar archives.
Identifying packages and dependencies: Dependencies within a container are represented as operating system packages, non-OS packages, files, application libraries, and other components.
Examine vulnerability databases: Databases predominantly employ the CVE system. An essential element of vulnerability scanners is precisely identifying all vulnerabilities in the database and linking the appropriate package or range of packages as susceptible.
Match: A fundamental component of a vulnerability scanner is juxtaposing installed packages against recognized vulnerabilities. Diverse methodologies may be employed for this comparison, but the most direct method is name-based matching.
Clean outcomes: Adhering to a general matching algorithm may yield multiple outcomes. The name-based matching approach is prone to generating false positive outcomes. The results must be analyzed to mitigate inaccuracies before reporting.

In summary, container image scanners operate through a detailed process that includes retrieving images, extracting layers, identifying dependencies, examining vulnerability databases, and precisely correlating potential vulnerabilities with installed packages. By executing these procedures, scanners enhance the security of containerized environments by promptly identifying vulnerabilities.

Container image scanning tools examine the container filesystem to collect metadata and identify any vulnerable components present in the image. A plethora of open-source and commercial enterprise solutions exists for this task. Since containers are integral to contemporary software development and delivery pipelines, it is essential to ensure that container images are free from known security vulnerabilities to maintain integrity, availability, and confidentiality. That is crucial as we necessitate the following:

Proactive detection of vulnerabilities: Container vulnerability scanning enables organizations to identify security concerns in container images during development, allowing them to address these concerns before deployment in production environments.
Compliance: Numerous organizations must comply with regulatory standards that mandate particular security practices and controls. Vulnerability scanning ensures that container images are free from recognized vulnerabilities.

Scanning container images for vulnerabilities reduces the attack surface by enabling organizations to eliminate unnecessary components, packages, and dependencies that may pose security threats.

4.3.1. Trivy (v0.62)

Container image scanners such as Trivy operate through a detailed procedure that includes retrieving images, extracting layers, identifying dependencies, examining vulnerability databases, and precisely correlating potential vulnerabilities with installed packages. By executing these procedures, scanners enhance containerized environments’ security by promptly identifying vulnerabilities.

The selected container image scanning tool for evaluating vulnerabilities is Trivy, an open-source security scanner designed to identify vulnerabilities and misconfigurations within various components of the software ecosystem. Upon initiating a scan, Trivy autonomously acquires the necessary vulnerability databases, preserving them locally for subsequent use. The central database, refreshed every six hours on GitHub, consolidates vulnerability information from credible sources, including operating system vendor advisories and community-driven repositories. Trivy exercises caution with packages with undefined versions by excluding vulnerability assessments to reduce false alerts. Users can activate a comprehensive detection mode that identifies vulnerabilities within unspecified version ranges by utilizing the lowest version for evaluation.

Trivy initiates image retrieval by obtaining the container image from a local repository or an online registry. That guarantees access to the most current image for analysis. Subsequently, a comprehensive inspection is conducted to analyze the image’s layers and identify installed system packages and software dependencies. Trivy’s efficacy is rooted in its vulnerability assessment phase, which cross-references detected packages with a comprehensive vulnerability database. That enables Trivy to identify and flag vulnerabilities accurately, ensuring no potential risk is neglected. Ultimately, Trivy consolidates its findings into an exhaustive report, providing actionable insights encompassing vulnerability identifiers, impacted components, and suggested remedial measures. Figure 6 presents an analysis of the ubuntu:20.04 image:

Vulnerabilities are categorized by severity: low, medium, high, and critical. Each entry comprises a vulnerability ID, references for vulnerability patches and fixes, severity level, and package information.

Container image scanning is a crucial practice for preserving the security and integrity of containerized applications. Organizations can mitigate the risk of deploying insecure workloads by proactively identifying vulnerabilities, misconfigurations, and obsolete dependencies in container images. Consistent scanning, automated security protocols, and continuous integration pipelines guarantee early detection of vulnerabilities in the development lifecycle.

4.3.2. Disabling Containers to Run as Root

To enhance the overall security of the container environment, it is essential to investigate not running containers as a root user. The principal advantage of executing containers with non-root privileges is to protect the application environment. Kubernetes enables the specification of a security context for pods and containers, delineating privilege and access control configurations. The objective is to guarantee their operation adheres to the principle of least privilege. Running containers without root privileges inhibits malicious code from acquiring elevated permissions on the container host.

This limitation guarantees that unauthorized individuals who obtain the container from a public registry cannot access essential server resources. When containers provide elevated privileges, unauthorized users can execute undesirable processes or access confidential information. That is an elucidation of the crucial fields we may utilize in Kubernetes YAML files to achieve this objective:

runAsNonRoot: guarantees that the container operates without root user privileges; Kubernetes will inhibit the container’s initiation if the UID is configured to 0 (root).
runAsUser and runAsGroup: designate the user and group ID under which the container operates, facilitating the enforcement of a specific non-root user.
allowPrivilegeEscalation: inhibits a process from acquiring elevated privileges.
privileged: When enabled, the container gains access to host devices and can alter the kernel.

Non-root containers are crucial in multi-tenant Kubernetes environments, where workload isolation and the enforcement of stringent security policies are imperative.

This configuration inhibits attackers from exploiting vulnerable applications to obtain root access and control the host system, as the security context delineates privilege and access control settings at the pod and container levels. The suggested pod definition can be implemented in the cluster using the command kubectl apply -f file_name.yaml, where file_name.yaml is the configuration file name.

Turning off privileged mode prevents the container from accessing host resources such as device files, kernel modules, or root processes, thereby protecting the host operating system from direct interference. Eliminating all capabilities guarantees that the container process receives minimal permissions, thereby obstructing attackers from executing privileged operations such as altering the system time, binding to privileged ports, or adjusting network configurations.

4.3.3. Integrating Hardening with CI/CD Pipelines

The last hardening step for images would include scanning container images into the whole CI/CD pipeline. That is where GitLab integration becomes very useful. The Trivy tool is the scanner, whereas the GitLab CI/CD operates as the orchestrator. Each time GitLab executes the “scan” job, it employs Trivy to assess the image for vulnerabilities. Should any critical issues be identified, the pipeline will fail, thereby upholding security standards within the development lifecycle. Integrating security at each phase of the development and deployment lifecycle is the primary aim of CI/CD and an essential element of the shift-left strategy in DevOps methodologies.

Integrating image scanning into the CI/CD workflow enables development teams to obtain security assessments immediately upon a developer’s code commitment to the repository. New vulnerabilities are continually identified, necessitating the maintenance of an updated vulnerability database. That indicates that an image scan may succeed during the build phase but could subsequently fail at runtime if a newly identified critical vulnerability exists within the image. In such instances, workload deployment must be suspended, and suitable mitigation strategies should be enacted.

In practical CI/CD processes, a failed Trivy scan, primarily upon identifying high or critical vulnerabilities, usually results in the automatic cessation of the pipeline. That inhibits the deployment of vulnerable images to production registries or clusters. Continuous Integration technologies, such as GitLab CI/CD or Jenkins, can be configured to terminate the build process if certain severity thresholds are reached. For instance, GitLab permits conditional job failure contingent upon Trivy’s exit codes or severity filters (e.g., --severity CRITICAL, HIGH). These controls implement security measures into DevSecOps pipelines, facilitating early correction before deploying insecure workloads in production systems.

Although fortified container images and secure workload configurations diminish the probability of vulnerabilities at the application layer, a secure Kubernetes environment also relies on regulating access to cluster resources. Unauthorized or too-permissive access remains a prevalent root cause of breaches in cloud-native environments. The subsequent part analyzes identity and access management systems within Kubernetes, emphasizing authentication methods, certificate utilization, token-based access, and the implementation of granular permissions via Role-Based Access Control (RBAC) and node authorization policies.

4.4. Identity and Access Management

Authentication and permission are essential components of Kubernetes security. They regulate access to cluster resources and ensure that only authenticated users or services can execute particular activities. Kubernetes, by default, turns off multiple authentication methods, which, if improperly configured, may permit malicious entities to access the API server or internal data without adequate verification. That highlights the necessity for intentional and secure configuration of identity management systems.

This section examines the authentication domain comprehensively, detailing how Kubernetes verifies the identity of users, services, and applications via mechanisms such as client certificates, static and bootstrap tokens, and authenticating proxies. These tactics are essential for implementing mutual TLS authentication, improving identity verification, and operational adaptability.

In addition to authentication, we analyze authorization solutions, emphasizing Role-Based Access Control (RBAC), which delineates specific permissions for users, groups, and service accounts. RBAC guarantees that each identity functions within explicitly delineated parameters, restricting the potential consequences of compromised credentials or errant services. We further address node authorization as a specific mechanism for safeguarding interactions initiated by the Kubelet.

Collectively, these measures reinforce the principle of least privilege, safeguard against unlawful access, and facilitate adherence to contemporary infrastructure governance mandates. Due to Kubernetes’s declarative configuration style and flexibility, these access controls may be seamlessly integrated into the cluster’s operational routines via YAML or JSON manifests.

4.4.1. Using Certificates

The API server enables Client certificate authentication by supplying the --client-ca-file parameter. The API server configuration is in /etc/kubernetes/manifests/kube-apiserver.yaml (static pod) or /etc/kubernetes/kube-apiserver (if operating as a systemd service). The designated file must contain one or more certificate authorities to authenticate client certificates presented to the API server. Figure 7 illustrates the method for supplying the --client-ca-file parameter to the API server within the configuration file:

Upon the provision and successful authentication of a client certificate, the subject’s common name is utilized as the username for the request. Every user holds a distinct X.509 client certificate when using this method. X.509 certificates are essential for securing communication among cluster components, authenticating users, and encrypting data. Kubernetes extensively employs X.509 certificates to secure its control plane and manage cluster security via TLS encryption.

The API server authenticates the client certificate with a specified certificate authority. Upon successful validation, the common name from the subject’s details functions as the username for the request, while any organizations linked to the subject are regarded as groups. Certificates can be generated manually using the OpenSSL tool. The OpenSSL cryptographic library offers access to encryption algorithms employed in various Internet protocols and standards.

Its functionalities include symmetric encryption, asymmetric key cryptography, key exchange protocols, certificate management, cryptographic hashing algorithms, secure pseudo-random number generators, message authentication codes (MACs), key derivation functions (KDFs), and various auxiliary tools. The certificates can be generated by adhering to the procedure outlined below:

4.4.2. Using Tokens

Tokens in Kubernetes function as a versatile and secure mechanism for authenticating users and services that engage with the Kubernetes API server. By comprehending the various categories of tokens, including bootstrap tokens, static tokens, and OIDC tokens, system administrators can devise authentication strategies customized for specific environments.

Static tokens are a straightforward mechanism for user authentication to the Kubernetes API server. These tokens are pre-established and maintained in a static file on the control plane. They offer a straightforward method for configuring basic authentication in small-scale or testing environments; however, they are not advisable for production due to security issues. Static tokens are transmitted as a CSV file containing user IDs, names, tokens, and optional group names. The API server retrieves bearer tokens from a file when the --token-auth-file=FILENAME option is specified in the command line. A static token file can be generated in /etc/kubernetes/static-token.csv on the controller node and must contain a minimum of three columns: token, username, and user-id. The Controller API manifest is at /etc/kubernetes/manifests/kube-apiserver.yaml must be updated to include the –token-auth-file flag.
Bootstrap tokens are basic bearer tokens designed to create new clusters or integrate new nodes into an existing cluster. They are ephemeral tokens engineered explicitly for the bootstrapping process to authenticate a node to the cluster’s control plane node. Consequently, they are designed to facilitate the kubeadm process but may also be utilized in alternative contexts. Bootstrap tokens are retained as secrets within the kube-system namespace. The Bootstrap Token authenticator can be activated using the subsequent flag on the API server: --activate-bootstrap-token-authentication. Figure 8 illustrates the default bootstrap token created by the kubeadm init process and stored as a secret in the kube-system namespace:

3.: The OpenID Connect protocol is developed by extending the existing OAuth2 protocol. Kubernetes lacks an OpenID Connect identity provider. Identity providers such as Google or Azure may be utilized. External identity providers can be seamlessly integrated with Kubernetes authentication workflows as needed. OpenID Connect is a derivative of OAuth2 endorsed by numerous OAuth2 providers. The principal improvement it provides over OAuth2 is incorporating an ID Token in the response in addition to the access token, with the ID Token being a JSON Web Token (JWT). To implement OpenID Connect with Kubernetes, one must first authenticate with an identity provider, such as Microsoft Entra ID, Salesforce, or Google. Upon logging in, the identity provider will issue an access_token, an id_token, and a refresh_token. When utilizing kubectl to engage with the Kubernetes cluster, the id_token must be provided either via the --token flag or by incorporating it directly into the kubeconfig file. Upon configuration, kubectl will transmit the id_token in the authorization header during API server requests. The API server will authenticate the JWT signature to confirm the token’s validity. It will also check whether the token has expired to verify its validity. The API server will respond to kubectl if the user possesses authorization.

Using tokens in Kubernetes improves security by facilitating precise access control and authentication. Tokens, including service account tokens and ephemeral tokens from OIDC providers, ensure that only authorized users or services can access the Kubernetes API. They can be precisely defined to restrict access to resources or namespaces and are more readily rotated and revoked than static credentials. That mitigates the risk of credential exposure and enhances auditability and compliance in secure settings.

4.4.3. Using an Authentication Proxy

An authentication proxy functions by intercepting API requests and validating the identity of the client or user initiating the request. Diverse proxy solutions, such as OAuth2_proxy or nginx, can be employed for this objective. Upon successful authentication, the proxy transmits the request to the Kubernetes API server. Should the authentication be unsuccessful, the proxy generates an error message. The API server must be configured to recognize users based on request header values, such as X-Remote-User. It is intended for use alongside an authenticating proxy. Figure 9 illustrates a configuration example for the kube-api server in this context:

The options used in the example are as follows:

--requestheader-client-ca-file=/etc/kubernetes/pki/ca.crt: designates the Certificate Authority certificate file utilized to authenticate client certificates submitted by the proxy, thereby ensuring that the proxy interacting with the API server is both trusted and authorized.
--requestheader-username-headers=X-Remote-User: designates the HTTP header (X-Remote-User) utilized by the proxy to transmit the authenticated user’s identity. The initial non-empty header is regarded as the username. The proxy establishes this header, which the API server utilizes to recognize the authenticated user.
--requestheader-group-headers=X-Remote-Group: establishes the header (X-Remote-Group) that enumerates the groups associated with the authenticated user.
--proxy-client-cert-file=/etc/kubernetes/pki/apiserver.crt: supplies the client certificate for the API server to authenticate with the proxy server, which is utilized during mutual TLS authentication when the API server interacts with the proxy.
--proxy-client-key-file=/etc/kubernetes/pki/apiserver.key: designates the private key associated with the certificate mentioned above (apiserver.crt), utilized to establish a secure connection between the API server and the proxy.

To establish an authentication proxy, nginx may be utilized by implementing ConfigMap and Deployment configuration files. A service to expose the nginx proxy should be deployed. The authentication proxy secures access to the Kubernetes API server by functioning as an intermediary between clients and the API server, enforcing authentication and authorization protocols before requests reach the API server.

Upon successful request authentication, nginx inserts headers including X-Remote-User and X-Remote-Group, which the API server utilizes for user identification.

4.4.4. Role-Based Access Control

Role-Based Access Control (RBAC) regulates access to network resources according to the roles designated to users within an organization. RBAC authorization employs the rbac.authorization.k8s.io API group to facilitate authorization determinations, allowing dynamic policy adjustments via the Kubernetes API. Reference source not located. The implementation of RBAC in Kubernetes consists of a two-step procedure as detailed below. The initial step is to establish a Role or ClusterRole. The latter is a global object, whereas the former is a namespace-scoped object. A Role or ClusterRole comprises verbs, resources, and subjects that delineate the permitted actions on a resource. The subsequent step involves establishing a ClusterRoleBinding, wherein the privileges outlined in the initial step are conferred upon the user or group. Figure 10 illustrates the creation of an RBAC role that confers read-only access to Pods within the test namespace, with the verbs specifying permitted actions on resources (get, list, and watch):

Role-Based Access Control (RBAC) is a crucial element of Kubernetes’ authorization strategy, offering a systematic and secure approach to managing permissions within a cluster. RBAC delineates Roles and Role Bindings to guarantee that users, groups, and service accounts possess only the permissions requisite for their designated functions. Effectively implementing RBAC enhances security and facilitates adherence to governance policies and industry standards. The effective execution of RBAC necessitates meticulous planning and continuous oversight. Administrators must assess user and service account needs to establish suitable roles with the least required permissions.

4.4.5. Node Authorization

Node authorization is a specific access control mechanism intended to authenticate API requests initiated by the kubelet client component. It can permit or prohibit a kubelet from executing API operations such as the following:

Reading operations: pods, nodes, endpoints, secrets, config maps, and volumes.
Execute operations: nodes, pods, and associated events.
Authentication operations: read/write access to Certificate Signing Requests for TLS generation, and the creation of TokenReviews and SubjectAccessReviews for delegated authentication and authorization assessments.

To configure the Node authorizer in Kubernetes, the kube-apiserver must be set up with the --authorization-config flag, as illustrated in Figure 11:

The auth-config.yaml file located in the /etc/kubernetes directory should contain the content depicted in Figure 12, where we can see the resources and actions permissible for kubelets:

This configuration guarantees that kubelet is restricted to executing actions (such as get and list) on designated resources (pods, services, and nodes). By limiting the extent of their permissions, nodes can access only the information necessary for their operation. This method mitigates the risk of inadvertent or malevolent exploitation of resources. In conclusion, the node authorizer in Kubernetes is a specific access control mechanism intended to authenticate API requests initiated by kubelet components.

Consequently, when a kubelet interacts with the Kubernetes API server, it must authenticate using the identity of its host node. The node authorizer ensures that the kubelet is permitted to execute only those operations for which it is authorized, contingent upon the resources allocated to its designated node. It limits the kubelet’s ability to perform operations not directly associated with the pod operating on that node. Kubelets can access the Pod resources for which they are accountable, but they are prohibited from accessing pods on other nodes, secrets, or configuration data unless explicitly permitted.

Despite robust access constraints, ensuring visibility into cluster activity is essential for identifying misconfigurations, unauthorized operations, and early signs of compromise. Security encompasses not just prevention but also detection and response. The subsequent section emphasizes observability and auditing within Kubernetes settings, encompassing audit logging, system monitoring, and integration with logging and visualization tools that help administrators maintain situational awareness and respond effectively to issues.

4.5. Observability and Auditing

Activity logs record discernible actions within the cluster. A comprehensive logging system and regular log analysis are crucial for confirming the proper operation and configuration of services and maintaining the system’s security. Comprehensive security audit standards necessitate regular and thorough evaluations of security configurations to identify potential breaches. Kubernetes can generate audit logs to oversee identifiable operations within the cluster and monitor essential resource utilization metrics, including CPU and memory consumption. Nevertheless, it lacks an integrated, all-encompassing monitoring or alerting system.

4.5.1. Auditing

The Kubernetes auditing policy delineates the logging of API server activities, specifying the types of events to be recorded and the level of detail required. An example of an audit policy definition yaml file is presented in Figure 13:

An effective auditing policy records API calls throughout various lifecycle phases (such as request receipt and response transmission) and encompasses contextual information, including user identity, resource type, and namespace. This detailed configuration aids in identifying misconfigurations and potential security incidents, while facilitating compliance monitoring in extensive environments [14,21]. Centralized logging frameworks augment observability and threat detection within distributed Kubernetes clusters [14].

Auditing regulations are delineated in the policy and serve as criteria for determining the volume of information documented for correlating API events. These regulations assess parameters such as users, actions (create, delete), or target resources and designate a log level: None, Metadata, Request, or RequestResponse. Evaluation is hierarchical, with the initial corresponding rule being implemented. Meticulously formulated regulations diminish log volume while maintaining critical security insights, facilitating effective anomaly detection and forensic examination [14,22].

The Kubernetes audit backend is an essential element of the auditing framework, tasked with storing and managing audit logs produced by the API server. It specifies the destination and format of audit data, accommodating multiple backends including log files, webhooks, or external logging solutions. This adaptability enables the integration of logs with SIEM platforms or centralized log collectors such as the ELK stack or PLG stack [14,23]. An optimally configured audit backend enhances compliance, enables real-time threat detection, and aids post-incident analysis. Audit logs can be directed through webhook backends to external processors for sophisticated filtering, correlation, and enrichment, which is especially crucial in microservices environments characterized by high log volume [14,24]. Furthermore, distributed log aggregation systems utilizing microservices for log collection across nodes have demonstrated efficacy in scalable Kubernetes deployments [24]. Backend configurations must equilibrate log granularity with resource limitations to ensure security and performance, utilizing container-native methodologies like eBPF for improved visibility and efficiency [25].

4.5.2. Logging

Kubernetes facilitates comprehensive logging methodologies at both the node and pod tiers. Node-level agent logging consolidates log collection throughout the cluster, whereas sidecar containers within pods provide application-specific log management. Collectively, these logging approaches augment observability and operational insight. We will now assess these logging approaches’ capabilities and architectural trade-offs in scalable and resilient Kubernetes environments.

Kubernetes Node-Level Agent Logging typically entails the deployment of agents such as Fluentd, Filebeat, or Logstash directly on each node. These agents gather logs from diverse sources, including container stdout/stderr, system logs, and Kubernetes components such as kubelet and container runtime logs. They standardize, process, and transmit logs to centralized logging systems such as Elasticsearch or cloud-native alternatives. This approach guarantees the reliable capture of all node and container logs, even for ephemeral pods [22,23]. It also improves observability by offering host-level context and facilitates consistent logging across environments. Contemporary designs integrate lightweight, scalable agents that diminish performance overhead and facilitate dynamic microservices [24]. The first part of the standard Fluentd deployment configuration is shown in Figure 14:

Fluentd is a flexible tool that significantly improves Kubernetes logging through centralized log aggregation, efficient log forwarding, and log processing capabilities. It is exceptionally efficient in performance and adept at managing substantial quantities of log data with negligible overhead. That is crucial in Kubernetes environments, where thousands of pods and containers produce logs at scale.

This paper utilizes Fluentd as a centralized logging agent, implemented as a DaemonSet to bolster Kubernetes security via consistent log aggregation and forwarding at the node level to Elasticsearch. This method guarantees uniform capture of all cluster activities, even amidst node failures or container crashes, thus enhancing observability and forensic capabilities. This configuration highlights Fluentd’s contribution to a security-first architecture, in contrast to previous implementations like the CMSWEB infrastructure at CERN, which utilized Fluentd as a sidecar or node-level agent to transmit logs to both S3 and Elasticsearch [26]. It closely integrates with auditing and monitoring systems and customizes Fluentd’s configuration to conform to security best practices, including role-based access control (RBAC), TLS encryption, and secure multi-tenancy. Using Fluentd in this context fosters a defense-in-depth approach, wherein improved logging underpins real-time alerting, incident response, and sustained audit compliance, representing a progression beyond conventional logging frameworks that frequently emphasize operational visibility at the expense of security robustness.

A sidecar container is a companion container in the same pod as the main application and is responsible for handling logging. It captures or gathers application logs (e.g., from shared volumes or stdout) and transmits them to a logging backend. This pattern segregates log collection from business logic, enhances modularity, and facilitates the customization of application-specific log routing and formats [19,23].

4.5.3. Kubernetes Monitoring

The standard way to monitor Kubernetes is to use Prometheus and Grafana. Prometheus is a powerful, open-source monitoring tool extensively utilized for Kubernetes observability. It aggregates metrics from multiple components, including kubelet, cAdvisor, and application pods via HTTP endpoints. Prometheus functions through a pull-based model, systematically retrieving metrics from designated targets and archiving them as time-series data. Kubernetes-native service discovery streamlines target management by automatically adjusting to cluster modifications. Prometheus is frequently integrated with Grafana for data visualization, allowing users to develop comprehensive dashboards that emphasize node health, pod utilization, and application performance trends [27,28]. Figure 15 shows a Prometheus dashboard for Kubernetes:

This integrated stack facilitates anomaly detection, SLA monitoring, and cluster health management. For example, automatic anomaly detection can be integrated with Prometheus data to identify faults before they impact performance, facilitating proactive issue resolution and enhancing operational efficiency [29]. These attributes render Prometheus an essential element of any scalable Kubernetes monitoring strategy.

Grafana software enhances Prometheus by serving as a visualization and dashboarding tool. Prometheus and Grafana collectively provide an all-encompassing solution for real-time monitoring. This amalgamation is extensively utilized in contemporary cloud-native ecosystems owing to its adaptability, scalability, and open-source characteristics. Figure 16 illustrates an instance of Kubernetes cluster monitoring within Grafana:

Connecting to Prometheus as a data source allows Grafana to facilitate the development of interactive dashboards for monitoring cluster health, resource utilization, and performance trends. It is intended to monitor containerized microservices and applications operating within extensive Kubernetes ecosystems. Prometheus and Grafana exhibit high scalability, rendering them appropriate for clusters of any magnitude. Prometheus adeptly manages substantial metric volumes and effectively stores time-series data, whereas Grafana proficiently oversees diverse visualizations within intricate Kubernetes ecosystems. This scalability guarantees that organizations can oversee extensive and expanding Kubernetes deployments without compromising performance or reliability.

In production environments, the data gathered by Fluentd and shown using Prometheus-Grafana dashboards is utilized for active analysis rather than passive observation. Alerts based on thresholds can be established, for instance, CPU or RAM surges, container restarts, or atypical API activity, that initiate automated replies through alerting systems like Alertmanager, Alerta, or Opsgenie. These responses may encompass scaling services, restarting malfunctioning pods, implementing mitigating scripts, or alerting incident response teams. Excessive failed login attempts documented in Fluentd logs can initiate webhook actions that remove access tokens or impose temporary network limitations.

5. Kubernetes Vulnerabilities and Kubernetes Penetration Testing

This section examines optimal strategies for reducing vulnerabilities in Kubernetes environments, emphasizing practical tools, methodologies, and reference data sources. First, we analyze the function of vulnerability databases and the significance of tracking Kubernetes-specific CVEs for prompt risk evaluation and patch administration. This section provides practical insights into penetration testing methodologies specific to Kubernetes clusters, illustrating how adversarial techniques can uncover security vulnerabilities. Collectively, these practices constitute a proactive defense strategy that enhances architectural hardening and runtime protections.

5.1. Vulnerability Databases

Vulnerability databases are crucial in contemporary cybersecurity frameworks, especially for cloud-native environments like Kubernetes. These archives methodically document known security vulnerabilities, frequently outlining the technical characteristics of each defect and its possible effects on systems. They also suggest remediation measures such as patches or configuration adjustments. Each item is generally allocated a unique identity, such as a CVE (Common Vulnerabilities and Exposures) ID, to guarantee standardized monitoring throughout organizations and tools [30]. Notable platforms comprise the National Vulnerability Database (NVD), overseen by NIST, which enhances CVE entries with scoring data and metadata via the Common Vulnerability Scoring System (CVSS) [31].

Vulnerability databases are essential for Kubernetes and other containerized systems, aiding in patch priority, risk assessment, and regulatory compliance. Nevertheless, dependence on these databases presents its difficulties. Research has revealed disparities among databases such as NVD, OSVDB, and others, particularly in severity scoring and affected software versions [32]. These discrepancies can result in differing risk evaluations and influence mitigation strategies. Furthermore, the imperative for patch creation, especially in high-severity situations, may engender new vulnerabilities if not rigorously verified, further complicating security management [33]. Access to comprehensive vulnerability data may differ based on user responsibilities or corporate security clearances, affecting repair tactics and response times.

Notwithstanding these issues, vulnerability databases are fundamental to proactive security tactics. They consolidate and distribute threat knowledge, optimize security automation, and facilitate manual and AI-assisted vulnerability triage procedures. Their ongoing advancement—via enhancements in data quality, machine-readable formats, and AI-augmented analysis—is essential for safeguarding Kubernetes deployments and broader software ecosystems [30,34].

5.2. Kubernetes CVEs

Kubernetes has become a fundamental technology in the swiftly advancing field of cloud-native infrastructure, managing containerized applications on a large scale. Its widespread popularity has rendered it a common subject for security analysis and exploitation. Organizations adopt a proactive security approach by utilizing standardized vulnerability disclosures via the Common Vulnerabilities and Exposures (CVE) system, which is essential for evaluating risks, prioritizing patches, and implementing countermeasures in Kubernetes environments. CVEs document vulnerabilities’ technical specifics, facilitate tools’ interoperability, and enhance transparency within the security community [30].

Recent studies underscore the limitations in the consistency, timeliness, and accuracy of CVE data, particularly when utilized in complex platforms such as Kubernetes. Documented inconsistencies in version information, discrepancies in CVSS scores, and traceability gaps in public databases such as NVD and CVE may influence organizational responses to emerging threats [32,34]. Considering the rising frequency of vulnerability disclosures in cloud platforms and third-party extensions, it is essential to contextualize CVE data with specific mitigation strategies. This section consolidates the most recent Kubernetes-related CVEs, providing succinct descriptions and suggested actions based on the latest intelligence. Here is a list of some of the latest, significant Kubernetes-related CVEs and their recommended mitigation strategies:

CVE-2025-1974

Component: Ingress-NGINX Controller

This significant vulnerability permits unauthenticated attackers within the pod network to execute arbitrary code in the context of the ingress-nginx controller. Successful exploitation may result in the revelation of confidential information and the compromise of resources across the cluster.

Mitigation: Upgrade ingress-nginx to a revised version outlined in Kubernetes security bulletins. Implement network access restrictions and role-based access control (RBAC) policies to mitigate exposure to untrusted sources.

2.: CVE-2025-1098

Component: Ingress-NGINX Controller

This vulnerability enables remote code execution by exploiting the ingress-nginx controller, representing a significant risk to Kubernetes environments. Malefactors may potentially obtain access to cluster internals or confidential information.

Mitigation: Implement the most recent security update for ingress-nginx. Consistently evaluate cluster ingress configurations and implement network segmentation to restrict access.

3.: CVE-2025-1097

Component: Ingress-NGINX Controller

This critical vulnerability enables attackers to execute remote code by leveraging configuration deficiencies. That may result in total control over ingress-nginx and its related workloads.

To reduce severity or seriousness, update the most recent version of ingress-nginx, perform regular configuration assessments, and implement least privilege protocols.

4.: CVE-2025-24514

Component: Ingress-NGINX Controller

This vulnerability allows attackers to achieve unauthorized code execution via specially crafted requests aimed at ingress-nginx, compromising container integrity and security.

Mitigation: Upgrade ingress-nginx as recommended in the Kubernetes CVE feed. Examine logs for anomalous ingress activity and implement rigorous input validation on annotations.

5.: CVE-2024-9042

Kubelet on Windows Nodes

Insufficient validation of the “pattern” parameter in NodeLogQuery requests may permit command injection on Windows worker nodes. That could facilitate privilege escalation or unauthorized command execution.

Reduction in adverse effects: Upgrade Windows kubelets to the specified versions: v1.32.1, v1.31.5, v1.30.9, or v1.29.13. Segregate Windows workloads utilizing stringent RBAC and firewall protocols.

6.: CVE-2024-7646

Component: Ingress-NGINX Controller

A validation bypass vulnerability enables attackers to alter annotations and inject harmful configurations, which may result in privilege escalation or data exfiltration.

Reduction in severity or seriousness: Upgrade to ingress-nginx version 1.11.2 or subsequent releases. Authenticate ingress annotations and limit annotation application via policy enforcement mechanisms such as OPA/Gatekeeper.

7.: CVE-2024-29990

Element: Azure Kubernetes Service (AKS)

A security vulnerability in AKS may result in unauthorized access or privilege escalation within the cluster environment, jeopardizing the confidentiality and integrity of workloads.

Reduction in severity or seriousness: Implement the most recent updates for Microsoft’s AKS and evaluate Azure role assignments. Utilize Azure Policy and Defender for Cloud to mandate secure cluster configurations.

8.: CVE-2024-9486

Element: Kubernetes Image Builder

This vulnerability permits unauthorized individuals to obtain root-level access to virtual machines utilized in image construction, potentially jeopardizing the supply chain.

Reduction in severity or seriousness: Upgrade the Kubernetes Image Builder to version 0.1.38 or later. Limit access to build infrastructure and segregate it from production systems.

9.: CVE-2024-10220

Element: Kubernetes Clusters Utilization of gitRepo Volume Description: Individuals possessing pod creation privileges can employ the gitRepo volume type to access local git repositories from alternative pods, resulting in potential data leakage.

Reduction in severity or seriousness: Upgrade to Kubernetes versions v1.31.0, v1.30.3, v1.29.7, or v1.28.12. Refrain from utilizing gitRepo volumes and, instead, execute Git clone operations through init containers.

Table 1 summarizes the CVEs examined in Section 5.2, contextualizing real-world threats within the proposed framework. Each item links a specific risk to its affected component and suggests mitigation strategies, underscoring the importance of architectural fortification in protecting against recognized threats.

The enumerated CVEs underscore vulnerabilities within Kubernetes ecosystems, especially ingress controllers, runtime elements, and improperly configured integrations. These findings emphasize the necessity for ongoing vulnerability surveillance, swift patch management, and proactive architectural design—all fundamental principles of the methodology presented in this study.

The analyzed CVEs highlight the advancing security issues within the Kubernetes ecosystem, especially concerning ingress controllers, configuration vulnerabilities, and platform integrations. Prompt patching, stringent Role-Based Access Control (RBAC), and ongoing monitoring are crucial for risk mitigation and ensuring strong cluster security.

5.3. Penetration Testing Tools, Demonstration, and Security Compliance Assessment

Penetration testing in Kubernetes environments is crucial for securing contemporary cloud-native infrastructures. As the leading container orchestration platform, Kubernetes features a vast and modular architecture that offers numerous potential attack vectors, including misconfigurations, privilege escalations, and exposed APIs. A range of open-source tools has been created to aid security teams in proactively evaluating and addressing these vulnerabilities. These tools facilitate offensive and defensive security measures by replicating real-world attacks and assessing a system’s robustness against various threat scenarios [35,36]. The open-source characteristics and robust community backing of tools such as Kube-hunter, kube-bench, and KubeFuzzer guarantee ongoing updates corresponding to the changing Kubernetes threat environment [35,37].

Kube-hunter, created by Aqua Security, is a tool that has become prominent for auditing Kubernetes environments. It identifies security vulnerabilities through passive scanning—non-intrusive detection that does not engage with the cluster—and active scanning, which rigorously tests for potential weaknesses. Besides vulnerability enumeration, tools such as kube-bench synchronize audits with CIS Benchmarks, providing recommendations for optimal configuration hardening. Advanced methodologies, including RESTful API fuzzing through KubeFuzzer, automate the identification of latent defects in the Kubernetes API layer by employing semantic-driven request generation, thereby enhancing code coverage and detection precision [38]. These tools can be customized to conform to organizational policies and compliance frameworks, facilitating regular assessments and high-assurance security validations. Their continual application fortifies Kubernetes environments against advanced threats, rendering penetration testing a fundamental element of resilient cloud-native defense strategies [2]. Kube-hunter is easy to deploy if we follow the procedure detailed in Figure 17:

By default, kube-hunter vulnerability scanning initiates an interactive session, allowing users to select from the following scanning options. These are the available options:

Remote scanning: To specify a remote machine for scanning, kube-hunter should be started with the --remote option.
Interface scanning: When selecting interface scanning, the --interface option must be specified; this option will scan all network interfaces on the localhost.
Network scanning: A specific CIDR can be selected for scanning using the-- cidr option.

Figure 18 illustrates the output of kube-hunter with the scanning option enabled. Upon selection, the tool automatically initiates a scan for vulnerabilities on the specified IP address:

The identified vulnerability pertains to KHV002–disclosure of Kubernetes version. An attacker may exploit the publicly available specific version of Kubernetes to target environments with newly identified vulnerabilities in that version. This information may have been acquired from the Kubernetes API endpoint or the Kubelet’s debug endpoint. To mitigate this vulnerability, the kubelet flag --enable-debugging-handlers should be deactivated [37].

Kube-hunter is an essential tool for detecting security vulnerabilities in Kubernetes clusters. It assists security professionals and system administrators in evaluating the security posture of their Kubernetes environments through comprehensive automated security scans. By simulating potential attack vectors, Kube-hunter identifies misconfigurations, obsolete components, and other vulnerabilities that attackers could exploit. It can be crucial in securing Kubernetes infrastructure but must be incorporated into a comprehensive security framework for optimal defense against contemporary threats.

Compliance assessment can be performed by using kube-bench. Kube-bench is an open-source utility that evaluates a Kubernetes cluster’s configuration following the CIS (Center for Internet Security) Kubernetes Benchmark, a compilation of best practices designed to enhance the security of Kubernetes deployments.

It automates the auditing of Kubernetes clusters, evaluating compliance with the CIS Kubernetes Benchmark and generating a comprehensive report on security configurations. The benchmark encompasses multiple Kubernetes components, such as the API server, Kubelet, controller manager, and scheduler, guaranteeing secure configuration for each element. The outcomes of the Kube-bench assessment can be categorized into the following segments: Master Node Security Configuration (including Master Node Configuration Files, API Server, Controller Manager, Scheduler), Etcd Node Configuration, among others. Kube-bench starts with assessing master node security compliance, as shown in Figure 19:

The full summary of these scans is presented in Figure 20:

The tool additionally offers solutions to address test results categorized as “Warning” and “Failed”. Several recommended solutions have been previously articulated and elucidated in the prior section, thereby substantiating and validating the fundamental issue this paper seeks to address: The newly implemented Kubernetes clusters lack default security measures. Therefore, implementing robust measures is crucial to attain confidentiality, integrity, and availability and to strengthen the cluster’s security posture. Figure 21 below illustrates remediation proposals for the master node:

Kube-bench is a crucial instrument for verifying the security and compliance of Kubernetes clusters. Kube-bench conducts automated security assessments based on the CIS Kubernetes Benchmark, offering critical insights into cluster configuration, aiding in identifying vulnerabilities, risk mitigation, and preserving a secure environment. It is an essential tool for Kubernetes administrators seeking to ensure compliance, bolster cluster security, or adhere to best practices in securing their environment.

5.4. Use Case: Hardening a Local Kubernetes Cluster

A local Kubernetes cluster was established using kubeadm on two virtual computers to illustrate the practical application and efficacy of the suggested hardening techniques: one serving as the control plane (k8s-master) and the other as a worker node (k8s-worker). The cluster operated on Kubernetes version 1.31.3 and was deliberately initialized with default configurations to evaluate its initial security posture, subsequently enhanced through incremental hardening measures.

5.4.1. Baseline Vulnerability Assessment Using Trivy

The base image, ubuntu:20.04, was initially analyzed using the Trivy tool. The preliminary scan identified 31 vulnerabilities, comprising six medium-severity concerns and zero serious vulnerabilities, as indicated in the previous Trivy result (Figure 6). No serious vulnerabilities were identified; however, components including libxml2 and OpenSSL were noted as having outdated versions. The Dockerfile was modified, resulting in the reconstruction and re-scanning of a new image, which reduced the findings to 12 low-severity issues, thereby proving successful remediation.

The image scanning procedure was incorporated into a GitLab CI/CD pipeline, wherein a .gitlab-ci.yml file featured a security stage that performed a Trivy scan with --exit-code 1 for high-severity vulnerabilities. The pipeline automatically failed when a known vulnerable package was reintroduced to the image for testing, hence preventing the image from being pushed to the registry or deployed. That confirmed the efficacy of integrating security earlier in the deployment lifecycle.

5.4.2. Compliance Scanning with Kube-Bench

The kube-bench tool was thereafter run on the master node to assess the cluster by the CIS Kubernetes Benchmark for version 1.31.3. The preliminary scan indicated 12 failures and 41 warnings, predominantly concerning API server flags, absent certificate setups, and lenient RBAC policies. Particular concerns encompassed the following:

API server accepting anonymous requests (--anonymous-auth=true);
Unsecured etcd communication channels;
Kubelet authorization set to AlwaysAllow.

The remedial procedures recommended by kube-bench were implemented as follows:

--anonymous-auth=false added to the API server manifest;
Mutual TLS was enforced for etcd using custom CA-signed certificates;
Kubelet authentication mode changed to webhook, and unauthorized cache TTLs tightened.

A subsequent scan indicated substantial progress, with all critical problems rectified and outstanding warnings about documentation and manual file permission checks.

5.4.3. Penetration Testing with Kube-Hunter

Kube-hunter was executed in remote scanning mode to emulate an external attacker, focusing on the master node IP. The instrument initially indicated the following:

Disclosure of API Server version;
Kubelet API accessible on port 10250;
Etcd port 2379 is available without firewall restrictions.

Following remedial measures, including the deactivation of debug handlers (--enable-debugging-handlers=false), implementation of host-based firewall regulations, and restriction of network CIDR ranges in the API server manifest, a subsequent scan revealed no publicly accessible sensitive endpoints, thereby validating the mitigation efforts.

This use case illustrates the gradual enhancement of a vulnerable default Kubernetes configuration into a fortified, compliant, and monitored cluster. It emphasizes that incorporating Trivy, kube-bench, and kube-hunter into the operational lifecycle can enhance early detection, remediation, and ongoing assurance of security posture.

Table 2 consolidates the practical implementations covered in the paper, summarizing the hardening of each Kubernetes architectural component and the associated security outcomes. This review improves the scientific foundation of the study by linking specific technical measures to their functional security advantages and emphasizing the breadth of the suggested framework.

This structured mapping illustrates how integrating layered controls across components collectively enhances the cluster’s security posture, resilience, and alignment with compliance requirements.

6. Future Works

Following this paper, we see three promising future research directions: native integration of automated security benchmarking into Kubernetes, ZTA (Zero Trust Architecture) deployments in Kubernetes, and adaptive threat detection using AI-powered behavior analytics.

A promising research avenue entails integrating tools such as kube-bench, kube-hunter, and others directly into Kubernetes distributions for immediate auditing and hardening capabilities. Utilizing configuration management and orchestration tools such as Ansible or Helm to automate these checks could enhance compliance with CIS benchmarks and diminish dependence on manual security audits. Integrating security tools into the CI/CD pipeline and incorporating them into the Kubernetes control plane would allow for ongoing configuration validation and promote the early identification of misconfigurations. That would improve scalability and consistency in multi-cluster or enterprise deployments, especially in regulated settings. Research may investigate the development of operator-based solutions or Kubernetes Admission Controllers that perform automated verifications during workload deployment.

While this paper focuses on architectural and operational hardening techniques, future work should include performance benchmarking, comparative studies of alternative configurations (e.g., Calico vs. Cilium for network policy), and deployment case studies across cloud and hybrid environments. These additions would provide empirical validation and broaden the proposed framework’s academic contribution.

Zero Trust Architecture signifies a transition from perimeter-centric defenses to identity-focused security, rendering it particularly appropriate for dynamic, microservice-oriented platforms like Kubernetes. Future research may explore the practical implementation of ZTA principles within Kubernetes clusters, encompassing identity-aware proxies, mutual TLS among pods, and ongoing authorization enforcement. Kubernetes-native components such as service meshes (e.g., Istio, Linkerd) and Open Policy Agent (OPA) provide an advantageous framework for executing Zero Trust policies in real-time. Research may investigate policy automation and context-aware access control informed by real-time indicators (e.g., pod health, user location, or role). A comprehensive Zero Trust framework could significantly mitigate lateral movement within clusters and fortify the environment against insider threats and supply chain attacks.

Developing AI/ML-driven threat detection models designed explicitly for Kubernetes workloads is an emerging and significant trend. These models could acquire foundational behaviors of containers, services, and users, subsequently identifying anomalies such as privilege escalations, reverse shells, or resource exploitation in real time. Research may concentrate on incorporating such models into monitoring tools such as Prometheus or Falco or integrating them into log pipelines for distributed inference. The transient and fluid characteristics of containerized environments render conventional signature-based detection inadequate, while adaptive behavior modeling may address this deficiency. Moreover, federated learning or edge-based inference may improve privacy and performance in extensive, multi-tenant clusters.

7. Conclusions

The findings presented in this paper underscore a critical reality: Default Kubernetes deployments are inherently insecure. By default, Kubernetes clusters are inadequately hardened, rendering them vulnerable to significant misconfigurations, privilege escalations, and remote exploits. This fundamental vulnerability requires a thorough reevaluation and redesign of Kubernetes’ security defaults and proactive measures from operators. The paper proposed a systematic, sequential methodology integrating architectural insights with specific technical implementations, including control plane protection, node-level hardening, robust authentication, observability, and CVE-based mitigation. When consistently implemented, this methodology substantially enhances the security posture of Kubernetes environments.

Moreover, the suggested research avenues—such as incorporating automated auditing tools like kube-bench, implementing Zero Trust Architecture models, and investigating AI-driven behavior detection—illustrate that Kubernetes security remains an unresolved issue, representing a dynamic and developing domain. Future research in these areas is essential for automating secure configurations, reducing human error, and guaranteeing real-time adaptability to advanced threats. Future research should extend this work by addressing Kubernetes network policies, service mesh configurations, and runtime security controls such as system call filtering and workload isolation.

Author Contributions

Conceptualization: Z.M. and V.D.; Data curation: T.Č.; Formal Analysis: T.Č.; Investigation: T.Č.; Methodology: Z.M. and T.Č.; Project administration: Z.M.; Resources: V.D.; Software: V.D.; Supervision: Z.M.; Validation: Z.M.; Visualization: Z.M.; Writing—original draft: Z.M. and V.D.; Writing—review and editing: Z.M. and V.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Curtis, J.A.; Eisty, N.U. The Kubernetes Security Landscape: AI-Driven Insights from Developer Discussions. arXiv 2024, arXiv:2409.04647. [Google Scholar] [CrossRef]
Rahman, A.; Shamim, S.I.; Bose, D.B.; Pandita, R. Security Misconfigurations in Open Source Kubernetes Manifests: An Empirical Study. ACM Trans. Softw. Eng. Methodol. 2023, 32, 1–36. [Google Scholar] [CrossRef]
Wang, H.; Zhang, G.; Wang, D.; Deng, J. KubeRM: A Distributed Rule-Based Security Management System in Cloud Native Environment. In Proceedings of the International Conference on Cloud Computing, Internet of Things, and Computer Applications (CICA 2022), Luoyang, China, 28 July 2022; p. 128. [Google Scholar] [CrossRef]
Bose, D.B.; Rahman, A.; Shamim, S.I. ‘Under-Reported’ Security Defects in Kubernetes Manifests. In Proceedings of the 2021 IEEE/ACM 2nd International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS), Madrid, Spain, 3–4 June 2021; pp. 9–12. [Google Scholar] [CrossRef]
Diana Kutsa Fortifying Multi-Cloud Kubernetes: Security Strategies for the Modern Enterprise. World J. Adv. Res. Rev. 2024, 23, 2719–2724. [CrossRef]
Kampa, S. Navigating the Landscape of Kubernetes Security Threats and Challenges. J. Knowl. Learn. Sci. Technol. 2024, 3, 274–281. [Google Scholar] [CrossRef]
Doriguzzi-Corin, R.; Cretti, S.; Catena, T.; Magnani, S.; Siracusa, D. Towards Application-Aware Provisioning of Security Services with Kubernetes. In Proceedings of the 2022 IEEE 8th International Conference on Network Softwarization (NetSoft) 2022, Milan, Italy, 27 June 2022–1 July 2022; pp. 284–286. [Google Scholar] [CrossRef]
Gunathilake, K.; Ekanayake, I. K8s Pro Sentinel: Extend Secret Security in Kubernetes Cluster. In Proceedings of the 2024 9th International Conference on Information Technology Research (ICITR) 2024, Colombo, Sri Lanka, 5–6 December 2024; pp. 1–5. [Google Scholar] [CrossRef]
Ascensão, P.; Neto, L.F.; Velasquez, K.; Abreu, D.P. Assessing Kubernetes Distributions: A Comparative Study. In Proceedings of the 2024 IEEE 22nd Mediterranean Electrotechnical Conference (MELECON) 2024, Porto, Portugal, 25–27 June 2024; pp. 832–837. [Google Scholar] [CrossRef]
German, K.; Ponomareva, O. An Overview of Container Security in a Kubernetes Cluster. In Proceedings of the 2023 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT) 2023, Yekaterinburg, Russia, 15–17 May 2023; pp. 283–285. [Google Scholar] [CrossRef]
Zhu, H.; Gehrmann, C. Kub-Sec, an Automatic Kubernetes Cluster AppArmor Profile Generation Engine. In Proceedings of the 2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS) 2022, Bangalore, India, 4–8 January 2022; pp. 129–137. [Google Scholar] [CrossRef]
Kamieniarz, K.; Mazurczyk, W. A Comparative Study on the Security of Kubernetes Deployments. In Proceedings of the 2024 International Wireless Communications and Mobile Computing (IWCMC) 2024, Ayia Napa, Cyprus, 27–31 May 2024; pp. 718–723. [Google Scholar] [CrossRef]
Aly, A.; Fayez, M.; Al-Qutt, M.; Hamad, A.M. Multi-Class Threat Detection Using Neural Network and Machine Learning Approaches in Kubernetes Environments. In Proceedings of the 2024 6th International Conference on Computing and Informatics (ICCI) 2024, Cairo, Egypt, 6–7 March 2024; pp. 103–108. [Google Scholar] [CrossRef]
Russell, E.; Dev, K. Centralized Defense: Logging and Mitigation of Kubernetes Misconfigurations with Open Source Tools. arXiv 2024, arXiv:2408.03714. [Google Scholar] [CrossRef]
Acharekar, T.V. Exploring Security Challenges and Solutions in Kubernetes: A Comprehensive Survey of Challenges and State-of-the-Art Approaches. Int. J. Adv. Res. Sci. Commun. Technol. 2024, 4, 34–38. [Google Scholar] [CrossRef]
Surantha, N.; Ivan, F. Secure Kubernetes Networking Design Based on Zero Trust Model: A Case Study of Financial Service Enterprise in Indonesia. Adv. Intell. Syst. Comput. 2019, 348–361. [Google Scholar] [CrossRef]
Molleti, R. Highly Scalable and Secure Kubernetes Multi Tenancy Architecture for Fintech. J. Eng. App. Sci. Technol. 2022, 4, 1–5. [Google Scholar] [CrossRef]
Surantha, N.; Ivan, F.; Chandra, R. A Case Analysis for Kubernetes Network Security of Financial Service Industry in Indonesia Using Zero Trust Model. Bull. EEI 2023, 12, 3142–3152. [Google Scholar] [CrossRef]
Chinnam, S.K. Enhancing Patient Care Through Kubernetes-Powered Healthcare Data Management. Int. J. Res. Appl. Sci. Eng. Technol. 2024, 12, 859–865. [Google Scholar] [CrossRef]
Huang, K.; Jumde, P. Learn Kubernetes Security: Securely Orchestrate, Scale, and Manage Your Microservices in Kubernetes Deployments; Packt Publications: Birmingham, UK, 2020; ISBN 978-1839216503. [Google Scholar]
Reddy Chittibala, D. Security in Kubernetes: A Comprehensive Review of Best Practices. Int. J. Sci. Res. 2023, 12, 2966–2970. [Google Scholar] [CrossRef]
Horalek, J.; Urbanik, P.; Sobeslav, V.; Svoboda, T. Proposed Solution for Log Collection and Analysis in Kubernetes Environment. In Proceedings of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, Guilin, China, 22–24 July 2023; pp. 9–22. [Google Scholar] [CrossRef]
Chen, T.; Suo, H.; Xu, W. Design of Log Collection Architecture Based on Cloud Native Technology. In Proceedings of the 2023 4th Information Communication Technologies Conference (ICTC), Nanjing, China, 17–19 May 2023; pp. 311–315. [Google Scholar] [CrossRef]
LISBOA; da Silva, F.G. A Scalable Distributed System Based on Microservices for Collecting Pod Logs from a Kubernetes Cluster. In Proceedings of the Anais do XVIII Congresso Latino-Americano de Software Livre e Tecnologias Abertas (Latinoware 2021), Online, 13–15 October 2021; pp. 122–125. [Google Scholar] [CrossRef]
Lim, S.Y.; Stelea, B.; Han, X.; Pasquier, T. Secure Namespaced Kernel Audit for Containers. In Proceedings of the ACM Symposium on Cloud Computing 2021, Seattle, WA, USA, 1–4 November 2021; pp. 518–532. [Google Scholar] [CrossRef]
Imran, M.; Kuznetsov, V.; Paparrigopoulos, P.; Trigazis, S.; Pfeiffer, A. Evaluation and Implementation of Various Persistent Storage Options for CMSWEB Services in Kubernetes Infrastructure at CERN. J. Phys. Conf. Ser. 2023, 2438, 012035. [Google Scholar] [CrossRef]
Sai, K. Enhanced Visibility for Real-Time Monitoring and Alerting in Kubernetes by Integrating Prometheus, Grafana, Loki, and Alerta. Int. J. Sci. Res. Eng. Manag. 2024, 08, 1–5. [Google Scholar] [CrossRef]
Li, K.; Xiao, X.; Gao, C.; Yu, S.; Tang, X.; Tan, G. Implementation of High-Performance Automated Monitoring Collection Based on Kubernetes. In Proceedings of the 2024 3rd International Conference on Cloud Computing, Big Data Application and Software Engineering (CBASE) 2024, Hangzhou, China, 11–13 October 2024; pp. 838–843. [Google Scholar] [CrossRef]
Mart, O.; Negru, C.; Pop, F.; Castiglione, A. Observability in Kubernetes Cluster: Automatic Anomalies Detection Using Prometheus. In Proceedings of the 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Yanuca Island, Cuvu, Fiji, 14–16 December 2020; pp. 565–570. [Google Scholar] [CrossRef]
Dimitrov, V. CVE (NVD) Ontology. In Proceedings of the CEUR Workshop Proceedings, Information Systems & Grid Technologies: Fifteenth International Conference ISGT’2022, Sofia, Bulgaria, 27–28 May 2022; pp. 220–227. [Google Scholar]
Johnson, P.; Lagerstrom, R.; Ekstedt, M.; Franke, U. Can the Common Vulnerability Scoring System Be Trusted? A Bayesian Analysis. IEEE Trans. Dependable Secur. Comput. 2018, 15, 1002–1015. [Google Scholar] [CrossRef]
Dong, Y.; Guo, W.; Chen, Y.; Xing, X.; Zhang, Y.; Wang, G. Towards the detection of inconsistencies in public security vulnerability reports. In Proceedings of the 28th USENIX Conference on Security Symposium, Santa Clara, CA, USA, 14–16 August 2019; pp. 869–885. [Google Scholar]
He, Y.; Wang, Y.; Zhu, S.; Wang, W.; Zhang, Y.; Li, Q.; Yu, A. Automatically Identifying CVE Affected Versions With Patches and Developer Logs. IEEE Trans. Dependable Secur. Comput. 2024, 21, 905–919. [Google Scholar] [CrossRef]
Sun, J.; Xing, Z.; Xu, X.; Zhu, L.; Lu, Q. Heterogeneous Vulnerability Report Traceability Recovery by Vulnerability Aspect Matching. In Proceedings of the 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) 2022, Limassol, Cyprus, 3–7 October 2022; pp. 175–186. [Google Scholar] [CrossRef]
Μυτιληνάκης, Π.; Mytilinakis, P. Attack Methods and Defenses on Kubernetes 2020. Available online: https://doi.org/10.26267/UNIPI_DIONE/311 (accessed on 21 February 2025).
Zheng, T.; Tang, R.; Chen, X.; Shen, C. KubeFuzzer: Automating RESTful API Vulnerability Detection in Kubernetes. CMC 2024, 81, 1595–1612. [Google Scholar] [CrossRef]
Aqua Kube-Hunter, KHV002—Kubernetes Version Disclosure Issue Description. Available online: https://aquasecurity.github.io/kube-hunter/kb/KHV002.html (accessed on 25 April 2025).
Straesser, M.; Haas, P.; Frank, S.; Hakamian, A.; van Hoorn, A.; Kounev, S. Kubernetes-in-the-Loop: Enriching Microservice Simulation Through Authentic Container Orchestration. In Proceedings of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2024, Hong Kong, China, 9–10 December 2024; pp. 82–98. [Google Scholar] [CrossRef]

Figure 1. Kubernetes architecture, https://commons.wikimedia.org/wiki/File:Kubernetes.png, accessed on 23 April 2025.

Figure 2. Modified kube-apiserver.yaml file.

Figure 3. Restoring etcd state from its snapshot.

Figure 4. Kube-scheduler security measures.

Figure 5. Kubelet security measures, kubelet.yaml file.

Figure 6. Using Trivy to scan the ubuntu:20.04 image.

Figure 7. Using client certificates for authorization.

Figure 8. Using bootstrap tokens.

Figure 9. Kube-api auth proxy configuration.

Figure 10. Initializing the RBAC role (a) and Role binding the RBAC role (b).

Figure 11. API-server node authorization settings.

Figure 12. Node authorization configuration.

Figure 13. Proposed starting auditing policy.

Figure 14. Standard Fluentd DaemonSet configuration.

Figure 15. Kubernetes dashboard for Prometheus (v3.2.1).

Figure 16. Grafana (v11.6) dashboard for Kubernetes.

Figure 17. Kube-hunter deployment process.

Figure 18. Kube-hunter scanning options.

Figure 19. Kube-bench master node scanning.

Figure 20. kube-bench full scan results.

Figure 21. kube-bench remediation proposals.

Table 1. Summary of Kubernetes CVEs and mitigations.

CVE ID	Affected Component	Vulnerability/Risk	Recommended Mitigation
CVE-2025-1974	Ingress-NGINX Controller	Unauthenticated remote code execution	Upgrade ingress-nginx; apply network restrictions and RBAC
CVE-2025-1098	Ingress-NGINX Controller	Remote code execution	Update ingress-nginx; enforce network segmentation and review ingress policies
CVE-2025-1097	Ingress-NGINX Controller	Remote code execution via misconfiguration	Patch ingress-nginx; implement configuration validation and least privilege
CVE-2025-24514	Ingress-NGINX Controller	Unauthorized code execution through malicious requests	Upgrade ingress-nginx; monitor ingress logs; sanitize input annotations
CVE-2024-9042	Kubelet on Windows Nodes	Command injection via insecure parameter handling	Upgrade Windows kubelet; restrict access using RBAC and firewalls
CVE-2024-7646	Ingress-NGINX Controller	Privilege escalation via annotation injection	Upgrade to ingress-nginx ≥ v1.11.2; apply OPA/Gatekeeper policies
CVE-2024-29990	Azure Kubernetes Service (AKS)	Privilege escalation within AKS	Apply the latest AKS patches; enforce Azure role and policy restrictions
CVE-2024-9486	Kubernetes Image Builder	Root access exposure in image-building environments	Upgrade the builder; isolate build infrastructure from production workloads
CVE-2024-10220	gitRepo Volume Plugin	Data exfiltration via pod-to-pod repository access	Upgrade Kubernetes; replace gitRepo volumes with init containers for controlled cloning

Table 2. Summary of security measures by architectural component.

Kubernetes Component	Applied Security Controls	Security Outcome
API Server	TLS encryption, client certs, RBAC, token auth, restricting anonymous access	Secure API access, authenticated requests only, reduced attack surface
etcd	TLS encryption, certificate authentication, restricted file permissions, snapshot encryption	Confidentiality and integrity of the cluster state and secrets
Kubelet	Disabled anonymous access, webhook authorization, certificate-based TLS, restricted permissions	Controlled access to workloads and node-level operations
kube-scheduler	TLS encryption, profiling disabled, access restriction via bind-address	Prevention of workload manipulation or unauthorized scheduling
Container Images	Trivy scanning, rootless containers, minimal base images, secure build pipeline.	Reduced supply chain risk and minimized the attack surface of workloads
Access Management (IAM)	Client certificates, bootstrap/static tokens, OpenID Connect, RBAC, node authorization	Enforced least privilege access and contextual identity control
Auditing & Logging	Audit policies, Fluentd log collection, Prometheus metrics, alert integration	Detection of misconfigurations, threat visibility, and compliance support
CI/CD Integration	Trivy in pipeline, conditional build failure on vulnerability detection	Early detection and prevention of insecure image deployments
Compliance Monitoring	kube-bench scanning with CIS benchmarks	Measurable security baseline and identification of hardening gaps
Penetration Testing	kube-hunter (active/passive scanning)	Validation of cluster surface exposure and misconfiguration detection
API Server	TLS encryption, client certs, RBAC, token auth, restricting anonymous access	Secure API access, authenticated requests only, reduced attack surface
etcd	TLS encryption, certificate authentication, restricted file permissions, snapshot encryption	Confidentiality and integrity of the cluster state and secrets
Kubelet	Disabled anonymous access, webhook authorization, certificate-based TLS, restricted permissions	Controlled access to workloads and node-level operations
kube-scheduler	TLS encryption, profiling disabled, access restriction via bind-address	Prevention of workload manipulation or unauthorized scheduling
Container Images	Trivy scanning, rootless containers, minimal base images, secure build pipeline.	Reduced supply chain risk and minimized the attack surface of workloads

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Morić, Z.; Dakić, V.; Čavala, T. Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads. J. Cybersecur. Priv. 2025, 5, 30. https://doi.org/10.3390/jcp5020030

AMA Style

Morić Z, Dakić V, Čavala T. Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads. Journal of Cybersecurity and Privacy. 2025; 5(2):30. https://doi.org/10.3390/jcp5020030

Chicago/Turabian Style

Morić, Zlatan, Vedran Dakić, and Tomislav Čavala. 2025. "Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads" Journal of Cybersecurity and Privacy 5, no. 2: 30. https://doi.org/10.3390/jcp5020030

APA Style

Morić, Z., Dakić, V., & Čavala, T. (2025). Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads. Journal of Cybersecurity and Privacy, 5(2), 30. https://doi.org/10.3390/jcp5020030

Article Menu

Security Hardening and Compliance Assessment of Kubernetes Control Plane and Workloads

Abstract

1. Introduction

2. Related Works

3. Kubernetes Architecture

3.1. Control Plane Elements

3.2. Node Components

3.3. Networking and Service Identification

3.4. Extensibility and Declarative Configuration

4. Security Hardening of Kubernetes Components

4.1. Control Plane Security

4.1.1. Securing Kubernetes API

4.1.2. Securing etcd

4.1.3. Securing Kube-Scheduler

4.2. Node Security

Securing Kubelet

4.3. Image and Workload Hardening

4.3.1. Trivy (v0.62)

4.3.2. Disabling Containers to Run as Root

4.3.3. Integrating Hardening with CI/CD Pipelines

4.4. Identity and Access Management

4.4.1. Using Certificates

4.4.2. Using Tokens

4.4.3. Using an Authentication Proxy

4.4.4. Role-Based Access Control

4.4.5. Node Authorization

4.5. Observability and Auditing

4.5.1. Auditing

4.5.2. Logging

4.5.3. Kubernetes Monitoring

5. Kubernetes Vulnerabilities and Kubernetes Penetration Testing

5.1. Vulnerability Databases

5.2. Kubernetes CVEs

5.3. Penetration Testing Tools, Demonstration, and Security Compliance Assessment

5.4. Use Case: Hardening a Local Kubernetes Cluster

5.4.1. Baseline Vulnerability Assessment Using Trivy

5.4.2. Compliance Scanning with Kube-Bench

5.4.3. Penetration Testing with Kube-Hunter

6. Future Works

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI