KazHackStan 2025: "Containers Without Right to Escape"

This blog post reviews the talk from Lead Software Engineer titled as “Containers without right to escape“ held at KazHackStan 2025 CyberSecurity Conference in Almaty.

Speaker outlined three main topics to be discussed and demonstrated them with practical examples:

Why container runtime protection is critical in 2026?
The growing importance of securing container runtimes in light of emerging threats and compliance requirements.
Kubernetes security policies: from PSS to Kyverno
The use of Kubernetes’ native security controls, PodSecurity Standards (PSS) to modern policy engines like Kyverno.
Container isolation: gVisor and Kata Containers
Exploring isolation technologies designed to reduce the risk of container escapes and strengthen workload security.

Why Runtime Protection Matters

To emphasize the importance of container security, the speaker referred to findings from a Kaspersky report (2025) that examined container and Kubernetes-related incidents across organizations in the past year.

Incident type	Organizations affected
Errors in configuration	34%
Malware in containers	32%
Excessive access rights	32%
Vulnerabilities in container images/containers	29%
Cyber incident during runtime	26%
Attacks via container environments	24%
Unauthorized third-party access to image registry	21%
Confidential information became public	21%
Open-source code unintentionally made public	20%
Failed audit	17%
No container/Kubernetes-related incidents reported	15%

The survey highlighted several recurring problems, with the following issues considered especially critical for container runtimes:

Errors in configuration (34%) – Misconfigured Kubernetes clusters, insecure defaults, or missing restrictions remain the leading cause of runtime incidents.
Malware in containers (32%) – Compromised images or injected malicious code during the build or deployment stage.
Excessive access rights (32%) – Over-privileged service accounts or roles granting attackers unnecessary control inside clusters. No adherence to the Principle of Least Privilege (PoLP).
Attacks via container environments (24%) – Exploits that target weaknesses in the container runtime itself, potentially enabling lateral movement or container escapes.

Other common issues included vulnerabilities in images and containers (29%), unauthorized access to image registries (21%), exposure of confidential information (21%), and failed audits (17%).

Key takeaway: Although containers are designed to isolate workloads, misconfigurations, weak access controls, and unverified images remain major attack vectors. This is why runtime protection is increasingly critical, especially as compliance standards become stricter leading into 2026.

Vulnerabilities in 2025

One of the simplest yet most striking slides in the presentation showed the number 604.
The speaker explained that this represents the average number of known vulnerabilities found in a single container image in 2025.

Image source

This figure illustrates a critical reality: most container images available in registries - whether public or private - ship with outdated packages and unpatched libraries. Even widely used base images often include dozens or hundreds of vulnerabilities.

The number 604 serves as a wake-up call: containerization simplifies deployment, but without proactive security practices, it can also multiply risks.

💡

To see detailed statistics for Docker vulnerabilities, go here.

Kubernetes Security Policies: From PSS to Kyverno

The speaker illustrated this topic with practical scenarios showing how misconfigurations at the container runtime level can create significant risks.

Scenario 1: Running Containers as Root

A DevOps engineer mistakenly pushed a staging container without a proper securityContext defined. By default, the container ran as the root user with privilege escalation enabled.

💡

A securityContext in Kubernetes defines privilege and access control settings for a Pod or its individual containers. It's a crucial feature for hardening your workloads by specifying constraints like which user or group a process runs as (runAsUser, runAsGroup), preventing privilege escalation, and making the root filesystem read-only. By using securityContext, you can enforce the principle of least privilege, significantly reducing the potential damage if a container is compromised.

Key points emphasized:

Never run containers as root. Containers should be assigned only the permissions strictly required to operate.
Follow the Principle of Least Privilege (PoLP): restrict privileges so that even if a vulnerability is exploited, the damage is limited.

Risks of Misconfiguration

If a container running as root contains any vulnerability, it can allow an attacker to:

Escape the container and gain access to the host system.

💡

Escaping a container means a process breaks out of its isolated environment and gains unauthorized access to the underlying host operating system or other containers on the same host. This is a severe security breach, typically achieved by exploiting a software vulnerability in the container runtime or the host's kernel, or through a serious misconfiguration, such as mounting sensitive host directories. A successful container escape can allow an attacker to compromise the entire host system and any other workloads it runs.

From the host, pivot into other containers, access sensitive data, or exploit the underlying infrastructure.
Open multiple attack vectors, effectively turning one misconfigured container into a gateway for a broader compromise.

Pod Security Standards (PSS)

As one of the solutions, the speaker introduced Pod Security Standards (PSS) - a native Kubernetes mechanism to enforce security rules at the pod level.

The three PSS modes define progressively stricter rules:

Feature / Mode	Privileged	Baseline	Restricted
Privileged containers	✅	✗	✗
Host networking (hostNet)	✅	✗	✗
Host path volumes	✅	✗	✗
runAsNonRoot	✗	⚠️ Optional	✅
readOnlyRootFilesystem (mounts the container's root file system as read-only)	✗	⚠️ Optional	✅
Dangerous capabilities	✅	✗	✗
Seccomp profile	✗	✗	✅

Restricted mode provides the strongest protection and is recommended for most production workloads.
Baseline mode, as the speaker explained, starts with all restrictive settings similar to Restricted, but administrators can selectively relax restrictions where necessary. This gives teams flexibility to balance security with workload requirements.

When PSS is Sufficient

The speaker highlighted situations where PSS is a practical choice:

When you need a fast, native solution for basic defense.
If you prefer not to install additional CRDs or admission webhooks (required by external tools like Kyverno).
When the restrictions imposed by Restricted mode or selectively relaxed Baseline mode align with your workload requirements.

How PSS Protects in Practice

With PSS enabled in Restricted mode, Kubernetes automatically rejects manifests that do not meet key security requirements such as:

runAsNonRoot: true
allowPrivilegeEscalation: false
seccompProfile enabled

This means misconfigured workloads are blocked at the admission stage before being deployed to the cluster, even if they passed code review. In practice, this prevents risky containers from ever running in the environment.

From PSS to Kyverno: Securing Image Usage

Scenario 2: The Risk of Using Public Images and the `latest` Tag

The speaker presented another real-world scenario involving insecure image usage. In a rush, a developer pulled the python:3.11-slim image from Docker Hub using the latest tag. The issue was that the tag had recently been updated, and the new image introduced an environment change along with potentially insecure dependencies.

Two critical mistakes were emphasized:

Using public repositories – Public Docker Hub images may be outdated, compromised, or modified without notice. As a golden rule, organizations should maintain their own private registries with vetted and approved images.
Using the latest tag – The latest tag is mutable and may change over time, breaking compatibility or introducing vulnerabilities. The speaker recalled a case where reliance of one developer on latest tag led to two hours of debugging in the production at the bank because integration tests with Kafka failed due to an image update.

Risks Highlighted

The speaker summarized the risks of poor image management:

Delivery of malware through a compromised public registry.
Compatibility issues caused by unpinned or images with latest tag, potentially leading to failed builds or broken production deployments.

Comparing PSS and Kyverno

To address these risks, the talk shifted to a comparison between Pod Security Standards (PSS) and Kyverno.

Feature	PSS (Pod Security Standards)	Kyverno
Installation	Built-in (from Kubernetes v1.25)	Requires installation of CRDs
Security levels	Only Privileged, Baseline, Restricted	Customizable policies beyond baseline levels
Flexibility and Customization	Strict, limited to predefined checks	Highly flexible
Works with ConfigMap, Secrets, etc	Pod-level only resources	Any Kubernetes resource
Possibility of auto switch/patching	Reject only, logging	Can automatically fix the manifest
Checks of extra fields	Only limited set of securityContext	Allows checks for any fields
Support of substitutions from the context	Doesn’t support	Yes, up to checks of namespaces and JWT
Learning/Validation mode	enforce, audit, warn	enforce, audit, warn

💡

Kubernetes Pod Security Standards (PSS) utilize admission control to enforce security policies on pods, and this enforcement can be configured with different modes: enforce, audit, and warn. These modes dictate how the Kubernetes API server reacts when a pod specification violates the defined PSS policies.

Key insight:
While PSS provides a good baseline for security, Kyverno has become the de-facto standard for enterprises in recent years because of its flexibility, ability to enforce organization-wide policies, and support for more than just pods.

Example Kyverno Policy: Disallowing the `latest` Tag

As a practical demonstration, the speaker shared a Kyverno policy designed to prevent the use of container images tagged with latest.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
  annotations:
    policies.kyverno.io/title: Disallow Latest Tag
    policies.kyverno.io/category: Best Practices
    policies.kyverno.io/severity: medium
    policies.kyverno.io/subject: Pod
    policies.kyverno.io/description: >
      Images with the ':latest' tag are not permitted as they can lead to
      unpredictable and insecure deployments. This policy enforces the
      use of explicit image tags.
spec:
  validationFailureAction: enforce
  background: true
  rules:
    - name: validate-image-tag
      match:
        resources:
          kinds:
            - Pod
      validate:
        message: "Usage of an image with the ':latest' tag is not allowed."
        pattern:
          spec:
            containers:
              - image: "!*:latest"

With this policy applied, any deployment manifest referencing an image with the :latest tag will be blocked at admission time, ensuring developers cannot accidentally introduce insecure or unstable builds into the cluster.

Scenario 3: Isolating Containers with gVisor

The third scenario described by the speaker focused on the risks of running user-supplied code inside Kubernetes.

The Scenario: Running Scripts in Containers

The example was a platform similar to Replit or JupyterHub, where users can write and execute Python or JavaScript scripts directly from their browser. Each script runs inside a Kubernetes container.

While convenient, this model carries a serious risk. If an attacker gains control of the container, they can attempt to:

Escape the container into the host system.
Run commands like mount, ptrace, or sh to access the host filesystem.
Explore or compromise other pods running on the same cluster.

In such a breach, one insecure container could give attackers numerous attack vectors into the entire platform. The solution, the speaker stressed, is strong isolation.

gVisor as a Solution

The proposed solution was gVisor, a sandboxed container runtime developed and maintained by Google.

gVisor is a user-space implementation of the Linux kernel.
It sits between the container and the host kernel, acting as a proxy.
Instead of letting containers interact directly with the host, gVisor intercepts all system calls and processes them in its own controlled environment (the sandbox).

The practical result: even if a malicious script tries to execute dangerous system commands, those calls remain trapped inside gVisor and never reach the host kernel.

The speaker emphasized two key points:

Lightweight – gVisor introduces isolation without the heavy overhead of full virtual machines.
Reliability – being maintained by Google, the project has strong backing, which reduces the risk of it being abandoned.

Internal Architecture of gVisor

The speaker presented the internal components of gVisor, explaining how each contributes to isolation:

Image source

System calls from a gVisor container get redirected to the Sentry component by using KVM. I/O system calls are handled by the Gofer component.

Sentry
- The central userspace process that implements core parts of the Linux kernel.
- It receives and handles intercepted syscalls from the container.
- Essentially acts as the “brain” of the sandbox.
Gofer
- A helper process responsible for handling filesystem access.
- Separates file I/O from syscall handling, which increases security by limiting direct access to the host’s filesystem.
Ptrace Mode
- One way gVisor intercepts syscalls is through Linux’s ptrace mechanism.
- This provides compatibility but comes with higher performance overhead.
KVM Mode
- An alternative execution mode where gVisor uses the Kernel-based Virtual Machine (KVM) for running the sandbox.
- This mode reduces overhead and integrates better in environments that already use KVM.
- The speaker noted this as particularly beneficial for enterprises, as it combines efficiency with stronger isolation.

Outcome

By deploying each container inside a gVisor sandbox:

Syscalls are intercepted and processed in userspace.
Potentially malicious commands never reach the host kernel.
Even if an attacker injects harmful code into a container, its impact is confined and cannot spread across the platform.

The key message was that sandboxed runtimes like gVisor significantly reduce the risk of container escape attacks in multi-tenant environments.

Scenario 4: Isolating Sensitive Workloads with Kata Containers

After discussing gVisor, the speaker shifted focus to another sandboxing approach: Kata Containers.

Scenario: Processing Healthcare Data

The example scenario involved a healthcare service where images are processed and ML inference is performed through microservices. The backend, deployed in Kubernetes, stores personally identifiable information (PII) of patients.

The risks are familiar:

An attacker compromising a container could attempt to run commands such as mount, ptrace, or sh.
Successful container escape could expose the host filesystem or other pods.

Because this environment handles sensitive patient data, the security requirements are significantly higher.

Kata Containers Explained

The solution proposed was Kata Containers, an open-source project maintained by the OpenInfra Foundation (formerly the OpenStack Foundation).

Core idea:

Kata Containers combine the ease of use of containers with the rigid isolation of virtual machines.
Each container (or pod) is run inside a lightweight VM.
Every pod receives:
- Its own kernel
- Its own isolated memory space
- Hypervisor-level separation from the host and other pods

This design ensures that even if a pod is compromised, it cannot break out to the host or neighboring workloads.

Solution in Practice

For workloads that involve sensitive data, the speaker recommended:

Wrapping sensitive pods into Kata Containers.
Ensuring that each pod runs in its own lightweight VM, isolated at the kernel and memory levels.

Result:

Even if a pod is compromised, attackers remain trapped within the VM boundaries.
This approach aligns with the strict requirements of GDPR and HIPAA, which emphasize full workload isolation for handling sensitive data.

gVisor vs Kata Containers

The speaker then compared gVisor and Kata Containers, outlining where each fits best.

Feature / Aspect	gVisor (User-Space Kernel)	Kata Containers (Lightweight VMs)
Isolation model	User-space kernel (Sentry intercepts syscalls)	Hypervisor-based isolation (KVM/QEMU/Firecracker)
Isolation from the host	High	Very High
Syscall handling	Trapped in userspace (sandboxed by Sentry)	Executed inside VM with its own kernel
POSIX API support	Partial	Full
GPU support	Limited	Native GPU passthrough supported
Performance	Better than VMs, lower than native runtime	Better than gVisor
Startup time	Fast	Slower compared to gVisor
Compatibility with images	May require additional adjustments	High
Resource consumption	Low	Medium to high

Guidance from the Speaker

Use Kata Containers when the environment demands rigid, hypervisor-level security (e.g., healthcare, finance, government workloads with PII).
Use gVisor when you need a lightweight sandbox that improves security without the overhead of virtual machines — suitable for startups or less critical workloads.
Both solutions can coexist, but the choice depends on the security vs performance tradeoff of the organization.

Q&A Session Highlights

The final part of the talk featured questions from the audience, covering practical concerns about policy engines, runtime isolation tools, and migration challenges.

Q1. Do you use OPA Gatekeeper?

Speaker: Yes, we use Gatekeeper and it remains a useful tool. In our case, we apply Gatekeeper as a CRD resource to validate JWT tokens and services. However, if asked to choose, I recommend Kyverno - it is more flexible, widely adopted, and actively maintained in the industry.

Q2. Can PSS in Restricted Mode control execution of `exec` health probes in Kubernetes?

Speaker: My assumption is no. PSS operates at the pod level and cannot restrict syscalls. Also, I do not believe root permissions are required for health checks.
Note: From best-practice perspective, exec probes are discouraged in production. They can create cascading failures and are security-sensitive, so readiness and liveness probes should rely on HTTP or TCP instead. For more information, see Exec Probes: A Story about Experiment and Relevant Problems.

Q3. Which is harder to migrate to: gVisor or Kata Containers?

Speaker: I cannot provide a definitive answer. In our case, the security department led the integration. There are many configuration details involved. We use a central repository with a large YAML manifest to configure and deploy Kata containers.

Q4. Is it as complex as SELinux?

Speaker: No. It is not as complicated. In our setup, one manifest file configures and deploys Kata containers into the Kubernetes cluster.

Q5. Why is gVisor considered better for startups?

Speaker: gVisor works as a proxy, intercepting syscalls without deploying a VM for each pod. Kata Containers, by contrast, create a VM per pod, which consumes more resources. For startups with limited budgets, gVisor is often the better choice.

Q6. Are there still risks after implementing gVisor or Kata?

Speaker: Security is never absolute. Even with isolation technologies in place, attackers may exploit other vectors - such as stolen developer credentials.

Q7. Historical comparison: FreeBSD Jails, Solaris Zones, and Linux Containers

Audience question: FreeBSD had jails, Solaris had zones, and Linux developed containers. Does Kata simply bring us back to Linux containers?
Speaker: Yes, in principle. Kata re-introduces strong isolation by running containers inside virtual machines. The main difference is that in Kubernetes, these workloads are now virtualized to provide stronger boundaries.

FreeBSD Jails and Solaris Zones were early container-like technologies that allowed running applications in isolated environments.
Linux Containers (LXC) followed with similar concepts.
Kata Containers extend this by combining the container model with VM isolation, giving each pod its own kernel and memory space.

Q8. How does Kata perform with heavy workloads like LLMs?

Speaker: In practice, we used Kata Containers for simple workloads (e.g., Ubuntu or OpenBSD images). For performance-heavy workloads such as machine learning inference, users report challenges in forums. These include errors during installation and runtime limitations. Kata is well-suited for industries like fintech and telecom with relatively straightforward workloads, but it can struggle with more complex, performance-intensive applications.

Key Takeaways from the Talk

The session “Containers without right to escape” delivered a practical overview of modern container security challenges and solutions. The main lessons are:

Runtime protection is critical
- Misconfigurations, excessive privileges, and insecure defaults remain the leading causes of container incidents.
- Average container images contain hundreds of known vulnerabilities, making runtime defenses essential.
Avoid risky practices
- Do not run containers as root; always enforce the Principle of Least Privilege (PoLP).
- Never use the latest tag for container images. Always pin explicit versions and rely on trusted private registries.
Use Kubernetes-native security policies
- Pod Security Standards (PSS) provide a baseline defense with Privileged, Baseline, and Restricted modes.
- PSS is fast and built-in but limited in scope.
- Kyverno offers more flexibility, supports custom policies, and is now the de-facto standard for enterprises.
Leverage sandbox runtimes for stronger isolation
- gVisor intercepts syscalls and processes them in a userspace sandbox, offering lightweight protection with minimal overhead.
- Kata Containers isolate workloads at the hypervisor level by running each pod in its own lightweight VM, aligning with strict standards such as GDPR and HIPAA.
Choose the right tool for your context
- gVisor is lightweight and resource-efficient, well-suited for startups or less sensitive workloads.
- Kata Containers provide stronger isolation but come with higher overhead and are best for industries requiring maximum protection (e.g., healthcare, finance).
Security is never absolute
- Even with PSS, Kyverno, gVisor, or Kata, attackers may exploit other vectors such as developer credentials. Defense-in-depth and continuous monitoring remain essential.
Practical Q&A insights
- Gatekeeper is still in use, but Kyverno is generally preferred.
- Exec probes in Kubernetes are discouraged, as they introduce both operational and security risks.
- Migration to gVisor or Kata involves fine-tuning but is not as complex as SELinux.

Command Palette