KubeCon + CloudNativeCon Europe 2025

Does your organization build firmware for hardware devices on Kubernetes? Do you still test firmware on hardware manually? Jumpstarter, an open-source project started by Red Hat, connects your software factory to your hardware, modernizing embedded software development. Developed in collaboration with a leading automotive manufacturer, Jumpstarter bridges the gap between embedded and cloud-native workflows.

This session demonstrates how to automate software testing on physical devices within Kubernetes using Tekton Pipelines and GitLab, leasing devices for tasks like flashing firmware, booting, and interfacing through serial, CAN bus, audio, and video. Eclipse Che will also be showcased for developing and debugging tests.

The presentation will include a live demo and will share deployment instructions, workflow examples, and real-world use cases from Red Hat and other community projects.

Speakers

Miguel Angel Ajo Pelayo

Senior Principal Software Engineer, Red Hat

Miguel has been an upstream contributor to open-source projects throughout his career at Red Hat. He has always been interested in hardware and the low-level details of how technology works. Before joining Red Hat, he ran a small consulting startup that developed embedded systems... Read More →

Thursday April 3, 2025 11:00 - 11:30 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Intermediate

11:45 BST

Beyond the Limits: Scaling Kubernetes Controllers Horizontally - Tim Ebert, STACKIT

Thursday April 3, 2025 11:45 - 12:15 BST

Do your Kubernetes controllers struggle to keep up with the demands of your growing infrastructure? As your clusters scale, traditional controller setups face increasing challenges, leading to slow reconciliation times, impacting application performance and overall cluster stability.

This session introduces sharding for Kubernetes controllers as a groundbreaking solution. By horizontally scaling controller workloads across multiple instances, it significantly improves scalability and addresses the inherent limitations of traditional leader election mechanisms.

In this session, we'll dive deep into the technical details of applying proven sharding mechanism from distributed databases to effectively partition controller workloads. We'll explore the underlying concepts and how to implement sharding in your own Kubernetes controllers.

Join us to learn how to overcome the scalability challenges of your Kubernetes controllers and unlock the full potential of your infrastructure.

Speakers

Tim Ebert

Cloud Engineer, STACKIT

Tim loves designing, developing, and operating cloud native systems at STACKIT. He is knee-deep in managing infrastructure and Kubernetes clusters themselves using Kubernetes operators. Tim is a core developer of Gardener, an open source project for managing Kubernetes clusters at... Read More →

Thursday April 3, 2025 11:45 - 12:15 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Intermediate

14:15 BST

eBPF and Wasm: Unifying Userspace Extensions With Bpftime - Yusheng Zheng, eunomia-bpf

Thursday April 3, 2025 14:15 - 14:45 BST

In cloud-native systems, extending and customizing applications is key to improving development, deployment, and observability. eBPF is powerful for kernel-level enhancements, and WebAssembly brings extension to userspace. Yet, both face challenges when userspace extensions need to interact deeply with host applications. eBPF's kernel-focused design struggles in diverse userspace environments, and Wasm’s sandboxing introduces overhead and complexity due to extra checks and data copying. Enter bpftime, a framework that extends eBPF’s capabilities into userspace. Using dynamic binary instrumentation, bytecode verification, and hardware isolation, bpftime allows secure, high-performance extensions without the overhead of Wasm’s sandboxing. This talk explores how bpftime works with the eBPF Interface to simplify userspace extensions, compares the evolution of eBPF and Wasm, and shows how bpftime can power observability, networking, and other cloud-native extensions.

Speakers

Yusheng Zheng

OSS maintainer, eunomia-bpf

Yusheng Zheng is an open-source maintainer and researcher focused on improving complex systems through comprehensive understanding and strategic, small-scale modifications. As the co-founder of the eunomia-bpf open-source community and a PhD student, Yusheng is at the forefront of... Read More →

Thursday April 3, 2025 14:15 - 14:45 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Advanced

15:00 BST

Dynamic Multi-Cluster Controllers With Controller-runtime - Marvin Beckers, Kubermatic & Stefan Schimanski, Upbound

Thursday April 3, 2025 15:00 - 15:30 BST

controller-runtime is the most popular SDK to write controllers for individual Kubernetes clusters. But the Kubernetes landscape is changing quickly: multi-cluster is becoming ubiquitous (e.g. through Cluster API), with clusters joining and leaving dynamically. controller-runtime has had no direct support, making writing uniform multi-cluster controllers hard and fracturing the emerging ecosystem.

This talk explores how to build controllers that reconcile resources across a dynamic fleet of Kubernetes clusters. A key change is the ability to plug in a dynamic cluster provider that registers new Kubernetes clusters from a specific source. While implementation internals are briefly discussed, focus is on a hands-on walkthrough for writing your own cluster provider, event handlers and reconciler functions.

We discuss a simplistic cluster provider implementation for “kind” clusters as an example and extrapolate from that how more complex providers could look like (e.g. for CAPI or kcp).

Speakers

Stefan Schimanski

Senior Principal Engineer, Upbound

Stefan is a Senior Principal Engineer at Upbound working on control planes, Kubernetes, kcp, and as a tech-lead in Sig API Machinery. He contributed a major part of the CRD feature set. Stefan is a 2nd time GoogleSummer of Code mentor with CNCF, loves to teach and help people to learn... Read More →

Marvin Beckers

Team Lead, Kubermatic

Marvin started out as a sysadmin, gradually turned into a software engineer and now works as an Software Engineering Team Lead at Kubermatic. He always had a passion for effective management of large server fleets, which has turned his attention to Kubernetes in 2018. He has been... Read More →

Thursday April 3, 2025 15:00 - 15:30 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Intermediate

16:00 BST

Get WITty: Evolving Kubernetes Scheduling With the WebAssembly Component Model - Dejan Pejchev & Jonathan Giannuzzi, G-Research

Thursday April 3, 2025 16:00 - 16:30 BST

At KubeCon NA 2024, we introduced WASM + KWOK Wizardry: Writing and Testing Kubernetes Scheduler Plugins at Scale, showcasing how WASM plugins transform Kubernetes scheduling. This session continues the story, highlighting our progress toward a language-agnostic framework using the WebAssembly Component Model.

The current Go-centric WASM plugin SDK restricts innovation to a single language. By adopting the Component Model, we enable developers to write plugins in Rust, Python, JavaScript, and more, unlocking new possibilities. This approach enhances modularity, simplifies integration with standardized interfaces, and strengthens security through improved isolation.

We’ll also showcase how this aligns with the Kubernetes Scheduler Simulator, providing a powerful testing environment for these advanced plugins. Join us to see how the Component Model fosters collaboration, innovation, and extensibility in Kubernetes scheduling. Let’s move beyond wizardry and get truly WITty!

Speakers

Dejan Zele Pejchev

Open Source Software Engineer, G-Research

Dejan is a seasoned Software Engineer with over 8 years of experience building and scaling distributed systems and an advocate of open source & Kubernetes-native solutions. Dejan is also a maintainer of Armada, the Kubernetes multi-cluster batch scheduling tool, Testkube, the Kubernetes-native... Read More →

Jonathan Giannuzzi

Open Source Evangelist, G-Research

Jonathan is an Open Source Evangelist at G-Research, where he applies his nerdy wizardry powers to solve deep problems that can bubble up all the way to the end-user.

Thursday April 3, 2025 16:00 - 16:30 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Advanced

16:45 BST

GPU Sharing at CERN: Cutting the Cake Without Losing a Slice! - Diana Gaponcic, CERN

Thursday April 3, 2025 16:45 - 17:15 BST

GPUs and accelerators are changing traditional High Energy Physics (HEP) deployments while also being the key to enabling efficient machine learning. However, their high cost and increasing demand oblige service managers to look into ways to maximize the HW utilization through sharing. While the existing methods are flexible and easy to use, complex use cases still require building custom components on top of the existing device plugin API.

This talk explores the new, exciting way of allocating and sharing GPUs - using Dynamic Resource Allocation (DRA). We go over the multiple options for GPU scheduling: time sharing, MPS, and MIG. We cover the features and limitations of each option and present extensive benchmark results that helped us assign each of our ML and scientific workloads to the most appropriate layout. Finally, we describe how managing GPUs in a centralized way improves resource utilization across interactive and batch workloads while optimizing costs in the long run.

Speakers

Diana Gaponcic

Computing Engineer, CERN

Diana is a Computing Engineer in the CERN IT department. After an internship at CERN focusing on containerization of ETL applications she later joined the Kubernetes team, working on the GitOps and monitoring infrastructure. Her current focus is on optimizing the usage of GPUs and... Read More →

Thursday April 3, 2025 16:45 - 17:15 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Intermediate

17:30 BST

Image Snapshotters for Efficient Container Execution in Particle Physics - Clemens Lange, Paul Scherrer Institute & Valentin Volkl, CERN

Thursday April 3, 2025 17:30 - 18:00 BST

In particle physics, compute-intensive workloads often involve thousands of "embarrassingly parallel" jobs relying on multi-gigabyte container images. A large fraction of these workloads is executed using software containers. Efficient execution across large-scale computing environments demands advanced caching and image loading techniques to prevent network saturation and reduce startup times. Leveraging the industry-standard containerd runtime, we evaluate snapshotter plugins such as CVMFS (a CERN-developed distributed file system for large-scale software distribution), SOCI, and Stargz, which use "lazy" image loading to optimise performance. This talk includes an analysis of metrics such as container startup time and image data downloaded, alongside usability evaluations in a research environment. We demonstrate how these tools enhance the reusability and reproducibility of physics analyses---insights relevant to broader high-performance computing scenarios.

Speakers

Clemens Lange

Research Physicist, Paul Scherrer Institute

Clemens is a particle physicist at Switzerland’s Paul Scherrer Institute, where he contributes to the CMS experiment at CERN’s Large Hadron Collider. He focusses on Higgs boson analysis, the development of new particle detectors, and is passionate about computing and open science... Read More →

Valentin Volkl

Systems Software Engineer, CERN

Valentin is a physicist and staff software engineer at CERN. In the past he has worked on software and simulations for the next generation of particle colliders. Since 2023 he is lead developer for the CernVM-FileSystem (CVMFS) that is used to distribute software for users in science... Read More →

Thursday April 3, 2025 17:30 - 18:00 BST
Level 1 | Hall Entrance S10 | Room D

Content Experience Level Intermediate

11:00 BST

Quantum-Ready Kubernetes: How Do We Get There? - Nikhita Raghunath & Natalie Fisher, Broadcom; Paul Schweigert, IBM; Ricardo Rocha, CERN ; Tomas Gustavsson, Keyfactor

Friday April 4, 2025 11:00 - 11:30 BST

As AI continues to evolve, quantum computing is poised to disrupt Kubernetes in ways we can’t ignore. By 2035, the US government will only procure quantum-safe solutions, and if our infrastructure isn’t ready soon we’ll be scrambling to catch up.

This panel brings together experts to explore:
- What quantum computing is & why it’s a game changer
- How to orchestrate quantum workloads on Kubernetes
- Middleware needed to bridge classical and quantum resources
- Redesigning infrastructure to meet NIST’s quantum-safe standards with an agile long-term strategy
- Building infrastructure for real-world use cases like scientific simulations
- How quantum machine learning can help run AI workloads

You don’t need to be a quantum expert to join! You’ll walk away with actionable insights on architectural trade-offs for running quantum workloads and learn how to implement quantum-safe security. This is your chance to spark fresh ideas & take the lead in shaping the next decade of technology!

Speakers

Ricardo Rocha

Computing Engineer, CERN

Ricardo leads the Platform Infrastructure team at CERN with a strong focus on cloud native deployments and machine learning. He has led for several years the internal effort to transition services and workloads to use cloud native technologies, as well as dissemination and training... Read More →

Nikhita Raghunath

Principal Engineer, Broadcom

Nikhita is a Principal Engineer at Broadcom, past co-chair of KubeCon and a maintainer of the Kubernetes project. She is the vice chair of the CNCF Technical Oversight Committee and has won the CNCF Top Committer Award in 2021 for her technical contributions. She was also a member... Read More →

Paul Schweigert

Senior Software Engineer, IBM

Paul Schweigert works on quantum and AI technologies at IBM. He has extensive experience in open source (Knative and Kubernetes in particular) and has spoken at numerous conferences. He has also led various platform engineering and data science teams. In a previous life, he studied... Read More →

Tomas Gustavsson

Chief PKI Officer, Keyfactor

Tomas Gustavsson is the chief public key infrastructure (PKI) officer at Keyfactor.. He pioneered open source public key infrastructure with EJBCA, now embraced by over 3,000 downloads per month. With a background in computer science, Tomas established EJBCA to fortify trusted digital... Read More →

Natalie Fisher

Technology Product Manager, Broadcom

Natalie is a Technology Product Manager at VCF. A lifelong learner, she’s always been fascinated with emerging technology and the endless possibilities and solutions one could dream up. Having spent many years in product and working in companies ranging from e-Commerce, Data Analytics... Read More →

Friday April 4, 2025 11:00 - 11:30 BST
Level 1 | Hall Entrance S10 | Room B

Content Experience Level Intermediate

11:45 BST

Testing AI Containers for Digital Twins in Science: A Cloud-HPC Workflow - Matteo Bunino, CERN & Diego Ciangottini, INFN

Friday April 4, 2025 11:45 - 12:15 BST

CERN is advancing the development of AI-based digital twins in science through projects like interTwin, an EC-funded project to develop a digital twin engine for science. These digital twins rely on HPC resources for training multi-node, multi-GPU models using containerized workflows.
Developing such containers for HPC systems presents unique challenges, including accessing restricted HPC resources and integrating with HPC software stacks, while ensuring the interoperability between different container runtimes.
We introduce a CI/CD workflow that bridges cloud and HPC and enables automated testing of AI/ML containers on the same SLURM-managed clusters where they will be deployed. By integrating Dagger’s reproducible CI runtime with HPC offloading, this approach validates both the software in the containers and their compatibility with HPC environments. This ensures the seamless deployment of AI-based digital twins, addressing the critical need for robust testing in hybrid environments.

Speakers

Diego Ciangottini

Technologist, INFN

Diego Ciangottini is a physicist and received his PhD from the University of Perugia, Italy in 2012. Now he's working as technologist at INFN (Italian National Institute for Nuclear Physics) researching cloud-native solutions for the scientific use cases of the institute. In that... Read More →

Matteo Bunino

Computing Engineer, CERN

Matteo holds a double Master’s degree in Computer Engineering from PoliTO and EURECOM. At CERN, he focuses on AI-based digital twins in science, integrating AI, HPC, and real-time data processing. As part of CERN openlab, he collaborates with industry and academia on R&D projects... Read More →

Friday April 4, 2025 11:45 - 12:15 BST
Level 1 | Hall Entrance S10 | Room B

Content Experience Level Advanced

13:45 BST

Thousands of Virtual Kubelets: 1-to-1 Mapping a Supercomputer To Kubernetes With Supernetes - Dennis Marttinen, Aalto University

Friday April 4, 2025 13:45 - 14:15 BST

Bridging the gap between High-Performance Computing (HPC) and the cloud is an ongoing challenge in the cloud-native ecosystem. Most projects migrate some parts of the batch job scheduling from Slurm to Kubernetes. However, with many HPC systems rigidly tied to Slurm and its features, where is the integration limit?

Introducing Supernetes: an open source HPC-to-cloud bridge that bidirectionally reconciles all Slurm tasks to v1/Pods, and all Slurm nodes to v1/Nodes, 1-to-1. Supernetes tolerates the strictest HPC limitations: tight firewalls, no root, no fakeroot, no namespaces, no slurmrestd API. If you can run sbatch and scontrol, you can run Supernetes.

In this session, Dennis presents his quest to integrate LUMI, a global top-10 supercomputer, with Kubernetes. Starting from HPC-to-cloud bridge basics, the talk evolves into running thousands of virtual kubelet instances and hacking FluxCD to reconcile from a gRPC tunnel. The session concludes with a live demo of Supernetes on LUMI.

Speakers

Dennis Marttinen

Security and Cloud Computing (SECCLO) Master Student, Aalto University

Dennis is a Security and Cloud Computing (SECCLO) double-degree master student with a broad background in Kubernetes, supercomputing/HPC, networking and cloud security. He is the co-author of Weave Ignite, a container-to-microVM solution, and Racklet, a scale model rack project presented... Read More →

Friday April 4, 2025 13:45 - 14:15 BST
Level 1 | Hall Entrance S10 | Room B

Content Experience Level Intermediate

14:30 BST

Transparent, Infra-Level Checkpoint and Restore for Resilient AI/ML Workloads at Scale - Ganeshkumar Ashokavardhanan, Microsoft & Bernie Wu, MemVerge

Friday April 4, 2025 14:30 - 15:00 BST

While model checkpointing at the application framework level provides basic failure recovery for AI/ML training, it burdens developers with complex config requirements. As the scale of production workload increases, infra-level checkpointing using Checkpoint/Restore in Userspace (CRIU) can provide fault-tolerance and live migration transparently to the end user. We will demonstrate with a k8s operator how to checkpoint and restore distributed ML workloads, showcasing novel extensions across CRIU, CRI-O, and cuda-checkpoint.

Our talk focuses on implementing synchronization mechanisms for JobSets running stateful workloads to be checkpointed in unison, while minimizing interruption overhead. The presentation explores how this infra-level approach accelerates recovery times, and workload reprioritization. Key topics include network state handling in distributed training and GPU memory checkpoint management, highlighting benefits for stateful applications requiring higher resiliency.

Speakers

Bernie Wu

VP Technology Partnerships, MemVerge

Bernie is VP of Technology Partnerships and leads the Kubernetes, AI/ML, and CXL Memory initiatives for MemVerge. He has 25+ years of experience as a senior executive for data center hardware and software infrastructure companies, including Conner/Seagate, Cheyenne Software, Trend... Read More →

Ganeshkumar Ashokavardhanan

Software Engineer, Microsoft

Ganesh is a Software Engineer on the Azure Kubernetes Service team at Microsoft, and is the lead for the GPU workload experience and error handling on this kubernetes platform. He collaborates with partners in the ecosystem to support operator models for machine learning workloads... Read More →

Friday April 4, 2025 14:30 - 15:00 BST
Level 1 | Hall Entrance S10 | Room B