Loading…
In-person
1-4 April 2025
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Europe 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in British Summer Time (BST) (UTC +1). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
or to bookmark your favorites and sync them to your phone or calendar.
Company: Advanced clear filter
Wednesday, April 2
 

13:30 BST

🪧 Poster Session: A New Approach To Cluster Infrastructure Management for Kubernetes Service Providers - Pascal Fries & Sascha Rauch, ATIX AG
Wednesday April 2, 2025 13:30 - 14:30 BST
Providing Kubernetes as a service is difficult, since clusters have to be administered extrinsically (i.e., at infrastructure level), as well as intrinsically (i.e., at API level). While platform providers will often want to delegate the latter task to their customers, separating responsibilities is not always easy because infrastructure components are usually deployed as API resources themselves. Externalising the control plane partially solves this issue, yet components such as network, storage, and monitoring still run as pods in the cluster.

In this session, we present a novel approach to cluster infrastructure that we call “ephemeral nodes”. Utilizing two kubelets, our method achieves separate interfaces for platform providers and users. Along with a general discussion, we provide an implementation based on mutating webhooks and a CSI shim plugin. Along the way, we also show how the present method can be used for bare metal node pooling without the need for virtualisation.
Speakers
avatar for Pascal Fries

Pascal Fries

Senior IT Consultant, ATIX AG
Pascal Fries is a Senior IT Consultant working at ATIX AG, Germany. He is passionate about optimising workflows in Kubernetes and container infrastructures in general. As a former high energy physicist, he loves taking things apart, see how they work in detail, and reassemble them... Read More →
avatar for Sascha Rauch

Sascha Rauch

Lead DevOps Consultant, ATIX AG
Sascha has several years of experience in managing cloud projects and designing highly available cloud architectures. He is a specialist in DevSecOps and container orchestration and primarily supports companies in building cluster solutions, CI/CD chains and analytics stacks.
Wednesday April 2, 2025 13:30 - 14:30 BST
Level 1 | Hall Entrances S8 - S9, N8 - N9

15:15 BST

The Security Challenges of Running Untrusted Code in Production on Kubernetes at Internet Scale - Christian Weichel & Alejandro de Brito Fontes, Gitpod
Wednesday April 2, 2025 15:15 - 15:45 BST
Running untrusted code from 1.5 million developers presents unique security challenges that push container isolation to its limits. At Gitpod, we spent six years building secure boundaries for development environments on Kubernetes, ultimately discovering fundamental security limitations that led us to rearchitect our platform. Our recent technical deep-dive blog ended up on Hacker News and sparked quite the intense debate (speakers are the OP).

This deep-dive examines our security evolution from standard container isolation to custom security implementations involving user namespaces, seccomp profiles, and network isolation. We'll explore how we handled privileged operations like Docker-in-Docker, FUSE filesystems, and root access requests while maintaining isolation. Whether you're dealing with multi-tenant workloads or running untrusted code, you'll gain practical insights about our learnings on real-world security boundaries in Kubernetes.
Speakers
avatar for Alejandro de Brito Fontes

Alejandro de Brito Fontes

Senior Engineer, Gitpod
Alejandro is a software entrepreneur and systems architect with more than 20 years of experience designing, building, and operating mission-critical IT infrastructure.
avatar for Christian Weichel

Christian Weichel

Chief Technology Officer, Gitpod
Chris Weichel is the Chief Technology Officer at Gitpod, where he leads the engineering team that builds and maintains the cloud-native platform for software development. With over 20 years of experience in software engineering and human-computer interaction, he has a comprehensive... Read More →
Wednesday April 2, 2025 15:15 - 15:45 BST
Level 0 | ICC Auditorium
  Security

16:15 BST

Where’s All My Memory Gone? Mapping K8s Memory Metrics To Physical Resources - Mahé Tardy, Isovalent at Cisco
Wednesday April 2, 2025 16:15 - 16:45 BST
Understanding memory statistics in Kubernetes is critical for reducing an application’s memory impact or avoiding the Out-Of-Memory (OOM) killer. In this talk, we’ll decode the complexities behind Kubernetes memory metrics (did you say container_memory_working_set_bytes?), tracing them from the kubelet binary to the host’s memory control groups.

The memory metrics we observe, whether through kubectl top or Prometheus, are the result of a complex journey, from memory control group statistics, through calculations by libraries like opencontainers/libcontainer, to cAdvisor or the container runtime, and finally, to the kubelet. We’ll deep dive into the role of cAdvisor and the container runtime in memory tracking, the interaction with the OOM killer, and the impact of control groups (cgroups) versions on metric calculations. By the end of this session, you’ll be able to better interpret memory statistics and troubleshoot memory-related issues in your clusters.
Speakers
avatar for Mahé Tardy

Mahé Tardy

Software Engineer, Isovalent at Cisco
Mahé is a security engineer at Isovalent and an active contributor to Kubernetes SIG Security. He was previously working as a security researcher and loves working with Linux, security, and Kubernetes!
Wednesday April 2, 2025 16:15 - 16:45 BST
Level 1 | Hall Entrance S10 | Room C
  Operations + Performance

16:15 BST

Making CRDs Delightful: Beyond the Pitfalls - Evan Anderson, Stacklok, Inc
Wednesday April 2, 2025 16:15 - 16:45 BST
CRDs have a lot of traps for new operator authors; this is a different talk about developing for Kubernetes! If you're building Kubernetes resource types, let's talk about how to make them satisfying and enjoyable for your users. Using examples from multiple popular projects, Evan will provide 10 tips on how to make your APIs friendly to Kubernetes beginners and experts alike.

* Use status for humans and machines
* Condition super-powers with one simple rule!
* How to avoid needing to build a CLI
* When to build one anyway
* Day-1 RBAC for everyone
* Supporting GitOps gracefully
* Status-free objects: Policies and Classes
* The beauty of zero
* Borrowing is best: embedding known types
* Operating someone else's CRD: labels and annotations

Evan has been extending and operating Kubernetes for the last 6 years. The above patterns will be illustrated with examples from his experience with ArgoCD, Cert-Manager, Gateway-API, Knative, and Kubernetes, among others.
Speakers
avatar for Evan Anderson

Evan Anderson

Software Engineer, Stacklok, Inc
Founder and maintainer on Knative serverless project. Currently at Stacklok working on supply chain security, previously at Google and VMware; recovering SRE.
Wednesday April 2, 2025 16:15 - 16:45 BST
Level 0 | ICC Capital Hall | Room 2
  Platform Engineering

17:00 BST

Kubernetes Backup Legitimized: CSI Changed Block Tracking Has Arrived - Mark Lavi, Carl Braganza & Prasad Ghangal, Veeam; Xing Yang, VMware by Broadcom
Wednesday April 2, 2025 17:00 - 17:30 BST
Kubernetes storage is compared to traditional facilities for backup, disaster recovery, cyber-resilience against ransomware, and audit compliance. To meet the fastest recovery point operation and return to production objectives, one critical area has been missing: Changed Block Tracking (CBT). Since 2018, Kubernetes has deprecated "in-tree" storage drivers in favor of Container Storage Interface (CSI) specification for industry wide collaboration and standardization. CBT radically improves backup efficiency and to meet business needs, proprietary storage drivers were required. For over two years, the Kubernetes Data Protection Working Group has worked to bring CBT to the CSI specification and Kubernetes API. Join us to learn how cloud native storage backup and disaster recovery can finally compete with traditional infrastructure, progress made with storage and backup vendors and projects, and the architecture, security, testing, and scalability of Kubernetes CSI CBT.
Speakers
avatar for Xing Yang

Xing Yang

Tech Lead, VMware by Broadcom
Xing Yang is a Tech Lead in the Cloud Native Storage team at VMware by Broadcom. She is a co-chair of CNCF Storage TAG, a co-chair of the Kubernetes Storage SIG, a co-chair of the Data Protection WG, and a maintainer in Kubernetes CSI. Before joining VMware, Xing was the Lead Architect... Read More →
avatar for Mark Lavi

Mark Lavi

Principal Cloud Native Product Manager, Veeam Software
Mark was an early web developer, administrator, and advocate at Netscape, Silicon Graphics, CNN, and News Corp., spending over 20 years in Silicon Valley with numerous start-ups across engineering, IT, and marketing. As a Cloud Native Product Manager at Veeam, Mark drives Kubernetes... Read More →
avatar for Carl Braganza

Carl Braganza

Software Engineer, Veeam
I've worked in the data storage and protection space for most of my career, most recently on Kasten by Veeam, a Kubernetes backup product. I'm a member of the Kubernetes SIG-Storage Data Protection Working Group and have co-authored the Changed Block Tracking KEP and its associated... Read More →
avatar for Prasad Ghangal

Prasad Ghangal

Member of Technical Staff, Veeam
Prasad works as an MTS at Kasten by Veeam (kasten.io). His main areas of interest are Kubernetes, distributed systems, and Open source. He likes to create and talk about dev tools. He is the creator of an open-source tool BotKube (botkube.io) and a contributor to the Changed Block... Read More →
Wednesday April 2, 2025 17:00 - 17:30 BST
Level 1 | Hall Entrance S10 | Room D
  Data Processing + Storage

17:45 BST

Kubernetes CRD Design for the Long Haul: Tips, Tricks, and Lessons Learned - Christian Schlotter, Broadcom & Fabrizio Pandini, VMware by Broadcom
Wednesday April 2, 2025 17:45 - 18:15 BST
Custom Resource Definitions (CRDs) are the present and future of Kubernetes, serving as the bridge between Kubernetes and your own applications, processes, and tooling.

However, as we’ve all learned the hard way, designing and evolving a good CRD is not as simple as it seems.

Join this talk to discover tips, tricks and lessons learned for designing CRDs that can support your cloud native journey for the next 10 years.

Let’s embark on this journey together to shed light on the intricacies of CRD design and implementation, so we can transform arcane CRDs into simple, consistent API types that everyone can comfortably work with.
Speakers
avatar for Christian Schlotter

Christian Schlotter

Software Engineer, Broadcom
Christian is a Software Engineer at Broadcom. He is an active maintainer at the Cluster API and Cluster API Provider vSphere projects of SIG Cluster Lifecycle as well as emeritus maintainer of the Cluster API Provider OpenStack project. Since messing up his fathers internet dial-up... Read More →
avatar for Fabrizio Pandini

Fabrizio Pandini

Software Engineer, VMware by Broadcom
A Kubernetes contributor obsessed with making Kubernetes lifecycle simple and consistent across all types of infrastructures, so everyone can build amazing applications on top of it. When I’m not busy as a SIG Cluster Lifecycle tech lead or as a project maintainer in Cluster API... Read More →
Wednesday April 2, 2025 17:45 - 18:15 BST
Level 1 | Hall Entrance N11
  Application Development
 
Thursday, April 3
 

11:00 BST

Development Environments on Kubernetes: Lessons From Six Years at Internet Scale - Christian Weichel & Alejandro de Brito Fontes, Gitpod
Thursday April 3, 2025 11:00 - 11:30 BST
Running dev environments at scale presents unique challenges that push Kubernetes to the limit. After 6 years of operating development environments for 1.5 million users and as long-time contributors to the Kubernetes community, we encountered fundamental limitations with our use-case that led us to rearchitect Gitpod away from Kubernetes. Our recent technical deep-dive blog ended up on Hacker News and sparked quite the intense debate (speakers are the OP).

This talk dives into our journey of kernel modifications, custom controllers, implementations of user namespaces with shiftfs for UID mapping, seccomp notify for proc masking, and custom device policies for FUSE, tackling CPU throttling with custom CFS controllers, experiments with cgroupv2, and why 1.26's dynamic resource allocation didn’t solve our challenges. These are our hard-won insights to share with the community and continue the discussion around development environment infrastructure both on, or even off Kubernetes.
Speakers
avatar for Alejandro de Brito Fontes

Alejandro de Brito Fontes

Senior Engineer, Gitpod
Alejandro is a software entrepreneur and systems architect with more than 20 years of experience designing, building, and operating mission-critical IT infrastructure.
avatar for Christian Weichel

Christian Weichel

Chief Technology Officer, Gitpod
Chris Weichel is the Chief Technology Officer at Gitpod, where he leads the engineering team that builds and maintains the cloud-native platform for software development. With over 20 years of experience in software engineering and human-computer interaction, he has a comprehensive... Read More →
Thursday April 3, 2025 11:00 - 11:30 BST
Level 1 | Hall Entrance S10 | Room A
  Operations + Performance

11:00 BST

Identity-based Trust - Till Death Do We Part? - John Kjell, TestifySec & Kairo De Araujo, Independent
Thursday April 3, 2025 11:00 - 11:30 BST
With the rise in adoption of identity-based trust, it is increasingly important to understand the threats to such systems. PyPI, NPM, RubyGems, and Homebrew have all established models for “trusted publishing” attestation, based on OIDC. Many of these implementations rely on Project Sigstore’s projects Fulcio and Rekor.

Sigstore’s Rekor is an append only log. There’s no way to remove entries, even if they’re illegitimate. In the case of an identity compromise, most individuals would prefer to avoid a divorce from their identity, allowing for recovery and the establishment in future trust of their name.

In this session, we’ll examine a threat model and mechanisms for compromise in a Sigstore-based identity signing system. Once established, we’ll describe ways to mitigate and resolve the threats, leveraging the CNCF projects in-toto and The Update Framework (TUF). Beyond theoretical designs, we’ll look at how this system has been implemented in in-toto’s sub-project Archivista.
Speakers
avatar for John Kjell

John Kjell

Director of Open Source, TestifySec
John is responsible for open source at TestifySec, a software supply chain security startup. He is a maintainer for the Witness and Archivista sub-projects under in-toto. Additionally, John is an active contributor to CNCF's TAG Security and multiple projects within the OpenSSF. Before... Read More →
avatar for Kairo De Araujo

Kairo De Araujo

Open Source Engineer, Independent
Kairo is a Senior Open Source Engineer. Kairo maintains python-tuf and is the author of Repository Service for TUF (RSTUF). His past roles include Senior Open Source Software Engineer at TestifySec, VMware, Senior Software Engineer at IBM, ING, Forescout, and a former System Engineer... Read More →
Thursday April 3, 2025 11:00 - 11:30 BST
Level 1 | Hall Entrance S10 | Room C
  Security

14:15 BST

eBPF and Wasm: Unifying Userspace Extensions With Bpftime - Yusheng Zheng, eunomia-bpf
Thursday April 3, 2025 14:15 - 14:45 BST
In cloud-native systems, extending and customizing applications is key to improving development, deployment, and observability. eBPF is powerful for kernel-level enhancements, and WebAssembly brings extension to userspace. Yet, both face challenges when userspace extensions need to interact deeply with host applications. eBPF's kernel-focused design struggles in diverse userspace environments, and Wasm’s sandboxing introduces overhead and complexity due to extra checks and data copying. Enter bpftime, a framework that extends eBPF’s capabilities into userspace. Using dynamic binary instrumentation, bytecode verification, and hardware isolation, bpftime allows secure, high-performance extensions without the overhead of Wasm’s sandboxing. This talk explores how bpftime works with the eBPF Interface to simplify userspace extensions, compares the evolution of eBPF and Wasm, and shows how bpftime can power observability, networking, and other cloud-native extensions.
Speakers
avatar for Yusheng Zheng

Yusheng Zheng

OSS maintainer, eunomia-bpf
Yusheng Zheng is an open-source maintainer and researcher focused on improving complex systems through comprehensive understanding and strategic, small-scale modifications. As the co-founder of the eunomia-bpf open-source community and a PhD student, Yusheng is at the forefront of... Read More →
Thursday April 3, 2025 14:15 - 14:45 BST
Level 1 | Hall Entrance S10 | Room D
  Emerging + Advanced

14:15 BST

Tutorial: Rock, Paper, Scissors! Build an AI-powered Interactive Game With Argo CD and Kubeflow - Natale Vinto, Daniel Oh, Roberto Carratala & Alex Soto Bueno, Red Hat; Hind Azegrouz, Intel
Thursday April 3, 2025 14:15 - 15:30 BST
Explore the exciting world of modern and AI-powered application development with our hands-on lab. This comprehensive session will guide you through the process of deploying and upgrading models, pipelines, and more for the classic game of Rock Scissors Paper, showcasing the capabilities of Kubeflow and Argo CD.

Throughout the lab/demo, you will:

- Learn how to deploy an AI model for the interactive game using Argo CD & KServe Model Mesh
- Discover how data scientists can efficiently test and experiment with their models
- Visualize model automation based on Kubeflow pipelines
- Utilize Argo CD for streamlined applications deployment and updates
- Implement GitOps methodology for enhanced collaboration and automation in AI application development and deployment

At the end of the sessions attendees will have a better understanding of the CI/CD for AI and Apps and how to combine both with Argo CD and GitOps for a perfect match in Kubernetes!
Speakers
avatar for Roberto Carratala

Roberto Carratala

AI Architect, Red Hat
Cloud Services Black Belt specialized in Container Orchestration Platforms (OpenShift & Kubernetes), Cloud Services, DevSecOps and CICD.
avatar for Daniel Oh

Daniel Oh

Senior Principal Developer Advocate, Red Hat
Daniel Oh is a Java Champion and Senior Principal Developer Advocate at Red Hat to evangelize developers for building cloud-native apps and serverless ob Kubernetes ecosystems. He's also contributing to various cloud open-source projects and ecosystems as a CNCF ambassador for accelerating... Read More →
avatar for Natale Vinto

Natale Vinto

Director of Developer Advocacy, Red Hat
Natale Vinto is a Software Engineer with more than 10 years of expertise on IT and ICT technologies, and a consolidated background on Telecommunications, DevOps and Linux operating systems. Today Natale is Director of Developer Advocacy at Red Hat and author of "Modernizing Enterprise... Read More →
HA

Hind Azegrouz

EMEA AI Inference Lead, Intel
AS

Alex Soto Bueno

Developer Advocate, Red Hat
Thursday April 3, 2025 14:15 - 15:30 BST
Level 1 | Hall Entrance N11
  Tutorials, Application Development

16:00 BST

Get WITty: Evolving Kubernetes Scheduling With the WebAssembly Component Model - Dejan Pejchev & Jonathan Giannuzzi, G-Research
Thursday April 3, 2025 16:00 - 16:30 BST
At KubeCon NA 2024, we introduced WASM + KWOK Wizardry: Writing and Testing Kubernetes Scheduler Plugins at Scale, showcasing how WASM plugins transform Kubernetes scheduling. This session continues the story, highlighting our progress toward a language-agnostic framework using the WebAssembly Component Model.

The current Go-centric WASM plugin SDK restricts innovation to a single language. By adopting the Component Model, we enable developers to write plugins in Rust, Python, JavaScript, and more, unlocking new possibilities. This approach enhances modularity, simplifies integration with standardized interfaces, and strengthens security through improved isolation.

We’ll also showcase how this aligns with the Kubernetes Scheduler Simulator, providing a powerful testing environment for these advanced plugins. Join us to see how the Component Model fosters collaboration, innovation, and extensibility in Kubernetes scheduling. Let’s move beyond wizardry and get truly WITty!
Speakers
avatar for Dejan Zele Pejchev

Dejan Zele Pejchev

Open Source Software Engineer, G-Research
Dejan is a seasoned Software Engineer with over 8 years of experience building and scaling distributed systems and an advocate of open source & Kubernetes-native solutions. Dejan is also a maintainer of Armada, the Kubernetes multi-cluster batch scheduling tool, Testkube, the Kubernetes-native... Read More →
avatar for Jonathan Giannuzzi

Jonathan Giannuzzi

Open Source Evangelist, G-Research
Jonathan is an Open Source Evangelist at G-Research, where he applies his nerdy wizardry powers to solve deep problems that can bubble up all the way to the end-user.
Thursday April 3, 2025 16:00 - 16:30 BST
Level 1 | Hall Entrance S10 | Room D
  Emerging + Advanced

16:45 BST

Building & Operating a Large-scale HPC AI Cluster on Kubernetes - Kalyan Saladi & Chandan Avdhut, Meta Platforms Inc.
Thursday April 3, 2025 16:45 - 17:15 BST
We explore the challenges of building and running a large-scale AI/ML cluster in cloud that can handle high-performance ML training jobs. We will cover the benefits of using a container orchestration platform like Kubernetes for managing AI/ML workloads and how Slurm can be used to schedule and manage jobs on a cluster. We will also dive into cluster health management and meeting performance expectations.

Share lessons from building a 12K GPU state-of-the-art HPC cluster, with high performance storage systems, and Infiniband network fabric, playing host to workloads ranging from 10s to thousands of GPUs lasting days to weeks.

We highlight the importance of health-checks and telemetry in understanding and reacting to various failure modes experienced in HPC clusters and how to mitigate impact on AI training jobs.

Finally, we share insights from operating the cluster for over a period of more than 6 months, and share pitfalls and best practices.
Speakers
avatar for Kalyan Saladi

Kalyan Saladi

Software Engineer, Meta Platforms Inc.
Kalyan is a software engineering lead at Meta Platforms in the research org(FAIR). He has built and operated multiple large AI clusters, both bare-metal as well as on the cloud. He supported several leading large model training efforts in FAIR over the years, including LLAMA-2. Kalyan... Read More →
avatar for Chandan Avdhut

Chandan Avdhut

Production Engineer, Meta Platforms Inc.
As a seasoned Production Engineer with a strong background in Kubernetes, public cloud infrastructure, and large-scale AI/ML clusters, I bring a unique blend of technical expertise and real-world experience to the table. With a proven track record of designing and operating complex... Read More →
Thursday April 3, 2025 16:45 - 17:15 BST
Level 1 | Hall Entrance S10 | Room B
  AI + ML
 
Friday, April 4
 

11:45 BST

Data Gravity and Kubernetes: Managing Large-Scale Data Ingest With Minimal Latency - Abhishek Bhattacharjee, Quasitech Innovations Private Limited & Arya Soni, Zupee
Friday April 4, 2025 11:45 - 12:15 BST
Kubernetes environments, particularly in the context of large-scale data ingest across APIs, suffer from unique challenges posed by data gravity. This presentation aims to explore the newer avenues to overcome these challenges such as local storage layer optimizations, integration of edge computing, and/or network efficiencies that can help reduce latency. Participants will be exposed to ways of reducing data transfer costs, increasing data transfer rates and improving data storage characteristics without loss of scalability of the system. Many of the provided examples will relate to the real situations which will help the audience to use those techniques effectively in the real-life complex Kubernetes environments.
Speakers
avatar for Abhishek Bhattacharjee

Abhishek Bhattacharjee

CEO at Wooak, Quasitech Innovations Private Limited
I am Abhishek Bhattacharjee, Co-Founder & CEO of Wooak, an AI-driven HRMS platform redefining workforce management. With a strong background in tech and leadership, I specialize in building scalable, user-focused solutions. Passionate about innovation, I aim to empower businesses... Read More →
avatar for Arya Soni

Arya Soni

DevOps Engineer, Zupee
I’m a DevOps Engineer with over two years of experience in cloud-native technologies, automation, and infrastructure optimization. As a co-organizer of the CNCG Bihar Chapter, I’ve led initiatives promoting open-source contributions and community growth. I’ve contributed to... Read More →
Friday April 4, 2025 11:45 - 12:15 BST
Level 1 | Hall Entrance S10 | Room C
  Data Processing + Storage

11:45 BST

Testing AI Containers for Digital Twins in Science: A Cloud-HPC Workflow - Matteo Bunino, CERN & Diego Ciangottini, INFN
Friday April 4, 2025 11:45 - 12:15 BST
CERN is advancing the development of AI-based digital twins in science through projects like interTwin, an EC-funded project to develop a digital twin engine for science. These digital twins rely on HPC resources for training multi-node, multi-GPU models using containerized workflows.
Developing such containers for HPC systems presents unique challenges, including accessing restricted HPC resources and integrating with HPC software stacks, while ensuring the interoperability between different container runtimes.
We introduce a CI/CD workflow that bridges cloud and HPC and enables automated testing of AI/ML containers on the same SLURM-managed clusters where they will be deployed. By integrating Dagger’s reproducible CI runtime with HPC offloading, this approach validates both the software in the containers and their compatibility with HPC environments. This ensures the seamless deployment of AI-based digital twins, addressing the critical need for robust testing in hybrid environments.
Speakers
avatar for Diego Ciangottini

Diego Ciangottini

Technologist, INFN
Diego Ciangottini is a physicist and received his PhD from the University of Perugia, Italy in 2012. Now he's working as technologist at INFN (Italian National Institute for Nuclear Physics) researching cloud-native solutions for the scientific use cases of the institute. In that... Read More →
avatar for Matteo Bunino

Matteo Bunino

Computing Engineer, CERN
Matteo holds a double Master’s degree in Computer Engineering from PoliTO and EURECOM. At CERN, he focuses on AI-based digital twins in science, integrating AI, HPC, and real-time data processing. As part of CERN openlab, he collaborates with industry and academia on R&D projects... Read More →
Friday April 4, 2025 11:45 - 12:15 BST
Level 1 | Hall Entrance S10 | Room B
  Emerging + Advanced

13:45 BST

Failure Is Not an Option: Durable Execution + Dapr = 🚀 - Marc Duiker, Diagrid
Friday April 4, 2025 13:45 - 14:15 BST
Applications break all the time, there could be a network issue, a cloud provider outage, or just a glitch in the matrix. But as a developer, you really need your applications to be resilient without the need to recover databases and restart services manually.

In this session, I'll demonstrate how Dapr Workflow provides durable execution, which enables you to write reliable workflows as code. In addition, I'll show how resiliency policies in Dapr improve reliable communication across services and resources when developing distributed applications.

I'll go into specific workflow features, such as scheduling, sequential and parallel execution, and waiting for external events. I'll show many code samples (in C#) for each of these features and will run the applications using the Dapr CLI to demonstrate their resiliency.

By the end of the session, you will have a good grasp of how durable execution with Dapr workflow and resiliency policies can help you build resilient applications.
Speakers
avatar for Marc Duiker

Marc Duiker

Developer Advocate, Diagrid
Marc is a Sr Developer Advocate at Diagrid with a strong focus on event-driven architectures. He loves helping developers to achieve more every day. You might have seen Marc at a developer meetup or conference, since he's a regular speaker and event-organizer in the area of Dapr... Read More →
Friday April 4, 2025 13:45 - 14:15 BST
Level 0 | ICC Capital Hall | Room 1
  Application Development

14:30 BST

Data Processing Efficiency: Optimizing Batch Workloads on Kubernetes With Custom Schedulers - Sigmar Stefánsson, NetApp & Hichem Kenniche, NetApp Instaclustr
Friday April 4, 2025 14:30 - 15:00 BST
Kubernetes is the leading platform for deploying major data processing frameworks like Apache Spark. However, its default scheduler falls short in meeting some of the advanced and specific requirements of batch workloads.

This presentation explores the necessity and benefits of custom schedulers, with a deep dive on the implementation of Volcano and Apache YuniKorn in multi-cloud Kubernetes environments running large and complex Apache Spark applications. Discover how these tools can optimize cluster management for batch and ML workloads.
Speakers
avatar for Hichem Kenniche

Hichem Kenniche

Principal OSS Product Architect, NetApp Instaclustr
Hichem is passionate about open-source technologies such as Kubernetes and its ecosystem, Apache Spark, Kafka, Airflow, and many others. With over 10 years of experience in Data Analytics and AI/ML, he is currently an OSS Product Architect at NetApp Instaclustr. In this role, he collaborates... Read More →
avatar for Sigmar Stefánsson

Sigmar Stefánsson

Software Engineer, NetApp
Sigmar is a Software Engineer at NetApp, where he has been instrumental in advancing the integration of Apache Spark within Kubernetes environments. With a robust background in software development and a keen focus on big data technologies, Sigmar has dedicated years to optimizing... Read More →
Friday April 4, 2025 14:30 - 15:00 BST
Level 1 | Hall Entrance S10 | Room C
  Data Processing + Storage

15:15 BST

AI Beyond Autocomplete: Using LLMs To Create 1000 Kubernetes Controllers - Justin Santa Barbara & Walter Fender, Google
Friday April 4, 2025 15:15 - 15:45 BST
LLMs can generate React apps, poems, and even music. But can they rise to the ultimate challenge: writing reliable Kubernetes controllers? The Config Connector team say "yes!" We are successfully using AI to write production controllers for a thousand google cloud resources.

Our path was to first break the problem into LLM-friendly steps (such as generating KRM types, the mocks and the reconciler). For each step, we invoke custom fine-tuned LLMs in a novel way with custom “jigs”. We add testing to create an “interlock” that mitigates hallucinations.

This journey changed our whole codebase philosophy: from optimizing for lines of code, we now prioritize the ability to safely and easily author and merge focused changes (at the expense of having lots of code). Although AI motivated this trade-off, it also aids development as OSS.

We believe our approach is generally applicable; join us to learn lessons that will apply as your project embraces the AI-assisted future.
Speakers
avatar for Justin Santa Barbara

Justin Santa Barbara

Software Engineer, Google
Justin has been contributing to kubernetes since 2014, initially as the primary maintainer of the kubernetes AWS support, he also started the kOps project. He loves helping users adopt and grow their use of kubernetes, and believes that we have only scratched the surface of the kubernetes... Read More →
avatar for Walter Fender

Walter Fender

Staff Engineer, Google
Graduated from U.C. Berkeley. Working at Google and on Kubernetes API Machinery and Cloud Provider for eight years. Maintainer for the APIServer Network Proxy and Config Connector projects.
Friday April 4, 2025 15:15 - 15:45 BST
Level 0 | ICC Capital Hall | Room 1
  Application Development
 

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
  • 🚨 Contribfest
  • 🪧 Poster Sessions
  • AI + ML
  • Application Development
  • Breaks
  • ⚡ Lightning Talks
  • Cloud Native Experience
  • Cloud Native Novice
  • CNCF-hosted Co-located Events
  • Connectivity
  • Data Processing + Storage
  • Emerging + Advanced
  • Experiences
  • Keynote Sessions
  • Maintainer Track
  • Observability
  • Operations + Performance
  • Platform Engineering
  • Project Opportunities
  • Registration
  • Security
  • Solutions Showcase
  • Sponsor-hosted Co-located Event
  • Tutorials