The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Europe 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.
Please note: This schedule is automatically displayed in British Summer Time (BST) (UTC +1). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.
Sign up or log in to bookmark your favorites and sync them to your phone or calendar.
Kubernetes was designed for microservices. With AI rapidly advancing, Kubernetes must adapt to also support both AI training and multi-node inference. It needs to improve not only at scheduling these workloads within the cluster, but also at fine-grained resource assignment on the nodes.
High Performance Computing (HPC) systems use workload managers such as Slurm. Slurm, the most used HPC workload manager with over two decades of development, excels at gang scheduling, fair usage, job planning, and batch scheduling.
We will show the current state of Slinky, a fully open-source toolset designed to integrate Slurm with Kubernetes and to solve the difficulties of getting AI clusters working more performantly and efficiently. Slinky includes a Slurm operator, a Slurm client library, and a metrics exporter. Here, we will outline our architecture and discuss the challenges of achieving the fine-grained control needed in Kubernetes for full functionality for AI and HPC workloads.
Tim Wickberg is the Chief Technology Officer of SchedMD, and is responsible for the technical direction and development of the open-source Slurm Workload Manager.