Loading…
In-person
1-4 April 2025
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Europe 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in British Summer Time (BST) (UTC +1). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Thursday April 3, 2025 17:30 - 18:00 BST
Balancing resource provision for LLM workloads is critical for maintaining both cost efficiency and service quality. Kubernetes’s Horizontal Autoscaling offers a cloud-native capability to address these challenges, relying on the metrics to make the autoscaling decisions. However, the efficiency of metrics collection impacts how quickly and accurately Autoscaler responds to the LLM workload demands. This session explores strategies to enhance metrics collection for autoscaling LLM workloads with:
1. The fundamentals of how horizontal autoscaling works in Kubernetes
2. The unique challenges of autoscaling LLM workloads
3. A comparison of existing Kubernetes autoscaling solution for custom metrics with their pros and cons
4. How optimizing metrics collection through push-based approaches can improve scaling responsiveness.
It will demonstrate an integrated solution using KServe, OpenTelemetry collector and KEDA to showcase how they can be leveraged to optimize LLM workload autoscaling.
Speakers
avatar for Vincent Hou

Vincent Hou

Senior Software Engineer, Bloomberg
Vincent Hou is a senior software engineer on Bloomberg’s Cloud Native Compute Services AI Inference engineering team, which he joined in 2023 after working for IBM for 13-years. He has been an active open source contributor since 2010. He previously was an active contributor to... Read More →
avatar for Jiří Kremser

Jiří Kremser

YAML Engineer, kedify.io
whois jkremser? Software engineer and open-source enthusiast currently working on kedify.io. Previously GiantSwarm.io, ABSA, Red Hat, etc. He likes road trips, 3d print and he is also a proud contributor to CNCF sandbox project called k8gb.io
Thursday April 3, 2025 17:30 - 18:00 BST
Level 1 | Hall Entrance N10 | Room G
  Observability

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link