Name: A Practical Guide To Benchmarking AI and GPU Workloads in Kubernetes - Yuan Chen, NVIDIA & Chen Wang, IBM Research
Start: 2025-04-03T11:00:00+0100
End: 2025-04-03T11:30:00+0100

In-person
1-4 April 2025
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Europe 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in British Summer Time (BST) (UTC +1). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

Thursday April 3, 2025 11:00 - 11:30 BST

Level 1 | Hall Entrance S10 | Room B

Effective benchmarking is required to optimize GPU resource efficiency and enhance performance for AI workloads. This talk provides a practical guide on setting up, configuring, and running various GPU and AI workload benchmarks in Kubernetes.

The talk covers benchmarks for a range of use cases, including model serving, model training and GPU stress testing, using tools like NVIDIA Triton Inference Server, fmperf: an open-source tool for benchmarking LLM serving performance, MLPerf: an open benchmark suite to compare the performance of machine learning systems, GPUStressTest, gpu-burn, and cuda benchmark. The talk will also introduce GPU monitoring and load generation tools.

Through step-by-step demonstrations, attendees will gain practical experience using benchmark tools. They will learn how to effectively run benchmarks on GPUs in Kubernetes and leverage existing tools to fine-tune and optimize GPU resource and workload management for improved performance and resource efficiency.

Speakers

Chen Wang

Senior Research Scientist, IBM Research

Chen Wang is a Senior Research Scientist at the IBM T.J. Watson Research Center. Her interests lie in Kubernetes, Container Cloud Resource Management, Cloud Native AI & LLM systems, and applying AI in Cloud system management. She is an open-source advocate, a Kubernetes & CNCF contributor... Read More →

Yuan Chen

Principal Software Enginner, NVIDIA

Yuan Chen is a Principal Software Engineer at Nvidia. Before joining Nvidia, Yuan served as a Staff Software engineer at Apple, where he contributed to the development of Apple's Kubernetes infrastructure beginning in 2019. Yuan has actively contributed to the Kubernetes projects... Read More →

Thursday April 3, 2025 11:00 - 11:30 BST
Level 1 | Hall Entrance S10 | Room B

AI + ML

Content Experience Level Beginner

KubeCon + CloudNativeCon Europe 2025

Chen Wang

Yuan Chen

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!