Model inferencing on Kubernetes with KServe (Peter Cseh, Gábor Lovass, Altair)

Опубликовано: 17 Июнь 2026
на канале: KCD Budapest

283

Deploying machine learning models requires an efficient serving infrastructure. In this session, we’ll explore KServe, a powerful Kubernetes-native solution for model inferencing. You'll learn how KServe simplifies model deployment, lightning-fast inference, monitoring, cost optimization while supporting multiple frameworks like Scikit-Learn, TensorFlow, PyTorch, LightGBM, Paddle, PMML, Spark MLib, XBoost and ONNX. It also enables their users to deploy large language models (LLMs) from Huggingface. This open source project provides a simple, pluggable solution for common infrastructure issues with inference models, like GPU scaling and ModelMesh serving for high volume/density use cases.

Whether you’re an ML engineer, DevOps expert, or AI leader, this session will equip you with the best practices and hands-on insights to take your model serving to the next level

---

This presentation was delivered at KCD Budapest 2025: https://kcdbudapest.hu/2025

You can find the slide decks here: https://kcdbudapest.hu/2025/slides

And the pictures taken during the event here: https://kcdbudapest.hu/2025/pictures