SyFI Lab Systems for Future Intelligence

« All talks

System Architectures for serving Agentic applications

May 1, 2026 Saurabh Agarwal — UT Austin

Abstract

The rapid evolution of LLMs requires a fundamental rethink of serving infrastructure. This talk explores the broad spectrum of challenges associated with deploying these models, extending beyond baseline inference to address the dynamic and stateful requirements of agentic applications. First, we will detail our recent advancements in the memory layer, designed to optimize multi-turn LLM workloads. We will then transition to the specific complexities of multi-step orchestration, introducing Ventis, a novel system architecture to serve agentic workflows.

Speaker Bio

Saurabh is a postdoc at UT-Austin, working with Aditya Akella. He graduated out of UW-Madison in 2024. He primarily works in Systems for Machine Learning, having published papers across a wide array of conferences ranging from ICML, Neurips, SOSP, NSDI, MLsys. He was selected as an ML Systems rising star in 2024.

Speaker Homepage »