CS SEMINAR

Designing Serverless Cloud Systems for Large-Scale Generative AI

Speaker
Dr. Dmitrii Ustiugov, Assistant Professor, College of Computing & Data Science, Nanyang Technological University
Chaired by
Dr LI Jialin, Sung Kah Kay Assistant Professor, School of Computing
lijl@comp.nus.edu.sg

21 Nov 2024 Thursday, 02:00 PM to 03:30 PM

SR21, COM3 02-60

Abstract:
Serverless computing has revolutionized cloud architecture by offloading resource management to cloud providers, thereby speeding up the deployment of commercial services and broadening their adoption. As the demand for Generative AI (GAI) applications grows, the existing CPU-centric cloud architectures fall short in meeting the needs of large-scale GAI applications. In particular, existing cloud systems lack support for the elastic management of heterogeneous clusters with CPU/GPU/xPU hardware and high-speed communication fabrics essential for large-scale GAI deployments. In this talk, I will discuss the state of serverless cloud systems and introduce the serverless cloud systems research ecosystem we have been building with colleagues from the University of Edinburgh and ETH Zurich. Then, I will discuss our latest works on providing elasticity in serverless systems for modern and emerging cloud applications, including large-scale LLM inference.

Bio:
Dmitrii is an Assistant Professor at NTU Singapore. Previously, he was a Postdoctoral Researcher at ETH Zurich. Dmitrii received a Ph.D. from the University of Edinburgh and BSc and MSc degrees from the Moscow Institute of Physics and Technology. Dmitrii’s research interests lie at the intersection of Computer Systems and Architecture with a current focus on support for serverless cloud and large-scale AI systems. His works are published in top-tier computer systems and architecture venues, such as OSDI, ASPLOS, and ISCA.