Continuous Optimization of Distributed Stream Programs
Dr Peh Li Shiuan, Professor, School of Computing
Dr Saman Amarasinghe, Professor, Massachusetts Institute of Technology
COM2 Level 2
MR3, COM2-02-26
closeAbstract:
Stream programs are an important class of applications in computer science, spanning several domains such as nancial market, multimedia, and radio astronomy. Due to the recent advancements in collecting and storing data in massive scale, stream programs need to distribute and scale over dozens of nodes to attain high-performance, necessitating specialized system support. As the modern high performance hardware environments are highly dynamic, the system must also provide a lifetime support to the programs to ensure desired quality-of-service.
However, building a highly optimizing domain specic compiler for stream programming comes with several inherent challenges. Specically, it has become less viable for heuristics based traditional compilers to generate optimal code for each and every hardware platform. Further, maintaining desired quality-of-service during dynamic changes needs advanced features.
In this thesis work, we have designed and implemented a distributed compiler and runtime system, StreamJIT, for high-performance stream processing. StreamJIT can automatically distribute and scale programs over cluster nodes per program?s needs. Further, StreamJIT continuously optimizes the programs via various advanced features such as cluster-wide dynamic recompilation, downtime-free live reconguration, and online autotuning. The cluster-wide dynamic recompilation makes all ahead-of-time domain specic optimizations available to a running program. The downtime-free live reconguration recompiles and redistributes program instances across cluster nodes on-the-y. The online autotuning, unlike the previous oine autotuning systems, continuously optimizes the code in the background. Series of experiments show StreamJIT is high performance, scalable, and resilient to dynamic changes.