Quantizing neural networks for ultra-low-precision computation
14 Jan 2020 Tuesday, 11:00 AM to 12:30 PM
COM2 Level 4
Executive Classroom, COM2-04-02
Bit-width needs to be minimized for efficient neural network design in terms of chip area, code size, and, most importantly, energy efficiency. In this talk, we first review state-of-the-art quantization methods in the industry and academia and introduce our ideas of outlier quantization, precision highway and quantization error fluctuation-aware training, which finally offers 4-bit linear weight/activation quantization of MobileNet v3.
Sungjoo Yoo received Ph.D. from Seoul National University in 2000. From 2000 to 2004, he was researcher at system level synthesis (SLS) group, TIMA laboratory, Grenoble France. From 2004 to 2008, he led, as principal engineer, system-level design team at System LSI, Samsung Electronics. From 2008 to 2015, he was associate professor at POSTECH. In 2015, he joined Seoul National University and is now full professor. In 2018, he did his sabbatical at Facebook, Menlo Park, US. His current research interests are software/hardware co-design of deep neural networks and machine learning-based optimization of computer architecture.