CS SEMINAR

Faster Neural Machine Translation

Speaker
Dr Kenneth Heafield, Assistant Professor, University of Edinburgh
Chaired by
Dr KAN Min Yen, Associate Professor, School of Computing
kanmy@comp.nus.edu.sg

07 Mar 2019 Thursday, 11:00 AM to 12:00 PM

Executive Classroom, COM2-04-02

Abstract:

We dominated a shared task on translation speed run by the Workshop on Neural Machine translation. Speed came from many levels: model complexity, teacher-student compression, and efficient kernels. Compressing the model is particularly important because memory bandwidth is the limiting factor on GPUs with tensor cores and on CPUs. I wrote 8-bit integer multiplication in AVX512 intrinsics, which reduced translation latency 2.7x. Much of the systems for ML addresses vision tasks; large parameter skew and variable-size input make sequential models difficult and interesting.


Biodata:

Kenneth Heafield is a Lecturer (~US Assistant Professor) leading a machine translation group at the University of Edinburgh. He works on efficient neural networks, low-resource translation, mining petabytes for translations, and occasionally grammatical error correction. The ParaCrawl project (https://paracrawl.eu/) has free large parallel corpora for 24 languages with English.