CS SEMINAR

An Deep Learning Regression Approach to Spectral Mapping with Application to Speech Enhancement, Source Separation and Speech Dereverberation

Speaker
Professor Chin-Hui Lee
School of Electrical and Computer Engineering
Georgia Institute of Technology

Chaired by
Dr WANG Ye, Associate Professor, School of Computing
wangye@comp.nus.edu.sg

22 Dec 2017 Friday, 04:00 PM to 05:00 PM

MR1, COM1-03-19

Abstract:

We cast classical speech processing problems into a new nonlinear regression setting by mapping log power spectral features of noisy to clean speech based on deep neural networks (DNNs). DNN-enhanced speech obtained by the proposed approach demonstrates better speech quality and intelligibility than those obtained with conventional state-of-the-art algorithms. Furthermore, this new paradigm also facilitates an integrated deep learning framework to train the three key modules in an automatic speech recognition (ASR) system, namely signal conditioning, feature extraction and acoustic phone models, altogether in a unified manner. The proposed framework was tested on recent challenging ASR tasks in CHiME-2, CHiME-4 and REVERB, which are designed to evaluate ASR robustness in mixed speakers, multi-channel, and reverberant conditions. Leveraging upon this new approach, our team scored the lowest word error rates in all three tasks with acoustic pre-processing algorithms for speech separation, microphone array-based speech enhancement and speech dereverberation.


Biodata:

Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had accumulated 20 years of industrial experience ending in Bell Laboratories, Murray Hill, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of International Speech Communication Association (ISCA). He has published over 500 papers and 30 patents, with more than 30,000 citations and an h-index of 75 on Google Scholar. He received numerous awards, including the Bell Labs President's Gold Award in 1998. He won the SPS's 2006 Technical Achievement Award for "Exceptional Contributions to the Field of Automatic Speech Recognition''. In 2012 he gave an ICASSP plenary talk on the future of automatic speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for "pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition''.