PH.D DEFENCE - PUBLIC SEMINAR

Regularization at Ease for Deep Learning Applications

Speaker
Mr Luo Zhaojing
Advisor
Dr Ooi Beng Chin, Lee Kong Chian Centennial Professor, School of Computing


27 Nov 2019 Wednesday, 10:00 AM to 11:30 AM

Executive Classroom, COM2-04-02

Abstract:

Deep Learning (DL) and Machine Learning models have recently been shown to be effective in many real world applications. While these models achieve increasingly better predictive performance, their structures have also become much more complex. A common and difficult problem for complex models is overfitting. Regularization is used to penalize the complexity of the model in order to avoid overfitting. However, in most learning frameworks, regularization function is usually set as some hyper-parameters, and therefore the best setting is difficult to find. In this thesis, we work towards automatically learning regularization for DL models. To be specific, we propose an adaptive regularization method, a knowledge driven regularization method on the input layer, and a knowledge driven regularization method on hidden layers.

First, from a Bayesian view point, the regularization term corresponds to a prior distribution for the model parameters. In the meantime, the intermediate model parameters learned during training process can be very informative to approximate the actual prior distribution of the model parameters. Based on this insight, we propose an adaptive regularization method, GM-Reg, to capture the actual prior model parameter distribution in order to exert the best regularization on the model parameters.

Second, the success of DL models is typically associated with large amounts of training data. However, in many real-world tasks, the training data for some domain-specific applications is not sufficient. Fortunately, knowledge associated with domain-specific applications is often available in the unstructured data and can be used to complement the limited training data. Techniques based on word embeddings are typically used to incorporate knowledge from external corpora. However, they are not the best fit for some domain-specific tasks because they cannot handle polysemy and synonyms well. Consequently, we propose LDA-Reg, a novel knowledge driven regularization method based on Latent Dirichlet Allocation (LDA), as an alternative approach to word embedding methods to adaptively incorporate abundant knowledge into DL models and to prevent overfitting.

Third, while the second regularization method can only exert knowledge driven regularization on the input layer of the DL models, it cannot be exploited for the hidden layers of the DL models. Lastly, we propose CORR-Reg, which takes advantage of correlation knowledge between hidden layers to exert knowledge driven regularization on the hidden layers of the DL models.

For all the three regularization methods, we develop effective update methods which update model parameters and regularization-related parameters respectively. In order to improve efficiency, we design a lazy update method and a sparse update method to reduce the computational costs.

We validate the effectiveness of our regularization methods through extensive experimental studies over standard benchmark datasets and different kinds of deep learning/machine learning models. The results illustrate that our proposed regularization methods achieve significant improvement over their respective baseline methods. Additionally, since all of the three proposed regularization methods are flexible and general to different kinds of DL models, we integrate them into GEMINI software stack, which is designed to support healthcare data analytics.