PH.D DEFENCE - PUBLIC SEMINAR

Principled Learning

Speaker
Mr. Theivendiram Pranavan
Advisor
Dr Terence Sim Mong Cheng, Associate Professor, School of Computing


26 Mar 2024 Tuesday, 09:00 AM to 10:30 AM

MR20, COM3-02-59

Abstract :

The thesis aims to experiment with the brain's learning principles in machine learning models. We focus on four important principles, which are continual learning, recency, similarity-based learning, and predictive coding.

Firstly, we propose a novel supervised machine learning strategy inspired by human learning that enables an Agent to learn continually over its lifetime. A natural consequence is that the Agent must be able to handle an input whose label is delayed until a later time or may not arrive at all. Our Agent learns in two steps: a short Seeding phase, in which the Agent's model is initialized with labelled inputs, and an indefinitely long Growing phase, in which the Agent refines and assesses its model if the label is given for input but stores the input in a finite-length queue if the label is missing. Queued items are matched against future input-label pairs that arrive, and the model is then updated. Our strategy also allows for the delayed feedback to take a different form. For example, in an image captioning task, the feedback could be a semantic segmentation rather than a textual caption. We show with many experiments that our strategy enables an Agent to learn flexibly and efficiently. In this work, we show how the principles continual learning and recency help in improving the performance of image classification and image captioning.

Secondly, we propose a novel way in which similarity is used to boost performance in multi-task learning. In supervised multi-task learning, auxiliary tasks are usually decided manually. This immediately raises the issue of choosing suitable tasks, which in turn requires measuring the similarity between tasks. In this chapter, we propose a task-similarity metric that depends solely on the task labels, not on the machine learning models. We show that auxiliary tasks may be synthetically generated from the main tasks with any desired similarity. Labels for these virtual tasks are also generated. In addition, we show that learning these virtual tasks along with the main task leads to real performance gains. In this work, we show how the principle, similarity-based learning is helpful to improve multi-task learning.

Thirdly, we propose a way to do Anomaly detection using predictive coding. Anomaly detection in multi-variate time series (MVTS) data is a huge challenge as it requires simultaneous representation of long-term temporal dependencies and correlations across multiple variables. This is often solved by breaking the complexity by modelling one dependency at a time. This chapter proposes a Time-series Representational Learning through Contrastive Predictive Coding (TRL-CPC) toward anomaly detection in MVTS data. First, we jointly optimize an encoder, an auto-regressor, and a non-linear transformation function to effectively learn the representations of the MVTS data sets for predicting future trends. It must be noted that the context vectors are representative of the observation window in the MTVS. Next, the latent representations for the successive instants obtained through non-linear transformations of these context vectors are contrasted with the latent representations of the encoder for the multi-variables such that the density for the positive pair is maximized. Thus, the TRL-CPC helps to model the temporal dependencies and the correlations of the parameters for a healthy signal pattern. Then, the latent representations are fit into a Gaussian scoring function to detect anomalies. Evaluation of the proposed TRL-CPC on three MVTS data sets against SOTA anomaly detection methods shows the superiority of TRL-CPC. In this work, we use the idea from the principle, predictive coding to learn good representations for anomaly detection.

Finally, we show the benefit of continual learning in medical imaging. This chapter discusses a practical scenario of continual learning using X-ray image processing where multiple data sources may not be mutually sharing information due to regulations and confidentiality. We demonstrate that continual learning is a helpful tool for data security through model sharing across multiple data sources.