PH.D DEFENCE - PUBLIC SEMINAR

Generalizable Deep Learning: Addressing Distribution Shifts for Visual Recognition

Speaker

Mr. Zhang Yifan

Advisor

Dr Bryan Hooi Kuen-Yew, Assistant Professor, School of Computing

12 Sep 2024 Thursday, 10:00 AM to 11:30 AM

MR20, COM3-02-59

Abstract:

Deep foundation models have revolutionized the field of artificial intelligence and machine learning. However, their vulnerability to distribution shifts, including data distribution shifts (covariate shifts) and class distribution shifts (concept drift), hinders their effective deployment in real-world applications like healthcare and autonomous driving. In this talk, we aim to advance generalizable deep learning by developing innovative techniques for addressing challenges arising from both data and class distribution shifts.

First, we explore Test-time Adaptation (TTA) to address distribution shifts between training and test data. Our primary goal is to address the stability issues of TTA in real-world applications. We find that the batch norm layers are the critical factor in TTA instability, especially under challenging conditions like mixed distribution shifts and small batch sizes. To mitigate this issue, we propose a novel Sharpness-aware and Reliable Entropy Minimization (SAR) method, which selectively removes noisy samples and pushes the model towards a stable and flat minimum.

Second, we explore long-tailed recognition to address class distribution shifts between training and test data. We focus on test-agnostic long-tailed recognition, where the training class distribution is long-tailed, while the test class distribution is unknown and not necessarily balanced. Besides class imbalance, this task poses one more key challenge: the class distribution shifts between training and test data are unknown. To address this task, we propose a novel framework called Self-supervised Aggregation of Diverse Experts (SADE), with a new skill-diverse expert learning strategy and a novel test-time expert aggregation strategy

Lastly, we explore foundation generative models to address the generalization problem of models in small-scale data scenarios. The generalization of DNNs relies heavily on the quantity and quality of training data. However, collecting and annotating data on a large scale is often expensive and time-consuming. To address this issue, we explore a new task, termed dataset expansion, aimed at expanding a ready-to-use small dataset by automatically creating new labeled samples. To this end, we present a Guided Imagination Framework (GIF) that leverages cutting-edge generative models like DALL-E2 and Stable Diffusion (SD) to "imagine" and create informative new data from the input seed data.

In summary, we introduce a suite of techniques to improve the generalizability of deep models for real-world applications in the face of data and class distribution shifts.