PH.D DEFENCE - PUBLIC SEMINAR

Multi-channel correlation filters with limited boundaries: theory and applications

Speaker
Mr. Hamed Kiani Galoogahi
Advisor
Dr Terence sim, Associate Professor, School of Computing


08 Aug 2014 Friday, 10:30 AM to 12:00 PM

MR6, AS6-05-10

Abstract

Correlation filters have been widely used in computer vision for pattern recognition and matching. The core idea of all correlation filters is to learn a filter/template that produces desired correlation outputs when correlated with a set of training examples. Correlation filters exhibit a number of characteristics that make them interesting to the vision community, e.g. shift-invariance, robustness to noise, closed-form solutions and most importantly their memory and computation efficiencies.
In spite of recent progress in correlation filters, there remains plenty of scope for new extensions and improvements of traditional correlation filters for vision problems.

In this research, we introduce the following improvements to the correlation filter theory for vision applications. First, traditional correlation filters are limited to single-channel image representations (e.g. intensity). We propose an extension to canonical correlation filter theory that is able to handle multi-channel signals/features, which refereed to as multi-channel correlation filters. This allows one to exploit modern image descriptors (e.g HOG and SIFT) to learn discriminative filters for challenging pattern classification and detection.

Second, we demonstrate that multi-channel correlation filters can be directly applied to learn spatial-temporal patterns in videos with no extra memory and computation overheads. Third, traditional correlation filters employ shifted patches for
filter training which implicitly are created by circular boundary effects. These shifted patches are not representative of real patches and can drastically reduce the discrimination power of the trained filter. We propose correlation filters with limited boundaries that can significantly reduce the number of patches affected by boundary effects.

Finally, we propose to apply a set of multi-channel correlation filters with different spatial supports over a cascaded framework for coarse-to-fine facial landmark detection. We demonstrate the superior performance, memory and computation efficiencies of all the proposed techniques in this thesis over an extensive set of experiments including visual object tracking, object localization, human action recognition and robust facial landmark detection.