A Compact Network Model for Learning in Distribution Space

Ms Connie Kou Khor Li
Dr Ng Teck Khim, Associate Professor (Practice), School of Computing
Dr Lee Hwee Kuan, Adjunct Associate Professor, School of Computing

  07 Apr 2020 Tuesday, 09:30 AM to 11:00 AM

 Executive Classroom, COM2-04-02


Despite the superior performance of deep learning methods in various tasks, challenges remain in the area of regression on function spaces. In this work, we address the problem of regression where the data are distributions. In contrast to prediction on single instances, distribution regression can be useful for population-based studies or on problems that are inherently statistical in nature. However, neural networks are not designed for distribution inputs - the networks are unable to encode function inputs compactly as each node encodes just a real value. To that end, we propose a novel idea to address this shortcoming: we encode an entire function in each network node. Our proposed model, which we call distribution regression network (DRN), propagates distributions layerwise through the nodes, with a propagation form inspired by statistical physics. We conducted theoretical analysis and derive some properties of DRN in comparison to conventional neural networks. DRN' s propagation is highly regularized, with rich propagation behavior controlled by very few network parameters.

We conducted comprehensive experiments to test DRN against other methods. Our experimental findings show that DRN uses at least two times fewer training data and has better accuracies even with increasing data sampling noise and task difficulty. Furthermore, the theoretical properties of DRN can be used to provide some explanation for DRN's generalization performance.

As an application, we use DRN in a defense method against adversarial attacks in convolutional neural networks by integrating with existing transformation-based defenses. Exploiting the fact that the transformation methods are stochastic, our method samples a population of transformed images and performs the final classification on distributions of softmax probabilities. We train a separate compact distribution classifier to recognize distinctive features in the distributions of softmax probabilities of transformed clean images. Without retraining the original CNN, our distribution classifier improves the performance of transformation-based defenses on both clean and adversarial images.