PH.D DEFENCE - PUBLIC SEMINAR

Sample-Efficient Automated Machine Learning with Bayesian Optimization

Speaker
Mr Dai Zhongxiang
Advisor
Dr Low Kian Hsiang, Associate Professor, School of Computing


22 Jun 2021 Tuesday, 10:00 AM to 11:30 AM

Zoom presentation

Abstract:

Automated hyperparameter optimization of machine learning (ML) models, referred to as AutoML, has been a challenging problem for practitioners, mainly due to the high computational cost of training modern ML models and the lack of gradient information with respect to the model hyperparameters. To this end, the black-box optimization method of Bayesian optimization (BO) has become a prominent method for optimizing the hyperparameters of ML models, which can be attributed to its impressive sample efficiency and theoretical convergence guarantee. Despite recent advances, there are still important scenarios where we can further improve the sample efficiency of BO for AutoML by exploiting naturally available auxiliary information, or extend the applicability of BO to other ML tasks. This thesis identifies five such important scenarios and, for each of them, proposes a novel BO algorithm that is both theoretically grounded and practically effective.

Firstly, many ML models require an iterative training process, which requires every hyperparameter evaluation during BO to run for a certain number of training epochs. As a result, the auxiliary observations from intermediate training epochs can be exploited to early-stop the evaluations of those unpromising hyperparameter configurations to save resource. We propose the BO with Bayesian optimal stopping (BO-BOS) algorithm, which incorporates BOS into BO in order to improve the epoch efficiency of BO using a principled optimal stopping mechanism. BO-BOS preserves the asymptotic no-regret property of BO with our specified setting of BOS parameters which is amenable to an elegant interpretation in terms of the exploration-exploitation trade-off, and performs competitively in a number of AutoML experiments.

Secondly, the widely celebrated federated learning (FL) setting requires first-order optimization techniques, and is hence unable to handle zeroth-order optimization tasks such as hyperparameter optimization. We extend BO into the FL setting (FBO) and derive the federated Thompson sampling (FTS) algorithm, to improve the efficiency of BO in the FL setting by employing auxiliary information from other agents. FTS tackles a number of major challenges faced by FBO in a principled way: FTS uses random Fourier features approximation to derive the parameters to be communicated in order to avoid sharing the raw data, adopts the Thompson sampling algorithm which reduces the number of parameters to be exchanged, and is robust against heterogeneous agents due to a robust theoretical convergence guarantee.

Thirdly, the above-mentioned FTS algorithm, unfortunately, is not equipped with a rigorous privacy guarantee, which is an important consideration in FL. To this end, we integrate differential privacy (DP) into FTS through a general framework for adding DP to iterative algorithms. Moreover, we leverage the ability of this general DP framework to handle different parameter vectors, as well as the technique of local modeling for BO, to further improve the utility of our algorithm through distributed exploration (DE). The resulting DP-FTS-DE algorithm is able to improve an agent's sample efficiency by exploiting auxiliary information from other agents, while rigorously hiding its participation in the algorithm. DP-FTS-DE is amenable to a number of interesting theoretical insights regarding the privacy-utility trade-off, and achieves competitive utilities with strong privacy guarantees in real-world experiments.

Fourthly, when BO is used for hyperparameter optimization using a dataset, we often have access to previous completed hyperparameter optimization tasks using other potentially related datasets. This prompts the question as to whether we can leverage these previous completed tasks to improve the efficiency of the current BO task through meta-learning, while ensuring its robustness against dissimilar tasks. We introduce a scalable, principled and robust meta-BO algorithm called robust meta-Gaussian process-upper confidence bound (RM-GP-UCB). We show that RM-GP-UCB is asymptotically no-regret even when all previous tasks are dissimilar to the current task, and is amenable to a principled method to learn the weights assigned to the individual previous tasks through regret minimization via online learning. RM-GP-UCB achieves effective performances in a wide range of real-world experiments.

Lastly, many ML tasks such as adversarial ML can be modeled as repeated games between boundedly rational, self-interested agents with unknown, complex, and costly-to-evaluate payoff functions. We introduce a recursive reasoning formalism of BO, called Recursive Reasoning-Based BO (R2-B2), which extends the applicability of BO to provide efficient strategies for players in this type of game. Under certain conditions, using R2-B2 to reason at one level higher than the other agents achieves faster asymptotic convergence to no regret than without using recursive reasoning. R2-B2 performs effectively in practice in adversarial ML and multi-agent reinforcement learning experiments.