Scaling up decision-making under uncertainty
Abstract:
Recent advances in deep learning and reinforcement learning have demonstrated impressively scalable performance in many tasks. This naturally raises the question whether approaches rooted in probabilistic methods can also be increased in scalability to be competitive with these approaches.
We answer this question in the affirmative in both the deep learning and reinforcement learning setting. We consider the problem scenario of decision-making under uncertainty, which is a general problem setting that arises both in deep learning and reinforcement learning.
In deep learning, an important problem setting is the early pruning of neural network elements. Here, early pruning means pruning ineffective network elements during the training process to speed up the training process. This is posed as a decision-making problem where decisions are made on which elements to keep or prune and when to prune these elements.
In reinforcement learning, an important problem setting is finding the optimal policy for a given task. This is posed as a decision-making problem where decisions are made on which policy to attempt next given the history of previous policies attempted so far.
We make progress on the above decision-making problems and show superior performance to competing approaches under specific scenarios.
To tie together these works, we isolate a scenario where Reinforcement Learning, an approach optimized by gradient descent, is empirically outperformed by Bayesian Optimization, a competing optimization approach. Gradient descent is at the heart of the deep learning and reinforcement learning revolution. We present a set of scenarios, termed Adversarially Designed Games, where Bayesian Optimization significantly outperforms gradient descent, thus showing the value of approaches rooted in probabilistic methods.
Finally, we present an open question on a suboptimality gap between gradient descent and Bayesian Optimization. If proved, this open question would definitively show there exist scenarios and problem settings where gradient descent is outperformed by Bayesian Optimization. We also present a preliminary proof design for solving this open problem.