PH.D DEFENCE - PUBLIC SEMINAR

PLANNING UNDER UNCERTAINTY: FROM INFORMATIVE PATH PLANNING TO PARTIALLY OBSERVABLE SEMI-MDPS

Speaker
Mr Lim Zhan Wei
Advisor
Dr Lee Wee Sun, Associate Professor, School of Computing


13 Aug 2015 Thursday, 02:00 PM to 03:30 PM

MR3, COM2-02-26

Abstract:

Planning under uncertainty is crucial to the success of many autonomous systems. An agent interacting in the real-world often has to deal with uncertainty due to unknown environment, noisy sensor measurements, and imprecise actuation. It also has to continuously adapt to circumstances as the world unfolds. Partially Observable Markov Decision Process (POMDP) is an elegant and general framework for modeling planning under such uncertainties. Unfortunately, solving POMDPs is computationally intractable. This thesis examines useful subclasses of POMDPs and algorithms to solve them efficiently.

We look at informative path planning (IPP) problems where an agent seeks a minimum cost path to sense the world and gather information. IPP generalizes the well- known optimal decision tree problem from selecting subset of tests to selecting paths. We present Recursive Adaptive Identification (RAId), a new polynomial time algorithm and obtain a polylogarithmic approximation bound for IPP problems without observation noise.

We also study adaptive stochastic optimization problems, a generalization of IPP from gathering information to general goals. In adaptive stochastic optimization problems, an agent minimizes the cost of a sequence of actions to achieve its goal under uncertainty, where its progress towards the goal can be measured by an appropriate function. We propose the marginal likelihood rate bound condition for pointwise submodular functions as a condition that allows efficient approximation for adaptive stochastic optimization problems. We develop Recursive Adaptive Coverage (RAC), a near-optimal polynomial time algorithm that exploits properties of the marginal likelihood rate bound to solve problems that optimize these functions. We further propose a more general condition, the marginal likelihood bound that contains all finite point- wise submodular monotone functions. Using a modified version of RAC, we obtain an approximation bound that depends on a problem specific constant for the marginal likelihood bound condition.

Finally, scaling up POMDPs is hard when the task takes many actions to complete. We examine the special case of POMDPs that can be well approximated using sequences of macro-actions that encapsulate several primitive actions. We give sufficient conditions for macro actions model to retain good theoretical properties of POMDP. We introduce Macro-Monte Carlo Value Iteration (Macro-MCVI), an algorithm that enables the use of macro actions in POMDP. Macro-MCVI only needs a generative model for macro actions, making it easy to specify macro actions for effective approximation.