Act to See and See to Act: A Robotic System for Object Retrieval in Clutter
COM1 Level 3
MR1, COM1-03-19
closeAbstract:
Object retrieval in clutter is an extraordinary challenge for robots. The challenges come from the incomplete knowledge of the environment. A robot has imperfect sensing due to occlusion among objects. At the same time, it must physically interact with objects of unknown physical properties.
We humans naturally adopt the strategy of Act to See and See to Act to retrieve objects in clutter. We may rearrange (Act) objects to better understand (See) the scene, which in turn guides us to select better actions (Act) towards achieving the goal. This thesis adopts the same strategy that enables a robotic system to robustly and efficiently retrieve objects in clutter under uncertainties in sensing due to occlusion and uncertainties in control due to unknown objects??? physical properties, such as center of mass. To alleviate uncertainties in sensing, we formulate the problem of object search in clutter as a Partially Observable Markov Decision Process (POMDP) with large state, action and observation spaces. With insights in spatial constraints of the problem, we improve the state-of-the-art POMDP solver to solve the POMDP efficiently. Through experiments in simulation, we show that the proposed planner was able to select near-optimal actions to remove occlusion and reveal the target object efficiently. We further conclude that POMDP planning is effective for problems which require multi-step lookahead search.
To handle uncertainties in control, we devise Push-Net, a deep recurrent neural network, which enables a robot to push an object from one configuration to another robustly and efficiently. Capturing history of push interactions enables Push-Net to push novel objects robustly. We perform experiments in simulation and on a real robot, and show that embedding physical understanding (center of mass) about objects in Push-Net helps select more effective push actions.
Finally, we improve and integrate both the POMDP planner and the Push-Net into a real robotic system. We evaluate the system on a set of challenging scenarios. The results demonstrate that the proposed system is able to retrieve the target object robustly and efficiently in clutter. The success of the system is attributed to 1) the ability to handle perceptual uncertainty due to occlusion; 2) the ability to push objects of unknown physical properties in clutter; 3) the ability to perform multi-step lookahead planning for efficient object search in complex environment.