CS SEMINAR

Melding Mechanism Design with Machine Learning in a Multi-Armed Bandit Context

Speaker
Prof. Y. Narahari
Department of Computer Science and Automation
Indian Institute of Science, Bangalore

Chaired by
Dr Mohan KANKANHALLI, Provost's Chair Professor, School of Computing
mohan@comp.nus.edu.sg

30 Sep 2015 Wednesday, 10:30 AM to 12:00 PM

SR7, COM1-02-07

Abstract:

Mechanism design (MD) provides a game theoretic framework to explore if the given social choice function may be implemented as an equilibrium outcome of an induced game. In a multi-agent setting, machine learning (ML) seeks to learn the preferences or types of the agents through any available data or through intelligent exploration. Many emerging problems involve strategic agents, private information, unknown information, and opportunities to explore and interact with agents; for example, Internet advertising auctions, Crowdsourcing, and online educational forums. ML and MD are well investigated as individual problems, however, interesting research questions arise when we try to meld them together. We address these issues in the context of the following problem in crowdsourcing. Consider a requester who wishes to crowdsource a series of identical tasks from a pool of workers so as to achieve an assured accuracy for each task, in a cost optimal way. The workers are heterogeneous with unknown qualities and moreover their costs are private. The problem is to select an optimal subset of the workers to work on each task so that the outcome obtained from aggregating labels from them guarantees a target accuracy. The requester not only has to learn the qualities of the workers but also elicit their true costs. We develop a novel multi-armed bandit (MAB) mechanism for solving this problem. In particular, we propose an adaptive, exploration separated MAB algorithm. We derive an upper bound on the number of exploration steps which depends on the target accuracy and true qualities. We show that our algorithm produces an ex-post monotone allocation rule which can be transformed into an ex-post incentive compatible and ex-post individually rational mechanism that learns qualities of the workers and guarantees the target accuracy in a cost optimal way.


Biodata:

Y. Narahari is currently a Professor at the Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India. The focus of his research in the last decade has been to explore problems at the interface of computer science and game theory. He is a Fellow of IEEE. He has just completed a textbook entitled Game Theory and Mechanism Design brought out by the IISc Press and the World Scientific Publishing Company.
More details at: http://lcm.csa.iisc.ernet.in/hari/