Near-Optimal Bayesian Active Learning with Noisy Observations.

link: http://arxiv.org/abs/1010.3091
Abstract

We tackle the fundamental problem of Bayesian active learning with noise,
where we need to adaptively select from a number of expensive tests in order to
identify an unknown hypothesis sampled from a known prior distribution. In the
case of noise-free observations, a greedy algorithm called generalized binary
search (GBS) is known to perform near-optimally. We show that if the
observations are noisy, perhaps surprisingly, GBS can perform very poorly. We
develop EC2, a novel, greedy active learning algorithm and prove that it is
competitive with the optimal policy, thus obtaining the first competitiveness
guarantees for Bayesian active learning with noisy observations. Our bounds
rely on a recently discovered diminishing returns property called adaptive
submodularity, generalizing the classical notion of submodular set functions to
adaptive policies. Our results hold even if the tests have non-uniform cost and
their noise is correlated. We also propose EffECXtive, a particularly fast
approximation of EC2, and evaluate it on a Bayesian experimental design problem
involving human subjects, intended to tease apart competing economic theories
of how people make decisions under uncertainty.