Sparsity regret bounds for individual sequences in online linear regression.

link: http://arxiv.org/abs/1101.1057
Abstract

We consider the problem of online linear regression on arbitrary
deterministic sequences when the ambient dimension $d$ can be much larger than
the number of time rounds $T$. In this framework we prove deterministic online
counterparts of the so-called sparsity oracle inequalities introduced in the
stochastic setting in the past decade. They indicate that the task consisting
in predicting almost as well as an unknown high-dimensional target vector is
still statistically feasible if this target vector has only few non-zero
coordinates. Our online-learning algorithm SeqSEW is based on exponential
weighting and data-driven truncation. In a second part we apply a
parameter-independent version of this algorithm to the regression model with
random or fixed design. In this setting the sparsity regret bounds proved on
arbitrary deterministic sequences yield sparsity oracle inequalities with
leading constant $1$ which are of the same flavor as in Dalalyan and Tsybakov
(2008; 2010) but are adaptive (up to a logarithmic factor) to the unknown
variance of the noise if the latter is Gaussian (weaker bounds are also proved
under weaker assumptions). The framework of prediction of individual sequences
thus offers a unifying setting to address tuning issues in both the random and
the fixed design cases.