The estimation of a sparse vector in the linear model is a fundamental
problem in signal processing, statistics, and compressive sensing. This paper
establishes a lower bound on the mean-squared error, which holds regardless of
the sensing/design matrix being used and regardless of the estimation
procedure. This lower bound very nearly matches the known upper bound one gets
by taking a random projection of the sparse vector followed by an $\ell_1$
estimation procedure such as the Dantzig selector. In this sense, compressive
sensing techniques cannot essentially be improved.
This paper develops a general framework for solving a variety of convex cone
problems that frequently arise in signal processing, machine learning,
statistics, and other fields. The approach works as follows: first, determine a
conic formulation of the problem; second, determine its dual; third, apply
smoothing; and fourth, solve using an optimal first-order method. A merit of
this approach is its flexibility: for example, all compressed sensing problems
can be solved via this approach.
Testing for the significance of a subset of regression coefficients in a
linear model, a staple of statistical analysis, goes back at least to the work
of Fisher who introduced the analysis of variance (ANOVA). We study this
problem under the assumption that the coefficient vector is sparse, a common
situation in modern high-dimensional settings. Suppose the regression vector is
of dimension p with S non-zero coefficients with S = p^{1 -alpha}. Under
moderate sparsity levels, i.e. alpha <= 1/2, we show that ANOVA is essentially
optimal under some conditions on the design.
We consider the fundamental problem of estimating the mean of a vector
$y=X\beta+z$, where $X$ is an $n\times p$ design matrix in which one can have
far more variables than observations, and $z$ is a stochastic error term--the
so-called "$p>n$" setup. When $\beta$ is sparse, or, more generally, when there
is a sparse subset of covariates providing a close approximation to the unknown
mean vector, we ask whether or not it is possible to accurately estimate
$X\beta$ using a computationally tractable algorithm.