Consider a linear regression model where the design matrix X has n rows and p
columns. We assume (a) p is much large than n, (b) the coefficient vector beta
is sparse in the sense that only a small fraction of its coordinates is
nonzero, and (c) the Gram matrix G = X'X is sparse in the sense that each row
has relatively few large coordinates (diagonals of G are normalized to 1).
This paper develops a general theoretical framework to analyze structured
sparse recovery problems using the notation of dual certificate. Although
certain aspects of the dual certificate idea have already been used in some
previous work, due to the lack of a general and coherent theory, the analysis
has so far only been carried out in limited scopes for specific problems. In
this context the current paper makes two contributions. First, we introduce a
general definition of dual certificate, which we then use to develop a unified
theory of sparse recovery analysis for convex programming.
We propose a new penalized method for variable selection and estimation that
explicitly incorporates the correlation patterns among predictors. This method
is based on a combination of the minimax concave penalty and Laplacian
quadratic associated with a graph as the penalty function. We call it the
sparse Laplacian shrinkage (SLS) method. The SLS uses the minimax concave
penalty for encouraging sparsity and Laplacian quadratic penalty for promoting
smoothness among coefficients associated with the correlated predictors.
The purpose of this paper is to propose methodologies for statistical
inference of low-dimensional parameters with high-dimensional data. We focus on
constructing confidence intervals for individual coefficients and linear
combinations of several of them in a linear regression model, although our
ideas are applicable in a much broad context. The theoretical results presented
here provide sufficient conditions for the asymptotic normality of the proposed
estimators along with a consistent estimator for their finite-dimensional
covariance matrices.
Concave regularization methods provide natural procedures for sparse
recovery. However, they are difficult to analyze in the high dimensional
setting. Only recently a few sparse recovery results have been established for
some specific local solutions obtained via specialized numerical procedures.
Still, the fundamental relationship between these solutions such as whether
they are identical or their relationship to the global minimizer of the
underlying nonconvex formulation is unknown.
Scaled sparse linear regression jointly estimates the regression coefficients
and noise level in a linear model. It chooses an equilibrium with a sparse
regression method by iteratively estimating the noise level via the mean
residual squares and scaling the penalty in proportion to the estimated noise
level.
Covariance matrix plays a central role in multivariate statistical analysis.
Significant advances have been made recently on developing both theory and
methodology for estimating large covariance matrices. However, a minimax theory
has yet been developed. In this paper we establish the optimal rates of
convergence for estimating the covariance matrix under both the operator norm
and Frobenius norm. It is shown that optimal procedures under the two norms are
different and consequently matrix estimation under the operator norm is
fundamentally different from vector estimation.
We propose MC+, a fast, continuous, nearly unbiased and accurate method of
penalized variable selection in high-dimensional linear regression. The LASSO
is fast and continuous, but biased. The bias of the LASSO may prevent
consistent variable selection. Subset selection is unbiased but computationally
costly. The MC+ has two elements: a minimax concave penalty (MCP) and a
penalized linear unbiased selection (PLUS) algorithm. The MCP provides the
convexity of the penalized loss in sparse regions to the greatest extent given
certain thresholds for variable selection and unbiasedness.
This paper establishes a necessary and sufficient condition for the
asymptotic normality of the nonparametric estimator of sample coverage proposed
by Good [Biometrica 40 (1953) 237--264]. This new necessary and sufficient
condition extends the validity of the asymptotic normality beyond the
previously proven cases.