Cun-Hui Zhang

  1. Optimality of Graphlet Screening in High Dimensional Variable Selection.

    Authors: Cun-Hui Zhang, Qi Zhang, Jiashun Jin
    Subjects: Statistics
    Abstract

    Consider a linear regression model where the design matrix X has n rows and p
    columns. We assume (a) p is much large than n, (b) the coefficient vector beta
    is sparse in the sense that only a small fraction of its coordinates is
    nonzero, and (c) the Gram matrix G = X'X is sparse in the sense that each row
    has relatively few large coordinates (diagonals of G are normalized to 1).

  2. A General Framework of Dual Certificate Analysis for Structured Sparse Recovery Problems.

    Authors: Tong Zhang, Cun-Hui Zhang
    Subjects: Machine Learning
    Abstract

    This paper develops a general theoretical framework to analyze structured
    sparse recovery problems using the notation of dual certificate. Although
    certain aspects of the dual certificate idea have already been used in some
    previous work, due to the lack of a general and coherent theory, the analysis
    has so far only been carried out in limited scopes for specific problems. In
    this context the current paper makes two contributions. First, we introduce a
    general definition of dual certificate, which we then use to develop a unified
    theory of sparse recovery analysis for convex programming.

  3. The sparse Laplacian shrinkage estimator for high-dimensional regression.

    Authors: Cun-Hui Zhang, Shuangge Ma, Jian Huang, Hongzhe Li
    Subjects: Statistics
    Abstract

    We propose a new penalized method for variable selection and estimation that
    explicitly incorporates the correlation patterns among predictors. This method
    is based on a combination of the minimax concave penalty and Laplacian
    quadratic associated with a graph as the penalty function. We call it the
    sparse Laplacian shrinkage (SLS) method. The SLS uses the minimax concave
    penalty for encouraging sparsity and Laplacian quadratic penalty for promoting
    smoothness among coefficients associated with the correlated predictors.

  4. Confidence Intervals for Low-Dimensional Parameters With High-Dimensional Data.

    Authors: Cun-Hui Zhang, Stephanie S. Zhang
    Subjects: Methodology
    Abstract

    The purpose of this paper is to propose methodologies for statistical
    inference of low-dimensional parameters with high-dimensional data. We focus on
    constructing confidence intervals for individual coefficients and linear
    combinations of several of them in a linear regression model, although our
    ideas are applicable in a much broad context. The theoretical results presented
    here provide sufficient conditions for the asymptotic normality of the proposed
    estimators along with a consistent estimator for their finite-dimensional
    covariance matrices.

  5. A General Theory of Concave Regularization for High Dimensional Sparse Estimation Problems.

    Authors: Tong Zhang, Cun-Hui Zhang
    Subjects: Machine Learning
    Abstract

    Concave regularization methods provide natural procedures for sparse
    recovery. However, they are difficult to analyze in the high dimensional
    setting. Only recently a few sparse recovery results have been established for
    some specific local solutions obtained via specialized numerical procedures.
    Still, the fundamental relationship between these solutions such as whether
    they are identical or their relationship to the global minimizer of the
    underlying nonconvex formulation is unknown.

  6. Scaled Sparse Linear Regression.

    Authors: Cun-Hui Zhang, Tingni Sun
    Subjects: Machine Learning
    Abstract

    Scaled sparse linear regression jointly estimates the regression coefficients
    and noise level in a linear model. It chooses an equilibrium with a sparse
    regression method by iteratively estimating the noise level via the mean
    residual squares and scaling the penalty in proportion to the estimated noise
    level.

  7. Optimal rates of convergence for covariance matrix estimation.

    Authors: Cun-Hui Zhang, T. Tony Cai, Harrison H. Zhou
    Subjects: Statistics
    Abstract

    Covariance matrix plays a central role in multivariate statistical analysis.
    Significant advances have been made recently on developing both theory and
    methodology for estimating large covariance matrices. However, a minimax theory
    has yet been developed. In this paper we establish the optimal rates of
    convergence for estimating the covariance matrix under both the operator norm
    and Frobenius norm. It is shown that optimal procedures under the two norms are
    different and consequently matrix estimation under the operator norm is
    fundamentally different from vector estimation.

  8. Nearly unbiased variable selection under minimax concave penalty.

    Authors: Cun-Hui Zhang
    Subjects: Statistics
    Abstract

    We propose MC+, a fast, continuous, nearly unbiased and accurate method of
    penalized variable selection in high-dimensional linear regression. The LASSO
    is fast and continuous, but biased. The bias of the LASSO may prevent
    consistent variable selection. Subset selection is unbiased but computationally
    costly. The MC+ has two elements: a minimax concave penalty (MCP) and a
    penalized linear unbiased selection (PLUS) algorithm. The MCP provides the
    convexity of the penalized loss in sparse regions to the greatest extent given
    certain thresholds for variable selection and unbiasedness.

  9. Asymptotic normality of a nonparametric estimator of sample coverage.

    Authors: Cun-Hui Zhang, Zhiyi Zhang
    Subjects: gr. Statistics
    Abstract

    This paper establishes a necessary and sufficient condition for the
    asymptotic normality of the nonparametric estimator of sample coverage proposed
    by Good [Biometrica 40 (1953) 237--264]. This new necessary and sufficient
    condition extends the validity of the asymptotic normality beyond the
    previously proven cases.

Syndicate content