Tong Zhang

  1. A General Framework of Dual Certificate Analysis for Structured Sparse Recovery Problems.

    Authors: Tong Zhang, Cun-Hui Zhang
    Subjects: Machine Learning
    Abstract

    This paper develops a general theoretical framework to analyze structured
    sparse recovery problems using the notation of dual certificate. Although
    certain aspects of the dual certificate idea have already been used in some
    previous work, due to the lack of a general and coherent theory, the analysis
    has so far only been carried out in limited scopes for specific problems. In
    this context the current paper makes two contributions. First, we introduce a
    general definition of dual certificate, which we then use to develop a unified
    theory of sparse recovery analysis for convex programming.

  2. A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem.

    Authors: Tong Zhang, Lin Xiao
    Subjects: Optimization and Control
    Abstract

    We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS)
    problem in the context of sparse recovery, for applications such as compressed
    sensing. The standard proximal gradient method, also known as iterative
    soft-thresholding when applied to this problem, has low computational cost per
    iteration but a rather slow convergence rate. Nevertheless, when the solution
    is sparse, it often exhibits fast linear convergence in the final stage.

  3. Mutual-Information Optimized Quantization for LDPC Decoding of Accurately Modeled Flash Data.

    Authors: Tong Zhang, Richard Wesel, Jiadong Wang, Guiqiang Dong
    Subjects: Information Theory
    Abstract

    High-capacity NAND flash memories use multi-level cells (MLCs) to store
    multiple bits per cell and achieve high storage densities. Higher densities
    cause increased raw bit error rates (BERs), which demand powerful error
    correcting codes. Low-density parity-check (LDPC) codes are a well-known class
    of capacity-approaching codes in AWGN channels. However, LDPC codes
    traditionally use soft information while the flash read channel provides only
    hard information. Low resolution soft information may be obtained by performing
    multiple reads per cell with distinct word-line voltages.

  4. Effective bound of linear series on arithmetic surfaces.

    Authors: Tong Zhang, Xinyi Yuan
    Subjects: Number Theory
    Abstract

    We prove an effective upper bound on the number of effective sections of a
    nef hermitian line bundle over an arithmetic surface. It is an effective
    version of the arithmetic Hilbert--Samuel formula. As a consequence, we obtain
    effective lower bounds on the Faltings height and on the self-intersection of
    the canonical bundle in terms of the number of singular points on fibers of the
    arithmetic surface.

  5. Truncated Power Method for Sparse Eigenvalue Problems.

    Authors: Tong Zhang, Xiao-Tong Yuan
    Subjects: Machine Learning
    Abstract

    This paper considers the sparse eigenvalue problem, which is to extract
    dominant (largest) sparse eigenvectors with at most $k$ non-zero components. We
    propose a simple yet effective solution called truncated power method that can
    approximately solve the underlying nonconvex optimization problem. A strong
    sparse recovery result is proved for the truncated power method, and this
    theory is our key motivation for developing the new algorithm. The proposed
    method is tested on applications such as sparse principal component analysis
    and the densest $k$-subgraph problem.

  6. Learning Nonlinear Functions Using Regularized Greedy Forest.

    Authors: Tong Zhang, Rie Johnson
    Subjects: Machine Learning
    Abstract

    We apply the concept of structured sparsity to improve boosted decision trees
    with general loss functions. The existing approach to this problem is
    Friedman's gradient boosting procedure using a regression tree base learner.
    Although this method has led to many successful industrial applications, it
    suffers from several theoretical and practical drawbacks. By employing the idea
    of structured greedy search, we are able to design a regularized greedy forest
    procedure to address these issues.

  7. A tail inequality for quadratic forms of subgaussian random vectors.

    Authors: Tong Zhang, Sham M. Kakade, Daniel Hsu
    Subjects: Probability
    Abstract

    We prove an exponential probability tail inequality for positive semidefinite
    quadratic forms in a subgaussian random vector. The bound is analogous to one
    that holds when the vector has independent Gaussian entries.

  8. A General Theory of Concave Regularization for High Dimensional Sparse Estimation Problems.

    Authors: Tong Zhang, Cun-Hui Zhang
    Subjects: Machine Learning
    Abstract

    Concave regularization methods provide natural procedures for sparse
    recovery. However, they are difficult to analyze in the high dimensional
    setting. Only recently a few sparse recovery results have been established for
    some specific local solutions obtained via specialized numerical procedures.
    Still, the fundamental relationship between these solutions such as whether
    they are identical or their relationship to the global minimizer of the
    underlying nonconvex formulation is unknown.

  9. Spectral Methods for Learning Multivariate Latent Tree Structure.

    Authors: Tong Zhang, Animashree Anandkumar, Sham M. Kakade, Kamalika Chaudhuri, Daniel Hsu, Le Song
    Subjects: Learning
    Abstract

    This work considers the problem of learning the structure of a broad class of
    multivariate latent variable tree models, which include a variety of continuous
    and discrete models (including the widely used linear-Gaussian models, hidden
    Markov models, and Markov evolutionary trees). The setting is one where we only
    have samples from certain observed variables in the tree and our goal is to
    estimate the tree structure (i.e., the graph of how the underlying hidden
    variables are connected to the observed variables).

  10. Efficient Optimal Learning for Contextual Bandits.

    Authors: Tong Zhang, John Langford, Daniel Hsu, Lev Reyzin, Nikos Karampatziakis, Miroslav Dudik, Satyen Kale
    Subjects: Learning
    Abstract

    We address the problem of learning in an online setting where the learner
    repeatedly observes features, selects among a set of actions, and receives
    reward for the action taken. We provide the first efficient algorithm with an
    optimal regret. Our algorithm uses a cost sensitive classification learner as
    an oracle and has a running time $\mathrm{polylog}(N)$, where $N$ is the number
    of classification rules among which the oracle might choose. This is
    exponentially faster than all previous algorithms that achieve optimal regret
    in this setting.

  11. An Analysis of Random Design Linear Regression.

    Authors: Tong Zhang, Sham M. Kakade, Daniel Hsu
    Subjects: Statistics
    Abstract

    The random design setting for linear regression concerns estimators based on
    a random sample of covariate/response pairs. This work gives explicit bounds on
    the prediction error for the ordinary least squares estimator and the ridge
    regression estimator under mild assumptions on the covariate/response
    distributions. In particular, this work provides sharp results on the
    "out-of-sample" prediction error, as opposed to the "in-sample" (fixed design)
    error. Our analysis also explicitly reveals the effect of noise vs. modeling
    errors.

  12. Robust Matrix Decomposition with Outliers.

    Authors: Tong Zhang, Sham M. Kakade, Daniel Hsu
    Subjects: Machine Learning
    Abstract

    Suppose a given observation matrix can be decomposed as the sum of a low-rank
    matrix and a sparse matrix (outliers), and the goal is to recover these
    individual components from the observed sum. Such additive decompositions have
    applications in a variety of numerical problems including system
    identification, latent variable graphical modeling, and principal components
    analysis. We study conditions under which recovering such a decomposition is
    possible via a combination of $\ell_1$ norm and trace norm minimization.

  13. Some sharp performance bounds for least squares regression with $L_1$ regularization.

    Authors: Tong Zhang
    Subjects: gr. Statistics
    Abstract

    We derive sharp performance bounds for least squares regression with $L_1$
    regularization from parameter estimation accuracy and feature selection quality
    perspectives. The main result proved for $L_1$ regularization extends a similar
    result in [Ann. Statist. 35 (2007) 2313--2351] for the Dantzig selector. It
    gives an affirmative answer to an open question in [Ann. Statist. 35 (2007)
    2358--2364]. Moreover, the result leads to an extended view of feature
    selection that allows less restrictive conditions than some recent work.

Syndicate content