Bin Yu

  1. Complexity Analysis of the Lasso Regularization Path.

    Authors: Bin Yu, Julien Mairal
    Subjects: Machine Learning
    Abstract

    The regularization path of the Lasso can be shown to be piecewise linear,
    making it possible to "follow" and explicitly compute the entire path. We
    analyze in this paper this popular strategy, and prove that its worst case
    complexity is exponential in the number of variables. We then oppose this
    pessimistic result to an (optimistic) approximate analysis: We show that an
    approximate path with at most O(1/sqrt(epsilon)) linear segments can always be
    obtained, where every point on the path is guaranteed to be optimal up to a
    relative epsilon-duality gap.

  2. Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows.

    Authors: Bin Yu, Julien Mairal
    Subjects: Machine Learning
    Abstract

    We consider supervised learning problems where the features are embedded in a
    graph, such as gene expressions in a gene network. In this context, it is of
    much interest to take into account the problem structure, and automatically
    select a subgraph with a small number of connected components. By exploiting
    prior knowledge, one can indeed improve the prediction performance and/or
    obtain better interpretable results. Regularization or penalty functions for
    selecting features in graphs have recently been proposed but they raise new
    algorithmic challenges.

  3. Co-clustering for Directed Graphs; the Stochastic Co-Blockmodel and a Spectral Algorithm.

    Authors: Bin Yu, Karl Rohe
    Subjects: Machine Learning
    Abstract

    Communities of highly connected actors form an essential feature in the
    structure of several empirical directed and undirected networks. However,
    compared to the amount of research on clustering for undirected graphs, there
    is relatively little understanding of clustering in directed networks.

  4. Encoding and decoding V1 fMRI responses to natural images with sparse nonparametric models.

    Authors: Bin Yu, Pradeep Ravikumar, Vincent Q Vu, Thomas Naselaris, Kendrick N Kay, Jack L Gallant
    Subjects: Applications
    Abstract

    Functional MRI (fMRI) has become the most common method for investigating the
    human brain. However, fMRI data present some complications for statistical
    analysis and modeling. One recently developed approach to these data focuses on
    estimation of computational encoding models that describe how stimuli are
    transformed into brain activity measured in individual voxels. Here we aim at
    building encoding models for fMRI signals recorded in primary visual cortex of
    the human brain. We use residual analyses to reveal systematic nonlinearity
    across voxels not taken into account by previous models.

  5. Remembering Leo.

    Authors: Bin Yu
    Subjects: Applications
    Abstract

    I do not remember when was the first time that I met Leo, but I have a clear
    memory of going to Leo's office on the 4th floor of Evans Hall to talk to him
    in my second year in Berkeley's Ph.D. program in 1986. The details of the
    conversation are not retained but a visual image of his clean and orderly
    office remains, in a stark contrast to a high entropy state of the same office
    now being used by myself.

  6. The Lasso under Heteroscedasticity.

    Authors: Bin Yu, Karl Rohe, Jinzhu Jia
    Subjects: Machine Learning
    Abstract

    The performance of the Lasso is well understood under the assumptions of the
    standard linear model with homoscedastic noise. However, in several
    applications, the standard model does not describe the important features of
    the data. This paper examines how the Lasso performs on a non-standard model
    that is motivated by medical imaging applications. In these applications, the
    variance of the noise scales linearly with the expectation of the observation.
    Like all heteroscedastic models, the noise terms in this Poisson-like model are
    \textit{not} independent of the design matrix.

  7. A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers.

    Authors: Martin J. Wainwright, Bin Yu, Pradeep Ravikumar, Sahand Negahban
    Subjects: Statistics
    Abstract

    High-dimensional statistical inference deals with models in which the the
    number of parameters $p$ is comparable to or larger than the sample size $n$.
    Since it is usually impossible to obtain consistent procedures unless $p/n
    \rightarrow 0$, a line of recent work has studied models with various types of
    low-dimensional structure (e.g., sparse vectors; block-structured matrices;
    low-rank matrices; Markov assumptions).

  8. Minimax-optimal rates for sparse additive models over kernel classes via convex programming.

    Authors: Martin J. Wainwright, Bin Yu, Garvesh Raskutti
    Subjects: Statistics
    Abstract

    Sparse additive models are families of $d$-variate functions that have the
    additive decomposition \mbox{$f^* = \sum_{j \in S} f^*_j$,} where $S$ is a
    unknown subset of cardinality $s \ll d$. We consider the case where each
    component function $f^*_j$ lies in a reproducing kernel Hilbert space, and
    analyze a simple kernel-based convex program for estimating the unknown
    function $f^*$. Working within a high-dimensional framework that allows both
    the dimension $d$ and sparsity $s$ to scale, we derive convergence rates in the
    $L^2(\mathbb{P})$ and $L^2(\mathbb{P}_n)$ norms.

  9. Spectral clustering and the high-dimensional Stochastic Block Model.

    Authors: Sourav Chatterjee, Bin Yu, Karl Rohe
    Subjects: Machine Learning
    Abstract

    Networks or graphs can easily represent a diverse set of data sources that
    are characterized by interacting units or actors. Social networks, representing
    people who communicate with each other, are one example. Communities or
    clusters of highly connected actors form an essential feature in the structure
    of several empirical networks. Spectral clustering is a popular and
    computationally feasible method to discover these communities. The Stochastic
    Block Model (Holland et al., 1983) is a social network model with well defined
    communities; each node is a member of one community.

  10. Data spectroscopy: Eigenspaces of convolution operators and clustering.

    Authors: Bin Yu, Mikhail Belkin, Tao Shi
    Subjects: Machine Learning
    Abstract

    This paper focuses on obtaining clustering information about a distribution
    from its i.i.d. samples. We develop theoretical results to understand and use
    clustering information contained in the eigenvectors of data adjacency matrices
    based on a radial kernel function with a sufficiently fast tail decay. In
    particular, we provide population analyses to gain insights into which
    eigenvectors should be used and when the clustering information for the
    distribution can be recovered from the sample.

  11. Minimax rates of estimation for high-dimensional linear regression over $\ell_q$-balls.

    Authors: Martin J. Wainwright, Bin Yu, Garvesh Raskutti
    Subjects: Statistics
    Abstract

    Consider the standard linear regression model $\y = \Xmat \betastar + w$,
    where $\y \in \real^\numobs$ is an observation vector, $\Xmat \in
    \real^{\numobs \times \pdim}$ is a design matrix, $\betastar \in \real^\pdim$
    is the unknown regression vector, and $w \sim \mathcal{N}(0, \sigma^2 I)$ is
    additive Gaussian noise. This paper studies the minimax rates of convergence
    for estimation of $\betastar$ for $\ell_\rpar$-losses and in the
    $\ell_2$-prediction loss, assuming that $\betastar$ belongs to an
    $\ell_{\qpar}$-ball $\Ballq(\myrad)$ for some $\qpar \in [0,1]$.

  12. Minimax rates of estimation for high-dimensional linear regression over $\ell_q$-balls.

    Authors: Martin J. Wainwright, Bin Yu, Garvesh Raskutti
    Subjects: Statistics
    Abstract

    Consider the standard linear regression model $\y = \Xmat \betastar + w$,
    where $\y \in \real^\numobs$ is an observation vector, $\Xmat \in
    \real^{\numobs \times \pdim}$ is a design matrix, $\betastar \in \real^\pdim$
    is the unknown regression vector, and $w \sim \mathcal{N}(0, \sigma^2 I)$ is
    additive Gaussian noise. This paper studies the minimax rates of convergence
    for estimation of $\betastar$ for $\ell_\rpar$-losses and in the
    $\ell_2$-prediction loss, assuming that $\betastar$ belongs to an
    $\ell_{\qpar}$-ball $\Ballq(\myrad)$ for some $\qpar \in [0,1]$.

  13. The composite absolute penalties family for grouped and hierarchical variable selection.

    Authors: Peng Zhao, Guilherme Rocha, Bin Yu
    Subjects: Statistics
    Abstract

    Extracting useful information from high-dimensional data is an important
    focus of today's statistical research and practice. Penalized loss function
    minimization has been shown to be effective for this task both theoretically
    and empirically. With the virtues of both regularization and sparsity, the
    $L_1$-penalized squared error minimization method Lasso has been popular in
    regression models and beyond.

  14. The composite absolute penalties family for grouped and hierarchical variable selection.

    Authors: Peng Zhao, Guilherme Rocha, Bin Yu
    Subjects: Statistics
    Abstract

    Extracting useful information from high-dimensional data is an important
    focus of today's statistical research and practice. Penalized loss function
    minimization has been shown to be effective for this task both theoretically
    and empirically. With the virtues of both regularization and sparsity, the
    $L_1$-penalized squared error minimization method Lasso has been popular in
    regression models and beyond.

Syndicate content