Peter Bühlmann

  1. Two Optimal Strategies for Active Learning of Causal Models from Interventions.

    Authors: Peter Bühlmann, Alain Hauser
    Subjects: Methodology
    Abstract

    From observational data alone, a causal DAG is in general only identifiable
    up to Markov equivalence. Interventional data generally improves
    identifiability; however, the gain of an intervention strongly depends on the
    intervention target, i.e., the intervened variables. We present active learning
    strategies calculating optimal interventions for two different learning goals.
    The first one is a greedy approach using single-vertex interventions that
    maximizes the number of edges that can be oriented after each intervention.

  2. Identifiability of Gaussian Structural Equation Models with Same Error Variances.

    Authors: Peter Bühlmann, Jonas Peters
    Subjects: Machine Learning
    Abstract

    We consider structural equation models (SEMs) in which variables can be
    written as a function of their parents and noise terms (the latter are assumed
    to be jointly independent). Corresponding to each SEM, there is a directed
    acyclic graph (DAG) G_0 describing the relationships between the variables. In
    Gaussian SEMs with linear functions, the graph can be identified from the joint
    distribution only up to Markov equivalence classes (assuming faithfulness). It
    has been shown, however, that this constitutes an exceptional case.

  3. Asymptotic optimality of the Westfall--Young permutation procedure for multiple testing under dependence.

    Authors: Marloes H. Maathuis, Peter Bühlmann, Nicolai Meinshausen
    Subjects: Statistics
    Abstract

    Test statistics are often strongly dependent in large-scale multiple testing
    applications. Most corrections for multiplicity are unduly conservative for
    correlated test statistics, resulting in a loss of power to detect true
    positives. We show that the Westfall--Young permutation method has
    asymptotically optimal power for a broad class of testing problems with a
    block-dependence and sparsity structure among the tests, when the number of
    tests tends to infinity.

  4. Introduction to the Lehmann special section.

    Authors: Peter Bühlmann, Tony Cai
    Subjects: Statistics
    Abstract

    The current Special Issue of The Annals of Statistics contains three invited
    articles. Javier Rojo discusses Erich's scientific achievements and provides
    complete lists of his scientific writings and his former Ph.D. students.

  5. Conditional Transformation Models.

    Authors: Peter Bühlmann, Thomas Kneib, Torsten Hothorn
    Subjects: Methodology
    Abstract

    The ultimate goal of regression analysis is to obtain information about the
    conditional distribution of a response given a set of explanatory variables.
    This goal is, however, seldom achieved because most established regression
    models only estimate the conditional mean as a function of the explanatory
    variables and assume that higher moments are not affected by the regressors.
    The underlying reason for such a restriction is the assumption of additivity of
    signal and noise. We propose to relax this common assumption in the framework
    of transformation models.

  6. GLMMLasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using L1-Penalization.

    Authors: Peter Bühlmann, Jürg Schelldorfer
    Subjects: Computation
    Abstract

    We propose an L1-penalized algorithm for fitting high-dimensional generalized
    linear mixed models. Generalized linear mixed models (GLMMs) can be viewed as
    an extension of generalized linear models for clustered observations. This
    Lasso-type approach for GLMMs should be mainly used as variable screening
    method to reduce the number of variables below the sample size. We then suggest
    a refitting by maximum likelihood based on the selected variables only. This is
    an effective correction to overcome problems stemming from the variable
    screening procedure which are more severe with GLMMs.

  7. Stable Graphical Model Estimation with Random Forests for Discrete, Continuous, and Mixed Variables.

    Authors: Peter Bühlmann, Bernd Fellinghauer, Martin Ryffel, Michael von Rhein, Jan D. Reinhardt
    Subjects: Methodology
    Abstract

    A conditional independence graph is a concise representation of pairwise
    conditional independence among many variables. We propose Graphical Random
    Forests (GRaFo) for estimating pairwise conditional independence relationships
    among mixed-type, i.e. continuous and discrete, variables. The number of edges
    is a tuning parameter in any graphical model estimator and there is no obvious
    number that constitutes a good choice. Stability Selection helps choosing this
    parameter with respect to a bound on the expected number of false positives
    (error control).

  8. Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs.

    Authors: Peter Bühlmann, Alain Hauser
    Subjects: Methodology
    Abstract

    The investigation of directed acyclic graphs (DAGs) encoding the same Markov
    property, that is the same conditional independence relations of multivariate
    observational distributions, has a long tradition; many algorithms exist for
    model selection and structure learning in Markov equivalence classes. In this
    paper, we extend the notion of Markov equivalence of DAGs to the case of
    interventional distributions arising from multiple intervention experiments.

  9. Remembrance of Leo Breiman.

    Authors: Peter Bühlmann
    Subjects: Applications
    Abstract

    In 1994, I came to Berkeley and was fortunate to stay there three years,
    first as a postdoctoral researcher and then as Neyman Visiting Assistant
    Professor. For me, this period was a unique opportunity to see other aspects
    and learn many more things about statistics: the Department of Statistics at
    Berkeley was much bigger and hence broader than my home at ETH Z\"urich and I
    enjoyed very much that the science was perhaps a bit more speculative.

  10. Missing values: sparse inverse covariance estimation and an extension to sparse regression.

    Authors: Peter Bühlmann, Nicolas Städler
    Subjects: Methodology
    Abstract

    We propose an l1-regularized likelihood method for estimating the inverse
    covariance matrix in the high-dimensional multivariate normal model in presence
    of missing data. Our method is based on the assumption that the data are
    missing at random (MAR) which entails also the completely missing at random
    case. The implementation of the method is non-trivial as the observed negative
    log-likelihood generally is a complicated and non-convex function. We propose
    an efficient EM-algorithm for optimization with provable numerical convergence
    properties.

  11. Estimation for High-Dimensional Linear Mixed-Effects Models Using $\ell_1$-Penalization.

    Authors: Peter Bühlmann, Jürg Schelldorfer
    Subjects: Methodology
    Abstract

    We propose an $\ell_1$-penalized estimation procedure for high-dimensional
    linear mixed-effects models. The models are useful whenever there is a grouping
    structure among high-dimensional observations, i.e. for clustered data. We
    prove a consistency and an oracle optimality result and we develop an algorithm
    with provable numerical convergence. Furthermore, we demonstrate the
    performance of the method on simulated and a real high-dimensional dataset.

  12. High-dimensional additive modeling.

    Authors: Peter Bühlmann, Lukas Meier, Sara van de Geer
    Subjects: Machine Learning
    Abstract

    We propose a new sparsity-smoothness penalty for high-dimensional generalized
    additive models. The combination of sparsity and smoothness is crucial for
    mathematical theory as well as performance for finite-sample data. We present a
    computationally efficient algorithm, with provable numerical convergence
    properties, for optimizing the penalized likelihood. Furthermore, we provide
    oracle results which yield asymptotic optimality of our estimator for high
    dimensional but sparse additive models.

  13. Decomposition and Model Selection for Large Contingency Tables.

    Authors: Markus Kalisch, Peter Bühlmann, Corinne Dahinden
    Subjects: Methodology
    Abstract

    Large contingency tables summarizing categorical variables arise in many
    areas. For example in biology when a large number of biomarkers are
    cross-tabulated according to their discrete expression level. Interactions of
    the variables are generally studied with log-linear models and the structure of
    a log-linear model can be visually represented by a graph from which the
    conditional independence structure can then be read off.

  14. High dimensional sparse covariance estimation via directed acyclic graphs.

    Authors: Peter Bühlmann, Philipp Rütimann
    Subjects: Methodology
    Abstract

    We present a graph-based technique for estimating sparse covariance matrices
    and their inverse from high-dimensional data. The method is based on learning a
    directed acyclic graph (DAG) and estimating parameters of a multivariate
    Gaussian distribution based on a DAG. For inferring the underlying DAG we use
    the PC-algorithm and for estimating the DAG-based covariance matrix and its
    inverse, we use a Cholesky decomposition approach which provides a positive
    (semi-)definite sparse estimate.

  15. Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm.

    Authors: Marloes H. Maathuis, Markus Kalisch, Peter Bühlmann
    Subjects: Methodology
    Abstract

    We consider variable selection in high-dimensional linear models where the
    number of covariates greatly exceeds the sample size. We introduce the new
    concept of partial faithfulness and use it to infer associations between the
    covariates and the response.

  16. Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm.

    Authors: Marloes H. Maathuis, Markus Kalisch, Peter Bühlmann
    Subjects: Methodology
    Abstract

    We consider variable selection in high-dimensional linear models where the
    number of covariates greatly exceeds the sample size. We introduce the new
    concept of partial faithfulness and use it to infer associations between the
    covariates and the response.

  17. On the conditions used to prove oracle results for the Lasso.

    Authors: Peter Bühlmann, Sara A. van de Geer
    Subjects: Statistics
    Abstract

    Oracle inequalities and variable selection properties for the Lasso in linear
    models have been established under a variety of different assumptions on the
    design matrix. We show in this paper how the different conditions and concepts
    relate to each other. The restricted eigenvalue condition (Bickel et al., 2009)
    or the slightly weaker compatibility condition (van de Geer, 2007) are
    sufficient for oracle results. We argue that both these conditions allow for a
    fairly general class of design matrices.

  18. Estimating high-dimensional intervention effects from observational data.

    Authors: Marloes H. Maathuis, Markus Kalisch, Peter Bühlmann
    Subjects: Methodology
    Abstract

    We assume that we have observational data generated from an unknown
    underlying directed acyclic graph (DAG) model. A DAG is typically not
    identifiable from observational data, but it is possible to consistently
    estimate the equivalence class of a DAG. Moreover, for any given DAG, causal
    effects can be estimated using intervention calculus. In this paper, we combine
    these two parts. For each DAG in the estimated equivalence class, we use
    intervention calculus to estimate the causal effects of the covariates on the
    response.

Syndicate content