Carlos Guestrin

  1. Distributed GraphLab: A Framework for Machine Learning in the Cloud.

    Authors: Danny Bickson, Carlos Guestrin, Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Joseph M. Hellerstein
    Subjects: Databases
    Abstract

    While high-level data parallel frameworks, like MapReduce, simplify the
    design and implementation of large-scale data processing systems, they do not
    naturally or efficiently support many important data mining and machine
    learning algorithms and can lead to inefficient learning systems. To help fill
    this critical void, we introduced the GraphLab abstraction which naturally
    expresses asynchronous, dynamic, graph-parallel computation while ensuring data
    consistency and achieving a high degree of parallel performance in the
    shared-memory setting.

  2. Concept Modeling with Superwords.

    Authors: Emily B. Fox, Carlos Guestrin, Khalid El-Arini
    Subjects: Machine Learning
    Abstract

    In information retrieval, a fundamental goal is to transform a document into
    concepts that are representative of its content. The term "representative" is
    in itself challenging to define, and various tasks require different
    granularities of concepts. In this paper, we aim to model concepts that are
    sparse over the vocabulary, and that flexibly adapt their content based on
    other relevant semantic information such as textual structure or associated
    image features.

  3. Multiresolution Cube Estimators for Sensor Network Aggregate Queries.

    Authors: Carlos Guestrin, Joseph M. Hellerstein, Alexandra Meliou
    Subjects: Databases
    Abstract

    In this work we present in-network techniques to improve the efficiency of
    spatial aggregate queries. Such queries are very common in a sensornet setting,
    demanding more targeted techniques for their handling. Our approach constructs
    and maintains multi-resolution cube hierarchies inside the network, which can
    be constructed in a distributed fashion. In case of failures, recovery can also
    be performed with in-network decisions.

  4. GraphLab: A New Framework for Parallel Machine Learning.

    Authors: Danny Bickson, Carlos Guestrin, Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Joseph M. Hellerstein
    Subjects: Learning
    Abstract

    Designing and implementing efficient, provably correct parallel machine
    learning (ML) algorithms is challenging. Existing high-level parallel
    abstractions like MapReduce are insufficiently expressive while low-level tools
    like MPI and Pthreads leave ML experts repeatedly solving the same design
    challenges. By targeting common patterns in ML, we developed GraphLab, which
    improves upon abstractions like MapReduce by compactly expressing asynchronous
    iterative algorithms with sparse computational dependencies while ensuring data
    consistency and achieving a high degree of parallel performance.

  5. Uncovering the Riffled Independence Structure of Rankings.

    Authors: Jonathan Huang, Carlos Guestrin
    Subjects: Learning
    Abstract

    Representing distributions over permutations can be a daunting task due to
    the fact that the number of permutations of $n$ objects scales factorially in
    $n$. One recent way that has been used to reduce storage complexity has been to
    exploit probabilistic independence, but as we argue, full independence
    assumptions impose strong sparsity constraints on distributions and are
    unsuitable for modeling rankings.

RSS-материал