Clayton Scott

  1. Active Diagnosis via AUC Maximization: An Efficient Approach for Multiple Fault Identification in Large Scale, Noisy Networks.

    Authors: Clayton Scott, Gowtham Bellala, Jason Stanley, Suresh K. Bhavnani
    Subjects: Learning
    Abstract

    The problem of active diagnosis arises in several applications such as
    disease diagnosis, and fault diagnosis in computer networks, where the goal is
    to rapidly identify the binary states of a set of objects (e.g., faulty or
    working) by sequentially selecting, and observing, (noisy) responses to binary
    valued queries. Current algorithms in this area rely on loopy belief
    propagation for active query selection. These algorithms have an exponential
    time complexity, making them slow and even intractable in large networks.

  2. Calibrated Surrogate Losses for Classification with Label-Dependent Costs.

    Authors: Clayton Scott
    Subjects: Machine Learning
    Abstract

    We present surrogate regret bounds for arbitrary surrogate losses in the
    context of binary classification with label-dependent costs. Such bounds relate
    a classifier's risk, assessed with respect to a surrogate loss, to its
    cost-sensitive classification risk. Two approaches to surrogate regret bounds
    are developed. The first is a direct generalization of Bartlett et al.

  3. Statistical File Matching of Flow Cytometry Data.

    Authors: Clayton Scott, Gyemin Lee, William Finn
    Subjects: Applications
    Abstract

    Flow cytometry is a technology that rapidly measures antigen-based markers
    associated to cells in a cell population. Although analysis of flow cytometry
    data has traditionally considered one or two markers at a time, there has been
    increasing interest in multidimensional analysis. However, flow cytometers are
    limited in the number of markers they can jointly observe, which is typically a
    fraction of the number of markers of interest. For this reason, practitioners
    often perform multiple assays based on different, overlapping combinations of
    markers.

  4. Query Learning with Exponential Query Costs.

    Authors: Clayton Scott, Gowtham Bellala, Suresh Bhavnani
    Subjects: Machine Learning
    Abstract

    In query learning, the goal is to identify an unknown object while minimizing
    the number of "yes" or "no" questions (queries) posed about that object. A
    well-studied algorithm for query learning is known as generalized binary search
    (GBS). We show that GBS is a greedy algorithm to optimize the expected number
    of queries needed to identify the unknown object. We also generalize GBS in two
    ways. First, we consider the case where the cost of querying grows
    exponentially in the number of queries and the goal is to minimize the expected
    exponential cost.

  5. Group-based Query Learning for rapid diagnosis in time-critical situations.

    Authors: Clayton Scott, Gowtham Bellala, Suresh Bhavnani
    Subjects: Machine Learning
    Abstract

    In query learning, the goal is to identify an unknown object while minimizing
    the number of "yes or no" questions (queries) posed about that object. We
    consider three extensions of this fundamental problem that are motivated by
    practical considerations in real-world, time-critical identification tasks such
    as emergency response. First, we consider the problem where the objects are
    partitioned into groups, and the goal is to identify only the group to which
    the object belongs.

  6. Adaptive Hausdorff estimation of density level sets.

    Authors: Aarti Singh, Clayton Scott, Robert Nowak
    Subjects: gr. Statistics
    Abstract

    Consider the problem of estimating the $\gamma$-level set
    $G^*_{\gamma}=\{x:f(x)\geq\gamma\}$ of an unknown $d$-dimensional density
    function $f$ based on $n$ independent observations $X_1,...,X_n$ from the
    density. This problem has been addressed under global error criteria related to
    the symmetric set difference. However, in certain applications a spatially
    uniform mode of convergence is desirable to ensure that the estimated set is
    close to the target set everywhere. The Hausdorff error criterion provides this
    degree of uniformity and, hence, is more appropriate in such situations.

Syndicate content