Aarti Singh

  1. Density-Sensitive Semisupervised Inference.

    Authors: Larry Wasserman, Aarti Singh, Martin Azizyan
    Subjects: Statistics
    Abstract

    Semisupervised methods are techniques for using labeled data $(X_1,Y_1),...,
    (X_n,Y_n)$ together with unlabeled data $X_{n+1},..., X_N$ to make predictions.
    These methods invoke some assumption that links the marginal distribution $P_X$
    of $X$ to the regression function $f(x)$. For example, it is common to assume
    that $f$ is very smooth over high density regions of $P_X$. Many of the methods
    are ad-hoc and have been shown to work in specific examples but are lacking a
    theoretical foundation. We provide a minimax framework for analyzing
    semisupervised methods.

  2. Adaptive Semisupervised Inference.

    Authors: Larry Wasserman, Aarti Singh, Martin Azizyan
    Subjects: Statistics
    Abstract

    Semisupervised methods inevitably invoke some assumption that links the
    marginal distribution of the features to the regression function of the label.
    Most commonly, the cluster or manifold assumptions are used which imply that
    the regression function is smooth over high-density clusters or manifolds
    supporting the data.

  3. Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities.

    Authors: Aarti Singh, Robert Nowak, Brian Eriksson, Gautam Dasarathy
    Subjects: Information Theory
    Abstract

    Hierarchical clustering based on pairwise similarities is a common tool used
    in a broad range of scientific applications. However, in many problems it may
    be expensive to obtain or compute similarities between the items to be
    clustered. This paper investigates the hierarchical clustering of N items based
    on a small subset of pairwise similarities, significantly less than the
    complete set of N(N-1)/2 similarities.

  4. Stability of Density-Based Clustering.

    Authors: Larry Wasserman, Aarti Singh, Alessandro Rinaldo, Rebecca Nugent
    Subjects: Machine Learning
    Abstract

    High density clusters can be characterized by the connected components of a
    level set $L(\lambda) = \{x:\ p(x)>\lambda\}$ of the underlying probability
    density function $p$ generating the data, at some appropriate level
    $\lambda\geq 0$. The complete hierarchical clustering can be characterized by a
    cluster tree ${\cal T}= \bigcup_{\lambda} L(\lambda)$. In this paper, we study
    the behavior of a density level set estimate $\widehat L(\lambda)$ and cluster
    tree estimate $\widehat{\cal{T}}$ based on a kernel density estimator with
    kernel bandwidth $h$.

  5. Detecting Weak but Hierarchically-Structured Patterns in Networks.

    Authors: Aarti Singh, Robert Calderbank, Robert D. Nowak
    Subjects: Information Theory
    Abstract

    The ability to detect weak distributed activation patterns in networks is
    critical to several applications, such as identifying the onset of anomalous
    activity or incipient congestion in the Internet, or faint traces of a
    biochemical spread by a sensor network. This is a challenging problem since
    weak distributed patterns can be invisible in per node statistics as well as a
    global network-wide aggregate. Most prior work considers situations in which
    the activation/non-activation of each node is statistically independent, but
    this is unrealistic in many problems.

  6. Adaptive Hausdorff estimation of density level sets.

    Authors: Aarti Singh, Clayton Scott, Robert Nowak
    Subjects: gr. Statistics
    Abstract

    Consider the problem of estimating the $\gamma$-level set
    $G^*_{\gamma}=\{x:f(x)\geq\gamma\}$ of an unknown $d$-dimensional density
    function $f$ based on $n$ independent observations $X_1,...,X_n$ from the
    density. This problem has been addressed under global error criteria related to
    the symmetric set difference. However, in certain applications a spatially
    uniform mode of convergence is desirable to ensure that the estimated set is
    close to the target set everywhere. The Hausdorff error criterion provides this
    degree of uniformity and, hence, is more appropriate in such situations.

Syndicate content