We find lower and upper bounds for the risk of estimating a manifold in
Hausdorff distance under several models. We also show that there are close
connections between manifold estimation and the problem of deconvolving a
singular measure.
We find the minimax rate of convergence in Hausdorff distance for estimating
a manifold M of dimension d embedded in R^D given a noisy sample from the
manifold. We assume that the manifold satisfies a smoothness condition and that
the noise distribution has compact support. We show that the optimal rate of
convergence is n^{-2/(2+d)}. Thus, the minimax rate depends only on the
dimension of the manifold, not on the dimension of the space in which M is
embedded.
We develop nonparametric methods for estimating filamentary structure from
planar point process data and find the minimax lower bound for this problem. We
show that, under weak conditions, the filaments have a simple geometric
representation as the medial axis of the data distribution's support. Our
methods convert an estimator of the support's boundary into an estimator of the
filaments. We find the rates of convergence of our estimators and show that
when using an optimal boundary estimator, they achieve the minimax rate.
We consider the problem of reliably finding filaments in point clouds.
Realistic data sets often have numerous filaments of various sizes and shapes.
Statistical techniques exist for finding one (or a few) filaments but these
methods do not handle noisy data sets with many filaments. Other methods can be
found in the astronomy literature but they do not have rigorous statistical
guarantees. We propose the following method. Starting at each data point we
construct the steepest ascent path along a kernel density estimator.
We consider the problem of reliably finding filaments in point clouds.
Realistic data sets often have numerous filaments of various sizes and shapes.
Statistical techniques exist for finding one (or a few) filaments but these
methods do not handle noisy data sets with many filaments. Other methods can be
found in the astronomy literature but they do not have rigorous statistical
guarantees. We propose the following method. Starting at each data point we
construct the steepest ascent path along a kernel density estimator.