In many signal detection and classification problems, we have knowledge of
the distribution under each hypothesis, but not the prior probabilities.
Standard compressive sensing results state that to exactly recover an s
sparse signal in R^p, one requires O(s\cdotlog p) measurements. While this
bound is extremely useful in practice, often real world signals are not only
sparse, but also exhibit structure in the sparsity pattern. We focus on
group-structured patterns in this paper. Under this model, groups of signal
coefficients are active (or inactive) together. The groups are predefined, but
the particular set of groups that are active (i.e., in the signal support) must
be learned from measurements.
This paper addresses the problem of inferring sparse causal networks modeled
by multivariate auto-regressive (MAR) processes. Conditions are derived under
which the Group Lasso (gLasso) procedure consistently estimates sparse network
structure. The key condition involves a "false connection score." In
particular, we show that consistent recovery is possible even when the number
of observations of the network is far less than the number of parameters
describing the network, provided that the false connection score is less than
one.
This paper presents results pertaining to sequential methods for support
recovery of sparse signals in noise. Specifically, we show that any sequential
measurement procedure fails provided the average number of measurements per
dimension grows less then log s / D(f0||f1) where s is the level of sparsity,
and D(f0||f1) the Kullback-Leibler divergence between the underlying
distributions.
Hierarchical clustering based on pairwise similarities is a common tool used
in a broad range of scientific applications. However, in many problems it may
be expensive to obtain or compute similarities between the items to be
clustered. This paper investigates the hierarchical clustering of N items based
on a small subset of pairwise similarities, significantly less than the
complete set of N(N-1)/2 similarities.
This work presents GROUSE (Grassmanian Rank-One Update Subspace Estimation),
an efficient online algorithm for tracking subspaces from highly incomplete
observations. GROUSE requires only basic linear algebraic manipulations at each
iteration, and each subspace update can be performed in linear time in the
dimension of the subspace. The algorithm is derived by analyzing incremental
gradient descent on the Grassmannian manifold of subspaces.
We consider the problem of deciding whether a highly incomplete signal lies
within a given subspace. This problem, Matched Subspace Detection, is a
classical, well-studied problem when the signal is completely observed. High-
dimensional testing problems in which it may be prohibitive or impossible to
obtain a complete observation motivate this work. The signal is represented as
a vector in R^n, but we only observe m << n of its elements.
Adaptive sampling results in dramatic improvements in the recovery of sparse
signals in white Gaussian noise. A sequential adaptive sampling-and-refinement
procedure called distilled sensing (DS) is proposed and analyzed. DS is a form
of multi-stage experimental design and testing. Because of the adaptive nature
of the data collection, DS can detect and localize far weaker signals than
possible from non-adaptive measurements.
Consider the problem of estimating the $\gamma$-level set
$G^*_{\gamma}=\{x:f(x)\geq\gamma\}$ of an unknown $d$-dimensional density
function $f$ based on $n$ independent observations $X_1,...,X_n$ from the
density. This problem has been addressed under global error criteria related to
the symmetric set difference. However, in certain applications a spatially
uniform mode of convergence is desirable to ensure that the estimated set is
close to the target set everywhere. The Hausdorff error criterion provides this
degree of uniformity and, hence, is more appropriate in such situations.