We provide a unifying framework linking two classes of statistics used in
two-sample and independence testing: on the one hand, the energy distances and
distance covariances from the statistics literature; on the other, distances
between embeddings of distributions to reproducing kernel Hilbert spaces
(RKHS), as established in machine learning. The equivalence holds when energy
distances are computed with semimetrics of negative type, in which case a
kernel may be defined such that the RKHS distance between distributions
corresponds exactly to the energy distance.
This paper presents a kernel-based discriminative learning framework on
probability measures. Rather than relying on large collections of vectorial
training examples, our framework learns using a collection of probability
distributions that have been constructed to meaningfully represent training
data. By representing these probability distributions as mean embeddings in the
reproducing kernel Hilbert space (RKHS), we are able to apply many standard
kernel-based learning techniques in straightforward fashion.
We propose a new approach to the theoretical analysis of Loopy Belief
Propagation (LBP) and the Bethe free energy (BFE) by establishing a formula to
connect LBP and BFE with a graph zeta function. The proposed approach is
applicable to a wide class of models including multinomial and Gaussian types.
The connection derives a number of new theoretical results on LBP and BFE. This
paper focuses two of such topics.
Discussion on "Brownian distance covariance" by G\'{a}bor J. Sz\'{e}kely and
Maria L. Rizzo [arXiv:1010.0297]
A kernel method is proposed for realizing Bayes' rule, based on
representations of probability distributions in reproducing kernel Hilbert
spaces (RKHS). The empirical RKHS embeddings of the conditional probabilities
and prior are expressed as feature mappings of samples, and an RKHS embedding
of the posterior distribution is computed, again based on a feature mapping of
a sample. This kernel Bayes' rule can be applied to a wide variety of
nonparametric Bayesian inference problems. As an example, the approach is used
in filtering with a nonparametric state-space model.
A Hilbert space embedding for probability measures has recently been
proposed, wherein any probability measure is represented as a mean element in a
reproducing kernel Hilbert space (RKHS). Such an embedding has found
applications in homogeneity testing, independence testing, dimensionality
reduction, etc., with the requirement that the reproducing kernel is
characteristic, i.e., the embedding is injective.
We propose a new approach to the analysis of Loopy Belief Propagation (LBP)
by establishing a formula that connects the Hessian of the Bethe free energy
with the edge zeta function. The formula has a number of theoretical
implications on LBP. It is applied to give a sufficient condition that the
Hessian of the Bethe free energy is positive definite, which shows
non-convexity for graphs with multiple cycles. The formula clarifies the
relation between the local stability of a fixed point of LBP and local minima
of the Bethe free energy.
We introduce two graph polynomials and discuss their properties. The one is a
polynomial of two variables, motivated by performance analysis of the Bethe
approximation of the Ising partition function. The other polynomial of one
variable is obtained by its specialization. It is shown that these polynomials
satisfy deletion-contraction relations and are essentially new examples of
V-function, which is introduced by Tutte (1947, Proc. Cambridge Philos. Soc.
43, 26-40).
We introduce two graph polynomials and discuss their properties. The one is a
polynomial of two variables, motivated by performance analysis of the Bethe
approximation of the Ising partition function. The other polynomial of one
variable is obtained by its specialization. It is shown that these polynomials
satisfy deletion-contraction relations and are essentially new examples of
V-function, which is introduced by Tutte (1947, Proc. Cambridge Philos. Soc.
43, 26-40).
A class of distance measures on probabilities -- the integral probability
metrics (IPMs) -- is addressed: these include the Wasserstein distance, Dudley
metric, and Maximum Mean Discrepancy. IPMs have thus far mostly been used in
more abstract settings, for instance as theoretical tools in mass
transportation problems, and in metrizing the weak topology on the set of all
Borel probability measures defined on a metric space.
We present a new methodology for sufficient dimension reduction (SDR). Our
methodology derives directly from the formulation of SDR in terms of the
conditional independence of the covariate $X$ from the response $Y$, given the
projection of $X$ on the central subspace [cf. J. Amer. Statist. Assoc. 86
(1991) 316--342 and Regression Graphics (1998) Wiley].