The Multivariate Watson Distribution: Maximum-Likelihood Estimation and other Aspects.

Authors: Suvrit Sra, Dmitrii Karp
Subjects: Computation
link: http://arxiv.org/abs/1104.4422
Abstract

This paper studies fundamental aspects of modelling data using multivariate
Watson distributions. Although these distributions are natural for modelling
axially symmetric data (i.e., unit vectors where $\pm \x$ are equivalent), for
high-dimensions using them can be difficult. Why so? Largely because for Watson
distributions even basic tasks such as maximum-likelihood are numerically
challenging. To tackle the numerical difficulties some approximations have been
derived---but these are either grossly inaccurate in high-dimensions
(\emph{Directional Statistics}, Mardia & Jupp. 2000) or when reasonably
accurate (\emph{J. Machine Learning Research, W. & C.P., v2}, Bijral \emph{et
al.}, 2007, pp. 35--42), they lack theoretical justification. We derive new
approximations to the maximum-likelihood estimates; our approximations are
theoretically well-defined, numerically accurate, and easy to compute. We build
on our parameter estimation and discuss mixture-modelling with Watson
distributions; here we uncover a hitherto unknown connection to the
"diametrical clustering" algorithm of Dhillon \emph{et al.}
(\emph{Bioinformatics}, 19(13), 2003, pp. 1612--1619).