We consider the predictive problem of supervised ranking, where the task is
to rank sets of candidate items returned in response to queries. Although there
exist statistical procedures that come with guarantees of consistency in this
setting, these procedures require that individuals provide a complete ranking
of all items, which is rarely feasible in practice. Instead, individuals
routinely provide partial preference information, such as pairwise comparisons
of items, and more practical approaches to ranking have aimed at modeling this
partial preference data directly.
In this work, we establish novel connections between the Bayesian
nonparametric clustering and featural paradigms by considering the problem of
admixture modeling. We examine the Dirichlet process-and its unnormalized
Poisson point process generation via the gamma process-on the traditional
clustering side of Bayesian nonparametrics. On the featural side, we examine
the beta process and introduce a new model, the beta negative binomial process
(BNBP), for admixture modeling.
This work introduces SubMF, a parallel divide-and-conquer framework for noisy
matrix factorization. SubMF divides a large-scale matrix factorization task
into smaller subproblems, solves each subproblem in parallel using an arbitrary
base matrix factorization algorithm, and combines the subproblem solutions
using techniques from randomized matrix approximation. Our experiments with
collaborative filtering, video background modeling, and simulated data
demonstrate the near-linear to super-linear speed-ups attainable with this
approach.