We consider a collection of prediction experiments, which are clustered in
the sense that groups of experiments ex- hibit similar relationship between the
predictor and response variables. The experiment clusters as well as the
regres- sion relationships are unknown. The regression relation- ships define
the experiment clusters, and in general, the predictor and response variables
may not exhibit any clus- tering. We call this prediction problem clustered
regres- sion with unknown clusters (CRUC) and in this paper we focus on linear
regression.
Motivated by applications such as recommendation systems, we consider the
estimation of a binary random field X obtained by row and column permutations
of a block constant random matrix. The estimation of X is based on observations
Y, which are obtained by passing entries of X through a binary symmetric
channel (BSC) and an erasure channel. We focus on the analysis of a specific
algorithm based on local popularity when the erasure rate approaches unity at a
specified rate. We study the bit error rate (BER) in the limit as the matrix
size approaches infinity.
We consider the problem of collaborative filtering from a channel coding
perspective. We model the underlying rating matrix as a finite alphabet matrix
with block constant structure. The observations are obtained from this
underlying matrix through a discrete memoryless channel with a noisy part
representing noisy user behavior and an erasure part representing missing data.
Moreover, the clusters over which the underlying matrix is constant are {\it
unknown}.