Statistical Methods for Analyzing Tissue Microarray Images - Algorithmic Scoring and Co-training.

link: http://arxiv.org/abs/1102.0059
Abstract

Recent advances in tissue microarray technology have allowed
immunohistochemistry to become a powerful medium-to-high throughput analysis
tool, particularly for the validation of diagnostic and prognostic biomarkers.
However, as study size grows, the manual evaluation of these assays becomes a
prohibitive limitation; it vastly reduces throughput and greatly increases
variability and expense. We propose an algorithm - Tissue Array Co-Occurrence
Matrix Analysis (TACOMA) - for quantifying cellular phenotypes based on
textural regularity summarized by local inter-pixel relationships. The
algorithm can be easily trained for any staining pattern, is absent of
sensitive tuning parameters and has the ability to report salient pixels in an
image that contribute to its score. Pathologists' input via informative
training patches is an important aspect of the algorithm that allows the
training for any specific marker or cell type. With co-training, TACOMA can be
trained with a radically small training sample (e.g., with size 30). We give
theoretical insights into the success of co-training via thinning of the
feature set in a high dimensional setting when there is "sufficient" redundancy
among the features. TACOMA is flexible, transparent and provides a scoring
process that can be evaluated with clarity and confidence. In a study based on
an estrogen receptor (ER) marker, we show that TACOMA is comparable to, or
outperforms, pathologists' performance in terms of accuracy and repeatability.