Marina Sapir

  1. Bipartite ranking algorithm for classification and survival analysis.

    Authors: Marina Sapir
    Subjects: Learning
    Abstract

    Unsupervised aggregation of independently built univariate predictors is
    explored as an alternative regularization approach for noisy, sparse datasets.
    Bipartite ranking algorithm Smooth Rank implementing this approach is
    introduced. The advantages of this algorithm are demonstrated on two types of
    problems. First, Smooth Rank is applied to two-class problems from bio-medical
    field, where ranking is often preferable to classification. In comparison
    against SVMs with radial and linear kernels, Smooth Rank had the best
    performance on 8 out of 12 benchmark benchmarks.

  2. Bias Plus Variance Decomposition for Survival Analysis Problems.

    Authors: Marina Sapir
    Subjects: Learning
    Abstract

    Bias - variance decomposition of the expected error defined for regression
    and classification problems is an important tool to study and compare different
    algorithms, to find the best areas for their application. Here the
    decomposition is introduced for the survival analysis problem. In our
    experiments, we study bias -variance parts of the expected error for two
    algorithms: original Cox proportional hazard regression and CoxPath, path
    algorithm for L1-regularized Cox regression, on the series of increased
    training sets.

  3. New Risk Modeling Method for Robust Learning on Smaller Samples.

    Authors: Marina Sapir
    Subjects: Learning
    Abstract

    Prognosis of disease progression is necessary for development of
    individualized treatment, understanding of the disease. Risk modeling is a
    challenging problem, and too often amount of available relevant observations is
    not sufficient to build a quality model with traditional approaches. New method
    Smooth Rank for survival analysis, risk modeling is introduced here. Smooth
    Rank is robust against overfitting on relatively small samples. The method is
    compared with established risk modeling methods on 10 real life datasets.

RSS-материал