Owen Kaser

  1. Reordering Columns for Smaller Indexes.

    Authors: Daniel Lemire, Owen Kaser
    Subjects: Databases
    Abstract

    Column-oriented indexes-such as projection or bitmap indexes-are compressed
    by run-length encoding to reduce storage and increase speed. Sorting the tables
    improves compression. On realistic data sets, permuting the columns in the
    right order before sorting can reduce the number of runs by a factor of two or
    more. For many cases, we prove that the number of runs in table columns is
    minimized if we sort columns by increasing cardinality. Yet-maybe
    surprisingly-we must sometimes maximize the number of runs to minimize the
    index size.

RSS-материал