Jun Yang

  1. Optimizing I/O for Big Array Analytics.

    Authors: Yi Zhang, Jun Yang
    Subjects: Databases
    Abstract

    Big array analytics is becoming indispensable in answering important
    scientific and business questions. Most analysis tasks consist of multiple
    steps, each making one or multiple passes over the arrays to be analyzed and
    generating intermediate results. In the big data setting, I/O optimization is a
    key to efficient analytics. In this paper, we develop a framework and
    techniques for capturing a broad range of analysis tasks expressible in
    nested-loop forms, representing them in a declarative way, and optimizing their
    I/O by identifying sharing opportunities.

  2. RIOT: I/O-Efficient Numerical Computing without SQL.

    Authors: Yi Zhang, Herodotos Herodotou, Jun Yang
    Subjects: Databases
    Abstract

    R is a numerical computing environment that is widely popular for statistical
    data analysis. Like many such environments, R performs poorly for large
    datasets whose sizes exceed that of physical memory. We present our vision of
    RIOT (R with I/O Transparency), a system that makes R programs I/O-efficient in
    a way transparent to the users. We describe our experience with RIOT-DB, an
    initial prototype that uses a relational database system as a backend.

Syndicate content