ROSA: R Optimizations with Static Analysis
R is a popular language and programming environment for data scientists. It is increasingly co-packaged with both relational and Hadoop-based data platforms and can often be the most dominant computational component in data analytics pipelines. Recent work has highlighted inefficiencies in executing...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | R is a popular language and programming environment for data scientists. It
is increasingly co-packaged with both relational and Hadoop-based data
platforms and can often be the most dominant computational component in data
analytics pipelines. Recent work has highlighted inefficiencies in executing R
programs, both in terms of execution time and memory requirements, which in
practice limit the size of data that can be analyzed by R. This paper presents
ROSA, a static analysis framework to improve the performance and space
efficiency of R programs. ROSA analyzes input programs to determine program
properties such as reaching definitions, live variables, aliased variables, and
types of variables. These inferred properties enable program transformations
such as C++ code translation, strength reduction, vectorization, code motion,
in addition to interpretive optimizations such as avoiding redundant object
copies and performing in-place evaluations. An empirical evaluation shows
substantial reductions by ROSA in execution time and memory consumption over
both CRAN R and Microsoft R Open. |
---|---|
DOI: | 10.48550/arxiv.1704.02996 |