Streamlining data processing optimizations for machine learning workloads

Techniques for refinement of data pipelines are provided. An original file of serialized objects is received, and an original pipeline comprising a plurality of transformations is identified based on the original file. A first computing cost is determined for a first transformation of the plurality...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: MIN, Hong, ZHANG, Qi, NAIR, Ravi, YU, Lei, RAMJI, Shyam, KAWAHITO, Motohiro, NOVOTNY, Petr, NAKAIKE, Takuya
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Techniques for refinement of data pipelines are provided. An original file of serialized objects is received, and an original pipeline comprising a plurality of transformations is identified based on the original file. A first computing cost is determined for a first transformation of the plurality of transformations. The first transformation is modified using a predefined optimization, and a second cost of the modified first transformation is determined. Upon determining that the second cost is lower than the first cost, the first transformation is replaced, in the original pipeline, with the optimized first transformation.