Data pipeline definition using descriptive language

A computer-implemented method executed using a first networked computer and comprising: receiving a digitally stored workflow pattern that specifies at least an input data source, a data transformation process, and an output data destination, the workflow pattern comprising a structured plurality of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Damodaran, Bhargavi, Hariharan, Srinivasan, Bysani, Puneet, Venkateswaran, Lakshmi Ranjani, Rajanna, Uday, Gu, Yifan
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A computer-implemented method executed using a first networked computer and comprising: receiving a digitally stored workflow pattern that specifies at least an input data source, a data transformation process, and an output data destination, the workflow pattern comprising a structured plurality of name declarations and value specifications that are human readable and machine readable, the data transformation process specified in the workflow pattern including one or more references to processing logic, a processing logic source outside the workflow pattern at which the processing logic is stored, and one or more available process engines that are capable of processing the processing logic; machine parsing the workflow pattern and dividing the workflow pattern into a plurality of execution units, each execution unit being associated with a particular process engine among the one or more available process engines; accessing the input data source specified in the workflow pattern and loading at least a portion of data from the input data source into main memory; accessing the processing logic source at a second networked computer and loading a copy of the processing logic specified in the workflow pattern from the second networked computer; for each of the execution units, selecting a particular process engine among the plurality of available process engines, calling the particular process engine, programmatically providing access to the portion of data and the copy of the processing logic, and receiving output data that has been created by the particular process engine after transforming the portion of data.