Managing provenance information for data processing pipeline
A method is disclosed for managing provenance information associated with one or more interconnected provenance entities in a provenance system over a network interface for data processing pipelines in a distributed cloud environment, where each data processing pipeline is configured to read in data...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method is disclosed for managing provenance information associated with one or more interconnected provenance entities in a provenance system over a network interface for data processing pipelines in a distributed cloud environment, where each data processing pipeline is configured to read in data, transform the data, and output the transformed data. The method comprises the following steps performed by a configuration component: obtaining at least one declarative intent representing a configuration indicative of a requirement and a priority level for storing provenance information for each data processing pipeline; deriving, based on the obtained at least one declarative intent, requirements and priority levels for storing provenance information for each data processing pipeline, where one of the priority levels-a first priority level-is higher than the other priority level-a second priority level; estimating a storage capacity for storing provenance information in the provenance system based on the derive |
---|