OPTIMIZING PIPELINING RESULT SETS WITH FAULT TOLERANCE IN DISTRIBUTED QUERY EXECUTION
Aspects extend to methods, systems, and computer program products for optimally pipelining result sets with fault tolerance in distributed query execution. Distributed computing jobs are optimized by dividing the distributed computing jobs into one or more bubbles for execution. Each bubble can be i...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Aspects extend to methods, systems, and computer program products for optimally pipelining result sets with fault tolerance in distributed query execution. Distributed computing jobs are optimized by dividing the distributed computing jobs into one or more bubbles for execution. Each bubble can be independently executed, potentially in parallel with other bubbles, when resources to handle the bubble are available. Intra-bubble communication can be streamed between vertices within a bubble. Inter-bubble communication can be stored to durable storage. Bubbles provide a failure boundary for a job graph and re-executing a bubble along with storage of intermediate results in durable storage can be used to recover from failures. When a vertex inside a bubble fails, computation can resume by rescheduling the execution of the failed bubble from the durable inputs for that bubble. Durable storage provides a light-weight failover to handle non-deterministic behavior. Jobs can also leverage streaming to increase performance |
---|