Scheduling computations with provably low synchronization overheads
Work Stealing has been a very successful algorithm for scheduling parallel computations, and is known to achieve high performances even for computations exhibiting fine-grained parallelism. We present a variant of \ws\ that provably avoids most synchronization overheads by keeping processors' d...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Work Stealing has been a very successful algorithm for scheduling parallel
computations, and is known to achieve high performances even for computations
exhibiting fine-grained parallelism. We present a variant of \ws\ that provably
avoids most synchronization overheads by keeping processors' deques entirely
private by default, and only exposing work when requested by thieves. This is
the first paper that obtains bounds on the synchronization overheads that are
(essentially) independent of the total amount of work, thus corresponding to a
great improvement, in both algorithm design and theory, over state-of-the-art
\ws\ algorithms. Consider any computation with work $T_{1}$ and critical-path
length $T_{\infty}$ executed by $P$ processors using our scheduler. Our
analysis shows that the expected execution time is $O\left(\frac{T_{1}}{P} +
T_{\infty}\right)$, and the expected synchronization overheads incurred during
the execution are at most $O\left(\left(C_{CAS} +
C_{MFence}\right)PT_{\infty}\right)$, where $C_{CAS}$ and $C_{MFence}$
respectively denote the maximum cost of executing a Compare-And-Swap
instruction and a Memory Fence instruction. |
---|---|
DOI: | 10.48550/arxiv.1810.10615 |