Root-to-Leaf Scheduling in Write-Optimized Trees
Write-optimized dictionaries are a class of cache-efficient data structures that buffer updates and apply them in batches to optimize the amortized cache misses per update. For example, a B^epsilon tree inserts updates as messages at the root. B^epsilon trees only move ("flush") messages w...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Write-optimized dictionaries are a class of cache-efficient data structures
that buffer updates and apply them in batches to optimize the amortized cache
misses per update. For example, a B^epsilon tree inserts updates as messages at
the root. B^epsilon trees only move ("flush") messages when they have total
size close to a cache line, optimizing the amount of work done per cache line
written. Thus, recently-inserted messages reside at or near the root and are
only flushed down the tree after a sufficient number of new messages arrive.
Although this lazy approach works well for many operations, some types of
updates do not complete until the update message reaches a leaf. For example,
deferred queries and secure deletes must flush through all nodes along their
root-to-leaf path before taking effect. What happens when we want to service a
large number of (say) secure deletes as quickly as possible? Classic techniques
leave us with an unsavory choice. On the one hand, we can group the delete
messages using a write-optimized approach and move them down the tree lazily.
But then many individual deletes may be left incomplete for an extended period
of time, as their messages wait to be grouped with a sufficiently large number
of related messages. On the other hand, we can ignore cache efficiency and
perform a root-to-leaf flush for each delete. This begins work on individual
deletes immediately, but harms system throughput. This paper investigates a new
framework for efficiently flushing collections of messages from the root to
their leaves in a write-optimized data structure. Our goal is to minimize the
average time that messages reach the leaves. We give an algorithm that
O(1)-approximates the optimal average completion time in this model. Along the
way, we give a new 4-approximation algorithm for scheduling parallel tasks for
weighted completion time with tree precedence constraints. |
---|---|
DOI: | 10.48550/arxiv.2404.17544 |