Root-to-Leaf Scheduling in Write-Optimized Trees

Write-optimized dictionaries are a class of cache-efficient data structures that buffer updates and apply them in batches to optimize the amortized cache misses per update. For example, a B^epsilon tree inserts updates as messages at the root. B^epsilon trees only move ("flush") messages w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chung, Christopher, Jannen, William, McCauley, Samuel, Simon, Bertrand
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Chung, Christopher
Jannen, William
McCauley, Samuel
Simon, Bertrand
description Write-optimized dictionaries are a class of cache-efficient data structures that buffer updates and apply them in batches to optimize the amortized cache misses per update. For example, a B^epsilon tree inserts updates as messages at the root. B^epsilon trees only move ("flush") messages when they have total size close to a cache line, optimizing the amount of work done per cache line written. Thus, recently-inserted messages reside at or near the root and are only flushed down the tree after a sufficient number of new messages arrive. Although this lazy approach works well for many operations, some types of updates do not complete until the update message reaches a leaf. For example, deferred queries and secure deletes must flush through all nodes along their root-to-leaf path before taking effect. What happens when we want to service a large number of (say) secure deletes as quickly as possible? Classic techniques leave us with an unsavory choice. On the one hand, we can group the delete messages using a write-optimized approach and move them down the tree lazily. But then many individual deletes may be left incomplete for an extended period of time, as their messages wait to be grouped with a sufficiently large number of related messages. On the other hand, we can ignore cache efficiency and perform a root-to-leaf flush for each delete. This begins work on individual deletes immediately, but harms system throughput. This paper investigates a new framework for efficiently flushing collections of messages from the root to their leaves in a write-optimized data structure. Our goal is to minimize the average time that messages reach the leaves. We give an algorithm that O(1)-approximates the optimal average completion time in this model. Along the way, we give a new 4-approximation algorithm for scheduling parallel tasks for weighted completion time with tree precedence constraints.
doi_str_mv 10.48550/arxiv.2404.17544
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2404_17544</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2404_17544</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2404_175443</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw0TM0NzUx4WQwCMrPL9Etydf1SU1MUwhOzkhNKc3JzEtXyMxTCC_KLEnV9S8oyczNrEpNUQgpSk0t5mFgTUvMKU7lhdLcDPJuriHOHrpgs-MLijJzE4sq40F2xIPtMCasAgAbiDBn</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Root-to-Leaf Scheduling in Write-Optimized Trees</title><source>arXiv.org</source><creator>Chung, Christopher ; Jannen, William ; McCauley, Samuel ; Simon, Bertrand</creator><creatorcontrib>Chung, Christopher ; Jannen, William ; McCauley, Samuel ; Simon, Bertrand</creatorcontrib><description>Write-optimized dictionaries are a class of cache-efficient data structures that buffer updates and apply them in batches to optimize the amortized cache misses per update. For example, a B^epsilon tree inserts updates as messages at the root. B^epsilon trees only move ("flush") messages when they have total size close to a cache line, optimizing the amount of work done per cache line written. Thus, recently-inserted messages reside at or near the root and are only flushed down the tree after a sufficient number of new messages arrive. Although this lazy approach works well for many operations, some types of updates do not complete until the update message reaches a leaf. For example, deferred queries and secure deletes must flush through all nodes along their root-to-leaf path before taking effect. What happens when we want to service a large number of (say) secure deletes as quickly as possible? Classic techniques leave us with an unsavory choice. On the one hand, we can group the delete messages using a write-optimized approach and move them down the tree lazily. But then many individual deletes may be left incomplete for an extended period of time, as their messages wait to be grouped with a sufficiently large number of related messages. On the other hand, we can ignore cache efficiency and perform a root-to-leaf flush for each delete. This begins work on individual deletes immediately, but harms system throughput. This paper investigates a new framework for efficiently flushing collections of messages from the root to their leaves in a write-optimized data structure. Our goal is to minimize the average time that messages reach the leaves. We give an algorithm that O(1)-approximates the optimal average completion time in this model. Along the way, we give a new 4-approximation algorithm for scheduling parallel tasks for weighted completion time with tree precedence constraints.</description><identifier>DOI: 10.48550/arxiv.2404.17544</identifier><language>eng</language><subject>Computer Science - Data Structures and Algorithms</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2404.17544$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2404.17544$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chung, Christopher</creatorcontrib><creatorcontrib>Jannen, William</creatorcontrib><creatorcontrib>McCauley, Samuel</creatorcontrib><creatorcontrib>Simon, Bertrand</creatorcontrib><title>Root-to-Leaf Scheduling in Write-Optimized Trees</title><description>Write-optimized dictionaries are a class of cache-efficient data structures that buffer updates and apply them in batches to optimize the amortized cache misses per update. For example, a B^epsilon tree inserts updates as messages at the root. B^epsilon trees only move ("flush") messages when they have total size close to a cache line, optimizing the amount of work done per cache line written. Thus, recently-inserted messages reside at or near the root and are only flushed down the tree after a sufficient number of new messages arrive. Although this lazy approach works well for many operations, some types of updates do not complete until the update message reaches a leaf. For example, deferred queries and secure deletes must flush through all nodes along their root-to-leaf path before taking effect. What happens when we want to service a large number of (say) secure deletes as quickly as possible? Classic techniques leave us with an unsavory choice. On the one hand, we can group the delete messages using a write-optimized approach and move them down the tree lazily. But then many individual deletes may be left incomplete for an extended period of time, as their messages wait to be grouped with a sufficiently large number of related messages. On the other hand, we can ignore cache efficiency and perform a root-to-leaf flush for each delete. This begins work on individual deletes immediately, but harms system throughput. This paper investigates a new framework for efficiently flushing collections of messages from the root to their leaves in a write-optimized data structure. Our goal is to minimize the average time that messages reach the leaves. We give an algorithm that O(1)-approximates the optimal average completion time in this model. Along the way, we give a new 4-approximation algorithm for scheduling parallel tasks for weighted completion time with tree precedence constraints.</description><subject>Computer Science - Data Structures and Algorithms</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw0TM0NzUx4WQwCMrPL9Etydf1SU1MUwhOzkhNKc3JzEtXyMxTCC_KLEnV9S8oyczNrEpNUQgpSk0t5mFgTUvMKU7lhdLcDPJuriHOHrpgs-MLijJzE4sq40F2xIPtMCasAgAbiDBn</recordid><startdate>20240426</startdate><enddate>20240426</enddate><creator>Chung, Christopher</creator><creator>Jannen, William</creator><creator>McCauley, Samuel</creator><creator>Simon, Bertrand</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240426</creationdate><title>Root-to-Leaf Scheduling in Write-Optimized Trees</title><author>Chung, Christopher ; Jannen, William ; McCauley, Samuel ; Simon, Bertrand</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2404_175443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Data Structures and Algorithms</topic><toplevel>online_resources</toplevel><creatorcontrib>Chung, Christopher</creatorcontrib><creatorcontrib>Jannen, William</creatorcontrib><creatorcontrib>McCauley, Samuel</creatorcontrib><creatorcontrib>Simon, Bertrand</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chung, Christopher</au><au>Jannen, William</au><au>McCauley, Samuel</au><au>Simon, Bertrand</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Root-to-Leaf Scheduling in Write-Optimized Trees</atitle><date>2024-04-26</date><risdate>2024</risdate><abstract>Write-optimized dictionaries are a class of cache-efficient data structures that buffer updates and apply them in batches to optimize the amortized cache misses per update. For example, a B^epsilon tree inserts updates as messages at the root. B^epsilon trees only move ("flush") messages when they have total size close to a cache line, optimizing the amount of work done per cache line written. Thus, recently-inserted messages reside at or near the root and are only flushed down the tree after a sufficient number of new messages arrive. Although this lazy approach works well for many operations, some types of updates do not complete until the update message reaches a leaf. For example, deferred queries and secure deletes must flush through all nodes along their root-to-leaf path before taking effect. What happens when we want to service a large number of (say) secure deletes as quickly as possible? Classic techniques leave us with an unsavory choice. On the one hand, we can group the delete messages using a write-optimized approach and move them down the tree lazily. But then many individual deletes may be left incomplete for an extended period of time, as their messages wait to be grouped with a sufficiently large number of related messages. On the other hand, we can ignore cache efficiency and perform a root-to-leaf flush for each delete. This begins work on individual deletes immediately, but harms system throughput. This paper investigates a new framework for efficiently flushing collections of messages from the root to their leaves in a write-optimized data structure. Our goal is to minimize the average time that messages reach the leaves. We give an algorithm that O(1)-approximates the optimal average completion time in this model. Along the way, we give a new 4-approximation algorithm for scheduling parallel tasks for weighted completion time with tree precedence constraints.</abstract><doi>10.48550/arxiv.2404.17544</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2404.17544
ispartof
issn
language eng
recordid cdi_arxiv_primary_2404_17544
source arXiv.org
subjects Computer Science - Data Structures and Algorithms
title Root-to-Leaf Scheduling in Write-Optimized Trees
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T06%3A45%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Root-to-Leaf%20Scheduling%20in%20Write-Optimized%20Trees&rft.au=Chung,%20Christopher&rft.date=2024-04-26&rft_id=info:doi/10.48550/arxiv.2404.17544&rft_dat=%3Carxiv_GOX%3E2404_17544%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true