The Subset Assignment Problem for Data Placement in Caches
We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2016-10 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ghandeharizadeh, Shahram Irani, Sandy Lam, Jenny |
description | We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number \(n\) of data objects (items) is very large and the number \(d\) of memory banks (bins) is a small constant (on the order of \(3\) or \(4\)). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on \(n\). The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most \(d\). If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time \(O({3^d \choose d+1} \text{poly}(d) n \log(n) \log(nC) \log(Z))\), where \(Z\) is the maximum item size and \(C\) the maximum storage cost. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2080718248</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2080718248</sourcerecordid><originalsourceid>FETCH-proquest_journals_20807182483</originalsourceid><addsrcrecordid>eNqNytEKgjAUgOERBEn5Dge6FuamOboLK7oU8l6mHFOZW-3M9y-iB-jqv_j-FYuElGmiMiE2LCaaOOfiUIg8lxE71gPCfWkJA5yIxoed0QaovGsNztA7D2cdNFRGd_il0UKpuwFpx9a9NoTxr1u2v17q8pY8vXstSKGZ3OLthxrBFS9SJTIl_7veZWg2CA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2080718248</pqid></control><display><type>article</type><title>The Subset Assignment Problem for Data Placement in Caches</title><source>Free E- Journals</source><creator>Ghandeharizadeh, Shahram ; Irani, Sandy ; Lam, Jenny</creator><creatorcontrib>Ghandeharizadeh, Shahram ; Irani, Sandy ; Lam, Jenny</creatorcontrib><description>We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number \(n\) of data objects (items) is very large and the number \(d\) of memory banks (bins) is a small constant (on the order of \(3\) or \(4\)). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on \(n\). The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most \(d\). If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time \(O({3^d \choose d+1} \text{poly}(d) n \log(n) \log(nC) \log(Z))\), where \(Z\) is the maximum item size and \(C\) the maximum storage cost.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Bins ; Caching ; Knapsack problem ; Operations research ; Recovery time ; Time dependence</subject><ispartof>arXiv.org, 2016-10</ispartof><rights>2016. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>777,781</link.rule.ids></links><search><creatorcontrib>Ghandeharizadeh, Shahram</creatorcontrib><creatorcontrib>Irani, Sandy</creatorcontrib><creatorcontrib>Lam, Jenny</creatorcontrib><title>The Subset Assignment Problem for Data Placement in Caches</title><title>arXiv.org</title><description>We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number \(n\) of data objects (items) is very large and the number \(d\) of memory banks (bins) is a small constant (on the order of \(3\) or \(4\)). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on \(n\). The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most \(d\). If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time \(O({3^d \choose d+1} \text{poly}(d) n \log(n) \log(nC) \log(Z))\), where \(Z\) is the maximum item size and \(C\) the maximum storage cost.</description><subject>Algorithms</subject><subject>Bins</subject><subject>Caching</subject><subject>Knapsack problem</subject><subject>Operations research</subject><subject>Recovery time</subject><subject>Time dependence</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNytEKgjAUgOERBEn5Dge6FuamOboLK7oU8l6mHFOZW-3M9y-iB-jqv_j-FYuElGmiMiE2LCaaOOfiUIg8lxE71gPCfWkJA5yIxoed0QaovGsNztA7D2cdNFRGd_il0UKpuwFpx9a9NoTxr1u2v17q8pY8vXstSKGZ3OLthxrBFS9SJTIl_7veZWg2CA</recordid><startdate>20161001</startdate><enddate>20161001</enddate><creator>Ghandeharizadeh, Shahram</creator><creator>Irani, Sandy</creator><creator>Lam, Jenny</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20161001</creationdate><title>The Subset Assignment Problem for Data Placement in Caches</title><author>Ghandeharizadeh, Shahram ; Irani, Sandy ; Lam, Jenny</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20807182483</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Algorithms</topic><topic>Bins</topic><topic>Caching</topic><topic>Knapsack problem</topic><topic>Operations research</topic><topic>Recovery time</topic><topic>Time dependence</topic><toplevel>online_resources</toplevel><creatorcontrib>Ghandeharizadeh, Shahram</creatorcontrib><creatorcontrib>Irani, Sandy</creatorcontrib><creatorcontrib>Lam, Jenny</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ghandeharizadeh, Shahram</au><au>Irani, Sandy</au><au>Lam, Jenny</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>The Subset Assignment Problem for Data Placement in Caches</atitle><jtitle>arXiv.org</jtitle><date>2016-10-01</date><risdate>2016</risdate><eissn>2331-8422</eissn><abstract>We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number \(n\) of data objects (items) is very large and the number \(d\) of memory banks (bins) is a small constant (on the order of \(3\) or \(4\)). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on \(n\). The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most \(d\). If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time \(O({3^d \choose d+1} \text{poly}(d) n \log(n) \log(nC) \log(Z))\), where \(Z\) is the maximum item size and \(C\) the maximum storage cost.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2016-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2080718248 |
source | Free E- Journals |
subjects | Algorithms Bins Caching Knapsack problem Operations research Recovery time Time dependence |
title | The Subset Assignment Problem for Data Placement in Caches |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T16%3A30%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=The%20Subset%20Assignment%20Problem%20for%20Data%20Placement%20in%20Caches&rft.jtitle=arXiv.org&rft.au=Ghandeharizadeh,%20Shahram&rft.date=2016-10-01&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2080718248%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2080718248&rft_id=info:pmid/&rfr_iscdi=true |