Controlling AWS Costs with Data Carousel
How to manage the costs associated with a 2.4 Petabyte dataset hosted on AWS? This is a question posed by the EOSDIS Large, Mission Scale Data working group. Part of the answer lies in keeping the data in low-cost Glacier storage, however unbounded data retrieval costs are incompatible with federal...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Video |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Galewsky, Benjamin Petravick, Donald Daues, Greg Readey, John Kolak, Ryan |
description | How to manage the costs associated with a 2.4 Petabyte dataset hosted on AWS? This is a question posed by the EOSDIS Large, Mission Scale Data working group. Part of the answer lies in keeping the data in low-cost Glacier storage, however unbounded data retrieval costs are incompatible with federal budget rules. We will describe and demonstrate a data carousel model where data is restored on a fixed regular schedule and research jobs are run against the data before it is again placed in cold storage. This provides a bounded, fixed cost to NASA to operate, and allows the researchers to scale their analysis as their budgets and needs permit. This presentation was given at the Earth Science Information Partners (ESIP) Summer Meeting held online in July 2020. |
doi_str_mv | 10.6084/m9.figshare.12690038 |
format | Video |
fullrecord | <record><control><sourceid>datacite_PQ8</sourceid><recordid>TN_cdi_datacite_primary_10_6084_m9_figshare_12690038</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_6084_m9_figshare_12690038</sourcerecordid><originalsourceid>FETCH-datacite_primary_10_6084_m9_figshare_126900383</originalsourceid><addsrcrecordid>eNpjYJAxNNAzM7Aw0c-11EvLTC_OSCxK1TM0MrM0MDC24GTQcM7PKynKz8nJzEtXcAwPVnDOLy4pVijPLMlQcEksSVRwTizKLy1OzeFhYE1LzClO5YXS3Awmbq4hzh66KUBVyZklqfEFRZm5iUWV8YYG8SAL43Mt42EWxsMsNCZTGwBdvz0f</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>video</recordtype></control><display><type>video</type><title>Controlling AWS Costs with Data Carousel</title><source>DataCite</source><creator>Galewsky, Benjamin ; Petravick, Donald ; Daues, Greg ; Readey, John ; Kolak, Ryan</creator><creatorcontrib>Galewsky, Benjamin ; Petravick, Donald ; Daues, Greg ; Readey, John ; Kolak, Ryan</creatorcontrib><description>How to manage the costs associated with a 2.4 Petabyte dataset hosted on AWS? This is a question posed by the EOSDIS Large, Mission Scale Data working group. Part of the answer lies in keeping the data in low-cost Glacier storage, however unbounded data retrieval costs are incompatible with federal budget rules. We will describe and demonstrate a data carousel model where data is restored on a fixed regular schedule and research jobs are run against the data before it is again placed in cold storage. This provides a bounded, fixed cost to NASA to operate, and allows the researchers to scale their analysis as their budgets and needs permit. This presentation was given at the Earth Science Information Partners (ESIP) Summer Meeting held online in July 2020.</description><identifier>DOI: 10.6084/m9.figshare.12690038</identifier><language>eng</language><publisher>ESIP</publisher><subject>Climate Science ; FOS: Electrical engineering, electronic engineering, information engineering ; Input, Output and Data Devices</subject><creationdate>2020</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,1894</link.rule.ids><linktorsrc>$$Uhttps://commons.datacite.org/doi.org/10.6084/m9.figshare.12690038$$EView_record_in_DataCite.org$$FView_record_in_$$GDataCite.org$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Galewsky, Benjamin</creatorcontrib><creatorcontrib>Petravick, Donald</creatorcontrib><creatorcontrib>Daues, Greg</creatorcontrib><creatorcontrib>Readey, John</creatorcontrib><creatorcontrib>Kolak, Ryan</creatorcontrib><title>Controlling AWS Costs with Data Carousel</title><description>How to manage the costs associated with a 2.4 Petabyte dataset hosted on AWS? This is a question posed by the EOSDIS Large, Mission Scale Data working group. Part of the answer lies in keeping the data in low-cost Glacier storage, however unbounded data retrieval costs are incompatible with federal budget rules. We will describe and demonstrate a data carousel model where data is restored on a fixed regular schedule and research jobs are run against the data before it is again placed in cold storage. This provides a bounded, fixed cost to NASA to operate, and allows the researchers to scale their analysis as their budgets and needs permit. This presentation was given at the Earth Science Information Partners (ESIP) Summer Meeting held online in July 2020.</description><subject>Climate Science</subject><subject>FOS: Electrical engineering, electronic engineering, information engineering</subject><subject>Input, Output and Data Devices</subject><fulltext>true</fulltext><rsrctype>video</rsrctype><creationdate>2020</creationdate><recordtype>video</recordtype><sourceid>PQ8</sourceid><recordid>eNpjYJAxNNAzM7Aw0c-11EvLTC_OSCxK1TM0MrM0MDC24GTQcM7PKynKz8nJzEtXcAwPVnDOLy4pVijPLMlQcEksSVRwTizKLy1OzeFhYE1LzClO5YXS3Awmbq4hzh66KUBVyZklqfEFRZm5iUWV8YYG8SAL43Mt42EWxsMsNCZTGwBdvz0f</recordid><startdate>20200722</startdate><enddate>20200722</enddate><creator>Galewsky, Benjamin</creator><creator>Petravick, Donald</creator><creator>Daues, Greg</creator><creator>Readey, John</creator><creator>Kolak, Ryan</creator><general>ESIP</general><scope>DYCCY</scope><scope>PQ8</scope></search><sort><creationdate>20200722</creationdate><title>Controlling AWS Costs with Data Carousel</title><author>Galewsky, Benjamin ; Petravick, Donald ; Daues, Greg ; Readey, John ; Kolak, Ryan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-datacite_primary_10_6084_m9_figshare_126900383</frbrgroupid><rsrctype>videos</rsrctype><prefilter>videos</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Climate Science</topic><topic>FOS: Electrical engineering, electronic engineering, information engineering</topic><topic>Input, Output and Data Devices</topic><toplevel>online_resources</toplevel><creatorcontrib>Galewsky, Benjamin</creatorcontrib><creatorcontrib>Petravick, Donald</creatorcontrib><creatorcontrib>Daues, Greg</creatorcontrib><creatorcontrib>Readey, John</creatorcontrib><creatorcontrib>Kolak, Ryan</creatorcontrib><collection>DataCite (Open Access)</collection><collection>DataCite</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Galewsky, Benjamin</au><au>Petravick, Donald</au><au>Daues, Greg</au><au>Readey, John</au><au>Kolak, Ryan</au><genre>unknown</genre><ristype>VIDEO</ristype><title>Controlling AWS Costs with Data Carousel</title><date>2020-07-22</date><risdate>2020</risdate><abstract>How to manage the costs associated with a 2.4 Petabyte dataset hosted on AWS? This is a question posed by the EOSDIS Large, Mission Scale Data working group. Part of the answer lies in keeping the data in low-cost Glacier storage, however unbounded data retrieval costs are incompatible with federal budget rules. We will describe and demonstrate a data carousel model where data is restored on a fixed regular schedule and research jobs are run against the data before it is again placed in cold storage. This provides a bounded, fixed cost to NASA to operate, and allows the researchers to scale their analysis as their budgets and needs permit. This presentation was given at the Earth Science Information Partners (ESIP) Summer Meeting held online in July 2020.</abstract><pub>ESIP</pub><doi>10.6084/m9.figshare.12690038</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.6084/m9.figshare.12690038 |
ispartof | |
issn | |
language | eng |
recordid | cdi_datacite_primary_10_6084_m9_figshare_12690038 |
source | DataCite |
subjects | Climate Science FOS: Electrical engineering, electronic engineering, information engineering Input, Output and Data Devices |
title | Controlling AWS Costs with Data Carousel |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T07%3A30%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-datacite_PQ8&rft_val_fmt=info:ofi/fmt:kev:mtx:&rft.genre=unknown&rft.au=Galewsky,%20Benjamin&rft.date=2020-07-22&rft_id=info:doi/10.6084/m9.figshare.12690038&rft_dat=%3Cdatacite_PQ8%3E10_6084_m9_figshare_12690038%3C/datacite_PQ8%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |