Metabolic Flux Analysis in the Cloud
The MapReduce pattern popularized by Google has successfully been utilized in several scientific applications. In this paper, it is investigated whether a MapReduce approach utilizing on-demand resources from a Cloud is beneficial to perform simulation tasks in the area of Systems Biology and whethe...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 64 |
---|---|
container_issue | |
container_start_page | 57 |
container_title | |
container_volume | |
creator | Dalman, Tolga Doernemann, Tim Juhnke, Ernst Weitzel, Michael Smith, Matthew Wiechert, Wolfgang Noh, Katharina Freisleben, Bernd |
description | The MapReduce pattern popularized by Google has successfully been utilized in several scientific applications. In this paper, it is investigated whether a MapReduce approach utilizing on-demand resources from a Cloud is beneficial to perform simulation tasks in the area of Systems Biology and whether it can be seamlessly integrated into a service-oriented scientific workflow framework. In particular, an Amazon Elastic Map Reduce Cloud implementation of the 13C-MFA (Metabolix Flux Analysis) Monte Carlo bootstrap approach aimed at the integration into an existing BPEL-based scientific workflow system is presented. A comparison of a 64 node MapReduce cluster with a single node computation approach reveals a total performance gain up to a factor of 14, with a total cost for on-demand resources of 11. The most critical factor in terms of performance is I/O, i.e. our application suffers from the fact that I/O operations on many small files are expensive using Amazon S3 and the Hadoop DFS. |
doi_str_mv | 10.1109/eScience.2010.20 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5693899</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5693899</ieee_id><sourcerecordid>5693899</sourcerecordid><originalsourceid>FETCH-LOGICAL-c137t-e5c8f0d18af99d862a8678dff71d77b44e0a2bd3652163bb9e47075ae1f37ae83</originalsourceid><addsrcrecordid>eNotjD1LA0EUAFdEUOP1gs0Wthf37dfbV4bDqJCQQq3D3u1bXDkvkruA-fcGdJqBKUaIW1BzAEUP_NoVHjqea3VKWp2JijAo9OSsJmXPxTVYbW0gh3ApqnH8VCecRgS8EvdrnmK760snl_3hRy6G2B_HMsoyyOmDZdPvDulGXOTYj1z9eybel49vzXO92jy9NItV3YHBqWbXhawShJiJUvA6Bo8h5YyQEFtrWUXdJuOdBm_altiiQhcZssHIwczE3d-3MPP2e1--4v64dZ5MIDK_RKBAkw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Metabolic Flux Analysis in the Cloud</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Dalman, Tolga ; Doernemann, Tim ; Juhnke, Ernst ; Weitzel, Michael ; Smith, Matthew ; Wiechert, Wolfgang ; Noh, Katharina ; Freisleben, Bernd</creator><creatorcontrib>Dalman, Tolga ; Doernemann, Tim ; Juhnke, Ernst ; Weitzel, Michael ; Smith, Matthew ; Wiechert, Wolfgang ; Noh, Katharina ; Freisleben, Bernd</creatorcontrib><description>The MapReduce pattern popularized by Google has successfully been utilized in several scientific applications. In this paper, it is investigated whether a MapReduce approach utilizing on-demand resources from a Cloud is beneficial to perform simulation tasks in the area of Systems Biology and whether it can be seamlessly integrated into a service-oriented scientific workflow framework. In particular, an Amazon Elastic Map Reduce Cloud implementation of the 13C-MFA (Metabolix Flux Analysis) Monte Carlo bootstrap approach aimed at the integration into an existing BPEL-based scientific workflow system is presented. A comparison of a 64 node MapReduce cluster with a single node computation approach reveals a total performance gain up to a factor of 14, with a total cost for on-demand resources of 11. The most critical factor in terms of performance is I/O, i.e. our application suffers from the fact that I/O operations on many small files are expensive using Amazon S3 and the Hadoop DFS.</description><identifier>ISBN: 1424489571</identifier><identifier>ISBN: 9781424489572</identifier><identifier>EISBN: 9780769542904</identifier><identifier>EISBN: 0769542905</identifier><identifier>DOI: 10.1109/eScience.2010.20</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Analytical models ; Cloud computing ; Computational modeling ; Data models ; Hadoop ; MapReduce ; MFA ; Monte Carlo methods ; Systems Biology</subject><ispartof>2010 IEEE Sixth International Conference on e-Science, 2010, p.57-64</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c137t-e5c8f0d18af99d862a8678dff71d77b44e0a2bd3652163bb9e47075ae1f37ae83</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5693899$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2051,27904,54898</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5693899$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dalman, Tolga</creatorcontrib><creatorcontrib>Doernemann, Tim</creatorcontrib><creatorcontrib>Juhnke, Ernst</creatorcontrib><creatorcontrib>Weitzel, Michael</creatorcontrib><creatorcontrib>Smith, Matthew</creatorcontrib><creatorcontrib>Wiechert, Wolfgang</creatorcontrib><creatorcontrib>Noh, Katharina</creatorcontrib><creatorcontrib>Freisleben, Bernd</creatorcontrib><title>Metabolic Flux Analysis in the Cloud</title><title>2010 IEEE Sixth International Conference on e-Science</title><addtitle>escience</addtitle><description>The MapReduce pattern popularized by Google has successfully been utilized in several scientific applications. In this paper, it is investigated whether a MapReduce approach utilizing on-demand resources from a Cloud is beneficial to perform simulation tasks in the area of Systems Biology and whether it can be seamlessly integrated into a service-oriented scientific workflow framework. In particular, an Amazon Elastic Map Reduce Cloud implementation of the 13C-MFA (Metabolix Flux Analysis) Monte Carlo bootstrap approach aimed at the integration into an existing BPEL-based scientific workflow system is presented. A comparison of a 64 node MapReduce cluster with a single node computation approach reveals a total performance gain up to a factor of 14, with a total cost for on-demand resources of 11. The most critical factor in terms of performance is I/O, i.e. our application suffers from the fact that I/O operations on many small files are expensive using Amazon S3 and the Hadoop DFS.</description><subject>Algorithm design and analysis</subject><subject>Analytical models</subject><subject>Cloud computing</subject><subject>Computational modeling</subject><subject>Data models</subject><subject>Hadoop</subject><subject>MapReduce</subject><subject>MFA</subject><subject>Monte Carlo methods</subject><subject>Systems Biology</subject><isbn>1424489571</isbn><isbn>9781424489572</isbn><isbn>9780769542904</isbn><isbn>0769542905</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjD1LA0EUAFdEUOP1gs0Wthf37dfbV4bDqJCQQq3D3u1bXDkvkruA-fcGdJqBKUaIW1BzAEUP_NoVHjqea3VKWp2JijAo9OSsJmXPxTVYbW0gh3ApqnH8VCecRgS8EvdrnmK760snl_3hRy6G2B_HMsoyyOmDZdPvDulGXOTYj1z9eybel49vzXO92jy9NItV3YHBqWbXhawShJiJUvA6Bo8h5YyQEFtrWUXdJuOdBm_altiiQhcZssHIwczE3d-3MPP2e1--4v64dZ5MIDK_RKBAkw</recordid><startdate>201012</startdate><enddate>201012</enddate><creator>Dalman, Tolga</creator><creator>Doernemann, Tim</creator><creator>Juhnke, Ernst</creator><creator>Weitzel, Michael</creator><creator>Smith, Matthew</creator><creator>Wiechert, Wolfgang</creator><creator>Noh, Katharina</creator><creator>Freisleben, Bernd</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201012</creationdate><title>Metabolic Flux Analysis in the Cloud</title><author>Dalman, Tolga ; Doernemann, Tim ; Juhnke, Ernst ; Weitzel, Michael ; Smith, Matthew ; Wiechert, Wolfgang ; Noh, Katharina ; Freisleben, Bernd</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c137t-e5c8f0d18af99d862a8678dff71d77b44e0a2bd3652163bb9e47075ae1f37ae83</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithm design and analysis</topic><topic>Analytical models</topic><topic>Cloud computing</topic><topic>Computational modeling</topic><topic>Data models</topic><topic>Hadoop</topic><topic>MapReduce</topic><topic>MFA</topic><topic>Monte Carlo methods</topic><topic>Systems Biology</topic><toplevel>online_resources</toplevel><creatorcontrib>Dalman, Tolga</creatorcontrib><creatorcontrib>Doernemann, Tim</creatorcontrib><creatorcontrib>Juhnke, Ernst</creatorcontrib><creatorcontrib>Weitzel, Michael</creatorcontrib><creatorcontrib>Smith, Matthew</creatorcontrib><creatorcontrib>Wiechert, Wolfgang</creatorcontrib><creatorcontrib>Noh, Katharina</creatorcontrib><creatorcontrib>Freisleben, Bernd</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dalman, Tolga</au><au>Doernemann, Tim</au><au>Juhnke, Ernst</au><au>Weitzel, Michael</au><au>Smith, Matthew</au><au>Wiechert, Wolfgang</au><au>Noh, Katharina</au><au>Freisleben, Bernd</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Metabolic Flux Analysis in the Cloud</atitle><btitle>2010 IEEE Sixth International Conference on e-Science</btitle><stitle>escience</stitle><date>2010-12</date><risdate>2010</risdate><spage>57</spage><epage>64</epage><pages>57-64</pages><isbn>1424489571</isbn><isbn>9781424489572</isbn><eisbn>9780769542904</eisbn><eisbn>0769542905</eisbn><abstract>The MapReduce pattern popularized by Google has successfully been utilized in several scientific applications. In this paper, it is investigated whether a MapReduce approach utilizing on-demand resources from a Cloud is beneficial to perform simulation tasks in the area of Systems Biology and whether it can be seamlessly integrated into a service-oriented scientific workflow framework. In particular, an Amazon Elastic Map Reduce Cloud implementation of the 13C-MFA (Metabolix Flux Analysis) Monte Carlo bootstrap approach aimed at the integration into an existing BPEL-based scientific workflow system is presented. A comparison of a 64 node MapReduce cluster with a single node computation approach reveals a total performance gain up to a factor of 14, with a total cost for on-demand resources of 11. The most critical factor in terms of performance is I/O, i.e. our application suffers from the fact that I/O operations on many small files are expensive using Amazon S3 and the Hadoop DFS.</abstract><pub>IEEE</pub><doi>10.1109/eScience.2010.20</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 1424489571 |
ispartof | 2010 IEEE Sixth International Conference on e-Science, 2010, p.57-64 |
issn | |
language | eng |
recordid | cdi_ieee_primary_5693899 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Algorithm design and analysis Analytical models Cloud computing Computational modeling Data models Hadoop MapReduce MFA Monte Carlo methods Systems Biology |
title | Metabolic Flux Analysis in the Cloud |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T01%3A42%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Metabolic%20Flux%20Analysis%20in%20the%20Cloud&rft.btitle=2010%20IEEE%20Sixth%20International%20Conference%20on%20e-Science&rft.au=Dalman,%20Tolga&rft.date=2010-12&rft.spage=57&rft.epage=64&rft.pages=57-64&rft.isbn=1424489571&rft.isbn_list=9781424489572&rft_id=info:doi/10.1109/eScience.2010.20&rft_dat=%3Cieee_6IE%3E5693899%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9780769542904&rft.eisbn_list=0769542905&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5693899&rfr_iscdi=true |